The big internal debate is which to use. Plain Raid6 or Zed 2… and ZFS goodies
Phaselockedloopable- PLL's continued exploration of networking, self-hosting and decoupling from big tech
Well, in my book, there’s literally no reason to go traditional raid.
If your controller dies, so does your array. See, raid controllers usually have a proprietary storage format and all that fun stuff. RAIDZ is just ZFS, so you can plug it into any old computer and get to work.
Small performance sacrifice, but how bad is it really?
That was my thought but given mdadm exists. which is more advantageous over the other?
mdadm has speed.
ZFS has reliability and feature set.
If you have 8GB of ram that you won’t use, go ZFS every day of the week.
64 GB of WAM so in theory I can support up to a 48 TB array and still take full advantage of all of ZFS’ great features
The only RAM requirement for ZFS is that you have at least 4GB to support the ARC and other things it does in ram.
There is no array size dependent memory requirement. (except for dedup, which we don’t talk about)
Novasty was saying something about 1 GB of ram per terabyte or something. Is it some feature that requires that?
That’s only for dedup, and even that’s questionable now that SSDs are properly supported.
IIRC CERN uses ZFS and Ceph in the multiple petabyte order to do their LHC work.
Well its going on spinning rust (not budging on that)… so Dedup seemed like something I wanted to have?
I read a bit into this
Dedup is something you want only if you do WORM operations and actually plan on storing deduplicatable data.
If you need performant writes, dedup is basically like tying a 5 ton anchor to a diesel truck. It’ll go, but not quick.
I see. so Tell me more. This is just going to be a massive storage bank. Its going to store whats in the OP (reference it for what I am doing with the server). Is de dup or any other feature really necessary. What features increase the reliability of the array. ( I want to minimize rebuilds).
If I am not limited by the size of disks… man how big can I go… Whats a good sweet spot of CMR drives (preferably ENT drives or NAS drives)?
See, the problem with ZFS implementation of dedup is that every write (and read, for that matter) needs to do a dedup table search, and it needs to compare the csum of the block to write with all existing blocks. It then either adds the block to the table or references an existing block.
BTRFS has a much more sane solution for dedup, where it scans and dereferences blocks on a schedule. This allows fast writes, fast reads and no huge memory requirements.
- block checksums
- Copy on Write
- ZFS Intent Log (ZIL)
As far as the features you really want, you’ll probably get good use out of snapshots, which are deduplicated even without the dedup feature enabled. That’s the nature of Copy on Write. Additionally, the ability to zfs send and zfs recv data to and from pools, over UNIX pipes is really handy. Means you can
zfs send rpool/dataset | ssh remote.zfs.system | zfs recv rpool/dataset and send an entire dataset to another server.
The other features I found to be nice is transparent compression and native encryption. You can tell zfs to compress the entire dataset with gzip, lzma, bzip or zstd (as of the most recent version, IIRC), and it’s 100% transparent to the user. Encryption works similarly. There’s no need to
cryptsetup luksopen /dev/sdc, it’s just
zfs load-key rpool/dataset and then
zfs mount -a.
I don’t know what the sweet spot is, but the limitations are boundless. Wendell has a 192TB zfs pool for Level1, and that’s honestly probably not anywhere near the biggest pool he’s worked with.
Linus uses ZFS on his “petabyte project” pools I think, but I’m not 100% sure.
The maximum pool size is 256 quadrillion zebibytes (2128 bytes), so any pool you can reasonably construct, ZFS won’t even sneeze at.
Sorry I was getting memeSSL setup
(libreSSL over openSSL)
I will remember this. Im assuming there is some fine tuning to this?
Awesomesauce. I would put that to good use incase anything dumb happened with my data or I made a mistake administering something
Sweet… Thats pretty convenient. Yeah I might go ZFS
This is pretty much a must for me. You know me. I even encrypt stuff that makes no sense to encrypt.
IDK how I feel about facebook code but its open so it cant hurt. They have done some interesting math with this
So theres a high likely hood it may just be one single array data set (Z2). I will assign space through LXC of my stuff such as nextcloud and collabora. Otherwise generally store data on it for Jellyfin. (see above… moving away from plex and emby)
Sometimes I love his insanity
What I meant was price to space ratio max gigabytes for money spent economically
Currently WAN only. I have almost entirely open devices internally. Im not worried about an internal attack.
Does anyone on the forum know of good suricata lists? Asking for a friend.
Current lists above
Thinking my next step is an SSH Honeypot for kicks
Shes coming together slowly. I prefer a more solarized look
(dont care if my IP is shown. it changes every 6 hrs)
for an engineer I can be big dumb sometimes. Sits for 30 minutes wondering why he cant set IPv6 prefixes in his dual stack
Now I can set OPT2 to its own damn thing and LAN and OPT1 to their respectives
Parity is just if you’re running raidz, and the only tuning is if it’s 1 2 or 3 drive loss tolerances.
CoW is on or off.
Checksums, I don’t think you can turn off.
ZIL is definitely something you can tune, but you can also set up a SLOG device (separate log) for this, to improve your overall performance.
Same here. I hate facebook, but you can’t deny that their compression algorithm is the best out there. Arch compresses all their packages with zstd now.
datasets are like thin-provisioned filesystems.
Think of it as a folder, but it has tunable filesystem properties and ZFS can administer it. You should have a dataset for each distinct category of data. I can give you more details if you want.
Used to be 4TB, but it might be going up now. I’d check /r/datahoarder for best results.
It’s worth noting that BTRFS is a worthy contender if you use raid10 instead of parity. (their parity implementation is broken and the write hole still exists, IIRC)
BTRFS has the benefit of being flexible, but ZFS is working on flexibility soon, I think, with device removal or raidz resizing.
Thanks for the help ill go to reddit
Right now im reverse engineering the damn switch to figure out how to proof of concept my dumb ap with my previous router
Numbers 0-3 are Ports 4 to 1 as labeled on the unit, 5 is the internal connection to the router itself. Don’t be fooled: Port 1 on the unit is number 3 when configuring VLANs. vlan0 = eth0.0, vlan1 = eth0.1 and so on.
Going to add /etc/config/network:
config switch option name 'switch0' option reset 1 option enable_vlan 1 config switch_vlan option device 'switch0' option vlan 1 option ports '0 1 2 3 5'
I need food first. Its not openwrts fault its the hardbaked settings of the God damn netgear blob
Because when I just use the lan ports I get the IP from the protectli but it wont route. I.e I cant get passed the dumb switch / ap
Once I get this I can transfer the edit to configs over and modify them slightly be off to the races on the r7800
At the end of the day its math how do you store data in less bytes via manipulation of the binary data its hard stuff. Huffman coding is useful to index where the literal sections of the math start
I kind of wonder what Facebook did special to it
All the entropy coding
OH MY GOD ITS GORGEOUS… I can manage every machine from one spot
Now I just need to upgrade the machine, get the drives and fix my ZFS support LMAO