So, since Wendell scared me with his bit rot video (and a few other articles I read recently), I was wondering if anyone recently build a low-cost low-power NAS.
Theorectically, I may be able to use a raspberry pi 3 and connect the drives via usb, but then I'd have to have n+1 seperate power connectors for a n drive NAS. In the end, I'd like to have a one-power-connector only NAS and, as the title suggests, I am going to use btrfs as filesystem since afaik btrfs doesn't have the xTiB of drives == xGiB of RAM as zfs does.
Also, I don't mind if the NAS would be slow, as long as my files are more or less safe. Do you have any suggestions?
Edit: It would be awesome if I could use a ARM CPU xD
Gonna agree with @Streetguru on this one. If you can avoid it though, don't use BTRFS. Parity in BTRFS is majorly broken, in that it will sometimes recalculate the parity incorrectly.
This xTiB === XGiB rule only applies when you're doing deduplication. If you don't want to dedup your data, you don't need this. You can also tune the ZFS ram usage settings.
Obviously, use BTRFS if you want, but don't assume anything about it being reliable.
In my opinion, LVM (or mdadm) and xfs is better than BTRFS in its current (Linux 4.7) state.
If you are going with BTRFS, a quick PSA: Do not use RAID 5 or 6 with it. It is not ready for prime time and you will experience data loss. The code behind it is bad and will most likely require a rewrite to fix it.
Depending on the amount of drives you want to use and what services and how many people are using those services I would recommend a quad core Xeon D or Avaton
ZFS will still use a lot of ram without depuplication. By default it will use up to half of your available memory for cache plus whatever it needs for the file system. An SSD cache will require an additional 1gb per 10gb of cache or something like that. Deduplication requires an additional 4gb per 1tb etc.
The amount of ram zfs needs depends a lot on the workload so if you're not hitting the NAS very hard it won't use much ram. But if you don't have much ram and zfs needs it then it might cause a system crash.
There are ways to tune zfs for low memory systems however, probably at the cost of performance.
I actually like the implementation of BTRFS on the newer Synology boxes. It uses BTRFS features on oldschool mdraid. So it won't repair stuff automatically but it will say "hey, that file is fucked". So as long as you have another source for the file (which you should if you care about your data) you then can replace it yourself. I plan on using crashplan for that and I also have another off site backup of my most important stuff.
Oh seems I either have to read up on zfs tuning or stick with btrfs
I know btrfs is experimental, but I wouldn't have minded, the parity part is a little scaring though. As far as I've read, btrfs' checksums are better.
Didn't intend to use btrfs + raid5/6, because of it being unstable. As of the number of persons using the NAS, it would be my wife and me. I'd be fine with it being slower, since I have other and faster network attached storage.
@noenken: Since I am looking for a rather cheap nas and I would really like to build one myself, I'm probably not going to buy a Synology one, but thanks.
Well, rather lenthy answer, I hope I didn't forget anyone - thanks again :-)
Yeah, Since @Azulath is going to be staying away from 5/6, he's totally fine using BTRFS.
ZFS will still use a lot of ram without depuplication. By default it will use up to half of your available memory for cache plus whatever it needs for the file system. An SSD cache will require an additional 1gb per 10gb of cache or something like that. Deduplication requires an additional 4gb per 1tb etc.
Odd. I've not seen anything like that with my SSD cache. I've got 16GB of ram on the machine that hosts my array and 120GB of SSD L2ARC along with 40 more of ZIL.
As far as the "half your available memory" goes, that's true, but again, it's default. I usually limit ARC to 1GB of ram. I've got SSD's and the only need for ram would be to help with 10G networking, which I don't use.
I think the ARC is used to cache the metadata more than anything, so it saves the disks having to keep going back and forth between data and meta data when reading small files. The SSD cache memory usage seems to be workload dependant, so the cache can be full but not much memory is being used but after some heavy reads it goes right up. I think that 1gb per 10gb of cache is the maximum that is required to reference the cached data.
That's just based on stuff I've read anyway, I've only been playing around with ZFS for a couple of weeks.