Experience with ZFS Raid and USB Docks?

So I bought

Which for a little more money has hw raid, but I opted for the cheaper option to see if it can do software raid.

Anyone know how usb docks such as what i bought operates with truenas?

The raid offering is nearly 50 dollars more. So I am trying to fine a cheap option for remote pool (sister’s house) to sync her stuff and my stuff between location. As a off site backup if you will.

Zfs over USB has been surprisingly solid in test do far. Using 4 drive 10gbs enclosure.

5 Likes

That’s interesting, I have an older enclosure which used to work great over eSATA but when I had to switch to USB 3.0 the performance tanked. I ended up switching to using ZFS on top of it’s built in RAID. I’m sure USB has improved a lot since however old that thing was.

1 Like

Do any of these docks bring the disks up after a power loss induced restart?

Or, what happens in the scenario you have your compute that it’s attached to automatically rebooting after power loss? If, say, the dock itself powers up but the drives have to be individually and manually brought up afterwards, is it as straightforward as just hitting the buttons and ZFS will sort itself out, would it require a reboot of the host once those drives are on, or better to not have it autoreboot the host (and thus, recovery being manually turn the dock and drives on first, then the host)?

I am not sure.

That is a fare question though.

Does SMART for hard drives work over USB?

Not specifically ZFS but I had done long-term testing with USB-SATA bridge chipsets from ASMedia in 2018-2019:

  • Installed Windows 10 on the USB SATA SSD with “Win-To-Go”.

  • Booted it from USB 3.0 on an ASRock X470 Taichi/2600X/32 GB ECC UDIMM

  • 0 (!) stability issues in over a year of use and instead of properly shutting down the system I only used S3 sleep (Suspend-to-RAM), the only reboots were due to driver updates or the monthly Windows updates.

  • No issues with data corruption.

  • I only ended the experiment since Windows Upgrades (for example version 21H2 to 22H2) are unfortunately blocked on “Win-To-Go” Windows installations and I didn’t want to toy around finding a way around that limitation.

  • I recommend to get an adapter with a modern USB 3.x bridge chipset from a manufacturer that also deploys firmware updates.

Not an issue on modern USB bridge chipsets, but stay away from ones that offer integrated “RAID” solutions.

1 Like

Do you already have this box in your hands? How general impressions? worth it?

Not shipping till next week supposedly.

1 Like

Those are some huge balls right there!

1 Like

Yeah, I was only using it for backup and needed ZFS send/receive. I’ve since replaced it with a proper NAS.

I’ve run an array on UBS docked HDD’s it worked okay, to be fair.I had expected more drop outs, causing false “corruption” reports and expected to have to do loads of scrubs / resilbers as it lots drives to the flakey USB connection drop-outs… But it mostly just worked

This is what im hoping for. I already have three nases and just dont want to buy a completely new computer. My budget is at most 250 USD for 5 or more bay DAS.

1 Like

Performance is bad on USB, but for infrequent access, good enough.

If 5 drives hanging off one USB connection, then the whole Zpool might drop and alarms trigger, but a replug, or reboot should fix itre-save any files that were being transferred / saved to be safe, in case of partial / interrupted file write; ZFS won’t serve bad data, but an OS may think that because part of a file successfully saved, it may not restart from the beggining / whole file next time, ending in a stub of a file saved because of bad connection. Not corrupt, just incomplete. Like if you are doing a large copy, cancel it, but incomplete file not deleted afterwards.
Coz, when you go back to resume, the OS will see a 1mb stub, and just skip writing a whole 4gb file

Im kind of curious as to any reason why it shouldnt be? Its software raid. It shouldnt trust the USB controller nor the disks.

Even if you lose a disk due to loss of USB power or a bug. ZFS should be able to handle that

Particularly if these are ssd’s in a UASP enclosure and maybe had a couple extra disks for SLOG and the ZIL? Then it wouldn’t just be in RAM it would have been written to the log devices as well.

ZFS seems more than robust enough to handle the task

Unless im missing part of the picture here?

historically especially before changes to the usb protocol… errors would accumulate until the controller would drop the device and re-add it to the bus. Like resetting a pcie slot to get a device back. by design.

Not great for continuous access like disks. But modern 5 and 10g controllers use larger packets and have more sane error correction protocols.

I had also found a lot of usb chipsets aren’t really designed for continuous use. in testing a 10g file transfer… that takes longer than say… 5 minutes… every single chip would overheat/throttle/die in the extreme cases.

zpool is great. right up till scrub. then blyat.

zfs tolerates these things but it is also tuned to be intolerant of hardware hiccups like that and keeps counters. so its not as straightforward as you might imagine in practice.

I had incorrectly assumed things were like the bad old times. Motherboards do still sometimes ship with the otherwise-reject asmedia controller mixed in with the other usb ports that are good. usually thats good enough/no one notices/whatever but mostly things have dramatically improved over the last 2-3 years.

nucs seems to have surprisingly robust usb ports, probably moreso than motherboards, tbh. which can also be good for this use case if a usb disk enclosure has a good chipset that doesn’t overheat and go blyat under zfs scrub pressures.

3 Likes

Oh i get that much. One thing I dislike is how developers often handle the software side of dealing with USB errors. Granted in a well done implementation an error at the PHY layer will only happen once every so 10^12 bits. tbh that’s often how it goes lol a fail retry loop with max counter or worse no max counter because laziness and I dont want to pay the man hours so they just hammer the bus with this. Ive fixed lots of these issues in my time as an engineer. if dev is extra nice they add a suspend between the retry loop and if they are even extraer nice they increase the suspend duration every iteration. The other thing is USB protocol includes separate CRCs for headers and data packet payloads. Additionally, the link control word (in each packet header) has its own CRC. A failed CRC in the header or link control word is considered a serious error which will result in a link level retry to recover from the error. A failed CRC in a data packet payload is considered to indicate corrupted data and can be handled by the protocol layer with a request to resend the data packet. So the hardware has a mechanism to detect it and then transceivers will then resend the data multiple times, but this can in turn cause data packets to drop as the receiver may consider this to be duplicate data. If you spend a little time you can catch it in firmware and stop this bus hammering condition from occurring but most don’t spend that much time on it.

Thats interesting. Ive never used it for this case. It would be interesting to look at some documented use cases.

If that’s the case it doesn’t seem like a good idea. Scrubs are an essential part of maintaining a zfs array arent they?

my point was that a zpool scrub can run for days with a constant low background hum and two aspects of external usb enclosure design AND chipset design supposed that kind of access pattern would never happen.

this is much less of an issue today than several years ago, probably because usb is improving and because 10 gigabit isn’t as fast as it used to be

sidenote: there were access patterns discovered to murder mechanical disks. Think rowhammer, but for spinning rust. That was mostly solved and not forgotten since the 80s tho.

1 Like

shudders

Ahh okay im tracking. That sucks. That would be incredibly painful to find out after the fact too because of the sunken cost in going the suggested route

It turns out going a little slower to not have so many errors happens to be a good thing. Im sure as needs expand and as usb 4 pushes everything it has in its goalset USB will continue to improve. That said its still a dumpster fire (on the implementation end)

1 Like

Do you have a recommended dock/das? I would be willing to pay more for a superior setup.

2 Likes