Well yeah, [quote=“Log, post:61, topic:191533”]
it’s fucked
[/quote]
Seems like you got a nice capacity optimized RAID0 configuration now. With that RAIDZ pool configuration you can’t remove them anymore and have to delete and restore from backup to change this (which is what you want to do anyway, because you want all the metadata on the HDDs to be on the special vdev. ZFS doesn’t move the data by itself so it stays on the HDDs.
Use my command a couple of postings earlier in the shell to properly get mirrored setup.
It took a few days but I backedup the data then rebuilt the pool. Before I transfer the data back, is this the configuration I want? The goal was a 330GB stripped/mirrored vdev for metadata.
Yeah that looks like it’s probably what you want, presuming you wanted 3 vdevs of 2 disk mirrors, and not 2 vdevs of triple mirrors. I’m assuming the special vdev drives are ~110GB.
Note that as a consequence, your RAIDZ2 vdev has to lose 3 disks to kill the pool and 2 to be vulnerable to bitrot, but a special vdev only 2 and 1 respectively. As long as you have automatic backups, and another driver to throw in should a special vdev drive fail, this shouldn’t be an issue. If you ever needed, you could take almost any other HDD or SSD and throw it in to replace a failed special vdev because they’d be larger. The performance of that special vdev would suck, but it’d work and the drive could be switched out later when you have a replacement.
Note you can explicitly check sizes, allocated and free space on a pool using zpool list -v yourpool
HDD’s, but especially Flash can be real quick to rebuild, but having any drive on hand with the right size/cap, can serve as a band-aid to maintain integrity till another drive of better quality ordered and badblocked/tested
This value is highly dependent on the record/blocksize of your datasets. Lower recordsize, less metadata and vice versa. But unless you’re running a lot of 4k/8k on your pool, you’ll be fine with metadata-only on the special vdev. But with small blocks included, 330GB isn’t that much really.
edit: but judging from that 3GB per 5.5TB from the list, all is fine.
The tool is called fio and available in basically all repos. Great tool.
64k or 128k is probably what most people use. I have mostly 256k recordsize and I’m using 128k. Anything up to and including this value is stored on the special vdev.
I was using special devices at home NAS. However, I don’t think it was worth the efforts. In most case, you’d better just put some cache devices. Cache device can cache metadata, too. You just need to wait the cache gets warm.
Also adding log devices is recommended to improve sync write.
As someone mentioned, for special device, you have to mirror it. Then, it would takes at least 8 pcie lines. Generally, I don’t mirror cache and log devices.
Special devices also add a point of failure, also an opportunity for human error. You loose special devices, you loose the pool. There is not an issue if you loose cache and log devices.
In my opinion, the priority should be, cache > log > special.
There is a fixed amount of bytes per record or block. The more blocks/records you have, the more metadata you get. Storing data in large records/blocks results in relatively tiny amounts of metadata. Usually you end up with 1-10GB per TB of data.
special vdevs are particularly useful if you use a lot of 4k or 8k records on a large pool because you can’t cache all that metadata. Or for using small files, which is probably the most beneficial part for home usage. Sizing entirely depends on how many small files you have.
Special being too small is not a problem. Excess data just gets allocated to HDD as normal and cached in ARC. Just get a mirror of 1-2TB NVMe or SATA SSD if short on PCIe lanes and you’ll be fine.
Wendell is selling me hard on low latency 118GB discount Optane drives.
I do not see a PCIe bifurcation option for the board, despite Anandtech saying the processor is capable.
“the Atom C3000 features a PCIe 3.0 x16 controller (with x2, x4 and x8 bifurcation), 16 SATA 3.0 ports, four 10 GbE controllers, and four USB 3.0 ports.”
This helpful poster @jode has replied here basically telling me I’m boned if I want 4x M.2
So to my question(s)
PCIe 3.0 x4 can do 4GB/s, if I’m comfortable losing bandwidth, can I purchase a PCIe switching 4x M.2 slot card? (I’ll still get the great latency, right?)
Is this simply not possible? The 118GB drives are reasonably priced and I have no other use for the PCIe slot. The NAS has been particulatly slow in accessing drives with lots of files of late, highly suspect Wendells Metadata ‘trick’ is right up my alley.
Well that is unfortunate but it is what it is. (Funny enough I have an N40L and N54L under a desk here I’m needing to eventually sell, but that’s another story)
So the question is, if I can’t do PCIe bifurcation, is there a solution which allows me 4x m.2 devices, fits into a 4x slot on this old classic motherboard and doesn’t hold me back too much?
Yes, you can use a card with PLX chip (no bifurcation needed). There are things to watch out for:
Most such cards are PCIe 8x or 16x, meaning these are physically larger than the slot on your mobo. However (and I cannot really tell from the pics) it looks like the PCIe slot of your mobo is open to the one side allowing to fit larger cards into the slot. Assuming this is true (or you would be willing to modify your slot as a customization) these cards would work in your mobo with speed reduced to the 4 available PCIe lanes.
Any hop required between device and CPU will add latency. I cannot guess how much latency is added by a PLX switch, but it’s going to be measurable.
PLX cards are pretty expensive (thanks, Broadcom). Expect to add a card that costs as much as your mobo.
Finally, using such a card you’ll only get speed benefits adding up to two Optane 1600X devices, because these support up to 2 PCIe lanes (at superior latency). Adding more devices (say 4) would expand the capacity of Optane storage you can access, but not bandwidth. It likely will reduce latency a tiny bit because the PLX chip has more work to do switching (but that’s just a wild guess on my part). If you need more capacity than the cheaper Intel Optane 1600X device offers, look for more expensive but higher capacity and faster (4 pcie lanes) Intel Optane 905.
Before I make further posts, can I confirm even if I compromise to just 2x m.2 slots would still require a PLX switch based (expensive) card?! - No way for me to add a dirt cheap one?