ZFS - Swapping lz4 > zstd

I’m thinking of swapping the compression on my main zfs pool from lz4 > zstd, to get a bit more space back

This one is slow 5400 spinning rust drives. It contains my steam library.

Does anyone have experience? Presumably zstd wont add much in terms of latency given the HDD (I also have SSD special devices for small files)

What is a good level of zstd to aim for?
Is there a command I can use to apply the new compression without a copy & paste job?

default is level 3. And how high you can crank up the levels entirely depends on your CPU. My Ryzen 5900 can keep up with ZSTD-9 over 10Gbit connection, but that’s more or less the limit and using a lot of threads. I run ZSTD (aka lvl 3) for most stuff on my pool, higher levels for backup/archival.

Changing compression is easy (dataset property, a simple zfs set compression=zstd), but it doesn’t apply retroactively. So only modified data or new data runs on ZSTD. Otherwise you would need to read, decompress, compress, write the entire pool all over again.

1 Like

I set ztd-9 on my archive pool. Reduces the throughput a little, but I’m happy. Didn’t seem to make a huge difference going above that.

I’m running a 5600 on my server.

Are you happy with zstd-3 ? I may get an increase in performance setting it to zstd-5 -ish, if my HDD are slow

Do you see an appreciable compression increase with zstd-3 over lz4?

I do. But LZ4 is basically free and ZSTD uses quite a bit of CPU depending on your CPU and how much throughput. I just keep everything at defaults and modern CPU is just fine…doesn’t really matter much. I even have two NVMe pools with ZSTD (1x BTRFS, 1x ZFS) and it’s surprisingly not noticeable.

CPUs are damn fast today. Hard to hit the limits unless you want to compress GBs/s or set levels very high (diminishing returns in compressratio, barely worth it for me)

Is your server on Linux?

I have been worrying recently that my whole server is on a single NVMe. Not only is not backed up/redundant, it is wasting a drive.

I’ve definitely noticed an uptick in CPU consumption on my Xeon E5 v4 after making the change on my home NAS, but I’ve got cycles to spare so it’s fine for my use case. OpenZFS 2.2 introduced early abort for zstd (with quite an interesting implementation), which was the main thing I was waiting for.

1 Like

I was thinking of trying zstd-5

Is the early abort a parameter that needs setting?

No. 2.2.0 just adds a LZ4 trick to see if the block is compressible or not, saving up CPU time.
If you get 2.2.0 all ZSTD compression with use the LZ4 early abort feature to speed things up.

If CPU usage gets too high, just reduce the compression. It doesn’t apply retroactively, but decompression is way faster anyway so that’s usually not a problem.

I’m running with defaults basically everywhere and some archival stuff has ZSTD-9 and 11. I probably wouldn’t use higher compression than defaults on NVMe because the CPU will be a bottleneck at some point. And I like to have my network being the bottleneck.

How good is this early abort feature ?

Worth swapping lz4 > zstd even on in-compressible (media) drive?

Or upgrade my platform. :wink:

100%.

It’s at least as good as lz4, because that’s exactly what it uses as its first pass.

It’s effectively moot. If it won’t compress with lz4 it won’t compress with zstd either, for the reason I gave above. But if the pool is all incompressible files then I have to ask: why bother changing the compression setting at all? I would probably err on the side of maximum pool compatibility in that case, but that’s just my 2¢.

Hmm

Seems my ubuntu server is still on 2.1…I assumed 2.2, so I will probably hold out on the switch for now

zfs-2.1.5-1ubuntu6~22.04.2

If you want to tune your compression, lzbench is a nice utility for easily benchmarking a number of algorithms or levels of compression on files/folders of the particular dataset. For example my vdev bottleneck turns into a CPU bottleneck between zstd-12 and zstd-13, so I’ve set the dataset to zstd-10 so the CPU never caps out during large transfers. Makes accessing the server over xrdp more comfortable, the screen won’t freeze whenever something needs to be written to disk.

2 Likes

All the time I spent years ago reading up on ZFS, following all the advice on freenas’s forum, I now realize I’ve been doing it wrong and I should have cranked the compression up instead of leaving the cpu twiddling its thumbs (I’m sure intel chips do have thumbs in them). I mean, its been working just fine, but man if I love this forum. Thanks!

Some months later… Is zstd restricted on how many cores it can use? I am unable to get lzbench to tax my cpu (Max 10% on ryzen 5600)

lzbench is dynamically linked, so it should use your distro’s zstd. I hope that’s a multithreaded version in most modern distros, but I guess it might not be? You can statically link lzbench to make sure it spawns as many threads as you like, there are instructions for that on its github page.

1 Like

GitHub - inikep/lzbench: lzbench is an in-memory benchmark of open-source LZ77/LZSS/LZMA compressors

?

I can’t see anything about threads. If it IS using single thread…Does that mean zfs is also using a single thread?

I have recently moved to Debian…I don’t remember this being an issue on Ubuntu…

ZFS compression is multithreaded. It doesnt compress files, it compresses blocks. So parallel compute is easy to do. Just enable zstd and your CPU will go 100% unless you don’t really have work to do…e.g. low compression level or few blocks to be compressed. Modern CPUs are rather fast, most things you don’t even notice the CPU working.

2 Likes

I can see lots of z_wr_iss threads when copying to zstd-9 on my zfs pool, so presumably it is multi-threaded and working correctly. Not 100% though, and it seems to bounce around alot, up to around 80%.

I cannot get lz-bench to multithread ( I assume this is what it is, as I get similar results manually zstd -e), even when make the lib with lib-mt

I can get consistent high cpu with the following command

zstd -T12 -b9 -e9 /NAS/Fast/steamArch.img