ZFS arc_prune using 100% CPU. Why is ZFS badly caching RAID0?

Hi. I’m using a 8TB RAID0 pool on zfs. This pool is being used by multiple users on a heavy I/O scenario (long and often simultaneous project compilations). I’m having disk performance bottlenecks and checking htop I see a very high usage of arc_prune. Quick googling says this is ZFS caching overhead.

Now I wonder if ZFS is the adequate tool for this workload. Doesn’t seem to make sense ZFS waste so much time in caching and I wonder if ZFS is more suitable for redundancy and disk failure scenarios than raw performance like we want in a RAID0 pool.

Asking for advice :slight_smile:

Hard to know for sure without some more data. Is the system under memory pressure?

You could try turning caching to just metadata for your dataset with zfs set primarycache=metadata though if you need the caching to maintain your performance, this might just substitute one problem for another. Can you check /proc/spl/kstat/zfs/arcstats to see if ZFS is bumping up against any size limits?

1 Like

since you are using a stripe then you can also switch off the other security features and tune ZFS for pure performance.
Disable sync if your application is using sync calls, disable atime, increase buffering of dirty data and get the record-size right

https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Workload%20Tuning.html

2 Likes