ZFS vs EXT

im still testing… going to reconfigure the vdev configuration at least one more time.

Think I will change it to 7 vdevs of mirrors for better iops

Okay, excellent, that’s what I thought people were talking about, but I’ve never come across anyone who expanded on the subject. The way I’ve heard some people talk about it sounded like they were talking about something much more complicated.

1 Like

If you’re testing, you really need to put a significant amount of data in that pool, especially if using deduplication.

Because until you have the data in the pool, you won’t have the same sort of size hash table in RAM and any performance comparisons you’re doing with an empty pool are irrelevant.

Also, ideally, data that is representative of what you will be putting in it.

Because (for an extreme example) - 4 T of zeroed data (best case) will be one (? i think) block hash.
4T of random data (worst case) will be many, many more hashes and thus a bigger in-memory hash table.

Your data is likely somewhere in between those extremes and how de-duplication friendly it is will influence how much ram (and CPU?) the hash table uses. Which will in turn impact performance.

ZFS and F2FS are the future. Running Fedora Silverblue 29 with F2FS on all partitions, within an lvm2 group on my 1TB mushkin SSD. Performance has never been better. Also running F2FS on my 5 year old OnePlus One, and it keeps it running buttery smooth. I’m planning on doing a ZFS fedora silverblue conversion soon here since I have so many random separate storage drives that I want to join altogether without having to use a hardware a raid controller.

1 Like

But same type of HDD right? ie: Hitachi 2TB Y RPM Model#X ?

Youve got any good tldr pros of f2fs vs ext4 for ssds, or resources to read or watch on it.

Ive just recently got yet another nvme ssd. But this time not as a boot drive, but for my home lab vm host to put disks for vms on instead of spinning rust (so kinda sorta a boot drive actually). I formatted it with ext4 for now. But it would not be hard to move all the stuff back to the hdd and reformat the ssd. Except for f2fs there did not seem to be in any centos repo (sure enough there would be rpm files to download). So… I wanted it to work and since I had exactly 0 clue about f2fs and googleling it came up with more phones than pcs I did format it ext4. Because you know… its easy going. Still kinda interested on why I would consider f2fs though.

If setup correctly, zfs will take in to account the difference in storage tech. Theoretically you can pair a HDD and an SSD together as one large storage platform, using both storage devices dynamically for their strengths. Frequently used files will move to the faster storage media, less used files will be moved to the spinning rust. It’s the magic of zfs :sunglasses:

1 Like

F2FS is faster across the the board than EXT4 on any flash device, it’s been proven time and time again on benchmarks across the internet, and it’s only getting more efficient as time goes on. At this time it’s a nicely matured filesystem, now containing every utility EXT4 has to offer and more. F2FS increases the longevity of your flash storage, it takes each used block of data in to account (hence the large cache partition) and writes new data on to the least used blocks first, essentially using your SSD’s flash storage evenly across the board, so the entire drive wears evenly much more like a HDD. EXT4 treats your SSD like a HDD and just stuffs all the data in the first section of the drive; eventually you end up wearing out the first section of the SSD leaving the rest practically unused. Because of all these optimizations F2FS benefits from huge read and write speed gains over EXT4, most notably write

I think you may be confusing something, or maby I’m getting confused about what you wrote but last I checked ZFS currently does NOT have a general use tiered caching system that most people think of when they want such things.

ZFS has

  • Regular Data: your storage pools
  • Ram cache: ARC. ZFS will make use of about half your ram by default, but doesn’t actually need to. There’s also l2arc but don’t fucking use that.
  • Logs: SLOG, a hopefully mirrored pair of small SSD’s that zfs will write it’s ZIL data to, instead of the regular rust pool, which can help speed things up in certain circumstances.

There is currently no mechanism (or to my knowledge even one in real development) that will promote/demote frequently used data from a rust pool to an SSD pool. You need something else to handle that right now.

Also, don’t mix rust and SSD’s in the same pool.

1 Like

AS perhaps stated above somewhere - people need to take ZFS defaults with a pinch of salt, as the filesystem was intended to be run as a dedicated fileserver box on enterprise hardware. The defaults reflect that - run/serve files as fast as reliably possible with the resources available on the assumption that the single purpose of the box is to act as a file server.

If you’re not doing that (and running OTHER stuff on the same box), sometimes the defaults don’t make sense.

Definitely. I would say that I don’t recommend going lower than 4gb. People can, and have done so, but it’s not a use case that to my understanding is really tested against, and I’ve seen reports of occasional issues when going below that. Of course, larger pools will still need larger amounts of ram for good performance.

Another thing is when adjusting maximum arc size, to also check arc size over time, to make sure it doesn’t start repeatedly filling up over time and suddenly clearing itself. I’m pretty sure this issue has been resolved though.

1 Like

I did say that a little confusingly, what I really meant is, the l2arc cache is what allows you to do said task. If you dedicate the SSD to l2arc, all of the zfs pool’s frequently used files will be stored there. Hence using the SSD for what it’s good for. The storage pool goes on the spinning rust of course.

[You have three solutions depending on the use case.

  1. If your disks are going to store a lot of data not accessed often and which do not need special read or write speed. And if you have a need for programs which will fit on the ssd, you can use your ssd separately to your zpool and format it simply as ext4.
  2. If you need high read performance on your ZFS pool, you should use at least part of the ssd as l2arc cache. The ZFS filesystem caches data in ram first (arc cache), and can use a ssd to store a level 2 (l2 arc) cache. It will contain the files uses most frequently.
  3. If instead you need fast write performance on your zfs pool, you can use your ssd as SLOG, which will cache the write requests before they are sequentially written to the spinning disks. However, it is a good practice to use a mirror of ssd to store the slog, as losing the ssd may corrupt recently written data.

In the first and third cases, beware of the reliability issues linked to the potential failure of the ssd.](https://askubuntu.com/questions/743941/how-to-use-a-small-ssd-with-zfs)

It’s worth clarifying some of the details here:

  1. The L2ARC is not persistent, it must be repopulated after each boot through a normal usage cycle (your “most frequently used files” aren’t remembered across boots)
  2. Using L2ARC has additional memory overhead, reducing the amount of data that can fit in RAM, which can actually reduce performance if the working set no longer fits
  3. SLOG does not improve write performance in general, it specifically applies to synchronous writes where the application explicitly blocks until the data is flushed to disk (not common for performance-sensitive applications)
3 Likes

awww DOOOOOOOOD fekkin zfs da best. Wat rong vit u?!?!?!?!?!??!?

These three things are exactly what I was going to type of later, thanks for saving me the effort!

L2ARC sounds cool, but if you aren’t an enterprise user with certain enterprise workloads, you’re more likely to see a loss in performance with L2ARC. Regular users are better off getting more ram and/or using those SDD’s as their own regular pool.

there is almost no benefit from replacing a per block basis though other than creating fragmented files which will read slower and require more resources to even read the file since you will have to put it back together in a sense if you even qualify to get the slight decrease in space used on a per block basis

im saying if you want to install zfs from nothing, you cant get that format for verifying the data because no one uses that format you get md5, sha1 sha512 w.e. one hash for the one iso.

zfs needs a bunch of ram to also run fast at all because you know doing a bunch of extra bs so you have to implement your own disk caching on say an operating system which already does disk caching, because you know, your file system is slow/low performance so cheat, just use the ram and say the operation is done when its not even close to on the disk yet

You have some serious misunderstanding about the point of the checksums. ZFS is a merkle tree, the checksums ensure the integrity of the storage. They tell you if the data the disk reads back is different from what you wrote. This happens for a lot of reasons, from bit rot to cabling issues to buggy firmware on drives etc. ZFS uses the checksum (which is stored in a different block on the disk) to make sure the data it gets back was not corrupted since it was written. The checksum is calculated by ZFS itself when you give it data to write.

Here’s a simple example:

#include <unistd.h>
#include <fcntl.h>
int
main()
{
    int fd = open("/foo/bar.txt", O_WRONLY|O_CREAT|O_TRUNC);
    write(fd, "foobar", 6);
    close(fd);
    return 0;
}

ZFS will create an entry for the file and calculate a checksum using the configured algorithm. No manual checksumming is required.

Now imagine a firmware bug in the disk causes it to send the wrong data next time the block storing this file is read. Here is a sample program again:

#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>
int
main()
{
    char buf[7] = { 0 };
    int fd = open("/foo/bar.txt", O_RDONLY);
    if (read(fd, buf, 6) == -1) {
        puts("read failed");
        return 1;
    }
    printf("%s\n", buf);
    return 0;
}

On ext4 and most other filesystems, this program would print the wrong data, for example “asdfgh”, while on ZFS it would tell you “read failed”.

1 Like

I wonder if the fact that you are using high end hardware is why you’ve seen the L2ARC trip things up. If you’ve got consumer grade hardware, the L2ARC will basically never do anything except make performance better. Generally by leaps and bounds. I’ve seen a single, 7200 RPM disk used as an L2ARC, and it made performance on the machine worlds better.

I tend to work with a lot of consumer grade hardware (mainly because I have a hate-on for “enterprise” hardware). I’ve never seen an L2ARC do anything short of amazing things for a system.

1 Like

I wonder how you measured that.

fio is a favorite of mine. But in some cases, it’s just a night-and-day difference in how a VM behaves.