Return to Level1Techs.com

Looking for an introduction to Linux storage


#1

I’ve built a testbed for playing with storage (60GB SSD OS drive,3x1TB 7200 RPM hard drives) but I’m really not familiar with what options are out there. I have built a RAID 5 array with mdadm and lvm and did some informal benchmarking (write speeds seemed really slow - 20MB/sec sequential.) However, I simply used ext4 on top of these arrays.

I know there are other filesystems I can use (xfs, zfs, btrfs) but it is unclear to me whether they are simply layered on top of lvm or mdadm like ext4, or if they have their own built-in RAID functionality. Is there a document somewhere that explains the different software RAID and filesystem implementations?


#2

Flagging @sgtawesomesauce as this seems like their territory.


#3

Hi checkout zfs my favorite filesytem for storage it’s so awesome. zfs and btrfs is a newer generation of file systems with inbuilt volume manager so they do not need lvm. Here is a link to read more about zfs http://www.allanjude.com/bsd/zfs.html


#4

@habbis Thanks for the link. However it seems geared to BSD. Casual googling indicates that Fedora may not be the best distro to use ZFS with since its not supported natively. Are there any Linux distros with out-of-the-box support for ZFS?


#5

Ubuntu has support from the official repos. I think all you need to do is install a package.

The arch ecosystem also has support, but that’s not for the faint of heart.


#6

And keep in mind that ZFS loves RAM. For a productive system recommendations usually say 2G Ram plus an additional Gig per TB of storage.

For a Lab situation that’s not a major concern though. Performance might suffer a bit.


#7

While not a bad filesystem, don’t use RAID5/6 with it. Just… don’t.


#8

Did you wait for the md device to finish build/sync , before you did your write test?


#9

@nx2l Actually I left it to build overnight and ran the benchmark the next day. Lvm and mdadm were both similarly slow


#10

There are other reasons to stay away from it. Snapshots/subvols exponentially degrade performance, there are occasional breaking changes and in general, it’s unstable.


#11

Thats odd.
On centos using mdadm on 4 WD green 3TB drives in raid5 … id see 310MB/s write and 400MB/s reads ( seq)


#12

@nx2l Read speeds were OK - 220MB/sec or so, but only 15-20MB/sec according to the GNOME Disk Utility. There was one other test I ran that I can’t remember the name of, that showed similar numbers. Any thoughts on what the problem might be? Keep in mind I’m using 3 disks, not 4.


#13

Disks all the same model?
Check smart data on them?
Watch iostat while testing?


#14

@nx2l All three drives are 1TB 7200 RPM drives, but only 2 are identical Western Digital models. The third odd man out is a Hitachi drive. It shows 2 reallocated sectors, but still passes the SMART self-test. Casual googling finds that the WDs average about 120MB/s whereas the Hitachi is only rated at about 80MB/s.

Looking at iostat shows nothing unusual to my eyes, but maybe I’m missing something.

Linux 4.17.12-200.fc28.x86_64 (raidtest) 	08/27/2018 	_x86_64_	(4 CPU)
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.89    0.01    6.83   10.07    0.00   82.21
Device             tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda              16.64       319.57       313.97     672518     660732
sdb             276.19     67292.41     16851.52  141611495   35462675
sdc             276.05     67292.19     16853.74  141611020   35467355
sdd             273.42     67292.43     16853.62  141611540   35467087
dm-0             17.93       310.76       312.32     653965     657248
dm-1              0.07         1.58         0.00       3320          0
dm-2              0.45         3.71         1.91       7800       4024
dm-3              4.67         0.03        13.57         59      28551
dm-4          20946.90     67292.08     16844.74  141610800   35448400
dm-5              4.66         0.00        13.57          0      28551
dm-6          20947.40     67291.91     16846.96  141610432   35453080
dm-7              4.66         0.00        13.57          0      28551
dm-8          20947.46     67292.17     16846.83  141610984   35452816
dm-9            546.14     19056.59    556821.35   40103076 1171785980

EDIT: Looking at top, I see two processes mdX_raid5 and mdX_resync using about 20% and 10% of CPU respectively. Is this indicative of the system still trying to build the array?

I should also note that I’m using SATA II ports for the spinning drives on this older system.


#15

Usually. Yea

Have you checked the mdstat file?

Also it is generally not recommended to mix drive models with different ones.


#16

Just trying to run an experiment without breaking the bank. I have a wide variety of drives 2TB and less, but very few are alike. Since I already have the two identical WD drives I may go ahead and get a 3rd identical model.


#17

try cat /proc/mdstat, see if it shows anything insightful.

there are also --assume-clean and --wait-clean flags that might be interesting in your case.

You can also try storing the bitmap/journal or whatever you’re using on flash on your OS drive in order to speed up writes, for sequential writes in your case there’s no reason you shouldn’t be able to get between 150 and 200 MB/s.

also atop will show you per block device stats, and you can use iostat -x to glean some performance data as well.


#18

So I tore down my mdadm RAID5 array and rebuilt it in lvm, then waited for the build to complete. Same results - about 200MB/s read and 16MB/s write. At this point I got a bit frustrated.

So I broke out an old Dell SAS 6/iR controller I had from an old Dell tower. In the Dell tower, it only functioned as an HBA, but to my surprise when I put it in my ancient Asus motherboard it had RAID 0 and 1 functionality. So I mirrored the 2 WD drives and set up the Hitachi as a hot spare, and proceeded to benchmark.

That nets me 280MB/s sequential read speed and 150MB/s write speed. Much better, though now my 3TB of storage is reduced to 1TB. And the SAS 6/iR can only handle drives 2TB or less. But at least I didn’t have to wait hours for the array to build.

Is hardware RAID always this much better?


#19

Hardware raid has various disadvantages though, Wendell did a video on it a while ago comparing it to md, lvm and btrfs.



#20

in terms of performance, i found software RAID on Linux to be pretty darn fast.