Linux copy speed way higher than device can support #LIEnux

Playing around with a thumb drive.

Tried to copy a file from the hard drive to the usb stick. The stick’s best effort write speed is around 12-13 MB/s.

But when I start a copy, there is a burst of speed initially in the 300-900 MB/s range.

For a fact, the device cannot do this (being a cheap, old flash stick).

Why is Linux lying?

And let me speculate a bit here:
I looks like that classic case where the OS is engaging in some kind of write caching behavior. On a removable device. By default. You get the idea. What happens when the transfer finishes and yank the thumb drive out?

By the almighty power of google search, I found a few references to sudo hdparm -W 0 /dev/sdx. Except it reports “not supported”. Take that. Even if it worked, the question remains - how do you make that the default? Sigh I miss windows.

Message sent from Chrome on Windows.

Short version:
The kernel caches data to RAM first, this is in general considered bad practice as it may lead to data loss upon unintended interruption. I’ve never understood why this is default practice. It does increase performance but most drives are fast enough these days so it’s not really necessary to do it.

5 Likes

Why yes. That much is clear.

And

watch -n 1 "cat /proc/meminfo | grep -e Dirty: -e Writeback:"

may be used to monitor this cache and watch it fill up and empty out.

Dirty is the total pending pool, Writeback is the chunk currently being written.

…unfortunately, when I tested, even when the pool was depleted, linux still complained that the device was still in use and would not let me remove it. Ultimately I physically and supposedly “unsafely” just yanked the device out of the USB port.

And I still cannot locate a method to disable this caching by default, preferably for only removable drives.

Having to dig into /proc/meminfo after the transfer has finished, and even then the system refusing to relinquish the device is such a usability nightmare for a task as basic of plugging a flash drive to copy some photos over for example.

1 Like

And some aren’t. If you don’t use some kind of buffering, latencies start to add up and the process requires far more work. If you want a seamless, fast and lag free read/write experience, you need some kind of buffer and DRAM is the most common denominator because nothing has lower latency or higher bandwidth. Even your drives have buffers.

It’s very much a cosmetic issue…if the file copy app or whatever wouldn’t be so honest to the user and rather display more convenient and calming numbers, just use a tool that displays more comfortable numbers.
If the OS sent 500MB to the drive in the last second, 500MB/s for this second is correct. It’s very much a perception on how you measure things and where.

I don’t agree on the KDE Dolphin file transfer numbers, I value other methods more. But they’re 100% correct in technical terms and display what they are supposed to.

We all like to being lied to by software. A smooth and linearly increasing progress bar is what we want. Well this is not how hardware and computing usually work, so developers make up stuff for better user experience.

If a program doesn’t do this and lacks “user convenience features”, it is perceived as lying. When it is in fact a more accurate representation of the computing process.

Linux isn’t lying. You’re just used to being lied to by other programs.

That is the fate of having volatile storage. You can only remove this risk by removing (volatile) memory entirely. This is not an option. And no one really thinks this is a problem. If it would, every PC would ship with UPS+ECC and sync writes would be the only available option for the OS.

We rely on keeping stuff in memory all the time. We made it work. Memory is a good thing, not something to avoid.

4 Likes

I’m going to guess that it’s a relic that no one “wants” to touch, neither Windows or *BSD have this behaviour by default or at least far from as aggressive as Linux and they work fine (I’m not going into the debate about I/O performance in detail) but it’s fine in 2023 and have been for years.

In terms of your argument is that buffered speed is what programs report, I’m going to say that it’s rather the opposite for most applications and that it’s deceptive to put it lightly. …and you’re perception about software in general seems a bit skewed…

The behaviour of “Linux” in this regard is stange and not common, a very good example of this weird behaviour and consequences can be found here: mount - Disable write cache for exFAT partitions - Unix & Linux Stack Exchange . Yes, drives have write caches but this adds on top and it’s not like you’re target device is going to perform better if you have gigabytes of data queued (“cached”) in RAM.

1 Like

Hard disagree on that one. When the transfer program tells me it’s done and I pull out my drive and my data is missing, I don’t care who is lying to me! I was misled. Bamboozled. Deceived. Done dirty.

Excuse me, but this is not a “user convenience feature”.

It’s a “don’t break my fucking data” feature.

Perhaps we could agree that to be somewhat important, no?

Do you think I am the only one? Certainly not. Not by a longshot!

I am feeling sassy today. Imagine a few finger snaps when you read the above.
:crossed_fingers::point_left::crossed_fingers::point_right::crossed_fingers::point_left::crossed_fingers:

1 Like

This has always bothered me. Caching is necessary for performance, but the progress bar should never reflect data cached, only data written. There is no value in telling the user that data is reading fast in a copy operation, because the entire operation will be finished only when the full data is written.
That said, it’s not strange or uncommon. Windows 10 does this by default as well. I believe it doesn’t claim the transfer is finished and kill the dialogue until it’s actually done, though, but I don’t use that shit anymore and don’t remember.

I don’t remember any good way to fix this, either. You can force it to write synchronously across the board, at the cost of performance. I think this might be a more complicated issue on linux than just “the file managers are lying to me”, though. afaik, this is EVERY file manager that operates this way, and I think it has something to do with the kernel lying about how quickly files are written, so that software can continue to operate quickly as though the transfer is complete, even though all the data is in memory, or something like that would make sense to me.

Basically, it’s just a thing that should have been fixed a long time ago, but no one has come up with the right solution and gotten it patched in correctly, which happens a lot with the Linux ecosystem in general.

1 Like

Windows purposely disables write caching on removable media, which I think is a sane decision.
I think Linux should follow; the only reason I could think of as to why this isn’t already done is that Linux’s implementation of recognizing media properties isn’t as robust as Window’s.

3 Likes

Internal and external hard disks connected over SATA, SAS, Thunderbolt, and FireWire (yes, I tested Firewire just for you!) present the following panel in Gnome’s disk manager:

The same panel is not available for USB flash drives or my NVME disks (and I didn’t have any SATA SSDs to test).

Either of the following options, the stop icon in Disks or the eject icon in Nautilus, will safely eject the drive after finishing all pending writes:

Screenshot from 2023-05-06 19-11-05

You might try “Dconf Editor” for modifying the default behavior:

And if you aren’t using Gnome well… that’s your mistake :smile:

3 Likes

Or unmount on command line, or just “eject” in file browser (it’ll refuse if still writing)

Its the sane in windows by default, with the icon in the taskbar?

Windows does allow you to change a drive to non-cacheing mode though

1 Like

It absolutely DOES perform better. With lots of data cached in RAM, the OS can re-order the queued data to the most performant sequence of writes. Not to mention the cases where you copy a file and quickly change your mind and delete it. I certainly like that when I copy a file to my flash drive it just goes instantly and I can get on with things, even if it needs time to write to disk in the background.

If you’re mounting the drive, use the “flush” option, either from the command-line or in /etc/fstab.

If your drive is being automounted, it’s not “Linux” but your desktop environment doing it, so you’ll have to dig into its options.

Linux is very much different than Windows. If you’re going to blow a fuse over everything it does any differently at all, and demand that it be made to act just like Windows, you’re in for a very bad time. I bet you wouldn’t be happy switching over to Mac OS either.

The lsof command will help you figure out what is holding your drive open.

4 Likes

I found these settings, but they’re greyed out to me.

And I’m not sure what I did, but I managed to crash my desktop environment - permanently.
Now it crashes on every login. (I’m in the middle of a long operation, so I need to wait for it to finish :frowning: )

If you’re going to blow a fuse over everything it does any differently at all, and demand that it be made to act just like Windows, you’re in for a very bad time.

There’s always that one person that doesn’t get the joke.

The lsof command will help you figure out what is holding your drive open.

@rcxb Tnx!

In GNOME at least, there is an eject button next to the drive in Nautilus. It’s easy and accessible IMO, although I also would prefer if Nautilus did not report the transfer as being complete until the data has been physically written to the drive.

I guess there’s nothing stopping the progress bars or whatever from reporting both bytes written and (bytes_written - bytes_dirty) somehow.

either way, if you care about the data, always eject/unmount and wait for it to finish, never “just pull the drive” like a caveman.

Which is more caveman technology?

Having to babysit the drive and ask it nicely to disconnect?

Or being able to dispense with it as you please?

:thinking:

2 Likes

I see your point and I’ll raise you a “UGH physical storage media, YUCK!”


More seriously though, this way (with write back caching) you can use the files from the destination before the data itself has been flushed fully to storage, be it a network mount or USB or whatever, and it’s especially useful to perhaps try and squeeze every last piece of performance from these “non local”/“usb” slow devices.

Some USB have blinking “activity led” - there’s probably widgets/extensions that’d keep some kind of UI thingy displayed while dirty data is flushed to removable media (e.g. chrome os does this - “do not unplug your device” indicator)

1 Like

windows does the same mate.
the first 500-700mb burst then slows to 120-240mbps depending on the drive im writing to.

wait, Win does not lie when showing transfer stats…

Moving: 2mins…5mins…20secs…20hrs…done!

(as in, it’s never been reliable…)

3 Likes

If windows has the drive marked as removable storage it will not write cache by default:

2 Likes

I prefer it the way Linux does it. Of course, I always end up tweaking disk caching to smooth out performance and responsiveness even more (the default is often as bad as Windows).