That’s very strange. I would almost suggest submitting that as a bug (if it hasn’t been submitted already…).
I usually let mine run in the background, which is probably why I never noticed the lack of hyperthreading use. I looked a bit and someone suggested running multiple instances in parallel, but I feel like it should be handled by the package maintainer. Hm. A project for another day, I suppose.
Lolno. I was too fast. You won’t believe it, but again, without changing anything, I get “usable” and better than before, but still awfully slow Plex transcoding (up to a minute to start watching and waiting for transcoder to hold up every minute or so).
ffmpeg now gives me 15fps:
My only remaining thought is that, when you get higher fps, the data is cached in memory. Ffmpeg will happily work with something in memory after it has recently been moved or copied.
That said, the only way to really test it would be to sync the data after moving it (if you are), unmounting the drive, remounting, and then trying ffmpeg. You will get slower speeds here.
Then copy the data and try again. You will get significantly higher speeds here. I would compare those numbers to what you’re seeing. If that’s the case, you just have a bottleneck somewhere. Possibly the source drive (it is a rotational, after all).
File was staying in single location all the time. And I don’t think that a new 7.2k rpm drive can’t provide enough reading speed for 24 fps but could have sustained 200mb/s read speed few days ago while I was copying data from it to my pc.
Now I am pretty sure it’s a memory-related issue. I’ve had problems with ECC correctable errors previously, but they looked like a bug to me, because I changed several RAM modules and memtest86 gave me 0 errors after roughly 24 hours of testing. But they continued to appear in syslog, so i changed memory mode to mirroring and lived with it, thinking I am safe, because there were no system hangs anymore (I actually have 32gb of RAM).
Now I’ve found out that first after reboot OVERFLOW area:DRAM err_code:0001:0092 error in syslog and transcoding slowing down happen pretty close in time, so likely transcoding slows down due to memory error correction.
Lol, does anyone have any ideas hot to fix this error without changing RAM, CPUs and motherboard?
Google gave me several results about similar looking bug bug with Supermicro and Dell motherboards in RedHat/CentOS. I haveDebian 9.5 and Asus Z9PE-D16, bought it completely unused and sealed, but still 5-6 years old, so it’s unlikely it was broken but no warranty obviously. And it is pretty expensive, I am not really willing to spend another 15-20k rub ($220-290) for same or equal motherboard without any confidence in success.
Errors spawn a lot when system is idling, so might be bits reversion. I am pretty frustrated right now.
I’ve got total of 48 gigs lying around, interchanged reported dimms, but they still report same issue. This MB does not allow tweaking any voltages, I can only go down to 1066 preset. Would try this.
Modules are hyunix HMT351V7BMR4C.
Official response from Dell in one of the threads:
The reason this occurs (and why blacklisting EDAC “resolves” the problem) is due to a bug in the EDAC module present in all major Linux distributions at this time. EDAC does not communicate properly with the Intel Node Manager on the latest generation Intel processors; this causes false error reporting whenever a variety of status triggers are met. Any time the processor increases or decreases clock speed or voltage to meet demands due to different loads, any thermal sensor check, HT being turned on or off, or several other things of this nature will cause this to occur.
We need to disable the edac modules to stop it from attempting to take over the hardware management features of the Lifecycle Controller and the BMC
Ok. 1066 freq, NUMA on, blacklisted edac modules.
2 hours of idling, everything looks good, transcoding works great. Would wait and watch, hopefully this is it. And hopefully the system would not crash randomly, proving edac was right about errors. I really wish this is over after all my “fixing”. ROFL in tears.
NUMA was on since the beginning, I turned it off and it gave me nothing. You can’t really fix anything with NUMA I think, only by turning it OFF.
Either frequency or EDAC solved the issue. Blacklisting EDAC was reported to fix this for several guys and was suggested by Dell. According to info online, it tries to rule the memory controller and some other stuff in ex-north bridge, now integrated in CPU, and fails in it with Sandy Bridge. Reported at the time this CPUs were up to date, looks like was not fixed untill today.