2000% CPU usage, What is going on?

So I’m transcoding video with ffmpeg on an AWS instance, and I wanted to make sure I was using the cpu to the best of its ability.
The command I ran was:
for f in *.webm; do ffmpeg -i “$f” -strict -2 “${f:0: -4}mp4”; done;
and t does seem to be transcoding properly, but top is showing very strange numbers. Does anyone know why this would appear that way?
I also tried with -threads 16 and the results are identical
Thanks,
Andrew
Screenshot from 2017-10-27 01-03-31

note: each thread is represented as 100%, this is ok.

ex. Lets say you have 4 threads, 100% utilization of all 4 will be shown in top as 400%.

Try htop if typical top is too confusion

5 Likes

It is also worth noting that you can pass the -H flag or press Ctrl+H with top running which will represent the output of each thread instead of as a whole.

1 Like

I’d also recommend htop and glances to help out.

1 Like

For production systems, you’d typically want load-avg closer to 0.7 so as to give you some headroom. At 1.00 your system really hasn’t much room to keep up if any workload spikes.

I have some resque workers in AWS chewing at 24-28 load-avg on a m4.xlarge. Totally underpowered :crazy_face: - this also means that at a load-avg of say 28, that at least 27 processes were waiting for CPU time and the system was basically overloaded by 2700%.

Looking at your 6.45 load-avg for 1min means 645% overloaded, with 5.45 processes waiting for the CPU.

uh, that’s only true on a single core system.

load average is core utilization of each core all added together.

If you have 8 cores/threads, a load average of up to 8.0 means you’re not overloading the cpu.

1 Like

not sure why are you telling me this, but ok.

not really,

loadavg of 28 could mean you have 28 things waiting to read a thing each off the filesystem, and your cpu could be basically idle…

it could also mean that 4 things are numbercrunching on 4 cores and 24 things are waiting to get scheduled

having a load average of 2000 on a 4 core system, … doesn’t necessarily mean the system is unresponsive (although, load avg of 2000 is like “wtf … what’s this code doing” kind of moment).

when it comes to aiming for 0.7 or 70% of load average in production, it wildly depends on what your production is about, … 70% * cores as a load number, even if all load is cpu load, also doesn’t mean that the CPU utilization is 70%. (due to how statistics and queuing theory works, too complicated to explain here).

1 Like

Well, this thread certainly has been very interesting. So far I’ve learned quite a bit, but does anyone know why the %Cpu load numbers are strange? I would have expected ffmpeg to put the overall cpu usage to 100%, or close to it. It always does on my windows machine. I have seen per-process %CPU numbers over 200% before, and I always assumed that indicated for example, 2 cores at 100% each. This 2106% seems to indicate over 20 cores at 100% each? I thought maybe something like turbo boost would cause this result, yet the overall usage in the top left is only 13% Confusing

Thanks mate - sorry, I should have touched on the multiple-threaded scenario. I was thinking of a single vcore/single thread VM (For whatever reason!!)

1 Like

Oops - Yes, I was wrong!!

I have some resque workers in AWS chewing at 24-28 load-avg on a m4.xlarge. Totally underpowered :crazy_face: - this also means that at a load-avg of say 28, that at least 27 processes were waiting for CPU time and the system was basically overloaded by 2700%.

m4.xlarge has 4vcores/threads so a load-avg of 28 means 28/4 = means we are only 7x overloaded, i.e. at least 6 processes would be waiting for CPU - right? Similarly, a 600% overload.

1 Like

2000% on a less than 20 core system makes no sense.

Try passing --threads 32 to ffmpeg and look at whether overall system idle number reduces.

Also, you have multiple files, run multiple ffmpegs in parallel. Depending on the codec and video, a single ffmpegs may end up not scaling perfectly with the number of cores.

Also, try checking iostat and see what your average wait times are, it’s unlikely, but maybe you’re not using the CPU because you can’t read or write the videos fast enough.