I think if it was as big a problem as all that, linux wouldn’t be the operating system of choice for massively parallel processors, and consistently lead Windows in multithreaded performance scaling.
There is always old stuff in the code…might still be necessary but could also be obsolete or just relevant for legacy hardware. I’m not a kernel hacker but I know my scaling on cores is pretty much linear in any system I own.
From any major OS, Linux has proven to keep up with all core counts…even FreeBSD had some hiccups and Windows is always troublesome in this aspect and also had licensing paywalls for higher core counts.
Which may or may not be a problem. Sounds like that’s for scheduling more processes on the same core. Which kinda gets less relevant with increasing core counts.
And catchy titles make for good clickbait. So this title makes sense…although not necessarily constructive.
Recent AMD desktop have 32 threads, recent AMD servers have 128 threads per CPU and often have multiple physical CPUs.
Desktop yes, for servers, the author may want to update his data.
Official comments in the code says it’s scaling with log2(1+cores) but it doesn’t.
All the comments in the code are incorrect.
Official documentation and man pages are incorrect.
Every blog article, stack overflow answer and guide ever published about the scheduler is incorrect.
I’m not sure why the author suggests that this was added by an accident. 8U in this case is just a fancy(ier) way of writing 8. If they dig further, they would found this context from Linux mailing list:
Based on Peter Zijlstras patch suggestion this enables recalculation of the scheduler tunables in response of a change in the number of cpus. It also adds a max of eight cpus that are considered in that scaling.
The first patch even has a bug that it uses max instead of min (which quickly pointed out).
Surprise, surprise! Hardlocked scaling limits would have been so obvious even 20 years ago. The only non-linear scaling I noticed was on the hardware side of things. And benchmarks tell me exactly that.
The problem we are having are the applications that don’t scale…I even remember a video from Wendell having to start multiple instances to hit the amount of threads provided by CPU and Linux.
AMD and Intel invest a whole lot into kernel development…they don’t want bad numbers or bottlenecks for their new flagships.
Not only that, Peter Zijlstra (the main person behind Linux scheduler), works for Intel. I’m sure if it couldn’t scale beyond 8 cores it’ll be fixed very quickly.
Isn’t there a mixup between kernel threads being limited to 8 as apposed to user threads where we can all agree Linux uses all available CPU/threads easily (as shown in some of the responses here).
Okay, looking at HN comments, I get the impression that scaling tops at 8 cores by design, because further than that, the workload does not need to be sliced further/scaled, because the resources are not as contended, and there is more room for more tasks; the system does not have to keep stopping individual tasks to fairly free up threads?
At work, we are running servers with 4 sockets that each have 26 CPU cores with hyperthreading. That gives us, 4262 = 208 CPU threads which makes looking at htop, umm, difficult on a small screen. However, it definitely reports us maxing out all the threads on our heaviest loads. We often have 40 individual users on a server each running their own builds. Oh, and these systems have 3 TB of DDR4 RAM.