Chips so advanced, they're unpredictable

Vordreller · June 10, 2021, 11:10pm

Weird title.

As the article starts:

Computer chips have advanced to the point that they’re no longer reliable: they’ve become “mercurial,” as Google puts it, and may not perform their calculations in a predictable manner.

Not something I ever had at the front of my mind, but yeah, this seems kinda scary.

redocbew · June 10, 2021, 11:13pm

Behold, the heisenbug.

http://www.catb.org/~esr/jargon/html/H/heisenbug.html

Dutch_Master · June 11, 2021, 3:58am

Fuzzy Logic?

MetalizeYourBrain · June 11, 2021, 7:36am

Yeah, this is nothing new. When you start to pile up more and more functionality into a CPU like instruction sets, pipeline optimizations, registers, funky ways to access memory and so on you’re bound to find a workload that wasn’t predicted by whoever worked on the architecture that might lead to more frequent errors or incredible slowdowns.

Since the article is talking about “computer CPUs” I’m inclined to think that these misbehaviours are happening mostly on x86 architectures. A bloated architecture with tons and tons of instruction sets on it that uses CISC instructions which compound the complexity issues, increasing the possibility of data corruption due to unpredictible instruction flows with CISC instructions that can read, modify and write data in a single instruction.

This makes me think even more that RISC is the way forward and it’s unfortunate that Apple is leading charge alone for now.

Zibob · June 11, 2021, 9:24am

I don’t thinks it is that they are “so advanced” but more that the transistors are “so small”.

That it is only in certain cores, in certain CPUs amongst thousands says to me that it is hardware faults. They may pass validation and all work to spec but when loaded a certain way or working with a certain system the flaw in a few or even 1 transistor within billions in a core does not act as expected.

It is worth remembering that we are nearing atomic scale transistors with current materials, that is not much room for errors or tolerance of flaws in the silicon.

As frequency gets higher, power draw tighter to low and instructions per cycle packed in even more it is not surprising that unforeseeable structal errors in the silicon will cause some odd results.

PaintChips · June 11, 2021, 9:59am

ARM has its own unique issues, there have been some OEMs who’ve custom built their own chip with unique multi-cores running at different speeds and suffer random faults that you’ll find in crash logs. Several ARM based chips have failed in a similar way of the well known Intel Atom/Avoton Server chips under certain conditions as usage managed to burn out a core… one ARM chipmaker I won’t name uses two cores dedicated for their hardware encryption.

MetalizeYourBrain · June 11, 2021, 10:17am

I’d say that those are youth problems that will be ironed out over time. Let’s not forget that ARM is 20 years younger than x86 so they got that gap to make up for with the new ever growing needs of the market (privacy, high level of security, optimization, AI driven branch predictors and so on).

PaintChips · June 11, 2021, 10:34am

I’m more concerned some OEMs preferred core usage for encryption will lead to early failure or security flaws in which it could be exploited in some abnormal way–similar to how Intel’s Secure Enclave SGX turned into a bag of hurt.

MetalizeYourBrain · June 11, 2021, 11:23am

That’s absolutely a valid point, but doesen’t really speak for the architecture itself but highlits the lenghts some manifacturer go just to tick a box on the specsheet (mostly).
Anyway I don’t want to derail the conversation anymore so feel free to DM me about this if you want this part of the conversation to go on.