Brand new 14700k on a msi z790 tomahawk: what to do ? (Confused)

Guys, I tried following this circus but I’m at a loss atm, feeling kind of confused as to what the best course of action is for running this chip without damaging it.

The aim is - this is a linux workstation, it does a lot of build and compile jobs - stability and speed (within reason).

What am I suppsed to do with this so that It doesn’t fry ? Msi has a latest stable bios with microcode 0x129. Is flashing this sufficient ? Can I just do this, reset to defaults, then enable XMP ? The amount of articles and official confusion from intel themselves as well as established reviewers is kind of making me just dump it and grab a ryzen.

Please advise

2 Likes

Updating the BIOS is all you should need to do. Intel has extended the warranty on all 13th and 14th gen CPU’s that have been effected so even if it does fail you should be good.

2 Likes

Yeah def keep your BIOS updated for latest microcode patches (which may continue to come out).

If you haven’t booted yet, pretty sure your mobo supports flash back to update BIOS before actually fully booting. Usually format a usb drive FAT32 and rename the bios file to msi.rom or whatever the manual says then push the magic button.

In theory the new Intel Recommended Defaults should no longer cause degradation due to over voltage. So yeah just setup XMP for your RAM and you should be fine.

Some folks seem to undervolt just a touch as well, as honestly the chip runs hot (my old one would cap out at 100 deg C and throttle regularly with a 240mm AIO).

When doing bursty compiles in Linux e.g. make -j$(nproc), a degraded chip can lock-up or throw compiler segfaults. Compiling llama.cpp would fail almost a quarter of the time on my degraded chip; oddly not while all cores were at 100%, but when most cores were idle and a couple cores would boost up in the middle of the compilation…

If something seems fishy, e.g. code that used to compile no longer compiles, check the output of your kernel logs with sudo dmesg -T:

[Sat Jul  6 18:41:44 2024] mce: [Hardware Error]: CPU 9: Machine Check: 0 Bank 0: 8000004000050005
[Sat Jul  6 18:41:44 2024] mce: [Hardware Error]: TSC 578e290d131
[Sat Jul  6 18:41:44 2024] mce: [Hardware Error]: PROCESSOR 0:b0671 TIME 1720305704 SOCKET 0 APIC 21 microcode 123
[Sat Jul  6 18:41:44 2024] mce: [Hardware Error]: Machine check events logged

Also might see errors while inferencing AI LLMs like:

[Fri Jul 26 12:44:08 2024] llama-server[13422]: segfault at 55 ip 00007bc3ad0b7d55 sp 00007ffc1ab1ffa0 error 4 in libc.so.6[7bc3ad038000+16c000] likely on CPU 4 (core 8, socket 0)
[Fri Jul 26 12:44:08 2024] Code: e8 00 0d f9 ff f3 0f 1e fa 48 85 ff 0f 84 d3 00 00 00 55 48 89 e5 41 55 4c 8d 6f f0 41 54 53 48 83 ec 18 48 8b 1d bb df 13 00 <48> 8b 47 f8 64 44 8b 23 a8 02 75 5f 48 8b 15 38 df 13 00 64 48 83
[Fri Jul 26 12:44:18 2024] llama-server[13495]: segfault at 55 ip 000076091928ad55 sp 00007ffdef195540 error 4 in libc.so.6[76091920b000+16c000] likely on CPU 31 (core 47, socket 0)
[Fri Jul 26 12:44:18 2024] Code: e8 00 0d f9 ff f3 0f 1e fa 48 85 ff 0f 84 d3 00 00 00 55 48 89 e5 41 55 4c 8d 6f f0 41 54 53 48 83 ec 18 48 8b 1d bb df 13 00 <48> 8b 47 f8 64 44 8b 23 a8 02 75 5f 48 8b 15 38 df 13 00 64 48 83

This would be a canary in the coal mine for degradation. As xyz says though, they should RMA under extended warranty if the new microcode doesn’t fix it. Time will tell!

Enjoy the blazing fast low core count performance though!

1 Like

thank you for your answers. will do

1 Like

Brace yourselves, new BIOS w/ microcode 0x12B is coming to address the CPU over-voltaging itself during idle/light activity periods!