ASRock Rack W680D4ID-2T/G5/X550 C-states / ASPM problem

Hi, recenly got this board, but cannot figure out how to go below pkg C2 (pc2).

My setup:

  • CPU: i9-14900K
  • RAM: two 48 GB ECC UDIMM sticks from Kingston
  • Motherboard: ASRock Rack W680D4ID-2T/G5/X550 (latest BIOS 2.09)
  • Storage:
    • 2 x M.2 (Kingston DC2000B 960GB)
    • 4 x SATA (Samsung SM863 1.96TB).
  • PSU: HDPlex 250W

I also have Intel A380 GPU, but for now to simplify debug pulled it out.

What I tried:

  1. Enabled ASPM for basically everything, including “secret” Ctrl+Alt+F3 BIOS menus (Chipset -> SA, Chipset -> PCH and others).
  2. Enabled everything related to C-states (excluding some hidden options which I am scared to enable).
  3. Enabled Modern Standby (S0ix).
  4. Tried to enable D3Cold (D3 states start to work, but it breaks one NVMe which is connected to CPU directly).
  5. Installed power limits 1/2 at 125W.
  6. Installed 6.11 kernel.
  7. Use the latest powertop 2.15.

CPU sits at C7 (cc7), but package doesn’t go below C2 (pc2).

This is my PCI tree:

# lspci -tv
-[0000:00]-+-00.0  Intel Corporation Device a700
           +-02.0  Intel Corporation Raptor Lake-S GT1 [UHD Graphics 770]
           +-06.0-[01]----00.0  Kingston Technology Company, Inc. DC2000B NVMe SSD E18DC
           +-14.0  Intel Corporation Alder Lake-S PCH USB 3.2 Gen 2x2 XHCI Controller
           +-14.2  Intel Corporation Alder Lake-S PCH Shared SRAM
           +-15.0  Intel Corporation Alder Lake-S PCH Serial IO I2C Controller #0
           +-16.0  Intel Corporation Alder Lake-S PCH HECI Controller #1
           +-17.0  Intel Corporation Alder Lake-S PCH SATA Controller [AHCI Mode]
           +-1a.0-[02]----00.0  Kingston Technology Company, Inc. DC2000B NVMe SSD E18DC
           +-1b.0-[03]--
           +-1b.4-[04]--
           +-1c.0-[05]--+-00.0  Intel Corporation Ethernet Controller X550
           |            \-00.1  Intel Corporation Ethernet Controller X550
           +-1c.4-[06]--
           +-1c.7-[07-08]----00.0-[08]----00.0  ASPEED Technology, Inc. ASPEED Graphics Family
           +-1d.0-[09]--
           +-1f.0  Intel Corporation Device 7a88
           +-1f.3  Intel Corporation Alder Lake-S HD Audio Controller
           +-1f.4  Intel Corporation Alder Lake-S PCH SMBus Controller
           \-1f.5  Intel Corporation Alder Lake-S PCH SPI Controller

ASPM states:

# lspci -vv | awk '/ASPM/{print $0}' RS= | grep --color -P '(^[a-z0-9:.]+|ASPM )'
00:06.0 PCI bridge: Intel Corporation Raptor Lake PCIe 4.0 Graphics Port (rev 01) (prog-if 00 [Normal decode])
                LnkCap: Port #5, Speed 16GT/s, Width x4, ASPM L1, Exit Latency L1 <16us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
00:1a.0 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port #25 (rev 11) (prog-if 00 [Normal decode])
                LnkCap: Port #25, Speed 16GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
00:1b.0 PCI bridge: Intel Corporation Device 7ac0 (rev 11) (prog-if 00 [Normal decode])
                LnkCap: Port #17, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk-
00:1b.4 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port (rev 11) (prog-if 00 [Normal decode])
                LnkCap: Port #21, Speed 16GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk-
00:1c.0 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port #1 (rev 11) (prog-if 00 [Normal decode])
                LnkCap: Port #1, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
00:1c.4 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port #5 (rev 11) (prog-if 00 [Normal decode])
                LnkCap: Port #5, Speed 8GT/s, Width x1, ASPM L1, Exit Latency L1 <64us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk-
00:1c.7 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port #8 (rev 11) (prog-if 00 [Normal decode])
                LnkCap: Port #8, Speed 8GT/s, Width x1, ASPM L1, Exit Latency L1 <64us
                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
00:1d.0 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port #9 (rev 11) (prog-if 00 [Normal decode])
                LnkCap: Port #9, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk-
01:00.0 Non-Volatile memory controller: Kingston Technology Company, Inc. DC2000B NVMe SSD E18DC (rev 01) (prog-if 02 [NVM Express])
                LnkCap: Port #0, Speed 16GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
02:00.0 Non-Volatile memory controller: Kingston Technology Company, Inc. DC2000B NVMe SSD E18DC (rev 01) (prog-if 02 [NVM Express])
                LnkCap: Port #0, Speed 16GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
05:00.0 Ethernet controller: Intel Corporation Ethernet Controller X550 (rev 01)
                LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L0s L1, Exit Latency L0s <2us, L1 <16us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
05:00.1 Ethernet controller: Intel Corporation Ethernet Controller X550 (rev 01)
                LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L0s L1, Exit Latency L0s <2us, L1 <16us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
07:00.0 PCI bridge: ASRock Incorporation Device 1150 (rev 06) (prog-if 00 [Normal decode])
                LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <512ns, L1 <32us
                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+

From what I understand, BMC is connected to PCH-S using PCIe x1 gen 3 link (port #8). For some reason it cannot use ASPM (BIOS bug? BIOS missconfig?) and it prevents system from going to deeper C-states. Or I just don’t understand how it works?

I would probably just tried to disable port #8 to test, but I am not sure it won’t kill the board (since BMC will no longer able to connect to motherboard).

EDIT 20.12.2024: After almost a month of research and trying I can conclude that modern server platform with at least AST2600 BMC and AST1150 PCIe-to-PCIe bridge HAS NOT any hardware limitations on supporting package C10 state.

In my case I achieved it with single NVMe (Kingston DC2000B), 96 GB ECC UDIMM RAM, 14900K and HDPlex 250W PSU and got drastic thermal improvement.

  • Intel X550 became barely hot (it’s now 45 Celsius instead of really hot 60)
  • HDPlex is barely warm (it was pretty hot previously)
  • 14900K is chilling at 28 Celsius, which is crazy

The only thing that stayed the same are SATA disks, but it’s just because my enterprise disks are not supporting DevSlp.

Unfortunately with second NVMe (which is connected directly to the CPU) it is limited by pkg C6 state for some reason. I can also connect Intel A380 and retain C6 state.

2 Likes

This command:

echo -n powersave > /sys/module/pcie_aspm/parameters/policy

Allowed to enable L1 for AST1150 and now basically all my PCIe devices should support L1 (thanks to man from this thread):

# lspci -vv | awk '/ASPM/{print $0}' RS= | grep --color -P '(^[a-z0-9:.]+|ASPM )'
00:06.0 PCI bridge: Intel Corporation Raptor Lake PCIe 4.0 Graphics Port (rev 01) (prog-if 00 [Normal decode])
                LnkCap: Port #5, Speed 16GT/s, Width x4, ASPM L1, Exit Latency L1 <16us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
00:1a.0 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port #25 (rev 11) (prog-if 00 [Normal decode])
                LnkCap: Port #25, Speed 16GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
00:1b.0 PCI bridge: Intel Corporation Device 7ac0 (rev 11) (prog-if 00 [Normal decode])
                LnkCap: Port #17, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk-
00:1c.0 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port #1 (rev 11) (prog-if 00 [Normal decode])
                LnkCap: Port #1, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
00:1c.4 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port #5 (rev 11) (prog-if 00 [Normal decode])
                LnkCap: Port #5, Speed 8GT/s, Width x1, ASPM L1, Exit Latency L1 <64us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk-
00:1c.7 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port #8 (rev 11) (prog-if 00 [Normal decode])
                LnkCap: Port #8, Speed 8GT/s, Width x1, ASPM L1, Exit Latency L1 <64us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
00:1d.0 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port #9 (rev 11) (prog-if 00 [Normal decode])
                LnkCap: Port #9, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk-
01:00.0 Non-Volatile memory controller: Kingston Technology Company, Inc. DC2000B NVMe SSD E18DC (rev 01) (prog-if 02 [NVM Express])
                LnkCap: Port #0, Speed 16GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
02:00.0 Non-Volatile memory controller: Kingston Technology Company, Inc. DC2000B NVMe SSD E18DC (rev 01) (prog-if 02 [NVM Express])
                LnkCap: Port #0, Speed 16GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
04:00.0 Ethernet controller: Intel Corporation Ethernet Controller X550 (rev 01)
                LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L0s L1, Exit Latency L0s <2us, L1 <16us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
04:00.1 Ethernet controller: Intel Corporation Ethernet Controller X550 (rev 01)
                LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L0s L1, Exit Latency L0s <2us, L1 <16us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
06:00.0 PCI bridge: ASRock Incorporation Device 1150 (rev 06) (prog-if 00 [Normal decode])
                LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <512ns, L1 <32us
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+

But system still not able to go below pkg C2.

2 Likes

Tried this tonight:

  1. Boot without 1 SSD.
  2. Boot without 2 SSDs (from USB stick).
  3. Boot without 2 SSDs and 10GbE network (Intel X550, turned off it’s PCIe slot).

No success, PC2 is the deepest state it could archive. Also I discovered that if board is powered off but plugged into wall (so BMC is working, but server itself is not) Intel X550 is still hot and working (WoL is off).

1 Like

I am pretty sure I tried everything. I see that it has a lot of CPU wakeups while it does nothing. My guess is that BMC somehow polls platform too frequent, that causes unnecessary interrupts. But I cannot confirm it. Will wait answer from ASRock Support.

Another thing I tried: disabled basically every PCIe port in PCH settings, including AST 1150 PCIe-to-PCIe bridge (it broke iKVM btw). There were only basic PCIe devices like M.2 and SATA controller, every device supported ASPM out of the box.

Unfortunately it didn’t help. At this moment I am sure there is some kind of BIOS bug. I will contact ASRock support.

1 Like

I have major progress, I achieved package C6 (pc2)! It seems like ASRock has some kind of BIOS bug. I guess in my case it was a combination from a lot of BIOS options, but the key was to send it in S0ix sleep. After that you can turn off and on server, but pkg C6 will continue to work until you change hardware configuration of enter into BIOS. In that case you need to repeat sleep trick. I am not sure it will work with usual S3 since I enabled Modern Standby in my case.

With LTR ignored, package C10 achieved, but without one NVMe (which is connected directly to the CPU) and Intel A380. So there are problems with the platform, but the are probably fixable.

In my case I was required to ignore PMC0:SOUTHPORT_G and PMC0:SATA. In case of SATA I guess this is because my drives don’t support DevSlp and probably Slumber (?). But I am not sure about this south port.

cat /sys/kernel/debug/pmc_core/ltr_show
0       PMC0:SOUTHPORT_A                        LTR: RAW: 0x0                   Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
1       PMC0:SOUTHPORT_B                        LTR: RAW: 0x0                   Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
2       PMC0:SATA                               LTR: RAW: 0x884a                Non-Snoop(ns): 0                Snoop(ns): 75776                LTR_IGNORE: 1
3       PMC0:GIGABIT_ETHERNET                   LTR: RAW: 0x0                   Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
4       PMC0:XHCI                               LTR: RAW: 0x936                 Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
5       PMC0:SOUTHPORT_F                        LTR: RAW: 0x0                   Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
6       PMC0:ME                                 LTR: RAW: 0x8000800             Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
7       PMC0:SATA1                              LTR: RAW: 0x0                   Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
8       PMC0:SOUTHPORT_C                        LTR: RAW: 0x0                   Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
9       PMC0:HD_AUDIO                           LTR: RAW: 0x0                   Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
10      PMC0:CNV                                LTR: RAW: 0x0                   Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
11      PMC0:LPSS                               LTR: RAW: 0x0                   Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
12      PMC0:SOUTHPORT_D                        LTR: RAW: 0x0                   Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
13      PMC0:SOUTHPORT_E                        LTR: RAW: 0x0                   Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
14      PMC0:SATA2                              LTR: RAW: 0x0                   Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
15      PMC0:ESPI                               LTR: RAW: 0x0                   Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
16      PMC0:SCC                                LTR: RAW: 0x0                   Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
17      PMC0:ISH                                LTR: RAW: 0x0                   Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
18      PMC0:UFSX2                              LTR: RAW: 0x0                   Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
19      PMC0:EMMC                               LTR: RAW: 0x0                   Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
20      PMC0:WIGIG                              LTR: RAW: 0x0                   Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
21      PMC0:THC0                               LTR: RAW: 0x0                   Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
22      PMC0:THC1                               LTR: RAW: 0x0                   Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
23      PMC0:SOUTHPORT_G                        LTR: RAW: 0x80008000            Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 1
24      PMC0:CURRENT_PLATFORM                   LTR: RAW: 0x40201               Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
25      PMC0:AGGREGATED_SYSTEM                  LTR: RAW: 0x7ffffff             Non-Snoop(ns): 0                Snoop(ns): 0                    LTR_IGNORE: 0
2 Likes

I found in BIOS options to enable L1 substates for CPU PCI root ports:

  • “Enable ClockReq Messaging” – from what I understand it enables CLKREQ#
  • L1 Substates options become active and you can choose “L1.1 and L1.2”

Unfortunately while Linux shows that both L1.1 and L1.2 are supported by the link, I don’t see it uses it. Though I don’t see that any of my bridges uses L1 substates at all. Moreover using those options limits my pkg C-states from C6 to C3.

Multi-VC was disabled in those cases.

I even tried to enable it using Link Control registries:

setpci -s 00:06.0 208.b=1f

There 00:06.0 is a PCIe address of my M.2 port (what called “PCIe bridge” in Linux). But still, kernel said it uses L1 instead of L1.2.

At some point I thought it’s necessary for pkg C10 to use L1.1/L1.2, but after researching it more I found out that C10 was achieved previously with all PCI bridges using L1 link. So the problem lies in other place.

1 Like

I found out that SOUTHPORT_G is probably ASPEED AST1150 PCIe-to-PCIe bridge which is for some reason labeled as “Asrock Incorporation Device 1150”.

It requires you to ignore LTR after fresh power on, but after sleep (suspend or rtcwake -s 10 -m freeze) it allows to achieve C10 even without ignoring LTR. What is more surprising is that this behavior retains between reboots, but it breaks after complete power off.

So I guess there is some bug related to PCIe link between PCH and AST1150.

My current observation at the moment:

  1. Probably AST1150 bug I described above limits my states from PC6 to PC2 when I use both NVMe drives.
  2. Probably some problem with direct-to-CPU PCIe ports limits my system by PC2 from achieving PC6/PC10.
  3. With a single NVMe drive connected to PCH and after sleep my system can achieve PC10 even without powertop --auto-tune at all, though PC10 residency will be low.
  4. With a single NVMe drive connected to PCH and before sleep my system can achieve only PC2.
  5. With a single NVMe drive connected to PCH and before sleep with ignoring LTR for SOUTHBRIDGE_G system achieves PC10.

I suspect that there are two bugs:

  1. Firmware cannot setup proper PCIe link after the first boot between AST1150 and PCH. Putting platform into S3 or S0ix state fixes the problem (maybe some runtime values in firmware change, idk), but complete shutdown bringing bug again.
  2. PCIe CPU ports are limiting system by PC2 for some reason though Multi-VC in my case is disabled for all ports (which is what Intel suggests users to fix this problem).

I wrote ASRock two messages 3 weeks ago, but still didn’t get any answer from them, will try two write again with more finding.

I believe my troubleshooting could be similar for other ASRock Intel based boards, maybe even some of AMD ones.

At the end of the day I was wrong that AST1150 is SOUTHPORT_G, by changing LTR settings one by one I was surprised to know that actually this is M.2_1 port (NVMe connected to the PCH). I wasn’t able to disable LTR for this port, but fortunately was able to set manual Snoop and Non Snoop latencies:
310 * 1024ns and 500 * 1024ns accordingly. I am not sure if it affects NVMe drive performance, but at least now, while I boot with single drive it ALWAYS achieves PC10 state even without “sleep trick”.

Unfortunately setting the same LTR latencies manually for another PCIe slot connected to the CPU (System Agent) didn’t help and I continue digging.

I found Samsung 970 Evo Plus at my home, so decided to test with it:

  • Samsung 970 Evo Plus sends 0x09f409f4 LTR value, which translates to:

    • There is no a requirement to follow LTR values disk sends
    • Snoop and Non Snoop Latencies are 500 * 1024 ns (basically just a recommended values which BIOS can use)
  • Kingston DC2000B sends 0x80008000 LTR value which translates to:

    • There is a requirement to follow LTR values disk sends
    • Snoop and Non Snoop Latencies are 0ns. Zero latency in case of LTR means “never allow PCIe link to go in power saving states (L1, etc)”

So while Kingston DC2000B supports ASPM itself it sends LTR values which break actual work of ASPM. I contacted Kigston, maybe they will provide some options (reflashing with proper LTR values?).

Otherwise I would need to change drives.

How LTR messages encoded:
20210409165042801

I’ve hit the same PC2 minimum c-state problem with a consumer Arrow Lake board – Z890 Pro-A Wifi, and I’ve had zero success contacting ASRock technical support through their web form.

You seem to be much more knowledgeable than I in this area, and have gotten much farther than I have. Thank you very much for publishing your findings.

Alas I’m juggling a bunch of Christmas logistics at the moment, but I will follow up as soon as I can diff what you’ve done against the settings available on my board and try it out.

Unfortunately I didn’t get any response from ASRock either. But a lot of my friends got and mostly within 3-4 days. I don’t understand why and what’s the pattern there. Maybe they just need some time to debug it, I don’t know, but after 4 weeks it would be nice to get something, at least “we are working on it”.

In case of consumer board you will probably need to unlock BIOS first. A lot of options I use are in “hidden menu” which is available only in ASRock Rack boards as far as I know (Ctrl+Alt+F3 during boot, instead of Del/F10). SCEWIN should work for you.

I saw that a lot of users got a result on ASRock boards by switching Low Power S0 Idle Capability (this is basically S0ix Modern Standby switch, which is used at laptops usually, but I hope newer PC will also use it as some point). Here is the original post: Reduce power consumption with powertop - Page 25 - User Customizations - Unraid

Also if your CPU PCIe slots populated you need to disable Multi-VC for Root Ports you populate (or just all of them): Reduce power consumption with powertop - Page 35 - User Customizations - Unraid

1 Like

Maybe this will help someone. I use 14900K with HDPlex 250W PSU and noticed that under load peaks system randomly shutdowns. Both PL1 and PL2 was set to 65W and 125W accordingly. But it wasn’t enough. Fortunately this board allows to set both PL3 and PL4. I hope setting PL4 is enough, so I set it to 160W and the problem was completely solved.

Another note: as far as I understand if your PCIe device reports wrong LTR values and you want to override them, they should be bigger than L1.2 threshold value in nanoseconds (you can see it using lspci -vvvv).

Sorry this is so delayed. I went down a different rabbit hole chasing mystery throttling.

Alas, SCEWIN approach didn’t work. The output file it produced contained only language and boot order.

However, although ASRock didn’t reply to my support ticket or forum post (and took a week and a half to respond by email), somebody apparently saw it, because as of BIOS version 2.22.AS05, the CPU is no longer restricted to C2:

cpu0: MSR_PKG_CST_CONFIG_CONTROL: 0x74008008 (UNdemote-C1, demote-C1, locked, pkg-cstate-limit=8 (unlimited))

Thanks ASRock!

This is great, because I was very close to driving back to Microcenter and exchanging this board (and another $30) for one from a different manufacturer. Would be very big hassle, have to buy more thermal interface material, other vendor has worse I/O, etc.

So now the ball is in my court.

Findings so far:

  • With all the ASPM and ALPM BIOS settings at max, powertop --auto-tune gets it from 28W at the wall to 23W… Until powertop itself pokes the Realtek NIC, which I preemptively disconnected in favor of WiFi, because lspci says it doesn’t support ASPM.

  • The actually-effective powertop tweaks were the one that enabled SATA ALPM (-3 W), and the one that autosuspended the mouse and keyboard (-2 W). Which is unfortunate because it makes the desktop functionally unusable when your input devices take a second to start registering input when they’ve not been touched for ~10s.

  • acpi_osi="Windows 2015" in kernel command line doesn’t help.

  • pcie_aspm=force didn’t help, which is good because it’s scary.

  • Neither did BIOS-disabling the Realtek NIC, the NPU, the Intel VMD, or Integrated Thunderbolt.

  • The CPU still does not go below package C2 in Linux, but it does in Windows, only with the “power saver” power plan, which got to 20 W. The first time windows went screen off idle in that condition, it hard-reset. Perhaps my ancient Antec Neo Eco PSU just doesn’t like loads that small.

  • The second time Windows went screen-off-idle, it got down to 17W. Maybe it set a chicken bit on something? Linux echo s2idle >/sys/power/mem_sleep; systemctl suspend goes down to 14 W and comes back up again fine.

  • Windows “balanced” is 28 W just like Fedora out-of-the-box, and does not go deeper than package C2.

  • I read through drivers/idle/intel_idle.c, and it looks like INTEL_ARROWLAKE is not specifically supported yet, so I’m getting the CPU idle states exposed by ACPI.

  • Enabling SAGV (System Agent Geyserville) in BIOS, using the parameters in the documentation for points 1,2, and 3, and the 6000 MT/s XMP speed for point 4, gets it down to 20 W at the wall after powertop --auto-tune, a further 3W savings. C8? We don’t need no stinkin’ C8! =P …er, unless we want to to have properly functioning keyboards, that is.

  • At 20W idle in Linux, I saw a random hard reset there too. Plugged in a couple case fans to add some load, and there have been no more resets. This supports the “14 year old group regulated PSU can’t take it” theory. Apparently there were doubts at the time about whether it was Haswell ready. And I never got my Haswell below 35 W… Time to buy a new PSU and enjoy modern LLC resonant 80+ gold, I think. Will it help my bottom line or save the Earth? Absolutely not… but it’s fun, Jan!

  • Disabling WiFi power save (iw dev $wlpwhatever set power_save off) did not have enough of an impact to show up on the wall power meter, but it reduced outgoing ping time by ~1 ms, and eliminated the worst case incoming pings of >100 ms.

Suffice to say, although it’s not going deeper than C2 yet, it’s not locked out by the firmware anymore, and I have seen that it can reach C8. Once kernel support for this platform matures (perhaps as laptop users get their hands on it), hopefully it will become possible on Linux.

It’s already idling nearly as low as Windows just with Geyserville, and without whatever compromises the “power saver” power plan entails – which I have heard are quite drastic, at least in the launch day reviews.

I read through drivers/idle/intel_idle.c, and it looks like INTEL_ARROWLAKE is not specifically supported yet, so I’m getting the CPU idle states exposed by ACPI.

You don’t need special support for Raptor or Arror Lake. By default Linux will use default ACPI C-states using MWAIT instruction. It’s enough to achieve C2, C6 and C10 (though AFAIK it’s impossible to achieve C8 without proper support).

For some reason Intel didn’t find it useful for new desktop platforms, maybe their reasons have some ground, I don’t know, but it’s definitely not a requirement.

Enabling SAGV (System Agent Geyserville) in BIOS, using the parameters in the documentation

Do you have Intel Premier account? Because I cannot open this documentation without login. Found it in web.archive.org. Also you can find those points in any processor specification PDF.

Do you have Intel Premier account?

I do not. Intel’s hosted documentation has been flaky lately. It seems to be progressively switching to Azure login gated, document-by-document… among other things. The reason I linked the older one is that the version of the Desktop 200S Datasheet I saved last week (rev 001 from October) has an empty spot where the SAGV table is supposed to be:

sagv-table-if-i-had-one

Also, it says there are 3 operating points, the BIOS definitely shows 4. I wonder if 1 is unused?

13/14 gen datasheet volume 1 with table which is available right now: https://www.intel.com/content/www/us/en/content-details/743844/13th-generation-intel-core-intel-core-14th-generation-intel-core-processor-series-1-and-series-2-and-intel-xeon-e-2400-processor-datasheet-volume-1-of-2.html

13_14_gen_datasheet.zip (12.6 MB)