AMD HSMP & eSMI - On-the-fly control of fabric, clocks, power and more

Reference Material from the video

GitHub - amd/amd_hsmp: AMD HSMP module to provide user interface to system management features.

GitHub - amd/esmi_ib_library: E-SMI: EPYC™ System management Interface In-band Library

Processor Programming Reference (PPR) for AMD Family 1Ah Model 02h, Revision C1 Processors (57238)

What is AMD e-SMI?

The EPYC™ System Management Interface In-band Library, or E-SMI library, is part of the EPYC™ System Management Inband software stack. It is a C library for Linux that provides a user space interface to monitor and control the CPU’s power, energy, performance and other system management features.

… this uses the HSMP module which is the interface to these system management features.

What does this do for us?

If you know Ryzen Master on desktop-class CPUs, then this provides an interface for a lot of similar functionality but with Epyc cpus. If you need to adjust the cTDP all the way down on the fly, then it is possible. That’s one of the power-saving scenarios I demo in the video.

It is also possible to fine-tune the infinity fabric speed (within supported limits – no overclocking) and to tell the CPU to prioritize, for example, fabric and I/O operations instead of power for compute and cores.

For advanced bios features like streaming enhancement and L3 cache management policies, it is also possible to tweak those on the fly using the documented HSMP registers from the PPR reference updates that just dropped.

It is incredibly useful for anyone looking to do workload diagnostics and fine-tuning. Options that have been in the bios since Rome generation can now be tweaked on the fly from the command line, which greatly speeds testing and optimization efforts.

The Key To Using

(from github)

HSMP PCIe interface needs to be enabled in the BIOS. The CBS option can be found by navigating to the following path

##### Advanced > AMD CBS > NBIO Common Options > SMU Common Options > HSMP Support

##### BIOS Default: “Auto” (Disabled)

If the option is disabled, calls to the SMU will result in a timeout.

Not all bioses expose this option; if your board does not have this option please ping your board vendor to inquire.

Does this work on Threadripper?

No, not presently. If this is something you’d like to see however, please PLEASE engage and let AMD know so they can do the qual work to get it going.

4 Likes

Hey Wendell! Cool video and thanks for the description. Is there a way to make use of it to monitor the EPYC CPU properly via Grafana / telegraf or something like that? Sure, it can do much more, but monitoring would really be awesome as a start.

yes and infact if you paste the relevant 3-4 pages from the PDF Claude or chatgpt can do a plugin to get you most of the way there

1 Like

How recent does the Epyc need to be, does this work for Naples or Rome too?

1 Like

sure does. depends on board more than chip. technically Milan is first supported but it happens to work unofficially and unsupportedly on the rome board I tested with a recent bios

you might need to build your own e SMI like too however as the registers are somewhat different but still documented with the updated docs

Naples is too old

1 Like

How best to engage AMD regarding Threadripper?

igotchufam, already hitting that angle hard

Shame Naples is too old, still have one in use as a home server.