ASRock Rack has created the first AM4 socket server boards, X470D4U, X470D4U2-2T

I run ESXi 7 with no issues.

It seems I have finally found a working configuration. These are the BIOS settings I changed. Now I have to check one by one which ones are really neded.

  • Advanced
    • CPU Configuration
      • PSS Support -> Disabled (Performance State, Cool&Quiet)
      • CPB Mode -> Disabled (Core Performance Boost)
      • C6 Mode -> Disabled
  • AMD CBS
    • CPU Common Options
      • Platform First Error Handling -> Disabled
      • Core Performance Boost -> Disabled
      • Power Supply Idle Control -> Typical Current Idle
      • ACPI _CST C1 Declaration -> Disabled
    • NBIO Common Options
      • IOMMU -> Enabled
      • ACS Enable -> Enable
      • PCIE ARI Support -> Enable
      • Enable AER Cap -> Enable

Thanks again for the help!

@RoLee What PSU/PSUs did you use when you were experiencing issues?

Maybe we are seeing something similar to 2013 Intel Hasswell situation here:

At that time many PSUs started appearing with ‘Haswell-ready’ sticker

Or it’s just what @Mastic_Warrior mentioned in this old thread: Ryzen C-State related problems -- what is the root cause?

@Tenrag Hmm, interesting.

Currently it is running with a "Seasonic Prime Ultra Platinum 550W ATX 2.4 (SSR-550PD2)"
/seasonic.com/prime-ultra-platinum

The other model I tested it with was an "Enermax Modu 82+ (EMD425AWT)"
/www.enermax.com/home.php?fn=eng/product_a1_1_1&lv0=1&lv1=54&no=7

Drop a support email to Seasonic and ASRock Rack, they know about PSU compatibility issues with this motherboard (just search the first half of this thread), but I seem to be black-listed and never received a resolution regarding the issues I had reported.

In the meantime I’m using SFX-L SilverStone Titanium PSUs, they don’t trigger anything (so far).

With BMC version 1.90 it seems to have become much better but I still don’t trust the X470D4Us with my Seasonic PSUs :confused:

I’m happy to report that, after disabling “Platform First Error Handling (PFEH)” in the BIOS, (corrected) single-bit are properly reported to the OS, also when overclocking / undervolting! So I’m now getting the same results as Diversity with his memory pin shorting method…

The reason I was failing to detect this earlier was:

  • Memtest86 v8.2 reported “unknown” for ECC support. Memtest86 v8.3 reported “enabled” for ECC support. So I assumed, if it was working, Memtest86 v8.3 should be able to detect them. However, a couple of days ago I figured out that Memtest86 v8.4 beta had Zen2 ECC support in its changelog. So after testing I figured out that Memtest86 v8.3 does NOT support Zen2 ECC, but Memtest86 v8.4 beta DOES support Zen2 ECC.
  • I only discovered the BIOS option “Platform First Error Handling (PFEH)” very recently. During all my previous testing, except for the very last couple short tests, it was set to the default “enabled”. I probably did too little testing with Linux / Windows after disabling it.

So in short:

  1. (corrected) single-bit memory errors -> motherboard (BIOS) -> OS ==> works 100%
  2. (corrected) single-bit memory errors -> motherboard (BIOS) -> OS -> IPMI ==> not sure if the OS properly forwards the error to the IPMI
  3. (corrected) single-bit memory errors -> motherboard (BIOS) -> IPMI ==> 100% broken
  4. (corrected) single-bit memory errors -> motherboard (BIOS) -> IPMI -> OS ==> 100% broken
  5. (uncorrected) multi-bit memory errors -> * ==> I’m not sure if it is broken (or perhaps not even possible on Zen2) or if we just haven’t been able trigger them yet. I’ve ran Memtest86 v8.4 with unstable memory for many hours now. In doing so, I’ve triggered about 3000 “ECC Correctable Errors” (=single-bit) and about 100 of CPU errors, but 0 “ECC Correctable Errors” (=multi-bit). Also using the shorting-method, we haven’t achieved any “ECC Correctable Errors” (=multi-bit) yet. We are currently in contact with the persons who wrote the paper (see link above for details) that explained the shorting-method, to see how to trigger multi-bit errors reliably.

So if I understand it correctly

  1. = ok
  2. I think we can only validate this once 3) is fixed
  3. Is actually a bug and should be fixed by Asrock Rack (with help of AMD and perhaps the IPMI-chip manufacturer Aspeed)
  4. Can only work / be fixed once 3) gets fixed
  5. Suggestions are welcome. Perhaps AMD can confirm if Zen2 properly supports this? But not like AMD TW claimed that “reporting is not supported”, which we now clearly proved to be false

I’ve send this information to Asrock Rack + AMD. Asrock Rack, on the same day, confirmed that they, together with AMD, had come to the exact same conclusion (using error injection in Linux) and that they asked AMD for assistance to report these errors in the IPMI as well. So hopefully we’ll someday get this important feature on this motherboard!!

In meantime, me and (especially) Diversity, are still trying to trigger (uncorrected) multi-bit errors as well. We’re in contact with the interesting folks of ECCploit for this, who have a very profound knowledge on this matter… (Check out Lucians talk on OffensiveCon19 https://www.youtube.com/watch?v=R2aPo_wwmZw).

Finally some real progress on this matter! Thanks to Diversity for getting my hopes up again, cause I almost gave up on this…

PFEH setting in BIOS
afbeelding
Screenshot taken after 1m33sec with memory overclocked / undervolted


screenshot after almost 2h after ending the run with memory overclocked / undervolted
afbeelding
And screenshot in Linux with memory overclocked / undervolted (during memtester run in the background):

3rd working method

This is how Asrock Rack and AMD tested it

The BIOS

Everything default except

  • “Platform First Error Handling” was changed from the default “Enabled” to “Disabled”
    afbeelding
  • “Disable Memory Error Injection”, strangely enough, was (accidently) set to the default “True”. I think Asrock Rack fell for their own double negation confusion [image] I haven’t tried it yet with set to false. I also haven’t retried Memtest86 error injection with these settings again…
    afbeelding

The OS

I used a fresh install of “Fedora-Server-dvd-x86_64-32-1.6.iso” for this. I might have selected a few additional package groups during the install, not sure if it will make a difference to the below instructions.

[root@localhost mce-inject-master]# cat /etc/fedora-release

Fedora release 32 (Thirty Two)

[root@localhost ~]# uname -r

5.6.8-300.fc32.x86_64

Installing / configuring additional packages / tools

edac-utils

[root@localhost ~]# yum install edac-utils

Fedora 32 openh264 (From Cisco) - x86_64                                                4.8 kB/s | 5.1 kB     00:01

Fedora Modular 32 - x86_64                                                              2.2 MB/s | 4.9 MB     00:02

Fedora Modular 32 - x86_64 - Updates                                                    881 kB/s | 1.4 MB     00:01

Fedora 32 - x86_64 - Updates                                                            4.1 MB/s | 7.8 MB     00:01

Fedora 32 - x86_64                                                                      4.3 MB/s |  70 MB     00:16

Dependencies resolved.

======================================================================================================================== Package                      Architecture             Version                           Repository                Size

========================================================================================================================Installing:

edac-utils                   x86_64                   0.16-22.fc32                      fedora                    49 k

Installing dependencies:

sysfsutils                   x86_64                   2.1.0-28.fc32                     fedora                    44 k

Transaction Summary

========================================================================================================================Install  2 Packages

Total download size: 93 k

Installed size: 238 k

Is this ok [y/N]: y

Downloading Packages:

(1/2): edac-utils-0.16-22.fc32.x86_64.rpm                                               406 kB/s |  49 kB     00:00

(2/2): sysfsutils-2.1.0-28.fc32.x86_64.rpm                                              337 kB/s |  44 kB     00:00

------------------------------------------------------------------------------------------------------------------------Total                                                                                   115 kB/s |  93 kB     00:00

warning: /var/cache/dnf/fedora-558931b5e76b51a7/packages/edac-utils-0.16-22.fc32.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID 12c944d0: NOKEY

Fedora 32 - x86_64                                                                      1.6 MB/s | 1.6 kB     00:00

Importing GPG key 0x12C944D0:

Userid     : "Fedora (32) <[email protected]>"

Fingerprint: 97A1 AE57 C3A2 372C CA3A 4ABA 6C13 026D 12C9 44D0

From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-32-x86_64

Is this ok [y/N]: y

Key imported successfully

Running transaction check

Transaction check succeeded.

Running transaction test

Transaction test succeeded.

Running transaction

  Preparing        :                                                                                                1/1

  Installing       : sysfsutils-2.1.0-28.fc32.x86_64                                                                1/2

  Installing       : edac-utils-0.16-22.fc32.x86_64                                                                 2/2

  Running scriptlet: edac-utils-0.16-22.fc32.x86_64                                                                 2/2

  Verifying        : edac-utils-0.16-22.fc32.x86_64                                                                 1/2

  Verifying        : sysfsutils-2.1.0-28.fc32.x86_64                                                                2/2

Installed:

  edac-utils-0.16-22.fc32.x86_64                             sysfsutils-2.1.0-28.fc32.x86_64

Complete!

Bison

[root@localhost mce-inject-master]# yum install bison

Last metadata expiration check: 0:06:26 ago on Fri 08 May 2020 12:45:14 AM CEST.

Dependencies resolved.

=============================================================================================================================================================================================================================================

Package                                               Architecture                                           Version                                                           Repository                                              Size

=============================================================================================================================================================================================================================================

Installing:

bison                                                 x86_64                                                 3.5-2.fc32                                                        fedora                                                 818 k

Installing dependencies:

m4                                                    x86_64                                                 1.4.18-12.fc32                                                    fedora                                                 218 k

Transaction Summary

=============================================================================================================================================================================================================================================

Install  2 Packages

Total download size: 1.0 M

Installed size: 3.0 M

Is this ok [y/N]: y

Downloading Packages:

(1/2): m4-1.4.18-12.fc32.x86_64.rpm                                                                                                                                                                          946 kB/s | 218 kB     00:00

(2/2): bison-3.5-2.fc32.x86_64.rpm                                                                                                                                                                           2.0 MB/s | 818 kB     00:00

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Total                                                                                                                                                                                                        1.2 MB/s | 1.0 MB     00:00

Running transaction check

Transaction check succeeded.

Running transaction test

Transaction test succeeded.

Running transaction

  Preparing        :                                                                                                                                                                                                                     1/1

  Installing       : m4-1.4.18-12.fc32.x86_64                                                                                                                                                                                            1/2

  Installing       : bison-3.5-2.fc32.x86_64                                                                                                                                                                                             2/2

  Running scriptlet: bison-3.5-2.fc32.x86_64                                                                                                                                                                                             2/2

  Verifying        : bison-3.5-2.fc32.x86_64                                                                                                                                                                                             1/2

  Verifying        : m4-1.4.18-12.fc32.x86_64                                                                                                                                                                                            2/2

Installed:

  bison-3.5-2.fc32.x86_64                                                                                              m4-1.4.18-12.fc32.x86_64

Complete!

Flex

[root@localhost mce-inject-master]# yum install flex

Last metadata expiration check: 0:06:41 ago on Fri 08 May 2020 12:45:14 AM CEST.

Dependencies resolved.

=============================================================================================================================================================================================================================================

Package                                               Architecture                                            Version                                                         Repository                                               Size

=============================================================================================================================================================================================================================================

Installing:

flex                                                  x86_64                                                  2.6.4-4.fc32                                                    fedora                                                  318 k

Transaction Summary

=============================================================================================================================================================================================================================================

Install  1 Package

Total download size: 318 k

Installed size: 927 k

Is this ok [y/N]: y

Downloading Packages:

flex-2.6.4-4.fc32.x86_64.rpm                                                                                                                                                                                 1.5 MB/s | 318 kB     00:00

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Total                                                                                                                                                                                                        482 kB/s | 318 kB     00:00

Running transaction check

Transaction check succeeded.

Running transaction test

Transaction test succeeded.

Running transaction

  Preparing        :                                                                                                                                                                                                                     1/1

  Installing       : flex-2.6.4-4.fc32.x86_64                                                                                                                                                                                            1/1

  Running scriptlet: flex-2.6.4-4.fc32.x86_64                                                                                                                                                                                            1/1

  Verifying        : flex-2.6.4-4.fc32.x86_64                                                                                                                                                                                            1/1

Installed:

  flex-2.6.4-4.fc32.x86_64

Complete!

Rasdaemon

[root@localhost mce-inject-master]# yum install rasdaemon

Last metadata expiration check: 0:02:51 ago on Fri 08 May 2020 12:52:01 AM CEST.

Dependencies resolved.

=============================================================================================================================================================================================================================================

Package                                                       Architecture                                       Version                                                           Repository                                          Size

=============================================================================================================================================================================================================================================

Installing:

rasdaemon                                                     x86_64                                             0.6.4-1.fc32                                                      fedora                                             117 k

Installing dependencies:

perl-DBD-SQLite                                               x86_64                                             1.64-4.fc32                                                       fedora                                             196 k

perl-DBI                                                      x86_64                                             1.643-2.fc32                                                      fedora                                             707 k

perl-Math-BigInt                                              noarch                                             1:1.9998.18-2.fc32                                                fedora                                             190 k

perl-Math-Complex                                             noarch                                             1.59-452.fc32                                                     fedora                                              56 k

Transaction Summary

=============================================================================================================================================================================================================================================

Install  5 Packages

Total download size: 1.2 M

Installed size: 3.5 M

Is this ok [y/N]: y

Downloading Packages:

(1/5): perl-Math-BigInt-1.9998.18-2.fc32.noarch.rpm                                                                                                                                                          497 kB/s | 190 kB     00:00

(2/5): perl-DBD-SQLite-1.64-4.fc32.x86_64.rpm                                                                                                                                                                498 kB/s | 196 kB     00:00

(3/5): perl-Math-Complex-1.59-452.fc32.noarch.rpm                                                                                                                                                            1.3 MB/s |  56 kB     00:00

(4/5): rasdaemon-0.6.4-1.fc32.x86_64.rpm                                                                                                                                                                     1.1 MB/s | 117 kB     00:00

(5/5): perl-DBI-1.643-2.fc32.x86_64.rpm                                                                                                                                                                      1.1 MB/s | 707 kB     00:00

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Total                                                                                                                                                                                                        1.1 MB/s | 1.2 MB     00:01

Running transaction check

Transaction check succeeded.

Running transaction test

Transaction test succeeded.

Running transaction

  Preparing        :                                                                                                                                                                                                                     1/1

  Installing       : perl-Math-Complex-1.59-452.fc32.noarch                                                                                                                                                                              1/5

  Installing       : perl-Math-BigInt-1:1.9998.18-2.fc32.noarch                                                                                                                                                                          2/5

  Installing       : perl-DBI-1.643-2.fc32.x86_64                                                                                                                                                                                        3/5

  Installing       : perl-DBD-SQLite-1.64-4.fc32.x86_64                                                                                                                                                                                  4/5

  Installing       : rasdaemon-0.6.4-1.fc32.x86_64                                                                                                                                                                                       5/5

  Running scriptlet: rasdaemon-0.6.4-1.fc32.x86_64                                                                                                                                                                                       5/5

  Verifying        : perl-DBD-SQLite-1.64-4.fc32.x86_64                                                                                                                                                                                  1/5

  Verifying        : perl-DBI-1.643-2.fc32.x86_64                                                                                                                                                                                        2/5

  Verifying        : perl-Math-BigInt-1:1.9998.18-2.fc32.noarch                                                                                                                                                                          3/5

  Verifying        : perl-Math-Complex-1.59-452.fc32.noarch                                                                                                                                                                              4/5

  Verifying        : rasdaemon-0.6.4-1.fc32.x86_64                                                                                                                                                                                       5/5

Installed:

  perl-DBD-SQLite-1.64-4.fc32.x86_64             perl-DBI-1.643-2.fc32.x86_64             perl-Math-BigInt-1:1.9998.18-2.fc32.noarch             perl-Math-Complex-1.59-452.fc32.noarch             rasdaemon-0.6.4-1.fc32.x86_64

Complete!

[root@localhost machinecheck0]# rasdaemon -e

rasdaemon: ras:mc_event event enabled

rasdaemon: ras:aer_event event enabled

rasdaemon: mce:mce_record event enabled

rasdaemon: Can't write to set_event

rasdaemon: devlink:devlink_health_report event enabled

rasdaemon: block:block_rq_complete event enabled

[root@localhost machinecheck0]# systemctl start rasdaemon

[root@localhost machinecheck0]# systemctl enable rasdaemon

Created symlink /etc/systemd/system/multi-user.target.wants/rasdaemon.service → /usr/lib/systemd/system/rasdaemon.service.

[root@localhost machinecheck0]# systemctl status rasdaemon.service

● rasdaemon.service - RAS daemon to log the RAS events

     Loaded: loaded (/usr/lib/systemd/system/rasdaemon.service; enabled; vendor preset: disabled)

     Active: active (running) since Fri 2020-05-08 00:57:46 CEST; 23s ago

   Main PID: 33914 (rasdaemon)

      Tasks: 1 (limit: 38389)

     Memory: 7.1M

        CPU: 10ms

     CGroup: /system.slice/rasdaemon.service

             └─33914 /usr/sbin/rasdaemon -f -r

May 08 00:57:46 localhost.localdomain rasdaemon[33914]: rasdaemon: diskerror_eventstore: 0x564510eb9918

May 08 00:57:46 localhost.localdomain rasdaemon[33914]: rasdaemon: register inserted at db

May 08 00:57:46 localhost.localdomain rasdaemon[33914]: overriding event (1360) ras:mc_event with new print handler

May 08 00:57:46 localhost.localdomain rasdaemon[33914]: overriding event (1357) ras:aer_event with new print handler

May 08 00:57:46 localhost.localdomain rasdaemon[33914]: overriding event (114) mce:mce_record with new print handler

May 08 00:57:46 localhost.localdomain rasdaemon[33914]: overriding event (1441) net:net_dev_xmit_timeout with new print handler

May 08 00:57:46 localhost.localdomain rasdaemon[33914]: overriding event (1449) devlink:devlink_health_report with new print handler

May 08 00:57:46 localhost.localdomain rasdaemon[33914]: overriding event (1154) block:block_rq_complete with new print handler

May 08 00:57:46 localhost.localdomain rasdaemon[33914]: Calling ras_mc_event_opendb()

May 08 00:57:46 localhost.localdomain rasdaemon[33914]:            <...>-36    [005]     0.000095: block_rq_complete:    2020-05-08 00:57:45 +0200

Development Tools (for make)

[root@localhost mce-inject-master]# yum groupinstall "Development Tools"

Last metadata expiration check: 0:07:50 ago on Fri 08 May 2020 12:52:01 AM CEST.

Dependencies resolved.

=============================================================================================================================================================================================================================================

Package                                                                 Architecture                              Version                                                                  Repository                                  Size

=============================================================================================================================================================================================================================================

Installing group/module packages:

diffstat                                                                x86_64                                    1.63-2.fc32                                                              fedora                                      43 k

...

xorg-x11-server-utils                                                   x86_64                                    7.7-34.fc32                                                              fedora                                     188 k

Installing weak dependencies:

kernel-devel                                                            x86_64                                    5.6.8-300.fc32                                                           updates                                     13 M

Installing Groups:

Development Tools


Transaction Summary

=============================================================================================================================================================================================================================================

Install  79 Packages


Total download size: 124 M

Installed size: 448 M

Is this ok [y/N]: y

Downloading Packages:

(1/79): git-2.26.2-1.fc32.x86_64.rpm                                                                                                                                                                         787 kB/s | 126 kB     00:00

...

(79/79): xorg-x11-fonts-ISO8859-1-100dpi-7.5-24.fc32.noarch.rpm                                                                                                                                              2.6 MB/s | 1.0 MB     00:00

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Total                                                                                                                                                                                                        6.9 MB/s | 124 MB     00:18

Running transaction check

Transaction check succeeded.

Running transaction test

Transaction test succeeded.

Running transaction

  Preparing        :                                                                                                                                                                                                                     1/1

  Installing       : urw-base35-fonts-common-20170801-14.fc32.noarch                                                                                                                                                                    1/79

...

  Running scriptlet: diffstat-1.63-2.fc32.x86_64                                                                                                                                                                                       79/79

  Verifying        : cpp-10.0.1-0.14.fc32.x86_64                                                                                                                                                                                        1/79

...

  Verifying        : xorg-x11-server-utils-7.7-34.fc32.x86_64                                                                                                                                                                          79/79


Installed:

  adobe-mappings-cmap-20171205-7.fc32.noarch                    adobe-mappings-cmap-deprecated-20171205-7.fc32.noarch  adobe-mappings-pdf-20180407-5.fc32.noarch                   binutils-2.34-2.fc32.x86_64

  binutils-gold-2.34-2.fc32.x86_64                              boost-filesystem-1.69.0-15.fc32.x86_64                 boost-system-1.69.0-15.fc32.x86_64                          boost-thread-1.69.0-15.fc32.x86_64

  cpp-10.0.1-0.14.fc32.x86_64                                   diffstat-1.63-2.fc32.x86_64                            doxygen-1:1.8.17-2.fc32.x86_64                              dyninst-10.1.0-5.fc32.x86_64

  gcc-10.0.1-0.14.fc32.x86_64                                   gd-2.3.0-1.fc32.x86_64                                 git-2.26.2-1.fc32.x86_64                                    git-core-2.26.2-1.fc32.x86_64

  git-core-doc-2.26.2-1.fc32.noarch                             glibc-devel-2.31-2.fc32.x86_64                         glibc-headers-2.31-2.fc32.x86_64                            google-droid-sans-fonts-20200215-3.fc32.noarch

  graphviz-2.42.4-1.fc32.x86_64                                 gtk2-2.24.32-7.fc32.x86_64                             gts-0.7.6-37.20121130.fc32.x86_64                           guile22-2.2.6-4.fc32.x86_64

  isl-0.16.1-10.fc32.x86_64                                     jbig2dec-libs-0.17-4.fc32.x86_64                       kernel-devel-5.6.8-300.fc32.x86_64                          kernel-headers-5.6.7-300.fc32.x86_64

  lasi-1.1.3-2.fc32.x86_64                                      libXaw-1.0.13-14.fc32.x86_64                           libXmu-1.1.3-3.fc32.x86_64                                  libXpm-3.5.13-2.fc32.x86_64

  libXt-1.2.0-1.fc32.x86_64                                     libfontenc-1.1.3-12.fc32.x86_64                        libgs-9.52-1.fc32.x86_64                                    libidn-1.35-7.fc32.x86_64

  libijs-0.35-11.fc32.x86_64                                    libimagequant-2.12.6-2.fc32.x86_64                     libmcpp-2.7.2-25.fc32.x86_64                                libmpc-1.1.0-8.fc32.x86_64

  libpaper-1.1.24-26.fc32.x86_64                                libraqm-0.7.0-5.fc32.x86_64                            librsvg2-2.48.4-1.fc32.x86_64                               libserf-1.3.9-15.fc32.x86_64

  libwebp-1.1.0-2.fc32.x86_64                                   libxcrypt-devel-4.4.16-3.fc32.x86_64                   make-1:4.2.1-16.fc32.x86_64                                 mcpp-2.7.2-25.fc32.x86_64

  netpbm-10.90.00-1.fc32.x86_64                                 openjpeg2-2.3.1-6.fc32.x86_64                          patch-2.7.6-12.fc32.x86_64                                  patchutils-0.3.4-15.fc32.x86_64

  perl-Error-1:0.17029-1.fc32.noarch                            perl-Git-2.26.2-1.fc32.noarch                          perl-TermReadKey-2.38-6.fc32.x86_64                         subversion-1.12.2-7.fc32.x86_64

  subversion-libs-1.12.2-7.fc32.x86_64                          systemtap-4.3-0.20200211git91ffb97ad335.fc32.x86_64    systemtap-client-4.3-0.20200211git91ffb97ad335.fc32.x86_64  systemtap-devel-4.3-0.20200211git91ffb97ad335.fc32.x86_64

  systemtap-runtime-4.3-0.20200211git91ffb97ad335.fc32.x86_64   tbb-2020.2-1.fc32.x86_64                               urw-base35-bookman-fonts-20170801-14.fc32.noarch            urw-base35-c059-fonts-20170801-14.fc32.noarch

  urw-base35-d050000l-fonts-20170801-14.fc32.noarch             urw-base35-fonts-20170801-14.fc32.noarch               urw-base35-fonts-common-20170801-14.fc32.noarch             urw-base35-gothic-fonts-20170801-14.fc32.noarch

  urw-base35-nimbus-mono-ps-fonts-20170801-14.fc32.noarch       urw-base35-nimbus-roman-fonts-20170801-14.fc32.noarch  urw-base35-nimbus-sans-fonts-20170801-14.fc32.noarch        urw-base35-p052-fonts-20170801-14.fc32.noarch

  urw-base35-standard-symbols-ps-fonts-20170801-14.fc32.noarch  urw-base35-z003-fonts-20170801-14.fc32.noarch          utf8proc-2.4.0-3.fc32.x86_64                                xapian-core-libs-1.4.14-1.fc32.x86_64

  xorg-x11-font-utils-1:7.5-44.fc32.x86_64                      xorg-x11-fonts-ISO8859-1-100dpi-7.5-24.fc32.noarch     xorg-x11-server-utils-7.7-34.fc32.x86_64


Complete!

mce-inject

[root@localhost ~]# wget https://github.com/andikleen/mce-inject/archive/master.zip

--2020-05-08 00:49:09--  https://github.com/andikleen/mce-inject/archive/master.zip

Resolving github.com (github.com)... 140.82.118.3

Connecting to github.com (github.com)|140.82.118.3|:443... connected.

HTTP request sent, awaiting response... 302 Found

Location: https://codeload.github.com/andikleen/mce-inject/zip/master [following]

--2020-05-08 00:49:09--  https://codeload.github.com/andikleen/mce-inject/zip/master

Resolving codeload.github.com (codeload.github.com)... 140.82.114.9

Connecting to codeload.github.com (codeload.github.com)|140.82.114.9|:443... connected.

HTTP request sent, awaiting response... 200 OK

Length: unspecified [application/zip]

Saving to: ‘master.zip’


master.zip                        [ <=>                                              ]  13.21K  --.-KB/s    in 0.09s


2020-05-08 00:49:10 (139 KB/s) - ‘master.zip’ saved [13530]


[root@localhost ~]# unzip master.zip

Archive:  master.zip

4cbe46321b4a81365ff3aafafe63967264dbfec5

   creating: mce-inject-master/

  inflating: mce-inject-master/Makefile

  inflating: mce-inject-master/README

  inflating: mce-inject-master/inject.h

  inflating: mce-inject-master/mce-inject.8

  inflating: mce-inject-master/mce-inject.c

  inflating: mce-inject-master/mce.h

  inflating: mce-inject-master/mce.lex

  inflating: mce-inject-master/mce.y

  inflating: mce-inject-master/parser.h

   creating: mce-inject-master/test/

  inflating: mce-inject-master/test/corrected

  inflating: mce-inject-master/test/fatal

  inflating: mce-inject-master/test/uncorrected

  inflating: mce-inject-master/util.c

  inflating: mce-inject-master/util.h

[root@localhost ~]# cd mce-inject-master/

[root@localhost mce-inject-master]# ls -la

total 48

drwxr-xr-x. 3 root root  189 Jan 19  2013 .

drwxr-xr-x. 3 root root   49 May  8 00:49 ..

-rw-r--r--. 1 root root  193 Jan 19  2013 inject.h

-rw-r--r--. 1 root root  904 Jan 19  2013 Makefile

-rw-r--r--. 1 root root 3863 Jan 19  2013 mce.h

-rw-r--r--. 1 root root 3793 Jan 19  2013 mce-inject.8

-rw-r--r--. 1 root root 6506 Jan 19  2013 mce-inject.c

-rw-r--r--. 1 root root 3487 Jan 19  2013 mce.lex

-rw-r--r--. 1 root root 3822 Jan 19  2013 mce.y

-rw-r--r--. 1 root root  385 Jan 19  2013 parser.h

-rw-r--r--. 1 root root 1460 Jan 19  2013 README

drwxr-xr-x. 2 root root   55 Jan 19  2013 test

-rw-r--r--. 1 root root  364 Jan 19  2013 util.c

-rw-r--r--. 1 root root  290 Jan 19  2013 util.h


[root@localhost mce-inject-master]# make

bison -d mce.y

flex mce.lex

cc -MM -DDEPS_RUN -I. mce-inject.c util.c mce.tab.c lex.yy.c > .depend.X && \

        mv .depend.X .depend

cc -Os -g -Wall   -c -o mce-inject.o mce-inject.c

cc -Os -g -Wall   -c -o mce.tab.o mce.tab.c

cc -Os -g -Wall   -c -o lex.yy.o lex.yy.c

cc -Os -g -Wall   -c -o util.o util.c

cc -pthread  mce-inject.o mce.tab.o lex.yy.o util.o   -o mce-inject

[root@localhost mce-inject-master]# ls -la

total 400

drwxr-xr-x. 3 root root  4096 May  8 01:01 .

drwxr-xr-x. 3 root root    49 May  8 00:49 ..

-rw-r--r--. 1 root root    45 May  8 00:54 correct

-rw-r--r--. 1 root root   185 May  8 01:01 .depend

-rw-r--r--. 1 root root   193 Jan 19  2013 inject.h

-rw-r--r--. 1 root root 47534 May  8 01:01 lex.yy.c

-rw-r--r--. 1 root root 73320 May  8 01:01 lex.yy.o

-rw-r--r--. 1 root root   904 Jan 19  2013 Makefile

-rw-r--r--. 1 root root  3863 Jan 19  2013 mce.h

-rwxr-xr-x. 1 root root 84584 May  8 01:01 mce-inject

-rw-r--r--. 1 root root  3793 Jan 19  2013 mce-inject.8

-rw-r--r--. 1 root root  6506 Jan 19  2013 mce-inject.c

-rw-r--r--. 1 root root 38960 May  8 01:01 mce-inject.o

-rw-r--r--. 1 root root  3487 Jan 19  2013 mce.lex

-rw-r--r--. 1 root root 56619 May  8 01:01 mce.tab.c

-rw-r--r--. 1 root root  2922 May  8 01:01 mce.tab.h

-rw-r--r--. 1 root root 25552 May  8 01:01 mce.tab.o

-rw-r--r--. 1 root root  3822 Jan 19  2013 mce.y

-rw-r--r--. 1 root root   385 Jan 19  2013 parser.h

-rw-r--r--. 1 root root  1460 Jan 19  2013 README

drwxr-xr-x. 2 root root    55 Jan 19  2013 test

-rw-r--r--. 1 root root   364 Jan 19  2013 util.c

-rw-r--r--. 1 root root   290 Jan 19  2013 util.h

-rw-r--r--. 1 root root  8128 May  8 01:01 util.o

[root@localhost mce-inject-master]# modprobe mce_inject

[root@localhost mce-inject-master]# vi correct

[root@localhost mce-inject-master]# cat correct

CPU 1 BANK 2

STATUS corrected

RIP 0x12341234

Prevent the machine from crashing

[root@localhost mce-inject-master]# cd /sys/devices/system/machinecheck/machinecheck0

[root@localhost machinecheck0]# cat tolerant

1

[root@localhost machinecheck0]# vi tolerant

[root@localhost machinecheck0]# cat tolerant

3

[root@localhost machinecheck0]# 

Check edac status

[root@localhost ~]# ls /sys/devices/system/edac/mc

mc0  power  subsystem  uevent

[root@localhost ~]# find /lib/modules/$(uname -r) -name '*edac*'

/lib/modules/5.6.8-300.fc32.x86_64/kernel/drivers/edac

/lib/modules/5.6.8-300.fc32.x86_64/kernel/drivers/edac/amd64_edac_mod.ko.xz

/lib/modules/5.6.8-300.fc32.x86_64/kernel/drivers/edac/e752x_edac.ko.xz

/lib/modules/5.6.8-300.fc32.x86_64/kernel/drivers/edac/edac_mce_amd.ko.xz

/lib/modules/5.6.8-300.fc32.x86_64/kernel/drivers/edac/i10nm_edac.ko.xz

/lib/modules/5.6.8-300.fc32.x86_64/kernel/drivers/edac/i3000_edac.ko.xz

/lib/modules/5.6.8-300.fc32.x86_64/kernel/drivers/edac/i3200_edac.ko.xz

/lib/modules/5.6.8-300.fc32.x86_64/kernel/drivers/edac/i5000_edac.ko.xz

/lib/modules/5.6.8-300.fc32.x86_64/kernel/drivers/edac/i5100_edac.ko.xz

/lib/modules/5.6.8-300.fc32.x86_64/kernel/drivers/edac/i5400_edac.ko.xz

/lib/modules/5.6.8-300.fc32.x86_64/kernel/drivers/edac/i7300_edac.ko.xz

/lib/modules/5.6.8-300.fc32.x86_64/kernel/drivers/edac/i7core_edac.ko.xz

/lib/modules/5.6.8-300.fc32.x86_64/kernel/drivers/edac/i82975x_edac.ko.xz

/lib/modules/5.6.8-300.fc32.x86_64/kernel/drivers/edac/ie31200_edac.ko.xz

/lib/modules/5.6.8-300.fc32.x86_64/kernel/drivers/edac/pnd2_edac.ko.xz

/lib/modules/5.6.8-300.fc32.x86_64/kernel/drivers/edac/sb_edac.ko.xz

/lib/modules/5.6.8-300.fc32.x86_64/kernel/drivers/edac/skx_edac.ko.xz

/lib/modules/5.6.8-300.fc32.x86_64/kernel/drivers/edac/x38_edac.ko.xz

[root@localhost ~]# edac-util -rfull

mc0:csrow2:mc#0csrow#2channel#0:CE:0

mc0:csrow2:mc#0csrow#2channel#1:CE:0

mc0:csrow3:mc#0csrow#3channel#0:CE:0

mc0:csrow3:mc#0csrow#3channel#1:CE:0

mc0:noinfo:all:UE:0

mc0:noinfo:all:CE:0

Inject the error and observe the result

[CODE][root@localhost mce-inject-master]# modprobe mce_inject

[root@localhost mce-inject-master]# ./mce-inject correct

[root@localhost mce-inject-master]#

Message from syslogd@localhost at May  8 01:02:10 ...

kernel:[Hardware Error]: Corrected error, no action required.


Message from syslogd@localhost at May  8 01:02:10 ...

kernel:[Hardware Error]: CPU:1 (17:71:0) MC2_STATUS[-|CE|-|-|-|-|-|-|-|-]: 0x9000000000000000


Message from syslogd@localhost at May  8 01:02:10 ...

kernel:[Hardware Error]: IPID: 0x0000000000000000


Message from syslogd@localhost at May  8 01:02:10 ...

kernel:[Hardware Error]: L2 Cache Ext. Error Code: 0, L2M Tag Multiple-Way-Hit error.


Message from syslogd@localhost at May  8 01:02:10 ...

kernel:[Hardware Error]: cache level: RESV, tx: INSN


[root@localhost mce-inject-master]# ras-mc-ctl --summary

No Memory errors.


No PCIe AER errors.


No Extlog errors.


No devlink errors.

Disk errors summary:

        0:0 has 1 errors

MCE records summary:

        1 Corrected error, no action required. errors

[root@localhost mce-inject-master]#
2 Likes

Thanks, that was a good read.
I also was digging a little into how ECC errors should be propagated and there are a few things:

  • ECC errors are reported with Machine Check Exceptions (MCE). Those exceptions are essentially just an event when the CPU populates MCA registers. From what I gathered the kernel MCA handler periodically polls for a change in those registers and will report any errors (or panic for example). I also read that MCEs are essentially interrupts similar to NMIs so I am not sure how it goes with the polling strategy.

  • AGESA decides if ECC should be enabled on a specific CPU and depending on that the BIOS can allow the OS/kernel to register a MCA handler.
    (old agesa code: https://github.com/coreboot/coreboot/tree/master/src/vendorcode/amd/agesa)


Now the most interesting part:

  • AFAIK BMC and the host OS do not talk with each other when it comes to MCEs. I contacted ASPEED support and they informed me that normally MCEs are reported to BMC chip through APML which is an I2C bus. (on AMD CPUs)
    According to the APML spec the CPU exports the same set of MCA registers to the BMC as it does to the host OS. So when it comes to MCAs host OS and IPMI detect/log/report them independently.
    AMD docs: https://developer.amd.com/resources/developer-guides-manuals/
    APML spec: https://developer.amd.com/wordpress/media/2012/10/41918.pdf
    It could be that APML is simply disabled for AM4 CPUs - I can’t find a definitive info but most of the AMPL marketing seems to point to EPYC exclusivity. At the same time its only 2 CPU pins (SIC and SID) and even AM2 had them:
    https://en.wikichip.org/wiki/amd/packages/socket_am2
    If this is true and we are simply missing APML link then not only ECC errors should be missing in the IPMI logs. Critical temperature events are also reported through this bus. Anyone saw overhat events in the IPMI log?
    It is also possible that X470D4U simply did not wire those 2 pins

  • A reminder about mce-inject: Looking at the documentation - it is only for testing of MCA handlers in kernel - it is not for testing the whole platform. Be sure not to rely on it when it comes to simulating real ECC errors. The mce-inject code seems to suggest it is using EDAC driver inject points. You can find more info here: https://www.kernel.org/doc/html/latest/admin-guide/ras.html#edac-error-detection-and-correction
    I believe ECC inject in the BIOS is a separate feature.
    Edit: According to the previous arch BKDG (http://support.amd.com/TechDocs/50742_15h_Models_60h-6Fh_BKDG.pdf) EDAC driver can support error injection using dedicated CPU registers. Don’t know if logs are different when compared to injecting errors just on kernel driver level.

  • The host OS can communicate with IPMI (/dev/ipmi) so it should be possible to for example bypass the APML requirement in the MCA handler and communicate with IPMI directly (with “ipmitool event” like mechanism). I do not believe something like this is being done currently (at least not in the MCA handler kernel code). So I think what you wrote about OS forwarding is simply not supposed to happen now:

1 Like


12 boards available at the time of writing this

Edit: Price seems to be going up, it started at ~247 Euro
Edit2: another 12 boards at itboost.de
Edit3: Aaand It’s gone

Edit4: The stock seems more stable now - several retailers list it and those that sold-out few days ago seem to be getting more units.

Hey guys,

I’m having some weird issues with this board that I was hoping you could help me figure out. I didn’t notice anyone with this exact issue but apologies if it’s already been mentioned.

To make a long story short I don’t think I can get to BIOS and my machine isnt POSTing.

The computer will automatically start itself if it’s plugged into power. If the computer is off and I plug in the PSU and turn the PSU on. The BMC light will be solid, then after about 10 seconds there is a click, the BMC heart beat starts and the machine starts to power up . Once it starts booting I can’t power it off by holding the power button or anything, I have to switch off the PSU.

I’ve connected a monitor over VGA and spammed Delete during this whole process, but can’t seem to access BIOS.

I am able to look at the machine over IPMI…kind of. I can log in and see the dashboard, but my MB model and BIOS version are blank. There are no alerts, some system sensors/fan data. when I go to system inventory I see, “System information will be refreshed when the system POST, please restart the system if you see nothing on screen.”

I can’t KVM into the machine, either JAVA of web, and I have no power control over the machine over IPMI. I can’t turn on or off the machine over IPMI, but if I try the KVM I can see the correct server power status on the top right. If I manually turn off the server when on IPMI I see the power switch go RED before I lose connection.

I was able to update to the most recent BMC version (1.9) but I can’t remote flash the BIOS. I am able to upload the file and get it to the point where it says, “preparing to flash bios” but it just hangs on processing. I think it’s because the IPMI can’t power on or off the computer.

I’m going to probably take it all apart tonight and reconnect every component, but curious if there’s anything you all can think of with this.

Thanks!!

Did it stop posting or it never posted since purchase?

Quite normal - there is a BIOS option for this behavior

how does Dr.Debug look? (the boot codes)
I don’t really know what you mean by a click. I don’t have anything like it.
About the BMC light: yup, first few seconds solid, then pulsing - sounds normal to me

Bad BIOS flash? Try flashing through IPMI again, but using a clean browser without any addons/adblockers. And be patient.

If the BIOS is busted then KVM won’t have much to show.

This is concerning, this should work even without functioning BIOS. Did it work in the past?

I had similar issues, and your suspicious aligns with my experience a a little. I once tried updating the BIOS through the IPMI while the machine was ON. Make sure it is OFF when flashing.

But since holding the power button doesn’t work then I have only one sugestion: Disconnect any and all front panel connectors for powier/reset/LEDs and use a screwdriver to short pins.
I once had a similar, very frustrating problem with a PC that was boot looping. It was a broken power button in a chassis.

Additional question: What components? CPU/RAM/PCIe cards/Drives

Edit:
2 more things:

  • Try resetting the CMOS - instruction in the manual
  • Make sure there are no misaligned standoffs or stray screws shorting anything on the back of the board

No, it’s never worked. This is a new comp so all components are new.

No boot codes come up, the display on the MB doesn’t light up or anything. The BMC and power ready green LEDs are one though, which I think is normal.

I let it run for like an hour and a half last night. I think the issue is since I can’t turn the machine power on or off (request times out) the IPMI can’t shut down the computer to flash the BIOS. I can’t try to flash the BIOS with the machine off because it will auto start if it’s connected to the PSU.

As far as I understand it, it is working. The cable is connected into the IPMI port, and then hits my router. I’ve been controlling it from my PC also hard wired into the router. I was able to update the BMC today to the current version, so I think it’s hopefully at least connected right. I have admittedly not heard of IPMI until I bought this board haha.

After work today I’m planning on resetting the CMOS and trying to mess with ram variations to see if it will boot. If it doesn’t work I’ll just have to take it apart and reseat everything and go from there.

Comp parts are…
PSU - EVGA SuperNova 650 G3 80+ Gold
RAM - 16GB Corsair Vengance LPX SDRAM DDR4 3000
Case - Fractal Design 804
CPU - AMD Ryzen 5 1600
MB - AsROck Rack x470d4u
NAS Storage - 2x 6gb WD RED
SSD/Cache - Kingston 240gb SSD

I meant the power control over IPMI.


If it were DDR fault you would most likely see it boot-loop a few times while cycling error codes. My current bet is shorted power/reset button, or just incorrectly connected front panel.

Nope, power over IPMI has never worked.

Hopefully it’s something silly like the power button being connected wrong.

Hopefully.


One more comment about:

1st gen isn’t on the supported CPU list but STH posted an article with tests including 1600 (non AF). So it shouldn’t be an issue: https://www.servethehome.com/amd-ryzen-5-1600-af-review-a-wildcard-option/2/


Also: X470D4U review on STH is finally up: https://www.servethehome.com/asrock-rack-x470d4u-review-amd-ryzen-meet-server/

Has anyone been able to get AMD-RAID w/ the two nvme slots to work when installing Centos/Redhat 8? It simply just shows both NVME disks, but not the amdraid I created in the bios.

Thanks

In my thread with the Intel P4500 NVMe issues @wendell mentions that there is no Linux driver support for an AMD NVMe RAID array so you’ll probably have to use general software RAID solutions from the intended distributions to get there.

Was way off, only got an 8.9

The review is not really touching the various bugs of the motherboard so I’ll guess we’ll see a few new users here soon :wink:

2 Likes

OP raises from the grave

Ugh… wha… STH finally reviewed this? Cool.

reads review

Well that was a useless waste of time…

For the moment, until more widespread support materializes for the Ryzen platform, our recommendation for the X470D4U will come with a huge caveat; we like the platform, but you will have to test it for yourself to make sure the platform works for your organization (or you if this is a lab environment) and your particular set of applications.

They should have just hotlinked this thread and called it a day.

1 Like