Please help! - LM_SENSORS on Asus Pro WS WRX80E-SAGE SE WIFI seems to have broken KVM functionality and ethernet stability

TLDR - I ran lm_sensors command ‘sensors-detect’ on the ASUS Pro WS WRX80E-SAGE SE WIFI w/ 3975WX. I choose all the default options (i.e. hit enter to all the questions for what to run and what not to run). Immediately afterwards the following issues were prominent:

  1. I no longer could get the video to work on the KVM

  2. While I was in the OS (Fedora 35 Workstation Gnome) I kept seeing a message saying that ‘Activation of network connection failed’

I’m looking for help resolving these issues and getting my system back to working.


The Details:

My System Hardware:

  • Asus Pro WS WRX80E-SAGE SE WIFI
  • AMD 3975WX
  • 256 GB ECC RAM (Crucial 3200 MHz)
  • Sapphire Pulse 5500XT 8GB
  • Corsair AX1600i PSU
  • 2x 1TB Samsung 970 Evo
  • 1 TB Samsung 980 Pro
  • LSI SAS 9300-8i 8PT 12GB/S SATA+ SAS PCIE HBA (Avago 9300-8i storage controller)
  • 3x Seagate 6TB IronWolf Pro connected to LSI HBA
  • 3x WD 6TB Red connected to LSI HBA
  • AMD 6900XT (Not connected to monitor at the time)
  • 2x Samsung 850 EVO 500GB

Other System Details

  • Running Fedora 35 Workstation - everything was up to date
  • Running latest BIOS (0701) and Firmware (1.14.0)

Issues in More Detail:

KVM

When I try to power on the system with the KVM it will no longer provide video once the OS is fully booted. I do see the ASUS w/ Fedora Splash Screen while the OS is booting; but once it goes to the login screen the KVM just shows a black screen saying ‘No Signal’. Prior to running ‘sensors-detect’ I was able to get video output. Note that I still have keyboard and mouse control via the KVM. A couple of other things to note: 1) I do not see the screen to enter the BIOS (UEFI) via the KVM while booting (it does show on the monitor connected to the graphics card) nor am I able to enter the BIOS via the KVM by hitting the delete key; I’m not sure if I was able to access this prior to running ‘sensors-detect’ or not. 2) When starting and stopping the KVM I get some short audio static/popping feedback; this did not happen previous to the ‘sensors-detect’ command

Activation of network connection failed

I get this notice at the top of my desktop while using Fedora. It is intermittent. I have not experimented a ton, but it may only happen while I am also logged into the IPMI web service.

Things I’ve tried:

  1. Fresh install of Fedora 35 workstation
  2. Doing a firmware reset via the IPMI web service (note this did not seem to totally reset the firmware as it maintained the custom fan speed settings I had inputted previously, but it did reset the fan speed alarm levels)
  3. Doing a BIOS restore defaults via the BIOS interface
  4. Pressing the Clear CMOS button on the motherboard

I wanted to try removing the CMOS battery but I was not able to locate it. I scoured the internet and based on this picture, I think it maybe located under the rear I/O port shield/VRM heatsink. I’m not sure how to access this area. My fear is that I would need to uninstall the motherboard, remove the back plate of the motherboard, which would reveal some screws to take off the outer shield and access the battery. If so, this would take a lot of time. I’m fairly frustrated with ASUS’s design in this regard. If anyone can confirm where the CMOS battery is, and how best to access it, that would be helpful.

How You Can Help

Basically I am hoping someone can give me suggestion of things to try or check to get the functionality of my motherboard back. Or if they have had a similar experience are they able to confirm that it is repairable or did I permanently break my motherboard. For anyone interested, after encountering these problems I did more research and it seems I should probably use ipmitools to get sensor data. Also, as a PSA, it also looks like the last commit to the lm_sensors project was a year ago, so I’m not sure if it has been abandoned or not. While using the software it did provide some warnings, but I guess I was too trusting of the comments I read suggesting that if one used the defaults they were most likely safe.

Thanks in advance for any help you can provide! FYI - I live in Thailand so if you do offer help I may not respond right away do to the timezone difference.

have you turned it off completely?

The best idea i can come up with is that i2c transactions went to a clockgen that then got confused and now isn’t correctly clocking pcie for the Nic and BMC Gpu.
Those clockgens are likely powered by 5V StandBy, so just rebooting / resetting won’t reset those.

The good thing is that they usually don’t contain permanent storage so they should recover from this.

1 Like

First, thanks so much for your reply and help!

Yes, I have completely turned off my power supply and pressed the power button in attempt to drain capacitors. However that has not worked. The computer has been in this state for about a week now. Do you know if the CMOS battery powers any of the BMC/IPMI settings? (fyi - I have a very limited understanding of the electronics)

That is sad to hear because that means something got persistently corrupted.

One such case i could ddg is Problem with lm-sensors - ThinkWiki

With knowledge of this, there are a few steps you can take.

First would be to warn others about this, like with the Thinkpad case.
Then get in contact with Asus for confirmation and resolution of your specific issue.

Since your Board is already affected and you might want to dig deeper there are a few things you can do to figure out specifics on your own.
Maybe even fix the issue yourself with some information from others.

  • identify eeproms from pictures off the board
  • i2cdetect all i2c slave devices on all connected buses
  • i2cdump the contents of detected devices
  • Use everything so far to identify the likely culprit
  • get good eeprom content from Asus or another Owner of such a board
  • Write good values to the eeprom in question.
  • have a working board again.
2 Likes

Thanks! While a little outside of what I’m familiar with, this is exactly the kind of information I was looking for. I really appreciate the time and this provides me a concrete list of things I can work on.

To clarify, it isn’t necessarily such an eeprom and that got corrupted that way.

But, it is currently the only likely and forward thing.

Are you able to post detailed pictures of the Board?
If we were to spot a Atmel EEPROM of the known problematic types on the Board, that would confirm the hole theory and make it easier going forward.

So if you can, please post as many closeup pictures of the board as you can.

I’m not sure kvm video issue is related to the lm_sensor stuff.

I didn’t yet run sensors-detect on my wrx80 Asus board and I have the same behavior, that the Bios screen doesn’t show on KVM but on Screen attached to the grafics card, but I think this is normal behavior because it seems to be able to send the output to only one device.

In Bios I have CSM support disabled (if enabled primary graphics device will be onboard VGA) and primary display set to grafics card.
→ Bios, Grub Menu, (Boot) consoles, X will all show on my Screen connected to graphic card, and when I connect via KVM it also only shows a black screen.

If I enable CSM support or set the primary display to onboard VGA I will see the BIOS, Grub menu, (Boot) consoles in KVM, but my display stays black, until X is loaded which I configured explicitly to use nvidia driver. Once X is loaded KVM display will be black, because output changed to nvidia, but you can get the console back by sending CTRL-ALT-F1/F2/ etc