No POST when LSI 9207-8e is installed in motherboard

Hello, I’m trying to install a LSI 9207-8e in a ASRock Rack EP2C621D12 WS motherboard but everything I’ve tried so far has lead to a failure to POST.

I’ve been testing the system stability with all of my other chosen hardware and the server is rock solid but the introduction of this HBA just breaks the whole system and I’m baffled as to why.

I assumed the HBA was faulty but plugging it into my desktop shows it’s working just fine. Host boots, kernel loads a driver for the HBA, and everything is hunky-dory.

But for some reason I can’t get past the error code B2 if this HBA is installed on this workstation motherboard.

The motherboard is already on the latest BIOS.
I can’t track down firmware updates for this card if that could have helped.
I explored CSM parameters to see if Legacy vs UEFI was causing some sort of storage conflict.

What worries me most is the plan is for this platform to be a drop in replacement for a current system. The current system has three LSI 9207-8i’s and given the only difference I’m aware of is external vs internal ports I don’t know if the motherboard will work with the 8i variants which would be a deal breaker.

Has anybody else ran into this before? Not necessarily this specific combination of hardware but just in general an HBA that works but is not letting a system POST?

Have you tried plugging it into every slot, and do you have a GPU installed?

Could be worth trying going to “Advanced” > “Chipset Configuration” in the BIOS, and set “IGPU Multi-Monitor” to disabled.

If I recall, there were some HBAs that had an OEM vendor (probably Dell, the bastards) fuck with one of the pcie pins that would cause boot issues. I think there was a fix. I’ll check for that in a moment

Found it, here’s a video explaining the issue. As expected, it’s dell’s bullshit. Also note, misbehaving cards might work in some systems, but not others, and the errors they produce are strange and seemingly unrelated, because the cards are fucking with the SMBus which is used for error reporting.

So the question is, are your cards dell OEM lsi 9207cards?

2 Likes

For the sake of the question. I tried. Same behavior in slots 1 - 7. I previously had a GPU installed just to test virtualization and hardware pass-through on Linux (little annoying but got it working in the end). Only BIOS setting I toyed with though was setting the Primary Display Adapter from onboard to PCI_e but changing that back didn’t change anything for the current issue.

This motherboard uses LGA3647 and supports no iGPU. No setting like that exists under Chipset Configuration. Onboard display comes from the ASPEED AST2500 BMC IIRC.

Just to make sure I’m not going insane I am sitting on my desktop again with the HBA installed and everything’s fine. I don’t have the SFF-8088 equipment to connect drives from here but I’m in the OS and the HBA appears fine:

42:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 [1000:0087] (rev 05)
	    Subsystem: Broadcom / LSI 9207-8e SAS2.1 HBA [1000:3040]
	    Flags: bus master, fast devsel, latency 0, IRQ 47
	    I/O ports at 3000 [size=256]
	    Memory at 9f340000 (64-bit, non-prefetchable) [size=64K]
	    Memory at 9f300000 (64-bit, non-prefetchable) [size=256K]
	    Expansion ROM at 9f200000 [disabled] [size=1M]
	    Capabilities: <access denied>
	    Kernel driver in use: mpt3sas

Addressing both of you I don’t know if this would be of help but it was a refurb card sold through Avago. At boot on the desktop it looks like it’s running some Avago firmware and not LSI/Broadcom direct. I can try to get a screen shot but it’ll be ugly. Desktop doesn’t have IPMI.

It is actually an initialization issue. Just tape those pins and go on with life. Unless you feel like modifying the bios. (Of the card, board, or both)

You might give the pin fix a try anyways, it’s easy and non destructive to do with some paper and an exacto knife.

Beyond that, I’ve no idea what could cause the card to make the system to fail POST. There are lots of reasons it could be unrecognized or something, but to halt POST is interesting.

It does look as though the instant the card is supposed to initialize and show it’s BIOS firmware screen is the moment the system halts with B2.

Of course it’d be on the Xeon platform where this occurs but the card doesn’t know how to complain on AMD Threadripper (the desktop).

So long as the tape trick is non-destructive I don’t mind giving it a shot. I’d use kapton if I had it but…maybe some basic box tape will work for a quick test.

I’ve done it with regular sellotape on my HBA before.

Just make sure you leave enough on the end to pull it out, in case it gets stuck in the slot.

Any plastic tape works fine (I’ve taped hard drives with the 3v force power off feature/issue, though it can be hard to deal with a piece that big.

Paper tapes can become goo and possible cause problems in the slot.

Packing tape or scotch tape. Wrap it around the bottom edge of the card just barely, so it does not slide up when shoving the card in.

The tape trick unfortunately did not work.

So from here I could query the seller and ask what’s up or I could flash the firmware to another version that isn’t AVAGO brand and see if that goes anywhere because the card works. Just not in a Xeon server. So it does seem to be a soft-lock of sorts.

1 Like

Are you on the newest bios for your mainboard.?

Yep. V2.10

if the tape trick does not work, it is more an issue with the MB bios initialization code, than it is the cards firmware. the best thing to do would be to look on the freenas forums for cards that work with your board. Rather you are using freenas or not is irrelevant at this point.

Though I am open to looking at other HBA’s. No problem, I’m very confused by that because during testing I dropped older, definitely un-supported cards than that into the server and it took them just fine.

I installed a Dell PERC H310 with IT firmware and it worked.
I installed a Radeon HD 7970.
I installed a Mellanox ConnectX-3 CX311A.

Everything worked. None of this hardware being a part of the final system. Just for testing. So why does a legitimate card I plan to use just not work…that’s so weird.

And it worries me because that means I might have to replace the LSI 9207-8i cards because I expect they won’t work either.

I guess I could e-mail ASRock Rack support and see if they could give me any information but I expect they probably won’t have anything helpful.

a quick google search yielded tons of forum posts regarding LSI cards not posting on Asrock boards. Even the server-centric Rack series seems to have issues. So my assumption of it being the MB bios seems to hold weight.

Emailing their support may be a good plan, they may not have a fix, but they may be able to confirm non-working HBAs so at least you do not have to guess.

2 Likes

In my experience, their consumer division are insanely quick at turning around beta BIOS builds to fix particular issues, so this is definitely worth chasing.

@Log @Zedicus @Zavar
Oh shit, oh jeez, WE HAVE POST! :laughing:

And it shows up in the OS:

It even took the vfio-pci driver that I forgot to disable the script for on the 3b:00.0 address. :sweat_smile:

Now I do still need to connect a disk shelf to actually verify it works but this is a very good sign.

To get it booting I tracked down the latest Firmware and apparently BIOS (I didn’t know those were two separate things on these cards)

Firmware: 20.00.00.00 → 20.00.07.00
BIOS: 7.39.00.00 → 7.39.02.00

Not a significant version number change but that’s all it took. Getting the card flashed was a PITA though.

  1. Couldn’t go through the OS.
  2. Apparently none of my open/free motherboards support BIOS32 Service Directory so DOS kept giving me PAM errors.
  3. Had to track down a means to enter Shell.efi and use the sas2flash.efi utility.

Once I finally had access to a working shell though it was just a matter of running sas2flash.efi -o -f 9207-8e.bin -b mptsas2.rom and waiting a minute or two.

I’ll run some tests with it before I really call it good though.

3 Likes

Nice, and thanks for the update on what fixed it!
Yeah the update process is assinine.