Out of ideas: HP Quadro RTX 6000 Invisible on Intel systems

Hello! So after a too long of a time lurking I have a hardware issue on my hands that I have no idea how to approach.

I bought a second hand Quadro RTX 6000 (so 24GB Turing) for cheap and it only works in some of my systems.

GPU-Z reports the subsystem as HP and the seller confirms it came from an HP workstation. The actual device and subsystem IDs are 10DE 1E30 103C 12BA as reported by HP’s firmware updater tool.

So, the issue at hand - it works (almost*) perfectly fine in all of my AMD Ryzen systems, powers on and functions just fine, all VRAM usable, performance as expected. But it does not work on any of my Intel systems - Z79 Haswell/Broadwell, B760/Z790 12700KF/14900K - the system POSTs and boots up just fine, but the GPU just isn’t there, Windows device manager doesn’t see anything, Linux lspci comes up empty, no Quadro to be seen. Also nothing weird that I could spot in dmesg.

*I say “almost” perfectly because it seems like the PCIe PHY or PCB is a bit broken since it only negotiates x8 link speed, but all of these issues crop up in an x4 slot too so I think this is unrelated.

I tried setting PCIe link speeds to gen1 in the BIOS, various combinations of VT-d, SR-IOV support, Above 4G Decoding and ResizableBAR and ofc ASPM, but none of it does anything. The iGPU systems also don’t see it as present in the BIOS and won’t allow me to select dGPU output.

Another weird bit is that I tried to update the VBIOS, since the one on the card is basically a launch-time VBIOS (90.02.15.00.05), I tried HPE’s updater from here but all 3 versions of the updater just say the card is incompatible or that no supported hardware was found, which I find quite weird since it does seem to be a genuine HP card.

I plan to eventually put the card into an EPYC based NAS which hopefully behaves the same as AM4/AM5, so it shouldn’t be an issue, but I’d rather have a card that works in all systems, so if anyone has any ideas, please help :sweat_smile:

1 Like

it is unlikely to work in x79 system if its using its full bar space
you might be able to toggle bar space options using the recently unlocked Nvidia bios flasher but do so at your own risk

you can find vbios on tech powerup and flash that via the nvidia bios flasher
make sure you have CSM disabled and above 4G decoding enabled and rebar on auto

1 Like

Oops, I meant to say Z97 up there, swapped the numbers around and didn’t notice. Thanks for the info, the Z97 test was moreso just to test something non-ASRock since both of my AlderLake/RaptorLake boards are ASRock…

anyway, don’t really mind if I can’t get it to work on the old board, but the LGA1700 ones not working is puzzling. I did try pretty much any combination of Above 4G/ReBAR/etc to no avail on those.

1 Like

Try using NVFlash and a bios from tech power up

2 Likes

Tried that (after finally gathering the courage to possibly brick the thing entirely, thank you for the final push to try it), found a 1 version older VBIOS on techpowerup and flashed it, overwriting the subsystem ID from HP to NVIDIA in the process.

Curiously, after this I was able to use HP’s VBIOS updater I found on the HPE support site, so I guess it didn’t like the subsystem ID being HP, which is odd… anyway, updated to the latest VBIOS using that, which is 90.02.63.00.04 and dated 2022, so newer than anything I could find on techpowerup or Dell’s publicly searchable updaters.

Sadly the card still behaves the same - nothing on the LGA1700 system, works fine on the AM4 system (which is where I did the flashing) - but I guess at least it isn’t a full brick :smiley:

(I did also try taping over the SMBus pins but that didn’t change anything on the Intel system)

Have you tried booting with the Nvidia VBIOS it could be something funky with the HP one like it wanting an HP board and is just ignored on AMD boards

Quadros historically have had 256MB bar spaces to work on consumer workstations but with the advent of rebar they could have put the whole bar space

In addition to full bar and 256 bar, the was an odd 1GB bar option when using the unlocked NVFlash

Maybe hp is using the weird 1GB one and Intel doesn’t like it

1 Like

Alright one last round of testing for the night.

Flashed back the launch day NVIDIA VBIOS and tried stuff, alongside SMBus pin masking again. No change.

When it boots up on the AM4 system I see the BAR size as 256MB, so that shouldn’t be an issue. At least it doesn’t seem like its trying something fancy.

2 Likes

I bring “good” news. A friend of mine tipped me off on the x8 link-up and weird Intel incompatibility to likely be related to bad PCIe lane training, so I did some testing specifically for that and yep!

PCIe lane 0 (so literally the first lane) is dead. I then looked a lot closer at the card and noticed one of the decoupling capacitors on the RX0 trace was knocked to the side, I only lightly touched it and it fell off, so seems like the seller did that in the past somehow (honestly I’m surprised the cap stayed there during shipping) - oh well NVIDIA you should have put backplates on these things.

Anyway, as for the reasoning why the card works in some systems but not others - PCIe lane reversal. So when it fails to link up lane 0, it will flip over to only using the latter 8 lanes. Likewise in an x4 slot the card negotiates at x2 and in an x2 slot at x1, in an x1 slot it won’t link up at all obviously.
Lane reversal is supported on Intel and AMD server platforms, which explains why the card worked in the seller’s Intel workstation, it is not supported on Intel consumer hardware, but interestingly it seems both AM4 and AM5 do support lane reversal, even using chipset PCIe slots, so that is something new I’ve learnt.

Anyway, seller sent me a GPU-Z pic and the second card he has links up at x16, he offered a replacement and I ended up saying I will just buy both, got a discount as the first card is effectively defective and I should be receiving the second card in the coming days. The first card will just be destined for AMD systems only, or I could get it repaired I guess, but no real need since it works on AMD fine.

5 Likes

There was this asrock board that had something funky with its PCI-E and wouldn’t boot if the last half of the pcie lanes were present, so you could put scotch tape on the last half of the connector and it worked

I did a similar thing with the Chinese alderlake board wendell had gotten, they told people the x16 slot didn’t work but still included it with a cover in the slot
but if you tape the last 8-12X lanes of the connector you can use GPUs

1 Like

Alright and just for complete confirmation - I just received the second card and it links up at x16 just fine and works on Intel just fine with the stock HP VBIOS.

So that confirms it (not that it wasn’t confirmed before already given the SMD cap literally fell off the PCB). After thinking about it more I will likely send the first card for a repair just to have two fully working ones.

Thank you for the advice and help throughout @GigaBusterEXE!

1 Like