How to SR-IOV Mod the W7100 GPU

You can always check wikipedia which has tables with chip codenames/SU:TMU:ROP counts/etc.



As you can see, it’s basically R9 285 with 8G VRAM. Should be little behind RX 470.

The big question is how robust W7100 VF passthrough is. I guess GIM is only tested against few LTS kernels and whole thing can be extremely sensitive to the environment. Latest stable kernels, dozens of VM start-stop cycles, VM crashes, suspend-resume, can it handle it all?

1 Like

Soon enough I might be able to found out, snatched a W7100 for 100$ and is on its way right now :smiley:

I will flash it and place it into my Colo server, where I would like to have multiple GPU accelerated VMs, with both Windows and Linux, because it has .5ms latency to school (as if I’m going back there anytime soon… :sweat_smile: God knows, maybe it will even work with Docker :thinking:

The colo server luckily only needs to be up for like 30min a day for a scheduled rsync backup :rofl:

2 Likes

@wendell I’ve got a former AMD engineer working on something with me. Currently in discussions on gaining access to some internal software branches and to boot I’m trying to get ahold of some newer MxGPU hardware beyond hardware that’s available outside of sole purchaser agreements. I’m looking into reproducing some steps involved here but applied to a newer architecture with blobs produced from newer MxGPU devices. Best case scenario I’d like to resolve these issues (1 2) specifically with AMF encoding via access to the internal gim repo and/or manipulating EEPROM firmware to enable SR-IOV on newer consumer hardware using flashrom and some blobs I might be able to pull off a newer SR-IOV device (newer than the s7150x2 or v340). Someone on my team is also looking into doing some tampering with the blobs themselves to see if we can unpack them similar to the methodology applied here (obviously as it applies to GPU firmware stored on the EEPROM rather than CPU firmware as is the case with me_cleaner/coreboot). I could use some help on this is anyone is interested in peaking down the rabbit hole on this one with me. Happy to share what I find if I can get a hand from some people.

1 Like

I would like to support this only if amd okays or on obsolete hardware. It gets weird too if non public access things are used so please don’t do that as there are potential legal implications. :slight_smile:

Amd might release bits and pieces to help us for some I and cross some t as a sort of unofficial assist but I’d rather not give them a reason to resist these types of efforts

Intel’s gvt-g is a go also so this kinda thing may be accessible sooner than we thought

@wendell
I wouldn’t be dumping a repo we get through an agreement with AMD. Definitely not interested in legal issues or doing a disservice to the SR-IOV space. I just want bug fixes for a production workload (specific customer that is asking - also happens to be the AMD engineer I’m working with). The same encoding job is currently being handled by CPU encoding instead of GPU encoding (maxing out all cores on 2 Xeons with 48 threads).

If necessary we’ll purchase whatever volume is needed to get the codebase which is apparently only available to select vendors who have entered into a sole purchaser agreement for large volumes of new stock MxGPU hardware. I’d obviously rather not go that route if I can avoid paying for a huge volume order to get it. Other route is writing the encoder patches myself which would be much harder and far more time consuming.

On newer hardware conversions (SR-IOV being enabled on architectures beyond the s7150 and w7100) dumping a blob and reflashing elsewhere seems fine.

Beyond simple reflashing of bins to other cards I’d like to begin tampering with the blobs using a similar approach to that of Flashrom+Coreboot+ME_Cleaner (dump EEPROM blob via SPI flashing device, dump blob contents or reverse engineer via tools such as: 1, 2, 3, 4, 5 then finally making use of a hash collision to repack the bin aligning the bits such that the card does not detect a modification to the binary blob loaded firmware).

Best case is I’d like to get some deeper hooks in the firmware and begin to have a more definitive grasp on forcing SR-IOV on across the board of current generation AMD devices. It doesn’t seem possible without pwning the blobs. AMD won’t let go of SR-IOV as an enterprise feature and that’s not a tenable scenario with my long term plans to incorporate it into consumer software.

I’m aware of what I’m getting myself into and that there’s a host of ways it can go wrong (code signing, hash algorithms used without existing collisions, ect…). I still think it’s worth trying given the possible benefits that could come if it works.

2 Likes

Also been following this for several months.
https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.2-IOMMU-AUX

I’ll fork my project to use it if I can get ahold of the hardware to make it work. Measured against current production performance gvt-g sounds like it could be a game changer. I’m just not counting on something panning out that’s not here (and may or may not be as maliciously implemented as GRID and/or SR-IOV when it finally lands).

Hi there and merry Christmas!

I have a W7100 with a GD25Q41B (512 KB) flash chip around and tried to flash the S7150 BIOS posted by I guy named kino0924 I found (techpowerup, overclock.net and here) while researching the interwebs on that matter. I also tried some from the Techpowerup webseite (thou not neccessarily classified as S7150).

Long story short I end up with a card that is not recognized anymore by the “amdvbflash -(a)i” after a powercycle. I need to use the 1+8 pin method to get the card recognized again.

I know that @wendell is a busy guy but could you maybe share your rom (since you seem to have done the patch in the past) or some hints on what to try? Or can anyone elaborate if I have to make some adjustments to the file?

Any hint would be greatly appreciated.

Cheers!

Maybe it is because the BIOS file supports only Hynix H5GC4H24AJR RAM any my card has Hynix H5GQ4H24MFR RAM? I’m no VGA BIOS expert …

If so, is it possible to combine those two files?

I’m interested in doing the mod as well. Can you tell me what is the 1+8 pin method, and can you fiash W7100 firmware back to fix the card?

You have to short pin 1+8 of the eeprom chip of the card while booting the computer to get the card in some kind of fallback mode. When booted this way it shows up as flashable in amdflash again and you can unbrick the card with a bios backup you hopefully made earlier.

You can search for this method in your preferred search engine for further details.

I ended up in some kind of locked up state after flashing various s7150 bios I found online and even with an backup a friend of mine took personally of a s7150 card.
I’m not sure if this is an expected behavior. Unfortunately I haven’t had the time to investigate further.

I’ll probably have a spare server in a few days where I can install Proxmox to test my theory, thou I was hoping to get a few hints here if I’m on the right or wrong track.

3 Likes

Thanks for your response! Maybe you can try with a hardware programmer like CH341A?

Good point. I’ve ordered one and will try when it gets here!

Could someone assist me in locating the S7150.bin file. I have been searching Tech PowerUp. The file I found was the 228431.rom, which is 128K. But the dump i did from the w7100 card is 512K. I cannot locate the 512K file. Thank you for your help.

please help me install GIM driver on Kubuntu 20.04 :pleading_face:

Please excuse the intrusion.

Are you sure the actual payload is 512K or is it just zero-padded? Open it in a hex editor and take a look.
Often they just reuse larger chips because supply chain and/or the same chip for multiple cards.

1 Like

Is there anyone who managed to run this successfully on a recent kernel on a TRX40 Aourus Master Xtreme?
I had some luck in building the GIM driver on Fedora 34 but loading it just locks the machine.
SysRq’d stack trace shows the driver’s stuck at atom_exec_bios_table.

I haven’t been able to install Proxmox 6.4. The installer doesn’t start due to graphics driver issues, I assume.

1 Like

Are you using it as an addin or primary gpu? Once you mod it, it only works as secondary

As an add-in on the 3rd slot. It shows up in lspci after flashing the VBIOS but unfortunately no output. it’s a Barco MXRT-7600.
1st slot is occupied by a 6800 XT.

I’ve had trouble getting it work to with the AMDGPU open source drivers, because under Fedora they crash (6800 XT works fine). I’m currently trying to get it to work under debian testing with the AMDGPU Pro drivers and original VBIOS. If it works, I’m going to try flashing it again.

Edit: I was able to get the W7100 working with the regular BIOS on debian testing. After flashing the S7150 BIOS modprobe gim just panics but fortunately, doesn’t end up locking the machine up.
panic.txt (8.7 KB)

Looking at this in May of 2021, I am seeing W7100 going for over $400.00 I was wondering if there was a way to turn a Vega 56 into a mini V340. Since I have one at arms length.

Did anyone get past that hot reload problem with the w7100? I’ve got the 8gb w7100 with the 512kb 25Q41B dual/quad EEPROM.

I get to the part where this payload gets delivered (dmesg -wH above):
https://raw.githubusercontent.com/GPUOpen-LibrariesAndSDKs/MxGPU-Virtualization/master/drv/gim_vbios_patch.h

Then the whole machine hard hangs. I wonder if this is the hot reload bug that’s been discussed… Anyone have any ideas? I could really use some help here as I’m finding myself somewhat stuck.

So far flashed the s7150 firmware with amdvbflash and made the modifications to GIM and compiled as described in @wendell 's OP. Problem comes when trying to modprobe gim, then we get the crash.

2 Likes