Anyone looked at TRX40 IOMMU Groups and Passthrough yet?

I have spent hundereds and hundereds of hours building a mid life crysis of a PC

Handbuilt custom case
Full water cooling loop
with an old EVGA SR-2 motherboard

Attempting to do a “2 gamers 1 cpu” + NAS + Docker / VM box all in one. It almost worked. But the stability / compatibility of that old SR-2 board just isn’t up to it.

After trying everything under the sun for months I’ve given up on that core system and hope to move to either a 3950x or the 3960x.

I’ve seen some positive experiences with the 3900x on the ASUS x570-e gaming boards regarding pci-e passthrough and good iommu groups.

But I think I would rather go with a 3960x threadripper to allow an upgrade path down the road should I need it.

That entire TRX40 platform is still super new, but I figured if anyone had experience attempting pcie passthrough on one of the new 3rd gen threadripper boards it would be someone here.

Is the zen 2 threadripper gear actually shipping “retail” or just accepting orders and sending press-kits to tubers? All the typically YT channels seem to be slathering praise on the severely bandwidth limited 3950x and forgetting about TRX40

1 Like

As far as I know it is. I mean it seems to be sold out everywhere but I kinda of expected that.

Assuming I can get confirmation at some point that PCI passthrough and decent IOMMU groups can happen on the TRX40 boards, the only think I feel like I’d be missing is a board with more physical PCI-E slots every 3rd gen Threadripper board I’ve seen has at most like 4 slots. :frowning:

I need to fit at minimum

2x GTX 980s

Some other video card for the host (unraid), I dont think Threadripper has onboard video but I also never looked…

My M1015 SAS controller card

At least 1 USB card, hopefully the two controller card I have breaks out into different IOMMU groups on the new board as well.

and down the road a 10 Gig Nic

Need a video from “someone” into the hardware side of things, testing all the bifurcation cards out there that can add more slots to Threadripper system.

The ones you have to order in person from a company and more or less beg them to sell you just one, that’s no good, need another solution.

I don’t think the TRX40 boards are sold out, more than likely they’re not really in production yet, only announced.

Just a reminder, if you purchase a TRX40 system, you likely will not get an upgrade option at all. The next Zen architecture will come out this coming 2020 with a vastly different architecture that is very likely not compatible with your system in mind. TRX40 will be the last and only one of its kind with no upgrade path in sight. TRX40 will also demand a new motherboard, unlike the the previous Ryzen systems.

1 Like

Very good point, I guess I meant that I’d have other CPUs to upgrade to within the same architecture.

As opposed to going AM4 with a 3950x I feel pretty confident that will be the top of the line for that socket. Where as if (big if…) I some how out grew 24 cores I could potentially upgrade years down the road to the 32 or maybe 64 core versions.

I know I saw several as available to ship from newegg. When I was talking about being sold out I was referring to the cpus both the new 3rd gen threadripper parts 3960x and 3970x.

What happened to the AMD promise from a few weeks ago to support sTR4+ for “several years” ?

They will support it for RMA, software bug fixes, firmware, bios and driver updates. You just have no future upgrade paths.

You could just use one of the GPUs to install unraid and once you can get to the webpage then tag that GPU for passthrough :stuck_out_tongue:

There will be at least one more CPU upgrade to this platform (Zen3/Milan) as they are also Sp3/DDR4. The one after will not be compatible.

I have the TRX40 platform /w 3960, and it’s interesting… not straightforward, and a little buggy. 1660 TI works 100% time once-per-reboot, if you shut down the VM, you can’t start it again. Quadro4000 (ie 2070) refuses to work, libvirt just says VM crashed. Both these cards worked flawless on a 1950X X399 platform. Both motherboards are Zenith Extremes.

Libvirt/Qemu has horrible support for 6 Chiplet Zen2s, no way to get the correct cache layout. If you use the new parameter -smp die" parameter, it works for 2 dies, but not more than two (ie if you have 3 dies with 3 cores, it just appears as 6 cores, not 9 in Windows).

There is probably quite a lot of work to do. Will give my Vega 64 a spin later today. Performance is decent, but not so much better vs the 1950X (that was very finely tuned, so, probably room to improve on the new platform).

Update #1
Resolved the issue /w the Quadro Crash, you can’t use the new USB controller on nVidia cards, it instantly freezes & crashes the guest. Confirmed with two different USB hubs that worked when the cards were in the X399 mobos.

Topology is still an challenge. Using TopoExt and EPYC model, you can get two dies to look “correct” if you also use smp, but if you try for more, , or want only non-smp cores, they get ignored or look wrong (all cores share L3 for example, which is not true). Also dies get treated as sockets in Windows, which looks funny, but behaves correctly.

Greetings,

From which establishment(s) did you order your gear? I ordered on the 25th and all I could find was pre-order option from B&H for the CPU (3970x). I ordered, and have received, a Zenith 2 Extreme.

So I’m sitting around w/nearly a full system’s worth of parts representing ~$3K just waiting for AMD to actually ship cpus. Seems like the 25th was a paper launch and aside from the few they made for the tech press they don’t really exist.

So, again, what ‘magic’ did you employ to get one? :slight_smile:

Hi NaCl

I live in Sweden, there are stores that still have them in stock or have had re-fills. I ordered it on the day (didn’t pre-order) and it shipped same day, had it on the 26th. Took a while to put together.

Update #2
Another thing to note, with the M/B, I bought a kit of G Skill Neo 3600 / 16-16-16-36 32GB (4x8) Memory. On this motherboard it can not do 3600 Stably, and it shuts off if you try to the smallest change in MHz or timings higher. Memtest fails invariably on the first pass.

I have an older set of GSkill Fury 3200C14 Memory, that Memory runs fine @ 3400, and even boots / works (but with Memtest errors, no tweaking to make it better) @ 3733. It seems there is some incompat. between these memories (or my specific copy of them) and the Zenith Extreme II.

While using my older GSkill Furies in the TRX, I put the Neos in the X399, which passed memtest @ 3200 just fine (that mobo/chip has a hard time above 3400 and I just wanted to get it back up and running).

“Vastly Different” is likely and overstatement. AMD had to break backwards compatibility with TR39X0 getting the IO die to talk to memory, chip-set and direct pci-e lanes required them to re-assign pins in the socket.

Most of the changes I’ve seen rumored are to the distribution of CCX’s and cache, as well as a lot of micro-archetectual details (unspecified) to improve floating point.

I’d wager that the packet switched fabric and the i/o die will be able to hide most of the differences from the motherboard.

1 Like

Look into how to ensure your hypervisor is using d3 to d0 instead of FLR reset methods. Had to do this on my ESXi host to be able to reboot my VMs if I passed through the USB ports that stem from the CPU on my x470 board and solved some hitching issues with my 1080 Ti

Hi,

That sounds interesting, can you give some (more) hints at what to look at?

What types of hints? I’m not sure kind of info you’re looking for really. Here’s some random stuff:

  • Some things I found out for Ryzen CPUs was for core pinning: Core and thread pairing isn’t right next to each other. To explain: Ryzen 3700x is 8c/16t, physical cores are 0-7, the SMT threads are 8-15. So core 0 has threads 0 and 8, core 1 has threads 1 and 9, etc. So if you want to pin a VM with 4c/8t all from the same CPU die, then you’d pin 0-3 and 8-11
  • Found some QEMU documentation with a string to find which devices support resetting on your current linux kernel version. I’d check it out, as what I mentioned may be exclusive to VMware ESXi and not KVM/QEMU

Thanks for the reply, the pinning etc, I am well versed in, but the documentation bit for reset is kind of new to me, I will look into it.

The most curious is that it worked on the other platform, and from the sounds of it, as long as the kernel knows of the device, it should work. Except if the platform hides / changes what can be done with the addin cards on the PCIe bus, which I can see happening.

Thanks again!