Asus ROG Zenith Extreme X399 Threadripper IOMMU Details

Hi all -

I’ve been away for a short while, mostly running smoke tests on the parts that arrived for my Ballin’ AF Workstation build and waiting on another (yikes, my fourth Corsair case!) tempered glass case. That RGB nonsense is growing on me…

In any case, here’s the IOMMU mapping for the ROG Zenith Extreme X399 board — details in a gist.

I’m tad confused about the Aquantia NIC that’s sitting in PCIEX8_2 so I wonder if @wendell @MisteryAngel or anyone else here could chime in; and here’s my confunddled thought stream

  • In lspci -tv the output shows the Aquantia in group 41? how does 0000:40 if at all, map to the IOMMU groups?
  • Do I need to enable an UEFI option for IOMMU mapping to ‘fix’ itself; I thought I saw something along those lines, will follow up on this in due course.
  • lspci output - it looks like the entire PCH is in group 11? Ugh…
  • Group 6 has a lone USB controller, get that could be used to pass peripherals along?

I haven’t tried throwing in M.2 NVMe drives in (yet) and will be testing that with Asus’s DIMM.2 slot later this weekend. I’m taking Veteran’s day off so I get a long weekend to play on this stuff - if you have any questions and want me to try anything, here’s your chance!

You can also tweet me if you like

Other links of interest:

1 Like

Oh, this issue hasn’t gone away (with Nvidia cards), although I haven’t tested a more recent UEFI (yet)

[10482.195548] dpc 0000:00:01.1:pcie010: DPC containment event, status:0x1f00 source:0x0000
[10482.195565] pcieport 0000:00:01.1: AER: Corrected error received: id=0000
[10482.195569] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0009(Receiver ID)
[10482.195573] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=00000040/00006000
[10482.195576] pcieport 0000:00:01.1:    [ 6] Bad TLP
[11144.059410] dpc 0000:00:01.1:pcie010: DPC containment event, status:0x1f00 source:0x0000
[11144.059428] pcieport 0000:00:01.1: AER: Corrected error received: id=0000
[11144.059432] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0009(Transmitter ID)
[11144.059437] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=00001000/00006000
[11144.059439] pcieport 0000:00:01.1:    [12] Replay Timer Timeout

that [41] isn’t showing the IOMMU group. I’m not sure what it is showing, but it’s definitely not the group.

That makes sense. Everything is connected to a bridge that’s using a single PCIe connection (I think it’s an x4) to go back to the CPU. You’ll see this on a lot of devices. Just grab some USB controllers.

Group 6 also has an unknown device at 0a:00.2. There’s a bit of info on it here, it’s more or less unknown what this is though

Due to your newfound interest in RGB, you’ve lost your Threadripper authorization. You are hereby required to surrender your workstations to me at your earliest convenience.

3 Likes

:rofl: :rofl: :rofl:

Let me rephrase it slightly – what’s the reason for 0000:40 to appear as a separate ‘root’ node in lspci vs. 0000:00?

Point being, the Aquantia doesn’t show up in the IOMMU listing at all… hmm!

I haven’t seen this before. I don’t do much IOMMU stuff with server/workstation boards. I’ll have to defer to those with more experience.

1 Like

Just for context, this is the mapping on the Gigabyte Threadripper box - https://gist.github.com/bsodmike/f481dce37e6e9fd3a051742016791017

Notice how the GTX1070 sits in the 0000:40 block, yet gets its own IOMMU group No 18.

CC @wendell @ryan @kreestuh

Here’s an updated IOMMU/lspci mapping with the latest UEFI 0701 on the ROG Zenith

Changes in hardware now include the addition of a Samsung 960 PRO 512GB M.2 NVMe SSD in ROG DIMM.2 (R-Slot) which shows up in the same 0000:40 grouping in lspci but isn’t shown at all in the IOMMU listing.

IMG_3873

Should Enumerate all IOMMU in IVRs be enabled? cc @wendell @ryan @kreestuh

1 Like

Success - The Aquantia 10G card & 960 PRO NVMe get their own IOMMU groups!

Setting “Enumerate all IOMMU in IVRs” as Enabled did the trick!

cc @SgtAwesomesauce

1 Like

Noice!

Threadripper is looking really good. Stop tempting me!

1 Like

If you decide to take the dive, I highly recommend the Zenith Extreme board - so much so I ordered a second one, to replace my only 1-month new Gigabyte board.

Why? UEFI has so much more polish, their ROG forum is excellent with UEFI devs replying regularly and even offer a beta (0801 right now) for download there).

On the Gigabyte front, I really haven’t seen much progress with their latest vF3g. Sure, IOMMU is enumerated across both Threadripper cores by default on the Gigabyte board, but groupings are pretty nasty.

Since I’m going 10G as well, the additional Aquantia card provided with the ZE is fine meaning the ZE mainboard’s actual cost is closer to $300, since most decent new Intel 10G (copper) cards are in the $200 range anyways; getting features like DIMM.2 (makes swapping out and accessing M.2 sticks SUUUUUPER easy) just pushes me over the edge in terms of convenience etc.

I also believe the Asus UEFI has better support for PCIe bifurcation (which the Gigabyte may be lacking on?) should you want to say run 4x M.2 NVMe in RAID off a single slot via the Asus Hyper M.2 card (only $60…wow)

Ah, that’s enough for me. I was also looking at the Asrock board. Any experience with it? I think Wendell made a video, I’ll have to check the channel.

I’m not on 10G, but at some point, will probably get there. I have really slow peripherals, still running lots of spinning drives and consumer SATA SSD’s, so it’s not really worth it for me considering the amount of performance I’d get out of it.

I’m also moving in the next year or so, so it’s not worth running cat7 in the house.

FYI Groupings are nasty on the Gigabyte board only - at least from what I’ve seen so far.

Yeah, I can’t exactly recall but I think he was happy with the Asrock and MSI boards but double check his reviews first. I don’t recall him covering the Zenith Extreme though - @wendell does far more thorough NVMe testing (as he has so many FREE toys to play with!!).

I’m hoping to order a Vega 64 by mid December, only then will I be able to play with GPU pass through.

For my 10G link, I plan to go SFP (where possible) as the non-copper 10G switches are cheaper especially this Ubiquiti Networks ES-16-XG EdgeSwitch 16 10G 16-Port Managed Aggregation Switch.

I want to go SFP to at least ensure my network link will no longer be an IOPS bottle neck for iSCSI, as I’m running a fair few simultaneous shares between the FreeNAS box, and most likely at least 1-block-level share for each VM that will run on the final XenServer.

Wow… I had a hard time sourcing anything decent beyond Cat5E, but then again my primary use-case was for outdoor runs for PoE cameras (from Ubiquiti); so I ended up ordering 1000ft roll of Ubiquiti Networks 1000’ TOUGHCableCARRIER Outdoor Shielded Cat 5e Ethernet Cable (Level 2) from B&H (since their DHL rates are the lowest to Colombo). Since I have a fair bit of excess, I’ve been running these indoors as well.

Cat5E is about as good as you’ll get for outdoor, from what I can tell.

Technically, for 10G, you only need Cat6A, but I figure “go big or go home” is an appropriate model for this solution.

1 Like

Haha that’s been my recent model too :wink: In anycase, these are expansions that I expect to last a minimum of 5-years, so even the ‘over the top’ hardware on the TR boxes, is in line with that plan…

1 Like

Yep, and when all else fails, X399 is supposed to support the next couple generations of TR CPUs, if I remember correctly.

1 Like

Another reason for ditching the Gigabyte board, the fact that it doesn’t have an Intel NIC (on board). Tried installing XenServer but the installer bombs out as it doesn’t have drivers for the ‘Killer NIC’.

On the plus side, the Zenith Extreme has a good 'ol Intel Gigabit NIC, so I’m hoping the install should work with that.

Mmm, believe so…

I had more IOMMU group on the Zenith. Unfortunately the system stopped posting. before I could test it. I’m not sure, but it could have been the “enumerate IOMMU for IVR” (or so setting).

Maybe Asus support can think of something I haven’t, Nd get it to POST. If so I should be able to test my set up tomorrow.

Can you reset to defaults in your UEFI and try posting again? Which UEFI do you have on the Zenith?

The problem is that I can’t enable PCIE_ARI and have my PNY M.2 card installed under the southbridge. It will fail to post with a 00, and it isn’t even possible to reset the bios. I left the battery out overnight, and it still would not boot. Maybe I needed to leave it longer.

Asus says it will work with a stick from ‘the list’. I have my doubts, but we’ll see.

I put in place what worked with my old X58 chipset, but having referenced your post and a couple of others previously, I know there’s a few more settings I gotta pin down. I’ll update the thread with my results.