Threadripper 1920x on ASRock Fatal1ty with Linux 5.0.x: some of the issues and quirks encountered

I’m not sure this is the most appropriate forum as this affects VFIO users. On with the show…

I’ve been playing with VFIO on Linux 5.0.x and my 1920x Threadripper build. Right now, I have detected some issues that pop up every now and then, but before describing those I will list the hardware:

  • 1920x Threadripper (gen.1)
  • ASRock Fatal1ty x399
  • 4 x 8GB Corsair Vengeance RGB, tested at 3200, 3000, 2666 speeds with the XMP profile (1.35v)
  • 1 x Mellanox ConnectX-2 with SFP+ transceiver
  • 1 x USB 3.0 switch to swap between two hosts for the input devices (more on this later)
  • 1 x Asus 1060 (dual card, 6GB) (passthrough)
  • 1 x Sapphire Nitro RX580 (host)
  • 1 x Intel P3500 2TB
  • several SATA SSDs and M.2 sticks
  • H110v2 AIO cooler (got it before the Liqtech was available at sensible prices)

Now, some of the issues I have come across:

  • ASrock updated the BIOS after 2.10 with USB modules that broke compat with my USB 3 switch (brand is ugreen), this means I need to connect the keyboard directly into a port, otherwise pre-OS boot environment gets no kbd input. I have reported this to them and never heard back (their forums show other people having the same problem).
  • Regardless of the XMP configuration, I get hangs/lockups during compilation (parallelized) workloads. I’m still investigating, will likely need COM based kgdb to debug the kernel, but initial tests show this might be something strictly affecting the kernel. I can’t replicate any issues when running memtest, for example.
  • BIOS reset means you lose your liquid cooler settings ex. pump needs to be DC and set to full speed. This is expectable as the EEPROM gets erased but be very mindful of the implications (read: immediately fix that in your settings).

I will post my configurations later. When it works, VFIO is a breeze with Win10, passing a SATA SSD direct and one of the USB3 controllers via VFIO binding (so I get four ports in the back dedicated to the guest).

Anyone here willing to explain/share their experiences with BIOS settings and stability? Or if you had similar problems (they disappeared for me then came up today while compiling qemu 4.0 in the host!). I haven’t modified much between the working state and the one with lockups, so I’m inclined to think there is a regression with the kernel packages (Ubuntu Disco Dingo as the host OS).

Cheers!