#Fedora 27 for Threadripper # Virtualization with GPU pass-through # (specifically for dual, identical nVidia cards) # #================================================================================================= #My user is pasty2k2 - search and replace on this document, sub pasty2k2 for your Fedora username. #================================================================================================= # #This guide assumes you know how to use vi. If you don't, you should, it's weird, but good. #Should be a song. # #Here is a very quick vi tutorial: # #The default Mode when opening a file is VIEW mode - can't enter text, all keys are commands: # Arrows/pgup/pgdn to move around. # / key to search for a string # i key to Insert/change text - like a normal editor. INSERT mode. # ESC key to return to VIEW mode. # # When in VIEW mode, press : (colon) key to insert "useful" commands, think of it as the # FILE menu in a GUI. when : is pressed a prompt is displayed at the bottom. Enter commands # (CASE SENSITIVE) after colon: # q (so colon then lowercase-q keys) to quit without saving, if you made no changes. # q! to quit while not saving changes. # wq to write changes and quit # #There is a loooot more to learn about vi but this should get you by. # #================================================================================================= #When I give a command to use vi, I'll give a blank line when it's time to write/quit. #================================================================================================= # #My system for reference: # #ASUS ROG Zenith Extreme #AMD Threadripper 1950X at 3800MHz over all cores #Noctua 12cm for Threadripper #32GB Corsair Vengeance RGB 3600MHz (running at 3066 at the moment, it can do # 3466 but reboot failures are a ballache) #4x SanDisk Extreme 240GB SATA3 SSD (all about 4 years old, no NVME... yet) #1x SanDisk 128GB SATA3 SSD (low-end model, for boot partitions only) #2x ASUS/Nvidia Geforce GTX980 Ti Lightning edition (SLI capable and bridged but not using # this at the moment) #Mellanox ConnectX2-based InfinBand, dual port 40Gbps (for fast remote storage access. Glad this # works right out of the box on Fedora!) # #Mobo options: Turned on both IOMMU options - "enumerate all IOMMU" and the one buried in AMD # CBS/PBS or something... # Secure boot is enabled but set to Other OS not Windows. # Compatibility Support Module (CSM) is disabled entirely. # SVM, SMT etc is all of course on. # SATA is set to AHCI, I'm not using onboard RAID. An AMD linux driver is available # for it but it's a pain to integrate and MDRAID is fine performance-wise. Just # backup your boot disk (and of course the RAID array, RAID is not a backup!) # #================================================================================================= #Once this is working - at the moment I need to have the second GPU unplugged from its' monitor #whenever I boot Fedora. After VFIO is setup, once booted you can plug in the monitor connection. #================================================================================================= # #Finally, I've included a few extra steps for setting up remote CIFS shares and google chrome, not #necessary but may be useful to some, omit/include as you please. # # # #BEGINNING OF STEPS # # # #Get latest Fedora livecd, burn to usb using fedora media writer on some computer. #Boot it, don't bother testing, press e to edit boot flags on "Install Fedora" entry. #Add to the end of the linuxefi command: nomodeset #Press ctrl x to boot. Install. Configure networking from the live environment, hostname from the #installer. #For storage, I use blivet-gui to setup a software MDRAID RAID10 setup on 4x SATA SSDs, followed #by LVM groups for root and swap. /boot and /boot/efi go onto a single 120GB SSD, only 1GB #partitions needed on there. #You can use the normal custom partitioning for non-RAID setups, it's easier, blivet-gui can work #fine but it seems to barf for me a few times before eventually working correctly, locking itself #up and possibly causing a full crash. Keep your other data disks away from it, and keep #trying/reboot if necessary. The extra speed is worth it! #When installed, don't install any updates yet! #Mount r/w Windows/CIFS shares at startup: #Run this as your username, don't su to root yet: vi /home/pasty2k2/.credentials #Add lines, put in your credentials for windows share. Domain is the remote hostname. #No " or ' quotes needed for special characters: username= password= domain= #set security on file so only your username can read/write this file: chmod 600 /home/pasty2k2/.credentials su - mkdir /media/shared vi /etc/fstab #Add line at bottom - replace ip with your windows ip, this also works over InfiniBand/any IP #address, or valid hostname if you prefer. #If you have a space in source share pathname use " double quotes around it: //192.168.3.1/Shared /media/shared cifs credentials=/home/pasty2k2/.credentials,vers=1.0,rw,file_mode=0777,dir_mode=0777,suid,_netdev 0 0 #Now test the mount works under root: mount -a #Install Google Chrome: #Create proper repo manually: su - vi /etc/yum.repos.d/google-chrome.repo [google-chrome] name=google-chrome baseurl=http://dl.google.com/linux/chrome/rpm/stable/x86_64 enabled=1 gpgcheck=1 gpgkey=https://dl.google.com/linux/linux_signing_key.pub #Install: dnf install -y google-chrome-stable #Nvidia driver installation: #This tends to work best if you havent done dnf upgrade yet. #Download driver from nvidia.com directly. Run all cmds as root: su - vi /etc/dnf/dnf.conf #remove line, if it exists: exclude=xorg-x11* #Set execute permission on downloaded nvidia .run driver file: chmod +x /home/pasty2k2/Downloads/NVIDIA-Linux-*.run #Install needed extra packages: dnf install gcc dkms acpid libglvnd-glx libglvnd-opengl libglvnd-devel pkgconfig kernel-headers kernel-devel #Disable nouveau driver from loading: echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf #Edit grub boot options: vi /etc/sysconfig/grub #Append to end of GRUB_CMDLINE_LINUX: rd.driver.blacklist=nouveau #Rebuild default grub command line options: grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg #Remove nouveau driver for good: dnf remove xorg-x11-drv-nouveau #Rebuild kernel modules now we have uninstalled nouveau: dracut -f --kver $(uname -r) #Set default target for systemd to be multi-user - command line only: systemctl set-default multi-user.target #Make sure you can read the next few steps elsewhere, you are about to lose the gui! reboot #Login as root user and run the driver: ./home/pasty2k2/Downloads/NVIDIA* #Install should work with default options. If it throws an error, view the log file it mentions to #troubleshoot. If its a gcc version error, try downgrading/upgrading with dnf (down/up)grade gcc. #If it's missing kernel-something, install it. If it complains about a certain module not being #allowed due to a GPL violation, start this entire guide over again- reinstall fedora, and try #installing nvidia driver AFTER doing a dnf upgrade. #Once installed, set systemd target back to graphical so it actually tries to load vga drivers: systemctl set-default graphical.target reboot #GUI should now load, if it doesnt use CTRL-ALT-F6 to change to multi-user text only session. Try #nvidia-xconfig to generate a new Xconfig file for your display and reboot. #When back into GUI, finally, do a full upgrade of everything and reboot. DKMS should rebuild the #driver modules for the new kernel without error/any output during install: su - dnf upgrade reboot #Use below command to check what driver is using each PCI device if still having issues: lspci -nnk #Custom build 4.15 kernel: #Once you can get dkms (the nvidia driver rebuilder tool) to rebuild the nvidia driver OK on a #released current kernel as above, you can feel pretty confident dkms will do this for a custom #built kernel just fine. #Download gnif's patch (modified for Fedora by wendell) from the forum post to your home/Download #directory: https://forum.level1techs.com/t/threadripper-reset-fixes/123937 #Install needed packages: su - dnf install fedpkg fedora-packager rpmdevtools ncurses-devel pesign dnf install rpm-build perl-devel perl-generators openssl-devel hmaccalc elfutils-devel bison flex reboot #Load terminal again as root: su - #Download the latest kernel bits: fedpkg clone -a kernel #Enter newly created directory: cd kernel #Change the branch/code level to f27 at this time, other options would be master or f28 fedpkg switch-branch f27 #Copy wendell modded gnif patch to this directory and set permissions: cp /home/pasty2k2/Downloads/tr.patch . chmod 644 tr.patch #Edit kernel.spec build configuration file: vi kernel.spec #About a page down is a line: # define buildid .local #Uncomment the line and remove the space, then add % to the start of line. #Change .local to .onlyalphanumericcharacters %define buildid .trpatch #then find the patches section about 10 more pages down, and at the end add: Patch999: tr.patch #Done editing. Now build the kernel: fedpkg local #Wait about 20mins, ignore cat broken pipe errors and errors relating to other arch's. #When done (last few parts single-threaded...) install the rpm's in /root/kernel/x86_64 #However you need to give the full rpm filename - can't use wildcards. #Since this will be custom for your build, use your brain. cd x86_64 #Then rpm -i kernel, kernel-core, kernel-devel, kernel-modules and kernel-modules-plus - all in #the same command. ignore cat broken pipe errors. rpm -i kernel-... kernel-core... etc #Finally installed, reboot and ensure the new entry in grub is selected (it should be default) reboot #Check if nvidia is still alive: lspci -nnk #Virtualization, Windows 10/Nvidia passthrough on AMD threadripper #Enable VFIO & disable second GPU from loading: #First create script that forces vfio-pci to grab specific devices instead of normal graphics #module: su - vi /sbin/vfio-pci-override-vga.sh #replace device IDs below, to find them out: lspci -nnk and look at /var/log/Xorg.0.log to see #which card had your display attached to it at boot, or use nvidia-smi if installed. second dev is #audio over hdmi, need both: #!/bin/sh DEVS="0000:09:00.0 0000:09:00.1" for DEV in $DEVS; do echo "vfio-pci" > /sys/bus/pci/devices/$DEV/driver_override done modprobe -i vfio-pci #Set permissions on script: chmod 755 /sbin/vfio-pci-override-vga.sh #Script is called from this file, which is run whenever modprobe is called. Add lines: vi /etc/modprobe.d/local.conf install vfio-pci /sbin/vfio-pci-override-vga.sh #Possibly also add this line - disables arbitration, apparently allowing you to leave the second #monitor plugged in during host bootup. Doesnt seem to work for me. options vfio-pci disable_vga=1 #Modprobe is used/run when dracut runs. #This config file is used whenever dracut is run to install/update modules/stuff in kernels. vi /etc/dracut.conf.d/local.conf #Add lines: add_drivers+="vfio vfio_iommu_type1 vfio_pci vfio_virqfd" install_items+="/sbin/vfio-pci-override-vga.sh" #Run dracut to chainsmoke all this nonsense. dracut -f --kver $(uname -r) #Add some extra options to grub. I think some of these may be superfluous after the above but do #it anyway, it works: vi /etc/sysconfig/grub #Add to line GRUB_CMDLINE_LINUX: amd_iommu=on iommu=pt rd.driver.pre=vfio-pci vfio-pci=enable #Set grub to boot last used option - not necessary but handy. Add line: GRUB_SAVEDEFAULT=true #Rebuild grub config and reboot to use it. This is for EFI systems: grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg reboot #Install/update virtualization and further prep: #Set SELinux booleans to allow use of samba (and cifs) for loading iso's from windows network #shares (my use case), and something else I forget that it moans about so may as well set it here #anyway. -P option makes it permanent: setsebool -P virt_use_samba 1 setsebool -P domain_kernel_load_modules 1 #Add repo for windows virtio drivers: curl --output /etc/yum.repos.d/virtio-win.repo https://fedorapeople.org/groups/virt/virtio-win/virtio-win.repo #Install virtualization: dnf install @virt* dnf install libvirt virtio-win virt-manager dnf install xorg-x11-xauth "xorg-x11-fonts-*" xorg-x11-utils #Wendell uses this, I haven't set up Xen yet but have installed Xorg stuff above and done this in #advance in case I do/you do: export XAUTHORITY=/home/pasty2k2/.Xauthority #Check if everything is running ok: virt-host-validate #Should all PASS #Create virtual bridge network: #Mostly GUI now, home strectch! #First enable IP forwarding rule for virtual bridges: echo "net.ipv4.ip_forward = 1"|sudo tee /etc/sysctl.d/99-ipforward.conf #Check if the commmand worked/the value is present and reboot: sudo sysctl -p /etc/sysctl.d/99-ipforward.conf reboot #Create virtual bridge: #Load virt-manager from Applications or from terminal (your username) right click qemu/kvm #header, #details. #Go to virtual networks, stop then delete the default already there. #Add new, call it default, give it 192.168.4.0 or other unused subnet, enable dhcp. #Forward to physical, choose ethernet, mode NAT. #While in here, go to Storage tab and set up your disks/pools now. Also add a directory pool: #/usr/share/virtio-win/ - this is the location of the Windows virtio driver iso's the package #manager downloaded. #Creating a Windows 10 VM: #Create new VM and use the manual OS choice. For the iso, do not select your OS iso, instead #select /usr/share/virtio-win/virtio-win-.iso. For the hard disk dont worry about it for #now just hit forward, then customise on exit. #Overview - change to UEFI, leave on I440FX. #CPUs - tick "Copy host CPU configuration", then tick "Manually set CPU topology" - choose 1 #socket, all your cores, #and only 1 thread. Then adjust current allocation again as it will have set back to 1. #Remove the generated IDE disk, but not the IDE CDROM or IDE controller. #Remove Display Spice. #Remove Video. #Remove Channel. #Remove Serial. #Remove Controller Virtio Serial. #Remove Touch. #Leave Sound on ICH6. #Set Controller USB to model USB 3. #Set NIC device model to virtio. #Add a SCSI disk and a SCSI controller FOR EACH DISK, this has better performance - if you do add #more than one you will have to configure each disk to use a separate controller (unsure how). #Add a SCSI CDROM and mount your OS iso image to it #Add USB host devices for second physical keyboard and mouse #Add PCI host devices for nvidia card and the DP/HDMI audio "device" - the same device IDs as #those you've passed to VFIO-PCI as per above. You can find out the exact ones again by doing #lspci -nnk and looking at /var/log/Xorg.0.log to see which card had your display attached to it #at boot, or run nvidia-smi to see current card in use, it should only display one if the other #belongs to VFIO-PCI. #At the end you should have: #At least 1 SCSI disk #1 SCSI CDROM #1 IDE CDROM #1 Mouse (can't be removed) #1 Keyboard (can't be removed) #Sound ICH6 #Passed-through USB keyboard and mouse devices (I have a wireless set with 1 dongle for both so #its one device for me) #Passed-through PCI bus device for your GPU's VGA device #Passed-through PCI bus device for your GPU's DP/HDMI audio "device" #Controller USB (model USB 3) #Controller Virtio SCSI #Controller PCI (this was already here) #Now thats all done, set the boot options again, they will be messed up. Set SCSI CDROM to be #primary, SCSI hard disk secondary. #Get your monitor/keyboard/mouse ready! #Begin/save, check it builds and is working. If you get pushed to the EFI shell for some reason, #type exit, then go to boot menu and boot the QEMU CDROM, need to press enter within 5 seconds, #old skool. Turn off VM before installing anything/continuing. #Enter commands: virsh edit #The editor is vi. Slightly down a bit is the section. Add these entries: #should already be here #should already be here #should already be here #should already be here #should already be here #Next find the CPU mode line and amend both options to look like this: #Write and exit. From now on, don't change the CPU model in the CPU settings within the GUI #options, or the CPU mode customisation will be removed and you'll have to edit the VM again. #Power on your win10 VM again. Install windows. When prompted, browse the IDE CDROM for the SCSI #virtio drivers and it should continue without issue. #Check device manager and install missing drivers for all remaining devices from IDE CDROM, #should include virtio network card (10gbps, default Realtek option is only 100mbps) #Once connected to the omniweb Windows should acquire some out of date nvidia drivers all on it's #own. BE PREPARED for either success or failure with an abrupt resolution change or kernel crash! #Install windows updates and proper nvidia drivers. Should install, reboot without issue, and all #work like its a physical installation. #Remove the IDE CDROM and the IDE Controller when you are done with the driver iso. #Check out task manager for what Windows sees, run a Userbenchmark and 3DMark, be happy! Except #we need to pass through sound still. #Audio pass-through: #By default libvirt tries to pass audio through pulse, but is running under root, and because #pulse runs under your username this is going nowhere. #Edit the VM again: virsh edit #Change the top line to: #Add these lines to the bottom of the file, but before . Replace the user id 1000 with #your user if it is different (1000 is the default for the first non-root user): #Now edit qemu config: vi /etc/libvirt/qemu.conf #Find the line (press / while not in INSERT mode to search a string) with #nographics_allow_host_audio in it, #uncomment the line and set it to 1: nographics_allow_host_audio = 1 #Ensure pulse audio is running under your username - run this command under your username, not as #root: systemctl --user enable pulseaudio #Now reboot: reboot #Start the VM, be prepared for about 3 SELinux alerts. There is probably a boolean for all of #these but I can't find one and I've done all the below now so it's too late for me! #If you do get any, follow the "allow it for now" instructions on each of them, run the commands #under root user. #Each time you accept a round of alerts, shutdown the vm, boot it again, you'll get more. #Do this about 10 times (no joke). Use the virt-manager shutdown and "Play" buttons to make this #a little quicker, no need to use Windows. #Eventually, the SELinux alerts will stop. Audio passthrough NOW WORKS! Test it - it sounds #dodgy/choppy... #In windows right click volume icon, go to playback options, set the playback bitrate to 44.1kHz #In Fedora you may need to install pavucontrol and set the Recording option for the VM to point #to a different device, if one exists. Or generally fiddle around in there - that seemed to get #it working best for me. #DONE! #Running a full nvidia DirectX enabled GPU to a second screen/same screen on a second #cable/channel, with Audio passthrough from guest VM to host onboard audio card, using a #secondary mouse and keyboard, should now be working perfectly! Have a cup of tea and do a little #dance. #One thing I haven't included here is setting up a Xen server and use the feature "copying GPU0 #buffer to GPU1" Wendell speaks of. Haven't yet felt the need, still experimenting, but #eventually I will probably try that out and document my findings. #Useful commands: lspci -nnk #check name and driver-in-use status of each pci port lspci -t #show IOMMU groups by pci tree, good visual top #text based task manager/system monitor. ps aux | grep -i "string" #find out if "string" is running, ignore case #Nice tools: vim #nicer vi, add colours and allow pasting text in without needing #to change to insert mode! inxi -Fx #temp/system moitoring lm_sensors #more temp/system monitoring. Check with command: sensor