Linux (debian-ProxMox) Recompile needing over 60Gb (and counting) to compile

Hey Everybody! Need a Sanity check.

I’m switching over from ESXi 6.7 to proxmox. One of my VM’s will be running a 5600xt, and will need the navi patch. Also because my MB has RMRR problems, so I’m also having to patch that (simple fix:

— a/drivers/iommu/intel-iommu.c 2019-11-14 10:20:18.717161513 +0100
+++ b/drivers/iommu/intel-iommu.c 2019-11-14 10:23:31.202402702 +0100
@@ -5112,8 +5112,7 @@

if (domain->type == IOMMU_DOMAIN_UNMANAGED &&
device_is_rmrr_locked(dev)) {

  • dev_warn(dev, "Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.\n");
    
  • return -EPERM;
    
  • dev_warn(dev, "Device was ineligible for IOMMU domain attach due to platform RMRR requirement. Patch is in effect.\n");
    

    }

    if (is_aux_domain(dev, domain))

outside of that, it’s just a standard Proxmox compilation.

Question is, is it common for kernel compilation to take this much space? I had to LVEXTEND /root to 70gb before it stopped failing due to limited space

Thanks,

-Felix

It’s been a while since I built an x86 kernel, but it sounds like too much.

Normally, I’d build openwrt … building entire images takes about 15GB of space, that’s with multiple copies of the tool chain and sources of various things downloaded off the internet and so on.

What build instructions are you following?

I’m doing:

Code:

apt-get update
apt-get install git nano screen patch fakeroot build-essential devscripts libncurses5 libncurses5-dev libssl-dev bc flex bison libelf-dev libaudit-dev libgtk2.0-dev libperl-dev asciidoc xmlto gnupg gnupg2 rsync lintian debhelper libdw-dev libnuma-dev libslang2-dev sphinx-common asciidoc-base automake cpio dh-python file gcc kmod libiberty-dev libpve-common-perl libtool perl-modules python-minimal sed tar zlib1g-dev lz4

Then download the pve-kernel git:

Code:

cd /usr/src/
git clone git://git.proxmox.com/git/pve-kernel.git

Then place both patches in the appropriate folder

Then

Code:

nano /usr/src/pve-kernel/debian/scripts/find-firmware.pl

Comment out the fourth line from above with a # so that it looks as such:

Code:

#die "strange directory name" if..

Save with ctrl-x and y.

To finish up and give your system a nice identifier, edit the Makefile:

Code:

nano /usr/src/pve-kernel/Makefile

Edit the EXTRAVERSION line near the top of the Makefile and add this to the end:

Code:

-removermrr-NaviPatch

then I ran

 Make

Definitely too much.

Not sure what’s going on there.

When I get back to my workstation I’ll give the pve kernel rebuild a try to see if I can reproduce.

Hopefully it’s obvious you can just make a directory inside your home directory to build - you don’t have to do it on your root partition.

The proxmox kernel in the git repo above looks like a lightly patched Ubuntu kernel, docs say 15G should be enough… which sounds like much to me.


Try starting clean in your home dir.

1 Like

I wanna say anything over 2g for a non git build is too much.

The kernel is a big project but it’s not that big. Remember the resulting binaries are like 20 to 70 mb

Typically

60G sounds like maybe some retry loop gone wrong and a log file filling up the disk.

If you look at the git repo, you’ll notice it also tries building ZFS and some other stuff, it’s not just the kernel, perhaps @Camofelix thought he’s building just the kernel and in fact might be building some minimal userland version of entire proxmox.

1 Like

Potentially, that does make sense. Spend much more time in the BSD space these days, and I’m not doing tons of Kernel work. Using proxmox to get my feet wet closer to open source virtualization as a stop gap while bhyve matures to the point of being usable.

Yeah, I realize I could have done it in home, it’s just a preference thing. What threw me for a loop(pun fully intended) was the ~62gb of final space it peaked at. most of the extension I have to be able to test stuff in the future, wanted to be one and done with the rebuild.

No way, something is broken there. I just double-checked on my Talos II Power9 system and a kernel compile source and output directory is 16 GB. That includes the .git Git repo as well, which is 2.6 GB itself.

And for funsies I ran make -j zImage and make -j modules which ran the load average up to 800 and RAM usage over 100 GiB. Fun! (Note: make -j allows make to launch an unlimited number of parallel jobs.)

2 Likes

:Cries in 32 gigs:

@SgtAwesomesauce @zlynx @risk If you guys need any of my patch files let me know! Would love to get to the bottom of this. Now just having to deal with the normal passthrough headaches of VFIO

also, @Trooper_ish :grimacing::sweat_smile:

Screen Shot 2020-07-27 at 7.39.42 PM

2 Likes

Dude, if you flex just a little harder, I might just burst a blood vessel…

2 Likes

Then you find out that the CPU’s are nearly ten years old and that I modern i3 outperforms the entire system

2 Likes

Just download a bit more.

I’m working on it.

3 Likes

Configurating the test container now.

Hmmmm…

:thonk:

root@44f7aa8ff52d:/usr/src/pve-kernel# make
test -f "submodules/ubuntu-focal/README" || git submodule update --init submodules/ubuntu-focal
Submodule 'submodules/ubuntu-focal' (git://git.proxmox.com/git/mirror_ubuntu-focal-kernel) registered for path 'submodules/ubuntu-focal'
Cloning into '/usr/src/pve-kernel/submodules/ubuntu-focal'...

That’s definitely a thing…

Had to get enough RAM to populate the CPU channels didn’t I?

Honestly, 128 GiB is rarely useful. It often doesn’t even fill up the cache since there just aren’t that many files to store.

2 Likes

So, full disclosure:

image

I’m getting the same results.

Not really sure what the hell is going on.

30 seconds later, it’s up to 41GB in build.

1 Like

At least the sanity check is in play :sweat_smile: I was shocked myself.

Any theories on why?

Not yet. I would have to dig through the makefile.

1 Like

Not Sure if it’s related to the ballooning compile, but I did notice that I’m experiencing a lot of DMAR faults. not sure if that has to do with the PCRMRR patch, or the navi patch or something else all together.

The web interface locks up completely, tho I can still reset via IPMI.

I’m also having VFIO issues passing through an NVMe drive using the SMI 2263 controller. I’m not sure if its related, and I’ve got a non kernel patch that should work [comment 42], but I’m not sure hot to apply it to a proxmox VM (https://bugzilla.kernel.org/show_bug.cgi?id=202055)