Trouble with truenas scale install on AMD epyc 4004 system

I am installing TRUNAS SCALE on a new set of server hardware and it gets stuck at a grub screen. I made the install stick with etcher (rufus did not make a bootable drive). when booting from the stick it gets to a screen with the following message and hangs

Redfish : Pushing firmware layout to OData server.Welcome to Grub!

I have tried enabling legacy boot but it cause the mother board to not boot/post. I have to then pull the CMOS battery and wait for it to reset before booting again.

As a sanity check I installed Ubuntu server 22.04 LTS. It would not install using the normal “try and install” option, but had to use the “HWE kernel” option to install.

In the past I had a trunas installation on an intel NUC but did not use it much.

Is there any advice on steps to take to try and get trunas to install? I’m stuck and not sure what to try at this point.

system specs:
ASRACK B650D4U with latest bios 10.18 (this bios added the epyc 4004 support)
AMD EPYC 4344P
32GB of unreg ECC ram 1 stick
WD black m.2 HDD

1 Like

bets bios is where it’s at. new agenda solves loaoooddsss of issues

2 Likes

That could very well be it. 10.18 is the latest and was released on 6/21/2024 with a release note of support for the new epyc 4004 cpus. First thing I did was flash the bios. So I might just need to wait a month until some bug fix comes out.

1 Like

reset to defaults and reboot

Disable CSM support and reboot

reload secure boot keys and reboot

enable secure boot and reboot

delete all UEFI entries and reboot

Boot with ventoy live USB as sanity check then chainload into live OS and wipe boot drive using dd (just first sectors, don’t have to clear entire drive)

After all that, install your base OS and you may have to edit the UEFI entry manually from the UEFI shell to point to the grubx64.efi stub. Not difficult, just different than most consumer boards where it just works

1 Like

thanks, ill give these a try. a few thought I have:

reset to defaults and reboot

the bios has pretty much remained in a default state because I had to keep pulling the cmos battery. but i gave it a try and had the same results.

Disable CSM support and reboot

CSM has been disable this whole time. enabling it actually cause the BIOS to not post. this is why I had to pull the CMOS battery.

enable secure boot and reboot

I forgot about this. secure boot has been on and has some config options that I with try adjusting. but my gut instinct is that between ubuntu sever and truenas, neither of them will like secure boot being on. turning it off did not help truenas, ubuntu still got to the grub screen.

After all that, install your base OS and you may have to edit the UEFI entry manually from the UEFI shell to point to the grubx64.efi stub.

Ill see what happens but the fact that:
-ubuntu is reaching the grub screen fine
-the ubuntu grub screen works (just some grub selections do not)
-truenas is giving a welcome to grub message and not loading the grub menu

i think the bios is finding the grub loader just fine but having an issue somewhere else later on. Ill still give your recommendations a try and see what happens.

2 Likes

Is it the WD SN770? There’s a hardware bug involved with that one (not even directly related to ZFS; just Linux in general, if I understood correctly):

Unsuitable SSD/NVMe hardware for ZFS - WD BLACK SN770 and others · openzfs/zfs · Discussion #14793 · GitHub

1 Like

i dont think so. this is just installing an on OS a machine that has no OS. from a flash drive. its seems debian (Proxmox, trunas) based OS cant get to the initial grub screen where you would select “install OS”. ubuntu server has to use its HWE mode which is an experimental kernel option. im not sure why. its an AM5 mother board and an (admitedly new) Epyc CPU. all things that have been around for a while now, but seem to not be supported by the linux kernel. (or maybe the MB bios is just fubar)

as a note ubuntu server is install on the wd m.2 WD drive and seems happy. I think im ok with this because the boot drive is not a zfs pool because my MB only has 1 m.2 port.

I got this to work. no need for another amd system, I used an intel nuc. the nuc installed truenas scale onto the m.2 nvme drive. then i placed the drive in the asrack motherboard, and it boots just fine.

mostly. it did get the hang after “initializing ramdisk” issue. according to the forums this seems to be related to serial terminals, but that was all disabled (by loading the m.2 drive back into the nuc and checking). for some reason having a gpu installed in the system reduced the amount of times it gets stuck. i have no idea why. the fix is mostly to just keep rebooting if you get the error (you wont see it because the screen goes blank).

1 Like

I have almost the same hardware and I was running into a similar issue. I was seeing “Redfish : Pushing firmware layout to OData server.” and nothing else. Couldn’t install the OS. I had my monitor plugged into the motherboard HDMI. After I switched to the graphics card HDMI, I could see the installation screen. Weird, but that worked for me.

1 Like

I’ve seen this a few times and for each instance it’s because of how the usb drive was burned.

Use Rufus “dd mode” not “iso mode” or Bella etcher,

or if your workstation isn’t windows…simply “dd” the iso to the drive from the terminal

My update is a sold the board and will just buy a supermicro instead and never stray again. what really got me is even after getting an os on it, the board would take minutes to boot. I am just planning to avoid any asrack boards going forward.

That board has an IPMI, right? My gut feeling is that IPMI is to blame here, presumably because grub was initializing on the wrong console.

2 Likes

I tried messing with its settings, including a “wait for ipmi to boot” to no effect. I have never had any problem like this with supermicro.

1 Like

Same exact issue, same exact solution.

1 Like