Rages disk-less Windows 10 boot from iscsi - Adventures / Results

Hello there,
this is a short* dump of my adventures with trying to boot windows10 from an iscsi target.

I’ll ignore the iscsi target setup since that is the smallest problem.

The PXE ,
i played around with iPXE and Flexboot (basically the same) on Mellanox network cards.
Those can only do legacy boot, so FFFFFF if what you want to boot is installed in U-EFi mode.
Yes, ipxe is available as efi version, but it behaved flaky for me and didn’t work with the Mellanox-Cards.
Since Connect-X4, there is a EFI capable PXE rom on the cards, but i have Cx3 and 2.

NOTE, ipxe can only use 3 nic ports, so if you have more, FFF good luck figuring out which is which.

The DHCP Server telling iPXE what to do took me like two years of intermittent work to get it working on freenas, but i won’t write about that here right now. If you have questions, ask them!

Hint: read everything before attempting anything!

Install windows to iscsi target

All Windows Installers seem to be iBFT capable.

0: Set proper boot order in BIOS.
1: Boot into PXE: ipxe / flexboot / what ever.

with just an install-USB-stick

Proper boot order: First the pxe, second the Installer.

2: sanhook the Target that you want to install to.
sanboot is also ok since that hooks and then fails to boot, if the target is empty.
3: exit iPXE
4: the installer should now boot, if not, check the boot-order.
5: proceed through the installer
6: add relevant drivers, for example the mellanox nic drivers.
7: The previously hooked target should be available as install target since ipxe left a iBFT entry. If it isn’t, well there was a version with bugged iBFT, or try the later mentioned keep-san.
8: install windows

Notes:
You need to hook, and install over the network-card that is later used to boot because of specific driver un-binding that the installer will do for you.

My Asus Maximus ix Apex, failed to boot the windows10 installer after exiting iPXE/Flexboot even though the correct boot order was set. That is a BUG and could be due to other settings, or the fact that it is a modded BIOS to run Coffe-Lake on Z270.

I suggest you try the WinPE route, or you manually prepare and transfer an image, described in the other spoiler.

with winPE + install USB

Make sure winPE has the correct Nic-Drivers, for example for the mellanox nics.

2: sanhook the install-target.
3: sanboot or just boot the WinPE.
4: Start the installer “setup.exe” from within winPe. Idealy this is build into winPE or on an attached USBstick or where ever.
5: proceed the installation.

Has it worked? I had “can not install to this disk” error, but i was to dumb to add the mellanox drivers to winPE.

NOTE, ipxe / flexboot can only attach to two iscsi targets simultaneously. A third one will fail to attach.
Error was “no space”. Maybe the 8GB of ram weren’t enough.

done.

NOTE, Flexboot has a keep-san option that iPXE also has.
It shouldn’t matter but Mellanox release notes state different things so you might want to set keep-san 1 with flexboot just in case.

Prepare existing W10 for iscsi boot

Any way described here under workarounds should work.

https://support.microsoft.com/en-us/help/976042/windows-may-fail-to-boot-from-an-iscsi-drive-if-networking-hardware-is

In short, what you need to do is to tell the NDIS Lightweight Filter Driver not to bind the nic you are booting from.
You do that by disabling the binding to the Nic.
The link above contains two ways, the bindview UI application, and a Nvspbind cmd too.

There are Powershell-cmd-lets that allow you to do the same but i forgot where i found those.
And in the end, all of this should be just registry-entrys, buti didn’t dive into those.
Soem kees are mentioned in here.
https://support.microsoft.com/en-us/help/2507616/0x0000007b-stop-error-after-replacing-or-switching-to-an-alternate-isc

If you want to use BindView, well i needed about 25GB of space with visual studio and many different libs and sdks just to compile it.

Here is my compiled version, It should work. Haven’t had problems with it jet. AMD_64.
https://github.com/Ragebone/MS_bindview/releases/tag/1.0

shortened bindview steps from M$
  1. open bindview with admin-privileges
  2. In “Show bindings for” – Select “All Services”
  3. You will see two “WFP Lightweight filter … MAC …” node. Expand it to see its binding paths. There are two of those, ipv4/6.
  4. Right-Click on “Binding Path #” for the iSCSI boot NIC/NICs and select “Disable”. If the Disable option is not available, it means the binding is already disabled. You also see if it is disabled.
    that’s it!

Transfer your Windows to the iscsi target.

Yay, now you should be able to successfully boot windows form an iscsi target.
If you get a inaccassable_boot_devie BSOD, set the Gateway of the boot nic to 0.0.0.0, and if that doesn’t work, well, keep on reading.

Pain with Hardware-Changes:
The moment you significantly change the hardware in a way that changes the IDs of the Boot-Nic, Windows will detect it as a new device, and bind the NDIS LWF (WFP) Driver to it.
That will then cause a “inaccassable boot device” blue screen.
The only way, i currently know of, how to recover from that is to transfer the target to a “real disk”, boot that disk to get new bindings.
Then disable the boot-nic bindings and transfer it back to the iscsi-target.

Windows 10 seems to at least be fault tolerant that you can replace the Nic in case of a fault.
With other versions, you will need to apply those M$ patches to even be able to change the nic with an identical one.

Funny observation. My two W10 tests couldn’t shut-down properly, but restarts work fine, so do applying-updates.
Upgrades on the other hand, have the extra potential of breaking.
Haven’t done one jet.

Why did i do this?
Well. friends and i do overclocking and i got really annoyed by the time that we waste every sessions, shuffeling ssds, installing, reinstalling, updating and recovering windows-es.

So we dreamt of snapshots and “all disks in one place”.
My choice because it is already in place and running, freenas with ZFS ZVols, snapshots and clones.

The only thing that i haven’t found jet:
One image that “multiple” boot “ReadOnly”, and each gets a few gigabytes of separate space for differences and tCopyOnWrite handles changes of the boot image.

M$ sever can do that if you need it.


good luck with all of that, if you know a way to disable the NDIS LWF driver completely or just for a family of nics so that i can boot any machine with one base-image, please let me know!

4 Likes

And the Boot-Times, even of off an ssd are surprisingly long for windows alone. iPXE takes its time too. FFFFFFFFFF

1 Like

PXE is very slow as it uses TFTP to transfer data and TFTP is slow. TFTP is rocks-in-head stupid, as it was never intended to be fast, just simple to implement.

Have you considered trying to see if you can find an iSCSI HBA (i.e., a hardware based iscsi initiator) that can just pretend to be a disk (to windows)?

Yes, money for an HBA but… will likely make it a lot easier. And MUCH faster.

They definitely exist, usually used in hypervisor deployments when you want diskless servers that iSCSI boot of a SAN. I haven’t looked, but you might be able to find some ex-server ones cheap. Whether or not Windows 10 can use them or not is another issue but…

1 Like

I mean the time from the moment ipxe has connected to the iscsi target and actually boots windows.
BlackScrenn -> windows logo -> login screen.

Even though it is CX3 40GbE with Striped Samsung 840s, it takes long.
it feels as long as from the WD-Reds.
So i doubt that a fast NVME would change that.
Maybe it is because windows iscsi isn’t RDMA capable and coppies stuff around.

I have read of those iscsi HBAs, though i haven’t looked into it jet.
I’m a bit skeptical performance wise, but i’ll see when i look into it.

Ah ok so you’re definitely booting from iSCSI once PXE does its thing.

You’re definitely on gigabit ethernet yeah? Not wifi? Not routing through a router (that will fuck it up speed wise unless you’ve got a real high end routing device/decent layer 3 switch)

I’m booting VMs over NFS (off a netapp filer) from my Linux workstation in VMware workstation at work and they seem to be snappy enough disk IO wise…

I’d break out wireshark and see if there’s anything wierd going on at the ethernet layer 2 level.

1 Like

i’m on 40GbE to be precise.
Booting off 2 Striped (raid 0 in ZFS)* of two samsung sata ssds.
Crystal diskmark sais about 500MB/s read / write.

Ahh… make sure atime tracking is turned OFF on that dataset…

1 Like

i think i did that. will do that if i didn’t.
I expected lowered boot-times on the SSDs VS the HDDs.
But it feels the same. Guess i have to really measure it later.

1 Like

Thank you for shariong you pain.

A diskless Windows 10 system is something I’d very much like to try for myself since I want to enjoy the benefits of ZFS like file integrity features, encryption (hence Oracle and not Open ZFS) caching, snapshots etc. even with the boot device.

Since this is a personal non-commercial project I intended to use ESXI and Oracle Solaris with Intel XL710 40 GbE ethernet parts (direct connection between server and client). ZFS would get 64 GB RAM, 480 GB Optane 905P as ZIL and L2ARC and common SATA SSDs as main data storage parts.

(The server isn’t just running for that Windows system, but that would be a nice add-on)

1 Like

regarding the windows only boot-time after iPXE has executed the sanboot command.
after 50 seconds, the windows ring starts spinning.
It took 1minute 41 seconds to get to the desktop and another 10 seconds until the previously opened task-manager has opened.

The Storage (“Datenträger”) Usage, jumps to 100% a few times.

And that is from the Raid ssds.

50sec until the wheel spinns.
2 min 57sec to desktop.
on the WD Red
System is kinda unresponsive though, task-manager sill takes like 30 seconds to open.

A time is in both cases still on though, so next try is with A time off.

Boot from Hdds with Atime off, 1 min 45 sec till desktop and 2min till task-manager is open and usable.

No change on the SSDs with Atime off though. still 1 min 40 + 10.
Questionable is how much iscsi without RDMA and the NASes CPU are the limit in this case.
I mean, The nas has a 1.5GhZ e5 2628L V4 …

Though, % Usage and responsetime of the ssds is phenomenally in comparison to the hdds. Who would have thought.

Hope I didn’t overlook it somewhere - what hardware is the client box using (CPU/Motherboard/Memory/Ethernet Adapter)?

You’re already using 40 GbE… is that also an Intel XL710 like I like to try or a Mellanox part?

haven’t really mentioned anything in that detail.

Asus Maximus ix Apex with a lend 8700k on it. it is modded to do that.
8GB Ram, i could slap the second dimm in but didn’t cause i’m lazy and i don’t think thats the problem.
the Nics are Mellanox CX354 FCBT, the HP OEM 544*? Reflashed to the MLX part. It is Infiniband VPI / Eth, set to Eth in the hardware config as default.
It is located at the last by 16 slot which could / should be pcie by4 from the PCH. if it is gen 2, that could cause performance problems with the nic, but it should still perform better then it does i guess.

I can test that rather easily though.

I’m kind of curious to compare your Mellanox to the Intel 40 GbE with a diskless environment - “normal” non-RDMA/iWARP SMB3MC performance with Windows Server 2016 is quite pleasant. But I’m in the process of moving everything to ZFS.

xl710-smb3mc

1 Like

be aware, windows iscsi is not RDMA capable as far as i’m informed.
So SMB3MC or ratherr SMBDirect will perform better.

iscsi + rdma = iSER. No problem on linux just to have mentioned it

1 Like

The Intel XL710 can’t do RDMA or iWARP (I think that was Intel’s name for RDMA).

Any ideas how Oracle Solaris performs here? In Windows the maximum single core performance for a single SMB3 connection and the lower turbo frequency with two active cores limits the SMB3MC performance.

although RSS is to be active I’m only seeing one core above 95 % when saturating one 40 GbE link.

Hardware of test systems: Xeon E3-1275v5, 64 GB ECC and Xeon E5-1650 v4/64 GB ECC.

i have no clue how Oracle Solaris will perform here, i’m on Freenas, so i assume the iX ZFS and future ZOL implementation.

I can’t put iscsi or iSer into relation with SMB. Sorry.
If there is a way to boot over SMB, especially Multi-Channel and Direct, which should be the same thing because Fuck naming, Go SMB Direct boot.

If you look here, iSer is significantly better then just iscsi.
And implementation specific, i’m looking at you windows, iscsi might even be worse.

I did iSer test over QDR (40GbE like) Infiniband with a samsung 950pro and it behaved identical, just hte rather low acces-times doubled from like 0.00004 to 0.00008
just iscsi as a test, didn’t even reach 10GBit/s if i remember correctly.
it was horrible, and single core limited.

I suspect it may well be something in the Windows 10 boot sequence that is the issue.

How is performance if you were to try and do this with say, Windows 7 or Linux?

Might be an idea to attempt to isolate the problem to either your iSCSI/network/backend storage or the OS, because it would not surprise me in the least of Windows is attempting to do “stuff” that might be making assumptions about the underlying storage that aren’t valid when running from iSCSI… and eventually it is timing out and continuing after sitting there waiting for a while.

Are there any messages in the Windows event viewer under the system log (or any other relevant log)?

Maybe hit the event viewer main screen after boot and look for warnings/errors within the last hour.

Because as far as performance goes - you should be getting WAY better performance than I am with this setup (well, one would hope, given your hardware):

  • Fedora 30 VM host (running VM workstation 15)
  • NFS mounted datastore over 1GbE from FreeNAS RAIDZ2 with 6x 10k SAS drives (its an old poweredge R710 i think)

Booting Windows from that (i.e., VMware Workstation 15 VMDK file hosted off NFS mounted folder) is relatively “normal” in terms of performance.

1 Like

linux is next on the list for testing since windows 7 will be nearly identical to windows 10 in this regard.

Will take a look at the logs later.

1 Like

Do you have tutorial for that NFS VMware image boot?
Since that might actually be a better solution for my problem :smile: .
At least i’d like to give that a try.

Lol, no tutorial but essentially just export an nfs share from FreeNAS, allow by IP or subnet, map user = root (from memory - or even better set security on the dataset and map user= the owner of the share on the freenas side, i think i fixed it like that) and mount it under your linux login.

I’m off work until monday, but remind me via pm or something next week (otherwise i WILL forget) and i’ll check my configuration on both ends.

edit: and yeah, i’m definitely doing this via ZFS off freenas. dunno why i posted netapp above, must have had netapp on the brain for some other work-related reason at the time. use nfs off netapp for vmware ESXi over 10GbE for production environment… my “lab” is a workstation with 64GB RAM a TB of SSD and NFS to a freenas box :smiley:

edit:
you can probably figure it out in half an hour of reading up on NFS if you’ve never used it before (judging by your level of understanding with what you’ve done in this thread). its pretty simple. much simpler than iscsi to set up IMHO.

1 Like