This may be a really dumb idea and please feel free to tell me so, but I would like to split my home lab into 2 sections: compute and storage. Compute will be multiple nodes of Proxmox; Storage will be a TrueNas server with ~80Tb of available pool. All of these nodes will be connected via 10Gb links with vlan segregation and such as appropriate.
I would like to have all of my VM and CT disks (including os disks) placed on the storage node and accessible via iSCSI (probably a ‘share’ per vm/ct to segregate data).
Couple of days of research has not been promising to find anyone else doing this across the internet, but perhaps my Google-foo is weak… any ideas on this idea?
You can make a ginormous zvol, share via iSCSI and add that as storage in Proxmox and you can carve out all storage for VMs and Containers and whatever from it. This is the quick and (very) dirty approach.
One iSCSI share and storage per VM, one for each VM…gets kinda disorganized and complicated if you’ve got a whole lot of zvols. I moved from this approach to 3. after some time.
Called “ZFS over iSCSI” which is kind of a Proxmox plugin to automate creation and sharing of zvols via iSCSI ssh connection. This is the most straight forward and scalable approach but also requires more than a couple of clicks in the Proxmox UI.
I’d say this is the best practice approach and functions similar to a local_zfs storage on the Proxmox server. Means each VM disk creates a zvol instead of qcow2 by default. Which is nice and easy once you got it running.
It isn’t dumb…a lot of people use ZFS storage outside of Proxmox. Virtualizing e.g. TrueNAS in a VM is rather popular and having a separate bare-metal server is even better. And ZFS can’t be clustered, so you can’t use shared storage on Proxmox other than with external storage.
wow. ok. i had seen the ‘ZFS over iSCSI’ but had skipped it because it looked really complex to setup, but now that I actually RTFM it is somewhat complex but do able. Sounds like the one I will setup.
What is your/generally accepted method to moving over the rsa keys? Just turn on password auth for a minute?
I ask my dad to read it for me while I type it in on the other side…just as he asks me to read his Wifi password all the time just kidding.
You can save the key on a USB drive and plug it into the storage server (~/.ssh/config is where all the keys go). Or just switch to password auth…plug out WAN if you feel too uncomfortable doing this.
I don’t use password auth and I got key pairs all over, but I’m not that strict on my home network.
That’s not unusual. If you set up a cluster with fixed connections between nodes, it’s a root ssh keypair. I’m not sure if Proxmox has the flexibility to do otherwise because of the rights needed to do all the stuff on the storage server to make everything work, like meddling in iSCSI daemon service, setting up portals and targets etc.
I’d check on the Proxmox forums or the general internet. But I don’t see a problem myself. If I don’t trust my Proxmox host, I wouldn’t entrust it with accessing my pool either.
One thing I have an opinion on is not putting your VM partition on the same pool as your data.
The reason for this is snapshots and backups.
The virtual memory pool gets neither snapshots nor backups. The data drives gets both.
Ideally the backups get pulled from the file server to an independent host. The backup server can push files to the file server, but the file server cannot access any of the data on the backup server.
In case of ransomware your backup server does not get compromised.
That is an interesting thought. As far as pivoting around the network, I don’t know much, but my thoughts are this (and please correct me if I’m wrong).
If a VM is compromised it would only be able to get to its ZFS/iSCSI share and destroy that, but if the host Proxmox system was compromised it would destroy both the compute and storage because it has root shell access.
So is there a way or how do you prevent a pivot from the VM to the host system? Is it just network segmentation (VLANs) and firewall rules? Can it be done without leaving the box?
While you COULD do it all as VMs, I have always put the backup server on a different physical host. When you found out after a lightning strike that your surge suppressor was worn out and all of the drives connected to the backplane on your disk shelf are now fried, it is nice to know that as soon as you get the new hardware in you can be online within a few hours. If you live in a city, or near to one you could have your system online within half a day. (When you make a production system using mirrors, put at least one member of each mirror in an independent disk shelf, with a different power and data path, hopefully through an independent sas controller. )
Presumably the file server has a pcie SAS card, not just volumes. You want the storage server to report on things like excessive crc errors, or interface resets that occur between the drives and the sas expanders, or between sas expanders and sas controllers. You also want the storage controller to be able to blink a dead drive so you don’t pull the live drive from a degraded mirror, killing the entire pool.
If you did it all as VMs, I would give the backup server it’s own SAS card to attach the backup hard drives to, that way it makes it much more difficult for another VM to compromise the backup drives. It would be more vulnerable to user error if you just gave it volumes.
Having the backup server and the primary server have independent isolated sas controllers does make it so that you have to be physically present if you find you need to replace your primary pool with your backup pool.
If you are just assigning drives to VMs you can unallocated a set of drives from your backup server and then allocate it to your primary server.
The backup server does not need much CPU or RAM, it needs to perform its work eventually, not right now. Usually just scheduling backups for off times is enough.
Snapshot hourly or daily of data drives, then send the diff of the snapshots to the backup server. I have heard of some zfs horror stories of corrupted directories, and will usually wait an extra day before applying the last snapshot to the replicated array. This does mean that I need to have a fairly high bandwidth and high endurance location to store the short duration daily diffs.