Does a SAN make sense?

That would be a scenario I would be ok to go with but again if I hit a tree….

Two storage boxes (start with two).

e.g. TrueNAS scale with ZFS, and you can mix iSCSI with Samba and NFS. [edit: don’t bother with spinning storage for 10T data, go all flash, but use enterprise flash with power loss protection that won’t die once you exceed the rated limits; this is much simpler than various weird hierarchical storage caching stuff, you can go with raid10 or raidz made from large capacity 3.5" flash drives, relatively cheaply per byte]

Two cores boxes [edit: I should really start saying RAM boxes] for your VM services (start with two), and put proxmox there.

These two will give you clicky web UIs for management that’ll save you time in the long run.

As for off-site backup, you can use some cloud or you can use a third storage box.


If you can deal with some downtime, or having your data and services single homed for a while, or having your data go back in time maybe (restore from backup), and then dealing with those inconsistencies between systems when that happens, you can simplify your life dramatically with just 4 hosts (5 counting off-site backups) as I described above.

E.g. you can have 2 ADs and you can replicate SQL stuff and you can failover manually.

You can cron ZFS snapshot sending and receiving.


Alternative, would be to build a ceph cluster with 5+ hosts and run a combination of iSCSI and filesystem and kubernetes stuff.

I think for a 70 people organization, unless you’re planning on growing substantially over the next year or two (500 people), you probably don’t want to maintain ceph/kubernetes “in house” and I think you’d best out source it… if that HA stuff is what you need/want.

3 Likes

I was figuring combining the VM and ZFS local storage to avoid the complexity of NFS/ISCSI/etc.

Avoiding the network storage for VM stuff just eliminates so much configuration complexity, network speed requirements, performance diagnostics, etc.

Store local snapshots on the hosts for fast recovery in the “i trashed a VM” case, the backup host is in case a host dies and you need to restore in a DR scenario.

Given your data is stored on at least 2 places at a time (i.e., on the host, snapshotted 3x/day, and replicated to your archive) you could probably get away with one archive host (if it fails, the data is still on the original host). But ideally have a tape unit on that or something as well.

Theme of the above is “keep it simple”. Basic, basic VM maintenance via something like virt-manager or even boxes… no live migration etc. ZFS would be the most complex bit.

Again this is something i’d happily run in a lab (not in prod, at least not without significant testing first in a dev/staging environment), but i’m dealing with 4000+ users in a 24/7 operation (and the capacity is there to spend, i just need to put in the effort and forward a business case). YMMV

edit:
let me provide the additional caveat. if the above sounds really simple… its because you haven’t tried it yet :slight_smile:

you’d need to lab it up, test various scenarios and make sure that your replication actually works, etc.

edit:
you could also probably do this with 3 truenas boxes as they can run VMs. And you could even get support on that i suspect.

2 Likes

Could also run 2 VM hosts and 1 Veeam box for backups. That way, you can “instant” restore VMs on the host should one fail.

my same design with 2x smaller hosts, and still the 1x TRUENAS box would work identically, and could have VM redundancy. rather you did true HA of the VMs, or cheaper ‘i will start the clone myself if one host goes down’ redundancy, would be up to your budget.

There’s multiple strategies you could employ like 2x Hosts and 1x Storage appliance for it’s central storage so you can use that storage OS’s snapshotting/replication functions, or have the 2x hosts operate on their own local storage and use the Storage appliance as the backup appliance.

However, seeing as you already have a backup NAS, I’d probably go with the 2x Host and the centralized storage approach, and use the other backup NAS to backup your primary NAS. Keep the layout relatively simple and straight forward that way. If you don’t already have one, make a point of having 10Gb Switching between all your storage and servers. I’ve seen way too many small shops and governments brush it aside and not realize how much of their slowdown is just due to having 1Gb networking in their core

Depending on what you’re familiar with both TrueNAS and Synology can both work well.

1 Like

Tbh it sounds like most of that might be better to just move to Azure.

You can just use office 365 and Azure AD and call it a day. Depending on the file system requirements you might still want that local but the other stuff I would offload to Azure if I were in your shoes.

1 Like

You can look at Liqid to build an environment from scratch. Wendall had some really good videos talking about composable technologies I’m defiantly sold on its future potential for on prem hardware.

agreed, but then you have added network speed and configuration into your VM storage. you’ll need faster switching and added some additional performance diagnostics.

it’s definitely the better (or perhaps more complete and scalable) solution given enough network and skill set but avoiding storing the vms on network storage while running live saves a heap of complexity and will be easier to diagnose performance wise.

and 1g network? no problem. it’s just used for snapshot copies and user access.

i’d also consider that. with the advent of carrier grade nat, vpns are becoming a shit show now for work from home etc. and microsoft are less likely to go out than your head office isp.

you’ll also get multi factor auth for free, not have to patch exchange any more, etc…

if i was starting over the only thing i’d put on prem today would be maybe file and print, and the files would be replicated up to azure anyway.

I literally just did a project for my BAS Cyber program about your exact situation. It really makes more sense to just get what you can into Azure especially with 70 users and being a MS shop. I would look into it. You might have to spend a few days learning all the Azure lingo and products but they have a calculator that makes it pretty easy to estimate stuff out.

1 Like

I had, perhaps naively, assumed @dem_Geist had considered cloud, but there file servers in the mix ; and having good connectivity is not always a given, and then there’s cost control.

jumpcloud and other stuff might not be a bad idea either, and you may be able to save on CAL fees.

@ucav117 and others, how well does Azure deal with simple file shares, is it just a VM and network latencies be damned or are they doing something smarter (distributed cache-coherent filesystem as a service?)

1 Like

Morning Y´all,
cloud isn´t really an option for since we use a lot of local old goverment application with no support if the system is hosted in the cloud.
But i got a few ideas now how to keep the costs a bit lower. Replace the old ESX Hosts with set of newer ones lets say a ratio of 3 to 1. Vmotion is nice to have but in our case maybe overkill since its ok to have to bit of downtime. So new vmware hosts with essiental lic. and either 2 full flash NAS systems with sync and snapshot enabled or on SAN. All connected with Iscsi and dual 10G interfaces at least. Or in Case of SAN the Systems diretctly connteced with SAS to the storage. If one hosts goes down. We can mount the vm in the another host or recover the system from snapshot/backup etc on still alive hosts. Applications we can offload into the cloud like exchange will be offloaded next year. Also Backups will still take place with veeam to a seperate NAS and Off-Side.
What about that plan?

I’d say ditch VMware and go with a KVM or Xen based solution - there’s no licensing to worry about.

Since you don’t seem to need many VM hosts, I’d advise to stay away from things like oVirt (vcenter analogue-ish) or kubeVirt; and do proxmox instead, or do xcp-ng instead.

(with proxmox/xcp-ng you can half-train a local helper who can take care of basic stuff when you go on a vacation).


Speaking of vacations, have you considered outsourcing to a consultancy that would take over dealing with basic infra stuff, while you provide “white gloves” treatment …
… are you in touch with any peers?

edit:

Ah ok, it’s not a “one man show”, good.

Yes and No. I am the only “real” sysadmin. So from the techincal perspektiv its a one man show but my boss is whilling to throw money at a new colleague as well as SLA contracts for said systems.

Honestly if you want to do it the easy way you can just OneDrive it with office 365

Otherwise there are few different ways of doing it starting cheap and going to astronomical depending on your needs.

What I was looking for is apparently called “Azure Files”, I don’t know how practical it is…

For example, something like you have an office in Berlin and an office in NYC:

  • both have the same share / UNC path mounted
  • both can read/write to the share
  • both can read the same 1MB file off of the share in 5ms or less.
  • both can open two separate excel spreadsheets for read write in the same directory in 100ms or less (typical public internet RTT is around 85ms)
  • no planned downtime, no maintenance, no upgrades, no Windows to manage - pure service.
1 Like

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.