I have been playing with proxmox for a few weeks now.
Hardware
2 dell r720’s with 24 cores. (third workstation for quorum…)
both have 10gbs nics (intel) hooked to a ubiquiti 10gb switch.
The problem am having is network speed between nodes. A replication between node a and b takes hours. This should be pretty fast over 10g-baseT.
I am getting a max of 150MBs between nodes.
The VM’s are on a zfs ssd raid - they seem snappy.
What should I be looking at?
Thanks!
PS - rsync is even slower… 15MB/s
When did the performance degrade?
Do you know what changes were made leading up to the issue or is this a new setup that just isn’t reaching specs based on the stated capabilities?
This is a new setup… SO - just not reaching performance… Unless I am asking too much. (this is relatively old hardware…)
The plan is - buy a gigabyte epyc server… soon.
It’s tough to say without complete specs, but I’d be looking closely at my performance counters if they are available.
I am not familiar with that specific virtualization software, but I would also be checking your drive block sizes, software based striping parameters (if that is relevant in ZFS), to see if those need checking. If you’re using any software based link aggregation, hopefully you have a recipe book for setting that up to spec for your desired application.
Also, if you have the ability to see network utilization on your ubiquiti device, perhaps you’ve already determined the bottleneck exists elsewhere within your machines.
1 Like
150Mbps sounds like a disk drive cap or a cabling issue, assuming its constant. Have you tried eliminating variables one at a time?
-
run Linux from a usb pen and test single add to single add copy to verify network
-
add drives one at a time to see if you have faulty ssds or if one is full / worn
-
try a non zfs array to see if it related to the config.
1 Like
A little more testing… I have 3 nodes - one is a 3rd gen hp workstaion. I have both a cifs and a NFS share that I can copy from. Both seem to be capped at 40MB/s or less. (cp’ing a file from either share is this speed.) The HP has a 1gb nic in it. (same performance) so I put a 10gb nic in it for grins. Same. <40MB/s transfers.
Maybe proxmox? I don’t think so. I have a coworker that uses ubuntu as his daily driver. He also seems to peak at 40MB/s copying from the cifs server.
Now - we mosty connect with windows machines. My workstation connect to the same server (samba) and I can pull files at the peak rate of 1gb nic in my computer (slightly over 100MB/s
(the cifs server is an ubuntu server running samba and 4 bonded 1gb nics tied in LAG)
thanks!
If the machines are linked by both 1gb and 10gb, are you sure the traffic is going over the right link? You should be able to monitor traffic by individual link/nic/if device
Ok - after some playing. i can scp between 2 of the servers and am getting around 350MB/s… Between one of them I get 100mb/s… Well - I figured out the ssd’s I am using on that server for the boot drive are only 100MB/s reads… So duh…
The raid pools (6 2tb ssd’s in zfs raidz1) on the 2 r720 dell servers seem to transfer at a rate of about 150MB/s
This might just be the limit of the r720’s. It will be fun to play with the epyc server when it gets here.
I still have an issue between servers over samba as I said before. For some reason - my windows machines can pull files off the samba server 100MB/s + While the proxmox servers seem to peak at 50MB/s (nfs, samba, rsync whatever)