So I’ve been looking to containerise some HPC/CFD/FEA applications from running on baremetal to some form a container (I haven’t decided yet whether it’s going to be LXC containers or Docker containers).
Given that, I am trying to set it up so that with a single 100 Gbps port/connection, I want to be able to share that either between LXC containers and/or between Docker containers.
I’ve played with and enabled SR-IOV on my Mellanox ConnectX-4 NICs, and I have set and I can see the virtual functions in ip link
, but when I try to pass them to a LXC container (or even a VM), it says “NO-CARRIER”.
(But the hosts in my 3-node cluster, shows:
...
State: Active
Physical State: LinkUp
...
in ibstat
.)
So, right now, I am thinking of dispensing with and dropping SR-IOV altogether, and maybe potentially going with Docker containers instead, but I haven’t found any good documentation as to how Docker network works (or doesn’t work) with Infiniband.
Anybody have any ideas on how to get that set up?
(And yes, I did try setting one of the ports on my dual-port card to use ETH link type rather than IB. Host-to-host, it can do 96.9 Gbps (out of a possible 100 Gbps) with 8 parallel streams. But VM-to-VM, it tops out at 34 Gbps, which, for FEA work, would be abyssmal.)
This is why I am trying to find a different solution to this problem.
Any help is greatly appreciated.