I suspect, but not sure that I have run into a Supermicro IPMI bug.
I’ve been replacing site routers with different VPN gateway devices. Upon doing this, every other device on my OOB management subnet is fine, but my Supermiros won’t talk to the gateway and route to different subnets, but I can clearly ping the device from the local subnet.
If we change the IP address on the IPMI then things come back to normal.
Has anyone else seen this? Any idea if its doing some sort of brain damaged caching of the MAC address of the gateway for some indeterminate amount of time unless or until you change the network configuration on the IPMI? I’m not talking minutes or hours either. I’ve got one that hasn’t been able to route through its gateway for 2 weeks, which I’m working on right now. Gateway IP hasn’t changed, but its a different device; again every other device on this subnet has no problem.
I’m building a local on-site VM on my OOB management network to reconfigure one of the IPMI ports now, will get more details on model, software version, etc. shortly.
model showing as Supermimcro SYS-E302-9D in msinfo.
In this vein, will a super micro BMC reset leave the running OS alone?
I’m fairly sure it should, but don’t want to test that live
You are correct, a BMC reset won’t disrupt the host OS unless there is something really weird going on like you have some kind of poorly written daemon that will crash if it stops seeing BMC data it expects.
Are you trying to get into the IPMI via the web interface when it’s acting up? or are you using one of supermicro’s dedicated management tools, I know it’s not supposed to be the case but I swear I’ve gotten different behavior when trying to do the same thing between the two.
To confirm: I can get into the web gui or ping from the local subnet, but not from remote.
Every other device on subnet can be accessed or pinged from remote (as in, different subnet several thousand km away), as could this prior to the gateway device being physically swapped out. They all have a gateway of 10.152.6.1 configured, subnet mask the same everywhere, 255.255.255.0. This hasn’t changed
It’s almost as if the gateway address has been arp looked up by the IPMI and it refuses to re arp for the new device holding that ip address and hence can’t route.
I’ll reset the bmc and see if it fixes it.
Oh, also to confirm - my colleague told me he fixed one of these supermicro IPMI problems (we have a number of them and am changing out routers on sites) by changing the IP, however that did squat for mine.
I changed it both via the web UI and also from within the running OS via ipmicfg tool, neither made any difference.
Could log in/ping the new IP from local OOB management subnet, but not from my HQ site.
Like i said, it seems like the device has done an arp lookup for the IP address of its gateway on BMC boot and just caches it forever; even if that device/MAC address is no longer on the network. I figured it might be able to get it to reset the IP stack by setting the BMC to DHCP and back but it did nothing.
So, rebooting the BMC from the GUI did nothing (well, it rebooted the BMC of course, but…)
Cold reset of BMC using ipmicfg did nothing (as above).
this is kind of a shot in the dark but what if you access the BMC through supermicro’s IPMIView software? do the settings displayed in that match up with the web based ipmi gui? I just don’t trust supermicro (or asrock’s) web interfaces, always seemed alittle flaky to me.
another shot in the dark:
does toggling garp enable/disable change behavior?
ipmicfg (command line tool) matches the GUI settings and they are correct.
It won’t let me turn on gratiuitous arp (already tried that)
I think I’ll need to hard power cycle it.
Either that or its a weird bug in the Silverpeak unit in front of it, but like I said, this isn’t the first time I’ve seen this and every other device on that subnet behind the silverpeak unit is 100% ok every time.
Looks like this isn’t an IPMI bug, it may be a Silverpeak SDWAN software bug in build 220.127.116.11. For some reason all the devices it won’t route (seemingly at random from device to device) are on my management VLAN 6.
I’ve seen this sort of behaviour on a few devices now behind a couple of silver peaks on different sites, fixed one of them by upgrading to 18.104.22.168, so will see how that goes elsewhere.