Hi All,
I’ve been happily running a Brocade ICX7250 48P-2X10G network switch for a few months now 24/7. Recently I observed my Proxmox cluster nodes regularly rebooting my VMs and I think I have traced the problem to the Brocade switch. It seems to reboot intermittently multiple times per day and I don’t see an obvious pattern to it.
I have syslog remote logging to a server so these logs persist whilst the show logging command outputs don’t seem to survive a reboot. I don’t see anything weird in there other than regular warnings along the lines of:
May 26 19:42:24:A:System: Stack unit 1 Temperature 67.0 C degrees,
I see this steadily rise to about 80C then at some stage a reboot occurs and the temperatures can be closer to 60C again then start rising.
I wonder then if these issues are actually thermal related despite being below the shutdown temperature of 105C? I also note that recently I discovered one of my ports with an SFP+ to 10GBaseT adapter was not blinking LEDs or providing a network. On replacing the adapter the switch quickly went into a boot loop with all amber LEDs. I had the system powered off overnight then next day it powered on ok with the adapter in except for this more intermittent boot looping.
To test the thermals I tried blocking the fans and the temperature reached around 90C according to the logs but just kicked in the high fan mode so cooled down and did not reboot. So maybe under heavy load the switch could be quickly overheating to the shutdown temperature before it has time to send a remote syslog message (highest I see in the logs in normal usage is around 80C before it drops lower eg. 50C on next reboot)? But this seems unusual. Are there any other explanations/fixes people can think of?
Am hoping someone can make sense of this/have seen something similar in their own switch, and suggest how I can fix this. Any advice would be very much appreciated at this stage as aside from this new issue I have been very happy with this device. Thanks for your help!
Here is some further debug output:
show version
Copyright (c) Ruckus Networks, Inc. All rights reserved.
UNIT 1: compiled on Aug 8 2023 at 23:06:54 labeled as SPR08095m
(33554432 bytes) from Primary SPR08095m.bin (UFI)
SW: Version 08.0.95mT213
Compressed Primary Boot Code size = 786944, Version:10.1.26T215 (spz10126)
Compiled on Tue Nov 29 23:13:15 2022
HW: Stackable ICX7250-48-HPOE
==========================================================================
UNIT 1: SL 1: ICX7250-48P POE 48-port Management Module
Serial #UK3845L1DZ
Software Package: ICX7250_L3_SOFT_PACKAGE (LID: fwmINJKnGfb)
Current License: l3-prem-8X10G
P-ASIC 0: type B344, rev 01 Chip BCM56344_A0
==========================================================================
UNIT 1: SL 2: ICX7250-SFP-Plus 8-port 80G Module
==========================================================================
1000 MHz ARM processor ARMv7 88 MHz bus
8 MB boot flash memory
2 GB code flash memory
2 GB DRAM
STACKID 1 system uptime is 3 hour(s) 44 minute(s) 17 second(s)
The system started at 19:38:19 CST Mon May 26 2025
The system : started=cold start
show chassis
The stack unit 1 chassis info:
Power supply 1 (AC - PoE) present, status ok
Power supply 2 not present
Power supply 3 not present
Fan 1 ok, speed (auto): [[1]]<->2
Fan 2 ok, speed (auto): [[1]]<->2
Fan 3 ok, speed (auto): [[1]]<->2
Fan controlled temperature:
Rule 1/2 (MGMT THERMAL PLANE): 91.3 deg-C
Rule 2/2 (AIR OUTLET NEAR PSU): 40.5 deg-C
Fan speed switching temperature thresholds:
Rule 1/2 (MGMT THERMAL PLANE):
Speed 1: NM<-----> 95 deg-C
Speed 2: 85<----->105 deg-C (shutdown)
Rule 2/2 (AIR OUTLET NEAR PSU):
Speed 1: NM<-----> 41 deg-C
Speed 2: 34<----->105 deg-C (shutdown)
Fan 1 Air Flow Direction: Front to Back
Fan 2 Air Flow Direction: Front to Back
Fan 3 Air Flow Direction: Front to Back
Slot 1 Current Temperature: 91.3 deg-C (Sensor 1), 40.5 deg-C (Sensor 2)
Slot 2 Current Temperature: NA
Warning level…: 85.0 deg-C
Shutdown level…: 105.0 deg-C