Rosewill is basically Newegg’s store brand. I don’t think Newegg has much of a presence outside the USA so it is unsurprising you wouldn’t know about Rosewill.
damn. thats weird. And lucky / good that it didn’t kill the BMC rom
I’ve sent a message to ASRock Rack support. We’ll see how it goes. But right now I am reconsidering how much I really need IPMI. I “upgraded” from an ASRock X370 Pro4 for this, and if the issues don’t get resolved within my return window, I might have to “upgrade” back to it.
Very nice
I’m on 2T version with 2XKingston KSM26ED8/16ME, Ryzen 3700x, running unraid without any issue
Has anyone attempted to passthrough hardware to their vms? I had to turn iommu off as it was causing lots of system instability.
ok, just found out my syslog was flooding with similar log as below, not sure if it is the hardware issue
Aug 18 22:08:34 Tower kernel: pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
Aug 18 22:08:34 Tower kernel: ixgbe 0000:01:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
Aug 18 22:08:34 Tower kernel: ixgbe 0000:01:00.0: device [8086:1563] error status/mask=00001000/00002000
Aug 18 22:08:34 Tower kernel: ixgbe 0000:01:00.0: [12] Timeout
Aug 18 22:08:34 Tower kernel: pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
Aug 18 22:08:34 Tower kernel: ixgbe 0000:01:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
Aug 18 22:08:34 Tower kernel: ixgbe 0000:01:00.0: device [8086:1563] error status/mask=00001000/00002000
Aug 18 22:08:34 Tower kernel: ixgbe 0000:01:00.0: [12] Timeout
If you are in Europe and have about a week time to wait for the actual product you can send these guys an email (not listed in shop yet), a consumer got a X470D4U2-2T for EUR 308 there.
That’s quite a good value 
(Note: Not affiliated in any way)
The journey continues. I’m up to 5 unRAID crashes now. It usually crashes within an hour or two. I let Windows run for over 24 hours and it was fine. This is strikingly similar to what I experienced when this 1800X cpu was new. I had similar crashes running Ubuntu, and ended up setting on Windows for stability.
Reading things online, it seems that unRAID, and in fact *nix in general, have been fairly unstable on some Ryzen systems. I’ve read of some users achieving unRAID stability by disabling Global C-State Control in UEFI. And also setting Power Supply Idle Control to “Typical Current Idle” (this is what I’m trying right now-- uptime 30 minutes so far). Others say that neither setting makes it stable. One guy said he had to RMA his affected CPUs, and the replacements were stable. Well, that isn’t going to happen in this case.
Regarding CPU_PROCHOT, I may have found a temporary workaround. I learned that a program called Ryzen Master can disable the PROCHOT flag. It comes with a warning that it is intended only for extreme overclocking attempts. Anyway, I clicked the button to disable PROCHOT and it said I couldn’t do that until I increased CPU speed past the base clock speed. Yay, more hoops to jump through. Using Ryzen Master, I told the CPU to run at 3700 MHz (100 MHz over base, still well below normal turbo speeds). This required a reboot. At some point during the reboot, CPU_PROCHOT got deasserted. Nonetheless, I went back into the Ryzen Master settings panel, pushed the button again to disable PROCHOT, and then used Ryzen Master to reset the CPU speed back to defaults (which again required a reboot). Well, CPU_PROCHOT has not gone back to State Asserted since doing that about 48 hours ago, including several reboots into both Windows and unRAID. I think the button to disable CPU_PROCHOT was supposed to have a temporary effect, but I have not yet fully power cycled the motherboard so the BMC hasn’t restarted, and I’m guessing that is why.
Are you using a 3rd gen CPU?
Nope, maybe in the future, but for now, as noted, it is an 1800X (1st gen).
I have a 2700 in my MB. Except for when I replaced my 4Gb FC HBA with an 8Gb model… mine has been running about a week and a half.
Have you seen the option Settings -> “Processor Hot” where you can disable the throttling of the CPU when it’s hot? Might this be the same option you were able to set using the Ryzen Master Tool?
Maybe. It did not immediately clear the CPU_PROCHOT state when I disabled that Processor Hot in the IPMI settings, but I did disable that a few minutes before I did the Ryzen Master song and dance.
With Power Supply Idle Control set to Typical Current Idle, I have now had unRAID running for 8 hours 29 minutes, which is at least 4 times longer than it has gone without crashing before. So that setting may have done the trick! Hopefully I’ve seen the last of CPU_PROCHOT too, but I’m going to shut down now and physically unplug the power and plug it back in to reboot the BMC and I’ll see how things shake out from there. If PROCHOT comes back, I’ll need to fish up some alternate power supplies.
Edit: Nope. CPU_PROCHOT immediately asserted again before Windows even booted up.
ASRock support wanted me to try a higher wattage power supply, so I’ve swapped the Rosewill 500w PSU for an older OCZ 700w, and so far it is looking good.
For reference, this is my original build: https://pcpartpicker.com/list/gGR6vn
This Rosewill Valens 500 PSU caused my ASRock X470D4U to often show CPU_PROCHOT State Asserted and throttle the CPU to 550 MHz:

And this is the OCZ700MXSP which, so far, has not caused the same problem:

Since installing the OCZ700MXSP, I have booted the server into Windows several times and done PassMark benchmarks, with full power cycles between. Then I went into the UEFI settings and changed Power Supply Idle Control back to Auto, and I am now running unraid again to see how things work out. I’m up to 40 minutes uptime without a crash yet. If it survives the night, that will be a very good sign.
For anyone curious about the CPU performance change between running the memory at default (Auto) settings versus at its advertised speed, leaving the UEFI settings alone resulted in 1866 MHz 13/13/13/30 and a CPU Mark score of 15258. Manually assigning only the clock ratio to 1333 MHz in UEFI settings, leaving everything else on defaults, resulted in 2666 MHz 20/19/19/43 and a CPU Mark score of 16537. Here are screenshots of the results.
Yes, this is in line with the findings in this thread. However the wattage of the PSU has nothing to do with CPU_PROCHOT:
For me that setting doesn’t have any effect.
Sadly, still no word from ASRock Rack if they really know where these issues with certain PSUs are coming from.
Seasonic said they would contact ASRock Rack in Taiwan and cooperate with them since e few of their PSUs are publicly mentioned here. But also no news there 
I was thinking that maybe the +3.3V and +5V current capabilities made a difference (because heck if I know how much of which voltages each component uses). But looking at the specs of the PSUs above, that is clearly false. The entire Seasonic FOCUS Plus Platinum series, which is reported good, is rated for 20A at those voltages and labeled as 100W total, exactly like my Rosewill PSU which is clearly not good for this motherboard. Meanwhile other PSUs with 25A / 125W at those voltages are bad while my older OCZ model has 25A / 150W on the sticker and it does appear to be good. It seems arbitrary. I honestly wonder if it is even consistent within particular PSU models or if it is down to random chance.
Anyway, under the OCZ700MXSP power supply, with my earlier UEFI configuration tweak for idle current undone, Unraid ran all night and is currently at 11 hours 15 minutes uptime. There is no sign of CPU_PROCHOT in the logs since I swapped power supplies last night. So now, barring any stability issues that have yet to be revealed, my biggest remaining issue is the motherboard thinking CPU temperature is 20°C higher than it really is.
No chance to tell without proper test equipment… the poor sods at ASRock who have to figure this out…
Since I don’t know any other manufacturer that offers such an unique motherboard I just hope that there is no critical hardware design flaw that cannot be rectified by BIOS/firmware updates.
But since ASRock Rack seems to have outsourced QA to customers my compassion is somewhat limited…
What is that saying in the States that doesn’t mean anything, “Thoughts and prayers”? 
That’s intended by AMD for first-gen Ryzen CPUs to ensure adequate CPU cooling due to higher fan speeds even with shittier CPU heatsinks.