Scratching my head on this one. journalctl --list-boots and journalctl -b -1 help me to determine at which point the system took a nose dive.
Dec 21 05:58:40 i9-7920X ntpd[964]: Soliciting pool server 2001:da8:8007:1::3
Dec 21 05:59:45 i9-7920X ntpd[964]: Soliciting pool server 2400:8901::f03c:91ff:fefb:7b7c
Dec 21 06:00:50 i9-7920X ntpd[964]: Soliciting pool server 2a04:3543:1000:2310:d862:f5ff:fe4e:6e9a
Dec 21 06:01:01 i9-7920X CROND[67761]: (root) CMD (run-parts /etc/cron.hourly)
Dec 21 06:01:01 i9-7920X run-parts[67764]: (/etc/cron.hourly) starting 0anacron
Dec 21 06:01:01 i9-7920X run-parts[67770]: (/etc/cron.hourly) finished 0anacron
Dec 21 06:01:57 i9-7920X ntpd[964]: Soliciting pool server 2001:418:3ff::1:53
Dec 21 06:03:02 i9-7920X ntpd[964]: Soliciting pool server 2a04:3543:1000:2310:d862:f5ff:fe4e:6e9a
Dec 21 06:04:09 i9-7920X ntpd[964]: Soliciting pool server 2001:a98:11::40
Dec 21 06:05:15 i9-7920X ntpd[964]: Soliciting pool server 2400:6180:0:d0::1494:e001
Dec 21 06:06:20 i9-7920X ntpd[964]: Soliciting pool server 2001:df1:801:a005:3::1
Dec 21 06:07:25 i9-7920X ntpd[964]: Soliciting pool server 2001:da8:8007:1::30
Dec 21 06:08:30 i9-7920X ntpd[964]: Soliciting pool server 2001:3c8:e10e:399f::20
Dec 21 06:09:36 i9-7920X ntpd[964]: Soliciting pool server 2001:da8:8007:1::3
Dec 21 06:10:44 i9-7920X ntpd[964]: Soliciting pool server 2405:aa00:2::10
Dec 21 06:11:48 i9-7920X ntpd[964]: Soliciting pool server 2001:da8:8007:1::3
Dec 21 06:12:53 i9-7920X ntpd[964]: Soliciting pool server 2406:f000:3:e000::2
Dec 21 06:13:58 i9-7920X ntpd[964]: Soliciting pool server 2001:418:3ff::53
Dec 21 06:15:05 i9-7920X ntpd[964]: Soliciting pool server 2001:da8:d800::1
Dec 21 06:16:12 i9-7920X ntpd[964]: Soliciting pool server 2402:f000:1:416:101:6:6:172
Dec 21 06:17:18 i9-7920X ntpd[964]: Soliciting pool server 2a02:2a50:6::123
Dec 21 06:18:24 i9-7920X ntpd[964]: Soliciting pool server 2001:418:3ff::1:53
Dec 21 06:19:31 i9-7920X ntpd[964]: Soliciting pool server 2a02:2a50:6::123
Dec 21 06:20:36 i9-7920X ntpd[964]: Soliciting pool server 2001:da8:d800::1
Dec 21 06:21:41 i9-7920X ntpd[964]: Soliciting pool server 2a04:3543:1000:2310:d862:f5ff:fe4e:6e9a
Dec 21 06:22:45 i9-7920X ntpd[964]: Soliciting pool server 2001:a98:11::40
Dec 21 06:23:53 i9-7920X ntpd[964]: Soliciting pool server 2a04:3543:1000:2310:d862:f5ff:fe4e:6e9a
Dec 21 06:24:58 i9-7920X ntpd[964]: Soliciting pool server 2001:418:3ff::1:53
Dec 21 06:26:04 i9-7920X ntpd[964]: Soliciting pool server 2001:418:3ff::53
Dec 21 06:27:10 i9-7920X ntpd[964]: Soliciting pool server 2001:da8:8007:1::30
Dec 21 06:28:16 i9-7920X ntpd[964]: Soliciting pool server 2001:da8:9000::81
As you can see, pretty boring. What else should/could I try?
Is the memory in XMP? You may have to apply a XMP profile cause the timings without XMP are extremely conservative and loose, which could simply cause random shutdowns.
[root@i9-7920X ~]# dmidecode -t 17
# dmidecode 3.1
Getting SMBIOS data from sysfs.
SMBIOS 3.0.0 present.
Handle 0x004B, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x0049
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 8192 MB
Form Factor: DIMM
Set: None
Locator: DIMM_A1
Bank Locator: NODE 1
Type: DDR4
Type Detail: Synchronous
Speed: 2666 MT/s
Manufacturer: CRUCIAL
Serial Number: A41A7021
Asset Tag:
Part Number: BLE8G4D26AFEA.16FAD
Rank: 2
Configured Clock Speed: 2666 MT/s
Minimum Voltage: 1.2 V
Maximum Voltage: 1.2 V
Configured Voltage: 1.2 V
I noticed a couple reboot loops taking place; it did about 3 reboots within the span of 2-3 mins and it has been running stable since the last reboot for an hour or so (but Iโve stopped running the ZCash miner since).
On an island mate, sending it back is another $100 DHL cost, but I may end up doing it. Moving it onto a different UPS to isolate any power issues.
Re. the pfSense box // 7700K / TUF Z270; another new build. Was working fine, totally deal. Wonโt even go post 5vdc standby voltage. Will most likely spend the weekend taking that apart. Tried switching the AX860i PSU for a AX760W PSU (new), nada.
I tried installing the second kit of Ballistix RAM (total 32GB). Things I noticed -
Couldnโt get it to POST once properly
However, on one bootup, it ended up in Asusโ โSafe Modeโ which lets me into the UEFI - I was able to enable XMP, bump DRAM channel voltage to v1.48vdc and it loaded up Fedora.
I had it run the miner for almost a day, but then we had a power-cut andโฆ
Since it wouldnโt POST at all like normal; yanked out the second kit, pushed on the original kit a bit to make sure (again!) that they were seated properly. Didnโt notice any extra give or clicking (into-place) any further.
This time around, I disabled XMP, set DRAM channel to v1.48vdc, and manually entered the timings 16-17-17 and left the rest on auto. It boots fine now, and has been running for a while - 04:50:26 up 1 day, 9:20, 3 users, load average: 0.26, 0.24, 0.20. I also disabled Asusโ MultiCore enhancement and left it on Intelโs default-auto.
Previously, I had Asusโ MultiCore enhancement on โautoโ (which is on) - this pushes core-clocks to the max. It could be this boils down to power-draw, may be over heating VRMs โ seems unlikely, high clock but at idle? Remember, the miner just offloads all load to the GPU.
I really shouldnโt have cut-corners on this build; used a spare Noctua Cooler I had a NH-L12. This is most likely the cause for the random reboots.
Today I ran the Phoronix Cryptography benchmarks and noticed reported CPU temps (in Fedora) were north of 100degC!! The Corsair AX860i PSU is about adequate although Iโd need to do more testing to comment on that (remember, another 125W is being pulled through the GTX1070).
Idle temps are around 40DegC per core.
So it makes sense for the MCE enhancement (OC) to push this over the edge. Got a 4 day 13+ hrs uptime since the last change (disabling MCE).
Iโm also going to see if I can get the 32GB of the Ballistix RAM to work at some point.
@FurryJackman should I bump any of the Vcore settings or just leave everything on auto for the time being (apart from MCEโฆ)
You need the U14 Noctua cooler badly, or a 280mm AIO. Seems stock is not enough for that cooler. Remember, itโs 180+W TDP with turbo boost engaged. If you were going the AIO route. I recommend what I currently use, the Cryorig A80 with an included VRM fan. I got it before NCIX ceased to exist due to bankruptcy, so it may be hard to find now. You could get Fractal Design Celsius S24 as the alternative, but it will have no VRM cooling.
Iโd always stick to G.Skill memory and Samsung B-Die if you go down the 32GB route. 2 2x8GB kits should do for that. The non RGB 3200 Trident Z kit ending in GTZB as the model number is a good bet. Thatโs guaranteed Samsung B-Die.