Ryzen Pre-Week 25 fabrication RMA issue

I’m assuming that what you are saying is that you tested with a Ubuntu 16.04 Guest OS in a Virtual Machine on a Windows Host?

Exactly

Got another 1600X and will test it now. It is a 1723 though…

Crap, I have that error too. I don’t really want top wait long for RMA, I hope the shop does it itself, not directly AMD.

Then you can’t be sure to get a post-week-25-Ryzen. Their RMA process seems very well done so far. They’ll cover the full DHL expense upfront and send you the replacement once you give them a tracking number.

I’m very pleased with their RMA department currently.

CPU # 3… and it’s week 28. :smiley:

waiting for new mobo. i took the opportunity to get a good one.

need-more-data


I am running the blender build right now and it just passed 50%. I might be overly optimistic but … well, I’ll hammer it a while more before I finish that sentence.

[update] Blender passed. Now running mprime to heat the chip a bit and start a new blender build.

Just realized I did not take a picture before installing the chip…
I know it is a week 23 1600X from Malaysia, microcode is 0x8001126.

[update] Second blender build successful. Looking good. :smiley:

Third build done. This seems to be a winner. Yeehaw!

taichi. should be here by the end of the week (I wasn’t expecting the CPU to come so soon).

1 Like

OK, I ran a lot of blender builds, with and without mprime -t running right next to it. After that I built mesa-git with all dependencies. Took half a day, completed without any problems. Is there anything worse to throw at it or is it a reasonable assumption that my chip is fine?

chip is probably fine.

1 Like
1 Like

Don’t be too sure. Blender always worked for me and I have a week 21 Malaysia one. I had to run kill-ryzen for it to trigger.

1 Like

Good to know.

Couldn’t get that ryzen kill thing to run on manjaro. So I’ll try this, that seems to be written for stupid people like me. :smiley:

I think (U know nothing Jon Snow) I am ok as well. Long blender renders, handbrake encodes and the Kill-ryzen code for about 4 hours only then mined etherium for a day with the CPU and no problems.

I would have an aneurysm after my MB issues if the CPU was bad as well. I hope it get sorted for people affected quick.

OK, i’ve got my replacement CPU. I’ve gone from a 1721PGT, Malaysia, to a 1730SUS, China. No bugs yet and crazy binning.

4.1 GHz @ 1.3375V prime95 and AVX stable.

Weird thing is, I’m experiencing opposite issues with offset/voltage/pstate compared to my old Ryzen. With the old one, manual voltage override would result in a stuck multiplier and offset + pstate worked. With the new one only manual voltage override works and offset/pstate doesn’t.

As for the whole RMA process: Their response time per e-mail is about one working day. After providing them with all the info, referencing to this thread here, I was given a DHL Express pre-paid account number and they picked the CPU up the day after.
6 days later the replacement part arrived. Pretty good.

1 Like

Reading this thread for a while now. Today I took the Cooler off the CPU and checked the serial, the part and the batch number. UA 1709PGT from Ryzen 1700. SHIT!
I was curious, so I stopped my work, git cloned the ryzen-kill script started the test.
Seems like I have a bad one.

Extract GCC sources
Download prerequisites
2017-10-01 15:10:35 URL: ftp://gcc.gnu.org/pub/gcc/infrastructure/gmp-6.1.0.tar.bz2 [2383840] -> “./gmp-6.1.0.tar.bz2” [1]
2017-10-01 15:10:40 URL: ftp://gcc.gnu.org/pub/gcc/infrastructure/mpfr-3.1.4.tar.bz2 [1279284] -> “./mpfr-3.1.4.tar.bz2” [1]
2017-10-01 15:10:45 URL: ftp://gcc.gnu.org/pub/gcc/infrastructure/mpc-1.0.3.tar.gz [669925] -> “./mpc-1.0.3.tar.gz” [1]
2017-10-01 15:10:50 URL: ftp://gcc.gnu.org/pub/gcc/infrastructure/isl-0.16.1.tar.bz2 [1626446] -> “./isl-0.16.1.tar.bz2” [1]
gmp-6.1.0.tar.bz2: OK
mpfr-3.1.4.tar.bz2: OK
mpc-1.0.3.tar.gz: OK
isl-0.16.1.tar.bz2: OK
All prerequisites downloaded successfully.
cat /proc/cpuinfo | grep -i -E "(model name|microcode)"
model name : AMD Ryzen 7 1700 Eight-Core Processor
microcode : 0x8001129
model name : AMD Ryzen 7 1700 Eight-Core Processor
microcode : 0x8001129
model name : AMD Ryzen 7 1700 Eight-Core Processor
microcode : 0x8001129
model name : AMD Ryzen 7 1700 Eight-Core Processor
microcode : 0x8001129
model name : AMD Ryzen 7 1700 Eight-Core Processor
microcode : 0x8001129
model name : AMD Ryzen 7 1700 Eight-Core Processor
microcode : 0x8001129
model name : AMD Ryzen 7 1700 Eight-Core Processor
microcode : 0x8001129
model name : AMD Ryzen 7 1700 Eight-Core Processor
microcode : 0x8001129
model name : AMD Ryzen 7 1700 Eight-Core Processor
microcode : 0x8001129
model name : AMD Ryzen 7 1700 Eight-Core Processor
microcode : 0x8001129
model name : AMD Ryzen 7 1700 Eight-Core Processor
microcode : 0x8001129
model name : AMD Ryzen 7 1700 Eight-Core Processor
microcode : 0x8001129
model name : AMD Ryzen 7 1700 Eight-Core Processor
microcode : 0x8001129
model name : AMD Ryzen 7 1700 Eight-Core Processor
microcode : 0x8001129
model name : AMD Ryzen 7 1700 Eight-Core Processor
microcode : 0x8001129
model name : AMD Ryzen 7 1700 Eight-Core Processor
microcode : 0x8001129
sudo dmidecode -t memory | grep -i -E “(rank|speed|part)” | grep -v -i unknown
Speed: 2134 MHz
Part Number: F4-3200C16-16GTZSW
Rank: 2
Configured Clock Speed: 1067 MHz
Speed: 2134 MHz
Part Number: F4-3200C16-16GTZSW
Rank: 2
Configured Clock Speed: 1067 MHz
Speed: 2134 MHz
Part Number: F4-3200C16-16GTZSW
Rank: 2
Configured Clock Speed: 1067 MHz
Speed: 2134 MHz
Part Number: F4-3200C16-16GTZSW
Rank: 2
Configured Clock Speed: 1067 MHz
uname -a
Linux COMPUTERNAME 4.11.0-14-generic #20~16.04.1-Ubuntu SMP Wed Aug 9 09:06:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
cat /proc/sys/kernel/randomize_va_space
2
/ /mnt/ramdisk/workdir
/mnt/ramdisk/workdir
Using 16 parallel processes
Hint: You are currently not seeing messages from other users and the system.
Users in the ‘systemd-journal’ group can see all messages. Pass -q to
turn off this notice.
No journal files were opened due to insufficient permissions.
[loop-0] Sun Oct 1 15:10:51 CEST 2017 start 0
[loop-1] Sun Oct 1 15:10:52 CEST 2017 start 0
[loop-2] Sun Oct 1 15:10:53 CEST 2017 start 0
[loop-3] Sun Oct 1 15:10:54 CEST 2017 start 0
[loop-4] Sun Oct 1 15:10:55 CEST 2017 start 0
[loop-5] Sun Oct 1 15:10:56 CEST 2017 start 0
[loop-6] Sun Oct 1 15:10:57 CEST 2017 start 0
[loop-7] Sun Oct 1 15:10:58 CEST 2017 start 0
[loop-8] Sun Oct 1 15:10:59 CEST 2017 start 0
[loop-9] Sun Oct 1 15:11:00 CEST 2017 start 0
[loop-10] Sun Oct 1 15:11:01 CEST 2017 start 0
[loop-11] Sun Oct 1 15:11:02 CEST 2017 start 0
[loop-12] Sun Oct 1 15:11:03 CEST 2017 start 0
[loop-13] Sun Oct 1 15:11:04 CEST 2017 start 0
[loop-14] Sun Oct 1 15:11:05 CEST 2017 start 0
[loop-15] Sun Oct 1 15:11:06 CEST 2017 start 0
> [loop-9] Sun Oct 1 15:12:31 CEST 2017 build failed
> [loop-9] TIME TO FAIL: 100 s
> [loop-10] Sun Oct 1 15:12:52 CEST 2017 build failed
> [loop-10] TIME TO FAIL: 121 s
> [loop-1] Sun Oct 1 15:49:45 CEST 2017 build failed
> [loop-1] TIME TO FAIL: 2334 s
> [loop-2] Sun Oct 1 18:05:58 CEST 2017 build failed
> [loop-2] TIME TO FAIL: 10507 s
> [loop-3] Sun Oct 1 18:08:18 CEST 2017 build failed
> [loop-3] TIME TO FAIL: 10647 s
> [loop-5] Sun Oct 1 18:38:33 CEST 2017 build failed
> [loop-5] TIME TO FAIL: 12462 s
[loop-6] Sun Oct 1 18:46:49 CEST 2017 start 1
[loop-4] Sun Oct 1 18:46:50 CEST 2017 start 1
[loop-7] Sun Oct 1 18:47:04 CEST 2017 start 1
[loop-13] Sun Oct 1 18:47:16 CEST 2017 start 1
[loop-0] Sun Oct 1 18:47:21 CEST 2017 start 1
[loop-15] Sun Oct 1 18:47:24 CEST 2017 start 1
[loop-12] Sun Oct 1 18:47:26 CEST 2017 start 1
[loop-8] Sun Oct 1 18:47:36 CEST 2017 start 1
[loop-14] Sun Oct 1 18:47:39 CEST 2017 start 1
> [loop-7] Sun Oct 1 18:48:04 CEST 2017 build failed
> [loop-7] TIME TO FAIL: 13033 s
[loop-11] Sun Oct 1 18:48:14 CEST 2017 start 1
> [loop-12] Sun Oct 1 19:21:48 CEST 2017 build failed
> [loop-12] TIME TO FAIL: 15057 s
> [loop-13] Sun Oct 1 19:21:48 CEST 2017 build failed
> [loop-13] TIME TO FAIL: 15057 s
> [loop-11] Sun Oct 1 21:35:50 CEST 2017 build failed
> [loop-11] TIME TO FAIL: 23099 s
[loop-6] Sun Oct 1 21:39:22 CEST 2017 start 2
[loop-4] Sun Oct 1 21:39:30 CEST 2017 start 2
[loop-15] Sun Oct 1 21:39:34 CEST 2017 start 2
[loop-8] Sun Oct 1 21:39:34 CEST 2017 start 2
[loop-14] Sun Oct 1 21:39:46 CEST 2017 start 2
[loop-0] Sun Oct 1 21:39:53 CEST 2017 start 2
> [loop-8] Sun Oct 1 23:59:22 CEST 2017 build failed
> [loop-8] TIME TO FAIL: 31711 s
[loop-6] Mon Oct 2 00:32:24 CEST 2017 start 3
[loop-15] Mon Oct 2 00:32:39 CEST 2017 start 3
[loop-4] Mon Oct 2 00:32:48 CEST 2017 start 3
[loop-0] Mon Oct 2 00:32:48 CEST 2017 start 3
[loop-14] Mon Oct 2 00:33:02 CEST 2017 start 3
> [loop-15] Mon Oct 2 00:34:17 CEST 2017 build failed
> [loop-15] TIME TO FAIL: 33806 s

Test was run under Linux Mint 18.2 KDE, Kernel 4.11.0-14, no OC, all settings in EFI at default, with latest EFI downloaded as of today.
RAM 64GB (4x16) at default 2133 MHz; Mobo: x370 Taichi
Number of parallel processes: 16; Test duration: 9h 02min; Number of Fails:12

I also have another Ryzen chip in my server waiting to be tested, R3 1200 with batch number UA 1724PGT. Tomorrow will be a long day :tired_face:

2 Likes

ouch. that fucking sucks

Do you have 32GB of RAM?

You might be running out of memory. If the ramdisk says 64 capacity / 64 utilized then the test will fail once it reaches that mem cap.

df -h

I point it out because I just so happen to have a 1700, and 32GB of RAM and received a replacement UA 1733SUS and it also runs out of memory at ~15,000 seconds, or 4-5 hrs every time, depending on OC settings when the script is at default settings.

There are some options on the ryzen-test github page that lower the mem usage to allow for testing using fewer GCC compiling instances for longer periods. The GCC test can manifest the concurrency bug in in a few minutes in the majority of cases.

1 Like

Low memory should not be an issue. I have 64GB.
I also got the first fails right after testing a minute, and then it continues with random fails.

It is defective then. :frowning:

If the crashes happen all at once with memory full, then it is memory, but if they are spread out with the first few very soon, then there is instability in the system.