[FIX] Linux RNG performance sucks on Ryzen/TR4

I noticed some weird thing with my Threadripper 1950X and Ryzen 1700 system.

Both has terrible /dev/urandom performance for some reason.
You can benchmark it yourself via ‘pv /dev/urandom > /dev/null’

Intel can do well over 170 MB/s even on my 5200U laptop, and my OnePlus 3 can do 120 MB/s
while my both AMD system limits to 75 MB/s.

RNG is being used everywhere, so its performance is also quite important.

When I googled ‘poor RNG performance ryzen’, I got this
https://patchwork.kernel.org/patch/9856707

The problem seems to be that AMD somehow can’t generate long type RDRAND
(which afaik, is an instruction that Intel suggested/implemented)
fast enough.

The patch suggests to use arch_get_random_int() instead of arch_get_random_long().
But we can avoid patching kernel by just simply avoiding RDRAND altogether.

Imo, the Linux kernel has a poor implementation of getting random entropy from RDRAND.
Both my Intel and AMD systems can now do 330+ MB/s by appending “nordrand” to cmdline.

RDRAND should be an additional entropy source, rather than slowing it down.
Linux kernel should have an asynchronous design to read RDRAND, imo.

But until then, I suggest everyone(maybe even Intel) to use “nordrand” to cmdline.

2 Likes

in case you’re too lazy to get pv

$ time head -c 1G /dev/urandom > /dev/null

real    0m6.515s
user    0m0.029s
sys     0m6.483s

^^ 2.5GHz Haswell Xeon E5 – this is about 150MB/s


How did you notice?

1 Like

Funny thing, I was reading the wiki about it -

On an Intel Core i7-7700K, 4500 MHz (45 x 100MHz) processor (Kaby Lake-S microarchitecture), a single RDRAND or RDSEED instruction takes 110ns or 463 clock cycles, regardless of the operand size (16/32/64 bits). This number of clock cycles applies to all processors with Skylake or Kaby Lake microarchitecture. On the Silvermont microarchitecture processors, each of the instructions take around 1472 clock cycles, regardless of the operand size; and on Ivy Bridge processors it takes up to 117 clock cycles[18].

On an AMD Ryzen CPU, each of the instructions takes around 1200 clock cycles for 16-bit or 32-bit operand, and around 2500 clock cycles for a 64-bit operand.

and even cooler is when you read about Linus and Linux.

Linus Torvalds dismissed concerns about the use of RdRand in the Linux kernel, and pointed out that it is not used as the only source of entropy for /dev/random, but rather used to improve the entropy by combining the values received from RdRand with other sources of randomness.[25][26] However, Taylor Hornby of Defuse Security demonstrated that the Linux random number generator could become insecure if a backdoor is introduced into the RdRand instruction that specifically targets the code using it. Hornby’s proof-of-concept implementation works on an unmodified Linux kernel prior to version 3.13

Damn that thing seems like it should be disabled :smiley:

[mdesilva@i9-7920X ~]$ time head -c 1G /dev/urandom > /dev/null

real    0m4.580s
user    0m0.017s
sys     0m4.558s

# 1950X... ouch
[mdesilva@ballinripper ~]$ time head -c 1G /dev/urandom > /dev/null                                                                                     │················
                                                                                                                                                        │················
real    0m14.794s                                                                                                                                       │················
user    0m0.026s                                                                                                                                        │················
sys     0m14.736s
  • i9 7920X
    ~222 MB/s

  • 1950X Threadripper
    ~69.2 MB/s

:exploding_head:

Really?

My i5-7500U (dual core) laptop is between 150 MB/s and 180 MB/s with a single SATA SSD.

Are you sure it’s not a chipset or storage controller issue?

Isn’t /dev/null still on the root partition?

Try making the comparison with the same SSD in both computers. If you can, compare an m.2 and single SATA or even stripe-RAID SATA to see the difference.

The last time I rebuilt one of the computers in my family, I went with a cheap unlocked dual core Pentium and just put a pair of cheap SSD’s in a stripe RAID because power consumption was the main concern.

It’s still running perfectly fine at 4.2 GHz with no complaints and one of the most power-efficient computers I support.

Devices under /dev have nothing to do with root device.
/dev/null, zero, random, urandom and many more are virtual devices linked to the kernel.

Ok, so I guess I just don’t understand where the data is being written or the path it is traveling then? :man_shrugging:

1 Like

But is the kernel using RDSEED properly right now?

https://www.phoronix.com/scan.php?page=news_item&px=AMD-Zen-CPU-Znver1

RDSEED Search in kernel source:

https://software.intel.com/en-us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed

http://www.felixcloutier.com/x86/RDSEED.html

Some more Info I just found about how the Zen Random Number Generator works

It’s rather interesting to read how it’s put together from the 16 ring oscillator noise source to the entropy conditioning on downwards.

1 Like