Official Press Deck Slides
COMSOL
FGMRES Compute
Pardiso Compute
CFD Only 10GB
EM Only 260GB
Phoronix Test Suite
AMD EPYC Turin Benchmarks.rar (896.4 KB)
Test Systems
Summary
Timed Compilations
Timed Linux Kernel Compilation 6.8, Build: defconfig
Timed Linux Kernel Compilation 6.8, Build: allmodconfig
Timed FFmpeg Compilation 7.0, Time To Compile
Timed Godot Game Engine Compilation 4.0, Time To Compile
Timed Node.js Compilation 21.7.2, Time To Compile
Timed Gem5 Compilation 23.0.1, Time To Compile
Timed LLVM Compilation 16.0, Build System: Ninja
OpenSSL
OpenSSL 3.3, Algorithm: RSA4096
OpenSSL 3.3, Algorithm: RSA4096 (2)
OpenSSL 3.3, Algorithm: SHA256
OpenSSL 3.3, Algorithm: SHA512
OpenSSL 3.3, Algorithm: AES-128-GCM
OpenSSL, Algorithm: AES-128-GCM
OpenSSL 3.3, Algorithm: AES-256-GCM
OpenSSL, Algorithm: AES-256-GCM
OpenSSL 3.3, Algorithm: ChaCha20
OpenSSL, Algorithm: ChaCha20
OpenSSL 3.3, Algorithm: ChaCha20-Poly1305
OpenSSL, Algorithm: ChaCha20-Poly1305
John The Ripper
John The Ripper 2023.03.14, Test: Blowfish
John The Ripper 2023.03.14, Test: bcrypt
John The Ripper 2023.03.14, Test: WPA PSK
RocksDB 9.0, Test: Random Read
Speeddb
Speedb 2.7, Test: Random Read
Speedb 2.7, Test: Read While Writing
Speedb 2.7, Test: Random Fill & Variant: Monero - Hash Count: 1M
SecureMark 1.0.4, Benchmark: SecureMark-TLS
Coremark 1.0, CoreMark Size 666 - Iterations Per Second
Google SynthMark 20201109, Test: VoiceMark_100
Algebraic Multi-Grid Benchmark 1.2
WRF 4.2.2, Input: conus 2.5km
ACES DGEMM 1.0, Sustained Floating-Point Rate
RELION 4.0.1, Test: Basic - Device: CPU
LULESH 2.0.3
miniBUDE
miniBUDE 20210901, Implementation: OpenMP - Input Deck: BM2
miniBUDE 20210901, Implementation: OpenMP - Input Deck: BM2
LAMMPS
LAMMPS Molecular Dynamics Simulator 23Jun2022, Model: Rhodopsin Protein
LAMMPS Molecular Dynamics Simulator 23Jun2022, Model: 20k Atoms
m-queens 1.2, Time To Solve
miniFE 2.2, Problem Size: Small
ASKAP
ASKAP 1.0, Test: tConvolve MPI - Degridding
ASKAP 1.0, Test: tConvolve MPI - Gridding
NAMD
NAMD 3.0b6, Input: ATPase with 327,506 Atoms
NAMD 3.0b6, Input: STMV with 1,066,628 Atoms
GROMACS 2024, Implementation: MPI CPU - Input: water_GMX50_bare
QuantLib
QuantLib 1.32, Configuration: Single-Threaded
QuantLib 1.32, Configuration: Multi-Threaded
QMCPACK 3.17.1Input: Li2_STO_ae
GPAW 23.6, Input: Carbon Nanotube
High Performance Conjugate Gradient 3.1, X Y Z: 144 144 144 - RT: 60
Pennant
Pennant 1.0.1, Test: leblancbig
Pennant 1.0.1, Test: sedovbig
NAS Parallel
NAS Parallel Benchmarks 3.4, Test / Class: EP.D
NAS Parallel Benchmarks 3.4, Test / Class: LU.C
NAS Parallel Benchmarks 3.4, Test / Class: SP.C
NAS Parallel Benchmarks 3.4, Test / Class: IS.D
NAS Parallel Benchmarks 3.4, Test / Class: MG.C
NAS Parallel Benchmarks 3.4, Test / Class: CG.C
NWChem 7.0.2, Input: C240 Buckyball
Xcompact3d
Xcompact3d Incompact3d 2021-03-11, Input: input.i3d 193 Cells Per Direction
Xcompact3d Incompact3d 2021-03-11, Input: X3D-benchmarking input.i3d
BRL-CAD 7.38.2, VGR Performance Metric
OpenFOAM
OpenFOAM 10, Input: drivaerFastback, Small Mesh Size - Mesh Time
OpenFOAM 10, Input: drivaerFastback, Small Mesh Size - Execution Time
OpenFOAM 10, Input: drivaerFastback, Medium Mesh Size - Execution Time
OpenFOAM 10 & Speedb 2.7, Test: Sequential Fill
OpenRadioss
OpenRadioss 2023.09.15, Model: INIVOL and Fluid Structure Interaction Drop Container
OpenRadioss 2023.09.15, Model: Chrysler Neon 1M
Blender
Blender 4.1, Blend File: BMW27 - Compute: CPU-Only
Blender 4.1, Blend File: Classroom - Compute: CPU-Only
Blender 4.1, Blend File: Fishy Cat - Compute: CPU-Only
Blender 4.1, Blend File: Pabellon Barcelona - Compute: CPU-Only
Blender 4.1, Blend File: Barbershop - Compute: CPU-Only
Blender 4.1, Blend File: Junkshop - Compute: CPU-Only
LuxCoreRender
LuxCoreRender 2.6, Scene: DLSC - Acceleration: CPU
LuxCoreRender 2.6, Scene: LuxCore Benchmark - Acceleration: CPU
LuxCoreRender 2.6, Scene: Orange Juice - Acceleration: CPU
OSPRay
OSPRay 3.1, Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time
OSPRay 3.1, Benchmark: particle_volume/ao/real_time
OSPRay 3.1, Benchmark: particle_volume/scivis/real_time
OSPRay Studio 1.0, Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU
OSPRay Studio 1.0, Camera: 1 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU
OSPRay Studio 1.0, Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU
OSPRay Studio 1.0, Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU
OSPRay Studio 1.0, Camera: 3 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU
OSPRay Studio 1.0, Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU
Embree
Embree 4.3, Binary: Pathtracer ISPC - Model: Asian Dragon
Embree 4.3, Binary: Pathtracer ISPC - Model: Asian Dragon Obj
Embree 4.3, Binary: Pathtracer ISPC - Model: Crown
Intel Open Image Denoise
Intel Open Image Denoise 2.2, Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only
Intel Open Image Denoise 2.2, Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only
Intel Open Image Denoise 2.2, Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only
OpenVKL 2.0.0, Benchmark: vklBenchmarkCPU ISPC
7-Zip Compression
7-Zip Compression 22.01, Test: Compression Rating
7-Zip Compression 22.01, Test: Decompression Rating
Parallel BZIP2 Compression 1.1.13, FreeBSD-13.0-RELEASE-amd64-memstick.img Compression
PyBench 2018-02-16, Total For Average Test Times
Numpy Benchmark
SVT-AV1 2.0, Encoder Mode: Preset 8 - Input: Bosphorus 4K
WebP Image Encode 1.2.4, Encode Settings: Quality 100, Highest Compression
libavif avifenc
libavif avifenc 1.0, Encoder Speed: 0
libavif avifenc 1.0, Encoder Speed: 2
libavif avifenc 1.0, Encoder Speed: 6, Lossless
libavif avifenc 1.0, Encoder Speed: 10, Lossless
ASTC Encoder
ASTC Encoder 4.7, Preset: Thorough
ASTC Encoder 4.7, Preset: Very Thorough
ASTC Encoder 4.7, Preset: Exhaustive
GraphicsMagick 1.3.43, Operation: Noise-Gaussian
Liquid-DSP
Liquid-DSP 1.6, Threads: 1 - Buffer Length: 256 - Filter Length: 32
Liquid-DSP 1.6, Threads: 1 - Buffer Length: 256 - Filter Length: 57
Liquid-DSP 1.6, Threads: 1 - Buffer Length: 256 - Filter Length: 512
Liquid-DSP 1.6, Threads: 64 - Buffer Length: 256 - Filter Length: 32
Liquid-DSP 1.6, Threads: 128 - Buffer Length: 256 - Filter Length: 32
Liquid-DSP 1.6, Threads: 128 - Buffer Length: 256 - Filter Length: 512
Liquid-DSP 1.6, Threads: 256 - Buffer Length: 256 - Filter Length: 57
Liquid-DSP 1.6, Threads: 256 - Buffer Length: 256 - Filter Length: 512
srsRAN Project
srsRAN Project 23.10.1-20240325, Test: PUSCH Processor Benchmark, Throughput Total
srsRAN Project 23.10.1-20240325, Test: PUSCH Processor Benchmark, Throughput Thread
srsRAN Project 23.10.1-20240325, Test: PDSCH Processor Benchmark, Throughput Total
srsRAN Project 23.10.1-20240325, Test: PDSCH Processor Benchmark, Throughput Thread
TensorFlow 2.16.1, Device: CPU - Batch Size: 512 - Model: ResNet-50
OpenVINO
OpenVINO 2024.0, Model: Face Detection FP16-INT8 - Device: CPU
OpenVINO 2024.0, Model: Face Detection FP16-INT8 - Device: CPU (2)
OpenVINO 2024.0, Model: Person Detection FP16 - Device: CPU
OpenVINO 2024.0, Model: Person Detection FP16 - Device: CPU
OpenVINO 2024.0, Model: Weld Porosity Detection FP16-INT8 - Device: CPU
OpenVINO 2024.0, Model: Weld Porosity Detection FP16-INT8 - Device: CPU
OpenVINO 2024.0, Model: Vehicle Detection FP16-INT8 - Device: CPU
OpenVINO 2024.0, Model: Vehicle Detection FP16-INT8 - Device: CPU
OpenVINO 2024.0, Model: Person Vehicle Bike Detection FP16 - Device: CPU
OpenVINO 2024.0, Model: Person Vehicle Bike Detection FP16 - Device: CPU
OpenVINO 2024.0, Model: Machine Translation EN To DE FP16 - Device: CPU
OpenVINO 2024.0, Model: Machine Translation EN To DE FP16 - Device: CPU
OpenVINO 2024.0, Model: Face Detection Retail FP16-INT8 - Device: CPU
OpenVINO 2024.0, Model: Face Detection Retail FP16-INT8 - Device: CPU
OpenVINO 2024.0, Model: Handwritten English Recognition FP16-INT8 - Device: CPU
OpenVINO 2024.0, Model: Handwritten English Recognition FP16-INT8 - Device: CPU
OpenVINO 2024.0, Model: Road Segmentation ADAS FP16-INT8 - Device: CPU
OpenVINO 2024.0, Model: Road Segmentation ADAS FP16-INT8 - Device: CPU
OpenVINO 2024.0, Model: Person Re-Identification Retail FP16 - Device: CPU
OpenVINO 2024.0, Model: Person Re-Identification Retail FP16 - Device: CPU
ONNX Runtime
ONNX Runtime 1.17, Model: GPT-2 - Device: CPU - Executor: Standard
ONNX Runtime 1.17, Model: GPT-2 - Device: CPU - Executor: Standard
ONNX Runtime 1.17, Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard
ONNX Runtime 1.17, Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard
ONNX Runtime 1.17, Model: yolov4 - Device: CPU - Executor: Standard
Xmrig 6.21, Variant: GhostRider - Hash Count: 1M
Helsing 1.0-beta, Digit Range: 14 digit
Stockfish 16.1, Chess Benchmark
Primsieve
Primesieve 12.1, Length: 1e12
Primesieve 12.1, Length: 1e13
Y-Cruncher
Y-Cruncher 0.8.3, Pi Digits To Calculate: 500M
Y-Cruncher 0.8.3, Pi Digits To Calculate: 1B
Y-Cruncher 0.8.3, Pi Digits To Calculate: 10B & OpenVINO, Model: Face Detection FP16
8 Likes
Let me see how much money I have saved up for this…
2 Likes
That are a lot of graphs…if you can measure scrolling down all the graphs in seconds, this test is probably the record so far.
I guess AMD didn’t like all the Xeon 6 attention lately. I’ll check the results later…how long does the entire suite run?
does L1T have a standard run with selected benchmarks or is it run all stuff that’s out there?
Maybe we get an upgrade to Siena and Bergamo as we did with the Genoa generation…Zen5 Siena certainly is more within John Travoltas budget. And AMD plans to keep SP6 for more than a generation.
And the 9575F is only 15k$. You saved some money on that SSD lately, so you’re getting closer
edit: Amber streamlined the graphs and shortened the scroll time by 80%. She must be using one of these new Turin chips!
2 Likes
Yes, but my personal EPYC Genoa build just shifted price by $3k with the CPU alone. Faster RAM is pricier, but necessary…
@JayVenturi How will you reinforce the motherboard for the air coolers you’ll inevitably place on these 500 watt TDP monsters?
1 Like
@JayVenturi said he must use a custom water Loop for his next EPYC upgrade. I want the 192-core EPYC, but I can’t justify the $15,000 price tag, especially since I want to purchase two, which are $30,000 just for CPUs.
If I look at the Xeon 6 with 500W, seems like good old passive cooling block + enough 11k RPM fans will do the job just as before.
Interesting note…Turin uses Zen5c cores. So that’s more an updated Bergamo rather than updated Genoa. Seems like AMD can’t scale “standard” cores any higher. Zen4c and Zen5c becoming the standard soon?
Or they had bad experience with Bergamo+Genoa marketing and just merged everything under Turin
So far only 5 of the Turin SKUs use Zen 5c cores, the majority are normal Zen 5 cores.
Shadowbane:
I want the 192-core EPYC, but I can’t justify the $15,000 price tag, especially since I want to purchase two, which are $30,000 just for CPUs.
Yes, BUTTTT more cores better
It would be fiscally irresponsible to forego potential performance
And I am goin single socket. When I can now single socket 192 cores vs 128…
well that’s a 50% bump, which is 200% more than the 96 core I had in the cart!
1 Like
Exard3k
October 10, 2024, 8:34pm
11
Imagine the bump when using a 2P system. Imagine all the threads in top were actual cores and not SMT cores. Where others have threads, you have cores.
A single 192-Core System is really only a poor mans workstation. Life starts >200
400% more cores are 800% more performance just because of dopamine and serotonine enhancing the perceived performance by x2. Placebo purchase is a thing
Ok, so they just merged Bergamo+Genoa into one and the higher core counts are Zen5c and everything below is Zen5. Makes sense. Because a 24-core Zen5c for 4x the price of a 24-core Siena would be outrageous
1 Like
Exard3k:
Life starts >200
I’m dyin here
My EPYC build was supposed to be a meme machine home server build
Duplicating what we’ve been building for customers.
Did some napkin math and it would have been able to replace at least 1 of my prod servers at the house.
Now, it can replace…still 1
1 Like
@Level1_Amber you forgot the most important slide
1 Like
Is this the part where I post a picture of two Rolls Royce turbine coolers dangling of some home workstation?
…going to to have to come up with something new for dual 500W. It is feasible to go air, but no longer viable in a relatively small footprint.
Try holding, with yer bare palm, a full on 500w glass floodlamp, now imagine trying to dissipate the heat of TWO of those (1000W) just using air.
Please included the words “silently” and “small form factor” in the build and I say bullocks !
It will have to be a very well thought out liquid cooling or phase change systems.
…not to mention PRICEY. I don’t know if I can add a third and fourth full time job for just this habit.
I will be available to gladly help other folks spend THEIR money on this.
2 Likes
We always used the Dynatron J12’s in 4u cases
Dynamic fan control kept it under control…during installs
Then we got the calls, “there’s a LOUD noise comin from the server closet ”
looks like dynatron already has a solution for the punishment gluttons:
4 Likes
fixed ;p I added the official press deck slides to the post:
Level1_Amber:
4 Likes
Oh for Turin, you have no idea xD Also, the AC couldn’t keep up for the room. When you walked by the server closet it was a waft of heat
2 Likes
Exard3k
October 10, 2024, 9:56pm
19
janitors, cleaning personnel or generally “the uninitiated”…treating the data center like some kind of witchcraft or a nuclear power plant where every flashing light or noise can result in an immediate meltdown. my standard answer: “If we can log into the AS/400 and nothing is burning → it’s fine! and don’t plug any cables”
JayVenturi:
Looks like a screamer
It’s a fan row that fits into 1U. So probably 40mm fans with 6A running at 11k RPM@100%. With 2U, you can at least physically use 80mm fans and keep RPM at a sane level. But I like that Dynatron AIO…provides a lot more surface.
2 Likes
TryTwiceMedia:
Yes, BUT more cores better
It would be fiscally irresponsible to forego potential performance
And I am going single socket. When I can now single socket 192 cores vs 128…
well, that’s a 50% bump, which is 200% more than the 96 cores I had in the cart!
The difference between you and me is that my new virtualization server is for personal use, while I hope your new server is used to make money. Even if I were to go with just one EPYC 192 core, my final budget would be between 20,000 and 25,000 dollars, and I couldn’t justify spending that much money on personnel use.
1 Like