Ryzen build for MySQL database server

Hansli · May 14, 2024, 2:35pm

Hi all,

I need to set up a new database server for my small business to host an industry analysis/ market research database. Below is what I am currently planning.

Database specs:

MySQL 8.4
Database size of ~1TB, in ~150 tables
Longest table has ~10B rows and the average table length is 100M rows
Very few writes, from a single user (besides initial load, the updates will be inserted by different crawlers on an hourly or daily basis)
Very few but complex queries (many JOIN and WHERE statements) from a single user (for daily reporting and ad-hoc data analysis as needed); plus backup of course

Proposed new hardware:

CPU: AMD Ryzen 9 7900X (based on the above use-case, I think single core performance is key)
RAM: 4x 32GB DDR5 ECC UDIMM 2Rx8 4800 (probably Crucial)
Motherboard: AsRock Rack B650D4U-2L2T/BCM
- 10Gbs and IMPI onboard
- Supports x4x4x4x4 bifurcation
- Alternative: Gigabyte MC13-LE1
Storage:
- 4x Samsung PM9A3 1920GB 2.5" in a ZFS RAID 10 (pool with 2 vdevs of 2 mirrored drives each, 16KiB pagesize)
- 1x Intel Optane P4800X as ZFS SLOG vdev
- Delock 90169 (PCIe x16 to 4x U.2 card)
- Boot drive: 1TB consumer M.2 NVME
OS:
- Debian 12
- Dedicated bare-metal server

FYI, the current hardware:

Intel Xeon E3-1230 v6,
32GB of non-ECC RAM
2TB consumer NVME SSD

My first question: Since I’m neither a DBA nor a real systems admin, I am curious about your thoughts on my current plan. Do you see any issues or do you have any better ideas? Happy to learn!

My second question: Should I instead build something around a used EPYC, e.g., 7F52? Would only boost to 3.9GHz instead of 5.6GHz but have 4x the L3 cache, and more RAM. But I am not sure what will deliver more performance for my use case…

Thank you!

jode · May 14, 2024, 6:43pm

Why do you feel the need to upgrade from your current setup? What do you expect the proposed setup to provide that your current doesn’t?

If the few but complex queries actually traverse a significant portion of your data the bottleneck is not likely the CPU or RAM, but the storage. Your choice of enterprise class NVMe drives is not bad.
IMHO, stock MySQL is woefully the wrong software for this workload. You need database management software that supports analytical query loads well. Ideally, it supports concepts like columnar storage, partitioning, etc.

Assuming you want to upgrade because you’re not happy with the current level of performance and not because your existing hardware just reached EOL (Xeon CPU) I’d rather invest time and effort in migrating away from MySQL to a different DBMS platform.

1 TB of data in MySQL compares to about 100GB in columnar format (e.g. Parquet) suitable for the type of workloads you describe. Queries should be much faster to process and the full dataset may even fit into memory.

MariaDB (the open-source fork of Oracle’s MySQL) offers a columnstore database engine with a good pedigree, but it seems to be distributed as a commercial product. The obvious benefit here is that it should be a mostly drop-in replacement of your current setup.

PostgreSQL is (again IMHO) way more robust as a database engine. Citusdata has expanded PostgreSQL with the required functionality for analytical processing. It is available freely, but as a small business you may benefit from some help, especially if you’re neither a DBA nor a real systems admin.

Alternatively, you should look at Clickhouse or the Apache supported OLAP tools.

Hansli · May 14, 2024, 11:21pm

Thank you jode for your answer! Let me try to better explain some of the points you highlighted:

Regarding the need to upgrade:
- It’s definitely to get better performance (e.g., I am currently running with a 1Gbit NIC and PCIe 3 connectivity only)
- But also to “professionalize” the system and make it more resilient (e.g., I am currently running non-ECC RAM and a single consumer-grade SSD)
Regarding the DBMS: Great, I will look into migrating to PostgreSQL first then! (I am limited by the options that the market research data provider supports, but PostgreSQL is one of them. So I will start here.)
Regarding the queries: Maybe complex queries was the wrong term. Essentially, I have a few SQL queries that join a large number of long relational tables into a handful of Polars data frames (based on Apache Arrow). All further processing is then done in-memory on a workstation, just like you described. I would love to directly go to a columnar format but again, I’m limited by what the data provider supports here. So an intermediate step through a relational DB it is…

jode · May 15, 2024, 3:44am

Ah, that potentially changes the requirements quite a bit. I guess there is no need to find a DBMS suitable for OLAP workload as you already use one downstream.

So, in essence you want to build/upgrade a machine that exists solely for the purpose of staging a data load into the final OLAP system.
Fair enough.

I still think that you have the chance to benefit significantly from switching to PostgreSQL. MySQL is processing all queries single threaded (I don’t think that has changed I haven’t looked in a few years), PostgreSQL since version 9.6 has successively added parallel processing capabilities to its engine.

If you’re going to benefit from that effort by switching to PostgreSQL depends really on the queries. It has the potential to lead to quite some dramatic speedup, but it will likely require some more elbow grease than you put into setting up MySQL.

Assuming my characterization as a “staging db” is not too far from the truth, let me ask you a couple of technical questions:

I assume the “market research data provider” can export the data into a plain text format (some form of CSV is commonly supported). Have you considered loading that into your target DBMS and processing the data there? That would allow you to cut out the middleman completely and save the money on the upgrade.

How often does the machine load data? Am I correct to assume it loads bulk data in regular intervals (e.g. daily, hourly?)? What’s the data volume processed during any of these loads? Does the data load perform a full refresh (replace all data) or an incremental load (loads only the data that changed since last load)?

Are you aware what your current performance bottleneck is (which hw part is holding you back?)?

Is the network fully pegged during data export from the “market research data provider”?
Is a single core pegged during data load and data processing in MySQL?
How is memory usage?

I expect that your proposed upgrade from a single NVMe gen3 drive to 4x NVMe gen4 drives in a “raid10” config with zfs will not yield noticeable performance benefits. It will be way more resilient to failure, but I wonder if that is really necessary for a staging db.

Did you (what did you) change any settings in the MySQL config to “tune” it, “adjust” it towards the data load and hardware?

If you really want to go ahead and switch towards PostgreSQL, I recommend doing that before the hw purchase.
PostgreSQL doesn’t work well out of the box and requires at a minimum the adjustment of about a handful of settings.
If you go through that learning curve on your existing hw, you know what bottlenecks your hw has in the setup that you plan to run after the upgrade. As a result you may find that your choice in hw may differ significantly.

Dynamic_Gravity · May 15, 2024, 4:49am

As someone who works with large mysql databases everyday I would recommend you steer more towards IOPS perf and more memory. Mysql really isn’t limited that much by the CPU its mostly not having enough RAM once the database is properly warmed up or not enough IOPS for the queries. Queries need to be optimized to be run in parallel so if you’re just running a single unoptomised query that has to scan 10B rows then more cores will not make it go magically faster.

If it is alright maybe you could post a sample query so we could better understand what we’re working with? As well as the current execution time?

Your hardware config for storage does otherwise looks solid. But if you’re going to use ZFS with Mysql then you might have to tweak some ZFS settings because the ARC might conflict with what mysql is expecting to be able to use so that might lead to increased memory pressure. Based on your config with a SLOG I am assuming this is for safely and reliability.

diizzy · May 15, 2024, 4:59am

You might also want to consider FreeBSD which usually performs well as database server.

jode · May 15, 2024, 11:15am

After sleeping over this statement I need to clarify this.

There are many database management systems (DBMS) out there designed to work well under certain workloads.
This is not unlike cars where there are very different types of cars for different workloads, e.g.:

a minivan is a great tool for every soccer mom driving their offspring to the local sports meet in the pm.
a bus is a great and efficient tool to haul 50+ people across town, or even between towns
a dump truck is a great tool to haul a ton of garbage around, a workload where both bus and minivan would perform quite poorly

MySQL (as well as PostgreSQL) were designed to work well in OLTP workload scenarios (simply characterized as: many fast queries that only touch a fraction of the overall dataset). In our analogy above it’s the minivan.
Your initial description of your db workload made it sound more like an OLAP workload (fewer queries, most touching a significant portion of the total dataset). I would characterize that more like the bus.
Your response to my initial post gave me the impression that the workload could possibly better be serviced by the equivalent of the proverbial dump truck (no offense).

The fact that this is my third post in this thread documents my passion in this area, both personally as well as professionally.

However, a database server in a small biz should first and foremost work without a lot of effort required. MySQL certainly has a proven track record in this regard in general and seemingly also in your environment.

As a guideline I would not recommend changing this in your environment unless you could get a significant, roughly an order of magnitude, improvement as a result (aside from your hw change that you’re asking about).

The main job of a DBMS is to process data volumes that are not easily handled by a couple of lines of software code (e.g. create a large array in C++ or Python). DBMSs are designed not to fail. Most do that by silently selecting a less efficient method of processing the data (e.g. employ swapping to disk in order to be able to use only a specified amount of RAM).
Therefore, understanding the bottleneck(s) of a given workload is critical to

choosing the best DBMS for the job, and
optimally using chosen DBMS

I am happy to engage further on this conversation, but I feel my passion may divert this discussion away from your objective. Feel free to respond or pm me.

A few notes about your specific question:

The proposed system on paper will add a lot more horsepower: 3x number of cores, each core slightly faster; ~3x memory bandwidth, no increased memory capacity; 4-10x storage bandwidth on paper
The reason why I responded initially is that I have concerns about you receiving the theoretically possible improvements after the upgrade. My main concern being the storage setup, the most expensive part of the upgrade.

Hansli · May 18, 2024, 2:37pm

Thank you all for your help! This gave me a lot to think about - hence the delay in responding…

I migrated the DB from a MySQL to a PostgreSQL instance. This led to some significant performance improvements (for example, with MySQL the full data ingestion load took >3 days; with PG ~10h). Of course, I am fully aware that this says way more about my inability to properly tune the MySQL DB than about the DBMS themselves! Nevertheless, as a layman I take all the out-of-the-box performance I can get.
- For MySQL, the main performance setting I used was innodb_dedicated_server=1 (following the guidance in this talk)
- For PostgreSQL, I used the settings proposed by PGTune for a data warehouse
Now, I will spend some time simplifying and optimizing the PostgreSQL queries. In particular shifting more of the OLAP workloads into Polars instead of running them as SQL queries.
Regarding hardware, I think I will pass on the Ryzen system for now and first do some more investigating into the bottlenecks. Meanwhile, I think I will keep my eyes open for a used 2nd or 3rd gen Xeon Scalable system that I can load up on memory (looking at you, PMem).
My reason for thinking about a ZFS SLOG was this article stating that database workloads can benefit from low-latency LOG devices. But since my specific workload isn’t particularly write-heavy on a day-to-day basis (incremental loads only), I guess that’s not a priority.

jode · May 18, 2024, 4:47pm

Look for a package called postgresqltuner on your Debian. It connects to your db instance and suggests configuration adjustments based on actual usage. Better still, it explains the reasoning behind the suggestions.

Have a look into the pgsql log file. Chances are that you see a lot of checkpoint messages, maybe even warnings about checkpoints occurring too frequently.
This is best fixed by increasing the value of max_wal_size and comes along with increased data ingest performance. The cost for increasing this value is more storage space dedicated to the write ahead log (WAL) and in case of a crash it takes longer for PostgreSQL to recover.

ThatGuyB · May 19, 2024, 3:49am

I read through this and saw that the hw upgrade was delayed until performance is addressed. Also, +1 for Postgres!

IDK anything about DB performance tuning, but what I know is that DBs, even with good ZFS tuning don’t have as much performance as an md array with ext4 or xfs. The good part about ZFS is that you can lock all the tables (for postgres is just pg_backup_start() in 15 and 16), take a snapshot, then unlock (pg_backup_stop() in postgres) and zfs-send the data (and transaction logs / WALs) for fast backups.

And I’m not so sure you need to backup a staging DB, or have the need for much resilience (that ZFS ensures). I love ZFS, but does ZFS even make sense for you in this case? If the DB is not that important, then use whatever you need to get better performance out of it.

memnoch_proxy · May 21, 2024, 5:46pm

Having spent a lot of time around failing databases, having a small cluster of databases (one write-master and several replicas) gives you a much better recovery posture. Replica databases can live on other filesystems depending on their need. If you wanted your write master to have the best write performance, consider putting it on a XFS filesystem. Do backups from a replica on ZFS for it’s snapshotting and read-mostly performance.

memnoch_proxy · May 21, 2024, 5:50pm

In general, having just one database server is not as good as having two or more. Replication gives you the opportunity to backup from something that is not on the critical path.

From a hardware standpoint, I would jump from Ryzen to Epyc for the amount of ECC RAM it could hold. Used Epyc chips parts are pretty affordable. If you want to expand to NVMe storage, then you’ll have plenty of PCIe to do that with.

Clay_Bell · May 21, 2024, 5:59pm

ZFS is not necessary here and will only slow down the DB, no matter how you configure it.

What you are missing is a 2nd server. MySQL (or any DBS) is best setup with a 2nd box that maintains a real time replica. This is important for recovery, off-loading heavy queries and is common practice.
I use Percona’s MySQL as my daily driver. It’s more compatible, very reliable, and they have set the industry standard in backups with xtrabackup. Oracle’s community version is also pretty good.

I’d avoid MariaDB unless you are sure you can use it and that it has the features you need. As they have bifurcated well away from the standard db structures, converting away Mariadb later is impossible w/o a lot of downtime on large systems.

Ideally both machines should be identical hardware but if cost is an issue spend on the primary for extra RAM and faster storage.

Databases are seldom impacted by CPU type, your critical issues are storage speed and mainly storage latency followed by RAM.

More RAM, Faster RAM, and fast storage:
DO NOT USE SATA, for anything other than a backup store.
SATA deliberately introduces unnecessary latency in any mutli-threaded app or access scheme. DO NOT USE SATA FOR DATA, ever. No SSD, and RAID will not fix SATA. don’t do it.

Do not use ZFS, your RAM is better spent on database cache. Your redundancy should come from having at least 1 replica and COW will slow you down further. I primary use xfs on lvm volumes so that I can add storage and grow the data volume dynamically. As long as you are using RAID 10 you can do the RAID in software (LVM) or hardware, hardware usually has a bit of an advantage but can be risky due to dead cache batteries (Monitoring is your friend here).

RAM of course is very useful but don’t forget to configure it in the mysql cnf file(s). If you can use MySQL 8.x, do so, you’ll have a much better experience due to dynamic memory allocation. Also always use ECC, for the same reasons as we require it for zfs.

See Percona’s web site for best practices and …

Best of luck.
Clay

Clay_Bell · May 21, 2024, 6:08pm

Also, in case you were thinking of using ZFS or anything else for snapshots, don’t. This does not word with any database system. At all. With the exception of oracle bcv’s; Transactional DB’s are corrupted by OS or lower level snapshots. Take the time to configure and learn how to use xtrabackup. Use it to create your replica’s to test your backup recovery.

ThatGuyB · May 21, 2024, 6:10pm

@memnoch_proxy @Clay_Bell this is a staging DB, the DB is not important and can be wiped at any time, because this one is utilized to injest data to another DB. In fact, the DB was switched to postgres in the meantime.

I generally agree with you, xfs on md is faster than zfs for mysql, even with heavily tuned zfs. And replicas should be utilized if you need the uptime. But this scenario is neither here, nor there, the db doesn’t even need backups.

melroy89 · May 24, 2024, 6:20pm

I would still go for a proxmox server with a VM. Just my advice in general. Easier for back-ups of the server (only the VM without the DB data), reduce down-time etc.

Maybe then mount the storage as external data into the VM. That is what I also do.

Anyhow, besides the specs… Did you also think about fine-tuning? People really forget about the configuration part. And it’s not very straight-forward. The defaults configs are NOT good.

I share my MySQL config (and much more configs) over here, please take a close look:

Also if you use maybe PHP (or another client-side interfacing the server) be sure to also configure that part correctly. Eg. PHP: My PHP 8 configs ($91) · Snippets · GitLab

Ps. I keep my configs up to date here. So the post is maybe 2 years old, but I update/improve my configs regularly and thus also my snippet.

I personally love MariaDB. Sorry to hear this.

MadMatt · May 27, 2024, 12:14pm

Ahem, says who?
ZFS snapshots work just fine with postgres, provided you take a snapshot of both the data and the WAL in the same operation …

MadMatt · May 28, 2024, 6:09am

If you don’t need backups and don’t care about having to reload the data every now and then, provided it’s quick, you have no need of zfs or raid, they need a lot of care to perform when you use multiple devices

I vote for an epyc based single instance server, with 2tb of ram and the database loaded on a ramdisk or, second best, an optane based system, not sure Ryzen supports it …
You don’t need clusters or redundancy for your use case…

xzpfzxds · May 28, 2024, 9:00am

ZFS might still help getting correct query results, if OP values that - speaking from bad experience from a big MySQL DB machine on XFS on SSDs which had buggy firmware. On XFS the database kept getting corrupted, switching to ZFS found and corrected all the block errors caused by the bad SSDs.

MadMatt · May 28, 2024, 9:21am

that is an interesting combination … I wouldn’t want that for my gaming rig, let alone for a server …
XFS, I never liked it, and have been using Linux since v0.99, even reiserfs was more reliable than that … etx4 is going to be much more reliable…