Old techie, new to the forums, building something new

65million rows is more than I ever had, but it also does not shock me that much. I think either mysql or postgresql can handle that without much problem. Comparing those with access is a very bad comparison. Access didn´t die without reason.

General concenses seems to be that postgresql can hold it´s own better against the likes of oracle and mssql. But i´m still trying to wrap my head around why that would be the case. Also then you have big fish like twitter and facebook using mysql and they seem to handle it just fine. So either one can scale. What´s going to be better for you, kinda depends on what you´re doing with it. And that´s really the hard part.

Row count can be a poor indicator of how well an RDBMS can perform; the data-types used, number of columns, how you index and how you join other tables has a massive impact. I’ve seen smaller databases with terrible performance vs larger ones with no such issues. A common mistake is joining more than 7 tables in a single query - the query engine just cannot comput all the different possible ways of joining the tables so never a picks a plan that is ‘good enough’ times out, and just finds a way to do it that might be terrible.

Both PostgreSQL and MySQL have lot’s of plugins (and forks) available that help achieve specific goals. With a propriarty database companies have to wait for the vendor to decide to adopt such a feature e.g. SQL Server has only recently adopted support for JSON where as PostgreSQL has supported it one way or another for a number of years.

Generally if you are buiding a webapp and you plan to keep moat logic in the app layer and use a database as a datastore MySQL will probably be fine. If you plan to actually use a database prop[erly, and get it to do the work of dealing with the data PostgreSQL will often be a better bet for reasons already explained.

If used properly (and optimised and played to its strengths) there shouldn’t be many reasons why you would find any of the common RDBMS’ as lacking in performance, but unfortunately most developers barely scratch the surface of database functionality and will often reinvent functionality that might already exist. Where I currently work we have a lot of developers that get described as ‘full stack’ and the DBA’s are kept very busy :wink:

2 Likes

Going with what you know is damned good advice. I brought the system up on the motherboard box and started playing. The objectives are to get a system up that will run a type 1 hypervisor. Then get VMs running to learn the Linux environment and database choices. Starting from “don’t know nothing” about Linux.

I wanted to run xen or kvm. After much reading I launched into trying to get kvm installed under Linux mint. It was not pretty. I have a little experience with mint so installing that was easy. Kvm under mint was a disaster. Error after error after error… Googling, loading more stuff, errors, Googling, loading more stuff… 12 hours later no end in sight. I was making progress, the various things I was doing was inching me forward. But really?

I gave up and downloaded windows 10 pro, installed, one click, hyper-v installed, 2 hours later VMs up and running.

More later.

I found a nice nvme ssd for this build. 960gb for $200. Unfortunately it is the long form factor so I can only use one in the taichi motherboard.

What I am going to do is buy another and swap it out with the one I just built with Windows 10 hyper-v. Put Linux mint back on it and then document what has to be done to get KVM running as well as virtual machines running. Once that works I’d like to do some bench marks. Having a single fairly fast ssd to swap out would allow easy, fast and repeatable A/B comparisons between the hypervisors.

Even better would be if the VMs could be stored on an ssd that stays put and get them to mount under whatever hypervisor I want to test. We’ll see if that works.

I got a client that needed me to import that database and run counts against it. It was interesting but as the job progressed I ended up building dozens of queries, stored procedures etc. Plus the client kept bringing more databases. I call them databases but they were really just single large tables.

I ended up doing 60 hour weeks on this so I decided to automate using C#. I went to the community college and took two semesters, then hired the sharpest kid in the class to come to my home office and work with me. I (we) built a system which dumps 420 million names and addresses to disk in a maximum of 2 million row files. Then fed those into virtual machines running Accuzip for address updating and NCOA. Back into SQL Server.

It runs to this day.

I really can’t “replace” SQL Server simply because the C# program calls into a MS library to perform the automation.

What I can do however is move the server into a VM and run the VM on my hardware.

2 Likes