Why are relational Databases faster than a spreadsheet

MCDerbuss · July 7, 2023, 1:24pm

Why are relational databases faster than spreadsheets? I’ve been googling a bit but all I can find is people saying they are…so what’s the maths?

MadMatt · July 7, 2023, 3:12pm

Faster doing what task?
Also, in how many words and how technical an answer can you absorb ?

MadMatt · July 7, 2023, 3:26pm

Spreadsheets are generic, free form tables where you can define relations between data on a cell by cell (or matrix by matrix) fashion.
Spreadsheets recalculate each cell value every time any of the cell in the sheet changes value, unless you disable auto calc
Spreadsheets give you maximum flexibility in terms of setting relations/ formulas and let you store whatever type of value in whatever cell. Spreadsheets perform all calculations on a sheet in memory, and they read all the data in memory as well
You pay that, when compared to relational databases, with limited total capacity,linear slowdown until you have memory available and then an unusable brick of data once you go past the memory threshold

Relational databases, provided you know your data model, let you overcome most of the slowdowns because they can offload to disk parts you are not using, they let you define indexes for faster data access and relations to make sure constraints like lookups and logical keys are defined. You do not get a gui with a database, so you lose most of the easy access and visualization, also, you need to define exactly what type of data you are going to load in a column in advance …
Two different products for two different use cases…

Dynamic_Gravity · July 7, 2023, 3:27pm

Spreadsheets are basically just doubly linked lists with the data stored in XML format. Spreadsheets don’t really understand what is in the cells just that something is there.

Databases store data and index data with b trees. They understand what the data is based on relations defined by the schema.

For small things spread sheets are fine but they are not designed to scale very well on their own they need a rediculous amount of infrastructure.

jode · July 7, 2023, 5:09pm

Database technology was developed in the early 1960s when programmers on early mainframe technology had the problem that every time they needed to process more data than fit into the measly 64kb main memory they had to manually write a routine that would save some data to disk to allow other data to be loaded into memory (essentially swap).
After doing that a couple of times they noticed that this is such a common need that it would be wise to establish a common and structured way to process datasets larger than fit into memory.
Almost two decades and several iterations of this technology later they decided that tabular relations as the basis for generic data models are the bees knees. This is when the relational database technology was born that is most commonly used.

In this sense (memory consumption) a spreadsheet is a single table that fits into RAM.
Today’s spreadsheet software (Excel and the like) obviously offer a lot of flexibility and convenience for quick calculations that are not found in typical database software.
Database software on the other hand allows performing calculations across any dataset that fits on your permanent storage without failing. It will slow down compared to in-memory only calculations, but it will succeed.