Can anyone explain the difference between a NoSQL database versus a SQL database?

I get that NoSQL databases are like flat files, and I have read that the data structure is different than traditional SQL. But I haven't had a chance to program with one yet, so I really can comprehend the advantages, other than they maybe are more mobile, then having to dump a database when you want to migrate between machines...

Versus SQL databases which come in a few flavors, but basically seem to be more of a service, have a daemon, and I think they store data out of band (?)...

Aren't databases basically all tables, rows, and columns?

Or are NoSQL databases more like objects, so they have have anything like arrays, or potentially you could even scaffold data? It would seem like one file would inherently by like one complete database, or object block. It couldn't be a string of mysql injections?

Not too sure how to answer this directly, so I'll give some sort of database history explanation :P

Historically seen, there have been a few types of databases. The most important ones being hierarchical, a network database or a relational database. This is also more or less the following order in which they appeared, but the first two ones are considered to be "legacy software".

Now every database needs a "language" to work with the data. This is called a "Data Manipulation Language", and SQL is an implementation of such a language. More specific, SQL is the DML for most relational databases. A lot of vendors of relational databases choose SQL as their DML because it's easy to work with and because people already know it. But it's also pretty specific to relational databases. So you won't find it easily for another type.

And of course, relational databases have some characteristics. For example: before you can store any data, you have to define what data you want to store, in what format it will be and so on. In other terms, you have to create tables and columns first, and define the links between these tables.

NoSQL is just a collection of other types of databases, so not relational databases. It stands for "not only SQL". It often has the advantage that you can set up a DB and you can use it straight away. Take mongodb as an example. You could just make an object and then tell mongoDB: "Yo, store this thing for me please". There's no defining the data beforehand involved, so you could pretty much store whatever you want.

So it's all about the use case. Relational databases are really good for storing data that's very structured and where you know beforehand how it will look like. Sometimes you need something else though, and these types of databases will often not use SQL. I gave the example of mongoDB already. Another example would be redis, which is an in-memory DB. This means it's really fast, but can't store as much as a traditional, relational DB.

Yet another example, the other extreme if you will, is hadoop, which allows to create giant clusters to form one big database.

So no, not all databases are tables, rows and columns. Some databases store their data internally in a different way so that it's easier to work with in certain use cases.

I hope this cleared things up a bit.


Thanks. That was really beneficial.

Just to add to @herrsubset great answer and to give you a really basic example of the two types:

Let's say I want to store info about my customers. In a relational database a table might look like this with 9 or so columns;

CustomerID FirstName LastName DoB SSN Address1 Address2 City Zip

These columns will have defined datatypes, this means I can opnly store a DATETIME in the DoB and a NUMERIC value in the SSN. They will also be defined as to if a record has to be in a colomn or not. You would not store your customers order information in this table. That would go to an Orders tables and you would have a column in the Orders table that would link an order to a CustomerID (the first column in the customers table).

Setting this up properly means you get great performance and can trust your data, you can also offload a lot of business logic into the database keeping the front end app very lightweight. Most relational DB engines are also very mature and stable giving full ACID compliance (look up ACID), this means zero dataloss in the event of a server restart etc.

The problem is it takes time to set it up properly, you need to learn SQL and it can be difficult to change later on if you need to. Traditional relational databases can also be a nightmare to scaleout across lots of servers.

In a NoSQL database I could have a customers table with just two columns;

ID Customer

The customer column will accept any data. I could place the customers personal details on one row and store their order info on another. I could even store them together. I also don't need to learn SQL and in many cases may just store the JSON or XML from the application. NoSQL is therefore faster for most developers to get a datastore up and running that can evolve with the application. The downside is it's even easier to make a mess of your data or to deploy a half baked solution :-)

Fortunately we seem to have moved past the era when many developers thought NoSQL was the future and tried to use it for everything. I now see both types used together to play to their strengths and some of the relational engines now also imitate some NoSQL features and vice versa - e.g. support for JSON and XML and using Key Value Pairs in relational DB engines. An increasingly common deployment pattern is to have a relational DB for the backoffice (Stock Control, finance, CRM) type databases and a NoSQL layer for the frontend (retail website).

Hope that helps you further. If you are interested in learning some SQL codeacademy now has a course on it.