Some of the bigger Internet companies are moving away from the relational database approach utilized by SQL database systems, in favor of a new breed of database called NoSQL. There are many different NoSQL databases, each with unique approaches to storing data; the “NoSQL” moniker derives from their non-relational approach (as opposed to SQL, which is meant for managing data in relational database management systems).
Indeed, the number of NoSQL databases keeps growing, as do the features of each. Here’s a list of some of the more commonly ones and what they offer.
First, Some Document-Based Types:
MongoDB: This document-based NoSQL uses JSON (well, technically it uses “BSON,” a binary form of JSON) to store its data. You store your JSON structures in collections. MongoDB features a huge number of language drivers, and is used by many different companies that feed documents (including SourceForge).
CouchDB: This one’s similar in many ways to MongoDB, in that it’s also JSON-based. What’s the difference between MongoDB and CouchDB? CouchDB uses MVCC to handle extreme concurrency, which may or may not be suited to your application, depending on the overhead.
Now for Some Key-Value Types:
BigTable: This is Google’s own database, used by many of its applications. It was developed internally, and the code remains unavailable. But the company has published many technical papers about the database and how it works, thus allowing other developers to create databases that function in a similar way.
As Google puts it, “A Big Table is a sparse, distributed, persistent multi-dimensional sorted map” (that quote comes from the official paper). The layout of a Big Table is almost deceptively simple. The unique key is made up of several columns, combined with a timestamp. Add to that a contents column.
Want to use Big Table? You can’t download and install it, but you have a couple options. You can access it via the Google App Engine, along with a query language called GQL, which is actually similar in nature to SQL. Or you can install open-source databases that others have modeled after the design. Cassandra is one; there’s also an open-source one called HBase.
Cassandra: This was created by Facebook (love it or hate it); the original designer claims it was modeled after BigTable. It’s now a top-level project with Apache, and used by many large organizations including Reddit, NetFlix, and others.
HBase: HBase is designed for big, with a capacity to handle billions of rows with millions of columns. It was also modeled after BigTable.
SimpleDB: This is Amazon’s answer to the key-value approach (although some people put it in the document-based category). It runs on their servers, with access from your code via a REST interface. Although you can’t install it on your own servers, there are some development versions people have made that mimic the behavior, which you can run on on your own development machines. (Try Googling to find some development versions; these projects tend to come and go quickly. And if you find one, check the dates to make sure it’s still being managed.)
There are dozens of different NoSQL options, and each one seems to have its own unique way of managing data. If you’re new to NoSQL, try out several different ones to see which features work best for your own needs.
Image: Karramba Production/Shutterstock.com