No-SQL. The topography of a database

In our previous BLOG on No-SQL databases, we focused on Big Data.  We explored the idea that because of the enormous size of the underlying data, our former notions of data efficiency and order no longer apply.  Rather than spread related data across numerous, normalized tables, we strive to keep related data together.  In doing so, we greatly simplify the task of retrieving and storing data.  When we need it, the data is stored in one complex record in one table.  One read and we have it.

But in simplifying the retrieval and storage of data, we create complexity of another kind.  How do we keep track of data that we formerly parsed out with logical precision to individual tables?

  • Customers can place many orders.
  • The orders can contain many line items.
  • The line items can, in turn represent many products.
  • There are invoices to be sent out
  • Backorders to be dealth with, and
  • payments to be received.

How do we propose to store all of this data in one record? In answering this question, we find that our data takes on an unusual shape or topography.  Each”record” is no longer flat like Kansas.  On the contrary, it has contours, shapes, and texture, like Colorado.

We find that each of our data records are lumpy.  They accommodate all the data necessary to describe the underlying business or information problem.

In No-SQL, tThe records and tables are so different, in fact, that when we refer to them we must use different terms.  We refer to collections rather than tables because the structure of the collections are diverse enough to accommodate many different aspects of one data problem.  And we refer to documents rather than records because a record implies structural uniformity rather than the diversity of information that the No-SQL database can accommodate.

But do not confuse No-SQL documents with a word document or other kind of unstructured computer text files.  These are highly-structured data-rich groupings of information designed expressly to accommodate our high performance data storage and retrieval needs.

In attempting to understand the benefits of No-SQL, we can find a helpful analogy in physics where the conceptual transition from Newtonian physics to Einsteinian physics comes to mind.  In Einsteinian physics, space is no longer Euclidean.  It becomes curved.  And time is no longer purely fixed intervals, it behaves differently depending on the relative speed of the objects in question.

Similarly, in No-SQL we no longer think of documents (formerly records) as uniform in length or field count.  Documents can contain a variety of related information that is stored together to describe our business problem or data problem.

We refer to the topography of the documents (the diverse shape of the records)  in a No-SQL database.  Understanding this topography and knowing that it is dynamic and can be changed over time with relative ease is a powerful concept indeed.