What is a Multivalue Database? |
![]() ![]() ![]() |
There are many different databases available but they all fall into a small number of basic types. One of these is the relational database such as Oracle or Access. A relational database holds data in the form of tables in just the same way that we could store information as tables written on paper.
Consider an order processing system. We need to hold information about the orders that each customer has placed. Keeping things very simple, at a minimum we might need a table such as that shown below.
In this simple table, each row represents an order and each column holds data associated with that order.
Relational databases are built following a set of rules known as the Laws of Normalisation [E. Codd : "A Relational Model of Data for Large Shared Data Banks", Communications of the ACM, June 1970]. The process of transforming data to fit the rules of a relational database is called normalisation and the steps in this process are referred to as first normal form, second normal form, and so on.
The First Law of Normalisation states that we may not have repeating data. In practical terms this means that we cannot add extra columns to the right of the table to allow a customer to order more than one item at the same time.
Clearly this restriction is not acceptable in the real world.
There are many reasons why the Laws of Normalisation do not allow this, mostly based on the way in which the data might be stored by the computer system. If we are to observe the First Law of Normalisation, we must reconstruct our data in some way that removes the additional columns. One way would be to split an order that has multiple item across several rows of our table.
Although we can now store as many items in an order as we wish, things have become more complicated. Firstly, the details of a single order are now split across multiple rows of our table. Secondly, we have been forced to add an extra column so that we can know how many lines there are in the order. Also, we have duplicated some information, a step which actually breaks another of the Laws of Normalisation. To avoid this last complication, a typical implementation of this sort of data in a fully normalised system (e.g. Oracle or Access) would break the order into two separate tables, one containing the basic information about the order and the other containing the details of the items ordered.
Things are becoming complex and this is supposed to be a trivial application!
Multivalue database products avoid this complication by removing the need to adhere to the First Law of Normalisation. We allow a single cell of our table to hold more than one value (hence "multivalue").
If you have spent many years working with fully normalised databases, you are probably shaking your head and saying that we cannot do this. Yes, we can do it; it's just a different way to hold our data.
Think about the advantages: The entire order is all held as a single record; there is no redundant duplication of data; we do not need an item counter.
The end result of this is that our multivalue view of the world is typically much faster than its fully normalised counterpart though there will always be situations where this model is not ideal. In such cases, you can freely revert to using the fully normalised approach. Notice that fully normalised data can be stored in a multivalue database. The opposite tends not to be true.
The time has come to introduce some terminology. A typical application will have many tables, perhaps hundreds or even thousands though the multivalue model usually results in far fewer tables than in other data models. Each table is stored as a file. The rows of our table are known as records and the columns as fields (some users refer to these as attributes). The data stored in a field may be made up of multiple values.
Note how in our multivalued implementation of the above example, the values in the product and quantity columns are related together. For any particular order, the first product number belongs with the first quantity, the second product number belongs with the second quantity and so on. A typical realistic table may have several separate sets of fields that are linked in this way. The relationship between the values in different fields (e.g. product and quantity above) is referred to as an association.
By adopting this data model instead of using additional columns, the data model imposes no limit to the number of items that may be included in an order.
This extended form of the relational database model is at the heart of the QM database. You may also see it referenced as post-relational, nested table or NF2 (non-first normal form). They all mean the same thing.
In a multivalue database, the tables can gain a fourth dimension (subvalues). Continuing with the above example, perhaps we need to record the serial number of each item that we sell. Thus each value line in the table depicted above would have subvalues containing the serial numbers for each item supplied.
The History of Multivalue Databases
The original multivalue database is usually attributed to Dick Pick (hence the frequently used term "Pick databases") back in 1968 though their origins can be tracked back further. The current D3 database from Raining Data is a direct descendant of the original Pick product but there have been many other players along the way, some large, some small. Some of these are significant to the way in which QM works.
The Reality database, originally implemented on McDonnell Douglas systems but now owned by Northgate Information Solutions, closely follows the Pick style of operation. The long defunct Prime Information database from Prime Computer retained the same data model and general principles but made some fairly significant changes to the command and programming languages.
In the mid-1980's the various companies with multivalue products hit a problem. The world was standardising on the Unix operating system but these products did not run on Unix. As a result of this, McDonnell Douglas developed an "open systems" version of Reality (Reality X) and Prime Computer developed the PI/open database. At the same time, two start up companies appeared each with their own Unix based multivalue implementation, VMark (UniVerse) and Unidata (Unidata). These companies set out to capture users from the existing products as well as taking on new users. The history is long and complex but to bring it up to date in one step, UniVerse and Unidata are now both owned by IBM. Technically, IBM also own Prime Information and PI/open but both products have been retired.
The UniVerse and Unidata products (usually referred to collectively as U2) follow the Information style of implementation by default but have features that allow them to look more like the Pick style if required.
QM was originally developed in 1993 for use as an embedded database but not released as a product in its own right until 2001. Like the U2 products, it is an Information style database but has options to make it more like Pick for those who need it. |