September 1, 2016
The role of NoSQL in MDM & PIM
What does NoSQL mean to Riversand’s MDM & PIM solutions?
“Are you going to replace your relational database store with a NoSQL?”
“What role does NoSQL play in your MDM or PIM products & platform?”
“How will you ensure ACID transaction sanctity in a NoSQL environment?”
… find and replace the term “NoSQL” above with Graph, Big Data, Hadoop, schema less, etc in the above questions and you get a resultant set of questions that resemble the theme of queries we have been responding to increasingly in the last 6-12 months. In this post I will do my best to address Riversand’s approach and position on this topic as simply and succinctly as possible.
My team at Riversand started looking into this roughly 12-18 months ago primarily due to the implications of the Digital revolution on the information landscape at large and the MDM capability set specifically. This research became the basis for the innovation we are now bringing to the market under the aegis of NextGenMDM / MDM2.0. The reason enterprises are asking questions around NoSQL have a direct causality to the digital revolution afoot and their need to pivot / adopt / succeed in this age of the customer – here Data is the currency/enterprise asset.
Before I get into where NoSQL fits, it is important to outline the types of data involved in the digital age. I am going to borrow an analogy from the online strategy games realm (Age of empires series from Microsoft studios) to explain this. I apologize for generations (before and after mine) that don’t get the analogy being employed below – but the iceberg analogy doesn’t do justice to explain the depth and breadth here. The screen shot below is essentially of the map where you start as a player and build your empire by using resources. What you can see and have visibility to is divided into areas that are in:
- Your immediate scope and fully lighted and therefore completely visible at all times
- The surrounding scope where you have context because of previously traveling through but no visibility into what might be going on there (an enemy building in that area for example)
- Area beyond the surrounding scope where you have no visibility or context
Similar to the theater you operate within the game today’s enterprise information landscape is essentially a theater where data is an asset and knowing how to wield it to your advantage will be the difference maker. Applying the analogy further to our enterprise landscape – the “data you know” is primarily #1 and partially #2, while “Data you don’t know” refers to most of #2 and #3 above.
Figure 1 – Age of Empires courtesy Microsoft Studios
The “data you know” resembles the data you are familiar and have visibility into primarily because you defined the schema, model or structure to persist it in, the policies to manage or govern it in a prescriptive manner. Examples include Product or customer or vendor master data that reflect your understanding of each of these data domains with virtually no ability to
- Correlate end to end (sometimes referred to as 360⁰) or third party perspective
- Understand the implications of interactions of the data with the users and systems it touches – are the operations on my data today helping or being impedances to my business objectives
The “data you don’t know” resembles the data you are either not familiar with or don’t have visibility (or control) over primarily due to either not knowing it exists (outside your four walls – social, mobile, IoT / agent based data…) or because you don’t view it in the purview of master or authored data (within your four walls – like order, sales and other transaction data).
So great – why is all this relevant to understanding the role of NoSQL in MDM or PIM? Because understanding the type of data you are dealing with is the primary driver of understanding what kind of data construct is best suited to manage, persist and operate on it. In addition data constructs aren’t a Boolean categorization of either SQL or NoSQL. Here is how we break it down:
- SQL or RDBMS oriented – strict schema orientation, strong support for concepts like cardinality, defined relationship and ACID transactions (commit /rollback) – meant for authoring of “Data you know”
- NoSQL – schema less geared towards searchability, understanding or discovery of data
- Graph – Nodes/Edge and vertices oriented geared towards dynamic relationships, inference and analytical use cases
- Big Data & DFS oriented – columnar and / or distributed file system based (like Apache Hadoop related projects) that allow for excellent data processing capabilities on very large volumes of data
Figure 2. Mapping of types of data elements to data constructs best suited – Level 1
The table above is the first layer of mapping the types of data involved and the data constructs best suited to handle them. Your three key takeaways from this table:
- There isn’t a one size fits all answer – Sorry!
- Recognizing and sifting the type of data is critical to applying it into a data construct for the purposes of managing it
- The ability for the rest (other than SQL / RDBMS) of the data constructs to guarantee ACID transaction reliably cannot be assumed (… yet) and must be choreographed at the processing / service layer.
The layers below this high level perspective is a topic for another blog post or a paper. If you need more details in a hurry – just schedule a demo or contact us. Our approach isn’t to say that one data construct is better than the other in absolute rather outline what lends itself naturally to what type of data (and thereby data problem) and seamless offer the same through our next generation MDM products and platforms for the appropriate use case.