“The main purpose of a data model is actually not to design a database—it’s to describe a business,”
said Christopher Bradley, information strategist at DMA Advisors. Bradley spoke at the DATAVERSITY® Data Architecture Online conference about the purpose of Data Modeling and its role in Data Governance and the modern successful business.
Big Data: Are Models Obsolete?
Data Modeling and Data Governance are inextricably linked, Bradley said. Data Modeling plays a very significant part in all the knowledge areas of Data Management, and it informs everything. “The amount of information that we can hold in data models that aids us in the Data Governance journey is quite staggering,” he said. However, many people still believe that working with big data means that data models are unnecessary. “I’m afraid I have to tell you that this is a complete myth,” he said.
No Magic Bullet
Bradley cited Gartner reports saying that more than 85% of data lake projects were complete failures, placing the blame on a widespread belief that schema-on-demand means no data models. “It’s not the same thing.” This erroneous belief leads vendors to over-promise, and their customers to assume that they can, “Just chuck data into a magic bucket, say some big data magic words and get actionable insights, straightaway.”
Fundamentally, big data, AI, machine learning, and other new technologies still depend on a degree of reliable data that doesn’t have to be perfect to be useful. But it’s more than just dumping data in and keeping your fingers crossed. “That’s the bit that’s really missing.”
The Beginning of Wisdom or What Has the Data Model Ever Done for Us?
The beginning of wisdom is the definition of terms, according to Socrates, Bradley said, but clarity and definition is absolutely key. “And it’s never more so than in the Data Modeling world and thinking about big data.” A data model can provide the clarity of definition and an understanding of where processes and data interact.
Why Produce a Data Model?
There are a variety of reasons to create and use a data model, and Bradley presented a list of them from a survey of over 200 data modelers. Key among them is avoiding late discovery of missed requirements, which can result in unnecessary costs. Assessment of package solutions for proper fit is another, as well as identifying and managing redundant data. Not specifically mentioned on the list, but also important, is the interaction between business and process, to ensure that there is a clear understanding of where the data is used and how it helps to define business rules.
Why Is Data Modeling Important?
To paraphrase Tim Berners-Lee, data is a precious thing that will last longer than the systems themselves. Data is an organizational asset, and to manage it properly, it needs to be understood. Data models provide a common vocabulary, whether they are high-level or detailed, providing shared vocabulary and common ground for understanding.
Data Focused Mindset
The prevailing application-centric mindset has caused the fundamental problems that we have today, Bradley said, with multiple disparate copies of the same concept in system after system after system after system. Unless we replace that mindset with one that is more data-focused, the situation will continue to propagate, he said.
Data Models: Not Just for Data
Models have a wide variety of applicable uses and can present different levels of detail based on the intended user and context. Similarly, a map is a model that can be usedlike models are used in a business. Like data models, there are different levels of maps for different audiences and different purposes. A map of the counties in an election will provide a different view than a street map used for finding an address. A construction team needs a different type of detail on a map they use to connect a building to city water, and a lesson about different countries on a globe uses still another level of detail targeted to a different type of user. Similarly, some models are more focused on communication and others are used for implementation.
The Audience Determines the Data Model
A model for integrating a new CRM system will have a high level of detail presented in highly technical terms. A model of the same project for business stakeholders needs a different level of information presented in an accessible way. Higher level models should be a starting point, he said, so that gaps or incorrect assumptions can be addressed early on before more detailed models are developed, and costly errors can be avoided in the implementation process later on.
Where Does Data Modeling Fit?
Bradley used the DAMA-DMBoK2 Data Management Framework (The DAMA Wheel) to illustrate where Data Modeling fits among the eleven disciplines or knowledge areas involved in Data Management. Shown as spokes in a wheel, and positioning Data Governance at the center, the knowledge areas are:
- Data Modeling and Design
- Data Architecture
- Data Quality
- Data Warehousing and Business Intelligence
- Reference and Master Data
- Document and Content Management
- Data Integration and Interoperability
- Data Security
- Data Storage and Operations
Data Modeling plays a part in all Data Management disciplines, facilitating analysis, database design, and implementation, through the process of enterprise, conceptual, and logical Data Modeling.
Data Modeling at the Business Level
Data Modeling can facilitate answers to core business questions, such as:
1. What is the data that we need to run our business?
This entails having business level data models linked to process models, with proper governance, good descriptions, and a solid understanding of how processes and data are used.
2. Do we agree on what the data means?
So many problems are caused by differing understanding of basic concepts, like how “customer” is defined. Terms used in the model glossary should agree with the vocabulary used by business stakeholders.
3. Do we know where the data is?
Legislation drives the demand for knowledge of where data is located. That understanding requires business level models linked to glossaries, with definitions linking them to the physical models and the systems. Those physical models must have their data documented and cataloged in a technical data dictionary, which should be cross-referenced to the to the glossary.
4. Have accountabilities with the right skills and processes been allocated to manage it?
Competencies, skills, and capabilities should be allocated to the right people, agreed on with the business stakeholders, with the correct security, by business data subject area.
5. Is it fit for purpose?
“Fit for purpose” isn’t simply about the Data Quality aspect. It also covers security, regulatory compliance quality, business criticality, whether or not critical data elements are defined, etc.
There’s More to Data Modeling Than You Thought
Models can be useful as a way of communicating with different areas in the business and for different purposes. They can be used to help understand service architectures, message-based architectures, virtualization, package selection, lineage, Master Data Management, Business Intelligence—all of these things and more can benefit from modeling. Information is at the heart of all architecture disciplines. There’s no one definitive statement about what a data model is, Bradley said, but data has to be understood to be managed—and data models are the best tool to provide that understanding.
Read Bradley’s paper Data Modeling is NOT Just for RDBMS’s.
Want to learn more about DATAVERSITY’s upcoming events? Check out our current lineup of online and face-to-face conferences here.
Here is the video from Data Architecture Online:
Image used under license from Shutterstock.com