“Data Architecture is the physical implementation of the Business Strategy,” said Nigel Turner, Principal Consultant in E.M.E.A. at Global Data Strategy, Ltd., speaking at the DATAVERSITY® Enterprise Data Governance Online Conference. “It’s a key part of the whole continuum that you need to build within an organization to manage data effectively,” and Data Governance forms an important bridge between those strategies and the real-world implementation of them in the business.
Data Architecture: What is it?
The DAMA DMBoK2 says that Data Architecture “defines the blueprint for managing data assets by aligning with organizational strategy to establish strategic data requirements and designs to meet these requirements.” Turner pointed out three key parts of this definition, the first being the word “blueprint.” “What that implies is that any Data Architecture that doesn’t have an implementation plan will probably remain on the shelf until the mists of eternity have risen.”
The second key part is “aligning with organizational strategy.” Data Architecture must be directly connected to the goals of the business and how data supports those goals, he said. The third part is about establishing strategic data requirements. Because “any effective Data Architecture must be forward-looking.”
Citing a DATAVERSITY research report entitled Trends in Data Architecture, by Donna Burbank and Charles Roe, he noted the range of responses to the question: “What is Data Architecture?”
“One of the issues we have in Data Management is if you take any Data Management concept or term or discipline, different people define it in different ways.”
Regardless of how it’s defined, Data Architecture must have some concrete deliverables, such as specifications, master design documents at different abstraction levels, and descriptions of all the containers and paths that data takes through the system. Without these deliverables, he said, “Then clearly, you’re not actually delivering anything of value to the business.”
Typical Deliverables from Data Architecture
Standard deliverables include:
- Policies about data usage, guiding principles, statements of intent about usage, and mechanisms for accountability
- Data models including enterprise conceptual models, logical data models, physical data models, and application-specific logical data models
- Data Catalogs
- Inventories of data sources
- Master Data or Reference Data, and which data is widely shared
- Defined key data, with glossaries, dictionaries, definitions, and applied standards
- Metadata and how it should be managed
- Data Lineage and flow through systems
- A road map for implementation
“If you’ve got all those things in place, you’ve got a shouting chance of getting a Data Architecture that will work.”
Data Architecture: How to Fail
- Trying to devise an architecture that encompasses managing, processing, collecting, and storing everything:“Avoid boiling the ocean. Focus your architecture on the things that are critical to make your business work and operate.”
- A Data Architecture entirely managed, driven, and designed by an IT department can end up being a shopping list for new technology, rather than a plan to support the Business Strategy. “With all respect to people in I.T. departments, they’re not always the best people to understand how data supports the business strategy and therefore, how the architecture needs to evolve to make that happen.”
- Without active support of senior management, both in the business and IT side, success is unlikely. “It shouldn’t be led and developed exclusively to people in the middle of your organization.”
- If your architecture is too complicated, it’s unlikely to stay current. Turner shared a story about a company he worked with as a consultant. The company had a very detailed data model covering the entire wall of a room. They were very proud of the model, yet over the course of several years Turner noticed that the same model was on the same wall, unchanged, the implication being that it wasn’t used for anything more than a wall decoration.
- Long-term planning is important, but don’t leave out concrete, short-term benefits. Data Architecture, he said, “remains a dream if you don’t have hard deliverables attached to it.” Build in some quick wins.
Getting it Right
Key features of an effective Data Architecture include a Data Strategy that is in alignment with business drivers, targets essential data, delineates clear activities and milestones, and is flexible enough to evolve with the business needs and the technology available. Most importantly, architecture must be manageable. “You can never sort out all your data everywhere. You need to focus on the things that really make a difference.”
Developing a Data Strategy
Turner outlined a simple path to a Data Strategy. Start with the Business Strategy and determine what data is critical to supporting that strategy. Evaluate the data you have and decide if it’s up to the task, and if it isn’t, decide what is needed to improve it. Turner pointed out that improvements may need to come from the business side, rather then exclusively from IT. For example, if every department uses a different code or term to indicate “customer,” “Then that obviously would influence the business strategy, which might need to change in order to accommodate that barrier.”
Today’s Data: Scope, Scale, and Complexity
The volumes of data that companies and organizations are handling have increased phenomenally in the last ten years. Ninety percent of all the data currently stored today has just been created in the last two years. Put another way:
“There are 2.5 quintillion grains of sand on the earth. A quintillion, by the way, is a one followed eighteen zeros. Three times that number of bytes of data are created every single day. So, the scope and scale of this is absolutely phenomenal.”
Yet it’s not just about the scope and scale, he said. Complexity is just as much a factor. Because so many companies don’t yet have a handle on the basics — Data Management, Data Quality, and ensuring that data used fits the intended purpose — “forget about all the new technologies in the future. This is the reality today.”
Business Drivers for Data Architecture
Business Intelligence and Data Science are drivers for Data Architecture because those are strong growth areas in IT. At same time, cost reductions, increased efficiency, and regulatory compliance are also creating pressure to improve Data Governance. Another reason, he said, is that, “The current status quo with the management of data in most organizations, I suspect, is still pretty poor.”
He cited a study that was published last year in Harvard Business Review. Researchers surveyed 75 companies, asking senior executives from those companies to check the accuracy of a series of records from key systems identified as essential to the efficient running of a company. “The outcome was really quite shocking,” he said, because only three records out of a hundred were error free. “Ninety-seven percent of all records examined across those 75 companies had some critical errors in them which could impact business performance.”
Turner said that the biggest problem with Data Lakes is a lack of effective Data Governance. There is a lack of consistent data definitions and metadata, so when people access the data in those Data Lakes, “They haven’t got the foggiest idea what it means.”
Data scientists who are being paid high salaries to find insights from data are instead spending most of their time doing lower level tasks just to get data in usable condition, he said.
“All the great promise that Big Data and Analytics brings, with all this data that companies are now collecting — less than one percent of it is actually being used.”
He likens the current situation to the process of fighting fires instead of creating a proactive way to prevent them. What’s needed is a coherent and effective Data Architecture, and a focus on identifying problems, creating solutions, and building in preventive, proactive governance. “In other words, you stop the fires from breaking out rather than wait till they do and then fighting them in a reactive way.”
Data Governance: Moving from Reactive to Proactive
Turner shared Global Data Strategy’s definition of Data Governance: a business-led continuous process of improving data for the benefit of all data stakeholders. Although the initial implementation might start off as a project, “Ultimately, you’re in this to make sure it runs in the background as a business process, in effect, alongside all your other business processes.”
Seven Key Principles of Data Governance
- Data must be actively managed
- The business should be responsible for leading governance efforts
- The business must set the priorities for improving data, what data to focus on, and what impacts it should have
- Data owners must be accountable for critical data
- Data stewards are responsible for data improvement
- IT provides the technology to make Data Governance real in the physical world
- Everybody in an organization must be included as part of any Data Governance activity
“Every organization needs Data Architecture, but how much and where it needs to be applied varies from organization to organization, and nobody is better placed than governance professionals to help the architect make those decisions.”
Data Governance and Data Architecture support and reinforce each other, he said. Sharing a slide outlining the synergies between Data Governance and Data Architecture, Turner highlighted key advantages to both.
Data stewards are in a position to identify critical data and how the state of that data impacts the business, which can help with prioritizing and evolving the architecture. Data owners should inform business rules about data that are then implemented within the architecture. Owners and stewards are in a good position to serve as champions and help architects build a case for more investments in Data Architecture because, “They will understand the current implications of shortcomings in data and the importance of managing data in more structured ways,” he said.
Data Architecture can support Data Governance by making governance strategies on a physical level so they can be implemented in the real world, and not just serving as abstract ideas. Data models can illustrate which data requires governance and can highlight reference and master data sets, which Turner says, “Need to be the most closely stewarded and owned within an organization.” Data Architecture helps to build business and IT consensus around critical data, ensuring that the business collaborates with IT to carry out identified priorities.
Aligning Data Governance with Data Architecture
Turner said that it doesn’t matter where you start. The important thing is that the disciplines of Data Architecture and Data Governance come together to form a continuous improvement cycle, ensuring that “Your data is getting better and evolving in line with business requirements.”
Turner ended his presentation with a photo of the Acropolis to illustrate the importance of enduring architecture. In order to build a temple that would last for more than 2,500 years, the Greeks spent a lot of time leveling and preparing the foundation of the hill on which it stands.
“Any organization aspiring to do the same with data, in other words — to create an enduring, and endurable data-driven business — should recognize that you need Data Architecture and Data Governance, and that they need to work together to provide the foundation for the future.”
Want to learn more about DATAVERSITY’s upcoming events? Check out our current lineup of online and face-to-face conferences here.
Here is the video of the Enterprise Data Governance Online Presentation:
Image used under license from Shutterstock.com