Socializing Data Governance: Utilizing a Common Data Matrix

By on

“Data Governance is like Lord of the Rings. You are on a journey to cast obfuscation and bad decisions into the fires of Mount Doom. To succeed in this quest, you will need a fellowship of data” said Mark Horseman, Data Enthusiast at the Northern Alberta Institute of Technology (NAIT), in his presentation titled Socializing Data Governance at DATAVERSITY® Enterprise Data World Conference. The “fellowship” comprises champions of knowledge and wisdom to guide the forces of good in a quest against ignorance. When it comes time to raise an army against the forces of idiocy, this fellowship will be the key to success, he said.

Creating Your Data Governance Council

NAIT is a leading polytech trade school in Canada committed to student success. The tool NAIT used to “raise an army” of data stewards was not the Elven sword Orcrist or Gondor’s Red Arrow – it was the Common Data Matrix from Robert Seiner’s book, Non-Invasive Data Governance. (Horseman highly recommends this book). This group of stewards is the key to connecting the governance program to business goals and functions. Horseman said that once they tied the governance plan to NAIT’s strategic plan, it became clear who needed to be on the council.

But rather than identifying or appointing individuals as stewards, Horseman said, they recognized people who were already doing stewardship, so the message became more about providing help and understanding rather than adding extra work. “These people are the executors of the production, use, and definition of data across the organization [and] should be recognized for the work that they already do.”

A Common Data Matrix

The next step, Horseman said, is to build the Common Data Matrix. The Matrix provides documentation of the production, use, and definition of data domains across the organization, and defines and documents domains, roles, and responsibilities. “Student” and “staff” are examples of domains for NAIT, and subdomains for these might include different types of demographic data or personal information governed by GDPR.

Documenting roles and responsibilities for each business unit across the organization allows everyone to understand who is using data, what data is being used by each job, who is putting data into what system, and who is responsible for defining the data’s meaning for the institution.

The Matrix also serves as a communication document. If there is an upcoming project or event that will affect student data, such as adding additional gender codes, Horseman said they are able to proactively communicate that out to their user group via the Matrix before they see something they weren’t expecting on a report.

Socializing the Common Data Matrix

Horseman used a social process to build out the Matrix with informal chats over coffee, interviews, and discussions with users, data stewards, and stakeholders.

Questions that helped focus the conversations included:

  • What are the major domains of data that you interact with?
  • What system(s) does your team use to interact with that data? (For department heads and deans.)
  • Do you have data off the side of your desk?
  • Do you have anything outside of the centrally managed system?

With the last question, Horseman stressed that it’s especially important to understand where these “pockets of awfulness” are hiding outside the system – usually in Excel. “Well-meaning people come up with a fantastic little idea, and they track it in Excel, and they’re doing it at their desk,” he said, but when that information becomes critical to the business, it creates risk. Over time, the spreadsheet builds and becomes more challenging to manage, “and suddenly, we’ve got two FTE manning a spreadsheet, which everybody in the organization needs to look at, but nobody can,” because it’s on one person’s desktop.

Identifying sources that are not in the central system reinforces the idea that the team is there to provide support and not to change, he said. The entire Matrix build-out process was thoroughly documented so the team could understand what was being done with data throughout the institution and also voice concerns and fears.

Data Sensitivity

Another outcome of the interview process was the identification of confidential and sensitive data. To make it easier for people to understand those different classifications of data, Horseman illustrated variations based on the impact data would have on the organization:

  • Confidential: “If it gets out, we will get sued.” This category includes personal information, HIPAA, and anything covered by regulation.
  • Sensitive: “If it gets out, you will be embarrassed.”
  • Public: “It’s already out there.”

Using the Common Data Matrix

Project Management

The Common Data Matrixcan serve as a tool to insert operationalized and socialized governance into project management. In the context of event management, for example, an event planner can look at the Common Data Matrix and can see which departments, business units, and jobs are using and interacting with event-related data. A business analyst working on an event told Horseman that by using the Matrix during the planning process, he saved two months worth of effort going around the institution ensuring that all the right people were in place for the event.

Governed Reporting

To show the reliability of reports and governance of the data behind them, Horseman and his team developed a ranked approval process that includes a visible watermark on the report. Clicking on the watermark brings up a summary of the reasons for its score, as well as further documentation. In a workshop Horseman attended with Kelle O’Neal of First San Francisco Partners, O’Neal called this watermarking process “governance at the point of usage,” and Horseman said that this concept was the most important takeaway from his presentation:

“So we’re punching people with definitions, we’re hitting them with lineage, we’re explaining why it got the mark it got right there at the point that somebody is consuming that content. And that has been amazing.”

Creating a Rubric

The process of developing a scoring rubric for the certification started by prioritizing business goals with emphasis on areas that aligned with the focus of the governance program and the existing tools they were using.     

Horseman shared how they determined point values for certification levels with a total of 20 points possible. The point system emphasized their Data Quality, Business Intelligence, and security goals:

  • The business owner or data steward approves the report and certifies it for distribution – 1 point
  • The Common Data Matrix is complete for major and subdomains of data used on the report – 2 points
  • The data is supported in the data warehouse – 4 points
  • The data is supported by automated Data Quality screens – 4 points
  • The data is supported by a dimensional model – 4 points
  • Data lineage is known – 4 points
  • Business owner has control or full audit of the use of the report – 1 point

The first two points and the last point (in italics) are required to get a stamp at all, with nine points as the bare minimum to get a bronze rating. Silver is 10 to 14 points, gold is 15 to 19 points, and platinum is 20 out of 20 points. Not all reports are eligible for the governed approval process, Horseman said. Canned reports out of PeopleSoft or SAP, or reports that are specific to a particular position or operational task – such as a list of students to call – are not governed because the focus of the certification process is on reporting that is tied to a strategic decision-making point, he said.

Communication is Key

Horseman said, “The importance of a rubric like this is to be completely honest and consistent, so as long as you don’t give people bonus points, you’re going to be fine.” Provide communication about the meaning of each set of points, clarifying what one out of two means, what two out of four means, etc., and users will gain an understanding, he said.

Expect that there may be disagreements about Data Quality. He worked with a group who did regular manual audits of their source system and didn’t understand why they didn’t get four out of four points for automated Data Quality. “We’ve all heard this story, right? ‘My data is perfect!’” he said, but the goal is to have an automated Data Quality process in the warehouse so they are alerted when referential integrity breaks down. “Once it gets explained, then they kind of go, ‘Okay, okay. I understand, Mark. Thank you.’”

Using the ranking process with NAIT’s financial sustainability framework meant that reporting tuition revenue was an identified priority. One indicator of potential tuition revenue is the number of applications. Additional factors are the qualifications of applicants, the number of applicants who go on to enroll, and the success of those applicants. “We did the gold stamp on that. It’s in the warehouse; we have automated Data Quality. You can trust the content on there.” Horseman is able to monitor the interactions users have with reports via SharePoint and is aware of what reports are being used and for what purpose. “If any questions come back to us, people know there’s a stamp on there, so they can talk to us about it.”

Recognition of Value

As use of the scoring system became more widespread, NAIT published workforce reports that were classified as silver. The provost said, “I want all my stuff to be gold or better.” So others started asking for a ‘gold-stamped report’ as well, providing an opportunity for Horseman to educate about the process: “These are the steps that you have to follow to get that, and here’s how it works.” Although not everyone has access to the highest quality automated data, there is a growing recognition of the value of the data behind every report. “The executive is driving governance because they have governance at the point of usage.”

The systems and tools needed to launch an effective, engaging Data Governance program may not glow like fire or be covered in precious stones, but the knowledge and wisdom they bring can help the forces of good governance triumph over chaos and ignorance.

Check out Enterprise Data World at

Here is the video of the Enterprise Data World Presentation:

Image used under license from

We use technologies such as cookies to understand how you use our site and to provide a better user experience. This includes personalizing content, using analytics and improving site operations. We may share your information about your use of our site with third parties in accordance with our Privacy Policy. You can change your cookie settings as described here at any time, but parts of our site may not function correctly without them. By continuing to use our site, you agree that we can save cookies on your device, unless you have disabled cookies.
I Accept