Dr. Thomas C. Redman, the Data Doc and President of Data Quality Solutions, has written another book that tackles the same issue his firm regularly deals with: Data Quality. Joining his previous works, which include Data Driven: Profiting from Your Most Important Business Asset and Data Quality: The Field Guide, is Getting in Front on Data. This time around, though, Dr. Redman is talking more about the people and the roles they should play in achieving Data Quality rather than providing a step-by-step how-to guide to the topic.
DATAVERSITY® had a chance to speak with Dr. Redman, a contributor both to our website and events, about the new book and the help it offers business and IT leaders who want to make an impact on Data Quality in their organization.
DATAVERSITY (DV): Tell us a little more about the impetus for writing this book.
Dr. Redman: In the past I mostly focused on the “how” of doing the work of Data Quality. Here I wanted to focus more on the “who does what” to achieve Data Quality. With the right people and structure in place, the fact is that Data Quality improves really, really fast. So in this book I try to tease out how you get the right people in place – who they are, where they come from, what sets those who approach Data Quality properly apart from those who don’t – and tell the story from the organizational viewpoint.
DV: What data issues continue to drive the need for business and IT professionals to have resources, such as your past books and now Getting in Front on Data, at their disposal?
Dr. Redman: Bad data is like a virus. You can start at the bottom of the organizational chart and find someone infected by it. For instance, there’s the person who’s charged with fulfilling an online order but the address is wrong. That’s a problem for that person who is doing real work in realtime and who may or may not be able to correct it. Crank it up a bit and there’s the large class of knowledge workers who may spend up to 50% of their time looking for data they need or correcting simple errors or searching for confirmatory sources for things that look wrong.
Further up the organizational chart you’ll meet a senior executive who resembles someone whose story I tell in the new book: This person asked me to explain to him Data Quality in simple terms. I asked him to remember the last time he made a big decision and whether he trusted the data it was based on. He thought about it a minute and then he said that he didn’t think he ever trusted very much of the data he uses for making decisions. It was just his job to try and steer through it all.
So just up and down the organization people run into bad data all the time. They do their best to accommodate it and sometimes they do that well. But a lot of time they don’t and the results could be anything from sending a package to the wrong address and irritating a customer to making a bad business decision that has enormous cost to the company.
Bad data may simply be wrong data, like the address, or it may be poorly defined data – for instance, numbers in a report are represented as metric units but that isn’t clearly defined, so people assume they are English units and make decisions based on that assumption. Problems with unclear definitions happen a lot because companies so often are pulling data together from different sources and these sources don’t always have the same understanding of what, for example, a customer is.
DV: What changes, then, should take place in an organization’s staff to get these problems in hand?
Dr. Redman: Today, data customers often take steps to fix data that seems wrong – they take responsibility for the quality of data that others have created. Instead when they see bad things come in, they should work their way upstream and find the data creators to explain to them their needs and requirements. It’s not like the data creators are consciously sending out junk, but in the absence of feedback they think each what they are sending is fine. When the issues are articulated to them, most take reasonable efforts to find and eliminate the root causes of problems. And these two roles—data creator and data customer—are the most important. By the way, I’ve personified these roles but you also can think of the data creator and customer as processes that create or use the data.
In any case, the practical reality that Getting in Front on Data acknowledges is that, left on their own, most data customers and creators don’t step into these roles. Organizations have to put support roles in place to make this happen across the company.
Things start with the Data Provocateur – the first person in his or her work team who recognizes the complete inanity of the current situation where some data comes in and their team spends extra time fixing it up before it can be used. Usually a provocateur is in the low or middle level of an organization; occasionally they are more senior. But the important thing is that this person makes a big improvement in their work team by, at a small scale, implementing the new data customer-creator interaction to eliminate the root causes of problems. It’s diligent but not heroic work, maybe eliminating 90% of the errors their particular team faces. But by doing that the provocateur has provided a role model and script for the rest of the company to take up.
Data Quality teams, then, can use that script to try to realize the same results at a larger scale, making sure that the most important customers and creators connect. Then, of course, if you are going to change throughout an entire organization, you need senior leadership. In the book we lay out specific things we expect senior leaders to do to advance initiatives beyond anything a provocateur or even a Data Quality team can do on their own.
In some respects this idea models how human resource operations work in a business. The result of scaling up a Data Provocateur’s work should be that everyone steps up their efforts as a data customer or creator in the same way that today most actual HR management is done in the line, by employees and managers every day. They step up to that task following HR’s scripts on how to do performance reviews and so on.
The book works out a couple of examples in great detail about finding and building up the support roles within a company to make finding and eliminating root causes of error standard practice. And eliminating just one root cause may prevent tens, hundreds or even thousands of errors down the line.
There’s an awful lot about Getting in Front on Data that’s about advancing the culture and about dealing with the dynamics of change. It starts with understanding the problem bad data to a good degree and requires establishing a beachhead – which Data Provocateurs are so good at doing – to prove change can happen successfully. Without the kind of examples they provide, it’s hard to convince people to do anything new. And as a result, they won’t realize the cost reductions they otherwise might, they won’t reduce the tedious work they have to keep doing or their risks, and they won’t be able to execute the data-based decisions they make with confidence that they’re likely making the correct decision.
If you’re interested in getting a copy of Getting in Front on Data, go here and enter the promo code DataDoc for a 25% discount.