Designing in Data Quality with the User Interface

by David Plotkin

Larry English recommends “designing in” data quality into new systems, and I agree that this is very important. One of the ways you can accomplish “designing in” of data quality is to work closely with the User Interface team. This is even more important when multiple new systems (or major system upgrades) are being designed at the same time, as the opportunity for achieving consistency in the user interfaces then exists. In the real-world experience discussed here, we were designing two major new systems simultaneously.

The User Interface team (not surprisingly) is charged with designing the user interface for the systems. Of course, they wanted to ensure that the user interfaces were consistent across the two systems, including field lengths, labels, types (e.g., text field, radio button, drop-down list, etc) and color schemes. As you can probably imagine, having this sort of consistency from screen to screen and from system to system helps make the entry of data more accurate. For example, both systems collected customer information and displayed that information in blocks on various screens. The same color scheme was used everywhere that customer information was referenced so that the data entry personnel could easily find that information on the screen. Field lengths were also standardized – street address was a character field of a given length everywhere it appeared, including customer address, shipping address, garaging address, and so on. Finally, information was captured using consistent field types whenever possible. For example, the preferred communication method was always a set of check boxes so the customer could specify multiple methods if they so wished.

We (the Data Governance team) first met up with the folks on the User Interface (UI) team in meetings to review the screen designs. We soon realized that we had much the same goals in mind, and that we could help each other. The UI team was struggling with the following issues:

  • What field labels should be used on the screens and how could they be kept consistent?
  • What is the best type of field to use to gather the needed information, especially when the information was considered mandatory? And how do we keep track of decisions about what field types to use?
  • For fields with limited value sets, what should those value sets be?
  • What field definition (to explain the contents of the field) should be available and who will supply that definition?

The Data Governance organization and our business (metadata) glossary could help in many ways.

Field labels, we got field labels

For field labels, we started with the standardized business names in the glossary, and then went through a process to create a screen field name. We normally didn’t use the data element business name as-is because it was often too long. To shorten it, we used entries from the standard abbreviation list. From there, the UI team used the screen context to devise the actual label to use. What does that mean? Basically, the data on a screen has a context that isn’t present in the business glossary. For example, in the business glossary, we have terms such as “Customer First Name”, “Customer Last Name”, “Customer Street Address”, and so on. But on the screen, there is often a box labeled “Customer” which contains these fields. Thus, it is not necessary to have the word “Customer” in front of every field! But since screen design determines context, the UI team did the final pass to set the field names.

Field types and lengths

Standardization on the field types (e.g., character, drop-down list, radio button, check box, etc.) for different data elements is another important UI consideration. One of the issues that the UI team faced was how to remember what had been decided on in each instance. When they brought that concern to us, the answer was obvious because we’re metadata people. We added fields to our business glossary to document the field type, standardized length, and other field properties, such as patterns, precision, and logical data type (integer, y/n, date, character, etc.). Thus, anytime such a field needed to be added to a screen, the UI team could look it up in the glossary to see what the UI characteristics should be. Data Governance didn’t enforce this, but the UI team was very good at self-policing and appreciated our help and having a tool to record this information.

We got into some interesting discussions on how best to ensure that data was entered when it was considered mandatory. The initial approach was to provide a default so that a value was always available, but having participated in data quality efforts many times, I knew what the result would be – a preponderance of values that corresponded to the default. As a result, we prevailed on the UI team to NOT provide a default, but instead to force the data entry personnel to enter a value.

And speaking of values…

Quite a few of the fields on the screens had limited sets of values, and Data Governance played a key role in providing the values that should be used. We felt it was important that the data entry personnel see values that were identical to those determined to be correct by the data stewards. The UI team was understandably willing to take advantage of all the pre-work we had done to track the values down and document them. This also ensured that the value lists would be standard across the applications.

Definitions

We also tied the definitions that popped up in tooltips to those in the business glossary. The glossary is our “system of record” for definitions, so ideally those definitions should be presented to the business people. One small issue arose in that some of our definitions are quite long and involved, so in some cases we had to create a shorter definition for use on the screens. But Data Governance managed those as well, and kept them in the business glossary.

And don’t forget search…

One of the more vexing problems was how to get the data entry personnel to locate an existing customer using the Search functionality, rather than simply entering the data again. In the old system, Search often returned so many hits that it was just easier to create a duplicate customer record – which leads to all kinds of bad things, discussed elsewhere in this series. Data Governance worked with the UI team to define:

  • What fields made sense to search on. We knew not only which fields we had available, but also (due to data profiling) which fields had quality that was good enough to locate the right customer. Having implemented a Master Customer effort, we also knew which fields were used (in combination) to figure out who was who – the “identifying attributes”.
  • The rules under which, if a data entry person created a new customer anyway, the Search would take place again and force choosing of an existing customer due to a match that exceeded a reasonable threshold. Again, this work was considerably aided by the work done in Customer Master, in which the customer data was exhaustively analyzed for quality and completeness.
  • What to do if the user managed to create a duplicate anyway (they ARE clever). This would be detected by the matching engine, the appropriate adjustments made, and a note sent to the supervisor to point out a “training opportunity” for employee. Constant reinforcement of the rules – coupled with the fact that duplicate customer records did NOT contribute to the employee’s total (and thus they weren’t paid for that work) reduced the number of deliberately-created duplicates to a trickle.

Summing Up

UI and Data Governance provide a powerful combination. While Data Governance understands how data quality can be degraded by a poorly-designed interface, the UI designers know how to turn that information into an interface that safeguards the quality. And by the way, it feels good to provide the metadata to a group that values it and will use it to increase their productivity and enforce standardization.

Related Posts Plugin for WordPress, Blogger...

David-Plotkin

David Plotkin is an Advisory Consultant for EMC, helping clients implement or mature Data Governance programs in their organizations. He has previously served in the capacity of Manager of Data Governance for the AAA of Northern Ca, Nevada, and Utah; Manager of Data Quality for a large bank; and Data Administration Manager at a drug store chain. He has been working with data modeling, data governance, metadata and data quality for over 20 years. He serves as a subject matter expert on many topics around metadata, data governance, and data quality, and speaks often at industry conferences. 

  1 comment for “Designing in Data Quality with the User Interface

  1. John Biderman
    March 13, 2012 at 3:25 pm

    Great work, David, to integrate data governance with user interface design to build in data quality controls at the point of data origination. This is how it should be — and largely has been — but usually in a application project-specific silo without the cross-functional representation of data stakeholders involved. This I think represents the next level of maturation of data governance in the software lifecycle.
    And I completely agree about techniques for trapping duplicate entry at the source rather than doing all that de-duplication nonsense we’ve been involved in for years.
    From a technical architecture standpoint, the next level is to abstract the data validation rules, and even the data presentation rules, from a specific application and store them as metadata — not just in a documentary way, but in a machine interpretable form. In other words, have some structured representation of validation rules and a validation engine that can be invoked as a service, so if multiple applications need to maintain the same data items they can do so sharing common presentation and validation rules. Then you have metadata in action, rather than static metadata inaction (apologies for the word play).

Leave a Reply

Your email address will not be published. Required fields are marked *