I recently read an article about health care in the United States. The article made the case that although considerable progress has been made toward health care in the last 100 years, we don’t have a health care program in the United States—we have an illness treatment industry. Minimal emphasis is placed on a wellness program to prevent illness. Instead, major emphasis is placed on treating illnesses after they occur. Even major health initiatives like weight reduction, exercise, stress reduction, and so on, are oriented toward resolving health problems that have already occurred.
I also read a similar article about the tremendous progress that has been made treating mental illness in the last 50 years. To be sure, progress has been made toward both recognizing mental illness and treating mental illness. However, the article also emphasized that the approach is toward allowing the mental illness happen and then proceeding to treat the illness. Minimal effort has been made to ensure mental wellness, let alone establish any formal mental wellness program.
After reading these two articles, I saw a similar pattern with the way organizations manage their data resource. I wondered if we really have a physical data manipulation industry rather than a data resource management program. Are we physically manipulating the data, for short term needs, without any formal design, or any consideration for long term needs? Are we physically manipulating the data according to the business processes using those data rather managing the data according to formal data management concepts and principles?
Most data modeling tools are actually used to physically design and implement the database without any formal logical design. The physical design is often oriented toward the data used by specific business processes, without normalizing the data for use by other business processes. The primary objective of most data modeling tools seems to be an orientation toward cutting the code to develop a physical database for a set of business processes.
Many purchased applications have a physical orientation toward a fixed way of doing business and managing data for organizations, without regard for how organizations conduct their business as they perceive the business world. The result is an organization’s way of doing business becomes warped to fit the application. Many organizations are serially warping their business from one purchased application to the next without any consideration for the way the business operates. Many organizations have parallel warping of the business where part of the business is warped one way to fit a purchased application and another part of the business is warped another way to fit another purchased application.
Many generic data architectures and universal data models appear to be doing the same thing—forcing organizations to manage their data in a set manner without any regard for the way the organization perceives business world. Many data files are oriented toward supporting specific business processes rather than being designed according to formal data management concepts and principles. Data are often created redundantly in different data files to support specific business processes. The redundant data require bridges and feeds to keep those data in synch, which are seldom fully effective.
Most applications and databases have very few physical data edits and seldom have extensive logical data integrity rules. Most applications and databases lack formal data names and comprehensive data definitions meaningful to the business. Many applications actively create data disparity and many software tools are developed to resolve that disparity. Most data integration and ETL activities intended to resolve disparity actually make the disparity worse.
All of these situations, and many others, fall into the category of brute-force-physical data manipulation without any formal design or regard for the way an organization does business. Brute-force-physical data manipulation is the theme of a physical data manipulation industry. People react, sometimes violently, to the term ‘brute-force-physical’, but it’s true and the truth often hurts.
I noticed another analogy between an illness treatment industry and a data manipulation industry. In the illness treatment industry a class of illnesses known as nosocomial infections runs rampant in many medical facilities. A nosocomial infection is an infection that a person did not have when they entered a medical facility, but had when they left the medical facility. It’s an infection that was acquired at the medical facility that was not related to the illness that person had when they entered the medical facility.
Many people claim that a nosocomial infection follows the principle of unintended consequences, which states that any intervention in a complex system may or may not have the intended result, but will inevitably create unintended and often undesirable outcome (Brackett, 2011). In actuality, a nosocomial infection is a result of not following established sanitary techniques in medical facilities. Had the medical facility followed established concepts, principles, and techniques, most nosocomial infections would not have occurred.
The data resource in many public and private organizations has a nosocomial infection, known as disparate data. Disparate data are any data that are essentially not alike, or are distinctly different in kind, quality, or character. They are unequal and cannot be readily integrated to meet the business information demand. They are low quality, defective, discordant, ambiguous, heterogeneous data. (Brackett, 2011) The disparate data resulted from not planning and designing the data resource to support the current and future business information demand of the organization according to established, concepts, principles, and techniques.
The result of an illness treatment industry is a high probability of nosocomial infections. The result of a physical data manipulation industry is a high probability of disparate data. Nosocomial infections impact the patient’s wellbeing and hamper their pursuit of a productive life. Disparate data impact an organization’s wellbeing and hamper their pursuit of a productive business.
I asked several medical professionals why we have an illness treatment industry rather than a health care program. The response in most cases was that it’s profit motivated. Keeping people healthy is not as profitable as treating illnesses. The profit is in medical procedures and medications—more profit exists for being reactive than for being proactive. I made the case that a formal health care program oriented toward the proactive prevention of illness could be profitable, but the responses in most cases were that it would not be as profitable as the reactive treatment of illnesses. In addition, more satisfaction comes from solving a problem than from preventing a problem.
The same situation appears to be true for the physical data manipulation industry. Physical data manipulation seems to have a greater profit motive than formal data resource management. More profit exists for being reactive than for being proactive. More profit exists in software application and software tool sales than in formal planning and design. More satisfaction is gained from getting a database up and running to support current business processes than in following formal concepts, principles, and techniques. More satisfaction is gained from building and implementing than from planning and designing. More satisfaction is gained from playing with tools than from making hard decisions about a high quality data resource that provides long term support to the business.
Look at your organization. Do you follow established and proven concepts, principles, and techniques for formal data resource management? Are you oriented toward brute force physical development of databases to support current business processes? Are you warping your business into one or more applications? Do you have disparate data that are impacting the wellbeing of the organization?
The fact that we have a physical data manipulation industry rather than a data resource management program should be obvious. The question becomes what can be done to resolve the situation? What is needed to turn a physical data manipulation industry into a data resource management program? What needs to be done to stop the stop the creation of disparate data—the nosocomial infection of data bases
The answer is almost too obvious—create a data resource management program that formally manages data as a critical resource of the organization! Follow established and proven concepts, principles, and techniques to build a data resource that adequately meets the current and future business information demand of the organization. Stop warping the business to fit applications. Stop building databases that directly support specific business processes, because the structure of data is orthogonal to the structure of business processes. Stop creating any further disparity in the data resource and clean up the existing disparity.
Stopping data disparity is like stopping nosocomial infections. Cleaning up existing data management practices to stop data disparity is like cleaning up medical facilities to stop nosocomial infections. A strong case can be made that formal data resource management is just as profitable as a physical data manipulation industry, and far better for the wellbeing of the organization. Both organizations and data management professionals must start a formal data resource management program that replaces the current physical data manipulation industry.
That’s the professional approach. That’s the only way to develop a formal, certified, recognized, and respected data management profession.
Brackett, Michael. Data Resource Simplexity: How Organizations Choose Data Resource Success Or Failure. New Jersey: Technics Publications, LLC, 2011.