“Why?” a Chief Technology Officer (CTO) may ask when the subject of automated Data Management tools arises. After all, they have probably already been storing, archiving, and backing up enterprise data, day after day, with success.
For example, setting up a Database-as-a-Service (DBaaS) from a reputable cloud computing provider, with appropriate access to data, enough storage, suitable integration, and strong enough security should be reasonable and keep the business running. Having a self-configuring and self-tuning database does save time and money by automating technical Data Management tasks. But this only represents a sliver of what is possible. To miss the full potential of Data Management automation is to remain mired in time-consuming, menial, manual business tasks.
Business forms the core of why organizations do Data Management. Data Management ensures “an organization’s entire body of data is accurate, consistent, readily accessible, and properly secured.” While backing up and keeping data safe — with performance humming among different data systems — maintains the data already there, these activities, alone, do not address the suitability of that data content or Data Quality. When the data is not fit for consumption, data scientists and business analysts spend 80 percent of their time each day cleaning and preparing data. As a result, these analysts have less time to gain potential business insights.
It may not seem critical to automate redundant business Data Management work as the reports generate, analysis executed, and the database systems remain active. But over some time, the combined slowness can be costly, and less able to handle dynamic and quickly changing data. Some firms realize this and are jumping on the automated Data Management bandwagon.
Gartner estimates that automation will reduce Data Management manual tasks by 45 percent through 2022. Executives would benefit from learning how automated Data Management tools save time and money. These managers need to understand how automation does repetitive business Data Management chores, what tasks still need humans, and how to evaluate automated Data Management capabilities.
What are Automated Data Management Tools?
Think of automated Data Management tools as mechanisms to streamline enterprise-wide Data Management tasks. They fall under a technical practice called robotic process automation (RPA), which speeds up handling business operations and reduces costs. In the context of Data Management, RPA handles tasks like “data extraction and cleaning via existing user interfaces,” as well as technical DBMS tasks, like backup and storage. Implementing RPA through automated Data Management tools allows businesses to make sense of lots of data by mechanizing redundant work and leaving high-level tasks to humans.
Automated Data Management tools simplify operations through machine learning (ML) and artificial intelligence (AI). Both ML and AI technologies have become quite sophisticated in identifying data patterns and adapting to business rules by matching data, detecting and correcting errors, and mapping different data elements. Because of this, automated Data Management tools do tasks like a strip or evaluate unique data points or self-correct bad data, (e.g., merge duplicate records). Through ML and AI, which continue to advance, organizations can quickly increase their ability to manage dynamic data, as both technical and business Data Management tasks complete in real-time.
Business Data Management work, when automated, leverages on-hand changing information better and includes the following:
- Data Quality: Profiling, cleaning, linking, and reconciling data with a master source, a standard for formatting master data or the data to do business. In addition to making corrections to the data, these types of algorithms ensure organizations keep to well-defined rules and have logs detailing the processes used to make the data consumable.
- Metadata Management: Metadata describes information about the data on hand and its context, crucial to finding needed information within millions of records. Automated metadata management platforms prioritize data and quickly identify records to remain private to comply with regulations. They also speed up tracking data assets during use, making Data Quality and Data Management that much easier.
- Master Data Management: Master data describes essential data about people, places, and things needed to do business. Standardization makes master data stable through a centralized reference point, validating that one person’s record is indeed unique. Automating Master Data Management keeps this kind of data consistent and trustworthy across systems and up-to-date.
- Data Integration: Many firms have been dealing with multiple database systems with different standards. The end-user, whether human or AI, needs this mish-mash of data floating through the pipelines to be unified. Integration processes allow consistent Data Quality when a user recalls this information.
Automating these business Data Management components saves the manual labor needed to make the data usable, either at the beginning or end of the DBMS pipeline. Consequently, businesses reduce costs and optimize data scientists’ and analysts’ talents.
Know the Business Reasons for Automating Data Management
Automated routinized Data Management helps if business reasons have been made clear at the outset, and algorithmic strengths are known. Each automated Data Management platform has a different focus. For example, Talend excels at maintaining clean and reliable data upon integrating different DBMS, while Informatica specializes in Data Quality and Master Data Management. Other Data Management platforms work only with specific applications, such as Cloudingo does with Salesforce. Because of these differences, blindly choosing a platform without understanding the business needs and aligning these with a good Data Strategy is counterproductive.
Throwing together a bunch of automated Data Management tools, of any kind, without a Data Strategy, risks cost overruns, and not just because each platform differs. Preparation must be done before using any automated Data Management tool, as IBM learned while trying to use Watson, its artificial intelligence platform, as a clinical diagnostic tool. Many of these initiatives languished due to poor strategizing, as found by the University of Texas’ MD Anderson Cancer Center. This institution used older data sets, not the information needed for Watson to learn to diagnose.
Arvind Krishna, IBM’s senior VP of cloud and cognitive software, says that 80 percent of the “work with an AI project is gathering and preparing data.” Data Management automation tools “cannot fix completely broken, incomplete, missing” or poorly managed data. Automated tools find that work just too complex. Feeding the wrong data to automated Data Management tools sets them up for failure. So, some prep and cleaning will be necessary to ensure an automated Data Management tool works with the correct data.
Evaluating Automated Data Management Tools
As shown above, automated Data Management tools have different capabilities. Even the most sophisticated platforms reach limits running Data Management tasks, like defining how metadata from various data sources should be categorized. Is there a way to assess automated Data Management capabilities among different applications? Also, can we compare these to human ones, just as the automotive industry has done for the self-driving car?
The DMM Capability Maturity Model Levels, used to evaluate enterprise-wide Data Management, shows promise and applies to both automated tools and human performance.
This schema emphasizes behavior and work products, regardless of where they originate.
For example, take Data Quality. Excellent data preparation and cleansing platforms can perform some business-level processes by matching alias to a master record. But this type of ability occurs between a level 2 and level 3 above. Such a platform cannot measure and review whether a database record is indeed an alias of a master record or a unique record. That has to be set by a human operator. This example demonstrates that the DMM Capability Maturity Model Levels can help leaders see how software and employee Data Quality capabilities can be best leveraged.
Why consider a full range of Data Management tools? They not only automate technical Data Management tasks, backup and optimize DBMS, but also business Data Management tasks. Both kinds of automation are needed to manage dynamic data most effectively. Progress in artificial intelligence has made repetitive business Data Management less cumbersome and time-consuming.
With a robust Data Strategy and helpful automated management tools, organizations can better keep up with the marketplace and expand opportunities. It’s true that automated Data Management tools do not solve every Data Management business need, but the tasks they do accomplish make a significant difference to the bottom line.
Image used under license from Shutterstock.com