Loading...
You are here:  Home  >  Data Education  >  BI / Data Science News, Articles, & Education  >  BI / Data Science Articles  >  Current Article

Data Governance 2.0 at the Hub of Data Curation

By   /  April 26, 2018  /  No Comments

data curationData Curation needs to be guided by Data Governance so it can secure itself a place as a core Data Management requirement. The MIT Technology Review Insights states, “Data is the single biggest asset at most organizations. It’s time to start treating it accordingly.” Towards that end, leaders and boards have been called to action through the Leader’s Data Manifesto, making their organizations truly data driven.

Kelle O’Neal, the CEO and Founder of First San Francisco Partners, breaks this charge down by noting that an enterprise needs to “ensure its data adds value to the enterprise as well as enabling the enterprise to derive value from data.”  For data to be useful, Data Curation, guided by a clear Data Governance plan, must make its way into the mature organization.

For data to be an asset, it needs be maintained, preserved, culled, and added to throughout its life cycle, making Data Discovery and Analytics easier to sustain by both internal and external consumers. Customers and innovative technologies demand it. This defines the Data Curation realm.

Organizations cannot curate in a vacuum. Data policies and controls around curation goals and customer activities need to be applied. This explains the importance of the Data Governance realm. As Ken Kring, Principal at PGTT states, Data Governance today means “the continual cleaning and integration of data to drive profitable behaviors of customers and employees on an ongoing basis.”

Data Curation is a necessary part of mature Data Management, while Data Governance provides the consistency and balance to allow organizations to leverage Data Curation as a primary element of that entire system.

Data Curation in the Mature Organization

Why associate Data Curation with mature organizations now? Investors and organizations recognize Big Data (and all data) as a corporate asset. Like currency, companies are beginning to understand that they can’t just continue to blindly “store up” the vast piles of data streaming into them without developing a way to value this data, and to determine which data has present or potential value. Data curators collect data from diverse sources, integrating it into repositories that are many times more valuable than the independent parts.

Organizations, to be mature, can no longer ignore Data Curation.

Increasing Variety of Data Sources

The 2018 Data & Analytics Global Executive Study and Research Report by MIT Sloan Management Review finds that innovative, analytically mature organizations make use of data from multiple sources.

This includes a variety of data types such as mobile, social and public data. Companies are beginning to understand that they can’t just continue to blindly “store up” the vast piles of data streaming into them without developing a way to value this data and to determine which data has present or potential value, and which will always virtually remain useless.

To keep data and maintain data as an asset, as Pat Hennel notes, corporations need to consider Data Curation.

Machine Learning Growth

Machine Learning will become one of the game changers of the coming decade as Gartner notes, 35 percent of IT resources will be spent to support the creation of new digital revenue streams and by 2020 almost 50 percent of IT budgets will be tied to digital transformation initiatives, including Machine Learning.

Stephanie McReynolds, VP of marketing at Alation says “Curations are about where the humans can actually add their knowledge to what the machine has automated.” This results in prepping for intelligent self-service processes, setting up organizations up for insights. Hence a drive for Data Curation to remain ahead and more effective on the Machine Learning curve.

The Data Lake Problem from Moving into the Cloud

By 2020, Gartner analysts say that almost 40 percent of enterprises will use the Cloud to support more than half of transactional systems of record. In fact, according to a 2017 survey, 72 percent of US finance executives said they are either using Cloud-based solutions or plan to do so in the future, which is an increase from 62 percent in the 2016 survey.

With the increase of Cloud migration and streaming, data storage will look more like a Data Lake. The Cloud has become the default ingest point for data for many organizations. To avoid turning sending Big Data to its death in the Cloud, Data Curation must be considered with customized data collections. As the Geological Survey of Alabama (GSA) has discovered, with their data graveyard, Data Curation has been imperative in resurrecting Data.

Data Governance: The Hub of Data Curation

Creating and maintaining data collections, using Data Curation, needs to be guided by Data Governance. Questions such as who will have access to or be responsible for adding to and culling a Data Collection/s. must be addressed. To prevent collections from becoming Data cesspools (aka Data Swamps) and to facilitate communication about the collection with a common vocabulary (e.g. A Business Glossary) requires Data Governance.

Three reasons why Data Governance needs to direct Data Curation are:

  • Self-Service Analytics

Customers and Corporations depend and demand Self-Service Analytics. This form of Business Intelligence, enables and encourages business professionals to query and generate reports. Consider that Self-Service Analytics Platforms with superior data visualization capabilities will gain momentum in 2018.

Organizations embracing this “self-service” model will of course need more governance, so that business users have more control over the data they analyze. To connect strict Data Governance with business requirements, an organization needs Data Curation. Any analyst can curate data knowledge and this information can be reused, notes the company Alation. Data Governance policies ensure compliance, collaboration, and faster insights from Data Curation.

  • General Data Protection Regulation (GDPR)

General Data Protection Regulation (GDPR) affects personal information processing and storage in any data collection. Should any Data Curator add, maintain or archive financial or health data, he or she will likely find personal information in the mix. Non-compliance costs 4 percent of an organization’s revenue.

To help customers deal with the upcoming GDPR regulations, in effect by May 2018, a number of companies have created effective systems to help manage the data and the regulations. Issues for DBAs, Data Protection Officers, and many others are being discussed and dealt with by organizations all over the world who will have to contend with these new laws. Data Curation and Data Governance are of the utmost necessity.

  • Leveraging Human Interaction

Data Curation provides access to human knowledge and open communication by leveraging human responses towards customized information. As Wilder-James writes, “you can’t cleanly separate the data from its intended use…every new problem has its unique aspects that usually reach back into data acquisition and preparation”

Data Silos, from structural, political, and growth factors, can make it “prohibitively costly to extract data.” Data Curation without direction from Data Governance promotes Data Silos. Data Governance 2.0 approaches authority and policy from a company-wide view. Data Governance makes it possible, for example, that a person in sales can retrieve information from an engineering Data Curator and data collection, to answer a query.

As Danny Sandwell elegantly says, “Data Governance is the foundation of an Enterprise Data Strategy.” Data as an asset needs to be useful to people across a corporation, not just a specific team or department.

Conclusion

Many organizations want to be data-driven. Data Curation provides a means to get value. Data Curation needs to enter in company considerations because and the need to aggregate data from diverse sources to form a unique picture of a business situation.”

In addition, staying competitive with Machine Learning and moving Big Data into the Cloud requires Data Curation. But, Data Curation must be part of a comprehensive Data Governance program. Self-Service Analytics, and regulations like the GDPR require it. Setting up a Data Governance program without Data Curation will result in an inability to keep up with the changing Data Management landscape.

 


Photo Credit: Anna_leni/Shutterstock.com

About the author

Michelle Knight enjoys putting her information specialist background to use by writing technical articles on enhancing Data Quality, lending to useful information. Michelle has written articles on W3C validator for SiteProNews, SEO competitive analysis for the SLA (Special Libraries Association), Search Engine alternatives to Google, for the Business Information Alert, and Introductions on the Semantic Web, HTML 5, and Agile, Seabourne INC LLC, through AboutUs.com. She has worked as a software tester, a researcher, and a librarian. She has over five years of experience, contracting as a quality assurance engineer at a variety of organizations including Intel, Cigna, and Umpqua Bank. During that time Michelle used HTML, XML, and SQL to verify software behavior through databases Michelle graduated, from Simmons College, with a Masters in Library and Information with an Outstanding Information Science Student Award from the ASIST (The American Society for Information Science and Technology) and has a Bachelor of Arts in Psychology from Smith College. Michelle has a talent for digging into data, a natural eye for detail, and an abounding curiosity about finding and using data effectively.

You might also like...

Metadata Management and Analytics: What is the Intersection?

Read More →