by Charles Roe
In the old world shepherds used to roam the countryside with their flocks of sheep, happy in the idyllic simplicity of nature; their only worries were predators, inclement weather, and disease. As large estates became the norm, fences were built and the shepherds became more like stewards who watched over not just flocks, but other possessions as well.
According to the Online Etymological Dictionary, the word steward originates from the Old English terms stiward or stigweard which means “house guardian,” or “hall/pen + weard (guard).” Following the Norman Conquests of England in the 11th Century C.E. the term became synonymous with the Old French term seneschal or “overseer of workmen.” As history progressed ever onward, the term took on the meaning “one who manages affairs of an estate on behalf of his employer.” In the digital world of the 21st Century, such old world truths still exist in the form of Data Stewards, who devotedly watch over their own flocks, with some particular contemporary variations:
- Purdue University: “A Data Steward must participate with IT Data Administration staff, application development teams, and knowledgeable departmental staff on projects creating, maintaining, and using University data.”
- CDI-MDM: “Person responsible for managing the data in a corporation in terms of integrated, consistent definitions, structures, calculations, derivations, and so on.”
- Experian-QAS: “Data steward refers to the lead role in a data governance project. Data stewards take ownership of the data and work with the business to define the programme’s objectives.”
- The Data Administration Newsletter: “It is the Data Stewards’ responsibility to approve business naming standards, develop consistent data definitions, determine data aliases, develop standard calculations and derivations, document the business rules of the corporation, monitor the quality of the data in the data warehouse, define security requirements, and so forth.”
The DAMA Guide to Data Management Body of Knowledge (DM-BOK) breaks down data stewardship jobs into many positions including Business Data Steward, Coordinating Data Steward, Executive Data Steward and Data Stewardship Facilitator.
Any enterprise who takes its data seriously will have similar roles and definitions for one of the most important jobs in the Data Management industry today. Data is no longer just the worry of IT Departments; data is business, and without data modern businesses are not competitive. C-level executives and their lower level business compatriots worldwide are proclaiming throughout boardrooms, across virtual meeting spaces, during business lunches and across a multitude of other communication channels that “someone must own our data!” It all equates to the bottom line; without quality data the bottom line suffers.
Data Stewards “own” data, or to be more precise, Data Stewards are responsible for the data owned by the enterprise. If the enterprise is the old world Lord’s Estate, then the Data Stewardship Team consists of the people who watch over the lifeblood of the estate, including the shepherds who make sure the data is flowing smoothly from field to field, safe from internal and external predators, safe from inclement weather, and safe from disease.
What does it take to be a Data Steward?
Do you love data? Does the idea of working with data like a painter works with oils, one drop at a time, for months or years, until the work is done (though a Data Steward’s work is never done) sound appealing? Does defining, capturing and maintaining Metadata excite you? Does terminology like data validation, tolerance limits, data mining, data profiling and process standardization fill your dreams with visions of vast digital landscapes that you control? Do you like a nice paycheck and good benefits? A career in Data Stewardship may be right for you, but getting there takes a long list of skills and training:
- Programming Expertise: Data Stewards love data, but to get into the inner workings of data you must understand the programming involved. A comprehensive knowledge of at least some of the primary languages used is necessary, including Python, Perl, PHP, C/C++, Java and others.
- Relational Database Proficiency: Even with the growth of non-relational systems (see below), a Data Steward must know how to manage the relational databases still used within many enterprises. Simple knowledge of SQL is not enough though, proven experience working with various SQL-based systems including (but not limited to) Oracle RDBMS, Informix, Sybase, IBM DB2, MS SQL Server, PostgreSQL and others will add weight to your skillset and resume. You should have the ability to discuss during a technical interview some of the particularities of queries, DML, DDL, transaction controls, data types, DCL and procedural extensions if necessary.
- Data Modeling: A general understanding of Data Modeling with some experience is a plus. Data Stewards are not Data Modelers, but they interact with them often in meetings and when working on Data Governance and Master Data Management initiatives throughout the enterprise. A comprehension of such features as ORM diagrams, modeling applications like ERwin, and the differences between conceptual, logical and physical schema is helpful.
- Data Warehouse Concepts: It is necessary to understand and have experience with OLAP (and its variations), data integrity, ETL platforms, ODS, OLTP, schema and bottom-up versus top-down designs among others. Real world Data Warehousing experience is highly preferred.
- Understanding of Non-Relational Systems: The Big Data onslaught on enterprises has changed the landscape of Data Management forever, so everyone is the field is working to get better skill sets in dealing with Big Data and Unstructured Data. A solid background in the various NoSQL systems has become a prime requirement for many Data Management jobs, including Data Steward. A clear understanding of MapReduce, BigTable implementations, Memcache, sharding, distributed computing techniques, and the differences in the multitude of products on the market today including Hadoop/HBase, Cassandra, Redis, MongoDB, Riak et al. is crucial.
- Technical Writing: There is a belief that high levels of logical thought (a necessity for any Data Management job) somehow presupposes and inability to express oneself clearly (and creatively) with the written word. Luckily, such a belief is only a stereotype and thus not always true. Hone your skills as a writer, which includes the clear expression of ideas, good grammar, the ability to invoke interest in the reader, and your path to landing a good Data Steward position will increase. Data Stewards must be able to write; it’s an essential part of their skill package. They are frequently the intermediaries between the IT and Business Departments, and the ability to express yourself clearly so that both sides of the enterprise understand the message will increase your usefulness by magnitudes.
- Formal Education/Certification: Look at any Data Steward jobs available today and they all state that a B.S. in Information Technology, Computer Science, MIS, Mathematics or related field is a must. Most of the top jobs prefer a Master’s Degree, though 3-5 years of relevant experience can supersede the M.S in many cases. Added to the college degree is a list of possible certifications that include Certified Data Management Professional (CDMP), Certified Data Steward (CDS), Data Governance & Stewardship Professional (DGSP) and Certified Information Management Professional (CIMP).
- Business Acumen: If necessary, take some Business courses during your college education years. If those years are already past, then take some continuing education courses. Learn about the business world; understand it, thrive in its concepts. Data Stewards are the SMEs of the business in terms of data; Data Stewards understand the data in an enterprise often better than anyone else; Data Stewardship is not an IT function it is a business function. The primary responsibility of a Data Steward is to make sure that the data of the enterprise is business worthy. To be an effective and efficient Data Steward you must understand the inner workings of the business, they are inseparable skills.
Data Stewards have not been a part of human history as long as shepherds or traditional Old English stewards, but history is on their side. Where shepherds bring about romantic ancient visions of simpler lives, Data Stewards are the guardians of the digital new world.
Data is not going to disappear; it’s an integral part of the landscape of the modern world and Data Stewards are needed to make sure the digital landscapes remain unobstructed so the data flows effortlessly ever onwards into the future.