Karen Lopez says that when it comes to surviving as a Data Architect, “Hybrid is the future.” It’s no longer enough to speak only one language or stay attached to one set of technologies. According to Lopez, “purely relational (SQL) databases don’t exist any longer,” since most applications developed now use various types of database and data store technologies, while the actual data structures within them “are being expressed in a variety of places.”
Lopez is Senior Project Manager and Architect at InfoAdvisors with more than twenty years managing large multi-project programs, and she doesn’t see the hybrid concept going away anytime soon:
“We’ve already seen the NoSQL world and Hadoop add relational-like query structures or relational-like layers into their non-relational designs, and now we’re starting to see the relational vendors add NoSQL features in. Because Microsoft and other vendors have added non-relational features, I expect this to spread because this is how most features make it into vendor products – it’s competitive advantage.”
Lopez spoke specifically to Data Architects and Data Modelers during her presentation “Surviving as a Data Architect in a Polyglot Database World” at the DATAVERSITY® Database Online Now! 2017 Conference. She spoke about the mindset they need to stay relevant during this shift. “With the advent of these new database models, we’re being excluded because we’re perceived as being relational Data Architects only.” Embracing the hybrid world provides a way to show your value to the business as the technology changes while continuing to be involved in “all the data-related architectural decisions,” she said.
What Lopez considers ‘hybrid’ is a combination of relational and non-relational database features, supporting a variety of functions. She said Data Architects must become ‘polyglots’ or speakers of many languages, an idea she extends to understanding different schema as well.
“Most of the major database vendors have column store features. They support XML data types, they support JSON data types, they support other kinds of NoSQL non-relational features right inside their relational database, and that’s been going on for a long time. It’s becoming even more polyglot and poly-schematic. The concept of a purely relational database really doesn’t exist anymore.”
She clarified that her use of ‘SQL’ and ‘NoSQL’ does not specifically refer to ‘structured query language,’ but that she uses the terms to designate relational vs. non-relational, or extra-relational. Over time, features of non-relational databases have been added to relational databases, she said. “One of the biggest things that changed in the database world in the last 10 or 15 years” is the idea that schemas can live in a variety of places – they don’t just apply to a persistence layer.
“Applications make use of multiple database and data store technologies. We have hybrid applications, hybrid data technologies in an application, and now the new concept that NoSQL brought to us is that schemas are now being expressed in a variety of places.”
A Good Data Architect
Lopez also clarified her use of the word, ‘Data Architect:’
“I don’t make a huge distinction between Data Modeler and Data Architect. I tend to use the term Data Architect even though other people in the industry use it to mean something more physical like a Storage Architect. They are not just someone who draws boxes and lines all day, but someone who makes decisions about how to do that, and understands what the business requirements and models are.”
She included in that definition finding the right design models for business needs, as well as responsibility for data protection, security, and privacy, and other business needs that are beyond the structure of the data. Lopez wants to expand the boundaries and the usefulness of the architect and modeler, so that as the technology evolves, so does the concept of the Database Architect. She encourages data model-driven development in an environment where the architect is a valued team player, valued for meeting business needs with ahead-of-the-curve thinking.
“I want us to be ‘team data’ not just ‘team relational.’ I want all of our data models – whether it’s a graph model, whether it’s a JSON document or an expression, or design of a JSON document – I want them to continue to be wanted and appreciated. And of course for that to happen, we have to want them and appreciate them even outside the relational world.”
After a quick review of the overarching differences between SQL and NoSQL and the arguments given by ‘Team SQL’ and ‘Team NoSQL,’ she said:
“All of this means, with all of these hybrid approaches, that the SQL versus NoSQL [debate] isn’t a thing any longer. If you’re a Data Architect who specializes in relational-only modeling and relational-only design, you’re going to be considered overly specialized.”
Five years ago, it might have been possible to ignore NoSQL technologies, but now that Data Modeling tool vendors are involved, organizations are going to have to be designing for them. They can’t ignore implementing a Graph Database solution, or not using “graph nodes in SQL Server because my Data Modeling tool doesn’t support it,” she said. Now survival depends on learning the best use cases in each instance and asking vendors to support them as well. Everyone has to have this hybrid thinking for the best fit for the data.
Hybrid is the Future
Most relational DBs are adding NoSQL support, she said, and Column store, Graph, XML, and JSON features are being added to relational databases and can now occur all in the same engine. “Truly hybrid – this is a major change from how things worked a decade ago,” she said. Although some Data Architects are already using these together, “now they are going to be native features in your tools” as vendors move to a hybrid focus.
“The more SQL-like features that are added to your NoSQL tools, the more likely it is your Data Modeling tool will be supported. And the serious NoSQL vendors or projects – since a lot of them are open source – understand that hybrid is the enterprise data story.”
How to Become a Great Enterprise Data Modeler
To be what Lopez calls a ‘Great Enterprise Data Modeler,’ it’s essential to cultivate a set of characteristics that include a lack of attachment to relational models, a willingness to feel empathy for teams when they want you to design something your modeling notation doesn’t support, and keeping a hard-working attitude.
“This is really exciting to me, because now we get to love our data in a way that best meets its need, and we get to stop being in those endless debates about whether relational or non-relational is better, or whether Data Warehouse or transactional is the place to be.”
She recommends talking to vendors, reading, getting some hands-on training, and possibly getting certified in a new area. It’s important to learn and use the lingo, stay ahead of the curve, describe Data Modeling and related Data Governance within the hybrid context, and understand use cases for each type of technology.
“We need to be able to let go and think differently about data and data design, especially when it comes to consistency and constraints, and even data quality, because there’s a good business case for that.”
She says she often gets overly excited about this because she’s been “doing this relational stuff for 30 years.” There have been so many changes in the past few years, it is hard to keep up. She compares embracing the hybrid world to driving a new car off the lot: “We should enjoy that new database smell.”
Check out Database Now! Online at http://www.databasenow.com
Here is the video of the Database Now! Online 2017 Presentation:
Photo Credit: davooda/Shutterstock.com