Anthony J. Algmin believes Data Architecture is moving from a time of chaos and tangles into something more clean and organized. Speaking at the DATAVERSITY® Data Architecture Online Conference, Algmin looked at past predictions, current hot topics, and predictions for the future. He is the Founder and CEO of Algmin Data Leadership.
A Quick Look Backward: Revising Past Predictions
Reflecting on a panel discussion he facilitated at the Data Architecture Summit in Chicago, Algmin said four themes emerged from the discussion that have subsequently been validated:
- Cloud and derivatives have become — and continue to be — a hot topic, and most organizations have acknowledged they have a future.
- Business understanding for data architects has become increasingly important, and Algmin says it is now crucial to the data architect’s success.
- At the same time as the data architect’s role is encompassing more business acumen, the desire and ability to dig in and take on technical implementation is still important.
- The expanding role of data architectsis something he continues to see.
Recent Data Architecture Misses
Algmin next looked at recent projections that didn’t pan out as expected:
- “The data warehouse is dead!” The prediction that NoSQL will kill the data warehouse and replace it with in-memory analytics has not become a reality. Although it has an impact, NoSQL has not killed the data warehouse. Instead, it has become a complement to it. In-memory analytics are incredibly powerful, Algmin said, but they are not destroying the data warehouse, and like NoSQL, work well alongside it. The reality is that operational reporting is best served by a data warehouse and will continue to be relevant in the foreseeable future.
- “Ad-hoc analyses by data scientists will fix everything.”Although Data Science is a hot topic, “The fact is that the relevancy of data scientists has not yet reached its actual potential. It hasn’t lived up to the hype,” he said. Companies expect their scientists, working in relative isolation from most other business functions, to produce analytics, do predictive modeling, and power machine learning and AI. In reality, most data scientists still spend most of their time on data prep because they often don’t trust others to do that work for them. “Even if you have a robust Data Governance and Data Management capability, you’re still going to find that you’re doing a lot of data prep as a data scientist today, and we haven’t yet been able to find a way around that.” Some useful tools have arisen to mitigate this, but increasing data volumes have offset those advantages, he said. Context is critical to success with Data Science, and scientists have struggled to connect their academic analyses to operational improvements in the business, he said, but there is a lot of potential.
Beyond the Hype
Some of the hottest topics have reached maturity and are now mainstream.
“Big data” was a
popular term that came about because existing technology couldn’t keep up with
the volume. Now that the technology exists, he said, the challenge has shifted
to the difficulty people have with understanding today’s massive data stores.
Real-time Analytics/Streams have been hampered by an inability — or a lack of need — for real-time reactions.
In use cases like day trading, reacting very quickly is necessary, but the reality is that many business use cases today don’t need to — or can’t — react quickly enough to justify the expense of real-time analytics. Event-based batch processing, whether hourly, minute-by-minute, or overnight, works just as well for most uses.
Agile development was a “mysterious thing” several years ago, he said, but it’s now being used in many organizations. The term “agile” has also come into popular use, and companies are also using it to describe themselves as a nimble, relevant, innovative organization. “It’s kind of hilarious to me how many organizations out there say, ‘We’re agile,’ and you look under the covers at what they actually do process-wise, and they are nowhere near agile.” Despite the mainstream playing fast and loose with the term, Algmin said that true agile development is undeniably useful.
Along with the emergence of dashboards and information reporting, he said, there was a strong desire to have access to analytics on the phone, because executives needed to be able to see their numbers anytime, anywhere. Now responsive design makes it possible for the output format to be decoupled from the analytics programming calculation, and the receiver can choose their form factor independently of the creation of the analytics itself. “Phones and mobile analytics used to be super-hot. Now they’ve settled down, and now they’re just part of the fabric of everything that we’re doing.”
“It was the peak of hilarity to me that when we first started talking about the Internet of Things, we were saying, ‘Okay, the Twitter-enabled refrigerator.’ You remember that?” Not surprisingly, refrigerators with a screen enabling tweets from the kitchen have not become commonplace. “Who thought that was really going to help?”
Algmin said that we’ve reached a point where many organizations have a Chief Data Officer or CDO equivalent, because they recognize that they want more from their data. It’s still relevant, but it’s becoming more mainstream, and he would like to see more dialogue about the role of CDO in the organization. Algmin suggests that the Chief Information Officer (CIO) should be part of the business side, and the CDO should remain in IT.
NoSQL is an area that peaked with the “kill the data warehouse” story, but has settled into being an important part of search and retrieval for datasets tied to a specific key.
In-memory analytics has become a robust tool for ad hoc analytics, data profiling and proving things that are best hardened through a data warehouse.
Although Data Science hasn’t entirely lived up to expectations, Algmin said, “A functioning, impactful Data Science team can do some incredible stuff for your organization.”
Current Trends: Hottest Topics in Data Architecture Today
“Machine learning and artificial intelligence are the hottest areas, bar none,” Algmin said. Even organizations with no Data Governance, poor Data Quality, and processing performance at a slow crawl are clamoring for AI and ML. Algmin suggested that the introduction of these advanced technologies be contingent upon improving the basics, such as Metadata Management.
“We’re not going to change what’s hot, but we can change how we approach it, and recognize that the supporting capabilities to drive machine learning and artificial intelligence are the things that we need [in order] to do anything with data.”
Algmin said that conversation with chatbots such as Alexa or Siri, “who don’t understand what I’m talking about,” hasn’t quite lived up to its potential. It has definitely reached critical mass, he said, and it’s not going away, so in some use cases, enabling voice capability in consumer-facing products may be worth pursuing.
Cloud technologyis starting to mature, as evidenced by the continued growth of the major cloud platforms, as well as consolidation and retrenchment strategies in some of the secondary players.
Convergence of Data Architecture and Enterprise Architecture
“The data architect is on the upswing in terms of its heat index, but the enterprise architecture heat index is very, very cold, and it has been for a while.”
The popularity of data catalogs, data lineage, and the resurgence of Data Modeling in many cases, is a reaction to poor Data Management and decentralized data sources. He sees companies that were not successful with managing a warehouse moving to a data lake and repeating the same mistakes, resulting in a “contained terrible mess.”
Implementing a data catalog that could show where the pieces of the mess are, and providing an ability to trace lineage and data flow, he said, can be a step in the right direction. Modeling can provide a single source of truth and eliminate work spent repeating basic tasks. “Hey, you know what’s hard? Transforming everything every time you need to move it somewhere. Why not get it right and then use it everywhere?”
Innovation Pace is Flat
“We are exiting one of the most profound rapid-innovation periods in history,” Algmin said, and although innovation is still happening, more derivatives are emerging, such as the development of cloud features, compared to the original creation of cloud technology. “We need time to refine and operationalize all of the potential that has been created.”
In the few years between now and when 5G becomes ubiquitous, expect great productivity gains and value creation, as well as changing service delivery models. The streaming video space, for example, is undergoing massive fragmentation, with services like Netflix, Hulu, and Disney providing services not available to traditional cable and satellite customers. “Traditional cable and satellite providers are just hemorrhaging customers,” he said, and fragmentation is amplified due to their “rich history of terrible customer service.” He predicts that consumers may never again have just one package, but instead, will have multiple different cloud-based streaming content providers.
Implications for Data Architecture
Data Architecture innovation is experiencing similar patterns in areas with related technologies where the use case potential is in its infancy, such as blockchain and graph databases, and the role of Data Architecture is changing to accommodate. Algmin cautions against interpreting this as overall slowing and says it’s more like refactoring instead of brainstorming. Overall output will continue to increase and will exponentially expand the value, “Because we’re doing less hypothesizing and more iterative improvement.”
On the Horizon: Future Hot Data Architecture Topics
Algmin predicts the extension of ML and AI into Metadata Management and Data Governance with things like blockchain and distributed ledgers. “We’re going to start seeing how we as data architects can do some of the things that are really holding our organizations back.”
Algmin predicts continued convergence of architect roles around data, cloud, infrastructure, enterprise, applications, and business processes, because business expects answers from trusted experts tuned in to the systems making digital transformation happen. Arising out of that expectation, effective communication will help the architect have meaningful influence in other functional areas within the organization.
As privacy concerns and risk continue to be a priority, Algmin cautions that those in the newest generation entering the workforce — although having grown up with and being accustomed to daily use of data —may not fully realize the potential downside of information sharing. He also cautions against assuming that sophisticated data consumers will inherently have deep Data Architecture skills.
Quantified reality, augmented reality, and virtual reality, including wearables and embedded tech, are truly in their infancy and have much potential beyond the “glorified step counters” of today.
Algmin said that the importance of data is continually increasing, and this is not going to change. “We are going to continue to realize that data is the closest thing we have in our organizations to truth, and continue sharing and spreading that truth to each other in whatever way is necessary.” Data informs improvement, allowing businesses to react, take action, and improve.
Algmin writes a quarterly column at The Data Administration Newsletter (TDAN) and has released popular online training courses in Data Leadership. His book, Data Leadership: Stop Talking about Data and Start Making an Impact! is available now.
Want to learn more about DATAVERSITY’s upcoming events? Check out our current lineup of online and face to face conferences here.
Here is the video of the Data Architecture Online Presentation:
Image used under license from Shutterstock.com