You are here:  Home  >  Data Education  >  Big Data News, Articles, & Education  >  Big Data Blogs  >  Current Article

Big Data’s Coming End-State Architecture

By   /  August 28, 2013  /  No Comments

by James Kobielus

Maturity is a complex concept where big data is concerned. I see it mentioned in many industry contexts, but I rarely see anybody put together a clear picture of where it’s all heading. How would we recognize a mature end-state big data architecture if it ever arrives?

I’ve addressed big data maturity head-on in many recent posts, but I still have more thoughts in that regard. You can speak of “mature best practices” for deploying, managing, and optimizing big data environments, such as I did here. You can broaden the focus to discussing, in turn, the “maturity” of big data deployments, platforms, and industry ecosystems, as I did in this post. You might even talk about big data as a “mature” industry paradigm, as I did here.

But is it possible to conceptualize big data “maturity” in terms of what evolutionary end-state, if any, all of today’s innovations might or might not culminate in? I left that question hanging in my previous discussions, in which I injected the following interrogatives:

“So, then, what, in the final historical perspective, will be big data’s target architecture…? How much of this plateau end-state has already been realized in practice in production-grade data management and analytics? How far do we as an industry and as users of the technology still need to evolve to arrive at that promised plateau?”

I like to call this end-state the “omega architecture” for big data. Some might call it the “settling point of big data systems.” Regardless of what you call it, you must wonder whether its wishful thinking, a mirage that forever recedes into the future. You wonder whether, if it arrived, it would be a utopia or dystopia. The end of evolution means death, after all.

Is the omega architecture some future point of “maturation” when most commercially available big data systems from established solution providers are to be considered functionally sufficient for the full range of apps and workloads that they are expected to handle? Considering the widening range of uses to which big data systems are being applied, I doubt that this settling point would emerge from organic industry competitive dynamics.

Suffice it to say, I’m not all that optimistic that we can frame a valid “S-curve” (a la Gartner) to apply to big data analytics. Big data is far too diverse a range of approaches to facilitate clear identification of the “slough of despond,” or what have you. One would be hard pressed to identify industry “maturation” toward that end-state, much less roadmap/milestones toward us all “getting there.”

If there’s any prospect of a big data omega architecture, it would have to be a vision that someone articulates that catalyzes the industry toward a specific goal. If I ruled the big data world, this would be the (admittedly complex) omega architecture:

  • Provides an analytic resource of elastic, fluid topology
  • Provides an all-consuming resource that ingests information originating in any source, format and schema
  • Provides a latency-agile resource that persists, aggregates and processes any dynamic mix of at-rest and in-motion information
  • Provides a federated resource that sprawls within and across value chains, spanning both private and public clouds
  • Provides an analytics-optimized information persistence and delivery layer
  • Aggregates information into integrated, nonvolatile, time-variant repositories under unified governance
  • Organizes information into subject area data marts that correspond with one or more business, process and/or application domain
  • Supports flexible deployment topologies such as centralized, hub-and-spoke, federated, independent data marts and ODSes
  • Enables unified conformance and governance of detailed, aggregated and derived information, as well as associated metadata and schemas, by business stakeholders
  • Extracts, loads and consolidates information from sources through various approaches
  • Governs the controlled distribution of information to various downstream repositories, applications and consumers
  • Maintains the availability, reliability, scalability, load balancing, mixed workload management, backup and recovery, security and other robust platform features necessary to meet the most demanding, changing enterprise mix of analytics, data management and decision support workloads

Whatever omega vision we agree on, we can measure “gaps” in terms of “how far commercially available solutions and industry best practices need to evolve, from today to some indefinite point in the future, to realize that end-state.”

Or one can short-circuit this discussion by saying the omega is “quantum analytics in the cloud.” But I’m not sure that’s sufficient either.

My crystal ball on this stuff is as cloudy as yours (pun semi-intended).

About the author

James Kobielus, Wikibon, Lead Analyst Jim is Wikibon's Lead Analyst for Data Science, Deep Learning, and Application Development. Previously, Jim was IBM's data science evangelist. He managed IBM's thought leadership, social and influencer marketing programs targeted at developers of big data analytics, machine learning, and cognitive computing applications. Prior to his 5-year stint at IBM, Jim was an analyst at Forrester Research, Current Analysis, and the Burton Group. He is also a prolific blogger, a popular speaker, and a familiar face from his many appearances as an expert on theCUBE and at industry events.

You might also like...

Augmented Analytics Use Cases

Read More →