You are here:  Home  >  Data Education  >  BI / Data Science News, Articles, & Education  >  BI / Data Science Blogs  >  Current Article

Internet of Things May Disrupt Predictive Analytics in Big Data Clouds

By   /  July 3, 2013  /  3 Comments

by James Kobielus

Many operational big data applications have predictive analytics at their core. So, given how much of the business may be riding on a predictive analytics infrastructure, business professionals have to ask: how stable is this infrastructure under the chaotic, complex, dynamic conditions that I find in many operational environments?

Predictive models are usually built around specific scenarios with well-defined dependent and independent variables in expected distributions. What predictive models thrive on are linear relationships among variables – in other words, the sort that can be most effectively defined using regression modeling. But what happens when the underlying reality being modeled becomes non-linear – in other words, when seemingly inconsequential new events, neither expected nor modeled explicitly, render formerly powerful predictive models impotent?

That’s the famed “butterfly effect” of chaos theory. It’s technically defined as “the sensitive dependence on initial conditions, where a small change at one place in a deterministic nonlinear system can result in large differences to a later state.” There are many mathematical techniques for modeling non-linear relationships, but, given that these are highly specialized and often unfamiliar to business-oriented data scientists, you probably haven’t incorporated any of them into the sorts of predictive models that drive your big data applications.

The next frontier on operational big data applications is the Internet of Things (IoT), which is likely to become a deepening vortex of butterfly effects waiting to happen. This is due to the fact that non-linear effects are likely to be far more prevalent in IoT environments – such as smart grid and real-time distributed process monitoring – than in traditional B2C-oriented big data applications. The chief causes for these effects will be the continued expansion in new IoT endpoints and growth in these endpoints’ generation and consumption of a wider range of messages under a broader range of operational scenarios. If nothing else, the sheer combinatorial explosion in IoT interaction patterns is a recipe for chaotic traffic loads.

Think about it. Every new sensor, gadget, system, and other device that enters the IoT becomes yet another butterfly, and every new piece of data it emits or action it takes becomes another flapping of the butterfly’s wings. Throughout the world, as more of these butterflies come online, produce and consume more data, and cavort in countless combinations in every possible circumstance, the non-linear effects are almost certain to intensify. How can we do effective predictive analysis under those conditions?

These thoughts came to me as I read a recent article by Geoffrey West. He focused on the accelerating complexity of distributed big data systems and called for a “big theory” to encompass it all and enable better prediction of complex behaviors. While reading this article, though, it occurred to me that we already have such a theoretical framework: the chaos and complexity theories developed by IBM’s Benoit Mandelbrot and others. It seems to me that we can’t truly harness and control the coming global IoT if we don’t revisit these theories with a renewed emphasis on prediction under chaotic conditions.

The vision of a planet-wide optimization depends on keeping the butterfly effect under control in operational IoT clouds. But how will that be possible?

One key approach will be to deploy federated IoT clouds. In this scenario, which I expect to see first in autonomic smart-grid applications in energy and utilities, each cloud is a distinct domain (business unit, region, application, etc.) with its own dedicated predictive infrastructure that ensures continuous, closed-loop local optimization. In addition, the IoT cloud domains would be loosely coupled from each other, lessening the likelihood that anomalous non-linear events (e.g., “butterfly effects”) in one or more of them don’t trigger chain reactions that cascade across them all.

Potentially, big data might address the solution end of this vision in the form of an event management bus shared by the federated IoT clouds. Non-linear predictive models and associated rules that leverage the pooled real-time event data on this shared bus could act as shock absorbers that prevent the butterflies from running riot globally.

What do you think?

About the author

James Kobielus is an industry veteran and serves as IBM’s big data evangelist. He spearheads IBM’s thought leadership activities in Big Data, Hadoop, enterprise data warehousing, advanced analytics, business intelligence, data management, and next best action technologies. He works with IBM’s product management and marketing teams in Big Data. He has spoken at such leading industry events as Hadoop Summit, Strata, and Forrester Business Process Forum. He has published several business technology books and is a very popular provider of original commentary on blogs and many social media.

  • I disagree that the IoT will be as chaotic as you describe. In my technoligies, I use “inductive modeling” neural network AI for predicting human behavior. In the IoT, gadgets have to speak a common language in order to communicate with the IoT. Therefore, the functionality of the “gadget” has known behaviors. These “gadgets” are not going to do unpredictable things that can’t be identified, therefore, as more and more devices are added to the IoT, they will have predictable behaviors that CAN be managed and will transmit data that CAN be expected, even when the devices are different. This whole “big data” marketing thing is frankly, BS, as far as I’m concerned. The more I read about it the more nonsense it appears to be. To me. I don’t get it, and I think I know what I’m talking about. But maybe you can convince me otherwise. I’ll keep an open mind.

  • James Kobielus

    You didn’t respond to my core point: “If nothing else, the sheer combinatorial explosion in IoT interaction patterns is a recipe for chaotic traffic loads.” You focused your comments on some ostensibly “predictable” behavior of individual gadgets in isolation. Then you took a side trip into some tangential, unsupported attitude that had no bearing on the point you were trying to make.

    • Partially correct. I did head off on the myth of “big data.” So more to the point, the IoT won’t be a sudden “explosion” but it will increase over time. However, the devices will still have predictable behavior. If the devices transmit 10 separate pieces of data, the device won’t suddenly create an 11th. So it is still predictable. Therefore, not chaotic. Just more data to manage. And you were apparently offended by my remarks about “big data” since you are IBM’s evangelist and paid more attention to that than what I was saying about predictable behavior.

You might also like...

Weaving Your Own Big Data Fabric

Read More →