How to Avoid Messing Up Big Data Analytics

By on

Click to learn more about author Ravi Shankar.

“Big Data” is still in the news these days, but the story has evolved from describing what it is to how an organization can actually use all that data for actionable intelligence, or “Big Data Analytics”. In fact, Forbes just reported on a Honeywell study of 200 manufacturing executives and found the majority of them (68 percent) are currently investing in Big Data Analytics as they felt it is essential for improving manufacturing intelligence and operational performance.

The reality is that while many understand the value of Big Data Analytics, far too many are messing it up, but why?

How Big Data Initiatives Are Creating Big Data Silos

As businesses continue to amass terabytes upon terabytes of data, they are in reality creating Big Data systems and assembling volumes of information that are considered “big” by any reasonable measure based largely on a specific line of business or business objective. The trouble is these systems are not single repositories and that makes all the difference. If you look inside these systems, you’ll find that they’re made up of multiple Big Data systems: one for marketing, one for finance, one for HR, etc.

The result is that you have the same problem you had before: the data is separated into silos, and it cannot easily be integrated. Before, the Big Data “explosion” these silos existed in separate, physical data centers, on servers from different vendors, which might not work well with each other. This is especially true if they are systems from the ’80s that were still in use for one reason or another. The difference is now these independent silos are all residing in the same Big Data system, yet they remain just that – silos. Marketing will only see marketing data, Finance will see the finance data, etc.

The objective of Big Data has always been to identify and tap into insights from a variety of structured and unstructured sources and there are many potential reasons to look across the silos. In doing so, the organization can ask questions like, “Of the customers who bought lawnmowers, which might be likely to buy a hedge trimmer?” May sound simple but to determine this, the first half of the question relies on data from finance, whereas the second half needs data from marketing. Put simply, these questions are only possible if the silos are somehow stitched together to enable a comprehensive view of all the data. In fact, this is precisely where Data Virtualization can help.

A Layer of Intelligence

Data Virtualization technology establishes an intelligent layer that sits above the silos, managing access to each in a way that is transparent to business users. To query across the silos, a user just has to query the Data Virtualization layer, which chases down the results and brings them back to the user in real-time.

The beauty of Data Virtualization technology is that it stitches together all the data across the enterprise: not only the multiple Big Data systems that make up an organization’s monolithic Big Data set, but also all the other enterprise systems that an organization may hold, including legacy on-premises systems or systems recently absorbed through an acquisition, for a single virtual view of all of an organization’s data.

So while nearly half of the manufacturing executives surveyed (46 percent) agreed that implementing and using Analytics are no longer optional, the reality is that breaking data silos—both big and small—is the only way to deliver on the full potential of Big Data Analytics.


Leave a Reply