Welcome to Magazine Premium

You can change this text in the options panel in the admin

There are tons of ways to configure Magazine Premium... The possibilities are endless!

Member Login
Lost your password?
Not a member yet? Sign Up!

Not Your Type? Big Data Matchmaker On Five Data Types You Need To Explore Today

June 3, 2012

by Sunil Soares

I have met with dozens of clients this year in industries such as financial services, retail, and government, and when the subject of big data first comes up, the first question is usually  “what data am I supposed to be looking at?”

Before anyone can answer that question, a broader question must be answered: “what problem are you trying to solve?”

Let’s assume that answer comes quickly, and it’s time to look at the different kinds of  big data that businesses aren’t necessarily getting insights from today.

We need to have a good classification for big data. The figure above provides a framework that I have been fine tuning for the past several months.

I believe that most big data can be broadly classified into five types:

  1. Web and social media
    This includes clickstream and social media data such as Facebook, Twitter, LinkedIn, and blogs. Big data governance programs will increasingly be required to integrate this data with master data and with core business processes such as customer loyalty programs. The big data governance program needs to establish policies regarding the acceptable use of social media data especially as regulations and precedents are continually evolving. The program also needs to establish guidelines regarding the acceptable use of cookies, especially third-party cookies, to track users and to personalize their web interactions. Metadata is also critical to web and social media. For example, two sites may measure the term “unique visitors” differently for clickstream analytics. One site may measure unique visitors within a month while another one may measure unique visitors within a week.
  2. Machine-to-machine data
    Machine-to-machine (M2M) refers to technologies that allow both wireless and wired systems to communicate with other devices. M2M uses a device such as a sensor or meter to capture an event (such as speed, temperature, pressure, flow, or salinity) which is relayed through a wireless, wired, or hybrid network to an application that translates the captured event into meaningful information. M2M communications create the so-called “internet of things.” The big data governance program needs to establish a number of policies around M2M data. For example, the program needs to draw up guidelines around the acceptable use of geolocation and RFID data that can be used to build a profile of individuals and potentially violate their privacy. The program also needs to establish retention policies around the massive volumes of M2M data that can easily overwhelm IT budgets if not properly controlled. The big data governance program also needs to address any data quality concerns such as RFID read rates in environments with high moisture content and lots of congestion.
  3. Big transaction data
    This includes healthcare claims, telecommunications call detail records, and utility billing records. Big transaction data is increasingly available in semi-structured and unstructured formats. Information governance challenges such as metadata, data quality, privacy, and information lifecycle management also apply to this data.
  4. Biometrics
    Biometric information includes fingerprints, retinal scans, facial recognition, and genetics. Advances in technology have vastly increased the available biometric data. Law enforcement, the legal system, and intelligence agencies have been using this information for a long time. However, biometric data is increasingly available in the commercial arena where it can be co-mingled with other types of data such as social media. For example, page 45 of the attached FTC report describes a scenario where retailers can combine facial recognition with social media to personalize messages to customers.
    http://ftc.gov/os/2012/03/120326privacyreport.pdf
    All this opens up new business opportunities as well as several governance issues relating to privacy and data retention.
  5. Human generated data
    Human beings generate vast quantities of data such as call center agents’ notes, voice recordings, email, paper documents, surveys, and electronic medical records. This data may contain sensitive information that needs to be masked. It may contain insights that can improve the quality of structured data sets and should be integrated with MDM. Finally, organizations need to establish policies regarding the retention period for this data to adhere to regulations and to manage storage costs.

Of course, the true test of a framework is that it can withstand the test of time and address different scenarios. I had to evolve this framework a number of times to account for new types of big data. As such, it is entirely possible that I missed something and I am looking forward to your feedback.

 

 

Related Posts Plugin for WordPress, Blogger...

3 Responses to Not Your Type? Big Data Matchmaker On Five Data Types You Need To Explore Today

  1. [...] Soares, Director of Information Governance at IBM, kicks off his first blog with DATAVERSITY: Not Your Type? Big Data Matchmaker On Five Data Types You Need To Explore Today. I really think you’re going to enjoy his blogs on Data Governance. We have more video blogs [...]

  2. Big Reads On Big Data | Numrush – Nederland on June 20, 2012 at 2:38 pm

    [...] “Not Your Type? Big Data Matchmaker On Five Data Types You Need To Explore Today,” [...]

  3. [...] Un premier effort pour clarifier la sitation pourrait être de dresser une typologie de ces données hétérogènes à traiter. C’est à cette tâche que s’est attelé  Sunil Soares, directeur de la gouverance de l’information à IBM dans ce billet. [...]

Leave a Reply

Your email address will not be published. Required fields are marked *


Add video comment

FOLLOW US!

Friend me on FacebookFollow me on TwitterJoin my group on LinkedInWatch me on YouTubeRSS Feed

User Login

Lost Password

 

 

Latest Tweets

Twitter