by Angela Guess
According to a recent article by Evangelos Simoudis, “it appears that more companies are getting comfortable collecting Petabyte scale data sets from sensor networks, social applications, IT log files, and online advertising applications. However, many of the companies collecting these data sets appear to have been struggling with the analysis of the collected data… Larger corporations that have already developed data warehousing and analytic infrastructures must understand how to use Hadoop in order to deal with the data volumes being collected in the context of their existing infrastructures. Smaller private companies, on the other hand, are trying to establish such infrastructures for the first time and trying to determine what role Hadoop can ultimately play.”
Simoudis continues, “Companies analyze data either through interactive queries or by creating models. Such models can predict the outcome of future actions or automatically describe characteristics of data sets. Examples of emerging analytic big data applications include social media sentiment analysis, and analysis of various forms of data logs for online advertising, computer security, etc. Hadoop’s MapReduce batch processing programming model has proven effective for model creation. One may even say that for model creation Hadoop could further benefit by interfacing with open source modeling tools such as those offered by Revolution Analytics.”
For more on this topic, see Simoudis’ full article here.
photo credit: Hadoop