James Kobielus of InfoWorld recently wrote, “Machine-generated log data is the dark matter of the big data cosmos. It is generated at every layer, node, and component within distributed information technology ecosystems, including smartphones and Internet-of-things endpoints… Clearly, automation is key to finding insights within log data, especially as it all scales into big data territory. Automation can ensure that data collection, analytical processing, and rule- and event-driven responses to what the data reveals are executed as rapidly as the data flows. Key enablers for scalable log-analysis automation include machine-data integration middleware, business rules management systems, semantic analysis, stream computing platforms, and machine-learning algorithms.?
He goes on, “Among these, machine learning is the key for automating and scaling distillation of insights from log data. But machine learning is not a one-size-fits-all approach to log-data analysis. Different machine-learning techniques are suited to different types of log data and to different analytical challenges. When the correlations and other patterns sought through machine learning can be specified a priori, supervised learning is the way to proceed. However, supervised learning requires a human expert to prepare a reference ‘training data’ set from the log in order to refine a machine-learning algorithm’s ability to discern the most relevant patterns.”
Image: Courtesy Flickr/ zachstern