How to Keep Machine Learning From Becoming a Big Data Bottleneck

By on

2516118991_fac8eac9b8by Angela Guess

Jack Vaughn of SearchDataManagement recently wrote, “The recent rush of machine learning technology and products is formidable, but machine learning techniques are far from new. What’s new is the number of parallelized data processing platforms becoming available for machine learning applications of big data. At the recent Strata + Hadoop World conference in San Jose, Calif., data specialists said the complexity of predictive machine learning algorithms and models, as well as the sheer numbers of such models, can limit use of machine learning in large corporations. They also discussed tools that help them address these limits. ‘The power of the machine learning techniques scales with the data, but training times can increase exponentially,’ said Ryan Michaluk, a data scientist at Allstate Insurance Co. in Northbrook, Ill.”

Vaughn goes on, “With more sophisticated models and growing masses of digital data to process, Michaluk added, iterative machine learning actually became a bit of a bottleneck in his part of the organization. As a result, models ran on samples, not full or near-full data sets, which resulted in some compromised accuracy and predictability. He said that using Hadoop data pooling is a sensible step toward addressing size issues with models and data, but that machine learning problems still may remain hard to solve. ‘Some algorithms parallelize trivially — some don’t,’ he said.”

He continues, “Michaluk said his group began using Hadoop along with machine learning software from Skytree Inc. in San Jose to speed the time it took for parallel model development.  He and his colleagues are now able to take existing learning models and run them on larger sets of data, which can lead to better predictions. These models can improve decision making around pricing, fraud prevention, underwriting, marketing and webpage design.”

Read more here.

photo credit: Flickr/ Let Ideas Compete

We use technologies such as cookies to understand how you use our site and to provide a better user experience. This includes personalizing content, using analytics and improving site operations. We may share your information about your use of our site with third parties in accordance with our Privacy Policy. You can change your cookie settings as described here at any time, but parts of our site may not function correctly without them. By continuing to use our site, you agree that we can save cookies on your device, unless you have disabled cookies.
I Accept