Advertisement

Mining Unstructured Data: The Silver Standard

By on

silverby Angela Guess

Alex Woodie recently wrote in Datanami, “The traditional approach to mining unstructured data typically involves training machine learning models upon high-quality “gold standard” data that’s been meticulously groomed. But thanks to innovations in deep learning, more insight may be extracted at less cost by training upon larger amounts of raw data, or what’s being called “silver standard” data. This is the approach advocated by indico, a Boston startup that’s looking to make a mark in the burgeoning world of deep learning. The company is leveraging deep learning models running atop a combination of CPUs and GPUs to help customers in financial services and marketing to analyze large amounts of unstructured data, primarily text and images.”

Woodie continues, “There’s a tremendous amount of value hidden in unstructured data, including social media posts, news stories, legal documents, and other free sources of data. But as indico CEO Slater Victoroff explains, it can be very difficult to get useful insights out of these sources, such as performing sentiment analyses. ‘We found the majority of people are either not doing anything with unstructured data, or not doing nearly enough with it,’ Victoroff tells Datanami. ‘The main barrier to sentiment analysis is not making a better model. It’s getting more data.’ Indico claims it has come up with a better way to analyze unstructured data. While much of how the platform works is secret, key breakthroughs involve the combination of silver-standard corpora training data, as well as the use of transfer learning techniques to accelerate the training of its recurrent neural network (RNN) model.”

Read more here.

Photo credit: Flickr/ Breibeest

Leave a Reply