by Angela Guess
Lukas Biewald recently wrote for CrowdFlower, “Earlier this week, Google open-sourced TensorFlow. According to Matt Cutts, who has run Google’s spam algorithms for years, TensorFlow is essentially Google’s ‘secret-sauce.’ That said, Google clearly believes machine learning is incredibly important and is willing to invest billions in R&D. So why would they be willing open source their core technology? It’s simple. Machine learning algorithms aren’t the secret sauce. The data is the secret sauce.”
Biewald goes on, “Companies don’t usually come out and say that data is the most important thing, but you can see it in their actions. In the past month, IBM completed a multibillion dollar acquisition of Merge Healthcare and announced the multibillion dollar acquisition of weather.com. Neither company has a business with obvious synergies with IBM, but both had giant, unique data sets that IBM wanted to use to train its algorithms. Notably, no machine learning algorithm company has been acquired for billions.”
He continues, “Google can safely open-source their core technology because without training data for their algorithms, you can’t build a search algorithm anywhere near as good as Google’s. And Google knows that no one can build a training data set as good as they have. By open-sourcing their algorithm they can count on the whole world to help make it efficient and powerful. But by keeping their data, they keep a large moat between them and their competitors.”
photo credit: Flickr/ Angelbattle bros