Will deep learning take us where we want to go? It’s one of the questions that Oxford University professor of Computational Linguistics Stephen Pulman will be delving into at this week’s Sentiment Analysis Symposium. There, he’ll be participating in a workshop session today on compositional sentiment analysis and giving a presentation tomorrow on bleeding-edge natural language processing.
“There is a lot of hype about deep learning, but it’s not a magic solution,” says Pulman. “I worry whenever there is hype about some technologies like this that it raises expectations to the point where people are bound to be disappointed.”
That’s not to imply, however, that important progress isn’t taking place when it comes to deep learning, which leverages machine learning methods based on learning representations with applications to everything from NLP to computer vision to speech recognition.
For example, with its roots in the neural networking tradition that began with attempts to find out how a human brain could, in principle, do the kinds of computations it manages, “deep learning may possibly give insight into the kind of human computational process in a way that brute force machine learning algorithms really don’t,” Pulman notes. How IBM’s Deep Blue plays chess, for example, doesn’t tell you anything about what’s going on in the human brain when a person plays chess.
There’s also the gains that have been made in being able to apply deep learning to Big Data analytics. The scale and speed here is beyond human capacity. “These technologies can deal with those vast amounts of data,” Pulman says. Importantly, one “computer science aspect of deep learning is that people have figured out how to parallelize a lot of the learning to take advantage of huge networks of processors. That means you can train on amounts of data that even five years ago wouldn’t have been possible.” Speech recognition systems, for example, are trained on more data than any human being would experience in several lifetimes, he comments.
Another promise of deep learning is that it can reduce – or in the best cases perhaps even eliminate – the hand-crafted, analyst-dependent feature engineering of most traditional machine-learning approaches to NLP. “Careful feature engineering has been a big component in the success of accurate systems,” Pulman says. In the sphere of NLP, probably the most widely used application of deep learning relies on training that builds up a large number of correct and incorrect examples of language usage by replacing one word in a correctly used set of words with a randomly chosen different word, and training a classifier to distinguish between the two. The result is that you get some kind of vector that represents the properties of that word in terms of the context in which it occurs, “and if you use those input features for more sophisticated programs like parsing, the accuracy somehow increases,” he says.
“It’s not completely obvious why, but the assumption is the pre-training hones in on regularities in data that are not observable to the human analyst, and deep learning machine learning algorithms can capitalize more efficiently on those regularities,” Pulman says.
So far, though, the deep learning approach has been more compelling in the area of computer vision than natural language recognition, he personally believes. “There,” he says, “you can really see some multi-layer neural network systems learning features at successive layers of abstraction.”