Click here to learn more about author James Kobielus.
Machine Learning and other statistical algorithms are like muscles. How you train a statistical algorithm makes all the difference in how well it performs the task for which it was constructed. You could say training an algorithm is a bit like conditioning your muscles in a gym. It can only be considered fit for the specific function for which its been trained. Typically its strength in a particular function—such as classifying customers for target marketing or predicting whether they will accept your offer—does not carry over to other tasks for which you might wish to repurpose it. For other uses, the algorithm, if not rejiggered and retrained, may be entirely useless.
The Data Science community hopes to someday perfect a sort of general Artificial Intelligence (AI) powered by a “master learning algorithms.” But if they’re going to achieve a breakthrough of that magnitude, they’re going to need to build agility into how statistical algorithms are trained and optimized. Is it possible to train data-driven AI algorithms so that they can achieve proficiency in new, unfamiliar tasks with as much agility and speed as a human being?
The non-profit institute OpenAI is working toward that end. As described in this recent article, the group has built a virtual, distributed platform, called Universe, for training the next generation of Machine Learning algorithms. As I pondered this initiative, I began to sense that the next generation of Machine Learning and other statistical algorithms will need to be trained in a multi-step process so that they can serve as serve as modular building blocks in an agile, superintelligent, generalized AI environment.
Here’s what I mean:
- Templatize the training: Typically, Machine Learning projects focus on a standard training “template” associated with the type of neural network (eg., recurrent, deep convolutional, long/short term memory) and learning style (e.g., supervised, unsupervised, semi-supervised, reinforcement). When supervised learning is involved, determining the model’s fitness for its function (e.g., classification) relies on training it with labeled data and adjusting weights between neural nodes. When unsupervised learning is involved, the training template involves using algorithmic cluster analysis to discern hidden structure in unlabeled data. When reinforcement learning is employed, the template involves training AI apps to maximize a reward function rather than, as with supervised and unsupervised learning, to minimize a loss function.
- Productionize the training: Once they’ve been trained in a templatized manner, the next step in training AI apps should be to train them further to approximate how humans might execute those same application functions in practice. In this way, Machine Learning agents can be readied for production deployment as fully automated intelligent application resources, such as chatbots, for a specific function. Per the approach enabled through Universe, this could involve using reinforcement learning to benchmark apps’ automated performance against training data collected from actual human interactions within the same application environment. Universe provides an environment within which such user interaction data can be collected organically from within interactive online app environments. Universe enables AI apps to learn by observing how humans interactively accomplish a task within any app that has been built with a clear reward function. A Universe-plugged AI application environment can virtually “see” app-screen pixels and use a virtual keyboard and mouse to give out commands, thereby enabling them to shadow every interaction step that a human user might make in the same application. Universe’s virtual network computing environment sends updated screenshots back to AI apps, enabling them to execute any scenario’s next step much the same way that a human user might do. Human-user sessions are recorded in Universe to provide interaction-sourced training data for benchmarking AI app performance.
- Generalize the training: Finally, once a function-specific AI app has been trained through the templatized and productionized approaches outlined above, its knowledge should be further generalized through a training technique called “transfer learning.” This is necessary if an AI agent is to serve as a modular component in a generalized AI environment, which may be required to take data-driven decisions outside its core scope. Universe facilitates transfer learning by enabling developers to train algorithmically-driven games, websites, and other applications in such a way that each new app can tap into the statistical knowledge built up from training prior apps that may have been built for different purposes. By “statistical knowledge,” I’m referring to the feature representations, neural-node layering, weights, learning rate, loss function, and other properties of a prior model that has been trained successful for a particular purpose. In Universe, AI agents accomplish this by flexibly hopping between games and other apps, taking the statistical knowledge they’ve gained in one app and applying it to their ability to execute other apps effectively.
It remains to be seen whether Universe will gain broad adoption in the machine-learning community, but OpenAI’s initial results in achieving transfer learning among disparate gaming apps appear encouraging. Likewise, it’s not yet clear that this approach will be successful in bringing transfer learning more fully into the development of a wider range of Machine Learning, Deep Learning, and other AI-driven apps (e.g., streaming, collaboration, e-commerce, mobility, industrial, and Internet of Things). Furthermore, transfer learning is not yet a widely adopted approach or discrete set of practices that are well understood among mainstream Data Scientists, so it may take some time for them to warm up to the need to engineer it into their algorithm training procedures.
Nevertheless, development of the long-sought master learning algorithm will need to exploit a progressively layered training regimen along the lines that I’ve presented. Without a common framework for sharing and reusing the statistical knowledge that’s locked up in scattered AI projects, the vision of generalized AI can never come to fruition.