Click to learn more about author Paul Barba.
Intuition is a core component of human intelligence. But it doesn’t always send us in the right direction. Anyone versed in mathematics will know that sometimes the world behaves opposite to expectations. Things that seem normal or given can actually be outliers. Outliers are the norm.
Take the physical dimensions of pilots. In The End of Average, Todd Rose recounted an anecdote about a 1950s investigation by the US Air Force into plane crashes. Ten physically relevant dimensions were measured on 4,000 pilots so the cockpit could be redesigned to fit them well. If somebody’s arms were too short they might not be able to reach an important lever. If they were too long they might catch on something. The expectation was that most pilots would be average.
But less than 4% of pilots were average across even three of the ten dimensions. No one was close to uniformly average. This already low percentage dropped as additional dimensions were measured. Anticipated averages became meaningless: everyone became an outlier. This is what we call the “curse of dimensionality.” The more dimensions you add, the less “average” anyone – or anything – is.
A usual approach to dealing with a class of things is to think about how average members behave. But what about when there are no average members?
Many problems in Machine Learning involve many variables. Take written language as an example. Even a single paragraph of text involves the interplay of hundreds of words, and we can be dead certain a great many of them will be used in “non-average” ways. For every “obviously” negative word (e.g. “dead”), there’s a slew of counter-uses (e.g. “dead certain”).
As Machine Learning has evolved, we’ve moved from carefully quantifying the world and logically reasoning about it to shoveling data into Deep Learning algorithms and watching as mystifying but effective decision-making networks grow. We’ve evolved from drawing conclusions about averages in “three” dimensions to considering them in countless dimensions.
As a result, those “averages” become less meaningful. They tell you almost nothing about the individual instances, in which every use case becomes an outlier. Perhaps there is a prototypical example, a turn of phrase that’s an exemplar. But it’s almost impossible to pinpoint it – because everything is unique.
We must take a new perspective in what we’re doing with Machine Learning. We need to step away from the notion of “average” or “normal” or “prototypical” – and away from the idea that we can simply produce the “answer” to a single problem. When you’re swimming in variables, even the problem itself becomes vague and complex.
We shouldn’t be designing models that solve a single “problem.” We should be designing models that solve a class of infinitely variable problems.
What does this look like? Let’s say that I’m writing a model to identify company names in text. I’m not just trying to quantify exactly what a company name is. I’m creating software to solve unique company name problems. It’s a subtle distinction, but an important one that will underlie further advances in AI as we shift from solving individual problems to solving classes of problems.
To return to the Air Force pilot example: the key insight derived was that no single cockpit could comfortably fit more than a small handful of humans. The problem of finding the optimal configuration was itself flawed. Instead, you had to build an adjustable cockpit: Every important variable had to be changed when one pilot left the plane and another got in. Building in this degree of configurability was difficult, but ultimately solved the problem. With Machine Learning, we need to develop systems that are constantly re-contextualizing themselves: Figuring out the context of the specific instance of the problem they’re trying to solve, and solving precisely that, rather than trying to predict how an average context would have behaved.