Does AI Truly Learn And Why We Need to Stop Overhyping Deep Learning
Source: Kalev Leetaru
AI today is described in breathless terms as computer algorithms that use silicon incarnations of our organic brains to learn and reason about the world, intelligent superhumans rapidly making their creators obsolete. The reality could not be further from the truth. As deep learning moves from the lab into production use in mission critical fields from medicine to driverless cars, we must recognize its very real limitations as nothing more than a pile of software code and statistics, rather than the learning and thinking intelligences we describe them as.
Every day data scientists build machine learning algorithms to make sense of the world and harness large piles of data into marketable insights. As guided machine assistance tools, they operate much like the large classical observation equipment of the traditional sciences, software microscopes and telescopes onto society. However, a physicist does not proclaim that their analysis software is alive and thinking its own thoughts about the universe. They list the algorithm they used to analyze their dataset and talk about how and why it surfaced a new finding. To them, no matter how advanced the software, it is still strictly a statistical algorithm that found a pattern in data through statistics and programming.
In contrast, data scientists all too often treat their algorithmic creations as if they were alive, proclaiming that their algorithm “learned” a new task, rather than merely induced a set of statistical patterns from a hand-picked set of training data under the direct supervision of a human programmer who chose which algorithms, parameters and workflows to use to build it.
Algorithms that use statistics to extrapolate from known training data to create unexpected outcomes are proclaimed to have “created” something new and are immediately labeled as superhuman entities sure to bring about the end of human life as we know it. Why is a passage of text “created” by a neural network any different than one “generated” through a traditional classical probability model?
Generative adversarial algorithms are described as superhuman titans battling through more outcomes in a few hours than all of humankind has since the dawn of history. Why are paired neural networks described in terms of humanlike qualities when paired classical adversarial refinement algorithms are seen merely as adjusting their parameters?
Most dangerously, we take successful algorithms and assign stories to their successes that extrapolate far beyond what they actually did. A neural network that correctly distinguishes one breed of dog from others is said to have “learned” the innate biological characteristics of that breed. In reality it may merely have noticed that all examples of that breed wore red collars in the training dataset. In fact, the underlying neural network doesn’t actually understand what a “dog” or a “breed” or “red” or a “collar” is. It merely associates specific spatial groupings of colors and textures with particular strings of text. Stray too far from the examples it has seen in the past and it fails, with disastrous consequences if it is screening for cancer or driving a car with human passengers.
Few AI image classifiers are capable of genuine reason. The ability to take an image of an entirely unknown artificial object and work out what it might be based on its high order semantic characteristics like the presence of a battery, an LED, a driving circuit and an oversized switch. To today’s deep learning algorithms, they see just combinations of shapes and textures, not the presence of a battery indicating a portable power source and the combination with an LED, converter and switch as suggestive of a flashlight. Some systems can learn basic associations that can be used to approximate aspects of this descriptive process, but current systems are incapable of true high order reasoning about the world around them.
A neural network of today no more “learns” or “reasons” about the world than a linear regression of the past. They merely induce patterns through statistics. Those patterns may be opaquer, more mediated and more automatic than historical approaches and capable of representing more complex statistical phenomena, but they are still merely mathematical incarnations, not intelligent entities, no matter how spectacular their results.
Whether neural network, Naïve Bayes or simply linear regression, data scientists train their machine learning models on carefully constructed piles of training examples then claim their algorithms have “learned” about the world. Yet, machine learning is in reality merely another form of machine instruction, different from purely expert manual coding of rules, but still guided, with the algorithms and workflows manually tuned for each application.
Why does this matter? It matters because as we increasingly deploy AI systems into mission critical applications directly affecting human life, from driverless cars to medicine, we must understand their very real limitations and brittleness in order to properly understand their risks.
Putting this all together, in the end, as we ascribe our own aspirations to mundane piles of code, anthropomorphizing them into living breathing silicon humans, rather than merely statistical representations of patterns in data, we lose track of their very real limitations and think in terms of utopian hyperbole rather then the very real risk calculus needed to ensure their safe and robust integration into our lives.