Not much time passes these days between so-called major advancements in artificial intelligence. Yet researchers are not much closer than they were decades ago to the big goal: actually replicating human intelligence. That’s the most surprising revelation by a team of eminent scholars who just released the first in what is meant to be a series of annual reports on the state of AI.
The report is a great opportunity to finally recognize that the current methods we now know as AI and deep learning do not qualify as “intelligent.” They are based on the “brute force” of computers and limited by the quantity and quality of available training data. Many experts agree.
The steering committee of ”AI Index, November 2017” includes Stanford’s Yoav Shoham and Massachusetts Institute of Technology’s Eric Brynjolfsson, an eloquent writer who did much to promote the modern-day orthodoxy that machines will soon displace people in many professions. The team behind the effort tracked the activity around AI in recent years and found thousands of published papers (18,664 in 2016), hundreds of venture capital-backed companies (743 in July 2017) and tens of thousands of job postings. It’s a vibrant academic field and an equally dynamic market (with the number of U.S. startups in the space increasing by a factor of 14 since 2000).
All this concentrated effort cannot help but produce results. According to the AI Index, the best systems surpassed human performance in image detection in 2014 and are on their way to 100% results. Error rates in labeling images (“this is a dog with a tennis ball”) have fallen to less than 2.5% from 28.5% in 2010. Machines have matched humans when it comes to recognizing speech in a telephone conversation and are getting close to parsing the structure of sentences, finding answers to questions within a document and translating news stories from German into English. They have also learned to beat humans at poker and Pac-Man. But, the authors of the index wrote:
Tasks for AI systems are often framed in narrow contexts for the sake of making progress on a specific problem or application. While machines may exhibit stellar performance on a certain task, performance may degrade dramatically if the task is modified even slightly. For example, a human who can read Chinese characters would likely understand Chinese speech, know something about Chinese culture and even make good recommendations at Chinese restaurants. In contrast, very different AI systems would be needed for each of these tasks.
The AI systems are such one-trick ponies because they’re designed to be trained on specific, diverse, huge datasets. It could be argued that they still exist within philosopher John Searle’s “Chinese Room.” In that thought experiment, Searle, who doesn’t speak Chinese, is alone in a room with a set of instructions, in English, on correlating sets of Chinese characters with other sets of Chinese characters. Chinese speakers are sliding notes in Chinese under the door, and Searle pushes his own notes back, following the instructions. They can be fooled into thinking his replies are intelligent, but that’s not really the case. Searle devised the “Chinese Room” argument — to which there have been dozens of replies and attempted rebuttals — in 1980. But modern AI is still working in a way that fits his description.
Machine translation is one example. Google Translate, which has drastically improved since it started using neural networks, trains the networks on billions of lines of parallel text in different languages, translated by humans. Where lots of these lines exist, Google Translate does OK — about 80% as well as an expert human. Where the data are lacking, it produces hilarious results. I like putting in Russian text and telling Google Translate it’s Hmong. The results, in English or Russian, will often be surprising — like the pronouncements found inside fortune cookies.
I doubt this is accidental. There are probably not many legitimate calls for translations from Hmong, so idle tricksters must have helped train Google’s translation machine to produce various kinds of exquisite nonsense.
Researchers are trying to overcome the data insufficiency problem. Two recently published papers show how machine translation can work based on monolingual datasets, using the statistical likelihood of certain words being grouped together. The quality is not as good as with bilingual training data, but it’s still not complete nonsense and workable in a pinch. These are, however, mere crutches that don’t change the general brute force approach.
Solving complex tasks requires ever more power and ever more data. A computer beat humans at Othello the year Sarle wrote about the Chinese Room and at poker this year — but that’s a quantitative leap rather than a qualitative one.
This kind of “artificial intelligence” continues to be a promising line of both research and business while there are growing quantities of “big data” to parse. Kai-Fu Lee of Chinese investment firm Sinovation Ventures, one of the experts who contributed essays to AI Index 2017, wrote that China was competitive against the U.S. in artificial intelligence because it generates oodles of data:
In China, people use their mobile phones to pay for goods 50 times more often than Americans. Food delivery volume in China is 10 times more than that of the US. It took bike-sharing company Mobike 10 months to go from nothing to 20 million orders (or rides) per day. There are over 20 million bicycle rides transmitting their GPS and other sensor information up to the server, creating 20 terabytes of data everyday. Similarly, China’s ride-hailing operator Didi is reported to connect its data with traffic control in some pilot cities. All of these Internet-connected things will yield data that helps make existing products and applications more efficient and enable new applications we never thought of.
The data dependence, however, isn’t great for AI’s future development. A backlash against the limitless data collection is gathering strength in the West; nation states are putting up barriers to data sharing; the weaponization of datasets to produce intentionally flawed results and flawed responses to them is not far off. And it’ll be far harder to detect than, for example, the weaponization of social networks by Russian information warriors has been.
Meanwhile, the AI Index estimates that modern machines’ capacity for common sense reasoning is far less than that of a 5-year-old child. Hardly any progress is being made in that area, and it’s hard to quantify.
An increasing capacity for data crunching can be both helpful and dangerous to humans. It isn’t, however, a game changer. And it’s up to us to keep this branch of computer science in its place by only giving it as much data as we’re comfortable handing over — and only using it for those applications in which it can’t produce dangerously wrong results if fed lots of garbage. The technology itself is not the kind that can push us away from the controls — entirely new approaches would be necessary to create that threat.
By Leonid Bershidsky, a Bloomberg View columnist. He was the founding editor of the Russian business daily Vedomosti and founded the opinion website Slon.ru. This column does not necessarily reflect the opinion of the editorial board or Bloomberg LP and its owners.