Or subscribe now to read with unlimited access for as low as $10/month.
Don’t have an account? Sign up now.
A free account provides you access to a limited number of free articles each month, plus newsletters, job postings, salary data, and exclusive store discounts.
Danielle discovered, to her horror, that part of a conversation she had with her husband had been recorded and sent without her knowledge to someone whose name was in a contacts list accessible to “Alexa,” the fictive assistant inside Amazon Echo smart speakers.
Amazon investigated, and reported the following on Thursday, May 24:
Echo woke up due to a word in background conversation sounding like “Alexa.” Then, the subsequent conversation was heard as a “send message” request. At which point, Alexa said out loud ‘To whom?’ At which point, the background conversation was interpreted as a name in the customer’s contact list. Alexa then asked out loud, "[contact name], right?” Alexa then interpreted background conversation as “right.”
As unlikely as this string of events is, we are evaluating options to make this case even less likely.
A human being would have been able to tell in a hundred different common-sense ways that a husband and wife chatting to each other about hardwood floors are not trying to send a voice message to a workmate. The Amazon Echo doesn’t have an iota of that common sense.
ADVERTISEMENT
The absolute probability of this sort of fluke triggering of a smart speaker is hard to estimate. But let me just testify, anecdotally, that on the evening of April 17, as I chatted with friends in my neighbors’ apartment, their Apple HomePod woke up and started quietly saying something to us (I didn’t catch it, and whether it sent any messages or ordered any creamed corn is not known; its owners didn’t seem worried).
What’s going wrong in the industry — terribly wrong, I think — is that attempts are being made to link front ends based on natural-language processing (NLP) to back ends based on artificial intelligence (AI) in exactly the wrong sorts of domains, before either the NLP or the AI is anywhere near ready for prime time (let alone Amazon Prime time).
Natural-language processing is not ready because the science just isn’t there to support the engineering. Brilliant software expertise has gotten us this far by faking it with heavy use of probability computations rooted in gigantic quantities of data, but it is not backed up by reliable linguistics. We simply don’t know how to simulate natural-language understanding in a serious way. And the AI is not ready because without radical domain limitations its task is outright impossible.
The applications that I think should be attempted, given the state of the art, are modest uses of NLP that if successful will benefit users and if unsuccessful cannot possibly cause damage, harm, embarrassment, or expense. Natural-language question-answering systems to replace FAQ lists would be an example. Purchase of nonrefundable airline tickets to Oakland or Auckland would not.
The sending of spoken-word messages containing overheard conversations should certainly not be under the control of an AI system triggered by something as uncertain as speech recognition.
ADVERTISEMENT
Amazon, Apple, Google, Microsoft, and other companies (like Alibaba and Baidu in China) are trying to do unrestricted understanding of unrehearsed spontaneous speech in everyday contexts (which basically guarantees a high failure rate) and apply it to the riskiest domains you can think of (sending messages to names plucked out of a contact list, making purchases from a supplier). Few will listen when I urge caution: NPR and Edison estimate that 16 percent of the over-18 population owns a smart speaker already. But for those who will heed me, my advice would be do not try this at home. Not for about 20 to 30 years, anyway.
Arthur C. Clarke figured that by 2001 we would have high-quality vocal NLP connected to AI (and scheduled Pan Am flights to the moon with drinks service). By the time I came to know a bit about computational linguistics in the 1980s, I was more pessimistic, but I did think that maybe by 2010 we might have functional NLP. I was not pessimistic enough. It’s 2018 and we are not even close. NLP is still unable to replicate the linguistic competence of a 5-year-old.
And AI cannot even simulate the powers of the average insect.
It is good to see enthusiastic attempts being made at applying NLP and AI; we’ll never perfect such technologies if practical products are not on the drawing board. But never forget, the technology is not up to it yet. Early adopters do so at their peril.