As anyone who has ever had the ‘pleasure’ of typing “I want to speak to a human” when interacting with a chatbot knows, natural language processing still has a way to go before we’re faced with Turing-test-passing robots. While we’ve made massive strides toward sentient computing, language still remains one of the most insurmountable hurdles to truly intelligent machines.
Recently, NewtonX hosted a talk in which two heads of NLP from two competing Tech giants discussed the latest progress and challenges in applied NLP. Both experts first structured the debate by explaining that there are two aspects to how machines understand and interpret human language: the first is transcription — the machine translates spoken language into a verbatim transcript. But the second aspect is what we’ll focus on in this article: it consists of the machine parsing intent from a perfect transcript, mapping intent to a knowledge graph, and finally using category identification to output a response to the user’s query. It is this aspect of NLP that enables machines to “understand” human language, and respond to it accordingly. But if we already know how to do this, why is it that all too often our phone, Alexa, or online chatbot seem to have absolutely no idea what we are trying to tell them?
Don’t Call It NLP If It’s A Decision Tree
Many of the chatbots that you interact with today are based on a decision-tree structure. In other words, you either type a keyword or choose a menu option within the chat and are led down a path, wherein each decision cues an outcome. “Most of these chatbots have such limited NLP that I would actually not call it NLP” one of our experts said. He then proceeds to give an illustration using the Dominos Pizza chatbot. This chatbot can understand a set of pre-defined keywords that fit into one of its pre-programmed paths, but cannot understand requests that fall outside the set of paths pre-defined by the Dominos customer service team. If, for instance, you ask it what payment it accepts it cannot understand, because it doesn’t address payment until you have already entered your location and order request.
Our second expert puts this point in perspective by arguing that these “simple” chatbots work for most situations, and often work better than human for these precise use cases. Insurance chatbots, for instance, are a mobile/tablet friendly format for essentially just choosing different options – being led through a path to resolution, one choice at a time. But more holistic, NLP-based chatbots are less ubiquitous. One of our two experts recalls that after the incidents with IBM and Microsoft chatbots going rogue, developers and more importantly, their employers financing these expensive programs, are wary of allowing chatbots to learn from human interactions without parameters. Accidents such as Tay posting racists comments can occur rapidly, and their impact in terms of brand equity can be dramatic. This is why customer-facing NLP remains very conservative and mostly decision tree-based.
That said, there are enterprise applications for NLP-based bots actively deployed today. Both our experts report that the chatbot with the highest volume of interactions as of today is the one developed by Alibaba, the Chinese ecommerce giant. Its robo assistant, called AliMe, in addition to handling customer service, can also act as a personal shopper. AliMe functions much like a search engine, helping users discover what they’re looking for in a mobile-friendly format. She follows two different models based on the request – for instance, if a user says “I’d like to reset my password” the bot follows a knowledge graph or retrieval model (like the Domino’s bot). This works for all FAQs. If, however, a user says something like “I’m looking for a chic and slimming pair of jeans,” AliMe uses a knowledge graph combined with semantic indexes to try and match the user’s intent to a specific product. This aspect of the bot’s abilities is extremely hard to develop, due to several ambiguities in natural language processing.
Why NLP is so Hard
NLP is hard for two primary reasons: humans don’t always express intent through semantically accurate language, and there are numerous ambiguities in language. Some examples include:
- Semantics “Gabe invited me to his medical school ball”. What is “ball” in this context?
- Morphology (parts of the word that can be deconstructed to create different meanings).
- Ambiguity of intent: “I just got back from New York”. What do they want?
- Situational ambiguity: “Elaina was found by the river head”. Could be by the head of the river (place) or the executive of the river (person)
- Unable to deduce meaning of unknown words from context like humans can
- Disambiguation – “jaguar” can refer to a car or to an animal
These difficulties arise even when a machine is trying to understand a perfectly written and expressed piece of text. Most of the time, though, the text that the machines get is imperfect – riddled with typos or slang or unclear intentions. These issues are compounded through voice – in addition to the above difficulties, the machine also has to parse phonetics and phonology, such as determining whether someone is saying “I scream” or “Ice cream”. It’s no wonder that smart speakers still mishear, misinterpret, or overhear and take other conversations as commands.
Advances in NLP are Imminent
In the past few years alone, bots have made vast improvements in identifying intent, mapping it to a knowledge graph, and using category identification to respond. Numerous bots have also been developed using machine learning through public interactions, including Microsoft’s Zo and IBM’s Watson.
“The key to NLP is data — the more data you collect, the more you can correct your algorithm’s mistakes, and reinforce its correct answers. With unlimited data and unlimited compute, we would have perfect NLP today,” claims one of our experts.The key to NLP is data. Click To Tweet
The biggest advance in NLP has been word embeddings, in which words with similar meaning usually occur in similar contexts, enabling the machine to learn through a corpus of text from Wikipedia, Twitter, and news sites. This technique also helps the machine approach unknown words in similar ways that humans do – it examines the context of the word, and compares that context to other similar ones in order to glean meaning. In 2017, the Facebook AI Research Lab released pre-trained vectors in 294 languages using this model.
A recent NewtonX survey predicted that AI will outperform humans in translating languages and editing high school essays within the next decade. As NLP models become better and better at mapping intent, using word embeddings, and predicting sentiment through text, AI will (finally) master human language, at least to the extent that it can provide use for enterprises.
The data and insights in this article are sourced from NewtonX experts. For the purposes of this blog, we keep our experts anonymous and ensure that no confidential data or information has been disclosed. Experts are a mix of industry consultants and previous employees of the company(s) referenced.