If you've ever dealt with an AI technology that has conversed with you, either via speech or text, then you have indirectly used AI that had natural language processing (NLP) capabilities. Language-capable AI is different than most other machine learning algorithms in that the data that they have to deal with is inherently messier. Instead of being given very structured data, usually in a row/column format, language-capable AIs must deal with messy, unstructured text, different styles of writing, typos, accents, etc., in order to make accurate predictions and take specific actions. This article will focus on those algorithms that work with the written language but the concepts transfer over quite well to AI that deals with speech. Here are four major components of designing AI to read and write.
1. Context Is King
Words can mean different things to different people at different times. For example, the word "terribly" could be used both negatively or in a neutral manner (e.g., "The service was handled terribly" vs. "I'm terribly sorry I'm late") and so the word "terribly" on its own is ambiguous. This is not news to humans, but AI machines struggle with this concept. One key development in NLP in recent years, called word vector embeddings, is used to attempt to solve this problem. Deep learning methods such as Word2vec (out of Google) and GloVe (out of Stanford) create numeric representations of words out of context in order to gain a deeper understanding of the words we use.
A common example of this is done through simple word association. If I were to say "Man is to King as Woman is to ___," you would say "Queen." This same logic also applies to word vectors.
AI models can use these numerical/vector representations of words and invoke the concept of context in order to read text at a much deeper level.
2. AI See, AI Do
AI usually learns to read and write based on how humans have done so in the past. In order to train AI models, historical conversations are fed into the system to teach the machine what constitutes good or bad writing. Using training data to train and tune models is not a new concept to machine learning, and it is a huge limiting factor to the AI's ability to process and create text. AI is generally only able to repeat what humans have said previously and aren't able to generate new sequences of words and trains of thought. A very recent development in deep learning called sequence to sequence learning is able to ingest and generate sequences of data and is showing great increases in an AI's ability to learn the "style" of writing and then go on to generate new pieces of text, given a prompt.
3. Metadata/Interconnected Systems
The same question could be answered differently depending on information surrounding the message, called metadata. If the customer of an e-commerce site asks about the status of their online order, the AI could ask the customer which order they were talking about, look up the status of their order and call it a day. The AI could also go a step further and look up the history of that user's conversation with the company. If it went even further, the AI could notice that this particular user has asked this same question three times in the last week and adjust its answer to be more sympathetic to the user, "I'm so sorry you've had to ask so many times, etc." Metadata could include the time of day that the message is coming in or the channel the message is on (whether the question is a tweet or an email).
4. Vocabulary Size
One of the simplest and most prevalent components of NLP is that the sheer size of words and character combinations in a human language is generally staggering and requires time and processing power to process. This problem is especially prevalent when a machine has to remember all possible words, combinations of words and typos of common words. This usually means committing to memory millions of possible tokens of text. To combat this, a technique called dimension reduction can be used to extract latent structure within a text to optimize for computing power and speed. This kind of technique is used to understand high-level topics and use these topics to give better responses.
Technologies like Alexa, Siri and customer service conversation AIs are already using these four components, among others, to create language-capable AI to interact with us on a daily basis. Natural language processing will continue to present new obstacles and inspire new solutions and advancements in AI and deep learning.