Nand Kishor is the Product Manager of House of Bots. After finishing his studies in computer science, he ideated & re-launched Real Estate Business Intelligence Tool, where he created one of the leading Business Intelligence Tool for property price analysis in 2012. He also writes, research and sharing knowledge about Artificial Intelligence (AI), Machine Learning (ML), Data Science, Big Data, Python Language etc... ...Full Bio
Nand Kishor is the Product Manager of House of Bots. After finishing his studies in computer science, he ideated & re-launched Real Estate Business Intelligence Tool, where he created one of the leading Business Intelligence Tool for property price analysis in 2012. He also writes, research and sharing knowledge about Artificial Intelligence (AI), Machine Learning (ML), Data Science, Big Data, Python Language etc...
Data science is the big draw in business schools
645 days ago
7 Effective Methods for Fitting a Liner
655 days ago
3 Thoughts on Why Deep Learning Works So Well
655 days ago
3 million at risk from the rise of robots
655 days ago
Top 10 Hot Artificial Intelligence (AI) Technologies
4 Approaches to Natural Language Processing and Understanding
In 1971, Terry Winograd wrote the SHRDLU program while completing his PhD at MIT. SHRDLU features a world of toy blocks where the computer translates human commands into physical actions, such as "move the red pyramid next to the blue cube." To succeed in such tasks, the computer must build up semantic knowledge iteratively, a process Winograd discovered was brittle and limited.
The rise of chatbots and voice activated technologies has renewed fervor in natural language processing (NLP) and natural language understanding (NLU) techniques that can produce satisfying human-computer dialogs. Unfortunately, academic breakthroughs have not yet translated to improved user experiences, with Gizmodo writer Darren Orf declaring Messenger chatbots "frustrating and useless" and Facebook admitting a 70% failure rate for their highly anticipated conversational assistant M.
Nevertheless, researchers forge ahead with new plans of attack, occasionally revisiting the same tactics and principles Winograd tried in the 70s. OpenAI recently leveraged reinforcement learning to teach to agents to design their own language by "dropping them into a set of simple worlds, giving them the ability to communicate, and then giving them goals that can be best achieved by communicating with other agents." The agents independently developed a simple "grounded" language.
MIT Media Lab presents this satisfying clarification on what "grounded" means in the context of language: "Language is grounded in experience. Unlike dictionaries which define words in terms of other words, humans understand many basic words in terms of associations with sensory-motor experiences. People must interact physically with their world to grasp the essence of words like "red," "heavy," and "above." Abstract words are acquired only in relation to more concretely grounded terms. Grounding is thus a fundamental aspect of spoken language, which enables humans to acquire and to use words and sentences in context."
The antithesis of grounded language is inferred language. Inferred language derives meaning from words themselves rather than what they represent. When trained only on large corpuses of text, but not on real-world representations, statistical methods for NLP and NLU lack true understanding of what words mean. OpenAI points out that such approaches share the weaknesses revealed by John Searle‚??s famous Chinese Room thought experiment. Equipped with a universal dictionary to map all possible Chinese input sentences to Chinese output sentences, anyone can perform a brute force lookup and produce conversationally acceptable answers without understanding what they‚??re actually saying.
WHY IS LANGUAGE IS SO COMPLEX?
Percy Liang, a Stanford CS professor and NLP expert, breaks down the various approaches to NLP / NLU into four distinct categories:
4) Interactive learning
You might appreciate a brief linguistics lesson before we continue on to define and describe those categories. There are three levels of linguistic analysis:
1) Syntax - what is grammatical?
2) Semantics - what is the meaning?
3) Pragmatics - what is the purpose or goal?
Drawing upon a programming analogy, Liang likens successful syntax to "no compiler errors", semantics to "no implementation bugs", and pragmatics to "implemented the right algorithm."
He highlights that sentences can have the same semantics, yet different syntax, such as "3+2" versus "2+3". Similarly, they can have identical syntax yet different syntax, for example 3/2 is interpreted differently in Python 2.7 vs Python 3.
Ultimately, pragmatics is key, since language is created from the need to motivate an action in the world. If you implement a complex neural network to model a simple coin flip, you have excellent semantics but poor pragmatics since there are a plethora of easier and more efficient approaches to solve the same problem.
Plenty of other linguistics terms exist which demonstrate the complexity of language. Words take on different meanings when combined with other words, such as "light" versus "light bulb" (i.e. multi-word expressions), or used in various sentences such as "I stepped into the light" and "the suitcase was light" (polysemy).
Hyponymy shows how a specific instance is related to a general term (i.e. a cat is a mammal) and meronymy denotes that one term is a part of another (i.e. a cat has a tail). Such relationships must be understood to perform the task of textual entailment, recognizing when one sentence is logically entailed in another. "You‚??re reading this article" entails the sentence "you can read".
Aside from complex lexical relationships, your sentences also involve beliefs, conversational implicatures, and presuppositions. Liang provides excellent examples of each. Superman and Clark Kent are the same person, but Lois Lane believes Superman is a hero while Clark Kent is not. If you say "Where is the roast beef?" and your conversation partner replies "Well, the dog looks happy", the conversational implicature is the dog ate the roast beef. Presuppositions are background assumptions that are true regardless of the truth value of a sentence. "I have stopped eating meat" has the presupposition "I once ate meat" even if you inverted the sentence to "I have not stopped eating meat."
Adding to the complexity are vagueness, ambiguity, and uncertainty. Uncertainty is when you see a word you don‚??t know and must guess at the meaning. If you‚??re stalking a crush on Facebook and their relationship status says "It‚??s Complicated", you already understand vagueness. Richard Socher, Chief Scientist at Salesforce, gave an excellent example of ambiguity at a recent AI conference: "The question ‚??can I cut you?‚?? means very different things if I‚??m standing next to you in line or if I am holding a knife".
Now that you‚??re more enlightened about the myriad challenges of language, let‚??s return to Liang‚??s four categories of approaches to semantic analysis in NLP / NLU.