THE FUTURE OF ARTIFICIAL INTELLIGENCE:
Communicating With Computers
Editor's Note: Will M. ('15) wrote this essay for his science research class, and describes the purpose and effects of his research regarding language.
When my grandma developed a chronic eye disease causing complete blindness, the way she was able to interact with the world was changed forever. Basic tasks like walking down the stairs or reading a letter became impossible, and she was forced to give up reading and many of her other favorite activities. On top of all this, my grandma’s blindness made her feel disconnected from the outside world. Thankfully, the negative ramifications of my grandma’s disease were able to be lessened by voice control technology. Using the iPhone’s primitive Siri service, my grandma is now able to stay connected to the world around her more than she would be able to otherwise. Although simple algorithms like Siri are helpful to blind people, they are only the first step. Advances in the field of neurolinguistics and language processing could lead to more sophisticated speech recognition technology that could even more greatly benefit blind people like my grandma around the world.
To understand how to build a computer algorithm that can communicate with human language, the best place to look is the human brain itself. According to classical neurolinguistics, language in the brain is processed in two main regions: Broca’s Area and Wernicke’s Area. Broca’s Area coordinates syntax (the order of words within a sentence) during language production and comprehension, and Wernicke’s area acts as a lexical dictionary which efficiently maps sound to semantics (meaning). Although this distinction may seem straightforward, modern studies in linguistics are finding that language processing is much more complex than previously thought and that there is much overlap between the functions performed by these two regions. For example, scientists are unclear where morphology, a characteristic of language somewhere between syntax and semantics, is handled neurologically according to the current understanding. Another important unknown is how the brain anticipates upcoming words in order to speed up the rate at which it is able to comprehend a full sentence. Processing the subtleties of human language as quickly as it does is clearly one of the most advanced functions of the brain, and before software engineers can efficiently replicate this behavior with computer programs, neuroscientists need to map out exactly how the neurological device for language operates. In order to create advanced systems of artificial intelligence which replicate human language, the traditional model of language in the brain must be expanded upon and the inner workings of neurolinguistic pathways must be understood in greater detail.
A computer which communicates with its user via artificial intelligence as opposed to a traditional graphical user interface is much more accessible to the visually impaired because it can be controlled by voice alone. Although sophisticated artificial intelligence software has yet to be developed, I have personally seen how the relatively primitive algorithm behind Apple’s Siri has allowed my blind grandma to utilize her phone in a way not possible otherwise. Through simple voice commands, my grandma is able to check her email in order to stay in contact with the outside world, play music for her personal enjoyment, and perform countless other tasks which may be difficult otherwise. However, although Siri is a step in the right direction, it does not achieve the true potential of artificial intelligence in such an application. Siri is merely a limited extension of the pre-existing iOS interface, and the voice commands to which it can respond belong to a limited set pre-defined by its developers.
Products similar to Siri with improved natural language algorithms would have a wide range of practical applications. Rather than only parsing simple commands like “play music” or “call (917) 123-4567,” a more sophisticated natural language algorithm could relay information about the surrounding physical world through spoken language to a blind user. For example, a computed response to logically valid questions like “where are my keys?” and “what color is this chair?” could be accomplished by combining a high-quality camera with an advanced language processing algorithm. In addition, a computer able to respond to not only imperative statements but also process indicative ones would revolutionize how we interact with technology. Visually impaired people could use the voice control interface as a way of writing down reminders or notes just as those who have no problems seeing use sticky notes or blackboards. Computers would no longer just “do” stuff, but capture a mental snapshot of any sort of described scenario -- hypothetical or real. The development of statements as opposed to just commands in natural human language revolutionized human communication thousands of years ago, and it is not unreasonable to assume that such a development in computational communication would have far-reaching applications today.
Although sci-fi has given artificial intelligence a bad name, advanced language processing algorithms based upon discoveries in neurolinguistics could revolutionize how the blind interact with technology. Although the classic GUI will by no means disappear, the potential to create computers run by human language will not just be useful to the blind but also the average consumer. Language is one of--if not the most--natural modes of communication for people, and thus not including it in our interactions with computers is not using technology to its greatest potential. In the near future, you will not just type, click, and drag when using a computer, but also talk to it as well.
To understand how to build a computer algorithm that can communicate with human language, the best place to look is the human brain itself. According to classical neurolinguistics, language in the brain is processed in two main regions: Broca’s Area and Wernicke’s Area. Broca’s Area coordinates syntax (the order of words within a sentence) during language production and comprehension, and Wernicke’s area acts as a lexical dictionary which efficiently maps sound to semantics (meaning). Although this distinction may seem straightforward, modern studies in linguistics are finding that language processing is much more complex than previously thought and that there is much overlap between the functions performed by these two regions. For example, scientists are unclear where morphology, a characteristic of language somewhere between syntax and semantics, is handled neurologically according to the current understanding. Another important unknown is how the brain anticipates upcoming words in order to speed up the rate at which it is able to comprehend a full sentence. Processing the subtleties of human language as quickly as it does is clearly one of the most advanced functions of the brain, and before software engineers can efficiently replicate this behavior with computer programs, neuroscientists need to map out exactly how the neurological device for language operates. In order to create advanced systems of artificial intelligence which replicate human language, the traditional model of language in the brain must be expanded upon and the inner workings of neurolinguistic pathways must be understood in greater detail.
A computer which communicates with its user via artificial intelligence as opposed to a traditional graphical user interface is much more accessible to the visually impaired because it can be controlled by voice alone. Although sophisticated artificial intelligence software has yet to be developed, I have personally seen how the relatively primitive algorithm behind Apple’s Siri has allowed my blind grandma to utilize her phone in a way not possible otherwise. Through simple voice commands, my grandma is able to check her email in order to stay in contact with the outside world, play music for her personal enjoyment, and perform countless other tasks which may be difficult otherwise. However, although Siri is a step in the right direction, it does not achieve the true potential of artificial intelligence in such an application. Siri is merely a limited extension of the pre-existing iOS interface, and the voice commands to which it can respond belong to a limited set pre-defined by its developers.
Products similar to Siri with improved natural language algorithms would have a wide range of practical applications. Rather than only parsing simple commands like “play music” or “call (917) 123-4567,” a more sophisticated natural language algorithm could relay information about the surrounding physical world through spoken language to a blind user. For example, a computed response to logically valid questions like “where are my keys?” and “what color is this chair?” could be accomplished by combining a high-quality camera with an advanced language processing algorithm. In addition, a computer able to respond to not only imperative statements but also process indicative ones would revolutionize how we interact with technology. Visually impaired people could use the voice control interface as a way of writing down reminders or notes just as those who have no problems seeing use sticky notes or blackboards. Computers would no longer just “do” stuff, but capture a mental snapshot of any sort of described scenario -- hypothetical or real. The development of statements as opposed to just commands in natural human language revolutionized human communication thousands of years ago, and it is not unreasonable to assume that such a development in computational communication would have far-reaching applications today.
Although sci-fi has given artificial intelligence a bad name, advanced language processing algorithms based upon discoveries in neurolinguistics could revolutionize how the blind interact with technology. Although the classic GUI will by no means disappear, the potential to create computers run by human language will not just be useful to the blind but also the average consumer. Language is one of--if not the most--natural modes of communication for people, and thus not including it in our interactions with computers is not using technology to its greatest potential. In the near future, you will not just type, click, and drag when using a computer, but also talk to it as well.
Amazon Mechanical Turk. Digital image. N.p., n.d. Web. 16 Sept. 2014. <http://women2.com/2011/06/17/using-survey-monkey-and-amazon-mechanical-turk-to-validate-my-startup-idea/>.
"Brain's Magnetic Fields Reveal Language Delays in Autism." Brain's Magnetic Fields Reveal Language Delays in Autism. N.p., 2003-2013. Web. 19 Sept. 2013. <http://phys.org/news147357357.html>.
Caplan, David. "Language and the Brain." The Harvard Mahoney Institution Letter on the Brain. N.p., 1995. Web. 7 Nov. 2012. <http://www.hms.harvard.edu/hmni/On_The_Brain/Volume04/Number4/F95Lang.html>.
Chomsky, Noam, and Mitsou Ronat. On Language. New York: New Free, 1998. Print.
Crystal, David. A Little Book of Language. New Haven: Yale UP, 2010. Print.
Head, J. R., W. S. Helton, E. Neumann, P. N. Russell, and C. Shears. "Text-Speak Processing." Proceedings of the Human Factors and Ergonomics Society Annual Meeting 55.1 (2011): 470-74. Web.
Mahoney, Nicole. "Paths of Change." Language and Linguistics: Language Change Path of Change. N.p., n.d. Web. <http://www.nsf.gov/news/special_reports/linguistics/paths.jsp>.
Malone, Elizabeth. "Dialects." Language and Linguistics: Dialects. N.p., n.d. Web. 22 Oct. 2012. <http://www.nsf.gov/news/special_reports/linguistics/dialects.jsp>.
"MEG Imaging." Nature.com. N.p., n.d. Web. 3 Dec. 2013. <nature.com>.
Nowak, Martin A. "The Chomsky Hierarchy and the Logical Necessity of Universal Grammar." Nature.com. N.p., 6 June 2002. Web. 3 Dec. 2013. <http://www.nature.com/nature/journal/v417/n6889/fig_tab/nature00771_F3.html>.
Pylkkänen, Liina, and Alec Marantz. "Tracking the Time Course of Word Recognition with MEG." Trends in Cognitive Sciences. N.p., 2003. Web. 30 Jan. 2014.
Smith, Pamela A. "The Lexical Organization and Processing of Text Messages: Evidence from Priming." Contemporary Issues in Communications Science and Disorders 36 (2010): 27-39. Print.
Schutte, John. "Researchers Fine-Tune F-35 Pilot-Aircraft Speech System."U.S. Air Force. N.p., 15 Oct. 2007. Web. 11 Nov. 2012. <http://www.af.mil/news/story.asp?id=123071861>.
Sejnowski, Terry, and Tobi Delbruck. "The Language of the Brain." Scientific American n.d.: 54-57. Web.
Zarina, Agnew, Hans Van De Koot, Carolyn McGettigan, and Sophie Scott. "Is Syntactic Movement a Unitary Operation? Evidence from Neuroimaging." UCLWPL 24 (2012): n. pag. Web. 24 Oct. 2013.
"Brain's Magnetic Fields Reveal Language Delays in Autism." Brain's Magnetic Fields Reveal Language Delays in Autism. N.p., 2003-2013. Web. 19 Sept. 2013. <http://phys.org/news147357357.html>.
Caplan, David. "Language and the Brain." The Harvard Mahoney Institution Letter on the Brain. N.p., 1995. Web. 7 Nov. 2012. <http://www.hms.harvard.edu/hmni/On_The_Brain/Volume04/Number4/F95Lang.html>.
Chomsky, Noam, and Mitsou Ronat. On Language. New York: New Free, 1998. Print.
Crystal, David. A Little Book of Language. New Haven: Yale UP, 2010. Print.
Head, J. R., W. S. Helton, E. Neumann, P. N. Russell, and C. Shears. "Text-Speak Processing." Proceedings of the Human Factors and Ergonomics Society Annual Meeting 55.1 (2011): 470-74. Web.
Mahoney, Nicole. "Paths of Change." Language and Linguistics: Language Change Path of Change. N.p., n.d. Web. <http://www.nsf.gov/news/special_reports/linguistics/paths.jsp>.
Malone, Elizabeth. "Dialects." Language and Linguistics: Dialects. N.p., n.d. Web. 22 Oct. 2012. <http://www.nsf.gov/news/special_reports/linguistics/dialects.jsp>.
"MEG Imaging." Nature.com. N.p., n.d. Web. 3 Dec. 2013. <nature.com>.
Nowak, Martin A. "The Chomsky Hierarchy and the Logical Necessity of Universal Grammar." Nature.com. N.p., 6 June 2002. Web. 3 Dec. 2013. <http://www.nature.com/nature/journal/v417/n6889/fig_tab/nature00771_F3.html>.
Pylkkänen, Liina, and Alec Marantz. "Tracking the Time Course of Word Recognition with MEG." Trends in Cognitive Sciences. N.p., 2003. Web. 30 Jan. 2014.
Smith, Pamela A. "The Lexical Organization and Processing of Text Messages: Evidence from Priming." Contemporary Issues in Communications Science and Disorders 36 (2010): 27-39. Print.
Schutte, John. "Researchers Fine-Tune F-35 Pilot-Aircraft Speech System."U.S. Air Force. N.p., 15 Oct. 2007. Web. 11 Nov. 2012. <http://www.af.mil/news/story.asp?id=123071861>.
Sejnowski, Terry, and Tobi Delbruck. "The Language of the Brain." Scientific American n.d.: 54-57. Web.
Zarina, Agnew, Hans Van De Koot, Carolyn McGettigan, and Sophie Scott. "Is Syntactic Movement a Unitary Operation? Evidence from Neuroimaging." UCLWPL 24 (2012): n. pag. Web. 24 Oct. 2013.