Data analyst working on analyzing data for project

From smart speakers to customer service chatbots, natural language processing (NLP) provides numerous opportunities for technologists. MarketsandMarkets expects the worldwide NLP market to grow from $11.6 billion in 2020 to $35.1 billion by 2026, a compound annual growth rate of 20.3 percent.

NLP originates from machine learning (ML) algorithms and learns rules by studying examples and forming inferences from statistics. It helps make language more comprehensible for computers.

Practical use cases for natural language processing include automating legal documents in finance. Chatbots also use NLP to triage medical symptoms in health care and improve retail sales and customer service.

Dr. Rohini Srihari, professor of computer science and engineering at the University at Buffalo’s School of Engineering and Applied Sciences, noted that many applications can require NLP to understand large amounts of text, process resumes and medical records, and analyze news. NLP also helps marketing and research professionals with opinion analysis, social media mining and consumer brand analysis. In addition, startups need NLP engineers for customer service chatbots, and manufacturers rely on NLP to sift through manuals.

Zein Tawil, product engineer and data scientist at Primer AI, a provider of industrial-grade NLP applications, noted how organizations use NLP to analyze thousands of reviews on Glassdoor. “You see Glassdoor does this already with pulling out key phrases and then aggregating them, but you can get in at a much deeper level by doing some targeted analysis on those,” Tawil said.

To effectively train in NLP, you can study it as part of a computer science program at a university or college, or take courses in NLP online from online services such as Coursera, edX or Udacity. Here are seven things to keep in mind if you’re getting started in this particular discipline.

Know Your Math

Like other aspects of ML, natural language processing requires a deep understanding of mathematics, including probability, statistics, linear algebra and calculus. “If you want to be an NLP engineer, NLP researcher or NLP scientist, you need to be absolutely ready and have a mindset of, ‘I'm going to learn all the core math that's required for machine learning that gets applied in NLP,’” said Dr. Sameer Maskey, CEO of Fusemachines, an AI talent platform and service provider, as well as a Columbia University adjunct associate professor of international and public affairs.

Study Programming and Computer Science

Learn at least one programming language such as Python or Java, Maskey advised: “You need to know algorithms, data structures and one programming language really well so you can write good programs and code with a solid foundations of algorithms and data structures.”

Learn a Language

Gain a solid understanding of computational linguistics and language, which make up the classic form of natural language processing rooted in ML, Srihari advised. Languages help NLP professionals understand the syntax behind the language they’re creating.

“Know the language you're using to build the NLP system,” Maskey said. That way, NLP engineers can evaluate how well the machine is performing as they are building it. Maskey advised that if engineers are building an NLP system in English, they should master the nuances of English so they can evaluate the systems correctly by analyzing the output and finding errors. Maskey wished he had studied Chinese or Spanish when he built machine translation systems at IBM Watson.

Study How Search Engines Work

Srihari recommends that, before students take her course in natural language processing, they study information retrieval in search engines. “That turns out to be [an] important and very valuable experience when developing NLP applications to know how to process large volumes of textual data, like how do you index a million tweets or 2 million tweets and search it quickly,” Srihari said.

Dig Into ML and Deep Learning

Srihari said research scientists will need natural language processing skills based on deep learning and ML. “What's really in vogue these days is the deep learning approaches, which work well because of the large volumes of data and some of the sophistication of these deep learning models, their ability to learn those representations.”

Deep learning models can distinguish between the correct pronouns to use in speech, a practice known as call reference or pronoun resolution.

Learn Error Analysis

Learn how to debug chatbots and make sure they provide the proper answers for users. Error analysis involves creating new rules along with new types of data and retraining, according to Srihari, who teaches this concept in her natural language processing courses.

“That’s something I emphasize quite a bit—figure out what’s not working, do error analysis and that tells you how to fine-tune things,” she said.

Explore Speech Recognition and Conversational AI

The part of natural language processing that makes smart speakers such as Alexa and Siri work is called speech recognition. It includes speech input, in which engineers ensure that a conversation was heard correctly and then transcribed as text accurately.

NLP could bring the future ability to go beyond simple questions and answers and enable back-and-forth conversation with an Alexa-like device. Srihari is performing this type of research in conversational AI.

“Can we develop an AI agent that can really converse intelligently, empathetically and engagingly with humans?” Srihari asked. “If we can do that, then all kinds of applications are enabled.”