Main image of article When IBM’s Watson Learned Too Much About Natural Language
IBM’s Watson can crush opponents on “Jeopardy” and even help with improving cancer care, but the supercomputer does have some difficulties with language, including the ability to comprehend slang. In order to boost Watson’s aptitude with everyday lingo, its software engineers began teaching it the Urban Dictionary, that massive online depository of current slang. There was just one small problem: Watson couldn’t differentiate “clean” terms from profanity. It’s one thing when a supercomputer uses “OMG,” but quite another when it starts cursing like a Quentin Tarantino character—hopefully not in the middle of a “Jeopardy” episode, although that could prove memorable for everyone involved. Fortune offers a glimpse into how it all played out:
“Ultimately, [IBM research scientist Eric Brown's] 35-person team developed a filter to keep Watson from swearing and scraped the Urban Dictionary from its memory. But the trial proves just how thorny it will be to get artificial intelligence to communicate naturally. Brown is now training Watson as a diagnostic tool for hospitals. No knowledge of OMG required.”
Watson also adopted some “bad habits” from Wikipedia, which contains a number of risqué definitions. Hundreds of IBM staffers are involved in the Watson project. It takes several days—at least—to train the system for new tasks, whether Jeopardy or medical research. Watson’s “mind” depends on a combination of geospatial, statistical, and temporal reasoning; for those interested in a little more detail, Dr. David Ferrucci, IBM Fellow and Principal Investigator for Watson, once gave a speech at the Future Health Technology Summit that breaks down how Watson reasons and learns from its data. Despite its sophistication, Watson sometimes fails—such as the one “Jeopardy” episode where it confused the location of Chicago airports and responded with, “What is Toronto.” Nonetheless, IBM is openly considering how best to use Watson in some high-pressure industries, where its massive datasets and ability to process natural-language queries could make it an invaluable resource for researchers, doctors, and other workers. For example, IBM could end up deployed in patient evaluation, deploying its considerable brainpower to determining the exact cause of a medical complaint. But thanks to IBM’s run-in with the Urban Dictionary, Watson probably won’t be delivering that diagnosis with an “LOL.”   Image: IBM