Last week’s horrific bombing of the Boston Marathon presented the FBI and Boston Police Department with an enormous challenge: find the perpetrators and stop them before they could escape or commit another atrocity. It would require sifting though mountains of physical evidence—everything from victims’ torn clothing to bomb fragments scattered around the scene—as well as thousands of photos and videos shot by bystanders.
If there was ever a time to deploy Big Data tools such as facial-recognition technology, which can find matches within enormous image databases, this was it. Even as law enforcement scrambled to analyze the data with the platforms available, crowds of amateur sleuths took to online forums to pick through data en masse—a very real example of crowdsourcing, another much-hyped Big Data method.
But in the end, the killing of bombing suspect Tamerlan Tsarnaev—killed in a shootout with the Boston police—and the capture of his younger brother Dzhokhar came down to a more old-fashioned brand of police work. Despite their hype, Big Data techniques didn’t prove a magic bullet in this case—but that could change in coming years, as the underlying tools become more sophisticated.
Facial-Recognition Technology Failed
A number of Big Data platforms have focused on drawing out relevant data from images and video. Facebook, for example, offers facial-recognition tools that allow users to tag friends in photos—a feature that initially sparked protests from European Union regulators (and many others) concerned with privacy risks. The Pentagon recently signed a contract with AOptix to develop a smartphone add-on that scans and transmits biometric data (such as facial features) from a distance.
Whether to sell ads or discover terrorists, it’s clear that analyzing visual information is Big Data’s next big frontier. But when the FBI tried to use facial-recognition software to scan the crowds around the Boston bombing site for suspects, it apparently came up empty.
“[Boston Police Commissioner Edward Davis] said he was told that facial-recognition software did not identify the men in the ball caps,” The Washington Post reported April 20. “The technology came up empty even though both Tsarnaevs’ images exist in official databases: Dzhokhar had a Massachusetts driver’s license; the brothers had legally immigrated; and Tamerlan had been the subject of some FBI investigation.”
Does that mean the technology developed in recent years to correlate visual information with existing databases is an outright failure? Well, no. But it’s clear that such systems aren’t nearly as foolproof as some technologists would lead one to believe. In the end, some FBI agents had to watch the same video segments hundreds of times over a few days in order to construct proper timelines and eventually settle on the two suspects.
Much has been made in recent years about the “power of crowds,” or having groups unite in a collective “hive mind” to solve particularly vexing problems. Ten thousand brains are better than one, after all. Gavin Newsom, Tim O’Reilly, and other intellectuals have even advocated the idea of crowd-sourcing government, on the theory that a population working in concert can resolve community issues better than a slow and entrenched bureaucracy.
Crowdsourcing might work for something like Amazon’s Mechanical Turk (MTurk), a marketplace that unleashes crowds onto gargantuan-yet-repetitive tasks in exchange for small fees; it might prove a fantastic tool for Kaggle.com, which lets organizations and individuals post data problems to a massive community of analysts and scientists. But there’s also a point when crowdsourcing veers into the madness of crowds.
With emotions high in the hours and days following the bombing, hundreds of people took to Reddit’s user-generated forums to pick over images from the crime scene. Could a crowd of sharp-eyed citizens uncover evidence of the perpetrators? No, but they could definitely focus attention on the wrong people.
“Though started with noble intentions, some of the activity on reddit fueled online witch hunts and dangerous speculation which spiraled into very negative consequences for innocent parties,” read an April 22 posting on Reddit’s official blog. “The reddit staff and the millions of people on reddit around the world deeply regret that this happened.”
The most notable example of this witch-hunting is Sunil Tripathi, who was targeted as a possible suspect in the bombings by the Reddit community, which thought it identified him in surveillance images taken at the scene; although innocent, he has been missing since mid-March. “We hope that this painful event will be channeled into something positive and the increased awareness will lead to Sunil’s quick and safe return home,” the blog posting continued. “We encourage everyone to join and show your support to the Triphathi family and their search.”
Tripathi wasn’t the only individual falsely identified as a suspect in the chaos; on the night of April 19, one “Mike Mulugeta” ended up Tweeted and re-Tweeted as “Suspect 1” after an official recited the name “Mulugeta” on the Boston Police Department scanner. Later that day, The Atlantic offered a good breakdown of how that name (its owner never identified) ended up as chum for a voracious online community. “There was a full-on frenzy as thousand upon thousands of tweets poured out, many celebrating new media’s victory in trouncing old media,” author Alexis Magrigal wrote. “It was all so shockingly new and the pitch was so high and it was so late at night on one of the craziest days in memory.”
Law enforcement will likely spend the next several months—if not years—piecing together the attackers’ motives. They’ll also spend that time improving their data-analytics tools. According to BusinessWeek, the FBI is in the midst of deploying a biometric information system, developed in conjunction with IBM and other technology titans, which could eventually contain 12 million images searchable by algorithm. And that’s surely not the end of it.
Editor’s Note: A previous version of this article stated that Sunil Tripathi disappeared after the Boston bombings; according to news reports, he has actually been missing since mid-March. The article has been updated to reflect this.