Facebook is rolling out its Graph Search platform to its U.S. English-speaking audience.
The social network claims that Graph Search’s months in beta have resulted in improved features such as speedier results, a simpler-to-use search box, and a better understanding of how users ask questions in natural language.
“Everyone using US English should start seeing their search box automatically updated,” Facebook posted on its Newsroom site July 8. “This is just the beginning. We’re currently working on making it easier for people to search and discover topics, including posts and comments.” Work on a mobile version of Graph Search is apparently underway.
From the very beginning, Graph Search was an ambitious project, allowing users to search through the entirety of Facebook’s vast social webs via natural-language queries. (You could ask it to find “friends of friends who live in San Francisco and like pizza,” for example.) In order to accomplish that goal, Facebook needed to upgrade its hardware infrastructure to deal with the inevitable traffic spike from all those users querying the system in complex ways: a big part of that solution was the so-called Disaggregated Rack, which can scale resources independently of one another. Jason Taylor, director of capacity engineering and analysis at Facebook, told an audience at this year’s Open Compute Summit in Santa Clara, Calif., that powering Graph Search could require as many as 20 compute servers, 8 flash sleds, 2 RAM sleds and a storage sled, for a total of 320 CPU cores, 3 terabytes of RAM, and 30 TB of flash.
Graph Search also required quite a bit of work on the software side of things. The team tasked with building the platform knew that keyword-based search wasn’t a viable solution. “Keywords, which usually consist of nouns or proper nouns, can be nebulous in their intent,” Facebook engineering manager Xiao Li wrote in an April 29 posting on Facebook’s blog.
Instead, the software team focused on building an interface built on three components: entity recognition and resolution (“finding possible entries and their categories in an input query and resolving them to database entries”), lexical analysis (“analyzing themorphological, syntactical and semantic information of the words/phrases in the input query”), and semantic parsing (“finding the top N interpretations of an input query given a grammar expressing what one can potentially search for using Graph Search”).
With those components laid out, the team began building Graph Search’s query language using weighted context-free grammar (WCFG). Imagine a tree: its root or base is the “Start” of a particular query. From there, various “limbs” branching from that base include verbs, objects, and so on. The “leaves” at the top of this tree (which Facebook refers to as the “parse tree”) are terminal symbols, or entities such as users, cities, employers, groups, and the phrases that link those entities together:
Facebook also built a “semantic tree” to go along with the parse tree, as well as something it calls “parameterization,” which gives grammar a “cost structure” in order to rank parse trees. In addition, the team had to incorporate synonyms, a robust system of lexical analysis, inflections, and other points of language that probably had engineers ripping their hair out until the wee hours.
While Facebook is certainly an impressive feat from a technical standpoint, it’s raised some questions about privacy. Graph Search can surface all sorts of deep connections, because Facebook records pretty much everything anyone does on its network. “When a billion people are listing everything they like (or ‘Like’) on Facebook, from travel and games to photos and people, all that information is stored in one gigantic graph,” Slashdot writer Jeff Cogswell wrote in February. “That’s somewhat more ominous than Google’s tracking.”
But Facebook insists that its users still have granular control over their information. “Graph Search makes finding things easier, but you can only see what you could already view elsewhere on Facebook,” a spokesperson wrote in an email to Facebook soon after Cogswell’s article hit the Web. “You control who you share your interests and likes with on Facebook. Each category of interests and likes has its own privacy setting.”
Whatever Graph Search’s opportunities and dangers, the system is now out there for most of Facebook’s users to experience.