Software That Fills Gaps in Search, Programming


The biggest problem with computers has always been that they do what we tell them, rather than what we want them to do.

Separate projects at MIT and the Max Planck Institute for Informatics are trying to change that by filling incomplete statements with what we mean rather than what we’ve typed so far… and doing it more accurately than the notoriously inaccurate correction- and word-completion algorithms in smartphone texting applications.

At the Max Planck Institute, researchers have created an application designed to decide whether searchers who type the word “Merkel” are looking for information on the German Chancellor Angela Merkel or the soccer coach Max Merkel. The application, called AIDA, analyzes a user’s own writing to identify clues that identify the topic or context of the text, then matches that context against its analysis of large text databases, in this case the online encyclopedia Wikipedia or the available online text of the German National Laboratory. AIDA gives words or people “mention entity” scores that rise along with the number of mentions of that person or place within the overall text offered by Wikipedia.

The higher the mention-entity score, the more likely a user is looking for that person or place rather than someone or something else with a similar name, according to Johannes Hoffart, who co-wrote the app. AIDA handles category searches the same way to correctly identify “Angela Merkel” + “phone call” + “soccer” as text mentioning a congratulatory call from the German chancellor to a winning sports team rather than references to the alleged bugging of her phone by the U.S. National Security Agency.

“With our new technique we can not only build better search engines, but also make computers understand texts almost as a human does, in an efficient way,” according to the other developer, Gerhard Weikum, scientific director at the Institute. A test version of AIDA is available online as a general-purpose search engine (which works in English). Downloads of the source code, repositories and other data about it are available here.

MIT computer-science and EE researcher Armando Solar Lezama, on the other hand, is working on a more efficient way for programmers to work by filling in gaps in their code automatically as they write it: he’s created a language, called Sketch, that uses “program synthesis” or automatic code generation to decide what a developer is trying to accomplish and generate the resulting code automatically.

Sketch treats program synthesis as a search problem, evaluating all the variations of a the same program or code sequence to decide which the developer is most likely to pursue. “When you’re trying to synthesize a larger piece of code, you’re relying on other functions, other subparts of the code,” according to Rishabh Singh, a member of the team developing Sketch in MIT’s Computer Science and Artificial Intelligence Laboratory. “If it just so happens that your system only depends on certain properties of the subparts, you should be able to express that somehow in a high-level language. Once you are able to specify that only certain properties are required, then you are able to successfully synthesize the larger code.”

The team recently published a paper describing a model-based way to automate the generation of large chunks of code and algorithms to use models for program synthesis. The text is available here. A paper describing an earlier version of Sketch, which Solar-Lezama used for his Ph.D. thesis project, is available here. Neither version will be taking work away from professional developers any time soon, however.

“The application as a tool-building infrastructure,” Solar-Lezama said. “It still requires a level of expertise and understanding about the underlying technology in order for it not to blow up. As far as the more ambitious goal of everybody dumping C and using Sketch instead, we’d still have to push quite a bit.”


Image: chungking