Santa’s Sleigh: The Ultimate Data Problem?

Faster! Run faster!

On Christmas Eve, as we all know, Santa Claus hops in his sled and delivers toys to all the good boys and girls across the world. It’s a monumental job, and since elves refuse to talk to journalists, we really have no idea how he accomplishes it.

However, we could possibly figure out the route Santa takes across the world. In fact, a good chunk of that work is already finished.

The “Santa problem” is essentially the traditional “traveling salesman” problem, writ large. In simple terms, the original problem is this: for three or more cities, what route would minimize the distance (and, therefore, the travel time and cost) between them? For a small number of cities, someone relatively skilled in the relevant mathematics can solve the problem almost by hand, simply by calculating the number of routes involved (a factorial of the number of cities) and then comparing the distances. Larger sample sizes expand the scale exponentially, requiring the use of heuristic algorithms to determine optimal solutions.

A Brief History of the TSP

According to a history of the problem compiled by Georgia Tech, the traveling salesman problem (or “TSP”) dates back to the 1800s, when Irish mathematician Sir William Rowan Hamilton challenged friends to navigate 20 points in the shortest path using only a series of specified connections. But it wasn’t until the first decades of the 20th century when statisticians began to tackle the challenge in a more systematic way. In 1954, George Dantzig, Ray Fulkerson, and Selmer Johnson figured out the optimal route through 49 U.S. cities, one in each state, plus Washington D.C. Further elaborations on the problem have included 532 AT&T switch locations (1987), the 13,509 cities in the U.S. with populations greater than 500 (1998) and 24,798 cities in Sweden (2004).

David Applegate of AT&T Labs, along with a number of others, crafted that Swedish conundrum; the solution is hosted on a Website run by Georgia Tech. The problem was run on a cluster of 96 2.8-GHz dual-core Xeons at Georgia Tech’s School of Industrial and Systems Engineering in 2004, and it took months (if normalized to a single CPU, the process would have required 84.8 CPU years).

“The total computation for the Sweden TSP makes it one of the largest efforts to date to solve an optimization problem,” the authors at the time. “We can expect that more powerful computational platforms in the future will permit the study of even larger instances, but further progress in the solution of the TSP will no doubt depend much more heavily on improved algorithmic techniques than improved computing hardware.”

From there, the obvious question was: what about every city in the world? Enter the World TSP Problem, which involved a 1,904,711-city instance of locations throughout the world: in other words, all locations that were registered as populated cities or towns, plus several research bases in Antarctica.

According to Georgia Tech, the best solution came from Danish researcher Keld Hausgaun, who in October 2011 established the most optimized route: 7,515,778.188 kilometers (or about 4,670,090 miles). That’s roughly 19 one-way trips back and forth from the earth to the moon. According to the university, Keld Helsgaun’s tour length is, at most, 0.0474 percent greater than the length of an optimal tour. (We attempted to contact Helsgaun for more information, but haven’t heard back.)

The Santa Problem

Just for fun, we asked Apple, Google, Microsoft, Mapquest, and Navteq to chart the most optimized route for Santa to fly on Dec. 31, originating and ending at the North Pole, to enable Santa to travel the least distance in distributing toys around the globe. Although we originally hoped a global solution, we decided to accept a solution just for the United States. (Which, of course, is much more feasible.)

Specifically, we asked the map providers to deliver the shortest route between every home on the globe, or at least the United States. We’d like to know how many homes Santa will visit, and we’d like to the distance traveled to do so. That would require knowing how many buildings there are (and which are businesses—sorry, no toys for you) and the fastest route between them. Heck, we would even accept a road route, using the providers’ established navigation algorithms, rather than an airborne path.

Google’s declined, and Microsoft has yet to formally commit, although it was happy to point out that it powers the NORAD Santa Tracker app.

How hard is the problem? According to 2009 housing data published by the U.S. Census, there are about 130,112,000 homes within the United States, making the U.S. Santa problem at least a couple of orders of magnitude more difficult than the world-city TSP problem that Hausgaun has calculated. (Part of the problem would be figuring out where those homes were located.)

We’d have to expect that the Santa problem is solvable, at least for the United States; it’s just a matter of the computing resources available to throw at it, plus a database of home locations, which a company like Google should already have in its possession.

Trying to solve the problem would arguably be an excellent test of both computing infrastructure and algorithmic know-how. But so far, however, the mapping companies have proven to be more of a Grinch than anything else. Bah, humbug!


Image: ollyy/