CERN Testing Cloud for Crunching Universe’s Secrets

In order to obtain results from its experiments, including those with the Large Hadron Collider, CERN writes its own research and analytics software.

The European Organization for Nuclear Research (known as CERN) requires truly epic hardware and software in order to analyze some of the most epic questions about the nature of the universe.  While much of that computing power stems from a network of data centers, CERN is considering a more aggressive move to the cloud for its data-crunching needs.

To that end, CERN has partnered with Rackspace on a hybrid cloud built atop OpenStack, an open-source Infrastructure-as-a-Service (IaaS) platform originally developed by Rackspace as part of a joint effort with NASA. OpenStack has gained favor with a number of companies seeking to expand their cloud portfolios, including Hewlett-Packard and IBM.

Joint initiatives between CERN and Rackspace will include creating a reference architecture and operational model for federated cloud technologies based on OpenStack, as well as personnel support (Rackspace will fund one full-time CERN member).

Tim Bell, leader of CERN’s OIS Group within its IT department, suggested in an interview with Slashdot that CERN and Rackspace will initially focus on simulations—which he characterized as “putting into place the theory and then working out what the collision will have to look like.”

“I would expect that there would be investigations into data analysis in the cloud in the future but there is no timeframe for it at the moment,” Bell wrote in a follow-up email. “The experiences running between the two CERN data centers in Geneva and Budapest will already give us early indications of the challenges of the more data intensive work.”

CERN’s physicists write their own research and analytics software, using a combination of C++ and Python running atop Linux. “Complex physics frameworks and the fundamental nature of the research makes it difficult to use off-the-shelf [software] packages,” Bell added.

The outcomes of the collaboration between CERN and Rackspace will help everyone involved better understand the workloads that can be placed on the public cloud. CERN’s private cloud will run 15,000 hypervisors and 150,000 virtual machines by 2015—any public cloud will likely need to handle similarly massive loads with a minimum of latency. By running small tests with a variety of public-cloud providers, CERN can determine how to best distribute workloads, puzzle out those latency questions, and eventually take on some of its bigger computational challenges.

And what challenges those are. Last summer, foe example, CERN announced the discovery of a particle consistent in its behavior with Higgs boson. Pursued by physicists for decades, the Higgs particle is considered a missing element in the Standard Model of particle physics; identify it with certainty, and you take a major step toward answering how elementary particles assume mass. CERN runs regular experiments with the Large Hadron Collider (LHC), a giant particle accelerator; it’s likely that, thanks to all the high-tech equipment at its disposal (and a little help from the cloud), many more universe-bending discoveries await in the future.


Image: CERN