Breaking Supercomputers’ Exaflop Barrier

Titan, now the world’s second-fastest supercomputer.

Breaking the exaflop barrier remains a development goal for many who research high-performance computing (HPC).

Some developers predicted that China’s new Tianhe-2 supercomputer would be the first to break through. Indeed, Tianhe-2 did pretty well when it was finally revealed—knocking the U.S.-based Titan off the top of the Top500 list of the world’s fastest supercomputers. Yet despite sustained performance of 33 petaflop/s to 35 petaflop/s and peaks ranging as high as 55 petaflops, even the world’s fastest supercomputer couldn’t make it past (or even close to) the big barrier.

Now, the HPC market is back to chattering over who’ll first build an exascale computer, and how long it might take to bring such a platform online.

Bottom line: It will take a really long time, combined with major breakthroughs in chip design, power utilization and programming, according to Nvidia chief scientist Bill Dally, who gave the keynote speech at the 2013 International Supercomputing Conference last week in Leipzig, Germany.

In a speech he called “Future Challenges of Large-scale Computing” (and in a blog post covering similar ground), Dally described some of the incredible performance hurdles that need to be overcome in pursuit of the exaflop.

New supercomputers are showing up with a hybrid CPU/GPU design. The CPU is responsible for primary processing; the GPU acts as an accelerator by taking on the computer’s sometimes-formidable visualization requirements. Without the GPU co-processor, neither Tianhe-2 nor Titan would have been able to come close to their current levels of performance.

The GPU driving both of them (as well as many of the 62 other Top500 supercomputers that rely on CPU/GPU combinations) comes from Nvidia. Rather than more powerful installations of GPUs, artful design or an even more ridiculous number of processing cores packed into one system (Tianhe-2 has more than 3 million), Dally zeroed in on characteristics that could keep even the most powerful supercomputer designs from threatening the exaflop.

Power—electrons travelling in very large groups—is the main limitation, Dhe said. A supercomputer running Intel CPUs and Nvidia GPUs would still require 150 megawatts of steady, clean power: ten times the amount being used currently by Tianhe-2. HPC developers need to take a different tack to the way supercomputers use electricity, increasing the average rating of power efficiency by 25 times.  That would work out to about 50 gigaflops per watt in an exascale computer.

Which would be great if it were easy to do—but that big a leap in efficiency isn’t easy under any circumstances, let alone the rarefied engineering environment that surrounds the supercomputer competition. Dally is designing better power-management features into Nvidia chips, but it would take quite a long time for those benefits to find their way downstream to supercomputers.

Big improvements in power management will bring big improvements in performance, but it’s not the power or the processors that are the biggest drag on supercomputer speed right now, Dally told Scientific Computing.

To reach its top potential power a system built with x86 processors would have to also draw a tremendous amount of power—two full gigawatts in this case, which is the entire power output of the Hoover Dam, he said. Also, legacy software built for backward compatibility rather than optimized for speed can put the drag on a supercomputer faster than anything else.

Dally suggested it would likely be necessary to create a whole new software architecture and programming techniques that could give the computer its instructions as quickly as possible and, otherwise, “stay out of the way.”


Image: Oak Ridge National Laboratory