Cray is blending Apache Hadoop into its CS300 cluster supercomputers.
“The convergence of data-intensive HPC and high-end commercial analytics is forming a new Big Data market IDC calls High Performance Data Analysis, and most of this work is and will be done on clusters,” Steve Conway, an analyst with research firm IDC, wrote in a statement reprinted by Cray on its Website. In theory, integrating Hadoop with Cray’s systems could help mitigate some of the issues associated with crunching data-analytics problems on large hardware clusters.
Apache Hadoop has become a popular framework for those with a need to analyze massive amounts of unstructured data stored on clusters. A variety of companies, including Intel and Hewlett-Packard, have released hardware and software platforms built atop of Hadoop; as a result, it’s become increasingly difficult for any company—even the huge ones with millions of dollars to spend in research and marketing—to stand out in an increasingly crowded field.
Cray’s Hadoop package includes a Linux operating system, software for managing workloads, Cray Advanced Cluster Engine (ACE) management software, and Intel’s Hadoop distribution.
Announced in February, Intel’s Hadoop distribution was supposedly built “from the silicon up” to help the framework do its job more efficiently, taking advantage of Intel’s hardware work to process data at superior speeds. Intel claims its distribution can process one terabyte of data in seven minutes, versus hours for other systems. The platform also comes with Intel Advanced Encryption Standard (AES) Instructions (Intel AES-NI) in the Intel Xeon processor.
Owners can load up the CS300 with Intel Xeon processors, Intel Xeon Phi co-processors for parallel workloads, AMD Opteron processors, and Nvidia Tesla K20/K20X GPU computing accelerators; they would obviously need to opt for the Intel hardware, however, to take full advantage of the Intel Hadoop distribution.
Among supercomputer makers, Cray faces some significant competition: it trails IBM and Hewlett-Packard on the Top500 list (a list of the world’s fastest supercomputers) by a significant margin, although it leads smaller companies such as Appro, SGI, and Bull.