Intel plans on launching an Apache Hadoop distribution optimized for its own hardware, with global distribution planned for the second quarter of this year.
While the move may come as a surprise to some, Intel has been signaling its software ambitions since at least 2009, when it bought embedded software developer Wind River for just under $900 million. Intel then purchased security vendor McAfee, along with a host of smaller developers, in a bid to enhance its hardware offerings. That’s led to the appearance of products such as Intel’s cache acceleration software for SSDs, which works with system memory to accelerate data even faster than flash alone.
Intel also beefed up its Big Data capabilities. It invested in MongoDB company10gen and big data analytics solution providerGuavus Analytics; its Project Rhino is an attempt to harden the data-protection capabilities of Hadoop. Intel’s Intel Graph Builder software, designed to provide visualizations of the analyzed data, has been optimized for the new distribution.
Intel follows closely on the heels of EMC, which launched its own distribution on Monday; Hortonworks and Cloudera have also rolled out improvements to their Hadoop platforms in the past few days.
Intel will contribute improvements from the Intel Distribution for Apache Hadoop (as it’s formally known) back to the open-source community as fast as it can, Boyd Davis, vice president and general manager of Intel’s Datacenter Software division, said at a Feb. 26 press conference in San Francisco: “The goal is not to fork the code or cause any dissention in the community.”
Intel said that by upgrading an existing server—swapping an older Xeon 5690 chip for an E5-2690, adding an Intel 520 SSD, and 10 Gbit Ethernet adapters—Hadoop’s performance (as measured by a terabyte-sort operation) can be dramatically improved.
The distribution will also run queries up to 8.5 times faster queries in Hive and add hardware-enhanced compression with AVX and SSE 4.2. Intel said that the data being processed could be encrypted as well, tapping into hardware AES support within the Xeon.
Intel obviously has its own interests in mind; by selling the Hadoop distribution, the company believes that it can pull in the launch ramp of new Xeons for the data center by two years. Davis said that for every dollar of operating margin the new Hadoop distribution generates, Intel expects to sell four dollars worth of hardware. That, in turn, has prompted Intel to commit to Hadoop in a big way.
Image: Mark Hachman