Craft Website Etsy.com has delivered, offering a detailed look of what hardware powers its data center. The takeaway? It evidently takes quite a bit of processor muscle to deliver cute prints and homemade jewelry to buyers’ screens.
Laurie Denness, a senior operations engineer at Etsy, described in a recent blog post what hardware powers the company’s database servers, Hadoop cluster, and search stack, providing a bit of context for each.
“Traditionally, discussing hardware configurations when running a large website is something done inside private circles; and normally to discuss how vendor X did something very poorly, and vendor Y’s support sucks,” he wrote. “With the advent of the ‘cloud’, this has changed slightly. Suddenly people are talking about how big their instances are, and how many of them. And I think this is a great practice to get in to with physical servers in datacenters too.”
Open discussions of hardware aren’t unusual; the OpenCompute project, for example, provides specifications and design documents for the custom-built servers, racks, and other equipment used in Facebook’s data centers. Hardware configurations may either be seen as proprietary, or so optimized to a particular workload that sharing data might be of little use. But 37signals got the ball rolling, and Etsy followed through.
For MySQL database servers, Etsy uses the HPDL380. They include 2U of rack space, 2x 8 core Intel E5630 CPUs (@ 2.53ghz), 96GB of RAM for MySQL buffer cache) and 16x 15,000 RPM 146GB hard disks. The company is just beginning to test SSDs.
“This gives us the right balance of disk space to store user data, and spindles/RAM to retrieve it quickly enough,” Denness wrote.
For general-purpose machines, such as Web servers, Etsy uses a 2U Supermicro chassis, with two power supplies and 12 3.5″ disks on the front of the chassis. “A general configuration for these would be 2x 8 core Intel E5620 CPUs (@ 2.40ghz), 12GB-96GB of RAM, and either a 600GB 7200pm hard disk or an Intel 160GB SSD,” Denness added.
Denness and Etsy also made the decision to avoid RAID here, if only because that its use of Cobbler and Chef means that that they can rebuild a machine in under 20 minutes.
Etsy’s Hadoop cluster uses a similar design to its Web servers, incorporating a chassis with 24 2.5-inch disks on the front. “Each node (with 4 in a 2U chassis) has 2x 12 core Intel E5646 CPUs (@ 2.40ghz), 96GB of RAM, and 6x 1Tb 2.5″ 7200rpm disks. That’s 96 cores, 384GB of RAM and 24TB per 2U of rack space,” according to Denness.
For search, Denness got a new toy to play with: an Intel Sandy Bridge machine, incluidng 2X 16-core Intel E5-2690s, clocked at 2.90 GHz, for a total of 128 2.9-GHz CPU cores per 2U, Hyperthreading included. “This works so well because search is really CPU bound; we’ve been using SSDs to get around I/O issues in these machines for a few years now,” she wrote.
For backups, Etsy also uses a Supermicro chassis, a 4U chassis with 36 3.5-inch disks, paired with a battery-backed LSI RAID controller. Once stocked with 2-Tbyte 7,200-RPM drives, Etsy has access to 60 TB of usable drive space across two RAID6 volumes.
Etsy promised a future look at its networking needs, although its internal data traffic doesn’t seem to be overwhelmingly high; both its Web and database machines only use a single Gigabit Ethernet port, even though its chosen machines have more than one. The exception is in the backup machines, where Etsu is using both Gigabit Ethernet connections, bonded together to the switch to allow redundancy and extra bandwidth.