Amazon’s Glacier Offers Archival Storage in the Cloud

by Nick Kolakowski Aug 21, 2012 3 min read

[caption id="attachment_3743" align="aligncenter" width="618"]

Amazon’s Glacier Offers Archival Storage in the Cloud

With Glacier, users create a "vault" for data storage.[/caption] Amazon is expanding its reach into the low-cost, high-durability archival storage market with the newly announced Glacier. While Glacier allows companies to transfer their data-archiving duties to the cloud—a potentially money-saving boon for many a budget-squeezed organization—the service comes with some caveats. Its cost structure and slow speed of data retrieval make it best suited for data that needs to be accessed infrequently, such as years-old legal records and research data. “With Glacier,” Jeff Barr, an evangelist for Amazon Web Services, wrote in an August 21 posting on the Amazon Web Services Blog, “you can store any amount of data with high durability at a cost that will allow you to get rid of your tape libraries and robots and all the operational complexity and overhead that have been part and parcel of data archiving for decades.” How much is that cost, exactly? Amazon apparently plans on charging “as low as $0.01 (one US penny, one one-hundredth of a dollar) per Gigabyte, per month,” with no upfront fee. “You don’t have to worry about capacity planning and you will never run out of storage space,” Barr wrote. If that sounds quite a bit like Amazon Simple Storage Service, otherwise known as Amazon S3, you’d be correct. Both Amazon S3 and Glacier have been designed to store and retrieve data from anywhere with a Web connection. However, Amazon S3—“designed to make Web-scale computing easier for developers,” according to the company—is meant for rapid data retrieval; contrast that with a Glacier data-retrieval request (referred to as a “job”), where it can take between 3 and 5 hours before it’s ready for downloading. Glacier retrieval requests also follow a different pricing structure from S3, with retrieval fees starting at $0.01 per GB (retrieving the first 5 percent of the customer’s average monthly storage is apparently free, however). “For data that you’ll need to retrieve in greater volume more frequently, S3 may be a more cost-effective service,” Barr added, in an unusual bit of candor for these types of corporate blog postings. Glacier also allows users to assign unique IDs to each archive at upload time, in contrast to S3, which allows the user to assign individual names to each object. When starting with Glacier, users create a “vault” (the limit is 1,000 vaults per region in the customer’s AWS account) and upload their data—Amazon refers to the latter as an “archive,” capable of storing up to 40 terabytes. The data is subsequently encrypted with AES-256 and stored with “high durability.” Amazon claims that Glacier can “sustain the concurrent loss of data in two facilities,” and conducts systematic data-integrity checks without the need for user intervention. Glacier is available starting Aug. 21 in Amazon’s US-East (N. Virginia), US-West (N. California), US-West (Oregon), Asia Pacific (Tokyo), and EU-West (Ireland) Regions. Amazon has added Glacier support to the AWS SDKs; those interested can also find additional documentation here. Image: Amazon

[caption id="attachment_3743" align="aligncenter" width="618"]