Expanding the Cloud – Managing Cold Storage with Amazon Glacier
Managing long-term digital archiving is a challenge for almost every company. With the introduction of Amazon Glacier, IT organizations now have a solution that removes the headaches of digital archiving and provides extremely low cost storage.
Many organizations have to manage some form of long term archiving. Enterprises have regulatory and business requirements to retain everything from email to customers’ transactions, hospitals create archives of all digital assets related to patients, research and scientific organizations are creating substantial historical archives of their findings, governments want to provide long-term open data access, media companies are creating huge repositories of digital assets, and libraries and other organizations have been looking to archive everything that takes place in society.
It’s no surprise that digital archiving is growing exponentially. One factor in this growth is the presence of image, audio and video archives, which are often archived in their original un-edited form. This is apparent in the Media industry where film re-edits, for example, are no longer just about revisiting the original 35mm or 65 mm film but rather all the digital content captured by the 2K or 4K cameras that were used during filming.
Building and managing archive storage that needs to remain operational for decades if not centuries is a major challenge for most organizations. From selecting the right technology, to maintaining multi-site facilities, to dealing with exponential and often unpredictable growth, to ensuring long-term digital integrity, digital archiving can be a major headache. It requires substantial upfront capital investments in cold data storage systems such as tape robots and tape libraries, then there’s the expensive support contracts and don’t forget the ongoing operational expenditures such as rent and power. This can be extremely painful for most organizations, as much of these expenditures in financial and intellectual capital do not contribute to the operational success of the business.
Using Amazon Glacier AWS customers no longer need to worry about how to plan and manage their archiving infrastructure, unlimited archival storage is available to them with a familiar pay-as-you-go model, and with storage priced as low as 1 cent per GB it is extremely cost-effective. The service redundantly stores data in multiple facilities and on multiple devices within each facility, as Amazon Glacier is designed to provide average annual durability of 99.999999999% for each item stored.
In Amazon Glacier data is stored as archives, which are uploaded to Glacier and organized in vaults, which customers can control access to using the AWS Identity and Access Management (IAM) service. Data is retrieved by scheduling a job, which typically completes within 3 to 5 hours.
Amazon Glacier integrates seamlessly with other AWS services such as Amazon S3 and the different AWS Database services. What’s more, in the coming months, Amazon S3 will introduce an option that will allow customers to seamlessly move data between Amazon S3 and Amazon Glacier based on data lifecycle policies.
Although archiving is often associated with established enterprises, many SMB’s and startups have similar archival needs, but dedicated archiving solutions have been out of their reach (either due to the upfront capital investments required or the lack of bandwidth to deal with the operational burden of traditional storage systems). With Amazon Glacier any organization now has access to the same data archiving capabilities as the world’s largest organizations. We see many young businesses engaging in large-scale big-data collection activities, and storing all this data can become rather expensive over time- archiving their historical data sets in Amazon Glacier is an ideal solution.
A Complete Storage Solution
With the arrival of Amazon Glacier AWS now has a complete package of data storage solutions that customers can choose from:
Amazon Simple Storage Service (S3) – provides highly available and highly durable (“designed for 11 nines”) storage that is directly accessible.
Amazon S3 Reduced Redundancy Storage (RRS) – provides the same highly available, direct accessible data storage, but relaxes the durability guarantees to 99.99% at reduced cost. This is an ideal solution for customers who can regenerate objects or who keep the master copies in separate storage locations.
Amazon Glacier – Provides the same high durability guarantee as Amazon S3 but relaxes the access times to a few hours. This is the right service for customers who have archival data that requires highly reliable storage but for which immediate access is not needed.
Amazon Direct Connect – provides dedicated bandwidth between customers’ on-premise systems and AWS regions to ensure sufficient transmission capacity.
*Amazon Import/Export – for those datasets that are too large to transmit via the network AWS offers the ability to up- and download data from disks that can be shipped.
More information
With the arrival of Amazon Glacier AWS now has a set of very powerful easy to use storage solutions that can serve almost all scenarios. For more information on Amazon Glacier visit the detail page and the posting on the AWS developer blog.
If you are an engineer or engineering manager with an interest in massive scale distributed storage systems we’d love to hear from you. Please send your resume to glacier-jobs@amazon.com.