The players in the cloud storage field are increasing daily, it seems. The most well-known are Amazon S3, Microsoft Azure, Google, AT&T, OpenStack (Rackspace), Oracle and IBM. As a developer, where do you start?
If you have ever worked with corporate storage you may be used to different brand names than these. You may have worked with a Storage Area Network or Network Attached Storage. But cloud storage is typically different than a corporate data store or file system. Its implementations for most vendors are object stores, where the data object you want to store or retrieve is located somewhere in a large cluster of commodity servers using standard drives (JBOD). The “secret sauce” is the software that controls these data objects, drives and servers.
When evaluating cloud ctorage for the first time, the biggest hurdle is thinking in terms of objects instead of directories and files. Most cloud storage repositories have a root or primary bucket that you create through your management console, or programmatically via the vendor’s API. This API will become the developer’s best or worst friend, depending on the vendor’s implementation.
As a developer, you may be thinking that all you want to do is store a given file in a given directory that you will create in the cloud, but you can’t find a way to create them. The problem is you will not find a way with most, if not all, cloud storage vendors.
Earlier, I mentioned that cloud storage is object-based, and not directory/file based. This simply means you create an object that “looks” like a directory structure and file. For example, in your corporate SAN or NAS you may have a directory named something like “myapp/customer1/invoices/” and a file name “inv0010.pdf” with result looking like “myapp/customer1/invoices/inv0010.pdf” where your app is only interested in that particular file.
With Cloud Storage you can emulate this. but your file name is no longer “inv0010.pdf” but “myapp/customer1/invoices/inv0010.pdf.” “myapp” is the primary bucket and “customer1/invoices/inv0010.pdf” is the file. Note that in some systems, this implementation may be a little different though the basics will be the same.
A key point is that cloud storage is basically virtual storage mapping as opposed to absolute mapping as in SAN or NAS via directories. Cloud storage systems have a front-end Web service that handles all PUTs and GETs and then cross references the object to an internal database entry for where the actual data object resides. As a developer, you never know that absolute path of the object but only a virtual representation of it.
I’ll follow up with articles to describe how-tos and what to look for in cloud storage for those of you using a mobile device to store/retrieve data objects. Stay tuned.