Amazon Web Services recently announced that they are reducing charges for their popular S3 cloud storage offering. The announcement boasts about a 19% percent reduction. This means that the price went from 15 cents per GB per month to 14 cents per GB per month. Actually, this is the price for the first terabyte. Savings are higher as the amount of storage you use increases. For those not familiar with how resources priced in the cloud, Amazon S3 pricing is in keeping with the cloud promise of consumption based pricing. In other words, you pay as you go and you pay only for what you actually used. While a reduction of a penny per gigabyte per month does not seem like a big deal, I like the trend. Amazon AWS and other cloud providers have been driving the prices down and that is great news for all of us.
Understanding Amazon cloud storage
For us database people, storage is a very important resource, so, any time storage prices go down we cheer. However, when it comes to the cloud, not all storage is created equal. As a matter of fact, Amazon S3 storage is not suitable as “database storage”. Amazon S3 storage does not look like your regular disk. It is more of a file-based storage organized in to buckets and you interact with it using http put/get requests. I am not aware of any DBMS that can use http protocol to work with storage nor would I want to use such a DBMS even if it existed. There are two other types of AWS storage that are much more suitable for database work. One is called “instance storage” and it is included in the price of various Amazon EC2 instances (servers) with the exception of the new very inexpensive (2 cents/hour) Micro instances. Instance storage is a block device i.e. a disk drive and you will have a file system on it which looks exactly like what you would expect if you were using a physical server in your own data centre. There is an unexpected twist to the instance storage though. As the name suggests, this storage is part of your instance and it will appear and disappear with it. So, unlike a real server in your data centre that keeps the data on its hard drive when it crashes or is shut down, instance storage looses any and all of its data when instance that owns it is terminated. Ouch, I hope you did not keep your critical database there.
Don’t despair, Amazon Web Services actually does offer storage that is more in line with what we database people are used to and need. It is called Elastic Block Store or EBS. Amazon EBS storage works exactly as we would expect storage to behave. It is available as 1GB – 1TB volumes that you attach to your EC2 instance (server in the cloud). You can attach multiple volumes and you can RAID them. You put your favourite file system on them so your DBMS can read and write data.
What is S3 storage good for?
So, if Amazon S3 is this strange http accessible storage that DB2 and other DBMS can’t use, why do we care that the price for it went down? Our team (IBM Information Management Cloud Computing Centre of Competence) puts out best practices for leveraging the cloud and we also codify these best practices by creating in templates, and images. The best practices that we have put in to RightScale templates, for example implement this configuration:
As you can see in this diagram, we utilize all 3 storage types in a typical DB2 configuration. We use EBS as database storage; no surprises there. We recommend (and codify this in our scripts) use of the instance storage as a temporary repository for backup images. This allows for very fast disk-based backups, and, more important, quick recovery times. However, since instance storage is not persistent, we provide scripts that copy backup images and archived logs in to Amazon S3. Amazon S3 storage is very durable, inexpensive and, as I said before, you only pay only for what you use. This means you can keep multiple backup images around and not worry too much about the cost. This solution is much more resilient and lets you deliver on more stringent recovery time objectives than comparable tape-based backup solution in a typical on-premises system.
There is more. Those of you using free DB2 Express-C in the cloud and have large databases may be interested in getting the yearly subscription. We just reduced the price on it to $1,990/server and it offers value that can not be beat. One of the reasons to get the subscription is you get ability to compress backups. Compressing backups has the benefit of making backup and recovery process faster and saving on space to keep backup images. We typically see 70% compression ratios. This means that even for multi-terabyte databases your chances of running out of space on the instance storage as a temporary location for backup images during backup or recovery are almost nil. And, because backup images are compressed they are much smaller and much cheaper to keep around on Amazon S3. Backup compression is only one of the extra features you get when you purchase yearly subscription for $1,990/server. You also get double the memory and CPU capacity, data replication, clustering for high availability and disaster recovery, not to mention 24*7 world wide support. Also, as of October 5, 2010 you get Optim Database Administrator ($5,410/user value) and Optim Development Studio ($860/user value). $1,990 for a DB2 server with unlimited number of users with all of that function sounds pretty good, doesn’t it? This is about $3000 per year cheaper than comparable (I am being generous) MySQL subscription.