Amazon Web Services (AWS) S3 is probably one of the most popular option chosen for storage today. Over the years S3 has proven to be very cost effective and reliable. It was initally targeted for small to medium range companies who would not want to make a huge investment in a storage infrastructure, but with its ease of use it has become popular option for everyone. S3 not only offers a fast and reliable way to store your data, but also offers various ways to interact and retrieve data via SDKs and APIs. S3 today is much different from what it used to be 5 or 10 years ago. Today if you want to use S3 you are fraught with choices like:
- S3 Standard Storage.
- S3 Infrequent Access Storage.
- S3 Reduced Redundancy Storage.
- S3 Glacier Storage.
These are referred as cloud storage classes. If your storage needs are small, the default option of S3 Standard Storage would suffice, but if you are storing/retrieving data on a multi-terabyte or multi-petabyte scale then you need to look into different S3 storage classes to find the most cost effective option. Now remember that storage cost is only one dimension of your total S3 costs. Your total S3 costs will include the cost associated with making requests to S3.
- Number of PUT/COPY/POST/LIST requests.
- Number of GET and other requests.
- The amount of data that is transferred from S3 and out to the internet.
These requests costs that you pay depends on your usage pattern. If you can match your usage pattern to one of the non-standard S3 storage classes, then there is potential to save a lot on your total S3 costs.
So how do you go about choosing the right S3 Storage class? The answer is in usage pattern, usage pattern and usage pattern.
S3 Standard Storage
This is the default storage option in the S3 class. You should start off with S3 Standard storage if you are new to S3 or if your data usage patterns fits one of the below
- Images for an high traffic online store.
- Continuously analyzing application logs for Machine learning.
- Static websites with high traffic.
All of the above usage patterns show that data is continually being requested after it is stored on a very high frequency.
S3 Infrequent Access Storage
AWS offered this storage class based on the analysis of existing usage of S3 Standard and found that data request frequency goes down with age of the data. This storage class is best for use cases that fits the following usage pattern:
- Images for an low volume online store.
- Data that is only requested when it is new.
- The overall data request frequency is very low compared to percentage of S3 objects.
- You would like data be readily available when requested even though the request frequency is low.
A good example of this usage patter would be your web logs. Your store your web logs on S3 and use some recent data to put up a dashboard and as the time goes by the older data is never accessed. This storage class can save you lot of money if the usage pattern fits at the same time making the data readily available at any time. To really know how much you can save can be complicated as the request pricing goes up for this storage class. To help with this issue, I have a calculator that will tell you how much you can save once you input your usage data. The calculator can be found at gulamshakir.com/apps/s3calc/index.html.
S3 Reduced Redundancy Storage
S3 Reduced Redundancy storage can also save costs over S3 Standard. The cost savings come at the expense of data reliability. S3 Standard comes with 99.999999 (six 9s) of reliability compared with 99.99% (2 9s) reliability of S3 Reduced Redundancy storage. This storage class is best if the usage pattern fits the following:
- Data can recreated if lost. For example, rebuild thumbnails from the original image.
- Data loss that can be tolerated.
S3 Glacier Storage
S3 Glacier is the cheapest of the S3 storage classes. As the term “glacier” implies, this storage class is best for archiving. This storage class is best if you want to store your data and forget it.
I hope this helps in choosing the right storage class for your data needs. There are also other calculators available online in addition to my S3 Infrequent Access calculator.