- Pip install Boto3 - Thu, Mar 24 2022
- Install Boto3 (AWS SDK for Python) in Visual Studio Code (VS Code) on Windows - Wed, Feb 23 2022
- Automatically mount an NVMe EBS volume in an EC2 Linux instance using fstab - Mon, Feb 21 2022
Simple Storage Service (S3) ^
Amazon Simple Storage Service (S3) was one of Amazon’s first cloud services and is the foundation for many of the AWS services. Various ways exist to store and read objects (files) in S3. You can access S3 objects through a web browser, the AWS console, the AWS Command Line Interface, other AWS services, and third-party (on-premises) tools and applications, and developers can use the S3 API to create new software solutions that are based on S3. You’ve probably heard of Dropbox and Netflix.
Simple Storage Service (S3)
S3 objects are organized in so-called buckets, which are unique identifiers within AWS. Aside from the fact that the name has to be unique, in practice, buckets are used like root folders. The protocols supported to access S3 objects are HTTP, HTTPs, and BitTorrent. Considering that an S3 object can have a size of up to 5 terabytes, BitTorrent can be useful for large-scale distributions. Amazon claims that the number of objects you can store is unlimited, where “unlimited” means that you are most likely not able to reach the actual limit, and guarantees 99.999999999% durability and 99.99% availability of objects over a given year.
If you have large amounts of data that can’t be transferred over the Internet, you can use the AWS Import/Export service to send hard disks to Amazon, which will then be transferred to or from S3.
S3 supports various ways to ensure that only authorized users can access your S3 objects: Identity and Access Management (IAM) (discussed in a later post), Access Control Lists (ACLs), bucket policies, and query string authentication. ACLs are comparable to the ACLs you know from file systems. Bucket policies allow you to add access restrictions such as allowed IP addresses or specific dates and times. Query string authentication essentially enables you to define long access URLs for an object that nobody can guess. In addition, you may use Server Side Encryption (SSE) to secure your data.
Monthly S3 storage costs range between $0.095/GB and $0.010/GB depending on the amount of data you store in S3. In addition, you pay for data transfer out from Amazon (not for transfers in). Data transfer prices range between $0.120/GB and $0.050/GB.
AWS Elastic Block Store (EBS) ^
Elastic Block Store (EBS) is not really an independent service, as it requires EC2. EBS volumes are essentially virtual disks that you can mount on virtual EC2 instances (virtual machines). Although it is possible to create and manage EBS volumes through the AWS Management Console or the API, it is not possible (as far as I know) to access the data on EBS volumes through ways other than EC2—unlike with S3
Elastic Block Store (EBS)
An EBS volume can range from 1GB to 1TB. If you need more than 1TB, for instance for a database system, you can attach multiple volumes to an EC2 instance and then work with striping.
The AWS Management Console and the Command Line Tools allow you to create live snapshots of volumes, which you can use to create new volumes. Some EC2 instances have their root file system on EBS volumes; this enables you to easily create a snapshot of an online or offline virtual machine and then create a new AMI (OS image) from the snapshot.
EBS - Snapshots
You pay $0.10/GB for provisioned EBS storage and $0.095/GB for snapshots. In addition, Amazon charges $0.10 per 1 million I/O requests. In my experience, the latter costs are negligible in most scenarios.
Glacier is an archiving and backup service. Some confusion exists as to whether Glacier is tape-based because it takes three to five hours until archives are available. Although Amazon is secretive about the storage technology behind it, Glacier most likely stores the data on hard disks. You can back up data to Glacier from S3 through the API. Developers can create archiving applications with .NET and Java libraries. And you can use Amazon’s Storage Gateway (see below) to back up data from your data center to Glacier. The AWS Import/Export service also supports Glacier.
AWS Storage Gateway ^
The AWS Storage Gateway enables you to connect your data center to Amazon’s storage services S3 and Glacier. The Gateway can run on an EC2 instance; however, in most scenarios, you will run the Gateway on a VMware ESXi or a Hyper-V 2008 R2, on-prem host.
AWS Storage Gateway
So-called Volume gateways are mounted as iSCSI devices from your on-premises application server and can either be cached volumes or stored volumes. With cached volumes, you only store frequently accessed data on-prem, whereas stored volumes keep all data on-prem and asynchronously back up point-in-time snapshots to S3.
In addition to Volume gateways, AWS Storage Gateway supports Gateway virtual tape library (VTL), a service that allows you to back up on-premises data to S3 and Glacier using your on-premises backup software.
It should become clear by now that Amazon fully supports hybrid clouds—that is, you can move only parts of your IT to Amazon’s public cloud and keep the rest on-prem in your private cloud. This also applies to Amazon’s database services Relational Database Service (RDS), DynamoDB, SimpleDB, ElastiCache, and Data Pipeline, which will be the topics of my next post in this AWS series.