Amazon Simple Storage Service (S3) is a scalable and secure object storage service widely used for storing and retrieving data in the cloud. Unlike a traditional file system, S3 organizes data into “buckets” and manages objects within them. This article explores the key aspects of S3, how it differs from a standard file system, and how to interact with it from the command line.
What Is an S3 Bucket?
An S3 bucket is a high-level container that stores objects (files) and their metadata. Each bucket is unique within an AWS account and provides a globally unique namespace. S3 offers high availability, durability, and security through fine-grained IAM (Identity and Access Management) policies.
Differences Between S3 and a Traditional File System
Feature | S3 Bucket | Traditional File System |
---|---|---|
Storage Structure | Flat namespace with object keys | Hierarchical directory structure |
Accessibility | Cloud-based, accessed via APIs | Local or network-based access |
Metadata | Extensive, including custom tags | Limited to standard file attributes |
Scaling | Virtually unlimited storage | Limited by disk space |
Permissions | Managed via IAM roles and bucket policies | Managed via OS file permissions |
Connecting to an S3 Bucket via Console
To interact with S3 from the command line, you need the AWS CLI installed and configured. The configuration file is stored in ~/.aws/
.
Setting Up AWS Credentials
- Install the AWS CLI:
sudo apt install awscli # Ubuntu brew install awscli # macOS
- Configure AWS credentials:
aws configure
You’ll be prompted to enter:- AWS Access Key ID
- AWS Secret Access Key
- Default region
- Default output format (json, table, text)
- The credentials are stored in
~/.aws/credentials
:[default] aws_access_key_id=YOUR_ACCESS_KEY aws_secret_access_key=YOUR_SECRET_KEY
- IAM Role Considerations: If using an IAM role on an EC2 instance, attach the role with the necessary S3 permissions, and the CLI will automatically retrieve credentials.
Basic S3 Operations Using AWS CLI
Listing Buckets
aws s3 ls
Creating a New Bucket
aws s3 mb s3://my-new-bucket
Uploading a File to a Bucket
aws s3 cp myfile.txt s3://my-new-bucket/
Downloading a File from a Bucket
aws s3 cp s3://my-new-bucket/myfile.txt ./
Listing Objects in a Bucket
aws s3 ls s3://my-new-bucket/
Deleting an Object from a Bucket
aws s3 rm s3://my-new-bucket/myfile.txt
Deleting a Bucket
aws s3 rb s3://my-new-bucket --force
(--force
ensures the bucket is emptied before deletion.)
Syncing Data with an S3 Bucket
The sync
command is useful for mirroring local directories and S3 buckets.
Syncing a Local Directory to a Bucket
aws s3 sync ./my-local-folder s3://my-new-bucket/
Syncing a Bucket to a Local Directory
aws s3 sync s3://my-new-bucket/ ./my-local-folder
Syncing Two Buckets
aws s3 sync s3://source-bucket s3://destination-bucket
Does Data Pass Through Your Machine? Yes, when syncing two buckets via the AWS CLI, the data first transfers to the local machine before being uploaded to the destination bucket. To avoid this and perform a direct bucket-to-bucket transfer, use AWS SDKs or AWS DataSync.
Few words at the end
S3 provides a powerful, scalable, and secure storage solution compared to traditional file systems. With the AWS CLI, managing buckets and objects is straightforward, allowing efficient file transfers and synchronization. Understanding these fundamental operations will help streamline data workflows in the cloud.
For more details, refer to the AWS S3 Documentation.