AWS S3 - Advanced

๐Ÿ” Moving Between Storage Classes

You can move objects between storage classes:

  • Manually via AWS CLI/SDK
  • Automatically using Lifecycle Rules
  • Examples:
    • From S3 Standard โ†’ S3 Infrequent Access (IA) after 30 days
    • From S3 IA โ†’ S3 Glacier after 90 days

๐Ÿ› ๏ธ CLI Example:

aws s3 cp s3://my-bucket/file.txt s3://my-bucket/file.txt \
ย  ย  --storage-class STANDARD_IA


๐Ÿ“œ Lifecycle Rules

Lifecycle rules automate transitions and deletions of S3 objects based on age, size, or version.

Rule Type Description
Transition Move to cheaper storage class after N days
Expiration Delete object after N days
Non-current Version Expiration Delete old versions of files

ย 

๐Ÿง  Useful for:

  • Archiving
  • Cost optimization
  • Data retention policies

๐Ÿ“Š Storage Class Analysis

S3 analyzes object access patterns to recommend storage class transitions.

  • Helps identify infrequently accessed data
  • Based on object size, last access time, frequency
  • Works with lifecycle rules for automation

๐Ÿ“ Enable in Management tab โ†’ After analysis โ†’ Use to automate transitions.


๐Ÿ’ธ Requester Pays

Normally, the bucket owner pays for all access.
With Requester Pays, the user downloading data pays the transfer cost.

Use Case Description
Data lakes or public datasets Useful when you host large files
Shared resources Offload cost to API users

ย 

Enable via:

aws s3api put-bucket-request-payment \
ย  ย  --bucket my-bucket \
ย  ย  --request-payment-configuration Payer=Requester


๐Ÿ“ฃ Event Notifications with Amazon EventBridge

You can configure S3 to emit events to EventBridge for actions like:

  • Object created
  • Object deleted
  • Restore completed
Target Service Use Case
Lambda Process uploaded images
SNS Send SMS/email notifications
SQS Queue object creation events
EventBridge Trigger complex workflows or apps

ย 

๐Ÿง  This allows real-time processing and event-driven architecture.


โš™๏ธ Baseline Performance

  • S3 automatically scales for massive concurrency
  • Recommended best practices:
    • Use parallel uploads/downloads
    • Use multipart upload for files >100MB
    • Optimize prefixes (S3 now supports high parallelism even under same prefix)

๐Ÿ“Œ Each S3 prefix supports 3,500 PUTs/sec and 5,500 GETs/sec


๐Ÿ” S3 Select & Glacier Select

  • S3 Select allows you to query part of a file using SQL, instead of downloading the entire file.

Example:

SELECT s.name FROM S3Object s WHERE s.age > 30

  • Works with CSV, JSON, and Parquet
  • Glacier Select enables similar queries on archived data (Glacier)

๐ŸŽฏ Reduces cost and speeds up processing for analytics workloads


๐Ÿงบ S3 Batch Operations

Used to perform large-scale operations on many S3 objects:

  • Replace object metadata
  • Restore objects from Glacier
  • Copy objects across buckets
  • Trigger Lambda functions on each object

๐Ÿงฐ Supports millions of objects in a single job

โœ… You specify:

  • Manifest (list of objects)
  • Operation (e.g., PUT, DELETE, COPY)
  • Optional Lambda

๐Ÿ” S3 Storage Lens

A centralized analytics dashboard that provides:

  • Usage metrics
  • Cost optimization suggestions
  • Object count, size, age, storage class distribution

๐Ÿง  Useful for:

  • Understanding storage trends
  • Identifying unused data
  • Enforcing data governance policies

๐Ÿ“ Found under: S3 โ†’ Storage Lens


โœ… Summary Table

Feature Purpose
Moving between storage classes Cost optimization manually or via lifecycle rules
Lifecycle Rules Automate transitions/deletion of objects
Storage Class Analysis Suggests best class based on access patterns
Requester Pays Shifts cost to the user requesting the object
EventBridge Integration Enables real-time processing with serverless workflows
Baseline Performance High concurrency with optimized access patterns
S3 Select / Glacier Select Query object content without full download
Batch Operations Bulk operations on millions of objects
S3 Storage Lens Visibility into storage usage and optimization
Back to blog

Leave a comment