- S3 is a simple key based object store.
- S3 has globally unique namespace i.e. bucket name has be unique across all region but S3 bucket is created in a region. The data is stored on multiple devices spanning at least 3 AZs
- S3 has three URL scheme
- Global (legacy)
- Virtual Domain Based
- Path Based
- File size 0 bytes to 5 TB. Largest PUT 5GB
- Successful uploads produces a HTTP 200
- Consistency
- Read after write for PUTS of new objects
- Eventual consistency for overwrites PUTS and DELETES
- S3 as objects and has following properties
- Key
- Value
- Version id
- Metadata
- Sub resources
- Access control list
- Torrents
- 99.99% availability, Amazon guarantees 99.9% and 99.999999999% durability (unlikely it will be lost)
- Tiered storage
- Standard
- S3 - IA
- S3 - one Zone - IA
- S3 - Intelligent Tiering
- Objects > 128 KB
- Minimum 30 days before an object Tiering is applied
- S3 - Glacier
- S3 - Glacier Deep Archiver
- Lifecycle management
- Automates moving object between different storage classes
- Versioning
- Once enabled, it can't be disabled, only suspended
- Versioning is at bucket level
- Stores all version
- MFA Delete with versioning provides extra security
- Uploading a new version reset the permissions and make it private again. Older version permissions don’t change
- Size of S3 bucket is sum of version
- Deleting an object puts a delete marker over. Previous versions are still there. To restore the file, delete the delete marker
- Encryption at Rest - server side or client side
- SSE-S3 - Amazon managed keys
- SSE-KMS - Key managed service provided for AWS Key management
- SSE-C - Customer managed keys
- KMS has quota based on region - for every download and upload you will use KMS. Quota based on region is either 5500, 10000 or 30000 request per second. You cannot request to increase quota
- Secure data using
- Access Control List
- Read, Write, Full_Control to specific users
- Bucket Policies
- Fine grain access
- IAM Policies
- Query string Authentication - URL valid for limited time.
- Charges based on
- Storage size
- No of request
- Storage class
- Data transfer
- Transfer acceleration - takes advantage of edge location to transfer data
- Cross region replication pricing
- By default all buckets and objects are private.
- Object lock modes (WORM - Write once Read Many)
- Governance mode - cannot delete or overwrite until you has special permission.
- Compliance mode - cannot be deleted or modified by any user for the duration of retention period
- Legal hold - no retention period, prevents object from being overwritten or deleted until removed
- Object lock can be enabled ONLY at the time of bucket creation
- S3 Glacier vault lock policy - once the policy is locked it cannot be changed.
- S3 Prefixes are basically subfolders in the bucket. 5500 request per prefix. More prefix, more requests
- Sharing buckets
- Bucket policies & IAM - Bucket level - Programmatic only
- ACL & IAM - Object level _ programmatic access only
- Cross account IAM roles - Programmtic and console access
- Cross Region Replication
- Only new objects or new versions of exiting objects are replicated
- CRR requires versioning to be enabled on source and destination
- Deleting a object from original bucket wont delete from replicated bucket
- S3 Transfer acceleration
- You upload file to edge location rather than directly to S3
- Amazon backbone network transfers the file to S3
- https://s3-accelerate-speedtest.s3-accelerate.amazonaws.com/en/accelerate-speed-comparsion.html - tool to compare upload speed to various regions compared to edge locations
- DataSync
- Tool used to sync large file from on prem to AWS
- NFS or SMB file system compatible
- You can also sync EFS to EFS
- CloudFront - Content Delivery Network service
- Origin - content that will be delivered - S3, EC2, ELB or Route 53
- Distribution - collection of edge locations
- Web distribution - typically for web sites
- RTMP - used for media streaming
- This is being replaced with
- HLS - HTTP Live Streaming
- HDS- HTTP Dynamic streaming
- MSS - Microsoft Smooth Streaming
- DASH - Dynamic Adaptive Streaming over HTTP
- Signed URL for scenario 1 URL = 1 file
- Signed cookies for scenario 1 cookie = multiple files
- CloudFront Signed URL
- Client Use AWS SDK to generate signed URL
- You attach a policy to each signed URL
- S3 Signed URL
- Limited lifetime
- Issues a request as IAM user who creates the presigned URL
- Snowball
- Petabyte level storage solution for transferring data in and out of AWS
- 50 and 80 TB
- Snowmobile is 100PB, 45 feet long container
- Storage Gateway
- Virtual or physical device to connect on prem application to cloud storage
- Three types
- File gateway
- Volume gateway
- Tape Gateway
- Athena vs Macie
- Athena is serverless services, provides SQL like interface to data stored in S3
- Macie uses machine learning and NLP to identify is S3 objects contain PII
- Access Analyser
- Evaluates bucket access policies, enables you to discover and swiftly remediate
- Needs to be enabled from IAM Access Analyser
- Access Points
- Used for shared data
- Instead of using single bucket policy, you create access points for each application
- Provide unique path
- You can create multiple access points for same bucket
- Querying S3 data
- S3 Select - SQL like queries for data stored as CSV, JSON and Apache Parquet
- S3 Select is used to fetch a subset of data rather than entire object. E.g. You can query data from a zipped CSV rather than download entire CSV
- Athena - interactive queries for data stored as CSV, JSON, ORC, Apache Parquet and Avro
- Redshift spectrum - run queries against S3
- S3 Batch
- Automation of an operation
- Notification - send notification through following channels for changes to objects
- SQS
- SNS
- Lambda
- Read S3 FAQ - https://aws.amazon.com/s3/faqs/
Very often while reviewing the code for my team, I will come across a semicolon at the start of JavaScript function as show below ; (function () { 'use strict'; ...and I often wondered what purpose it served. Guess what. It is an insurance to make sure your script works fine when all other scripts are merged together; The leading ; in front of immediately-invoked function expressions (iffe) is there to prevent errors when appending the file during concatenation to a file containing an expression not properly terminated with a ;. So there you go. Now you know what that little semicolon is doing there in your code.
Comments