AWS storage services

This was my preparation note while I appeared for AWS solution architect – Associate exam. I cleared it in first attempt with good margin. Sharing it here as I guess it helps for beginners and aspirants.

Other notes in this series.

Simple Storage Service (Amazon S3)
  • Files can be from 1B to 5TB.
  • Bucket name has to be global and unique.
  • Consistency
    • Read after write consistency for new uploads.
    • Eventual consistency for update (overwrite) and delete. Takes some time to propagate.
  • S3 supports versioning.
  • 99.99% HA, 11-9 durability.
  • Supports encryption.
  • Secure using ACLs and bucket policies.
  • S3 classes
    • 99.99 HA, 11-9 durability (normal)
    • S3-Infrequently Accessed (99.9 HA).
    • S3-Reduced Redundancy storage (RRS) – only 4-9 durability.
    • Glacier (3-5 hrs to restore).
      • Suitable for backup and archival.
      • Notion of vaults and archives. Vaults are containers. Archive is any data.
      • Unlimited archives per vault. Each vault can have up to 40TB. 1000 vaults per region.
      • Data retrieval request processed in 3-5 hrs. Once processed, data available for 24hrs.
      • 11-9 durability.
      • 2 APIs supported
        • Via S3 API
        • Via Glacier native API.
  • Interfaces
    • REST based
    • SOAP based
  • Pricing based on
    • Storage size
    • # of requests
    • Amount of data transfer
  • Versioning
    • Once turned-on, it cannot be turned-off again.
    • This costs more storage.
    • Versions include reads and writes.
  • Cross region replication
    • Existing assets won’t be replicated.
    • Only the new assets uploaded will be replicated.
    • Versioning need to enabled in the source and destination bucket.
  • Life cycle rules
    • Min 30 days and 128kb to move the object to the infrequently accessed storage.
    • Min 1 day to move to glacier.
    • Have to deal with versions as well.
  • S3 accelerated transfer
    • You would have to enable it at the bucket level.
    • Uses edge machines to upload into S3.
    • Costs extra.
  • Security
    • Bucket policies can be set.
    • ACL can be set at individual object level.
    • ACL cannot be set at group level but at user level or account level
    • S3 access can also be restricted based on certain conditions such as ip address, date, etc.
    • Additional layer of security through MFA (Muti Factor Authentication) Delete. This required enabling versions.
  • Encryption
    • At transit – using SSL.
    • At rest
      • Server side encryption and s3 managed key (SSE-S3)
      • Server side encryption and kms managed key (SSE-KMS)
      • Server side encryption and customer managed key (SSE-C)
      • Meta data of objects are not encrypted.
  • S3 Url format
    • With static site hosting enabled – https://<bucket name>.s3-website-<aws region>.amazonaws.com
    • Otherwise – https://s3-<aws region>.amazonaws.com/<bucket name>
Elastic Block Storage (EBS)
    • Similar to hard disk. Attached to EC2.
    • EBS types
      • Magnetic disk
        • By default – 100 IOPS. By stripping the disks you get better IOPS.
      • SSD GP2 (General purpose)
        • 99.99% availability
        • Upto 10k IOPS. By default 2k IOPS.
      • SSD Provisioned
        • More than 10k IOPS
        • Designed for databases.
    • Snapshots
      • Stored in S3. Allows incremental.
      • Snapshots are point in time copies of volume.
      • With a snapshot, it’s is possible to create an EBS volume and attach to an EC2.
      • Can be shared across accounts or made public.
      • Increasing disk size could be achieved by creating snapshot of existing volume and create a new volume with required capacity using the snapshot.
      • When the snapshot creation is ON, status is shown as “pending”.
    • Encryption
      • Encryption happens at the attached EC2 instance.
      • It’s an overhead at EC2 and hence supported only in powerful instances.
      • Configure encryption while creating the volume
    • Cost
      • Factors
        • Storage size
        • Snapshot size
        • Number of I/O requests
      • No charge for transferring data between AWS storage offerings such as S3, RDS, EC2, EBS, etc.
    • Durability
      • Size < 20 GB – 99.5 to 99.9%
      • Size > 20 GB – Less than above
    • EBS root volume gets deleted when EC2 is terminated if not set by the configuration while creation of EC2.
    • Deleting EBS volume attached to Amazon Machine Image (AMI) requires de-registering of AMI first.
    • For temporrary storage such as bufer, temp file, etc in EC2, it’s recommended to use the EC2 instance storage.
    • RAID (Redundant Array of Independent Disks)
      • Making multiple disk act as a single disk to operating system.
      • Opt for RAID when OOTB volumes IOPS is not sufficient.
      • RAID gives
          • More space.
          • More performance as we combine more disks.
          • Redundancy factors can be achieved.
      • RAID types – RAID 0, RAID 1, RAID 5, RAID 10.
        • RAID 5 – not recommended in AWS.
        • Either go for RAID 0 – stripped or RAID 10 – Stripped and mirrored.
      • RAID is possible with Windows and Unix OS.
      • Taking snapshot when using RAIDs
        • As disks are virtualized, there is a chance that data written by application are in cache and not into disk yet.
        • To create an application consistent snapshot or backup, the following are the options
            • Freeze the filesystem.
            • Unmount the RAID.
            • Shut down the associated EC2 instances.
AWS Import Export
  • Doesn’t support export from glacier
  • Disc based.
    • Import and export from s3.
    • Import into bed (??)
    • Import into glacier.
  • Snowball based.
    • Available only in us.
    • Import and export from s3 only
    • Supports encryption.
Elastic File System (EFS)
  • Similar to shared file system.
  • Allows single disk to be shared across multiple EC2 within a region.
  • Read after write consistency.
  • Supports NFSv4(Network File System version 4).
  • Can scale up to petabytes.
  • Can support thousands of concurrent NFS connections
  • Data stored across multiple AZ’s within a region.
Storage gateway
  • Connects on-premise IT to AWS.
  • It’s a software appliance (VM image) to be installed in one of the host instance.
  • Types
    • Gateway stored volume
      • Data stored in client data center. Asynchronously backed up to S3.
      • I TB per volume.
    • Gateway cached volume.
      • Data is stored in AWS but cached in client data center.
      • 32 TB per volume.
    • Virtual Tape Library
      • It’s stored in S3 (Library) or Glacier (Shelf)
      • Can be used if the client infrastructure has the mechanism to backup things using back applications such as NetBackUp, BackUp Exec, Veam, etc

Leave a Reply

Your email address will not be published. Required fields are marked *