AWS Solutions Architect

SAA-C03
DataEngineering
Engineering
Software
Author

Gurpreet Johl

Published

July 18, 2025

1. Getting Started

1.1. AWS Global Infrastructure

Regions are geographic locations, e.g. europe-west-3, us-east-1, etc.

How should we choose a region?

  • Compliance - data governance rules may require data within a certain location
  • Proximity to reduce latency
  • Available services vary by region
  • Pricing varies by region

Each region can have multiple Availability Zones. There are usually between 3 and 6, e.g. ap-southeast-2a, ap-southeast-2b and ap-southeast-2c.

Each AZ contains multiple data centers with redundant power, networking and connectivity.

There are multiple Edge Locations/Points of Presence; 400 locations around the world.

1.2. Tour of the Console

Some services are global: IAM, Route 53, CloudFront, WAF

Most are region-scoped: EC2, Elastic Beanstalk, Lambda, Rekognition

The region selector is in the top right. The service selector in top left, or alternatively use search bar.

1.3. AWS Billing

Click on Billing and Cost Management in the top right of the screen.

This needs to first be activated for administrator IAM users. From the root account: Account (top right) -> IAM user and role access to billing information -> tick the Activate IAM Access checkbox.

  • Bills tab - You can see bills per service and per region.
  • Free Tier tab - Check what the free tier quotas are, and your current and forecasted usage.
  • Budgets tab - set a budget. Use a template -> Zero spend budget -> Budget name and email recipients. This will alert as soon as you spend any money. There is also a monthly cost budget for regular reporting.

2. IAM

2.1. Overview

Identity and access management. This is a global service.

The root account is created by default. It shouldn’t be used or shared; just use it to create users.

Users are people within the org and can be grouped. Groups cannot contain other groups. A user can belong to multiple groups (or none, but this is generally not best practice).

2.2. Permissions

Users or groups can be assigned policies which are specified as a JSON document.

Least privilege principle means you shouldn’t give a user more permissions than they need.

2.3. Creating Users and Groups

In the IAM dashboard, there is a Users tab.

There is a Create User button. We give them a user name and can choose a password (or autogenerate a password if this is for another user).

Then we can add permissions directly, or create a group and add the user.

To create a group, specify the name and permissions policy.

Tags are optional key-value pairs we can add to assign custom metadata to different resources.

We can also create an account alias in IAM to simplify the account sign in, rather than having to remember the account ID.

When signing in to the AWS console, you can choose to log in as root user or IAM user.

2.4. IAM Policies

Policies can be attached to groups, or assigned as inline policies to a specific user. Groups are best practice.

Components of JSON document:

  • Version: Policy language version (date)
  • Id: Identifier for the policy
  • Statement: Specifies the permissions

Each statement consists of:

  • Sid: Optional identifier for the statement
  • Effect: “Allow” or “Deny”
  • Principal: The account/user/role that this policy applies to
  • Action: List of actions that this policy allows or denies
  • Resource: What the actions apply to, eg a bucket
  • Condition: Optional, conditions when this policy should apply

“*” is a wildcard that matches anything.

2.5. MFA

Password policy can have different settings: minimum length, specific characters, password expiration, prevent password re-use.

Multi-factor authentication requires the password you know and the device you own to log in.
A hacker needs both to compromise the account.

MFA devices:

  • Virtual MFA devices - Google Authenticator, Authy. Support for multiple tokens on a single device.
  • Universal 2nd Factor Security Key (U2F) - eg YubiKey. Support for multiple root and IAM users on a single security key.
  • Hardware key fob MFA device
  • Hardware key fob MFA device for AWS GovCloud

2.6. Access Keys

There are 3 approaches to access AWS:

  • Management console (web UI) - password + MFA
  • Command line interface (CLI) - access keys
  • Software Developer Kit (SDK) - access keys

Access keys are generated through the console and managed by the user. Access Key ID is like a username. Secret access key is like a password. Do not share access keys.

AWS CLI gives programmatic access to public APIs of AWS. It is open source. Configure access keys in the CLI using aws configure.

AWS SDK is for language-specific APIs.

2.7. AWS CloudShell

Access using the terminal icon in the toolbar next to the search bar.

This is an alternative to using your own terminal to access the AWS CLI. It is a cloud-based terminal.

You can pass --region to a command to run in a region other than the region selected in the AWS console.

CloudShell has a file system attached so we can upload and download files.

2.8. IAM Roles for Services

Some AWS services can perform actions on your behalf. To do so, they need the correct permissions, which we can grant with an IAM role.

For example, EC2 instance roles, Lambda Function roles, CloudFormation roles.

In IAM, select Roles. Choose AWS Service and select the use case, e.g. EC2. Then we attach a permissions policy, such as IAMReadOnlyAccess.

2.9. IAM Security Tools

  • IAM Credentials Report. Account-level report on all users and their credentials.
  • IAM Access Advisor. User-level report on the service permissions granted to a user and when they were last accessed. This can help to see unused permissions to enforce principle of least privilege. This is in the Access Advisor tab under Users in IAM.

2.10 IAM Guidelines and Best Practices

  • Don’t use root account except for account set up
  • One physical user = One AWS user
  • Assign users to groups and assign permissions to groups
  • Create a strong password policy and use MFA
  • Use Roles to give permissions to AWS services
  • Use Access Keys for programmatic access via CLI and SDK
  • Audit permissions using credentials report and access advisor
  • Never share IAM users or access keys

2.11. Summary

  • Users map to a physical user
  • Groups contain users. They can’t contain other groups.
  • Policies are JSON documents denoting the permissions for a user / group
  • Roles grant permissions for AWS services like EC2 instances
  • Security use MFA and password policy
  • Programmatic use of services via CLI or SDK. Access keys manage permissions for these.
  • Audit usage via credentials report or access advisor

3. EC2

3.1. EC2 Overview

Elastic Compute Cloud used for infrastructure-as-a-service.

Encompasses a few different use cases:

  • Renting virtual machines (EC2)
  • Storing data on virtual drives (EBS)
  • Distributing load across machines (ELB)
  • Scaling services using an auto-scaling group (ASG)

Sizing and configuration options:

  • OS
  • CPU
  • RAM
  • Storage - This can be network-attached (EBS and EFS) or hardware (EC2 Instance Store)
  • Network Card - Speed of card and IP address
  • Firewall rules - Security group
  • Bootstrap script - Configure a script to run at first launch using and EC2 User Data script. This runs as the root user so has sudo access.

There are different instance types that have different combinations of the configuration options above.

3.2. Creating an EC2 Instance

  1. Specify a “name” tag for the instance and any other optional tags.
  2. Choose a base image. OS.
  3. Choose an instance type.
  4. Key pair. This is optional and allows you to ssh into your instance.
  5. Configure network settings. Public IP address, checkboxes to allow ssh access, http access
  6. Configure storage amount and type. Delete on termination is an important selection to delete the EBS volume once the corresponding EC2 instance is terminated.
  7. The “user data” box allows us to pass a bootstrap shell script.
  8. Check the summary and click Launch Instance.

The Instance Details tab tells you the Instance ID, public IP address (to access from the internet) and the private IP address (to access from within AWS).

We can stop an instance to keep the storage state of the attached EBS volume but without incurring any more EC2 costs. The public IP address might change it stopping and starting. The private IP address stays the same.

Alternatively, we can terminate it completely.

3.3. EC2 Instance Types

There are several families of instances: general purpose, compute-optimised, memory-optimised, accelerated computing, storage-optimised.

See the AWS website for an overview of all instances. There is also a handy comparison website here.

The naming convention is: \[ m5.large \]

  • m is the instance class
  • 5 is the generation (AWS releases new versions over time)
  • large is the size within the class

The use cases for each of the instance types:

  • General purpose is for generic workloads like web servers. Balance between compute, memory and networking.
  • Compute-optimized instances for tasks that require good processors, such as batch processing, HPC, scientific modelling.
  • Memory-optimized instances for large RAM, e.g. in-memory databases and big unstructured data processing.
  • Storage-optimised instances for tasks that require reading and writing a lot of data from lcoal storage, e.g. high-frequency transaction processing, cache for in-memory databases, data warehouse.

3.4. Security Groups

Security groups control how traffic is allowed into or out of EC2 instances. They act as a “firewall” on EC2 instances.

Security groups only contain allow rules. Security groups can reference IP addresses or other security groups.

They regulate:

  • Access to ports
  • Authorised IP ranges (IPv4 and IPv6)
  • Inbound and outbound network

By default, any inbound traffic is blocked and any outbound traffic is allowed.

A security group can be attached to multiple instances. It is locked down to a (region, VPC) combination.

The security group exists “outside” of the EC2 instance, so if traffic is blocked then the instance will never see it.

  • Any time you get a timeout when trying to access your EC2 instance, it’s almost always a result of the security rule.
  • If the application gives “connection refused” then it’s an application error.
  • It can be helpful to keep a security group specifically for SSH access

Access security groups under:

EC2 -> Network & Security -> Security Groups

We can set the type of connection, the port and the IP address/range.

A security group can reference other security groups, i.e. “allow traffic from any other EC2 instance which has Security Group A or Security Group B attached to it”. This saves us from having to reference IP addresses all the time, which can be handy when these are not static.

Typical ports to know:

  • 21 - FTP, file transfer protocol
  • 22 - SSH or SFTP (because SFTP uses SSH), secure shell and secure FTP
  • 80 - HTTP, access unsecured websites
  • 443 - HTTPS, access secured websites
  • 3389 - RDP, Remote Desktop Protocol, SSH equivalent for Windows

3.5. Connecting to Instances

SSH works for Linux, Mac or Windows > 10. Putty works for all versions of Windows. EC2 Instance Connect works for all operating systems.

3.5.1. Linux via SSH

SSH allows you to control a remote machine using the command line.

You need you pem / ppm file for your secure keys. The EC2 instance needs to allow inbound connections for SSH access.

ssh EC2-<username>@<public IP address>

We can pass the file path for the key with the argument -i path/to/file.pem

3.5.2. EC2 Instance Connect

This opens a terminal in browser. No security keys are required since it generates temporary keys.

This relies on SSH behind the scenes, so the correct security groups for SSH need to be allowed on the EC2 instance.

Use the EC2 Instance Connect tab in the EC2 section for your running instance.

3.6. EC2 Instance Roles

Never enter your IAM details on an EC2 instance as this would be available to anybody else who can access the instance. Poor security practices!

Instead, we use EC2 instance roles.

In the tab for the instance tab, we can do this with:

Action -> Security -> Modify IAM Role

Then select a role to attach to the instance.

3.7. EC2 Instance Purchase Options

3.7.1. Purchase Options

More common: - Spot: short workloads, cheap but can be terminated. Not suitable for critical jobs or databases. - On-demand: short uninterrupted workloads, pay per second - Reserved: long workloads like a database. 1 or 3 years. Convertible reserved instances allow you to change the instance type over the reserved period. - Savings plan: 1 to 3 year commitment to an amount of USAGE rather than committing to a specific instance size or OS.

Less common: - Dedicated hosts: book an entire physical server and control instance placement. Most expensive. Useful to meet compliance requirements, or where you have Bring Your Own Licence (BYOL) software. - Dedicated instances: no other customers share your hardware. No control over instance placement, so the physical hardware might move after stopping and starting. May share hardware with other instances in the same account. - Capacity reservations: reserve capacity in your AZ for any duration. No time commitment and no billing discounts. You’re charged on demand rates whether you run the instance or not. Suitable for short term interrupted workloads that need to be in a specific AZ.

3.7.2. IPv4 Charges

There is a $0.005 per hour charge for all public IPv4 in your account.

There is a free tier for the EC2 service. There is no free tier for any other service.

There is no charge for IPv6 addresses. But this does not work for all ISPs.

AWS Public IP Insights Service under Billing is helpful to see these costs.

3.7.3. Spot Instances

Discount of up to 90% compared to on demand instances.

You define the max spot price you are willing to pay, and you get the instance for as long as the current price is less than your max price. The hourly spot price varies by offer and capacity. If the current price rises above your max, you have a 2 minute grace period to stop or terminate your instance.

“Spot Block” is a strategy to block a spot instance for a specified time frame (1-6 hours). They are no longer available but could potentially still come up on the exam.

A spot request consists of:

  • Max price
  • Desired number of instances
  • Launch specification - AMI
  • Request type: one-time or persistent. A persistent request will automatically request more spot instances whenever any are terminated, for as long as the spot request is valid.
  • Valid from and until

Spot Instance Requests can only be cancelled if they are open, active or disabled. Canceling a spot request does not terminate the instance. You need to cancel the request then terminate the instance, to ensure a persistent request does not launch another.

3.7.4. Spot Fleets

A spot fleet is a set of spot instances + optional on-demand instances.

It will try to meet the target capacity within the price constraints. You specify the launch pool: instance type, OS and availability zone. You can have multiple launch pools so the fleet can choose the best. It will stop launching instances either when it reaches target capacity or max cost.

There are several strategies for allocating spot instances:

  • lowestPrice: from the pool with lowest price
  • diversified: distributed across pools for better availability
  • capacityOptimized: from the pool with optimal capacity
  • priceCapacityOptimized (recommended): pool with highest capacity first, then lowest price

4. EC2 Networking

4.1. Private vs Public IP

There are two types of IP in networking: IPv4 and IPv6. v4 is most commonly used, v6 is for IoT.

Public IP means the machine can be identified on the internet. It must be unique across the whole web.

Private IP means the machine can only be located on the private network. It must be unique across that private network. Only a specified range of IP addresses can be used as private addresses. Machines connect to the internet using an internet gateway (a proxy).

4.2. Elastic IP

When you start and stop an EC2 instance it can change its IP. If you need this to be fixed, you can use elastic IP which will get reused for future instances if that one gets terminated. You can only have 5 elastic IP addresses in your account.

It is best practice to avoid elastic IP addresses as they often are a symptom of bad design choices. Instead, use a random public IP and register a DNS name to it. Or alternatively, use a load balancer and don’t use a public IP.

4.3. Placement Groups

Placement groups allow you to control the EC2 instance placement strategy. You don’t get direct access to / knowledge of the hardware, but you can specify one of three strategies:

  • Cluster - cluster instances into a low latency group in a single AZ. High performance but high risk; low latency and high bandwidth. Useful for big data jobs that need to complete quickly.
  • Spread - spread instances across different hardware (max 7 instances per AZ). Useful for critical applications as the risk of all instances simultaneously failing is minimised. But the max instance count limits the size of the job.
  • Partition - Spread instances across many different sets of partitions within an AZ. Each partition represents a physical rack of hardware. Max 7 partitions per AZ, but each partition can have many instances. Useful for applications with hundreds of instances or more, like Hadoop.

Creating a Placement Group

To create a placement group:

EC2 -> Network & Security -> Placement Groups -> Create Placement Group

Give it a name, e.g. my-critical-group, then select one of the three strategies.

To launch an instance in a group:

Click Launch Instances -> Advanced Settings -> Placement Group Name

4.4. Elastic Network Interfaces

4.4.1. What is an ENI?

Elastic Network Interfaces (ENIs) represent a virtual network card in a VPC. They are bound to a specific AZ.

Each ENI can have the following attributes:

  • One private IPv4, plus one or more secondary IPv4 addresses
  • One Elastic IPv4 per private IPv4
  • One public IPv4
  • One or more security groups
  • One MAC address

An ENI can be created and then attached to EC2 instances on the fly. This makes them useful for failover, as the ENI from the failed instance can be moved to its replacement to keep the IP addresses consistent.

Another use case is for deployments. We have the current version of the application running on instance A with an ENI. This is accessible by its IP address. Then we can run the new version of the application of instance B. When we are ready to deploy, move the ENI to instance B.

4.4.2. Creating an ENI

Click on the Instance in the UI and see the Network Interfaces section.

Under the Network & Security tab we can see Network Interfaces. We can create ENIs here.

Specify: description, subnet, Private IPv4 address (auto assign), attach a Security Group.

4.4.3 Attaching an ENI to an Instance

On the Network Interfaces UI, Actions -> Attach. Select the instance.

More on ENIs: https://aws.amazon.com/blogs/aws/new-elastic-network-interfaces-in-the-virtual-private-cloud/

4.5. EC2 Hibernate

4.5.1. Why Hibernate?

When we stop an instance, the data on disk (EBS) is kept intact until the next start. When we start it again, the OS boots up and runs the EC2 User Data script. This can take time.

EC2 Hibernate is a way of reducing boot time. When the instance is hibernated, the RAM state is saved to disk (EBS). When the instance is started again, it loads the RAM state from disk. This avoids having to boot up and initialise the instance from scratch.

Use cases:

  • Long-running processing
  • Services that take time to initialise

An instance can not be hibernated for more than 60 days. The instance RAM size must be less than 150GB and the EBS root volume large enough to store it.

4.5.2. Enable Hibernation on an Instance

We can enable hibernation when creating an instance under Advanced Details, there is a “Stop - Hibernate Behaviour” drop down that we can enable.

Under Storage, the EBS volume must be encrypted and larger than the RAM.

To then hibernate a specific instance, on the Instance Summary select Instance State -> Hibernate Instance.

5. EC2 Instance Storage

5.1. EBS

5.1.1. What is an EBS Volume?

An Elastic Block Storage (EBS) volume is a network drive that you can attach to your instance. Think of it like a “cloud USB stick”. They allow us to persist data even after an instance is terminated. EBS volumes have a provisioned capacity: size in GB and IOPS.

Each EBS volume can only be mounted to one EC2 instance at a time, and are bound to a specific AZ. To move a volume across an AZ, you need to snapshot it. Each EC2 instance can have multiple EBS volumes.

There is a “delete on terminate” option. By default, this is on for the root volume but not any additional volumes. We can control this in the AWS console or CLI.

5.1.2. Creating an EBS Volume on an Instance

We can see existing volumes under

EC2 -> Elastic Block Store -> Volumes

We can select Create Volume. We then choose volume type, size, AZ (same as instance).

This makes the volume available. We can then attach the volume using Actions -> Attach Volume. The Volume State will now be “In-use” instead of “Available”.

5.1.3. EBS Snapshots

A snapshot is a backup of your EBS volume at a point in time.

It is recommended, but not necessary, to detach the volume from an instance.

We can copy snapshots across AZ and regions.

Features:

  • EBS Snapshot Archive. Moving the snapshot to an archive tier is much cheaper (75%) but then takes 24-72 hours to restore.
  • Recycle bin. There are setup rules to retain deleted snapshots so they can be restored after deletion. Retention period is 1 day to 1 year.
  • Fast snapshot restore (FSR). Force full initialisation of snapshot to have no latency. This costs more.

5.1.4. EBS Features Hands On

Create an EBS volume:

Elastic Block Store UI, Actions -> Create Snapshot ->  Add a description

See snapshots:

EBS -> Snapshots tab

Copy it to another region:

Right-click volume -> Copy Snapshot -> Select the description and destination region

Recreate a volume from a snapshot:

Select the snapshot -> Actions -> Create Volume From Snapshot ->  Select size and AZ

Archive the snapshot:

Select the volume -> Actions -> Archive Snapshot

Recover a snapshot after deletion:

Recycle Bin -> Select the snapshot -> Recover

5.1.5. EBS Volume Types

EBS volumes are characterised by: size, throughput and IOPS.

Types of EBS volumes:

  • gp2/gp3 - General purpose SSD. The newer gp3 options allow size and IOPS to be varied independently, for the older gp2 types they were linked.
  • io1/io2 Block Express - High throughput low latency SSD. Support EBS Multi Attach.
  • st1 - Low cost HDD for frequently accessed data.
  • sc1 - Lowest cost HDD for infrequently accessed data.

Only the SSD options can be used as boot volumes.

5.1.6. EBS Multi Attach

Attach the same EBS volume to multiple EC2 instances (up to 16) in the same AZ.

Only available for io1 and io2 EBS volume types. You must use a file system that is cluster-aware.

For use cases with higher application availability in clustered applications, or where applications must manage concurrent write operations.

5.1.7. EBS Encryption

When you create an encrypted EBS volume you get:

  • Data at rest is encrypted inside the volume
  • Data in flight is encrypted between the volume and the instance
  • Snapshots are encrypted
  • Volumes created from the snapshot are encrypted

Encryption and decryption is all handled by AWS. The latency impact is minimal. It uses KMA (AES-256) keys.

Copying an unencrypted snapshot allows encryption:

  1. Create an EBS snapshot of the volume.
  2. Encrypt the snapshot using copy.
  3. Create a new EBS volume from the snapshot. This will be encrypted.
  4. Attach the encrypted volume to the original instance.

5.2. AMI

Amazon Machine Image (AMI) is the customisation of an EC2 instance. It defines the OS, installed software, configuration, monitoring, etc. AMIs are built for a specific region.

Putting this in the AMI rather than the boot script results in a faster boot time since the software is prepackaged.

We can launch EC2 instances from:

  • A public AMI (provided by AWS)
  • An AMI from the AWS Marketplace (provided by a third-party)
  • Your own AMI

Create an AMI from a running instance that we have customised to our liking:

Right-click the instance -> Images and Templates -> Create Image

See the AMI:

EC2 UI -> Images -> AMIs

Launch an instance from an AMI

Select the AMI -> Launch Instance from AMI

5.3. EC2 Instance Store

EBS volumes are network drives, which gives adequate but potentially slow read/write.

EC2 Instance Store is a physical disk attached to the server that is running the EC2 instance.

They give better I/O performance but are ephemeral, the data is lost if the instance is stopped or the hardware fails.

Good for cache or temporary data.

5.4. Elastic File System (EFS)

5.4.1. What is EFS?

EFS is a managed Network File System (NFS) that can be mounted on multiple EC2 instances. The EC2 instances can be in multiple AZs.

Highly available, scalable, but more expensive. It scales automatically and you pay per GB. You don’t need to plan the capacity in advance.

A security group is required to control access to EFS. It is compatible with Linux AMIs only, not Windows. It is a POSIX (Linux-ish) file system with the standard API.

Uses cases: content management, web serving, data sharing, Wordpress.

5.4.2. Performance Modes

  • EFS Scale - This gives thousands of concurrent NFS clients for >10GB/s of throughput.
  • Performance modes - This can be set to general purpose for latency-sensitive use cases, or Max I/O for higher throughput at the expense of higher latency.
  • Throughput mode - This can be set to bursting which scales throughput with the total storage used, provisioned which sets a fixed throughput, or elastic which scales throughput depending on the demand (ie the requests received)

5.4.3. Storage classes

Storage tiers are a lifecycle management feature to move files to cheaper storage after N days. You can implement lifecycle policies to automatically move files between tiers based on the number of days since it was last accessed.

  • Standard. For frequently accessed files.
  • Infrequent access (EFS-IA). There is a cost to retrieve files, but lower cost to store.
  • Archive. For rarely accessed data.

There are two different availability options:

  • Regional. Multi-AZ within a region, good for production.
  • One zone. Only one AZ with backup enabled by default. Good for dev.

5.5. EBS vs EFS

EBS volumes are attached to one instance (mostly, apart from multi-attach) and are locked at the AZ level.

EFS can be mounted to hundreds of instances across AZs. It is more expensive, but storage tiers can help reduce this.

Instance Store is attached to a specific instance, and is lost when that instance goes down.

6. ELB and ASG

6.1. Scalability and Availability

Scalability means an application can adapt to handle greater loads.

  • Vertical scalability. Increase the size of a single instance. The scaling limit is often a hardware limit. “Scale up and down”.
  • Horizontal scalability. Also called elasticity. Distribute across more instances. “Scale out and in”.

High availability is the ability to survive a data center loss. This often comes with horizontal scale. Run the application across multiple AZs.

6.2. ELB

6.2.1. Load balancing

Load balancers are servers that forward traffic to multiple servers downstream.

Benefits:

  • Spread the load across downstream instances
  • Perform health checks on instances and handle downstream failures
  • Expose a single point of access (DNS) to your application
  • Provide SSL termination
  • Separate public traffic from private traffic
  • High availability across zones

Elastic Load Balancer (ELB) is a managed load balancer.

Health checks are done on a port and route, and return a 200 status if it can be reached.

There are four kinds of managed load balancer:

  • Classic load balancer (Deprecated)
  • Application load balancer
  • Network load balancer
  • Gateway load balancer

Some can be set up as internal (private) or external (public).

6.2.2. Security Groups

Users can connect via HTTP/HTTPS from anywhere. So the security groups typically allow inbound TCP connections on ports 80 and 443.

The security groups for the downstream EC2 instances then only needs to allow inbound connections from the load balancer, i.e. one specific source. This means we can forbid users from connecting directly to the instance and force them to go via the load balancer.

6.2.3. Application Load Balancer (ALB)

These are layer 7 load balancers, meaning they take HTTP requests. They support HTTP/2 and WebSocket, and can also redirect from HTTP to HTTPS.

You get a fixed hostname, i.e. XXX.region.elb.amazonaws.com. This is helpful to get a fixed IP to connect to instances which are being created and destroyed, where IP addresses are normally changing constantly. The application servers don’t see the IP of the client directly, but this is inserted as a header X-forwarded-for. We also headers for the port and protocol.

Use cases are microservices and container-based applications (e.g. ECS). One load balancer can route traffic between multiple applications. There is a port mapping feature to redirect to a dynamic port in ECS.

They can route requests to multiple HTTP applications across machines (called target groups) or multiple applications on the same machine (e.g. containers).

Routing options:

  • By path in URL - e.g. /users endpoint, /blog endpoint
  • By hostname in URL - e.g. one.example.com and two.example.com
  • By query string headers - e.g. /id=123&orders=True

ALB can route to multiple target groups. Health checks are at the target group level. Target groups can be:

  • EC2 instances
  • ECS tasks
  • Lambda functions
  • Private IP addresses

6.2.4. Network Load Balancer (NLB)

These are layer 4 load balancers, meaning they route TCP and UDP traffic. Ultra-low latency and can handle millions of requests per second. NLB has one static IP per AZ.

Target groups can be:

  • EC2 instances
  • Private IP addresses
  • Application load balancers. You may want the NLB for a static IP, routing to an ALB for the http routing rules.

Health checks support TCP, HTTP and HTTPS protocols.

6.2.5. Gateway Load Balancer (GWLB)

This is a layer 3 load balancer, meaning it routes IP packets.

This is useful when we want to route traffic via a target group of a 3rd party network virtual appliance (e.g. a firewall) before it reaches our application.

User traffic > GWLB > Firewall > GWLB > Our application 

It uses the GENEVE protocol on port 6081.

Target groups can be:

  • EC2 instances
  • Private IP addresses

6.2.6. Sticky Sessions

Stickiness means a particular client is always routed to the same instance behind the load balancer. This means the user doesn’t lose their session data. It does this via a cookie which has an expiration date.

Overusing sticky sessions can result in imbalanced loads, since they’re constraining the load balancer to direct traffic to instances that may not be optimal.

Application-based cookies. Two options for this: - A custom cookie is generated by the target. The cookie name must be specified for each target group and cannot be one of the reserved keywords: AWSALB, AWSALBAPP, AWSALBTG. - An application cookie is generated by the load balancer. The cookie name is always AWSALBAPP.

Duration-based cookies. This is generated by the load balancer. The cookie name is always AWSALB for ALB (or AWSELB for CLB).

ELB UI -> Target Groups -> Select a target group -> Edit Target Group 
-> Turn On Stickiness -> Select cookie type and duration

6.2.7. Cross-Zone Load Balancing

With cross-zone load balancing, each load balancer will distribute requests evenly across all registered instances in all AZs, regardless of which zone the request came from.

Without cross-zone load balancing, there can be big disparities between load in different AZs.

Cross-zone load balancing is enabled by default for ALB and CLB, and won’t charge for data transfer between AZs. It is disabled by default for NLB and GWLB. These will charge for inter-AZ data transfer if you enable it.

6.2.8. Connection Draining

The load balancer allows time to complete “in-flight requests” while the instance is registering or unhealthy. The load balancer stops sending new requests to the EC2 instance which is de-registering.

The is called connection draining for CLB, and deregistration delay for ALB and NLB.

It can be 0-3600 seconds. By default it is 300 seconds. Disable connection draining by setting it to 0.

6.3. SSL Certificates

6.3.1. SSL and TLS

An SSL certificate allows in-flight encryption - traffic between clients and load balancer is encrypted. They have an expiration date (that you set) and must be renewed. Public SSL certificates are issued by Certificate Authorities (CA) like GoDaddy, GlobalSign etc.

TLS certificates are actually used in practice, but the name SSL has stuck.

  • SSL = Secure Sockets Layer
  • TLS = Transport Layer Security (a newer version of SSL)

The load balancer uses a default X.509 certificate, but you can upload your own. AWS Certificate Manager (ACM) allows you to manage these certificates.

6.3.2. SNI

Clients can use Server Name Indication (SNI) to specify the hostname they reach.

SNI solves the problem of loading multiple SSL certificates on one web server. We may have a single load balancer serving two websites: www.example.com and www.company.com

Each of these websites has an SSL certificate uploaded to the load balancer. When a client request comes in, it indicates which website it wants to reach and the load balancer will use the corresponding SSL certificate.

This works for ALB, NLB or CloudFront.

ELB UI -> Select a load balancer -> Add a listener -> Select the default SSL/TLS certificate

6.4. Auto Scaling Groups (ASG)

6.4.1. What is an ASG?

The goal of an ASG is to scale out/in (add/remove EC2 instances) to match load. It ensures we have a minimum and maximum number of instances running.

If running an ASG connected to a load balancer, any EC2 instances created will be part of that load balancer.

ASG is free, you only pay for the underlying EC2 instances.

You need to create a ASG Launch Template.

  • AMI and instance type
  • EC2 user data
  • EBS volumes
  • Security groups
  • SSH key pair
  • IAM roles for EC2 instances
  • Network and subnet information
  • Load balancer information

The ASG has a minimum, maximum and initial size as well as a scaling policy.

It is possible to scale the ASG based on CloudWatch alarms.

6.4.2. Scaling Policies

Dynamic scaling:

  • Target tracking scaling. Keep a certain metric, e.g. ASG CPU, to stay at 40%
  • Simple / step scaling. When a CloudWatch alarm is triggered, e.g. CPU > 70%, add 2 units.

Scheduled scaling:

  • Anticipate scaling based on known usage patterns. E.g. market open.

Predictive scaling:

  • Continuously forecast load and schedule scaling accordingly.

Good metrics to scale on:

  • CPUUtilization
  • RequestCountPerTarget
  • Average Network In/Out
  • Application-specific metrics

6.4.3. Scaling Cooldown

After a scaling activity happens, there is a cooldown period (default 300 seconds) where the ASG will not launch or remove any more instances while it waits for metrics to stabilise.

Using a ready-to-use AMI means the EC2 instances start quicker, allowing you to use a shorter cooldown and be more reactive.

7. Relational Databases

7.1. RDS

7.1.1 What is Relational Database Service?

Relational Database Service (RDS) is a managed DB using SQL as a query language.

Supported database engines: Postgres, MySQL, MariaDB, Oracle, Microsoft SQL Server, IBM DB2, Aurora (AWS proprietary database).

Instead of RDS, we could run our own EC2 instance with a database inside. The benefit of RDS is that it is a managed service, so you get:

  • Automated provisioning and OS patching
  • Continuous backups and point-in-time restore
  • Monitoring dashboards
  • Read replicas
  • Multi-AZ setup for disaster recovery
  • Maintenance windows for upgrades
  • Horizontal and vertical scaling capabilities
  • Storage backed by EBS

But the downside is you can’t SSH into the underlying instances.

7.1.2. Storage Auto-Scaling

RDS will increase your DB instance automatically as you run out of free space. You set a Maximum Storage Threshold.

This will automatically modify storage if:

  • Free storage is less than 10%
  • Low-storage lasts at least 5 mins
  • 6 hours have passed since the last notification

7.1.3. Read Replicas

Read replicas allow better read scalability. Read replicas are obviously read-only, so only support SELECT statements.

We can create up to 15 read replicas. They can be within AZ, cross-AZ or cross-region. The replication is asynchronous so they are eventually consistent. Replicas can be promoted to their own DB.

Applications must update their connection string to use the read replicas.

Use cases may be if you have an existing production application, and now you want to add a reporting application without affecting performance of the existing process.

Network costs:

  • If the read replicas are in the same region, there is no network costs for the data transfer.
  • There is a network cost for cross-region read replicas.

7.1.4. Multi-AZ

This is typically for disaster recovery.

This is a synchronous replication.

There is one DNS name, and the application will automatically failover to the standby database if the master database goes down. No manual intervention is required. This increases availability.

Aside from the disaster case, no traffic is normally routed to the standby database. It is only for failovers, not scaling.

You can set up read replicas as multi-AZ for disaster recovery.

Single-AZ to multi-AZ is a zero downtime operation, the DB does not stop. We just click “modify” on the database.

Internally, what happens is: a snapshot it taken, a new DB is restored from this snapshot in a new AZ, synchronisation is established between the two databases.

7.1.5. RDS Custom

This is a managed Oracle and Microsoft SQL Server database with full admin access for OS and database customisation. Usually these are managed by RDS.

It allows us to configure the OS and settings, and access the underlying EC2 instance using SSH or SSM Session Manager.

7.1.6. RDS Proxy

An RDS Proxy pools and shares incoming connections together resulting in fewer connections to the database. Think of it like a load balancer for the database.

This is useful when you have multiples instances scaling in and out that might connect to your database then disappear and leave lingering connections open. For example, when using Lambda functions.

It is serverless and supports autoscaling. It reduces failover time by up to 66%. It supports both RDS and Aurora, including most flavours of SQL.

No code changes are required for most apps, just point the connection details to the proxy rather than the database directly. Authentication is via IAM using credentials stored in AWS Secrets Manager. The RDS Proxy can only be accessed from inside the VPC, it is never publicly accessible.

7.2. Amazon Aurora

7.2.1. What is Aurora?

Aurora is a proprietary database from AWS with compatibility with Postgres and MySQL.

Aurora is “cloud-optimised” with faster read/write performance and less lag when creating read replicas. Storage grows automatically. Failover is instantaneous.

It stores 6 copies of your data across 3 AZ: 4 out of 6 copies are needed for writes, and 3 out of 6 for reads.

Storage is striped across hundreds of volumes. There is self-healing with peer-to-peer replication.

  • One master Aurora instance takes writes. There is automated failover within 30 seconds if the master instance goes down.
  • Master + up to 15 read replicas serve reads. You can set up auto-scaling for read replicas. There is support for cross-region replication.

The aurora DB cluster. You don’t interact with any instance directly; they can scale and be removed so the connection URL would be constantly changing. Instead there is a writer endpoint that always points to the master instance. There is a read endpoint which points to a load balancer that directs your query to a read replica.

Aurora Cluster Readers and Writers

7.2.2. Advanced Concepts

Auto scaling. Read replicas scale based on CPU usage or number of connections breaching a user-defined threshold.

Custom endpoint. Define a subset of the read replicas as a custom endpoint. This means we can route traffic for jobs that we know are database-intensive, like analytical queries, to a subset of the instances without affecting the performance on the other read replicas.

Aurora serverless. Automated database instantiation and auto-scaling based on actual usage.

Good for infrequent, intermittent or unpredictable workloads. No capacity planning needed, you pay per second of usage.

The client connects to a proxy fleet, which is like a load balancer that directs requests to Aurora instances that are scaled in the background.

Global Aurora. Cross-region replicas are useful for disaster recovery. Aurora Global Database is the recommended approach.

You create 1 primary regions for read/write. You can then have up to 5 secondary read-only regions. Replication lag is <1 second. Up to 16 read replicas per secondary region.

Promoting another region in the event of disaster recovery has a Recovery Time Objective (RTO) < 1 minute.

Aurora Machine Learning. Add ML-based predictions to your application via SQL. Supported on SageMaker or Amazon Comprehend.

Babelfish for Aurora PostgreSQL. Babelfish allows Aurora PostgreSQL to understand commands targeted for Microsoft SQL Server (written in T-SQL). It automatically translates between these flavours of SQL to make migration easier.

7.3. Backups and Monitoring

7.3.1. RDS

There are automated backups:

  • Full backup daily during the backup window.
  • Transaction logs backed up every 5 mins. This gives the ability to do a point-in-time restore.
  • 1-35 days of retention. Can be disabled by setting to 0.

Manual DB snapshots are triggered by the user and can be retained as long as you want.

A use case for this is for an infrequently used database. A stopped RDS database will still incur storage costs. If you intend to stop it for a long time, you can snapshot it then restore it later.

7.3.2. Aurora

Automate backups are retained for 1-35 days. Cannot be disabled. Point-in-time recovery for any point in that timeframe.

Manual DB snapshots. Triggered by user and retained for as long as you want.

7.3.3. Restore Options

  • Restore an RDS / Aurora backup or snapshot to create a new database.
  • Restore a MySQL RDS database from S3. Create a backup of your on-premises database, store it in S3, the restore the backup file on to a new instance running MySQL.
  • Restore a MySQL Aurora cluster from S3. Same as for RDS, except the on-premises backup must be created using Percona XtraBackup.

7.3.4 Aurora Database Cloning

Create a new Aurora DB cluster from an existing one. An example use case is cloning a production database into dev and staging.

It is faster than doing a snapshot+restore. It uses the copy-on-write protocol. Initially the clone uses the same data volume as the original cluster, then when updates are made to the cloned DB cluster additional storage is allocated and data is copied to be separated.

7.4. Encryption

Applies to both RDS and Aurora.

At rest encryption. Database master and read replicas are encrypted using AWS KMS. This must be defined at launch time. If master is not encrypted then the read replicas cannot be encrypted. If you want to encrypt and unencrypted database, you need to take a snapshot of it and restore a new database with encryption set up at launch time.

In flight encryption. RDS and Aurora are TLS-ready by default. Applications must use the provided AWS TLS root certificates on the client side.

Authentication can be via IAM or by the standard username/password used to connect to databases. Security groups can also be used to control access. Audit logs can be enabled and sent to CloudWatch Logs for longer retention.

7.5. ElastiCache

7.5.1. What is ElastiCache?

ElastiCache is a managed Redis or Memcached. Analogous to how RDS is a managed SQL database. It is managed, meaning AWS takes care of OS maintenance, configuration, monitoring, failure recovery, backups, etc.

Caches are in-memory databases with low latency. They reduce the load on your database for read-intensive workloads.

This can help make your application stateless. For example, when the user logs in, their session is written to the cache. If their workload is moved to another instance, their session can be retrieved from the cache.

It does, however, require significant changes to your application’s code. Instead of querying the database directly, we need to:

  1. Query the cache. If we get a cache hit, use that result.
  2. If we get a cache miss, read from the database directly.
  3. Then write that result to the cache ready for the next query.

We also need to define a cache invalidation strategy to ensure the data in the cache is not stale.

7.5.2. Redis vs Memcached

Redis replicates whereas Memcached shards.

Redis:

  • Multi-AZ with auto-failover
  • Read replicas to scale for high availability
  • AOF persistence
  • Backup and restore features
  • Supports sets and sorted sets. Sorted sets allow for things like real-time leaderboards

Memcached:

  • Multi-node for partitioning (sharding) data
  • No replication (therefore not high availability)
  • Not persistent
  • Backup and restore available for the serverless option only
  • Multi-threaded architecture

7.5.3. ElastiCache Security

ElastiCache supports IAM authentication for Redis. IAM policies on ElastiCache are only used for AWS API-level security.

For Memcached, it needs to be username/password. Memcached supports SASL-based authentication.

With Redis AUTH you can set a password/token when you create a cluster, which provides an extra level of security on top of security groups. It supports SSL in-flight encryption.

Common patterns for ElastiCache.

  • Lazy loading - All the read data is cached, but data in the cache may become stale.
  • Write through - Data is inserted/updated in the cache any time it is written to the DB. Ensures no stale data.
  • Session store - Using the cache to store temporary session data, and using TTL to determine cache validation.

7.5.4. Common Port Numbers

Useful port numbers to know:

  • 21 - FTP
  • 22 - SFTP, SSH
  • 80 - HTTP
  • 443 - HTTPS

Common database ports:

  • 5432 - PostgreSQL, Aurora
  • 3306 - MySQL, MariaDB, Aurora
  • 1433 - MySQL Server
  • 1521 - Oracle RDS

References

Back to top