DEV Community

Neil Clark for AWS Community Builders

Posted on

Building Scalable Applications in AWS

Image description

Overview of Scalability in AWS

In simple terms scalability refers to a system’s ability to handle increased loads by adding resources, this is crucial as is ensures that applications can maintain performance levels despite growing user demand. So what does this mean in terms of AWS services?

Scalability is broken into two variants:-

  • Vertical Scaling
  • Horizontal Scaling

Vertical Scaling - This is where we are adding more horsepower to an existing instance, More CPU, More RAM, this is achieved by changing the instance size or type.

Horizontal Scaling - This is where we are adding more of the same type of machine for world domination.. I mean increased capacity.

AWS Core Services for Scalability

Amazon EC2

Probably AWS's most used service EC2 provides on-demand compute capacity for servers deployed in the cloud. EC2 instances come in a huge array of types and sizes to fill just about any need, from burstable T3 instances to the recently announced u7inh Family with 1920 vCPUs and 32TB of RAM truly huge computing capacity in current times.

So the instance families and the varying degrees of sizes within those families covers the vertical scaling element, and we can horizontally scale those by adding additional instances...

Well hold on are you saying add them manually??

Yes I am...

Ok... well is there a way to do this automatically

Well of course there is... this is called EC2 Auto Scaling

Auto Scaling allows you to create auto scaling groups and as part of these groups you set parameters that control your minimum and maximum number of instances with a desired number for normal operations. From this you can then set scaling events to determine when your scaling group grows or shrinks. For example I have a group that has a minimum number of instances set for 2 and a maximum of 8, my desired number is 3. My scaling event is set to monitor EC2 instance CPU utilisation and when it breaches 60% a new instance is added, and when the utilisation resolves itself instances numbers gradually reduce back to the desired number.

Scaling Types are broken down into the following

  • Dynamic scaling: Adapts capacity to actual loads to optimize resource utilization
  • Predictive scaling: Uses workload forecasting to plan future capacity
  • Scheduled scaling: Allows you to scale based on a schedule
  • Fixed number of instances: Allows you to maintain a fixed number of instances
  • Proactive scaling: Allows you to scale proactively

Elastic Load balancing

Now here is a service that goes hand in hand with EC2 and auto scaling now while it is great to have the ability to automatically scale the number of instances needed to meet demand generally you would have to manually add them into you application or run scripts to get the most from them. Elastic load balancers and auto scaling groups overcomes this issue. When combining ELB's and auto scaling the ELB acts as the front end to your servers and when autoscaling is used this allows for new servers that are started to be registered with the ELB and they automatically become part of your application landscape until they are no longer required and scaling back starts where they are removed from the ELB and shutdown.

Amazon RDS

Amazon RDS this is Amazon's Managed Database Service supporting a wide variety of popular database engines. Scaling RDS is really quite simple, in a traditional setup your database node would handle both write and read operations from your applications, in an application where you have heavy read and write operations, relying on a single node is not always the best option however it might be the only option due to the application. In this instance RDS allows you to scale your RDS instance vertically both for the instance size and storage options, this means that there would be disruption during the scaling event which is not ideal and would mean planning this in during the least impactful time which for some apps could be never...

The other option that RDS offers is read replicas, this allows you to scale out the read operations for the database to multiple read replicas to remove the load from the primary database and freeing up capacity and adding durability to your application. For situations where you need to keep scaling up as an option RDS should be deployed in a Multiple AZ setup, when RDS is setup this way, when you need to scale your primary nodes up, the standby node is scaled up first with a failover then taking place from the primary and then the primary being scaled up also, this would provide minimum downtime for your database.

AWS Serverless Services

AWS Lambda

AWS Lambda is a serverless compute service from AWS, it allows you to run code without provisioning any compute resources, it is triggered by various supported AWS services and scales automatically based on the number of incoming requests from services. A service that scales by default with no intervention required by the infrastructure team (unless you hit the concurrency limits)

There are some best practices to help with scaling though..

These are :-

  • Optimize Function Code: Ensure your function code is efficient and minimizes execution time. This helps reduce the overall load and cost.
  • Use Asynchronous Invocations: For non-blocking tasks, use asynchronous invocations to allow Lambda to handle more requests concurrently.

Amazon API Gateway

API Gateway enables you to create, publish, maintain and monitor API's at any scale, this again is another Serverless managed service that requires no management of infrastructure.

As with AWS Lambda, API Gateway scales automatically with demand and again there are some recommendations to help it scale effectively..

  • API Gateway Caching: Enable caching to store responses and reduce the load on your backend services. This can significantly improve performance and reduce latency for frequently accessed data
  • Request Throttling: Set up throttling to limit the number of requests per second for each API key. This helps protect your backend services from being overwhelmed by too many requests1.
  • Usage Plans: Create usage plans to enforce throttling and quota limits on individual API keys, ensuring fair usage and preventing abuse

AWS Fargate

AWS Fargate is a serverless compute engine for containers that works with Amazon ECS and EKS. It allows you to run containers without managing the underlying infrastructure. This means you can focus on building and running applications without worrying about the servers.

Fargate scales automatically based on the resource requirements of your containers. When you define your task, you specify the CPU and memory requirements, and Fargate takes care of provisioning the right amount of resources. This ensures that your applications can handle varying loads without manual intervention.

Fargate can auto scale using the below policies

  • Auto Scaling Policies: Use Amazon ECS Service Auto Scaling to automatically adjust the number of running tasks based on demand.
  • Target Tracking Policies: Adjust the task count to maintain a specified metric (e.g., CPU utilization).
  • Step Scaling Policies: Add or remove tasks based on specific thresholds.

Amazon DynamoDB

Amazon DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability. It is designed to handle high-traffic applications and can scale horizontally to accommodate growing workloads.

DynamoDB automatically adjusts throughput capacity based on traffic patterns, ensuring that your application can handle sudden spikes in demand. It also offers features like Global Tables for multi-region replication and on-demand capacity mode for flexible scaling.

DynamoDB can scale using the below policies

  • Auto Scaling Policies: Use DynamoDB auto scaling to automatically adjust the provisioned throughput capacity based on actual traffic patterns. This ensures that your tables and global secondary indexes can handle sudden increases in traffic without throttling.
  • Target Utilization: Set a target utilization percentage for your table's read and write capacity. Auto scaling will adjust the provisioned throughput to maintain this target, ensuring efficient use of resources.

Amazon S3

Amazon Simple Storage Service (S3) provides scalable object storage for a wide range of use cases. S3 is designed to store and retrieve any amount of data from anywhere on the web, making it ideal for backup, archiving, big data analytics, and more.

S3 is a great service that can scale infinitely... ok well not infinitely it depends on AWS storage hardware having the capacity but let’s just say i don’t think AWS will be running out when people need it.

This is just a whistle stop look at some AWS services and how they scale to allow your applications to scale for your users, there are many more AWS services you can use to integrate with your applications and the theme is the same there too.

Top comments (0)