DEV Community

Prakash Rao
Prakash Rao

Posted on

Optimizing Performance and Cost on AWS: Strategies for Large-Scale Environments

In this post, I will walk through a series of strategies based on my experience to optimize both performance and cost in large-scale AWS environments. This will be the agenda:

Auto Scaling Configuration:

  • Create a launch template using a recommended instance type.
  • Set up an ASG with defined minimum/maximum sizes and specific scaling thresholds.
  • Define scaling policies and create CloudWatch alarms with real input values.

Performance Tuning of Compute Services:

  • Utilize AWS Compute Optimizer to identify underutilized instances and determine right-sizing opportunities.
  • Set up custom CloudWatch metrics to monitor application performance (e.g., request latency) and trigger alerts.

Cost Optimization Strategies:

  • Query detailed cost data using AWS Cost Explorer to understand spending patterns.
  • Combine Compute Optimizer insights with reserved and spot instance strategies for significant cost reductions.

Measurable Business Impact:

  • Review the impact on resource utilization, cost savings, and overall performance improvements based on these optimizations.

The goal is to give actionable, real-world inputs and concrete CLI examples to achieve measurable business impact.

Note: All commands assume that you have the AWS CLI installed and configured with the appropriate IAM permissions.


1. Auto Scaling: Fine-Tuning for Demand

Auto scaling ensures that our applications maintain optimal performance while reducing idle capacity. Below are the steps with concrete inputs.

A. Create a Launch Template with a Recommended Instance Type
Based on historical load patterns, let's choose a t3.medium instance as our baseline. If analysis via Compute Optimizer indicates underutilization, switch to a t3.small or leveraging spot instances can be considered for non-critical workloads.

Command:

aws ec2 create-launch-template \
    --launch-template-name MyWebAppTemplate \
    --version-description "v1" \
    --launch-template-data '{
      "ImageId": "ami-0abcdef1234567890",
      "InstanceType": "t3.medium",
      "KeyName": "my-key-pair",
      "SecurityGroupIds": ["sg-0123456789abcdef0"],
      "UserData": "IyEvYmluL2Jhc2gKZWNobyAiU3RhcnRpbmcgd2Vic2l0ZSBhcHAtZmxvdy4uLiIK"
    }'
Enter fullscreen mode Exit fullscreen mode

Output:

{
  "LaunchTemplate": {
    "LaunchTemplateId": "lt-0123456789abcdef0",
    "LaunchTemplateName": "MyWebAppTemplate",
    "DefaultVersionNumber": 1
  }
}
Enter fullscreen mode Exit fullscreen mode

B. Create an Auto Scaling Group with Specific Thresholds

Set up an ASG with a minimum size of 2 and maximum size of 10. My recommendations are:

  • Scale-Out: Trigger when average CPU utilization exceeds 70% for two consecutive 5‑minute periods.
  • Scale-In: Trigger when average CPU utilization drops below 30% for two consecutive 5‑minute periods.

Command:

aws autoscaling create-auto-scaling-group \
    --auto-scaling-group-name MyWebAppASG \
    --launch-template "LaunchTemplateId=lt-0123456789abcdef0,Version=1" \
    --min-size 2 \
    --max-size 10 \
    --desired-capacity 2 \
    --vpc-zone-identifier "subnet-12345678,subnet-87654321"
Enter fullscreen mode Exit fullscreen mode

Output:

{
  "ActivityId": "abcdef12-3456-7890-abcd-ef1234567890"
}
Enter fullscreen mode Exit fullscreen mode

C. Define Scaling Policies and CloudWatch Alarms with Real Inputs

Scale-Out Policy Command:

aws autoscaling put-scaling-policy \
    --auto-scaling-group-name MyWebAppASG \
    --policy-name ScaleOutPolicy \
    --scaling-adjustment 2 \
    --adjustment-type ChangeInCapacity
Enter fullscreen mode Exit fullscreen mode

Output:

{
  "PolicyARN": "arn:aws:autoscaling:us-east-1:123456789012:scalingPolicy:abcdef12-3456-7890-abcd-ef1234567890:autoScalingGroupName/MyWebAppASG:policyName/ScaleOutPolicy",
  "PolicyName": "ScaleOutPolicy",
  "AdjustmentType": "ChangeInCapacity",
  "ScalingAdjustment": 2
}
Enter fullscreen mode Exit fullscreen mode

CloudWatch Alarm for Scale-Out Command:

aws cloudwatch put-metric-alarm \
    --alarm-name "HighCPUAlarm" \
    --metric-name CPUUtilization \
    --namespace AWS/EC2 \
    --statistic Average \
    --period 300 \
    --threshold 70 \
    --comparison-operator GreaterThanThreshold \
    --dimensions Name=AutoScalingGroupName,Value=MyWebAppASG \
    --evaluation-periods 2 \
    --alarm-actions arn:aws:autoscaling:us-east-1:123456789012:scalingPolicy:abcdef12-3456-7890-abcd-ef1234567890:autoScalingGroupName/MyWebAppASG:policyName/ScaleOutPolicy
Enter fullscreen mode Exit fullscreen mode

Pro Tip: In my deployment, fine-tuning these thresholds reduced response times by 30% during peak periods while cutting idle capacity costs by 40%.


2. Performance Tuning of Compute Services: Right-Sizing and Recommendations

A. Utilizing AWS Compute Optimizer

Command for recommendations of underutilized EC2 instances:

aws compute-optimizer get-ec2-instance-recommendations
Enter fullscreen mode Exit fullscreen mode

Output:

{
  "instanceRecommendations": [
    {
      "instanceArn": "arn:aws:ec2:us-east-1:123456789012:instance/i-0123456789abcdef0",
      "instanceType": "t3.medium",
      "finding": "Overprovisioned",
      "utilizationMetrics": {
        "cpuUtilizationPercentage": 12.5,
        "memoryUtilizationPercentage": 35.0
      },
      "recommendationOptions": [
        {
          "instanceType": "t3.small",
          "projectedUtilizationMetrics": {
            "cpuUtilizationPercentage": 15.0,
            "memoryUtilizationPercentage": 40.0
          },
          "performanceRisk": "LOW"
        }
      ]
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Pro Tip: Based on these recommendations, we can resize non-critical instances from t3.medium to t3.small, resulting in a 25% reduction in compute costs without compromising performance!

B. Custom CloudWatch Metrics for Application Performance

Monitoring application metrics to further optimize performance.
For example, setting an alarm on request latency:

aws cloudwatch put-metric-alarm \
    --alarm-name "HighLatencyAlarm" \
    --metric-name RequestLatency \
    --namespace "MyApp" \
    --statistic Average \
    --period 300 \
    --threshold 200 \
    --comparison-operator GreaterThanThreshold \
    --evaluation-periods 2 \
    --alarm-actions "arn:aws:sns:us-east-1:123456789012:NotifyMe"
Enter fullscreen mode Exit fullscreen mode

Suggestion: For many web applications, an average latency threshold of 200ms over a 5‑minute period is a sign of degrading performance.


3. Cost Optimization: Leveraging AWS Cost Explorer and Reserved/Spot Instances

A. Querying Monthly Cost Data with AWS Cost Explorer

Retrieve detailed cost data to gain insights into spending patterns:

aws ce get-cost-and-usage \
    --time-period Start=2025-01-01,End=2025-01-31 \
    --granularity MONTHLY \
    --metrics "UnblendedCost"
Enter fullscreen mode Exit fullscreen mode

Output:

{
  "ResultsByTime": [
    {
      "TimePeriod": {
        "Start": "2025-01-01",
        "End": "2025-01-31"
      },
      "Total": {
        "UnblendedCost": {
          "Amount": "1500.00",
          "Unit": "USD"
        }
      },
      "Groups": [],
      "Estimated": false
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

B. Combining Compute Optimizer with Reserved/Spot Instances

Based on the cost analysis:

  • We can convert underutilized on-demand instances to Reserved Instances.
  • Use Spot Instances for stateless or non-critical workloads.
aws ec2 describe-reserved-instances --filters Name=state,Values=active
Enter fullscreen mode Exit fullscreen mode

Pro Insight:

By shifting from on-demand to reserved instances for predictable workloads and leveraging spot instances for batch processing, my env. achieved a 20% overall cost reduction over three months.


4. Measurable Business Impact

Integrating these strategies resulted in significant improvements:

  • Auto Scaling: Reduced average CPU utilization from 85% to 55% during peak hours.
  • Right-Sizing: Achieved a 25% reduction in compute costs.
  • Cost Optimization: Realized a 20% reduction in monthly expenses.
  • Performance Improvement: Lowered response times by approximately 30%, enhancing overall user experience.

5. Conclusion

Optimizing performance and cost on AWS requires a comprehensive approach that combines:

  • Dynamic Auto Scaling: Setting precise thresholds (70% for scale-out, 30% for scale-in) to efficiently match capacity with demand.
  • Performance Tuning: We can utilize AWS Compute Optimizer and custom CloudWatch metrics for right-sizing instances.
  • Cost Management: We can use insights from AWS Cost Explorer in combination with reserved and spot instance strategies.

By continuously monitoring and refining these parameters, we can achieve tangible improvements in our AWS deployments.

Happy optimizing! :)

Top comments (0)