DEV Community

Optimization and Automation of AWS Resources to reduce costs without impacting operations

In our exercise today, we will create a method of reducing costs without impacting the operation. The operation needs to be operational 24 hours a day, 7 days a week. However, during the night shift, it is not necessary to keep the same computing resources running. Hence, we need to create an automation that turns off the resources at a certain time, reduces the computing resource, and at dawn the machine will be turned off again and the automation will return with the morning shift resources.

In the architectural drawing below is a clear example of how it will work.

Image description

The experiment was made in the Sao Paulo Region (sa-east-1)

Revised cost estimates

  • DB-1-VOL-WEB (c6i.8xlarge) = $1.530,08/month
  • APP-1-VOL-WEB (m5.2xlarge) = $446,76/month

Total cost without resizing

  • Monthly: $1,530.08 + $446.76 = $1,976.84
  • Half-yearly: $1,976.84 * 6 = $11,861.04
  • Annual: $1,976.84 * 12 = $23,722.08

Scenario with automatic resizing

  • During the day, the instances keep the same settings (c6i.8xlarge for DB-1-VOL-WEB and m5.2xlarge for APP-1-VOL-WEB).
  • During the night and weekends, the instances are resized to smaller configurations. In our work, we will use t3.2xlarge for DB-1-VOL-WEB and t3.large for APP-1-VOL-WEB).

Daily Calculations:

# Day Cost (12 hours/day):
  • DB-1-VOL-WEB (c6i.8xlarge) = $765.04/month
  • Hour cost: $2,125 * 12 hours = $25,50 * 30 (days) = $765,04

  • APP-1-VOL-WEB (m5.2xlarge) = $223.38/month

  • Hour cost: $0.612 * 12 hours = $7,34 * 30 (days) = $220,32

# Night Cost (12 hours/day):
  • DB-1-VOL-WEB (t3.2xlarge) = 193.53/Month
  • Hour cost: $0,5376 * 12 hours = $6,45 * 30 (days) = $ 193.53

  • APP-1-VOL-WEB (t3.large) = $48.38/month

  • Hour cost: $0.1344 * 12 hours = $1,61 * 30 (days) = $48.38

Total Cost with resizing:

  • Monthly: $765.04 + $223.38 + 193.53 + $48.38 = $1,230.33
  • Half-yearly: $1,230.33 * 6 = $7,381.98
  • Annual: $1,230.33 * 12 = $14,763.96

Revised Potential Savings:

  • Monthly Savings: $1,976.84 - $1,230.33 = $746,51
  • Half-yearly savings: $11,861.04 - $7,381.9 = $4,479.14
  • Annual savings: $23,722.08 - $14,763.96 = $8,958.12

Details of Lambda, EventBridge, and Custom Policies

Lambda

Two Lambda functions were created to automate the Start/Stop/Resize process of the DB-1-VOL-WEB and APP-1-VOL-WEB instances. The Lambda functions were configured with the Python 3.12 runtime and make use of the boto3 libraries for interaction with AWS.

  • Function: db-stop-resize-start

-- Goal: To downscale the DB-1-VOL-WEB instance at night and upscale it in the morning.
-- Steps:
--- Stop: The function stops the instance at 06:00 p.m.
--- Resize: Once stopped, the instance is resized to a smaller type (t3.2xlarge).
--- Start: The instance is started again at 06:00 a.m. for the largest (c6i.8xlarge).

Image description

Code to insert on Lambda Function:

import boto3

ec2 = boto3.client('ec2')

def lambda_handler(event, context):
    instance_id = 'i-instanceidnumber'  # Replace with your instance ID
    action = event.get('action')
    print (event)

    if action == 'downscale': #Remember this on Payload in EventBridge
        # Stop the instance
        ec2.stop_instances(InstanceIds=[instance_id])
        waiter = ec2.get_waiter('instance_stopped')
        waiter.wait(InstanceIds=[instance_id])

        # Change instance type to a smaller one
        ec2.modify_instance_attribute(InstanceId=instance_id, InstanceType={'Value': 't3.2xlarge'})  # Replace with the smallest instance type

        # Start the instance
        ec2.start_instances(InstanceIds=[instance_id])

    elif action == 'upscale': #Remember this on Payload in EventBridge
        # Stop the instance
        ec2.stop_instances(InstanceIds=[instance_id])
        waiter = ec2.get_waiter('instance_stopped')
        waiter.wait(InstanceIds=[instance_id])

        # Change instance type to a larger one
        ec2.modify_instance_attribute(InstanceId=instance_id, InstanceType={'Value': 'c6i.8xlarge'})  # Replace with the largest instance type

        # Start the instance
        ec2.start_instances(InstanceIds=[instance_id])

Enter fullscreen mode Exit fullscreen mode

In the Start part, there is a function that will shut down the instance to add computing resources and start the instance as scheduled.

  • Function: app-stop-resize-start

-- Goal: To downscale the APP-1-VOL-WEB instance at night and upscale it in the morning.
-- Steps:
--- Stop: The function stops the instance at 06:00 p.m.
--- Resize: The instance is resized to a smaller type (t3.large).
--- Start: The instance is started again at 06:00 a.m to the largest (m5.2xlarge).

Image description

Code to insert on Lambda Function:

import boto3

ec2 = boto3.client('ec2')

def lambda_handler(event, context):
    instance_id = 'i-instanceidnumber'  # Replace with your instance ID
    action = event.get('action')
    print (event)

    if action == 'downscale': #Remember this on Payload in EventBridge
        # Stop the instance
        ec2.stop_instances(InstanceIds=[instance_id])
        waiter = ec2.get_waiter('instance_stopped')
        waiter.wait(InstanceIds=[instance_id])

        # Change instance type to a smaller one
        ec2.modify_instance_attribute(InstanceId=instance_id, InstanceType={'Value': 't3.large'})  # Replace with the smallest instance type

        # Start the instance
        ec2.start_instances(InstanceIds=[instance_id])

    elif action == 'upscale': #Remember this on Payload in EventBridge
        # Stop the instance
        ec2.stop_instances(InstanceIds=[instance_id])
        waiter = ec2.get_waiter('instance_stopped')
        waiter.wait(InstanceIds=[instance_id])

        # Change instance type to a larger one
        ec2.modify_instance_attribute(InstanceId=instance_id, InstanceType={'Value': 'm5.2xlarge'})  # Replace with the largest instance type

        # Start the instance
        ec2.start_instances(InstanceIds=[instance_id])
Enter fullscreen mode Exit fullscreen mode

In the Start part, there is a function that will shut down the instance to add computing resources and start the instance as scheduled.

A tip for both functions! It is necessary to change the time out to 10 minutes.

Image description

Custom Policies

  • ec2:StopInstances and ec2:StartInstances: Allow Lambda functions to stop and start instances.
  • ec2:ModifyInstanceAttribute: Allows the type of instance to be modified for resizing.
  • ec2:DescribeInstances and ec2:DescribeInstanceStatus: Needed to check the current status of instances before and after modification.
  • logs:* actions: Allow Lambda functions to create and write logs to Amazon CloudWatch, important for monitoring and debugging.

One role for DB
Image description

One role for APP
Image description

On the Lambda Configuration, the roles need to be associated with the Function as in the example below.

Image description

All the items mentioned in this document have been created following best practices based on AWS Well-Architected. Therefore, all the steps taken are recorded for your information.

EventBrigde (Scheduler)

When you create Scheduler-type EventBridges, you need to set the CRON to the agreed times. Another point will also be the Payload that you will need to look up in the Python code for the action that will invoke the code, as in the example below.

Image description

For APP EC2
Image description

This action will determine what EventBridge will invoke. I will leave all the information used in this work.

For APP EC2
Image description

For DB EC2
Image description

For DB EC2
Image description

Once configured, when the scheduled time in CRON is executed, the entire performance of shutting down, performing the instance swap, and then turning on will be done.

Evidences with my local time:

At night:

Image description

CloudTrail:

Image description

In the morning:

Image description

CloudTrail:

Image description

As you can see, it's great work to do and reduce costs without impacting operations.

I hope you can enjoy this article! Keep it up.

Top comments (2)

Collapse
 
vtexperts profile image
Tom Brown

What about downtime during resizing?

Collapse
 
carlosfilho profile image
Carlos Filho • Edited

Hello @vtexperts , how are you?

The downtime is when the EC2 switches off and on again—a maximum of 10 minutes or less.

Before doing a job like that, it is always recommended to ask the owner about the EC2 particularities. Some EC2 is running some specific applications. If you get a job like that, always ask the owner before running it.

Best regards.