In our exercise today, we will create a method of reducing costs without impacting the operation. The operation needs to be operational 24 hours a day, 7 days a week. However, during the night shift, it is not necessary to keep the same computing resources running. Hence, we need to create an automation that turns off the resources at a certain time, reduces the computing resource, and at dawn the machine will be turned off again and the automation will return with the morning shift resources.
In the architectural drawing below is a clear example of how it will work.
The experiment was made in the Sao Paulo Region (sa-east-1)
Revised cost estimates
- DB-1-VOL-WEB (c6i.8xlarge) = $1.530,08/month
- APP-1-VOL-WEB (m5.2xlarge) = $446,76/month
Total cost without resizing
- Monthly: $1,530.08 + $446.76 = $1,976.84
- Half-yearly: $1,976.84 * 6 = $11,861.04
- Annual: $1,976.84 * 12 = $23,722.08
Scenario with automatic resizing
- During the day, the instances keep the same settings (c6i.8xlarge for DB-1-VOL-WEB and m5.2xlarge for APP-1-VOL-WEB).
- During the night and weekends, the instances are resized to smaller configurations. In our work, we will use t3.2xlarge for DB-1-VOL-WEB and t3.large for APP-1-VOL-WEB).
Daily Calculations:
# Day Cost (12 hours/day):
- DB-1-VOL-WEB (c6i.8xlarge) = $765.04/month
Hour cost: $2,125 * 12 hours = $25,50 * 30 (days) = $765,04
APP-1-VOL-WEB (m5.2xlarge) = $223.38/month
Hour cost: $0.612 * 12 hours = $7,34 * 30 (days) = $220,32
# Night Cost (12 hours/day):
- DB-1-VOL-WEB (t3.2xlarge) = 193.53/Month
Hour cost: $0,5376 * 12 hours = $6,45 * 30 (days) = $ 193.53
APP-1-VOL-WEB (t3.large) = $48.38/month
Hour cost: $0.1344 * 12 hours = $1,61 * 30 (days) = $48.38
Total Cost with resizing:
- Monthly: $765.04 + $223.38 + 193.53 + $48.38 = $1,230.33
- Half-yearly: $1,230.33 * 6 = $7,381.98
- Annual: $1,230.33 * 12 = $14,763.96
Revised Potential Savings:
- Monthly Savings: $1,976.84 - $1,230.33 = $746,51
- Half-yearly savings: $11,861.04 - $7,381.9 = $4,479.14
- Annual savings: $23,722.08 - $14,763.96 = $8,958.12
Details of Lambda, EventBridge, and Custom Policies
Lambda
Two Lambda functions were created to automate the Start/Stop/Resize process of the DB-1-VOL-WEB and APP-1-VOL-WEB instances. The Lambda functions were configured with the Python 3.12 runtime and make use of the boto3 libraries for interaction with AWS.
- Function: db-stop-resize-start
-- Goal: To downscale the DB-1-VOL-WEB instance at night and upscale it in the morning.
-- Steps:
--- Stop: The function stops the instance at 06:00 p.m.
--- Resize: Once stopped, the instance is resized to a smaller type (t3.2xlarge).
--- Start: The instance is started again at 06:00 a.m. for the largest (c6i.8xlarge).
Code to insert on Lambda Function:
import boto3
ec2 = boto3.client('ec2')
def lambda_handler(event, context):
instance_id = 'i-instanceidnumber' # Replace with your instance ID
action = event.get('action')
print (event)
if action == 'downscale': #Remember this on Payload in EventBridge
# Stop the instance
ec2.stop_instances(InstanceIds=[instance_id])
waiter = ec2.get_waiter('instance_stopped')
waiter.wait(InstanceIds=[instance_id])
# Change instance type to a smaller one
ec2.modify_instance_attribute(InstanceId=instance_id, InstanceType={'Value': 't3.2xlarge'}) # Replace with the smallest instance type
# Start the instance
ec2.start_instances(InstanceIds=[instance_id])
elif action == 'upscale': #Remember this on Payload in EventBridge
# Stop the instance
ec2.stop_instances(InstanceIds=[instance_id])
waiter = ec2.get_waiter('instance_stopped')
waiter.wait(InstanceIds=[instance_id])
# Change instance type to a larger one
ec2.modify_instance_attribute(InstanceId=instance_id, InstanceType={'Value': 'c6i.8xlarge'}) # Replace with the largest instance type
# Start the instance
ec2.start_instances(InstanceIds=[instance_id])
In the Start part, there is a function that will shut down the instance to add computing resources and start the instance as scheduled.
- Function: app-stop-resize-start
-- Goal: To downscale the APP-1-VOL-WEB instance at night and upscale it in the morning.
-- Steps:
--- Stop: The function stops the instance at 06:00 p.m.
--- Resize: The instance is resized to a smaller type (t3.large).
--- Start: The instance is started again at 06:00 a.m to the largest (m5.2xlarge).
Code to insert on Lambda Function:
import boto3
ec2 = boto3.client('ec2')
def lambda_handler(event, context):
instance_id = 'i-instanceidnumber' # Replace with your instance ID
action = event.get('action')
print (event)
if action == 'downscale': #Remember this on Payload in EventBridge
# Stop the instance
ec2.stop_instances(InstanceIds=[instance_id])
waiter = ec2.get_waiter('instance_stopped')
waiter.wait(InstanceIds=[instance_id])
# Change instance type to a smaller one
ec2.modify_instance_attribute(InstanceId=instance_id, InstanceType={'Value': 't3.large'}) # Replace with the smallest instance type
# Start the instance
ec2.start_instances(InstanceIds=[instance_id])
elif action == 'upscale': #Remember this on Payload in EventBridge
# Stop the instance
ec2.stop_instances(InstanceIds=[instance_id])
waiter = ec2.get_waiter('instance_stopped')
waiter.wait(InstanceIds=[instance_id])
# Change instance type to a larger one
ec2.modify_instance_attribute(InstanceId=instance_id, InstanceType={'Value': 'm5.2xlarge'}) # Replace with the largest instance type
# Start the instance
ec2.start_instances(InstanceIds=[instance_id])
In the Start part, there is a function that will shut down the instance to add computing resources and start the instance as scheduled.
A tip for both functions! It is necessary to change the time out to 10 minutes.
Custom Policies
- ec2:StopInstances and ec2:StartInstances: Allow Lambda functions to stop and start instances.
- ec2:ModifyInstanceAttribute: Allows the type of instance to be modified for resizing.
- ec2:DescribeInstances and ec2:DescribeInstanceStatus: Needed to check the current status of instances before and after modification.
- logs:* actions: Allow Lambda functions to create and write logs to Amazon CloudWatch, important for monitoring and debugging.
On the Lambda Configuration, the roles need to be associated with the Function as in the example below.
All the items mentioned in this document have been created following best practices based on AWS Well-Architected. Therefore, all the steps taken are recorded for your information.
EventBrigde (Scheduler)
When you create Scheduler-type EventBridges, you need to set the CRON to the agreed times. Another point will also be the Payload that you will need to look up in the Python code for the action that will invoke the code, as in the example below.
This action will determine what EventBridge will invoke. I will leave all the information used in this work.
Once configured, when the scheduled time in CRON is executed, the entire performance of shutting down, performing the instance swap, and then turning on will be done.
Evidences with my local time:
At night:
CloudTrail:
In the morning:
CloudTrail:
As you can see, it's great work to do and reduce costs without impacting operations.
I hope you can enjoy this article! Keep it up.
Top comments (2)
What about downtime during resizing?
Hello @vtexperts , how are you?
The downtime is when the EC2 switches off and on again—a maximum of 10 minutes or less.
Before doing a job like that, it is always recommended to ask the owner about the EC2 particularities. Some EC2 is running some specific applications. If you get a job like that, always ask the owner before running it.
Best regards.