AWS CloudWatch: The Gatekeeper for Your AWS Environment

#aws #cloudwatch #cloudcomputing #devops

Introduction

Maintaining your infrastructure’s operational health and peak performance is crucial in the world of cloud computing. AWS CloudWatch provides robust monitoring, alerting, reporting, and logging features, acting as a gatekeeper for your AWS environment. This article explores AWS CloudWatch, its benefits, and a real-world example of setting up an alarm to track CPU usage.

What is AWS CloudWatch?

Amazon Web Services (AWS) offers a flexible monitoring and management solution called AWS CloudWatch. You may use it to trigger alarms, monitor and evaluate metrics, and get real-time insights about your AWS apps and resources. By serving as a central repository for all monitoring data, CloudWatch enables you to keep your infrastructure operating efficiently and in good condition.

Advantages of AWS CloudWatch

AWS CloudWatch is a vital tool for managing AWS environments because of its many important benefits, which include:

Monitoring: Continuously observes your AWS resources and applications to ensure they are functioning correctly.
Real-Time Metrics: Provides up-to-date data on resource utilization, enabling informed decision-making.
Alarms: Automatically notifies you when specific metrics exceed
predefined thresholds, allowing for timely intervention.
Log Insights: Centralizes and manages logs, simplifying troubleshooting and application behavior monitoring.
Custom Metrics: Tracks specific metrics relevant to your application or business needs.
Cost Optimization: Monitors resource usage and sets up billing alarms to help manage and optimize AWS costs.

Creating an Alarm in CloudWatch

Let’s explore a practical use case where we create an alarm in CloudWatch to notify us via email when the CPU utilization of an instance spikes to 50% or above.

Log in to the AWS Management Console and navigate to the CloudWatch service.
Click on “Alarms” in the left-hand menu and select “Create alarm.”
Choose “Select metric” and pick “EC2” from the metric source.
Under “Namespace,” select “AWS/EC2.”
For “Metric Name,” choose “CPUUtilization.”
Select the specific EC2 instance you want to monitor from the “Instance ID” dropdown. So I have created a specific EC2 instance for this purpose called “cloud-watch-demo”.
Scaling: Scales with your AWS environment, handling millions of events per minute.

Select your EC2 instance and then click on monitoring tab to see Graphical analysis of your instance based on various parameters such as CPU Utilization.

Under “Statistic,” select “Average” to monitor the average CPU utilization over a period.
In the “Period” field, enter the desired time window for averaging CPU usage (e.g., 5 minutes).
For “Comparison operator,” choose “Greater than (>)”
In the “Threshold” value, enter “50” to trigger the alarm when CPU utilization exceeds 50%.
Leave the “Evaluation periods” set to “1” for the alarm to trigger if the average CPU utilization is above 50% for the chosen time period.
Under “Alarm name,” enter a descriptive name for your alarm (e.g., “High CPU Utilization on [Instance ID]”).
Now you need to configure the notification for the alarm. Click on “Add action” and choose “SNS topic.”
If you haven’t already, create a new SNS topic or select an existing one where you want to receive notifications.
Click “Next” and review the alarm configuration.
Finally, click “Create alarm” to set up your CloudWatch alarm.
Once the alarm is created you can click on it to see a detailed view.

Click on the Metrics tab in the AWS CloudWatch and then search for metric name “CPUUtilization”. After this select your EC2 Instance to see the graph. (In my case, I’ve selected “cloud-watch-demo”)

Now to check the working of alarm I’ve used a python program that generates CPU Spikes which will in turn affect the CPU Utilization of our instance. Credits for the python script:- Abhishek Veeramalla

import time

def simulate_cpu_spike(duration=30, cpu_percent=80):
    print(f"Simulating CPU spike at {cpu_percent}%...")
    start_time = time.time()

    # Calculate the number of iterations needed to achieve the desired CPU utilization
    target_percent = cpu_percent / 100
    total_iterations = int(target_percent * 5_000_000)  # Adjust the number as needed

    # Perform simple arithmetic operations to spike CPU utilization
    for _ in range(total_iterations):
        result = 0
        for i in range(1, 1001):
            result += i

    # Wait for the rest of the time interval
    elapsed_time = time.time() - start_time
    remaining_time = max(0, duration - elapsed_time)
    time.sleep(remaining_time)

    print("CPU spike simulation completed.")

if __name__ == '__main__':
    # Simulate a CPU spike for 30 seconds with 80% CPU utilization
    simulate_cpu_spike(duration=30, cpu_percent=80)

Run this python script on your EC2 Instance. After this you will have to wait for 2–5 minutes then check your email which you have provided for the SNS topic. You will see the results on your AWS Cloud watch alarm dashboard.

Conclusion

AWS CloudWatch is a crucial tool for monitoring and managing your AWS resources and applications. It offers real-time insights, automated alarms, centralized logging, and cost management features, enabling you to maintain the health and performance of your infrastructure. By setting up alarms, such as the one for monitoring CPU utilization, you can proactively address issues and ensure your applications run smoothly.

Thank you for reading, and I hope you found this blog post helpful in your AWS journey!