Ebenezer Lamptey

Posted on Dec 4, 2024

Comprehensive Bash System Monitoring Script: Technical Breakdown

#linux #bash #ubuntu

In today's fast-paced IT environments, keeping track of system performance is crucial. This bash script provides an elegant, automated solution for monitoring critical system metrics and sending timely alerts when resources are under strain.

Understanding the Script's Architecture

The script is designed to monitor three primary system health indicators:

CPU Usage
Memory Consumption
Disk Space Utilization

Key Functionalities

Logging Mechanism

The script includes a robust logging function that timestamps every event. All system metrics and alerts are recorded in /var/log/sysmonitor.log, providing a comprehensive audit trail for system administrators.

logfile=/var/log/sysmonitor.log

log(){     
    local message=$1     
    echo "$message - $(date +%T)" >> $logfile 
}
# Logging function with timestamp
# Uses local variable for message
# Appends to log file with current time

Metric Collection

Each metric collection function leverages standard Linux command-line tools:

top for CPU usage
free **for memory consumption
**df for disk space utilization

CPU Usage Metrics

CPU_metrics(){    
    local cpu_usage=$(top -bn1 | grep "CPU(s)" | awk '{print $1 + $2 + $3}')
    echo "$cpu_usage"
    log "Cpu usage is $cpu_usage" 
}

# Uses `top` command to retrieve CPU usage
# Adds first three CPU percentage columns
# Logs the result

Memory Usage Metrics

Memory_metrics(){     
    local Mem_usage=$(free -m | awk '/Mem:/ {print ($3 / $2) * 100.00}')
    echo "$Mem_usage"
    log "Cpu usage is $Mem_usage" 
}
# Uses `free` command to calculate memory usage
# Calculates percentage of used memory
# Converts to percentage with decimal precision

Disk Usage Metrics

disk_usage_metrics(){     
    local disk_usage=$(df -h --total | grep ^total | awk '{print $5}' | sed 's/%//')
    echo "$disk_usage"
    log "Disk Usage is $disk_usage" 
}
# Uses `df` command to get total disk usage
# Removes percentage symbol for calculation

Metric Processing

dskusage=$(printf "%.0f" "$(disk_usage_metrics)")
memusage=$(printf "%.0f" "$(Memory_metrics)")
cpuUsage=$(printf "%.0f" "$(CPU_metrics)")
# Converts floating-point metrics to integers
# Uses printf to round to nearest whole number

Threshold-Based Alerting

The script defines critical thresholds for each metric:

Disk Usage: 75%
Memory Usage: 85%
CPU Usage: 90%

When any metric exceeds its predefined threshold, the script:

Logs the event in the system log
Sends an email alert to the specified address

CPU_threshold=90
Mem_threshold=85
Disk_threshold=75

if [ "$dskusage" -gt "$Disk_threshold" ] then
    echo "Disk limt exceeded!!!" 1>>$logfile
    mail -s "Disk Limit Alert" email@address.com <<< "Disk usage has exceeded Limit. Kindly review"
fi
# Checks if disk usage exceeds 75%
# Logs the event
# Sends email alert
# Similar logic for memory and CPU thresholds

Performance Automation

The script is scheduled using a crontab entry to run every 4 hours, ensuring continuous, hands-off system monitoring.

cron_job="0 */4 * * */workspace/Automated-System-Monitoring-/Monitor.sh 1>>$logfile"
(crontab -l | grep -v -F "$cron_job"; echo "$cron_job") | crontab -
# Schedules script to run every 4 hours
# Adds to crontab without duplicating existing entries

Implementation Tips

To use this script:

Replace email@address.com with your actual email
Ensure necessary permissions are set (chmod +x Monitor.sh)
Verify mail utility is configured on your system

Conclusion

Automated system monitoring is essential for maintaining infrastructure health. This bash script provides a lightweight, effective solution for tracking and alerting on critical system resources.

Sample Use Cases

DevOps teams monitoring server performance
Small to medium enterprise infrastructure management
Personal server and workstation health tracking

By proactively monitoring system metrics, you can prevent performance bottlenecks and potential system failures before they impact your operations. You can share ideas on improving this code in the comment section. Feel free to have discussions. you can accesss this code in my github : Automated Monitoring Script

DEV Community

Comprehensive Bash System Monitoring Script: Technical Breakdown

Understanding the Script's Architecture

Key Functionalities

Logging Mechanism

Metric Collection

Threshold-Based Alerting

Performance Automation

Implementation Tips

Conclusion

Sample Use Cases

Top comments (0)

Read next

Testing post

How to Find All URLs on a Domain

🚀 Database Optimization: Essential Tips and Tricks for Developers

Navigating the Future: 3 Essential AI Security Practices for Cyber Defense