DEV Community

Cover image for DevOps: Linux Performance Monitoring
Priyank Sevak
Priyank Sevak

Posted on • Edited on

DevOps: Linux Performance Monitoring

In my previous post DevOps: Understanding Process Monitoring on Linux I discussed how a Linux process works and why it's important to monitor process.

Let's dig deeper into how we can keep an eye out for how a Linux server is "Performing".

TL;DR:

This article dives into Linux process monitoring tools and techniques, helping you keep an eye on your server's performance. It covers command-line tools like top, htop, vmstat, and sar for in-depth monitoring, along with system utilities like System Monitor for a graphical overview. The article also demonstrates a sample script using top and uptime to monitor CPU, memory, and system uptime, laying the groundwork for integrating push notifications.

Performance Monitoring

1./proc

In my previous post, I explained that "Everything is a File in Linux system" so where are these process files stored?

go to your CMD and just type ls /proc and you will see your PIDs in there. This is where the Linux process resides. the /proc directory contains files that contain (including but not limited to):

  • Current state of Linux Kernel.
  • Information about System Hardware.
  • Currently running process.

Try running the below commands to find out more about what does /proc consists of:

cat /proc/cpuinfo

cat /proc/devices #list serial ports, Network Interface, etc.

cat /proc/cmdline #useful in boot failures
Enter fullscreen mode Exit fullscreen mode

/proc can be modified and can be used to communicate configurational changes directly to the kernel.

The Linux kernel is equipped with procps package which contains useful tools such as ps, top, iostat, etc. to help us in performance and process monitoring.

In addition to previously discussed top and ps there are other alternatives to the top which can provide additional or graphical alternatives to the traditional top.

htop

htop

htop offers a more visually appealing interface with color-encoded bars for CPU and memory utilization.

It views processes in a tree-like structure making it easier to understand the relationship between processes.

atop

atop

atop has the ability to be configured and run on remote systems, making it suitable for large-scale monitoring environments.

atop provides long-term monitoring and analysis. It logs system data to a file which allows to review historic trends and identifies performance issue over time.

2.Where's my task manager?

Isn't it easy to find out what's going on in my system and process on Windows by just hitting "Ctrl+Alt+Del" and going to "Task Manager"? Why doesn't Linux provide something like that?

If you are in a GNOME environment you can find a similar tool under your apps by searching for "System Monitor". System Monitor has 4 tabs:

  • System: Shows basic system info. -Process: Lists all the running processes. Can sort them and also perform operations such as Kill, stop, or terminating that process.
  • Resources: Lists current CPU usage, Memory and Swap usage, Network usage, and Disk usage.
  • File System: Lists all currently mounted file systems and additional info such as mount point, system type, and memory usage.

3.Virtual Memory statistics: vmstat

As the name suggests the vmstat command provides detailed info regarding the processes, memory, paging, Input/Output blocks, traps and disk and CPU activity.

The first time you run vmstat it lists the average since the last reboot. the subsequent reports are from the sampling period of provided 'delay'.

Some useful options with vmstat:

vmstat -s #lists memory and scheduling statistics
Enter fullscreen mode Exit fullscreen mode

vmstat

From the above image you can see that running vmstat -s gives you info regarding:

  • Amount of used memory: Total memory, currently used memory, Active/Inactive memory, Free, Buffer, Cache, etc.
  • CPU statistics: High and low priority process, Kernel Process, I/O management, Software interrupts, etc.
  • Memory Paging: Total pages paged in and paged out from virtual memory, total pages read from and written to swap memory.
  • Event Counters: Total interrupts, context switches, timestamps, and forks since last boot time.

4.System Activity Reporter: sar

Go to your terminal and write the below command:

ls /var/log/sysstat
Enter fullscreen mode Exit fullscreen mode

You will see a bunch of directories either named saDD or saYYYYMMDD where YYYY, MM, and DD stand for Year, Month, and Day. These are "Standard System Activity Daily Data Files".

These are the directories created by sar, which collects and reports information about system activity that has occurred so far since the system started. It is possible to store the output of sar to a different file by the below command:

sar -o [filename] #save output to a different file

sar -1 # shows sar output from the previous day
Enter fullscreen mode Exit fullscreen mode

sar

Real world example:

Problem statement:

You want to keep a check on the current performance of your Linux server. You want to get notified if either CPU usage, Memory Usage, or System usage is going over a certain threshold and prevent unintentional system overutilization.

Assumptions & lab setup

I will be using the below command which is provided by Linux and is a way to benchmark the hardware or software component. It can generate various types of load, including I/O, CPU, Memory, and Network:

stress --cpu 8 --io 4 --vm 4 --vm-bytes 1024M --timeout 10s
Enter fullscreen mode Exit fullscreen mode

I have specified:

  • CPU load equivalent to 8 CPU cores.
  • 4 I/O concurrent operations.
  • 1024 MB of 4 virtual memory workload.

Solution & Explaination:

#!/bin/bash

# Set the interval for monitoring (in seconds)
interval=5

while true; do
  # Get CPU usage and average load
  cpu_usage=$(top -n 1 | grep 'Cpu(s):' | awk '{print 100 - $8}')
  avg_load=$(uptime | awk '{print $8, $9, $10}')

  # Get memory usage
  mem_total=$(free -m | grep Mem | awk '{print $2}')
  mem_used=$(free -m | grep Mem | awk '{print $3}')
  mem_free=$(free -m | grep Mem | awk '{print $4}')
  mem_usage=$(( ($mem_used * 100) / $mem_total ))

  # Get network statistics
  echo "Network Packets:"
  # Iterate over each interface
  for interface in $(ifconfig | grep 'flags' | awk '{print $1}' | cut -d':' -f1); do
    # Get RX packets and bytes
    packets_transferred=$(ifconfig $interface | grep 'RX packets')


    # Print the interface name and transfer the data
    echo "$interface : $packets_transferred"
  done

  # Get system uptime
  uptime_hours=uptime | awk -F, '{sub(".*up ",x,$1);print $1,$2}'


  echo "CPU Usage: $cpu_usage%"
  echo "Average Load: $avg_load"
  echo "Memory Usage: $mem_usage%"
  echo "System Uptime: $uptime_hours"
  echo


  sleep $interval
done

Enter fullscreen mode Exit fullscreen mode

Code explanation:

Getting the CPU usage:

top -n 1 | grep 'Cpu(s):' | awk '{print 100 - $8}'
Enter fullscreen mode Exit fullscreen mode
  • Running the top and grepping details regarding the CPU. The top command will show the details regarding the tasks, CPU details, Swap, and Physical Memory in the system.

Getting the Average system load:

In addition to vmstat and sar commands, we can use uptime command to get concise details about the system. uptime command will output the current time, how long the system has been running, how many users are currently logged on, and the system load averages for the past 1, 5 and 15 minutes.

uptime

avg_load=$(uptime | awk '{print $8, $9, $10}')
Enter fullscreen mode Exit fullscreen mode

I am simply manipulating the output to only fetch the required average system load from uptime.

Later in the script, I manipulated the same uptime output to get the current uptime. I am using some RegEX to accommodate different uptime. i.e. 15 days, 12 hours, 2 minutes, and 45 seconds

uptime_hours=uptime | awk -F, '{sub(".*up ",x,$1);print $1,$2}'
Enter fullscreen mode Exit fullscreen mode

Getting the Memory usage:

free is another useful command to get detailed output regarding the memory available, used, and free on the system. You can think of it as a more concise version of vmstat -s.

free

I am manipulating the string returned by free to get the precise memory currently being used.

The output

Here's the output, I am printing out the CPU usage, Memory Usage, or System usage for now. We can extend the bash code and use Push notification services such as pushover, sendmail, etc.

Output

Additional Performance Monitoring Tools:

I would just want to list some additional GUI tools which can help you monitor performance of your linux server better:

1.stacer:
GUI for CPU/Memory and other things

stacer

2.saidar:
similar to atop or htop

saidar

3.cpu-x:

My personal favorite as it gives very precise details on the CPU, memory and Disk usage and feels familiar to use.

cpu-x

You can also run the stress command that I ran in the beginning to benchmark and stress test the CPU directly inside the cpu-x

cpu-x

Conclusion:

This article effectively explores various Linux process monitoring tools. From command-line utilities offering detailed insights to GUI tools providing a visual representation, you're equipped to choose the tools that best suit your needs. The provided script example demonstrates the practical application of these tools and opens doors for further customization with push notifications.

Feel free to ask any questions or share your preferred monitoring tools and techniques! Let's keep the discussion going!

Buy Me A Coffee

Top comments (0)