Introduction
Modern applications generate a vast amount of data, including performance metrics like response times, error rates, CPU usage, memory consumption, and request counts. To monitor these, tools like Prometheus (for data collection) and Grafana (for visualization) work together.
But have you ever wondered:
❓ Where do these metrics come from?
❓ How do they get processed and stored?
❓ How do they finally appear on a Grafana dashboard?
In this blog, we’ll break down the entire flow of metrics from their source (applications, databases, infrastructure) to visualization in Grafana.
1️⃣ Step 1: Metrics Collection (Instrumentation Layer)
🔹 Application-Level Metrics (Developers add instrumentation)
- Applications generate performance data such as request latency, error rate, throughput, etc.
- Developers use Micrometer (Spring Boot), StatsD, OpenTelemetry, or Prometheus client libraries to expose these metrics.
- Example: Exposing request latency in Spring Boot with Micrometer:
@RestController
public class MyController {
@GetMapping("/hello")
@Timed(value = "http.server.requests", description = "Response time of /hello")
public String hello() {
return "Hello, World!";
}
}
- These metrics are now available via an HTTP endpoint (
/actuator/prometheus
).
🔹 Middleware Metrics (Databases, Queues, etc.)
- Databases (MySQL, PostgreSQL) expose query execution time, connections, and query errors.
- Queues (Kafka, RabbitMQ, Redis) provide message throughput, queue depth, and latency.
-
Tools Used: Prometheus Exporters (e.g.,
mysqld_exporter
,kafka_exporter
).
🔹 Infrastructure Metrics (Servers, Containers, Kubernetes, etc.)
- Node Exporter (for Linux systems): Collects CPU, RAM, disk I/O, network usage.
- Kubernetes Metrics Server: Tracks Pod CPU, memory, and network stats.
- Cloud Watchers (AWS CloudWatch, GCP Stackdriver, Azure Monitor) collect cloud-based metrics.
2️⃣ Step 2: Metric Collection & Scraping (Prometheus Layer)
Prometheus acts as the data aggregator. It periodically scrapes (pulls) metrics from applications, databases, and infrastructure.
How Does Prometheus Collect Metrics?
-
Scrapes HTTP endpoints (e.g.,
http://app:8080/actuator/prometheus
) every few seconds. - Stores time-series metrics in its TSDB (Time-Series Database).
- Uses PromQL (Prometheus Query Language) to aggregate and filter the data.
Example Prometheus Config (prometheus.yml)
scrape_configs:
- job_name: "spring-boot-app"
metrics_path: "/actuator/prometheus"
static_configs:
- targets: ["localhost:8080"]
- job_name: "node-metrics"
static_configs:
- targets: ["node_exporter:9100"]
This configuration tells Prometheus to:
✅ Collect metrics from a Spring Boot app (/actuator/prometheus
).
✅ Collect infrastructure metrics from Node Exporter (:9100
).
3️⃣ Step 3: Storing & Querying Metrics (Prometheus Database)
Prometheus stores all collected data in its Time-Series Database (TSDB).
🔹 Each metric is stored with labels (metadata) to categorize it.
Example:
http_requests_total{method="GET", endpoint="/home", status="200"} 350
This means there were 350 successful GET requests to /home
.
🔹 You can use PromQL queries to fetch and analyze data:
rate(http_requests_total[5m]) # Requests per second over the last 5 minutes
4️⃣ Step 4: Visualization in Grafana
Grafana connects to Prometheus as a data source to visualize collected metrics.
🔹 Connecting Grafana to Prometheus
- Open Grafana and go to Configuration > Data Sources.
-
Add Prometheus as a data source and enter the Prometheus URL (e.g.,
http://prometheus:9090
). - Click Save & Test.
🔹 Building Dashboards
- Grafana provides pre-built dashboards for common services like Kubernetes, MySQL, Redis, and JVM.
- You can create custom panels using PromQL queries.
🔹 Example Queries for Dashboard Panels
- CPU Usage Over Time
rate(node_cpu_seconds_total[5m])
- Average Response Time
avg(rate(http_request_duration_seconds_sum[5m]) / rate(http_request_duration_seconds_count[5m]))
- Error Rate
sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m])) * 100
🔹 Sample Grafana Dashboard (Visual Representation)
🔥 Metrics Visualized on Grafana Panels
✅ Request Count & Response Time
✅ System CPU, Memory, and Disk Usage
✅ Application Errors & Latency
✅ Service Health & Alerts
5️⃣ Step 5: Alerting & Notifications
Grafana can trigger alerts based on thresholds defined for metrics.
🔹 Setting Up Alerts in Grafana
- Define a condition (e.g., Response time > 2s).
- Choose an action (Email, Slack, PagerDuty, etc.).
- Grafana continuously monitors and sends alerts if conditions are met.
Example: Alert when CPU Usage > 80%
avg(rate(node_cpu_seconds_total[5m])) > 0.8
Final Summary: Full Metrics Flow
Here’s how the data flows from applications to Grafana:
✅ Step 1: Application & Infrastructure Generate Metrics
- Spring Boot, MySQL, Kubernetes, Linux Servers expose metrics.
✅ Step 2: Prometheus Scrapes & Stores Metrics
- Pulls metrics from endpoints.
- Stores in time-series database (TSDB).
✅ Step 3: Grafana Queries & Visualizes Metrics
- Connects to Prometheus as a data source.
- Executes PromQL queries and displays dashboards.
✅ Step 4: Alerting & Monitoring
- Grafana alerts notify engineers when thresholds exceed limits.
Conclusion
By following this structured flow, organizations can monitor applications, databases, and infrastructure efficiently, ensuring high availability and performance.
💡 Key Takeaways:
- Prometheus collects & stores metrics.
- Grafana visualizes & alerts on metrics.
- Developers must instrument applications properly.
- Pre-built exporters help collect infrastructure metrics.
Would you like a full hands-on tutorial to set this up in a Docker-based environment? 🚀 Let me know!
Top comments (0)