Prometheus Query Cheat Sheet: 50+ Essential PromQL
The Prometheus Query Cheat Sheet is your ultimate reference for mastering PromQL — the powerful query language used to extract, analyze, and visualize time-series metrics in Prometheus. Whether you’re a DevOps engineer, SRE, or developer, this guide simplifies the most important queries, operators, and aggregation functions you’ll need for effective monitoring.
Prometheus Query Language (PromQL) lets you perform complex operations on metrics, including filtering, mathematical computations, and data aggregation — all in real time. By the end of this guide, you’ll be able to query metrics confidently, visualize performance trends, and create meaningful dashboards.
What Is Prometheus and PromQL?
Prometheus is an open-source monitoring and alerting system developed by SoundCloud. It collects metrics from configured targets at intervals, stores them as time-series data, and enables users to query that data using PromQL (Prometheus Query Language).
PromQL is the backbone of Prometheus — it allows you to slice, dice, and analyze your metrics in real-time. You can retrieve metrics for CPU usage, memory consumption, network activity, HTTP request latency, and much more.
Example:
up
This simple query returns the status of monitored targets (1 for up, 0 for down).
Why Use This Prometheus Query Cheat Sheet?
Prometheus has a vast query language, and memorizing every function or operator isn’t practical. This Prometheus Query Cheat Sheet gives you:
- A quick reference for all key PromQL commands.
- Ready-to-use examples for system and application monitoring.
- Aggregation, mathematical, and rate function usage.
- Simplified explanations for faster debugging and dashboard building.
PromQL Query Types Explained
Prometheus supports different query types for various use cases:
1. Instant Queries
Instant queries return the current value of a metric.
node_memory_MemFree_bytes
2. Range Queries
Range queries show metric changes over a period.
rate(http_requests_total[5m])
3. Vector Queries
These return multiple time-series results for the same metric name.
Understanding these types helps you choose the right query format for alerts, graphs, or analysis.
Basic Prometheus Query Examples
Here are a few commonly used PromQL queries for everyday monitoring.
Purpose
Query Example
Show all active targets
up
Show CPU usage per core
rate(node_cpu_seconds_total[1m])
Show memory usage
node_memory_MemTotal_bytes - node_memory_MemFree_bytes
Disk usage percentage
(node_filesystem_size_bytes - node_filesystem_free_bytes) / node_filesystem_size_bytes * 100
HTTP request rate
rate(http_requests_total[5m])
These queries are the foundation of system-level monitoring.
Using Labels in Prometheus Queries
Labels allow fine-grained filtering of metrics. Each time series in Prometheus is uniquely identified by its metric name and label set.
Filter by Label
http_requests_total{method="GET"}
Exclude a Label
http_requests_total{method!="POST"}
Multiple Label Filters
http_requests_total{method="GET", handler="/api"}
Labels help you zoom in on specific applications, endpoints, or nodes for precise analysis.
Working with Rate and Increase Functions
The rate() and increase() functions are essential for analyzing time-series changes.
1. rate()
Shows per-second average rate of increase.
rate(http_requests_total[5m])
2. increase()
Displays the total increase over a time range.
increase(http_requests_total[1h])
3. irate()
Instantaneous rate — good for real-time dashboards.
irate(node_cpu_seconds_total[1m])
Use rate() for smoother graphs and irate() for more responsive charts.
Aggregation Operators in PromQL
Aggregations summarize data across multiple dimensions.
Operator
Description
sum()
Sum of all values
avg()
Average of all values
max()
Maximum value
min()
Minimum value
count()
Count of series
stddev()
Standard deviation
topk()
Top K series by value
Examples:
sum(rate(http_requests_total[5m])) by (method)
avg(node_cpu_seconds_total) by (mode)
topk(5, rate(http_requests_total[5m]))
Aggregation is vital for service-level metrics like average latency or total traffic.
Mathematical Operations in Prometheus Queries
PromQL allows math operations between metrics and constants.
Examples:
node_memory_MemFree_bytes / node_memory_MemTotal_bytes * 100
rate(http_requests_total[5m]) * 60
You can even combine metrics:
(rate(http_requests_total[1m]) / rate(http_requests_errors_total[1m])) * 100
This calculates error percentages from total request counts.
Comparison and Logical Operators
Use these operators to compare or combine metrics.
Operator
Meaning
Equal to
Not equal to
Greater than
Less than
and, or, unless
Logical operations
Example:
up == 0
Lists all targets that are down.
node_cpu_seconds_total > 0.9
Shows nodes with CPU usage above 90%.
Rate vs. Increase vs. Irate
Let’s summarize these commonly confused PromQL functions:
Function
Purpose
Use Case
rate()
Per-second average over range
Smooth trends
increase()
Total increase over range
Counters
irate()
Instantaneous rate
Live dashboards
Each has unique advantages depending on how frequently your metrics are scraped.
Working with Histogram Metrics
Histogram metrics measure distribution, such as request durations.
Example:
histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))
This returns the 95th percentile latency for HTTP requests.
Common Histogram Buckets:
le= less than or equal to bucket boundary.le= less than or equal to bucket boundary.sum by (le)aggregates all samples by bucket.
Histograms are crucial for performance and SLA/SLO analysis.
Vector Matching in Prometheus Queries
When performing operations between two metrics, Prometheus uses vector matching to align series by labels.
Example:
http_requests_total{job="api"} / http_requests_total{job="frontend"}
Use:
- on() to match specific labels.
- ignoring() to exclude labels from matching.
Example:
rate(requests_total[5m]) / ignoring(instance) rate(errors_total[5m])
This prevents mismatched label dimensions.
Prometheus Query Cheat Sheet for Node Exporter Metrics
Here are the most used PromQL commands for host-level metrics:
Metric
Description
Example
CPU Usage
CPU time used per mode
rate(node_cpu_seconds_total[5m])
Memory Usage
Free vs total memory
1 - (node_memory_MemFree_bytes / node_memory_MemTotal_bytes)
Disk Usage
Used space percentage
(node_filesystem_size_bytes - node_filesystem_free_bytes) / node_filesystem_size_bytes * 100
Load Average
System load
node_load1
Network In/Out
Bytes transferred
rate(node_network_receive_bytes_total[5m])
These form the backbone of system observability in Prometheus.
PromQL Cheat Sheet for Kubernetes Metrics
Prometheus integrates deeply with Kubernetes, allowing cluster-level insights.
Use Case
Query Example
Pod CPU Usage
sum(rate(container_cpu_usage_seconds_total[5m])) by (pod)
Pod Memory Usage
sum(container_memory_usage_bytes) by (pod)
Node Disk Usage
node_filesystem_avail_bytes / node_filesystem_size_bytes * 100
Pod Restart Count
sum(increase(kube_pod_container_status_restarts_total[1h])) by (pod)
Running Pods
count(kube_pod_info)
These queries are essential for monitoring Kubernetes cluster health.
Alerting Rules with Prometheus Queries
Prometheus queries are used to trigger alerts in Alertmanager.
Example Rule:
- alert: HighCPULoad
expr: sum(rate(node_cpu_seconds_total[1m])) by (instance) > 0.9
for: 2m
labels:
severity: critical
annotations:
summary: "High CPU usage detected on {{ $labels.instance }}"
By mastering PromQL, you can design intelligent alerts to catch anomalies early.
Advanced PromQL Functions
Function
Purpose
avg_over_time()
Average over a time range
sum_over_time()
Sum over time
max_over_time()
Maximum in range
quantile_over_time()
Percentile values
predict_linear()
Forecast trends
deriv()
Derivative of time series
Example:
predict_linear(http_requests_total[1h], 3600)
Forecasts future requests based on past trends.
PromQL Query Optimization Tips
- Use shorter time ranges for faster queries.
- Filter labels precisely to reduce data scans.
- Use recording rules for repetitive queries.
- Avoid using heavy joins in dashboards.
- Test queries incrementally.
These optimizations make dashboards load faster and alerts more responsive.
Best Practices for Prometheus Querying
- Always use rate() for counters (e.g., request counts).
- Use gauge metrics for instantaneous values (e.g., temperature, memory).
- Normalize metrics for consistent visualization.
- Use recording rules to precompute metrics for dashboards.
- Regularly test queries in Prometheus UI or Grafana Explore.
A disciplined PromQL strategy improves observability and scalability.
Conclusion
This Prometheus Query Cheat Sheet gives you everything you need to query, aggregate, and analyze metrics efficiently. From system monitoring and Kubernetes metrics to alerting and forecasting, PromQL’s flexibility makes it a must-have skill for any DevOps engineer.
Mastering these commands and best practices will help you troubleshoot faster, optimize performance, and create meaningful dashboards — all while unlocking the full power of Prometheus.
FAQs About Prometheus Query Cheat Sheet
1. What is PromQL in Prometheus?
PromQL (Prometheus Query Language) is used to query and analyze time-series data collected by Prometheus. It supports filtering, aggregation, and mathematical operations.
2. How do I query data in Prometheus?
You can query data using the Prometheus UI or API. For example:
rate(http_requests_total[5m])
3. What is the difference between rate() and increase()?
rate() shows the per-second rate of increase, while increase() shows the total increase over the selected range.
4. Can Prometheus queries be used in Grafana?
Yes, Grafana supports Prometheus as a data source. You can use PromQL queries directly in Grafana dashboards.
5. How do I create custom alerts in Prometheus?
Use alerting rules in YAML format with PromQL expressions. These rules are processed by Alertmanager to send notifications.