Prometheus Query Cheat Sheet: 50+ Essential PromQL

2025-10-29

The Prometheus Query Cheat Sheet is your ultimate reference for mastering PromQL — the powerful query language used to extract, analyze, and visualize time-series metrics in Prometheus. Whether you’re a DevOps engineer, SRE, or developer, this guide simplifies the most important queries, operators, and aggregation functions you’ll need for effective monitoring.

Prometheus Query Language (PromQL) lets you perform complex operations on metrics, including filtering, mathematical computations, and data aggregation — all in real time. By the end of this guide, you’ll be able to query metrics confidently, visualize performance trends, and create meaningful dashboards.

What Is Prometheus and PromQL?

Prometheus is an open-source monitoring and alerting system developed by SoundCloud. It collects metrics from configured targets at intervals, stores them as time-series data, and enables users to query that data using PromQL (Prometheus Query Language).

PromQL is the backbone of Prometheus — it allows you to slice, dice, and analyze your metrics in real-time. You can retrieve metrics for CPU usage, memory consumption, network activity, HTTP request latency, and much more.

Example:

up

This simple query returns the status of monitored targets (1 for up, 0 for down).

Why Use This Prometheus Query Cheat Sheet?

Prometheus has a vast query language, and memorizing every function or operator isn’t practical. This Prometheus Query Cheat Sheet gives you:

  • A quick reference for all key PromQL commands.
  • Ready-to-use examples for system and application monitoring.
  • Aggregation, mathematical, and rate function usage.
  • Simplified explanations for faster debugging and dashboard building.

PromQL Query Types Explained

Prometheus supports different query types for various use cases:

1. Instant Queries

Instant queries return the current value of a metric.

node_memory_MemFree_bytes

2. Range Queries

Range queries show metric changes over a period.

rate(http_requests_total[5m])

3. Vector Queries

These return multiple time-series results for the same metric name.

Understanding these types helps you choose the right query format for alerts, graphs, or analysis.

Basic Prometheus Query Examples

Here are a few commonly used PromQL queries for everyday monitoring.

Purpose

Query Example

Show all active targets

up

Show CPU usage per core

rate(node_cpu_seconds_total[1m])

Show memory usage

node_memory_MemTotal_bytes - node_memory_MemFree_bytes

Disk usage percentage

(node_filesystem_size_bytes - node_filesystem_free_bytes) / node_filesystem_size_bytes * 100

HTTP request rate

rate(http_requests_total[5m])

These queries are the foundation of system-level monitoring.

Using Labels in Prometheus Queries

Labels allow fine-grained filtering of metrics. Each time series in Prometheus is uniquely identified by its metric name and label set.

Filter by Label

http_requests_total{method="GET"}

Exclude a Label

http_requests_total{method!="POST"}

Multiple Label Filters

http_requests_total{method="GET", handler="/api"}

Labels help you zoom in on specific applications, endpoints, or nodes for precise analysis.

Working with Rate and Increase Functions

The rate() and increase() functions are essential for analyzing time-series changes.

1. rate()

Shows per-second average rate of increase.

rate(http_requests_total[5m])

2. increase()

Displays the total increase over a time range.

increase(http_requests_total[1h])

3. irate()

Instantaneous rate — good for real-time dashboards.

irate(node_cpu_seconds_total[1m])

Use rate() for smoother graphs and irate() for more responsive charts.

Aggregation Operators in PromQL

Aggregations summarize data across multiple dimensions.

Operator

Description

sum()

Sum of all values

avg()

Average of all values

max()

Maximum value

min()

Minimum value

count()

Count of series

stddev()

Standard deviation

topk()

Top K series by value

Examples:

sum(rate(http_requests_total[5m])) by (method)
avg(node_cpu_seconds_total) by (mode)
topk(5, rate(http_requests_total[5m]))

Aggregation is vital for service-level metrics like average latency or total traffic.

Mathematical Operations in Prometheus Queries

PromQL allows math operations between metrics and constants.

Examples:

node_memory_MemFree_bytes / node_memory_MemTotal_bytes * 100
rate(http_requests_total[5m]) * 60

You can even combine metrics:

(rate(http_requests_total[1m]) / rate(http_requests_errors_total[1m])) * 100

This calculates error percentages from total request counts.

Comparison and Logical Operators

Use these operators to compare or combine metrics.

Operator

Meaning

Equal to

Not equal to

Greater than

Less than

and, or, unless

Logical operations

Example:

up == 0

Lists all targets that are down.

node_cpu_seconds_total > 0.9

Shows nodes with CPU usage above 90%.

Rate vs. Increase vs. Irate

Let’s summarize these commonly confused PromQL functions:

Function

Purpose

Use Case

rate()

Per-second average over range

Smooth trends

increase()

Total increase over range

Counters

irate()

Instantaneous rate

Live dashboards

Each has unique advantages depending on how frequently your metrics are scraped.

Working with Histogram Metrics

Histogram metrics measure distribution, such as request durations.

Example:

histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))

This returns the 95th percentile latency for HTTP requests.

Common Histogram Buckets:

  • le = less than or equal to bucket boundary.
  • le = less than or equal to bucket boundary.
  • sum by (le) aggregates all samples by bucket.

Histograms are crucial for performance and SLA/SLO analysis.

Vector Matching in Prometheus Queries

When performing operations between two metrics, Prometheus uses vector matching to align series by labels.

Example:

http_requests_total{job="api"} / http_requests_total{job="frontend"}

Use:

  • on() to match specific labels.
  • ignoring() to exclude labels from matching.

Example:

rate(requests_total[5m]) / ignoring(instance) rate(errors_total[5m])

This prevents mismatched label dimensions.

Prometheus Query Cheat Sheet for Node Exporter Metrics

Here are the most used PromQL commands for host-level metrics:

Metric

Description

Example

CPU Usage

CPU time used per mode

rate(node_cpu_seconds_total[5m])

Memory Usage

Free vs total memory

1 - (node_memory_MemFree_bytes / node_memory_MemTotal_bytes)

Disk Usage

Used space percentage

(node_filesystem_size_bytes - node_filesystem_free_bytes) / node_filesystem_size_bytes * 100

Load Average

System load

node_load1

Network In/Out

Bytes transferred

rate(node_network_receive_bytes_total[5m])

These form the backbone of system observability in Prometheus.

PromQL Cheat Sheet for Kubernetes Metrics

Prometheus integrates deeply with Kubernetes, allowing cluster-level insights.

Use Case

Query Example

Pod CPU Usage

sum(rate(container_cpu_usage_seconds_total[5m])) by (pod)

Pod Memory Usage

sum(container_memory_usage_bytes) by (pod)

Node Disk Usage

node_filesystem_avail_bytes / node_filesystem_size_bytes * 100

Pod Restart Count

sum(increase(kube_pod_container_status_restarts_total[1h])) by (pod)

Running Pods

count(kube_pod_info)

These queries are essential for monitoring Kubernetes cluster health.

Alerting Rules with Prometheus Queries

Prometheus queries are used to trigger alerts in Alertmanager.

Example Rule:

- alert: HighCPULoad
  expr: sum(rate(node_cpu_seconds_total[1m])) by (instance) > 0.9
  for: 2m
  labels:
    severity: critical
  annotations:
    summary: "High CPU usage detected on {{ $labels.instance }}"

By mastering PromQL, you can design intelligent alerts to catch anomalies early.

Advanced PromQL Functions

Function

Purpose

avg_over_time()

Average over a time range

sum_over_time()

Sum over time

max_over_time()

Maximum in range

quantile_over_time()

Percentile values

predict_linear()

Forecast trends

deriv()

Derivative of time series

Example:

predict_linear(http_requests_total[1h], 3600)

Forecasts future requests based on past trends.

PromQL Query Optimization Tips

  1. Use shorter time ranges for faster queries.
  2. Filter labels precisely to reduce data scans.
  3. Use recording rules for repetitive queries.
  4. Avoid using heavy joins in dashboards.
  5. Test queries incrementally.

These optimizations make dashboards load faster and alerts more responsive.

Best Practices for Prometheus Querying

  • Always use rate() for counters (e.g., request counts).
  • Use gauge metrics for instantaneous values (e.g., temperature, memory).
  • Normalize metrics for consistent visualization.
  • Use recording rules to precompute metrics for dashboards.
  • Regularly test queries in Prometheus UI or Grafana Explore.

A disciplined PromQL strategy improves observability and scalability.

Conclusion

This Prometheus Query Cheat Sheet gives you everything you need to query, aggregate, and analyze metrics efficiently. From system monitoring and Kubernetes metrics to alerting and forecasting, PromQL’s flexibility makes it a must-have skill for any DevOps engineer.

Mastering these commands and best practices will help you troubleshoot faster, optimize performance, and create meaningful dashboards — all while unlocking the full power of Prometheus.

FAQs About Prometheus Query Cheat Sheet

1. What is PromQL in Prometheus?

PromQL (Prometheus Query Language) is used to query and analyze time-series data collected by Prometheus. It supports filtering, aggregation, and mathematical operations.

2. How do I query data in Prometheus?

You can query data using the Prometheus UI or API. For example:
rate(http_requests_total[5m])

3. What is the difference between rate() and increase()?

rate() shows the per-second rate of increase, while increase() shows the total increase over the selected range.

4. Can Prometheus queries be used in Grafana?

Yes, Grafana supports Prometheus as a data source. You can use PromQL queries directly in Grafana dashboards.

5. How do I create custom alerts in Prometheus?

Use alerting rules in YAML format with PromQL expressions. These rules are processed by Alertmanager to send notifications.