Free Prometheus Alert Rule and SLO Generator

Tools for Prometheus monitoring: SLO-based PromQL generator, error budget calculator, and scaling to avoid OOMs.

Brought to you by Cardinality Cloud, LLC.

Frequently Asked Questions

Common questions about Prometheus alerting, SLO-based monitoring, and using our tools

What is an SLO and why should I use SLO-based alerts?

October 20, 2025 Cardinality Cloud

Traditional infrastructure alerts page you when CPU hits 80%, but your users are fine. Meanwhile, degraded API performance goes unnoticed because no arbitrary threshold was crossed. An SLO (Service Level Objective) changes this - it’s a target reliability goal that measures what users actually experience, like “99.9% of requests succeed over 30 days.” Born from Google’s Site Reliability Engineering (SRE) practices, SLO-based alerting only pages when user experience is genuinely at risk, eliminating alert fatigue while catching real issues early.

Read full article

Why is burn rate alerting useful?

October 18, 2025 Cardinality Cloud

Traditional threshold alerts fire on every spike, creating alert fatigue. Burn rate alerting is different - it tracks how quickly you’re consuming your error budget and only alerts when errors are sustained enough to threaten your reliability target. This gives you early warnings before user experience degrades, while dramatically reducing noise.

Read full article

How does this tool efficiently calculate error budget over long SLO windows?

October 16, 2025 Cardinality Cloud

Calculating error budget over 30 days should be simple, but naive Prometheus queries time out on high-cardinality metrics. This tool uses a Riemann Sum-inspired technique that pre-computes error ratios at 5-minute intervals, turning an expensive range query into a single fast aggregation. The result: accurate error budget calculations that scale.

Read full article

How do I query these generated rules in Prometheus to monitor my application?

October 14, 2025 Cardinality Cloud

You’ve deployed the generated SLO rules to Prometheus - now what? The recording rules are pre-computing your SLO metrics every minute, but how do you actually check if you’re meeting your targets, monitor error budget consumption, or build dashboards? This guide shows you the essential PromQL queries to unlock the full power of your SLO monitoring, from checking current status to visualizing long-term trends.

Read full article

How do I size my Prometheus deployment?

October 12, 2025 Cardinality Cloud

Your monitoring just went down because Prometheus got OOM-killed again. Or maybe you’re paying for 32GB of RAM when 8GB would suffice. Sizing Prometheus shouldn’t be guesswork - it’s actually predictable math based on three inputs: active time series, scrape interval, and retention period. Our Resource Calculator does the math for you, showing memory, CPU, and disk requirements with visual guidance on safe ranges and real-world scaling examples.

Read full article

What is an Error Budget?

October 10, 2025 Cardinality Cloud

Engineering wants to slow down and fix stability issues. Product wants to ship faster and hit deadlines. Who’s right? Both - and neither. The real question isn’t “should we prioritize reliability or velocity?” but “how much unreliability can we tolerate while still meeting our promises?” That’s your error budget: the quantitative answer that turns endless debates into data-driven decisions. With a 99.9% SLO, you get 43.2 minutes of downtime per month to spend on innovation, experiments, or controlled risks.

Read full article

How do I report a bug or request a feature?

October 4, 2025 Cardinality Cloud

We use GitHub Issues to track bugs and feature requests.

To report a bug or request a feature:

  • Visit our Issues page
  • Search existing issues to avoid duplicates
  • Click “New Issue” and provide detailed information
  • For bugs: include steps to reproduce, expected vs actual behavior
  • For features: describe the use case and desired functionality

Your feedback helps us improve the tool for everyone!

Read full article