Free Prometheus Alert Rule and SLO Generator

Tools for Prometheus monitoring: SLO-based PromQL generator, error budget calculator, and scaling to avoid OOMs.

Brought to you by Cardinality Cloud, LLC.

Faq

Articles tagged with "Faq"

What is an SLO and why should I use SLO-based alerts?

October 20, 2025 Cardinality Cloud 9 min read

Traditional infrastructure alerts page you when CPU hits 80%, but your users are fine. Meanwhile, degraded API performance goes unnoticed because no arbitrary threshold was crossed. An SLO (Service Level Objective) changes this - it’s a target reliability goal that measures what users actually experience, like “99.9% of requests succeed over 30 days.” Born from Google’s Site Reliability Engineering (SRE) practices, SLO-based alerting only pages when user experience is genuinely at risk, eliminating alert fatigue while catching real issues early.

Why is burn rate alerting useful?

October 18, 2025 Cardinality Cloud 2 min read

Traditional threshold alerts fire on every spike, creating alert fatigue. Burn rate alerting is different - it tracks how quickly you’re consuming your error budget and only alerts when errors are sustained enough to threaten your reliability target. This gives you early warnings before user experience degrades, while dramatically reducing noise.

How does this tool efficiently calculate error budget over long SLO windows?

October 16, 2025 Cardinality Cloud 2 min read

Calculating error budget over 30 days should be simple, but naive Prometheus queries time out on high-cardinality metrics. This tool uses a Riemann Sum-inspired technique that pre-computes error ratios at 5-minute intervals, turning an expensive range query into a single fast aggregation. The result: accurate error budget calculations that scale.

How do I query these generated rules in Prometheus to monitor my application?

October 14, 2025 Cardinality Cloud 2 min read

You’ve deployed the generated SLO rules to Prometheus - now what? The recording rules are pre-computing your SLO metrics every minute, but how do you actually check if you’re meeting your targets, monitor error budget consumption, or build dashboards? This guide shows you the essential PromQL queries to unlock the full power of your SLO monitoring, from checking current status to visualizing long-term trends.

How do I size my Prometheus deployment?

October 12, 2025 Cardinality Cloud 3 min read

Your monitoring just went down because Prometheus got OOM-killed again. Or maybe you’re paying for 32GB of RAM when 8GB would suffice. Sizing Prometheus shouldn’t be guesswork - it’s actually predictable math based on three inputs: active time series, scrape interval, and retention period. Our Resource Calculator does the math for you, showing memory, CPU, and disk requirements with visual guidance on safe ranges and real-world scaling examples.

What is an Error Budget?

October 10, 2025 Cardinality Cloud 2 min read

Engineering wants to slow down and fix stability issues. Product wants to ship faster and hit deadlines. Who’s right? Both - and neither. The real question isn’t “should we prioritize reliability or velocity?” but “how much unreliability can we tolerate while still meeting our promises?” That’s your error budget: the quantitative answer that turns endless debates into data-driven decisions. With a 99.9% SLO, you get 43.2 minutes of downtime per month to spend on innovation, experiments, or controlled risks.

How do I contribute to this project?

October 6, 2025 Cardinality Cloud 1 min read

We welcome contributions! This project is open source under the Apache 2.0 license.

To contribute:

  • Fork the repository at GitHub
  • Read our Contributing Guidelines
  • Create a feature branch
  • Make your changes and write tests if applicable
  • Submit a pull request

We appreciate all contributions, whether they’re bug fixes, new features, documentation improvements, or examples!

How do I report a bug or request a feature?

October 4, 2025 Cardinality Cloud 1 min read

We use GitHub Issues to track bugs and feature requests.

To report a bug or request a feature:

  • Visit our Issues page
  • Search existing issues to avoid duplicates
  • Click “New Issue” and provide detailed information
  • For bugs: include steps to reproduce, expected vs actual behavior
  • For features: describe the use case and desired functionality

Your feedback helps us improve the tool for everyone!