Infra & metrics
Prometheus
open source / community (CNCF)
The de-facto standard pull-based metrics engine and time-series database for cloud-native and Kubernetes monitoring; everything else in the space orbits it.
- Category
- Infra & metrics
- License
- Open source
- Deployment
- Self-hosted
- Cost
- Free
- Free tier
- Yes
- Self-host effort
- Heavy
- Maturity
- Incumbent
- Popularity
- ≈64k GitHub stars; category center of gravity
The catch
Single-node by design — no native HA or long-term storage — so any serious deployment becomes a 4-5 component stack (Alertmanager, Grafana, Thanos/Mimir, exporters) you assemble and operate yourself.
Monitors
Protocols
Capabilities
Built for
The honest take
Prometheus is the cloud-native default, and the single most useful thing I can tell you about it is that it’s the wrong thing to call “a product.” When someone says they “use Prometheus,” what they actually run is a stack: Prometheus itself for scraping and storage, Alertmanager for routing alerts, Grafana for dashboards, an exporter per thing you want to watch, and — the moment you need high availability or more than a few weeks of history — Thanos, Mimir or VictoriaMetrics bolted on behind it. Adopting Prometheus is adopting an architecture you operate, not installing a tool. That’s not a criticism; it’s the thing people underestimate, and the source of most “Prometheus is hard” complaints.
Where it’s genuinely the right call: anything Kubernetes or container-native. The pull model, service discovery and PromQL are the lingua franca of cloud-native monitoring for a reason — the ecosystem, the exporters and the community knowledge are unmatched, and config-as-code means your monitoring lives in Git like everything else. If you’re building on the standard, you’re building on the thing everything else integrates with.
Where it’s the wrong call: a traditional, SNMP-heavy network. Yes, snmp_exporter exists, and yes, it works — but it’s the clunky path, and you’ll spend your time fighting generator configs instead of monitoring. If you’re escaping SolarWinds and you’re not going cloud-native, Zabbix is the gentler landing; reach for Prometheus when the destination is Kubernetes, not just “something free.”
Two traps worth naming up front. The first is long-term storage: vanilla Prometheus is single-node and short-memoried by design — local retention is days-to-weeks, with no HA — so any serious deployment eventually grows a remote-storage backend, and that decision (Thanos vs Mimir vs VictoriaMetrics) is its own project. The second is cardinality: a single label with unbounded values (a user ID, a full URL, a pod name baked into a metric) can explode your time-series count and your memory bill overnight. Get your labels right and it’s cheap to run; get them wrong and you’ll learn what “cardinality” means the hard way.
The honest summary: free in license, expensive in ownership, and worth it when you’re cloud-native and treat operating the stack as work you want to own. If that sentence makes you tired, that’s useful information. Compare it head-to-head with Zabbix or managed Grafana Cloud before you commit.
Pricing in the real world
- Software $0
- Grafana Enterprise support optional $10-30k/yr
Free software; cost is engineering time + the surrounding stack.