Skip to content

ADR-048: Prometheus metrics exposure

Category: architecture Provenance: human

Decision

Expose Prometheus metrics from operator for monitoring reconciliation performance, API calls, rate limiting, and errors.

Rationale

Metrics enable monitoring operator health and performance. Track reconciliation lag, API bottlenecks, rate limiting impact. Alert on error rates or reconciliation failures. Prometheus is standard for Kubernetes monitoring. Metrics help capacity planning and troubleshooting. Essential for production operations.

Agent Instructions

Expose Prometheus metrics on operator's metrics port. Track reconciliation duration, API calls, rate limit waits, errors, resource counts. Use labels for resource type, namespace, operation. Import from keycloak_operator.observability.metrics. See src/keycloak_operator/observability/metrics.py for available metrics.

Rejected Alternatives

Logging only, no metrics

Logs don't aggregate easily. Can't graph trends or set numeric alerts. Poor visibility into system health.

Custom metrics format

Prometheus is Kubernetes standard. Custom format requires special tooling, reduces interoperability.