ADR-048: Prometheus metrics exposure¶
Category: architecture Provenance: human
Decision¶
Expose Prometheus metrics from operator for monitoring reconciliation performance, API calls, rate limiting, and errors.
Rationale¶
Metrics enable monitoring operator health and performance. Track reconciliation lag, API bottlenecks, rate limiting impact. Alert on error rates or reconciliation failures. Prometheus is standard for Kubernetes monitoring. Metrics help capacity planning and troubleshooting. Essential for production operations.
Agent Instructions¶
Expose Prometheus metrics on operator's metrics port. Track reconciliation duration, API calls, rate limit waits, errors, resource counts. Use labels for resource type, namespace, operation. Import from keycloak_operator.observability.metrics. See src/keycloak_operator/observability/metrics.py for available metrics.
Rejected Alternatives¶
Logging only, no metrics¶
Logs don't aggregate easily. Can't graph trends or set numeric alerts. Poor visibility into system health.
Custom metrics format¶
Prometheus is Kubernetes standard. Custom format requires special tooling, reduces interoperability.