Migration & Upgrade Guide¶

This guide covers upgrading the Keycloak operator and comparing this operator with the official Keycloak operator.

Table of Contents¶

Upgrading the Operator
Upgrading Keycloak Version
Comparison with Official Keycloak Operator
Backup & Rollback

Upgrading the Operator¶

Pre-Upgrade Checklist¶

Backup current state - Export all Keycloak resources
Review release notes - Check for breaking changes
Test in non-production - Upgrade staging environment first
Check database backups - Ensure recent backup exists
Document current versions - Record operator and Keycloak versions

Step 1: Backup Current State¶

# Backup all Keycloak resources
kubectl get keycloak,keycloakrealm,keycloakclient --all-namespaces -o yaml \
  > keycloak-resources-backup-$(date +%Y%m%d).yaml

# Backup operator configuration
helm get values keycloak-operator -n keycloak-operator-system \
  > operator-values-backup-$(date +%Y%m%d).yaml

# Backup CRDs
kubectl get crd -o yaml | grep -A1000 "vriesdemichael.github.io" \
  > crds-backup-$(date +%Y%m%d).yaml

Step 2: Check Current Version¶

# Get current operator version
helm list -n keycloak-operator-system

# Get operator image version
kubectl get deployment keycloak-operator -n keycloak-operator-system \
  -o jsonpath='{.spec.template.spec.containers[0].image}'

Step 3: Review Release Notes¶

Check the Releases Page for: - Breaking changes - New features - Bug fixes - Migration requirements

Step 4: Upgrade Operator (Helm)¶

# Check available versions (OCI)
helm show chart oci://ghcr.io/vriesdemichael/charts/keycloak-operator

# Upgrade operator
helm upgrade keycloak-operator oci://ghcr.io/vriesdemichael/charts/keycloak-operator \
  --namespace keycloak-operator-system \
  --values operator-values-backup-$(date +%Y%m%d).yaml \
  --version <version> \
  --wait

Important: The --wait flag ensures the upgrade completes before returning.

Step 5: Verify Upgrade¶

# Check operator pods are running new version
kubectl get pods -n keycloak-operator-system

# Check operator logs for startup
kubectl logs -n keycloak-operator-system -l app=keycloak-operator --tail=50

# Verify CRDs updated
kubectl get crd keycloaks.vriesdemichael.github.io -o yaml | grep -A5 version

# Check all resources still healthy
kubectl get keycloak,keycloakrealm,keycloakclient --all-namespaces

All resources should remain in Ready phase.

Step 6: Test Reconciliation¶

# Trigger reconciliation on a test realm
kubectl annotate keycloakrealm <test-realm> -n <test-namespace> \
  reconcile=$(date +%s) --overwrite

# Watch logs
kubectl logs -n keycloak-operator-system -l app=keycloak-operator -f

# Verify realm still Ready
kubectl get keycloakrealm <test-realm> -n <test-namespace>

Rollback Procedure¶

If upgrade fails:

# Rollback Helm release
helm rollback keycloak-operator -n keycloak-operator-system

# Verify operator rolled back
kubectl get pods -n keycloak-operator-system

# Check resources still healthy
kubectl get keycloak,keycloakrealm,keycloakclient --all-namespaces

Important: CRD changes cannot be automatically rolled back. You may need to manually restore CRDs from backup:

kubectl apply -f crds-backup-<date>.yaml

Upgrading Keycloak Version¶

Supported Keycloak Versions¶

Minimum: Keycloak 25.0.0 (management port 9000 requirement)
Recommended: Keycloak 26.0.0+
Maximum: Latest Keycloak release

Automated Pre-Upgrade Backups¶

Automatic for CNPG and Managed tiers

When upgrading Keycloak to a new major or minor version, the operator automatically creates a backup before applying the change. Patch-level upgrades (e.g., 26.0.1 → 26.0.2) skip this step.

The backup behavior depends on your database tier:

CNPG: Creates a CNPG Backup CR and waits for completion before proceeding.
Managed: Creates a VolumeSnapshot of the database PVC.
External: Cannot back up automatically. The operator logs a warning and proceeds. Users with external databases must handle backups independently before upgrading. Flat-field (legacy) configs are treated identically (ADR-091).

See Backup & Restore: Automated Pre-Upgrade Backups for full configuration details.

Pre-Upgrade Checklist¶

Check Keycloak release notes - Review breaking changes
Verify backup configuration - Ensure upgradePolicy settings match your requirements
Test in non-production - Verify compatibility
Schedule maintenance window - Plan for brief downtime (rolling update) or zero-downtime (blue-green, Phase 3)

Upgrade Strategy¶

Blue-Green Deployment (Future — Phase 3): 1. Deploy new Keycloak version alongside old version 2. Migrate database schema via Liquibase job 3. Switch traffic to new version 4. Keep old version for quick rollback 5. Remove old version after verification

Rolling Update (Current): 1. Update Keycloak resource with new image tag 2. Operator triggers pre-upgrade backup (automatic for CNPG/Managed) 3. Operator performs rolling update 4. Brief downtime during pod restarts

Rolling Update Procedure¶

# Check current Keycloak version
kubectl get keycloak <name> -n <namespace> \
  -o jsonpath='{.spec.image.tag}'

# Update to new version
kubectl patch keycloak <name> -n <namespace> --type=merge -p '
spec:
  image:
    tag: "26.0.0"
'

# Watch rollout
kubectl rollout status statefulset/<keycloak-name> -n <namespace>

# Verify all pods running new version
kubectl get pods -n <namespace> -l app=keycloak \
  -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[0].image}{"\n"}{end}'

Verify Upgrade¶

# Check Keycloak status
kubectl get keycloak <name> -n <namespace>
# Should show PHASE=Ready

# Check all realms still working
kubectl get keycloakrealm --all-namespaces

# Test OAuth2 flow
# (Use test client to verify authentication)

# Check database schema version
kubectl exec -it -n <namespace> <keycloak-pod> -- \
  psql -h <db-host> -U keycloak -d keycloak \
  -c "SELECT * FROM databasechangelog ORDER BY orderexecuted DESC LIMIT 5;"

Rollback to Previous Version¶

# Revert to previous image tag
kubectl patch keycloak <name> -n <namespace> --type=merge -p '
spec:
  image:
    tag: "25.0.6"
'

# Watch rollout
kubectl rollout status statefulset/<keycloak-name> -n <namespace>

# Verify rollback
kubectl get pods -n <namespace> -l app=keycloak

Note: Keycloak database migrations are forward-only. Rolling back may require database restore if schema was upgraded.

Cache Isolation During Upgrades¶

When running multiple Keycloak pods, all members must form a single Infinispan/JGroups cluster to share distributed caches (user sessions, action tokens, login flows in progress). If pods from two different major versions try to cluster together during a rolling upgrade, they may encounter serialization incompatibilities in the JGroups protocol, causing subtle split-brain issues.

The cacheIsolation feature solves this by restricting which pods can join the same JGroups cluster using a Kubernetes label selector on the headless discovery service.

Recommended: `autoRevision: true` for semver images¶

If you tag your Keycloak image with a proper semver version (e.g., quay.io/keycloak/keycloak:26.4.1), enable automatic revision-based isolation:

# keycloak-values.yaml
keycloak:
  image: quay.io/keycloak/keycloak
  version: "26.4.1"
  cacheIsolation:
    autoRevision: true

The operator derives the cluster identity from the major version only (v26), so patch and minor upgrades (e.g., 26.4.1 → 26.5.0) remain in the same cluster. A major upgrade (e.g., 26.x → 27.x) automatically creates a new isolated cluster.

Pod label set by the operator:

vriesdemichael.github.io/cache-cluster: <name>-v26

The discovery service selector is updated to match the current major version on every reconcile loop, so stale selectors after upgrades are corrected automatically.

Alternative: `clusterName` for non-semver or custom names¶

If you use non-semver tags (e.g., :nightly, :latest, or custom CI builds), use an explicit cluster name:

keycloak:
  cacheIsolation:
    clusterName: my-keycloak-prod

:latest and autoRevision

If autoRevision: true is set but the image tag is non-semver (:latest, :nightly, SHA digest), the operator cannot determine a major version. It will log a warning and disable cache isolation for that instance. You must use clusterName or a semver-tagged image in this case.

What is isolated — and what survives an upgrade¶

Data	Survives upgrade?	Why
User sessions	✅ Yes	Stored in the database
Realm & client config	✅ Yes	Stored in the database
Offline tokens	✅ Yes	Stored in the database
In-progress login flows (action tokens)	⚠️ Lost during upgrade window	Stored in Infinispan only
Active SSO sessions (not yet flushed)	⚠️ May be lost	Flushed periodically to DB

Users mid-flow during the rolling upgrade window (typically seconds to a few minutes) may need to restart the flow. This is the same behaviour as any rolling pod restart.

Priority of `cacheIsolation` options¶

If multiple fields are set, the operator applies this resolution order (highest priority first):

clusterName — explicit static name, always wins
autoRevision — derives <name>-v<major> from image tag
autoSuffix — appends the full image tag as-is
No isolation — pods join Keycloak's default cluster

Comparison with Official Keycloak Operator¶

Overview¶

Aspect	This Operator	Official Keycloak Operator
Primary Focus	GitOps-native, multi-tenant	General Keycloak deployment
Language	Python (Kopf)	Go (Operator SDK)
CRDs	Keycloak, KeycloakRealm, KeycloakClient	Keycloak, KeycloakRealmImport
Authorization	Namespace grant lists + RBAC	RBAC + direct access
Multi-tenancy	First-class support	Limited
GitOps Compatibility	Excellent	Good
Secret Management	Kubernetes-native	Kubernetes + Keycloak
Database	CloudNativePG (CNPG) primary	External PostgreSQL

When to Use This Operator¶

✅ Choose this operator if: - Multi-tenant environment (10+ teams) - GitOps-first workflow (ArgoCD, Flux) - Strong namespace isolation required - Declarative authorization via grant lists - CloudNativePG database management preferred

When to Use Official Operator¶

✅ Choose official operator if: - Single-tenant environment - Need Keycloak's built-in security model - Organization policy requires official/upstream operators - Integration with Red Hat/RHSSO required - Prefer Go-based operators - Need features not yet in this operator

Feature Comparison¶

Realm Management¶

Feature	This Operator	Official Operator
Declarative realm config	✅ KeycloakRealm CRD	✅ KeycloakRealmImport
Live realm updates	✅ Automatic reconciliation	⚠️ Import-based
Drift detection	✅ Built-in	❌ Not supported
Multi-namespace realms	✅ Fully supported	⚠️ Limited
Realm deletion	✅ Automatic	⚠️ Manual

Client Management¶

Feature	This Operator	Official Operator
Declarative client config	✅ KeycloakClient CRD	⚠️ Via RealmImport
Client secret management	✅ Automatic Kubernetes secret	⚠️ Via RealmImport
Protocol mappers	✅ CRD support	✅ Via RealmImport
Service accounts	✅ CRD support	✅ Via RealmImport
Cross-namespace clients	✅ Fully supported	❌ Not supported

Security Model¶

Feature	This Operator	Official Operator
Authorization method	Namespace Grant + RBAC	Keycloak admin credentials
Client secret rotation	✅ Automatic	❌ Manual
Multi-tenant isolation	✅ Namespace Grant Lists	⚠️ RBAC-based
Audit trail	✅ K8s API + ConfigMap	⚠️ Keycloak logs
Secret distribution	✅ GitOps-friendly	⚠️ Manual

Operations¶

Feature	This Operator	Official Operator
Database management	✅ CNPG integration	⚠️ External required
Backup/restore	✅ Via CNPG	⚠️ Manual
High availability	✅ Multi-replica support	✅ Multi-replica support
Monitoring	✅ Prometheus metrics	✅ Prometheus metrics
Rate limiting	✅ Built-in API rate limiting	❌ Not supported

Migration from Official Operator¶

Not Automated - Migration requires manual steps:

Export data from existing Keycloak:

# Export realms from existing Keycloak
kubectl exec -it <keycloak-pod> -- \
  /opt/keycloak/bin/kc.sh export --dir /tmp/export

Deploy this operator alongside (different namespace)
Create new Keycloak instance with this operator
Import realm exports:
Create KeycloakRealm CRDs based on exports
Create KeycloakClient CRDs for each client
Switch application traffic to new Keycloak
Decommission old operator after verification

Note: Direct migration is complex. Recommend running both operators in parallel during transition.

Backup & Rollback¶

Pre-Upgrade Backup¶

Always backup before major changes:

# Full backup script
#!/bin/bash
BACKUP_DIR="keycloak-backup-$(date +%Y%m%d-%H%M%S)"
mkdir -p ${BACKUP_DIR}

# Backup resources
kubectl get keycloak,keycloakrealm,keycloakclient --all-namespaces -o yaml \
  > ${BACKUP_DIR}/resources.yaml

# Backup operator config
helm get values keycloak-operator -n keycloak-operator-system \
  > ${BACKUP_DIR}/operator-values.yaml

# Backup CRDs
kubectl get crd -o yaml | grep -A1000 "vriesdemichael.github.io" \
  > ${BACKUP_DIR}/crds.yaml

# Backup database (if using CNPG)
kubectl cnpg backup keycloak-db -n keycloak-db

echo "Backup complete: ${BACKUP_DIR}"

Database Backup (CloudNativePG)¶

# Trigger manual backup
kubectl cnpg backup keycloak-db -n keycloak-db

# List backups
kubectl get backup -n keycloak-db

# Verify backup succeeded
kubectl describe backup <backup-name> -n keycloak-db

Restore from Backup¶

Restore Kubernetes Resources:

# Restore all resources
kubectl apply -f keycloak-backup-<date>/resources.yaml

# Verify resources restored
kubectl get keycloak,keycloakrealm,keycloakclient --all-namespaces

Restore Database (see Backup & Restore Guide):

# Restore from specific backup
kubectl cnpg restore keycloak-db \
  --backup <backup-name> \
  --namespace keycloak-db

Rollback Operator¶

# Rollback to previous Helm release
helm rollback keycloak-operator -n keycloak-operator-system

# Or rollback to specific revision
helm history keycloak-operator -n keycloak-operator-system
helm rollback keycloak-operator <revision> -n keycloak-operator-system

# Verify rollback
kubectl get pods -n keycloak-operator-system

Emergency Procedures¶

Operator Completely Broken:

# Uninstall operator (resources remain)
helm uninstall keycloak-operator -n keycloak-operator-system

# Resources continue working (Keycloak still serves traffic)
# Reinstall operator when ready:
helm install keycloak-operator ./charts/keycloak-operator \
  --namespace keycloak-operator-system \
  --values operator-values-backup.yaml

Keycloak Database Corrupted:

# Restore from backup (requires downtime)
kubectl delete cluster keycloak-db -n keycloak-db
kubectl cnpg restore keycloak-db \
  --backup <backup-name> \
  --namespace keycloak-db

# Wait for database to come back
kubectl wait --for=condition=Ready cluster/keycloak-db \
  -n keycloak-db --timeout=10m

# Restart Keycloak pods
kubectl rollout restart statefulset/<keycloak-name> -n <namespace>

Best Practices¶

Upgrade Strategy¶

Test First - Always test upgrades in non-production
Backup Always - Never upgrade without recent backup
Read Release Notes - Check for breaking changes
Rolling Updates - Use rolling updates for zero downtime
Verify Thoroughly - Test all critical flows after upgrade
Monitor - Watch metrics and logs during upgrade
Have Rollback Plan - Know how to rollback before starting

Maintenance Windows¶

Schedule upgrades during low-traffic periods:

# Check current traffic
kubectl exec -n keycloak-operator-system deployment/keycloak-operator -- \
  curl -s localhost:8081/metrics | grep keycloak_operator_reconciliation_total

# Notify users of maintenance window
# Perform upgrade
# Verify and re-enable traffic

Documentation¶

Document your upgrade:

Pre-upgrade state (versions, configurations)
Steps taken
Issues encountered
Resolution steps
Post-upgrade verification
Rollback procedure used (if any)

Migration & Upgrade Guide¶

Table of Contents¶

Upgrading the Operator¶

Pre-Upgrade Checklist¶

Step 1: Backup Current State¶

Step 2: Check Current Version¶

Step 3: Review Release Notes¶

Step 4: Upgrade Operator (Helm)¶

Step 5: Verify Upgrade¶

Step 6: Test Reconciliation¶

Rollback Procedure¶

Upgrading Keycloak Version¶

Supported Keycloak Versions¶

Automated Pre-Upgrade Backups¶

Pre-Upgrade Checklist¶

Upgrade Strategy¶

Rolling Update Procedure¶

Verify Upgrade¶

Rollback to Previous Version¶

Cache Isolation During Upgrades¶

Recommended: autoRevision: true for semver images¶

Alternative: clusterName for non-semver or custom names¶

What is isolated — and what survives an upgrade¶

Priority of cacheIsolation options¶

Comparison with Official Keycloak Operator¶

Overview¶

When to Use This Operator¶

When to Use Official Operator¶

Feature Comparison¶

Realm Management¶

Client Management¶

Security Model¶

Operations¶

Migration from Official Operator¶

Backup & Rollback¶

Pre-Upgrade Backup¶

Database Backup (CloudNativePG)¶

Restore from Backup¶

Rollback Operator¶

Emergency Procedures¶

Best Practices¶

Upgrade Strategy¶

Maintenance Windows¶

Documentation¶

Related Documentation¶

Recommended: `autoRevision: true` for semver images¶

Alternative: `clusterName` for non-semver or custom names¶

Priority of `cacheIsolation` options¶