Skip to content

Backup & Restore Guide

Backup and restore procedures for Keycloak and PostgreSQL database using CloudNativePG.

What Gets Backed Up

Component Content Backup Method
Database Users, realms, clients, sessions CloudNativePG barman
Kubernetes Resources CRDs, manifests kubectl export
Token Metadata Token rotation state ConfigMap backup
Secrets Credentials (⚠️ encrypt) kubectl export

Not Backed Up: Operator code, container images (use image registry).


Quick Backup

One-Command Backup

#!/bin/bash
BACKUP_DIR="keycloak-backup-$(date +%Y%m%d-%H%M%S)"
mkdir -p ${BACKUP_DIR}

# Backup Kubernetes resources
kubectl get keycloak,keycloakrealm,keycloakclient --all-namespaces -o yaml \
  > ${BACKUP_DIR}/resources.yaml

# Backup token metadata
kubectl get configmap keycloak-operator-token-metadata \
  -n keycloak-operator-system -o yaml \
  > ${BACKUP_DIR}/token-metadata.yaml

# Trigger database backup
kubectl cnpg backup keycloak-db -n keycloak-db

echo "Backup complete: ${BACKUP_DIR}"

Database Backup

Configure Automatic Backups

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: keycloak-db
  namespace: keycloak-db
spec:
  instances: 3

  backup:
    barmanObjectStore:
      destinationPath: s3://my-backup-bucket/keycloak-db
      s3Credentials:
        accessKeyId:
          name: backup-s3-credentials
          key: ACCESS_KEY_ID
        secretAccessKey:
          name: backup-s3-credentials
          key: ACCESS_SECRET_KEY
      wal:
        compression: gzip
      data:
        compression: gzip
    retentionPolicy: "30d"

Scheduled Backups

apiVersion: postgresql.cnpg.io/v1
kind: ScheduledBackup
metadata:
  name: keycloak-db-daily
  namespace: keycloak-db
spec:
  schedule: "0 2 * * *"  # Daily at 2 AM
  backupOwnerReference: self
  cluster:
    name: keycloak-db

Manual Backup

# Trigger backup
kubectl cnpg backup keycloak-db -n keycloak-db

# List backups
kubectl get backup -n keycloak-db

# Check backup status
kubectl describe backup <backup-name> -n keycloak-db

Kubernetes Resources Backup

Backup Script

#!/bin/bash
BACKUP_DIR="k8s-backup-$(date +%Y%m%d)"
mkdir -p ${BACKUP_DIR}

# Backup all Keycloak resources
kubectl get keycloak,keycloakrealm,keycloakclient --all-namespaces -o yaml \
  > ${BACKUP_DIR}/keycloak-resources.yaml

# Backup operator configuration
helm get values keycloak-operator -n keycloak-operator-system \
  > ${BACKUP_DIR}/operator-values.yaml

# Backup token metadata
kubectl get configmap keycloak-operator-token-metadata \
  -n keycloak-operator-system -o yaml \
  > ${BACKUP_DIR}/token-metadata.yaml

# Backup CRDs
kubectl get crd -o yaml | grep -A1000 "vriesdemichael.github.io" \
  > ${BACKUP_DIR}/crds.yaml

# Backup secrets (⚠️ ENCRYPT THIS FILE)
kubectl get secret --all-namespaces -l vriesdemichael.github.io/managed-by=keycloak-operator \
  -o yaml > ${BACKUP_DIR}/secrets.yaml

echo "Backup saved to: ${BACKUP_DIR}"
echo "⚠️ IMPORTANT: Encrypt secrets.yaml before storing!"

Encrypt Secrets

# Using GPG
gpg --symmetric --cipher-algo AES256 ${BACKUP_DIR}/secrets.yaml

# Using age
age -p ${BACKUP_DIR}/secrets.yaml > ${BACKUP_DIR}/secrets.yaml.age

# Remove plaintext
rm ${BACKUP_DIR}/secrets.yaml

Database Restore

Full Cluster Restore

# 1. Delete existing cluster (⚠️ DOWNTIME)
kubectl delete cluster keycloak-db -n keycloak-db

# 2. Wait for PVCs to be deleted
kubectl get pvc -n keycloak-db

# 3. Create restore manifest
cat <<EOF | kubectl apply -f -
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: keycloak-db
  namespace: keycloak-db
spec:
  instances: 3

  bootstrap:
    recovery:
      source: keycloak-db-backup

  externalClusters:
    - name: keycloak-db-backup
      barmanObjectStore:
        destinationPath: s3://my-backup-bucket/keycloak-db
        s3Credentials:
          accessKeyId:
            name: backup-s3-credentials
            key: ACCESS_KEY_ID
          secretAccessKey:
            name: backup-s3-credentials
            key: ACCESS_SECRET_KEY
EOF

# 4. Wait for restore
kubectl wait --for=condition=Ready cluster/keycloak-db \
  -n keycloak-db --timeout=10m

# 5. Verify data
kubectl exec -it -n keycloak-db keycloak-db-1 -- \
  psql -U keycloak -d keycloak -c "SELECT COUNT(*) FROM public.realm;"

Point-in-Time Restore

bootstrap:
  recovery:
    source: keycloak-db-backup
    recoveryTarget:
      targetTime: "2025-01-15 10:00:00+00"  # UTC timestamp

Restore to New Cluster

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: keycloak-db-restored  # Different name
  namespace: keycloak-db
spec:
  instances: 3
  bootstrap:
    recovery:
      source: keycloak-db-backup

  externalClusters:
    - name: keycloak-db-backup
      barmanObjectStore:
        destinationPath: s3://my-backup-bucket/keycloak-db
        s3Credentials:
          accessKeyId:
            name: backup-s3-credentials
            key: ACCESS_KEY_ID
          secretAccessKey:
            name: backup-s3-credentials
            key: ACCESS_SECRET_KEY

Kubernetes Resources Restore

Restore All Resources

# 1. Restore CRDs first
kubectl apply -f k8s-backup-20250115/crds.yaml

# 2. Restore secrets (decrypt first)
gpg --decrypt k8s-backup-20250115/secrets.yaml.gpg | kubectl apply -f -

# 3. Restore token metadata
kubectl apply -f k8s-backup-20250115/token-metadata.yaml

# 4. Restore Keycloak resources
kubectl apply -f k8s-backup-20250115/keycloak-resources.yaml

# 5. Verify
kubectl get keycloak,keycloakrealm,keycloakclient --all-namespaces

Selective Restore

# Restore single realm
kubectl get -f k8s-backup-20250115/keycloak-resources.yaml \
  keycloakrealm/my-realm -n my-app -o yaml | kubectl apply -f -

# Restore single namespace
kubectl get -f k8s-backup-20250115/keycloak-resources.yaml \
  --namespace=my-app -o yaml | kubectl apply -f -

Disaster Recovery Procedures

Scenario 1: Database Corruption

Symptoms: Data integrity errors, query failures.

Recovery:

# 1. Scale down Keycloak (prevent new writes)
kubectl scale keycloak keycloak -n keycloak-system --replicas=0

# 2. Restore database from backup (see above)

# 3. Verify database integrity
kubectl exec -it -n keycloak-db keycloak-db-1 -- \
  psql -U keycloak -d keycloak -c "
    SELECT tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename))
    FROM pg_tables WHERE schemaname = 'public' LIMIT 5;
  "

# 4. Scale up Keycloak
kubectl scale keycloak keycloak -n keycloak-system --replicas=3

# 5. Test authentication

RTO: 15-30 minutes RPO: Time since last backup

Scenario 2: Accidental Resource Deletion

Symptoms: Realm/client deleted from Kubernetes and Keycloak.

Recovery:

# 1. Find resource in backup
grep -A50 "name: my-realm" k8s-backup-20250115/keycloak-resources.yaml

# 2. Restore resource
kubectl apply -f - <<EOF
# (paste resource YAML)
EOF

# 3. Verify reconciliation
kubectl describe keycloakrealm my-realm -n my-app

RTO: 5-10 minutes RPO: Last backup time

Scenario 3: Complete Cluster Loss

Symptoms: Entire Kubernetes cluster destroyed.

Recovery:

# 1. Deploy new Kubernetes cluster

# 2. Install operators
helm install cnpg cnpg/cloudnative-pg -n cnpg-system --create-namespace
helm install keycloak-operator ./charts/keycloak-operator -n keycloak-operator-system --create-namespace

# 3. Restore database
# (Use Full Cluster Restore procedure above)

# 4. Restore Kubernetes resources
# (Use Kubernetes Resources Restore procedure above)

# 5. Verify end-to-end

RTO: 2-4 hours RPO: Last backup time


Backup Verification

Test Restore Monthly

#!/bin/bash
# Monthly backup test script

# 1. Create test namespace
kubectl create namespace backup-test

# 2. Restore database to test cluster
cat <<EOF | kubectl apply -f -
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: keycloak-db-test
  namespace: backup-test
spec:
  instances: 1
  bootstrap:
    recovery:
      source: keycloak-db-backup
  externalClusters:
    - name: keycloak-db-backup
      barmanObjectStore:
        destinationPath: s3://my-backup-bucket/keycloak-db
        s3Credentials:
          accessKeyId:
            name: backup-s3-credentials
            key: ACCESS_KEY_ID
          secretAccessKey:
            name: backup-s3-credentials
            key: ACCESS_SECRET_KEY
EOF

# 3. Wait for restore
kubectl wait --for=condition=Ready cluster/keycloak-db-test -n backup-test --timeout=10m

# 4. Verify data
kubectl exec -it -n backup-test keycloak-db-test-1 -- \
  psql -U keycloak -d keycloak -c "
    SELECT COUNT(*) FROM public.realm;
    SELECT COUNT(*) FROM public.user_entity;
    SELECT COUNT(*) FROM public.client;
  "

# 5. Cleanup
kubectl delete namespace backup-test

echo "Backup test complete ✓"

Backup Monitoring

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: backup-alerts
  namespace: keycloak-db
spec:
  groups:
    - name: backups
      rules:
        - alert: BackupFailed
          expr: increase(cnpg_backup_failures_total[1h]) > 0
          labels:
            severity: critical
          annotations:
            summary: "Backup failed for {{ $labels.cluster }}"

        - alert: BackupOld
          expr: time() - cnpg_backup_last_success_timestamp > 86400
          labels:
            severity: warning
          annotations:
            summary: "No successful backup in 24h for {{ $labels.cluster }}"

Best Practices

1. Backup Frequency

Environment Database Kubernetes Resources Retention
Production Hourly Daily 30 days
Staging Daily Weekly 14 days
Development Daily Weekly 7 days

2. Storage Strategy

  • Primary: S3/GCS/Azure Blob (encrypted)
  • Secondary: Different region/provider
  • Tertiary: Offline/tape (compliance)

3. Encryption

Always encrypt backups containing: - Kubernetes secrets - Database dumps - Token metadata

4. Testing

  • Monthly restore tests (automated)
  • Quarterly disaster recovery drills
  • Document restore procedures
  • Train team on restore process

5. Retention

retentionPolicy: "30d"  # Base backups
  # WAL archives retained for PITR within retention window

Troubleshooting

Backup Fails with S3 Error

# Test S3 access
kubectl run aws-cli --rm -it --image=amazon/aws-cli -- \
  s3 ls s3://my-backup-bucket/ --region us-east-1

# Verify credentials
kubectl get secret backup-s3-credentials -n keycloak-db -o yaml

Restore Hangs

# Check cluster events
kubectl describe cluster keycloak-db -n keycloak-db

# Check pod logs
kubectl logs -n keycloak-db keycloak-db-1

# Verify backup exists
kubectl run aws-cli --rm -it --image=amazon/aws-cli -- \
  s3 ls s3://my-backup-bucket/keycloak-db/base/

Data Mismatch After Restore

# Check backup timestamp
kubectl describe backup <backup-name> -n keycloak-db

# Verify you restored correct backup
# Consider point-in-time recovery if needed