ADR-067: Integration test coverage collection via SIGUSR1 signal¶

Category: development Provenance: guided-ai

Decision¶

Coverage data from integration tests running in Kubernetes pods must be collected using SIGUSR1 signal handling rather than relying on process termination. The operator registers a signal handler that flushes coverage data on demand, allowing retrieval while the process continues running.

Rationale¶

Kubernetes termination issues: When pods are deleted, atexit handlers often don't run or coverage data is lost before retrieval. Signal-based approach: SIGUSR1 allows explicit, controlled flushing while process continues running. Works with parallel tests: pytest-xdist's pytest_sessionfinish hook runs on controller node, ensuring single retrieval after all workers complete. Path differences: Host uses relative paths (src/keycloak_operator), container uses absolute paths (/app/src/keycloak_operator). Production safety: Signal handler only registers when COVERAGE_PROCESS_START env var is set, zero overhead in production. Combined coverage: Retrieval happens before cluster teardown, allowing combination of unit test and integration test coverage data. Current baseline: 54.5% combined coverage with integration tests contributing real operator execution paths.

Agent Instructions¶

When modifying integration test infrastructure, preserve the coverage collection mechanism. The operator listens for SIGUSR1 and calls coverage.save(). The pytest_sessionfinish hook in conftest.py triggers coverage retrieval by calling an internal script which sends SIGUSR1 to PID 1 in the operator pod, waits for flush, then retrieves files via kubectl cp. Never rely on atexit handlers for coverage in containerized tests - they're unreliable with Kubernetes pod termination. Coverage config differs between host (.coveragerc) and container (images/operator/coveragerc.container) due to different path structures.