Routine tasks — overview

Nine recurring operator workflows that own the fleet's running state: secret rotation, operator bumps, fleet onboarding, policy rollout, evidence backfill, kubeadmin and RHACS Central rotation, division onboarding, Loki OBC bridge.

This subsection is the operator’s playbook for the recurring work the fleet pays for during steady state. Each task is small enough to complete in one session, common enough to standardise, and important enough to get wrong if rushed. Read every page once; come back when the task is on your plate.

Every routine task lands as a platform-gitops MR. There are no exceptions — secret rotations land via ESO config changes, evidence backfills land via lifecycle rule updates, even RHACS init-bundle rotations land via Vault writes + ExternalSecret refreshes captured in GitOps. The MR loop is the standardisation; the per-task page tells you what to put in the MR.

The five-step shape

Every page in this subsection follows the same shape:

  1. When it runs. Cadence (calendar-driven, alert-driven, install-driven) and the trigger that surfaces the task on TODO.
  2. What is in scope. The specific systems/credentials/files touched. Boundaries against adjacent tasks.
  3. Pre-checks. The state you confirm before you start mutating. Always read-only.
  4. The change. The actual MR / commands / config that implement the rotation/bump/onboarding.
  5. Validation. The end-of-task evidence that the fleet returned to steady state.

If you find yourself reading a task page and it does not have those five sections, the page is incomplete — file an issue under #229 against this section.

The nine routine tasks

#PageCadenceTriggers
02Rotate secrets and tokensQuarterly or on personnel changeCalendar; offboarding event; drift-check failure
03Bump operator versionWhen upstream releases a CSV with a security fix or required featureCSV release; CVE; ADR-mandated feature
04Add a cluster to the fleetOn capacity event or DR build-outNew ManagedCluster onboarding
05Roll out a policyOn compliance gate or new findingPCI-DSS scan finding; ACM PolicySet update
06Backfill evidence to MinIOWhen an evidence-pack window closesCI run; compliance close-out; lifecycle audit
07Rotate the kubeadmin passwordQuarterly or on personnel changeOCP 4.20 23-char minimum; htpasswd-Secret recreate; Vault custody
08Rotate the RHACS Central adminQuarterly or after init-bundle regenVault secret/ocp/platform/rhacs-admin -> ESO -> rollout restart deploy/central
09Add a division to federated GitLabOn new client engagement or compliance scope-changect-* role groups; CODEOWNERS; runner-class tags; Vault path tree
10Publish the Loki OBC bridgeOne-shot under #233LokiStack stuck Warning Degraded; backport of the Tempo bridge !43

The cadence column matters: tasks 02 / 06 / 07 / 08 are calendar-driven; tasks 03 / 04 / 09 are event-driven; tasks 05 / 10 are gate-driven (compliance and incident-follow-up respectively). The on-call rotation escalation matrix tells you who owns each.

What is not in this subsection

  • Incidents (live failures) live in the incidents subsection. Routine tasks run during steady state; incidents run during not-steady state.
  • Cluster upgrades (OCP minor/patch). Upgrade procedures are governed by ADR 0018’s pull model and the OCP upgrade documentation. They are not yet a published runbook on this site; the next operator who runs one should write the page.
  • OADP backup drills. Backup/restore drills follow the OADP operator’s documentation plus the lab’s evidence-capture convention. A drill runbook is planned under #229 follow-ups.
  • Trivy scanner refresh and CI evidence rotation. These touch the developer-side pipeline (GitLab -> Jenkins -> Trivy -> Nexus -> Docker runtime VM) which is paused for app-delivery scope per the user’s 2026-05-09 decision (project_app_dev_direction). When OpenShift app delivery is reopened, these will join routine tasks.

A note on naming

The session-handoff convention is one routine task = one MR = one issue = one session report. Branch name embeds the issue key. The MR description references the routine-task page on this site (this is the public-facing “why”). The session report captures the validation evidence.

Naming patterns observed in the active repo:

Task classBranch prefixExample
Secret rotationsecret-rot/secret-rot/gitlab-pat-rotate-202605
Operator bumpop-bump/op-bump/oadp-1.5.5-to-1.5.6
Cluster onboardingcluster-onb/cluster-onb/spoke-dr-v6-import
Policy rolloutpolicy/policy/pci-dss-4-allowedregistries
Evidence backfillevidence/evidence/pci-baseline-2026-05-11

These prefixes are not enforced by any hook — they are the lab’s conventional shape.

References

  • opp-full-plat/connection-details/platform-admin-handoff.md
  • opp-full-plat/runbooks/secrets-custody-drift-check.md
  • opp-full-plat/plans/disconnected-rebuild/environments/dc-lab/operator-version-lock.md
  • Issues: #229 (this section), #137 (hub mirror capture — example of evidence-backfill class), #233 (Loki OBC bridge backport), #255 / MR !73 (RHACS Central admin via Vault + ESO)

Last reviewed: 2026-05-12