Installation Manual - 71 OADP post-Vault-poweroff backup validation

Post-power-off OADP backup validation while the old lost-custody Vault VMs remained off.

This chapter records the post-power-off OADP backup validation gate for hub-dc-v7 and spoke-dc-v7.

The old lost-custody Vault VMs remained powered off for the whole gate. The only intended live change was a temporary GitOps schedule acceleration, followed by restoration to the normal daily schedules.

Governance

FieldValue
IssueOP-GF-VAULTRECOVERY-1 / #389
MilestoneWorkspace Governance
ADRADR 0028: Greenfield Vault Replacement After Custody Loss
Existing controlsADR 0016 and ADR 0025

Preflight

Before changing the schedules:

  • hub-dc-v7 and spoke-dc-v7 were on OpenShift 4.20.18;
  • all nodes were Ready;
  • no ClusterOperator exceptions were reported;
  • old Vault VMs were shut off and autostart disabled;
  • R1 Vault VMs were running and autostart enabled;
  • R1 Vault health with standbyok=true returned HTTP 200 on 30.30.200.35-.37;
  • hub/spoke ExternalSecrets were 6/6 Ready;
  • OADP DPAs were Reconciled and BSLs were Available;
  • live schedules were at the normal values:
    • hub 15 2 * * *;
    • spoke 45 2 * * *.

Temporary Schedule Acceleration

Temporary GitOps commit:

2d368c3 Temporarily accelerate post-poweroff OADP schedules

Temporary schedule values:

ClusterTemporary cron
hub-dc-v733 6 * * *
spoke-dc-v734 6 * * *

Validation before reconcile:

  • local render passed for both OADP overlays;
  • server-side dry-run accepted both overlays using force-conflicts because Argo CD owns .spec.schedule;
  • bootstrap clone on gf-ocp-bootstrap-01 fast-forwarded to 2d368c333b241c2b8ec6d2ac6e0f9aa3433bc04c;
  • Argo CD hard-refresh was requested.

Argo CD converged to Synced/Healthy at 2d368c3, and live schedules showed the temporary values before the backup windows.

Backup Results

ClusterBackupPhaseItemsWarningsErrors
hub-dc-v7platform-resource-daily-20260518063347Completed10122/1012200
spoke-dc-v7platform-resource-daily-20260518063423Completed16808/1680800

Backup timestamps:

ClusterStartedCompleted
hub-dc-v72026-05-18T06:33:47Z2026-05-18T06:34:14Z
spoke-dc-v72026-05-18T06:34:23Z2026-05-18T06:34:55Z

MinIO Object Validation

Object validation used the stored OADP backup user credential without printing credential values.

ClusterPrefixObjectsvelero-backup.json
hub-dc-v7hub-dc-v7/general/backups/platform-resource-daily-20260518063347121
spoke-dc-v7spoke-dc-v7/general/backups/platform-resource-daily-20260518063423121

Schedule Restoration

Restore GitOps commit:

0bb0cca Restore post-poweroff OADP schedules

Restored schedule values:

ClusterRestored cron
hub-dc-v715 2 * * *
spoke-dc-v745 2 * * *

Argo CD final state:

ContextApplicationSyncHealthRevision
hubhub-dc-v7-bootstrapSyncedHealthy0bb0cca
hubspoke-dc-v7-cluster-configSyncedHealthy0bb0cca
spokespoke-dc-v7-cluster-configSyncedHealthy0bb0cca

Final OADP state:

ClusterDPABSLScheduleLast backup
hub-dc-v7ReconciledAvailable15 2 * * *platform-resource-daily-20260518063347
spoke-dc-v7ReconciledAvailable45 2 * * *platform-resource-daily-20260518063423

Post-Gate Health

Cluster state:

ClusterOpenShiftNodesClusterOperators
hub-dc-v74.20.183/3 Readysteady
spoke-dc-v74.20.186/6 Readysteady

External Secrets:

ClusterResult
hub6/6 ExternalSecrets Ready
spoke6/6 ExternalSecrets Ready

R1-backed ClusterSecretStores remained Ready/Valid on both clusters:

  • vault-r1-eso-smoke;
  • vault-r1-oadp;
  • vault-r1-rhacs.

Vault egress policies still allowed only R1 Vault CIDRs:

30.30.200.35/32
30.30.200.36/32
30.30.200.37/32

Vault VM and health state:

AreaResult
old Vault VMsshut off, autostart disabled
R1 Vault VMsrunning, autostart enabled
old Vault healthHTTP 000 on 30.30.200.30-.33
R1 Vault healthHTTP 200 on 30.30.200.35-.37 with standbyok=true

RHACS:

  • hub Central reported Available and Deployed;
  • no non-running StackRox pods were found on hub or spoke.

OCM report:

clusters=1 synced=1 healthy=1 inProgress=0 notHealthy=0 notSynced=0

Result

The post-power-off OADP backup validation passed.

Both clusters created and completed new backups while the old Vault VMs stayed off. The corresponding MinIO object prefixes were present, normal schedules were restored, and platform health remained steady.

Disk deletion remains a separate final explicit retention decision.