Installation Manual - 63 Vault post-promotion soak cleanup plan

Post-promotion Vault R1 soak results and cleanup plan for the unused vault-platform store.

This chapter records the read-only post-promotion soak after vault.v7.comptech-lab.com was promoted from the old locked Vault to replacement Vault R1.

Governance

FieldValue
IssueOP-GF-VAULTRECOVERY-1 / #389
MilestoneWorkspace Governance
ADRADR 0028: Greenfield Vault Replacement After Custody Loss
Existing controlsADR 0016 and ADR 0025

Validation Summary

Access path:

local coordinator -> dl385-2 -> gf-ocp-bootstrap-01 -> v7 kubeconfigs

Stable DNS still resolves to R1:

vault.v7.comptech-lab.com -> 30.30.200.35, 30.30.200.36, 30.30.200.37

Resolution was confirmed from:

  • dl385-2
  • gf-ocp-bootstrap-01
  • hub ESO controller pod
  • spoke ESO controller pod

Explicit Vault health checks:

IPClusterResult
30.30.200.31old Vaultinitialized, unsealed, standby
30.30.200.32old Vaultinitialized, unsealed, active
30.30.200.33old Vaultinitialized, unsealed, standby
30.30.200.35R1initialized, unsealed, active
30.30.200.36R1initialized, unsealed, standby
30.30.200.37R1initialized, unsealed, standby

OpenShift state:

ClusterOpenShiftNodesClusterOperators
hub-dc-v74.20.183/3 Readysteady
spoke-dc-v74.20.186/6 Readysteady

Active ExternalSecrets remain Ready / SecretSynced:

ClusterConsumerStore
hub-dc-v7ESO smokevault-r1-eso-smoke
hub-dc-v7OADP cloud credentialsvault-r1-oadp
hub-dc-v7RHACS TLS/admin materialvault-r1-rhacs
spoke-dc-v7ESO smokevault-r1-eso-smoke
spoke-dc-v7OADP cloud credentialsvault-r1-oadp
spoke-dc-v7logging object-store credentialslogging-local
spoke-dc-v7RHACS TLS materialvault-r1-rhacs

OADP remains healthy:

ClusterDPABSLScheduleLatest scheduled Backup CR
hub-dc-v7ReconciledAvailableEnabledplatform-resource-daily-20260517223546
spoke-dc-v7ReconciledAvailableEnabledplatform-resource-daily-20260517224523

StackRox remained acceptable on hub and spoke.

Finding

The unused ClusterSecretStore/vault-platform is now invalid on both clusters:

hub-dc-v7:   vault-platform Ready=False / InvalidProviderConfig
spoke-dc-v7: vault-platform Ready=False / InvalidProviderConfig

Argo CD is therefore Synced/Degraded for the applications that own that object:

hub-dc-v7-bootstrap
spoke-dc-v7-cluster-config

This is not an active secret delivery outage. No live ExternalSecret and no GitOps-managed ExternalSecret references vault-platform.

Likely cause:

  • vault-platform still has the old Vault CA bundle.
  • vault.v7.comptech-lab.com now resolves to R1.
  • The R1 serving certificate is issued by the R1 CA.
  • The R1 serving certificate includes vault-r1.v7.comptech-lab.com and 30.30.200.35, but not vault.v7.comptech-lab.com.

Cleanup Inventory

DNS:

NameA records
vault.v7.comptech-lab.com30.30.200.35, 30.30.200.36, 30.30.200.37
gf-ocp-vault-01.v7.comptech-lab.com30.30.200.31
gf-ocp-vault-02.v7.comptech-lab.com30.30.200.32
gf-ocp-vault-03.v7.comptech-lab.com30.30.200.33
vault-r1.v7.comptech-lab.comno A record

VM state on dl385-2:

VMState
gf-ocp-vault-seed-01running
gf-ocp-vault-01running
gf-ocp-vault-02running
gf-ocp-vault-03running
gf-ocp-vault-r1-seed-01running
gf-ocp-vault-r1-01running
gf-ocp-vault-r1-02running
gf-ocp-vault-r1-03running

Remove the unused ClusterSecretStore/vault-platform resources from hub/spoke GitOps:

  • clusters/hub-dc-v7/secrets/eso/clustersecretstore-vault.yaml
  • clusters/spoke-dc-v7/secrets/eso/clustersecretstore-vault.yaml

Then validate:

  • hub and spoke overlays render;
  • server-side dry-run accepts both overlays;
  • Argo returns to Synced/Healthy;
  • active R1 stores remain Ready / Valid;
  • active ExternalSecrets remain Ready / SecretSynced;
  • OADP and RHACS remain healthy.

This is lower risk than rotating the R1 serving certificate because no active consumer uses vault-platform.

Do not decommission old Vault VMs yet. Keep old node-specific DNS records until the Argo degradation is cleared and at least one more scheduled backup window remains healthy.

Actions Not Taken

  • No Vault secret, policy, auth role, auth mount, or token was changed.
  • No DNS record was changed.
  • No VM was stopped or modified.
  • No GitOps desired state was changed in this gate.
  • No secret values were printed.