Secrets Architecture (Vault + ESO)
How secrets reach OpenShift workloads on the v6 fleet: HashiCorp Vault on a VM, External Secrets Operator on each cluster, a single platform ClusterSecretStore, per-tenant SecretStores, ExternalSecrets, and the kubernetes-provider bridge for OBC-backed operands.
This page is the orientation map for everything else in §10. Read it once, then jump to the page covering the specific CR or flow you need (operator install, ClusterSecretStore, per-tenant SecretStore, ExternalSecret, OBC bridge, rotation). The wider §3 sections (image supply, GitOps, ACM) often touch secrets at the edges; the design point is that those sections never have to author Secret YAML — they reference what this layer materializes.
What this layer does
External Secrets Operator (ESO) is the “secret delivery” plane for the v6 fleet. Its job is one sentence: take desired-state references to secrets stored in a system of record (HashiCorp Vault on a VM, or another Kubernetes Secret in the case of the OBC bridge), and materialize them as Kubernetes Secret objects in the namespaces that consume them — without ever putting the secret value into Git.
The system of record is:
- Vault (VM-based, lab /24 — addresses redacted) — the canonical store for tokens, passwords, robot accounts, pull credentials, certificates, and any opaque key/value that an operator or app needs. KV-v2 mount at
secret/. - NooBaa ObjectBucketClaim Secret + ConfigMap — the local cluster-side store for in-cluster S3 access credentials produced by ODF. Loki, Tempo, and Quay consume these.
The delivery plane is:
- External Secrets Operator (Red Hat distribution,
openshift-external-secrets-operatorv1.1.0). Provides the controllers, the webhook, and the four CR kinds shown in the diagram. - ClusterSecretStore — exactly one, named
vault-platform. Cluster-wide; used only by platform/operator namespaces. Defined under ADR 0019 conventions. - SecretStore (per tenant) — one per
apps-<division>-<app>namespace, namedvault-apps. Namespace-scoped; cannot read another division’s Vault paths. - ExternalSecret — the request CR. References either a
ClusterSecretStoreor a tenantSecretStore, declares a targetSecretshape, and (optionally) templates the value. - ClusterExternalSecret — used once, for the cluster-wide
app-registry-pullSecret that needs to fan out to every namespace labelledapps.platform/tenant=true. Documented separately.
Architecture
Reading the diagram:
- Top row — platform path. The ESO operand authenticates to Vault via the cluster’s Kubernetes auth mount (
auth/kubernetes-<cluster>). Thevault-platformClusterSecretStore is read byExternalSecrets in operator namespaces (openshift-cert-manager,openshift-pipelines,openshift-gitops,stackrox, etc.); their materializedSecrets are the operator/system credentials. - Middle row — tenant path. Each tenant namespace runs its own
app-esoServiceAccount and its own namespacedSecretStore vault-apps. The Vault role behind it is per-division (apps-<cluster>-<division>) with a namespace-glob bind ofapps-<division>-*. TenantExternalSecrets materialize the actual app credentials, robot tokens, and DB passwords. - Bottom row — OBC bridge. ODF/NooBaa creates an
ObjectBucketClaim, which yields a Secret (withAWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY) and a ConfigMap (withBUCKET_HOST/BUCKET_NAME/BUCKET_PORT). Loki/Tempo/Quay expect a Secret with different lowercase keys (endpoint,bucketnames,access_key_id,access_key_secret). AnExternalSecretusing ESO’s kubernetes provider templates the bridge Secret in the shape the operand expects.
Three reasons this is one diagram, not three:
- The same operand reconciler runs all three flows. Sizing, NetworkPolicy egress, and observability are shared.
- The CR vocabulary is identical. A reviewer reading any
ExternalSecretpage can predict the others. - Failure modes overlap. The “Vault auth times out” symptom in the platform path is the same NetworkPolicy egress fix as the tenant path.
Why this design (and not the alternatives)
| Alternative considered | Why we did not pick it |
|---|---|
| Vault Agent Injector | Pod-side sidecar that mounts secret files. Requires per-Pod annotations and reissues at restart only; conflicts with Argo-managed manifests and forces ServiceAccount-bound annotations into app repos. ESO keeps the indirection in CRs the platform owns. |
| Sealed Secrets | The secret value is in Git (encrypted). Recovery, audit, and rotation are clumsy at fleet scale; an encrypted Secret blob in a public-ish-feeling repo still gives auditors heartburn. |
oc apply with a Secret out of band | Defeats GitOps. Breaks the “everything reconciles back” property under ADR 0025. |
Cluster-wide ClusterSecretStore for tenants | A single CSS would let any namespace’s ESO request read any Vault path. We want a hard cross-tenant boundary, so tenants get a namespace-scoped SecretStore with a role that refuses to issue tokens for namespaces outside apps-<division>-*. |
| Vault Kubernetes auth with one shared role | One leaked token would read every division. Per-division role + per-division policy keeps blast radius bounded. |
ADR 0014 (developer-readiness contract) requires “approved pull-secret delivery”; ADR 0019 requires that image-pull credentials come from a controlled source rather than being committed; ADR 0025 forbids oc apply mutations of secrets in the live cluster. The Vault + ESO + per-tenant SecretStore composition satisfies all three.
Inventory at a glance
| Concept | Identifier | Where it lives |
|---|---|---|
| Vault auth method | auth/kubernetes-<cluster> | Vault VM |
| Platform Vault policy | ocp-<cluster>-eso | Vault VM |
| Platform Vault role | ocp-<cluster>-eso | auth/kubernetes-<cluster>/role/... |
| Tenant Vault policy | apps-<division>-read | Vault VM (cluster-agnostic) |
| Tenant Vault role | apps-<cluster>-<division> | auth/kubernetes-<cluster>/role/... |
| ESO operator namespace | openshift-external-secrets-operator | OperatorHub install |
| ESO operand namespace | external-secrets | created by ESO operator |
| Platform ClusterSecretStore | vault-platform | per cluster, in clusters/<cluster>/secrets/eso/ |
| Tenant SecretStore | vault-apps (in each tenant ns) | per tenant overlay, from tenants/_template/secretstore-vault-apps.yaml |
| Vault path tree (platform) | secret/ocp/<cluster>/..., secret/ocp/platform/... | KV-v2 |
| Vault path tree (tenant) | secret/apps/<division>/<app>/<env>/<key> | KV-v2 |
What lives in each Vault subtree
| Subtree | Owner | Example contents |
|---|---|---|
secret/ocp/<cluster>/... | platform admin | cluster-specific bootstrap secrets, ESO smoke-test data, ODF route credentials |
secret/ocp/platform/rhacs-init-bundle | platform admin | RHACS sensor init-bundle materials (delivered via §11) |
secret/ocp/<cluster>/registries/app-registry-pull | platform admin | cluster-wide app-registry dockerconfigjson (see §6) |
secret/ocp/<cluster>/quay/config-bundle | platform admin | Quay registry operand config bundle (consumed via OBC bridge sibling pattern) |
secret/apps/<division>/<app>/<env>/<key> | tenant team (write) / platform (audit) | app credentials, DB passwords, OIDC client secrets |
secret/apps/<division>/<app>/ci/quay-robot | platform admin (creates) / tenant (consumes) | Tekton Path B push-robot dockerconfigjson |
Cross-references
02-eso-operator-and-policies.mdx— install path, the operator/operand split, default-deny NetworkPolicy gotcha.03-vault-clustersecretstore.mdx— thevault-platformshape, CA bundle, auth wiring.04-tenant-secretstore-pattern.mdx— per-tenantvault-apps+ role/policy/namespace-glob.05-externalsecret-and-templates.mdx— request shapes, templating, refresh interval choices.06-obc-to-operand-bridge.mdx— the kubernetes-provider Loki/Tempo/Quay bridge.07-rotation-and-revocation.mdx— Vault-side rotation, ESO refresh semantics, revocation playbook.
References
connection-details/vault-app-secrets.md— tenant path convention (DEV-OCP-0.4 / #174).connection-details/app-registry-pullsecret.md—ClusterExternalSecret-fanned pull credential (DEV-OCP-0.2 / #172).- ADRs: 0014 (developer readiness), 0019 (Nexus-only supply chain), 0025 (GitOps-only operations).