Lab Infrastructure: the VM fleet outside OpenShift

The supporting VM platform that sits outside OpenShift — edge ingress, DNS, object storage, secret store, image mirror, CI, and observability — and why each piece lives on a VM rather than inside a cluster.

A working OpenShift fleet is not just OpenShift. It needs a DNS plane, an ingress edge, an image mirror, a secret store, durable object storage, a build system, and at least one place to look at what is happening. In this lab every one of those sits on a small Linux VM next to the clusters, never inside them. This section documents that VM platform.

This page is the orientation map. Subsections drill into each VM family: the libvirt+cloud-init build pattern, PowerDNS, HAProxy, MinIO, Vault. The image supply chain (Nexus), CI (Jenkins), and observability (SigNoz / monitoring-0) live in adjacent sections under this same top-level — they share the platform conventions described here.

The fleet, in one picture

HAProxy is the only public-facing component. Everything else is reachable only on the private 30.30.0.0/16 lab bridge, and is exposed selectively through HAProxy’s *.apps.sub.comptech-lab.com wildcard when a public hostname is needed. OpenShift clusters get their own ingress (the in-cluster OpenShift router) — they do not ride this HAProxy. That boundary is set by ADR 0005 and reinforced in the HAProxy subsection.

What each VM family does

FamilyHostnamesRole
HAProxy edgehaproxy (private only)Public+private TLS edge for the platform VM fleet. ~40 frontend/backend blocks, SNI passthrough to a loopback re-decrypt with the Let’s Encrypt wildcard cert. Off-limits for OpenShift Routes.
PowerDNSpdns.local; lab resolver on the recursor IPAuthoritative DNS for sub.comptech-lab.com (PowerDNS Authoritative 4.8.3 over SQLite) and a recursor (PowerDNS Recursor 4.9.3) that every other lab VM points to. Forwards lab subdomain locally, everything else to public resolvers.
MinIOminio.sub.comptech-lab.com, minio-console.apps.sub.comptech-lab.comS3-compatible object store. Hosts Loki/Tempo storage, Quay backing store, OADP backups, CI evidence, Vault Raft snapshots.
Vault OSSvault.sub.comptech-lab.com (DNS RR to 3 raft voters), vault-seal-0HashiCorp Vault OSS 1.21.1 on dedicated VMs. Three-node Raft cluster auto-unsealed against a separate single-node transit-seal Vault. Backs ESO in both OpenShift clusters.
Nexusnexus.apps, mirror-registry.apps, docker-group.apps, app-registry.appsSonatype Nexus on a single VM exposing three independent Docker endpoints: install mirror, dev pull-through, app push target. Detailed under §3 of the docs site.
Jenkins + agentjenkins.apps, jenkins-agent-0Single-controller LTS Jenkins on Ubuntu, dedicated build-agent VM with Podman/Buildah/Skopeo/Trivy. Detailed under §4.
GitLabgitlab.apps.sub.comptech-lab.comSelf-hosted GitLab CE; canonical source of truth for the platform GitOps repo (openshift-platform-gitops). Detailed under §8.
SigNozsignoz.appsSigNoz EE Docker-Compose VM. The intended production OTLP/UI track.
monitoring-0monitoring.apps, grafana.apps, *.mon.sub.comptech-lab.comNative-systemd LGTM stack (Grafana, Prometheus, Alertmanager, Loki, Tempo, Pyroscope, Alloy). Learning sandbox, not a production destination.
docker-runtime-vmdocker-runtime.sub.comptech-lab.comUbuntu host running Docker Engine 29 / Podman / Buildx as a non-OpenShift app target (paused OpenShift-app delivery track).
DefectDojo / Trivy / WSO2 / Kafka / Redisvarious *.apps.sub.comptech-lab.comAdjacent platform VMs. Touched in their own sections.

Why VM-hosted, not in-cluster

The defining constraint of the lab is disconnected rebuild safety: OpenShift must be reinstallable end-to-end without depending on the things it itself delivers secrets, images, and DNS to. That argues for putting the supply chain outside the data plane it serves.

Specifically:

  1. Vault must not live inside OpenShift, because OpenShift secrets (kubeconfigs, registry pull-secrets, htpasswd hashes, OperatorHub credentials, ACM init bundles) need to be reachable before any cluster has finished installing. ADR vault-oss-vm-plan.md is explicit: “the new active target is a Vault OSS service running directly on dedicated VMs on the private lab network. Vault must be independent of OpenShift and RKE2 so OpenShift secret delivery does not depend on a cluster that is itself being rebuilt.”
  2. Nexus is the local image mirror that OpenShift’s release payload, OperatorHub catalogs, operator operands, and certified-catalog images all resolve to via IDMS/ITMS. If Nexus were a Quay running inside the cluster, a cluster failure would brick the cluster’s own re-install path. ADR 0019 forces it onto its own VM.
  3. DNS has to be running before any cluster node boots. PowerDNS on a single dual-homed VM is fine for a lab; the moment it depends on OpenShift it stops being a recovery tool and becomes another thing that has to be recovered.
  4. HAProxy terminates TLS for ~40 platform service hostnames using a public Let’s Encrypt wildcard. OpenShift’s own ingress controller terminates TLS for *.apps.<cluster>.sub.comptech-lab.com independently. Putting OpenShift Routes behind HAProxy would mean two TLS terminators fighting over the same wildcard, double-edge logic, and a 4th-tier of certificate distribution — see the HAProxy subsection for the full rationale.
  5. Object storage (MinIO) backs Loki, Tempo, Quay, OADP backups, and Vault Raft snapshots. Putting it inside a cluster would mean cluster backups depending on the cluster — bootstrap loop.
  6. CI (Jenkins) builds images that flow into Nexus. Putting Jenkins in-cluster on the cluster it builds for would mean every cluster upgrade has to either evict Jenkins or run with no CI for the duration.

The pattern across all six items is “the dependency must not move into the thing that depends on it.” VM-hosting is how this lab keeps that property without introducing a second OpenShift cluster just for management. (The hub cluster does management, but for OpenShift-native operations — ACM placement, GitOps wiring — not for the underlying supply chain.)

What runs in-cluster vs. on a VM

FunctionVMOpenShift
Public TLS edge (platform VM hostnames)HAProxy edge VM
Public TLS edge (cluster app hostnames)OpenShift Router (per cluster)
Authoritative DNS for lab zonePowerDNS VM
Image mirror & app registryNexus VM
Secret storeVault VM cluster
Secret delivery into podsESO (External Secrets Operator) per cluster
Object storage backing servicesMinIO VM
Backup product (OADP)OADP operator on each cluster
Service meshOSSM 3 + Kiali
Logging/tracing pipelinesSigNoz, monitoring-0 (collectors)Loki/Tempo operator (per cluster, ships to MinIO)
CI buildsJenkins + jenkins-agent VMs
Build runtime (non-OCP)docker-runtime VM

Network plane (general shape)

  • A single Linux bridge per hypervisor — br30 — carries the lab /16 allocation. VMs on the bridge get static addresses inside a platform /24; the OpenShift node range lives one zone over inside a separate /24 in the same /16. IPv6 is intentionally disabled across the platform per ADR 0005.
  • A single recursor address on the DNS VM is what every other VM and OpenShift node points to as upstream DNS. That recursor forwards the lab subdomain to the authoritative daemon on the same host and recurses public names via 8.8.8.8 and 1.1.1.1.
  • HAProxy listens on a private edge address (:443/:80) and on the public address(es) routed in from outside. Both binds reach the same set of frontends so lab traffic and public traffic produce identical TLS behavior.
  • The *.apps.sub.comptech-lab.com Let’s Encrypt wildcard certificate is loaded once at HAProxy’s loopback bind (127.0.0.1:8443). All VM-side TLS for the wildcard re-decrypts there.

Internal-only specifics (exact host addresses, hardware identifiers, credential locations) are kept in opp-full-plat/connection-details/platform-admin-handoff.md.

How a request reaches a platform service

  1. Browser opens https://signoz.apps.sub.comptech-lab.com/.
  2. PowerDNS Recursor answers from cache or from its own authoritative side: name → HAProxy’s public edge IP.
  3. HAProxy accepts the TCP connection on :443, reads SNI, matches signoz.apps.sub.comptech-lab.com, hands the bytes to its internal vm-tls frontend at 127.0.0.1:8443 via PROXY protocol.
  4. HAProxy vm-tls terminates TLS with the wildcard cert, reads the HTTP Host header, and routes to signoz-vm-be → SigNoz VM port :8080.
  5. SigNoz answers; bytes flow back unchanged.

That five-step path is the same for every VM-hosted service that uses the wildcard. The subsection on HAProxy describes the SNI-passthrough plus loopback re-decrypt pattern in detail, including why this lab uses it instead of decrypting at the public bind directly.

Reading order in this section

  1. VM platform (§2.2) — libvirt + cloud-init + br30, deterministic MAC/IP allocation, the base image and how new VMs join the fleet.
  2. DNS (§2.3) — the authoritative + recursor split on one dual-homed VM, zone contents, forwarder topology, failure modes.
  3. Edge ingress / HAProxy (§2.4) — architecture, frontends, SNI passthrough + loopback re-decrypt, backend conventions, certificates.
  4. Object storage / MinIO (§2.5) — deployment, bucket inventory, IAM, evidence lifecycle.
  5. Vault VM (§2.6) — deployment + storage, TLS, auth methods, secret engines, app namespacing, rotation/DR, monitoring/audit.

§2.7 onward (Nexus, CI VMs, observability VMs, other platform VMs, GitLab) are documented in their own sections under this same top-level 02-lab-infrastructure.

References

  • opp-full-plat/connection-details/platform-admin-handoff.md
  • opp-full-plat/connection-details/minio.md
  • opp-full-plat/connection-details/vault-app-secrets.md
  • opp-full-plat/plans/disconnected-rebuild/environments/dc-lab/environment-profile.md
  • opp-full-plat/plans/disconnected-rebuild/environments/dc-lab/allocation-table.md
  • opp-full-plat/plans/disconnected-rebuild/environments/dc-lab/vault-oss-vm-plan.md
  • ADRs 0005 (rebuild network/ingress/PKI), 0009 (Jenkins VM), 0010 (SigNoz VM), 0012 (monitoring VM), 0013 (DefectDojo VM), 0019 (Nexus-only image supply chain)

Last reviewed: 2026-05-11