Installation Manual - 28 Spoke compliance findings triage

How to collect the first spoke-dc-v7 Compliance Operator findings, classify them, and choose the first remediation order.

This chapter records the read-only triage immediately after the first spoke-dc-v7 compliance baseline run. Do this before applying any generated Compliance Operator remediations.

The goal is to separate true greenfield configuration gaps from install-time design decisions, future-service dependencies, and manual attestations.

Target State

ItemValue
Governance issueOP-GF-SPOKEDCV7-16, issue #362
Clusterspoke-dc-v7
GitOps revision under review0932f14
Action typeRead-only evidence collection and classification
Evidence namespaceopenshift-compliance
Local evidence reportreports/compliance/spoke-dc-v7/20260516/triage.md

Collection

Collect structured API output from the bootstrap host. Do not copy raw result PVC contents for this triage gate, and do not print Secret manifests, kubeconfigs, pull secrets, or token material.

export SPOKE_KUBECONFIG=/home/ze/ocp-greenfield-deployment/artifacts/openshift/spoke-dc-v7/auth/kubeconfig

mkdir -p reports/compliance/spoke-dc-v7/$(date -u +%Y%m%d)

oc --kubeconfig "$SPOKE_KUBECONFIG" -n openshift-compliance \
  get compliancecheckresults -o json \
  > reports/compliance/spoke-dc-v7/$(date -u +%Y%m%d)/compliancecheckresults.json

oc --kubeconfig "$SPOKE_KUBECONFIG" -n openshift-compliance \
  get complianceremediations -o json \
  > reports/compliance/spoke-dc-v7/$(date -u +%Y%m%d)/complianceremediations.json

Generate a compact status count:

jq -r '.items[].status' \
  reports/compliance/spoke-dc-v7/$(date -u +%Y%m%d)/compliancecheckresults.json \
  | sort | uniq -c

Generate a non-pass rule list:

jq -r '
  .items[]
  | select(.status != "PASS")
  | [
      .metadata.name,
      .status,
      (.metadata.labels["compliance.openshift.io/rule"] // ""),
      (.metadata.labels["compliance.openshift.io/scan-name"] // "")
    ]
  | @tsv
' reports/compliance/spoke-dc-v7/$(date -u +%Y%m%d)/compliancecheckresults.json \
  > reports/compliance/spoke-dc-v7/$(date -u +%Y%m%d)/nonpass-checkresults.tsv

Generate a remediation inventory:

jq -r '
  .items[]
  | [
      .metadata.name,
      (.metadata.labels["compliance.openshift.io/rule"] // ""),
      (.metadata.labels["compliance.openshift.io/scan-name"] // ""),
      (.status.applicationState // "unknown")
    ]
  | @tsv
' reports/compliance/spoke-dc-v7/$(date -u +%Y%m%d)/complianceremediations.json \
  > reports/compliance/spoke-dc-v7/$(date -u +%Y%m%d)/remediations.tsv

First Triage Result

The first evidence run completed with all fourteen scans in DONE state.

StatusCount
PASS1022
FAIL438
MANUAL107

Unique non-pass findings:

TypeCount
Unique non-pass rules247
Unique failed rules217
Unique manual rules30

The large result count is expected for the first run because the rhcos4-high node profiles generate many MachineConfig-backed audit, sysctl, kernel module, SSH, and host-hardening checks.

Classification

Classify findings into these groups before opening remediation work.

GroupExamplesHandling
Immediate greenfield configuration gapsaudit-profile-set, OAuth inactivity and token age, image registry allow lists, ingress certificate and TLS checks, audit forwarding, File Integrity Operator, namespace baseline controls, login/MOTD bannerRemediate first through GitOps in small platform-config batches.
Install-time or design-decision gapsmachine-volume-encrypted, node auditd/sysctl/kernel hardening, SSH service posture, USB boot/device postureRequire ADR or explicit issue decision before changing node posture.
Documented or likely tailoring exceptionsODF route TLS behavior, temporary HTPasswd break-glass IdP, ingress TLS profile compatibility, no cluster-wide proxyCarry as tailored exceptions or operational attestations until policy changes.
Future-service dependenciesenterprise IdP, alert receiver, Security Profiles Operator, tenant templatesDefer until the dependent service is installed.
Manual attestationsRBAC least privilege, SCC review, ServiceAccount use, secrets handling, namespace partitioningBuild an evidence pack; do not treat these as automatic remediation work.

Remediation Order

Use this order after the triage gate:

  1. Low-risk platform config:
    • API audit profile;
    • OAuth inactivity and token max-age;
    • image registry allow lists;
    • default ingress certificate validation or fix;
    • basic namespace baseline where appropriate.
  2. Logging and file integrity:
    • cluster logging or selected logging stack;
    • TLS audit forwarding;
    • File Integrity Operator and notification path.
  3. Manual attestation pack:
    • RBAC exports;
    • SCC exception catalogue;
    • ServiceAccount and secrets posture;
    • namespace inventory;
    • alert receiver inventory.
  4. Node hardening batches:
    • auditd rule families;
    • sysctl and kernel module hardening;
    • SSH policy;
    • USB policy;
    • host banner, core dump, and logrotate settings.
  5. Design decisions:
    • disk encryption posture for VM masters and physical workers;
    • whether SSH remains enabled for operations;
    • whether TLS Modern is required or Intermediate remains accepted;
    • whether cluster-wide proxy is required.

Do not bulk-apply all generated remediations. The first run found hundreds of generated remediation objects. Apply MachineConfig-backed changes in narrow batches and validate MachineConfigPool rollout after every batch.

Validation

Reconfirm cluster health before and after evidence collection:

oc --kubeconfig "$SPOKE_KUBECONFIG" get co
oc --kubeconfig "$SPOKE_KUBECONFIG" get mcp
oc --kubeconfig "$SPOKE_KUBECONFIG" -n openshift-compliance get compliancescan

Expected:

  • no non-steady ClusterOperators;
  • master and worker MCPs updated, not updating, and not degraded;
  • all expected compliance scans are DONE.

The next execution gate should remediate only the low-risk platform configuration group, then rerun the affected scans or perform a focused evidence recheck.

Last reviewed: 2026-05-16