Installation Manual - 41 Spoke worker disable users coredumps preflight
How the next spoke-dc-v7 worker coredump-family MachineConfig control was selected and preflighted without applying it.
This chapter records the preflight gate for the next worker
coredump-family hardening control on spoke-dc-v7.
No MachineConfig was applied in this gate. The outcome is a selected candidate and a rollout warning.
Target State
| Item | Value |
|---|---|
| Governance issue | OP-GF-SPOKEDCV7-28, issue #378 |
| Cluster | spoke-dc-v7 |
| Selected rule | rhcos4-high-worker-disable-users-coredumps |
| Prospective MachineConfig | 75-worker-disable-users-coredumps |
| Evidence report | reports/compliance/spoke-dc-v7/20260517/worker-disable-users-coredumps-preflight.md |
| Rollout status | Not applied |
Access Path
Run operational commands from the bootstrap VM through dl385-2.
ssh ze@dl385-2
ssh gf-ocp-bootstrap-01
export HUB_KUBECONFIG=/home/ze/ocp-greenfield-deployment/artifacts/openshift/hub-dc-v7/auth/kubeconfig
export SPOKE_KUBECONFIG=/home/ze/ocp-greenfield-deployment/artifacts/openshift/spoke-dc-v7/auth/kubeconfig
Do not print kubeconfigs, kubeadmin passwords, pull secrets, PAT values, repository private keys, Secret data, or full Secret manifests.
Starting Health
Validate cluster and storage health before preflighting a worker MCP change.
oc --kubeconfig "$HUB_KUBECONFIG" -n openshift-gitops \
get applications.argoproj.io spoke-dc-v7-cluster-config \
-o custom-columns=NAME:.metadata.name,SYNC:.status.sync.status,HEALTH:.status.health.status,REV:.status.sync.revision
oc --kubeconfig "$SPOKE_KUBECONFIG" get clusterversion version
oc --kubeconfig "$SPOKE_KUBECONFIG" get nodes
oc --kubeconfig "$SPOKE_KUBECONFIG" get mcp
oc --kubeconfig "$SPOKE_KUBECONFIG" get co --no-headers \
| awk '$3!="True" || $4!="False" || $5!="False" {print}'
oc --kubeconfig "$SPOKE_KUBECONFIG" -n openshift-storage \
get noobaa noobaa storagecluster ocs-storagecluster cephcluster ocs-storagecluster-cephcluster
oc --kubeconfig "$SPOKE_KUBECONFIG" -n openshift-storage \
get cluster noobaa-db-pg-cluster \
-o jsonpath='ready={.status.readyInstances}/{.status.instances} primary={.status.currentPrimary}{"\n"}'
oc --kubeconfig "$SPOKE_KUBECONFIG" -n openshift-storage \
get pods -l cnpg.io/cluster=noobaa-db-pg-cluster -o wide
Observed:
spoke-dc-v7-cluster-config Synced/Healthy at 8175ed896909906e8317a6c1f9514c4ce4bf942a
OpenShift 4.20.18 Available=True Progressing=False Failing=False
all six nodes Ready
master MCP Updated=True Updating=False Degraded=False 3/3
worker MCP Updated=True Updating=False Degraded=False 3/3
worker render rendered-worker-430d044e4d36ecc194bdcd0b451ca322
NooBaa=Ready
StorageCluster=Ready
CephCluster=Ready HEALTH_OK
CNPG=2/2 primary=noobaa-db-pg-cluster-2
NooBaa DB primary pod on spoke-dc-v7-worker-2
NooBaa DB replica pod on spoke-dc-v7-worker-1
Candidate Selection
The fresh worker high scan from the previous chapter left these coredump-family failures:
rhcos4-high-worker-disable-users-coredumps FAIL medium
rhcos4-high-worker-service-systemd-coredump-disabled FAIL medium
rhcos4-high-worker-sysctl-kernel-core-pattern FAIL medium
Inspect the generated remediation shape before choosing a control.
for remediation in \
rhcos4-high-worker-disable-users-coredumps \
rhcos4-high-worker-service-systemd-coredump-disabled \
rhcos4-high-worker-sysctl-kernel-core-pattern; do
oc --kubeconfig "$SPOKE_KUBECONFIG" -n openshift-compliance \
get complianceremediation "$remediation" -o json \
| jq -r '{name:.metadata.name, type:.spec.type, apply:.spec.apply, state:.status.applicationState, files:[.spec.current.spec.config.storage.files[]?.path], units:[.spec.current.spec.config.systemd.units[]?.name], kernelArguments:.spec.current.spec.kernelArguments}'
done
Observed remediation shapes:
| Candidate | Generated remediation | Decision |
|---|---|---|
rhcos4-high-worker-disable-users-coredumps | one file under /etc/security/limits.d/ | selected |
rhcos4-high-worker-service-systemd-coredump-disabled | masks systemd-coredump.socket and systemd-coredump.service | defer |
rhcos4-high-worker-sysctl-kernel-core-pattern | writes `kernel.core_pattern = | /bin/false` |
The selected candidate is the smallest next step because it only writes:
/etc/security/limits.d/75-disable_users_coredumps.conf
* hard core 0
Prospective MachineConfig
Use this as the prospective MachineConfig shape for a later GitOps rollout.
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
name: 75-worker-disable-users-coredumps
labels:
machineconfiguration.openshift.io/role: worker
compliance.comptech-lab.com/gate: OP-GF-SPOKEDCV7-28
spec:
config:
ignition:
version: 3.1.0
storage:
files:
- path: /etc/security/limits.d/75-disable_users_coredumps.conf
mode: 420
overwrite: true
contents:
source: "data:,%2A%20%20%20%20%20hard%20%20%20core%20%20%20%200"
Server-Side Dry Run
Confirm the current render and workers do not already have the target file.
worker_render=$(oc --kubeconfig "$SPOKE_KUBECONFIG" \
get mcp worker -o jsonpath='{.status.configuration.name}')
oc --kubeconfig "$SPOKE_KUBECONFIG" get machineconfig "$worker_render" -o json \
| jq -r 'any(.spec.config.storage.files[]?; .path == "/etc/security/limits.d/75-disable_users_coredumps.conf")'
for node in spoke-dc-v7-worker-0 spoke-dc-v7-worker-1 spoke-dc-v7-worker-2; do
oc --kubeconfig "$SPOKE_KUBECONFIG" debug "node/$node" --quiet -- \
chroot /host sh -c \
"test -f /etc/security/limits.d/75-disable_users_coredumps.conf && echo present || echo absent"
done
Observed:
render_has_disable_users_coredumps=false
spoke-dc-v7-worker-0 absent
spoke-dc-v7-worker-1 absent
spoke-dc-v7-worker-2 absent
Run server-side dry-run apply for the prospective object.
oc --kubeconfig "$SPOKE_KUBECONFIG" apply --dry-run=server \
-f /tmp/75-worker-disable-users-coredumps.yaml
Observed:
machineconfig.machineconfiguration.openshift.io/75-worker-disable-users-coredumps created (server dry run)
Drainability Check
Run only server-side dry-run drain checks during this gate.
for node in spoke-dc-v7-worker-0 spoke-dc-v7-worker-1 spoke-dc-v7-worker-2; do
oc --kubeconfig "$SPOKE_KUBECONFIG" adm drain "$node" \
--ignore-daemonsets \
--delete-emptydir-data \
--dry-run=server \
--timeout=20s
done
Observed:
| Worker | Result | Notes |
|---|---|---|
spoke-dc-v7-worker-0 | pass | no NooBaa DB primary |
spoke-dc-v7-worker-1 | pass | hosts NooBaa DB replica |
spoke-dc-v7-worker-2 | fail | hosts protected NooBaa DB primary |
Worker-2 failed on the primary PDB:
error when evicting pods/"noobaa-db-pg-cluster-2" -n "openshift-storage": Cannot evict pod as it would violate the pod's disruption budget.
error when evicting pods/"noobaa-db-pg-cluster-2" -n "openshift-storage": global timeout reached: 20s
PDB noobaa-db-primary allowed=0 currentHealthy=1 desiredHealthy=1
Final State
spoke-dc-v7-cluster-config Synced/Healthy at 8175ed896909906e8317a6c1f9514c4ce4bf942a
OpenShift 4.20.18 Available=True Progressing=False Failing=False
master MCP rendered-master-394597acba416ab151cf83289fece615 Updated=True Updating=False Degraded=False 3/3
worker MCP rendered-worker-430d044e4d36ecc194bdcd0b451ca322 Updated=True Updating=False Degraded=False 3/3
all six nodes Ready
nonsteady-co-count=0
NooBaa=True/SystemPhaseReady
StorageCluster=Ready
CephCluster=Ready health=HEALTH_OK
CNPG ready=2/2 currentPrimary=noobaa-db-pg-cluster-2 targetPrimary=noobaa-db-pg-cluster-2
noobaa-db-pg-cluster-1 replica on spoke-dc-v7-worker-1
noobaa-db-pg-cluster-2 primary on spoke-dc-v7-worker-2
PDB noobaa-db-primary allowed=0 currentHealthy=1 desiredHealthy=1
Next Step
The recommended next control is
rhcos4-high-worker-disable-users-coredumps.
Do not apply it from this preflight alone. A later rollout needs explicit approval and must account for worker-2 not being drainable while it hosts the NooBaa DB primary.