CI/CD with Gitea Actions
Turn every project on the platform into a CI/CD-driven artifact: lint and test on every push, train and register on a tagged commit, deploy on a green tag — using Gitea's GitHub-Actions-compatible runner.
By the end of this module you will have:
- A Gitea Actions runner registered on the GPU server with a
gpulabel. - A standard CI workflow for every project: lint → test → build container → push to Gitea’s registry.
- A CD workflow triggered by git tags: train, evaluate, promote in MLflow, redeploy the serving container.
- A clean separation between shared platform credentials (Gitea secrets) and per-project secrets.
- An ADR pinning the CI choice and the tag-driven promotion pattern.
This is the module that turns “I shipped a capstone” into “the platform ships every capstone, repeatedly, without me at the keyboard.”
Why Gitea Actions, not GitHub Actions or Jenkins
Gitea Actions is wire-compatible with GitHub Actions — same .yml syntax, same marketplace actions work (actions/checkout@v4, actions/setup-python@v5, etc.). The compatibility means:
- Students learn the GitHub Actions skill they’ll use at any future job.
- You can lift workflows from public repos without rewriting them.
- The runner runs on the GPU server, with access to MinIO, MLflow, the model registry, and (crucially) Slurm.
Jenkins still works; it just requires more babysitting per pipeline than is justified for a six-student cohort.
Step 1 — Install the runner
sudo useradd -m -s /bin/bash gitea-runner
sudo usermod -a -G docker gitea-runner
sudo -u gitea-runner mkdir -p /home/gitea-runner/runner
cd /home/gitea-runner/runner
curl -sLo act_runner https://gitea.com/gitea/act_runner/releases/download/v0.2.11/act_runner-0.2.11-linux-amd64
chmod +x act_runner
Register against the Gitea server. From the Gitea UI: Site Administration → Actions → Runners → Create new Runner — copy the token. Then:
sudo -u gitea-runner /home/gitea-runner/runner/act_runner register \
--no-interactive \
--instance http://localhost:3000 \
--token <registration-token> \
--name gpu-runner-01 \
--labels gpu,cpu,docker
The gpu label is the one workflows will target with runs-on: gpu to land jobs on this host. The cpu and docker labels are aliases for the same runner so generic CI jobs also land here.
Install as a systemd unit:
sudo tee /etc/systemd/system/act-runner.service > /dev/null <<'UNIT'
[Unit]
Description=Gitea Actions Runner
After=network.target
[Service]
User=gitea-runner
WorkingDirectory=/home/gitea-runner/runner
ExecStart=/home/gitea-runner/runner/act_runner daemon
Restart=on-failure
[Install]
WantedBy=multi-user.target
UNIT
sudo systemctl enable --now act-runner
sudo systemctl status act-runner
The runner now appears in the Gitea UI as Online.
Step 2 — A standard ci.yml
Every project on the platform forks the cookiecutter from module 03 and gets .gitea/workflows/ci.yml:
name: CI
on:
push:
branches: [main]
pull_request:
jobs:
lint-and-test:
runs-on: docker
container:
image: ghcr.io/astral-sh/uv:python3.12-bookworm-slim
steps:
- uses: actions/checkout@v4
- name: Install deps
run: uv sync --frozen --all-extras
- name: Lint
run: |
uv run ruff check .
uv run ruff format --check .
- name: Type check
run: uv run mypy src/
- name: Tests
run: uv run pytest -x --cov=src --cov-report=term-missing
- name: dbt build (if project has analytics/)
if: hashFiles('analytics/dbt_project.yml') != ''
env:
DBT_PG_PASSWORD: ${{ secrets.DBT_PG_PASSWORD }}
run: |
cd analytics
uv run dbt build --profiles-dir ../.dbt
build-image:
needs: lint-and-test
runs-on: docker
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
steps:
- uses: actions/checkout@v4
- name: Log in to Gitea registry
run: echo "${{ secrets.GITEA_TOKEN }}" | docker login http://gitea.example.com:3000 -u ${{ github.actor }} --password-stdin
- name: Build and push
run: |
IMAGE="gitea.example.com:3000/${{ github.repository }}:${{ github.sha }}"
docker build -t $IMAGE .
docker push $IMAGE
docker tag $IMAGE "gitea.example.com:3000/${{ github.repository }}:latest"
docker push "gitea.example.com:3000/${{ github.repository }}:latest"
Two patterns worth understanding:
- The image gets the commit SHA as a tag, not just
latest. This is what makes “roll back to last week” a docker command, not a re-build. - dbt build runs in CI if the project has an
analytics/directory. The same workflow handles tabular ML projects, RAG projects, and DE projects — no per-project drift.
Step 3 — A release.yml triggered by tags
Tagged releases do the heavy work: training, eval, registry promotion, deploy. This is the promote-by-tag pattern.
name: Release
on:
push:
tags:
- "v*.*.*"
jobs:
train-and-promote:
runs-on: gpu # the runner with GPU access
steps:
- uses: actions/checkout@v4
- name: Install deps
run: uv sync --frozen
- name: dbt build prod
env:
DBT_PG_PASSWORD: ${{ secrets.DBT_PG_PASSWORD }}
run: cd analytics && uv run dbt build --target prod
- name: Submit training to Slurm
env:
MLFLOW_TRACKING_URI: ${{ secrets.MLFLOW_TRACKING_URI }}
run: |
sbatch --wait scripts/train.sh
- name: Evaluate against hold-out
env:
MLFLOW_TRACKING_URI: ${{ secrets.MLFLOW_TRACKING_URI }}
run: uv run python -m src.eval --threshold reports/release_threshold.json
- name: Promote to Production
env:
MLFLOW_TRACKING_URI: ${{ secrets.MLFLOW_TRACKING_URI }}
run: |
uv run python -c "
import mlflow
from mlflow.tracking import MlflowClient
client = MlflowClient()
mv = client.search_model_versions(\"name='${{ vars.MODEL_NAME }}'\")
latest = max(mv, key=lambda v: int(v.version))
client.set_registered_model_alias(
name='${{ vars.MODEL_NAME }}', alias='production', version=latest.version)
"
- name: Restart serving container
run: ssh deployer@<gpu-server> 'docker compose -f /opt/${{ vars.SERVICE_NAME }}/docker-compose.yml pull && docker compose -f /opt/${{ vars.SERVICE_NAME }}/docker-compose.yml up -d'
The sbatch --wait is what makes Slurm “real” in CI — the workflow blocks on the queued GPU job and only continues when it succeeds. Evaluation has to pass a threshold before the promotion step runs; a regression on the hold-out set fails the workflow and the model stays at Staging.
For a model service, the Restart serving container step is the deploy. For a RAG service, it’s the same. For an embedding model swap, the same. Every deploy is a docker compose pull && up -d after a registry alias flips.
Step 4 — Shared platform secrets vs. per-project
Gitea supports both. The convention:
| Scope | Lives in | Examples |
|---|---|---|
| Platform-wide | Organization-level secrets (platform org in Gitea) | MLFLOW_TRACKING_URI, MINIO_ENDPOINT, WAREHOUSE_URI, PLATFORM_ACCESS_KEY, PLATFORM_SECRET |
| Per-project | Repo-level secrets | Anything specific to one model or one customer corpus |
| Variables (non-secret) | Org or repo vars | MODEL_NAME, SERVICE_NAME, DEPLOY_HOST |
Rotate platform credentials every 90 days. The runner pulls fresh secrets at job start, so rotation is invisible to in-flight workflows.
Step 5 — Branch protection that pairs with CI
In each project’s Gitea settings under Branches → main → Protection:
- Require pull request before merging.
- Require status check:
lint-and-testmust pass. - Disallow force-push.
This is what makes “merge to main” mean “passed CI.” Without it, broken main is a recurring problem.
Step 6 — Reusable workflows
Once you’ve written a clean CI workflow you’ll want to reuse it. Gitea supports reusable workflows the same way GitHub does:
.gitea/workflows/reusable-ds-ci.yml in platform/ci-templates:
on:
workflow_call:
inputs:
project-dir:
type: string
default: "."
jobs:
lint-and-test:
runs-on: docker
container: { image: ghcr.io/astral-sh/uv:python3.12-bookworm-slim }
steps:
- uses: actions/checkout@v4
- run: uv sync --frozen --all-extras
working-directory: ${{ inputs.project-dir }}
- run: uv run ruff check .
- run: uv run pytest -x
A project’s own workflow shrinks to:
jobs:
ci:
uses: platform/ci-templates/.gitea/workflows/reusable-ds-ci.yml@main
This is the practical end-state — every project is one line of CI configuration. Improvements to the template benefit every project on the next push.
ADR 0015 — Promote-by-tag
/srv/shared/adr/0015-promote-by-tag.md:
# ADR 0015 — Model promotion by git tag
## Status
Accepted, 2026-05-15.
## Context
Manual promotion of models in the MLflow UI is error-prone — there is no
record of *what code* produced the promoted model, only "version 7". This
breaks reproducibility audits.
## Decision
Production promotion happens only on a git tag matching `v*.*.*`. The
release workflow:
1. Runs against the tag's commit.
2. Submits training to Slurm via `sbatch --wait`.
3. Evaluates the resulting model against a hold-out set.
4. Promotes the MLflow Registry model only if eval passes.
5. Triggers a deploy.
A failed eval fails the workflow; the tag remains, the model does not
get promoted.
## Consequences
- Pro: `git tag v1.4.2` is the audit trail. The model in Production is
always traceable to a tagged commit.
- Pro: rollback is "promote the previous version's MLflow alias."
- Con: Slurm cluster health is now in the critical path for releases.
Mitigation: a release that fails because Slurm is down can be re-run
by re-pushing the same tag once Slurm recovers.
## Alternatives considered
- UI-driven promotion. Rejected: no audit trail.
- Auto-promotion on every successful train. Rejected: removes the
human-in-the-loop decision.
Recap
You now have:
- A registered Gitea Actions runner with
gpuaccess. - A CI workflow that runs lint, test, dbt build, and image push on every PR and main push.
- A tag-driven CD workflow that handles train → eval → promote → deploy.
- A clean secret-and-variable convention.
The last module is the look-back and the look-forward — what we deliberately skipped and where to find it.
Next: 16 — What’s next.