Self-hosting Private GitHub Runners on a Kubernetes Cluster

The bill that started it

GitHub-hosted runners are easy until they aren't. The Linux minute is cheap on its own; what gets expensive is the multiplier — every PR, every push, every nightly job, multiplied across every repository, multiplied across every developer. Large jobs that pull big container images, run a matrix build, and push artifacts can eat ten or twenty minutes per run. The monthly invoice was the wake-up call.

The self-hosted runner option exists exactly for this situation. You bring the compute, GitHub orchestrates the work. Done well, the developer experience is unchanged — the same runs-on: label, the same logs in the Actions tab — but the workload runs on hardware you already pay for.

Why Kubernetes was the right target

We already had a Kubernetes platform. Running runners as pods meant the existing autoscaling, observability, secret management, and on-call runbooks applied unchanged. We weren't building a new system; we were giving an existing system one more workload.

The specific tool of choice is ARC — Actions Runner Controller. It's the official-blessed-by-GitHub way to run runners on Kubernetes. The model is straightforward: you tell ARC how many runners you want (or how many it should scale up to under load), and it spins ephemeral pods that register themselves with GitHub, pick up jobs, run them, and exit. Each job gets a clean pod — no shared state to leak between unrelated workflows.

Two ARC modes worth knowing:

Workflow-job scale set: runners are scaled based on queued workflow jobs. You define a runner scale set (basically a runner pool with a label), workflows opt in with runs-on: <scale-set-name>, and ARC scales pods up and down to match queue depth. This is the modern recommended pattern.
Repository or organization runners: the older pattern, still supported. Less elastic, more manual capacity planning.

We went with the scale-set model. It just behaves more sensibly under bursty traffic.

What we actually had to think about

The "spin up pods that run untrusted code" framing should set off security alarms, and it did. A few decisions worth recording:

Ephemeral pods, no persistent volumes. Every job gets a fresh container; nothing survives. This isn't optional — it's the security boundary.
A dedicated namespace with its own resource quotas, network policies blocking lateral movement, and no access to cluster-internal services it didn't need.
No privileged containers in the default runner image. Workflows that genuinely need Docker-in-Docker use a separately configured runner image with documented warnings, not the default.
Secret management — GitHub Actions secrets are still the source of truth and are injected by GitHub into the job's environment. We didn't try to remap that. What we did add was scoped Kubernetes secrets for runner registration tokens, managed by external-secrets-operator.

The biggest non-security gotcha was caching. GitHub-hosted runners come with a generous remote cache (actions/cache) and a fast disk by default. Self-hosted pods don't, and a naive setup is slower than GitHub-hosted because every job starts from cold. We ended up running a dedicated S3-compatible cache backend (MinIO in-cluster) that actions/cache writes to via a custom action. With the cache warm, builds were faster than GitHub-hosted; without it, slower.

What translated

The biggest wins were the boring ones:

Build times dropped, especially for image-pull-heavy jobs running on the same network as the registry.
Costs dropped significantly — to the point that the cluster overhead of running ARC was a rounding error relative to the savings.
Developers didn't notice. The workflow file changed one line (runs-on), the rest of the developer experience stayed identical.

What I'm bringing home

The homelab version is the same architecture at smaller scale. I have a Talos cluster already, ARC runs cleanly on it, and the homelab GitHub org has exactly enough Actions activity (this site, a few homelab repos) to make the exercise worth doing — not for cost reasons but for the operational pattern. Running ARC at home means I'll see what happens when the controller restarts, when the runner image gets stale, when a job hangs. Those are the same failures as in production, just lower-stakes.

The plan:

ARC controller deployed via Helm, GitOps-managed through the same ArgoCD setup described here.
One runner scale set, sized for 0–3 pods.
MinIO already runs in the cluster for other things; I'll point actions/cache at it.
Network policies cribbed from the work pattern, adjusted for homelab realities.

I'll write up the homelab version once it's been chugging for a month. The interesting part won't be the happy path — it'll be whatever breaks first.