Iterating a .NET CI/CD Pipeline From 12 Minutes to Under 5
I run a multi-service .NET backend: four API projects (Admin, Auth, LandLord, Tenant) sharing three class libraries (ApiBase, Core, Models). Each service ships as its own Docker image. GitHub Actions handles build-test-deploy on every push to dev.
The pipeline used to finish in about 90 seconds per service, with builds running in parallel. Then I added automated API snapshot tracking, and build times hit 12+ minutes. What followed were eleven iterations of restructuring. Some made things worse before they got better.
The New Requirement: API Snapshots in Git
I built a .NET tool called SwaggerDiff that captures OpenAPI spec snapshots for each API project. On every deployment, it generates a versioned JSON file (doc_YYYYMMDD.json) and stores it alongside the project source:
MyApp.Admin.Api/
Docs/Versions/
doc_20250601.json
doc_20250612.json
MyApp.Auth.Api/
Docs/Versions/
doc_20250601.json
Why git instead of an external service? Zero external dependencies for version tracking. Each project carries its own history. I can always add external storage later, but the baseline should be self-contained.
"Run a tool, commit the output." Simple requirement. (The tool itself has its own story about how it went from an internal feature to a standalone package.) It set off a chain of rearchitecting that touched nearly every layer of my CI/CD setup.
Starting Point: Separate Dockerfiles, Parallel Builds
Before snapshots:
detect-changes → docker-admin ─┐
→ docker-auth ─┤→ deploy
→ docker-landlord┤
→ docker-tenant ─┘
Four Dockerfiles. Parallel builds. Registry cache hits kept each build around 1m 30s. Total pipeline: 3–4 minutes.
Iteration 1: Adding Snapshot Generation
The obvious approach: a new job that runs swaggerdiff snapshot, commits the files, pushes.
api-snapshot:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-dotnet@v4
- run: dotnet restore && dotnet tool restore
- run: dotnet swaggerdiff snapshot -c Release
- run: |
git add '**/Docs/Versions/*.json'
git commit -m "chore: update API snapshots"
git push
Two things broke immediately.
Branch protection blocked the push. The dev branch requires PR reviews, so github-actions[bot] couldn't push directly. I created a Personal Access Token (SNAPSHOT_PAT) with repo write permissions:
- uses: actions/checkout@v4
with:
token: ${{ secrets.SNAPSHOT_PAT }}
Then: infinite CI loops. The snapshot commit triggered the pipeline, which generated the same snapshots, committed, triggered again. Fix was two-pronged: [skip ci] in the commit message, plus path exclusions on the workflow trigger:
paths:
- '!**/Docs/Versions/**'
Any CI job that pushes commits back to the triggering branch needs both a skip mechanism and path-based exclusions. Belt and suspenders.
Iteration 2: One Dockerfile
Four separate Dockerfiles meant four independent NuGet restores, four compilations of shared libraries, four copies of the same source. I consolidated into a single multi-stage Dockerfile:
FROM mcr.microsoft.com/dotnet/sdk:9.0 AS build
COPY *.csproj files...
RUN dotnet restore my-app.sln
COPY . .
RUN dotnet build my-app.sln -c Release
FROM build AS publish-admin
RUN dotnet publish MyApp.Admin.Api/... --no-restore --no-build
FROM build AS publish-auth
RUN dotnet publish MyApp.Auth.Api/... --no-restore --no-build
# ... same for landlord, tenant
FROM base AS final-admin
COPY --from=publish-admin /app/publish .
dotnet build compiles the entire solution once. Each publish-* stage branches from build and uses --no-build, so it's just packaging: copying compiled DLLs to the publish output. No redundant compilation.
Nothing broke structurally. But it introduced a sequencing problem I didn't see coming.
Iteration 3: The Sequential Build Regression
One Dockerfile meant I couldn't run separate Docker build jobs targeting different files anymore. And building all four services in one job meant they ran sequentially.
The pipeline that used to take max(build times) ≈ 1.5 minutes now took sum(build times) ≈ 8–9 minutes. Total pipeline time: 12 minutes 42 seconds.
I'd made things worse. Consolidating Dockerfiles for maintainability is fine, but you have to preserve parallelism at the CI layer. One Dockerfile doesn't have to mean one job.
Iteration 4: Anchor + Parallel Remaining
I needed parallel Docker builds from a single Dockerfile. The problem: Docker registry cache. If all four services build simultaneously on separate runners, the first one to finish writes the shared cache layers. But the others are already running; they can't read cache that hasn't been pushed yet.
So I split the builds into phases:
- Detect changes dynamically builds a JSON matrix of which services need building
- Anchor build takes the first changed service, builds it, pushes shared layers to the registry cache
- Remaining builds run in parallel after the anchor completes, reading from the warm cache
detect-changes:
outputs:
anchor: '{"service":"admin","target":"final-admin","image":"my-app-admin"}'
remaining: '[{"service":"auth",...},{"service":"tenant",...}]'
docker-anchor:
needs: [detect-changes]
# Builds first service, warms cache
docker-remaining:
needs: [docker-anchor, detect-changes]
strategy:
matrix:
include: ${{ fromJSON(needs.detect-changes.outputs.remaining) }}
# Builds remaining services in parallel, reading warm cache
detect-changes uses dorny/paths-filter to figure out which projects changed. If shared libraries change, all services rebuild. The matrix is built dynamically in a bash step; no hardcoded service lists.
Pipeline time dropped to 6 minutes 45 seconds. The anchor takes about 4 minutes (cold or near-cold), remaining services finish in 1–2 minutes each by hitting the warm cache. When you can't parallelise everything, parallelise what you can after warming the shared state: time(anchor) + max(remaining) instead of sum(all).
Iteration 5: Snapshots Inside Docker
The pipeline graph at this point:
┌→ docker-anchor → docker-remaining ─┐
detect-changes ─┤ ├→ deploy
└→ api-snapshot ──────────────────────┘
api-snapshot sat on the critical path. deploy waited for both Docker builds AND snapshot generation, even though the two are independent. Could I generate snapshots inside the Docker build?
I added a generate-snapshots stage to the Dockerfile:
FROM build AS generate-snapshots
RUN dotnet tool restore --verbosity quiet
ENV SWAGGERDIFF_DRYRUN=true
ENV DOTNET_ROLL_FORWARD=LatestMajor
RUN dotnet swaggerdiff snapshot --no-build -c Release
Each final image then copies its own snapshots:
FROM base AS final-admin
COPY --from=publish-admin /app/publish .
COPY --from=generate-snapshots /src/MyApp.Admin.Api/Docs/Versions/ ./Docs/Versions/
The api-snapshot CI job still runs (to commit snapshots to git), but it's fully decoupled from Docker builds now. Both run in parallel after detect-changes.
Two gotchas. SwaggerDiff targets net8.0, but the Docker SDK image runs 9.0. The tool spawns subprocesses via dotnet exec against net9.0 dependencies, and .NET refused to load them. DOTNET_ROLL_FORWARD=LatestMajor fixed that.
Second: SwaggerDiff boots each ASP.NET app to extract its OpenAPI spec. Some apps register external services (message queues, payment providers) during startup that fail inside a Docker build context. The SWAGGERDIFF_DRYRUN env var lets apps skip those registrations in snapshot mode.
Iteration 6: The .dockerignore Cache Bug
After implementing the anchor pattern, even workflow-only changes (editing the YAML file) triggered full Docker rebuilds. Four minutes. No source code had changed. The anchor should have hit the cache.
.dockerignore didn't exclude .github/. The COPY . . step copies everything that isn't ignored, including .github/workflows/dev-release.yml. Changing the workflow file changed the copy hash, which invalidated every layer after it.
One line fixed it:
# .dockerignore
**/.github
Your .dockerignore is a performance-critical file. Every directory not excluded from COPY . . is a potential cache-buster.
Iteration 7: Per-Project Cache Granularity (Failed)
With the consolidated Dockerfile, COPY . . copies ALL source code into the build stage. Change one file in Admin, and every service's publish layer is invalidated because they all branch from build.
I tried to fix this by having each publish stage copy only its own project files:
FROM restore AS publish-admin
COPY MyApp.ApiBase/ MyApp.ApiBase/
COPY MyApp.Core/ MyApp.Core/
COPY MyApp.Models/ MyApp.Models/
COPY MyApp.Admin.Api/ MyApp.Admin.Api/
RUN dotnet publish ... --no-restore
I also added per-service cache refs (cache-admin, cache-auth, etc.) to preserve each service's layers independently.
Build times went from 6m 45s to 8m+ on two consecutive runs.
Without a shared build stage, each publish-* ran its own compilation. Four independent compilations are slower than one shared dotnet build plus four --no-build publish steps. The first run also had zero cache hits because the entire layer structure changed, and the second run was still slow because the per-service cache refs hadn't populated enough shared layers. Each publish stage re-compiled ApiBase, Core, and Models independently, when the old approach compiled them once and every publish stage reused the output.
I reverted. Cache granularity has diminishing returns when it costs you duplicated work. A shared build step that compiles once and lets downstream stages skip compilation is faster than per-project builds, even when one change invalidates all of them.
Iteration 8: Concurrent Snapshot Generation
While reworking the pipeline, I also sped up the tool itself. SwaggerDiff processes four projects. Originally it did them sequentially. Each project involves building/resolving the assembly path (sequential, because dotnet build holds file locks on shared deps) and then launching a subprocess to extract the OpenAPI spec (independent per project).
I split execution into two phases:
// Phase 1: Sequential build/resolve
var resolved = projects.Select(p => BuildAndResolve(p)).ToList();
// Phase 2: Concurrent subprocess execution
var tasks = resolved.Select((p, i) => Task.Run(() => {
results[i] = RunSnapshotSubprocessBuffered(p.Assembly, p.OutputDir, docName);
})).ToArray();
Task.WhenAll(tasks).GetAwaiter().GetResult();
Each subprocess buffers its own stdout/stderr. Results print sequentially after all processes complete.
Iteration 9: The Cache Tax
The anchor pattern gave me a steady ~6–7 minute pipeline. Then I looked at the actual build logs.
| Stage | Duration |
|---|---|
Cache export (mode=max) |
101.9s |
dotnet build |
42.4s |
generate-snapshots |
36.2s |
dotnet restore |
24.5s |
| Export image + push | 16.2s |
Cache export alone: 1 minute 42 seconds. Longer than the actual compilation. The remaining jobs that were supposed to benefit from the warm cache? Each spent 45–53 seconds downloading ~2GB of cached layers before doing any work. Actual time saved per remaining service: roughly 17 seconds.
The math was damning:
- Anchor overhead: +102s for cache export
- Each remaining build: +50s download, –17s from cache hit = net +33s per build
- Serial dependency: remaining builds can't start until the anchor finishes exporting
I'd built a caching scheme that made every step slower. Registry cache with mode=max pushes every intermediate layer, and that's expensive on both upload and download. Sometimes the fastest cache is no cache.
Iteration 10: Docker Buildx Bake
The cache existed to solve one problem: sharing the compiled build stage across parallel jobs running on separate runners. Each runner has its own filesystem. Without registry cache, each one would compile the solution independently.
What if the builds weren't on separate runners?
docker buildx bake builds multiple targets from a single Dockerfile in one BuildKit invocation. BuildKit constructs a dependency graph of all stages and deduplicates shared stages automatically. The build stage, which every publish-* and final-* depends on, gets computed once, in memory. No registry push. No pull. No serialisation overhead.
variable "REGISTRY" { default = "ghcr.io/my-org" }
variable "BRANCH_TAG" { default = "dev" }
variable "SHA_TAG" { default = "" }
group "all" {
targets = ["final-admin", "final-auth", "final-landlord", "final-tenant"]
}
target "_common" {
dockerfile = "Dockerfile"
context = "."
}
target "final-admin" {
inherits = ["_common"]
target = "final-admin"
tags = compact([
"${REGISTRY}/my-app-admin:${BRANCH_TAG}",
notequal("", SHA_TAG) ? "${REGISTRY}/my-app-admin:${SHA_TAG}" : "",
])
}
// ... same pattern for auth, landlord, tenant
Replaced the anchor + remaining jobs with a single step:
docker-build:
needs: [detect-changes]
steps:
- uses: docker/bake-action@v6
with:
files: ./docker-bake.hcl
targets: ${{ needs.detect-changes.outputs.bake_targets }}
push: true
env:
BRANCH_TAG: dev
SHA_TAG: "dev-${{ needs.detect-changes.outputs.short_sha }}"
detect-changes builds a comma-separated list of bake targets (final-admin,final-auth,...) based on which projects changed. Bake builds only the requested targets and their transitive dependencies.
One gotcha: HCL variables are overridden by environment variables of the same name. The workflow had a top-level env: REGISTRY: ghcr.io (without the org prefix) for use elsewhere. That silently overrode the HCL default of ghcr.io/my-org, pushing images to ghcr.io/my-app-admin instead of ghcr.io/my-org/my-app-admin. Fix: explicitly compose the full registry path in the bake step's env block.
Pipeline time: ~4 minutes 30 seconds. Close to pre-snapshot times.
Iteration 11: Gating Snapshot Commits on Build Success
One remaining problem. The snapshot job committed to git regardless of whether Docker builds succeeded. If bake failed, the pipeline would still push new snapshot files to the repo, even though no images were deployed.
I split api-snapshot into two phases:
- Generate (
api-snapshot): build the solution, run SwaggerDiff, upload snapshot files as a GitHub Actions artifact - Commit (
commit-snapshots): download the artifact, commit and push to git. Runs only if bothapi-snapshotanddocker-buildsucceeded
┌→ api-snapshot (generate + upload artifact) ─┐
detect-changes ─┤ ├→ commit-snapshots → deploy
└→ docker-build (bake) ───────────────────────┘
The generate phase no longer needs contents: write permissions or the SNAPSHOT_PAT. It just produces files. The commit phase is the only job that pushes to git, gated on the full build pipeline passing. Treat git commits from CI as side effects, not intermediate steps; gate them on everything succeeding so repo state stays in sync with what's deployed.
The Final Pipeline
┌→ api-snapshot (generate + upload artifact) ─┐
detect-changes ─┤ ├→ commit-snapshots → deploy
└→ docker-build (bake) ───────────────────────┘
Dockerfile (single, multi-stage):
build: restores, copies source, compiles entire solution oncegenerate-snapshots: runs SwaggerDiff inside Docker (FROM build)publish-*: one per service, FROM build,--no-build(packaging only)final-*: runtime images, copies publish output + snapshots
Bake file (docker-bake.hcl):
- HCL target per service, all inheriting a
_commonbase - Variables for registry, branch tag, SHA tag
- BuildKit deduplicates
buildandgenerate-snapshotsacross all targets
Workflow:
detect-changes: paths-filter + dynamic bake target listapi-snapshot: generates snapshots, uploads as artifactdocker-build: single bake invocation, builds only changed servicescommit-snapshots: gated on both upstream jobs, commits to gitdeploy: SSH deployment of changed services
Performance: ~4–5 minutes, down from 12+ at the worst. Close to the original ~3 minutes before snapshots.
What I Learned
Registry cache is not free. mode=max pushes every intermediate layer. Cache export cost me 102 seconds; cache pull cost 50 seconds per remaining build. The compile time saved was 17 seconds. Measure actual transfer times before assuming cache helps.
.dockerignore is a performance file, not a hygiene file. Every path not excluded from COPY . . is a potential cache invalidator. I lost 4 minutes per build because I forgot to exclude .github/.
Build-graph sharing beats cache-based sharing. The anchor pattern existed to warm a registry cache for subsequent jobs. Bake eliminated the need entirely by deduplicating shared stages within a single BuildKit invocation. No serialisation, no network transfer, no cold-start penalty. When your builds share expensive stages, put them in the same build graph.
Cache migration has a cold-start cost. Changing your Docker layer structure invalidates all existing cache. If cold-cache performance is worse than your old steady state, the restructure probably isn't worth it.
CI jobs that push commits need three safeguards. A PAT for branch protection bypass, [skip ci] in commit messages, and path-based trigger exclusions. Miss any one and you get failures or infinite loops.
Not every optimisation works, and that's fine. I spent real time on per-project cache granularity, measured it on two consecutive runs, and it was slower both times. The willingness to measure, acknowledge the regression, and revert is more useful than the cleverness of the approach.
I got within a minute of my original build times while adding a feature that touches every stage of the pipeline. The process of getting there (measure, change, measure, revert if worse) mattered more than any single technique. And the most impactful change wasn't adding something clever. It was removing a cache I assumed was helping.