Build deploy binaries on Blacksmith with a Nix sticky disk#3768
Closed
synoet wants to merge 36 commits into
Closed
Build deploy binaries on Blacksmith with a Nix sticky disk#3768synoet wants to merge 36 commits into
synoet wants to merge 36 commits into
Conversation
Introduce a workflow_dispatch-only proof of concept that builds the deploy service binaries on Blacksmith runners, caching the Nix store on a sticky disk instead of relying solely on Cachix. - Expose deployCargoArtifacts as a flake package output so the shared release dependency closure can be built directly. - Add a setup-nix composite action that installs Nix (or re-initialises the daemon/config when /nix is restored warm from a sticky disk). - Add the PoC workflow: a single warm-deps job builds and commits the shared deps to the /nix sticky disk, then a per-service build matrix fans out, cloning the warm disk so only each service's own crate compiles. Cachix stays wired as a fallback substituter so a cold/evicted disk pulls prebuilt artifacts instead of compiling from source. Additive only; the production deploy path is untouched.
Contributor
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Add a warm-deps job that builds the shared release dependency closure (deployCargoArtifacts) and commits it to the /nix sticky disk, then route the build-service-binaries matrix through Blacksmith so each parallel job clones the warm store and only compiles its own service crate. The build/upload step (including the nix-store closure copy the deploy consumes) is unchanged. Cachix stays wired as a fallback substituter, and warm-deps keeps pushing to it during migration. build-lambda-artifacts is left on its existing runner for now. Surfaces warm-deps failures in the deployment summary. Requires the Blacksmith app + runner pool to be provisioned; the runner label may need adjusting to the org's available labels.
Lambdas compile via cargo-lambda inside the dev shell (not crane), so they cannot reuse deployCargoArtifacts. Warm what they actually consume instead: - warm-deps now also realises the dev shell into the same /nix sticky disk (single committer, so no last-write-wins race), giving the lambda matrix an instant nix develop. - build-lambda-artifacts runs on Blacksmith, clones the warm /nix disk, and mounts a per-service cargo target sticky disk so compiled lambda deps stay warm across runs. Gated behind warm-deps. Timing: setup now emits filtered binaries/lambdas matrices so neither build job spins up Nix + a sticky disk for services that produce no such artifact (14 binary services, 12 lambda services vs all 20+). Both matrices fan out in parallel after the single warm-deps gate.
workflow_dispatch only works once a workflow is on the default branch, so add a push trigger scoped to the feature branch so the PoC runs directly from the branch. To be removed before merge.
build-cloud-storage-lambdas.sh ran the recipe via 'nix develop -c bash -lc'. The login shell re-sources /etc/profile and resets PATH, dropping the dev-shell tools (just, cargo-lambda) on runners without a system-wide install such as fresh Blacksmith images, causing 'just: command not found'. Use a non-login 'bash -c' so the nix develop environment carries through.
The warm-deps run built the deps + dev shell fine, but the Blacksmith sticky-disk commit failed with 'umount: /nix: target is busy': the Determinate Nix daemon runs out of /nix and holds the mount open, so the warmed store was never persisted (every run stayed cold). Add a teardown-nix composite action that stops nix-daemon and determinate-nixd (plus a fuser backstop) and run it as the final always() step of every Blacksmith job that mounts the /nix sticky disk, so the disk can unmount and commit the warmed store.
setup-cachix installed the cachix CLI via 'nix profile add nixpkgs#cachix', which Determinate resolves to an unpinned nixpkgs-weekly and re-fetches + re-evaluates (~20s) on every job, even when /nix is warm on the sticky disk. Install from the repo flake's pinned nixpkgs (already on the sticky disk) via --inputs-from, with a fallback to the registry path if that fails. Keeps all substituter/auth/push semantics identical; just removes the redundant nixpkgs-weekly fetch.
The PoC was triggered on push to the feature branch for validation. Drop it so the workflow is workflow_dispatch-only again and won't auto-run.
…deploys 1) deploy-all-services: drop max-parallel:8 on build-service-binaries so all binary services build concurrently instead of batching 8 (the lower half was waiting on the first 8). Each job only clones the warm /nix disk and compiles its own crate, so full fan-out is fine. 2) serviceLoadBalancer: the shared target group only set health-check path + protocol, inheriting AWS ALB defaults (interval 30s, healthyThreshold 5) so a new ECS task needed ~5x30s = 120-150s to register healthy - the dominant cost in the ~136s rollout. Tune to interval 10s / healthyThreshold 2 / timeout 5 / matcher 200 (~20s registration), parametrized with these as defaults so a service with an expensive /health endpoint can override them.
Blacksmith autoscales runners, so cap the lambda build matrix at nothing and let all lambda services build concurrently like the binary matrix.
deploy-services authenticates to AWS via explicit static keys (aws-actions/configure-aws-credentials with secrets.AWS_*), not the RunsOn EC2 instance role, so it has no ambient-credential dependency to lose. Swap the RunsOn runs-on array (runner/spot/hdd/run-id) for a Blacksmith label; Pulumi token, Datadog keys and AWS creds all come from GitHub secrets and are runner-independent. Keeps the max-parallel:20 deploy concurrency cap.
Dockerfiles: replace 'COPY <bin> /app/svc' + 'RUN chmod +x' with a single 'COPY --chmod=755 …'. The old form wrote the binary into two layers (one without +x, one with), doubling its on-disk footprint in the image; --chmod sets perms during the copy in one layer. Applied to the prebuilt deploy images and the builder-stage production images (Dockerfile, Dockerfile.convert_service, Dockerfile.search_processing_service + their .prebuilt variants). Lambdas: set CARGO_PROFILE_RELEASE_OPT_LEVEL=2 (vs release default 3) for the cargo-lambda build only, trimming leaf-crate codegen time. Service binaries build via crane and are unaffected.
Let all services deploy concurrently like the build matrices.
Replace the GitHub Actions artifact upload/download between the build and
deploy matrices with a Blacksmith sticky-disk handoff. The ~96MB prebuilt
closure was moving at ~1.3MB/s over the GitHub artifact API (~70s per deploy
job); routing it over Blacksmith's co-located NVMe snapshot fabric drops that
to seconds.
- build-service-binaries / build-lambda-artifacts: mount a per-service,
per-SHA handoff disk and write the tar straight onto it; drop upload-artifact.
- deploy-services: clone the matching handoff disk (gated on the artifact
flags) and feed the on-disk tar to the deploy action.
- deploy-cloud-storage-pulumi: add optional prebuilt-binaries-tar /
lambda-artifacts-tar inputs that take precedence over (and skip) the artifact
download. The artifact-name inputs are untouched, so the other callers
(deploy-cloud-storage-on-push, reusable-deploy-service, deploy-pulumi-stack)
keep working unchanged.
Keys are <repo>-handoff-{binaries,lambdas}-<service>-<sha>: the Nix build is
deploy-env-independent so same-SHA dev/prod runs are byte-identical (safe
last-write-wins), different SHAs get distinct keys, and unused snapshots
auto-evict after 7 days. A guarded chown makes the fresh ext4 mount writable
on non-root runners.
https://claude.ai/code/session_01R2zCM4cvNDRHPkN93Fw3DJ
Each deploy job re-downloaded the AWS/Docker provider plugins into $PULUMI_HOME/plugins (~45s). Pin PULUMI_HOME to /pulumi (outside the workspace the deploy action's checkout cleans) and back its plugins subdir with a single stable-keyed sticky disk shared by every deploy job: first run downloads + commits, later runs clone it warm and skip the pull. Plugins are version-pinned by infra/ and identical across services, so the shared key with last-write-wins is safe. Kept in this workflow (not the shared composite) so non-Blacksmith callers are unaffected. https://claude.ai/code/session_01R2zCM4cvNDRHPkN93Fw3DJ
Multi-handler services (document-storage-service has 3; email-service, bulk-upload, organization-retention have 2) were building their handlers serially -- the script looped `just <lambda>/build`, each a separate `cargo lambda build --bin <name>`. Backgrounding them wouldn't help: cargo holds an exclusive target-dir lock, so concurrent invocations just serialize. Build all of a service's handlers in a single `cargo lambda build --bin a --bin b ...` so cargo compiles the shared workspace deps once and parallelizes the leaf handler crates across the runner's cores. Single-handler services are unchanged (one --bin flag). Falls back to the per-lambda `just` recipe if the combined build fails, so it can only improve build time, never break a deploy. https://claude.ai/code/session_01R2zCM4cvNDRHPkN93Fw3DJ
The check/test jobs run sccache (S3-backed) + rust-cache and monitor `sccache --show-stats`; the deploy lambda build wires the same sccache bucket and RUSTC_WRAPPER but never reports hit/miss, so we can't tell if it's caching the cargo-lambda/zigbuild compiles. Add a show-stats step (querying the same dev-shell sccache server, with AWS creds) so the next deploy reveals whether sccache is actually doing its job for lambdas. https://claude.ai/code/session_01R2zCM4cvNDRHPkN93Fw3DJ
First step toward giving lambdas the same content-addressed nix cache the service binaries already get (instead of cargo-in-a-fresh-checkout, which recompiles the workspace every run because cargo keys path crates by mtime). Adds, modeled on deployServiceBinaryPackage: - lambdaCommonArgs: cargo-zigbuild as the builder, zig as the linker (drops the host-only mold arg), opt-level 2, and a preBuild that points zig's cache at $TMPDIR so it works in the read-only-$HOME sandbox. - lambdaDeployCargoArtifacts: a cached dep closure for the Lambda target, scoped with --package so the C-heavy service deps (pdfium, libreoffice) stay out. glibc is pinned purely via the target suffix (x86_64-unknown-linux-gnu.2.26, AL2; forward-compatible with al2023) -- host triple == lambda triple, so no extra rust-std. - deployLambdaPackage: builds one handler and emits the custom-runtime bootstrap.zip (zip whose single entry is `bootstrap`), mirroring cargo-lambda. Scoped to one lean handler (user_link_cleanup_handler) to validate the zig-in-sandbox + crane interaction before rolling out to all handlers and wiring CI to `nix build` it. https://claude.ai/code/session_01R2zCM4cvNDRHPkN93Fw3DJ
workflow_dispatch-only job that runs `nix build .#deploy-lambda-<handler>` on a Blacksmith runner (x86_64 Linux, so the lambda triple == host triple — no cross-std needed, which is why this can't run on a macOS dev box). Mounts the warm /nix sticky disk, keeps Cachix as the fallback substituter, inspects the bootstrap.zip (contents + max glibc symbol), and uploads it. Additive; does not touch the deploy path. https://claude.ai/code/session_01R2zCM4cvNDRHPkN93Fw3DJ
Temporary push trigger (scoped to this branch + the relevant paths) so the dispatch-only spike can actually run without first living on the default branch. Remove once we have a green run. https://claude.ai/code/session_01R2zCM4cvNDRHPkN93Fw3DJ
… at link
The cold run failed because crane's buildDepsOnly runs plain `cargo check`, and
it was handed `--target x86_64-unknown-linux-gnu.2.26` — the `.2.26` glibc
suffix is a cargo-zigbuild-only concept, so rustc rejected it ("could not find
specification for target").
Correct split: the glibc pin is a link-time concern. The dep closure now builds
for the plain triple with ordinary cargo; only the final binary link uses
`cargo zigbuild --target x86_64-unknown-linux-gnu.2.26`. cargo sees the same
plain triple in both (zigbuild strips the suffix), so the closure's rlibs are
reused and only the leaf crate compiles + zig-links.
https://claude.ai/code/session_01R2zCM4cvNDRHPkN93Fw3DJ
zig works in the sandbox and the dep closure cached, but the final zigbuild
link hit aws-lc-sys's cc-builder guard ("COMPILER BUG DETECTED ... zigcc not
supported"). aws-lc-rs is rustls's default crypto provider (pulled via aws-sdk
+ sqlx). Force aws-lc-sys onto its cmake builder (AWS_LC_SYS_CMAKE_BUILDER=1,
+ cmake/nasm), which lacks that guard and still compiles the C against the
zig-pinned glibc.
If this chains into more build deps, the cleaner alternative is dropping
aws-lc-rs for the ring backend.
https://claude.ai/code/session_01R2zCM4cvNDRHPkN93Fw3DJ
flake.nix: derive the lambda package set from services-config.json (crate ==
dir == deploy_lambdas entry for all 17 handlers), so the flake never drifts
from the deploy config. One shared dep closure now spans every lambda package;
deployLambdaPackages exposes deploy-lambda-<name> for each.
Spike workflow becomes an all-handlers validation matrix on a DEDICATED lambda
sticky disk (${repo}-nix-lambdas, separate from the binaries' /nix-store disk):
setup -> warm-lambdas (shared closure) -> per-handler build matrix with a glibc
check. This proves every handler compiles + links against the pinned Lambda
glibc before we flip the production deploy path, and prototypes the two-disk
lambda topology to be lifted into deploy-all-services.
https://claude.ai/code/session_01R2zCM4cvNDRHPkN93Fw3DJ
Split the single warm-deps job into parallel warm-binaries (deployCargoArtifacts on <repo>-nix-store) and warm-lambdas (lambdaDeployCargoArtifacts on <repo>-nix-lambdas) — separate sticky disks so the two chains never collide on a last-write-wins commit, and neither warm blocks the other's build matrix. Dropped the now-obsolete dev-shell warm (nothing uses `nix develop` anymore). build-service-binaries now needs warm-binaries; build-lambda-artifacts needs warm-lambdas, mounts the lambda /nix disk (not the binary one), drops the per-service cargo-target disk and the sccache-stats step, and builds via `nix build .#deploy-lambda-<name>` (new build-cloud-storage-lambdas-nix.sh) instead of cargo-lambda-in-a-checkout. Same target/lambda/<name>/bootstrap.zip layout, so the handoff + deploy action are unchanged. Result: unchanged handlers are pure nix cache hits (no mtime recompile), and a service's handlers build in parallel within one nix invocation. The old cargo-lambda script stays for the inline/other-workflow paths. https://claude.ai/code/session_01R2zCM4cvNDRHPkN93Fw3DJ
… workflow The lambda crane build is validated (all 17 handlers green) and the deploy path is wired, so the feature-branch push trigger has served its purpose. Back to workflow_dispatch-only. https://claude.ai/code/session_01R2zCM4cvNDRHPkN93Fw3DJ
The ~15s/job was the `cachix` CLI install (nix profile add), needed only to push via watch-store. Remove the setup-cachix step from warm-binaries, warm-lambdas, build-service-binaries and build-lambda-artifacts so they rely on the /nix sticky disks alone. setup-nix still puts nix on PATH, and nixpkgs deps still substitute from cache.nixos.org; only our own artifacts depend on the sticky disk now. Tradeoff (accepted for now): a cold/evicted sticky disk has no Cachix fallback, so it rebuilds from source. The leftover `cachix watch-store` guards are no-ops without the CLI. Re-enabling a cheap pull-only fallback later is just an extra-substituters line in setup-nix (no CLI install). https://claude.ai/code/session_01R2zCM4cvNDRHPkN93Fw3DJ
With setup-cachix gone, the cachix CLI isn't installed, so the watch-store push guards were permanent no-ops (and build-service-binaries logged a cosmetic "Cachix is unavailable" warning every run). Strip them from the warm jobs, the binary build step, and the lambda nix build script. The CACHIX_AUTH_TOKEN workflow_call secret is kept declared so existing callers don't error. https://claude.ai/code/session_01R2zCM4cvNDRHPkN93Fw3DJ
The deploy build used a fresh docker/setup-buildx-action builder each run, so heavy base layers (convert-service's LibreOffice + Collabora, ~780MB) were re-pulled from ECR every deploy (~130s at ~4MB/s on a cold builder). Add an opt-in `use-blacksmith-builder` input to the shared deploy composite that swaps in useblacksmith/setup-docker-builder -- a buildkitd builder whose /var/lib/buildkit cache lives on a per-Dockerfile sticky disk, set as the default builder that Pulumi's docker-build provider uses. deploy-all-services opts in; other callers (which may run on non-Blacksmith runners) keep the stock buildx builder. https://claude.ai/code/session_01R2zCM4cvNDRHPkN93Fw3DJ
…son-LXyqN # Conflicts: # .github/scripts/build-cloud-storage-lambdas.sh # infra/packages/resources/src/resources/load_balancer.ts
…into claude/gracious-thompson-LXyqN
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Move the deploy service-binary builds onto Blacksmith runners and cache the Nix store on a Blacksmith sticky disk instead of relying solely on Cachix.
Design: a single
warm-depsjob builds the shared release dependency closure (deployCargoArtifacts) and commits it to the/nixsticky disk. The per-service build matrix then fans out, each job cloning the warm disk so the expensive shared deps are already in/nix/storeand only the service's own crate compiles. This maps onto crane's existing layering —deployCargoArtifacts(deps-only) is already separate from the per-servicebuildPackage.Changes
Production cutover (
deploy-all-services.yml):warm-depsjob → builds + commits shared deps to the sticky disk.build-service-binariesmatrix now runs on Blacksmith, mounts the same sticky disk (clones the warm snapshot), and builds per service. The build/upload step — including thenix-storeclosure copy the deploy consumes — is unchanged.warm-depsfailures surfaced in the deployment summary.build-lambda-artifactsleft on its existing runner for now (different cargo-lambda build path); can follow in a separate change.Supporting:
flake.nix— exposedeployCargoArtifactsas a package output (nix build .#deployCargoArtifacts)..github/actions/setup-nix/action.yml— install Nix on a runner that doesn't ship it (Blacksmith), or re-initialise just the daemon/config/nixbldusers when/nixis restored warm from a sticky disk..github/workflows/deploy-binaries-blacksmith-poc.yml— a standalone, non-deployingworkflow_dispatchharness to validate the build/cache path in isolation, with acold-disktoggle to A/B cold vs. warm builds. Safe to delete once confident.Cachix kept as fallback
The sticky disk is the primary (L1) cache; Cachix stays wired as the fallback substituter (via
setup-cachix), so a cold or evicted disk pulls prebuilt artifacts instead of compiling from source.warm-depsalso still pushes to Cachix during migration. Dropping Cachix can be a later step.I could not execute this (no Nix/Blacksmith in my environment). Before this deploy path can run green:
macro-inc/macro, and the runner labelblacksmith-8vcpu-ubuntu-2404matches a provisioned pool (adjust if not).useblacksmith/stickydiskallowed by org Actions policy.setup-nixis the fiddly bit — the store persists on the disk but the systemd unit,/etc/nix, andnixbldusers don't, so the action recreates them. Most likely the step to need a tweak after the first real run.Suggested rollout: run the standalone PoC workflow first to shake out 1–3 above without risking a deploy; then exercise
deploy-all-servicesagainst dev. Because this is a direct cutover, if Blacksmith isn't fully wired up the dev/prod binary builds will fail until reverted — happy to add a toggle to fall back to the oldlinux-extra-beefy+ Cachix path if you'd prefer a safety valve.How to test
[PoC] Deploy Binaries on Blacksmith(workflow_dispatch) — first run = cold disk, re-run = warm;cold-diskinput forces a comparison. Watch per-service build times to confirm deps aren't recompiling.dev.https://claude.ai/code/session_01R2zCM4cvNDRHPkN93Fw3DJ