Skip to content

chore: bump llama.cpp to b9404; 1.0.9265:0 → 1.0.9404:0#2

Merged
MattDHill merged 2 commits into
masterfrom
next
May 29, 2026
Merged

chore: bump llama.cpp to b9404; 1.0.9265:0 → 1.0.9404:0#2
MattDHill merged 2 commits into
masterfrom
next

Conversation

@helix-nine

Copy link
Copy Markdown
Collaborator

Summary

  • Bumps the wrapped llama-server images from upstream build b9265 → b9404 (upstreamBuild in startos/manifest/index.ts). All four variants — generic, nvidia (cuda), rocm, vulkan — are published for b9404 on ghcr.io/ggml-org/llama.cpp and bump together.
  • StartOS version 1.0.9265:0 → 1.0.9404:0, with one-line release notes in all five locales (startos/versions/current.ts).
  • npm update refreshed transitive deps. @start9labs/start-sdk is unchanged at 1.5.3 (already the latest on npm), so no SDK bump and no migration.

Notes

  • GitHub's latest release is b9410, but its server images aren't published on ghcr.io yet. Per UPDATING.md, b9404 is the highest build with all four variants resolving, so that's the target.
  • The launch flags this package relies on (--host, --port, --api-key, and -hf / --hf-file via the model presets) are core, stable llama-server flags — unchanged across the b9265→b9404 range, so the presets in startos/actions/presets.ts remain valid. No breaking flag/server-API changes apply.
  • No user-visible behavior change beyond the version bump, so README.md and instructions.md need no edits (neither references a version/build number).

Test plan

  • npm run check (tsc) is green.
  • make generic (and ideally all four variants) builds the .s9pk — verified in review per the monitor cycle.

Bump the wrapped llama.cpp server images from upstream build b9265 to
b9404 (highest build with all four variants — generic, cuda, rocm,
vulkan — published on ghcr.io; the latest GitHub release b9410 has no
images published yet). npm update refreshed transitive deps;
@start9labs/start-sdk stays at 1.5.3 (already latest).
MattDHill
MattDHill previously approved these changes May 29, 2026
The rocm variant exhausts runner disk (no space left on device) while
pulling the GPU image. Enable the shared-workflow disk-cleanup step, as
the comparably large ollama package already does.
@MattDHill MattDHill merged commit cc06c16 into master May 29, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants