Skip to content

Add --gpu (or similar) flag to nemoclaw onboard #1751

@jolle-ag

Description

@jolle-ag

Problem Statement

Problem

nemoclaw onboard omits --gpu on both openshell gateway start and openshell sandbox create. The source comments in bin/lib/onboard.js explain the reasoning:

// Do NOT pass --gpu here. On DGX Spark (and most GPU hosts), inference is
// routed through a host-side provider (Ollama, vLLM, or cloud API) — the
// sandbox itself does not need direct GPU access.

This is correct for the typical use case. However, there are use cases where the agent's task (not its inference) requires GPU access inside the sandbox — for example, running GPU-accelerated numerical computations, scientific simulations, or matrix operations as part of an autonomous workflow.

In these cases, the agent needs both:

  1. Inference routed through the gateway (cloud API) — already handled by onboard
  2. GPU hardware accessible inside the sandbox container — not currently possible through onboard

Proposed Design

Request

CLI change — accept --gpu in bin/nemoclaw.js. As far as I know, the existing --gpu handling in openshell gateway start and openshell sandbox create does the heavy lifting (NVIDIA device plugin, k8s resource requests). No changes needed to the Dockerfile, blueprint, or policies.

Usage

nemoclaw onboard --gpu

Behaves identically to nemoclaw onboard except the gateway and sandbox are created with GPU passthrough enabled. Requires NVIDIA drivers and the NVIDIA Container Toolkit on the host (same prereqs as openshell gateway start --gpu).

Alternatives Considered

Current workaround

After a successful nemoclaw onboard, I managed to find a workaround at the expenses of having to tear down and rebuild what onboard just created:

  1. openshell gateway stop --gateway nemoclaw
  2. openshell gateway start --gateway nemoclaw --gpu --recreate
  3. openshell sandbox delete <name>
  4. Locate the nemoclaw Dockerfile inside the npm package, stage the build context manually
  5. openshell sandbox create --from <Dockerfile> --gpu --policy <base-policy>
  6. openshell provider create (provider config lost with gateway recreate)
  7. openshell inference set (inference route lost with gateway recreate)
  8. Reapply policy presets
  9. Reinstall sandbox dependencies

This requires knowledge of nemoclaw internals and is fragile across versions. It is also surely not the best solution since I'm not an openshell expert.

Category

enhancement: feature

Checklist

  • I searched existing issues and this is not a duplicate
  • This is a design proposal, not a "please build this" request

Metadata

Metadata

Assignees

Labels

NemoClaw CLIUse this label to identify issues with the NemoClaw command-line interface (CLI).OpenShellSupport for OpenShell, a safe, private runtime for autonomous AI agentsPlatform: DGX SparkSupport for DGX SparkbugSomething isn't workingenhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions