Skip to content

Add databricks-execution-compute skill — serverless, classic cluster, and file execution#278

Open
GeorgeTheo99 wants to merge 5 commits intodatabricks-solutions:mainfrom
GeorgeTheo99:feature/serverless-code-runner
Open

Add databricks-execution-compute skill — serverless, classic cluster, and file execution#278
GeorgeTheo99 wants to merge 5 commits intodatabricks-solutions:mainfrom
GeorgeTheo99:feature/serverless-code-runner

Conversation

@GeorgeTheo99
Copy link

@GeorgeTheo99 GeorgeTheo99 commented Mar 9, 2026

Summary

Adds a unified databricks-execution-compute skill covering all three code execution tools, plus new capabilities for multi-language file execution and persistent workspace notebooks.

New skill: databricks-execution-compute

Consolidates what was previously databricks-serverless-compute into a single skill covering:

Tool Compute Languages
execute_databricks_command Classic cluster Python, Scala, SQL, R
run_file_on_databricks (new) Classic cluster Auto-detect from extension
run_code_on_serverless (new) Serverless Python, SQL, .ipynb

New: run_code_on_serverless

  • Execute Python, SQL, or Jupyter notebooks (.ipynb) on serverless compute via Jobs API — no cluster required
  • Auto-detects .ipynb content and uploads via Databricks' native Jupyter import
  • Full error tracebacks captured from run output (not just generic "Workload failed")

New: run_file_on_databricks (renamed from run_python_file_on_databricks)

  • Auto-detects language from file extension (.py, .scala, .sql, .r)
  • Explicit language parameter for override
  • Old name kept as backwards-compatible alias

New: workspace_path — ephemeral vs persistent mode

Both run_code_on_serverless and run_file_on_databricks support:

  • Ephemeral (default) — no workspace artifact, temp files cleaned up
  • Persistent — pass workspace_path to save the notebook in the Databricks workspace for project work (model training, ETL, etc.)

Cluster management helpers

list_clusters, get_best_cluster, start_cluster, get_cluster_status — with actionable error messages when no cluster is available (suggests startable clusters, serverless alternatives)

Files changed

File Change
databricks-tools-core/.../compute/serverless.py Core serverless implementation (new)
databricks-tools-core/.../compute/execution.py Renamed function, language detection, workspace_path
databricks-tools-core/.../compute/__init__.py Updated exports
databricks-mcp-server/.../tools/compute.py MCP tool wrappers for all new params
databricks-skills/databricks-execution-compute/SKILL.md New consolidated skill
databricks-skills/databricks-serverless-compute/ Removed (replaced by above)
databricks-skills/install_skills.sh Updated skill list and description
tests/integration/compute/test_execution.py Expanded tests for new features
tests/integration/compute/test_serverless.py New serverless test suite

Test plan

  • 34/34 integration tests passing on e2-demo-field-eng
  • Classic cluster: shared context, variable persistence, SQL, Spark, error handling, destroy context
  • run_file_on_databricks: Python/SQL auto-detect, language override, empty file, file not found
  • run_file_on_databricks + workspace_path: notebook uploaded and verified in workspace
  • Serverless Python: dbutils.notebook.exit, computation, error handling, Spark, custom run name
  • Serverless SQL: DDL execution
  • Serverless ephemeral: no workspace_path in result/dict
  • Serverless persistent: notebook saved at workspace_path, verified via workspace API
  • Input validation: empty code, whitespace, unsupported language, return types, to_dict
  • Backwards compatibility: run_python_file_on_databricks alias confirmed
  • Skill validation: validate_skills.py passes (26 skills)

This pull request was AI-assisted by Isaac.

George Theodosopoulos and others added 3 commits March 9, 2026 14:45
Adds a new `run_code_on_serverless()` function that executes Python or SQL
code on Databricks serverless compute using the Jobs API `runs/submit`
endpoint. No interactive cluster is required.

The implementation:
- Uploads code as a temporary notebook to the workspace
- Submits a one-time run with serverless compute (environments + environment_key pattern)
- Waits for completion and retrieves output via get_run_output
- Cleans up temporary workspace files after execution
- Returns a typed ServerlessRunResult with output, error, run_id, run_url, and timing

New files and changes:
- databricks-tools-core: compute/serverless.py (core module)
- databricks-tools-core: compute/__init__.py (exports)
- databricks-mcp-server: tools/compute.py (MCP tool wrapper)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
… improve docs

- Retrieve actual Python traceback on failure instead of generic "Workload
  failed" message by fetching run output in the exception handler
- Fix Optional[str] type annotation for run_name in MCP wrapper
- Document SQL SELECT output limitation in all docstrings
- Reframe tool as Python-first; clarify SQL is niche (DDL/DML only, use
  execute_sql for queries — works with serverless SQL warehouses)
- Add databricks-serverless-compute skill file with decision matrix,
  output capture behavior, limitations, and examples

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Auto-detects .ipynb JSON content and uploads via Databricks native
Jupyter import (ImportFormat.JUPYTER), enabling users to run local
Jupyter notebooks on serverless compute without conversion.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@GeorgeTheo99 GeorgeTheo99 marked this pull request as ready for review March 10, 2026 13:57
@calreynolds calreynolds self-requested a review March 12, 2026 21:05
…ge and workspace_path support

- Rename databricks-serverless-compute → databricks-execution-compute covering all three
  execution tools (execute_databricks_command, run_file_on_databricks, run_code_on_serverless)
- Rename run_python_file_on_databricks → run_file_on_databricks with language auto-detection
  from file extension (.py, .scala, .sql, .r); old name kept as alias
- Add workspace_path param to run_file_on_databricks and run_code_on_serverless for
  persistent mode (saves notebook to workspace) vs ephemeral (default, temp cleanup)
- Add comprehensive integration tests (34 tests) covering classic cluster execution,
  serverless Python/SQL, ephemeral vs persistent modes, input validation, and error handling
- Update MCP tool layer with new params and empty-string-to-None coercion
- Update install_skills.sh with new skill name and description

Co-authored-by: Isaac
@GeorgeTheo99 GeorgeTheo99 changed the title Add serverless code runner tool (Python, SQL, .ipynb — no cluster required) Add databricks-execution-compute skill — serverless, classic cluster, and file execution Mar 13, 2026
…le management

Core functions (manage.py): create/modify/terminate/delete clusters with opinionated
defaults (auto-pick LTS DBR, reasonable node type, SINGLE_USER mode, 120min auto-term),
plus create/modify/delete SQL warehouses. List helpers for node types and spark versions.
MCP tool wrappers with destructive-action warnings in docstrings. SKILL.md with decision
matrix, tool reference tables, and examples. Integration tests validated against e2-demo-field-eng.

Co-authored-by: Isaac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant