Skip to content

Conversation

@smurching
Copy link
Contributor

@smurching smurching commented Jan 16, 2026

Some improvements from Dogfooding of the agent app template:

  • Enable deploying the app via DAB (and importing an existing app via DAB) to make it easier to add additional resources to the app
  • Add docs on how to add resources to the app, how to run quickstart script non-interactively, etc
  • Add tool discovery script + guidance on how to use it to discover relevant tools for agent building

Testing

Tested by asking Cursor + Claude Code to "create and deploy an agent to Databricks". They were both able to mostly one-shot it using Sonnet 4.5

Claude Output

image

Cursor Output

image

smurching and others added 16 commits January 11, 2026 21:08
Updates from databricks/cli agent template:
- Automatic MLflow experiment creation in databricks.yml
- Port flag support in start_app.py (--port, --host, --workers, --reload)
- Git repository initialization documentation in AGENTS.md
- Non-interactive mode for quickstart.sh (--profile, --host flags)

These changes improve the deployment experience and resolve issues found
during testing.

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Updates from databricks/cli agent template:
- Tool discovery script (discover-tools) for finding UC functions, tables, vector search, etc.
- Automatic MLflow experiment creation in databricks.yml
- Port flag support in start_app.py (--port, --host, --workers, --reload)
- Git repository initialization documentation in AGENTS.md
- Non-interactive mode for quickstart.sh (--profile, --host flags)

These changes improve the deployment experience and help developers discover
available tools and data sources before building agents.

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Updates from databricks/cli agent template:
- Tool discovery script (discover-tools) for finding UC functions, tables, vector search, etc.
- Automatic MLflow experiment creation in databricks.yml
- Port flag support in start_app.py (--port, --host, --workers, --reload)
- Git repository initialization documentation in AGENTS.md
- Non-interactive mode for quickstart.sh (--profile, --host flags)

These changes improve the deployment experience and help developers discover
available tools and data sources before building agents.

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Updates from databricks/cli agent template:
- Tool discovery script (discover-tools) for finding UC functions, tables, vector search, etc.
- Automatic MLflow experiment creation in databricks.yml
- Port flag support in start_app.py (--port, --host, --workers, --reload)
- Git repository initialization documentation in AGENTS.md
- Non-interactive mode for quickstart.sh (--profile, --host flags)

These changes improve the deployment experience and help developers discover
available tools and data sources before building agents.

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Signed-off-by: Sid Murching <[email protected]>
- Flatten template/{{.project_name}}/ structure to root directory
- Remove databricks_template_schema.json
- Convert databricks.yml.tmpl to fixed databricks.yml
- Add discover-tools script entry point in pyproject.toml
- Create streamlined AGENTS.md for Claude Code guidance
- Separate tool discovery as explicit step (not automatic)
- Remove template instantiation dependencies
Add comprehensive sections from CLI template:
- Getting Started intro paragraph with reference to README.md
- Detailed quickstart options (--host, --profile, git init step)
- Extended discover-tools options (--profile, --max-results, --max-schemas)
- Complete 'Running the App Locally' section with server options and troubleshooting
- 'Next Steps' section with setup completion checklist

This provides better onboarding for both users and AI agents helping with setup.
Add argument parsing to start-app script to support --port, --host,
--workers, and --reload flags. Previously these arguments were ignored
and the backend always used default values.

Changes:
- Import argparse module
- Add usage documentation to docstring
- Accept args parameter in ProcessManager.run()
- Build backend command conditionally based on parsed arguments
- Parse CLI arguments in main() before calling run()

Now 'uv run start-app --port 8001' correctly starts backend on port 8001.
Replace allowlist approach with parse_known_args() to pass all options
directly to start-server. This eliminates maintenance burden of keeping
start-app's argument list in sync with start-server.

Benefits:
- Any new start-server option automatically works with start-app
- No need to update start-app when start-server adds features
- Simpler, more maintainable code
- Better forward compatibility

Usage remains the same: 'uv run start-app --port 8001 --reload' etc.
Signed-off-by: Sid Murching <[email protected]>
Add command-line argument parsing to support running quickstart
non-interactively with --profile and --host flags.

Features:
- --profile NAME: Use specified Databricks profile without prompting
- --host URL: Provide workspace URL for initial setup
- -h, --help: Show usage information

This enables CI/CD and automation scenarios while maintaining
interactive mode as the default.

Examples:
  ./scripts/quickstart.sh --profile DEFAULT
  ./scripts/quickstart.sh --host https://workspace.cloud.databricks.com
When no existing profiles exist, databricks auth login requires both
--profile and --host flags to work properly. Previously we only passed
--host, which would fail.

Changes:
- Add --profile DEFAULT to databricks auth login command
- Simplify profile name handling (always use DEFAULT for new configs)
- Remove unnecessary profile name extraction logic

Tested with fresh config (no ~/.databrickscfg) and confirmed it works.
Add detailed guidance on granting app access to agentic resources like
Vector Search, Genie spaces, UC functions, and more.

Content includes:
- Critical resource permissions warning
- Complete workflow example with Genie space
- Resource type examples for all common resources:
  - Unity Catalog functions
  - Unity Catalog connections (for external MCP servers)
  - Vector Search indexes
  - SQL warehouses
  - Model serving endpoints
  - Genie spaces
- Special handling for custom MCP servers (Databricks Apps)
- Important notes about default MLflow experiment access

This helps users avoid common permission errors when integrating
their agents with workspace resources.
@smurching smurching requested a review from bbqiu January 16, 2026 17:39
@@ -0,0 +1,403 @@
#!/usr/bin/env python3
"""
Discover available tools and data sources for Databricks agents.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing external MCP server here in doc string

Copy link
Contributor

@bbqiu bbqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM with a few small comments. thank you for working on this! it'll be super useful to make the one-shot possibility easier

for future reference, asked claude to make a summary of improvements that were made in this PR so it's easier to apply changes to other directories

AI DevEx Checklist

Documentation (AGENTS.md)

  • Mandatory first action with diagnostic command
  • Error handling playbooks for common failures
  • ⚠️ CRITICAL sections for must-do steps
  • Multi-file example workflows with explicit connections
  • Copy-pasteable templates for each config type
  • Key Files reference table

Scripts

  • CLI flags for all interactive prompts (--profile, --host, --yes)
  • Wrapper scripts pass through unknown arguments
  • Explicit resource naming (not parsed from output)
  • Discovery script for available resources
  • JSON output format for programmatic use

Configuration

  • Infrastructure-as-code for deployment
  • Explicit dev/prod targets
  • Config files tracked in git (not ignored)
  • Named script entry points in pyproject.toml/package.json

genie_space:
name: 'My Genie Space'
space_id: '01234567-89ab-cdef'
permission: 'CAN_RUN'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could we also add an mlflow experiment here

"databricks-openai>=0.8.0",
"mlflow>=3.8.0rc0",
"openai-agents>=0.4.1",
"openai-agents>=0.4.1,<0.6.3",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooc why are we putting an upper bound here? jic future updates break things?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, this was just to get around the strict parameter validation issue with the latest OpenAI agents SDK, which is now fixed in the latest databricks-openai, so we can remove this upper bound

return spaces


def discover_mcp_servers() -> List[Dict[str, Any]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we rename to be discover_local_mcp_servers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch we should just remove this, there's no convention around packages starting with mcp- coresponding to MCP servers

uv run discover-tools --catalog my_catalog --schema my_schema

# Customize search depth for faster execution
uv run discover-tools --max-results 50 --max-schemas 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do these params get passed to discover tools?

# Streaming agent logic here
pass
```bash
uv run start-app
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we document the --reload flag here as well?

Signed-off-by: Sid Murching <[email protected]>
Signed-off-by: Sid Murching <[email protected]>
…start-app --port <port_number>, the frontend targets localhost://<port_number>

Signed-off-by: Sid Murching <[email protected]>
@smurching smurching merged commit a57ec1e into databricks:main Jan 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants