Skip to content

[Feature] New API to discover, fetch, and call tools from server extensions #1521

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

Abigayle-Mercer
Copy link

This adds a structured tool discovery interface via ExtensionPoint.tools, aggregated through ExtensionManager.get_tools, and exposed in ServerApp via get_tools. Was not sure how far to 'lift' this functionality.

It also introduces a new REST endpoint at /api/tools, implemented in a new ListToolInfoHandler class under services/contents/handlers.py. I am also unsure if this is the best place to put the ListToolInfoHandler.

Still exploring if we should, and how best we would integrate the "run tool" functionality from structured tool calls — would love guidance on that.

Still needs unit tests, will probably implement those once the full functionality integration is complete.

Feedback welcome and needed!

@Zsailer Would love your thoughts on:

  • Overall structure

  • Handler placement (under services/contents or maybe a new service? services/tools?)

  • Logger usage (especially in ExtensionPoint.tools)

@Zsailer Zsailer changed the title Feature/tool discovery api [Feature] New API to discover, fetch, and call tools from server extensions May 12, 2025
@Zsailer
Copy link
Member

Zsailer commented May 12, 2025

Nice work, @Abigayle-Mercer! I'll leave comments/review inline.

tools[name] = tool
except Exception as e:
# not sure if this should fail quietly, raise an error, or log it?
print(f"[tool-discovery] Failed to load tools from {self.module_name}: {e}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using print, we should make this a self.log.error.

Before we do that, though, we'll need to make ExtensionPoint a LoggingConfigurable and set the parent trait to ExtensionPackage. I can work on this in a separate PR and we can rebase here once that's merged.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #1522. If/when that gets merged, we can rebase your PR to use the logger created there.

@krassowski
Copy link
Collaborator

Trying to understand what this PR does and how it fits with other discussions, I was able to find:

But I probably missed some discussion or a JEP.

definitions = tools_func()
for name, tool in definitions.items():
# not sure if we should just pick MCP schema or make the schema something the user can pass
jsonschema.validate(instance=tool["metadata"], schema=MCP_TOOL_SCHEMA)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, let's not require MCP here. Maybe MCP is the default (for now), but we should let server extensions arrive with their own schema that the server can validate against.

Maybe the returned value of the jupyter_server_extension_tools hook is a tuple with the tool definition and the schema to validate against?

Copy link
Author

@Abigayle-Mercer Abigayle-Mercer May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good — so just to clarify, you're thinking one validation schema per extension (covering all of its exposed tools), rather than per tool?

Also, should we assume each tool will include both metadata and a callable? If so, does that remove the need to build something like a server-exposed endpoint for tool execution? I’m thinking specifically of my run function prototype: https://github.com/Abigayle-Mercer/list_ai_tools/blob/main/list_ai_tools/list_ai_tools.py#L103, which takes structured tool calls and executes them dynamically.

Just wondering if you envision those callables staying internal to the server, or whether we’d eventually want something like POST /api/tools/:name to run them.

@Zsailer
Copy link
Member

Zsailer commented May 12, 2025

Trying to understand what this PR does and how it fits with other discussions, I was able to find:

But I probably missed some discussion or a JEP.

@krassowski 👋 thanks for taking a look!

@Abigayle-Mercer is a Master's student from Cal Poly that's been doing her thesis, exploring ways to integrate models, agents, etc. into Jupyter applications. I encouraged her to open a Draft PR so she can begin collaborating with folks here. This is her first PR to the ecosystem, so just encouraging her to get her work public for discussion earlier in the cycle.

She's got a whole bunch of work and cool demos to share, so she'll likely start socializing some of this over the next few weeks.

Don't you worry; a JEP will be submitted + voted on before considering merging this :)

@krassowski
Copy link
Collaborator

krassowski commented May 12, 2025

At the risk of being off-topic, I love https://github.com/Abigayle-Mercer/test_jupyter_notebook_tools/blob/main/test_jupyter_notebook_tools/ynotebook_tools.py - I have something very similar (unfortunately not public yet) and was recently also exploring how to put the cursor of the backend agent in the editor when streaming and using the collaboration layer which led to opening y-crdt/pycrdt#263 - I would love to see this evolve further.

@Abigayle-Mercer
Copy link
Author

Thanks again everyone for taking a look!

@Zsailer I'll be looking out for #1523 and work on the other edits in the meantime.

@krassowski I’d love to follow the progress on y-crdt/pycrdt#263 — having access to the sticky index for cursor placement would open up really exciting possibilities for streaming agents.

My write_to_cell (https://github.com/Abigayle-Mercer/test_jupyter_notebook_tools/blob/main/test_jupyter_notebook_tools/ynotebook_tools.py#L23) function is currently just faking streaming by applying character-level diffs and replaying them, but it’s purely visual. If the agent could control its cursor in a collaborative session — and users could actually see it working — we could get a lot closer to live agent editing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants