Skip to content

[Serving] Support tool function calls under strict format constraints #3190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

Irfnfnkemed
Copy link

@Irfnfnkemed Irfnfnkemed commented Mar 26, 2025

This PR supports tool function calls under strict format constraints. Specifically, it uses structural tag to constrain the calling format.
It made following changes:

  • Add "tool_call_format" attribute in EngineConfig, which determines the tool calls format
tool_call_format : Literal["json", "xml", "python"] = "json"
        The tool function call foramt.
        "json" means model will call tool function in json style format
        '{"name": func_name, "parameters": parameters(JSON dict)}',
        e.g. '{"name": "get_time", "parameters": {"location": "Pittsburgh"}}'.
        "xml" means model will call tool function in xml style format
        '<function=func_name>{parameters(JSON dict)}</function>',
        e.g. '<function=get_time>{"location": "Pittsburgh"}</function>'.
        "python" means model will call tool function in python-style format,
        e.g. 'wolfram_alpha.call(query="solve x^3 - 4x^2 + 6x - 24 = 0")'.

In most cases, the "json" and "xml" mode can meet the requirements. For some models specialized in Python code call output, "python" mode can be used, which means output will be parsed in python ast. For a few special cases, users can use the API of structural tags to customize their own function call format.

  • Add "strict" attribute in ChatFunction, which is aligned to OpenAI API
  • Set system prompt according to tool_call_format
  • Set structural tag to ensure strict func calls
  • Parse output to json-style func calls
  • Add Structural-Tag api to RequestResponseFormat [Serving] Add Structural-Tag api to RequestResponseFormat #3187 , including:
    • Upgrade xgrammar to latest version
    • Add Structural-Tag-relevant attributes to RequestResponseFormat and modify corresponding process
    • Align RequestResponseFormat with open-ai protocol
    • Add test script for Structural-Tag
    • Use vocab_size in config.json instead of tokenizer.vocab_size to build xgrammar mask

Irfnfnkemed and others added 11 commits March 14, 2025 12:22
- upgrade xgrammar calling to latest API
…mmar

- ensure the tool function will be called in expected format using xgrammar
- modify RequestResponseFormat: add structural tag according to the tools when building response format
- the tool function calling is now constrained by format: <function=function_name>parameters</function>
- tools call list will be parsed according to the calling format when processing the response
- also expose the Structural Tag api of xgrammar to RequestResponseFormat
- Expose Structural-Tag api, which can be used to standarlize function calling format
- Add test script for Structural-Tag (passed on Llama-2-7b-chat-hf-q0f16-MLC and Llama-3-8B-Instruct-q4f16_1-MLC)
- Add "tool_call_format" attribute in EngineConfig, which determines the tool calls format
- Add "strict" attribute in ChatFunction, which is aligned to OpenAI format
- Set system prompt according to tool_call_format
- Set structural tag to ensure strict func calls
- Parse output to json-style func calls
- TODO: Now only supports format <function=NAME>{PARA}</function>
@Irfnfnkemed Irfnfnkemed marked this pull request as draft March 26, 2025 14:39
@@ -975,13 +977,25 @@ class EngineImpl : public Engine {
* is not JSON, return std::nullopt. */
std::optional<xgrammar::CompiledGrammar> GetGrammarFromResponseFormat(
const ResponseFormat& response_format) {
if (response_format.type != "json_object") {
// TODO: add other grammar type
if (response_format.type == "text") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

align with openai api

@@ -86,12 +86,34 @@ class ModelResponse(BaseModel):


class RequestResponseFormat(BaseModel):
type: Literal["text", "json_object"] = "text"
json_schema: Optional[str] = Field(default=None, alias="schema")
type: Literal["text", "json_object", "structural_tag"] = "text"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -81,6 +81,8 @@ class Conversation(BaseModel):
function_string: str = ""
# whether using function calling or not, helps check for output message format in API call
use_function_calling: bool = False
# Tool function call format mode
tool_call_format: str = "default"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tool_call_format: str = "default"
_tool_call_format: str = "json"

@Ubospica Ubospica marked this pull request as ready for review April 15, 2025 07:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants