Support Sp token Function Call Token Implementation #13339
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Support <|observation|> for function call behavior, add in the EOG detection logic for src/llama-vocab.cpp#L1976-L1977
Verification the PR
checkout Ref: #13058
1. Build
2. Convert HF Weights
3. Run Inference
{ "name": "C++ Server Launch", "type": "cppdbg", "request": "launch", "program": "${workspaceFolder}/build/bin/llama-server", "args": [ "--jinja", "-m", "/mnt/ceph/develop/jiawei/model_checkpoint/glm-4-9b-chat-hf.gguf", "--port", "8000" ], "stopAtEntry": false, "cwd": "${workspaceFolder}", "environment": [], "externalConsole": false, "MIMode": "gdb", "setupCommands": [ { "description": "Enable pretty-printing for gdb", "text": "-enable-pretty-printing", "ignoreFailures": true } ], "miDebuggerPath": "/usr/bin/gdb" }4.Funcation Call test