Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature add Add LlamaCppChatCompletionClient and llama-cpp #5326

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions python/packages/autogen-ext/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,11 @@ file-surfer = [
"autogen-agentchat==0.4.5",
"markitdown>=0.0.1a2",
]

llama-cpp = [
"llama-cpp-python"
]

graphrag = ["graphrag>=1.0.1"]
web-surfer = [
"autogen-agentchat==0.4.5",
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
try:
from ._llama_cpp_completion_client import LlamaCppChatCompletionClient
except ImportError as e:
raise ImportError(

Check warning on line 4 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/__init__.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/__init__.py#L1-L4

Added lines #L1 - L4 were not covered by tests
"Dependencies for Llama Cpp not found. " "Please install llama-cpp-python: " "pip install llama-cpp-python"
) from e

__all__ = ["LlamaCppChatCompletionClient"]

Check warning on line 8 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/__init__.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/__init__.py#L8

Added line #L8 was not covered by tests
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add unit tests in the python/packages/autogen-ext/tests directory

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will work on this tomorrow

Original file line number Diff line number Diff line change
@@ -0,0 +1,240 @@
import json
import logging # added import
from typing import Any, AsyncGenerator, Dict, List, Literal, Optional, Sequence, Union

Check warning on line 3 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L1-L3

Added lines #L1 - L3 were not covered by tests

from autogen_core import CancellationToken
from autogen_core.models import AssistantMessage, ChatCompletionClient, CreateResult, SystemMessage, UserMessage
from autogen_core.tools import Tool
from llama_cpp import Llama
from pydantic import BaseModel

Check warning on line 9 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L5-L9

Added lines #L5 - L9 were not covered by tests


class ComponentModel(BaseModel):
provider: str
component_type: Optional[Literal["model", "agent", "tool", "termination", "token_provider"]] = None
version: Optional[int] = None
component_version: Optional[int] = None
description: Optional[str] = None
config: Dict[str, Any]

Check warning on line 18 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L12-L18

Added lines #L12 - L18 were not covered by tests


class LlamaCppChatCompletionClient(ChatCompletionClient):
def __init__(

Check warning on line 22 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L21-L22

Added lines #L21 - L22 were not covered by tests
self,
repo_id: str,
filename: str,
n_gpu_layers: int = -1,
seed: int = 1337,
n_ctx: int = 1000,
verbose: bool = True,
):
"""
Initialize the LlamaCpp client.
"""
self.logger = logging.getLogger(__name__) # initialize logger
self.logger.setLevel(logging.DEBUG if verbose else logging.INFO) # set level based on verbosity
self.llm = Llama.from_pretrained(

Check warning on line 36 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L34-L36

Added lines #L34 - L36 were not covered by tests
repo_id=repo_id,
filename=filename,
n_gpu_layers=n_gpu_layers,
seed=seed,
n_ctx=n_ctx,
verbose=verbose,
)
self._total_usage = {"prompt_tokens": 0, "completion_tokens": 0}

Check warning on line 44 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L44

Added line #L44 was not covered by tests

async def create(self, messages: List[Any], tools: List[Any] = None, **kwargs) -> CreateResult:

Check warning on line 46 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L46

Added line #L46 was not covered by tests
"""
Generate a response using the model, incorporating tool metadata.

:param messages: A list of message objects to process.
:param tools: A list of tool objects to register dynamically.
:param kwargs: Additional arguments for the model.
:return: A CreateResult object containing the model's response.
"""
tools = tools or []

Check warning on line 55 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L55

Added line #L55 was not covered by tests

# Convert LLMMessage objects to dictionaries with 'role' and 'content'
converted_messages = []
for msg in messages:
if isinstance(msg, SystemMessage):
converted_messages.append({"role": "system", "content": msg.content})
elif isinstance(msg, UserMessage):
converted_messages.append({"role": "user", "content": msg.content})
elif isinstance(msg, AssistantMessage):
converted_messages.append({"role": "assistant", "content": msg.content})

Check warning on line 65 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L58-L65

Added lines #L58 - L65 were not covered by tests
else:
raise ValueError(f"Unsupported message type: {type(msg)}")

Check warning on line 67 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L67

Added line #L67 was not covered by tests

# Add tool descriptions to the system message
tool_descriptions = "\n".join(

Check warning on line 70 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L70

Added line #L70 was not covered by tests
[f"Tool: {i+1}. {tool.name} - {tool.description}" for i, tool in enumerate(tools)]
)

few_shot_example = """

Check warning on line 74 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L74

Added line #L74 was not covered by tests
Example tool usage:
User: Validate this request: {"patient_name": "John Doe", "patient_id": "12345", "procedure": "MRI Knee"}
Assistant: Calling tool 'validate_request' with arguments: {"patient_name": "John Doe", "patient_id": "12345", "procedure": "MRI Knee"}
"""

system_message = (

Check warning on line 80 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L80

Added line #L80 was not covered by tests
"You are an assistant with access to tools. "
"If a user query matches a tool, explicitly invoke it with JSON arguments. "
"Here are the tools available:\n"
f"{tool_descriptions}\n"
f"{few_shot_example}"
)
converted_messages.insert(0, {"role": "system", "content": system_message})

Check warning on line 87 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L87

Added line #L87 was not covered by tests

# Debugging outputs
# print(f"DEBUG: System message: {system_message}")
# print(f"DEBUG: Converted messages: {converted_messages}")

# Generate the model response
response = self.llm.create_chat_completion(messages=converted_messages, stream=False)
self._total_usage["prompt_tokens"] += response.get("usage", {}).get("prompt_tokens", 0)
self._total_usage["completion_tokens"] += response.get("usage", {}).get("completion_tokens", 0)

Check warning on line 96 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L94-L96

Added lines #L94 - L96 were not covered by tests

# Parse the response
response_text = response["choices"][0]["message"]["content"]

Check warning on line 99 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L99

Added line #L99 was not covered by tests
# print(f"DEBUG: Model response: {response_text}")

# Detect tool usage in the response
tool_call = await self._detect_and_execute_tool(response_text, tools)
if not tool_call:
self.logger.debug("DEBUG: No tool was invoked. Returning raw model response.")

Check warning on line 105 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L103-L105

Added lines #L103 - L105 were not covered by tests
else:
self.logger.debug(f"DEBUG: Tool executed successfully: {tool_call}")

Check warning on line 107 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L107

Added line #L107 was not covered by tests

# Create a CreateResult object
create_result = CreateResult(

Check warning on line 110 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L110

Added line #L110 was not covered by tests
content=tool_call if tool_call else response_text,
usage=response.get("usage", {}),
finish_reason=response["choices"][0].get("finish_reason", "unknown"),
cached=False,
)
return create_result

Check warning on line 116 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L116

Added line #L116 was not covered by tests

async def _detect_and_execute_tool(self, response_text: str, tools: List[Tool]) -> Optional[str]:

Check warning on line 118 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L118

Added line #L118 was not covered by tests
"""
Detect if the model is requesting a tool and execute the tool.

:param response_text: The raw response text from the model.
:param tools: A list of available tools.
:return: The result of the tool execution or None if no tool is called.
"""
for tool in tools:
if tool.name.lower() in response_text.lower(): # Case-insensitive matching
self.logger.debug(f"DEBUG: Detected tool '{tool.name}' in response.")

Check warning on line 128 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L126-L128

Added lines #L126 - L128 were not covered by tests
# Extract arguments (if any) from the response
func_args = self._extract_tool_arguments(response_text)
if func_args:
self.logger.debug(f"DEBUG: Extracted arguments for tool '{tool.name}': {func_args}")

Check warning on line 132 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L130-L132

Added lines #L130 - L132 were not covered by tests
else:
self.logger.debug(f"DEBUG: No arguments found for tool '{tool.name}'.")
return f"Error: No valid arguments provided for tool '{tool.name}'."

Check warning on line 135 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L134-L135

Added lines #L134 - L135 were not covered by tests

# Ensure arguments match the tool's args_type
try:
args_model = tool.args_type()
if "request" in args_model.__fields__: # Handle nested arguments
func_args = {"request": func_args}
args_instance = args_model(**func_args)
except Exception as e:
return f"Error parsing arguments for tool '{tool.name}': {e}"

Check warning on line 144 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L138-L144

Added lines #L138 - L144 were not covered by tests

# Execute the tool
try:
result = await tool.run(args=args_instance, cancellation_token=CancellationToken())
if isinstance(result, dict):
return json.dumps(result)
elif hasattr(result, "model_dump"): # If it's a Pydantic model
return json.dumps(result.model_dump())

Check warning on line 152 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L147-L152

Added lines #L147 - L152 were not covered by tests
else:
return str(result)
except Exception as e:
return f"Error executing tool '{tool.name}': {e}"

Check warning on line 156 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L154-L156

Added lines #L154 - L156 were not covered by tests

return None

Check warning on line 158 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L158

Added line #L158 was not covered by tests

def _extract_tool_arguments(self, response_text: str) -> Dict[str, Any]:

Check warning on line 160 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L160

Added line #L160 was not covered by tests
"""
Extract tool arguments from the response text.

:param response_text: The raw response text.
:return: A dictionary of extracted arguments.
"""
try:
args_start = response_text.find("{")
args_end = response_text.find("}")
if args_start != -1 and args_end != -1:
args_str = response_text[args_start : args_end + 1]
return json.loads(args_str)
except json.JSONDecodeError as e:
self.logger.debug(f"DEBUG: Failed to parse arguments: {e}")
return {}

Check warning on line 175 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L167-L175

Added lines #L167 - L175 were not covered by tests

async def create_stream(self, messages: List[Any], tools: List[Any] = None, **kwargs) -> AsyncGenerator[str, None]:

Check warning on line 177 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L177

Added line #L177 was not covered by tests
"""
Generate a streaming response using the model.

:param messages: A list of messages to process.
:param tools: A list of tool objects to register dynamically.
:param kwargs: Additional arguments for the model.
:return: An asynchronous generator yielding the response stream.
"""
tools = tools or []

Check warning on line 186 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L186

Added line #L186 was not covered by tests

# Convert LLMMessage objects to dictionaries with 'role' and 'content'
converted_messages = []
for msg in messages:
if isinstance(msg, SystemMessage):
converted_messages.append({"role": "system", "content": msg.content})
elif isinstance(msg, UserMessage):
converted_messages.append({"role": "user", "content": msg.content})
elif isinstance(msg, AssistantMessage):
converted_messages.append({"role": "assistant", "content": msg.content})

Check warning on line 196 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L189-L196

Added lines #L189 - L196 were not covered by tests
else:
raise ValueError(f"Unsupported message type: {type(msg)}")

Check warning on line 198 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L198

Added line #L198 was not covered by tests

# Add tool descriptions to the system message
tool_descriptions = "\n".join([f"Tool: {tool.name} - {tool.description}" for tool in tools])
if tool_descriptions:
converted_messages.insert(

Check warning on line 203 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L201-L203

Added lines #L201 - L203 were not covered by tests
0, {"role": "system", "content": f"The following tools are available:\n{tool_descriptions}"}
)

# Convert messages into a plain string prompt
prompt = "\n".join(f"{msg['role']}: {msg['content']}" for msg in converted_messages)

Check warning on line 208 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L208

Added line #L208 was not covered by tests
# Call the model with streaming enabled
response_generator = self.llm(prompt=prompt, stream=True)

Check warning on line 210 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L210

Added line #L210 was not covered by tests

for token in response_generator:
yield token["choices"][0]["text"]

Check warning on line 213 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L212-L213

Added lines #L212 - L213 were not covered by tests

# Implement abstract methods
def actual_usage(self) -> Dict[str, int]:
return self._total_usage

Check warning on line 217 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L216-L217

Added lines #L216 - L217 were not covered by tests

@property
def capabilities(self) -> Dict[str, bool]:
return {"chat": True, "stream": True}

Check warning on line 221 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L219-L221

Added lines #L219 - L221 were not covered by tests

def count_tokens(self, messages: Sequence[Dict[str, Any]], **kwargs) -> int:
return sum(len(msg["content"].split()) for msg in messages)

Check warning on line 224 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L223-L224

Added lines #L223 - L224 were not covered by tests

@property
def model_info(self) -> Dict[str, Any]:
return {

Check warning on line 228 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L226-L228

Added lines #L226 - L228 were not covered by tests
"name": "llama-cpp",
"capabilities": {"chat": True, "stream": True},
"context_window": self.llm.n_ctx,
"function_calling": True,
}

def remaining_tokens(self, messages: Sequence[Dict[str, Any]], **kwargs) -> int:
used_tokens = self.count_tokens(messages)
return max(self.llm.n_ctx - used_tokens, 0)

Check warning on line 237 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L235-L237

Added lines #L235 - L237 were not covered by tests

def total_usage(self) -> Dict[str, int]:
return self._total_usage

Check warning on line 240 in python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-ext/src/autogen_ext/models/llama_cpp/_llama_cpp_completion_client.py#L239-L240

Added lines #L239 - L240 were not covered by tests
26 changes: 16 additions & 10 deletions python/uv.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading