-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Bug description
I have an Endpoint written in flask to mimic an openai Endpoint. When i want to use this Endpoint in Huggingface Chat UI i get an error:
Looking in the Debbuger, the content is empty compared to other requests. My flask reply looks like one from a working llm.
Steps to reproduce
The Flask Endpoint:
@app.route('/v1/chat/completions', methods=['POST'])
def chat_completion():
try:
# Get the JSON data from the request
data = request.json
print(data)
# Here you would typically process the request and interact with your LLM
# For this example, we'll just echo back a simple response
client = OpenAI(base_url="http://localhost:1234/v1", api_key="not-needed")
# Creating a mock response
response = ai_response("test", "", client)
#return jsonify(data), 200
tmp = openai_object_to_dict(response)
return jsonify(
{
"id": tmp["id"],
"choices": [
{
"finish_reason": tmp["choices"][0]["finish_reason"],
"logprobs": None,
"index": tmp["choices"][0]["index"],
"message": {
"content": tmp["choices"][0]["message"]["content"],
"role": tmp["choices"][0]["message"]["role"],
}
}
],
"created": tmp["created"],
"model": tmp["model"],
"object": tmp["object"],
"system_fingerprint": tmp["system_fingerprint"],
"usage": {
"completion_tokens": tmp["usage"]["completion_tokens"],
"prompt_tokens": tmp["usage"]["prompt_tokens"],
"total_tokens": tmp["usage"]["total_tokens"]
},
}), 200
The Code to serialize the openai chat completion:
def openai_object_to_dict(obj, visited=None):
"""
Recursively converts an OpenAI response/model object into a JSON-serializable dictionary
by calling __dict__ on each nested object, if available.
:param obj: The initial object (e.g., a ChatCompletion response from OpenAI).
:param visited: A set to track visited objects and avoid infinite recursion for circular references.
:return: A dictionary (or basic Python object) that can be serialized to JSON.
"""
if visited is None:
visited = set()
# Avoid infinite recursion in case of circular references
obj_id = id(obj)
if obj_id in visited:
return None
visited.add(obj_id)
# Base types can be returned as-is
if isinstance(obj, (str, int, float, bool, type(None))):
return obj
# If it's a list or tuple, process each item
if isinstance(obj, (list, tuple)):
return [openai_object_to_dict(item, visited) for item in obj]
# If it's a dictionary, process each key/value
if isinstance(obj, dict):
return {k: openai_object_to_dict(v, visited) for k, v in obj.items()}
# If the object has __dict__, convert it recursively
if hasattr(obj, '__dict__'):
# Here we convert all attributes, skipping those starting with '__' by convention
return {
k: openai_object_to_dict(v, visited)
for k, v in obj.__dict__.items()
if not k.startswith('__')
}
# Fallback: convert to string if we have no better way
return str(obj)
When I access the Endpoint with Postman I get this result (the content is not empty):
{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"message": {
"content": "Ich kann auf Ihre Anfrage leider keine qualifizierte Antwort liefern. Versuchen Sie eine andere Anfrage. Vielen Dank!",
"role": "assistant"
}
}
],
"created": 1749719897,
"id": "chatcmpl-9n3g3ffhyean0ha9wf94k",
"model": "gemma-3-12b-it",
"object": "chat.completion",
"system_fingerprint": "gemma-3-12b-it",
"usage": {
"completion_tokens": 23,
"prompt_tokens": 396,
"total_tokens": 419
}
}
As some background, I host a gemma-3-12b-it on LM Studio. Accessing this model directly from the LM Studio Server works with Chat UI.
However, I want to implement a simple RAG. For this I need to wrap the model in a custom endpoint via flask.
I acces both endpoint the same in Chat UI local.env:
{
"name": "Local Gemma 12b",
"description": "LLM",
"promptExamples": [
{
"title": "What is a LLM?",
"prompt": "What is a LLM?"
}
],
"endpoints": [
{
"type": "openai",
"model": "gemma-3-12b-it",
"baseURL": "http://localhost:1234/v1",
}
],
},
{
"name": "RAG",
"description": "RAG",
"promptExamples": [
{
"title": "What is a LLM?",
"prompt": "What is a LLM?"
}
],
"endpoints": [
{
"type": "openai",
"model": "gemma-3-12b-it",
"baseURL": "http://localhost:5001/v1",
}
],
} ...
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working