How to manage multiple independent conversations against a single model backend #1174

nchammas · 2025-04-02T18:34:18Z

nchammas
Apr 2, 2025

I am using Guidance in the context of a Streamlit web application. I assume it would be a mistake to have every web session get its own instance of the LLM, as that would both be a large waste of memory and severely limit the number of concurrent users I can support.

What I want to do is load a model once, set its system prompt, and then allow multiple users to have independent conversations against that single model instance. So the model itself would be shared, as would the system prompt, but each web session would track its own conversation state.

I see that Model has a state attribute. To track independent conversations against a common backend, should I basically save this attribute to my web session state and load it from there?

Or am I thinking about this the wrong way? I am aware of related projects like vLLM that are meant for model serving and support structured outputs, but I don't know if I need to leverage a tool like that vs. just use Guidance directly.

hudson-ai · 2025-04-03T18:48:48Z

hudson-ai
Apr 3, 2025
Collaborator

Hey @nchammas good questions here --

There isn't really a clean solution inside of guidance today, but this is actually the focus of our current work cycle. Ideally, you'd be able to set up a guidance "server" and connect multiple clients to it, either from the same process (using async and/or threads) or from entirely separate processes and/or over http.

You may be able to hack something together in the meantime though. The Model object contains a State and a Client -- State is what should be persistent for a given conversation, and Client objects can be shared (i.e. sharing an underlying llm implementation). I do warn you that the state/client attributes are not yet considered stable though, so don't rely on them being there if you're frequently updating your guidance dependency.

2 replies

hudson-ai Apr 3, 2025
Collaborator

Also note that llguidance is supported inside of VLLM if you want to go that route!

nchammas Apr 5, 2025
Author

Is there a big reason, from your point of view, for someone to want to use Guidance from within vLLM, even after Guidance adds its own client/server implementation?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to manage multiple independent conversations against a single model backend #1174

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to manage multiple independent conversations against a single model backend #1174

Uh oh!

nchammas Apr 2, 2025

Replies: 1 comment · 2 replies

Uh oh!

hudson-ai Apr 3, 2025 Collaborator

Uh oh!

hudson-ai Apr 3, 2025 Collaborator

Uh oh!

nchammas Apr 5, 2025 Author

nchammas
Apr 2, 2025

Replies: 1 comment 2 replies

hudson-ai
Apr 3, 2025
Collaborator

hudson-ai Apr 3, 2025
Collaborator

nchammas Apr 5, 2025
Author