Added a proxy for model swapping #1645

samfundev · 2025-07-13T20:12:51Z

This is a very rough version of a proxy for kobold so that it can swap models for each request.

Specify models by name
Multimodal support
Try using reload_config instead of subprocessing
Possibly split out into a separate file?

Only text models are supported but that'll be fixed as well. First step towards #1623.

This can currently be used w/ things like open webui to chat with multiple models.

LostRuins · 2025-07-14T07:06:54Z

Hmm I think

A proxy, if added, should be in a separate external file, instead of directly in KoboldCpp.py
Why use subprocesses to open/close every time? Just use the admin api which is designed for switching models already.

henk717 · 2025-07-14T22:57:05Z

Would a seperate file play nice with the pyinstallers?

LostRuins · 2025-07-15T14:15:48Z

If communicating solely thru API I don't see why not.

Proxy accepts request, stalls the user
Proxy calls /api/admin/reload_config to switch model and waits until model switched
Proxy calls /api/v1/generate with request (or v1/chat/completions for openai mode)
Proxy receives reply from KoboldCpp
Proxy sends reply back to original requestor transparently.

This should be doable cleanly as an entirely separate program. However, SSE streaming will be more challenging.

henk717 · 2025-07-16T14:34:00Z

Either way it should be the regular koboldcpp setting it up. I dont want the mess of users having to start seperate things for this feature. However we do it should be a thing the main koboldcpp launcher / cli starts when admin mode is in use

Personally I do think integrating it into koboldcpp.py makes sense.

samfundev · 2025-08-06T19:25:13Z

I've updated this to use the model name. I also implemented support for getting the list of models instead of the currently active model. This was enough to get it working in open webui. I've updated the original comment with a checklist.

henk717 · 2025-08-07T00:28:34Z

The idea is that it would show the configs the admin api already shows. You get multi modal for free since it accepts kcpps files.

samfundev · 2025-08-21T14:55:11Z

I could be missing something, but I don't think that config files give us multi modal for free. If the goal is to keep at most one model loaded for each modality, having a config file either comes with a drawback (implementation 1) or requires splitting configs for each modality (implementation 2).

There's a two ways I can think to implement multi modality:

Have one server and if that server doesn't have the right model, swap to one that does. This has the drawback that if you have multiple modalities loaded, you have to load/unload multiple models for each swap of the server.
We can split each modality into a separate server, then each modality can be swapped without effecting the others. This has the drawback of more overhead since multiple servers are running. But avoids the drawback of implementation 1.

samfundev · 2025-08-22T16:41:40Z

@LostRuins: Why use subprocesses to open/close every time? Just use the admin api which is designed for switching models already.

I've just tried to implement it using this way but I ran into a problem. The API responds before the server has switched over. If I try to connect to the server right after the server responds, the old server is still active so it errors out when the connection gets closed. I added a sleep to try to wait until the old server closes but it feels like this method is unreliable.

pqnet · 2025-09-13T15:20:52Z

@LostRuins: Why use subprocesses to open/close every time? Just use the admin api which is designed for switching models already.

I've just tried to implement it using this way but I ran into a problem. The API responds before the server has switched over. If I try to connect to the server right after the server responds, the old server is still active so it errors out when the connection gets closed. I added a sleep to try to wait until the old server closes but it feels like this method is unreliable.

For this particular feature you could poll the model endpoint until it matches the model you want (or you get "no model loaded" in case of an error)

samfundev · 2025-09-20T23:12:19Z

Based on @pqnet suggested, I swapped over to the admin api.

samfundev marked this pull request as draft July 13, 2025 20:12

samfundev force-pushed the concedo branch from 47a5291 to 9c92eb8 Compare August 6, 2025 19:19

samfundev added 2 commits September 20, 2025 19:10

added a proxy

7a32fd2

switch to the admin api

684aed4

samfundev force-pushed the concedo branch from 9c92eb8 to 684aed4 Compare September 20, 2025 23:10

LostRuins added the enhancement New feature or request label Oct 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added a proxy for model swapping #1645

Added a proxy for model swapping #1645

Uh oh!

samfundev commented Jul 13, 2025 •

edited

Loading

Uh oh!

LostRuins commented Jul 14, 2025

Uh oh!

henk717 commented Jul 14, 2025

Uh oh!

LostRuins commented Jul 15, 2025 •

edited

Loading

Uh oh!

henk717 commented Jul 16, 2025 •

edited

Loading

Uh oh!

samfundev commented Aug 6, 2025

Uh oh!

henk717 commented Aug 7, 2025

Uh oh!

samfundev commented Aug 21, 2025

Uh oh!

samfundev commented Aug 22, 2025

Uh oh!

pqnet commented Sep 13, 2025

Uh oh!

samfundev commented Sep 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Added a proxy for model swapping #1645

Are you sure you want to change the base?

Added a proxy for model swapping #1645

Uh oh!

Conversation

samfundev commented Jul 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LostRuins commented Jul 14, 2025

Uh oh!

henk717 commented Jul 14, 2025

Uh oh!

LostRuins commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

henk717 commented Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

samfundev commented Aug 6, 2025

Uh oh!

henk717 commented Aug 7, 2025

Uh oh!

samfundev commented Aug 21, 2025

Uh oh!

samfundev commented Aug 22, 2025

Uh oh!

pqnet commented Sep 13, 2025

Uh oh!

samfundev commented Sep 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

samfundev commented Jul 13, 2025 •

edited

Loading

LostRuins commented Jul 15, 2025 •

edited

Loading

henk717 commented Jul 16, 2025 •

edited

Loading