How to tell for sure that some layers have been offloaded to CPU/RAM? #15323

remon-nashid · 2025-08-14T17:38:37Z

remon-nashid
Aug 14, 2025

... and how many of them.

With a model larger than VRAM, and when specifying -ngl 99, llama-server logs show that all layers are being offloaded to GPU despite they are clearly not.

Currently I check RAM use for clear increase to indicate that some layers are offloaded to RAM, but I don't believe it to be the smartest way.

Any clues? Any additional indicative log messages I'm missing? Ultimately, a display similar to ollama's GPU/CPU percentages would be ideal.

Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to tell for sure that some layers have been offloaded to CPU/RAM? #15323

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

How to tell for sure that some layers have been offloaded to CPU/RAM? #15323

Uh oh!

Uh oh!

remon-nashid Aug 14, 2025

Replies: 0 comments

remon-nashid
Aug 14, 2025