Newby questions about model names and paths #15017

osenvosem · 2025-08-01T17:46:46Z

osenvosem
Aug 1, 2025

Hello 👋🏻 Just started with llama.cpp and have a couple of questions. I downloaded a model using the llama-cli -hf like this according the instructions on the unsloth site.

llama-cli -hf unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF:Q4_K_XL --jinja -ngl 99 --threads -1 --ctx-size 32684 --temp 0.7 --min-p 0.0 --top-p 0.80 --top-k 20 --repeat-penalty 1.05

It runs, I can interact with it. But now I want to run it using llama-server to use it over http requests. I tried a couple of options with the model name but non of the worked.

llama-server -m unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF:Q4_K_XL
llama-server -m unsloth_Qwen3-Coder-30B-A3B-Instruct-GGUF_Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf # model name from the .cache/llama.cpp folder

Getting this error:

build: 983 (f12b193) with cc (GCC) 15.1.1 20250729 for x86_64-pc-linux-gnu
system info: n_threads = 12, n_threads_batch = 12, total_threads = 24

system_info: n_threads = 12 (n_threads_batch = 12) / 24 | CPU : LLAMAFILE = 1 | OPENMP = 1 | REPACK = 1 | 

main: binding port with default address family
main: HTTP server is listening, hostname: 127.0.0.1, port: 8080, http threads: 23
main: loading model
srv    load_model: loading model 'models/7B/ggml-model-f16.gguf'
gguf_init_from_file: failed to open GGUF file 'models/7B/ggml-model-f16.gguf'
llama_model_load: error loading model: llama_model_loader: failed to load model from models/7B/ggml-model-f16.gguf
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model 'models/7B/ggml-model-f16.gguf'
srv    load_model: failed to load model, 'models/7B/ggml-model-f16.gguf'
srv    operator(): operator(): cleaning up before exit...
main: exiting due to model loading error

.cache/llama.cpp/ contents

❯ ls ~/.cache/llama.cpp/
manifest=unsloth_Qwen3-Coder-30B-A3B-Instruct-GGUF=latest.json
manifest=unsloth_Qwen3-Coder-30B-A3B-Instruct-GGUF=Q4_K_XL.json
unsloth_Qwen3-Coder-30B-A3B-Instruct-GGUF_Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf
unsloth_Qwen3-Coder-30B-A3B-Instruct-GGUF_Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf.json
unsloth_Qwen3-Coder-30B-A3B-Instruct-GGUF_Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL.gguf
unsloth_Qwen3-Coder-30B-A3B-Instruct-GGUF_Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL.gguf.json

How do I find proper model name/path for llama-server -m command?
Can I change default model folder?
Can I list model names using llama-cli to use the name for the llama-server -m command?

Kahdeg-15520487 · 2025-08-20T07:12:06Z

Kahdeg-15520487
Aug 20, 2025

you just use the -hf switch and it will check in cache and load downloaded file.
or you can manually copy the gguf file out and put the path directly to the .gguf model file with -m switch

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Newby questions about model names and paths #15017

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Newby questions about model names and paths #15017

Uh oh!

Uh oh!

osenvosem Aug 1, 2025

Replies: 1 comment

Uh oh!

Kahdeg-15520487 Aug 20, 2025

osenvosem
Aug 1, 2025

Kahdeg-15520487
Aug 20, 2025