[FR] Local inference for mobile app using llama.cpp #7644

rampa3 · 2025-03-29T00:40:05Z

Description

I would like to suggest implementing an option to use local LLM inference on mobile devices using llama.cpp library and either user provided or by the app downloaded quantized GGUF variant of LLM model. I believe such function would be feasible, since most middle tier mobile phones nowadays are capable of running usually a Q4_K_M quantization (the medium balanced quality/speed option) of 7B variants of many models at slower than PC, but bearable speed.

Impact

Implementing this would benefit users who are not always able to access internet on their mobile devices, plus those who would wish a privacy of local LLM on the go.

Additional Context

Inspired by addition of local inference option into desktop version of AppFlowy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FR] Local inference for mobile app using llama.cpp #7644

[FR] Local inference for mobile app using llama.cpp #7644

rampa3 commented Mar 29, 2025 •

edited

Loading

[FR] Local inference for mobile app using llama.cpp #7644

[FR] Local inference for mobile app using llama.cpp #7644

Comments

rampa3 commented Mar 29, 2025 • edited Loading

Description

Impact

Additional Context

rampa3 commented Mar 29, 2025 •

edited

Loading