Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Local inference for mobile app using llama.cpp #7644

Open
rampa3 opened this issue Mar 29, 2025 · 0 comments
Open

[FR] Local inference for mobile app using llama.cpp #7644

rampa3 opened this issue Mar 29, 2025 · 0 comments

Comments

@rampa3
Copy link

rampa3 commented Mar 29, 2025

Description

I would like to suggest implementing an option to use local LLM inference on mobile devices using llama.cpp library and either user provided or by the app downloaded quantized GGUF variant of LLM model. I believe such function would be feasible, since most middle tier mobile phones nowadays are capable of running usually a Q4_K_M quantization (the medium balanced quality/speed option) of 7B variants of many models at slower than PC, but bearable speed.

Impact

Implementing this would benefit users who are not always able to access internet on their mobile devices, plus those who would wish a privacy of local LLM on the go.

Additional Context

Inspired by addition of local inference option into desktop version of AppFlowy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant