You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to suggest implementing an option to use local LLM inference on mobile devices using llama.cpp library and either user provided or by the app downloaded quantized GGUF variant of LLM model. I believe such function would be feasible, since most middle tier mobile phones nowadays are capable of running usually a Q4_K_M quantization (the medium balanced quality/speed option) of 7B variants of many models at slower than PC, but bearable speed.
Impact
Implementing this would benefit users who are not always able to access internet on their mobile devices, plus those who would wish a privacy of local LLM on the go.
Additional Context
Inspired by addition of local inference option into desktop version of AppFlowy.
The text was updated successfully, but these errors were encountered:
Description
I would like to suggest implementing an option to use local LLM inference on mobile devices using llama.cpp library and either user provided or by the app downloaded quantized GGUF variant of LLM model. I believe such function would be feasible, since most middle tier mobile phones nowadays are capable of running usually a Q4_K_M quantization (the medium balanced quality/speed option) of 7B variants of many models at slower than PC, but bearable speed.
Impact
Implementing this would benefit users who are not always able to access internet on their mobile devices, plus those who would wish a privacy of local LLM on the go.
Additional Context
Inspired by addition of local inference option into desktop version of AppFlowy.
The text was updated successfully, but these errors were encountered: