Currently, if one tries to run a model that requires more than the default memory allocation (e.,g, 7GB), GPULlama3.java throws a tornado out of memory exception.
The error message refers to to a solution to increase GPU heap size through a Tornado flag, but the llama repo expects wrapper calls as --gpu-memory XGB
Solution:
Lets try to catch the exception in TornadoMasterPlan class and throw the appripate message to increase heap size that is relevant to the llama implementation.