feat: Parallelise Model Loading #360

vovw · 2024-10-17T21:18:51Z

test using

exo --preload-models llama-3.2-1b,llama-3.1-8b

AlexCheema · 2024-10-18T01:09:21Z

Almost what I envisioned - only thing I would change is to preload after the preemptive download. We don't want to download all possible model shards, only the relevant one.

vovw · 2024-10-18T09:51:48Z

@AlexCheema I think I got it, can you review the changes ??

exo/main.py

vovw · 2024-10-19T23:08:37Z

tested on a m3 pro

vovw added 2 commits October 18, 2024 02:47

feat: Parallelise Model Loading

41494f8

cleanup

a7418f5

vovw mentioned this pull request Oct 17, 2024

[BOUNTY - $100] Parallelise Model Loading #202

Open

preload after the preemptive download

8fba922

AlexCheema requested changes Oct 18, 2024

View reviewed changes

exo/main.py Outdated Show resolved Hide resolved

move preload after

e914db5

vovw requested a review from AlexCheema October 19, 2024 23:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Parallelise Model Loading #360

feat: Parallelise Model Loading #360

vovw commented Oct 17, 2024 •

edited

Loading

AlexCheema commented Oct 18, 2024

vovw commented Oct 18, 2024

vovw commented Oct 19, 2024

feat: Parallelise Model Loading #360

Are you sure you want to change the base?

feat: Parallelise Model Loading #360

Conversation

vovw commented Oct 17, 2024 • edited Loading

AlexCheema commented Oct 18, 2024

vovw commented Oct 18, 2024

vovw commented Oct 19, 2024

vovw commented Oct 17, 2024 •

edited

Loading