Why a locked model? #20
Replies: 3 comments 2 replies
-
|
The 2am LoRa tune was an old technique that had been since moved away from. (Was in v1). We have since moved to the geometric lens as we can introduce new ways to work with the data it already has versus an entire retrain. It's a lot more efficient. I also chose a frozen model as it would prove how much it already knows without needing to fine tune or retrain it. It also would allow for solid ablation and benchmarking across each infra version. But feel free to make it your own! I will mention that its not completely model agnostic yet- its straight forward with llama++ server, but you would still need to retrain c(x), g(x) to match the correct dimensions of your model, and adjust config and any templates. |
Beta Was this translation helpful? Give feedback.
-
|
you know, I thought I had it working- but seeing your post, realizing i didn't retrain anything... I think q4_k_m uses the same embedding dimensions. |
Beta Was this translation helpful? Give feedback.
-
|
Im reasonably sure that my quant has the same weights and doesnt require
training, but I'm having an issue related to either token length or Atlas'
ability to save files where software never 'passes'. It does present code,
but I suspect it just does a normal qwen build and picks the best without
the layers working properly.
Still working on diagnostics atm, benchmark is some time off. Fun project
though
…On Thu, Apr 9, 2026, 8:02 PM Johnathon Isaac Tigges < ***@***.***> wrote:
I have yet to bench the 9B, and would love to bench larger models to see
what type of scaling laws are associated with the infrastructure. I doubt
it's linear, but I am curious!
Keep me updated with results!
—
Reply to this email directly, view it on GitHub
<#20 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BZAETYWEVBCFV7YSWCQ636D4VBP5XAVCNFSM6AAAAACXR447MKVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTMNJRGA4TSMA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I've been running some tweaks to getting it running on a 5070, like using a slightly quantized model and reducing token length..
Can i ask why you use a locked model? I seemed to see a reference to retraining at 2am- is that something automatic or user-initiated?
Beta Was this translation helpful? Give feedback.
All reactions