-
Notifications
You must be signed in to change notification settings - Fork 28
[feat] LoRA #149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hey @jlamypoirier, checking in on LoRA progress. This was assigned last Tuesday, but I haven't seen updates yet. LoRA is a blocker for multiple upcoming projects, so we need execution now. |
I had to address outstanding bugs and maintenance, so could only start today. I am working on a prototype for linear layers, following https://pytorch.org/torchtune/0.3/tutorials/lora_finetune.html. With a bit of luck I'll have something this week. But:
|
🎯 Goal (What & Why)
Add LoRA (Low-Rank Adaptation) support to Fast-LLM for flexible and memory-efficient fine-tuning.
Motivations:
🚀 Execution Plan
Step 1: What is the smallest working version?
Wq
andWv
of each transformer layer in Fast-LLM.LoraConfig
similar to PEFT's LoraConfig, focusing only on the essential parameters:int
): Lora attention dimension (the "rank").int
): The alpha parameter for Lora scaling.0
.PeftModel.from_pretrained
, see https://huggingface.co/docs/peft/en/tutorial/peft_model_config#peft-models.Step 2: What additional optimizations are possible (later, out-of-scope for now)?
LoreConfig
, e.g. to define which weights get LoRA adapters.📌 Acceptance Criteria (Must-Haves for Completion)
🛠️ Project Management
Estimate
field (in days) in the GitHub project.Size
field to categorize the PR size (Small/Medium/Large).The text was updated successfully, but these errors were encountered: