-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support for heterogeneous types for modules_to_save
#2136
Comments
Hmm, I see the problem but I'm not sure if I understand that for some tasks, we just randomly initialize the classifier head, so it doesn't really matter if For this reason, I feel like It is a bit wasteful since we create a copy for each of these heads, so if memory is very tight, this would not be an optimal solution. Given that you probably don't need the weights of the randomly initialized heads, you could probably delete those to recuperate the memory. It would just mean that you can't use the model without the adapter (e.g. disabling adapters would not work) but there would not really be any reason to do that. |
@BenjaminBossan Thank you for your comment and for proposing a solution. Having multiple classifier heads in the base model based on the LoRA requirements would indeed solve my issue. As you suggested, I could remove the base heads and keep only those needed for the LoRAs. Thank you! Do you think this is something worth exploring as a feature for the main library? For example, instead of modifying the |
Good question. I'm a little bit torn about the suggestion, since at first glance, it has nothing really to do with parameter-efficient fine-tuning and thus would not fit the spirit of the PEFT library. However, having a convenient way to switch out specific modules can be handy in a few situations when working with these models. PEFT already implements something like that with the option to load multiple adapters and switch between them, so there is a clear connection in the implementation itself, even if not in spirit. Therefore, if you can come up with a very nice API to achieve this, we I would be open to the possibility. So I would start from the API and only when we can agree on something nice, start working on the implementation. As to the implementation, I would probably still use |
Yes, that makes sense, especially in scenarios where PEFT is used as a repository for different adapters with a similar base. However, unless there's a clean way to implement this, it might be out of scope for the current framework. I'll give it some thought and see if I can find a straightforward solution when I get the chance. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
Feature request
From my understanding of the current implementation, the modules_to_save wrappers are currently limited to copying only one specific layer of the model (reference:
peft/src/peft/utils/other.py
Line 192 in 859fd88
Motivation
This feature is particularly useful for the final classifier layer. Currently, I have a model with multiple LoRAs attached for a classification task, but the classifier layers are not all the same size. As a result, I need to maintain several models, grouping LoRAs with the same classifier size into the same base model. However, since the core model remains identical, it should be possible to use a single base model for all of them, especially since we are training the classifier layers from scratch. A potential solution could be to introduce an additional option, allowing users to specify the
modules_to_save
class for the classifier layer, instead of simply copying the existing layer.Your contribution
I'm happy to explore possible solutions and potentially contribute a PR if this is considered a valuable addition to the library.
The text was updated successfully, but these errors were encountered: