What is the intended way of doing EMA with PEFT? #1557

samedii · 2024-03-12T16:46:28Z

samedii
Mar 12, 2024

Currently I'm doing a full copy of the entire model instead of just the LoRA. Would be great to not have to do this. Maybe if there was a way to easily get only the LoRA weights in a nn.Module.

I normally use PostHocEMA from https://github.com/lucidrains/ema-pytorch.

Is there an intended way of doing this that I've just missed?

Answered by samedii

Mar 13, 2024

I ended up using a wrapper that keeps track of all the trained parameters.

from torch import nn


class TrainablesContainer(nn.ModuleDict):
    @classmethod
    def from_module(cls, module: nn.Module, parent_name=""):
        module_dict = cls()
        for name, sub_module in module.named_children():
            full_name = f"{parent_name}.{name}" if parent_name else name
            if list(sub_module.children()):  # If the submodule has children, recurse
                module_dict[name] = cls.from_module(sub_module, full_name)
            else:
                # Create a ParameterDict for leaf module
                param_dict = nn.ParameterDict()
                for param_name, param…

View full answer

BenjaminBossan · 2024-03-12T17:00:18Z

BenjaminBossan
Mar 12, 2024
Maintainer

Just did a quick glance. I think the issue is this:

https://github.com/lucidrains/ema-pytorch/blob/c77058c4ff6d1eefbe512776eb5059d760897d5a/ema_pytorch/ema_pytorch.py#L140-L144

Here, all parameters are used and there is no way on the PEFT side that we could prevent this. There is a filter for parameter_names, but that's only filtering for the dtype and cannot be changed by the user. If you modify your local copy of the code, you could add extra logic here to skip parameters without "lora_" in the name.

Not sure if lucidrains accepts PRs, but you could try to suggest adding a filter_fn parameter to EMA that would allow passing a callable or regex so that users can determine how to filter the parameters. Then this line could be changed:

- self.parameter_names = {name for name, param in self.ema_model.named_parameters() if param.dtype in [torch.float, torch.float16]}
+ self.parameter_names = {name for name, param in self.ema_model.named_parameters() if param.dtype in [torch.float, torch.float16] and filter_fn(name)}

1 reply

samedii Mar 13, 2024
Author

Thanks for the response! I could indeed make a PR or just copy the code since it's basically just one file anyway.

samedii · 2024-03-13T19:13:23Z

samedii
Mar 13, 2024
Author

I ended up using a wrapper that keeps track of all the trained parameters.

from torch import nn


class TrainablesContainer(nn.ModuleDict):
    @classmethod
    def from_module(cls, module: nn.Module, parent_name=""):
        module_dict = cls()
        for name, sub_module in module.named_children():
            full_name = f"{parent_name}.{name}" if parent_name else name
            if list(sub_module.children()):  # If the submodule has children, recurse
                module_dict[name] = cls.from_module(sub_module, full_name)
            else:
                # Create a ParameterDict for leaf module
                param_dict = nn.ParameterDict()
                for param_name, parameter in sub_module.named_parameters():
                    if parameter.requires_grad:
                        # Use the parameter's name directly, without the module prefix
                        param_dict[param_name] = nn.Parameter(
                            parameter.data, requires_grad=False
                        )
                module_dict[name] = param_dict
        return module_dict

This can then just be given to the EMA like this:

ema = EMA(TrainablesContainer.from_module(my_peft_model))

And you can load the trainables state dict into the original model because it has the same structure.

my_peft_model.load_state_dict(my_trainables.state_dict(), strict=False)

I feel like I've ran into problems related of not being able to access the PEFT part model twice in a short time because it is always injected or part of the larger model.

10 replies

samedii Mar 14, 2024
Author

Yes this was one of the methods I tried. Really liking PEFT so far and this being unable to get a "LoRA" object to save and manipulate has really been the only gripe in the design for me.

BenjaminBossan Mar 15, 2024
Maintainer

I think it should be possible to add this. When calling unload, we could optionally return a dict with one entry per adapter (key being the adapter name), whose values would be another dict that contains the config and the state_dict. And for load_adapter, we could modify it to accept state_dicts as well, so that the unloaded adapters could be reloaded again.

This would probably be a very niche feature. Still, if you want to give this a go and create a PR, feel free to do so.

bghira Nov 18, 2024

has this changed much since then?

BenjaminBossan Nov 19, 2024
Maintainer

No, this feature does not exist yet, let me put this on the backlog. Do you have a specific use case in mind?

bghira Nov 19, 2024

i managed to get it working with Diffusers and a custom EMAModel class. for me the diffusers integration gets in the way as it only allows disabling all adapters and not a specific one. i dont think peft has much to do with the difficulty in diffusers for LoRA EMA.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What is the intended way of doing EMA with PEFT? #1557

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 11 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

What is the intended way of doing EMA with PEFT? #1557

Uh oh!

samedii Mar 12, 2024

Replies: 2 comments · 11 replies

Uh oh!

Uh oh!

BenjaminBossan Mar 12, 2024 Maintainer

Uh oh!

samedii Mar 13, 2024 Author

Uh oh!

samedii Mar 13, 2024 Author

Uh oh!

samedii Mar 14, 2024 Author

Uh oh!

BenjaminBossan Mar 15, 2024 Maintainer

Uh oh!

bghira Nov 18, 2024

Uh oh!

BenjaminBossan Nov 19, 2024 Maintainer

Uh oh!

bghira Nov 19, 2024

samedii
Mar 12, 2024

Replies: 2 comments 11 replies

BenjaminBossan
Mar 12, 2024
Maintainer

samedii Mar 13, 2024
Author

samedii
Mar 13, 2024
Author

samedii Mar 14, 2024
Author

BenjaminBossan Mar 15, 2024
Maintainer

BenjaminBossan Nov 19, 2024
Maintainer