Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization with activation layer does not work #21088

Open
MathiesW opened this issue Mar 25, 2025 · 2 comments
Open

Serialization with activation layer does not work #21088

MathiesW opened this issue Mar 25, 2025 · 2 comments
Assignees
Labels
keras-team-review-pending Pending review by a Keras team member. type:Bug

Comments

@MathiesW
Copy link

MathiesW commented Mar 25, 2025

Most Keras layer support an activation function. While it is possible to use string identifiers like "relu", we can also use actual activation layers like layers.ReLU().

When using a layer, the deserialization is broken, since the activations.deserialize(activation) in the from_config() method does not support an instance of class layers.Layer. This significantly reduces the flexibility, because using, e.g., LeakyReLU with a negative slope of 0.1 is not possible when you rely on loading the trained model later on.

An easy fix would be to use saving.serialize_keras_object(self.activation) in get_config() and saving.deserialize_keras_object(activation_cfg) in the from_config() method.

from keras import layers

layer = layers.Conv1D(filters=1, kernel_size=1, activation=layers.ReLU())  # works flawless
layer_from_config = layers.Conv1D.from_config(layer.get_config())

The last line throws the following exception (tested on Keras 3.6.0 and 3.9.0)
"Exception encountered: Could not interpret activation function identifier: {'module': 'keras.layers', 'class_name': 'ReLU', 'config': {'name': 're_lu', 'trainable': True, 'dtype': {'module': 'keras', 'class_name': 'DTypePolicy', 'config': {'name': 'float32'}, 'registered_name': None}, 'max_value': None, 'negative_slope': 0.0, 'threshold': 0.0}, 'registered_name': None}"

@MathiesW
Copy link
Author

MathiesW commented Mar 25, 2025

I wanted to add that the following code runs without problems and allows serialization / deserialization with a layer class as activation. The simple fix, as I hinted in my initial post, is to use saving.(de)serialize_keras_object() in the get_config() and from_config() methods of BaseConv. Since the __init__() of BaseConv calls self.activation = activations.get(activation), the following code is still compatible with using string identifiers such as "relu" as activation function.

from keras import saving, layers
from keras.src.layers.convolutional.base_conv import BaseConv

class MyBaseConv(BaseConv):
    def get_config(self):
        config: dict = super().get_config()
        config.update({"activation": saving.serialize_keras_object(self.activation)})

        return config
    
    @classmethod
    def from_config(cls, config: dict):
        activation_cfg = config.pop("activation")
        config.update({"activation": saving.deserialize_keras_object(activation_cfg)})

        return cls(**config)


class MyConv3D(MyBaseConv):
    def __init__(
        self,
        filters,
        kernel_size,
        strides=(1, 1, 1),
        padding="valid",
        data_format=None,
        dilation_rate=(1, 1, 1),
        groups=1,
        activation=None,
        use_bias=True,
        kernel_initializer="glorot_uniform",
        bias_initializer="zeros",
        kernel_regularizer=None,
        bias_regularizer=None,
        activity_regularizer=None,
        kernel_constraint=None,
        bias_constraint=None,
        **kwargs
    ):
        super().__init__(
            rank=3,
            filters=filters,
            kernel_size=kernel_size,
            strides=strides,
            padding=padding,
            data_format=data_format,
            dilation_rate=dilation_rate,
            groups=groups,
            activation=activation,
            use_bias=use_bias,
            kernel_initializer=kernel_initializer,
            bias_initializer=bias_initializer,
            kernel_regularizer=kernel_regularizer,
            bias_regularizer=bias_regularizer,
            activity_regularizer=activity_regularizer,
            kernel_constraint=kernel_constraint,
            bias_constraint=bias_constraint,
            **kwargs
        )


layer = MyConv3D(filters=1, kernel_size=1, activation=layers.ReLU(negative_slope=0.1))
layer_from_config = MyConv3D.from_config(layer.get_config())

@dhantule
Copy link
Contributor

dhantule commented Mar 25, 2025

Hi @MathiesW, thanks for reporting this.
I have tested your code with Keras 3.9.0 and I'm facing the same error in this gist, we'll look into this issue and update you.

@dhantule dhantule added the keras-team-review-pending Pending review by a Keras team member. label Mar 26, 2025
@VarunS1997 VarunS1997 self-assigned this Mar 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
keras-team-review-pending Pending review by a Keras team member. type:Bug
Projects
None yet
Development

No branches or pull requests

4 participants