How to use nnx.BatchNorm layer together with nnx.vmap #4496

kklankers · 2025-01-22T15:34:02Z

kklankers
Jan 22, 2025

Update of question:

Dear Community,

I encountered a small challenge while using a batch normalization layer with nnx.vmap. To illustrate the issue, I have created a minimal example code snippet. Based on my understanding of the documentation for flax-nnx.vmap the issue seems to stem from the handling of BatchStat, which requires special consideration when using vmap.

Currently, I am struggling to make the final example in the attached code work. Does anyone have suggestions on how to adjust the loss function to work correctly with nnx.vmap?

I have a problem with the first defined train step, which uses nnx.vmap and does not execute.
Thank you very much!

import os
os.environ['JAX_CHECK_TRACER_LEAKS'] = '1'

import jax.numpy as jnp
from jax import random

from flax import nnx
import optax

class Model(nnx.Module):
  def __init__(self, din, dmid, dout, rngs: nnx.Rngs):
    self.linear = nnx.Linear(din, dmid, rngs=rngs)
    self.bn = nnx.BatchNorm(dmid, rngs=rngs)
    # self.dropout = nnx.Dropout(0.2, rngs=rngs)
    self.linear_out = nnx.Linear(dmid, dout, rngs=rngs)

  def __call__(self, x):
    x = nnx.relu(self.bn(self.linear(x)))
    return self.linear_out(x)

model = Model(2, 64, 3, rngs=nnx.Rngs(0))  # Eager initialization
optimizer = nnx.Optimizer(model, optax.adam(1e-3))  # Reference sharing.

#####################################################
# Train Step Function
#####################################################

@nnx.jit
def train_step(model, optimizer, x, y):
    def loss_fn(model):
        def loss(x_, y_):
            y_pred = model(x_)  # Forward pass
            return jnp.mean((y_pred - y_) ** 2)  # Mean squared error

        return jnp.mean(nnx.vmap(loss)(x, y), axis=1)  # Compute the mean loss across the batch

    loss, grads = nnx.value_and_grad(loss_fn)(model)  # Calculate loss and gradients
    optimizer.update(grads)  # Update model parameters using gradients

    return loss  # Return the computed loss


@nnx.jit  # Automatic state management for JAX transforms.
def train_step_works(model, optimizer, x, y):
  def loss_fn(model):
    y_pred = model(x)  # call methods directly
    return ((y_pred - y) ** 2).mean()

  loss, grads = nnx.value_and_grad(loss_fn)(model)
  optimizer.update(grads)  # in-place updates

  return loss


#####################################################
# Training Loop
#####################################################

key = random.PRNGKey(0)
key, sub_key = random.split(key)

x_set = random.uniform(key, (100, 2), jnp.float32)
y_set = random.uniform(key, (100, 3), jnp.float32)

for i in range(1000):


    loss = train_step_works(model, optimizer, x_set, y_set)
    print(f"Loss works: {loss}")

   # fails to execute
    loss = train_step(model, optimizer, x_set, y_set)
    print(f"Loss not works: {loss}")

print(f"loss: {loss}")

print(f"Done")

DiagRisker · 2025-01-26T19:09:20Z

DiagRisker
Jan 26, 2025

your train_step in non nnx.vmap hasn't .5 factor, and you didn't hint what is the problem when executing your code (just what's blocking you not the code execution).
You also changed the x_ input in the loss.

I believe the problem is that using nnx.vmap, will only get 1 item per call thus, the BatchNorm has only 1 element to work with when using mean and variance (variance will be null)..

NB: when sharing your code use <> Code option ( with 'python' at the start ) to output your code the right way, also the change with the nnx.vmap option only occurs in train_step (so you can avoid the excess code)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use nnx.BatchNorm layer together with nnx.vmap #4496

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

How to use nnx.BatchNorm layer together with nnx.vmap #4496

kklankers Jan 22, 2025

Update of question:

Replies: 1 comment

DiagRisker Jan 26, 2025

kklankers
Jan 22, 2025

DiagRisker
Jan 26, 2025