Python Dictionary lookups in the context of jax. Deeper reasons as to why array-based code is faster than dictionary-based code. #33402

fernando-garcia-cortez · 2025-11-19T01:22:38Z

fernando-garcia-cortez
Nov 19, 2025

Classically speaking (in the absence of JAX), it is known that dictionaries can be slower than simply feeding a vector (array) with parameters. Still, when function inputs (parameters) have names (e.g., codes for physics or other natural sciences), it is often handy to have dictionaries as your input. That way, code becomes more readable and easier to keep track of, for example:

def density(parameters):

	# If parameters is a dictionary, we could have:
	return parameters["mass"] / parameters["volume"]

	# Or if it is an array, then
	return parameters[0] / parameters[1]

Recently I changed my entire code to be purely (JAX numpy) array-based, without any use of dictionaries, and this drastically improved the run time of my code.

Is there a reason (in the context of JAX, so for gradients, tracing, JIT, etc.) for my code to run much, much faster with just arrays and no dictionaries? What can be said about dictionary lookups in the context of JAX?

I decided to test whether the dictionary lookup was the biggest slowdown, so I wrote a very artificial looking function to stress-test this:

N_PARAM = 64
N_OUT = 128

# Create a dictionary
keys = [f"p{i}" for i in range(N_PARAM)]
values = [0.01 * (i + 1) for i in range(N_PARAM)]
params_dict = {k: v for k, v in zip(keys, values)}
# And the corresponding array
params_array = jnp.array(values)

def f_dict(x, parameters):
    vectorized_parameters = jnp.array([parameters[k] for k in keys])  # Unpack dictionary

    i = jnp.arange(N_OUT)  # shape (N_OUT,)

    out = (
        jnp.dot(vectorized_parameters, jnp.sin(x * i)[:N_PARAM])   * jnp.sin(x * i)
        +jnp.dot(vectorized_parameters, jnp.cos(x + i)[:N_PARAM]) * jnp.cos(x + i)
        + jnp.sum(vectorized_parameters) * (x + i)**2
        + jnp.sum(vectorized_parameters**2)
        + jnp.sum(vectorized_parameters**3) * jnp.exp(-0.005 * i)
    )
    return out  # shape (N_OUT,)

def f_array(x, parameters):
    i = jnp.arange(N_OUT)

    trig_vec = jnp.sin(x + 0.001 * i)
    big_dot = jnp.dot(parameters, trig_vec[:N_PARAM])

    out = (
        jnp.dot(parameters, jnp.sin(x * i)[:N_PARAM])   * jnp.sin(x * i)
        +jnp.dot(parameters, jnp.cos(x + i)[:N_PARAM]) * jnp.cos(x + i)
        + jnp.sum(parameters) * (x + i)**2
        + jnp.sum(parameters**2)
        + jnp.sum(parameters**3) * jnp.exp(-0.005 * i)
    )
    return out

Observe that the functions differ in that the dictionary version first unpacks the dictionary as an array (in other words, the dictionary lookup occurs one per call). I found that after running them 1000 times (both were jitted and burned in before benchmark), the running times were:

0.3704 seconds for dictionary-based
0.0735 seconds for array-based

Fantastic! But now I'm puzzled about why this is the case.

Answered by jakevdp

Nov 19, 2025

The operative difference here is that in the dictionary-based code, you're constructing 64 array buffers on the device, then concatenating them with jnp.array([parameters[k] for k in keys]). In the array-based code, you're constructing a single array on the device. This relative lack of data movement makes the array version of the code faster.

Here's a simpler demonstration of the same thing, where we avoid the dict question and just compare passing 64 individual values vs. those same 64 values in an array:

import jax

@jax.jit
def f1(x_array):
  return x_array * 2

@jax.jit
def f2(x_list):
  return jnp.array(x_list) * 2

x_list = list(range(64))
x_array = jnp.array(x_list)

_ = f1(x_array).

View full answer

jakevdp · 2025-11-19T01:48:52Z

jakevdp
Nov 19, 2025
Maintainer

Can you share your benchmarking code? (in particular, I'm not sure what x should be). Did you account for data transfer, asynchronous dispatch, etc. as mentioned in Benchmarking JAX Code?

1 reply

fernando-garcia-cortez Nov 19, 2025
Author

Of course, find code below. x is just a scalar value. I also had a misdescription above: the benchmark times the evaluation time for running the function a given number of times, rather than the average time for a single run.

import time
import jax
import jax.numpy as jnp

"""
Functions to benchmark
"""

N_PARAM = 64
N_OUT = 128 # This should be larger than N_PARAM

# Create a dictionary
keys = [f"p{i}" for i in range(N_PARAM)]
values = [0.01 * (i + 1) for i in range(N_PARAM)]
params_dict = {k: v for k, v in zip(keys, values)}

# And the corresponding array (i.e., the dictionary in array format)
params_array = jnp.array(values)

def f_dict(x, parameters):
    vectorized_parameters = jnp.array([parameters[k] for k in keys])  # Unpack dictionary

    i = jnp.arange(N_OUT)  # shape (N_OUT,)

    out = (
        jnp.dot(vectorized_parameters, jnp.sin(x * i)[:N_PARAM])   * jnp.sin(x * i)
        +jnp.dot(vectorized_parameters, jnp.cos(x + i)[:N_PARAM]) * jnp.cos(x + i)
        + jnp.sum(vectorized_parameters) * (x + i)**2
        + jnp.sum(vectorized_parameters**2)
        + jnp.sum(vectorized_parameters**3) * jnp.exp(-0.005 * i)
    )
    return out  # shape (N_OUT,)

# Same as above, now assuming that "parameters" is a vector array, not a dictionary.
def f_array(x, parameters):
    i = jnp.arange(N_OUT)

    trig_vec = jnp.sin(x + 0.001 * i)
    big_dot = jnp.dot(parameters, trig_vec[:N_PARAM])

    out = (
        jnp.dot(parameters, jnp.sin(x * i)[:N_PARAM])   * jnp.sin(x * i)
        +jnp.dot(parameters, jnp.cos(x + i)[:N_PARAM]) * jnp.cos(x + i)
        + jnp.sum(parameters) * (x + i)**2
        + jnp.sum(parameters**2)
        + jnp.sum(parameters**3) * jnp.exp(-0.005 * i)
    )
    return out

# JIT

f_dict_jit = jax.jit(f_dict)
f_array_jit = jax.jit(f_array)

# Benchmark function

def benchmark(fn, p, label, N=10000):
    fn(1.0, p).block_until_ready()  # compile JIT (burn in)

    t0 = time.time() # Start time
    for _ in range(N): # Run the function N times, waiting for each call to finish.
        fn(1.0, p).block_until_ready()
    t1 = time.time() # End time

    print(f"{label}: {t1 - t0:.4f} s for {N} calls")


# Run benchmarks

benchmark(f_dict_jit, params_dict,   "Using dictionary: ")
benchmark(f_array_jit, params_array, "Using array:      ")

jakevdp · 2025-11-19T17:52:12Z

jakevdp
Nov 19, 2025
Maintainer

The operative difference here is that in the dictionary-based code, you're constructing 64 array buffers on the device, then concatenating them with jnp.array([parameters[k] for k in keys]). In the array-based code, you're constructing a single array on the device. This relative lack of data movement makes the array version of the code faster.

Here's a simpler demonstration of the same thing, where we avoid the dict question and just compare passing 64 individual values vs. those same 64 values in an array:

import jax

@jax.jit
def f1(x_array):
  return x_array * 2

@jax.jit
def f2(x_list):
  return jnp.array(x_list) * 2

x_list = list(range(64))
x_array = jnp.array(x_list)

_ = f1(x_array).block_until_ready()
%timeit f1(x_array).block_until_ready()
# 12.6 µs ± 3.08 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)

_ = f2(*x_list).block_until_ready()
%timeit f2(x_list).block_until_ready()
# 284 µs ± 18 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Python Dictionary lookups in the context of jax. Deeper reasons as to why array-based code is faster than dictionary-based code. #33402

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Python Dictionary lookups in the context of jax. Deeper reasons as to why array-based code is faster than dictionary-based code. #33402

Uh oh!

Uh oh!

fernando-garcia-cortez Nov 19, 2025

Replies: 2 comments · 1 reply

Uh oh!

jakevdp Nov 19, 2025 Maintainer

Uh oh!

fernando-garcia-cortez Nov 19, 2025 Author

Uh oh!

Uh oh!

jakevdp Nov 19, 2025 Maintainer

fernando-garcia-cortez
Nov 19, 2025

Replies: 2 comments 1 reply

jakevdp
Nov 19, 2025
Maintainer

fernando-garcia-cortez Nov 19, 2025
Author

jakevdp
Nov 19, 2025
Maintainer