Loading a netCDF file with multiple variables is very slow

## 📰 Custom Issue


Hi! While evaluating a large number of files with multiple variables each I noticed that ESMValTool is much slower when files contain a lot of variables. I could trace that back to Iris' `load` function. Here is an example of a loading files with 1 and 61 variables:

```py
import iris

one_path = "data/one_cube.nc"  # file with 1 variable
multi_path = "data/multiple_cubes.nc"  # file with 61 variables

%%timeit
iris.load(one_path)  # 13.2 ms ± 136 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%%timeit
iris.load(multi_path)  # 673 ms ± 984 μs per loop (mean ± std. dev. of 7 runs, 1 loop each)

%%timeit
constraint = iris.Constraint("zonal stress from subgrid scale orographic drag")
iris.load(multi_path, constraint)  # 611 ms ± 1.72 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
```

As you can see, loading the file with 61 variables takes ~51 times as long as loading the file with 1 variable. Using a constraint does not help.

Doing the same with xarray gives:

```py
import xarray as xr

one_path = "data/one_cube.nc"  # file with 1 variable
multi_path = "data/multiple_cubes.nc"  # file with 61 variables

%%timeit
xr.open_dataset(one_path, chunks='auto')  # 7.75 ms ± 164 μs per loop (mean ± std. dev. of 7 runs, 1 loop each)

%%timeit
xr.open_dataset(multi_path, chunks='auto')  # 54.6 ms ± 241 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)
```

Here, the difference between 1 and 61 variables is only a factor of ~7.

If only a single file needs to be loaded, this is not a problem, but this quickly adds up to a lot of time if 100s or even 1000s of files need to be read (which can be the case for climate models that write one file with many variables per time step).

Have you ever encountered this problem? Are there any tricks to make loading faster? As mentioned, I tried with a constraint, but that didn't work.

Thanks for your help!

Sample data:
- [multiple_cubes.nc](https://swift.dkrz.de/v1/dkrz_4eefb34f-8803-415a-bd70-9c455db9a403/share/multiple_cubes.nc?temp_url_sig=492c196486cf1e837af0d4ec5cc8e9c03a9ed42b&temp_url_expires=2025-11-08T11:05:36Z)
- [one_cube.nc](https://swift.dkrz.de/v1/dkrz_4eefb34f-8803-415a-bd70-9c455db9a403/share/one_cube.nc?temp_url_sig=e0afdaf4f2370dc021093f3b0862c92af86bba5e&temp_url_expires=2025-11-08T11:08:03Z)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Loading a netCDF file with multiple variables is very slow #6223

📰 Custom Issue

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Loading a netCDF file with multiple variables is very slow #6223

Description

📰 Custom Issue

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions