Skip to content

Switch docs to jupyter-execute sphinx extension for HTML reprs #10383

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

scottyhq
Copy link
Contributor

@scottyhq scottyhq commented Jun 1, 2025

  • Closes HTML repr in the online docs #3893
  • Tests added
  • User visible changes (including notable bug fixes) are documented in whats-new.rst
  • New functions/methods are listed in api.rst

As a follow-on to #8708 (review), and now that HTML repr is looking really nice in both light and dark mode (#10353), this switches the docs to use https://jupyter-sphinx.readthedocs.io so that the HTML reprs are used throughout.

An added benefit to doing this is that the code block execution now clearly raises errors where previously they may have been obscured (#10032 (comment))

Still need to review the rendered docs and could completely remove the ipython extension as a dependency, but figured I'd open this up now to test on RTD

RTD preview is up! I recommend starting here, the main changes are under the 'user guide' and 'internals' sections:

User Guide:
https://xray--10383.org.readthedocs.build/en/10383/user-guide/data-structures.html#creating-a-dataarray
vs
https://docs.xarray.dev/en/v2025.04.0/user-guide/data-structures.html

Comment on lines 674 to +677
"z": ("x", list("abcd")),
},
)
ds.to_zarr("path/to/directory.zarr")
ds.to_zarr("path/to/directory.zarr", zarr_format=2, consolidated=False)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs environment is unpinned, so zarr-python>=3 is installed. This current code emits lots of warnings if using Zarr v3 (which I think will just go away when zarr-python 3.1 is released #10077 ). In the meantime I thought it might also be good to have an example in the docs that explicitly uses zarr_format=2, which also circumvents the warnings :)

By using jupyter-sphinx we have to handle warnings either by 'catching' them with the :stderr: marker (less intuitively named than the old :okwarning:) or modifying the code to avoid the warning in the first place. This is because of the current RTD build -W (--warning-is-error) option set here

$(SPHINXBUILD) -T -j auto -E -W --keep-going -b html -d $(BUILDDIR)/doctrees -D language=en . $(BUILDDIR)/html

Comment on lines 1135 to 1143
.. ipython:: python
:okwarning:
.. jupyter-execute::
:stderr:

ds = xr.tutorial.open_dataset("air_temperature_gradient")
cubes = ncdata.iris_xarray.cubes_from_xarray(ds)
print(cubes)
print(cubes[1])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code currently raises the following warning, which with :stderr: gets rendered in the docs with a red background (https://xray--10383.org.readthedocs.build/en/10383/user-guide/io.html#ncdata)

SerializationWarning: saving variable Tair with floating point data as an integer dtype without any _FillValue to use for NaNs
  ncdata = from_xarray(xrds, **xr_save_kwargs)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also noting that these warnings don't look great in dark mode. Printing the iris cubes seems necessary as their HTML repr messes with the page theme CSS.

Comment on lines 963 to 964
encoding={"xc": {"chunks": (-1, -1)}, "yc": {"chunks": (-1, -1)}},
encoding={"xc": {"chunks": None}, "yc": {"chunks": None}},
consolidated=False,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to avoid ValueError: Expected all values to be non-negative. Got (-1, -1) instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the only way I could get this to actually save single coordinate arrays ways to use encoding={"xc": {"chunks": ds.xc.shape}, "yc": {"chunks": ds.yc.shape}}, So I removed the comment about the -1 shortcut. Probably deserves a separate issue, as I'd expect "chunks": None to work here, but it doesn't raise any error or warning and the on-disk arrays are still chunked...

Comment on lines -30 to +46
.. ipython:: python
.. jupyter-execute::

da = xr.DataArray(
np.arange(6).reshape(2, 3), [("x", ["a", "b"]), ("y", [10, 20, 30])]
)
da.isel(y=slice(0, 1)) # same as da[:, :1]

.. jupyter-execute::

# This resembles how you would use np.concatenate:
xr.concat([da[:, :1], da[:, 1:]], dim="y")

.. jupyter-execute::

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This a good example of how the change to HTML reprs results in "longer" form docs if previously multiple outputs were contained in a single ipython code block:

While I find the 'new' version an improvement, I think there are places throughout the docs where things could be even better by using smaller subsection headers ### or #### to make navigating and reading easier. I'd advocated for reviewing content to move code comments into these smaller markdown headings and/or collapsing sections of the HTML if they aren't relevant to the example. But I'd prefer to tackle those changes in a follow-on PR

@scottyhq
Copy link
Contributor Author

scottyhq commented Jun 3, 2025

Not going to lie, that revamp took me a bit longer than I expected, but I am very happy about the results :) I think having HTML reprs is a huge improvement both aesthetically and also cognitively for users trying to relate narrative and visuals. There is also a much cleaner separation of 'input' cells and 'output' cells throughout. I left some open questions as review comments, but would welcome any feedback.

@scottyhq scottyhq requested review from benbovy and dcherian June 3, 2025 14:12
@dcherian
Copy link
Contributor

dcherian commented Jun 3, 2025

Very nice Scott. Thanks for taking this on!


We have some weird padding in these reprs now. I think it's fine to ignore for now and adjust it later.
image

FWIW I added some extra styling to cf-xarray docs (example) to distinguish the reprs from the surrounding code and text.


@benbovy
Copy link
Member

benbovy commented Jun 3, 2025

Thanks @scottyhq, I agree this looks very nice!

We have some weird padding in these reprs now.

Yes this is because of some 3rd-party styling defined for <pre> html elements within jupyter cells.

Unfortunately this also affects text reprs in dark mode, at least for datatrees:

Screenshot 2025-06-03 at 16 35 14

(BTW as written in the screenshot, the text repr is used for showing Datatree in a compact way... with #10139 maybe we can now use the html repr?)

A lot of changes are introduced in pydata/pydata-sphinx-theme#2187 so maybe those issues will be fixed upstream?

Copy link
Contributor

@kmuehlbauer kmuehlbauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @scottyhq! I've reviewed time-coding.rst and added a couple comments and suggestions.

Nice work with jupyter-execute!

@scottyhq
Copy link
Contributor Author

scottyhq commented Jun 4, 2025

Yes this is because of some 3rd-party styling defined for <pre> html elements within jupyter cells. Unfortunately this also affects text reprs in dark mode, at least for datatrees. (BTW as written in the screenshot, the text repr is used for showing Datatree in a compact way... with #10139 maybe we can now use the html repr?).

@benbovy see the comparison below. I assumed xr.set_options(display_style="text") would be equivalent to just calling print(dt) instead of display(dt) but I guess not? For many of these examples, the goal is to show the "tree" structure, which requires toggling the groups in HTML (and there's a lot of extraneous rows for these basic cases). I noticed these open issues for ideas on more succinct groups #9343 or 'summaries' #9271 to tackle at some point.

image

A lot of changes are introduced in pydata/pydata-sphinx-theme#2187 so maybe those issues will be fixed upstream?

Perhaps, I'll plan to run in an environment against that branch at some point. But it's slated for the release after next, so could be a while. For now I'll stick with print(dt) in the docs where it seems appropriate otherwise the HTML repr

We have some weird padding in these reprs now. I think it's fine to ignore for now and adjust it later. FWIW I added some extra styling to cf-xarray docs (example) to distinguish the reprs from the surrounding code and text.

Looks nice! I do prefer clearly isolating cell outputs from page background, but I'll leave custom styling for a follow up or fixing upstream in pydata-sphinx-theme.

@benbovy
Copy link
Member

benbovy commented Jun 4, 2025

I assumed xr.set_options(display_style="text") would be equivalent to just calling print(dt) instead of display(dt) but I guess not?

The difference between the two is that display(dt) calls DataTree._repr_html_(), while print(dt) calls DataTree.__repr__().

xr.set_options(display_style="text") embeds the text repr in an html <pre> container to mimic the classic repr. This is what causes the styling issue.

Sticking with print(dt) is better for now I agree.

@scottyhq
Copy link
Contributor Author

scottyhq commented Jun 4, 2025

Thanks all for the reviews! I think this is ready to go. I left unresolved comments above because I think there are some notes there worth keeping visible for future reference.

Copy link
Member

@spencerkclark spencerkclark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @scottyhq! The updates to time-coding.rst look good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants