Skip to content

Documentation overhaul #444

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Documentation overhaul #444

wants to merge 5 commits into from

Conversation

dmdunla
Copy link
Collaborator

@dmdunla dmdunla commented Jul 10, 2025

This PR is aimed at creating a consistent documentation across the pyttb classes and algorithms.

To that end, here are some tasks to complete:

  • Evaluate the use of autosummary in the Sphinx documentation
  • Create consistent documentation across the classes
  • Create consistent documentation across the algorithms

📚 Documentation preview 📚: https://pyttb--444.org.readthedocs.build/en/444/

@dmdunla dmdunla linked an issue Jul 10, 2025 that may be closed by this pull request
@dmdunla
Copy link
Collaborator Author

dmdunla commented Jul 10, 2025

First attempt at using autosummary ...

Considerations:

  • I followed the pydata.sparse documentation approach to start
    • We can consider others, if desired. This was just one we discussed and is pretty straightforward.
  • For the Reference documentation, I opted to let autosummary create all *.rst stubs for all classes and methods
    • Current: When using class templates (docs/source/_templates/autosummary), there is no customization allowed across classes (e.g., no class name customization, like "Dense Tensor (:class:tensor)"
    • Option: Create class RST files (e.g., tensor.rst) manually for each class and then create the attribute and method stubs using autosummary. Some manual maintenance would then be needed.
  • There are currently a lot of warnings -- we would need to address the path issues.
  • Attributes that are defined in __slots__ are not documented, which creates inconsistency in the Attributes tables.
    • Option: We can define internal attributes in __slots__ (e.g., _data) and then create an associated property (e.g., data) with getter/setter that can be documented. This is one approach that is recommended online.

@tgkolda
Copy link
Collaborator

tgkolda commented Jul 10, 2025

I notice the side menus are missing. I'm also not sure where functions like tenones are documented. Maybe they aren't. They don't seem to be linked. I also don't seem to be able to click through on the functions in the tables to their detailed documentation. I guess it's all there, but it's hard to get to.

@dmdunla
Copy link
Collaborator Author

dmdunla commented Jul 10, 2025

It seems like something is different on RTD. Locally, the nav bar (toctree) is available. Just not online. I'll work to fix that.

@dmdunla
Copy link
Collaborator Author

dmdunla commented Jul 10, 2025

Getting closer on the docs with the nav bar (toctree) being populated now. But there are still warnings in doc generation and all of the method documentation is still appearing in the class documentation (as opposed to just the autosummaries).

@dmdunla
Copy link
Collaborator Author

dmdunla commented Jul 10, 2025

One other issue is that the autosummary tables for methods do not pick up the special members, so dunder methods will not appear in the table.

@dmdunla dmdunla requested review from tgkolda and ntjohnson1 July 10, 2025 22:45
@dmdunla
Copy link
Collaborator Author

dmdunla commented Jul 10, 2025

In Reference in the docs, just look at pyttb.tensor for now, as that is the main class I've been working on with the switch to autosummary.

@ntjohnson1
Copy link
Collaborator

My primary concern is on the level of upkeep. I'm fine on whatever aesthetics people agree on as long as there is some semi-automated way to enforce it.

For upkeep this looks great. If we wanted to go one level deeper (specify an initial page for each class so we can get the nested methods), that is where I had an issue with our file/class naming indirection issues.

I generated a tensor.rst to mirror https://github.com/pydata/sparse/blob/0.15.1/docs/generated/sparse.COO.rst and couldn't figure out how to convince autosummary to parse the class vs the module.

If we think the current level of nesting is fine then no concern from me.

@tgkolda
Copy link
Collaborator

tgkolda commented Jul 11, 2025

I'm not sure the advantage of autosummary. I see several disadvantages.

The main advantage I anticipated is that it would put each function on its own page, but it didn't do that.

The disadvantage is that you've now lost the ability to easily document the functions that are in the file but not in the class (like tenones).

It looks also like like the init help has been moved up to the class level. That shouldn't be necessary. We talked about just having the one-liner at the class level and then the rest at the function level. Why the change?

Additionally, the 'attributes' section was removed and replaced by 'parameters', but these are 2 different things. Now the attributes of data and shape have no description in the attribute table.

Given that there are only a handful of classes, I'm not sure the autosummary is worth the extra effort. This is assuming I'm understand what it's doing.

@dmdunla
Copy link
Collaborator Author

dmdunla commented Jul 11, 2025

Suggestion: Do no use autosummary in our docs.

I'd rather maintain control over what we want in our docs, and I think we can more easily accomplish this currently without the use of autosummary.

@ntjohnson1 I tried many combinations of the autosummary options, and I was unable to get exactly what I had wanted. I'm assuming some level of user minunderstanding/error on my part, but my assessment is that it is a challenge to support our current implementation nuances with autosummary.

@tgkolda I moved the all docs from class/__init__ to the class level, as accessing help on the class in the REPL and Jupyter notebooks pulls from the class docs. I checked several other packages and see that this is a common approach, with no docstrings for the __init__ methods. I think this is a better approach moving forward.

@tgkolda I agree about the Attributes documentation. autosummary gathered all attributes/properties into a single table (automiatically), which I like, but the lack of documentation of attributes that are defined as __slots__ is unfortunate. The workaround I suggested above is not satisfying, as it requires a lot more manual coding. I think we need to decide how we want to handle properties in the code (where should they be placed) and attributes/properties in the documentation (e.g., directly after the Parameters, after the Examples, etc.).

I plan to tag this branch as the end of the autosummary assessment and provide an updated version of our old approach that we can discuss next.

@dmdunla
Copy link
Collaborator Author

dmdunla commented Jul 11, 2025

As I move away from autosummary here, I suggest the following flow of class docs:

  1. All class docs are included in one place -- at the class level. No docstrings will be provided for __init__.
  2. The order of the elements of the class docs are as follows:
    • Short, single-line class description (required), followed by a blank line
    • Longer class description (optional)
    • Parameters
    • Examples
    • Note listing other ways to construct instances of the class
    • Note with link to tutorial (jupyter notebook)
    • Methods
  3. The nav bar (toctree) includes the following (as ordered in the source code):
    • Attributes/properties
    • All methods, including dunder methods

If we use the alphabetical ordering of methods in the toctree, attributes/properties are mixed in with methods and dunder methods appear first (which I do not like). I think that determining the flow of the toctree layout via the layout of the source code is a better option.

@tgkolda, @ntjohnson1 Please review and comment on the suggested layout

@tgkolda
Copy link
Collaborator

tgkolda commented Jul 11, 2025

Hey Danny, Great progress. I know that sphinx is painful. Here are problems I notice...

  • Documentation for tenones, etc. is still MIA
  • The main Reference page is funny. It has these arguments for the functions but they are inside parentheses and square brackets. I would vote for no arguments at all.
  • Attribute descriptions are still missing in the table for data and shape. Not sure why some appear and others don't.
  • Under Methods, init is showing.
  • Under Methods, none of the functions is clickable.
  • Under Methods, some functions have the same parentheses plus square brackets argument listing. I find it odd. I'm starting to infer that the brackets indicate optional??
  • In your item 2, it's missing attributes before methods
  • In your item 3, it's missing other methods defined in the same file (like tenones)
  • A more general question, why are ndims and nnz properties rather than functions?
  • I'm a bit confused that the autosummary is still there.

@dmdunla
Copy link
Collaborator Author

dmdunla commented Jul 11, 2025

@tgkolda Sorry, it may not have been clear in my previous message ... I have not yet changed anything from the autosummary version, which I plan to drop in the next commit. I plan to address your suggestions and others in the next commit, and I'll send a new review request when that lands on this branch.

@tgkolda As for property vs. method, we use properties to define class methods that return values and act like data members but are derived/computed values. So, for ndims and nnz methods, we decorate these as properties, since they act like data members (e.g., data or shape), which are real class variables. Moreover, we can avoid users trying to set these values (when needed) by decorating a method as a property but not defining the setter for that property. We could just define these as data members and manage their values in the methods that impact them, but then users would be able to change them directly (which we do not want).

@ntjohnson1
Copy link
Collaborator

Suggested layout seems fine to me.

Once we have a proof of concept that works probably worth documenting in the contributing docs, and see if there's an easy way to enforce it (or else checking just falls to you manually in review) but those are future steps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Documentation overhaul
3 participants