Skip to content

Change storagetype to UInt8 Dynkin labels to reduce memory footprint#46

Open
lkdvos wants to merge 7 commits intomainfrom
ld-storage
Open

Change storagetype to UInt8 Dynkin labels to reduce memory footprint#46
lkdvos wants to merge 7 commits intomainfrom
ld-storage

Conversation

@lkdvos
Copy link
Copy Markdown
Member

@lkdvos lkdvos commented Mar 19, 2026

This change fixes #45, storing the weights as UInt8 values rather than Int.
This does limit the maximal weight to 255, although that seems reasonable for now?

This being said, there are 2 further changes we should discuss, before tagging this (Breaking) release.

The first is whether we want to reduce the storage slightly more, by getting rid of the last entry of a.I which necessarily has to be 0. This is another 1 / N reduction in required storage, but would incur another type parameter to fix the length of the tuple to M = N - 1.

The second, which is slightly related is whether we want to instead store Dynkin labels, and compute the weights dynamically from that.
IIRC, this is simply diff(weight(a)), which therefore also can be stored as an NTuple{N - 1, UInt8}, but with the difference that these no longer have to form a non-increasing sequence.
This slightly expands the range of possible sectors that can be represented within the UInt8 tuples, but otherwise has the same drawback of needing another type parameter M = N - 1

Keep in mind that all of this concerns only the internal storage types, and accessor functions would still be available for either information.

@lkdvos lkdvos requested a review from Jutho March 19, 2026 17:45
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 19, 2026

Codecov Report

❌ Patch coverage is 96.15385% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/sunirrep.jl 94.25% 5 Missing ⚠️
src/gtpatterns.jl 96.87% 1 Missing ⚠️
Files with missing lines Coverage Δ
src/SUNRepresentations.jl 100.00% <ø> (ø)
src/clebschgordan.jl 97.32% <100.00%> (ø)
src/naming.jl 98.46% <100.00%> (-0.43%) ⬇️
src/sector.jl 98.61% <100.00%> (+0.03%) ⬆️
src/gtpatterns.jl 94.40% <96.87%> (-2.40%) ⬇️
src/sunirrep.jl 94.25% <94.25%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Jutho
Copy link
Copy Markdown
Member

Jutho commented Mar 19, 2026

Both changes would definitely be ok with me; in particular I think Dynkin labels make sense if we anyway go through the effort of having tuples of length N-1.

But you quoted the benefit of currently still having the same hash, which would be advantageous for disk-based caches, whereas the hash would definitely change with either of those two changes?

Jutho
Jutho previously approved these changes Mar 19, 2026
@lkdvos
Copy link
Copy Markdown
Member Author

lkdvos commented Mar 19, 2026

Well, technically I could force the hash to remain the same, but I would indeed feel way less comfortable about that since we do actually call the hash function quite a lot in TensorKit as well. On the flipside of that though, changing to the N-1 tuple means the hash will be slightly faster as well, which might actually be a good reason to go and make the breaking change.
In any case, I wouldn't feel comfortable releasing this as a non-breaking change since stored tensors still would have some issues, and while I don't usually think that is a good enough reason not to do this, since people should use long-term storage methods if they want this kind of stability, it might then be easiest to just change the storage to the dynkin labels and really make a breaking change.

I do think though that the files on disk should remain valid, since that is all stored with actual text files (the key there being string(weight(a))), so that would remain valid.

@Jutho
Copy link
Copy Markdown
Member

Jutho commented Mar 19, 2026

Well, another issue might be that your current convention is that Dynkin labels come in a vector, whereas weights come in a tuple (of length N).

@Jutho
Copy link
Copy Markdown
Member

Jutho commented Mar 19, 2026

Hence, if you want to construct both tuple weights and tuple dynkin labels in the constructor, you can no longer infer N from the tuple length, so the SUNIrrep(tuple) constructor has to go. While maybe conceptually not that bad; users should be explicit, it is very much a breaking change.

@lkdvos lkdvos changed the title Change storagetype to UInt8 to reduce memory footprint Change storagetype to UInt8 Dynkin labels to reduce memory footprint Mar 31, 2026
lkdvos and others added 7 commits March 31, 2026 17:43
Replace all direct s.I field accesses in internal code with weight(s),
and update Base.getproperty to route s.I through weight(s) for backward
compatibility. Pure refactor — no behavioral change.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
8× memory reduction per irrep. The weight() accessor now promotes UInt8
to Int on access, so UInt8 never leaks into arithmetic or public API.
Constructors validate that all weight components fit in [0, 255].

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@lkdvos
Copy link
Copy Markdown
Member Author

lkdvos commented Mar 31, 2026

@Jutho apologies for exploding this PR, but I didn't really see an easy way of incrementally doing this change. I ended up making this breaking as discussed, and now store the N-1 dynkin labels instead. Most usage should be completely unbreaking, the only weird edge cases are presumably related to SUNIrrep{N} now being an abstract type, but I don't think I can really avoid that.

I also took the time to clean up a bunch of the file organization since I wasn't a big fan of having (some of) the code in the SUNRepresentations.jl file and some code in the other files, and I needed to migrate things around anyways.

I think I'm okay with merging this and tagging this as a breaking release in its current state, so let me know what you think and feel free to do so :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[optimization] Reduce memory footprint by using UInt8

2 participants