-
Couldn't load subscription status.
- Fork 86
Description
Is your feature request related to a problem? Please describe.
Recent libcuopt-cu13 wheels package more than 19,000 files.
And it looks like there are many things in there that I wouldn't expect to find in a wheel for a C++ shared library, like 50MB of HTML and 27MB of PNG files, .bat files, etc.
file size
* compressed size: 0.738G
* uncompressed size: 1.006G
* compression space saving: 26.6%
contents
* directories: 2465
* files: 19408 (41 compiled)
size by extension
* .so - 0.438G (43.5%)
* .pack - 0.226G (22.5%)
* .html - 50.317M (4.9%)
* .hpp - 43.327M (4.2%)
* .h - 39.824M (3.9%)
* .a - 37.721M (3.7%)
* .3 - 31.537M (3.1%)
* .png - 27.33M (2.7%)
* .cu - 22.121M (2.1%)
* .cuh - 18.394M (1.8%)
* .o - 16.885M (1.6%)
* .idx - 9.856M (1.0%)
* .cpp - 9.027M (0.9%)
...
Describe the solution you'd like
Please scrutinize the contents of the wheels being produced here and try to identify some files that could be omitted.
Summaries like I shared above can be obtained in the CI logs of wheel-build-* CI jobs, or directly by pip downloading-ing wheels and running pydistcheck --inspect on them.
Some places to look:
- use of
install(DIRECTORY),install(FILES), or similar in CMake code (install(TARGETS)and dependency-tracking should be preferred) MANIFEST.inrulespackage_dateconfiguration inpyproject.toml/setup.py- third-party dependencies being vendored instead of re-used from wheels (e.g., probably do not need to vendor the
raft/headers, a runtime dependency onlibraft-cu{12,13}might be enough)
After eliminating as much data as possible, look at the reported compressed sizes and reduce thresholds like this:
cuopt/python/libcuopt/pyproject.toml
Lines 64 to 68 in 933d810
| [tool.pydistcheck] | |
| select = [ | |
| "distro-too-large-compressed", | |
| ] | |
| max_allowed_size_compressed = '900M' |
Describe alternatives you've considered
N/A
Additional context
Smaller wheels, with fewer individual files, would mean:
- faster builds
- faster installation (in CI and for users)
- smaller disk footprint for environments