|
| 1 | +.. _package-formats: |
| 2 | + |
| 3 | +=============== |
| 4 | +Package Formats |
| 5 | +=============== |
| 6 | + |
| 7 | +This page discusses the file formats that are used to distribute Python packages |
| 8 | +and the differences between them. |
| 9 | + |
| 10 | +You will find files in two formats on package indices such as PyPI_: **source |
| 11 | +distributions**, or **sdists** for short, and **binary distributions**, commonly |
| 12 | +called **wheels**. For example, the `PyPI page for pip 23.3.1 <pip-pypi_>`_ |
| 13 | +lets you download two files, ``pip-23.3.1.tar.gz`` and |
| 14 | +``pip-23.3.1-py3-none-any.whl``. The former is an sdist, the latter is a |
| 15 | +wheel. As explained below, these serve different purposes. When publishing a |
| 16 | +package on PyPI (or elsewhere), you should always upload both an sdist and one |
| 17 | +or more wheel. |
| 18 | + |
| 19 | + |
| 20 | +What is a source distribution? |
| 21 | +============================== |
| 22 | + |
| 23 | +Conceptually, a source distribution is an archive of the source code in raw |
| 24 | +form. Concretely, an sdist is a ``.tar.gz`` archive containing the source code |
| 25 | +plus an additional special file called ``PKG-INFO``, which holds the project |
| 26 | +metadata. The presence of this file helps packaging tools to be more efficient |
| 27 | +by not needing to compute the metadata themselves. The ``PKG-INFO`` file follows |
| 28 | +the format specified in :ref:`core-metadata` and is not intended to be written |
| 29 | +by hand [#core-metadata-format]_. |
| 30 | + |
| 31 | +You can thus inspect the contents of an sdist by unpacking it using standard |
| 32 | +tools to work with tar archives, such as ``tar -xvf`` on UNIX platforms (like |
| 33 | +Linux and macOS), or :ref:`the command line interface of Python's tarfile module |
| 34 | +<python:tarfile-commandline>` on any platform. |
| 35 | + |
| 36 | +Sdists serve several purposes in the packaging ecosystem. When :ref:`pip`, the |
| 37 | +standard Python package installer, cannot find a wheel to install, it will fall |
| 38 | +back on downloading a source distribution, compiling a wheel from it, and |
| 39 | +installing the wheel. Furthermore, sdists are often used as the package source |
| 40 | +by downstream packagers (such as Linux distributions, Conda, Homebrew and |
| 41 | +MacPorts on macOS, ...), who, for various reasons, may prefer them over, e.g., |
| 42 | +pulling from a Git repository. |
| 43 | + |
| 44 | +A source distribution is recognized by its file name, which has the form |
| 45 | +:samp:`{package_name}-{version}.tar.gz`, e.g., ``pip-23.3.1.tar.gz``. |
| 46 | + |
| 47 | +.. TODO: provide clear guidance on whether sdists should contain docs and tests. |
| 48 | + Discussion: https://discuss.python.org/t/should-sdists-include-docs-and-tests/14578 |
| 49 | +
|
| 50 | +If you want technical details on the sdist format, read the :ref:`sdist |
| 51 | +specification <source-distribution-format>`. |
| 52 | + |
| 53 | + |
| 54 | +What is a wheel? |
| 55 | +================ |
| 56 | + |
| 57 | +Conceptually, a wheel contains exactly the files that need to be copied when |
| 58 | +installing the package. |
| 59 | + |
| 60 | +There is a big difference between sdists and wheels for packages with |
| 61 | +:term:`extension modules <extension module>`, written in compiled languages like |
| 62 | +C, C++ and Rust, which need to be compiled into platform-dependent machine code. |
| 63 | +With these packages, wheels do not contain source code (like C source files) but |
| 64 | +compiled, executable code (like ``.so`` files on Linux or DLLs on Windows). |
| 65 | + |
| 66 | +Furthermore, while there is only one sdist per version of a project, there may |
| 67 | +be many wheels. Again, this is most relevant in the context of extension |
| 68 | +modules. The compiled code of an extension module is tied to an operating system |
| 69 | +and processor architecture, and often also to the version of the Python |
| 70 | +interpreter (unless the :ref:`Python stable ABI <cpython-stable-abi>` is used). |
| 71 | + |
| 72 | +For pure-Python packages, the difference between sdists and wheels is less |
| 73 | +marked. There is normally one single wheel, for all platforms and Python |
| 74 | +versions. Python is an interpreted language, which does not need ahead-of-time |
| 75 | +compilation, so wheels contain ``.py`` files just like sdists. |
| 76 | + |
| 77 | +If you are wondering about ``.pyc`` bytecode files: they are not included in |
| 78 | +wheels, since they are cheap to generate, and including them would unnecessarily |
| 79 | +force a huge number of packages to distribute one wheel per Python version |
| 80 | +instead of one single wheel. Instead, installers like :ref:`pip` generate them |
| 81 | +while installing the package. |
| 82 | + |
| 83 | +With that being said, there are still important differences between sdists and |
| 84 | +wheels, even for pure Python projects. Wheels are meant to contain exactly what |
| 85 | +is to be installed, and nothing more. In particular, wheels should never include |
| 86 | +tests and documentation, while sdists commonly do. Also, the wheel format is |
| 87 | +more complex than sdist. For example, it includes a special file -- called |
| 88 | +``RECORD`` -- that lists all files in the wheel along with a hash of their |
| 89 | +content, as a safety check of the download's integrity. |
| 90 | + |
| 91 | +At a glance, you might wonder if wheels are really needed for "plain and basic" |
| 92 | +pure Python projects. Keep in mind that due to the flexibility of sdists, |
| 93 | +installers like pip cannot install from sdists directly -- they need to first |
| 94 | +build a wheel, by invoking the :term:`build backend` that the sdist specifies |
| 95 | +(the build backend may do all sorts of transformations while building the wheel, |
| 96 | +such as compiling C extensions). For this reason, even for a pure Python |
| 97 | +project, you should always upload *both* an sdist and a wheel to PyPI or other |
| 98 | +package indices. This makes installation much faster for your users, since a |
| 99 | +wheel is directly installable. By only including files that must be installed, |
| 100 | +wheels also make for smaller downloads. |
| 101 | + |
| 102 | +On the technical level, a wheel is a ZIP archive (unlike sdists which are TAR |
| 103 | +archives). You can inspect its contents by unpacking it as a normal ZIP archive, |
| 104 | +e.g., using ``unzip`` on UNIX platforms like Linux and macOS, ``Expand-Archive`` |
| 105 | +in Powershell on Windows, or :ref:`the command line interface of Python's |
| 106 | +zipfile module <python:zipfile-commandline>`. This can be very useful to check |
| 107 | +that the wheel includes all the files you need it to. |
| 108 | + |
| 109 | +Inside a wheel, you will find the package's files, plus an additional directory |
| 110 | +called :samp:`{package_name}-{version}.dist-info`. This directory contains |
| 111 | +various files, including a ``METADATA`` file which is the equivalent of |
| 112 | +``PKG-INFO`` in sdists, as well as ``RECORD``. This can be useful to ensure no |
| 113 | +files are missing from your wheels. |
| 114 | + |
| 115 | +The file name of a wheel (ignoring some rarely used features) looks like this: |
| 116 | +:samp:`{package_name}-{version}-{python_tag}-{abi_tag}-{platform_tag}.whl`. |
| 117 | +This naming convention identifies which platforms and Python versions the wheel |
| 118 | +is compatible with. For example, the name ``pip-23.3.1-py3-none-any.whl`` means |
| 119 | +that: |
| 120 | + |
| 121 | +- (``py3``) This wheel can be installed on any implementation of Python 3, |
| 122 | + whether CPython, the most widely used Python implementation, or an alternative |
| 123 | + implementation like PyPy_; |
| 124 | +- (``none``) It does not depend on the Python version; |
| 125 | +- (``any``) It does not depend on the platform. |
| 126 | + |
| 127 | +The pattern ``py3-none-any`` is common for pure Python projects. Packages with |
| 128 | +extension modules typically ship multiple wheels with more complex tags. |
| 129 | + |
| 130 | +All technical details on the wheel format can be found in the :ref:`wheel |
| 131 | +specification <binary-distribution-format>`. |
| 132 | + |
| 133 | + |
| 134 | +.. _egg-format: |
| 135 | +.. _`Wheel vs Egg`: |
| 136 | + |
| 137 | +What about eggs? |
| 138 | +================ |
| 139 | + |
| 140 | +"Egg" is an old package format that has been replaced with the wheel format. It |
| 141 | +should not be used anymore. Since August 2023, PyPI `rejects egg uploads |
| 142 | +<pypi-eggs-deprecation_>`_. |
| 143 | + |
| 144 | +Here's a breakdown of the important differences between wheel and egg. |
| 145 | + |
| 146 | +* The egg format was introduced by :ref:`setuptools` in 2004, whereas the wheel |
| 147 | + format was introduced by :pep:`427` in 2012. |
| 148 | + |
| 149 | +* Wheel has an :doc:`official standard specification |
| 150 | + </specifications/binary-distribution-format>`. Egg did not. |
| 151 | + |
| 152 | +* Wheel is a :term:`distribution <Distribution Package>` format, i.e a packaging |
| 153 | + format. [#wheel-importable]_ Egg was both a distribution format and a runtime |
| 154 | + installation format (if left zipped), and was designed to be importable. |
| 155 | + |
| 156 | +* Wheel archives do not include ``.pyc`` files. Therefore, when the distribution |
| 157 | + only contains Python files (i.e. no compiled extensions), and is compatible |
| 158 | + with Python 2 and 3, it's possible for a wheel to be "universal", similar to |
| 159 | + an :term:`sdist <Source Distribution (or "sdist")>`. |
| 160 | + |
| 161 | +* Wheel uses standard :ref:`.dist-info directories |
| 162 | + <recording-installed-packages>`. Egg used ``.egg-info``. |
| 163 | + |
| 164 | +* Wheel has a :ref:`richer file naming convention <wheel-file-name-spec>`. A |
| 165 | + single wheel archive can indicate its compatibility with a number of Python |
| 166 | + language versions and implementations, ABIs, and system architectures. |
| 167 | + |
| 168 | +* Wheel is versioned. Every wheel file contains the version of the wheel |
| 169 | + specification and the implementation that packaged it. |
| 170 | + |
| 171 | +* Wheel is internally organized by `sysconfig path type |
| 172 | + <https://docs.python.org/2/library/sysconfig.html#installation-paths>`_, |
| 173 | + therefore making it easier to convert to other formats. |
| 174 | + |
| 175 | +-------------------------------------------------------------------------------- |
| 176 | + |
| 177 | +.. [#core-metadata-format] This format is email-based. Although this would |
| 178 | + be unlikely to be chosen today, backwards compatibility considerations lead to |
| 179 | + it being kept as the canonical format. From the user point of view, this |
| 180 | + is mostly invisible, since the metadata is specified by the user in a way |
| 181 | + understood by the build backend, typically ``[project]`` in ``pyproject.toml``, |
| 182 | + and translated by the build backend into ``PKG-INFO``. |
| 183 | +
|
| 184 | +.. [#wheel-importable] Circumstantially, in some cases, wheels can be used |
| 185 | + as an importable runtime format, although :ref:`this is not officially supported |
| 186 | + at this time <binary-distribution-format-import-wheel>`. |
| 187 | +
|
| 188 | +
|
| 189 | +
|
| 190 | +.. _pip-pypi: https://pypi.org/project/pip/23.3.1/#files |
| 191 | +.. _pypi: https://pypi.org |
| 192 | +.. _pypi-eggs-deprecation: https://blog.pypi.org/posts/2023-06-26-deprecate-egg-uploads/ |
| 193 | +.. _pypy: https://pypy.org |
0 commit comments