Skip to content

Start article about the history of conda-forge #2298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 35 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
333e9fb
Start article about the history of conda-forge
jaimergp Sep 14, 2024
9bebd8b
Merge branch 'main' of github.com:conda-forge/conda-forge.github.io i…
jaimergp Apr 11, 2025
bea0898
pre-commit
jaimergp Apr 11, 2025
4e22a38
Address some feedback about the early history
jaimergp Apr 11, 2025
2af512e
Rebrand was in 2017
jaimergp Apr 11, 2025
d164985
Clarify Scitools
jaimergp Apr 11, 2025
a3c6f5f
add quote from Travis
jaimergp Apr 11, 2025
a58f100
Add link to binstar announcement in scipy
jaimergp Apr 11, 2025
352d669
More content in "how conda-forge came to be"
jaimergp Apr 11, 2025
96d4d5c
More details about the state of wheels in 2013
jaimergp Apr 11, 2025
17e5982
Mention Gohlke and Cournapeau
jaimergp Apr 11, 2025
ca6a3b4
Add link to SciTool's .travis.yml
jaimergp Apr 11, 2025
6e8b282
typo
jaimergp Apr 11, 2025
1583e8b
Improve language and grammar
jaimergp Apr 11, 2025
d881fe5
add manylinux link
jaimergp Apr 11, 2025
5cf119e
flesh out continuum history (#1)
msarahan Apr 14, 2025
271b95e
cosmetic adjustments
jaimergp Apr 14, 2025
8a9e5b9
pre-commit
jaimergp Apr 14, 2025
9930e5b
fix lost save
msarahan Apr 14, 2025
9c36534
Merge branch 'history-conda-forge' of github.com:jaimergp/conda-forge…
jaimergp Apr 14, 2025
d0a4839
add link to talk python episode #94
jaimergp Apr 14, 2025
b0135c7
reapply cosmetic changes
jaimergp Apr 14, 2025
e43258b
pre-commit
jaimergp Apr 14, 2025
2f8aaa3
add github users
jaimergp Apr 14, 2025
a0c439c
Update community/history.md
jaimergp Apr 15, 2025
734d2b6
link to pypackaging-native for abi explanation
jaimergp Apr 15, 2025
65f25b7
mention CB3 and gcc 7
jaimergp Apr 15, 2025
9f1463b
break up long paragraphs a bit and add more sections
jaimergp Apr 15, 2025
cdb662a
add reference to Anaconda 5.0 release
jaimergp Apr 15, 2025
063b395
Line breaks and a couple of new links (#2)
msarahan Apr 15, 2025
16cc130
fix links
jaimergp Apr 15, 2025
6eaed62
fold at 90
jaimergp Apr 15, 2025
2d96e5a
mention build/host/run split
jaimergp Apr 15, 2025
a3430e3
pre-commit
jaimergp Apr 15, 2025
25cde72
one more link to conda-build docs
jaimergp Apr 15, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions community/_sidebar.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
[
"index",
"history",
"getting-in-touch",
{
"type": "category",
Expand Down
348 changes: 348 additions & 0 deletions community/history.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,348 @@
---
title: History
---

# History of conda-forge

<!-- TODO: Add a little introduction -->

## Context of binary packaging for Python

conda-forge's origins are best understood in the context of Python packaging back in the
early 2010s. Back then, the installation of Python packages across operating systems was
very challenging, especially on Windows, as it often meant compiling dependencies from
source.

Python 2.x was the norm. To install it, you'd get the official installers from
Python.org, use the system-provided interpreter in Linux, or resort to options like
Python(x,y) [^pythonxy], ActiveState ActivePython [^activepython] or Enthought's
distributions (EPD, later Canopy) [^enthought] in macOS and Windows
[^legacy-python-downloads].

If you wanted to install additional packages, the community was transitioning from
`easy_install` to `pip`, and there was no easy way to ship or install pre-compiled
Python packages. An alternative to the Python eggs [^eggs] used by `easy_install`
wouldn't emerge until 2013 with the formalization of wheels [^wheels]. These were useful
for Windows, where Christoph Gohlke's exes and wheels
[^cgohlke]<sup>,</sup>[^cgohlke-shutdown] were your only choice.

However, for Linux, you would have to wait until 2016, when [`manylinux` wheels were
introduced](https://peps.python.org/pep-0513/). Before then, PyPI wouldn't even allow
compiled Linux wheels and your only alternative was to compile every package from
source.

As an example, take a look at the [PyPI download page for `numpy`
1.7.0](https://pypi.org/project/numpy/1.7.0/#files), released in Feb 2013. The "Built
Distributions" section only shows a few `.exe` files for Windows (!), and some
`manylinux1` wheels. However, the `manylinux1` wheels were not uploaded until April 2016.
There was no mention whatsoever of macOS. Now compare it to [`numpy`
1.11.0](https://pypi.org/project/numpy/1.11.0/#files), released in March 2016: wheels
for all platforms!

The reason why it is hard to find packages for a specific system, and why compilation
was the preferred option for many, is [binary compatibility][abi]. Binary compatibility
is a window of compatibility where each combination of compiler version, core libraries
such as `glibc`, and dependency libraries present on the build machine are compatible on
destination systems.

Linux distributions achieve this by freezing compiler versions and library versions for
a particular release cycle. Windows achieves this relatively easily because Python
standardized on particular Visual Studio compiler versions for each Python release.
Where a Windows package executable was reliably redistributable across versions of
Windows, so long as Python version was the same, Linux presented a more difficult target
because it was (and is) so much harder to account for all of the little details that
must line up.

## The origins of `conda`

In 2012, Continuum Analytics announced Anaconda 0.8 at the SciPy conference
[^anaconda-history]. Anaconda was a distribution of scientifically-oriented packages,
but did not yet have tools for managing individual packages. Later that year, in
September, Continuum released `conda` 1.0, the cross-platform, language-agnostic package
manager for pre-compiled artifacts [^conda-changelog-1.0]. The motivation behind these
efforts was to provide an easy way to ship all the compiled libraries and Python
packages that users of the SciPy and NumPy stacks needed
[^packaging-and-deployment-with-conda]<sup>,</sup>[^lex-fridman-podcast].

Travis Oliphant, on [Why I promote
conda](https://technicaldiscovery.blogspot.com/2013/12/why-i-promote-conda.html) (2013):

> [...] at the first PyData meetup at Google HQ, where several of us asked Guido what we
> can do to fix Python packaging for the NumPy stack. Guido's answer was to "solve the
> problem ourselves". We at Continuum took him at his word. We looked at dpkg, rpm,
> pip/virtualenv, brew, nixos, and 0installer, and used our past experience with EPD
> [Enthought Python Distribution]. We thought hard about the fundamental issues, and
> created the conda package manager and conda environments.

Conda packages could not only ship pre-compiled Python packages across platforms but
were also agnostic enough to ship Python itself, as well as the underlying shared
libraries without having to statically vendor them. This was particularly convenient for
projects that relied on both compiled dependencies (e.g. C++ or Fortran libraries) and
Python "glue code".

By June 2013, conda was using a SAT solver and included the `conda build` tool
[^new-advances-in-conda] for community users outside of Continuum to build their own
conda packages. This is also when the first Miniconda release and Binstar.org
[^binstar], a site for hosting arbitrary user-built conda packages, were announced.
Miniconda provided a minimal base environment that users could populate themselves, and
Binstar.org gave any user an easy platform for redistributing their packages.

:::note

Did you know All of the conda tools have been free (BSD 3-clause) and
Binstar/Anaconda.org was also free (as in beer, not open source), with some paid options
on Binstar/Anaconda.org for more storage.

:::

With `conda build` came along the concept of recipes [^early-conda-build-docs]. The
[`ContinuumIO/conda-recipes`](https://github.com/conda-archive/conda-recipes) repository
became _the_ central place where people would contribute their conda recipes. This was
separate from Anaconda's package recipes, which were private at this point. While
successful, the recipes varied in quality, and typically only worked on one or two
platforms. There was no CI for any recipes to help keep them working. It was common to
find recipes that would no longer build, and you had to tweak it to get it to work.

In 2015, Binstar.org became Anaconda.org, and in 2017 Continuum Analytics rebranded as
Anaconda Inc [^anaconda-rebrand].

## How conda-forge came to be

By 2015, several institutes and groups were using Binstar/Anaconda.org to distribute
software packages they used daily: the [Omnia Molecular
Dynamics](https://github.com/omnia-md) project started as early as March 2014
[^binstar-omnia], the UK Met Office supported [SciTools
project](https://scitools.org.uk/) joined in June 2014 [^binstar-scitools], the [US
Integrated Ocean Observing System (IOOS)](http://www.ioos.noaa.gov/) started using it in
July 2014 [^binstar-ioos]. Although each channel was building conda packages, the binary
compatibility between channels was unpredictable.

In 2014, Filipe Fernandes ([@ocefpaf](https://github.com/ocefpaf)) and Phil Elson
([@pelson](https://github.com/pelson)) were maintaining the Binstar channels for IOOS
and SciTools, respectively. Phil had [implemented CI
pipelines](https://github.com/SciTools/conda-recipes-scitools/blob/995fc231967719db0dd6321ba8a502390a2f192c/.travis.yml)
and [special tooling](https://github.com/conda-tools/conda-build-all) to build conda
packages for SciTools efficiently, and Filipe borrowed it for IOOS. There was also a
healthy exchange of recipes between the two groups, often assisted by members of other
communities. For example, Christophe Gohlke and David Cournapeau were instrumental in
getting Windows builds of the whole SciPy stack to work on AppVeyor.

There was a lot of cross-pollination between projects/channels, but projects were still
working in separate repos, duplicating recipes, and with differing build toolchains (so
mixing channels was unpredictable). Given the success of the `ContinuumIO/conda-recipes`
repository, it became clear there was a demand for high quality conda recipes and more
efficient community-driven collaboration under a single umbrella [^talkpython-conda]. On
April 11th, 2015, `conda-forge` was registered as a Github organization
[^github-api-conda-forge] and an Anaconda.org channel [^binstar-conda-forge].

## Meanwhile at Continuum

:::note

Reminder Continuum Analytics is now Anaconda, but this article tries to keep the
company name contemporaneous with the state of the world.

:::

It's a little strange to describe Continuum's history here, but the company history is
so deeply intertwined with conda-forge that it is essential for a complete story. During
this time, Continuum (especially Ilan Schnell
([@ilanschnell](https://github.com/ilanschnell))) was developing its own internal
recipes for packages. Continuum's Linux toolchain at the time was based on CentOS 5 and
GCC 4.8. These details matter, because they effectively set the compatibility bounds of
the entire conda package ecosystem.

The packages made from these internal recipes were available on the `free` channel,
which in turn was part of a metachannel named `defaults`. The `defaults` channel made up
the initial channel configuration for the Miniconda and Anaconda installers.
Concurrently, Aaron Meurer ([@asmeurer](https://github.com/asmeurer)) led the `conda`
and `conda-build` projects, contributed many recipes to the `conda-recipes` repository
and built many packages on his `asmeurer` binstar.org channel.

Aaron left Continuum in late 2015, leaving the community side of the projects in need of
new leadership. Continuum hired Kale Franz ([@kalefranz](https://github.com/kalefranz))
to fill this role. Kale had huge ambitions for conda, but `conda-build` was not as much
of a priority for him. Michael Sarahan ([@msarahan](https://github.com/msarahan))
stepped in to maintain `conda-build`.

In 2016, Rich Signell at USGS connected Filipe and Phil with Travis Oliphant at
Continuum, who assigned Michael Sarahan to be Continuum's representative in
`conda-forge`. Ray Donnelly ([@mingwandroid](https://github.com/mingwandroid)) joined
the team at Continuum soon afterwards, bringing extensive experience in package managers
and toolchains from his involvement in the MSYS2 project.

## conda-build 3 and the new compiler toolchain

There was a period of time where conda-forge and Continuum worked together closely, with
conda-forge relying on Continuum to supply several core libraries. In its infancy, the
`conda-forge` channel had far fewer packages than the `defaults` channel. conda-forge's
reliance on `defaults` was partly to lower conda-forge's maintenance burden and reduce
duplicate work, but it also helped keep mixtures of conda-forge and `defaults` channel
packages working by reducing possibility of divergence. Just as there were binary
compatibility issues with mixing packages from among the many Binstar channels, mixing
packages from `defaults` with `conda-forge` could be fragile and frustrating.

Around this point in time, [GCC 5 arrived][gcc-5] with a breaking change in `libstdc++`.
These changes, among other compiler updates, began to make the CentOS 5 toolchain
troublesome. Cutting edge packages, such as the nascent TensorFlow project, required
cumbersome patching to work with the older toolchain, if they worked at all.

There was strong pressure from the community to update the ecosystem (i.e. the
toolchain, and implicitly everything built with it). There were two prevailing options.
One was Red Hat's `devtoolset`. This used an older GCC version which statically linked
the newer `libstdc++` parts into binaries, so that `libstdc++` updates were not
necessary on end user systems. The other was to build GCC ourselves, and to ship the
newer `libstdc++` library as a conda package. This was a community decision, and it was
split roughly down the middle.

In the end, the community decided to take the latter route, for the sake of greater
control over updating to the latest toolchains, instead of having to rely on Red Hat.
One major advantage of providing our own toolchain was that we could provide the
toolchain as a conda package instead of a system dependency, so we could now express
toolchain requirements in our recipes and have better control over compiler flags and
behavior.

The result of this overhaul crystallized in the `compiler(...)` Jinja function and the
`build`/`host`/`run` dependencies split in `conda-build` 3.x [^conda-build-3], and the
publication of the GCC 7 toolchain built from source in `defaults`
[^conda-build-compiler-docs]<sup>,</sup>[^anaconda-compilers]. `conda-build` 3.x also
introduced dynamic pinning expressions that made it easier to maintain compatibility
boundaries. ABI documentation from [^abilab] helped establish whether a given package
should be pinned to major, minor, or bugfix versions.

## From `free` to `main`

Here around 2017, Continuum renamed itself to Anaconda, so let's switch those names from
here out.

As more and more conflicts with `free` channel packages occurred, conda-forge gradually
added more and more of their own core dependency packages to avoid those breakages. At
the same time, Anaconda was working on two contracts that would prove revolutionary.

- Samsung wanted to use conda packages to manage their internal toolchains, and Ray
suggested that this was complementary to our own internal needs to update our
toolchain. Samsung's contract supported development to conda-build that greatly
expanded its ability to support explicit variants of recipes. This became the major
new feature set released in conda-build 3.x.
- Intel was working on developing their own Python distribution at the time, which they
based on Anaconda and added their accelerated math libraries and patches to. Part of
the Intel contract was that Anaconda would move all of their internal recipes into
public-facing GitHub repositories.

Rather than putting another set of repositories (another set of changes to merge) in
between internal and external sources, such as `conda-forge`, Michael and Ray pushed for
a design where conda-forge would be the reference source of recipes. Anaconda would only
carry local changes if they were not able to be incorporated into the conda-forge recipe
for social, licensing, or technical reasons. The combination of these conda-forge based
recipes and the new toolchain are what made up the `main` channel [^anaconda-5], which
was also part of `defaults`.

The `main` channel represented a major step forward in keeping conda-forge and Anaconda
aligned, which equates to smooth operation and happy users. The joined recipe base and
toolchain has sometimes been contentious, with conda-forge wanting to move faster than
Anaconda or vice-versa. The end result has been a compromise between cutting-edge
development and slower enterprise-focused development.

<!-- miniforge -->

<!-- autotick bot -->

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think there is a lot Filipe and I can add here 😉

We should give some context of the geospatial world in particular (hi gdal).

I would also get to the point of acknowledging both Christophe Gohlke and David Cournapeau, who especially helped me with the Windows builds of the whole SciPy stack (a topic on which I had no knowledge at all, yet needed to get building in a CI context on appveyor. In those days you had to pick particular compilers for particular Python versions, and this was a bit of a dark art to me).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the acknowledgements, but would love to hear more about the geospatial world if you want to add something!

<!-- to be continued -->

## References

[^abilab]: [ABI Laboratory](https://abi-laboratory.pro/)

[^activepython]: https://www.activestate.com/platform/supported-languages/python/

[^anaconda-5]: https://www.anaconda.com/blog/announcing-the-release-of-anaconda-distribution-5-0

[^anaconda-compilers]: https://www.anaconda.com/blog/utilizing-the-new-compilers-in-anaconda-distribution-5

[^anaconda-history]:
[The Early History of the Anaconda
Distribution](http://ilan.schnell-web.net/prog/anaconda-history/), Ilan Schnell, 2018.

[^anaconda-rebrand]: https://www.anaconda.com/blog/continuum-analytics-officially-becomes-anaconda, 2017.

[^binstar-conda-forge]: https://anaconda.org/conda-forge, 2015.

[^binstar-ioos]: https://anaconda.org/ioos, 2014.

[^binstar-omnia]: https://anaconda.org/omnia, 2014.

[^binstar-scitools]: https://anaconda.org/scitools, 2014.

[^binstar]:
[SciPy 2013 Lightning Talks, Thu June
27](https://youtu.be/ywHqIEv3xXg?list=PLYx7XA2nY5GeTWcUQTbXVdllyp-Ie3r-y&t=850).

[^cgohlke-shutdown]:
[What to do when Gohlke's python wheel service shuts
down?](https://stackoverflow.com/questions/72581592/what-to-do-when-gohlkes-python-wheel-service-shuts-down), 2022.

[^cgohlke]: https://www.cgohlke.com/, 2025.

[^chatting-ocefpaf]:
[Filipe Fernandes on the Evolution of
conda-forge](https://www.youtube.com/watch?v=U2oa_RLbTVA), Chatting with the Conda
Community #1, 2024.

[^conda-build-3]: [`conda-build` 3](https://github.com/conda/conda-build/tree/3.0.0)

[^conda-changelog-1.0]:
[`conda` 1.0 release
notes](https://github.com/conda/conda/blob/24.7.1/CHANGELOG.md#100-2012-09-06), 2012.

[^conda-build-compiler-docs]: https://docs.conda.io/projects/conda-build/en/latest/resources/compiler-tools.html#anaconda-compiler-tools

[^conda-recipes-repo]: [ContinuumIO/conda-recipes](https://github.com/conda-archive/conda-recipes)

[^early-conda-build-docs]:
[Conda build framework
documentation](https://web.archive.org/web/20141006141927/http://conda.pydata.org/docs/build.html), 2014.

[^eggs]:
[The Internal Structure of Python
Eggs](https://setuptools.pypa.io/en/latest/deprecated/python_eggs.html).

[^enthought]: https://docs.enthought.com/canopy/

[^github-api-conda-forge]: https://api.github.com/orgs/conda-forge

[^legacy-python-downloads]:
[Download Python for Windows (legacy
docs)](https://legacy.python.org/download/windows/).

[^lex-fridman-podcast]:
[Travis Oliphant: NumPy, SciPy, Anaconda, Python & Scientific
Programming](https://www.youtube.com/watch?v=gFEE3w7F0ww&t=7596s), Lex Fridman
Podcast #224, 2022.

[^new-advances-in-conda]:
[New Advances in
Conda](https://web.archive.org/web/20140331190645/http://continuum.io/blog/new-advances-in-conda),
Ilan Schnell, 2013.

[^packaging-and-deployment-with-conda]:
[Packaging and deployment with
conda](https://speakerdeck.com/teoliphant/packaging-and-deployment-with-conda),
Travis Oliphant, 2013.

[^pythonxy]: https://python-xy.github.io/, 2015.

[^talkpython-conda]:
[Guaranteed packages via Conda and
Conda-Forge](https://talkpython.fm/episodes/show/94/guarenteed-packages-via-conda-and-conda-forge), 2016.

[^technical-discovery]: https://technicaldiscovery.blogspot.com/2013/12/why-i-promote-conda.html, 2013.

[^wheels]:
[PEP 427 – The Wheel Binary Package Format
1.0](https://peps.python.org/pep-0427/)

<!-- links -->

[abi]: https://pypackaging-native.github.io/background/binary_interface/
[free-channel]: https://anaconda.org/free
[gcc-5]: https://gcc.gnu.org/gcc-5/changes.html
2 changes: 1 addition & 1 deletion community/index.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: 'conda-forge community'
title: 'The conda-forge community'
---

# Community
Expand Down
Loading