Skip to content

feat: Add support for REPLs #2723

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open

Conversation

philsc
Copy link
Contributor

@philsc philsc commented Apr 1, 2025

This patch adds a new target that lets users invoke a REPL for a given
PyInfo target.

For example, the following command will spawn a REPL for any target
that provides PyInfo:

$ bazel run --//python/config_settings:bootstrap_impl=script //python/bin:repl --//python/bin:repl_dep=//tools:wheelmaker
Python 3.11.1 (main, Jan 16 2023, 22:41:20) [Clang 15.0.7 ] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> import tools.wheelmaker
>>>

If the user wants an IPython shell instead, they can create a file like this:

import IPython
IPython.start_ipython()

Then they can set this up in their .bazelrc file:

# Allow the REPL stub to import ipython. In this case, @my_deps is the name
# of the pip.parse() repository.
build --@rules_python//python/bin:repl_stub_dep=@my_deps//ipython
# Point the REPL at the stub created above.
build --@rules_python//python/bin:repl_stub=//path/to:ipython_stub.py

This patch adds three different ways to invoke a REPL for a given
target, each with slightly unique use cases.

Deployed binaries: Sometimes it's really useful to start the REPL
for a binary that's already deployed in a docker container. You can do
this with the `RULES_PYTHON_BOOTSTRAP_REPL` environment variable. For
example:

    $ RULES_PYTHON_BOOTSTRAP_REPL=1 bazel run --//python/config_settings:bootstrap_impl=script //tools:wheelmaker
    Python 3.11.1 (main, Jan 16 2023, 22:41:20) [Clang 15.0.7 ] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    (InteractiveConsole)
    >>> import tools.wheelmaker
    >>>

`py_{library,binary,test}` targets: These targets will now
auto-generate additional `<name>.repl` targets.

    $ bazel run --//python/config_settings:bootstrap_impl=script //python/runfiles:runfiles.repl
    Python 3.11.1 (main, Jan 16 2023, 22:41:20) [Clang 15.0.7 ] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    (InteractiveConsole)
    >>> import python.runfiles
    >>>

Arbitrary `PyInfo` providers: Spawn a REPL for any target that
provides `PyInfo` like this:

    $ bazel run --//python/config_settings:bootstrap_impl=script //python/bin:repl --//python/bin:repl_dep=//tools:wheelmaker
    Python 3.11.1 (main, Jan 16 2023, 22:41:20) [Clang 15.0.7 ] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    (InteractiveConsole)
    >>> import tools.wheelmaker
    >>>
@arrdem
Copy link
Contributor

arrdem commented Apr 1, 2025

Broadly in favor of this, had a similar macro feature aligned with the verbs pattern at a former employer.

I don't love having magical behavior that runs ipython if it's on the path, IMO an ipython shell or any other notebook-like thing is a different kind of target than a vanilla shell and we shouldn't overload the one with try-import behavior which is generally an antipattern. IMO the kind of shell to run should be explicitly selected either by having separate targets or by having a feature flag we can select() on.

While REPL is the more familiar term to me from the Lisp world, the Python ecosystem appears to prefer the words "shell" or "console". My previous macros emitted .shell and .ishell verbs which I would suggest is slightly better aligned.

@aignas
Copy link
Collaborator

aignas commented Apr 1, 2025

What is the difference between 2 and 3? Could we just get away with 3?

@philsc
Copy link
Contributor Author

philsc commented Apr 4, 2025

I don't love having magical behavior that runs ipython if it's on the path, IMO an ipython shell or any other notebook-like thing is a different kind of target than a vanilla shell

I generally agree, but everywhere I've implemented this (3 places), one of the first questions is "how do I get ipython?". So I've just given up on that front and made these automatically use that. I can be convinced to drop it, but I suspect without it, folks are just going to implement it themselves which reduces the value of having it in a shared place.

IMO the kind of shell to run should be explicitly selected either by having separate targets or by having a feature flag we can select() on.

Perhaps we could add an additional label_flag to let the user specify a .py file that invokes the actual shell (to replace code.interact()) in addition to the dep that pulls in ipython.

@philsc
Copy link
Contributor Author

philsc commented Apr 4, 2025

What is the difference between 2 and 3? Could we just get away with 3?

The difference is convenience. I have found that folks picked up the "bazel run this library with .repl at the end" a whole lot easier and they even taught it to other folks.

@philsc philsc requested a review from rickeylev April 4, 2025 05:42
@philsc
Copy link
Contributor Author

philsc commented Apr 4, 2025

While REPL is the more familiar term to me from the Lisp world, the Python ecosystem appears to prefer the words "shell" or "console". My previous macros emitted .shell and .ishell verbs which I would suggest is slightly better aligned.

Hmm. Perhaps I'm old 😛

REPL does appear in the docs: https://docs.python.org/3/tutorial/appendix.html#tut-interac

There are two variants of the interactive REPL. The classic basic interpreter is supported on all platforms with minimal line control capabilities.

And there's even an environment variable for configuration:
https://docs.python.org/3/using/cmdline.html#envvar-PYTHON_BASIC_REPL

I'm happy to change it to ".shell" or similar if you all want, but I don't think REPL is un-Pythonic so to speak.

@aignas
Copy link
Collaborator

aignas commented Apr 8, 2025

One extra question about what is the difference between your //python/bin:repl target and the existing //python/bin:python target? I have a usecase where I need a handle to the python interpreter with all of the deps available and importable. The current //python/bin:python only gives me a clean interpreter but the deps are not be found in my PYTHONPATH.

Should we have the approach where we:

  1. Fix the //python/bin:python to have the deps?
  2. If not, have a //python/bin:repl target to start with that just unblocks us from having a working interpreter.
  3. Add the env var to the bootstrap (which could be super useful but the API might need to be discussed further.
  4. Discuss the macro documentation, etc later.

@philsc
Copy link
Contributor Author

philsc commented Apr 8, 2025

One extra question about what is the difference between your //python/bin:repl target and the existing //python/bin:python target? I have a usecase where I need a handle to the python interpreter with all of the deps available and importable. The current //python/bin:python only gives me a clean interpreter but the deps are not be found in my PYTHONPATH.

Can you elaborate on this a bit? Is the use case to do something like subprocess.run([sys.executable, "path/to/file.py"])?

The goal with this PR here is to create a REPL in an environment identical to what you would see when running your code in a py_binary. I.e. all the things that the first and second stage bootstrap script do. The three that I can think of:

  1. venv creation
  2. safe path handling
  3. setting up runfiles env vars

You don't get that with only the interpreter, even if it could import the deps.

I think if we can figure out how to move all those things into a pre-generated venv, then that could work. But it's not obvious to me that we can do that at this time.

Should we have the approach where we:

1. Fix the `//python/bin:python` to have the deps?

2. If not, have a `//python/bin:repl` target to start with that just unblocks us from having a working interpreter.

3. Add the `env var` to the bootstrap (which could be super useful but the API might need to be discussed further.

4. Discuss the macro documentation, etc later.

Removed the env var in the bootstrap script and the macro stuff for future PRs.

@aignas
Copy link
Collaborator

aignas commented Apr 9, 2025

The concrete usecase is that I am using dagster and dagster allows users to include user code deployments via specifying in a YAML directory a list of venv with a path to the python interpreter.

Right now I am hacking this around by having a py_library with all the deps and then a py_binary that has the main as the following file:

import subprocess
import sys

if __name__ == "__main__":
    subprocess.run([sys.executable] + sys.argv[1:])  # noqa: S603

This basically re-execs the interpreter from the environment with any parameters that may be passed to the target. If I run this target I get the REPL. And I can do import dagster in the said vanila python REPL.

If on the other hand I pass the py_library to the @rules_python//python/bin:python target via a the label flag and try doing import dagster it fails because the deps are not wired in.

My question here is:

  • Why do we have @rules_python//python/bin:python in addition to this new REPL target?
  • If there is a reason to have both, where should the user use one over the other?

@groodt
Copy link
Collaborator

groodt commented Apr 9, 2025

My 2c here. I would prefer to have this in user-space. We have a similar need at work, and we added a small macro to create an ipython py_console_script_binary with the dependencies. (It occurs to me with the new main_module support from #2671 there might also be an additional way to do it... )

This deals with most of the concerns listed above?

  • venv creation ✅ - I presume you really mean the runfiles or PYTHONPATH? But broadly, when the rules support site-packages layout, you can get this for free with a user-space solution
  • safe path handling ✅ - A user-space solution is launched with the same SAFEPATH handling
  • setting up runfiles env vars ✅ - A user-space solution sets up runfiles

Im worried about adding more complexity into the toolchains and bootstrapping areas at the moment.

If your main motivation is convenience for users, it's unclear why you couldn't create some user facing macros in your repository (automated or not via gazelle) that wrap the native py_library and py_binary targets with an ipython repl and targetnames of your choosing?

@aignas
Copy link
Collaborator

aignas commented Apr 9, 2025

I am curious if having the main_module with a select statement something like:

py_binary(
    name = "my_py_binary",
    srcs = ...,
    deps = ...,
    main_module = select({
        "@rules_python//python/config_settings:launch_repl": "$(PYTHON3)",
        "//conditions:default": "",
    }),
)

I agree with @groodt here that running IPython should be solved in the user space, but launching vanilla python should be a part of the core rules IMHO, but I am not sure exactly how.

I personally liked the already existing target @rules_python//python/bin:python but currently it is not setting up the PYTHONPATH yet.

@philsc
Copy link
Contributor Author

philsc commented Apr 20, 2025

My question here is:

* Why do we have `@rules_python//python/bin:python` in addition to this new REPL target?

The goal with this PR here is to create a REPL in an environment identical to what you would see when running your code in a py_binary. I.e. all the things that the first and second stage bootstrap script do. The three that I can think of:

  1. venv creation
  2. safe path handling
  3. setting up runfiles env vars

You don't get that with only the interpreter, even if it could import the deps.

If we want to to make @rules_python//python/bin:python capable of importing arbitrary py_library targets, then we'd have to teach python/private/interpreter_tmpl.sh about runfiles and set up the PYTHONPATH. I suspect it would look a lot like what python/private/python_bootstrap_template.txt looks like right now (maybe minus the zipfile stuff? 🤔 ). If that's what we want to do that, that's okay, but that's not the goal of this target here. The goal of the python/bin:python target was to just give access to the interpreter

* If there is a reason to have both, where should the user use one over the other?

When you just want the interpreter (for deploying, bootstrapping, etc.) you use @rules_python//python/bin:python. For development, when you want a REPL in an environment identical to what you see in a py_binary, then you use python/bin:repl.

Your work around with a py_binary target sounds reasonable to me. If we turn that into a macro, then we can avoid introducing a transition for targets that want the interpreter plus some libraries.

My 2c here. I would prefer to have this in user-space. ... Im worried about adding more complexity into the toolchains and bootstrapping areas at the moment.

I removed all the changes to IPython, the bootstrapping scripts, and there are no changes to the toolchain stuff. Could I ask you to double check that you're still interested in seeing some changes?

If your main motivation is convenience for users, it's unclear why you couldn't create some user facing macros in your repository (automated or not via gazelle) that wrap the native py_library and py_binary targets with an ipython repl and targetnames of your choosing?

I removed all the IPython-related changes. The user can create their own macro or substitute their own stub if they like.

I am curious if having the main_module with a select statement something like:

py_binary(
    name = "my_py_binary",
    srcs = ...,
    deps = ...,
    main_module = select({
        "@rules_python//python/config_settings:launch_repl": "$(PYTHON3)",
        "//conditions:default": "",
    }),
)```

That's actually pretty cool! I think that could work, but it would be limited to py_binary targets. It's not immediately obvious to me how to extend that to py_library without doing, say, a user-space wrapping macro that automatically adds a py_binary wrapper for the target (similar to what Greg was saying).

Copy link
Collaborator

@aignas aignas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as the REPL implementation goes, I don't have major issues with this, but love to discuss on the comments I left.

# can point this at their version of ipython.
label_flag(
name = "repl_stub_dep",
build_setting_default = "//python/private:empty",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we just use //python:none as the default? I think it is working above in setting the python_src?

If it does not work, maybe we just need to add an empty PyInfo provider in the sentinel target.

# The user can modify this flag to make arbitrary PyInfo targets available for
# import on the REPL.
label_flag(
name = "repl_dep",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just use python_src from above instead?

},
)

def py_repl_binary(name, stub, deps = [], data = [], **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document the values with the usual docs.

Comment on lines +37 to +39
"stub": attr.label(
mandatory = True,
allow_single_file = True,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is stub?

Comment on lines +1 to +14
# Copyright 2025 The Bazel Authors. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Copyright can be removed from the PR, bazel-contrib does not require that.

Comment on lines +1 to +3
import code

code.interact()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is code and what is code.interact? Is it something special? Could you please add a comment?

Comment on lines +44 to +46
# The user can modify this flag to make an interpreter shell library available
# for the stub. E.g. if they switch the stub for an ipython-based one, then they
# can point this at their version of ipython.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our docs we have documentation for targets, e.g. the config settings, so these comments could be written there so that the user can easily find the docs.

@aignas aignas marked this pull request as ready for review April 23, 2025 08:59
@aignas
Copy link
Collaborator

aignas commented Apr 23, 2025

Changed to Ready for Review since we are actively reviewing the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants