Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protocol buffers are not correctly built during installation #188

Open
peytondmurray opened this issue Oct 22, 2024 · 0 comments
Open

Protocol buffers are not correctly built during installation #188

peytondmurray opened this issue Oct 22, 2024 · 0 comments
Labels
bug Something isn't working type:bug

Comments

@peytondmurray
Copy link
Contributor

peytondmurray commented Oct 22, 2024

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux 6.6.57-1-lts x86_64
  • TensorFlow Model Analysis installed from (source or binary): Source
  • TensorFlow Model Analysis version (use command below): 7a068ec
  • Python version: 3.11.9
  • Jupyter Notebook version: N/A
  • Exact command to reproduce: N/A

Describe the problem

This issue is a catch-all for a number of problems with the current build system. Starting from the beginning:

  1. Hardcoded protobuf v3.21.9 dependency: Although the WORKSPACE file defines targets from com_google_protobuf for v3.21.9, it doesn't actually use _PROTOBUF_COMMIT except in stripping output. It should use _PROTOBUF_COMMIT in both the archive name and for stripping output.
  2. Version inconsistencies: setup.py requires protobuf>=3.20.3 for python<3.11, which doesn't match the version grabbed by bazel. For python>=3.11, protobuf>=4.25.2 is required, which is a full major version different and even more likely to be incompatible.
  3. Bazel doesn't build the protocol buffers: setup.py does an ad-hoc platform-dependent search for protoc, meaning that the version of protobuf downloaded by bazel never gets used. If the build environment already contains any version of protobuf, setup.py will happily use it, leading to generated files which are incompatible with the rest of the code.
  4. com_google_protobuf gets clobbered by rules_rust transitive dependency: Bazel never downloads the version of protobuf that you request in the WORKSPACE file because rules_rust has a transitive dependency on com_google_protobuf that takes precedence. Bazel silently builds the protocol buffers using a much older version of protobuf as a result, again leading to library incompatibilities at runtime.
  5. bazel is never invoked from setup.py: Although there is a BUILD file for generating python code from the protobufs, bazel is never called from setup.py.
  6. tensorflow_model_analysis/proto/BUILD points to the wrong protobuf dependency: Needs to point to the explicit protobuf dependency in WORKSPACE.

Even if bazel is made to build the protocol buffers, setup.py will need to be modified to grab the sources from bazel-bin/ when the wheel is being built. IMO this could be much more easily done with meson-python, which has a first-class build backend for Python already, would allow for robust version control for external tooling with fallback options as well if the host doesn't have the right version of protoc; we'd also avoid problems with transitive dependencies clobbering our actual dependencies too. If this is something folks are interested in, I'm happy to write the meson.build. Otherwise we can stick with bazel and call it by hand in setup.py.

On my system I'm unable to run tests because of this, but because bazel provides partial build isolation, whether you are affected by this or not really depends on the build environment.

cc @smokestacklightnin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working type:bug
Projects
None yet
Development

No branches or pull requests

1 participant