Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kokoro cmake_bazel_linux_x86-* builds are failing to link iree-lld #10097

Closed
ScottTodd opened this issue Aug 15, 2022 · 8 comments
Closed

Kokoro cmake_bazel_linux_x86-* builds are failing to link iree-lld #10097

ScottTodd opened this issue Aug 15, 2022 · 8 comments
Labels
bug 🐞 Something isn't working

Comments

@ScottTodd
Copy link
Member

This started when #10087 was merged (we don't test Kokoro on presubmit and will soon be turning it down). We should see how those Kokoro builds differ from the GitHub Actions pipelines, and if this indicates some missing test/build coverage in the new pipelines. I think the build_tf_integrations / test_tf_integrations pipelines should be testing roughly the same things as the Kokoro builds.

https://source.cloud.google.com/results/invocations/49ffbbca-a5ab-47d9-913a-2277ba46ee83/targets/iree%2Fgcp_ubuntu%2Fcmake-bazel%2Flinux%2Fx86-swiftshader%2Fmain/log

[4580/4784] Linking C executable compiler/bindings/python/iree/compiler/_mlir_libs/iree-lld
FAILED: compiler/bindings/python/iree/compiler/_mlir_libs/iree-lld
: && /usr/bin/clang-9 -O2 -g -DNDEBUG  compiler/src/iree/compiler/API/python/CMakeFiles/IREECompilerLldTool.dir/LldTool.c.o -o compiler/bindings/python/iree/compiler/_mlir_libs/iree-lld  -Wl,-rpath,"\$ORIGIN:/home/kbuilder/iree/build/tf/compiler/bindings/python/iree/compiler/_mlir_libs:"  compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so && :
compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so: undefined reference to `mlir::thlo::GatherOp::build(mlir::OpBuilder&, mlir::OperationState&, mlir::TypeRange, mlir::Value, mlir::Value, mlir::Value)'
compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so: undefined reference to `mlir::thlo::DynamicBroadcastInDimOp::build(mlir::OpBuilder&, mlir::OperationState&, mlir::TypeRange, mlir::Value, mlir::Value, mlir::detail::DenseArrayAttr<long>, mlir::detail::DenseArrayAttr<long>, mlir::detail::DenseArrayAttr<long>)'
compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so: undefined reference to `mlir::thlo::ConcatenateOp::build(mlir::OpBuilder&, mlir::OperationState&, mlir::Type, mlir::ValueRange, mlir::Value, unsigned long)'
compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so: undefined reference to `mlir::detail::TypeIDResolver<mlir::thlo::GatherOp, void>::id'
compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so: undefined reference to `mlir::detail::TypeIDResolver<mlir::thlo::ScatterOp, void>::id'
compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so: undefined reference to `mlir::detail::TypeIDResolver<mlir::thlo::DynamicBroadcastInDimOp, void>::id'
compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so: undefined reference to `mlir::detail::TypeIDResolver<mlir::thlo::ConcatenateOp, void>::id'
compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so: undefined reference to `mlir::detail::TypeIDResolver<mlir::thlo::THLODialect, void>::id'
compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so: undefined reference to `mlir::thlo::ScatterOp::build(mlir::OpBuilder&, mlir::OperationState&, mlir::TypeRange, mlir::Value, mlir::Value, mlir::Value)'
compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so: undefined reference to `mlir::thlo::THLODialect::THLODialect(mlir::MLIRContext*)'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
@ScottTodd ScottTodd added the bug 🐞 Something isn't working label Aug 15, 2022
@benvanik
Copy link
Collaborator

@stellaraccident
Copy link
Collaborator

There is some sketch layering in MHLO that I have already talked to them about fixing. It isn't terribly surprising that a slightly different setup uncovered more issues and I don't think it is particularly actionable. I'm unlikely to go track down the specific in these Kokoro builds when the root cause is getting fixed.

@ScottTodd
Copy link
Member Author

I'm less concerned with the particular build failure and more wondering why the pipelines/environments are different.

build_all is linking that executable (successfully), at least: [4795/4836] Linking C executable compiler/bindings/python/iree/compiler/_mlir_libs/iree-lld https://github.com/iree-org/iree/runs/7839905117?check_suite_focus=true#logs. (It would be bad if the new pipelines weren't building the python bindings, for example)

@stellaraccident
Copy link
Collaborator

Looks like the release failed. Probably on the same thing but I can't see the logs yet. I'll debug that since it is actually... Debuggable. I'm spending no time on Kokoro failures.

@stellaraccident
Copy link
Collaborator

It's a missing dep in a shared library. Os/linker nonsense is the reason.

@GMNGeoffrey
Copy link
Contributor

Cross-posting my comment from the PR:

So this broke a Kokoro postsubmit build that builds the Python bindings linking iree-lld. Somehow it didn't break any of the GitHub actions, so there's something subtly different about how they're building it

[4546/4749] Linking C executable compiler/bindings/python/iree/compiler/_mlir_libs/iree-lld
FAILED: compiler/bindings/python/iree/compiler/_mlir_libs/iree-lld 
: && /usr/bin/clang-9 -O2 -g -DNDEBUG  compiler/src/iree/compiler/API/python/CMakeFiles/IREECompilerLldTool.dir/LldTool.c.o -o compiler/bindings/python/iree/compiler/_mlir_libs/iree-lld  -Wl,-rpath,"\$ORIGIN:/home/kbuilder/iree/build/tf/compiler/bindings/python/iree/compiler/_mlir_libs:"  compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so && :
compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so: undefined reference to `mlir::thlo::GatherOp::build(mlir::OpBuilder&, mlir::OperationState&, mlir::TypeRange, mlir::Value, mlir::Value, mlir::Value)'
compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so: undefined reference to `mlir::thlo::DynamicBroadcastInDimOp::build(mlir::OpBuilder&, mlir::OperationState&, mlir::TypeRange, mlir::Value, mlir::Value, mlir::detail::DenseArrayAttr<long>, mlir::detail::DenseArrayAttr<long>, mlir::detail::DenseArrayAttr<long>)'
compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so: undefined reference to `mlir::thlo::ConcatenateOp::build(mlir::OpBuilder&, mlir::OperationState&, mlir::Type, mlir::ValueRange, mlir::Value, unsigned long)'
compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so: undefined reference to `mlir::detail::TypeIDResolver<mlir::thlo::GatherOp, void>::id'
compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so: undefined reference to `mlir::detail::TypeIDResolver<mlir::thlo::ScatterOp, void>::id'
compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so: undefined reference to `mlir::detail::TypeIDResolver<mlir::thlo::DynamicBroadcastInDimOp, void>::id'
compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so: undefined reference to `mlir::detail::TypeIDResolver<mlir::thlo::ConcatenateOp, void>::id'
compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so: undefined reference to `mlir::detail::TypeIDResolver<mlir::thlo::THLODialect, void>::id'
compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so: undefined reference to `mlir::thlo::ScatterOp::build(mlir::OpBuilder&, mlir::OperationState&, mlir::TypeRange, mlir::Value, mlir::Value, mlir::Value)'
compiler/bindings/python/iree/compiler/_mlir_libs/libIREECompilerAggregateCAPI.so: undefined reference to `mlir::thlo::THLODialect::THLODialect(mlir::MLIRContext*)'
clang: error: linker command failed with exit code 1 (use -v to see invocation)

https://source.cloud.google.com/results/invocations/4ecedaea-bfcd-4575-8ed7-9b8175621e92/targets/iree%2Fgcp_ubuntu%2Fcmake-bazel%2Flinux%2Fx86-swiftshader%2Fmain/log

Contrasting the configure step between the Kokoro and GitHub actions builds (copied from the logs with extra line breaks and reordering)

# Kokoro
# docker_run.sh gcr.io/iree-oss/frontends-swiftshader@sha256:3090418a8d8a64c356d35eff285af32570a72f41127aa123209c1562f57abb01 build_tools/kokoro/gcp_ubuntu/cmake-bazel/linux/x86-swiftshader/build.sh
/usr/bin/cmake \
  -G Ninja \
  -B /home/kbuilder/iree/build/tf \
  -DCMAKE_BUILD_TYPE=RelWithDebInfo \
  -DIREE_BUILD_PYTHON_BINDINGS=ON \
  -DIREE_BUILD_COMPILER=ON \
  -DIREE_BUILD_TESTS=ON \
  -DIREE_BUILD_SAMPLES=OFF \
  -DIREE_HAL_DRIVER_CUDA=ON \
  -DIREE_TARGET_BACKEND_CUDA=ON \
  .

# GitHub Actions
# docker_run.sh gcr.io/iree-oss/base@sha256:5d43683c6b50aebe1fca6c85f2012f3b0fa153bf4dd268e8767b619b1891423a ./build_tools/cmake/build_all.sh ${BUILD_DIR}"
/usr/bin/cmake \
  -G Ninja \
  -B full-build-dir \
  -DCMAKE_BUILD_TYPE=RelWithDebInfo \
  -DIREE_BUILD_PYTHON_BINDINGS=ON \
  -DIREE_ENABLE_LLD=ON \
  -DIREE_BUILD_DOCS=ON \
  -DIREE_ENABLE_ASSERTIONS=ON \
  -DIREE_ENABLE_CCACHE=OFF \
  -DIREE_HAL_DRIVER_CUDA=ON \
  -DIREE_TARGET_BACKEND_CUDA=ON \
  /home/runner/actions-runner/_work/iree/iree

Maybe the error only happens when building them without assertions? Or only when linking without lld?

@allieculp
Copy link

Looks like this still needs some triaging. Any further updates @GMNGeoffrey @stellaraccident @ScottTodd @benvanik ?

@ScottTodd
Copy link
Member Author

Stella fixed this in 61e0e46

@ScottTodd ScottTodd moved this to Done in IREE Aug 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐞 Something isn't working
Projects
No open projects
Archived in project
Development

No branches or pull requests

5 participants