Skip to content

Changes in v1.3.1 cause inconsistent MPI errors. #17

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
JuanPedroGHM opened this issue Apr 14, 2025 · 1 comment
Open

Changes in v1.3.1 cause inconsistent MPI errors. #17

JuanPedroGHM opened this issue Apr 14, 2025 · 1 comment

Comments

@JuanPedroGHM
Copy link

Hi,

for the heat library, we use both mpi4py and the setup-mpi action. Firstly, thanks for the great work!

We use the setup-mpi to run our tests on GitHub, but we have not been able to update the action to v1.3.1 because our tests fail when trying to update, as you see in this PR. As far as we can tell, there is no change in the MPI version between the runs using [email protected] and v1.3.1. Both use OpenMPI 1.4.6, and the output from ompi_info --all is exactly the same as far as we can tell.

We have not been able to recreate the errors on our systems, installing the same MPI version and the other dependencies and running our tests, so we have not been able to properly debug the errors.

Here are links to the pipelines running on 1.3.1, and one running on 1.2.0.

Failing CI with 1.3.1

Working CI with 1.2.0

We would really appreciate your input into solving this issue. Let us know if you need any further information.

Best,
Juan

@dalcinl
Copy link
Member

dalcinl commented Apr 14, 2025

I have no clue what's going on. This does not seem to be related to the mpi4py/setup-mpi action, but rather one of these nasty bugs in the Open MPI v4.x leading to non-reproducible failures.

The only suggestion I have is to add env: {OMPI_MCA_pml=ob1} to your test step, or add a previous step with run: echo OMPI_MCA_pml=ob1 >> "$GITHUB_ENV". That's what I usually do for my own CI runs, for example here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants