Skip to content

Defect: Cafrun on single processor (-np 1) #778

Open
@jmag722

Description

@jmag722
  • I am reporting a bug others will be able to reproduce and not asking a question or requesting a new feature.

System information including:

  • OpenCoarrays Version: 2.10.1
  • Fortran Compiler: gfortran gcc version 11.3.0 (Ubuntu 11.3.0-1ubuntu1~22.04)
  • C compiler used for building lib: gcc version 11.3.0
  • Installation method: Cmake from source using git clone (followed instructions from Modern Fortran by Milan Curic)
cd OpenCoarrays
mkdir build
cd build
FC=gfortran CC=gcc cmake ..
make
make install
  • All flags & options passed to the installer: FC=gfortran, CC=gcc
  • Output of uname -a: Linux DESKTOP-OB8O7DS 5.10.16.3-microsoft-standard-WSL2 #1 SMP Fri Apr 2 22:23:49 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
  • MPI library being used: Debian OpenMPI v.4.1.2
  • Machine architecture and number of physical cores: x86_64, 4 physical cores, 2 threads/core
  • Version of CMake: 3.22.1

To help us debug your issue please explain:

What you were trying to do (and why)

I'm trying to run cafrun in serial. The bug occurred for the tally test from https://github.com/sourceryinstitute/OpenCoarrays/blob/main/GETTING_STARTED.md. However, the same bug also occurred running the serial program ./weather_stats (my original goal), compiled from https://github.com/modern-fortran/weather-buoys.

What happened (include command output, screenshots, logs, etc.)

cafrun -np 4 tally yielded Test passed as expected.
However, simply running ./tally or cafrun -np 1 tally yields the following:

[DESKTOP-OB8O7DS:10249] *** An error occurred in MPI_Win_create
[DESKTOP-OB8O7DS:10249] *** reported by process [304545793,0]
[DESKTOP-OB8O7DS:10249] *** on communicator MPI COMMUNICATOR 3 DUP FROM 0
[DESKTOP-OB8O7DS:10249] *** MPI_ERR_WIN: invalid window
[DESKTOP-OB8O7DS:10249] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[DESKTOP-OB8O7DS:10249] ***    and potentially your MPI job)

Cloning the Modern Fortran weather-buoys repo and compiling weather_stats, the same thing occured: the program ran with 2 processors but not 1. When run with 1 processor, I get the above error.

What you expected to happen

I expected Test passed as before.

Step-by-step reproduction instructions to reproduce the error/bug

$ cat tally.f90
      program main
        use iso_c_binding, only : c_int
        use iso_fortran_env, only : error_unit
        implicit none
        integer(c_int) :: tally
        tally = this_image() ! this image's contribution
        call co_sum(tally)
        verify: block
          integer(c_int) :: image
          if (tally/=sum([(image,image=1,num_images())])) then
             write(error_unit,'(a,i5)') "Incorrect tally on image ",this_image()
             error stop
          end if
        end block verify
        ! Wait for all images to pass the test
        sync all
        if (this_image()==1) print *,"Test passed"
      end program
$ caf tally.f90 -o tally
$ cafrun -np 1 ./tally

OR for the weather-buoy example

git clone https://github.com/modern-fortran/weather-buoys.git
cd weather-buoys
make weather_stats
./weather_stats

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions