Skip to content

bugfix: Setting OMPI_MPI_THREAD_LEVEL to a value different than requested crashes #13211

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

abouteiller
Copy link
Member

Setting OMPI_MPI_THREAD_LEVEL to a value different than requested in MPI_Init_thread would invoke the error handler, even though it is an useful override in some threaded library use cases.

@abouteiller abouteiller removed the bug label Apr 25, 2025
@abouteiller abouteiller force-pushed the bugfix/env-thread-level-ignored branch 4 times, most recently from b71461f to b481893 Compare April 25, 2025 16:29
@abouteiller abouteiller requested review from jsquyres and bosilca May 5, 2025 15:33
@abouteiller abouteiller force-pushed the bugfix/env-thread-level-ignored branch from b481893 to 1a0d2f7 Compare May 12, 2025 17:42
Copy link

Hello! The Git Commit Checker CI bot found a few problems with this PR:

1a0d2f7: review comments

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

@abouteiller abouteiller force-pushed the bugfix/env-thread-level-ignored branch 2 times, most recently from 8ad7579 to fe5d20e Compare May 15, 2025 15:55
Copy link

Hello! The Git Commit Checker CI bot found a few problems with this PR:

b056c23: OMPI_MPI_THREAD_LEVEL can now take 'multiple' 'MPI...

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

@abouteiller abouteiller force-pushed the bugfix/env-thread-level-ignored branch from b056c23 to 5059134 Compare June 13, 2025 03:57
@bosilca
Copy link
Member

bosilca commented Jun 13, 2025

Hold on a little on this one, I have a fancier solution I just didn't had the time to integrate it. The code is attached below, and it allows for partial matching for as long as the match is unique. As an example "s" will not be accepted because single and serialized, but si will be.

int check_env_value(const char** valid_prepositions, const char** keywords, int nb_keywords, const char* value)
{
    char *prep = NULL, *token = (char*)value  /* full match */;
    int pidx = 0, v = strtol(value, &prep, 10), found = -1;
    if ((0 == v) && (prep != value))
        return v;
    while(NULL != (prep = (char*)valid_prepositions[pidx])) {
        if( 0 == strncasecmp(prep, value, strlen(prep)) ) {
            token = value + strlen(prep);
            break;  /* got a token let's find a match */
        }
        pidx++;
    }

    for(int i = 0; i < nb_keywords; i++) {
        if( 0 == strncasecmp(keywords[i], token, strlen(token)) ) {
            if( -1 != found ) {  /* not the first match, bail out */
                return -1;
            }
            found = i;
        }
    }
    return found;
}

static const char* keywords[] = {"single", "serialized", "funneled", "multiple"};
static const char* prepositions[] = {"mpi_thread_", "thread_", NULL};

@abouteiller
Copy link
Member Author

I think you have perms to push in my branch if you want to do that

Copy link
Member

@jsquyres jsquyres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're going to need to get this documented, too. It's cool new functionality, but if it doesn't appear in a man page and/or some other docs, no one will ever use it.

@jsquyres jsquyres mentioned this pull request Jun 21, 2025
`requested` in `MPI_Init_thread` would invoke the error handler, even
though it is an useful override in some threaded library use cases.

Signed-off-by: Aurelien Bouteiller <[email protected]>
(single,etc) in addition to numeric 0-3 values

Signed-off-by: Aurelien Bouteiller <[email protected]>
@jsquyres jsquyres force-pushed the bugfix/env-thread-level-ignored branch from 54682e4 to 0f4673c Compare June 21, 2025 19:39
jsquyres
jsquyres previously approved these changes Jun 21, 2025
Copy link
Member

@jsquyres jsquyres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the liberty of updating the MPI_Init* man pages. You're welcome. 😇

I made minor changes to some MPI_Session_* man pages, too, but most changes were to

@jsquyres jsquyres force-pushed the bugfix/env-thread-level-ignored branch from 0f4673c to 027fda5 Compare June 21, 2025 19:48
jsquyres
jsquyres previously approved these changes Jun 21, 2025
@jsquyres jsquyres force-pushed the bugfix/env-thread-level-ignored branch from 027fda5 to 1e66d4c Compare June 22, 2025 12:45
@jsquyres
Copy link
Member

jsquyres commented Jun 22, 2025

It occurred to me overnight that the updates I made in MPI_Init*(3) and MPI_Finalize(3) were indicative of the fact that none of our man pages had been updated to reflect the fact that the world model now exists (and is distinct from the MPI session model). So I updated even more text this morning to describe and clarify the MPI world model vs. the MPI session model.

I.e., in documenting the MPI_THREAD_* constant-setting functionality via the *OMPI_MPI_THREAD_LEVEL env variable, this PR became an excuse the update the man pages about the MPI world model vs. the MPI session model.

Reviewers should read these man pages:

jsquyres
jsquyres previously approved these changes Jun 22, 2025
@bosilca bosilca force-pushed the bugfix/env-thread-level-ignored branch 2 times, most recently from 046983c to b3e2f68 Compare June 25, 2025 09:39
@bosilca
Copy link
Member

bosilca commented Jun 25, 2025

I made all changes I wanted to the documentation. As far as I am concerned this is ready to go.

@abouteiller
Copy link
Member Author

LGTM as well

bosilca
bosilca previously approved these changes Jun 27, 2025
Comment on lines +311 to +317
/* did we found a valid integer value? */
const int nb_keywords = sizeof(ompi_thread_level_values)/sizeof(ompi_thread_level_values[0]);
if (found >= nb_keywords) {
return OMPI_ERR_BAD_PARAM;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite grok this -- this works for the integer values 0, 1, 2, 3, but won't work if -- when ABI is merged in -- the user specifies the ABI integer value for MPI_THREAD_MULTIPLE (4096), for example.

Wouldn't it be better to iterate through the values of the ompi_thread_level_values and see if the integer found value matches one of them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation states 0 to 3, the MCA help string states 0 to 3, the previous OMPI versions were also based on 0 to 3 values. If we add the fact that the internal values, ABI or not, are hard to know for users and are not necessarily monotonically increasing, I don't think that accepting the internal values makes sense or is usefull for users (except maybe nitpicking).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've previously allowed 0-3, which is probably "well known" (or at least easy to guess).

The ABI values are published in the MPI-5 document, so those fit at least some definition of "well known", too.

Meaning: if we're going to accept integer values, we should probably accept both 0-3 and the ABI values that are published in the MPI spec.

bosilca and others added 2 commits July 1, 2025 06:47
This function support prepositions (such as mpi_thread_) and partial
matching (such as "fun" for funnelled).

Signed-off-by: George Bosilca <[email protected]>
…ages

Including, but not limited to:

* Added much more description of and distinction between the MPI world
  model and the MPI session model.  Updated a lot of old,
  pre-MPI-world-model/pre-MPI-session-model text that was now stale /
  outdated, especially in the following pages:
  * MPI_Init(3), MPI_Init_thread(3)
  * MPI_Initialized(3)
  * MPI_Finalize(3)
  * MPI_Finalized(3)
  * MPI_Session_init(3)
  * MPI_Session_finalize(3)
* Numerous formatting updates
* Slightly improve the C code examples
* Describe the mathematical relationship between the various
  MPI_THREAD_* constants in MPI_Init_thread(3)
  * Note that the mathematical relationships render nicely in HTML,
    but don't render entirely properly in nroff.  This commit author
    is of the opinion that the nroff rendering is currently "good
    enough", and some Sphinx maintainer will fix it someday.
* Add descriptions about the $OMPI_MPI_THREAD_LEVEL env variable and
  how it is used in MPI_Init_thread(3)
* Added more seealso links

Signed-off-by: Jeff Squyres <[email protected]>
@bosilca bosilca force-pushed the bugfix/env-thread-level-ignored branch from b3e2f68 to 47c8adc Compare July 1, 2025 10:48

The ``OMPI_MPI_THREAD_LEVEL`` environment variable can be set to one
of several different values, or in case of a string value to any unique
initial substring (identical to regex ^) of these values:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

The ``OMPI_MPI_THREAD_LEVEL`` environment variable can be set to any
of the values listed below.  If using one of the string values, any unique 
prefix of those values is sufficient (e.g., both ``F`` and ``FUNN`` will 
uniquely identify ``FUNNELED``, which is short for the 
``MPI_THREAD_FUNNELED`` value).

Comment on lines +311 to +317
/* did we found a valid integer value? */
const int nb_keywords = sizeof(ompi_thread_level_values)/sizeof(ompi_thread_level_values[0]);
if (found >= nb_keywords) {
return OMPI_ERR_BAD_PARAM;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've previously allowed 0-3, which is probably "well known" (or at least easy to guess).

The ABI values are published in the MPI-5 document, so those fit at least some definition of "well known", too.

Meaning: if we're going to accept integer values, we should probably accept both 0-3 and the ABI values that are published in the MPI spec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OMPI_MPI_THREAD_LEVEL cannot override 'required' in MPI_Init_thread
3 participants