Skip to content

Conversation

@XiaohongGong
Copy link

@XiaohongGong XiaohongGong commented Jan 6, 2026

Problem:

Test compiler/vectorapi/VectorMaskToLongTest.java crashes intermittently (approximately once per 200+ runs) with stress VM options such as -XX:+StressIGVN:

// A fatal error has been detected by the Java Runtime Environment:
//
// Internal Error (jdk/src/hotspot/share/opto/type.hpp:2287), pid=69056, tid=28419
// assert(_base >= VectorMask && _base <= VectorZ) failed: Not a Vector
// ...

The crash occurs in following code when calling is_vect() in the assertion added by JDK-8367292 [1]:

if (in1->Opcode() == Op_VectorStoreMask) {
Node* mask = in1->in(1);
assert(!Matcher::mask_op_prefers_predicate(Opcode(), mask->bottom_type()->is_vect()), "sanity");
in1 = mask;
}

Root Cause:

The mask's type becomes TOP (unreachable) during compiler optimizations when the mask node is marked as dead before all its users are removed from the ideal graph. If Ideal() is subsequently called on a user node, it may access the TOP type, triggering the assertion.

Here is the simplified ideal graph showing the crash scenario:

      Con #top
       |       ConI
         \      /
           \  /
     VectorStoreMask
             |
         VectorMaskToLong  # !jvms: IntMaxVector$IntMaxMask::toLong

Detailed Scenario:

Following is the method in the test case that hits the assertion:

public static void testMaskAllToLong(VectorSpecies<?> species) {
int vlen = species.length();
long inputLong = 0L;
// fromLong is expected to be converted to maskAll.
long got = VectorMask.fromLong(species, inputLong).toLong();
verifyMaskToLong(species, inputLong, got);

This method accepts a VectorSpecies<?> parameter and calls vector APIs VectorMask.fromLong() and toLong(). It is called with species ranging from ByteVector.SPECIES_MAX to DoubleVector.SPECIES_MAX. During compilation, C2 speculatively generates fast paths for toLong() for all possible species.

When compiling a specific test case such as:

public static void testMaskAllToLongDouble() {
testMaskAllToLong(D_SPECIES);
}

the compiler inlines the method and attempts to optimize away unreachable branches. The following graph shows the situation before the mask becomes TOP:

                     VectorBox # DoubleMaxMask, generated by VectorMask.fromLong()
                       /    \
                     AddP     \
                      |         \
                  LoadNClass      \
   ConP #IntMaxMask    |            |
      \                |             |
        \        DecodeNClass       |
          \       /                |
            \   /                 |
             CmpP                |
              |                 |
             Bool #ne          |
              |              /
             If            /
              |          /
           IfFalse     /
              |      /
              |    /
          CheckCastPP  # IntMaxMask
              |
         VectorUnbox  # Start of inlining IntMaxMask::toLong()
              |
               \     ConI
                \    /
           VectorStoreMask
                   |
            VectorMaskToLong

The generated mask (VectorBox) is a DoubleMaxMask, but the code path expects an IntMaxMask for IntMaxMask::toLong(). Since this is an unreachable branch, the control input of CheckCastPP becomes TOP during IGVN, propagating the TOP type to subsequent data nodes until reaching VectorStoreMask. VectorStoreMask has another non-TOP input (ConI), which stops further TOP propagation.

With stress VM options, the IGVN worklist order is shuffled, causing VectorMaskToLongNode::Ideal() to be invoked before dead path cleanup completes, which triggers the assertion failure.

Solution:

Replace is_vect() with the safer isa_vect(), which checks whether the type is a vector type before casting and returns nullptr if it is not. Additionally, check for nullptr and skip the optimization if the type check fails.

An alternative solution would be to detect top inputs during IGVN for the relevant vector nodes and skip certain optimizations when such inputs are encountered. That is probably the right long-term direction. However, because this handling is currently missing for all vector nodes, I'd like to leave it as a separate follow-up topic for discussion.

Testing:

Ran the test 800 times on SVE/NEON/AVX2 systems with no failures observed.

Note that no new test case was added because it is so challenging to me to reproduce this issue reliably. The issue depends on a specific IGVN optimization sequence that occurs non-deterministically due to the worklist shuffling behavior under stress VM options.

[1] https://bugs.openjdk.org/browse/JDK-8367292


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8374043: C2: assert(_base >= VectorMask && _base <= VectorZ) failed: Not a Vector (Bug - P3)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/29057/head:pull/29057
$ git checkout pull/29057

Update a local copy of the PR:
$ git checkout pull/29057
$ git pull https://git.openjdk.org/jdk.git pull/29057/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 29057

View PR using the GUI difftool:
$ git pr show -t 29057

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/29057.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 6, 2026

👋 Welcome back xgong! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jan 6, 2026

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot changed the title 8374043: C2: assert(_base >= VectorMask && _base <= VectorZ) failed: Not a Vector 8374043: C2: assert(_base >= VectorMask && _base <= VectorZ) failed: Not a Vector Jan 6, 2026
@openjdk
Copy link

openjdk bot commented Jan 6, 2026

@XiaohongGong The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 6, 2026
@mlbridge
Copy link

mlbridge bot commented Jan 6, 2026

Webrevs

@mhaessig
Copy link
Contributor

mhaessig commented Jan 8, 2026

A drive-by comment on the reproducibility:

  • Does this only reproduce for specific hardware features or on all relatively new vector instruction sets?
  • Have you tried to reproduce this using the StressSeed flag? In the hs-error file you should find it with all the hotspot flags and rerunning the test with that seed often leads to a reproducible failure.

@XiaohongGong
Copy link
Author

A drive-by comment on the reproducibility:

  • Does this only reproduce for specific hardware features or on all relatively new vector instruction sets?

Thanks for looking at this PR! This can be reproduced on hardwares that 1) support vector api well in backend, 2) do not support predicate features like AVX-512 and RVV.

  • Have you tried to reproduce this using the StressSeed flag? In the hs-error file you should find it with all the hotspot flags and rerunning the test with that seed often leads to a reproducible failure.

Yes, I can reproduce this issue with all the stress flags reported in the hs-error file, but limited to the existing test case and the failure still happens randomly. I tried with the failure seed, but I failed to reproduce with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hotspot-compiler [email protected] rfr Pull request is ready for review

Development

Successfully merging this pull request may close these issues.

2 participants