Skip to content

8385328: [lworld] C2: assert(phi->_idx >= nodes_size()) failed: only new Phi per instance memory slice still happens after JDK-8374742#2489

Open
rwestrel wants to merge 8 commits into
openjdk:lworldfrom
rwestrel:JDK-8385328
Open

8385328: [lworld] C2: assert(phi->_idx >= nodes_size()) failed: only new Phi per instance memory slice still happens after JDK-8374742#2489
rwestrel wants to merge 8 commits into
openjdk:lworldfrom
rwestrel:JDK-8385328

Conversation

@rwestrel
Copy link
Copy Markdown
Collaborator

@rwestrel rwestrel commented May 29, 2026

The bug is caused by a mismatch in the way LoadFlat and StoreFlat
are handled in ConnectionGraph::find_inst_mem() and
MemNode::optimize_simple_memory_chain(). Once EA has run a first
time, found some value array allocations as non escaping, and removed
their LoadFlat/StoreFlat, igvn runs and is able to further
transform the memory subgraph which introduces inconsistencies that
are then caught the next time EA runs.

The fix tries to make sure ConnectionGraph::find_inst_mem() and
MemNode::optimize_simple_memory_chain() handle LoadFlat and
StoreFlat similarly.



Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (1 review required, with at least 1 Committer)

Issue

  • JDK-8385328: [lworld] C2: assert(phi->_idx >= nodes_size()) failed: only new Phi per instance memory slice still happens after JDK-8374742 (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/valhalla.git pull/2489/head:pull/2489
$ git checkout pull/2489

Update a local copy of the PR:
$ git checkout pull/2489
$ git pull https://git.openjdk.org/valhalla.git pull/2489/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 2489

View PR using the GUI difftool:
$ git pr show -t 2489

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/valhalla/pull/2489.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link
Copy Markdown

bridgekeeper Bot commented May 29, 2026

👋 Welcome back roland! A progress list of the required criteria for merging this PR into lworld will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link
Copy Markdown

openjdk Bot commented May 29, 2026

@rwestrel This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8385328: [lworld] C2: assert(phi->_idx >= nodes_size()) failed: only new Phi per instance memory slice still happens after JDK-8374742

Reviewed-by: thartmann

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 16 new commits pushed to the lworld branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the lworld branch, type /integrate in a new comment.

@rwestrel
Copy link
Copy Markdown
Collaborator Author

/template append

@openjdk
Copy link
Copy Markdown

openjdk Bot commented May 29, 2026

@rwestrel The pull request template has been appended to the pull request body

@openjdk
Copy link
Copy Markdown

openjdk Bot commented May 29, 2026

@rwestrel Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information.

@openjdk openjdk Bot added the rfr Pull request is ready for review label May 29, 2026
@mlbridge
Copy link
Copy Markdown

mlbridge Bot commented May 29, 2026

Webrevs

Comment thread src/hotspot/share/opto/escape.cpp Outdated
} else if (proj_in->is_StoreFlat()) {
Node* base = proj_in->as_StoreFlat()->base();
const TypeOopPtr* t_store_base =igvn->type(base)->is_oopptr();
if (t_store_base->instance_id() != toop->instance_id()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment explaining what is happening here would be helpful. Why does StoreFlat need this check but LoadFlat does not?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some comments. Do they look ok to you?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the comments help, thanks!

Copy link
Copy Markdown
Member

@TobiHartmann TobiHartmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@openjdk openjdk Bot added the ready Pull request is ready to be integrated label Jun 1, 2026
}
assert(is_strict_final_load || is_known_instance, "tested above");
// LoadFlat and StoreFlat cannot happen to strict final fields
// LoadFlat and StoreFlat to known instances are removed at the end of EA: this one is unrelated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked a particular LLM about this, and it thinks there could be a problem here because a C2_MISMATCHED StoreFlat is not removed, and EA only marks the fields as global escape and not the based object. Do you agree?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A StoreFlat is not a regular Store so it can't be C2_MISMATCHED.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case then we can add assert to check that in the StoreFlatNode ctor, and remove the check for C2_MISMATCHED at the beginning of StoreFlatNode::expand_non_atomic.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I missed that. It seems C2_MISMATCHED accesses come from unsafe.
Not sure why C2_MISMATCHED would be a problem with StoreFlat but for LoadFlat, I suppose it's safer to not step over a C2_MISMATCHED LoadFlat for the same allocation as alias_idx as that one is not going away.

Comment thread src/hotspot/share/opto/escape.cpp Outdated
// - or it is a StoreFlat for some object unrelated to alias_idx that can't modify the memory state for alias_idx
Node* base = proj_in->as_StoreFlat()->base();
const TypeOopPtr* t_store_base =igvn->type(base)->is_oopptr();
if (t_store_base->instance_id() != toop->instance_id()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it impossible to encounter InstanceBot here? This compare is only valid if both sides are known instances, right?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can nbe InstanceBot on the left hand side. AFAIU, InstanceBot doesn't mean any instance, it means any instance other than the ones that are known. So comparing InstanceBot with a known instance returns false as expected.
This said, one problem with this logic is that it gets types from igvn and they may not have propagated yet. I reworked it to not use the types from igvn.

@openjdk
Copy link
Copy Markdown

openjdk Bot commented Jun 2, 2026

@rwestrel this pull request can not be integrated into lworld due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout JDK-8385328
git fetch https://git.openjdk.org/valhalla.git lworld
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge lworld"
git push

@openjdk openjdk Bot added merge-conflict Pull request has merge conflict with target branch and removed ready Pull request is ready to be integrated labels Jun 2, 2026
@openjdk openjdk Bot added ready Pull request is ready to be integrated and removed merge-conflict Pull request has merge conflict with target branch labels Jun 2, 2026
Comment thread src/hotspot/share/opto/escape.cpp Outdated
return true;
}
}
assert(t_store_base->instance_id() != toop->instance_id(), "should not alias");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not an EA expert, but I'm still not convinced that we can compare instance_id() her directly here without checking for is_known_instance() or unknown edge pointing to phantom_obj. Also, I think such a test would be better as TypeOopPtr helper method. (It is a pet peeve of mine that we expose the implementation detail of how instance IDs and alias IDs are integers.)
Maybe an EA expert like @vnkozlov could check this too.

Copy link
Copy Markdown
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few comments/questions.

Comment thread src/hotspot/share/opto/escape.cpp Outdated
//
#define FIND_INST_MEM_RECURSION_DEPTH_LIMIT 1000

bool ConnectionGraph::flat_access_aliases_with(Node* flat_access, const TypeOopPtr *toop) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment about what this method does.

#define FIND_INST_MEM_RECURSION_DEPTH_LIMIT 1000

bool ConnectionGraph::flat_access_aliases_with(Node* flat_access, const TypeOopPtr *toop) {
Node* base = flat_access->is_StoreFlat() ? flat_access->as_StoreFlat()->base() : flat_access->as_LoadFlat()->base();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What type of base node could be here?

Copy link
Copy Markdown
Collaborator Author

@rwestrel rwestrel Jun 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

base can be anything that's a pointer (at offset 0) but not an AddP AFAICT

Comment thread src/hotspot/share/opto/escape.cpp Outdated
// - this is a LoadFlat for alias_idx's non escaping allocation: it will get removed by
// ConnectionGraph::optimize_flat_accesses() and has no effect on the memory state
// - or it is a LoadFlat for some object unrelated to alias_idx
// In both cases, it's safe to assume this LoadFlat doesn't modify the memory for alias_idx
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which flat load can modify memory? I surprise that this node has Memory projection.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whey they are expanded they need memory barriers which is why I think they have a memory projection. If the LoadFlat is from a non escaping object, then it doesn't need memory barriers when it's expanded AFAIU.

Comment on lines +4652 to +4654
// If the LoadFlat is mismatched, it's not removed so don't step over it if it's a flat access to the alias_idx
// known instance
if (!proj_in->as_LoadFlat()->is_mismatched() || !flat_access_aliases_with(proj_in, toop)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When it can be mismatched? Your whole comment above just confuse me.

What I see is in normal case (not mismatched) we always step over loads regardless memory slice it is on.
In mismatching case we skip if it is on different memory slice. That is it.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mismatched when it comes from an unsafe call, for instance, if a flat field is read with unsafe at an offset that's not a constant so we don't really know what field is being read.

PointsToNode::EscapeState es = ptn->escape_state();
if (es >= PointsToNode::GlobalEscape) {
return false;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not check here? Asserts will not be needed then.

  const TypeOopPtr* t_base = _igvn->type(base)->is_oopptr();
  if (t_base->instance_id() == toop->instance_id()) {
    return true;
  }

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That method should return the same as:

t_base->instance_id() == toop->instance_id()

The problem is that igvn hasn't run so types haven't propagated and t_base has a good chance to be a value that's not up to date and can't be relied on.

So yes, what you suggest would work but if feels safer to me to checks that the types and the result of the method are consistent.

Comment thread src/hotspot/share/opto/escape.cpp Outdated

bool ConnectionGraph::flat_access_aliases_with(Node* flat_access, const TypeOopPtr *toop) {
Node* base = flat_access->is_StoreFlat() ? flat_access->as_StoreFlat()->base() : flat_access->as_LoadFlat()->base();
const TypeOopPtr* t_base = _igvn->type(base)->is_oopptr();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This statement is too early - far before uses.

@rwestrel
Copy link
Copy Markdown
Collaborator Author

rwestrel commented Jun 4, 2026

Few comments/questions.

Thanks for looking at this. I pushed a small update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready Pull request is ready to be integrated rfr Pull request is ready for review

Development

Successfully merging this pull request may close these issues.

4 participants