Skip to content

Conversation

@rbrchen
Copy link
Contributor

@rbrchen rbrchen commented Oct 27, 2025

Preliminary Float4 support for arrayViews. Currently only supported for F32ArrayPadded buffers, which can be accessed like a Float4[] as shown below:

Float4[] arr = buffer.float4ArrayView(); // for a F32ArrayPadded buffer

At the moment, to use an element in the array, the Float4 must be loaded into a separate variable first, i.e.

Float4 a = arr[index * 4];

A value can be stored into the array through the following syntax:

arr[index * 4] = ...

Progress

  • Change must not contain extraneous whitespace

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/babylon.git pull/646/head:pull/646
$ git checkout pull/646

Update a local copy of the PR:
$ git checkout pull/646
$ git pull https://git.openjdk.org/babylon.git pull/646/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 646

View PR using the GUI difftool:
$ git pr show -t 646

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/babylon/pull/646.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 27, 2025

👋 Welcome back rbrchen! A progress list of the required criteria for merging this PR into code-reflection will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Oct 27, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

Copy link
Member

@jjfumero jjfumero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great work. I have a few comments/questions.

All test passing also for my setup (OpenCL backend).

.space()
.oparen();
// if the value to be stored is an operation, recurse on the operation
if (hatVectorStoreView.operands().get(1) instanceof Op.Result r && r.op() instanceof HATVectorBinaryOp) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be automatically visited when HATVectorBinaryOp is found? Do you have the use case in which this is needed? The reason I am asking is that this also works with the prev. proposal for single Float4 loads./stores . So I wonder which pattern triggers the new check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to propagate this change also for the CUDA codegen.

Copy link
Contributor Author

@rbrchen rbrchen Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point; currently, a line like vC[index * 4] = Float4.add(vA[index * 4], vB[index * 4]) won't be handled properly (the add operation won't be recursed over, and the resulting code will look like vstore4(vA, 0, &c->array[index*4]) instead of storing the add operation's result).

(Edit: I originally thought that if we extracted the array accesses vA[index * 4] and vB[index * 4] into Float4 variables first before operating on them, the original code would work, but it doesn't seem to be the case)

There might also be something I'm missing that causes a case like vC[index * 4] = Float4.add(vA[index * 4], vB[index * 4]) to fail with the original code; I'll give it another pass.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, got it. Thanks for the clarification. So it looks to me that possibly there are some missing HATVectorVarLoadOps in this case.

For example:

Float4.add(vA[index * 4], vB[index * 4]), would have two HATVectorVarLoadOp loading a float4. But I get now, with the array views, we will see actually two plain arrays being loaded. So if that's the case, it makes sense to invoke recurse.

I am not suggesting to use HATVectorVarLoadOp, but it is good we have the cases identified.

// }
switch (op) {
case JavaOp.InvokeOp iop -> {
if (isVectorOperation(iop)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This HAT Phase is invoked after all vector operations have been inserted in the new code-model. So I wonder if this is actually triggered. My take is that, at this point in the compilation pipeline, all VectoLoads, Stores and VSelect are already in place. Is this correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct; this is leftover from when I was testing earlier haha. Good catch!

}
}
// handles only 1D and 2D arrays
case JavaOp.ArrayAccessOp.ArrayLoadOp alop -> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My take is that this phase is hitting this case for all the actual array view. If the prev. block still insert the dialect ops for vectors, we could substitute the prev.. pipeline of hat phases with this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting, are you proposing combining some of the hat phases into one larger overall phase using this case logic?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the whole logic works, then yes, we can condense it in one phase. I personally prefer different compiler phases, so if something goes wrong, or we find a new use-case that does not match the patterns for one of the phases, we can easily detect phases involved. But not strong opinion.

vop.resultType(),
4,
HATVectorViewOp.VectorType.FLOAT4,
false, //TODO: fix
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This flag was to indentify if the store is set to a private or shared array. In this case, we don't have a pointer.

Using this search will probably work:

    private boolean findIsSharedOrPrivateSpace(CoreOp.VarAccessOp.VarLoadOp varLoadOp) {
        return findIsSharedOrPrivateSpace(varLoadOp.operands().get(0));
    }

    private boolean findIsSharedOrPrivateSpace(Value v) {
        if (v instanceof Op.Result r && r.op() instanceof CoreOp.VarAccessOp.VarLoadOp varLoadOp) {
            return findIsSharedOrPrivateSpace(varLoadOp);
        } else  if (v instanceof CoreOp.Result r && (r.op() instanceof HATLocalVarOp || r.op() instanceof HATPrivateVarOp)) {
            return true;
        }else{
            return false;
        }
    }

I think in the future we can add a new Op for expressing global array accesses too (from parameters), so we can simplify how we handle different levels of memory within the code gen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants