Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flang][MLIR][OpenMP] - Add support for firstprivate when translating omp.target ops from MLIR to LLVMIR #131213

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

bhandarkar-pranav
Copy link
Contributor

This patch adds support to translate firstprivate clauses on omp.target ops when translating from MLIR to LLVMIR. Presently, this PR is restricted to supporting only included tasks, i.e #omp target nowait firstprivate(some_variable) will likely not work correctly even if it produces object code.

@llvmbot
Copy link
Member

llvmbot commented Mar 13, 2025

@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-mlir-openmp
@llvm/pr-subscribers-flang-fir-hlfir

@llvm/pr-subscribers-flang-openmp

Author: Pranav Bhandarkar (bhandarkar-pranav)

Changes

This patch adds support to translate firstprivate clauses on omp.target ops when translating from MLIR to LLVMIR. Presently, this PR is restricted to supporting only included tasks, i.e #omp target nowait firstprivate(some_variable) will likely not work correctly even if it produces object code.


Full diff: https://github.com/llvm/llvm-project/pull/131213.diff

5 Files Affected:

  • (modified) flang/lib/Lower/OpenMP/DataSharingProcessor.cpp (+1-1)
  • (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+9-3)
  • (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+8-7)
  • (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+8-13)
  • (modified) mlir/test/Target/LLVMIR/openmp-target-private.mlir (+92-20)
diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp
index 781b0dfceff9e..59d4a5563d3de 100644
--- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp
@@ -25,6 +25,7 @@
 #include "flang/Semantics/attr.h"
 #include "flang/Semantics/tools.h"
 
+#define DEBUG_TYPE "flang-data-sharing"
 namespace Fortran {
 namespace lower {
 namespace omp {
@@ -548,7 +549,6 @@ void DataSharingProcessor::doPrivatize(const semantics::Symbol *sym,
   }
 
   mlir::Type argType = privVal.getType();
-
   mlir::omp::PrivateClauseOp privatizerOp = [&]() {
     auto moduleOp = firOpBuilder.getModule();
     auto uniquePrivatizerName = fir::getTypeAsString(
diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp
index b1568cc12a05a..0094abe044b86 100644
--- a/flang/lib/Lower/OpenMP/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP/OpenMP.cpp
@@ -1666,13 +1666,19 @@ static void genTargetClauses(
   cp.processNowait(clauseOps);
   cp.processThreadLimit(stmtCtx, clauseOps);
 
-  cp.processTODO<clause::Allocate, clause::Defaultmap, clause::Firstprivate,
-                 clause::InReduction, clause::UsesAllocators>(
-      loc, llvm::omp::Directive::OMPD_target);
+  cp.processTODO<clause::Allocate, clause::Defaultmap, clause::InReduction,
+                 clause::UsesAllocators>(loc,
+                                         llvm::omp::Directive::OMPD_target);
 
   // `target private(..)` is only supported in delayed privatization mode.
   if (!enableDelayedPrivatizationStaging)
     cp.processTODO<clause::Private>(loc, llvm::omp::Directive::OMPD_target);
+
+  // We do not yet have MLIR to LLVMIR translation for privatization in
+  // for deferred target tasks.
+  if (clauseOps.nowait)
+    cp.processTODO<clause::Private, clause::Firstprivate>(
+        loc, llvm::omp::Directive::OMPD_target);
 }
 
 static void genTargetDataClauses(
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
index f5a8a7ba04375..bac62da067db5 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
@@ -81,7 +81,7 @@ def PrivateClauseOp : OpenMP_Op<"private", [IsolatedFromAbove, RecipeInterface]>
 
     There are no restrictions on the body except for:
     - The `dealloc` regions has a single argument.
-    - The `init & `copy` regions have 2 arguments.
+    - The `init` & `copy` regions have 2 arguments.
     - All three regions are terminated by `omp.yield` ops.
     The above restrictions and other obvious restrictions (e.g. verifying the
     type of yielded values) are verified by the custom op verifier. The actual
@@ -90,15 +90,15 @@ def PrivateClauseOp : OpenMP_Op<"private", [IsolatedFromAbove, RecipeInterface]>
     Instances of this op would then be used by ops that model directives that
     accept data-sharing attribute clauses.
 
-    The $sym_name attribute provides a symbol by which the privatizer op can be
+    The `sym_name` attribute provides a symbol by which the privatizer op can be
     referenced by other dialect ops.
 
-    The $type attribute is the type of the value being privatized. This type
+    The `type` attribute is the type of the value being privatized. This type
     will be implicitly allocated in MLIR->LLVMIR conversion and passed as the
     second argument to the init region. Therefore the type of arguments to
-    the regions should be a type which represents a pointer to $type.
+    the regions should be a type which represents a pointer to `type`.
 
-    The $data_sharing_type attribute specifies whether privatizer corresponds
+    The `data_sharing_type` attribute specifies whether privatizer corresponds
     to a `private` or a `firstprivate` clause.
   }];
 
@@ -161,9 +161,10 @@ def PrivateClauseOp : OpenMP_Op<"private", [IsolatedFromAbove, RecipeInterface]>
     /// needsMap returns true if the value being privatized should additionally
     /// be mapped to the target region using a MapInfoOp. This is most common
     /// when an allocatable is privatized. In such cases, the descriptor is used
-    /// in privatization and needs to be mapped on to the device.
+    /// in privatization and needs to be mapped on to the device. The use of
+    /// firstprivate also causes the need to map the host variable to the device.
     bool needsMap() {
-      return initReadsFromMold();
+      return readsFromMold();
     }
 
     /// Get the type for arguments to nested regions. This should
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 32c7c501d03c3..87b7ab9b6b0a3 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -44,6 +44,7 @@
 #include <optional>
 #include <utility>
 
+#define DEBUG_TYPE "openmp-to-llvmir"
 using namespace mlir;
 
 namespace {
@@ -211,19 +212,9 @@ static LogicalResult checkImplementationStatus(Operation &op) {
   };
   auto checkPrivate = [&todo](auto op, LogicalResult &result) {
     if constexpr (std::is_same_v<std::decay_t<decltype(op)>, omp::TargetOp>) {
-      // Privatization clauses are supported, except on some situations, so we
-      // need to check here whether any of these unsupported cases are being
-      // translated.
-      if (std::optional<ArrayAttr> privateSyms = op.getPrivateSyms()) {
-        for (Attribute privatizerNameAttr : *privateSyms) {
-          omp::PrivateClauseOp privatizer = findPrivatizer(
-              op.getOperation(), cast<SymbolRefAttr>(privatizerNameAttr));
-
-          if (privatizer.getDataSharingType() ==
-              omp::DataSharingClauseType::FirstPrivate)
-            result = todo("firstprivate");
-        }
-      }
+      // Privatization is supported only for include target tasks.
+      if (!op.getPrivateVars().empty() && op.getNowait())
+        result = todo("privatization for deferred target tasks");
     } else {
       if (!op.getPrivateVars().empty() || op.getPrivateSyms())
         result = todo("privatization");
@@ -4810,6 +4801,10 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
             .failed())
       return llvm::make_error<PreviouslyReportedError>();
 
+    if (failed(copyFirstPrivateVars(builder, moduleTranslation, mlirPrivateVars,
+                                    llvmPrivateVars, privateDecls)))
+      return llvm::make_error<PreviouslyReportedError>();
+
     SmallVector<Region *> privateCleanupRegions;
     llvm::transform(privateDecls, std::back_inserter(privateCleanupRegions),
                     [](omp::PrivateClauseOp privatizer) {
diff --git a/mlir/test/Target/LLVMIR/openmp-target-private.mlir b/mlir/test/Target/LLVMIR/openmp-target-private.mlir
index f97360e2c6e84..0e651482647f0 100644
--- a/mlir/test/Target/LLVMIR/openmp-target-private.mlir
+++ b/mlir/test/Target/LLVMIR/openmp-target-private.mlir
@@ -18,11 +18,6 @@ llvm.func @target_map_single_private() attributes {fir.internal_name = "_QPtarge
   }
   llvm.return
 }
-// CHECK: define internal void @__omp_offloading_
-// CHECK-NOT: define {{.*}}
-// CHECK: %[[PRIV_ALLOC:.*]] = alloca i32, align 4
-// CHECK: %[[ADD:.*]] = add i32 {{.*}}, 10
-// CHECK: store i32 %[[ADD]], ptr %[[PRIV_ALLOC]], align 4
 
 omp.private {type = private} @n.privatizer : f32
 
@@ -50,16 +45,6 @@ llvm.func @target_map_2_privates() attributes {fir.internal_name = "_QPtarget_ma
 }
 
 
-// CHECK: define internal void @__omp_offloading_
-// CHECK: %[[PRIV_I32_ALLOC:.*]] = alloca i32, align 4
-// CHECK: %[[PRIV_FLOAT_ALLOC:.*]] = alloca float, align 4
-// CHECK: %[[ADD_I32:.*]] = add i32 {{.*}}, 10
-// CHECK: store i32 %[[ADD_I32]], ptr %[[PRIV_I32_ALLOC]], align 4
-// CHECK: %[[LOAD_I32_AGAIN:.*]] = load i32, ptr %[[PRIV_I32_ALLOC]], align 4
-// CHECK: %[[CAST_TO_FLOAT:.*]] = sitofp i32 %[[LOAD_I32_AGAIN]] to float
-// CHECK: %[[ADD_FLOAT:.*]] = fadd contract float %[[CAST_TO_FLOAT]], 1.100000e+01
-// CHECK: store float %[[ADD_FLOAT]], ptr %[[PRIV_FLOAT_ALLOC]], align 4
-
 // An entirely artifical privatizer that is meant to check multi-block
 // privatizers. The idea here is to prove that we set the correct
 // insertion points for the builder when generating, first, LLVM IR for the
@@ -79,10 +64,6 @@ llvm.func @target_op_private_multi_block(%arg0: !llvm.ptr) {
   }
   llvm.return
 }
-// CHECK: define internal void @__omp_offloading_
-// CHECK: %[[PRIV_ALLOC:.*]] = alloca float, align 4
-// CHECK: %[[PHI_ALLOCA:.*]]  = phi ptr [ %[[PRIV_ALLOC]], {{.*}} ]
-// CHECK: %[[RESULT:.*]] = load float, ptr %[[PHI_ALLOCA]], align 4
 
 // Descriptors are needed for CHARACTER arrays and their type is
 // !fir.boxchar<KIND>. When such arrays are used in the private construct, the
@@ -165,10 +146,101 @@ llvm.func @llvm.memmove.p0.p0.i64(!llvm.ptr, !llvm.ptr, i64, i1) attributes {sym
 
 
 
-// CHECK: define internal void @__omp_offloading_{{.*}}(ptr %{{[^,]+}}, ptr %[[MAPPED_ARG:.*]]) {
+omp.private {type = firstprivate} @sf.firstprivate : f32 copy {
+^bb0(%arg0: !llvm.ptr, %arg1: !llvm.ptr):
+  %0 = llvm.load %arg0 : !llvm.ptr -> f32
+  llvm.store %0, %arg1 : f32, !llvm.ptr
+  omp.yield(%arg1 : !llvm.ptr)
+}
+omp.private {type = firstprivate} @sv.firstprivate : i32 copy {
+^bb0(%arg0: !llvm.ptr, %arg1: !llvm.ptr):
+  %0 = llvm.load %arg0 : !llvm.ptr -> i32
+  llvm.store %0, %arg1 : i32, !llvm.ptr
+  omp.yield(%arg1 : !llvm.ptr)
+}
+llvm.func @target_simple_() attributes {fir.internal_name = "_QPtarget_simple"} {
+  %0 = llvm.mlir.constant(1 : i64) : i64
+  %sv = llvm.alloca %0 x i32 {bindc_name = "sv"} : (i64) -> !llvm.ptr
+  %sf = llvm.alloca %0 x f32 {bindc_name = "sf"} : (i64) -> !llvm.ptr
+  %4 = llvm.mlir.constant(1 : i64) : i64
+  %5 = llvm.mlir.constant(1 : i64) : i64
+  %6 = omp.map.info var_ptr(%sv : !llvm.ptr, i32) map_clauses(to) capture(ByRef) -> !llvm.ptr
+  %7 = omp.map.info var_ptr(%sf : !llvm.ptr, f32) map_clauses(to) capture(ByRef) -> !llvm.ptr
+  omp.target map_entries(%6 -> %arg0, %7 -> %arg1 : !llvm.ptr, !llvm.ptr) private(@sv.firstprivate %sv -> %arg2 [map_idx=0], @sf.firstprivate %sf -> %arg3 [map_idx=1] : !llvm.ptr, !llvm.ptr) {
+    %8 = llvm.mlir.constant(2.000000e+00 : f64) : f64
+    %9 = llvm.mlir.constant(10 : i32) : i32
+    %10 = llvm.load %arg2 : !llvm.ptr -> i32
+    %11 = llvm.add %10, %9 : i32
+    llvm.store %11, %arg2 : i32, !llvm.ptr
+    %12 = llvm.load %arg3 : !llvm.ptr -> f32
+    %13 = llvm.fpext %12 : f32 to f64
+    %14 = llvm.fadd %13, %8 {fastmathFlags = #llvm.fastmath<contract>} : f64
+    %15 = llvm.fptrunc %14 : f64 to f32
+    llvm.store %15, %arg3 : f32, !llvm.ptr
+    omp.terminator
+  }
+  llvm.return
+}
+// CHECK: define void @target_map_single_private() {
+// CHECK: call void @__omp_offloading_[[MAP_SINGLE_PRIVATE_OFFLOADED_FUNCTION:.*]](ptr {{.*}})
+// CHECK: define void @target_map_2_privates() {
+// CHECK: call void @__omp_offloading_[[MAP_2_PRIVATES_OFFLOADED_FUNCTION:.*]](ptr {{.*}})
+// CHECK: define void @target_op_private_multi_block
+// CHECK: call void @__omp_offloading_[[PRIVATE_MULTI_BLOCK_OFFLOADED_FUNCTION:.*]]()
+// CHECK: define void @target_boxchar_
+// CHECK: call void @__omp_offloading_[[BOXCHAR_OFFLOADED_FUNCTION:.*]](ptr {{.*}}, ptr {{.*}})
+// CHECK: define void @target_simple_()
+// CHECK: call void @__omp_offloading_[[SIMPLE_OFFLOADED_FUNCTION:.*]](ptr {{.*}}, ptr {{.*}})
+
+// CHECK: define internal void @__omp_offloading_[[MAP_SINGLE_PRIVATE_OFFLOADED_FUNCTION]]
+// CHECK: %[[PRIV_ALLOC:.*]] = alloca i32, align 4
+// CHECK: %[[ADD:.*]] = add i32 {{.*}}, 10
+// CHECK: store i32 %[[ADD]], ptr %[[PRIV_ALLOC]], align 4
+
+
+
+
+// CHECK: define internal void @__omp_offloading_[[MAP_2_PRIVATES_OFFLOADED_FUNCTION]]
+// CHECK: %[[PRIV_I32_ALLOC:.*]] = alloca i32, align 4
+// CHECK: %[[PRIV_FLOAT_ALLOC:.*]] = alloca float, align 4
+// CHECK: %[[ADD_I32:.*]] = add i32 {{.*}}, 10
+// CHECK: store i32 %[[ADD_I32]], ptr %[[PRIV_I32_ALLOC]], align 4
+// CHECK: %[[LOAD_I32_AGAIN:.*]] = load i32, ptr %[[PRIV_I32_ALLOC]], align 4
+// CHECK: %[[CAST_TO_FLOAT:.*]] = sitofp i32 %[[LOAD_I32_AGAIN]] to float
+// CHECK: %[[ADD_FLOAT:.*]] = fadd contract float %[[CAST_TO_FLOAT]], 1.100000e+01
+// CHECK: store float %[[ADD_FLOAT]], ptr %[[PRIV_FLOAT_ALLOC]], align 4
+
+// CHECK: define internal void @__omp_offloading_[[PRIVATE_MULTI_BLOCK_OFFLOADED_FUNCTION]]
+// CHECK: %[[PRIV_ALLOC:.*]] = alloca float, align 4
+// CHECK: %[[PHI_ALLOCA:.*]]  = phi ptr [ %[[PRIV_ALLOC]], {{.*}} ]
+// CHECK: %[[RESULT:.*]] = load float, ptr %[[PHI_ALLOCA]], align 4
+
+
+// CHECK: define internal void @__omp_offloading_[[BOXCHAR_OFFLOADED_FUNCTION]](ptr %{{[^,]+}}, ptr %[[MAPPED_ARG:.*]]) {
 // CHECK: %[[BOXCHAR:.*]] = load { ptr, i64 }, ptr %[[MAPPED_ARG]]
 // CHECK: %[[BOXCHAR_PTR:.*]] = extractvalue { ptr, i64 } %[[BOXCHAR]], 0
 // CHECK: %[[BOXCHAR_i64:.*]] = extractvalue { ptr, i64 } %[[BOXCHAR]], 1
 // CHECK: %[[MEM_ALLOC:.*]] = alloca i8, i64 %[[BOXCHAR_i64]]
 // CHECK: %[[PRIV_BOXCHAR0:.*]] = insertvalue { ptr, i64 } undef, ptr %[[MEM_ALLOC]], 0
 // CHECK: %[[PRIV_BOXCHAR1:.*]] = insertvalue { ptr, i64 } %[[PRIV_BOXCHAR0]], i64 %[[BOXCHAR_i64]], 1
+
+
+// CHECK: define internal void @__omp_offloading_[[SIMPLE_OFFLOADED_FUNCTION]](ptr %[[SV:.*]], ptr %[[SF:.*]])
+// CHECK: entry:
+// CHECK-NEXT: %[[SV_PRIV_ALLOCA:.*]] = alloca i32, align 4
+// CHECK-NEXT: %[[SF_PRIV_ALLOCA:.*]] = alloca float, align 4
+// CHECK: omp.private.copy:
+// CHECK-NEXT: %[[INIT_SV:.*]] = load i32, ptr %[[SV]], align 4
+// CHECK-NEXT: store i32 %[[INIT_SV]], ptr %[[SV_PRIV_ALLOCA]], align 4
+// CHECK:  %[[INIT_SF:.*]] = load float, ptr %[[SF]], align 4
+// CHECK-NEXT  store float %[[INIT_SF]], ptr %[[SF_PRIV_ALLOCA]], align 4
+// CHECK: omp.target
+// CHECK: %[[LOAD_SV:.*]] = load i32, ptr %[[SV_PRIV_ALLOCA]], align 4
+// CHECK-NEXT: %[[ADD_SV:.*]] = add i32 %[[LOAD_SV]], 10
+// CHECK-NEXT: store i32 %[[ADD_SV]], ptr %[[SV_PRIV_ALLOCA]], align 4
+// CHECK: %[[LOAD_SF:.*]] = load float, ptr %[[SF_PRIV_ALLOCA]], align 4
+// CHECK-NEXT: %[[SF_EXT:.*]] = fpext float %[[LOAD_SF]] to double
+// CHECK-NEXT: %[[ADD_SF:.*]] = fadd contract double %[[SF_EXT]], 2.000000e+00
+// CHECK-NEXT: %[[TRUNC_SF:.*]] = fptrunc double %[[ADD_SF]] to float
+// CHECK-NEXT:  store float %[[TRUNC_SF]], ptr %[[SF_PRIV_ALLOCA]], align 4
+

Copy link
Member

@ergawy ergawy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Pranav! Just a few small comments from my side.

loc, llvm::omp::Directive::OMPD_target);
cp.processTODO<clause::Allocate, clause::Defaultmap, clause::InReduction,
clause::UsesAllocators>(loc,
llvm::omp::Directive::OMPD_target);

// `target private(..)` is only supported in delayed privatization mode.
if (!enableDelayedPrivatizationStaging)
cp.processTODO<clause::Private>(loc, llvm::omp::Directive::OMPD_target);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until we flip the switch and enable delayed privatization for target by default, we need to list firstprivate here otherwise we will enable it accidently while disabling private.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

Copy link
Contributor

@tblah tblah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work so far on this

@bhandarkar-pranav
Copy link
Contributor Author

My end-to-end tests exposed some problems in firstprivate lowering (not just the changes in this PR). I have fixed all but one of them. Once I have fixed the last of these problems, I'll update this PR.

Copy link
Member

@ergawy ergawy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks Pranav.

@bhandarkar-pranav bhandarkar-pranav force-pushed the flang/target_firstprivate branch from d13edc9 to 9c10941 Compare March 27, 2025 22:22
Copy link

github-actions bot commented Mar 27, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@bhandarkar-pranav
Copy link
Contributor Author

@tblah and @ergawy - I have updated my PR with fixes for problems that some of my synthetic end-to-end tests had exposed. Please take a look at this again. Thank you.

@@ -92,7 +103,7 @@ class MapsForPrivatizedSymbolsPass
/*varPtrPtr=*/Value{},
/*members=*/SmallVector<Value>{},
/*member_index=*/mlir::ArrayAttr{},
/*bounds=*/ValueRange{},
/*bounds=*/boundsOps.empty() ? SmallVector<Value>{} : boundsOps,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary. An empty boundsOps is already an empty SmallVector<Value>.

Comment on lines +185 to +186
}
void genBoundsOps(fir::FirOpBuilder &builder, mlir::Value var,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ultra-nit

Suggested change
}
void genBoundsOps(fir::FirOpBuilder &builder, mlir::Value var,
}
void genBoundsOps(fir::FirOpBuilder &builder, mlir::Value var,

Comment on lines +202 to +203
auto normalizedLB = builder.create<mlir::arith::ConstantOp>(
loc, idxTy, builder.getIntegerAttr(idxTy, 0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this is constant across iterations so you could move it outside of the loop and generate fewer constants like you did with one.

mlir::Value lb = dimInfo.getLowerBound();
mlir::Value extent = dimInfo.getExtent();
mlir::Value byteStride = dimInfo.getByteStride();
mlir::Value ub = builder.create<mlir::arith::SubIOp>(loc, extent, one);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you subtract 1 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the extent is the number of elements. So, the upper_bound is (extent -1)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't that depend upon what the lower bound is?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants