Skip to content

Commit 4f06546

Browse files
authored
[0035] Fix various VectorAccumlator details (#875)
Fixes the following issues with the VectorAccumulate spec that was discovered during implementation. Specifically - ~~Parameter order in the header was inconsistent~~ This was reverted based on internal discussion - DXIL op name was inconsistent - DXIL op needed an `align` parameter Fixes #868
1 parent 70c317a commit 4f06546

1 file changed

Lines changed: 14 additions & 12 deletions

File tree

proposals/0035-linalg-matrix.md

Lines changed: 14 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -287,7 +287,7 @@ OuterProduct(vector<InputElTy, M> VecA, vector<InputElTy, N> VecB);
287287
template <typename InputElTy, SIZE_TYPE M>
288288
typename hlsl::enable_if<hlsl::is_arithmetic<InputElTy>::value, void>::type
289289
InterlockedAccumulate(vector<InputElTy, M> Vec, RWByteAddressBuffer Res,
290-
uint StartOffset);
290+
uint StartOffset, uint Align = 64);
291291

292292
} // namespace linalg
293293
} // namespace dx
@@ -1082,7 +1082,7 @@ provided matrix argument into the accumulator matrix.
10821082
template <typename InputElTy, SIZE_TYPE M>
10831083
typename hlsl::enable_if<hlsl::is_arithmetic<InputElTy>::value, void>::type
10841084
InterlockedAccumulate(vector<InputElTy, M> Vec, RWByteAddressBuffer Res,
1085-
uint StartOffset);
1085+
uint StartOffset, uint Align = 64);
10861086
```
10871087

10881088
Atomically adds the vector data of `Vec` to the `RWByteAddressBuffer` target
@@ -1583,7 +1583,7 @@ declare <[NUMo] x [TYo]> @dx.op.linAlgMatVecMul.v[NUMo][TYo].[MatTy].v[NUMi][TYi
15831583
immarg i1, ; is output signed
15841584
<[NUMi] x [TYi]>, ; input vector
15851585
immarg i32 ; input interpretation type (DXIL::ComponentType)
1586-
)
1586+
)
15871587
```
15881588

15891589
This operation implements a column-vector multiplication against an `A` matrix
@@ -1607,7 +1607,7 @@ declare <[NUMo] x [TYo]> @dx.op.linAlgMatVecMulAdd.v[NUMo][TYo].[MatTy].v[NUMi][
16071607
immarg i32, ; input interpretation type (DXIL::ComponentType)
16081608
<[NUMo] x [TYb]>, ; bias vector
16091609
immarg i32 ; bias interpretation type (DXIL::ComponentType)
1610-
)
1610+
)
16111611
```
16121612

16131613
This operation implements a column-vector multiplication against an `A` matrix
@@ -1695,7 +1695,7 @@ declare <[NUMo] x [TYo]> @dx.op.linAlgConvert.v[NUMo][TYo].v[NUMi][TYi](
16951695
<[NUMi] x [TYi]>, ; input vector
16961696
immarg i32, ; input interpretation type (DXIL::ComponentType)
16971697
immarg i32 ; output interpretation type (DXIL::ComponentType)
1698-
)
1698+
)
16991699
```
17001700

17011701
Converts an input vector containing data of the input interpretation type to a
@@ -1776,11 +1776,13 @@ represent all values of the format used in the shader's DXIL.
17761776
> FP type, this may cause expected behavior differences.
17771777
17781778
``` llvm
1779-
declare void @dx.op.vectorAccumulateToDescriptor.v[NUM][TY](
1780-
immarg i32, ; opcode
1781-
<[NUM] x [TY]>, ; input vector
1782-
%dx.types.Handle, ; destination RWByteAddressBuffer
1783-
i32) ; buffer offset
1779+
declare void @dx.op.linAlgVectorAccumulateToDescriptor.v[NUM][TY](
1780+
immarg i32, ; opcode
1781+
<[NUM] x [TY]>, ; input vector
1782+
%dx.types.Handle, ; destination RWByteAddressBuffer
1783+
i32, ; buffer offset
1784+
i32 ; vector element alignment
1785+
)
17841786
```
17851787

17861788
Accumulates a vector to a RWByteAddressBuffer at a specified offset. Each
@@ -1800,7 +1802,7 @@ elements to the default value.
18001802

18011803
The `@dx.op.linAlgMatrixStoreToDescriptor`,
18021804
`@dx.op.linAlgMatrixAccumulateToDescriptor`, and
1803-
`@dx.op.vectorAccumulateToDescriptor` operations write data to a
1805+
`@dx.op.linAlgVectorAccumulateToDescriptor` operations write data to a
18041806
descriptor. Writes to out of bounds memory are a no-op. An implementation may
18051807
either perform bounds checking on the full bounds of the store converting the
18061808
whole store to a no-op if any elelemt is out of bounds, or it may perform
@@ -2266,7 +2268,7 @@ OuterProduct(vector<InputElTy, M> VecA, vector<InputElTy, N> VecB);
22662268
template <typename InputElTy, SIZE_TYPE M>
22672269
typename hlsl::enable_if<hlsl::is_arithmetic<InputElTy>::value, void>::type
22682270
InterlockedAccumulate(vector<InputElTy, M> Vec, RWByteAddressBuffer Res,
2269-
uint StartOffset);
2271+
uint StartOffset, uint Align = 64);
22702272

22712273
} // namespace linalg
22722274
} // namespace dx

0 commit comments

Comments
 (0)