Introduce QuantizedType #1352

sdasgup3 · 2023-03-24T02:32:48Z

StableHLO dialect currently supports quantization via:

Supporting quant.uniform element types.
Having dedicated ops like uniform_quantize / uniform_dequantize.
Allowing regular ops like add / convolution to take quantized tensors.

This support was inherited from MHLO when StableHLO was bootstrapped, and
MHLO support was motivated by mobile use cases and inherited from TFLite.

As pointed out in #1149, StableHLO specification doesn't support quantization
at the moment, and this is an important gap that we would like to fix before
StableHLO v1.0 (see #588).

To continue the discussion started in #1149 and to make progress towards v1.0,
this pull request:
A) Adds QuantizedType to the StableHLO specification, modelled after
TFLite quantization spec.
B) To start a conversation about the applications of QuantizedType and the
semantics of quantized ops, proposes semantics for quantized add.

TFLite quantization spec doesn't cover everything. It specs constraints on
types (which we captured accordingly in this pull request), but it doesn't go
into describing semantics of quantized ops.

As a result, the proposed semantics for quantized add is intentionally naive,
as compared with the much more involved implementations in the TensorFlow
repository, e.g.:

In this pull request, we are looking for feedback from the community to
determine the right level of detail for speccing quantized types and quantized
ops. Please let us know!

dominicsymes · 2023-03-24T17:16:54Z

docs/spec.md

@@ -507,23 +560,35 @@ Performs element-wise addition of two tensors `lhs` and `rhs` and produces a
 * For integers: integer addition.
 * For floats: `addition` from IEEE-754.
 * For complex numbers: complex addition.
+* For quantized types:


Hi. Relating to the discussion in PR#1149, defining the operation in terms of a floating-point scale parameter can be problematic for integer only processing. An approach that could be consistent with PR#1149 is to define the addition with the restriction that the lhs, rhs and result scales are equal. This would remove need for scaling in the add. It would lead to separate scaling operations (before or after) to change the scale if this is not the case. These scales could be defined using an integer scaling parameter.

In the current version of the PR, we started with the TFLite quantization spec and were looking to get feedback from users on what they would be looking for. Thank you for providing yours! We've also pinged a few other groups to share their thoughts, so how about we keep this thread open for now and then decide?

There are some implementations with efficient different-scale quantized add as well. Imposing same-scale requirement and adding extra rescale ops would actually mean slower code in those implementations. Also, such rescale op either requires a larger immediate type or risk losing some LSBs for the operand with smaller scale (i.e. rounding happens twice).

It would be preferable if the semantics of the computation here are compatible with TFLite, since that is widely used across Mobile hardware. The existing floating-point type allows for the compatibility. Breaking out the rescaling limits optimizations.

Also, some of our implementations indeed use int based computation for scale as an implementation detail to avoid Float computations. But I don't think it needs to show up in the spec.

Thanks for the comments. A different-scale quantized add implementation could still be used by combining the rescale operations with the add. The rescale would need a larger type to avoid loss of precision but quantized add implementations will generally need a larger precision in the calculation for the same reason. With reference to TFLite, one issue is that since a TFLite flatbuffer contains floating-point scale values it is hard to read such a flatbuffer on integer only devices. While definition is at a higher floating-point level this does allow for more optimizations but also there will be less consistency of result.

A way forward could be the additional quantized type suggested by @sngyhan and @burmako in the thread below. This would keep the floating-point-scale quantized type allowing wider optimizations but also provide a mapping to pure integer behavior. This could ensure at least a base behavior is defined consistently in a pure integer way if using this quantization type. As thread below there can be a pass to convert to this fully integer type. Implementations that benefit from consistency can use this.

Thank you for the feedback! It looks like there's broad consensus among reviewers that exploring an integer-based QuantizedTensorType makes sense, so we opened #1404 and will follow up on it shortly in a separate PR.

dominicsymes · 2023-03-24T17:32:53Z

docs/spec.md

+    * `storage_min = 0`, `storage_max = 2^d - 1`.
+* (C5) For all `i`, `type(scales[i]) = expressed_type`.
+* (C6) For all `i`, `scales[i] > 0`.
+* (C7) For all `i`, `type(zero_points[i]) = i64`.


Hi. i64 type seems a large type for the zero point. Is there a constraint separately limiting the zero-point range, for example to the quantized data type, to restrict the range of (value - zero_point) operations. Additionally, I think zero points are most used for small data types – such 8-bit and below. Can zero point usage be restricted to 8-bit and lower?

Thanks @dominicsymes for your valuable feedback.

Since zero_point can get out of the storage value range, the proposal is using a wider type i64 than the storage type, with max limit as 32.

Additionally, I think zero points are most used for small data types – such 8-bit and below.

That is pretty useful information and can help in restricting the zero_point type. I will explore it further and welcome feedback from other reviewers.

I agree that adding smaller level types like 8-bit can be useful for optimized kernels. Maybe the constraint can be modified to type(zero_points[i]) = i64 or type(zero_points[i]) = storage_type?

Thanks @sdasgup3 and @sngyhan for your reply.

I think the first point is resolved by the comment from @loganchien who points out that zero-point range must be restricted to storage range so the value 0 can be represented. From this I understand that zero-point will be limited to storage range?

On the second point, I’d like to expand a bit more the problems of having zero points on larger data types (greater than 8 bit). For example, an si16 data type with an si16 zero point can give data range of -65535 to +65535. Multiplying two such values will overflow a 32-bit result type. This then has an implementation cost. Since zero_point seems most effective on small data types which have limited range and precision, I’d suggest not to allow on larger data types unless there is a clear use case. I think also in TFLite the zero-point value for integer 16-bit quantized type is set to zero in usage.

Thanks for the inputs!

In sync with @loganchien's comment that zero_point lie within the storage_min and storage_max, I proposed the following edit in the speficiation:

(C7) For all i, storage_min <= zero_points[i] <= storage_max. (C8) For all i, type(zero_points[i]) = storage_type.

cc @sngyhan

I can't agree that we should treat StableHLO as a compiler IR and optimize for this use-case. StableHLO is a model representation schema that is to be shared by many consumers and producers, most of them not compilers (e.g. NN authoring frameworks, visualization and analysis tools, on-device interpreters). Thus, we should optimize StableHLO for the semantics of the models, and strive to make it impossible to have inconsistent or invalid data in the schema.

In TFLite quantization schema, we have the following zero-point cases:

Unsigned 8-bit quantization, both weights and activations: zero point is in [0, 255] range.

Signed 8-bit quantization for activations: zero point is in [-128, 127] range.

Signed 8-bit quantization for weights: zero point is 0.

Signed 16-bit quantization for activations: zero point is 0.

Signed 32-bit quantization for biases: zero point is 0.

Signed 64-bit quantization for biases: zero point is 0.

Thus, the maximum range for zero point is [-128, 255] and it could fit into a 16-bit signed integer. I'd be ok with extending it to 32-bit signed integer, but no further (64-bit signed integer has no use and makes no sense). Alternatively, we can constrain zero point based on the storage type, similarly to how @sdasgup3 suggested:

For unsigned 8-bit quantized tensors, zero point is unsigned 8-bit.

For signed 8-bit quantized tensors, zero point is signed 8-bit (must be 0 for Convolution/FullyConnected weights, but it is up to the operators to validate it).

For signed 16-/32-/40-/48-/64-bit quantized tensors, zero point is 0 (implicit, not stored in the schema).

IMO, an authoring tool/framework doesn't have to use a feature just because the spec can support/express such feature. The authoring tool (producer) can still limit themselves to use zp in [-128, 127] for i8/i16 if the narrower zero point range can accommodate their need. Interpreters/validators may have a valid case because they are on the implementor side (consumer). But, like the other parts of the spec, I don't think an interpreter can efficiently cover all features. And, we will still have per-op constraints which can be used to narrow down the scope as needed (all multiplicative ops may need a narrower range, which I agree). For visualization, I don't think it will add constraints to the value range.

My point is not to enact a constraint too early. Maybe someday we will evolve to a stage with real need in real model.

My proposed wording is to limit zero point to storage type, i.e. i8 gets [-128, 127], u8 gets [0, 255], i16 gets [-32768, 32767], etc. Thus, only i64 storage type can get i64 zero point.

But I am fine with adding constraints (albeit not preferring). Actually, the implementation I am working on probably can't support all zero points in all operations either (in case the product overflows as pointed out by @dominicsymes earlier). My comments are more on the design symmetry or my personal view on aesthetic.

Thanks everybody for the comments!
@sngyhan Can you please add your opinion as well.

I am also not preferring specific constraints based on we've observed from TFLite product. IMO, it potentially might limit the adoptions of new quantization schemes/kernels in the future, e.g., experimental quantization schemes.

Numerically, to cover all the possible cases, "original value range ⊂ zp range" might be the right constraint.

This is a very lively discussion! Given the variety of proposals, I think that this conversation would benefit from a video meeting. I'll work with @theadactyl to set up a community meeting about quantization, and this will be one of the topics. We've opened #1405 to keep track.

docs/spec.md

burmako · 2023-03-28T03:00:41Z

docs/spec.md

+QuantizationScalesAndZeroPoints ::= (QuantizationScaleAndZero
+                                  | '{' QuantizationScaleAndZero {',' QuantizationScaleAndZero} '}')
+QuantizationScaleAndZero ::= QuantizationScale ':' QuantizationZeroPoint
+QuantizationScale ::=  FloatConstant


@dominicsymes Following up on integer only processing, what if we had two different versions of QuantizedType - one as proposed in the PR, and another one where a floating-point scale is replaced with integer multiplier and shift. Would that work for you with respect to #1149?

This idea was proposed by @sngyhan in an internal discussion, and I think that it's pretty neat. I think that it allows implementations to more easily detect different styles of quantization, which is nice. Also, through FP8, we already have a precedent for providing multiple ways of doing similar things through similar types, and this proposal goes along the same lines. What do you think?

@dominicsymes is out of the office at the moment. I won't speak for him, but I'll add a bit here, and he can add more when he's back.
In my view, it's not a completely bad option. If we implemented this, there should be a pass that was able to convert between the two formats with a fully defined result. If there is agreement on that pass, then all integer based implementations would agree, and an integer based quantized tensor could be converted to a floating-point version.

Hi. Maybe in future we can extend QuantizationScale to accept a type for multiplier/shift to support the case with the defined conversion format between the two.

Hi @burmako. @sngyhan

Thanks for your comments. Having a quantization type with integer multiplier and shift does get part of the way there provided (as @eric-k256 mentions) that there is a defined conversion between the two formats.
For implementations to agree it also needs to be defined how to convert these into actual scales to be applied.
In an element-wise multiply operation the scale applied after the multiply is the ratio of the product of the input scales to the output scale. Similarly, in the element-wise add in this PR it needs be defined how the integer scales apply as input and output scales.
An advantage of separating out these scale changes is that they can then be handled in a standard way rather than being defined as part of each operation.

I agree that the spec should facilitate both implementations using floating-point and using integer multiplier + shift representations of the scales. IMO, it is easier to achieve this using floating-point scales, because integer multiplier + shift may represent values outside of the single-precision floating-point range (e.g. integer multiplier has 31/32 bits of precision vs 24 in single-precision floating-point), and can we non-unique while floating-point values are normalized. That said, we need to expose additional constrains on the floating-point scales:

They must be strictly positive (>0.0)

Scales can't take infinity or NaN values.

Scales can't be subnormal floating-point numbers. There are two motivations for this constraint: it simplifies conversion between floating-point and multiplier + shift representation and it assures compatibility with systems which don't support subnormal numbers where they will treated as 0.

Hi @Maratyszcza
Thanks for the comments!

We already have (1). I have added (2)

* (C6) For all `i`, `scales[i] > 0`. * (C7) For all `i`, `is_finite(scales[i]) = true`.

About (3): .... assures compatibility with systems which don't support subnormal numbers where they will treated as 0.

Maybe I did not get this.
Isn't the constraint will be restrictive to the systems supporting sub-normals?
If this check is relevant when converting a double multiplier (e.g. S1*S2/S3 in eq (5) in https://arxiv.org/pdf/1712.05877.pdf to a int multiplier/shift), then does it matter if we add constraints to the individual scales? That is, isn't the double multiplier can still be sub-normal is the individual scales are not.

On a side note: For (3), Can we explore this constraint when we deal with the "integer based quantized type" in a separate PR (as suggested in #1352 (review)) ?

+1 for opening the possibility of both scaling mechanisms.

As discussed above, let's move the work on integer-based QuantizedTensorType to a separate PR, given that there's broad interest in exploring this. We opened #1404 to track this.

docs/spec.md

sngyhan

LGTM in overall.

sngyhan · 2023-04-03T04:07:25Z

docs/spec.md

+    * `storage_min = 0`, `storage_max = 2^d - 1`.
+* (C5) For all `i`, `type(scales[i]) = expressed_type`.
+* (C6) For all `i`, `scales[i] > 0`.
+* (C7) For all `i`, `type(zero_points[i]) = i64`.


I agree that adding smaller level types like 8-bit can be useful for optimized kernels. Maybe the constraint can be modified to type(zero_points[i]) = i64 or type(zero_points[i]) = storage_type?

sngyhan · 2023-04-03T14:39:12Z

docs/spec.md

+QuantizationScalesAndZeroPoints ::= (QuantizationScaleAndZero
+                                  | '{' QuantizationScaleAndZero {',' QuantizationScaleAndZero} '}')
+QuantizationScaleAndZero ::= QuantizationScale ':' QuantizationZeroPoint
+QuantizationScale ::=  FloatConstant


Hi. Maybe in future we can extend QuantizationScale to accept a type for multiplier/shift to support the case with the defined conversion format between the two.

docs/spec.md

nutsiepully

LGTM, thanks.

docs/spec.md

…o_point type.

…nsor of non-quantized type'

burmako

Thank you, everyone, for the discussion! Looks like we have alignment on quite a few topics already, so I think it would be beneficial to merge this pull request and continue working out the rest in follow up PRs.

More specifically, let's keep the definition of QuantizedTensorType, remove the definition of quantized AddOp for now and open tickets to follow up on the topics where we haven't reached consensus yet - #1404, #1405, #1406 and #1407.

From here, we'll be able to explore multiple directions simultaneously, including speccing integer-based QuantizedType, as well as discussing the semantics of quantized elementwise ops and convolutions.

Following up on #1352, this pull request documents the unresolved conversations from that PR review.

## Summary The PR proposes the specification for quantized add op. ## A few details At some point we [decided](#1352 (comment)) to drop the introduction of the specification of this op mainly because we were unsure about the fate of #1406. Please have a look at my revised proposal on #1406 and let me know if I am missing something. Otherwise, let us review this op and let me know your feedback. Side note: For those who are already aware of the context of prior introduction of this op, please note that the current proposal is almost same as before except that it does not have any additional constraint imposed by the op's semantics on `storage_min` or `storage_max`.

sdasgup3 added the Spec label Mar 24, 2023

burmako self-requested a review March 24, 2023 02:34

burmako self-assigned this Mar 24, 2023

sdasgup3 force-pushed the spec-quant-types branch from d58cfce to e073f2c Compare March 24, 2023 02:44

dominicsymes reviewed Mar 24, 2023

View reviewed changes

loganchien reviewed Mar 25, 2023

View reviewed changes

docs/spec.md Outdated Show resolved Hide resolved

docs/spec.md Outdated Show resolved Hide resolved

docs/spec.md Outdated Show resolved Hide resolved

sdasgup3 force-pushed the spec-quant-types branch 2 times, most recently from 2bbd784 to 1b6f292 Compare March 27, 2023 23:12

burmako suggested changes Mar 28, 2023

View reviewed changes

docs/spec.md Outdated Show resolved Hide resolved

docs/spec.md Outdated Show resolved Hide resolved

docs/spec.md Outdated Show resolved Hide resolved

burmako reviewed Mar 28, 2023

View reviewed changes

loganchien reviewed Mar 28, 2023

View reviewed changes

docs/spec.md Outdated Show resolved Hide resolved

sdasgup3 force-pushed the spec-quant-types branch from 1b6f292 to ffd53c1 Compare March 28, 2023 21:25

sdasgup3 requested a review from burmako March 30, 2023 01:26

sngyhan reviewed Apr 3, 2023

View reviewed changes

sdasgup3 force-pushed the spec-quant-types branch from ffd53c1 to b380612 Compare April 4, 2023 17:21

loganchien reviewed Apr 4, 2023

View reviewed changes

docs/spec.md Outdated Show resolved Hide resolved

docs/spec.md Outdated Show resolved Hide resolved

docs/spec.md Outdated Show resolved Hide resolved

burmako suggested changes Apr 4, 2023

View reviewed changes

docs/spec.md Outdated Show resolved Hide resolved

docs/spec.md Outdated Show resolved Hide resolved

sdasgup3 force-pushed the spec-quant-types branch from b380612 to 40cf49d Compare April 5, 2023 17:56

burmako suggested changes Apr 6, 2023

View reviewed changes

docs/spec.md Outdated Show resolved Hide resolved

docs/spec.md Outdated Show resolved Hide resolved

docs/spec.md Outdated Show resolved Hide resolved

docs/spec.md Outdated Show resolved Hide resolved

burmako assigned sdasgup3 and unassigned burmako Apr 6, 2023

sdasgup3 force-pushed the spec-quant-types branch from 40cf49d to 83c006c Compare April 6, 2023 01:29

nutsiepully approved these changes Apr 6, 2023

View reviewed changes

sdasgup3 force-pushed the spec-quant-types branch from e74d3cf to ad82073 Compare April 6, 2023 02:36

sdasgup3 requested a review from burmako April 6, 2023 02:38

loganchien approved these changes Apr 6, 2023

View reviewed changes

docs/spec.md Outdated Show resolved Hide resolved

sdasgup3 force-pushed the spec-quant-types branch 2 times, most recently from 1ab4a44 to a8f3ef1 Compare April 8, 2023 02:07

burmako mentioned this pull request Apr 8, 2023

Proposed edits sdasgup3/stablehlo#5

Merged

burmako suggested changes Apr 8, 2023

View reviewed changes

docs/spec.md Outdated Show resolved Hide resolved

docs/spec.md Outdated Show resolved Hide resolved

sdasgup3 force-pushed the spec-quant-types branch from fef795a to 2ca424d Compare April 13, 2023 18:40

sdasgup3 and others added 15 commits April 13, 2023 23:31

Clarifying the semantics of has_side_effect in custom calls

bd87b3d

specification of quantized AddOp

1e9382c

Address feedback comment on changing order of constraints

a79f3fe

Address feedback comment on using familar MLIR syntax for eBNF grammar

e66d04a

Addressed comments related to grammar changes

f7c35ac

Proposed: clarification on (1) default-values of min/max, and (2) zer…

335ee96

…o_point type.

Suggested a update on constrainst for storage_min and storage_max

df84365

Addressed comments on placement of default values in the specification

11d64ca

fix typo in constraints section of AddOp

21f3653

replace all unqualified occurrences of tensor' in operations with 'te…

54c20a9

…nsor of non-quantized type'

fix constraint references in input/output tables

caf5f87

Proposed edits (#5)

40b13fb

Split non-quantized and quantized tensor types

6379be5

ad constraint on the finiteness of scale

09e9ca8

fix is_finite = true to is_finite

b49a2d4

sdasgup3 force-pushed the spec-quant-types branch from 2ca424d to 88fee82 Compare April 14, 2023 00:30

Including just the quantized type in the current PR

e30a7c7

sdasgup3 force-pushed the spec-quant-types branch from 88fee82 to e30a7c7 Compare April 14, 2023 00:31

burmako approved these changes Apr 14, 2023

View reviewed changes

burmako merged commit e83c5e0 into openxla:main Apr 14, 2023

burmako mentioned this pull request Apr 14, 2023

Document open questions in the quantization spec #1408

Merged

burmako pushed a commit that referenced this pull request Apr 18, 2023

Document open questions in the quantization spec (#1408)

4a32703

Following up on #1352, this pull request documents the unresolved conversations from that PR review.

sdasgup3 mentioned this pull request Apr 28, 2023

Specification for quantized AddOp #1446

Merged

sdasgup3 added the Quantization label Sep 6, 2024

Introduce QuantizedType #1352

Introduce QuantizedType #1352

Uh oh!

Conversation

sdasgup3 commented Mar 24, 2023 • edited by burmako Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

loganchien Apr 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

burmako Mar 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sngyhan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sdasgup3 commented Mar 24, 2023 •

edited by burmako

Loading

loganchien Apr 11, 2023 •

edited

Loading

burmako Mar 28, 2023 •

edited

Loading