Dequantize#

Dequantize operation converts a quantized (u8 or s8) tensor to a f32 tensor. It supports both per-tensor and per-channel asymmetric linear de-quantization. Rounding mode is library-implementation defined.

For per-tensor de-quantization:

\[\dst_{f32} = round((\src_{i} - zps) \times scale)\]

For per-channel de-quantization, taking channel axis = 1 as an example:

\[\dst_{\cdots,i,\cdots,\cdots} = (\src_{\cdots,i,\cdots,\cdots} - zps_i) \times scale_i, i \in {[0, ic-1]}\]

where \(ic\) is the number of channels.

Operation Attributes#

Attribute Name	Description	Value Type	Supported Values	Required or Optional
`qtype`	Specifies which de-quantization type is used	string	`per_tensor` (default), `per_channel`	Optional
`axis`	Specifies dimension on which per-channel de-quantization is applied	s64	A s64 value in the range of [-r, r-1] where r = rank(src), `1` by default	Optional
`scales`	Scalings applied on the src data	f32	A f32 list (only contain one element if `qtype` is `per_tensor`)	Required
`zps`	Offset values that maps to float zero	s64	A s64 list (only contain one element if `qtype` is `per_tensor`)	Required

Execution Arguments#

The inputs and outputs must be provided according to the below index order when constructing an operation.

Inputs#

Index	Argument Name	Required or Optional
0	`src`	Required

Outputs#

Index	Argument Name	Required or Optional
0	`dst`	Required

Supported Data Types#

Dequantize operation supports the following data type combinations.

Src	Dst
s8, u8	f32

@note This operation is to support int8 quantization model.

Dequantize

Contents