MatMul#

MatMul operation computes the product of two tensors with optional bias addition The variable names follow the standard Conventions, typically taking 2D input tensors as an example, the formula is below:

\[\dst(m, n) = \sum_{k=0}^{K - 1} \left( \src(m, k) \cdot \weights(k, n) \right) + \bias(m, n)\]

In the shape of a tensor, two right-most axes are interpreted as row and column dimensions of a matrix while all left-most axes (if present) are interpreted as batch dimensions. The operation supports broadcasting semantics for those batch dimensions. For example src can be broadcasted to weights if the corresponding dimension in src is 1 (and vice versa). Additionally, if ranks of src and weights are different, the tensor with a smaller rank will be unsqueezed from the left side of dimensions (inserting 1) to make sure two ranks matched.

Operation Attributes#

Attribute

Name

Description

Value Type

Supported

Values

Required or

Optional

transpose_a

Controls whether to transpose the last two dimensions of src

bool

True, False (default)

Optional

transpose_b

Controls whether to transpose the last two dimensions of weights

bool

True, False (default)

Optional

The above transpose attributes will not be in effect when rank of an input tensor is less than 2. For example, in library implementation 1D tensor is unsqueezed firstly before compilation. The rule is applied independently.

  • For src tensor, the rule is defined like: [d] -> [1, d].

  • For weights tensor, the rule is defined like: [d] -> [d, 1].

Execution Arguments#

The inputs and outputs must be provided according to the below index order when constructing an operation.

Inputs#

Index

Argument Name

Required or Optional

0

src

Required

1

weights

Required

2

bias

Optional

Outputs#

Index

Argument Name

Required or Optional

0

dst

Required

Supported Data Types#

MatMul operation supports the following data type combinations.

Src

Weights

Bias

Dst

f32

f32

f32

f32

bf16

bf16

bf16

bf16

f16

f16

f16

f16