Conventions#

oneDNN specification relies on a set of standard naming conventions for variables. This section describes these conventions.

Variable (Tensor) Names#

Neural network models consist of operations of the following form:

dst = f (src, weights),

where $dst$ and $src$ are activation tensors, and $weights$ are learnable tensors.

The backward propagation therefore consists in computing the gradients with respect to the $src ‘ a n d : m a t h :$ weights` respectively:

diff_src = d f_{src} (diff_dst, src, weights, dst),

and

diff_weights = d f_{weights} (diff_dst, src, weights, dst) .

While oneDNN uses src, dst, and weights as generic names for the activations and learnable tensors, for a specific operation there might be commonly used and widely known specific names for these tensors. For instance, the convolution operation has a learnable tensor called bias. For usability reasons, oneDNN primitives use such names in initialization and other functions.

oneDNN uses the following commonly used notations for tensors:

Name	Meaning
`src`	Source tensor
`dst`	Destination tensor
`weights`	Weights tensor
`bias`	Bias tensor (used in convolution, inner product and other primitives)
`scale_shift`	Scale and shift tensors (used in Batch Normalization and Layer normalization primitives)
`workspace`	Workspace tensor that carries additional information from the forward propagation to the backward propagation
`scratchpad`	Temporary tensor that is required to store the intermediate results
`diff_src`	Gradient tensor with respect to the source
`diff_dst`	Gradient tensor with respect to the destination
`diff_weights`	Gradient tensor with respect to the weights
`diff_bias`	Gradient tensor with respect to the bias
`diff_scale`	Gradient tensor with respect to the scale
`diff_shift`	Gradient tensor with respect to the shift
`*_layer`	RNN layer data or weights tensors
`*_iter`	RNN recurrent data or weights tensors

RNN-Specific Notation#

The following notations are used when describing RNN primitives.

Name	Semantics
$\cdot$	matrix multiply operator
$*$	elementwise multiplication operator
W	input weights
U	recurrent weights
$◻^{T}$	transposition
B	bias
h	hidden state
a	intermediate value
x	input
$◻_{t}$	timestamp index
$◻_{l}$	layer index
activation	tanh, relu, logistic
c	cell state
$\tilde{c}$	candidate state
i	input gate
f	forget gate
o	output gate
u	update gate
r	reset gate

Conventions

Contents

Conventions#

Variable (Tensor) Names#

RNN-Specific Notation#