Conventions#
oneDNN specification relies on a set of standard naming conventions for variables. This section describes these conventions.
Variable (Tensor) Names#
Neural network models consist of operations of the following form:
where \(\dst\) and \(\src\) are activation tensors, and \(\weights\) are learnable tensors.
The backward propagation therefore consists in computing the gradients with respect to the \(\src`and :math:\)weights` respectively:
and
While oneDNN uses src, dst, and weights as generic names for the activations and learnable tensors, for a specific operation there might be commonly used and widely known specific names for these tensors. For instance, the convolution operation has a learnable tensor called bias. For usability reasons, oneDNN primitives use such names in initialization and other functions.
oneDNN uses the following commonly used notations for tensors:
Name |
Meaning |
---|---|
|
Source tensor |
|
Destination tensor |
|
Weights tensor |
|
Bias tensor (used in convolution, inner product and other primitives) |
|
Scale and shift tensors (used in Batch Normalization and Layer normalization primitives) |
|
Workspace tensor that carries additional information from the forward propagation to the backward propagation |
|
Temporary tensor that is required to store the intermediate results |
|
Gradient tensor with respect to the source |
|
Gradient tensor with respect to the destination |
|
Gradient tensor with respect to the weights |
|
Gradient tensor with respect to the bias |
|
Gradient tensor with respect to the scale |
|
Gradient tensor with respect to the shift |
|
RNN layer data or weights tensors |
|
RNN recurrent data or weights tensors |
RNN-Specific Notation#
The following notations are used when describing RNN primitives.
Name |
Semantics |
---|---|
\(\cdot\) |
matrix multiply operator |
\(*\) |
elementwise multiplication operator |
W |
input weights |
U |
recurrent weights |
\(\Box^T\) |
transposition |
B |
bias |
h |
hidden state |
a |
intermediate value |
x |
input |
\(\Box_t\) |
timestamp index |
\(\Box_l\) |
layer index |
activation |
tanh, relu, logistic |
c |
cell state |
\(\tilde{c}\) |
candidate state |
i |
input gate |
f |
forget gate |
o |
output gate |
u |
update gate |
r |
reset gate |