Primitives¶
Primitives are functor objects that encapsulate a particular computation such as forward convolution, backward LSTM computations, or a data transformation operation. A single primitive can sometimes represent more complex fused computations such as a forward convolution followed by a ReLU.
The most important difference between a primitive and a pure function is that a primitive can store state.
One part of the primitive’s state is immutable. For example, convolution primitives store parameters like tensor shapes and can pre-compute other dependent parameters like cache blocking. This approach allows oneDNN primitives to pre-generate code specifically tailored for the operation to be performed. The oneDNN programming model assumes that the time it takes to perform the pre-computations is amortized by reusing the same primitive to perform computations multiple times.
The mutable part of the primitive’s state is referred to as a scratchpad. It is a memory buffer that a primitive may use for temporary storage only during computations. The scratchpad can either be owned by a primitive object (which makes that object non-thread safe) or be an execution-time parameter.
Conceptually, oneDNN establishes several layers of how to describe a computation from more abstract to more concrete:
Operation descriptors (one for each supported primitive) describe an operation’s most basic properties without specifying, for example, which engine will be used to compute them. For example, convolution descriptor describes shapes of source, destination, and weights tensors, propagation kind (forward, backward with respect to data or weights), and other implementation-independent parameters. The shapes are usually described as memory descriptors (
dnnl::memory::desc
).Primitive descriptors are at the abstraction level in between operation descriptors and primitives. They combine both an operation descriptor and primitive attributes. Primitive descriptors can be used to query various primitive implementation details and, for example, to implement memory format propagation by inspecting expected memory formats via queries without having to fully instantiate a primitive. oneDNN may contain multiple implementations for the same primitive that can be used to perform the same particular computation. Primitive descriptors allow one-way iteration which allows inspecting multiple implementations. The library is expected to order the implementations from most to least preferred, so it should always be safe to use the one that is chosen by default.
Primitives, which are the most concrete, embody actual computations that can be executed.
On the API level:
Primitives are represented as a class on the top level of the
dnnl
namespace that havednnl::primitive
as their base class, for examplednnl::convolution_forward
Operation descriptors are represented as classes named
desc
and nested within the corresponding primitives classes, for examplednnl::convolution_forward::desc
. Thednnl::primitive_desc::next_impl()
member function provides a way to iterate over implementations.Primitive descriptors are represented as classes named
primitive_desc
and nested within the corresponding primitive classes that havednnl::primitive_desc_base
as their base class (except for RNN primitives that derive fromdnnl::rnn_primitive_desc_base
), for examplednnl::convolution_forward::primitive_desc
namespace dnnl {
struct something_forward : public primitive {
struct desc {
// Primitive-specific constructors.
}
struct primitive_desc : public primitive_desc_base {
// Constructors and primitive-specific memory descriptor queries.
}
};
}
The sequence of actions to create a primitive is:
Create an operation descriptor via, for example,
dnnl::convolution_forward::desc
. The operation descriptor can contain memory descriptors with placeholderdnnl::memory::format_tag::any
memory formats if the primitive supports it.Create a primitive descriptor based on the operation descriptor, engine and attributes.
Create a primitive based on the primitive descriptor obtained in step 2.
Note
Strictly speaking, not all the primitives follow this sequence. For example, the reorder primitive does not have an operation descriptor and thus does not require step 1 above.