Post-ops#

Post-ops are operations that are appended after a primitive. They are implemented using the Attributes mechanism. If there are multiple post-ops, they are executed in the order they have been appended as follow:

\[\dst = po[n](po[n-1] (...(po[0](OP()))))\]

Note

Post-ops does not preserve intermediate data during computation. This typically makes them suitable for inference only.

The post-ops are represented by dnnl::post_ops which is copied once it is attached to the attributes using dnnl::primitive_attr::set_post_ops() function. The attributes then need to be passed to a primitive descriptor creation function to take effect. Below is a simple sketch:

dnnl::post_ops po; // default empty post-ops
assert(po.len() == 0); // no post-ops attached

po.append_SOMETHING(params); // append some particular post-op
po.append_SOMETHING_ELSE(other_params); // append one more post-op

// (!) Note that the order in which post-ops are appended matters!
assert(po.len() == 2);

dnnl::primitive_attr attr; // default attributes
attr.set_post_ops(po); // attach the post-ops to the attr
// any changes to po after this point don't affect the value stored in attr

primitive::primitive_desc op_pd(params, attr); // create a pd with the attr

Note

Different primitives may have different post-ops support. Moreover, the support might also depend on the actual implementation of a primitive. So robust code should be able to handle errors accordingly. See the Attribute Related Error Handling.

Note

Post-ops do not change memory format of the operation destination memory object.

The post-op objects can be inspected using the dnnl::post_ops::kind() function that takes an index of the post-op to inspect (that must be less than the value returned by dnnl::post_ops::len()), and returns its kind.

Supported Post-ops#

Eltwise Post-op#

The eltwise post-op is appended using dnnl::post_ops::append_eltwise() function. The dnnl::post_ops::kind() returns dnnl::primitive::kind::eltwise for such a post-op.

The eltwise post-op replaces:

\[\dst[:] = \operatorname{Op}(...)\]

with

\[\dst[:] = scale \cdot \operatorname{eltwise}(\operatorname{Op}(...))\]

The intermediate result of the \(\operatorname{Op}(...)\) is not preserved.

The \(scale\) factor is supported in int8 inference only. For all other cases the scale must be 1.0 (default value). The scale parameter is set to \(1.0\) by default, and can be set using the dnnl::primitive_attr::set_scales_mask() attribute for the argument DNNL_ARG_ATTR_MULTIPLE_POST_OP.

Sum Post-op#

The sum post-op accumulates the result of a primitive with the existing data and is appended using dnnl::post_ops::append_sum() function. The dnnl::post_ops::kind() returns dnnl::primitive::kind::sum for such a post-op.

Prior to accumulating the result, the existing value is multiplied by scale. The \(scale\) factor is supported in int8 inference only and should be used only when the result and the existing data have different magnitudes. For all other cases the scale must be 1.0 (default value). The scale parameter is set to \(1.0\) by default, and can be set using the dnnl::primitive_attr::set_scales_mask() attribute for the argument DNNL_ARG_ATTR_MULTIPLE_POST_OP.

Additionally, the sum post-op can reinterpret the destination values as a different data type of the same size. This may be used to, for example, reinterpret 8-bit signed data as unsigned or vice versa (which requires that values fall within a common range to work).

The sum post-op replaces

\[\dst[:] = \operatorname{Op}(...)\]

with

\[\dst[:] = scale \cdot as_data_type(\dst[:]) + \operatorname{Op}(...)\]

Binary post-ops#

The binary post-op replaces: .. math:

\dst[:] = \operatorname{Op}(...)

with

\[\dst[:] = \operatorname{binary}(\operatorname{Op}(...), scale[:] \cdot Source\_1[:])\]

The binary post-op supports the same algorithms and broadcast semantic as the binary primitive.

Furthermore, the binary post-op scale parameter is set to \(1.0\) by default, and can be set using the dnnl::primitive_attr::set_scales_mask() attribute for the argument DNNL_ARG_ATTR_MULTIPLE_POST_OP | DNNL_ARG_SRC_1. For example:

primitive_attr attr;
post_ops p_ops;
p_ops.append_binary(algorithm::binary_add, summand_md);

attr.set_post_ops(p_ops);
attr.set_scales_mask(DNNL_ARG_ATTR_MULTIPLE_POST_OP(0) | DNNL_ARG_SRC_1,
        /* mask */ 0);

Examples of Chained Post-ops#

Post-ops can be chained together by appending one after another. Note that the order matters: the post-ops are executed in the order they have been appended.

Sum -> ReLU#

This pattern is pretty common for the CNN topologies of the ResNet family.

dnnl::post_ops po;
po.append_sum();
po.append_eltwise(
        /* algorithm = */ dnnl::algorithm::eltwise_relu,
        /* neg slope = */ 0.f,
        /* unused for ReLU */ 0.f);

dnnl::primitive_attr attr;
attr.set_post_ops(po);

convolution_forward::primitive_desc(conv_d, attr, engine);

This will lead to the following computations:

\[\dst[:] = \operatorname{ReLU}(\dst[:] + \operatorname{conv}(\src[:], \weights[:])\]

API#

struct post_ops#

Post-ops.

Post-ops are computations executed after the main primitive computations and are attached to the primitive via primitive attributes.

Public Functions

post_ops()#

Constructs an empty sequence of post-ops.

int len() const#

Returns the number of post-ops entries.

primitive::kind kind(int index) const#

Returns the primitive kind of post-op at entry with a certain index.

Parameters:

index – Index of the post-op to return the kind for.

Returns:

Primitive kind of the post-op at the specified index.

void append_sum(memory::data_type data_type = memory::data_type::undef)#

Appends an accumulation (sum) post-op. Prior to accumulating the result, the previous value would be multiplied by a scaling factor scale provided as execution argument.

The kind of this post-op is dnnl::primitive::kind::sum.

This feature may improve performance for cases like residual learning blocks, where the result of convolution is accumulated to the previously computed activations. The parameter scale may be used for the integer-based computations when the result and previous activations have different logical scaling factors.

In the simplest case when the accumulation is the only post-op, the computations would be dst[:] := scale * dst[:] + op(...) instead of dst[:] := op(...).

If data_type is specified, the original dst tensor will be reinterpreted as a tensor with the provided data type. Because it is a reinterpretation, data_type and dst data type should have the same size. As a result, computations would be dst[:] <- scale * as_data_type(dst[:]) + op(...) instead of dst[:] <- op(...).

Note

This post-op executes in-place and does not change the destination layout.

Parameters:

data_type – Data type.

void get_params_sum(int index, float &scale) const#

Returns the parameters of an accumulation (sum) post-op.

Parameters:
  • index – Index of the sum post-op.

  • scale – Scaling factor of the sum post-op.

void get_params_sum(int index, float &scale, memory::data_type &data_type) const#

Returns the parameters of an accumulation (sum) post-op.

Parameters:
  • index – Index of the sum post-op.

  • scale – Scaling factor of the sum post-op.

  • data_type – Data type of the sum post-op.

void append_eltwise(algorithm aalgorithm, float alpha, float beta)#

Appends an elementwise post-op.

The kind of this post-op is dnnl::primitive::kind::eltwise.

In the simplest case when the elementwise is the only post-op, the computations would be dst[:] := scale * eltwise_op (op(...)) instead of dst[:] <- op(...), where eltwise_op is configured with the given parameters.

Parameters:
  • aalgorithm – Elementwise algorithm.

  • alpha – Alpha parameter for the elementwise algorithm.

  • beta – Beta parameter for the elementwise algorithm.

void get_params_eltwise(int index, algorithm &aalgorithm, float &alpha, float &beta) const#

Returns parameters of an elementwise post-up.

Parameters:
  • index – Index of the post-op.

  • aalgorithm – Output elementwise algorithm kind.

  • alpha – Output alpha parameter for the elementwise algorithm.

  • beta – Output beta parameter for the elementwise algorithm.

void append_binary(algorithm aalgorithm, const memory::desc &src1_desc)#

Appends a binary post-op.

The kind of this post operation is dnnl::primitive::kind::binary.

In the simplest case when the binary is the only post operation, the computations would be:

dst[:] <- binary_op (dst[:], another_input[:])
where binary_op is configured with the given parameters. binary_op supports broadcast semantics for a second operand.

Parameters:
  • aalgorithm – Binary algorithm for the post-op.

  • src1_desc – Memory descriptor of a second operand.

void get_params_binary(int index, algorithm &aalgorithm, memory::desc &src1_desc) const#

Returns the parameters of a binary post-op.

Parameters:
  • index – Index of the binary post-op.

  • aalgorithm – Output binary algorithm kind.

  • src1_desc – Output memory descriptor of a second operand.