Shuffle

The shuffle primitive shuffles data along the shuffle axis (here is designated as C) with the group parameter G. Namely, the shuffle axis is thought to be a 2D tensor of size (CG×G) and it is being transposed to (G×CG). Variable names follow the standard Conventions.

The formal definition is shown below:

Forward

dst(¯ou,c,¯in)=src(¯ou,c,¯in)

where

  • c dimension is called a shuffle axis,

  • G is a group_size,

  • ¯ou is the outermost indices (to the left from shuffle axis),

  • ¯in is the innermost indices (to the right from shuffle axis), and

  • c and c relate to each other as define by the system:

{c=u+vCG,c=uG+v,

Here, 0u<CG and 0v<G.

Difference Between Forward Training and Forward Inference

There is no difference between the forward_training and forward_inference propagation kinds.

Backward

The backward propagation computes diff_src(ou,c,in), based on diff_dst(ou,c,in).

Essentially, backward propagation is the same as forward propagation with g replaced by C/g.

Execution Arguments

When executed, the inputs and outputs should be mapped to an execution argument index as specified by the following table.

Primitive input/output

Execution argument index

src

DNNL_ARG_SRC

dst

DNNL_ARG_DST

diff_src

DNNL_ARG_DIFF_SRC

diff_dst

DNNL_ARG_DIFF_DST

Operation Details

  1. The memory format and data type for src and dst are assumed to be the same, and in the API are typically referred as data (e.g., see data_desc in dnnl::shuffle_forward::desc::desc()). The same holds for diff_src and diff_dst. The corresponding memory descriptors are referred to as diff_data_desc.

Data Types Support

The shuffle primitive supports the following combinations of data types:

Note

Here we abbreviate data types names for readability. For example, dnnl::memory::data_type::f32 is abbreviated to f32.

Propagation

Source / Destination

forward / backward

f32, bf16

forward

s32, s8, u8

Data Layouts

The shuffle primitive works with arbitrary data tensors. There is no special meaning associated with any logical dimensions. However, the shuffle axis is typically referred to as channels (hence in formulas we use c).

Shuffle operation typically appear in CNN topologies. Hence, in the library the shuffle primitive is optimized for the corresponding memory formats:

Spatial

Logical tensor

Shuffle Axis

Implementations optimized for memory formats

2D

NCHW

1 (C)

nchw (abcd), nhwc (acdb), optimized^

3D

NCDHW

1 (C)

ncdhw (abcde), ndhwc (acdeb), optimized^

Here optimized^ means the format that comes out of any preceding compute-intensive primitive.

Post-ops and Attributes

The shuffle primitive does not have to support any post-ops or attributes.

API

struct dnnl::shuffle_forward : public dnnl::primitive

Shuffle forward propagation primitive.

Public Functions

shuffle_forward()

Default constructor. Produces an empty object.

shuffle_forward(const primitive_desc &pd)

Constructs a shuffle forward propagation primitive.

Parameters
  • pd: Primitive descriptor for a shuffle forward propagation primitive.

struct desc

Descriptor for a shuffle forward propagation primitive.

Public Functions

desc(prop_kind aprop_kind, const memory::desc &data_desc, int axis, int group_size)

Constructs a descriptor for a shuffle forward propagation primitive.

Parameters

struct primitive_desc : public dnnl::primitive_desc

Primitive descriptor for a shuffle forward propagation primitive.

Public Functions

primitive_desc()

Default constructor. Produces an empty object.

primitive_desc(const desc &adesc, const engine &aengine, const primitive_attr &attr = primitive_attr(), bool allow_empty = false)

Constructs a primitive descriptor for a shuffle forward propagation primitive.

Parameters
  • adesc: Descriptor for a shuffle forward propagation primitive.

  • aengine: Engine to use.

  • attr: Primitive attributes to use.

  • allow_empty: A flag signifying whether construction is allowed to fail without throwing an exception. In this case an empty object will be produced. This flag is optional and defaults to false.

memory::desc src_desc() const

Returns a source memory descriptor.

Return

Source memory descriptor.

Return

A zero memory descriptor if the primitive does not have a source parameter.

memory::desc dst_desc() const

Returns a destination memory descriptor.

Return

Destination memory descriptor.

Return

A zero memory descriptor if the primitive does not have a destination parameter.

struct dnnl::shuffle_backward : public dnnl::primitive

Shuffle backward propagation primitive.

Public Functions

shuffle_backward()

Default constructor. Produces an empty object.

shuffle_backward(const primitive_desc &pd)

Constructs a shuffle backward propagation primitive.

Parameters
  • pd: Primitive descriptor for a shuffle backward propagation primitive.

struct desc

Descriptor for a shuffle primitive backward propagation primitive.

Public Functions

desc(const memory::desc &diff_data_desc, int axis, int group_size)

Constructs a descriptor for a shuffle backward propagation primitive.

Parameters
  • diff_data_desc: Diff source and diff destination memory descriptor.

  • axis: The axis along which the data is shuffled.

  • group_size: Shuffle group size.

struct primitive_desc : public dnnl::primitive_desc

Primitive descriptor for a shuffle backward propagation primitive.

Public Functions

primitive_desc()

Default constructor. Produces an empty object.

primitive_desc(const desc &adesc, const engine &aengine, const shuffle_forward::primitive_desc &hint_fwd_pd, const primitive_attr &attr = primitive_attr(), bool allow_empty = false)

Constructs a primitive descriptor for a shuffle backward propagation primitive.

Parameters
  • adesc: Descriptor for a shuffle backward propagation primitive.

  • aengine: Engine to use.

  • attr: Primitive attributes to use.

  • hint_fwd_pd: Primitive descriptor for a shuffle forward propagation primitive. It is used as a hint for deciding which memory format to use.

  • allow_empty: A flag signifying whether construction is allowed to fail without throwing an exception. In this case an empty object will be produced. This flag is optional and defaults to false.

memory::desc diff_src_desc() const

Returns a diff source memory descriptor.

Return

Diff source memory descriptor.

Return

A zero memory descriptor if the primitive does not have a diff source memory with.

memory::desc diff_dst_desc() const

Returns a diff destination memory descriptor.

Return

Diff destination memory descriptor.

Return

A zero memory descriptor if the primitive does not have a diff destination parameter.