Data storage#
The data storage convention observed by a
descriptor object depends on whether it is a real
or complex descriptor and, in case of complex descriptors, on the configuration
value associated with configuration parameter config_param::COMPLEX_STORAGE
.
Complex descriptors#
For a complex descriptor, the configuration parameter
config_param::COMPLEX_STORAGE
specifies how the entries of the complex data
sequences it consumes and produces are stored. If that configuration parameter is
associated with a configuration value config_value::COMPLEX_COMPLEX
(default
behavior), those entries are accessed and stored as std::complex<float>
(resp. std::complex<double>
) elements of a single data container
(device-accessible USM allocation or sycl::buffer
object) if the
descriptor object is a single-precision (resp.
double-precision) descriptor. If the configuration value
config_value::REAL_REAL
is used instead, the real and imaginary parts of
those entries are accessed and stored as float
(resp. double
) elements
of two separate, non-overlapping data containers (device-accessible USM
allocations or sycl::buffer
objects) if the
descriptor object is a single-precision (resp.
double-precision) descriptor.
These two behaviors are further specified and illustrated below.
config_value::COMPLEX_COMPLEX
for config_param::COMPLEX_STORAGE
For complex descriptors with parameter config_param::COMPLEX_STORAGE
set to
config_value::COMPLEX_COMPLEX
, each of forward- and backward-domain data
sequences must belong to a single data container (device-accessible USM
allocation or sycl::buffer
object). Any relevant entry
\(\left(\cdot\right)^{m}_{k_1, k_2,\dots ,k_d}\) is accessed/stored from/in
a data container provided at compute time at the index value expressed in eq.
(1) (from this page)
of that data container, whose elementary data type is (possibly implicitly
re-interpreted as) std::complex<float>
(resp. std::complex<double>
) for
single-precision (resp. double-precision) descriptors.
The same unique data container is to be used for forward- and backward-domain
data sequences for in-place transforms (for
descriptor objects with configuration value
config_value::INPLACE
for configuration parameter
config_param::PLACEMENT
). Two separate data containers sharing no common
elements are to be used for out-of-place transforms (for
descriptor objects with configuration value
config_value::NOT_INPLACE
for configuration parameter
config_param::PLACEMENT
).
The following snippet illustrates the usage of config_value::COMPLEX_COMPLEX
for configuration parameter config_param::COMPLEX_STORAGE
, in the
context of in-place, single-precision (fp32) calculations of \(M\)
three-dimensional \(n_1 \times n_2 \times n_3\) complex transforms, using
identical (default) strides and distances in forward and backward domains, with
USM allocations.
namespace dft = oneapi::mkl::dft;
dft::descriptor<dft::precision::SINGLE, dft::domain::COMPLEX> desc({n1, n2, n3});
std::vector<std::int64_t> strides({0, n2*n3, n3, 1});
std::int64_t dist = n1*n2*n3;
std::complex<float> *Z = (std::complex<float> *) malloc_device(2*sizeof(float)*n1*n2*n3*M, queue);
desc.set_value(dft::config_param::FWD_STRIDES, strides);
desc.set_value(dft::config_param::BWD_STRIDES, strides);
desc.set_value(dft::config_param::FWD_DISTANCE, dist);
desc.set_value(dft::config_param::BWD_DISTANCE, dist);
desc.set_value(dft::config_param::NUMBER_OF_TRANSFORMS, M);
desc.set_value(dft::config_param::COMPLEX_STORAGE, dft::config_value::COMPLEX_COMPLEX);
desc.commit(queue);
// initialize forward-domain data such that entry {m;k1,k2,k3}
// = Z[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
compute_forward(desc, Z); // complex-to-complex in-place DFT
// in backward domain: entry {m;k1,k2,k3}
// = Z[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
config_value::REAL_REAL
for config_param::COMPLEX_STORAGE
For complex descriptors with parameter config_param::COMPLEX_STORAGE
set to
config_value::REAL_REAL
, forward- and backward-domain data sequences are
read/stored from/in two different, non-overlapping data containers
(device-accessible USM allocations or sycl::buffer
objects) encapsulating
the real and imaginary parts of the relevant entries separately. The real and
imaginary parts of any relevant complex entry
\(\left(\cdot\right)^{m}_{k_1, k_2,\dots ,k_d}\) are both stored at the index value
expressed in eq. (1) (from this
page) of their respective data containers, whose elementary
data type is (possibly implicitly re-interpreted as) float
(resp.
double
) for single-precision (resp. double-precision) descriptors.
The same two data containers are to be used for real and imaginary parts of
forward- and backward-domain data sequences for in-place transforms (for
descriptor objects with configuration value
config_value::INPLACE
for configuration parameter
config_param::PLACEMENT
). Four separate data containers sharing no common
elements are to be used for out-of-place transforms (for
descriptor objects with configuration value
config_value::NOT_INPLACE
for configuration parameter
config_param::PLACEMENT
).
The following snippet illustrates the usage of config_value::REAL_REAL
set for configuration parameter config_param::COMPLEX_STORAGE
, in the
context of in-place, single-precision (fp32) calculation of \(M\)
three-dimensional \(n_1 \times n_2 \times n_3\) complex transforms, using
identical (default) strides and distances in forward and backward domains, with
USM allocations.
namespace dft = oneapi::mkl::dft;
dft::descriptor<dft::precision::SINGLE, dft::domain::COMPLEX> desc({n1, n2, n3});
std::vector<std::int64_t> strides({0, n2*n3, n3, 1});
std::int64_t dist = n1*n2*n3;
float *ZR = (float *) malloc_device(sizeof(float)*n1*n2*n3*M, queue); // data container for real parts
float *ZI = (float *) malloc_device(sizeof(float)*n1*n2*n3*M, queue); // data container for imaginary parts
desc.set_value(dft::config_param::FWD_STRIDES, strides);
desc.set_value(dft::config_param::BWD_STRIDES, strides);
desc.set_value(dft::config_param::FWD_DISTANCE, dist);
desc.set_value(dft::config_param::BWD_DISTANCE, dist);
desc.set_value(dft::config_param::NUMBER_OF_TRANSFORMS, M);
desc.set_value(dft::config_param::COMPLEX_STORAGE, dft::config_value::REAL_REAL);
desc.commit(queue);
// initialize forward-domain data such that the real part of entry {m;k1,k2,k3}
// = ZR[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
// and the imaginary part of entry {m;k1,k2,k3}
// = ZI[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
compute_forward<decltype(desc), float>(desc, ZR, ZI); // complex-to-complex in-place DFT
// in backward domain: the real part of entry {m;k1,k2,k3}
// = ZR[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
// and the imaginary part of entry {m;k1,k2,k3}
// = ZI[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
Real descriptors#
Real descriptors observe only one type of data storage. Any relevant (real)
entry \(\left(\cdot\right)^{m}_{k_1, k_2,\dots ,k_d}\) of a data sequence
in forward domain is accessed and stored as a float
(resp. double
)
element of a single data container (device-accessible USM allocation or
sycl::buffer
object) if the descriptor object
is a single-precision (resp. double-precision) descriptor. Any relevant
(complex) entry \(\left(\cdot\right)^{m}_{k_1, k_2,\dots ,k_d}\) of a data
sequence in backward domain is accessed and stored as a std::complex<float>
(resp. std::complex<double>
) element of a single data container
(device-accessible USM allocation or sycl::buffer
object) if the
descriptor object is a single-precision (resp.
double-precision) descriptor.
The following snippet illustrates the usage of a real, single-precision descriptor (and the corresponding data storage) for the in-place, single-precision (fp32), calculation of \(M\) three-dimensional \(n_1 \times n_2 \times n_3\) real transforms, using default strides in forward and backward domains, with USM allocations.
namespace dft = oneapi::mkl::dft;
dft::descriptor<dft::precision::SINGLE, dft::domain::REAL> desc({n1, n2, n3});
// Note: integer divisions here below
std::vector<std::int64_t> fwd_strides({0, 2*n2*(n3/2 + 1), 2*(n3/2 + 1), 1});
std::vector<std::int64_t> bwd_strides({0, n2*(n3/2 + 1), (n3/2 + 1), 1});
std::int64_t fwd_dist = 2*n1*n2*(n3/2 + 1);
std::int64_t bwd_dist = n1*n2*(n3/2 + 1);
float *data = (float *) malloc_device(sizeof(float)*fwd_dist*M, queue); // data container
desc.set_value(dft::config_param::FWD_STRIDES, fwd_strides);
desc.set_value(dft::config_param::BWD_STRIDES, bwd_strides);
desc.set_value(dft::config_param::FWD_DISTANCE, fwd_dist);
desc.set_value(dft::config_param::BWD_DISTANCE, bwd_dist);
desc.set_value(dft::config_param::NUMBER_OF_TRANSFORMS, M);
desc.commit(queue);
// initialize forward-domain data such that real entry {m;k1,k2,k3}
// = data[ fwd_strides[0] + k1*fwd_strides[1] + k2*fwd_strides[2] + k3*fwd_strides[3] + m*fwd_dist ]
compute_forward(desc, data); // real-to-complex in-place DFT
// in backward domain, the implicitly-assumed type is complex so, considering
// std::complex<float>* complex_data = static_cast<std::complex<float>*>(data);
// we have entry {m;k1,k2,k3}
// = complex_data[ bwd_strides[0] + k1*bwd_strides[1] + k2*bwd_strides[2] + k3*bwd_strides[3] + m*bwd_dist ]
// for 0 <= k3 <= n3/2.
// Note: if n3/2 < k3 < n3, entry {m;k1,k2,k3} = std::conj(entry {m;n1-k1,n2-k2,n3-k3})
Parent topic DFT-related scoped enumeration types