copy_batch#
Computes a group of copy
operations.
Description
The copy_batch
routines are batched versions of copy, performing
multiple copy
operations in a single call. Each copy
operation copies one vector to another.
copy_batch
supports the following precisions for data.
T
float
double
std::complex<float>
std::complex<double>
copy_batch (Buffer Version)#
Description
The buffer version of copy_batch
supports only the strided API.
The strided API operation is defined as:
for i = 0 … batch_size – 1
X and Y are vectors at offset i * stridex, i * stridey in x and y
Y := X
end for
where:
X
and Y
are vectors.
Strided API
Syntax
namespace oneapi::mkl::blas::column_major {
void copy_batch(sycl::queue &queue,
std::int64_t n,
sycl::buffer<T,
1> &x,
std::int64_t incx,
std::int64_t stridex,
sycl::buffer<T,
1> &y,
std::int64_t incy,
std::int64_t stridey,
std::int64_t batch_size)
}
namespace oneapi::mkl::blas::row_major {
void copy_batch(sycl::queue &queue,
std::int64_t n,
sycl::buffer<T,
1> &x,
std::int64_t incx,
std::int64_t stridex,
sycl::buffer<T,
1> &y,
std::int64_t incy,
std::int64_t stridey,
std::int64_t batch_size)
}
Input Parameters
- queue
The queue where the routine should be executed.
- n
Number of elements in
X
andY
.- x
Buffer holding input vectors
X
with sizestridex
*batch_size
.- incx
Stride of vector
X
.- stridex
Stride between different
X
vectors.- y
Buffer holding input/output vectors
Y
with sizestridey
*batch_size
.- incy
Stride of vector
Y
.- stridey
Stride between different
Y
vectors.- batch_size
Specifies the number of
copy
operations to perform.
Output Parameters
- y
Output buffer, overwritten by
batch_size
copy
operations.
Throws
This routine shall throw the following exceptions if the associated condition is detected. An implementation may throw additional implementation-specific exception(s) in case of error conditions not covered here.
copy_batch (USM Version)#
Description
The USM version of copy_batch
supports the group API and strided API.
The group API operation is defined as
idx = 0
for i = 0 … group_count – 1
for j = 0 … group_size – 1
X and Y are vectors in x[idx] and y[idx]
Y := X
idx := idx + 1
end for
end for
The strided API operation is defined as
for i = 0 … batch_size – 1
X and Y are vectors at offset i * stridex, i * stridey in x and y
Y := X
end for
where:
X
and Y
are vectors.
For group API, x
and y
arrays contain the pointers for all the input vectors.
The total number of vectors in x
and y
are given by:
For strided API, x
and y
arrays contain all the input vectors.
The total number of vectors in x
and y
are given by the batch_size
parameter.
Group API
Syntax
namespace oneapi::mkl::blas::column_major {
sycl::event copy_batch(sycl::queue &queue,
const std::int64_t *n,
const T **x,
const std::int64_t *incx,
T **y,
const std::int64_t *incy,
std::int64_t group_count,
const std::int64_t *group_size,
const std::vector<sycl::event> &dependencies = {})
}
namespace oneapi::mkl::blas::row_major {
sycl::event copy_batch(sycl::queue &queue,
const std::int64_t *n,
const T **x,
const std::int64_t *incx,
T **y,
const std::int64_t *incy,
std::int64_t group_count,
const std::int64_t *group_size,
const std::vector<sycl::event> &dependencies = {})
}
Input Parameters
- queue
The queue where the routine should be executed.
- n
Array of
group_count
integers.n[i]
specifies the number of elements in vectorsX
andY
for every vector in groupi
.- x
Array of pointers to input vectors
X
with sizetotal_batch_count
. The size of array allocated for theX
vector of the groupi
must be at least (1 + (n[i]
– 1)*abs(incx[i]
)). See Matrix Storage for more details.- incx
Array of
group_count
integers.incx[i]
specifies the stride of vectorX
in groupi
.- y
Array of pointers to input/output vectors
Y
with sizetotal_batch_count
. The size of array allocated for theY
vector of the groupi
must be at least (1 + (n[i]
– 1)*abs(incy[i]
)). See Matrix Storage for more details.- incy
Array of
group_count
integers.incy[i]
specifies the stride of vectorY
in groupi
.- group_count
Number of groups. Must be at least 0.
- group_size
Array of
group_count
integers.group_size[i]
specifies the number ofcopy
operations in groupi
. Each element ingroup_size
must be at least 0.- dependencies
List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Output Parameters
- y
Array of pointers holding the
Y
vectors, overwritten bytotal_batch_count
copy
operations.
Return Values
Output event to wait on to ensure computation is complete.
Strided API
Syntax
namespace oneapi::mkl::blas::column_major {
sycl::event copy_batch(sycl::queue &queue,
std::int64_t n,
const T *x,
std::int64_t incx,
std::int64_t stridex,
T *y,
std::int64_t incy,
std::int64_t stridey,
std::int64_t batch_size,
const std::vector<sycl::event> &dependencies = {})
}
namespace oneapi::mkl::blas::row_major {
sycl::event copy_batch(sycl::queue &queue,
std::int64_t n,
const T *x,
std::int64_t incx,
std::int64_t stridex,
T *y,
std::int64_t incy,
std::int64_t stridey,
std::int64_t batch_size,
const std::vector<sycl::event> &dependencies = {})
}
Input Parameters
- queue
The queue where the routine should be executed.
- n
Number of elements in
X
andY
.- x
Pointer to input vectors
X
with sizestridex
*batch_size
.- incx
Stride of vector
X
.- stridex
Stride between different
X
vectors.- y
Pointer to input/output vectors
Y
with sizestridey
*batch_size
.- incy
Stride of vector
Y
.- stridey
Stride between different
Y
vectors.- batch_size
Specifies the number of
copy
operations to perform.- dependencies
List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Output Parameters
- y
Output vectors, overwritten by
batch_size
copy
operations
Return Values
Output event to wait on to ensure computation is complete.
Throws
This routine shall throw the following exceptions if the associated condition is detected. An implementation may throw additional implementation-specific exception(s) in case of error conditions not covered here.
oneapi::mkl::unsupported_device
Parent topic:BLAS-like Extensions