syrk_batch#
Computes a group of syrk
operations.
Description
The syrk_batch
routines are batched versions of syrk, performing
multiple syrk
operations in a single call. Each syrk
operation perform a rank-k update with general matrices.
syrk_batch
supports the following precisions.
T
float
double
std::complex<float>
std::complex<double>
syrk_batch (Buffer Version)#
Description
The buffer version of syrk_batch
supports only the strided API.
The strided API operation is defined as:
for i = 0 … batch_size – 1
A and C are matrices at offset i * stridea, i * stridec in a and c.
C := alpha * op(A) * op(A)^T + beta * C
end for
where:
op(X) is one of op(X) = X, or op(X) = XT, or op(X) = XH,
alpha
and beta
are scalars,
A
and C
are matrices,
op(A
) is n
x k
and C
is n
x n
.
The a
and c
buffers contain all the input matrices. The stride
between matrices is given by the stride parameter. The total number
of matrices in a
and c
buffers is given by the batch_size
parameter.
Strided API
Syntax
namespace oneapi::mkl::blas::column_major {
void syrk_batch(sycl::queue &queue,
onemkl::uplo upper_lower,
onemkl::transpose trans,
std::int64_t n,
std::int64_t k,
T alpha,
sycl::buffer<T,1> &a,
std::int64_t lda,
std::int64_t stridea,
T beta,
sycl::buffer<T,1> &c,
std::int64_t ldc,
std::int64_t stridec,
std::int64_t batch_size)
}
namespace oneapi::mkl::blas::row_major {
void syrk_batch(sycl::queue &queue,
onemkl::uplo upper_lower,
onemkl::transpose trans,
std::int64_t n,
std::int64_t k,
T alpha,
sycl::buffer<T,1> &a,
std::int64_t lda,
std::int64_t stridea,
T beta,
sycl::buffer<T,1> &c,
std::int64_t ldc,
std::int64_t stridec,
std::int64_t batch_size)
}
Input Parameters
- queue
The queue where the routine should be executed.
- upper_lower
Specifies whether data in
C
is stored in its upper or lower triangle. For more details, see oneMKL defined datatypes.- trans
Specifies op(
A
) the transposition operation applied to the matrixA
. Conjugation is never performed, even if trans = transpose::conjtrans. See oneMKL defined datatypes for more details.- n
Number of rows and columns of
C
. Must be at least zero.- k
Number of columns of op(
A
). Must be at least zero.- alpha
Scaling factor for the rank-k update.
- a
Buffer holding the input matrices
A
with sizestridea
*batch_size
.- lda
The leading dimension of the matrices
A
. It must be positive.A
not transposedA
transposedColumn major
lda
must be at leastn
.lda
must be at leastk
.Row major
lda
must be at leastk
.lda
must be at leastn
.- stridea
Stride between different
A
matrices.- beta
Scaling factor for the matrices
C
.- c
Buffer holding input/output matrices
C
with sizestridec
*batch_size
.- ldc
The leading dimension of the matrices
C
. It must be positive and at leastn
.- stridec
Stride between different
C
matrices. Must be at leastldc
*n
.- batch_size
Specifies the number of rank-k update operations to perform.
Output Parameters
- c
Output buffer, overwritten by
batch_size
rank-k update operations of the formalpha
* op(A
)*op(A
)^T +beta
*C
.
Throws
This routine shall throw the following exceptions if the associated condition is detected. An implementation may throw additional implementation-specific exception(s) in case of error conditions not covered here.
syrk_batch (USM Version)#
Description
The USM version of syrk_batch
supports the group API and strided API.
The group API operation is defined as:
idx = 0
for i = 0 … group_count – 1
for j = 0 … group_size – 1
A, B, and C are matrices in a[idx] and c[idx]
C := alpha[i] * op(A) * op(A)^T + beta[i] * C
idx = idx + 1
end for
end for
The strided API operation is defined as
for i = 0 … batch_size – 1
A, B and C are matrices at offset i * stridea, i * stridec in a and c.
C := alpha * op(A) * op(A)^T + beta * C
end for
where:
op(X) is one of op(X) = X, or op(X) = XT, or op(X) = XH,
alpha
and beta
are scalars,
A
and C
are matrices,
op(A
) is n
x k
and C
is n
x n
.
For group API, a
and c
arrays contain the pointers for all the input matrices.
The total number of matrices in a
and c
are given by:
For strided API, a
and c
arrays contain all the input matrices. The total number of matrices
in a
and c
are given by the batch_size
parameter.
Group API
Syntax
namespace oneapi::mkl::blas::column_major {
sycl::event syrk_batch(sycl::queue &queue,
const uplo *upper_lower,
const transpose *trans,
const std::int64_t *n,
const std::int64_t *k,
const T *alpha,
const T **a,
const std::int64_t *lda,
const T *beta,
T **c,
const std::int64_t *ldc,
std::int64_t group_count,
const std::int64_t *group_size,
const std::vector<sycl::event> &dependencies = {})
}
namespace oneapi::mkl::blas::row_major {
sycl::event syrk_batch(sycl::queue &queue,
const uplo *upper_lower,
const transpose *trans,
const std::int64_t *n,
const std::int64_t *k,
const T *alpha,
const T **a,
const std::int64_t *lda,
const T *beta,
T **c,
const std::int64_t *ldc,
std::int64_t group_count,
const std::int64_t *group_size,
const std::vector<sycl::event> &dependencies = {})
}
Input Parameters
- queue
The queue where the routine should be executed.
- upper_lower
Array of
group_count
onemkl::upper_lower
values.upper_lower[i]
specifies whether data in C for every matrix in groupi
is in upper or lower triangle.- trans
Array of
group_count
onemkl::transpose
values.trans[i]
specifies the form of op(A
) used in the rank-k update in groupi
. See oneMKL defined datatypes for more details.- n
Array of
group_count
integers.n[i]
specifies the number of rows and columns ofC
for every matrix in groupi
. All entries must be at least zero.- k
Array of
group_count
integers.k[i]
specifies the number of columns of op(A
) for every matrix in groupi
. All entries must be at least zero.- alpha
Array of
group_count
scalar elements.alpha[i]
specifies the scaling factor for every rank-k update in groupi
.- a
Array of pointers to input matrices
A
with sizetotal_batch_count
.See Matrix Storage for more details.
- lda
Array of
group_count
integers.lda[i]
specifies the leading dimension ofA
for every matrix in groupi
. All entries must be positive.A
not transposedA
transposedColumn major
lda[i]
must be at leastn[i]
.lda[i]
must be at leastk[i]
.Row major
lda[i]
must be at leastk[i]
.lda[i]
must be at leastn[i]
.- beta
Array of
group_count
scalar elements.beta[i]
specifies the scaling factor for matrixC
for every matrix in groupi
.- c
Array of pointers to input/output matrices
C
with sizetotal_batch_count
.See Matrix Storage for more details.
- ldc
Array of
group_count
integers.ldc[i]
specifies the leading dimension ofC
for every matrix in groupi
. All entries must be positive andldc[i]
must be at leastn[i]
.- group_count
Specifies the number of groups. Must be at least 0.
- group_size
Array of
group_count
integers.group_size[i]
specifies the number of rank-k update products in groupi
. All entries must be at least 0.- dependencies
List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Output Parameters
- c
Overwritten by the
n[i]
-by-n[i]
matrix calculated by (alpha[i]
* op(A
)*op(A
)^T +beta[i]
*C
) for groupi
.
Return Values
Output event to wait on to ensure computation is complete.
Strided API
Syntax
namespace oneapi::mkl::blas::column_major {
sycl::event syrk_batch(sycl::queue &queue,
uplo upper_lower,
transpose trans,
std::int64_t n,
std::int64_t k,
value_or_pointer<T> alpha,
const T *a,
std::int64_t lda,
std::int64_t stride_a,
value_or_pointer<T> beta,
T *c,
std::int64_t ldc,
std::int64_t stride_c,
std::int64_t batch_size,
const std::vector<sycl::event> &dependencies = {})
}
namespace oneapi::mkl::blas::row_major {
sycl::event syrk_batch(sycl::queue &queue,
uplo upper_lower,
transpose trans,
std::int64_t n,
std::int64_t k,
value_or_pointer<T> alpha,
const T *a,
std::int64_t lda,
std::int64_t stride_a,
value_or_pointer<T> beta,
T *c,
std::int64_t ldc,
std::int64_t stride_c,
std::int64_t batch_size,
const std::vector<sycl::event> &dependencies = {})
}
Input Parameters
- queue
The queue where the routine should be executed.
- upper_lower
Specifies whether data in
C
is stored in its upper or lower triangle. For more details, see oneMKL defined datatypes.- trans
Specifies op(
A
) the transposition operation applied to the matricesA
. Conjugation is never performed, even if trans = transpose::conjtrans. See oneMKL defined datatypes for more details.- n
Number of rows and columns of
C
. Must be at least zero.- k
Number of columns of op(
A
). Must be at least zero.- alpha
Scaling factor for the rank-k updates. See Scalar Arguments in BLAS for more details.
- a
Pointer to input matrices
A
with sizestridea
*batch_size
.- lda
The leading dimension of the matrices
A
. It must be positive.A
not transposedA
transposedColumn major
lda
must be at leastn
.lda
must be at leastk
.Row major
lda
must be at leastk
.lda
must be at leastn
.- stridea
Stride between different
A
matrices.- beta
Scaling factor for the matrices
C
. See Scalar Arguments in BLAS for more details.- c
Pointer to input/output matrices
C
with sizestridec
*batch_size
.- ldc
The leading dimension of the matrices
C
. It must be positive and at leastn
.- stridec
Stride between different
C
matrices.- batch_size
Specifies the number of rank-k update operations to perform.
- dependencies
List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Output Parameters
- c
Output matrices, overwritten by
batch_size
rank-k update operations of the formalpha
* op(A
)*op(A
)^T +beta
*C
.
Return Values
Output event to wait on to ensure computation is complete.
Throws
This routine shall throw the following exceptions if the associated condition is detected. An implementation may throw additional implementation-specific exception(s) in case of error conditions not covered here.
oneapi::mkl::unsupported_device
Parent topic: BLAS-like Extensions