gemv_batch#
Computes a group of gemv
operations.
Description
The gemv_batch
routines are batched versions of
gemv, performing multiple gemv
operations in a
single call. Each gemv
operations perform a scalar-matrix-vector
product and add the result to a scalar-vector product.
gemv_batch
supports the following precisions.
T
float
double
std::complex<float>
std::complex<double>
gemv_batch (Buffer Version)#
Description
The buffer version of gemv_batch
supports only the strided API.
The strided API operation is defined as:
for i = 0 … batch_size – 1
A is a matrix at offset i * stridea in a.
X and Y are matrices at offset i * stridex, i * stridey, in x and y.
Y := alpha * op(A) * X + beta * Y
end for
where:
op(A) is one of op(A) = A, or op(A) = AT, or op(A) = AH,
alpha
and beta
are scalars,
A
is a matrix and X
and Y
are vectors,
The x
and y
buffers contain all the input matrices. The stride
between vectors is given by the stride parameter. The total number of
vectors in x
and y
buffers is given by the batch_size
parameter.
Strided API
Syntax
namespace oneapi::mkl::blas::column_major {
void gemv_batch(sycl::queue &queue,
onemkl::transpose trans,
std::int64_t m,
std::int64_t n,
T alpha,
sycl::buffer<T,1> &a,
std::int64_t lda,
std::int64_t stridea,
sycl::buffer<T,1> &x,
std::int64_t incx,
std::int64_t stridex,
T beta,
sycl::buffer<T,1> &y,
std::int64_t incy,
std::int64_t stridey,
std::int64_t batch_size)
}
namespace oneapi::mkl::blas::row_major {
void gemv_batch(sycl::queue &queue,
onemkl::transpose trans,
std::int64_t m,
std::int64_t n,
T alpha,
sycl::buffer<T,1> &a,
std::int64_t lda,
std::int64_t stridea,
sycl::buffer<T,1> &x,
std::int64_t incx,
std::int64_t stridex,
T beta,
sycl::buffer<T,1> &y,
std::int64_t incy,
std::int64_t stridey,
std::int64_t batch_size)
}
Input Parameters
- queue
The queue where the routine should be executed.
- trans
Specifies op(
A
) the transposition operation applied to the matricesA
. See oneMKL defined datatypes for more details.- m
Number of rows of op(
A
). Must be at least zero.- n
Number of columns of op(
A
). Must be at least zero.- alpha
Scaling factor for the matrix-vector products.
- a
Buffer holding the input matrices
A
with sizestridea
*batch_size
.- lda
The leading dimension of the matrices
A
. It must be positive and at leastm
if column major layout is used or at leastn
if row major layout is used.- stridea
Stride between different
A
matrices. Must be at least zero.- x
Buffer holding the input vectors
X
with sizestridex
*batch_size
.- incx
The stride of the vector
X
. Must not be zero.- stridex
Stride between different consecutive
X
vectors, must be at least 0.- beta
Scaling factor for the vector
Y
.- y
Buffer holding input/output vectors
Y
with sizestridey
*batch_size
.- incy
Stride between two consecutive elements of the
Y
vectors. Must not be zero.- stridey
Stride between two consecutive
Y
vectors. Must be at least (1 + (m
- 1)*abs(incy
)) if layout is column major or (1 + (n
- 1)*abs(incy
)) if row major layout is used.- batch_size
Specifies the number of matrix-vector operations to perform.
Output Parameters
- y
Output overwritten by
batch_size
matrix-vector product operations of the formalpha
* op(A
) *X
+beta
*Y
.
Throws
This routine shall throw the following exceptions if the associated condition is detected. An implementation may throw additional implementation-specific exception(s) in case of error conditions not covered here.
gemv_batch (USM Version)#
Description
The USM version of gemv_batch
supports the group API and strided API.
The group API operation is defined as:
idx = 0
for i = 0 … group_count – 1
for j = 0 … group_size – 1
A is an m x n matrix in a[idx]
X and Y are vectors in x[idx] and y[idx]
Y := alpha[i] * op(A) * X + beta[i] * Y
idx = idx + 1
end for
end for
The strided API operation is defined as
for i = 0 … batch_size – 1
A is a matrix at offset i * stridea in a.
X and Y are vectors at offset i * stridex, i * stridey in x and y.
Y := alpha * op(A) * X + beta * Y
end for
where:
op(A) is one of op(A) = A, or op(A) = AT, or op(A) = AH,
alpha
and beta
are scalars,
A
is a matrix and X
and Y
are vectors,
For group API, x
and y
arrays contain the pointers for all the input vectors.
A
array contains the pointers to all input matrices.
The total number of vectors in x
and y
and matrices in A
are given by:
For strided API, x
and y
arrays contain all the input
vectors. A
array contains the pointers to all input matrices. The
total number of vectors in x
and y
and matrices in A
are given by the
batch_size
parameter.
Group API
Syntax
namespace oneapi::mkl::blas::column_major {
sycl::event gemv_batch(sycl::queue &queue,
const onemkl::transpose *trans,
const std::int64_t *m,
const std::int64_t *n,
const T *alpha,
const T **a,
const std::int64_t *lda,
const T **x,
const std::int64_t *incx,
const T *beta,
T **y,
const std::int64_t *incy,
std::int64_t group_count,
const std::int64_t *group_size,
const std::vector<sycl::event> &dependencies = {})
}
namespace oneapi::mkl::blas::row_major {
sycl::event gemv_batch(sycl::queue &queue,
const onemkl::transpose *trans,
const std::int64_t *m,
const std::int64_t *n,
const T *alpha,
const T **a,
const std::int64_t *lda,
const T **x,
const std::int64_t *incx,
const T *beta,
T **y,
const std::int64_t *incy,
std::int64_t group_count,
const std::int64_t *group_size,
const std::vector<sycl::event> &dependencies = {})
}
Input Parameters
- queue
The queue where the routine should be executed.
- trans
Array of
group_count
onemkl::transpose
values.trans[i]
specifies the form of op(A
) used in the matrix-vector product in groupi
. See oneMKL defined datatypes for more details.- m
Array of
group_count
integers.m[i]
specifies the number of rows of op(A
) for every matrix in groupi
. All entries must be at least zero.- n
Array of
group_count
integers.n[i]
specifies the number of columns of op(A
) for every matrix in groupi
. All entries must be at least zero.- alpha
Array of
group_count
scalar elements.alpha[i]
specifies the scaling factor for every matrix-vector product in groupi
.- a
Array of pointers to input matrices
A
with sizetotal_batch_count
.See Matrix Storage for more details.
- lda
Array of
group_count
integers.lda[i]
specifies the leading dimension ofA
for every matrix in groupi
. All entries must be positive and at leastm
if column major layout is used or at leastn
if row major layout is used.- x
Array of pointers to input vectors
X
with sizetotal_batch_count
.See Matrix Storage for more details.
- incx
Array of
group_count
integers.incx[i]
specifies the stride ofX
for every vector in groupi
. Must not be zero.- beta
Array of
group_count
scalar elements.beta[i]
specifies the scaling factor for vectorY
for every vector in groupi
.- y
Array of pointers to input/output vectors
Y
with sizetotal_batch_count
.See Matrix Storage for more details.
- incy
Array of
group_count
integers.incy[i]
specifies the leading dimension ofY
for every vector in groupi
. Must not be zero.- group_count
Specifies the number of groups. Must be at least 0.
- group_size
Array of
group_count
integers.group_size[i]
specifies the number of matrix-vector products in groupi
. All entries must be at least 0.- dependencies
List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Output Parameters
- y
Overwritten by vector calculated by (
alpha[i]
* op(A
) *X
+beta[i]
*Y
) for groupi
.
Return Values
Output event to wait on to ensure computation is complete.
Strided API
Syntax
namespace oneapi::mkl::blas::column_major {
sycl::event gemv_batch(sycl::queue &queue,
onemkl::transpose trans,
std::int64_t m,
std::int64_t n,
value_or_pointer<T> alpha,
const T *a,
std::int64_t lda,
std::int64_t stridea,
const T *x,
std::int64_t incx,
std::int64_t stridex,
value_or_pointer<T> beta,
T *y,
std::int64_t incy,
std::int64_t stridey,
std::int64_t batch_size,
const std::vector<sycl::event> &dependencies = {})
}
namespace oneapi::mkl::blas::row_major {
sycl::event gemv_batch(sycl::queue &queue,
onemkl::transpose trans,
std::int64_t m,
std::int64_t n,
value_or_pointer<T> alpha,
const T *a,
std::int64_t lda,
std::int64_t stridea,
const T *x,
std::int64_t incx,
std::int64_t stridex,
value_or_pointer<T> beta,
T *y,
std::int64_t incy,
std::int64_t stridey,
std::int64_t batch_size,
const std::vector<sycl::event> &dependencies = {})
}
Input Parameters
- queue
The queue where the routine should be executed.
- trans
Specifies op(
A
) the transposition operation applied to the matricesA
. See oneMKL defined datatypes for more details.- m
Number of rows of op(
A
). Must be at least zero.- n
Number of columns of op(
A
). Must be at least zero.- alpha
Scaling factor for the matrix-vector products. See Scalar Arguments in BLAS for more details.
- a
Pointer to the input matrices
A
with sizestridea
*batch_size
.- lda
The leading dimension of the matrices
A
. It must be positive and at leastm
if column major layout is used or at leastn
if row major layout is used.- stridea
Stride between different
A
matrices. Must be at least zero.- x
Pointer to the input vectors
X
with sizestridex
*batch_size
.- incx
Stride of the vector
X
. Must not be zero.- stridex
Stride between different consecutive
X
vectors, must be at least 0.- beta
Scaling factor for the vector
Y
. See Scalar Arguments in BLAS for more details.- y
Pointer to the input/output vectors
Y
with sizestridey
*batch_size
.- incy
Stride between two consecutive elements of the
y
vectors. Must not be zero.- stridey
Stride between two consecutive
Y
vectors. Must be at least (1 + (m
- 1)*abs(incy
)) if layout is column major or (1 + (n
- 1)*abs(incy
)) if row major layout is used.- batch_size
Specifies the number of matrix-vector operations to perform.
Output Parameters
- y
Output overwritten by
batch_size
matrix-vector product operations of the formalpha
* op(A
) *X
+beta
*Y
.
Return Values
Output event to wait on to ensure computation is complete.
Throws
This routine shall throw the following exceptions if the associated condition is detected. An implementation may throw additional implementation-specific exception(s) in case of error conditions not covered here.
oneapi::mkl::unsupported_device
Parent topic: BLAS-like Extensions