Inner Product, Flat Parallelism and the CPU, with Views
Exercise 02 taken from the Kokkos tutorial. The initial code file to complete can be found on this link.
1. Problem statement
The code provided in the exercise is a simple matrix-vector multiplication:
\[y = A * x\]
Run the exercise
./04_kokkos_exercise_views -S 26
The solution are present just below, so if you wann try it by yourself, stop reading this page ! |
2. Solutions
Allocating memory
- double * const y = new double[ N ];
+ ViewVectorType y( "y", N );
- double * const x = new double[ M ];
+ ViewVectorType x( "x", M );
- double * const A = new double[ N * M ];
+ ViewMatrixType A( "A", N, M );
Initialize data
- for ( int i = 0; i < N; ++i ) {
- y[ i ] = 1;
- }
+ Kokkos::parallel_for( N, KOKKOS_LAMBDA ( int i ) {
+ y( i ) = 1;
+ });
- for ( int i = 0; i < M; ++i ) {
- x[ i ] = 1;
- }
+ Kokkos::parallel_for( M, KOKKOS_LAMBDA ( int i ) {
+ x( i ) = 1;
+ });
Matrix-vector multiplication
- for ( int j = 0; j < N; ++j ) {
- for ( int i = 0; i < M; ++i ) {
- A[ j * M + i ] = 1;
- }
- }
+ Kokkos::parallel_for( N, KOKKOS_LAMBDA ( int j ) {
+ for ( int i = 0; i < M; ++i ) {
+ A( j, i ) = 1;
+ }
+ });
Concerning deallocation, Kokkos will automatically deallocate the memory when the views go out of scope.
Deallocate memory
- delete [] y;
- delete [] x;
- delete [] A;