Kokkos Memory Spaces
1. Introduction
Kokkos, a performance portability programming model, introduces the concept of Memory Spaces as a fundamental abstraction to address the complexities of heterogeneous computing environments. Memory Spaces [1] in Kokkos represent distinct memory areas with specific characteristics and accessibility patterns. These abstractions enable programmers to express algorithms independently of hardware specifics while maintaining control over data placement and movement. The Kokkos machine model envisions future computing nodes as complex systems with multiple execution units and memory hierarchies.
In heterogeneous nodes, Kokkos' space abstractions prove particularly valuable. Such nodes may include CPU cores, GPU accelerators, and other specialized processing units, each with access to different memory types like host memory, device memory, or high-bandwidth memory. Kokkos abstracts these hardware-specific details, allowing developers to focus on algorithm structure rather than platform intricacies [2][3].
To control data residence, Kokkos provides mechanisms to specify the desired Memory Space when creating Views. A View in Kokkos is a multidimensional array abstraction encapsulating both data and layout. By specifying the appropriate Memory Space template parameter, developers can dictate where data should reside, enabling optimizations based on access patterns and hardware characteristics.
Kokkos annotation macros play a vital role in achieving portability across architectures, allowing developers to provide hints and directives to the runtime. These macros are particularly important in performance-critical code sections, enabling expression of parallelism, memory access, and execution space preferences.
In conclusion, Kokkos' Memory Spaces, initialization/finalization procedures, and annotation macros form a cohesive framework for developing portable, high-performance code for heterogeneous computing environments, allowing efficient utilization of diverse hardware resources while maintaining a single, portable codebase.
2. Instances of Kokkos Memory Spaces
Memory spaces in Kokkos are dynamic and flexible, offering programmers the ability to allocate data across various memory types, including on-package memory, DRAM, and non-volatile memories. Each memory space has specific instances that enable precise data storage allocation, with flexibility for developers to strategically choose memory locations for different data structures. As follows :
-
Memory Spaces :
-
Memory spaces, like execution spaces, have specific instances.
-
An instance of a memory space allows the programmer to request data storage allocations.
-
Different types of memory are available, such as on-package memory, slower DRAM, and non-volatile memories.
-
GPUs may have their own local memory space.
-
-
Memory Allocation:
-
The programmer can choose where to allocate each data structure.
-
Kokkos provides abstraction for allocation routines and memory management operations.
-
-
Atomic Accesses:
-
Used to prevent race conditions when multiple threads access the same memory address.
-
Atomic operations ensure that a read, simple computation, and write to memory are completed as a single unit.
-
-
Memory Consistency:
-
Kokkos assumes a very weak memory consistency model.
-
Programmers should not assume any specific ordering of memory operations within a kernel.
-
Kokkos provides a fence operation to ensure completion of memory operations.
-
3. Illustration of some memory space concepts
-
Memory Spaces and Allocation
-
Allocating in default memory space:
-
Kokkos::View<double*> defaultView("defaultView", 1000);
-
Allocating in CUDA memory space:
Kokkos::View<double*, Kokkos::CudaSpace> cudaView("cudaView", 1000);
-
Allocating in host memory space:
Kokkos::View<double*, Kokkos::HostSpace> hostView("hostView", 1000);
-
Atomic Accesses
-
Example of atomic addition:
-
KOKKOS_INLINE_FUNCTION
void atomicAdd(int* addr, int val) {
Kokkos::atomic_add(addr, val);
}
-
Memory Consistency
-
Using fence operation:
-
Kokkos::fence();
-
Memory Space Instances
-
Creating a memory pool for custom allocation:
-
Kokkos::MemoryPool<Kokkos::CudaSpace> memoryPool(cudaSpace, totalSize); void* ptr = memoryPool.allocate(allocSize);
These examples demonstrate how Kokkos allows flexible memory allocation across different memory spaces, provides atomic operations for thread-safe memory access, and offers memory consistency control through fencing. The MemoryPool example shows how specific instances of memory spaces can be used for custom allocation strategies [5].
…