Integrations into the SPEC CPU ® lbm benchmark and the particle-in-cell simulation PIConGPU demonstrate LLAMA's abilities in real-world applications. Providing two close-to-life examples, we show that the LLAMA-generated array of structs and struct of arrays layouts produce identical code with the same performance characteristics as manually written data structures. The library is extensible with third-party allocators. LLAMA provides fully C++ compliant methods for defining and switching custom memory layouts for user-defined data types. We present the low-level abstraction of memory access (LLAMA), a C++ library that provides such a data structure abstraction layer with example implementations for multidimensional arrays of nested, structured data. This can be accomplished via a zero-runtime-overhead abstraction layer, underneath which memory layouts can be freely exchanged. For portable codes that run across heterogeneous hardware architectures, the choice of the memory layout for data structures is ideally decoupled from the rest of a program. Choosing the best memory layout for each hardware architecture is increasingly important as more and more programs become memory bound. The performance gap between CPU and memory widens continuously.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |