NUMA.jl
NUMA.jl provides tools for querying and controlling NUMA (non-uniform memory access) policies in Julia applications. It is based on libnuma.
Install
NUMA.jl is registered in Julia's General package registry. You can add it to your Julia environment by executing
using Pkg
Pkg.add("NUMA")
Why care?
Because not caring about NUMA can negatively impact performance, in particular for computations that are limited by memory-bandwidth. To give a simple example, consider a DAXPY kernel, which operates on two Julia arrays. We benchmark the memory bandwidth (i.e. how fast we can read and write data) of this kernel under two different circumstances: The arrays are allocated in the local NUMA node or in a distant NUMA node. By "local" we mean local to the CPU core that is hosting the Julia thread performing the computation. The benchmark results - on a system with 2x AMD Milan 7763 CPUs and 8 NUMA domains - are:
Array in local NUMA node: 37.19 GB/s
Array in distant NUMA node: 23.24 GB/s
Note that the memory bandwidth is 60% higher for the local case. This NUMA effect is even more pronounced when using multithreading (example code).
Arrays in thread-local NUMA nodes: 286.08 GB/s
Arrays all in first NUMA node (naive): 38.7 GB/s
Here, we see a ~7.4x improvement of the memory bandwidth (i.e. ∝ the number of NUMA domains) .
Useful background information
- NUMA (Non-Uniform Memory Access): An Overview (by Christoph Lameter)
- What is NUMA? (from the linux kernel documentation)
- NUMA (from the HPC wiki)
Acknowledgements
- ArrayAllocators.jl and specifically NumaAllocators.jl has served as an inspiration (and provides similar functionality).