Example: Allocate on NUMA nodes
Specific node(s)
In the following example, we explicitly allocate an array on a specific NUMA node.
julia> using NUMA, Random
julia> x = Vector{Float64}(numanode(3), 10); rand!(x);
julia> which_numa_node(x)
3
Below we demonstrate the same for a bunch of arrays:
# Run with as many threads as there are NUMA domains, e.g.:
# julia --project -t 8 numa_node_alloc.jl [power]
using NUMA
using ThreadPinning
using Base.Threads
using Random
pinthreads(:random) # to demonstrate that the pinning is irrelevant
N = length(ARGS) != 0 ? 10^(parse(Int, first(ARGS))) : 10^7 # optional cmdline arg
xs = Vector{Vector{Float64}}(undef, nthreads());
targets = 1:nnumanodes()
for i in 1:nthreads()
xs[i] = Vector{Float64}(numanode(targets[i]), N)
end
first_touch_from = current_numa_nodes()
@threads :static for i in 1:nthreads()
rand!(xs[i])
end
println("Size of each array: ", Base.format_bytes(sizeof(Float64) * N))
println("Requested memory for arrays from nodes:\t", collect(targets))
println("Filled the arrays from (random) nodes:\t", first_touch_from)
println("Queried locations of memory pages:\t", which_numa_node.(xs))
$ julia --project -t 8 numa_node_alloc.jl
Size of each array: 76.294 MiB
Requested memory for arrays from nodes: [1, 2, 3, 4, 5, 6, 7, 8]
Filled the arrays from (random) nodes: [7, 2, 8, 6, 3, 6, 1, 1]
Queried locations of memory pages: [1, 2, 3, 4, 5, 6, 7, 8]
Local node(s)
We can also allocate on the local NUMA node, that is, the node closest to the CPU-thread/core we're currently running on.
julia> using NUMA, ThreadPinning, Random
julia> numa_node_of_cpu(32)
2
julia> pinthread(32);
julia> current_cpu()
32
julia> current_numa_node()
2
julia> x = Vector{Float64}(numalocal(), 10); rand!(x);
julia> which_numa_node(x)
2
Demonstrating the same for multiple threads pinned to separate NUMA domains (in random order):
# Run with as many threads as there are NUMA domains, e.g.:
# julia --project -t 8 numa_node_alloc_local.jl [power]
using NUMA
using ThreadPinning
using Base.Threads
using Random
@assert nthreads() == nnumanodes()
# pin each thread to a random NUMA domain but each to a different one
pinthreads(shuffle!(first.(cpuids_per_numa())))
N = length(ARGS) != 0 ? 10^(parse(Int, first(ARGS))) : 10^7 # optional cmdline arg
xs = Vector{Vector{Float64}}(undef, nthreads());
@threads :static for i in 1:nthreads()
xs[i] = Vector{Float64}(numanode(current_numa_node()), N) # works
# xs[i] = Vector{Float64}(numalocal(), N) # doesn't quite work?!
end
first_touch_from = shuffle(current_numa_nodes()) # randomize
@threads :static for i in 1:nthreads()
rand!(xs[first_touch_from[i]])
end
println("Size of each array: ", Base.format_bytes(sizeof(Float64) * N))
println("Requested memory for arrays from nodes:\t", current_numa_nodes())
println("Filled the arrays from (random) nodes:\t", first_touch_from)
println("Queried locations of memory pages:\t", which_numa_node.(xs))
$ julia --project -t 8 numa_node_alloc_local.jl
Size of each array: 76.294 MiB
Requested memory for arrays from nodes: [1, 3, 5, 8, 7, 6, 2, 4]
Filled the arrays from (random) nodes: [8, 5, 7, 6, 1, 3, 2, 4]
Queried locations of memory pages: [1, 3, 5, 8, 7, 6, 2, 4]