How to pin Julia threads

Below, we demonstrate how to use LIKWID.jl to pin Julia threads to specific cores. However, before we do that, let us note two things.

Note

Instead of LIKWID's pinning features, we generally strongly recommend to use ThreadPinning.jl to pin Julia threads to cores, as it provides many more options and visualizations!

Note

Note that Julia implements task-based multithreading where N tasks get mapped onto M OS threads (M:N hybrid threading). We will pin the Julia (p)threads and not the tasks. Depending on how the latter are started/configured, tasks may migrate between Julia threads!

Dynamic pinning

First, make sure to start Julia in multithreaded mode, i.e. julia -t N where N is the desired number of Julia threads (below I'll use N=10).

N = Threads.nthreads()
10

Let's find out on which cores the Julia threads are currently running (before we've pinned them). We will use LIKWID.get_processor_id in combination with Threads.@threads :static for, which guarantees that the tasks associated with different instances of the loop body get executed on different Julia threads (ThreadPinning.jl provides @tspawnat as a nice(r) alternative).

using LIKWID
coreids = zeros(Int, N)
Threads.@threads :static for i in 1:N
    coreids[i] = LIKWID.get_processor_id()
end
println("Cores: ", coreids)
Cores: [37, 40, 38, 41, 39, 42, 32, 33, 47, 34]

Since querying all core ids is a common operation, we provide LIKWID.get_processor_ids which returns all core ids right away.

println("Cores: ", LIKWID.get_processor_ids())
Cores: [37, 40, 38, 41, 39, 42, 32, 33, 47, 34]

To pin a thread to a specific core, there is LIKWID.pinthread. Using Threads.@threads :static for like above, we can, for example, pin the N Julia threads to the first N cores.

cores_firstN = 0:N-1
Threads.@threads :static for i in 1:N
    LIKWID.pinthread(cores_firstN[i])
end
println("Cores: ", LIKWID.get_processor_ids())
Cores: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

To avoid the explicit for-loop, we can directly use LIKWID.pinthreads to pin all Julia threads. Let's realize a less trivial shuffled mapping.

using Random
cores_firstN_shuffeled = shuffle(cores_firstN)
LIKWID.pinthreads(cores_firstN_shuffeled)
println("Cores: ", LIKWID.get_processor_ids())
LIKWID.get_processor_ids() == cores_firstN_shuffeled
true

likwid-pin

Command-line interface tool for pinning (p)threads. For details, check out the official documentation.

Important: Mask

(See this discussion on the Julia discourse.)

In general, likwid-pin pins all pthreads. However, julia involves more than the "Julia user threads" specified via the -t option. For example, it create an additional unix signal thread (in src/signals-unix.c) and - unless OPENBLAS_NUM_THREADS=1 - the OpenBLAS related threads (blas_thread_init () in [..]/lib/julia/libopenblas64_.so). Hence, when you run likwid-pin -c 0-3 julia -t 4 the four cores (0-3) are actually oversubscribed and multiple "Julia user threads" get pinned to the same core.

To work around this, we need to provide a mask to likwid-pin via the -s option. To compute an appropriate mask for N "Julia user threads" you may use the helper function LIKWID.pinmask(N):

LIKWID.pinmask(4)
"0xffffffffffffffe1"

Example

(See https://github.com/JuliaPerf/LIKWID.jl/tree/main/examples/cli/likwid-pin/.)

# pin.jl
using Base.Threads

glibc_coreid() = @ccall sched_getcpu()::Cint

@threads :static for i in 1:nthreads()
    println("Thread: $(i), CPU: $(glibc_coreid())")
end

Running this file with e.g. likwid-pin -s 0xffffffffffffffe1 -c 1,3,5,7 julia -t 4 pin.jl one obtains

[pthread wrapper]
[pthread wrapper] MAIN -> 1
[pthread wrapper] PIN_MASK: 0->3  1->5  2->7
[pthread wrapper] SKIP MASK: 0xFFFFFFFFFFFFFFE1
	threadid 140576878921280 -> SKIP
	threadid 140576612378176 -> hwthread 3 - OK
	threadid 140576590759488 -> hwthread 5 - OK
	threadid 140576494188096 -> hwthread 7 - OK
Thread: 1, CPU: 1
Thread: 2, CPU: 3
Thread: 3, CPU: 5
Thread: 4, CPU: 7

If you're wondering about the -s 0xffffffffffffffe1 option, see Mask above.


This page was generated using Literate.jl.