Marker API (CPU)
Example
(See https://github.com/JuliaPerf/LIKWID.jl/tree/main/examples/perfctr.)
# perfctr.jl
using LIKWID
using LinearAlgebra
Marker.init()
A = rand(128, 64)
B = rand(64, 128)
C = zeros(128, 128)
@marker for _ in 1:100
mul!(C, A, B)
end
Marker.close()
Manual
# perfctr.jl
using LIKWID
using LinearAlgebra
Marker.init()
A = rand(128, 64)
B = rand(64, 128)
C = zeros(128, 128)
Marker.registerregion("matmul") # optional
Marker.startregion("matmul")
for _ in 1:100
mul!(C, A, B)
end
Marker.stopregion("matmul")
Marker.close()
Index
LIKWID.Marker.close
LIKWID.Marker.getregion
LIKWID.Marker.init
LIKWID.Marker.init_dynamic
LIKWID.Marker.init_nothreads
LIKWID.Marker.isactive
LIKWID.Marker.marker
LIKWID.Marker.nextgroup
LIKWID.Marker.perfmon_marker
LIKWID.Marker.registerregion
LIKWID.Marker.resetregion
LIKWID.Marker.startregion
LIKWID.Marker.stopregion
LIKWID.Marker.threadinit
LIKWID.Marker.@marker
LIKWID.Marker.@parallelmarker
LIKWID.Marker.@perfmon_marker
API
LIKWID.Marker.close
— MethodClose the connection to the LIKWID Marker API and write out measurement data to file. This file will be evaluated by likwid-perfctr
.
LIKWID.Marker.getregion
— Methodgetregion(regiontag::AbstractString, [num_events]) -> nevents, events, time, count
Get the intermediate results of the region identified by regiontag
. On success, it returns * nevents
: the number of events in the current group, * events
: a list with all the aggregated event results, * time
: the measurement time for the region and * count
: the number of calls.
LIKWID.Marker.init
— MethodInitialize the Marker API, assuming that julia is running under likwid-perfctr
. Must be called previous to all other functions.
LIKWID.Marker.init_dynamic
— Methodinit_dynamic(group_or_groups; kwargs...)
Initialize the full Marker API from within the current Julia session (i.e. no likwird-perfctr
necessary). A performance group, e.g. "FLOPS_DP", must be provided as the first argument.
LIKWID.Marker.init_nothreads
— MethodInitialize the Marker API only on the main thread (assuming that julia is running under likwid-perfctr
). LIKWID.Marker.threadinit()
must be called manually.
LIKWID.Marker.isactive
— MethodChecks whether the Marker API is active (by checking if the LIKWID_MODE
environment variable has been set).
LIKWID.Marker.marker
— Methodmarker(f, regiontag::AbstractString)
Adds a LIKWID marker region around the execution of the given function f
using Marker.startregion
, Marker.stopregion
under the hood. Note that LIKWID.Marker.init()
and LIKWID.Marker.close()
must be called before and after, respectively.
Examples
julia> using LIKWID
julia> Marker.init()
julia> marker("sleeping...") do
sleep(1)
end
true
julia> marker(()->rand(100), "create rand vec")
true
julia> Marker.close()
LIKWID.Marker.nextgroup
— MethodSwitch to the next event set in a round-robin fashion. If you have set only one event set on the command line, this function performs no operation.
LIKWID.Marker.perfmon_marker
— Methodperfmon_marker(f, group_or_groups[; kwargs...])
Monitor performance groups in marked areas (see @marker
) while executing the given function f
on one or multiple Julia threads.
This is an experimental feature!
Note that
Marker.init_dynamic
,Marker.init
,Marker.close
, andPerfMon.finalize
are called automatically- the measurement of multiple performance groups is sequential and requires multiple executions of
f
!
Keyword arguments:
cpuids
(default: currently used CPU threads): specify the CPU threads (~ cores) to be monitoredautopin
(default:true
): automatically pin Julia threads to the CPU threads (~ cores) they are currently running on (to avoid migration and wrong results).keep
(default:false
): keep the temporarily created marker file
Example
julia> using LIKWID
julia> perfmon_marker("FLOPS_DP") do
# only the marked regions are monitored!
NUM_FLOPS = 100_000_000
a = 1.8
b = 3.2
c = 1.3
@marker "calc_flops" for _ in 1:NUM_FLOPS
c = a * b + c
end
z = a*b+c
@marker "exponential" exp(z)
sin(c)
end
Region: calc_flops, Group: FLOPS_DP
┌───────────────────────────┬───────────┐
│ Event │ Thread 1 │
├───────────────────────────┼───────────┤
│ ACTUAL_CPU_CLOCK │ 3.00577e8 │
│ MAX_CPU_CLOCK │ 2.08917e8 │
│ RETIRED_INSTRUCTIONS │ 3.00005e8 │
│ CPU_CLOCKS_UNHALTED │ 3.00067e8 │
│ RETIRED_SSE_AVX_FLOPS_ALL │ 1.0e8 │
│ MERGE │ 0.0 │
└───────────────────────────┴───────────┘
┌──────────────────────┬───────────┐
│ Metric │ Thread 1 │
├──────────────────────┼───────────┤
│ Runtime (RDTSC) [s] │ 0.0852431 │
│ Runtime unhalted [s] │ 0.122687 │
│ Clock [MHz] │ 3524.84 │
│ CPI │ 1.00021 │
│ DP [MFLOP/s] │ 1173.12 │
└──────────────────────┴───────────┘
Region: exponential, Group: FLOPS_DP
┌───────────────────────────┬──────────┐
│ Event │ Thread 1 │
├───────────────────────────┼──────────┤
│ ACTUAL_CPU_CLOCK │ 85696.0 │
│ MAX_CPU_CLOCK │ 59192.0 │
│ RETIRED_INSTRUCTIONS │ 5072.0 │
│ CPU_CLOCKS_UNHALTED │ 6013.0 │
│ RETIRED_SSE_AVX_FLOPS_ALL │ 27.0 │
│ MERGE │ 0.0 │
└───────────────────────────┴──────────┘
┌──────────────────────┬────────────┐
│ Metric │ Thread 1 │
├──────────────────────┼────────────┤
│ Runtime (RDTSC) [s] │ 2.60005e-7 │
│ Runtime unhalted [s] │ 3.49786e-5 │
│ Clock [MHz] │ 3546.95 │
│ CPI │ 1.18553 │
│ DP [MFLOP/s] │ 103.844 │
└──────────────────────┴────────────┘
LIKWID.Marker.registerregion
— MethodRegister a region with name regiontag
to the Marker API. On success, true
is returned.
This is an optional function to reduce the overhead of region registration at Marker.startregion
. If you don't call registerregion
, the registration is done at startregion
.
LIKWID.Marker.resetregion
— MethodReset the values stored using the region name regiontag
. On success, true
is returned.
LIKWID.Marker.startregion
— MethodStart measurements under the name regiontag
. On success, true
is returned.
LIKWID.Marker.stopregion
— MethodStop measurements under the name regiontag
. On success, true
is returned.
LIKWID.Marker.threadinit
— MethodAdd the current thread to the Marker API.
LIKWID.Marker.@marker
— MacroConvenience macro for flanking code with Marker.startregion
and Marker.stopregion
.
Examples
julia> using LIKWID
julia> Marker.init()
julia> @marker "sleeping..." sleep(1)
true
julia> @marker "create rand vec" rand(100)
true
julia> Marker.close()
LIKWID.Marker.@parallelmarker
— MacroConvenience macro for flanking code with Marker.startregion
and Marker.stopregion
on all threads separately.
Examples
julia> using LIKWID
julia> Marker.init()
julia> @parallelmarker begin
Threads.@thread :static for i in 1:Threads.nthreads()
# thread-local computation
end
end
julia> Marker.close()
LIKWID.Marker.@perfmon_marker
— Macro@perfmon_marker group_or_groups codeblock
This is an experimental feature!
See also: perfmon_marker
Example
julia> using LIKWID
julia> @perfmon_marker "FLOPS_DP" begin
@marker "exponential" exp(3.141)
end
Region: exponential, Group: FLOPS_DP
┌───────────────────────────┬──────────┐
│ Event │ Thread 1 │
├───────────────────────────┼──────────┤
│ ACTUAL_CPU_CLOCK │ 115146.0 │
│ MAX_CPU_CLOCK │ 78547.0 │
│ RETIRED_INSTRUCTIONS │ 4208.0 │
│ CPU_CLOCKS_UNHALTED │ 7112.0 │
│ RETIRED_SSE_AVX_FLOPS_ALL │ 10.0 │
│ MERGE │ 0.0 │
└───────────────────────────┴──────────┘
┌──────────────────────┬────────────┐
│ Metric │ Thread 1 │
├──────────────────────┼────────────┤
│ Runtime (RDTSC) [s] │ 3.02056e-8 │
│ Runtime unhalted [s] │ 4.70008e-5 │
│ Clock [MHz] │ 3591.4 │
│ CPI │ 1.69011 │
│ DP [MFLOP/s] │ 331.064 │
└──────────────────────┴────────────┘