NVIDIA Monitoring (NvMon)
Note: This is a maturing feature. Only NVIDIA GPUs are supported.
Index
LIKWID.NvMon.add_event_set
LIKWID.NvMon.get_event_results
LIKWID.NvMon.get_id_of_active_group
LIKWID.NvMon.get_last_metric
LIKWID.NvMon.get_last_result
LIKWID.NvMon.get_longinfo_of_group
LIKWID.NvMon.get_metric
LIKWID.NvMon.get_metric_results
LIKWID.NvMon.get_metric_results
LIKWID.NvMon.get_name_of_counter
LIKWID.NvMon.get_name_of_event
LIKWID.NvMon.get_name_of_group
LIKWID.NvMon.get_name_of_metric
LIKWID.NvMon.get_number_of_events
LIKWID.NvMon.get_number_of_gpus
LIKWID.NvMon.get_number_of_groups
LIKWID.NvMon.get_number_of_metrics
LIKWID.NvMon.get_result
LIKWID.NvMon.get_shortinfo_of_group
LIKWID.NvMon.get_time_of_group
LIKWID.NvMon.init
LIKWID.NvMon.isgroupsupported
LIKWID.NvMon.nvmon
LIKWID.NvMon.read_counters
LIKWID.NvMon.setup_counters
LIKWID.NvMon.start_counters
LIKWID.NvMon.stop_counters
LIKWID.NvMon.supported_groups
LIKWID.NvMon.switch_group
LIKWID.NvMon.@nvmon
API
LIKWID.NvMon.add_event_set
— Methodadd_event_set(estr) -> groupid
Add a performance group or a custom event set to the nvmon module. Returns a groupid
(starting at 1) which is required to later specify the event set.
LIKWID.NvMon.get_event_results
— Methodget_event_results([groupid_or_groupname, eventid_or_eventname, gpuid::Integer])
Retrieve the results of monitored events. Same as get_metric_results
but for raw events.
LIKWID.NvMon.get_id_of_active_group
— MethodReturn the groupid
of the currently activate group.
LIKWID.NvMon.get_last_metric
— MethodReturn the derived metric result of the last measurement cycle identified by group groupid
and the indices for metric metricidx
and gpu gpuid
(all starting at 1).
LIKWID.NvMon.get_last_result
— MethodReturn the raw counter register result of the last measurement cycle identified by group groupid
and the indices for event eventidx
and gpu gpuid
(all starting at 1).
LIKWID.NvMon.get_longinfo_of_group
— MethodReturn the (long) description of a performance group with id groupid
(starts at 1).
LIKWID.NvMon.get_metric
— MethodReturn the derived metric result of all measurements identified by group groupid
and the indices for metric metricidx
and gpu gpuid
(all starting at 1).
LIKWID.NvMon.get_metric_results
— Methodget_metric_results([groupid_or_groupname, metricid_or_metricname, gpuid::Integer])
Retrieve the results of monitored metrics.
Optionally, a group, metric, and gpuid can be provided to select a subset of metrics or a single metric. If given as integers, note that groupid
, metricid
, and gpuid
all start at 1 and the latter enumerates the monitored gpus.
If no arguments are provided, a nested data structure is returned in which different levels correspond to performance groups, gpus, and metrics (in this order). ```
LIKWID.NvMon.get_metric_results
— Methodget_metric_results()
Get the metric results for all performance groups and all monitored (NvMon.init
) gpus.
Returns a an OrderedDict
whose keys correspond to the performance groups and the values hold the results for all monitored gpus. ```
LIKWID.NvMon.get_name_of_counter
— MethodReturn the name of the counter register identified by groupid
and eventidx
(both start at 1).
LIKWID.NvMon.get_name_of_event
— MethodReturn the name of the event identified by groupid
and eventidx
(both start at 1).
LIKWID.NvMon.get_name_of_group
— MethodReturn the name of the group identified by groupid
(starts at 1). If it is a custom event set, the name is set to Custom
.
LIKWID.NvMon.get_name_of_metric
— MethodReturn the name of a derived metric identified by groupid
and metricidx
(both start at 1).
LIKWID.NvMon.get_number_of_events
— MethodReturn the number of events in the group with id groupid
(starts at 1).
LIKWID.NvMon.get_number_of_gpus
— MethodReturn the number of GPUs initialized in the nvmon module.
LIKWID.NvMon.get_number_of_groups
— MethodReturn the number of groups currently registered in the nvmon module.
LIKWID.NvMon.get_number_of_metrics
— MethodReturn the number of metrics in the group with id groupid
(starts at 1). Always zero for custom event sets.
LIKWID.NvMon.get_result
— MethodReturn the raw counter register result of all measurements identified by group groupid
and the indices for event eventidx
and gpu gpuid
(all starting at 1).
LIKWID.NvMon.get_shortinfo_of_group
— MethodReturn the short information about a performance group with id groupid
(starts at 1).
LIKWID.NvMon.get_time_of_group
— MethodReturn the measurement time for group identified by groupid
(starts at 1).
LIKWID.NvMon.init
— Functioninit(gpuid_or_gpuids)
Initialize LIKWID's NvMon module for the gpu(s) with the given gpu id(s) (starting at 0!).
LIKWID.NvMon.isgroupsupported
— FunctionChecks if the given performance group is available on the given GPU (defaults to the first).
LIKWID.NvMon.nvmon
— Methodnvmon(f, group_or_groups[; gpuids])
Monitor performance groups while executing the given function f
on one or multiple GPUs. Note that
NvMon.init
andNvMon.finalize
are called automatically- the measurement of multiple performance groups is sequential and requires multiple executions of
f
!
Keyword arguments:
gpuids
(default: first GPU): specify the GPUs to be monitored
Note: This is an experimental feature and might change or be dropped any time!
Example
julia> using LIKWID
julia> x = CUDA.rand(1000); y = CUDA.rand(1000);
julia> metrics, events = nvmon("FLOPS_DP") do
CUDA.@sync x .+ y;
end;
LIKWID.NvMon.read_counters
— MethodRead the counter registers. To be executed after start_counters
and before stop_counters
. Returns true
on success.
LIKWID.NvMon.setup_counters
— MethodProgram the counter registers to measure all events in group groupid
(starts at 1). Returns true
on success.
LIKWID.NvMon.start_counters
— MethodStart the counter registers. Returns true
on success.
LIKWID.NvMon.stop_counters
— MethodStop the counter registers. Returns true
on success.
LIKWID.NvMon.supported_groups
— FunctionReturn a dictionary of all available nvmon groups for the GPU identified by gpu
(starts at 0).
Examples
julia> NvMon.supported_groups()
Dict{String, LIKWID.GroupInfoCompact} with 4 entries:
"DATA" => DATA => Load to store ratio
"FLOPS_SP" => FLOPS_SP => Single-precision floating point
"FLOPS_HP" => FLOPS_HP => Half-precision floating point
"FLOPS_DP" => FLOPS_DP => Double-precision floating point
LIKWID.NvMon.switch_group
— MethodSwitch currently active group to groupid
(starts at 1). Returns true
on success.
LIKWID.NvMon.@nvmon
— Macro@nvmon group_or_groups codeblock
See also: nvmon
Note: This is an experimental feature and might change or be dropped any time!
Example
julia> using LIKWID
julia> x = CUDA.rand(1000); y = CUDA.rand(1000);
julia> metrics, events = @nvmon "FLOPS_DP" x .+ y;