NVIDIA Monitoring (NvMon)
Note: This is a maturing feature. Only NVIDIA GPUs are supported.
Index
LIKWID.NvMon.add_event_setLIKWID.NvMon.get_event_resultsLIKWID.NvMon.get_id_of_active_groupLIKWID.NvMon.get_last_metricLIKWID.NvMon.get_last_resultLIKWID.NvMon.get_longinfo_of_groupLIKWID.NvMon.get_metricLIKWID.NvMon.get_metric_resultsLIKWID.NvMon.get_metric_resultsLIKWID.NvMon.get_name_of_counterLIKWID.NvMon.get_name_of_eventLIKWID.NvMon.get_name_of_groupLIKWID.NvMon.get_name_of_metricLIKWID.NvMon.get_number_of_eventsLIKWID.NvMon.get_number_of_gpusLIKWID.NvMon.get_number_of_groupsLIKWID.NvMon.get_number_of_metricsLIKWID.NvMon.get_resultLIKWID.NvMon.get_shortinfo_of_groupLIKWID.NvMon.get_time_of_groupLIKWID.NvMon.initLIKWID.NvMon.isgroupsupportedLIKWID.NvMon.nvmonLIKWID.NvMon.read_countersLIKWID.NvMon.setup_countersLIKWID.NvMon.start_countersLIKWID.NvMon.stop_countersLIKWID.NvMon.supported_groupsLIKWID.NvMon.switch_groupLIKWID.NvMon.@nvmon
API
LIKWID.NvMon.add_event_set — Methodadd_event_set(estr) -> groupidAdd a performance group or a custom event set to the nvmon module. Returns a groupid (starting at 1) which is required to later specify the event set.
LIKWID.NvMon.get_event_results — Methodget_event_results([groupid_or_groupname, eventid_or_eventname, gpuid::Integer])
Retrieve the results of monitored events. Same as get_metric_results but for raw events.
LIKWID.NvMon.get_id_of_active_group — MethodReturn the groupid of the currently activate group.
LIKWID.NvMon.get_last_metric — MethodReturn the derived metric result of the last measurement cycle identified by group groupid and the indices for metric metricidx and gpu gpuid (all starting at 1).
LIKWID.NvMon.get_last_result — MethodReturn the raw counter register result of the last measurement cycle identified by group groupid and the indices for event eventidx and gpu gpuid (all starting at 1).
LIKWID.NvMon.get_longinfo_of_group — MethodReturn the (long) description of a performance group with id groupid (starts at 1).
LIKWID.NvMon.get_metric — MethodReturn the derived metric result of all measurements identified by group groupid and the indices for metric metricidx and gpu gpuid (all starting at 1).
LIKWID.NvMon.get_metric_results — Methodget_metric_results([groupid_or_groupname, metricid_or_metricname, gpuid::Integer])
Retrieve the results of monitored metrics.
Optionally, a group, metric, and gpuid can be provided to select a subset of metrics or a single metric. If given as integers, note that groupid, metricid, and gpuid all start at 1 and the latter enumerates the monitored gpus.
If no arguments are provided, a nested data structure is returned in which different levels correspond to performance groups, gpus, and metrics (in this order). ```
LIKWID.NvMon.get_metric_results — Methodget_metric_results()
Get the metric results for all performance groups and all monitored (NvMon.init) gpus.
Returns a an OrderedDict whose keys correspond to the performance groups and the values hold the results for all monitored gpus. ```
LIKWID.NvMon.get_name_of_counter — MethodReturn the name of the counter register identified by groupid and eventidx (both start at 1).
LIKWID.NvMon.get_name_of_event — MethodReturn the name of the event identified by groupid and eventidx (both start at 1).
LIKWID.NvMon.get_name_of_group — MethodReturn the name of the group identified by groupid (starts at 1). If it is a custom event set, the name is set to Custom.
LIKWID.NvMon.get_name_of_metric — MethodReturn the name of a derived metric identified by groupid and metricidx (both start at 1).
LIKWID.NvMon.get_number_of_events — MethodReturn the number of events in the group with id groupid (starts at 1).
LIKWID.NvMon.get_number_of_gpus — MethodReturn the number of GPUs initialized in the nvmon module.
LIKWID.NvMon.get_number_of_groups — MethodReturn the number of groups currently registered in the nvmon module.
LIKWID.NvMon.get_number_of_metrics — MethodReturn the number of metrics in the group with id groupid (starts at 1). Always zero for custom event sets.
LIKWID.NvMon.get_result — MethodReturn the raw counter register result of all measurements identified by group groupid and the indices for event eventidx and gpu gpuid (all starting at 1).
LIKWID.NvMon.get_shortinfo_of_group — MethodReturn the short information about a performance group with id groupid (starts at 1).
LIKWID.NvMon.get_time_of_group — MethodReturn the measurement time for group identified by groupid (starts at 1).
LIKWID.NvMon.init — Functioninit(gpuid_or_gpuids)Initialize LIKWID's NvMon module for the gpu(s) with the given gpu id(s) (starting at 0!).
LIKWID.NvMon.isgroupsupported — FunctionChecks if the given performance group is available on the given GPU (defaults to the first).
LIKWID.NvMon.nvmon — Methodnvmon(f, group_or_groups[; gpuids])Monitor performance groups while executing the given function f on one or multiple GPUs. Note that
NvMon.initandNvMon.finalizeare called automatically- the measurement of multiple performance groups is sequential and requires multiple executions of
f!
Keyword arguments:
gpuids(default: first GPU): specify the GPUs to be monitored
Note: This is an experimental feature and might change or be dropped any time!
Example
julia> using LIKWID
julia> x = CUDA.rand(1000); y = CUDA.rand(1000);
julia> metrics, events = nvmon("FLOPS_DP") do
CUDA.@sync x .+ y;
end;LIKWID.NvMon.read_counters — MethodRead the counter registers. To be executed after start_counters and before stop_counters. Returns true on success.
LIKWID.NvMon.setup_counters — MethodProgram the counter registers to measure all events in group groupid (starts at 1). Returns true on success.
LIKWID.NvMon.start_counters — MethodStart the counter registers. Returns true on success.
LIKWID.NvMon.stop_counters — MethodStop the counter registers. Returns true on success.
LIKWID.NvMon.supported_groups — FunctionReturn a dictionary of all available nvmon groups for the GPU identified by gpu (starts at 0).
Examples
julia> NvMon.supported_groups()
Dict{String, LIKWID.GroupInfoCompact} with 4 entries:
"DATA" => DATA => Load to store ratio
"FLOPS_SP" => FLOPS_SP => Single-precision floating point
"FLOPS_HP" => FLOPS_HP => Half-precision floating point
"FLOPS_DP" => FLOPS_DP => Double-precision floating pointLIKWID.NvMon.switch_group — MethodSwitch currently active group to groupid (starts at 1). Returns true on success.
LIKWID.NvMon.@nvmon — Macro@nvmon group_or_groups codeblockSee also: nvmon
Note: This is an experimental feature and might change or be dropped any time!
Example
julia> using LIKWID
julia> x = CUDA.rand(1000); y = CUDA.rand(1000);
julia> metrics, events = @nvmon "FLOPS_DP" x .+ y;