NVIDIA Monitoring (NvMon)

Note: This is a maturing feature. Only NVIDIA GPUs are supported.

Index

API

LIKWID.NvMon.add_event_setMethod
add_event_set(estr) -> groupid

Add a performance group or a custom event set to the nvmon module. Returns a groupid (starting at 1) which is required to later specify the event set.

source
LIKWID.NvMon.get_last_metricMethod

Return the derived metric result of the last measurement cycle identified by group groupid and the indices for metric metricidx and gpu gpuid (all starting at 1).

source
LIKWID.NvMon.get_last_resultMethod

Return the raw counter register result of the last measurement cycle identified by group groupid and the indices for event eventidx and gpu gpuid (all starting at 1).

source
LIKWID.NvMon.get_metricMethod

Return the derived metric result of all measurements identified by group groupid and the indices for metric metricidx and gpu gpuid (all starting at 1).

source
LIKWID.NvMon.get_metric_resultsMethod

get_metric_results([groupid_or_groupname, metricid_or_metricname, gpuid::Integer])

Retrieve the results of monitored metrics.

Optionally, a group, metric, and gpuid can be provided to select a subset of metrics or a single metric. If given as integers, note that groupid, metricid, and gpuid all start at 1 and the latter enumerates the monitored gpus.

If no arguments are provided, a nested data structure is returned in which different levels correspond to performance groups, gpus, and metrics (in this order). ```

source
LIKWID.NvMon.get_metric_resultsMethod

get_metric_results()

Get the metric results for all performance groups and all monitored (NvMon.init) gpus.

Returns a an OrderedDict whose keys correspond to the performance groups and the values hold the results for all monitored gpus. ```

source
LIKWID.NvMon.get_resultMethod

Return the raw counter register result of all measurements identified by group groupid and the indices for event eventidx and gpu gpuid (all starting at 1).

source
LIKWID.NvMon.initFunction
init(gpuid_or_gpuids)

Initialize LIKWID's NvMon module for the gpu(s) with the given gpu id(s) (starting at 0!).

source
LIKWID.NvMon.nvmonMethod
nvmon(f, group_or_groups[; gpuids])

Monitor performance groups while executing the given function f on one or multiple GPUs. Note that

  • NvMon.init and NvMon.finalize are called automatically
  • the measurement of multiple performance groups is sequential and requires multiple executions of f!

Keyword arguments:

  • gpuids (default: first GPU): specify the GPUs to be monitored

Note: This is an experimental feature and might change or be dropped any time!

Example

julia> using LIKWID

julia> x = CUDA.rand(1000); y = CUDA.rand(1000);

julia> metrics, events = nvmon("FLOPS_DP") do
           CUDA.@sync x .+ y;
       end;
source
LIKWID.NvMon.supported_groupsFunction

Return a dictionary of all available nvmon groups for the GPU identified by gpu (starts at 0).

Examples

julia> NvMon.supported_groups()
Dict{String, LIKWID.GroupInfoCompact} with 4 entries:
  "DATA"     => DATA => Load to store ratio
  "FLOPS_SP" => FLOPS_SP => Single-precision floating point
  "FLOPS_HP" => FLOPS_HP => Half-precision floating point
  "FLOPS_DP" => FLOPS_DP => Double-precision floating point
source
LIKWID.NvMon.@nvmonMacro
@nvmon group_or_groups codeblock

See also: nvmon

Note: This is an experimental feature and might change or be dropped any time!

Example

julia> using LIKWID

julia> x = CUDA.rand(1000); y = CUDA.rand(1000);

julia> metrics, events = @nvmon "FLOPS_DP" x .+ y;
source