PerfTest macros quick reference

The following are the main macros used to define performance test suites. These shall be always used inside a testset (see the [Test] package). Combining the different macros listed in this section gives the full extent of the package features.

Declaring test targets

PerfTest.@perftestMacro

This macro is used to signal that the wrapped expression is a performance test target, and therefore its performance will be sampled and then evaluated following the current suite configuration.

If the macro is evaluated it does not modify the target at all. The effects of the macro only show when the script is transformed into a performance testing suite.

This macro is sensitive to context since other adjacent macros can change how the target will be evaluated.

@perftest expression [parameters...]

Run a performance test on a given target expression.

Basic usage

The simplest usage is to place @perftest in front of the expression you want to test:

julia> @perftest sin(1)

Additional parameters

You can pass the following keyword arguments to configure the execution process:

  • setup: An expression that is run once per sample before the benchmarked expression. The setup expression is run once per sample, and is not included in the timing results. Note that each sample can require multiple evaluations.

  • teardown: An expression that is run once per sample after the benchmarked expression.

  • samples: The number of samples to take. Execution will end if this many samples have been collected. Defaults to 10000.

  • seconds: The number of seconds budgeted for the benchmarking process. The trial will terminate if this time is exceeded (regardless of samples), but at least one sample will always be taken. In practice, actual runtime can overshoot the budget by the duration of a sample.

  • evals: The number of evaluations per sample. For best results, this should be kept consistent between trials. A good guess for this value can be automatically set on a benchmark via tune!, but using tune! can be less consistent than setting evals manually (which bypasses tuning).

  • gctrial: If true, run gc() before executing this benchmark's trial. Defaults to true.

  • gcsample: If true, run gc() before each sample. Defaults to false.

  • time_tolerance: The noise tolerance for the benchmark's time estimate, as a percentage. This is utilized after benchmark execution, when analyzing results. Defaults to 0.05.

Examples

Basic performance test

 @perftest sin(1)

With setup and teardown

 @perftest sort!(data) setup=(data=rand(100)) teardown=(data=nothing)

With custom parameters

# Run with a 3-second time budget
 @perftest sin(x) setup=(x=rand()) seconds=3

# Limit to 100 samples with 10 evaluations each
 @perftest myfunction(data) samples=100 evals=10

# Disable garbage collection before each sample
 @perftest allocating_function() gcsample=false gctrial=false

See Also

source

Declaring metrics

PerfTest.@define_metricMacro

This macro is used to define a new custom metric.

Arguments

  • name : the name of the metric for identification purposes.
  • units : the unit space that the metric values will be in.
  • formula block : an expression that returns a single value, which would be the metric value. The formula can have any julia expression inside and additionally some special symbols are supported. The formula may be evaluated several times, so its applied to every target in every test set or just once, if the formula is defined inside a test set, which makes it only applicable to it. NOTE: If there is the need of referring to a variable on a formula block, it first needs to be exported using the macro @export_vars, otherwise an error will occur.

Special symbols:

  • :median_time : will be substituted by the median time the target took to execute in the benchmark.
  • :minimum_time: will be substituted by the minimum time the target took to execute in the benchmark.
  • :ret_value : will be substituted by the return value of the target.
  • :autoflop: will be substituted by the FLOP count the target.
  • :printed_output : will be substituted by the standard output stream of the target.
  • :iterator : will be substituted by the current iterator value in a loop test set.
source
PerfTest.@auxiliary_metricMacro

Defines a custom metric for informational purposes that will not be used for testing but will be printed as output.

Arguments

  • name : the name of the metric for identification purposes.
  • units : the unit space that the metric values will be in.
  • formula block : an expression that returns a single value, which would be the metric value. The formula can have any julia expression inside and additionally some special symbols are supported. The formula may be evaluated several times, so its applied to every target in every test set or just once, if the formula is defined inside a test set, which makes it only applicable to it.

Special symbols:

  • :median_time : will be substituted by the median time the target took to execute in the benchmark.
  • :minimum_time: will be substituted by the minimum time the target took to execute in the benchmark.
  • :ret_value : will be substituted by the return value of the target.
  • :autoflop: will be substituted by the FLOP count the target.
  • :printed_output : will be substituted by the standard output stream of the target.
  • :iterator : will be substituted by the current iterator value in a loop test set.
source

Declaring methodologies

PerfTest.@perfcompareMacro

This macro is used to manually declare performance test conditions.

Arguments

  • An expression that must result to a boolean when evaluated. Being true if the comparison leads to a succesful performance test. Special symbols can be used.

Special symbols:

  • :median_time : will be substituted by the median time the target took to execute in the benchmark.
  • :minimum_time: will be substituted by the minimum time the target took to execute in the benchmark.
  • :ret_value : will be substituted by the return value of the target.
  • :autoflop: will be substituted by the FLOP count the target.
  • :printed_output : will be substituted by the standard output stream of the target.
  • :iterator : will be substituted by the current iterator value in a loop test set.

Example:

    @perfcompare :median_time < 0.05
source
PerfTest.@define_eff_memory_throughputMacro

This macro is used to define the memory bandwidth of a target in order to execute the effective memory thorughput methodology.

Arguments

  • formula block : an expression that returns a single value, which would be the metric value. The formula can have any julia expression inside and additionally some special symbols are supported. The formula may be evaluated several times, so its applied to every target in every test set or just once, if the formula is defined inside a test set, which makes it only applicable to it.
  • ratio : the allowed minimum percentage over the maximum attainable that is allowed to pass the test, it can be a number or a Julia expression that evaluates to a number
  • membenchmark : which STREAM kernel benchmark to use (e.g :MEMSTREAMCOPY for transfer operations :MEMSTREAM_ADD for transfer and computing)
  • custom_benchmark : in case of using a custom benchmark, the symbol that identifies the chosen benchmark, (must have been defined before)

Special symbols:

  • :median_time : will be substituted by the median time the target took to execute in the benchmark.
  • :minimum_time: will be substituted by the minimum time the target took to execute in the benchmark.
  • :ret_value : will be substituted by the return value of the target.
  • :autoflop: will be substituted by the FLOP count the target.
  • :printed_output : will be substituted by the standard output stream of the target.
  • :iterator : will be substituted by the current iterator value in a loop test set.

Example:

The following definition assumes that each execution of the target expression involves transacting 1000 bytes. Therefore the bandwith is 1000 / execution time.

@define_eff_memory_throughput begin
      1000 / :median_time
end
source
PerfTest.@rooflineMacro

This macro enables roofline modelling, if put just before a target declaration (@perftest) it will proceed to evaluate it using a roofline model.

Mandatory arguments

  • formula block: the macro has to wrap a block that holds a formula to obtain the operational intensity of target algorithms.

Optional arguments

  • cpu_peak : a manual input value for the maximum attainable FLOPS, this will override the empirical runtime benchmark
  • membw_peak : a manual input value for the maximum memory bandwith, this will override the empirical runtime benchmark
  • target_opint : a desired operational intensity for the target, this will turn operational intensity into a test metric
  • actual_flops: another formula that defines the actual performance of the test.
  • target_ratio : the acceptable ratio between the actual performance and the projected performance from the roofline, this will turn actual performance into a test metric.

Special symbols:

  • :median_time : will be substituted by the median time the target took to execute in the benchmark.
  • :minimum_time: will be substituted by the minimum time the target took to execute in the benchmark.
  • :ret_value : will be substituted by the return value of the target.
  • :autoflop: will be substituted by the FLOP count the target.
  • :printed_output : will be substituted by the standard output stream of the target.
  • :iterator : will be substituted by the current iterator value in a loop test set.

Any formula block specified in this macro supports these symbols.

Example

    @roofline actual_flops=:autoflop target_ratio=0.05 begin
        mem = ((:iterator + 1) * :iterator)
        :autoflop / mem
    end

The code block defines operational intensity, whilst the other arguments define how to measure and compare the actual performance with the roofline performance. If the actual to projected performance ratio goes below the target, the test fails.

source

Structure and configuration

PerfTest.@perftest_configMacro

Captures a set of configuration parameters that will override the default configuration. The parameters shall be written in TOML syntax, like a subset of the complete configuration (see config.toml generated by executing transform, or transform/configuration.jl for more information). Order is irrelevant. This macro shall be put as high as possible in the test file (code that is above will be transformed using the default configuration).

Recursive transformation:

This macro will set the new configuration keys for the current file and any other included files. If the included files have the macro as well, those macros will override the configuration locally for each file.

Arguments

  • A String, with the TOML declaration of configuration keys

Example

@perftestconfig " [roofline] enabled = false [general] maxsaved_results = 1 recursive = false "

source
PerfTest.@on_perftest_execMacro

The expression given to this macro will only be executed in the generated suite, and will be deleted if the source code is executed as is.

source
PerfTest.@on_perftest_ignoreMacro

The expression given to this macro will only be executed in the source code, and will be deleted in the generated performance test suite.

source
PerfTest.@export_varsMacro

@export_vars vars...

Exports the specified symbols –along with the values they hold at the moment of the calling– to the scope of metric definitions. In order to use any variable on the definition of a metric such variable needs to be exported with this macro.

source