Profiling
OpenCL.jl applications can be profiled using opencl-kernel-profiler, which is based on Perfetto. The most convenient way to use it is through opencl_kernel_profiler_jll
by setting the OPENCL_LAYERS
environment variable before initializing OpenCL as follows:
using opencl_kernel_profiler_jll
ENV["OPENCL_LAYERS"] = opencl_kernel_profiler_jll.libopencl_kernel_profiler
By default, traces are limited to 1024 KB. To increase this limit, set CLKP_TRACE_MAX_SIZE
to a larger value, e.g. ENV["CLKP_TRACE_MAX_SIZE"] = "100 * 1024"
for 100 MB.
After the Julia session exits, traces will be written to opencl-kernel-profiler.trace
in the current directory, which can be changed using the CLKP_TRACE_DEST
environment variable. These traces can then be visualized by going to https://ui.perfetto.dev/ and opening the trace file.
opencl-kernel-profiler
works by intercepting calls to the OpenCL API and recording the execution time of kernels as well as events on the host side. It will also log the OpenCL/SPIR-V source code of the kernels alongside the traced calls to clEnqueueNDRangekernel
, which can be useful for debugging.