Runs your expression
ex while activating the CUDA profiler upon first kernel launch. This makes it easier to profile accurately, without the overhead of initial compilation, memory transfers, ...
Note that this API is used to programmatically control the profiling granularity by allowing profiling to be done only on selective pieces of code. It does not perform any profiling on itself, you need external tools for that.