API Reference
This page provides an overview of the oneAPI.jl API. For detailed documentation, see the specific API reference pages:
- Context & Device Management - Managing drivers, devices, and contexts
- Array Operations - Working with GPU arrays
- Kernel Programming - Writing and launching custom kernels
- Memory Management - Memory allocation and transfer
- Compiler & Reflection - Code generation and introspection
Core Functions
oneAPI.context! — Method
context!(ctx::ZeContext)Set the current Level Zero context for the calling task.
The context selection is task-local, allowing different Julia tasks to use different contexts.
Arguments
ctx::ZeContext: The context to use for subsequent operations.
Examples
ctx = ZeContext(driver())
context!(ctx)See also: context, ZeContext
oneAPI.context — Method
context() -> ZeContextGet the current Level Zero context for the calling task. If no context has been explicitly set with context!, returns a global context for the current driver.
Contexts manage the lifetime of resources like memory allocations and command queues. The context selection is task-local, but contexts themselves are cached globally per driver.
Examples
ctx = context()
println("Using context: ", ctx)oneAPI.device! — Method
device!(dev::ZeDevice)
device!(i::Int)Set the current Level Zero device for the calling task.
The device selection is task-local, allowing different Julia tasks to use different devices.
Arguments
dev::ZeDevice: The device to use for subsequent operations.i::Int: Device index (1-based) from the list of available devices for the current driver.
Examples
# Select by device object
dev = devices()[2]
device!(dev)
# Select by index
device!(2) # Select second deviceoneAPI.device — Method
device() -> ZeDeviceGet the current Level Zero device for the calling task. If no device has been explicitly set with device!, returns the first available device for the current driver.
The device selection is task-local, allowing different Julia tasks to use different devices.
Examples
dev = device()
println("Using device: ", dev)See also: device!, devices, driver
oneAPI.driver! — Method
driver!(drv::ZeDriver)Set the current Level Zero driver for the calling task. This also clears the current device selection, as devices are associated with specific drivers.
The driver selection is task-local, allowing different Julia tasks to use different drivers.
Arguments
drv::ZeDriver: The driver to use for subsequent operations.
Examples
drv = drivers()[2] # Select second available driver
driver!(drv)See also: driver, drivers
oneAPI.driver — Method
driver() -> ZeDriverGet the current Level Zero driver for the calling task. If no driver has been explicitly set with driver!, returns the first available driver.
The driver selection is task-local, allowing different Julia tasks to use different drivers.
Examples
drv = driver()
println("Using driver: ", drv)See also: driver!, drivers
oneAPI.global_queue — Method
global_queue(ctx::ZeContext, dev::ZeDevice) -> ZeCommandQueueGet the global command queue for the given context and device. This queue is used as the default queue for executing operations, guaranteeing expected semantics when using a device on a Julia task.
The queue is created with in-order execution flags, meaning commands are executed in the order they are submitted. Queues are cached per task and (context, device) pair.
Arguments
ctx::ZeContext: The context for the command queue.dev::ZeDevice: The device for the command queue.
Returns
ZeCommandQueue: A cached command queue with in-order execution.
Examples
ctx = context()
dev = device()
queue = global_queue(ctx, dev)See also: context, device, synchronize
oneAPI.oneL0.devices — Method
devices() -> Vector{ZeDevice}
devices(drv::ZeDriver) -> Vector{ZeDevice}Return a list of available Level Zero devices. Without arguments, returns devices for the current driver. With a driver argument, returns devices for that specific driver.
Examples
# Get devices for current driver
devs = devices()
println("Found ", length(devs), " devices")
# Get devices for specific driver
drv = drivers()[1]
devs = devices(drv)See also: device, device!, drivers
oneAPI.@sync — Macro
@sync exRun expression ex and synchronize the GPU afterwards.
See also: synchronize.
Compiler Functions
oneAPI.kernel_convert — Method
kernel_convert(x)This function is called for every argument to be passed to a kernel, allowing it to be converted to a GPU-friendly format. By default, the function does nothing and returns the input object x as-is.
Do not add methods to this function, but instead extend the underlying Adapt.jl package and register methods for the the oneAPI.KernelAdaptor type.
oneAPI.@oneapi — Macro
@oneapi [kwargs...] kernel(args...)High-level interface for launching Julia kernels on Intel GPUs using oneAPI.
This macro compiles a Julia function to SPIR-V, prepares the arguments, and optionally launches the kernel on the GPU.
Keyword Arguments
Macro Keywords (compile-time)
launch::Bool=true: Whether to launch the kernel immediately. Iffalse, returns the compiled kernel object without executing it.
Compiler Keywords
kernel::Bool=false: Whether to compile as a kernel (true) or device function (false)name::Union{String,Nothing}=nothing: Explicit name for the kernelalways_inline::Bool=false: Whether to always inline device functions
Launch Keywords (runtime)
groups: Number of workgroups (required). Can be an integer or tuple.items: Number of work-items per workgroup (required). Can be an integer or tuple.queue::ZeCommandQueue=global_queue(...): Command queue to submit to.
Examples
# Simple vector addition kernel
function vadd(a, b, c)
i = get_global_id()
@inbounds c[i] = a[i] + b[i]
return
end
a = oneArray(rand(Float32, 1024))
b = oneArray(rand(Float32, 1024))
c = similar(a)
# Launch with 4 workgroups of 256 items each
@oneapi groups=4 items=256 vadd(a, b, c)
# Compile without launching
kernel = @oneapi launch=false vadd(a, b, c)
kernel(a, b, c; groups=4, items=256) # Launch laterSee also: zefunction, kernel_convert
oneAPI.return_type — Method
return_type(f, tt) -> r::TypeReturn a type r such that f(args...)::r where args::tt.
oneL0 (Level Zero)
Low-level bindings to the Level Zero API. See the Level Zero page for details.
oneAPI.oneL0.DeviceBuffer — Type
DeviceBufferA buffer of device memory, owned by a specific device. Generally, may only be accessed by the device that owns it.
oneAPI.oneL0.HostBuffer — Type
HostBufferA buffer of memory on the host. May be accessed by the host, and all devices within the host driver. Frequently used as staging areas to transfer data to or from devices.
Note that these buffers need to be made resident to the device, e.g., by using the ZEKERNELFLAGFORCERESIDENCY module flag, the ZEKERNELSETATTRINDIRECTHOSTACCESS kernel attribute, or by calling zeDeviceMakeMemoryResident.
oneAPI.oneL0.OutOfGPUMemoryError — Type
OutOfGPUMemoryError(sz::Integer=0, dev::ZeDevice)An operation allocated too much GPU memory.
oneAPI.oneL0.PtrOrZePtr — Type
PtrOrZePtr{T}A special pointer type, ABI-compatible with both Ptr and ZePtr, for use in ccall expressions to convert values to either a device or a host type (in that order). This is required for APIs which accept pointers that either point to host or device memory.
oneAPI.oneL0.SharedBuffer — Type
SharedBufferA managed buffer that is shared between the host and one or more devices.
oneAPI.oneL0.ZeCommandList — Method
ZeCommandList(dev::ZeDevice, ...) do list
append_...!(list)
endCreate a command list for device dev, passing in a do block that appends operations. The list is then closed and can be used immediately, e.g. for execution.
oneAPI.oneL0.ZeDim3 — Type
ZeDim3(x)
ZeDim3((x,))
ZeDim3((x, y))
ZeDim3((x, y, x))A type used to specify dimensions, consisting of 3 integers for respectively the x, y and z dimension. Unspecified dimensions default to 1.
Often accepted as argument through the ZeDim type alias, allowing to pass dimensions as a plain integer or a tuple without having to construct an explicit ZeDim3 object.
oneAPI.oneL0.ZePtr — Type
ZePtr{T}A memory address that refers to data of type T that is accessible from q device. A ZePtr is ABI compatible with regular Ptr objects, e.g. it can be used to ccall a function that expects a Ptr to device memory, but it prevents erroneous conversions between the two.
oneAPI.oneL0.ZE_MAKE_VERSION — Method
ze_make_version(major::Integer, minor::Integer) -> UInt3232-bit unsigned integer version number from major and minor components. This should be the Julia equivalent of the C macro: #define ZE_MAKE_VERSION( _major, _minor ) (( _major << 16 )|( _minor & 0x0000ffff))
oneAPI.oneL0.execute! — Function
execute!(queue::ZeCommandQueue, ...) do list
append_...!(list)
endCreate a command list for the device that owns queue, passing in a do block that appends operations. The list is then closed and executed on the queue.
oneMKL
Intel oneAPI Math Kernel Library bindings. See the oneMKL page for details.