CUDAnative provides a primitive, lightweight array type to manage GPU data organized in an plain, dense fashion. This is the device-counterpart to CUDAdrv's CuArray, and implements (part of) the array interface as well as other functionality for use on the GPU:

CuDeviceArray(dims, ptr)
CuDeviceArray{T}(dims, ptr)
CuDeviceArray{T,A}(dims, ptr)
CuDeviceArray{T,A,N}(dims, ptr)

Construct an N-dimensional dense CUDA device array with element type T wrapping a pointer, where N is determined from the length of dims and T is determined from the type of ptr. dims may be a single scalar, or a tuple of integers corresponding to the lengths in each dimension). If the rank N is supplied explicitly as in Array{T,N}(dims), then it must match the length of dims. The same applies to the element type T, which should match the type of the pointer ptr.

CUDAnative.ldg โ€” Function.
ldg(A, i)

Index the array A with the linear index i, but loads the value through the read-only texture cache for improved cache behavior. You should make sure the array A, or any aliased instance, is not written to for the duration of the current kernel.

This function can only be used on devices with compute capability 3.5 or higher.

See also: Base.getindex