Find the indices where some elements x should be inserted into a sorted sequence v to maintain the sorted order. Effectively applying the Julia.Base functions in parallel on a GPU using foreachindex.

  • searchsortedfirst! (in-place), searchsortedfirst (allocating): index of first element in v >= x[j].
  • searchsortedlast!, searchsortedlast: index of last element in v <= x[j].
  • Other names: thrust::upper_bound, std::lower_bound.

Example:

import AcceleratedKernels as AK
using Metal

# Sorted array
v = MtlArray(rand(Float32, 100_000))
AK.merge_sort!(v)

# Elements `x` to place within `v` at indices `ix`
x = MtlArray(rand(Float32, 10_000))
ix = MtlArray{Int}(undef, 10_000)

AK.searchsortedfirst!(ix, v, x)
AcceleratedKernels.searchsortedfirst!Function
searchsortedfirst!(
    ix::AbstractVector,
    v::AbstractVector,
    x::AbstractVector,
    backend::Backend=get_backend(x);

    by=identity, lt=isless, rev::Bool=false,

    # CPU settings
    scheduler=:threads,
    max_tasks::Int=Threads.nthreads(),
    min_elems::Int=1000,

    # GPU settings
    block_size::Int=256,
)

Equivalent to applying searchsortedfirst! element-wise to each element of x. The CPU and GPU settings are the same as for foreachindex.

source
AcceleratedKernels.searchsortedfirstFunction
searchsortedfirst(
    v::AbstractVector,
    x::AbstractVector,
    backend::Backend=get_backend(x);

    by=identity, lt=isless, rev::Bool=false,

    # CPU settings
    scheduler=:threads,
    max_tasks::Int=Threads.nthreads(),
    min_elems::Int=1000,

    # GPU settings
    block_size::Int=256,
)

Equivalent to applying searchsortedfirst element-wise to each element of x. The CPU and GPU settings are the same as for foreachindex.

source
AcceleratedKernels.searchsortedlast!Function
searchsortedlast!(
    ix::AbstractVector,
    v::AbstractVector,
    x::AbstractVector,
    backend::Backend=get_backend(x);

    by=identity, lt=isless, rev::Bool=false,

    # CPU settings
    scheduler=:threads,
    max_tasks::Int=Threads.nthreads(),
    min_elems::Int=1000,

    # GPU settings
    block_size::Int=256,
)

Equivalent to applying searchsortedlast! element-wise to each element of x. The CPU and GPU settings are the same as for foreachindex.

source
AcceleratedKernels.searchsortedlastFunction
searchsortedlast(
    v::AbstractVector,
    x::AbstractVector,
    backend::Backend=get_backend(x);

    by=identity, lt=isless, rev::Bool=false,

    # CPU settings
    scheduler=:threads,
    max_tasks::Int=Threads.nthreads(),
    min_elems::Int=1000,

    # GPU settings
    block_size::Int=256,
)

Equivalent to applying searchsortedlast element-wise to each element of x. The CPU and GPU settings are the same as for foreachindex.

source