CUDA.jl 5.8: CuSparseVector broadcasting, CUDA 12.9, and more


Tim Besard

CUDA.jl v5.8 brings several enhancements, most notably the introduction of broadcasting support for CuSparseVector. The release also includes support for CUDA 12.9, and updates to key CUDA libraries like cuTENSOR, cuQuantum, and cuDNN.

Broadcasting for CuSparseVector

A significant enhancement in CUDA.jl v5.8 is the support for broadcasting CuSparseVector. Thanks to @kshyatt, it is now possible to use sparse GPU vectors in broadcast expressions just like it was already possible with sparse matrices:

julia> using CUDA, .CUSPARSE, SparseArrays

julia> x = cu(sprand(Float32, 10, 0.3))
10-element CuSparseVector{Float32, Int32} with 4 stored entries:
  [2]  =  0.459139
  [3]  =  0.964073
  [8]  =  0.904363
  [9]  =  0.721723

julia> # a zero-preserving elementwise operation
       x .* 2
10-element CuSparseVector{Float32, Int32} with 4 stored entries:
  [2]  =  0.918278
  [3]  =  1.928146
  [8]  =  1.808726
  [9]  =  1.443446

julia> # a non-zero-preserving elementwise operation
       x .+ 1
10-element CuArray{Float32, 1, CUDA.DeviceMemory}:
 1.0
 1.4591388
 1.9640732
 1.0
 1.0
 1.0
 1.0
 1.9043632
 1.7217231
 1.0

julia> # combining multiple sparse inputs
       x .+ cu(sprand(Float32, 10, 0.3))
10-element CuSparseVector{Float32, Int32} with 6 stored entries:
  [1]  =  0.906
  [2]  =  0.583197
  [3]  =  0.964073
  [4]  =  0.259103
  [8]  =  0.904363
  [9]  =  0.935917

Minor Changes

CUDA.jl 5.8 also includes several other useful updates:

As always, we encourage users to update to the latest version to benefit from these improvements and bug fixes. Check out the changelog for a full list of changes.