# CUDA.jl 2.1

Tim Besard

CUDA.jl v2.1 is a bug-fix release, with one new feature: support for cubic texture interpolations. The release also partly reverts a change from v2.0: `reshape`

, `reinterpret`

and contiguous `view`

s now return a `CuArray`

again.

## Generalized texture interpolations

CUDA's texture hardware only supports nearest-neighbour and linear interpolation, for other modes one is required to perform the interpolation by hand. In CUDA.jl v2.1 we are generalizing the texture interpolation API so that it is possible to use both hardware-backed and software-implemented interpolation modes in exactly the same way:

```
# N is the dimensionality (1, 2 or 3)
# T is the element type (needs to be supported by the texture hardware)
# source array
src = rand(T, fill(10, N)...)
# indices we want to interpolate
idx = [tuple(rand(1:0.1:10, N)...) for _ in 1:10]
# upload to the GPU
gpu_src = CuArray(src)
gpu_idx = CuArray(idx)
# create a texture array for optimized fetching
# this is required for N=1, optional for N=2 and N=3
gpu_src = CuTextureArray(gpu_src)
# interpolate using a texture
gpu_dst = CuArray{T}(undef, size(gpu_idx))
gpu_tex = CuTexture(gpu_src; interpolation=CUDA.NearestNeighbour())
broadcast!(gpu_dst, gpu_idx, Ref(gpu_tex)) do idx, tex
tex[idx...]
end
# back to the CPU
dst = Array(gpu_dst)
```

Here, we can change the `interpolation`

argument to `CuTexture`

to either `NearestNeighbour`

or `LinearInterpolation`

, both supported by the hardware, or `CubicInterpolation`

which is implemented in software (building on the hardware-supported linear interpolation).

## Partial revert of array wrapper changes

In CUDA.jl v2.0, we changed the behavior of several important array operations to reuse available wrappers in Base: `reshape`

started returning a `ReshapedArray`

, `view`

now returned a `SubArray`

, and `reinterpret`

was reworked to use `ReinterpretArray`

. These changes were made to ensure maximal compatibility with Base's array type, and to simplify the implementation in CUDA.jl and GPUArrays.jl.

However, this change turned out to regress the time to precompile and load CUDA.jl. Consequently, the change has been reverted, and these wrappers are now implemented as part of the `CuArray`

type again. Note however that we intend to revisit this change in the future. It is therefore recommended to use the `DenseCuArray`

type alias for methods that need a `CuArray`

backed by contiguous GPU memory. For strided `CuArray`

s, i.e. non-contiguous views, you should use the `StridedCuArray`

alias.