Learn

Currently, the Julia CUDA stack is the most mature, easiest to install, and full-featured. The CUDA.jl documentation is a central place for information on all relevant packages. Start with the instructions on how to install the stack, and follow with this introductory tutorial. There are also a series of notebooks on more advanced uses of CUDA.jl, including application and kernel optimization, as well as advanced memory management and concurrent programming concepts (which apply to other back-ends as well).

If you prefer video material, there are plenty of talks and workshops on GPU programming in Julia to be found on Youtube. For example:

GPU programming in Julia

3-hour workshop covering various of the toolchain:


Concurrent GPU computing in CUDA.jl 3.0

Introduction to concurrent GPU computing: