Currently, the Julia CUDA stack is the most mature, easiest to install, and full-featured. The CUDA.jl repository is a central place for the documentation of all relevant packages. Start with the instructions on how to install the stack, and follow with this introductory tutorial.
If you prefer videos, the presentations below highlight different aspects of the toolchain.
Effective CUDA GPU computing in Julia
- Design and benefits of the Julia GPU stack
- Composability with existing (non-GPU) software
- Performance killers and tools for optimization
How Julia is compiled to CUDA GPUs
- Design and implementation of the Julia language
- Retargeting the language to GPUs
- Use of LLVM with LLVM.jl
- Benefits of a high-level language for GPU programming