undefined min read
Automata and Compilers 11 - Intermediate Representations and Optimization
Why compilers use intermediate representations and how optimization makes programs faster
Why compilers use intermediate representations and how optimization makes programs faster
The optimization patterns that keep showing up in CUDA kernels
Why shared memory is not automatically fast and how bank conflicts appear
Using reduction kernels to connect shared memory, warp primitives, and synchronization