undefined min read
GPU Systems 08 - Profiling and the Roofline View
A practical way to use profiling and roofline thinking to understand kernel bottlenecks
A practical way to use profiling and roofline thinking to understand kernel bottlenecks
Closing the GPU Systems series by connecting profiling, Triton experimentation, and FlashAttention-style thinking