Jae's Tech Blog
Home Archive About Game

Posts tagged "cuda"

January 28, 2026 undefined min read

GPU Systems 00 - What You Should Know Before Starting This Series

The background knowledge that makes the GPU Systems series much easier to study properly

Lectures
Read more
January 30, 2026 undefined min read

GPU Systems 01 - Roadmap to GPU Kernel Engineering

A practical study order from GPU architecture to CUDA, Triton, and kernel optimization

Lectures
Read more
February 1, 2026 undefined min read

GPU Systems 02 - The Thread, Warp, and Block Execution Model

What threads, warps, blocks, and grids mean in actual GPU execution

Lectures
Read more
February 3, 2026 undefined min read

GPU Systems 03 - Memory Hierarchy and Bandwidth

How to think about the GPU memory hierarchy and bandwidth bottlenecks

Lectures
Read more
February 5, 2026 undefined min read

GPU Systems 04 - Writing CUDA Kernels and Choosing Launch Configuration

How to think about indexing and launch configuration when writing CUDA kernels

Lectures
Read more
February 7, 2026 undefined min read

GPU Systems 05 - Coalescing, Shared Memory, and Reduction Patterns

The optimization patterns that keep showing up in CUDA kernels

Lectures
Read more
February 11, 2026 undefined min read

GPU Systems 07 - Occupancy and Latency Hiding

Understanding occupancy as a latency-hiding concept instead of just a percentage

Lectures
Read more
February 15, 2026 undefined min read

GPU Systems 09 - Why Naive Matrix Multiplication Is Slow

Using naive matrix multiplication to see memory reuse and traffic problems clearly

Lectures
Read more
February 17, 2026 undefined min read

GPU Systems 10 - Tiled Matrix Multiplication and Shared Memory

Why tiled matrix multiplication and shared memory create such a big performance difference

Lectures
Read more
โ† Previous
1 2 3
Next โ†’

© 2025 Jae ยท Notes on systems, software, and building things carefully.

RSS