Jae's Tech Blog
Home Start Here Best Of Archive About Game

Posts tagged "gpu"

January 28, 2026 undefined min read

GPU Systems 00 - What You Should Know Before Starting This Series

The background knowledge that makes the GPU Systems series much easier to study properly

Lectures
Read more
January 30, 2026 undefined min read

GPU Systems 01 - Roadmap to GPU Kernel Engineering

A practical study order from GPU architecture to CUDA, Triton, and kernel optimization

Lectures
Read more
February 1, 2026 undefined min read

GPU Systems 02 - The Thread, Warp, and Block Execution Model

What threads, warps, blocks, and grids mean in actual GPU execution

Lectures
Read more
February 3, 2026 undefined min read

GPU Systems 03 - Memory Hierarchy and Bandwidth

How to think about the GPU memory hierarchy and bandwidth bottlenecks

Lectures
Read more
February 5, 2026 undefined min read

GPU Systems 04 - Writing CUDA Kernels and Choosing Launch Configuration

How to think about indexing and launch configuration when writing CUDA kernels

Lectures
Read more
February 9, 2026 undefined min read

GPU Systems 06 - Triton and the Practical Shape of Kernel Optimization

How Triton fits into real kernel optimization work, especially for LLM-style workloads

Lectures
Read more
February 11, 2026 undefined min read

GPU Systems 07 - Occupancy and Latency Hiding

Understanding occupancy as a latency-hiding concept instead of just a percentage

Lectures
Read more
February 13, 2026 undefined min read

GPU Systems 08 - Profiling and the Roofline View

A practical way to use profiling and roofline thinking to understand kernel bottlenecks

Lectures
Read more
February 15, 2026 undefined min read

GPU Systems 09 - Why Naive Matrix Multiplication Is Slow

Using naive matrix multiplication to see memory reuse and traffic problems clearly

Lectures
Read more
โ† Previous
1 2 3
Next โ†’

© 2025 Jae ยท Notes on systems, software, and building things carefully.

RSS