Jae's Tech Blog

January 15, 2026 undefined min read

Distributed LLM Training 04 - What PyTorch DDP Actually Does Internally

DDP is not just a wrapper around your model; it is a runtime that coordinates autograd hooks, gradient buckets, and synchronization timing

Lectures

February 25, 2026 undefined min read

DDP and FSDP are not external magic; they depend directly on autograd timing and tensor-state management inside the runtime

Lectures