Jae's Tech Blog

January 15, 2026 undefined min read

Distributed LLM Training 04 - What PyTorch DDP Actually Does Internally

DDP is not just a wrapper around your model; it is a runtime that coordinates autograd hooks, gradient buckets, and synchronization timing

Lectures

January 5, 2026 undefined min read

PyTorch Internals 01 - Why You Need to Understand the Internals

To reason well about performance, custom operators, and distributed runtime behavior, PyTorch has to be understood as a runtime, not just a Python library

Lectures

January 17, 2026 undefined min read

PyTorch Internals 05 - How the Autograd Graph and Engine Work

Autograd is not just automatic differentiation; it is a graph-construction and backward-execution runtime

Lectures

January 20, 2026 undefined min read

PyTorch Internals 06 - When and How to Use a Custom Autograd Function

Custom autograd functions are a practical place to define forward-backward contracts before dropping to lower-level extensions

Lectures

February 7, 2026 undefined min read

PyTorch Internals 12 - Backward Implementation Patterns and Saved-State Strategy

Backward design is really a question about what to save, what to recompute, and how to preserve correct semantics

Lectures

February 25, 2026 undefined min read

PyTorch Internals 18 - Where Autograd Meets Distributed Runtime

DDP and FSDP are not external magic; they depend directly on autograd timing and tensor-state management inside the runtime

Lectures

Posts tagged "autograd"

Distributed LLM Training 04 - What PyTorch DDP Actually Does Internally

PyTorch Internals 01 - Why You Need to Understand the Internals

PyTorch Internals 05 - How the Autograd Graph and Engine Work

PyTorch Internals 06 - When and How to Use a Custom Autograd Function

PyTorch Internals 12 - Backward Implementation Patterns and Saved-State Strategy

PyTorch Internals 18 - Where Autograd Meets Distributed Runtime