Jae's Tech Blog
Home Archive About Game

Posts tagged "fsdp"

February 17, 2026 undefined min read

Distributed LLM Training 15 - How FSDP Differs from DDP and When It Helps

FSDP keeps parameters sharded and only gathers them when needed, making it a direct answer to parameter-replication pressure

Lectures
Read more
March 1, 2026 undefined min read

Distributed LLM Training 19 - How to Read Megatron-LM and DeepSpeed Structurally

Frameworks are easier to understand when you read them as bundles of parallelization and state-management choices rather than as giant feature lists

Lectures
Read more
February 25, 2026 undefined min read

PyTorch Internals 18 - Where Autograd Meets Distributed Runtime

DDP and FSDP are not external magic; they depend directly on autograd timing and tensor-state management inside the runtime

Lectures
Read more

© 2025 Jae ยท Notes on systems, software, and building things carefully.

RSS