Jae's Tech Blog
Home Archive About Game

Posts tagged "roadmap"

January 6, 2026 undefined min read

Distributed LLM Training 01 - Why LLM Training Becomes a Distributed Systems Problem

Once LLM training leaves a single GPU, it stops being only a modeling problem and becomes a systems problem around memory, communication, and recovery

Lectures
Read more
March 3, 2026 undefined min read

PyTorch Internals 20 - A Practical Path from Internals Knowledge to Real Engineering Work

The goal of studying PyTorch internals is not trivia, but the ability to connect custom operators, kernel work, profiling, and distributed runtime behavior

Lectures
Read more

© 2025 Jae ยท Notes on systems, software, and building things carefully.

RSS