Jae's Tech Blog
Home Archive About Game

Posts tagged "attention"

January 30, 2026 undefined min read

Distributed LLM Training 09 - Where Tensor Parallelism Actually Lives Inside a Transformer

Tensor parallelism becomes real when you map it onto QKV projections, attention output paths, and the two large MLP projections inside a transformer block

Lectures
Read more

© 2025 Jae ยท Notes on systems, software, and building things carefully.

RSS