Jae's Tech Blog
Home Archive About Game

Posts tagged "batch-size"

January 18, 2026 undefined min read

Distributed LLM Training 05 - Global Batch Size, Gradient Accumulation, and Learning Rate Scaling

Adding more GPUs changes optimizer semantics as well as throughput, so batch size and learning rate need to be reasoned about together

Lectures
Read more

© 2025 Jae ยท Notes on systems, software, and building things carefully.

RSS