undefined min read
Distributed LLM Training 07 - NCCL and Topology: Why the Same GPU Count Can Behave Very Differently
In distributed training, performance is often shaped more by how GPUs are connected than by the raw number of GPUs