This page is for readers who do not want the full roadmap first.

If the Start Here page is the structured entry, this page is the curated one. These are the posts and series that best represent what the blog is trying to do well.

Best Series To Read Straight Through

GPU Systems

One of the strongest long-form tracks on the blog right now. It connects GPU architecture, CUDA thinking, Triton, and optimization work into one technical arc instead of treating them as disconnected topics.

Best if you want:

  • GPU kernel engineering
  • better hardware-level performance intuition
  • a stronger bridge into large-model training systems

PyTorch Internals

A good bridge series between model code and systems work. It is useful if you already use PyTorch but want to understand tensors, autograd, extensions, and how custom kernels fit into real training code.

Distributed LLM Training

This is the systems-heavy LLM training track. It is less about API usage and more about understanding how memory, communication, topology, and framework structure interact.

Compiler

A good theory-to-implementation series. It is strongest when read as a way to connect formal language ideas to ASTs, IR, optimization, and code generation.

Best Individual Starting Points

GPU Systems 00 - What You Need Before Studying GPU Systems

This is a strong entry point if you want to study the GPU track seriously instead of sampling random posts.

Distributed LLM Training 01 - Why LLM Training Becomes a Distributed Systems Problem

This is one of the better “why this topic matters” posts on the blog. It sets the framing correctly before the framework names start appearing.

PyTorch Internals 01 - Why You Need to Understand the Internals

A useful bridge post if you are already working with models and want to move closer to kernels, operators, and runtime behavior.

Automata and Compilers 01 - Finite Automata

Good if you want to start the compiler track from the actual theoretical base instead of jumping into parser implementation without context.

Linux 01 - What the Kernel Is Actually Doing

Good if you want stronger systems intuition from the operating-system side first.

Best Paths By Goal

If you want to become stronger at systems work

Read:

  1. Computer Architecture
  2. Linux
  3. Compiler

If you want the GPU / LLM stack

Read:

  1. GPU Systems
  2. PyTorch Internals
  3. Distributed LLM Training

If you want production-facing ML systems

Read:

  1. MLOps
  2. Platform Engineering
  3. Distributed LLM Training

If You Want One Recommendation

If you want the most representative current path on the blog, start with GPU Systems, then move to PyTorch Internals, then Distributed LLM Training.

That path captures the blog at its strongest: systems thinking, runtime detail, and serious technical sequencing.