GPU Systems
From GPU architecture and CUDA kernels to Triton and real kernel optimization work
Engineers who want to understand how GPUs actually execute work and eventually write and optimize their own kernels.
Thoughts on code, technology, and everything in between
Long-form posts on platform engineering, Linux, compilers, MLOps, and computer architecture, written to help you build stronger intuition instead of just memorize terms.
A few strong entry points if you are new here.
Fresh writing, updates, and ongoing series entries.
From GPU architecture and CUDA kernels to Triton and real kernel optimization work
Engineers who want to understand how GPUs actually execute work and eventually write and optimize their own kernels.
Building reliable ML systems from data pipelines to production monitoring
ML engineers, data scientists, and backend engineers moving from model experiments to production operations.
From finite automata and formal languages to building a compiler from scratch
Readers who want both the theory behind language processing and the bridge to real compiler construction.
From finite automata and formal languages to building a compiler from scratch
Why developer experience should guide every platform decision, how to measure it, and the anti-patterns that kill it
If you think of a tensor only as an n-dimensional array, you will misunderstand views, layouts, and hidden copies
What computational power a stack adds to finite automata
Once LLM training leaves a single GPU, it stops being only a modeling problem and becomes a systems problem around memory, communication, and recovery
How processes are created and managed in Linux, and how threads relate to them
To reason well about performance, custom operators, and distributed runtime behavior, PyTorch has to be understood as a runtime, not just a Python library
How the ALU, control unit, and datapath work together to execute instructions
What actually goes inside an Internal Developer Platform โ the five planes, how they connect, and the build vs buy decision
Why MLOps is necessary and what the real challenges of the ML lifecycle are