PyTorch Internals 05 - How the Autograd Graph and Engine Work
Autograd is not just automatic differentiation; it is a graph-construction and backward-execution runtime
Backward is a runtime process
loss.backward() looks simple from the outside, but internally it relies on:
- graph construction during forward
- saved tensors and metadata
- dependency-aware backward scheduling
That is why autograd is best thought of as an execution engine, not just a differentiation feature.
Why this matters
You need this model to understand:
- why gradients disappear
- why in-place operations are dangerous
- why some custom operators need explicit backward logic
- why memory usage can rise due to saved activations
The next post shows where custom autograd functions fit into this picture.