PyTorch Internals 12 - Backward Implementation Patterns and Saved-State Strategy
Backward design is really a question about what to save, what to recompute, and how to preserve correct semantics
The real backward design question
When you build a custom operator, you have to decide:
- what to save during forward
- what to recompute in backward
- where numerical stability needs extra care
Those choices control memory, performance, and implementation complexity at the same time.
The next post moves to fused operators, where those tradeoffs become even more visible.