A tensor is more than a shape

A PyTorch tensor is best understood as a combination of:

  • storage: the underlying memory buffer
  • size: the shape along each dimension
  • stride: how far to move in storage for the next element along each dimension
  • offset: where this tensor begins inside storage

That is the reason views, permutations, slices, and transposes can often exist without immediately copying data.

Why stride matters

Two tensors can have the same shape and values but different memory access patterns. Once dimensions are reordered, stride changes even if the numerical result looks equivalent. Many performance and correctness issues begin there.

Why views feel cheap

When possible, view-like operations reinterpret existing storage by changing metadata. That is much cheaper than allocating and copying. But if the layout does not support the requested interpretation, PyTorch may need a contiguous copy path.

The next post focuses directly on contiguous layout, memory formats, and hidden copy behavior.