Automata and Compilers 11 - Intermediate Representations and Optimization

Why Intermediate Representations Exist

Could we not translate the syntax tree directly to machine code once semantic analysis is complete? In theory, yes, but in practice this creates a serious problem. Suppose the frontend supports three languages — C, Java, and Python — and the backend must target three architectures — x86, ARM, and RISC-V. With direct connections between frontends and backends, 3 x 3 = 9 translation paths are needed. Every additional language or architecture increases the count multiplicatively.

An intermediate representation (IR) solves this problem. If every frontend translates to a common IR and every backend translates from that IR to machine code, only 3 + 3 = 6 translations are required. Adding a new language means implementing only the translation from that language to the IR, and adding a new architecture means implementing only the translation from the IR to that architecture.

Direct approach:                  IR approach:
C ──→ x86                        C ──┐
C ──→ ARM                       Java ─→ IR ──→ x86
C ──→ RISC-V                   Python ┘   ├──→ ARM
Java ──→ x86                              └──→ RISC-V
Java ──→ ARM
Java ──→ RISC-V                   Translations: 3 + 3 = 6
Python ──→ x86
Python ──→ ARM
Python ──→ RISC-V

Translations: 3 × 3 = 9

Another advantage of IR is that optimization can be performed independently of both source language and target architecture. Implementing an optimization once at the IR level benefits every language and every architecture. This is why every modern compiler without exception uses an IR.

Types of Intermediate Representation

IRs are broadly classified into three categories based on their level of abstraction.

Tree-based IRs closely resemble syntax trees. The abstract syntax tree (AST) is the primary example, and because it preserves source code structure well, it is suited for semantic analysis and high-level optimization. However, its distance from machine code makes it impractical for direct code generation.

Linear IRs arrange instructions sequentially. Three-address code is the canonical example, where each instruction has at most three operands. Its structure resembles machine code while remaining hardware-independent, making it the bridge between optimization and code generation.

Source code:       a = b * c + d * e

Three-address code:
  t1 = b * c
  t2 = d * e
  t3 = t1 + t2
  a  = t3

Graph-based IRs represent data flow and control flow as graphs. The most important form in this category is Static Single Assignment (SSA) form.

SSA Form

SSA form is the foundation of optimization in modern compilers. Its core rule is simple: every variable is defined exactly once.

Regular code:           SSA form:
x = 1                   x₁ = 1
x = 2                   x₂ = 2
y = x + 3               y₁ = x₂ + 3

Because each variable is defined only once, it is always clear where any given value came from at any point of use. This property dramatically simplifies many optimizations. For constant propagation in conventional IR, reaching definition analysis is required because a variable may be redefined multiple times. In SSA, there is only one definition, so the relationship between definitions and uses is self-evident.

But what happens in conditional branches where a variable receives different values depending on the path taken? This is handled by introducing the phi function.

if (cond)
  x₁ = 1
else
  x₂ = 2
x₃ = φ(x₁, x₂)     ← selects x₁ or x₂ depending on which branch was taken
y₁ = x₃ + 3

The phi function is not an actual executable instruction but a notational device to maintain the SSA invariant. During code generation, it is replaced with appropriate copy instructions or eliminated during register allocation.

Key Optimization Techniques

Once the IR is in place, the compiler performs a variety of optimizations to improve execution speed and reduce code size. An optimization is a transformation that improves performance without altering the program's meaning.

Constant folding evaluates expressions that can be computed at compile time. x = 3 + 5 becomes x = 8. Simple yet effective, it is particularly powerful when it cascades with other optimizations.

Dead code elimination removes code that has no effect on the program's output. If a variable is defined but never used afterward, its definition contributes nothing to the result and can be safely removed. In SSA, where every definition's uses are explicit, identifying dead code becomes straightforward.

Before optimization:        After optimization:
t1 = a + b                  t1 = a + b
t2 = c * d                  (removed: t2 is never used)
t3 = t1 + 1                 t3 = t1 + 1
return t3                   return t3

Common subexpression elimination ensures that identical computations are performed only once. In a = b + c; d = b + c;, the expression b + c does not need to be computed twice, so d = a can replace the second computation.

Loop optimizations target loops, which typically account for the majority of a program's execution time. Loop-invariant code motion moves computations whose values do not change across iterations outside the loop body. Loop unrolling replicates the loop body multiple times to reduce branch overhead. Induction variable optimization simplifies calculations tied to loop counters.

Loop-invariant code motion example:

Before optimization:                After optimization:
for (i = 0; i < n; i++)            t = x * y           ← moved outside the loop
  a[i] = x * y + i                 for (i = 0; i < n; i++)
                                      a[i] = t + i

LLVM IR

LLVM is the most successful realization of the IR concept in modern compiler infrastructure. LLVM IR exists in three forms: a human-readable text format (.ll), an in-memory data structure, and a bitcode format (.bc).

define i32 @add(i32 %a, i32 %b) {
entry:
  %result = add i32 %a, %b
  ret i32 %result
}

The above is the LLVM IR for a function that adds two integers. Type information is explicit, SSA form is enforced, and nothing is hardware-specific. Clang (C/C++), Rustc (Rust), and the Swift compiler all translate to LLVM IR, after which the LLVM backend handles optimization and code generation. The vision of multiple languages sharing a single IR and a single optimization pipeline has been realized.

Optimization Passes

In real compilers, optimization is not a single monolithic step but a sequence of small passes applied in order. Each pass performs one specific optimization, and the ordering and combination of passes affect the final performance.

IR → [Constant propagation] → [Dead code elimination] → [Common subexpression elimination]
  → [Loop-invariant code motion] → [Register promotion] → ... → Optimized IR

The pass-based design allows each pass to be developed and tested independently. It also enables different sets of passes for different optimization levels (-O0, -O1, -O2, -O3), balancing compile time against code quality. Because one pass's transformations can create new optimization opportunities for another, it is sometimes effective to apply the same pass multiple times.

Intermediate representations and optimization form the heart of the compiler. By transforming the program into a more efficient form while preserving its meaning, this stage is what allows the high-level code we write to execute quickly on hardware. In the next post, we'll look at how the optimized IR is translated into actual machine code: code generation.

Where to go next

Why Intermediate Representations Exist

Types of Intermediate Representation

SSA Form

Key Optimization Techniques

LLVM IR

Optimization Passes

Continue Reading

Automata and Compilers 12 - Code Generation

Automata and Compilers 09 - Abstract Syntax Trees

Automata and Compilers 10 - Semantic Analysis and Type Checking