Computer Architecture 06 - Interrupts and Exceptions

What Happens Without Interrupts?

How does a CPU receive keyboard input? The simplest approach is for the CPU to periodically check the keyboard controller's status register. This is called polling. The CPU runs in a loop, repeatedly asking "is there input?"

Is this approach efficient? Not at all. The interval between keystrokes ranges from tens of milliseconds to several seconds, but the CPU operates on a nanosecond scale. While waiting for a single keystroke, the CPU would perform millions of pointless checks. If the CPU had to poll not just the keyboard but also the disk, network, timer, and every other device, most of its time would be consumed by status checks, leaving almost none for actual computation.

Interrupts solve this problem at its root. The device actively sends a signal to the CPU. The CPU continues its own work without needing to check the device, and only when a device signals "there is something to process" does it momentarily pause its current task to handle the event. By analogy with a telephone, polling is picking up the receiver and saying "hello?" repeatedly, while interrupts mean answering only when the phone rings.

Three Types of Interrupts

Events that disrupt the CPU's normal execution flow fall into three broad categories.

Hardware interrupts are notifications sent from external devices to the CPU through physical signal lines. Keyboard input, timer expiration, disk I/O completion, and network packet arrival all fall into this category. Because they originate outside the CPU and occur asynchronously, they can arrive at any point during instruction execution.

Software interrupts are intentionally triggered by programs. On x86, they are initiated via the INT instruction, with the system call (INT 0x80) discussed in the previous post being a prime example. Unlike hardware interrupts, they occur synchronously, at predictable points within the program's execution flow.

Exceptions are events generated inside the CPU during instruction execution. Division by zero, accessing a nonexistent memory address, and unauthorized execution of privileged instructions all cause exceptions. Exceptions are further divided into three subtypes. Faults are exceptions where the faulting instruction can be re-executed. The page fault is the classic example: the operating system loads the page and then re-executes the instruction. Traps resume execution at the instruction following the one that caused the exception. Aborts are unrecoverable severe errors that typically terminate the process or the entire system.

The Interrupt Descriptor Table

When the CPU receives an interrupt, how does it know which code to execute? On x86, the Interrupt Descriptor Table (IDT) provides this mapping.

The IDT is a table with up to 256 entries, each containing a gate descriptor corresponding to one interrupt vector (number). The CPU's IDTR register points to this table's base address and size.

IDTR ──▶ ┌───────────────────────────────┐
         │ Vector 0: #DE (Divide Error)   │ ──▶ Exception Handler
         │ Vector 1: #DB (Debug)          │ ──▶ Debug Handler
         │ Vector 2: NMI                  │ ──▶ NMI Handler
         │ ...                            │
         │ Vector 13: #GP (General        │ ──▶ GP Fault Handler
         │          Protection Fault)      │
         │ Vector 14: #PF (Page Fault)    │ ──▶ Page Fault Handler
         │ ...                            │
         │ Vector 32: Timer Interrupt     │ ──▶ Timer ISR
         │ Vector 33: Keyboard Interrupt  │ ──▶ Keyboard ISR
         │ ...                            │
         │ Vector 255                     │
         └───────────────────────────────┘

Vectors 0 through 31 are reserved for CPU-defined exceptions. Vectors 32 through 255 are available for the operating system and devices. When an interrupt occurs, the CPU indexes into the IDT using the vector number and transfers execution to the segment and offset recorded in the gate descriptor. As explained in the previous post, privilege level transitions and stack switches occur as part of this process.

How Interrupt Service Routines Work

An Interrupt Service Routine (ISR) is the code that handles a specific interrupt. By the time the ISR begins executing, the CPU has already automatically performed critical operations. The current instruction pointer (RIP), code segment (CS), flags register (RFLAGS), stack pointer (RSP), and stack segment (SS) have been automatically saved to the kernel stack.

The ISR must additionally preserve general-purpose registers. Since interrupts can occur at any point, if the ISR modifies registers, the interrupted code would encounter unexpected values when it resumes. This is why prologue/epilogue code that pushes registers onto the stack at the start and restores them before returning is essential.

ISRs must execute as quickly as possible. When entered through an interrupt gate, the corresponding interrupt (or all interrupts) is disabled, so a long-running ISR delays the processing of other interrupts. The Linux kernel addresses this by splitting interrupt handling into two stages. The top half performs only immediate processing such as reading hardware registers and acknowledging the interrupt, while time-consuming work is deferred to the bottom half.

Interrupt Priority and Nesting

Not all interrupts carry the same urgency. Timer interrupts must be processed quickly for system time accuracy, while keyboard interrupts can tolerate slight delays. To reflect these differences, interrupt controllers provide a priority scheme.

When a higher-priority interrupt occurs while a lower-priority ISR is executing, the current ISR can be suspended to handle the higher-priority interrupt first. This is called interrupt nesting. Conversely, interrupts of equal or lower priority wait until the current ISR completes.

The NMI (Non-Maskable Interrupt) is a special interrupt with the highest priority. As the name implies, it cannot be masked, meaning it cannot be disabled by software. It occurs in critical situations like hardware failures and memory parity errors, and must be handled regardless of the system's current state. Ordinary hardware interrupts, by contrast, are maskable and can be temporarily disabled with the CLI instruction and re-enabled with STI. When the kernel disables interrupts in a critical section, it uses this mechanism.

From PIC to APIC

In early IBM PCs, the 8259A PIC (Programmable Interrupt Controller) managed interrupts. A single PIC provided 8 IRQ (Interrupt Request) lines, and cascading two together allowed handling a total of 15 external interrupts.

                ┌──────────┐
IRQ 0 (Timer)  ─▶│          │
IRQ 1 (Keyboard)─▶│  Master  │
IRQ 2 (Cascade)─▶│  PIC     │──▶ CPU INTR Pin
IRQ 3          ─▶│          │
IRQ 4          ─▶│          │
...              │          │
                └──────────┘
                     ▲
                ┌────┴─────┐
IRQ 8          ─▶│          │
IRQ 9          ─▶│  Slave   │
...              │  PIC     │
IRQ 15         ─▶│          │
                └──────────┘

The PIC architecture was sufficient for single-processor systems, but its limitations became clear in multiprocessor environments. It could only deliver interrupts to a single CPU, and the number of IRQ lines was insufficient.

The APIC (Advanced Programmable Interrupt Controller) was introduced as a replacement. APIC consists of a local APIC present in each CPU core and an I/O APIC that collects external interrupts system-wide. The I/O APIC receives external interrupts and routes them to a specific CPU or to the most suitable CPU based on priority. This architecture enables distributing interrupt processing across multiple cores in multicore environments.

Modern systems also widely use MSI (Message Signaled Interrupts). Instead of physical interrupt lines, interrupts are delivered as memory write transactions, allowing hundreds of interrupt vectors without pin count constraints.

Interrupt Latency

Interrupt latency is the time from when an interrupt signal is raised to when the ISR's first instruction executes. This latency is determined by the sum of several factors: completing the current instruction, saving CPU state, IDT lookup, stack switching, and potential cache misses.

In general-purpose operating systems, interrupt latency is on the order of a few microseconds, which is not an issue for typical use. However, in real-time systems such as industrial control, audio processing, and autonomous driving, guaranteed maximum latency matters. This is why real-time extensions like the PREEMPT_RT patch have been introduced to the Linux kernel, providing optimizations that reduce the upper bound on interrupt latency.

Interrupts and Multitasking

Interrupts are also the fundamental foundation of multitasking. Preemptive multitasking is possible because of timer interrupts. The operating system configures a hardware timer to generate interrupts at regular intervals (typically 1 to 10 milliseconds). The scheduler is invoked in the timer ISR, determines which process to run next, and performs a context switch.

Without this mechanism, each program would have to voluntarily yield the CPU for other programs to run. Early Windows (3.1) and classic Mac OS actually used this cooperative multitasking model, where a single program refusing to yield would freeze the entire system. Preemptive multitasking based on timer interrupts solves this problem at the hardware level.

DMA and Interrupts Working Together

When transferring large volumes of data from disk or network to memory, having the CPU copy data byte by byte is extremely inefficient. A DMA (Direct Memory Access) controller transfers data directly between devices and memory without CPU intervention.

DMA and interrupts work in concert. The CPU configures the DMA controller with transfer parameters (source address, destination address, size), and the DMA controller transfers data independently. The CPU can perform other work in the meantime. When the transfer completes, the DMA controller raises an interrupt to notify the CPU. Only upon receiving the interrupt does the CPU check the transfer result and perform follow-up processing.

This pattern is used throughout I/O in modern systems. Disk reads, network packet reception, and data exchange with GPUs all operate through the combination of DMA transfers and completion interrupts. Without interrupts, the CPU would have to poll until the transfer finished, unable to perform any other useful work during that time.

In the next post, we'll look at the memory hierarchy.

Where to go next

What Happens Without Interrupts?

Three Types of Interrupts

The Interrupt Descriptor Table

How Interrupt Service Routines Work

Interrupt Priority and Nesting

From PIC to APIC

Interrupt Latency

Interrupts and Multitasking

DMA and Interrupts Working Together

Continue Reading

Computer Architecture 07 - Memory Hierarchy

Computer Architecture 08 - Virtual Memory and MMU

Computer Architecture 09 - I/O and DMA