Computer Architecture 05 - CPU Privilege Levels and Protection
Why CPUs distinguish privilege levels, and how x86 protection rings and ARM exception levels protect the system
Why Does a CPU Need Privilege Levels?
Early computers had no concept of privilege. Programs could freely read and write any memory address and directly access any hardware port. This was not a problem when only a single program ran at a time, but the situation changed fundamentally with the advent of multiprogramming, where several programs execute concurrently.
Is it safe for every program to have equal privileges? It is not. If a user program could overwrite another program's memory, directly manipulate the disk's file system, or execute instructions that halt the entire system, stable operation would be impossible. For the operating system to manage resources and guarantee isolation between programs, the hardware must be able to enforce what each piece of code can and cannot do.
This is why CPU privilege levels exist. Privilege levels are not a software convention but a constraint enforced by hardware. When unprivileged code attempts a forbidden operation, the CPU raises an exception, and the operation is not executed.
From Real Mode to Protected Mode
The history of the x86 processor illustrates how the need for privilege was recognized. The 8086 processor, released in 1978, operated exclusively in Real Mode. In Real Mode, all code could directly access physical memory and execute any instruction without restriction. There was no memory protection, no privilege separation.
The 80286 processor, introduced in 1982, brought Protected Mode. In Protected Mode, memory access is controlled through segment descriptors, and each segment is assigned access permissions. On every memory access, the CPU checks whether the current code's privilege level qualifies it to access that segment.
Modern x86 processors still start in Real Mode when powered on, maintaining backward compatibility. After the bootloader completes initial setup and switches to Protected Mode (or 64-bit Long Mode), hardware-level privilege protection becomes active.
x86 Protection Rings
The x86 architecture defines four privilege levels, called protection rings, numbered Ring 0 through Ring 3.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Ring 3 โ
โ User Applications โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Ring 2 โ โ
โ โ Device Drivers (rarely used)โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ Ring 1 โ โ โ
โ โ โ OS Services (rarely โ โ โ
โ โ โ used) โ โ โ
โ โ โ โโโโโโโโโโโโโโโโ โ โ โ
โ โ โ โ Ring 0 โ โ โ โ
โ โ โ โ Kernel โ โ โ โ
โ โ โ โโโโโโโโโโโโโโโโ โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Ring 0 holds the highest privilege. It can access all memory, execute any CPU instruction, and modify hardware settings. The operating system kernel runs at Ring 0. Ring 3 is the lowest privilege level, where ordinary user programs execute. Code at Ring 3 can only access memory allocated to it and cannot execute privileged instructions.
Ring 1 and Ring 2 were originally designed for OS services and device drivers, but in practice they are rarely used. Most operating systems adopted a two-level model using only Ring 0 and Ring 3. Some hypervisor environments do make use of these intermediate rings, however.
The Current Privilege Level (CPL) is stored in the lower two bits of the code segment selector. Every time the CPU executes an instruction, it checks the CPL to determine whether the operation is permitted.
Privileged Instructions
Instructions that can only execute at Ring 0 are called privileged instructions. If Ring 3 code attempts to execute one, the CPU raises a General Protection Fault (#GP).
| Instruction | Operation | Why Privileged |
|---|---|---|
HLT | Halt CPU | Affects entire system |
LGDT / LIDT | Load GDT/IDT register | Modifies memory protection scheme |
MOV CR0, ... | Modify control register | Changes protected mode/paging settings |
IN / OUT | I/O port access | Direct hardware control |
WRMSR | Write model-specific register | Alters CPU behavior |
This distinction is not a mere convention but is implemented at the silicon level. During the decode stage, the CPU checks the current CPL, and if privilege is insufficient, an exception is raised before the instruction executes. There is no way for software to bypass this check.
Privilege Level Transitions: Gate Descriptors
When a Ring 3 user program needs to read a file or send a network packet, it requires the kernel's assistance. But kernel code must execute at Ring 0. So how does the transition from Ring 3 to Ring 0 occur?
Allowing jumps to arbitrary code would render protection meaningless. If Ring 3 code could jump to any kernel address, it would be possible to enter code paths that bypass security checks. Therefore, x86 uses gate descriptors to restrict transitions to precisely the permitted entry points.
There are three main types of gate descriptors. Call gates provide an entry path to higher-privilege code via the CALL FAR instruction. Interrupt gates transfer control to an interrupt handler when a hardware interrupt or INT instruction occurs, automatically disabling interrupts during the transition. Trap gates are similar to interrupt gates but differ in that they do not disable interrupts.
Each gate records a target segment, an offset, and the minimum privilege level required to use that gate. When the CPU passes through a gate, it also performs a stack switch. Using the Ring 3 stack would cause kernel code to depend on the user stack, creating security vulnerabilities. The TSS (Task State Segment) stores stack pointers for each ring, and during a privilege transition the CPU automatically switches to the appropriate ring's stack.
System Calls: Modern Ring Transitions
In modern operating systems, the most frequent Ring 3 to Ring 0 transition is the system call. Early implementations used software interrupts like INT 0x80, but interrupt-based transitions incurred significant overhead from IDT lookups, stack switches, and pipeline flushes.
To address this, x86 introduced the SYSENTER/SYSEXIT (Intel) and SYSCALL/SYSRET (AMD, later a common standard) instructions. These bypass gate descriptors and transition directly to a kernel entry point pre-configured in MSRs (Model Specific Registers). Segment selectors and stacks are replaced according to hardwired rules, minimizing memory lookups and greatly reducing transition cost.
User Program (Ring 3)
โ
โ SYSCALL
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Load kernel entry from MSR โ
โ Change CPL to 0 โ
โ Switch to kernel stack โ
โโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโ
โผ
Kernel Syscall Handler (Ring 0)
โ
โ Process request
โ
โ SYSRET
โผ
User Program (Ring 3)
Linux further reduces this cost with the vDSO (virtual Dynamic Shared Object). System calls that only need to read kernel data, such as gettimeofday, execute directly from a read-only page that the kernel maps into user space, eliminating the ring transition entirely.
Segment Protection and Page-Level Protection
x86 protection operates at two layers: segment-level protection and page-level protection.
Each segment descriptor specifies a DPL (Descriptor Privilege Level). When the CPU accesses a segment, it compares the current CPL with the DPL. If the CPL number is greater than the DPL (meaning lower privilege), access is denied. However, modern 64-bit operating systems configure segments in a flat model, greatly reducing the role of segment protection. Practical memory protection is handled by page tables.
Each page table entry contains a User/Supervisor bit. Pages marked as Supervisor are accessible only from Ring 0. Additionally, read/write bits and the No-Execute (NX) bit provide fine-grained per-page access control. The reason kernel memory is unreadable from user space is precisely because of these bits.
Following the Meltdown vulnerability, KPTI (Kernel Page Table Isolation) was introduced, which unmaps most kernel page table entries when executing in user mode, further strengthening protection.
ARM Exception Levels
The ARM architecture separates privilege differently from x86's protection rings. ARM (AArch64) uses the concept of exception levels, with four stages from EL0 to EL3.
| Level | Purpose | x86 Equivalent |
|---|---|---|
| EL0 | User applications | Ring 3 |
| EL1 | OS kernel | Ring 0 |
| EL2 | Hypervisor | - |
| EL3 | Secure monitor (TrustZone) | - |
The most significant difference from x86 is that EL2 and EL3 are officially defined at the architecture level. In x86, hypervisors operate in a separate root/non-root mode through extensions like VT-x, whereas in ARM they are naturally integrated as part of the exception level hierarchy. EL3 is where the ARM TrustZone secure monitor executes, managing transitions between the Secure World and the Non-secure World.
Privilege transitions in ARM occur through exceptions. Transitions from a lower level to a higher level can only happen through exceptions (interrupts, system calls, etc.), and return from a higher level to a lower level uses the ERET instruction. This structure achieves the same purpose as x86's gate descriptors while providing a more streamlined model.
The Significance of Hardware Enforcement
The fact that privilege levels are enforced by hardware rather than software carries decisive significance. Even if the operating system kernel itself contains bugs, hardware-level protection guarantees basic isolation. No matter how malicious a user program may be, as long as it runs at Ring 3, it cannot directly read another process's memory or manipulate hardware.
Of course, perfect protection does not exist. As microarchitectural vulnerabilities like Spectre and Meltdown have shown, hardware protection can have gaps in its implementation details. But the very fact that these vulnerabilities made major headlines demonstrates how fundamental and important hardware-level protection is as a security boundary.
In the next post, we'll look at interrupts and exceptions.