dd86k's blog

Machine code enthusiast

The Legacy of Paging in x86

Author: dd
Published: August 13, 2020
Last modified: December 25, 2022 at 15h02
Categories:

I recently had a talk with someone about paging and Intel’s 5th paging level, then grown curious what happened throughout the years.

2 levels

The paging mechanism was first introduced in the Intel 80386 and included two levels. This, alongside protected mode, permitted the operating system to split memory regions with more granular access, useful when spawning new processes.

Address translation using 4 KiB pages

Under VM86, it’s limited to 20 bits for the 8086 and 24 bits for the 80286. 80386 doesn’t truncate anything and goes up to 32 bits.

For an example of a linear memory address translation, imagine the memory offset 0x00402010.

Let’s open it up! 0x00400000, when shifted (22 bits) and masked (9 bits), selects directory 1 (zero-based), 0x2000 selects page 2 (zero-based), and 0x10 is the memory offset within the page. Neat, isn’t it?

3 levels

The Physical Address Extension adds PAE paging, which in turn squeezes another level called the Directory Pointer. In simple terms, PAE also makes it possible to access memory regions above 4 GiB when operating in 32-bit mode (IA-32e, x86-32, protected extended 32-bit mode, name it as you wish).

Now you may ask, why another level? Well, it was a planned feature way before AMD64 and Intel64. Adding a directory pointer permits more fine grained permission levels and control.

Address translation using 4 KiB pages and the Directory Pointer

PAE paging adds 64-bit entries, but still translates 32-bit linear addresses to 52-bit physical addresses using four (4) PDPTE registers.

PAE also add support for 2 MiB pages, which gets rid of the page selector when translating addresses.

Address translation using 2 MiB pages

 

4 levels

With the introduction of the 64-bit instruction set, it was obvious that 32-bit was no longer a limit, and so AMD introduced the 4th level. This extends the directory pointer to 8 bits and adds the PML4 field.

Address translation using 4 KiB pages and 4 levels

The 4th level still supports 2 MiB pages.

Address translation using 2 MiB pages

And now 1 GiB pages by getting rid of the directory field.

Address translation using 1 GiB pages

5 levels

Since recent processors currently implement 46 out of 48 bits of the address space, Intel has been more keen on adding a 5th level.

In a 2017 whitepaper, Intel proposed the 5th level, which extends the address translation up to 57 bits. Which sounds scary (since it’s yet another level to traverse), but entirely required for Intel to do (since current processors kinda reached that limit already). Besides this level is mostly welcomed for the Linux kernel (trust me, it *wants* levels, otherwise it has to emulate dummy levels), and VMMs to avoid generating #PF exceptions when the guest wants to poke higher than 48 bits of the space.

Address translation using 4 KiB pages and 5 levels of paging

Currently unaware if bigger pages than 1 GiB are going to be supported, but I don’t doubt it’ll come soon!

6 levels?

With the remaining bits available, it is possible in the late future that Intel, or AMD, may extend the rest of the bits to introduce another level, or possible even add 4 more levels with those 8 bits left! I guess we’ll have to see where x86 goes in that future (possibly in a dark corner, but who know).