386 Paging BasicsWe have seen how segmentation works in the 386. Now let's examine paging. For our purposes, segmentation on the 386 is defeated by running in "flat" mode. We can then consider intrasegment addresses as if they are linear address space.
Paging works with a two-level scheme that permits the sparse allocation of address space, so that the whole address space, or even all of the address space mapping information, need not be present. Otherwise, a 4 gigabyte process would require more than 4 Mbyte of page tables, even though it may be the case that only a few thousand would be active at any time. Typically, for our purposes, only three pages of page tables are allocated per process (page directory and the top and bottom address space page tables). This is sufficient to run a 4-Mbyte process (instruction plus data size) and 4 Mbyte of stack. (Note that all processes run with a full-sized address space and can dynamically grow to use it.) This mechanism is quite successful in reducing memory-management overhead.
The two-level scheme splits the incoming virtual address into three parts: 10 bits of page table directory index, 10 bits of page table index, and 12 bits of offset within a page. The page table directory is a single page of physical memory that facilitates allocation of page table space by breaking it up into 4-Mbyte chunks of linear address space per each of its 1024 PDEs (Page Directory Entry), which determine the location of underlying page tables in physical memory.
Each PDE-addressed page of a page table contains 1024 PTEs (Page Table Entry). A PTE is similar in form and function to a PDE. The major difference between a PDE and a PTE is that a PTE selects the physical page frame for the desired reference. Once the frame offset least-significant address bits are obtained, the final address is determined. This method is identical to that used in many other common microprocessors (the MC68030, Clipper, and NS32532, among others).
Each PDE and PTE may be marked either "invalid" (not currently used) or "valid" (the underlying page of physical memory is present). In addition, other attribute bits mark entries as "read only" or "read-write" and "supervisor" or "user." Because segmentation is not used to control memory protection, we keep processes honest by relying entirely on the paging mechanism's attributes for protection as well as for the allocation of memory.
The current Berkeley UNIX virtual memory management subsystem was originally designed for use with a VAX, and as such has no support for page directories. For that matter, the 386 doesn't know of such VAX concepts as P0 and P1 address spaces for instruction/data and stack nor of page table-length registers. Currently, these are simulated in 386BSD. However, work is underway to revise the entire virtual memory system to permit more generalized operation over all supported Berkeley UNIX platforms, now that the demands of each platform have been made obvious.
Portions of the VAX were simulated by employing code, written by Mike Hibler at the University of Utah, which supports the 68030 paging memory management. Because the 386 code is so similar, we used a conditional compilation that shares 68030 and 386 versions interchangeably -- an odd couple indeed.