Memory Management Unit (MMU) Demystified For Embedded System
by cawan (cawan[at]ieee.org or chuiyewleong[at]hotmail.com)
In embedded system, RISC architecture is the most common approach as system
processor. In order to support multitasking, each running process should has its
own memory space. In other words, the memory space of each running process should
be isolated among each other. On the other hand, the kernel space should be
separated from the user space, in order to ensure the stability of the system.
Besides, it is also important to assign continuous virtual memory blocks over the
incontinuous physical memory pages. Well, it seems some kind of memory mapping or
translation mechanism is needed for such kind of memory management process to
support multitasking properly. In x86 architecture, CR3 is always pointing to the
current context of page directory. So, with a virtual address, it is usually
segmented into 3 portions, where the first portion is the offset from page directory,
the second portion is the offset from the page table directed by the entry in page
directory, and the third portion is the offset from the physical memory page. Yes,
it is about the translation process from virtual memory into physical memory. But,
how about the translation process of RISC architecture which is commonly being used
in embedded system ? The answer is simple, it is almost the same.
In RISC architecture, the translation process is usually being managed automatically
by Memory Management Unit or MMU. The MMU is optionally to be activated from the
register setting of appropriate coprocessor of the respective RISC processor. In ARM
platform, the memory management process is controlled by system control coprocessor,
or also well-known as coprocessor 15, or CP15 in short. In fact, ARM architecture has
defined 16 coprocessors with numbers from 0 to 15, where the coprocessor 15 (CP15) is
reserved for memory system control purposes. In CP15, there are 16 registers
available in controlling the behavior of MMU such as the caching of TLB or the
translation process from virtual address to physical address, as mentioned earlier.
However, in order to enable MMU, it is only about to set the value of bit 0 in
register 1 (C1) of CP15. Yes, it is straightforward. Well, once the MMU is activated,
as what CR3 in x86 does, a register which is always pointing to the current context
of page directory is necessary to start the translation process. Is there anything
similar in ARM architecture ? Yes, it is register 2 (C2) in CP15. In ARM terminology,
C2 is a pointer to the physical address of Translation Table Base (TTB). So, as
almost similar to x86, any virtual address in ARM plarform is segmented into 3
portions, and after 2 level of page table redirection, the final memory page is found
and the last portion of virtual address will point to the exact location in the
memory page. The final memory page can be defined as tiny page (1KB), small page (4kB),
large page (64kB), or section (1MB). In most of the times, small page is the favorite.
Besides, as exception, memory page in section will only involve 1 level of page table
redirection, in other words, the virtual address that referring a section will only
segmented into 2 portions, instead of 3 portions in small page. As additional info,
in order to perform read or write operation to the registers in coprocessor, it is
necessary to going through a general purpose register. The related commands to perform
such actions are mrc and mcr.
On the other hand, in MIPS processor, the MMU is controlled by coprocessor 0 which is
reserved by the system. In coprocessor 0, 32 registers are available in controlling
the behavior of memory management of the system. Among the 32 registers, the register
16 is defined as configuration register, where its bit 7 to 9 (3 bits) will determine
the MMU mode of the system. If it is 0, MMU is disabled; if it is 1, then MMU is
enabled in standard TLB mode. For value 2 and 3, the MMU will perform some kind of
direct mapping from virtual address to physical address. Value 4 to 7 are reserved.
It is important to note that there are 4 columns of selections available in register
16, numbers from 0 to 3. The bit 7 to 9 as mentioned earlier is located at column 0
of the selection. In order to point to the current context, register 4 which is named
as context register will do the job. As usual, any virtual address is segmented and
the MMU will perform page table redirection to find the final memory page, and the
offset portion of the virtual address will point to the exact location in the memory
page, nothing special. Regarding to perform read and write operation to the registers
in coprocessor, the related commands are mfc0 and mtc0.
Before we stop this paper, it is worth to take note about TLB. TLB is known as
translation lookaside buffer, where it caches all the histories of translation
process. In other words, for those virtual addresses which are getting translated
into physical address before will be pooled in the TLB. With such mechanism, it is
an advantage to optimize the performance of memory management process. So, instead
of forcing the translation process by referring the page tables again and again, it
is a good idea to seek the historical virtual address entry in TLB first. If any
match entry is found in TLB, then the respective physical address will be returned
and getting use directly, and the complicated translation process can be skipped.
The TLB mechanism is implemented in both ARM and MIPS platforms, and of course, it
is available to be controlled by CP15 and coprocessor 0, respectively.