Related Content: CS604 - VU Lectures, Handouts, PPT Slides, Assignments, Quizzes, Papers & Books of Operating Systems
We discussed logical to physical address translation in the real mode operation of the Intel 80386 processor in the last lecture. Here we discuss address translation in the protected mode.
The following figure shows the hardware support needed for this translation.
An examination of real programs shows that in many cases the existence of the entire program in memory is not necessary:
Even in cases where the entire program is needed, it may not be all needed at the same time. The ability to execute a program that is only partially in memory confers many benefits.
Thus running a program that is not entirely in memory would benefit both the system and
the user.
Virtual Memory is the separation of user logical memory from physical memory.
This separation allows an extremely large virtual memory to be provided for
programmers when only a smaller physical memory is available. Virtual memory makes
the task of programming easier because the programmer need not worry about the
amount of physical memory, or about what code can be placed in overlays; she can
concentrate instead on the problem to be programmed.
In addition to separating logical memory from physical memory, virtual memory also
allows files and memory to be shared by several different processes through page sharing.
The sharing of pages further allows performance improvements during process creation.
Virtual memory is commonly implemented as demand paging. It can also be
implemented in a segmentation system. One benefit of virtual memory is efficient
process creation. Yet another is the concept of memory mapped files. We will discuss
these topics in subsequent lectures.
A demand paging system is similar to a paging system with swapping. Processes reside on secondary memory (which is usually a disk). When we want to execute a process, we swap it into memory. Rather than swapping the entire process into memory, however we use a lazy swapper. A lazy swapper never swaps a page into memory unless that page will be needed. Since we are now viewing a process as a sequence of pages rather than as one large contiguous address space, use of swap is technically incorrect. A swapper manipulates entire processes, whereas a pager is concerned with the individual pages of a process. Thus the term pager is used in connection with demand paging.
When a process is to be swapped in, the paging software guesses which pages would be used before the process is swapped out again. Instead of swapping in a whole process, the pager brings only those necessary pages into memory. Thus it avoids reading into memory pages that will not be used anyway, decreasing the swap time and the amount of physical memory needed.
With this scheme, we need some form of hardware support to distinguish which pages are in
memory and which are on disk. The valid-invalid bit scheme described in previous lectures
can be used. This time however when the bit is set to valid, this value indicates that the
associated page is both legal and in memory. If the bit is set to invalid this value indicates
that the page either is invalid or valid but currently on the disk. The page table entry for a
page that is brought into memory is set as usual but the page table entry for a page that is
currently not in memory is simply marked invalid or contains the address of the page on disk.
Notice that marking a page invalid will have no effect if the process never attempts to
access that page. Hence if we guess right and page in all and only those pages that are
actually needed, the process will run exactly as though we had brought in all pages. Wile the
process executes and accesses pages that are memory resident, execution proceeds normally.
But what happens if the process tries to access a page that was not brought into memory?
Access to a page marked invalid causes a page fault trap. The paging hardware in
translating the address through the page table will notice that the invalid bit is set, causing
a trap to the operating system. This trap is the result of the operating system’s failure to
bring the desired page into memory (in an attempt to minimize disk transfer overhead and
memory requirements) rather than an invalid address error as a result of an attempt to use
an illegal memory address. The procedure for handling a page fault is straightforward:
Since we save the state (registers, condition code, instruction counter) of the
interrupted process when the page fault occurs, we can restart the process in exactly the
same place and state except that the desired page is now in memory and is accessible. In
this way we are able to execute a process even though portions of it are not yet in
memory. When the process tries to access locations that are not in memory, the hardware
traps the operating system (page fault). The operating system reads the desired into
memory and restarts the process as though the page had always been in memory.
In the extreme case, we could start executing a process with no pages in memory.
When the operating system sets the instruction pointer to the first instruction of the
process, which is on a non memory resident page, the process immediately faults for the
page. After this page is brought into memory, the process continues to execute faulting as
necessary until every page that it needs is in memory. At that point, it can execute with
no more faults. This scheme is called pure demand paging: never bring a page into
memory until it is required.
The hardware needed to support demand paging is the same as the hardware for
paging and swapping:
In addition to this hardware, additional architectural constraints must be imposed. A crucial one is the need to be able to restart any instruction after a page fault. In most cases this is easy to meet, a page fault occurs while we are fetching an operand, we must fetch and decode the instruction again, and then fetch the operand. A similar problem occurs in machines that use special addressing modes, including auto increment and auto decrement modes. These addressing modes use a register as a pointer and automatically increment or decrement the register. Auto decrement automatically decrements the register before using its contents as the operand address; auto increment increments the register after using its contents. Thus the instruction
MOV (R2) +, -(R3)
Copies the contents of the location pointed to by register2 into that pointed to by register3. Now consider what will happen if we get a fault when trying to store into the location pointed to by register3. To restart the instruction we must reset the two registers to the values they had before we started the execution of the instruction.
Execution of a block (string) move instruction causing part of the source to be overwritten before a page fault occurs
Another problem occurs during the execution of a block (string) move instruction. If either source or destination straddles a page boundary a page fault might occur after the move is partially done. In addition if the source and destination blocks overlap the source block may have been modified in which case we cannot simply restart the instruction, as shown in the diagram on the previous page.
Demand paging can have a significant effect on the performance of a computer system.
To see why, let us compute the effective access time for a demand paged memory. For
most computer systems, the memory access time, denoted ma now ranges from 10 to 200
nanoseconds. As long as we have no page faults, the effective access time is equal to the
memory access time. If, however a page fault occurs, we must first read the relevant page
from disk, and then access the desired word.
Let p be the probability of a page fault (0 ≤ p ≤ 1). We would expect p to be close to
zero, that is, there will be only a few page faults. The effective access time is then:
Effective access time = (1-p) * ma + p * page fault time
To compute the effective access time, we must know how much time is needed to service a page fault. A page fault causes the following sequence to occur:
Not all these steps are necessary in every case. For example we are assuming that in
step 6, the CPU is allocated to another process while the I/O occurs. This arrangement
allows multiprogramming to maintain CPU utilization, but requires additional time to
resume the page fault service routine when the I/O transfer is complete.
In any case we are faced with three major components of the page fault service time:
The first and third tasks may be reduced, with careful coding, to several hundred
instructions. These tasks may take from 1 to 100 microseconds each. The page switch
time, on the other hand, will probably be close to 24 milliseconds. A typical hard disk has
an average latency of 8 milliseconds, a seek of 15 milliseconds, and a transfer time of 1
millisecond. Thus, the total paging time would be close to 25 milliseconds, including
hardware and software time. Remember that we are looking at only the device service
time. If a queue of processes is waiting for the device we have to add device queuing time
as we wait for the paging device to be free to service our request, increasing even more
the time to swap.
If we take an average page fault service time of 25 milliseconds and a memory access
time of 100 nanoseconds, then the effective access time in nanoseconds is
Effective access time = (1-p) * (100) + p (25 milliseconds)
= (1-p) * 100 + p * 25,000,000
= 100 + 24,999,900 * p
We see then that the effective access time is directly proportional to the page fault rate. If one access out of 1,000 causes a page fault, the effective access time is 25 microseconds. The computer would be slowed down by a factor of 250 because of demand paging! If we want less than 10 percent degradation, we need:
110 > 100 + 25,000,000 * p
10 > 25,000,000 * p
p < 0.0000004
That is, to keep the slowdown due to paging to a reasonable level, we can allow only
less than one memory access out of 2,500,000 to page fault.
It is important to keep the slowdown due to paging to a reasonable level, we can
allow only less than one memory access out of 2,500,000 to page fault.
It is important to keep the page fault rate low in a demand-paging system. Otherwise
the effective access time increases, slowing process execution dramatically.
One additional aspect of demand paging is the handling and overall use of swap
space. Disk I/O to swap space is generally faster than that to the file system. It is faster
because swap space is allocated in much larger blocks, and file lookups and indirect
allocation methods are not used. It is therefore possible for the system to gain better
paging throughput by copying an entire file image into the swap space at process startup, and then performing demand paging from the swap space. Another option is to demand
pages from the from the file system initially, but to write the pages to swap space as they
are replaced. This approach will ensure that only needed pages are ever read from the file
system, but all subsequent paging is done from swap space.
Some systems attempt to limit the amount of swap space when binary files are used.
Demand pages for such files are brought directly from the file system. However, when
page replacement is called for, these pages can simply be overwritten and read in from
the file system again if ever needed. Using this approach, the file system itself serves as
the backing store. However swap space must still be used for pages not associated with a
file; these pages include the stack and heap for a process. This technique is used in
several systems including Solaris.