Homework 3
CS W4118 Fall 1999
DUE: Wednesday, 11/17/99 at the beginning of class

CS4118 - Homework 2


Individual problems are due at the beginning of class on 11/17/99, with the usual homework deadline rules for CVN students in effect. Group programming problems are to be electronically submitted by 1:00am on 11/17/99, with hardcopies writeups due at the beginning of class on 11/17/99. The deadline for group programming problems applies to both CVN and non-CVN students. Refer to the homework policy page on the class web site for further details.

Individual Problems:

Individual problems are to be done individually. Exercise numbers refer to the Silverschatz and Galvin textbook. Each problem is worth 3 points unless otherwise noted.

  1. Exercise 8.1
  2. Exercise 8.5
  3. Exercise 8.10
  4. Exercise 8.11
  5. Exercise 8.16
  6. Exercise 8.18
  7. Exercise 9.3
  8. Exercise 9.5
  9. Exercise 9.9
  10. Exercise 9.10
  11. Exercise 9.11
  12. Exercise 9.16 (4 pts)
  13. Exercise 9.18

Group Programming Problems:

Programming problems are to be done in your assigned groups using the VM that has been assigned to your group. For all programming problems you will be required to submit source code, a README file documenting your files and code, and a test run of your programs. For source code submissions, you only need to submit new source code files that you created and kernel source code files that you changed. You should clearly indicate your names, email addresses, and assigned group number on your submission. Each group is required to submit one writeup for the programming assignment using the SUBMIT program. Refer to the homework submission page on the class web site for additional submission instructions.

The image of the kernel you build for this assignment should be vmlinuz.hmwk3. Grading for this assignment will be done based on vmlinuz.hmwk3.

Background

The purpose of this assignment is to gain better understanding of Linux's virtual memory management subsystem, to extend this subsystem with new memory management policies, and to evaluate your implementation.

The basic operation of the Linux virtual memory management subsystem is as follows. A process runs in a virtual address space. Virtual addresses are converted into physical memory addresses by using components of the virtual address as indices into the page directory and page table. This mapping allows the mapping of physical memory into a much larger virtual memory space. This mapping is detailed in the optional textbook "Linux Kernel Internals".

The virtual address space of a process consists of a kernel segment and a user segment. Since the page tables of the kernel segment are identical for all processes, this exercise is concerned only with the memory allocated for the user segment of a process.

In addition to mapping physical memory into a much larger virtual memory space, the indirection provided through the page directories and page tables allows multiple processes to share the specific memory pages. This allows, for example, 2 users to run the /bin/bash program, with both processes referring to the same code pages in physical memory (although data pages would differ). When a process is created (forked), the parent's page directories and tables are copied for the child process. After forking, if either process modifies data, then the pages holding the memory being modified will be copied. You will see this referred to throughout the memory management code as copy-on-write or COW.

When a process begins to execute, not all of the image is pulled into memory. Instead, Linux waits until the process attempts to reference the page in virtual memory. If the page is not in memory, a page fault occurs signalling Linux to bring the page into physical memory and update the process's page tables. To ensure there are physical memory pages available to bring in requested pages, Linux attempts to maintain a minimum of free_pages_low pages of memory. This maintainence is performed by kswapd which we will describe in more detail later.

Another way of looking at this is, when processes are loaded, only a minimal amount of memory is allocated. The scheduler decides which process has the right to run. Once this decision is made, the process's actions will determine how much additional physical memory space it is given. Since physical memory is finite, the memory management system must ensure it can give a process the memory space it needs, when it needs it. The Linux memory management subsystem does this by keeping a pool of free pages. It attempts to maintain this pool by reclaiming pages which it estimates are not being used. This results in physical memory being shared by all loaded processes, but with a very high preference to current working set.

As described earlier, kswapd continuously sweeps memory space to determine which pages are most likely to hold data which is not part of the current working set, and frees those pages by swapping out or discarding their contents. The operation of kswapd is detailed in Chapter 3 of "The Linux Kernel". Kswapd will first attempt to free pages in the buffer and page caches and then proceed to examining the shared memory pages. Finally, it will consider each process. It will iteratively select the process with the most resident memory as a 'victim.' It does not attempt to swap out the entire 'victim' process, but instead looks for pages of that 'victim' which are not being referenced (or at least are not frequently referenced). So for a given process, the space is examined on a page-by-page basis. In a given sweep, kswapd may retrieve enough pages simply by examining the page and buffer caches. If so, then the next time it runs, it will start its sweep where it left off (with the shared memory pages in this example).

Additional Hints

Now we get to the fun part!

  1. (10 pts) Describe how to extend the Linux memory management system to proportionally share physical memory space among processes. The proportion given to a process should be based on a weight assigned to that process. Valid weights are in the range 1-5, with 5 receiving the largest proportion. If a weight is not assigned to a process, a default weight of 3 should be used.

    Your management scheme need not enforce proportional sharing when physical memory is not constrained. For example, suppose only 2 processes with the following properties are running:

    	Process		Memory Requirements		Weight
    	   A			20MB			  1
    	   B			12MB			  3
    
    If the system has 64MB of available physical memory, then process A and process B should receive 20MB and 12MB respectively. If however, the system has only 16MB of available physical memory, then process A should receive only 4MB and process B should receive 12MB.

  2. (25 pts) Implement the proportional memory management scheme described in part 1.

    You may extend the task_struct to include a weight field, but be sure to make it the last field of the task_struct and to update the hard coded definition of the init task to reflect the change. You will need to add a system call to set this value for a process. The prototype for this call should be:

    	int set_weight(int weight)
    
    You should implement set_weight() by updating the syscall table (as you did for pinfo) and use the __syscall1() macro to invoke it. The return value is 0 if successful and -1 if not. The only cause for a -1 return code is an invalid weight.

  3. (10 pts) Write a simple program which incrementally allocates and touches memory to test your policy. Your test program should account for the fact that if memory is not touched, physical memory space may not be allocated. And if this memory is not being referenced, the physical memory space allocated for this data may be freed. Test your implementation by running 5 different copies of the program concurrently, with respective weights 1, 2, 3, 4, 5.

  4. (15 pts) Show that the physical memory space allocated to each process is in proportion to its weight. Since utilities such as rusage may not provide the detail you require, you should extend your pinfo system call to include the amount resident memory your program has allocated. You may use your own code for the basis of pinfo system call, or the code posted in the solutions for Homework 1 or 2.

    For your final report, use statistics collected by a small program utilizing your extended pinfo system call to retrieve the resident memory allocation of each of your weighted test processes at 1 second intervals. Plot the resident space allocated for each of the 5 processes over a period of at least 20 sample points. Your writeup must describe how the statistics and graph verify that proportional sharing has been achieved.

    The solutions for HW2 provide an example of how to format the data. You may use a spreadsheet application to plot the graphs, or you may plot the graphs by hand.