代做CS202、代写python/Java语言程序
- 首页 >> Java编程 CS202: Lab 4: WeensyOS
Home | Schedule | Policies and grading | Labs | Infrastructure | Exams | Reference materials | Announcements
In this lab, you will implement process memory isolation, virtual memory, and a system call (fork()) in a tiny (but real!) operating system, called WeensyOS.
This will introduce you to virtual memory and reinforce some of the concepts that we have covered this semester.
The WeensyOS kernel runs on x86-64 CPUs. Because the OS kernel runs on the “bare” hardware, debugging kernel code can be tough: if a bug causes misconfiguration of the hardware, the usual result is a crash of the entire kernel (and all the applications running on top of it). And because the kernel itself provides the most basic system services (for example, causing the display hardware to display error messages), deducing what led to a kernel crash can be particularly challenging. In the old days, the usual way to develop code for an OS (whether as part of a class, in a research lab, or in industry) was to boot it on a physical CPU. The lives of kernel developers have gotten much better since. You will run WeensyOS in QEMU.
QEMU is a software-based x86-64 emulator: it “looks” to WeensyOS just like a physical x86-64 CPU with a particular hardware configuration. However, if your WeensyOS code-in-progress wedges the (virtual) hardware, QEMU itself and the whole OS that is running on the “real” hardware (that is, the “real” Linux OS that QEMU is running on) survive unscathed (“real” is in quotation marks for reasons that will be unpacked in the next paragraph). So, for example, your last few debugging printf()s before a kernel crash will still get logged to disk (by QEMU running on Linux), and “rebooting” the kernel you’re developing amounts to re-running the QEMU emulator application.
What is the actual software/hardware stack here? The answer is different for students with x86-64 computers (for example, Windows machines and older Macs) and ARMs. All students are running a host OS (on your computer) on top of either x86-64 or ARM hardware (ARM being the architecture for so-called Apple silicon, namely M1 and M2 chips). Then, the Docker containerization environment runs on top of the host OS (as a process). That environment, loosely speaking, emulates either an x86 or an ARM CPU, and running on top of that emulated CPU is Ubuntu Linux, targeted to x86-64 or ARM. Running on top of Ubuntu is QEMU. QEMU presents an emulated x86-64 interface, and QEMU itself is either an x86-64 or ARM binary, again depending on the underlying hardware. Finally, WeensyOS is exclusively an x86-64 binary, and that of course runs on QEMU (though if you have some x86-64 hardware sitting around, you can try installing WeensyOS and running it “bare”). Taking that same progression, now top-down: if you have an ARM CPU, that means you are running the WeensyOS kernel’s x86-64 instructions in QEMU, a software-emulated x86-64 CPU that is an ARM binary, on top of Linux (targeted to ARM), running in the Docker containerization environment (also itself an ARM binary), on macOS, running on an ARM hardware CPU.
Heads up. As always, it’s important to start on time. In this case, on time means 3 weeks before the assignment is due, as you will almost certainly need all of the allotted time to complete the lab. Kernel development is less forgiving than developing user-level applications; tiny deviations in the configuration of hardware (such as the MMU) by the OS tend to bring the whole (emulated) machine to a halt.
To save yourself headaches later, read this lab writeup in its entirety before you begin.
Resources.
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html 1/19
noitcudortnI SOysneeW :4 baL :202SC
2024/11/15 18:17 CS202: Lab 4: WeensyOS
You may want to look at Chapter 9 of CS:APP3e (from which our x86-64 virtual memory handout is borrowed). The book is on reserve at the Courant library. Section 9.7 in particular describes the 64-bit virtual memory architecture of the x86-64 CPU. Figure 9.23 and Section 9.7.1 show and discuss the PTE_P, PTE_W, and PTE_U bits; these are flags in the x86-64 hardware’s page table entries that play a central role in this lab.
You may find yourself during the lab wanting to understand particular assembly instructions. Here are two guides to x86-64 instructions, from Brown and CMU. The former is more digestible; the latter is more comprehensive. The supplied code also uses certain assembly instructions like iret; see here for a reference.
You’ll be working in the Docker container as usual. We assume that you have set up the upstream as described in the lab setup. Then run the following on your local machine (Mac users can do this on their local machine or within the Docker container; Windows and CIMS users should do this from outside the container):
This lab’s files are located in the lab4 subdirectory.
If you have any “conflicts” from lab 3, resolve them before continuing further. Run git push to save your work
back to your personal repository.
$ cd ~/cs202
$ git fetch upstream
$ git merge upstream/main
Another heads up. Given the complexity of this lab, and the possibility of breaking the functionality of the kernel if you code in some errors, make sure to commit and push your code often! It's very important that your commits have working versions of the code, so if something goes wrong, you can always go back to a previous commit and get back a working copy! At the very least, for this lab, you should be committing once per step (and probably more often), so you can go back to the last step if necessary.
You will implement complete and correct memory isolation for WeensyOS processes. Then you'll implement full virtual memory, which will improve utilization. You'll implement fork() (creating new processes at runtime) and for extra credit, you’ll implement exit() (destroying processes at runtime).
We’ve provided you with a lot of support code for this assignment; the code you will need to write is in fact limited in extent. Our complete solution (for all 5 stages) consists of well under 300 lines of code beyond what we initially hand out to you. All the code you write will go in kernel.c (except for part of step 6).
For this assignment, your primary checking method will be to run your instance of Weensy OS and visually compare it to the images you see below in the assignment.
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html 2/19
noitadilav dna ,gnikcehc ,gnitseT
detratS gnitteG
laoG
2024/11/15 18:17 CS202: Lab 4: WeensyOS
Studying these graphical memory maps carefully is the best way to determine whether your WeensyOS code for each stage is working correctly. Therefore, you will definitely want to make sure you understand how to read these maps before you start to code.
We supply some grading scripts, outlined at the end of the lab, but those will not be your principal source of feedback. For the most part, they indicate only whether a given step is passing or failing; look to the memory maps to understand why.
Enter the Docker environment:
The rest of these instructions presume that you are in the Docker environment. We omit the cs202- user@172b6e333e91:~/cs202-labs part of the prompt.
make run should cause you to see something like the below, which shows four processes running in parallel, each running a version of the program in p-allocator:
This image loops forever; in an actual run, the bars will move to the right and stay there. Don't worry if your image has different numbers of K's or otherwise has different details.
If your bars run painfully slowly, edit the p-allocator.c file and reduce the ALLOC_SLOWDOWN constant.
$ ./cs202-run-docker cs202-user@172b6e333e91:~/cs202-labs$ cd lab4/ cs202-user@172b6e333e91:~/cs202-labs/lab4$ make run
Stop now to read and understand p-allocator.c. Here’s how to interpret the memory map display:
WeensyOS displays the current state of physical and virtual memory. Each character represents 4 KB of memory: a single page. There are 2 MB of physical memory in total. (Ask yourself: how many pages is this?)
WeensyOS runs four processes, 1 through 4. Each process is compiled from the same source code (p- allocator.c), but linked to use a different region of memory.
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html 3/19
etats laitinI
2024/11/15 18:17 CS202: Lab 4: WeensyOS
Each process asks the kernel for more heap memory, one page at a time, until it runs out of room. As usual, each process's heap begins just above its code and global data, and ends just below its stack. The processes allocate heap memory at different rates: compared to Process 1, Process 2 allocates twice as quickly, Process 3 goes three times faster, and Process 4 goes four times faster. (A random number generator is used, so the exact rates may vary.) The marching rows of numbers show how quickly the heap spaces for processes 1, 2, 3, and 4 are allocated.
Here are two labeled memory diagrams, showing what the characters mean and how memory is arranged.
The virtual memory display is similar.
The virtual memory display cycles successively among the four processes’ address spaces. In the base version of the WeensyOS code we give you to start from, all four processes’ address spaces are the same (your job will be to change that!).
Blank spaces in the virtual memory display correspond to unmapped addresses. If a process (or the kernel) tries to access such an address, the processor will page fault.
The character shown at address X in the virtual memory display identifies the owner of the corresponding physical page.
In the virtual memory display, a character is reverse video if an application process is allowed to access the corresponding address. Initially, any process can modify all of physical memory, including the kernel. Memory is not properly isolated.
Read the README-OS.md file for information on how to run WeensyOS.
There are several ways to debug WeensyOS. We recommend adding log_printf statements to your code. The output of log_printf is written to the file /tmp/log.txt outside QEMU. We also recommend that you use
assertions (of which we saw a few in lab 1) to catch problems early. For example, call the helper functions we’ve provided, check_page_table_mappings and check_page_table_ownership to test a page table for obvious errors.
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html 4/19
SOysneeW gninnuR
2024/11/15 18:17 CS202: Lab 4: WeensyOS
Finally, you can and should use gdb, which we cover at the end of this section.
The WeensyOS memory system layout is defined by several constants:
Constant
KERNEL_START_A DDR
console
MEMSIZE_PHYSIC AL
Meaning
Start of kernel code.
Address of CGA console memory.
Size of physical memory in bytes. WeensyOS does not support physical addresses ≥
this value. Defined as 0x200000 (2MB).
KERNEL_STACK_T Top of kernel stack. The kernel stack is one page long. OP
PROC_START_ADD Start of application code. Applications should not be able to access memory below this R address, except for the single page at console.
MEMSIZE_VIRTUA Size of virtual memory. WeensyOS does not support virtual addresses ≥ this value. L Defined as 0x300000 (3MB).
WeensyOS uses several C macros to handle addresses. They are defined at the top of x86-64.h. The most important include:
Macro
PAGESIZE
PAGEADDRESS(pn)
PTE_ADDR(pe)
Meaning
Size of a memory page. Equals 4096 (or, equivalently, 1 << 12).
The initial address (zeroth byte) in page number pn. Expands to an expression
analogous to pn * PAGESIZE.
The physical address contained in page table entry pe. Obtained by masking off the flag bits (setting the low-order 12 bits to zero).
PAGENUMBER(addr) Page number for the page containing addr. Expands to an expression analogous to addr / PAGESIZE.
PAGEINDEX(addr, lev The index in the levelth page table for addr. level must be between 0 and 3; 0 el) returns the level-1 page table index (address bits 39–47), 1 returns the level-2
index (bits 30–38), 2 returns the level-3 index (bits 21–29), and 3 returns the level- 4 index (bits 12–20).
Before you begin coding, you should both understand what these macros represent and be able to derive values for them if you were given a different page size.
The version of WeensyOS you receive at the start of lab4 places the kernel and all processes in a single, shared address space. This address space is defined by the kernel_pagetable page table. kernel_pagetable is initialized to the identity mapping: virtual address X maps to physical address X.
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html 5/19
secaps sserdda ssecorp dna lenreK
sesserdda rof snoisserpxe gnitirW
tuoyal metsys yromeM
2024/11/15 18:17 CS202: Lab 4: WeensyOS
As you work through the lab, you will shift processes to using their own independent address spaces, where each process can access only a subset of physical memory.
The kernel, though, must remain able to access any location in physical memory. Therefore, all kernel functions run using the kernel_pagetable page table. Thus, in kernel functions, each virtual address maps to the physical address with the same number. The exception() function explicitly installs kernel_pagetable when it begins.
WeensyOS system calls are more expensive than they need to be, since every system call switches address spaces twice (once to kernel_pagetable and once back to the process’s page table). Real-world operating systems avoid this overhead. To do so, real-world kernels access memory using process page tables, rather than a kernel-specific kernel_pagetable. This makes a kernel’s code more complicated, since kernels can’t always access all of physical memory directly under that design.
tmux
It will be handy to be able to “see” multiple sessions within Docker at the same time. A good tool for this is called tmux.
We suggest reading, and typing along with, this excellent tmux tutorial. It should take no more than 10 minutes and will be well worth it. Our debugging instructions below will assume that you have done so. Other tmux resources:
MIT’s missing semester: Search for the section called “Terminal Multiplexers”.
Cheatsheet: This is a more comprehensive list of commands, though the formatting is not the best, being interspersed with ads.
If you find yourself needing to exit tmux, either exit all of the panes in the current window, or do: C-b :kill- session. (The C-b is the usual Ctrl-b, and then you type :kill-session and press return or enter.)
gdb
The debugger that we have seen, gdb, can be used to debug an already running process, even one over a network. QEMU supports this facility (see here). As a result, you can use gdb to single-step the software that is running on top of the emulated processor created by QEMU.
Here are the steps. These steps assume that (1) you have taken the 10 minutes to work through the tmux tutorial above, and (2) you are working within the directory lab4 underneath cs202-labs inside the Docker container.
We will start by creating two side-by-side panes in tmux. The C-b % means “type Ctrl-b together, let go, and then type the % key”. Please see the tmux tutorial above for more.
At this point, you should have two side-by-side panes, with the active one being the one on the right. Go back to the one on the left.
C-b # refers to the left arrow key
$ tmux C-b %
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html 6/19
gnisU
gnisU
2024/11/15 18:17 CS202: Lab 4: WeensyOS
If you’re having trouble with C-b (which, again, refers to Ctrl-b), please again see the tmux tutorial in the prior section. At this point you should be in the left pane. The next command runs QEMU in a mode where it expects a debugger to attach:
$ make run-gdb
You will see “VGA Blank mode”. Now, get back to the right-hand pane:
C-b # refers to the right arrow key
In that terminal, invoke gdb. Do this via the script gdb-wrapper.sh, which will invoke the correct version of gdb (handling the case of M1/M2 hardware). You don’t need to tell gdb what you are debugging and how to connect over the network, because the .gdbinit file in that directory tells gdb about these things for you. So you type:
$ ./gdb-wrapper.sh
Make sure you see:
...
The target architecture is set to "i386:x86-64". add symbol table from file "obj/bootsector.full" at
.text_addr = 0x7c00
add symbol table from file "obj/p-allocator.full" at
.text_addr = 0x100000
add symbol table from file "obj/p-allocator2.full" at
.text_addr = 0x140000
add symbol table from file "obj/p-allocator3.full" at
.text_addr = 0x180000
add symbol table from file "obj/p-allocator4.full" at
.text_addr = 0x1c0000
add symbol table from file "obj/p-fork.full" at
.text_addr = 0x100000
add symbol table from file "obj/p-forkexit.full" at
.text_addr = 0x100000
If you do not see something like that, then it means that you did not load the appropriate debugging information into gdb; likely, you are not running from within the lab4 directory.
Now, set a breakpoint, for example at the function kernel() (or whatever function you want to break at):
Now run the “remote” software (really, the WeensyOS kernel in the left-hand pane). Do this by telling gdb to continue, via the c command:
(gdb) c
You should see in the right-hand pane:
(gdb) break kernel
Breakpoint 1 at 0x40167: file kernel.c, line 86.
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html 7/19
2024/11/15 18:17 CS202: Lab 4: WeensyOS
Continuing.
Breakpoint 1, kernel (command=0x0) at kernel.c:86 86 void kernel(const char* command) {
1: x/5i $pc
=> 0x40167:
0x4016b: 0x4016c : mov 0x4016f : sub 0x40173 : mov
(gdb)
%rsp,%rbp $0x20,%rsp %rdi,-0x18(%rbp)
endbr64 push %rbp
You will also see the kernel begin to execute in the left-hand pane.
Now, you can and should use the existing facilities of gdb to poke around. gdb understands the hardware very
well. So can, for example, ask it to print out the value of %cr3:
You are encouraged to use gdb’s facilities. Type help at the (gdb) prompt to get a menu. See the tmux section above for how to exit tmux.
In the starting code we’ve given you, WeensyOS processes could stomp all over the kernel’s memory if they wanted to. Better prevent that. Change kernel(), the kernel initialization function, so that kernel memory is inaccessible to applications, except for the memory holding the CGA console (the single page at (uintptr_t) console == 0xB8000).1
When you are done, WeensyOS should look like the below. In the virtual map, kernel memory is no longer reverse-video, since the user can’t access it. Note the lonely CGA console memory block in reverse video in the virtual address space.
Hints:
(gdb) info registers cr3
cr3 0x8000 [ PDBR=8 PCID=0 ]
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html 8/19
noitalosi lenreK :1 petS
2024/11/15 18:17
CS202: Lab 4: WeensyOS
Use virtual_memory_map. A description of this function is in kernel.h. You will benefit from reading all the function descriptions in kernel.h. You can supply NULL for the allocator argument for now.
If you really want to look at the code for virtual_memory_map, it is in k-hardware.c, along with many other hardware-related functions.
The perm argument to virtual_memory_map is a bitwise-or of zero or more PTE flags, PTE_P, PTE_W, and PTE_U. PTE_P marks Present pages (pages that are mapped). PTE_W marks Writable pages. PTE_U marks User-accessible pages—pages accessible to applications. You want kernel memory to be mapped with permissions PTE_P|PTE_W, which will prevent applications from reading or writing the memory, while allowing the kernel to both read and write.
Make sure that your sys_page_alloc system call preserves kernel isolation: Applications shouldn’t be able to use sys_page_alloc to screw up the kernel.
When you're done with this step, make sure to commit and push your code!
Implement process isolation by giving each process its own independent page table. Your OS memory map should look something like this when you’re done:
(Yours won’t look exactly like that; in the first line of physical and virtual memory, instead of having the pattern R11223344, yours will probably have a pattern like R1111222233334444. This is because the gif is from a 32-bit architecture; recall that on a 64-bit architecture, there are four levels of page table required.)
That is, each process only has permission to access its own pages. You can tell this because only its own pages are shown in reverse video.
What goes in per-process page tables:
The initial mappings for addresses less than PROC_START_ADDR should be copied from those in kernel_pagetable. You can use a loop with virtual_memory_lookup and virtual_memory_map to copy
them. Alternately, you can copy the mappings from the kernel’s page table into the new page tables; this is faster, but make sure you copy the right data!
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html 9/19
secaps sserdda detalosI :2 petS
2024/11/15 18:17 CS202: Lab 4: WeensyOS
The initial mappings for the user area—addresses greater than or equal to PROC_START_ADDR—should be inaccessible to user processes (that is, PTE_U should not be set for these PTEs). In our solution (shown above), these addresses are totally inaccessible (so they show as blank), but you can also change this so that the mappings are still there, but accessible only to the kernel, as in this diagram:
The reverse video shows that this OS also implements process isolation correctly.
[Note: This second approach will pass the automated tests for step 2 but not for steps 3 and beyond. Thus, we recommend taking the first approach, namely total inaccessibility.]
How to implement per-process page tables:
Change process_setup to create per-process page tables.
We suggest you write a copy_pagetable(x86_64_pagetable* pagetable, int8_t owner) function that allocates and returns a new page table, initialized as a full copy of pagetable (including all mappings from pagetable). This function will be useful in Step 5. In process_setup you can modify the page table returned by copy_pagetable according to the requirements above. Your function can use pageinfo to find free pages to use for page tables. Read about pageinfo at the top of kernel.c.
Remember that the x86-64 architecture uses four-level page tables.
The easiest way to copy page tables involves an allocator function suitable for passing to
virtual_memory_map.
You’ll need at least to allocate a level-1 page table and initialize it to zero. You can also set up the whole four-level page table skeleton (for addresses 0...MEMSIZE_VIRTUAL - 1) yourself; then you don’t need an allocator function.
A physical page is free if pageinfo[PAGENUMBER].refcount == 0. Look at the other code in kernel.c for some hints on how to examine the pageinfo[] array.
All of process P’s page table pages must have pageinfo[...].owner == P or WeensyOS’s consistency- checking functions will fail. This will affect your allocator function. (Hint: Don’t forget that global variables are allowed in your code!)
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html
10/19
2024/11/15 18:17 CS202: Lab 4: WeensyOS
If you create an incorrect page table, WeensyOS might crazily reboot. Don’t panic! Add log_printf statements. Another useful technique that may at first seem counterintuitive: add infinite loops to your kernel to track down exactly where a fault occurs. (If the OS hangs without crashing once you’ve added an infinite loop, then the crash you’re debugging must occur after the infinite loop.)
Again, once finished with step 2, commit and push!
Up to this point in the lab, WeensyOS processes have used physical page allocation: the page with physical address X is used to satisfy the sys_page_alloc(X) allocation request for virtual address X. This strategy is inflexible and limits utilization. Change the implementation of the INT_SYS_PAGE_ALLOC system call so that it can use any free physical page to satisfy a sys_page_alloc(X) request.
Your new INT_SYS_PAGE_ALLOC code must perform the following tasks.
Find a free physical page using the pageinfo[] array. Return -1 to the application if you can’t find one.
Use any algorithm you like to find a free physical page; our solution just returns the first one we find. Record the physical page’s allocation in pageinfo[].
Map that physical page at the requested virtual address.
Don’t modify the assign_physical_page helper function, which is also used by the program loader. You can write a new function if you need to.
Here’s how our OS looks after this step.
Now commit and push your code before moving on to step 4!
Now the processes are isolated, which is awesome. But they’re still not taking full advantage of virtual memory. Isolated address spaces can use the same virtual addresses for different physical memory. There’s no need to keep the four process address spaces disjoint.
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html
11/19
secaps sserdda gnippalrevO :4 petS
noitacolla egap lautriV :3 petS
2024/11/15 18:17 CS202: Lab 4: WeensyOS
In this step, change each process’s stack to start from address 0x300000 == MEMSIZE_VIRTUAL. Now the processes have enough heap room to use up all of physical memory! Here’s how the memory map will look after you’ve done it successfully:
Notice the single reverse video page in the bottom right, for all processes. This is their stack page: each process has the same virtual address for its stack page, but (if you’ve implemented it correctly) different physical pages.
If there’s no physical memory available, sys_page_alloc should return an error to the caller (by returning -1). Our solution additionally prints “Out of physical memory!” to the console when this happens; you don’t need to.
As always, make sure to commit and push after finishing this step!
The fork() system call is one of Unix’s great ideas. It starts a new process as a copy of an existing one. The fork() system call appears to return twice, once to each process. To the child process, it returns 0. To the parent process, it returns the child’s process ID.
Run WeensyOS with make run or make run-console. At any time, press the ‘f’ key. This will soft-reboot WeensyOS and ask it to run a single process from the p-fork application, rather than the gang of allocator processes. You should see something like this in the memory map:
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html
12/19
kroF :5 petS
2024/11/15 18:17 CS202: Lab 4: WeensyOS
That’s because you haven’t implemented fork() yet. How to implement fork():
When a process calls fork(), look for a free process slot in the processes[] array. Don’t use slot 0. If no free slot exists, return -1 to the caller.
If a free slot is found, make a copy of current->p_pagetable, the forking process’s page table, using your function from earlier.
But you must also copy the process data in every application page shared by the two processes. The processes should not share any writable memory except the console (otherwise they wouldn’t be isolated). So fork() must examine every virtual address in the old page table. Whenever the parent process has an application-writable page at virtual address V, then fork() must allocate a new physical page P; copy the data from the parent’s page into P, using memcpy(); and finally map page P at address V in the child process’s page table. (memcpy() works like the one installed on your Linux dev box; use the man pages for reference.)
The child process’s registers are initialized as a copy of the parent process’s registers, except for reg_rax. Use virtual_memory_lookup to query the mapping between virtual and physical addresses in a page
table.
When you’re done, you should see something like the below after pressing ‘f’.
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html
13/19
2024/11/15 18:17 CS202: Lab 4: WeensyOS
An image like the below, however, means that you forgot to copy the data for some pages, so the processes are actually sharing stack and/or data pages when they should not:
Don't forget to commit and push after finishing fork!
This extra credit and the next are challenging—and the point values will not be commensurate to the extra effort. We supply these for completeness, and for those who want to go deeper into the material.
It’s wasteful for fork() to copy all of a process’s memory. For example, most processes, including p-fork, never change their code. So what if we shared the memory containing the code? That’d be fine for process isolation, as long as neither process could write the code.
Other hints.
Make sure you’re setting the owner correctly when allocating new page tables.
Failing this step of the lab does not mean that the bug is actually in this step. It’s very common that a student’s step 5 code fails because of errors made in any of the earlier steps.
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html
14/19
yromem ylno-daer derahS :6 petS )tiderc artxE(
2024/11/15 18:17 CS202: Lab 4: WeensyOS
Step A: change the process loader in k-loader.c to detect read-only program segments and map them as read-only for applications (PTE_P|PTE_U). A program segment ph is read-only iff (ph->p_flags & ELF_PFLAG_WRITE) == 0.
Step B: From step 5, your fork() code already shouldn’t copy shareable pages. But make sure in this step that your code keeps track accurately of the number of active references to each user page. Specifically, if pageinfo[pn].refcount > 0 and pageinfo[pn].owner > 0, then pageinfo[pn].refcount should equal the number of times pn is mapped in process page tables.
When you’re done, running p-fork should look like this:
Hint:
Mark a program segment read-only after the memcpy and memset operations that add data to the segment. Otherwise you’ll get a fault.
Again, commit and push!
So far none of your test programs have ever freed memory or exited. Memory allocation’s pretty easy until you add free! So let’s do that, by allowing applications to exit. In this exercise you’ll implement the sys_exit() system call, which exits the current process.
This exercise is challenging: freeing memory will tend to expose weaknesses and problems in your other code. To test your work, use make run and then type ‘e’. This reboots WeensyOS to run the p-forkexit program.
(Initially it’ll crash because sys_exit() isn’t implemented yet.) p-forkexit combines two types of behavior:
Process 1 forks children indefinitely.
The child processes, #2 and up, are memory allocators, as in the previous parts of the lab. But with small probability at each step, each child process either exits or attempts to fork a new child.
The result is that once your code is correct, p-forkexit makes crazy patterns forever. An example:
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html
15/19
yromem gnieerF :7 petS )tiderc artxE(
2024/11/15 18:17 CS202: Lab 4: WeensyOS
Your picture might look a little different; for example, thanks to Step 6, your processes should share a code page, which would appear as a darker-colored “1”.
Here’s your task.
sys_exit() should mark a process as free and free all of its memory. This includes the process’s code, data, heap, and stack pages, as well as the pages used for its paging structures.
In p-forkexit, unlike in previous parts of the lab, sys_fork() can run when there isn’t quite enough memory to create a new process. Your code should handle this case. If there isn’t enough free memory to allocate a process, fork() should clean up after itself (i.e., free any memory that was allocated for the new process before memory ran out, including pages that were allocated as part of the paging structures), and then return -1 to the caller. There should be no memory leaks.
The check_virtual_memory function, which runs
Home | Schedule | Policies and grading | Labs | Infrastructure | Exams | Reference materials | Announcements
In this lab, you will implement process memory isolation, virtual memory, and a system call (fork()) in a tiny (but real!) operating system, called WeensyOS.
This will introduce you to virtual memory and reinforce some of the concepts that we have covered this semester.
The WeensyOS kernel runs on x86-64 CPUs. Because the OS kernel runs on the “bare” hardware, debugging kernel code can be tough: if a bug causes misconfiguration of the hardware, the usual result is a crash of the entire kernel (and all the applications running on top of it). And because the kernel itself provides the most basic system services (for example, causing the display hardware to display error messages), deducing what led to a kernel crash can be particularly challenging. In the old days, the usual way to develop code for an OS (whether as part of a class, in a research lab, or in industry) was to boot it on a physical CPU. The lives of kernel developers have gotten much better since. You will run WeensyOS in QEMU.
QEMU is a software-based x86-64 emulator: it “looks” to WeensyOS just like a physical x86-64 CPU with a particular hardware configuration. However, if your WeensyOS code-in-progress wedges the (virtual) hardware, QEMU itself and the whole OS that is running on the “real” hardware (that is, the “real” Linux OS that QEMU is running on) survive unscathed (“real” is in quotation marks for reasons that will be unpacked in the next paragraph). So, for example, your last few debugging printf()s before a kernel crash will still get logged to disk (by QEMU running on Linux), and “rebooting” the kernel you’re developing amounts to re-running the QEMU emulator application.
What is the actual software/hardware stack here? The answer is different for students with x86-64 computers (for example, Windows machines and older Macs) and ARMs. All students are running a host OS (on your computer) on top of either x86-64 or ARM hardware (ARM being the architecture for so-called Apple silicon, namely M1 and M2 chips). Then, the Docker containerization environment runs on top of the host OS (as a process). That environment, loosely speaking, emulates either an x86 or an ARM CPU, and running on top of that emulated CPU is Ubuntu Linux, targeted to x86-64 or ARM. Running on top of Ubuntu is QEMU. QEMU presents an emulated x86-64 interface, and QEMU itself is either an x86-64 or ARM binary, again depending on the underlying hardware. Finally, WeensyOS is exclusively an x86-64 binary, and that of course runs on QEMU (though if you have some x86-64 hardware sitting around, you can try installing WeensyOS and running it “bare”). Taking that same progression, now top-down: if you have an ARM CPU, that means you are running the WeensyOS kernel’s x86-64 instructions in QEMU, a software-emulated x86-64 CPU that is an ARM binary, on top of Linux (targeted to ARM), running in the Docker containerization environment (also itself an ARM binary), on macOS, running on an ARM hardware CPU.
Heads up. As always, it’s important to start on time. In this case, on time means 3 weeks before the assignment is due, as you will almost certainly need all of the allotted time to complete the lab. Kernel development is less forgiving than developing user-level applications; tiny deviations in the configuration of hardware (such as the MMU) by the OS tend to bring the whole (emulated) machine to a halt.
To save yourself headaches later, read this lab writeup in its entirety before you begin.
Resources.
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html 1/19
noitcudortnI SOysneeW :4 baL :202SC
2024/11/15 18:17 CS202: Lab 4: WeensyOS
You may want to look at Chapter 9 of CS:APP3e (from which our x86-64 virtual memory handout is borrowed). The book is on reserve at the Courant library. Section 9.7 in particular describes the 64-bit virtual memory architecture of the x86-64 CPU. Figure 9.23 and Section 9.7.1 show and discuss the PTE_P, PTE_W, and PTE_U bits; these are flags in the x86-64 hardware’s page table entries that play a central role in this lab.
You may find yourself during the lab wanting to understand particular assembly instructions. Here are two guides to x86-64 instructions, from Brown and CMU. The former is more digestible; the latter is more comprehensive. The supplied code also uses certain assembly instructions like iret; see here for a reference.
You’ll be working in the Docker container as usual. We assume that you have set up the upstream as described in the lab setup. Then run the following on your local machine (Mac users can do this on their local machine or within the Docker container; Windows and CIMS users should do this from outside the container):
This lab’s files are located in the lab4 subdirectory.
If you have any “conflicts” from lab 3, resolve them before continuing further. Run git push to save your work
back to your personal repository.
$ cd ~/cs202
$ git fetch upstream
$ git merge upstream/main
Another heads up. Given the complexity of this lab, and the possibility of breaking the functionality of the kernel if you code in some errors, make sure to commit and push your code often! It's very important that your commits have working versions of the code, so if something goes wrong, you can always go back to a previous commit and get back a working copy! At the very least, for this lab, you should be committing once per step (and probably more often), so you can go back to the last step if necessary.
You will implement complete and correct memory isolation for WeensyOS processes. Then you'll implement full virtual memory, which will improve utilization. You'll implement fork() (creating new processes at runtime) and for extra credit, you’ll implement exit() (destroying processes at runtime).
We’ve provided you with a lot of support code for this assignment; the code you will need to write is in fact limited in extent. Our complete solution (for all 5 stages) consists of well under 300 lines of code beyond what we initially hand out to you. All the code you write will go in kernel.c (except for part of step 6).
For this assignment, your primary checking method will be to run your instance of Weensy OS and visually compare it to the images you see below in the assignment.
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html 2/19
noitadilav dna ,gnikcehc ,gnitseT
detratS gnitteG
laoG
2024/11/15 18:17 CS202: Lab 4: WeensyOS
Studying these graphical memory maps carefully is the best way to determine whether your WeensyOS code for each stage is working correctly. Therefore, you will definitely want to make sure you understand how to read these maps before you start to code.
We supply some grading scripts, outlined at the end of the lab, but those will not be your principal source of feedback. For the most part, they indicate only whether a given step is passing or failing; look to the memory maps to understand why.
Enter the Docker environment:
The rest of these instructions presume that you are in the Docker environment. We omit the cs202- user@172b6e333e91:~/cs202-labs part of the prompt.
make run should cause you to see something like the below, which shows four processes running in parallel, each running a version of the program in p-allocator:
This image loops forever; in an actual run, the bars will move to the right and stay there. Don't worry if your image has different numbers of K's or otherwise has different details.
If your bars run painfully slowly, edit the p-allocator.c file and reduce the ALLOC_SLOWDOWN constant.
$ ./cs202-run-docker cs202-user@172b6e333e91:~/cs202-labs$ cd lab4/ cs202-user@172b6e333e91:~/cs202-labs/lab4$ make run
Stop now to read and understand p-allocator.c. Here’s how to interpret the memory map display:
WeensyOS displays the current state of physical and virtual memory. Each character represents 4 KB of memory: a single page. There are 2 MB of physical memory in total. (Ask yourself: how many pages is this?)
WeensyOS runs four processes, 1 through 4. Each process is compiled from the same source code (p- allocator.c), but linked to use a different region of memory.
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html 3/19
etats laitinI
2024/11/15 18:17 CS202: Lab 4: WeensyOS
Each process asks the kernel for more heap memory, one page at a time, until it runs out of room. As usual, each process's heap begins just above its code and global data, and ends just below its stack. The processes allocate heap memory at different rates: compared to Process 1, Process 2 allocates twice as quickly, Process 3 goes three times faster, and Process 4 goes four times faster. (A random number generator is used, so the exact rates may vary.) The marching rows of numbers show how quickly the heap spaces for processes 1, 2, 3, and 4 are allocated.
Here are two labeled memory diagrams, showing what the characters mean and how memory is arranged.
The virtual memory display is similar.
The virtual memory display cycles successively among the four processes’ address spaces. In the base version of the WeensyOS code we give you to start from, all four processes’ address spaces are the same (your job will be to change that!).
Blank spaces in the virtual memory display correspond to unmapped addresses. If a process (or the kernel) tries to access such an address, the processor will page fault.
The character shown at address X in the virtual memory display identifies the owner of the corresponding physical page.
In the virtual memory display, a character is reverse video if an application process is allowed to access the corresponding address. Initially, any process can modify all of physical memory, including the kernel. Memory is not properly isolated.
Read the README-OS.md file for information on how to run WeensyOS.
There are several ways to debug WeensyOS. We recommend adding log_printf statements to your code. The output of log_printf is written to the file /tmp/log.txt outside QEMU. We also recommend that you use
assertions (of which we saw a few in lab 1) to catch problems early. For example, call the helper functions we’ve provided, check_page_table_mappings and check_page_table_ownership to test a page table for obvious errors.
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html 4/19
SOysneeW gninnuR
2024/11/15 18:17 CS202: Lab 4: WeensyOS
Finally, you can and should use gdb, which we cover at the end of this section.
The WeensyOS memory system layout is defined by several constants:
Constant
KERNEL_START_A DDR
console
MEMSIZE_PHYSIC AL
Meaning
Start of kernel code.
Address of CGA console memory.
Size of physical memory in bytes. WeensyOS does not support physical addresses ≥
this value. Defined as 0x200000 (2MB).
KERNEL_STACK_T Top of kernel stack. The kernel stack is one page long. OP
PROC_START_ADD Start of application code. Applications should not be able to access memory below this R address, except for the single page at console.
MEMSIZE_VIRTUA Size of virtual memory. WeensyOS does not support virtual addresses ≥ this value. L Defined as 0x300000 (3MB).
WeensyOS uses several C macros to handle addresses. They are defined at the top of x86-64.h. The most important include:
Macro
PAGESIZE
PAGEADDRESS(pn)
PTE_ADDR(pe)
Meaning
Size of a memory page. Equals 4096 (or, equivalently, 1 << 12).
The initial address (zeroth byte) in page number pn. Expands to an expression
analogous to pn * PAGESIZE.
The physical address contained in page table entry pe. Obtained by masking off the flag bits (setting the low-order 12 bits to zero).
PAGENUMBER(addr) Page number for the page containing addr. Expands to an expression analogous to addr / PAGESIZE.
PAGEINDEX(addr, lev The index in the levelth page table for addr. level must be between 0 and 3; 0 el) returns the level-1 page table index (address bits 39–47), 1 returns the level-2
index (bits 30–38), 2 returns the level-3 index (bits 21–29), and 3 returns the level- 4 index (bits 12–20).
Before you begin coding, you should both understand what these macros represent and be able to derive values for them if you were given a different page size.
The version of WeensyOS you receive at the start of lab4 places the kernel and all processes in a single, shared address space. This address space is defined by the kernel_pagetable page table. kernel_pagetable is initialized to the identity mapping: virtual address X maps to physical address X.
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html 5/19
secaps sserdda ssecorp dna lenreK
sesserdda rof snoisserpxe gnitirW
tuoyal metsys yromeM
2024/11/15 18:17 CS202: Lab 4: WeensyOS
As you work through the lab, you will shift processes to using their own independent address spaces, where each process can access only a subset of physical memory.
The kernel, though, must remain able to access any location in physical memory. Therefore, all kernel functions run using the kernel_pagetable page table. Thus, in kernel functions, each virtual address maps to the physical address with the same number. The exception() function explicitly installs kernel_pagetable when it begins.
WeensyOS system calls are more expensive than they need to be, since every system call switches address spaces twice (once to kernel_pagetable and once back to the process’s page table). Real-world operating systems avoid this overhead. To do so, real-world kernels access memory using process page tables, rather than a kernel-specific kernel_pagetable. This makes a kernel’s code more complicated, since kernels can’t always access all of physical memory directly under that design.
tmux
It will be handy to be able to “see” multiple sessions within Docker at the same time. A good tool for this is called tmux.
We suggest reading, and typing along with, this excellent tmux tutorial. It should take no more than 10 minutes and will be well worth it. Our debugging instructions below will assume that you have done so. Other tmux resources:
MIT’s missing semester: Search for the section called “Terminal Multiplexers”.
Cheatsheet: This is a more comprehensive list of commands, though the formatting is not the best, being interspersed with ads.
If you find yourself needing to exit tmux, either exit all of the panes in the current window, or do: C-b :kill- session
gdb
The debugger that we have seen, gdb, can be used to debug an already running process, even one over a network. QEMU supports this facility (see here). As a result, you can use gdb to single-step the software that is running on top of the emulated processor created by QEMU.
Here are the steps. These steps assume that (1) you have taken the 10 minutes to work through the tmux tutorial above, and (2) you are working within the directory lab4 underneath cs202-labs inside the Docker container.
We will start by creating two side-by-side panes in tmux. The C-b % means “type Ctrl-b together, let go, and then type the % key”. Please see the tmux tutorial above for more.
At this point, you should have two side-by-side panes, with the active one being the one on the right. Go back to the one on the left.
C-b
$ tmux C-b %
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html 6/19
gnisU
gnisU
2024/11/15 18:17 CS202: Lab 4: WeensyOS
If you’re having trouble with C-b (which, again, refers to Ctrl-b), please again see the tmux tutorial in the prior section. At this point you should be in the left pane. The next command runs QEMU in a mode where it expects a debugger to attach:
$ make run-gdb
You will see “VGA Blank mode”. Now, get back to the right-hand pane:
C-b
In that terminal, invoke gdb. Do this via the script gdb-wrapper.sh, which will invoke the correct version of gdb (handling the case of M1/M2 hardware). You don’t need to tell gdb what you are debugging and how to connect over the network, because the .gdbinit file in that directory tells gdb about these things for you. So you type:
$ ./gdb-wrapper.sh
Make sure you see:
...
The target architecture is set to "i386:x86-64". add symbol table from file "obj/bootsector.full" at
.text_addr = 0x7c00
add symbol table from file "obj/p-allocator.full" at
.text_addr = 0x100000
add symbol table from file "obj/p-allocator2.full" at
.text_addr = 0x140000
add symbol table from file "obj/p-allocator3.full" at
.text_addr = 0x180000
add symbol table from file "obj/p-allocator4.full" at
.text_addr = 0x1c0000
add symbol table from file "obj/p-fork.full" at
.text_addr = 0x100000
add symbol table from file "obj/p-forkexit.full" at
.text_addr = 0x100000
If you do not see something like that, then it means that you did not load the appropriate debugging information into gdb; likely, you are not running from within the lab4 directory.
Now, set a breakpoint, for example at the function kernel() (or whatever function you want to break at):
Now run the “remote” software (really, the WeensyOS kernel in the left-hand pane). Do this by telling gdb to continue, via the c command:
(gdb) c
You should see in the right-hand pane:
(gdb) break kernel
Breakpoint 1 at 0x40167: file kernel.c, line 86.
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html 7/19
2024/11/15 18:17 CS202: Lab 4: WeensyOS
Continuing.
Breakpoint 1, kernel (command=0x0) at kernel.c:86 86 void kernel(const char* command) {
1: x/5i $pc
=> 0x40167
0x4016b
(gdb)
%rsp,%rbp $0x20,%rsp %rdi,-0x18(%rbp)
endbr64 push %rbp
You will also see the kernel begin to execute in the left-hand pane.
Now, you can and should use the existing facilities of gdb to poke around. gdb understands the hardware very
well. So can, for example, ask it to print out the value of %cr3:
You are encouraged to use gdb’s facilities. Type help at the (gdb) prompt to get a menu. See the tmux section above for how to exit tmux.
In the starting code we’ve given you, WeensyOS processes could stomp all over the kernel’s memory if they wanted to. Better prevent that. Change kernel(), the kernel initialization function, so that kernel memory is inaccessible to applications, except for the memory holding the CGA console (the single page at (uintptr_t) console == 0xB8000).1
When you are done, WeensyOS should look like the below. In the virtual map, kernel memory is no longer reverse-video, since the user can’t access it. Note the lonely CGA console memory block in reverse video in the virtual address space.
Hints:
(gdb) info registers cr3
cr3 0x8000 [ PDBR=8 PCID=0 ]
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html 8/19
noitalosi lenreK :1 petS
2024/11/15 18:17
CS202: Lab 4: WeensyOS
Use virtual_memory_map. A description of this function is in kernel.h. You will benefit from reading all the function descriptions in kernel.h. You can supply NULL for the allocator argument for now.
If you really want to look at the code for virtual_memory_map, it is in k-hardware.c, along with many other hardware-related functions.
The perm argument to virtual_memory_map is a bitwise-or of zero or more PTE flags, PTE_P, PTE_W, and PTE_U. PTE_P marks Present pages (pages that are mapped). PTE_W marks Writable pages. PTE_U marks User-accessible pages—pages accessible to applications. You want kernel memory to be mapped with permissions PTE_P|PTE_W, which will prevent applications from reading or writing the memory, while allowing the kernel to both read and write.
Make sure that your sys_page_alloc system call preserves kernel isolation: Applications shouldn’t be able to use sys_page_alloc to screw up the kernel.
When you're done with this step, make sure to commit and push your code!
Implement process isolation by giving each process its own independent page table. Your OS memory map should look something like this when you’re done:
(Yours won’t look exactly like that; in the first line of physical and virtual memory, instead of having the pattern R11223344, yours will probably have a pattern like R1111222233334444. This is because the gif is from a 32-bit architecture; recall that on a 64-bit architecture, there are four levels of page table required.)
That is, each process only has permission to access its own pages. You can tell this because only its own pages are shown in reverse video.
What goes in per-process page tables:
The initial mappings for addresses less than PROC_START_ADDR should be copied from those in kernel_pagetable. You can use a loop with virtual_memory_lookup and virtual_memory_map to copy
them. Alternately, you can copy the mappings from the kernel’s page table into the new page tables; this is faster, but make sure you copy the right data!
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html 9/19
secaps sserdda detalosI :2 petS
2024/11/15 18:17 CS202: Lab 4: WeensyOS
The initial mappings for the user area—addresses greater than or equal to PROC_START_ADDR—should be inaccessible to user processes (that is, PTE_U should not be set for these PTEs). In our solution (shown above), these addresses are totally inaccessible (so they show as blank), but you can also change this so that the mappings are still there, but accessible only to the kernel, as in this diagram:
The reverse video shows that this OS also implements process isolation correctly.
[Note: This second approach will pass the automated tests for step 2 but not for steps 3 and beyond. Thus, we recommend taking the first approach, namely total inaccessibility.]
How to implement per-process page tables:
Change process_setup to create per-process page tables.
We suggest you write a copy_pagetable(x86_64_pagetable* pagetable, int8_t owner) function that allocates and returns a new page table, initialized as a full copy of pagetable (including all mappings from pagetable). This function will be useful in Step 5. In process_setup you can modify the page table returned by copy_pagetable according to the requirements above. Your function can use pageinfo to find free pages to use for page tables. Read about pageinfo at the top of kernel.c.
Remember that the x86-64 architecture uses four-level page tables.
The easiest way to copy page tables involves an allocator function suitable for passing to
virtual_memory_map.
You’ll need at least to allocate a level-1 page table and initialize it to zero. You can also set up the whole four-level page table skeleton (for addresses 0...MEMSIZE_VIRTUAL - 1) yourself; then you don’t need an allocator function.
A physical page is free if pageinfo[PAGENUMBER].refcount == 0. Look at the other code in kernel.c for some hints on how to examine the pageinfo[] array.
All of process P’s page table pages must have pageinfo[...].owner == P or WeensyOS’s consistency- checking functions will fail. This will affect your allocator function. (Hint: Don’t forget that global variables are allowed in your code!)
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html
10/19
2024/11/15 18:17 CS202: Lab 4: WeensyOS
If you create an incorrect page table, WeensyOS might crazily reboot. Don’t panic! Add log_printf statements. Another useful technique that may at first seem counterintuitive: add infinite loops to your kernel to track down exactly where a fault occurs. (If the OS hangs without crashing once you’ve added an infinite loop, then the crash you’re debugging must occur after the infinite loop.)
Again, once finished with step 2, commit and push!
Up to this point in the lab, WeensyOS processes have used physical page allocation: the page with physical address X is used to satisfy the sys_page_alloc(X) allocation request for virtual address X. This strategy is inflexible and limits utilization. Change the implementation of the INT_SYS_PAGE_ALLOC system call so that it can use any free physical page to satisfy a sys_page_alloc(X) request.
Your new INT_SYS_PAGE_ALLOC code must perform the following tasks.
Find a free physical page using the pageinfo[] array. Return -1 to the application if you can’t find one.
Use any algorithm you like to find a free physical page; our solution just returns the first one we find. Record the physical page’s allocation in pageinfo[].
Map that physical page at the requested virtual address.
Don’t modify the assign_physical_page helper function, which is also used by the program loader. You can write a new function if you need to.
Here’s how our OS looks after this step.
Now commit and push your code before moving on to step 4!
Now the processes are isolated, which is awesome. But they’re still not taking full advantage of virtual memory. Isolated address spaces can use the same virtual addresses for different physical memory. There’s no need to keep the four process address spaces disjoint.
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html
11/19
secaps sserdda gnippalrevO :4 petS
noitacolla egap lautriV :3 petS
2024/11/15 18:17 CS202: Lab 4: WeensyOS
In this step, change each process’s stack to start from address 0x300000 == MEMSIZE_VIRTUAL. Now the processes have enough heap room to use up all of physical memory! Here’s how the memory map will look after you’ve done it successfully:
Notice the single reverse video page in the bottom right, for all processes. This is their stack page: each process has the same virtual address for its stack page, but (if you’ve implemented it correctly) different physical pages.
If there’s no physical memory available, sys_page_alloc should return an error to the caller (by returning -1). Our solution additionally prints “Out of physical memory!” to the console when this happens; you don’t need to.
As always, make sure to commit and push after finishing this step!
The fork() system call is one of Unix’s great ideas. It starts a new process as a copy of an existing one. The fork() system call appears to return twice, once to each process. To the child process, it returns 0. To the parent process, it returns the child’s process ID.
Run WeensyOS with make run or make run-console. At any time, press the ‘f’ key. This will soft-reboot WeensyOS and ask it to run a single process from the p-fork application, rather than the gang of allocator processes. You should see something like this in the memory map:
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html
12/19
kroF :5 petS
2024/11/15 18:17 CS202: Lab 4: WeensyOS
That’s because you haven’t implemented fork() yet. How to implement fork():
When a process calls fork(), look for a free process slot in the processes[] array. Don’t use slot 0. If no free slot exists, return -1 to the caller.
If a free slot is found, make a copy of current->p_pagetable, the forking process’s page table, using your function from earlier.
But you must also copy the process data in every application page shared by the two processes. The processes should not share any writable memory except the console (otherwise they wouldn’t be isolated). So fork() must examine every virtual address in the old page table. Whenever the parent process has an application-writable page at virtual address V, then fork() must allocate a new physical page P; copy the data from the parent’s page into P, using memcpy(); and finally map page P at address V in the child process’s page table. (memcpy() works like the one installed on your Linux dev box; use the man pages for reference.)
The child process’s registers are initialized as a copy of the parent process’s registers, except for reg_rax. Use virtual_memory_lookup to query the mapping between virtual and physical addresses in a page
table.
When you’re done, you should see something like the below after pressing ‘f’.
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html
13/19
2024/11/15 18:17 CS202: Lab 4: WeensyOS
An image like the below, however, means that you forgot to copy the data for some pages, so the processes are actually sharing stack and/or data pages when they should not:
Don't forget to commit and push after finishing fork!
This extra credit and the next are challenging—and the point values will not be commensurate to the extra effort. We supply these for completeness, and for those who want to go deeper into the material.
It’s wasteful for fork() to copy all of a process’s memory. For example, most processes, including p-fork, never change their code. So what if we shared the memory containing the code? That’d be fine for process isolation, as long as neither process could write the code.
Other hints.
Make sure you’re setting the owner correctly when allocating new page tables.
Failing this step of the lab does not mean that the bug is actually in this step. It’s very common that a student’s step 5 code fails because of errors made in any of the earlier steps.
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html
14/19
yromem ylno-daer derahS :6 petS )tiderc artxE(
2024/11/15 18:17 CS202: Lab 4: WeensyOS
Step A: change the process loader in k-loader.c to detect read-only program segments and map them as read-only for applications (PTE_P|PTE_U). A program segment ph is read-only iff (ph->p_flags & ELF_PFLAG_WRITE) == 0.
Step B: From step 5, your fork() code already shouldn’t copy shareable pages. But make sure in this step that your code keeps track accurately of the number of active references to each user page. Specifically, if pageinfo[pn].refcount > 0 and pageinfo[pn].owner > 0, then pageinfo[pn].refcount should equal the number of times pn is mapped in process page tables.
When you’re done, running p-fork should look like this:
Hint:
Mark a program segment read-only after the memcpy and memset operations that add data to the segment. Otherwise you’ll get a fault.
Again, commit and push!
So far none of your test programs have ever freed memory or exited. Memory allocation’s pretty easy until you add free! So let’s do that, by allowing applications to exit. In this exercise you’ll implement the sys_exit() system call, which exits the current process.
This exercise is challenging: freeing memory will tend to expose weaknesses and problems in your other code. To test your work, use make run and then type ‘e’. This reboots WeensyOS to run the p-forkexit program.
(Initially it’ll crash because sys_exit() isn’t implemented yet.) p-forkexit combines two types of behavior:
Process 1 forks children indefinitely.
The child processes, #2 and up, are memory allocators, as in the previous parts of the lab. But with small probability at each step, each child process either exits or attempts to fork a new child.
The result is that once your code is correct, p-forkexit makes crazy patterns forever. An example:
https://cs.nyu.edu/~mw alfish/classes/24sp/labs/lab4.html
15/19
yromem gnieerF :7 petS )tiderc artxE(
2024/11/15 18:17 CS202: Lab 4: WeensyOS
Your picture might look a little different; for example, thanks to Step 6, your processes should share a code page, which would appear as a darker-colored “1”.
Here’s your task.
sys_exit() should mark a process as free and free all of its memory. This includes the process’s code, data, heap, and stack pages, as well as the pages used for its paging structures.
In p-forkexit, unlike in previous parts of the lab, sys_fork() can run when there isn’t quite enough memory to create a new process. Your code should handle this case. If there isn’t enough free memory to allocate a process, fork() should clean up after itself (i.e., free any memory that was allocated for the new process before memory ran out, including pages that were allocated as part of the paging structures), and then return -1 to the caller. There should be no memory leaks.
The check_virtual_memory function, which runs