os_kernel_lab/related_info/ostep/ostep14-afs.md
2015-03-15 16:54:19 +08:00

10 KiB

This program, afs.py, allows you to experiment with the cache consistency behavior of the Andrew File System (AFS). The program generates random client traces (of file opens, reads, writes, and closes), enabling the user to see if they can predict what values end up in various files.

Here is an example run:

prompt> ./afs.py -C 2 -n 1 -s 12

  Server                         c0                          c1

file:a contains:0 open:a [fd:0] write:0 value? -> 1 close:0 open:a [fd:0] read:0 -> value? close:0 file:a contains:? prompt>

The trace is fairly simple to read. On the left is the server, and each column shows actions being taken on each of two clients (use -C to specify a different number). Each client generates one random action (-n 1), which is either the open/read/close of a file or the open/write/close of a file. The contents of a file, for simplicity, is always just a single number.

To generate different traces, use '-s' (for a random seed), as always. Here we set it to 12 to get this specific trace.

In the trace, the server shows the initial contents of all the files in the system:

file:a contains:0

As you can see in this trace, there is just one file (a) and it contains the value 0.

Time increases downwards, and what is next is client 0 (c0) opening the file 'a' (which returns a file descriptor, 0 in this case), writing to that descriptor, and then closing the file.

Immediately you see the first question posed to you:

                         write:0 value? -> 1

When writing to descriptor 0, you are overwriting an existing value with the new value of 1. What was that old value? (pretty easy in this case: 0).

Then client 1 begins doing some work (c1). It opens the file, reads it, and closes it. Again, we have a question to answer:

                                                      read:0 -> value?

When reading from this file, what value should client 1 see? Again, given AFS consistency, the answer is straightforward: 1 (the value placed in the file when c0 closed the file and updated the server).

The final question in the trace is the final value of the file on the server:

file:a contains:?

Again, the answer here is easy: 1 (as generated by c0).

To see if you have answered these questions correctly, run with the '-c' flag (or '--compute'), as follows:

prompt> ./afs.py -C 2 -n 1 -s 12 -c

  Server                         c0                          c1

file:a contains:0 open:a [fd:0] write:0 0 -> 1 close:0 open:a [fd:0] read:0 -> 1 close:0 file:a contains:1 prompt>

From this trace, you can see that all the question marks have been filled in with answers.

More detail is available on what has happened too, with the '-d' ('--detail') flag. Here is an example that shows when each client issued a get or put of a file to the server:

prompt> ./afs.py -C 2 -n 1 -s 12 -c -d 1

  Server                         c0                          c1

file:a contains:0 open:a [fd:0] getfile:a c:c0 [0]

                         write:0 0 -> 1

                         close:0

putfile:a c:c0 [1]

                                                      open:a [fd:0]

getfile:a c:c1 [1]

                                                      read:0 -> 1

                                                      close:0

file:a contains:1 prompt>

You can show more with higher levels of detail, including cache invalidations, the exact client cache state after each step, and extra diagnostic information. We'll show these in one more example below.

Random client actions are useful to generate new problems and try to solve them; however, in some cases it is useful to control exactly what each client does in order to see specific AFS behaviors. To do this, you can use the '-A' and '-S' flags (either together or in tandem).

The '-S' flag lets you control the exact schedule of client actions. Assume our example above. Let's say we wish to run client 1 in entirety first; to achieve this end, we simply run the following:

prompt> ./afs.py -C 2 -n 1 -s 12 -S 111000

  Server                         c0                          c1

file:a contains:0 open:a [fd:0] read:0 -> value? close:0 open:a [fd:0] write:0 value? -> 1 close:0 file:a contains:? prompt>

The -S flag here is passed "111000" which means "run client 1, then client 1, then 1 again, then 0, 0, 0, and then repeat (if need be)". The result in this case is client 1 reading file a before client 1 writes it.

The '-A' flag gives exact control over which actions the clients take. Here is an example:

prompt> ./afs.py -s 12 -S 011100 -A oa1:r1:c1,oa1:w1:c1

  Server                         c0                          c1

file:a contains:0 open:a [fd:1] open:a [fd:1] write:1 value? -> 1 close:1 read:1 -> value? close:1 file:a contains:? prompt>

In this example, we have specified the following via -A "oa1:r1:c1,oa1:w1:c1". The list splits each clients actions by a comma; thus, client 0 should do whatever "oa1:r1:c1" indicates, whereas client 1 should do whatever the string "oa1:w1:c1" indicates. Parsing each command string is straightforward: "oa1" means open file 'a' and assign it file descriptor 1; "r1" or "w1" means read or write file descriptor 1; "c1" means close file descriptor 1.

So what value will the read on client 0 return?

We can also see the cache state, callbacks, and invalidations with a few extra flags (-d 7):

prompt> ./afs.py -s 12 -S 011100 -A oa1:r1:c1,oa1:w1:c1 -c -d 7

  Server                         c0                          c1

file:a contains:0 open:a [fd:1] getfile:a c:c0 [0] [a: 0 (v=1,d=0,r=1)]

                                                      open:a [fd:1]

getfile:a c:c1 [0] [a: 0 (v=1,d=0,r=1)]

                                                      write:1 0 -> 1
                                                      [a: 1 (v=1,d=1,r=1)]

                                                      close:1

putfile:a c:c1 [1] callback: c:c0 file:a invalidate a [a: 0 (v=0,d=0,r=1)] [a: 1 (v=1,d=0,r=0)]

                         read:1 -> 0
                         [a: 0 (v=0,d=0,r=1)]

                         close:1

file:a contains:1 prompt>

From this trace, we can see what happens when client 1 closes the (modified) file. At that point, c1 puts the file to the server. The server knows that c0 has the file cached, and thus sends an invalidation to c0. However, c0 already has the file open; as a result, the cache keeps the old contents until the file is closed.

You can see this in tracking the cache contents throughout the trace (available with the correct -d flag, in particular any value which sets the 3rd least significant bit to 1, such as -d 4, -d 5, -d 6, -d 7, etc.). When client 0 opens the file, you see the following cache state after the open is finished:

                         [a: 0 (v=1,d=0,r=1)]

This means file 'a' is in the cache with value '0', and has three bits of state associated with it: v (valid), d (dirty), and r (reference count). The valid bit tracks whether the contents are valid; it is now, because the cache has not been invalidated by a callback (yet). The dirty bit changes when the file has been written to and must be flushed back to the server when closed. Finally, the reference count tracks how many times the file has been opened (but not yet closed); this is used to ensure the client gets the old value of the file until it's been closed by all readers and then re-opened.

The full list of options is available here:

Options: -h, --help show this help message and exit -s SEED, --seed=SEED the random seed -C NUMCLIENTS, --clients=NUMCLIENTS number of clients -n NUMSTEPS, --numsteps=NUMSTEPS ops each client will do -f NUMFILES, --numfiles=NUMFILES number of files in server -r READRATIO, --readratio=READRATIO ratio of reads/writes -A ACTIONS, --actions=ACTIONS client actions exactly specified, e.g., oa1:r1:c1,oa1:w1:c1 specifies two clients; each opens the file a, client 0 reads it whereas client 1 writes it, and then each closes it -S SCHEDULE, --schedule=SCHEDULE exact schedule to run; 01 alternates round robin between clients 0 and 1. Left unspecified leads to random scheduling -p, --printstats print extra stats -c, --compute compute answers for me -d DETAIL, --detail=DETAIL detail level when giving answers (1:server actions,2:invalidations,4:client cache,8:extra labels); OR together for multiple

Read the AFS chapter, and answer the questions at the back, or just explore this simulator more on your own to increase your understanding of AFS.