initial SSD sim commit
This commit is contained in:
parent
533129b0a9
commit
b42f17f580
|
@ -0,0 +1,451 @@
|
|||
|
||||
# Overview
|
||||
|
||||
Welcome to `ssd.py`, yet another wonderful simulator provided to you,
|
||||
for free, by the authors of OSTEP, which is also free. Pretty soon,
|
||||
you're going to think that everything important in life is free! And,
|
||||
it turns out, it kind of is: the air you breathe, the love you give
|
||||
and receive, and a book about operating systems. What else do you
|
||||
need?
|
||||
|
||||
To run the simulator, you just do the usual:
|
||||
|
||||
```sh
|
||||
prompt> ./ssd.py
|
||||
```
|
||||
|
||||
The simulator models a few different types of SSDs. The first is what we'll
|
||||
call an "ideal" SSD, which actually isn't much an SSD at all; it's more like a
|
||||
perfect memory. To simulate this SSD, type:
|
||||
|
||||
```sh
|
||||
prompt> ./ssd.py -T ideal
|
||||
```
|
||||
|
||||
To see how this one works, let's create a little workload. A workload, for an
|
||||
SSD, is just a series of low-level I/O operations issued to the device. There
|
||||
are three operations supported by ssd.py: read (which takes an address to
|
||||
read, and returns the data), write (which takes an address and a piece of data
|
||||
to write, in this case, a single letter), and trim (which takes an
|
||||
address). The trim operation is used to indicate a previously written block is
|
||||
no longer live (i.e., the file it was in was deleted); this is particular
|
||||
useful for a log-based SSD, which can reclaim the block's space during garbage
|
||||
collection and free up space in the FTL. Let's run a simple workload
|
||||
consisting of just one write:
|
||||
|
||||
```sh
|
||||
prompt> ./ssd.py -T ideal -L w10:a -l 30 -B 3 -p 10
|
||||
```
|
||||
|
||||
The `-L` flag allows us to specify a comma-separated list of commands. Here, to
|
||||
write to logical page 10, we include the command "w10:a" which means "write"
|
||||
to logical page "10" the data of "a". We also include a few other specifics
|
||||
about the size of the SSD with the flags `-l 30 -B 3 -p 10`, but let's
|
||||
ignore those for now.
|
||||
|
||||
What you should see on the screen, after running the above:
|
||||
|
||||
```sh
|
||||
FTL (empty)
|
||||
Block 0 1 2
|
||||
Page 0000000000 1111111111 2222222222
|
||||
0123456789 0123456789 0123456789
|
||||
State iiiiiiiiii iiiiiiiiii iiiiiiiiii
|
||||
Data
|
||||
Live
|
||||
|
||||
|
||||
FTL 10: 10
|
||||
Block 0 1 2
|
||||
Page 0000000000 1111111111 2222222222
|
||||
0123456789 0123456789 0123456789
|
||||
State viiiiiiiii iiiiiiiiii iiiiiiiiii
|
||||
Data a
|
||||
Live +
|
||||
```
|
||||
|
||||
The first chunk of information shows the initial state of the SSD, and the
|
||||
second chunk shows its final state. Let's walk through each piece to make sure
|
||||
you understand what they mean.
|
||||
|
||||
The first line of each chunk of output shows the contents of the FTL. This
|
||||
simulator only models a simple page-mapped FTL; thus, each entry within it
|
||||
shows the logical-to-physical page mapping for any live data items.
|
||||
|
||||
In the initial state, the FTL is empty:
|
||||
|
||||
```sh
|
||||
FTL (empty)
|
||||
```
|
||||
|
||||
However, in the final state, you can see that the FTL maps logical page 10 to
|
||||
physical page 10:
|
||||
|
||||
```sh
|
||||
FTL 10: 10
|
||||
```
|
||||
|
||||
The reason for this simple mapping is that we are running the "ideal" SSD,
|
||||
which really just acts like a memory; if you write to logical page X, this SSD
|
||||
will just (magically) write the data to physical page X (indeed, you don't
|
||||
even really need the FTL for this; we'll just use the ideal SSD to show how
|
||||
much extra work a real SSD does, in terms of erases and data copying, as
|
||||
compared to an ideal memory).
|
||||
|
||||
The next lines of output just label the blocks and physical pages of the
|
||||
underlying Flash the simulator is modeling:
|
||||
|
||||
```sh
|
||||
Block 0 1 2
|
||||
Page 0000000000 1111111111 2222222222
|
||||
0123456789 0123456789 0123456789
|
||||
```
|
||||
|
||||
In this simulation, you can see that the SSD has 3 physical Flash blocks, and
|
||||
that each block has 10 physical pages. Each block is numbered (0, 1, and 2),
|
||||
as is each page (from 00 to 29); to keep the display compact (width-wise), the
|
||||
page numbering is shown across two lines. Thus, physical page "10" is labeled
|
||||
with a "1" on the first line, and a "0" on the second.
|
||||
|
||||
The next line shows the state of each page, i.e., whether it is INVALID (i),
|
||||
ERASED (E), or VALID (v), as per the chapter:
|
||||
|
||||
```sh
|
||||
State viiiiiiiii iiiiiiiiii iiiiiiiiii
|
||||
```
|
||||
|
||||
The states for the "ideal" SSD are a bit weird, in that you can have "v" and
|
||||
"i" mixed in a block, and that the block is never "E" for erased. Below, with
|
||||
the more realistic "direct" and "log" SSDs, you'll see "E" too.
|
||||
|
||||
The final two lines show the "contents" of any written-to pages (on the "Data"
|
||||
row) and whether that data is currently live (that is, referred to in the
|
||||
FTL) in the "Live" row:
|
||||
|
||||
```sh
|
||||
Block 0 1 2
|
||||
Page 0000000000 1111111111 2222222222
|
||||
0123456789 0123456789 0123456789
|
||||
Data a
|
||||
Live +
|
||||
```
|
||||
|
||||
Here, we can see that on Block 2 (i.e., Page 10), there is the data "a", and
|
||||
it is indeed live (shown by the "+" symbol).
|
||||
|
||||
Let's expand our workload a little bit, before getting to the more realistic
|
||||
types of SSDs. After writing the data, let's read it, and then let's use trim
|
||||
to delete it:
|
||||
|
||||
```sh
|
||||
prompt> ./ssd.py -T ideal -L w10:a,r10,t10 -l 30 -B 3 -p 10
|
||||
```
|
||||
|
||||
If you run this, you'll see two identical states: the initial (empty) state,
|
||||
and the final (also empty!) state. Not too exciting! To see more of what is
|
||||
going on, you'll have to use some more flags. Yes, this SSD simulator uses a
|
||||
lot of flags; sorry, all lovers of parsimony! But alas, there is some
|
||||
complexity here we must explore.
|
||||
|
||||
One useful flag is `-C`, which just shows every command that was issued, and
|
||||
whether is succeeded or not.
|
||||
|
||||
```sh
|
||||
prompt> ./ssd.py -T ideal -L w10:a,r10,t10 -l 30 -B 3 -p 10 -C
|
||||
|
||||
FTL (empty)
|
||||
Block 0 1 2
|
||||
Page 0000000000 1111111111 2222222222
|
||||
0123456789 0123456789 0123456789
|
||||
State iiiiiiiiii iiiiiiiiii iiiiiiiiii
|
||||
Data
|
||||
Live
|
||||
|
||||
cmd 0:: write(10, a) -> success
|
||||
cmd 1:: read(10) -> a
|
||||
cmd 2:: trim(10) -> success
|
||||
|
||||
FTL (empty)
|
||||
Block 0 1 2
|
||||
Page 0000000000 1111111111 2222222222
|
||||
0123456789 0123456789 0123456789
|
||||
State iiiiiiiiii viiiiiiiii iiiiiiiiii
|
||||
Data a
|
||||
Live
|
||||
|
||||
prompt>
|
||||
```
|
||||
|
||||
Here, you can see the write, read, and trim, and you can also see what each
|
||||
command returned: success, the data read, and success, respectively. This will
|
||||
be more interesting later, when the simulator generates the operations
|
||||
randomly.
|
||||
|
||||
Similarly, the `-F` flag shows the state of the Flash between each operation,
|
||||
instead of just at the end. Note the subtle changes at each step:
|
||||
|
||||
```sh
|
||||
prompt> ./ssd.py -T ideal -L w10:a,r10,t10 -l 30 -B 3 -p 10 -F
|
||||
|
||||
FTL (empty)
|
||||
Block 0 1 2
|
||||
Page 0000000000 1111111111 2222222222
|
||||
0123456789 0123456789 0123456789
|
||||
State iiiiiiiiii iiiiiiiiii iiiiiiiiii
|
||||
Data
|
||||
Live
|
||||
|
||||
FTL 10: 10
|
||||
Block 0 1 2
|
||||
Page 0000000000 1111111111 2222222222
|
||||
0123456789 0123456789 0123456789
|
||||
State iiiiiiiiii viiiiiiiii iiiiiiiiii
|
||||
Data a
|
||||
Live +
|
||||
|
||||
FTL 10: 10
|
||||
Block 0 1 2
|
||||
Page 0000000000 1111111111 2222222222
|
||||
0123456789 0123456789 0123456789
|
||||
State iiiiiiiiii viiiiiiiii iiiiiiiiii
|
||||
Data a
|
||||
Live +
|
||||
|
||||
FTL (empty)
|
||||
Block 0 1 2
|
||||
Page 0000000000 1111111111 2222222222
|
||||
0123456789 0123456789 0123456789
|
||||
State iiiiiiiiii viiiiiiiii iiiiiiiiii
|
||||
Data a
|
||||
Live
|
||||
|
||||
prompt>
|
||||
```
|
||||
|
||||
Of course, you can use `-C` and `-F` in concert to show everything (an exercise
|
||||
left to the reader).
|
||||
|
||||
The simulator also lets you generate random workloads, instead of specifying
|
||||
operations yourself. Use the "-n" flag for this, with an associated number (we
|
||||
also specify a random seed with "-s" to get a particular workload):
|
||||
|
||||
```sh
|
||||
prompt> ./ssd.py -T ideal -l 30 -B 3 -p 10 -n 5 -s 10
|
||||
```
|
||||
|
||||
If you run this with `-C`, `-F`, or both, you'll see either the exact commands,
|
||||
the intermediate states of the Flash, or both. However, you can also use the
|
||||
"-q" flag to quiz yourself on what you think the commands are. Thus, run the
|
||||
following:
|
||||
|
||||
```sh
|
||||
prompt> ./ssd.py -T ideal -l 30 -B 3 -p 10 -n 5 -s 10 -q
|
||||
(output omitted for brevity)
|
||||
```
|
||||
|
||||
Now, by examining the intermediate states, see if you can discern what the
|
||||
commands must have been (writes and trims are left completely unspecified,
|
||||
whereas reads just ask you to figure out which data was returned).
|
||||
|
||||
You can then either manually use `-C -F` to show everything, or just add the
|
||||
`-c` flag to "solve" the problem for you, to check your answers.
|
||||
|
||||
Let's now do the same thing (a random workload of five operations) but use
|
||||
different more realistic SSDs. The first is the "direct" SSD mentioned in the
|
||||
chapter. This too isn't particularly realistic, but at least uses erases and
|
||||
programs to update the Flash. Specifically, when a logical page is written, it
|
||||
is mapped directly to the physical page of the same number. This mapping
|
||||
necessitates first a read of all the live data in that block, then an erase of
|
||||
the block, and then a series of programs to restore all previously live data
|
||||
as well as write the new data to Flash. Let's run it, show the commands (-C)
|
||||
but not the intermediate states (no -F):
|
||||
|
||||
```sh
|
||||
prompt> ./ssd.py -T direct -l 30 -B 3 -p 10 -n 5 -s 10 -C
|
||||
|
||||
FTL (empty)
|
||||
Block 0 1 2
|
||||
Page 0000000000 1111111111 2222222222
|
||||
0123456789 0123456789 0123456789
|
||||
State iiiiiiiiii iiiiiiiiii iiiiiiiiii
|
||||
Data
|
||||
Live
|
||||
|
||||
cmd 0:: write(12, z) -> success
|
||||
cmd 1:: write(19, 9) -> success
|
||||
cmd 2:: write(9, f) -> success
|
||||
cmd 3:: trim(9) -> success
|
||||
cmd 4:: read(19) -> 9
|
||||
|
||||
FTL 12: 12 19: 19
|
||||
Block 0 1 2
|
||||
Page 0000000000 1111111111 2222222222
|
||||
0123456789 0123456789 0123456789
|
||||
State EEEEEEEEEv EEvEEEEEEv iiiiiiiiii
|
||||
Data f z 9
|
||||
Live + +
|
||||
|
||||
prompt>
|
||||
```
|
||||
|
||||
As you can see from the final state, the FTL contains two live mappings:
|
||||
logical page 12 refers to physical page 12, and 19 to 19 (remember, this is
|
||||
the direct mapping). You can also see three data pages with information within
|
||||
them: physical page 9 contains "f", 12 contains "z", and 19 contains "9" (data
|
||||
can be letters or numbers or really any single character). However, also note
|
||||
that "9" has been trimmed; this removes its entry from the FTL, but the data
|
||||
lies their dormant (for now). If you then tried to read logical page 9, it no
|
||||
longer would succeed:
|
||||
|
||||
```sh
|
||||
prompt> ./ssd.py -T direct -l 30 -B 3 -p 10 -C -L w9:f,t9,r9
|
||||
|
||||
FTL (empty)
|
||||
Block 0 1 2
|
||||
Page 0000000000 1111111111 2222222222
|
||||
0123456789 0123456789 0123456789
|
||||
State iiiiiiiiii iiiiiiiiii iiiiiiiiii
|
||||
Data
|
||||
Live
|
||||
|
||||
cmd 0:: write(9, f) -> success
|
||||
cmd 1:: trim(9) -> success
|
||||
cmd 2:: read(9) -> fail: uninitialized read
|
||||
|
||||
FTL (empty)
|
||||
Block 0 1 2
|
||||
Page 0000000000 1111111111 2222222222
|
||||
0123456789 0123456789 0123456789
|
||||
State EEEEEEEEEv iiiiiiiiii iiiiiiiiii
|
||||
Data f
|
||||
Live
|
||||
|
||||
prompt>
|
||||
```
|
||||
|
||||
One last SSD we should pay attention to is the actual most realistic one,
|
||||
which uses log-structuring (as do most real SSDs). To use it, just change the
|
||||
SSD type to "log" (we'll again turn on -C so we can just know which operations
|
||||
took place, instead of quizzing ourselves):
|
||||
|
||||
```sh
|
||||
prompt> ./ssd.py -T log -l 30 -B 3 -p 10 -s 10 -n 5 -C
|
||||
|
||||
FTL (empty)
|
||||
Block 0 1 2
|
||||
Page 0000000000 1111111111 2222222222
|
||||
0123456789 0123456789 0123456789
|
||||
State iiiiiiiiii iiiiiiiiii iiiiiiiiii
|
||||
Data
|
||||
Live
|
||||
|
||||
cmd 0:: write(12, z) -> success
|
||||
cmd 1:: write(19, 9) -> success
|
||||
cmd 2:: write(9, f) -> success
|
||||
cmd 3:: trim(9) -> success
|
||||
cmd 4:: read(19) -> 9
|
||||
|
||||
FTL 12: 0 19: 1
|
||||
Block 0 1 2
|
||||
Page 0000000000 1111111111 2222222222
|
||||
0123456789 0123456789 0123456789
|
||||
State vvvEEEEEEE iiiiiiiiii iiiiiiiiii
|
||||
Data z9f
|
||||
Live ++
|
||||
|
||||
prompt>
|
||||
```
|
||||
|
||||
Note how the log-structured SSD writes data to the Flash. First, the current
|
||||
log block (Block 0, in this case) is erased. Then, the pages are programmed in
|
||||
order. Use the -F flag to see each step for more detail.
|
||||
|
||||
The simulator can also show more statistics, including operation counts and
|
||||
the estimated time that the modeled SSD would take to complete the given
|
||||
workload. To see these, use the -S flag:
|
||||
|
||||
```sh
|
||||
prompt> ./ssd.py -T log -l 30 -B 3 -p 10 -s 10 -n 5 -S
|
||||
|
||||
(stuff omitted)
|
||||
|
||||
Physical Operations Per Block
|
||||
Erases 1 0 0 Sum: 1
|
||||
Writes 3 0 0 Sum: 3
|
||||
Reads 1 0 0 Sum: 1
|
||||
|
||||
Logical Operation Sums
|
||||
Write count 3 (0 failed)
|
||||
Read count 1 (0 failed)
|
||||
Trim count 1 (0 failed)
|
||||
|
||||
Times
|
||||
Erase time 1000.00
|
||||
Write time 120.00
|
||||
Read time 10.00
|
||||
Total time 1130.00
|
||||
```
|
||||
|
||||
Here you can see the physical erases, writes, and reads per block as well as a
|
||||
sum of each, then the number of logical writes, reads, and trims issued to the
|
||||
device, and finally the estimated times. You can change the costs of low-level
|
||||
operations such as read, program, and erase, with the -R, -W, and -E flags,
|
||||
respectively.
|
||||
|
||||
Finally, with the SSD in log-structured mode, there is a garbage collector (GC)
|
||||
that can be configured to run periodically. This behavior is controlled by the
|
||||
-G and -g flags, which set the high and low watermarks for determining whether
|
||||
the garbage collector should run. Setting the high watermark to a value N
|
||||
(i.e., -G N) means that when the GC notices that N blocks are in use, it
|
||||
should run. Setting the low watermark to M (i.e., -G M) means that the GC
|
||||
should run until only M blocks are in use.
|
||||
|
||||
The -J flag is also useful here: it shows which low-level commands the GC
|
||||
issues (reads and writes of live data, followed by erases of reclaimed
|
||||
blocks). The following issues 60 operations, and sets the high and low
|
||||
watermarks to 3 and 2, respectively.
|
||||
|
||||
```sh
|
||||
prompt> ./ssd.py -T log -l 30 -B 3 -p 10 -s 10 -n 60 -G 3 -g 2 -C -F -J
|
||||
```
|
||||
|
||||
Using `-C`, `-F`, and `-J` lets you really see what is happening, step
|
||||
by step, inside the log-structured simulation.
|
||||
|
||||
There are a few other flags worth knowing. The entire time, we've been using
|
||||
the following three flags to control the size of the simulated SSD:
|
||||
|
||||
```sh
|
||||
-l NUM_LOGICAL_PAGES, --logical_pages=NUM_LOGICAL_PAGES number of logical pages in interface
|
||||
-B NUM_BLOCKS, --num_blocks=NUM_BLOCKS number of physical blocks in SSD
|
||||
-p PAGES_PER_BLOCK, --pages_per_block=PAGES_PER_BLOCK pages per physical block
|
||||
```
|
||||
|
||||
You can change these values to simulate larger or smaller SSDs than the simple
|
||||
one we've been simulating so far.
|
||||
|
||||
One other set of controls lets you control randomly generated
|
||||
workloads a bit more precisely. The `-P` flag lets you control how
|
||||
many reads/writes/trims show up (probabilistically). For example,
|
||||
using `-P 30/35/35` means that roughly 30% of operations will be
|
||||
reads, 35% writes, and 35% trims.
|
||||
|
||||
The `-r` flag allows reads to be issued to non-live addresses (the default only
|
||||
issues reads to live data). Thus, `-r 10` means roughly 10% of reads will
|
||||
fail.
|
||||
|
||||
Finally, the `-K` and `-k` flags let you add some "skew" to a workload. A skew is
|
||||
first specified by `-K`, e.g., `-K 80/20` makes 80% of writes target 20% of
|
||||
the logical space (a hot/cold kind of workload). Skew is common in real
|
||||
workloads, and has different effects on garbage collection, etc., so it is
|
||||
good to be able to model. The related `-k` flag lets you specify when the skew
|
||||
starts; specifically, `-k 50` means that after 50 writes, start doing the skew
|
||||
(before then, the writes will be chosen at random from all possible pages in
|
||||
the logical space).
|
||||
|
||||
Wow, have you gotten this far? You are some impressive person! We suspect you
|
||||
will go far in life. Or, we suspect that you typed "cat README" and not "more
|
||||
README" or "less README", in which case we suspect you are just learning about
|
||||
"more" or "less", more or less.
|
||||
|
||||
|
|
@ -0,0 +1,622 @@
|
|||
#! /usr/bin/env python
|
||||
|
||||
from __future__ import print_function
|
||||
from collections import *
|
||||
from optparse import OptionParser
|
||||
import random
|
||||
import string
|
||||
|
||||
# to make Python2 and Python3 act the same -- how dumb
|
||||
def random_seed(seed):
|
||||
try:
|
||||
random.seed(seed, version=1)
|
||||
except:
|
||||
random.seed(seed)
|
||||
return
|
||||
|
||||
def random_randint(low, hi):
|
||||
return int(low + random.random() * (hi - low + 1))
|
||||
|
||||
def random_choice(L):
|
||||
return L[random_randint(0, len(L)-1)]
|
||||
|
||||
|
||||
|
||||
|
||||
class ssd:
|
||||
def __init__(self, ssd_type, num_logical_pages, num_blocks, pages_per_block,
|
||||
block_erase_time, page_program_time, page_read_time,
|
||||
high_water_mark, low_water_mark, trace_gc, show_state):
|
||||
# type
|
||||
self.TYPE_DIRECT = 1
|
||||
self.TYPE_LOGGING = 2
|
||||
self.TYPE_IDEAL = 3
|
||||
|
||||
if ssd_type == 'direct':
|
||||
self.ssd_type = self.TYPE_DIRECT
|
||||
elif ssd_type == 'log':
|
||||
self.ssd_type = self.TYPE_LOGGING
|
||||
elif ssd_type == 'ideal':
|
||||
self.ssd_type = self.TYPE_IDEAL
|
||||
else:
|
||||
print('bad SSD type (%s)' % ssd_type)
|
||||
exit(1)
|
||||
|
||||
# size
|
||||
self.num_logical_pages = num_logical_pages
|
||||
self.num_blocks = num_blocks
|
||||
self.pages_per_block = pages_per_block
|
||||
|
||||
# parameters
|
||||
self.block_erase_time = block_erase_time
|
||||
self.page_program_time = page_program_time
|
||||
self.page_read_time = page_read_time
|
||||
|
||||
# init each page of each block to INVALID
|
||||
self.STATE_INVALID = 1
|
||||
self.STATE_ERASED = 2
|
||||
self.STATE_VALID = 3
|
||||
|
||||
self.num_pages = self.num_blocks * self.pages_per_block
|
||||
self.state = {}
|
||||
for i in range(self.num_pages):
|
||||
self.state[i] = self.STATE_INVALID
|
||||
|
||||
# data itself
|
||||
self.data = {}
|
||||
for i in range(self.num_pages):
|
||||
self.data[i] = ' '
|
||||
|
||||
# LOGGING stuff
|
||||
# reverse map: for each physical page, what LOGICAL page refers to it?
|
||||
# which page to write to right now?
|
||||
self.current_page = -1
|
||||
self.current_block = 0
|
||||
|
||||
# gc counts
|
||||
self.gc_count = 0
|
||||
self.gc_current_block = 0
|
||||
self.gc_high_water_mark = high_water_mark
|
||||
self.gc_low_water_mark = low_water_mark
|
||||
|
||||
self.gc_trace = trace_gc
|
||||
self.show_state = show_state
|
||||
|
||||
# can use this as a log block
|
||||
self.gc_used_blocks = {}
|
||||
for i in range(self.num_blocks):
|
||||
self.gc_used_blocks[i] = 0
|
||||
|
||||
# counts so as to help the GC
|
||||
self.live_count = {}
|
||||
for i in range(self.num_blocks):
|
||||
self.live_count[i] = 0
|
||||
|
||||
# FTL
|
||||
self.forward_map = {}
|
||||
for i in range(self.num_logical_pages):
|
||||
self.forward_map[i] = -1
|
||||
|
||||
self.reverse_map = {}
|
||||
for i in range(self.num_pages):
|
||||
self.reverse_map[i] = -1
|
||||
|
||||
# stats
|
||||
self.physical_erase_count = {}
|
||||
self.physical_read_count = {}
|
||||
self.physical_write_count = {}
|
||||
|
||||
for i in range(self.num_blocks):
|
||||
self.physical_erase_count[i] = 0
|
||||
self.physical_read_count[i] = 0
|
||||
self.physical_write_count[i] = 0
|
||||
|
||||
self.physical_erase_sum = 0
|
||||
self.physical_write_sum = 0
|
||||
self.physical_read_sum = 0
|
||||
|
||||
self.logical_trim_sum = 0
|
||||
self.logical_write_sum = 0
|
||||
self.logical_read_sum = 0
|
||||
|
||||
self.logical_trim_fail_sum = 0
|
||||
self.logical_write_fail_sum = 0
|
||||
self.logical_read_fail_sum = 0
|
||||
return
|
||||
|
||||
def blocks_in_use(self):
|
||||
used = 0
|
||||
for i in range(self.num_blocks):
|
||||
used += self.gc_used_blocks[i]
|
||||
return used
|
||||
|
||||
def physical_erase(self, block_address):
|
||||
page_begin = block_address * self.pages_per_block
|
||||
page_end = page_begin + self.pages_per_block - 1
|
||||
|
||||
for page in range(page_begin, page_end + 1):
|
||||
self.data[page] = ' '
|
||||
self.state[page] = self.STATE_ERASED
|
||||
|
||||
# now, definitely NOT in use
|
||||
self.gc_used_blocks[block_address] = 0
|
||||
|
||||
# STATS
|
||||
self.physical_erase_count[block_address] += 1
|
||||
self.physical_erase_sum += 1
|
||||
return
|
||||
|
||||
def physical_program(self, page_address, data):
|
||||
self.data[page_address] = data
|
||||
self.state[page_address] = self.STATE_VALID
|
||||
# STATS
|
||||
self.physical_write_count[int(page_address / self.pages_per_block)] += 1
|
||||
self.physical_write_sum += 1
|
||||
return
|
||||
|
||||
def physical_read(self, page_address):
|
||||
# STATS
|
||||
self.physical_read_count[int(page_address / self.pages_per_block)] += 1
|
||||
self.physical_read_sum += 1
|
||||
return self.data[page_address]
|
||||
|
||||
def read_direct(self, address):
|
||||
return self.physical_read(address)
|
||||
|
||||
def write_direct(self, page_address, data):
|
||||
block_address = int(page_address / self.pages_per_block)
|
||||
page_begin = block_address * self.pages_per_block
|
||||
page_end = page_begin + self.pages_per_block - 1
|
||||
|
||||
old_list = []
|
||||
for old_page in range(page_begin, page_end + 1):
|
||||
if self.state[old_page] == self.STATE_VALID:
|
||||
old_data = self.physical_read(old_page)
|
||||
old_list.append((old_page, old_data))
|
||||
|
||||
self.physical_erase(block_address)
|
||||
for (old_page, old_data) in old_list:
|
||||
if old_page == page_address:
|
||||
continue
|
||||
self.physical_program(old_page, old_data)
|
||||
|
||||
self.physical_program(page_address, data)
|
||||
self.forward_map[page_address] = page_address
|
||||
self.reverse_map[page_address] = page_address
|
||||
return 'success'
|
||||
|
||||
def write_ideal(self, page_address, data):
|
||||
self.physical_program(page_address, data)
|
||||
self.forward_map[page_address] = page_address
|
||||
self.reverse_map[page_address] = page_address
|
||||
return 'success'
|
||||
|
||||
def is_block_free(self, block):
|
||||
first_page = block * self.pages_per_block
|
||||
if self.state[first_page] == self.STATE_INVALID or self.state[first_page] == self.STATE_ERASED:
|
||||
if self.state[first_page] == self.STATE_INVALID:
|
||||
self.physical_erase(block)
|
||||
self.current_block = block
|
||||
self.current_page = first_page
|
||||
self.gc_used_blocks[block] = 1
|
||||
return True
|
||||
return False
|
||||
|
||||
def get_cursor(self):
|
||||
if self.current_page == -1:
|
||||
for block in range(self.current_block, self.num_blocks):
|
||||
if self.is_block_free(block):
|
||||
return 0
|
||||
for block in range(0, self.current_block):
|
||||
if self.is_block_free(block):
|
||||
return 0
|
||||
return -1
|
||||
return 0
|
||||
|
||||
def update_cursor(self):
|
||||
self.current_page += 1
|
||||
if self.current_page % self.pages_per_block == 0:
|
||||
self.current_page = -1
|
||||
return
|
||||
|
||||
def write_logging(self, page_address, data, is_gc_write=False):
|
||||
if self.get_cursor() == -1:
|
||||
self.logical_write_fail_sum += 1
|
||||
return 'failure: device full'
|
||||
# NORMAL MODE writing
|
||||
assert(self.state[self.current_page] == self.STATE_ERASED)
|
||||
self.physical_program(self.current_page, data)
|
||||
self.forward_map[page_address] = self.current_page
|
||||
self.reverse_map[self.current_page] = page_address
|
||||
self.update_cursor()
|
||||
return 'success'
|
||||
|
||||
def garbage_collect(self):
|
||||
blocks_cleaned = 0
|
||||
# for block in range(self.gc_current_block, self.num_blocks) + range(0, self.gc_current_block):
|
||||
# tricky flattening generator expression (https://stackoverflow.com/questions/18317913/how-can-i-combine-range-functions)
|
||||
for block in (x for y in (range(self.gc_current_block, self.num_blocks), range(0, self.gc_current_block)) for x in y):
|
||||
# don't GC the block currently being written to
|
||||
if block == self.current_block:
|
||||
continue
|
||||
|
||||
# page to start looking for live blocks
|
||||
page_start = block * self.pages_per_block
|
||||
|
||||
# is this page (and hence block) already erased? then don't bother
|
||||
if self.state[page_start] == self.STATE_ERASED:
|
||||
continue
|
||||
|
||||
# collect list of live physical pages in this block
|
||||
live_pages = []
|
||||
for page in range(page_start, page_start + self.pages_per_block):
|
||||
logical_page = self.reverse_map[page]
|
||||
if logical_page != -1 and self.forward_map[logical_page] == page:
|
||||
live_pages.append(page)
|
||||
|
||||
# if ONLY live blocks, don't clean it! (why bother with move?)
|
||||
if len(live_pages) == self.pages_per_block:
|
||||
continue
|
||||
|
||||
# live pages should be copied to current writing location
|
||||
for page in live_pages:
|
||||
# live: so copy it someplace new
|
||||
if self.gc_trace:
|
||||
print('gc %d:: read(physical_page=%d)' % (self.gc_count, page))
|
||||
print('gc %d:: write()' % self.gc_count)
|
||||
data = self.physical_read(page)
|
||||
self.write(self.reverse_map[page], data)
|
||||
|
||||
# finally, erase the block and see if we're done
|
||||
blocks_cleaned += 1
|
||||
self.physical_erase(block)
|
||||
|
||||
if self.gc_trace:
|
||||
print('gc %d:: erase(block=%d)' % (self.gc_count, block))
|
||||
if self.show_state:
|
||||
print('')
|
||||
self.dump()
|
||||
print('')
|
||||
|
||||
if self.blocks_in_use() <= self.gc_low_water_mark:
|
||||
# done! record where we stopped and return
|
||||
self.gc_current_block = block
|
||||
self.gc_count += 1
|
||||
return
|
||||
|
||||
# END: block iteration
|
||||
return
|
||||
|
||||
def upkeep(self):
|
||||
# GARBAGE COLLECTION
|
||||
if self.blocks_in_use() >= self.gc_high_water_mark:
|
||||
self.garbage_collect()
|
||||
# WEAR LEVELING: for future
|
||||
return
|
||||
|
||||
def trim(self, address):
|
||||
self.logical_trim_sum += 1
|
||||
if address < 0 or address >= self.num_logical_pages:
|
||||
self.logical_trim_fail_sum += 1
|
||||
return 'fail: illegal trim address'
|
||||
if self.forward_map[address] == -1:
|
||||
self.logical_trim_fail_sum += 1
|
||||
return 'fail: uninitialized trim'
|
||||
self.forward_map[address] = -1
|
||||
return 'success'
|
||||
|
||||
def read(self, address):
|
||||
self.logical_read_sum += 1
|
||||
if address < 0 or address >= self.num_logical_pages:
|
||||
self.logical_read_fail_sum += 1
|
||||
return 'fail: illegal read address'
|
||||
if self.forward_map[address] == -1:
|
||||
self.logical_read_fail_sum += 1
|
||||
return 'fail: uninitialized read'
|
||||
# USED for DIRECT and LOGGING and IDEAL
|
||||
return self.read_direct(self.forward_map[address])
|
||||
|
||||
def write(self, address, data):
|
||||
self.logical_write_sum += 1
|
||||
if address < 0 or address >= self.num_logical_pages:
|
||||
self.logical_write_fail_sum += 1
|
||||
return 'fail: illegal write address'
|
||||
if self.ssd_type == self.TYPE_DIRECT:
|
||||
return self.write_direct(address, data)
|
||||
elif self.ssd_type == self.TYPE_IDEAL:
|
||||
return self.write_ideal(address, data)
|
||||
else:
|
||||
return self.write_logging(address, data)
|
||||
|
||||
def printable_state(self, s):
|
||||
if s == self.STATE_INVALID:
|
||||
return 'i'
|
||||
elif s == self.STATE_ERASED:
|
||||
return 'E'
|
||||
elif s == self.STATE_VALID:
|
||||
return 'v'
|
||||
else:
|
||||
print('bad state %d' % s)
|
||||
exit(1)
|
||||
|
||||
def stats(self):
|
||||
print('Physical Operations Per Block')
|
||||
print('Erases ', end='')
|
||||
for i in range(self.num_blocks):
|
||||
print('%3d ' % self.physical_erase_count[i], end='')
|
||||
print(' Sum: %d' % self.physical_erase_sum)
|
||||
|
||||
print('Writes ', end='')
|
||||
for i in range(self.num_blocks):
|
||||
print('%3d ' % self.physical_write_count[i], end='')
|
||||
print(' Sum: %d' % self.physical_write_sum)
|
||||
|
||||
print('Reads ', end='')
|
||||
for i in range(self.num_blocks):
|
||||
print('%3d ' % self.physical_read_count[i], end='')
|
||||
print(' Sum: %d' % self.physical_read_sum)
|
||||
print('')
|
||||
print('Logical Operation Sums')
|
||||
print(' Write count %d (%d failed)' % (self.logical_write_sum, self.logical_write_fail_sum))
|
||||
print(' Read count %d (%d failed)' % (self.logical_read_sum, self.logical_read_fail_sum))
|
||||
print(' Trim count %d (%d failed)' % (self.logical_trim_sum, self.logical_trim_fail_sum))
|
||||
print('')
|
||||
print('Times')
|
||||
print(' Erase time %.2f' % (self.physical_erase_sum * self.block_erase_time))
|
||||
print(' Write time %.2f' % (self.physical_write_sum * self.page_program_time))
|
||||
print(' Read time %.2f' % (self.physical_read_sum * self.page_read_time))
|
||||
total_time = self.physical_erase_sum * self.block_erase_time + self.physical_write_sum * self.page_program_time + self.physical_read_sum * self.page_read_time
|
||||
print(' Total time %.2f' % total_time)
|
||||
return
|
||||
|
||||
def dump(self):
|
||||
# FTL
|
||||
print('FTL ', end='')
|
||||
count = 0
|
||||
ftl_columns = int((self.pages_per_block * self.num_blocks) / 7)
|
||||
for i in range(self.num_logical_pages):
|
||||
if self.forward_map[i] == -1:
|
||||
continue
|
||||
count += 1
|
||||
print('%3d:%3d ' % (i, self.forward_map[i]), end='')
|
||||
if count > 0 and count % ftl_columns == 0:
|
||||
print('\n ', end='')
|
||||
if count == 0:
|
||||
print('(empty)', end='')
|
||||
print('')
|
||||
|
||||
# FLASH?
|
||||
print('Block ', end='')
|
||||
for i in range(self.num_blocks):
|
||||
out_str = '%d' % i
|
||||
print(out_str + ' ' * (self.pages_per_block - len(out_str) + 1), end='')
|
||||
print('')
|
||||
|
||||
max_len = len(str(self.num_pages))
|
||||
for n in range(max_len, 0, -1):
|
||||
if n == max_len:
|
||||
print('Page ', end='')
|
||||
else:
|
||||
print(' ', end='')
|
||||
for i in range(self.num_pages):
|
||||
out_str = str(i).zfill(max_len)[max_len - n]
|
||||
print(out_str, end='')
|
||||
if i > 0 and (i+1) % 10 == 0:
|
||||
print(end=' ')
|
||||
print('')
|
||||
|
||||
print('State ', end='')
|
||||
for i in range(self.num_pages):
|
||||
print('%s' % self.printable_state(self.state[i]), end='')
|
||||
if i > 0 and (i+1) % 10 == 0:
|
||||
print(end=' ')
|
||||
print('')
|
||||
|
||||
# DATA
|
||||
print('Data ', end='')
|
||||
for i in range(self.num_pages):
|
||||
if self.state[i] == self.STATE_VALID:
|
||||
print('%s' % self.data[i], end='')
|
||||
else:
|
||||
print(' ', end='')
|
||||
if i > 0 and (i+1) % 10 == 0:
|
||||
print(end=' ')
|
||||
print('')
|
||||
|
||||
# LIVE
|
||||
print('Live ', end='')
|
||||
for i in range(self.num_pages):
|
||||
if self.state[i] == self.STATE_VALID and self.forward_map[self.reverse_map[i]] == i:
|
||||
print('+', end='')
|
||||
else:
|
||||
print(' ', end='')
|
||||
if i > 0 and (i+1) % 10 == 0:
|
||||
print(end=' ')
|
||||
print('')
|
||||
return
|
||||
|
||||
|
||||
|
||||
|
||||
#
|
||||
# MAIN PROGRAM
|
||||
#
|
||||
parser = OptionParser()
|
||||
parser.add_option('-s', '--seed', default=0, help='the random seed', action='store', type='int', dest='seed')
|
||||
parser.add_option('-n', '--num_cmds', default=10, help='number of commands to randomly generate', action='store', type='int', dest='num_cmds')
|
||||
parser.add_option('-P', '--op_percentages', default='40/50/10', help='if rand, percent of reads/writes/trims', action='store', type='string', dest='op_percentages')
|
||||
parser.add_option('-K', '--skew', default='', help='if non-empty, skew, e.g., 80/20: 80% of ops to 20% of blocks', action='store', type='string', dest='skew')
|
||||
parser.add_option('-k', '--skew_start', default=0, help='if --skew, skew after this many writes', action='store', type='int', dest='skew_start')
|
||||
parser.add_option('-r', '--read_fails', default=0, help='if rand, percent of reads that can fail', action='store', type='int', dest='read_fail')
|
||||
parser.add_option('-L', '--cmd_list', default='', help='comma-separated list of commands (e.g., r10,w20:a)', action='store', type='string', dest='cmd_list')
|
||||
parser.add_option('-T', '--ssd_type', default='direct', help='SSD type: ideal, direct, log', action='store', type='string', dest='ssd_type')
|
||||
parser.add_option('-l', '--logical_pages', default=50, help='number of logical pages in interface', action='store', type='int', dest='num_logical_pages')
|
||||
parser.add_option('-B', '--num_blocks', default=7, help='number of physical blocks in SSD', action='store', type='int', dest='num_blocks')
|
||||
parser.add_option('-p', '--pages_per_block', default=10, help='pages per physical block', action='store', type='int', dest='pages_per_block')
|
||||
parser.add_option('-G', '--high_water_mark', default=10, help='blocks used before gc trigger', action='store', type='int', dest='high_water_mark')
|
||||
parser.add_option('-g', '--low_water_mark', default=8, help='gc target before stopping gc', action='store', type='int', dest='low_water_mark')
|
||||
parser.add_option('-R', '--read_time', default=10, help='page read time (usecs)', action='store', type='int', dest='read_time')
|
||||
parser.add_option('-W', '--program_time', default=40, help='page program time (usecs)', action='store', type='int', dest='program_time')
|
||||
parser.add_option('-E', '--erase_time', default=1000, help='page erase time (usecs)', action='store', type='int', dest='erase_time')
|
||||
parser.add_option('-J', '--show_gc', default=False, help='show garbage collector behavior', action='store_true', dest='show_gc')
|
||||
parser.add_option('-F', '--show_state', default=False, help='show flash state', action='store_true', dest='show_state')
|
||||
parser.add_option('-C', '--show_cmds', default=False, help='show commands', action='store_true', dest='show_cmds')
|
||||
parser.add_option('-q', '--quiz_cmds', default=False, help='quiz commands', action='store_true', dest='quiz_cmds')
|
||||
parser.add_option('-S', '--show_stats', default=False, help='show statistics', action='store_true', dest='show_stats')
|
||||
parser.add_option('-c', '--compute', default=False, help='compute answers for me', action='store_true', dest='solve')
|
||||
|
||||
(options, args) = parser.parse_args()
|
||||
|
||||
random_seed(options.seed)
|
||||
|
||||
print('ARG seed %s' % options.seed)
|
||||
print('ARG num_cmds %s' % options.num_cmds)
|
||||
print('ARG op_percentages %s' % options.op_percentages)
|
||||
print('ARG skew %s' % options.skew)
|
||||
print('ARG skew_start %s' % options.skew_start)
|
||||
print('ARG read_fail %s' % options.read_fail)
|
||||
print('ARG cmd_list %s' % options.cmd_list)
|
||||
print('ARG ssd_type %s' % options.ssd_type)
|
||||
print('ARG num_logical_pages %s' % options.num_logical_pages)
|
||||
print('ARG num_blocks %s' % options.num_blocks)
|
||||
print('ARG pages_per_block %s' % options.pages_per_block)
|
||||
print('ARG high_water_mark %s' % options.high_water_mark)
|
||||
print('ARG low_water_mark %s' % options.low_water_mark)
|
||||
print('ARG erase_time %s' % options.erase_time)
|
||||
print('ARG program_time %s' % options.program_time)
|
||||
print('ARG read_time %s' % options.read_time)
|
||||
print('ARG show_gc %s' % options.show_gc)
|
||||
print('ARG show_state %s' % options.show_state)
|
||||
print('ARG show_cmds %s' % options.show_cmds)
|
||||
print('ARG quiz_cmds %s' % options.quiz_cmds)
|
||||
print('ARG show_stats %s' % options.show_stats)
|
||||
print('ARG compute %s' % options.solve)
|
||||
print('')
|
||||
|
||||
s = ssd(ssd_type=options.ssd_type,
|
||||
num_logical_pages=options.num_logical_pages, num_blocks=options.num_blocks, pages_per_block=options.pages_per_block,
|
||||
block_erase_time=float(options.erase_time), page_program_time=float(options.program_time), page_read_time=float(options.read_time),
|
||||
high_water_mark=options.high_water_mark, low_water_mark=options.low_water_mark, trace_gc=options.show_gc, show_state=options.show_state)
|
||||
|
||||
#
|
||||
# generate cmds (if not passed in by cmd_list)
|
||||
#
|
||||
hot_cold = False
|
||||
skew_start = options.skew_start
|
||||
if options.skew != '':
|
||||
hot_cold = True
|
||||
skew = options.skew.split('/')
|
||||
if len(skew) != 2:
|
||||
print('bad skew specification; should be 80/20 or something like that')
|
||||
exit(1)
|
||||
hot_percent = int(skew[0])/100.0
|
||||
hot_target = int(skew[1])/100.0
|
||||
|
||||
if options.cmd_list == '':
|
||||
max_page_addr = int(options.num_logical_pages)
|
||||
|
||||
num_cmds = int(options.num_cmds)
|
||||
p = options.op_percentages.split('/')
|
||||
assert(len(p) == 3)
|
||||
percent_reads, percent_writes, percent_trims = int(p[0]), int(p[1]), int(p[2])
|
||||
if percent_writes <= 0:
|
||||
print('must have some writes, otherwise nothing in the SSD!')
|
||||
exit(1)
|
||||
|
||||
printable = string.digits + string.ascii_lowercase + string.ascii_uppercase
|
||||
|
||||
cmd_list = []
|
||||
valid_addresses = []
|
||||
while len(cmd_list) < num_cmds:
|
||||
which_cmd = int(random.random() * 100.0)
|
||||
if which_cmd < percent_reads:
|
||||
# READ
|
||||
if random_randint(0, 99) < int(options.read_fail):
|
||||
address = random_randint(0, max_page_addr - 1)
|
||||
else:
|
||||
if len(valid_addresses) < 2:
|
||||
continue
|
||||
address = random_choice(valid_addresses)
|
||||
cmd_list.append('r%d' % address)
|
||||
elif which_cmd < percent_reads + percent_writes:
|
||||
# WRITE
|
||||
if skew_start == 0 and hot_cold and random.random() < hot_percent:
|
||||
address = random_randint(0, int(hot_target * (max_page_addr - 1)))
|
||||
else:
|
||||
address = random_randint(0, max_page_addr - 1)
|
||||
if address not in valid_addresses:
|
||||
valid_addresses.append(address)
|
||||
data = random_choice(list(printable))
|
||||
cmd_list.append('w%d:%s' % (address, data))
|
||||
if skew_start > 0:
|
||||
skew_start -= 1
|
||||
else:
|
||||
# TRIM
|
||||
if len(valid_addresses) < 1:
|
||||
continue
|
||||
address = random_choice(valid_addresses)
|
||||
cmd_list.append('t%d' % address)
|
||||
valid_addresses.remove(address)
|
||||
|
||||
else:
|
||||
cmd_list = options.cmd_list.split(',')
|
||||
|
||||
s.dump()
|
||||
print('')
|
||||
|
||||
show_state = options.show_state
|
||||
show_cmds = options.show_cmds
|
||||
quiz_cmds = options.quiz_cmds
|
||||
|
||||
if quiz_cmds:
|
||||
show_state = True
|
||||
|
||||
op = 0
|
||||
for cmd in cmd_list:
|
||||
if cmd == '':
|
||||
break
|
||||
if cmd[0] == 'r':
|
||||
# r10
|
||||
address = int(cmd.split('r')[1])
|
||||
data = s.read(address)
|
||||
if show_cmds or (quiz_cmds and options.solve):
|
||||
print('cmd %3d:: read(%d) -> %s' % (op, address, data))
|
||||
elif quiz_cmds:
|
||||
print('cmd %3d:: read(%d) -> ??' % (op, address))
|
||||
op += 1
|
||||
elif cmd[0] == 'w':
|
||||
# w80:b
|
||||
parts = cmd.split(':')
|
||||
address = int(parts[0].split('w')[1])
|
||||
data = parts[1]
|
||||
rc = s.write(address, data)
|
||||
if show_cmds or (quiz_cmds and options.solve):
|
||||
print('cmd %3d:: write(%d, %s) -> %s' % (op, address, data, rc))
|
||||
elif quiz_cmds:
|
||||
print('cmd %3d:: command(??) -> ??' % op)
|
||||
op += 1
|
||||
elif cmd[0] == 't':
|
||||
address = int(cmd.split('t')[1])
|
||||
rc = s.trim(address)
|
||||
if show_cmds or (quiz_cmds and options.solve):
|
||||
print('cmd %3d:: trim(%d) -> %s' % (op, address, rc))
|
||||
elif quiz_cmds:
|
||||
print('cmd %d:: command(??) -> ??' % op)
|
||||
op += 1
|
||||
|
||||
if show_state:
|
||||
print('')
|
||||
s.dump()
|
||||
print('')
|
||||
|
||||
# Do GC?
|
||||
s.upkeep()
|
||||
|
||||
if not show_state:
|
||||
print('')
|
||||
s.dump()
|
||||
print('')
|
||||
if options.show_stats:
|
||||
s.stats()
|
||||
print('')
|
||||
|
Loading…
Reference in New Issue