initial SSD sim commit

This commit is contained in:
Remzi Arpaci-Dusseau 2020-05-30 15:39:32 -05:00
parent 533129b0a9
commit b42f17f580
2 changed files with 1073 additions and 0 deletions

451
file-ssd/README.md Normal file
View File

@ -0,0 +1,451 @@
# Overview
Welcome to `ssd.py`, yet another wonderful simulator provided to you,
for free, by the authors of OSTEP, which is also free. Pretty soon,
you're going to think that everything important in life is free! And,
it turns out, it kind of is: the air you breathe, the love you give
and receive, and a book about operating systems. What else do you
need?
To run the simulator, you just do the usual:
```sh
prompt> ./ssd.py
```
The simulator models a few different types of SSDs. The first is what we'll
call an "ideal" SSD, which actually isn't much an SSD at all; it's more like a
perfect memory. To simulate this SSD, type:
```sh
prompt> ./ssd.py -T ideal
```
To see how this one works, let's create a little workload. A workload, for an
SSD, is just a series of low-level I/O operations issued to the device. There
are three operations supported by ssd.py: read (which takes an address to
read, and returns the data), write (which takes an address and a piece of data
to write, in this case, a single letter), and trim (which takes an
address). The trim operation is used to indicate a previously written block is
no longer live (i.e., the file it was in was deleted); this is particular
useful for a log-based SSD, which can reclaim the block's space during garbage
collection and free up space in the FTL. Let's run a simple workload
consisting of just one write:
```sh
prompt> ./ssd.py -T ideal -L w10:a -l 30 -B 3 -p 10
```
The `-L` flag allows us to specify a comma-separated list of commands. Here, to
write to logical page 10, we include the command "w10:a" which means "write"
to logical page "10" the data of "a". We also include a few other specifics
about the size of the SSD with the flags `-l 30 -B 3 -p 10`, but let's
ignore those for now.
What you should see on the screen, after running the above:
```sh
FTL (empty)
Block 0 1 2
Page 0000000000 1111111111 2222222222
0123456789 0123456789 0123456789
State iiiiiiiiii iiiiiiiiii iiiiiiiiii
Data
Live
FTL 10: 10
Block 0 1 2
Page 0000000000 1111111111 2222222222
0123456789 0123456789 0123456789
State viiiiiiiii iiiiiiiiii iiiiiiiiii
Data a
Live +
```
The first chunk of information shows the initial state of the SSD, and the
second chunk shows its final state. Let's walk through each piece to make sure
you understand what they mean.
The first line of each chunk of output shows the contents of the FTL. This
simulator only models a simple page-mapped FTL; thus, each entry within it
shows the logical-to-physical page mapping for any live data items.
In the initial state, the FTL is empty:
```sh
FTL (empty)
```
However, in the final state, you can see that the FTL maps logical page 10 to
physical page 10:
```sh
FTL 10: 10
```
The reason for this simple mapping is that we are running the "ideal" SSD,
which really just acts like a memory; if you write to logical page X, this SSD
will just (magically) write the data to physical page X (indeed, you don't
even really need the FTL for this; we'll just use the ideal SSD to show how
much extra work a real SSD does, in terms of erases and data copying, as
compared to an ideal memory).
The next lines of output just label the blocks and physical pages of the
underlying Flash the simulator is modeling:
```sh
Block 0 1 2
Page 0000000000 1111111111 2222222222
0123456789 0123456789 0123456789
```
In this simulation, you can see that the SSD has 3 physical Flash blocks, and
that each block has 10 physical pages. Each block is numbered (0, 1, and 2),
as is each page (from 00 to 29); to keep the display compact (width-wise), the
page numbering is shown across two lines. Thus, physical page "10" is labeled
with a "1" on the first line, and a "0" on the second.
The next line shows the state of each page, i.e., whether it is INVALID (i),
ERASED (E), or VALID (v), as per the chapter:
```sh
State viiiiiiiii iiiiiiiiii iiiiiiiiii
```
The states for the "ideal" SSD are a bit weird, in that you can have "v" and
"i" mixed in a block, and that the block is never "E" for erased. Below, with
the more realistic "direct" and "log" SSDs, you'll see "E" too.
The final two lines show the "contents" of any written-to pages (on the "Data"
row) and whether that data is currently live (that is, referred to in the
FTL) in the "Live" row:
```sh
Block 0 1 2
Page 0000000000 1111111111 2222222222
0123456789 0123456789 0123456789
Data a
Live +
```
Here, we can see that on Block 2 (i.e., Page 10), there is the data "a", and
it is indeed live (shown by the "+" symbol).
Let's expand our workload a little bit, before getting to the more realistic
types of SSDs. After writing the data, let's read it, and then let's use trim
to delete it:
```sh
prompt> ./ssd.py -T ideal -L w10:a,r10,t10 -l 30 -B 3 -p 10
```
If you run this, you'll see two identical states: the initial (empty) state,
and the final (also empty!) state. Not too exciting! To see more of what is
going on, you'll have to use some more flags. Yes, this SSD simulator uses a
lot of flags; sorry, all lovers of parsimony! But alas, there is some
complexity here we must explore.
One useful flag is `-C`, which just shows every command that was issued, and
whether is succeeded or not.
```sh
prompt> ./ssd.py -T ideal -L w10:a,r10,t10 -l 30 -B 3 -p 10 -C
FTL (empty)
Block 0 1 2
Page 0000000000 1111111111 2222222222
0123456789 0123456789 0123456789
State iiiiiiiiii iiiiiiiiii iiiiiiiiii
Data
Live
cmd 0:: write(10, a) -> success
cmd 1:: read(10) -> a
cmd 2:: trim(10) -> success
FTL (empty)
Block 0 1 2
Page 0000000000 1111111111 2222222222
0123456789 0123456789 0123456789
State iiiiiiiiii viiiiiiiii iiiiiiiiii
Data a
Live
prompt>
```
Here, you can see the write, read, and trim, and you can also see what each
command returned: success, the data read, and success, respectively. This will
be more interesting later, when the simulator generates the operations
randomly.
Similarly, the `-F` flag shows the state of the Flash between each operation,
instead of just at the end. Note the subtle changes at each step:
```sh
prompt> ./ssd.py -T ideal -L w10:a,r10,t10 -l 30 -B 3 -p 10 -F
FTL (empty)
Block 0 1 2
Page 0000000000 1111111111 2222222222
0123456789 0123456789 0123456789
State iiiiiiiiii iiiiiiiiii iiiiiiiiii
Data
Live
FTL 10: 10
Block 0 1 2
Page 0000000000 1111111111 2222222222
0123456789 0123456789 0123456789
State iiiiiiiiii viiiiiiiii iiiiiiiiii
Data a
Live +
FTL 10: 10
Block 0 1 2
Page 0000000000 1111111111 2222222222
0123456789 0123456789 0123456789
State iiiiiiiiii viiiiiiiii iiiiiiiiii
Data a
Live +
FTL (empty)
Block 0 1 2
Page 0000000000 1111111111 2222222222
0123456789 0123456789 0123456789
State iiiiiiiiii viiiiiiiii iiiiiiiiii
Data a
Live
prompt>
```
Of course, you can use `-C` and `-F` in concert to show everything (an exercise
left to the reader).
The simulator also lets you generate random workloads, instead of specifying
operations yourself. Use the "-n" flag for this, with an associated number (we
also specify a random seed with "-s" to get a particular workload):
```sh
prompt> ./ssd.py -T ideal -l 30 -B 3 -p 10 -n 5 -s 10
```
If you run this with `-C`, `-F`, or both, you'll see either the exact commands,
the intermediate states of the Flash, or both. However, you can also use the
"-q" flag to quiz yourself on what you think the commands are. Thus, run the
following:
```sh
prompt> ./ssd.py -T ideal -l 30 -B 3 -p 10 -n 5 -s 10 -q
(output omitted for brevity)
```
Now, by examining the intermediate states, see if you can discern what the
commands must have been (writes and trims are left completely unspecified,
whereas reads just ask you to figure out which data was returned).
You can then either manually use `-C -F` to show everything, or just add the
`-c` flag to "solve" the problem for you, to check your answers.
Let's now do the same thing (a random workload of five operations) but use
different more realistic SSDs. The first is the "direct" SSD mentioned in the
chapter. This too isn't particularly realistic, but at least uses erases and
programs to update the Flash. Specifically, when a logical page is written, it
is mapped directly to the physical page of the same number. This mapping
necessitates first a read of all the live data in that block, then an erase of
the block, and then a series of programs to restore all previously live data
as well as write the new data to Flash. Let's run it, show the commands (-C)
but not the intermediate states (no -F):
```sh
prompt> ./ssd.py -T direct -l 30 -B 3 -p 10 -n 5 -s 10 -C
FTL (empty)
Block 0 1 2
Page 0000000000 1111111111 2222222222
0123456789 0123456789 0123456789
State iiiiiiiiii iiiiiiiiii iiiiiiiiii
Data
Live
cmd 0:: write(12, z) -> success
cmd 1:: write(19, 9) -> success
cmd 2:: write(9, f) -> success
cmd 3:: trim(9) -> success
cmd 4:: read(19) -> 9
FTL 12: 12 19: 19
Block 0 1 2
Page 0000000000 1111111111 2222222222
0123456789 0123456789 0123456789
State EEEEEEEEEv EEvEEEEEEv iiiiiiiiii
Data f z 9
Live + +
prompt>
```
As you can see from the final state, the FTL contains two live mappings:
logical page 12 refers to physical page 12, and 19 to 19 (remember, this is
the direct mapping). You can also see three data pages with information within
them: physical page 9 contains "f", 12 contains "z", and 19 contains "9" (data
can be letters or numbers or really any single character). However, also note
that "9" has been trimmed; this removes its entry from the FTL, but the data
lies their dormant (for now). If you then tried to read logical page 9, it no
longer would succeed:
```sh
prompt> ./ssd.py -T direct -l 30 -B 3 -p 10 -C -L w9:f,t9,r9
FTL (empty)
Block 0 1 2
Page 0000000000 1111111111 2222222222
0123456789 0123456789 0123456789
State iiiiiiiiii iiiiiiiiii iiiiiiiiii
Data
Live
cmd 0:: write(9, f) -> success
cmd 1:: trim(9) -> success
cmd 2:: read(9) -> fail: uninitialized read
FTL (empty)
Block 0 1 2
Page 0000000000 1111111111 2222222222
0123456789 0123456789 0123456789
State EEEEEEEEEv iiiiiiiiii iiiiiiiiii
Data f
Live
prompt>
```
One last SSD we should pay attention to is the actual most realistic one,
which uses log-structuring (as do most real SSDs). To use it, just change the
SSD type to "log" (we'll again turn on -C so we can just know which operations
took place, instead of quizzing ourselves):
```sh
prompt> ./ssd.py -T log -l 30 -B 3 -p 10 -s 10 -n 5 -C
FTL (empty)
Block 0 1 2
Page 0000000000 1111111111 2222222222
0123456789 0123456789 0123456789
State iiiiiiiiii iiiiiiiiii iiiiiiiiii
Data
Live
cmd 0:: write(12, z) -> success
cmd 1:: write(19, 9) -> success
cmd 2:: write(9, f) -> success
cmd 3:: trim(9) -> success
cmd 4:: read(19) -> 9
FTL 12: 0 19: 1
Block 0 1 2
Page 0000000000 1111111111 2222222222
0123456789 0123456789 0123456789
State vvvEEEEEEE iiiiiiiiii iiiiiiiiii
Data z9f
Live ++
prompt>
```
Note how the log-structured SSD writes data to the Flash. First, the current
log block (Block 0, in this case) is erased. Then, the pages are programmed in
order. Use the -F flag to see each step for more detail.
The simulator can also show more statistics, including operation counts and
the estimated time that the modeled SSD would take to complete the given
workload. To see these, use the -S flag:
```sh
prompt> ./ssd.py -T log -l 30 -B 3 -p 10 -s 10 -n 5 -S
(stuff omitted)
Physical Operations Per Block
Erases 1 0 0 Sum: 1
Writes 3 0 0 Sum: 3
Reads 1 0 0 Sum: 1
Logical Operation Sums
Write count 3 (0 failed)
Read count 1 (0 failed)
Trim count 1 (0 failed)
Times
Erase time 1000.00
Write time 120.00
Read time 10.00
Total time 1130.00
```
Here you can see the physical erases, writes, and reads per block as well as a
sum of each, then the number of logical writes, reads, and trims issued to the
device, and finally the estimated times. You can change the costs of low-level
operations such as read, program, and erase, with the -R, -W, and -E flags,
respectively.
Finally, with the SSD in log-structured mode, there is a garbage collector (GC)
that can be configured to run periodically. This behavior is controlled by the
-G and -g flags, which set the high and low watermarks for determining whether
the garbage collector should run. Setting the high watermark to a value N
(i.e., -G N) means that when the GC notices that N blocks are in use, it
should run. Setting the low watermark to M (i.e., -G M) means that the GC
should run until only M blocks are in use.
The -J flag is also useful here: it shows which low-level commands the GC
issues (reads and writes of live data, followed by erases of reclaimed
blocks). The following issues 60 operations, and sets the high and low
watermarks to 3 and 2, respectively.
```sh
prompt> ./ssd.py -T log -l 30 -B 3 -p 10 -s 10 -n 60 -G 3 -g 2 -C -F -J
```
Using `-C`, `-F`, and `-J` lets you really see what is happening, step
by step, inside the log-structured simulation.
There are a few other flags worth knowing. The entire time, we've been using
the following three flags to control the size of the simulated SSD:
```sh
-l NUM_LOGICAL_PAGES, --logical_pages=NUM_LOGICAL_PAGES number of logical pages in interface
-B NUM_BLOCKS, --num_blocks=NUM_BLOCKS number of physical blocks in SSD
-p PAGES_PER_BLOCK, --pages_per_block=PAGES_PER_BLOCK pages per physical block
```
You can change these values to simulate larger or smaller SSDs than the simple
one we've been simulating so far.
One other set of controls lets you control randomly generated
workloads a bit more precisely. The `-P` flag lets you control how
many reads/writes/trims show up (probabilistically). For example,
using `-P 30/35/35` means that roughly 30% of operations will be
reads, 35% writes, and 35% trims.
The `-r` flag allows reads to be issued to non-live addresses (the default only
issues reads to live data). Thus, `-r 10` means roughly 10% of reads will
fail.
Finally, the `-K` and `-k` flags let you add some "skew" to a workload. A skew is
first specified by `-K`, e.g., `-K 80/20` makes 80% of writes target 20% of
the logical space (a hot/cold kind of workload). Skew is common in real
workloads, and has different effects on garbage collection, etc., so it is
good to be able to model. The related `-k` flag lets you specify when the skew
starts; specifically, `-k 50` means that after 50 writes, start doing the skew
(before then, the writes will be chosen at random from all possible pages in
the logical space).
Wow, have you gotten this far? You are some impressive person! We suspect you
will go far in life. Or, we suspect that you typed "cat README" and not "more
README" or "less README", in which case we suspect you are just learning about
"more" or "less", more or less.

622
file-ssd/ssd.py Executable file
View File

@ -0,0 +1,622 @@
#! /usr/bin/env python
from __future__ import print_function
from collections import *
from optparse import OptionParser
import random
import string
# to make Python2 and Python3 act the same -- how dumb
def random_seed(seed):
try:
random.seed(seed, version=1)
except:
random.seed(seed)
return
def random_randint(low, hi):
return int(low + random.random() * (hi - low + 1))
def random_choice(L):
return L[random_randint(0, len(L)-1)]
class ssd:
def __init__(self, ssd_type, num_logical_pages, num_blocks, pages_per_block,
block_erase_time, page_program_time, page_read_time,
high_water_mark, low_water_mark, trace_gc, show_state):
# type
self.TYPE_DIRECT = 1
self.TYPE_LOGGING = 2
self.TYPE_IDEAL = 3
if ssd_type == 'direct':
self.ssd_type = self.TYPE_DIRECT
elif ssd_type == 'log':
self.ssd_type = self.TYPE_LOGGING
elif ssd_type == 'ideal':
self.ssd_type = self.TYPE_IDEAL
else:
print('bad SSD type (%s)' % ssd_type)
exit(1)
# size
self.num_logical_pages = num_logical_pages
self.num_blocks = num_blocks
self.pages_per_block = pages_per_block
# parameters
self.block_erase_time = block_erase_time
self.page_program_time = page_program_time
self.page_read_time = page_read_time
# init each page of each block to INVALID
self.STATE_INVALID = 1
self.STATE_ERASED = 2
self.STATE_VALID = 3
self.num_pages = self.num_blocks * self.pages_per_block
self.state = {}
for i in range(self.num_pages):
self.state[i] = self.STATE_INVALID
# data itself
self.data = {}
for i in range(self.num_pages):
self.data[i] = ' '
# LOGGING stuff
# reverse map: for each physical page, what LOGICAL page refers to it?
# which page to write to right now?
self.current_page = -1
self.current_block = 0
# gc counts
self.gc_count = 0
self.gc_current_block = 0
self.gc_high_water_mark = high_water_mark
self.gc_low_water_mark = low_water_mark
self.gc_trace = trace_gc
self.show_state = show_state
# can use this as a log block
self.gc_used_blocks = {}
for i in range(self.num_blocks):
self.gc_used_blocks[i] = 0
# counts so as to help the GC
self.live_count = {}
for i in range(self.num_blocks):
self.live_count[i] = 0
# FTL
self.forward_map = {}
for i in range(self.num_logical_pages):
self.forward_map[i] = -1
self.reverse_map = {}
for i in range(self.num_pages):
self.reverse_map[i] = -1
# stats
self.physical_erase_count = {}
self.physical_read_count = {}
self.physical_write_count = {}
for i in range(self.num_blocks):
self.physical_erase_count[i] = 0
self.physical_read_count[i] = 0
self.physical_write_count[i] = 0
self.physical_erase_sum = 0
self.physical_write_sum = 0
self.physical_read_sum = 0
self.logical_trim_sum = 0
self.logical_write_sum = 0
self.logical_read_sum = 0
self.logical_trim_fail_sum = 0
self.logical_write_fail_sum = 0
self.logical_read_fail_sum = 0
return
def blocks_in_use(self):
used = 0
for i in range(self.num_blocks):
used += self.gc_used_blocks[i]
return used
def physical_erase(self, block_address):
page_begin = block_address * self.pages_per_block
page_end = page_begin + self.pages_per_block - 1
for page in range(page_begin, page_end + 1):
self.data[page] = ' '
self.state[page] = self.STATE_ERASED
# now, definitely NOT in use
self.gc_used_blocks[block_address] = 0
# STATS
self.physical_erase_count[block_address] += 1
self.physical_erase_sum += 1
return
def physical_program(self, page_address, data):
self.data[page_address] = data
self.state[page_address] = self.STATE_VALID
# STATS
self.physical_write_count[int(page_address / self.pages_per_block)] += 1
self.physical_write_sum += 1
return
def physical_read(self, page_address):
# STATS
self.physical_read_count[int(page_address / self.pages_per_block)] += 1
self.physical_read_sum += 1
return self.data[page_address]
def read_direct(self, address):
return self.physical_read(address)
def write_direct(self, page_address, data):
block_address = int(page_address / self.pages_per_block)
page_begin = block_address * self.pages_per_block
page_end = page_begin + self.pages_per_block - 1
old_list = []
for old_page in range(page_begin, page_end + 1):
if self.state[old_page] == self.STATE_VALID:
old_data = self.physical_read(old_page)
old_list.append((old_page, old_data))
self.physical_erase(block_address)
for (old_page, old_data) in old_list:
if old_page == page_address:
continue
self.physical_program(old_page, old_data)
self.physical_program(page_address, data)
self.forward_map[page_address] = page_address
self.reverse_map[page_address] = page_address
return 'success'
def write_ideal(self, page_address, data):
self.physical_program(page_address, data)
self.forward_map[page_address] = page_address
self.reverse_map[page_address] = page_address
return 'success'
def is_block_free(self, block):
first_page = block * self.pages_per_block
if self.state[first_page] == self.STATE_INVALID or self.state[first_page] == self.STATE_ERASED:
if self.state[first_page] == self.STATE_INVALID:
self.physical_erase(block)
self.current_block = block
self.current_page = first_page
self.gc_used_blocks[block] = 1
return True
return False
def get_cursor(self):
if self.current_page == -1:
for block in range(self.current_block, self.num_blocks):
if self.is_block_free(block):
return 0
for block in range(0, self.current_block):
if self.is_block_free(block):
return 0
return -1
return 0
def update_cursor(self):
self.current_page += 1
if self.current_page % self.pages_per_block == 0:
self.current_page = -1
return
def write_logging(self, page_address, data, is_gc_write=False):
if self.get_cursor() == -1:
self.logical_write_fail_sum += 1
return 'failure: device full'
# NORMAL MODE writing
assert(self.state[self.current_page] == self.STATE_ERASED)
self.physical_program(self.current_page, data)
self.forward_map[page_address] = self.current_page
self.reverse_map[self.current_page] = page_address
self.update_cursor()
return 'success'
def garbage_collect(self):
blocks_cleaned = 0
# for block in range(self.gc_current_block, self.num_blocks) + range(0, self.gc_current_block):
# tricky flattening generator expression (https://stackoverflow.com/questions/18317913/how-can-i-combine-range-functions)
for block in (x for y in (range(self.gc_current_block, self.num_blocks), range(0, self.gc_current_block)) for x in y):
# don't GC the block currently being written to
if block == self.current_block:
continue
# page to start looking for live blocks
page_start = block * self.pages_per_block
# is this page (and hence block) already erased? then don't bother
if self.state[page_start] == self.STATE_ERASED:
continue
# collect list of live physical pages in this block
live_pages = []
for page in range(page_start, page_start + self.pages_per_block):
logical_page = self.reverse_map[page]
if logical_page != -1 and self.forward_map[logical_page] == page:
live_pages.append(page)
# if ONLY live blocks, don't clean it! (why bother with move?)
if len(live_pages) == self.pages_per_block:
continue
# live pages should be copied to current writing location
for page in live_pages:
# live: so copy it someplace new
if self.gc_trace:
print('gc %d:: read(physical_page=%d)' % (self.gc_count, page))
print('gc %d:: write()' % self.gc_count)
data = self.physical_read(page)
self.write(self.reverse_map[page], data)
# finally, erase the block and see if we're done
blocks_cleaned += 1
self.physical_erase(block)
if self.gc_trace:
print('gc %d:: erase(block=%d)' % (self.gc_count, block))
if self.show_state:
print('')
self.dump()
print('')
if self.blocks_in_use() <= self.gc_low_water_mark:
# done! record where we stopped and return
self.gc_current_block = block
self.gc_count += 1
return
# END: block iteration
return
def upkeep(self):
# GARBAGE COLLECTION
if self.blocks_in_use() >= self.gc_high_water_mark:
self.garbage_collect()
# WEAR LEVELING: for future
return
def trim(self, address):
self.logical_trim_sum += 1
if address < 0 or address >= self.num_logical_pages:
self.logical_trim_fail_sum += 1
return 'fail: illegal trim address'
if self.forward_map[address] == -1:
self.logical_trim_fail_sum += 1
return 'fail: uninitialized trim'
self.forward_map[address] = -1
return 'success'
def read(self, address):
self.logical_read_sum += 1
if address < 0 or address >= self.num_logical_pages:
self.logical_read_fail_sum += 1
return 'fail: illegal read address'
if self.forward_map[address] == -1:
self.logical_read_fail_sum += 1
return 'fail: uninitialized read'
# USED for DIRECT and LOGGING and IDEAL
return self.read_direct(self.forward_map[address])
def write(self, address, data):
self.logical_write_sum += 1
if address < 0 or address >= self.num_logical_pages:
self.logical_write_fail_sum += 1
return 'fail: illegal write address'
if self.ssd_type == self.TYPE_DIRECT:
return self.write_direct(address, data)
elif self.ssd_type == self.TYPE_IDEAL:
return self.write_ideal(address, data)
else:
return self.write_logging(address, data)
def printable_state(self, s):
if s == self.STATE_INVALID:
return 'i'
elif s == self.STATE_ERASED:
return 'E'
elif s == self.STATE_VALID:
return 'v'
else:
print('bad state %d' % s)
exit(1)
def stats(self):
print('Physical Operations Per Block')
print('Erases ', end='')
for i in range(self.num_blocks):
print('%3d ' % self.physical_erase_count[i], end='')
print(' Sum: %d' % self.physical_erase_sum)
print('Writes ', end='')
for i in range(self.num_blocks):
print('%3d ' % self.physical_write_count[i], end='')
print(' Sum: %d' % self.physical_write_sum)
print('Reads ', end='')
for i in range(self.num_blocks):
print('%3d ' % self.physical_read_count[i], end='')
print(' Sum: %d' % self.physical_read_sum)
print('')
print('Logical Operation Sums')
print(' Write count %d (%d failed)' % (self.logical_write_sum, self.logical_write_fail_sum))
print(' Read count %d (%d failed)' % (self.logical_read_sum, self.logical_read_fail_sum))
print(' Trim count %d (%d failed)' % (self.logical_trim_sum, self.logical_trim_fail_sum))
print('')
print('Times')
print(' Erase time %.2f' % (self.physical_erase_sum * self.block_erase_time))
print(' Write time %.2f' % (self.physical_write_sum * self.page_program_time))
print(' Read time %.2f' % (self.physical_read_sum * self.page_read_time))
total_time = self.physical_erase_sum * self.block_erase_time + self.physical_write_sum * self.page_program_time + self.physical_read_sum * self.page_read_time
print(' Total time %.2f' % total_time)
return
def dump(self):
# FTL
print('FTL ', end='')
count = 0
ftl_columns = int((self.pages_per_block * self.num_blocks) / 7)
for i in range(self.num_logical_pages):
if self.forward_map[i] == -1:
continue
count += 1
print('%3d:%3d ' % (i, self.forward_map[i]), end='')
if count > 0 and count % ftl_columns == 0:
print('\n ', end='')
if count == 0:
print('(empty)', end='')
print('')
# FLASH?
print('Block ', end='')
for i in range(self.num_blocks):
out_str = '%d' % i
print(out_str + ' ' * (self.pages_per_block - len(out_str) + 1), end='')
print('')
max_len = len(str(self.num_pages))
for n in range(max_len, 0, -1):
if n == max_len:
print('Page ', end='')
else:
print(' ', end='')
for i in range(self.num_pages):
out_str = str(i).zfill(max_len)[max_len - n]
print(out_str, end='')
if i > 0 and (i+1) % 10 == 0:
print(end=' ')
print('')
print('State ', end='')
for i in range(self.num_pages):
print('%s' % self.printable_state(self.state[i]), end='')
if i > 0 and (i+1) % 10 == 0:
print(end=' ')
print('')
# DATA
print('Data ', end='')
for i in range(self.num_pages):
if self.state[i] == self.STATE_VALID:
print('%s' % self.data[i], end='')
else:
print(' ', end='')
if i > 0 and (i+1) % 10 == 0:
print(end=' ')
print('')
# LIVE
print('Live ', end='')
for i in range(self.num_pages):
if self.state[i] == self.STATE_VALID and self.forward_map[self.reverse_map[i]] == i:
print('+', end='')
else:
print(' ', end='')
if i > 0 and (i+1) % 10 == 0:
print(end=' ')
print('')
return
#
# MAIN PROGRAM
#
parser = OptionParser()
parser.add_option('-s', '--seed', default=0, help='the random seed', action='store', type='int', dest='seed')
parser.add_option('-n', '--num_cmds', default=10, help='number of commands to randomly generate', action='store', type='int', dest='num_cmds')
parser.add_option('-P', '--op_percentages', default='40/50/10', help='if rand, percent of reads/writes/trims', action='store', type='string', dest='op_percentages')
parser.add_option('-K', '--skew', default='', help='if non-empty, skew, e.g., 80/20: 80% of ops to 20% of blocks', action='store', type='string', dest='skew')
parser.add_option('-k', '--skew_start', default=0, help='if --skew, skew after this many writes', action='store', type='int', dest='skew_start')
parser.add_option('-r', '--read_fails', default=0, help='if rand, percent of reads that can fail', action='store', type='int', dest='read_fail')
parser.add_option('-L', '--cmd_list', default='', help='comma-separated list of commands (e.g., r10,w20:a)', action='store', type='string', dest='cmd_list')
parser.add_option('-T', '--ssd_type', default='direct', help='SSD type: ideal, direct, log', action='store', type='string', dest='ssd_type')
parser.add_option('-l', '--logical_pages', default=50, help='number of logical pages in interface', action='store', type='int', dest='num_logical_pages')
parser.add_option('-B', '--num_blocks', default=7, help='number of physical blocks in SSD', action='store', type='int', dest='num_blocks')
parser.add_option('-p', '--pages_per_block', default=10, help='pages per physical block', action='store', type='int', dest='pages_per_block')
parser.add_option('-G', '--high_water_mark', default=10, help='blocks used before gc trigger', action='store', type='int', dest='high_water_mark')
parser.add_option('-g', '--low_water_mark', default=8, help='gc target before stopping gc', action='store', type='int', dest='low_water_mark')
parser.add_option('-R', '--read_time', default=10, help='page read time (usecs)', action='store', type='int', dest='read_time')
parser.add_option('-W', '--program_time', default=40, help='page program time (usecs)', action='store', type='int', dest='program_time')
parser.add_option('-E', '--erase_time', default=1000, help='page erase time (usecs)', action='store', type='int', dest='erase_time')
parser.add_option('-J', '--show_gc', default=False, help='show garbage collector behavior', action='store_true', dest='show_gc')
parser.add_option('-F', '--show_state', default=False, help='show flash state', action='store_true', dest='show_state')
parser.add_option('-C', '--show_cmds', default=False, help='show commands', action='store_true', dest='show_cmds')
parser.add_option('-q', '--quiz_cmds', default=False, help='quiz commands', action='store_true', dest='quiz_cmds')
parser.add_option('-S', '--show_stats', default=False, help='show statistics', action='store_true', dest='show_stats')
parser.add_option('-c', '--compute', default=False, help='compute answers for me', action='store_true', dest='solve')
(options, args) = parser.parse_args()
random_seed(options.seed)
print('ARG seed %s' % options.seed)
print('ARG num_cmds %s' % options.num_cmds)
print('ARG op_percentages %s' % options.op_percentages)
print('ARG skew %s' % options.skew)
print('ARG skew_start %s' % options.skew_start)
print('ARG read_fail %s' % options.read_fail)
print('ARG cmd_list %s' % options.cmd_list)
print('ARG ssd_type %s' % options.ssd_type)
print('ARG num_logical_pages %s' % options.num_logical_pages)
print('ARG num_blocks %s' % options.num_blocks)
print('ARG pages_per_block %s' % options.pages_per_block)
print('ARG high_water_mark %s' % options.high_water_mark)
print('ARG low_water_mark %s' % options.low_water_mark)
print('ARG erase_time %s' % options.erase_time)
print('ARG program_time %s' % options.program_time)
print('ARG read_time %s' % options.read_time)
print('ARG show_gc %s' % options.show_gc)
print('ARG show_state %s' % options.show_state)
print('ARG show_cmds %s' % options.show_cmds)
print('ARG quiz_cmds %s' % options.quiz_cmds)
print('ARG show_stats %s' % options.show_stats)
print('ARG compute %s' % options.solve)
print('')
s = ssd(ssd_type=options.ssd_type,
num_logical_pages=options.num_logical_pages, num_blocks=options.num_blocks, pages_per_block=options.pages_per_block,
block_erase_time=float(options.erase_time), page_program_time=float(options.program_time), page_read_time=float(options.read_time),
high_water_mark=options.high_water_mark, low_water_mark=options.low_water_mark, trace_gc=options.show_gc, show_state=options.show_state)
#
# generate cmds (if not passed in by cmd_list)
#
hot_cold = False
skew_start = options.skew_start
if options.skew != '':
hot_cold = True
skew = options.skew.split('/')
if len(skew) != 2:
print('bad skew specification; should be 80/20 or something like that')
exit(1)
hot_percent = int(skew[0])/100.0
hot_target = int(skew[1])/100.0
if options.cmd_list == '':
max_page_addr = int(options.num_logical_pages)
num_cmds = int(options.num_cmds)
p = options.op_percentages.split('/')
assert(len(p) == 3)
percent_reads, percent_writes, percent_trims = int(p[0]), int(p[1]), int(p[2])
if percent_writes <= 0:
print('must have some writes, otherwise nothing in the SSD!')
exit(1)
printable = string.digits + string.ascii_lowercase + string.ascii_uppercase
cmd_list = []
valid_addresses = []
while len(cmd_list) < num_cmds:
which_cmd = int(random.random() * 100.0)
if which_cmd < percent_reads:
# READ
if random_randint(0, 99) < int(options.read_fail):
address = random_randint(0, max_page_addr - 1)
else:
if len(valid_addresses) < 2:
continue
address = random_choice(valid_addresses)
cmd_list.append('r%d' % address)
elif which_cmd < percent_reads + percent_writes:
# WRITE
if skew_start == 0 and hot_cold and random.random() < hot_percent:
address = random_randint(0, int(hot_target * (max_page_addr - 1)))
else:
address = random_randint(0, max_page_addr - 1)
if address not in valid_addresses:
valid_addresses.append(address)
data = random_choice(list(printable))
cmd_list.append('w%d:%s' % (address, data))
if skew_start > 0:
skew_start -= 1
else:
# TRIM
if len(valid_addresses) < 1:
continue
address = random_choice(valid_addresses)
cmd_list.append('t%d' % address)
valid_addresses.remove(address)
else:
cmd_list = options.cmd_list.split(',')
s.dump()
print('')
show_state = options.show_state
show_cmds = options.show_cmds
quiz_cmds = options.quiz_cmds
if quiz_cmds:
show_state = True
op = 0
for cmd in cmd_list:
if cmd == '':
break
if cmd[0] == 'r':
# r10
address = int(cmd.split('r')[1])
data = s.read(address)
if show_cmds or (quiz_cmds and options.solve):
print('cmd %3d:: read(%d) -> %s' % (op, address, data))
elif quiz_cmds:
print('cmd %3d:: read(%d) -> ??' % (op, address))
op += 1
elif cmd[0] == 'w':
# w80:b
parts = cmd.split(':')
address = int(parts[0].split('w')[1])
data = parts[1]
rc = s.write(address, data)
if show_cmds or (quiz_cmds and options.solve):
print('cmd %3d:: write(%d, %s) -> %s' % (op, address, data, rc))
elif quiz_cmds:
print('cmd %3d:: command(??) -> ??' % op)
op += 1
elif cmd[0] == 't':
address = int(cmd.split('t')[1])
rc = s.trim(address)
if show_cmds or (quiz_cmds and options.solve):
print('cmd %3d:: trim(%d) -> %s' % (op, address, rc))
elif quiz_cmds:
print('cmd %d:: command(??) -> ??' % op)
op += 1
if show_state:
print('')
s.dump()
print('')
# Do GC?
s.upkeep()
if not show_state:
print('')
s.dump()
print('')
if options.show_stats:
s.stats()
print('')