-
Notifications
You must be signed in to change notification settings - Fork 0
jmroot/iomkc
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
I/O Markov chain tools by Joshua Root <[email protected]> To build a Markov chain: you'll need some blktrace output, which comes in the form of one file per CPU, normally named {device}.blktrace.{cpu}. First do: "./prep.sh {device} OPFILE". Then run "./buildChain.py -i OPFILE -o CHAINFILE" and a Markov chain built from the trace will be output to CHAINFILE. Options -s, -k and -d set the size of the buckets into which the transfer sizes (in bytes), seek lengths (in sectors) and inter-op delays (in seconds) will be divided, respectively. Smaller buckets means longer build times and a larger chain file, in exchange for more accurate distribution of the values. To run an I/O workload based on your Markov chain: first build the C module with "python setup.py build". Then move it where python will find it with "mv build/lib.*/*.so ." (assuming you're in the directory containing runChain.py). Now run: "./runChain.py -i CHAINFILE -d TARGET". TARGET will be opened with O_DIRECT if available, so you really want it to be a device, e.g. /dev/sdb. The script will keep reading/writing to/from TARGET until you stop it, e.g. with ctrl-c. Or, you can specify a maximum number of ops to perform and/or a maximum time for which to run, using the -n and -t options respectively. Adding "-p OUT" to the command line will cause runChain to not perform the I/O operations itself, but instead write a description of them to OUT in btrecord format. The idea is that you can then use OUT as input for btreplay. Why the C extension exists: Reads initially always threw exceptions. Apparently this was because Python doesn't bother allocating blocksize-aligned buffers when reading from a file that has been opened with O_DIRECT. I wrote a C extension module that does it right. Random implementation-related notes I wrote while developing this stuff: Use min/max data from trace to determine appropriate buckets into which to divide transfer sizes, inter-op delays, and seek distance. Count up number of transitions between each triplet. Store weighted adjacency matrix (sparse, so use hashing). Replay by generating I/Os matching current triplet, moving to next based on column in current matrix row matched by random(). Store rows as lists of cumulative probabilities, e.g. [0.1, 0.3, 0.8, 1.0]
About
I/O Markov chain tools
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published