Ansari's Technical Notes: Using blktrace, blkparser and btt to analyze block device performance in Linux

Collect blktrace
( Caution - it will increase load on system)

mount -t debugfs debugfs /sys/kernel/debug
df /sys/kernel/debug
cd /var/tmp
mkdir blktrace_data
cd blktrace_data
# It will create one file per device per CPU
blktrace -d /dev/sdbs /dev/sdq # dump binary for one of more disk
blkrawverify sdbs # it will create sdbs.verify.out file
less sdbs.verify.out
grep invalid sdbs.verify.out

For many devices - use blkparser to combine blktrace data and then use btt to create report.

# combine all the files into one binary time-ordered stream of traces
blkparse -i sdbs -d bp.sdbs.bin # .blktrace.* not required
btt -A -i bp.sdbs.bin > bp.sdbs.txt
less bp.sdbs.txt

blkparse sdbs > bp.sdbs.txt # text file
less bp.sdbs.txt

# iosatt like output
btt -I bp.sdbs.iostat -i bp.sdbs.bin
less bp.sdbs.iostat

Interpreting Information

Note: All time is mili seconds
Q2Q — time between requests sent to the block layer

Q2G — how long it takes from the time a block I/O is queued to the time it gets a request allocated for it

G2I — how long it takes from the time a request is allocated to the time it is Inserted into the device's queue

Q2M — how long it takes from the time a block I/O is queued to the time it gets merged with an existing request

I2D — how long it takes from the time a request is inserted into the device's queue to the time it is actually issued to the device

M2D — how long it takes from the time a block I/O is merged with an exiting request until the request is issued to the device

D2C — service time of the request by the device

Q2C — total time spent in the block layer for a request

Q------->G------------>I--------->M------------------->D----------------------------->C

|-Q time-|-Insert time-|

|--------- merge time ------------|-merge with other IO|

|----------------scheduler time time-------------------|---driver,adapter,storagetime--|

|----------------------- await time in iostat output ----------------------------------|

If Q2Q is much larger than Q2C, that means the application is not issuing I/O in rapid succession. hus, any performance problems you have may not be at all related to the I/O subsystem.
D2C
 is very high, then the device is taking a long time to service 
requests. This can indicate that the device is simply overloaded (which 
may be due to the fact that it is a shared resource), or it could be 
because the workload sent down to the device is sub-optimal.
If Q2G is very high, it means that there are a lot of requests queued concurrently. This could indicate that the storage is unable to keep up with the I/O load.
await in iostat output = Q2C = Q2I + I2D + D2C
Q2I + I2D == scheduler time
The I2D time can include a lot of apparent extra time due to plug and unplug events (not shown above) which are used to improve merging of io within the schedule sort queue
D2C time covers driver time, adapter time, transport time, and storage service time (and back)
So if D2C/Q2C is approaching to 1, it  means % time spent on storage component is high.
high D->C times, the underlying transport structure of storage needs to be examined, such as switch counters or maintenance interfaces for storage boxes themselves.

References:
* A good hp doc
* Redhat doc
* Redhat article
* btt user guide
* blktrace user guide

Ansari's Technical Notes

Thursday, April 3, 2014

Using blktrace, blkparser and btt to analyze block device performance in Linux

1 comment: