This document describes use and implementation of a new fio histogram-based latency
percentile measurement tool.
History
Why the old tool wasn’t good enough
Design and implementation
Histogram log parsing
Histogram buckets
Histogram time alignment
Histogram addition
Percentile calculation from a histogram
References
History
As a result of the need to measure I/O latency percentiles for a cluster with hundreds or even
thousands of storage users, Karl Cronberg and I worked on the fiologparser_hist.py tool 2 years
ago [1]. First he added the capability into fio to emit periodic latency histograms, which it
already had been maintaining in memory, to a log file. Once he wrote the tool to postprocess
this data, we then had a new ability to calculate latency percentiles as a function of time for a
large distributed-storage cluster, so that we could understand how cluster events, such as node
failures, impact response time of real applications. The results were pretty shocking - max
latencies as high as hundreds of seconds during an OSD node failure. If you looked at fio’s
traditional latency percentile output, you might not see this because fio averages across entire
time of test run and can’t combine percentiles from multiple processes. This, plus
customer-experience inputs to the Ceph team, helped to convince the Ceph team to introduce
features that could help reduce these high latencies. We now are about to see if Ceph
Luminous upstream release (RHCS 3) has better behavior in this area than Ceph Jewel (RHCS
2) upstream release did.
Why the old tool wasn’t good enough
I got frustrated with fiologparser_hist.py, the tool for calculating fio latency percentiles as a
function of time, because of: