187 lines
6.5 KiB
Plaintext
187 lines
6.5 KiB
Plaintext
Using the Linux Kernel Latency Histograms
|
|
|
|
|
|
This document gives a short explanation how to enable, configure and use
|
|
latency histograms. Latency histograms are primarily relevant in the
|
|
context of real-time enabled kernels (CONFIG_PREEMPT/CONFIG_PREEMPT_RT)
|
|
and are used in the quality management of the Linux real-time
|
|
capabilities.
|
|
|
|
|
|
* Purpose of latency histograms
|
|
|
|
A latency histogram continuously accumulates the frequencies of latency
|
|
data. There are two types of histograms
|
|
- potential sources of latencies
|
|
- effective latencies
|
|
|
|
|
|
* Potential sources of latencies
|
|
|
|
Potential sources of latencies are code segments where interrupts,
|
|
preemption or both are disabled (aka critical sections). To create
|
|
histograms of potential sources of latency, the kernel stores the time
|
|
stamp at the start of a critical section, determines the time elapsed
|
|
when the end of the section is reached, and increments the frequency
|
|
counter of that latency value - irrespective of whether any concurrently
|
|
running process is affected by latency or not.
|
|
- Configuration items (in the Kernel hacking/Tracers submenu)
|
|
CONFIG_INTERRUPT_OFF_LATENCY
|
|
CONFIG_PREEMPT_OFF_LATENCY
|
|
|
|
|
|
* Effective latencies
|
|
|
|
Effective latencies are actually occuring during wakeup of a process. To
|
|
determine effective latencies, the kernel stores the time stamp when a
|
|
process is scheduled to be woken up, and determines the duration of the
|
|
wakeup time shortly before control is passed over to this process. Note
|
|
that the apparent latency in user space may be somewhat longer, since the
|
|
process may be interrupted after control is passed over to it but before
|
|
the execution in user space takes place. Simply measuring the interval
|
|
between enqueuing and wakeup may also not appropriate in cases when a
|
|
process is scheduled as a result of a timer expiration. The timer may have
|
|
missed its deadline, e.g. due to disabled interrupts, but this latency
|
|
would not be registered. Therefore, the offsets of missed timers are
|
|
recorded in a separate histogram. If both wakeup latency and missed timer
|
|
offsets are configured and enabled, a third histogram may be enabled that
|
|
records the overall latency as a sum of the timer latency, if any, and the
|
|
wakeup latency. This histogram is called "timerandwakeup".
|
|
- Configuration items (in the Kernel hacking/Tracers submenu)
|
|
CONFIG_WAKEUP_LATENCY
|
|
CONFIG_MISSED_TIMER_OFSETS
|
|
|
|
|
|
* Usage
|
|
|
|
The interface to the administration of the latency histograms is located
|
|
in the debugfs file system. To mount it, either enter
|
|
|
|
mount -t sysfs nodev /sys
|
|
mount -t debugfs nodev /sys/kernel/debug
|
|
|
|
from shell command line level, or add
|
|
|
|
nodev /sys sysfs defaults 0 0
|
|
nodev /sys/kernel/debug debugfs defaults 0 0
|
|
|
|
to the file /etc/fstab. All latency histogram related files are then
|
|
available in the directory /sys/kernel/debug/tracing/latency_hist. A
|
|
particular histogram type is enabled by writing non-zero to the related
|
|
variable in the /sys/kernel/debug/tracing/latency_hist/enable directory.
|
|
Select "preemptirqsoff" for the histograms of potential sources of
|
|
latencies and "wakeup" for histograms of effective latencies etc. The
|
|
histogram data - one per CPU - are available in the files
|
|
|
|
/sys/kernel/debug/tracing/latency_hist/preemptoff/CPUx
|
|
/sys/kernel/debug/tracing/latency_hist/irqsoff/CPUx
|
|
/sys/kernel/debug/tracing/latency_hist/preemptirqsoff/CPUx
|
|
/sys/kernel/debug/tracing/latency_hist/wakeup/CPUx
|
|
/sys/kernel/debug/tracing/latency_hist/wakeup/sharedprio/CPUx
|
|
/sys/kernel/debug/tracing/latency_hist/missed_timer_offsets/CPUx
|
|
/sys/kernel/debug/tracing/latency_hist/timerandwakeup/CPUx
|
|
|
|
The histograms are reset by writing non-zero to the file "reset" in a
|
|
particular latency directory. To reset all latency data, use
|
|
|
|
#!/bin/sh
|
|
|
|
TRACINGDIR=/sys/kernel/debug/tracing
|
|
HISTDIR=$TRACINGDIR/latency_hist
|
|
|
|
if test -d $HISTDIR
|
|
then
|
|
cd $HISTDIR
|
|
for i in `find . | grep /reset$`
|
|
do
|
|
echo 1 >$i
|
|
done
|
|
fi
|
|
|
|
|
|
* Data format
|
|
|
|
Latency data are stored with a resolution of one microsecond. The
|
|
maximum latency is 10,240 microseconds. The data are only valid, if the
|
|
overflow register is empty. Every output line contains the latency in
|
|
microseconds in the first row and the number of samples in the second
|
|
row. To display only lines with a positive latency count, use, for
|
|
example,
|
|
|
|
grep -v " 0$" /sys/kernel/debug/tracing/latency_hist/preemptoff/CPU0
|
|
|
|
#Minimum latency: 0 microseconds.
|
|
#Average latency: 0 microseconds.
|
|
#Maximum latency: 25 microseconds.
|
|
#Total samples: 3104770694
|
|
#There are 0 samples greater or equal than 10240 microseconds
|
|
#usecs samples
|
|
0 2984486876
|
|
1 49843506
|
|
2 58219047
|
|
3 5348126
|
|
4 2187960
|
|
5 3388262
|
|
6 959289
|
|
7 208294
|
|
8 40420
|
|
9 4485
|
|
10 14918
|
|
11 18340
|
|
12 25052
|
|
13 19455
|
|
14 5602
|
|
15 969
|
|
16 47
|
|
17 18
|
|
18 14
|
|
19 1
|
|
20 3
|
|
21 2
|
|
22 5
|
|
23 2
|
|
25 1
|
|
|
|
|
|
* Wakeup latency of a selected process
|
|
|
|
To only collect wakeup latency data of a particular process, write the
|
|
PID of the requested process to
|
|
|
|
/sys/kernel/debug/tracing/latency_hist/wakeup/pid
|
|
|
|
PIDs are not considered, if this variable is set to 0.
|
|
|
|
|
|
* Details of the process with the highest wakeup latency so far
|
|
|
|
Selected data of the process that suffered from the highest wakeup
|
|
latency that occurred in a particular CPU are available in the file
|
|
|
|
/sys/kernel/debug/tracing/latency_hist/wakeup/max_latency-CPUx.
|
|
|
|
In addition, other relevant system data at the time when the
|
|
latency occurred are given.
|
|
|
|
The format of the data is (all in one line):
|
|
<PID> <Priority> <Latency> (<Timeroffset>) <Command> \
|
|
<- <PID> <Priority> <Command> <Timestamp>
|
|
|
|
The value of <Timeroffset> is only relevant in the combined timer
|
|
and wakeup latency recording. In the wakeup recording, it is
|
|
always 0, in the missed_timer_offsets recording, it is the same
|
|
as <Latency>.
|
|
|
|
When retrospectively searching for the origin of a latency and
|
|
tracing was not enabled, it may be helpful to know the name and
|
|
some basic data of the task that (finally) was switching to the
|
|
late real-tlme task. In addition to the victim's data, also the
|
|
data of the possible culprit are therefore displayed after the
|
|
"<-" symbol.
|
|
|
|
Finally, the timestamp of the time when the latency occurred
|
|
in <seconds>.<microseconds> after the most recent system boot
|
|
is provided.
|
|
|
|
These data are also reset when the wakeup histogram is reset.
|