docs: Document the throttling infrastructure
Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This commit is contained in:
parent
f5a845fdb4
commit
1ffad77cde
252
docs/throttle.txt
Normal file
252
docs/throttle.txt
Normal file
@ -0,0 +1,252 @@
|
||||
The QEMU throttling infrastructure
|
||||
==================================
|
||||
Copyright (C) 2016 Igalia, S.L.
|
||||
Author: Alberto Garcia <berto@igalia.com>
|
||||
|
||||
This work is licensed under the terms of the GNU GPL, version 2 or
|
||||
later. See the COPYING file in the top-level directory.
|
||||
|
||||
Introduction
|
||||
------------
|
||||
QEMU includes a throttling module that can be used to set limits to
|
||||
I/O operations. The code itself is generic and independent of the I/O
|
||||
units, but it is currenly used to limit the number of bytes per second
|
||||
and operations per second (IOPS) when performing disk I/O.
|
||||
|
||||
This document explains how to use the throttling code in QEMU, and how
|
||||
it works internally. The implementation is in throttle.c.
|
||||
|
||||
|
||||
Using throttling to limit disk I/O
|
||||
----------------------------------
|
||||
Two aspects of the disk I/O can be limited: the number of bytes per
|
||||
second and the number of operations per second (IOPS). For each one of
|
||||
them the user can set a global limit or separate limits for read and
|
||||
write operations. This gives us a total of six different parameters.
|
||||
|
||||
I/O limits can be set using the throttling.* parameters of -drive, or
|
||||
using the QMP 'block_set_io_throttle' command. These are the names of
|
||||
the parameters for both cases:
|
||||
|
||||
|-----------------------+-----------------------|
|
||||
| -drive | block_set_io_throttle |
|
||||
|-----------------------+-----------------------|
|
||||
| throttling.iops-total | iops |
|
||||
| throttling.iops-read | iops_rd |
|
||||
| throttling.iops-write | iops_wr |
|
||||
| throttling.bps-total | bps |
|
||||
| throttling.bps-read | bps_rd |
|
||||
| throttling.bps-write | bps_wr |
|
||||
|-----------------------+-----------------------|
|
||||
|
||||
It is possible to set limits for both IOPS and bps and the same time,
|
||||
and for each case we can decide whether to have separate read and
|
||||
write limits or not, but note that if iops-total is set then neither
|
||||
iops-read nor iops-write can be set. The same applies to bps-total and
|
||||
bps-read/write.
|
||||
|
||||
The default value of these parameters is 0, and it means 'unlimited'.
|
||||
|
||||
In its most basic usage, the user can add a drive to QEMU with a limit
|
||||
of 100 IOPS with the following -drive line:
|
||||
|
||||
-drive file=hd0.qcow2,throttling.iops-total=100
|
||||
|
||||
We can do the same using QMP. In this case all these parameters are
|
||||
mandatory, so we must set to 0 the ones that we don't want to limit:
|
||||
|
||||
{ "execute": "block_set_io_throttle",
|
||||
"arguments": {
|
||||
"device": "virtio0",
|
||||
"iops": 100,
|
||||
"iops_rd": 0,
|
||||
"iops_wr": 0,
|
||||
"bps": 0,
|
||||
"bps_rd": 0,
|
||||
"bps_wr": 0
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
I/O bursts
|
||||
----------
|
||||
In addition to the basic limits we have just seen, QEMU allows the
|
||||
user to do bursts of I/O for a configurable amount of time. A burst is
|
||||
an amount of I/O that can exceed the basic limit. Bursts are useful to
|
||||
allow better performance when there are peaks of activity (the OS
|
||||
boots, a service needs to be restarted) while keeping the average
|
||||
limits lower the rest of the time.
|
||||
|
||||
Two parameters control bursts: their length and the maximum amount of
|
||||
I/O they allow. These two can be configured separately for each one of
|
||||
the six basic parameters described in the previous section, but in
|
||||
this section we'll use 'iops-total' as an example.
|
||||
|
||||
The I/O limit during bursts is set using 'iops-total-max', and the
|
||||
maximum length (in seconds) is set with 'iops-total-max-length'. So if
|
||||
we want to configure a drive with a basic limit of 100 IOPS and allow
|
||||
bursts of 2000 IOPS for 60 seconds, we would do it like this (the line
|
||||
is split for clarity):
|
||||
|
||||
-drive file=hd0.qcow2,
|
||||
throttling.iops-total=100,
|
||||
throttling.iops-total-max=2000,
|
||||
throttling.iops-total-max-length=60
|
||||
|
||||
Or, with QMP:
|
||||
|
||||
{ "execute": "block_set_io_throttle",
|
||||
"arguments": {
|
||||
"device": "virtio0",
|
||||
"iops": 100,
|
||||
"iops_rd": 0,
|
||||
"iops_wr": 0,
|
||||
"bps": 0,
|
||||
"bps_rd": 0,
|
||||
"bps_wr": 0,
|
||||
"iops_max": 2000,
|
||||
"iops_max_length": 60,
|
||||
}
|
||||
}
|
||||
|
||||
With this, the user can perform I/O on hd0.qcow2 at a rate of 2000
|
||||
IOPS for 1 minute before it's throttled down to 100 IOPS.
|
||||
|
||||
The user will be able to do bursts again if there's a sufficiently
|
||||
long period of time with unused I/O (see below for details).
|
||||
|
||||
The default value for 'iops-total-max' is 0 and it means that bursts
|
||||
are not allowed. 'iops-total-max-length' can only be set if
|
||||
'iops-total-max' is set as well, and its default value is 1 second.
|
||||
|
||||
Here's the complete list of parameters for configuring bursts:
|
||||
|
||||
|----------------------------------+-----------------------|
|
||||
| -drive | block_set_io_throttle |
|
||||
|----------------------------------+-----------------------|
|
||||
| throttling.iops-total-max | iops_max |
|
||||
| throttling.iops-total-max-length | iops_max_length |
|
||||
| throttling.iops-read-max | iops_rd_max |
|
||||
| throttling.iops-read-max-length | iops_rd_max_length |
|
||||
| throttling.iops-write-max | iops_wr_max |
|
||||
| throttling.iops-write-max-length | iops_wr_max_length |
|
||||
| throttling.bps-total-max | bps_max |
|
||||
| throttling.bps-total-max-length | bps_max_length |
|
||||
| throttling.bps-read-max | bps_rd_max |
|
||||
| throttling.bps-read-max-length | bps_rd_max_length |
|
||||
| throttling.bps-write-max | bps_wr_max |
|
||||
| throttling.bps-write-max-length | bps_wr_max_length |
|
||||
|----------------------------------+-----------------------|
|
||||
|
||||
|
||||
Controlling the size of I/O operations
|
||||
--------------------------------------
|
||||
When applying IOPS limits all I/O operations are treated equally
|
||||
regardless of their size. This means that the user can take advantage
|
||||
of this in order to circumvent the limits and submit one huge I/O
|
||||
request instead of several smaller ones.
|
||||
|
||||
QEMU provides a setting called throttling.iops-size to prevent this
|
||||
from happening. This setting specifies the size (in bytes) of an I/O
|
||||
request for accounting purposes. Larger requests will be counted
|
||||
proportionally to this size.
|
||||
|
||||
For example, if iops-size is set to 4096 then an 8KB request will be
|
||||
counted as two, and a 6KB request will be counted as one and a
|
||||
half. This only applies to requests larger than iops-size: smaller
|
||||
requests will be always counted as one, no matter their size.
|
||||
|
||||
The default value of iops-size is 0 and it means that the size of the
|
||||
requests is never taken into account when applying IOPS limits.
|
||||
|
||||
|
||||
Applying I/O limits to groups of disks
|
||||
--------------------------------------
|
||||
In all the examples so far we have seen how to apply limits to the I/O
|
||||
performed on individual drives, but QEMU allows grouping drives so
|
||||
they all share the same limits.
|
||||
|
||||
The way it works is that each drive with I/O limits is assigned to a
|
||||
group named using the throttling.group parameter. If this parameter is
|
||||
not specified, then the device name (i.e. 'virtio0', 'ide0-hd0') will
|
||||
be used as the group name.
|
||||
|
||||
Limits set using the throttling.* parameters discussed earlier in this
|
||||
document apply to the combined I/O of all members of a group.
|
||||
|
||||
Consider this example:
|
||||
|
||||
-drive file=hd1.qcow2,throttling.iops-total=6000,throttling.group=foo
|
||||
-drive file=hd2.qcow2,throttling.iops-total=6000,throttling.group=foo
|
||||
-drive file=hd3.qcow2,throttling.iops-total=3000,throttling.group=bar
|
||||
-drive file=hd4.qcow2,throttling.iops-total=6000,throttling.group=foo
|
||||
-drive file=hd5.qcow2,throttling.iops-total=3000,throttling.group=bar
|
||||
-drive file=hd6.qcow2,throttling.iops-total=5000
|
||||
|
||||
Here hd1, hd2 and hd4 are all members of a group named 'foo' with a
|
||||
combined IOPS limit of 6000, and hd3 and hd5 are members of 'bar'. hd6
|
||||
is left alone (technically it is part of a 1-member group).
|
||||
|
||||
Limits are applied in a round-robin fashion so if there are concurrent
|
||||
I/O requests on several drives of the same group they will be
|
||||
distributed evenly.
|
||||
|
||||
When I/O limits are applied to an existing drive using the QMP command
|
||||
'block_set_io_throttle', the following things need to be taken into
|
||||
account:
|
||||
|
||||
- I/O limits are shared within the same group, so new values will
|
||||
affect all members and overwrite the previous settings. In other
|
||||
words: if different limits are applied to members of the same
|
||||
group, the last one wins.
|
||||
|
||||
- If 'group' is unset it is assumed to be the current group of that
|
||||
drive. If the drive is not in a group yet, it will be added to a
|
||||
group named after the device name.
|
||||
|
||||
- If 'group' is set then the drive will be moved to that group if
|
||||
it was member of a different one. In this case the limits
|
||||
specified in the parameters will be applied to the new group
|
||||
only.
|
||||
|
||||
- I/O limits can be disabled by setting all of them to 0. In this
|
||||
case the device will be removed from its group and the rest of
|
||||
its members will not be affected. The 'group' parameter is
|
||||
ignored.
|
||||
|
||||
|
||||
The Leaky Bucket algorithm
|
||||
--------------------------
|
||||
I/O limits in QEMU are implemented using the leaky bucket algorithm
|
||||
(specifically the "Leaky bucket as a meter" variant).
|
||||
|
||||
This algorithm uses the analogy of a bucket that leaks water
|
||||
constantly. The water that gets into the bucket represents the I/O
|
||||
that has been performed, and no more I/O is allowed once the bucket is
|
||||
full.
|
||||
|
||||
To see the way this corresponds to the throttling parameters in QEMU,
|
||||
consider the following values:
|
||||
|
||||
iops-total=100
|
||||
iops-total-max=2000
|
||||
iops-total-max-length=60
|
||||
|
||||
- Water leaks from the bucket at a rate of 100 IOPS.
|
||||
- Water can be added to the bucket at a rate of 2000 IOPS.
|
||||
- The size of the bucket is 2000 x 60 = 120000
|
||||
- If 'iops-total-max-length' is unset then the bucket size is 100.
|
||||
|
||||
The bucket is initially empty, therefore water can be added until it's
|
||||
full at a rate of 2000 IOPS (the burst rate). Once the bucket is full
|
||||
we can only add as much water as it leaks, therefore the I/O rate is
|
||||
reduced to 100 IOPS. If we add less water than it leaks then the
|
||||
bucket will start to empty, allowing for bursts again.
|
||||
|
||||
Note that since water is leaking from the bucket even during bursts,
|
||||
it will take a bit more than 60 seconds at 2000 IOPS to fill it
|
||||
up. After those 60 seconds the bucket will have leaked 60 x 100 =
|
||||
6000, allowing for 3 more seconds of I/O at 2000 IOPS.
|
||||
|
||||
Also, due to the way the algorithm works, longer burst can be done at
|
||||
a lower I/O rate, e.g. 1000 IOPS during 120 seconds.
|
Loading…
Reference in New Issue
Block a user