[qemu.git] / docs / qcow2-cache.txt

qcow2 L2/refcount cache configuration
=====================================
Copyright (C) 2015 Igalia, S.L.
Author: Alberto Garcia <[email protected]>

This work is licensed under the terms of the GNU GPL, version 2 or
later. See the COPYING file in the top-level directory.

Introduction
------------
The QEMU qcow2 driver has two caches that can improve the I/O
performance significantly. However, setting the right cache sizes is
not a straightforward operation.

This document attempts to give an overview of the L2 and refcount
caches, and how to configure them.

Please refer to the docs/specs/qcow2.txt file for an in-depth
technical description of the qcow2 file format.


Clusters
--------
A qcow2 file is organized in units of constant size called clusters.

The cluster size is configurable, but it must be a power of two and
its value 512 bytes or higher. QEMU currently defaults to 64 KB
clusters, and it does not support sizes larger than 2MB.

The 'qemu-img create' command supports specifying the size using the
cluster_size option:

   qemu-img create -f qcow2 -o cluster_size=128K hd.qcow2 4G


The L2 tables
-------------
The qcow2 format uses a two-level structure to map the virtual disk as
seen by the guest to the disk image in the host. These structures are
called the L1 and L2 tables.

There is one single L1 table per disk image. The table is small and is
always kept in memory.

There can be many L2 tables, depending on how much space has been
allocated in the image. Each table is one cluster in size. In order to
read or write data from the virtual disk, QEMU needs to read its
corresponding L2 table to find out where that data is located. Since
reading the table for each I/O operation can be expensive, QEMU keeps
an L2 cache in memory to speed up disk access.

The size of the L2 cache can be configured, and setting the right
value can improve the I/O performance significantly.


The refcount blocks
-------------------
The qcow2 format also mantains a reference count for each cluster.
Reference counts are used for cluster allocation and internal
snapshots. The data is stored in a two-level structure similar to the
L1/L2 tables described above.

The second level structures are called refcount blocks, are also one
cluster in size and the number is also variable and dependent on the
amount of allocated space.

Each block contains a number of refcount entries. Their size (in bits)
is a power of two and must not be higher than 64. It defaults to 16
bits, but a different value can be set using the refcount_bits option:

   qemu-img create -f qcow2 -o refcount_bits=8 hd.qcow2 4G

QEMU keeps a refcount cache to speed up I/O much like the
aforementioned L2 cache, and its size can also be configured.


Choosing the right cache sizes
------------------------------
In order to choose the cache sizes we need to know how they relate to
the amount of allocated space.

The amount of virtual disk that can be mapped by the L2 and refcount
caches (in bytes) is:

   disk_size = l2_cache_size * cluster_size / 8
   disk_size = refcount_cache_size * cluster_size * 8 / refcount_bits

With the default values for cluster_size (64KB) and refcount_bits
(16), that is

   disk_size = l2_cache_size * 8192
   disk_size = refcount_cache_size * 32768

So in order to cover n GB of disk space with the default values we
need:

   l2_cache_size = disk_size_GB * 131072
   refcount_cache_size = disk_size_GB * 32768

QEMU has a default L2 cache of 1MB (1048576 bytes) and a refcount
cache of 256KB (262144 bytes), so using the formulas we've just seen
we have

   1048576 / 131072 = 8 GB of virtual disk covered by that cache
    262144 /  32768 = 8 GB


How to configure the cache sizes
--------------------------------
Cache sizes can be configured using the -drive option in the
command-line, or the 'blockdev-add' QMP command.

There are three options available, and all of them take bytes:

"l2-cache-size":         maximum size of the L2 table cache
"refcount-cache-size":   maximum size of the refcount block cache
"cache-size":            maximum size of both caches combined

There are two things that need to be taken into account:

 - Both caches must have a size that is a multiple of the cluster
   size.

 - If you only set one of the options above, QEMU will automatically
   adjust the others so that the L2 cache is 4 times bigger than the
   refcount cache.

This means that these options are equivalent:

   -drive file=hd.qcow2,l2-cache-size=2097152
   -drive file=hd.qcow2,refcount-cache-size=524288
   -drive file=hd.qcow2,cache-size=2621440

The reason for this 1/4 ratio is to ensure that both caches cover the
same amount of disk space. Note however that this is only valid with
the default value of refcount_bits (16). If you are using a different
value you might want to calculate both cache sizes yourself since QEMU
will always use the same 1/4 ratio.

It's also worth mentioning that there's no strict need for both caches
to cover the same amount of disk space. The refcount cache is used
much less often than the L2 cache, so it's perfectly reasonable to
keep it small.


Reducing the memory usage
-------------------------
It is possible to clean unused cache entries in order to reduce the
memory usage during periods of low I/O activity.

The parameter "cache-clean-interval" defines an interval (in seconds).
All cache entries that haven't been accessed during that interval are
removed from memory.

This example removes all unused cache entries every 15 minutes:

   -drive file=hd.qcow2,cache-clean-interval=900

If unset, the default value for this parameter is 0 and it disables
this feature.

Note that this functionality currently relies on the MADV_DONTNEED
argument for madvise() to actually free the memory, so it is not
useful in systems that don't follow that behavior.
Commit	Line	Data
7f65ce83 AG	1	qcow2 L2/refcount cache configuration
	2	=====================================
	3	Copyright (C) 2015 Igalia, S.L.
	4	Author: Alberto Garcia <[email protected]>
	5
	6	This work is licensed under the terms of the GNU GPL, version 2 or
	7	later. See the COPYING file in the top-level directory.
	8
	9	Introduction
	10	------------
	11	The QEMU qcow2 driver has two caches that can improve the I/O
	12	performance significantly. However, setting the right cache sizes is
	13	not a straightforward operation.
	14
	15	This document attempts to give an overview of the L2 and refcount
	16	caches, and how to configure them.
	17
	18	Please refer to the docs/specs/qcow2.txt file for an in-depth
	19	technical description of the qcow2 file format.
	20
	21
	22	Clusters
	23	--------
	24	A qcow2 file is organized in units of constant size called clusters.
	25
	26	The cluster size is configurable, but it must be a power of two and
	27	its value 512 bytes or higher. QEMU currently defaults to 64 KB
	28	clusters, and it does not support sizes larger than 2MB.
	29
	30	The 'qemu-img create' command supports specifying the size using the
	31	cluster_size option:
	32
	33	qemu-img create -f qcow2 -o cluster_size=128K hd.qcow2 4G
	34
	35
	36	The L2 tables
	37	-------------
	38	The qcow2 format uses a two-level structure to map the virtual disk as
	39	seen by the guest to the disk image in the host. These structures are
	40	called the L1 and L2 tables.
	41
	42	There is one single L1 table per disk image. The table is small and is
	43	always kept in memory.
	44
	45	There can be many L2 tables, depending on how much space has been
	46	allocated in the image. Each table is one cluster in size. In order to
	47	read or write data from the virtual disk, QEMU needs to read its
	48	corresponding L2 table to find out where that data is located. Since
	49	reading the table for each I/O operation can be expensive, QEMU keeps
	50	an L2 cache in memory to speed up disk access.
	51
	52	The size of the L2 cache can be configured, and setting the right
	53	value can improve the I/O performance significantly.
	54
	55
	56	The refcount blocks
	57	-------------------
	58	The qcow2 format also mantains a reference count for each cluster.
	59	Reference counts are used for cluster allocation and internal
	60	snapshots. The data is stored in a two-level structure similar to the
	61	L1/L2 tables described above.
	62
	63	The second level structures are called refcount blocks, are also one
	64	cluster in size and the number is also variable and dependent on the
65	amount of allocated space.
66
67	Each block contains a number of refcount entries. Their size (in bits)
68	is a power of two and must not be higher than 64. It defaults to 16
69	bits, but a different value can be set using the refcount_bits option:
70
71	qemu-img create -f qcow2 -o refcount_bits=8 hd.qcow2 4G
72
73	QEMU keeps a refcount cache to speed up I/O much like the
74	aforementioned L2 cache, and its size can also be configured.
75
76
77	Choosing the right cache sizes
78	------------------------------
79	In order to choose the cache sizes we need to know how they relate to
80	the amount of allocated space.
81
82	The amount of virtual disk that can be mapped by the L2 and refcount
83	caches (in bytes) is:
84
85	disk_size = l2_cache_size * cluster_size / 8
86	disk_size = refcount_cache_size * cluster_size * 8 / refcount_bits
87
88	With the default values for cluster_size (64KB) and refcount_bits
89	(16), that is
90
91	disk_size = l2_cache_size * 8192
92	disk_size = refcount_cache_size * 32768
93
94	So in order to cover n GB of disk space with the default values we
95	need:
96
97	l2_cache_size = disk_size_GB * 131072
98	refcount_cache_size = disk_size_GB * 32768
99
100	QEMU has a default L2 cache of 1MB (1048576 bytes) and a refcount
101	cache of 256KB (262144 bytes), so using the formulas we've just seen
102	we have
103
104	1048576 / 131072 = 8 GB of virtual disk covered by that cache
105	262144 / 32768 = 8 GB
106
107
108	How to configure the cache sizes
109	--------------------------------
110	Cache sizes can be configured using the -drive option in the
111	command-line, or the 'blockdev-add' QMP command.
112
113	There are three options available, and all of them take bytes:
114
115	"l2-cache-size": maximum size of the L2 table cache
116	"refcount-cache-size": maximum size of the refcount block cache
117	"cache-size": maximum size of both caches combined
118
119	There are two things that need to be taken into account:
120
121	- Both caches must have a size that is a multiple of the cluster
122	size.
123
124	- If you only set one of the options above, QEMU will automatically
125	adjust the others so that the L2 cache is 4 times bigger than the
126	refcount cache.
127
128	This means that these options are equivalent:
129
130	-drive file=hd.qcow2,l2-cache-size=2097152
131	-drive file=hd.qcow2,refcount-cache-size=524288
132	-drive file=hd.qcow2,cache-size=2621440
133
134	The reason for this 1/4 ratio is to ensure that both caches cover the
135	same amount of disk space. Note however that this is only valid with
136	the default value of refcount_bits (16). If you are using a different
137	value you might want to calculate both cache sizes yourself since QEMU
138	will always use the same 1/4 ratio.
139
140	It's also worth mentioning that there's no strict need for both caches
141	to cover the same amount of disk space. The refcount cache is used
142	much less often than the L2 cache, so it's perfectly reasonable to
143	keep it small.
144
145
146	Reducing the memory usage
147	-------------------------
148	It is possible to clean unused cache entries in order to reduce the
149	memory usage during periods of low I/O activity.
150
151	The parameter "cache-clean-interval" defines an interval (in seconds).
152	All cache entries that haven't been accessed during that interval are
153	removed from memory.
154
155	This example removes all unused cache entries every 15 minutes:
156
157	-drive file=hd.qcow2,cache-clean-interval=900
158
159	If unset, the default value for this parameter is 0 and it disables
160	this feature.
161
162	Note that this functionality currently relies on the MADV_DONTNEED
163	argument for madvise() to actually free the memory, so it is not
164	useful in systems that don't follow that behavior.