Git Repo - qemu.git/blame_incremental

... / ...

Commit	Line	Data
	1	qcow2 L2/refcount cache configuration
	2	=====================================
	3	Copyright (C) 2015, 2018 Igalia, S.L.
	4	Author: Alberto Garcia <[email protected]>
	5
	6	This work is licensed under the terms of the GNU GPL, version 2 or
	7	later. See the COPYING file in the top-level directory.
	8
	9	Introduction
	10	------------
	11	The QEMU qcow2 driver has two caches that can improve the I/O
	12	performance significantly. However, setting the right cache sizes is
	13	not a straightforward operation.
	14
	15	This document attempts to give an overview of the L2 and refcount
	16	caches, and how to configure them.
	17
	18	Please refer to the docs/interop/qcow2.txt file for an in-depth
	19	technical description of the qcow2 file format.
	20
	21
	22	Clusters
	23	--------
	24	A qcow2 file is organized in units of constant size called clusters.
	25
	26	The cluster size is configurable, but it must be a power of two and
	27	its value 512 bytes or higher. QEMU currently defaults to 64 KB
	28	clusters, and it does not support sizes larger than 2MB.
	29
	30	The 'qemu-img create' command supports specifying the size using the
	31	cluster_size option:
	32
	33	qemu-img create -f qcow2 -o cluster_size=128K hd.qcow2 4G
	34
	35
	36	The L2 tables
	37	-------------
	38	The qcow2 format uses a two-level structure to map the virtual disk as
	39	seen by the guest to the disk image in the host. These structures are
	40	called the L1 and L2 tables.
	41
	42	There is one single L1 table per disk image. The table is small and is
	43	always kept in memory.
	44
	45	There can be many L2 tables, depending on how much space has been
	46	allocated in the image. Each table is one cluster in size. In order to
	47	read or write data from the virtual disk, QEMU needs to read its
	48	corresponding L2 table to find out where that data is located. Since
	49	reading the table for each I/O operation can be expensive, QEMU keeps
	50	an L2 cache in memory to speed up disk access.
	51
	52	The size of the L2 cache can be configured, and setting the right
	53	value can improve the I/O performance significantly.
	54
	55
	56	The refcount blocks
	57	-------------------
	58	The qcow2 format also mantains a reference count for each cluster.
	59	Reference counts are used for cluster allocation and internal
	60	snapshots. The data is stored in a two-level structure similar to the
	61	L1/L2 tables described above.
	62
	63	The second level structures are called refcount blocks, are also one
	64	cluster in size and the number is also variable and dependent on the
	65	amount of allocated space.
	66
	67	Each block contains a number of refcount entries. Their size (in bits)
	68	is a power of two and must not be higher than 64. It defaults to 16
	69	bits, but a different value can be set using the refcount_bits option:
	70
	71	qemu-img create -f qcow2 -o refcount_bits=8 hd.qcow2 4G
	72
	73	QEMU keeps a refcount cache to speed up I/O much like the
	74	aforementioned L2 cache, and its size can also be configured.
	75
	76
	77	Choosing the right cache sizes
	78	------------------------------
	79	In order to choose the cache sizes we need to know how they relate to
	80	the amount of allocated space.
	81
	82	The part of the virtual disk that can be mapped by the L2 and refcount
	83	caches (in bytes) is:
	84
	85	disk_size = l2_cache_size * cluster_size / 8
	86	disk_size = refcount_cache_size * cluster_size * 8 / refcount_bits
	87
	88	With the default values for cluster_size (64KB) and refcount_bits
	89	(16), this becomes:
	90
	91	disk_size = l2_cache_size * 8192
	92	disk_size = refcount_cache_size * 32768
	93
	94	So in order to cover n GB of disk space with the default values we
	95	need:
	96
	97	l2_cache_size = disk_size_GB * 131072
	98	refcount_cache_size = disk_size_GB * 32768
	99
	100	For example, 1MB of L2 cache is needed to cover every 8 GB of the virtual
	101	image size (given that the default cluster size is used):
	102
	103	8 GB / 8192 = 1 MB
	104
	105	The refcount cache is 4 times the cluster size by default. With the default
	106	cluster size of 64 KB, it is 256 KB (262144 bytes). This is sufficient for
	107	8 GB of image size:
	108
	109	262144 * 32768 = 8 GB
	110
	111
	112	How to configure the cache sizes
	113	--------------------------------
	114	Cache sizes can be configured using the -drive option in the
	115	command-line, or the 'blockdev-add' QMP command.
	116
	117	There are three options available, and all of them take bytes:
	118
	119	"l2-cache-size": maximum size of the L2 table cache
	120	"refcount-cache-size": maximum size of the refcount block cache
	121	"cache-size": maximum size of both caches combined
	122
	123	There are a few things that need to be taken into account:
	124
	125	- Both caches must have a size that is a multiple of the cluster size
	126	(or the cache entry size: see "Using smaller cache sizes" below).
	127
	128	- The maximum L2 cache size is 1 MB by default (enough for full coverage
	129	of 8 GB images, with the default cluster size). This value can be
	130	modified using the "l2-cache-size" option. QEMU will not use more memory
	131	than needed to hold all of the image's L2 tables, regardless of this max.
	132	value. The minimal L2 cache size is 2 clusters (or 2 cache entries, see
	133	below).
	134
	135	- The default (and minimum) refcount cache size is 4 clusters.
	136
	137	- If only "cache-size" is specified then QEMU will assign as much
	138	memory as possible to the L2 cache before increasing the refcount
	139	cache size.
	140
	141	- At most two of "l2-cache-size", "refcount-cache-size", and "cache-size"
	142	can be set simultaneously.
	143
	144	Unlike L2 tables, refcount blocks are not used during normal I/O but
	145	only during allocations and internal snapshots. In most cases they are
	146	accessed sequentially (even during random guest I/O) so increasing the
	147	refcount cache size won't have any measurable effect in performance
	148	(this can change if you are using internal snapshots, so you may want
	149	to think about increasing the cache size if you use them heavily).
	150
	151	Before QEMU 2.12 the refcount cache had a default size of 1/4 of the
	152	L2 cache size. This resulted in unnecessarily large caches, so now the
	153	refcount cache is as small as possible unless overridden by the user.
	154
	155
	156	Using smaller cache entries
	157	---------------------------
	158	The qcow2 L2 cache stores complete tables by default. This means that
	159	if QEMU needs an entry from an L2 table then the whole table is read
	160	from disk and is kept in the cache. If the cache is full then a
	161	complete table needs to be evicted first.
	162
	163	This can be inefficient with large cluster sizes since it results in
	164	more disk I/O and wastes more cache memory.
	165
	166	Since QEMU 2.12 you can change the size of the L2 cache entry and make
	167	it smaller than the cluster size. This can be configured using the
	168	"l2-cache-entry-size" parameter:
	169
	170	-drive file=hd.qcow2,l2-cache-size=2097152,l2-cache-entry-size=4096
	171
	172	Some things to take into account:
	173
	174	- The L2 cache entry size has the same restrictions as the cluster
	175	size (power of two, at least 512 bytes).
	176
	177	- Smaller entry sizes generally improve the cache efficiency and make
	178	disk I/O faster. This is particularly true with solid state drives
	179	so it's a good idea to reduce the entry size in those cases. With
	180	rotating hard drives the situation is a bit more complicated so you
	181	should test it first and stay with the default size if unsure.
	182
	183	- Try different entry sizes to see which one gives faster performance
	184	in your case. The block size of the host filesystem is generally a
	185	good default (usually 4096 bytes in the case of ext4).
	186
	187	- Only the L2 cache can be configured this way. The refcount cache
	188	always uses the cluster size as the entry size.
	189
	190	- If the L2 cache is big enough to hold all of the image's L2 tables
	191	(as explained in the "Choosing the right cache sizes" and "How to
	192	configure the cache sizes" sections in this document) then none of
	193	this is necessary and you can omit the "l2-cache-entry-size"
	194	parameter altogether.
	195
	196
	197	Reducing the memory usage
	198	-------------------------
	199	It is possible to clean unused cache entries in order to reduce the
	200	memory usage during periods of low I/O activity.
	201
	202	The parameter "cache-clean-interval" defines an interval (in seconds).
	203	All cache entries that haven't been accessed during that interval are
	204	removed from memory.
	205
	206	This example removes all unused cache entries every 15 minutes:
	207
	208	-drive file=hd.qcow2,cache-clean-interval=900
	209
	210	If unset, the default value for this parameter is 0 and it disables
	211	this feature.
	212
	213	Note that this functionality currently relies on the MADV_DONTNEED
	214	argument for madvise() to actually free the memory. This is a
	215	Linux-specific feature, so cache-clean-interval is not supported in
	216	other systems.