]>
Commit | Line | Data |
---|---|---|
7f65ce83 AG |
1 | qcow2 L2/refcount cache configuration |
2 | ===================================== | |
3 | Copyright (C) 2015 Igalia, S.L. | |
4 | Author: Alberto Garcia <[email protected]> | |
5 | ||
6 | This work is licensed under the terms of the GNU GPL, version 2 or | |
7 | later. See the COPYING file in the top-level directory. | |
8 | ||
9 | Introduction | |
10 | ------------ | |
11 | The QEMU qcow2 driver has two caches that can improve the I/O | |
12 | performance significantly. However, setting the right cache sizes is | |
13 | not a straightforward operation. | |
14 | ||
15 | This document attempts to give an overview of the L2 and refcount | |
16 | caches, and how to configure them. | |
17 | ||
18 | Please refer to the docs/specs/qcow2.txt file for an in-depth | |
19 | technical description of the qcow2 file format. | |
20 | ||
21 | ||
22 | Clusters | |
23 | -------- | |
24 | A qcow2 file is organized in units of constant size called clusters. | |
25 | ||
26 | The cluster size is configurable, but it must be a power of two and | |
27 | its value 512 bytes or higher. QEMU currently defaults to 64 KB | |
28 | clusters, and it does not support sizes larger than 2MB. | |
29 | ||
30 | The 'qemu-img create' command supports specifying the size using the | |
31 | cluster_size option: | |
32 | ||
33 | qemu-img create -f qcow2 -o cluster_size=128K hd.qcow2 4G | |
34 | ||
35 | ||
36 | The L2 tables | |
37 | ------------- | |
38 | The qcow2 format uses a two-level structure to map the virtual disk as | |
39 | seen by the guest to the disk image in the host. These structures are | |
40 | called the L1 and L2 tables. | |
41 | ||
42 | There is one single L1 table per disk image. The table is small and is | |
43 | always kept in memory. | |
44 | ||
45 | There can be many L2 tables, depending on how much space has been | |
46 | allocated in the image. Each table is one cluster in size. In order to | |
47 | read or write data from the virtual disk, QEMU needs to read its | |
48 | corresponding L2 table to find out where that data is located. Since | |
49 | reading the table for each I/O operation can be expensive, QEMU keeps | |
50 | an L2 cache in memory to speed up disk access. | |
51 | ||
52 | The size of the L2 cache can be configured, and setting the right | |
53 | value can improve the I/O performance significantly. | |
54 | ||
55 | ||
56 | The refcount blocks | |
57 | ------------------- | |
58 | The qcow2 format also mantains a reference count for each cluster. | |
59 | Reference counts are used for cluster allocation and internal | |
60 | snapshots. The data is stored in a two-level structure similar to the | |
61 | L1/L2 tables described above. | |
62 | ||
63 | The second level structures are called refcount blocks, are also one | |
64 | cluster in size and the number is also variable and dependent on the | |
65 | amount of allocated space. | |
66 | ||
67 | Each block contains a number of refcount entries. Their size (in bits) | |
68 | is a power of two and must not be higher than 64. It defaults to 16 | |
69 | bits, but a different value can be set using the refcount_bits option: | |
70 | ||
71 | qemu-img create -f qcow2 -o refcount_bits=8 hd.qcow2 4G | |
72 | ||
73 | QEMU keeps a refcount cache to speed up I/O much like the | |
74 | aforementioned L2 cache, and its size can also be configured. | |
75 | ||
76 | ||
77 | Choosing the right cache sizes | |
78 | ------------------------------ | |
79 | In order to choose the cache sizes we need to know how they relate to | |
80 | the amount of allocated space. | |
81 | ||
82 | The amount of virtual disk that can be mapped by the L2 and refcount | |
83 | caches (in bytes) is: | |
84 | ||
85 | disk_size = l2_cache_size * cluster_size / 8 | |
86 | disk_size = refcount_cache_size * cluster_size * 8 / refcount_bits | |
87 | ||
88 | With the default values for cluster_size (64KB) and refcount_bits | |
89 | (16), that is | |
90 | ||
91 | disk_size = l2_cache_size * 8192 | |
92 | disk_size = refcount_cache_size * 32768 | |
93 | ||
94 | So in order to cover n GB of disk space with the default values we | |
95 | need: | |
96 | ||
97 | l2_cache_size = disk_size_GB * 131072 | |
98 | refcount_cache_size = disk_size_GB * 32768 | |
99 | ||
100 | QEMU has a default L2 cache of 1MB (1048576 bytes) and a refcount | |
101 | cache of 256KB (262144 bytes), so using the formulas we've just seen | |
102 | we have | |
103 | ||
104 | 1048576 / 131072 = 8 GB of virtual disk covered by that cache | |
105 | 262144 / 32768 = 8 GB | |
106 | ||
107 | ||
108 | How to configure the cache sizes | |
109 | -------------------------------- | |
110 | Cache sizes can be configured using the -drive option in the | |
111 | command-line, or the 'blockdev-add' QMP command. | |
112 | ||
113 | There are three options available, and all of them take bytes: | |
114 | ||
115 | "l2-cache-size": maximum size of the L2 table cache | |
116 | "refcount-cache-size": maximum size of the refcount block cache | |
117 | "cache-size": maximum size of both caches combined | |
118 | ||
119 | There are two things that need to be taken into account: | |
120 | ||
121 | - Both caches must have a size that is a multiple of the cluster | |
122 | size. | |
123 | ||
124 | - If you only set one of the options above, QEMU will automatically | |
125 | adjust the others so that the L2 cache is 4 times bigger than the | |
126 | refcount cache. | |
127 | ||
128 | This means that these options are equivalent: | |
129 | ||
130 | -drive file=hd.qcow2,l2-cache-size=2097152 | |
131 | -drive file=hd.qcow2,refcount-cache-size=524288 | |
132 | -drive file=hd.qcow2,cache-size=2621440 | |
133 | ||
134 | The reason for this 1/4 ratio is to ensure that both caches cover the | |
135 | same amount of disk space. Note however that this is only valid with | |
136 | the default value of refcount_bits (16). If you are using a different | |
137 | value you might want to calculate both cache sizes yourself since QEMU | |
138 | will always use the same 1/4 ratio. | |
139 | ||
140 | It's also worth mentioning that there's no strict need for both caches | |
141 | to cover the same amount of disk space. The refcount cache is used | |
142 | much less often than the L2 cache, so it's perfectly reasonable to | |
143 | keep it small. | |
144 | ||
145 | ||
146 | Reducing the memory usage | |
147 | ------------------------- | |
148 | It is possible to clean unused cache entries in order to reduce the | |
149 | memory usage during periods of low I/O activity. | |
150 | ||
151 | The parameter "cache-clean-interval" defines an interval (in seconds). | |
152 | All cache entries that haven't been accessed during that interval are | |
153 | removed from memory. | |
154 | ||
155 | This example removes all unused cache entries every 15 minutes: | |
156 | ||
157 | -drive file=hd.qcow2,cache-clean-interval=900 | |
158 | ||
159 | If unset, the default value for this parameter is 0 and it disables | |
160 | this feature. | |
161 | ||
162 | Note that this functionality currently relies on the MADV_DONTNEED | |
163 | argument for madvise() to actually free the memory, so it is not | |
164 | useful in systems that don't follow that behavior. |