The refcount blocks
-------------------
-The qcow2 format also mantains a reference count for each cluster.
+The qcow2 format also maintains a reference count for each cluster.
Reference counts are used for cluster allocation and internal
snapshots. The data is stored in a two-level structure similar to the
L1/L2 tables described above.
In order to choose the cache sizes we need to know how they relate to
the amount of allocated space.
-The amount of virtual disk that can be mapped by the L2 and refcount
+The part of the virtual disk that can be mapped by the L2 and refcount
caches (in bytes) is:
disk_size = l2_cache_size * cluster_size / 8
disk_size = refcount_cache_size * cluster_size * 8 / refcount_bits
With the default values for cluster_size (64KB) and refcount_bits
-(16), that is
+(16), this becomes:
disk_size = l2_cache_size * 8192
disk_size = refcount_cache_size * 32768
l2_cache_size = disk_size_GB * 131072
refcount_cache_size = disk_size_GB * 32768
-QEMU has a default L2 cache of 1MB (1048576 bytes) and a refcount
-cache of 256KB (262144 bytes), so using the formulas we've just seen
-we have
+For example, 1MB of L2 cache is needed to cover every 8 GB of the virtual
+image size (given that the default cluster size is used):
- 1048576 / 131072 = 8 GB of virtual disk covered by that cache
- 262144 / 32768 = 8 GB
+ 8 GB / 8192 = 1 MB
+
+The refcount cache is 4 times the cluster size by default. With the default
+cluster size of 64 KB, it is 256 KB (262144 bytes). This is sufficient for
+8 GB of image size:
+
+ 262144 * 32768 = 8 GB
How to configure the cache sizes
- Both caches must have a size that is a multiple of the cluster size
(or the cache entry size: see "Using smaller cache sizes" below).
- - The default L2 cache size is 8 clusters or 1MB (whichever is more),
- and the minimum is 2 clusters (or 2 cache entries, see below).
+ - The maximum L2 cache size is 32 MB by default on Linux platforms (enough
+ for full coverage of 256 GB images, with the default cluster size). This
+ value can be modified using the "l2-cache-size" option. QEMU will not use
+ more memory than needed to hold all of the image's L2 tables, regardless
+ of this max. value.
+ On non-Linux platforms the maximal value is smaller by default (8 MB) and
+ this difference stems from the fact that on Linux the cache can be cleared
+ periodically if needed, using the "cache-clean-interval" option (see below).
+ The minimal L2 cache size is 2 clusters (or 2 cache entries, see below).
- The default (and minimum) refcount cache size is 4 clusters.
memory as possible to the L2 cache before increasing the refcount
cache size.
+ - At most two of "l2-cache-size", "refcount-cache-size", and "cache-size"
+ can be set simultaneously.
+
Unlike L2 tables, refcount blocks are not used during normal I/O but
only during allocations and internal snapshots. In most cases they are
accessed sequentially (even during random guest I/O) so increasing the
Using smaller cache entries
---------------------------
-The qcow2 L2 cache stores complete tables by default. This means that
-if QEMU needs an entry from an L2 table then the whole table is read
-from disk and is kept in the cache. If the cache is full then a
-complete table needs to be evicted first.
+The qcow2 L2 cache can store complete tables. This means that if QEMU
+needs an entry from an L2 table then the whole table is read from disk
+and is kept in the cache. If the cache is full then a complete table
+needs to be evicted first.
This can be inefficient with large cluster sizes since it results in
more disk I/O and wastes more cache memory.
-drive file=hd.qcow2,l2-cache-size=2097152,l2-cache-entry-size=4096
+Since QEMU 4.0 the value of l2-cache-entry-size defaults to 4KB (or
+the cluster size if it's smaller).
+
Some things to take into account:
- The L2 cache entry size has the same restrictions as the cluster
- Try different entry sizes to see which one gives faster performance
in your case. The block size of the host filesystem is generally a
- good default (usually 4096 bytes in the case of ext4).
+ good default (usually 4096 bytes in the case of ext4, hence the
+ default).
- Only the L2 cache can be configured this way. The refcount cache
always uses the cluster size as the entry size.
- If the L2 cache is big enough to hold all of the image's L2 tables
- (as explained in the "Choosing the right cache sizes" section
- earlier in this document) then none of this is necessary and you
- can omit the "l2-cache-entry-size" parameter altogether.
+ (as explained in the "Choosing the right cache sizes" and "How to
+ configure the cache sizes" sections in this document) then none of
+ this is necessary and you can omit the "l2-cache-entry-size"
+ parameter altogether. In this case QEMU makes the entry size
+ equal to the cluster size by default.
Reducing the memory usage
It is possible to clean unused cache entries in order to reduce the
memory usage during periods of low I/O activity.
-The parameter "cache-clean-interval" defines an interval (in seconds).
-All cache entries that haven't been accessed during that interval are
-removed from memory.
+The parameter "cache-clean-interval" defines an interval (in seconds),
+after which all the cache entries that haven't been accessed during the
+interval are removed from memory. Setting this parameter to 0 disables this
+feature.
-This example removes all unused cache entries every 15 minutes:
+The following example removes all unused cache entries every 15 minutes:
-drive file=hd.qcow2,cache-clean-interval=900
-If unset, the default value for this parameter is 0 and it disables
-this feature.
+If unset, the default value for this parameter is 600 on platforms which
+support this functionality, and is 0 (disabled) on other platforms.
-Note that this functionality currently relies on the MADV_DONTNEED
-argument for madvise() to actually free the memory. This is a
-Linux-specific feature, so cache-clean-interval is not supported in
-other systems.
+This functionality currently relies on the MADV_DONTNEED argument for
+madvise() to actually free the memory. This is a Linux-specific feature,
+so cache-clean-interval is not supported on other systems.