* remotes/kraxel/tags/ui-20180518-pull-request:
sdl: Move use of surface pointer below check for whether it is NULL
ui: add x_keymap.o to modules
console: Avoid segfault in screendump
Peter Maydell [Fri, 18 May 2018 09:16:24 +0000 (10:16 +0100)]
Merge remote-tracking branch 'remotes/rth/tags/pull-fpu-20180517' into staging
Roundup of softfloat patches
# gpg: Signature made Thu 17 May 2018 23:44:04 BST
# gpg: using RSA key 64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <[email protected]>"
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth/tags/pull-fpu-20180517: (28 commits)
fpu/softfloat: Define floatN_silence_nan in terms of parts_silence_nan
fpu/softfloat: Clean up parts_default_nan
fpu/softfloat: Define floatN_default_nan in terms of parts_default_nan
fpu/softfloat: Pass FloatClass to pickNaNMulAdd
fpu/softfloat: Pass FloatClass to pickNaN
fpu/softfloat: Make is_nan et al available to softfloat-specialize.h
fpu/softfloat: Specialize on snan_bit_is_one
fpu/softfloat: Remove floatX_maybe_silence_nan
fpu/softfloat: Use float*_silence_nan in propagateFloat*NaN
target/s390x: Remove floatX_maybe_silence_nan from conversions
target/riscv: Remove floatX_maybe_silence_nan from conversions
target/mips: Remove floatX_maybe_silence_nan from conversions
target/m68k: Use floatX_silence_nan when we have already checked for SNaN
target/hppa: Remove floatX_maybe_silence_nan from conversions
target/arm: Remove floatX_maybe_silence_nan from conversions
target/arm: Use floatX_silence_nan when we have already checked for SNaN
fpu/softfloat: re-factor float to float conversions
fpu/softfloat: Partial support for ARM Alternative half-precision
target/arm: squash FZ16 behaviour for conversions
target/arm: convert conversion helpers to fpst/ahp_flag
...
Jakub Jelen [Wed, 16 May 2018 11:55:44 +0000 (13:55 +0200)]
hw/usb/dev-smartcard-reader: Handle 64 B USB packets
The current code was not correctly handling 64 B (Max USB 1.1 payload size)
packets and therefore preventing some of the messages from smart card to
pass through to the guest.
If the smart card in host responded with 34 B of data in APDU layer, the
CCID headers added up to 64 B. The packet was send, but not correctly
committed per USB specification (8.5.3.2 Variable-length Data Stage):
> When all of the data structure is returned to the host, the function
> should indicate that the Data stage is ended by returning a packet
> that is shorter than the MaxPacketSize for the pipe. If the data
> structure is an exact multiple of wMaxPacketSize for the pipe, the
> function will return a zero-length packet to indicate the end of the
> Data stage.
This lead the guest applications to timeout while waiting for the rest
of data (the emulation layer is answering with NAK until the timeout).
This patch is checking the current maximum packet size and if the
payload of this size is detected, the message buffer is not yet released.
With the next call, the empty buffer is sent and the message buffer
is finally released.
Since cc847bfd16d894fd8c1a2ce25f31772f6cdbbc74, CCID card-passthru
fails to intialize, because it changed a debug line to an error,
probably by mistake. Change it back to a DPRINTF debug.
(solves Boxes creating VM with smartcard passthru failing to start)
Peter Maydell [Tue, 15 May 2018 18:58:14 +0000 (19:58 +0100)]
sdl: Move use of surface pointer below check for whether it is NULL
In commit 2ab858c6c38ee1 we added a use of the 'surf' variable
in sdl2_2d_update() that was unfortunately placed above the
early-exit-if-NULL check. Move it to where it ought to be.
Paolo Bonzini [Thu, 17 May 2018 12:39:42 +0000 (14:39 +0200)]
ui: add x_keymap.o to modules
x_keymap.o is common to the SDL and GTK+ modules, and it causes the
QEMU binary to link to the X11 libraries. Add it separately to the
modules to keep the main QEMU binary smaller.
Michal Privoznik [Thu, 17 May 2018 15:00:11 +0000 (17:00 +0200)]
console: Avoid segfault in screendump
After f771c5440e04626f1 it is possible to select device and
head which to take screendump from. And even though we check if
provided head number falls within range, it may still happen that
the console has no surface yet leading to SIGSEGV:
fpu/softfloat: Define floatN_silence_nan in terms of parts_silence_nan
Isolate the target-specific choice to 3 functions instead of 6.
The code in floatx80_default_nan tried to be over-general. There are
only two targets that support this format: x86 and m68k. Thus there
is no point in inventing a mechanism for snan_bit_is_one.
Move routines that no longer have ifdefs out of softfloat-specialize.h.
fpu/softfloat: Define floatN_default_nan in terms of parts_default_nan
Isolate the target-specific choice to 2 functions instead of 6.
The code in float16_default_nan was only correct for ARM, MIPS, and X86.
Though float16 support is rare among our targets.
The code in float128_default_nan was arguably wrong for Sparc. While
QEMU supports the Sparc 128-bit insns, no real cpu enables it.
The code in floatx80_default_nan tried to be over-general. There are
only two targets that support this format: x86 and m68k. Thus there
is no point in inventing a value for snan_bit_is_one.
Move routines that no longer have ifdefs out of softfloat-specialize.h.
For each operand, pass a single enumeration instead of a pair of booleans.
The commit also merges multiple different ifdef-selected implementations
of pickNaNMulAdd into a single function whose body is ifdef-selected.
For each operand, pass a single enumeration instead of a pair of booleans.
The commit also merges multiple different ifdef-selected implementations
of pickNaN into a single function whose body is ifdef-selected.
fpu/softfloat: Make is_nan et al available to softfloat-specialize.h
We will need these helpers within softfloat-specialize.h, so move
the definitions above the include. After specialization, they will
not always be used so mark them to avoid the Werror.
Alex Bennée [Wed, 2 May 2018 14:58:31 +0000 (15:58 +0100)]
fpu/softfloat: Partial support for ARM Alternative half-precision
For float16 ARM supports an alternative half-precision format which
sacrifices the ability to represent NaN/Inf in return for a higher
dynamic range. The new FloatFmt flag, arm_althp, is then used to
modify the behaviour of canonicalize and round_canonical with respect
to representation and exception raising.
Usage of this new flag waits until we re-factor float-to-float conversions.
Alex Bennée [Mon, 7 May 2018 12:57:39 +0000 (13:57 +0100)]
target/arm: squash FZ16 behaviour for conversions
The ARM ARM specifies FZ16 is suppressed for conversions. Rather than
pushing this logic into the softfloat code we can simply save the FZ
state and temporarily disable it for the softfloat call.
Alex Bennée [Mon, 7 May 2018 12:17:16 +0000 (13:17 +0100)]
target/arm: convert conversion helpers to fpst/ahp_flag
Instead of passing env and leaving it up to the helper to get the
right fpstatus we pass it explicitly. There was already a get_fpstatus
helper for neon for the 32 bit code. We also add an get_ahp_flag() for
passing the state of the alternative FP16 format flag. This leaves
scope for later tracking the AHP state in translation flags.
Shift the NaN fraction to a canonical position, much like we
do for the fraction of normal numbers. This will facilitate
manipulation of NaNs within the shared code paths.
Petr Tesarik [Fri, 11 May 2018 07:10:52 +0000 (09:10 +0200)]
fpu/softfloat: Fix conversion from uint64 to float128
The significand is passed to normalizeRoundAndPackFloat128() as high
first, low second. The current code passes the integer first, so the
result is incorrectly shifted left by 64 bits.
This bug affects the emulation of s390x instruction CXLGBR (convert
from logical 64-bit binary-integer operand to extended BFP result).
* remotes/cody/tags/block-pull-request:
nfs: Remove processed options from QDict
nfs: Fix error path in nfs_options_qdict_to_qapi()
blockjob: do not cancel timer in resume
qemu-iotests: reduce chance of races in 185
# gpg: Signature made Tue 15 May 2018 22:54:38 BST
# gpg: using RSA key F487EF185872D723
# gpg: Good signature from "Juan Quintela <[email protected]>"
# gpg: aka "Juan Quintela <[email protected]>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723
* remotes/juanquintela/tags/migration/20180515: (40 commits)
Migration+TLS: Fix crash due to double cleanup
migration: Textual fixups for blocktime
migration: update index field when delete or qsort RDMALocalBlock
migration: update docs
migration/hmp: add migrate_pause command
migration/qmp: add command migrate-pause
migration: introduce lock for to_dst_file
hmp/migration: add migrate_recover command
qmp/migration: new command migrate-recover
migration: init dst in migration_object_init too
migration: final handshake for the resume
migration: setup ramstate for resume
migration: synchronize dirty bitmap for resume
migration: introduce SaveVMHandlers.resume_prepare
migration: new message MIG_RP_MSG_RESUME_ACK
migration: new cmd MIG_CMD_POSTCOPY_RESUME
migration: new message MIG_RP_MSG_RECV_BITMAP
migration: new cmd MIG_CMD_RECV_BITMAP
migration: wakeup dst ram-load-thread for recover
migration: new state "postcopy-recover"
...
Peter Maydell [Thu, 17 May 2018 08:57:55 +0000 (09:57 +0100)]
Merge remote-tracking branch 'remotes/ehabkost/tags/x86-next-pull-request' into staging
x86 queue, 2018-05-15
* KnightsMill CPU model
* CLDEMOTE(Demote Cache Line) cpu feature
* pc-i440fx-2.13 and pc-q35-2.13 machine-types
* Add model-specific cache information to EPYC CPU model
# gpg: Signature made Tue 15 May 2018 22:53:12 BST
# gpg: using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <[email protected]>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/x86-next-pull-request:
i386: Add new property to control cache info
pc: add 2.13 machine types
i386: Initialize cache information for EPYC family processors
i386: Add cache information in X86CPUDefinition
i386: Helpers to encode cache information consistently
x86/cpu: Enable CLDEMOTE(Demote Cache Line) cpu feature
i386: add KnightsMill cpu model
Kevin Wolf [Wed, 16 May 2018 16:08:16 +0000 (18:08 +0200)]
nfs: Remove processed options from QDict
Commit c22a03454 QAPIfied option parsing in the NFS block driver, but
forgot to remove all the options we processed. Therefore, we get an
error in bdrv_open_inherit(), which thinks the remaining options are
invalid. Trying to open an NFS image will result in an error like this:
Block protocol 'nfs' doesn't support the option 'server.host'
Remove all options from the QDict to make the NFS driver work again.
Stefan Hajnoczi [Tue, 8 May 2018 13:54:36 +0000 (14:54 +0100)]
blockjob: do not cancel timer in resume
Currently the timer is cancelled and the block job is entered by
block_job_resume(). This behavior causes drain to run extra blockjob
iterations when the job was sleeping due to the ratelimit.
This patch leaves the job asleep when block_job_resume() is called.
Jobs can still be forcibly woken up using block_job_enter(), which is
used to cancel jobs.
After this patch drain no longer runs extra blockjob iterations. This
is the expected behavior that qemu-iotests 185 used to rely on. We
temporarily changed the 185 test output to make it pass for the QEMU
2.12 release but now it's time to address this issue.
Similar issues also affect the other sub-tests. If disk I/O completes
quickly, it races with the QMP 'quit' command. This causes spurious
test failures because QMP events are emitted in an unpredictable order.
This test relies on QEMU internals and there is no QMP API for getting
deterministic behavior needed to make this test 100% reliable. At the
same time, the test is useful and it would be a shame to remove it.
Add sleep 0.5 to reduce the chance of races. This is not a real fix but
appears to reduce spurious failures in practice.
During a TLS connect we see:
migration_channel_connect calls
migration_tls_channel_connect
(calls after TLS setup)
migration_channel_connect
My previous error handling fix made migration_channel_connect
call migrate_fd_connect in all cases; unfortunately the above
means it gets called twice and crashes doing double cleanup.
Lidong Chen [Sun, 6 May 2018 14:54:58 +0000 (22:54 +0800)]
migration: update index field when delete or qsort RDMALocalBlock
rdma_delete_block function deletes RDMALocalBlock base on index field,
but not update the index field. So when next time invoke rdma_delete_block,
it will not work correctly.
If start and cancel migration repeatedly, some RDMALocalBlock not invoke
ibv_dereg_mr to decrease kernel mm_struct vmpin. When vmpin is large than
max locked memory limitation, ibv_reg_mr will failed, and migration can not
start successfully again.
Among other changes:
* Added a general list of advice for device authors
* Reordered the section on conditional state (subsections etc)
into the order we prefer.
* Add a note about firmware
Peter Xu [Wed, 2 May 2018 10:47:39 +0000 (18:47 +0800)]
migration/qmp: add command migrate-pause
It pauses an ongoing migration. Currently it only supports postcopy.
Note that this command will work on either side of the migration.
Basically when we trigger this on one side, it'll interrupt the other
side as well since the other side will get notified on the disconnect
event.
However, it's still possible that the other side is not notified, for
example, when the network is totally broken, or due to some firewall
configuration changes. In that case, we will also need to run the same
command on the other side so both sides will go into the paused state.
Peter Xu [Wed, 2 May 2018 10:47:36 +0000 (18:47 +0800)]
qmp/migration: new command migrate-recover
The first allow-oob=true command. It's used on destination side when
the postcopy migration is paused and ready for a recovery. After
execution, a new migration channel will be established for postcopy to
continue.
Peter Xu [Wed, 2 May 2018 10:47:35 +0000 (18:47 +0800)]
migration: init dst in migration_object_init too
Though we may not need it, now we init both the src/dst migration
objects in migration_object_init() so that even incoming migration
object would be thread safe (it was not).
Peter Xu [Wed, 2 May 2018 10:47:32 +0000 (18:47 +0800)]
migration: synchronize dirty bitmap for resume
This patch implements the first part of core RAM resume logic for
postcopy. ram_resume_prepare() is provided for the work.
When the migration is interrupted by network failure, the dirty bitmap
on the source side will be meaningless, because even the dirty bit is
cleared, it is still possible that the sent page was lost along the way
to destination. Here instead of continue the migration with the old
dirty bitmap on source, we ask the destination side to send back its
received bitmap, then invert it to be our initial dirty bitmap.
The source side send thread will issue the MIG_CMD_RECV_BITMAP requests,
once per ramblock, to ask for the received bitmap. On destination side,
MIG_RP_MSG_RECV_BITMAP will be issued, along with the requested bitmap.
Data will be received on the return-path thread of source, and the main
migration thread will be notified when all the ramblock bitmaps are
synchronized.
This is hook function to be called when a postcopy migration wants to
resume from a failure. For each module, it should provide its own
recovery logic before we switch to the postcopy-active state.
Peter Xu [Wed, 2 May 2018 10:47:30 +0000 (18:47 +0800)]
migration: new message MIG_RP_MSG_RESUME_ACK
Creating new message to reply for MIG_CMD_POSTCOPY_RESUME. One uint32_t
is used as payload to let the source know whether destination is ready
to continue the migration.
Peter Xu [Wed, 2 May 2018 10:47:29 +0000 (18:47 +0800)]
migration: new cmd MIG_CMD_POSTCOPY_RESUME
Introducing this new command to be sent when the source VM is ready to
resume the paused migration. What the destination does here is
basically release the fault thread to continue service page faults.
Peter Xu [Wed, 2 May 2018 10:47:28 +0000 (18:47 +0800)]
migration: new message MIG_RP_MSG_RECV_BITMAP
Introducing new return path message MIG_RP_MSG_RECV_BITMAP to send
received bitmap of ramblock back to source.
This is the reply message of MIG_CMD_RECV_BITMAP, it contains not only
the header (including the ramblock name), and it was appended with the
whole ramblock received bitmap on the destination side.
When the source receives such a reply message (MIG_RP_MSG_RECV_BITMAP),
it parses it, convert it to the dirty bitmap by inverting the bits.
One thing to mention is that, when we send the recv bitmap, we are doing
these things in extra:
- converting the bitmap to little endian, to support when hosts are
using different endianess on src/dst.
- do proper alignment for 8 bytes, to support when hosts are using
different word size (32/64 bits) on src/dst.
Peter Xu [Wed, 2 May 2018 10:47:26 +0000 (18:47 +0800)]
migration: wakeup dst ram-load-thread for recover
On the destination side, we cannot wake up all the threads when we got
reconnected. The first thing to do is to wake up the main load thread,
so that we can continue to receive valid messages from source again and
reply when needed.
At this point, we switch the destination VM state from postcopy-paused
back to postcopy-recover.
Peter Xu [Wed, 2 May 2018 10:47:25 +0000 (18:47 +0800)]
migration: new state "postcopy-recover"
Introducing new migration state "postcopy-recover". If a migration
procedure is paused and the connection is rebuilt afterward
successfully, we'll switch the source VM state from "postcopy-paused" to
the new state "postcopy-recover", then we'll do the resume logic in the
migration thread (along with the return path thread).
This patch only do the state switch on source side. Another following up
patch will handle the state switching on destination side using the same
status bit.
Peter Xu [Wed, 2 May 2018 10:47:22 +0000 (18:47 +0800)]
migration: allow fault thread to pause
Allows the fault thread to stop handling page faults temporarily. When
network failure happened (and if we expect a recovery afterwards), we
should not allow the fault thread to continue sending things to source,
instead, it should halt for a while until the connection is rebuilt.
When the dest main thread noticed the failure, it kicks the fault thread
to switch to pause state.
Peter Xu [Wed, 2 May 2018 10:47:20 +0000 (18:47 +0800)]
migration: allow dst vm pause on postcopy
When there is IO error on the incoming channel (e.g., network down),
instead of bailing out immediately, we allow the dst vm to switch to the
new POSTCOPY_PAUSE state. Currently it is still simple - it waits the
new semaphore, until someone poke it for another attempt.
One note is that here on ram loading thread we cannot detect the
POSTCOPY_ACTIVE state, but we need to detect the more specific
POSTCOPY_INCOMING_RUNNING state, to make sure we have already loaded all
the device states.
Peter Xu [Wed, 2 May 2018 10:47:19 +0000 (18:47 +0800)]
migration: implement "postcopy-pause" src logic
Now when network down for postcopy, the source side will not fail the
migration. Instead we convert the status into this new paused state, and
we will try to wait for a rescue in the future.
If a recovery is detected, migration_thread() will reset its local
variables to prepare for that.
Peter Xu [Wed, 2 May 2018 10:47:18 +0000 (18:47 +0800)]
migration: new postcopy-pause state
Introducing a new state "postcopy-paused", which can be used when the
postcopy migration is paused. It is targeted for postcopy network
failure recovery.
Peter Xu [Wed, 2 May 2018 10:47:17 +0000 (18:47 +0800)]
migration: let incoming side use thread context
The old incoming migration is running in main thread and default
gcontext. With the new qio_channel_add_watch_full() we can now let it
run in the thread's own gcontext (if there is one).
Currently this patch does nothing alone. But when any of the incoming
migration is run in another iothread (e.g., the upcoming migrate-recover
command), this patch will bind the incoming logic to the iothread
instead of the main thread (which may already get page faulted and
hanged).
RDMA is not considered for now since it's not even using the QIO watch
framework at all.
Juan Quintela [Mon, 19 Feb 2018 18:01:45 +0000 (19:01 +0100)]
migration: terminate_* can be called for other threads
Once there, make count field to always be accessed with atomic
operations. To make blocking operations, we need to know that the
thread is running, so create a bool to indicate that.
Peter Maydell [Tue, 15 May 2018 16:02:00 +0000 (17:02 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches:
- Switch AIO/callback based block drivers to a byte-based interface
- Block jobs: Expose error string via query-block-jobs
- Block job cleanups and fixes
- hmp: Allow using a qdev id in block_set_io_throttle
# gpg: Signature made Tue 15 May 2018 16:33:10 BST
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (37 commits)
iotests: Add test for -U/force-share conflicts
qemu-img: Use only string options in img_open_opts
qemu-io: Use purely string blockdev options
block: Document BDRV_REQ_WRITE_UNCHANGED support
qemu-img: Check post-truncation size
iotests: Add test for COR across nodes
iotests: Copy 197 for COR filter driver
iotests: Clean up wrap image in 197
block: Support BDRV_REQ_WRITE_UNCHANGED in filters
block/quorum: Support BDRV_REQ_WRITE_UNCHANGED
block: Set BDRV_REQ_WRITE_UNCHANGED for COR writes
block: Add BDRV_REQ_WRITE_UNCHANGED flag
block: BLK_PERM_WRITE includes ..._UNCHANGED
block: Add COR filter driver
iotests: Skip 181 and 201 without userfaultfd
iotests: Add failure matching to common.qemu
docs: Document the new default sizes of the qcow2 caches
qcow2: Give the refcount cache the minimum possible size by default
specs/qcow2: Clarify that compressed clusters have the COPIED bit reset
Fix error message about compressed clusters with OFLAG_COPIED
...
Babu Moger [Mon, 14 May 2018 16:41:51 +0000 (11:41 -0500)]
i386: Add new property to control cache info
The property legacy-cache will be used to control the cache information.
If user passes "-cpu legacy-cache" then older information will
be displayed even if the hardware supports new information. Otherwise
use the statically loaded cache definitions if available.
Renamed the previous cache structures to legacy_*. If there is any change in
the cache information, then it needs to be initialized in builtin_x86_defs.
Eduardo Habkost [Thu, 10 May 2018 20:41:41 +0000 (15:41 -0500)]
i386: Helpers to encode cache information consistently
Instead of having a collection of macros that need to be used in
complex expressions to build CPUID data, define a CPUCacheInfo
struct that can hold information about a given cache. Helper
functions will take a CPUCacheInfo struct as input to encode
CPUID leaves for a cache.
This will help us ensure consistency between cache information
CPUID leaves, and make the existing inconsistencies in CPUID info
more visible.
Jingqi Liu [Fri, 4 May 2018 03:57:33 +0000 (11:57 +0800)]
x86/cpu: Enable CLDEMOTE(Demote Cache Line) cpu feature
The CLDEMOTE instruction hints to hardware that the cache line that
contains the linear address should be moved("demoted") from
the cache(s) closest to the processor core to a level more distant
from the processor core. This may accelerate subsequent accesses
to the line by other cores in the same coherence domain,
especially if the line was written by the core that demotes the line.
Intel Snow Ridge has added new cpu feature, CLDEMOTE.
The new cpu feature needs to be exposed to guest VM.
The bit definition:
CPUID.(EAX=7,ECX=0):ECX[bit 25] CLDEMOTE
The release document ref below link:
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf
Boqun Feng [Tue, 20 Mar 2018 00:08:15 +0000 (08:08 +0800)]
i386: add KnightsMill cpu model
A new cpu model called "KnightsMill" is added to model Knights Mill
processors. Compared to "Skylake-Server" cpu model, the following
features are added:
Kevin Wolf [Tue, 15 May 2018 14:19:53 +0000 (16:19 +0200)]
Merge remote-tracking branch 'mreitz/tags/pull-block-2018-05-15' into queue-block
- Copy-on-read block driver
- The qcow2 default refcount cache size has been decreased
- Various bug fixes
# gpg: Signature made Tue May 15 16:18:25 2018 CEST
# gpg: using RSA key F407DB0061D5CF40
# gpg: Good signature from "Max Reitz <[email protected]>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* mreitz/tags/pull-block-2018-05-15: (21 commits)
iotests: Add test for -U/force-share conflicts
qemu-img: Use only string options in img_open_opts
qemu-io: Use purely string blockdev options
block: Document BDRV_REQ_WRITE_UNCHANGED support
qemu-img: Check post-truncation size
iotests: Add test for COR across nodes
iotests: Copy 197 for COR filter driver
iotests: Clean up wrap image in 197
block: Support BDRV_REQ_WRITE_UNCHANGED in filters
block/quorum: Support BDRV_REQ_WRITE_UNCHANGED
block: Set BDRV_REQ_WRITE_UNCHANGED for COR writes
block: Add BDRV_REQ_WRITE_UNCHANGED flag
block: BLK_PERM_WRITE includes ..._UNCHANGED
block: Add COR filter driver
iotests: Skip 181 and 201 without userfaultfd
iotests: Add failure matching to common.qemu
docs: Document the new default sizes of the qcow2 caches
qcow2: Give the refcount cache the minimum possible size by default
specs/qcow2: Clarify that compressed clusters have the COPIED bit reset
Fix error message about compressed clusters with OFLAG_COPIED
...
Max Reitz [Wed, 2 May 2018 20:20:50 +0000 (22:20 +0200)]
qemu-img: Use only string options in img_open_opts
img_open_opts() takes a QemuOpts and converts them to a QDict, so all
values therein are strings. Then it may try to call qdict_get_bool(),
however, which will fail with a segmentation fault every time:
$ ./qemu-img info -U --image-opts \
driver=file,filename=/dev/null,force-share=off
[1] 27869 segmentation fault (core dumped) ./qemu-img info -U
--image-opts driver=file,filename=/dev/null,force-share=off
Fix this by using qdict_get_str() and comparing the value as a string.
Also, when adding a force-share value to the QDict, add it as a string
so it fits the rest of the dict.
Max Reitz [Wed, 2 May 2018 20:20:49 +0000 (22:20 +0200)]
qemu-io: Use purely string blockdev options
Currently, qemu-io only uses string-valued blockdev options (as all are
converted directly from QemuOpts) -- with one exception: -U adds the
force-share option as a boolean. This in itself is already a bit
questionable, but a real issue is that it also assumes the value already
existing in the options QDict would be a boolean, which is wrong.
Since @opts is converted from QemuOpts, the value must be a string, and
we have to compare it as such. Consequently, it makes sense to also set
it as a string instead of a boolean.
Max Reitz [Wed, 2 May 2018 14:03:59 +0000 (16:03 +0200)]
block: Document BDRV_REQ_WRITE_UNCHANGED support
Add BDRV_REQ_WRITE_UNCHANGED to the list of flags honored during pwrite
and pwrite_zeroes, and also add a note on when you absolutely need to
support it.
Max Reitz [Sat, 21 Apr 2018 16:39:57 +0000 (18:39 +0200)]
qemu-img: Check post-truncation size
Some block drivers (iscsi and file-posix when dealing with device files)
do not actually support truncation, even though they provide a
.bdrv_truncate() method and will happily return success when providing a
new size that does not exceed the current size. This is because these
drivers expect the user to resize the image outside of qemu and then
provide qemu with that information through the block_resize command
(compare cb1b83e740384b4e0d950f3d7c81c02b8ce86c2e).
Of course, anyone using qemu-img resize will find that behavior useless.
So we should check the actual size of the image after the supposedly
successful truncation took place, emit an error if nothing changed and
emit a warning if the target size was not met.
Max Reitz [Sat, 21 Apr 2018 13:29:29 +0000 (15:29 +0200)]
iotests: Add test for COR across nodes
COR across nodes (that is, you have some filter node between the
actually COR target and the node that performs the COR) cannot reliably
work together with the permission system when there is no explicit COR
node that can request the WRITE_UNCHANGED permission for its child.
This is because COR (currently) sneaks its requests by the usual
permission checks, so it can work without a WRITE* permission; but if
there is a filter node in between, that will re-issue the request, which
then passes through the usual check -- and if nobody has requested a
WRITE_UNCHANGED permission, that check will fail.
There is no real direct fix apart from hoping that there is someone who
has requested that permission; in case of just the qemu-io HMP command
(and no guest device), however, that is not the case. The real real fix
is to implement the copy-on-read flag through an implicitly added COR
node. Such a node can request the necessary permissions as shown in
this test.