Peter Maydell [Tue, 9 Jan 2018 15:22:47 +0000 (15:22 +0000)]
Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2018-01-08' into staging
nbd patches for 2018-01-08
- Eric Blake: 0/2 Optimize sparse reads over NBD
- Murilo Opsfelder Araujo: block/nbd: fix segmentation fault when .desc is not null-terminated
# gpg: Signature made Mon 08 Jan 2018 15:21:19 GMT
# gpg: using RSA key 0xA7A16B4A2527436A
# gpg: Good signature from "Eric Blake <[email protected]>"
# gpg: aka "Eric Blake (Free Software Programmer) <[email protected]>"
# gpg: aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A
* remotes/ericb/tags/pull-nbd-2018-01-08:
block/nbd: fix segmentation fault when .desc is not null-terminated
nbd/server: Optimize final chunk of sparse read
nbd/server: Implement sparse reads atop structured reply
Peter Maydell [Mon, 8 Jan 2018 22:14:24 +0000 (22:14 +0000)]
Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
- Aneesh no longer listed in MAINTAINERS,
- deprecation of the handle backend,
- improved error reporting, especially when the local backend fails to
open the VirtFS root,
- virtio-9p-test to behave more like a real virtio guest driver: set
DRIVER_OK when ready to use the device and process the used ring
for completed requests,
- cosmetic fixes (mostly coding style related).
# gpg: Signature made Mon 08 Jan 2018 10:19:18 GMT
# gpg: using RSA key 0x71D4D5E5822F73D6
# gpg: Good signature from "Greg Kurz <[email protected]>"
# gpg: aka "Gregory Kurz <[email protected]>"
# gpg: aka "[jpeg image of size 3330]"
# Primary key fingerprint: B482 8BAF 9431 40CE F2A3 4910 71D4 D5E5 822F 73D6
* remotes/gkurz/tags/for-upstream:
MAINTAINERS: Drop Aneesh as 9pfs maintainer
9pfs: deprecate handle backend
fsdev: improve error handling of backend init
fsdev: improve error handling of backend opts parsing
tests: virtio-9p: set DRIVER_OK before using the device
tests: virtio-9p: fix ISR dependence
9pfs: make pdu_marshal() and pdu_unmarshal() static functions
9pfs: fix error path in pdu_submit()
9pfs: fix type in *_parse_opts declarations
9pfs: handle: fix type definition
9pfs: fix some type definitions
fsdev: fix some type definitions
9pfs: fix XattrOperations typedef
virtio-9p: move unrealize/realize after virtio_9p_transport definition
In commit c97d6d2cdf97ed we accidentally added code to configure
that uses '==' for string equality testing. This is a bashism --
the portable way to write this is '='.
This fixes the "Unexpected operator error" complaint produced
if the system /bin/sh is dash.
block/nbd: fix segmentation fault when .desc is not null-terminated
The find_desc_by_name() from util/qemu-option.c relies on the .name not being
NULL to call strcmp(). This check becomes unsafe when the list is not
NULL-terminated, which is the case of nbd_runtime_opts in block/nbd.c, and can
result in segmentation fault when strcmp() tries to access an invalid memory:
#0 0x00007fff8c75f7d4 in __strcmp_power9 () from /lib64/libc.so.6
#1 0x00000000102d3ec8 in find_desc_by_name (desc=0x1036d6f0, name=0x28e46670 "server.path") at util/qemu-option.c:166
#2 0x00000000102d93e0 in qemu_opts_absorb_qdict (opts=0x28e47a80, qdict=0x28e469a0, errp=0x7fffec247c98) at util/qemu-option.c:1026
#3 0x000000001012a2e4 in nbd_open (bs=0x28e42290, options=0x28e469a0, flags=24578, errp=0x7fffec247d80) at block/nbd.c:406
#4 0x00000000100144e8 in bdrv_open_driver (bs=0x28e42290, drv=0x1036e070 <bdrv_nbd_unix>, node_name=0x0, options=0x28e469a0, open_flags=24578, errp=0x7fffec247f50) at block.c:1135
#5 0x0000000010015b04 in bdrv_open_common (bs=0x28e42290, file=0x0, options=0x28e469a0, errp=0x7fffec247f50) at block.c:1395
>From gdb, the desc[i].name was not NULL and resulted in strcmp() accessing an
invalid memory:
>>> p desc[5]
$8 = {
name = 0x1037f098 "R27A",
type = 1561964883,
help = 0xc0bbb23e <error: Cannot access memory at address 0xc0bbb23e>,
def_value_str = 0x2 <error: Cannot access memory at address 0x2>
}
>>> p desc[6]
$9 = {
name = 0x103dac78 <__gcov0.do_qemu_init_bdrv_nbd_init> "\001",
type = 272101528,
help = 0x29ec0b754403e31f <error: Cannot access memory at address 0x29ec0b754403e31f>,
def_value_str = 0x81f343b9 <error: Cannot access memory at address 0x81f343b9>
}
This patch fixes the segmentation fault in strcmp() by adding a NULL element at
the end of nbd_runtime_opts.desc list, which is the common practice to most of
other structs like runtime_opts in block/null.c. Thus, the desc[i].name != NULL
check becomes safe because it will not evaluate to true when .desc list reached
its end.
Eric Blake [Tue, 7 Nov 2017 03:09:12 +0000 (21:09 -0600)]
nbd/server: Optimize final chunk of sparse read
If we are careful to handle 0-length read requests correctly,
we can optimize our sparse read to send the NBD_REPLY_FLAG_DONE
bit on our last OFFSET_DATA or OFFSET_HOLE chunk rather than
needing a separate chunk.
The reason that NBD added structured reply in the first place was
to allow for efficient reads of sparse files, by allowing the
reply to include chunks to quickly communicate holes to the client
without sending lots of zeroes over the wire. Time to implement
this in the server; our client can already read such data.
We can only skip holes insofar as the block layer can query them;
and only if the client is okay with a fragmented request (if a
client requests NBD_CMD_FLAG_DF and the entire read is a hole, we
could technically return a single NBD_REPLY_TYPE_OFFSET_HOLE, but
that's a fringe case not worth catering to here). Sadly, the
control flow is a bit wonkier than I would have preferred, but
it was minimally invasive to have a split in the action between
a fragmented read (handled directly where we recognize
NBD_CMD_READ with the right conditions, and sending multiple
chunks) vs. a single read (handled at the end of nbd_trip, for
both simple and structured replies, when we know there is only
one thing being read). Likewise, I didn't make any effort to
optimize the final chunk of a fragmented read to set the
NBD_REPLY_FLAG_DONE, but unconditionally send that as a separate
NBD_REPLY_TYPE_NONE.
Peter Maydell [Mon, 8 Jan 2018 13:44:01 +0000 (13:44 +0000)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Fri 22 Dec 2017 14:09:01 GMT
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (35 commits)
block: Keep nodes drained between reopen_queue/multiple
commit: Simplify reopen of base
test-bdrv-drain: Test graph changes in drained section
block: Allow graph changes in subtree drained section
test-bdrv-drain: Recursive draining with multiple parents
test-bdrv-drain: Test behaviour in coroutine context
test-bdrv-drain: Tests for bdrv_subtree_drain
block: Add bdrv_subtree_drained_begin/end()
block: Don't notify parents in drain call chain
test-bdrv-drain: Test nested drain sections
block: Nested drain_end must still call callbacks
block: Don't block_job_pause_all() in bdrv_drain_all()
test-bdrv-drain: Test drain vs. block jobs
blockjob: Pause job on draining any job BDS
test-bdrv-drain: Test bs->quiesce_counter
test-bdrv-drain: Test callback for bdrv_drain
block: Make bdrv_drain() driver callbacks non-recursive
block: Assert drain_all is only called from main AioContext
block: Remove unused bdrv_requests_pending
block: Mention -drive cyls/heads/secs/trans/serial/addr in deprecation chapter
...
Peter Maydell [Mon, 8 Jan 2018 11:39:50 +0000 (11:39 +0000)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream-hvf' into staging
Initial support for the HVF accelerator
# gpg: Signature made Sat 23 Dec 2017 07:51:18 GMT
# gpg: using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <[email protected]>"
# gpg: aka "Paolo Bonzini <[email protected]>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream-hvf:
i386: hvf: cleanup x86_gen.h
i386: hvf: remove VM_PANIC from "in"
i386: hvf: remove addr_t
i386: hvf: simplify flag handling
i386: hvf: abort on decoding error
i386: hvf: remove ZERO_INIT macro
i386: hvf: remove more dead emulator code
i386: hvf: unify register enums between HVF and the rest
i386: hvf: header cleanup
i386: hvf: move all hvf files in the same directory
i386: hvf: inject General Protection Fault when vmexit through vmcall
i386: hvf: refactor event injection code for hvf
i386: hvf: implement vga dirty page tracking
i386: refactor KVM cpuid code so that it applies to hvf as well
i386: hvf: implement hvf_get_supported_cpuid
i386: hvf: use new helper functions for put/get xsave
i386: hvf: fix licensing issues; isolate task handling code (GPL v2-only)
i386: hvf: add code base from Google's QEMU repository
apic: add function to apic that will be used by hvf
Greg Kurz [Mon, 8 Jan 2018 10:18:23 +0000 (11:18 +0100)]
9pfs: deprecate handle backend
This backend raise some concerns:
- doesn't support symlinks
- fails +100 tests in the PJD POSIX file system test suite [1]
- requires the QEMU process to run with the CAP_DAC_READ_SEARCH
capability, which isn't recommended for security reasons
This backend should not be used and wil be removed. The 'local'
backend is the recommended alternative.
Greg Kurz [Mon, 8 Jan 2018 10:18:23 +0000 (11:18 +0100)]
fsdev: improve error handling of backend init
This patch changes some error messages in the backend init code and
convert backends to propagate QEMU Error objects instead of calling
error_report().
One notable improvement is that the local backend now provides a more
detailed error report when it fails to open the shared directory.
Greg Kurz [Mon, 8 Jan 2018 10:18:23 +0000 (11:18 +0100)]
fsdev: improve error handling of backend opts parsing
This patch changes some error messages in the backend opts parsing
code and convert backends to propagate QEMU Error objects instead
of calling error_report().
Greg Kurz [Mon, 8 Jan 2018 10:18:22 +0000 (11:18 +0100)]
9pfs: fix error path in pdu_submit()
If we receive an unsupported request id, we first decide to
return -ENOTSUPP to the client, but since the request id
causes is_read_only_op() to return false, we change the
error to be -EROFS if the fsdev is read-only. This doesn't
make sense since we don't know what the client asked for.
This patch ensures that -EROFS can only be returned if the
request id is supported.
Peter Maydell [Mon, 8 Jan 2018 10:16:40 +0000 (10:16 +0000)]
Merge remote-tracking branch 'remotes/stefanberger/tags/pull-tpm-2017-12-22-1' into staging
Merge tpm 2017/12/22 v1
# gpg: Signature made Fri 22 Dec 2017 20:03:37 GMT
# gpg: using RSA key 0x75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2017-12-22-1:
acpi: Update TPM2 ACPI table to more recent specs
tpm: Implement tpm_sized_buffer_reset
tpm_tis: merge r/w_offset into rw_offset
tpm_tis: move r/w_offsets to TPMState
tpm_tis: merge read and write buffer into single buffer
tpm_tis: move buffers from localities into common location
tpm_tis: remove TPMSizeBuffer usage
tpm_tis: limit size of buffer from backend
tpm_tis: convert uint32_t to size_t
tpm_emulator: Add a caching layer for the TPM Established flag
Peter Maydell [Mon, 8 Jan 2018 09:15:42 +0000 (09:15 +0000)]
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Fri 22 Dec 2017 02:12:29 GMT
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
qemu-doc: Update the deprecation information of -tftp, -bootp, -redir and -smb
qemu-doc: The "-net nic" option can be used with "netdev=...", too
net: Remove the legacy "-net channel" parameter
net: remove unused compute_mcast_idx() function
rtl8139: use inline net_crc32() and bitshift instead of compute_mcast_idx()
ne2000: use inline net_crc32() and bitshift instead of compute_mcast_idx()
ftgmac100: use inline net_crc32() and bitshift instead of compute_mcast_idx()
lan9118: use inline net_crc32() and bitshift instead of compute_mcast_idx()
opencores_eth: use inline net_crc32() and bitshift instead of compute_mcast_idx()
eepro100: use inline net_crc32() and bitshift instead of compute_mcast_idx()
sungem: fix multicast filter CRC calculation
sunhme: switch sunhme over to use net_crc32_le()
eepro100: switch eepro100 e100_compute_mcast_idx() over to use net_crc32()
pcnet: switch pcnet over to use net_crc32_le()
net: introduce net_crc32_le() function
net: move CRC32 calculation from compute_mcast_idx() into its own net_crc32() function
e1000: Separate TSO and non-TSO contexts, fixing UDP TX corruption
e1000, e1000e: Move per-packet TX offload flags out of context state
Laurent Vivier [Thu, 4 Jan 2018 01:29:10 +0000 (02:29 +0100)]
target/m68k: add 680x0 "move to SR" instruction
Some cleanup, and allows SR to be moved from any addressing mode.
Previous code was wrong for coldfire: coldfire also allows to
use addressing mode to set SR/CCR. It only supports Data register
to get SR/CCR (move from)
Laurent Vivier [Thu, 4 Jan 2018 01:29:07 +0000 (02:29 +0100)]
target/m68k: add reset
The instruction traps if the CPU is not in
Supervisor state but the helper is empty because
there is no easy way to reset all the peripherals
without resetting the CPU itself.
Laurent Vivier [Thu, 4 Jan 2018 01:28:58 +0000 (02:28 +0100)]
target/m68k: fix gen_get_ccr()
As gen_helper_get_ccr() is able to compute CCR from cc_op and
flags, we don't need to flush flags before to call it.
flush_flags() and get_ccr() use COMPUTE_CCR() to compute
flags. get_ccr() computes CCR value,
whereas flush_flags update live cc_op and flags.
Use the function argument "name" instead of hardcoded
"VMCOREINFO". All callers use "VMCOREINFO" as argument, so this isn't
an exposed bug, thankfully.
Simplify a little bit the code while touching this.
With no fixed array allocation, we can't overflow a buffer.
This will be important as optimizations related to host vectors
may expand the number of ops used.
Alex Bennée [Tue, 14 Nov 2017 10:25:35 +0000 (11:25 +0100)]
target/*helper: don't check retaddr before calling cpu_restore_state
cpu_restore_state officially supports being passed an address it can't
resolve the state for. As a result the checks in the helpers are
superfluous and can be removed. This makes the code consistent with
other users of cpu_restore_state.
Of course this does nothing to address what to do if cpu_restore_state
can't resolve the state but so far it seems this is handled elsewhere.
The change was made with included coccinelle script.
Signed-off-by: Alex Bennée <[email protected]>
[rth: Fixed up comment indentation. Added second hunk to script to
combine cpu_restore_state and cpu_loop_exit.] Signed-off-by: Richard Henderson <[email protected]>
Stefan Berger [Tue, 14 Nov 2017 18:42:42 +0000 (13:42 -0500)]
acpi: Update TPM2 ACPI table to more recent specs
More recent specs of the TPM2 ACPI table add fields for the log area
start address and the log area minimum size, which we already use
for the TCPA table.
Kevin Wolf [Wed, 6 Dec 2017 19:24:44 +0000 (20:24 +0100)]
block: Keep nodes drained between reopen_queue/multiple
The bdrv_reopen*() implementation doesn't like it if the graph is
changed between queuing nodes for reopen and actually reopening them
(one of the reasons is that queuing can be recursive).
So instead of draining the device only in bdrv_reopen_multiple(),
require that callers already drained all affected nodes, and assert this
in bdrv_reopen_queue().
Kevin Wolf [Wed, 6 Dec 2017 12:53:36 +0000 (13:53 +0100)]
commit: Simplify reopen of base
Since commit bde70715, base is the only node that is reopened in
commit_start(). This means that the code, which still involves an
explicit BlockReopenQueue, can now be simplified by using bdrv_reopen().
Kevin Wolf [Mon, 18 Dec 2017 15:05:48 +0000 (16:05 +0100)]
block: Allow graph changes in subtree drained section
We need to remember how many of the drain sections in which a node is
were recursive (i.e. subtree drain rather than node drain), so that they
can be correctly applied when children are added or removed during the
drained section.
With this change, it is safe to modify the graph even inside a
bdrv_subtree_drained_begin/end() section.
Kevin Wolf [Fri, 8 Dec 2017 17:51:16 +0000 (18:51 +0100)]
test-bdrv-drain: Test behaviour in coroutine context
If bdrv_do_drained_begin/end() are called in coroutine context, they
first use a BH to get out of the coroutine context. Call some existing
tests again from a coroutine to cover this code path.
Kevin Wolf [Wed, 6 Dec 2017 16:05:44 +0000 (17:05 +0100)]
block: Add bdrv_subtree_drained_begin/end()
bdrv_drained_begin() waits for the completion of requests in the whole
subtree, but it only actually keeps its immediate bs parameter quiesced
until bdrv_drained_end().
Add a version that keeps the whole subtree drained. As of this commit,
graph changes cannot be allowed during a subtree drained section, but
this will be fixed soon.
Kevin Wolf [Thu, 7 Dec 2017 12:03:13 +0000 (13:03 +0100)]
block: Don't notify parents in drain call chain
This is in preparation for subtree drains, i.e. drained sections that
affect not only a single node, but recursively all child nodes, too.
Calling the parent callbacks for drain is pointless when we just came
from that parent node recursively and leads to multiple increases of
bs->quiesce_counter in a single drain call. Don't do it.
In order for this to work correctly, the parent callback must be called
for every bdrv_drain_begin/end() call, not only for the outermost one:
If we have a node N with two parents A and B, recursive draining of A
should cause the quiesce_counter of B to increase because its child N is
drained independently of B. If now B is recursively drained, too, A must
increase its quiesce_counter because N is drained independently of A
only now, even if N is going from quiesce_counter 1 to 2.
Kevin Wolf [Wed, 13 Dec 2017 17:14:18 +0000 (18:14 +0100)]
block: Nested drain_end must still call callbacks
bdrv_do_drained_begin() restricts the call of parent callbacks and
aio_disable_external() to the outermost drain section, but the block
driver callbacks are always called. bdrv_do_drained_end() must match
this behaviour, otherwise nodes stay drained even if begin/end calls
were balanced.
Kevin Wolf [Tue, 12 Dec 2017 18:04:28 +0000 (19:04 +0100)]
blockjob: Pause job on draining any job BDS
Block jobs already paused themselves when their main BlockBackend
entered a drained section. This is not good enough: We also want to
pause a block job and may not submit new requests if, for example, the
mirror target node should be drained.
This implements .drained_begin/end callbacks in child_job in order to
consider all block nodes related to the job, and removes the
BlockBackend callbacks which are unnecessary now because the root of the
job main BlockBackend is always referenced with a child_job, too.
Kevin Wolf [Wed, 6 Dec 2017 17:13:53 +0000 (18:13 +0100)]
test-bdrv-drain: Test callback for bdrv_drain
The existing test is for bdrv_drain_all_begin/end() only. Generalise the
test case so that it can be run for the other variants as well. At the
moment this is only bdrv_drain_begin/end(), but in a while, we'll add
another one.
Also, add a backing file to the test node to test whether the operations
work recursively.
Kevin Wolf [Thu, 7 Dec 2017 11:20:10 +0000 (12:20 +0100)]
block: Make bdrv_drain() driver callbacks non-recursive
bdrv_drained_begin() doesn't increase bs->quiesce_counter recursively
and also doesn't notify other parent nodes of children, which both means
that the child nodes are not actually drained, and bdrv_drained_begin()
is providing useful functionality only on a single node.
To keep things consistent, we also shouldn't call the block driver
callbacks recursively.
A proper recursive drain version that provides an actually working
drained section for child nodes will be introduced later.
Thomas Huth [Mon, 18 Dec 2017 17:14:32 +0000 (18:14 +0100)]
block: Remove the deprecated -hdachs option
It's been marked as deprecated since QEMU v2.10.0, and so far nobody
complained that we should keep it, so let's remove this legacy option
now to simplify the code quite a bit.
but this doesn't work anymore due to image locking:
qemu-img: /overlay/image.qcow2: Failed to get shared "write" lock
Is another process using the image?
Could not open backing image to determine size.
Use the force share option to allow this use case again.
Kevin Wolf [Fri, 15 Dec 2017 10:54:22 +0000 (11:54 +0100)]
block: Document that x-blockdev-change breaks quorum children list
Removing a quorum child node with x-blockdev-change results in a quorum
driver state that cannot be recreated with create options because it
would require a list with gaps. This causes trouble in at least
.bdrv_refresh_filename().
Document this problem so that we won't accidentally mark the command
stable without having addressed it.
Since bdrv_co_preadv does all neccessary checks including
reading after the end of the backing file, avoid duplication
of verification before bdrv_co_preadv call.
Kevin Wolf [Mon, 11 Dec 2017 14:33:17 +0000 (15:33 +0100)]
block: Don't acquire AioContext in hmp_qemu_io()
Commit 15afd94a047 added code to acquire and release the AioContext in
qemuio_command(). This means that the lock is taken twice now in the
call path from hmp_qemu_io(). This causes BDRV_POLL_WHILE() to hang for
any requests issued to nodes in a non-mainloop AioContext.
Dropping the first locking from hmp_qemu_io() fixes the problem.
Kevin Wolf [Wed, 6 Dec 2017 10:00:59 +0000 (11:00 +0100)]
block: Unify order in drain functions
Drain requests are propagated to child nodes, parent nodes and directly
to the AioContext. The order in which this happened was different
between all combinations of drain/drain_all and begin/end.
The correct order is to keep children only drained when their parents
are also drained. This means that at the start of a drained section, the
AioContext needs to be drained first, the parents second and only then
the children. The correct order for the end of a drained section is the
opposite.
This patch changes the three other functions to follow the example of
bdrv_drained_begin(), which is the only one that got it right.
Kevin Wolf [Wed, 6 Dec 2017 09:45:27 +0000 (10:45 +0100)]
block: Don't wait for requests in bdrv_drain*_end()
The device is drained, so there is no point in waiting for requests at
the end of the drained section. Remove the bdrv_drain_recurse() calls
there.
The bdrv_drain_recurse() calls were introduced in commit 481cad48e5e
in order to call the .bdrv_co_drain_end() driver callback. This is now
done by a separate bdrv_drain_invoke() call.
Kevin Wolf [Tue, 5 Dec 2017 13:05:02 +0000 (14:05 +0100)]
test-bdrv-drain: Test BlockDriver callbacks for drain
This adds a test case that the BlockDriver callbacks for drain are
called in bdrv_drained_all_begin/end(), and that both of them are called
exactly once.
Kevin Wolf [Tue, 5 Dec 2017 12:53:35 +0000 (13:53 +0100)]
block: Call .drain_begin only once in bdrv_drain_all_begin()
bdrv_drain_all_begin() used to call the .bdrv_co_drain_begin() driver
callback inside its polling loop. This means that how many times it got
called for each node depended on long it had to poll the event loop.
This is obviously not right and results in nodes that stay drained even
after bdrv_drain_all_end(), which calls .bdrv_co_drain_begin() once per
node.
Fix bdrv_drain_all_begin() to call the callback only once, too.
Kevin Wolf [Tue, 5 Dec 2017 11:52:09 +0000 (12:52 +0100)]
block: Make bdrv_drain_invoke() recursive
This change separates bdrv_drain_invoke(), which calls the BlockDriver
drain callbacks, from bdrv_drain_recurse(). Instead, the function
performs its own recursion now.
One reason for this is that bdrv_drain_recurse() can be called multiple
times by bdrv_drain_all_begin(), but the callbacks may only be called
once. The separation is necessary to fix this bug.
The other reason is that we intend to go to a model where we call all
driver callbacks first, and only then start polling. This is not fully
achieved yet with this patch, as bdrv_drain_invoke() contains a
BDRV_POLL_WHILE() loop for the block driver callbacks, which can still
call callbacks for any unrelated event. It's a step in this direction
anyway.
John Snow [Tue, 5 Dec 2017 01:08:20 +0000 (20:08 -0500)]
iotests: fix 197 for vpc
VPC has some difficulty creating geometries of particular size.
However, we can indeed force it to use a literal one, so let's
do that for the sake of test 197, which is testing some specific
offsets.
Kevin Wolf [Thu, 30 Nov 2017 16:38:43 +0000 (17:38 +0100)]
block: Formats don't need CONSISTENT_READ with NO_IO
Commit 1f4ad7d fixed 'qemu-img info' for raw images that are currently
in use as a mirror target. It is not enough for image formats, though,
as these still unconditionally request BLK_PERM_CONSISTENT_READ.
As this permission is geared towards whether the guest-visible data is
consistent, and has no impact on whether the metadata is sane, and
'qemu-img info' does not read guest-visible data (except for the raw
format), it makes sense to not require BLK_PERM_CONSISTENT_READ if there
is not going to be any guest I/O performed, regardless of image format.