Peter Maydell [Tue, 24 May 2016 11:21:07 +0000 (12:21 +0100)]
Merge remote-tracking branch 'remotes/amit-migration/tags/migration-2.7-1' into staging
migration fixes:
- ensure src block devices continue fine after a failed migration
- fail on migration blockers; helps 9p savevm/loadvm
- move autoconverge commands out of experimental state
- move the migration-specific qjson in migration/
* remotes/amit-migration/tags/migration-2.7-1:
migration: regain control of images when migration fails to complete
savevm: fail if migration blockers are present
migration: Promote improved autoconverge commands out of experimental state
migration/qjson: Drop gratuitous use of QOM
migration: Move qjson.[ch] to migration/
Zhou Jie [Wed, 27 Apr 2016 02:07:48 +0000 (10:07 +0800)]
hw/net/opencores_eth: Allocating Large sized arrays to heap
open_eth_start_xmit has a huge stack usage of 65536 bytes approx.
Moving large arrays to heap to reduce stack usage.
Reduce size of a buffer allocated on stack to 0x600 bytes, which is the
maximal frame length when HUGEN bit is not set in MODER, only allocate
buffer on heap when that is too small. Thus heap is not used in typical
use case.
If we try postcopy with a similar scenario, we also get the writev error
message but QEMU leaves the guest paused because entered_postcopy is true.
We could possibly do the same with precopy and leave the guest paused.
But since the historical default for migration errors is to restart the
source, this patch adds a call to bdrv_invalidate_cache_all() instead.
Greg Kurz [Wed, 4 May 2016 19:44:19 +0000 (21:44 +0200)]
savevm: fail if migration blockers are present
QEMU has currently two ways to prevent migration to occur:
- migration blocker when it depends on runtime state
- VMStateDescription.unmigratable when migration is not supported at all
This patch gathers all the logic into a single function to be called from
both the savevm and the migrate paths.
This fixes a bug with 9p, at least, where savevm would succeed and the
following would happen in the guest after loadvm:
$ ls /host
ls: cannot access /host: Protocol error
With this patch:
(qemu) savevm foo
Migration is disabled when VirtFS export path '/' is mounted in the guest
using mount_tag 'host'
Peter Maydell [Mon, 23 May 2016 15:15:51 +0000 (16:15 +0100)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* NMI cleanups (Bandan)
* RAMBlock/Memory cleanups and fixes (Dominik, Gonglei, Fam, me)
* first part of linuxboot support for fw_cfg DMA (Richard)
* IOAPIC fix (Peter Xu)
* iSCSI SG_IO fix (Vadim)
* Various infrastructure bug fixes (Zhijian, Peter M., Stefan)
* CVE fixes (Prasad)
# gpg: Signature made Mon 23 May 2016 16:06:18 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <[email protected]>"
# gpg: aka "Paolo Bonzini <[email protected]>"
* remotes/bonzini/tags/for-upstream: (24 commits)
cpus: call the core nmi injection function
nmi: remove x86 specific nmi handling
target-i386: add a generic x86 nmi handler
coccinelle: add g_assert_cmp* to macro file
iscsi: pass SCSI status back for SG_IO
esp: check dma length before reading scsi command(CVE-2016-4441)
esp: check command buffer length before write(CVE-2016-4439)
scripts/signrom.py: Check for magic in option ROMs.
scripts/signrom.py: Allow option ROM checksum script to write the size header.
Remove config-devices.mak on 'make clean'
cpus.c: Use pthread_sigmask() rather than sigprocmask()
memory: remove unnecessary masking of MemoryRegion ram_addr
memory: Drop FlatRange.romd_mode
memory: Remove code for mr->may_overlap
exec: adjust rcu_read_lock requirement
memory: drop find_ram_block()
vl: change runstate only if new state is different from current state
ioapic: clear remote irr bit for edge-triggered interrupts
ioapic: keep RO bits for IOAPIC entry
target-i386: key sfence availability on CPUID_SSE, not CPUID_SSE2
...
esp: check dma length before reading scsi command(CVE-2016-4441)
The 53C9X Fast SCSI Controller(FSC) comes with an internal 16-byte
FIFO buffer. It is used to handle command and data transfer.
Routine get_cmd() uses DMA to read scsi commands into this buffer.
Add check to validate DMA length against buffer size to avoid any
overrun.
esp: check command buffer length before write(CVE-2016-4439)
The 53C9X Fast SCSI Controller(FSC) comes with an internal 16-byte
FIFO buffer. It is used to handle command and data transfer. While
writing to this command buffer 's->cmdbuf[TI_BUFSZ=16]', a check
was missing to validate input length. Add check to avoid OOB write
access.
scripts/signrom.py: Check for magic in option ROMs.
Because of the risk that compilers might not emit the asm() block at
the beginning of the option ROM, check that the ROM contains the
required magic signature.
scripts/signrom.py: Allow option ROM checksum script to write the size header.
Modify the signrom.py script so that if the size byte in the header is
0 (ie. not set) then the script will set the size. If the size byte
is non-zero then we do the same as before, so this doesn't require
changes to any existing ROM sourcecode.
Peter Maydell [Tue, 17 May 2016 11:27:31 +0000 (12:27 +0100)]
Remove config-devices.mak on 'make clean'
Our dependency mechanism works like this:
* on first build there is neither a .o nor a .d
* we create the .d as a side effect of creating the .o
* for rebuilds we know when we need to update the .o,
which also updates the .d
This system requires that you're never in a situation where there is
a .o file but no .d (because then we will never realise we need to
build the .d, and we will not have the dependency information about
when to rebuild the .o).
This is working fine for our object files, but we also try to use it
for $TARGET/config-devices.mak (where the dependency file is
in $TARGET-config-devices.mak.d). Unfortunately "make clean" doesn't
remove config-devices.mak, which means that it puts us in the
forbidden situation of "object file exists but not its .d file".
This in turn means that we will fail to notice when we need to rebuild:
mkdir build/depbug
(cd build/depbug && '../../configure')
make -C build/depbug -j8
make -C build/depbug clean
echo "CONFIG_CANARY = y" >> default-configs/arm-softmmu.mak
make -C build/depbug
grep CANARY build/depbug/aarch64-softmmu/config-devices.mak
The CANARY token should show up in config-devices.mak but does not.
Fix this bug by making "make clean" delete the config-devices.mak files.
config-all-devices.mak doesn't have the same problem since it has
no .d file, but delete it too, since it is created by "make" and
logically should be removed by "make clean".
(Note that it is important not to remove config-devices.mak until
after we have recursively run 'make clean' in the subdirectories.)
Peter Maydell [Mon, 16 May 2016 17:33:59 +0000 (18:33 +0100)]
cpus.c: Use pthread_sigmask() rather than sigprocmask()
On Linux, sigprocmask() and pthread_sigmask() are in practice the
same thing (they only set the signal mask for the calling thread),
but the documentation states that the behaviour of sigprocmask() in a
multithreaded process is undefined. Use pthread_sigmask() instead
(which is what we do in almost all places in QEMU that alter the
signal mask already).
Gonglei [Tue, 10 May 2016 02:04:59 +0000 (10:04 +0800)]
memory: drop find_ram_block()
On the one hand, we have already qemu_get_ram_block() whose function
is similar. On the other hand, we can directly use mr->ram_block but
searching RAMblock by ram_addr which is a kind of waste.
Peter Xu [Tue, 10 May 2016 10:21:22 +0000 (18:21 +0800)]
ioapic: clear remote irr bit for edge-triggered interrupts
This is to better emulate IOAPIC version 0x1X hardware. Linux kernel
leveraged this "feature" to do explicit EOI since EOI register is still
not introduced at that time. This will also fix the issue that level
triggered interrupts failed to work when IR enabled (tested with Linux
kernel version 4.5).
Stefan Weil [Thu, 28 Apr 2016 21:33:41 +0000 (23:33 +0200)]
configure: Allow builds with extra warnings
The clang compiler supports a useful compiler option -Weverything,
and GCC also has other warnings not enabled by -Wall.
If glib header files trigger a warning, however, testing glib with
-Werror will always fail. A size mismatch is also detected without
-Werror, so simply remove it.
When processing Task Priorty Register(TPR) access, it could leak
automatic stack variable 'imm32' in patch_instruction().
Initialise the variable to avoid it.
Peter Maydell [Mon, 23 May 2016 14:53:02 +0000 (15:53 +0100)]
Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20160523-1' into staging
usb: add xen pvUSB backend, add num-ports check to ohci.
# gpg: Signature made Mon 23 May 2016 14:02:25 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <[email protected]>"
# gpg: aka "Gerd Hoffmann <[email protected]>"
# gpg: aka "Gerd Hoffmann (private) <[email protected]>"
* remotes/kraxel/tags/pull-usb-20160523-1:
usb/ohci: Fix crash with when specifying too many num-ports
xen: add pvUSB backend
xen: write information about supported backends
xen: introduce dummy system device
# gpg: Signature made Mon 23 May 2016 13:30:26 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <[email protected]>"
# gpg: aka "Gerd Hoffmann <[email protected]>"
# gpg: aka "Gerd Hoffmann (private) <[email protected]>"
* remotes/kraxel/tags/pull-vga-20160523-1:
vga: add sr_vbe register set
virtio-gpu: fix ui idx check
virtio-gpu: use VIRTIO_GPU_MAX_SCANOUTS
virtio-gpu: check max_outputs only
virtio-gpu: check max_outputs value
virtio-vga: propagate on gpu realized error
virtio-gpu: check early scanout id
Thomas Huth [Mon, 23 May 2016 09:23:07 +0000 (11:23 +0200)]
usb/ohci: Fix crash with when specifying too many num-ports
QEMU currently crashes when an OHCI controller is instantiated with
too many ports, e.g. "-device pci-ohci,num-ports=100,masterbus=1".
Thus add a proper check in usb_ohci_init() to make sure that we
do not use more than OHCI_MAX_PORTS = 15 ports here.
Gerd Hoffmann [Tue, 17 May 2016 08:54:54 +0000 (10:54 +0200)]
vga: add sr_vbe register set
Commit "fd3c136 vga: make sure vga register setup for vbe stays intact
(CVE-2016-3712)." causes a regression. The win7 installer is unhappy
because it can't freely modify vga registers any more while in vbe mode.
This patch introduces a new sr_vbe register set. The vbe_update_vgaregs
will fill sr_vbe[] instead of sr[]. Normal vga register reads and
writes go to sr[]. Any sr register read access happens through a new
sr() helper function which will read from sr_vbe[] with vbe active and
from sr[] otherwise.
This way we can allow guests update sr[] registers as they want, without
allowing them disrupt vbe video modes that way.
Juergen Gross [Thu, 12 May 2016 14:13:40 +0000 (16:13 +0200)]
xen: write information about supported backends
Add a Xenstore directory for each supported pv backend. This will allow
Xen tools to decide which backend type to use in case there are
multiple possibilities.
The information is added under
/local/domain/<backend-domid>/device-model/<domid>/backends
before the "running" state is written to Xenstore. Using a directory
for each backend enables us to add parameters for specific backends
in the future.
This interface is documented in the Xen source repository in the file
docs/misc/qemu-backends.txt
In order to reuse the Xenstore directory creation already present in
hw/xen/xen_devconfig.c move the related functions to
hw/xen/xen_backend.c where they fit better.
Juergen Gross [Thu, 12 May 2016 14:13:39 +0000 (16:13 +0200)]
xen: introduce dummy system device
Introduce a new dummy system device serving as parent for virtual
buses. This will enable new pv backends to introduce virtual buses
which are removable again opposed to system buses which are meant
to stay once added.
Peter Maydell [Mon, 23 May 2016 09:30:41 +0000 (10:30 +0100)]
Merge remote-tracking branch 'remotes/ehabkost/tags/machine-pull-request' into staging
Machine Core queue, 2016-05-20
# gpg: Signature made Fri 20 May 2016 21:26:49 BST using RSA key ID 984DC5A6
# gpg: Good signature from "Eduardo Habkost <[email protected]>"
* remotes/ehabkost/tags/machine-pull-request: (21 commits)
Use &error_fatal when initializing crypto on qemu-{img,io,nbd}
vl: Use &error_fatal when parsing monitor options
vl: Use &error_fatal when parsing VNC options
machine: add properties to compat_props incrementaly
vl: Simplify global property registration
vl: Make display_remote a local variable
vl: Move DisplayType typedef to vl.c
vl: Make display_type a local variable
vl: Replace DT_NOGRAPHIC with machine option
milkymist: Move DT_NOGRAPHIC check outside milkymist_tmu2_create()
spice: Initialization stubs on qemu-spice.h
gtk: Initialization stubs
cocoa: cocoa_display_init() stub
sdl: Initialization stubs
curses: curses_display_init() stub
vnc: Initialization stubs
vl: Add DT_COCOA DisplayType value
vl: Replace *_vga_available() functions with class_names field
vl: Table-based select_vgahw()
vl: Use exit(1) when requested VGA interface is unavailable
...
Type QJSON lets you build JSON text. Its interface mirrors (a subset
of) abstract JSON syntax.
QAPI output visitors also produce JSON text. They assert their
preconditions and invariants, and therefore abort on incorrect use.
Contrastingly, QJSON does *not* detect incorrect use. It happily
produces invalid JSON then. This is what migration wants.
QJSON was designed for migration, and migration is its only user.
Move it to migration/ for proper coverage by MAINTAINERS, and to deter
accidental use outside migration.
[Pointed out by Eric: QJSON was added in commits 0457d07..b174257
-- Amit]
Usually, Random Number Generator is abbreviated to RNG/rng.
so replacing RndRandom with RngRandom seems more reasonable
and keep consistent with RngBackend.
Eduardo Habkost [Thu, 12 May 2016 14:10:04 +0000 (11:10 -0300)]
Use &error_fatal when initializing crypto on qemu-{img,io,nbd}
In addition to making the code simpler, this will replace the
long error messages:
cannot initialize crypto: Unable to initialize GNUTLS library: [...]
cannot initialize crypto: Unable to initialize gcrypt
with shorter messages:
Unable to initialize GNUTLS library: [...]
Unable to initialize gcrypt
Igor Mammedov [Thu, 28 Jan 2016 10:58:08 +0000 (11:58 +0100)]
machine: add properties to compat_props incrementaly
Switch to adding compat properties incrementaly instead of
completly overwriting compat_props per machine type.
That removes data duplication which we have due to nested
[PC|SPAPR]_COMPAT_* macros.
It also allows to set default device properties from
default foo_machine_options() hook, which will be used
in following patch for putting VMGENID device as
a function if ISA bridge on pc/q35 machines.
Eduardo Habkost [Thu, 28 Jan 2016 15:11:04 +0000 (13:11 -0200)]
vl: Simplify global property registration
There's no need to use qdev_prop_register_global_list() and an
array, if we are registering a single GlobalProperty struct. Use
qdev_prop_register_global() instead.
All DisplayType values are just UI options that don't affect any
hardware emulation code, except for DT_NOGRAPHIC. Replace
DT_NOGRAPHIC with DT_NONE plus a new "-machine graphics=on|off"
option, so hardware emulation code don't need to use the
display_type variable.
This reduces the number of CONFIG_VNC #ifdefs in the vl.c code.
The only user-visible difference is that this will make QEMU
complain about syntax when using "-display vnc" ("VNC requires a
display argument vnc=<display>") even if CONFIG_VNC is disabled.
Instead of reusing DT_SDL for Cocoa, use DT_COCOA to indicate
that a Cocoa display was requested.
configure already ensures CONFIG_COCOA and CONFIG_SDL are never
set at the same time. The only case where DT_SDL is used outside
a #ifdef CONFIG_SDL block is in the no_frame/alt_grab/ctrl_grab
check. That means the only user-visible change is that we will
start printing a warning if the SDL-specific options are used in
Cocoa mode. This is a bugfix, because no_frame/alt_grab/ctrl_grab
are not used by Cocoa code.
Paolo Bonzini [Fri, 20 May 2016 11:57:31 +0000 (13:57 +0200)]
tci: do not include exec/exec-all.h
TCI does not need the runtime definition in exec-all.h. It only needs the
host-side definitions in tcg/tcg.h. Now that cpu.h is not included
everywhere, this caused a failure because exec-all.h does need cpu.h
but does not include it itself.
Paolo Bonzini [Fri, 20 May 2016 11:57:32 +0000 (13:57 +0200)]
aspeed: include qemu/log.h
This is not visible with the default "log" trace backend. With other
backends however trace.h does not include qemu/log.h, resulting in
build failures.
Peter Maydell [Thu, 19 May 2016 15:54:12 +0000 (16:54 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Thu 19 May 2016 16:09:27 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
* remotes/kevin/tags/for-upstream: (31 commits)
qemu-iotests: Fix regression in 136 on aio_read invalid
qemu-iotests: Simplify 109 with unaligned qemu-img compare
qemu-io: Fix recent UI updates
block: clarify error message for qmp-eject
qemu-iotests: Some more write_zeroes tests
qcow2: Fix write_zeroes with partially allocated backing file cluster
qcow2: fix condition in is_zero_cluster
block: Propagate AioContext change to all children
block: Remove BlockDriverState.blk
block: Don't return throttling info in query-named-block-nodes
block: Avoid bs->blk in bdrv_next()
block: Add bdrv_has_blk()
block: Remove bdrv_aio_multiwrite()
blockjob: Don't touch BDS iostatus
blockjob: Don't set iostatus of target
block: User BdrvChild callback for device name
block: Use BdrvChild callbacks for change_media/resize
block: Don't check throttled reqs in bdrv_requests_pending()
Revert "block: Forbid I/O throttling on nodes with multiple parents for 2.6"
block: Remove bdrv_move_feature_fields()
...
Eric Blake [Mon, 16 May 2016 16:43:03 +0000 (10:43 -0600)]
qemu-iotests: Fix regression in 136 on aio_read invalid
Commit 093ea232 removed the ability for aio_read and aio_write
to artificially inflate the invalid statistics counters for
block devices, since it no longer flags unaligned offset or
length. Add 'aio_read -i' and 'aio_write -i' to restore
the ability, and update test 136 to use it.
Eric Blake [Mon, 16 May 2016 16:43:02 +0000 (10:43 -0600)]
qemu-iotests: Simplify 109 with unaligned qemu-img compare
For some time now, qemu-img compare has been able to compare
unaligned images. So we no longer need test 109's hack of
resizing to sector boundaries before invoking compare.
Eric Blake [Mon, 16 May 2016 16:43:01 +0000 (10:43 -0600)]
qemu-io: Fix recent UI updates
Commit 770e0e0e [*] tried to add 'writev -f', but didn't tweak
the getopt() call to actually let it work. Likewise, commit c2e001c missed implementing 'aio_write -u -z'. The latter commit
also introduced a leak of ctx.
Peter Maydell [Thu, 19 May 2016 14:55:08 +0000 (15:55 +0100)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
NEED_CPU_H cleanups, big enough to deserve their own pull request.
# gpg: Signature made Thu 19 May 2016 15:42:37 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <[email protected]>"
# gpg: aka "Paolo Bonzini <[email protected]>"
* remotes/bonzini/tags/for-upstream: (52 commits)
hw: clean up hw/hw.h includes
hw: remove pio_addr_t
cpu: move exec-all.h inclusion out of cpu.h
exec: extract exec/tb-context.h
hw: explicitly include qemu/log.h
mips: move CP0 functions out of cpu.h
arm: move arm_log_exception into .c file
qemu-common: push cpu.h inclusion out of qemu-common.h
acpi: do not use TARGET_PAGE_SIZE
s390x: reorganize CSS bits between cpu.h and other headers
dma: do not depend on kvm_enabled()
gdbstub: remove unnecessary includes from gdbstub-xml.c
qemu-common: stop including qemu/host-utils.h from qemu-common.h
qemu-common: stop including qemu/bswap.h from qemu-common.h
cpu: move endian-dependent load/store functions to cpu-all.h
hw: cannot include hw/hw.h from user emulation
hw: move CPU state serialization to migration/cpu.h
hw: do not use VMSTATE_*TL
include: poison symbols in osdep.h
apic: move target-dependent definitions to cpu.h
...
Kevin Wolf [Tue, 17 May 2016 16:37:39 +0000 (18:37 +0200)]
qemu-iotests: Some more write_zeroes tests
This covers some more write_zeroes cases which are relevant for the
recent qcow2 optimisations that check the allocation status of the
backing file for partial cluster write_zeroes requests.
This needs to be separate from 034 because we can only support qcow2 in
this test case for multiple reasons: We check the allocation status
after write_zeroes with 'qemu-img map' and the optimised behaviour that
produces zero clusters is only implemented in qcow2; second, the map
command returns offsets that are qcow2 specific; and finally, we also
use 512 byte clusters which aren't supported for formats like qed.
Kevin Wolf [Tue, 17 May 2016 16:42:23 +0000 (18:42 +0200)]
qcow2: Fix write_zeroes with partially allocated backing file cluster
In order to correctly check whether a given cluster is read as zero, we
don't only need to check whether bdrv_get_block_status_above() sets
BDRV_BLOCK_ZERO, but also if all sectors for the whole cluster have the
same status.
Denis V. Lunev [Tue, 17 May 2016 09:15:42 +0000 (12:15 +0300)]
qcow2: fix condition in is_zero_cluster
We should check for (res & BDRV_BLOCK_ZERO) only. The situation when we
will have !(res & BDRV_BLOCK_DATA) and will not have BDRV_BLOCK_ZERO is
not possible for images with bdi.unallocated_blocks_are_zero == true.
For those images where it's false, however, it can happen and we must
not consider the data zeroed then or we would corrupt the image.
Max Reitz [Tue, 17 May 2016 11:38:04 +0000 (13:38 +0200)]
block: Propagate AioContext change to all children
Instead of propagating any change of a BDS's AioContext only to its file
and backing children and letting driver-specific code do the rest, just
propagate it to all and drop the thus superfluous implementations of
bdrv_{at,de}tach_aio_context() in Quorum, blkverify and VMDK.
Kevin Wolf [Tue, 22 Mar 2016 17:38:44 +0000 (18:38 +0100)]
block: Remove BlockDriverState.blk
This patch removes the remaining users of bs->blk, which will allow us
to have multiple BBs on top of a single BDS. In the meantime, all checks
that are currently in place to prevent the user from creating such
setups can be switched to bdrv_has_blk() instead of accessing BDS.blk.
Future patches can allow them and e.g. enable users to mirror to a block
device that already has a BlockBackend on it.
Kevin Wolf [Tue, 22 Mar 2016 19:20:29 +0000 (20:20 +0100)]
block: Don't return throttling info in query-named-block-nodes
query-named-block-nodes should not return information that is related
to the attached BlockBackend rather than the node itself, so throttling
information needs to be removed from it.
Kevin Wolf [Tue, 22 Mar 2016 17:58:50 +0000 (18:58 +0100)]
block: Avoid bs->blk in bdrv_next()
We need to introduce a separate BdrvNextIterator struct that can keep
more state than just the current BDS in order to avoid using the bs->blk
pointer.
Kevin Wolf [Mon, 29 Feb 2016 09:50:38 +0000 (10:50 +0100)]
block: Add bdrv_has_blk()
In many cases we just want to know whether a BDS has at least one BB
attached, without needing to know the exact BB that is attached. In
contrast to bs->blk, this is still a valid question when more than one
BB can be attached, so just answer it by checking the parents list.
Kevin Wolf [Fri, 26 Feb 2016 12:50:43 +0000 (13:50 +0100)]
block: Remove bdrv_aio_multiwrite()
Since virtio-blk implements request merging itself these days, the only
remaining users are test cases for the function. That doesn't make the
function exactly useful any more.
Kevin Wolf [Mon, 18 Apr 2016 13:14:11 +0000 (15:14 +0200)]
blockjob: Don't touch BDS iostatus
Block jobs don't actually make use of the iostatus for their BDSes, but
they manage a separate block job iostatus. Still, they require that it
is enabled for the source BDS and they enable it automatically for the
target and set the error handling mode - which ends up never being used
by the job.
This patch removes all of the BDS iostatus handling from the block job,
which removes another few bs->blk accesses.
Kevin Wolf [Mon, 18 Apr 2016 09:36:38 +0000 (11:36 +0200)]
blockjob: Don't set iostatus of target
When block job errors were introduced, we assigned the iostatus of the
target BDS "just in case". The field has never been accessible for the
user because the target isn't listed in query-block.
Before we can allow the user to have a second BlockBackend on the
target, we need to clean this up. If anything, we would want to set the
iostatus for the internal BB of the job (which we can always do later),
but certainly not for a separate BB which the job doesn't even use.
As a nice side effect, this gets us rid of another bs->blk use.
Kevin Wolf [Fri, 26 Feb 2016 09:22:16 +0000 (10:22 +0100)]
block: User BdrvChild callback for device name
In order to get rid of bs->blk for bdrv_get_device_name() and
bdrv_get_device_or_node_name(), ask all parents for their name and
simply pick the first one.
Kevin Wolf [Wed, 24 Feb 2016 14:13:35 +0000 (15:13 +0100)]
block: Use BdrvChild callbacks for change_media/resize
We want to get rid of BlockDriverState.blk in order to allow multiple
BlockBackends per BDS. Converting the device callbacks in block.c (which
assume a single BlockBackend) to per-child callbacks gets us rid of the
first few instances.
Kevin Wolf [Tue, 22 Mar 2016 15:11:33 +0000 (16:11 +0100)]
block: Don't check throttled reqs in bdrv_requests_pending()
Checking whether there are throttled requests requires going to the
associated BlockBackend, which we want to avoid.
All users of bdrv_requests_pending() in block/io.c already call
bdrv_parent_drained_begin() first, which restarts all throttled
requests, so no throttled requests can be left here and this is removal
of dead code.
The remaining users (assertions during graph manipulation in block.c)
don't care about requests that are still queued in the BlockBackend and
haven't been issued for a BlockDriverState yet.
Now that I/O throttling is fully done on the BlockBackend level, there
is no reason any more to block I/O throttling for nodes with multiple
parents as the parents don't influence each other any more.
Kevin Wolf [Tue, 22 Mar 2016 12:00:08 +0000 (13:00 +0100)]
block: Decouple throttling from BlockDriverState
This moves the throttling related part of the BDS life cycle management
to BlockBackend. The throttling group reference is now kept even when no
medium is inserted.
With this commit, throttling isn't disabled and then re-enabled any more
during graph reconfiguration. This fixes the temporary breakage of I/O
throttling when used with live snapshots or block jobs that manipulate
the graph.
Kevin Wolf [Wed, 11 May 2016 12:57:23 +0000 (14:57 +0200)]
block/io: Quiesce parents between drained_begin/end
So far, bdrv_parent_drained_begin/end() was called for the duration of
the actual bdrv_drain() at the beginning of a drained section, but we
really should keep parents quiesced until the end of the drained
section.
This does not actually change behaviour at this point because the only
user of the .drained_begin/end BdrvChildRole callback is I/O throttling,
which already doesn't send any new requests after flushing its queue in
.drained_begin. The patch merely removes a trap for future users.
Kevin Wolf [Tue, 22 Mar 2016 11:05:35 +0000 (12:05 +0100)]
block: Drain throttling queue with BdrvChild callback
This removes the last part of I/O throttling from block/io.c and moves
it to the BlockBackend.
Instead of having knowledge about throttling inside io.c, we can call a
BdrvChild callback .drained_begin/end, which happens to drain the
throttled requests for BlockBackend parents.