However, if you specify a binary without a path, for example with
QTEST_QEMU_BINARY=qemu-system-x86_64 if the QEMU binary is in your
$PATH, then the test currently simply crashes.
Let's try a little bit smarter here by looking for the final '-'
instead of the slash.
bdrv_is_allocated_above wrongly handles short backing files: it reports
after-EOF space as UNALLOCATED which is wrong, as on read the data is
generated on the level of short backing file (if all overlays have
unallocated areas at that place).
Reusing bdrv_common_block_status_above fixes the issue and unifies code
path.
block/io: bdrv_common_block_status_above: support bs == base
We are going to reuse bdrv_common_block_status_above in
bdrv_is_allocated_above. bdrv_is_allocated_above may be called with
include_base == false and still bs == base (for ex. from img_rebase()).
bdrv_co_block_status_above has several design problems with handling
short backing files:
1. With want_zeros=true, it may return ret with BDRV_BLOCK_ZERO but
without BDRV_BLOCK_ALLOCATED flag, when actually short backing file
which produces these after-EOF zeros is inside requested backing
sequence.
2. With want_zero=false, it may return pnum=0 prior to actual EOF,
because of EOF of short backing file.
Fix these things, making logic about short backing files clearer.
With fixed bdrv_block_status_above we also have to improve is_zero in
qcow2 code, otherwise iotest 154 will fail, because with this patch we
stop to merge zeros of different types (produced by fully unallocated
in the whole backing chain regions vs produced by short backing files).
Note also, that this patch leaves for another day the general problem
around block-status: misuse of BDRV_BLOCK_ALLOCATED as is-fs-allocated
vs go-to-backing.
Stefan Hajnoczi [Thu, 1 Oct 2020 14:46:03 +0000 (15:46 +0100)]
block/export: add vhost-user-blk multi-queue support
Allow the number of queues to be configured using --export
vhost-user-blk,num-queues=N. This setting should match the QEMU --device
vhost-user-blk-pci,num-queues=N setting but QEMU vhost-user-blk.c lowers
its own value if the vhost-user-blk backend offers fewer queues than
QEMU.
The vhost-user-blk-server.c code is already capable of multi-queue. All
virtqueue processing runs in the same AioContext. No new locking is
needed.
Add the num-queues=N option and set the VIRTIO_BLK_F_MQ feature bit.
Note that the feature bit only announces the presence of the num_queues
configuration space field. It does not promise that there is more than 1
virtqueue, so we can set it unconditionally.
I tested multi-queue by running a random read fio test with numjobs=4 on
an -smp 4 guest. After the benchmark finished the guest /proc/interrupts
file showed activity on all 4 virtio-blk MSI-X. The /sys/block/vda/mq/
directory shows that Linux blk-mq has 4 queues configured.
Stefan Hajnoczi [Tue, 29 Sep 2020 12:55:16 +0000 (13:55 +0100)]
block/export: add iothread and fixed-iothread options
Make it possible to specify the iothread where the export will run. By
default the block node can be moved to other AioContexts later and the
export will follow. The fixed-iothread option forces strict behavior
that prevents changing AioContext while the export is active. See the
QAPI docs for details.
Signed-off-by: Stefan Hajnoczi <[email protected]>
Message-id: 20200929125516[email protected]
[Fix stray '#' character in block-export.json and add missing "(since:
5.2)" as suggested by Eric Blake.
--Stefan] Signed-off-by: Stefan Hajnoczi <[email protected]>
Stefan Hajnoczi [Tue, 29 Sep 2020 12:55:15 +0000 (13:55 +0100)]
block: move block exports to libblockdev
Block exports are used by softmmu, qemu-storage-daemon, and qemu-nbd.
They are not used by other programs and are not otherwise needed in
libblock.
Undo the recent move of blockdev-nbd.c from blockdev_ss into block_ss.
Since bdrv_close_all() (libblock) calls blk_exp_close_all()
(libblockdev) a stub function is required..
Make qemu-nbd.c use signal handling utility functions instead of
duplicating the code. This helps because os-posix.c is in libblockdev
and it depends on a qemu_system_killed() symbol that qemu-nbd.c lacks.
Once we use the signal handling utility functions we also end up
providing the necessary symbol.
Stefan Hajnoczi [Thu, 24 Sep 2020 15:15:47 +0000 (16:15 +0100)]
block/export: convert vhost-user-blk server to block export API
Use the new QAPI block exports API instead of defining our own QOM
objects.
This is a large change because the lifecycle of VuBlockDev needs to
follow BlockExportDriver. QOM properties are replaced by QAPI options
objects.
VuBlockDev is renamed VuBlkExport and contains a BlockExport field.
Several fields can be dropped since BlockExport already has equivalents.
The file names and meson build integration will be adjusted in a future
patch. libvhost-user should probably be built as a static library that
is linked into QEMU instead of as a .c file that results in duplicate
compilation.
Note that unix-socket is optional because we may wish to accept chardevs
too in the future.
Markus noted that supported address families are not explicit in the
QAPI schema. It is unlikely that support for more address families will
be added since file descriptor passing is required and few address
families support it. If a new address family needs to be added, then the
QAPI 'features' syntax can be used to advertize them.
Signed-off-by: Stefan Hajnoczi <[email protected]> Acked-by: Markus Armbruster <[email protected]>
Message-id: 20200924151549[email protected]
[Skip test on big-endian host architectures because this device doesn't
support them yet (as already mentioned in a code comment).
--Stefan] Signed-off-by: Stefan Hajnoczi <[email protected]>
The vu_client_trip() coroutine is leaked during AioContext switching. It
is also unsafe to destroy the vu_dev in panic_cb() since its callers
still access it in some cases.
Rework the lifecycle to solve these safety issues.
Stefan Hajnoczi [Thu, 24 Sep 2020 15:15:43 +0000 (16:15 +0100)]
util/vhost-user-server: fix memory leak in vu_message_read()
fds[] is leaked when qio_channel_readv_full() fails.
Use vmsg->fds[] instead of keeping a local fds[] array. Then we can
reuse goto fail to clean up fds. vmsg->fd_num must be zeroed before the
loop to make this safe.
block/export: vhost-user block device backend server
By making use of libvhost-user, block device drive can be shared to
the connected vhost-user client. Only one client can connect to the
server one time.
Since vhost-user-server needs a block drive to be created first, delay
the creation of this object.
Suggested-by: Kevin Wolf <[email protected]> Signed-off-by: Stefan Hajnoczi <[email protected]> Signed-off-by: Coiby Xu <[email protected]> Reviewed-by: Stefan Hajnoczi <[email protected]> Reviewed-by: Marc-André Lureau <[email protected]>
Message-id: 20200918080912[email protected]
[Shorten "vhost_user_blk_server" string to "vhost_user_blk" to avoid the
following compiler warning:
../block/export/vhost-user-blk-server.c:178:50: error: ‘%s’ directive output truncated writing 21 bytes into a region of size 20 [-Werror=format-truncation=]
and fix "Invalid size %ld ..." ssize_t format string arguments for
32-bit hosts.
--Stefan] Signed-off-by: Stefan Hajnoczi <[email protected]>
libvhost-user: remove watch for kick_fd when de-initialize vu-dev
When the client is running in gdb and quit command is run in gdb,
QEMU will still dispatch the event which will cause segment fault in
the callback function.
libvhost-user: Allow vu_message_read to be replaced
Allow vu_message_read to be replaced by one which will make use of the
QIOChannel functions. Thus reading vhost-user message won't stall the
guest. For slave channel, we still use the default vu_message_read.
Green Wan [Tue, 20 Oct 2020 03:37:31 +0000 (11:37 +0800)]
hw/misc/sifive_u_otp: Add write function and write-once protection
- Add write operation to update fuse data bit when PWE bit is on.
- Add array, fuse_wo, to store the 'written' status for all bits
of OTP to block the write operation.
Yifei Jiang [Wed, 14 Oct 2020 10:17:28 +0000 (18:17 +0800)]
target/riscv: raise exception to HS-mode at get_physical_address
VS-stage translation at get_physical_address needs to translate pte
address by G-stage translation. But the G-stage translation error
can not be distinguished from VS-stage translation error in
riscv_cpu_tlb_fill. On migration, destination needs to rebuild pte,
and this G-stage translation error must be handled by HS-mode. So
introduce TRANSLATE_STAGE2_FAIL so that riscv_cpu_tlb_fill could
distinguish and raise it to HS-mode.
Alistair Francis [Wed, 14 Oct 2020 00:17:28 +0000 (17:17 -0700)]
hw/riscv: Return the end address of the loaded firmware
Instead of returning the unused entry address from riscv_load_firmware()
instead return the end address. Also return the end address from
riscv_find_and_load_firmware().
This tells the caller if a firmware was loaded and how big it is. This
can be used to determine the load address of the next image (usually the
kernel).
Georg Kotheimer [Tue, 13 Oct 2020 15:10:54 +0000 (17:10 +0200)]
target/riscv: Fix update of hstatus.SPVP
When trapping from virt into HS mode, hstatus.SPVP was set to
the value of sstatus.SPP, as according to the specification both
flags should be set to the same value.
However, the assignment of SPVP takes place before SPP itself is
updated, which results in SPVP having an outdated value.
riscv: Convert interrupt logs to use qemu_log_mask()
Currently we log interrupts and exceptions using the trace backend in
riscv_cpu_do_interrupt(). We also log exceptions using the interrupt log
mask (-d int) in riscv_raise_exception().
This patch converts riscv_cpu_do_interrupt() to log both interrupts and
exceptions with the interrupt log mask, so that both are printed when a
user runs QEMU with -d int.
Thomas Huth [Tue, 20 Oct 2020 16:05:04 +0000 (18:05 +0200)]
Remove deprecated -no-kvm option
The option has never been mentioned in our documentation, it's been
deprecated since years, it's marked with QEMU_ARCH_I386 (which does
not make sense anymore since KVM is available on other architectures,
too), it does not do anything by default in upstream QEMU (since TCG
is the default here anyway), and we're spending too much precious time
each year discussing whether it makes sense to keep this option as a
nice suger or not... let's finally put an end on this and remove it.
Claudio Fontana [Tue, 13 Oct 2020 19:21:23 +0000 (21:21 +0200)]
replay: do not build if TCG is not available
this fixes non-TCG builds broken recently by replay reverse debugging.
Stub the needed functions in stub/, splitting roughly between functions
needed only by system emulation, by system emulation and tools,
and by everyone. This includes duplicating some code in replay/, and
puts the logic for non-replay related events in the replay/ module (+
the stubs), so this should be revisited in the future.
Surprisingly, only _one_ qtest was affected by this, ide-test.c, which
resulted in a buzz as the bh events were never delivered, and the bh
never executed.
Many other subsystems _should_ have been affected.
This fixes the immediate issue, however a better way to group replay
functionality to TCG-only code could be developed in the long term.
Luc Michel [Tue, 20 Oct 2020 09:10:24 +0000 (11:10 +0200)]
hw/core/qdev-clock: add a reference on aliased clocks
When aliasing a clock with the qdev_alias_clock() function, a new link
property is created on the device aliasing the clock. The link points
to the aliased clock and use the OBJ_PROP_LINK_STRONG flag. This
property is read only since it does not provide a check callback for
modifications.
The object_property_add_link() documentation stats that with
OBJ_PROP_LINK_STRONG properties, the linked object reference count get
decremented when the property is deleted. But it is _not_ incremented on
creation (object_property_add_link() does not actually know the link).
This commit increments the reference count on the aliased clock to
ensure the aliased clock stays alive during the property lifetime, and
to avoid a double-free memory error when the property gets deleted.
Paolo Bonzini [Mon, 19 Oct 2020 08:42:11 +0000 (04:42 -0400)]
meson: rewrite curses/iconv test
Redo the curses test to do the same tests that the configure
check used to do. OpenBSD triggers the warning because
it does not support NCURSES_WIDECHAR and thus the cc.links
test fails.
Paolo Bonzini [Tue, 20 Oct 2020 09:18:17 +0000 (05:18 -0400)]
build: fix macOS --enable-modules build
Apple's nm implementation includes empty lines in the output that are not
found in GNU binutils. This confuses scripts/undefsym.py, though it did
not confuse the scripts/undefsym.sh script that it replaced. To fix
this, ignore lines that do not have two fields.
Reported-by: Emmanuel Blot <[email protected]> Tested-by: Emmanuel Blot <[email protected]> Fixes: 604f3e4e90 ("meson: Convert undefsym.sh to undefsym.py", 2020-09-08) Signed-off-by: Paolo Bonzini <[email protected]>
Janosch Frank [Thu, 22 Oct 2020 10:31:34 +0000 (06:31 -0400)]
s390x: pv: Remove sclp boundary checks
The SCLP boundary cross check is done by the Ultravisor for a
protected guest, hence we don't need to do it. As QEMU doesn't get a
valid SCCB address in protected mode this is even problematic and can
lead to QEMU reporting a false boundary cross error.
Matthew Rosato [Thu, 15 Oct 2020 13:16:07 +0000 (09:16 -0400)]
s390x/s390-virtio-ccw: Reset PCI devices during subsystem reset
Currently, a subsystem reset event leaves PCI devices enabled, causing
issues post-reset in the guest (an example would be after a kexec). These
devices need to be reset during a subsystem reset, allowing them to be
properly re-enabled afterwards. Add the S390 PCI host bridge to the list
of qdevs to be reset during subsystem reset.
Peter Maydell [Thu, 22 Oct 2020 10:13:24 +0000 (11:13 +0100)]
Merge remote-tracking branch 'remotes/philmd-gitlab/tags/sd-next-20201021' into staging
SD/MMC patches
Fix two heap-overflow reported by Alexander Bulekov while fuzzing:
- https://bugs.launchpad.net/qemu/+bug/1892960
- https://bugs.launchpad.net/qemu/+bug/1895310
CI jobs results:
. https://cirrus-ci.com/build/6399328187056128
. https://gitlab.com/philmd/qemu/-/pipelines/205701966
. https://travis-ci.org/github/philmd/qemu/builds/737708930
# gpg: Signature made Wed 21 Oct 2020 18:33:08 BST
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <[email protected]>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
* remotes/philmd-gitlab/tags/sd-next-20201021:
hw/sd/sdcard: Assert if accessing an illegal group
hw/sd/sdcard: Do not attempt to erase out of range addresses
hw/sd/sdcard: Reset both start/end addresses on error
hw/sd/sdcard: Do not use legal address '0' for INVALID_ADDRESS
hw/sd/sdcard: Introduce the INVALID_ADDRESS definition
hw/sd/sdcard: Add trace event for ERASE command (CMD38)
hw/sd/sdhci: Yield if interrupt delivered during multiple transfer
hw/sd/sdhci: Let sdhci_update_irq() return if IRQ was delivered
hw/sd/sdhci: Resume pending DMA transfers on MMIO accesses
hw/sd/sdhci: Stop multiple transfers when block count is cleared
hw/sd/sdhci: Fix DMA Transfer Block Size field
hw/sd/sdhci: Document the datasheet used
hw/sd/sdhci: Fix qemu_log_mask() format string
Gerd Hoffmann [Mon, 19 Oct 2020 07:52:22 +0000 (09:52 +0200)]
spice: flip modules switch
Build spice core code as module. This removes libspice-server and a
handful of indirect dependencies from core qemu. The number of shared
libraries for qemu-system-x86_64 goes down from 73 to 66 on my system.
Gerd Hoffmann [Mon, 19 Oct 2020 07:52:20 +0000 (09:52 +0200)]
modules: dependencies infrastructure
Allow modules depending on other modules.
module_load_file() gets the option to export symbols (by not adding the
G_MODULE_BIND_LOCAL flag).
module_load_one() will check the module dependency list to figure (a)
whenever are other modules must be loaded first, or (b) the module
should export the symbols.
The dependencies are specificed as static list in the source code for
now as I expect the list will stay small.
Gerd Hoffmann [Mon, 19 Oct 2020 07:52:12 +0000 (09:52 +0200)]
spice: add QemuSpiceOps, move migrate_info
Add QemuSpiceOps struct. This struct holds function pointers to the
spice functions. It will be initialized with pointers to the stub
functions. When spice gets initialized the function pointers will
be re-written to the real functions.
The spice stubs will move from qemu-spice.h to spice-module.c for that,
because they will be needed for both "CONFIG_SPICE=n" and "CONFIG_SPICE=y
but spice module not loaded" cases.
This patch adds the infrastructure and starts with moving
qemu_spice_migrate_info() to QemuSpiceOps.
Gerd Hoffmann [Mon, 19 Oct 2020 07:52:11 +0000 (09:52 +0200)]
spice: add module helpers
Add new spice-module.c + qemu-spice-module.h files. The code needed to
support modular spice will be there. For starters this will be only the
using_spice variable, more will follow ...
hw/sd/sdcard: Reset both start/end addresses on error
From the Spec "4.3.5 Erase":
The host should adhere to the following command
sequence: ERASE_WR_BLK_START, ERASE_WR_BLK_END and
ERASE (CMD38).
If an erase (CMD38) or address setting (CMD32, 33)
command is received out of sequence, the card shall
set the ERASE_SEQ_ERROR bit in the status register
and reset the whole sequence.
Reset both addresses if the ERASE command occured
out of sequence (one of the start/end address is
not set).
hw/sd/sdcard: Do not use legal address '0' for INVALID_ADDRESS
As it is legal to WRITE/ERASE the address/block 0,
change the value of this definition to an illegal
address: UINT32_MAX.
Unfortunately this break the migration stream, so
bump the VMState version number. This affects some
ARM boards and the SDHCI_PCI device (which is only
used for testing).
hw/sd/sdhci: Yield if interrupt delivered during multiple transfer
The Descriptor Table has a bit to allow the DMA to generates
Interrupt when the operation of the descriptor line is completed
(see "1.13.4. Descriptor Table" of 'SD Host Controller Simplified
Specification Version 2.00').
If we have pending interrupt and the descriptor requires it
to be generated as soon as it is completed, reschedule pending
transfers and yield to the CPU.
See section '2.2.2. Block Size Register (Offset 004h)' in datasheet.
Two different bug reproducer available:
- https://bugs.launchpad.net/qemu/+bug/1892960
- https://ruhr-uni-bochum.sciebo.de/s/NNWP2GfwzYKeKwE?path=%2Fsdhci_oob_write1