Peter Maydell [Mon, 27 Jul 2020 19:34:58 +0000 (20:34 +0100)]
hw/arm/nrf51_soc: Set system_clock_scale
The nrf51 SoC model wasn't setting the system_clock_scale
global.which meant that if guest code used the systick timer in "use
the processor clock" mode it would hang because time never advances.
Set the global to match the documented CPU clock speed for this SoC.
This SoC in fact doesn't have a SysTick timer (which is the only thing
currently that cares about the system_clock_scale), because it's
a configurable option in the Cortex-M0. However our Cortex-M0 and
thus our nrf51 and our micro:bit board do provide a SysTick, so
we ought to provide a functional one rather than a broken one.
Kaige Li [Mon, 3 Aug 2020 16:55:04 +0000 (17:55 +0100)]
target/arm: Avoid maybe-uninitialized warning with gcc 4.9
GCC version 4.9.4 isn't clever enough to figure out that all
execution paths in disas_ldst() that use 'fn' will have initialized
it first, and so it warns:
/home/LiKaige/qemu/target/arm/translate-a64.c: In function ‘disas_ldst’:
/home/LiKaige/qemu/target/arm/translate-a64.c:3392:5: error: ‘fn’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
fn(cpu_reg(s, rt), clean_addr, tcg_rs, get_mem_index(s),
^
/home/LiKaige/qemu/target/arm/translate-a64.c:3318:22: note: ‘fn’ was declared here
AtomicThreeOpFn *fn;
^
Make it happy by initializing the variable to NULL.
The definition of top_bit used in this function is one higher
than that used in the Arm ARM psuedo-code, which put the error
indication at top_bit - 1 at the wrong place, which meant that
it wasn't visible to Auth.
Fixing the definition of top_bit requires more changes, because
its most common use is for the count of bits in top_bit:bot_bit,
which would then need to be computed as top_bit - bot_bit + 1.
For now, prefer the minimal fix to the error indication alone.
Peter Maydell [Mon, 3 Aug 2020 16:55:03 +0000 (17:55 +0100)]
msf2-soc, stellaris: Don't wire up SYSRESETREQ
The MSF2 SoC model and the Stellaris board code both wire
SYSRESETREQ up to a function that just invokes
qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
This is now the default action that the NVIC does if the line is
not connected, so we can delete the handling code.
Peter Maydell [Mon, 3 Aug 2020 16:55:03 +0000 (17:55 +0100)]
hw/intc/armv7m_nvic: Provide default "reset the system" behaviour for SYSRESETREQ
The NVIC provides an outbound qemu_irq "SYSRESETREQ" which it signals
when the guest sets the SYSRESETREQ bit in the AIRCR register. This
matches the hardware design (where the CPU has a signal of this name
and it is up to the SoC to connect that up to an actual reset
mechanism), but in QEMU it mostly results in duplicated code in SoC
objects and bugs where SoC model implementors forget to wire up the
SYSRESETREQ line.
Provide a default behaviour for the case where SYSRESETREQ is not
actually connected to anything: use qemu_system_reset_request() to
perform a system reset. This will allow us to remove the
implementations of SYSRESETREQ handling from the boards where that's
exactly what it does, and also fixes the bugs in the board models
which forgot to wire up the signal:
We still allow the board to wire up the signal if it needs to, in case
we need to model more complicated reset controller logic or to model
buggy SoC hardware which forgot to wire up the line itself. But
defaulting to "reset the system" is more often going to be correct
than defaulting to "do nothing".
Peter Maydell [Mon, 3 Aug 2020 16:55:03 +0000 (17:55 +0100)]
include/hw/irq.h: New function qemu_irq_is_connected()
Mostly devices don't need to care whether one of their output
qemu_irq lines is connected, because functions like qemu_set_irq()
silently do nothing if there is nothing on the other end. However
sometimes a device might want to implement default behaviour for the
case where the machine hasn't wired the line up to anywhere.
Provide a function qemu_irq_is_connected() that devices can use for
this purpose. (The test is trivial but encapsulating it in a
function makes it easier to see where we're doing it in case we need
to change the implementation later.)
Peter Maydell [Mon, 3 Aug 2020 16:55:03 +0000 (17:55 +0100)]
hw/arm/netduino2, netduinoplus2: Set system_clock_scale
The netduino2 and netduinoplus2 boards forgot to set the system_clock_scale
global, which meant that if guest code used the systick timer in "use
the processor clock" mode it would hang because time never advances.
Set the global to match the documented CPU clock speed of these boards.
Judging by the data sheet this is slightly simplistic because the
SoC allows configuration of the SYSCLK source and frequency via the
RCC (reset and clock control) module, but we don't model that.
Max Reitz [Thu, 30 Jul 2020 12:02:33 +0000 (14:02 +0200)]
qcow2: Release read-only bitmaps when inactivated
During migration, we release all bitmaps after storing them on disk, as
long as they are (1) stored on disk, (2) not read-only, and (3)
consistent.
(2) seems arbitrary, though. The reason we do not release them is
because we do not write them, as there is no need to; and then we just
forget about all bitmaps that we have not written to the file. However,
read-only persistent bitmaps are still in the file and in sync with
their in-memory representation, so we may as well release them just like
any R/W bitmap that we have updated.
It leads to actual problems, too: After migration, letting the source
continue may result in an error if there were any bitmaps on read-only
nodes (such as backing images), because those have not been released by
bdrv_inactive_all(), but bdrv_invalidate_cache_all() attempts to reload
them (which fails, because they are still present in memory).
Andrea Bolognani [Wed, 29 Jul 2020 18:50:24 +0000 (20:50 +0200)]
schemas: Add vim modeline
The various schemas included in QEMU use a JSON-based format which
is, however, strictly speaking not valid JSON.
As a consequence, when vim tries to apply syntax highlight rules
for JSON (as guessed from the file name), the result is an unreadable
mess which mostly consist of red markers pointing out supposed errors
in, well, pretty much everything.
Using Python syntax highlighting produces much better results, and
in fact these files already start with specially-formatted comments
that instruct Emacs to process them as if they were Python files.
This commit adds the equivalent special comments for vim.
Peter Maydell [Wed, 29 Jul 2020 19:10:19 +0000 (20:10 +0100)]
qapi/machine.json: Fix missing newline in doc comment
In commit 176d2cda0dee9f4 we added the @die-id field
to the CpuInstanceProperties struct, but in the process
accidentally removed the newline between the doc-comment
lines for @core-id and @thread-id.
Put the newline back in; this fixes a misformatting in the
generated HTML QMP reference manual.
Stefan Hajnoczi [Wed, 29 Jul 2020 15:39:26 +0000 (16:39 +0100)]
tracetool: carefully define SDT_USE_VARIADIC
The dtrace backend defines SDT_USE_VARIADIC as a workaround for a
conflict with a LTTng UST header file, which requires SDT_USE_VARIADIC
to be defined.
LTTng UST <lttng/tracepoint.h> breaks if included after generated dtrace
headers because SDT_USE_VARIADIC will already be defined:
Halil Pasic [Thu, 30 Jul 2020 13:01:56 +0000 (15:01 +0200)]
s390x/s390-virtio-ccw: fix off-by-one in loadparm getter
As pointed out by Peter, g_memdup(ms->loadparm, sizeof(ms->loadparm) + 1)
reads one past of the end of ms->loadparm, so g_memdup() can not be used
here.
trace/simple: Allow enabling simple traces from command line
The simple trace backend is enabled / disabled with a call
to st_set_trace_file_enabled(). When initializing tracing
from the command-line, this must be enabled on startup.
(Prior to db25d56c014aa1a9, command-line initialization of
simple trace worked because every call to st_set_trace_file
enabled tracing.)
Peter Maydell [Tue, 28 Jul 2020 19:43:03 +0000 (20:43 +0100)]
Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2020-07-28' into staging
nbd patches for 2020-07-28
- fix NBD handling of trim/zero requests larger than 2G
- allow no-op resizes on NBD (in turn fixing qemu-img convert -c into NBD)
- several deadlock fixes when using NBD reconnect
* remotes/ericb/tags/pull-nbd-2020-07-28:
block/nbd: nbd_co_reconnect_loop(): don't sleep if drained
block/nbd: on shutdown terminate connection attempt
block/nbd: allow drain during reconnect attempt
block/nbd: split nbd_establish_connection out of nbd_client_connect
iotests: Test convert to qcow2 compressed to NBD
iotests: Add more qemu_img helpers
iotests: Make qemu_nbd_popen() a contextmanager
block: nbd: Fix convert qcow2 compressed to nbd
nbd: Fix large trim/zero requests
Peter Maydell [Tue, 28 Jul 2020 17:43:48 +0000 (18:43 +0100)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20200727' into staging
target-arm queue:
* ACPI: Assert that we don't run out of the preallocated memory
* hw/misc/aspeed_sdmc: Fix incorrect memory size
* target/arm: Always pass cacheattr in S1_ptw_translate
* docs/system/arm/virt: Document 'mte' machine option
* hw/arm/boot: Fix PAUTH, MTE for EL3 direct kernel boot
* target/arm: Improve IMPDEF algorithm for IRG
* remotes/pmaydell/tags/pull-target-arm-20200727:
target/arm: Improve IMPDEF algorithm for IRG
hw/arm/boot: Fix MTE for EL3 direct kernel boot
hw/arm/boot: Fix PAUTH for EL3 direct kernel boot
docs/system/arm/virt: Document 'mte' machine option
target/arm: Always pass cacheattr in S1_ptw_translate
hw/misc/aspeed_sdmc: Fix incorrect memory size
ACPI: Assert that we don't run out of the preallocated memory
* remotes/maxreitz/tags/pull-block-2020-07-28:
iotests/197: Fix for non-qcow2 formats
iotests/028: Add test for cross-base-EOF reads
block: Fix bdrv_aligned_p*v() for qiov_offset != 0
Peter Maydell [Tue, 28 Jul 2020 15:28:22 +0000 (16:28 +0100)]
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
Want to send earlier but most patches just come.
- fix vhost-vdpa issues when no peer
- fix virtio-pci queue enabling index value
- forbid reentrant RX
Changes from V1:
- drop the patch that has been merged
# gpg: Signature made Tue 28 Jul 2020 09:59:41 BST
# gpg: using RSA key EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <[email protected]>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
net: forbid the reentrant RX
virtio-net: check the existence of peer before accessing vDPA config
virtio-pci: fix wrong index in virtio_pci_queue_enabled
block/nbd: nbd_co_reconnect_loop(): don't sleep if drained
We try to go to wakeable sleep, so that, if drain begins it will break
the sleep. But what if nbd_client_co_drain_begin() already called and
s->drained is already true? We'll go to sleep, and drain will have to
wait for the whole timeout. Let's improve it.
block/nbd: on shutdown terminate connection attempt
On shutdown nbd driver may be in a connecting state. We should shutdown
it as well, otherwise we may hang in
nbd_teardown_connection, waiting for conneciton_co to finish in
BDRV_POLL_WHILE(bs, s->connection_co) loop if remote server is down.
How to reproduce the dead lock:
1. Create nbd-fault-injector.conf with the following contents:
2. In one terminal run nbd-fault-injector in a loop, like this:
n=1; while true; do
echo $n; ((n++));
./nbd-fault-injector.py 127.0.0.1:10000 nbd-fault-injector.conf;
done
3. In another terminal run qemu-io in a loop, like this:
n=1; while true; do
echo $n; ((n++));
./qemu-io -c 'read 0 512' nbd://127.0.0.1:10000;
done
After some time, qemu-io will hang. Note, that this hang may be
triggered by another bug, so the whole case is fixed only together with
commit "block/nbd: allow drain during reconnect attempt".
It should be safe to reenter qio_channel_yield() on io/channel read/write
path, so it's safe to reduce in_flight and allow attaching new aio
context. And no problem to allow drain itself: connection attempt is
not a guest request. Moreover, if remote server is down, we can hang
in negotiation, blocking drain section and provoking a dead lock.
How to reproduce the dead lock:
1. Create nbd-fault-injector.conf with the following contents:
2. In one terminal run nbd-fault-injector in a loop, like this:
n=1; while true; do
echo $n; ((n++));
./nbd-fault-injector.py 127.0.0.1:10000 nbd-fault-injector.conf;
done
3. In another terminal run qemu-io in a loop, like this:
n=1; while true; do
echo $n; ((n++));
./qemu-io -c 'read 0 512' nbd://127.0.0.1:10000;
done
After some time, qemu-io will hang trying to drain, for example, like
this:
#3 aio_poll (ctx=0x55f006bdd890, blocking=true) at
util/aio-posix.c:600
#4 bdrv_do_drained_begin (bs=0x55f006bea710, recursive=false,
parent=0x0, ignore_bds_parents=false, poll=true) at block/io.c:427
#5 bdrv_drained_begin (bs=0x55f006bea710) at block/io.c:433
#6 blk_drain (blk=0x55f006befc80) at block/block-backend.c:1710
#7 blk_unref (blk=0x55f006befc80) at block/block-backend.c:498
#8 bdrv_open_inherit (filename=0x7fffba1563bc
"nbd+tcp://127.0.0.1:10000", reference=0x0, options=0x55f006be86d0,
flags=24578, parent=0x0, child_class=0x0, child_role=0,
errp=0x7fffba154620) at block.c:3491
#9 bdrv_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:10000",
reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at
block.c:3513
#10 blk_new_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:10000",
reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at
block/block-backend.c:421
And connection_co stack like this:
#0 qemu_coroutine_switch (from_=0x55f006bf2650, to_=0x7fe96e07d918,
action=COROUTINE_YIELD) at util/coroutine-ucontext.c:302
#1 qemu_coroutine_yield () at util/qemu-coroutine.c:193
#2 qio_channel_yield (ioc=0x55f006bb3c20, condition=G_IO_IN) at
io/channel.c:472
#3 qio_channel_readv_all_eof (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0,
niov=1, errp=0x7fe96d729eb0) at io/channel.c:110
#4 qio_channel_readv_all (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0,
niov=1, errp=0x7fe96d729eb0) at io/channel.c:143
#5 qio_channel_read_all (ioc=0x55f006bb3c20, buf=0x7fe96d729d28
"\300.\366\004\360U", buflen=8, errp=0x7fe96d729eb0) at
io/channel.c:247
#6 nbd_read (ioc=0x55f006bb3c20, buffer=0x7fe96d729d28, size=8,
desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at
/work/src/qemu/master/include/block/nbd.h:365
#7 nbd_read64 (ioc=0x55f006bb3c20, val=0x7fe96d729d28,
desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at
/work/src/qemu/master/include/block/nbd.h:391
#8 nbd_start_negotiate (aio_context=0x55f006bdd890,
ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0,
outioc=0x55f006bf19f8, structured_reply=true,
zeroes=0x7fe96d729dca, errp=0x7fe96d729eb0) at nbd/client.c:904
#9 nbd_receive_negotiate (aio_context=0x55f006bdd890,
ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0,
outioc=0x55f006bf19f8, info=0x55f006bf1a00, errp=0x7fe96d729eb0) at
nbd/client.c:1032
#10 nbd_client_connect (bs=0x55f006bea710, errp=0x7fe96d729eb0) at
block/nbd.c:1460
#11 nbd_reconnect_attempt (s=0x55f006bf19f0) at block/nbd.c:287
#12 nbd_co_reconnect_loop (s=0x55f006bf19f0) at block/nbd.c:309
#13 nbd_connection_entry (opaque=0x55f006bf19f0) at block/nbd.c:360
#14 coroutine_trampoline (i0=113190480, i1=22000) at
util/coroutine-ucontext.c:173
Note, that the hang may be
triggered by another bug, so the whole case is fixed only together with
commit "block/nbd: on shutdown terminate connection attempt".
block/nbd: split nbd_establish_connection out of nbd_client_connect
We are going to implement non-blocking version of
nbd_establish_connection, which for a while will be used only for
nbd_reconnect_attempt, not for nbd_open, so we need to call it
separately.
Refactor nbd_reconnect_attempt in a way which makes next commit
simpler.
Add test for "qemu-img convert -O qcow2 -c" to NBD target. The tests
create a OVA file and write compressed qcow2 disk content directly into
the OVA file via qemu-nbd.
Add 2 helpers for measuring and checking images:
- qemu_img_measure()
- qemu_img_check()
Both use --output-json and parse the returned json to make easy to use
in other tests. I'm going to use them in a new test, and I hope they
will be useful in may other tests.
Instead of duplicating the code to wait until the server is ready and
remember to terminate the server and wait for it, make it possible to
use like this:
with qemu_nbd_popen('-k', sock, image):
# Access image via qemu-nbd socket...
Only test 264 used this helper, but I had to modify the output since it
did not consistently when starting and stopping qemu-nbd.
When converting to qcow2 compressed format, the last step is a special
zero length compressed write, ending in a call to bdrv_co_truncate(). This
call always fails for the nbd driver since it does not implement
bdrv_co_truncate().
For block devices, which have the same limits, the call succeeds since
the file driver implements bdrv_co_truncate(). If the caller asked to
truncate to the same or smaller size with exact=false, the truncate
succeeds. Implement the same logic for nbd.
$ qemu-img check nbd+unix:///?socket=/tmp/nbd.sock
No errors were found on the image.
1/16384 = 0.01% allocated, 100.00% fragmented, 100.00% compressed clusters
Image end offset: 393216
$ qemu-img compare disk.raw nbd+unix:///?socket=/tmp/nbd.sock
Images are identical.
Dr. David Alan Gilbert (1):
ip_stripoptions use memmove
Jindrich Novy (4):
Fix possible infinite loops and use-after-free
Use secure string copy to avoid overflow
Be sure to initialize sockaddr structure
Check lseek() for failure
Marc-André Lureau (2):
util: do not silently truncate
Merge branch 'stable-4.2' into 'stable-4.2'
Philippe Mathieu-Daudé (3):
Fix win32 builds by using the SLIRP_PACKED definition
Fix constness warnings
Remove unnecessary break
Ralf Haferkamp (2):
Drop bogus IPv6 messages
Fix MTU check
We are having issues debugging and bisecting this issue that happen
mostly on patchew. Let's make it abort where it failed to gather some
new informations.
Peter Maydell [Tue, 28 Jul 2020 14:24:31 +0000 (15:24 +0100)]
Merge remote-tracking branch 'remotes/mdroth/tags/qga-pull-2020-07-27-tag' into staging
qemu-ga patch queue for hard-freeze
* document use of -1 when pci_controller field can't be retrieved for
guest-get-fsinfo
* fix incorrect filesystem type reporting on w32 for guest-get-fsinfo
when a volume is not mounted
Eric Blake [Wed, 22 Jul 2020 21:22:31 +0000 (16:22 -0500)]
nbd: Fix large trim/zero requests
Although qemu as NBD client limits requests to <2G, the NBD protocol
allows clients to send requests almost all the way up to 4G. But
because our block layer is not yet 64-bit clean, we accidentally wrap
such requests into a negative size, and fail with EIO instead of
performing the intended operation.
The bug is visible in modern systems with something as simple as:
Although both blk_co_pdiscard and blk_pwrite_zeroes currently return 0
on success, this is also a good time to fix our code to a more robust
paradigm that treats all non-negative values as success.
Alas, our iotests do not currently make it easy to add external
dependencies on blkdiscard or nbdsh, so we have to rely on manual
testing for now.
This patch can be reverted when we later improve the overall block
layer to be 64-bit clean, but for now, a minimal fix was deemed less
risky prior to release.
Peter Maydell [Tue, 28 Jul 2020 13:38:17 +0000 (14:38 +0100)]
Merge remote-tracking branch 'remotes/ericb/tags/pull-bitmaps-2020-07-27' into staging
bitmaps patches for 2020-07-27
- Improve handling of various post-copy bitmap migration scenarios. A lost
bitmap should merely mean that the next backup must be full rather than
incremental, rather than abruptly breaking the entire guest migration.
- Associated iotest improvements
* remotes/ericb/tags/pull-bitmaps-2020-07-27: (24 commits)
migration: Fix typos in bitmap migration comments
iotests: Adjust which migration tests are quick
qemu-iotests/199: add source-killed case to bitmaps postcopy
qemu-iotests/199: add early shutdown case to bitmaps postcopy
qemu-iotests/199: check persistent bitmaps
qemu-iotests/199: prepare for new test-cases addition
migration/savevm: don't worry if bitmap migration postcopy failed
migration/block-dirty-bitmap: cancel migration on shutdown
migration/block-dirty-bitmap: relax error handling in incoming part
migration/block-dirty-bitmap: keep bitmap state for all bitmaps
migration/block-dirty-bitmap: simplify dirty_bitmap_load_complete
migration/block-dirty-bitmap: rename finish_lock to just lock
migration/block-dirty-bitmap: refactor state global variables
migration/block-dirty-bitmap: move mutex init to dirty_bitmap_mig_init
migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanup
migration/block-dirty-bitmap: rename state structure types
migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_start
qemu-iotests/199: increase postcopy period
qemu-iotests/199: change discard patterns
qemu-iotests/199: improve performance: set bitmap by discard
...
Max Reitz [Tue, 28 Jul 2020 13:11:34 +0000 (15:11 +0200)]
iotests/197: Fix for non-qcow2 formats
While 197 is very much a qcow2 test, and it looks like the partial
cluster case at the end (introduced in b0ddcbbb36a66a6) is specifically
a qcow2 case, the whole test scripts actually marks itself to work with
generic formats (and generic protocols, even).
Said partial cluster case happened to work with non-qcow2 formats as
well (mostly by accident), but 1855536256 broke that, because it sets
the compat option, which does not work for non-qcow2 formats.
So go the whole way and force IMGFMT=qcow2 and IMGPROTO=file, as done in
other places in this test.
Max Reitz [Tue, 28 Jul 2020 12:08:04 +0000 (14:08 +0200)]
block: Fix bdrv_aligned_p*v() for qiov_offset != 0
Since these functions take a @qiov_offset, they must always take it into
account when working with @qiov. There are a couple of places where
they do not, but they should.
Jason Wang [Wed, 22 Jul 2020 08:57:46 +0000 (16:57 +0800)]
net: forbid the reentrant RX
The memory API allows DMA into NIC's MMIO area. This means the NIC's
RX routine must be reentrant. Instead of auditing all the NIC, we can
simply detect the reentrancy and return early. The queue->delivering
is set and cleared by qemu_net_queue_deliver() for other queue helpers
to know whether the delivering in on going (NIC's receive is being
called). We can check it and return early in qemu_net_queue_flush() to
forbid reentrant RX.
Jason Wang [Sat, 25 Jul 2020 00:13:17 +0000 (08:13 +0800)]
virtio-net: check the existence of peer before accessing vDPA config
We try to check whether a peer is VDPA in order to get config from
there - with no peer, this leads to a NULL
pointer dereference. Add a check before trying to access the peer
type. No peer means not VDPA.
* remotes/maxreitz/tags/pull-block-2020-07-27:
iotests/197: Fix for compat=0.10
iotests: Select a default machine for the rx and avr targets
block/amend: Check whether the node exists
Thomas Huth [Wed, 22 Jul 2020 04:40:25 +0000 (06:40 +0200)]
qga/qapi-schema: Document -1 for invalid PCI address fields
The "guest-get-fsinfo" could also be used for non-PCI devices in the
future. And the code in GuestPCIAddress() in qga/commands-win32.c seems
to be using "-1" for fields that it can not determine already. Thus
let's properly document "-1" as value for invalid PCI address fields.
qga-win: fix "guest-get-fsinfo" wrong filesystem type
This patch handles the case where unmounted volumes exist,
where in that case GetVolumePathNamesForVolumeName returns
empty path, GetVolumeInformation will use the current working
directory instead.
This patch fixes the issue by opening a handle to the volumes,
and using GetVolumeInformationByHandleW instead.
Eric Blake [Mon, 27 Jul 2020 19:51:17 +0000 (14:51 -0500)]
iotests: Adjust which migration tests are quick
A quick run of './check -qcow2 -g migration' shows that test 169 is
NOT quick, but meanwhile several other tests ARE quick. Let's adjust
the test designations accordingly.
qemu-iotests/199: add source-killed case to bitmaps postcopy
Previous patches fixes behavior of bitmaps migration, so that errors
are handled by just removing unfinished bitmaps, and not fail or try to
recover postcopy migration. Add corresponding test.
migration/savevm: don't worry if bitmap migration postcopy failed
First, if only bitmaps postcopy is enabled (and not ram postcopy)
postcopy_pause_incoming crashes on an assertion
assert(mis->to_src_file).
And anyway, bitmaps postcopy is not prepared to be somehow recovered.
The original idea instead is that if bitmaps postcopy failed, we just
lose some bitmaps, which is not critical. So, on failure we just need
to remove unfinished bitmaps and guest should continue execution on
destination.
migration/block-dirty-bitmap: cancel migration on shutdown
If target is turned off prior to postcopy finished, target crashes
because busy bitmaps are found at shutdown.
Canceling incoming migration helps, as it removes all unfinished (and
therefore busy) bitmaps.
Similarly on source we crash in bdrv_close_all which asserts that all
bdrv states are removed, because bdrv states involved into dirty bitmap
migration are referenced by it. So, we need to cancel outgoing
migration as well.
migration/block-dirty-bitmap: relax error handling in incoming part
Bitmaps data is not critical, and we should not fail the migration (or
use postcopy recovering) because of dirty-bitmaps migration failure.
Instead we should just lose unfinished bitmaps.
Still we have to report io stream violation errors, as they affect the
whole migration stream.
While touching this, tighten code that was previously blindly calling
malloc on a size read from the migration stream, as a corrupted stream
(perhaps from a malicious user) should not be able to convince us to
allocate an inordinate amount of memory.
migration/block-dirty-bitmap: keep bitmap state for all bitmaps
Keep bitmap state for disabled bitmaps too. Keep the state until the
end of the process. It's needed for the following commit to implement
bitmap postcopy canceling.
To clean-up the new list the following logic is used:
We need two events to consider bitmap migration finished:
1. chunk with DIRTY_BITMAP_MIG_FLAG_COMPLETE flag should be received
2. dirty_bitmap_mig_before_vm_start should be called
These two events may come in any order, so we understand which one is
last, and on the last of them we remove bitmap migration state from the
list.
bdrv_enable_dirty_bitmap_locked() call does nothing, as if we are in
postcopy, bitmap successor must be enabled, and reclaim operation will
enable the bitmap.
So, actually we need just call _reclaim_ in both if branches, and
making differences only to add an assertion seems not really good. The
logic becomes simple: on load complete we do reclaim and that's all.
Using the _locked version of bdrv_enable_dirty_bitmap to bypass locking
is wrong as we do not already own the mutex. Moreover, the adjacent
call to bdrv_dirty_bitmap_enable_successor grabs the mutex.
The test wants to force a bitmap postcopy. Still, the resulting
postcopy period is very small. Let's increase it by adding more
bitmaps to migrate. Also, test disabled bitmaps migration.
The test aims to test _postcopy_ migration, and wants to do some write
operations during postcopy time.
Test considers migrate status=complete event on source as start of
postcopy. This is completely wrong, completion is completion of the
whole migration process. Let's instead consider destination start as
start of postcopy, and use RESUME event for it.
Next, as migration finish, let's use migration status=complete event on
target, as such method is closer to what libvirt or another user will
do, than tracking number of dirty-bitmaps.
Finally, add a possibility to dump events for debug. And if
set debug to True, we see, that actual postcopy period is very small
relatively to the whole test duration time (~0.2 seconds to >40 seconds
for me). This means, that test is very inefficient in what it supposed
to do. Let's improve it in following commits.
We don't need any specific format constraints here. Still keep qcow2
for two reasons:
1. No extra calls of format-unrelated test
2. Add some check around persistent bitmap in future (require qcow2)
Andreas Schwab [Thu, 23 Jul 2020 10:27:13 +0000 (12:27 +0200)]
linux-user: Use getcwd syscall directly
The glibc getcwd function returns different errors than the getcwd
syscall, which triggers an assertion failure in the glibc getcwd function
when running under the emulation.
When the syscall returns ENAMETOOLONG, the glibc wrapper uses a fallback
implementation that potentially handles an unlimited path length, and
returns with ERANGE if the provided buffer is too small. The qemu
emulation cannot distinguish the two cases, and thus always returns ERANGE.
This is unexpected by the glibc wrapper.
Implementation of 'rt_sigtimedwait()' in 'syscall.c' uses the
function 'target_to_host_timespec()' to transfer the value of
'struct timespec' from target to host. However, the implementation
doesn't check whether this conversion succeeds and thus can cause
an unaproppriate error instead of the 'EFAULT (Bad address)' which
is supposed to be set if the conversion from target to host fails.
This was confirmed with the LTP test for rt_sigtimedwait:
"/testcases/kernel/syscalls/rt_sigtimedwait/rt_sigtimedwait01.c"
which causes an unapropriate error in test case "test_bad_adress3"
which is run with a bad adress for the 'struct timespec' argument:
When the chroot does not have /proc mounted, we can read neither
/proc/sys/vm/mmap_min_addr nor /proc/sys/maps.
The enforcement of mmap_min_addr in the host kernel is done by
the security module, and so does not apply to processes owned
by root. Which leads pgd_find_hole_fallback to succeed in probing
a reservation at address 0. Which confuses pgb_reserved_va to
believe that guest_base has not actually been initialized.
We don't actually want NULL addresses to become accessible, so
make sure that mmap_min_addr is initialized with a non-zero value.
Peter Maydell [Mon, 27 Jul 2020 20:00:01 +0000 (21:00 +0100)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio,pci: bugfixes
Minor bugfixes all over the places, including one CVE.
Additionally, a fix for an ancient bug in migration -
one has to wonder how come no one noticed.
The fix is also non-trivial since we dare not break all
existing machine types with pci - we have a work around
in the works, for now we just skip the work-around for
old machine types.
Great job by Hogan Wang noticing, debugging and fixing it,
and thanks to Dr. David Alan Gilbert for reviewing the patches.
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Mon 27 Jul 2020 16:34:58 BST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "[email protected]"
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>" [full]
# gpg: aka "Michael S. Tsirkin <[email protected]>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
virtio-pci: fix virtio_pci_queue_enabled()
MAINTAINERS: Cover the firmware JSON schema
vhost-vdpa :Fix Coverity CID 1430270 / CID 1420267
libvhost-user: Report descriptor index on panic
Fix vhost-user buffer over-read on ram hot-unplug
hw/pci-host: save/restore pci host config register
virtio-mem-pci: force virtio version 1
In legacy mode, virtio_pci_queue_enabled() falls back to
virtio_queue_enabled() to know if the queue is enabled.
But virtio_queue_enabled() calls again virtio_pci_queue_enabled()
if k->queue_enabled is set. This ends in a crash after a stack
overflow.
The problem can be reproduced with
"-device virtio-net-pci,disable-legacy=off,disable-modern=true
-net tap,vhost=on"
And a look to the backtrace is very explicit:
...
#4 0x000000010029a438 in virtio_queue_enabled ()
#5 0x0000000100497a9c in virtio_pci_queue_enabled ()
...
#130902 0x000000010029a460 in virtio_queue_enabled ()
#130903 0x0000000100497a9c in virtio_pci_queue_enabled ()
#130904 0x000000010029a460 in virtio_queue_enabled ()
#130905 0x0000000100454a20 in vhost_net_start ()
...
This patch fixes the problem by introducing a new function
for the legacy case and calls it from virtio_pci_queue_enabled().
It also calls it from virtio_queue_enabled() to avoid code duplication.
When GCR_EL1.RRND==1, the choosing of the random value is IMPDEF,
and the kernel is not expected to have set RGSR_EL1. Force a
non-zero value into SEED, so that we do not continually return
the same tag.
When booting an EL3 cpu with -kernel, we set up EL3 and then
drop down to EL2. We need to enable access to v8.5-MemTag
tag allocation at EL3 before doing so.
When booting an EL3 cpu with -kernel, we set up EL3 and then
drop down to EL2. We need to enable access to v8.3-PAuth
keys and instructions at EL3 before doing so.
Commit 6a0b7505f1fd6769c which added documentation of the virt board
crossed in the post with commit 6f4e1405b91da0d0 which added a new
'mte' machine option. Update the docs to include the new option.
target/arm: Always pass cacheattr in S1_ptw_translate
When we changed the interface of get_phys_addr_lpae to require
the cacheattr parameter, this spot was missed. The compiler is
unable to detect the use of NULL vs the nonnull attribute here.
The SDRAM Memory Controller has a 32-bit address bus, thus
supports up to 4 GiB of DRAM. There is a signed to unsigned
conversion error with the AST2600 maximum memory size:
Peter Maydell [Mon, 27 Jul 2020 14:55:56 +0000 (15:55 +0100)]
Merge remote-tracking branch 'remotes/stsquad/tags/pull-fixes-for-rc2-270720-1' into staging
Various fixes for rc2:
- get shippable working again
- semihosting bug fixes
- tweak tb-size handling for low memory machines
- i386 compound literal float fix
- linux-user MAP_FIXED->MAP_NOREPLACE on fallback
- docker binfmt_misc fixes
- linux-user nanosleep fix
- tests/vm drain console fixes
# gpg: Signature made Mon 27 Jul 2020 09:45:31 BST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <[email protected]>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-fixes-for-rc2-270720-1:
tests/vm: add shutdown timeout in basevm.py
python/qemu: Change ConsoleSocket to optionally drain socket.
python/qemu: Cleanup changes to ConsoleSocket
linux-user, ppc: fix clock_nanosleep() for linux-user-ppc
linux-user: fix clock_nanosleep()
tests/docker: add support for DEB_KEYRING
tests/docker: fix binfmt_misc image building
tests/docker: fix update command due to python3 str/bytes distinction
linux-user: don't use MAP_FIXED in pgd_find_hole_fallback
target/i386: floatx80: avoid compound literals in static initializers
accel/tcg: better handle memory constrained systems
util/oslib-win32: add qemu_get_host_physmem implementation
util: add qemu_get_host_physmem utility function
semihosting: don't send the trailing '\0'
semihosting: defer connect_chardevs a little more to use serialx
shippable: add one more qemu to registry url
Max Reitz [Mon, 27 Jul 2020 13:52:37 +0000 (15:52 +0200)]
iotests/197: Fix for compat=0.10
Writing zeroes to a qcow2 v2 images without a backing file results in an
unallocated cluster as of 61b3043965. 197 has a test for COR-ing a
cluster on an image without a backing file, which means that the data
will be zero, so now on a v2 image that cluster will just stay
unallocated, and so the test fails. Just force compat=1.1 for that
particular case to enforce the cluster to get allocated.
The VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS vhost-user protocol
feature introduced a shadow-table, used by the backend to dynamically
determine how a vdev's memory regions have changed since the last
vhost_user_set_mem_table() call. On hot-remove, a memmove() operation
is used to overwrite the removed shadow region descriptor(s). The size
parameter of this memmove was off by 1 such that if a VM with a backend
supporting the VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS filled it's
shadow-table (by performing the maximum number of supported hot-add
operatons) and attempted to remove the last region, Qemu would read an
out of bounds value and potentially crash.
This change fixes the memmove() bounds such that this erroneous read can
never happen.
The pci host config register is used to save PCI address for
read/write config data. If guest writes a value to config register,
and then QEMU pauses the vcpu to migrate, after the migration, the guest
will continue to write pci config data, and the write data will be ignored
because of new qemu process losing the config register state.
To trigger the bug:
1. guest is booting in seabios.
2. guest enables the SMRAM in seabios:piix4_apmc_smm_setup, and then
expects to disable the SMRAM by pci_config_writeb.
3. after guest writes the pci host config register, QEMU pauses vcpu
to finish migration.
4. guest write of config data(0x0A) fails to disable the SMRAM because
the config register state is lost.
5. guest continues to boot and crashes in ipxe option ROM due to SMRAM
in enabled state.
Example Reproducer:
step 1. Make modifications to seabios and qemu for increase reproduction
efficiency, write 0xf0 to 0x402 port notify qemu to stop vcpu after
0x0cf8 port wrote i440 configure register. qemu stop vcpu when catch
0x402 port wrote 0xf0.
+ if (ch == 0xf0) {
+ vm_stop(RUN_STATE_PAUSED);
+ }
/* XXX this blocks entire thread. Rewrite to use
* qemu_chr_fe_write and background I/O callbacks */
qemu_chr_fe_write_all(&s->chr, &ch, 1);
step 2. start vm1 by the following command line, and then vm stopped.
$ qemu-system-x86_64 -machine pc-i440fx-5.0,accel=kvm\
-netdev tap,ifname=tap-test,id=hostnet0,vhost=on,downscript=no,script=no\
-device virtio-net-pci,netdev=hostnet0,id=net0,bus=pci.0,addr=0x13,bootindex=3\
-device cirrus-vga,id=video0,vgamem_mb=16,bus=pci.0,addr=0x2\
-chardev file,id=seabios,path=/var/log/test.seabios,append=on\
-device isa-debugcon,iobase=0x402,chardev=seabios\
-monitor stdio
step 4. execute the following qmp command in vm1 to migrate.
(qemu) migrate tcp:127.0.0.1:8000
step 5. execute the following qmp command in vm2 to resume vcpu.
(qemu) cont
Before this patch, we get KVM "emulation failure" error on vm2.
This patch fixes it.
Trying to run simple virtio-mem-pci examples currently fails with
qemu-system-x86_64: -device virtio-mem-pci,id=vm0,memdev=mem0,node=0,
requested-size=300M: device is modern-only, use disable-legacy=on
due to the added safety checks in 9b3a35ec8236 ("virtio: verify that legacy
support is not accidentally on").
As noted by Conny, we have to force virtio version 1. While at it, use
qdev_realize() to set the parent bus and realize - like most other
virtio-*-pci implementations.
Thomas Huth [Wed, 22 Jul 2020 16:19:08 +0000 (18:19 +0200)]
iotests: Select a default machine for the rx and avr targets
If you are building only with either the new rx-softmmu or avr-softmmu
target, "make check-block" fails a couple of tests since there is no
default machine defined in these new targets. We have to select a machine
in the "check" script for these, just like we already do for the arm- and
tricore-softmmu targets.
Max Reitz [Fri, 10 Jul 2020 09:50:37 +0000 (11:50 +0200)]
block/amend: Check whether the node exists
We should check whether the user-specified node-name actually refers to
a node. The simplest way to do that is to use bdrv_lookup_bs() instead
of bdrv_find_node() (the former wraps the latter, and produces an error
message if necessary).