Kevin Wolf [Wed, 13 Jun 2018 09:01:30 +0000 (11:01 +0200)]
block: Remove deprecated -drive option addr
This reinstates commit eae3bd1eb7c6b105d30ec06008b3bc3dfc5f45bb,
which was temporarily reverted for the 3.0 release so that libvirt gets
some extra time to update their command lines.
The -drive option addr was deprecated in QEMU 2.10. It's time to remove
it.
Kevin Wolf [Wed, 13 Jun 2018 09:01:30 +0000 (11:01 +0200)]
block: Remove deprecated -drive geometry options
This reinstates commit a7aff6dd10b16b67e8b142d0c94c5d92c3fe88f6,
which was temporarily reverted for the 3.0 release so that libvirt gets
some extra time to update their command lines.
The -drive options cyls, heads, secs and trans were deprecated in
QEMU 2.10. It's time to remove them.
hd-geo-test tested both the old version with geometry options in -drive
and the new one with -device. Therefore the code using -drive doesn't
have to be replaced there, we just need to remove the -drive test cases.
This in turn allows some simplification of the code.
Fam Zheng [Tue, 14 Aug 2018 07:25:51 +0000 (15:25 +0800)]
luks: Allow share-rw=on
Format drivers such as qcow2 don't allow sharing the same image between
two QEMU instances in order to prevent image corruptions, because of
metadata cache. LUKS driver don't modify metadata except for when
creating image, so it is safe to relax the permission. This makes
share-rw=on property work on virtual devices.
Alberto Garcia [Thu, 2 Aug 2018 14:50:26 +0000 (17:50 +0300)]
throttle-groups: Don't allow timers without throttled requests
Commit 6fccbb475bc6effc313ee9481726a1748b6dae57 fixed a bug caused by
QEMU attempting to remove a throttle group member with no pending
requests but an active timer set. This was the result of a previous
bdrv_drained_begin() call processing the throttled requests but
leaving the timer untouched.
Although the commit does solve the problem, the situation shouldn't
happen in the first place. If we try to drain a throttle group member
which has a timer set, we should cancel the timer instead of ignoring
it.
Alberto Garcia [Thu, 2 Aug 2018 14:50:25 +0000 (17:50 +0300)]
qemu-iotests: Update 093 to improve the draining test
The previous patch fixes a problem in which draining a block device
with more than one throttled request can make it wait first for the
completion of requests in other members of the same group.
This patch updates test_remove_group_member() in iotest 093 to
reproduce that scenario. This updated test would hang QEMU without the
fix from the previous patch.
Alberto Garcia [Thu, 2 Aug 2018 14:50:24 +0000 (17:50 +0300)]
throttle-groups: Skip the round-robin if a member is being drained
In the throttling code after an I/O request has been completed the
next one is selected from a different member using a round-robin
algorithm. This ensures that all members get a chance to finish their
pending I/O requests.
However, if a group member has its I/O limits disabled (because it's
being drained) then we should always give it priority in order to have
all its pending requests finished as soon as possible.
If we don't do this we could have a member in the process of being
drained waiting for the throttled requests of other members, for which
the I/O limits still apply.
This can have additional consequences: if we're running in qtest mode
(with QEMU_CLOCK_VIRTUAL) then timers can only fire if we advance the
clock manually, so attempting to drain a block device can hang QEMU in
the BDRV_POLL_WHILE() loop at the end of bdrv_do_drained_begin().
Alberto Garcia [Thu, 2 Aug 2018 14:50:23 +0000 (17:50 +0300)]
qemu-iotests: Test removing a throttle group member with a pending timer
A throttle group can have several members, and each one of them can
have several pending requests in the queue.
The requests are processed in a round-robin fashion, so the algorithm
decides the drive that is going to run the next request and sets a
timer in it. Once the timer fires and the throttled request is run
then the next drive from the group is selected and a new timer is set.
If the user tried to remove a drive from a group and that drive had a
timer set then the code was not taking care of setting up a new timer
in one of the remaining members of the group, freezing their I/O.
virtio-gpu: fix crashes upon warm reboot with vga mode
With vga=775 on the Linux command line a first boot of the VM running
Linux works fine. After a warm reboot it crashes during Linux boot.
Before that, valgrind points out bad memory write to console
surface. The VGA code is not aware that virtio-gpu got a message
surface scanout when the display is disabled. Let's reset VGA graphic
mode when it is the case, so that a new display surface is created
when doing further VGA operations.
Peter Maydell [Tue, 7 Aug 2018 11:45:01 +0000 (12:45 +0100)]
slirp: Correct size check in m_inc()
The data in an mbuf buffer is not necessarily at the start of the
allocated buffer. (For instance m_adj() allows data to be trimmed
from the start by just advancing the pointer and reducing the length.)
This means that the allocated buffer size (m->m_size) and the
amount of space from the m_data pointer to the end of the
buffer (M_ROOM(m)) are not necessarily the same.
Commit 864036e251f54c9 tried to change the m_inc() function from
taking the new allocated-buffer-size to taking the new room-size,
but forgot to change the initial "do we already have enough space"
check. This meant that if we were trying to extend a buffer which
had a leading gap between the buffer start and the data, we might
incorrectly decide it didn't need to be extended, and then
overrun the end of the buffer, causing memory corruption and
an eventual crash.
Change the "already big enough?" condition from checking the
argument against m->m_size to checking against M_ROOM().
This only makes a difference for the callsite in m_cat();
the other three callsites all start with a freshly allocated
mbuf from m_get(), which will have m->m_size == M_ROOM(m).
Thomas Huth [Thu, 19 Jul 2018 13:02:00 +0000 (15:02 +0200)]
target/xtensa/cpu: Set owner of memory region in xtensa_cpu_initfn
The instance_init function of the xtensa CPUs creates a memory region,
but does not set an owner, so the memory region is not destroyed
correctly when the CPU object is removed. This can happen when
introspecting the CPU devices, so introspecting the CPU device will
leave a dangling memory region object in the QOM tree. Make sure to
set the right owner here to fix this issue.
Peter Maydell [Mon, 6 Aug 2018 12:34:45 +0000 (13:34 +0100)]
hw/intc/arm_gicv3_common: Move gicd shift bug handling to gicv3_post_load
The code currently in gicv3_gicd_no_migration_shift_bug_post_load()
that handles migration from older QEMU versions with a particular
bug is misplaced. We need to run this after migration in all cases,
not just the cases where the "arm_gicv3/gicd_no_migration_shift_bug"
subsection is present, so it must go in a post_load hook for the
top level VMSD, not for the subsection. Move it.
Peter Maydell [Mon, 6 Aug 2018 12:34:44 +0000 (13:34 +0100)]
hw/intc/arm_gicv3_common: Move post_load hooks to top-level VMSD
Contrary to the the impression given in docs/devel/migration.rst,
the migration code does not run the pre_load hook for a
subsection unless the subsection appears on the wire, and so
this is not a place where you can set the default value for
state for the "subsection not present" case. Instead this needs
to be done in a pre_load hook for whatever is the parent VMSD
of the subsection.
We got this wrong in two of the subsection definitions in
the GICv3 migration structs; fix this.
Peter Maydell [Mon, 6 Aug 2018 12:34:43 +0000 (13:34 +0100)]
target/arm: Add dummy needed functions to M profile vmstate subsections
Currently the migration code incorrectly treats a subsection with
no .needed function pointer as if it was the subsection list
terminator -- it is ignored and so is everything after it.
Work around this by giving various M profile vmstate structs
a 'needed' function that always returns true.
We reuse m_needed() for this, since it's always true here.
Peter Maydell [Mon, 6 Aug 2018 12:34:42 +0000 (13:34 +0100)]
hw/intc/arm_gicv3_common: Combine duplicate .subsections in vmstate_gicv3_cpu
Commit 6692aac411199064 accidentally introduced a second initialization
of the .subsections field of vmstate_gicv3_cpu, instead of adding
the new subsection to the existing list. The effect of this was
probably that migration of GICv3 with virtualization enabled was
broken (or alternatively that migration of ICC_SRE_EL1 was broken,
depending on which of the two initializers the compiler used).
Combine the two into a single list.
Peter Maydell [Mon, 6 Aug 2018 12:34:41 +0000 (13:34 +0100)]
hw/intc/arm_gicv3_common: Give no-migration-shift-bug subsection a needed function
Currently the migration code incorrectly treats a subsection with
no .needed function pointer as if it was the subsection list
terminator -- it is ignored and so is everything after it.
Work around this by giving vmstate_gicv3_gicd_no_migration_shift_bug
a 'needed' function that always returns true.
Peter Maydell [Mon, 6 Aug 2018 08:59:05 +0000 (09:59 +0100)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, virtio: fixes
A couple of last minute fixes.
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Fri 03 Aug 2018 09:35:54 BST
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg: aka "Michael S. Tsirkin <[email protected]>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
tests/acpi: update tables after memory hotplug changes
pc: acpi: fix memory hotplug regression by reducing stub SRAT entry size
tests/acpi-test: update ACPI tables test blobs
hw/acpi-build: Add a check for memory-less NUMA nodes
vhost: check region type before casting
Check region type first before casting the memory region
to IOMMUMemoryRegion. Otherwise QEMU will abort with below
error message when casting non-IOMMU memory region:
vhost_iommu_region_add: Object 0x561f28bce4f0 is not an
instance of type qemu:iommu-memory-region
sam460ex: Fix PCI interrupts with multiple devices
The four interrupts of the PCI bus are connected to the same UIC pin
on the real Sam460ex. Evidence for this can be found in the UBoot
source for the Sam460ex in the Sam460ex.c file where
PCI_INTERRUPT_LINE is written. Change the ppc440_pcix model to behave
more like this.
This fixes the problem that can be observed when adding further PCI
cards that got their interrupt rotated to other interrupts than PCI
INT A. In particular, the bug was observed with an additional OHCI PCI
card or an ES1370 sound device.
Use the new function sysbus_init_child_obj() to initialize the objects
here, to get the reference counting of the objects right, so that they
are cleaned up correctly when the parent gets removed.
monitor: temporary fix for dead-lock on event recursion
With a Spice port chardev, it is possible to reenter
monitor_qapi_event_queue() (when the client disconnects for
example). This will dead-lock on monitor_lock.
Instead, use some TLS variables to check for recursion and queue the
events.
Fixes:
(gdb) bt
#0 0x00007fa69e7217fd in __lll_lock_wait () at /lib64/libpthread.so.0
#1 0x00007fa69e71acf4 in pthread_mutex_lock () at /lib64/libpthread.so.0
#2 0x0000563303567619 in qemu_mutex_lock_impl (mutex=0x563303d3e220 <monitor_lock>, file=0x5633036589a8 "/home/elmarco/src/qq/monitor.c", line=645) at /home/elmarco/src/qq/util/qemu-thread-posix.c:66
#3 0x0000563302fa6c25 in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x56330602bde0, errp=0x7ffc6ab5e728) at /home/elmarco/src/qq/monitor.c:645
#4 0x0000563303549aca in qapi_event_send_spice_disconnected (server=0x563305afd630, client=0x563305745360, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-ui.c:149
#5 0x00005633033e600f in channel_event (event=3, info=0x5633061b0050) at /home/elmarco/src/qq/ui/spice-core.c:235
#6 0x00007fa69f6c86bb in reds_handle_channel_event (reds=<optimized out>, event=3, info=0x5633061b0050) at reds.c:316
#7 0x00007fa69f6b193b in main_dispatcher_self_handle_channel_event (info=0x5633061b0050, event=3, self=0x563304e088c0) at main-dispatcher.c:197
#8 0x00007fa69f6b193b in main_dispatcher_channel_event (self=0x563304e088c0, event=event@entry=3, info=0x5633061b0050) at main-dispatcher.c:197
#9 0x00007fa69f6d0833 in red_stream_push_channel_event (s=s@entry=0x563305ad8f50, event=event@entry=3) at red-stream.c:414
#10 0x00007fa69f6d086b in red_stream_free (s=0x563305ad8f50) at red-stream.c:388
#11 0x00007fa69f6b7ddc in red_channel_client_finalize (object=0x563304df2360) at red-channel-client.c:347
#12 0x00007fa6a56b7fb9 in g_object_unref () at /lib64/libgobject-2.0.so.0
#13 0x00007fa69f6ba212 in red_channel_client_push (rcc=0x563304df2360) at red-channel-client.c:1341
#14 0x00007fa69f68b259 in red_char_device_send_msg_to_client (client=<optimized out>, msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305
#15 0x00007fa69f68b259 in red_char_device_send_msg_to_clients (msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305
#16 0x00007fa69f68b259 in red_char_device_read_from_device (dev=0x563304e08bc0) at char-device.c:353
#17 0x000056330317d01d in spice_chr_write (chr=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/spice.c:199
#18 0x00005633034deee7 in qemu_chr_write_buffer (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, offset=0x7ffc6ab5ea70, write_all=false) at /home/elmarco/src/qq/chardev/char.c:112
#19 0x00005633034df054 in qemu_chr_write (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, write_all=false) at /home/elmarco/src/qq/chardev/char.c:147
#20 0x00005633034e1e13 in qemu_chr_fe_write (be=0x563304dbb800, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/char-fe.c:42
#21 0x0000563302fa6334 in monitor_flush_locked (mon=0x563304dbb800) at /home/elmarco/src/qq/monitor.c:425
#22 0x0000563302fa6520 in monitor_puts (mon=0x563304dbb800, str=0x563305de7e9e "") at /home/elmarco/src/qq/monitor.c:468
#23 0x0000563302fa680c in qmp_send_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:517
#24 0x0000563302fa6905 in qmp_queue_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:538
#25 0x0000563302fa6b5b in monitor_qapi_event_emit (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730) at /home/elmarco/src/qq/monitor.c:624
#26 0x0000563302fa6c4b in monitor_qapi_event_queue (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730, errp=0x7ffc6ab5ed00) at /home/elmarco/src/qq/monitor.c:649
#27 0x0000563303548cce in qapi_event_send_shutdown (guest=false, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-run-state.c:58
#28 0x000056330313bcd7 in main_loop_should_exit () at /home/elmarco/src/qq/vl.c:1822
#29 0x000056330313bde3 in main_loop () at /home/elmarco/src/qq/vl.c:1862
#30 0x0000563303143781 in main (argc=3, argv=0x7ffc6ab5f068, envp=0x7ffc6ab5f088) at /home/elmarco/src/qq/vl.c:4644
Note that error report is now moved to the first caller, which may
receive an error for a recursed event. This is probably fine (95% of
callers use &error_abort, the rest have NULL error and ignore it)
* remotes/bonzini/tags/for-upstream:
backends/cryptodev: remove dead code
timer: remove replay clock probe in deadline calculation
i386: implement MSR_SMI_COUNT for TCG
i386: do not migrate MSR_SMI_COUNT on machine types <2.12
linux-user: ppc64: don't use volatile register during safe_syscall
r11 is a volatile register on PPC as per calling conventions.
The safe_syscall code uses it to check if the signal_pending
is set during the safe_syscall. When a syscall is interrupted
on return from signal handling, the r11 might be corrupted
before we retry the syscall leading to a crash. The registers
r0-r13 are not to be used here as they have
volatile/designated/reserved usages.
Change the code to use r14 which is non-volatile.
Use SP+16 which is a slot for LR, for save/restore of previous value
of r14. SP+16 can be used, as LR is preserved across the syscall.
Steps to reproduce:
On PPC host, issue `qemu-x86_64 /usr/bin/cc -E -`
Attempt Ctrl-C, the issue is reproduced.
Alex Bennée [Mon, 30 Jul 2018 13:43:20 +0000 (14:43 +0100)]
linux-user/mmap.c: handle invalid len maps correctly
I've slightly re-organised the check to more closely match the
sequence that the kernel uses in do_mmap(). We check for both the zero
case (EINVAL) and the overflow length case (ENOMEM).
Peter Maydell [Mon, 30 Jul 2018 18:11:57 +0000 (19:11 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches:
- qemu-img convert -C is now required to enable copy offloading
- file-posix: Fix write_zeroes with unmap on block devices (would fall
back to explicit writes on recent kernels)
- Fix query-blockstats interface for use with -blockdev
- Minor fixes and documentation updates
# gpg: Signature made Mon 30 Jul 2018 16:08:14 BST
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream:
qemu-iotests: Test query-blockstats with -drive and -blockdev
block/qapi: Include anonymous BBs in query-blockstats
block/qapi: Add 'qdev' field to query-blockstats result
file-posix: Fix write_zeroes with unmap on block devices
block: Fix documentation for BDRV_REQ_MAY_UNMAP
iotests: Add test for 'qemu-img convert -C' compatibility
qemu-img: Add -C option for convert with copy offloading
Revert "qemu-img: Document copy offloading implications with -S and -c"
iotests: Don't lock /dev/null in 226
docs: Describe using images in writing iotests
file-posix: Handle EINTR in preallocation=full write
qcow2: A grammar fix in conflicting cache sizing error message
qcow: fix a reference leak
Peter Maydell [Mon, 30 Jul 2018 16:27:54 +0000 (17:27 +0100)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20180730' into staging
target-arm queue:
* arm/smmuv3: Fix broken VM state migration
* armv7m_nvic: Fix broken VM state migration
* hw/arm/sysbus-fdt: Fix assertion in copy_properties_from_host()
* hw/arm/iotkit: Fix IRQ number for timer1
* hw/misc/tz-mpc: Zero the LUT on initialization, not just reset
* target/arm: Remove duplicate 'host' entry in '-cpu ?' output
* remotes/pmaydell/tags/pull-target-arm-20180730:
target/arm: Remove duplicate 'host' entry in '-cpu ?' output
hw/misc/tz-mpc: Zero the LUT on initialization, not just reset
hw/arm/iotkit: Fix IRQ number for timer1
armv7m_nvic: Fix m-security subsection name
hw/arm/sysbus-fdt: Fix assertion in copy_properties_from_host()
arm/smmuv3: Fix missing VMSD terminator
We clamp down ram_size to match the sclp increment size. We do
not do the same for maxram_size, which means for large guests
with some sizes (e.g. -m 50000) maxram_size differs from ram_size.
This can break other code (e.g. CMMA migration) which uses maxram_size
to calculate the number of pages and then throws some errors.
Peter Maydell [Tue, 24 Jul 2018 15:36:16 +0000 (16:36 +0100)]
hw/misc/tz-mpc: Zero the LUT on initialization, not just reset
In the tz-mpc device we allocate a data block for the LUT,
which we then clear to zero in the device's reset method.
This is conceptually fine, but unfortunately results in a
valgrind complaint about use of uninitialized data on startup:
==30906== Conditional jump or move depends on uninitialised value(s)
==30906== at 0x503609: tz_mpc_translate (tz-mpc.c:439)
==30906== by 0x3F3D90: address_space_translate_iommu (exec.c:511)
==30906== by 0x3F3FF8: flatview_do_translate (exec.c:584)
==30906== by 0x3F4292: flatview_translate (exec.c:644)
==30906== by 0x3F2120: address_space_translate (memory.h:1962)
==30906== by 0x3FB753: address_space_ldl_internal (memory_ldst.inc.c:36)
==30906== by 0x3FB8A6: address_space_ldl (memory_ldst.inc.c:80)
==30906== by 0x619037: ldl_phys (memory_ldst_phys.inc.h:25)
==30906== by 0x61985D: arm_cpu_reset (cpu.c:255)
==30906== by 0x98791B: cpu_reset (cpu.c:249)
==30906== by 0x57FFDB: armv7m_reset (armv7m.c:265)
==30906== by 0x7B1775: qemu_devices_reset (reset.c:69)
This is because of a reset ordering problem -- the TZ MPC
resets after the CPU, but an M-profile CPU's reset function
includes memory loads to get the initial PC and SP, which
then go through an MPC that hasn't yet been reset.
The simplest fix for this is to zero the LUT when we
initialize the data, which will result in the MPC's
translate function giving the right answers for these
early memory accesses.
Peter Maydell [Fri, 27 Jul 2018 11:38:54 +0000 (12:38 +0100)]
hw/arm/iotkit: Fix IRQ number for timer1
A cut-and-paste error meant we were incorrectly wiring up the timer1
IRQ to IRQ3. IRQ3 is the interrupt for timer0 -- move timer0 to
IRQ4 where it belongs.
Peter Maydell [Fri, 27 Jul 2018 11:38:53 +0000 (12:38 +0100)]
armv7m_nvic: Fix m-security subsection name
The vmstate save/load code insists that subsections of a VMState must
have names which include their parent VMState's name as a leading
substring. Unfortunately it neither documents this nor checks it on
device init or state save, but instead fails state load with a
confusing error message ("Missing section footer for armv7m_nvic").
Fix the name of the m-security subsection of the NVIC, so that
state save/load works correctly for the security-enabled NVIC.
Kevin Wolf [Fri, 27 Jul 2018 14:09:25 +0000 (16:09 +0200)]
block/qapi: Include anonymous BBs in query-blockstats
Consistent with query-block, query-blockstats should not only include
named BlockBackends, but also those that are anonymous, but belong to a
device model.
Kevin Wolf [Fri, 27 Jul 2018 14:07:07 +0000 (16:07 +0200)]
block/qapi: Add 'qdev' field to query-blockstats result
Like for query-block, the client needs to identify which BlockBackend
the returned data is for. Anonymous BlockBackends are identified by the
device model they are attached to. Add a 'qdev' field that contains the
qdev ID or QOM path of the attached device model.
Kevin Wolf [Thu, 26 Jul 2018 09:28:30 +0000 (11:28 +0200)]
file-posix: Fix write_zeroes with unmap on block devices
The BLKDISCARD ioctl doesn't guarantee that the discarded blocks read as
all-zero afterwards, so don't try to abuse it for zero writing. We try
to only use this if BLKDISCARDZEROES tells us that it is safe, but this
is unreliable on older kernels and a constant 0 in newer kernels. In
other words, this code path is never actually used with newer kernels,
so we don't even try to unmap while writing zeros.
This patch removes the abuse of discard for writing zeroes from
file-posix and instead adds a new function that uses interfaces that are
actually meant to deallocate and zero out at the same time. Only if
those fail, it falls back to zeroing out without unmap. We never fall
back to a discard operation any more that may or may not result in
zeros.
Kevin Wolf [Wed, 25 Jul 2018 11:20:32 +0000 (13:20 +0200)]
block: Fix documentation for BDRV_REQ_MAY_UNMAP
BDRV_REQ_MAY_UNMAP in a write_zeroes request does not only allow the
driver to unmap the blocks, but it actively requests that the blocks be
unmapped afterwards if at all possible.
On my system (Fedora 28), this script reports a 'failed to get
"consistent read" lock' error. Following docs/devel/testing.rst, it's
better to add locking=off here.
This is because we reference bs twice in qcow_co_create(..) one time in
bdrv_open_blockdev_ref(..) and in blk_insert_bs(..) but we unref it only once
in blk_unref which leads to the reference leak.
Note that I didn't tested much QCOW after this change as I don't use it much.
Pavel Dovgalyuk [Wed, 25 Jul 2018 12:15:26 +0000 (15:15 +0300)]
timer: remove replay clock probe in deadline calculation
Ciro Santilli reported that commit a5ed352596a8b7eb2f9acce34371b944ac3056c4
breaks the execution replay. It happens due to the probing the clock
for the new instances of iothread.
However, this probing was made in replay mode for the timer lists that
are empty.
This patch removes clock probing in replay mode.
It is an artifact of the old version with another thread model.
Paolo Bonzini [Tue, 24 Jul 2018 11:59:21 +0000 (13:59 +0200)]
i386: do not migrate MSR_SMI_COUNT on machine types <2.12
MSR_SMI_COUNT started being migrated in QEMU 2.12. Do not migrate it
on older machine types, or the subsection causes a load failure for
guests that use SMM.
Peter Maydell [Mon, 30 Jul 2018 08:55:47 +0000 (09:55 +0100)]
Merge remote-tracking branch 'remotes/armbru/tags/pull-qobject-2018-07-27-v2' into staging
QObject patches for 2018-07-27 (3.0.0-rc3)
# gpg: Signature made Sat 28 Jul 2018 08:10:39 BST
# gpg: using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <[email protected]>"
# gpg: aka "Markus Armbruster <[email protected]>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-qobject-2018-07-27-v2:
qstring: Move qstring_from_substr()'s @end one to the right
qstring: Assert size calculations don't overflow
qstring: Fix qstring_from_substr() not to provoke int overflow
qstring: Move qstring_from_substr()'s @end one to the right
qstring_from_substr() takes the index of the substring's first and
last character. qstring_from_substr(s, 0, SIZE_MAX) denotes an empty
substring. Awkward.
Shift the end index one to the right. This simplifies both
qstring_from_substr() and its callers.
qstring: Fix qstring_from_substr() not to provoke int overflow
qstring_from_substr() parameters @start and @end are of type int.
blkdebug_parse_filename(), blkverify_parse_filename(), nbd_parse_uri(),
and qstring_from_str() pass @end values of type size_t or ptrdiff_t.
Values exceeding INT_MAX get truncated, with possibly disastrous
results.
Such huge substrings seem unlikely, but we found one in a core dump,
where "info tlb" executed via QMP's human-monitor-command apparently
produced 35 GiB of output.
Peter Maydell [Tue, 24 Jul 2018 19:16:31 +0000 (20:16 +0100)]
Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20180724a' into staging
Migration pull for 3.0
Fixes only
# gpg: Signature made Tue 24 Jul 2018 19:31:39 BST
# gpg: using RSA key 0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <[email protected]>"
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert/tags/pull-migration-20180724a:
migration: fix duplicate initialization for expected_downtime and cleanup_bh
tests: only update last_byte when at the edge
migration: disallow recovery for release-ram
migration: update recv bitmap only on dest vm
audio/hda: Fix migration
migrate: Fix cancelling state warning
migration: fix potential overflow in multifd send
When gnutls negotiates TLS 1.3 instead of 1.2, the order of messages
sent by the handshake changes. This exposed a logic bug in the test
suite which caused us to wait for the server to see handshake
completion, but not wait for the client to see completion. The result
was the client didn't receive the certificate for verification and the
test failed.
This is exposed in Fedora 29 rawhide which has just enabled TLS 1.3 in
its GNUTLS builds.
Most of the TLS related tests are passing an in a "Error" object to
methods that are expected to fail, but then ignoring any error that is
set and instead asserting on a return value. This means that when an
error is unexpectedly raised, no information about it is printed out,
making failures hard to diagnose. Changing these tests to pass in
&error_abort will make unexpected failures print messages to stderr.
tests: don't silence error reporting for all tests
The test-vmstate test is a bit chatty because it triggers various
expected failure scenarios and the code in question uses error_report
instead of accepting 'Error **errp' parameters. To silence this test the
stubs for error_vprintf() were changed to send errors via
g_test_message() instead of stderr:
Unfortunately this change has global impact across the entire test suite
and means that when tests fail for unexpected reasons, the message is
not displayed on stderr. eg when using &error_abort in a call the test
merely prints
Unexpected error in qcrypto_tls_session_check_certificate() at crypto/tlssession.c:280:
and the actual error message is hidden, making it impossible to diagnose
the failure. This is especially problematic in CI or build systems where
it isn't possible to easily pass the --debug-log flag to tests and
re-run with the test log visible.
This change makes the previous big hammer much more nuanced, providing a
flag in the stub error_vprintf() that can used on a per-test basis to
silence the errors. Only the test-vmstate silences errors initially.
Peter Xu [Mon, 23 Jul 2018 12:33:04 +0000 (20:33 +0800)]
tests: only update last_byte when at the edge
The only possible change of last_byte is when it reaches the edge.
Setting it every time might let last_byte contain an invalid data when
memory corruption is detected, then the check of the next byte will be
incorrect. For example, a single page corruption at address 0x14ad000
will also lead to a "fake" corruption at 0x14ae000:
Memory content inconsistency at 14ad000 first_byte = 44 last_byte = 44 current = ef hit_edge = 0
Memory content inconsistency at 14ae000 first_byte = 44 last_byte = ef current = 44 hit_edge = 0
After the patch, it'll only report the corrputed page:
Memory content inconsistency at 14ad000 first_byte = 44 last_byte = 44 current = ef hit_edge = 0
Peter Xu [Mon, 23 Jul 2018 12:33:03 +0000 (20:33 +0800)]
migration: disallow recovery for release-ram
Postcopy recovery won't work well with release-ram capability since
release-ram will drop the page buffer as long as the page is put into
the send buffer. So if there is a network failure happened, any page
buffers that have not yet reached the destination VM but have already
been sent from the source VM will be lost forever. Let's refuse the
client from resuming such a postcopy migration. Luckily release-ram was
designed to only be used when src and destination VMs are on the same
host, so it should be fine.
Fix outgoing migration which was crashing in
vmstate_hda_audio_stream_buf_needed, I think the problem
is that we have room for upto 4 streams in the array but only
use 2, when we come to try and save the state of the unused
streams we hit st->state == NULL.
migration_iteration_finish: Unknown ending state 2
on a cancel.
I think that's originally due to 39b9e17905c; although
I've only seen the warning, I think that in some cases
that we could find the VM stays paused after a cancel where
it should restart.
Peter Xu [Fri, 20 Jul 2018 03:47:13 +0000 (11:47 +0800)]
migration: fix potential overflow in multifd send
I would guess it won't happen normally, but this should ease Coverity.
>>> CID 1394385: Integer handling issues (OVERFLOW_BEFORE_WIDEN)
>>> Potentially overflowing expression "pages->used * 8192U" with type "unsigned int" (32 bits, unsigned) is evaluated using 32-bit arithmetic, and then used in a context that expects an expression of type "uint64_t" (64 bits, unsigned).
854 transferred = pages->used * TARGET_PAGE_SIZE + p->packet_len;
block/file-posix: add bdrv_attach_aio_context callback for host dev and cdrom
In ed6e2161 ("linux-aio: properly bubble up errors from initialzation"),
I only added a bdrv_attach_aio_context callback for the bdrv_file
driver. There are several other drivers that use the shared
aio_plug callback, though, and they will trip the assertion added to
aio_get_linux_aio because they did not call aio_setup_linux_aio first.
Add the appropriate callback definition to the affected driver
definitions.
Peter Maydell [Tue, 24 Jul 2018 12:30:12 +0000 (13:30 +0100)]
Merge remote-tracking branch 'remotes/stsquad/tags/pull-docker-fixes-for-3.0-240718-1' into staging
docker fixes & tcg test tweak
- graceful handling of testing under cross-compile
- fixes for debootstrap handling
- more helpful errors (binfmt_misc/EXECUTABLE missing)
- drop runcom TCG test
# gpg: Signature made Tue 24 Jul 2018 11:48:32 BST
# gpg: using RSA key FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <[email protected]>"
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-docker-fixes-for-3.0-240718-1:
tests/tcg: remove runcom test
docker: perform basic binfmt_misc validation in docker.py
docker: ignore distro versioning of debootstrap
docker: add commentary to debian-bootstrap.docker
docker: Update debootstrap script after Debian migration from Alioth to Salsa
docker: report hint when docker.py check fails
docker: drop QEMU_TARGET check, fallback in EXECUTABLE not set
docker: add expansion for docker-test-FOO to Makefile.include
docker: add test-unit runner
docker: Makefile.include don't include partial images
docker: gracefully skip check_qemu
docker: move make check into check_qemu helper
docker: split configure_qemu from build_qemu
docker: fail more gracefully on docker.py check
docker: par down QEMU_CONFIGURE_OPTS in debian-tricore-cross
docker: base debian-tricore on qemu:debian9
tests/.gitignore: don't ignore docker tests
Alex Bennée [Tue, 17 Jul 2018 16:17:53 +0000 (17:17 +0100)]
tests/tcg: remove runcom test
The combination of being rather esoteric and needing to support mmap @
0 means this only ever worked under translation. It has now regressed
even further and is no longer useful. Kill it.
Alex Bennée [Tue, 17 Jul 2018 16:11:26 +0000 (17:11 +0100)]
docker: perform basic binfmt_misc validation in docker.py
Setting up binfmt_misc is outside of the scope of the docker.py script
but we can at least validate it with any given executable so we have a
more useful error message than the sed line of deboostrap failing
cryptically.
Alex Bennée [Fri, 13 Jul 2018 11:23:14 +0000 (12:23 +0100)]
docker: ignore distro versioning of debootstrap
We do a minimum version check for the debootstrap but if the distro
has added their own minor version tick it would fail and fall-back to
the SCM version. This is sub-optimal as the latest/greatest version
may be broken at any one particular time. We fix that with a little
sed magic on the version string before passing to our ugly shell
versioning check.
Alex Bennée [Thu, 12 Jul 2018 13:36:18 +0000 (14:36 +0100)]
docker: report hint when docker.py check fails
When a check fails we currently just report why we failed. This is not
totally helpful to people who want to boot-strap a new image. Report a
hint as to why it failed.
Alex Bennée [Thu, 12 Jul 2018 09:35:54 +0000 (10:35 +0100)]
docker: drop QEMU_TARGET check, fallback in EXECUTABLE not set
The addition of QEMU_TARGET was intended to ensure we fall back to
checking for the existence of an image if the build system was not
currently configured to build it. However this breaks the direct use
of the rule for building custom binfmt_misc images. We already check
for EXECUTABLE so let us just use that as a proxy for deciding if we
are just going to check the image exits.
Alex Bennée [Mon, 9 Jul 2018 10:51:57 +0000 (11:51 +0100)]
docker: Makefile.include don't include partial images
Rename DOCKER_INTERMEDIATE_IMAGES to DOCKER_PARTIAL_IMAGES and add the
incomplete cross compiler images that can build tests but can't build
QEMU itself. We also add debian, debian-bootstrap and the tricode
images to the list.
Alex Bennée [Mon, 9 Jul 2018 13:27:47 +0000 (14:27 +0100)]
docker: gracefully skip check_qemu
Not all our images are able to run the tests. Rather than use features
we can just check for the existence and run-ability of gtester. If the
image has been setup for binfmt_misc it will be able to run anyway.
Alex Bennée [Mon, 9 Jul 2018 13:08:25 +0000 (14:08 +0100)]
docker: fail more gracefully on docker.py check
As this is called directly from the Makefile while determining
dependencies and it is possible the user was configured in one window
but not have credentials in the other. Let's catch the Exceptions and
deal with it quietly.
Alex Bennée [Fri, 13 Jul 2018 15:02:31 +0000 (16:02 +0100)]
docker: par down QEMU_CONFIGURE_OPTS in debian-tricore-cross
This image isn't going to build anything significant as it is just
intended for building test cases. In case it does end up getting
inadvertently included in a build lets aim for the minimal possible
product.