Li Qiang [Sun, 16 Aug 2020 14:22:45 +0000 (07:22 -0700)]
virtio-mem: detach the element from the virtqueue when error occurs
If error occurs while processing the virtio request we should call
'virtqueue_detach_element' to detach the element from the virtqueue
before free the elem.
Jason Wang [Mon, 7 Sep 2020 10:49:03 +0000 (18:49 +0800)]
vhost-vdpa: batch updating IOTLB mappings
To speed up the memory mapping updating between vhost-vDPA and vDPA
device driver, this patch passes the IOTLB batching flags via IOTLB
API. Two new flags was introduced, VHOST_IOTLB_BATCH_BEGIN is a hint
that a bathced IOTLB updating may be initiated from the
userspace. VHOST_IOTLB_BATCH_END is a hint that userspace has finished
the updating:
Jason Wang [Mon, 7 Sep 2020 10:49:02 +0000 (18:49 +0800)]
vhost: switch to use IOTLB v2 format
This patch tries to switch to use new kernel IOTLB format V2. Previous
version may have inconsistent ABI between 32bit and 64bit machines
because of the hole after type field. Refer kernel commit
("429711aec282 vhost: switch to use new message format") for more
information.
To enable this feature, qemu need to use a new ioctl
VHOST_SET_BACKEND_FEATURE with VHOST_BACKEND_F_IOTLB_MSG_V2 bit. A new
vhost setting backend features ops was introduced. And when we try to
set features for vhost dev, we will examine the support of new IOTLB
format and enable it. This process is total transparent to guest,
which means we can have different IOTLB message type in src and dst
during migration.
The conversion of IOTLB message is straightforward, just check the
type and behave accordingly.
* remotes/kraxel/tags/usb-20200928-pull-request:
hw/usb: Use lock guard macros
usb: hcd-xhci-sysbus: Attach xhci to sysbus device
usb/hcd-xhci: Split pci wrapper for xhci base model
usb/hcd-xhci: Move qemu-xhci device to hcd-xhci-pci.c
usb/hcd-xhci: Make dma read/writes hooks pci free
Peter Maydell [Mon, 28 Sep 2020 15:49:10 +0000 (16:49 +0100)]
Merge remote-tracking branch 'remotes/alistair/tags/pull-register-20200927' into staging
Two small patches. One with a fix for the register API instance_size
and one for removing unused address variables from load_elf.
# gpg: Signature made Sun 27 Sep 2020 14:45:06 BST
# gpg: using RSA key F6C4AC46D4934868D3B8CE8F21E10D29DF977054
# gpg: Good signature from "Alistair Francis <[email protected]>" [full]
# Primary key fingerprint: F6C4 AC46 D493 4868 D3B8 CE8F 21E1 0D29 DF97 7054
* remotes/alistair/tags/pull-register-20200927:
core/register: Specify instance_size in the TypeInfo
load_elf: Remove unused address variables from callers
* remotes/vivier2/tags/trivial-branch-for-5.2-pull-request:
docs/system/deprecated: Move lm32 and unicore32 to the right section
migration/multifd: Remove superfluous semicolons
timer: Fix timer_mod_anticipate() documentation
vhost-vdpa: remove useless variable
Add *.pyc back to the .gitignore file
virtio: vdpa: omit check return of g_malloc
meson: fix static flag summary
vhost-vdpa: fix indentation in vdpa_ops
usb/hcd-xhci: Split pci wrapper for xhci base model
This patch sets the base to use xhci as sysbus model, for which pci
specific hooks are moved to hcd-xhci-pci.c. As a part of this requirment
msi/msix interrupts handling is moved under XHCIPCIState. Made required
changes for qemu-xhci-nec.
load_elf: Remove unused address variables from callers
Several callers of load_elf() pass pointers for lowaddr and highaddr
parameters which are then not used for anything. This may stem from a
misunderstanding that load_elf need a value here but in fact it can
take NULL to ignore these values. Remove such unused variables and
pass NULL instead from callers that don't need these.
Peter Maydell [Fri, 25 Sep 2020 13:46:18 +0000 (14:46 +0100)]
Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20200925a' into staging
Migration and virtiofsd pull
Chuan Zheng's Dirtyrate and TLS changes, with small fixes from Dov and
Luarent.
Small virtiofs changes from Harry, Stefan, Vivek and Jiachen.
One HMP/monitor rework from me.
# gpg: Signature made Fri 25 Sep 2020 13:03:50 BST
# gpg: using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <[email protected]>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert/tags/pull-migration-20200925a: (26 commits)
virtiofsd: Add -o allow_direct_io|no_allow_direct_io options
virtiofsd: Used glib "shared" thread pool
virtiofsd: document cache=auto default
monitor: Use LOCK_GUARD macros
migration/tls: add trace points for multifd-tls
migration/tls: add support for multifd tls-handshake
migration/tls: extract cleanup function for common-use
migration/tls: add tls_hostname into MultiFDSendParams
migration/tls: extract migration_tls_client_create for common-use
migration/tls: save hostname into MigrationState
migration: increase max-bandwidth to 128 MiB/s (1 Gib/s)
migration: Truncate state file in xen-save-devices-state
migration/dirtyrate: Add trace_calls to make it easier to debug
migration/dirtyrate: Implement qmp_cal_dirty_rate()/qmp_get_dirty_rate() function
migration/dirtyrate: Implement calculate_dirtyrate() function
migration/dirtyrate: Implement set_sample_page_period() and is_sample_period_valid()
migration/dirtyrate: skip sampling ramblock with size below MIN_RAMBLOCK_SIZE
migration/dirtyrate: Compare page hash results for recorded sampled page
migration/dirtyrate: Record hash results for each sampled page
migration/dirtyrate: move RAMBLOCK_FOREACH_MIGRATABLE into ram.h
...
Due to the commit 65da4539803373ec4eec97ffc49ee90083e56efd, the O_DIRECT
open flag of guest applications will be discarded by virtiofsd. While
this behavior makes it consistent with the virtio-9p scheme when guest
applications use direct I/O, we no longer have any chance to bypass the
host page cache.
Therefore, we add a flag 'allow_direct_io' to lo_data. If '-o
no_allow_direct_io' option is added, or none of '-o allow_direct_io' or
'-o no_allow_direct_io' is added, the 'allow_direct_io' will be set to
0, and virtiofsd discards O_DIRECT as before. If '-o allow_direct_io'
is added to the starting command-line, 'allow_direct_io' will be set to
1, so that the O_DIRECT flags will be retained and host page cache can
be bypassed.
Currently we use "exlusive" thread pools but its performance seems to be
poor. I tried using "shared" thread pools and performance seems much
better. I posted performance results here.
So lets switch to shared thread pools. We can think of making it optional
once somebody can show in what cases exclusive thread pools offer better
results. For now, my simple performance tests across the board see
better results with shared thread pools.
migration/dirtyrate: Implement set_sample_page_period() and is_sample_period_valid()
Implement is_sample_period_valid() to check if the sample period is vaild and
do set_sample_page_period() to sleep specific time between sample actions.
Peter Xu [Tue, 8 Sep 2020 20:30:18 +0000 (16:30 -0400)]
migration: Rework migrate_send_rp_req_pages() function
We duplicated the logic of maintaining the last_rb variable at both callers of
this function. Pass *rb pointer into the function so that we can avoid
duplicating the logic. Also, when we have the rb pointer, it's also easier to
remove the original 2nd & 4th parameters, because both of them (name of the
ramblock when needed, or the page size) can be fetched from the ramblock
pointer too.
* remotes/ehabkost/tags/machine-next-pull-request:
sifive_u: Register "start-in-flash" as class property
sifive_e: Register "revb" as class property
i440fx: Register i440FX-pcihost properties as class properties
machine: Register "memory-backend" as class property
xlnx-zcu102: Register properties as class properties
cpu/core: Register core-id and nr-threads as class properties
s390x: Register all CPU properties as class properties
cryptodev-backend: Register "chardev" as class property
cryptodev-vhost-user: Register "chardev" as class property
smp: drop support for deprecated (invalid topologies)
qom: simplify object_find_property / object_class_find_property
Stefan Hajnoczi [Wed, 23 Sep 2020 10:56:46 +0000 (11:56 +0100)]
qemu/atomic.h: rename atomic_ to qatomic_
clang's C11 atomic_fetch_*() functions only take a C11 atomic type
pointer argument. QEMU uses direct types (int, etc) and this causes a
compiler error when a QEMU code calls these functions in a source file
that also included <stdatomic.h> via a system header file:
$ CC=clang CXX=clang++ ./configure ... && make
../util/async.c:79:17: error: address argument to atomic operation must be a pointer to _Atomic type ('unsigned int *' invalid)
Avoid using atomic_*() names in QEMU's atomic.h since that namespace is
used by <stdatomic.h>. Prefix QEMU's APIs with 'q' so that atomic.h
and <stdatomic.h> can co-exist. I checked /usr/include on my machine and
searched GitHub for existing "qatomic_" users but there seem to be none.
This patch was generated using:
$ git grep -h -o '\<atomic\(64\)\?_[a-z0-9_]\+' include/qemu/atomic.h | \
sort -u >/tmp/changed_identifiers
$ for identifier in $(</tmp/changed_identifiers); do
sed -i "s%\<$identifier\>%q$identifier%g" \
$(git grep -I -l "\<$identifier\>")
done
I manually fixed line-wrap issues and misaligned rST tables.
Stefan Hajnoczi [Tue, 15 Sep 2020 12:03:39 +0000 (13:03 +0100)]
tests: add test-fdmon-epoll
Test aio_disable_external(), which switches from fdmon-epoll back to
fdmon-poll. This resulted in an assertion failure that was fixed in the
previous patch.
Stefan Hajnoczi [Wed, 9 Sep 2020 10:09:37 +0000 (11:09 +0100)]
gitmodules: add qemu.org vbootrom submodule
The vbootrom module is needed for the new NPCM7xx ARM SoCs. The
vbootrom.git repo is now mirrored on qemu.org. QEMU mirrors third-party
code to ensure that users can always build QEMU even if the dependency
goes offline and so QEMU meets its responsibilities to provide full
source code under software licenses.
Stefan Hajnoczi [Tue, 15 Sep 2020 13:08:33 +0000 (14:08 +0100)]
gitmodules: switch to qemu.org meson mirror
QEMU now hosts a mirror of meson.git. QEMU mirrors third-party code to
ensure that users can always build QEMU even if the dependency goes
offline and so QEMU meets its responsibilities to provide full source
code under software licenses.
Stefan Hajnoczi [Tue, 15 Sep 2020 13:08:32 +0000 (14:08 +0100)]
gitmodules: switch to qemu.org qboot mirror
QEMU now hosts a mirror of qboot.git. QEMU mirrors third-party code to
ensure that users can always build QEMU even if the dependency goes
offline and so QEMU meets its responsibilities to provide full source
code under software licenses.
Stefan Hajnoczi [Tue, 15 Sep 2020 15:07:34 +0000 (16:07 +0100)]
docs/system: clarify deprecation schedule
The sentence explaining the deprecation schedule is ambiguous. Make it
clear that a feature deprecated in the Nth release is guaranteed to
remain available in the N+1th release. Removal can occur in the N+2nd
release or later.
As an example of this in action, see commit 25956af3fe5dd0385ad8017bc768a6afe41e2a74 ("block: Finish deprecation of
'qemu-img convert -n -o'"). The feature was deprecated in QEMU 4.2.0. It
was present in the 5.0.0 release and removed in the 5.1.0 release.
Stefan Hajnoczi [Thu, 17 Sep 2020 09:44:55 +0000 (10:44 +0100)]
virtio-crypto: don't modify elem->in/out_sg
A number of iov_discard_front/back() operations are made by
virtio-crypto. The elem->in/out_sg iovec arrays are modified by these
operations, resulting virtqueue_unmap_sg() calls on different addresses
than were originally mapped.
This is problematic because dirty memory may not be logged correctly,
MemoryRegion refcounts may be leaked, and the non-RAM bounce buffer can
be leaked.
Take a copy of the elem->in/out_sg arrays so that the originals are
preserved. The iov_discard_undo() API could be used instead (with better
performance) but requires careful auditing of the code, so do the simple
thing instead.
Fuzzing discovered that virtqueue_unmap_sg() is being called on modified
req->in/out_sg iovecs. This means dma_memory_map() and
dma_memory_unmap() calls do not have matching memory addresses.
Fuzzing discovered that non-RAM addresses trigger a bug:
Stefan Hajnoczi [Thu, 17 Sep 2020 09:44:53 +0000 (10:44 +0100)]
util/iov: add iov_discard_undo()
The iov_discard_front/back() operations are useful for parsing iovecs
but they modify the array elements. If the original array is needed
after parsing finishes there is currently no way to restore it.
Although g_memdup() can be used before performing destructive
iov_discard_front/back() operations, this is inefficient.
Introduce iov_discard_undo() to restore the array to the state prior to
an iov_discard_front/back() operation.
Marc Hartmayer [Tue, 1 Sep 2020 15:00:19 +0000 (17:00 +0200)]
libvhost-user: handle endianness as mandated by the spec
Since virtio existed even before it got standardized, the virtio
standard defines the following types of virtio devices:
+ legacy device (pre-virtio 1.0)
+ non-legacy or VIRTIO 1.0 device
+ transitional device (which can act both as legacy and non-legacy)
Virtio 1.0 defines the fields of the virtqueues as little endian,
while legacy uses guest's native endian [1]. Currently libvhost-user
does not handle virtio endianness at all, i.e. it works only if the
native endianness matches with whatever is actually needed. That means
things break spectacularly on big-endian targets. Let us handle virtio
endianness for non-legacy as required by the virtio specification [1]
and fence legacy virtio, as there is no safe way to figure out the
needed endianness conversions for all cases. The fencing of legacy
virtio devices is done in `vu_set_features_exec`.
Stefan Hajnoczi [Mon, 7 Sep 2020 11:16:32 +0000 (12:16 +0100)]
MAINTAINERS: add Stefan Hajnoczi as block/nvme.c maintainer
Development of the userspace NVMe block driver picked up again recently.
After talking with Fam I am stepping up as block/nvme.c maintainer.
Patches will be merged through my 'block' tree.
audio: align audio_generic_write with audio_pcm_hw_run_out
The function audio_generic_write should work exactly like
audio_pcm_hw_run_out. It's a very similar function working on a
different buffer.
This patch significantly reduces the number of drop-outs with
the DirectSound backend. To hear the difference start qemu with
-audiodev dsound,id=audio0,out.mixing-engine=off and play a
song in the guest with and without this patch.
This patch removes unnecessary calls to the pcm_ops function
put_buffer_in(). No audio backend needs this call if the
returned length of pcm_ops function get_buffer_in() is zero.
For the DirectSound backend this prevents a call to
dsound_unlock_in() without a preceding call to dsound_lock_in().
While Windows doesn't complain it seems wrong anyway.
The playback rate with the spiceaudio backend is currently too
fast if there's no spice client connected or the spice client
can't play audio. Rate limit the audio playback stream in all
cases. To calculate the rate correctly the limiter has to know
the maximum buffer size.
The new rules for the variables buf and size returned by
get_buffer_out() are:
size == 0: Downstream playback buffer is full. Retry later.
size > 0, buf != NULL: Copy size bytes to buf for playback.
size > 0, buf == NULL: Drop size bytes.
The audio playback rate with spiceaudio for the no audio case is
too fast, but that's what we had before commit fb35c2cec5
"audio/dsound: fix invalid parameters error". The complete fix
comes with the next patch.
With the next patch all audio backends put_buffer_out() functions
have to handle the buf == NULL case, provided the get_buffer_out()
function may return buf = NULL and size > 0.
It turns out that all audio backends get_buffer_out() functions
either can't return buf = NULL or return buf = NULL and size = 0
at the same time. The only exception is the spiceaudio backend
where size may be uninitialized.
Igor Mammedov [Fri, 11 Sep 2020 13:32:02 +0000 (09:32 -0400)]
smp: drop support for deprecated (invalid topologies)
it's was deprecated since 3.1
Support for invalid topologies is removed, the user must ensure
that topologies described with -smp include all possible cpus,
i.e. (sockets * cores * threads) == maxcpus or QEMU will
exit with error.
When debugging QEMU it is often useful to put a breakpoint on the
error_setg_internal method impl.
Unfortunately the object_property_add / object_class_property_add
methods call object_property_find / object_class_property_find methods
to check if a property exists already before adding the new property.
As a result there are a huge number of calls to error_setg_internal
on startup of most QEMU commands, making it very painful to set a
breakpoint on this method.
Most callers of object_find_property and object_class_find_property,
however, pass in a NULL for the Error parameter. This simplifies the
methods to remove the Error parameter entirely, and then adds some
new wrapper methods that are able to raise an Error when needed.
Eric Blake [Mon, 14 Sep 2020 19:10:09 +0000 (14:10 -0500)]
qemu-img: Support bitmap --merge into backing image
If you have the chain 'base.qcow2 <- top.qcow2' and want to merge a
bitmap from top into base, qemu-img was failing with:
qemu-img: Could not open 'top.qcow2': Could not open backing file: Failed to get shared "write" lock
Is another process using the image [base.qcow2]?
The easiest fix is to not open the entire backing chain of either
image (source or destination); after all, the point of 'qemu-img
bitmap' is solely to manipulate bitmaps directly within a single qcow2
image, and this is made more precise if we don't pay attention to
other images in the chain that may happen to have a bitmap by the same
name.
However, note that on a case-by-case analysis, there _are_ times where
we treat it as a feature that we can access a bitmap from a backing
layer in association with an overlay BDS. A demonstration of this is
using NBD to expose both an overlay BDS (for constant contents) and a
bitmap (for learning which blocks are interesting) during an
incremental backup:
Base <- Active <- Temporary
\--block job ->/
where Temporary is being fed by a backup 'sync=none' job. When
exposing Temporary over NBD, referring to a bitmap that lives only in
Active is less effort than having to copy a bitmap into Temporary [1].
So the testsuite additions in this patch check both where bitmaps get
allocated (the qemu-img info output), and that qemu-nbd is indeed able
to access a bitmap inherited from the backing chain since it is a
different use case than 'qemu-img bitmap'.
[1] Full disclosure: prior to the recent commit 374eedd1c4 and
friends, we were NOT able to see bitmaps through filters, which meant
that we actually did not have nice clean semantics for uniformly being
able to pick up bitmaps from anywhere in the backing chain (seen as a
change in behavior between qemu 4.1 and 4.2 at commit 00e30f05de, when
block-copy swapped from a one-off to a filter). Which means libvirt
was already coded to copy bitmaps around for the sake of older qemu,
even though modern qemu no longer needs it. Oh well.
* remotes/ehabkost/tags/machine-next-pull-request:
Use OBJECT_DECLARE_SIMPLE_TYPE when possible
Use OBJECT_DECLARE_TYPE when possible
qom: Remove module_obj_name parameter from OBJECT_DECLARE* macros
qom: Remove ParentClassType argument from OBJECT_DECLARE_SIMPLE_TYPE
scripts/codeconverter: Update to latest version
target/s390x: Set instance_align on S390CPU TypeInfo
target/riscv: Set instance_align on RISCVCPU TypeInfo
target/ppc: Set instance_align on PowerPCCPU TypeInfo
target/arm: Set instance_align on CPUARM TypeInfo
qom: Allow objects to be allocated with increased alignment
qom: Correct error values in two contracts
qom: Clean up object_property_get_enum()'s error value
qom: Correct object_class_dynamic_cast_assert() documentation
sifive: Use DECLARE_*CHECKER* macros
sifive: Move QOM typedefs and add missing includes
sifive_u: Rename memmap enum constants
sifive_e: Rename memmap enum constants
Peter Maydell [Mon, 21 Sep 2020 16:41:32 +0000 (17:41 +0100)]
Merge remote-tracking branch 'remotes/ehabkost/tags/x86-next-pull-request' into staging
x86 queue, 2020-09-18
Cleanups:
* Correct the meaning of '0xffffffff' value for hv-spinlocks (Vitaly Kuznetsov)
* vmport: Drop superfluous parenthesis (Philippe Mathieu-Daudé)
Fixes:
* Use generic APIC ID encoding code for EPYC (Babu Moger)
* remotes/ehabkost/tags/x86-next-pull-request:
i386: Simplify CPUID_8000_001E for AMD
i386: Simplify CPUID_8000_001d for AMD
hw/i386/vmport: Drop superfluous parenthesis around function typedef
i386/kvm: correct the meaning of '0xffffffff' value for hv-spinlocks
Commit a5d7eb6534a ("Add TSC2301 touchscreen & keypad controller")
added the MouseTransformInfo declaration in "ui/console.h",
however it is only used in "hw/input/tsc2xxx.h".
Reduce the structure exposure by moving it to the single include
where it is used.
This should fix a build failure on OpenBSD:
In file included from hw/arm/nseries.c:30:
In file included from include/hw/arm/omap.h:24:
In file included from include/hw/input/tsc2xxx.h:14:
include/ui/console.h:11:11: fatal error: 'epoxy/gl.h' file not found
# include <epoxy/gl.h>
^~~~~~~~~~~~
1 error generated.
gmake: *** [Makefile.ninja:1735:
libqemu-aarch64-softmmu.fa.p/hw_arm_nseries.c.o] Error 1
hw: usb: hcd-ohci: check for processed TD before retire
While servicing OHCI transfer descriptors(TD), ohci_service_iso_td
retires a TD if it has passed its time frame. It does not check if
the TD was already processed once and holds an error code in TD_CC.
It may happen if the TD list has a loop. Add check to avoid an
infinite loop condition.
hw: usb: hcd-ohci: check len and frame_number variables
While servicing the OHCI transfer descriptors(TD), OHCI host
controller derives variables 'start_addr', 'end_addr', 'len'
etc. from values supplied by the host controller driver.
Host controller driver may supply values such that using
above variables leads to out-of-bounds access issues.
Add checks to avoid them.