Peter Maydell [Sun, 4 Apr 2021 20:48:45 +0000 (21:48 +0100)]
Merge remote-tracking branch 'remotes/xtensa/tags/20210403-xtensa' into staging
target/xtensa fixes for v6.0:
- make meson.build pick up all available xtensa core definitions;
- don't modify Makefile.objs in import_core.sh;
- add sed rule to import_core.sh to make xtensa_modules variable static.
Max Filippov [Tue, 30 Mar 2021 07:25:24 +0000 (00:25 -0700)]
target/xtensa: fix meson.build rule for xtensa cores
import_core.sh tries to change Makefile.objs when importing new xtensa
core, but that file no longer exists. Rewrite meson.build rule to pick
up all source files that match core-*.c pattern and drop commands that
change Makefile.objs.
Peter Maydell [Fri, 2 Apr 2021 10:53:18 +0000 (11:53 +0100)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc,virtio,pci: bugfixes
Fixes all over the place.
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Thu 01 Apr 2021 17:22:03 BST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "[email protected]"
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>" [full]
# gpg: aka "Michael S. Tsirkin <[email protected]>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
pci: sprinkle assert in PCI pin number
isa/v582c686: Reinitialize ACPI PM device on reset
vt82c686.c: don't raise SCI when PCI_INTERRUPT_PIN isn't setup
acpi/piix4: reinitialize acpi PM device on reset
virtio-pci: remove explicit initialization of val
virtio-pci: add check for vdev in virtio_pci_isr_read
vhost-user-blk: add immediate cleanup on shutdown
vhost-user-blk: perform immediate cleanup if disconnect on initialization
vhost-user-blk: use different event handlers on initialization
* remotes/thuth-gitlab/tags/pull-request-2021-04-01:
device-crash-test: Ignore errors about a bus not being available
docs: Fix typo in the default name of the qemu-system-x86_64 binary
docs: Remove obsolete paragraph about config-target.mak
util/compatfd.c: Fixed style issues
qom: Fix default values in help
MAINTAINERS: Mark SH-4 hardware emulation orphan
MAINTAINERS: Mark RX hardware emulation orphan
MAINTAINERS: add virtio-fs mailing list
MAINTAINERS: Drop the line with Xiang Zheng
MAINTAINERS: replace Huawei's email to personal one
MAINTAINERS: Drop the lines with Sarah Harris
MAINTAINERS: add/replace backups for some s390 areas
MAINTAINERS: Fix tests/migration maintainers
Isaku Yamahata [Tue, 23 Mar 2021 20:52:27 +0000 (13:52 -0700)]
pci: sprinkle assert in PCI pin number
If a device model
(a) doesn't set the value to a correct interrupt number and then
(b) triggers an interrupt for itself,
it's device model bug. Add assert on interrupt pin number to catch
this kind of bug more obviously.
Isaku Yamahata [Tue, 23 Mar 2021 20:52:26 +0000 (13:52 -0700)]
isa/v582c686: Reinitialize ACPI PM device on reset
Commit 6be8cf56bc8b made sure that SCI is enabled in PM1.CNT
on reset in acpi_only mode by modifying acpi_pm1_cnt_reset() and
that worked for q35 as expected.
This patch adds reset ACPI PM related registers on vt82c686 reset time
and de-assert sci.
via_pm_realize() initializes acpi pm tmr, evt, cnt and gpe.
Reset them on device reset.
Isaku Yamahata [Tue, 23 Mar 2021 20:52:25 +0000 (13:52 -0700)]
vt82c686.c: don't raise SCI when PCI_INTERRUPT_PIN isn't setup
Without this patch, the following patch will triger clan runtime
sanitizer warnings as follows. This patch proactively works around it.
I leave a correct fix to v582c686.c maintainerfix as I'm not sure
about fuloong2e device model.
Isaku Yamahata [Tue, 23 Mar 2021 20:52:24 +0000 (13:52 -0700)]
acpi/piix4: reinitialize acpi PM device on reset
Commit 6be8cf56bc8b made sure that SCI is enabled in PM1.CNT
on reset in acpi_only mode by modifying acpi_pm1_cnt_reset() and
that worked for q35 as expected.
The function was introduced by commit eaba51c573a (acpi, acpi_piix, vt82c686: factor out PM1_CNT logic)
that forgot to actually call it at piix4 reset time and as result
SCI_EN wasn't set as was expected by 6be8cf56bc8b in acpi_only mode.
So Windows crashes when it notices that SCI_EN is not set and FADT is
not providing information about how to enable it anymore.
Reproducer:
qemu-system-x86_64 -enable-kvm -M pc-i440fx-6.0,smm=off -cdrom any_windows_10x64.iso
Fix it by calling acpi_pm1_cnt_reset() at piix4 reset time.
Occasionally this patch adds reset acpi PM related registers on
piix4 reset time and de-assert sci.
piix4_pm_realize() initializes acpi pm tmr, evt, cnt and gpe.
Reset them on device reset. pm_reset() in ich9.c correctly calls
corresponding reset functions.
* remotes/marcandre/tags/for-6.0-pull-request:
tests: Add tests for yank with the chardev-change case
chardev: Fix yank with the chardev-change case
chardev/char.c: Always pass id to chardev_new
chardev/char.c: Move object_property_try_add_child out of chardev_new
yank: Always link full yank code
yank: Remove dependency on qiochannel
docs: simplify each section title
dbus-vmstate: Increase the size of input stream buffer used during load
util: fix use-after-free in module_load_one
Yuri Benditovich [Mon, 15 Mar 2021 11:59:36 +0000 (13:59 +0200)]
virtio-pci: add check for vdev in virtio_pci_isr_read
https://bugzilla.redhat.com/show_bug.cgi?id=1743098
This commit completes the solution of segfault in hot unplug flow
(by commit ccec7e9603f446fe75c6c563ba335c00cfda6a06).
Added missing check for vdev in virtio_pci_isr_read.
Typical stack of crash:
virtio_pci_isr_read ../hw/virtio/virtio-pci.c:1365 with proxy-vdev = 0
memory_region_read_accessor at ../softmmu/memory.c:442
access_with_adjusted_size at ../softmmu/memory.c:552
memory_region_dispatch_read1 at ../softmmu/memory.c:1420
memory_region_dispatch_read at ../softmmu/memory.c:1449
flatview_read_continue at ../softmmu/physmem.c:2822
flatview_read at ../softmmu/physmem.c:2862
address_space_read_full at ../softmmu/physmem.c:2875
Denis Plotnikov [Thu, 25 Mar 2021 15:12:17 +0000 (18:12 +0300)]
vhost-user-blk: add immediate cleanup on shutdown
Qemu crashes on shutdown if the chardev used by vhost-user-blk has been
finalized before the vhost-user-blk.
This happens with char-socket chardev operating in the listening mode (server).
The char-socket chardev emits "close" event at the end of finalizing when
its internal data is destroyed. This calls vhost-user-blk event handler
which in turn tries to manipulate with destroyed chardev by setting an empty
event handler for vhost-user-blk cleanup postponing.
This patch separates the shutdown case from the cleanup postponing removing
the need to set an event handler.
Denis Plotnikov [Thu, 25 Mar 2021 15:12:16 +0000 (18:12 +0300)]
vhost-user-blk: perform immediate cleanup if disconnect on initialization
Commit 4bcad76f4c39 ("vhost-user-blk: delay vhost_user_blk_disconnect")
introduced postponing vhost_dev cleanup aiming to eliminate qemu aborts
because of connection problems with vhost-blk daemon.
However, it introdues a new problem. Now, any communication errors
during execution of vhost_dev_init() called by vhost_user_blk_device_realize()
lead to qemu abort on assert in vhost_dev_get_config().
This happens because vhost_user_blk_disconnect() is postponed but
it should have dropped s->connected flag by the time
vhost_user_blk_device_realize() performs a new connection opening.
On the connection opening, vhost_dev initialization in
vhost_user_blk_connect() relies on s->connection flag and
if it's not dropped, it skips vhost_dev initialization and returns
with success. Then, vhost_user_blk_device_realize()'s execution flow
goes to vhost_dev_get_config() where it's aborted on the assert.
To fix the problem this patch adds immediate cleanup on device
initialization(in vhost_user_blk_device_realize()) using different
event handlers for initialization and operation introduced in the
previous patch.
On initialization (in vhost_user_blk_device_realize()) we fully
control the initialization process. At that point, nobody can use the
device since it isn't initialized and we don't need to postpone any
cleanups, so we can do cleaup right away when there is a communication
problem with the vhost-blk daemon.
On operation we leave it as is, since the disconnect may happen when
the device is in use, so the device users may want to use vhost_dev's data
to do rollback before vhost_dev is re-initialized (e.g. in vhost_dev_set_log()).
Denis Plotnikov [Thu, 25 Mar 2021 15:12:15 +0000 (18:12 +0300)]
vhost-user-blk: use different event handlers on initialization
It is useful to use different connect/disconnect event handlers
on device initialization and operation as seen from the further
commit fixing a bug on device initialization.
This patch refactors the code to make use of them: we don't rely any
more on the VM state for choosing how to cleanup the device, instead
we explicitly use the proper event handler depending on whether
the device has been initialized.
* remotes/bonzini-gitlab/tags/for-upstream:
docs: Add a QEMU Code of Conduct and Conflict Resolution Policy document
hexagon: do not specify Python scripts as inputs
hexagon: do not specify executables as inputs
configure: Do not use default_feature for EXESUF
target/openrisc: fix icount handling for timer instructions
replay: notify CPU on event
icount: get rid of static variable
Revert "qom: use qemu_printf to print help for user-creatable objects"
replay: fix recursive checkpoints
qapi: qom: do not use target-specific conditionals
target/i386: Verify memory operand for lcall and ljmp
meson: Propagate gnutls dependency to migration
Thomas Huth [Tue, 23 Mar 2021 16:47:18 +0000 (17:47 +0100)]
device-crash-test: Ignore errors about a bus not being available
Recent QEMU versions now sometimes exit cleanly with an error message
that a bus is not available for a specified device. Don't flag those
as an error in the device-crash-test script.
Output of default values in device help is broken:
$ ./qemu-system-x86_64 -S -display none -monitor stdio
QEMU 5.2.50 monitor - type 'help' for more information
(qemu) device_add pvpanic,help
pvpanic options:
events=<uint8> - (default: (null))
ioport=<uint16> - (default: (null))
pvpanic[0]=<child<qemu:memory-region>>
The "(null)" is glibc printing a null pointer. Other systems crash
instead. Having a help request crash a running VM can really spoil
your day.
Root cause is a botched replacement of qstring_free() by
g_string_free(): to get the string back, we need to pass true to the
former, but false to the latter. Fix the argument.
Yoshinori Sato doesn't have time to manage QEMU reviews.
The code is in good shape and hasn't started to bitrot,
so mark the SH-4 hardware as orphan to give the possibility
to any contributor to step in and fill the gap.
Yoshinori Sato doesn't have time to manage QEMU reviews.
The code is in good shape and hasn't started to bitrot,
so mark the RX target and hardware as orphan to give the
possibility to any contributor to step in and fill the gap.
Thomas Huth [Thu, 1 Apr 2021 06:24:26 +0000 (08:24 +0200)]
MAINTAINERS: Drop the lines with Sarah Harris
In a mail to the qemu-devel mailing list, Sarah wrote:
"I was added as a reviewer (in MAINTAINERS) for the AVR target for the
duration of my research work using it.
The funding for my project expires in the middle of April, so I will not be
able to provide time for reviewing patches from that point."
Thus let's remove the corresponding lines in the MAINTAINERS file.
Lukas Straub [Tue, 30 Mar 2021 18:13:31 +0000 (20:13 +0200)]
chardev: Fix yank with the chardev-change case
When changing from chardev-socket (which supports yank) to
chardev-socket again, it fails, because the new chardev attempts
to register a new yank instance. This in turn fails, as there
still is the yank instance from the current chardev. Also,
the old chardev shouldn't unregister the yank instance when it
is freed.
To fix this, now the new chardev only registers a yank instance if
the current chardev doesn't support yank and thus hasn't registered
one already. Also, when the old chardev is freed, it now only
unregisters the yank instance if the new chardev doesn't need it.
If the initialization of the new chardev fails, it still has
chr->handover_yank_instance set and won't unregister the yank
instance when it is freed.
s->registered_yank is always true here, as chardev-change only works
on user-visible chardevs and those are guraranteed to register a
yank instance as they are initialized via
chardev_new()
qemu_char_open()
cc->open() (qmp_chardev_open_socket()).
Lukas Straub [Tue, 30 Mar 2021 18:13:28 +0000 (20:13 +0200)]
chardev/char.c: Always pass id to chardev_new
Always pass the id to chardev_new, since it is needed to register
the yank instance for the chardev. Also, after checking that
nothing calls chardev_new with id=NULL, assert() that id!=NULL.
This fixes a crash when using chardev-change to change a chardev
to chardev-socket, which attempts to register a yank instance.
This in turn tries to dereference the NULL-pointer.
Lukas Straub [Tue, 30 Mar 2021 18:13:25 +0000 (20:13 +0200)]
chardev/char.c: Move object_property_try_add_child out of chardev_new
Move object_property_try_add_child out of chardev_new into it's
callers. This is a preparation for the next patches to fix yank
with the chardev-change case.
Priyankar Jain [Tue, 2 Feb 2021 13:54:20 +0000 (13:54 +0000)]
dbus-vmstate: Increase the size of input stream buffer used during load
This commit fixes an issue where migration is failing in the load phase
because of a false alarm about data unavailability.
Following is the error received when the amount of data to be transferred
exceeds the default buffer size setup by G_BUFFERED_INPUT_STREAM(4KiB),
even when the maximum data size supported by this backend is 1MiB
(DBUS_VMSTATE_SIZE_LIMIT):
dbus_vmstate_post_load: Invalid vmstate size: 4364
qemu-kvm: error while loading state for instance 0x0 of device 'dbus-vmstate/dbus-vmstate'
This commit sets the size of the input stream buffer used during load to
DBUS_VMSTATE_SIZE_LIMIT which is the maximum amount of data a helper can
send during save phase.
Secondly, this commit makes sure that the input stream buffer is loaded before
checking the size of the data available in it, rectifying the false alarm about
data unavailability.
g_hash_table_add always retains ownership of the pointer passed in as
the key. Its return status merely indicates whether the added entry was
new, or replaced an existing entry. Thus key must never be freed after
this method returns.
Spotted by ASAN:
==2407186==ERROR: AddressSanitizer: heap-use-after-free on address 0x6020003ac4f0 at pc 0x7ffff766659c bp 0x7fffffffd1d0 sp 0x7fffffffc980
READ of size 1 at 0x6020003ac4f0 thread T0
#0 0x7ffff766659b (/lib64/libasan.so.6+0x8a59b)
#1 0x7ffff6bfa843 in g_str_equal ../glib/ghash.c:2303
#2 0x7ffff6bf8167 in g_hash_table_lookup_node ../glib/ghash.c:493
#3 0x7ffff6bf9b78 in g_hash_table_insert_internal ../glib/ghash.c:1598
#4 0x7ffff6bf9c32 in g_hash_table_add ../glib/ghash.c:1689
#5 0x5555596caad4 in module_load_one ../util/module.c:233
#6 0x5555596ca949 in module_load_one ../util/module.c:225
#7 0x5555596ca949 in module_load_one ../util/module.c:225
#8 0x5555596cbdf4 in module_load_qom_all ../util/module.c:349
Paolo Bonzini [Wed, 31 Mar 2021 14:35:27 +0000 (16:35 +0200)]
docs: Add a QEMU Code of Conduct and Conflict Resolution Policy document
In an ideal world, we would all get along together very well, always be
polite and never end up in huge conflicts. And even if there are conflicts,
we would always handle each other fair and respectfully. Unfortunately,
this is not an ideal world and sometimes people forget how to interact with
each other in a professional and respectful way. Fortunately, this seldom
happens in the QEMU community, but for such rare cases it is preferrable
to have a basic code of conduct document available to show to people
who are misbehaving. In case that does not help yet, we should also have
a conflict resolution policy ready that can be applied in the worst case.
The Code of Conduct document tries to be short and to the point while
trying to remain friendly and welcoming; it is based on the Fedora Code
of Conduct[1] with extra detail added based on the Contributor Covenant
1.3.0[2]. Other proposals included the Contributor Covenant 1.3.0 itself
or the Django Code of Conduct[3] (which is also a derivative of Fedora's)
but, in any case, there was agreement on keeping the conflict resolution
policy separate from the CoC itself.
An important point is whether to apply the code of conduct to violations
that occur outside public spaces. The text herein restricts that to
individuals acting as a representative or a member of the project or
its community. This is intermediate between the Contributor Covenant
(which only mentions representatives of the community, for example using
an official project e-mail address or posting via an official social media
account), and the Django Code of Conduct, which says that violations of
this code outside these spaces "may" be considered but otherwise applies
no limit.
The conflict resolution policy is based on the Drupal Conflict Resolution
Policy[4] and its derivative, the Mozilla Consequence Ladder[5].
Paolo Bonzini [Tue, 9 Mar 2021 15:15:30 +0000 (16:15 +0100)]
hexagon: do not specify Python scripts as inputs
Python scripts are not inputs, and putting them in @INPUT@. This
puts requirements on the command line format, keeping all inputs
close to the name of the script. Avoid that by not including the
script in the command and not in the inputs.
Also wrap "PYTHONPATH" usage with "env", since setting the environment
this way is not valid under Windows.
Paolo Bonzini [Tue, 9 Mar 2021 15:15:30 +0000 (16:15 +0100)]
hexagon: do not specify executables as inputs
gen_semantics is an executable, not an input. Meson 0.57 special cases
the first argument and @INPUT@ is not expanded there. Fix that by
not including it in the input, only in the command.
Commit "c87ea11631 configure: add --without-default-features" use
default_feature to set default values for configure option. This value
is used for EXESUF too.
However, EXESUF is not option to be tested, it is just append to any
binary name so using --without-default-features set EXESUF to "n"o and
all binaries using it has form <name>no (e.g. qemu-imgno).
This is not expected behavior as disabling features should not cause
generating different binary names.
Reverting back to setting EXESUF to empty value unless needed otherwise.
Pavel Dovgalyuk [Thu, 1 Apr 2021 08:19:51 +0000 (11:19 +0300)]
replay: notify CPU on event
This patch enables vCPU notification to wake it up
when new async event comes in replay mode.
The motivation of this patch is the following.
Consider recorded block async event. It is saved into the log
with one of the checkpoints. This checkpoint may be passed in
vCPU loop. In replay mode when this async event is read from
the log, and block thread task is not finished yet, vCPU thread
goes to sleep. That is why this patch adds waking up the vCPU
to process this finished event.
The real code change had already been added by Kevin's commit da0a932bbf
("hmp: QAPIfy object_add") and commit 6d9abb6d just added a duplicated
include statement as a left-over of a rebase.
Pavel Dovgalyuk [Mon, 29 Mar 2021 07:59:25 +0000 (10:59 +0300)]
replay: fix recursive checkpoints
Record/replay uses checkpoints to synchronize the execution
of the threads and timers. Hardware events such as BH are
processed at the checkpoints too.
Event processing can cause refreshing the virtual timers
and calling the icount-related functions, that also use checkpoints.
This patch prevents recursive processing of such checkpoints,
because they have their own records in the log and should be
processed later.
Paolo Bonzini [Fri, 26 Mar 2021 08:48:39 +0000 (04:48 -0400)]
qapi: qom: do not use target-specific conditionals
ObjectType and ObjectOptions are defined in a target-independent file,
therefore they do not have access to target-specific configuration
symbols such as CONFIG_PSERIES or CONFIG_SEV. For this reason,
pef-guest and sev-guest are currently omitted when compiling the
generated QAPI files. In addition, this causes ObjectType to have
different definitions depending on the file that is including
qapi-types-qom.h (currently this is not causing any issues, but it
is wrong).
Define the two enum entries and the SevGuestProperties type
unconditionally to avoid the issue. We do not expect to have
many target-dependent user-creatable classes, so it is not
particularly problematic.
This fixes the following compilation failure on Arm-based Macs:
In file included from migration/multifd.c:23:
In file included from migration/tls.h:25:
In file included from include/io/channel-tls.h:26:
In file included from include/crypto/tlssession.h:24:
include/crypto/tlscreds.h:28:10: fatal error: 'gnutls/gnutls.h' file not found
#include <gnutls/gnutls.h>
^~~~~~~~~~~~~~~~~
1 error generated.
Hyman Huang(黄勇) [Fri, 19 Mar 2021 08:07:57 +0000 (16:07 +0800)]
MAINTAINERS: Fix tests/migration maintainers
when executing the following scripts, it throw error message:
$ ./scripts/get_maintainer.pl -f tests/migration/guestperf.py
get_maintainer.pl: No maintainers found, printing recent contributors.
get_maintainer.pl: Do not blindly cc: them on patches! Use common sense.
add the tests/migration to the "Migration" section of MAINTAINERS
Peter Maydell [Wed, 31 Mar 2021 15:38:49 +0000 (16:38 +0100)]
Merge remote-tracking branch 'remotes/stefanha-gitlab/tags/block-pull-request' into staging
Pull request
A fix for VDI image files and more generally for CoRwlock.
# gpg: Signature made Wed 31 Mar 2021 10:50:39 BST
# gpg: using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <[email protected]>" [full]
# gpg: aka "Stefan Hajnoczi <[email protected]>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha-gitlab/tags/block-pull-request:
test-coroutine: Add rwlock downgrade test
test-coroutine: Add rwlock upgrade test
coroutine-lock: Reimplement CoRwlock to fix downgrade bug
coroutine-lock: Store the coroutine in the CoWaitRecord only once
block/vdi: Don't assume that blocks are larger than VdiHeader
block/vdi: When writing new bmap entry fails, don't leak the buffer
Peter Maydell [Wed, 31 Mar 2021 12:14:18 +0000 (13:14 +0100)]
Merge remote-tracking branch 'remotes/dg-gitlab/tags/ppc-for-6.0-20210331' into staging
ppc patch queue for 2021-03-31
Here's another set of patches for the ppc target and associated
machine types. I'd hoped to send this closer to the hard freeze, but
got caught up for some time chasing what looked like a strange
regression, before finally concluding it was due to unrelated failures
on the CI.
This is just a handful of fairly straightforward fixes, plus one
performance improvement that's simple and beneficial enough that I'm
considering it a "performance bug fix".
* remotes/dg-gitlab/tags/ppc-for-6.0-20210331:
hw/net: fsl_etsec: Tx padding length should exclude CRC
spapr: Fix typo in the patb_entry comment
spapr: Assert DIMM unplug state in spapr_memory_unplug()
target/ppc/kvm: Cache timebase frequency
hw/ppc: e500: Add missing #address-cells and #size-cells in the eTSEC node
Paolo Bonzini [Thu, 25 Mar 2021 11:29:39 +0000 (12:29 +0100)]
coroutine-lock: Reimplement CoRwlock to fix downgrade bug
An invariant of the current rwlock is that if multiple coroutines hold a
reader lock, all must be runnable. The unlock implementation relies on
this, choosing to wake a single coroutine when the final read lock
holder exits the critical section, assuming that it will wake a
coroutine attempting to acquire a write lock.
The downgrade implementation violates this assumption by creating a
read lock owning coroutine that is exclusively runnable - any other
coroutines that are waiting to acquire a read lock are *not* made
runnable when the write lock holder converts its ownership to read
only.
More in general, the old implementation had lots of other fairness bugs.
The root cause of the bugs was that CoQueue would wake up readers even
if there were pending writers, and would wake up writers even if there
were readers. In that case, the coroutine would go back to sleep *at
the end* of the CoQueue, losing its place at the head of the line.
To fix this, keep the queue of waiters explicitly in the CoRwlock
instead of using CoQueue, and store for each whether it is a
potential reader or a writer. This way, downgrade can look at the
first queued coroutines and wake it only if it is a reader, causing
all other readers in line to be released in turn.
David Edmondson [Thu, 25 Mar 2021 11:29:38 +0000 (12:29 +0100)]
coroutine-lock: Store the coroutine in the CoWaitRecord only once
When taking the slow path for mutex acquisition, set the coroutine
value in the CoWaitRecord in push_waiter(), rather than both there and
in the caller.
David Edmondson [Thu, 25 Mar 2021 11:29:37 +0000 (12:29 +0100)]
block/vdi: Don't assume that blocks are larger than VdiHeader
Given that the block size is read from the header of the VDI file, a
wide variety of sizes might be seen. Rather than re-using a block
sized memory region when writing the VDI header, allocate an
appropriately sized buffer.
David Edmondson [Thu, 25 Mar 2021 11:29:36 +0000 (12:29 +0100)]
block/vdi: When writing new bmap entry fails, don't leak the buffer
If a new bitmap entry is allocated, requiring the entire block to be
written, avoiding leaking the buffer allocated for the block should
the write fail.
Greg Kurz [Wed, 17 Mar 2021 17:57:07 +0000 (18:57 +0100)]
target/ppc/kvm: Cache timebase frequency
Each vCPU core exposes its timebase frequency in the DT. When running
under KVM, this means parsing /proc/cpuinfo in order to get the timebase
frequency of the host CPU.
The parsing appears to slow down the boot quite a bit with higher number
of cores:
The timebase frequency of the host CPU is identical for all
cores and it is an invariant for the VM lifetime. Cache it
instead of doing the same expensive parsing again and again.
Rename kvmppc_get_tbfreq() to kvmppc_get_tbfreq_procfs() and
rename the 'retval' variable to make it clear it is used as
fallback only. Come up with a new version of kvmppc_get_tbfreq()
that calls kvmppc_get_tbfreq_procfs() only once and keep the
value in a static.
Zero is certainly not a valid value for the timebase frequency.
Treat atoi() returning zero as another parsing error and return
the fallback value instead. This allows kvmppc_get_tbfreq() to
use zero as an indicator that kvmppc_get_tbfreq_procfs() hasn't
been called yet.
Bin Meng [Thu, 11 Mar 2021 08:16:08 +0000 (16:16 +0800)]
hw/ppc: e500: Add missing #address-cells and #size-cells in the eTSEC node
Per devicetree spec v0.3 [1] chapter 2.3.5:
The #address-cells and #size-cells properties are not inherited
from ancestors in the devicetree. They shall be explicitly defined.
If missing, a client program should assume a default value of 2
for #address-cells, and a value of 1 for #size-cells.
These properties are currently missing, causing the <reg> property
of the queue-group subnode to be incorrectly parsed using default
values.
Peter Maydell [Tue, 30 Mar 2021 15:37:15 +0000 (16:37 +0100)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20210330' into staging
* net/npcm7xx_emc.c: Fix handling of receiving packets when RSDR not set
* hw/display/xlnx_dp: Free FIFOs adding xlnx_dp_finalize()
* hw/arm/smmuv3: Drop unused CDM_VALID() and is_cd_valid()
* target/arm: Make number of counters in PMCR follow the CPU
* hw/timer/renesas_tmr: Add default-case asserts in read_tcnt()
* remotes/pmaydell/tags/pull-target-arm-20210330:
hw/timer/renesas_tmr: Add default-case asserts in read_tcnt()
target/arm: Make number of counters in PMCR follow the CPU
hw/arm/smmuv3: Drop unused CDM_VALID() and is_cd_valid()
hw/display/xlnx_dp: Free FIFOs adding xlnx_dp_finalize()
net/npcm7xx_emc.c: Fix handling of receiving packets when RSDR not set
Peter Maydell [Tue, 30 Mar 2021 13:06:54 +0000 (14:06 +0100)]
Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2021-03-30' into staging
Block patches for 6.0-rc1:
- Mark the qcow2 cache clean timer as external to fix record/replay
- Fix the mirror filter node's permissions so that an external process
cannot grab an image while it is used as the mirror source
- Add documentation about FUSE exports to the storage daemon
- When creating a qcow2 image with the data-file-raw option, all
metadata structures should be preallocated
- iotest fixes
Peter Maydell [Tue, 30 Mar 2021 13:05:34 +0000 (14:05 +0100)]
hw/timer/renesas_tmr: Add default-case asserts in read_tcnt()
In commit 81b3ddaf8772ec we fixed a use of uninitialized data
in read_tcnt(). However this change wasn't enough to placate
Coverity, which is not smart enough to see that if we read a
2 bit field and then handle cases 0, 1, 2 and 3 then there cannot
be a flow of execution through the switch default. Add explicit
default cases which assert that they can't be reached, which
should help silence Coverity.
Peter Maydell [Tue, 30 Mar 2021 13:05:33 +0000 (14:05 +0100)]
target/arm: Make number of counters in PMCR follow the CPU
Currently we give all the v7-and-up CPUs a PMU with 4 counters. This
means that we don't provide the 6 counters that are required by the
Arm BSA (Base System Architecture) specification if the CPU supports
the Virtualization extensions.
Instead of having a single PMCR_NUM_COUNTERS, make each CPU type
specify the PMCR reset value (obtained from the appropriate TRM), and
use the 'N' field of that value to define the number of counters
provided.
This means that we now supply 6 counters for Cortex-A53, A57, A72,
A15 and A9 as well as '-cpu max'; Cortex-A7 and A8 stay at 4; and
Cortex-R5 goes down to 3.
Note that because we now use the PMCR reset value of the specific
implementation, we no longer set the LC bit out of reset. This has
an UNKNOWN value out of reset for all cores with any AArch32 support,
so guest software should be setting it anyway if it wants it.
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x5618479ec7cf in malloc (qemu-system-aarch64+0x233b7cf)
#1 0x7f675745f958 in g_malloc (/lib64/libglib-2.0.so.0+0x58958)
#2 0x561847c2dcc9 in xlnx_dp_init hw/display/xlnx_dp.c:1259:5
#3 0x56184a5bdab8 in object_init_with_type qom/object.c:375:9
#4 0x56184a5a2bda in object_initialize_with_type qom/object.c:517:5
#5 0x56184a5a24d5 in object_initialize qom/object.c:536:5
#6 0x56184a5a2f6c in object_initialize_child_with_propsv qom/object.c:566:5
#7 0x56184a5a2e60 in object_initialize_child_with_props qom/object.c:549:10
#8 0x56184a5a3a1e in object_initialize_child_internal qom/object.c:603:5
#9 0x5618495aa431 in xlnx_zynqmp_init hw/arm/xlnx-zynqmp.c:273:5
The RX/TX FIFOs are created in xlnx_dp_init(), add xlnx_dp_finalize()
to destroy them.
Max Reitz [Fri, 26 Mar 2021 14:55:09 +0000 (15:55 +0100)]
iotests/244: Test preallocation for data-file-raw
Three test cases:
(1) Adding a qcow2 (metadata) file to an existing data file, see whether
we can read the existing data through the qcow2 image.
(2) Append data to the data file, grow the qcow2 image accordingly, see
whether we can read the new data through the qcow2 image.
(3) At runtime, add a backing image to a freshly created qcow2 image
with an external data file (with data-file-raw). Reading data from
the qcow2 image must return the same result as reading data from the
data file, so everything in the backing image must be ignored.
(This did not use to be the case, because without the L2 tables
preallocated, all clusters would appear as unallocated, and so the
qcow2 driver would fall through to the backing file.)
Max Reitz [Fri, 26 Mar 2021 14:55:08 +0000 (15:55 +0100)]
qcow2: Force preallocation with data-file-raw
Setting the qcow2 data-file-raw bit means that you can ignore the
qcow2 metadata when reading from the external data file. It does not
mean that you have to ignore it, though. Therefore, the data read must
be the same regardless of whether you interpret the metadata or whether
you ignore it, and thus the L1/L2 tables must all be present and give a
1:1 mapping.
This patch changes 244's output: First, the qcow2 file is larger right
after creation, because of metadata preallocation. Second, the qemu-img
map output changes: Everything that was not explicitly discarded or
zeroed is now a data area.
Frédéric Fortier [Sun, 28 Mar 2021 18:01:35 +0000 (14:01 -0400)]
linux-user: NETLINK_LIST_MEMBERSHIPS: Allow bad ptr if its length is 0
getsockopt(fd, SOL_NETLINK, NETLINK_LIST_MEMBERSHIPS, *optval, *optlen)
syscall allows optval to be NULL/invalid if optlen points to a size of
zero. This allows userspace to query the length of the array they should
use to get the full membership list before allocating memory for said
list, then re-calling getsockopt with proper optval/optlen arguments.
Notable users of this pattern include systemd-networkd, which in the
(albeit old) version 237 tested, cannot start without this fix.
Klaus Jensen [Mon, 22 Mar 2021 06:10:24 +0000 (07:10 +0100)]
hw/block/nvme: fix ref counting in nvme_format_ns
Max noticed that since blk_aio_pwrite_zeroes() may invoke the callback
before returning, the callbacks will never see *count == 0 and thus
never free the count variable or decrement num_formats causing a CQE to
never be posted.
Coverity (CID 1451082) also picked up on the fact that count would not
be free'ed if the namespace was of zero size.
Fix both of these issues by explicitly checking *count and finalize for
the given namespace if --(*count) is zero. Enqueing a CQE if there are
no AIOs outstanding after this case is already handled by nvme_format()
by inspecting *num_formats.
Reported-by: Max Reitz <[email protected]> Reported-by: Coverity (CID 1451082) Fixes: dc04d25e2f3f ("hw/block/nvme: add support for the format nvm command") Signed-off-by: Klaus Jensen <[email protected]> Reviewed-by: Gollu Appalanaidu <[email protected]>
Max Reitz [Wed, 17 Feb 2021 11:58:44 +0000 (12:58 +0100)]
qsd: Document FUSE exports
Implementing FUSE exports required no changes to the storage daemon, so
we forgot to document them there. Considering that both NBD and
vhost-user-blk exports are documented in its man page (and NBD exports
in its --help text), we should probably do the same for FUSE.
Max Reitz [Thu, 11 Feb 2021 17:22:41 +0000 (18:22 +0100)]
block/mirror: Fix mirror_top's permissions
mirror_top currently shares all permissions, and takes only the WRITE
permission (if some parent has taken that permission, too).
That is wrong, though; mirror_top is a filter, so it should take
permissions like any other filter does. For example, if the parent
needs CONSISTENT_READ, we need to take that, too, and if it cannot share
the WRITE permission, we cannot share it either.
The exception is when mirror_top is used for active commit, where we
cannot take CONSISTENT_READ (because it is deliberately unshared above
the base node) and where we must share WRITE (so that it is shared for
all images in the backing chain, so the mirror job can take it for the
target BB).
Max Reitz [Fri, 18 Sep 2020 15:33:23 +0000 (17:33 +0200)]
iotests/046: Filter request length
For its concurrent requests, 046 has always filtered the offset,
probably because concurrent requests may settle in any order. However,
it did not filter the request length, and so if requests with different
lengths settle in an unexpected order (notably the longer request before
the shorter request), the test fails (for no good reason).
Pavel Dovgalyuk [Mon, 29 Mar 2021 08:06:03 +0000 (11:06 +0300)]
qcow2: use external virtual timers
Regular virtual timers are used to emulate timings
related to vCPU and peripheral states. QCOW2 uses timers
to clean the cache. These timers should have external
flag. In the opposite case they affect the execution
and it can't be recorded and replayed.
This patch adds external flag to the timer for qcow2
cache clean.
Max Reitz [Fri, 26 Mar 2021 14:14:19 +0000 (15:14 +0100)]
iotests/116: Fix reference output
15ce94a68ca ("block/qed: bdrv_qed_do_open: deal with errp") has improved
the qed driver's error reporting, though sadly did not add a test for
it.
The good news are: There already is such a test, namely 116.
The bad news are: Its reference output was not adjusted, and so now it
fails.
Let's fix the reference output, which has the nice side effect of
demonstrating 15ce94a68ca's improvements.
Connor Kuehl [Thu, 18 Mar 2021 20:09:49 +0000 (15:09 -0500)]
iotests: fix 051.out expected output after error text touchups
A patch was recently applied that touched up some error messages that
pertained to key names like 'node-name'. The trouble is it only updated
tests/qemu-iotests/051.pc.out and not tests/qemu-iotests/051.out as
well.
* remotes/vivier2/tags/linux-user-for-6.0-pull-request:
linux-user: allow NULL msg in recvfrom
linux-user/s390x: Use the guest pointer for the sigreturn stub
Zach Reizner [Sat, 27 Mar 2021 02:11:16 +0000 (22:11 -0400)]
linux-user: allow NULL msg in recvfrom
The kernel allows a NULL msg in recvfrom so that he size of the next
message may be queried before allocating a correctly sized buffer. This
change allows the syscall translator to pass along the NULL msg pointer
instead of returning early with EFAULT.
This happens because the device is doing things at "instance_init" time
that should be done at "realize" time instead. So move the related code
to the realize() function instead. (NB: This now also matches the
memory_region_del_subregion() calls which are done in usb_ehci_unrealize(),
and not during finalize()).
Gerd Hoffmann [Wed, 17 Mar 2021 09:56:22 +0000 (10:56 +0100)]
s390x: modularize virtio-gpu-ccw
Since the virtio-gpu-ccw device depends on the hw-display-virtio-gpu
module, which provides the type virtio-gpu-device, packaging the
hw-display-virtio-gpu module as a separate package that may or may not
be installed along with the qemu package leads to problems. Namely if
the hw-display-virtio-gpu is absent, qemu continues to advertise
virtio-gpu-ccw, but it aborts not only when one attempts using
virtio-gpu-ccw, but also when libvirtd's capability probing tries
to instantiate the type to introspect it.
Let us thus introduce a module named hw-s390x-virtio-gpu-ccw that
is going to provide the virtio-gpu-ccw device. The hw-s390x prefix
was chosen because it is not a portable device.
With virtio-gpu-ccw built as a module, the correct way to package a
modularized qemu is to require that hw-display-virtio-gpu must be
installed whenever the module hw-s390x-virtio-gpu-ccw.
Gerd Hoffmann [Wed, 17 Mar 2021 09:56:20 +0000 (10:56 +0100)]
s390x: move S390_ADAPTER_SUPPRESSIBLE
The definition S390_ADAPTER_SUPPRESSIBLE was moved to "cpu.h", per
suggestion of Thomas Huth. From interface design perspective, IMHO, not
a good thing as it belongs to the public interface of
css_register_io_adapters(). We did this because CONFIG_KVM requeires
NEED_CPU_H and Thomas, and other commenters did not like the
consequences of that.
Moving the interrupt related declarations to s390_flic.h was suggested
by Cornelia Huck.
hw/usb/hcd-ehci-sysbus: Free USBPacket on instance finalize()
When building with --enable-sanitizers we get:
Direct leak of 32 byte(s) in 2 object(s) allocated from:
#0 0x5618479ec7cf in malloc (qemu-system-aarch64+0x233b7cf)
#1 0x7f675745f958 in g_malloc (/lib64/libglib-2.0.so.0+0x58958)
#2 0x561847f02ca2 in usb_packet_init hw/usb/core.c:531:5
#3 0x561848df4df4 in usb_ehci_init hw/usb/hcd-ehci.c:2575:5
#4 0x561847c119ac in ehci_sysbus_init hw/usb/hcd-ehci-sysbus.c:73:5
#5 0x56184a5bdab8 in object_init_with_type qom/object.c:375:9
#6 0x56184a5bd955 in object_init_with_type qom/object.c:371:9
#7 0x56184a5a2bda in object_initialize_with_type qom/object.c:517:5
#8 0x56184a5a24d5 in object_initialize qom/object.c:536:5
#9 0x56184a5a2f6c in object_initialize_child_with_propsv qom/object.c:566:5
#10 0x56184a5a2e60 in object_initialize_child_with_props qom/object.c:549:10
#11 0x56184a5a3a1e in object_initialize_child_internal qom/object.c:603:5
#12 0x561849542d18 in npcm7xx_init hw/arm/npcm7xx.c:427:5
Similarly to commit d710e1e7bd3 ("usb: ehci: fix memory leak in
ehci"), fix by calling usb_ehci_finalize() to free the USBPacket.
vugbm implements GBM device wrapping, udmabuf and memory fallback.
However, the fallback/detection logic is flawed, as if "/dev/udmabuf"
failed to be opened, it will not initialize vugbm and crash later.
Rework the vugbm_device_init() logic to initialize correctly in all
cases.
For similar reasons as commit 3af1671852 ("spice: flush on GL update
before notifying client"), vhost-user-gpu must ensure the GL state is
flushed before sharing its rendering result.
Thomas Huth [Thu, 11 Mar 2021 09:28:29 +0000 (10:28 +0100)]
usb: Remove "-usbdevice ccid"
"-usbdevice ccid" was not documented and -usbdevice itself was marked
as deprecated before QEMU v6.0. And searching for "-usbdevice ccid"
in the internet does not show any useful results, so likely nobody
was using the ccid device via the -usbdevice option. Remove it now.
Andreas Krebbel [Wed, 24 Mar 2021 18:51:28 +0000 (19:51 +0100)]
linux-user/s390x: Use the guest pointer for the sigreturn stub
When setting up the pointer for the sigreturn stub in the return
address register (r14) we currently use the host frame address instead
of the guest frame address.
Note: This only caused problems if Qemu has been built with
--disable-pie (as it is in distros nowadays). Otherwise guest_base
defaults to 0 hiding the actual problem.