Git Repo - qemu.git/log

vhost: use wfd on functions setting vring call fd

When ioeventfd is emulated using qemu_pipe(), only EventNotifier's wfd
can be used for writing.

Use the recently introduced event_notifier_get_wfd() function to
obtain the fd that our peer must use to signal the vring.

Signed-off-by: Sergio Lopez <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Message-Id: <20220304100854 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

event_notifier: add event_notifier_get_wfd()

event_notifier_get_fd(const EventNotifier *e) always returns
EventNotifier's read file descriptor (rfd). This is not a problem when
the EventNotifier is backed by a an eventfd, as a single file
descriptor is used both for reading and triggering events (rfd ==
wfd).

But, when EventNotifier is backed by a pipe pair, we have two file
descriptors, one that can only be used for reads (rfd), and the other
only for writes (wfd).

There's, at least, one known situation in which we need to obtain wfd
instead of rfd, which is when setting up the file that's going to be
sent to the peer in vhost's SET_VRING_CALL.

Add a new event_notifier_get_wfd(const EventNotifier *e) that can be
used to obtain wfd where needed.

Signed-off-by: Sergio Lopez <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Message-Id: <20220304100854 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

pci: drop COMPAT_PROP_PCP for 2.0 machine types

COMPAT_PROP_PCP is 'on' by default and it's used for turning
off PCP capability on PCIe slots for 2.0 machine types using
compat machinery.
Drop not needed compat glue as Q35 supports migration starting
from 2.4 machine types.

Signed-off-by: Igor Mammedov <[email protected]>
Message-Id: <20220222102504.3080104 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

hw/smbios: Add table 4 parameter, "processor-id"

This parameter is to be used in the processor_id entry in the type 4
table.

This parameter is set as optional and if left will use the values from
the CPU model.

This enables hiding the host information from the guest and allowing AMD
VMs to run pretending to be Intel for some userspace software concerns.

Reviewed-by: Peter Foley <[email protected]>
Reviewed-by: Titus Rwantare <[email protected]>
Signed-off-by: Patrick Venture <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>
Message-Id: <20220125163118.1011809 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

x86: cleanup unused compat_apic_id_mode

commit
f862ddbb1a4 (hw/i386: Remove the deprecated pc-1.x machine types)
removed the last user of broken APIC ID compat knob,
but compat_apic_id_mode itself was forgotten.
Clean it up and simplify x86_cpu_apic_id_from_index()

Signed-off-by: Igor Mammedov <[email protected]>
Message-Id: <20220228131634.3389805 [email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

vhost-vsock: detach the virqueue element in case of error

In vhost_vsock_common_send_transport_reset(), if an element popped from
the virtqueue is invalid, we should call virtqueue_detach_element() to
detach it from the virtqueue before freeing its memory.

Fixes: fc0b9b0e1c ("vhost-vsock: add virtio sockets device")
Fixes: CVE-2022-26354
Cc: [email protected]
Reported-by: VictorV <[email protected]>
Signed-off-by: Stefano Garzarella <[email protected]>
Message-Id: <20220228095058 [email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

pc: add option to disable PS/2 mouse/keyboard

On some older software like Windows 7 installer, having both a PS/2
mouse and USB mouse results in only one device working property (which
might be a different device each boot). While the workaround to not use
a USB mouse with such software is valid, it creates an inconsistent
experience if the user wishes to always use a USB mouse.

This introduces a new machine property to inhibit the creation of the
i8042 PS/2 controller.

Signed-off-by: Joelle van Dyne <[email protected]>
Message-Id: <20220227210655 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

acpi: pcihp: pcie: set power on cap on parent slot

on creation a PCIDevice has power turned on at the end of pci_qdev_realize()
however later on if PCIe slot isn't populated with any children
it's power is turned off. It's fine if native hotplug is used
as plug callback will power slot on among other things.
However when ACPI hotplug is enabled it replaces native PCIe plug
callbacks with ACPI specific ones (acpi_pcihp_device_*plug_cb) and
as result slot stays powered off. It works fine as ACPI hotplug
on guest side takes care of enumerating/initializing hotplugged
device. But when later guest is migrated, call chain introduced by]
commit d5daff7d312 (pcie: implement slot power control for pcie root ports)

   pcie_cap_slot_post_load()
       -> pcie_cap_update_power()
           -> pcie_set_power_device()
               -> pci_set_power()
                   -> pci_update_mappings()

will disable earlier initialized BARs for the hotplugged device
in powered off slot due to commit 23786d13441 (pci: implement power state)
which disables BARs if power is off.

Fix it by setting PCI_EXP_SLTCTL_PCC to PCI_EXP_SLTCTL_PWR_ON
on slot (root port/downstream port) at the time a device
hotplugged into it. As result PCI_EXP_SLTCTL_PWR_ON is migrated
to target and above call chain keeps device plugged into it
powered on.

Fixes: d5daff7d312 ("pcie: implement slot power control for pcie root ports")
Fixes: 23786d13441 ("pci: implement power state")
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2053584
Suggested-by: "Michael S. Tsirkin" <[email protected]>
Signed-off-by: Igor Mammedov <[email protected]>
Message-Id: <20220301151200.3507298 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

pci: expose TYPE_XIO3130_DOWNSTREAM name

Type name will be used in followup patch for cast check
in pcihp code.

Signed-off-by: Igor Mammedov <[email protected]>
Message-Id: <20220301151200.3507298 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

pci: show id info when pci BDF conflict

During qemu init stage, when there is pci BDF conflicts, qemu print
a warning but not showing which device the BDF is occupied by. E.x:

"PCI: slot 2 function 0 not available for virtio-scsi-pci, in use by virtio-scsi-pci"

To facilitate user knowing the offending device and fixing it, showing
the id info in the warning.

Signed-off-by: Zhenzhong Duan <[email protected]>
Message-Id: <20220223094435 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

hw/misc/pvpanic: Use standard headers instead

QEMU side has already imported pvpanic.h from linux, remove bit
definitions from include/hw/misc/pvpanic.h, and use
include/standard-headers/linux/pvpanic.h instead.
Also minor changes for PVPANIC_CRASHLOADED -> PVPANIC_CRASH_LOADED.

Signed-off-by: zhenwei pi <[email protected]>
Message-Id: <20220221122717.1371010 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

headers: Add pvpanic.h

Since 2020, linux kernel started to export pvpanic.h. Import the
latest version from linux into QEMU.

Signed-off-by: zhenwei pi <[email protected]>
Message-Id: <20220221122717.1371010 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

pci-bridge/xio3130_downstream: Fix error handling

Wrong goto label, so msi cleanup would not occur if there is
an error in the ssvid initialization.

Signed-off-by: Jonathan Cameron <[email protected]>
Message-Id: <20220218102303 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

pci-bridge/xio3130_upstream: Fix error handling

Goto label is incorrect so msi cleanup would not occur if there is
an error in the ssvid initialization.

Signed-off-by: Jonathan Cameron <[email protected]>
Message-Id: <20220218102303 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

pcie: Add 1.2 version token for the Power Management Capability

Signed-off-by: Łukasz Gieryk <[email protected]>
Message-Id: <20220217174504.1051716 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

pcie: Add a helper to the SR/IOV API

Convenience function for retrieving the PCIDevice object of the N-th VF.

Signed-off-by: Łukasz Gieryk <[email protected]>
Reviewed-by: Knut Omang <[email protected]>
Message-Id: <20220217174504.1051716 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

pcie: Add some SR/IOV API documentation in docs/pcie_sriov.txt

Add a small intro + minimal documentation for how to
implement SR/IOV support for an emulated device.

Signed-off-by: Knut Omang <[email protected]>
Message-Id: <20220217174504.1051716 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

pcie: Add support for Single Root I/O Virtualization (SR/IOV)

This patch provides the building blocks for creating an SR/IOV
PCIe Extended Capability header and register/unregister
SR/IOV Virtual Functions.

Signed-off-by: Knut Omang <[email protected]>
Message-Id: <20220217174504.1051716 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

virtio-net: Unlimit tx queue size if peer is vdpa

The code used to limit the maximum size of tx queue for others backends
than vhost_user since the introduction of configurable tx queue size in
9b02e1618cf2 ("virtio-net: enable configurable tx queue size").

As vhost_user, vhost_vdpa devices should deal with memory region
crosses already, so let's use the full tx size.

Signed-off-by: Eugenio Pérez <[email protected]>
Message-Id: <20220217175029.2517071 [email protected]>
Acked-by: Jason Wang <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

hw/pci-bridge/pxb: Fix missing swizzle

pxb_map_irq_fn() handled the necessary removal of the swizzle
applied to the PXB interrupts by the bus to which it was attached
but neglected to apply the normal swizzle for PCI root ports
on the expander bridge.

Result of this was on ARM virt, the PME interrupts for a second
RP on a PXB instance were miss-routed to #45 rather than #46.

Tested with a selection of different configurations with 1 to 5
RP per PXB instance. Note on my x86 test setup the PME interrupts
are not triggered so I haven't been able to test this.

Signed-off-by: Jonathan Cameron <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Marcel Apfelbaum <[email protected]>
Message-Id: <20220118174855 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

hw/i386/pc_piix: Mark the machine types from version 1.4 to 1.7 as deprecated

The list of machine types grows larger and larger each release ... and
it is unlikely that many people still use the very old ones for live
migration. QEMU v1.7 has been released more than 8 years ago, so most
people should have updated their machines to a newer version in those
8 years at least once. Thus let's mark the very old 1.x machine types
as deprecated now.

Signed-off-by: Thomas Huth <[email protected]>
Message-Id: <20220117191639 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

tests/qtest/virtio-iommu-test: Check bypass config

The bypass config field should be initialized to 1 by default.

Reviewed-by: Eric Auger <[email protected]>
Signed-off-by: Jean-Philippe Brucker <[email protected]>
Message-Id: <20220214124356 [email protected]>
Acked-by: Cornelia Huck <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Tested-by: Eric Auger <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Acked-by: Thomas Huth <[email protected]>

virtio-iommu: Support bypass domain

The driver can create a bypass domain by passing the
VIRTIO_IOMMU_ATTACH_F_BYPASS flag on the ATTACH request. Bypass domains
perform slightly better than domains with identity mappings since they
skip translation.

Signed-off-by: Jean-Philippe Brucker <[email protected]>
Message-Id: <20220214124356 [email protected]>
Acked-by: Cornelia Huck <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Tested-by: Eric Auger <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

virtio-iommu: Default to bypass during boot

Currently the virtio-iommu device must be programmed before it allows
DMA from any PCI device. This can make the VM entirely unusable when a
virtio-iommu driver isn't present, for example in a bootloader that
loads the OS from storage.

Similarly to the other vIOMMU implementations, default to DMA bypassing
the IOMMU during boot. Add a "boot-bypass" property, defaulting to true,
that lets users change this behavior.

Replace the VIRTIO_IOMMU_F_BYPASS feature, which didn't support bypass
before feature negotiation, with VIRTIO_IOMMU_F_BYPASS_CONFIG.

We add the bypass field to the migration stream without introducing
subsections, based on the assumption that this virtio-iommu device isn't
being used in production enough to require cross-version migration at
the moment (all previous version required workarounds since they didn't
support ACPI and boot-bypass).

Reviewed-by: Eric Auger <[email protected]>
Signed-off-by: Jean-Philippe Brucker <[email protected]>
Message-Id: <20220214124356 [email protected]>
Acked-by: Cornelia Huck <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Tested-by: Eric Auger <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

hw/i386: Replace magic number with field length calculation

Replce the literal magic number 48 with length calculation (32 bytes at
the end of the firmware after the table footer + 16 bytes of the OVMF
table footer GUID).

No functional change intended.

Signed-off-by: Dov Murik <[email protected]>
Message-Id: <20220222071906.2632426 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Daniel P. Berrangé <[email protected]>

hw/i386: Improve bounds checking in OVMF table parsing

When pc_system_parse_ovmf_flash() parses the optional GUIDed table in
the end of the OVMF flash memory area, the table length field is checked
for sizes that are too small, but doesn't error on sizes that are too
big (bigger than the flash content itself).

Add a check for maximal size of the OVMF table, and add an error report
in case the size is invalid. In such a case, an error like this will be
displayed during launch:

qemu-system-x86_64: OVMF table has invalid size 4047

and the table parsing is skipped.

Signed-off-by: Dov Murik <[email protected]>
Message-Id: <20220222071906.2632426 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Daniel P. Berrangé <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>

intel_iommu: support snoop control

SC is required for some kernel features like vhost-vDPA. So this patch
implements basic SC feature. The idea is pretty simple, for software
emulated DMA it would be always coherent. In this case we can simple
advertise ECAP_SC bit. For VFIO and vhost, thing will be more much
complicated, so this patch simply fail the IOMMU notifier
registration.

In the future, we may want to have a dedicated notifiers flag or
similar mechanism to demonstrate the coherency so VFIO could advertise
that if it has VFIO_DMA_CC_IOMMU, for vhost kernel backend we don't
need that since it's a software backend.

Signed-off-by: Jason Wang <[email protected]>
Message-Id: <20220214060346 [email protected]>
Reviewed-by: Peter Xu <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

vhost-vdpa: make notifiers _init()/_uninit() symmetric

vhost_vdpa_host_notifiers_init() initializes queue notifiers
for queues "dev->vq_index" to queue "dev->vq_index + dev->nvqs",
whereas vhost_vdpa_host_notifiers_uninit() uninitializes the
same notifiers for queue "0" to queue "dev->nvqs".

This asymmetry seems buggy, fix that by using dev->vq_index
as the base for both.

Fixes: d0416d487bd5 ("vhost-vdpa: map virtqueue notification area if possible")
Cc: [email protected]
Signed-off-by: Laurent Vivier <[email protected]>
Message-Id: <20220211161309.1385839 [email protected]>
Acked-by: Jason Wang <[email protected]>
Reviewed-by: Stefano Garzarella <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

hw/virtio: vdpa: Fix leak of host-notifier memory-region

If call virtio_queue_set_host_notifier_mr fails, should free
host-notifier memory-region.

This problem can trigger a coredump with some vDPA drivers (mlx5,
but not with the vdpasim), if we unplug the virtio-net card from
the guest after a stop/start.

The same fix has been done for vhost-user:
1f89d3b91e3e ("hw/virtio: Fix leak of host-notifier memory-region")

Fixes: d0416d487bd5 ("vhost-vdpa: map virtqueue notification area if possible")
Cc: [email protected]
Resolves: https://bugzilla.redhat.com/2027208
Signed-off-by: Laurent Vivier <[email protected]>
Message-Id: <20220211170259.1388734 [email protected]>
Cc: [email protected]
Acked-by: Jason Wang <[email protected]>
Reviewed-by: Stefano Garzarella <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

hw/vhost-user-i2c: Add support for VIRTIO_I2C_F_ZERO_LENGTH_REQUEST

VIRTIO_I2C_F_ZERO_LENGTH_REQUEST is a mandatory feature, that must be
implemented by everyone. Add its support.

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
Message-Id: <fc47ab63b1cd414319c9201e8d6c7705b5ec3bd9.1644490993 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

virtio: fix the condition for iommu_platform not supported

The commit 04ceb61a40 ("virtio: Fail if iommu_platform is requested, but
unsupported") claims to fail the device hotplug when iommu_platform
is requested, but not supported by the (vhost) device. On the first
glance the condition for detecting that situation looks perfect, but
because a certain peculiarity of virtio_platform it ain't.

In fact the aforementioned commit introduces a regression. It breaks
virtio-fs support for Secure Execution, and most likely also for AMD SEV
or any other confidential guest scenario that relies encrypted guest
memory. The same also applies to any other vhost device that does not
support _F_ACCESS_PLATFORM.

The peculiarity is that iommu_platform and _F_ACCESS_PLATFORM collates
"device can not access all of the guest RAM" and "iova != gpa, thus
device needs to translate iova".

Confidential guest technologies currently rely on the device/hypervisor
offering _F_ACCESS_PLATFORM, so that, after the feature has been
negotiated, the guest grants access to the portions of memory the
device needs to see. So in for confidential guests, generally,
_F_ACCESS_PLATFORM is about the restricted access to memory, but not
about the addresses used being something else than guest physical
addresses.

This is the very reason for which commit f7ef7e6e3b ("vhost: correctly
turn on VIRTIO_F_IOMMU_PLATFORM") fences _F_ACCESS_PLATFORM from the
vhost device that does not need it, because on the vhost interface it
only means "I/O address translation is needed".

This patch takes inspiration from f7ef7e6e3b ("vhost: correctly turn on
VIRTIO_F_IOMMU_PLATFORM"), and uses the same condition for detecting the
situation when _F_ACCESS_PLATFORM is requested, but no I/O translation
by the device, and thus no device capability is needed. In this
situation claiming that the device does not support iommu_plattform=on
is counter-productive. So let us stop doing that!

Signed-off-by: Halil Pasic <[email protected]>
Reported-by: Jakob Naucke <[email protected]>
Fixes: 04ceb61a40 ("virtio: Fail if iommu_platform is requested, but
unsupported")
Acked-by: Cornelia Huck <[email protected]>
Reviewed-by: Daniel Henrique Barboza <[email protected]>
Tested-by: Daniel Henrique Barboza <[email protected]>
Cc: Kevin Wolf <[email protected]>
Cc: [email protected]
Message-Id: <20220207112857 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Acked-by: Jason Wang <[email protected]>

vhost-user: fix VirtQ notifier cleanup

When vhost-user device cleanup, remove notifier MR and munmaps notifier
address in the event-handling thread, VM CPU thread writing the notifier
in concurrent fails with an error of accessing invalid address. It
happens because MR is still being referenced and accessed in another
thread while the underlying notifier mmap address is being freed and
becomes invalid.

This patch calls RCU and munmap notifiers in the callback after the
memory flatview update finish.

Fixes: 44866521bd6e ("vhost-user: support registering external host notifiers")
Cc: [email protected]
Signed-off-by: Xueming Li <[email protected]>
Message-Id: <20220207071929 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

vhost-user: remove VirtQ notifier restore

Notifier set when vhost-user backend asks qemu to mmap an FD and
offset. When vhost-user backend restart or getting killed, VQ notifier
FD and mmap addresses become invalid. After backend restart, MR contains
the invalid address will be restored and fail on notifier access.

On the other hand, qemu should munmap the notifier, release underlying
hardware resources to enable backend restart and allocate hardware
notifier resources correctly.

Qemu shouldn't reference and use resources of disconnected backend.

This patch removes VQ notifier restore, uses the default vhost-user
notifier to avoid invalid address access.

After backend restart, the backend should ask qemu to install a hardware
notifier if needed.

Fixes: 44866521bd6e ("vhost-user: support registering external host notifiers")
Cc: [email protected]
Signed-off-by: Xueming Li <[email protected]>
Message-Id: <20220207071929 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

hw/smbios: add assertion to ensure handles of tables 19 and 32 do not collide

Since change dcf359832eec02 ("hw/smbios: fix table memory corruption with large memory vms")
we reserve additional space between handle numbers of tables 17 and 19 for
large VMs. This may cause table 19 to collide with table 32 in their handle
numbers for those large VMs. This change adds an assertion to ensure numbers
do not collide. If they do, qemu crashes with useful debug information for
taking additional steps.

Signed-off-by: Ani Sinha <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>
Message-Id: <20220223143322 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

hw/smbios: fix overlapping table handle numbers with large memory vms

The current smbios table implementation splits the main memory in 16 GiB
(DIMM like) chunks. With the current smbios table assignment code, we can have
only 512 such chunks before the 16 bit handle numbers in the header for tables
17 and 19 conflict. A guest with more than 8 TiB of memory will hit this
limitation and would fail with the following assertion in isa-debugcon:

ASSERT_EFI_ERROR (Status = Already started)
ASSERT /builddir/build/BUILD/edk2-ca407c7246bf/OvmfPkg/SmbiosPlatformDxe/SmbiosPlatformDxe.c(125): !EFI_ERROR (Status)

This change adds an additional offset between tables 17 and 19 handle numbers
when configuring VMs larger than 8 TiB of memory. The value of the offset is
calculated to be equal to the additional space required to be reserved
in order to accomodate more DIMM entries without the table handles colliding.
In normal cases where the VM memory is smaller or equal to 8 TiB, this offset
value is 0. Hence in this case, no additional handle numbers are reserved and
table handle values remain as before.

Since smbios memory is not transmitted over the wire during migration,
this change can break migration for large memory vms if the guest is in the
middle of generating the tables during migration. However, in those
situations, qemu generates invalid table handles anyway with or without this
fix. Hence, we do not preserve the old bug by introducing compat knobs/machine
types.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2023977

Signed-off-by: Ani Sinha <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>
Message-Id: <20220223143322 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

hw/smbios: code cleanup - use macro definitions for table header handles

This is a minor cleanup. Using macro definitions makes the code more
readable. It is at once clear which tables use which handle numbers in their
header. It also makes it easy to calculate the gaps between the numbers and
update them if needed.

Reviewed-by: Igor Mammedov <[email protected]>
Signed-off-by: Ani Sinha <[email protected]>
Message-Id: <20220223143322 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

hw/acpi/erst: clean up unused IS_UEFI_CPER_RECORD macro

This change is cosmetic. IS_UEFI_CPER_RECORD macro definition that was added
as a part of the ERST implementation seems to be unused. Remove it.

CC: Eric DeVolder <[email protected]>
Reviewed-by: Eric DeVolder <[email protected]>
Signed-off-by: Ani Sinha <[email protected]>
Message-Id: <20220223143322 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

docs/acpi/erst: add device id for ACPI ERST device in pci-ids.txt

Adding device ID for ERST device in pci-ids.txt. It was missed when ERST
related patches were reviewed.

CC: Eric DeVolder <[email protected]>
Reviewed-by: Eric DeVolder <[email protected]>
Signed-off-by: Ani Sinha <[email protected]>
Message-Id: <20220223143322 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

MAINTAINERS: no need to add my name explicitly as a reviewer for VIOT tables

I am already listed as a reviewer for ACPI/SMBIOS subsystem. There is no need to
again add me as a reviewer for ACPI/VIOT.

Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Ani Sinha <[email protected]>
Message-Id: <20220223143322 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

ACPI ERST: specification for ERST support

Information on the implementation of the ACPI ERST support.

Signed-off-by: Eric DeVolder <[email protected]>
Acked-by: Ani Sinha <[email protected]>
Message-Id: <20220223143322 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

qom: assert integer does not overflow

QOM reference counting is not designed with an infinite amount of
references in mind, trying to take a reference in a loop without
dropping a reference will overflow the integer.

It is generally a symptom of a reference leak (a missing deref, commonly
as part of error handling - such as one fixed here:
https://lore.kernel.org/r/20220228095058.27899-1-sgarzare%40redhat.com ).

All this can lead to either freeing the object too early (memory
corruption) or never freeing it (memory leak).

If we happen to dereference at just the right time (when it's wrapping
around to 0), we might eventually assert when dereferencing, but the
real problem is an extra object_ref so let's assert there to make such
issues cleaner and easier to debug.

Some micro-benchmarking shows using fetch and add this is essentially
free on x86.

Since multiple threads could be incrementing in parallel, we assert
around INT_MAX to make sure none of these approach the wrap around
point: this way we get a memory leak and not a memory corruption, the
former is generally easier to debug.

Signed-off-by: Michael S. Tsirkin <[email protected]>

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20220302' into staging

target-arm queue:
* mps3-an547: Add missing user ahb interfaces
* hw/arm/mps2-tz.c: Update AN547 documentation URL
* hw/input/tsc210x: Don't abort on bad SPI word widths
* hw/i2c: flatten pca954x mux device
* target/arm: Support PSCI 1.1 and SMCCC 1.0
* target/arm: Fix early free of TCG temp in handle_simd_shift_fpint_conv()
* tests/qtest: add qtests for npcm7xx sdhci
* Implement FEAT_LVA
* Implement FEAT_LPA
* Implement FEAT_LPA2 (but do not enable it yet)
* Report KVM's actual PSCI version to guest in dtb
* ui/cocoa.m: Fix updateUIInfo threading issues
* ui/cocoa.m: Remove unnecessary NSAutoreleasePools

# gpg: Signature made Wed 02 Mar 2022 20:52:06 GMT
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "[email protected]"
# gpg: Good signature from "Peter Maydell <[email protected]>" [ultimate]
# gpg:                 aka "Peter Maydell <[email protected]>" [ultimate]
# gpg:                 aka "Peter Maydell <[email protected]>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20220302: (26 commits)
  ui/cocoa.m: Remove unnecessary NSAutoreleasePools
  ui/cocoa.m: Fix updateUIInfo threading issues
  target/arm: Report KVM's actual PSCI version to guest in dtb
  target/arm: Implement FEAT_LPA2
  target/arm: Advertise all page sizes for -cpu max
  target/arm: Validate tlbi TG matches translation granule in use
  target/arm: Fix TLBIRange.base for 16k and 64k pages
  target/arm: Introduce tlbi_aa64_get_range
  target/arm: Extend arm_fi_to_lfsc to level -1
  target/arm: Implement FEAT_LPA
  target/arm: Implement FEAT_LVA
  target/arm: Prepare DBGBVR and DBGWVR for FEAT_LVA
  target/arm: Honor TCR_ELx.{I}PS
  target/arm: Use MAKE_64BIT_MASK to compute indexmask
  target/arm: Pass outputsize down to check_s2_mmu_setup
  target/arm: Move arm_pamax out of line
  target/arm: Fault on invalid TCR_ELx.TxSZ
  target/arm: Set TCR_EL1.TSZ for user-only
  hw/registerfields: Add FIELD_SEX<N> and FIELD_SDP<N>
  tests/qtest: add qtests for npcm7xx sdhci
  ...

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/dgilbert-gitlab/tags/pull-migration-20220302b' into staging

Migration/HMP/Virtio pull 2022-03-02

A bit of a mix this time:
  * Minor fixes from myself, Hanna, and Jack
  * VNC password rework by Stefan and Fabian
  * Postcopy changes from Peter X that are
    the start of a larger series to come
  * Removing the prehistoic load_state_old
    code from Peter M

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
# gpg: Signature made Wed 02 Mar 2022 18:25:12 GMT
# gpg:                using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <[email protected]>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert-gitlab/tags/pull-migration-20220302b:
  migration: Remove load_state_old and minimum_version_id_old
  tests: Pass in MigrateStart** into test_migrate_start()
  migration: Add migration_incoming_transport_cleanup()
  migration: postcopy_pause_fault_thread() never fails
  migration: Enlarge postcopy recovery to capture !-EIO too
  migration: Move static var in ram_block_from_stream() into global
  migration: Add postcopy_thread_create()
  migration: Dump ramblock and offset too when non-same-page detected
  migration: Introduce postcopy channels on dest node
  migration: Tracepoint change in postcopy-run bottom half
  migration: Finer grained tracepoints for POSTCOPY_LISTEN
  migration: Dump sub-cmd name in loadvm_process_command tp
  migration/rdma: set the REUSEADDR option for destination
  qapi/monitor: allow VNC display id in set/expire_password
  qapi/monitor: refactor set/expire_password with enums
  monitor/hmp: add support for flag argument with value
  virtiofsd: Let meson check for statx.stx_mnt_id
  clock-vmstate: Add missing END_OF_LIST

Signed-off-by: Peter Maydell <[email protected]>

ui/cocoa.m: Remove unnecessary NSAutoreleasePools

In commit 6e657e64cdc478 in 2013 we added some autorelease pools to
deal with complaints from macOS when we made calls into Cocoa from
threads that didn't have automatically created autorelease pools.
Later on, macOS got stricter about forbidding cross-thread Cocoa
calls, and in commit 5588840ff77800e839d8 we restructured the code to
avoid them. This left the autorelease pool creation in several
functions without any purpose; delete it.

We still need the pool in cocoa_refresh() for the clipboard related
code which is called directly there.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Akihiko Odaki <[email protected]>
Tested-by: Akihiko Odaki <[email protected]>
Message-id: 20220224101330 [email protected]

ui/cocoa.m: Fix updateUIInfo threading issues

The updateUIInfo method makes Cocoa API calls.  It also calls back
into QEMU functions like dpy_set_ui_info().  To do this safely, we
need to follow two rules:
* Cocoa API calls are made on the Cocoa UI thread
* When calling back into QEMU we must hold the iothread lock

Fix the places where we got this wrong, by taking the iothread lock
while executing updateUIInfo, and moving the call in cocoa_switch()
inside the dispatch_async block.

Some of the Cocoa UI methods which call updateUIInfo are invoked as
part of the initial application startup, while we're still doing the
little cross-thread dance described in the comment just above
call_qemu_main().  This meant they were calling back into the QEMU UI
layer before we'd actually finished initializing our display and
registered the DisplayChangeListener, which isn't really valid.  Once
updateUIInfo takes the iothread lock, we no longer get away with
this, because during this startup phase the iothread lock is held by
the QEMU main-loop thread which is waiting for us to finish our
display initialization.  So we must suppress updateUIInfo until
applicationDidFinishLaunching allows the QEMU main-loop thread to
continue.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Akihiko Odaki <[email protected]>
Tested-by: Akihiko Odaki <[email protected]>
Message-id: 20220224101330 [email protected]

target/arm: Report KVM's actual PSCI version to guest in dtb

When we're using KVM, the PSCI implementation is provided by the
kernel, but QEMU has to tell the guest about it via the device tree.
Currently we look at the KVM_CAP_ARM_PSCI_0_2 capability to determine
if the kernel is providing at least PSCI 0.2, but if the kernel
provides a newer version than that we will still only tell the guest
it has PSCI 0.2. (This is fairly harmless; it just means the guest
won't use newer parts of the PSCI API.)

The kernel exposes the specific PSCI version it is implementing via
the ONE_REG API; use this to report in the dtb that the PSCI
implementation is 1.0-compatible if appropriate. (The device tree
binding currently only distinguishes "pre-0.2", "0.2-compatible" and
"1.0-compatible".)

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Marc Zyngier <[email protected]>
Reviewed-by: Akihiko Odaki <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Andrew Jones <[email protected]>
Message-id: 20220224134655.1207865 [email protected]

target/arm: Implement FEAT_LPA2

This feature widens physical addresses (and intermediate physical
addresses for 2-stage translation) from 48 to 52 bits, when using
4k or 16k pages.

This introduces the DS bit to TCR_ELx, which is RES0 unless the
page size is enabled and supports LPA2, resulting in the effective
value of DS for a given table walk. The DS bit changes the format
of the page table descriptor slightly, moving the PS field out to
TCR so that all pages have the same sharability and repurposing
those bits of the page table descriptor for the highest bits of
the output address.

Do not yet enable FEAT_LPA2; we need extra plumbing to avoid
tickling an old kernel bug.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20220301215958 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Advertise all page sizes for -cpu max

We support 16k pages, but do not advertize that in ID_AA64MMFR0.

The value 0 in the TGRAN*_2 fields indicates that stage2 lookups defer
to the same support as stage1 lookups. This setting is deprecated, so
indicate support for all stage2 page sizes directly.

Signed-off-by: Richard Henderson <[email protected]>
Reviewed-by: Peter Maydell <[email protected]>
Message-id: 20220301215958 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Validate tlbi TG matches translation granule in use

For FEAT_LPA2, we will need other ARMVAParameters, which themselves
depend on the translation granule in use. We might as well validate
that the given TG matches; the architecture "does not require that
the instruction invalidates any entries" if this is not true.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20220301215958 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Fix TLBIRange.base for 16k and 64k pages

The shift of the BaseADDR field depends on the translation
granule in use.

Fixes: 84940ed8255 ("target/arm: Add support for FEAT_TLBIRANGE")
Reported-by: Peter Maydell <[email protected]>
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20220301215958 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Introduce tlbi_aa64_get_range

Merge tlbi_aa64_range_get_length and tlbi_aa64_range_get_base,
returning a structure containing both results. Pass in the
ARMMMUIdx, rather than the digested two_ranges boolean.

This is in preparation for FEAT_LPA2, where the interpretation
of 'value' depends on the effective value of DS for the regime.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20220301215958 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Extend arm_fi_to_lfsc to level -1

With FEAT_LPA2, rather than introducing translation level 4,
we introduce level -1, below the current level 0. Extend
arm_fi_to_lfsc to handle these faults.

Assert that this new translation level does not leak into
fault types for which it is not defined, which allows some
masking of fi->level to be removed.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20220301215958 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Implement FEAT_LPA

This feature widens physical addresses (and intermediate physical
addresses for 2-stage translation) from 48 to 52 bits, when using
64k pages. The only thing left at this point is to handle the
extra bits in the TTBR and in the table descriptors.

Note that PAR_EL1 and HPFAR_EL2 are nominally extended, but we don't
mask out the high bits when writing to those registers, so no changes
are required there.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20220301215958 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Implement FEAT_LVA

This feature is relatively small, as it applies only to
64k pages and thus requires no additional changes to the
table descriptor walking algorithm, only a change to the
minimum TSZ (which is the inverse of the maximum virtual
address space size).

Note that this feature widens VBAR_ELx, but we already
treat the register as being 64 bits wide.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20220301215958 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Prepare DBGBVR and DBGWVR for FEAT_LVA

The original A.a revision of the AArch64 ARM required that we
force-extend the addresses in these registers from 49 bits.
This language has been loosened via a combination of IMPLEMENTATION
DEFINED and CONSTRAINTED UNPREDICTABLE to allow consideration of
the entire aligned address.

This means that we do not have to consider whether or not FEAT_LVA
is enabled, and decide from which bit an address might need to be
extended.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20220301215958 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Honor TCR_ELx.{I}PS

This field controls the output (intermediate) physical address size
of the translation process. V8 requires to raise an AddressSize
fault if the page tables are programmed incorrectly, such that any
intermediate descriptor address, or the final translated address,
is out of range.

Add a PS field to ARMVAParameters, and properly compute outputsize
in get_phys_addr_lpae. Test the descaddr as extracted from TTBR
and from page table entries.

Restrict descaddrmask so that we won't raise the fault for v7.

Reviewed-by: Peter Maydell <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20220301215958 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Use MAKE_64BIT_MASK to compute indexmask

The macro is a bit more readable than the inlined computation.

Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20220301215958 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Pass outputsize down to check_s2_mmu_setup

Pass down the width of the output address from translation.
For now this is still just PAMax, but a subsequent patch will
compute the correct value from TCR_ELx.{I}PS.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20220301215958 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Move arm_pamax out of line

We will shortly share parts of this function with other portions
of address translation.

Reviewed-by: Peter Maydell <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20220301215958 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Fault on invalid TCR_ELx.TxSZ

Without FEAT_LVA, the behaviour of programming an invalid value
is IMPLEMENTATION DEFINED. With FEAT_LVA, programming an invalid
minimum value requires a Translation fault.

It is most self-consistent to choose to generate the fault always.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20220301215958 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Set TCR_EL1.TSZ for user-only

Set this as the kernel would, to 48 bits, to keep the computation
of the address space correct for PAuth.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20220301215958 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

hw/registerfields: Add FIELD_SEX<N> and FIELD_SDP<N>

Add new macros to manipulate signed fields within the register.

Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20220301215958 [email protected]
Suggested-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

tests/qtest: add qtests for npcm7xx sdhci

Reviewed-by: Hao Wu <[email protected]>
Reviewed-by: Chris Rauer <[email protected]>
Signed-off-by: Shengtan Mao <[email protected]>
Signed-off-by: Patrick Venture <[email protected]>
Message-id: 20220225174451 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Fix early free of TCG temp in handle_simd_shift_fpint_conv()

handle_simd_shift_fpint_conv() was accidentally freeing the TCG
temporary tcg_fpstatus too early, before the last use of it. Move
the free down to where it belongs.

Signed-off-by: Wentao_Liang <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
[PMM: cleaned up commit message]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Support PSCI 1.1 and SMCCC 1.0

Support the latest PSCI on TCG and HVF. A 64-bit function called from
AArch32 now returns NOT_SUPPORTED, which is necessary to adhere to SMC
Calling Convention 1.0. It is still not compliant with SMCCC 1.3 since
they do not implement mandatory functions.

Signed-off-by: Akihiko Odaki <[email protected]>
Message-id: 20220213035753 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
[PMM: update MISMATCH_CHECK checks on PSCI_VERSION macros to match]
Signed-off-by: Peter Maydell <[email protected]>

hw/i2c: flatten pca954x mux device

Previously this device created N subdevices which each owned an i2c bus.
Now this device simply owns the N i2c busses directly.

Tested: Verified devices behind mux are still accessible via qmp and i2c
from within an arm32 SoC.

Reviewed-by: Hao Wu <[email protected]>
Signed-off-by: Patrick Venture <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Tested-by: Philippe Mathieu-Daudé <[email protected]>
Message-id: 20220202164533.1283668 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

hw/input/tsc210x: Don't abort on bad SPI word widths

The tsc210x doesn't support anything other than 16-bit reads on the
SPI bus, but the guest can program the SPI controller to attempt
them anyway. If this happens, don't abort QEMU, just log this as
a guest error.

This fixes our machine_arm_n8x0.py:N8x0Machine.test_n800
acceptance test, which hits this assertion.

The reason we hit the assertion is because the guest kernel thinks
there is a TSC2005 on this SPI bus address, not a TSC210x.  (The n810
*does* have a TSC2005 at this address.) The TSC2005 supports the
24-bit accesses which the guest driver makes, and the TSC210x does
not (that is, our TSC210x emulation is not missing support for a word
width the hardware can handle).  It's not clear whether the problem
here is that the guest kernel incorrectly thinks the n800 has the
same device at this SPI bus address as the n810, or that QEMU's n810
board model doesn't get the SPI devices right.  At this late date
there no longer appears to be any reliable information on the web
about the hardware behaviour, but I am inclined to think this is a
guest kernel bug.  In any case, we prefer not to abort QEMU for
guest-triggerable conditions, so logging the error is the right thing
to do.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/736
Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Message-id: 20220221140750 [email protected]

hw/arm/mps2-tz.c: Update AN547 documentation URL

The AN547 application note URL has changed: update our comment
accordingly. (Rev B is still downloadable from the old URL,
but there is a new Rev C of the document now.)

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Tested-by: Philippe Mathieu-Daudé <[email protected]>
Message-id: 20220221094144 [email protected]

mps3-an547: Add missing user ahb interfaces

With these interfaces missing, TFM would delegate peripherals 0, 1,
2, 3 and 8, and qemu would ignore the delegation of interface 8, as
it thought interface 4 was eth & USB.

This patch corrects this behavior and allows TFM to delegate the
eth & USB peripheral to NS mode.

(The old QEMU behaviour was based on revision B of the AN547
appnote; revision C corrects this error in the documentation,
and this commit brings QEMU in to line with how the FPGA
image really behaves.)

Signed-off-by: Jimmy Brisson <[email protected]>
Message-id: 20220210210227.3203883 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
[PMM: added commit message note clarifying that the old behaviour
was a docs issue, not because there were two different versions
of the FPGA image]
Signed-off-by: Peter Maydell <[email protected]>

migration: Remove load_state_old and minimum_version_id_old

There are no longer any VMStateDescription structs in the tree which
use the load_state_old support for custom handling of incoming
migration from very old QEMU. Remove the mechanism entirely.

This includes removing one stray useless setting of
minimum_version_id_old in a VMStateDescription with no load_state_old
function, which crept in after the global weeding-out of them in
commit 17e313406126.

Signed-off-by: Peter Maydell <[email protected]>
Message-Id: <20220215175705.3846411 [email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Francisco Iglesias <[email protected]>
Reviewed-by: Alistair Francis <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

tests: Pass in MigrateStart** into test_migrate_start()

test_migrate_start() will release the MigrateStart structure that passed
in, however that's not super clear to the caller because after the call
returned the pointer can still be referenced by the callers. It can easily
be a source of use-after-free.

Let's pass in a double pointer of that, then we can safely clear the
pointer for the caller after the struct is released.

Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20220301083925 [email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>
dgilbert: Fixup apply since I didn't take 24/25

migration: Add migration_incoming_transport_cleanup()

Add a helper to cleanup the transport listener.

When do it, we should also null-ify the cleanup hook and the data, then it's
even safe to call it multiple times.

Move the socket_address_list cleanup altogether, because that's a mirror of the
listener channels and only for the purpose of query-migrate. Hence when
someone wants to cleanup the listener transport, it should also want to cleanup
the socket list too, always.

No functional change intended.

Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20220301083925 [email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration: postcopy_pause_fault_thread() never fails

Per the title, remove the return code and simplify the callers as the errors
will never be triggered. No functional change intended.

Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20220301083925 [email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration: Enlarge postcopy recovery to capture !-EIO too

We used to have quite a few places making sure -EIO happened and that's the
only way to trigger postcopy recovery. That's based on the assumption that
we'll only return -EIO for channel issues.

It'll work in 99.99% cases but logically that won't cover some corner cases.
One example is e.g. ram_block_from_stream() could fail with an interrupted
network, then -EINVAL will be returned instead of -EIO.

I remembered Dave Gilbert pointed that out before, but somehow this is
overlooked. Neither did I encounter anything outside the -EIO error.

However we'd better touch that up before it triggers a rare VM data loss during
live migrating.

To cover as much those cases as possible, remove the -EIO restriction on
triggering the postcopy recovery, because even if it's not a channel failure,
we can't do anything better than halting QEMU anyway - the corpse of the
process may even be used by a good hand to dig out useful memory regions, or
the admin could simply kill the process later on.

Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20220301083925 [email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration: Move static var in ram_block_from_stream() into global

Static variable is very unfriendly to threading of ram_block_from_stream().
Move it into MigrationIncomingState.

Make the incoming state pointer to be passed over to ram_block_from_stream() on
both caller sites.

Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20220301083925 [email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration: Add postcopy_thread_create()

Postcopy create threads. A common manner is we init a sem and use it to sync
with the thread. Namely, we have fault_thread_sem and listen_thread_sem and
they're only used for this.

Make it a shared infrastructure so it's easier to create yet another thread.

Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20220301083925 [email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration: Dump ramblock and offset too when non-same-page detected

In ram_load_postcopy() we'll try to detect non-same-page case and dump error.
This error is very helpful for debugging. Adding ramblock & offset into the
error log too.

Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20220301083925 [email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>
dgilbert: Fix up long line

migration: Introduce postcopy channels on dest node

Postcopy handles huge pages in a special way that currently we can only have
one "channel" to transfer the page.

It's because when we install pages using UFFDIO_COPY, we need to have the whole
huge page ready, it also means we need to have a temp huge page when trying to
receive the whole content of the page.

Currently all maintainance around this tmp page is global: firstly we'll
allocate a temp huge page, then we maintain its status mostly within
ram_load_postcopy().

To enable multiple channels for postcopy, the first thing we need to do is to
prepare N temp huge pages as caching, one for each channel.

Meanwhile we need to maintain the tmp huge page status per-channel too.

To give some example, some local variables maintained in ram_load_postcopy()
are listed; they are responsible for maintaining temp huge page status:

  - all_zero:     this keeps whether this huge page contains all zeros
  - target_pages: this counts how many target pages have been copied
  - host_page:    this keeps the host ptr for the page to install

Move all these fields to be together with the temp huge pages to form a new
structure called PostcopyTmpPage.  Then for each (future) postcopy channel, we
need one structure to keep the state around.

For vanilla postcopy, obviously there's only one channel.  It contains both
precopy and postcopy pages.

This patch teaches the dest migration node to start realize the possible number
of postcopy channels by introducing the "postcopy_channels" variable.  Its
value is calculated when setup postcopy on dest node (during POSTCOPY_LISTEN
phase).

Vanilla postcopy will have channels=1, but when postcopy-preempt capability is
enabled (in the future), we will boost it to 2 because even during partial
sending of a precopy huge page we still want to preempt it and start sending
the postcopy requested page right away (so we start to keep two temp huge
pages; more if we want to enable multifd).  In this patch there's a TODO marked
for that; so far the channels is always set to 1.

We need to send one "host huge page" on one channel only and we cannot split
them, because otherwise the data upon the same huge page can locate on more
than one channel so we need more complicated logic to manage.  One temp host
huge page for each channel will be enough for us for now.

Postcopy will still always use the index=0 huge page even after this patch.
However it prepares for the latter patches where it can start to use multiple
channels (which needs src intervention, because only src knows which channel we
should use).

Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20220301083925 [email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>
  dgilbert: Fixed up long line

migration: Tracepoint change in postcopy-run bottom half

Remove the old two tracepoints and they're even near each other:

trace_loadvm_postcopy_handle_run_cpu_sync()
trace_loadvm_postcopy_handle_run_vmstart()

Add trace_loadvm_postcopy_handle_run_bh() with a finer granule trace.

Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20220301083925 [email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration: Finer grained tracepoints for POSTCOPY_LISTEN

The enablement of postcopy listening has a few steps, add a few tracepoints to
be there ready for some basic measurements for them.

Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20220301083925 [email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration: Dump sub-cmd name in loadvm_process_command tp

It'll be easier to read the name rather than index of sub-cmd when debugging.

Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20220301083925 [email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration/rdma: set the REUSEADDR option for destination

We hit following error during testing RDMA transport:
in case of migration error, mgmt daemon pick one migration port,
incoming rdma:[::]:8089: RDMA ERROR: Error: could not rdma_bind_addr

Then try another -incoming rdma:[::]:8103, sometime it worked,
sometimes need another try with other ports number.

Set the REUSEADDR option for destination, This allow address could
be reused to avoid rdma_bind_addr error out.

Signed-off-by: Jack Wang <[email protected]>
Message-Id: <20220208085640 [email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Pankaj Gupta <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>
dgilbert: Fixed up some tabs

qapi/monitor: allow VNC display id in set/expire_password

It is possible to specify more than one VNC server on the command line,
either with an explicit ID or the auto-generated ones à la "default",
"vnc2", "vnc3", ...

It is not possible to change the password on one of these extra VNC
displays though. Fix this by adding a "display" parameter to the
"set_password" and "expire_password" QMP and HMP commands.

For HMP, the display is specified using the "-d" value flag.

For QMP, the schema is updated to explicitly express the supported
variants of the commands with protocol-discriminated unions.

Signed-off-by: Stefan Reiter <[email protected]>
[FE: update "Since: " from 6.2 to 7.0
make @connected a common member of @SetPasswordOptions]
Signed-off-by: Fabian Ebner <[email protected]>
Message-Id: <20220225084949 [email protected]>
Acked-by: Markus Armbruster <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

qapi/monitor: refactor set/expire_password with enums

'protocol' and 'connected' are better suited as enums than as strings,
make use of that. No functional change intended.

Suggested-by: Markus Armbruster <[email protected]>
Reviewed-by: Markus Armbruster <[email protected]>
Signed-off-by: Stefan Reiter <[email protected]>
[FE: update "Since: " from 6.2 to 7.0
put 'keep' first in enum to ease use as a default]
Signed-off-by: Fabian Ebner <[email protected]>
Message-Id: <20220225084949 [email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

monitor/hmp: add support for flag argument with value

Adds support for the "-xs" parameter type, where "-x" denotes a flag
name and the "s" suffix indicates that this flag is supposed to take
an arbitrary string parameter.

These parameters are always optional, the entry in the qdict will be
omitted if the flag is not given.

Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Stefan Reiter <[email protected]>
[FE: fixed typo pointed out by Eric Blake
use s instead of V to indicate string parameter]
Signed-off-by: Fabian Ebner <[email protected]>
Message-Id: <20220225084949 [email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: Let meson check for statx.stx_mnt_id

In virtiofsd, we assume that the presence of the STATX_MNT_ID macro
implies existence of the statx.stx_mnt_id field. Unfortunately, that is
not necessarily the case: glibc has introduced the macro in its commit
88a2cf6c4bab6e94a65e9c0db8813709372e9180, but the statx.stx_mnt_id field
is still missing from its own headers.

Let meson.build actually chek for both STATX_MNT_ID and
statx.stx_mnt_id, and set CONFIG_STATX_MNT_ID if both are present.
Then, use this config macro in virtiofsd.

Closes: https://gitlab.com/qemu-project/qemu/-/issues/882
Signed-off-by: Hanna Reitz <[email protected]>
Message-Id: <20220223092340 [email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

clock-vmstate: Add missing END_OF_LIST

Add the missing VMSTATE_END_OF_LIST to vmstate_muldiv

Fixes: 99abcbc7600 ("clock: Provide builtin multiplier/divider")
Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Message-Id: <20220111101934 [email protected]>
Reviewed-by: Peter Maydell <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Reviewed-by: Luc Michel <[email protected]>
Cc: [email protected]
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

Merge remote-tracking branch 'remotes/legoater/tags/pull-ppc-20220302' into staging

ppc-7.0 queue

* ppc/pnv fixes
* PMU EBB support
* target/ppc: PowerISA Vector/VSX instruction batch
* ppc/pnv: Extension of the powernv10 machine with XIVE2 ans PHB5 models
* spapr allocation cleanups

# gpg: Signature made Wed 02 Mar 2022 11:00:42 GMT
# gpg:                using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <[email protected]>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B  0B60 51A3 43C7 CFFB ECA1

* remotes/legoater/tags/pull-ppc-20220302: (87 commits)
  hw/ppc/spapr_vio.c: use g_autofree in spapr_dt_vdevice()
  hw/ppc/spapr_rtas.c: use g_autofree in rtas_ibm_get_system_parameter()
  spapr_pci_nvlink2.c: use g_autofree in spapr_phb_nvgpu_ram_populate_dt()
  hw/ppc/spapr_numa.c: simplify spapr_numa_write_assoc_lookup_arrays()
  hw/ppc/spapr_drc.c: use g_autofree in spapr_drc_by_index()
  hw/ppc/spapr_drc.c: use g_autofree in spapr_dr_connector_new()
  hw/ppc/spapr_drc.c: use g_autofree in drc_unrealize()
  hw/ppc/spapr_drc.c: use g_autofree in drc_realize()
  hw/ppc/spapr_drc.c: use g_auto in spapr_dt_drc()
  hw/ppc/spapr_caps.c: use g_autofree in spapr_caps_add_properties()
  hw/ppc/spapr_caps.c: use g_autofree in spapr_cap_get_string()
  hw/ppc/spapr_caps.c: use g_autofree in spapr_cap_set_string()
  hw/ppc/spapr.c: fail early if no firmware found in machine_init()
  hw/ppc/spapr.c: use g_autofree in spapr_dt_chosen()
  pnv/xive2: Add support for 8bits thread id
  pnv/xive2: Add support for automatic save&restore
  xive2: Add a get_config() handler for the router configuration
  pnv/xive2: Add support XIVE2 P9-compat mode (or Gen1)
  ppc/pnv: add XIVE Gen2 TIMA support
  pnv/xive2: Introduce new capability bits
  ...

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/stsquad/tags/pull-testing-and-semihosting-280222-1' into staging

Testing and semihosting updates:

  - restore TESTS/IMAGES filtering to docker tests
  - add NOUSER to alpine image
  - bump lcitool version
  - move arm64/s390x cross build images to lcitool
  - add aarch32 runner CI scripts
  - expand testing to more vectors
  - update s390x jobs to focal for gitlab/travis
  - disable threadcount for all sh4
  - fix semihosting SYS_HEAPINFO and test

# gpg: Signature made Mon 28 Feb 2022 18:46:41 GMT
# gpg:                using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <[email protected]>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8  DF35 FBD0 DB09 5A9E 2A44

* remotes/stsquad/tags/pull-testing-and-semihosting-280222-1:
  tests/tcg: port SYS_HEAPINFO to a system test
  semihosting/arm-compat: replace heuristic for softmmu SYS_HEAPINFO
  tests/tcg: completely disable threadcount for sh4
  gitlab: upgrade the job definition for s390x to 20.04
  travis.yml: Update the s390x jobs to Ubuntu Focal
  tests/tcg: add vectorised sha512 versions
  tests/tcg: add sha512 test
  tests/tcg: build sha1-vector with O3 and compare
  tests/tcg/ppc64: clean-up handling of byte-reverse
  gitlab: add a new aarch32 custom runner definition
  scripts/ci: allow for a secondary runner
  scripts/ci: add build env rules for aarch32 on aarch64
  tests/docker: introduce debian-riscv64-test-cross
  tests/docker: update debian-s390x-cross with lcitool
  tests/docker: update debian-arm64-cross with lcitool
  tests/lcitool: update to latest version
  tests/docker: add NOUSER for alpine image
  tests/docker: restore TESTS/IMAGES filtering

Signed-off-by: Peter Maydell <[email protected]>

hw/ppc/spapr_vio.c: use g_autofree in spapr_dt_vdevice()

And return the result of g_strdup_printf() directly instead of using the
'path' var.

Signed-off-by: Daniel Henrique Barboza <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20220228175004 [email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>

hw/ppc/spapr_rtas.c: use g_autofree in rtas_ibm_get_system_parameter()

Signed-off-by: Daniel Henrique Barboza <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20220228175004 [email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>

spapr_pci_nvlink2.c: use g_autofree in spapr_phb_nvgpu_ram_populate_dt()

Signed-off-by: Daniel Henrique Barboza <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20220228175004 [email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>

hw/ppc/spapr_numa.c: simplify spapr_numa_write_assoc_lookup_arrays()

We can get the job done in spapr_numa_write_assoc_lookup_arrays() a bit
cleaner:

- 'cur_index = int_buf = g_malloc0(..)' is doing a g_malloc0() in the
'int_buf' pointer and making 'cur_index' point to 'int_buf' all in a
single line. No problem with that, but splitting into 2 lines is clearer
to follow

- use g_autofree in 'int_buf' to avoid a g_free() call later on

- 'buf_len' is only being used to store the size of 'int_buf' malloc.
Remove the var and just use the value in g_malloc0() directly

- remove the 'ret' var and just return the result of fdt_setprop()

Signed-off-by: Daniel Henrique Barboza <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20220228175004 [email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>

hw/ppc/spapr_drc.c: use g_autofree in spapr_drc_by_index()

Signed-off-by: Daniel Henrique Barboza <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20220228175004 [email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>

hw/ppc/spapr_drc.c: use g_autofree in spapr_dr_connector_new()

Signed-off-by: Daniel Henrique Barboza <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20220228175004 [email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>

hw/ppc/spapr_drc.c: use g_autofree in drc_unrealize()

Signed-off-by: Daniel Henrique Barboza <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20220228175004 [email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>

hw/ppc/spapr_drc.c: use g_autofree in drc_realize()

Signed-off-by: Daniel Henrique Barboza <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20220228175004 [email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>

hw/ppc/spapr_drc.c: use g_auto in spapr_dt_drc()

Use g_autoptr() with GArray* and GString* pointers to avoid calling
g_free() and the need for the 'out' label.

'drc_name' can also be g_autofreed to avoid a g_free() call at the end
of the while() loop.

Signed-off-by: Daniel Henrique Barboza <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20220228175004 [email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>

hw/ppc/spapr_caps.c: use g_autofree in spapr_caps_add_properties()

Signed-off-by: Daniel Henrique Barboza <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20220228175004 [email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>

hw/ppc/spapr_caps.c: use g_autofree in spapr_cap_get_string()

Signed-off-by: Daniel Henrique Barboza <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20220228175004 [email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>