migrate: remove QMP/HMP commands for speed, downtime and cache size
The generic 'migrate_set_parameters' command handle all types of param.
Only the QMP commands were documented in the deprecations page, but the
rationale for deprecating applies equally to HMP, and the replacements
exist. Furthermore the HMP commands are just shims to the QMP commands,
so removing the latter breaks the former unless they get re-implemented.
The code comment suggests removing QAPIEvent_(str|lookup) symbols too,
however, these are both auto-generated as standard for any enum in
QAPI. As such it they'll exist whether we use them or not.
* remotes/awilliam/tags/vfio-update-20210316.0:
vfio/migrate: Move switch of dirty tracking into vfio_memory_listener
vfio: Support host translation granule size
vfio: Avoid disabling and enabling vectors repeatedly in VFIO migration
vfio: Set the priority of the VFIO VM state change handler explicitly
vfio: Move the saving of the config space to the right place in VFIO migration
spapr_iommu: Fix vhost integration regression
vfio: Do not register any IOMMU_NOTIFIER_DEVIOTLB_UNMAP notifier
MAINTAINERS: Cover docs/igd-assign.txt in VFIO section
hw/vfio/pci-quirks: Replace the word 'blacklist'
vfio: Fix vfio_listener_log_sync function name typo
Peter Maydell [Wed, 17 Mar 2021 18:28:03 +0000 (18:28 +0000)]
Merge remote-tracking branch 'remotes/cschoenebeck/tags/pull-9p-20210316' into staging
9pfs: code cleanup
* Use lock-guard design pattern instead of manual lock/unlock.
# gpg: Signature made Tue 16 Mar 2021 10:49:09 GMT
# gpg: using RSA key 96D8D110CF7AF8084F88590134C2B58765A47395
# gpg: issuer "[email protected]"
# gpg: Good signature from "Christian Schoenebeck <[email protected]>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: ECAB 1A45 4014 1413 BA38 4926 30DB 47C3 A012 D5F4
# Subkey fingerprint: 96D8 D110 CF7A F808 4F88 5901 34C2 B587 65A4 7395
* remotes/cschoenebeck/tags/pull-9p-20210316:
hw/9pfs/9p-synth: Replaced qemu_mutex_lock with QEMU_LOCK_GUARD
* remotes/kraxel/tags/audio-20210316-pull-request:
coreaudio: Handle output device change
coreaudio: Extract device operations
coreaudio: Drop support for macOS older than 10.6
Peter Maydell [Wed, 17 Mar 2021 15:01:09 +0000 (15:01 +0000)]
Merge remote-tracking branch 'remotes/cohuck-gitlab/tags/s390x-20210316' into staging
s390x updates:
- get rid of legacy_s390_alloc() and phys_mem_set_alloc()
- tcg: implement the MVPG condition-code-option bit
- fix g_autofree variable handing in the pci vfio code
- use official z15 names in the cpu model definitions
* remotes/cohuck-gitlab/tags/s390x-20210316:
s390x/pci: Add missing initialization for g_autofree variables
target/s390x: Store r1/r2 for page-translation exceptions during MVPG
target/s390x: Implement the MVPG condition-code-option bit
s390x/cpu_model: use official name for 8562
exec: Get rid of phys_mem_set_alloc()
s390x/kvm: Get rid of legacy_s390_alloc()
* remotes/kraxel/tags/ui-20210316-pull-request:
ui/cocoa: Comment about modifier key input quirks
ui: fold qemu_alloc_display in only caller
ui: honour the actual guest display dimensions without rounding
ui: use client width/height in WMVi message
ui: avoid sending framebuffer updates outside client desktop bounds
ui: add more trace points for VNC client/server messages
ui/cocoa: Do not exit immediately after shutdown
opengl: Do not convert format with glTexImage2D on OpenGL ES
ui: deprecate "password" option for SPICE server
ui: introduce "password-secret" option for SPICE server
ui: introduce "password-secret" option for VNC servers
Peter Maydell [Wed, 17 Mar 2021 09:07:28 +0000 (09:07 +0000)]
Merge remote-tracking branch 'remotes/dgilbert-gitlab/tags/pull-virtiofs-20210315' into staging
virtiofs and migration pull 2021-03-15
Signed-off-by: Dr. David Alan Gilbert <[email protected]>
# gpg: Signature made Mon 15 Mar 2021 20:03:03 GMT
# gpg: using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <[email protected]>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert-gitlab/tags/pull-virtiofs-20210315:
migration: Replaced qemu_mutex_lock calls with QEMU_LOCK_GUARD
monitor: Replaced qemu_mutex_lock calls with QEMU_LOCK_GUARD
migration/tls: add error handling in multifd_tls_handshake_thread
migration/tls: fix inverted semantics in multifd_channel_connect
virtiofsd: Convert some functions to return bool
virtiofsd: Don't allow empty paths in lookup_name()
virtiofsd: Don't allow empty filenames
virtiofsd: Add qemu version and copyright info
virtiofsd: Release vu_dispatch_lock when stopping queue
Keqian Zhu [Tue, 9 Mar 2021 03:19:13 +0000 (11:19 +0800)]
vfio/migrate: Move switch of dirty tracking into vfio_memory_listener
For now the switch of vfio dirty page tracking is integrated into
@vfio_save_handler. The reason is that some PCI vendor driver may
start to track dirty base on _SAVING state of device, so if dirty
tracking is started before setting device state, vfio will report
full-dirty to QEMU.
However, the dirty bmap of all ramblocks are fully set when setup
ram saving, so it's not matter whether the device is in _SAVING
state when start vfio dirty tracking.
Moreover, this logic causes some problems [1]. The object of dirty
tracking is guest memory, but the object of @vfio_save_handler is
device state, which produces unnecessary coupling and conflicts:
1. Coupling: Their saving granule is different (perVM vs perDevice).
vfio will enable dirty_page_tracking for each devices, actually
once is enough.
2. Conflicts: The ram_save_setup() traverses all memory_listeners
to execute their log_start() and log_sync() hooks to get the
first round dirty bitmap, which is used by the bulk stage of
ram saving. However, as vfio dirty tracking is not yet started,
it can't get dirty bitmap from vfio. Then we give up the chance
to handle vfio dirty page at bulk stage.
Move the switch of vfio dirty_page_tracking into vfio_memory_listener
can solve above problems. Besides, Do not require devices in SAVING
state for vfio_sync_dirty_bitmap().
Kunkun Jiang [Thu, 4 Mar 2021 13:34:46 +0000 (21:34 +0800)]
vfio: Support host translation granule size
The cpu_physical_memory_set_dirty_lebitmap() can quickly deal with
the dirty pages of memory by bitmap-traveling, regardless of whether
the bitmap is aligned correctly or not.
cpu_physical_memory_set_dirty_lebitmap() supports pages in bitmap of
host page size. So it'd better to set bitmap_pgsize to host page size
to support more translation granule sizes.
[aw: The Fixes commit below introduced code to restrict migration
support to configurations where the target page size intersects the
host dirty page support. For example, a 4K guest on a 4K host.
Due to the above flexibility in bitmap handling, this restriction
unnecessarily prevents mixed target/host pages size that could
otherwise be supported. Use host page size for dirty bitmap.]
Shenming Lu [Wed, 10 Mar 2021 03:02:33 +0000 (11:02 +0800)]
vfio: Avoid disabling and enabling vectors repeatedly in VFIO migration
In VFIO migration resume phase and some guest startups, there are
already unmasked vectors in the vector table when calling
vfio_msix_enable(). So in order to avoid inefficiently disabling
and enabling vectors repeatedly, let's allocate all needed vectors
first and then enable these unmasked vectors one by one without
disabling.
Shenming Lu [Wed, 10 Mar 2021 03:02:32 +0000 (11:02 +0800)]
vfio: Set the priority of the VFIO VM state change handler explicitly
In the VFIO VM state change handler when stopping the VM, the _RUNNING
bit in device_state is cleared which makes the VFIO device stop, including
no longer generating interrupts. Then we can save the pending states of
all interrupts in the GIC VM state change handler (on ARM).
So we have to set the priority of the VFIO VM state change handler
explicitly (like virtio devices) to ensure it is called before the
GIC's in saving.
Shenming Lu [Wed, 10 Mar 2021 03:02:31 +0000 (11:02 +0800)]
vfio: Move the saving of the config space to the right place in VFIO migration
On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt
setup, if the restoring of the VFIO PCI device config space is
before the VGIC, an error might occur in the kernel.
So we move the saving of the config space to the non-iterable
process, thus it will be called after the VGIC according to
their priorities.
As for the possible dependence of the device specific migration
data on it's config space, we can let the vendor driver to
include any config info it needs in its own data stream.
Eric Auger [Tue, 9 Feb 2021 21:32:33 +0000 (22:32 +0100)]
spapr_iommu: Fix vhost integration regression
Previous work on dev-iotlb message broke spapr_iommu/vhost integration
as it did for SMMU and virtio-iommu. The spapr_iommu currently
only sends IOMMU_NOTIFIER_UNMAP notifications. Since commit 958ec334bca3 ("vhost: Unbreak SMMU and virtio-iommu on dev-iotlb support"),
VHOST first tries to register IOMMU_NOTIFIER_DEVIOTLB_UNMAP notifier
and if it fails, falls back to legacy IOMMU_NOTIFIER_UNMAP. So
spapr_iommu must fail on the IOMMU_NOTIFIER_DEVIOTLB_UNMAP
registration.
Eric Auger [Tue, 9 Feb 2021 21:32:32 +0000 (22:32 +0100)]
vfio: Do not register any IOMMU_NOTIFIER_DEVIOTLB_UNMAP notifier
In an attempt to fix smmu/virtio-iommu - vhost regression, commit 958ec334bca3 ("vhost: Unbreak SMMU and virtio-iommu on dev-iotlb support")
broke virtio-iommu integration. This is due to the fact VFIO registers
IOMMU_NOTIFIER_ALL notifiers, which includes IOMMU_NOTIFIER_DEVIOTLB_UNMAP
and this latter now is rejected by the virtio-iommu. As a consequence,
the registration fails. VHOST behaves like a device with an ATC cache. The
VFIO device does not support this scheme yet.
Let's register only legacy MAP and UNMAP notifiers.
Follow the inclusive terminology from the "Conscious Language in your
Open Source Projects" guidelines [*] and replace the word "blacklist"
appropriately.
* remotes/kraxel/tags/usb-20210315-pull-request:
usb/storage: clear csw on reset
usb/storage: add kconfig symbols
usb/storage move usb-storage device to separate source file
usb/storage: move usb-bot device to separate source file
usb/storage: move declarations to usb/msd.h header
hw/usb: Extract VT82C686 UHCI PCI function into a new unit
hw/usb/hcd-uhci: Expose generic prototypes to local header
hw/southbridge: Add missing Kconfig dependency VT82C686 on USB_UHCI
usb: Document the missing -usbdevice options
usb: Un-deprecate -usbdevice (except for -usbdevice audio which gets removed)
usb: remove '-usbdevice u2f-key'
usb: remove support for -usbdevice parameters
hw/usb/bus: Remove the "full-path" property
Peter Maydell [Tue, 16 Mar 2021 10:53:47 +0000 (10:53 +0000)]
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Mon 15 Mar 2021 08:42:25 GMT
# gpg: using RSA key EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <[email protected]>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
net: Do not fill legacy info_str for backends
hmp: Use QAPI NetdevInfo in hmp_info_network
net: Move NetClientState.info_str to dynamic allocations
tests: Add tests for query-netdev command
qapi: net: Add query-netdev command
pvrdma: wean code off pvrdma_ring.h kernel header
lan9118: switch to use qemu_receive_packet() for loopback
cadence_gem: switch to use qemu_receive_packet() for loopback
pcnet: switch to use qemu_receive_packet() for loopback
rtl8139: switch to use qemu_receive_packet() for loopback
tx_pkt: switch to use qemu_receive_packet_iov() for loopback
sungem: switch to use qemu_receive_packet() for loopback
msf2-mac: switch to use qemu_receive_packet() for loopback
dp8393x: switch to use qemu_receive_packet() for loopback packet
e1000: switch to use qemu_receive_packet() for loopback
net: introduce qemu_receive_packet()
e1000: fail early for evil descriptor
net: validate that ids are well formed
net: Fix build error when DEBUG_NET is on
virtio-net: calculating proper msix vectors on init
Signed-off-by: Peter Maydell <[email protected]>
# Conflicts:
# hw/core/machine.c
Mahmoud Mandour [Thu, 11 Mar 2021 03:15:37 +0000 (05:15 +0200)]
hw/9pfs/9p-synth: Replaced qemu_mutex_lock with QEMU_LOCK_GUARD
Replaced a call to qemu_mutex_lock and its respective call to
qemu_mutex_unlock and used QEMU_LOCK_GUARD macro in their place.
This simplifies the code by removing the call required to unlock
and also eliminates goto paths.
Laurent Vivier [Fri, 12 Mar 2021 21:41:45 +0000 (22:41 +0100)]
m68k: add Virtual M68k Machine
The machine is based on Goldfish interfaces defined by Google
for Android simulator. It uses Goldfish-rtc (timer and RTC),
Goldfish-pic (PIC) and Goldfish-tty (for serial port and early tty).
The machine is created with 128 virtio-mmio bus, and they can
be used to use serial console, GPU, disk, NIC, HID, ...
Laurent Vivier [Fri, 12 Mar 2021 21:41:43 +0000 (22:41 +0100)]
m68k: add an interrupt controller
A (generic) copy of the GLUE device we already have for q800 to use with
the m68k-virt machine.
The q800 one would disappear in the future as q800 uses actually the djMEMC
controller.
Mahmoud Mandour [Thu, 11 Mar 2021 03:15:35 +0000 (05:15 +0200)]
migration: Replaced qemu_mutex_lock calls with QEMU_LOCK_GUARD
Replaced various qemu_mutex_lock calls and their respective
qemu_mutex_unlock calls with QEMU_LOCK_GUARD macro. This simplifies
the code by eliminating the respective qemu_mutex_unlock calls.
Mahmoud Mandour [Thu, 11 Mar 2021 03:15:34 +0000 (05:15 +0200)]
monitor: Replaced qemu_mutex_lock calls with QEMU_LOCK_GUARD
Removed various qemu_mutex_lock and their respective qemu_mutex_unlock
calls and used lock guard macros (QEMU_LOCK_GUARD and
WITH_QEMU_LOCK_GUARD). This simplifies the code by
eliminating qemu_mutex_unlock calls.
Hao Wang [Tue, 9 Feb 2021 10:42:37 +0000 (18:42 +0800)]
migration/tls: add error handling in multifd_tls_handshake_thread
If any error happens during multifd send thread creating (e.g. channel broke
because new domain is destroyed by the dst), multifd_tls_handshake_thread
may exit silently, leaving main migration thread hanging (ram_save_setup ->
multifd_send_sync_main -> qemu_sem_wait(&p->sem_sync)).
Fix that by adding error handling in multifd_tls_handshake_thread.
Greg Kurz [Fri, 12 Mar 2021 14:10:01 +0000 (15:10 +0100)]
virtiofsd: Don't allow empty paths in lookup_name()
When passed an empty filename, lookup_name() returns the inode of
the parent directory, unless the parent is the root in which case
the st_dev doesn't match and lo_find() returns NULL. This is
because lookup_name() passes AT_EMPTY_PATH down to fstatat() or
statx().
This behavior doesn't quite make sense because users of lookup_name()
then pass the name to unlinkat(), renameat() or renameat2(), all of
which will always fail on empty names.
Drop AT_EMPTY_PATH from the flags in lookup_name() so that it has
the consistent behavior of "returning an existing child inode or
NULL" for all directories.
Greg Kurz [Fri, 12 Mar 2021 14:10:03 +0000 (15:10 +0100)]
virtiofsd: Don't allow empty filenames
POSIX.1-2017 clearly stipulates that empty filenames aren't
allowed ([1] and [2]). Since virtiofsd is supposed to mirror
the host file system hierarchy and the host can be assumed to
be linux, we don't really expect clients to pass requests with
an empty path in it. If they do so anyway, this would eventually
cause an error when trying to create/lookup the actual inode
on the underlying POSIX filesystem. But this could still confuse
some code that wouldn't be ready to cope with this.
Filter out empty names coming from the client at the top level,
so that the rest doesn't have to care about it. This is done
everywhere we already call is_safe_path_component(), but
in a separate helper since the usual error for empty path
names is ENOENT instead of EINVAL.
Vivek Goyal [Wed, 3 Mar 2021 19:53:39 +0000 (14:53 -0500)]
virtiofsd: Add qemu version and copyright info
Option "-V" currently displays the fuse protocol version virtiofsd is
using. For example, I see this.
$ ./virtiofsd -V
"using FUSE kernel interface version 7.33"
People also want to know software version of virtiofsd so that they can
figure out if a certain fix is part of currently running virtiofsd or
not. Eric Ernst ran into this issue.
David Gilbert thinks that it probably is best that we simply carry the
qemu version and display that information given we are part of qemu
tree.
So this patch enhances version information and also adds qemu version
and copyright info. Not sure if copyright information is supposed
to be displayed along with version info. Given qemu-storage-daemon
and other utilities are doing it, so I continued with same pattern.
This is how now output looks like.
$ ./virtiofsd -V
virtiofsd version 5.2.50 (v5.2.0-2357-gcbcf09872a-dirty)
Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers
using FUSE kernel interface version 7.33
Greg Kurz [Fri, 12 Mar 2021 09:22:12 +0000 (10:22 +0100)]
virtiofsd: Release vu_dispatch_lock when stopping queue
QEMU can stop a virtqueue by sending a VHOST_USER_GET_VRING_BASE request
to virtiofsd. As with all other vhost-user protocol messages, the thread
that runs the main event loop in virtiofsd takes the vu_dispatch lock in
write mode. This ensures that no other thread can access virtqueues or
memory tables at the same time.
In the case of VHOST_USER_GET_VRING_BASE, the main thread basically
notifies the queue thread that it should terminate and waits for its
termination:
Simply have the main thread to release the lock before going to
sleep and take it back afterwards. A very similar patch was already
sent by Vivek Goyal sometime back:
The only difference here is that this done in fv_queue_set_started()
because fv_queue_cleanup_thread() can also be called from virtio_loop()
without the lock being held.
Once we've parsed the fractional value, extract it into an integral
64-bit fraction. Perform the scaling with integer arithmetic, and
simplify the overflow detection.
One of the implications of the fix was that the VNC server would have a
thin black bad down the right hand side if the guest desktop width was
not a multiple of 16. In practice this was a non-issue since the VNC
server was always honouring a guest specified resolution and guests
essentially always pick from a small set of sane resolutions likely in
real world hardware.
We recently introduced support for the extended desktop resize extension
and as a result the VNC client has ability to specify an arbitrary
desktop size and the guest OS may well honour it exactly. As a result we
no longer have any guarantee that the width will be a multiple of 16,
and so when resizing the desktop we have a 93% chance of getting the
black bar on the right hand size.
The VNC server maintains three different desktop dimensions
1. The guest surface
2. The server surface
3. The client desktop
The requirement for the width to be a multiple of 16 only applies to
item 2, the server surface, for the purpose of doing dirty bitmap
tracking.
Normally we will set the client desktop size to always match the server
surface size, but that's not a strict requirement. In order to cope with
clients that don't support the desktop size encoding, we already allow
for the client desktop to be a different size that the server surface.
Thus we can trivially eliminate the black bar, but setting the client
desktop size to be the un-rounded server surface size - the so called
"true width".
The WMVi message is supposed to provide the same width/height
information as the regular desktop resize and extended desktop
resize messages. There can be times where the client width and
height are different from the pixman surface dimensions.
We plan framebuffer update rects based on the VNC server surface. If the
client doesn't support desktop resize, then the client bounds may differ
from the server surface bounds. VNC clients may become upset if we then
send an update message outside the bounds of the client desktop.
This takes the approach of clamping the rectangles from the worker
thread immediately before sending them. This may sometimes results in
sending a framebuffer update message with zero rectangles.
Akihiko Odaki [Fri, 19 Feb 2021 11:16:52 +0000 (20:16 +0900)]
ui/cocoa: Do not exit immediately after shutdown
ui/cocoa used to call exit immediately after calling
qemu_system_shutdown_request, which prevents QEMU from actually
perfoming system shutdown. Just sleep forever, and wait QEMU to call
exit and kill the Cocoa thread.
Akihiko Odaki [Fri, 19 Feb 2021 09:48:03 +0000 (18:48 +0900)]
opengl: Do not convert format with glTexImage2D on OpenGL ES
OpenGL ES does not support conversion from the given data format
to the internal format with glTexImage2D.
Use the given data format as the internal format, and ignore
the given alpha channels with GL_TEXTURE_SWIZZLE_A in case the
format contains alpha channels.
ui: introduce "password-secret" option for SPICE server
Currently when using SPICE the "password" option provides the password
in plain text on the command line. This is insecure as it is visible
to all processes on the host. As an alternative, the password can be
provided separately via the monitor.
This introduces a "password-secret" option which lets the password be
provided up front.
ui: introduce "password-secret" option for VNC servers
Currently when using VNC the "password" flag turns on password based
authentication. The actual password has to be provided separately via
the monitor.
This introduces a "password-secret" option which lets the password be
provided up front.
Gerd Hoffmann [Fri, 12 Mar 2021 09:49:54 +0000 (10:49 +0100)]
usb/storage: clear csw on reset
Stale data in csw (specifically residue) can confuse the state machine
and allows the guest trigger an assert(). So clear csw on reset to
avoid this happening in case the guest resets the device in the middle
of a request.
Thomas Huth [Wed, 10 Mar 2021 17:33:23 +0000 (18:33 +0100)]
usb: Document the missing -usbdevice options
There are some more -usbdevice options that have never been mentioned
in the documentation. Now that we removed -usbdevice from the list
of deprecated features again, we should document them properly.
Thomas Huth [Wed, 10 Mar 2021 17:33:22 +0000 (18:33 +0100)]
usb: Un-deprecate -usbdevice (except for -usbdevice audio which gets removed)
When trying to remove the -usbdevice option, there were complaints that
"-usbdevice braille" is still a very useful shortcut for some people.
Thus we never remove this option. Since it's not such a big burden to
keep it around, and it's also convenient in the sense that you don't
have to worry to enable a host controller explicitly with this option,
we should remove it from he deprecation list again.
However, there is one exception: "-usbdevice audio" should go away, since
audio devices without "audiodev=..." parameter are also on the deprecation
list and you cannot use "-usbdevice audio" with "audiodev".
Paolo Bonzini [Wed, 10 Mar 2021 17:33:20 +0000 (18:33 +0100)]
usb: remove support for -usbdevice parameters
No device needs them anymore and in fact they're undocumented.
Remove the code. The only change in behavior is that "-usbdevice
braille:hello" now reports an error, which is a bugfix.
s390x/pci: Add missing initialization for g_autofree variables
When declaring g_autofree variable without initialization, compiler
will raise "may be used uninitialized in this function" warning due
to automatic free handling.
This is mentioned in docs/devel/style.rst (quote from section
"Automatic memory deallocation"):
* Variables declared with g_auto* MUST always be initialized,
otherwise the cleanup function will use uninitialized stack memory
Add initialization for these declarations to prevent the warning and
comply with coding style.
target/s390x: Store r1/r2 for page-translation exceptions during MVPG
The PoP states:
When EDAT-1 does not apply, and a program interruption due to a
page-translation exception is recognized by the MOVE PAGE
instruction, the contents of the R1 field of the instruction are
stored in bit positions 0-3 of location 162, and the contents of
the R2 field are stored in bit positions 4-7.
If [...] an ASCE-type, region-first-translation,
region-second-translation, region-third-translation, or
segment-translation exception was recognized, the contents of
location 162 are unpredictable.
So we have to write r1/r2 into the lowcore on page-translation
exceptions. Simply handle all exceptions inside our mvpg helper now.
legacy_s390_alloc() was required for dealing with the absence of the ESOP
feature -- on old HW (< gen 10) and old z/VM versions (< 6.3).
As z/VM v6.2 (and even v6.3) is no longer supported since 2017 [1]
and we don't expect to have real users on such old hardware, let's drop
legacy_s390_alloc().
Still check+report an error just in case someone still runs on
such old z/VM environments, or someone runs under weird nested KVM
setups (where we can manually disable ESOP via the CPU model).
No need to check for KVM_CAP_GMAP - that should always be around on
kernels that also have KVM_CAP_DEVICE_CTRL (>= v3.15).
Akihiko Odaki [Thu, 25 Feb 2021 00:12:39 +0000 (09:12 +0900)]
virtio-blk: Respect discard granularity
Report the configured granularity for discard operation to the
guest. If this is not set use the block size.
Since until now we have ignored the configured discard granularity
and always reported the block size, let's add
'report-discard-granularity' property and disable it for older
machine types to avoid migration issues.
Alexey Kirillov [Wed, 3 Mar 2021 09:59:10 +0000 (12:59 +0300)]
net: Do not fill legacy info_str for backends
As we use QAPI NetClientState->stored_config to store and get information
about backend network devices, we can drop fill of legacy field info_str
for them.
We still use info_str field for NIC and hubports, so we can not completely
remove it.
Alexey Kirillov [Wed, 3 Mar 2021 09:59:09 +0000 (12:59 +0300)]
hmp: Use QAPI NetdevInfo in hmp_info_network
Replace usage of legacy field info_str of NetClientState for backend
network devices with QAPI NetdevInfo stored_config that already used
in QMP query-netdev.
This change increases the detail of the "info network" output and takes
a more general approach to composing the output.
Alexey Kirillov [Wed, 3 Mar 2021 09:59:08 +0000 (12:59 +0300)]
net: Move NetClientState.info_str to dynamic allocations
The info_str field of the NetClientState structure is static and has a size
of 256 bytes. This amount is often unclaimed, and the field itself is used
exclusively for HMP "info network".
The patch translates info_str to dynamic memory allocation.
This action is also allows us to painlessly discard usage of this field
for backend devices.
Alexey Kirillov [Wed, 3 Mar 2021 09:59:06 +0000 (12:59 +0300)]
qapi: net: Add query-netdev command
The query-netdev command is used to get the configuration of the current
network device backends (netdevs).
This is the QMP analog of the HMP command "info network" but only for
netdevs (i.e. excluding NIC and hubports).
The query-netdev command returns an array of objects of the NetdevInfo
type, which are an extension of Netdev type. It means that response can
be used for netdev-add after small modification. This can be useful for
recreate the same netdev configuration.
Information about the network device is filled in when it is created or
modified and is available through the NetClientState->stored_config.
Cornelia Huck [Fri, 22 Jan 2021 18:00:29 +0000 (19:00 +0100)]
pvrdma: wean code off pvrdma_ring.h kernel header
The pvrdma code relies on the pvrdma_ring.h kernel header for some
basic ring buffer handling. The content of that header isn't very
exciting, but contains some (q)atomic_*() invocations that (a)
cause manual massaging when doing a headers update, and (b) are
an indication that we probably should not be importing that header
at all.
Let's reimplement the ring buffer handling directly in the pvrdma
code instead. This arguably also improves readability of the code.
Jason Wang [Wed, 24 Feb 2021 03:44:36 +0000 (11:44 +0800)]
net: introduce qemu_receive_packet()
Some NIC supports loopback mode and this is done by calling
nc->info->receive() directly which in fact suppresses the effort of
reentrancy check that is done in qemu_net_queue_send().
Unfortunately we can't use qemu_net_queue_send() here since for
loopback there's no sender as peer, so this patch introduce a
qemu_receive_packet() which is used for implementing loopback mode
for a NIC with this check.
NIC that supports loopback mode will be converted to this helper.
Jason Wang [Wed, 24 Feb 2021 05:45:28 +0000 (13:45 +0800)]
e1000: fail early for evil descriptor
During procss_tx_desc(), driver can try to chain data descriptor with
legacy descriptor, when will lead underflow for the following
calculation in process_tx_desc() for bytes:
Paolo Bonzini [Fri, 12 Mar 2021 14:51:38 +0000 (09:51 -0500)]
net: validate that ids are well formed
When a network or network device is created from the command line or HMP,
QemuOpts ensures that the id passes the id_wellformed check. However,
QMP skips this:
$ qemu-system-x86_64 -qmp stdio -S -nic user,id=123/456
qemu-system-x86_64: -nic user,id=123/456: Parameter id expects an identifier
Identifiers consist of letters, digits, -, ., _, starting with a letter.
Validity checks should be performed always at the bottom of the call chain,
because QMP skips all the steps above. At the same time we know that every
call chain should go through either QMP or (for legacy) through QemuOpts.
Because the id for -net and -nic is automatically generated and not
well-formed by design, just add the check to QMP.
Jason Wang [Mon, 8 Mar 2021 04:49:19 +0000 (12:49 +0800)]
virtio-net: calculating proper msix vectors on init
Currently, the default msix vectors for virtio-net-pci is 3 which is
obvious not suitable for multiqueue guest, so we depends on the user
or management tools to pass a correct vectors parameter. In fact, we
can simplifying this by calculating the number of vectors on realize.
Consider we have N queues, the number of vectors needed is 2*N + 2
(#queue pairs + plus one config interrupt and control vq). We didn't
check whether or not host support control vq because it was added
unconditionally by qemu to avoid breaking legacy guests such as Minix.