Alon Levy [Tue, 26 Mar 2013 10:08:02 +0000 (11:08 +0100)]
virtio-serial: propagate guest_connected to the port on post_load
When migrating a host with with a spice agent running the mouse becomes
non operational after the migration due to the agent state being
inconsistent between the guest and the client.
After migration the spicevmc backend on the destination has never been notified
of the (non 0) guest_connected state. Virtio-serial holds this state
information and migrates it, this patch properly propagates this information
to virtio-console and through that to interested chardev backends.
Hans de Goede [Tue, 26 Mar 2013 10:07:56 +0000 (11:07 +0100)]
qemu-char: Automatically do fe_open / fe_close on qemu_chr_add_handlers
Most frontends can't really determine if the guest actually has the frontend
side open. So lets automatically generate fe_open / fe_close as soon as a
frontend becomes ready (as signalled by calling qemu_chr_add_handlers) /
becomes non ready (as signalled by setting all handlers to NULL).
And allow frontends which can actually determine if the guest is listening to
opt-out of this.
Paolo Bonzini [Wed, 27 Mar 2013 13:34:32 +0000 (14:34 +0100)]
compiler: fix warning with GCC 4.8.0
GCC 4.8.0 introduces a new warning:
block/qcow2-snapshot.c: In function 'qcow2_write_snapshots’:
block/qcow2-snapshot.c:252:18: error: typedef 'qemu_build_bug_on__253'
locally defined but not used [-Werror=unused-local-typedefs]
QEMU_BUILD_BUG_ON(offsetof(QCowHeader, snapshots_offset) !=
^
cc1: all warnings being treated as errors
(Caret diagnostics aren't perfect yet with macros... :)) Work around it
with __attribute__((unused)).
Anthony Liguori [Tue, 26 Mar 2013 21:16:43 +0000 (16:16 -0500)]
Merge remote-tracking branch 'mst/tags/for_anthony' into staging
virtio,pci,qom
Work by Alex to support VGA assignment,
pci and virtio fixes by Stefan, Jason and myself, and a
new qmp event for hotplug support by myself.
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Tue 26 Mar 2013 02:02:24 PM CDT using RSA key ID D28D5469
# gpg: Can't check signature: public key not found
# By Alex Williamson (13) and others
# Via Michael S. Tsirkin
* mst/tags/for_anthony: (23 commits)
pcie: Add endpoint capability initialization wrapper
roms: switch oldnoconfig to olddefconfig
pcie: Mangle types to match topology
pci: Create and use API to determine root buses
pci: Create pci_bus_is_express helper
pci: Q35, Root Ports, and Switches create PCI Express buses
pci: Allow PCI bus creation interfaces to specify the type of bus
pci: Move PCI and PCIE type defines
pci: Create and register a new PCI Express TypeInfo
exec: assert that RAMBlock size is non-zero
pci: refuse empty ROM files
pci_bridge: Remove duplicate IRQ swizzle function
pci_bridge: Use a default map_irq function
pci: Fix INTx routing notifier recursion
pci_bridge: drop formatting from source
pci_bridge: factor out common code
pci: Teach PCI Bridges about VGA routing
pci: Add PCI VGA helpers
virtio-pci: guest notifier mask without non-irqfd
virtio-net: remove layout assumptions for mq ctrl
...
Fix the awkward API of mangling the caller specified PCIe type and
just provide an interface to initialize an endpoint device. This
will pick either a regular endpoint or integrated endpoint based on
the bus and return pcie_cap_init to doing exactly what is asked.
Alex Williamson [Thu, 14 Mar 2013 22:01:35 +0000 (16:01 -0600)]
pcie: Mangle types to match topology
Windows will fail to start drivers for devices with an Endpoint type
PCIe capability attached to a Root Complex (code 10 - Device cannot
start). The proper type for such a device is Root Complex Integrated
Endpoint. Devices don't care which they are, so do this conversion
automatically.
This allows the Windows driver to load for nec-usb-xhci when attached
to pcie.0 of a q35 machine.
Stefan Hajnoczi [Mon, 11 Mar 2013 09:20:20 +0000 (10:20 +0100)]
pci: refuse empty ROM files
A zero size ROM file is invalid and should produce a warning.
Attempting to use a zero size file ends up hitting an assertion
qemu_ram_set_idstr() because RAMBlocks with duplicate addresses are
allocated - due to zero size the allocator doesn't increment the next
available RAMBlock offset.
Also convert __FUNCTION__ to __func__ while we're touching this code.
There are no other __FUNCTION__ instances in pci.c anymore.
Alex Williamson [Thu, 7 Mar 2013 23:17:00 +0000 (16:17 -0700)]
pci_bridge: Remove duplicate IRQ swizzle function
pci_bridge_dev_map_irq_fn() is identical to pci_swizzle_map_irq_fn(),
which is now the default for all PCI bridges. We can therefore remove
this function and the pci_bridge_map_irq() call that used it.
Alex Williamson [Thu, 7 Mar 2013 23:16:54 +0000 (16:16 -0700)]
pci_bridge: Use a default map_irq function
The PCI bridge spec defines a default swizzle for translating INTx
IRQs from secondary bus to primary. Use this by default for any
bridge that doesn't set a function.
Alex Williamson [Thu, 7 Mar 2013 18:29:19 +0000 (11:29 -0700)]
pci: Fix INTx routing notifier recursion
For some reason we recurse to fire the INTx routing notifier for each
child of a bus, for each possible device of a bus. That means that if
we add a root port, the notifier gets called for that bridge 256
times. If we add an upstream switch behind that root port, 256^2. But
of course we need a downstream switch, 256^3. This starts to be
noticeable. Stop the insanity.
Alex Williamson [Sun, 3 Mar 2013 17:21:32 +0000 (10:21 -0700)]
pci: Teach PCI Bridges about VGA routing
Each PCI Bridge has a set of implied VGA regions that are enabled when
the VGA bit is set in the bridge control register. This allows VGA
devices behind bridges. Unfortunately with VGA Enable, which we
formerly allowed but didn't back, comes along some required VGA
baggage. VGA Palette Snooping is required, along with VGA 16-bit
decoding. We don't yet have support for palette snooping.
We also don't have support for 10-bit VGA aliases, the default mode, but
we enable the register, even on root ports, to avoid confusing guests.
Fortunately there's likely nothing from this century that requires these
features, so the missing bits are noted with TODOs.
Alex Williamson [Sun, 3 Mar 2013 17:21:26 +0000 (10:21 -0700)]
pci: Add PCI VGA helpers
Allow devices to register VGA memory regions for handling PCI spec
defined VGA I/O port and MMIO areas. PCI will attach these to the
bus address spaces and enable them according to the device command
register value.
non-irqfd setups are currently broken with vhost:
we start up masked and nothing unmasks the interrupts.
Fix by using mask notifiers, same as the irqfd path.
Sharing irqchip/non irqchip code is always a good thing,
in this case it will help non irqchip benefit
from backend masking optimization.
Jason Wang [Wed, 6 Mar 2013 05:50:27 +0000 (13:50 +0800)]
virtio-net: remove layout assumptions for mq ctrl
Following commit 921ac5d0f3a0df869db5ce4edf752f51d8b1596a (virtio-net:
remove layout assumptions for ctrl vq), this patch makes multiqueue ctrl
handling not rely on the layout of descriptors.
It seems more logical to have destruction flow start with the subclass
and move up to the base class. This ensures object has a valid
canonical path when destructor is called.
Anthony Liguori [Tue, 26 Mar 2013 18:38:00 +0000 (13:38 -0500)]
Merge remote-tracking branch 'quintela/migration.next' into staging
# By Peter Lieven (9) and others
# Via Juan Quintela
* quintela/migration.next: (22 commits)
Use qemu_put_buffer_async for guest memory pages
Add qemu_put_buffer_async
Use writev ops if available
Store the data to send also in iovec
Update bytes_xfer in qemu_put_byte
Add socket_writev_buffer function
Add QemuFileWritevBuffer QemuFileOps
migration: use XBZRLE only after bulk stage
migration: do not search dirty pages in bulk stage
migration: do not sent zero pages in bulk stage
migration: add an indicator for bulk state of ram migration
migration: search for zero instead of dup pages
bitops: unroll while loop in find_next_bit()
buffer_is_zero: use vector optimizations if possible
cutils: add a function to find non-zero content in a buffer
move vector definitions to qemu-common.h
savevm: Fix bugs in the VMSTATE_VBUFFER_MULTIPLY definition
savevm: Add VMSTATE_STRUCT_VARRAY_POINTER_UINT32
savevm: Add VMSTATE_FLOAT64 helpers
savevm: Add VMSTATE_UINTTL_EQUAL helper
...
Peter Maydell [Mon, 25 Mar 2013 13:15:14 +0000 (13:15 +0000)]
hw/qdev: Abort rather than ignoring errors adding device properties
Instead of ignoring any errors that occur when adding properties
to a new device in device_initfn(), check for them and abort if any
occur. The most likely cause is accidentally adding a duplicate
property, which is a programming error by the device author.
Peter Maydell [Mon, 25 Mar 2013 13:40:44 +0000 (13:40 +0000)]
hw/qdev-properties.c: Improve diagnostic for setting property after realize
Now we have error_setg() we can improve the error message emitted if
you attempt to set a property of a device after the device is realized
(the previous message was "permission denied" which was not very
informative).
KONRAD Frederic [Thu, 21 Mar 2013 14:15:17 +0000 (15:15 +0100)]
virtio-scsi-ccw: switch to new API
Here the virtio-scsi-ccw is modified for the new API. The device
virtio-scsi-ccw extends virtio-ccw-device as before. It creates and
connects a virtio-scsi during the init. The properties are not modified.
KONRAD Frederic [Thu, 21 Mar 2013 14:15:16 +0000 (15:15 +0100)]
virtio-scsi-s390: switch to the new API.
Here the virtio-scsi-s390 is modified for the new API. The device
virtio-scsi-s390 extends virtio-s390-device as before. It creates and
connects a virtio-scsi during the init. The properties are not modified.
KONRAD Frederic [Thu, 21 Mar 2013 14:15:15 +0000 (15:15 +0100)]
virtio-scsi-pci: switch to new API.
Here the virtio-scsi-pci is modified for the new API. The device virtio-scsi-pci
extends virtio-pci. It creates and connects a virtio-scsi during the init.
Anthony Liguori [Tue, 26 Mar 2013 14:25:45 +0000 (09:25 -0500)]
Merge remote-tracking branch 'luiz/queue/qmp' into staging
# By Corey Bryant (2) and others
# Via Luiz Capitulino
* luiz/queue/qmp:
New QMP command query-cpu-max and HMP command cpu_max
qmp: fix handling of boolean values in qmp-shell
QMP: TPM QMP and man page documentation updates
QMP: Remove duplicate TPM type from query-tpm
Orit Wasserman [Fri, 22 Mar 2013 14:48:03 +0000 (16:48 +0200)]
Use qemu_put_buffer_async for guest memory pages
This will remove an unneeded copy of guest memory pages.
For the page header and device state we still copy the data to the
static buffer the other option is to allocate the memory on demand
which is more expensive.
Orit Wasserman [Fri, 22 Mar 2013 14:48:02 +0000 (16:48 +0200)]
Add qemu_put_buffer_async
This allows us to add a buffer to the iovec to send without copying it
into the static buffer, the buffer will be sent later when qemu_fflush is called.
Peter Lieven [Tue, 26 Mar 2013 09:58:39 +0000 (10:58 +0100)]
migration: use XBZRLE only after bulk stage
at the beginning of migration all pages are marked dirty and
in the first round a bulk migration of all pages is performed.
currently all these pages are copied to the page cache regardless
of whether they are frequently updated or not. this doesn't make sense
since most of these pages are never transferred again.
this patch changes the XBZRLE transfer to only be used after
the bulk stage has been completed. that means a page is added
to the page cache the second time it is transferred and XBZRLE
can benefit from the third time of transfer.
since the page cache is likely smaller than the number of pages
it's also likely that in the second round the page is missing in the
cache due to collisions in the bulk phase.
on the other hand a lot of unnecessary mallocs, memdups and frees
are saved.
the following results have been taken earlier while executing
the test program from docs/xbzrle.txt. (+) with the patch and (-)
without. (thanks to Eric Blake for reformatting and comments)
+ total time: 22185 milliseconds
- total time: 22410 milliseconds
Peter Lieven [Tue, 26 Mar 2013 09:58:37 +0000 (10:58 +0100)]
migration: do not sent zero pages in bulk stage
during bulk stage of ram migration if a page is a
zero page do not send it at all.
the memory at the destination reads as zero anyway.
even if there is an madvise with QEMU_MADV_DONTNEED
at the target upon receipt of a zero page I have observed
that the target starts swapping if the memory is overcommitted.
it seems that the pages are dropped asynchronously.
this patch also updates QMP to return the number of
skipped pages in MigrationStats.
Peter Lieven [Tue, 26 Mar 2013 09:58:36 +0000 (10:58 +0100)]
migration: add an indicator for bulk state of ram migration
the first round of ram transfer is special since all pages
are dirty and thus all memory pages are transferred to
the target. this patch adds a boolean variable to track
this stage.
Peter Lieven [Tue, 26 Mar 2013 09:58:32 +0000 (10:58 +0100)]
cutils: add a function to find non-zero content in a buffer
this adds buffer_find_nonzero_offset() which is a SSE2/Altivec
optimized function that searches for non-zero content in a
buffer.
the function starts full unrolling only after the first few chunks have
been checked one by one. analyzing real memory page data has revealed
that non-zero pages are non-zero within the first 256-512 bits in
most cases. as this function is also heavily used to check for zero memory
pages this tweak has been made to avoid the high setup costs of the fully
unrolled check for non-zero pages.
due to the optimizations used in the function there are restrictions
on buffer address and search length. the function
can_use_buffer_find_nonzero_content() can be used to check if
the function can be used safely.
David Gibson [Tue, 12 Mar 2013 03:06:04 +0000 (14:06 +1100)]
savevm: Fix bugs in the VMSTATE_VBUFFER_MULTIPLY definition
The VMSTATE_BUFFER_MULTIPLY macro is misnamed - it actually specifies
a variably sized buffer with VMS_VBUFFER, so should be named
VMSTATE_VBUFFER_MULTIPLY. This patch fixes this (the macro had no current
users under either name).
In addition, unlike the other VMSTATE_VBUFFER variants, this macro did not
specify VMS_POINTER. This patch fixes this bug as well.
David Gibson [Tue, 12 Mar 2013 03:06:03 +0000 (14:06 +1100)]
savevm: Add VMSTATE_STRUCT_VARRAY_POINTER_UINT32
Currently the savevm code contains a VMSTATE_STRUCT_VARRAY_POINTER_INT32
helper (a variably sized array with the number of elements in an int32_t),
but not VMSTATE_STRUCT_VARRAY_POINTER_UINT32 (... with the number of
elements in a uint32_t). This patch (trivially) fixes the deficiency.
David Gibson [Tue, 12 Mar 2013 03:06:02 +0000 (14:06 +1100)]
savevm: Add VMSTATE_FLOAT64 helpers
The current savevm code includes VMSTATE helpers for a number of commonly
used data types, but not for the float64 type used by the internal floating
point emulation code. This patch fixes the deficiency.
David Gibson [Tue, 12 Mar 2013 03:06:00 +0000 (14:06 +1100)]
savevm: Add VMSTATE_UINT64_EQUAL helpers
The savevm code already includes a number of *_EQUAL helpers which act as
sanity checks verifying that the configuration of the saved state matches
that of the machine we're loading into to work. Variants already exist
for 8 bit 16 bit and 32 bit integers, but not 64 bit integers. This patch
fills that hole, adding a UINT64 version.
Igor Mammedov [Mon, 25 Mar 2013 14:48:46 +0000 (15:48 +0100)]
qmp: fix handling of boolean values in qmp-shell
qmp-shell converts only integer arguments and the rest
is assumed to be strings which are faithfully sent as
quoted strings by json. But QEMU refuses to accept qmp
command with boolean argument whose value is escaped
as string.
Fix it by special-casing true/false keywords and store
value as corresponding boolean.
Anthony Liguori [Mon, 25 Mar 2013 18:14:26 +0000 (13:14 -0500)]
Merge remote-tracking branch 'stefanha/net' into staging
# By Dmitry Fleytman (5) and others
# Via Stefan Hajnoczi
* stefanha/net:
net: increase buffer size to accommodate Jumbo frame pkts
VMXNET3 device implementation
Packet abstraction for VMWARE network devices
Common definitions for VMWARE devices
net: iovec checksum calculator
Checksum-related utility functions
net: use socket_set_nodelay() for -netdev socket
Anthony Liguori [Mon, 25 Mar 2013 18:14:20 +0000 (13:14 -0500)]
Merge remote-tracking branch 'stefanha/block' into staging
# By Liu Yuan (1) and Stefan Weil (1)
# Via Stefan Hajnoczi
* stefanha/block:
block: Add options QDict to bdrv_file_open() prototypes (fix MinGW build)
rbd: fix compile error
Scott Feldman [Mon, 18 Mar 2013 18:43:44 +0000 (11:43 -0700)]
net: increase buffer size to accommodate Jumbo frame pkts
Socket buffer sizes were hard-coded to 4K for VDE and socket netdevs. Bump this
up to 68K (ala tap netdev) to handle maximum GSO packet size (64k) plus plenty
of room for the ethernet and virtio_net headers.
Originally, ran into this limitation when using -netdev UDP sockets to connect
VM-to-VM, where VM interface is configure with MTU=9000. (Using virtio_net
NIC model). Test is simple: ping -M do -s 8500 <target>. This test will
attempt to ping with unfragmented packet of given size. Without patch, size
is limited to < 4K (minus protocol hdrs). With patch, ping test works with pkt
size up to 9000 (again, minus protocol hdrs).
v2: per Stefan, increase buf size to (4096+65536) as done in tap and apply
to vde and socket netdevs.
v1: increase buf size to 12K just for -netdev UDP sockets
Stefan Hajnoczi [Wed, 27 Feb 2013 14:05:47 +0000 (15:05 +0100)]
net: use socket_set_nodelay() for -netdev socket
Reduce -netdev socket latency by disabling the Nagle algorithm on
SOCK_STREAM sockets in net/socket.c. Since we are tunelling Ethernet
over TCP we shouldn't artificially delay outgoing packets, let the guest
decide packet scheduling.
I already get sub-millisecond -netdev socket ping times on localhost, so
there was no measurable difference in my testing. This won't hurt
though and may improve remote socket performance.