While sending 'SetPixelFormat' messages to a VNC server,
the client could set the 'red-max', 'green-max' and 'blue-max'
values to be zero. This leads to a floating point exception in
write_png_palette while doing frame buffer updates.
Peter Maydell [Thu, 3 Dec 2015 11:08:43 +0000 (11:08 +0000)]
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Thu 03 Dec 2015 04:59:48 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <[email protected]>"
# gpg: aka "Stefan Hajnoczi <[email protected]>"
* remotes/stefanha/tags/block-pull-request:
iotests: Add regresion test case for write notifier assertion failure
iotests: Add "add_drive_raw" method
block: Don't wait serialising for non-COR read requests
iothread: include id in thread name
Fam Zheng [Tue, 1 Dec 2015 09:36:30 +0000 (17:36 +0800)]
iotests: Add regresion test case for write notifier assertion failure
The idea is to let the top level bs have a big request alignment with
blkdebug, so that the aio_write request issued from monitor will be
serialised. This tests that QEMU doesn't crash upon the read request
from the backup job's write notifier, which is a very special case of
"reentrant" request.
Fam Zheng [Tue, 1 Dec 2015 09:36:28 +0000 (17:36 +0800)]
block: Don't wait serialising for non-COR read requests
The assertion problem was noticed in 06c3916b35a, but it wasn't
completely fixed, because even though the req is not marked as
serialising, it still gets serialised by wait_serialising_requests
against other serialising requests, which could lead to the same
assertion failure.
Fix it by even more explicitly skipping the serialising for this
specific case.
Paolo Bonzini [Tue, 24 Nov 2015 13:46:44 +0000 (14:46 +0100)]
iothread: include id in thread name
This makes it easier to find the desired thread. Use "IO" plus the id;
even with the 14 character limit on the thread name, enough of the id should
be readable (e.g. "IO iothreadNNN" with three characters for the number).
Peter Maydell [Wed, 2 Dec 2015 23:11:24 +0000 (23:11 +0000)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio,vhost,mmap fixes for 2.5
vhost test patches to fix the travis build
virtio ccw patch to fix virtio 1
virtio pci patch to fix pci express
vhost user bridge patch to fix fd leaks
mmap-alloc patch to fix hugetlbfs on ppc64
remove dead code for vhost (trivial)
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Wed 02 Dec 2015 20:38:41 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg: aka "Michael S. Tsirkin <[email protected]>"
* remotes/mst/tags/for_upstream:
util/mmap-alloc: fix hugetlb support on ppc64
virtio-pci: Set the QEMU_PCI_CAP_EXPRESS capability early in its DeviceClass realize method
virtio: handle non-virtio-1-capable backend for ccw
tests/vhost-user-bridge.c: fix fd leakage
vhost: drop dead code
vhost-user: verify that number of queues is non-zero
vhost-user-test: fix crash with glib < 2.36
vhost-user-test: use unix port for migration
vhost-user-test: fix chardriver race
Paolo Bonzini [Mon, 26 Jan 2015 11:12:27 +0000 (12:12 +0100)]
migration: do floating-point division
Dividing integer expressions transferred_bytes and time_spent, and then converting
the integer quotient to type double. Any remainder, or fractional part of the
quotient, is ignored. Fix this.
migration: Clean up use of g_poll() in socket_writev_buffer()
socket_writev_buffer() writes in a loop, using g_poll() to block. If
g_poll() fails, it tries to write more before the file descriptor is
ready. In theory, this could go into a tight loop. In practice,
errors other than EINTR are really unlikely, and when they happen,
we're probably screwed anyway, so we can just as well loop.
Clean it up a bit: retry poll on EINTR, keep ignoring other errors.
Since commit 8561c9244ddf1122d "exec: allocate PROT_NONE pages on top of
RAM", it is no longer possible to back guest RAM with hugepages on ppc64
hosts:
This is because on ppc64, Linux fixes a page size for a virtual address
at mmap time, so we can't switch a range of memory from anonymous
small pages to hugetlbs with MAP_FIXED.
Shmulik Ladkani [Wed, 2 Dec 2015 17:49:07 +0000 (19:49 +0200)]
virtio-pci: Set the QEMU_PCI_CAP_EXPRESS capability early in its DeviceClass realize method
In 1811e64 'hw/virtio: Add PCIe capability to virtio devices', the
QEMU_PCI_CAP_EXPRESS capability was added to virtio's pci_dev, within
'virtio_pci_realize' - the pci device object realization method.
This occurs to late, as 'pci_qdev_realize' (DeviceClass.realize of
TYPE_PCI_DEVICE) has already been called, without knowing that the
device instance is indeed an "express" instance, thus allocating
insufficient pci config space.
As a result, device may crash upon attempt to write to the PCIE config
space.
Fix, by arming the QEMU_PCI_CAP_EXPRESS capability early in virtio-pci's
own DeviceClass realize method.
This also makes code cleaner, as 'virtio_pci_realize' may now access the
'pci_is_express' predicate when needed.
Cornelia Huck [Wed, 2 Dec 2015 17:31:57 +0000 (18:31 +0100)]
virtio: handle non-virtio-1-capable backend for ccw
If you run a qemu advertising VERSION_1 with an old kernel where
vhost did not yet support VERSION_1, you'll end up with a device
that is {modern pci|ccw revision 1} but does not advertise VERSION_1.
This is not a sensible configuration and is rejected by the Linux
guest drivers.
To fix this, add a ->post_plugged() callback invoked after features
have been queried that can handle the VERSION_1 bit being withdrawn
and change ccw to fall back to revision 0 if VERSION_1 is gone.
Note that pci is _not_ fixed; we'll need to rethink the approach
for the next release but at least for pci it's not a regression.
This fixes file descriptor leakage in vhost-user-bridge
application. Whenever a new callfd or kickfd is set, the previous
one should be explicitly closed. File descriptors used to map
guest's memory are closed immediately after mmap call.
Fam Zheng [Mon, 23 Nov 2015 02:28:04 +0000 (10:28 +0800)]
mirror: Quiesce source during "mirror_exit"
With dataplane, the ioeventfd events could be dispatched after
mirror_run releases the dirty bitmap, but before mirror_exit actually
does the device switch, because the iothread will still be running, and
it will cause silent data loss.
Fix this by adding a bdrv_drained_begin/end pair around the window, so
that no new external request will be handled.
Peter Maydell [Wed, 2 Dec 2015 15:41:38 +0000 (15:41 +0000)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* exec.c use after free
* Xen 32-on-64 breakage
* missing EINTR
* naughty warning under qtest
# gpg: Signature made Wed 02 Dec 2015 12:13:55 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <[email protected]>"
# gpg: aka "Paolo Bonzini <[email protected]>"
* remotes/bonzini/tags/for-upstream:
translate-all: ensure host page mask is always extended with 1's
main-loop: suppress warnings under qtest
qemu-char: retry g_poll on EINTR
exec: Stop using memory after free
The prepare callback needs to be implemented with glib < 2.36,
quoting glib documentation:
"Since 2.36 this may be NULL, in which case the effect is as if the
function always returns FALSE with a timeout of -1."
vhost-user-tests uses a helper thread to dispatch the vhost-user servers
sources. However the CharDriverState is not thread-safe. Therefore, when
it's given to the thread, it shouldn't be manipulated concurrently.
We dispatch cleaning the server in an idle source. By the end of the
test, we ensure not to leave anything behind by joining the thread and
finishing the sources dispatch.
Kevin Wolf [Tue, 1 Dec 2015 14:16:49 +0000 (15:16 +0100)]
qcow2: Fix potential qemu-img check crash on 32 bit hosts
This crash was caught with qemu-iotests test case 138.
Commit b6d36de already fixed a few 32 bit truncation bugs that could
cause qemu-img check to allocate too little memory and consequently
it would segfault. On 32 bit hosts, there is one more place that needs
to be fixed because size_t was involved in the calculation and is a
32 bit type there.
Paolo Bonzini [Wed, 2 Dec 2015 12:00:54 +0000 (13:00 +0100)]
translate-all: ensure host page mask is always extended with 1's
Anthony reported that >4GB guests on Xen with 32bit QEMU broke after
commit 4ed023c ("Round up RAMBlock sizes to host page sizes", 2015-11-05).
In that patch sizes are masked against qemu_host_page_size/mask which
are uintptr_t, and thus 32bit on a 32bit QEMU, even though the ram space
might be bigger than 4GB on Xen.
Since ram_addr_t is not available on user-mode emulation targets, ensure
that we get a sign extension when masking away the low bits of the address.
Remove the ~10 year old scary comment that the type of these variables
is probably wrong, with another equally scary comment. The new comment
however does not have "???" in it, which is arguably an improvement.
For completeness use the alignment macros in linux-user and bsd-user
instead of manually doing an &. linux-user and bsd-user are not affected
by the Xen issue, however.
commit 01c22f2cdd4fcf02276ea10f48253850a5fd7259 ("main-loop: Suppress
"I/O thread spun" warnings for qtest") doesn't actually disable the
warning for everyone since some tests don't run under the qtest
accelerator.
Paolo Bonzini [Tue, 1 Dec 2015 10:27:00 +0000 (11:27 +0100)]
qemu-char: retry g_poll on EINTR
This is a case where pty_chr_update_read_handler_locked's lack
of error checking can produce incorrect values. We are not using
SIGUSR1 anymore, so this is quite theoretical, but easy to fix.
Don Slutz [Mon, 30 Nov 2015 22:11:04 +0000 (17:11 -0500)]
exec: Stop using memory after free
memory_region_unref(mr) can free memory.
For example I got:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f43280d4700 (LWP 4462)]
0x00007f43323283c0 in phys_section_destroy (mr=0x7f43259468b0)
at /home/don/xen/tools/qemu-xen-dir/exec.c:1023
1023 if (mr->subpage) {
(gdb) bt
at /home/don/xen/tools/qemu-xen-dir/exec.c:1023
at /home/don/xen/tools/qemu-xen-dir/exec.c:1034
at /home/don/xen/tools/qemu-xen-dir/exec.c:2205
(gdb) p mr
$1 = (MemoryRegion *) 0x7f43259468b0
If there are a lot of guest memory ops in the TB, the amount of
code generated by tcg_out_tb_finalize could be well more than 1k.
In the short term, increase the reservation larger than any TB
seen in practice.
Peter Maydell [Thu, 26 Nov 2015 15:19:28 +0000 (15:19 +0000)]
ui/cocoa.m: Prevent activation clicks from going to guest
When QEMU is brought to the foreground, the click event that activates QEMU
should not go to the guest. Accidents happen when they do go to the guest
without giving the user a chance to handle them. In particular, if the
guest input device is not an absolute-position one then the location of
the guest cursor (and thus the click) will likely not be the location of
the host cursor when it is clicked, and could be completely obscured
below another window. Don't send mouse clicks to QEMU unless the
window either has focus or has grabbed mouse events.
Peter Maydell [Tue, 1 Dec 2015 16:30:27 +0000 (16:30 +0000)]
Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20151201' into staging
Last round of s390x fixes for 2.5:
- The bios should be built for the first z machine, so that newer
instructions don't creep in.
- Silence annoying message when running make check.
- Fix a problem with the pci iommu exposed by recent changes.
# gpg: Signature made Tue 01 Dec 2015 08:59:42 GMT using RSA key ID C6F02FAF
# gpg: Good signature from "Cornelia Huck <[email protected]>"
# gpg: aka "Cornelia Huck <[email protected]>"
* remotes/cohuck/tags/s390x-20151201:
s390x/pci: fix up IOMMU size
s390x: no deprecation warning while testing
pc-bios/s390-ccw: rebuild image
pc-bios/s390-ccw: build for z900
Yi Min Zhao [Wed, 4 Nov 2015 07:50:45 +0000 (15:50 +0800)]
s390x/pci: fix up IOMMU size
Present code uses @size==UINT64_MAX to initialize IOMMU. It infers that it
can map any 64-bit IOVA whatsoever. But in fact, the largest DMA range for
each PCI Device on s390x is from ZPCI_SDMA_ADDR to ZPCI_EDMA_ADDR. The largest
value is returned from hardware, which is to indicate the largest range
hardware can support. But the real IOMMU size for specific PCI Device is
obtained once qemu intercepts mpcifc instruction that guest is requesting a
DMA range for that PCI Device. Therefore, before intercepting mpcifc instruction,
qemu cannot be aware of the size of IOMMU region that guest will use.
Moreover, iommu replay during device initialization for the whole region in
4k steps takes a very long time.
In conclusion, this patch intializes IOMMU region for each PCI Device when
intercept mpcifc instruction which is to register DMA range for the PCI Device.
And then, destroy IOMMU region when guest wants to deregister IOAT.
Cornelia Huck [Thu, 12 Nov 2015 15:46:09 +0000 (16:46 +0100)]
s390x: no deprecation warning while testing
'make check' tries to start all available machines; the deprecation
message for the s390-virtio machine is both useless and annoying
there. Silence it while testing.
Peter Maydell [Mon, 30 Nov 2015 21:59:22 +0000 (21:59 +0000)]
Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
Two fixes for virtfs/9p from Paolo.
# gpg: Signature made Mon 30 Nov 2015 14:10:47 GMT using DSA key ID 0101DBC2
# gpg: Good signature from "Greg Kurz <[email protected]>"
# gpg: aka "Greg Kurz <[email protected]>"
# gpg: aka "Greg Kurz <[email protected]>"
# gpg: aka "Gregory Kurz (Groug) <[email protected]>"
# gpg: aka "Gregory Kurz (Cimai Technology) <[email protected]>"
# gpg: aka "Gregory Kurz (Meiosys Technology) <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894 DBA2 02FC 3AEB 0101 DBC2
* remotes/gkurz/tags/for-upstream:
virtio-9p: use QEMU thread pool
fsdev-proxy-helper: avoid TOC/TOU race
target-ppc and related bugfix patches for qemu-2.5
I don't have the facilities to test the Macintosh and BookE related
patches. I've sanity checked them (inspection + make check), but I'm
otherwise relying on the submitters.
# gpg: Signature made Mon 30 Nov 2015 08:42:01 GMT using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg: aka "David Gibson (Red Hat) <[email protected]>"
# gpg: aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.5-20151130:
target-ppc/fpu_helper: fix FPSCR_FX bit shift operation
target-ppc: Move the FPSCR bit update macros to cpu.h
hw/ppc/ppc405_boards: Fix infinite recursion by converting taihu_cpld from old_mmio
hw/ppc/spapr: Remove duplicated "pseries" alias
mac_dbdma: always initialize channel field in DBDMA_channel
Peter Maydell [Mon, 30 Nov 2015 15:35:20 +0000 (15:35 +0000)]
Merge remote-tracking branch 'remotes/weil/tags/pull-wxx-20151130' into staging
wxx patch queue
# gpg: Signature made Mon 30 Nov 2015 05:48:33 GMT using RSA key ID 677450AD
# gpg: Good signature from "Stefan Weil <[email protected]>"
# gpg: aka "Stefan Weil <[email protected]>"
# gpg: aka "Stefan Weil <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 4923 6FEA 75C9 5D69 8EC2 B78A E08C 21D5 6774 50AD
* remotes/weil/tags/pull-wxx-20151130:
w32: Use gcc option -mthreads
oslib-win32: Change return type of function getpagesize
trace/simple: Fix warning and wrong trace file name for MinGW
Paolo Bonzini [Fri, 27 Nov 2015 11:43:06 +0000 (12:43 +0100)]
virtio-9p: use QEMU thread pool
The QEMU thread pool already has a mechanism to invoke callbacks in the main
thread. It does not need an EventNotifier and it is more efficient too.
Use it instead of GAsyncQueue + GThreadPool + glue.
As a side effect, it silences Coverity's complaint about an unchecked
return value for event_notifier_init.
target-ppc/fpu_helper: fix FPSCR_FX bit shift operation
Currently in TCG mode, updating floating exception
summary bit (FPSCR_FX) in fpscr also updates
the upper 32bits of fpscr with all 1s.
Modify the bit shift operation statement to use
1ULL instead.
Peter Maydell [Mon, 16 Nov 2015 14:57:50 +0000 (14:57 +0000)]
hw/ppc/ppc405_boards: Fix infinite recursion by converting taihu_cpld from old_mmio
The taihu_cpld_writel() function had an obvious typo that meant that
if it was ever called it would go into an infinite recursion. Newer
versions of clang will detect and warn about this:
hw/ppc/ppc405_boards.c:481:1: warning: all paths through this function will call itself [-Winfinite-recursion]
Fix this by converting taihu_cpld from the legacy old_mmio accessors
to new-style ones, with an impl {} declaration to cause the core
memory code to do the splitting of 16 bit and 32 bit accesses into
multiple 8-bit accesses.
Thomas Huth [Mon, 23 Nov 2015 16:13:37 +0000 (17:13 +0100)]
hw/ppc/spapr: Remove duplicated "pseries" alias
The "pseries" alias is currently set twice, one time for the
pseries-2.4 machine and one time for the "pseries-2.5" machine.
To avoid confusion with the alias, let's remove the one from
the older machine class. And while we're at it, also remove
the "is_default = 0" there since the is_default variable
should be set to zero by default already.
Hervé Poussineau [Thu, 12 Nov 2015 21:24:08 +0000 (22:24 +0100)]
mac_dbdma: always initialize channel field in DBDMA_channel
dbdma_from_ch() uses channel field to return the right DBDMA object.
Previous code was working if guest OS was only using registered DMA channels.
However, it lead to QEMU crashes if guest OS was using unregistered DMA channels.
Stefan Weil [Wed, 11 Mar 2015 21:08:56 +0000 (22:08 +0100)]
trace/simple: Fix warning and wrong trace file name for MinGW
On Windows, getpid() always returns an int value, but pid_t (which is
expected by the format string) is either a 32 bit or a 64 bit value.
Without a type cast (or a modified format string), the compiler prints
a warning when building for 64 bit Windows and the resulting trace_file_name
will include a wrong pid:
trace/simple.c:332:9: warning:
format ‘%lld’ expects argument of type ‘long long int’,
but argument 2 has type ‘int’ [-Wformat=]
Peter Maydell [Fri, 27 Nov 2015 10:44:42 +0000 (10:44 +0000)]
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Fri 27 Nov 2015 02:42:02 GMT using RSA key ID 398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
tap-win32: disable broken async write path
tap-win32: skip unexpected nodes during registry enumeration
eepro100: Prevent two endless loops
Andrew Baumann [Wed, 18 Nov 2015 19:45:09 +0000 (11:45 -0800)]
tap-win32: disable broken async write path
The code under the TUN_ASYNCHRONOUS_WRITES path makes two incorrect
assumptions about the behaviour of the WriteFile API for overlapped
file handles. First, WriteFile does not update the
lpNumberOfBytesWritten parameter when the write completes
asynchronously (the number of bytes written is known only when the
operation completes). Second, the buffer shouldn't be touched (or
freed) until the operation completes. This led to at least one bug
where tap_win32_write returned zero bytes written, which in turn
caused further writes ("receives") to be disabled for that device.
This change disables the asynchronous write path, while keeping most
of the code around in case someone sees value in resurrecting it. It
also adds some conditional debug output, similar to the read path.
Andrew Baumann [Wed, 18 Nov 2015 19:45:08 +0000 (11:45 -0800)]
tap-win32: skip unexpected nodes during registry enumeration
In order to find a named tap device, get_device_guid() enumerates children of
HKLM\SYSTEM\CCS\Control\Network\{4D36E972-E325-11CE-BFC1-08002BE10318}
(aka NETWORK_CONNECTIONS_KEY). For each child, it then looks for a
"Connection" subkey, but if this key doesn't exist, it aborts the
entire search. This was observed to fail on at least one Windows 10
machine, where there is an additional child of NETWORK_CONNECTIONS_KEY
(named "Descriptions"). Since registry enumeration doesn't guarantee
any particular sort order, we should continue to search for matching
children rather than aborting the search.
Peter Maydell [Thu, 26 Nov 2015 16:50:59 +0000 (16:50 +0000)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
vhost, pc: fixes for 2.5
Minor vhost fixes. HW version tweak for PC.
Documentation and test updates.
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Thu 26 Nov 2015 16:40:25 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg: aka "Michael S. Tsirkin <[email protected]>"
* remotes/mst/tags/for_upstream:
vhost-user-test: fix migration overlap test
Fix memory leak on error
Revert "vhost: send SET_VRING_ENABLE at start/stop"
tests/vhost-user-bridge: read command line arguments
tests/vhost-user-bridge: propose GUEST_ANNOUNCE feature
vhost-user: clarify start and enable
vhost-user: set link down when the char device is closed
pc: Don't set hw_version on pc-*-2.5
osdep: Change default value of qemu_hw_version() to "2.5+"
During migration, source does GET_BASE, destination does SET_BASE.
Use that as opposed to fds being configured to detect
vhost user running on both source and destination.
Peter Maydell [Thu, 26 Nov 2015 16:27:26 +0000 (16:27 +0000)]
Merge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2015-11-26' into staging
QMP and QObject patches
# gpg: Signature made Thu 26 Nov 2015 09:07:18 GMT using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <[email protected]>"
# gpg: aka "Markus Armbruster <[email protected]>"
* remotes/armbru/tags/pull-monitor-2015-11-26:
qjson: Limit number of tokens in addition to total size
qjson: surprise, allocating 6 QObjects per token is expensive
qjson: store tokens in a GQueue
qjson: Convert to parser to recursive descent
qjson: replace QString in JSONLexer with GString
qjson: Inline token_is_escape() and simplify
qjson: Inline token_is_keyword() and simplify
qjson: Give each of the six structural chars its own token type
qjson: Spell out some silent assumptions
check-qjson: Add test for JSON nesting depth limit
qjson: Don't crash when input exceeds nesting limit
qjson: Apply nesting limit more sanely
monitor: Plug memory leak on QMP error
Peter Maydell [Thu, 26 Nov 2015 15:56:53 +0000 (15:56 +0000)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Small patches, without the one that introduces -fwrapv.
# gpg: Signature made Thu 26 Nov 2015 15:48:53 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <[email protected]>"
# gpg: aka "Paolo Bonzini <[email protected]>"
* remotes/bonzini/tags/for-upstream:
target-i386: kvm: Print warning when clearing mcg_cap bits
target-i386: kvm: Use env->mcg_cap when setting up MCE
target-i386: kvm: Abort if MCE bank count is not supported by host
virtio-scsi: don't crash without a valid device
target-sparc: fix 32-bit truncation in fpackfix
exec: remove warning about mempath and hugetlbfs
Revert "exec: silence hugetlbfs warning under qtest"
call bdrv_drain_all() even if the vm is stopped
MAINTAINERS: Update TCG CPU cores section
Eduardo Habkost [Wed, 25 Nov 2015 17:19:15 +0000 (18:19 +0100)]
target-i386: kvm: Use env->mcg_cap when setting up MCE
When setting up MCE, instead of using the MCE_*_DEF macros
directly, just filter the existing env->mcg_cap value.
As env->mcg_cap is already initialized as
MCE_CAP_DEF|MCE_BANKS_DEF at target-i386/cpu.c:mce_init(), this
doesn't change any behavior. But it will allow us to change
mce_init() in the future, to implement different defaults
depending on CPU model, machine-type or command-line parameters.
Eduardo Habkost [Wed, 25 Nov 2015 17:19:14 +0000 (18:19 +0100)]
target-i386: kvm: Abort if MCE bank count is not supported by host
Instead of silently changing the number of banks in mcg_cap based
on kvm_get_mce_cap_supported(), abort initialization if the host
doesn't support MCE_BANKS_DEF banks.
Note that MCE_BANKS_DEF was always 10 since it was introduced in
QEMU, and Linux always returned 32 at KVM_CAP_MCE since
KVM_CAP_MCE was introduced, so no behavior is being changed and
the error can't be triggered by any Linux version. The point of
the new check is to ensure we won't silently change the bank
count if we change MCE_BANKS_DEF or make the bank count
configurable in the future.
Paolo Bonzini [Mon, 2 Nov 2015 14:05:34 +0000 (15:05 +0100)]
target-sparc: fix 32-bit truncation in fpackfix
This is reported by Coverity. The algorithm description at
ftp://ftp.icm.edu.pl/packages/ggi/doc/hw/sparc/Sparc.pdf suggests
that the 32-bit parts of rs2, after the left shift, is treated
as a 64-bit integer. Bits 32 and above are used to do the
saturating truncation.
The gethugepagesize() method in exec.c printed a warning if
the file path for "-mem-path" or "-object memory-backend-file"
was not on a hugetlbfs filesystem. This warning is bogus, because
QEMU functions perfectly well with the path on a regular tmpfs
filesystem. Use of hugetlbfs vs tmpfs is a choice for the management
application or end user to make as best fits their needs. As such it
is inappropriate for QEMU to have an opinion on whether the user's
choice is right or wrong in this case.
That commit changed QEMU initialization order from
- object-initial, chardev, qtest, object-late
to
- chardev, qtest, object-initial, object-late
This breaks chardev setups which need to rely on objects
having been created. For example, when chardevs use TLS
encryption in the future, they need to have tls credential
objects created first.
Wen Congyang [Fri, 20 Nov 2015 09:34:38 +0000 (17:34 +0800)]
call bdrv_drain_all() even if the vm is stopped
There are still I/O operations when the vm is stopped. For example,
stop the vm, and do block migration. In this case, we don't drain all
I/O operation, and may meet the following problem:
hw/ppc/spapr.c: Fix memory leak on error, it was introduced in bc09e0611
hw/acpi/memory_hotplug.c: Fix memory leak on error, it was introduced in 34f2af3d
Peter Maydell [Thu, 26 Nov 2015 10:24:18 +0000 (10:24 +0000)]
Merge remote-tracking branch 'remotes/mdroth/tags/qga-pull-2015-11-25-v2-tag' into staging
qemu-ga patch queue for 2.5
* include additional w32 MSI install components needed for
guest-exec
* fix 'make install' when compiling with --disable-tools
* fix potential data corruption/loss when accessing files
bi-directionally via guest-file-{read,write}
* explicitly document how integer args for guest-file-seek map to
SEEK_SET/SEEK_CUR/etc to avoid platform-specific differences
* remotes/mdroth/tags/qga-pull-2015-11-25-v2-tag:
qga: added another non-interactive gspawn() helper file.
qga: Better mapping of SEEK_* in guest-file-seek
tests: add file-write-read test
qga: flush explicitly when needed
qga: gspawn() console helper to Windows guest agent msi build
makefile: fix qemu-ga make install for --disable-tools
In case of live migration several queues can be enabled and not only the
first one. So informing backend that only the first queue is enabled is
wrong.
qjson: Limit number of tokens in addition to total size
Commit 29c75dd "json-streamer: limit the maximum recursion depth and
maximum token count" attempts to guard against excessive heap usage by
limiting total token size (it says "token count", but that's a lie).
Total token size is a rather imprecise predictor of heap usage: many
small tokens use more space than few large tokens with the same input
size, because there's a constant per-token overhead: 37 bytes on my
system.
Tighten this up: limit the token count to 2Mi. Chosen to roughly
match the 64MiB total token size limit.
Paolo Bonzini [Wed, 25 Nov 2015 21:23:32 +0000 (22:23 +0100)]
qjson: surprise, allocating 6 QObjects per token is expensive
Replace the contents of the tokens GQueue with a simple struct. This cuts
the amount of memory allocated by tests/check-qjson from ~500MB to ~20MB,
and the execution time from 600ms to 80ms on my laptop. Still a lot (some
could be saved by using an intrusive list, such as QSIMPLEQ, instead of
the GQueue), but the savings are already massive and the right thing to
do would probably be to get rid of json-streamer completely.
Paolo Bonzini [Wed, 25 Nov 2015 21:23:31 +0000 (22:23 +0100)]
qjson: store tokens in a GQueue
Even though we still have the "streamer" concept, the tokens can now
be deleted as they are read. While doing so convert from QList to
GQueue, since the next step will make tokens not a QObject and we
will have to do the conversion anyway.
Paolo Bonzini [Wed, 25 Nov 2015 21:23:29 +0000 (22:23 +0100)]
qjson: replace QString in JSONLexer with GString
JSONLexer only needs a simple resizable buffer. json-streamer.c
can allocate memory for each token instead of relying on reference
counting of QStrings.
qjson: Don't crash when input exceeds nesting limit
We limit nesting depth and input size to defend against input
triggering excessive heap or stack memory use (commit 29c75dd
json-streamer: limit the maximum recursion depth and maximum token
count). However, when the nesting limit is exceeded,
parser_context_peek_token()'s assertion fails.
Broken in commit 65c0f1e "json-parser: don't replicate tokens at each
level of recursion".
To reproduce stuff 1025 open braces or brackets into QMP.
Fix by taking the error exit instead of the normal one.
The nesting limit from commit 29c75dd "json-streamer: limit the
maximum recursion depth and maximum token count" applies separately to
braces and brackets. This makes no sense. Apply it to their sum,
because that's actually a measure of recursion depth.
Gerd Hoffmann [Wed, 25 Nov 2015 07:04:05 +0000 (08:04 +0100)]
vnc: fix segfault
Commit "c7628bf vnc: only alloc server surface with clients connected"
missed one rarely used codepath (cirrus with guest drivers using 2d
accel) where we have to check for the server surface being present,
to avoid qemu crashing with a NULL pointer dereference. Add the check.
qga: added another non-interactive gspawn() helper file.
With previous commit we added gspawn-win64-helper-console.exe,
required for gspawn() mingw implementation.
Unfortunatly when running as a service without interactive
desktop, gspawn() also requires another helper app.
Added gspawn-win64-helper.exe and gspawn-win32-helper.exe
for corresponding architectures.
Eric Blake [Wed, 25 Nov 2015 17:37:15 +0000 (10:37 -0700)]
qga: Better mapping of SEEK_* in guest-file-seek
Exposing OS-specific SEEK_ constants in our qapi was a mistake
(if the host has SEEK_CUR as 1, but the guest has it as 2, then
the semantics are unclear what should happen); if we had a time
machine, we would instead expose only a symbolic enum. It's too
late to change the fact that we have an integer in qapi, but we
can at least document what mapping we want to enforce for all
qga clients (and luckily, it happens to be the mapping that both
Linux and Windows use); then fix the code to match that mapping.
It also helps us filter out unsupported SEEK_DATA and SEEK_HOLE.
In the future, we may wish to move our QGA_SEEK_* constants into
qga/qapi-schema.json, along with updating the schema to take an
alternate type (either the integer, or the string value of the
enum name) - but that's too much risk during hard freeze.
This test exhibits a POSIX behaviour regarding switching between write
and read. It's undefined result if the application doesn't ensure a
flush between the two operations (with glibc, the flush can be implicit
when the buffer size is relatively small). The previous commit fixes
this test.
Related to:
https://bugzilla.redhat.com/show_bug.cgi?id=1210246
According to the specification:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/fopen.html
"the application shall ensure that output is not directly followed by
input without an intervening call to fflush() or to a file positioning
function (fseek(), fsetpos(), or rewind()), and input is not directly
followed by output without an intervening call to a file positioning
function, unless the input operation encounters end-of-file."
Without this change, an fwrite() followed by an fread() may lose the
previously written content, as shown in the following test.
qga: gspawn() console helper to Windows guest agent msi build
This helper, gspawn-win64-helper-console.exe for 64-bit and
gspawn-win32-helper-console.exe for 32-bit environment,
is needed for gspawn() mingw implementation, used by guest-exec command.
Without these files guest-exec command on Windows will not
work with "file not found" diagnostic message.
Michael Roth [Mon, 23 Nov 2015 21:48:58 +0000 (15:48 -0600)]
makefile: fix qemu-ga make install for --disable-tools
ab59e3e introduced a fix for `make install` on w32 that involved
filtering out qemu-ga from $TOOLS install recipe so that we could
append $(EXESUF) to it before attempting to install the binary
via install-prog function.
install-prog takes a list of binaries to install to a particular
directory. If the list is empty it breaks. We guard against this
by ensuring $TOOLS is not empty prior to calling.
However, ab59e3e introduces extra filtering after this check which
can still result on us attempting to call install-prog with an
empty list of binaries. In particular, this occurs if we
build with the --disable-tools configure option, which results
in qemu-ga being the only member of $TOOLS.
Fix this by doing a simple s/qemu-ga/qemu-ga$(EXESUF)/ pass through
$TOOLS instead of filtering out qemu-ga to handle it seperately.
Peter Maydell [Wed, 25 Nov 2015 14:47:06 +0000 (14:47 +0000)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Wed 25 Nov 2015 13:33:14 GMT using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
* remotes/kevin/tags/for-upstream:
qemu-iotests: Add -nographic when starting QEMU in 119 and 120
block/qapi: Plug memory leak on query-block error path
raw-posix.c: Make GetBSDPath() handle caching options
nand: fix flash erase when oob is in memory
test-aio: Fix event notifier cleanup
tests/Makefile: Add more dependencies for test-timed-average
Wen Congyang [Fri, 20 Nov 2015 09:37:13 +0000 (17:37 +0800)]
block-migration: limit the memory usage
If we set migration speed in a very large value, block-migration will try to read
all data to the memory. Because
(block_mig_state.submitted + block_mig_state.read_done) * BLOCK_SIZE
will be overflow, and it will be always less than rate limit.
There is no need to read too many data into memory when the rate limit is very large.
So limit the memory usage can fix the overflow problem.
madvise() returns EINVAL in the case of many failures, but also
returns it in cases where the host kernel doesn't have THP enabled.
Postcopy only really cares that THP is off before it detects faults,
and turns it back on afterwards; so we're going to have
to assume that if the madvise fails then the host just doesn't do
THP and we can carry on with the postcopy.