Isaac Lozano [Sun, 17 Apr 2016 11:29:53 +0000 (04:29 -0700)]
usb-mtp: fix usb_mtp_get_device_info so that libmtp on the guest doesn't complain
If an application uses libmtp on the guest system,
it will complain with the warning message:
LIBMTP WARNING: VendorExtensionID: ffffffff
LIBMTP WARNING: VendorExtensionDesc: (null)
LIBMTP WARNING: this typically means the device is PTP (i.e. a camera) but
not a MTP device at all. Trying to continue anyway.
This is because libmtp expects a MTP Vendor Extension ID of 0x00000006 and a
MTP Version of 0x0064. These numbers are taken from Microsoft's MTP Vendor
Extension Identification Message page and are what most physical devices
show.
QEMU overwrites guest memory using stale guest addresses.
This doesn't happen when the guest (firmware) driver sets up xhci for
the first time as there are no slots configured yet. However when the
firmware hands over the control to the OS some slots and endpoints are
already set up with their context in the guest RAM. Now the OS' driver
resets the controller again and xhci_set_ep_state then reads and writes
that memory which is now owned by the OS.
As a quick fix, skip calling xhci_set_ep_state in xhci_disable_ep if the
device context base address array pointer is zero (indicating we're in
the HC reset and no DMA is possible).
Commit message claims locking is not needed, but that appears
to not be true, seabios ehci driver runs into timekeeping problems
with this, see
https://bugzilla.redhat.com/show_bug.cgi?id=1322713
vga: make sure vga register setup for vbe stays intact (CVE-2016-3712).
Call vbe_update_vgaregs() when the guest touches GFX, SEQ or CRT
registers, to make sure the vga registers will always have the
values needed by vbe mode. This makes sure the sanity checks
applied by vbe_fixup_regs() are effective.
Without this guests can muck with shift_control, can turn on planar
vga modes or text mode emulation while VBE is active, making qemu
take code paths meant for CGA compatibility, but with the very
large display widths and heigts settable using VBE registers.
Which is good for one or another buffer overflow. Not that
critical as they typically read overflows happening somewhere
in the display code. So guests can DoS by crashing qemu with a
segfault, but it is probably not possible to break out of the VM.
When enabling vbe mode qemu will setup a bunch of vga registers to make
sure the vga emulation operates in correct mode for a linear
framebuffer. Move that code to a separate function so we can call it
from other places too.
vga allows banked access to video memory using the window at 0xa00000
and it supports a different access modes with different address
calculations.
The VBE bochs extentions support banked access too, using the
VBE_DISPI_INDEX_BANK register. The code tries to take the different
address calculations into account and applies different limits to
VBE_DISPI_INDEX_BANK depending on the current access mode.
Which is probably effective in stopping misprogramming by accident.
But from a security point of view completely useless as an attacker
can easily change access modes after setting the bank register.
Drop the bogus check, add range checks to vga_mem_{readb,writeb}
instead.
Jan Vesely [Fri, 29 Apr 2016 17:15:23 +0000 (13:15 -0400)]
configure: Check if struct fsxattr is available from linux header
Fixes build failure with --enable-xfsctl and
new linux headers (>=4.5) and older xfsprogs(<4.5):
In file included from /usr/include/xfs/xfs.h:38:0,
from /var/tmp/portage/app-emulation/qemu-2.5.0-r1/work/qemu-2.5.0/block/raw-posix.c:97:
/usr/include/xfs/xfs_fs.h:42:8: error: redefinition of ‘struct fsxattr’
struct fsxattr {
^
In file included from /var/tmp/portage/app-emulation/qemu-2.5.0-r1/work/qemu-2.5.0/block/raw-posix.c:60:0:
/usr/include/linux/fs.h:155:8: note: originally defined here
struct fsxattr {
This is really a bug in the system headers, but we can work around it
by defining HAVE_FSXATTR in the QEMU headers if linux/fs.h provides
the struct, so that xfs_fs.h doesn't try to define it as well.
Peter Maydell [Sun, 1 May 2016 21:52:47 +0000 (22:52 +0100)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
acpi: last minute fix for 2.6
Minor, obvious fix only affecting BE hosts.
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Sun 01 May 2016 13:43:28 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg: aka "Michael S. Tsirkin <[email protected]>"
Assert happens because qemu-system-x86_64 generates
SSDT table and test looks for a corresponding expected
table to compare with.
However there is no expected SSDT blob anymore, since
QEMU souldn't generate one. As it happens BIOS is not
able to read ACPI tables from QEMU and fallbacks to
embeded legacy ACPI codepath, which generates SSDT.
That happens due to wrongly sized endiannes conversion
which makes
uint8_t BiosLinkerLoaderEntry.alloc.zone
end up with 0 due to truncation of 32 bit integer
which on host is 1 or 2.
Fix it by dropping invalid cpu_to_le32() as uint8_t
doesn't require any conversion.
Kevin Wolf [Wed, 27 Apr 2016 12:18:16 +0000 (14:18 +0200)]
vvfat: Fix default volume label
Commit d5941dd documented that it leaves the default volume name as it
was ("QEMU VVFAT"), but it doesn't actually implement this. You get an
empty name (eleven space characters) instead.
This fixes the implementation to apply the advertised default.
Kevin Wolf [Wed, 27 Apr 2016 12:11:38 +0000 (14:11 +0200)]
vvfat: Fix volume name assertion
Commit d5941dd made the volume name configurable, but it didn't consider
that the rw code compares the volume name string to assert that the
first directory entry is the volume name. This made vvfat crash in rw
mode.
This fixes the assertion to compare with the configured volume name
instead of a literal string.
Eric Blake [Thu, 28 Apr 2016 21:45:28 +0000 (15:45 -0600)]
qapi: Don't pass NULL to printf in string input visitor
Make sure the error message for visit_type_uint64() gracefully
handles a NULL 'name' when called from the top level or a list
context, as not all the world behaves like glibc in allowing
NULL through a printf-family %s.
Peter Maydell [Thu, 28 Apr 2016 09:25:26 +0000 (10:25 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.6-20160426' into staging
ppc patch queue for 2016-04-26 (last minute qemu-2.6 fix)
This just has one, last-minute, fix for a serious regression of memory
hotplug.
Patch author's comment:
Really sorry for the way last-minute fix, but without this memory
hotplug is totally broken :( Hoping to get this in for Wednesday's
RC4, which I think will be the final before release.
# gpg: Signature made Tue 26 Apr 2016 03:52:20 BST using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg: aka "David Gibson (Red Hat) <[email protected]>"
# gpg: aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.6-20160426:
spapr_drc: fix aborts during DRC-count based hotplug
James Hogan [Wed, 27 Apr 2016 22:21:06 +0000 (23:21 +0100)]
target-mips: Fix RDHWR exception host PC
Commit b00c72180c36 ("target-mips: add PC, XNP reg numbers to RDHWR")
changed the rdhwr helpers to use check_hwrena() to check the register
being accessed is enabled in CP0_HWREna when used from user mode. If
that check fails an EXCP_RI exception is raised at the host PC
calculated with GETPC().
However check_hwrena() may not be fully inlined as the
do_raise_exception() part of it is common regardless of the arguments.
This causes GETPC() to calculate the address in the call in the helper
instead of the generated code calling the helper. No TB will be found
and the EPC reported with the resulting guest RI exception points to the
beginning of the TB instead of the RDHWR instruction.
We can't reliably force check_hwrena() to be inlined, and converting it
to a macro would be ugly, so instead pass the host PC in as an argument,
with each rdhwr helper passing GETPC(). This should avoid any dependence
on compiler behaviour, and in practice seems to ensure the full inlining
of check_hwrena() on x86_64.
This issue causes failures when running a MIPS KVM (trap & emulate)
guest in a MIPS QEMU TCG guest, as the inner guest kernel will do a
RDHWR of counter, which is disabled in the outer guest's CP0_HWREna by
KVM so it can emulate the inner guest's counter. The emulation fails and
the RI exception is passed to the inner guest.
qom: -object error messages lost location, restore it
qemu_opts_foreach() runs its callback with the error location set to
the option's location. Any errors the callback reports use the
option's location automatically.
Commit 90998d5 moved the actual error reporting from "inside"
qemu_opts_foreach() to after it. Here's a typical hunk:
Before, object_create() reports from within qemu_opts_foreach(), using
the option's location. Afterwards, we do it after
qemu_opts_foreach(), using whatever location happens to be current
there. Commonly a "none" location.
This is because Error objects don't have location information.
Problematic.
Reproducer:
$ qemu-system-x86_64 -nodefaults -display none -object secret,id=foo,foo=bar
qemu-system-x86_64: Property '.foo' not found
Note no location. This commit restores it:
qemu-system-x86_64: -object secret,id=foo,foo=bar: Property '.foo' not found
Note that the qemu_opts_foreach() bug just fixed could mask the bug
here: if the location it leaves dangling hasn't been clobbered, yet,
it's the correct one.
replay: Fix dangling location bug in replay_configure()
replay_configure() pushes and pops a Location with automatic storage
duration. Except it fails to pop when -icount parameter "rr" isn't
given. cur_loc then points to unused stack space, and will most
likely get clobbered in short order.
Clobbered cur_loc can make loc_pop() and error_print_loc() crash or
report bogus locations.
qemu_opts_foreach() pushes and pops a Location with automatic storage
duration. Except it fails to pop when @func() returns non-zero.
cur_loc then points to unused stack space, and will most likely get
clobbered in short order.
Clobbered cur_loc can make loc_pop() and error_print_loc() crash or
report bogus locations.
Affects several qemu command line options as well as qemu-img,
qemu-io, qemu-nbd -object, and blkdebug's configuration file.
main() reports "Property '.foo' not found" like this:
if (qemu_opts_foreach(qemu_find_opts("object"),
user_creatable_add_opts_foreach,
object_create_delayed, &err)) {
error_report_err(err);
exit(1);
}
cur_loc then points to where qemu_opts_foreach()'s Location used to
be, i.e. unused stack space. With optimization, this Location doesn't
get clobbered for me, and also happens to be the correct location.
Without optimization, it does get clobbered in a way that makes
error_report_err() report no location.
Michael Roth [Mon, 25 Apr 2016 22:24:25 +0000 (17:24 -0500)]
spapr_drc: fix aborts during DRC-count based hotplug
CPU/memory resources can be signalled en-masse via
spapr_hotplug_req_add_by_count(), and when doing so, actually change
the meaning of the 'drc' parameter passed to
spapr_hotplug_req_event() to be a count rather than an index.
f40eb92 added a hook in spapr_hotplug_req_event() to record when a
device had been 'signalled' to the guest, but that code assumes that
drc is always an index. In cases where it's a count, such as memory
hotplug, the DRC lookup will fail, leading to an assert.
Fix this by only explicitly setting the signalled state for cases where
we are doing PCI hotplug.
For other resources types, since we cannot selectively track whether a
resource has been signalled in cases where we signal attach as a count,
set the 'signalled' state to true immediately upon making the
resource available via drck->attach().
commit "5f77e06 usb: add pid check at the first of uhci_handle_td()"
moved the pid verification to the start of the uhci_handle_td function,
to simplify the error handling (we don't have to free stuff which we
didn't allocate in the first place ...).
Problem is now the check fires too often, it raises error IRQs even for
TDs which we are not going to process because they are not set active.
So, lets move down the check a bit, so it is done only for active TDs,
but still before we are going to allocate stuff to process the requested
transfer.
Peter Maydell [Mon, 25 Apr 2016 10:15:53 +0000 (11:15 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.6-20160423' into staging
ppc patch queue for 2016-03-23
A single fix for a bug in parameter handling for the spapr PCI host
bridge.
# gpg: Signature made Sat 23 Apr 2016 07:55:29 BST using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg: aka "David Gibson (Red Hat) <[email protected]>"
# gpg: aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.6-20160423:
hw/ppc/spapr: Fix crash when specifying bad parameters to spapr-pci-host-bridge
The problem is that spapr_tce_find_by_liobn() might return NULL, but
the code in spapr_populate_pci_dt() does not check for this condition
and then tries to dereference this NULL pointer.
Apart from that, the return value of spapr_populate_pci_dt() also
has to be checked for all PCI buses, not only for the last one, to
make sure we catch all errors.
mirror: Workaround for unexpected iohandler events during completion
Commit 5a7e7a0ba moved mirror_exit to a BH handler but didn't add any
protection against new requests that could sneak in just before the
BH is dispatched. For example (assuming a code base at that commit):
At (a) we know the BDS has no pending request. However, the same
main_loop_wait call is going to dispatch iohandlers (EventNotifier
events), which may lead to a new I/O from guest. So the invariant is
already broken at (c). Data loss.
Commit f3926945c8 made iohandler to use aio API. The order of
virtio_queue_host_notifier_read and block_job_defer_to_main_loop within
a main_loop_wait becomes unpredictable, and even worse, if the host
notifier event arrives at the next main_loop_wait call, the
unpredictable order between mirror_exit and
virtio_queue_host_notifier_read is also a trouble. As shown below, this
commit made the bug easier to trigger:
In both cases, (b) breaks the invariant wanted by (a) and (c).
Until then, the request loss has been silent. Later, 3f09bfbc7be added
asserts at (c) to check the invariant (in
bdrv_replace_in_backing_chain), and Max reported an assertion failure
first visible there, by doing active committing while the guest is
running bonnie++.
2.5 added bdrv_drained_begin at (a) to protect the dataplane case from
similar problems, but we never realize the main loop bug until now.
As a bandage, this patch disables iohandler's external events
temporarily together with bs->ctx.
aio_poll doesn't poll the external nodes so this should never be true,
but aio_ctx_dispatch may get notified by the events from GSource. To
make bdrv_drained_begin effective in main loop, we should check the
is_external flag here too.
Also do the check in aio_pending so aio_dispatch is not called
superfluously, when there is no events other than external ones.
All callers pass "false" keeping the old semantics. The windows
implementation doesn't distinguish the flag yet. On posix, it is passed
down to the underlying aio context.
Christoffer Dall [Fri, 22 Apr 2016 11:12:09 +0000 (13:12 +0200)]
util: align memory allocations to 2M on AArch64
For KVM to use Transparent Huge Pages (THP) we have to ensure that the
alignment of the userspace address of the KVM memory slot and the IPA
that the guest sees for a memory region have the same offset from the 2M
huge page size boundary.
One way to achieve this is to always align the IPA region at a 2M
boundary and ensure that the mmap alignment is also at 2M.
Unfortunately, we were only doing this for __arm__, not for __aarch64__,
so add this simple condition.
This fixes a performance regression using KVM/ARM on AArch64 platforms
that showed a performance penalty of more than 50%, introduced by the
following commit:
9fac18f (oslib: allocate PROT_NONE pages on top of RAM, 2015-09-10)
We were only lucky before the above commit, because we were allocating
large regions and naturally getting a 2M alignment on those allocations
then.
Eric Blake [Thu, 21 Apr 2016 14:42:30 +0000 (08:42 -0600)]
nbd: Don't mishandle unaligned client requests
The NBD protocol does not (yet) force any alignment constraints
on clients. Even though qemu NBD clients always send requests
that are aligned to 512 bytes, we must be prepared for non-qemu
clients that don't care about alignment (even if it means they
are less efficient). Our use of blk_read() and blk_write() was
silently operating on the wrong file offsets when the client
made an unaligned request, corrupting the client's data (but
as the client already has control over the file we are serving,
I don't think it is a security hole, per se, just a data
corruption bug).
Note that in the case of NBD_CMD_READ, an unaligned length could
cause us to return up to 511 bytes of uninitialized trailing
garbage from blk_try_blockalign() - hopefully nothing sensitive
from the heap's prior usage is ever leaked in that manner.
tcg: use tcg_debug_assert instead of assert (fix performance regression)
The TCG code is quite performance sensitive, but at the same time can
also be quite tricky. That is why asserts that can be enabled with the
--enable-debug-tcg configure option.
Since commit 757e725b (tcg: Clean up includes) "config.h" as been
replaced by "qemu/osdep.h" which itself includes <assert.h>. As a
consequence the assertions are always enabled, even when using
--disable-debug-tcg, causing a performance regression, especially on
targets with many registers. For instance on qemu-system-ppc the
speed difference is about 15%.
tcg_debug_assert is controlled directly by CONFIG_DEBUG_TCG and already
uses in some places. This patch replaces all the calls to assert into
calss to tcg_debug_assert.
The 32-bit ARM Linux kernel booting ABI requires that r0 is 0
when calling the kernel image. A bug in commit 10b8ec73e610e01
meant that for boards which use the write_board_setup hook (which
means "highbank", "midway", "raspi2" and "xilinx-zynq-a9") we
were incorrectly skipping the "clear r0" instruction in the
mini-bootloader. Use the right offset in the "add lr, pc, #n"
instruction so that we return from the board-setup code to the
correct place.
When using K: in MAINTAINERS, false positives makes
get_maintainer.pl not use git history to find contributors. As
those patterns cause lots of false positives they are causing
more harm than good, so remove them.
Peter Maydell [Wed, 20 Apr 2016 15:43:53 +0000 (16:43 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Mirror block job fixes for 2.6.0-rc3
# gpg: Signature made Wed 20 Apr 2016 15:56:43 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
* remotes/kevin/tags/for-upstream:
iotests: Test case for drive-mirror with unaligned image size
iotests: Add iotests.image_size
mirror: Don't extend the last sub-chunk
block/mirror: Refresh stale bitmap iterator cache
block/mirror: Revive dead yielding code
Max Reitz [Tue, 19 Apr 2016 22:59:48 +0000 (00:59 +0200)]
block/mirror: Refresh stale bitmap iterator cache
If the drive's dirty bitmap is dirtied while the mirror operation is
running, the cache of the iterator used by the mirror code may become
stale and not contain all dirty bits.
This only becomes an issue if we are looking for contiguously dirty
chunks on the drive. In that case, we can easily detect the discrepancy
and just refresh the iterator if one occurs.
Max Reitz [Tue, 19 Apr 2016 22:59:47 +0000 (00:59 +0200)]
block/mirror: Revive dead yielding code
mirror_iteration() is supposed to wait if the current chunk is subject
to a still in-flight mirroring operation. However, it mixed checking
this conflict situation with checking the dirty status of a chunk. A
simplification for the latter condition (the first chunk encountered is
always dirty) led to neglecting the former: We just skip the first chunk
and thus never test whether it conflicts with an in-flight operation.
To fix this, pull out the code which waits for in-flight operations on
the first chunk of the range to be mirrored to settle.
Peter Maydell [Wed, 20 Apr 2016 14:05:19 +0000 (15:05 +0100)]
Merge remote-tracking branch 'remotes/mdroth/tags/qga-pull-2016-04-19-tag' into staging
qemu-ga patch queue for 2.6
* fixes inadvertant change that unconditionally disables qemu-ga unit test
* fixes make check failures when building with --disable-guest-agent that
were present visible before the unit test was inadvertantly disabled.
* remotes/cody/tags/block-pull-request:
block/gluster: prevent data loss after i/o error
block/gluster: code movement of qemu_gluster_close()
block/gluster: return correct error value
Yang Hongyang [Tue, 19 Apr 2016 07:39:13 +0000 (15:39 +0800)]
qemu-ga: do not run qga test when guest agent disabled
When configure with --disable-guest-agent, make check will fail with:
ERROR:tests/test-qga.c:74:fixture_setup: assertion failed (error == NULL):
Failed to execute child process "/home/xx/qemu/qemu-ga" (No such file or
directory) (g-exec-error-quark, 8)
make: *** [check-tests/test-qga] Error 1
This check was commented out by bab47d9a75a. I think that was by
mistake, because the commit message of that commit didn't mention
this change.
Jeff Cody [Tue, 5 Apr 2016 14:40:09 +0000 (10:40 -0400)]
block/gluster: prevent data loss after i/o error
Upon receiving an I/O error after an fsync, by default gluster will
dump its cache. However, QEMU will retry the fsync, which is especially
useful when encountering errors such as ENOSPC when using the werror=stop
option. When using caching with gluster, however, the last written data
will be lost upon encountering ENOSPC. Using the write-behind-cache
xlator option of 'resync-failed-syncs-after-fsync' should cause gluster
to retain the cached data after a failed fsync, so that ENOSPC and other
transient errors are recoverable.
Unfortunately, we have no way of knowing if the
'resync-failed-syncs-after-fsync' xlator option is supported, so for now
close the fd and set the BDS driver to NULL upon fsync error.
Jeff Cody [Wed, 6 Apr 2016 03:11:34 +0000 (23:11 -0400)]
block/gluster: return correct error value
Upon error, gluster will call the aio callback function with a
ret value of -1, with errno set to the proper error value. If
we set the acb->ret value to the return value in the callback,
that results in every error being EPERM (i.e. 1). Instead, set
it to the proper error result.
FW CFG's primary user is QEMU, which uses it to expose configuration
information (in the widest sense) to Firmware. Thus the name FW CFG.
FW CFG can also be used by others for their own purposes. QEMU is
merely acting as transport then. Names starting with opt/ are
reserved for such uses. There is no provision, however, to guide safe
sharing among different such users.
Fix that, loosely following QMP precedence: names should start with
opt/RFQDN/, where RFQDN is a reverse fully qualified domain name you
control.
Based on a more ambitious patch from Michael Tsirkin.
Peter Maydell [Tue, 19 Apr 2016 10:15:32 +0000 (11:15 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.6-20160419' into staging
ppc patch queueu for 2016-04-19
A single fix for a regression since 2.5. This should be the last ppc
pull request for 2.6.
# gpg: Signature made Tue 19 Apr 2016 02:48:30 BST using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg: aka "David Gibson (Red Hat) <[email protected]>"
# gpg: aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.6-20160419:
cuda: fix off-by-one error in SET_TIME command
cadence_uart_init() initializes an I/O memory region of size 0x1000
bytes. However in uart_write(), the 'offset' parameter (offset within
region) is divided by 4 and then used to index the array 'r' of size
CADENCE_UART_R_MAX which is much smaller: (0x48/4). If 'offset>>=2'
exceeds CADENCE_UART_R_MAX, this will cause an out-of-bounds memory
write where the offset and the value are controlled by guest.
This will corrupt QEMU memory, in most situations this causes the vm to
crash.
Fix by checking the offset against the array size.
Commit "156a2e4 ehci: make idt processing more robust" tries to avoid a
DoS by the guest (create a circular iTD queue and let qemu ehci
emulation run in circles forever). Unfortunately this has two problems:
First it misses the case of siTDs, and second it reportedly breaks
FreeBSD.
So lets go for a different approach: just count the number of iTDs and
siTDs we have seen per frame and apply a limit. That should really
catch all cases now.
With the new framework the cuda_cmd_set_time command directly receive
the data, without the command byte. Therefore the time is stored at
in_data[0], not at in_data[1].
This fixes the "hwclock --systohc" command in a guest.
target-i386: Set AMD alias bits after filtering CPUID data
QEMU complains about -cpu host on an AMD machine:
warning: host doesn't support requested feature: CPUID.80000001H:EDX [bit 0]
For bits 0,1,3,4,5,6,7,8,9,12,13,14,15,16,17,23,24.
KVM_GET_SUPPORTED_CPUID and and x86_cpu_get_migratable_flags()
don't handle the AMD CPUID aliases bits, making
x86_cpu_filter_features() print warnings and clear those CPUID
bits incorrectly.
To avoid hacking x86_cpu_get_migratable_flags() to handle
CPUID_EXT2_AMD_ALIASES (just like the existing hack inside
kvm_arch_get_supported_cpuid()), simply move the
CPUID_EXT2_AMD_ALIASES code in x86_cpu_realizefn() after the
x86_cpu_filter_features() call.
This will probably make the CPUID_EXT2_AMD_ALIASES hack in
kvm_arch_get_supported_cpuid() unnecessary, too. The hack will be
removed in a follow-up patch after v2.6.0.
Andreas Färber [Fri, 12 Feb 2016 14:50:58 +0000 (15:50 +0100)]
MAINTAINERS: Drop target-i386 from CPU subsystem
X86CPU QOM type is in good hands and actively maintained these days, so
drop it from the generic QOM CPU subsystem.
Some refactorings and design questions will still intersect, but review
and discussions of individual series can still take place while opting out
of general X86CPU patch review.
Peter Maydell [Mon, 18 Apr 2016 10:11:45 +0000 (11:11 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.6-20160418' into staging
ppc patch queue for 2-16-04-18
Three bugfixe patches for 2.6 here.
* Two for bad implementation of some of the strong load/store
instructions
* One for bad migration of the XER register. This is a regression
from 2.5, cause by a change in the way we represent at XER during
runtime.
# gpg: Signature made Mon 18 Apr 2016 06:17:03 BST using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg: aka "David Gibson (Red Hat) <[email protected]>"
# gpg: aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.6-20160418:
ppc: Fix migration of the XER register
ppc: Fix the bad exception NIP value and the range check in LSWX
ppc: Fix the range check in the LSWI instruction
Peter Maydell [Mon, 18 Apr 2016 09:22:43 +0000 (10:22 +0100)]
Merge remote-tracking branch 'remotes/otubo/tags/pull-seccomp-20160416' into staging
seccomp branch queue
# gpg: Signature made Sat 16 Apr 2016 19:58:46 BST using RSA key ID 12F8BD2F
# gpg: Good signature from "Eduardo Otubo (Software Engineer @ ProfitBricks) <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 1C96 46B6 E1D1 C38A F2EC 3FDE FD0C FF5B 12F8 BD2F
* remotes/otubo/tags/pull-seccomp-20160416:
seccomp: adding sysinfo system call to whitelist
seccomp: Whitelist cacheflush since 2.2.0 not 2.2.3
configure: Enable seccomp sandbox for MIPS
Peter Maydell [Mon, 18 Apr 2016 08:55:16 +0000 (09:55 +0100)]
Merge remote-tracking branch 'remotes/weil/tags/pull-wxx-20160415' into staging
wxx patch queue
# gpg: Signature made Fri 15 Apr 2016 18:36:41 BST using RSA key ID 677450AD
# gpg: Good signature from "Stefan Weil <[email protected]>"
# gpg: aka "Stefan Weil <[email protected]>"
# gpg: aka "Stefan Weil <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 4923 6FEA 75C9 5D69 8EC2 B78A E08C 21D5 6774 50AD
Thomas Huth [Fri, 15 Apr 2016 09:03:00 +0000 (11:03 +0200)]
ppc: Fix migration of the XER register
env->xer only holds the lower bits of the XER register nowadays, the
SO, OV and CA bits are stored in separate variables (see the function
cpu_write_xer() for details). Since the migration code currently only
reads the "xer" variable, the upper bits are lost during migration.
Fix it by using cpu_read_xer() instead.
Thomas Huth [Thu, 14 Apr 2016 15:14:53 +0000 (17:14 +0200)]
ppc: Fix the bad exception NIP value and the range check in LSWX
The range checks in the LSWX instruction are completely insufficient:
They do not take the wrap-around case into account, and the check
"reg < rx" should be "reg <= rx" instead. Fix it by using the new
lsw_reg_in_range() helper function that is already used for LSWI, too.
Then there is a second problem: In case the INVAL exception is generated,
the NIP value is wrong, it currently points to the instruction before
the LSWX instruction. This is because gen_lswx() already decreases the
NIP value by 4 (to be prepared for page fault exceptions), and
powerpc_excp() later decreases it again by 4 while handling the program
exception. So to get this right, we've got to undo the "- 4" from
gen_lswx() here before calling helper_raise_exception_err().
Thomas Huth [Thu, 14 Apr 2016 15:14:52 +0000 (17:14 +0200)]
ppc: Fix the range check in the LSWI instruction
There are two issues: First, the number of registers that are used has
to be calculated with "(nb + 3) / 4" (i.e. round always up, not down).
Second, the "start <= ra && (start + nr - 32) > ra" condition for the
wrap-around case is wrong: It has to be tested with "||" instead of "&&".
Since we can reuse this check later for the LSWX instruction, let's
place the fixed code into a helper function, too.
James Hogan [Fri, 8 Apr 2016 13:16:33 +0000 (14:16 +0100)]
seccomp: Whitelist cacheflush since 2.2.0 not 2.2.3
The cacheflush system call (found on MIPS and ARM) has been included in
the libseccomp header since 2.2.0, so include it back to that version.
Previously it was only enabled since 2.2.3 since that is when it was
enabled properly for ARM.
This will allow seccomp support to be enabled for MIPS back to
libseccomp 2.2.0.
Peter Maydell [Fri, 15 Apr 2016 17:26:49 +0000 (18:26 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches for 2.6.0-rc3
# gpg: Signature made Fri 15 Apr 2016 17:02:23 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
* remotes/kevin/tags/for-upstream:
nbd: Don't kill server on client that doesn't request TLS
nbd: fix assert() on qemu-nbd stop
nbd: Don't fail handshake on NBD_OPT_LIST descriptions
qemu-iotests: 041: More robust assertion on quorum node
qemu-iotests: place valgrind log file in scratch dir
qemu-iotests: tests: do not set unused tmp variable
qemu-iotests: common.rc: drop unused _do()
qemu-iotests: drop unused _within_tolerance() filter
Fix pflash migration
block: Don't ignore flags in blk_{,co,aio}_write_zeroes()
block/vpc: update comments to be compliant w/coding guidelines
block/vpc: set errp in vpc_open
block/vpc: make checks on max table size a bit more lax
block/vpc: Use the correct max sector count for VHD images
block/vpc: use current_size field for XenConverter VHD images
vpc: use current_size field for XenServer VHD images
block/vpc: set errp in vpc_create
block: Fix blk_aio_write_zeroes()
qemu-io: Support 'aio_write -z'
Kevin Wolf [Fri, 15 Apr 2016 15:59:42 +0000 (17:59 +0200)]
Merge remote-tracking branch 'mreitz/tags/pull-block-for-kevin-2016-04-15' into queue-block
Block patches for 2.6.0-rc3.
# gpg: Signature made Fri Apr 15 17:57:30 2016 CEST using RSA key ID E838ACAD
# gpg: Good signature from "Max Reitz <[email protected]>"
* mreitz/tags/pull-block-for-kevin-2016-04-15:
nbd: Don't kill server on client that doesn't request TLS
nbd: fix assert() on qemu-nbd stop
nbd: Don't fail handshake on NBD_OPT_LIST descriptions
qemu-iotests: 041: More robust assertion on quorum node
qemu-iotests: place valgrind log file in scratch dir
qemu-iotests: tests: do not set unused tmp variable
qemu-iotests: common.rc: drop unused _do()
qemu-iotests: drop unused _within_tolerance() filter
Eric Blake [Thu, 14 Apr 2016 22:02:23 +0000 (16:02 -0600)]
nbd: Don't kill server on client that doesn't request TLS
Upstream NBD documents (as of commit 4feebc95) that servers MAY
choose to operate in a conditional mode, where it is up to the
client whether to use TLS. For qemu's case, we want to always be
in FORCEDTLS mode, because of the risk of man-in-the-middle
attacks, and since we never export more than one device; likewise,
the qemu client will ALWAYS send NBD_OPT_STARTTLS as its first
option. But now that SELECTIVETLS servers exist, it is feasible
to encounter a (non-qemu) client that is programmed to talk to
such a server, and does not do NBD_OPT_STARTTLS first, but rather
wants to probe if it can use a non-encrypted export.
The NBD protocol documents that we should let such a client
continue trying, on the grounds that maybe the client will get the
hint to send NBD_OPT_STARTTLS, rather than immediately dropping
the connection.
Note that NBD_OPT_EXPORT_NAME is a special case: since it is the
only option request that can't have an error return, we have to
(continue to) drop the connection on that one; rather, what we are
fixing here is that all other replies prior to TLS initiation tell
the client NBD_REP_ERR_TLS_REQD, but keep the connection alive.
Pavel Butsykin [Thu, 14 Apr 2016 10:20:15 +0000 (13:20 +0300)]
nbd: fix assert() on qemu-nbd stop
From time to time qemu-nbd is crashing on the following assert:
assert(state == TERMINATING);
nbd_export_closed
nbd_export_put
main
and the state at the moment of the crash is evaluated to TERMINATE.
During shutdown process of the client the nbd_client_thread thread sends
SIGTERM signal and the main thread calls the nbd_client_closed callback.
If the SIGTERM callback will be executed after change the state to
TERMINATING, then the state will once again be TERMINATE.
To solve the issue, we must change the state to TERMINATE only if the state
is RUNNING. In the other case we are shutting down already.
Eric Blake [Fri, 8 Apr 2016 01:09:37 +0000 (19:09 -0600)]
nbd: Don't fail handshake on NBD_OPT_LIST descriptions
The NBD Protocol states that NBD_REP_SERVER may set
'length > sizeof(namelen) + namelen'; in which case the rest
of the packet is a UTF-8 description of the export. While we
don't know of any NBD servers that send this description yet,
we had better consume the data so we don't choke when we start
to talk to such a server.
Also, a (buggy/malicious) server that replies with length <
sizeof(namelen) would cause us to block waiting for bytes that
the server is not sending, and one that replies with super-huge
lengths could cause us to temporarily allocate up to 4G memory.
Sanity check things before blindly reading incorrectly.
qemu-iotests: 041: More robust assertion on quorum node
Block nodes are now assigned names automatically, therefore the test
case is fragile in using fixed indices in result. Introduce a method in
iotests.py and do the matching more sensibly.
qemu-iotests: tests: do not set unused tmp variable
The previous commit removed the last usage of ${tmp} inside the tests
themselves; the only remaining users are sourced by check. So we can now
drop this variable from the tests.
Kevin Wolf [Fri, 15 Apr 2016 08:21:04 +0000 (10:21 +0200)]
block: Don't ignore flags in blk_{,co,aio}_write_zeroes()
Commit 57d6a428 neglected to pass the given flags to blk_aio_prwv(),
which broke discard by WRITE SAME for scsi-disk (the UNMAP bit would be
ignored).
Commit fc1453cd introduced the same bug for blk_write_zeroes(). This is
used for 'qemu-img convert' without has_zero_init (e.g. on a block
device) and for preallocation=falloc in parallels.
Commit 8896e088 is the version for blk_co_write_zeroes(). This function
is only used in qemu-io.
Jeff Cody [Wed, 23 Mar 2016 03:33:42 +0000 (23:33 -0400)]
block/vpc: make checks on max table size a bit more lax
The check on the max_table_size field not being larger than required is
valid, and in accordance with the VHD spec. However, there have been
VHD images encountered in the wild that have an out-of-spec max table
size that is technically too large.
There is no issue in allowing this larger table size, as we also
later verify that the computed size (used for the pagetable) is
large enough to fit all sectors. In addition, max_table_entries
is bounds checked against SIZE_MAX and INT_MAX.
Remove the strict check, so that we can accomodate these sorts of
images that are benignly out of spec.
Jeff Cody [Wed, 23 Mar 2016 03:33:41 +0000 (23:33 -0400)]
block/vpc: Use the correct max sector count for VHD images
The old VHD_MAX_SECTORS value is incorrect, and is a throwback
to the CHS calculations. The VHD specification allows images up to 2040
GiB, which (using 512 byte sectors) corresponds to a maximum number of
sectors of 0xff000000, rather than the old value of 0xfe0001ff.
Update VHD_MAX_SECTORS to reflect the correct value.
Also, update comment references to the actual size limit, and correct
one compare so that we can have sizes up to the limit.