Git Repo - qemu.git/log

usb-mtp: fix usb_mtp_get_device_info so that libmtp on the guest doesn't complain

If an application uses libmtp on the guest system,
it will complain with the warning message:
LIBMTP WARNING: VendorExtensionID: ffffffff
LIBMTP WARNING: VendorExtensionDesc: (null)
LIBMTP WARNING: this typically means the device is PTP (i.e. a camera) but
not a MTP device at all. Trying to continue anyway.

This is because libmtp expects a MTP Vendor Extension ID of 0x00000006 and a
MTP Version of 0x0064. These numbers are taken from Microsoft's MTP Vendor
Extension Identification Message page and are what most physical devices
show.

Signed-off-by: Isaac Lozano <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Message-id: 1460892593 [email protected]
Signed-off-by: Gerd Hoffmann <[email protected]>

usb:xhci: no DMA on HC reset

This patch is a rough fix to a memory corruption we are observing when
running VMs with xhci USB controller and OVMF firmware.

Specifically, on the following call chain

xhci_reset
  xhci_disable_slot
    xhci_disable_ep
      xhci_set_ep_state

QEMU overwrites guest memory using stale guest addresses.

This doesn't happen when the guest (firmware) driver sets up xhci for
the first time as there are no slots configured yet.  However when the
firmware hands over the control to the OS some slots and endpoints are
already set up with their context in the guest RAM.  Now the OS' driver
resets the controller again and xhci_set_ep_state then reads and writes
that memory which is now owned by the OS.

As a quick fix, skip calling xhci_set_ep_state in xhci_disable_ep if the
device context base address array pointer is zero (indicating we're in
the HC reset and no DMA is possible).

Cc: [email protected]
Signed-off-by: Roman Kagan <[email protected]>
Message-id: 1462384435 [email protected]
Signed-off-by: Gerd Hoffmann <[email protected]>

Update version for v2.6.0-rc5 release

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/kraxel/tags/pull-vga-20160509-1' into staging

vga security fixes (CVE-2016-3710, CVE-2016-3712)

# gpg: Signature made Mon 09 May 2016 13:39:30 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <[email protected]>"
# gpg:                 aka "Gerd Hoffmann <[email protected]>"
# gpg:                 aka "Gerd Hoffmann (private) <[email protected]>"

* remotes/kraxel/tags/pull-vga-20160509-1:
  vga: make sure vga register setup for vbe stays intact (CVE-2016-3712).
  vga: update vga register setup on vbe changes
  vga: factor out vga register setup
  vga: add vbe_enabled() helper
  vga: fix banked access bounds checking (CVE-2016-3710)

Signed-off-by: Peter Maydell <[email protected]>

Update version for v2.6.0-rc4 release

Signed-off-by: Peter Maydell <[email protected]>

Revert "acpi: mark PMTIMER as unlocked"

This reverts commit 7070e085d490c396f9237c8f10bf8b6e69cd0066.

Commit message claims locking is not needed, but that appears
to not be true, seabios ehci driver runs into timekeeping problems
with this, see
https://bugzilla.redhat.com/show_bug.cgi?id=1322713

Signed-off-by: Gerd Hoffmann <[email protected]>
Message-id: 1460702609 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

vga: make sure vga register setup for vbe stays intact (CVE-2016-3712).

Call vbe_update_vgaregs() when the guest touches GFX, SEQ or CRT
registers, to make sure the vga registers will always have the
values needed by vbe mode.  This makes sure the sanity checks
applied by vbe_fixup_regs() are effective.

Without this guests can muck with shift_control, can turn on planar
vga modes or text mode emulation while VBE is active, making qemu
take code paths meant for CGA compatibility, but with the very
large display widths and heigts settable using VBE registers.

Which is good for one or another buffer overflow.  Not that
critical as they typically read overflows happening somewhere
in the display code.  So guests can DoS by crashing qemu with a
segfault, but it is probably not possible to break out of the VM.

Fixes: CVE-2016-3712
Reported-by: Zuozhi Fzz <[email protected]>
Reported-by: P J P <[email protected]>
Signed-off-by: Gerd Hoffmann <[email protected]>

vga: update vga register setup on vbe changes

Call the new vbe_update_vgaregs() function on vbe configuration
changes, to make sure vga registers are up-to-date.

Signed-off-by: Gerd Hoffmann <[email protected]>

vga: factor out vga register setup

When enabling vbe mode qemu will setup a bunch of vga registers to make
sure the vga emulation operates in correct mode for a linear
framebuffer. Move that code to a separate function so we can call it
from other places too.

Signed-off-by: Gerd Hoffmann <[email protected]>

vga: add vbe_enabled() helper

Makes code a bit easier to read.

Signed-off-by: Gerd Hoffmann <[email protected]>

vga: fix banked access bounds checking (CVE-2016-3710)

vga allows banked access to video memory using the window at 0xa00000
and it supports a different access modes with different address
calculations.

The VBE bochs extentions support banked access too, using the
VBE_DISPI_INDEX_BANK register. The code tries to take the different
address calculations into account and applies different limits to
VBE_DISPI_INDEX_BANK depending on the current access mode.

Which is probably effective in stopping misprogramming by accident.
But from a security point of view completely useless as an attacker
can easily change access modes after setting the bank register.

Drop the bogus check, add range checks to vga_mem_{readb,writeb}
instead.

Fixes: CVE-2016-3710
Reported-by: Qinghao Tang <[email protected]>
Signed-off-by: Gerd Hoffmann <[email protected]>

configure: Check if struct fsxattr is available from linux header

Fixes build failure with --enable-xfsctl and
new linux headers (>=4.5) and older xfsprogs(<4.5):
In file included from /usr/include/xfs/xfs.h:38:0,
from /var/tmp/portage/app-emulation/qemu-2.5.0-r1/work/qemu-2.5.0/block/raw-posix.c:97:
/usr/include/xfs/xfs_fs.h:42:8: error: redefinition of ‘struct fsxattr’
struct fsxattr {
^
In file included from /var/tmp/portage/app-emulation/qemu-2.5.0-r1/work/qemu-2.5.0/block/raw-posix.c:60:0:
/usr/include/linux/fs.h:155:8: note: originally defined here
struct fsxattr {

This is really a bug in the system headers, but we can work around it
by defining HAVE_FSXATTR in the QEMU headers if linux/fs.h provides
the struct, so that xfs_fs.h doesn't try to define it as well.

CC: [email protected]
CC: Markus Armbruster <[email protected]>
CC: Peter Maydell <[email protected]>
CC: Stefan Weil <[email protected]>
Tested-by: Stefan Weil <[email protected]>
Signed-off-by: Jan Vesely <[email protected]>
[PMM: adjusted commit message, comments]
Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

acpi: last minute fix for 2.6

Minor, obvious fix only affecting BE hosts.

Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Sun 01 May 2016 13:43:28 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg: aka "Michael S. Tsirkin <[email protected]>"

* remotes/mst/tags/for_upstream:
acpi: fix bios linker loadder COMMAND_ALLOCATE on bigendian host

Signed-off-by: Peter Maydell <[email protected]>

acpi: fix bios linker loadder COMMAND_ALLOCATE on bigendian host

'make check' fails with:

ERROR:tests/bios-tables-test.c:493:load_expected_aml:
assertion failed: (g_file_test(aml_file, G_FILE_TEST_EXISTS))

since commit:
caf50c7166a6ed96c462ab5db4b495e1234e4cc6
tests: pc: acpi: drop not needed 'expected SSDT' blobs

Assert happens because qemu-system-x86_64 generates
SSDT table and test looks for a corresponding expected
table to compare with.

However there is no expected SSDT blob anymore, since
QEMU souldn't generate one. As it happens BIOS is not
able to read ACPI tables from QEMU and fallbacks to
embeded legacy ACPI codepath, which generates SSDT.
That happens due to wrongly sized endiannes conversion
which makes
uint8_t BiosLinkerLoaderEntry.alloc.zone
end up with 0 due to truncation of 32 bit integer
which on host is 1 or 2.

Fix it by dropping invalid cpu_to_le32() as uint8_t
doesn't require any conversion.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1330174

Signed-off-by: Igor Mammedov <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Tested-by: Laurent Vivier <[email protected]>
Reviewed-by: Marcel Apfelbaum <[email protected]>

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

vvfat fixes for 2.6.0-rc4

# gpg: Signature made Fri 29 Apr 2016 10:52:13 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"

* remotes/kevin/tags/for-upstream:
vvfat: Fix default volume label
vvfat: Fix volume name assertion

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2016-04-29' into staging

QAPI patches for 2016-04-29

# gpg: Signature made Fri 29 Apr 2016 10:13:08 BST using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <[email protected]>"
# gpg: aka "Markus Armbruster <[email protected]>"

* remotes/armbru/tags/pull-qapi-2016-04-29:
qapi: Don't pass NULL to printf in string input visitor

Signed-off-by: Peter Maydell <[email protected]>

vvfat: Fix default volume label

Commit d5941dd documented that it leaves the default volume name as it
was ("QEMU VVFAT"), but it doesn't actually implement this. You get an
empty name (eleven space characters) instead.

This fixes the implementation to apply the advertised default.

Cc: [email protected]
Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Max Reitz <[email protected]>
Reviewed-by: Markus Armbruster <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>

vvfat: Fix volume name assertion

Commit d5941dd made the volume name configurable, but it didn't consider
that the rw code compares the volume name string to assert that the
first directory entry is the volume name. This made vvfat crash in rw
mode.

This fixes the assertion to compare with the configured volume name
instead of a literal string.

Cc: [email protected]
Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Markus Armbruster <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>

qapi: Don't pass NULL to printf in string input visitor

Make sure the error message for visit_type_uint64() gracefully
handles a NULL 'name' when called from the top level or a list
context, as not all the world behaves like glibc in allowing
NULL through a printf-family %s.

Signed-off-by: Eric Blake <[email protected]>
Message-Id: <1461879932 [email protected]>
Signed-off-by: Markus Armbruster <[email protected]>

slirp: fix guest network access with darwin host

On Darwin, connect, sendto and friends want the exact size of the sockaddr,
not more (and in particular, not sizeof(struct sockaddr_storaget))

This commit adds the sockaddr_size helper to be used when passing a sockaddr
size to such function, and makes use of it int sendto and connect calls.

Signed-off-by: Samuel Thibault <[email protected]>
Reviewed-by: John Arbuckle <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/lalrae/tags/mips-20160428' into staging

MIPS patches 2016-04-28

Changes:
* fixed RDHWR exception host PC

# gpg: Signature made Thu 28 Apr 2016 10:11:18 BST using RSA key ID 0B29DA6B
# gpg: Good signature from "Leon Alrae <[email protected]>"

* remotes/lalrae/tags/mips-20160428:
target-mips: Fix RDHWR exception host PC

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2016-04-28' into staging

Fix dangling pointers and error message regressions

# gpg: Signature made Thu 28 Apr 2016 07:25:51 BST using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <[email protected]>"
# gpg:                 aka "Markus Armbruster <[email protected]>"

* remotes/armbru/tags/pull-error-2016-04-28:
  qom: -object error messages lost location, restore it
  replay: Fix dangling location bug in replay_configure()
  QemuOpts: Fix qemu_opts_foreach() dangling location regression

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.6-20160426' into staging

ppc patch queue for 2016-04-26 (last minute qemu-2.6 fix)

This just has one, last-minute, fix for a serious regression of memory
hotplug.

Patch author's comment:
    Really sorry for the way last-minute fix, but without this memory
    hotplug is totally broken :( Hoping to get this in for Wednesday's
    RC4, which I think will be the final before release.

# gpg: Signature made Tue 26 Apr 2016 03:52:20 BST using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg:                 aka "David Gibson (Red Hat) <[email protected]>"
# gpg:                 aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.6-20160426:
  spapr_drc: fix aborts during DRC-count based hotplug

Signed-off-by: Peter Maydell <[email protected]>

target-mips: Fix RDHWR exception host PC

Commit b00c72180c36 ("target-mips: add PC, XNP reg numbers to RDHWR")
changed the rdhwr helpers to use check_hwrena() to check the register
being accessed is enabled in CP0_HWREna when used from user mode. If
that check fails an EXCP_RI exception is raised at the host PC
calculated with GETPC().

However check_hwrena() may not be fully inlined as the
do_raise_exception() part of it is common regardless of the arguments.
This causes GETPC() to calculate the address in the call in the helper
instead of the generated code calling the helper. No TB will be found
and the EPC reported with the resulting guest RI exception points to the
beginning of the TB instead of the RDHWR instruction.

We can't reliably force check_hwrena() to be inlined, and converting it
to a macro would be ugly, so instead pass the host PC in as an argument,
with each rdhwr helper passing GETPC(). This should avoid any dependence
on compiler behaviour, and in practice seems to ensure the full inlining
of check_hwrena() on x86_64.

This issue causes failures when running a MIPS KVM (trap & emulate)
guest in a MIPS QEMU TCG guest, as the inner guest kernel will do a
RDHWR of counter, which is disabled in the outer guest's CP0_HWREna by
KVM so it can emulate the inner guest's counter. The emulation fails and
the RI exception is passed to the inner guest.

Fixes: b00c72180c36 ("target-mips: add PC, XNP reg numbers to RDHWR")
Signed-off-by: James Hogan <[email protected]>
Cc: Leon Alrae <[email protected]>
Cc: Yongbok Kim <[email protected]>
Cc: Aurelien Jarno <[email protected]>
Reviewed-by: Aurelien Jarno <[email protected]>
Reviewed-by: Leon Alrae <[email protected]>
Signed-off-by: Leon Alrae <[email protected]>

qom: -object error messages lost location, restore it

qemu_opts_foreach() runs its callback with the error location set to
the option's location.  Any errors the callback reports use the
option's location automatically.

Commit 90998d5 moved the actual error reporting from "inside"
qemu_opts_foreach() to after it.  Here's a typical hunk:

if (qemu_opts_foreach(qemu_find_opts("object"),
    -                          object_create,
    -                          object_create_initial, NULL)) {
    +                          user_creatable_add_opts_foreach,
    +                          object_create_initial, &err)) {
    +        error_report_err(err);
     exit(1);
}

Before, object_create() reports from within qemu_opts_foreach(), using
the option's location.  Afterwards, we do it after
qemu_opts_foreach(), using whatever location happens to be current
there.  Commonly a "none" location.

This is because Error objects don't have location information.
Problematic.

Reproducer:

    $ qemu-system-x86_64 -nodefaults -display none -object secret,id=foo,foo=bar
    qemu-system-x86_64: Property '.foo' not found

Note no location.  This commit restores it:

    qemu-system-x86_64: -object secret,id=foo,foo=bar: Property '.foo' not found

Note that the qemu_opts_foreach() bug just fixed could mask the bug
here: if the location it leaves dangling hasn't been clobbered, yet,
it's the correct one.

Reported-by: Eric Blake <[email protected]>
Cc: Daniel P. Berrange <[email protected]>
Signed-off-by: Markus Armbruster <[email protected]>
Message-Id: <1461767349 [email protected]>
Reviewed-by: Daniel P. Berrange <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
[Paragraph on Error added to commit message]

replay: Fix dangling location bug in replay_configure()

replay_configure() pushes and pops a Location with automatic storage
duration. Except it fails to pop when -icount parameter "rr" isn't
given. cur_loc then points to unused stack space, and will most
likely get clobbered in short order.

Clobbered cur_loc can make loc_pop() and error_print_loc() crash or
report bogus locations.

Broken in commit 890ad55.

I didn't take the time to find a reproducer.

Cc: Eduardo Habkost <[email protected]>
Signed-off-by: Markus Armbruster <[email protected]>
Message-Id: <1461767349 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Eduardo Habkost <[email protected]>

QemuOpts: Fix qemu_opts_foreach() dangling location regression

qemu_opts_foreach() pushes and pops a Location with automatic storage
duration.  Except it fails to pop when @func() returns non-zero.
cur_loc then points to unused stack space, and will most likely get
clobbered in short order.

Clobbered cur_loc can make loc_pop() and error_print_loc() crash or
report bogus locations.

Affects several qemu command line options as well as qemu-img,
qemu-io, qemu-nbd -object, and blkdebug's configuration file.

Broken in commit a4c7367, v2.4.0.

Reproducer:
    $ qemu-system-x86_64 -nodefaults -display none -object secret,id=foo,foo=bar

main() reports "Property '.foo' not found" like this:

    if (qemu_opts_foreach(qemu_find_opts("object"),
                          user_creatable_add_opts_foreach,
                          object_create_delayed, &err)) {
        error_report_err(err);
        exit(1);
    }

cur_loc then points to where qemu_opts_foreach()'s Location used to
be, i.e. unused stack space.  With optimization, this Location doesn't
get clobbered for me, and also happens to be the correct location.
Without optimization, it does get clobbered in a way that makes
error_report_err() report no location.

Signed-off-by: Markus Armbruster <[email protected]>
Message-Id: <1461767349 [email protected]>
Reviewed-by: Eric Blake <[email protected]>

spapr_drc: fix aborts during DRC-count based hotplug

CPU/memory resources can be signalled en-masse via
spapr_hotplug_req_add_by_count(), and when doing so, actually change
the meaning of the 'drc' parameter passed to
spapr_hotplug_req_event() to be a count rather than an index.

f40eb92 added a hook in spapr_hotplug_req_event() to record when a
device had been 'signalled' to the guest, but that code assumes that
drc is always an index. In cases where it's a count, such as memory
hotplug, the DRC lookup will fail, leading to an assert.

Fix this by only explicitly setting the signalled state for cases where
we are doing PCI hotplug.

For other resources types, since we cannot selectively track whether a
resource has been signalled in cases where we signal attach as a count,
set the 'signalled' state to true immediately upon making the
resource available via drck->attach().

Reported-by: Bharata B Rao <[email protected]>
Cc: Bharata B Rao <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: David Gibson <[email protected]>

usb/uhci: move pid check

commit "5f77e06 usb: add pid check at the first of uhci_handle_td()"
moved the pid verification to the start of the uhci_handle_td function,
to simplify the error handling (we don't have to free stuff which we
didn't allocate in the first place ...).

Problem is now the check fires too often, it raises error IRQs even for
TDs which we are not going to process because they are not set active.

So, lets move down the check a bit, so it is done only for active TDs,
but still before we are going to allocate stuff to process the requested
transfer.

Reported-by: Joe Clifford <[email protected]>
Tested-by: Joe Clifford <[email protected]>
Signed-off-by: Gerd Hoffmann <[email protected]>
Message-id: 1461321893 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.6-20160423' into staging

ppc patch queue for 2016-03-23

A single fix for a bug in parameter handling for the spapr PCI host
bridge.

# gpg: Signature made Sat 23 Apr 2016 07:55:29 BST using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg:                 aka "David Gibson (Red Hat) <[email protected]>"
# gpg:                 aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.6-20160423:
  hw/ppc/spapr: Fix crash when specifying bad parameters to spapr-pci-host-bridge

Signed-off-by: Peter Maydell <[email protected]>

hw/ppc/spapr: Fix crash when specifying bad parameters to spapr-pci-host-bridge

QEMU currently crashes when using bad parameters for the
spapr-pci-host-bridge device:

$ qemu-system-ppc64 -device spapr-pci-host-bridge,buid=0x123,liobn=0x321,mem_win_addr=0x1,io_win_addr=0x10
Segmentation fault

The problem is that spapr_tce_find_by_liobn() might return NULL, but
the code in spapr_populate_pci_dt() does not check for this condition
and then tries to dereference this NULL pointer.
Apart from that, the return value of spapr_populate_pci_dt() also
has to be checked for all PCI buses, not only for the last one, to
make sure we catch all errors.

Signed-off-by: Thomas Huth <[email protected]>
Signed-off-by: David Gibson <[email protected]>

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Mirror block job fixes for 2.6.0-rc4

# gpg: Signature made Fri 22 Apr 2016 15:46:41 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"

* remotes/kevin/tags/for-upstream:
  mirror: Workaround for unexpected iohandler events during completion
  aio-posix: Skip external nodes in aio_dispatch
  virtio: Mark host notifiers as external
  event-notifier: Add "is_external" parameter
  iohandler: Introduce iohandler_get_aio_context

Signed-off-by: Peter Maydell <[email protected]>

mirror: Workaround for unexpected iohandler events during completion

Commit 5a7e7a0ba moved mirror_exit to a BH handler but didn't add any
protection against new requests that could sneak in just before the
BH is dispatched. For example (assuming a code base at that commit):

        main_loop_wait # 1
          os_host_main_loop_wait
            g_main_context_dispatch
              aio_ctx_dispatch
                aio_dispatch
                  ...
                    mirror_run
                      bdrv_drain
    (a)               block_job_defer_to_main_loop
          qemu_iohandler_poll
            virtio_queue_host_notifier_read
              ...
                virtio_submit_multiwrite
    (b)           blk_aio_multiwrite

        main_loop_wait # 2
          <snip>
                aio_dispatch
                  aio_bh_poll
    (c)             mirror_exit

At (a) we know the BDS has no pending request. However, the same
main_loop_wait call is going to dispatch iohandlers (EventNotifier
events), which may lead to a new I/O from guest. So the invariant is
already broken at (c). Data loss.

Commit f3926945c8 made iohandler to use aio API.  The order of
virtio_queue_host_notifier_read and block_job_defer_to_main_loop within
a main_loop_wait becomes unpredictable, and even worse, if the host
notifier event arrives at the next main_loop_wait call, the
unpredictable order between mirror_exit and
virtio_queue_host_notifier_read is also a trouble. As shown below, this
commit made the bug easier to trigger:

    - Bug case 1:

        main_loop_wait # 1
          os_host_main_loop_wait
            g_main_context_dispatch
              aio_ctx_dispatch (qemu_aio_context)
                ...
                  mirror_run
                    bdrv_drain
    (a)             block_job_defer_to_main_loop
              aio_ctx_dispatch (iohandler_ctx)
                virtio_queue_host_notifier_read
                  ...
                    virtio_submit_multiwrite
    (b)               blk_aio_multiwrite

        main_loop_wait # 2
          ...
                aio_dispatch
                  aio_bh_poll
    (c)             mirror_exit

    - Bug case 2:

        main_loop_wait # 1
          os_host_main_loop_wait
            g_main_context_dispatch
              aio_ctx_dispatch (qemu_aio_context)
                ...
                  mirror_run
                    bdrv_drain
    (a)             block_job_defer_to_main_loop

        main_loop_wait # 2
          ...
            aio_ctx_dispatch (iohandler_ctx)
              virtio_queue_host_notifier_read
                ...
                  virtio_submit_multiwrite
    (b)             blk_aio_multiwrite
              aio_dispatch
                aio_bh_poll
    (c)           mirror_exit

In both cases, (b) breaks the invariant wanted by (a) and (c).

Until then, the request loss has been silent. Later, 3f09bfbc7be added
asserts at (c) to check the invariant (in
bdrv_replace_in_backing_chain), and Max reported an assertion failure
first visible there, by doing active committing while the guest is
running bonnie++.

2.5 added bdrv_drained_begin at (a) to protect the dataplane case from
similar problems, but we never realize the main loop bug until now.

As a bandage, this patch disables iohandler's external events
temporarily together with bs->ctx.

Launchpad Bug: 1570134

Cc: [email protected]
Signed-off-by: Fam Zheng <[email protected]>
Reviewed-by: Jeff Cody <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

aio-posix: Skip external nodes in aio_dispatch

aio_poll doesn't poll the external nodes so this should never be true,
but aio_ctx_dispatch may get notified by the events from GSource. To
make bdrv_drained_begin effective in main loop, we should check the
is_external flag here too.

Also do the check in aio_pending so aio_dispatch is not called
superfluously, when there is no events other than external ones.

Signed-off-by: Fam Zheng <[email protected]>
Reviewed-by: Jeff Cody <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

virtio: Mark host notifiers as external

The effect of this change is the block layer drained section can work,
for example when mirror job is being completed.

Signed-off-by: Fam Zheng <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

event-notifier: Add "is_external" parameter

All callers pass "false" keeping the old semantics. The windows
implementation doesn't distinguish the flag yet. On posix, it is passed
down to the underlying aio context.

Signed-off-by: Fam Zheng <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

iohandler: Introduce iohandler_get_aio_context

Signed-off-by: Fam Zheng <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

util: align memory allocations to 2M on AArch64

For KVM to use Transparent Huge Pages (THP) we have to ensure that the
alignment of the userspace address of the KVM memory slot and the IPA
that the guest sees for a memory region have the same offset from the 2M
huge page size boundary.

One way to achieve this is to always align the IPA region at a 2M
boundary and ensure that the mmap alignment is also at 2M.

Unfortunately, we were only doing this for __arm__, not for __aarch64__,
so add this simple condition.

This fixes a performance regression using KVM/ARM on AArch64 platforms
that showed a performance penalty of more than 50%, introduced by the
following commit:

9fac18f (oslib: allocate PROT_NONE pages on top of RAM, 2015-09-10)

We were only lucky before the above commit, because we were allocating
large regions and naturally getting a 2M alignment on those allocations
then.

Cc: [email protected]
Reported-by: Shih-Wei Li <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>
Reviewed-by: Peter Maydell <[email protected]>
[PMM: wrapped long line]
Signed-off-by: Peter Maydell <[email protected]>

nbd: Don't mishandle unaligned client requests

The NBD protocol does not (yet) force any alignment constraints
on clients. Even though qemu NBD clients always send requests
that are aligned to 512 bytes, we must be prepared for non-qemu
clients that don't care about alignment (even if it means they
are less efficient). Our use of blk_read() and blk_write() was
silently operating on the wrong file offsets when the client
made an unaligned request, corrupting the client's data (but
as the client already has control over the file we are serving,
I don't think it is a security hole, per se, just a data
corruption bug).

Note that in the case of NBD_CMD_READ, an unaligned length could
cause us to return up to 511 bytes of uninitialized trailing
garbage from blk_try_blockalign() - hopefully nothing sensitive
from the heap's prior usage is ever leaked in that manner.

Signed-off-by: Eric Blake <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Tested-by: Kevin Wolf <[email protected]>
Message-id: 1461249750 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

Update version for v2.6.0-rc3 release

Signed-off-by: Peter Maydell <[email protected]>

tcg: check for CONFIG_DEBUG_TCG instead of NDEBUG

Check for CONFIG_DEBUG_TCG instead of NDEBUG, drop now useless code.

Cc: Richard Henderson <[email protected]>
Signed-off-by: Aurelien Jarno <[email protected]>
Message-id: 1461228530 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

tcg: use tcg_debug_assert instead of assert (fix performance regression)

The TCG code is quite performance sensitive, but at the same time can
also be quite tricky. That is why asserts that can be enabled with the
--enable-debug-tcg configure option.

This used to work the following way:

| #include "config.h"
|
| ...
|
| #if !defined(CONFIG_DEBUG_TCG) && !defined(NDEBUG)
| /* define it to suppress various consistency checks (faster) */
| #define NDEBUG
| #endif
|
| ...
|
| #include <assert.h>

Since commit 757e725b (tcg: Clean up includes) "config.h" as been
replaced by "qemu/osdep.h" which itself includes <assert.h>. As a
consequence the assertions are always enabled, even when using
--disable-debug-tcg, causing a performance regression, especially on
targets with many registers. For instance on qemu-system-ppc the
speed difference is about 15%.

tcg_debug_assert is controlled directly by CONFIG_DEBUG_TCG and already
uses in some places. This patch replaces all the calls to assert into
calss to tcg_debug_assert.

Cc: Peter Maydell <[email protected]>
Cc: Richard Henderson <[email protected]>
Signed-off-by: Aurelien Jarno <[email protected]>
Message-id: 1461228530 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

hw/arm/boot: always clear r0 when booting kernels

The 32-bit ARM Linux kernel booting ABI requires that r0 is 0
when calling the kernel image. A bug in commit 10b8ec73e610e01
meant that for boards which use the write_board_setup hook (which
means "highbank", "midway", "raspi2" and "xilinx-zynq-a9") we
were incorrectly skipping the "clear r0" instruction in the
mini-bootloader. Use the right offset in the "add lr, pc, #n"
instruction so that we return from the board-setup code to the
correct place.

Signed-off-by: Sylvain Garrigues <[email protected]>
[PMM: Expanded commit message]
Cc: [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

MAINTAINERS: Avoid using K: for NUMA section

When using K: in MAINTAINERS, false positives makes
get_maintainer.pl not use git history to find contributors. As
those patterns cause lots of false positives they are causing
more harm than good, so remove them.

Reported-by: Markus Armbruster <[email protected]>
Signed-off-by: Eduardo Habkost <[email protected]>
Message-id: 1461164130 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Mirror block job fixes for 2.6.0-rc3

# gpg: Signature made Wed 20 Apr 2016 15:56:43 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"

* remotes/kevin/tags/for-upstream:
  iotests: Test case for drive-mirror with unaligned image size
  iotests: Add iotests.image_size
  mirror: Don't extend the last sub-chunk
  block/mirror: Refresh stale bitmap iterator cache
  block/mirror: Revive dead yielding code

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/sstabellini/tags/xen-2016-04-20' into staging

Xen 2016/04/20

# gpg: Signature made Wed 20 Apr 2016 12:08:56 BST using RSA key ID 70E1AE90
# gpg: Good signature from "Stefano Stabellini <[email protected]>"

* remotes/sstabellini/tags/xen-2016-04-20:
xenfb: use the correct condition to avoid excessive looping

Signed-off-by: Peter Maydell <[email protected]>

iotests: Test case for drive-mirror with unaligned image size

This is the regression test for the virtual size mismatch issue between
target and source images.

[ kwolf: Added test_unaligned_with_update ]

Signed-off-by: Fam Zheng <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Max Reitz <[email protected]>
Reviewed-by: Jeff Cody <[email protected]>

iotests: Add iotests.image_size

This retrieves the virtual size of the image out of qemu-img info.

Signed-off-by: Fam Zheng <[email protected]>
Reviewed-by: Max Reitz <[email protected]>
Reviewed-by: Jeff Cody <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

mirror: Don't extend the last sub-chunk

The last sub-chunk is rounded up to the copy granularity in the target
image, resulting in a larger size than the source.

Add a function to clip the copied sectors to the end.

This undoes the "wrong" changes to tests/qemu-iotests/109.out in
e5b43573e28. The remaining two offset changes are okay.

[ kwolf: Use DIV_ROUND_UP to calculate nb_chunks now ]

Reported-by: Kevin Wolf <[email protected]>
Signed-off-by: Fam Zheng <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Max Reitz <[email protected]>
Reviewed-by: Jeff Cody <[email protected]>

block/mirror: Refresh stale bitmap iterator cache

If the drive's dirty bitmap is dirtied while the mirror operation is
running, the cache of the iterator used by the mirror code may become
stale and not contain all dirty bits.

This only becomes an issue if we are looking for contiguously dirty
chunks on the drive. In that case, we can easily detect the discrepancy
and just refresh the iterator if one occurs.

Signed-off-by: Max Reitz <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

block/mirror: Revive dead yielding code

mirror_iteration() is supposed to wait if the current chunk is subject
to a still in-flight mirroring operation. However, it mixed checking
this conflict situation with checking the dirty status of a chunk. A
simplification for the latter condition (the first chunk encountered is
always dirty) led to neglecting the former: We just skip the first chunk
and thus never test whether it conflicts with an in-flight operation.

To fix this, pull out the code which waits for in-flight operations on
the first chunk of the range to be mirrored to settle.

Signed-off-by: Max Reitz <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

Merge remote-tracking branch 'remotes/mdroth/tags/qga-pull-2016-04-19-tag' into staging

qemu-ga patch queue for 2.6

* fixes inadvertant change that unconditionally disables qemu-ga unit test
* fixes make check failures when building with --disable-guest-agent that
  were present visible before the unit test was inadvertantly disabled.

# gpg: Signature made Tue 19 Apr 2016 23:30:09 BST using RSA key ID F108B584
# gpg: Good signature from "Michael Roth <[email protected]>"
# gpg:                 aka "Michael Roth <[email protected]>"
# gpg:                 aka "Michael Roth <[email protected]>"

* remotes/mdroth/tags/qga-pull-2016-04-19-tag:
  qemu-ga: do not run qga test when guest agent disabled

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging

# gpg: Signature made Tue 19 Apr 2016 17:28:01 BST using RSA key ID C0DE3057
# gpg: Good signature from "Jeffrey Cody <[email protected]>"
# gpg:                 aka "Jeffrey Cody <[email protected]>"
# gpg:                 aka "Jeffrey Cody <[email protected]>"

* remotes/cody/tags/block-pull-request:
  block/gluster: prevent data loss after i/o error
  block/gluster: code movement of qemu_gluster_close()
  block/gluster: return correct error value

Signed-off-by: Peter Maydell <[email protected]>

qemu-ga: do not run qga test when guest agent disabled

When configure with --disable-guest-agent, make check will fail with:
ERROR:tests/test-qga.c:74:fixture_setup: assertion failed (error == NULL):
Failed to execute child process "/home/xx/qemu/qemu-ga" (No such file or
directory) (g-exec-error-quark, 8)
make: *** [check-tests/test-qga] Error 1

This check was commented out by bab47d9a75a. I think that was by
mistake, because the commit message of that commit didn't mention
this change.

Signed-off-by: Yang Hongyang <[email protected]>
Cc: Gerd Hoffmann <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Michael Roth <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Signed-off-by: Michael Roth <[email protected]>
Cc: [email protected]

Update language files for QEMU 2.6.0

Update translation files (change created via 'make -C po update').

Signed-off-by: Peter Maydell <[email protected]>
Message-id: 1461059023 [email protected]
Reviewed-by: Stefan Weil <[email protected]>

block/gluster: prevent data loss after i/o error

Upon receiving an I/O error after an fsync, by default gluster will
dump its cache.  However, QEMU will retry the fsync, which is especially
useful when encountering errors such as ENOSPC when using the werror=stop
option.  When using caching with gluster, however, the last written data
will be lost upon encountering ENOSPC.  Using the write-behind-cache
xlator option of 'resync-failed-syncs-after-fsync' should cause gluster
to retain the cached data after a failed fsync, so that ENOSPC and other
transient errors are recoverable.

Unfortunately, we have no way of knowing if the
'resync-failed-syncs-after-fsync' xlator option is supported, so for now
close the fd and set the BDS driver to NULL upon fsync error.

Signed-off-by: Jeff Cody <[email protected]>

block/gluster: code movement of qemu_gluster_close()

Move qemu_gluster_close() further up in the file, in preparation
for the next patch, to avoid a forward declaration.

Signed-off-by: Jeff Cody <[email protected]>

block/gluster: return correct error value

Upon error, gluster will call the aio callback function with a
ret value of -1, with errno set to the proper error value. If
we set the acb->ret value to the return value in the callback,
that results in every error being EPERM (i.e. 1). Instead, set
it to the proper error result.

Reviewed-by: Niels de Vos <[email protected]>
Signed-off-by: Jeff Cody <[email protected]>

Merge remote-tracking branch 'remotes/armbru/tags/pull-fw_cfg-2016-04-19' into staging

fw_cfg: Adopt /opt/RFQDN convention

# gpg: Signature made Tue 19 Apr 2016 15:14:20 BST using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <[email protected]>"
# gpg: aka "Markus Armbruster <[email protected]>"

* remotes/armbru/tags/pull-fw_cfg-2016-04-19:
fw_cfg: Adopt /opt/RFQDN convention

Signed-off-by: Peter Maydell <[email protected]>

fw_cfg: Adopt /opt/RFQDN convention

FW CFG's primary user is QEMU, which uses it to expose configuration
information (in the widest sense) to Firmware.  Thus the name FW CFG.

FW CFG can also be used by others for their own purposes.  QEMU is
merely acting as transport then.  Names starting with opt/ are
reserved for such uses.  There is no provision, however, to guide safe
sharing among different such users.

Fix that, loosely following QMP precedence: names should start with
opt/RFQDN/, where RFQDN is a reverse fully qualified domain name you
control.

Based on a more ambitious patch from Michael Tsirkin.

Cc: Gerd Hoffmann <[email protected]>
Cc: Gabriel L. Somlo <[email protected]>
Cc: Laszlo Ersek <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Acked-by: Gabriel Somlo <[email protected]>
Reviewed-by: Laszlo Ersek <[email protected]>

Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20160419-1' into staging

ehci: fix (s)iTD looping issue (CVE-2015-8558) in a different way.

# gpg: Signature made Tue 19 Apr 2016 07:22:22 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <[email protected]>"
# gpg:                 aka "Gerd Hoffmann <[email protected]>"
# gpg:                 aka "Gerd Hoffmann (private) <[email protected]>"

* remotes/kraxel/tags/pull-usb-20160419-1:
  Revert "ehci: make idt processing more robust"
  ehci: apply limit to iTD/sidt descriptors

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.6-20160419' into staging

ppc patch queueu for 2016-04-19

A single fix for a regression since 2.5.  This should be the last ppc
pull request for 2.6.

# gpg: Signature made Tue 19 Apr 2016 02:48:30 BST using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg:                 aka "David Gibson (Red Hat) <[email protected]>"
# gpg:                 aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.6-20160419:
  cuda: fix off-by-one error in SET_TIME command

Signed-off-by: Peter Maydell <[email protected]>

cadence_uart: bounds check write offset

cadence_uart_init() initializes an I/O memory region of size 0x1000
bytes. However in uart_write(), the 'offset' parameter (offset within
region) is divided by 4 and then used to index the array 'r' of size
CADENCE_UART_R_MAX which is much smaller: (0x48/4). If 'offset>>=2'
exceeds CADENCE_UART_R_MAX, this will cause an out-of-bounds memory
write where the offset and the value are controlled by guest.

This will corrupt QEMU memory, in most situations this causes the vm to
crash.

Fix by checking the offset against the array size.

Cc: [email protected]
Reported-by: 李强 <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Alistair Francis <[email protected]>
Message-id: 20160418100735 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging

X86 fix for 2.6.0-rc3

# gpg: Signature made Mon 18 Apr 2016 20:02:15 BST using RSA key ID 984DC5A6
# gpg: Good signature from "Eduardo Habkost <[email protected]>"

* remotes/ehabkost/tags/x86-pull-request:
target-i386: Set AMD alias bits after filtering CPUID data

Signed-off-by: Peter Maydell <[email protected]>

Revert "ehci: make idt processing more robust"

This reverts commit 156a2e4dbffa85997636a7a39ef12da6f1b40254.

Breaks FreeBSD.

Signed-off-by: Gerd Hoffmann <[email protected]>

ehci: apply limit to iTD/sidt descriptors

Commit "156a2e4 ehci: make idt processing more robust" tries to avoid a
DoS by the guest (create a circular iTD queue and let qemu ehci
emulation run in circles forever). Unfortunately this has two problems:
First it misses the case of siTDs, and second it reportedly breaks
FreeBSD.

So lets go for a different approach: just count the number of iTDs and
siTDs we have seen per frame and apply a limit. That should really
catch all cases now.

Reported-by: 杜少博 <[email protected]>
Signed-off-by: Gerd Hoffmann <[email protected]>

cuda: fix off-by-one error in SET_TIME command

With the new framework the cuda_cmd_set_time command directly receive
the data, without the command byte. Therefore the time is stored at
in_data[0], not at in_data[1].

This fixes the "hwclock --systohc" command in a guest.

Cc: Hervé Poussineau <[email protected]>
Cc: David Gibson <[email protected]>
Signed-off-by: Aurelien Jarno <[email protected]>
Reviewed-by: Hervé Poussineau <[email protected]>
[this fixes a regression introduced by e647317 "cuda: port SET_TIME
command to new framework"]
Signed-off-by: David Gibson <[email protected]>

target-i386: Set AMD alias bits after filtering CPUID data

QEMU complains about -cpu host on an AMD machine:
warning: host doesn't support requested feature: CPUID.80000001H:EDX [bit 0]
For bits 0,1,3,4,5,6,7,8,9,12,13,14,15,16,17,23,24.

KVM_GET_SUPPORTED_CPUID and and x86_cpu_get_migratable_flags()
don't handle the AMD CPUID aliases bits, making
x86_cpu_filter_features() print warnings and clear those CPUID
bits incorrectly.

To avoid hacking x86_cpu_get_migratable_flags() to handle
CPUID_EXT2_AMD_ALIASES (just like the existing hack inside
kvm_arch_get_supported_cpuid()), simply move the
CPUID_EXT2_AMD_ALIASES code in x86_cpu_realizefn() after the
x86_cpu_filter_features() call.

This will probably make the CPUID_EXT2_AMD_ALIASES hack in
kvm_arch_get_supported_cpuid() unnecessary, too. The hack will be
removed in a follow-up patch after v2.6.0.

Reported-by: Radim Krčmář <[email protected]>
Tested-by: Radim Krčmář <[email protected]>
Signed-off-by: Eduardo Habkost <[email protected]>

Merge remote-tracking branch 'remotes/afaerber/tags/qom-cpu-for-peter' into staging

QOM CPUState and X86CPU

* MAINTAINERS cleanup

# gpg: Signature made Mon 18 Apr 2016 17:23:16 BST using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <[email protected]>"
# gpg: aka "Andreas Färber <[email protected]>"

* remotes/afaerber/tags/qom-cpu-for-peter:
MAINTAINERS: Drop target-i386 from CPU subsystem

Signed-off-by: Peter Maydell <[email protected]>

MAINTAINERS: Drop target-i386 from CPU subsystem

X86CPU QOM type is in good hands and actively maintained these days, so
drop it from the generic QOM CPU subsystem.

Some refactorings and design questions will still intersect, but review
and discussions of individual series can still take place while opting out
of general X86CPU patch review.

Acked-by: Eduardo Habkost <[email protected]>
Signed-off-by: Andreas Färber <[email protected]>

Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-signed' into staging

Update OpenBIOS images

# gpg: Signature made Mon 18 Apr 2016 09:39:31 BST using RSA key ID AE0F321F
# gpg: Good signature from "Mark Cave-Ayland <[email protected]>"

* remotes/mcayland/tags/qemu-openbios-signed:
Update OpenBIOS images

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.6-20160418' into staging

ppc patch queue for 2-16-04-18

Three bugfixe patches for 2.6 here.
* Two for bad implementation of some of the strong load/store
  instructions

* One for bad migration of the XER register.  This is a regression
  from 2.5, cause by a change in the way we represent at XER during
  runtime.

# gpg: Signature made Mon 18 Apr 2016 06:17:03 BST using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg:                 aka "David Gibson (Red Hat) <[email protected]>"
# gpg:                 aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.6-20160418:
  ppc: Fix migration of the XER register
  ppc: Fix the bad exception NIP value and the range check in LSWX
  ppc: Fix the range check in the LSWI instruction

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/otubo/tags/pull-seccomp-20160416' into staging

seccomp branch queue

# gpg: Signature made Sat 16 Apr 2016 19:58:46 BST using RSA key ID 12F8BD2F
# gpg: Good signature from "Eduardo Otubo (Software Engineer @ ProfitBricks) <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 1C96 46B6 E1D1 C38A F2EC  3FDE FD0C FF5B 12F8 BD2F

* remotes/otubo/tags/pull-seccomp-20160416:
  seccomp: adding sysinfo system call to whitelist
  seccomp: Whitelist cacheflush since 2.2.0 not 2.2.3
  configure: Enable seccomp sandbox for MIPS

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/weil/tags/pull-wxx-20160415' into staging

wxx patch queue

# gpg: Signature made Fri 15 Apr 2016 18:36:41 BST using RSA key ID 677450AD
# gpg: Good signature from "Stefan Weil <[email protected]>"
# gpg:                 aka "Stefan Weil <[email protected]>"
# gpg:                 aka "Stefan Weil <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 4923 6FEA 75C9 5D69 8EC2  B78A E08C 21D5 6774 50AD

* remotes/weil/tags/pull-wxx-20160415:
  wxx: Fix broken TCP networking (regression)

Signed-off-by: Peter Maydell <[email protected]>

Update OpenBIOS images

Update OpenBIOS images to SVN r1395 built from submodule.

Signed-off-by: Mark Cave-Ayland <[email protected]>

ppc: Fix migration of the XER register

env->xer only holds the lower bits of the XER register nowadays, the
SO, OV and CA bits are stored in separate variables (see the function
cpu_write_xer() for details). Since the migration code currently only
reads the "xer" variable, the upper bits are lost during migration.
Fix it by using cpu_read_xer() instead.

Signed-off-by: Thomas Huth <[email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc: Fix the bad exception NIP value and the range check in LSWX

The range checks in the LSWX instruction are completely insufficient:
They do not take the wrap-around case into account, and the check
"reg < rx" should be "reg <= rx" instead. Fix it by using the new
lsw_reg_in_range() helper function that is already used for LSWI, too.

Then there is a second problem: In case the INVAL exception is generated,
the NIP value is wrong, it currently points to the instruction before
the LSWX instruction. This is because gen_lswx() already decreases the
NIP value by 4 (to be prepared for page fault exceptions), and
powerpc_excp() later decreases it again by 4 while handling the program
exception. So to get this right, we've got to undo the "- 4" from
gen_lswx() here before calling helper_raise_exception_err().

Signed-off-by: Thomas Huth <[email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc: Fix the range check in the LSWI instruction

There are two issues: First, the number of registers that are used has
to be calculated with "(nb + 3) / 4" (i.e. round always up, not down).
Second, the "start <= ra && (start + nr - 32) > ra" condition for the
wrap-around case is wrong: It has to be tested with "||" instead of "&&".
Since we can reuse this check later for the LSWX instruction, let's
place the fixed code into a helper function, too.

Signed-off-by: Thomas Huth <[email protected]>
Signed-off-by: David Gibson <[email protected]>

seccomp: adding sysinfo system call to whitelist

Newer version of nss-softokn libraries (> 3.16.2.3) use sysinfo call
so qemu using rbd image hang after start when run in sandbox mode.

To allow using rbd images in sandbox mode we have to whitelist it.

Signed-off-by: Miroslav Rezanina <[email protected]>
Acked-by: Eduardo Otubo <[email protected]>

seccomp: Whitelist cacheflush since 2.2.0 not 2.2.3

The cacheflush system call (found on MIPS and ARM) has been included in
the libseccomp header since 2.2.0, so include it back to that version.
Previously it was only enabled since 2.2.3 since that is when it was
enabled properly for ARM.

This will allow seccomp support to be enabled for MIPS back to
libseccomp 2.2.0.

Signed-off-by: James Hogan <[email protected]>
Reviewed-By: Andrew Jones <[email protected]>
Acked-by: Eduardo Otubo <[email protected]>

configure: Enable seccomp sandbox for MIPS

Enable seccomp on MIPS since libseccomp version 2.2.0 when MIPS support
was first added.

Signed-off-by: James Hogan <[email protected]>
Reviewed-by: Andrew Jones <[email protected]>
Acked-by: Eduardo Otubo <[email protected]>

wxx: Fix broken TCP networking (regression)

It is broken since commit c619644067f98098dcdbc951e2dda79e97560afa.

Reported-by: Michael Fritscher <[email protected]>
Tested-by: Michael Fritscher <[email protected]>
Reviewed-by: Peter Maydell <[email protected]>
Reviewed-by: Daniel P. Berrange <[email protected]>
Signed-off-by: Stefan Weil <[email protected]>

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches for 2.6.0-rc3

# gpg: Signature made Fri 15 Apr 2016 17:02:23 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"

* remotes/kevin/tags/for-upstream:
  nbd: Don't kill server on client that doesn't request TLS
  nbd: fix assert() on qemu-nbd stop
  nbd: Don't fail handshake on NBD_OPT_LIST descriptions
  qemu-iotests: 041: More robust assertion on quorum node
  qemu-iotests: place valgrind log file in scratch dir
  qemu-iotests: tests: do not set unused tmp variable
  qemu-iotests: common.rc: drop unused _do()
  qemu-iotests: drop unused _within_tolerance() filter
  Fix pflash migration
  block: Don't ignore flags in blk_{,co,aio}_write_zeroes()
  block/vpc: update comments to be compliant w/coding guidelines
  block/vpc: set errp in vpc_open
  block/vpc: make checks on max table size a bit more lax
  block/vpc: Use the correct max sector count for VHD images
  block/vpc: use current_size field for XenConverter VHD images
  vpc: use current_size field for XenServer VHD images
  block/vpc: set errp in vpc_create
  block: Fix blk_aio_write_zeroes()
  qemu-io: Support 'aio_write -z'

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/armbru/tags/pull-backends-2016-04-15' into staging

hostmem-file: plug a small leak

# gpg: Signature made Fri 15 Apr 2016 17:30:42 BST using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <[email protected]>"
# gpg: aka "Markus Armbruster <[email protected]>"

* remotes/armbru/tags/pull-backends-2016-04-15:
hostmem-file: plug a small leak

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'mreitz/tags/pull-block-for-kevin-2016-04-15' into queue-block

Block patches for 2.6.0-rc3.

# gpg: Signature made Fri Apr 15 17:57:30 2016 CEST using RSA key ID E838ACAD
# gpg: Good signature from "Max Reitz <[email protected]>"

* mreitz/tags/pull-block-for-kevin-2016-04-15:
  nbd: Don't kill server on client that doesn't request TLS
  nbd: fix assert() on qemu-nbd stop
  nbd: Don't fail handshake on NBD_OPT_LIST descriptions
  qemu-iotests: 041: More robust assertion on quorum node
  qemu-iotests: place valgrind log file in scratch dir
  qemu-iotests: tests: do not set unused tmp variable
  qemu-iotests: common.rc: drop unused _do()
  qemu-iotests: drop unused _within_tolerance() filter

Signed-off-by: Kevin Wolf <[email protected]>

nbd: Don't kill server on client that doesn't request TLS

Upstream NBD documents (as of commit 4feebc95) that servers MAY
choose to operate in a conditional mode, where it is up to the
client whether to use TLS. For qemu's case, we want to always be
in FORCEDTLS mode, because of the risk of man-in-the-middle
attacks, and since we never export more than one device; likewise,
the qemu client will ALWAYS send NBD_OPT_STARTTLS as its first
option. But now that SELECTIVETLS servers exist, it is feasible
to encounter a (non-qemu) client that is programmed to talk to
such a server, and does not do NBD_OPT_STARTTLS first, but rather
wants to probe if it can use a non-encrypted export.

The NBD protocol documents that we should let such a client
continue trying, on the grounds that maybe the client will get the
hint to send NBD_OPT_STARTTLS, rather than immediately dropping
the connection.

Note that NBD_OPT_EXPORT_NAME is a special case: since it is the
only option request that can't have an error return, we have to
(continue to) drop the connection on that one; rather, what we are
fixing here is that all other replies prior to TLS initiation tell
the client NBD_REP_ERR_TLS_REQD, but keep the connection alive.

Signed-off-by: Eric Blake <[email protected]>
Message-id: 1460671343 [email protected]
Signed-off-by: Max Reitz <[email protected]>

nbd: fix assert() on qemu-nbd stop

From time to time qemu-nbd is crashing on the following assert:
    assert(state == TERMINATING);
    nbd_export_closed
    nbd_export_put
    main
and the state at the moment of the crash is evaluated to TERMINATE.

During shutdown process of the client the nbd_client_thread thread sends
SIGTERM signal and the main thread calls the nbd_client_closed callback.
If the SIGTERM callback will be executed after change the state to
TERMINATING, then the state will once again be TERMINATE.

To solve the issue, we must change the state to TERMINATE only if the state
is RUNNING. In the other case we are shutting down already.

Signed-off-by: Pavel Butsykin <[email protected]>
Signed-off-by: Denis V. Lunev <[email protected]>
CC: Paolo Bonzini <[email protected]>
Message-id: 1460629215 [email protected]
Signed-off-by: Max Reitz <[email protected]>

nbd: Don't fail handshake on NBD_OPT_LIST descriptions

The NBD Protocol states that NBD_REP_SERVER may set
'length > sizeof(namelen) + namelen'; in which case the rest
of the packet is a UTF-8 description of the export. While we
don't know of any NBD servers that send this description yet,
we had better consume the data so we don't choke when we start
to talk to such a server.

Also, a (buggy/malicious) server that replies with length <
sizeof(namelen) would cause us to block waiting for bytes that
the server is not sending, and one that replies with super-huge
lengths could cause us to temporarily allocate up to 4G memory.
Sanity check things before blindly reading incorrectly.

Signed-off-by: Eric Blake <[email protected]>
Message-id: 1460077777 [email protected]
Reviewed-by: Alex Bligh <[email protected]>
Signed-off-by: Max Reitz <[email protected]>

qemu-iotests: 041: More robust assertion on quorum node

Block nodes are now assigned names automatically, therefore the test
case is fragile in using fixed indices in result. Introduce a method in
iotests.py and do the matching more sensibly.

Signed-off-by: Fam Zheng <[email protected]>
Message-id: 1460518995 [email protected]
Signed-off-by: Max Reitz <[email protected]>

qemu-iotests: place valgrind log file in scratch dir

Do not place the valgrind log file at a predictable path in a
world-writable location. Use the common scratch directory (${TEST_DIR})
instead.

Signed-off-by: Sascha Silbe <[email protected]>
Reviewed-by: Bo Tu <[email protected]>
Message-id: 1460472980 [email protected]
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Max Reitz <[email protected]>

qemu-iotests: tests: do not set unused tmp variable

The previous commit removed the last usage of ${tmp} inside the tests
themselves; the only remaining users are sourced by check. So we can now
drop this variable from the tests.

Signed-off-by: Sascha Silbe <[email protected]>
Reviewed-by: Bo Tu <[email protected]>
Message-id: 1460472980 [email protected]
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Max Reitz <[email protected]>

qemu-iotests: common.rc: drop unused _do()

_do() was never used and possibly creates temporary files at
predictable, world-writable locations. Get rid of it.

Signed-off-by: Sascha Silbe <[email protected]>
Reviewed-by: Bo Tu <[email protected]>
Message-id: 1460472980 [email protected]
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Max Reitz <[email protected]>

qemu-iotests: drop unused _within_tolerance() filter

_within_tolerance() isn't used anymore and possibly creates temporary
files at predictable, world-writable locations. Get rid of it.

If it's needed again in the future it can be revived easily and fixed up
to use TEST_DIR and / or safely created temporary files.

Signed-off-by: Sascha Silbe <[email protected]>
Reviewed-by: Bo Tu <[email protected]>
Message-id: 1460472980 [email protected]
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Max Reitz <[email protected]>

hostmem-file: plug a small leak

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <1460566660 [email protected]>
Reviewed-by: Igor Mammedov <[email protected]>
Signed-off-by: Markus Armbruster <[email protected]>

Fix pflash migration

Pflash migration (e.g. q35 + EFI variable storage) fails
with the assert:

bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed.

This avoids the problem by delaying the pflash update until after
the device loads complete.

Tested by:
  Migrating Q35/EFI vm.
  Changing efi variable content (with efiboot in the guest)
  md5sum'ing the variable file before migration and after.

This is a fix that Paolo posted in the message
  570244B3.4070105@redhat.com

Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Acked-by: Laszlo Ersek <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

block: Don't ignore flags in blk_{,co,aio}_write_zeroes()

Commit 57d6a428 neglected to pass the given flags to blk_aio_prwv(),
which broke discard by WRITE SAME for scsi-disk (the UNMAP bit would be
ignored).

Commit fc1453cd introduced the same bug for blk_write_zeroes(). This is
used for 'qemu-img convert' without has_zero_init (e.g. on a block
device) and for preallocation=falloc in parallels.

Commit 8896e088 is the version for blk_co_write_zeroes(). This function
is only used in qemu-io.

Reported-by: Max Reitz <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

block/vpc: update comments to be compliant w/coding guidelines

Signed-off-by: Jeff Cody <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

block/vpc: set errp in vpc_open

Add more useful error information to failure paths in vpc_open

Signed-off-by: Jeff Cody <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

block/vpc: make checks on max table size a bit more lax

The check on the max_table_size field not being larger than required is
valid, and in accordance with the VHD spec. However, there have been
VHD images encountered in the wild that have an out-of-spec max table
size that is technically too large.

There is no issue in allowing this larger table size, as we also
later verify that the computed size (used for the pagetable) is
large enough to fit all sectors. In addition, max_table_entries
is bounds checked against SIZE_MAX and INT_MAX.

Remove the strict check, so that we can accomodate these sorts of
images that are benignly out of spec.

Reported-by: Stefan Hajnoczi <[email protected]>
Reported-by: Grant Wu <[email protected]>
Signed-off-by: Jeff Cody <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

block/vpc: Use the correct max sector count for VHD images

The old VHD_MAX_SECTORS value is incorrect, and is a throwback
to the CHS calculations. The VHD specification allows images up to 2040
GiB, which (using 512 byte sectors) corresponds to a maximum number of
sectors of 0xff000000, rather than the old value of 0xfe0001ff.

Update VHD_MAX_SECTORS to reflect the correct value.

Also, update comment references to the actual size limit, and correct
one compare so that we can have sizes up to the limit.

Signed-off-by: Jeff Cody <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>