Git Repo - qemu.git/log

qcow2: Allow 'cache-clean-interval' in Linux only

The cache-clean-interval option of qcow2 only works on Linux. However
we allow setting it in other systems regardless of whether it works or
not.

In those systems this option is not simply a no-op: it actually
invalidates perfectly valid cache tables for no good reason without
freeing their memory.

This patch forbids using that option in non-Linux systems.

Signed-off-by: Alberto Garcia <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

qcow2: Make qcow2_cache_table_release() work only in Linux

We are using QEMU_MADV_DONTNEED to discard the memory of individual L2
cache tables. The problem with this is that those semantics are
specific to the Linux madvise() system call. Other implementations of
madvise() (including the very Linux implementation of posix_madvise())
don't do that, so we cannot use them for the same purpose.

This patch makes the code Linux-specific and uses madvise() directly
since there's no point in going through qemu_madvise() for this.

Signed-off-by: Alberto Garcia <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

Merge remote-tracking branch 'vivier-m68k/tags/m68k-for-2.8-pull-request' into staging

# gpg: Signature made Thu 24 Nov 2016 03:25:39 PM GMT
# gpg:                using RSA key 0xF30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <[email protected]>"
# gpg:                 aka "Laurent Vivier <[email protected]>"
# gpg:                 aka "Laurent Vivier (Red Hat) <[email protected]>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* vivier-m68k/tags/m68k-for-2.8-pull-request:
  target-m68k: fix muluw/mulsw
  target-m68k: Fix cmpa operand size
  target-m68k: fix EXG instruction

Message-id: 1480001287 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

Merge remote-tracking branch 'mcayland/tags/qemu-openbios-signed' into staging

Update OpenBIOS images

# gpg: Signature made Thu 24 Nov 2016 09:29:40 PM GMT
# gpg:                using RSA key 0x5BC2C56FAE0F321F
# gpg: Good signature from "Mark Cave-Ayland <[email protected]>"
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* mcayland/tags/qemu-openbios-signed:
  Update OpenBIOS images to ef8a14e built from submodule.

Message-id: 20161124214109 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

Update OpenBIOS images to ef8a14e built from submodule.

Signed-off-by: Mark Cave-Ayland <[email protected]>

target-m68k: fix muluw/mulsw

"The multiplier and multiplicand are both word operands, and the result
is a long-word operand."

So compute flags on a long-word result, not on a word result.

Signed-off-by: Laurent Vivier <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>

Merge remote-tracking branch 'gkurz/tags/for-upstream' into staging

This pull request fixes some leaks (memory, fd) in the handle and proxy
backends.

# gpg: Signature made Wed 23 Nov 2016 12:53:41 PM GMT
# gpg:                using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <[email protected]>"
# gpg:                 aka "Greg Kurz <[email protected]>"
# gpg:                 aka "Greg Kurz <[email protected]>"
# gpg:                 aka "Greg Kurz <[email protected]>"
# gpg:                 aka "Gregory Kurz (Groug) <[email protected]>"
# gpg:                 aka "Gregory Kurz (Cimai Technology) <[email protected]>"
# gpg:                 aka "Gregory Kurz (Meiosys Technology) <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894  DBA2 02FC 3AEB 0101 DBC2

* gkurz/tags/for-upstream:
  9pfs: add cleanup operation for proxy backend driver
  9pfs: add cleanup operation for handle backend driver
  9pfs: add cleanup operation in FileOperations
  9pfs: adjust the order of resource cleanup in device unrealize

Message-id: 1479920298 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

Merge remote-tracking branch 'rth/tags/pull-axp-20161123' into staging

Fix alpha smp interrupt masking

# gpg: Signature made Wed 23 Nov 2016 12:42:45 PM GMT
# gpg:                using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <[email protected]>"
# gpg:                 aka "Richard Henderson <[email protected]>"
# gpg:                 aka "Richard Henderson <[email protected]>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC  16A4 AD12 70CC 4DD0 279B

* rth/tags/pull-axp-20161123:
  target-alpha: Fix interrupt mask for cpu1

Message-id: 1479905195 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

target-m68k: Fix cmpa operand size

"The size of the operation can be specified as word or long.
Word length source operands are sign-extended to 32 bits for
comparison."

So comparison is always done using OS_LONG.

Signed-off-by: Laurent Vivier <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>

target-m68k: fix EXG instruction

opcodes of "EXG Ax,Ay" and "EXG Dx,Dy" have been swapped

Signed-off-by: Laurent Vivier <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>

xen_disk: split discard input to match internal representation

The guest sends discard requests as u64 sector/count pairs, but the
block layer operates internally with s64/s32 pairs. The conversion
leads to IO errors in the guest, the discard request is not processed.

  domU.cfg:
  'vdev=xvda, format=qcow2, backendtype=qdisk, target=/x.qcow2'
  domU:
  mkfs.ext4 -F /dev/xvda
  Discarding device blocks: failed - Input/output error

Fix this by splitting the request into chunks of BDRV_REQUEST_MAX_SECTORS.
Add input range checking to avoid overflow.

Fixes f313520 ("xen_disk: add discard support")

Signed-off-by: Olaf Hering <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Stefano Stabellini <[email protected]>

9pfs: add cleanup operation for proxy backend driver

In the init operation of proxy backend dirver, it allocates a
V9fsProxy struct and some other resources. We should free these
resources when the 9pfs device is unrealized. This is what this
patch does.

Signed-off-by: Li Qiang <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>
Signed-off-by: Greg Kurz <[email protected]>

9pfs: add cleanup operation for handle backend driver

In the init operation of handle backend dirver, it allocates a
handle_data struct and opens a mount file. We should free these
resources when the 9pfs device is unrealized. This is what this
patch does.

Signed-off-by: Li Qiang <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>
Signed-off-by: Greg Kurz <[email protected]>

9pfs: add cleanup operation in FileOperations

Currently, the backend of VirtFS doesn't have a cleanup
function. This will lead resource leak issues if the backed
driver allocates resources. This patch addresses this issue.

Signed-off-by: Li Qiang <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>
Signed-off-by: Greg Kurz <[email protected]>

9pfs: adjust the order of resource cleanup in device unrealize

Unrealize should undo things that were set during realize in
reverse order. So should do in the error path in realize.

Signed-off-by: Li Qiang <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>
Signed-off-by: Greg Kurz <[email protected]>

Merge remote-tracking branch 'dgibson/tags/ppc-for-2.8-20161123' into staging

ppc patch queue 2016-11-23

Here's the first set of 2.8 hard freeze bugfixes for ppc.

The biggest thing here is a batch of fixes for migration breakages in
both 2.7 and current 2.8.  Alas, there is at least one more migration
problem, which prevents memory unplug after a migration.  I hoped to
include a fix for that here, but it turned out to have some problems
bigger than those it was solving.  So, I expect at least one more hard
freeze pull request.

There are also a few other assorted bug fixes.

# gpg: Signature made Wed 23 Nov 2016 02:25:42 AM GMT
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg:                 aka "David Gibson (kernel.org) <[email protected]>"
# gpg:                 aka "David Gibson (Red Hat) <[email protected]>"
# gpg:                 aka "David Gibson (ozlabs.org) <[email protected]>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* dgibson/tags/ppc-for-2.8-20161123:
  spapr: Fix 2.7<->2.8 migration of PCI host bridge
  Revert "spapr: Fix migration of PCI host bridges from qemu-2.7"
  target-ppc: Allow eventual removal of old migration mistakes
  migration: Add VMSTATE_UINTTL_TEST()
  target-ppc: Fix CPU migration from qemu-2.6 <-> later versions
  ppc: Make uninorth interrupt swizzling identical to Grackle
  target-ppc: fix index array of national digits
  hw/char/spapr_vty: Return amount of free buffer entries in vty_can_receive()
  ppc: BOOK3E: nothing should be done when MSR:PR is set
  spapr: migration support for CAS-negotiated option vectors
  tests/postcopy: Use KVM on ppc64 only if it is KVM-HV

Message-id: 1479869383 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

Merge remote-tracking branch 'bonzini/tags/for-upstream' into staging

Small fixes for rc1.

# gpg: Signature made Tue 22 Nov 2016 10:26:56 PM GMT
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <[email protected]>"
# gpg:                 aka "Paolo Bonzini <[email protected]>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* bonzini/tags/for-upstream:
  scsi/esp: do not raise an interrupt when reading the FIFO register
  nbd: Allow unmap and fua during write zeroes
  cpu_ldst.h: use correct guest address parameter

Message-id: 1479853676 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

spapr: Fix 2.7<->2.8 migration of PCI host bridge

daa2369 "spapr_pci: Add a 64-bit MMIO window" subtly broke migration
from qemu-2.7 to the current version.  It split the device's MMIO
window into two pieces for 32-bit and 64-bit MMIO.

The patch included backwards compatibility code to convert the old
property into the new format.  However, the property value was also
transferred in the migration stream and compared with a (probably
unwise) VMSTATE_EQUAL.  So, the "raw" value from 2.7 is compared to
the new style converted value from (pre-)2.8 giving a mismatch and
migration failure.

Along with the actual field that caused the breakage, there are
several other ill-advised VMSTATE_EQUAL()s.  To fix forwards
migration, we read the values in the stream into scratch variables and
ignore them, instead of comparing for equality.  To fix backwards
migration, we populate those scratch variables in pre_save() with
adjusted values to match the old behaviour.

To permit the eventual possibility of removing this cruft from the
stream, we only include these compatibility fields if a new
'pre-2.8-migration' property is set.  We clear it on the pseries-2.8
machine type, which obviously can't be migrated backwards, but set it
on earlier machine type versions.

Signed-off-by: David Gibson <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>
Reviewed-by: Alexey Kardashevskiy <[email protected]>

Revert "spapr: Fix migration of PCI host bridges from qemu-2.7"

This reverts commit 9b54ca0ba781012eeea4237b7c4832ba2ea81d89.

The commit above corrected a migration breakage between qemu-2.7 and
qemu-2.8. However it did so by advancing the migration version for
the PCI host bridge, which obviously breaks migration backwards to
earlier qemu versions.

Although it's not totally essential, we'd like to maintain the
possibility for backwards migration, so revert the change in
preparation for a better fix.

Signed-off-by: David Gibson <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>
Reviewed-by: Alexey Kardashevskiy <[email protected]>

target-ppc: Allow eventual removal of old migration mistakes

Until very recently, the vmstate for ppc cpus included some poorly
thought out VMSTATE_EQUAL() components, that can easily break
migration compatibility, and did so between qemu-2.6 and later
versions. A hack was recently added which fixes this migration
breakage, but it leaves the unhelpful cruft of these fields in the
migration stream.

This patch adds a new cpu property allowing these fields to be removed
from the stream entirely. For the pseries-2.8 machine type - which
comes after the fix - and for all non-pseries machine types - which
aren't mature enough to care about cross-version migration - we remove
the fields from the stream.

For pseries-2.7 and earlier, The migration hack remains in place,
allowing backwards and forwards migration with the older machine
types.

This restricts the migration compatibility cruft to older machine
types, and at least opens the possibility of eventually deprecating
and removing it entirely.

Signed-off-by: David Gibson <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>
Reviewed-by: Alexey Kardashevskiy <[email protected]>

migration: Add VMSTATE_UINTTL_TEST()

include/migration/cpu.h defines VMSTATE_UINTTL() and several variants
for migrating target_ulong fields. It's defined in terms of
VMSTATE_UINT32() or VMSTATE_UINT64() as appropriate.

It doesn't, however, include a VMSTATE_UINTTL_TEST() variant, which
I'm going to need shortly. So, add it.

Signed-off-by: David Gibson <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>

target-ppc: Fix CPU migration from qemu-2.6 <-> later versions

When migration for target-ppc was converted to vmstate, several
VMSTATE_EQUAL() checks were foolishly included of things that really
should be internal state.  Specifically we verified equality of the
insns_flags and insns_flags2 fields, which are used within TCG to
determine which groups of instructions are available on this cpu
model.  Between qemu-2.6 and qemu-2.7 we made some changes to these
classes which broke migration.

This path fixes migration both forwards and backwards.  On migration
from 2.6 to later versions we import the fields into teporary
variables, which we then ignore.  In migration backwards, we populate
the temporary fields from the runtime fields, but mask out the bits
which were added after qemu-2.6, allowing the VMSTATE_EQUAL in
qemu-2.6 to accept the stream.

Signed-off-by: David Gibson <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>

ppc: Make uninorth interrupt swizzling identical to Grackle

It's currently broken as it uses an incorrect shift, it tries
to use the slot number but uses the top bits of the bus number
instead.

Note: Neither implementation matches what OpenBIOS ends up putting
in the device-tree either, which will have to be fixed separately.

This is not quite correct for modelling a real Mac since Apple
tend to tie all 4 interrupt lines of a slot together and have
separate interrupts for every slot and every motherboard devices
going straight to the PIC but we'll sort that out later.

Signed-off-by: Benjamin Herrenschmidt <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: David Gibson <[email protected]>

target-ppc: fix index array of national digits

Fixes the big endian array access of national digits, from commits
b815587 and e2106d7.

Signed-off-by: Jose Ricardo Ziviani <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: David Gibson <[email protected]>

hw/char/spapr_vty: Return amount of free buffer entries in vty_can_receive()

The can_receive() callbacks of the character devices should return
the amount of characters that can be accepted at once, not just a
boolean value (which rather means only one character at a time).

Signed-off-by: Thomas Huth <[email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc: BOOK3E: nothing should be done when MSR:PR is set

The server architecture (BOOK3S) specifies that any instruction that
sets MSR:PR will also set MSR:EE, IR and DR.
However there is no such behavior specification for the embedded
architecture (BOOK3E).

Signed-off-by: Vladimir Svoboda <[email protected]>
Signed-off-by: David Gibson <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>

spapr: migration support for CAS-negotiated option vectors

With the additional of the OV5_HP_EVT option vector, we now have
certain functionality (namely, memory unplug) that checks at run-time
for whether or not the guest negotiated the option via CAS. Because
we don't currently migrate these negotiated values, we are unable
to unplug memory from a guest after it's been migrated until after
the guest is rebooted and CAS-negotiation is repeated.

This patch fixes this by adding CAS-negotiated options to the
migration stream. We do this using a subsection, since the
negotiated value of OV5_HP_EVT is the only option currently needed
to maintain proper functionality for a running guest.

Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: David Gibson <[email protected]>

tests/postcopy: Use KVM on ppc64 only if it is KVM-HV

The ppc64 postcopy test does not work with KVM-PR, and it is also
causing annoying warning messages when run on a x86 host. So let's
use KVM here only if we know that we're running with KVM-HV (which
automatically also means that we're running on a ppc64 host), and
fall back to TCG otherwise.

Signed-off-by: Thomas Huth <[email protected]>
Reviewed-by: Laurent Vivier <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>
Signed-off-by: David Gibson <[email protected]>

Update version for v2.8.0-rc1 release

Signed-off-by: Stefan Hajnoczi <[email protected]>

scsi/esp: do not raise an interrupt when reading the FIFO register

This fixes SCSI adapter self-tests done in MIPS Jazz emulation,
broken since ff589551c8e8e9e95e211b9d8daafb4ed39f1aec.

Signed-off-by: Hervé Poussineau <[email protected]>
Message-Id: <1479508397 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

nbd: Allow unmap and fua during write zeroes

Commit fa778fff wired up support to send the NBD_CMD_WRITE_ZEROES,
but forgot to inform the block layer that FUA unmapping of zeroes is
supported. Without BDRV_REQ_MAY_UNMAP listed as a supported flag,
the block layer will always insist on the NBD layer passing
NBD_CMD_FLAG_NO_HOLE, resulting in the server always allocating
things even when it was desired to let the server punch holes.
Similarly, failing to set BDRV_REQ_FUA means that the client may
send unnecessary NBD_CMD_FLUSH when it could have instead used the
NBD_CMD_FLAG_FUA bit.

CC: [email protected]
Signed-off-by: Eric Blake <[email protected]>
Message-Id: <1479413642 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

cpu_ldst.h: use correct guest address parameter

In the user emulation code path, tlb_vaddr_to_host erronesously passed
vaddr as the guest address to be translated, instead of addr, the parameter
which actually contained the guest address.

This resulted in incorrect addresses being used when emulating block copy
(mvc/mvpg) and block clear (xc) instructions for the s390x target.

Signed-off-by: Bobby Bingham <[email protected]>
Message-Id: <20161113050523 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

Merge remote-tracking branch 'sstabellini/tags/xen-20161122-tag' into staging

Xen 2016/11/22

# gpg: Signature made Tue 22 Nov 2016 06:41:23 PM GMT
# gpg:                using RSA key 0x894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <[email protected]>"
# gpg:                 aka "Stefano Stabellini <[email protected]>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3  0AEA 894F 8F48 70E1 AE90

* sstabellini/tags/xen-20161122-tag:
  xen: attach pvusb usb bus to backend qdev
  xen: create qdev for each backend device
  qdev: add function qdev_set_id()
  xen: add an own bus for xen backend devices
  xen: fix ioreq handling

Message-id: alpine.DEB.2.10.1611221037010.21858@sstabellini-ThinkPad-X260
Signed-off-by: Stefan Hajnoczi <[email protected]>

Merge remote-tracking branch 'kwolf/tags/for-upstream' into staging

Block layer patches for 2.8.0-rc1

# gpg: Signature made Tue 22 Nov 2016 03:55:38 PM GMT
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* kwolf/tags/for-upstream:
  block: Pass unaligned discard requests to drivers
  block: Return -ENOTSUP rather than assert on unaligned discards
  block: Let write zeroes fallback work even with small max_transfer
  qcow2: Inform block layer about discard boundaries

Message-id: 1479830693 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

Merge remote-tracking branch 'kraxel/tags/pull-seabios-20161122-1' into staging

seabios: update to 1.10.1 stable release

# gpg: Signature made Tue 22 Nov 2016 09:12:39 AM GMT
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <[email protected]>"
# gpg:                 aka "Gerd Hoffmann <[email protected]>"
# gpg:                 aka "Gerd Hoffmann (private) <[email protected]>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* kraxel/tags/pull-seabios-20161122-1:
  seabios: update to 1.10.1 stable release

Message-id: 1479806144 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

xen: attach pvusb usb bus to backend qdev

Attach the usb bus of a new pvusb controller to the qdev associated
with the Xen backend. Any device connected to that controller can now
specify the bus and port directly via its properties.

Signed-off-by: Juergen Gross <[email protected]>
Reviewed-by: Stefano Stabellini <[email protected]>
Signed-off-by: Stefano Stabellini <[email protected]>

xen: create qdev for each backend device

Create a qdev plugged to the xen-sysbus for each new backend device.
This device can be used as a parent for all needed devices of that
backend. The id of the new device will be "xen-<type>-<dev>" with
<type> being the xen backend type (e.g. "qdisk") and <dev> the xen
backend number of the type under which it is to be found in xenstore.

Signed-off-by: Juergen Gross <[email protected]>
Reviewed-by: Stefano Stabellini <[email protected]>
Signed-off-by: Stefano Stabellini <[email protected]>

qdev: add function qdev_set_id()

In order to have an easy way to add a new qdev with a specific id
carve out the needed functionality from qdev_device_add() into a new
function qdev_set_id().

Signed-off-by: Juergen Gross <[email protected]>
Reviewed-by: Stefano Stabellini <[email protected]>
Signed-off-by: Stefano Stabellini <[email protected]>

xen: add an own bus for xen backend devices

Add a bus for Xen backend devices in order to be able to establish a
dedicated device path for pluggable devices.

Signed-off-by: Juergen Gross <[email protected]>
Reviewed-by: Stefano Stabellini <[email protected]>
Signed-off-by: Stefano Stabellini <[email protected]>

xen: fix ioreq handling

Avoid double fetches and bounds check size to avoid overflowing
internal variables.

This is CVE-2016-9381 / XSA-197.

Reported-by: yanghongke <[email protected]>
Signed-off-by: Jan Beulich <[email protected]>
Reviewed-by: Stefano Stabellini <[email protected]>
Signed-off-by: Stefano Stabellini <[email protected]>

target-alpha: Fix interrupt mask for cpu1

A typo prevents ISA interrupts from being recognized on cpu0,
which is where the smp kernel normally wants to see them.

Signed-off-by: Richard Henderson <[email protected]>

block: Pass unaligned discard requests to drivers

Discard is advisory, so rounding the requests to alignment
boundaries is never semantically wrong from the data that
the guest sees.  But at least the Dell Equallogic iSCSI SANs
has an interesting property that its advertised discard
alignment is 15M, yet documents that discarding a sequence
of 1M slices will eventually result in the 15M page being
marked as discarded, and it is possible to observe which
pages have been discarded.

Between commits 9f1963b and b8d0a980, we converted the block
layer to a byte-based interface that ultimately ignores any
unaligned head or tail based on the driver's advertised
discard granularity, which means that qemu 2.7 refuses to
pass any discard request smaller than 15M down to the Dell
Equallogic hardware.  This is a slight regression in behavior
compared to earlier qemu, where a guest executing discards
in power-of-2 chunks used to be able to get every page
discarded, but is now left with various pages still allocated
because the guest requests did not align with the hardware's
15M pages.

Since the SCSI specification says nothing about a minimum
discard granularity, and only documents the preferred
alignment, it is best if the block layer gives the driver
every bit of information about discard requests, rather than
rounding it to alignment boundaries early.

Rework the block layer discard algorithm to mirror the write
zero algorithm: always peel off any unaligned head or tail
and manage that in isolation, then do the bulk of the request
on an aligned boundary.  The fallback when the driver returns
-ENOTSUP for an unaligned request is to silently ignore that
portion of the discard request; but for devices that can pass
the partial request all the way down to hardware, this can
result in the hardware coalescing requests and discarding
aligned pages after all.

Reported by: Peter Lieven <[email protected]>
CC: [email protected]
Signed-off-by: Eric Blake <[email protected]>
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

block: Return -ENOTSUP rather than assert on unaligned discards

Right now, the block layer rounds discard requests, so that
individual drivers are able to assert that discard requests
will never be unaligned. But there are some ISCSI devices
that track and coalesce multiple unaligned requests, turning it
into an actual discard if the requests eventually cover an
entire page, which implies that it is better to always pass
discard requests as low down the stack as possible.

In isolation, this patch has no semantic effect, since the
block layer currently never passes an unaligned request through.
But the block layer already has code that silently ignores
drivers that return -ENOTSUP for a discard request that cannot
be honored (as well as drivers that return 0 even when nothing
was done). But the next patch will update the block layer to
fragment discard requests, so that clients are guaranteed that
they are either dealing with an unaligned head or tail, or an
aligned core, making it similar to the block layer semantics of
write zero fragmentation.

CC: [email protected]
Signed-off-by: Eric Blake <[email protected]>
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

block: Let write zeroes fallback work even with small max_transfer

Commit 443668ca rewrote the write_zeroes logic to guarantee that
an unaligned request never crosses a cluster boundary.  But
in the rewrite, the new code assumed that at most one iteration
would be needed to get to an alignment boundary.

However, it is easy to trigger an assertion failure: the Linux
kernel limits loopback devices to advertise a max_transfer of
only 64k.  Any operation that requires falling back to writes
rather than more efficient zeroing must obey max_transfer during
that fallback, which means an unaligned head may require multiple
iterations of the write fallbacks before reaching the aligned
boundaries, when layering a format with clusters larger than 64k
atop the protocol of file access to a loopback device.

Test case:

$ qemu-img create -f qcow2 -o cluster_size=1M file 10M
$ losetup /dev/loop2 /path/to/file
$ qemu-io -f qcow2 /dev/loop2
qemu-io> w 7m 1k
qemu-io> w -z 8003584 2093056

In fairness to Denis (as the original listed author of the culprit
commit), the faulty logic for at most one iteration is probably all
my fault in reworking his idea.  But the solution is to restore what
was in place prior to that commit: when dealing with an unaligned
head or tail, iterate as many times as necessary while fragmenting
the operation at max_transfer boundaries.

Reported-by: Ed Swierk <[email protected]>
CC: [email protected]
CC: Denis V. Lunev <[email protected]>
Signed-off-by: Eric Blake <[email protected]>
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

qcow2: Inform block layer about discard boundaries

At the qcow2 layer, discard is only possible on a per-cluster
basis; at the moment, qcow2 silently rounds any unaligned
requests to this granularity. However, an upcoming patch will
fix a regression in the block layer ignoring too much of an
unaligned discard request, by changing the block layer to
break up a discard request at alignment boundaries; for that
to work, the block layer must know about our limits.

However, we can't go one step further by changing
qcow2_discard_clusters() to assert that requests are always
aligned, since that helper function is reached on paths
outside of the block layer.

CC: [email protected]
Signed-off-by: Eric Blake <[email protected]>
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

Fix FreeBSD (10.x) build after 7dc9ae43

Include sys/user.h for declaration of 'struct kinfo_proc'.
Add -lutil to qemu-ga link for kinfo_getproc.

Signed-off-by: Ed Maste <[email protected]>
Message-id: 1479778365 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

Merge remote-tracking branch 'jtc/tags/block-pull-request' into staging

# gpg: Signature made Mon 21 Nov 2016 10:12:43 PM GMT
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <[email protected]>"
# gpg:                 aka "Jeffrey Cody <[email protected]>"
# gpg:                 aka "Jeffrey Cody <[email protected]>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* jtc/tags/block-pull-request:
  gluster: Fix use after free in glfs_clear_preopened()

Message-id: 1479766499 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

seabios: update to 1.10.1 stable release

git shortlog rel-1.10.0..rel-1.10.1
===================================

Igor Mammedov (1):
drop "etc/boot-cpus" fw_cfg file and reuse legacy QEMU_CFG_NB_CPUS

Signed-off-by: Gerd Hoffmann <[email protected]>

gluster: Fix use after free in glfs_clear_preopened()

This fixes a use-after-free bug introduced in commit 6349c154. We need
to use QLIST_FOREACH_SAFE() when freeing elements in the loop. Spotted
by Coverity.

Signed-off-by: Kevin Wolf <[email protected]>
Message-id: 1479378608 [email protected]
Signed-off-by: Jeff Cody <[email protected]>

Merge remote-tracking branch 'sstabellini/tags/xen-20161108-tag' into staging

Xen 2016/11/08

# gpg: Signature made Tue 08 Nov 2016 07:48:12 PM GMT
# gpg:                using RSA key 0x894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <[email protected]>"
# gpg:                 aka "Stefano Stabellini <[email protected]>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3  0AEA 894F 8F48 70E1 AE90

* sstabellini/tags/xen-20161108-tag:
  xen: Fix xenpv machine initialisation

Message-id: alpine.DEB.2.10.1611081150170.3491@sstabellini-ThinkPad-X260
Signed-off-by: Stefan Hajnoczi <[email protected]>

Merge remote-tracking branch 'mst/tags/for_upstream' into staging

virtio, vhost, pc: fixes

Most notably this fixes a regression with vhost introduced by the pull before
last.

Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Fri 18 Nov 2016 03:51:55 PM GMT
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg:                 aka "Michael S. Tsirkin <[email protected]>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* mst/tags/for_upstream:
  acpi: Use apic_id_limit when calculating legacy ACPI table size
  ipmi: fix qemu crash while migrating with ipmi
  ivshmem: Fix 64 bit memory bar configuration
  virtio: set ISR on dataplane notifications
  virtio: access ISR atomically
  virtio: introduce grab/release_ioeventfd to fix vhost
  virtio-crypto: fix virtio_queue_set_notification() race

Message-id: 1479484366 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

acpi: Use apic_id_limit when calculating legacy ACPI table size

The code that calculates the legacy ACPI table size for migration
compatibility uses max_cpus when calculating legacy_aml_len (the size of
the DSDT and SSDT tables). However, the SSDT grows according to APIC ID
limit, not max_cpus.

The bug is not triggered very often because of the 4k alignment on the
table size. But it can be triggered if you are unlucky enough to cross a
4k boundary.

Change the legacy_aml_len calculation to use apic_id_limit, to calculate
the right size.

Signed-off-by: Eduardo Habkost <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

ipmi: fix qemu crash while migrating with ipmi

Qemu crash in the source side while migrating, after starting ipmi service inside vm.

./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 \
-drive file=/work/suse/suse11_sp3_64_vt,format=raw,if=none,id=drive-virtio-disk0,cache=none \
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0 \
-vnc :99 -monitor vc -device ipmi-bmc-sim,id=bmc0 -device isa-ipmi-kcs,bmc=bmc0,ioport=0xca2

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffec4268700 (LWP 7657)]
__memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:2757
(gdb) bt
#0  __memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:2757
#1  0x00005555559ef775 in memcpy (__len=3, __src=0xc1421c, __dest=<optimized out>)
     at /usr/include/bits/string3.h:51
#2  qemu_put_buffer (f=0x555557a97690, buf=0xc1421c <Address 0xc1421c out of bounds>, size=3)
     at migration/qemu-file.c:346
#3  0x00005555559eef66 in vmstate_save_state (f=f@entry=0x555557a97690,
     vmsd=0x555555f8a5a0 <vmstate_ISAIPMIKCSDevice>, opaque=0x555557231160,
     vmdesc=vmdesc@entry=0x55555798cc40) at migration/vmstate.c:333
#4  0x00005555557cfe45 in vmstate_save (f=f@entry=0x555557a97690, se=se@entry=0x555557231de0,
     vmdesc=vmdesc@entry=0x55555798cc40) at /mnt/sdb/zyy/qemu/migration/savevm.c:720
#5  0x00005555557d2be7 in qemu_savevm_state_complete_precopy (f=0x555557a97690,
     iterable_only=iterable_only@entry=false) at /mnt/sdb/zyy/qemu/migration/savevm.c:1128
#6  0x00005555559ea102 in migration_completion (start_time=<synthetic pointer>,
     old_vm_running=<synthetic pointer>, current_active_state=<optimized out>,
     s=0x5555560eaa80 <current_migration.44078>) at migration/migration.c:1707
#7  migration_thread (opaque=0x5555560eaa80 <current_migration.44078>) at migration/migration.c:1855
#8  0x00007ffff3900dc5 in start_thread (arg=0x7ffec4268700) at pthread_create.c:308
#9  0x00007fffefc6c71d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Signed-off-by: Zhuang Yanying <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

ivshmem: Fix 64 bit memory bar configuration

Device ivshmem property use64=0 is designed to make the device
expose a 32 bit shared memory BAR instead of 64 bit one.  The
default is a 64 bit BAR, except pc-1.2 and older retain a 32 bit
BAR.  A 32 bit BAR can support only up to 1 GiB of shared memory.

This worked as designed until commit 5400c02 accidentally flipped
its sense: since then, we misinterpret use64=0 as use64=1 and vice
versa.  Worse, the default got flipped as well.  Devices
ivshmem-plain and ivshmem-doorbell are not affected.

Fix by restoring the test of IVShmemState member not_legacy_32bit
that got messed up in commit 5400c02.  Also update its
initialization for devices ivhsmem-plain and ivshmem-doorbell.
Without that, they'd regress to 32 bit BARs.

Cc: [email protected]
Signed-off-by: Zhuang Yanying <[email protected]>
Reviewed-by: Gonglei <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Markus Armbruster <[email protected]>

virtio: set ISR on dataplane notifications

Dataplane has been omitting forever the step of setting ISR when
an interrupt is raised.  This caused little breakage, because the
specification actually says that ISR may not be updated in MSI mode.

Some versions of the Windows drivers however didn't clear MSI mode
correctly, and proceeded using polling mode (using ISR, not the used
ring index!) for crashdump and hibernation.  If it were just crashdump
and hibernation it would not be a big deal, but recent releases of
Windows do not really shut down, but rather log out and hibernate to
make the next startup faster.  Hence, this manifested as a more serious
hang during shutdown with e.g. Windows 8.1 and virtio-win 1.8.0 RPMs.
Newer versions fixed this, while older versions do not use MSI at all.

The failure has always been there for virtio dataplane, but it became
visible after commits 9ffe337 ("virtio-blk: always use dataplane path
if ioeventfd is active", 2016-10-30) and ad07cd6 ("virtio-scsi: always
use dataplane path if ioeventfd is active", 2016-10-30) made virtio-blk
and virtio-scsi always use the dataplane code under KVM.  The good news
therefore is that it was not a bug in the patches---they were doing
exactly what they were meant for, i.e. shake out remaining dataplane bugs.

The fix is not hard, so it's worth arranging for the broken drivers.
The virtio_should_notify+event_notifier_set pair that is common to
virtio-blk and virtio-scsi dataplane is replaced with a new public
function virtio_notify_irqfd that also sets ISR.  The irqfd emulation
code now need not set ISR anymore, so virtio_irq is removed.

Reviewed-by: Stefan Hajnoczi <[email protected]>
Tested-by: Farhan Ali <[email protected]>
Tested-by: Alex Williamson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

virtio: access ISR atomically

This will be needed once dataplane will be able to set it outside
the big QEMU lock.

Reviewed-by: Stefan Hajnoczi <[email protected]>
Tested-by: Farhan Ali <[email protected]>
Tested-by: Alex Williamson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

virtio: introduce grab/release_ioeventfd to fix vhost

Following the recent refactoring of virtio notifiers [1], more specifically
the patch ed08a2a0b ("virtio: use virtio_bus_set_host_notifier to
start/stop ioeventfd") that uses virtio_bus_set_host_notifier [2]
by default, core virtio code requires 'ioeventfd_started' to be set
to true/false when the host notifiers are configured.

When vhost is stopped and started, however, there is a stop followed by
another start. Since ioeventfd_started was never set to true, the 'stop'
operation triggered by virtio_bus_set_host_notifier() will not result
in a call to virtio_pci_ioeventfd_assign(assign=false). This leaves
the memory regions with stale notifiers and results on the next start
triggering the following assertion:

kvm_mem_ioeventfd_add: error adding ioeventfd: File exists
Aborted

This patch reintroduces (hopefully in a cleaner way) the concept
that was present with ioeventfd_disabled before the refactoring.
When ioeventfd_grabbed>0, ioeventfd_started tracks whether ioeventfd
should be enabled or not, but ioeventfd is actually not started at
all until vhost releases the host notifiers.

[1] http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg07748.html
[2] http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg07760.html

Reported-by: Felipe Franciosi <[email protected]>
Reported-by: Christian Borntraeger <[email protected]>
Reported-by: Alex Williamson <[email protected]>
Fixes: ed08a2a0b ("virtio: use virtio_bus_set_host_notifier to start/stop ioeventfd")
Reviewed-by: Cornelia Huck <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Tested-by: Alexey Kardashevskiy <[email protected]>
Tested-by: Farhan Ali <[email protected]>
Tested-by: Alex Williamson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

Merge remote-tracking branch 'public/tags/tracing-pull-request' into staging

# gpg: Signature made Fri 18 Nov 2016 03:01:22 PM GMT
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <[email protected]>"
# gpg:                 aka "Stefan Hajnoczi <[email protected]>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* public/tags/tracing-pull-request:
  trace: fix generated code build break

Message-id: 1479481289 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

virtio-crypto: fix virtio_queue_set_notification() race

We must check for new virtqueue buffers after re-enabling notifications.
This prevents the race condition where the guest added buffers just
after we stopped popping the virtqueue but before we re-enabled
notifications.

I think the virtio-crypto code was based on virtio-net but this crucial
detail was missed. virtio-net does not have the race condition because
it processes the virtqueue one more time after re-enabling
notifications.

Cc: Gonglei <[email protected]>
Signed-off-by: Stefan Hajnoczi <[email protected]>
Tested-by: Alexey Kardashevskiy <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Gonglei <[email protected]>

Merge remote-tracking branch 'remotes/elmarco/tags/ivshmem-pull-request' into staging

* remotes/elmarco/tags/ivshmem-pull-request:
ivshmem: Fix 64 bit memory bar configuration

Message-id: 20161117152613 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

Merge remote-tracking branch 'rth/tags/pull-axp-20161117' into staging

Update alpha palcode for smp

# gpg: Signature made Thu 17 Nov 2016 02:57:29 PM GMT
# gpg:                using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <[email protected]>"
# gpg:                 aka "Richard Henderson <[email protected]>"
# gpg:                 aka "Richard Henderson <[email protected]>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC  16A4 AD12 70CC 4DD0 279B

* rth/tags/pull-axp-20161117:
  target-alpha: Log cpuid with -d int
  target-alpha: Update palcode for smp

Message-id: 1479394965 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

trace: fix generated code build break

If the QEMU source dir is

    /var/tmp/aaa-qemu-clone

and the build dir is

    /var/tmp/qemu-aio-poll-v2

Then I get an error as:

trace/generated-tracers.c:15950:13: error: invalid suffix "_trace_events"
on integer constant
TraceEvent *2_trace_events[] = {
             ^
trace/generated-tracers.c:15950:13: error: expected identifier or ‘(’ before
numeric constant
trace/generated-tracers.c: In function ‘trace_2_register_events’:
trace/generated-tracers.c:17949:32: error: invalid suffix "_trace_events" on
integer constant
     trace_event_register_group(2_trace_events);
                                ^
make: *** [trace/generated-tracers.o] Error 1

This patch fixes the issue.

Reported-by: Fam Zheng <[email protected]>
Signed-off-by: Greg Kurz <[email protected]>
Tested-by: Fam Zheng <[email protected]>
Signed-off-by: Stefan Hajnoczi <[email protected]>

Merge remote-tracking branch 'vivier/tags/trivial-patches-pull-request' into staging

# gpg: Signature made Thu 17 Nov 2016 10:18:58 AM GMT
# gpg:                using RSA key 0xF30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <[email protected]>"
# gpg:                 aka "Laurent Vivier <[email protected]>"
# gpg:                 aka "Laurent Vivier (Red Hat) <[email protected]>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* vivier/tags/trivial-patches-pull-request:
  qapi-schema: clarify 'colo' state for MigrationStatus

Message-id: 1479378016 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

target-alpha: Log cpuid with -d int

Signed-off-by: Richard Henderson <[email protected]>

target-alpha: Update palcode for smp

Signed-off-by: Richard Henderson <[email protected]>

ivshmem: Fix 64 bit memory bar configuration

Device ivshmem property use64=0 is designed to make the device
expose a 32 bit shared memory BAR instead of 64 bit one.  The
default is a 64 bit BAR, except pc-1.2 and older retain a 32 bit
BAR.  A 32 bit BAR can support only up to 1 GiB of shared memory.

This worked as designed until commit 5400c02 accidentally flipped
its sense: since then, we misinterpret use64=0 as use64=1 and vice
versa.  Worse, the default got flipped as well.  Devices
ivshmem-plain and ivshmem-doorbell are not affected.

Fix by restoring the test of IVShmemState member not_legacy_32bit
that got messed up in commit 5400c02.  Also update its
initialization for devices ivhsmem-plain and ivshmem-doorbell.
Without that, they'd regress to 32 bit BARs.

Signed-off-by: Zhuang Yanying <[email protected]>
Reviewed-by: Gonglei <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Markus Armbruster <[email protected]>
Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <1479385863 [email protected]>

qapi-schema: clarify 'colo' state for MigrationStatus

VM can not get into colo state unless users enable 'x-colo'
capability for migration, Here it is necessary to clarify
this.

Suggested-by: Eric Blake <[email protected]>
Signed-off-by: zhanghailiang <[email protected]>
Message-Id: <1478072652 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Laurent Vivier <[email protected]>

pc: fix FW_CFG_NB_CPUS to account for -device added CPUs

Signed-off-by: Igor Mammedov <[email protected]>
Message-Id: <1479301481 [email protected]>
Reviewed-by: Eduardo Habkost <[email protected]>
Signed-off-by: Eduardo Habkost <[email protected]>

fw_cfg: move FW_CFG_NB_CPUS out of fw_cfg_init1()

PC will use this field in other way, so move it outside the common
code so PC could set a different value, i.e. all CPUs
regardless of where they are coming from (-smp X | -device cpu...).

It's quick and dirty hack as it could be implemented in more generic
way in MashineClass. But do it in simple way since only PC is affected
so far.

Later we can generalize it when another affected target gets support
for -device cpu.

Signed-off-by: Igor Mammedov <[email protected]>
Message-Id: <1479212236 [email protected]>
Reviewed-by: Eduardo Habkost <[email protected]>
Signed-off-by: Eduardo Habkost <[email protected]>

Revert "pc: Add 'etc/boot-cpus' fw_cfg file for machine with more than 255 CPUs"

This reverts commit 080ac219cc7d9c55adf925c3545b7450055ad625.

Legacy FW_CFG_NB_CPUS will be reused instead of 'etc/boot-cpus'
fw_cfg file since it does the same and there is no point
to maintaing duplicate guest ABI, if it can be helped.

Signed-off-by: Igor Mammedov <[email protected]>
Message-Id: <1479212236 [email protected]>
Reviewed-by: Eduardo Habkost <[email protected]>
Signed-off-by: Eduardo Habkost <[email protected]>

Update version for v2.8.0-rc0 release

Signed-off-by: Stefan Hajnoczi <[email protected]>

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

virtio, vhost, pc, pci: documentation, fixes and cleanups

Lots of fixes all over the place.

Unfortunately, this does not yet fix a regression with vhost
introduced by the last pull, the issue is typically this error:
    kvm_mem_ioeventfd_add: error adding ioeventfd: File exists
followed by QEMU aborting.

Signed-off-by: Michael S. Tsirkin <[email protected]>
* remotes/mst/tags/for_upstream: (28 commits)
  docs: add PCIe devices placement guidelines
  virtio: drop virtio_queue_get_ring_{size,addr}()
  vhost: drop legacy vring layout bits
  vhost: adapt vhost_verify_ring_mappings() to virtio 1 ring layout
  nvdimm acpi: introduce NVDIMM_DSM_MEMORY_SIZE
  nvdimm acpi: use aml_name_decl to define named object
  nvdimm acpi: rename nvdimm_dsm_reserved_root
  nvdimm acpi: fix two comments
  nvdimm acpi: define DSM return codes
  nvdimm acpi: rename nvdimm_acpi_hotplug
  nvdimm acpi: cleanup nvdimm_build_fit
  nvdimm acpi: rename nvdimm_plugged_device_list
  docs: improve the doc of Read FIT method
  nvdimm acpi: clean up nvdimm_build_acpi
  pc: memhp: stop handling nvdimm hotplug in pc_dimm_unplug
  pc: memhp: move nvdimm hotplug out of memory hotplug
  nvdimm acpi: drop the lock of fit buffer
  qdev: hotplug: drop HotplugHandler.post_plug callback
  vhost: migration blocker only if shared log is used
  virtio-net: mark VIRTIO_NET_F_GSO as legacy
  ...

Message-id: 1479237527 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

Merge remote-tracking branch 'ehabkost/tags/machine-pull-request' into staging

qdev: Fix assert in PCI address property when used by vfio-pci

# gpg: Signature made Tue 15 Nov 2016 06:27:18 PM GMT
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <[email protected]>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* ehabkost/tags/machine-pull-request:
  qdev: Fix assert in PCI address property when used by vfio-pci

Message-id: 1479234540 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

qdev: Fix assert in PCI address property when used by vfio-pci

Allow the PCIHostDeviceAddress structure to work as the host property
in vfio-pci when it has it's default value of all fields set to ~0. In
this form the property indicates a non-existant device but given the
field bit sizes gets asserted as excess (and invalid) precision
overflows the string buffer. The BDF of an invalid device
"FFFF:FF:FF.F" is returned instead.

Signed-off-by: Daniel Oram <[email protected]>
Reviewed-by: Alex Williamson <[email protected]>
Message-Id: <71f06765c4ba16dcd71cbf78e877619948f04ed9.1478777270 [email protected]>
Signed-off-by: Eduardo Habkost <[email protected]>

Merge remote-tracking branch 'public/tags/block-pull-request' into staging

# gpg: Signature made Tue 15 Nov 2016 03:42:29 PM GMT
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <[email protected]>"
# gpg:                 aka "Stefan Hajnoczi <[email protected]>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* public/tags/block-pull-request:
  test-replication: fix leaks

Message-id: 1479224556 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

test-replication: fix leaks

ASAN spotted:
SUMMARY: AddressSanitizer: 301990288 byte(s) leaked in 33 allocation(s).

Signed-off-by: Marc-André Lureau <[email protected]>
Message-id: 20161109104547 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

docs: add PCIe devices placement guidelines

Proposes best practices on how to use PCI Express/PCI device
in PCI Express based machines and explain the reasoning behind them.

Reviewed-by: Laszlo Ersek <[email protected]>
Signed-off-by: Marcel Apfelbaum <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

virtio: drop virtio_queue_get_ring_{size,addr}()

These are not used anymore.

Signed-off-by: Greg Kurz <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

vhost: drop legacy vring layout bits

The legacy vring layout is not used anymore as we use the separate
mappings even for legacy devices.
This patch simply removes it.

This also fixes a bug with virtio 1 devices when the vring descriptor table
is mapped at a higher address than the used vring because the following
function may return an insanely great value:

hwaddr virtio_queue_get_ring_size(VirtIODevice *vdev, int n)
{
return vdev->vq[n].vring.used - vdev->vq[n].vring.desc +
virtio_queue_get_used_size(vdev, n);
}

and the mapping fails.

Signed-off-by: Greg Kurz <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

vhost: adapt vhost_verify_ring_mappings() to virtio 1 ring layout

With virtio 1, the vring layout is split in 3 separate regions of
contiguous memory for the descriptor table, the available ring and the
used ring, as opposed with legacy virtio which uses a single region.

In case of memory re-mapping, the code ensures it doesn't affect the
vring mapping. This is done in vhost_verify_ring_mappings() which assumes
the device is legacy.

This patch changes vhost_verify_ring_mappings() to check the mappings of
each part of the vring separately.

This works for legacy mappings as well.

Cc: [email protected]
Signed-off-by: Greg Kurz <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

nvdimm acpi: introduce NVDIMM_DSM_MEMORY_SIZE

and use it to replace the raw number

Suggested-by: Igor Mammedov <[email protected]>
Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>

nvdimm acpi: use aml_name_decl to define named object

to make the code more clearer

Suggested-by: Igor Mammedov <[email protected]>
Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>

nvdimm acpi: rename nvdimm_dsm_reserved_root

Rename it to nvdimm_dsm_handle_reserved_root_method

Suggested-by: Igor Mammedov <[email protected]>
Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>

nvdimm acpi: fix two comments

fixed the English issue and code-style issue

Suggested-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>

nvdimm acpi: define DSM return codes

and use these codes to refine the code

Suggested-by: Igor Mammedov <[email protected]>
Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>

nvdimm acpi: rename nvdimm_acpi_hotplug

Rename it to nvdimm_plug()

Suggested-by: Igor Mammedov <[email protected]>
Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>

nvdimm acpi: cleanup nvdimm_build_fit

inline buf_size to refine the code a bit

Suggested-by: Igor Mammedov <[email protected]>
Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>

nvdimm acpi: rename nvdimm_plugged_device_list

Its behavior has been changed as the nvdimm device which is being
realized also will be handled in this function, so rename it to
reflect the fact

Suggested-by: Igor Mammedov <[email protected]>
Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>

docs: improve the doc of Read FIT method

Improve the description and clearly document the length field

Suggested-by: Igor Mammedov <[email protected]>
Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>

nvdimm acpi: clean up nvdimm_build_acpi

To make the code more clearer, we
1) check ram_slots first, and build ssdt & nfit only when it is available
2) use nvdimm_get_plugged_device_list() to check if there is nvdimm device
plugged

Suggested-by: Igor Mammedov <[email protected]>
Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>

pc: memhp: stop handling nvdimm hotplug in pc_dimm_unplug

as it is never called when nvdimm hotplug happens

Suggested-by: Igor Mammedov <[email protected]>
Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>

pc: memhp: move nvdimm hotplug out of memory hotplug

as they use completely different way to handle hotplug event

Suggested-by: Igor Mammedov <[email protected]>
Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>

nvdimm acpi: drop the lock of fit buffer

as there is a global lock to protect vm-exit handlers and
QMP/monitor, this lock can be dropped

Suggested-by: Igor Mammedov <[email protected]>
Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>

qdev: hotplug: drop HotplugHandler.post_plug callback

as nvdimm acpi is okay to build fit when the nvdimm device
has not been 'realized'

Suggested-by: Igor Mammedov <[email protected]>
Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>

vhost: migration blocker only if shared log is used

Commit 31190ed7 added a migration blocker in vhost_dev_init() to
check if memfd would succeed. It is better if this blocker first
checks if vhost backend requires shared log. This will avoid a
situation where a blocker is added inappropriately (e.g. shared
log allocation fails when vhost backend doesn't support it).

Signed-off-by: Rafael David Tinoco <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

virtio-net: mark VIRTIO_NET_F_GSO as legacy

virtio 1.0 spec says this is a legacy feature bit,
hide it from guests in modern mode.

Note: for cross-version migration compatibility,
we keep the bit set in host_features.
The result will be that a guest migrating cross-version
will see host features change under it.
As guests only seem to read it once, this should
not be an issue. Meanwhile, will work to fix guests to
ignore this bit in virtio1 mode, too.

Cc: [email protected]
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>

virtio: allow per-device-class legacy features

Legacy features are those that transitional devices only
expose on the legacy interface.
Allow different ones per device class.

Cc: [email protected] # dependency for the next patch
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>

acpi: fix DMAR device scope for IOAPIC

We should not use cpu_to_le16() here, instead each of device/function
value is stored in a 8 byte field.

Signed-off-by: Peter Xu <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

intel_iommu: fix incorrect assert

Reported-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

intel_iommu: fix several incorrect endianess and bit fields

Signed-off-by: Peter Xu <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>