Git Repo - qemu.git/log

block: Factor out bdrv_replace_child_noperm()

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

block: Factor out should_update_child()

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

block: Fix blockdev-snapshot error handling

For blockdev-snapshot, external_snapshot_prepare() accepts an arbitrary
node reference at first and only checks later whether it already has a
backing file. Between those places, other errors can occur.

Therefore checking in external_snapshot_abort() whether state->new_bs
has a backing file is not sufficient to tell whether bdrv_append() was
already completed or not. Trying to undo the bdrv_append() when it
wasn't even executed is wrong.

Introduce a new boolean flag in the state to fix this.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

mirror: Fix error path for dirty bitmap creation

mirror_top_bs must be removed from the graph again when creating the
dirty bitmap fails.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

mirror: Fix permissions for removing mirror_top_bs

mirror_top_bs takes write permissions on its backing file, which can
make it impossible to attach that backing file node to another parent.
However, this is exactly what needs to be done in order to remove
mirror_top_bs from the backing chain. So give up the write permission
first.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

mirror: Fix permission problem with 'replaces'

The 'replaces' option of drive-mirror can be used to mirror a Quorum
node to a new image and then let the target image replace one of the
Quorum children. In order for this graph modification to succeed, the
mirror job needs to lift its restrictions on the target node first
before actually replacing the child.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

commit: Fix error handling

Apparently some kind of mismerge happened in commit 8dfba279, which
broke the error handling without any real reason by removing the
assignment of the return value to ret in a blk_insert_bs() call.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

Merge remote-tracking branch 'remotes/xtensa/tags/20170306-xtensa' into staging

target/xtensa updates:

- instantiate local memories in xtensa sim machine;
- add two missing include files to xtensa core importing script.

# gpg: Signature made Mon 06 Mar 2017 22:32:45 GMT
# gpg:                using RSA key 0x51F9CC91F83FA044
# gpg: Good signature from "Max Filippov <[email protected]>"
# gpg:                 aka "Max Filippov <[email protected]>"
# gpg:                 aka "Max Filippov <[email protected]>"
# Primary key fingerprint: 2B67 854B 98E5 327D CDEB  17D8 51F9 CC91 F83F A044

* remotes/xtensa/tags/20170306-xtensa:
  target/xtensa: add two missing headers to core import script
  target/xtensa: sim: instantiate local memories

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/gkurz/tags/fixes-for-2.9' into staging

Fixes issues that got merged with the latest pull request:
- missing O_NOFOLLOW flag for CVE-2016-960
- build break with older glibc that don't have O_PATH and AT_EMPTY_PATH
- various bugs reported by Coverity

# gpg: Signature made Mon 06 Mar 2017 17:51:29 GMT
# gpg:                using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <[email protected]>"
# gpg:                 aka "Greg Kurz <[email protected]>"
# gpg:                 aka "Greg Kurz <[email protected]>"
# gpg:                 aka "Gregory Kurz (Groug) <[email protected]>"
# gpg:                 aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894  DBA2 02FC 3AEB 0101 DBC2

* remotes/gkurz/tags/fixes-for-2.9:
  9pfs: fix vulnerability in openat_dir() and local_unlinkat_common()
  9pfs: fix O_PATH build break with older glibc versions
  9pfs: don't use AT_EMPTY_PATH in local_set_cred_passthrough()
  9pfs: fail local_statfs() earlier
  9pfs: fix fd leak in local_opendir()
  9pfs: fix bogus fd check in local_remove()

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/mdroth/tags/qga-pull-2017-03-06-tag' into staging

qemu-ga patch queue for 2.9

* fix fsfreeze for filesystems mounted in multiple locations
* fix test failure when running in a chroot
* support for socket-based activation

# gpg: Signature made Mon 06 Mar 2017 07:54:17 GMT
# gpg:                using RSA key 0x3353C9CEF108B584
# gpg: Good signature from "Michael Roth <[email protected]>"
# gpg:                 aka "Michael Roth <[email protected]>"
# gpg:                 aka "Michael Roth <[email protected]>"
# Primary key fingerprint: CEAC C9E1 5534 EBAB B82D  3FA0 3353 C9CE F108 B584

* remotes/mdroth/tags/qga-pull-2017-03-06-tag:
  tests: check path to avoid a failing qga/get-vcpus test
  qga: ignore EBUSY when freezing a filesystem
  qga: add systemd socket activation support

Signed-off-by: Peter Maydell <[email protected]>

9pfs: fix vulnerability in openat_dir() and local_unlinkat_common()

We should pass O_NOFOLLOW otherwise openat() will follow symlinks and make
QEMU vulnerable.

While here, we also fix local_unlinkat_common() to use openat_dir() for
the same reasons (it was a leftover in the original patchset actually).

This fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <[email protected]>
Reviewed-by: Daniel P. Berrange <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

9pfs: fix O_PATH build break with older glibc versions

When O_PATH is used with O_DIRECTORY, it only acts as an optimization: the
openat() syscall simply finds the name in the VFS, and doesn't trigger the
underlying filesystem.

On systems that don't define O_PATH, because they have glibc version 2.13
or older for example, we can safely omit it. We don't want to deactivate
O_PATH globally though, in case it is used without O_DIRECTORY. The is done
with a dedicated macro.

Systems without O_PATH may thus fail to resolve names that involve
unreadable directories, compared to newer systems succeeding, but such
corner case failure is our only option on those older systems to avoid
the security hole of chasing symlinks inappropriately.

Signed-off-by: Greg Kurz <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
(added last paragraph to changelog as suggested by Eric Blake)
Signed-off-by: Greg Kurz <[email protected]>

9pfs: don't use AT_EMPTY_PATH in local_set_cred_passthrough()

The name argument can never be an empty string, and dirfd always point to
the containing directory of the file name. AT_EMPTY_PATH is hence useless
here. Also it breaks build with glibc version 2.13 and older.

It is actually an oversight of a previous tentative patch to implement this
function. We can safely drop it.

Reported-by: Mark Cave-Ayland <[email protected]>
Signed-off-by: Greg Kurz <[email protected]>
Tested-by: Mark Cave-Ayland <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

9pfs: fail local_statfs() earlier

If we cannot open the given path, we can return right away instead of
passing -1 to fstatfs() and close(). This will make Coverity happy.

(Coverity issue CID1371729)

Signed-off-by: Greg Kurz <[email protected]>
Reviewed-by: Daniel P. berrange <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

9pfs: fix fd leak in local_opendir()

Coverity issue CID1371731

Signed-off-by: Greg Kurz <[email protected]>
Reviewed-by: Daniel P. Berrange <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

9pfs: fix bogus fd check in local_remove()

This was spotted by Coverity as a fd leak. This is certainly true, but also
local_remove() would always return without doing anything, unless the fd is
zero, which is very unlikely.

(Coverity issue CID1371732)

Signed-off-by: Greg Kurz <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging

# gpg: Signature made Mon 06 Mar 2017 04:15:17 GMT
# gpg:                using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  net/filter-mirror: Follow CODING_STYLE
  COLO-compare: Fix icmp and udp compare different packet always dump bug
  COLO-compare: Optimize compare_common and compare_tcp
  COLO-compare: Rename compare function and remove duplicate codes
  filter-rewriter: skip net_checksum_calculate() while offset = 0
  net/colo: fix memory double free error
  vmxnet3: VMStatify rx/tx q_descr and int_state
  vmxnet3: Convert ring values to uint32_t's
  net/colo-compare: Fix memory free error
  colo-compare: Fix removing fds been watched incorrectly in finalization
  char: remove the right fd been watched in qemu_chr_fe_set_handlers()
  colo-compare: kick compare thread to exit after some cleanup in finalization
  colo-compare: use g_timeout_source_new() to process the stale packets
  NetRxPkt: Remove code duplication in net_rx_pkt_pull_data()
  NetRxPkt: Account buffer with ETH header in IOV length
  NetRxPkt: Do not try to pull more data than present
  NetRxPkt: Fix memory corruption on VLAN header stripping
  eth: Extend vlan stripping functions
  net: Remove useless local var pkt

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170306' into staging

ppc patch queue for 2017-03-06

Looks like my previous batch wasn't quite the last before hard freeze.
This has a handful of bugfixes to go in.  They're all genuine
bugfixes, though not regressions in some cases.

# gpg: Signature made Mon 06 Mar 2017 04:07:48 GMT
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg:                 aka "David Gibson (Red Hat) <[email protected]>"
# gpg:                 aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg:                 aka "David Gibson (kernel.org) <[email protected]>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.9-20170306:
  target/ppc: use helper for excp handling
  target/ppc: fmadd: add macro for updating flags
  target/ppc: fmadd check for excp independently
  spapr: ensure that all threads within core are on the same NUMA node
  ppc/xics: register reset handlers for the ICP and ICS objects

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2017-02-28' into staging

QAPI patches for 2017-02-28

# gpg: Signature made Sun 05 Mar 2017 08:21:51 GMT
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <[email protected]>"
# gpg:                 aka "Markus Armbruster <[email protected]>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-qapi-2017-02-28: (27 commits)
  qapi: Improve qobject visitor documentation
  qapi: Fix object input visit beyond end of list
  tests: Cover input visit beyond end of list
  qapi: Make input visitors detect unvisited list tails
  test-qobject-input-visitor: Cover missing nested struct member
  tests: Cover partial input visit of list
  test-string-input-visitor: Improve list coverage
  test-string-input-visitor: Tear down existing test automatically
  tests-qobject-input-strict: Merge into test-qobject-input-visitor
  qapi: Drop unused non-strict qobject input visitor
  test-qobject-input-visitor: Use strict visitor
  qom: Make object_property_set_qobject()'s input visitor strict
  qapi: Make string input and opts visitor require non-null input
  qapi: Drop string input visitor method optional()
  qapi: Improve qobject input visitor error reporting
  qapi: Make QObject input visitor set *list reliably
  qapi: Clean up after commit 3d344c2
  qapi: Improve a QObject input visitor error message
  qmp: Eliminate silly QERR_QMP_* macros
  qmp: Drop duplicated QMP command object checks
  ...

Signed-off-by: Peter Maydell <[email protected]>

tests: check path to avoid a failing qga/get-vcpus test

The qga/get-vcpus test fails in a simple chroot environment, as
used in an openSUSE Build Service local build, so first check
that the sysfs based path exists in order to avoid calling this
test in an environment where it won't work right.

Signed-off-by: Bruce Rogers <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Signed-off-by: Michael Roth <[email protected]>

qga: ignore EBUSY when freezing a filesystem

the current implementation fails if we try to freeze an
already frozen filesystem. This can happen if a filesystem
is mounted more than once (e.g. with a bind mount).

Suggested-by: Christian Theune <[email protected]>
Cc: [email protected]
Signed-off-by: Peter Lieven <[email protected]>
Reviewed-by: Paolo Bonzini <[email protected]>
Signed-off-by: Michael Roth <[email protected]>

qga: add systemd socket activation support

AF_UNIX and AF_VSOCK listen sockets can be passed in by systemd on
startup.  This allows systemd to manage the listen socket until the
first client connects and between restarts.  Advantages of socket
activation are that parallel startup of network services becomes
possible and that unused daemons do not consume memory.

The key to achieving this is the LISTEN_FDS environment variable, which
is a stable ABI as shown here:
https://www.freedesktop.org/wiki/Software/systemd/InterfacePortabilityAndStabilityChart/

We could link against libsystemd and use sd_listen_fds(3) but it's easy
to implement the tiny LISTEN_FDS ABI so that qemu-ga does not depend on
libsystemd.  Some systems may not have systemd installed and wish to
avoid the dependency.  Other init systems or socket activation servers
may implement the same ABI without systemd involvement.

Test as follows:

  $ cat ~/.config/systemd/user/qga.service
  [Unit]
  Description=qga

  [Service]
  WorkingDirectory=/tmp
  ExecStart=/path/to/qemu-ga --logfile=/tmp/qga.log --pidfile=/tmp/qga.pid --statedir=/tmp

  $ cat ~/.config/systemd/user/qga.socket
  [Socket]
  ListenStream=/tmp/qga.sock

  [Install]
  WantedBy=default.target

  $ systemctl --user daemon-reload
  $ systemctl --user start qga.socket
  $ nc -U /tmp/qga.sock

Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Daniel P. Berrange <[email protected]>
Signed-off-by: Michael Roth <[email protected]>

net/filter-mirror: Follow CODING_STYLE

Signed-off-by: Zhang Chen <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

COLO-compare: Fix icmp and udp compare different packet always dump bug

Signed-off-by: Zhang Chen <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

COLO-compare: Optimize compare_common and compare_tcp

Add offset args for colo_packet_compare_common, optimize
colo_packet_compare_icmp() and colo_packet_compare_udp()
just compare the IP payload. Before compare all tcp packet,
we compare tcp checksum firstly, this function can get
better performance.

Signed-off-by: Zhang Chen <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

COLO-compare: Rename compare function and remove duplicate codes

Rename colo_packet_compare() to colo_packet_compare_common() that
make tcp_compare udp_compare icmp_compare reuse this function.
Remove minimum packet size check in icmp_compare, because we have
check this in parse_packet_early().

Signed-off-by: Zhang Chen <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

filter-rewriter: skip net_checksum_calculate() while offset = 0

While the offset of packets's sequence for primary side and
secondary side is zero, it is unnecessary to call net_checksum_calculate()
to recalculate the checksume value of packets.

Signed-off-by: zhanghailiang <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

net/colo: fix memory double free error

The 'primary_list' and 'secondary_list' members of struct Connection
is not allocated through dynamically g_queue_new(), but we free it by using
g_queue_free(), which will lead to a double-free bug.

Reviewed-by: Zhang Chen <[email protected]>
Signed-off-by: zhanghailiang <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

vmxnet3: VMStatify rx/tx q_descr and int_state

Fairly simple mechanical conversion of all fields.

TODO!!!!
The problem is vmxnet3-ring size/cell_size/next are declared as size_t
but written as 32bit.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Acked-by: Dmitry Fleytman <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

vmxnet3: Convert ring values to uint32_t's

The index's in the Vmxnet3Ring were migrated as 32bit ints
yet are declared as size_t's. They appear to be derived
from 32bit values loaded from guest memory, so actually
store them as that.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Acked-by: Dmitry Fleytman <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

net/colo-compare: Fix memory free error

We use g_queue_init() to init s->conn_list, so we should use g_queue_clear()
to instead of g_queue_free().

Signed-off-by: Zhang Chen <[email protected]>
Reviewed-by: zhanghailiang <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

colo-compare: Fix removing fds been watched incorrectly in finalization

We will catch the bellow error report while try to delete compare object
by qmp command:
chardev/char-io.c:91: io_watch_poll_finalize: Assertion `iwp->src == ((void *)0)' failed.

This is caused by failing to remove the right fd been watched while
call qemu_chr_fe_set_handlers();

Fix it by pass the worker_context parameter to qemu_chr_fe_set_handlers().

Signed-off-by: zhanghailiang <[email protected]>
Reviewed-by: Zhang Chen <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

char: remove the right fd been watched in qemu_chr_fe_set_handlers()

We can call qemu_chr_fe_set_handlers() to add/remove fd been watched
in 'context' which can be either default main context or other explicit
context. But the original logic is not correct, we didn't remove
the right fd because we call g_main_context_find_source_by_id(NULL, tag)
which always try to find the Gsource from default context.

Fix it by passing the right context to g_main_context_find_source_by_id().

Cc: Paolo Bonzini <[email protected]>
Cc: Marc-André Lureau <[email protected]>
Signed-off-by: zhanghailiang <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

colo-compare: kick compare thread to exit after some cleanup in finalization

We should call g_main_loop_quit() to notify colo compare thread to
exit, Or it will run in g_main_loop_run() forever.

Besides, the finalizing process can't happen in context of colo thread,
it is reasonable to remove the 'if (qemu_thread_is_self(&s->thread))'
branch.

Before compare thead exits, some cleanup works need to be
done, All unhandled packets need to be released and connection_track_table
needs to be freed, or there will be memory leak.

Signed-off-by: zhanghailiang <[email protected]>
Reviewed-by: Zhang Chen <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

colo-compare: use g_timeout_source_new() to process the stale packets

Instead of using qemu timer to process the stale packets,
We re-use the colo compare thread to process these packets
by creating a new timeout coroutine.

Besides, since we process all the same vNIC's net connection/packets
in one thread, it is safe to remove the timer_check_lock.

Signed-off-by: zhanghailiang <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

NetRxPkt: Remove code duplication in net_rx_pkt_pull_data()

This is a refactoring commit that does not change behavior.

Signed-off-by: Dmitry Fleytman <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

NetRxPkt: Account buffer with ETH header in IOV length

In case of VLAN stripping ETH header is stored in a
separate chunk and length of IOV should take this into
account.

This patch fixes checksum validation for RX packets
with VLAN header.

Devices affected by this problem: e1000e and vmxnet3.

Cc: [email protected]
Signed-off-by: Dmitry Fleytman <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

NetRxPkt: Do not try to pull more data than present

In case of VLAN stripping, ETH header put into a
separate buffer, therefore amont of data copied
from original IOV should be smaller.

Cc: [email protected]
Signed-off-by: Dmitry Fleytman <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

NetRxPkt: Fix memory corruption on VLAN header stripping

This patch fixed a problem that was introduced in commit eb700029.

When net_rx_pkt_attach_iovec() calls eth_strip_vlan()
this can result in pkt->ehdr_buf being overflowed, because
ehdr_buf is only sizeof(struct eth_header) bytes large
but eth_strip_vlan() can write
sizeof(struct eth_header) + sizeof(struct vlan_header)
bytes into it.

Devices affected by this problem: vmxnet3.

Cc: [email protected]
Reported-by: Peter Maydell <[email protected]>
Signed-off-by: Dmitry Fleytman <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

eth: Extend vlan stripping functions

Make VLAN stripping functions return number of bytes
copied to given Ethernet header buffer.

This information should be used to re-compose
packet IOV after VLAN stripping.

Cc: [email protected]
Signed-off-by: Dmitry Fleytman <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

net: Remove useless local var pkt

This has been pointless since commit 605d52e62, which was a
search-and-replace, overlooked the redundancy.

Signed-off-by: Fam Zheng <[email protected]>
Reviewed-by: Dmitry Fleytman <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

target/ppc: use helper for excp handling

Use the helper routine float[32,64]_maddsub_update_excp() in VSX_MADD
macro.

Signed-off-by: Nikunj A Dadhania <[email protected]>
Signed-off-by: David Gibson <[email protected]>

target/ppc: fmadd: add macro for updating flags

Adds FPU_MADDSUB_UPDATE macro, this will be used for other routines
having float32/16

Signed-off-by: Nikunj A Dadhania <[email protected]>
Signed-off-by: David Gibson <[email protected]>

target/ppc: fmadd check for excp independently

Current order of checking does not confirm with the spec
(ISA 3.0: MultiplyAddDP page-469). Change the order and make them
independent of each other.

For example: a = infinity, b = zero, c = SNaN, this should set both
VXIMZ and VXNAN

Signed-off-by: Nikunj A Dadhania <[email protected]>
Signed-off-by: David Gibson <[email protected]>

spapr: ensure that all threads within core are on the same NUMA node

Threads within a core shouldn't be on different
NUMA nodes, so if user has misconfgured command
line, fail QEMU at start up to force user fix it.

For now use the first thread on the core as source
of core's node-id. Later when cpu-numa refactoring
lands it will be switched to core's node-id from
possible_cpus[].

This prevents the same problems as commit 20bb648d
"spapr: Fix default NUMA node allocation for threads",
but for the case of manually configured NUMA node
mappings, instead of just the default case.

Signed-off-by: Igor Mammedov <[email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/xics: register reset handlers for the ICP and ICS objects

The recent changes on the XICS layer removed the XICSState object to
let the sPAPR machine handle the ICP and ICS directly. The reset of
these objects was previously handled by XICSState, which was a SysBus
device, and to keep the same behavior, the ICP and ICS were assigned
to SysbBus.

But that broke the 'info qtree' command in the monitor. 'qtree'
performs a loop on the children of a bus to print their properties and
SysBus devices are expected to be found under SysBus, which is not the
case anymore.

The fix for this problem is to register reset handlers for the ICP and
ICS objects and stop using SysBus for such devices.

Signed-off-by: Cédric Le Goater <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Tested-by: Thomas Huth <[email protected]>
Signed-off-by: David Gibson <[email protected]>

qapi: Improve qobject visitor documentation

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

qapi: Fix object input visit beyond end of list

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

tests: Cover input visit beyond end of list

When you try to visit beyond the end of a list, the qobject input
visitor crashes, and the string visitor screws returns garbage. The
generated list visits never go beyond the list end, but manual visits
could.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

qapi: Make input visitors detect unvisited list tails

Fix the design flaw demonstrated in the previous commit: new method
check_list() lets input visitors report that unvisited input remains
for a list, exactly like check_struct() lets them report that
unvisited input remains for a struct or union.

Implement the method for the qobject input visitor (straightforward),
and the string input visitor (less so, due to the magic list syntax
there). The opts visitor's list magic is even more impenetrable, and
all I can do there today is a stub with a FIXME comment. No worse
than before.

Signed-off-by: Markus Armbruster <[email protected]>
Message-Id: <1488544368 [email protected]>
Reviewed-by: Eric Blake <[email protected]>

test-qobject-input-visitor: Cover missing nested struct member

Signed-off-by: Markus Armbruster <[email protected]>
Message-Id: <1488544368 [email protected]>
Reviewed-by: Eric Blake <[email protected]>

tests: Cover partial input visit of list

Demonstrates a design flaw: there is no way to for input visitors to
report that a list visit didn't visit the complete input list. The
generated list visits always do, but manual visits needn't.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

test-string-input-visitor: Improve list coverage

Lists with elements above INT64_MAX don't work (known bug). Empty
lists don't work (weird).

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

test-string-input-visitor: Tear down existing test automatically

Call visitor_input_teardown() from visitor_input_test_init(), so you
don't have to call it from the actual tests.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

tests-qobject-input-strict: Merge into test-qobject-input-visitor

Much of test-qobject-input-strict.c duplicates
test-qobject-input-strict.c, but with less assertions on expected
output:

* test_validate_struct() duplicates test_visitor_in_struct()

* test_validate_struct_nested() duplicates
  test_visitor_in_struct_nested()

* test_validate_list() duplicates the first half of
  test_visitor_in_list()

* test_validate_union_native_list() duplicates
  test_visitor_in_native_list_int()

* test_validate_union_flat() duplicates test_visitor_in_union_flat()

* test_validate_alternate() duplicates the first part of
  test_visitor_in_alternate()

Merge the remaining test cases into test-qobject-input-visitor.c, and
drop the now redundant test-qobject-input-strict.c.

Test case "/visitor/input-strict/fail/list" isn't really about lists,
it's about a bad struct nested in a list.  Rename accordingly.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

qapi: Drop unused non-strict qobject input visitor

The split between tests/test-qobject-input-visitor.c and
tests/test-qobject-input-strict.c now makes less sense than ever. The
next commit will take care of that.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

test-qobject-input-visitor: Use strict visitor

The qobject input visitor comes in a strict and a non-strict variant.
This test is the non-strict variant's last user. Turns out it relies
on non-strict only in test_visitor_in_null(), and just out of
laziness. We don't actually test the non-strict behavior.

Clean up test_visitor_in_null(), and switch to the strict variant.
The next commit will drop the non-strict variant.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

qom: Make object_property_set_qobject()'s input visitor strict

Commit 240f64b made all qobject input visitors created outside tests
strict, except for the one in object_property_set_qobject().  That one
was left behind only because Eric couldn't spare the time to figure
out whether making it strict would break anything, with a TODO
comment.  Time to resolve it.

Strict makes a difference only for otherwise successful visits of QAPI
structs or unions.  Let's examine what the callers of
object_property_set_qobject() visit:

* object_property_set_str(), object_property_set_bool(),
  object_property_set_int() visit a QString, QBool, QInt,
  respectively.  Strictness can't matter.

* qmp_qom_set visits its @value argument.  Comes straight from QMP and
  can be anything ('any' in the QAPI schema).  Strictness matters when
  the property's set() method visits a struct or union QAPI type.

  No such methods exist, thus switching to strict can't break
  anything.

  If we acquire such methods in the future, we'll *want* the visitor
  to be strict, so that unexpected members get rejected as they should
  be.

Switch to strict.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

qapi: Make string input and opts visitor require non-null input

The string input visitor tries to cope with null input.  Null input
isn't used anywhere, and isn't covered by tests.  Unsurprisingly, it
doesn't fully work: start_list() crashes because it passes the input
via parse_str() to strtoll() unchecked.

Make string_input_visitor_new() assert its argument isn't null, and
drop the code trying to deal with null input.

The opts visitor crashes when you try to actually visit something with
null input.  Make opts_visitor_new() assert its argument isn't null,
mostly for clarity.

qobject_input_visitor_new() already asserts its argument isn't null.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

qapi: Drop string input visitor method optional()

visit_optional() is to be called only between visit_start_struct() and
visit_end_struct().  Visitors that don't support struct visits,
i.e. don't implement start_struct(), end_struct(), have no use for it.
Clarify documentation.

The string input visitor doesn't support struct visits.  Its
parse_optional() is therefore useless.  Drop it.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

qapi: Improve qobject input visitor error reporting

Error messages refer to nodes of the QObject being visited by name.
Trouble is the names are sometimes less than helpful:

* The name of the root QObject is whatever @name argument got passed
  to the visitor, except NULL gets mapped to "null".  We commonly pass
  NULL.  Not good.

  Avoiding errors "at the root" mitigates.  For instance,
  visit_start_struct() can only fail when the visited object is not a
  dictionary, and we commonly ensure it is beforehand.

* The name of a QDict's member is the member key.  Good enough only
  when this happens to be unique.

* The name of a QList's member is "null".  Not good.

Improve error messages by referring to nodes by path instead, as
follows:

* The path of the root QObject is whatever @name argument got passed
  to the visitor, except NULL gets mapped to "<anonymous>".

* The path of a root QDict's member is the member key.

* The path of a root QList's member is "[%u]", where %u is the list
  index, starting at zero.

* The path of a non-root QDict's member is the path of the QDict
  concatenated with "." and the member key.

* The path of a non-root QList's member is the path of the QList
  concatenated with "[%u]", where %u is the list index.

For example, the incorrect QMP command

    { "execute": "blockdev-add", "arguments": { "node-name": "foo", "driver": "raw", "file": {"driver": "file" } } }

now fails with

    {"error": {"class": "GenericError", "desc": "Parameter 'file.filename' is missing"}}

instead of

    {"error": {"class": "GenericError", "desc": "Parameter 'filename' is missing"}}

and

    { "execute": "input-send-event", "arguments": { "device": "bar", "events": [ [] ] } }

now fails with

    {"error": {"class": "GenericError", "desc": "Invalid parameter type for 'events[0]', expected: object"}}

instead of

    {"error": {"class": "GenericError", "desc": "Invalid parameter type for 'null', expected: QDict"}}

Aside: calling the thing "parameter" is suboptimal for QMP, because
the root object is "arguments" there.

The qobject output visitor doesn't have this problem because it should
not fail.  Same for dealloc and clone visitors.

The string visitors don't have this problem because they visit just
one value, whose name needs to be passed to the visitor as @name.  The
string output visitor shouldn't fail anyway.

The options visitor uses QemuOpts names.  Their name space is flat, so
the use of QDict member keys as names is fine.  NULL names used with
roots and lists could conceivably result in bad error messages.  Left
for another day.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

qapi: Make QObject input visitor set *list reliably

qobject_input_start_struct() sets *list, except when it fails because
qobject_input_get_object() fails, i.e. the input object doesn't exist.

All the other input visitor start_struct(), start_list(),
start_alternate() always set *obj / *list.

Change qobject_input_start_struct() to match.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

qapi: Clean up after commit 3d344c2

Drop unused QIV_STACK_SIZE and unused qobject_input_start_struct()
parameter errp.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

qapi: Improve a QObject input visitor error message

The QObject input visitor has three error message formats:

* Parameter '%s' is missing
* "Invalid parameter type for '%s', expected: %s"
* "QMP input object member '%s' is unexpected"

The '%s' are member names (or "null", but I'll fix that later).

The last error message calls the thing "QMP input object member"
instead of "parameter". Misleading when the visitor is used on
QObjects that don't come from QMP. Change it to "Parameter '%s' is
unexpected".

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

qmp: Eliminate silly QERR_QMP_* macros

The QERR_ macros are leftovers from the days of "rich" error objects.

QERR_QMP_BAD_INPUT_OBJECT, QERR_QMP_BAD_INPUT_OBJECT_MEMBER,
QERR_QMP_EXTRA_MEMBER are used in just one place now, except for one
use that has crept into qobject-input-visitor.c.

Drop these macros, to make the (bad) error messages more visible.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

qmp: Drop duplicated QMP command object checks

qmp_check_input_obj() duplicates qmp_dispatch_check_obj(), except the
latter screws up an error message.  handle_qmp_command() runs first
the former, then the latter via qmp_dispatch(), masking the screwup.

qemu-ga also masks the screwup, because it also duplicates checks,
just differently.

qmp_check_input_obj() exists because handle_qmp_command() needs to
examine the command before dispatching it.  The previous commit got
rid of this need, except for a tracepoint, and a bit of "id" code that
relies on qdict not being null.

Fix up the error message in qmp_dispatch_check_obj(), drop
qmp_check_input_obj() and the tracepoint.  Protect the "id" code with
a conditional.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

qmp: Clean up how we enforce capability negotiation

To enforce capability negotiation before normal operation,
handle_qmp_command() inspects every command before it's handed off to
qmp_dispatch().  This is a bit of a layering violation, and results in
duplicated code.

Before capability negotiation (!cur_mon->in_command_mode), we fail
commands other than "qmp_capabilities".  This is what enforces
capability negotiation.

Afterwards, we fail command "qmp_capabilities".

Clean this up as follows.

The obvious place to fail a command is the command itself, so move the
"afterwards" check to qmp_qmp_capabilities().

We do the "before" check in every other command, but that would be
bothersome.  Instead, start with an alternate list of commands that
contains only "qmp_capabilities".  Switch to the full list in
qmp_qmp_capabilities().

Additionally, replace the generic human-readable error message for
CommandNotFound by one that reminds the user to run qmp_capabilities.
Without that, we'd regress commit 2d5a834.

Signed-off-by: Markus Armbruster <[email protected]>
Message-Id: <1488544368 [email protected]>
[Mirco-optimization squashed in, commit message typo fixed]
Reviewed-by: Eric Blake <[email protected]>

qapi-introspect: Mangle --prefix argument properly for C

qapi-introspect.py --prefix hasn't been used so far, but fix it anyway.

Signed-off-by: Markus Armbruster <[email protected]>
Message-Id: <1488544368 [email protected]>
[Commit message improved]
Reviewed-by: Eric Blake <[email protected]>

qapi: Support multiple command registries per program

The command registry encapsulates a single command list. Give the
functions using it a parameter instead. Define suitable command lists
in monitor, guest agent and test-qmp-commands.

Signed-off-by: Markus Armbruster <[email protected]>
Message-Id: <1488544368 [email protected]>
[Debugging turds buried]
Reviewed-by: Eric Blake <[email protected]>

qmp: Dumb down how we run QMP command registration

The way we get QMP commands registered is high tech:

* qapi-commands.py generates qmp_init_marshal() that does the actual work

* it also generates the magic to register it as a MODULE_INIT_QAPI
  function, so it runs when someone calls
  module_call_init(MODULE_INIT_QAPI)

* main() calls module_call_init()

QEMU needs to register a few non-qapified commands.  Same high tech
works: monitor.c has its own qmp_init_marshal() along with the magic
to make it run in module_call_init(MODULE_INIT_QAPI).

QEMU also needs to unregister commands that are not wanted in this
build's configuration (commit 5032a16).  Simple enough:
qmp_unregister_commands_hack().  The difficulty is to make it run
after the generated qmp_init_marshal().  We can't simply run it in
monitor.c's qmp_init_marshal(), because the order in which the
registered functions run is indeterminate.  So qmp_init_marshal()
registers qmp_unregister_commands_hack() separately.  Since
registering *appends* to the list of registered functions, this will
make it run after all the functions that have been registered already.

I suspect it takes a long and expensive computer science education to
not find this silly.

Dumb it down as follows:

* Drop MODULE_INIT_QAPI entirely

* Give the generated qmp_init_marshal() external linkage.

* Call it instead of module_call_init(MODULE_INIT_QAPI)

* Except in QEMU proper, call new monitor_init_qmp_commands() that in
  turn calls the generated qmp_init_marshal(), registers the
  additional commands and unregisters the unwanted ones.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

qmp-test: New, covering basic QMP protocol

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

libqtest: Work around a "QMP wants a newline" bug

The next commit is going to add a test that calls qmp("null").
Curiously, this hangs.  Here's why.

qmp_fd_sendv() doesn't send newlines.  Not even when @fmt contains
some.  At first glance, the QMP parser seems to be fine with that.
However, it turns out that it fails to react to input until it sees
either a newline, an object or an array.  To reproduce, feed to a QMP
monitor like this:

    $ echo -n 'null' | socat UNIX:/work/armbru/images/test-qmp STDIO
    {"QMP": {"version": {"qemu": {"micro": 50, "minor": 8, "major": 2}, "package": " (v2.8.0-1195-gf84141e-dirty)"}, "capabilities": []}}

No output after the greeting.

Add a newline:

    $ echo 'null' | socat UNIX:/work/armbru/images/test-qmp STDIO
    {"QMP": {"version": {"qemu": {"micro": 50, "minor": 8, "major": 2}, "package": " (v2.8.0-1195-gf84141e-dirty)"}, "capabilities": []}}
    {"error": {"class": "GenericError", "desc": "Expected 'object' in QMP input"}}

Correct output for input 'null'.

Add an object instead:

    $ echo -n 'null { "execute": "qmp_capabilities" }' | socat UNIX:qmp-socket STDIO
    {"QMP": {"version": {"qemu": {"micro": 50, "minor": 8, "major": 2}, "package": " (v2.8.0-1195-gf84141e-dirty)"}, "capabilities": []}}
    {"error": {"class": "GenericError", "desc": "Expected 'object' in QMP input"}}
    {"return": {}}

Also correct output.

Work around this QMP bug by having qmp_fd_sendv() append a newline.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

qga: Fix crash on non-dictionary QMP argument

The value of key 'arguments' must be a JSON object.  qemu-ga neglects
to check, and crashes.  To reproduce, send

    { 'execute': 'guest-sync', 'arguments': [] }

to qemu-ga.

do_qmp_dispatch() uses qdict_get_qdict() to get the arguments.  When
not a JSON object, this gets a null pointer, which flows through the
generated marshalling function to qobject_input_visitor_new(), where
it fails the assertion.  qmp_dispatch_check_obj() needs to catch this
error.

QEMU isn't affected, because it runs qmp_check_input_obj() first,
which basically duplicates qmp_dispatch_check_obj()'s checks, plus the
missing one.

Fix by copying the missing one from qmp_check_input_obj() to
qmp_dispatch_check_obj().

Signed-off-by: Markus Armbruster <[email protected]>
Cc: Michael Roth <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488544368 [email protected]>

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170303' into staging

ppc patch queuye for 2017-03-03

This will probably be my last pull request before the hard freeze.  It
has some new work, but that has all been posted in draft before the
soft freeze, so I think it's reasonable to include in qemu-2.9.

This batch has:
    * A substantial amount of POWER9 work
        * Implements the legacy (hash) MMU for POWER9
* Some more preliminaries for implementing the POWER9 radix
          MMU
* POWER9 has_work
* Basic POWER9 compatibility mode handling
* Removal of some premature tests
    * Some cleanups and fixes to the existing MMU code to make the
      POWER9 work simpler
    * A bugfix for TCG multiply adds on power
    * Allow pseries guests to access PCIe extended config space

This also includes a code-motion not strictly in ppc code - moving
getrampagesize() from ppc code to exec.c.  This will make some future
VFIO improvements easier, Paolo said it was ok to merge via my tree.

# gpg: Signature made Fri 03 Mar 2017 03:20:36 GMT
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg:                 aka "David Gibson (Red Hat) <[email protected]>"
# gpg:                 aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg:                 aka "David Gibson (kernel.org) <[email protected]>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.9-20170303:
  target/ppc: rewrite f[n]m[add,sub] using float64_muladd
  spapr: Small cleanup of PPC MMU enums
  spapr_pci: Advertise access to PCIe extended config space
  target/ppc: Rework hash mmu page fault code and add defines for clarity
  target/ppc: Move no-execute and guarded page checking into new function
  target/ppc: Add execute permission checking to access authority check
  target/ppc: Add Instruction Authority Mask Register Check
  hw/ppc/spapr: Add POWER9 to pseries cpu models
  target/ppc/POWER9: Add cpu_has_work function for POWER9
  target/ppc/POWER9: Add POWER9 pa-features definition
  target/ppc/POWER9: Add POWER9 mmu fault handler
  target/ppc: Don't gen an SDR1 on POWER9 and rework register creation
  target/ppc: Add patb_entry to sPAPRMachineState
  target/ppc/POWER9: Add POWERPC_MMU_V3 bit
  powernv: Don't test POWER9 CPU yet
  exec, kvm, target-ppc: Move getrampagesize() to common code
  target/ppc: Add POWER9/ISAv3.00 to compat_table

Signed-off-by: Peter Maydell <[email protected]>

ppc: avoid typedef redefinitions

These cause compilation failures on CentOS 6 or other operating
systems with older GCCs.

Cc: David Gibson <[email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
Message-id: 1488558530 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

nios2: avoid anonymous unions in designated initializers.

These cause compilation failures on CentOS 6 or other operating
systems with older GCCs.

Cc: Richard Henderson <[email protected]>
Cc: Peter Maydell <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

hppa: avoid anonymous unions in designated initializers.

These cause compilation failures on CentOS 6 or other operating
systems with older GCCs.

Cc: Richard Henderson <[email protected]>
Cc: Peter Maydell <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Message-id: 1488558530 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* kernel header update (requested by David and Vijay)
* GuestPanicInformation fixups (Anton)
* record/replay icount fixes (Pavel)
* cpu-exec cleanup, unification of icount_decr with tcg_exit_req (me)
* KVM_CAP_IMMEDIATE_EXIT support (me)
* vmxcap update (me)
* iscsi locking fix (me)
* VFIO ram device fix (Yongji)
* scsi-hd vs. default CD-ROM (Hervé)
* SMI migration fix (Dave)
* spice-char segfault (Li Qiang)
* improved "info mtree -f" (me)

# gpg: Signature made Fri 03 Mar 2017 15:43:04 GMT
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <[email protected]>"
# gpg:                 aka "Paolo Bonzini <[email protected]>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (21 commits)
  iscsi: fix missing unlock
  memory: show region offset and ROM/RAM type in "info mtree -f"
  x86: Work around SMI migration breakages
  spice-char: fix segfault in char_spice_finalize
  vl: disable default cdrom when using explicitely scsi-hd
  memory: Introduce DEVICE_HOST_ENDIAN for ram device
  qmp-events: fix GUEST_PANICKED description formatting
  qapi: flatten GuestPanicInformation union
  vmxcap: update for September 2016 SDM
  vmxcap: port to Python 3
  KVM: use KVM_CAP_IMMEDIATE_EXIT
  kvm: use atomic_read/atomic_set to access cpu->exit_request
  KVM: move SIG_IPI handling to kvm-all.c
  KVM: do not use sigtimedwait to catch SIGBUS
  KVM: remove kvm_arch_on_sigbus
  cpus: reorganize signal handling code
  KVM: x86: cleanup SIGBUS handlers
  cpus: remove ugly cast on sigbus_handler
  cpu-exec: remove unnecessary check of cpu->exit_request
  replay: check icount in cpu exec loop
  ...

Signed-off-by: Peter Maydell <[email protected]>

iscsi: fix missing unlock

Reported by Coverity.

Signed-off-by: Paolo Bonzini <[email protected]>

memory: show region offset and ROM/RAM type in "info mtree -f"

"info mtree -f" output is currently hard to use for large RAM regions, because
there is no hint as to what part of the region is being mapped.  Add the offset
if it is nonzero.

Secondly, FlatView has a readonly field, that can override the MemoryRegion
in the presence of aliases.  Take it into account.

Together, with this patch this:

address-space (flat view): KVM-SMRAM
  0000000000000000-00000000000bffff (prio 0, ram): pc.ram
  00000000000c0000-00000000000c9fff (prio 0, ram): pc.ram
  00000000000ca000-00000000000ccfff (prio 0, ram): pc.ram
  00000000000cd000-00000000000ebfff (prio 0, ram): pc.ram
  00000000000ec000-00000000000effff (prio 0, ram): pc.ram
  00000000000f0000-00000000000fffff (prio 0, ram): pc.ram
  0000000000100000-00000000bfffffff (prio 0, ram): pc.ram
  00000000fd000000-00000000fdffffff (prio 1, ram): vga.vram
  00000000febc0000-00000000febdffff (prio 1, i/o): e1000-mmio
  00000000febf0400-00000000febf041f (prio 0, i/o): vga ioports remapped
  00000000febf0500-00000000febf0515 (prio 0, i/o): bochs dispi interface
  00000000febf0600-00000000febf0607 (prio 0, i/o): qemu extended regs
  00000000fec00000-00000000fec00fff (prio 0, i/o): kvm-ioapic
  00000000fed00000-00000000fed003ff (prio 0, i/o): hpet
  00000000fee00000-00000000feefffff (prio 4096, i/o): kvm-apic-msi
  00000000fffc0000-00000000ffffffff (prio 0, rom): pc.bios
  0000000100000000-000000013fffffff (prio 0, ram): pc.ram

becomes this:

address-space (flat view): KVM-SMRAM
  0000000000000000-00000000000bffff (prio 0, ram): pc.ram
  00000000000c0000-00000000000c9fff (prio 0, rom): pc.ram @00000000000c0000
  00000000000ca000-00000000000ccfff (prio 0, ram): pc.ram @00000000000ca000
  00000000000cd000-00000000000ebfff (prio 0, rom): pc.ram @00000000000cd000
  00000000000ec000-00000000000effff (prio 0, ram): pc.ram @00000000000ec000
  00000000000f0000-00000000000fffff (prio 0, rom): pc.ram @00000000000f0000
  0000000000100000-00000000bfffffff (prio 0, ram): pc.ram @0000000000100000
  00000000fd000000-00000000fdffffff (prio 1, ram): vga.vram
  00000000febc0000-00000000febdffff (prio 1, i/o): e1000-mmio
  00000000febf0400-00000000febf041f (prio 0, i/o): vga ioports remapped
  00000000febf0500-00000000febf0515 (prio 0, i/o): bochs dispi interface
  00000000febf0600-00000000febf0607 (prio 0, i/o): qemu extended regs
  00000000fec00000-00000000fec00fff (prio 0, i/o): kvm-ioapic
  00000000fed00000-00000000fed003ff (prio 0, i/o): hpet
  00000000fee00000-00000000feefffff (prio 4096, i/o): kvm-apic-msi
  00000000fffc0000-00000000ffffffff (prio 0, rom): pc.bios
  0000000100000000-000000013fffffff (prio 0, ram): pc.ram @00000000c0000000

This should make it easier to understand what's going on.

Cc: Peter Xu <[email protected]>
Cc: "William Tambe" <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

x86: Work around SMI migration breakages

Migration from a 2.3.0 qemu results in a reboot on the receiving QEMU
due to a disagreement about SM (System management) interrupts.

2.3.0 didn't have much SMI support, but it did set CPU_INTERRUPT_SMI
and this gets into the migration stream, but on 2.3.0 it
never got delivered.

~2.4.0 SMI interrupt support was added but was broken - so
that when a 2.3.0 stream was received it cleared the CPU_INTERRUPT_SMI
but never actually caused an interrupt.

The SMI delivery was recently fixed by 68c6efe07a, but the
effect now is that an incoming 2.3.0 stream takes the interrupt it
had flagged but it's bios can't actually handle it(I think
partly due to the original interrupt not being taken during boot?).
The consequence is a triple(?) fault and a reboot.

Tested from:
  2.3.1 -M 2.3.0
  2.7.0 -M 2.3.0
  2.8.0 -M 2.3.0
  2.8.0 -M 2.8.0

This corresponds to RH bugzilla entry 1420679.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Message-Id: <20170223133441 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

spice-char: fix segfault in char_spice_finalize

In 'qemu_chr_open_spice_vmc' if the 'psubtype' is NULL, it will
call 'char_spice_finalize'. But as the SpiceChardev is not inserted
in the 'spice_chars' list, the 'QLIST_REMOVE' will cause a segfault.
Add a detect to avoid it.

Signed-off-by: Li Qiang <[email protected]>
Message-Id: <1487665107 [email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Li Qiang <[email protected]>

vl: disable default cdrom when using explicitely scsi-hd

In commit af6bf1328ef90fae617857c02697e0174b84d596 (May 2011),
ide-hd, ide-cd and scsi-cd have been added to disable default cdrom,
"or else you can't put one on secondary master without -nodefaults".

Make it the same for scsi-hd, so you can put one on scsi-id 2 without
using -nodefaults.
scsi-hd has probably been forgotten, as it has been added in the
preceding commit (b443ae67130d32ad06b06fc9aa6d04d05ccd93ce).

Affected users are the ones using a machine with SCSI devices and start QEMU
with -device scsi-hd but without -device scsi-cd or -cdrom
In that case, the default cdrom device will disappear instead of being empty.

Signed-off-by: Hervé Poussineau <[email protected]>
Message-Id: <1487623279 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

memory: Introduce DEVICE_HOST_ENDIAN for ram device

At the moment ram device's memory regions are DEVICE_NATIVE_ENDIAN. It's
incorrect. This memory region is backed by a MMIO area in host, so the
uint64_t data that MemoryRegionOps read from/write to this area should be
host-endian rather than target-endian. Hence, current code does not work
when target and host endianness are different which is the most common case
on PPC64. To fix it, this introduces DEVICE_HOST_ENDIAN for the ram device.

This has been tested on PPC64 BE/LE host/guest in all possible combinations
including TCG.

Suggested-by: Paolo Bonzini <[email protected]>
Signed-off-by: Yongji Xie <[email protected]>
Message-Id: <1488171164 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

qmp-events: fix GUEST_PANICKED description formatting

Signed-off-by: Anton Nefedov <[email protected]>
Signed-off-by: Denis V. Lunev <[email protected]>
CC: Paolo Bonzini <[email protected]>
CC: Eric Blake <[email protected]>
Message-Id: <1487614915 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

qapi: flatten GuestPanicInformation union

Signed-off-by: Anton Nefedov <[email protected]>
Signed-off-by: Denis V. Lunev <[email protected]>
CC: Paolo Bonzini <[email protected]>
CC: Eric Blake <[email protected]>
Message-Id: <1487614915 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

vmxcap: update for September 2016 SDM

Signed-off-by: Paolo Bonzini <[email protected]>

vmxcap: port to Python 3

Signed-off-by: Paolo Bonzini <[email protected]>

KVM: use KVM_CAP_IMMEDIATE_EXIT

The purpose of the KVM_SET_SIGNAL_MASK API is to let userspace "kick"
a VCPU out of KVM_RUN through a POSIX signal.  A signal is attached
to a dummy signal handler; by blocking the signal outside KVM_RUN and
unblocking it inside, this possible race is closed:

          VCPU thread                     service thread
   --------------------------------------------------------------
        check flag
                                          set flag
                                          raise signal
        (signal handler does nothing)
        KVM_RUN

However, one issue with KVM_SET_SIGNAL_MASK is that it has to take
tsk->sighand->siglock on every KVM_RUN.  This lock is often on a
remote NUMA node, because it is on the node of a thread's creator.
Taking this lock can be very expensive if there are many userspace
exits (as is the case for SMP Windows VMs without Hyper-V reference
time counter).

KVM_CAP_IMMEDIATE_EXIT provides an alternative, where the flag is
placed directly in kvm_run so that KVM can see it:

          VCPU thread                     service thread
   --------------------------------------------------------------
                                          raise signal
        signal handler
          set run->immediate_exit
        KVM_RUN
          check run->immediate_exit

The previous patches changed QEMU so that the only blocked signal is
SIG_IPI, so we can now stop using KVM_SET_SIGNAL_MASK and sigtimedwait
if KVM_CAP_IMMEDIATE_EXIT is available.

On a 14-VCPU guest, an "inl" operation goes down from 30k to 6k on
an unlocked (no BQL) MemoryRegion, or from 30k to 15k if the BQL
is involved.

Signed-off-by: Paolo Bonzini <[email protected]>

kvm: use atomic_read/atomic_set to access cpu->exit_request

Signed-off-by: Paolo Bonzini <[email protected]>

KVM: move SIG_IPI handling to kvm-all.c

This lets us remove a bunch of CONFIG_LINUX defines.

Signed-off-by: Paolo Bonzini <[email protected]>

KVM: do not use sigtimedwait to catch SIGBUS

Call kvm_on_sigbus_vcpu asynchronously from the VCPU thread.
Information for the SIGBUS can be stored in thread-local variables
and processed later in kvm_cpu_exec.

Signed-off-by: Paolo Bonzini <[email protected]>

KVM: remove kvm_arch_on_sigbus

Build it on kvm_arch_on_sigbus_vcpu instead. They do the same
for "action optional" SIGBUSes, and the main thread should never get
"action required" SIGBUSes because it blocks the signal.

Signed-off-by: Paolo Bonzini <[email protected]>

cpus: reorganize signal handling code

Move the KVM "eat signals" code under CONFIG_LINUX, in preparation
for moving it to kvm-all.c; reraise non-MCE SIGBUS immediately,
without passing it to KVM.

Signed-off-by: Paolo Bonzini <[email protected]>

KVM: x86: cleanup SIGBUS handlers

This patch should have no semantic change.

Signed-off-by: Paolo Bonzini <[email protected]>

cpus: remove ugly cast on sigbus_handler

The cast is there because sigbus_handler is invoked via sigfd_handler.
But it feels just wrong to use struct qemu_signalfd_siginfo in the
prototype of a function that is passed to sigaction.

Instead, do a simple-minded conversion of qemu_signalfd_siginfo to
siginfo_t.

Signed-off-by: Paolo Bonzini <[email protected]>

Merge branch 'icount-update' into HEAD

Merge the original development branch due to breakage caused by the
MTTCG merge.

Conflicts:
cpu-exec.c
translate-common.c

Signed-off-by: Paolo Bonzini <[email protected]>

Merge remote-tracking branch 'remotes/ehabkost/tags/numa-pull-request' into staging

NUMA documentation update

# gpg: Signature made Fri 03 Mar 2017 13:11:25 GMT
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <[email protected]>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/numa-pull-request:
  qemu-options: Rewrite -numa documentation

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/dgibson/tags/submodule-update-20170303' into staging

submodule updates (SLOF & dtc) 2017-03-03

This set of patches updates the SLOF and dtc submodules for qemu-2.9.

The SLOF update could have gone in my ppc pull request earlier today,
but I forgot it.  It should be safe to apply in either order with that
set though.

The dtc (and libfdt) update brings us up to dtc 1.4.3 which includes
some things that will be useful in future.

# gpg: Signature made Fri 03 Mar 2017 06:29:31 GMT
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg:                 aka "David Gibson (Red Hat) <[email protected]>"
# gpg:                 aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg:                 aka "David Gibson (kernel.org) <[email protected]>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/submodule-update-20170303:
  Update dtc submodule to v1.4.3
  pseries: Update SLOF firmware image

Signed-off-by: Peter Maydell <[email protected]>

qemu-options: Rewrite -numa documentation

Rewrite the -numa documentation to clarify what exactly it does.

Signed-off-by: Eduardo Habkost <[email protected]>
Message-Id: <20170123180632 [email protected]>
Reviewed-by: Igor Mammedov <[email protected]>
Signed-off-by: Eduardo Habkost <[email protected]>