Git Repo - qemu.git/log

hw/intc/arm_gicv3: Fix secure-GIC NS ICC_PMR and ICC_RPR accesses

If the GIC has the security extension support enabled, then a
non-secure access to ICC_PMR must take account of the non-secure
view of interrupt priorities, where real priorities 0x00..0x7f
are secure-only and not visible to the non-secure guest, and
priorities 0x80..0xff are shown to the guest as if they were
0x00..0xff. We had the logic here wrong:
* on reads, the priority is in the secure range if bit 7
is clear, not if it is set
* on writes, we want to set bit 7, not mask everything else

Our ICC_RPR read code had the same error as ICC_PMR.

(Compare the GICv3 spec pseudocode functions ICC_RPR_EL1
and ICC_PMR_EL1.)

Fixes: https://bugs.launchpad.net/qemu/+bug/1748434
Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Andrew Jones <[email protected]>
Message-id: 20180315133441 [email protected]

sdhci: fix incorrect use of Error *

Detected by Coverity (CID 1386072, 1386073, 1386076, 1386077). local_err
was unused, and this made the static analyzer unhappy.

Signed-off-by: Paolo Bonzini <[email protected]>
Message-id: 20180320151355 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

arm/translate-a64: treat DISAS_UPDATE as variant of DISAS_EXIT

In OE project 4.15 linux kernel boot hang was observed under
single cpu aarch64 qemu. Kernel code was in a loop waiting for
vtimer arrival, spinning in TC generated blocks, while interrupt
was pending unprocessed. This happened because when qemu tried to
handle vtimer interrupt target had interrupts disabled, as
result flag indicating TCG exit, cpu->icount_decr.u16.high,
was cleared but arm_cpu_exec_interrupt function did not call
arm_cpu_do_interrupt to process interrupt. Later when target
reenabled interrupts, it happened without exit into main loop, so
following code that waited for result of interrupt execution
run in infinite loop.

To solve the problem instructions that operate on CPU sys state
(i.e enable/disable interrupt), and marked as DISAS_UPDATE,
should be considered as DISAS_EXIT variant, and should be
forced to exit back to main loop so qemu will have a chance
processing pending CPU state updates, including pending
interrupts.

This change brings consistency with how DISAS_UPDATE is treated
in aarch32 case.

CC: Peter Maydell <[email protected]>
CC: Alex Bennée <[email protected]>
CC: [email protected]
Suggested-by: Peter Maydell <[email protected]>
Signed-off-by: Victor Kamensky <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Message-id: 1521526368 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/borntraeger/tags/s390x-20180323' into staging

s390x: Fixes for 2.12

- Fix for the s390 cpumodel
- Forbid multifunction PCI devices

# gpg: Signature made Fri 23 Mar 2018 09:06:31 GMT
# gpg:                using RSA key 117BBC80B5A61C7C
# gpg: Good signature from "Christian Borntraeger (IBM) <[email protected]>"
# Primary key fingerprint: F922 9381 A334 08F9 DBAB  FBCA 117B BC80 B5A6 1C7C

* remotes/borntraeger/tags/s390x-20180323:
  s390x/cpumodel: fix feature groups and breakage of MSA8
  s390x/pci: forbid multifunction pci device

Signed-off-by: Peter Maydell <[email protected]>

s390x/cpumodel: fix feature groups and breakage of MSA8

Since commit 46a99c9f73c7 ("s390x/cpumodel: model PTFF subfunctions
for Multiple-epoch facility") -cpu help no longer shows the MSA8
feature group. Turns out that we forgot to add the new MEPOCH_PTFF
group enum.

Fixes: 46a99c9f73c7 ("s390x/cpumodel: model PTFF subfunctions for Multiple-epoch facility")
Reviewed-by: David Hildenbrand <[email protected]>
Signed-off-by: Christian Borntraeger <[email protected]>

s390x/pci: forbid multifunction pci device

Currently we don't support pci multifunction. If a pci with
multifucntion is plugged, the guest will spin forever. This patch fixes
this.

Signed-off-by: Yi Min Zhao <[email protected]>
Reviewed-by: Pierre Morel <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: Christian Borntraeger <[email protected]>

gitmodules: Use the QEMU mirror of qemu-palcode

We have a mirror of the qemu-palcode repository on
git.qemu.org; use that instead of the upstream github,
in line with our general policy of keeping and using
a mirror for submodules.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Message-id: 20180319131743 [email protected]

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Multiboot patches

# gpg: Signature made Wed 21 Mar 2018 14:38:36 GMT
# gpg:                using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  tests/multiboot: Add .gitignore
  tests/multiboot: Add tests for the a.out kludge
  tests/multiboot: Test exit code for every qemu run
  multiboot: Check validity of mh_header_addr
  multiboot: Reject kernels exceeding the address space

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/elmarco/tags/dump-pull-request' into staging

Pull request

# gpg: Signature made Wed 21 Mar 2018 14:37:05 GMT
# gpg:                using RSA key DAE8E10975969CE5
# gpg: Good signature from "Marc-André Lureau <[email protected]>"
# gpg:                 aka "Marc-André Lureau <[email protected]>"
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276  F62D DAE8 E109 7596 9CE5

* remotes/elmarco/tags/dump-pull-request:
  dump-guest-memory: more descriptive lookup_type failure
  dump.c: allow fd_write_vmcore to return errno on failure

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/stefanberger/tags/pull-tpm-2018-03-21-1' into staging

Merge tpm 2018/03/21 v1

# gpg: Signature made Wed 21 Mar 2018 12:02:06 GMT
# gpg:                using RSA key 75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE  C66B 75AD 6580 2A0B 4211

* remotes/stefanberger/tags/pull-tpm-2018-03-21-1:
  tpm: CRB: query backend for TPM established flag
  tpm: CRB: reset locAssigned upon relinquishing locality
  tpm: CRB: set registers to 0 by default
  tpm: CRB: Set tpmRegValidSts flag to '1' in device reset

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-2.12-pull-request' into staging

# gpg: Signature made Tue 20 Mar 2018 20:43:37 GMT
# gpg:                using RSA key F30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <[email protected]>"
# gpg:                 aka "Laurent Vivier <[email protected]>"
# gpg:                 aka "Laurent Vivier (Red Hat) <[email protected]>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/linux-user-for-2.12-pull-request:
  linux-user: init_guest_space: Try to make ARM space+commpage continuous

Signed-off-by: Peter Maydell <[email protected]>

tests/multiboot: Add .gitignore

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Jack Schwartz <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

tests/multiboot: Add tests for the a.out kludge

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Jack Schwartz <[email protected]>

tests/multiboot: Test exit code for every qemu run

Testing the exit code only once after a whole group of tests has
completed is not enough, it catches errors only in the very last qemu
invocation. We need to have the check after each qemu run.

The logging and diff with the reference output is still done once per
group to keep things more managable. This is not a problem because the
log file accumulates the output of all runs.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Jack Schwartz <[email protected]>

multiboot: Check validity of mh_header_addr

I couldn't find a case where this prevents something bad from happening
that isn't already caught by other checks, but let's err on the safe
side and check that mh_header_addr is as expected.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Jack Schwartz <[email protected]>

multiboot: Reject kernels exceeding the address space

The code path where mh_load_end_addr is non-zero in the Multiboot
header checks that mh_load_end_addr >= mh_load_addr and so
mb_load_size is checked. However, mb_load_size is not checked when
calculated from the file size, when mh_load_end_addr is 0.

If the kernel binary size is larger than can fit in the address space
after load_addr, we ended up with a kernel_size that is smaller than
load_size, which means that we read the file into a too small buffer.

Add a check to reject kernel files with such Multiboot headers.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Jack Schwartz <[email protected]>

dump-guest-memory: more descriptive lookup_type failure

We've seen a few reports of

(gdb) source /usr/share/qemu-kvm/dump-guest-memory.py
Traceback (most recent call last):
File "/usr/share/qemu-kvm/dump-guest-memory.py", line 19, in <module>
UINTPTR_T = gdb.lookup_type("uintptr_t")
gdb.error: No type named uintptr_t.

This occurs when symbols haven't been loaded first, i.e. neither a
QEMU binary was loaded nor a QEMU process was attached first. Let's
better inform the user of how to fix the issue themselves in order
to avoid more reports.

Acked-by: Janosch Frank <[email protected]>
Signed-off-by: Andrew Jones <[email protected]>
Message-Id: <20180314153820 [email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Tested-by: Fam Zheng <[email protected]>
Reviewed-by: Laszlo Ersek <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Signed-off-by: Marc-André Lureau <[email protected]>

dump.c: allow fd_write_vmcore to return errno on failure

fd_write_vmcore can fail to execute for a lot of reasons that can be
retrieved by errno, but it only returns -1. This makes difficult for
the caller to know what happened and only a generic error message is
propagated back to the user. This is an example using dump-guest-memory:

(qemu) dump-guest-memory /home/yasmin/mnt/test.dump
dump: failed to save memory

All callers of fd_write_vmcore of dump.c does error handling via
error_setg(), so at first it seems feasible to add the Error pointer as
an argument of fd_write_vmcore. This proved to be more complex than it
first looked. fd_write_vmcore is used by write_elf64_notes and
write_elf32_notes as a WriteCoreDumpFunction prototype. WriteCoreDumpFunction
is declared in include/qom/cpu.h and is used all around the code. This
leaves us with few alternatives:

- change the WriteCoreDumpFunction prototype to include an error pointer.
This would require to change all functions that implements this prototype
to also receive an Error pointer;

- change both write_elf64_notes and write_elf32_notes to no use the
WriteCoreDumpFunction. These functions use not only fd_write_vmcore
but also buf_write_note, so this would require to change buf_write_note
to handle an Error pointer. Considerable easier than the alternative
above, but it's still a lot of code just for the benefit of the callers
of fd_write_vmcore.

This patch presents an easier solution that benefits all fd_write_vmcore
callers:

- instead of returning -1 on error, return -errno. All existing callers
already checks for ret < 0 so there is no need to change the caller's
logic too much. This also allows the retrieval of the errno.

- all callers were updated to use error_setg_errno instead of just
errno_setg. Now that fd_write_vmcore can return an errno, let's update
all callers so they can benefit from a more detailed error message.

This is the same dump-guest-memory example with this patch applied:

(qemu) dump-guest-memory /home/yasmin/mnt/test.dump
dump: failed to save memory: No space left on device
(qemu)

This example illustrates an error of fd_write_vmcore when called
from write_data. All other callers will benefit from better
error messages as well.

Reported-by: [email protected]
Cc: Jose Ricardo Ziviani <[email protected]>
Signed-off-by: Yasmin Beatriz <[email protected]>
Signed-off-by: Daniel Henrique Barboza <[email protected]>
Message-Id: <20180212142506 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Signed-off-by: Marc-André Lureau <[email protected]>

tpm: CRB: query backend for TPM established flag

Signed-off-by: Stefan Berger <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>

tpm: CRB: reset locAssigned upon relinquishing locality

Signed-off-by: Stefan Berger <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>

tpm: CRB: set registers to 0 by default

Initialize all registers of the CRB device to 0. This clears a few
flags upon a reset.

Signed-off-by: Stefan Berger <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>

tpm: CRB: Set tpmRegValidSts flag to '1' in device reset

Fix the initialization of the tpmRegValidSts flag and set it to '1'
during device reset without expecting a write to another register.
This seems to also be the default behavior of real hardware.

Signed-off-by: Stefan Berger <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>

Update version for v2.12.0-rc0 release

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/dgilbert/tags/pull-hmp-20180320' into staging

HMP fixes for 2.12

# gpg: Signature made Tue 20 Mar 2018 12:39:24 GMT
# gpg:                using RSA key 0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <[email protected]>"
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-hmp-20180320:
  hmp: free sev info
  HMP: Initialize err before using

Signed-off-by: Peter Maydell <[email protected]>

linux-user: init_guest_space: Try to make ARM space+commpage continuous

At a fixed distance after the usable memory that init_guest_space maps, for
32-bit ARM targets we also need to map a commpage.  The normal
init_guest_space logic doesn't keep this in mind when searching for an
address range.

If !host_start, then try to find a big continuous segment where we can put
both the usable memory and the commpage; we then munmap that segment and
set current_start to that address; and let the normal code mmap the usable
memory and the commpage separately.  That is: if we don't have hint of
where to start looking for memory, come up with one that is better than
NULL.  Depending on host_size and guest_start, there may or may not be a
gap between the usable memory and the commpage, so this is slightly more
restrictive than it needs to be; but it's only a hint, so that's OK.

We only do that for !host start, because if host_start, then either:
- we got an address passed in with -B, in which case we don't want to
   interfere with what the user said;
- or host_start is based off of the ELF image's loaddr.  The check "if
   (host_start && real_start != current_start)" suggests that we really
   want lowest available address that is >= loaddr.  I don't know why that
   is, but I'm trusting that Paul Brook knew what he was doing when he
   wrote the original version of that check in
   c581deda322080e8beb88b2e468d4af54454e4b3 way back in 2010.

Signed-off-by: Luke Shumaker <[email protected]>
Message-Id: <20171228180814 [email protected]>
Signed-off-by: Laurent Vivier <[email protected]>

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

virtio,vhost,pci,pc: features, cleanups

SRAT tables for DIMM devices
new virtio net flags for speed/duplex
post-copy migration support in vhost
cleanups in pci

Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Tue 20 Mar 2018 14:40:43 GMT
# gpg:                using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg:                 aka "Michael S. Tsirkin <[email protected]>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream: (51 commits)
  postcopy shared docs
  libvhost-user: Claim support for postcopy
  postcopy: Allow shared memory
  vhost: Huge page align and merge
  vhost+postcopy: Wire up POSTCOPY_END notify
  vhost-user: Add VHOST_USER_POSTCOPY_END message
  libvhost-user: mprotect & madvises for postcopy
  vhost+postcopy: Call wakeups
  vhost+postcopy: Add vhost waker
  postcopy: postcopy_notify_shared_wake
  postcopy: helper for waking shared
  vhost+postcopy: Resolve client address
  postcopy-ram: add a stub for postcopy_request_shared_page
  vhost+postcopy: Helper to send requests to source for shared pages
  vhost+postcopy: Stash RAMBlock and offset
  vhost+postcopy: Send address back to qemu
  libvhost-user+postcopy: Register new regions with the ufd
  migration/ram: ramblock_recv_bitmap_test_byte_offset
  postcopy+vhost-user: Split set_mem_table for postcopy
  vhost+postcopy: Transmit 'listen' to slave
  ...

Signed-off-by: Peter Maydell <[email protected]>
# Conflicts:
# scripts/update-linux-headers.sh

postcopy shared docs

Add some notes to the migration documentation for shared memory
postcopy.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

libvhost-user: Claim support for postcopy

Tell QEMU we understand the protocol features needed for postcopy.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

postcopy: Allow shared memory

Now that we have the mechanisms in here, allow shared memory in a
postcopy.

Note that QEMU can't tell who all the users of shared regions are
and thus can't tell whether all the users of the shared regions
have appropriate support for postcopy. Those devices that explicitly
support shared memory (e.g. vhost-user) must check, but it doesn't
stop weirder configurations causing problems.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

vhost: Huge page align and merge

Align RAMBlocks to page size alignment, and adjust the merging code
to deal with partial overlap due to that alignment.

This is needed for postcopy so that we can place/fetch whole hugepages
when under userfault.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

vhost+postcopy: Wire up POSTCOPY_END notify

Wire up a call to VHOST_USER_POSTCOPY_END message to the vhost clients
right before we ask the listener thread to shutdown.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

vhost-user: Add VHOST_USER_POSTCOPY_END message

This message is sent just before the end of postcopy to get the
client to stop using userfault since we wont respond to any more
requests. It should close userfaultfd so that any other pages
get mapped to the backing file automatically by the kernel, since
at this point we know we've received everything.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

libvhost-user: mprotect & madvises for postcopy

Clear the area and turn off THP.
PROT_NONE the area until after we've userfault advised it
to catch any unexpected changes.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

vhost+postcopy: Call wakeups

Cause the vhost-user client to be woken up whenever:
a) We place a page in postcopy mode
b) We get a fault and the page has already been received

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

vhost+postcopy: Add vhost waker

Register a waker function in vhost-user code to be notified when
pages arrive or requests to previously mapped pages get requested.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

postcopy: postcopy_notify_shared_wake

Add a hook to allow a client userfaultfd to be 'woken'
when a page arrives, and a walker that calls that
hook for relevant clients given a RAMBlock and offset.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

postcopy: helper for waking shared

Provide a helper to send a 'wake' request on a userfaultfd for
a shared process.
The address in the clients address space is specified together
with the RAMBlock it was resolved to.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

vhost+postcopy: Resolve client address

Resolve fault addresses read off the clients UFD into RAMBlock
and offset, and call back to the postcopy code to ask for the page.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

postcopy-ram: add a stub for postcopy_request_shared_page

This fixes the build on systems without userfaultfd.

Signed-off-by: Michael S. Tsirkin <[email protected]>

Merge remote-tracking branch 'remotes/vivier/tags/m68k-for-2.12-pull-request' into staging

# gpg: Signature made Tue 20 Mar 2018 09:07:55 GMT
# gpg:                using RSA key F30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <[email protected]>"
# gpg:                 aka "Laurent Vivier <[email protected]>"
# gpg:                 aka "Laurent Vivier (Red Hat) <[email protected]>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier/tags/m68k-for-2.12-pull-request:
  target/m68k: add a mechanism to automatically free TCGv
  target/m68k: add DisasContext parameter to gen_extend()

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/ehabkost/tags/machine-next-pull-request' into staging

Machine and x86 queue, 2018-03-19

* cpu_model/cpu_type cleanups
* x86: Fix on Intel Processor Trace CPUID checks

# gpg: Signature made Mon 19 Mar 2018 20:07:14 GMT
# gpg:                using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <[email protected]>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/machine-next-pull-request:
  i386: Disable Intel PT if packets IP payloads have LIP values
  cpu: drop unnecessary NULL check and cpu_common_class_by_name()
  cpu: get rid of unused cpu_init() defines
  Use cpu_create(type) instead of cpu_init(cpu_model)
  cpu: add CPU_RESOLVING_TYPE macro
  tests: add machine 'none' with -cpu test
  nios2: 10m50_devboard: replace cpu_model with cpu_type

Signed-off-by: Peter Maydell <[email protected]>

hmp: free sev info

Found thanks to ASAN:

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7efe20417a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
    #1 0x7efe1f7b2f75 in g_malloc0 ../glib/gmem.c:124
    #2 0x7efe1f7b3249 in g_malloc0_n ../glib/gmem.c:355
    #3 0x558272879162 in sev_get_info /home/elmarco/src/qemu/target/i386/sev.c:414
    #4 0x55827285113b in hmp_info_sev /home/elmarco/src/qemu/target/i386/monitor.c:684
    #5 0x5582724043b8 in handle_hmp_command /home/elmarco/src/qemu/monitor.c:3333

Fixes: 63036314
Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20180319175823 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

HMP: Initialize err before using

When bdrv_snapshot_delete return fail, the errp will not be
assigned a valid value in error_propagate as errp didn't be
initialized in hmp_delvm, then error_reportf_err will use an
uninitialized value(call by hmp_delvm), and qemu crash.

Signed-off-by: zhangjixiang <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

RISC-V: Fix riscv_isa_string memory size bug

This version uses a constant size memory buffer sized for
the maximum possible ISA string length. It also uses g_new
instead of g_new0, uses more efficient logic to append
extensions and adds manual zero termination of the string.

Cc: Palmer Dabbelt <[email protected]>
Cc: Peter Maydell <[email protected]>
Signed-off-by: Michael Clark <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
[PMM: Use qemu_tolower() rather than tolower()]
Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/ericb/tags/pull-qapi-2018-03-12-v4' into staging

qapi patches for 2018-03-12, 2.12 softfreeze

- Marc-André Lureau: 0/4 qapi: generate a literal qobject for introspection
- Max Reitz: 0/7 block: Handle null backing link
- Daniel P. Berrange: chardev: tcp: postpone TLS work until machine done
- Peter Xu: 00/23 QMP: out-of-band (OOB) execution support
- Vladimir Sementsov-Ogievskiy: 0/2 block latency histogram
- Eric Blake: qapi: Pass '-u' when doing non-silent diff

# gpg: Signature made Mon 19 Mar 2018 19:59:04 GMT
# gpg:                using RSA key A7A16B4A2527436A
# gpg: Good signature from "Eric Blake <[email protected]>"
# gpg:                 aka "Eric Blake (Free Software Programmer) <[email protected]>"
# gpg:                 aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A

* remotes/ericb/tags/pull-qapi-2018-03-12-v4: (38 commits)
  qapi: Pass '-u' when doing non-silent diff
  qapi: add block latency histogram interface
  block/accounting: introduce latency histogram
  tests: qmp-test: add oob test
  tests: qmp-test: verify command batching
  qmp: add command "x-oob-test"
  monitor: enable IO thread for (qmp & !mux) typed
  qmp: isolate responses into io thread
  qmp: support out-of-band (oob) execution
  qapi: introduce new cmd option "allow-oob"
  monitor: send event when command queue full
  qmp: add new event "command-dropped"
  monitor: separate QMP parser and dispatcher
  monitor: let suspend/resume work even with QMPs
  monitor: let suspend_cnt be thread safe
  monitor: introduce monitor_qmp_respond()
  qmp: introduce QMPCapability
  monitor: allow using IO thread for parsing
  monitor: let mon_list be tail queue
  monitor: unify global init
  ...

Signed-off-by: Peter Maydell <[email protected]>

target/m68k: add a mechanism to automatically free TCGv

SRC_EA() and gen_extend() can return either a temporary
TCGv or a memory allocated one. Mark them when they are
allocated, and free them automatically at end of the
instruction translation.

We want to free locally allocated TCGv to avoid
overflow in sequence like:

  0xc00ae406:  movel %fp@(-132),%fp@(-268)
  0xc00ae40c:  movel %fp@(-128),%fp@(-264)
  0xc00ae412:  movel %fp@(-20),%fp@(-212)
  0xc00ae418:  movel %fp@(-16),%fp@(-208)
  0xc00ae41e:  movel %fp@(-60),%fp@(-220)
  0xc00ae424:  movel %fp@(-56),%fp@(-216)
  0xc00ae42a:  movel %fp@(-124),%fp@(-252)
  0xc00ae430:  movel %fp@(-120),%fp@(-248)
  0xc00ae436:  movel %fp@(-12),%fp@(-260)
  0xc00ae43c:  movel %fp@(-8),%fp@(-256)
  0xc00ae442:  movel %fp@(-52),%fp@(-276)
  0xc00ae448:  movel %fp@(-48),%fp@(-272)
  ...

That can fill a lot of TCGv entries in a sequence,
especially since 15fa08f845 ("tcg: Dynamically allocate TCGOps")
we have no limit to fill the TCGOps cache and we can fill
the entire TCG variables array and overflow it.

Suggested-by: Richard Henderson <[email protected]>
Signed-off-by: Laurent Vivier <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20180319113544 [email protected]>

target/m68k: add DisasContext parameter to gen_extend()

This parameter will be needed to manage automatic release
of temporary allocated TCG variables.

Signed-off-by: Laurent Vivier <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20180319113544 [email protected]>

vhost+postcopy: Helper to send requests to source for shared pages

Provide a helper to be used by shared waker functions to request
shared pages from the source.
The last_rb pointer is moved into the incoming state since this
helper can update it as well as the main fault thread function.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

vhost+postcopy: Stash RAMBlock and offset

Stash the RAMBlock and offset for later use looking up
addresses.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

vhost+postcopy: Send address back to qemu

We need a better way, but at the moment we need the address of the
mappings sent back to qemu so it can interpret the messages on the
userfaultfd it reads.

This is done as a 3 stage set:
   QEMU -> client
      set_mem_table

   mmap stuff, get addresses

   client -> qemu
       here are the addresses

   qemu -> client
       OK - now you can use them

That ensures that qemu has registered the new addresses in it's
userfault code before the client starts accessing them.

Note: We don't ask for the default 'ack' reply since we've got our own.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

libvhost-user+postcopy: Register new regions with the ufd

When new regions are sent to the client using SET_MEM_TABLE, register
them with the userfaultfd.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

migration/ram: ramblock_recv_bitmap_test_byte_offset

Utility for testing the map when you already know the offset
in the RAMBlock.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

postcopy+vhost-user: Split set_mem_table for postcopy

Split the set_mem_table routines in both qemu and libvhost-user
because the postcopy versions are going to be quite different
once changes in the later patches are added. However, this patch
doesn't produce any functional change, just the split.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

vhost+postcopy: Transmit 'listen' to slave

Notify the vhost-user slave on reception of the 'postcopy-listen'
event from the source.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

vhost+postcopy: Register shared ufd with postcopy

Register the UFD that comes in as the response to the 'advise' method
with the postcopy code.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

postcopy: Allow registering of fd handler

Allow other userfaultfd's to be registered into the fault thread
so that handlers for shared memory can get responses.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

libvhost-user: Open userfaultfd

Open a userfaultfd (on a postcopy_advise) and send it back in
the reply to the qemu for it to monitor.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

libvhost-user: Support sending fds back to qemu

Allow replies with fds (for postcopy)

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

vhost-user: Add 'VHOST_USER_POSTCOPY_ADVISE' message

Wire up a notifier to send a VHOST_USER_POSTCOPY_ADVISE
message on an incoming advise.

Later patches will fill in the behaviour/contents of the
message.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

postcopy: Add vhost-user flag for postcopy and check it

Add a vhost feature flag for postcopy support, and
use the postcopy notifier to check it before allowing postcopy.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

postcopy: Add notifier chain

Add a notifier chain for postcopy with a 'reason' flag
and an opportunity for a notifier member to return an error.

Call it when enabling postcopy.

This will initially used to enable devices to declare they're unable
to postcopy and later to notify of devices of stages within postcopy.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

postcopy: use UFFDIO_ZEROPAGE only when available

Use a flag on the RAMBlock to state whether it has the
UFFDIO_ZEROPAGE capability, use it when it's available.

This allows the use of postcopy on tmpfs as well as hugepage
backed files.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

qemu_ram_block_host_offset

Utility to give the offset of a host pointer within a RAMBlock
(assuming we already know it's in that RAMBlock)

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

migrate: Update ram_block_discard_range for shared

The choice of call to discard a block is getting more complicated
for other cases. We use fallocate PUNCH_HOLE in any file cases;
it works for both hugepage and for tmpfs.
We use the DONTNEED for non-hugepage cases either where they're
anonymous or where they're private.

Care should be taken when trying other backing files.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

Makefile: add target to print generated files

This is helpful for automatic code analysis.

Signed-off-by: Michael S. Tsirkin <[email protected]>

test/acpi-test-data: add ACPI tables for dimmpxm test

Reviewers can use ACPI tables in this patch to run
test_acpi_{piix4,q35}_tcg_dimm_pxm cases.

Signed-off-by: Haozhong Zhang <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

tests/bios-tables-test: add test cases for DIMM proximity

QEMU now builds one SRAT memory affinity structure for each PC-DIMM
and NVDIMM device presented at boot time with the proximity domain
specified in the device option 'node', rather than only one SRAT
memory affinity structure covering the entire hotpluggable address
space with the proximity domain of the last node.

Add test cases on PC and Q35 machines with 4 proximity domains, and
one PC-DIMM and one NVDIMM attached to the 2nd and 3rd proximity
domains respectively. Check whether the QEMU-built SRAT tables match
with the expected ones.

The following ACPI tables need to be added for this test:
  tests/acpi-test-data/pc/APIC.dimmpxm
  tests/acpi-test-data/pc/DSDT.dimmpxm
  tests/acpi-test-data/pc/NFIT.dimmpxm
  tests/acpi-test-data/pc/SRAT.dimmpxm
  tests/acpi-test-data/pc/SSDT.dimmpxm
  tests/acpi-test-data/q35/APIC.dimmpxm
  tests/acpi-test-data/q35/DSDT.dimmpxm
  tests/acpi-test-data/q35/NFIT.dimmpxm
  tests/acpi-test-data/q35/SRAT.dimmpxm
  tests/acpi-test-data/q35/SSDT.dimmpxm
New APIC and DSDT are needed because of the multiple processors
configuration. New NFIT and SSDT are needed because of NVDIMM.

Signed-off-by: Haozhong Zhang <[email protected]>
Suggested-by: Igor Mammedov <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

hw/acpi-build: build SRAT memory affinity structures for DIMM devices

ACPI 6.2A Table 5-129 "SPA Range Structure" requires the proximity
domain of a NVDIMM SPA range must match with corresponding entry in
SRAT table.

The address ranges of vNVDIMM in QEMU are allocated from the
hot-pluggable address space, which is entirely covered by one SRAT
memory affinity structure. However, users can set the vNVDIMM
proximity domain in NFIT SPA range structure by the 'node' property of
'-device nvdimm' to a value different than the one in the above SRAT
memory affinity structure.

In order to solve such proximity domain mismatch, this patch builds
one SRAT memory affinity structure for each DIMM device present at
boot time, including both PC-DIMM and NVDIMM, with the proximity
domain specified in '-device pc-dimm' or '-device nvdimm'.

The remaining hot-pluggable address space is covered by one or multiple
SRAT memory affinity structures with the proximity domain of the last
node as before.

Signed-off-by: Haozhong Zhang <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

qmp: distinguish PC-DIMM and NVDIMM in MemoryDeviceInfoList

It may need to treat PC-DIMM and NVDIMM differently, e.g., when
deciding the necessity of non-volatile flag bit in SRAT memory
affinity structures.

A new field 'nvdimm' is added to the union type MemoryDeviceInfo for
such purpose. Its type is currently PCDIMMDeviceInfo and will be
updated when necessary in the future.

It also fixes "info memory-devices"/query-memory-devices which
currently show nvdimm devices as dimm devices since
object_dynamic_cast(obj, TYPE_PC_DIMM) happily cast nvdimm to
TYPE_PC_DIMM which it's been inherited from.

Signed-off-by: Haozhong Zhang <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

pc-dimm: make qmp_pc_dimm_device_list() sort devices by address

Make qmp_pc_dimm_device_list() return sorted by start address
list of devices so that it could be reused in places that
would need sorted list*. Reuse existing pc_dimm_built_list()
to get sorted list.

While at it hide recursive callbacks from callers, so that:

qmp_pc_dimm_device_list(qdev_get_machine(), &list);

could be replaced with simpler:

list = qmp_pc_dimm_device_list();

* follow up patch will use it in build_srat()

Signed-off-by: Haozhong Zhang <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>
Acked-by: David Gibson <[email protected]> for ppc part
Reviewed-by: Bharata B Rao <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

hw/pci: remove obsolete PCIDevice->init()

All PCI devices are now QOM'ified.

Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Marcel Apfelbaum <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

standard-headers: update virtio_net.h

include speed/duplex fields

Signed-off-by: Michael S. Tsirkin <[email protected]>

i386: Disable Intel PT if packets IP payloads have LIP values

Intel processor trace should be disabled when
CPUID.(EAX=14H,ECX=0H).ECX.[bit31] is set.
Generated packets which contain IP payloads will have LIP
values when this bit is set, or IP payloads will have RIP
values.
Currently, The information of CPUID 14H is constant to make
live migration safty and this bit is always 0 in guest even
if host support LIP values.
Guest sees the bit is 0 will expect IP payloads with RIP
values, but the host CPU will generate IP payloads with
LIP values if this bit is set in HW.
To make sure the value of IP payloads correctly, Intel PT
should be disabled when bit[31] is set.

Signed-off-by: Luwei Kang <[email protected]>
Message-Id: <1520969191 [email protected]>
Signed-off-by: Eduardo Habkost <[email protected]>

qapi: Pass '-u' when doing non-silent diff

Ed-script diffs are awful compared to context diffs. Fix another
'diff -q' while in the area (if the files are different, being
noisy makes it easier to diagnose why).

While at it, diff .err before .out, because if a test fails, .err
is more likely to contain the most important information for
fixing the failure.

Fixes: 46ec4fce
Signed-off-by: Eric Blake <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20180315125116 [email protected]>

qapi: add block latency histogram interface

Set (and clear) histograms through new command
block-latency-histogram-set and show new statistics in
query-blockstats results.

For now, the command is marked experimental with prefix 'x-',
to gain experience with the interface without being stuck
with design decisions.

Signed-off-by: Vladimir Sementsov-Ogievskiy <[email protected]>
Message-Id: <20180309165212 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
[eblake: fix typos, mention x- prefix in commit message]
Signed-off-by: Eric Blake <[email protected]>

block/accounting: introduce latency histogram

Introduce latency histogram statics for block devices.
For each accounted operation type, the latency region [0, +inf) is
divided into subregions by several points. Then, calculate
hits for each subregion.

Signed-off-by: Vladimir Sementsov-Ogievskiy <[email protected]>
Message-Id: <20180309165212 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Eric Blake <[email protected]>

tests: qmp-test: add oob test

Test the new OOB capability. Here we used the new "x-oob-test" command.
First, we send a lock=true and oob=false command to hang the main
thread. Then send another lock=false and oob=true command (which will
be run inside parser this time) to free that hanged command.

Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
[eblake: grammar tweaks]
Signed-off-by: Eric Blake <[email protected]>

tests: qmp-test: verify command batching

OOB introduced DROP event for flow control. This should not affect old
QMP clients. Add a command batching check to make sure of it.

Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Eric Blake <[email protected]>

qmp: add command "x-oob-test"

This command is only used to test OOB functionality. It should not be
used for any other purposes.

Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
[eblake: grammar tweak]
Signed-off-by: Eric Blake <[email protected]>

monitor: enable IO thread for (qmp & !mux) typed

Start to use dedicate IO thread for QMP monitors that are not using
MUXed chardev.

Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Signed-off-by: Eric Blake <[email protected]>

qmp: isolate responses into io thread

For those monitors who have enabled IO thread, we'll offload the
responding procedure into IO thread.  The main reason is that chardev is
not thread safe, and we need to do all the read/write IOs in the same
thread.  For use_io_thr=true monitors, that thread is the IO thread.

We do this isolation in similar pattern as what we have done to the
request queue: we first create one response queue for each monitor, then
instead of replying directly in the main thread, we queue the responses
and kick the IO thread to do the rest of the job for us.

A funny thing after doing this is that, when the QMP clients send "quit"
to QEMU, it's possible that we close the IOThread even earlier than
replying to that "quit".  So another thing we need to do before cleaning
up the monitors is that we need to flush the response queue (we don't
need to do that for command queue; after all we are quitting) to make
sure replies for handled commands are always flushed back to clients.

Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Signed-off-by: Eric Blake <[email protected]>

qmp: support out-of-band (oob) execution

Having "allow-oob":true for a command does not mean that this command
will always be run in out-of-band mode.  The out-of-band quick path will
only be executed if we specify the extra "run-oob" flag when sending the
QMP request:

    { "execute":   "command-that-allows-oob",
      "arguments": { ... },
      "control":   { "run-oob": true } }

The "control" key is introduced to store this extra flag.  "control"
field is used to store arguments that are shared by all the commands,
rather than command specific arguments.  Let "run-oob" be the first.

Note that in the patch I exported qmp_dispatch_check_obj() to be used to
check the request earlier, and at the same time allowed "id" field to be
there since actually we always allow that.

Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
[eblake: rebase to qobject_to(), spelling fix]
Signed-off-by: Eric Blake <[email protected]>

qapi: introduce new cmd option "allow-oob"

Here "oob" stands for "Out-Of-Band".  When "allow-oob" is set, it means
the command allows out-of-band execution.

The "oob" idea is proposed by Markus Armbruster in following thread:

  https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg02057.html

This new "allow-oob" boolean will be exposed by "query-qmp-schema" as
well for command entries, so that QMP clients can know which commands
can be used in out-of-band calls. For example the command "migrate"
originally looks like:

  {"name": "migrate", "ret-type": "17", "meta-type": "command",
   "arg-type": "86"}

And it'll be changed into:

  {"name": "migrate", "ret-type": "17", "allow-oob": false,
   "meta-type": "command", "arg-type": "86"}

This patch only provides the QMP interface level changes.  It does not
contain the real out-of-band execution implementation yet.

Suggested-by: Markus Armbruster <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
[eblake: rebase on introspection done by qlit]
Signed-off-by: Eric Blake <[email protected]>

monitor: send event when command queue full

Set maximum QMP command queue length to 8. If the queue is full,
instead of queuing the command, we directly return a "command-dropped"
event, telling the client that a specific command is dropped.

Note that this flow control mechanism is only valid if OOB is enabled.
If it's not, the effective queue length will always be 1, which strictly
follows original behavior of QMP command handling (which never drops
messages).

Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
[eblake: commit message grammar, abort on failure to send event]
Signed-off-by: Eric Blake <[email protected]>

qmp: add new event "command-dropped"

This event will be emitted if one QMP command is dropped. Also,
declare an enum for the reasons.

Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
[eblake: rebase to master]
Signed-off-by: Eric Blake <[email protected]>

monitor: separate QMP parser and dispatcher

Originally QMP goes through these steps:

  JSON Parser --> QMP Dispatcher --> Respond
      /|\    (2)                (3)     |
   (1) |                               \|/ (4)
       +---------  main thread  --------+

This patch does this:

  JSON Parser     QMP Dispatcher --> Respond
      /|\ |           /|\       (4)     |
       |  | (2)        | (3)            |  (5)
   (1) |  +----->      |               \|/
       +---------  main thread  <-------+

So the parsing job and the dispatching job is isolated now.  It gives us
a chance in follow up patches to totally move the parser outside.

The isolation is done using one QEMUBH. Only one dispatcher QEMUBH is
used for all the monitors.

Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
[eblake: grammar tweaks, rebase to qobject_to()]
Signed-off-by: Eric Blake <[email protected]>

monitor: let suspend/resume work even with QMPs

This patches allows QMP monitors to be suspended/resumed.

One thing to mention is that for QMPs that are using IOThreads, we need
an explicit kick for the IOThread in case it is sleeping.

Meanwhile, we need to take special care on non-interactive HMPs.
Currently only gdbserver is using that. For these monitors, we still
don't allow suspend/resume operations.

Since at it, add traces for the operations.

Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Eric Blake <[email protected]>

monitor: let suspend_cnt be thread safe

Monitor code now can be run in more than one thread. Let it be thread
safe when accessing suspend_cnt counter.

Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Signed-off-by: Eric Blake <[email protected]>

monitor: introduce monitor_qmp_respond()

A tiny refactoring, preparing to split the QMP dispatcher away.

Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
[eblake: rebase to qobject_to() usage]
Signed-off-by: Eric Blake <[email protected]>

qmp: introduce QMPCapability

There were no QMP capabilities defined.  Define the first capability,
"oob", to allow out-of-band messages.

After this patch, we will allow QMP clients to enable QMP capabilities
when sending the first "qmp_capabilities" command.  Originally we are
starting QMP session with no arguments like:

  { "execute": "qmp_capabilities" }

Now we can enable some QMP capabilities using (take OOB as example,
which is the only capability that we support):

  { "execute": "qmp_capabilities",
    "arguments": { "enable": [ "oob" ] } }

When the "arguments" key is not provided, no capability is enabled.

For capability "oob", the monitor needs to be run on a dedicated IO
thread, otherwise the command will fail.  For example, trying to enable
OOB on a MUXed typed QMP monitor will fail.

One thing to mention is that QMP capabilities are per-monitor, and also
when the connection is closed due to some reason, the capabilities will
be reset.

Also, touch up qmp-test.c to test the new bits.

Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
[eblake: touch up commit message]
Signed-off-by: Eric Blake <[email protected]>

monitor: allow using IO thread for parsing

For each Monitor, add one field "use_io_thr" to show whether it will be
using the dedicated monitor IO thread to handle input/output.  When set,
monitor IO parsing work will be offloaded to the dedicated monitor IO
thread, rather than the original main loop thread.

This only works for QMP.  HMP will always be run on the main loop
thread.

Currently we're still keeping use_io_thr off always.  Will turn it on
later at some point.

One thing to mention is that we cannot set use_io_thr for every QMP
monitor.  The problem is that MUXed typed chardevs may not work well
with it now. When MUX is used, frontend of chardev can be the monitor
plus something else.  The only thing we know would be safe to be run
outside main thread so far is the monitor frontend. All the rest of the
frontends should still be run in main thread only.

Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
[eblake: squash in Peter's followup patch to avoid test failures]
Signed-off-by: Eric Blake <[email protected]>

monitor: let mon_list be tail queue

It was QLIST. I want to use this list to do monitor priority job later,
which need tail insertion ability. So switching to a tail queue.

Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Signed-off-by: Eric Blake <[email protected]>

monitor: unify global init

There are many places where the monitor initializes its globals:

- monitor_init_qmp_commands() at the very beginning
- single function to init monitor_lock
- in the first entry of monitor_init() using "is_first_init"

Unify them a bit.

monitor_lock is not used before monitor_init() (as confirmed by code
analysis and gdb watchpoints); so we are safe delaying what was a
constructor-time initialization of the mutex into the later first call
to monitor_init().

Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Signed-off-by: Eric Blake <[email protected]>

monitor: move the cur_mon hack deeper for QMP

In monitor_qmp_read(), we have the hack to temporarily replace the
cur_mon pointer. Now we move this hack deeper inside the QMP dispatcher
routine since the Monitor pointer can be actually obtained using
container_of() upon the parser object, just like most of the other JSON
parser users do.

This does not make much sense as a single patch. However, this will be
a big step for the next patch, when the QMP dispatcher routine will be
split from the QMP parser.

Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
[eblake: rebase context of qobject_to() macro]
Signed-off-by: Eric Blake <[email protected]>

monitor: move skip_flush into monitor_data_init

It's part of the data init. Collect it.

Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Signed-off-by: Eric Blake <[email protected]>

qobject: let object_property_get_str() use new API

We can simplify object_property_get_str() using the new
qobject_get_try_str().

Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
[eblake: rebase context of qobject_to() macro]
Signed-off-by: Eric Blake <[email protected]>

qobject: introduce qobject_get_try_str()

A quick way to fetch string from qobject when it's a QString.

Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
[eblake: rebase to qobject_to() macro]
Signed-off-by: Eric Blake <[email protected]>

qobject: introduce qstring_get_try_str()

The only difference from qstring_get_str() is that it allows the qstring
to be NULL. If so, NULL is returned.

CC: Eric Blake <[email protected]>
CC: Markus Armbruster <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Signed-off-by: Eric Blake <[email protected]>

docs: update QMP documents for OOB commands

Update both the developer and spec for the new QMP OOB (Out-Of-Band)
command.

Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180309090006 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
[eblake: grammar tweaks]
Signed-off-by: Eric Blake <[email protected]>

chardev: tcp: postpone TLS work until machine done

TLS handshake may create background GSource tasks, while we won't know
the correct GMainContext until the whole chardev (including frontend)
inited. Let's postpone the initial TLS handshake until machine done.

For dynamically created tcp chardev, we don't postpone that by checking
the init_machine_done variable.

Signed-off-by: Daniel P. Berrange <[email protected]>
[peterx: add missing include line, do unit test]
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180308140714 [email protected]>
Acked-by: Paolo Bonzini <[email protected]>
Signed-off-by: Eric Blake <[email protected]>