Peter Maydell [Fri, 23 Mar 2018 18:26:45 +0000 (18:26 +0000)]
hw/intc/arm_gicv3: Fix secure-GIC NS ICC_PMR and ICC_RPR accesses
If the GIC has the security extension support enabled, then a
non-secure access to ICC_PMR must take account of the non-secure
view of interrupt priorities, where real priorities 0x00..0x7f
are secure-only and not visible to the non-secure guest, and
priorities 0x80..0xff are shown to the guest as if they were
0x00..0xff. We had the logic here wrong:
* on reads, the priority is in the secure range if bit 7
is clear, not if it is set
* on writes, we want to set bit 7, not mask everything else
Our ICC_RPR read code had the same error as ICC_PMR.
(Compare the GICv3 spec pseudocode functions ICC_RPR_EL1
and ICC_PMR_EL1.)
Victor Kamensky [Fri, 23 Mar 2018 18:26:45 +0000 (18:26 +0000)]
arm/translate-a64: treat DISAS_UPDATE as variant of DISAS_EXIT
In OE project 4.15 linux kernel boot hang was observed under
single cpu aarch64 qemu. Kernel code was in a loop waiting for
vtimer arrival, spinning in TC generated blocks, while interrupt
was pending unprocessed. This happened because when qemu tried to
handle vtimer interrupt target had interrupts disabled, as
result flag indicating TCG exit, cpu->icount_decr.u16.high,
was cleared but arm_cpu_exec_interrupt function did not call
arm_cpu_do_interrupt to process interrupt. Later when target
reenabled interrupts, it happened without exit into main loop, so
following code that waited for result of interrupt execution
run in infinite loop.
To solve the problem instructions that operate on CPU sys state
(i.e enable/disable interrupt), and marked as DISAS_UPDATE,
should be considered as DISAS_EXIT variant, and should be
forced to exit back to main loop so qemu will have a chance
processing pending CPU state updates, including pending
interrupts.
This change brings consistency with how DISAS_UPDATE is treated
in aarch32 case.
s390x/cpumodel: fix feature groups and breakage of MSA8
Since commit 46a99c9f73c7 ("s390x/cpumodel: model PTFF subfunctions
for Multiple-epoch facility") -cpu help no longer shows the MSA8
feature group. Turns out that we forgot to add the new MEPOCH_PTFF
group enum.
Fixes: 46a99c9f73c7 ("s390x/cpumodel: model PTFF subfunctions for Multiple-epoch facility") Reviewed-by: David Hildenbrand <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
Peter Maydell [Mon, 19 Mar 2018 13:17:43 +0000 (13:17 +0000)]
gitmodules: Use the QEMU mirror of qemu-palcode
We have a mirror of the qemu-palcode repository on
git.qemu.org; use that instead of the upstream github,
in line with our general policy of keeping and using
a mirror for submodules.
Peter Maydell [Thu, 22 Mar 2018 14:01:29 +0000 (14:01 +0000)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Multiboot patches
# gpg: Signature made Wed 21 Mar 2018 14:38:36 GMT
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream:
tests/multiboot: Add .gitignore
tests/multiboot: Add tests for the a.out kludge
tests/multiboot: Test exit code for every qemu run
multiboot: Check validity of mh_header_addr
multiboot: Reject kernels exceeding the address space
Peter Maydell [Thu, 22 Mar 2018 13:15:52 +0000 (13:15 +0000)]
Merge remote-tracking branch 'remotes/elmarco/tags/dump-pull-request' into staging
Pull request
# gpg: Signature made Wed 21 Mar 2018 14:37:05 GMT
# gpg: using RSA key DAE8E10975969CE5
# gpg: Good signature from "Marc-André Lureau <[email protected]>"
# gpg: aka "Marc-André Lureau <[email protected]>"
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5
* remotes/elmarco/tags/dump-pull-request:
dump-guest-memory: more descriptive lookup_type failure
dump.c: allow fd_write_vmcore to return errno on failure
Peter Maydell [Thu, 22 Mar 2018 12:13:43 +0000 (12:13 +0000)]
Merge remote-tracking branch 'remotes/stefanberger/tags/pull-tpm-2018-03-21-1' into staging
Merge tpm 2018/03/21 v1
# gpg: Signature made Wed 21 Mar 2018 12:02:06 GMT
# gpg: using RSA key 75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2018-03-21-1:
tpm: CRB: query backend for TPM established flag
tpm: CRB: reset locAssigned upon relinquishing locality
tpm: CRB: set registers to 0 by default
tpm: CRB: Set tpmRegValidSts flag to '1' in device reset
Kevin Wolf [Wed, 14 Mar 2018 12:29:46 +0000 (13:29 +0100)]
tests/multiboot: Test exit code for every qemu run
Testing the exit code only once after a whole group of tests has
completed is not enough, it catches errors only in the very last qemu
invocation. We need to have the check after each qemu run.
The logging and diff with the reference output is still done once per
group to keep things more managable. This is not a problem because the
log file accumulates the output of all runs.
Kevin Wolf [Wed, 14 Mar 2018 16:57:45 +0000 (17:57 +0100)]
multiboot: Check validity of mh_header_addr
I couldn't find a case where this prevents something bad from happening
that isn't already caught by other checks, but let's err on the safe
side and check that mh_header_addr is as expected.
Kevin Wolf [Wed, 14 Mar 2018 16:46:38 +0000 (17:46 +0100)]
multiboot: Reject kernels exceeding the address space
The code path where mh_load_end_addr is non-zero in the Multiboot
header checks that mh_load_end_addr >= mh_load_addr and so
mb_load_size is checked. However, mb_load_size is not checked when
calculated from the file size, when mh_load_end_addr is 0.
If the kernel binary size is larger than can fit in the address space
after load_addr, we ended up with a kernel_size that is smaller than
load_size, which means that we read the file into a too small buffer.
Add a check to reject kernel files with such Multiboot headers.
Andrew Jones [Wed, 14 Mar 2018 15:38:20 +0000 (16:38 +0100)]
dump-guest-memory: more descriptive lookup_type failure
We've seen a few reports of
(gdb) source /usr/share/qemu-kvm/dump-guest-memory.py
Traceback (most recent call last):
File "/usr/share/qemu-kvm/dump-guest-memory.py", line 19, in <module>
UINTPTR_T = gdb.lookup_type("uintptr_t")
gdb.error: No type named uintptr_t.
This occurs when symbols haven't been loaded first, i.e. neither a
QEMU binary was loaded nor a QEMU process was attached first. Let's
better inform the user of how to fix the issue themselves in order
to avoid more reports.
Yasmin Beatriz [Mon, 12 Feb 2018 14:25:06 +0000 (12:25 -0200)]
dump.c: allow fd_write_vmcore to return errno on failure
fd_write_vmcore can fail to execute for a lot of reasons that can be
retrieved by errno, but it only returns -1. This makes difficult for
the caller to know what happened and only a generic error message is
propagated back to the user. This is an example using dump-guest-memory:
(qemu) dump-guest-memory /home/yasmin/mnt/test.dump
dump: failed to save memory
All callers of fd_write_vmcore of dump.c does error handling via
error_setg(), so at first it seems feasible to add the Error pointer as
an argument of fd_write_vmcore. This proved to be more complex than it
first looked. fd_write_vmcore is used by write_elf64_notes and
write_elf32_notes as a WriteCoreDumpFunction prototype. WriteCoreDumpFunction
is declared in include/qom/cpu.h and is used all around the code. This
leaves us with few alternatives:
- change the WriteCoreDumpFunction prototype to include an error pointer.
This would require to change all functions that implements this prototype
to also receive an Error pointer;
- change both write_elf64_notes and write_elf32_notes to no use the
WriteCoreDumpFunction. These functions use not only fd_write_vmcore
but also buf_write_note, so this would require to change buf_write_note
to handle an Error pointer. Considerable easier than the alternative
above, but it's still a lot of code just for the benefit of the callers
of fd_write_vmcore.
This patch presents an easier solution that benefits all fd_write_vmcore
callers:
- instead of returning -1 on error, return -errno. All existing callers
already checks for ret < 0 so there is no need to change the caller's
logic too much. This also allows the retrieval of the errno.
- all callers were updated to use error_setg_errno instead of just
errno_setg. Now that fd_write_vmcore can return an errno, let's update
all callers so they can benefit from a more detailed error message.
This is the same dump-guest-memory example with this patch applied:
(qemu) dump-guest-memory /home/yasmin/mnt/test.dump
dump: failed to save memory: No space left on device
(qemu)
This example illustrates an error of fd_write_vmcore when called
from write_data. All other callers will benefit from better
error messages as well.
Stefan Berger [Mon, 19 Mar 2018 16:13:14 +0000 (12:13 -0400)]
tpm: CRB: Set tpmRegValidSts flag to '1' in device reset
Fix the initialization of the tpmRegValidSts flag and set it to '1'
during device reset without expecting a write to another register.
This seems to also be the default behavior of real hardware.
Luke Shumaker [Thu, 28 Dec 2017 18:08:13 +0000 (13:08 -0500)]
linux-user: init_guest_space: Try to make ARM space+commpage continuous
At a fixed distance after the usable memory that init_guest_space maps, for
32-bit ARM targets we also need to map a commpage. The normal
init_guest_space logic doesn't keep this in mind when searching for an
address range.
If !host_start, then try to find a big continuous segment where we can put
both the usable memory and the commpage; we then munmap that segment and
set current_start to that address; and let the normal code mmap the usable
memory and the commpage separately. That is: if we don't have hint of
where to start looking for memory, come up with one that is better than
NULL. Depending on host_size and guest_start, there may or may not be a
gap between the usable memory and the commpage, so this is slightly more
restrictive than it needs to be; but it's only a hint, so that's OK.
We only do that for !host start, because if host_start, then either:
- we got an address passed in with -B, in which case we don't want to
interfere with what the user said;
- or host_start is based off of the ELF image's loaddr. The check "if
(host_start && real_start != current_start)" suggests that we really
want lowest available address that is >= loaddr. I don't know why that
is, but I'm trusting that Paul Brook knew what he was doing when he
wrote the original version of that check in c581deda322080e8beb88b2e468d4af54454e4b3 way back in 2010.
Now that we have the mechanisms in here, allow shared memory in a
postcopy.
Note that QEMU can't tell who all the users of shared regions are
and thus can't tell whether all the users of the shared regions
have appropriate support for postcopy. Those devices that explicitly
support shared memory (e.g. vhost-user) must check, but it doesn't
stop weirder configurations causing problems.
This message is sent just before the end of postcopy to get the
client to stop using userfault since we wont respond to any more
requests. It should close userfaultfd so that any other pages
get mapped to the backing file automatically by the kernel, since
at this point we know we've received everything.
Add a hook to allow a client userfaultfd to be 'woken'
when a page arrives, and a walker that calls that
hook for relevant clients given a RAMBlock and offset.
Provide a helper to send a 'wake' request on a userfaultfd for
a shared process.
The address in the clients address space is specified together
with the RAMBlock it was resolved to.
# gpg: Signature made Mon 19 Mar 2018 20:07:14 GMT
# gpg: using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <[email protected]>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/machine-next-pull-request:
i386: Disable Intel PT if packets IP payloads have LIP values
cpu: drop unnecessary NULL check and cpu_common_class_by_name()
cpu: get rid of unused cpu_init() defines
Use cpu_create(type) instead of cpu_init(cpu_model)
cpu: add CPU_RESOLVING_TYPE macro
tests: add machine 'none' with -cpu test
nios2: 10m50_devboard: replace cpu_model with cpu_type
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x7efe20417a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
#1 0x7efe1f7b2f75 in g_malloc0 ../glib/gmem.c:124
#2 0x7efe1f7b3249 in g_malloc0_n ../glib/gmem.c:355
#3 0x558272879162 in sev_get_info /home/elmarco/src/qemu/target/i386/sev.c:414
#4 0x55827285113b in hmp_info_sev /home/elmarco/src/qemu/target/i386/monitor.c:684
#5 0x5582724043b8 in handle_hmp_command /home/elmarco/src/qemu/monitor.c:3333
zhangjixiang [Sun, 25 Feb 2018 01:47:51 +0000 (09:47 +0800)]
HMP: Initialize err before using
When bdrv_snapshot_delete return fail, the errp will not be
assigned a valid value in error_propagate as errp didn't be
initialized in hmp_delvm, then error_reportf_err will use an
uninitialized value(call by hmp_delvm), and qemu crash.
Michael Clark [Mon, 19 Mar 2018 21:18:49 +0000 (14:18 -0700)]
RISC-V: Fix riscv_isa_string memory size bug
This version uses a constant size memory buffer sized for
the maximum possible ISA string length. It also uses g_new
instead of g_new0, uses more efficient logic to append
extensions and adds manual zero termination of the string.
Peter Maydell [Tue, 20 Mar 2018 09:51:49 +0000 (09:51 +0000)]
Merge remote-tracking branch 'remotes/ericb/tags/pull-qapi-2018-03-12-v4' into staging
qapi patches for 2018-03-12, 2.12 softfreeze
- Marc-André Lureau: 0/4 qapi: generate a literal qobject for introspection
- Max Reitz: 0/7 block: Handle null backing link
- Daniel P. Berrange: chardev: tcp: postpone TLS work until machine done
- Peter Xu: 00/23 QMP: out-of-band (OOB) execution support
- Vladimir Sementsov-Ogievskiy: 0/2 block latency histogram
- Eric Blake: qapi: Pass '-u' when doing non-silent diff
# gpg: Signature made Mon 19 Mar 2018 19:59:04 GMT
# gpg: using RSA key A7A16B4A2527436A
# gpg: Good signature from "Eric Blake <[email protected]>"
# gpg: aka "Eric Blake (Free Software Programmer) <[email protected]>"
# gpg: aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A
* remotes/ericb/tags/pull-qapi-2018-03-12-v4: (38 commits)
qapi: Pass '-u' when doing non-silent diff
qapi: add block latency histogram interface
block/accounting: introduce latency histogram
tests: qmp-test: add oob test
tests: qmp-test: verify command batching
qmp: add command "x-oob-test"
monitor: enable IO thread for (qmp & !mux) typed
qmp: isolate responses into io thread
qmp: support out-of-band (oob) execution
qapi: introduce new cmd option "allow-oob"
monitor: send event when command queue full
qmp: add new event "command-dropped"
monitor: separate QMP parser and dispatcher
monitor: let suspend/resume work even with QMPs
monitor: let suspend_cnt be thread safe
monitor: introduce monitor_qmp_respond()
qmp: introduce QMPCapability
monitor: allow using IO thread for parsing
monitor: let mon_list be tail queue
monitor: unify global init
...
Laurent Vivier [Mon, 19 Mar 2018 11:35:44 +0000 (12:35 +0100)]
target/m68k: add a mechanism to automatically free TCGv
SRC_EA() and gen_extend() can return either a temporary
TCGv or a memory allocated one. Mark them when they are
allocated, and free them automatically at end of the
instruction translation.
We want to free locally allocated TCGv to avoid
overflow in sequence like:
That can fill a lot of TCGv entries in a sequence,
especially since 15fa08f845 ("tcg: Dynamically allocate TCGOps")
we have no limit to fill the TCGOps cache and we can fill
the entire TCG variables array and overflow it.
vhost+postcopy: Helper to send requests to source for shared pages
Provide a helper to be used by shared waker functions to request
shared pages from the source.
The last_rb pointer is moved into the incoming state since this
helper can update it as well as the main fault thread function.
We need a better way, but at the moment we need the address of the
mappings sent back to qemu so it can interpret the messages on the
userfaultfd it reads.
This is done as a 3 stage set:
QEMU -> client
set_mem_table
mmap stuff, get addresses
client -> qemu
here are the addresses
qemu -> client
OK - now you can use them
That ensures that qemu has registered the new addresses in it's
userfault code before the client starts accessing them.
Note: We don't ask for the default 'ack' reply since we've got our own.
postcopy+vhost-user: Split set_mem_table for postcopy
Split the set_mem_table routines in both qemu and libvhost-user
because the postcopy versions are going to be quite different
once changes in the later patches are added. However, this patch
doesn't produce any functional change, just the split.
migrate: Update ram_block_discard_range for shared
The choice of call to discard a block is getting more complicated
for other cases. We use fallocate PUNCH_HOLE in any file cases;
it works for both hugepage and for tmpfs.
We use the DONTNEED for non-hugepage cases either where they're
anonymous or where they're private.
Care should be taken when trying other backing files.
Haozhong Zhang [Sun, 11 Mar 2018 03:02:14 +0000 (11:02 +0800)]
tests/bios-tables-test: add test cases for DIMM proximity
QEMU now builds one SRAT memory affinity structure for each PC-DIMM
and NVDIMM device presented at boot time with the proximity domain
specified in the device option 'node', rather than only one SRAT
memory affinity structure covering the entire hotpluggable address
space with the proximity domain of the last node.
Add test cases on PC and Q35 machines with 4 proximity domains, and
one PC-DIMM and one NVDIMM attached to the 2nd and 3rd proximity
domains respectively. Check whether the QEMU-built SRAT tables match
with the expected ones.
The following ACPI tables need to be added for this test:
tests/acpi-test-data/pc/APIC.dimmpxm
tests/acpi-test-data/pc/DSDT.dimmpxm
tests/acpi-test-data/pc/NFIT.dimmpxm
tests/acpi-test-data/pc/SRAT.dimmpxm
tests/acpi-test-data/pc/SSDT.dimmpxm
tests/acpi-test-data/q35/APIC.dimmpxm
tests/acpi-test-data/q35/DSDT.dimmpxm
tests/acpi-test-data/q35/NFIT.dimmpxm
tests/acpi-test-data/q35/SRAT.dimmpxm
tests/acpi-test-data/q35/SSDT.dimmpxm
New APIC and DSDT are needed because of the multiple processors
configuration. New NFIT and SSDT are needed because of NVDIMM.
Haozhong Zhang [Sun, 11 Mar 2018 03:02:13 +0000 (11:02 +0800)]
hw/acpi-build: build SRAT memory affinity structures for DIMM devices
ACPI 6.2A Table 5-129 "SPA Range Structure" requires the proximity
domain of a NVDIMM SPA range must match with corresponding entry in
SRAT table.
The address ranges of vNVDIMM in QEMU are allocated from the
hot-pluggable address space, which is entirely covered by one SRAT
memory affinity structure. However, users can set the vNVDIMM
proximity domain in NFIT SPA range structure by the 'node' property of
'-device nvdimm' to a value different than the one in the above SRAT
memory affinity structure.
In order to solve such proximity domain mismatch, this patch builds
one SRAT memory affinity structure for each DIMM device present at
boot time, including both PC-DIMM and NVDIMM, with the proximity
domain specified in '-device pc-dimm' or '-device nvdimm'.
The remaining hot-pluggable address space is covered by one or multiple
SRAT memory affinity structures with the proximity domain of the last
node as before.
Haozhong Zhang [Sun, 11 Mar 2018 03:02:12 +0000 (11:02 +0800)]
qmp: distinguish PC-DIMM and NVDIMM in MemoryDeviceInfoList
It may need to treat PC-DIMM and NVDIMM differently, e.g., when
deciding the necessity of non-volatile flag bit in SRAT memory
affinity structures.
A new field 'nvdimm' is added to the union type MemoryDeviceInfo for
such purpose. Its type is currently PCDIMMDeviceInfo and will be
updated when necessary in the future.
It also fixes "info memory-devices"/query-memory-devices which
currently show nvdimm devices as dimm devices since
object_dynamic_cast(obj, TYPE_PC_DIMM) happily cast nvdimm to
TYPE_PC_DIMM which it's been inherited from.
Haozhong Zhang [Sun, 11 Mar 2018 03:02:11 +0000 (11:02 +0800)]
pc-dimm: make qmp_pc_dimm_device_list() sort devices by address
Make qmp_pc_dimm_device_list() return sorted by start address
list of devices so that it could be reused in places that
would need sorted list*. Reuse existing pc_dimm_built_list()
to get sorted list.
While at it hide recursive callbacks from callers, so that:
Luwei Kang [Tue, 13 Mar 2018 19:26:31 +0000 (03:26 +0800)]
i386: Disable Intel PT if packets IP payloads have LIP values
Intel processor trace should be disabled when
CPUID.(EAX=14H,ECX=0H).ECX.[bit31] is set.
Generated packets which contain IP payloads will have LIP
values when this bit is set, or IP payloads will have RIP
values.
Currently, The information of CPUID 14H is constant to make
live migration safty and this bit is always 0 in guest even
if host support LIP values.
Guest sees the bit is 0 will expect IP payloads with RIP
values, but the host CPU will generate IP payloads with
LIP values if this bit is set in HW.
To make sure the value of IP payloads correctly, Intel PT
should be disabled when bit[31] is set.
Eric Blake [Thu, 15 Mar 2018 12:51:16 +0000 (07:51 -0500)]
qapi: Pass '-u' when doing non-silent diff
Ed-script diffs are awful compared to context diffs. Fix another
'diff -q' while in the area (if the files are different, being
noisy makes it easier to diagnose why).
While at it, diff .err before .out, because if a test fails, .err
is more likely to contain the most important information for
fixing the failure.
Introduce latency histogram statics for block devices.
For each accounted operation type, the latency region [0, +inf) is
divided into subregions by several points. Then, calculate
hits for each subregion.
Peter Xu [Fri, 9 Mar 2018 09:00:06 +0000 (17:00 +0800)]
tests: qmp-test: add oob test
Test the new OOB capability. Here we used the new "x-oob-test" command.
First, we send a lock=true and oob=false command to hang the main
thread. Then send another lock=false and oob=true command (which will
be run inside parser this time) to free that hanged command.
Peter Xu [Fri, 9 Mar 2018 09:00:02 +0000 (17:00 +0800)]
qmp: isolate responses into io thread
For those monitors who have enabled IO thread, we'll offload the
responding procedure into IO thread. The main reason is that chardev is
not thread safe, and we need to do all the read/write IOs in the same
thread. For use_io_thr=true monitors, that thread is the IO thread.
We do this isolation in similar pattern as what we have done to the
request queue: we first create one response queue for each monitor, then
instead of replying directly in the main thread, we queue the responses
and kick the IO thread to do the rest of the job for us.
A funny thing after doing this is that, when the QMP clients send "quit"
to QEMU, it's possible that we close the IOThread even earlier than
replying to that "quit". So another thing we need to do before cleaning
up the monitors is that we need to flush the response queue (we don't
need to do that for command queue; after all we are quitting) to make
sure replies for handled commands are always flushed back to clients.
Peter Xu [Sun, 11 Mar 2018 02:38:05 +0000 (20:38 -0600)]
qmp: support out-of-band (oob) execution
Having "allow-oob":true for a command does not mean that this command
will always be run in out-of-band mode. The out-of-band quick path will
only be executed if we specify the extra "run-oob" flag when sending the
QMP request:
The "control" key is introduced to store this extra flag. "control"
field is used to store arguments that are shared by all the commands,
rather than command specific arguments. Let "run-oob" be the first.
Note that in the patch I exported qmp_dispatch_check_obj() to be used to
check the request earlier, and at the same time allowed "id" field to be
there since actually we always allow that.
This new "allow-oob" boolean will be exposed by "query-qmp-schema" as
well for command entries, so that QMP clients can know which commands
can be used in out-of-band calls. For example the command "migrate"
originally looks like:
Peter Xu [Fri, 9 Mar 2018 08:59:59 +0000 (16:59 +0800)]
monitor: send event when command queue full
Set maximum QMP command queue length to 8. If the queue is full,
instead of queuing the command, we directly return a "command-dropped"
event, telling the client that a specific command is dropped.
Note that this flow control mechanism is only valid if OOB is enabled.
If it's not, the effective queue length will always be 1, which strictly
follows original behavior of QMP command handling (which never drops
messages).
Peter Xu [Fri, 9 Mar 2018 08:59:56 +0000 (16:59 +0800)]
monitor: let suspend/resume work even with QMPs
This patches allows QMP monitors to be suspended/resumed.
One thing to mention is that for QMPs that are using IOThreads, we need
an explicit kick for the IOThread in case it is sleeping.
Meanwhile, we need to take special care on non-interactive HMPs.
Currently only gdbserver is using that. For these monitors, we still
don't allow suspend/resume operations.
Peter Xu [Fri, 9 Mar 2018 08:59:53 +0000 (16:59 +0800)]
qmp: introduce QMPCapability
There were no QMP capabilities defined. Define the first capability,
"oob", to allow out-of-band messages.
After this patch, we will allow QMP clients to enable QMP capabilities
when sending the first "qmp_capabilities" command. Originally we are
starting QMP session with no arguments like:
{ "execute": "qmp_capabilities" }
Now we can enable some QMP capabilities using (take OOB as example,
which is the only capability that we support):
When the "arguments" key is not provided, no capability is enabled.
For capability "oob", the monitor needs to be run on a dedicated IO
thread, otherwise the command will fail. For example, trying to enable
OOB on a MUXed typed QMP monitor will fail.
One thing to mention is that QMP capabilities are per-monitor, and also
when the connection is closed due to some reason, the capabilities will
be reset.
Peter Xu [Fri, 9 Mar 2018 08:59:52 +0000 (16:59 +0800)]
monitor: allow using IO thread for parsing
For each Monitor, add one field "use_io_thr" to show whether it will be
using the dedicated monitor IO thread to handle input/output. When set,
monitor IO parsing work will be offloaded to the dedicated monitor IO
thread, rather than the original main loop thread.
This only works for QMP. HMP will always be run on the main loop
thread.
Currently we're still keeping use_io_thr off always. Will turn it on
later at some point.
One thing to mention is that we cannot set use_io_thr for every QMP
monitor. The problem is that MUXed typed chardevs may not work well
with it now. When MUX is used, frontend of chardev can be the monitor
plus something else. The only thing we know would be safe to be run
outside main thread so far is the monitor frontend. All the rest of the
frontends should still be run in main thread only.
Peter Xu [Fri, 9 Mar 2018 08:59:50 +0000 (16:59 +0800)]
monitor: unify global init
There are many places where the monitor initializes its globals:
- monitor_init_qmp_commands() at the very beginning
- single function to init monitor_lock
- in the first entry of monitor_init() using "is_first_init"
Unify them a bit.
monitor_lock is not used before monitor_init() (as confirmed by code
analysis and gdb watchpoints); so we are safe delaying what was a
constructor-time initialization of the mutex into the later first call
to monitor_init().
Peter Xu [Fri, 9 Mar 2018 08:59:49 +0000 (16:59 +0800)]
monitor: move the cur_mon hack deeper for QMP
In monitor_qmp_read(), we have the hack to temporarily replace the
cur_mon pointer. Now we move this hack deeper inside the QMP dispatcher
routine since the Monitor pointer can be actually obtained using
container_of() upon the parser object, just like most of the other JSON
parser users do.
This does not make much sense as a single patch. However, this will be
a big step for the next patch, when the QMP dispatcher routine will be
split from the QMP parser.
chardev: tcp: postpone TLS work until machine done
TLS handshake may create background GSource tasks, while we won't know
the correct GMainContext until the whole chardev (including frontend)
inited. Let's postpone the initial TLS handshake until machine done.
For dynamically created tcp chardev, we don't postpone that by checking
the init_machine_done variable.