* remotes/cody/tags/block-pull-request:
rbd: Fix bugs around -drive parameter "server"
rbd: Revert -blockdev parameter password-secret
rbd: Revert -blockdev and -drive parameter auth-supported
rbd: Clean up qemu_rbd_create()'s detour through QemuOpts
rbd: Clean up runtime_opts, fix -drive to reject filename
rbd: Don't accept -drive driver=rbd, keyvalue-pairs=...
rbd: Clean up after the previous commit
rbd: Don't limit length of parameter values
rbd: Fix to cleanly reject -drive without pool or image
rbd: Reject -blockdev server.*.{numeric, to, ipv4, ipv6}
qemu_rbd_open() takes option parameters as a flattened QDict, with
keys of the form server.%d.host, server.%d.port, where %d counts up
from zero.
qemu_rbd_array_opts() extracts these values as follows. First, it
calls qdict_array_entries() to find the list's length. For each list
element, it formats the list's key prefix (e.g. "server.0."), then
creates a new QDict holding the options with that key prefix, then
converts that to a QemuOpts, so it can finally get the member values
from there.
If there's one surefire way to make code using QDict more awkward,
it's creating more of them and mixing in QemuOpts for good measure.
The extraction of keys starting with server.%d into another QDict
makes us ignore parameters like server.0.neither-host-nor-port
silently.
The conversion to QemuOpts abuses runtime_opts, as described a few
commits ago.
Rewrite to simply get the values straight from the options QDict.
Fixes -drive not to crash when server.*.* are present, but
server.*.host is absent.
Fixes -drive to reject invalid server.*.*.
Permits cleaning up runtime_opts. Do that, and fix -drive to reject
bogus parameters host and port instead of silently ignoring them.
This reverts a part of commit 8a47e8e. We're having second thoughts
on the QAPI schema (and thus the external interface), and haven't
reached consensus, yet. Issues include:
* BlockdevOptionsRbd member @password-secret isn't actually a
password, it's a key generated by Ceph.
* We're not sure where member @password-secret belongs (see the
previous commit).
* How @password-secret interacts with settings from a configuration
file specified with @conf is undocumented.
Let's avoid painting ourselves into a corner now, and revert the
feature for 2.9.
Note that users can still configure an authentication key with a
configuration file. They probably do that anyway if they use Ceph
outside QEMU as well.
rbd: Revert -blockdev and -drive parameter auth-supported
This reverts half of commit 0a55679. We're having second thoughts on
the QAPI schema (and thus the external interface), and haven't reached
consensus, yet. Issues include:
* The implementation uses deprecated rados_conf_set() key
"auth_supported". No biggie.
* The implementation makes -drive silently ignore invalid parameters
"auth" and "auth-supported.*.X" where X isn't "auth". Fixable (in
fact I'm going to fix similar bugs around parameter server), so
again no biggie.
* BlockdevOptionsRbd member @password-secret applies only to
authentication method cephx. Should it be a variant member of
RbdAuthMethod?
* BlockdevOptionsRbd member @user could apply to both methods cephx
and none, but I'm not sure it's actually used with none. If it
isn't, should it be a variant member of RbdAuthMethod?
* The client offers a *set* of authentication methods, not a list.
Should the methods be optional members of BlockdevOptionsRbd instead
of members of list @auth-supported? The latter begs the question
what multiple entries for the same method mean. Trivial question
now that RbdAuthMethod contains nothing but @type, but less so when
RbdAuthMethod acquires other members, such the ones discussed above.
* How BlockdevOptionsRbd member @auth-supported interacts with
settings from a configuration file specified with @conf is
undocumented. I suspect it's untested, too.
Let's avoid painting ourselves into a corner now, and revert the
feature for 2.9.
Note that users can still configure authentication methods with a
configuration file. They probably do that anyway if they use Ceph
outside QEMU as well.
Further note that this doesn't affect use of key "auth-supported" in
-drive file=rbd:...:key=value.
qemu_rbd_array_opts()'s parameter @type now must be RBD_MON_HOST,
which is silly. This will be cleaned up shortly.
rbd: Clean up runtime_opts, fix -drive to reject filename
runtime_opts is used for three different purposes:
* qemu_rbd_open() uses it to accept options it recognizes, such as
"pool" and "image". Other .bdrv_open() methods do it similarly.
* qemu_rbd_open() accepts additional list-valued options
auth-supported and server, with the help of qemu_rbd_array_opts().
The list elements are again dictionaries. qemu_rbd_array_opts()
uses runtime_opts to accept their members. Thus, runtime_opts
contains recognized sub-sub-options "auth", "host", "port" in
addition to recognized options. No other block driver does that.
* qemu_rbd_create() uses it to convert the QDict produced by
qemu_rbd_parse_filename() to QemuOpts. No other block driver does
that. The keys produced by qemu_rbd_parse_filename() are "pool",
"image", "snapshot", "conf", "user" and "keyvalue-pairs".
qemu_rbd_open() accepts these, so no additional ones here.
This is a confusing mess. Dates back to commit 0f9d252. First step
to clean it up is documenting runtime_opts.desc[]:
* Reorder entries to match the QAPI schema, like we do in other block
drivers.
* Document why the schema's "server" and "auth-supported" aren't in
.desc[].
* Document why "keyvalue-pairs", "host", "port" and "auth" are in
.desc[], but not the schema.
* Delete "filename", because none of the three users actually uses it.
This fixes -drive to reject parameter filename instead of silently
ignoring it.
The way we communicate extra key-value pairs from
qemu_rbd_parse_filename() to qemu_rbd_open() exposes option parameter
"keyvalue-pairs" on the command line. It's not wanted there. Hack:
rename the parameter to "=keyvalue-pairs" to make it inaccessible.
We laboriously enforce that parameter values are between one and some
arbitrary limit in length. Only RBD_MAX_IMAGE_NAME_SIZE comes from
librbd.h, and I'm not sure it applies. Where the other limits come
from is unclear.
Drop the length checking. The limits librbd actually imposes must be
checked by librbd anyway.
There's one minor complication: BDRVRBDState member name is a
fixed-size array. Depends on the length limit. Make it a pointer to
a dynamically allocated string.
rbd: Fix to cleanly reject -drive without pool or image
qemu_rbd_open() neglects to check pool and image are present. Missing
image is caught by rbd_open(), but missing pool crashes. Reproducer:
$ qemu-system-x86_64 -nodefaults -drive driver=rbd,id=rbd,image=i,...
terminate called after throwing an instance of 'std::logic_error'
what(): basic_string::_M_construct null not valid
Aborted (core dumped)
where ... is a working server.0.{host,port} configuration.
Doesn't affect -drive with file=..., because qemu_rbd_parse_filename()
always sets both pool and image.
Doesn't affect -blockdev, because pool and image are mandatory in the
QAPI schema.
rbd: Reject -blockdev server.*.{numeric, to, ipv4, ipv6}
We use InetSocketAddress in the QAPI schema. However, the code
doesn't use inet_connect_saddr(), but formats "host" and "port" into a
configuration string for rados_conf_set(). Thus, members "numeric",
"to", "ipv4" and "ipv6" are silently ignored. Not nice. Example:
Peter Maydell [Tue, 28 Mar 2017 11:34:23 +0000 (12:34 +0100)]
Merge remote-tracking branch 'remotes/stsquad/tags/pull-mttcg-fixups-for-rc2-280317-1' into staging
MTTCG regression fixes for rc2
# gpg: Signature made Tue 28 Mar 2017 10:54:38 BST
# gpg: using RSA key 0xFBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <[email protected]>"
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-mttcg-fixups-for-rc2-280317-1:
replay/replay.c: bump REPLAY_VERSION
tcg: Add a new line after incompatibility warning
ui/console: use exclusive mechanism directly
ui/console: ensure do_safe_dpy_refresh holds BQL
bsd-user: align use of mmap_lock to that of linux-user
user-exec: handle synchronous signals from QEMU gracefully
Denis V. Lunev [Mon, 27 Mar 2017 14:38:08 +0000 (17:38 +0300)]
parallels: wrong call to bdrv_truncate
Parallels driver should not call bdrv_truncate if the image was opened
in the read-only mode. Without the patch
qemu-img check harddisk.hds
asserts with
bdrv_truncate: Assertion `child->perm & BLK_PERM_RESIZE' failed.
Parameters used on the write path are not needed if the image is opened
in the read-only mode.
Alex Bennée [Fri, 24 Mar 2017 15:21:55 +0000 (15:21 +0000)]
replay/replay.c: bump REPLAY_VERSION
A previous commit (3d4d16f4) added support for audio record/playback.
However this breaks the logfile ABI due to the re-ordering of the
ReplayEvents enum. The REPLAY_VERSION check is meant to prevent you
from using old log files in newer QEMUs but this is currently broken.
Alex Bennée [Fri, 24 Mar 2017 15:39:05 +0000 (15:39 +0000)]
ui/console: use exclusive mechanism directly
The previous commit (8bb93c6f99) using async_safe_run_on_cpu() doesn't
work on graphics sub-system which restrict which threads can do GUI
updates. Rather the special casing MacOS we just directly call the
helper and move all the exclusive handling into do_dafe_dpy_refresh().
The unfortunate bouncing of the BQL is to ensure there is no deadlock
as vCPUs waiting on the BQL are kicked into their quiescent state.
Alex Bennée [Wed, 22 Mar 2017 16:11:13 +0000 (16:11 +0000)]
ui/console: ensure do_safe_dpy_refresh holds BQL
I missed the fact that when an exclusive work item runs it drops the
BQL to ensure all no vCPUs are stuck waiting for it, hence causing a
deadlock. However the actual helper needs to take the BQL especially
as we'll be messing with device emulation bits during the update which
all assume BQL is held.
We make a minor cpu_reloading_memory_map which must try and unlock the
RCU if we are actually outside the running context.
Alex Bennée [Mon, 20 Mar 2017 14:36:47 +0000 (14:36 +0000)]
bsd-user: align use of mmap_lock to that of linux-user
The introduction of stricter mmap_lock checking in translate-all broke
the BSD user build. The working mmap_lock functions were hidden behind
CONFIG_USE_NPTL which is never defined. This patch brings them inline
with linux-user.
Despite the disapearence of the comment "We aren't threadsafe to start
with..." this doesn't make bsd-user so. It will still need the rest of
the fixes that have been done in linux-user ported over.
Alex Bennée [Mon, 20 Mar 2017 11:31:44 +0000 (11:31 +0000)]
user-exec: handle synchronous signals from QEMU gracefully
When "tcg: enable thread-per-vCPU" (commit 3725794) was merged the
lifetime of current_cpu was changed. Previously a broken linux-user
call might abort() which can eventually escalate into a SIGSEGV which
would then crash qemu as it attempted to deref a NULL current_cpu.
After commit 3725794 it would attempt to fixup state and re-start the
run-loop and much hilarity (i.e. a looping lockup) would ensue from
jumping into a stale jmp_env.
As we can actually tell if we are in the run-loop from looking at the
cpu->running flag we should catch this badness first and abort()
cleanly rather than try to soldier on. There is a theoretical race
between the flag being set and sigsetjmp refreshing the jump buffer
but we can try really hard to not introduce crashes into that code.
Peter Maydell [Tue, 28 Mar 2017 08:48:23 +0000 (09:48 +0100)]
Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
This series fixes potential memory/fd leaks in 9pfs and a crash when
running tests/virtio-9p-test on SPARC hosts.
# gpg: Signature made Tue 28 Mar 2017 09:44:05 BST
# gpg: using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <[email protected]>"
# gpg: aka "Greg Kurz <[email protected]>"
# gpg: aka "Greg Kurz <[email protected]>"
# gpg: aka "Gregory Kurz (Groug) <[email protected]>"
# gpg: aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894 DBA2 02FC 3AEB 0101 DBC2
* remotes/gkurz/tags/for-upstream:
tests/virtio-9p-test: Don't call le*_to_cpus on fields of packed struct
9pfs: fix file descriptor leak
Peter Maydell [Mon, 27 Mar 2017 17:59:04 +0000 (18:59 +0100)]
tests/virtio-9p-test: Don't call le*_to_cpus on fields of packed struct
For a packed struct like 'P9Hdr' the fields within it may not be
aligned as much as the natural alignment for their types. This means
it is not valid to pass the address of such a field to a function
like le32_to_cpus() which operate on uint32_t* and assume alignment.
Doing this results in a SIGBUS on hosts like SPARC which have strict
alignment requirements.
Use ldl_le_p() instead, which is specified to correctly handle
unaligned pointers.
Li Qiang [Mon, 27 Mar 2017 19:13:19 +0000 (21:13 +0200)]
9pfs: fix file descriptor leak
The v9fs_create() and v9fs_lcreate() functions are used to create a file
on the backend and to associate it to a fid. The fid shouldn't be already
in-use, otherwise both functions may silently leak a file descriptor or
allocated memory. The current code doesn't check that.
This patch ensures that the fid isn't already associated to anything
before using it.
Peter Maydell [Mon, 27 Mar 2017 16:34:50 +0000 (17:34 +0100)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* MTTCG fix for win32
* virtio-scsi assertion failure
* mem-prealloc coverity fix
* x86 migration revert which requires more thought
* x86 instruction limit (avoids >2 page translation blocks)
* nbd dead code cleanup
* small memory.c logic fix
* remotes/bonzini/tags/for-upstream:
scsi-generic: Fill in opt_xfer_len in INQUIRY reply if it is zero
Revert "apic: save apic_delivered flag"
nbd: drop unused NBDClientSession.is_unix field
win32: replace custom mutex and condition variable with native primitives
mem-prealloc: fix sysconf(_SC_NPROCESSORS_ONLN) failure case.
tcg/i386: Check the size of instruction being translated
virtio-scsi: Fix acquire/release in dataplane handlers
virtio-scsi: Make virtio_scsi_acquire/release public
clear pending status before calling memory commit
Peter Maydell [Mon, 27 Mar 2017 15:56:31 +0000 (16:56 +0100)]
Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2017-03-27' into staging
Block patches for 2.9-rc2.
# gpg: Signature made Mon 27 Mar 2017 16:47:54 BST
# gpg: using RSA key 0xF407DB0061D5CF40
# gpg: Good signature from "Max Reitz <[email protected]>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* remotes/maxreitz/tags/pull-block-2017-03-27:
block/file-posix.c: Fix unused variable warning on OpenBSD
file-posix: Make bdrv_flush() failure permanent without O_DIRECT
nbd-client: fix handling of hungup connections
qemu-img: print short help on getopt failure
qemu-img: fix switch indentation in img_amend()
qemu-img: show help for invalid global options
Peter Maydell [Thu, 23 Mar 2017 14:36:28 +0000 (14:36 +0000)]
block/file-posix.c: Fix unused variable warning on OpenBSD
On OpenBSD none of the ioctls probe_logical_blocksize() tries
exist, so the variable sector_size is unused. Refactor the
code to avoid this (and reduce the duplicated code).
Kevin Wolf [Wed, 22 Mar 2017 21:00:05 +0000 (22:00 +0100)]
file-posix: Make bdrv_flush() failure permanent without O_DIRECT
Success for bdrv_flush() means that all previously written data is safe
on disk. For fdatasync(), the best semantics we can hope for on Linux
(without O_DIRECT) is that all data that was written since the last call
was successfully written back. Therefore, and because we can't redo all
writes after a flush failure, we have to give up after a single
fdatasync() failure. After this failure, we would never be able to make
the promise that a successful bdrv_flush() makes.
Paolo Bonzini [Tue, 14 Mar 2017 11:11:56 +0000 (12:11 +0100)]
nbd-client: fix handling of hungup connections
After the switch to reading replies in a coroutine, nothing is
reentering pending receive coroutines if the connection hangs.
Move nbd_recv_coroutines_enter_all to the reply read coroutine,
which is the place where hangups are detected. nbd_teardown_connection
can simply wait for the reply read coroutine to detect the hangup
and clean up after itself.
This wouldn't be enough though because nbd_receive_reply returns 0
(rather than -EPIPE or similar) when reading from a hung connection.
Fix the return value check in nbd_read_reply_entry.
Stefan Hajnoczi [Fri, 17 Mar 2017 10:45:41 +0000 (18:45 +0800)]
qemu-img: print short help on getopt failure
Printing the full help output obscures the error message for an invalid
command-line option or missing argument.
Before this patch:
$ ./qemu-img --foo
...pages of output...
After this patch:
$ ./qemu-img --foo
qemu-img: unrecognized option '--foo'
Try 'qemu-img --help' for more information
This patch adds the getopt ':' character so that it can distinguish
between missing arguments and unrecognized options. This helps provide
more detailed error messages.
Andrey Shedel [Fri, 24 Mar 2017 22:01:41 +0000 (15:01 -0700)]
win32: replace custom mutex and condition variable with native primitives
The multithreaded TCG implementation exposed deadlocks in the win32
condition variables: as implemented, qemu_cond_broadcast waited on
receivers, whereas the pthreads API it was intended to emulate does
not. This was causing a deadlock because broadcast was called while
holding the IO lock, as well as all possible waiters blocked on the
same lock.
This patch replaces all the custom synchronisation code for mutexes
and condition variables with native Windows primitives (SRWlocks and
condition variables) with the same semantics as their POSIX
equivalents. To enable that, it requires a Windows Vista or newer host
OS.
Gerd Hoffmann [Tue, 14 Mar 2017 08:26:58 +0000 (09:26 +0100)]
vnc: fix reverse mode
vnc server in reverse mode (qemu -vnc localhost:$nr,reverse) interprets
$nr as display number (i.e. with 5900 offset) in recent qemu versions.
Historical and documented behavior is interpreting $nr as port number
though. So we should bring code and documentation in line.
Given that default listening port for viewers is 5500 the 5900 offset is
pretty inconvinient, because it is simply impossible to connect to port
5500. So, lets fix the code not the docs.
Gerd Hoffmann [Mon, 20 Mar 2017 08:04:02 +0000 (09:04 +0100)]
ui/egl-helpers: fix egl 1.5 display init
Unfortunaly switching to getPlatformDisplayEXT isn't as easy as
implemented by 0ea1523fb6703aa0dcd65e66b59e96fec028e60a. See the
longish comment for the complete story.
Ladi Prosek [Fri, 24 Mar 2017 14:24:50 +0000 (15:24 +0100)]
virtio-input: fix eventq batching
virtio_input_send buffers input events until it sees a SYNC. Then it
either sends or drops the entire batch, depending on whether eventq
has enough space available. The case to avoid here is partial sends
where only part of the batch would get to the guest.
Using virtqueue_get_avail_bytes to check the state of eventq was not
correct. The queue may have a smaller number of larger buffers
available so bytes may be enough but the batch would still not be
possible to send, leading to the "Huh? No vq elem available" error.
Instead of checking available bytes, this patch optimistically pops
buffers from the queue and puts them back in case it runs out of
space and the batch needs to be dropped.
Ladi Prosek [Fri, 24 Mar 2017 14:24:49 +0000 (15:24 +0100)]
virtio-input: free event queue when finalizing
VirtIOInput.queue was never freed. This commit adds an explicit
g_free to virtio_input_finalize and switches the allocation
function from realloc to g_realloc in virtio_input_send.
a qemu with an empty s390 guest will exit very quickly. This races
against the testsuite reading from the console pipe leading to
intermittent test suite failures. Using -no-shutdown will keep
the guest running.
This was spotted by Coverity, in case where sysconf(_SC_NPROCESSORS_ONLN)
fails and returns -1. This results in memset_num_threads getting set to -1.
Which we then pass to g_new0().
The patch replaces MAX_MEM_PREALLOC_THREAD_COUNT macro with a function call
get_memset_num_threads() to handle sysconf() failure gracefully. In case
sysconf() fails, we fall back to single threaded.
Pranith Kumar [Thu, 23 Mar 2017 17:58:51 +0000 (13:58 -0400)]
tcg/i386: Check the size of instruction being translated
This fixes the bug: 'user-to-root privesc inside VM via bad translation
caching' reported by Jann Horn here:
https://bugs.chromium.org/p/project-zero/issues/detail?id=1122
Fam Zheng [Fri, 17 Mar 2017 06:14:47 +0000 (14:14 +0800)]
virtio-scsi: Fix acquire/release in dataplane handlers
After the AioContext lock push down, there is a race between
virtio_scsi_dataplane_start and those "assert(s->ctx &&
s->dataplane_started)", because the latter doesn't isn't wrapped in
aio_context_acquire.
Reproducer is simply booting a Fedora guest with an empty
virtio-scsi-dataplane controller:
Xu, Anthony [Wed, 22 Mar 2017 17:53:35 +0000 (17:53 +0000)]
clear pending status before calling memory commit
clear pending status before calling memory commit.
Otherwise when memory_region_finalize is called,
memory_region_transaction_depth is 0 and
memory_region_update_pending is true.
That's wrong.
In file included from /usr/include/signal.h:326:0,
from /home/pm215/qemu/include/qemu/osdep.h:86,
from /home/pm215/qemu/disas/microblaze.c:36:
/usr/include/sparc64-linux-gnu/sys/ucontext.h:96:0: note: this is the location of the previous definition
#define REG_PC (1)
Since the code doesn't actually use the REG_PC define
anywhere, the simplest fix is just to remove it.
Eric Blake [Mon, 13 Mar 2017 19:55:20 +0000 (14:55 -0500)]
trace: Avoid abuse of amdvi_mmio_read
hw/i386/trace-events has an amdvi_mmio_read trace that is used for
both normal reads (listing the register name, address, size, and
offset) and for an error case (abusing the register name to show
an error message, the address to show the maximum value supported,
then shoehorning address and size into the size and offset
parameters). The change from a wide address to a narrower size
parameter could truncate a (rather-large) bogus read attempt, so
it's better to create a separate dedicated trace with correct types,
rather than abusing the trace mechanism. Broken since its
introduction in commit d29a09c.
[Change trace event argument type from hwaddr to uint64_t since
user-defined types should not be used for trace events. This fixes a
build failure with LTTng UST.
--Stefan]
Eric Blake [Mon, 13 Mar 2017 19:55:19 +0000 (14:55 -0500)]
trace: Fix incorrect megasas trace parameters
hw/scsi/trace-events lists cmd as the first parameter for both
megasas_iovec_overflow and megasas_iovec_underflow, but the caller
was mistakenly passing cmd->iov_size twice instead of the command
index. Also, trace_megasas_abort_invalid is called with parameters
in the wrong order. Broken since its introduction in commit e8f943c3.
Eric Blake [Mon, 13 Mar 2017 19:55:18 +0000 (14:55 -0500)]
trace: Fix backwards mirror_yield parameters
block/trace-events lists the parameters for mirror_yield
consistently with other mirror events (cnt just after s, like in
mirror_before_sleep; in_flight last, like in mirror_yield_in_flight).
But the callers were passing parameters in the wrong order, leading
to poor trace messages, including type truncation when there are
more than 4G dirty sectors involved. Broken since its introduction
in commit bd48bde.
While touching this, ensure that all callers use the same type
(uint64_t) for cnt, as a later patch will enable the compiler to do
stricter type-checking.
Peter Maydell [Thu, 23 Mar 2017 15:21:28 +0000 (15:21 +0000)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170323' into staging
ppc patch queue for 2017-03-23
Just a single bugfix in this batch. It's not strictly in ppc code,
though it's for the pseries machine's benefit. Eduardo suggested it
go through my tree however.
Peter Maydell [Thu, 23 Mar 2017 13:43:32 +0000 (13:43 +0000)]
Merge remote-tracking branch 'remotes/gonglei/tags/cryptodev-next-20170323' into staging
cryptodev fixes
# gpg: Signature made Thu 23 Mar 2017 09:22:44 GMT
# gpg: using RSA key 0x2ED7FDE9063C864D
# gpg: Good signature from "Gonglei <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 3EF1 8E53 3459 E6D1 963A 3C05 2ED7 FDE9 063C 864D
* remotes/gonglei/tags/cryptodev-next-20170323:
cryptodev: fix asserting single queue
cryptodev: setiv only when really need
Peter Maydell [Thu, 23 Mar 2017 12:31:52 +0000 (12:31 +0000)]
Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2017-03-22-v3' into staging
QAPI patches for 2017-03-22
# gpg: Signature made Wed 22 Mar 2017 18:25:15 GMT
# gpg: using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <[email protected]>"
# gpg: aka "Markus Armbruster <[email protected]>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-qapi-2017-03-22-v3:
qapi: Fix QemuOpts visitor regression on unvisited input
qom: Avoid unvisited 'id'/'qom-type' in user_creatable_add_opts
tests: Expose regression in QemuOpts visitor
test-qobject-input-visitor: Cover visit_type_uint64()
Revert "hostmem: fix QEMU crash by 'info memdev'"
qapi: Fix string input visitor regression for empty lists
qapi2texi: Fix translation of *strong* and _emphasized_
tests/qapi-schema: Systematic positive doc comment tests
tests/qapi-schema: Make test-qapi.py print docs again
qapi: Drop unused QAPIDoc member optional
qapi2texi: Fix to actually fail when 'doc-required' is false
qapi: Drop excessive Make dependencies on qapi2texi.py
MAINTAINERS: Add myself for files I touched recently
keyval: Document issues with 'any' and alternate types
test-keyval: Cover alternate and 'any' type
keyval: Improve some comments
test-keyval: Tweaks to improve list coverage
Peter Maydell [Thu, 23 Mar 2017 11:04:56 +0000 (11:04 +0000)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pc: fixes
virtio and misc fixes for 2.9.
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Wed 22 Mar 2017 16:29:50 GMT
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg: aka "Michael S. Tsirkin <[email protected]>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
hw/acpi/vmgenid: prevent more than one vmgenid device
hw/acpi/vmgenid: prevent device realization on pre-2.5 machine types
virtio: always use handle_aio_output if registered
virtio: Fix error handling in virtio_bus_device_plugged
Halil Pasic [Wed, 22 Mar 2017 12:36:55 +0000 (13:36 +0100)]
cryptodev: fix asserting single queue
We already check for queues == 1 in cryptodev_builtin_init and when that
is not true raise an error. But before that error is reported the
assertion in cryptodev_builtin_cleanup kicks in (because object is being
finalized and freed).
Let's remove assert(queues == 1) form cryptodev_builtin_cleanup as it
does only harm and no good.
Longpeng(Mike) [Sat, 14 Jan 2017 06:14:27 +0000 (14:14 +0800)]
cryptodev: setiv only when really need
ECB mode cipher doesn't need IV, if we setiv for it then qemu
crypto API would report "Expected IV size 0 not **", so we should
setiv only when the cipher algos really need.
Eric Blake [Wed, 22 Mar 2017 14:45:25 +0000 (09:45 -0500)]
qapi: Fix QemuOpts visitor regression on unvisited input
An off-by-one in commit 15c2f669e meant that we were failing to
check for unparsed input in all QemuOpts visitors. Recent testsuite
additions show that fixing the obvious bug with bogus fields will
also fix the case of an incomplete list visit; update the tests to
match the new behavior.
Eric Blake [Wed, 22 Mar 2017 14:45:24 +0000 (09:45 -0500)]
qom: Avoid unvisited 'id'/'qom-type' in user_creatable_add_opts
A regression in commit 15c2f669e caused us to silently ignore
excess input to the QemuOpts visitor. Later, commit ea4641
accidentally abused that situation, by removing "qom-type" and
"id" from the corresponding QDict but leaving them defined in
the QemuOpts, when using the pair of containers to create a
user-defined object. Note that since we are already traversing
two separate items (a QDict and a QemuOpts), we are already
able to flag bogus arguments, as in:
$ ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio -object memory-backend-ram,id=mem1,size=4k,bogus=huh
qemu-system-x86_64: -object memory-backend-ram,id=mem1,size=4k,bogus=huh: Property '.bogus' not found
So the only real concern is that when we re-enable strict checking
in the QemuOpts visitor, we do not want to start flagging the two
leftover keys as unvisited. Rearrange the code to clean out the
QemuOpts listing in advance, rather than removing items from the
QDict. Since "qom-type" is usually an automatic implicit default,
we don't have to restore it (this does mean that once instantiated,
QemuOpts is not necessarily an accurate representation of the
original command line - but this is not the first place to do that);
however "id" has to be put back (requiring us to cast away a const).
[As a side note, hmp_object_add() turns a QDict into a QemuOpts,
then calls user_creatable_add_opts() which converts QemuOpts into
a new QDict. There are probably a lot of wasteful conversions like
this, but cleaning them up is a much bigger task than the immediate
regression fix.]
John Snow [Thu, 16 Mar 2017 21:23:51 +0000 (17:23 -0400)]
blockjob: add devops to blockjob backends
This lets us hook into drained_begin and drained_end requests from the
backend level, which is particularly useful for making sure that all
jobs associated with a particular node (whether the source or the target)
receive a drain request.
Allow block backends to forward drain requests to their devices/users.
The initial intended purpose for this patch is to allow BBs to forward
requests along to BlockJobs, which will want to pause if their associated
BB has entered a drained region.
John Snow [Thu, 16 Mar 2017 21:23:49 +0000 (17:23 -0400)]
blockjob: add block_job_start_shim
The purpose of this shim is to allow us to pause pre-started jobs.
The purpose of *that* is to allow us to buffer a pause request that
will be able to take effect before the job ever does any work, allowing
us to create jobs during a quiescent state (under which they will be
automatically paused), then resuming the jobs after the critical section
in any order, either:
(1) -block_job_start
-block_job_resume (via e.g. drained_end)
(2) -block_job_resume (via e.g. drained_end)
-block_job_start
The problem that requires a startup wrapper is the idea that a job must
start in the busy=true state only its first time-- all subsequent entries
require busy to be false, and the toggling of this state is otherwise
handled during existing pause and yield points.
The wrapper simply allows us to mandate that a job can "start," set busy
to true, then immediately pause only if necessary. We could avoid
requiring a wrapper, but all jobs would need to do it, so it's been
factored out here.
Paolo Bonzini [Tue, 21 Mar 2017 17:48:10 +0000 (18:48 +0100)]
blockjob: avoid recursive AioContext locking
Streaming or any other block job hangs when performed on a block device
that has a non-default iothread. This happens because the AioContext
is acquired twice by block_job_defer_to_main_loop_bh and then released
only once by BDRV_POLL_WHILE. (Insert rants on recursive mutexes, which
unfortunately are a temporary but necessary evil for iothreads at the
moment).
Luckily, the reason for the double acquisition is simple; the function
acquires the AioContext for both the job iothread and the BDS iothread,
in case the BDS iothread was changed while the job was running. It
is therefore enough to skip the second acquisition when the two
AioContexts are one and the same.
Laszlo Ersek [Mon, 20 Mar 2017 17:05:56 +0000 (18:05 +0100)]
hw/acpi/vmgenid: prevent device realization on pre-2.5 machine types
The WRITE_POINTER linker/loader command that underlies VMGENID depends on
commit baf2d5bfbac0 ("fw-cfg: support writeable blobs", 2017-01-12), which
in turn depends on fw_cfg DMA.
DMA for fw_cfg is enabled in 2.5+ machine types only (see commit e6915b5f3a87, "fw_cfg: unbreak migration compatibility for 2.4 and earlier
machines", 2016-02-18).
Paolo Bonzini [Tue, 28 Feb 2017 13:21:32 +0000 (14:21 +0100)]
virtio: always use handle_aio_output if registered
Commit ad07cd6 ("virtio-scsi: always use dataplane path if ioeventfd is
active", 2016-10-30) and 9ffe337 ("virtio-blk: always use dataplane
path if ioeventfd is active", 2016-10-30) broke the virtio 1.0
indirect access registers.
The indirect access registers bypass the ioeventfd, so that virtio-blk
and virtio-scsi now repeatedly try to initialize dataplane instead of
triggering the guest->host EventNotifier. Detect the situation by
checking vq->handle_aio_output; if it is not NULL, trigger the
EventNotifier, which is how the device expects to get notifications
and in fact the only thread-safe manner to deliver them.
Eric Blake [Wed, 22 Mar 2017 14:45:23 +0000 (09:45 -0500)]
tests: Expose regression in QemuOpts visitor
Commit 15c2f669e broke the ability of the QemuOpts visitor to
flag extra input parameters, but the regression went unnoticed
because of missing testsuite coverage. Add a test to cover this;
take the approach already used in 9cb8ef3 of adding a test that
passes (to avoid breaking bisection) but marks with BUG the
behavior that we don't like, so that the actual impact of the
fix in a later patch is easier to see.
Fam Zheng [Fri, 17 Mar 2017 12:32:42 +0000 (20:32 +0800)]
virtio: Fix error handling in virtio_bus_device_plugged
For one thing we shouldn't continue if an error happened, for the other
two steps failing can cause an abort() in error_setg because we reuse
the same errp blindly.
Laurent Vivier [Tue, 21 Mar 2017 10:25:42 +0000 (11:25 +0100)]
numa,spapr: align default numa node memory size to 256MB
Since commit 224245b ("spapr: Add LMB DR connectors"), NUMA node
memory size must be aligned to 256MB (SPAPR_MEMORY_BLOCK_SIZE).
But when "-numa" option is provided without "mem" parameter,
the memory is equally divided between nodes, but 8MB aligned.
This can be not valid for pseries.
In that case we can have:
$ ./ppc64-softmmu/qemu-system-ppc64 -m 4G -numa node -numa node -numa node
qemu-system-ppc64: Node 0 memory size 0x55000000 is not aligned to 256 MiB
With this patch, we have:
(qemu) info numa
3 nodes
node 0 cpus: 0
node 0 size: 1280 MB
node 1 cpus:
node 1 size: 1280 MB
node 2 cpus:
node 2 size: 1536 MB
The new test demonstrates known bugs: integers between INT64_MAX+1 and
UINT64_MAX rejected, and integers between INT64_MIN and -1 are
accepted modulo 2^64.
Peter Maydell [Tue, 21 Mar 2017 14:31:57 +0000 (14:31 +0000)]
configure: Warn about deprecated hosts
We plan to drop support in a future QEMU release for host OSes
and host architectures for which we have no test machine where
we can build and run tests. For the 2.9 release, make configure
print a warning if it is run on such a host, so that the user
has some warning of the plans and can volunteer to help us
maintain the port if they need it to continue to function.
This commit flags up as deprecated the CPU architectures:
* ia64
* sparc
* anything which we don't have a TCG port for
(and which was presumably using TCI)
and the OSes:
* GNU/kFreeBSD
* DragonFly BSD
* NetBSD
* OpenBSD
* Solaris
* AIX
* Haiku
It also makes entirely unrecognized host OS strings be
rejected rather than treated as if they were Linux (which
likely never worked).
Peter Maydell [Tue, 21 Mar 2017 14:32:51 +0000 (14:32 +0000)]
Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
This pull request fixes a potential QEMU hang in 9pfs and two issues
reported by Coverity.
# gpg: Signature made Tue 21 Mar 2017 09:57:58 GMT
# gpg: using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <[email protected]>"
# gpg: aka "Greg Kurz <[email protected]>"
# gpg: aka "Greg Kurz <[email protected]>"
# gpg: aka "Gregory Kurz (Groug) <[email protected]>"
# gpg: aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894 DBA2 02FC 3AEB 0101 DBC2
* remotes/gkurz/tags/for-upstream:
9pfs: proxy: assert if unmarshal fails
9pfs: don't try to flush self and avoid QEMU hang on reset
parallels block driver is completely broken since commit
commit 75cdcd1553e74b5edc58aed23e3b2da8dabb1876
Author: Markus Armbruster <[email protected]>
Date: Tue Feb 21 21:14:08 2017 +0100
option: Fix checking of sizes for overflow and trailing crap
Right now even simple
qemu-io -c "read 512 64k" 1.hds
ends up with
Unexpected error in parse_option_size() at util/qemu-option.c:188:
Parameter 'prealloc-size' expects a non-negative number below 2^64
Aborted (core dumped)
The cure is simple - we should use 'M' as a suffix in default option value
instead of 'MiB'.
The string input visitor regression fixed in the previous commit made
visit_type_uint16List() fail on empty input. query_memdev() calls it
via object_property_get_uint16List(). Because it doesn't expect it to
fail, it passes &error_abort, and duly crashes.
Commit 1454d33 "fixes" this crash by making
host_memory_backend_get_host_nodes() return a list containing just
MAX_NODES instead of the empty list. Papers over the regression, and
leads to bogus "info memdev" output, as shown below; revert.
I suspect that if we had bisected the crash back then, we would have
found and fixed the actual bug instead of papering over it.
Between commit 74f24cb and 1454d33, it crashes like this:
Unexpected error in parse_str() at /work/armbru/tmp/qemu/qapi/string-input-visitor.c:126:
Parameter 'null' expects an int64 value or range
Aborted (core dumped)
qapi: Fix string input visitor regression for empty lists
Visiting a list when input is the empty string should result in an
empty list, not an error. Noticed when commit 3d089ce belatedly added
tests, but simply accepted as weird then. It's actually a regression:
broken in commit 74f24cb, v2.7.0. Fix it, and throw in another test
case for empty string.
tests/qapi-schema: Make test-qapi.py print docs again
test-qapi.py used to print the internal representation of doc comments
(commit 3313b61). This went away when we dropped the doc comments in
positive tests (commit 87c16dc). Bring it back, because I'm going to
add real positive doc comment tests.
Greg Kurz [Tue, 21 Mar 2017 08:12:47 +0000 (09:12 +0100)]
9pfs: proxy: assert if unmarshal fails
Replies from the virtfs proxy are made up of a fixed-size header (8 bytes)
and a payload of variable size (maximum 64kb). When receiving a reply,
the proxy backend first reads the whole header and then unmarshals it.
If the header is okay, it then does the same operation with the payload.
Since the proxy backend uses a pre-allocated buffer which has enough room
for a header and the maximum payload size, marshalling should never fail
with fixed size arguments. Any error here is likely to result from a more
serious corruption in QEMU and we'd better dump core right away.
This patch adds error checks where they are missing and converts the
associated error paths into assertions.
This should also address Coverity's complaints CID 1348519 and CID 1348520,
about not always checking the return value of proxy_unmarshal().
Greg Kurz [Tue, 21 Mar 2017 08:12:47 +0000 (09:12 +0100)]
9pfs: don't try to flush self and avoid QEMU hang on reset
According to the 9P spec [*], when a client wants to cancel a pending I/O
request identified by a given tag (uint16), it must send a Tflush message
and wait for the server to respond with a Rflush message before reusing this
tag for another I/O. The server may still send a completion message for the
I/O if it wasn't actually cancelled but the Rflush message must arrive after
that.
QEMU hence waits for the flushed PDU to complete before sending the Rflush
message back to the client.
If a client sends 'Tflush tag oldtag' and tag == oldtag, QEMU will then
allocate a PDU identified by tag, find it in the PDU list and wait for
this same PDU to complete... i.e. wait for a completion that will never
happen. This causes a tag and ring slot leak in the guest, and a PDU
leak in QEMU, all of them limited by the maximal number of PDUs (128).
But, worse, this causes QEMU to hang on device reset since v9fs_reset()
wants to drain all pending I/O.
This insane behavior is likely to denote a bug in the client, and it would
deserve an Rerror message to be sent back. Unfortunately, the protocol
allows it and requires all flush requests to suceed (only a Tflush response
is expected).
The only option is to detect when we have to handle a self-referencing
flush request and report success to the client right away.
* remotes/bonzini/tags/for-upstream:
hax: fix breakage in locking
configure: remove Cygwin
xen: do not build backends for targets that do not support xen
qemu-ga: obey LISTEN_PID when using systemd socket activation