Eric Auger [Tue, 28 Mar 2017 17:20:40 +0000 (19:20 +0200)]
hw/intc/arm_gicv3_kvm: Check KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS in reset
KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS needs to be checked before
attempting to read ICC_CTLR_EL1; otherwise kernel versions not
exposing this kvm device group will be incompatible with qemu 2.9.
Peter Maydell [Fri, 31 Mar 2017 10:09:51 +0000 (11:09 +0100)]
Merge remote-tracking branch 'remotes/mdroth/tags/qga-pull-2017-03-30-tag' into staging
qemu-ga patch queue for 2.9
* fix make check failure of guest-get-fsinfo when nested virtual block
device partitions are mounted in the test environment
* fix static compilation for mingw builds
Peter Maydell [Fri, 31 Mar 2017 09:09:42 +0000 (10:09 +0100)]
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Fri 31 Mar 2017 01:50:55 BST
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
e1000: disable debug by default
virtio-net: avoid call tap_enable when there's only one queue
Jason Wang [Wed, 29 Mar 2017 02:41:23 +0000 (10:41 +0800)]
virtio-net: avoid call tap_enable when there's only one queue
We call tap_enable() even if for multiqueue is not enabled. This is
wrong since it should be used for multiqueue codes to enable a
disabled queue. Fixing this by only calling this when multiqueue is
used.
Michael Roth [Thu, 23 Mar 2017 20:24:32 +0000 (15:24 -0500)]
qga: don't fail if mount doesn't have slave devices
In some cases the slave devices of a virtual block device are tracked
by the parent in the corresponding sysfs node. For instance, if we
have a loop-back mount of the form:
/dev/loop3p1 on /home/mdroth/mnt type ext4 (rw,relatime,data=ordered)
The current code however assumes the mounted virtual block device,
loop3p1 in this case, contains the slaves directory, and reports an
error otherwise. This breaks 'make check' in certain environments.
Fix this by simply skipping attempts to generate disk topology
information in these cases. Since this information is documented
in QAPI as optionally-reported, this should be ok from an API
perspective.
In the future, this can possibly be improved upon by collecting
topology information from the parent in these cases.
There's no reason to pack structures where we don't care about size or
padding, this applies to AcpiStdTable in tests/acpi-utils.h.
OTOH bios-tables-test happens to be passing the address of a field in
this struct to a function that expects a pointer to normally aligned
data which results in a SIGBUS on architectures like SPARC that have
strict alignment requirements.
Fixes: 9e8458c02 ("acpi unit-test: compare DSDT and SSDT tables against expected values") Reported-by: Peter Maydell <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Tested-by: Peter Maydell <[email protected]>
Jason Wang [Wed, 29 Mar 2017 04:10:04 +0000 (12:10 +0800)]
vhost: generalize iommu memory region
We assumes the iommu_ops were attached to the root region of address
space. This may not be true for all kinds of IOMMU implementation and
especially after commit 3716d5902d74 ("pci: introduce a bus master
container"). So fix this by not assuming as->root has iommu_ops,
instead depending on the regions reported by memory listener through:
- register a memory listener to dma_as
- during region_add, if it's a region of IOMMU, register a specific
IOMMU notifier, and store all notifiers in a list.
- during region_del, compare and delete the IOMMU notifier from the list
This is also a must for making vhost device IOTLB works for all types
of IOMMUs. Note, since we register one notifier during each
.region_add, the IOTLB may be flushed more than one times, this is
suboptimal and could be optimized in the future.
Peter Maydell [Thu, 30 Mar 2017 14:28:19 +0000 (15:28 +0100)]
Merge remote-tracking branch 'remotes/thibault/tags/samuel-thibault' into staging
slirp updates
# gpg: Signature made Tue 28 Mar 2017 23:51:51 BST
# gpg: using RSA key 0xB0A51BF58C9179C5
# gpg: Good signature from "Samuel Thibault <[email protected]>"
# gpg: aka "Samuel Thibault <[email protected]>"
# gpg: aka "Samuel Thibault <[email protected]>"
# gpg: aka "Samuel Thibault <[email protected]>"
# gpg: aka "Samuel Thibault <[email protected]>"
# gpg: aka "Samuel Thibault <[email protected]>"
# gpg: aka "Samuel Thibault <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 900C B024 B679 31D4 0F82 304B D017 8C76 7D06 9EE6
# Subkey fingerprint: AEBF 7448 FAB9 453A 4552 390E B0A5 1BF5 8C91 79C5
* remotes/thibault/tags/samuel-thibault:
slirp: Send RDNSS in RA only if host has an IPv6 DNS server
slirp: Make RA build more flexible
slirp: fix compilation errors with DEBUG set
Peter Maydell [Thu, 30 Mar 2017 12:55:40 +0000 (13:55 +0100)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pci: fixes
More fixes for 2.9.
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Wed 29 Mar 2017 00:35:49 BST
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg: aka "Michael S. Tsirkin <[email protected]>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
virtio: fix vring_align() on 64-bit windows
pci: Add missing drop of bus master AS reference
event_notifier: prevent accidental use after close
Peter Maydell [Tue, 28 Mar 2017 13:01:52 +0000 (14:01 +0100)]
configure: Don't claim 'unsupported host OS' when better message available
The change in commit 898be3e0415c6d which made completely
unrecognized OSes cause an error_exit "Unsupported host OS"
has some unfortunate unintended effects:
* if you run 'configure --help' on an unsupported host OS
(eg if intending to use it as a build machine for a
cross compile to a supported host) then the message
is printed instead of --help
* if the C compiler doesn't work or is missing (eg if
you passed an incorrect --cross-prefix by mistake)
the message is printed instead of the more useful
'compiler does not exist or does not work' message
Fix this by postponing the error_exit in this situation
until later, when we have already identified the more
useful cases for this.
The long term fix for this would be to move handling
of --help much further up in the configure script,
and make its output not dependent on checks that configure
runs. However for 2.9 this would be too invasive.
Laurent Vivier [Tue, 28 Mar 2017 12:09:34 +0000 (14:09 +0200)]
spapr: fix memory hot-unplugging
If, once the kernel has booted, we try to remove a memory
hotplugged while the kernel was not started, QEMU crashes on
an assert:
qemu-system-ppc64: hw/virtio/vhost.c:651:
vhost_commit: Assertion `r >= 0' failed.
...
#4 in vhost_commit
#5 in memory_region_transaction_commit
#6 in pc_dimm_memory_unplug
#7 in spapr_memory_unplug
#8 spapr_machine_device_unplug
#9 in hotplug_handler_unplug
#10 in spapr_lmb_release
#11 in detach
#12 in set_allocation_state
#13 in rtas_set_indicator
...
If we take a closer look to the guest kernel log, we can see when
we try to unplug the memory:
pseries-hotplug-mem: Attempting to hot-add 4 LMB(s)
What happens:
1- The kernel has ignored the memory hotplug event because
it was not started when it was generated.
2- When we hot-unplug the memory,
QEMU starts to remove the memory,
generates an hot-unplug event,
and signals the kernel of the incoming new event
3- as the kernel is started, on the QEMU signal, it reads
the event list, decodes the hotplug event and tries to
finish the hotplugging.
4- QEMU receive the the hotplug notification while it
is trying to hot-unplug the memory. This moves the memory
DRC to an invalid state
This patch prevents this by not allowing to set the allocation
state to USABLE while the DRC is awaiting release.
Running postcopy-test with ASAN produces the following error:
QTEST_QEMU_BINARY=ppc64-softmmu/qemu-system-ppc64 tests/postcopy-test
...
=================================================================
==23641==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7f1556600000 at pc 0x55b8e9d28208 bp 0x7f1555f4d3c0 sp 0x7f1555f4d3b0
READ of size 8 at 0x7f1556600000 thread T6
#0 0x55b8e9d28207 in htab_save_first_pass /home/elmarco/src/qq/hw/ppc/spapr.c:1528
#1 0x55b8e9d2939c in htab_save_iterate /home/elmarco/src/qq/hw/ppc/spapr.c:1665
#2 0x55b8e9beae3a in qemu_savevm_state_iterate /home/elmarco/src/qq/migration/savevm.c:1044
#3 0x55b8ea677733 in migration_thread /home/elmarco/src/qq/migration/migration.c:1976
#4 0x7f15845f46c9 in start_thread (/lib64/libpthread.so.0+0x76c9)
#5 0x7f157d9d0f7e in clone (/lib64/libc.so.6+0x107f7e)
0x7f1556600000 is located 0 bytes to the right of 2097152-byte region [0x7f1556400000,0x7f1556600000)
allocated by thread T0 here:
#0 0x7f159bb76980 in posix_memalign (/lib64/libasan.so.3+0xc7980)
#1 0x55b8eab185b2 in qemu_try_memalign /home/elmarco/src/qq/util/oslib-posix.c:106
#2 0x55b8eab186c8 in qemu_memalign /home/elmarco/src/qq/util/oslib-posix.c:122
#3 0x55b8e9d268a8 in spapr_reallocate_hpt /home/elmarco/src/qq/hw/ppc/spapr.c:1214
#4 0x55b8e9d26e04 in ppc_spapr_reset /home/elmarco/src/qq/hw/ppc/spapr.c:1261
#5 0x55b8ea12e913 in qemu_system_reset /home/elmarco/src/qq/vl.c:1697
#6 0x55b8ea13fa40 in main /home/elmarco/src/qq/vl.c:4679
#7 0x7f157d8e9400 in __libc_start_main (/lib64/libc.so.6+0x20400)
Thread T6 created by T0 here:
#0 0x7f159bae0488 in __interceptor_pthread_create (/lib64/libasan.so.3+0x31488)
#1 0x55b8eab1d9cb in qemu_thread_create /home/elmarco/src/qq/util/qemu-thread-posix.c:465
#2 0x55b8ea67874c in migrate_fd_connect /home/elmarco/src/qq/migration/migration.c:2096
#3 0x55b8ea66cbb0 in migration_channel_connect /home/elmarco/src/qq/migration/migration.c:500
#4 0x55b8ea678f38 in socket_outgoing_migration /home/elmarco/src/qq/migration/socket.c:87
#5 0x55b8eaa5a03a in qio_task_complete /home/elmarco/src/qq/io/task.c:142
#6 0x55b8eaa599cc in gio_task_thread_result /home/elmarco/src/qq/io/task.c:88
#7 0x7f15823e38e6 (/lib64/libglib-2.0.so.0+0x468e6)
SUMMARY: AddressSanitizer: heap-buffer-overflow /home/elmarco/src/qq/hw/ppc/spapr.c:1528 in htab_save_first_pass
index seems to be wrongly incremented, unless I miss something that
would be worth a comment.
Andrew Baumann [Fri, 24 Mar 2017 23:19:43 +0000 (16:19 -0700)]
virtio: fix vring_align() on 64-bit windows
long is 32-bits on 64-bit windows, which caused the top half of the
address to be truncated; this patch changes it to use the
QEMU_ALIGN_UP macro which does not suffer the same problem
The recent introduction of a bus master container added
memory_region_add_subregion() into the PCI device registering path but
missed memory_region_del_subregion() in the unregistering path leaving
a reference to the root memory region of the new container.
Halil Pasic [Thu, 2 Mar 2017 18:13:08 +0000 (19:13 +0100)]
event_notifier: prevent accidental use after close
Let's set the handles to the underlying facilities to their extremal
value so no accidental misuse can happen, and to make it obvious that the
notifier is dysfunctional. E.g. if we just close an fd but do not touch
the int holding the fd eventually a read/write could succeed again when
the fd gets reused, and corrupt the file addressed by the fd.
Samuel Thibault [Sun, 26 Mar 2017 18:46:34 +0000 (20:46 +0200)]
slirp: Send RDNSS in RA only if host has an IPv6 DNS server
Previously we would always send an RDNSS option in the RA, making the guest
try to resolve DNS through IPv6, even if the host does not actually have
and IPv6 DNS server available.
This makes the RDNSS option enabled only when an IPv6 DNS server is
available.
Eduardo Habkost [Mon, 27 Mar 2017 14:48:15 +0000 (11:48 -0300)]
i386: Don't override -cpu options on -cpu host/max
The existing code for "host" and "max" CPU models overrides every
single feature in the CPU object at realize time, even the ones
that were explicitly enabled or disabled by the user using
"feat=on" or "feat=off", while features set using +feat/-feat are
kept.
This means "-cpu host,+invtsc" works as expected, while
"-cpu host,invtsc=on" doesn't.
This was a known bug, already documented in a comment inside
x86_cpu_expand_features(). What makes this bug worse now is that
libvirt 3.0.0 and newer now use "feat=on|off" instead of
+feat/-feat when it detects a QEMU version that supports it (see
libvirt commit d47db7b16dd5422c7e487c8c8ee5b181a2f9cd66).
Change the feature property getter/setter to set a
env->user_features field, to keep track of features that were
explicitly changed using QOM properties. Then make the
max_features code not override user features when handling "-cpu
host" and "-cpu max".
This will also allow us to remove the plus_features/minus_features
hack in the future, but I plan to do that after 2.9.0 is
released.
Eduardo Habkost [Mon, 27 Mar 2017 14:48:14 +0000 (11:48 -0300)]
i386: Replace uint32_t* with FeatureWord on feature getter/setter
Instead of passing a pointer to the feature property getter and
setter functions, pass a FeatureWord enum so they can perform
other actions related to the feature flag.
This will be used to add a new "user_features" field to keep
track of features that were explicitly set by the user.
We first snprintf() to a fixed buffer, then g_strdup() the result
*boggle*.
Worse, the size of the fixed buffer INET6_ADDRSTRLEN + 5 + 4 is bogus:
the 4 correctly accounts for '[', ']', ':' and '\0', but
INET6_ADDRSTRLEN is not a suitable limit for inet->host, and 5 is not
one for inet->port! They are for host and port in *numeric* form
(exploiting that INET6_ADDRSTRLEN > INET_ADDRSTRLEN), but inet->host
can also be a hostname, and inet->port can be a service name, to be
resolved with getaddrinfo().
Fortunately, the only user so far is the "socket" network backend's
net_socket_connected(), which uses it to initialize a NetSocketState's
info_str[]. info_str[] has considerable more space: 256 instead of
55. So the bug's impact appears to be limited to truncated "info
networks" with the "socket" network backend.
* remotes/cody/tags/block-pull-request:
rbd: Fix bugs around -drive parameter "server"
rbd: Revert -blockdev parameter password-secret
rbd: Revert -blockdev and -drive parameter auth-supported
rbd: Clean up qemu_rbd_create()'s detour through QemuOpts
rbd: Clean up runtime_opts, fix -drive to reject filename
rbd: Don't accept -drive driver=rbd, keyvalue-pairs=...
rbd: Clean up after the previous commit
rbd: Don't limit length of parameter values
rbd: Fix to cleanly reject -drive without pool or image
rbd: Reject -blockdev server.*.{numeric, to, ipv4, ipv6}
qemu_rbd_open() takes option parameters as a flattened QDict, with
keys of the form server.%d.host, server.%d.port, where %d counts up
from zero.
qemu_rbd_array_opts() extracts these values as follows. First, it
calls qdict_array_entries() to find the list's length. For each list
element, it formats the list's key prefix (e.g. "server.0."), then
creates a new QDict holding the options with that key prefix, then
converts that to a QemuOpts, so it can finally get the member values
from there.
If there's one surefire way to make code using QDict more awkward,
it's creating more of them and mixing in QemuOpts for good measure.
The extraction of keys starting with server.%d into another QDict
makes us ignore parameters like server.0.neither-host-nor-port
silently.
The conversion to QemuOpts abuses runtime_opts, as described a few
commits ago.
Rewrite to simply get the values straight from the options QDict.
Fixes -drive not to crash when server.*.* are present, but
server.*.host is absent.
Fixes -drive to reject invalid server.*.*.
Permits cleaning up runtime_opts. Do that, and fix -drive to reject
bogus parameters host and port instead of silently ignoring them.
This reverts a part of commit 8a47e8e. We're having second thoughts
on the QAPI schema (and thus the external interface), and haven't
reached consensus, yet. Issues include:
* BlockdevOptionsRbd member @password-secret isn't actually a
password, it's a key generated by Ceph.
* We're not sure where member @password-secret belongs (see the
previous commit).
* How @password-secret interacts with settings from a configuration
file specified with @conf is undocumented.
Let's avoid painting ourselves into a corner now, and revert the
feature for 2.9.
Note that users can still configure an authentication key with a
configuration file. They probably do that anyway if they use Ceph
outside QEMU as well.
rbd: Revert -blockdev and -drive parameter auth-supported
This reverts half of commit 0a55679. We're having second thoughts on
the QAPI schema (and thus the external interface), and haven't reached
consensus, yet. Issues include:
* The implementation uses deprecated rados_conf_set() key
"auth_supported". No biggie.
* The implementation makes -drive silently ignore invalid parameters
"auth" and "auth-supported.*.X" where X isn't "auth". Fixable (in
fact I'm going to fix similar bugs around parameter server), so
again no biggie.
* BlockdevOptionsRbd member @password-secret applies only to
authentication method cephx. Should it be a variant member of
RbdAuthMethod?
* BlockdevOptionsRbd member @user could apply to both methods cephx
and none, but I'm not sure it's actually used with none. If it
isn't, should it be a variant member of RbdAuthMethod?
* The client offers a *set* of authentication methods, not a list.
Should the methods be optional members of BlockdevOptionsRbd instead
of members of list @auth-supported? The latter begs the question
what multiple entries for the same method mean. Trivial question
now that RbdAuthMethod contains nothing but @type, but less so when
RbdAuthMethod acquires other members, such the ones discussed above.
* How BlockdevOptionsRbd member @auth-supported interacts with
settings from a configuration file specified with @conf is
undocumented. I suspect it's untested, too.
Let's avoid painting ourselves into a corner now, and revert the
feature for 2.9.
Note that users can still configure authentication methods with a
configuration file. They probably do that anyway if they use Ceph
outside QEMU as well.
Further note that this doesn't affect use of key "auth-supported" in
-drive file=rbd:...:key=value.
qemu_rbd_array_opts()'s parameter @type now must be RBD_MON_HOST,
which is silly. This will be cleaned up shortly.
rbd: Clean up runtime_opts, fix -drive to reject filename
runtime_opts is used for three different purposes:
* qemu_rbd_open() uses it to accept options it recognizes, such as
"pool" and "image". Other .bdrv_open() methods do it similarly.
* qemu_rbd_open() accepts additional list-valued options
auth-supported and server, with the help of qemu_rbd_array_opts().
The list elements are again dictionaries. qemu_rbd_array_opts()
uses runtime_opts to accept their members. Thus, runtime_opts
contains recognized sub-sub-options "auth", "host", "port" in
addition to recognized options. No other block driver does that.
* qemu_rbd_create() uses it to convert the QDict produced by
qemu_rbd_parse_filename() to QemuOpts. No other block driver does
that. The keys produced by qemu_rbd_parse_filename() are "pool",
"image", "snapshot", "conf", "user" and "keyvalue-pairs".
qemu_rbd_open() accepts these, so no additional ones here.
This is a confusing mess. Dates back to commit 0f9d252. First step
to clean it up is documenting runtime_opts.desc[]:
* Reorder entries to match the QAPI schema, like we do in other block
drivers.
* Document why the schema's "server" and "auth-supported" aren't in
.desc[].
* Document why "keyvalue-pairs", "host", "port" and "auth" are in
.desc[], but not the schema.
* Delete "filename", because none of the three users actually uses it.
This fixes -drive to reject parameter filename instead of silently
ignoring it.
The way we communicate extra key-value pairs from
qemu_rbd_parse_filename() to qemu_rbd_open() exposes option parameter
"keyvalue-pairs" on the command line. It's not wanted there. Hack:
rename the parameter to "=keyvalue-pairs" to make it inaccessible.
We laboriously enforce that parameter values are between one and some
arbitrary limit in length. Only RBD_MAX_IMAGE_NAME_SIZE comes from
librbd.h, and I'm not sure it applies. Where the other limits come
from is unclear.
Drop the length checking. The limits librbd actually imposes must be
checked by librbd anyway.
There's one minor complication: BDRVRBDState member name is a
fixed-size array. Depends on the length limit. Make it a pointer to
a dynamically allocated string.
rbd: Fix to cleanly reject -drive without pool or image
qemu_rbd_open() neglects to check pool and image are present. Missing
image is caught by rbd_open(), but missing pool crashes. Reproducer:
$ qemu-system-x86_64 -nodefaults -drive driver=rbd,id=rbd,image=i,...
terminate called after throwing an instance of 'std::logic_error'
what(): basic_string::_M_construct null not valid
Aborted (core dumped)
where ... is a working server.0.{host,port} configuration.
Doesn't affect -drive with file=..., because qemu_rbd_parse_filename()
always sets both pool and image.
Doesn't affect -blockdev, because pool and image are mandatory in the
QAPI schema.
rbd: Reject -blockdev server.*.{numeric, to, ipv4, ipv6}
We use InetSocketAddress in the QAPI schema. However, the code
doesn't use inet_connect_saddr(), but formats "host" and "port" into a
configuration string for rados_conf_set(). Thus, members "numeric",
"to", "ipv4" and "ipv6" are silently ignored. Not nice. Example:
Peter Maydell [Tue, 28 Mar 2017 11:34:23 +0000 (12:34 +0100)]
Merge remote-tracking branch 'remotes/stsquad/tags/pull-mttcg-fixups-for-rc2-280317-1' into staging
MTTCG regression fixes for rc2
# gpg: Signature made Tue 28 Mar 2017 10:54:38 BST
# gpg: using RSA key 0xFBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <[email protected]>"
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-mttcg-fixups-for-rc2-280317-1:
replay/replay.c: bump REPLAY_VERSION
tcg: Add a new line after incompatibility warning
ui/console: use exclusive mechanism directly
ui/console: ensure do_safe_dpy_refresh holds BQL
bsd-user: align use of mmap_lock to that of linux-user
user-exec: handle synchronous signals from QEMU gracefully
Stefan Hajnoczi [Mon, 27 Mar 2017 13:17:18 +0000 (14:17 +0100)]
trace: fix tcg tracing build breakage
Commit 0ab8ed18a6fe98bfc82705b0f041fbf2a8ca5b60 ("trace: switch to
modular code generation for sub-directories") forgot to convert "tcg"
trace events to the modular code generation approach where each
sub-directory has its own trace-events file.
This patch fixes compilation for "tcg" trace events. Currently they are
only used in the root ./trace-events file.
"tcg" trace events can only be used in the root ./trace-events file for
the time being.
Denis V. Lunev [Mon, 27 Mar 2017 14:38:08 +0000 (17:38 +0300)]
parallels: wrong call to bdrv_truncate
Parallels driver should not call bdrv_truncate if the image was opened
in the read-only mode. Without the patch
qemu-img check harddisk.hds
asserts with
bdrv_truncate: Assertion `child->perm & BLK_PERM_RESIZE' failed.
Parameters used on the write path are not needed if the image is opened
in the read-only mode.
Alex Bennée [Fri, 24 Mar 2017 15:21:55 +0000 (15:21 +0000)]
replay/replay.c: bump REPLAY_VERSION
A previous commit (3d4d16f4) added support for audio record/playback.
However this breaks the logfile ABI due to the re-ordering of the
ReplayEvents enum. The REPLAY_VERSION check is meant to prevent you
from using old log files in newer QEMUs but this is currently broken.
Alex Bennée [Fri, 24 Mar 2017 15:39:05 +0000 (15:39 +0000)]
ui/console: use exclusive mechanism directly
The previous commit (8bb93c6f99) using async_safe_run_on_cpu() doesn't
work on graphics sub-system which restrict which threads can do GUI
updates. Rather the special casing MacOS we just directly call the
helper and move all the exclusive handling into do_dafe_dpy_refresh().
The unfortunate bouncing of the BQL is to ensure there is no deadlock
as vCPUs waiting on the BQL are kicked into their quiescent state.
Alex Bennée [Wed, 22 Mar 2017 16:11:13 +0000 (16:11 +0000)]
ui/console: ensure do_safe_dpy_refresh holds BQL
I missed the fact that when an exclusive work item runs it drops the
BQL to ensure all no vCPUs are stuck waiting for it, hence causing a
deadlock. However the actual helper needs to take the BQL especially
as we'll be messing with device emulation bits during the update which
all assume BQL is held.
We make a minor cpu_reloading_memory_map which must try and unlock the
RCU if we are actually outside the running context.
Alex Bennée [Mon, 20 Mar 2017 14:36:47 +0000 (14:36 +0000)]
bsd-user: align use of mmap_lock to that of linux-user
The introduction of stricter mmap_lock checking in translate-all broke
the BSD user build. The working mmap_lock functions were hidden behind
CONFIG_USE_NPTL which is never defined. This patch brings them inline
with linux-user.
Despite the disapearence of the comment "We aren't threadsafe to start
with..." this doesn't make bsd-user so. It will still need the rest of
the fixes that have been done in linux-user ported over.
Alex Bennée [Mon, 20 Mar 2017 11:31:44 +0000 (11:31 +0000)]
user-exec: handle synchronous signals from QEMU gracefully
When "tcg: enable thread-per-vCPU" (commit 3725794) was merged the
lifetime of current_cpu was changed. Previously a broken linux-user
call might abort() which can eventually escalate into a SIGSEGV which
would then crash qemu as it attempted to deref a NULL current_cpu.
After commit 3725794 it would attempt to fixup state and re-start the
run-loop and much hilarity (i.e. a looping lockup) would ensue from
jumping into a stale jmp_env.
As we can actually tell if we are in the run-loop from looking at the
cpu->running flag we should catch this badness first and abort()
cleanly rather than try to soldier on. There is a theoretical race
between the flag being set and sigsetjmp refreshing the jump buffer
but we can try really hard to not introduce crashes into that code.
Peter Maydell [Tue, 28 Mar 2017 08:48:23 +0000 (09:48 +0100)]
Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
This series fixes potential memory/fd leaks in 9pfs and a crash when
running tests/virtio-9p-test on SPARC hosts.
# gpg: Signature made Tue 28 Mar 2017 09:44:05 BST
# gpg: using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <[email protected]>"
# gpg: aka "Greg Kurz <[email protected]>"
# gpg: aka "Greg Kurz <[email protected]>"
# gpg: aka "Gregory Kurz (Groug) <[email protected]>"
# gpg: aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894 DBA2 02FC 3AEB 0101 DBC2
* remotes/gkurz/tags/for-upstream:
tests/virtio-9p-test: Don't call le*_to_cpus on fields of packed struct
9pfs: fix file descriptor leak
Peter Maydell [Mon, 27 Mar 2017 17:59:04 +0000 (18:59 +0100)]
tests/virtio-9p-test: Don't call le*_to_cpus on fields of packed struct
For a packed struct like 'P9Hdr' the fields within it may not be
aligned as much as the natural alignment for their types. This means
it is not valid to pass the address of such a field to a function
like le32_to_cpus() which operate on uint32_t* and assume alignment.
Doing this results in a SIGBUS on hosts like SPARC which have strict
alignment requirements.
Use ldl_le_p() instead, which is specified to correctly handle
unaligned pointers.
Li Qiang [Mon, 27 Mar 2017 19:13:19 +0000 (21:13 +0200)]
9pfs: fix file descriptor leak
The v9fs_create() and v9fs_lcreate() functions are used to create a file
on the backend and to associate it to a fid. The fid shouldn't be already
in-use, otherwise both functions may silently leak a file descriptor or
allocated memory. The current code doesn't check that.
This patch ensures that the fid isn't already associated to anything
before using it.
Peter Maydell [Mon, 27 Mar 2017 16:34:50 +0000 (17:34 +0100)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* MTTCG fix for win32
* virtio-scsi assertion failure
* mem-prealloc coverity fix
* x86 migration revert which requires more thought
* x86 instruction limit (avoids >2 page translation blocks)
* nbd dead code cleanup
* small memory.c logic fix
* remotes/bonzini/tags/for-upstream:
scsi-generic: Fill in opt_xfer_len in INQUIRY reply if it is zero
Revert "apic: save apic_delivered flag"
nbd: drop unused NBDClientSession.is_unix field
win32: replace custom mutex and condition variable with native primitives
mem-prealloc: fix sysconf(_SC_NPROCESSORS_ONLN) failure case.
tcg/i386: Check the size of instruction being translated
virtio-scsi: Fix acquire/release in dataplane handlers
virtio-scsi: Make virtio_scsi_acquire/release public
clear pending status before calling memory commit
Peter Maydell [Mon, 27 Mar 2017 15:56:31 +0000 (16:56 +0100)]
Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2017-03-27' into staging
Block patches for 2.9-rc2.
# gpg: Signature made Mon 27 Mar 2017 16:47:54 BST
# gpg: using RSA key 0xF407DB0061D5CF40
# gpg: Good signature from "Max Reitz <[email protected]>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* remotes/maxreitz/tags/pull-block-2017-03-27:
block/file-posix.c: Fix unused variable warning on OpenBSD
file-posix: Make bdrv_flush() failure permanent without O_DIRECT
nbd-client: fix handling of hungup connections
qemu-img: print short help on getopt failure
qemu-img: fix switch indentation in img_amend()
qemu-img: show help for invalid global options
Peter Maydell [Thu, 23 Mar 2017 14:36:28 +0000 (14:36 +0000)]
block/file-posix.c: Fix unused variable warning on OpenBSD
On OpenBSD none of the ioctls probe_logical_blocksize() tries
exist, so the variable sector_size is unused. Refactor the
code to avoid this (and reduce the duplicated code).
Kevin Wolf [Wed, 22 Mar 2017 21:00:05 +0000 (22:00 +0100)]
file-posix: Make bdrv_flush() failure permanent without O_DIRECT
Success for bdrv_flush() means that all previously written data is safe
on disk. For fdatasync(), the best semantics we can hope for on Linux
(without O_DIRECT) is that all data that was written since the last call
was successfully written back. Therefore, and because we can't redo all
writes after a flush failure, we have to give up after a single
fdatasync() failure. After this failure, we would never be able to make
the promise that a successful bdrv_flush() makes.
Paolo Bonzini [Tue, 14 Mar 2017 11:11:56 +0000 (12:11 +0100)]
nbd-client: fix handling of hungup connections
After the switch to reading replies in a coroutine, nothing is
reentering pending receive coroutines if the connection hangs.
Move nbd_recv_coroutines_enter_all to the reply read coroutine,
which is the place where hangups are detected. nbd_teardown_connection
can simply wait for the reply read coroutine to detect the hangup
and clean up after itself.
This wouldn't be enough though because nbd_receive_reply returns 0
(rather than -EPIPE or similar) when reading from a hung connection.
Fix the return value check in nbd_read_reply_entry.
Stefan Hajnoczi [Fri, 17 Mar 2017 10:45:41 +0000 (18:45 +0800)]
qemu-img: print short help on getopt failure
Printing the full help output obscures the error message for an invalid
command-line option or missing argument.
Before this patch:
$ ./qemu-img --foo
...pages of output...
After this patch:
$ ./qemu-img --foo
qemu-img: unrecognized option '--foo'
Try 'qemu-img --help' for more information
This patch adds the getopt ':' character so that it can distinguish
between missing arguments and unrecognized options. This helps provide
more detailed error messages.
Andrey Shedel [Fri, 24 Mar 2017 22:01:41 +0000 (15:01 -0700)]
win32: replace custom mutex and condition variable with native primitives
The multithreaded TCG implementation exposed deadlocks in the win32
condition variables: as implemented, qemu_cond_broadcast waited on
receivers, whereas the pthreads API it was intended to emulate does
not. This was causing a deadlock because broadcast was called while
holding the IO lock, as well as all possible waiters blocked on the
same lock.
This patch replaces all the custom synchronisation code for mutexes
and condition variables with native Windows primitives (SRWlocks and
condition variables) with the same semantics as their POSIX
equivalents. To enable that, it requires a Windows Vista or newer host
OS.
Gerd Hoffmann [Tue, 14 Mar 2017 08:26:58 +0000 (09:26 +0100)]
vnc: fix reverse mode
vnc server in reverse mode (qemu -vnc localhost:$nr,reverse) interprets
$nr as display number (i.e. with 5900 offset) in recent qemu versions.
Historical and documented behavior is interpreting $nr as port number
though. So we should bring code and documentation in line.
Given that default listening port for viewers is 5500 the 5900 offset is
pretty inconvinient, because it is simply impossible to connect to port
5500. So, lets fix the code not the docs.
Gerd Hoffmann [Mon, 20 Mar 2017 08:04:02 +0000 (09:04 +0100)]
ui/egl-helpers: fix egl 1.5 display init
Unfortunaly switching to getPlatformDisplayEXT isn't as easy as
implemented by 0ea1523fb6703aa0dcd65e66b59e96fec028e60a. See the
longish comment for the complete story.
Ladi Prosek [Fri, 24 Mar 2017 14:24:50 +0000 (15:24 +0100)]
virtio-input: fix eventq batching
virtio_input_send buffers input events until it sees a SYNC. Then it
either sends or drops the entire batch, depending on whether eventq
has enough space available. The case to avoid here is partial sends
where only part of the batch would get to the guest.
Using virtqueue_get_avail_bytes to check the state of eventq was not
correct. The queue may have a smaller number of larger buffers
available so bytes may be enough but the batch would still not be
possible to send, leading to the "Huh? No vq elem available" error.
Instead of checking available bytes, this patch optimistically pops
buffers from the queue and puts them back in case it runs out of
space and the batch needs to be dropped.
Ladi Prosek [Fri, 24 Mar 2017 14:24:49 +0000 (15:24 +0100)]
virtio-input: free event queue when finalizing
VirtIOInput.queue was never freed. This commit adds an explicit
g_free to virtio_input_finalize and switches the allocation
function from realloc to g_realloc in virtio_input_send.
a qemu with an empty s390 guest will exit very quickly. This races
against the testsuite reading from the console pipe leading to
intermittent test suite failures. Using -no-shutdown will keep
the guest running.
This was spotted by Coverity, in case where sysconf(_SC_NPROCESSORS_ONLN)
fails and returns -1. This results in memset_num_threads getting set to -1.
Which we then pass to g_new0().
The patch replaces MAX_MEM_PREALLOC_THREAD_COUNT macro with a function call
get_memset_num_threads() to handle sysconf() failure gracefully. In case
sysconf() fails, we fall back to single threaded.
Pranith Kumar [Thu, 23 Mar 2017 17:58:51 +0000 (13:58 -0400)]
tcg/i386: Check the size of instruction being translated
This fixes the bug: 'user-to-root privesc inside VM via bad translation
caching' reported by Jann Horn here:
https://bugs.chromium.org/p/project-zero/issues/detail?id=1122
Fam Zheng [Fri, 17 Mar 2017 06:14:47 +0000 (14:14 +0800)]
virtio-scsi: Fix acquire/release in dataplane handlers
After the AioContext lock push down, there is a race between
virtio_scsi_dataplane_start and those "assert(s->ctx &&
s->dataplane_started)", because the latter doesn't isn't wrapped in
aio_context_acquire.
Reproducer is simply booting a Fedora guest with an empty
virtio-scsi-dataplane controller:
Xu, Anthony [Wed, 22 Mar 2017 17:53:35 +0000 (17:53 +0000)]
clear pending status before calling memory commit
clear pending status before calling memory commit.
Otherwise when memory_region_finalize is called,
memory_region_transaction_depth is 0 and
memory_region_update_pending is true.
That's wrong.
In file included from /usr/include/signal.h:326:0,
from /home/pm215/qemu/include/qemu/osdep.h:86,
from /home/pm215/qemu/disas/microblaze.c:36:
/usr/include/sparc64-linux-gnu/sys/ucontext.h:96:0: note: this is the location of the previous definition
#define REG_PC (1)
Since the code doesn't actually use the REG_PC define
anywhere, the simplest fix is just to remove it.
Eric Blake [Mon, 13 Mar 2017 19:55:20 +0000 (14:55 -0500)]
trace: Avoid abuse of amdvi_mmio_read
hw/i386/trace-events has an amdvi_mmio_read trace that is used for
both normal reads (listing the register name, address, size, and
offset) and for an error case (abusing the register name to show
an error message, the address to show the maximum value supported,
then shoehorning address and size into the size and offset
parameters). The change from a wide address to a narrower size
parameter could truncate a (rather-large) bogus read attempt, so
it's better to create a separate dedicated trace with correct types,
rather than abusing the trace mechanism. Broken since its
introduction in commit d29a09c.
[Change trace event argument type from hwaddr to uint64_t since
user-defined types should not be used for trace events. This fixes a
build failure with LTTng UST.
--Stefan]
Eric Blake [Mon, 13 Mar 2017 19:55:19 +0000 (14:55 -0500)]
trace: Fix incorrect megasas trace parameters
hw/scsi/trace-events lists cmd as the first parameter for both
megasas_iovec_overflow and megasas_iovec_underflow, but the caller
was mistakenly passing cmd->iov_size twice instead of the command
index. Also, trace_megasas_abort_invalid is called with parameters
in the wrong order. Broken since its introduction in commit e8f943c3.
Eric Blake [Mon, 13 Mar 2017 19:55:18 +0000 (14:55 -0500)]
trace: Fix backwards mirror_yield parameters
block/trace-events lists the parameters for mirror_yield
consistently with other mirror events (cnt just after s, like in
mirror_before_sleep; in_flight last, like in mirror_yield_in_flight).
But the callers were passing parameters in the wrong order, leading
to poor trace messages, including type truncation when there are
more than 4G dirty sectors involved. Broken since its introduction
in commit bd48bde.
While touching this, ensure that all callers use the same type
(uint64_t) for cnt, as a later patch will enable the compiler to do
stricter type-checking.
Peter Maydell [Thu, 23 Mar 2017 15:21:28 +0000 (15:21 +0000)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170323' into staging
ppc patch queue for 2017-03-23
Just a single bugfix in this batch. It's not strictly in ppc code,
though it's for the pseries machine's benefit. Eduardo suggested it
go through my tree however.
Peter Maydell [Thu, 23 Mar 2017 13:43:32 +0000 (13:43 +0000)]
Merge remote-tracking branch 'remotes/gonglei/tags/cryptodev-next-20170323' into staging
cryptodev fixes
# gpg: Signature made Thu 23 Mar 2017 09:22:44 GMT
# gpg: using RSA key 0x2ED7FDE9063C864D
# gpg: Good signature from "Gonglei <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 3EF1 8E53 3459 E6D1 963A 3C05 2ED7 FDE9 063C 864D
* remotes/gonglei/tags/cryptodev-next-20170323:
cryptodev: fix asserting single queue
cryptodev: setiv only when really need
Peter Maydell [Thu, 23 Mar 2017 12:31:52 +0000 (12:31 +0000)]
Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2017-03-22-v3' into staging
QAPI patches for 2017-03-22
# gpg: Signature made Wed 22 Mar 2017 18:25:15 GMT
# gpg: using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <[email protected]>"
# gpg: aka "Markus Armbruster <[email protected]>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-qapi-2017-03-22-v3:
qapi: Fix QemuOpts visitor regression on unvisited input
qom: Avoid unvisited 'id'/'qom-type' in user_creatable_add_opts
tests: Expose regression in QemuOpts visitor
test-qobject-input-visitor: Cover visit_type_uint64()
Revert "hostmem: fix QEMU crash by 'info memdev'"
qapi: Fix string input visitor regression for empty lists
qapi2texi: Fix translation of *strong* and _emphasized_
tests/qapi-schema: Systematic positive doc comment tests
tests/qapi-schema: Make test-qapi.py print docs again
qapi: Drop unused QAPIDoc member optional
qapi2texi: Fix to actually fail when 'doc-required' is false
qapi: Drop excessive Make dependencies on qapi2texi.py
MAINTAINERS: Add myself for files I touched recently
keyval: Document issues with 'any' and alternate types
test-keyval: Cover alternate and 'any' type
keyval: Improve some comments
test-keyval: Tweaks to improve list coverage
Peter Maydell [Thu, 23 Mar 2017 11:04:56 +0000 (11:04 +0000)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pc: fixes
virtio and misc fixes for 2.9.
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Wed 22 Mar 2017 16:29:50 GMT
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg: aka "Michael S. Tsirkin <[email protected]>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
hw/acpi/vmgenid: prevent more than one vmgenid device
hw/acpi/vmgenid: prevent device realization on pre-2.5 machine types
virtio: always use handle_aio_output if registered
virtio: Fix error handling in virtio_bus_device_plugged
Halil Pasic [Wed, 22 Mar 2017 12:36:55 +0000 (13:36 +0100)]
cryptodev: fix asserting single queue
We already check for queues == 1 in cryptodev_builtin_init and when that
is not true raise an error. But before that error is reported the
assertion in cryptodev_builtin_cleanup kicks in (because object is being
finalized and freed).
Let's remove assert(queues == 1) form cryptodev_builtin_cleanup as it
does only harm and no good.
Longpeng(Mike) [Sat, 14 Jan 2017 06:14:27 +0000 (14:14 +0800)]
cryptodev: setiv only when really need
ECB mode cipher doesn't need IV, if we setiv for it then qemu
crypto API would report "Expected IV size 0 not **", so we should
setiv only when the cipher algos really need.
Eric Blake [Wed, 22 Mar 2017 14:45:25 +0000 (09:45 -0500)]
qapi: Fix QemuOpts visitor regression on unvisited input
An off-by-one in commit 15c2f669e meant that we were failing to
check for unparsed input in all QemuOpts visitors. Recent testsuite
additions show that fixing the obvious bug with bogus fields will
also fix the case of an incomplete list visit; update the tests to
match the new behavior.
Eric Blake [Wed, 22 Mar 2017 14:45:24 +0000 (09:45 -0500)]
qom: Avoid unvisited 'id'/'qom-type' in user_creatable_add_opts
A regression in commit 15c2f669e caused us to silently ignore
excess input to the QemuOpts visitor. Later, commit ea4641
accidentally abused that situation, by removing "qom-type" and
"id" from the corresponding QDict but leaving them defined in
the QemuOpts, when using the pair of containers to create a
user-defined object. Note that since we are already traversing
two separate items (a QDict and a QemuOpts), we are already
able to flag bogus arguments, as in:
$ ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio -object memory-backend-ram,id=mem1,size=4k,bogus=huh
qemu-system-x86_64: -object memory-backend-ram,id=mem1,size=4k,bogus=huh: Property '.bogus' not found
So the only real concern is that when we re-enable strict checking
in the QemuOpts visitor, we do not want to start flagging the two
leftover keys as unvisited. Rearrange the code to clean out the
QemuOpts listing in advance, rather than removing items from the
QDict. Since "qom-type" is usually an automatic implicit default,
we don't have to restore it (this does mean that once instantiated,
QemuOpts is not necessarily an accurate representation of the
original command line - but this is not the first place to do that);
however "id" has to be put back (requiring us to cast away a const).
[As a side note, hmp_object_add() turns a QDict into a QemuOpts,
then calls user_creatable_add_opts() which converts QemuOpts into
a new QDict. There are probably a lot of wasteful conversions like
this, but cleaning them up is a much bigger task than the immediate
regression fix.]
John Snow [Thu, 16 Mar 2017 21:23:51 +0000 (17:23 -0400)]
blockjob: add devops to blockjob backends
This lets us hook into drained_begin and drained_end requests from the
backend level, which is particularly useful for making sure that all
jobs associated with a particular node (whether the source or the target)
receive a drain request.
Allow block backends to forward drain requests to their devices/users.
The initial intended purpose for this patch is to allow BBs to forward
requests along to BlockJobs, which will want to pause if their associated
BB has entered a drained region.