Eduardo Habkost [Fri, 18 Aug 2017 19:09:43 +0000 (16:09 -0300)]
numa: Move numa_legacy_auto_assign_ram to pc-i440fx-2.9
The 'm->numa_auto_assign_ram = numa_legacy_auto_assign_ram;' line
was supposed to be in pc_i440fx_2_9_machine_options() (see commit 3bfe5716 "numa: equally distribute memory on nodes"), but the
merge commit adb354dd ("Merge remote-tracking branch
'mst/tags/for_upstream' into staging") moved it to the
pc_i440fx_2_10_machine_options().
Move the line back to pc_i440fx_2_9_machine_options().
Igor Mammedov [Thu, 17 Aug 2017 16:14:13 +0000 (18:14 +0200)]
fix build failure in nbd_read_reply_entry()
travis builds fail at HEAD at rc3 master with
block/nbd-client.c: In function ‘nbd_read_reply_entry’:
block/nbd-client.c:110:8: error: ‘ret’ may be used uninitialized in this function [-Werror=uninitialized]
Peter Maydell [Wed, 23 Aug 2017 08:04:20 +0000 (09:04 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.10-20170823' into staging
ppc patch queue 2017-08-23
This is identical to the pull request from yesterday (20180822),
except that a bug in one patch is fixed so that it doesn't break TCG
on a ppc host.
Last minute ppc related fixes for qemu-2.10. I'm not sure if these
are critical enough to prompt another rc, but I'm submitting them for
consideration.
First, is Cornelia's fix for 480bc11e6 which meant "make check" would
always fail on a ppc host. Tracking that down delayed submission of
the rest of these patches, sorry.
The rest are all fairly important bugfixes for qemu crashes or guest
behaviour regression on ppc. Patches 2-4 specifically are fixes for
regressions from qemu-2.9, caused by the compatibility mode and
hotplug handling cleanups for the pseries machine type.
* remotes/dgibson/tags/ppc-for-2.10-20170823:
hw/ppc/spapr_iommu: Fix crash when removing the "spapr-tce-table" device
hw/ppc/spapr_rtc: Mark the RTC device with user_creatable = false
hw/ppc/spapr: Fix segfault when instantiating a 'pc-dimm' without 'memdev'
spapr: Allow configure-connector to be called multiple times
ppc: fix ppc_set_compat() with KVM PR
target/ppc: 'PVR != host PVR' in KVM_SET_SREGS workaround
boot-serial-test: prefer tcg accelerator
Thomas Huth [Thu, 17 Aug 2017 05:15:10 +0000 (07:15 +0200)]
hw/ppc/spapr_rtc: Mark the RTC device with user_creatable = false
QEMU currently aborts unexpectedly when a user tries to do something
like this:
$ qemu-system-ppc64 -nographic -S -nodefaults -monitor stdio
QEMU 2.9.92 monitor - type 'help' for more information
(qemu) device_add spapr-rtc,id=spapr-rtc
(qemu) device_del spapr-rtc
**
ERROR:qemu/qdev-monitor.c:872:qdev_unplug: assertion failed: (hotplug_ctrl)
Aborted (core dumped)
The RTC device is not meant to be hot-pluggable - it's an internal
device only and it even should not be possible to create it a
second time with the "-device" parameter, so let's mark this
with "user_creatable = false".
Thomas Huth [Mon, 21 Aug 2017 06:30:29 +0000 (08:30 +0200)]
hw/ppc/spapr: Fix segfault when instantiating a 'pc-dimm' without 'memdev'
QEMU currently crashes when trying to use a 'pc-dimm' on the pseries
machine without specifying its 'memdev' property. This happens because
pc_dimm_get_memory_region() does not check whether the 'memdev' property
has properly been set by the user. Looking closer at this function, it's
also obvious that it is using &error_abort to call another function - and
this is bad in a function that is used in the hot-plugging calling chain
since this can also cause QEMU to exit unexpectedly.
So let's fix these issues in a proper way now: Add a "Error **errp"
parameter to pc_dimm_get_memory_region() which we use in case the 'memdev'
property has not been set by the user, and which we can use instead of
the &error_abort, and change the callers of get_memory_region() to make
use of this "errp" parameter for proper error checking.
Bharata B Rao [Thu, 17 Aug 2017 05:16:42 +0000 (10:46 +0530)]
spapr: Allow configure-connector to be called multiple times
In case of in-kernel memory hot unplug, when the guest is not able
to remove all the LMBs that are requested for removal, it will add back
any LMBs that have been successfully removed. The DR Connectors of
these LMBs wouldn't have been unconfigured and hence the addition of
these LMBs will result in configure-connector call being issued on
LMB DR connectors that are already in configured state. Such
configure-connector calls will fail resulting in a DIMM which is
partially unplugged.
This however worked till recently before we overhauled the DRC
implementation in QEMU. Commit 9d4c0f4f0a71e: "spapr: Consolidate
DRC state variables" is the first commit where this problem shows up
as per git bisect.
Ideally guest shouldn't be issuing configure-connector call on an
already configured DR connector. However for now, work around this in
QEMU by allowing configure-connector to be called multiple times for
all types of DR connectors.
Signed-off-by: Bharata B Rao <[email protected]>
[dwg: Corrected buglet that would have initialized fdt pointers ready
for reading on a device not present at reset] Signed-off-by: David Gibson <[email protected]>
Greg Kurz [Mon, 14 Aug 2017 17:49:16 +0000 (19:49 +0200)]
ppc: fix ppc_set_compat() with KVM PR
When running in KVM PR mode, kvmppc_set_compat() always fail because the
current PR implementation doesn't handle KVM_REG_PPC_ARCH_COMPAT. Now that
the machine code inconditionally calls ppc_set_compat_all() at reset time
to restore the compat mode default value (commit 66d5c492dd3a9), it is
impossible to start a guest with PR:
qemu-system-ppc64: Unable to set CPU compatibility mode in KVM:
Invalid argument
A tentative patch [1] was recently sent by Suraj to address the issue, but
it would prevent the compat mode to be turned off on reset. And we really
don't want to explicitely check for KVM PR. During the patch's review,
David suggested that we should only call the KVM ioctl() if the compat
PVR changes. This allows at least to run with KVM PR, provided no compat
mode is requested from the command line (which should be the case when
running PR nested). This is what this patch does.
While here, we also fix the side effect where KVM would fail but we would
change the CPU state in QEMU anyway.
target/ppc: 'PVR != host PVR' in KVM_SET_SREGS workaround
Commit d5fc133eed ("ppc: Rework CPU compatibility testing
across migration") changed the way cpu_post_load behaves with
the PVR setting, causing an unexpected bug in KVM-HV migrations
between hosts that are compatible (POWER8 and POWER8E, for example).
Even with pvr_match() returning true, the guest freezes right after
cpu_post_load. The reason is that the guest kernel can't handle a
different PVR value other that the running host in KVM_SET_SREGS.
In [1] it was discussed the possibility of a new KVM capability
that would indicate that the guest kernel can handle a different
PVR in KVM_SET_SREGS. Even if such feature is implemented, there is
still the problem with older kernels that will not have this capability
and will fail to migrate.
This patch implements a workaround for that scenario. If running
with KVM, check if the guest kernel does not have the capability
(named here as 'cap_ppc_pvr_compat'). If it doesn't, calls
kvmppc_is_pr() to see if the guest is running in KVM-HV. If all this
happens, set env->spr[SPR_PVR] to the same value as the current
host PVR. This ensures that we allow migrations with 'close enough'
PVRs to still work in KVM-HV but also makes the code ready for
this new KVM capability when it is done.
A new function called 'kvmppc_pvr_workaround_required' was created
to encapsulate the conditions said above and to avoid calling too
many kvm.c internals inside cpu_post_load.
Signed-off-by: Daniel Henrique Barboza <[email protected]>
[dwg: Fix for the case of using TCG on a PPC host] Signed-off-by: David Gibson <[email protected]>
Peter Maydell [Tue, 15 Aug 2017 17:17:02 +0000 (18:17 +0100)]
Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2017-08-15' into staging
nbd patches for 2017-08-15
- Eric Blake: nbd: Fix trace message for disconnect
- Stefan Hajnoczi: qemu-iotests: step clock after each test iteration
- Fam Zheng: 0/4 block: Fix non-shared storage migration
- Eric Blake: nbd-client: Fix regression when server sends garbage
# gpg: Signature made Tue 15 Aug 2017 16:06:02 BST
# gpg: using RSA key 0xA7A16B4A2527436A
# gpg: Good signature from "Eric Blake <[email protected]>"
# gpg: aka "Eric Blake (Free Software Programmer) <[email protected]>"
# gpg: aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A
* remotes/ericb/tags/pull-nbd-2017-08-15:
nbd-client: Fix regression when server sends garbage
iotests: Add non-shared storage migration case 192
block-backend: Defer shared_perm tightening migration completion
nbd: Fix order of bdrv_set_perm and bdrv_invalidate_cache
stubs: Add vm state change handler stubs
qemu-iotests: step clock after each test iteration
nbd: Fix trace message for disconnect
Peter Maydell [Tue, 15 Aug 2017 14:30:18 +0000 (15:30 +0100)]
mmio-interface: Mark as not user creatable
The mmio-interface device is not something we want to allow
users to create on the command line:
* it is intended as an implementation detail of the memory
subsystem, which gets created and deleted by that
subsystem on demand; it makes no sense to create it
by hand on the command line
* it uses a pointer property 'host_ptr' which can't be
set on the command line
Mark the device as not user_creatable to avoid confusion.
Alistair Francis [Tue, 15 Aug 2017 14:57:14 +0000 (07:57 -0700)]
target/arm: Require alignment for load exclusive
According to the ARM ARM exclusive loads require the same alignment as
exclusive stores. Let's update the memops used for the load to match
that of the store. This adds the alignment requirement to the memops.
We are not providing the required single-copy atomic semantics for
the 64-bit operation that is the 32-bit paired load.
At the same time, leave the entire 64-bit value in cpu_exclusive_val
and stop writing to cpu_exclusive_high. This means that we do not
have to re-assemble the 64-bit quantity when it comes time to store.
At the same time, drop a redundant temporary and perform all loads
directly into the cpu_exclusive_* globals.
Alistair Francis [Tue, 15 Aug 2017 14:57:12 +0000 (07:57 -0700)]
target/arm: Correct exclusive store cmpxchg memop mask
When we perform the atomic_cmpxchg operation we want to perform the
operation on a pair of 32-bit registers. Previously we were just passing
the register size in which was set to MO_32. This would result in the
high register to be ignored. To fix this issue we hardcode the size to
be 64-bits long when operating on 32-bit pairs.
Eric Blake [Mon, 14 Aug 2017 21:34:26 +0000 (16:34 -0500)]
nbd-client: Fix regression when server sends garbage
When we switched NBD to use coroutines for qemu 2.9 (in particular,
commit a12a712a), we introduced a regression: if a server sends us
garbage (such as a corrupted magic number), we quit the read loop
but do not stop sending further queued commands, resulting in the
client hanging when it never reads the response to those additional
commands. In qemu 2.8, we properly detected that the server is no
longer reliable, and cancelled all existing pending commands with
EIO, then tore down the socket so that all further command attempts
get EPIPE.
Restore the proper behavior of quitting (almost) all communication
with a broken server: Once we know we are out of sync or otherwise
can't trust the server, we must assume that any further incoming
data is unreliable and therefore end all pending commands with EIO,
and quit trying to send any further commands. As an exception, we
still (try to) send NBD_CMD_DISC to let the server know we are going
away (in part, because it is easier to do that than to further
refactor nbd_teardown_connection, and in part because it is the
only command where we do not have to wait for a reply).
Based on a patch by Vladimir Sementsov-Ogievskiy.
A malicious server can be created with the following hack,
followed by setting NBD_SERVER_DEBUG to a non-zero value in the
environment when running qemu-nbd:
As in the case of nbd_export_new(), bdrv_invalidate_cache() can be
called when migration is still in progress. In this case we are not
ready to tighten the shared permissions fenced by blk->disable_perm.
Kevin Wolf [Tue, 15 Aug 2017 13:07:38 +0000 (21:07 +0800)]
nbd: Fix order of bdrv_set_perm and bdrv_invalidate_cache
The "inactive" state of BDS affects whether the permissions can be
granted, we must call bdrv_invalidate_cache before bdrv_set_perm to
support "-incoming defer" case.
Stefan Hajnoczi [Tue, 15 Aug 2017 13:05:02 +0000 (14:05 +0100)]
qemu-iotests: step clock after each test iteration
The 093 throttling test submits twice as many requests as the throttle
limit in order to ensure that we reach the limit. The remaining
requests are left in-flight at the end of each test iteration.
Commit 452589b6b47e8dc6353df257fc803dfc1383bed8 ("vl.c/exit: pause cpus
before closing block devices") exposed a hang in 093. This happens
because requests are still in flight when QEMU terminates but
QEMU_CLOCK_VIRTUAL time is frozen. bdrv_drain_all() hangs forever since
throttled requests cannot complete.
Step the clock at the end of each test iteration so in-flight requests
actually finish. This solves the hang and is cleaner than leaving tests
in-flight.
Note that this could also be "fixed" by disabling throttling when drives
are closed in QEMU. That approach has two issues:
1. We must drain requests before disabling throttling, so the hang
cannot be easily avoided!
2. Any time QEMU disables throttling internally there is a chance that
malicious users can abuse the code path to bypass throttling limits.
Therefore it makes more sense to fix the test case than to modify QEMU.
The simpletrace.py script can pretty-print flight recorder ring buffers.
These are not full simpletrace binary trace files but just the end of a
trace file. There is no header and the event ID mapping information is
often unavailable since the ring buffer may have filled up and discarded
event ID mapping records.
The simpletrace.stp script that generates ring buffer traces uses the
same trace-events-all input file as simpletrace.py. Therefore both
scripts have the same global ordering of trace events. A dynamic event
ID mapping isn't necessary: just use the trace-events-all file as the
reference for how event IDs are numbered.
It is now possible to analyze simpletrace.stp ring buffers again using:
Stefan Hajnoczi [Tue, 15 Aug 2017 08:44:29 +0000 (09:44 +0100)]
trace: use static event ID mapping in simpletrace.stp
This is a partial revert of commit 7f1b588f20d027730676e627713ae3bbf6baab04 ("trace: emit name <-> ID
mapping in simpletrace header"), which broke the SystemTap flight
recorder because event mapping records may not be present in the ring
buffer when the trace is analyzed. This means simpletrace.py
--no-header does not know the event ID mapping needed to pretty-print
the trace.
Instead of numbering events dynamically, use a static event ID mapping
as dictated by the event order in the trace-events-all file.
The simpletrace.py script also uses trace-events-all so the next patch
will fix the simpletrace.py --no-header option to take advantage of this
knowledge.
Peter Maydell [Tue, 15 Aug 2017 11:10:50 +0000 (12:10 +0100)]
Merge remote-tracking branch 'remotes/famz/tags/build-and-test-pull-request' into staging
# gpg: Signature made Tue 15 Aug 2017 11:50:36 BST
# gpg: using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6
* remotes/famz/tags/build-and-test-pull-request:
docker: add centos7 image
docker: install more packages on CentOS to extend code coverage
docker: add Xen libs to centos6 image
docker: use one package per line in CentOS config
Makefile: Let "make check-help" work without running ./configure
Eric Farman [Mon, 14 Aug 2017 20:44:50 +0000 (22:44 +0200)]
pc-bios/s390-ccw: Use rm command during make clean
This reverts a change that replaced the "rm -f" command with the
undefined variable RM (expected to be set by make), and causes the
"make clean" command to fail for a s390 target:
Peter Maydell [Mon, 14 Aug 2017 12:35:33 +0000 (13:35 +0100)]
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Mon 14 Aug 2017 13:32:10 BST
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
qemu-doc: Mention host_net_add/-remove in the deprecation chapter
Thomas Huth [Thu, 10 Aug 2017 08:00:17 +0000 (10:00 +0200)]
qemu-doc: Mention host_net_add/-remove in the deprecation chapter
The two HMP commands host_net_add and -remove have recently been
marked as deprecated, too, so we should now mention them in the
chapter of deprecated features.
* remotes/mjt/tags/trivial-patches-fetch:
hw/misc/mmio_interface: Return after error_setg() to avoid crash
qemu-iotests: remove comment about root privileges requirement
qemu-iotests: remove commented out variables
qemu-iotests: get rid of _full_imgproto_details()
qemu-doc: Fix "-net van" typo
libqtest: Fix typo in comments
unicore32: abort when entering "x 0" on the monitor
This happens because the realize function is trying to set the errp
twice in this case. After setting an error, the realize function
should immediately return instead.
Cleber Rosa [Thu, 27 Jul 2017 12:02:09 +0000 (08:02 -0400)]
qemu-iotests: remove comment about root privileges requirement
The check script contains a commented out root user requirement,
probably because of its xfstests heritage. This requirement doesn't
apply to qemu-iotests, so it better be gone.
Cleber Rosa [Thu, 27 Jul 2017 12:02:08 +0000 (08:02 -0400)]
qemu-iotests: remove commented out variables
The variables FULL_MKFS_OPTIONS and FULL_MOUNT_OPTIONS are commented
out, never used, and even refer to functions that do exist. The last
time these were touched was around 8 years ago, so I guess it's safe
to assume outputting such information on test execution is still on the
radar.
Cleber Rosa [Thu, 27 Jul 2017 12:02:07 +0000 (08:02 -0400)]
qemu-iotests: get rid of _full_imgproto_details()
Although this function is used, its implementation does nothing
besides echoing a variable name. There's no need to wrap this
functionality in a function, and based on the one usage it has, it's
not even required to adhere to a convention or code style.
Thomas Huth [Thu, 10 Aug 2017 11:44:26 +0000 (13:44 +0200)]
qemu-doc: Fix "-net van" typo
While Andrew S. Tanenbaum has a point by saying "Never underestimate the
bandwidth of a station wagon full of tapes hurtling down the highway",
we don't support that way of transportation in QEMU yet, so replace the
typo with the correct word "vlan".
unicore32: abort when entering "x 0" on the monitor
Starting Qemu with "qemu-system-unicore32 -M puv3,accel=qtest -S -nographic"
and entering "x 0 " at the monitor prompt leads to abort():
$ ./unicore32-softmmu/qemu-system-unicore32 -M puv3,accel=qtest -S -nographic
QEMU 2.9.90 monitor - type 'help' for more information
(qemu) x 0
qemu: fatal: uc32_cpu_get_phys_page_debug not supported yet
Thomas Huth [Thu, 10 Aug 2017 11:44:26 +0000 (13:44 +0200)]
qemu-doc: Fix "-net van" typo
While Andrew S. Tanenbaum has a point by saying "Never underestimate the
bandwidth of a station wagon full of tapes hurtling down the highway",
we don't support that way of transportation in QEMU yet, so replace the
typo with the correct word "vlan".
Cornelia Huck [Fri, 11 Aug 2017 11:03:31 +0000 (13:03 +0200)]
boot-serial-test: fallback to kvm accelerator
Currently, at least x86_64 and s390x support building with --disable-tcg.
Instead of forcing tcg (which causes the test to fail on such builds),
allow to use kvm as well.
This is because, under heavy load, the quit can happen before the first
iteration of the mirror request has occurred. To make sure we've had
time to iterate, let's just add a sleep for 0.5 seconds before quitting.
Fam Zheng [Fri, 11 Aug 2017 11:44:47 +0000 (19:44 +0800)]
file-posix: Do runtime check for ofd lock API
It is reported that on Windows Subsystem for Linux, ofd operations fail
with -EINVAL. In other words, QEMU binary built with system headers that
exports F_OFD_SETLK doesn't necessarily run in an environment that
actually supports it:
As a matter of fact this is not WSL specific. It can happen when running
a QEMU compiled against a newer glibc on an older kernel, such as in
a containerized environment.
Eric Blake [Wed, 9 Aug 2017 20:38:07 +0000 (15:38 -0500)]
qcow2: Drop debugging dump_refcounts()
It's been #if 0'd since its introduction in 2006, commit 585f8587.
We can revive dead code if we need it, but in the meantime, it has
bit-rotted (for example, not checking for failure in bdrv_getlength()).
Eric Blake [Tue, 8 Aug 2017 14:34:16 +0000 (09:34 -0500)]
tests/multiboot: Fix whitespace failure
Commit b43671f8 accidentally broke run_test.sh within tests/multiboot;
due to a subtle change in whitespace.
These two commands produce theh same output (at least, for sane $IFS
of space-tab-newline):
echo -e "...$@..."
echo -e "...$*..."
But that's only because echo inserts spaces between multiple arguments
(the $@ case), while the $* form gives a single argument to echo with
the spaces already present.
But when converting to printf %b, there are no automatic spaces between
multiple arguments, so we HAVE to use $*.
It doesn't help that run_test.sh isn't part of 'make check'.
Peter Maydell [Thu, 10 Aug 2017 17:53:39 +0000 (18:53 +0100)]
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Thu 10 Aug 2017 18:48:13 BST
# gpg: using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <[email protected]>"
# gpg: aka "Stefan Hajnoczi <[email protected]>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/block-pull-request:
virtio-blk: handle blk_getlength() errors
IDE: test flush on empty CDROM
IDE: Do not flush empty CDROM drives
Peter Maydell [Thu, 10 Aug 2017 16:50:55 +0000 (17:50 +0100)]
Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
Just a single fix for an annoying regression introduced in 2.9 when fixing
CVE-2016-9602.
# gpg: Signature made Thu 10 Aug 2017 13:40:28 BST
# gpg: using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <[email protected]>"
# gpg: aka "Greg Kurz <[email protected]>"
# gpg: aka "Greg Kurz <[email protected]>"
# gpg: aka "Gregory Kurz (Groug) <[email protected]>"
# gpg: aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894 DBA2 02FC 3AEB 0101 DBC2
Stefan Hajnoczi [Wed, 9 Aug 2017 16:02:11 +0000 (17:02 +0100)]
IDE: Do not flush empty CDROM drives
The block backend changed in a way that flushing empty CDROM drives now
crashes. Amend IDE to avoid doing so until the root problem can be
addressed for 2.11.
Greg Kurz [Thu, 10 Aug 2017 12:21:04 +0000 (14:21 +0200)]
9pfs: local: fix fchmodat_nofollow() limitations
This function has to ensure it doesn't follow a symlink that could be used
to escape the virtfs directory. This could be easily achieved if fchmodat()
on linux honored the AT_SYMLINK_NOFOLLOW flag as described in POSIX, but
it doesn't. There was a tentative to implement a new fchmodat2() syscall
with the correct semantics:
but it didn't gain much momentum. Also it was suggested to look at an O_PATH
based solution in the first place.
The current implementation covers most use-cases, but it notably fails if:
- the target path has access rights equal to 0000 (openat() returns EPERM),
=> once you've done chmod(0000) on a file, you can never chmod() again
- the target path is UNIX domain socket (openat() returns ENXIO)
=> bind() of UNIX domain sockets fails if the file is on 9pfs
The solution is to use O_PATH: openat() now succeeds in both cases, and we
can ensure the path isn't a symlink with fstat(). The associated entry in
"/proc/self/fd" can hence be safely passed to the regular chmod() syscall.
The previous behavior is kept for older systems that don't have O_PATH.
Peter Maydell [Thu, 10 Aug 2017 10:12:36 +0000 (11:12 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.10-20170809' into staging
ppc patch queue 2017-08-09
This series contains a number of bugfixes for ppc and related
machines, for the qemu-2.10.release. Some are true regressions,
others are serious enough and non-invasive enough to fix that it's
worth putting in 2.10 this late.
* remotes/dgibson/tags/ppc-for-2.10-20170809:
spapr: Fix bug in h_signal_sys_reset()
spapr_drc: abort if object_property_add_child() fails
target/ppc: Add stub implementation of the PSSCR
target/ppc: Implement TIDR
ppc: fix double-free in cpu_post_load()
booke206: fix MAS update on tlb miss
Peter Maydell [Thu, 10 Aug 2017 09:05:29 +0000 (10:05 +0100)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, vhost: fixes for rc3
Fix up bugs and warnings in tests. Revert an experimental commit that I
put in by mistake: harmless but useless.
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Wed 09 Aug 2017 02:23:17 BST
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg: aka "Michael S. Tsirkin <[email protected]>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
libqtest: always set up signal handler for SIGABRT
libvhost-user: quit when no more data received
net: fix -netdev socket,fd= for UDP sockets
Revert "cpu: add APIs to allocate/free CPU environment"
acpi-test: update expected DSDT files
Sam Bobroff [Thu, 3 Aug 2017 06:28:27 +0000 (16:28 +1000)]
spapr: Fix bug in h_signal_sys_reset()
The unicast case in h_signal_sys_reset() seems to be broken:
rather than selecting the target CPU, it looks like it will pick
either the first CPU or fail to find one at all.
Fix it by using the search function rather than open coding the
search.
This was found by inspection; the code appears to be unused because
the Linux kernel only uses the broadcast target.
Greg Kurz [Mon, 7 Aug 2017 17:24:39 +0000 (19:24 +0200)]
spapr_drc: abort if object_property_add_child() fails
object_property_add_child() can only fail in two cases:
- the child already has a parent, which shouldn't happen since the DRC was
allocated a few lines above
- the parent already has a child with the same name, which would mean the
caller tries to create a DRC that already exists
In both case, this is a QEMU bug and we should abort.
David Gibson [Tue, 8 Aug 2017 05:09:35 +0000 (15:09 +1000)]
target/ppc: Add stub implementation of the PSSCR
The PSSCR register added in POWER9 controls certain power saving mode
behaviours. Mostly, it's not relevant to TCG, however because qemu
doesn't know about it yet, it doesn't synchronize the state with KVM,
and thus it doesn't get migrated.
To fix that, this adds a minimal stub implementation of the register.
This isn't complete, even to the extent that an implementation is
possible in TCG, just enough to get migration working. We need to
come back later and at least properly filter the various fields in the
register based on privilege level.
David Gibson [Tue, 8 Aug 2017 03:42:53 +0000 (13:42 +1000)]
target/ppc: Implement TIDR
This adds a trivial implementation of the TIDR register added in
POWER9. This isn't particularly important to qemu directly - it's
used by accelerator modules that we don't emulate.
However, since qemu isn't aware of it, its state is not synchronized
with KVM and therefore not migrated, which can be a problem.
Greg Kurz [Wed, 2 Aug 2017 17:34:16 +0000 (19:34 +0200)]
ppc: fix double-free in cpu_post_load()
When running nested with KVM PR, ppc_set_compat() fails and QEMU crashes
because of "double free or corruption (!prev)". The crash happens because
error_report_err() has already called error_free().
KONRAD Frederic [Tue, 1 Aug 2017 08:44:57 +0000 (10:44 +0200)]
booke206: fix MAS update on tlb miss
When a tlb instruction miss happen, rw is set to 0 at the bottom
of cpu_ppc_handle_mmu_fault which cause the MAS update function to miss
the SAS and TS bit in MAS6, MAS1 in booke206_update_mas_tlb_miss.
Just calling booke206_update_mas_tlb_miss with rw = 2 solve the issue.
Jens Freimann [Tue, 8 Aug 2017 20:38:59 +0000 (22:38 +0200)]
libqtest: always set up signal handler for SIGABRT
Currently abort handlers only work for the first test function
in a testcase, because the list of abort handlers is not properly
cleared when qtest_quit() is called.
qtest_quit() only deletes the kill_qemu_hook but doesn't completely
clear the abrt_hooks list. The effect is that abrt_hooks.is_setup is
never set to false and in a following test the abrt_hooks list is not
initialized and setup_sigabrt_handler() is not called.
One way to solve this is to clear the list in qtest_quit(), but
that means only asserts between qtest_start and qtest_quit will
be catched by the abort handler.
We can make abort handlers work in all cases if we always setup the
signal handler for SIGABRT in qtest_init.
Jens Freimann [Tue, 8 Aug 2017 20:38:57 +0000 (22:38 +0200)]
net: fix -netdev socket,fd= for UDP sockets
This patch fixes -netdev socket,fd= for UDP sockets
Currently -netdev socket,fd=<...> results in
qemu: error: specified mcastaddr "127.0.0.1" (0x7f000001) does not
contain a multicast address
qemu-system-x86_64: -netdev
socket,id=n1,fd=3: Device 'socket' could not be initialized
To fix these we need to allow specifying multicast and fd arguments
for the same netdev. With this the user can specify "-netdev
fd=3,mcast=<IP:port>"
* remotes/bonzini/tags/for-upstream:
maint: Include bug-reporting info in --help output
qga: Give more --version information
qemu-io: Give more --version information
qemu-img: Sort sub-command names in --help
target/i386: set rip_offset for some SSE4.1 instructions
scsi: clarify sense codes for LUN0 emulation
kvm: workaround build break on gcc-7.1.1 / fedora26
Revert "rcu: do not create thread in pthread_atfork callback"
rcu: completely disable pthread_atfork callbacks as soon as possible
Eric Blake [Thu, 3 Aug 2017 16:33:53 +0000 (11:33 -0500)]
maint: Include bug-reporting info in --help output
These days, many programs are including a bug-reporting address,
or better yet, a link to the project web site, at the tail of
their --help output. However, we were not very consistent at
doing so: only qemu-nbd and qemu-qa mentioned anything, with the
latter pointing to an individual person instead of the project.
Add a new #define that sets up a uniform string, mentioning both
bug reporting instructions and overall project details, and which
a downstream vendor could tweak if they want bugs to go to a
downstream database. Then use it in all of our binaries which
have --help output.
The canned text intentionally references http:// instead of https://
because our https website currently causes certificate errors in
some browsers. That can be tweaked later once we have resolved the
web site issued.
Eric Blake [Thu, 3 Aug 2017 16:33:50 +0000 (11:33 -0500)]
qemu-img: Sort sub-command names in --help
'amend' and 'create' were not listed alphabetically; hoist them
earlier. Separate the @end table block to make it easier to
copy-and-paste the addition of future sub-commands.
Peter Maydell [Tue, 8 Aug 2017 14:23:21 +0000 (15:23 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches for 2.10.0-rc2
# gpg: Signature made Tue 08 Aug 2017 14:56:15 BST
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream:
block/nfs: fix mutex assertion in nfs_file_close()
qemu-iotests: Test reopen between read-only and read-write
qemu-io: Allow reopen read-write
block: Set BDRV_O_ALLOW_RDWR during rw reopen
block: Allow reopen rw without BDRV_O_ALLOW_RDWR
block: Fix order in bdrv_replace_child()
parallels: drop check that bdrv_truncate() is working
parallels: respect error code of bdrv_getlength() in allocate_clusters()
block: respect error code from bdrv_getlength in handle_aiocb_write_zeroes
vmdk: Fix error handling/reporting of vmdk_check
block/null: Remove 'filename' option
block: drop bdrv_set_key from BlockDriver
block/vhdx: check error return of bdrv_truncate()
block/vhdx: check error return of bdrv_flush()
block/vhdx: check for offset overflow to bdrv_truncate()
block/vhdx: check error return of bdrv_getlength()
quorum: Set sectors-count to 0 when reporting a flush error
qemu-iotests/109: Fix lock race condition
Jeff Cody [Mon, 7 Aug 2017 22:29:09 +0000 (18:29 -0400)]
block/nfs: fix mutex assertion in nfs_file_close()
Commit c096358e747e88fc7364e40e3c354ee0bb683960 introduced assertion
checks for when qemu_mutex() functions are called without the
corresponding qemu_mutex_init() having initialized the mutex.
This uncovered a latent bug in qemu's nfs driver - in
nfs_client_close(), the NFSClient structure is overwritten with zeros,
prior to the mutex being destroyed.
Go ahead and destroy the mutex in nfs_client_close(), and change where
we call qemu_mutex_init() so that it is correctly balanced.
There are also a couple of memory leaks obscured by the memset, so this
fixes those as well.
Finally, we should be able to get rid of the memset(), as it isn't
necessary.
Kevin Wolf [Thu, 3 Aug 2017 15:02:59 +0000 (17:02 +0200)]
block: Set BDRV_O_ALLOW_RDWR during rw reopen
Reopening an image should be consistent with opening it, so we should
set BDRV_O_ALLOW_RDWR for any image that is reopened read-write like in
bdrv_open_inherit().
Kevin Wolf [Thu, 3 Aug 2017 15:02:58 +0000 (17:02 +0200)]
block: Allow reopen rw without BDRV_O_ALLOW_RDWR
BDRV_O_ALLOW_RDWR is a flag that tells whether qemu can internally
reopen a node read-write temporarily because the user requested
read-write for the top-level image, but qemu decided that read-only is
enough for this node (a backing file).
bdrv_reopen() is different, it is also used for cases where the user
changed their mind and wants to update the options. There is no reason
to forbid making a node read-write in that case.
Kevin Wolf [Thu, 3 Aug 2017 15:02:57 +0000 (17:02 +0200)]
block: Fix order in bdrv_replace_child()
Commit 8ee03995 refactored the code incorrectly and broke the release of
permissions on the old BDS. Instead of changing the permissions to the
new required values after removing the old BDS from the list of
children, it only re-obtains the permissions it already had.
Change the order of operations so that the old BDS is removed again
before calculating the new required permissions.
Denis V. Lunev [Fri, 4 Aug 2017 15:10:13 +0000 (18:10 +0300)]
parallels: drop check that bdrv_truncate() is working
This would be actually strange and error prone. If truncate() nowadays
will fail, there is something fatally wrong. Let's check for that during
the actual work.
The only fallback case is when the file is not zero initialized. In this
case we should switch to preallocation via fallocate().
Denis V. Lunev [Fri, 4 Aug 2017 15:10:11 +0000 (18:10 +0300)]
block: respect error code from bdrv_getlength in handle_aiocb_write_zeroes
Original idea beyond the code in question was the following: we have failed
to write zeroes with fallocate(FALLOC_FL_ZERO_RANGE) as the simplest
approach and via fallocate(FALLOC_FL_PUNCH_HOLE)/fallocate(0). We have the
only chance now: if the request comes beyond end of the file. Thus we
should calculate file length and respect the error code from that op.
Kevin Wolf [Fri, 4 Aug 2017 10:44:22 +0000 (12:44 +0200)]
block/null: Remove 'filename' option
This option was only added to allow 'null-co://' and 'null-aio://' as
filenames, its value never served any actual purpose and was ignored.
Nevertheless it was accepted as '-drive driver=null,filename=foo'.
The correct way to enable the protocol prefixes (and that without adding
a useless -drive option) is implementing .bdrv_parse_filename. This is
what this patch does.
Technically, this is an incompatible change, but the null block driver
is only used for benchmarking, testing and debugging, and an option
without effect isn't likely to be used by anyone anyway, so no bad
effects are to be expected.
Jeff Cody [Mon, 7 Aug 2017 12:38:20 +0000 (08:38 -0400)]
block/vhdx: check for offset overflow to bdrv_truncate()
VHDX uses uint64_t types for most offsets, following the VHDX spec.
However, bdrv_truncate() takes an int64_t value for the truncating
offset. Check for overflow before calling bdrv_truncate().
While we are here, replace the bit shifting with QEMU_ALIGN_UP as well.
N.B.: For a compliant image this is not an issue, as the maximum VHDX
image size is defined per the spec to be 64TB.
Jeff Cody [Mon, 7 Aug 2017 12:38:19 +0000 (08:38 -0400)]
block/vhdx: check error return of bdrv_getlength()
Calls to bdrv_getlength() were not checking for error. In vhdx.c, this
can lead to truncating an image file, so it is a definite bug. In
vhdx-log.c, the path for improper behavior is less clear, but it is best
to check in any case.
Some minor code movement of the log_guid intialization, as well.