s390x/cpumodel: expose features and feature groups as properties
Let's add all features and feature groups as properties to all CPU models.
If the "host" CPU model is unknown, we can neither query nor change
features. KVM will just continue to work like it did until now.
We will not allow to enable features that were not part of the original
CPU model, because that could collide with the IBC in KVM.
s390x/cpumodel: store the CPU model in the CPU instance
A CPU model consists of a CPU definition, to which delta changes are
applied - features added or removed (e.g. z13-base,vx=on). In addition,
certain properties (e.g. cpu id) can later on change during migration
but belong into the CPU model. This data will later be filled from the
host model in the KVM case.
Therefore, store the configured CPU model inside the CPU instance, so
we can later on perform delta changes using properties.
For the "qemu" model, we emulate in TCG a z900. "host" will be
uninitialized (cpu->model == NULL) unless we have CPU model support in KVM
later on. The other models are all initialized from their definitions.
Only the "host" model can have a cpu->model == NULL.
s390x/cpumodel: register defined CPU models as subclasses
This patch adds the CPU model definitions that are known on s390x -
like z900, zBC12 or z13. For each definition, introduce two CPU models:
1. Base model (e.g. z13-base): Minimum feature set we expect to be around
on all z13 systems. These models are migration-safe and will never
change.
2. Flexible models (e.g. z13): Models that can change between QEMU versions
and will be extended over time as we implement further features that
are already part of such a model in real hardware of certain
configurations.
We want to work on features using ordinary bitmap operations, however we
can't initialize a bitmap statically (unsigned long[] ...). Therefore we
store the generated feature lists in separate arrays and convert them to
proper bitmaps before registering all our CPU model classes.
s390x/cpumodel: introduce CPU feature group definitions
Let's use the generated groups to create feature group representations for
the user. These groups can later be used to enable/disable multiple
features in one shot and will be used to reduce the amount of reported
features to the user if all subfeatures are in place.
We want to work on features using ordinary bitmap operations, however we
can't initialize a bitmap statically (unsigned long[] ...). Therefore
we store the generated feature lists in separate arrays and convert
them to a proper bitmaps before they will ever be used.
Feature groups will be very helpful to reduce the amount of features
typically available in sane configurations. E.g. the MSA facilities
introduced loads of subfunctions, which could - in theory - go away
in the future, but we want to avoid reporting hundrets of features to
the user if usually all of them are in place.
Groups only contain features that were introduced in one shot, not just
random features. Therefore, groups can never change. This is an important
property regarding migration.
Michael Mueller [Mon, 5 Sep 2016 08:52:19 +0000 (10:52 +0200)]
s390x/cpumodel: generate CPU feature lists for CPU models
This patch introduces the helper "gen-features" which allows to generate
feature list definitions at compile time. Its flexibility is better and the
error-proneness is lower when compared to static programming time added
statements.
The helper includes "target-s390x/cpu_features.h" to be able to use named
facility bits instead of numbers. The generated defines will be used for
the definition of CPU models.
We generate feature lists for each HW generation and GA for EC models. BC
models are always based on a EC version and have no separate definitions.
Base features: Features we expect to be always available in sane setups.
Migration safe - will never change. Can be seen as "minimum features
required for a CPU model".
Default features: Features we expect to be stable and around in latest
setups (e.g. having KVM support) - not migration safe.
Max features: All supported features that are theoretically allowed for a
CPU model. Exceeding these features could otherwise produce problems with
IBC (instruction blocking controls) in KVM.
Michael Mueller [Mon, 5 Sep 2016 08:52:18 +0000 (10:52 +0200)]
s390x/cpumodel: introduce CPU features
The patch introduces s390x CPU features (most of them refered to as
facilities) along with their discription and some functions that will be
helpful when working with the features later on.
Please note that we don't introduce all known CPU features, only the
ones currently supported by KVM + QEMU. We don't want to enable later
on blindly any facilities, for which we don't know yet if we need QEMU
support to properly support them (e.g. migrate additional state when
active). We can update QEMU later on.
s390x/cpumodel: "host" and "qemu" as CPU subclasses
This patch introduces two CPU models, "host" and "qemu".
"qemu" is used as default when running under TCG. "host" is used
as default when running under KVM. "host" cannot be used without KVM.
"host" is not migration-safe. They both inherit from the base s390x CPU,
which is turned into an abstract class.
This patch also changes CPU creation to take care of the passed CPU string
and reuses common code parse_features() function for that purpose. Unknown
CPU definitions are now reported. The "-cpu ?" and "query-cpu-definition"
commands are changed to list all CPU subclasses automatically, including
migration-safety and whether static.
qmp: details about CPU definitions in query-cpu-definitions
It might be of interest for tooling whether a CPU definition can be safely
used when migrating, or if e.g. CPU features might get lost during
migration when migrationg from/to a different QEMU version or host, even if
the same compatibility machine is used.
Also, we want to know if a CPU definition is static and will never change.
Beause these definitions can then be used independantly of a compatibility
machine and will always have the same feature set, they can e.g. be used
to indicate the "host" model in libvirt later on.
Let's add two return values to query-cpu-definitions, stating for each
returned CPU definition, if it is migration-safe and if it is static.
While "migration-safe" is optional, "static" will be set to "false"
automatically by all implementing architectures. If a model really was
static all the time and will be in the future, this can simply be changed
later.
Diag 501 (4 bytes) was used until now for software breakpoints on s390.
As instructions on s390 might be 2 bytes long, temporarily overwriting them
with 4 bytes is evil and can result in very strange guest behaviour.
We make use of invalid instruction 0x0000 as new sw breakpoint instruction.
We have to enable interception of that instruction in KVM using a
capability.
If no software breakpoint has been inserted at the reported position, an
operation exception has to be injected into the guest. Otherwise a
breakpoint has been hit and the pc has to be rewound.
If KVM doesn't yet support interception of instruction 0x0000 the
existing mechanism exploiting diag 501 is used. To keep overhead low,
interception of instruction 0x0000 will only be enabled if sw breakpoints
are really used.
Cornelia Huck [Mon, 15 Aug 2016 09:10:28 +0000 (11:10 +0200)]
s390x/css: handle cssid 255 correctly
The cssid 255 is reserved but still valid from an architectural
point of view. However, feeding a bogus schid of 0xffffffff into
the virtio hypercall will lead to a crash:
With the current code a simple sclp command takes about 13000 ns
The biggest part seems to be the resolver of the object model. By
caching the sclp device the time for an sclp command goes down to
2500ns. Talking about real life scenarios, this change doubles
the speed of the sclp console when sending single bytes outputs
to /dev/console.
Yi Min Zhao [Wed, 10 Aug 2016 05:35:00 +0000 (13:35 +0800)]
s390x/pci: assert zpci always existing
If one pci device is plugged successfully, there must be a zpci device
existing. This means that during hot-unplugging a pci device, its
corresponding zpci device must be found. Therefore we use an assert to
replace current code.
Yi Min Zhao [Wed, 10 Aug 2016 08:10:08 +0000 (16:10 +0800)]
s390x/pci: return directly if create zpci failed
In the case that zpci is automatically created, we did not return
immediately on failure, which would lead to NULL pointer dereferencing.
Let's fix it.
Greg Kurz [Tue, 30 Aug 2016 15:02:27 +0000 (17:02 +0200)]
9pfs: handle walk of ".." in the root directory
The 9P spec at http://man.cat-v.org/plan_9/5/intro says:
All directories must support walks to the directory .. (dot-dot) meaning
parent directory, although by convention directories contain no explicit
entry for .. or . (dot). The parent of the root directory of a server's
tree is itself.
This means that a client cannot walk further than the root directory
exported by the server. In other words, if the client wants to walk
"/.." or "/foo/../..", the server should answer like the request was
to walk "/".
This patch just does that:
- we cache the QID of the root directory at attach time
- during the walk we compare the QID of each path component with the root
QID to detect if we're in a "/.." situation
- if so, we skip the current component and go to the next one
Greg Kurz [Tue, 30 Aug 2016 17:11:05 +0000 (19:11 +0200)]
9pfs: forbid illegal path names
Empty path components don't make sense for most commands and may cause
undefined behavior, depending on the backend.
Also, the walk request described in the 9P spec [1] clearly shows that
the client is supposed to send individual path components: the official
linux client never sends portions of path containing the / character for
example.
Moreover, the 9P spec [2] also states that a system can decide to restrict
the set of supported characters used in path components, with an explicit
mention "to remove slashes from name components".
This patch introduces a new name_is_illegal() helper that checks the
names sent by the client are not empty and don't contain unwanted chars.
Since 9pfs is only supported on linux hosts, only the / character is
checked at the moment. When support for other hosts (AKA. win32) is added,
other chars may need to be blacklisted as well.
If a client sends an illegal path component, the request will fail and
ENOENT is returned to the client.
Paolo Bonzini [Tue, 30 Aug 2016 12:04:12 +0000 (14:04 +0200)]
Revert "Change net/socket.c to use socket_*() functions"
Since commit 7e8449594c929, the socket connect code is blocking, because
calling socket_connect() without callback is blocking. This reverts the
commit.
translate: early exit in tb_flush if there is no tcg
tb_flush does all kind of things, which are very tcg specific. As it
is called from some places even for KVM (e.g. gdb server) it is better
to detect these cases and do an early exit.
This also fixes a crash in the gdb server that was triggered by
commit 909eaac9bbc2 ("tb hash: track translated blocks with qht").
vnc: only alloc server surface with clients connected
the VNC server was changed so that the 'vd->server' pixman
image was only allocated when a client is connected.
Since then if a client disconnects and then reconnects to
the VNC server all they will see is a black screen until
they do something that triggers a refresh. On a graphical
desktop this is not often noticed since there's many things
going on which cause a refresh. On a plain text console it
is really obvious since nothing refreshes frequently.
The problem is that the VNC server didn't update the guest
dirty bitmap, so still believes its server image is in sync
with the guest contents.
To fix this we must explicitly mark the entire guest desktop
as dirty after re-creating the server surface. Move this
logic into vnc_update_server_surface() so it is guaranteed
to be call in all code paths that re-create the surface
instead of only in vnc_dpy_switch()
Stefan Hajnoczi [Mon, 15 Aug 2016 12:54:16 +0000 (13:54 +0100)]
virtio: decrement vq->inuse in virtqueue_discard()
virtqueue_discard() moves vq->last_avail_idx back so the element can be
popped again. It's necessary to decrement vq->inuse to avoid "leaking"
the element count.
Stefan Hajnoczi [Mon, 15 Aug 2016 12:54:15 +0000 (13:54 +0100)]
virtio: recalculate vq->inuse after migration
The vq->inuse field is not migrated. Many devices don't hold
VirtQueueElements across migration so it doesn't matter that vq->inuse
starts at 0 on the destination QEMU.
At least virtio-serial, virtio-blk, and virtio-balloon migrate while
holding VirtQueueElements. For these devices we need to recalculate
vq->inuse upon load so the value is correct.
Peter Maydell [Mon, 22 Aug 2016 09:02:28 +0000 (10:02 +0100)]
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Mon 22 Aug 2016 09:06:32 BST
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
e1000e: remove internal interrupt flag
slirp: fix segv when init failed
Cao jin [Thu, 18 Aug 2016 14:15:54 +0000 (22:15 +0800)]
e1000e: remove internal interrupt flag
Commit 66bf7d58 removed internal msi state flag E1000E_USE_MSI, E1000E_USE_MSIX
is not necessary too, remove it now. And interrupt flag field intr_state also
can be removed now.
Since commit f6c2e66ae8c8a, slirp uses an exit notifier to call
slirp_smb_cleanup. However, if init() failed, the notifier isn't added,
and removing it will fail:
==18447== Invalid write of size 8
==18447== at 0x7EF2B5: notifier_remove (notify.c:32)
==18447== by 0x48E80C: qemu_remove_exit_notifier (vl.c:2661)
==18447== by 0x6A2187: net_slirp_cleanup (slirp.c:134)
==18447== by 0x69419D: qemu_cleanup_net_client (net.c:338)
==18447== by 0x69445B: qemu_del_net_client (net.c:401)
==18447== by 0x6A2B81: net_slirp_init (slirp.c:366)
==18447== by 0x6A4241: net_init_slirp (slirp.c:865)
==18447== by 0x695C6D: net_client_init1 (net.c:1051)
==18447== by 0x695F6E: net_client_init (net.c:1108)
==18447== by 0x696DBA: net_init_netdev (net.c:1498)
==18447== by 0x7F1F99: qemu_opts_foreach (qemu-option.c:1116)
==18447== by 0x696E60: net_init_clients (net.c:1516)
==18447== Address 0x0 is not stack'd, malloc'd or (recently) free'd
Sascha Silbe [Thu, 18 Aug 2016 18:46:03 +0000 (20:46 +0200)]
test-logging: don't hard-code paths in /tmp
Since f6880b7f [qemu-log: support simple pid substitution for logs],
test-logging creates files with hard-coded names in /tmp. In the best
case, this prevents multiple developers from running "make check" on
the same machine. In the worst case, it allows for symlink attacks,
enabling an attacker to overwrite files that are writable to the
developer running "make check".
Instead of hard-coding the paths, create a temporary directory using
g_dir_make_tmp() and clean it up afterwards.
Sascha Silbe [Thu, 18 Aug 2016 18:46:02 +0000 (20:46 +0200)]
glib: add compatibility implementation for g_dir_make_tmp()
We're going to make use of g_dir_make_tmp() in test-logging. Provide a
compatibility implementation of it for glib < 2.30.
May behave differently in some edge cases (e.g. pattern only at the
end of the template, the file name is not part of the error message),
but good enough in practice.
Michal Privoznik [Fri, 19 Aug 2016 08:06:40 +0000 (10:06 +0200)]
syscall.c: Redefine IFLA_* enums
In 9c37146782 I've tried to fix a broken build with older
linux-headers. However, I didn't do it properly. The solution
implemented here is to grab the enums that caused the problem
initially, and rename their values so that they are "QEMU_"
prefixed. In order to guarantee matching values with actual
enums from linux-headers, the enums are seeded with starting
values from the original enums.
Denis V. Lunev [Wed, 17 Aug 2016 18:06:54 +0000 (21:06 +0300)]
block: fix possible reorder of flush operations
This patch reduce CPU usage of flush operations a bit. When we have one
flush completed we should kick only next operation. We should not start
all pending operations in the hope that they will go back to wait on
wait_queue.
Also there is a technical possibility that requests will get reordered
with the previous approach. After wakeup all requests are removed from
the wait queue. They become active and they are processed one-by-one
adding to the wait queue in the same order. Though new flush can arrive
while all requests are not put into the queue.
Evgeny Yakovlev [Wed, 17 Aug 2016 18:06:53 +0000 (21:06 +0300)]
block: fix deadlock in bdrv_co_flush
The following commit
commit 3ff2f67a7c24183fcbcfe1332e5223ac6f96438c
Author: Evgeny Yakovlev <[email protected]>
Date: Mon Jul 18 22:39:52 2016 +0300
block: ignore flush requests when storage is clean
has introduced a regression.
There is a problem that it is still possible for 2 requests to execute
in non sequential fashion and sometimes this results in a deadlock
when bdrv_drain_one/all are called for BDS with such stalled requests.
1. Current flushed_gen and flush_started_gen is 1.
2. Request 1 enters bdrv_co_flush to with write_gen 1 (i.e. the same
as flushed_gen). It gets past flushed_gen != flush_started_gen and
sets flush_started_gen to 1 (again, the same it was before).
3. Request 1 yields somewhere before exiting bdrv_co_flush
4. Request 2 enters bdrv_co_flush with write_gen 2. It gets past
flushed_gen != flush_started_gen and sets flush_started_gen to 2.
5. Request 2 runs to completion and sets flushed_gen to 2
6. Request 1 is resumed, runs to completion and sets flushed_gen to 1.
However flush_started_gen is now 2.
From here on out flushed_gen is always != to flush_started_gen and all
further requests will wait on flush_queue. This change replaces
flush_started_gen with an explicitly tracked active flush request.
Peter Maydell [Thu, 18 Aug 2016 09:56:40 +0000 (10:56 +0100)]
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Thu 18 Aug 2016 06:36:16 BST
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
net/net: properly handle multiple packets in net_fill_rstate()
net: vmxnet: use g_new for pkt initialisation
Peter Maydell [Thu, 18 Aug 2016 09:27:57 +0000 (10:27 +0100)]
Merge remote-tracking branch 'remotes/famz/tags/docker-pull-request' into staging
Fix 'make docker-test-mingw@fedora'
Peter,
This is the single patch that stalls patchew's mingw testing. Since it
is small and trivial, let's have it in 2.7.
Fam
# gpg: Signature made Wed 17 Aug 2016 13:13:53 BST
# gpg: using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6
* remotes/famz/tags/docker-pull-request:
curl: Cast fd to int for DPRINTF
Zhang Chen [Thu, 18 Aug 2016 03:23:25 +0000 (11:23 +0800)]
net/net: properly handle multiple packets in net_fill_rstate()
When network is busy, we will receive multiple packets at one time. In
that situation, we should keep trying to do the receiving instead of
finalizing only the first packet.
Li Qiang [Tue, 16 Aug 2016 11:28:01 +0000 (16:58 +0530)]
net: vmxnet: use g_new for pkt initialisation
When network transport abstraction layer initialises pkt, the maximum
fragmentation count is not checked. This could lead to an integer
overflow causing a NULL pointer dereference. Replace g_malloc() with
g_new() to catch the multiplication overflow.
Fam Zheng [Mon, 1 Aug 2016 05:04:48 +0000 (13:04 +0800)]
curl: Cast fd to int for DPRINTF
Currently "make docker-test-mingw@fedora" has a warning like:
/tmp/qemu-test/src/block/curl.c: In function 'curl_sock_cb':
/tmp/qemu-test/src/block/curl.c:172:6: warning: format '%d' expects
argument of type 'int', but argument 4 has type 'curl_socket_t {aka long
long unsigned int}'
DPRINTF("CURL (AIO): Sock action %d on fd %d\n", action, fd);
^
cc1: all warnings being treated as errors
Peter Maydell [Thu, 11 Aug 2016 17:59:39 +0000 (18:59 +0100)]
linux-user: Fix llseek with high bit of offset_low set
The llseek syscall takes two 32-bit arguments, offset_high
and offset_low, which must be combined to form a single
64-bit offset. Unfortunately we were combining them with
(uint64_t)arg2 << 32) | arg3
and arg3 is a signed type; this meant that when promoting
arg3 to a 64-bit type it would be sign-extended. The effect
was that if the offset happened to have bit 31 set then
this bit would get sign-extended into all of bits 63..32.
Explicitly cast arg3 to abi_ulong to avoid the erroneous
sign extension.
Michal Privoznik [Tue, 16 Aug 2016 09:47:43 +0000 (11:47 +0200)]
syscall.c: Fix build with older linux-headers
In c5dff280 we tried to make us understand netlink messages more.
So we've added a code that does some translation. However, the
code assumed linux-headers to be at least version 4.4 of it
because most of the symbols there (if not all of them) were added
in just that release. This, however, breaks build on systems with
older versions of the package.
Thomas Huth [Mon, 15 Aug 2016 08:24:54 +0000 (10:24 +0200)]
slirp: Rename "struct arphdr" to "struct slirp_arphdr"
struct arphdr is already used by the system headers on OpenBSD
and thus QEMU does not compile here anymore. Fix it by renaming
our struct to slirp_arphdr instead.
Since commit d7a04fd7d5008, tcp_chr_wait_connected() was introduced,
so vhost-user could wait until a backend started successfully. In
vhost-user case, the chr socket must be plain unix, and the chr+vhost
setup happens synchronously during qemu startup.
However, with TLS and telnet socket, initial socket setup happens
asynchronously, and s->connected is not set after the socket is
accepted. In order for tcp_chr_wait_connected() to not keep accepting
new connections and proceed with the last accepted socket, it can
check for s->ioc instead.
The virtio-gpu.h file defines a macro VIRTIO_GPU_FILL_CMD
which includes a call to qemu_log_mask, but does not
include qemu/log.h. In a default configure, it is lucky
and gets qemu/log.h indirectly due to the 'log' trace
backend being enabled. If that trace backend is disabled
though, eg
./configure --enable-trace-backends=nop
Then the build will fail:
In file included from /home/berrange/src/virt/qemu/hw/display/virtio-gpu-3d.c:19:0:
/home/berrange/src/virt/qemu/hw/display/virtio-gpu-3d.c: In function ‘virgl_cmd_create_resource_2d’:
/home/berrange/src/virt/qemu/include/hw/virtio/virtio-gpu.h:138:13: error: implicit declaration of function ‘qemu_log_mask’ [-Werror=implicit-function-declaration]
qemu_log_mask(LOG_GUEST_ERROR, \
^
/home/berrange/src/virt/qemu/hw/display/virtio-gpu-3d.c:34:5: note: in expansion of macro ‘VIRTIO_GPU_FILL_CMD’
VIRTIO_GPU_FILL_CMD(c2d);
^~~~~~~~~~~~~~~~~~~
/home/berrange/src/virt/qemu/hw/display/virtio-gpu-3d.c:34:5: error: nested extern declaration of ‘qemu_log_mask’ [-Werror=nested-externs]
In file included from /home/berrange/src/virt/qemu/hw/display/virtio-gpu-3d.c:19:0:
/home/berrange/src/virt/qemu/include/hw/virtio/virtio-gpu.h:138:27: error: ‘LOG_GUEST_ERROR’ undeclared (first use in this function)
qemu_log_mask(LOG_GUEST_ERROR, \
Peter Maydell [Tue, 16 Aug 2016 08:32:40 +0000 (09:32 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches for 2.7.0-rc3
# gpg: Signature made Mon 15 Aug 2016 14:55:46 BST
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream:
iotests: Test case for wrong runtime option types
block/nbd: Store runtime option values
block/blkdebug: Store config filename
block/nbd: Use QemuOpts for runtime options
block/ssh: Use QemuOpts for runtime options
Peter Maydell [Mon, 15 Aug 2016 20:48:03 +0000 (21:48 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160815' into staging
ppc patch queue for 2016-08-15
Just a single patch here, I hope this is the last ppc / spapr fix to
squeeze into qemu-2.7.
# gpg: Signature made Mon 15 Aug 2016 07:46:36 BST
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg: aka "David Gibson (Red Hat) <[email protected]>"
# gpg: aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.7-20160815:
ppc: parse cpu features once
Peter Maydell [Mon, 15 Aug 2016 18:04:51 +0000 (19:04 +0100)]
Merge remote-tracking branch 'remotes/sstabellini/tags/xen-20160812-tag-2' into staging
Xen 2016/08/12, fixed commit message
# gpg: Signature made Sat 13 Aug 2016 00:39:09 BST
# gpg: using RSA key 0x894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <[email protected]>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3 0AEA 894F 8F48 70E1 AE90
* remotes/sstabellini/tags/xen-20160812-tag-2:
xen: handle inbound migration of VMs without ioreq server pages
Xen: fix converity warning of xen_pt_config_init()
Peter Maydell [Mon, 8 Aug 2016 16:11:28 +0000 (17:11 +0100)]
pc-bios/optionrom: Fix OpenBSD build with better detection of linker emulation
The various host OSes are irritatingly variable about the name
of the linker emulation we need to pass to ld's -m option to
build the i386 option ROMs. Instead of doing this via a
CONFIG ifdef, check in configure whether any of the emulation
names we know about will work and pass the right answer through
to the makefile. If we can't find one, we fall back to not trying
to build the option ROMs, in the same way we would for a non-x86
host platform.
This is in particular necessary to unbreak the build on OpenBSD,
since it wants a different answer to FreeBSD and we don't have
an existing CONFIG_ variable that distinguishes the two.
Max Reitz [Mon, 15 Aug 2016 13:29:26 +0000 (15:29 +0200)]
block/nbd: Store runtime option values
Store the runtime option values in the BDRVNBDState so they can later be
used in nbd_refresh_filename() without having to directly access the
options QDict which may contain values of non-string types.
Max Reitz [Mon, 15 Aug 2016 13:29:25 +0000 (15:29 +0200)]
block/blkdebug: Store config filename
Store the configuration file's filename so it can later be used in
bdrv_refresh_filename() without having to directly access the options
QDict which may contain a value of a non-string type.
Max Reitz [Mon, 15 Aug 2016 13:29:24 +0000 (15:29 +0200)]
block/nbd: Use QemuOpts for runtime options
Using QemuOpts will prevent qemu from crashing if the input options have
not been validated (which is the case when they are specified on the
command line or in a json: filename) and some have the wrong type.
Max Reitz [Mon, 15 Aug 2016 13:29:23 +0000 (15:29 +0200)]
block/ssh: Use QemuOpts for runtime options
Using QemuOpts will prevent qemu from crashing if the input options have
not been validated (which is the case when they are specified on the
command line or in a json: filename) and some have the wrong type.
Greg Kurz [Wed, 10 Aug 2016 19:08:01 +0000 (21:08 +0200)]
ppc: parse cpu features once
Considering that features are converted to global properties and
global properties are automatically applied to every new instance
of created CPU (at object_new() time), there is no point in
parsing cpu_model string every time a CPU created. So move
parsing outside CPU creation loop and do it only once.
Parsing also should be done before any CPU is created so that
features would affect the first CPU a well.
This patch does that for all PowerPC machine types.
Paul Durrant [Mon, 1 Aug 2016 09:16:25 +0000 (10:16 +0100)]
xen: handle inbound migration of VMs without ioreq server pages
VMs created on older versions on Xen will not have been provisioned with
pages to support creation of non-default ioreq servers. In this case
the ioreq server API is not supported and QEMU's only option is to fall
back to using the default ioreq server pages as it did prior to
commit 3996e85c ("Xen: Use the ioreq-server API when available").
This patch therefore changes the code in xen_common.h to stop considering
a failure of xc_hvm_create_ioreq_server() as a hard failure but simply
as an indication that the guest is too old to support the ioreq server
API. Instead a boolean is set to cause reversion to old behaviour such
that the default ioreq server is then used.
Cao jin [Mon, 25 Jan 2016 12:16:03 +0000 (20:16 +0800)]
Xen: fix converity warning of xen_pt_config_init()
emu_regs is a pointer, ARRAY_SIZE doesn't return what we expect.
Since the remaining message is enough for debugging, so just remove it.
Also tweaked the message a little.
Peter Maydell [Thu, 4 Aug 2016 11:14:36 +0000 (12:14 +0100)]
Update ancient copyright string in -version output
Currently the -version command line argument prints a string ending
with "Copyright (c) 2003-2008 Fabrice Bellard". This is now some
eight years out of date; abstract it out of the several places that
print the string and update it to:
Copyright (c) 2003-2016 Fabrice Bellard and the QEMU Project developers
to reflect the work by all the QEMU Project contributors over the
last decade.
Liang Li [Tue, 9 Aug 2016 00:22:26 +0000 (08:22 +0800)]
migration: fix live migration failure with compression
Because of commit 11808bb0c422, which remove some condition checks
of 'f->ops->writev_buffer', 'qemu_put_qemu_file' should be enhanced
to clear the 'f_src->iovcnt', or 'f_src->iovcnt' may exceed the
MAX_IOV_SIZE which will break live migration. This should be fixed.
mmap man page:
"On success, mmap() returns a pointer to the mapped area. On error, the
value MAP_FAILED (that is, (void *) -1) is returned, and errno is set
to indicate the cause of the error."
The check in postcopy_get_tmp_page is definitely wrong and should be
fixed.
virtio-console: set frontend open permanently for console devs
The virtio-console.c file handles both serial consoles
and interactive consoles, since they're backed by the
same device model.
Since serial devices are expected to be reliable and
need to notify the guest when the backend is opened
or closed, the virtio-console.c file wires up support
for chardev events. This affects both serial consoles
and interactive consoles, using a network connection
based chardev backend such as 'socket', but not when
using a PTY based backend or plain 'file' backends.
When the host side is not connected the handle_output()
method in virtio-serial-bus.c will drop any data sent
by the guest, before it even reaches the virtio-console.c
code. This means that if the chardev has a logfile
configured, the data will never get logged.
Consider for example, configuring a x86_64 guest with a
plain UART serial port
The isa-serial one gets data written to the log regardless
of whether a client is connected, while the virtioconsole
one only gets data written to the log when a client is
connected.
There is no need for virtio-serial-bus.c to aggressively
drop the data for console devices, as the chardev code is
prefectly capable of discarding the data itself.
So this patch changes virtconsole devices so that they
are always marked as having the host side open. This
ensures that the guest OS will always send any data it
has (Linux virtio-console hvc driver actually ignores
the host open state and sends data regardless, but we
should not rely on that), and also prevents the
virtio-serial-bus code prematurely discarding data.
The behaviour of virtserialport devices is *not* changed,
only virtconsole, because for the former, it is important
that the guest OSknow exactly when the host side is opened
/ closed so it can do any protocol re-negotiation that may
be required.
Kevin Wolf [Tue, 9 Aug 2016 11:20:19 +0000 (13:20 +0200)]
linux-aio: Handle io_submit() failure gracefully
It is generally not expected that io_submit() fails other than with
-EAGAIN, but corner cases like SELinux refusing I/O when permissions are
revoked are still possible. In this case, we shouldn't abort, but just
return an I/O error for the request.
Peter Maydell [Wed, 10 Aug 2016 16:14:35 +0000 (17:14 +0100)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio/vhost: fixes
some bugfixes for virtio/vhost
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Wed 10 Aug 2016 16:16:22 BST
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg: aka "Michael S. Tsirkin <[email protected]>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
vhost-user: Attempt to fix a race with set_mem_table.
vhost-user: Introduce a new protocol feature REPLY_ACK.
vhost: check for vhost_ops before using.
Peter Maydell [Wed, 10 Aug 2016 14:59:08 +0000 (15:59 +0100)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* pc-bios/optionrom/Makefile fixes
* warning fixes for __atomic_load and -1 << x in clang
* missed interrupt fix from Gonglei
* checkpatch fix from Radim and myself
* remotes/bonzini/tags/for-upstream:
checkpatch: default to success if only warnings
checkpatch: bump most warnings to errors
CODING_STYLE, checkpatch: update line length rules
checkpatch: check for CVS keywords on all sources
checkpatch: tweak the files in which TABs are checked
timer: set vm_clock disabled default
checkpatch: ignore automatically imported Linux headers
clang: Fix warning reg. expansion to 'defined'
Disable warn about left shifts of negative values
atomic: strip "const" from variables declared with typeof
optionrom: fix compilation with mingw docker target
optionrom: add -fno-stack-protector
build-sys: fix building with make CFLAGS=.. argument
linuxboot_dma: avoid guest ABI breakage on gcc vs. clang compilation
Prerna Saxena [Fri, 5 Aug 2016 10:53:51 +0000 (03:53 -0700)]
vhost-user: Attempt to fix a race with set_mem_table.
The set_mem_table command currently does not seek a reply. Hence, there is
no easy way for a remote application to notify to QEMU when it finished
setting up memory, or if there were errors doing so.
As an example:
(1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net
application). SET_MEM_TABLE does not require a reply according to the spec.
(2) Qemu commits the memory to the guest.
(3) Guest issues an I/O operation over a new memory region which was configured on (1).
(4) The application has not yet remapped the memory, but it sees the I/O request.
(5) The application cannot satisfy the request because it does not know about those GPAs.
While a guaranteed fix would require a protocol extension (committed separately),
a best-effort workaround for existing applications is to send a GET_FEATURES
message before completing the vhost_user_set_mem_table() call.
Since GET_FEATURES requires a reply, an application that processes vhost-user
messages synchronously would probably have completed the SET_MEM_TABLE before replying.
Prerna Saxena [Fri, 5 Aug 2016 10:53:50 +0000 (03:53 -0700)]
vhost-user: Introduce a new protocol feature REPLY_ACK.
This introduces the VHOST_USER_PROTOCOL_F_REPLY_ACK.
If negotiated, client applications should send a u64 payload in
response to any message that contains the "need_reply" bit set
on the message flags. Setting the payload to "zero" indicates the
command finished successfully. Likewise, setting it to "non-zero"
indicates an error.
Ilya Maximets [Wed, 3 Aug 2016 05:22:49 +0000 (08:22 +0300)]
vhost: check for vhost_ops before using.
'vhost_set_vring_enable()' tries to call function using pointer to
'vhost_ops' which can be already zeroized in 'vhost_dev_cleanup()'
while vhost disconnection.
Fix that by checking 'vhost_ops' before using. This fixes QEMU crash
on calling 'ethtool -L eth0 combined 2' if vhost disconnected.
Peter Maydell [Wed, 10 Aug 2016 14:13:30 +0000 (15:13 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160810' into staging
ppc patch queue for 2016-08-10
Here are some more last minute PAPR and ppc related fixes for
qemu-2.7. One patch makes compressed memory dumps work with guest
kernels using page sizes up to 64KiB. This is important since most
current pseries guests use a 64KiB default page size. The remainder
fix a regression with handling of CPU aliases which causes serious
problem for libvirt.
# gpg: Signature made Wed 10 Aug 2016 06:44:27 BST
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg: aka "David Gibson (Red Hat) <[email protected]>"
# gpg: aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.7-20160810:
ppc/kvm: Register also a generic spapr CPU core family type
ppc/kvm: Do not mess up the generic CPU family registration
hw/ppc/spapr: Look up CPU alias names instead of hard-coding the aliases
ppc: Introduce a function to look up CPU alias strings
spapr: remove extra type variable
ppc64: fix compressed dump with pseries kernel
Paolo Bonzini [Tue, 9 Aug 2016 15:47:44 +0000 (17:47 +0200)]
checkpatch: default to success if only warnings
CHK-level checks have been removed from checkpatch or bumped to
errors, so there is no effect anymore for --strict/--subjective.
Furthermore, even most WARNs have been bumped to errors, with
WARN only reserved to things that patchew probably ought not
to complain about (and that maintainers probably will notice
anyway during review if they are extreme).
Default to exiting with success even if there are WARN-level
failures, and cause --strict to fail for warnings. Maintainers
that want to have a strict 80-character limit for their subsystem
can add it to a commit hook for example.
Paolo Bonzini [Mon, 7 Sep 2015 09:53:02 +0000 (11:53 +0200)]
CODING_STYLE, checkpatch: update line length rules
Line lengths above 80 characters do exist. They are rare, but
they happen from time to time. An ignored rule is worse than an
exception to the rule, so do the latter.
Some on the list expressed their preference for a soft limit that
is slightly lower than 80 characters, to account for extra characters
in unified diffs (including three-way diffs) and for email quoting.
However, there was no consensus on this so keep the 80-character
soft limit and add a hard limit at 90.
Paolo Bonzini [Wed, 10 Aug 2016 08:05:03 +0000 (10:05 +0200)]
checkpatch: check for CVS keywords on all sources
These should apply to all files, not just C/C++. Tweak the regular
expression to check for whole words, to avoid false positives on Perl
variables starting with "Id".
Thomas Huth [Tue, 9 Aug 2016 17:00:01 +0000 (19:00 +0200)]
ppc/kvm: Register also a generic spapr CPU core family type
There is a regression with the "-cpu" parameter introduced by
the spapr CPU hotplug code: We used to allow to specify a
"CPU family" name with the "-cpu" parameter when running on KVM so
that the user does not need to know the gory details of the exact
CPU version of the host CPU. For example, it was possible to
use "-cpu POWER8" on a POWER8E host CPU. This behavior does not
work anymore with the new hot-pluggable spapr-cpu-core types.
Since libvirt already heavily depends on the old behavior, this
is quite a severe regression in the QEMU parameter interface.
Let's fix it by supporting a CPU family type for the spapr-cpu-core
on KVM, too.