Kevin Wolf [Wed, 6 Jul 2016 09:22:39 +0000 (11:22 +0200)]
nbd-server: Use a separate BlockBackend
The builtin NBD server uses its own BlockBackend now instead of reusing
the monitor/guest device one.
This means that it has its own writethrough setting now. The builtin
NBD server always uses writeback caching now regardless of whether the
guest device has WCE enabled. qemu-nbd respects the cache mode given on
the command line.
We still need to keep a reference to the monitor BB because we put an
eject notifier on it, but we don't use it for any I/O.
Kevin Wolf [Thu, 23 Jun 2016 12:20:24 +0000 (14:20 +0200)]
block: Accept node-name for drive-mirror
In order to remove the necessity to use BlockBackend names in the
external API, we want to allow node-names everywhere. This converts
drive-mirror to accept a node-name without lifting the restriction that
we're operating at a root node.
In case of an invalid device name, the command returns the GenericError
error class now instead of DeviceNotFound, because this is what
qmp_get_root_bs() returns.
Kevin Wolf [Thu, 23 Jun 2016 12:20:24 +0000 (14:20 +0200)]
block: Accept node-name for drive-backup
In order to remove the necessity to use BlockBackend names in the
external API, we want to allow node-names everywhere. This converts
drive-backup and the corresponding transaction action to accept a
node-name without lifting the restriction that we're operating at a root
node.
In case of an invalid device name, the command returns the GenericError
error class now instead of DeviceNotFound, because this is what
qmp_get_root_bs() returns.
Kevin Wolf [Thu, 23 Jun 2016 12:20:24 +0000 (14:20 +0200)]
block: Accept node-name for change-backing-file
In order to remove the necessity to use BlockBackend names in the
external API, we want to allow node-names everywhere. This converts
change-backing-file to accept a node-name without lifting the
restriction that we're operating at a root node.
In case of an invalid device name, the command returns the GenericError
error class now instead of DeviceNotFound, because this is what
qmp_get_root_bs() returns.
Kevin Wolf [Thu, 23 Jun 2016 12:20:24 +0000 (14:20 +0200)]
block: Accept node-name for blockdev-snapshot-internal-sync
In order to remove the necessity to use BlockBackend names in the
external API, we want to allow node-names everywhere. This converts
blockdev-snapshot-internal-sync to accept a node-name without lifting
the restriction that we're operating at a root node.
In case of an invalid device name, the command returns the GenericError
error class now instead of DeviceNotFound, because this is what
qmp_get_root_bs() returns.
Kevin Wolf [Thu, 23 Jun 2016 12:20:24 +0000 (14:20 +0200)]
block: Accept node-name for blockdev-snapshot-delete-internal-sync
In order to remove the necessity to use BlockBackend names in the
external API, we want to allow node-names everywhere. This converts
blockdev-snapshot-delete-internal-sync to accept a node-name without
lifting the restriction that we're operating at a root node.
In case of an invalid device name, the command returns the GenericError
error class now instead of DeviceNotFound, because this is what
qmp_get_root_bs() returns.
Kevin Wolf [Thu, 23 Jun 2016 12:20:24 +0000 (14:20 +0200)]
block: Accept node-name for blockdev-mirror
In order to remove the necessity to use BlockBackend names in the
external API, we want to allow node-names everywhere. This converts
blockdev-mirror to accept a node-name without lifting the restriction
that we're operating at a root node.
Kevin Wolf [Thu, 23 Jun 2016 12:20:24 +0000 (14:20 +0200)]
block: Accept node-name for blockdev-backup
In order to remove the necessity to use BlockBackend names in the
external API, we want to allow node-names everywhere. This converts
blockdev-backup and the corresponding transaction action to accept a
node-name without lifting the restriction that we're operating at a root
node.
In case of an invalid device name, the command returns the GenericError
error class now instead of DeviceNotFound, because this is what
qmp_get_root_bs() returns.
Kevin Wolf [Thu, 23 Jun 2016 12:20:24 +0000 (14:20 +0200)]
block: Accept node-name for block-commit
In order to remove the necessity to use BlockBackend names in the
external API, we want to allow node-names everywhere. This converts
block-commit to accept a node-name without lifting the restriction that
we're operating at a root node.
As libvirt makes use of the DeviceNotFound error class, we must add
explicit code to retain this behaviour because qmp_get_root_bs() only
returns GenericErrors.
Kevin Wolf [Thu, 23 Jun 2016 12:20:24 +0000 (14:20 +0200)]
block: Accept node-name for block-stream
In order to remove the necessity to use BlockBackend names in the
external API, we want to allow node-names everywhere. This converts
block-stream to accept a node-name without lifting the restriction that
we're operating at a root node.
In case of an invalid device name, the command returns the GenericError
error class now instead of DeviceNotFound, because this is what
qmp_get_root_bs() returns.
Greg Kurz [Tue, 30 Aug 2016 15:02:27 +0000 (17:02 +0200)]
9pfs: handle walk of ".." in the root directory
The 9P spec at http://man.cat-v.org/plan_9/5/intro says:
All directories must support walks to the directory .. (dot-dot) meaning
parent directory, although by convention directories contain no explicit
entry for .. or . (dot). The parent of the root directory of a server's
tree is itself.
This means that a client cannot walk further than the root directory
exported by the server. In other words, if the client wants to walk
"/.." or "/foo/../..", the server should answer like the request was
to walk "/".
This patch just does that:
- we cache the QID of the root directory at attach time
- during the walk we compare the QID of each path component with the root
QID to detect if we're in a "/.." situation
- if so, we skip the current component and go to the next one
Greg Kurz [Tue, 30 Aug 2016 17:11:05 +0000 (19:11 +0200)]
9pfs: forbid illegal path names
Empty path components don't make sense for most commands and may cause
undefined behavior, depending on the backend.
Also, the walk request described in the 9P spec [1] clearly shows that
the client is supposed to send individual path components: the official
linux client never sends portions of path containing the / character for
example.
Moreover, the 9P spec [2] also states that a system can decide to restrict
the set of supported characters used in path components, with an explicit
mention "to remove slashes from name components".
This patch introduces a new name_is_illegal() helper that checks the
names sent by the client are not empty and don't contain unwanted chars.
Since 9pfs is only supported on linux hosts, only the / character is
checked at the moment. When support for other hosts (AKA. win32) is added,
other chars may need to be blacklisted as well.
If a client sends an illegal path component, the request will fail and
ENOENT is returned to the client.
Paolo Bonzini [Tue, 30 Aug 2016 12:04:12 +0000 (14:04 +0200)]
Revert "Change net/socket.c to use socket_*() functions"
Since commit 7e8449594c929, the socket connect code is blocking, because
calling socket_connect() without callback is blocking. This reverts the
commit.
translate: early exit in tb_flush if there is no tcg
tb_flush does all kind of things, which are very tcg specific. As it
is called from some places even for KVM (e.g. gdb server) it is better
to detect these cases and do an early exit.
This also fixes a crash in the gdb server that was triggered by
commit 909eaac9bbc2 ("tb hash: track translated blocks with qht").
vnc: only alloc server surface with clients connected
the VNC server was changed so that the 'vd->server' pixman
image was only allocated when a client is connected.
Since then if a client disconnects and then reconnects to
the VNC server all they will see is a black screen until
they do something that triggers a refresh. On a graphical
desktop this is not often noticed since there's many things
going on which cause a refresh. On a plain text console it
is really obvious since nothing refreshes frequently.
The problem is that the VNC server didn't update the guest
dirty bitmap, so still believes its server image is in sync
with the guest contents.
To fix this we must explicitly mark the entire guest desktop
as dirty after re-creating the server surface. Move this
logic into vnc_update_server_surface() so it is guaranteed
to be call in all code paths that re-create the surface
instead of only in vnc_dpy_switch()
Stefan Hajnoczi [Mon, 15 Aug 2016 12:54:16 +0000 (13:54 +0100)]
virtio: decrement vq->inuse in virtqueue_discard()
virtqueue_discard() moves vq->last_avail_idx back so the element can be
popped again. It's necessary to decrement vq->inuse to avoid "leaking"
the element count.
Stefan Hajnoczi [Mon, 15 Aug 2016 12:54:15 +0000 (13:54 +0100)]
virtio: recalculate vq->inuse after migration
The vq->inuse field is not migrated. Many devices don't hold
VirtQueueElements across migration so it doesn't matter that vq->inuse
starts at 0 on the destination QEMU.
At least virtio-serial, virtio-blk, and virtio-balloon migrate while
holding VirtQueueElements. For these devices we need to recalculate
vq->inuse upon load so the value is correct.
Peter Maydell [Mon, 22 Aug 2016 09:02:28 +0000 (10:02 +0100)]
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Mon 22 Aug 2016 09:06:32 BST
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
e1000e: remove internal interrupt flag
slirp: fix segv when init failed
Cao jin [Thu, 18 Aug 2016 14:15:54 +0000 (22:15 +0800)]
e1000e: remove internal interrupt flag
Commit 66bf7d58 removed internal msi state flag E1000E_USE_MSI, E1000E_USE_MSIX
is not necessary too, remove it now. And interrupt flag field intr_state also
can be removed now.
Since commit f6c2e66ae8c8a, slirp uses an exit notifier to call
slirp_smb_cleanup. However, if init() failed, the notifier isn't added,
and removing it will fail:
==18447== Invalid write of size 8
==18447== at 0x7EF2B5: notifier_remove (notify.c:32)
==18447== by 0x48E80C: qemu_remove_exit_notifier (vl.c:2661)
==18447== by 0x6A2187: net_slirp_cleanup (slirp.c:134)
==18447== by 0x69419D: qemu_cleanup_net_client (net.c:338)
==18447== by 0x69445B: qemu_del_net_client (net.c:401)
==18447== by 0x6A2B81: net_slirp_init (slirp.c:366)
==18447== by 0x6A4241: net_init_slirp (slirp.c:865)
==18447== by 0x695C6D: net_client_init1 (net.c:1051)
==18447== by 0x695F6E: net_client_init (net.c:1108)
==18447== by 0x696DBA: net_init_netdev (net.c:1498)
==18447== by 0x7F1F99: qemu_opts_foreach (qemu-option.c:1116)
==18447== by 0x696E60: net_init_clients (net.c:1516)
==18447== Address 0x0 is not stack'd, malloc'd or (recently) free'd
Sascha Silbe [Thu, 18 Aug 2016 18:46:03 +0000 (20:46 +0200)]
test-logging: don't hard-code paths in /tmp
Since f6880b7f [qemu-log: support simple pid substitution for logs],
test-logging creates files with hard-coded names in /tmp. In the best
case, this prevents multiple developers from running "make check" on
the same machine. In the worst case, it allows for symlink attacks,
enabling an attacker to overwrite files that are writable to the
developer running "make check".
Instead of hard-coding the paths, create a temporary directory using
g_dir_make_tmp() and clean it up afterwards.
Sascha Silbe [Thu, 18 Aug 2016 18:46:02 +0000 (20:46 +0200)]
glib: add compatibility implementation for g_dir_make_tmp()
We're going to make use of g_dir_make_tmp() in test-logging. Provide a
compatibility implementation of it for glib < 2.30.
May behave differently in some edge cases (e.g. pattern only at the
end of the template, the file name is not part of the error message),
but good enough in practice.
Michal Privoznik [Fri, 19 Aug 2016 08:06:40 +0000 (10:06 +0200)]
syscall.c: Redefine IFLA_* enums
In 9c37146782 I've tried to fix a broken build with older
linux-headers. However, I didn't do it properly. The solution
implemented here is to grab the enums that caused the problem
initially, and rename their values so that they are "QEMU_"
prefixed. In order to guarantee matching values with actual
enums from linux-headers, the enums are seeded with starting
values from the original enums.
Denis V. Lunev [Wed, 17 Aug 2016 18:06:54 +0000 (21:06 +0300)]
block: fix possible reorder of flush operations
This patch reduce CPU usage of flush operations a bit. When we have one
flush completed we should kick only next operation. We should not start
all pending operations in the hope that they will go back to wait on
wait_queue.
Also there is a technical possibility that requests will get reordered
with the previous approach. After wakeup all requests are removed from
the wait queue. They become active and they are processed one-by-one
adding to the wait queue in the same order. Though new flush can arrive
while all requests are not put into the queue.
Evgeny Yakovlev [Wed, 17 Aug 2016 18:06:53 +0000 (21:06 +0300)]
block: fix deadlock in bdrv_co_flush
The following commit
commit 3ff2f67a7c24183fcbcfe1332e5223ac6f96438c
Author: Evgeny Yakovlev <[email protected]>
Date: Mon Jul 18 22:39:52 2016 +0300
block: ignore flush requests when storage is clean
has introduced a regression.
There is a problem that it is still possible for 2 requests to execute
in non sequential fashion and sometimes this results in a deadlock
when bdrv_drain_one/all are called for BDS with such stalled requests.
1. Current flushed_gen and flush_started_gen is 1.
2. Request 1 enters bdrv_co_flush to with write_gen 1 (i.e. the same
as flushed_gen). It gets past flushed_gen != flush_started_gen and
sets flush_started_gen to 1 (again, the same it was before).
3. Request 1 yields somewhere before exiting bdrv_co_flush
4. Request 2 enters bdrv_co_flush with write_gen 2. It gets past
flushed_gen != flush_started_gen and sets flush_started_gen to 2.
5. Request 2 runs to completion and sets flushed_gen to 2
6. Request 1 is resumed, runs to completion and sets flushed_gen to 1.
However flush_started_gen is now 2.
From here on out flushed_gen is always != to flush_started_gen and all
further requests will wait on flush_queue. This change replaces
flush_started_gen with an explicitly tracked active flush request.
Peter Maydell [Thu, 18 Aug 2016 09:56:40 +0000 (10:56 +0100)]
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Thu 18 Aug 2016 06:36:16 BST
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
net/net: properly handle multiple packets in net_fill_rstate()
net: vmxnet: use g_new for pkt initialisation
Peter Maydell [Thu, 18 Aug 2016 09:27:57 +0000 (10:27 +0100)]
Merge remote-tracking branch 'remotes/famz/tags/docker-pull-request' into staging
Fix 'make docker-test-mingw@fedora'
Peter,
This is the single patch that stalls patchew's mingw testing. Since it
is small and trivial, let's have it in 2.7.
Fam
# gpg: Signature made Wed 17 Aug 2016 13:13:53 BST
# gpg: using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6
* remotes/famz/tags/docker-pull-request:
curl: Cast fd to int for DPRINTF
Zhang Chen [Thu, 18 Aug 2016 03:23:25 +0000 (11:23 +0800)]
net/net: properly handle multiple packets in net_fill_rstate()
When network is busy, we will receive multiple packets at one time. In
that situation, we should keep trying to do the receiving instead of
finalizing only the first packet.
Li Qiang [Tue, 16 Aug 2016 11:28:01 +0000 (16:58 +0530)]
net: vmxnet: use g_new for pkt initialisation
When network transport abstraction layer initialises pkt, the maximum
fragmentation count is not checked. This could lead to an integer
overflow causing a NULL pointer dereference. Replace g_malloc() with
g_new() to catch the multiplication overflow.
Fam Zheng [Mon, 1 Aug 2016 05:04:48 +0000 (13:04 +0800)]
curl: Cast fd to int for DPRINTF
Currently "make docker-test-mingw@fedora" has a warning like:
/tmp/qemu-test/src/block/curl.c: In function 'curl_sock_cb':
/tmp/qemu-test/src/block/curl.c:172:6: warning: format '%d' expects
argument of type 'int', but argument 4 has type 'curl_socket_t {aka long
long unsigned int}'
DPRINTF("CURL (AIO): Sock action %d on fd %d\n", action, fd);
^
cc1: all warnings being treated as errors
Peter Maydell [Thu, 11 Aug 2016 17:59:39 +0000 (18:59 +0100)]
linux-user: Fix llseek with high bit of offset_low set
The llseek syscall takes two 32-bit arguments, offset_high
and offset_low, which must be combined to form a single
64-bit offset. Unfortunately we were combining them with
(uint64_t)arg2 << 32) | arg3
and arg3 is a signed type; this meant that when promoting
arg3 to a 64-bit type it would be sign-extended. The effect
was that if the offset happened to have bit 31 set then
this bit would get sign-extended into all of bits 63..32.
Explicitly cast arg3 to abi_ulong to avoid the erroneous
sign extension.
Michal Privoznik [Tue, 16 Aug 2016 09:47:43 +0000 (11:47 +0200)]
syscall.c: Fix build with older linux-headers
In c5dff280 we tried to make us understand netlink messages more.
So we've added a code that does some translation. However, the
code assumed linux-headers to be at least version 4.4 of it
because most of the symbols there (if not all of them) were added
in just that release. This, however, breaks build on systems with
older versions of the package.
Thomas Huth [Mon, 15 Aug 2016 08:24:54 +0000 (10:24 +0200)]
slirp: Rename "struct arphdr" to "struct slirp_arphdr"
struct arphdr is already used by the system headers on OpenBSD
and thus QEMU does not compile here anymore. Fix it by renaming
our struct to slirp_arphdr instead.
Since commit d7a04fd7d5008, tcp_chr_wait_connected() was introduced,
so vhost-user could wait until a backend started successfully. In
vhost-user case, the chr socket must be plain unix, and the chr+vhost
setup happens synchronously during qemu startup.
However, with TLS and telnet socket, initial socket setup happens
asynchronously, and s->connected is not set after the socket is
accepted. In order for tcp_chr_wait_connected() to not keep accepting
new connections and proceed with the last accepted socket, it can
check for s->ioc instead.
The virtio-gpu.h file defines a macro VIRTIO_GPU_FILL_CMD
which includes a call to qemu_log_mask, but does not
include qemu/log.h. In a default configure, it is lucky
and gets qemu/log.h indirectly due to the 'log' trace
backend being enabled. If that trace backend is disabled
though, eg
./configure --enable-trace-backends=nop
Then the build will fail:
In file included from /home/berrange/src/virt/qemu/hw/display/virtio-gpu-3d.c:19:0:
/home/berrange/src/virt/qemu/hw/display/virtio-gpu-3d.c: In function ‘virgl_cmd_create_resource_2d’:
/home/berrange/src/virt/qemu/include/hw/virtio/virtio-gpu.h:138:13: error: implicit declaration of function ‘qemu_log_mask’ [-Werror=implicit-function-declaration]
qemu_log_mask(LOG_GUEST_ERROR, \
^
/home/berrange/src/virt/qemu/hw/display/virtio-gpu-3d.c:34:5: note: in expansion of macro ‘VIRTIO_GPU_FILL_CMD’
VIRTIO_GPU_FILL_CMD(c2d);
^~~~~~~~~~~~~~~~~~~
/home/berrange/src/virt/qemu/hw/display/virtio-gpu-3d.c:34:5: error: nested extern declaration of ‘qemu_log_mask’ [-Werror=nested-externs]
In file included from /home/berrange/src/virt/qemu/hw/display/virtio-gpu-3d.c:19:0:
/home/berrange/src/virt/qemu/include/hw/virtio/virtio-gpu.h:138:27: error: ‘LOG_GUEST_ERROR’ undeclared (first use in this function)
qemu_log_mask(LOG_GUEST_ERROR, \
Peter Maydell [Tue, 16 Aug 2016 08:32:40 +0000 (09:32 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches for 2.7.0-rc3
# gpg: Signature made Mon 15 Aug 2016 14:55:46 BST
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream:
iotests: Test case for wrong runtime option types
block/nbd: Store runtime option values
block/blkdebug: Store config filename
block/nbd: Use QemuOpts for runtime options
block/ssh: Use QemuOpts for runtime options
Peter Maydell [Mon, 15 Aug 2016 20:48:03 +0000 (21:48 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160815' into staging
ppc patch queue for 2016-08-15
Just a single patch here, I hope this is the last ppc / spapr fix to
squeeze into qemu-2.7.
# gpg: Signature made Mon 15 Aug 2016 07:46:36 BST
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg: aka "David Gibson (Red Hat) <[email protected]>"
# gpg: aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.7-20160815:
ppc: parse cpu features once
Peter Maydell [Mon, 15 Aug 2016 18:04:51 +0000 (19:04 +0100)]
Merge remote-tracking branch 'remotes/sstabellini/tags/xen-20160812-tag-2' into staging
Xen 2016/08/12, fixed commit message
# gpg: Signature made Sat 13 Aug 2016 00:39:09 BST
# gpg: using RSA key 0x894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <[email protected]>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3 0AEA 894F 8F48 70E1 AE90
* remotes/sstabellini/tags/xen-20160812-tag-2:
xen: handle inbound migration of VMs without ioreq server pages
Xen: fix converity warning of xen_pt_config_init()
Peter Maydell [Mon, 8 Aug 2016 16:11:28 +0000 (17:11 +0100)]
pc-bios/optionrom: Fix OpenBSD build with better detection of linker emulation
The various host OSes are irritatingly variable about the name
of the linker emulation we need to pass to ld's -m option to
build the i386 option ROMs. Instead of doing this via a
CONFIG ifdef, check in configure whether any of the emulation
names we know about will work and pass the right answer through
to the makefile. If we can't find one, we fall back to not trying
to build the option ROMs, in the same way we would for a non-x86
host platform.
This is in particular necessary to unbreak the build on OpenBSD,
since it wants a different answer to FreeBSD and we don't have
an existing CONFIG_ variable that distinguishes the two.
Max Reitz [Mon, 15 Aug 2016 13:29:26 +0000 (15:29 +0200)]
block/nbd: Store runtime option values
Store the runtime option values in the BDRVNBDState so they can later be
used in nbd_refresh_filename() without having to directly access the
options QDict which may contain values of non-string types.
Max Reitz [Mon, 15 Aug 2016 13:29:25 +0000 (15:29 +0200)]
block/blkdebug: Store config filename
Store the configuration file's filename so it can later be used in
bdrv_refresh_filename() without having to directly access the options
QDict which may contain a value of a non-string type.
Max Reitz [Mon, 15 Aug 2016 13:29:24 +0000 (15:29 +0200)]
block/nbd: Use QemuOpts for runtime options
Using QemuOpts will prevent qemu from crashing if the input options have
not been validated (which is the case when they are specified on the
command line or in a json: filename) and some have the wrong type.
Max Reitz [Mon, 15 Aug 2016 13:29:23 +0000 (15:29 +0200)]
block/ssh: Use QemuOpts for runtime options
Using QemuOpts will prevent qemu from crashing if the input options have
not been validated (which is the case when they are specified on the
command line or in a json: filename) and some have the wrong type.
Greg Kurz [Wed, 10 Aug 2016 19:08:01 +0000 (21:08 +0200)]
ppc: parse cpu features once
Considering that features are converted to global properties and
global properties are automatically applied to every new instance
of created CPU (at object_new() time), there is no point in
parsing cpu_model string every time a CPU created. So move
parsing outside CPU creation loop and do it only once.
Parsing also should be done before any CPU is created so that
features would affect the first CPU a well.
This patch does that for all PowerPC machine types.
Signed-off-by: Greg Kurz <[email protected]>
[clg: only kept the fix for the spapr platform. support for other
platform will be added in 2.8 ] Signed-off-by: Cédric Le Goater <[email protected]> Tested-by: Bharata B Rao <[email protected]> Signed-off-by: David Gibson <[email protected]>
Paul Durrant [Mon, 1 Aug 2016 09:16:25 +0000 (10:16 +0100)]
xen: handle inbound migration of VMs without ioreq server pages
VMs created on older versions on Xen will not have been provisioned with
pages to support creation of non-default ioreq servers. In this case
the ioreq server API is not supported and QEMU's only option is to fall
back to using the default ioreq server pages as it did prior to
commit 3996e85c ("Xen: Use the ioreq-server API when available").
This patch therefore changes the code in xen_common.h to stop considering
a failure of xc_hvm_create_ioreq_server() as a hard failure but simply
as an indication that the guest is too old to support the ioreq server
API. Instead a boolean is set to cause reversion to old behaviour such
that the default ioreq server is then used.
Cao jin [Mon, 25 Jan 2016 12:16:03 +0000 (20:16 +0800)]
Xen: fix converity warning of xen_pt_config_init()
emu_regs is a pointer, ARRAY_SIZE doesn't return what we expect.
Since the remaining message is enough for debugging, so just remove it.
Also tweaked the message a little.
Peter Maydell [Thu, 4 Aug 2016 11:14:36 +0000 (12:14 +0100)]
Update ancient copyright string in -version output
Currently the -version command line argument prints a string ending
with "Copyright (c) 2003-2008 Fabrice Bellard". This is now some
eight years out of date; abstract it out of the several places that
print the string and update it to:
Copyright (c) 2003-2016 Fabrice Bellard and the QEMU Project developers
to reflect the work by all the QEMU Project contributors over the
last decade.
Liang Li [Tue, 9 Aug 2016 00:22:26 +0000 (08:22 +0800)]
migration: fix live migration failure with compression
Because of commit 11808bb0c422, which remove some condition checks
of 'f->ops->writev_buffer', 'qemu_put_qemu_file' should be enhanced
to clear the 'f_src->iovcnt', or 'f_src->iovcnt' may exceed the
MAX_IOV_SIZE which will break live migration. This should be fixed.
mmap man page:
"On success, mmap() returns a pointer to the mapped area. On error, the
value MAP_FAILED (that is, (void *) -1) is returned, and errno is set
to indicate the cause of the error."
The check in postcopy_get_tmp_page is definitely wrong and should be
fixed.
virtio-console: set frontend open permanently for console devs
The virtio-console.c file handles both serial consoles
and interactive consoles, since they're backed by the
same device model.
Since serial devices are expected to be reliable and
need to notify the guest when the backend is opened
or closed, the virtio-console.c file wires up support
for chardev events. This affects both serial consoles
and interactive consoles, using a network connection
based chardev backend such as 'socket', but not when
using a PTY based backend or plain 'file' backends.
When the host side is not connected the handle_output()
method in virtio-serial-bus.c will drop any data sent
by the guest, before it even reaches the virtio-console.c
code. This means that if the chardev has a logfile
configured, the data will never get logged.
Consider for example, configuring a x86_64 guest with a
plain UART serial port
The isa-serial one gets data written to the log regardless
of whether a client is connected, while the virtioconsole
one only gets data written to the log when a client is
connected.
There is no need for virtio-serial-bus.c to aggressively
drop the data for console devices, as the chardev code is
prefectly capable of discarding the data itself.
So this patch changes virtconsole devices so that they
are always marked as having the host side open. This
ensures that the guest OS will always send any data it
has (Linux virtio-console hvc driver actually ignores
the host open state and sends data regardless, but we
should not rely on that), and also prevents the
virtio-serial-bus code prematurely discarding data.
The behaviour of virtserialport devices is *not* changed,
only virtconsole, because for the former, it is important
that the guest OSknow exactly when the host side is opened
/ closed so it can do any protocol re-negotiation that may
be required.
Kevin Wolf [Tue, 9 Aug 2016 11:20:19 +0000 (13:20 +0200)]
linux-aio: Handle io_submit() failure gracefully
It is generally not expected that io_submit() fails other than with
-EAGAIN, but corner cases like SELinux refusing I/O when permissions are
revoked are still possible. In this case, we shouldn't abort, but just
return an I/O error for the request.
Peter Maydell [Wed, 10 Aug 2016 16:14:35 +0000 (17:14 +0100)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio/vhost: fixes
some bugfixes for virtio/vhost
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Wed 10 Aug 2016 16:16:22 BST
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg: aka "Michael S. Tsirkin <[email protected]>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
vhost-user: Attempt to fix a race with set_mem_table.
vhost-user: Introduce a new protocol feature REPLY_ACK.
vhost: check for vhost_ops before using.
Peter Maydell [Wed, 10 Aug 2016 14:59:08 +0000 (15:59 +0100)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* pc-bios/optionrom/Makefile fixes
* warning fixes for __atomic_load and -1 << x in clang
* missed interrupt fix from Gonglei
* checkpatch fix from Radim and myself
* remotes/bonzini/tags/for-upstream:
checkpatch: default to success if only warnings
checkpatch: bump most warnings to errors
CODING_STYLE, checkpatch: update line length rules
checkpatch: check for CVS keywords on all sources
checkpatch: tweak the files in which TABs are checked
timer: set vm_clock disabled default
checkpatch: ignore automatically imported Linux headers
clang: Fix warning reg. expansion to 'defined'
Disable warn about left shifts of negative values
atomic: strip "const" from variables declared with typeof
optionrom: fix compilation with mingw docker target
optionrom: add -fno-stack-protector
build-sys: fix building with make CFLAGS=.. argument
linuxboot_dma: avoid guest ABI breakage on gcc vs. clang compilation
Prerna Saxena [Fri, 5 Aug 2016 10:53:51 +0000 (03:53 -0700)]
vhost-user: Attempt to fix a race with set_mem_table.
The set_mem_table command currently does not seek a reply. Hence, there is
no easy way for a remote application to notify to QEMU when it finished
setting up memory, or if there were errors doing so.
As an example:
(1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net
application). SET_MEM_TABLE does not require a reply according to the spec.
(2) Qemu commits the memory to the guest.
(3) Guest issues an I/O operation over a new memory region which was configured on (1).
(4) The application has not yet remapped the memory, but it sees the I/O request.
(5) The application cannot satisfy the request because it does not know about those GPAs.
While a guaranteed fix would require a protocol extension (committed separately),
a best-effort workaround for existing applications is to send a GET_FEATURES
message before completing the vhost_user_set_mem_table() call.
Since GET_FEATURES requires a reply, an application that processes vhost-user
messages synchronously would probably have completed the SET_MEM_TABLE before replying.
Prerna Saxena [Fri, 5 Aug 2016 10:53:50 +0000 (03:53 -0700)]
vhost-user: Introduce a new protocol feature REPLY_ACK.
This introduces the VHOST_USER_PROTOCOL_F_REPLY_ACK.
If negotiated, client applications should send a u64 payload in
response to any message that contains the "need_reply" bit set
on the message flags. Setting the payload to "zero" indicates the
command finished successfully. Likewise, setting it to "non-zero"
indicates an error.
Ilya Maximets [Wed, 3 Aug 2016 05:22:49 +0000 (08:22 +0300)]
vhost: check for vhost_ops before using.
'vhost_set_vring_enable()' tries to call function using pointer to
'vhost_ops' which can be already zeroized in 'vhost_dev_cleanup()'
while vhost disconnection.
Fix that by checking 'vhost_ops' before using. This fixes QEMU crash
on calling 'ethtool -L eth0 combined 2' if vhost disconnected.
Peter Maydell [Wed, 10 Aug 2016 14:13:30 +0000 (15:13 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160810' into staging
ppc patch queue for 2016-08-10
Here are some more last minute PAPR and ppc related fixes for
qemu-2.7. One patch makes compressed memory dumps work with guest
kernels using page sizes up to 64KiB. This is important since most
current pseries guests use a 64KiB default page size. The remainder
fix a regression with handling of CPU aliases which causes serious
problem for libvirt.
# gpg: Signature made Wed 10 Aug 2016 06:44:27 BST
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg: aka "David Gibson (Red Hat) <[email protected]>"
# gpg: aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.7-20160810:
ppc/kvm: Register also a generic spapr CPU core family type
ppc/kvm: Do not mess up the generic CPU family registration
hw/ppc/spapr: Look up CPU alias names instead of hard-coding the aliases
ppc: Introduce a function to look up CPU alias strings
spapr: remove extra type variable
ppc64: fix compressed dump with pseries kernel
Paolo Bonzini [Tue, 9 Aug 2016 15:47:44 +0000 (17:47 +0200)]
checkpatch: default to success if only warnings
CHK-level checks have been removed from checkpatch or bumped to
errors, so there is no effect anymore for --strict/--subjective.
Furthermore, even most WARNs have been bumped to errors, with
WARN only reserved to things that patchew probably ought not
to complain about (and that maintainers probably will notice
anyway during review if they are extreme).
Default to exiting with success even if there are WARN-level
failures, and cause --strict to fail for warnings. Maintainers
that want to have a strict 80-character limit for their subsystem
can add it to a commit hook for example.
Paolo Bonzini [Mon, 7 Sep 2015 09:53:02 +0000 (11:53 +0200)]
CODING_STYLE, checkpatch: update line length rules
Line lengths above 80 characters do exist. They are rare, but
they happen from time to time. An ignored rule is worse than an
exception to the rule, so do the latter.
Some on the list expressed their preference for a soft limit that
is slightly lower than 80 characters, to account for extra characters
in unified diffs (including three-way diffs) and for email quoting.
However, there was no consensus on this so keep the 80-character
soft limit and add a hard limit at 90.
Paolo Bonzini [Wed, 10 Aug 2016 08:05:03 +0000 (10:05 +0200)]
checkpatch: check for CVS keywords on all sources
These should apply to all files, not just C/C++. Tweak the regular
expression to check for whole words, to avoid false positives on Perl
variables starting with "Id".
Thomas Huth [Tue, 9 Aug 2016 17:00:01 +0000 (19:00 +0200)]
ppc/kvm: Register also a generic spapr CPU core family type
There is a regression with the "-cpu" parameter introduced by
the spapr CPU hotplug code: We used to allow to specify a
"CPU family" name with the "-cpu" parameter when running on KVM so
that the user does not need to know the gory details of the exact
CPU version of the host CPU. For example, it was possible to
use "-cpu POWER8" on a POWER8E host CPU. This behavior does not
work anymore with the new hot-pluggable spapr-cpu-core types.
Since libvirt already heavily depends on the old behavior, this
is quite a severe regression in the QEMU parameter interface.
Let's fix it by supporting a CPU family type for the spapr-cpu-core
on KVM, too.
Thomas Huth [Tue, 9 Aug 2016 17:00:00 +0000 (19:00 +0200)]
ppc/kvm: Do not mess up the generic CPU family registration
The code for registering the sPAPR CPU host core type has been
added inbetween the generic CPU host core type and the generic
CPU family type. That way the instance_init and the class_init
information got lost when registering the generic CPU family
type. Fix it by moving the generic family registration before
the spapr cpu core registration code.
Thomas Huth [Tue, 9 Aug 2016 16:59:59 +0000 (18:59 +0200)]
hw/ppc/spapr: Look up CPU alias names instead of hard-coding the aliases
Hard-coding the CPU alias names in the spapr_cores[] array has
two big disadvantages:
1) We register a real type with the CPU alias name in
spapr_cpu_core_register_types() - this prevents us from registering
a CPU family name in kvm_ppc_register_host_cpu_type() with the same
name (as we do it for the non-hotpluggable CPU types).
2) It's quite cumbersome to maintain the aliases here in sync with the
ppc_cpu_aliases list from target-ppc/cpu-models.c.
So let's simply add proper alias lookup to the spapr cpu core code,
too (by checking whether the given model can be used directly, and
if not by trying to look up the given model as an alias name instead).
Laurent Vivier [Mon, 8 Aug 2016 13:08:53 +0000 (15:08 +0200)]
ppc64: fix compressed dump with pseries kernel
If we don't provide the page size in target-ppc:cpu_get_dump_info(),
the default one (TARGET_PAGE_SIZE, 4KB) is used to create
the compressed dump. It works fine with Macintosh, but not with
pseries as the kernel default page size is 64KB.
Without this patch, if we generate a compressed dump in the QEMU monitor:
Page_size is used to determine the dumpfile's block size. The
block size needs to be at least the page size, but a multiple of page
size works fine too. For PPC64, linux supports either 4KB or 64KB software
page size. So we define the page_size to 64KB.
Gonglei [Tue, 9 Aug 2016 07:49:15 +0000 (15:49 +0800)]
timer: set vm_clock disabled default
(commit 80dcfb8532ae76343109a48f12ba8ca1c505c179)
Upon migration, the code use a timer based on vm_clock for 1ns
in the future from post_load to do the event send in case host_connected
differs between migration source and target.
However, it's not guaranteed that the apic is ready to inject irqs into
the guest, and the irq line remained high, resulting in any future interrupts
going unnoticed by the guest as well.
That's because 1) the migration coroutine is not blocked when it get EAGAIN
while reading QEMUFile. 2) The vm_clock is enabled default currently, it doesn't
rely on the calling of vm_start(), that means vm_clock timers can run before
VCPUs are running.
So, let's set the vm_clock disabled default, keep the initial intention of
design for vm_clock timers.
Meanwhile, change the test-aio usecase, using QEMU_CLOCK_REALTIME instead of
QEMU_CLOCK_VIRTUAL as the block code does.
Radim Krčmář [Tue, 9 Aug 2016 17:38:41 +0000 (19:38 +0200)]
checkpatch: ignore automatically imported Linux headers
Linux uses tabs for indentation and checkpatch always complained about
automatically imported headers. update-linux-headers.sh could be modified to
expand tabs, but there is no real reason to complain about any ugly code in
Linux headers, so skip all hunk-related checks.