Use a single allocation for CharDriverState, this avoids extra
allocations & pointers, and is a step towards more object-oriented
CharDriver.
Gtk console is a bit peculiar, gd_vc_chr_set_echo() used to have a
temporary VirtualConsole to save the echo bit. Instead now, we consider
whether vcd->console is set or not, and restore the echo bit saved in
VCDriverState when calling gd_vc_vte_init().
The casts added are temporary, they are replaced with QOM type-safe
macros in a later patch in this series.
test.char.exe fails to link:
qemu-char.o: In function `win_chr_free':
/home/elmarco/src/qemu/qemu-char.c:2149: undefined reference to `qemu_del_polling_cb'
/home/elmarco/src/qemu/qemu-char.c:2151: undefined reference to `qemu_del_polling_cb'
qemu-char.o: In function `win_stdio_thread':
/home/elmarco/src/qemu/qemu-char.c:2568: undefined reference to `qemu_del_wait_object'
qemu-char.o: In function `qemu_chr_open_stdio':
/home/elmarco/src/qemu/qemu-char.c:2661: undefined reference to `qemu_add_wait_object'
/home/elmarco/src/qemu/qemu-char.c:2646: undefined reference to
`qemu_add_wait_object'
...
It needs main-loop.o symbols, among others. Linking with
$(test-block-obj-y) brings what's necessary. We could try to eventually
strip to the minimum if needed.
x86-KVM: Supply TSC and APIC clock rates to guest like VMWare
This fixes timekeeping of x86-64 Darwin/OS X/macOS guests when using KVM.
Darwin/OS X/macOS for x86-64 uses the TSC for timekeeping; it normally calibrates this by querying various clock frequency scaling MSRs. Details depend on the exact CPU model detected. The local APIC timer frequency is extracted from (EFI) firmware.
This is problematic in the presence of virtualisation, as the MSRs in question are typically not handled by the hypervisor. VMWare (Fusion) advertises TSC and APIC frequency via a custom 0x40000010 CPUID leaf, in the eax and ebx registers respectively. This is documented at https://lwn.net/Articles/301888/ among other places.
Darwin/OS X/macOS looks for the generic 0x40000000 hypervisor leaf, and if this indicates via eax that leaf 0x40000010 might be available, that is in turn queried for the two frequencies.
This adds a CPU option "vmware-cpuid-freq" to enable the same behaviour when running Qemu with KVM acceleration, if the KVM TSC frequency can be determined, and it is stable. (invtsc or user-specified) The virtualised APIC bus cycle is hardcoded to 1GHz in KVM, so ebx of the CPUID leaf is also hardcoded to this value.
Eric Farman [Fri, 20 Jan 2017 16:25:27 +0000 (17:25 +0100)]
block: get max_transfer limit for char (scsi-generic) devices
We can get the maximum number of bytes for a single I/O transfer
from the BLKSECTGET ioctl, but we only perform this for block
devices. scsi-generic devices are represented as character devices,
and so do not issue this today. Update this, so that virtio-scsi
devices using the scsi-generic interface can return the same data.
Eric Farman [Fri, 20 Jan 2017 16:25:26 +0000 (17:25 +0100)]
block: Fix target variable of BLKSECTGET ioctl
Commit 6f6071745bd0 ("raw-posix: Fetch max sectors for host block device")
introduced a routine to call the kernel BLKSECTGET ioctl, which stores the
result back to user space. However, the size of the data returned depends
on the routine handling the ioctl. The (compat_)blkdev_ioctl returns a
short, while sg_ioctl returns an int. Thus, on big-endian systems, we can
find ourselves accidentally shifting the result to a much larger value.
(On s390x, a short is 16 bits while an int is 32 bits.)
Also, the two ioctl handlers return values in different scales (block
returns sectors, while sg returns bytes), so some tweaking of the outputs
is required such that hdev_get_max_transfer_length returns a value in a
consistent set of units.
Eric Farman [Fri, 20 Jan 2017 16:25:25 +0000 (17:25 +0100)]
hw/scsi: Fix debug message of cdb structure in scsi-generic
When running with debug enabled, the scsi-generic cdb that is
dumped skips byte 0 of the command, which is the opcode. This
makes identifying which command is being issued/completed a
little difficult. Example:
Improve this by adding a message prior to the loop, similar to
what exists for scsi-disk. Clean up a few other messages to be
more explicit of what is being represented. Example:
Peter Lieven [Mon, 16 Jan 2017 15:17:12 +0000 (16:17 +0100)]
block/iscsi: avoid data corruption with cache=writeback
nb_cls_shrunk in iscsi_allocmap_update can become -1 if the
request starts and ends within the same cluster. This results
in passing -1 to bitmap_set and bitmap_clear and they don't
handle negative values properly. In the end this leads to data
corruption.
Laszlo Ersek [Thu, 26 Jan 2017 01:44:15 +0000 (02:44 +0100)]
hw/isa/lpc_ich9: add broadcast SMI feature
The generic edk2 SMM infrastructure prefers
EFI_SMM_CONTROL2_PROTOCOL.Trigger() to inject an SMI on each processor. If
Trigger() only brings the current processor into SMM, then edk2 handles it
in the following ways:
(1) If Trigger() is executed by the BSP (which is guaranteed before
ExitBootServices(), but is not necessarily true at runtime), then:
(a) If edk2 has been configured for "traditional" SMM synchronization,
then the BSP sends directed SMIs to the APs with APIC delivery,
bringing them into SMM individually. Then the BSP runs the SMI
handler / dispatcher.
(b) If edk2 has been configured for "relaxed" SMM synchronization,
then the APs that are not already in SMM are not brought in, and
the BSP runs the SMI handler / dispatcher.
(2) If Trigger() is executed by an AP (which is possible after
ExitBootServices(), and can be forced e.g. by "taskset -c 1
efibootmgr"), then the AP in question brings in the BSP with a
directed SMI, and the BSP runs the SMI handler / dispatcher.
The smaller problem with (1a) and (2) is that the BSP and AP
synchronization is slow. For example, the "taskset -c 1 efibootmgr"
command from (2) can take more than 3 seconds to complete, because
efibootmgr accesses non-volatile UEFI variables intensively.
The larger problem is that QEMU's current behavior diverges from the
behavior usually seen on physical hardware, and that keeps exposing
obscure corner cases, race conditions and other instabilities in edk2,
which generally expects / prefers a software SMI to affect all CPUs at
once.
Therefore introduce the "broadcast SMI" feature that causes QEMU to inject
the SMI on all VCPUs.
While the original posting of this patch
<http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg05658.html>
only intended to speed up (2), based on our recent "stress testing" of SMM
this patch actually provides functional improvements.
Laszlo Ersek [Thu, 26 Jan 2017 01:44:14 +0000 (02:44 +0100)]
hw/isa/lpc_ich9: add SMI feature negotiation via fw_cfg
Introduce the following fw_cfg files:
- "etc/smi/supported-features": a little endian uint64_t feature bitmap,
presenting the features known by the host to the guest. Read-only for
the guest.
The content of this file will be determined via bit-granularity ICH9-LPC
device properties, to be introduced later. For now, the bitmask is left
zeroed. The bits will be set from machine type compat properties and on
the QEMU command line, hence this file is not migrated.
- "etc/smi/requested-features": a little endian uint64_t feature bitmap,
representing the features the guest would like to request. Read-write
for the guest.
The guest can freely (re)write this file, it has no direct consequence.
Initial value is zero. A nonzero value causes the SMI-related fw_cfg
files and fields that are under guest influence to be migrated.
- "etc/smi/features-ok": contains a uint8_t value, and it is read-only for
the guest. When the guest selects the associated fw_cfg key, the guest
features are validated against the host features. In case of error, the
negotiation doesn't proceed, and the "features-ok" file remains zero. In
case of success, the "features-ok" file becomes (uint8_t)1, and the
negotiated features are locked down internally (to which no further
changes are possible until reset).
The initial value is zero. A nonzero value causes the SMI-related
fw_cfg files and fields that are under guest influence to be migrated.
The C-language fields backing the "supported-features" and
"requested-features" files are uint8_t arrays. This is because they carry
guest-side representation (our choice is little endian), while
VMSTATE_UINT64() assumes / implies host-side endianness for any uint64_t
fields. If we migrate a guest between hosts with different endiannesses
(which is possible with TCG), then the host-side value is preserved, and
the host-side representation is translated. This would be visible to the
guest through fw_cfg, unless we used plain byte arrays. So we do.
Peter Xu [Mon, 16 Jan 2017 08:40:04 +0000 (16:40 +0800)]
memory: tune mtree_print_mr() to dump mr type
We were dumping RW bits for each memory region, that might be confusing.
It'll make more sense to dump the memory region type directly rather
than the RW bits since that's how the bits are derived.
Meanwhile, with some slight cleanup in the function.
Pavel Dovgalyuk [Thu, 26 Jan 2017 12:34:18 +0000 (15:34 +0300)]
replay: exception replay fix
This patch fixes replaying the exception when TB cache is full.
It breaks cpu loop execution through setting exception_index
to process such queued work as TB flush.
v8: moved setting of exeption_index to tb_gen_code
Pavel Dovgalyuk [Tue, 24 Jan 2017 07:17:30 +0000 (10:17 +0300)]
replay: don't use rtc clock on loadvm phase
This patch disables the update of the periodic timer of mc146818rtc
in record/replay mode. State of this timer is saved and therefore does
not need to be updated in record/replay mode.
Read of RTC breaks the replay because all rtc reads have to be the same
as in record mode.
Pavel Dovgalyuk [Tue, 24 Jan 2017 07:17:08 +0000 (10:17 +0300)]
replay: improve interrupt handling
This patch improves interrupt handling in record/replay mode.
Now "interrupt" event is saved only when cc->cpu_exec_interrupt returns true.
This patch also adds missing return to cpu_exec_interrupt function.
Pavel Dovgalyuk [Tue, 24 Jan 2017 07:17:02 +0000 (10:17 +0300)]
icount: update instruction counter on apic patching
kvmvapic patches the code when some instructions are executed.
E.g. mov 0xff, 0xfffe0080 is interpreted as push 0xff/call ...
This patching is also followed by some side effects (changing apic
and guest memory state). Therefore deterministic execution should take
this operation into account. This patch decreases icount when original
mov instruction is trying to execute. Therefore patching becomes
deterministic and can be replayed correctly.
Peter Maydell [Fri, 27 Jan 2017 15:20:08 +0000 (15:20 +0000)]
Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2017-01-27' into staging
QAPI/QMP patches for 2017-01-27
# gpg: Signature made Fri 27 Jan 2017 07:24:02 GMT
# gpg: using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <[email protected]>"
# gpg: aka "Markus Armbruster <[email protected]>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-qapi-2017-01-27:
qmp: Fix argument name in error message of device-list-properties
qapi: Remove unwanted commas after #optional keyword
build-sys: Minor qapi doc generation target cleanups
Peter Maydell [Fri, 27 Jan 2017 10:14:56 +0000 (10:14 +0000)]
Merge remote-tracking branch 'remotes/famz/tags/for-upstream' into staging
# gpg: Signature made Thu 26 Jan 2017 02:44:47 GMT
# gpg: using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6
If a QIOTask has an error set and the calling code uses
qio_task_propagate_error() to steal the reference to
that Error object, the task would not clear its own
reference. This would lead to a double-free when
qio_task_free runs, if the caller had (correctly) freed
the Error object they now owned.
Stefan Hajnoczi [Tue, 24 Jan 2017 09:53:50 +0000 (09:53 +0000)]
aio-posix: honor is_external in AioContext polling
AioHandlers marked ->is_external must be skipped when aio_node_check()
fails. bdrv_drained_begin() needs this to prevent dataplane from
submitting new I/O requests while another thread accesses the device and
relies on it being quiesced.
This patch fixes the following segfault:
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00005577f6127dad in bdrv_io_plug (bs=0x5577f7ae52f0) at qemu/block/io.c:2650
2650 bdrv_io_plug(child->bs);
[Current thread is 1 (Thread 0x7ff5c4bd1c80 (LWP 10917))]
(gdb) bt
#0 0x00005577f6127dad in bdrv_io_plug (bs=0x5577f7ae52f0) at qemu/block/io.c:2650
#1 0x00005577f6114363 in blk_io_plug (blk=0x5577f7b8ba20) at qemu/block/block-backend.c:1561
#2 0x00005577f5d4091d in virtio_blk_handle_vq (s=0x5577f9ada030, vq=0x5577f9b3d2a0) at qemu/hw/block/virtio-blk.c:589
#3 0x00005577f5d4240d in virtio_blk_data_plane_handle_output (vdev=0x5577f9ada030, vq=0x5577f9b3d2a0) at qemu/hw/block/dataplane/virtio-blk.c:158
#4 0x00005577f5d88acd in virtio_queue_notify_aio_vq (vq=0x5577f9b3d2a0) at qemu/hw/virtio/virtio.c:1304
#5 0x00005577f5d8aaaf in virtio_queue_host_notifier_aio_poll (opaque=0x5577f9b3d308) at qemu/hw/virtio/virtio.c:2134
#6 0x00005577f60ca077 in run_poll_handlers_once (ctx=0x5577f79ddbb0) at qemu/aio-posix.c:493
#7 0x00005577f60ca268 in try_poll_mode (ctx=0x5577f79ddbb0, blocking=true) at qemu/aio-posix.c:569
#8 0x00005577f60ca331 in aio_poll (ctx=0x5577f79ddbb0, blocking=true) at qemu/aio-posix.c:601
#9 0x00005577f612722a in bdrv_flush (bs=0x5577f7c20970) at qemu/block/io.c:2403
#10 0x00005577f60c1b2d in bdrv_close (bs=0x5577f7c20970) at qemu/block.c:2322
#11 0x00005577f60c20e7 in bdrv_delete (bs=0x5577f7c20970) at qemu/block.c:2465
#12 0x00005577f60c3ecf in bdrv_unref (bs=0x5577f7c20970) at qemu/block.c:3425
#13 0x00005577f60bf951 in bdrv_root_unref_child (child=0x5577f7a2de70) at qemu/block.c:1361
#14 0x00005577f6112162 in blk_remove_bs (blk=0x5577f7b8ba20) at qemu/block/block-backend.c:491
#15 0x00005577f6111b1b in blk_remove_all_bs () at qemu/block/block-backend.c:245
#16 0x00005577f60c1db6 in bdrv_close_all () at qemu/block.c:2382
#17 0x00005577f5e60cca in main (argc=20, argv=0x7ffea6eb8398, envp=0x7ffea6eb8440) at qemu/vl.c:4684
The key thing is that bdrv_close() uses bdrv_drained_begin() and
virtio_queue_host_notifier_aio_poll() must not be called.
Thanks to Fam Zheng <[email protected]> for identifying the root cause of
this crash.
Max Reitz [Tue, 15 Nov 2016 22:57:46 +0000 (23:57 +0100)]
test-hbitmap: Add hbitmap_is_serializable() calls
Add calls to hbitmap_is_serializable() (asserting that it returns true)
where necessary (i.e. before every series of (de-)serialization function
invocations).
Max Reitz [Tue, 15 Nov 2016 22:57:45 +0000 (23:57 +0100)]
hbitmap: Add hbitmap_is_serializable()
Bitmaps with a granularity of 58 or above can be neither serialized nor
deserialized (see the comment in the function added in this series for
an explanation). This patch adds a function so that we can check whether
a bitmap actually can be (de-)serialized at all, thus avoiding failing
the necessary assertion in hbitmap_serialization_granularity().
Peter Maydell [Wed, 25 Jan 2017 17:54:14 +0000 (17:54 +0000)]
Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
This pull request fixes a 2.9 regression and a long standing bug that can
cause 9p clients to hang. Other patches are minor enhancements.
# gpg: Signature made Wed 25 Jan 2017 10:12:27 GMT
# gpg: using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <[email protected]>"
# gpg: aka "Greg Kurz <[email protected]>"
# gpg: aka "Greg Kurz <[email protected]>"
# gpg: aka "Greg Kurz <[email protected]>"
# gpg: aka "Gregory Kurz (Groug) <[email protected]>"
# gpg: aka "Gregory Kurz (Cimai Technology) <[email protected]>"
# gpg: aka "Gregory Kurz (Meiosys Technology) <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894 DBA2 02FC 3AEB 0101 DBC2
* remotes/gkurz/tags/for-upstream:
9pfs: fix offset error in v9fs_xattr_read()
9pfs: local: trivial cosmetic fix in pwritev op
9pfs: fix off-by-one error in PDU free list
tests: virtio-9p: improve error reporting
9pfs: add missing coroutine_fn annotations
* remotes/mjt/tags/trivial-patches-fetch: (31 commits)
hw/isa/isa-bus: Set category of the "isabus-bridge" device
usb: Set category and description of the MTP device
gdbstub.c: update old error report statements
gdbstub.c: fix GDB connection segfault caused by empty machines
scsi-disk: add 'fall through' comment to switch VERIFY cases
Drop duplicate display option documentation
hw/display/framebuffer.c: Avoid overflow for framebuffers > 4GB
win32: use glib gpoll if glib >= 2.50
util/mmap-alloc: refactor a little bit for readability
util/mmap-alloc: check parameter before using
vfio: remove a duplicated word in comments
docs: sync pci-ids.txt
disas/cris.c: Fix Coverity warning about unchecked NULL
lm32: milkymist-tmu2: fix another integer overflow
hw/i386/kvmvapic: Remove dead code in patch_hypercalls()
doc/usb2: fix typo
qga: fix erroneous argument to strerror
block: remove dead check
pci-assign: avoid pointless stat
qemu-img: remove dead check
...
Greg Kurz [Tue, 24 Jan 2017 23:23:49 +0000 (00:23 +0100)]
9pfs: fix offset error in v9fs_xattr_read()
The current code tries to copy `read_count' bytes starting at offset
`offset' from a `read_count`-sized iovec. This causes v9fs_pack() to
fail with ENOBUFS.
Since the PDU iovec is already partially filled with `offset' bytes,
let's skip them when creating `qiov_full' and have v9fs_pack() to
copy the whole of it. Moreover, this is consistent with the other
places where v9fs_init_qiov_from_pdu() is called.
This fixes commit "bcb8998fac16 9pfs: call v9fs_init_qiov_from_pdu
before v9fs_pack".
Greg Kurz [Fri, 13 Jan 2017 17:18:20 +0000 (18:18 +0100)]
9pfs: fix off-by-one error in PDU free list
The server can handle MAX_REQ - 1 PDUs at a time and the virtio-9p
device has a MAX_REQ sized virtqueue. If the client manages to fill
up the virtqueue, pdu_alloc() will fail and the request won't be
processed without any notice to the client (it actually causes the
linux 9p client to hang).
This has been there since the beginning (commit 9f10751365b2 "virtio-9p:
Add a virtio 9p device to qemu"), but it needs an agressive workload to
run in the guest to show up.
We actually allocate MAX_REQ PDUs and I see no reason not to link them
all into the free list, so let's fix the init loop.
Marek Vasut [Wed, 18 Jan 2017 22:01:46 +0000 (23:01 +0100)]
nios2: Add support for Nios-II R1
Add remaining bits of the Altera NiosII R1 support into qemu, which
is documentation, MAINTAINERS file entry, configure bits, arch_init
and configuration files for both linux-user (userland binaries) and
softmmu (hardware emulation).
Marek Vasut [Wed, 18 Jan 2017 22:01:45 +0000 (23:01 +0100)]
nios2: Add Altera 10M50 GHRD emulation
Add the Altera 10M50 Nios2 GHRD model. This allows emulating the
10M50 development kit with the Nios2 GHRD loaded in the FPGA. It
is possible to boot Linux kernel and run userspace, thus far only
from initrd as storage support is not yet implemented.
Marek Vasut [Wed, 18 Jan 2017 22:01:40 +0000 (23:01 +0100)]
nios2: Add disas entries
Add nios2 disassembler support. This patch is composed from binutils files
from commit "Opcodes and assembler support for Nios II R2". The files from
binutils used in this patch are:
Checkpatch says total: 114 errors, 0 warnings, 3609 lines checked , which
is caused by a different coding style in those files. These warnings and
errors are not addressed To let these files be easily synchronized between
binutils and qemu.
Chris Wulff [Wed, 18 Jan 2017 22:01:41 +0000 (23:01 +0100)]
nios2: Add architecture emulation support
Add support for emulating Altera NiosII R1 architecture into qemu.
This patch is based on previous work by Chris Wulff from 2012 and
updated to latest mainline QEMU.
Thomas Huth [Fri, 20 Jan 2017 13:11:04 +0000 (14:11 +0100)]
usb: Set category and description of the MTP device
It's a storage device, so let's classify it accordingly. And
while we're at it, also add a short description for people who
do not know what MTP means.
Ziyue Yang [Wed, 18 Jan 2017 08:02:41 +0000 (16:02 +0800)]
gdbstub.c: fix GDB connection segfault caused by empty machines
This patch is to fix the segmentation fault caused by attaching
GDB to a QEMU instance initialized with "-M none" option.
The bug can be reproduced by
> ./qemu-system-x86_64 -M none -nographic -S -s
and attach a GDB to it by
> gdb -ex 'target remote :1234
The segmentation fault was originally caused by trying to read
the information about CPU when communicating with GDB. However,
it's impossible for any control flow to exist on an empty machine,
nor can CPU's be hot plugged to an empty machine later by QOM
commands. So I think simply disabling GDB connections on empty
machines makes sense.
Peter Maydell [Mon, 16 Jan 2017 18:46:00 +0000 (18:46 +0000)]
scsi-disk: add 'fall through' comment to switch VERIFY cases
Commit 166dbda7e131 added some extra cases to a switch() such
that the existing code is intended to fall through the new
case statements. It's clear from the commit that this is
intentional, but less clear to subsequent readers of the
code, and not clear at all to static analysis tools like
Coverity. Add a /* fall through */ comment to indicate the
intent. (Fixes CID 1368287.)
Peter Maydell [Mon, 9 Jan 2017 16:45:09 +0000 (16:45 +0000)]
hw/display/framebuffer.c: Avoid overflow for framebuffers > 4GB
Coverity points out that calculating src_len by multiplying
src_width by rows could overflow. This can only happen in
the implausible case of a framebuffer larger than 4GB, but
we may as well fix it, placating Coverity. (CID1005515)
A fix has been committed in upstream glib commit 210a9796f78eb90f76f1bd6a304e9fea05e97617.
(See also related bug https://bugzilla.gnome.org/show_bug.cgi?id=764415)
It is desirable to use the glib version instead of qemu copy, since it
provides more debugging facilities (G_MAIN_POLL_DEBUG etc), and
hopefully has a better maintainance. Hopefully, we can drop the qemu
copy in a few years.
Peter Maydell [Mon, 9 Jan 2017 19:05:59 +0000 (19:05 +0000)]
disas/cris.c: Fix Coverity warning about unchecked NULL
Coverity (CID 1005689) warns that we don't check that
spec_reg_info() returned non-NULL before dereferencing.
Add the check, though as the comment notes this is
a can't-really-happen case because the earlier constraint
matching should have ruled out the "unknown reg" case.
Peter Maydell [Mon, 9 Jan 2017 17:05:21 +0000 (17:05 +0000)]
hw/i386/kvmvapic: Remove dead code in patch_hypercalls()
The patch_hypercalls() function sets up a 'patches'
variable and checks it at the end of the function, but
never modifies it in the middle. Remove this dead code,
which seems to have been present since the function was
added in commit e5ad936b0fd7 in 2012.
Peter Maydell [Tue, 24 Jan 2017 19:25:19 +0000 (19:25 +0000)]
Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20170124b' into staging
Migration
1 My maintainer change
2 Jianjun's qtailq
3 Ashijeet's only-migratable
4 Zhanghailiang's re-active images
5 Pankaj's change name of migration thread
6 My PCI migration merge
7 Juan's debug to tracing
8 My tracing on save
# gpg: Signature made Tue 24 Jan 2017 18:39:35 GMT
# gpg: using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert/tags/pull-migration-20170124b:
migration/tracing: Add tracing on save
migration: transform remaining DPRINTF into trace_
PCI/migration merge vmstate_pci_device and vmstate_pcie_device
migration: Change name of live migration thread
migration: re-active images while migration been canceled after inactive them
migration: Fail migration blocker for --only-migratable
migration: disallow migrate_add_blocker during migration
migration: Allow "device add" options to only add migratable devices
migration: Add a new option to enable only-migratable
block/vvfat: Remove the undesirable comment
migration: add error_report
tests/migration: Add test for QTAILQ migration
migration: migrate QTAILQ
migration: extend VMStateInfo
MAINTAINERS: Add myself as a migration submaintainer
Add some tracing to vmstate_subsection_save and vmstate_save_state
to help in debugging when you're not sure if a conditional piece
of data is being saved.
In vmstate_subsection_save I renamed the inner vmsd to avoid the aliasing
and be able to print both names.
PCI/migration merge vmstate_pci_device and vmstate_pcie_device
The vmstate_pci_device and vmstate_pcie_devices differ
just in the size of one buffer; combine the two using a _TEST
macro.
I think this is safe as long as everywhere which currently
uses either of these two uses the right type.
One thing that concerns me is that some places use pci_device_load/save
which does some irq mangling, but others just use the VMSTATE_PCI_DEVICE
macro - how are they getting the same irq mangling?
This passes a smoke test migrate of:
./x86_64-softmmu/qemu-system-x86_64 -M pc,accel=kvm -m 1024
./littlefed20.img -device e1000e -device virtio-net -device
e1000 -device virtio-rng -device megasas -device megasas-gen2 -device
ioh3420 -device nec-usb-xhci
Pankaj Gupta [Mon, 23 Jan 2017 13:42:56 +0000 (19:12 +0530)]
migration: Change name of live migration thread
Change the name of live migration thread from 'migration'
to 'live_migration' to identify it clearly. 'migration'
is a generic word and kernel also has tasks for process
migration with the name 'migration/cpu#'.
zhanghailiang [Tue, 24 Jan 2017 07:59:52 +0000 (15:59 +0800)]
migration: re-active images while migration been canceled after inactive them
commit fe904ea8242cbae2d7e69c052c754b8f5f1ba1d6 fixed a case
which migration aborted QEMU because it didn't regain the control
of images while some errors happened.
Actually, there are another two cases can trigger the same error reports:
" bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed",
Case 1, codes path:
migration_thread()
migration_completion()
bdrv_inactivate_all() ----------------> inactivate images
qemu_savevm_state_complete_precopy()
socket_writev_buffer() --------> error because destination fails
qemu_fflush() ----------------> set error on migration stream
-> qmp_migrate_cancel() ----------------> user cancelled migration concurrently
-> migrate_set_state() ------------------> set migrate CANCELLIN
migration_completion() -----------------> go on to fail_invalidate
if (s->state == MIGRATION_STATUS_ACTIVE) -> Jump this branch
As we can see from above, qmp_migrate_cancel can slip in whenever
migration_thread does not hold the global lock. If this happens after
bdrv_inactive_all() been called, the above error reports will appear.
To prevent this, we can call bdrv_invalidate_cache_all() in qmp_migrate_cancel()
directly if we find images become inactive.
Besides, bdrv_invalidate_cache_all() in migration_completion() doesn't have the
protection of big lock, fix it by add the missing qemu_mutex_lock_iothread();