Eric Blake [Sun, 12 Nov 2017 01:39:36 +0000 (19:39 -0600)]
nbd: Don't crash when server reports NBD_CMD_READ failure
If a server fails a read, for example with EIO, but the connection
is still live, then we would crash trying to print a non-existent
error message in nbd_client_co_preadv(). For consistency, also
change the error printout in nbd_read_reply_entry(), although that
instance does not crash. Bug introduced in commit f140e300.
* remotes/kraxel/tags/ui-20171117-pull-request:
sdl2: Fix broken display updating after the window is hidden
sdl2: Do not leave grab when fullscreen
sdl2: Fix dead keyboard after fullsceen
sdl2: Use the same pointer show/hide logic for absolute and relative mode
sdl2: Do not quit the emulator when an auxilliary window is closed
Peter Maydell [Thu, 16 Nov 2017 19:06:07 +0000 (19:06 +0000)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, pci, virtio: fixes for rc1
A bunch of fixes all over the place.
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Thu 16 Nov 2017 16:37:21 GMT
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg: aka "Michael S. Tsirkin <[email protected]>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
tests/bios-tables-test: Fix endianess problems when passing data to iasl
build-sys: restrict vmcoreinfo to fw_cfg+dma capable targets
vmcoreinfo: put it in the 'misc' device category
NUMA: Enable adding NUMA node implicitly
tests/acpi-test-data: update _CRS in DSDT
hw/pcie-pci-bridge: restrict to X86 and ARM
hw/pci-host: Fix x86 Host Bridges 64bit PCI hole
pci: Initialize pci_dev->name before use
fix: unrealize virtio device if we fail to hotplug it
Thomas Huth [Thu, 16 Nov 2017 12:17:02 +0000 (13:17 +0100)]
tests/bios-tables-test: Fix endianess problems when passing data to iasl
The bios-tables-test was writing out files that we pass to iasl in
with the wrong endianness in the header when running on a big endian
host. So instead of storing mixed endian information in our structures,
let's keep everything in little endian and byte-swap it only when we
need a value in the code.
build-sys: restrict vmcoreinfo to fw_cfg+dma capable targets
vmcoreinfo is built for all targets. However, it requires fw_cfg with
DMA operations support (write operation). Restrict vmcoreinfo exposure
to architectures that are supporting FW_CFG_DMA, that is arm-virt and
x86 only atm.
Dou Liyang [Tue, 14 Nov 2017 02:34:01 +0000 (10:34 +0800)]
NUMA: Enable adding NUMA node implicitly
Linux and Windows need ACPI SRAT table to make memory hotplug work properly,
however currently QEMU doesn't create SRAT table if numa options aren't present
on CLI.
Which breaks both linux and windows guests in certain conditions:
* Windows: won't enable memory hotplug without SRAT table at all
* Linux: if QEMU is started with initial memory all below 4Gb and no SRAT table
present, guest kernel will use nommu DMA ops, which breaks 32bit hw drivers
when memory is hotplugged and guest tries to use it with that drivers.
Fix above issues by automatically creating a numa node when QEMU is started with
memory hotplug enabled but without '-numa' options on CLI.
(PS: auto-create numa node only for new machine types so not to break migration).
Which would provide SRAT table to guests without explicit -numa options on CLI
and would allow:
* Windows: to enable memory hotplug
* Linux: switch to SWIOTLB DMA ops, to bounce DMA transfers to 32bit allocated
buffers that legacy drivers/hw can handle.
Marcel Apfelbaum [Sat, 11 Nov 2017 15:25:00 +0000 (17:25 +0200)]
hw/pci-host: Fix x86 Host Bridges 64bit PCI hole
Currently there is no MMIO range over 4G
reserved for PCI hotplug. Since the 32bit PCI hole
depends on the number of cold-plugged PCI devices
and other factors, it is very possible is too small
to hotplug PCI devices with large BARs.
Fix it by reserving 2G for I4400FX chipset
in order to comply with older Win32 Guest OSes
and 32G for Q35 chipset.
Even if the new defaults of pci-hole64-size will appear in
"info qtree" also for older machines, the property was
not implemented so no changes will be visible to guests.
Note this is a regression since prev QEMU versions had
some range reserved for 64bit PCI hotplug.
linzhecheng [Tue, 31 Oct 2017 08:03:03 +0000 (16:03 +0800)]
fix: unrealize virtio device if we fail to hotplug it
If we fail to hotplug virtio-blk device and then suspend
or shutdown VM, qemu is likely to crash.
Re-production steps:
1. Run VM named vm001
2. Create a virtio-blk.xml which contains wrong configurations:
<disk device="lun" rawio="yes" type="block">
<driver cache="none" io="native" name="qemu" type="raw" />
<source dev="/dev/mapper/11-dm" />
<target bus="virtio" dev="vdx" />
</disk>
3. Run command : virsh attach-device vm001 virtio-blk.xml
error: Failed to attach device from blk-scsi.xml
error: internal error: unable to execute QEMU command 'device_add': Please set scsi=off for virtio-blk devices in order to use virtio 1.0
it means hotplug virtio-blk device failed.
4. Suspend or shutdown VM will leads to qemu crash
Problem happens in virtio_vmstate_change which is called by
vm_state_notify:
vdev’s parent_bus is NULL, so qdev_get_parent_bus(DEVICE(vdev)) will crash.
virtio_vmstate_change is added to the list vm_change_state_head at virtio_blk_device_realize(virtio_init),
but after hotplug virtio-blk failed, virtio_vmstate_change will not be removed from vm_change_state_head.
Adding unrealize function of virtio-blk device can solve this problem.
Stefan Hajnoczi [Thu, 16 Nov 2017 11:21:50 +0000 (11:21 +0000)]
throttle-groups: forget timer and schedule next TGM on detach
tg->any_timer_armed[] must be cleared when detaching pending timers from
the AioContext. Failure to do so leads to hung I/O because it looks
like there are still timers pending when in fact they have been removed.
Other ThrottleGroupMembers might have requests pending too so it's
necessary to schedule the next TGM so it can set a timer.
This patch fixes hung I/O when QEMU is launched with drives that are in
the same throttling group:
Peter Maydell [Thu, 16 Nov 2017 12:45:14 +0000 (12:45 +0000)]
Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20171115' into staging
User-mode memory helper fixes
# gpg: Signature made Wed 15 Nov 2017 12:32:33 GMT
# gpg: using RSA key 0x64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <[email protected]>"
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth/tags/pull-tcg-20171115:
target/arm: Fix GETPC usage in do_paired_cmpxchg64_l/be
target/arm: Use helper_retaddr in stxp helpers
tcg: Record code_gen_buffer address for user-only memory helpers
Peter Maydell [Thu, 16 Nov 2017 11:34:24 +0000 (11:34 +0000)]
Merge remote-tracking branch 'remotes/stefanberger/tags/pull-tpm-2017-11-15-1' into staging
Merge tpm 2017/11/15 v1
# gpg: Signature made Wed 15 Nov 2017 11:51:47 GMT
# gpg: using RSA key 0x75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2017-11-15-1:
tpm_tis: Return 0 for every register in case of failure mode
tpm_tis: Return TPM_VERSION_UNSPEC in case of BE failure
tpm-emulator: protect concurrent ctrl_chr access
specs: Extend TPM spec with TPM emulator description
This is a partial revert of d3f3a0f453ea590be529079ae214c200bb5ecc1a,
which in turn is a workaround for a SDL bug. The bug is fixed in 2.0.6,
see https://bugzilla.libsdl.org/show_bug.cgi?id=3410
Paolo Bonzini [Wed, 15 Nov 2017 14:11:03 +0000 (15:11 +0100)]
exec: Do not resolve subpage in mru_section
This fixes a crash caused by picking the wrong memory region in
address_space_lookup_region seen with client code accessing a device
model that uses alias memory regions. The expensive part of
address_space_lookup_region anyway is phys_page_find; performance-wise
it is okay to repeat the subsequent subpage lookup.
Stefan Berger [Sat, 11 Nov 2017 03:33:14 +0000 (22:33 -0500)]
tpm_tis: Return 0 for every register in case of failure mode
Rather than returning ~0, return 0 for every register in case of failure
mode. The '0' is better to indicate that there's no device there. It avoids
SeaBIOS detecting a device and getting stuck on it trying to read and write
its registers.
Stefan Berger [Sat, 11 Nov 2017 03:07:35 +0000 (22:07 -0500)]
tpm_tis: Return TPM_VERSION_UNSPEC in case of BE failure
In case the backend has a failure, such as the tpm_emulator's CMD_INIT
failing, the TIS goes into failure mode and does not respond to reads
or writes to MMIO registers. In this case we need to prevent the ACPI
table from being added and the straight-forward way is to indicate that
there's no known TPM version being used.
The control chardev is being used from the data thread to set the
locality of the next request. Altough the chr has a write mutex, we
may potentially read the reply from another thread request.
Add a mutex to protect from concurrent control commands.
We use raw memory primitives along the !parallel_cpus paths in order to
simplify the endianness handling. Because of that, we did not benefit
from the generic changes to cpu_ldst_user_only_template.h.
The simplest fix is to manipulate helper_retaddr here.
tcg: Record code_gen_buffer address for user-only memory helpers
When we handle a signal from a fault within a user-only memory helper,
we cannot cpu_restore_state with the PC found within the signal frame.
Use a TLS variable, helper_retaddr, to record the unwind start point
to find the faulting guest insn.
Peter Maydell [Tue, 14 Nov 2017 17:35:41 +0000 (17:35 +0000)]
Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2017-11-14' into staging
Block patches for 2.11.0-rc1
# gpg: Signature made Tue 14 Nov 2017 17:22:17 GMT
# gpg: using RSA key 0xF407DB0061D5CF40
# gpg: Good signature from "Max Reitz <[email protected]>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* remotes/maxreitz/tags/pull-block-2017-11-14:
qemu-iotests: update unsupported image formats in 194
block/parallels: add migration blocker
block/parallels: Do not update header or truncate image when INMIGRATE
block/vhdx.c: Don't blindly update the header
iotests: 077: Filter out 'resume' lines
block/snapshot: dirty all dirty bitmaps on snapshot-switch
qcow2: Check that corrupted images can be repaired in iotest 060
iotests: Use new-style NBD connections
iotests: Make 136 less flaky
iotests: Make 083 less flaky
iotests: Make 055 less flaky
iotests: Add missing 'blkdebug::' in 040
iotests: Make 030 less flaky
qcow2: Assert that the crypto header does not overlap other metadata
qcow2: Add iotest for an empty refcount table
qcow2: Add iotest for an image with header.refcount_table_offset == 0
qcow2: Don't open images with header.refcount_table_clusters == 0
qcow2: Prevent allocating compressed clusters at offset 0
qcow2: Prevent allocating L2 tables at offset 0
qcow2: Prevent allocating refcount blocks at offset 0
Jeff Cody [Tue, 7 Nov 2017 13:10:35 +0000 (08:10 -0500)]
block/parallels: add migration blocker
Migration does not work for parallels, and has been broken for a while
(see patch 'block/parallels: Do not update header or truncate image when
INMIGRATE'). The bdrv_invalidate_cache() method needs to be added for
migration to be supported. Until this is done, prohibit migration.
Jeff Cody [Tue, 7 Nov 2017 13:10:34 +0000 (08:10 -0500)]
block/parallels: Do not update header or truncate image when INMIGRATE
If we write or modify the image file while the QEMU run state is
INMIGRATE, then the BDRV_O_INACTIVE BDS flag is set. This will cause
an assert, since the image is marked inactive. Make sure we obey this
flag.
Jeff Cody [Tue, 7 Nov 2017 13:10:33 +0000 (08:10 -0500)]
block/vhdx.c: Don't blindly update the header
The VHDX specification requires that before user data modification of
the vhdx image, the VHDX header file and data GUIDs need to be updated.
In vhdx_open(), if the image is set to RDWR, we go ahead and update the
header.
However, just because the image is set to RDWR does not mean we can go
ahead and write at this point - specifically, if the QEMU run state is
INMIGRATE, the underlying file BS may be set to inactive via the BDS
open flag of BDRV_O_INACTIVE. Attempting to write under this condition
will cause an assert in bdrv_co_pwritev().
We can alternatively latch the first time the image is written. And lo
and behold, we do just that, via vhdx_user_visible_write() in
vhdx_co_writev(). This means the call to vhdx_update_headers() in
vhdx_open() is likely just vestigial, and can be removed.
Fam Zheng [Mon, 13 Nov 2017 15:00:26 +0000 (23:00 +0800)]
iotests: 077: Filter out 'resume' lines
In the "Overlapping multiple requests" cases, the 3rd reqs (the break
point B) doesn't wait for the 2nd, and once resumed the I/O will just
continue. This is because the 2nd is already waiting for the 1st, and
in wait_serialising_requests() there is:
/* If the request is already (indirectly) waiting for us, or
* will wait for us as soon as it wakes up, then just go on
* (instead of producing a deadlock in the former case). */
if (!req->waiting_for) {
/* actually break */
...
}
Consequently, the following "sleep 100; resume A" command races with the
completion of that request, and sometimes results in an unexpected
order of output:
> @@ -56,9 +56,9 @@
> wrote XXX/XXX bytes at offset XXX
> XXX bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> blkdebug: Resuming request 'B'
> +blkdebug: Resuming request 'A'
> wrote XXX/XXX bytes at offset XXX
> XXX bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> -blkdebug: Resuming request 'A'
> wrote XXX/XXX bytes at offset XXX
> XXX bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> wrote XXX/XXX bytes at offset XXX
Filter out the "Resuming request" lines to make the output
deterministic.
block/snapshot: dirty all dirty bitmaps on snapshot-switch
Snapshot-switch actually changes active state of disk so it should
reflect on dirty bitmaps. Otherwise next incremental backup using
these bitmaps will be invalid.
Alberto Garcia [Wed, 8 Nov 2017 12:13:06 +0000 (14:13 +0200)]
qcow2: Check that corrupted images can be repaired in iotest 060
We just fixed a few bugs that caused QEMU to crash when trying to
write to corrupted qcow2 images, and iotest 060 was expanded to test
all those scenarios.
In almost all cases the corrupted images can be repaired using
qemu-img, so this patch verifies that.
Eric Blake [Thu, 9 Nov 2017 22:12:16 +0000 (16:12 -0600)]
iotests: Use new-style NBD connections
Old-style NBD is deprecated upstream (it is documented, but no
longer implemented in the reference implementation), and it is
severely limited (it cannot support structured replies, which
means it cannot support efficient handling of zeroes), when
compared to new-style NBD. We are better off having our iotests
favor new-style everywhere (although some explicit tests,
particularly 83, still cover old-style for back-compat reasons);
this is as simple as supplying the empty string as the default
export name, as it does not change the URI needed to connect a
client to the server. This also gives us more coverage of the
just-added structured reply code, when not overriding $QEMU_NBD
to intentionally point to an older server.
Max Reitz [Thu, 9 Nov 2017 20:30:25 +0000 (21:30 +0100)]
iotests: Make 136 less flaky
136 executes some AIO requests without a final aio_flush; then it
advances the virtual clock and thus expects the last access time of the
device to be less than the current time when queried (i.e. idle_time_ns
to be greater than 0). However, without the aio_flush, some requests
may be settled after the clock_step invocation. In that case,
idle_time_ns would be 0 and the test fails.
Fix this by adding an aio_flush if any AIO request other than some other
aio_flush has been executed.
Max Reitz [Thu, 9 Nov 2017 20:30:24 +0000 (21:30 +0100)]
iotests: Make 083 less flaky
083 has (at least) two issues:
1. By launching the nbd-fault-injector in background, it may not be
scheduled until the first grep on its output file is executed.
However, until then, that file may not have been created yet -- so it
either does not exist yet (thus making the grep emit an error), or it
does exist but contains stale data (thus making the rest of the test
case work connect to a wrong address).
Fix this by explicitly overwriting the output file before executing
nbd-fault-injector.
2. The nbd-fault-injector prints things other than "Listening on...".
It also prints a "Closing connection" message from time to time. We
currently invoke sed on the whole file in the hope of it only
containing the "Listening on..." line yet. That hope is sometimes
shattered by the brutal reality of race conditions, so make the sed
script more robust.
Max Reitz [Thu, 9 Nov 2017 20:30:23 +0000 (21:30 +0100)]
iotests: Make 055 less flaky
First of all, test 055 does a valiant job of invoking pause_drive()
sometimes, but that is worth nothing without blkdebug. So the first
thing to do is to sprinkle a couple of "blkdebug::" in there -- with the
exception of the transaction tests, because the blkdebug break points
make the transaction QMP command hang (which is bad). In that case, we
can get away with throttling the block job that it effectively is
paused.
Then, 055 usually does not pause the drive before starting a block job
that should be cancelled. This means that the backup job might be
completed already before block-job-cancel is invoked; thus making the
test either fail (currently) or moot if cancel_and_wait() ignored this
condition. Fix this by pausing the drive before starting the job.
Max Reitz [Thu, 9 Nov 2017 20:30:21 +0000 (21:30 +0100)]
iotests: Make 030 less flaky
This patch fixes two race conditions in 030:
1. The first is in TestENOSPC.test_enospc(). After resuming the job,
querying it to confirm it is no longer paused may fail because in the
meantime it might have completed already. The same was fixed in
TestEIO.test_ignore() already (in commit 2c3b44da07d341557a8203cc509ea07fe3605e11).
2. The second is in TestSetSpeed.test_set_speed_invalid(): Here, a
stream job is started on a drive without any break points, with a
block-job-set-speed invoked subsequently. However, without any break
points, the job might have completed in the meantime (on tmpfs at
least); or it might complete before cancel_and_wait() which expects
the job to still exist. This can be fixed like everywhere else by
pausing the drive (installing break points) before starting the job
and letting cancel_and_wait() resume it.
Alberto Garcia [Fri, 3 Nov 2017 14:18:56 +0000 (16:18 +0200)]
qcow2: Assert that the crypto header does not overlap other metadata
The crypto header is initialized only when QEMU is creating a new
image, so there's no chance of this happening on a corrupted image.
If QEMU is really trying to allocate the header overlapping other
existing metadata sections then this is a serious bug in QEMU itself
so let's add an assertion.
Alberto Garcia [Fri, 3 Nov 2017 14:18:53 +0000 (16:18 +0200)]
qcow2: Don't open images with header.refcount_table_clusters == 0
qcow2_do_open() is checking that header.refcount_table_clusters is not
too large, but it doesn't check that it's greater than zero. Apart
from the fact that an image like that is obviously corrupted, trying
to use it crashes QEMU since we end up with a null s->refcount_table
after qcow2_refcount_init().
These images can however be repaired, so allow opening them if the
BDRV_O_CHECK flag is set.
Alberto Garcia [Fri, 3 Nov 2017 14:18:52 +0000 (16:18 +0200)]
qcow2: Prevent allocating compressed clusters at offset 0
If the refcount data is corrupted then we can end up trying to
allocate a new compressed cluster at offset 0 in the image, triggering
an assertion in qcow2_alloc_bytes() that would crash QEMU:
qcow2_alloc_bytes: Assertion `offset' failed.
This patch adds an explicit check for this scenario and a new test
case.
Alberto Garcia [Fri, 3 Nov 2017 14:18:51 +0000 (16:18 +0200)]
qcow2: Prevent allocating L2 tables at offset 0
If the refcount data is corrupted then we can end up trying to
allocate a new L2 table at offset 0 in the image, triggering an
assertion in the qcow2 cache that would crash QEMU:
Alberto Garcia [Fri, 3 Nov 2017 14:18:50 +0000 (16:18 +0200)]
qcow2: Prevent allocating refcount blocks at offset 0
Each entry in the qcow2 cache contains an offset field indicating the
location of the data in the qcow2 image. If the offset is 0 then it
means that the entry contains no data and is available to be used when
needed.
Because of that it is not possible to store in the cache the first
cluster of the qcow2 image (offset = 0). This is not a problem because
that cluster always contains the qcow2 header and we're not using this
cache for that.
However, if the qcow2 image is corrupted it can happen that we try to
allocate a new refcount block at offset 0, triggering this assertion
and crashing QEMU:
Peter Maydell [Tue, 14 Nov 2017 16:11:19 +0000 (16:11 +0000)]
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Pull request
The following disk I/O throttling fixes solve recent bugs.
# gpg: Signature made Tue 14 Nov 2017 10:37:12 GMT
# gpg: using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <[email protected]>"
# gpg: aka "Stefan Hajnoczi <[email protected]>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/block-pull-request:
qemu-iotests: Test I/O limits with removable media
block: Leave valid throttle timers when removing a BDS from a backend
block: Check for inserted BlockDriverState in blk_io_limits_disable()
throttle-groups: drain before detaching ThrottleState
block: all I/O should be completed before removing throttle timers.
Peter Maydell [Tue, 14 Nov 2017 13:53:00 +0000 (13:53 +0000)]
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Tue 14 Nov 2017 02:05:34 GMT
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
net/socket: fix coverity issue
Add new PCI ID for i82559a
Fix eepro100 simple transmission mode
colo: Consolidate the duplicate code chunk into a routine
colo-compare: Fix comments
colo-compare: compare the packet in a specified Connection
colo-compare: Insert packet into the suitable position of packet queue directly
net: fix check for number of parameters to -netdev socket
Pavel Dovgalyuk [Tue, 14 Nov 2017 08:18:18 +0000 (11:18 +0300)]
cpu-exec: avoid cpu_exec_nocache infinite loop with record/replay
This patch ensures that icount_decr.u32.high is clear before calling
cpu_exec_nocache when exception is pending. Because the exception is
caused by the first instruction in the block and it cannot be executed
without resetting the flag.
There are two parts in the fix. First, clear icount_decr.u32.high in
cpu_handle_interrupt (just before processing the "dependent" request,
stored in cpu->interrupt_request or cpu->exit_request) rather than
cpu_loop_exec_tb; this ensures that cpu_handle_exception is always
reached with zero icount_decr.u32.high unless another interrupt has
happened in the meanwhile.
Second, try to cause the exception at the beginning of
cpu_handle_exception, and exit immediately if the TB cannot
execute. With this change, interrupts are processed and
cpu_exec_nocache can make process.
Commit 5c0919d0 [1] introduced virtqueue_size parameter
for common virtio-scsi path, without updaing the vhost-user-scsi
code. vhost-user-scsi devices right now report size 0 for each vq.
This patch introduces virtqueue_size param to vhost-user-scsi,
that can now be set by the user. However, the most importantly, it
now has a default value of 128 (same as QEMU's virtio-scsi).
[1] 5c0919d0 ("virtio-scsi: Add virtqueue_size parameter
allowing virtqueue size to be set.")
Using obscure black magic introduced in eaa2ddbb767 :)
In an out-of-tree directory, running "../configure && make help" will generate
some required files (.mak), then clone some submodules, compile at least
the capstone submodule, generate QMP and Trace files, and finally display
the help.
On an outdated computer (Sun Blade workstation), running "make help" took
more than 5h :) With this patch it took roughly 37min.
Remove the last few DPRINTFs from hw/intc/ioapic.c and turn
them into tracing. In one case it's a new trace, in the others
it's just adding a parameter to the existing traces.
Peter Maydell [Tue, 14 Nov 2017 10:26:08 +0000 (10:26 +0000)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20171113' into staging
target-arm queue:
* translate-a64.c: silence gcc5 warning
* highbank: validate register offset before access
* MAINTAINERS: Add entries for Smartfusion2
* accel/tcg/translate-all: expand cpu_restore_state addr check
(so usermode insn aborts don't crash with an assertion failure)
* fix TCG initialization of some Arm boards by allowing them
to specify min/default number of CPUs to create
* remotes/pmaydell/tags/pull-target-arm-20171113:
accel/tcg/translate-all: expand cpu_restore_state addr check
hw: add .min_cpus and .default_cpus fields to machine_class
xlnx-zcu102: Specify the max number of CPUs for the EP108
xlnx-zcu102: Add an info message deprecating the EP108
xlnx-zynqmp: Properly support the smp command line option
qom: move CPUClass.tcg_initialize to a global
MAINTAINERS: Add entries for Smartfusion2
highbank: validate register offset before access
arm/translate-a64: mark path as unreachable to eliminate warning
ie, all irqs are masked and XIRR is null, while we should get the
same output as with the emulated XICS.
If the guest is then migrated, 'info pic' shows the expected values
on both source and destination.
The problem is that QEMU doesn't synchronize with KVM before printing
the XICS state. Migration happens to fix the output because it enforces
synchronization with KVM.
To fix the invalid output of 'info pic', this patch introduces a new
synchronize_state operation for both ICPStateClass and ICSStateClass.
The ICP operation relies on run_on_cpu() in order to kick the vCPU
and avoid sleeping on KVM_GET_ONE_REG.
Sam Bobroff [Mon, 6 Nov 2017 03:14:35 +0000 (14:14 +1100)]
target/ppc: correct htab shift for hash on radix
KVM HV will soon support running a guest in hash mode on a POWER9 host
running in radix mode (see [1]), however the guest currently fails to
boot.
This is because the "htab_shift" value (the size of the MMU's hash
table) is added to the device tree before KVM has had a chance to
change it. If the host is in hash mode, KVM does not need to change it
and so the problem is not seen, but when the host is in radix mode a
change is required and we see a problem.
To fix this, move the call spapr_setup_hpt_and_vrma() (where
htab_shift could be changed) up a little so that it's called before
spapr_h_cas_compose_response() (where htab_shift is added to the
device tree).
Signed-off-by: Sam Bobroff <[email protected]>
[1] See http://www.spinics.net/lists/kvm-ppc/msg13057.html Signed-off-by: David Gibson <[email protected]>
Alberto Garcia [Fri, 10 Nov 2017 18:54:48 +0000 (20:54 +0200)]
qemu-iotests: Test I/O limits with removable media
This test hotplugs a CD drive to a VM and checks that I/O limits can
be set only when the drive has media inserted and that they are kept
when the media is replaced.
This also tests the removal of a device with valid I/O limits set but
no media inserted. This involves deleting and disabling the limits
of a BlockBackend without BlockDriverState, a scenario that has been
crashing until the fixes from the last couple of patches.
[Python PEP8 fixup: "Don't use spaces are the = sign when used to
indicate a keyword argument or a default parameter value"
--Stefan]
Alberto Garcia [Fri, 10 Nov 2017 18:54:47 +0000 (20:54 +0200)]
block: Leave valid throttle timers when removing a BDS from a backend
If a BlockBackend has I/O limits set then its ThrottleGroupMember
structure uses the AioContext from its attached BlockDriverState.
Those two contexts must be kept in sync manually. This is not
ideal and will be fixed in the future by removing the throttling
configuration from the BlockBackend and storing it in an implicit
filter node instead, but for now we have to live with this.
When you remove the BlockDriverState from the backend then the
throttle timers are destroyed. If a new BlockDriverState is later
inserted then they are created again using the new AioContext.
There are a couple of problems with this:
a) The code manipulates the timers directly, leaving the
ThrottleGroupMember.aio_context field in an inconsisent state.
b) If you remove the I/O limits (e.g by destroying the backend)
when the timers are gone then throttle_group_unregister_tgm()
will attempt to destroy them again, crashing QEMU.
While b) could be fixed easily by allowing the timers to be freed
twice, this would result in a situation in which we can no longer
guarantee that a valid ThrottleState has a valid AioContext and
timers.
This patch ensures that the timers and AioContext are always valid
when I/O limits are set, regardless of whether the BlockBackend has a
BlockDriverState inserted or not.
[Fixed "There'a" typo as suggested by Max Reitz <[email protected]>
--Stefan]
Alberto Garcia [Fri, 10 Nov 2017 18:54:46 +0000 (20:54 +0200)]
block: Check for inserted BlockDriverState in blk_io_limits_disable()
When you set I/O limits using block_set_io_throttle or the command
line throttling.* options they are kept in the BlockBackend regardless
of whether a BlockDriverState is attached to the backend or not.
Therefore when removing the limits using blk_io_limits_disable() we
need to check if there's a BDS before attempting to drain it, else it
will crash QEMU. This can be reproduced very easily using HMP:
* remotes/kraxel/tags/vga-20171110-pull-request:
vmsvga: use ARRAY_SIZE macro
vga: fix region checks in wraparound case
virtio-gpu: fix bug in host memory calculation.
This happens because blk_set_aio_context() detaches the ThrottleState
while requests may still be in flight:
if (tgm->throttle_state) {
throttle_group_detach_aio_context(tgm);
throttle_group_attach_aio_context(tgm, new_context);
}
This patch encloses the detach/attach calls in a drained region so no
I/O request is left hanging. Also add assertions so we don't make the
same mistake again in the future.
Zhengui [Sat, 21 Oct 2017 05:34:00 +0000 (13:34 +0800)]
block: all I/O should be completed before removing throttle timers.
In blk_remove_bs, all I/O should be completed before removing throttle
timers. If there has inflight I/O, removing throttle timers here will
cause the inflight I/O never return.
This patch add bdrv_drained_begin before throttle_timers_detach_aio_context
to let all I/O completed before removing throttle timers.
[Moved declaration of bs as suggested by Alberto Garcia
<[email protected]>.
--Stefan]
We are still seeing signals during translation time when we walk over
a page protection boundary. This expands the check to ensure the host
PC is inside the code generation buffer. The original suggestion was
to check versus tcg_ctx.code_gen_ptr but as we now segment the
translation buffer we have to settle for just a general check for
being inside.
I've also fixed up the declaration to make it clear it can deal with
invalid addresses. A later patch will fix up the call sites.
Emilio G. Cota [Mon, 13 Nov 2017 13:55:27 +0000 (13:55 +0000)]
hw: add .min_cpus and .default_cpus fields to machine_class
max_cpus needs to be an upper bound on the number of vCPUs
initialized; otherwise TCG region initialization breaks.
Some boards initialize a hard-coded number of vCPUs, which is not
captured by the global max_cpus and therefore breaks TCG initialization.
Fix it by adding the .min_cpus field to machine_class.
This commit also changes some user-facing behaviour: we now die if
-smp is below this hard-coded vCPU minimum instead of silently
ignoring the passed -smp value (sometimes announcing this by printing
a warning). However, the introduction of .default_cpus lessens the
likelihood that users will notice this: if -smp isn't set, we now
assign the value in .default_cpus to both smp_cpus and max_cpus. IOW,
if a user does not set -smp, they always get a correct number of vCPUs.
This change fixes 3468b59 ("tcg: enable multiple TCG contexts in
softmmu", 2017-10-24), which broke TCG initialization for some
ARM boards.
Alistair Francis [Mon, 13 Nov 2017 13:55:26 +0000 (13:55 +0000)]
xlnx-zcu102: Add an info message deprecating the EP108
The EP108 was an early access development board that is no longer used.
Add an info message to convert any users to the ZCU102 instead. On QEMU
they are both identical.
This patch also updated the qemu-doc.texi file to indicate that the
EP108 has been deprecated.
Emilio G. Cota [Mon, 13 Nov 2017 13:55:25 +0000 (13:55 +0000)]
qom: move CPUClass.tcg_initialize to a global
55c3cee ("qom: Introduce CPUClass.tcg_initialize", 2017-10-24)
introduces a per-CPUClass bool that we check so that the target CPU
is initialized for TCG only once. This works well except when
we end up creating more than one CPUClass, in which case we end
up incorrectly initializing TCG more than once, i.e. once for
each CPUClass.
This can be replicated with:
$ aarch64-softmmu/qemu-system-aarch64 -machine xlnx-zcu102 -smp 6 \
-global driver=xlnx,,zynqmp,property=has_rpu,value=on
In this case the class name of the "RPUs" is prefixed by "cortex-r5-",
whereas the "regular" CPUs are prefixed by "cortex-a53-". This
results in two CPUClass instances being created.
Fix it by introducing a static variable, so that only the first
target CPU being initialized will initialize the target-dependent
part of TCG, regardless of CPUClass instances.
An 'offset' parameter sent to highbank register r/w functions
could be greater than number(NUM_REGS=0x200) of hb registers,
leading to an OOB access issue. Add check to avoid it.
Emilio G. Cota [Mon, 13 Nov 2017 13:55:24 +0000 (13:55 +0000)]
arm/translate-a64: mark path as unreachable to eliminate warning
Fixes the following warning when compiling with gcc 5.4.0 with -O1
optimizations and --enable-debug:
target/arm/translate-a64.c: In function ‘aarch64_tr_translate_insn’:
target/arm/translate-a64.c:2361:8: error: ‘post_index’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
if (!post_index) {
^
target/arm/translate-a64.c:2307:10: note: ‘post_index’ was declared here
bool post_index;
^
target/arm/translate-a64.c:2386:8: error: ‘writeback’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
if (writeback) {
^
target/arm/translate-a64.c:2308:10: note: ‘writeback’ was declared here
bool writeback;
^
Note that idx comes from selecting 2 bits, and therefore its value
can be at most 3.
Mike Nawrocki [Tue, 7 Nov 2017 18:35:03 +0000 (13:35 -0500)]
Add new PCI ID for i82559a
Adds a new PCI ID for the i82559a (0x8086 0x1030) interface. The
"x-use-alt-device-id" property controls whether this new ID is to be
used, and is true by default, and set to false in a compat entry.
Mike Nawrocki [Tue, 7 Nov 2017 18:35:02 +0000 (13:35 -0500)]
Fix eepro100 simple transmission mode
The simple transmission mode was treating the area immediately after the
transmit command block (TCB) as if it were a transmit buffer descriptor,
when in reality it is simply the packet data. This change simply copies
the data following the TCB into the packet buffer.
Mao Zhongyi [Fri, 13 Oct 2017 06:32:09 +0000 (14:32 +0800)]
colo: Consolidate the duplicate code chunk into a routine
Consolidate the code that extract the ip address(src,dst) and
port number(src,dst) of the packet into a separate routine
extract_ip_and_port() since the same chunk of code is called
from two place.
Mao Zhongyi [Fri, 13 Oct 2017 06:32:07 +0000 (14:32 +0800)]
colo-compare: compare the packet in a specified Connection
A package from pri_indev or sec_indev only belongs to a particular
Connection, so we only need to compare the package in the specified
Connection's primary_list and secondary_list, rather than for each
the whole Connection list to compare. This is time-consuming and
unnecessary.
Mao Zhongyi [Fri, 13 Oct 2017 06:32:06 +0000 (14:32 +0800)]
colo-compare: Insert packet into the suitable position of packet queue directly
Currently, a packet from pri_dev or sec_dev is fristly pushed at the
tail of the primary or secondary packet queue then sorted by the tcp
sequence number.
Now, this patch use g_queue_insert_sorted to insert the packet directly
into the suitable position to avoid ordering all packets each time when
a new packet is comming, thereby increasing efficiency.
In addition, consolidate the code that add a packet to the list of
Connection (primary or secondary) into a separate routine colo_insert_packet()
since the same chunk of code is called from two place.
net: fix check for number of parameters to -netdev socket
Since commit 0f8c289ad "net: fix -netdev socket,fd= for UDP sockets"
we allow more than one parameter for -netdev socket. But now
we run into an assert when no parameter at all is specified
Peter Maydell [Fri, 10 Nov 2017 16:01:35 +0000 (16:01 +0000)]
Merge remote-tracking branch 'remotes/berrange/tags/pull-qcrypto-2017-11-08-1' into staging
Merge qcrypto 2017/11/08 v1
# gpg: Signature made Wed 08 Nov 2017 11:06:38 GMT
# gpg: using RSA key 0xBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <[email protected]>"
# gpg: aka "Daniel P. Berrange <[email protected]>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* remotes/berrange/tags/pull-qcrypto-2017-11-08-1:
crypto: afalg: fix a NULL pointer dereference
tests: Run the luks tests in test-crypto-block only if encryption is available