Kevin Wolf [Fri, 15 Apr 2016 08:21:04 +0000 (10:21 +0200)]
block: Don't ignore flags in blk_{,co,aio}_write_zeroes()
Commit 57d6a428 neglected to pass the given flags to blk_aio_prwv(),
which broke discard by WRITE SAME for scsi-disk (the UNMAP bit would be
ignored).
Commit fc1453cd introduced the same bug for blk_write_zeroes(). This is
used for 'qemu-img convert' without has_zero_init (e.g. on a block
device) and for preallocation=falloc in parallels.
Commit 8896e088 is the version for blk_co_write_zeroes(). This function
is only used in qemu-io.
Jeff Cody [Wed, 23 Mar 2016 03:33:42 +0000 (23:33 -0400)]
block/vpc: make checks on max table size a bit more lax
The check on the max_table_size field not being larger than required is
valid, and in accordance with the VHD spec. However, there have been
VHD images encountered in the wild that have an out-of-spec max table
size that is technically too large.
There is no issue in allowing this larger table size, as we also
later verify that the computed size (used for the pagetable) is
large enough to fit all sectors. In addition, max_table_entries
is bounds checked against SIZE_MAX and INT_MAX.
Remove the strict check, so that we can accomodate these sorts of
images that are benignly out of spec.
Jeff Cody [Wed, 23 Mar 2016 03:33:41 +0000 (23:33 -0400)]
block/vpc: Use the correct max sector count for VHD images
The old VHD_MAX_SECTORS value is incorrect, and is a throwback
to the CHS calculations. The VHD specification allows images up to 2040
GiB, which (using 512 byte sectors) corresponds to a maximum number of
sectors of 0xff000000, rather than the old value of 0xfe0001ff.
Update VHD_MAX_SECTORS to reflect the correct value.
Also, update comment references to the actual size limit, and correct
one compare so that we can have sizes up to the limit.
Jeff Cody [Wed, 23 Mar 2016 03:33:40 +0000 (23:33 -0400)]
block/vpc: use current_size field for XenConverter VHD images
XenConverter VHD images are another VHD image where current_size is
different from the CHS values in the the format header. Use
current_size as the default, by looking at the creator_app signature
field.
Stefan Hajnoczi [Wed, 23 Mar 2016 03:33:39 +0000 (23:33 -0400)]
vpc: use current_size field for XenServer VHD images
The vpc driver has two methods of determining virtual disk size. The
correct one to use depends on the software that generated the image
file. Add the XenServer creator_app signature so that image size is
correctly detected for those images.
Kevin Wolf [Wed, 13 Apr 2016 10:47:08 +0000 (12:47 +0200)]
block: Fix blk_aio_write_zeroes()
Commit 57d6a428 broke blk_aio_write_zeroes() because in some write
functions in the call path don't have an explicit length argument but
reuse qiov->size instead. Which is great, except that write_zeroes
doesn't have a qiov, which this commit interprets as 0 bytes.
Consequently, blk_aio_write_zeroes() didn't effectively do anything.
This patch introduces an explicit acb->bytes in BlkAioEmAIOCB and uses
that instead of acb->rwco.size.
The synchronous version of the function is okay because it does pass a
qiov (with the right size and a NULL pointer as its base).
Peter Maydell [Thu, 14 Apr 2016 13:55:24 +0000 (14:55 +0100)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
tpm, vhost, virtio: fixes for 2.6
Minor fixes all over the place.
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Thu 14 Apr 2016 14:45:55 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg: aka "Michael S. Tsirkin <[email protected]>"
* remotes/mst/tags/for_upstream:
hw/virtio/balloon: Replace TARGET_PAGE_SIZE with BALLOON_PAGE_SIZE
tpm: Fix write to file descriptor function
tpm: acpi: remove IRQ from TPM's CRS to make Windows not see conflict
pc: acpi: tpm: add missing MMIO resource to PCI0._CRS
specs/vhost-user: spelling fix
specs/vhost-user: improve VHOST_SET_VRING_NUM documentation
Thomas Huth [Thu, 14 Apr 2016 08:50:07 +0000 (10:50 +0200)]
hw/virtio/balloon: Replace TARGET_PAGE_SIZE with BALLOON_PAGE_SIZE
The balloon code currently calls madvise() with TARGET_PAGE_SIZE as
length parameter. Since the virtio-balloon protocol is always based
on 4k pages, no matter what the host and guest are using as page size,
this could cause problems: If TARGET_PAGE_SIZE is bigger than 4k, the
madvise call also destroys the 4k areas after the current one - which
might be wrong since the guest did not want free that area yet (in
case the guest used as smaller MMU page size than the hard-coded
TARGET_PAGE_SIZE). So to fix this issue, introduce a proper define
called BALLOON_PAGE_SIZE (which is 4096) to use this as the size
parameter for the madvise() call instead.
Peter Maydell [Wed, 13 Apr 2016 17:48:28 +0000 (18:48 +0100)]
Merge remote-tracking branch 'remotes/elmarco/tags/ivshmem-fix-pull-request' into staging
# gpg: Signature made Wed 13 Apr 2016 11:04:51 BST using RSA key ID 75969CE5
# gpg: Good signature from "Marc-André Lureau <[email protected]>"
# gpg: aka "Marc-André Lureau <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5
* remotes/elmarco/tags/ivshmem-fix-pull-request:
ivshmem: fix ivshmem-{plain,doorbell} crash without arg
Igor Mammedov [Fri, 8 Apr 2016 11:23:14 +0000 (13:23 +0200)]
tpm: acpi: remove IRQ from TPM's CRS to make Windows not see conflict
IRQ 5 used by TPM conflicts with PNP0C0F IRQs,
as result Windows fails driver initialization with reason
'device cannot find enough free resources'
But if TPM._CRS.IRQ entry is commented out, Windows
seems to initialize driver without errors as it doesn't
notice possible conflict and it seems to work
probably due to a link with IRQ 5 being unused/disabled.
So temporary comment out TPM._CRS.IRQ to 'fix'
regression in TPM, with intent to fix it correctly
later i.e.:
1. pick unused IRQ as default one for TPM
2. fetch IRQ value from device model so that user
could override default one if it conflicts with
some other device.
Igor Mammedov [Fri, 8 Apr 2016 11:23:13 +0000 (13:23 +0200)]
pc: acpi: tpm: add missing MMIO resource to PCI0._CRS
Windows will fail initialize TMP driver with the reason:
'device cannot find enough free resources'
That happens because parent BUS doesn't describe
MMIO resources used by TPM child device.
Fix it by describing it in top-most parent bus scope PCI0.
It was 'regressed' by commit 5cb18b3d TPM2 ACPI table support
with following fixup 9e472263 acpi: add missing ssdt
which did the right thing by moving TPM to BUS
it belongs to but lacked a proper resource declaration.
"number of vrings" doesn't help me understand the purpose of this
message. My understanding is that it is rather the size of the queue (in
modern terms).
virtio-input: support absolute axis config in pass-through
VIRTIO_INPUT_CFG_ABS_INFO was not implemented for pass-through input
devices. This patch follows the existing design and pre-fetches the
config for all absolute axes using EVIOCGABS at realize time.
virtio-input is simple enough that it doesn't need to xfer any state.
Still we have to wire up savevm manually, so the generic pci and virtio
are saved correctly.
Additionally we need to do some post-load processing to figure whenever
the guest uses the device or not, so we can give input routing hints to
the qemu input layer using qemu_input_handler_{activate,deactivate}.
The write path for pass-through devices, commonly used for controlling
keyboard LEDs via EV_LED, was not implemented. This commit adds the
necessary plumbing to connect the status virtio queue to the host evdev
file descriptor.
Ladi Prosek [Wed, 30 Mar 2016 13:07:20 +0000 (15:07 +0200)]
virtio-input: add missing key mappings
KEY_PAUSE is flat out missing. KEY_SYSRQ already has a keycode
assigned but it's not what I'm seeing on my system. The mapping
doesn't appear to have to be unique so both keycodes now map to
KEY_SYSRQ which is what the "Keyboard PrintScreen", HID usage ID
0x46, translates to.
ivshmem: fix ivshmem-{plain,doorbell} crash without arg
"qemu -device ivshmem-{plain,doorbell}" will crash, because the device
doesn't check that the required argument is provided. (screwed up in
commit 5400c02)
Pavel Butsykin [Tue, 12 Apr 2016 22:48:15 +0000 (18:48 -0400)]
ide: really restart pending and in-flight atapi dma
Restart of ATAPI DMA used to be unreachable, because the request to do
so wasn't indicated in bus->error_status due to the lack of spare bits, and
ide_restart_bh() would return early doing nothing.
This patch makes use of the observation that not all bit combinations were
possible in ->error_status. In particular, IDE_RETRY_READ only made sense
together with IDE_RETRY_DMA or IDE_RETRY_PIO. This allows to re-use
IDE_RETRY_READ alone as an indicator of ATAPI DMA restart request.
To makes things more uniform, ATAPI DMA gets its own value for ->dma_cmd.
As a means against confusion, macros are added to test the state of
->error_status.
The patch fixes the restart of both in-flight and pending ATAPI DMA,
following the scheme similar to that of IDE DMA.
Pavel Butsykin [Tue, 12 Apr 2016 20:47:52 +0000 (16:47 -0400)]
ide: don't lose pending dma state
If the migration occurs after the IDE DMA has been set up but before it
has been initiated, the state gets lost upon save/restore. Specifically,
->dma_cb callback gets cleared, so, when the guest eventually starts bus
mastering, the DMA never completes, causing the guest to time out the
operation.
OTOH all the infrastructure is already in place to restart the DMA if
the migration happens while the DMA is in progress.
So reuse that infrastructure, by setting bus->error_status based on
->dma_cmd in pre_save if ->dma_cb callback is already set but DMAING is
clear. This will indicate the need for restart and make sure ->dma_cb
is restored in ide_restart_bh(); howeover since DMAING is clear the state
upon restore will be exactly "ready for DMA" as before the save.
Anthony PERARD [Tue, 12 Apr 2016 20:47:52 +0000 (16:47 -0400)]
xen: Fix IDE unplug
After commit e5e7855 (blockdev: Separate BB name management), starting a
guest with PVHVM support result in this assert:
qemu-system-i386: block/block-backend.c:173: blk_delete: Assertion `!blk->name' failed.
A backtrace show that a caller is pci_piix3_xen_ide_unplug().
Peter Maydell [Tue, 12 Apr 2016 16:47:15 +0000 (17:47 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches for 2.6
# gpg: Signature made Tue 12 Apr 2016 17:10:29 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
* remotes/kevin/tags/for-upstream:
qemu-iotests: iotests.py: get rid of __all__
qemu-iotests: 068: don't require KVM
qemu-iotests: 148: properly skip test if quorum support is missing
qemu-iotests: iotests.VM: remove qtest socket on error
qemu-iotests: fix 051 on non-PC architectures
qemu-iotests: check: don't place files with predictable names in /tmp
MAINTAINERS: Block layer core, qcow2 and blkdebug
qcow2: Prevent backing file names longer than 1023
vpc: fix return value check for blk_pwrite
iotests: Make 150 use qemu-img map instead of du
block: initialize qcrypto API at startup
qemu-img: fix formatting of error message
iotests: fix the broken 026.nocache output
Kevin Wolf [Tue, 12 Apr 2016 16:09:16 +0000 (18:09 +0200)]
Merge remote-tracking branch 'mreitz/tags/pull-block-for-kevin-2016-04-12' into queue-block
Block patches for 2.6-rc2.
# gpg: Signature made Tue Apr 12 18:08:20 2016 CEST using RSA key ID E838ACAD
# gpg: Good signature from "Max Reitz <[email protected]>"
* mreitz/tags/pull-block-for-kevin-2016-04-12:
qemu-iotests: iotests.py: get rid of __all__
qemu-iotests: 068: don't require KVM
qemu-iotests: 148: properly skip test if quorum support is missing
qemu-iotests: iotests.VM: remove qtest socket on error
qemu-iotests: fix 051 on non-PC architectures
qemu-iotests: check: don't place files with predictable names in /tmp
The __all__ list contained a typo for as long as the iotests module
existed. That typo prevented "from iotests import *" (which is the
only case where iotests.__all__ is used at all) from ever working.
The names used by iotests are highly prone to name collisions, so
importing them all unconditionally is a bad idea anyway. Since __all__
is not adding any value, let's just get rid of it.
None of the other test cases explicitly enable KVM and there's no
obvious reason for 068 to require it. Drop this so all test cases can be
executed in environments where KVM is not available (e.g. because the
user doesn't have sufficient permissions to access /dev/kvm).
qemu-iotests: 148: properly skip test if quorum support is missing
qemu-iotests test case 148 already had some code for skipping the test
if quorum support is missing, but it didn't work in all
cases. TestQuorumEvents.setUp() gets run before the actual test class
(which contains the skipping code) and tries to start qemu with a drive
using the quorum driver. For some reason this works fine when using
qcow2, but fails for raw.
As the entire test case requires quorum, just check for availability
before even starting the test suite. Introduce a verify_quorum()
function in iotests.py for this purpose so future test cases can make
use of it.
qemu-iotests: iotests.VM: remove qtest socket on error
On error, VM.launch() cleaned up the monitor unix socket, but left the
qtest unix socket behind. This caused the remaining sub-tests to fail
with EADDRINUSE:
+======================================================================
+ERROR: testQuorum (__main__.TestFifoQuorumEvents)
+----------------------------------------------------------------------
+Traceback (most recent call last):
+ File "148", line 63, in setUp
+ self.vm.launch()
+ File "/home6/silbe/qemu/tests/qemu-iotests/iotests.py", line 247, in launch
+ self._qmp.accept()
+ File "/home6/silbe/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 141, in accept
+ return self.__negotiate_capabilities()
+ File "/home6/silbe/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 57, in __negotiate_capabilities
+ raise QMPConnectError
+QMPConnectError
+
+======================================================================
+ERROR: testQuorum (__main__.TestQuorumEvents)
+----------------------------------------------------------------------
+Traceback (most recent call last):
+ File "148", line 63, in setUp
+ self.vm.launch()
+ File "/home6/silbe/qemu/tests/qemu-iotests/iotests.py", line 244, in launch
+ self._qtest = qtest.QEMUQtestProtocol(self._qtest_path, server=True)
+ File "/home6/silbe/qemu/tests/qemu-iotests/../../scripts/qtest.py", line 33, in __init__
+ self._sock.bind(self._address)
+ File "/usr/lib64/python2.7/socket.py", line 224, in meth
+ return getattr(self._sock,name)(*args)
+error: [Errno 98] Address already in use
Fix this by cleaning up both the monitor socket and the qtest socket iff
they exist.
Commit 61de4c68 [block: Remove BDRV_O_CACHE_WB] updated the reference
output for PCs, but neglected to do the same for the generic reference
output file. Fix 051 on all non-PC architectures by applying the same
change to the generic output file.
qemu-iotests: check: don't place files with predictable names in /tmp
Placing files with predictable or even hard-coded names in /tmp is a
security risk and can prevent or disturb operation on a multi-user
machine. Place them inside the "scratch" directory instead, as we
already do for most other test-related files.
Paolo Bonzini [Thu, 7 Apr 2016 14:52:34 +0000 (16:52 +0200)]
vpc: fix return value check for blk_pwrite
bdrv_pwrite_sync used to return zero or negative error, while blk_pwrite returns
the number of written bytes when successful. This caused VPC image creation
to fail spectacularly: it wrote the first 512 bytes, and then exited immediately
because of the non-zero answer from blk_pwrite. But the truly spectacular part
is that it returns a positive value (the 512 that blk_pwrite returned) causing
everyone to believe that it succeeded.
This fixes qemu-iotests with vpc format.
Fixes: b8f45cdf7827e39f9a1e6cc446f5972cc6144237 Signed-off-by: Paolo Bonzini <[email protected]> Signed-off-by: Kevin Wolf <[email protected]>
Max Reitz [Tue, 29 Mar 2016 16:24:17 +0000 (18:24 +0200)]
iotests: Make 150 use qemu-img map instead of du
The actual on-disk size of a file does not only depend on factors qemu
can control. Thus, we should not depend on this to determine whether a
file has indeed been fully allocated. Instead, use qemu-img map and hope
that if an area is referenced, it is indeed allocated, too.
Also, limit the supported image formats to raw and qcow2 because the
actual qemu-img map output may depend on the image format.
Any programs which call the qcrypto APIs should ensure that
qcrypto_init() has been called before anything else which
can use crypto. Essentially this means right at the start
of the main method before initializing anything else.
This is important because some versions of gnutls/gcrypt
require explicit initialization before use.
The error_reportf_err() will not automatically append a
': ' before adding its suffix, so we must include that
in the message we pass it, otherwise we get a badly
formatted message lacking whitespace:
qemu-img: Could not open 'driver=nbd,host=127.0.0.1,port=6666,tls-creds=tls0'Failed to connect socket: Connection refused
Pavel Butsykin [Wed, 6 Apr 2016 06:08:32 +0000 (09:08 +0300)]
iotests: fix the broken 026.nocache output
This patch fixes longstanding issue with 026 iotest. Unfortunately,
this test contains 2 versions of the correct output, one for cached
writes and one for non-cached ones. People tends to fix only one
version of output of the test and thus noncached version becomes
broken. Unfortunately, it is default in tests/check-block.sh
Peter Maydell [Tue, 12 Apr 2016 08:34:52 +0000 (09:34 +0100)]
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Tue 12 Apr 2016 09:29:54 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <[email protected]>"
# gpg: aka "Stefan Hajnoczi <[email protected]>"
* remotes/stefanha/tags/block-pull-request:
MAINTAINERS: Add Fam Zheng as a co-maintainer of block I/O path
mirror: Replace bdrv_drain(bs) with bdrv_co_drain(bs)
block: Fix bdrv_drain in coroutine
Using the nested aio_poll() in coroutine is a bad idea. This patch
replaces the aio_poll loop in bdrv_drain with a BH, if called in
coroutine.
For example, the bdrv_drain() in mirror.c can hang when a guest issued
request is pending on it in qemu_co_mutex_lock().
Mirror coroutine in this case has just finished a request, and the block
job is about to complete. It calls bdrv_drain() which waits for the
other coroutine to complete. The other coroutine is a scsi-disk request.
The deadlock happens when the latter is in turn pending on the former to
yield/terminate, in qemu_co_mutex_lock(). The state flow is as below
(assuming a qcow2 image):
mirror coroutine scsi-disk coroutine
-------------------------------------------------------------
do last write
qcow2:qemu_co_mutex_lock()
...
scsi disk read
tracked request begin
qcow2:qemu_co_mutex_lock.enter
qcow2:qemu_co_mutex_unlock()
bdrv_drain
while (has tracked request)
aio_poll()
In the scsi-disk coroutine, the qemu_co_mutex_lock() will never return
because the mirror coroutine is blocked in the aio_poll(blocking=true).
With this patch, the added qemu_coroutine_yield() allows the scsi-disk
coroutine to make progress as expected:
mirror coroutine scsi-disk coroutine
-------------------------------------------------------------
do last write
qcow2:qemu_co_mutex_lock()
...
scsi disk read
tracked request begin
qcow2:qemu_co_mutex_lock.enter
qcow2:qemu_co_mutex_unlock()
bdrv_drain.enter
> schedule BH
> qemu_coroutine_yield()
> qcow2:qemu_co_mutex_lock.return
> ...
tracked request end
...
(resumed from BH callback)
bdrv_drain.return
...
ldstub [addr], reg incorrectly reads a signed byte from memory which causes
problems in the 32-bit Solaris mutex code. Here the byte value being read is
0xff which is incorrectly sign-extended to 0xffffffff before being written back
to the target register causing lock detection to behave incorrectly.
This fixes the intermittent hangs and MUTEX_HELD warnings issued to the
console when running 32-bit Solaris images under qemu-system-sparc.
With thanks to Joseph Dery for providing a condensed test image to consistently
reproduce the problem on demand, and Martin Husemann for allowing me access to
real hardware for comparison.
net: stellaris_enet: check packet length against receive buffer
When receiving packets over Stellaris ethernet controller, it
uses receive buffer of size 2048 bytes. In case the controller
accepts large(MTU) packets, it could lead to memory corruption.
Add check to avoid it.
Feeling a bit nervous putting the full live migration support
patch (https://patchwork.ozlabs.org/patch/606902/) in that
late in the 2.6 devel cycle as it carries some non-trivial
changes. So disable migration in case virtio-gpu is present
for now.
ui/virtio-gpu: add and use qemu_create_displaysurface_pixman
Add a the new qemu_create_displaysurface_pixman function, to create
a DisplaySurface backed by an existing pixman image. In that case
there is no need to create a new pixman image pointing to the same
backing storage. We can just use the existing image directly.
This does not only simplify things a bit, but most importantly it
gets the reference counting right, so the backing storage for the
pixman image wouldn't be released underneath us.
Use new function in virtio-gpu, where using it actually fixes
use-after-free crashes.
Peter Maydell [Fri, 8 Apr 2016 11:45:53 +0000 (12:45 +0100)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, virtio, acpi: fixes for 2.6
Fixes all over the place. Most notably, fixes migration
for systems with pci express bridges, and random crashes
observed with virtio blk and scsi dataplane.
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Fri 08 Apr 2016 08:53:46 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg: aka "Michael S. Tsirkin <[email protected]>"
* remotes/mst/tags/for_upstream:
hw/pci-bridge: Add missing unref in case register-bus fails
virtio: merge virtio_queue_aio_set_host_notifier_handler with virtio_queue_set_aio
virtio-scsi: use aio handler for data plane
virtio-blk: use aio handler for data plane
virtio: add aio handler
virtio-scsi: fix disabled mode
virtio-blk: fix disabled mode
virtio: make virtio_queue_notify_vq static
tests/bios-tables-test: fix assert
virtio-balloon: reset the statistic timer to load device
Migration: Add i82801b11 migration data
Sort the fw_cfg file list
xen: piix reuse pci generic class init function
pci-testdev: fast mmio support
acpi: Add missing GCC_FMT_ATTR
Peter Maydell [Fri, 8 Apr 2016 10:54:18 +0000 (11:54 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.6-20160408' into staging
ppc patch queue for 2016-04-08
Just a single bugfix for spapr in this batch, but I want to make sure
it gets in for 2.6.
# gpg: Signature made Fri 08 Apr 2016 06:02:45 BST using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg: aka "David Gibson (Red Hat) <[email protected]>"
# gpg: aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
Peter Maydell [Fri, 8 Apr 2016 09:51:45 +0000 (10:51 +0100)]
Merge remote-tracking branch 'remotes/weil/tags/pull-tci-20160407' into staging
tci patch queue
# gpg: Signature made Thu 07 Apr 2016 18:01:55 BST using RSA key ID 677450AD
# gpg: Good signature from "Stefan Weil <[email protected]>"
# gpg: aka "Stefan Weil <[email protected]>"
# gpg: aka "Stefan Weil <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 4923 6FEA 75C9 5D69 8EC2 B78A E08C 21D5 6774 50AD
Peter Maydell [Fri, 8 Apr 2016 09:25:22 +0000 (10:25 +0100)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* NBD fixes from Alex and Eric
* Debug code bitrot from Emilio
* HPET fix from Bill
* ps2kbd fix from Hervé
* PKU fix from myself
* Coverity fixes from Gonglei
* More memory.txt update from Jiangang
* .gitignore maintenance from Changlong
# gpg: Signature made Thu 07 Apr 2016 23:08:12 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <[email protected]>"
# gpg: aka "Paolo Bonzini <[email protected]>"
* remotes/bonzini/tags/for-upstream:
target-i386: check for PKU even for non-writable pages
tests: ignore test-logging
translate-all: add missing fold of tb_ctx into tcg_ctx
hostmem-file: fix memory leak
spapr: fix possible Negative array index read
nbd: do not hang nbd_wr_syncv if outside a coroutine and no available data
nbd: Don't kill server when client requests unknown option
nbd: Fix NBD unsupported options
qemu-nbd: Document -x option
nbd: Improve debug traces on little-endian
nbd: Avoid bitrot in TRACE() usage
nbd: Return correct error for write to read-only export
docs: fix typo in memory.txt
hw/timer: Revert "hpet: inverse polarity when pin above ISA_NUM_IRQS"
ps2kbd: default to scancode_set 2, as with KBD_CMD_RESET
ibm,lrdr-capacity has a field to describe the maximum address in bytes
and therefore, the most memory that can be allocated to this guest. We
are using maxmem for this field, but instead should use the actual RAM
address corresponding to the end of hotplug region.
(All failures are combinations of "pde.user pde.p pkru.wd pkey=1",
plus either "pde.pse" or "pte.p pte.user", plus one of "user cr0.wp",
"cr0.wp" or "user", plus unimportant bits such as accessed/dirty or
efer.nx).
So PFEC.PKEY is set even if the ordinary check failed (which it did
because pde.w is zero). Adjust QEMU to match behavior of silicon.
translate-all: add missing fold of tb_ctx into tcg_ctx
Since 5e5f07e08 "TCG: Move translation block variables
to new context inside tcg_ctx: tb_ctx" on Feb 1 2013, compilation
of usermode + TB_DEBUG_CHECK has been broken. Fix it.
Paolo Bonzini [Thu, 7 Apr 2016 11:25:08 +0000 (13:25 +0200)]
nbd: do not hang nbd_wr_syncv if outside a coroutine and no available data
Until commit 1c778ef7 ("nbd: convert to using I/O channels for actual
socket I/O", 2016-02-16), nbd_wr_sync returned -EAGAIN this scenario.
nbd_reply_ready required these semantics because it has two conflicting
requirements:
1) if a reply can be received on the socket, nbd_reply_ready needs
to read the header outside coroutine context to identify _which_
coroutine to enter to process the rest of the reply
2) on the other hand, nbd_reply_ready can find a false positive if
another thread (e.g. a VCPU thread running aio_poll) sneaks in and
calls nbd_reply_ready too. In this case nbd_reply_ready does nothing
and expects nbd_wr_syncv to return -EAGAIN.
Currently, the solution to the first requirement is to wait in the very
rare case of a read() that doesn't retrieve the reply header in its
entirety; this is what nbd_wr_syncv does by calling qio_channel_wait().
However, the unconditional call to qio_channel_wait() breaks the second
requirement. To fix this, the patch makes nbd_wr_syncv return -EAGAIN
if done is zero, similar to the code before commit 1c778ef7.
This is okay because NBD client-side negotiation is the only other case
that calls nbd_wr_syncv outside a coroutine, and it places the socket
in blocking mode. On the other hand, it is a bit unpleasant to put
this in nbd_wr_syncv(), because the function is used by both client
and server.
The full fix would be to add a counter to NbdClientSession for how
many bytes have been filled in s->reply. Then a reply can be filled
by multiple separate invocations of nbd_reply_ready and the
qio_channel_wait() call can be removed completely. Something to
consider for 2.7...
Eric Blake [Wed, 6 Apr 2016 22:48:38 +0000 (16:48 -0600)]
nbd: Don't kill server when client requests unknown option
nbd-server.c currently fails to handle unsupported options properly.
If during option haggling the client sends an unknown request, the
server kills the connection instead of letting the client try to
fall back to something older. This is precisely what advertising
NBD_FLAG_FIXED_NEWSTYLE was supposed to fix.
Alex Bligh [Wed, 6 Apr 2016 16:59:22 +0000 (10:59 -0600)]
nbd: Fix NBD unsupported options
nbd-client.c currently fails to handle unsupported options properly.
If during option haggling the server finds an option that is
unsupported, it returns an NBD_REP_ERR_UNSUP reply.
According to nbd's proto.md, the format for such a reply
should be:
S: 64 bits, 0x3e889045565a9 (magic number for replies)
S: 32 bits, the option as sent by the client to which this is a reply
S: 32 bits, reply type (e.g., NBD_REP_ACK for successful completion,
or NBD_REP_ERR_UNSUP to mark use of an option not known by this server
S: 32 bits, length of the reply. This may be zero for some replies,
in which case the next field is not sent
S: any data as required by the reply (e.g., an export name in the case
of NBD_REP_SERVER, or optional UTF-8 message for NBD_REP_ERR_*)
However, in nbd-client.c, the reply type was being read, and if it
contained an error, it was bailing out and issuing the next option
request without first reading the length. This meant that the
next option / handshake read had an extra 4 or more bytes of data in it.
In practice, this makes Qemu incompatible with servers that do not
support NBD_OPT_LIST.
To verify this isn't an error in the specification or my reading of
it, replies are sent by the reference implementation here:
https://github.com/yoe/nbd/blob/66dfb35/nbd-server.c#L1232
and as is evident it always sends a 'datasize' (aka length) 32 bit
word. Unsupported elements are replied to here:
https://github.com/yoe/nbd/blob/66dfb35/nbd-server.c#L1371
Eric Blake [Wed, 6 Apr 2016 02:02:08 +0000 (20:02 -0600)]
qemu-nbd: Document -x option
Commit 3d4b2f9c added -x to force qemu-nbd to use new-style
negotiation, but while it documented it in the man page, it
omitted docs in the --help output.
Eric Blake [Wed, 6 Apr 2016 03:35:04 +0000 (21:35 -0600)]
nbd: Improve debug traces on little-endian
Print debug tracing messages while data is still in native
ordering, rather than after we've potentially swapped it into
network order for transmission. Also, it's nice if the server
mentions what it is replying, to correlate it to with what the
client says it is receiving.
Eric Blake [Wed, 6 Apr 2016 03:35:02 +0000 (21:35 -0600)]
nbd: Return correct error for write to read-only export
The NBD Protocol requires that servers should send EPERM for
attempts to write (or trim) a read-only export. We were
correct for TRIM (blk_co_discard() gave EPERM); but were
manually setting EROFS which then got mapped to EINVAL over
the wire on writes.
This change was originally intended to correct the HPET behavior
in conjunction with Linux, however the behavior that it actually creates
is not compatible with the ioapic.c implementation; it used to be
compatible with KVM's own IOAPIC but it is not anymore.
Hervé Poussineau [Wed, 23 Mar 2016 06:21:40 +0000 (07:21 +0100)]
ps2kbd: default to scancode_set 2, as with KBD_CMD_RESET
This line has been added in commit ef74679a810fe6858f625b9d52b68cc3fc61eb3d with
other initializations. However, scancode set 0 doesn't exist (only 1, 2, 3).
This works well as long as operating system is resetting keyboard, or overwriting
the current scancode set with the one it wants.
This fixes IBM 40p firmware, which doesn't bother sending KBD_CMD_RESET or KBD_CMD_SCANCODE.
* remotes/mdroth/tags/qga-pull-2016-04-07-tag:
qga: Workaround for console redirection from non-interactive qemu-ga service
qga: fix fd leak with guest-exec i/o channels
Stefan Weil [Tue, 5 Apr 2016 20:24:51 +0000 (22:24 +0200)]
tci: Fix build regression
Commit d38ea87ac54af64ef611de434d07c12dc0399216 cleaned the include
statements which resulted in a wrong order of assert.h and the definition
of NDEBUG in tci.c. Normally NDEBUG modifies the definition of the assert
macro, but here this definition comes too late which results in a failing
build.
To fix this, a new macro tci_assert which depends on CONFIG_DEBUG_TCG
is introduced. Only builds with CONFIG_DEBUG_TCG will use assertions.
Even in this case, it is still possible to disable assertions by
defining NDEBUG via compiler settings.
Wei Jiangang [Wed, 23 Mar 2016 07:26:19 +0000 (15:26 +0800)]
hw/pci-bridge: Add missing unref in case register-bus fails
The error paths after a successful qdev_create/pci_bus_new
should contain a object_unref/object_unparent.
pxb_dev_init_common() did not yet, so add it.
Paolo Bonzini [Wed, 6 Apr 2016 10:16:28 +0000 (12:16 +0200)]
virtio: merge virtio_queue_aio_set_host_notifier_handler with virtio_queue_set_aio
Eliminating the reentrancy is actually a nice thing that we can do
with the API that Michael proposed, so let's make it first class.
This also hides the complex assign/set_handler conventions from
callers of virtio_queue_aio_set_host_notifier_handler, which in
fact was always called with assign=true.
Paolo Bonzini [Wed, 6 Apr 2016 10:16:27 +0000 (12:16 +0200)]
virtio-scsi: use aio handler for data plane
In addition to handling IO in vcpu thread and in io thread, dataplane
introduces yet another mode: handling it by AioContext.
This reuses the same handler as previous modes, which triggers races as
these were not designed to be reentrant. Use a separate handler just
for aio, and disable regular handlers when dataplane is active.
In addition to handling IO in vcpu thread and in io thread, dataplane
introduces yet another mode: handling it by AioContext.
This reuses the same handler as previous modes, which triggers races as
these were not designed to be reentrant. Use a separate handler just
for aio, and disable regular handlers when dataplane is active.
In addition to handling IO in vcpu thread and in io thread, blk dataplane
introduces yet another mode: handling it by AioContext.
Currently, this reuses the same handler as previous modes,
which triggers races as these were not designed to be reentrant.
Add instead a separate handler just for aio; this will make
it possible to disable regular handlers when dataplane is active.
Paolo Bonzini [Wed, 6 Apr 2016 10:16:24 +0000 (12:16 +0200)]
virtio-scsi: fix disabled mode
Add two missing checks for s->dataplane_fenced. In one case, QEMU
would skip injecting an IRQ due to a write to an uninitialized
EventNotifier's file descriptor.
In the second case, the dataplane_disabled field was used by mistake;
in fact after fixing this occurrence it is completely unused.
Paolo Bonzini [Wed, 6 Apr 2016 10:16:23 +0000 (12:16 +0200)]
virtio-blk: fix disabled mode
We must not call virtio_blk_data_plane_notify if dataplane is
disabled: we would hit a segmentation fault in notify_guest_bh as
s->guest_notifier has not been setup and is NULL.
Newer iasl does not add the aml file name to the Definition Block.
See acpica tools commit 1ecbb3d5:
"Emit the AMLFilename as a zero-length string. Allows the compiler to create
the name later -- making it easier to rename the parent ASL (DSL) file."
That causes an assert in acpi tests:
tests/bios-tables-test.c:455:normalize_asl: assertion failed: (block_name)
Fix it by striping the start of the definition block line until the first comma.
The block name is always the first parameter and
the grammar does not allow comma in between, so it is safe.