Stefan Hajnoczi [Thu, 5 Mar 2015 21:38:18 +0000 (15:38 -0600)]
iotests: add O_DIRECT alignment probing test
This test case checks that image files can be opened even if I/O
produces EIO errors. QEMU should not refuse opening failed disks since
the guest may be configured for multipath I/O where accessing failed
disks is expected.
Previously, QEMU would launch successfully and the guest would see the
errors when attempting I/O.
This is a regression and may prevent multipath I/O inside the guest,
where QEMU must launch and let the guest figure out by itself which
disks are online.
Tweak the alignment probing code in raw-posix.c to explicitly look for
EINVAL on Linux instead of bailing. The kernel refuses misaligned
requests with this error code and other error codes can be ignored.
John Snow [Thu, 5 Mar 2015 04:37:55 +0000 (23:37 -0500)]
MAINTAINERS: Add jsnow as IDE maintainer
It has been proposed that the block layer be split up into smaller,
more manageable portions to help speed up the review and merging of
block layer patches.
As part of this process, I propose that I take over the IDE, ATA, ATAPI
and FD devices.
As we split out the block layer, we will begin using the qemu-block
mailing list as a catchall for all of the block layer subcomponents.
Please CC [email protected] for all block layer patches, including
any that touch the IDE/Floppy devices.
Commit c53659f0 ("BlockConf: Call backend functions to detect geometry
and blocksizes") causes a segmentation fault on the invalid
configuration of a scsi device without a drive.
Let's check for conf.blk before calling blkconf_blocksizes. The error
will be handled later on in scsi_realize anyway.
Max Reitz [Fri, 27 Feb 2015 19:54:39 +0000 (14:54 -0500)]
block/vdi: Add locking for parallel requests
When allocating a new cluster, the first write to it must be the one
doing the allocation, because that one pads its write request to the
cluster size; if another write to that cluster is executed before it,
that write will be overwritten due to the padding.
See https://bugs.launchpad.net/qemu/+bug/1422307 for what can go wrong
without this patch.
Max Reitz [Tue, 3 Mar 2015 21:22:39 +0000 (16:22 -0500)]
iotests: Drop vpc from 004's and 104's format list
Both tests require the test image to have a specific size; this cannot
be guaranteed by vpc (unless tuning the test specifically for that
format).
It is safe to exclude vpc from 004 because what is tested there is
implemented in a generic part in the block layer and not
format-specific.
It is safe to exclude vpc from 104 because for vpc basically every image
size is "unaligned", so if that would break at some point in time, we
would quickly notice just by running the generic tests.
Fam Zheng [Mon, 16 Feb 2015 04:09:40 +0000 (12:09 +0800)]
virtio-blk: Remove the stale FIXME comment
By default, we have ioeventfd enabled, so the IO request processing is
in IO thread; in the vcpu thread, guest mode is returned to as quickly
as possible, and completion is delivered via irqfd. Therefore this
comment from the initial implementation is barely relevant.
Marc Marí [Tue, 24 Feb 2015 16:34:14 +0000 (17:34 +0100)]
libqos: Solve bug in interrupt checking when using MSIX in virtio-pci.c
The MSIX interrupt was always acked without checking its value, which caused a
race condition. If the ISR was raised between the read and the acking, the ISR
was never detected and it timed out.
John Snow [Wed, 25 Feb 2015 23:06:39 +0000 (18:06 -0500)]
qtest/ahci: Add PIO and LBA48 tests
In addition to DMA tests, test PIO and LBA48 command pathways in AHCI.
To accomplish this, a primitive multiplexer for gtest is added.
Though guests may prefer not to issue PIO commands directly except
for single sector cases during early boot and shutdown, these pathways
are still used for the transfer of ATAPI commands as well, and should
be behaving well.
John Snow [Wed, 25 Feb 2015 23:06:38 +0000 (18:06 -0500)]
qtest/ahci: Add DMA test variants
These test a few different pathways in the AHCI code.
short: Test the minimum transfer size, exactly one sector.
simple: Test a transfer using a single PRD, in this case, 4K.
double: Test transferring 8K, which we will split up as two PRDs.
long: Test transferring a lot of data using many PRDs, 256K. Signed-off-by: John Snow <[email protected]>
Message-id: 1424905602[email protected] Signed-off-by: Stefan Hajnoczi <[email protected]> Signed-off-by: Kevin Wolf <[email protected]>
John Snow [Wed, 25 Feb 2015 23:06:37 +0000 (18:06 -0500)]
libqos/ahci: add ahci command helpers
ahci_command_set_flags: Set additional flags in the command header.
ahci_command_clr_flags: Clear flags from the command header.
ahci_command_set_offset: Change the IO sector from 0.
ahci_command_adjust: Adjust many values simultaneously.
To be used to adjust the command header if the default values/guesses
were incorrect or undesirable.
John Snow [Wed, 25 Feb 2015 23:06:36 +0000 (18:06 -0500)]
qtest/ahci: Add a macro bootup routine
Add a routine that can be used to engage the AHCI
device at a not-granular level so that bringing up
the functionality of the HBA is easy in future tests
that are not concerned with testing the bring-up process.
John Snow [Mon, 23 Feb 2015 16:18:06 +0000 (11:18 -0500)]
qtest/ide: Test flush / retry for ISA and PCI
This patch adds tests for werror and rerror functionality
for the PCI and ISA ide buses.
Tests for the AHCI device are to be included at a later
date after requisite patches have been merged upstream
to support needed functionality by the tests.
John Snow [Mon, 23 Feb 2015 16:18:05 +0000 (11:18 -0500)]
ahci: Recompute cur_cmd on migrate post load
When the AHCI HBA device is migrated, all of the information that
led to the request being created is stored in the AHCIDevice
structures, except for pointers into guest data where return
information needs to be stored.
The "cur_cmd" field is usually responsible for this.
To rebuild the cur_cmd pointer post-migration, we can utilize
the busy_slot index to figure out where the command header
we are still processing is.
This allows a machine in a halted state from rerror=stop or
werror=stop to be migrated and resume operations without issue.
Paolo Bonzini [Mon, 23 Feb 2015 16:18:04 +0000 (11:18 -0500)]
ahci: add support for restarting non-queued commands
This is easy, since start_dma already restarts processing from the
beginning of the PRDT.
Migration is also easy to cover; the comment about busy_slot is
wrong, busy_slot will only be set if there is an error. In this
case we have nothing to do really. The core IDE code will restart
the operation and command list processing will proceed after the
erroring command has been completed.
John Snow [Mon, 23 Feb 2015 16:18:03 +0000 (11:18 -0500)]
ahci: Migrate IDEStatus
Amazingly, we weren't doing this before.
Make sure we migrate the IDEState structure that belongs to
the AHCIDevice.IDEBus structure during migrations.
No version numbering changes because AHCI is not officially
migratable (and we can all see with good reason why) so we
do not impact any official builds by altering the stream and
leaving it at version 1.
This fixes the rerror=stop/werror=stop test case where we wish
to migrate a halted job. Previously, the error code would not
migrate, so even if the job completed successfully, AHCI would
report an error because it would still have the placeholder
error code from initialization time.
Paolo Bonzini [Mon, 23 Feb 2015 16:18:00 +0000 (11:18 -0500)]
ide: commonize io_buffer_index initialization
Resetting the io_buffer_index to 0 is commonized,
with the exception of the case within ide_atapi_cmd_reply,
where we need to reset this index to 0 prior to the
ide_atapi_cmd_reply_end call.
Note that not all calls to ide_atapi_cmd_reply_end
expect the index to be 0, so setting it there is
not appropriate.
Paolo Bonzini [Mon, 23 Feb 2015 16:17:59 +0000 (11:17 -0500)]
ide: migrate initial request state via IDEBus
This only breaks backwards migration compatibility if the bus is in
an error state. It is in principle possible to avoid this by making
two subsections (one for version 1, and one for version 2, but with
the same name) with different "_needed" callbacks. The v1 callback would
return true if error_status != 0 and the bus is PATA; the v2 callback
would return true if error_status != 0 and the bus is AHCI.
Paolo Bonzini [Mon, 23 Feb 2015 16:17:55 +0000 (11:17 -0500)]
ide: move restart callback to common code
With BMDMA specific excised from the restart functions,
create a HBA-agnostic restart callback to be shared
between the different HBAs.
Change the callback registered with the vmstate_change
handler to always point to ide_restart_cb instead of
relying on the IDEDMAOps.restart_cb() member.
Paolo Bonzini [Mon, 23 Feb 2015 16:17:52 +0000 (11:17 -0500)]
ide: introduce ide_register_restart_cb
A helper is added that registers the IDEDMAOp .restart_cb()
via qemu_add_vm_change_state_handler instead of requiring
each HBA to register the callback themselves.
Paolo Bonzini [Mon, 23 Feb 2015 16:17:51 +0000 (11:17 -0500)]
ide: prepare to move restart to common code
This patch adds the restart_dma callback and adjusts
the ide_restart_dma function to utilize this callback
to call the BMDMA-specific restart code instead of statically
executing BMDMA-specific code.
BlockConf: Call backend functions to detect geometry and blocksizes
geometry: hd_geometry_guess function autodetects the drive geometry.
This patch adds a block backend call, that probes the backing device
geometry. If the inner driver method is implemented and succeeds
(currently only for DASDs), the blkconf_geometry will pass-through
the backing device geometry. Otherwise will fallback to old logic.
blocksize: This patch initializes blocksize properties to 0.
In order to set the property a blkconf_blocksizes was introduced.
If user didn't set physical or logical blocksize, it will
retrieve its value from a driver (only succeeds for DASD), otherwise
it will set default 512 value.
The blkconf_blocksizes call was added to all users of BlkConf.
block: Add driver methods to probe blocksizes and geometry
Introduce driver methods of defining disk blocksizes (physical and
logical) and hard drive geometry.
Methods are only implemented for "host_device". For "raw" devices
driver calls child's method.
For now geometry detection will only work for DASD devices. To check
that a local check_for_dasd function was introduced. It calls BIODASDINFO2
ioctl and returns its rc.
Blocksizes detection function will probe sizes for DASD devices.
John Snow [Fri, 6 Feb 2015 21:26:17 +0000 (16:26 -0500)]
blkdebug: fix "once" rule
Background:
The blkdebug scripts are currently engineered so that when a debug
event occurs, a prefilter browses a master list of parsed rules for a
certain event and adds them to an "active list" of rules to be used for
the forthcoming action, provided the events and state numbers match.
Then, once the request is received, the last active rule is used to
inject an error if certain parameters match.
This active list is cleared every time the prefilter injects a new
rule for the first time during a debug event.
The "once" rule currently causes the error injection, if it is
triggered, to only clear the active list. This is insufficient for
preventing future injections of the same rule.
Remedy:
This patch /deletes/ the rule from the list that the prefilter
browses, so it is gone for good. In V2, we remove only the rule of
interest from the active list instead of allowing the "once" rule to
clear the entire list of active rules.
Impact:
This affects iotests 026. Several ENOSPC tests that used "once" can
be seen to have output that shows multiple failure messages. After
this patch, the error messages tend to be smaller and less severe, but
the injection can still be seen to be working. I have patched the
expected output to expect the smaller error messages.
Max Reitz [Wed, 18 Feb 2015 22:40:48 +0000 (17:40 -0500)]
iotests: Prepare for refcount_bits option
Some tests do not work well with certain refcount widths (i.e. you
cannot create internal snapshots with refcount_bits=1), so make those
widths unsupported.
Furthermore, add another filter to _filter_img_create in common.filter
which filters out the refcount_bits value.
This is necessary for test 079, which does actually work with any
refcount width, but invoking qemu-img directly leads to the
refcount_bits value being visible in the output; use _make_test_img
instead which will filter it out.
Max Reitz [Wed, 18 Feb 2015 22:40:47 +0000 (17:40 -0500)]
qcow2: Use symbolic macros in qcow2_amend_options
qcow2_amend_options() should not compare options against some inline
strings but rather use the symbolic macros available for each of the
creation options.
Max Reitz [Wed, 18 Feb 2015 22:40:46 +0000 (17:40 -0500)]
qcow2: refcount_order parameter for qcow2_create2
Add a refcount_order parameter to qcow2_create2(), use that value for
the image header and for calculating the size required for
preallocation.
For now, always pass 4.
This addition requires changes to the calculation of the file size for
the "full" and "falloc" preallocation modes. That in turn is a nice
opportunity to add a comment about that calculation not necessarily
being exact (and that being intentional).
Max Reitz [Tue, 10 Feb 2015 20:28:52 +0000 (15:28 -0500)]
qcow2: Open images with refcount order != 4
No longer refuse to open images with a different refcount entry width
than 16 bits; only reject images with a refcount width larger than 64
bits (which is prohibited by the specification).
Max Reitz [Tue, 10 Feb 2015 20:28:51 +0000 (15:28 -0500)]
qcow2: More helpers for refcount modification
Add helper functions for getting and setting refcounts in a refcount
array for any possible refcount order, and choose the correct one during
refcount initialization.
Max Reitz [Tue, 10 Feb 2015 20:28:50 +0000 (15:28 -0500)]
qcow2: Helper function for refcount modification
Since refcounts do not always have to be a uint16_t, all refcount blocks
and arrays in memory should not have a specific type (thus they become
pointers to void) and for accessing them, two helper functions are used
(a getter and a setter). Those functions are called indirectly through
function pointers in the BDRVQcowState so they may later be exchanged
for different refcount orders.
With the check and repair functions using this function, the refcount
array they are creating will be in big endian byte order; additionally,
using realloc_refcount_array() makes the size of this refcount array
always cluster-aligned. Both combined allow rebuild_refcount_structure()
to drop the bounce buffer which was used to convert parts of the
refcount array to big endian byte order and store them on disk. Instead,
those parts can now be written directly.
[ kwolf: Fixed a build failure on 32 bit and another with old glib ]
Max Reitz [Tue, 10 Feb 2015 20:28:49 +0000 (15:28 -0500)]
qcow2: Helper for refcount array reallocation
Add a helper function for reallocating a refcount array, independent of
the refcount order. The newly allocated space is zeroed and the function
handles failed reallocations gracefully.
The helper function will always align the buffer size to a cluster
boundary; if storing the refcounts in such an array in big endian byte
order, this makes it possible to write parts of the array directly as
refcount blocks into the image file.
Max Reitz [Tue, 10 Feb 2015 20:28:47 +0000 (15:28 -0500)]
qcow2: Use unsigned addend for update_refcount()
update_refcount() and qcow2_update_cluster_refcount() currently take a
signed addend. At least one caller passes a value directly derived from
an absolute refcount that should be reached ("l2_refcount - 1" in
expand_zero_clusters_in_l1()). Therefore, the addend should be unsigned
as well; this will be especially important for 64 bit refcounts.
Because update_refcount() then no longer knows whether the refcount
should be increased or decreased, it now requires an additional flag
which specified exactly that. The same applies to
qcow2_update_cluster_refcount().
Max Reitz [Tue, 10 Feb 2015 20:28:46 +0000 (15:28 -0500)]
qcow2: Only return status from qcow2_get_refcount
Refcounts can theoretically be of type uint64_t; in order to be able to
represent the full range, qcow2_get_refcount() cannot use a single
variable to represent both all refcount values and also keep some values
reserved for errors.
One solution would be to add an Error pointer parameter to
qcow2_get_refcount(); however, no caller could (currently) pass that
error message, so it would have to be emitted immediately and be
passed to the next caller by returning -EIO or something similar.
Therefore, an Error parameter does not offer any advantages here.
The solution applied by this patch is simpler to use. Because no caller
would be able to pass the error message, they would have to print it and
free it, whereas with this patch the caller only needs to pass the
returned integer (which is often a no-op from the code perspective,
because that integer will be stored in a variable "ret" which will be
returned by the fail path of many callers).
Max Reitz [Tue, 10 Feb 2015 20:28:45 +0000 (15:28 -0500)]
qcow2: Do not return new value after refcount update
qcow2_update_cluster_refcount() does not have any quick access to the
new refcount value, it has to call qcow2_get_refcount(). Some callers do
not need that new value at all, others call qcow2_get_refcount()
themselves anyway (albeit in a different code path, which can however be
easily changed), therefore there is no advantage in making
qcow2_update_cluster_refcount() return the new value. Drop it.
Max Reitz [Tue, 10 Feb 2015 20:28:44 +0000 (15:28 -0500)]
qcow2: Add refcount_bits to format-specific info
Add the bit width of every refcount entry to the format-specific
information.
In contrast to lazy_refcounts and the corrupt flag, this should be
always emitted, even for compat=0.10 although it does not support any
refcount width other than 16 bits. This is because if a boolean is
optional, one normally assumes it to be false when omitted; but if an
integer is not specified, it is rather difficult to guess its value.
This new field breaks some test outputs, fix them.
Marc Marí [Tue, 24 Feb 2015 21:21:54 +0000 (22:21 +0100)]
libqos: Add malloc generic
This malloc is a basic interface implementation that works for any platform.
It should be replaced in the future for a real malloc implementation for each
of the platforms.
This variable is used only when on of the following macros are defined
CONFIG_XFS, CONFIG_FALLOCATE, CONFIG_FALLOCATE_PUNCH_HOLE or
CONFIG_FALLOCATE_ZERO_RANGE. Fortunately, CONFIG_FALLOCATE_PUNCH_HOLE
and CONFIG_FALLOCATE_ZERO_RANGE could be defined only along with
CONFIG_FALLOCATE. Therefore checking for CONFIG_XFS or CONFIG_FALLOCATE
would be enough.
Kevin Wolf [Wed, 11 Feb 2015 14:56:01 +0000 (15:56 +0100)]
vpc: Implement bdrv_co_get_block_status()
This implements bdrv_co_get_block_status() for VHD images. This can
significantly speed up qemu-img convert operation because only with this
function implemented sparseness can be considered. (Before, converting a
1 TB empty image took several minutes for me, now it's instantaneous.)
Kevin Wolf [Wed, 11 Feb 2015 16:19:57 +0000 (17:19 +0100)]
vpc: Fix size in fixed image creation
If total_sectors is rounded to match the geometry, total_size needs to
be changed as well. Otherwise we end up with an image whose geometry
describes a disk larger than the image file, which doesn't end well.
Kevin Wolf [Tue, 10 Feb 2015 10:31:52 +0000 (11:31 +0100)]
coroutine: Clean up qemu_coroutine_enter()
qemu_coroutine_enter() is now the only user of coroutine_swap(). Both
functions are short, so inline it.
Also, using COROUTINE_YIELD is now even more confusing because this code
is never called during qemu_coroutine_yield() any more. In fact, this
value is never read back, so we can just introduce a new COROUTINE_ENTER
which documents the purpose of the task switch better.
Kevin Wolf [Tue, 10 Feb 2015 10:17:53 +0000 (11:17 +0100)]
coroutine: Fix use after free with qemu_coroutine_yield()
Instead of using the same function for entering and exiting coroutines,
and hoping that it doesn't add any functionality that hurts with the
parameters used for exiting, we can just directly call into the real
task switch in qemu_coroutine_switch().
This fixes a use-after-free scenario where reentering a coroutine that
has yielded still accesses the old parent coroutine (which may have
meanwhile terminated) in the part of coroutine_swap() that follows
qemu_coroutine_switch().
# gpg: Signature made Sat Mar 7 12:35:05 2015 GMT using RSA key ID F83FA044
# gpg: Good signature from "Max Filippov <[email protected]>"
# gpg: aka "Max Filippov <[email protected]>"
* remotes/xtensa/tags/20150307-xtensa:
target-xtensa: xtfpga: fix ml605 flash size
target-xtensa: implement do_unassigned_access callback
hw/xtensa: allow reads/writes in the system I/O region
Peter Maydell [Sun, 8 Mar 2015 12:47:13 +0000 (12:47 +0000)]
Merge remote-tracking branch 'remotes/qmp-unstable/queue/qmp' into staging
* remotes/qmp-unstable/queue/qmp:
docs: add memory-hotplug.txt
qemu-options.hx: improve -m description
virtio-balloon: Add some trace events
virtio-balloon: Fix balloon not working correctly when hotplug memory
pc-dimm: add a function to calculate VM's current RAM size
Peter Maydell [Sun, 8 Mar 2015 09:47:55 +0000 (09:47 +0000)]
Merge remote-tracking branch 'remotes/spice/tags/pull-spice-20150304-1' into staging
misc spice/qxl fixes.
# gpg: Signature made Wed Mar 4 13:57:42 2015 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <[email protected]>"
# gpg: aka "Gerd Hoffmann <[email protected]>"
# gpg: aka "Gerd Hoffmann (private) <[email protected]>"
* remotes/spice/tags/pull-spice-20150304-1:
hmp: info spice: take out webdav
hmp: info spice: Show string channel name
qxl: drop update_displaychangelistener call for secondary qxl devices
vga: refactor vram_size clamping and rounding
qxl: refactor rounding up to a nearest power of 2
spice: fix invalid memory access to vga.vram
qxl: document minimal video memory for new modes
Peter Maydell [Sun, 8 Mar 2015 06:43:32 +0000 (06:43 +0000)]
Merge remote-tracking branch 'remotes/gonglei/tags/bootdevice-next-20150303' into staging
bootdevice: bug fixes
# gpg: Signature made Tue Mar 3 05:18:39 2015 GMT using RSA key ID DDE30FBB
# gpg: Good signature from "Gonglei <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5178 9C82 617F 2F58 8693 63B1 BA7A 65B0 DDE3 0FBB
* remotes/gonglei/tags/bootdevice-next-20150303:
bootdevice: add check in restore_boot_order()
bootdevice: check boot order argument validation before vm running
Peter Maydell [Sun, 8 Mar 2015 00:16:27 +0000 (00:16 +0000)]
Merge remote-tracking branch 'remotes/bkoppelmann/tags/pull-tricore-20150303' into staging
TriCore RRR1, RRR2 instructions and bugfixes
# gpg: Signature made Tue Mar 3 01:12:02 2015 GMT using RSA key ID 6B69CA14
# gpg: Good signature from "Bastian Koppelmann <[email protected]>"
* remotes/bkoppelmann/tags/pull-tricore-20150303:
target-tricore: Add instructions of RRR1 opcode format, which have 0xc3 as first opcode
target-tricore: Add instructions of RRR1 opcode format, which have 0x43 as first opcode
target-tricore: Add instructions of RRR1 opcode format, which have 0x83 as first opcode
target-tricore: Add instructions of RRR2 opcode format
target-tricore: fix msub32_suov return wrong results
target-tricore: Fix RLC_ADDI, RLC_ADDIH using wrong microcode helper
Max Filippov [Mon, 17 Feb 2014 16:57:45 +0000 (20:57 +0400)]
hw/xtensa: allow reads/writes in the system I/O region
Ignore writes to unassigned areas of system I/O regison and return 0 for
reads. This makes drivers for unimportant unimplemented hardware blocks
happy.
zhanghailiang [Mon, 17 Nov 2014 05:11:09 +0000 (13:11 +0800)]
virtio-balloon: Fix balloon not working correctly when hotplug memory
When do memory balloon, it takes the 'ram_size' as the VM's current ram size,
But 'ram_size' is the startup configured ram size, it does not take into
account the hotplugged memory.
As a result, the balloon result will be confused.
Steps to reproduce:
(1)Start VM: qemu -m size=1024,slots=4,maxmem=8G
(2)In VM: #free -m : 1024M
(3)qmp balloon 512M
(4)In VM: #free -m : 512M
(5)hotplug pc-dimm 1G
(6)In VM: #free -m : 1512M
(7)qmp balloon 256M
(8)In VM: #free -m :1256M
We expect the VM's available ram size to be 256M after 'qmp balloon 256M'
command, but VM's real available ram size is 1256M.
For "qmp balloon" is not performance critical code, we use function
'get_current_ram_size' to get VM's current ram size.
* remotes/awilliam/tags/vfio-update-20150302.0:
vfio-pci: Enable device request notification support
vfio: allow to disable MMAP per device with -x-mmap=off option
vfio: Make type1 listener symbols static
vfio: Add ioctl number to error report
Peter Maydell [Tue, 3 Mar 2015 12:07:47 +0000 (12:07 +0000)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
- more config options
- bootdevice, iscsi, virtio-scsi fixes
- build system patches for MinGW and config-devices.mak
- qemu_mutex_lock_iothread deadlock fixes
- another tiny patch from the record/replay series
# gpg: Signature made Mon Mar 2 09:59:14 2015 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <[email protected]>"
# gpg: aka "Paolo Bonzini <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream:
cpus: be more paranoid in avoiding deadlocks
cpus: fix deadlock and segfault in qemu_mutex_lock_iothread
virtio-scsi: Allocate op blocker reason before blocking
Makefile.target: binary depends on config-devices
Makefile: don't silence mak file test with V=1
Makefile: fix up parallel building under MSYS+MinGW
iscsi: Handle write protected case in reopen
Give ivshmem its own config option
Create specific config option for "platform-bus"
Add specific config options for PCI-E bridges
bootdevice: fix segment fault when booting guest with '-kernel' and '-initrd'
timer: replace time() with QEMU_CLOCK_HOST
virtio-scsi-dataplane: Call blk_set_aio_context within BQL
block: Forbid bdrv_set_aio_context outside BQL
scsi: give device a parent before setting properties
UsbEnumerateNewDev() in the USB bus driver issues a GET_DESCRIPTOR
request, in order to determine the number of configurations that the
endpoint supports. The requests consists of three stages (three TRBs),
setup, data, and status. The length of the response is determined in [1],
namely from the transfer event that the host controller generates in
response to the request's middle stage (ie. the data stage).
If the length of the answer is correct (a full GET_DESCRIPTOR request
takes 18 bytes), then the XHCI driver that underlies the USB bus driver
"snoops" (caches) the descriptor data for later [2].
Later, the USB bus driver sends a SET_CONFIG request. The underlying XHCI
driver allocates a transfer ring for the endpoint, relying on the data
snooped and cached in step [2].
Finally, the USB keyboard driver submits an asynchronous interrupt
transfer to manage the keyboard. As part of this it asserts [4] that the
ring has been allocated in step [3].
And this ASSERT() fires. The root cause can be found in the way QEMU
handles the initial GET_DESCRIPTOR request.
Again, that request consists of three stages (TRBs, Transfer Request
Blocks), "setup", "data", and "status". The XhcCreateTransferTrb()
function sets the IOC ("Interrupt on Completion") flag in each of these
TRBs.
According to the XHCI specification, the host controller shall generate a
Transfer Event in response to *each* individual TRB of the request that
had the IOC flag set. This means that QEMU should queue three events:
setup, data, and status, for edk2's XHCI driver.
However, QEMU only generates two events:
- one for the setup (ie. 1st) stage,
- another for the status (ie. 3rd) stage.
No event is generated for the middle (ie. data) stage. The loop in QEMU's
xhci_xfer_report() function runs three times, but due to the "reported"
variable, only the first and the last TRBs elicit events, the middle (data
stage) results in no event queued.
As a consequence:
- When handling the GET_DESCRIPTOR request, XhcCheckUrbResult() in [1]
does not update the response length from zero.
- XhcControlTransfer() thinks that the response is invalid (it has zero
length payload instead of 18 bytes), hence [2] is not reached; the
device descriptor is not stashed for later, and the number of possible
configurations is left at zero.
- When handling the SET_CONFIG request, (NumConfigurations == 0) from
above prevents the allocation of the endpoint's transfer ring.
- When the keyboard driver tries to use the endpoint, the ASSERT() blows
up.
The solution is to correct the emulation in QEMU, and to generate a
transfer event whenever IOC is set in a TRB.
Radim Krčmář [Tue, 17 Feb 2015 16:30:51 +0000 (17:30 +0100)]
spice: fix invalid memory access to vga.vram
vga_common_init() doesn't allow more than 256 MiB vram size and silently
shrinks any larger value. qxl_dirty_surfaces() used the unshrinked size
via qxl->shadow_rom.surface0_area_size when accessing the memory, which
resulted in segfault.
Add a workaround for this case and an assert if it happens again.
We have to bump the vga memory limit too, because 256 MiB wouldn't have
allowed 8k (it requires more than 128 MiB).
1024 MiB doesn't work, but 512 MiB seems fine.
Gonglei [Tue, 3 Feb 2015 11:31:09 +0000 (11:31 +0000)]
bootdevice: check boot order argument validation before vm running
Either 'once' option or 'order' option can take effect for -boot at
the same time, that is say initial startup processing can check only
one. And pc.c's set_boot_dev() fails when its boot order argument
is invalid. This patch provide a solution fix this problem:
1. If "once" is given, register reset handler to restore boot order.
2. Pass the normal boot order to machine creation. Should fail when
the normal boot order is invalid.
3. If "once" is given, set it with qemu_boot_set(). Fails when the
once boot order is invalid.
4. Start the machine.
5. On reset, the reset handler calls qemu_boot_set() to restore boot
order. Should never fail.