Anthony Liguori [Mon, 23 Sep 2013 16:52:49 +0000 (11:52 -0500)]
Merge remote-tracking branch 'qemu-kvm/uq/master' into staging
# By Alexey Kardashevskiy (3) and others
# Via Paolo Bonzini
* qemu-kvm/uq/master:
target-i386: add feature kvm_pv_unhalt
linux-headers: update to 3.12-rc1
target-i386: forward CPUID cache leaves when -cpu host is used
linux-headers: update to 3.11
kvm: fix traces to use %x instead of %d
kvmvapic: Clear also physical ROM address when entering INACTIVE state
kvmvapic: Enter inactive state on hardware reset
kvmvapic: Catch invalid ROM size
kvm irqfd: support direct msimessage to irq translation
fix steal time MSR vmsd callback to proper opaque type
kvm: warn if num cpus is greater than num recommended
cpu: Move cpu state syncs up into cpu_dump_state()
exec: always use MADV_DONTFORK
Anthony Liguori [Mon, 23 Sep 2013 16:52:32 +0000 (11:52 -0500)]
Merge remote-tracking branch 'bonzini/scsi-next' into staging
# By Hervé Poussineau (5) and Stefan Weil (1)
# Via Paolo Bonzini
* bonzini/scsi-next:
block/iscsi: Drop iscsi_co_get_block_status for older versions of libiscsi
lsi: add 53C810 variant
lsi: remove todo
lsi: ignore write accesses to CTEST0 registers
lsi: check ssid versus sdid only if ssid is valid
lsi: use constant name instead of its value
Anthony Liguori [Fri, 20 Sep 2013 13:08:08 +0000 (08:08 -0500)]
Merge remote-tracking branch 'kraxel/usb.90' into staging
# By Hans de Goede (6) and Gerd Hoffmann (1)
# Via Gerd Hoffmann
* kraxel/usb.90:
usb: Fix iovec memleak on combined-packet free
usb: Also reset max_packet_size on ep_reset
xhci: Fix memory leak on xhci_disable_ep
xhci: Add xhci_epid_to_usbep helper function
xhci: Init a transfers xhci, slotid and epid member on epctx alloc
xhci: Fix number of streams allocated when using streams
usb: remove old usb-host code
target-i386: forward CPUID cache leaves when -cpu host is used
Some users running cpu intensive tasks checking the cache CPUID leaves at
startup and making decisions based on the result reported that the guest was
not reflecting the host CPUID leaves when -cpu host is used.
This patch fix this.
Signed-off-by: Benoît Canet <[email protected]>
[Rename new field to cache_info_passthrough - Paolo] Signed-off-by: Paolo Bonzini <[email protected]>
Jan Kiszka [Tue, 3 Sep 2013 16:08:51 +0000 (18:08 +0200)]
kvmvapic: Enter inactive state on hardware reset
ROM layout may change after reset of devices are hotplugged, so we have
to pick up the physical address again when the ROM is initialized. This
is best achieved by resetting the state to INACTIVE.
Jan Kiszka [Tue, 3 Sep 2013 16:08:50 +0000 (18:08 +0200)]
kvmvapic: Catch invalid ROM size
If not caught early, a zero-length ROM will cause a NULL-pointer access
later on in patch_hypercalls when allocating a zero-length ROM copy and
trying to read from it.
kvm irqfd: support direct msimessage to irq translation
On PPC64 systems MSI Messages are translated to system IRQ in a PCI
host bridge. This is already supported for emulated MSI/MSIX but
not for irqfd where the current QEMU allocates IRQ numbers from
irqchip and maps MSIMessages to IRQ in the host kernel.
This adds a new direct mapping flag which tells
the kvm_irqchip_add_msi_route() function that a new VIRQ
should not be allocated, instead the value from MSIMessage::data
should be used. It is up to the platform code to make sure that
this contains a valid IRQ number as sPAPR does in spapr_pci.c.
Andrew Jones [Fri, 23 Aug 2013 13:24:37 +0000 (15:24 +0200)]
kvm: warn if num cpus is greater than num recommended
The comment in kvm_max_vcpus() states that it's using the recommended
procedure from the kernel API documentation to get the max number
of vcpus that kvm supports. It is, but by always returning the
maximum number supported. The maximum number should only be used
for development purposes. qemu should check KVM_CAP_NR_VCPUS for
the recommended number of vcpus. This patch adds a warning if a user
specifies a number of cpus between the recommended and max.
James Hogan [Tue, 27 Aug 2013 11:19:10 +0000 (12:19 +0100)]
cpu: Move cpu state syncs up into cpu_dump_state()
The x86 and ppc targets call cpu_synchronize_state() from their
*_cpu_dump_state() callbacks to ensure that up to date state is dumped
when KVM is enabled (for example when a KVM internal error occurs).
Move this call up into the generic cpu_dump_state() function so that
other KVM targets (namely MIPS) can take advantage of it.
This requires kvm_cpu_synchronize_state() and cpu_synchronize_state() to
be moved out of the #ifdef NEED_CPU_H in <sysemu/kvm.h> so that they're
accessible to qom/cpu.c.
Andrea Arcangeli [Thu, 25 Jul 2013 10:11:15 +0000 (12:11 +0200)]
exec: always use MADV_DONTFORK
MADV_DONTFORK prevents fork to fail with -ENOMEM if the default
overcommit heuristics decides there's too much anonymous virtual
memory allocated. If the KVM secondary MMU is synchronized with MMU
notifiers or not, doesn't make a difference in that regard.
Secondly it's always more efficient to avoid copying the guest
physical address space in the fork child (so we avoid to mark all the
guest memory readonly in the parent and so we skip the establishment
and teardown of lots of pagetables in the child).
In the common case we can ignore the error if MADV_DONTFORK is not
available. Leave a second invocation that errors out in the KVM path
if MMU notifiers are missing and KVM is enabled, to abort in such
case.
Hans de Goede [Tue, 17 Sep 2013 19:44:49 +0000 (21:44 +0200)]
xhci: Init a transfers xhci, slotid and epid member on epctx alloc
Transfers are part of an epctx, which is part of a slot, which is part of
a xhci. Transfers cannot dynamically be moved from one epctx to another,
so once created their xhci, slotid and epid are constant, so lets set these
up at creation time, rather then re-initializing them with the same
value each time a transfer gets submitted.
The usb-host code has been rewritten for qemu 1.5 to use libusb,
the old code has been left in as temporary fallback. Now we are
two releases further out, targeting the 1.7 release. No major
issues with the new code poped up until now. Time to remove it
from tre tree. Should we ever need it again for some reason --
git has a copy for us in the history.
Nowdays rom size is fixed at 8192 for live migration compat reasons.
So we can ditch the pointless math trying to calculate the size needed.
Also make the size sanity check fail at compile time not runtime.
Stefan Weil [Tue, 17 Sep 2013 17:33:49 +0000 (19:33 +0200)]
block/iscsi: Drop iscsi_co_get_block_status for older versions of libiscsi
Debian wheezy includes libiscsi-dev 1.4.0 which does not provide
SCSI_PROVISIONING_TYPE_DEALLOCATED. Drop iscsi_co_get_block_status
in this case to allow compilation without errors.
Anthony Liguori [Tue, 17 Sep 2013 15:01:24 +0000 (10:01 -0500)]
Merge remote-tracking branch 'kiszka/queues/slirp' into staging
# By Liu Ping Fan (3) and Jan Kiszka (1)
# Via Jan Kiszka
* kiszka/queues/slirp:
slirp: clean up slirp_update_timeout
slirp: set mainloop timeout with more precise value
slirp: define timeout as macro
slirp: make timeout local
Anthony Liguori [Tue, 17 Sep 2013 14:51:40 +0000 (09:51 -0500)]
Merge remote-tracking branch 'kwolf/for-anthony' into staging
# By Max Reitz (16) and others
# Via Kevin Wolf
* kwolf/for-anthony: (33 commits)
qemu-iotests: Fix test 038
block: Assert validity of BdrvActionOps
qemu-iotests: Cleanup test image in test number 007
qemu-img: fix invalid JSON
coroutine: add ./configure --disable-coroutine-pool
qemu-iotests: Adjustments due to error propagation
qcow2: Use Error parameter
qemu-img create: Emit filename on error
block: Error parameter for create functions
block: Error parameter for open functions
bdrv: Use "Error" for creating images
bdrv: Use "Error" for opening images
qemu-iotests: add 057 internal snapshot for block device test case
hmp: add interface hmp_snapshot_delete_blkdev_internal
hmp: add interface hmp_snapshot_blkdev_internal
qmp: add interface blockdev-snapshot-delete-internal-sync
qmp: add interface blockdev-snapshot-internal-sync
qmp: add internal snapshot support in qmp_transaction
snapshot: distinguish id and name in snapshot delete
snapshot: new function bdrv_snapshot_find_by_id_and_name()
...
Anthony Liguori [Tue, 17 Sep 2013 14:51:23 +0000 (09:51 -0500)]
Merge remote-tracking branch 'rth/tgt-i386' into staging
# By Paolo Bonzini (1) and Peter Maydell (1)
# Via Richard Henderson
* rth/tgt-i386:
target-i386: Only provide CMOV and friends if feature bit set
target-i386: fix disassembly with PAE=1, PG=0
Anthony Liguori [Tue, 17 Sep 2013 14:50:23 +0000 (09:50 -0500)]
Merge remote-tracking branch 'bonzini/scsi-next' into staging
# By Peter Lieven (3) and others
# Via Paolo Bonzini
* bonzini/scsi-next:
spapr-vscsi: Report error on unsupported MAD requests
spapr-vscsi: Adding VSCSI capabilities
iscsi: split discard requests in multiple parts
iscsi: add .bdrv_get_block_status
iscsi: add logical block provisioning information to iscsilun
hw/scsi/lsi53c895a: Use deposit32 rather than handcoded shift/mask
hw/scsi/lsi53c895a: Use sextract32 for sign-extension
scsi: Fix scsi_bus_legacy_add_drive() scsi-generic with serial
virtio-scsi: Make type virtio-scsi-common abstract
spapr-vscsi: add task management
scsi: prefer UUID to VM name for the initiator name
Liu Ping Fan [Sun, 25 Aug 2013 02:01:21 +0000 (10:01 +0800)]
slirp: set mainloop timeout with more precise value
If slirp needs to emulate tcp timeout, then the timeout value
for mainloop should be more precise, which is determined by
slirp's fasttimo or slowtimo. Achieve this by swap the logic
sequence of slirp_pollfds_fill and slirp_update_timeout.
Currently, treat it exactly as a 53C895A. 53C895A is a 53C810 with more capabilities, so this should work.
However, this lets us test different code paths on Linux, which
don't use lastest features if it detect a 810, or on some OSes
which only support 810 and not 895A (like very old Windows NT
versions).
53C895A datasheet says that this register is read/write, and that the value
returned on read access is dependant of DMA FIFO state. However, nothing is
said for written value.
53C810A datasheet gives more insight about this register:
"This was a general purpose read/write register in previous SYM53C8XX
family chips. Although it is still a read/write register, Symbios reserves
the right to use these bits for future 53C8XX family enhancements."
This prevents going to the default case, which prints an error message.
Max Reitz [Fri, 13 Sep 2013 08:37:12 +0000 (10:37 +0200)]
qemu-iotests: Fix test 038
Test 038 uses asynchronous I/O, resulting (potentially) in a different
output for every run (regarding the order of the I/O accesses). This can
be fixed by simply sorting the I/O access messages, since their order is
irrelevant anyway (for this asynchonous I/O).
Peter Maydell [Mon, 15 Jul 2013 17:21:40 +0000 (18:21 +0100)]
target-i386: Only provide CMOV and friends if feature bit set
The instructions CMOVcc, FCMOVcc and F[U]COMI[P] should only be
present if the CMOV feature bit is set. Add missing feature bit
checks so we correctly fault if emulating a 486 or 586.
This fixes bug LP:1201446.
exec: Don't abort when we can't allocate guest memory
We abort() on memory allocation failure. abort() is appropriate for
programming errors. Maybe most memory allocation failures are
programming errors, maybe not. But guest memory allocation failure
isn't, and aborting when the user asks for more memory than we can
provide is not nice. exit(1) instead, and do it in just one place, so
the error message is consistent.
Another issue missed in commit fdec991 is -mem-path: it needs to be
rejected only for old S390 KVM, not for any S390. Not that I
personally care, but the ifdeffery in qemu_ram_alloc_from_ptr() annoys
me.
Note that this doesn't actually make -mem-path work, as the kernel
doesn't (yet?) support large pages in the host for KVM guests. Clean
it up anyway.
Thanks to Christian Borntraeger for pointing out the S390 kernel
limitations.
exec: Drop incorrect & dead S390 code in qemu_ram_remap()
Old S390 KVM wants guest RAM mapped in a peculiar way. Commit 6b02494
implemented that.
When qemu_ram_remap() got added in commit cd19cfa, its code carefully
mimicked the allocation code: peculiar way if defined(TARGET_S390X) &&
defined(CONFIG_KVM), else normal way.
For new S390 KVM, we actually want the normal way. Commit fdec991
changed qemu_ram_alloc_from_ptr() accordingly, but forgot to update
qemu_ram_remap(). If qemu_ram_alloc_from_ptr() maps RAM the normal
way, but qemu_ram_remap() remaps it the peculiar way, remapping
changes protection and flags, which it shouldn't.
Fortunately, this can't happen, as we never remap on S390.
Replace the incorrect code with an assertion.
Thanks to Christian Borntraeger for help with assessing the bug's
(non-)impact.
Instead of spreading its ifdeffery everywhere, confine it to
qemu_ram_alloc_from_ptr(). Everywhere else, simply test block->fd,
which is non-negative exactly when block uses -mem-path.
exec: Clean up fall back when -mem-path allocation fails
With -mem-path, qemu_ram_alloc_from_ptr() first tries to allocate
accordingly, but when it fails, it falls back to normal allocation.
The fall back allocation code used to be effectively identical to the
"-mem-path not given" code, until it started to diverge in commit 432d268. I believe the code still works, but clean it up anyway: drop
the special fall back allocation code, and fall back to the ordinary
"-mem-path not given" code instead.
Max Reitz [Thu, 12 Sep 2013 12:57:27 +0000 (14:57 +0200)]
block: Assert validity of BdrvActionOps
In qmp_transaction, assert that the BdrvActionOps to be used is actually
valid.
This assertion failing is very improbable, however, it might happen, if
a new TransactionActionKind is introduced "out of order" and the
actions[] array is not updated.
qemu-iotests: Cleanup test image in test number 007
qemu-iotests number 007 doesn't do test image cleanup. This will affect
those protocols that expect a clean state before every test. Hence
ensure that test image is cleaned up in this test.
This implements capabilities exchange between vscsi host and client. As
at the moment no capability is supported, put zero flags everywhere and
return.
The 'gthread' coroutine backend was written before the freelist (aka
pool) existed in qemu-coroutine.c.
This means that every thread is expected to exit when its coroutine
terminates. It is not possible to reuse threads from a pool.
This patch automatically disables the pool when 'gthread' is used. This
allows the 'gthread' backend to work again (for example,
tests/test-coroutine completes successfully instead of hanging).
I considered implementing thread reuse but I don't want quirks like CPU
affinity differences due to coroutine threads being recycled. The
'gthread' backend is a reference backend and it's therefore okay to skip
the pool optimization.
Note this patch also makes it easy to toggle the pool for benchmarking
purposes:
Max Reitz [Tue, 10 Sep 2013 11:59:43 +0000 (13:59 +0200)]
qemu-iotests: Adjustments due to error propagation
When opening/creating images, propagating errors instead of immediately
emitting them on occurrence results in errors generally being printed on
a single line rather than being split up into multiple ones. This in
turn requires adjustments to some test results.
Also, test 060 used a sed to filter out the test image directory and
format by removing everything from the affected line after a certain
keyword; this now also removes the error message itself, which can be
fixed by using _filter_testdir and _filter_imgfmt.
Finally, _make_test_img in common.rc did not filter out the test image
directory etc. from stderr. This has been fixed through a redirection of
stderr to stdout (which is already done in _check_test_img and
_img_info).
Max Reitz [Fri, 6 Sep 2013 14:51:03 +0000 (16:51 +0200)]
qemu-img create: Emit filename on error
bdrv_img_create generally does not emit the target filename, although
this is pretty important information. Therefore, prepend its error
message with the output filename (if an error occurs).
This interface use id and name as optional parameters, to handle the
case that one image contain multiple snapshots with same name which
may be '', but with different id.
Adding parameter id is for historical compatiability reason, and
that case is not possible in qemu's new interface for internal
snapshot at block device level, but still possible in qemu-img.
qmp: add internal snapshot support in qmp_transaction
Unlike savevm, the qmp_transaction interface will not generate
snapshot name automatically, saving trouble to return information
of the new created snapshot.
Although qcow2 support storing multiple snapshots with same name
but different ID, here it will fail when an snapshot with that name
already exist before the operation. Format such as rbd do not support
ID at all, and in most case, it means trouble to user when he faces
multiple snapshots with same name, so ban that case. Request with
empty name will be rejected.
snapshot: distinguish id and name in snapshot delete
Snapshot creation actually already distinguish id and name since it take
a structured parameter *sn, but delete can't. Later an accurate delete
is needed in qmp_transaction abort and blockdev-snapshot-delete-sync,
so change its prototype. Also *errp is added to tip error, but return
value is kepted to let caller check what kind of error happens. Existing
caller for it are savevm, delvm and qemu-img, they are not impacted by
introducing a new function bdrv_snapshot_delete_by_id_or_name(), which
check the return value and do the operation again.
Before this patch:
For qcow2, it search id first then name to find the one to delete.
For rbd, it search name.
For sheepdog, it does nothing.
After this patch:
For qcow2, logic is the same by call it twice in caller.
For rbd, it always fails in delete with id, but still search for name
in second try, no change to user.
snapshot: new function bdrv_snapshot_find_by_id_and_name()
To make it clear about id and name in searching, add this API
to distinguish them. Caller can choose to search by id or name,
*errp will be set only for exception.
Max Reitz [Thu, 5 Sep 2013 08:55:54 +0000 (10:55 +0200)]
qemu-iotests: New test case in 061
Add one test case for zero cluster expansion on qcow2 version downgrade
in shared L2 tables (i.e., L2 tables with a refcount > 1) and one for
zero expansion on backed clusters in shared L2 tables.
qemu-iotests: add tests for runtime fd passing via SCM rights
This case will test whether the monitor can receive fd at runtime.
To verify better, additional monitor is created to see if qemu
can handler two monitor instances correctly.
Max Reitz [Tue, 3 Sep 2013 08:09:54 +0000 (10:09 +0200)]
qcow2: Implement bdrv_amend_options
Implement bdrv_amend_options for compat, size, backing_file, backing_fmt
and lazy_refcounts.
Downgrading images from compat=1.1 to compat=0.10 is achieved through
handling all incompatible flags accordingly, clearing all compatible and
autoclear flags and expanding all zero clusters.
Max Reitz [Tue, 3 Sep 2013 08:09:53 +0000 (10:09 +0200)]
qcow2: Save refcount order in BDRVQcowState
Save the image refcount order in BDRVQcowState. This will be relevant
for future code supporting different refcount orders than four and also
for code that needs to verify a certain refcount order for an opened
image.
Max Reitz [Tue, 3 Sep 2013 08:09:50 +0000 (10:09 +0200)]
block: Image file option amendment
This patch adds the "amend" option to qemu-img which allows changing
image options on existing image files. It also adds the generic bdrv
implementation which is basically just a wrapper for the image format
specific function.
Tal Kain [Mon, 9 Sep 2013 09:14:55 +0000 (11:14 +0200)]
raw-win32.c: Fix incorrect handling behaviour of small block files
It is a valid case that the read data's size is smaller than the
requested size since there could be files that are smaller than
the minimum block size (For ex. when a VMDK disk descriptor file)
Kevin Wolf [Fri, 6 Sep 2013 10:20:08 +0000 (12:20 +0200)]
qcow2: Discard VM state in active L1 after creating snapshot
During savevm, the VM state is written to the active L1 of the image and
then a snapshot is taken. After that, the VM state isn't needed any more
in the active L1 and should be discarded. This is implemented by this
patch.
The impact of not discarding the VM state is that a snapshot can never
become smaller than any previous snapshot (because it would be padded
with old VM state), and more importantly that future savevm operations
cause unnecessary COWs (with associated flushes), which makes subsequent
snapshots much slower.
Gerd Hoffmann [Thu, 22 Aug 2013 09:43:58 +0000 (11:43 +0200)]
chardev: fix pty_chr_timer
pty_chr_timer first calls pty_chr_update_read_handler(), then clears
timer_tag (because it is a one-shot timer). This is the wrong order
though. pty_chr_update_read_handler might re-arm time timer, and the
new timer_tag gets overwitten in that case.
This leads to crashes when unplugging a pty chardev: pty_chr_close
thinks no timer is running -> timer isn't canceled -> pty_chr_timer gets
called with stale CharDevState -> BOOM.
This patch fixes the ordering.
Kill the pointless goto while being at it.