Peter Maydell [Wed, 14 Jan 2015 18:02:47 +0000 (18:02 +0000)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Mostly bugfixes and cleanups from qemu-devel. Yet another small patch from
the record/replay series, and a few SCSI and i386 patches as well.
# gpg: Signature made Wed 14 Jan 2015 09:39:14 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <[email protected]>"
# gpg: aka "Paolo Bonzini <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream:
cpus: consistently use QEMU_CLOCK_VIRTUAL_RT for icount_warp_rt timer
qemu-timer: rename timer_init to timer_init_tl
scsi: fix cancellation when I/O was completed but DMA was not.
rules.mak: Fix module build
hw/scsi/lsi53c895a: add support for additional diag / debug registers
qemu-common.h: optimise muldiv64 if int128 is available
target-i386: do not memcpy in and out of xmm_regs
target-i386: fix movntsd on big-endian hosts
vl.c: fix regression when reading memory size from config file
vl: Don't silently change topology when all -smp options were set
vl: fix max_cpus check
vl: Avoid unnecessary 'if' nesting
9pfs: changed to use event_notifier instead of qemu_pipe
vl.c: fix regression when reading machine type from config file
char: restore stdio echo on resume from suspend.
Paolo Bonzini [Mon, 12 Jan 2015 10:47:30 +0000 (11:47 +0100)]
scsi: fix cancellation when I/O was completed but DMA was not.
Commit d577646 (scsi: Introduce scsi_req_cancel_complete, 2014-09-25)
was supposed to have no semantic change, but it missed a case. When
r->aiocb has already been NULLed, but DMA was not complete and the
SCSI layer was waiting for scsi_req_continue, after the patch the
SCSI layer will not call the .cancel callback of SCSIBusInfo.
Fam Zheng [Mon, 12 Jan 2015 04:43:09 +0000 (12:43 +0800)]
rules.mak: Fix module build
Module build is broken since commit c261d774fb ( rules.mak: Fix DSO
build by pulling in archive symbols). That commit added .mo placeholders
of DSO to -y variables, in order to pull stub symbols to executable. But
the placeholders are unintentionally expanded in -y, rather than
filtered out while linking.
Fix it by moving the -objs expanding to before inserting .mo
placeholders. Note that passing -cflags and -libs to member objects are
also moved to keep it happening before object expanding.
Peter Lieven [Mon, 12 Jan 2015 09:45:17 +0000 (10:45 +0100)]
hw/scsi/lsi53c895a: add support for additional diag / debug registers
Some ancient Linux kernels read from registers 0x09 and 0x3c-3f during
boot. According to the spec these registers are for diag and debug
purposes only. If they are absend qemu aborts on read.
Paolo Bonzini [Fri, 24 Oct 2014 07:44:38 +0000 (09:44 +0200)]
target-i386: do not memcpy in and out of xmm_regs
After the next patch, we will move the high parts of AVX and AVX512 registers
in the same array as the SSE registers. This will make it impossible to
memcpy an array of 128-bit values in and out of xmm_regs in one swoop.
Use a for loop instead.
Similarly, always use XMM_Q in translate.c. This avoids introducing bugs
such as the one fixed in the previous patch.
xen-hvm: increase maxmem before calling xc_domain_populate_physmap
Increase maxmem before calling xc_domain_populate_physmap_exact to
avoid the risk of running out of guest memory. This way we can also
avoid complex memory calculations in libxl at domain construction
time.
This patch fixes an abort() when assigning more than 4 NICs to a VM.
Peter Maydell [Tue, 13 Jan 2015 13:49:18 +0000 (13:49 +0000)]
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Tue 13 Jan 2015 13:48:06 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <[email protected]>"
# gpg: aka "Stefan Hajnoczi <[email protected]>"
* remotes/stefanha/tags/block-pull-request: (38 commits)
NVMe: Set correct VS Value for 1.1 Compliant Controllers
MAINTAINERS: Add migration/block* to block subsystem
MAINTAINERS: Update email addresses for Chrysostomos Nanakos
nvme: Fix get/set number of queues feature
ide: Implement VPD response for ATAPI
block: Split BLOCK_OP_TYPE_COMMIT to BLOCK_OP_TYPE_COMMIT_{SOURCE, TARGET}
block: limited request size in write zeroes unsupported path
coroutine: try harder not to delete coroutines
coroutine: drop qemu_coroutine_adjust_pool_size
coroutine: rewrite pool to avoid mutex
QSLIST: add lock-free operations
test-coroutine: avoid overflow on 32-bit systems
qemu-thread: add per-thread atexit functions
coroutine-ucontext: use __thread
qemu-iotests: Add supported os parameter for python tests
qemu-iotests: Add "_supported_os Linux" to 058
qemu-iotests: Replace "/bin/true" with "true"
.gitignore: Ignore generated "common.env"
libqos: Convert malloc-pc allocator to a generic allocator
migration/block: fix pending() return value
...
Alex Friedman [Fri, 5 Dec 2014 12:40:24 +0000 (14:40 +0200)]
nvme: Fix get/set number of queues feature
According to the specification, the low 16 bits should contain the number of
I/O submission queues, and the high 16 bits should contain the number of
I/O completion queues.
John Snow [Wed, 10 Dec 2014 18:17:07 +0000 (13:17 -0500)]
ide: Implement VPD response for ATAPI
SCSI devices have multiple kinds of queries they need to respond
to, as defined in the "cmd inquiry" section in MMC-6 and SPC-3.
Relevent sections:
MMC-6 revision 2g:
Non-VPD response data and pointer to SPC-3;
Section 6.8 "Inquiry Command"
SPC-3 revision 23:
Inquiry command and error handling:
Section 6.4 "INQUIRY command"
VPD data pages format:
Section 7.6 "Vital product data parameters"
We implement these Vital Product Data queries for SCSI, but not for
ATAPI through IDE. The result is that if you are looking for the WWN
identifier via tools such as sg3_utils, you will be unable to query
our CD/DVD rom device to obtain it.
This patch adds the minimum number of mandatory responses as defined
by SPC-3, which include the "supported pages" response (page 0x00)
and the "Device Identification" response (page 0x83). It also correctly
responds when it receives a request for an illegal page to improve
error output from related tools.
The Device ID page contains an arbitrary list of identification
strings of various formats; the ID strings included in this patch
were chosen to mimic those provided by the libata driver when
emulating this SCSI query (model, serial, and wwn when present.)
Example:
# libata emulated response
[root@localhost ~]# sg_inq --id /dev/sda
VPD INQUIRY: Device Identification page
Designation descriptor number 1, descriptor length: 24
designator_type: vendor specific [0x0], code_set: ASCII
associated with the addressed logical unit
vendor specific: QM00001
Designation descriptor number 2, descriptor length: 72
designator_type: T10 vendor identification, code_set: ASCII
associated with the addressed logical unit
vendor id: ATA
vendor specific: QEMU HARDDISK QM00001
# QEMU generated ATAPI response, with WWN
[root@localhost ~]# sg_inq --id /dev/sr0
VPD INQUIRY: Device Identification page
Designation descriptor number 1, descriptor length: 24
designator_type: vendor specific [0x0], code_set: ASCII
associated with the addressed logical unit
vendor specific: QM00005
Designation descriptor number 2, descriptor length: 72
designator_type: T10 vendor identification, code_set: ASCII
associated with the addressed logical unit
vendor id: ATA
vendor specific: QEMU DVD-ROM QM00005
Designation descriptor number 3, descriptor length: 12
designator_type: NAA, code_set: Binary
associated with the addressed logical unit
NAA 5, IEEE Company_id: 0xc50
Vendor Specific Identifier: 0x15ea71bb
[0x5000c50015ea71bb]
See also: hw/scsi/scsi-disk.c, scsi_disk_emulate_inquiry()
block: Split BLOCK_OP_TYPE_COMMIT to BLOCK_OP_TYPE_COMMIT_{SOURCE, TARGET}
Like BLOCK_OP_TYPE_BACKUP_SOURCE and BLOCK_OP_TYPE_BACKUP_TARGET,
block-commit involves two asymmetric devices.
This change is not user-visible (yet), because commit only works with
device names.
But once we enable backing reference in blockdev-add, or specifying
node-name in block-commit command, we don't want the user to start two
commit jobs on the same backing chain, which will corrupt things because
of the final bdrv_swap.
Before we have per category blockers, splitting this type is still
better.
[Resolved virtio-blk dataplane conflict by replacing
BLOCK_OP_TYPE_COMMIT with both BLOCK_OP_TYPE_COMMIT_{SOURCE, TARGET}.
They are safe since the block job runs in the same AioContext as the
dataplane IOThread.
--Stefan]
Peter Lieven [Mon, 5 Jan 2015 11:29:49 +0000 (12:29 +0100)]
block: limited request size in write zeroes unsupported path
If bs->bl.max_write_zeroes is large and we end up in the unsupported
path we might allocate a lot of memory for the iovector and/or even
generate an oversized requests.
Fix this by limiting the request by the minimum of the reported
maximum transfer size or 16MB (32768 sectors).
Peter Lieven [Tue, 2 Dec 2014 11:05:50 +0000 (12:05 +0100)]
coroutine: try harder not to delete coroutines
Placing coroutines on the global pool should be preferrable, because it
can help all threads. But if the global pool is full, we can still
try to save some allocations by stashing completed coroutines on the
local pool. This is quite cheap too, because it does not require
atomic operations, and provides a gain of 15% in the best case.
Paolo Bonzini [Tue, 2 Dec 2014 11:05:48 +0000 (12:05 +0100)]
coroutine: rewrite pool to avoid mutex
This patch removes the mutex by using fancy lock-free manipulation of
the pool. Lock-free stacks and queues are not hard, but they can suffer
from the ABA problem so they are better avoided unless you have some
deferred reclamation scheme like RCU. Otherwise you have to stick
with adding to a list, and emptying it completely. This is what this
patch does, by coupling a lock-free global list of available coroutines
with per-CPU lists that are actually used on coroutine creation.
Whenever the destruction pool is big enough, the next thread that runs
out of coroutines will steal the whole destruction pool. This is positive
in two ways:
1) the allocation does not have to do any atomic operation in the fast
path, it's entirely using thread-local storage. Once every POOL_BATCH_SIZE
allocations it will do a single atomic_xchg. Release does an atomic_cmpxchg
loop, that hopefully doesn't cause any starvation, and an atomic_inc.
A later patch will also remove atomic operations from the release path,
and try to avoid the atomic_xchg altogether---succeeding in doing so if
all devices either use ioeventfd or are not submitting requests actively.
2) in theory this should be completely adaptive. The number of coroutines
around should be a little more than POOL_BATCH_SIZE * number of allocating
threads; so this also empties qemu_coroutine_adjust_pool_size. (The previous
pool size was POOL_BATCH_SIZE * number of block backends, so it was a bit
more generous. But if you actually have many high-iodepth disks, it's better
to put them in different iothreads, which will also use separate thread
pools and aio=native file descriptors).
This speeds up perf/cost (in tests/test-coroutine) by a factor of ~1.33.
No matter if we end with some kind of coroutine bypass scheme or not,
it cannot hurt to optimize hot code.
Paolo Bonzini [Tue, 2 Dec 2014 11:05:47 +0000 (12:05 +0100)]
QSLIST: add lock-free operations
These operations are trivial to implement and do not have ABA problems.
They are enough to implement simple multiple-producer, single consumer
lock-free lists or, as in the next patch, the multiple consumers can
steal a whole batch of elements and process them at their leisure.
Paolo Bonzini [Tue, 2 Dec 2014 11:05:45 +0000 (12:05 +0100)]
qemu-thread: add per-thread atexit functions
Destructors are the main additional feature of pthread TLS compared
to __thread. If we were using C++ (hint, hint!) we could have used
thread-local objects with a destructor. Since we are not, instead,
we add a simple Notifier-based API.
Note that the notifier must be per-thread as well. We can add a
global list as well later, perhaps.
The Win32 implementation has some complications because a) detached
threads used not to have a QemuThreadData; b) the main thread does
not go through win32_start_routine, so we have to use atexit too.
Paolo Bonzini [Tue, 2 Dec 2014 11:05:44 +0000 (12:05 +0100)]
coroutine-ucontext: use __thread
ELF thread local storage is about 10% faster on tests/test-coroutine's
perf/cost test. The timing on my machine is 190ns per iteration with
pthread TLS, 170 with ELF TLS.
Based on a patch by Kevin Wolf and Peter Lieven, but redone to follow
the model of coroutine-win32.c (including the important "noinline"
attribute!).
Platforms without thread-local storage (OpenBSD probably?) will need
a new-enough GCC for this to compile, in order to use the same emutls
support that Windows already relies on.
Fam Zheng [Sun, 4 Jan 2015 01:53:52 +0000 (09:53 +0800)]
qemu-iotests: Add supported os parameter for python tests
If I understand correctly, qemu-iotests never meant to be portable. We
only support Linux for all the shell cases, but didn't specify it for
python tests. Now add this and default all the python tests as Linux
only. If we cares enough later, we can override the parameter in
individual cases.
Liang Li [Tue, 13 Jan 2015 02:40:53 +0000 (10:40 +0800)]
xen-pt: Fix PCI devices re-attach failed
Use the 'xl pci-attach $DomU $BDF' command to attach more than
one PCI devices to the guest, then detach the devices with
'xl pci-detach $DomU $BDF', after that, re-attach these PCI
devices again, an error message will be reported like following:
libxl: error: libxl_qmp.c:287:qmp_handle_error_response: receive
an error message from QMP server: Duplicate ID 'pci-pt-03_10.1'
for device.
If using the 'address_space_memory' as the parameter of
'memory_listener_register', 'xen_pt_region_del' will not be called
if the memory region's name is not 'xen-pci-pt-*' when the devices
is detached. This will cause the device's related QemuOpts object
not be released properly.
Using the device's address space can avoid such issue, because the
calling count of 'xen_pt_region_add' when attaching and the calling
count of 'xen_pt_region_del' when detaching is the same, so all the
memory region ref and unref by the 'xen_pt_region_add' and
'xen_pt_region_del' can be released properly.
Marc Marà [Thu, 23 Oct 2014 08:12:42 +0000 (10:12 +0200)]
libqos: Convert malloc-pc allocator to a generic allocator
The allocator in malloc-pc has been extracted, so it can be used in every arch.
This operation showed that both the alloc and free functions can be also
generic.
Because of this, the QGuestAllocator has been removed from is function to wrap
the alloc and free function, and now just contains the allocator parameters.
As a result, only the allocator initalizer and unitializer are arch dependent.
Because of wrong return value of .save_live_pending() in
migration/block.c, migration finishes before the whole disk is
transferred. Such situation occurs when the migration process is fast
enough, for example when source and dest are on the same host.
If in the bulk phase we return something < max_size, we will skip
transferring the tail of the device. Currently we have "set pending to
BLOCK_SIZE if it is zero" for bulk phase, but there no guarantee, that
it will be < max_size.
True approach is to return, for example, max_size+1 when we are in the
bulk phase.
block: fix spoiling all dirty bitmaps by mirror and migration
Mirror and migration use dirty bitmaps for their purposes, and since
commit [block: per caller dirty bitmap] they use their own bitmaps, not
the global one. But they use old functions bdrv_set_dirty and
bdrv_reset_dirty, which change all dirty bitmaps.
Named dirty bitmaps series by Fam and Snow are affected: mirroring and
migration will spoil all (not related to this mirroring or migration)
named dirty bitmaps.
This patch fixes this by adding bdrv_set_dirty_bitmap and
bdrv_reset_dirty_bitmap, which change concrete bitmap. Also, to prevent
such mistakes in future, old functions bdrv_(set,reset)_dirty are made
static, for internal block usage.
Max Reitz [Wed, 26 Nov 2014 16:20:29 +0000 (17:20 +0100)]
iotests: Add test for relative backing file names
Sometimes, qemu does not have a filename to work with, so it does not
know which directory to use for a backing file specified by a relative
filename. Add a test which tests that qemu exits with an appropriate
error message.
Additionally, add a test for qemu-img create with a backing filename
relative to the backed image's base directory while omitting the image
size.
Max Reitz [Wed, 26 Nov 2014 16:20:28 +0000 (17:20 +0100)]
block/vmdk: Relative backing file for creation
When a vmdk image is created with a backing file, it is opened to check
whether it is indeed a vmdk file by letting qemu probe it. When doing
so, the backing filename is relative to the image's base directory so it
should be interpreted accordingly.
Max Reitz [Wed, 26 Nov 2014 16:20:27 +0000 (17:20 +0100)]
block: Relative backing file for image creation
Relative backing filenames are always relative to the backed image's
directory; the same applies to image creation. Therefore, if the backing
file has to be opened for determining its size (in case the size has not
been explicitly specified) its filename should be interpreted relative
to the new image's base directory and not relative to qemu's working
directory.
Max Reitz [Wed, 26 Nov 2014 16:20:26 +0000 (17:20 +0100)]
block: JSON filenames and relative backing files
When using a relative backing file name, qemu needs to know the
directory of the top image file. For JSON filenames, such a directory
cannot be easily determined (e.g. how do you determine the directory of
a qcow2 BDS directly on top of a quorum BDS?). Therefore, do not allow
relative filenames for the backing file of BDSs only having a JSON
filename.
Furthermore, BDS::exact_filename should be used whenever possible. If
BDS::filename is not equal to BDS::exact_filename, the former will
always be a JSON object.
Max Reitz [Wed, 26 Nov 2014 16:20:25 +0000 (17:20 +0100)]
block: Get full backing filename from string
Introduce bdrv_get_full_backing_filename_from_filename(), a function
which takes the name of the backed file and a potentially relative
backing filename to produce the full (absolute) backing filename.
Use this function from bdrv_get_full_backing_filename().
Max Reitz [Wed, 26 Nov 2014 16:20:24 +0000 (17:20 +0100)]
checkpatch: Brace handling on multi-line condition
CODING_STYLE states the following about braces around blocks:
> The opening brace is on the line that contains the control flow
> statement that introduces the new block; [...]
This is obviously impossible with multi-line conditions. Therefore,
CODING_STYLE does not make any clear statement about where to put the
opening brace after a multi-line condition.
There is a reason to prefer to place the opening brace on an own line
after such a condition while still placing it on the same line as the
"control flow statement" if possible; that reason is that the last line
of a multi-line condition is indented, in the case of "if", it is often
indented by four spaces, just as much as the first statement in the
block will be indented. This is hard to read as there is no clearly
visible distinction between condition and block. Placing the opening
brace on a separate line solves this issue.
Also, there are cases where placing the opening brace on a separate line
is the only viable option; if the previous line had nearly 80 characters
and splitting it is not desirable, the opening brace is naturally placed
on an own line.
This patch fixes checkpatch.pl to not complain about braces on own lines
if the condition introducing the block spanned more than one line, or if
the previous line had 79 or 80 characters.
Furthermore, the warning about not having braces around a block is fixed
to mind braces not being on the last line of the condition.
Paolo Bonzini [Wed, 17 Dec 2014 15:10:00 +0000 (16:10 +0100)]
block: replace g_new0 with g_new for bottom half allocation.
This saves about 15% of the clock cycles spent on allocation. Using the
slice allocator does not add a visible improvement; allocation is faster
than malloc, while freeing seems to be slower.
Paolo Bonzini [Wed, 17 Dec 2014 15:09:59 +0000 (16:09 +0100)]
block: do not allocate an iovec per read of a growable/zero_after_eof BDS
Most reads do not go past the end of the file, and they can use the
input QEMUIOVector instead of creating one. This removes the
qemu_iovec_* functions from the profile.
Paolo Bonzini [Wed, 17 Dec 2014 15:09:58 +0000 (16:09 +0100)]
block: mark AioContext as recursive
AioContext can be accessed recursively, in fact that's what we do with
aio_poll. Marking the GSource as recursive avoids that GLib blocks it
and unblocks it around every call to aio_dispatch, which is a pretty
expensive operation.
Peter Maydell [Mon, 12 Jan 2015 11:13:24 +0000 (11:13 +0000)]
Merge remote-tracking branch 'remotes/stefanha/tags/net-pull-request' into staging
# gpg: Signature made Mon 12 Jan 2015 10:27:41 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <[email protected]>"
# gpg: aka "Stefan Hajnoczi <[email protected]>"
* remotes/stefanha/tags/net-pull-request:
hw/net/xen_nic.c: Set 'netdev->mac' to NULL after free it
hw/net/xen_nic.c: Need free 'netdev->nic' in net_free() instead of net_disconnect()
hw/net/xen_nic.c: Free 'netdev->txs' when map 'netdev->rxs' fails
net: remove all cleanup methods from NIC NetClientInfos
Chen Gang [Tue, 16 Dec 2014 20:52:16 +0000 (04:52 +0800)]
hw/net/xen_nic.c: Need free 'netdev->nic' in net_free() instead of net_disconnect()
net_init() and net_free() are pairs, net_connect() and net_disconnect()
are pairs. net_init() creates 'netdev->nic', so also need free it in
net_free().
Paolo Bonzini [Tue, 23 Dec 2014 16:53:19 +0000 (17:53 +0100)]
net: remove all cleanup methods from NIC NetClientInfos
All NICs have a cleanup function that, in most cases, zeroes the pointer
to the NICState. In some cases, it frees data belonging to the NIC.
However, this function is never called except when exiting from QEMU.
It is not necessary to NULL pointers and free data here; the right place
to do that would be in the device's unrealize function, after calling
qemu_del_nic. Zeroing the NIC multiple times is also wrong for multiqueue
devices.
This cleanup function gets in the way of making the NetClientStates for
the NIC hold an object_ref reference to the object, so get rid of it.
Peter Maydell [Mon, 12 Jan 2015 10:09:41 +0000 (10:09 +0000)]
Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20150112-v3' into staging
s390x patches for 2.3.
Highlight is support for PCI devices on s390x. Otherwise, performance
improvements (register sync) and small cleanups.
# gpg: Signature made Mon 12 Jan 2015 09:49:31 GMT using RSA key ID C6F02FAF
# gpg: Good signature from "Cornelia Huck <[email protected]>"
# gpg: aka "Cornelia Huck <[email protected]>"
* remotes/cohuck/tags/s390x-20150112-v3:
kvm: extend kvm_irqchip_add_msi_route to work on s390
s390: implement pci instructions
s390: Add PCI bus support
s390x/kvm: avoid syscalls by syncing registers with kvm_run
s390x/kvm: sync register support helper function
s390x/css: Clean up unnecessary CONFIG_USER_ONLY wrappers
s390x/ccw: fix oddity in machine class init
Frank Blaschka [Fri, 9 Jan 2015 08:04:40 +0000 (09:04 +0100)]
kvm: extend kvm_irqchip_add_msi_route to work on s390
on s390 MSI-X irqs are presented as thin or adapter interrupts
for this we have to reorganize the routing entry to contain
valid information for the adapter interrupt code on s390.
To minimize impact on existing code we introduce an architecture
function to fixup the routing entry.
Frank Blaschka [Fri, 9 Jan 2015 08:04:39 +0000 (09:04 +0100)]
s390: implement pci instructions
This patch implements the s390 pci instructions in qemu. It allows
to access and drive pci devices attached to the s390 pci bus.
Because of platform constrains devices using IO BARs are not
supported. Also a device has to support MSI/MSI-X to run on s390.
Frank Blaschka [Fri, 9 Jan 2015 08:04:38 +0000 (09:04 +0100)]
s390: Add PCI bus support
This patch implements a pci bus for s390x together with infrastructure
to generate and handle hotplug events, to configure/unconfigure via
sclp instruction, to do iommu translations and provide s390 support for
MSI/MSI-X notification processing.
s390x/kvm: avoid syscalls by syncing registers with kvm_run
We can avoid loads of syscalls when dropping to user space by storing the values
of more registers directly within kvm_run.
Support is added for:
- ARCH0: CPU timer, clock comparator, TOD programmable register,
guest breaking-event register, program parameter
- PFAULT: pfault parameters (token, select, compare)
Thomas Huth [Wed, 3 Dec 2014 14:38:29 +0000 (15:38 +0100)]
s390x/css: Clean up unnecessary CONFIG_USER_ONLY wrappers
The css functions are only used from ioinst.c and other files that are
only built for CONFIG_SOFTMMU. So we do not need the dummy wrappers for
the CONFIG_USER_ONLY target in the cpu.h header.
Cornelia Huck [Wed, 3 Dec 2014 14:38:28 +0000 (15:38 +0100)]
s390x/ccw: fix oddity in machine class init
ccw_machine_class_init() uses ',' instead of ';' while initializing
the class' fields. This is almost certainly a copy/paste error and,
while legal C, rather on the unusual side. Just use ';' everywhere.
Peter Maydell [Sat, 10 Jan 2015 21:02:23 +0000 (21:02 +0000)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc: resizeable ROM blocks
This makes ROM blocks resizeable. This infrastructure is required for other
functionality we have queued.
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Thu 08 Jan 2015 11:19:24 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg: aka "Michael S. Tsirkin <[email protected]>"
* remotes/mst/tags/for_upstream:
acpi-build: make ROMs RAM blocks resizeable
memory: API to allocate resizeable RAM MR
arch_init: support resizing on incoming migration
exec: qemu_ram_alloc_resizeable, qemu_ram_resize
exec: split length -> used_length/max_length
exec: cpu_physical_memory_set/clear_dirty_range
memory: add memory_region_set_size
Peter Maydell [Sat, 10 Jan 2015 19:50:21 +0000 (19:50 +0000)]
Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
Patch queue for ppc - 2015-01-07
New year's release. This time's highlights:
- E500: More RAM support
- pseries: New SLOF release
- Migration fixes
- Simplify USB spawning logic, removes support for explicit usb=off
- TCG: Simple untansactional TM emulation
# gpg: Signature made Wed 07 Jan 2015 15:19:37 GMT using RSA key ID 03FEDC60
# gpg: Good signature from "Alexander Graf <[email protected]>"
# gpg: aka "Alexander Graf <[email protected]>"
* remotes/agraf/tags/signed-ppc-for-upstream: (37 commits)
hw/ppc/mac_newworld: simplify usb controller creation logic
hw/ppc/spapr: simplify usb controller creation logic
hw/ppc/mac_newworld: QOMified mac99 machines
hw/usb: simplified usb_enabled
hw/machine: added machine_usb wrapper
hw/ppc: modified the condition for usb controllers to be created for some ppc machines
target-ppc: Cast ssize_t to size_t before printing with %zx
target-ppc: Mark SR() and gen_sync_exception() as !CONFIG_USER_ONLY
PPC: e500: Fix GPIO controller interrupt number
target-ppc: Introduce Privileged TM Noops
target-ppc: Introduce tcheck
target-ppc: Introduce TM Noops
target-ppc: Introduce tbegin
target-ppc: Introduce TEXASRU Bit Fields
target-ppc: Power8 Supports Transactional Memory
target-ppc: Introduce tm_enabled Bit to CPU State
target-ppc: Introduce Feature Flag for Transactional Memory
target-ppc: Introduce Instruction Type for Transactional Memory
pseries: Update SLOF firmware image to 20141202
PPC: Fix crash on spapr_tce_table_finalize()
...
Eduardo Habkost [Fri, 19 Dec 2014 02:59:47 +0000 (00:59 -0200)]
vl: Don't silently change topology when all -smp options were set
QEMU tries to change the "threads" option even if it was explicitly set
in the command-line, and it shouldn't do that.
The right thing to do when all options (cpus, sockets, cores, threds)
are explicitly set is to sanity check them and abort in case they don't
make sense (i.e. when sockets*cores*threads < cpus).
vl.c: fix regression when reading machine type from config file
After 'Machine as QOM' series the machine type input triggers
the creation of the machine class.
If the machine type is set in the configuration file, the machine
class is not updated accordingly and remains the default.
Fixed that by querying the machine options after the configuration
file is loaded.
Peter Maydell [Fri, 9 Jan 2015 16:29:36 +0000 (16:29 +0000)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
More migration fixes and more record/replay preparations. Also moves
the sdhci-pci device id to make space for the rocker device.
# gpg: Signature made Sat 03 Jan 2015 08:22:36 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <[email protected]>"
# gpg: aka "Paolo Bonzini <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream:
pci: move REDHAT_SDHCI device ID to make room for Rocker
block/iscsi: fix uninitialized variable
pckbd: set bits 2-3-6-7 of the output port by default
serial: refine serial_thr_ipending_needed
gen-icount: check cflags instead of use_icount global
translate: check cflags instead of use_icount global
cpu-exec: add a new CF_USE_ICOUNT cflag
target-ppc: pass DisasContext to SPR generator functions
atomic: fix position of volatile qualifier
Alex Williamson [Fri, 9 Jan 2015 15:50:53 +0000 (08:50 -0700)]
vfio-pci: Fix interrupt disabling
When disabling MSI/X interrupts the disable functions will leave the
device in INTx mode (when available). This matches how hardware
operates, INTx is enabled unless MSI/X is enabled (DisINTx is handled
separately). Therefore when we really want to disable all interrupts,
such as when removing the device, and we start with the device in
MSI/X mode, we need to pass through INTx on our way to being
completely quiesced.
In well behaved situations, the guest driver will have shutdown the
device and it will start vfio_exitfn() in INTx mode, producing the
desired result. If hot-unplug causes the guest to crash, we may get
the device in MSI/X state, which will leave QEMU with a bogus handler
installed.
Fix this by re-ordering our disable routine so that it should always
finish in VFIO_INT_NONE state, which is what all callers expect.
Alex Williamson [Fri, 9 Jan 2015 15:50:53 +0000 (08:50 -0700)]
vfio-pci: Fix BAR size overflow
We use an unsigned int when working with the PCI BAR size, which can
obviously overflow if the BAR is 4GB or larger. This needs to change
to a fixed length uint64_t. A similar issue is possible, though even
more unlikely, when mapping the region above an MSI-X table. The
start of the MSI-X vector table must be below 4GB, but the end, and
therefore the start of the next mapping region, could still land at
4GB.
hw/ppc: modified the condition for usb controllers to be created for some ppc machines
Some ppc machines create a default usb controller based on a 'machine condition'.
Until now the logic was: create the usb controller if:
- the usb option was supplied in cli and value is true or
- the usb option was absent and both set_defaults and the machine
condition were true.
Modified the logic to:
Create the usb controller if:
- the machine condition is true and defaults are enabled or
- the usb option is supplied and true.
The main for this is to simplify the usb_enabled method.
Use resizeable ram API so we can painlessly extend ROMs in the
future. Note: migration is not affected, as we are
not actually changing the used length for RAM, which
is the part that's migrated.
This looks just like regular RAM generally, but
has a special property that only a portion of it
(used_length) is actually used, and migrated.
This used_length size can change across reboots.
Follow up patches will change used_length for such blocks at migration,
making it easier to extend devices using such RAM (notably ACPI,
but in the future thinkably other ROMs) without breaking migration
compatibility or wasting ROM (guest) memory.
Device is notified on resize, so it can adjust if necessary.
Note: nothing prevents making all RAM resizeable in this way.
However, reviewers felt that only enabling this selectively will
make some class of errors easier to detect.
Add API to allocate "resizeable" RAM.
This looks just like regular RAM generally, but
has a special property that only a portion of it
(used_length) is actually used, and migrated.
This used_length size can change across reboots.
Follow up patches will change used_length for such blocks at migration,
making it easier to extend devices using such RAM (notably ACPI,
but in the future thinkably other ROMs) without breaking migration
compatibility or wasting ROM (guest) memory.
Device is notified on resize, so it can adjust if necessary.
qemu_ram_alloc_resizeable allocates this memory, qemu_ram_resize resizes
it.
Note: nothing prevents making all RAM resizeable in this way.
However, reviewers felt that only enabling this selectively will
make some class of errors easier to detect.
This patch allows us to distinguish between two
length values for each block:
max_length - length of memory block that was allocated
used_length - length of block used by QEMU/guest
Currently, we set used_length - max_length, unconditionally.
Follow-up patches allow used_length <= max_length.
hw/ppc: modified the condition for usb controllers to be created for some ppc machines
Some ppc machines create a default usb controller based on a 'machine condition'.
Until now the logic was: create the usb controller if:
- the usb option was supplied in cli and value is true or
- the usb option was absent and both set_defaults and the machine
condition were true.
Modified the logic to:
Create the usb controller if:
- the machine condition is true and defaults are enabled or
- the usb option is supplied and true.
The main for this is to simplify the usb_enabled method.