Some guests do a large number of mask/unmask
calls which currently trigger expensive route update
system calls.
Detect that route in unchanged and skip the system call.
kvm_add_routing_entry makes an attempt to
zero-initialize any new routing entry.
However, it fails to initialize padding
within the u field of the structure
kvm_irq_routing_entry.
Other functions like kvm_irqchip_update_msi_route
also fail to initialize the padding field in
kvm_irq_routing_entry.
It's better to just make sure all input is initialized.
Once it is, we can also drop complex field by field assignment and just
do the simple *a = *b to update a route entry.
Amos Kong [Wed, 22 May 2013 04:57:35 +0000 (12:57 +0800)]
kvm: add detail error message when fail to add ioeventfd
I try to hotplug 28 * 8 multiple-function devices to guest with
old host kernel, ioeventfds in host kernel will be exhausted, then
qemu fails to allocate ioeventfds for blk/nic devices.
Anthony Liguori [Mon, 1 Jul 2013 14:03:04 +0000 (09:03 -0500)]
Merge remote-tracking branch 'agraf/ppc-for-upstream' into staging
# By Alexander Graf (12) and others
# Via Alexander Graf
* agraf/ppc-for-upstream: (32 commits)
PPC: Ignore writes to L2CR
mac-io: Add escc-legacy memory alias region
PPC: Newworld: Add second uninorth control register set
PPC: Newworld: Add uninorth token register
PPC: Add clock-frequency export for Mac machines
PPC: Introduce an alias cache for faster lookups
PPC: Fix GDB read on code area for PPC6xx
PPC: Add dump_mmu() for 6xx
target-ppc: Introduce unrealizefn for PowerPCCPU
booke_ppc: limit booke timer to max when timeout overflow
Graphics: Switch to 800x600x32 as default mode
pseries: Update MAINTAINERS information
target-ppc kvm: save cr register
pseries: Fix compiler warning (conversion of pointer to integral value)
spapr-rtas: add CPU argument to RTAS calls
target-ppc: Change default machine for 64-bit
ppc: do not register IABR SPR twice for 603e
target-ppc: Drop redundant flags assignments from CPU families
mpc8544_guts: Turn qdev initfn into instance_init
mpc8544_guts: QOM'ify
...
Alexander Graf [Wed, 26 Jun 2013 22:31:42 +0000 (00:31 +0200)]
PPC: Ignore writes to L2CR
The L2CR register contains a number of bits that either impose configuration
which we can't deal with or mean "something is in progress until the bit is
0 again".
Since we don't model the former and we do want to accomodate guests using the
latter semantics, let's just ignore writes to L2CR. That way guests always read
back 0 and are usually happy with that.
Alexander Graf [Wed, 26 Jun 2013 11:58:31 +0000 (13:58 +0200)]
mac-io: Add escc-legacy memory alias region
Mac OS X's debugging serial driver accesses the ESCC through a different
register layout, called "escc-legacy". This layout differs from the normal
escc register layout purely by the location of the respective registers.
This patch adds a memory alias region that takes normal escc registers and
maps them into the escc-legacy register space.
With this patch applied, a Mac OS X guest successfully emits debug output
on the serial port when run with debug parameters set, for example by running:
Alexander Graf [Sat, 22 Jun 2013 01:53:35 +0000 (03:53 +0200)]
PPC: Introduce an alias cache for faster lookups
When running QEMU with "-cpu ?" we walk through every alias for every
target CPU we know about. This takes several seconds on my very fast
host system.
Let's introduce a class object cache in the alias table. Using that we
don't have to go through the tedious work of finding our target class.
Instead, we can just go directly from the alias name to the target class
pointer.
This patch brings -cpu "?" to reasonable times again.
Bharat Bhushan [Wed, 12 Jun 2013 12:30:50 +0000 (18:00 +0530)]
booke_ppc: limit booke timer to max when timeout overflow
Limit watchdog and fit timer to maximum timeout value which
qemu timer can support (INT64_MAX). This maximum timeout will be
hundreds of years, so limiting to max timeout is pretty safe.
David Gibson [Sat, 15 Jun 2013 01:51:52 +0000 (11:51 +1000)]
pseries: Update MAINTAINERS information
I'm no longer at IBM, and therefore no long actively working on the pseries
(aka sPAPR) qemu machine type. This patch removes my information in the
MAINTAINERS file.
While we're at it, I've added some extra file patterns for pseries specific
files that weren't included in the existing pattern.
Signed-off-by: David Gibson <[email protected]>
[agraf: Remove new maintainer addition] Signed-off-by: Alexander Graf <[email protected]>
Anthony Liguori [Wed, 19 Jun 2013 20:40:30 +0000 (15:40 -0500)]
spapr-rtas: add CPU argument to RTAS calls
RTAS is a hypervisor provided binary blob that a guest loads and
calls into to execute certain functions. It's similar to the
vsyscall page in Linux or the short lived VMCI paravirt interface
from VMware.
The QEMU implementation of the RTAS blob is simply a passthrough
that proxies all RTAS calls to the hypervisor via an hypercall.
While we pass a CPU argument for hypercall handling in QEMU, we
don't pass it for RTAS calls. Since some RTAs calls require
making hypercalls (normally RTAS is implemented as guest code) we
have nasty hacks to allow that.
Add a CPU argument to RTAS call handling so we can more easily
invoke hypercalls just as guest code would.
David Gibson [Sat, 15 Jun 2013 01:51:50 +0000 (11:51 +1000)]
target-ppc: Change default machine for 64-bit
Currently, for qemu-system-ppc64, the default machine type is 'mac99'.
The mac99 machine is not being actively maintained, and represents a
bizarre hybrid of components that never actually existed as a real system.
This patch changes the default machine to 'pseries', which is actively
maintained and works well with most modern ppc64 Linux distributions as a
guest.
Scott Wood [Wed, 12 Jun 2013 20:32:51 +0000 (15:32 -0500)]
kvm/openpic: in-kernel mpic support
Enables support for the in-kernel MPIC that thas been merged into the
KVM next branch. This includes irqfd/KVM_IRQ_LINE support from Alex
Graf (along with some other improvements).
Note from Alex regarding kvm_irqchip_create():
On x86, one would call kvm_irqchip_create() to initialize an
in-kernel interrupt controller. That function then goes ahead and
initializes global capability variables as well as the default irq
routing table.
On ppc, we can't call kvm_irqchip_create() because we can have
different types of interrupt controllers. So we want to do all the
things that function would do for us in the in-kernel device init
handler.
Signed-off-by: Scott Wood <[email protected]>
[agraf: squash in kvm_irqchip_commit_routes patch, fix non-kvm build,
fix ppcemb] Signed-off-by: Alexander Graf <[email protected]>
Alexander Graf [Fri, 28 Jun 2013 11:47:15 +0000 (13:47 +0200)]
PPC: Add non-kvm stub file
There are cases where a kvm provided function is called from generic
hw code that doesn't know whether kvm is available or not. Provide
a stub file which can provide simple replacement functions for those
cases.
Alexander Graf [Tue, 16 Apr 2013 23:11:55 +0000 (01:11 +0200)]
KVM: PIC: Only commit irq routing when necessary
The current logic updates KVM's view of our interrupt map every time we
change it. While this is nice and bullet proof, it slows things down
badly for me. QEMU spends about 3 seconds on every start telling KVM what
news it has on its routing maps.
Instead, let's just synchronize the whole irq routing map as a whole when
we're done constructing it. For things that change during runtime, we can
still update the routing table on demand.
Alexander Graf [Tue, 16 Apr 2013 13:05:22 +0000 (15:05 +0200)]
KVM: MSI: Swap payload to native endianness
The usual MSI injection mechanism writes msi.data into memory using an
le32 wrapper. So on big endian guests, this swaps msg.data into the
expected byte order.
For irqfd however, we don't swap the payload right now, rendering
in-kernel MPIC emulation broken on PowerPC.
Swap msg.data to the correct endianness whenever we touch it.
Anthony Liguori [Fri, 28 Jun 2013 20:48:35 +0000 (15:48 -0500)]
Merge remote-tracking branch 'mjt/trivial-patches' into staging
# By Gerd Hoffmann (13) and Michael Tokarev (1)
# Via Michael Tokarev
* mjt/trivial-patches:
doc: we use seabios, not bochs bios
qemu-socket: don't leak opts on error
qemu-char: report udp backend errors
qemu-char: add -chardev mux support
qemu-char: minor mux chardev fixes
qemu-char: use ChardevBackendKind in CharDriver
qemu-char: don't leak opts on error
qemu-char: fix documentation for telnet+wait socket flags
qemu-char: print notification to stderr
qemu-char: use more specific error_setg_* variants
qemu-char: check optional fields using has_*
qemu-socket: catch monitor_get_fd failures
qemu-socket: drop pointless allocation
qemu-socket: zero-initialize SocketAddress
Kevin Wolf [Wed, 19 Jun 2013 14:10:55 +0000 (16:10 +0200)]
hmp: Make "info block" output more readable
HMP is meant for humans and you should notice it.
This changes the output format to use a bit more space to display the
information more readable and leaves out irrelevant information (e.g.
mention only that an image is encrypted, but not when it's not; display
I/O limits only if throttling is in effect; ...)
qemu-char: Fix ID reuse after chardev-remove for qapi-based init
Commit 2c5f488 introduced qapi-based character device initialization
as a new code path in qemu_chr_new_from_opts(). Unfortunately, it
failed to store parameter opts in the new chardev. Therefore,
qemu_chr_delete() doesn't delete it. Even though the device is gone,
its options linger, and any attempt to create another one with the
same ID fails.
Gerd Hoffmann [Tue, 25 Jun 2013 08:48:54 +0000 (10:48 +0200)]
gtk: add support for surface conversion
Also use CAIRO_FORMAT_RGB24 unconditionally. DisplaySurfaces will never
ever see 8bpp surfaces. And using CAIRO_FORMAT_RGB16_565 for the 16bpp
case doesn't seem to be a good idea too.
<quote src="/usr/include/cairo/cairo.h">
* @CAIRO_FORMAT_RGB16_565: This format value is deprecated. It has
* never been properly implemented in cairo and should not be used
* by applications. (since 1.2)
</quote>
Kevin Wolf [Sun, 23 Jun 2013 20:07:45 +0000 (22:07 +0200)]
multiboot: Calculate upper_mem in the ROM
The upper_mem field of the Multiboot information struct doesn't really
contain the RAM size - 1 MB like we used to calculate it, but only the
memory from 1 MB up to the first (upper) memory hole.
In order to correctly retrieve this information, the multiboot ROM now
looks at the mmap it creates anyway and tries to find the size of
contiguous usable memory from 1 MB.
Drop the multiboot.c definition of lower_mem and upper_mem because both
are queried at runtime now.
Kevin Wolf [Sun, 23 Jun 2013 20:07:44 +0000 (22:07 +0200)]
multiboot: Don't forget last mmap entry
When the BIOS returns ebx = 0, the current entry is still valid and
needs to be included in the Multiboot memory map.
Fixing this meant that using bx as the entry index doesn't work any
more because it's 0 on the last entry (and it was SeaBIOS-specific
anyway), so the whole loop had to change a bit and should be more
generic as a result (ebx can be an arbitrary continuation number now,
and the entry size returned by the BIOS is used instead of hard-coding
20 bytes).
Stefan Weil [Thu, 27 Jun 2013 19:00:06 +0000 (21:00 +0200)]
arch_init: Fix format string by using RAM_ADDR_FMT
length is a ram_addr_t, so RAM_ADDR_FMT must be used instead of %ld.
This fixes a recently introduced regression for w64 builds.
Using RAM_ADDR_FMT also changes decimal output to sedecimal.
This is good here because length and block->length should both
use the same base in the error message.
Gerd Hoffmann [Mon, 24 Jun 2013 06:39:53 +0000 (08:39 +0200)]
qemu-char: minor mux chardev fixes
mux failure path has a memory leak. creating a mux chardev can't
fail though, so just assert() that instead of fixing an error path
which never ever runs anyway ...
Anthony Liguori [Fri, 28 Jun 2013 15:37:33 +0000 (10:37 -0500)]
Merge remote-tracking branch 'kwolf/for-anthony' into staging
# By Stefan Hajnoczi (11) and others
# Via Kevin Wolf
* kwolf/for-anthony:
cmd646: fix build when DEBUG_IDE is enabled.
block: change default of .has_zero_init to 0
vpc: Implement .bdrv_has_zero_init
vmdk: remove wrong calculation of relative path
gluster: Return bdrv_has_zero_init = 0
block/ssh: Set bdrv_has_zero_init according to the file type.
block: Make BlockJobTypes const
qemu-iotests: add 055 drive-backup test case
qemu-iotests: extract wait_until_completed() into iotests.py
blockdev: add Abort transaction
blockdev: add DriveBackup transaction
blockdev: allow BdrvActionOps->commit() to be NULL
blockdev: rename BlkTransactionStates to singular
block: add drive-backup QMP command
blockdev: use bdrv_getlength() in qmp_drive_mirror()
blockdev: drop redundant proto_drv check
block: add basic backup support to block driver
block: add bdrv_add_before_write_notifier()
notify: add NotiferWithReturn so notifier list can abort
raw-posix: Fix /dev/cdrom magic on OS X
Peter Lieven [Fri, 28 Jun 2013 10:47:42 +0000 (12:47 +0200)]
block: change default of .has_zero_init to 0
.has_zero_init defaults to 1 for all formats and protocols.
this is a dangerous default since this means that all
new added drivers need to manually overwrite it to 0 if
they do not ensure that a device is zero initialized
after bdrv_create().
if a driver needs to explicitly set this value to
1 its easier to verify the correctness in the review process.
during review of the existing drivers it turned out
that ssh and gluster had a wrong default of 1.
both protocols support host_devices as backend
which are not by default zero initialized. this
wrong assumption will lead to possible corruption
if qemu-img convert is used to write to such a backend.
vpc and vmdk also defaulted to 1 altough they support
fixed respectively flat extends. this has to be addresses
in separate patches. both formats as well as the mentioned
ssh and gluster are turned to the default of 0 with this
patch for safety.
a similar problem with the wrong default existed for
iscsi most likely because the driver developer did
oversee the default value of 1.
Andreas Färber [Sat, 2 Feb 2013 12:59:05 +0000 (13:59 +0100)]
target-openrisc: Register VMStateDescription for OpenRISCCPU
Since commit e67db06e9f6d7e514ee2a9b9b769ecd42977f6fb (target-or32: Add
target stubs and QOM cpu) a VMStateDescription existed, but
CPU_SAVE_VERSION was not set, so it was never registered.
Drop cpu_{save,load}() and register VMStateDescription via DeviceState.
Use a version_id of 1 and specify minimum versions as well.
Andreas Färber [Sun, 20 Jan 2013 23:27:16 +0000 (00:27 +0100)]
target-alpha: Register VMStateDescription for AlphaCPU
Commit b758aca1f6cdb175634812b79f5560c36c902d00 (target-alpha: Enable
the alpha-softmmu target.) introduced cpu_{save,load}() functions but
didn't define CPU_SAVE_VERSION, so they were never registered.
Drop cpu_{save,load}() and register the VMStateDescription via DeviceClass.
This operates on the AlphaCPU object instead of CPUAlphaState.
Andreas Färber [Tue, 18 Jun 2013 00:23:36 +0000 (02:23 +0200)]
cpu: Introduce device_class_set_vmsd() helper
It's the equivalent to cpu_class_set_vmsd(), to assign
DeviceClass::vmsd. It wasn't needed before since only static,
unmigratable VMStateDescriptions were assigned so far.
Kevin Wolf [Fri, 28 Jun 2013 08:21:00 +0000 (10:21 +0200)]
vpc: Implement .bdrv_has_zero_init
Depending on the subformat, has_zero_init on VHD must behave like raw
and query the underlying storage (fixed) or like other sparse formats
that can always return 1 (dynamic, differencing).
Fam Zheng [Wed, 26 Jun 2013 09:24:32 +0000 (17:24 +0800)]
vmdk: remove wrong calculation of relative path
When creating image with backing file, the driver tries to calculate the
relative path from created image file to backing file, but the path
computation is incorrect. e.g.:
The common part in file names, "vmdk-data-", is incorrectly forgotten by
relative_path(). As the VMDK specification has no restriction on
parentNameHint to be relative path, we simply remove this by using the
backing_file option.
Kevin Wolf [Wed, 26 Jun 2013 07:41:57 +0000 (09:41 +0200)]
gluster: Return bdrv_has_zero_init = 0
GlusterFS volumes can be backed by block devices, in which case
bdrv_create() doesn't make sure that the image is zeroed out. It is
currently not possibly to detect whether a given image is backed by a
file or a block device, and incorrectly assuming that it is zeroed
corrupts images during qemu-img convert, so let's err on the side of
caution and always return 0.
block/ssh: Set bdrv_has_zero_init according to the file type.
If the remote is a regular file, set it to true (ie. reads of
uninitialized areas in a newly created file will return zeroes).
If we can't prove that, return false (a safe default).
Tested by adding a debugging print statement [not part of this commit]
and creating a remote file and a remote block device:
Stefan Hajnoczi [Mon, 24 Jun 2013 15:13:19 +0000 (17:13 +0200)]
qemu-iotests: extract wait_until_completed() into iotests.py
The 'drive-mirror' tests often issue 'block-job-complete' and wait for
the QMP completion event. Other types of block jobs also want to wait
for completion but they may not need to issue 'block-job-complete'.
Extract wait_until_completed() from 041 and put it into iotests.py.
Return the QMP event object so the caller can make additional
assertions, if necessary.
Stefan Hajnoczi [Mon, 24 Jun 2013 15:13:18 +0000 (17:13 +0200)]
blockdev: add Abort transaction
The Abort action can be used to test QMP 'transaction' failure. Add it
as the last action to exercise the .abort() and .cleanup() code paths
for all previous actions.
Stefan Hajnoczi [Mon, 24 Jun 2013 15:13:17 +0000 (17:13 +0200)]
blockdev: add DriveBackup transaction
This patch adds a transactional version of the drive-backup QMP command.
It allows atomic snapshots of multiple drives along with automatic
cleanup if there is a failure to start one of the backup jobs.
Note that QMP events are emitted for block job completion/cancellation
and the block job will be listed by query-block-jobs.
@device: the name of the device whose writes should be mirrored.
@target: the target of the new image. If the file exists, or if it
is a device, the existing file/device will be used as the new
destination. If it does not exist, a new file will be created.
@format: #optional the format of the new destination, default is to
probe if @mode is 'existing', else the format of the source
@mode: #optional whether and how QEMU should create a new image, default is
'absolute-paths'.
@speed: #optional the maximum speed, in bytes per second
@on-source-error: #optional the action to take on an error on the source,
default 'report'. 'stop' and 'enospc' can only be used
if the block device supports io-status (see BlockInfo).
@on-target-error: #optional the action to take on an error on the target,
default 'report' (no limitations, since this applies to
a different block device than @device).