Extract the mish block decoder such that this can be used for other
formats in the future. A new DmgHeaderState struct is introduced to
share state while decoding.
The code is kept unchanged as much as possible, a "fail" label is added
for example where a simple return would probably do. In dmg_open, the
variable "tmp" is renamed to "rsrc_data_offset" for clarity and comments
have been added explaining various data.
Note that this patch has one subtle difference with the previous
version which should not affect functionality. In the previous code,
the end of a resource was inferred from the mish block (the offsets
would be increased by the fields). In this patch, the resource length
is used instead to avoid the need to rely on the previous offsets.
Peter Wu [Tue, 6 Jan 2015 17:48:04 +0000 (18:48 +0100)]
block/dmg: properly detect the UDIF trailer
DMG files have a variable length with a UDIF trailer at the end of a
file. This UDIF trailer is essential as it describes the contents of
the image. At the moment however, the start of this trailer is almost
always incorrect as bdrv_getlength() returns a multiple of the block
size (rounded up). This results in a failure to recognize DMG files,
resulting in Invalid argument (EINVAL) errors.
As there is no API to retrieve the real file size, look for the magic
header in the last two sectors to find the start of this 512-byte UDIF
trailer (the "koly" block).
The resource fork offset ("info_begin") has its offset adjusted as the
initial value of offset does not mean "end of file" anymore, but "begin
of UDIF trailer".
[Replaced error_set(errp, ERROR_CLASS_GENERIC_ERROR, ...) with
error_setg(errp, ...) as discussed with Peter.
--Stefan]
Francesco Romani [Mon, 12 Jan 2015 13:11:13 +0000 (14:11 +0100)]
block: add event when disk usage exceeds threshold
Managing applications, like oVirt (http://www.ovirt.org), make extensive
use of thin-provisioned disk images.
To let the guest run smoothly and be not unnecessarily paused, oVirt sets
a disk usage threshold (so called 'high water mark') based on the occupation
of the device, and automatically extends the image once the threshold
is reached or exceeded.
In order to detect the crossing of the threshold, oVirt has no choice but
aggressively polling the QEMU monitor using the query-blockstats command.
This lead to unnecessary system load, and is made even worse under scale:
deployments with hundreds of VMs are no longer rare.
To fix this, this patch adds:
* A new monitor command `block-set-write-threshold', to set a mark for
a given block device.
* A new event `BLOCK_WRITE_THRESHOLD', to report if a block device
usage exceeds the threshold.
* A new `write_threshold' field into the `BlockDeviceInfo' structure,
to report the configured threshold.
This will allow the managing application to use smarter and more
efficient monitoring, greatly reducing the need of polling.
[Updated qemu-iotests 067 output to add the new 'write_threshold'
property. --Stefan]
[Changed g_assert_false() to !g_assert() to fix the build on older glib
versions. --Kevin]
Fam Zheng [Fri, 16 Jan 2015 01:38:42 +0000 (09:38 +0800)]
qemu-iotests: Fix supported_oses check
There is a bug in the recently added sys.platform test, and we no longer
run python tests, because "linux2" is the value to compare here. So do a
prefix match. According to python doc [1], the way to use sys.platform
is "unless you want to test for a specific system version, it is
therefore recommended to use the following idiom":
Peter Lieven [Mon, 2 Feb 2015 13:52:21 +0000 (14:52 +0100)]
virtio-blk: introduce multiread
this patch finally introduces multiread support to virtio-blk. While
multiwrite support was there for a long time, read support was missing.
The complete merge logic is moved into virtio-blk.c which has
been the only user of request merging ever since. This is required
to be able to merge chunks of requests and immediately invoke callbacks
for those requests. Secondly, this is required to switch to
direct invocation of coroutines which is planned at a later stage.
The following benchmarks show the performance of running fio with
4 worker threads on a local ram disk. The numbers show the average
of 10 test runs after 1 run as warmup phase.
Peter Lieven [Mon, 2 Feb 2015 14:48:34 +0000 (15:48 +0100)]
block: change default for discard and write zeroes to INT_MAX
do not trim requests if the driver does not supply a limit
through BlockLimits. For write zeroes we still keep a limit
for the unsupported path to avoid allocating a big bounce buffer.
Denis V. Lunev [Fri, 30 Jan 2015 08:42:16 +0000 (11:42 +0300)]
block: use fallocate(FALLOC_FL_PUNCH_HOLE) & fallocate(0) to write zeroes
This sequence works efficiently if FALLOC_FL_ZERO_RANGE is not supported.
Unfortunately, FALLOC_FL_ZERO_RANGE is supported on really modern systems
and only for a couple of filesystems. FALLOC_FL_PUNCH_HOLE is much more
mature.
The sequence of 2 operations FALLOC_FL_PUNCH_HOLE and 0 is necessary due
to the following reasons:
- FALLOC_FL_PUNCH_HOLE creates a hole in the file, the file becomes
sparse. In order to retain original functionality we must allocate
disk space afterwards. This is done using fallocate(0) call
- fallocate(0) without preceeding FALLOC_FL_PUNCH_HOLE will do nothing
if called above already allocated areas of the file, i.e. the content
will not be zeroed
This should increase the performance a bit for not-so-modern kernels.
Denis V. Lunev [Fri, 30 Jan 2015 08:42:15 +0000 (11:42 +0300)]
block/raw-posix: call plain fallocate in handle_aiocb_write_zeroes
There is a possibility that we are extending our image and thus writing
zeroes beyond the end of the file. In this case we do not need to care
about the hole to make sure that there is no data in the file under
this offset (pre-condition to fallocate(0) to work). We could simply call
fallocate(0).
This improves the performance of writing zeroes even on really old
platforms which do not have even FALLOC_FL_PUNCH_HOLE.
Before the patch do_fallocate was used when either
CONFIG_FALLOCATE_PUNCH_HOLE or CONFIG_FALLOCATE_ZERO_RANGE are defined.
Now the story is different. CONFIG_FALLOCATE is defined when Linux
fallocate is defined, posix_fallocate is completely different story
(CONFIG_POSIX_FALLOCATE). CONFIG_FALLOCATE is mandatory prerequite
for both CONFIG_FALLOCATE_PUNCH_HOLE and CONFIG_FALLOCATE_ZERO_RANGE
thus we are on the safe side.
Denis V. Lunev [Fri, 30 Jan 2015 08:42:14 +0000 (11:42 +0300)]
block: use fallocate(FALLOC_FL_ZERO_RANGE) in handle_aiocb_write_zeroes
This efficiently writes zeroes on Linux if the kernel is capable enough.
FALLOC_FL_ZERO_RANGE correctly handles all cases, including and not
including file expansion.
Denis V. Lunev [Fri, 30 Jan 2015 08:42:13 +0000 (11:42 +0300)]
block/raw-posix: refactor handle_aiocb_write_zeroes a bit
move code dealing with a block device to a separate function. This will
allow to implement additional processing for ordinary files.
Please note, that xfs_code has been moved before checking for
s->has_write_zeroes as xfs_write_zeroes does not touch this flag inside.
This makes code a bit more consistent.
Denis V. Lunev [Fri, 30 Jan 2015 08:42:12 +0000 (11:42 +0300)]
block/raw-posix: create do_fallocate helper
The pattern
do {
if (fallocate(s->fd, mode, offset, len) == 0) {
return 0;
}
} while (errno == EINTR);
ret = translate_err(-errno);
will be commonly useful in next patches. Create helper for it.
Denis V. Lunev [Fri, 30 Jan 2015 08:42:11 +0000 (11:42 +0300)]
block/raw-posix: create translate_err helper to merge errno values
actually the code
if (ret == -ENODEV || ret == -ENOSYS || ret == -EOPNOTSUPP ||
ret == -ENOTTY) {
ret = -ENOTSUP;
}
is present twice and will be added a couple more times. Create helper
for this.
atapi migration: Throw recoverable error to avoid recovery
(With the previous atapi_dma flag recovery)
If migration happens between the ATAPI command being written and the
bmdma being started, the DMA is dropped. Eventually the guest times
out and recovers, but that can take many seconds.
(This is rare, on a pingpong reading the CD continuously I hit
this about ~1/30-1/50 migrates)
I don't think we've got enough state to be able to recover safely
at this point, so I throw a 'medium error, no seek complete'
that I'm assuming guests will try and recover from an apparently
dirty CD.
OK, it's a hack, the real solution is probably to push a lot of
ATAPI state into the migration stream, but this is a fix that
works with no stream changes. Tested only on Linux (both RHEL5
(pre-libata) and RHEL7).
If a migration happens just after the guest has kicked
off an ATAPI command and kicked off DMA, we lose the atapi_dma
flag, and the destination tries to complete the command as PIO
rather than DMA. This upsets Linux; modern libata based kernels
stumble and recover OK, older kernels end up passing bad data
to userspace.
Peter Maydell [Fri, 6 Feb 2015 14:35:52 +0000 (14:35 +0000)]
Merge remote-tracking branch 'remotes/stefanha/tags/net-pull-request' into staging
# gpg: Signature made Fri 06 Feb 2015 14:10:40 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <[email protected]>"
# gpg: aka "Stefan Hajnoczi <[email protected]>"
* remotes/stefanha/tags/net-pull-request:
monitor: more accurate completion for host_net_remove()
net: del hub port when peer is deleted
net: remove the wrong comment in net_init_hubport()
monitor: print hub port name during info network
rtl8139: simplify timer logic
MAINTAINERS: add Jason Wang as net subsystem maintainer
Paolo Bonzini [Tue, 20 Jan 2015 14:44:59 +0000 (15:44 +0100)]
rtl8139: simplify timer logic
Pavel Dovgalyuk reports that TimerExpire and the timer are not restored
correctly on the receiving end of migration.
It is not clear to me whether this is really the case, but we can take
the occasion to get rid of the complicated code that computes PCSTimeout
on the fly upon changes to IntrStatus/IntrMask. Just always keep a
timer running, it will fire every ~130 seconds at most if the interrupt
is masked with TimerInt != 0.
This makes rtl8139_set_next_tctr_time idempotent (when the virtual clock
is stopped between two calls, as is the case during migration).
Peter Maydell [Thu, 5 Feb 2015 16:40:00 +0000 (16:40 +0000)]
Merge remote-tracking branch 'remotes/armbru/tags/pull-cov-model-2015-02-05' into staging
coverity: Improve and extend model
# gpg: Signature made Thu 05 Feb 2015 16:20:49 GMT using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <[email protected]>"
# gpg: aka "Markus Armbruster <[email protected]>"
* remotes/armbru/tags/pull-cov-model-2015-02-05:
MAINTAINERS: Add myself as Coverity model maintainer
coverity: Model g_free() isn't necessarily free()
coverity: Model GLib string allocation partially
coverity: Improve model for GLib memory allocation
Alexander Graf [Thu, 22 Jan 2015 14:01:40 +0000 (15:01 +0100)]
Add migration stream analyzation script
This patch adds a python tool to the scripts directory that can read
a dumped migration stream if it contains the JSON description of the
device states. I constructs a human readable JSON stream out of it.
Alexander Graf [Thu, 22 Jan 2015 14:01:39 +0000 (15:01 +0100)]
migration: Append JSON description of migration stream
One of the annoyances of the current migration format is the fact that
it's not self-describing. In fact, it's not properly describing at all.
Some code randomly scattered throughout QEMU elaborates roughly how to
read and write a stream of bytes.
We discussed an idea during KVM Forum 2013 to add a JSON description of
the migration protocol itself to the migration stream. This patch
adds a section after the VM_END migration end marker that contains
description data on what the device sections of the stream are composed of.
This approach is backwards compatible with any QEMU version reading the
stream, because QEMU just stops reading after the VM_END marker and ignores
any data following it.
With an additional external program this allows us to decipher the
contents of any migration stream and hopefully make migration bugs easier
to track down.
Alexander Graf [Thu, 22 Jan 2015 14:01:38 +0000 (15:01 +0100)]
qemu-file: Add fast ftell code path
For ftell we flush the output buffer to ensure that we don't have anything
lingering in our internal buffers. This is a very safe thing to do.
However, with the dynamic size measurement that the dynamic vmstate
description will bring this would turn out quite slow.
Instead, we can fast path this specific measurement and just take the
internal buffers into account when telling the kernel our position.
I'm sure I overlooked some corner cases where this doesn't work, so
instead of tuning the safe, existing version, this patch adds a fast
variant of ftell that gets used by the dynamic vmstate description code
which isn't critical when it fails.
Alexander Graf [Thu, 22 Jan 2015 14:01:37 +0000 (15:01 +0100)]
QJSON: Add JSON writer
To support programmatic JSON assembly while keeping the code that generates it
readable, this patch introduces a simple JSON writer. It emits JSON serially
into a buffer in memory.
The nice thing about this writer is its simplicity and low memory overhead.
Unlike the QMP JSON writer, this one does not need to spawn QObjects for every
element it wants to represent.
This is a prerequisite for the migration stream format description generator.
Amit Shah [Wed, 21 Jan 2015 13:05:33 +0000 (18:35 +0530)]
vmstate-static-checker: update whitelist
Commit 22382bb96c8bd88370c1ff0cb28c3ee6bee79ed3 renamed the
'hw_cursor_x' and 'hw_cursor_y' fields in cirrus_vga. Update the static
checker's whitelist to allow matching against the old and new names.
Memory allocated with GLib needs to be freed with GLib. Freeing it
with free() instead of g_free() is a common error. Harmless when
g_free() is a trivial wrapper around free(), which is commonly the
case. But model the difference anyway.
In a local scan, this flags four ALLOC_FREE_MISMATCH. Requires
--enable ALLOC_FREE_MISMATCH, because the checker is still preview.
Without a model, Coverity can't know that the result of g_strdup()
needs to be fed to g_free().
One way to get such a model is to scan GLib, build a derived model
file with cov-collect-models, and use that when scanning QEMU.
Unfortunately, the Coverity Scan service we use doesn't support that.
Thus, we're stuck with the other way: write a user model. Doing that
for all of GLib is hardly practical. I'm doing it for the "String
Utility Functions" we actually use that return dynamically allocated
strings.
In a local scan, this flags 20 additional RESOURCE_LEAKs. The ones I
checked look genuine.
It also loses a NULL_RETURNS about ppce500_init() using
qemu_find_file() without error checking. I don't understand why.
coverity: Improve model for GLib memory allocation
In current versions of GLib, g_new() may expand into g_malloc_n().
When it does, Coverity can't see the memory allocation, because we
don't model g_malloc_n(). Similarly for g_new0(), g_renew(),
g_try_new(), g_try_new0(), g_try_renew().
Model g_malloc_n(), g_malloc0_n(), g_realloc_n(). Model
g_try_malloc_n(), g_try_malloc0_n(), g_try_realloc_n() by adding
indeterminate out of memory conditions on top.
To avoid undue duplication, replace the existing models for g_malloc()
& friends by trivial wrappers around g_malloc_n() & friends.
In a local scan, this flags four additional RESOURCE_LEAKs and one
NULL_RETURNS.
The NULL_RETURNS is a false positive: Coverity can now see that
g_try_malloc(l1_sz * sizeof(uint64_t)) in
qcow2_check_metadata_overlap() may return NULL, but is too stupid to
recognize that a loop executing l1_sz times won't be entered then.
Three out of the four RESOURCE_LEAKs appear genuine. The false
positive is in ppce500_prep_device_tree(): the pointer dies, but a
pointer to a struct member escapes, and we get the pointer back for
freeing with container_of(). Too funky for Coverity.
Peter Maydell [Thu, 5 Feb 2015 14:22:51 +0000 (14:22 +0000)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20150205' into staging
target-arm queue:
* refactor/clean up armv7m_init()
* some initial cleanup in the direction of supporting 64-bit EL3
* fix broken synchronization of registers between QEMU and KVM
for 32-bit ARM hosts (which among other things broke memory
access via gdbstub)
* fix flush-to-zero handling in FMULX, FRECPS, FRSQRTS and FRECPE
* don't crash QEMU for UNPREDICTABLE BFI insns in A32 encoding
* explain why virt board's device-to-transport mapping code is
the way it is
* implement mmu_idx values which match the architectural
distinctions, and introduce the concept of a translation
regime to get_phys_addr() rather than incorrectly looking
at the current CPU state
* update to upstream VIXL 1.7 (gives us correct code addresses
when dissassembling pc-relative references)
* sync system register state between KVM and QEMU for 64-bit ARM
* support virtio on big-endian guests by implementing the
"which endian is the guest now?" CPU method
# gpg: Signature made Thu 05 Feb 2015 14:02:16 GMT using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <[email protected]>"
* remotes/pmaydell/tags/pull-target-arm-20150205: (28 commits)
target-arm: fix for exponent comparison in recpe_f64
target-arm: Guest cpu endianness determination for virtio KVM ARM/ARM64
target-arm: KVM64: Get and Sync up guest register state like kvm32.
disas/arm-a64.cc: Tell libvixl correct code addresses
disas/libvixl: Update to upstream VIXL 1.7
target-arm: Fix brace style in reindented code
target-arm: Reindent ancient page-table-walk code
target-arm: Use mmu_idx in get_phys_addr()
target-arm: Pass mmu_idx to get_phys_addr()
target-arm: Split AArch64 cases out of ats_write()
target-arm: Don't define any MMU_MODE*_SUFFIXes
target-arm: Use correct mmu_idx for unprivileged loads and stores
target-arm: Define correct mmu_idx values and pass them in TB flags
target-arm/translate-a64: Fix wrong mmu_idx usage for LDT/STT
target-arm: Make arm_current_el() return sensible values for M profile
cpu_ldst.h: Allow NB_MMU_MODES to be 7
hw/arm/virt: explain device-to-transport mapping in create_virtio_devices()
target-arm: check that LSB <= MSB in BFI instruction
target-arm: Squash input denormals in FRECPS and FRSQRTS
Fix FMULX not squashing denormalized inputs when FZ is set.
...
Ildar Isaev [Thu, 5 Feb 2015 13:37:25 +0000 (13:37 +0000)]
target-arm: fix for exponent comparison in recpe_f64
f64 exponent in HELPER(recpe_f64) should be compared to 2045 rather than 1023
(FPRecipEstimate in ARMV8 spec). This fixes incorrect underflow handling when
flushing denormals to zero in the FRECPE instructions operating on 64-bit
values.
target-arm: Guest cpu endianness determination for virtio KVM ARM/ARM64
This patch implements a fucntion pointer "virtio_is_big_endian"
from "CPUClass" structure for arm/arm64.
Function arm_cpu_is_big_endian() is added to determine and
return the guest cpu endianness to virtio.
This is required for running cross endian guests with virtio on ARM/ARM64.
target-arm: KVM64: Get and Sync up guest register state like kvm32.
This patch adds:
1. Call write_kvmstate_to_list() and write_list_to_cpustate()
in kvm_arch_get_registers() to sync guest register state.
2. Call write_list_to_kvmstate() in kvm_arch_put_registers()
to sync guest register state.
These changes are already there for kvm32 in target-arm/kvm32.c.
disassembling relative branches in code which doesn't reside at
what the guest CPU would think its execution address is. Use
the new MapCodeAddress() API to tell libvixl where the code is
from the guest CPU's point of view so it can get the target
addresses right.
Peter Maydell [Thu, 5 Feb 2015 13:37:24 +0000 (13:37 +0000)]
target-arm: Use mmu_idx in get_phys_addr()
Now we have the mmu_idx in get_phys_addr(), use it correctly to
determine the behaviour of virtual to physical address translations,
rather than using just an is_user flag and the current CPU state.
Some TODO comments have been added to indicate where changes will
need to be made to add EL2 and 64-bit EL3 support.
Peter Maydell [Thu, 5 Feb 2015 13:37:24 +0000 (13:37 +0000)]
target-arm: Pass mmu_idx to get_phys_addr()
Make all the callers of get_phys_addr() pass it the correct
mmu_idx rather than just a simple "is_user" flag. This includes
properly decoding the AT/ATS system instructions; we include the
logic for handling all the opc1/opc2 cases because we'll need
them later for supporting EL2/EL3, even if we don't have the
regdef stanzas yet.
Peter Maydell [Thu, 5 Feb 2015 13:37:24 +0000 (13:37 +0000)]
target-arm: Split AArch64 cases out of ats_write()
Instead of simply reusing ats_write() as the handler for both AArch32
and AArch64 address translation operations, use a different function
for each with the common code in a third function. This is necessary
because the semantics for selecting the right translation regime are
different; we are only getting away with sharing currently because
we don't support EL2 and only support EL3 in AArch32.
Peter Maydell [Thu, 5 Feb 2015 13:37:24 +0000 (13:37 +0000)]
target-arm: Don't define any MMU_MODE*_SUFFIXes
target-arm doesn't use any of the MMU-mode specific cpu ldst
accessor functions. Suppress their generation by not defining
any of the MMU_MODE*_SUFFIX macros. ("user" and "kernel" are
too simplistic as descriptions of indexes 0 and 1 anyway.)
Peter Maydell [Thu, 5 Feb 2015 13:37:23 +0000 (13:37 +0000)]
target-arm: Use correct mmu_idx for unprivileged loads and stores
The MMU index to use for unprivileged loads and stores is more
complicated than we currently implement:
* for A64, it should be "if at EL1, access as if EL0; otherwise
access at current EL"
* for A32/T32, it should be "if EL2, UNPREDICTABLE; otherwise
access as if at EL0".
In both cases, if we want to make the access for Secure EL0
this is not the same mmu_idx as for Non-Secure EL0.
Peter Maydell [Thu, 5 Feb 2015 13:37:23 +0000 (13:37 +0000)]
target-arm: Define correct mmu_idx values and pass them in TB flags
We currently claim that for ARM the mmu_idx should simply be the current
exception level. However this isn't actually correct -- secure EL0 and EL1
should have separate indexes from non-secure EL0 and EL1 since their
VA->PA mappings may differ. We also will want an index for stage 2
translations when we properly support EL2.
Define and document all seven mmu index values that we require, and
pass the mmu index in the TB flags rather than exception level or
priv/user bit.
This change doesn't update the get_phys_addr() code, so our page
table walking still assumes a simplistic "user or priv?" model for
the moment.
Signed-off-by: Peter Maydell <[email protected]> Reviewed-by: Greg Bellows <[email protected]>
---
This leaves some odd gaps in the TB flags usage. I will circle
back and clean this up later (including moving the other common
flags like the singlestep ones to the top of the flags word),
but I didn't want to bloat this patchseries further.
Peter Maydell [Thu, 5 Feb 2015 13:37:23 +0000 (13:37 +0000)]
target-arm/translate-a64: Fix wrong mmu_idx usage for LDT/STT
The LDT/STT (load/store unprivileged) instruction decode was using
the wrong MMU index value. This meant that instead of these insns
being "always access as if user-mode regardless of current privilege"
they were "always access as if kernel-mode regardless of current
privilege". This went unnoticed because AArch64 Linux doesn't use
these instructions.
Cc: [email protected] Signed-off-by: Peter Maydell <[email protected]> Reviewed-by: Greg Bellows <[email protected]> Reviewed-by: Edgar E. Iglesias <[email protected]>
---
I'm not counting this as a security issue because I'm assuming
nobody treats TCG guests as a security boundary (certainly I
would not recommend doing so...)
Peter Maydell [Thu, 5 Feb 2015 13:37:23 +0000 (13:37 +0000)]
target-arm: Make arm_current_el() return sensible values for M profile
Although M profile doesn't have the same concept of exception level
as A profile, it does have a notion of privileged versus not, which
we currently track in the privmode TB flag. Support returning this
information if arm_current_el() is called on an M profile core, so
that we can identify the correct MMU index to use (and put the MMU
index in the TB flags) without having to special-case M profile.
Peter Maydell [Thu, 5 Feb 2015 13:37:23 +0000 (13:37 +0000)]
cpu_ldst.h: Allow NB_MMU_MODES to be 7
Support guest CPUs which need 7 MMU index values.
Add a comment about what would be required to raise the limit
further (trivial for 8, TCG backend rework for 9 or more).
Kirill Batuzov [Thu, 5 Feb 2015 13:37:22 +0000 (13:37 +0000)]
target-arm: check that LSB <= MSB in BFI instruction
The documentation states that if LSB > MSB in BFI instruction behaviour
is unpredictable. Currently QEMU crashes because of assertion failure in
this case:
While assertion failure may meet the "unpredictable" definition this
behaviour is undesirable because it allows an unprivileged guest program
to crash the emulator with the OS and other programs.
This patch addresses the issue by throwing illegal instruction exception
if LSB > MSB. Only ARM decoder is affected because Thumb decoder already
has this check in place.
Peter Maydell [Thu, 5 Feb 2015 13:37:22 +0000 (13:37 +0000)]
target-arm: Squash input denormals in FRECPS and FRSQRTS
The helper functions for FRECPS and FRSQRTS have special case
handling that includes checks for zero inputs, so squash input
denormals if necessary before those checks. This fixes incorrect
output when the FPCR DZ bit is set to enable squashing of input
denormals.
Xiangyu Hu [Thu, 5 Feb 2015 13:37:22 +0000 (13:37 +0000)]
Fix FMULX not squashing denormalized inputs when FZ is set.
While FMULX returns a 2.0f float when two operators are infinity and
zero, those operators should be unpacked from raw inputs first. Inconsistent
cases would occur when operators are denormalized floats in flush-to-zero
mode. A wrong codepath will be entered and 2.0f will not be returned
without this patch.
Fix by checking whether inputs need to be flushed before running into
different codepaths.
Peter Maydell [Thu, 5 Feb 2015 13:37:22 +0000 (13:37 +0000)]
target-arm: Add checks that cpreg raw accesses are handled
Add assertion checking when cpreg structures are registered that they
either forbid raw-access attempts or at least make an attempt at
handling them. Also add an assert in the raw-accessor-of-last-resort,
to avoid silently doing a read or write from offset zero, which is
actually AArch32 CPU register r0.
Peter Maydell [Thu, 5 Feb 2015 13:37:22 +0000 (13:37 +0000)]
target-arm: Split NO_MIGRATE into ALIAS and NO_RAW
We currently mark ARM coprocessor/system register definitions with
the flag ARM_CP_NO_MIGRATE for two different reasons:
1) register is an alias on to state that's also visible via
some other register, and that other register is the one
responsible for migrating the state
2) register is not actually state at all (for instance the TLB
or cache maintenance operation "registers") and it makes no
sense to attempt to migrate it or otherwise access the raw state
This works fine for identifying which registers should be ignored
when performing migration, but we also use the same functions for
synchronizing system register state between QEMU and the kernel
when using KVM. In this case we don't want to try to sync state
into registers in category 2, but we do want to sync into registers
in category 1, because the kernel might have picked a different
one of the aliases as its choice for which one to expose for
migration. (In particular, on 32 bit hosts the kernel will
expose the state in the AArch32 version of the register, but
TCG's convention is to mark the AArch64 version as the version
to migrate, even if the CPU being emulated happens to be 32 bit,
so almost all system registers will hit this issue now that we've
added AArch64 system emulation.)
Fix this by splitting the NO_MIGRATE flag in two (ALIAS and NO_RAW)
corresponding to the two different reasons we might not want to
migrate a register. When setting up the TCG list of registers to
migrate we honour both flags; when populating the list from KVM,
only ignore registers which are NO_RAW.
Signed-off-by: Peter Maydell <[email protected]> Reviewed-by: Greg Bellows <[email protected]>
Message-id: 1422282372[email protected]
[PMM: changed ARM_CP_NO_MIGRATE to ARM_CP_ALIAS on new SP_EL1 and
SP_EL2 reginfo stanzas since there was a (semantic) merge conflict
with the patchset that added those]
Greg Bellows [Thu, 5 Feb 2015 13:37:22 +0000 (13:37 +0000)]
target-arm: Add extended RVBAR support
Added RVBAR_EL2 and RVBAR_EL3 CP register support. All RVBAR_EL# registers
point to the same location and only the highest EL version exists at any one
time.
Peter Maydell [Thu, 5 Feb 2015 11:11:56 +0000 (11:11 +0000)]
Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2015-02-05' into staging
qmp hmp balloon: Cleanups around error reporting
# gpg: Signature made Thu 05 Feb 2015 07:15:11 GMT using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <[email protected]>"
# gpg: aka "Markus Armbruster <[email protected]>"
* remotes/armbru/tags/pull-error-2015-02-05:
balloon: Eliminate silly QERR_ macros
balloon: Factor out common "is balloon active" test
balloon: Inline qemu_balloon(), qemu_balloon_status()
qmp: Eliminate silly QERR_COMMAND_NOT_FOUND macro
qmp: Simplify recognition of capability negotiation command
qmp: Clean up qmp_query_spice() #ifndef !CONFIG_SPICE dummy
hmp: Compile hmp_info_spice() only with CONFIG_SPICE
qmp hmp: Improve error messages when SPICE is not in use
qmp hmp: Factor out common "using spice" test
Stefan Hajnoczi [Tue, 20 Jan 2015 15:40:38 +0000 (15:40 +0000)]
MAINTAINERS: add Jason Wang as net subsystem maintainer
Jason Wang will be co-maintaining the QEMU net subsystem with me. He
has contributed improvements and reviewed patches over the past years as
part of working on virtio-net and virtualized networking.
Jason has already been backing me up with patch reviews. For the time
being I will continue to submit pull requests.
Alex Williamson [Wed, 4 Feb 2015 18:45:32 +0000 (11:45 -0700)]
vfio-pci: Fix missing unparent of dynamically allocated MemoryRegion
Commit d8d95814609e added explicit object_unparent() calls for
dynamically allocated MemoryRegions. The VFIOMSIXInfo structure also
contains such a MemoryRegion, covering the mmap'd region of a PCI BAR
above the MSI-X table. This structure is freed as part of the class
exit function and therefore also needs an explicit object_unparent().
Failing to do this results in random segfaults due to fields within
the structure, often the class pointer, being reclaimed and corrupted
by the time object_finalize_child_property() is called for the object.
Peter Maydell [Tue, 3 Feb 2015 21:37:16 +0000 (21:37 +0000)]
Merge remote-tracking branch 'remotes/rth/tags/pull-tg-s390-20150203' into staging
s390 translator bug fixes
# gpg: Signature made Tue 03 Feb 2015 20:39:15 GMT using RSA key ID 4DD0279B
# gpg: Good signature from "Richard Henderson <[email protected]>"
# gpg: aka "Richard Henderson <[email protected]>"
# gpg: aka "Richard Henderson <[email protected]>"
* remotes/rth/tags/pull-tg-s390-20150203:
target-s390x: fix and optimize slb* and slbg* computation of carry/borrow flag
target-s390x: support OC and NC in the EX instruction
disas/s390.c: Remove unused variables
target-s390x: Mark check_privileged() as !CONFIG_USER_ONLY
target-s390: Implement ECAG
target-s390: Implement LURA, LURAG, STURG
target-s390: Fix STURA
target-s390: Fix STIDP
target-s390: Implement EPSW
target-s390: Implement SAM specification exception
target-s390x: fix and optimize slb* and slbg* computation of carry/borrow flag
This patch fixes the bug with borrow_in being set incorrectly, but it
also simplifies the logic to be much more plain, improving speed. It
fixes both the 32-bit SLB* and 64-bit SLBG*.
The SLBG* change has been well-tested. I haven't tested the SLB* change
explicitly, but the code was copy-pasted from the tested code.
The error of these functions' current implementations would not likely
be triggered by compiler-generated code, since the only error was in the
state of the carry/borrow flag. Compilers rarely generate an
instruction sequence such as carry-set -> carry-set-and-use ->
carry-use.
(With Paolo's fix and mine, there are still a couple of failures from
GMP's testsuite, but they are almost surely due to incorrect code
generation from gcc 4.9. But since this gcc is running under qemu, it
might be qemu bugs. I intend to investigate this.)
Peter Maydell [Mon, 26 Jan 2015 16:28:31 +0000 (16:28 +0000)]
disas/s390.c: Remove unused variables
The variables s390_opformats and s390_num_opformats are unused and
provoke clang warnings:
disas/s390.c:849:33: warning: variable 's390_opformats' is not needed and will not be emitted [-Wunneeded-internal-declaration]
static const struct s390_opcode s390_opformats[] =
^
disas/s390.c:875:18: warning: unused variable 's390_num_opformats' [-Wunused-const-variable]
static const int s390_num_opformats =
^
Peter Maydell [Mon, 26 Jan 2015 16:28:30 +0000 (16:28 +0000)]
target-s390x: Mark check_privileged() as !CONFIG_USER_ONLY
The function check_privileged() is only used in the softmmu configs;
wrap it in an #ifndef CONFIG_USER_ONLY to avoid clang warnings on the
linux-user builds.
[rth: Remove inline marker too; it was only there to prevent exactly
this warning in GCC.]
The implementation had been incomplete, as we did not store the
machine type. Note that the machine_type member is still unset
during initialization, so this has no effect yet.
s390x/kvm: unknown DIAGNOSE code should give a specification exception
As described in CP programming services an unimplemented DIAGNOSE
function should return a specification exception. Today we give the
guest an operation exception.
As both exception types are suppressing and Linux as a guest does not
care about the type of program check in its exception table handler
as long as both types have the same kind of error handling (nullifying,
terminating, suppressing etc.) this was unnoticed.
Yi Min Zhao [Mon, 19 Jan 2015 07:15:56 +0000 (15:15 +0800)]
s390x/pci: fix dma notifications in rpcit instruction
The virtual I/O address range passed to rpcit instruction might not
map to consecutive physical guest pages. For this we have to translate
and create mapping notifications for each vioa page separately.
Frank Blaschka [Fri, 16 Jan 2015 13:55:21 +0000 (14:55 +0100)]
s390x/pci: check for invalid function handle
broken guest may provide 0 (invalid) function handle to zpci
instructions. Since we use function handle 0 to indicate an empty
slot in the PHB we have to add an additional check to spot this
kind of error.