Peter Maydell [Wed, 7 May 2014 15:06:38 +0000 (16:06 +0100)]
Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20140507' into staging
Some improvements for s390.
Two patches deal with address translation, one fixes a problem in the
channel subsystem code.
# gpg: Signature made Wed 07 May 2014 09:29:30 BST using RSA key ID C6F02FAF
# gpg: Can't check signature: public key not found
* remotes/cohuck/tags/s390x-20140507:
s390x/css: Don't save orb in subchannel.
s390x/helper: Added format control bit to MMU translation
s390x/helper: Fixed real-to-absolute address translation
# gpg: Signature made Mon 05 May 2014 21:27:24 BST using RSA key ID 5872D723
# gpg: Can't check signature: public key not found
* remotes/juanquintela/tags/migration/20140505: (36 commits)
migration: expose xbzrle cache miss rate
migration: expose the bitmap_sync_count to the end
migration: Add counts of updating the dirty bitmap
XBZRLE: Fix one XBZRLE corruption issues
migration: remove duplicate code
Coverity: Fix failure path for qemu_accept in migration
Init the XBZRLE.lock in ram_mig_init
Provide init function for ram migration
Count used RAMBlock pages for migration_dirty_pages
Make qemu_peek_buffer loop until it gets it's data
Disallow outward migration while awaiting incoming migration
virtio: validate config_len on load
virtio-net: out-of-bounds buffer write on load
openpic: avoid buffer overrun on incoming migration
ssi-sd: fix buffer overrun on invalid state load
savevm: Ignore minimum_version_id_old if there is no load_state_old
usb: sanity check setup_index+setup_len in post_load
vmstate: s/VMSTATE_INT32_LE/VMSTATE_INT32_POSITIVE_LE/
virtio-scsi: fix buffer overrun on invalid state load
zaurus: fix buffer overrun on invalid state load
...
Peter Maydell [Wed, 7 May 2014 12:47:25 +0000 (13:47 +0100)]
Merge remote-tracking branch 'remotes/afaerber/tags/qom-devices-for-peter' into staging
QOM/QTest infrastructure fixes and device conversions
* -device / device_add assertion fix
* QEMUMachine conversion to MachineClass
* Device error handling improvements
* QTest cleanups and test cases for some more PCI devices
* PortIO memory leak fixes
# gpg: Signature made Mon 05 May 2014 19:59:16 BST using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <[email protected]>"
# gpg: aka "Andreas Färber <[email protected]>"
* remotes/afaerber/tags/qom-devices-for-peter:
PortioList: Store PortioList in device state
tests: Add EHCI qtest
tests: Add ioh3420 qtest
tests: Add intel-hda qtests
tests: Add es1370 qtest
tests: Add ac97 qtest
qtest: Be paranoid about accept() addrlen argument
qtest: Add error reporting to socket_accept()
qtest: Assure that init_socket()'s listen() does not fail
MAINTAINERS: Document QOM
arm: Clean up fragile use of error_is_set() in realize() methods
qom: Clean up fragile use of error_is_set() in set() methods
hw: Consistently name Error ** objects errp, and not err
hw: Consistently name Error * objects err, and not errp
machine: Remove QEMUMachine indirection from MachineClass
machine: Replace QEMUMachine by MachineClass in accelerator configuration
vl.c: Replace QEMUMachine with MachineClass in QEMUMachineInitArgs
machine: Copy QEMUMachine's fields to MachineClass
machine: Remove obsoleted field from QEMUMachine
qdev: Fix crash by validating the object type
Cornelia Huck [Tue, 11 Feb 2014 15:23:32 +0000 (16:23 +0100)]
s390x/css: Don't save orb in subchannel.
Current css code saves the operation request block (orb) in the
subchannel structure for later consumption by the start function
handler. This might make sense for asynchronous execution of the
start function (which qemu doesn't support), but not in our case;
it would even be wrong since orb contains a reference to a local
variable in the base ssch handler.
Let's just pass the orb through the start function call chain for
ssch; for rsch, we can pass NULL as the backend function does not
use any information passed via the orb there.
Thomas Huth [Fri, 25 Apr 2014 13:37:19 +0000 (15:37 +0200)]
s390x/helper: Added format control bit to MMU translation
With the EDAT-1 facility, the MMU translation can stop at the
segment table already, pointing to a 1 MB block. And while we're
at it, move the page table entry handling to a separate function,
too, as suggested by Alexander Graf.
The real-to-absolute address translation in mmu_translate() was
missing the second part for translating the page at the prefix
address back to the 0 page. And while we're at it, also moved the
code into a separate helper function since this might come in
handy for other parts of the code, too.
Peter Maydell [Tue, 6 May 2014 11:23:05 +0000 (12:23 +0100)]
Merge remote-tracking branch 'remotes/kraxel/tags/pull-smbios-2' into staging
smbios: make qemu generate smbios tables.
# gpg: Signature made Mon 05 May 2014 12:20:27 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <[email protected]>"
# gpg: aka "Gerd Hoffmann <[email protected]>"
# gpg: aka "Gerd Hoffmann (private) <[email protected]>"
* remotes/kraxel/tags/pull-smbios-2:
SMBIOS: Build aggregate smbios tables and entry point
SMBIOS: Use bitmaps to prevent incompatible comand line options
SMBIOS: Use macro to set smbios defaults
SMBIOS: Update header file definitions
SMBIOS: Rename symbols to better reflect future use
E820: Add interface for accessing e820 table
pc: add 2.1 machine type
Peter Maydell [Tue, 6 May 2014 09:56:38 +0000 (10:56 +0100)]
Merge remote-tracking branch 'remotes/riku/linux-user-for-upstream' into staging
* remotes/riku/linux-user-for-upstream:
linux-user: fix getrusage and wait4 failures with invalid rusage struct
linux-user/elfload.c: Support ARM HWCAP2 flags
linux-user/elfload.c: Fix A64 code which was incorrectly acting like A32
linux-user/elfload.c: Update ARM HWCAP bits
linux-user/elfload.c: Fix incorrect ARM HWCAP bits
linux-user: remove configure option for setting uname release
linux-user: move uname functions to uname.c
linux-user: rename cpu-uname -> uname
linux-user/signal.c: Set fault address in AArch64 signal info
linux-user: avoid using glibc internals in _syscall5 and in definition of target_sigevent struct
linux-user: Handle arches with llseek instead of _llseek
linux-user: Add support for SCM_CREDENTIALS.
linux-user: Move if-elses to a switch statement.
linux-user: Assert stack used for auxvec, envp, argv
linux-user: Add /proc/self/exe open forwarding
The page may not be inserted into cache after executing save_xbzrle_page.
In case of failure to insert, the original page should be sent rather
than the page in the cache.
Count used RAMBlock pages for migration_dirty_pages
This is a fix for a bug* triggered by a migration after hot unplugging
a few virtio-net NICs, that caused migration never to converge, because
'migration_dirty_pages' is incorrectly initialised.
'migration_dirty_pages' is used as a tally of the number of outstanding
dirty pages, to give the migration code an idea of how much more data
will need to be transferred, and thus whether it can end the iterative
phase.
It was initialised to the total size of the RAMBlock address space,
however hotunplug can leave this space sparse, and hence
migration_dirty_pages ended up too large.
Signed-off-by: Dr. David Alan Gilbert <[email protected]>
(* https://bugzilla.redhat.com/show_bug.cgi?id=1074913 )
Make qemu_peek_buffer loop until it gets it's data
Make qemu_peek_buffer repeatedly call fill_buffer until it gets
all the data it requires, or until there is an error.
At the moment, qemu_peek_buffer will try one qemu_fill_buffer if there
isn't enough data waiting, however the kernel is entitled to return
just a few bytes, and still leave qemu_peek_buffer with less bytes
than it needed. I've seen this fail in a dev world, and I think it
could theoretically fail in the peeking of the subsection headers in
the current world.
Comment qemu_peek_byte to point out it's not guaranteed to work for
non-continuous peeks
Disallow outward migration while awaiting incoming migration
QEMU will assert if you attempt to start an outgoing migration on
a QEMU that's sitting waiting for an incoming migration (started
with -incoming), so disallow it with a proper error.
(This is a fix for https://bugzilla.redhat.com/show_bug.cgi?id=1086987 )
Michael Roth [Mon, 28 Apr 2014 13:08:17 +0000 (16:08 +0300)]
openpic: avoid buffer overrun on incoming migration
CVE-2013-4534
opp->nb_cpus is read from the wire and used to determine how many
IRQDest elements to read into opp->dst[]. If the value exceeds the
length of opp->dst[], MAX_CPU, opp->dst[] can be overrun with arbitrary
data from the wire.
Fix this by failing migration if the value read from the wire exceeds
MAX_CPU.
Peter Maydell [Thu, 3 Apr 2014 16:52:28 +0000 (19:52 +0300)]
savevm: Ignore minimum_version_id_old if there is no load_state_old
At the moment we require vmstate definitions to set minimum_version_id_old
to the same value as minimum_version_id if they do not provide a
load_state_old handler. Since the load_state_old functionality is
required only for a handful of devices that need to retain migration
compatibility with a pre-vmstate implementation, this means the bulk
of devices have pointless boilerplate. Relax the definition so that
minimum_version_id_old is ignored if there is no load_state_old handler.
Note that under the old scheme we would segfault if the vmstate
specified a minimum_version_id_old that was less than minimum_version_id
but did not provide a load_state_old function, and the incoming state
specified a version number between minimum_version_id_old and
minimum_version_id. Under the new scheme this will just result in
our failing the migration.
usb: sanity check setup_index+setup_len in post_load
CVE-2013-4541
s->setup_len and s->setup_index are fed into usb_packet_copy as
size/offset into s->data_buf, it's possible for invalid state to exploit
this to load arbitrary data.
setup_len and setup_index should be checked to make sure
they are not negative.
Note: this adds security checks within assert calls since
SCSIBusInfo's load_request cannot fail.
For now simply disable builds with NDEBUG - there seems
to be little value in supporting these.
s->cmd_len used as index in ssd0323_transfer() to store 32-bit field.
Possible this field might then be supplied by guest to overwrite a
return addr somewhere. Same for row/col fields, which are indicies into
framebuffer array.
To fix validate after load.
Additionally, validate that the row/col_start/end are within bounds;
otherwise the guest can provoke an overrun by either setting the _end
field so large that the row++ increments just walk off the end of the
array, or by setting the _start value to something bogus and then
letting the "we hit end of row" logic reset row to row_start.
pxa2xx: avoid buffer overrun on incoming migration
CVE-2013-4533
s->rx_level is read from the wire and used to determine how many bytes
to subsequently read into s->rx_fifo[]. If s->rx_level exceeds the
length of s->rx_fifo[] the buffer can be overrun with arbitrary data
from the wire.
Fix this by validating rx_level against the size of s->rx_fifo.
Both virtio-block and virtio-serial read,
VirtQueueElements are read in as buffers, and passed to
virtqueue_map_sg(), where num_sg is taken from the wire and can force
writes to indicies beyond VIRTQUEUE_MAX_SIZE.
Michael Roth [Thu, 3 Apr 2014 16:51:46 +0000 (19:51 +0300)]
virtio: avoid buffer overrun on incoming migration
CVE-2013-6399
vdev->queue_sel is read from the wire, and later used in the
emulation code as an index into vdev->vq[]. If the value of
vdev->queue_sel exceeds the length of vdev->vq[], currently
allocated to be VIRTIO_PCI_QUEUE_MAX elements, subsequent PIO
operations such as VIRTIO_PCI_QUEUE_PFN can be used to overrun
the buffer with arbitrary data originating from the source.
Fix this by failing migration if the value from the wire exceeds
VIRTIO_PCI_QUEUE_MAX.
hw/pci/pcie_aer.c: fix buffer overruns on invalid state load
4) CVE-2013-4529
hw/pci/pcie_aer.c pcie aer log can overrun the buffer if log_num is
too large
There are two issues in this file:
1. log_max from remote can be larger than on local
then buffer will overrun with data coming from state file.
2. log_num can be larger then we get data corruption
again with an overflow but not adversary controlled.
Within hw/ide/ahci.c, VARRAY refers to ports which is also loaded. So
we use the old version of ports to read the array but then allow any
value for ports. This can cause the code to overflow.
There's no reason to migrate ports - it never changes.
So just make sure it matches.
PortioList is an abstraction used for construction of MemoryRegionPortioList
from MemoryRegionPortio. It can be used later to unmap created memory regions.
It also requires proper cleanup because some of the memory inside is allocated
dynamically.
By moving PortioList ot device state we make it possible to cleanup later and
avoid leaking memory.
This change spans several target platforms. The following testcases cover all
changed lines:
qemu-system-ppc -M prep
qemu-system-i386 -vga qxl
qemu-system-i386 -M isapc -soundhw adlib -device ib700,id=watchdog0,bus=isa.0
Andreas Färber [Thu, 17 Apr 2014 17:21:12 +0000 (19:21 +0200)]
qtest: Be paranoid about accept() addrlen argument
POSIX specifies that address_len shall on output specify the length of
the stored address; it does not however specify whether it may get
updated on failure as well to, e.g., zero.
In case EINTR occurs, re-initialize the variable to the desired value.
arm: Clean up fragile use of error_is_set() in realize() methods
Using error_is_set(ERRP) to find out whether a function failed is
either wrong, fragile, or unnecessarily opaque. It's wrong when ERRP
may be null, because errors go undetected when it is. It's fragile
when proving ERRP non-null involves a non-local argument. Else, it's
unnecessarily opaque (see commit 84d18f0).
I guess the error_is_set(errp) in the DeviceClass realize() methods
are merely fragile right now, because I can't find a call chain that
passes a null errp argument.
Make the code more robust and more obviously correct: receive the
error in a local variable, then propagate it through the parameter.
qom: Clean up fragile use of error_is_set() in set() methods
Using error_is_set(ERRP) to find out whether a function failed is
either wrong, fragile, or unnecessarily opaque. It's wrong when ERRP
may be null, because errors go undetected when it is. It's fragile
when proving ERRP non-null involves a non-local argument. Else, it's
unnecessarily opaque (see commit 84d18f0).
I guess the error_is_set(errp) in the ObjectProperty set() methods are
merely fragile right now, because I can't find a call chain that
passes a null errp argument.
Make the code more robust and more obviously correct: receive the
error in a local variable, then propagate it through the parameter.
vl.c: Replace QEMUMachine with MachineClass in QEMUMachineInitArgs
QEMUMachine's fields are already in MachineClass. We can safely
make the switch because we copy them in machine_class_init() and
spapr_machine_class_init().
machine: Copy QEMUMachine's fields to MachineClass
In order to eliminate the QEMUMachine indirection,
add its fields directly to MachineClass.
Do not yet remove qemu_machine field because it is
still in use by sPAPR.
Signed-off-by: Marcel Apfelbaum <[email protected]>
[AF: Copied fields for sPAPR, too] Signed-off-by: Andreas Färber <[email protected]>
The machine registration flow is refactored to use the QOM functionality.
Instead of linking the machines into a list, each machine has a type
and the types can be traversed in the QOM way.
Amos Kong [Wed, 16 Apr 2014 01:57:14 +0000 (09:57 +0800)]
qdev: Fix crash by validating the object type
QEMU crashed when I try to list device parameters and the driver name is
actually an available bus name.
# qemu -device virtio-pci-bus,?
# qemu -device virtio-bus,?
# qemu -device virtio-serial-bus,?
qdev-monitor.c:212:qdev_device_help: Object 0x7fd932f50620 is not an
instance of type device
Aborted (core dumped)
We can also reproduce this bug by adding device from monitor, so it's
worth to fix the crash.
(qemu) device_add virtio-serial-bus
qdev-monitor.c:491:qdev_device_add: Object 0x7f5e89530920 is not an
instance of type device
Aborted (core dumped)
Petar Jovanovic [Tue, 8 Apr 2014 17:24:30 +0000 (19:24 +0200)]
linux-user: fix getrusage and wait4 failures with invalid rusage struct
Implementations of system calls getrusage and wait4 have not previously
handled correctly cases when incorrect address of struct rusage is
passed.
This change makes sure return values are correctly set for these cases.
virtio-net: out-of-bounds buffer write on invalid state load
CVE-2013-4150 QEMU 1.5.0 out-of-bounds buffer write in
virtio_net_load()@hw/net/virtio-net.c
This code is in hw/net/virtio-net.c:
if (n->max_queues > 1) {
if (n->max_queues != qemu_get_be16(f)) {
error_report("virtio-net: different max_queues ");
return -1;
}
n->curr_queues = qemu_get_be16(f);
for (i = 1; i < n->curr_queues; i++) {
n->vqs[i].tx_waiting = qemu_get_be32(f);
}
}
Number of vqs is max_queues, so if we get invalid input here,
for example if max_queues = 2, curr_queues = 3, we get
write beyond end of the buffer, with data that comes from
wire.
This might be used to corrupt qemu memory in hard to predict ways.
Since we have lots of function pointers around, RCE might be possible.
with good in_use value, "n->mac_table.in_use * ETH_ALEN" can get
positive and bigger than mac_table.macs. For example 0x81000000
satisfies this condition when ETH_ALEN is 6.
Fix it by making the value unsigned.
For consistency, change first_multi as well.
Note: all call sites were audited to confirm that
making them unsigned didn't cause any issues:
it turns out we actually never do math on them,
so it's easy to validate because both values are
always <= MAC_TABLE_ENTRIES.
Gabriel L. Somlo [Wed, 23 Apr 2014 13:42:42 +0000 (09:42 -0400)]
SMBIOS: Build aggregate smbios tables and entry point
Build an aggregate set of smbios tables and an entry point structure.
Insert tables and entry point into fw_cfg respectively under
"etc/smbios/smbios-tables" and "etc/smbios/smbios-anchor".
Machine types <= 2.0 will for now continue using field-by-field
overrides to SeaBIOS defaults, but for machine types 2.1 and up we
expect the BIOS to look for and use the aggregate tables generated
by this patch.
This defines a descriptor for OHCIState.
This changes some OHCIState field types to be migration compatible.
This adds a descriptor for OHCIPort.
This migrates the EOF timer if the USB was started at the time of
migration.
Gabriel L. Somlo [Wed, 23 Apr 2014 13:42:41 +0000 (09:42 -0400)]
SMBIOS: Use bitmaps to prevent incompatible comand line options
Replace existing smbios_check_collision() functionality with
a pair of bitmaps: have_binfile_bitmap and have_fields_bitmap.
Bits corresponding to each smbios type are set by smbios_entry_add(),
which also uses the bitmaps to ensure that binary blobs and field
values are never accepted for the same type.
These bitmaps will also be used in the future to decide whether
or not to build a full table for a given smbios type.
Gabriel L. Somlo [Wed, 23 Apr 2014 13:42:38 +0000 (09:42 -0400)]
SMBIOS: Rename symbols to better reflect future use
Rename the following symbols:
- smbios_set_type1_defaults() to the more general smbios_set_defaults();
- bool smbios_type1_defaults to the more general smbios_defaults;
- smbios_get_table() to smbios_get_table_legacy();
At the moment, 2.1 and 2.0 machines are identical.
As several people are working on incompatible changes
to the PC machine, collaboration will be made easier
by merging this place-holder.
Peter Maydell [Fri, 2 May 2014 13:45:15 +0000 (14:45 +0100)]
linux-user/elfload.c: Support ARM HWCAP2 flags
The ARM kernel has chosen to spill into the HWCAP2 ELF feature bit flags
early, even though it hasn't yet exhausted all 32 bits of the HWCAP word.
Add support for setting this in the same way we do for HWCAP.
Peter Maydell [Fri, 2 May 2014 13:45:14 +0000 (14:45 +0100)]
linux-user/elfload.c: Fix A64 code which was incorrectly acting like A32
The ARM target-specific code in elfload.c was incorrectly allowing
the 64-bit ARM target to use most of the existing 32-bit definitions:
most noticably this meant that our HWCAP bits passed to the guest
were wrong, and register handling when dumping core was totally
broken. Fix this by properly separating the 64 and 32 bit code,
since they have more differences than similarities.
Peter Maydell [Fri, 2 May 2014 13:45:13 +0000 (14:45 +0100)]
linux-user/elfload.c: Update ARM HWCAP bits
The kernel has added support for a number of new ARM HWCAP bits;
add them to QEMU, including support for setting them where we have
a corresponding CPU feature bit.
We were also incorrectly setting the VFPv3D16 HWCAP -- this means
"only 16 D registers", not "supports 16-bit floating point format";
since QEMU always has 32 D registers for VFPv3, we can just remove
the line that incorrectly set this bit.
The kernel does not set the HWCAP_FPA even if it is providing FPA
emulation via nwfpe, so don't set this bit in QEMU either.
Peter Maydell [Fri, 2 May 2014 13:45:12 +0000 (14:45 +0100)]
linux-user/elfload.c: Fix incorrect ARM HWCAP bits
The ELF HWCAP bits for ARM features THUMBEE, NEON, VFPv3 and VFPv3D16 are
all off by one compared to the kernel definitions. Fix this discrepancy
and add in the missing CRUNCH bit which was the cause of the off-by-one
error. (We don't emulate any of the CPUs which have that weird hardware,
so it's otherwise uninteresting to us.)
Riku Voipio [Tue, 4 Mar 2014 02:28:43 +0000 (04:28 +0200)]
linux-user: remove configure option for setting uname release
--enable-uname-release was a rather heavyweight hammer, as it allows
providing values less that UNAME_MINIMUM_RELEASE. Also, it affects
all built linux-user targets, which in most cases is not what user
wants.
Now that we have UNAME_MINIMUM_RELEASE for all linux-user platforms,
we can drop --enable-uname-release and the related CONFIG_UNAME_RELEASE
define.
Users can still override the variable with QEMU_UNAME=2.6.32 or -r
command line option. If distributors need to update a minimum version
for a specific target, it can be done by updating UNAME_MINIMUM_RELEASE.
James Hogan [Tue, 25 Mar 2014 21:51:08 +0000 (21:51 +0000)]
linux-user: Handle arches with llseek instead of _llseek
Recently merged kernel ports (such as OpenRISC and Meta) have an llseek
system call instead of _llseek. This is handled for the host
architecture by defining __NR__llseek as __NR_llseek, but not for the
target architecture.
Handle it in the same way for these architectures, defining
TARGET_NR__llseek as TARGET_NR_llseek.
James Hogan [Tue, 25 Mar 2014 23:21:50 +0000 (23:21 +0000)]
linux-user: Assert stack used for auxvec, envp, argv
Assert that the amount of stack space used for auxvec, envp & argv
exactly matches the amount allocated. This catches if DLINFO_ITEMS isn't
updated when another NEW_AUX_ENT is added.
Peter Maydell [Fri, 2 May 2014 10:32:00 +0000 (11:32 +0100)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20140501' into staging
target-arm queue:
* implement XScale cache lockdown cp15 ops
* fix v7M CPUID base register
* implement WFE and YIELD as yields for A64
* fix A64 "BLR LR"
* support Cortex-A57 in virt machine model
* a few other minor AArch64 bugfixes
# gpg: Signature made Thu 01 May 2014 15:42:17 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <[email protected]>"
* remotes/pmaydell/tags/pull-target-arm-20140501:
hw/arm/virt: Add support for Cortex-A57
hw/arm/virt: Put GIC register banks on 64K boundaries
hw/arm/virt: Create the GIC ourselves rather than (ab)using a15mpcore_priv
target-arm: Correct a comment refering to EL0
target-arm: A64: Fix a typo when declaring TLBI ops
target-arm: A64: Handle blr lr
target-arm: Make vbar_write 64bit friendly on 32bit hosts
target-arm: implement WFE/YIELD as a yield for AArch64
armv7m_nvic: fix CPUID Base Register
target-arm: Implement XScale cache lockdown operations as NOPs