Tom Musta [Tue, 12 Aug 2014 18:53:43 +0000 (13:53 -0500)]
linux-user: writev Partial Writes
Although not technically not required by POSIX, the writev system call will
typically write out its buffers individually. That is, if the first buffer
is written successfully, but the second buffer pointer is invalid, then
the first chuck will be written and its size is returned.
Tom Musta [Tue, 12 Aug 2014 18:53:42 +0000 (13:53 -0500)]
linux-user: Support target-to-host translation of mlockall argument
The argument to the mlockall system call is not necessarily the same on
all platforms and thus may require translation prior to passing to the
host.
For example, PowerPC 64 bit platforms define values for MCL_CURRENT
(0x2000) and MCL_FUTURE (0x4000) which are different from Intel platforms
(0x1 and 0x2, respectively)
Tom Musta [Tue, 12 Aug 2014 18:53:41 +0000 (13:53 -0500)]
linux-user: clock_nanosleep errno Handling on PPC
The clock_nanosleep syscall is unusual in that it returns positive
numbers in error handling situations, versus returning -1 and setting
errno, or returning a negative errno value. On POWER, the kernel will
set the SO bit of CR0 to indicate failure in a syscall. QEMU has
generic handling to do this for syscalls with standard return values.
Add special case code for clock_nanosleep to handle CR0 properly.
Tom Musta [Wed, 13 Aug 2014 19:04:44 +0000 (14:04 -0500)]
linux-user: Move get_ppc64_abi
The get_ppc64_abi is used to determine the ELF ABI (i.e. V1 or V2). This
routine is currently implemented in the linux-user/elfload.c file but
is useful in other scenarios. Move the routine to a more generally
available location (linux-user/ppc/target_cpu.h).
Tom Musta [Tue, 12 Aug 2014 18:53:38 +0000 (13:53 -0500)]
linux-user: Handle NULL sched_param argument to sched_*
The sched_getparam, sched_setparam and sched_setscheduler system
calls take a pointer argument to a sched_param structure. When
this pointer is null, errno should be set to EINVAL.
Tom Musta [Tue, 12 Aug 2014 18:53:37 +0000 (13:53 -0500)]
linux-user: Detect Negative Message Sizes in msgsnd System Call
The msgsnd system call takes an argument that describes the message
size (msgsz) and is of type size_t. The system call should set
errno to EINVAL in the event that a negative message size is passed.
Tom Musta [Tue, 12 Aug 2014 18:53:36 +0000 (13:53 -0500)]
linux-user: Conditionally Pass Attribute Pointer to mq_open()
The mq_open system call takes an optional struct mq_attr pointer
argument in the fourth position. This pointer is used when O_CREAT
is specified in the flags (second) argument. It may be NULL, in
which case the queue is created with implementation defined attributes.
Change the code to properly handle the case when NULL is passed in the
arg4 position.
Tom Musta [Tue, 12 Aug 2014 18:53:35 +0000 (13:53 -0500)]
linux-user: Make ipc syscall's third argument an abi_long
For those target ABIs that use the ipc system call (e.g. POWER),
the third argument is used in the shmat path as a pointer. It
therefore must be declared as an abi_long (versus int) so that
the address bits are not lost in truncation. In fact, all arguments
to do_ipc should be declared as abit_long.
In fact, it makes more sense for all of the arguments to be declaried
as abi_long (except call).
Tom Musta [Tue, 12 Aug 2014 18:53:34 +0000 (13:53 -0500)]
linux-user: Properly Handle semun Structure In Cross-Endian Situations
The semun union used in the semctl system call contains both an int (val) and
pointers. In cross-endian situations on 64 bit targets, the value passed to
semctl is an 8 byte (abi_long) value and thus does not have the 4-byte val
field in the correct location. In order to rectify this, the other half
of the union must be accessed. This is achieved in code by performing
a byte swap on the entire 8 byte union, followed by a 4-byte swap of the
first half.
Also, eliminate an extraneous (dead) line of code that sets target_su.val in
the IPC_SET/IPC_GET case.
Tom Musta [Tue, 12 Aug 2014 18:53:33 +0000 (13:53 -0500)]
linux-user: Dereference Pointer Argument to ipc/semctl Sys Call
When the ipc system call is used to wrap a semctl system call,
the ptr argument to ipc needs to be dereferenced prior to passing
it to the semctl handler. This is because the fourth argument to
semctl is a union and not a pointer to a union.
Tom Musta [Tue, 12 Aug 2014 18:53:32 +0000 (13:53 -0500)]
linux-user: PPC64 semid_ds Doesnt Include _unused1 and _unused2
The 64 bit PowerPC platforms eliminate the _unused1 and _unused2
elements of the semid_ds structure from <sys/sem.h>. So eliminate
these from the target_semid_ds structure.
Peter Maydell [Sat, 9 Aug 2014 14:42:32 +0000 (15:42 +0100)]
linux-user: Fix conversion of sigevent argument to timer_create
There were a number of bugs in the conversion of the sigevent
argument to timer_create from target to host format:
* signal number not converted from target to host
* thread ID not copied across
* sigev_value not copied across
* we never unlocked the struct when we were done
Between them, these problems meant that SIGEV_THREAD_ID
timers (and the glibc-implemented SIGEV_THREAD timers which
depend on them) didn't work.
Fix these problems and clean up the code a little by pulling
the struct conversion out into its own function, in line with
how we convert various other structs. This allows the test
program in bug LP:1042388 to run.
Jincheng Miao [Fri, 8 Aug 2014 03:56:54 +0000 (11:56 +0800)]
linux-user: Fix syscall instruction usermode emulation on X86_64
Currently syscall instruction is buggy on user mode X86_64,
the EIP is updated after do_syscall(), that is too late for
clone(). Because clone() will create a thread at the env->EIP
(the address of syscall insn), and then child thread enters
do_syscall() again, that is not expected. Sometimes it is tragic.
User mode syscall insn emulation is not used MSR, so the
action should be same to INT 0x80. INT 0x80 will update EIP in
do_interrupt(), ditto for syscall() for consistency.
Riku Voipio [Wed, 6 Aug 2014 07:36:37 +0000 (10:36 +0300)]
linux-user: redirect openat calls
While Mikhail fixed /proc/self/maps, it was noticed openat calls are
not redirected currently. Some archs don't have open at all, so
openat needs to be redirected.
Fix this by consolidating open/openat code to do_openat - open
is implemented using openat(AT_FDCWD, ... ), which according
to open(2) man page is identical.
Since all targets now have openat, remove the ifdef around sys_openat
and openat: case in do_syscall.
Peter Maydell [Wed, 20 Aug 2014 08:55:42 +0000 (09:55 +0100)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20140819' into staging
target-arm:
* fix preferred return address for A64 BRK insn
* implement AArch64 single-stepping
* support loading gzip compressed AArch64 kernels
* use correct PSCI function IDs in the DT when KVM uses PSCI 0.2
* minor cleanups
# gpg: Signature made Tue 19 Aug 2014 19:04:09 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <[email protected]>"
* remotes/pmaydell/tags/pull-target-arm-20140819:
arm: stellaris: Remove misleading address_space_mem var
arm: armv7m: Rename address_space_mem -> system_memory
aarch64: Allow -kernel option to take a gzip-compressed kernel.
loader: Add load_image_gzipped function.
arm: cortex-a9: Fix cache-line size and associativity
arm/virt: Use PSCI v0.2 function IDs in the DT when KVM uses PSCI v0.2
target-arm: Rename QEMU PSCI v0.1 definitions
target-arm: Implement MDSCR_EL1 as having state
target-arm: Implement ARMv8 single-stepping for AArch32 code
target-arm: Implement ARMv8 single-step handling for A64 code
target-arm: A64: Avoid duplicate exit_tb(0) in non-linked goto_tb
target-arm: Set PSTATE.SS correctly on exception return from AArch64
target-arm: Correctly handle PSTATE.SS when taking exception to AArch32
target-arm: Don't allow AArch32 to access RES0 CPSR bits
target-arm: Adjust debug ID registers per-CPU
target-arm: Provide both 32 and 64 bit versions of debug registers
target-arm: Allow STATE_BOTH reginfo descriptions for more than cp14
target-arm: Collect up the debug cp register definitions
target-arm: Fix return address for A64 BRK instructions
arm: cortex-a9: Fix cache-line size and associativity
For A9, The cache associativity is 4 and the lines size is 32B.
Self identify in CCSIDR accordingly. Cache size remains at 16k.
QEMU doesn't emulate caches, but we should still report the correct
cache-line size to the guest. Some guests (like u-boot) complain if
the cache-line size mismatches a requested flush or invalidate
operation.
Christoffer Dall [Tue, 19 Aug 2014 17:56:27 +0000 (18:56 +0100)]
arm/virt: Use PSCI v0.2 function IDs in the DT when KVM uses PSCI v0.2
The current code supplies the PSCI v0.1 function IDs in the DT even when
KVM uses PSCI v0.2.
This will break guest kernels that only support PSCI v0.1 as they will
use the IDs provided in the DT. Guest kernels with PSCI v0.2 support
are not affected by this patch, because they ignore the function IDs in
the device tree and rely on the architecture definition.
Define QEMU versions of the constants and check that they correspond to
the Linux defines on Linux build hosts. After this patch, both guest
kernels with PSCI v0.1 support and guest kernels with PSCI v0.2 should
work.
Tested on TC2 for 32-bit and APM Mustang for 64-bit (aarch64 guest
only). Both cases tested with 3.14 and linus/master and verified I
could bring up 2 cpus with both guest kernels. Also tested 32-bit with
a 3.14 host kernel with only PSCI v0.1 and both guests booted here as
well.
Christoffer Dall [Tue, 19 Aug 2014 17:56:27 +0000 (18:56 +0100)]
target-arm: Rename QEMU PSCI v0.1 definitions
The function IDs for PSCI v0.1 are exported by KVM and defined as
KVM_PSCI_FN_<something>. To build using these defines in non-KVM code,
QEMU defines these IDs locally and check their correctness against the
KVM headers when those are available.
However, the naming scheme used for QEMU (almost) clashes with the PSCI
v0.2 definitions from Linux so to avoid unfortunate naming when we
introduce local PSCI v0.2 defines, rename the current local defines with
QEMU_ prependend and clearly identify the PSCI version as v0.1 in the
defines.
Peter Maydell [Tue, 19 Aug 2014 17:56:27 +0000 (18:56 +0100)]
target-arm: Implement ARMv8 single-stepping for AArch32 code
ARMv8 single-stepping requires the exception level that controls
the single-stepping to be in AArch64 execution state, but the
code being stepped may be in AArch64 or AArch32. Implement the
necessary support code for single-stepping AArch32 code.
Peter Maydell [Tue, 19 Aug 2014 17:56:26 +0000 (18:56 +0100)]
target-arm: Implement ARMv8 single-step handling for A64 code
Implement ARMv8 software single-step handling for A64 code:
correctly update the single-step state machine and generate
debug exceptions when stepping A64 code.
This patch has no behavioural change since MDSCR_EL1.SS can't
be set by the guest yet.
Peter Maydell [Tue, 19 Aug 2014 17:56:26 +0000 (18:56 +0100)]
target-arm: A64: Avoid duplicate exit_tb(0) in non-linked goto_tb
If gen_goto_tb() decides not to link the two TBs, then the
fallback path generates unnecessary code:
* if singlestep is enabled then we generate unreachable code
after the gen_exception_internal(EXCP_DEBUG)
* if singlestep is disabled then we will generate exit_tb(0)
twice, once in gen_goto_tb() and once coming out of the
main loop with is_jmp set to DISAS_JUMP
Correct these deficiencies by only emitting exit_tb() in the
non-singlestep case, in which case we can use DISAS_TB_JUMP
to suppress the main-loop exit_tb().
Peter Maydell [Tue, 19 Aug 2014 17:56:26 +0000 (18:56 +0100)]
target-arm: Correctly handle PSTATE.SS when taking exception to AArch32
When an exception is taken to AArch32, we must clear the PSTATE.SS
bit for the exception handler, and must also ensure that the SS bit
is not set in the value saved to SPSR_<mode>. Achieve both of these
aims by clearing the bit in uncached_cpsr before saving it to the SPSR.
Peter Maydell [Tue, 19 Aug 2014 17:56:26 +0000 (18:56 +0100)]
target-arm: Don't allow AArch32 to access RES0 CPSR bits
The CPSR has a new-in-v8 execution state bit (IL), and
also some state which has effects in AArch32 but appears
only in the SPSR format (SS) but is RES0 in the CPSR.
Add the IL bit to CPSR_EXEC, and enforce that guest direct
reads and writes to CPSR can't read or write the RES0
bits, so the guest can't get at the SS bit which we store
in uncached_cpsr. This includes not permitting exception
returns to copy reserved bits from an SPSR into CPSR.
Peter Maydell [Tue, 19 Aug 2014 17:56:25 +0000 (18:56 +0100)]
target-arm: Adjust debug ID registers per-CPU
Allow each CPU type to specify the value for the debug ID
registers, by putting them in the ARMCPU struct, and use
the resulting information to only expose the correct number
of watchpoint and breakpoint registers for the CPU.
Peter Maydell [Tue, 19 Aug 2014 17:56:25 +0000 (18:56 +0100)]
target-arm: Provide both 32 and 64 bit versions of debug registers
Bring the 32 bit and 64 bit views of the debug registers into
line by providing the same set of registers in both cases.
(This still isn't a complete set, but it is consistent.)
Peter Maydell [Tue, 19 Aug 2014 17:56:25 +0000 (18:56 +0100)]
target-arm: Allow STATE_BOTH reginfo descriptions for more than cp14
Currently the STATE_BOTH shorthand for allowing a single reginfo struct
to define handling for both AArch32 and AArch64 views of a register
only permits this where the AArch32 view is in cp15. It turns out that
the debug registers in cp14 also have neatly lined up encodings;
allow these also to share reginfo structs by permitting a STATE_BOTH
reginfo to specify the .cp field (and continue to default to 15 if
it is not specified).
Peter Maydell [Tue, 19 Aug 2014 17:56:25 +0000 (18:56 +0100)]
target-arm: Collect up the debug cp register definitions
At the moment we have a mixed set of mostly dummy register
definitions for various debug related registers which have
been added piecemeal in order to get Linux kernels to boot.
In preparation for actually implementing debug support,
bring them all together into one place.
This commit doesn't change behaviour: we still expose
exactly the same registers and behaviour to the guest
in all configurations.
Peter Maydell [Tue, 19 Aug 2014 17:56:24 +0000 (18:56 +0100)]
target-arm: Fix return address for A64 BRK instructions
When we take an exception resulting from a BRK instruction,
the architecture requires that the "preferred return address"
reported to the exception handler is the address of the BRK
itself, not the following instruction (like undefined
insns, and in contrast with SVC, HVC and SMC). Follow this,
rather than incorrectly reporting the address of the following
insn.
(We do get this correct for the A32/T32 BKPT insns.)
Peter Maydell [Tue, 19 Aug 2014 12:00:57 +0000 (13:00 +0100)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
SCSI changes that enable sending vendor-specific commands via virtio-scsi.
Memory changes for QOMification and automatic tracking of MR lifetime.
# gpg: Signature made Mon 18 Aug 2014 13:03:09 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <[email protected]>"
# gpg: aka "Paolo Bonzini <[email protected]>"
* remotes/bonzini/tags/for-upstream:
mtree: remove write-only field
memory: Use canonical path component as the name
memory: Use memory_region_name for name access
memory: constify memory_region_name
exec: Abstract away ref to memory region names
loader: Abstract away ref to memory region names
tpm_tis: remove instance_finalize callback
memory: remove memory_region_destroy
memory: convert memory_region_destroy to object_unparent
ioport: split deletion and destruction
nic: do not destroy memory regions in cleanup functions
vga: do not dynamically allocate chain4_alias
sysbus: remove unused function sysbus_del_io
qom: object: move unparenting to the child property's release callback
qom: object: delete properties before calling instance_finalize
virtio-scsi: implement parse_cdb
scsi-block, scsi-generic: implement parse_cdb
scsi-block: extract scsi_block_is_passthrough
scsi-bus: introduce parse_cdb in SCSIDeviceClass and SCSIBusInfo
scsi-bus: prepare scsi_req_new for introduction of parse_cdb
The function monitor_fdset_dup_fd_find_remove() references member of
'mon_fdset' which - when remove flag is set - may be freed in function
monitor_fdset_cleanup().
remove is set by monitor_fdset_dup_fd_remove which in practice
does not need the returned value, so make it void,
and return -1 from monitor_fdset_dup_fd_find_remove.
Peter Maydell [Mon, 18 Aug 2014 17:24:38 +0000 (18:24 +0100)]
Merge remote-tracking branch 'remotes/amit/for-2.2' into staging
* remotes/amit/for-2.2:
virtio-serial: search for duplicate port names before adding new ports
virtio-serial: create a linked list of all active devices
Amit Shah [Tue, 15 Jul 2014 04:47:02 +0000 (10:17 +0530)]
virtio-serial: search for duplicate port names before adding new ports
Before adding new ports to VirtIOSerial devices, check if there's a
conflict in the 'name' parameter. This ensures two virtserialports with
identical names are not initialized.
Amit Shah [Wed, 16 Jul 2014 11:08:50 +0000 (16:38 +0530)]
virtio-serial: create a linked list of all active devices
To ensure two virtserialports don't get added to the system with the
same 'name' parameter, we need to access all the ports on all the
devices added, and compare the names.
We currently don't have a list of all VirtIOSerial devices added to the
system. This commit adds a simple linked list in which devices are put
when they're initialized, and removed when they go away.
Peter Maydell [Mon, 18 Aug 2014 11:55:02 +0000 (12:55 +0100)]
Merge remote-tracking branch 'remotes/mcayland/qemu-sparc' into staging
* remotes/mcayland/qemu-sparc:
target-sparc64: implement Short Floating-Point Store Instructions
apb: add IOMMU flush register implementation
sun4u: switch second PCI-ebus bridge BAR over to PCI IO space
Peter Maydell [Mon, 18 Aug 2014 10:59:26 +0000 (11:59 +0100)]
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Block pull request
# gpg: Signature made Fri 15 Aug 2014 18:04:23 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <[email protected]>"
# gpg: aka "Stefan Hajnoczi <[email protected]>"
* remotes/stefanha/tags/block-pull-request: (55 commits)
qcow2: fix new_blocks double-free in alloc_refcount_block()
image-fuzzer: Reduce number of generator functions in __init__
image-fuzzer: Add generators of L1/L2 tables
image-fuzzer: Add fuzzing functions for L1/L2 table entries
docs: Expand the list of supported image elements with L1/L2 tables
image-fuzzer: Public API for image-fuzzer/runner/runner.py
image-fuzzer: Generator of fuzzed qcow2 images
image-fuzzer: Fuzzing functions for qcow2 images
image-fuzzer: Tool for fuzz tests execution
docs: Specification for the image fuzzer
ide: only constrain read/write requests to drive size, not other types
virtio-blk: Correct bug in support for flexible descriptor layout
libqos: Change free function called in malloc
libqos: Correct mask to align size to PAGE_SIZE in malloc-pc
libqtest: add QTEST_LOG for debugging qtest testcases
ide: Fix segfault when flushing a device that doesn't exist
qemu-options: add missing -drive discard option to cmdline help
parallels: 2TB+ parallels images support
parallels: split check for parallels format in parallels_open
parallels: replace tabs with spaces in block/parallels.c
...
Rather than having the name as separate state. This prepares support
for creating a MemoryRegion dynamically (i.e. without
memory_region_init() and friends) and the MemoryRegion still getting
a usable name.
Despite being local to memory.c, use the helper function. This prepares
support for fully QOMifiying the name field of MR (which will remove
this state from MR completely).
Paolo Bonzini [Wed, 11 Jun 2014 10:42:01 +0000 (12:42 +0200)]
memory: convert memory_region_destroy to object_unparent
Explicitly call object_unparent in the few places where we
will re-create the memory region. If the memory region is
simply being destroyed as part of device teardown, let QOM
handle it.
Paolo Bonzini [Wed, 11 Jun 2014 11:02:51 +0000 (13:02 +0200)]
ioport: split deletion and destruction
Of the two functions portio_list_del and portio_list_destroy,
the latter is just freeing a memory area. However, portio_list_del
is the logical equivalent of memory_region_del_subregion so
destruction of memory regions does not belong there.
Actually, neither of these APIs are in use; portio is mostly used by
ISA devices or VGAs, and neither of these is currently hot-unpluggable.
Paolo Bonzini [Wed, 11 Jun 2014 10:23:03 +0000 (12:23 +0200)]
nic: do not destroy memory regions in cleanup functions
The memory regions should be destroyed in the unrealize function;
since these NICs are not even qdev-ified, they cannot be unplugged
and they do not have to do anything to destroy their memory regions.
Paolo Bonzini [Wed, 11 Jun 2014 10:19:25 +0000 (12:19 +0200)]
vga: do not dynamically allocate chain4_alias
Instead, add a boolean variable to indicate the presence of the region.
This avoids a repeated malloc/free (later we can also avoid the
add_child/unparent by changing the offset/size of the alias).
Paolo Bonzini [Wed, 11 Jun 2014 09:57:38 +0000 (11:57 +0200)]
qom: object: move unparenting to the child property's release callback
This ensures that the unparent callback is called automatically
when the parent object is finalized.
Note that there's no need to keep a reference neither in
object_unparent nor in object_finalize_child_property. The
reference held by the child property itself will do.
Mark Cave-Ayland [Mon, 11 Aug 2014 11:22:52 +0000 (12:22 +0100)]
apb: add IOMMU flush register implementation
The IOMMU flush register is a write-only register used to remove entries from the
hardware TLB. Allow guest writes to this register as a no-op, and return a value
of 0 for reads.
This fixes IOMMU DMA operations under NetBSD SPARC64.
sun4u: switch second PCI-ebus bridge BAR over to PCI IO space
The ebus is the sun4u equivalent of the old ISA bus which is already mapped at
the beginning of PCI IO space within QEMU. NetBSD attempts to find the physical
addresses of devices connected to the ebus by parsing the BARs of the PCI-ebus
bridge and using the base address found by matching both the address space
type and range for a particular ebus address.
Since the second PCI-ebus bridge BAR is already aliased onto IO space, switch
the BAR over to match and reduce the size to 0x1000 which is enough to cover
all the legacy ioport devices whilst leaving the remaining IO space for other
PCI devices. This allows NetBSD SPARC64 to correctly detect and access devices
on the ebus.
Peter Maydell [Fri, 15 Aug 2014 17:44:47 +0000 (18:44 +0100)]
Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-2014-08-15' into staging
trivial patches for 2014-08-15
# gpg: Signature made Fri 15 Aug 2014 16:13:03 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <[email protected]>"
# gpg: aka "Michael Tokarev <[email protected]>"
# gpg: aka "Michael Tokarev <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D 4324 457C E0A0 8044 65C5
# Subkey fingerprint: 6F67 E18E 7C91 C5B1 5514 66A7 BEE5 9D74 A4C3 D7DB
* remotes/mjt/tags/trivial-patches-2014-08-15:
ivshmem: check the value returned by fstat()
l2cap: fix access to freed memory
intc: i8259: Convert Array allocation to g_new0
ppc: convert g_new(qemu_irq usages to g_new0
ssi: xilinx_spi: Initialise CS GPIOs as NULL
vl: free err
qemu-options.hx: fix typo about l2tpv3
vmxnet3: don't use 'Yoda conditions'
vl: don't use 'Yoda conditions'
spice: don't use 'Yoda conditions'
don't use 'Yoda conditions'
isa-bus: don't use 'Yoda conditions'
audio: don't use 'Yoda conditions'
usb: don't use 'Yoda conditions'
CODING_STYLE: Section about conditional statement
pci-host: update uncorresponding description
pci-host: update obsolete reference about piix_pci.c
qemu-options.hx: fix a typo of chardev
memory: Update obsolete comment about AddrRange field type
apic: Fix reported DFR content
Stefan Hajnoczi [Fri, 15 Aug 2014 16:59:54 +0000 (17:59 +0100)]
qcow2: fix new_blocks double-free in alloc_refcount_block()
Commit de82815db1c89da058b7fb941dab137d6d9ab738 ("qcow2: Handle failure
for potentially large allocations") introduced a double-free of
new_blocks in the alloc_refcount_block() error path.
The qemu-iotests qcow2 026 test case was failing because qemu-io
segfaulted.
Make sure new_blocks is NULL after we free it the first time.
Maria Kustova [Mon, 11 Aug 2014 11:27:46 +0000 (15:27 +0400)]
image-fuzzer: Reduce number of generator functions in __init__
Some issues can be found only when a fuzzed image has a partial structure,
e.g. has L1/L2 tables but no refcount ones. Generation of an entirely
defined image limits these cases. Now the Image constructor creates only
a header and a backing file name (if any), other image elements are generated
in the 'create_image' API.
Maria Kustova [Mon, 11 Aug 2014 11:01:10 +0000 (15:01 +0400)]
image-fuzzer: Add generators of L1/L2 tables
Entries in L1/L2 entries are based on a portion of random guest clusters.
L2 entries contain offsets to host image clusters filled with random data.
Clusters for L1/L2 tables and guest data are selected randomly.
Maria Kustova [Mon, 11 Aug 2014 10:34:01 +0000 (14:34 +0400)]
image-fuzzer: Generator of fuzzed qcow2 images
The layout submodule of the qcow2 package creates a random valid image,
randomly selects some amount of its fields, fuzzes them and write the fuzzed
image to the file. Fuzzing process can be controlled by an external
configuration.
Maria Kustova [Mon, 11 Aug 2014 10:34:00 +0000 (14:34 +0400)]
image-fuzzer: Fuzzing functions for qcow2 images
The fuzz submodule of the qcow2 image generator contains fuzzing functions for
image fields.
Each fuzzing function contains a list of constraints and a call of a helper
function that randomly selects a fuzzed value satisfied to one of constraints.
For now constraints include only known as invalid or potentially dangerous
values. But after investigation of code coverage by fuzz tests they will be
expanded by heuristic values based on inner checks and flows of a program
under test.
Now fuzzing of a header, header extensions and a backing file name is
supported.
Maria Kustova [Mon, 11 Aug 2014 10:33:59 +0000 (14:33 +0400)]
image-fuzzer: Tool for fuzz tests execution
The purpose of the test runner is to prepare the test environment (e.g. create
a work directory, a test image, etc), execute a program under test with
parameters, indicate a test failure if the program was killed during the test
execution and collect core dumps, logs and other test artifacts.
The test runner doesn't depend on an image format, so it can be used with any
external image generator.
[Fixed path to qcow2 format module "qcow2" instead of "../qcow2" since
runner.py is no longer in a sub-directory.
--Stefan]
Michael Tokarev [Wed, 13 Aug 2014 07:23:31 +0000 (11:23 +0400)]
ide: only constrain read/write requests to drive size, not other types
Commit 58ac321135a introduced a check to ide dma processing which
constrains all requests to drive size. However, apparently, some
valid requests (like TRIM) does not fit in this constraint, and
fails in 2.1. So check the range only for reads and writes.
Marc Marí [Tue, 12 Aug 2014 11:41:51 +0000 (13:41 +0200)]
virtio-blk: Correct bug in support for flexible descriptor layout
Without this correction, only a three descriptor layout is accepted, and
requests with just two descriptors are not completed and no error message is
displayed.
Parallels has released in the recent updates of Parallels Server 5/6
new addition to his image format. Images with signature WithouFreSpacExt
have offsets in the catalog coded not as offsets in sectors (multiple
of 512 bytes) but offsets coded in blocks (i.e. header->tracks * 512)
In this case all 64 bits of header->nb_sectors are used for image size.
This patch implements support of this for qemu-img and also adds specific
check for an incorrect image. Images with block size greater than
INT_MAX/513 are not supported. The biggest available Parallels image
cluster size in the field is 1 Mb. Thus this limit will not hurt
anyone.
parallels: split check for parallels format in parallels_open
and rework error path a bit. There is no difference at the moment, but
the code will be definitely shorter when additional processing will
be required for WithouFreSpacExt
parallels: extend parallels format header with actual data values
Parallels image format has several additional fields inside:
- nb_sectors is actually 64 bit wide. Upper 32bits are not used for
images with signature "WithoutFreeSpace" and must be explicitly
zeroed according to Parallels. They will be used for images with
signature "WithouFreSpacExt"
- inuse is magic which means that the image is currently opened for
read/write or was not closed correctly, the magic is 0x746f6e59
- data_off is the location of the first data block. It can be zero
and in this case data starts just beyond the header aligned to
512 bytes. Though this field does not matter for read-only driver
This patch adds these values to struct parallels_header and adds
proper handling of nb_sectors for currently supported WithoutFreeSpace
images.
The dataplane code is currently doing a hard exit if it fails to set
up either guest or host notifiers. In practice, this may mean that a
guest suddenly dies after a dataplane device failed to come up (e.g.,
when a file descriptor limit is hit for tne nth device).
Let's just try to unwind the setup instead and return.
Gonglei [Mon, 11 Aug 2014 09:34:21 +0000 (17:34 +0800)]
channel-posix: using qemu_set_nonblock() instead of fcntl(O_NONBLOCK)
Technically, fcntl(soc, F_SETFL, O_NONBLOCK)
is incorrect since it clobbers all other file flags.
We can use F_GETFL to get the current flags, set or
clear the O_NONBLOCK flag, then use F_SETFL to set the flags.
Gonglei [Mon, 11 Aug 2014 09:34:20 +0000 (17:34 +0800)]
qemu-char: using qemu_set_nonblock() instead of fcntl(O_NONBLOCK)
Technically, fcntl(soc, F_SETFL, O_NONBLOCK)
is incorrect since it clobbers all other file flags.
We can use F_GETFL to get the current flags, set or
clear the O_NONBLOCK flag, then use F_SETFL to set the flags.