Git Repo - qemu.git/log

target-arm: Implement FPEXC32_EL2 system register

The AArch64 FPEXC32_EL2 system register is visible at EL2 and EL3,
and allows those exception levels to read and write the FPEXC
register for a lower exception level that is using AArch32.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Reviewed-by: Sergey Fedorov <[email protected]>
Message-id: 1453132414 [email protected]

target-arm: ignore ELR_ELx[1] for exception return to 32-bit ARM mode

The architecture requires that for an exception return to AArch32 the
low bits of ELR_ELx are ignored when the PC is set from them:
* if returning to Thumb mode, ignore ELR_ELx[0]
* if returning to ARM mode, ignore ELR_ELx[1:0]

We were only squashing bit 0; also squash bit 1 if the SPSR T bit
indicates this is a return to ARM code.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>

target-arm: Implement remaining illegal return event checks

We already implement almost all the checks for the illegal
return events from AArch64 state described in the ARM ARM section
D1.11.2. Add the two missing ones:
* return to EL2 when EL3 is implemented and SCR_EL3.NS is 0
* return to Non-secure EL1 when EL2 is implemented and HCR_EL2.TGE is 1

(We don't implement external debug, so the case of "debug state exit
from EL0 using AArch64 state to EL0 using AArch32 state" doesn't apply
for QEMU.)

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>

target-arm: Handle exception return from AArch64 to non-EL0 AArch32

Remove the assumptions that the AArch64 exception return code was
making about a return to AArch32 always being a return to EL0.
This includes pulling out the illegal-SPSR checks so we can apply
them for return to 32 bit as well as return to 64-bit.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>

target-arm: Fix wrong AArch64 entry offset for EL2/EL3 target

The entry offset when taking an exception to AArch64 from a lower
exception level may be 0x400 or 0x600. 0x400 is used if the
implemented exception level immediately lower than the target level
is using AArch64, and 0x600 if it is using AArch32. We were
incorrectly implementing this as checking the exception level
that the exception was taken from. (The two can be different if
for example we take an exception from EL0 to AArch64 EL3; we should
in this case be checking EL2 if EL2 is implemented, and EL1 if
EL2 is not implemented.)

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>

target-arm: Pull semihosting handling out to arm_cpu_do_interrupt()

Handling of semihosting calls should depend on the register width
of the calling code, not on that of any higher exception level,
so we need to identify and handle semihosting calls before we
decide whether to deliver the exception as an entry to AArch32
or AArch64. (EXCP_SEMIHOST is also an "internal exception" so
it has no target exception level in the first place.)

This will allow AArch32 EL1 code to use semihosting calls when
running under an AArch64 EL3.

Signed-off-by: Peter Maydell <[email protected]>

target-arm: Use a single entry point for AArch64 and AArch32 exceptions

If EL2 or EL3 is present on an AArch64 CPU, then exceptions can be
taken to an exception level which is running AArch32 (if only EL0
and EL1 are present then EL1 must be AArch64 and all exceptions are
taken to AArch64). To support this we need to have a single
implementation of the CPU do_interrupt() method which can handle both
32 and 64 bit exception entry.

Pull the common parts of aarch64_cpu_do_interrupt() and
arm_cpu_do_interrupt() out into a new function which calls
either the AArch32 or AArch64 specific entry code once it has
worked out which one is needed.

We temporarily special-case the handling of EXCP_SEMIHOST to
avoid an assertion in arm_el_is_aa64(); the next patch will
pull all the semihosting handling out to the arm_cpu_do_interrupt()
level (since semihosting semantics depend on the register width
of the calling code, not on that of any higher EL).

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>

target-arm: Move aarch64_cpu_do_interrupt() to helper.c

Move the aarch64_cpu_do_interrupt() function to helper.c. We want
to be able to call this from code that isn't AArch64-only, and
the move allows us to avoid awkward #ifdeffery at the callsite.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>

target-arm: Properly support EL2 and EL3 in arm_el_is_aa64()

Support EL2 and EL3 in arm_el_is_aa64() by implementing the
logic for checking the SCR_EL3 and HCR_EL2 register-width bits
as appropriate to determine the register width of lower exception
levels.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>

arm_gic: Update ID registers based on revision

Update the GIC ID registers (registers above 0xfe0) based on the GIC
revision instead of using the sames values for all GIC implementations.

Signed-off-by: Alistair Francis <[email protected]>
Tested-by: Sören Brinkmann <[email protected]>
Message-id: 629e7fa5d47f2800e51cc1f18d12635f1eece349.1453333840 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

hw/arm/virt: Add always-on property to the virt board timer

The virt board has an arch timer, which is always on. Emit the
"always-on" property to indicate to Linux that it can switch off the
periodic timer and reduces the amount of interrupts injected into a
guest.

Signed-off-by: Christoffer Dall <[email protected]>
Reviewed-by: Andrew Jones <[email protected]>
Message-id: 1453204158 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

hw/arm/virt: add secure memory region and UART

Add a secure memory region to the virt board, which is the
same as the nonsecure memory region except that it also has
a secure-only UART in it. This is only created if the
board is started with the '-machine secure=on' property.

Signed-off-by: Peter Maydell <[email protected]>
Acked-by: Edgar E. Iglesias <[email protected]>

hw/arm/virt: Wire up memory region to CPUs explicitly

Wire up the system memory region to the CPUs explicitly
by setting the QOM property. This doesn't change anything
over letting it default, but will be needed for adding
a secure memory region later.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Acked-by: Edgar E. Iglesias <[email protected]>

target-arm: Support multiple address spaces in page table walks

If we have a secure address space, use it in page table walks:
when doing the physical accesses to read descriptors, make them
through the correct address space.

(The descriptor reads are the only direct physical accesses
made in target-arm/ for CPUs which might have TrustZone.)

Signed-off-by: Peter Maydell <[email protected]>
Acked-by: Edgar E. Iglesias <[email protected]>

target-arm: Implement cpu_get_phys_page_attrs_debug

Implement cpu_get_phys_page_attrs_debug instead of cpu_get_phys_page_debug.

Signed-off-by: Peter Maydell <[email protected]>
Acked-by: Edgar E. Iglesias <[email protected]>

target-arm: Implement asidx_from_attrs

Implement the asidx_from_attrs CPU method to return the
Secure or NonSecure address space as appropriate.

(The function is inline so we can use it directly in target-arm
code to be added in later patches.)

Signed-off-by: Peter Maydell <[email protected]>
Acked-by: Edgar E. Iglesias <[email protected]>

target-arm: Add QOM property for Secure memory region

Add QOM property to the ARM CPU which boards can use to tell us what
memory region to use for secure accesses. Nonsecure accesses
go via the memory region specified with the base CPU class 'memory'
property.

By default, if no secure region is specified it is the same as the
nonsecure region, and if no nonsecure region is specified we will use
address_space_memory.

Signed-off-by: Peter Maydell <[email protected]>
Acked-by: Edgar E. Iglesias <[email protected]>

qom/cpu: Add MemoryRegion property

Add a MemoryRegion property, which if set is used to construct
the CPU's initial (default) AddressSpace.

Signed-off-by: Peter Crosthwaite <[email protected]>
[PMM: code is moved from qom/cpu.c to exec.c to avoid having to
make qom/cpu.o be a non-common object file; code to use the
MemoryRegion and to default it to system_memory added.]
Signed-off-by: Peter Maydell <[email protected]>
Acked-by: Edgar E. Iglesias <[email protected]>

memory: Add address_space_init_shareable()

This will either create a new AS or return a pointer to an
already existing equivalent one, if we have already created
an AS for the specified root memory region.

The motivation is to reuse address spaces as much as possible.
It's going to be quite common that bus masters out in device land
have pointers to the same memory region for their mastering yet
each will need to create its own address space. Let the memory
API implement sharing for them.

Aside from the perf optimisations, this should reduce the amount
of redundant output on info mtree as well.

Thee returned value will be malloced, but the malloc will be
automatically freed when the AS runs out of refs.

Signed-off-by: Peter Crosthwaite <[email protected]>
[PMM: dropped check for NULL root as unused; added doc-comment;
squashed Peter C's reference-counting patch into this one;
don't compare name string when deciding if we can share ASes;
read as->malloced before the unref of as->root to avoid possible
read-after-free if as->root was the owner of as]
Signed-off-by: Peter Maydell <[email protected]>
Acked-by: Edgar E. Iglesias <[email protected]>

exec.c: Use correct AddressSpace in watch_mem_read and watch_mem_write

In the watchpoint access routines watch_mem_read and watch_mem_write,
find the correct AddressSpace to use from current_cpu and the memory
transaction attributes, rather than always assuming address_space_memory.

Signed-off-by: Peter Maydell <[email protected]>
Acked-by: Edgar E. Iglesias <[email protected]>

exec.c: Use cpu_get_phys_page_attrs_debug

Use cpu_get_phys_page_attrs_debug() when doing virtual-to-physical
conversions in debug related code, so that we can obtain the right
address space index and thus select the correct AddressSpace,
rather than always using cpu->as.

Signed-off-by: Peter Maydell <[email protected]>
Acked-by: Edgar E. Iglesias <[email protected]>

exec.c: Add cpu_get_address_space()

Add a function to return the AddressSpace for a CPU based on
its numerical index. (Callers outside exec.c don't have access
to the CPUAddressSpace struct so can't just fish it out of the
CPUState struct directly.)

Signed-off-by: Peter Maydell <[email protected]>
Acked-by: Edgar E. Iglesias <[email protected]>

exec.c: Pass MemTxAttrs to iotlb_to_region so it uses the right AS

Pass the MemTxAttrs for the memory access to iotlb_to_region(); this
allows it to determine the correct AddressSpace to use for the lookup.

Signed-off-by: Peter Maydell <[email protected]>
Acked-by: Edgar E. Iglesias <[email protected]>

cputlb.c: Use correct address space when looking up MemoryRegionSection

When looking up the MemoryRegionSection for the new TLB entry in
tlb_set_page_with_attrs(), use cpu_asidx_from_attrs() to determine
the correct address space index for the lookup, and pass it into
address_space_translate_for_iotlb().

Signed-off-by: Peter Maydell <[email protected]>
Acked-by: Edgar E. Iglesias <[email protected]>

cpu: Add new asidx_from_attrs() method

Add a new method to CPUClass which the memory system core can
use to obtain the correct address space index to use for a memory
access with a given set of transaction attributes, together
with the wrapper function cpu_asidx_from_attrs() which implements
the default behaviour ("always use asidx 0") for CPU classes
which don't provide the method.

Signed-off-by: Peter Maydell <[email protected]>
Acked-by: Edgar E. Iglesias <[email protected]>

cpu: Add new get_phys_page_attrs_debug() method

Add a new optional method get_phys_page_attrs_debug() to CPUClass.
This is like the existing get_phys_page_debug(), but also returns
the memory transaction attributes to use for the access.
This will be necessary for CPUs which have multiple address
spaces and use the attributes to select the correct address
space.

We provide a wrapper function cpu_get_phys_page_attrs_debug()
which falls back to the existing get_phys_page_debug(), so we
don't need to change every target CPU.

Signed-off-by: Peter Maydell <[email protected]>
Acked-by: Edgar E. Iglesias <[email protected]>

exec-all.h: Document tlb_set_page_with_attrs, tlb_set_page

Add documentation comments for tlb_set_page_with_attrs()
and tlb_set_page().

Signed-off-by: Peter Maydell <[email protected]>
Acked-by: Edgar E. Iglesias <[email protected]>

exec.c: Allow target CPUs to define multiple AddressSpaces

Allow multiple calls to cpu_address_space_init(); each
call adds an entry to the cpu->ases array at the specified
index. It is up to the target-specific CPU code to actually use
these extra address spaces.

Since this multiple AddressSpace support won't work with
KVM, add an assertion to avoid confusing failures.

Signed-off-by: Peter Maydell <[email protected]>
Acked-by: Edgar E. Iglesias <[email protected]>

exec.c: Don't set cpu->as until cpu_address_space_init

Rather than setting cpu->as unconditionally in cpu_exec_init
(and then having target-i386 override this later), don't set
it until the first call to cpu_address_space_init.

This requires us to initialise the address space for
both TCG and KVM (KVM doesn't need the AS listener but
it does require cpu->as to be set).

For target CPUs which don't set up any address spaces (currently
everything except i386), add the default address_space_memory
in qemu_init_vcpu().

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Acked-by: Edgar E. Iglesias <[email protected]>

misc: zynq-xadc: Fix off-by-one

This bounds check was off-by-one. Fix.

Reported-by: Paolo Bonzini <[email protected]>
Signed-off-by: Peter Crosthwaite <[email protected]>
Message-id: 1453101737 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

xlnx-ep108: Connect the SPI Flash

Connect the sst25wf080 SPI flash to the EP108 board.

Signed-off-by: Alistair Francis <[email protected]>
Reviewed-by: Peter Crosthwaite <[email protected]>
Signed-off-by: Peter Crosthwaite <[email protected]>
[PMM: free string when finished with it]
Signed-off-by: Peter Maydell <[email protected]>

xlnx-zynqmp: Connect the SPI devices

Connect the Xilinx SPI devices to the ZynqMP model.

Signed-off-by: Alistair Francis <[email protected]>
Reviewed-by: Peter Crosthwaite <[email protected]>
[ PC changes
* Use QOM alias for bus connectivity on SoC level
]
Signed-off-by: Peter Crosthwaite <[email protected]>
[PMM: free the g_strdup_printf() string when finished with it]
Signed-off-by: Peter Maydell <[email protected]>

xilinx_spips: Separate the state struct into a header

Separate out the XilinxSPIPS struct into a separate header
file.

Signed-off-by: Alistair Francis <[email protected]>
Reviewed-by: Peter Crosthwaite <[email protected]>
Signed-off-by: Peter Crosthwaite <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

ssi: Move ssi.h into a separate directory

Move the ssi.h include file into the ssi directory.

While touching the code also fix the typdef lines as
checkpatch complains.

Signed-off-by: Alistair Francis <[email protected]>
Reviewed-by: Peter Crosthwaite <[email protected]>
Signed-off-by: Peter Crosthwaite <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

m25p80.c: Add sst25wf080 SPI flash device

Add the sst25wf080 SPI flash device.

Signed-off-by: Alistair Francis <[email protected]>
Reviewed-by: Peter Crosthwaite <[email protected]>
Signed-off-by: Peter Crosthwaite <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

qdev: get_child_bus(): Use QOM lookup if available

qbus_realize() adds busses as a QOM child of the device in addition to
adding it to the qdev bus list. Change get_child_bus() to use the QOM
child if it is available. This takes priority over the bus-list, but
the child object is checked for type correctness.

This prepares support for aliasing of buses. The use case is SoCs,
where a SoC container needs to present buses to the board level, but
the buses are implemented by controller IP we already model as self
contained qbus-containing devices.

Signed-off-by: Peter Crosthwaite <[email protected]>
Acked-by: Alistair Francis <[email protected]>
Signed-off-by: Alistair Francis <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches

# gpg: Signature made Wed 20 Jan 2016 15:37:57 GMT using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"

* remotes/kevin/tags/for-upstream:
  iotests: Test that throttle values ranges
  blockdev: Error out on negative throttling option values
  vmdk: Create streamOptimized as version 3
  qcow2: Make image inaccessible after failed qcow2_invalidate_cache()
  qcow2: Fix BDRV_O_INACTIVE handling in qcow2_invalidate_cache()
  qcow2: Implement .bdrv_inactivate
  block: Inactivate BDS when migration completes
  block: Rename BDRV_O_INCOMING to BDRV_O_INACTIVE
  block: Fix error path in bdrv_invalidate_cache()
  block: Assert no write requests under BDRV_O_INCOMING
  qcow2: Write full header on image creation
  qcow2: Write feature table only for v3 images
  block: Clean up includes
  qemu-iotests: Reduce racy output in 028
  qemu-img: Speed up comparing empty/zero images
  block/raw-posix: avoid bogus fixup for cylinders on DASD disks
  block: Fix .bdrv_open flags

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/berrange/tags/pull-io-next-2016-01-20-1' into staging

I/O channels fixes 2016/01/20 v1

# gpg: Signature made Wed 20 Jan 2016 11:31:47 GMT using RSA key ID 15104FDF
# gpg: Good signature from "Daniel P. Berrange <[email protected]>"
# gpg:                 aka "Daniel P. Berrange <[email protected]>"

* remotes/berrange/tags/pull-io-next-2016-01-20-1:
  io: use memset instead of { 0 } for initializing array
  io: fix description of @errp parameter initialization
  io: some fixes to handling of /dev/null when running commands
  io: increment counter when killing off subcommand
  io: fix sign of errno value passed to error report

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/kraxel/tags/pull-socket-20160120-1' into staging

Convert qemu-socket to use QAPI exclusively, update MAINTAINERS.

# gpg: Signature made Wed 20 Jan 2016 06:49:07 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <[email protected]>"
# gpg:                 aka "Gerd Hoffmann <[email protected]>"
# gpg:                 aka "Gerd Hoffmann (private) <[email protected]>"

* remotes/kraxel/tags/pull-socket-20160120-1:
  vnc: distiguish between ipv4/ipv6 omitted vs set to off
  sockets: remove use of QemuOpts from socket_dgram
  sockets: remove use of QemuOpts from socket_connect
  sockets: remove use of QemuOpts from socket_listen
  sockets: remove use of QemuOpts from header file
  add MAINTAINERS entry for qemu socket code

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20160119.0' into staging

VFIO updates 2016-01-19

- Performance fix for devices with poorly placed MSI-X PBA regions
- Quirk fix for hosts with broken MMCONFIG access

# gpg: Signature made Tue 19 Jan 2016 19:00:21 GMT using RSA key ID 3BB08B22
# gpg: Good signature from "Alex Williamson <[email protected]>"
# gpg:                 aka "Alex Williamson <[email protected]>"
# gpg:                 aka "Alex Williamson <[email protected]>"
# gpg:                 aka "Alex Williamson <[email protected]>"

* remotes/awilliam/tags/vfio-update-20160119.0:
  vfio/pci: Lazy PBA emulation
  vfio/pci-quirks: Only quirk to size of PCI config space

Signed-off-by: Peter Maydell <[email protected]>

iotests: Test that throttle values ranges

Signed-off-by: Fam Zheng <[email protected]>
Reviewed-by: Markus Armbruster <[email protected]>
Reviewed-by: Alberto Garcia <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

blockdev: Error out on negative throttling option values

extract_common_blockdev_options() uses qemu_opt_get_number() to parse
the bps/iops numbers to uint64_t, then converts to double and stores in
ThrottleConfig.  The actual parsing is done by strtoull() in
parse_option_number().  Negative numbers are wrapped to large positive
ones, and stored.

We used to reject negative numbers since 7d81c1413c9, but this regressed
when the option parsing code was changed later. Now fix this again.

This time, define an arbitrary large upper limit (1e15),  and check the
values so both negative and impractically big numbers are caught and
reported.

Signed-off-by: Fam Zheng <[email protected]>
Reviewed-by: Markus Armbruster <[email protected]>
Reviewed-by: Alberto Garcia <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

vmdk: Create streamOptimized as version 3

VMware products accept only version 3 for streamOptimized, let's bump
the version.

Reported-by: Radoslav Gerganov <[email protected]>
Signed-off-by: Fam Zheng <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

qcow2: Make image inaccessible after failed qcow2_invalidate_cache()

If qcow2_invalidate_cache() fails, we are in a state where qcow2_close()
has already been completed, but the image hasn't been reopened yet.
Calling into any qcow2 function for an image in this state will cause
crashes.

The real solution would be to get rid of the close/open pair and instead
do an atomic reset of the involved data structures, but this isn't
trivial, so let's just make the image inaccessible for now.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

qcow2: Fix BDRV_O_INACTIVE handling in qcow2_invalidate_cache()

What qcow2_invalidate_cache() should do is close the image with
BDRV_O_INACTIVE set and reopen it with the flag cleared. In fact, it
used to do exactly the opposite: qcow2_close() relied on bs->open_flags,
which is already updated to have cleared BDRV_O_INACTIVE at this point,
whereas qcow2_open() was called with s->flags, which has the flag still
set. Fix this.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

qcow2: Implement .bdrv_inactivate

The callback has to ensure that closing or flushing the image afterwards
wouldn't cause a write access to the image files. This means that just
the caches have to be written out, which is part of the existing
.bdrv_close implementation.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

block: Inactivate BDS when migration completes

So far, live migration with shared storage meant that the image is in a
not-really-ready don't-touch-me state on the destination while the
source is still actively using it, but after completing the migration,
the image was fully opened on both sides. This is bad.

This patch adds a block driver callback to inactivate images on the
source before completing the migration. Inactivation means that it goes
to a state as if it was just live migrated to the qemu instance on the
source (i.e. BDRV_O_INACTIVE is set). You're then supposed to continue
either on the source or on the destination, which takes ownership of the
image.

A typical migration looks like this now with respect to disk images:

1. Destination qemu is started, the image is opened with
   BDRV_O_INACTIVE. The image is fully opened on the source.

2. Migration is about to complete. The source flushes the image and
   inactivates it. Now both sides have the image opened with
   BDRV_O_INACTIVE and are expecting the other side to still modify it.

3. One side (the destination on success) continues and calls
   bdrv_invalidate_all() in order to take ownership of the image again.
   This removes BDRV_O_INACTIVE on the resuming side; the flag remains
   set on the other side.

This ensures that the same image isn't written to by both instances
(unless both are resumed, but then you get what you deserve). This is
important because .bdrv_close for non-BDRV_O_INACTIVE images could write
to the image file, which is definitely forbidden while another host is
using the image.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: John Snow <[email protected]>

block: Rename BDRV_O_INCOMING to BDRV_O_INACTIVE

Instead of covering only the state of images on the migration
destination before the migration is completed, the flag will also cover
the state of images on the migration source after completion. This
common state implies that the image is technically still open, but no
writes will happen and any cached contents will be reloaded from disk if
and when the image leaves this state.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

block: Fix error path in bdrv_invalidate_cache()

We can only clear BDRV_O_INCOMING if the caches were actually
invalidated.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

block: Assert no write requests under BDRV_O_INCOMING

As long as BDRV_O_INCOMING is set, the image file is only opened so we
have a file descriptor for it. We're definitely not supposed to modify
the image, it's still owned by the migration source.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

qcow2: Write full header on image creation

When creating a qcow2 image, we didn't necessarily call
qcow2_update_header(), but could end up with the basic header that
qcow2_create2() created manually. One thing that this basic header
lacks is the feature table. Let's make sure that it's always present.

This requires a few updates to test cases as well.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

qcow2: Write feature table only for v3 images

Version 2 images don't have feature bits, so writing a feature table to
those images is kind of pointless.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

block: Clean up includes

Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

qemu-iotests: Reduce racy output in 028

On my machine, './check -qcow2 028' was failing about 80% of the
time, due to a race in how many times the repeated attempts
to run 'info block-jobs' could occur before the job was done,
showing up as a failure of fewer '(qemu) ' prompts than in the
expected output. Silence the output during the repetitions, then
add a final clean command to keep the expected output useful;
once patched, I was finally able to run the test 20 times in a
row with no failures.

Signed-off-by: Eric Blake <[email protected]>
Reviewed-by: John Snow <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

qemu-img: Speed up comparing empty/zero images

Two empty raw files are always compared by actually reading data even if
there is no data, because BDRV_BLOCK_ZERO is considered "allocated" in
bdrv_is_allocated_above().  That is inefficient.

Use bdrv_get_block_status_above() for more information, and skip the
consecutive zero sectors.

This brings a huge speed up in comparing sparse/empty raw images:

    $ qemu-img create a 1G

    $ time ~/build/master/bin/qemu-img compare a a
    Images are identical.

    real    0m6.583s
    user    0m0.191s
    sys     0m6.367s

    $ time qemu-img compare a a
    Images are identical.

    real    0m0.033s
    user    0m0.003s
    sys     0m0.031s

Signed-off-by: Fam Zheng <[email protected]>
Reviewed-by: Paolo Bonzini <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

io: use memset instead of { 0 } for initializing array

Some versions of GCC on OS-X complain about CMSG_SPACE
not being constant size, which prevents use of { 0 }

io/channel-socket.c: In function 'qio_channel_socket_writev':
io/channel-socket.c:497:18: error: variable-sized object may not be initialized
char control[CMSG_SPACE(sizeof(int) * SOCKET_MAX_FDS)] = { 0 };

The compiler is at fault here, but it is nicer to avoid
tickling this compiler bug by using memset instead.

Reviewed-By: John Arbuckle <[email protected]>
Signed-off-by: Daniel P. Berrange <[email protected]>

io: fix description of @errp parameter initialization

The "Error **errp" parameters must be NULL initialized
not uninitialized.

Signed-off-by: Daniel P. Berrange <[email protected]>

io: some fixes to handling of /dev/null when running commands

The /dev/null file handle was leaked in a couple of places.
There is also the possibility that both readfd and writefd
point to the same /dev/null file handle, so care must be
taken not to close the same file handle twice.

Reviewed-by: Paolo Bonzini <[email protected]>
Signed-off-by: Daniel P. Berrange <[email protected]>

vfio/pci: Lazy PBA emulation

The PCI spec recommends devices use additional alignment for MSI-X
data structures to allow software to map them to separate processor
pages.  One advantage of doing this is that we can emulate those data
structures without a significant performance impact to the operation
of the device.  Some devices fail to implement that suggestion and
assigned device performance suffers.

One such case of this is a Mellanox MT27500 series, ConnectX-3 VF,
where the MSI-X vector table and PBA are aligned on separate 4K
pages.  If PBA emulation is enabled, performance suffers.  It's not
clear how much value we get from PBA emulation, but the solution here
is to only lazily enable the emulated PBA when a masked MSI-X vector
fires.  We then attempt to more aggresively disable the PBA memory
region any time a vector is unmasked.  The expectation is then that
a typical VM will run entirely with PBA emulation disabled, and only
when used is that emulation re-enabled.

Reported-by: Shyam Kaushik <[email protected]>
Tested-by: Shyam Kaushik <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>

vfio/pci-quirks: Only quirk to size of PCI config space

For quirks that support the full PCIe extended config space, limit the
quirk to only the size of config space available through vfio.  This
allows host systems with broken MMCONFIG regions to still make use of
these quirks without generating bad address faults trying to access
beyond the end of config space exposed through vfio.  This may expose
direct access to the mirror of extended config space, only trapping
the sub-range of standard config space, but allowing this makes the
quirk, and thus the device, functional.  We expect that only device
specific accesses make use of the mirror, not general extended PCI
capability accesses, so any virtualization in this space is likely
unnecessary anyway, and the device is still IOMMU isolated, so it
should only be able to hurt itself through any bogus configurations
enabled by this space.

Link: https://www.redhat.com/archives/vfio-users/2015-November/msg00192.html
Reported-by: Ronnie Swanink <[email protected]>
Reviewed-by: Laszlo Ersek <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>

block/raw-posix: avoid bogus fixup for cylinders on DASD disks

large volume DASD that have > 64k cylinders do claim to have
0xFFFE cylinders as special value in the old 16 bit field. We
want to pass this "token" along to the guest, instead of
calculating the real number. Otherwise qemu might fail with
"cyls must be between 1 and 65535"

Cc: [email protected]
Acked-by: Cornelia Huck <[email protected]>
Signed-off-by: Christian Borntraeger <[email protected]>
Reviewed-by: Markus Armbruster <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

block: Fix .bdrv_open flags

bdrv_common_open() modified bs->open_flags after inferring the set of
options to pass to the driver's .bdrv_open callback. This means that the
cache options were correctly set in bs->open_flags (and therefore
correctly displayed in 'info block'), but the image would actually be
opened with the default cache mode instead.

This patch removes the flags parameter to bdrv_common_open() (except for
BDRV_O_NO_BACKING it's the same as bs->open_flags anyway, and having two
names for the same thing is confusing), and moves the assignment of
open_flags down to immediately before calling into the block drivers. In
all other places, bs->open_flags is now used consistently.

Signed-off-by: Kevin Wolf <[email protected]>
Tested-by: Christian Borntraeger <[email protected]>
Reviewed-by: Denis V. Lunev <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>

vnc: distiguish between ipv4/ipv6 omitted vs set to off

The VNC code for interpreting QemuOpts does not currently
distinguish between ipv4/ipv6 being omitted, and being
set to 'off', because historically the 'ipv4' and 'ipv6'
options were just flags which did not accept a value.

The upshot is that if someone runs

  $QEMU -vnc localhost:1,ipv6=off

QEMU still uses PF_UNSPEC and thus may still bind to IPv6,
when it should use PF_INET.

This is another instance of the problem previously fixed
for chardevs in

  commit b77e7c8e99f9ac726c4eaa2fc3461fd886017dc0
  Author: Paolo Bonzini <[email protected]>
  Date:   Mon Oct 12 15:35:16 2015 +0200

    qemu-sockets: fix conversion of ipv4/ipv6 JSON to QemuOpts

Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Daniel P. Berrange <[email protected]>
Message-id: 1452518225 [email protected]
Signed-off-by: Gerd Hoffmann <[email protected]>

sockets: remove use of QemuOpts from socket_dgram

The socket_dgram method accepts a QAPI SocketAddress object
which it then turns into QemuOpts before calling the
inet_dgram_opts helper method. By converting the latter to
use QAPI SocketAddress directly, the QemuOpts conversion
step can be eliminated.

This removes the very last use of QemuOpts from the
sockets code, so the socket_optslist[] array is also
removed.

Signed-off-by: Daniel P. Berrange <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-id: 1452518225 [email protected]
Signed-off-by: Gerd Hoffmann <[email protected]>

sockets: remove use of QemuOpts from socket_connect

The socket_connect method accepts a QAPI SocketAddress object
which it then turns into QemuOpts before calling the
inet_connect_opts/unix_connect_opts helper methods. By
converting the latter to use QAPI SocketAddress directly,
the QemuOpts conversion step can be eliminated

Signed-off-by: Daniel P. Berrange <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-id: 1452518225 [email protected]
Signed-off-by: Gerd Hoffmann <[email protected]>

sockets: remove use of QemuOpts from socket_listen

The socket_listen method accepts a QAPI SocketAddress object
which it then turns into QemuOpts before calling the
inet_listen_opts/unix_listen_opts helper methods. By
converting the latter to use QAPI SocketAddress directly,
the QemuOpts conversion step can be eliminated

Signed-off-by: Daniel P. Berrange <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-id: 1452518225 [email protected]
Signed-off-by: Gerd Hoffmann <[email protected]>

sockets: remove use of QemuOpts from header file

There are no callers of the sockets methods which accept
QemuOpts any more. Make all the QemuOpts related functions
static to avoid new callers being added, in preparation
for removal of all QemuOpts usage, in favour of QAPI
SocketAddress.

Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Daniel P. Berrange <[email protected]>
Message-id: 1452518225 [email protected]
Signed-off-by: Gerd Hoffmann <[email protected]>

add MAINTAINERS entry for qemu socket code

Cc: Daniel P. Berrange <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Signed-off-by: Gerd Hoffmann <[email protected]>
Reviewed-by: Daniel P. Berrange <[email protected]>
Message-id: 1453129403 [email protected]

io: increment counter when killing off subcommand

When killing the subcommand, it is intended to first send
SIGTERM, then SIGKILL and only report an error if it still
doesn't die after SIGKILL. The 'step' counter was not
being incremented though, so the code never got past the
SIGTERM stage.

Signed-off-by: Daniel P. Berrange <[email protected]>

io: fix sign of errno value passed to error report

When reporting the number of FDs has been exceeded, pass
EINVAL to error_setg_errno, rather than -EINVAL.

Signed-off-by: Daniel P. Berrange <[email protected]>

Merge remote-tracking branch 'remotes/afaerber/tags/qom-devices-for-peter' into staging

QOM infrastructure fixes and device conversions

* Dynamic class properties
* Property iterator cleanup
* Device hot-unplug ID race fix

# gpg: Signature made Mon 18 Jan 2016 17:27:01 GMT using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <[email protected]>"
# gpg:                 aka "Andreas Färber <[email protected]>"

* remotes/afaerber/tags/qom-devices-for-peter:
  MAINTAINERS: Fix sPAPR entry heading
  qdev: Free QemuOpts when the QOM path goes away
  qom: Change object property iterator API contract
  qom: Allow properties to be registered against classes

Signed-off-by: Peter Maydell <[email protected]>

MAINTAINERS: Fix sPAPR entry heading

get_maintainers.pl does not handle parenthesis in maintenance areas well
in connection with list emails (here: [email protected]).

Resolve a recurring CC issue breaking git-send-email by reverting part
of commit 085eb217dfb3ee12e7985c11f71f8a038394735a ("Add David Gibson
for sPAPR in MAINTAINERS file").

Cc: David Gibson <[email protected]>
Signed-off-by: Andreas Färber <[email protected]>

qdev: Free QemuOpts when the QOM path goes away

Otherwise there is a race where the DEVICE_DELETED event has been sent but
attempts to reuse the ID will fail.

Note that similar races exist for other QemuOpts, which this patch
does not attempt to fix.

For example, if the device is a block device, then unplugging it also
deletes its backend.  However, this backend's get deleted in
drive_info_del(), which is only called when properties are
destroyed.  Just like device_finalize(), drive_info_del() is called
some time after DEVICE_DELETED is sent.  A separate patch series has
been sent to plug this other bug.  Character devices also have yet to
be fixed.

Reported-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: Markus Armbruster <[email protected]>
Signed-off-by: Andreas Färber <[email protected]>

qom: Change object property iterator API contract

Currently the ObjectProperty iterator API works as follows:

  ObjectPropertyIterator *iter;

  iter = object_property_iter_init(obj);
  while ((prop = object_property_iter_next(iter))) {
     ...
  }
  object_property_iter_free(iter);

This has the benefit that the ObjectPropertyIterator struct
can be opaque, but has the downside that callers need to
explicitly call a free function. It is also not in keeping
with iterator style used elsewhere in QEMU/GLib2.

This patch changes the API to use stack allocation instead:

  ObjectPropertyIterator iter;

  object_property_iter_init(&iter, obj);
  while ((prop = object_property_iter_next(&iter))) {
     ...
  }

Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Daniel P. Berrange <[email protected]>
Reviewed-by: Markus Armbruster <[email protected]>
[AF: Fused ObjectPropertyIterator struct with typedef]
Signed-off-by: Andreas Färber <[email protected]>

qom: Allow properties to be registered against classes

When there are many instances of a given class, registering
properties against the instance is wasteful of resources. The
majority of objects have a statically defined list of possible
properties, so most of the properties are easily registerable
against the class. Only those properties which are conditionally
registered at runtime need be recorded against the klass.

Registering properties against classes also makes it possible
to provide static introspection of QOM - currently introspection
is only possible after creating an instance of a class, which
severely limits its usefulness.

This impl only supports simple scalar properties. It does not
attempt to allow child object / link object properties against
the class. There are ways to support those too, but it would
make this patch more complicated, so it is left as an exercise
for the future.

There is no equivalent to object_property_del() provided, since
classes must be immutable once they are defined.

Signed-off-by: Daniel P. Berrange <[email protected]>
Signed-off-by: Andreas Färber <[email protected]>

hw/arm: Clean up includes

Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <[email protected]>
Message-id: 1449505425 [email protected]

target-arm: Clean up includes

Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Daniel P. Berrange <[email protected]>
Message-id: 1449505425 [email protected]

scripts: Add new clean-includes script to fix C include directives

Add a new scripts/clean-includes, which can be used to automatically
ensure that a C source file includes qemu/osdep.h first and doesn't
then include any headers which osdep.h provides already.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Daniel P. Berrange <[email protected]>
Message-id: 1449505425 [email protected]

Merge remote-tracking branch 'remotes/kraxel/tags/pull-ui-20160118-1' into staging

ui: misc small gtk/spice/vnc patches.

# gpg: Signature made Mon 18 Jan 2016 15:52:13 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <[email protected]>"
# gpg:                 aka "Gerd Hoffmann <[email protected]>"
# gpg:                 aka "Gerd Hoffmann (private) <[email protected]>"

* remotes/kraxel/tags/pull-ui-20160118-1:
  vnc: fix tls-creds error message
  Fix corner-case when using VNC+SASL+SPICE
  vnc: clear vs->tlscreds after unparenting it
  gtk: implement set_echo

Signed-off-by: Peter Maydell <[email protected]>

vnc: fix tls-creds error message

The parameter is called 'tls-creds', 'credid' is just the
variable name in the code.

Signed-off-by: Wolfgang Bumiller <[email protected]>
Reviewed-by: Daniel P. Berrange <[email protected]>
Message-id: 1452681360 [email protected]
Signed-off-by: Gerd Hoffmann <[email protected]>

Fix corner-case when using VNC+SASL+SPICE

Similarly to the commit 764eb39d1b6 fixing VNC+SASL+QXL, when starting
QEMU with SPICE but no SASL, and at the same time VNC with SASL, then
spice_server_init() will get called without a previous call to
spice_server_set_sasl_appname(), which will cause cyrus-sasl to
try to use /etc/sasl2/spice.conf (spice-server uses "spice" as its
default appname) rather than the expected /etc/sasl2/qemu.conf.

This commit unconditionally calls spice_server_set_sasl_appname()
before calling spice_server_init() in order to use the correct appname
even if SPICE without SASL was requested on qemu command line.

Signed-off-by: Christophe Fergeau <[email protected]>
Message-id: 1452607738 [email protected]
Signed-off-by: Gerd Hoffmann <[email protected]>

vnc: clear vs->tlscreds after unparenting it

This pointer should be cleared in vnc_display_close()
otherwise a use-after-free can happen when when using the
old style 'x509' and 'tls' options rather than a persistent
tls-creds -object, by issuing monitor commands to change
the vnc server like so:

Start with: -vnc unix:test.socket,x509,tls
Then use the following monitor command:
change vnc unix:test.socket

After this the pointer is still set but invalid and a crash
can be triggered for instance by issuing the same command a
second time which will try to object_unparent() the same
pointer again.

Signed-off-by: Wolfgang Bumiller <[email protected]>
Reviewed-by: Daniel P. Berrange <[email protected]>
Signed-off-by: Gerd Hoffmann <[email protected]>

gtk: implement set_echo

Even without line editing, this makes -qmp vc more pleasant with the
GTK+ backend. The only issue is that set_echo is invoked very early,
long before a vc is actually associated with a VirtualConsole. To work
around this, create a temporary VirtualConsole until then.

Signed-off-by: Paolo Bonzini <[email protected]>
Message-id: 1450356422 [email protected]
Signed-off-by: Gerd Hoffmann <[email protected]>

Merge remote-tracking branch 'remotes/mcayland/tags/qemu-sparc-signed' into staging

qemu-sparc update

# gpg: Signature made Sat 16 Jan 2016 12:32:06 GMT using RSA key ID AE0F321F
# gpg: Good signature from "Mark Cave-Ayland <[email protected]>"

* remotes/mcayland/tags/qemu-sparc-signed:
  target-sparc: Migrate CWP and PIL for SPARC64
  target-sparc: Use VMState arrays for SPARC64 TLB/MMU state
  target-sparc: Convert to VMStateDescription
  target-sparc: Don't flush TLB in cpu_load function
  target-sparc: Split cpu_put_psr into side-effect and no-side-effect parts
  vmstate: define vmstate_info_uinttl
  vmstate: Introduce VMSTATE_VARRAY_MULTPLY
  vmstate: introduce CPU_DoubleU arrays

Signed-off-by: Peter Maydell <[email protected]>

target-sparc: Migrate CWP and PIL for SPARC64

In SPARC32 the env->cwp and env->psrpil state is part of the PSR
register, and gets migrated as part of that register.
In SPARC64 this state is in separate CWP and PIL registers, but we
were not doing anything to migrate those.

Add the missing fields to the migration vmstate (which is a
migration break, but without these fields migration is completely
broken anyway).

This change means that trying a save/load of a SPARC64 target at
the boot rom prompt now produces a system which at least responds
to keyboard input after the restore.

Reported-by: Paolo Bonzini <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>
Signed-off-by: Mark Cave-Ayland <[email protected]>

target-sparc: Use VMState arrays for SPARC64 TLB/MMU state

Use VMState arrays for SPARC64 TLB/MMU state. This is
a migration-break for SPARC64 (but not for SPARC32),
which is acceptable because currently migration does not
work for any SPARC64 machines due to the lack of any migration
of interrupt controller state.

Signed-off-by: Peter Maydell <[email protected]>
Signed-off-by: Mark Cave-Ayland <[email protected]>

target-sparc: Convert to VMStateDescription

Convert the SPARC CPU from cpu_load/save functions to VMStateDescription.
We preserve migration compatibility with the previous version
(required for SPARC32 but not necessarily for SPARC64).

Signed-off-by: Juan Quintela <[email protected]>
[PMM:
* Rebase and update to apply to master
* VMSTATE_STRUCT_POINTER now takes type, not pointer-to-type
* QEMUTimer* are migrated via VMSTATE_TIMER_PTR
* Put CPUTimer vmstate struct inside TARGET_SPARC64 ifdef
* Convert handling of PSR to use a vmstate_psr, like Alpha and ARM
]
Signed-off-by: Peter Maydell <[email protected]>
Signed-off-by: Mark Cave-Ayland <[email protected]>

target-sparc: Don't flush TLB in cpu_load function

There's no need to flush the TLB in the SPARC cpu_load function: we're
guaranteed to be loading state into a fresh clean configuration.

Signed-off-by: Peter Maydell <[email protected]>
Signed-off-by: Mark Cave-Ayland <[email protected]>

target-sparc: Split cpu_put_psr into side-effect and no-side-effect parts

For inbound migration we really want to be able to set the PSR without
having any side effects, but cpu_put_psr() calls cpu_check_irqs() which
might try to deliver CPU interrupts. Split cpu_put_psr() into the
no-side-effect and side-effect parts.

This includes reordering the cpu_check_irqs() to the end of cpu_put_psr(),
because that function may actually end up calling cpu_interrupt(), which
does not seem like a good thing to happen in the middle of updating the PSR.

Suggested-by: Blue Swirl <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>
Signed-off-by: Mark Cave-Ayland <[email protected]>

vmstate: define vmstate_info_uinttl

We are going to define arrays of this type, so we need the integer type.

Signed-off-by: Juan Quintela <[email protected]>
[PMM: updated to apply on current QEMU; renamed to 'uinttl'
rather than 'uinttls' to match other vmstate naming]
Signed-off-by: Peter Maydell <[email protected]>
Signed-off-by: Mark Cave-Ayland <[email protected]>

vmstate: Introduce VMSTATE_VARRAY_MULTPLY

This allows to send a partial array where the size is another
structure field multiplied by a constant.

Signed-off-by: Juan Quintela <[email protected]>
[PMM: updated to current master]
Signed-off-by: Peter Maydell <[email protected]>
Signed-off-by: Mark Cave-Ayland <[email protected]>

vmstate: introduce CPU_DoubleU arrays

Add vmstate support for migrating arrays of CPU_DoubleU via
VMSTATE_CPUDOUBLE_ARRAY.

Signed-off-by: Juan Quintela <[email protected]>
[PMM: rebased, since files have all moved since 2012;
added VMSTATE_CPUDOUBLE_ARRAY_V for consistency with FLOAT64]
Signed-off-by: Peter Maydell <[email protected]>
Signed-off-by: Mark Cave-Ayland <[email protected]>

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* qemu-char logfile facility
* NBD coroutine based negotiation
* bugfixes

# gpg: Signature made Fri 15 Jan 2016 17:58:28 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <[email protected]>"
# gpg:                 aka "Paolo Bonzini <[email protected]>"

* remotes/bonzini/tags/for-upstream:
  qemu-char: do not leak QemuMutex when freeing a character device
  qemu-char: add logfile facility to all chardev backends
  nbd-server: do not exit on failed memory allocation
  nbd-server: do not check request length except for reads and writes
  nbd-server: Coroutine based negotiation
  nbd: Split nbd.c
  nbd: Always call "close_fn" in nbd_client_new
  SCSI device: fix to incomplete QOMify
  iscsi: send readcapacity10 when readcapacity16 failed
  qemu-char: delete send_all/recv_all helper methods
  vmw_pvscsi: x-disable-pcie, x-old-pci-configuration back-compat props are 2.5 specific
  scsi: initialise info object with appropriate size
  i386: avoid null pointer dereference
  target-i386: do not duplicate page protection checks
  scsi: revert change to scsi_req_cancel_async and add assertions

Signed-off-by: Peter Maydell <[email protected]>

qemu-char: do not leak QemuMutex when freeing a character device

The leak is only apparent on Win32. On POSIX platforms destroying a
mutex is not necessary.

Reported-by: Eric Blake <[email protected]>
Reviewed-by: Daniel P. Berrange <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

qemu-char: add logfile facility to all chardev backends

Typically a UNIX guest OS will log boot messages to a serial
port in addition to any graphical console. An admin user
may also wish to use the serial port for an interactive
console. A virtualization management system may wish to
collect system boot messages by logging the serial port,
but also wish to allow admins interactive access.

Currently providing such a feature forces the mgmt app
to either provide 2 separate serial ports, one for
logging boot messages and one for interactive console
login, or to proxy all output via a separate service
that can multiplex the two needs onto one serial port.
While both are valid approaches, they each have their
own downsides. The former causes confusion and extra
setup work for VM admins creating disk images. The latter
places an extra burden to re-implement much of the QEMU
chardev backends logic in libvirt or even higher level
mgmt apps and adds extra hops in the data transfer path.

A simpler approach that is satisfactory for many use
cases is to allow the QEMU chardev backends to have a
"logfile" property associated with them.

$QEMU -chardev socket,host=localhost,port=9000,\
server=on,nowait,id-charserial0,\
logfile=/var/log/libvirt/qemu/test-serial0.log
-device isa-serial,chardev=charserial0,id=serial0

This patch introduces a 'ChardevCommon' struct which
is setup as a base for all the ChardevBackend types.
Ideally this would be registered directly as a base
against ChardevBackend, rather than each type, but
the QAPI generator doesn't allow that since the
ChardevBackend is a non-discriminated union. The
ChardevCommon struct provides the optional 'logfile'
parameter, as well as 'logappend' which controls
whether QEMU truncates or appends (default truncate).

Signed-off-by: Daniel P. Berrange <[email protected]>
Message-Id: <1452516281 [email protected]>
[Call qemu_chr_parse_common if cd->parse is NULL. - Paolo]
Signed-off-by: Paolo Bonzini <[email protected]>

nbd-server: do not exit on failed memory allocation

The amount of memory allocated in nbd_co_receive_request is driven by the
NBD client (possibly a virtual machine). Parallel I/O can cause the
server to allocate a large amount of memory; check for failures and
return ENOMEM in that case.

Cc: [email protected]
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

nbd-server: do not check request length except for reads and writes

Only reads and writes need to allocate memory correspondent to the
request length. Other requests can be sent to the storage without
allocating any memory, and thus any request length is acceptable.

Reported-by: Sitsofe Wheeler <[email protected]>
Cc: [email protected]
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

nbd-server: Coroutine based negotiation

Create a coroutine in nbd_client_new, so that nbd_send_negotiate doesn't
need qemu_set_block().

Handlers need to be set temporarily for csock fd in case the coroutine
yields during I/O.

With this, if the other end disappears in the middle of the negotiation,
we don't block the whole event loop.

To make the code clearer, unify all function names that belong to
negotiate, so they are less likely to be misused. This is important
because we rely on negotiation staying in main loop, as commented in
nbd_negotiate_read/write().

Signed-off-by: Fam Zheng <[email protected]>
Message-Id: <1452760863 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

nbd: Split nbd.c

We have NBD server code and client code, all mixed in a file. Now split
them into separate files under nbd/, and update MAINTAINERS.

filter_nbd for iotest 083 is updated to keep the log filtered out.

Signed-off-by: Fam Zheng <[email protected]>
Message-Id: <1452760863 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

nbd: Always call "close_fn" in nbd_client_new

Rename the parameter "close" to "close_fn" to disambiguous with
close(2).

This unifies error handling paths of NBDClient allocation:
nbd_client_new will shutdown the socket and call the "close_fn" callback
if negotiation failed, so the caller don't need a different path than
the normal close.

The returned pointer is never used, make it void in preparation for the
next patch.

Signed-off-by: Fam Zheng <[email protected]>
Message-Id: <1452760863 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>