Cédric Le Goater [Thu, 17 Jan 2019 07:53:24 +0000 (08:53 +0100)]
xive: add a get_tctx() method to the XiveRouter
It provides a mean to retrieve the XiveTCTX of a CPU. This will become
necessary with future changes which move the interrupt presenter
object pointers under the PowerPCCPU machine_data.
The PowerNV machine has an extra requirement on TIMA accesses that
this new method addresses. The machine can perform indirect loads and
stores on the TIMA on behalf of another CPU. The PIR being defined in
the controller registers, we need a way to peek in the controller
model to find the PIR value.
The XiveTCTX is moved above the XiveRouter definition to avoid forward
typedef declarations.
While looking at the s390x implementation, looks like spapr has a
similar BUG when building the topology.
The primary bus number corresponds always to the bus number of the
bus the bridge is attached to.
Right now, if we have two bridges attached to the same bus (e.g. root
bus) this is however not the case. The first bridge will have primary
bus 0, the second bridge primary bus 1, which is wrong. Fix the assignment.
While at it, drop setting the PCI_SUBORDINATE_BUS temporarily to 0xff.
Setting it temporarily to that value (as discussed e.g. in [1]), is
only relevant for a running system that probes the buses. The value is
effectively unused for us just doing a DFS.
Thomas Huth [Wed, 9 Jan 2019 14:03:23 +0000 (15:03 +0100)]
hw/ppc/spapr: Encode the SCSI channel (bus) in the SRP LUNs
In hw/scsi/spapr_vio.c we declare that the controller supports multiple
buses by specifying "max_channel = 7" there. So in the code that fixes
up the device tree nodes, we must encode the channel number (a.k.a. bus
number in the "Logical unit addressing format" table of SAM5) into the
64-bit LUN, too.
commit efe2add7cb7f ("spapr/vio: deprecate the "irq" property") was
merged in QEMU version 3.0. The "irq" property" can be removed for
QEMU version 4.0.
BALATON Zoltan [Wed, 9 Jan 2019 22:37:33 +0000 (23:37 +0100)]
ppc440: Avoid reporting error when reading non-existent RAM slot
When reading base register of RAM slot with no RAM we should not try
to calculate register value because that will result printing an error
due to invalid RAM size. Just return 0 without the error in this case.
Greg Kurz [Thu, 10 Jan 2019 14:23:58 +0000 (15:23 +0100)]
target/ppc/kvm: Drop useless include directive
It has been there since the enablement of PR KVM for PAPR, ie, commit f61b4bedaf35 in 2011. Not sure why at that time, but it is definitely
not needed with the current code.
BALATON Zoltan [Thu, 3 Jan 2019 16:27:24 +0000 (17:27 +0100)]
sam460ex: Fix support for memory larger than 1GB
Fix the encoding of larger memory modules in the SoC registers which
allows specifying more than 1GB memory for sam460ex. Well, only 2GB
due to SoC and firmware restrictions which was the only missing value
compared to what the real hardware supports. The SoC should support up
to 4GB but when setting that the firmware hangs during memory test.
This may be an overflow bug in the firmware which I did not try to
debug but this may affect real hardware as well.
BALATON Zoltan [Thu, 3 Jan 2019 16:27:24 +0000 (17:27 +0100)]
ppc4xx: Pass array index to function instead of pointer into the array
The sdram_set_bcr() function in ppc440_uc.c takes a pointer into an
array then calculates its index from that. It's simpler and easier to
just pass the index which simplifies both the function and its callers.
Do similar cleanup in ppc4xx_devs.c to similar function.
BALATON Zoltan [Thu, 3 Jan 2019 16:27:24 +0000 (17:27 +0100)]
ppc4xx: Rename ppc4xx_sdram_t in ppc440_uc.c to ppc440_sdram_t
There's already a struct with the same name in ppc4xx_devs.c. They are
not used outside their files so don't clash but they are also not
identical so rename the ppc440 specific one to distinguish them.
BALATON Zoltan [Thu, 3 Jan 2019 16:27:24 +0000 (17:27 +0100)]
sam460ex: Clean up SPD EEPROM creation
Get rid of code from MIPS Malta board used to create SPD EEPROM data
(parts of which was not even needed for sam460ex) and use the generic
spd_data_generate() function to simplify this.
BALATON Zoltan [Thu, 3 Jan 2019 16:27:24 +0000 (17:27 +0100)]
smbus: Add a helper to generate SPD EEPROM data
There are several boards with SPD EEPROMs that are now using
duplicated or slightly different hard coded data. Add a helper to
generate SPD data for a memory module of given type and size that
could be used by these boards (either as is or with further changes if
needed) which should help cleaning this up and avoid further duplication.
This includes spapr-vio and usb-storage fixes, phandles fix for NVLink2
pass through support and other compile improvements.
The full list of changes is:
* vio-vscsi: Support multiple channels / buses
* board-qemu/slof/vio-vscsi: Scan up to 64 SCSI IDs
* usb/storage: Implement block write support
* usb/storage: Invert the logic of the IF-statements
* fdt: Fix phandles for NVLink/NVLink2
* fdt: Factor out code to replace a phandle in place
* pci: use appropriate base class ids
* Makefile: Set a proper DRIVER_NAME when building from a git tree
* romfs/tools: Silence more compiler warnings with GCC 8.1
* romfs/tools: Silence GCC 8.1 compiler warning with FLASHFS_MAGIC
* romfs/tools: Remove superfluous union around the rom header struct
* make.rules: Compile SLOF with -fno-asynchronous-unwind-tables
Peter Maydell [Fri, 1 Feb 2019 17:58:27 +0000 (17:58 +0000)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches:
- vmdk: Support for blockdev-create
- block: Apply auto-read-only for ro-whitelist drivers
- virtio-scsi: Fixes related to attaching/detaching iothreads
- scsi-disk: Fixed erroneously detected multipath setup with multiple
disks created with node-names. Added device_id property.
- block: Fix hangs in synchronous APIs with iothreads
- block: Fix invalidate_cache error path for parent activation
- block-backend, mirror, qcow2, vpc, vdi, qemu-iotests:
Minor fixes and code improvements
# gpg: Signature made Fri 01 Feb 2019 15:23:10 GMT
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (27 commits)
scsi-disk: Add device_id property
scsi-disk: Don't use empty string as device id
qtest.py: Wait for the result of qtest commands
block: Fix invalidate_cache error path for parent activation
iotests/236: fix transaction kwarg order
iotests: Filter second BLOCK_JOB_ERROR from 229
virtio-scsi: Forbid devices with different iothreads sharing a blockdev
scsi-disk: Acquire the AioContext in scsi_*_realize()
virtio-scsi: Move BlockBackend back to the main AioContext on unplug
block: Eliminate the S_1KiB, S_2KiB, ... macros
block: Remove blk_attach_dev_legacy() / legacy_dev code
block: Apply auto-read-only for ro-whitelist drivers
uuid: Make qemu_uuid_bswap() take and return a QemuUUID
block/vdi: Don't take address of fields in packed structs
block/vpc: Don't take address of fields in packed structs
vmdk: Reject excess extents in blockdev-create
iotests: Add VMDK tests for blockdev-create
iotests: Filter cid numbers in VMDK extent info
vmdk: Implement .bdrv_co_create callback
vmdk: Refactor vmdk_create_extent
...
Peter Maydell [Fri, 1 Feb 2019 16:39:17 +0000 (16:39 +0000)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20190201' into staging
target-arm queue:
* New machine mps2-an521 -- this is a model of the AN521 FPGA image for the MPS2 devboard
* Fix various places where we failed to UNDEF invalid A64 instructions
* Don't UNDEF a valid FCMLA on 32-bit inputs
* Fix some bugs in the newly-added PAuth implementation
* microbit: Implement NVMC non-volatile memory controller
target/arm: fix AArch64 virtual address space size
Since QEMU does not support the ARMv8.2-LVA, Large Virtual Address,
extension (yet), the VA address space is 48-bits plus a sign bit. User
mode can only handle the positive half of the address space, so that
makes a limit of 48 bits.
(With LVA, it would be 53 and 52 bits respectively.)
The incorrectly large address space conflicts with PAuth instructions,
which use bits 48-54 and 56-63 for the pointer authentication code. This
also conflicts with (as yet unsupported by QEMU) data tagging and with
the ARMv8.5-MTE extension.
Drop the pac properties. This approach cannot work as written
because the properties are applied before arm_cpu_reset, which
zeros SCTLR_EL1 (amongst everything else).
We can re-introduce the properties if they turn out to be useful.
But since linux 5.0 enables all of the keys, they may not be.
Julia Suvorova [Fri, 1 Feb 2019 14:55:46 +0000 (14:55 +0000)]
arm: Clarify the logic of set_pc()
Until now, the set_pc logic was unclear, which raised questions about
whether it should be used directly, applying a value to PC or adding
additional checks, for example, set the Thumb bit in Arm cpu. Let's set
the set_pc logic for “Configure the PC, as was done in the ELF file”
and implement synchronize_with_tb hook for preserving PC to cpu_tb_exec.
target/arm: Add a timer to predict PMU counter overflow
Make PMU overflow interrupts more accurate by using a timer to predict
when they will overflow rather than waiting for an event to occur which
allows us to otherwise check them.
target/arm: Send interrupts on PMU counter overflow
Whenever we notice that a counter overflow has occurred, send an
interrupt. This is made more reliable with the addition of a timer in a
follow-on commit.
Peter Maydell [Fri, 1 Feb 2019 14:55:45 +0000 (14:55 +0000)]
target/arm/translate-a64: Fix mishandling of size in FCMLA decode
In disas_simd_indexed(), for the case of "complex fp", each indexable
element is a complex pair, so the total size is twice that indicated
in the 'size' field in the encoding. We were trying to do this
"double the size" operation with a left shift by 1, but this is
incorrect because the 'size' field is a MO_8/MO_16/MO_32/MO_64
value, and doubling the size should be done by a simple increment.
This meant we were mishandling FCMLA (by element) of values where
the real and imaginary parts are 32-bit floats, and would incorrectly
UNDEF this encoding. (No other insns take this code path, and for
16-bit floats it happens that 1 << 1 and 1 + 1 are both the same).
The FCMLA (by element) instruction exists in the
"vector x indexed element" encoding group, but not in
the "scalar x indexed element" group. Correctly UNDEF
the unallocated encodings.
Peter Maydell [Fri, 1 Feb 2019 14:55:45 +0000 (14:55 +0000)]
exec.c: Don't reallocate IOMMUNotifiers that are in use
The tcg_register_iommu_notifier() code has a GArray of
TCGIOMMUNotifier structs which it has registered by passing
memory_region_register_iommu_notifier() a pointer to the embedded
IOMMUNotifier field. Unfortunately, if we need to enlarge the
array via g_array_set_size() this can cause a realloc(), which
invalidates the pointer that memory_region_register_iommu_notifier()
put into the MemoryRegion's iommu_notify list. This can result
in segfaults.
Switch the GArray to holding pointers to the TCGIOMMUNotifier
structs, so that we can individually allocate and free them.
Peter Maydell [Fri, 1 Feb 2019 14:55:45 +0000 (14:55 +0000)]
target/arm/translate-a64: Don't underdecode SDOT and UDOT
In the AdvSIMD scalar x indexed element and vector x indexed element
encoding group, the SDOT and UDOT instructions are vector only,
and their opcode is unallocated in the scalar group. Correctly
UNDEF this unallocated encoding.
bit 31 is M and bit 29 is S (and bit 30 is 0, already checked at
this point in the decode). None of these groups allocate any
encoding for M=1 or S=1. We checked this in disas_fp_compare(),
disas_fp_ccomp() and disas_fp_csel(), but missed it in disas_fp_1src(),
disas_fp_2src(), disas_fp_3src() and disas_fp_imm().
We also missed that in the fp immediate encoding the imm5 field
must be all zeroes.
In the "add/subtract (extended register)" encoding group, the "opt"
field in bits [23:22] must be zero. Correctly UNDEF the unallocated
encodings where this field is not zero.
Peter Maydell [Fri, 1 Feb 2019 14:55:44 +0000 (14:55 +0000)]
target/arm/translate-a64: Don't underdecode SIMD ld/st single
In the AdvSIMD load/store single structure encodings, the
non-post-indexed case should have zeroes in [20:16] (which is the
Rm field for the post-indexed case). Bit 31 must also be zero
(a check we got right in ldst_multiple but not here). Correctly
UNDEF these unallocated encodings.
In the AdvSIMD load/store multiple structures encodings,
the non-post-indexed case should have zeroes in [20:16]
(which is the Rm field for the post-indexed case).
Correctly UNDEF the currently unallocated encodings which
have non-zeroes in those bits.
Peter Maydell [Fri, 1 Feb 2019 14:55:44 +0000 (14:55 +0000)]
target/arm/translate-a64: Don't underdecode PRFM
The PRFM prefetch insn in the load/store with imm9 encodings
requires idx field 0b00; we were underdecoding this by
only checking !is_unpriv (which is equivalent to idx != 2).
Correctly UNDEF the unallocated encodings where idx == 0b01
and 0b11 as well as 0b10.
Peter Maydell [Fri, 1 Feb 2019 14:55:44 +0000 (14:55 +0000)]
target/arm/translate-a64: Don't underdecode system instructions
The "system instructions" and "system register move" subcategories
of "branches, exception generating and system instructions" for A64
only apply if bits [23:22] are zero; other values are currently
unallocated. Correctly UNDEF these unallocated encodings.
Peter Maydell [Fri, 1 Feb 2019 14:55:44 +0000 (14:55 +0000)]
hw/arm/mps2-tz: Add mps2-an521 model
Add a model of the MPS2 FPGA image described in Application Note
AN521. This is identical to the AN505 image, except that it uses
the SSE-200 rather than the IoTKit and so has two Cortex-M33 CPUs.
Peter Maydell [Fri, 1 Feb 2019 14:55:43 +0000 (14:55 +0000)]
hw/arm/mps2-tz: Add IRQ infrastructure to support SSE-200
In preparation for adding support for the AN521 MPS2 image, we need
to handle wiring up the MPS2 device interrupt lines to both CPUs in
the SSE-200, rather than just the one that the IoTKit has.
Abstract out a "connect to the IoTKit interrupt line" function
and make it connect to a splitter which feeds both sets of inputs
for the SSE-200 case.
The SSE-200 has a CPU_IDENTITY register block, which is a set of
read-only registers. As well as the usual PID/CID registers, there
is a single CPUID register which indicates whether the CPU is CPU 0
or CPU 1. Implement a model of this register block.
Peter Maydell [Fri, 1 Feb 2019 14:55:43 +0000 (14:55 +0000)]
hw/arm/armsse: Add unimplemented-device stub for CPU local control registers
The SSE-200 has a "CPU local security control" register bank; add an
unimplemented-device stub for it. (The register bank has only one
interesting register, which allows the guest to lock down changes
to various CPU registers so they cannot be modified further. We
don't support that in our Cortex-M33 model anyway.)
Peter Maydell [Fri, 1 Feb 2019 14:55:43 +0000 (14:55 +0000)]
hw/arm/armsse: Add unimplemented-device stubs for MHUs
The SSE-200 has two Message Handling Units (MHUs), which sit behind
the APB PPC0. Wire up some unimplemented-device stubs for these,
since we don't yet implement a real model of this device.
Peter Maydell [Fri, 1 Feb 2019 14:55:42 +0000 (14:55 +0000)]
iotkit-sysinfo: Make SYS_VERSION and SYS_CONFIG configurable
The SYS_VERSION and SYS_CONFIG register values differ between the
IoTKit and SSE-200. Make them configurable via QOM properties rather
than hard-coded, and set them appropriately in the ARMSSE code that
instantiates the IOTKIT_SYSINFO device.
Peter Maydell [Fri, 1 Feb 2019 14:55:42 +0000 (14:55 +0000)]
hw/arm/armsse: Put each CPU in its own cluster object
Create a cluster object to hold each CPU in the SSE. They are
logically distinct and may be configured differently (for instance
one may not have an FPU where the other does).
Peter Maydell [Fri, 1 Feb 2019 14:55:42 +0000 (14:55 +0000)]
hw/arm/armsse: Give each CPU its own view of memory
Give each CPU its own container memory region. This is necessary
for two reasons:
* some devices are instantiated one per CPU and the CPU sees only
its own device
* since a memory region can only be put into one container, we must
give each armv7m object a different MemoryRegion as its 'memory'
property, or a dual-CPU configuration will assert on realize when
the second armv7m object tries to put the MR into a container when
it is already in the first armv7m object's container
Peter Maydell [Fri, 1 Feb 2019 14:55:42 +0000 (14:55 +0000)]
hw/arm/armsse: Support dual-CPU configuration
The SSE-200 has two Cortex-M33 CPUs. These see the same view
of memory, with the exception of the "private CPU region" which
has per-CPU devices. Internal device interrupts for SSE-200
devices are mostly wired up to both CPUs, with the exception of
a few per-CPU devices. External GPIO inputs on the SSE-200
device are provided for the second CPU's interrupts above 32,
as is already the case for the first CPU.
Refactor the code to support creation of multiple CPUs.
For the moment we leave all CPUs with the same view of
memory: this will not work in the multiple-CPU case, but
we will fix this in the following commit.
Peter Maydell [Fri, 1 Feb 2019 14:55:42 +0000 (14:55 +0000)]
hw/arm/armsse: Make SRAM bank size configurable
For the IoTKit the SRAM bank size is always 32K (15 bits); for the
SSE-200 this is a configurable parameter, which defaults to 32K but
can be changed when it is built into a particular SoC. For instance
the Musca-B1 board sets it to 128K (17 bits).
Make the bank size a QOM property. We follow the SSE-200 hardware in
naming the parameter SRAM_ADDR_WIDTH, which specifies the number of
address bits of a single SRAM bank.
Peter Maydell [Fri, 1 Feb 2019 14:55:42 +0000 (14:55 +0000)]
hw/arm/armsse: Make number of SRAM banks parameterised
The SSE-200 has four banks of SRAM, each with its own
Memory Protection Controller, where the IoTKit has only one.
Make the number of SRAM banks a field in ARMSSEInfo.
Peter Maydell [Fri, 1 Feb 2019 14:55:42 +0000 (14:55 +0000)]
hw/misc/iotkit-secctl: Support 4 internal MPCs
The SSE-200 has 4 banks of SRAM, each with its own internal
Memory Protection Controller. The interrupt status for these
extra MPCs appears in the same security controller SECMPCINTSTATUS
register as the MPC for the IoTKit's single SRAM bank. Enhance the
iotkit-secctl device to allow 4 MPCs. (If the particular IoTKit/SSE
variant in use does not have all 4 MPCs then the unused inputs will
simply result in the SECMPCINTSTATUS bits being zero as required.)
The hardcoded constant "1"s in armsse.c indicate the actual number
of SRAM MPCs the IoTKit has, and will be replaced in the following
commit.
Peter Maydell [Fri, 1 Feb 2019 14:55:41 +0000 (14:55 +0000)]
hw/arm/iotkit: Rename 'iotkit' local variables and functions
Rename various internal uses of 'iotkit' in hw/arm/iotkit.c to
'armsse', for consistency. The remaining occurences are:
* related to the devices TYPE_IOTKIT_SYSCTL, TYPE_IOTKIT_SYSINFO,
etc, which this refactor is not touching
* references that apply specifically to the IoTKit (like
the lack of a private CPU region)
* the vmstate, which keeps its old "iotkit" name for
migration compatibility reasons
Peter Maydell [Fri, 1 Feb 2019 14:55:41 +0000 (14:55 +0000)]
hw/arm/iotkit: Refactor into abstract base class and subclass
The Arm SSE-200 Subsystem for Embedded is a revised and
extended version of the older IoTKit SoC. Prepare for
adding a model of it by refactoring the IoTKit code into
an abstract base class which contains the functionality,
driven by a class data block specific to each subclass.
(This is the same approach used by the existing bcm283x
SoC family implementation.)
Peter Maydell [Fri, 1 Feb 2019 14:55:41 +0000 (14:55 +0000)]
hw/arm/iotkit: Rename IoTKit to ARMSSE
The Arm IoTKit was effectively the forerunner of a series of
subsystems for embedded SoCs, named the SSE-050, SSE-100 and SSE-200:
https://developer.arm.com/products/system-design/subsystems
These are generally quite similar, though later iterations have
extra devices that earlier ones do not.
We want to add a model of the SSE-200, which means refactoring the
IoTKit code into an abstract base class and subclasses (using the
same design that the bcm283x SoC and Aspeed SoC family
implementations do). As a first step, rename the IoTKit struct and
QOM macros to ARMSSE, which is what we're going to name the base
class. We temporarily retain TYPE_IOTKIT to avoid changing the
code that instantiates a TYPE_IOTKIT device here and then changing
it back again when it is re-introduced as a subclass.
Peter Maydell [Fri, 1 Feb 2019 14:55:41 +0000 (14:55 +0000)]
armv7m: Pass through start-powered-off CPU property
Expose "start-powered-off" as a property of the ARMv7M container,
which we just pass through to the CPU object in the same way that we
do for "init-svtor" and "idau". (We want this for the SSE-200, which
powers up only the first CPU at reset and leaves the second powered
down.)
As with the other CPU properties here, we can't just use alias
properties, because the CPU QOM object is not created until armv7m
realize time.
Peter Maydell [Fri, 1 Feb 2019 14:55:41 +0000 (14:55 +0000)]
armv7m: Make cpu object a child of the armv7m container
Rather than just creating the CPUs with object_new, make them child
objects of the armv7m container. This will allow the cluster code to
find the CPUs if an armv7m object is made a child of a cluster object.
object_new_with_props() will do the parenting for us.
Peter Maydell [Fri, 1 Feb 2019 14:55:41 +0000 (14:55 +0000)]
armv7m: Don't assume the NVIC's CPU is CPU 0
Currently the ARMv7M NVIC object's realize method assumes that the
CPU the NVIC is attached to is CPU 0, because it thinks there can
only ever be one CPU in the system. To allow a dual-Cortex-M33
setup we need to remove this assumption; instead the armv7m
wrapper object tells the NVIC its CPU, in the same way that it
already tells the CPU what the NVIC is.
* remotes/kraxel/tags/ui-20190201-pull-request:
ui: remove support for SDL1.2 in favour of SDL2
hw/display/milkymist-tmu2: Move inlined code from header to source
hw/display/milkymist-tmu2: Explicit the dependency to both X11 / OpenGL
configure: LM32 Milkymist Texture Mapping Unit (tmu2) also depends of X11
hw/display: Move Milkymist specific hardware out of common-obj list
Kevin Wolf [Fri, 25 Jan 2019 17:20:27 +0000 (18:20 +0100)]
scsi-disk: Add device_id property
The new device_id property specifies which value to use for the vendor
specific designator in the Device Identification VPD page.
In particular, this is necessary for libvirt to maintain guest ABI
compatibility when no serial number is given and a VM is switched from
-drive (where the BlockBackend name is used) to -blockdev (where the
vendor specific designator is left out by default).
Kevin Wolf [Fri, 25 Jan 2019 16:30:28 +0000 (17:30 +0100)]
scsi-disk: Don't use empty string as device id
scsi-disk includes in the Device Identification VPD page, depending on
configuration amongst others, a vendor specific designator that consists
either of the serial number if given or the BlockBackend name (which is
a host detail that better shouldn't have been leaked to the guest, but
now we have to maintain it for compatibility).
With anonymous BlockBackends, i.e. scsi-disk devices constructed with
drive=<node-name>, and no serial number explicitly specified, this ends
up as an empty string. If this happens to more than one disk, we have
accidentally signalled to the OS that this is a multipath setup, which
is obviously not what was intended.
Instead of using an empty string for the vendor specific designator,
simply leave out that designator, which makes Linux detect such setups
as separate disks again.
Alberto Garcia [Thu, 31 Jan 2019 12:38:10 +0000 (14:38 +0200)]
qtest.py: Wait for the result of qtest commands
The cmd() method of the QEMUQtestProtocol class sends a qtest command
to QEMU but doesn't wait for the return message ("OK", "FAIL", "ERR").
Because of this, it can return control to the caller before the
command has actually finished.
In cases like clock_step or clock_set this means that cmd() can return
before all the timers triggered by the clock change have been fired.
This can be fixed by making cmd() wait for the output of the qtest
command.
This fixes iotests 093 and 136, which are flaky since commit 8258292e18c39480b64eba9f3551 when the machine is under heavy workload.
Kevin Wolf [Thu, 31 Jan 2019 14:16:10 +0000 (15:16 +0100)]
block: Fix invalidate_cache error path for parent activation
bdrv_co_invalidate_cache() clears the BDRV_O_INACTIVE flag before
actually activating a node so that the correct permissions etc. are
taken. In case of errors, the flag must be restored so that the next
call to bdrv_co_invalidate_cache() retries activation.
Restoring the flag was missing in the error path for a failed
parent->role->activate() call. The consequence is that this attempt to
activate all images correctly fails because we still set errp, however
on the next attempt BDRV_O_INACTIVE is already clear, so we return
success without actually retrying the failed action.
An example where this is observable in practice is migration to a QEMU
instance that has a raw format block node attached to a guest device
with share-rw=off (the default) while another process holds
BLK_PERM_WRITE for the same image. In this case, all activation steps
before parent->role->activate() succeed because raw can tolerate other
writers to the image. Only the parent callback (in particular
blk_root_activate()) tries to implement the share-rw=on property and
requests exclusive write permissions. This fails when the migration
completes and correctly displays an error. However, a manual 'cont' will
incorrectly resume the VM without calling blk_root_activate() again.
This case is described in more detail in the following bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1531888
Fix this by correctly restoring the BDRV_O_INACTIVE flag in the error
path.
We define 54 macros for the powers of two >= 1024. We use six, in six
macro definitions. Four of them could just as well use the common MiB
macro, so do that. The remaining two can't, because they get passed
to stringify. Replace the macro by the literal number there.
Slightly harder to read in one instance (1048576 vs. S_1MiB), so add a
comment there. The other instance is a wash: 65536 vs S_64KiB. 65536
has been good enough for more than seven years there.
The last user of blk_attach_dev_legacy() was the code in xen_disk which
has recently been reworked. Now there is no user for this legacy function
anymore. Thus we can finally remove all code related to the "legacy_dev"
flag, too, and turn the related "void *" in block-backend.c into proper
"DeviceState *" to fix some of the remaining TODOs there.
Kevin Wolf [Tue, 22 Jan 2019 12:15:31 +0000 (13:15 +0100)]
block: Apply auto-read-only for ro-whitelist drivers
If QEMU was configured with a driver in --block-drv-ro-whitelist, trying
to use that driver read-write resulted in an error message even if
auto-read-only=on was set.
Consider auto-read-only=on for the whitelist checking and use it to
automatically degrade to read-only for block drivers on the read-only
whitelist.
Peter Maydell [Mon, 10 Dec 2018 11:26:49 +0000 (11:26 +0000)]
uuid: Make qemu_uuid_bswap() take and return a QemuUUID
Currently qemu_uuid_bswap() takes a pointer to the QemuUUID to
be byte-swapped. This means it can't be used when the UUID
to be swapped is in a packed member of a struct. It's also
out of line with the general bswap*() functions we provide
in bswap.h, which take the value to be swapped and return it.
Make qemu_uuid_bswap() take a QemuUUID and return the swapped version.
This fixes some clang warnings about taking the address of
a packed struct member in block/vdi.c.
Peter Maydell [Mon, 10 Dec 2018 11:26:48 +0000 (11:26 +0000)]
block/vdi: Don't take address of fields in packed structs
Taking the address of a field in a packed struct is a bad idea, because
it might not be actually aligned enough for that pointer type (and
thus cause a crash on dereference on some host architectures). Newer
versions of clang warn about this.
Instead of passing UUID related functions the address of a possibly
unaligned QemuUUID struct, use local variables and then copy to/from
the struct field as appropriate.
Peter Maydell [Mon, 10 Dec 2018 11:26:47 +0000 (11:26 +0000)]
block/vpc: Don't take address of fields in packed structs
Taking the address of a field in a packed struct is a bad idea, because
it might not be actually aligned enough for that pointer type (and
thus cause a crash on dereference on some host architectures). Newer
versions of clang warn about this. Avoid the bug by generating the
UUID into a local variable which is definitely safely aligned and
then copying it into place.
Kevin Wolf [Fri, 7 Dec 2018 11:42:19 +0000 (12:42 +0100)]
vmdk: Reject excess extents in blockdev-create
Clarify that the number of extents provided in BlockdevCreateOptionsVmdk
must match the number of extents that will actually be used. Providing
more extents will result in an error now.
This requires adapting the test case to provide the right number of
extents.
Fam Zheng [Tue, 15 May 2018 15:36:32 +0000 (23:36 +0800)]
vmdk: Implement .bdrv_co_create callback
This makes VMDK support blockdev-create. The implementation reuses the
image creation code in vmdk_co_create_opts which now acceptes a callback
pointer to "retrieve" BlockBackend pointers from the caller. This way we
separate the logic between file/extent acquisition and initialization.
The QAPI command parameters are mostly the same as the old create_opts
except the dropped legacy @compat6 switch, which is redundant with
@hwversion.
Fam Zheng [Tue, 15 May 2018 15:36:31 +0000 (23:36 +0800)]
vmdk: Refactor vmdk_create_extent
The extracted vmdk_init_extent takes a BlockBackend object and
initializes the format metadata. It is the common part between "qemu-img
create" and "blockdev-create".
Add a "BlockBackend *pbb" parameter to vmdk_create_extent, to return the
opened BB to the caller in the next patch.
Max Reitz [Fri, 21 Dec 2018 18:42:26 +0000 (19:42 +0100)]
iotests: Make 234 stable
This test waits for a MIGRATION event with status=completed on the
source VM before querying the migration status on both source and
destination. However, just because the source says migration has
completed does not mean the destination thinks the same. Therefore, in
some cases, the destination VM may still report "active" instead of
"completed" when asked for its migration status.
Fix this by enabling migration events on both VMs and waiting until both
source and destination emit a status=completed MIGRATION event.
Kevin Wolf [Mon, 7 Jan 2019 12:02:48 +0000 (13:02 +0100)]
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
yuchenlin [Sat, 5 Jan 2019 08:42:43 +0000 (16:42 +0800)]
qemu-iotests: add test case for dmg
Recently, some bugs in dmg file have been fixed. To prevent reading dmg
is broken someday in the future, add a simple test which ensures the
conversion from dmg to raw should not hang or face any I/O error.
Alberto Garcia [Wed, 14 Nov 2018 14:58:57 +0000 (16:58 +0200)]
qcow2: Assert that refcount block offsets fit in the refcount table
Refcount table entries have a field to store the offset of the
refcount block. The rest of the bits of the entry are currently
reserved.
The offset is always taken from the entry using REFT_OFFSET_MASK to
ensure that we only use the bits that belong to that field.
While that mask is used every time we read from the refcount table, it
is never used when we write to it. Due to the other constraints of the
qcow2 format QEMU can never produce refcount block offsets that don't
fit in that field so any such offset when allocating a refcount block
would indicate a bug in QEMU.
Alberto Garcia [Thu, 22 Nov 2018 15:00:27 +0000 (17:00 +0200)]
mirror: Block the source BlockDriverState in mirror_start_job()
The mirror_start_job() function used for the commit-active job blocks
the source, target and all intermediate nodes for the duration of the
job.
target <- intermediate <- source
Since 4ef85a9c2339 this function creates a dummy mirror_top_bs that
goes on top of the source node, and it is this dummy node that gets
blocked instead. The source node is never blocked or added to the
job's list of nodes.
target <- intermediate <- source <- mirror_top
At the moment I don't think it is possible to exploit this problem
because any additional job on 'source' would either be forbidden for
other reasons or it would need to involve an additional node that is
blocked, causing an error.
This can be seen in the error messages, however, because they never
refer to the source node being blocked:
Alberto Garcia [Thu, 22 Nov 2018 15:00:26 +0000 (17:00 +0200)]
mirror: Release the dirty bitmap if mirror_start_job() fails
At the moment I don't see how to make this function fail after the
dirty bitmap has been created, but if that was possible then we would
hit the assert(QLIST_EMPTY(&bs->dirty_bitmaps)) in bdrv_close().
ui: deprecate use of SDL 1.2 in favour of 2.0 series
The SDL 2.0 release was made in Aug, 2013:
https://www.libsdl.org/release/
That will soon be 4 + 1/2 years ago, which is enough time to consider
the 2.0 series widely supported.
Thus we deprecate the SDL 1.2 support, which will allow us to delete it
in the last release of 2018. By this time, SDL 2.0 will be more than 5
years old.
hw/display/milkymist-tmu2: Move inlined code from header to source
Move the complexity of milkymist_tmu2_create() into the
source file. Doing so we avoid to include the X11/OpenGL
headers in all LM32 devices, and we also avoid the duplicate
declaration of glx_fbconfig_attr[] (it is already declared
in hw/display/milkymist-tmu2.c).
Since TYPE_MILKYMIST_TMU2 is now accessible, use it.
configure: LM32 Milkymist Texture Mapping Unit (tmu2) also depends of X11
Commit 5f9b1e35060b8 remove the dependency between OpenGL and X11.
However the milkymist-tmu2 device do require X11.
When using SDL, the configure script sets need_x11=yes, so the X11
flags are populated to the makefiles.
When building without SDL, X11 is not pulled and populated, leading
to a link failure:
LINK lm32-softmmu/qemu-system-lm32
hw/lm32/milkymist.o: In function `milkymist_tmu2_create':
hw/lm32/milkymist-hw.h:114: undefined reference to `XOpenDisplay'
hw/lm32/milkymist-hw.h:140: undefined reference to `XFree'
hw/lm32/milkymist-hw.h:141: undefined reference to `XCloseDisplay'
hw/lm32/milkymist-hw.h:130: undefined reference to `XCloseDisplay'
../hw/display/milkymist-tmu2.o: In function `tmu2_glx_init':
hw/display/milkymist-tmu2.c:112: undefined reference to `XOpenDisplay'
hw/display/milkymist-tmu2.c:123: undefined reference to `XFree'
collect2: error: ld returned 1 exit status
gmake[1]: *** [Makefile:199: qemu-system-lm32] Error 1
Enforce the X11 dependency when the LM32 target is built.
This will allow us to build QEMU without SDL.