The existing QemuOpts parsing code uses a fixed size 1024 byte buffer
for storing the option values. If a value exceeded this size it was
silently truncated and no error reported to the user. Long option values
is not a common scenario, but it is conceivable that they will happen.
eg if the user has a very deeply nested filesystem it would be possible
to come up with a disk path that was > 1024 bytes. Most of the time if
such data was silently truncated, the user would get an error about
opening a non-existant disk. If they're unlucky though, QEMU might use a
completely different disk image from another VM, which could be
considered a security issue. Another example program was in using the
-smbios command line arg with very large data blobs. In this case the
silent truncation will be providing semantically incorrect data to the
guest OS for SMBIOS tables.
If the operating system didn't limit the user's argv when spawning QEMU,
the code should honour whatever length arguments were given without
imposing its own length restrictions. This patch thus changes the code
to use a heap allocated buffer for storing the values during parsing,
lifting the arbitrary length restriction.
The existing QemuOpts parsing code uses a fixed size 128 byte buffer
for storing the parameter keys. If a key exceeded this size it was
silently truncate and no error reported to the user. This behaviour was
reasonable & harmless because traditionally the key names are all
statically declared, and it was known that no code was declaring a key
longer than 127 bytes. This assumption, however, ceased to be valid once
the block layer added support for dot-separate compound keys. This
syntax allows for keys that can be arbitrarily long, limited only by the
number of block drivers you can stack up. With this usage, silently
truncating the key name can never lead to correct behaviour.
Hopefully such truncation would turn into an error, when the block code
then tried to extract options later, but there's no guarantee that will
happen. It is conceivable that an option specified by the user may be
truncated and then ignored. This could have serious consequences,
possibly even leading to security problems if the ignored option set a
security relevant parameter.
If the operating system didn't limit the user's argv when spawning QEMU,
the code should honour whatever length arguments were given without
imposing its own length restrictions. This patch thus changes the code
to use a heap allocated buffer for storing the keys during parsing,
lifting the arbitrary length restriction.
Roman Kagan [Fri, 13 Apr 2018 14:33:54 +0000 (17:33 +0300)]
update-linux-headers: drop hyperv.h
As of mainline linux commit 5a485803221777013944cbd1a7cd5c62efba3ffa
"x86/hyper-v: move hyperv.h out of uapi" by Vitaly Kuznetsov, no linux
uapi header includes it, so we no longer need to create a stub for it.
Peter Xu [Thu, 12 Apr 2018 05:34:44 +0000 (13:34 +0800)]
qemu-thread: always keep the posix wrapper layer
We will conditionally have a wrapper layer depending on whether the host
has the PTHREAD_SETNAME capability. It complicates stuff. Let's keep
the wrapper there; we opt out the pthread_setname_np() call only.
Paolo Bonzini [Sun, 18 Mar 2018 17:26:36 +0000 (18:26 +0100)]
exec: reintroduce MemoryRegion caching
MemoryRegionCache was reverted to "normal" address_space_* operations
for 2.9, due to lack of support for IOMMUs. Reinstate the
optimizations, caching only the IOMMU translation at address_cache_init
but not the IOMMU lookup and target AddressSpace translation are not
cached; now that MemoryRegionCache supports IOMMUs, it becomes more widely
applicable too.
The inlined fast path is defined in memory_ldst_cached.inc.h, while the
slow path uses memory_ldst.inc.c as before. The smaller fast path causes
a little code size reduction in MemoryRegionCache users:
hw/virtio/virtio.o text size before: 32373
hw/virtio/virtio.o text size after: 31941
Paolo Bonzini [Sat, 3 Mar 2018 16:24:04 +0000 (17:24 +0100)]
exec: extract address_space_translate_iommu, fix page_mask corner case
This will be used to process IOMMUs in a MemoryRegionCache. This
includes a small bugfix, in that the returned page_mask is now
correctly -1 if the IOMMU memory region maps the entire address
space directly. Previously, address_space_get_iotlb_entry would
return ~TARGET_PAGE_MASK.
Paolo Bonzini [Tue, 17 Apr 2018 09:39:35 +0000 (11:39 +0200)]
exec: small changes to flatview_do_translate
Prepare for extracting the IOMMU part to a separate function. Mostly
cosmetic; the only semantic change is that, if there is more than one
cascaded IOMMU and the second one fails to translate, *plen_out is now
adjusted according to the page mask of the first IOMMU.
Paolo Bonzini [Mon, 30 Apr 2018 09:48:18 +0000 (11:48 +0200)]
memdev: remove "id" property
The "id" property is unnecessary and can be replaced simply with
object_get_canonical_path_component. This patch mostly undoes commit e1ff3c67e8 ("monitor: fix qmp/hmp query-memdev not reporting IDs of
memory backends", 2017-01-12).
Commit 9b0605f9837b ("cpus: tcg: unregister thread with RCU, fix
exiting of loop on unplug") changed the exit condition of the loop in
the vCPU thread function but forgot to remove the beginning 'while (1)'
statement. The resulting code :
while (1) {
...
} while (!cpu->unplug || cpu_can_run(cpu));
is a sequence of two distinct two while() loops, the first not exiting
in case of an unplug event.
3. Run a guest with an additional scratch disk, i.e. with additional
arguments
-drive if=none,id=scratch-drive,format=raw,werror=stop,file=blkdebug:blkdebug.conf:scratch.img
-device virtio-blk-pci,id=scratch,drive=scratch-drive
The blkdebug part makes all writes to the scratch drive fail with
EIO. The werror=stop pauses the guest on write errors.
4. Connect to the QMP socket e.g. like this:
$ socat UNIX:/your/qmp/socket READLINE,history=$HOME/.qmp_history,prompt='QMP> '
After step 6, I get a BLOCK_IO_ERROR event followed by a STOP event. Good.
After step 7, I get BLOCK_IO_ERROR, then RESUME, then STOP. Not so
good; I'd expect RESUME, then BLOCK_IO_ERROR, then STOP.
The funny event order confuses libvirt: virsh -r domstate DOMAIN
--reason reports "paused (unknown)" rather than "paused (I/O error)".
The culprit is vm_prepare_start().
/* Ensure that a STOP/RESUME pair of events is emitted if a
* vmstop request was pending. The BLOCK_IO_ERROR event, for
* example, according to documentation is always followed by
* the STOP event.
*/
if (runstate_is_running()) {
qapi_event_send_stop(&error_abort);
res = -1;
} else {
replay_enable_events();
cpu_enable_ticks();
runstate_set(RUN_STATE_RUNNING);
vm_state_notify(1, RUN_STATE_RUNNING);
}
/* We are sending this now, but the CPUs will be resumed shortly later */
qapi_event_send_resume(&error_abort);
return res;
When resuming a stopped guest, we take the else branch before we get
to sending RESUME. vm_state_notify() runs virtio_vmstate_change(),
among other things. This restarts I/O, triggering the BLOCK_IO_ERROR
event.
Reshuffle vm_prepare_start() to send the RESUME event earlier.
Peter Maydell [Tue, 8 May 2018 14:25:17 +0000 (15:25 +0100)]
Merge remote-tracking branch 'remotes/ehabkost/tags/machine-next-pull-request' into staging
Machine queue, 2018-05-07
* pc-dimm: factor out MemoryDevice
(virtio-pmem and virtio-mem will make use of the new abstraction later)
* scripts/device-crash-test: Removed fixed CAN entries
# gpg: Signature made Mon 07 May 2018 18:01:42 BST
# gpg: using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <[email protected]>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/machine-next-pull-request:
scripts/device-crash-test: Removed fixed CAN entries
vl: allow 'maxmem' without 'slot'
spapr: rename "hotplug memory" terminology to "device memory"
pc: rename "hotplug memory" terminology to "device memory"
machine: rename MemoryHotplugState to DeviceMemoryState
pc-dimm: move actual plug/unplug of a memory region to MemoryDevice
pc-dimm: factor out capacity and slot checks into MemoryDevice
pc-dimm: factor out address search into MemoryDevice code
pc-dimm: pass in the machine and to the MemoryHotplugState
pc-dimm: no need to pass the memory region
machine: make MemoryHotplugState accessible via the machine
pc-dimm: factor out MemoryDevice interface
Peter Maydell [Tue, 8 May 2018 12:34:03 +0000 (13:34 +0100)]
Merge remote-tracking branch 'remotes/riscv/tags/riscv-qemu-2.13-pull-20180506' into staging
RISC-V: QEMU 2.13 Privileged ISA emulation updates
Several code cleanups, minor specification conformance changes,
fixes to make ROM read-only and add device-tree size checks.
* Honour privileged ISA v1.10 counter enable CSRs.
* Implements WARL behavior for CSRs that don't support writes
* Past behavior of raising traps was non-conformant
with the RISC-V Privileged ISA Specification v1.10.
* Allow S-mode access to sstatus.MXR when priv ISA >= v1.10
* Sets mtval/stval to zero on exceptions without addresses
* Past behavior of leaving the last value was non-conformant
with the RISC-V Privileged ISA Specition v1.10. mtval/stval
must be set on all exceptions; to zero if not supported.
* Make ROMs read-only and implement device-tree size checks
* Uses memory_region_init_rom and rom_add_blob_fixed_as
* Adds hexidecimal instruction bytes to disassembly output.
* Fixes missing break statement for rv128 disassembly.
* Several code cleanups
* Replacing hard-coded constants with enums
* Dead-code elimination
This is an incremental pull that contains 20 reviewed changes out
of 38 changes currently queued in the qemu-2.13-for-upstream branch.
# gpg: Signature made Sun 06 May 2018 00:27:37 BST
# gpg: using DSA key 6BF1D7B357EF3E4F
# gpg: Good signature from "Michael Clark <[email protected]>"
# gpg: aka "Michael Clark <[email protected]>"
# gpg: aka "Michael Clark <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 7C99 930E B17C D8BA 073D 5EFA 6BF1 D7B3 57EF 3E4F
* remotes/riscv/tags/riscv-qemu-2.13-pull-20180506:
RISC-V: Mark ROM read-only after copying in code
RISC-V: No traps on writes to misa,minstret,mcycle
RISC-V: Make mtvec/stvec ignore vectored traps
RISC-V: Add mcycle/minstret support for -icount auto
RISC-V: Use [ms]counteren CSRs when priv ISA >= v1.10
RISC-V: Allow S-mode mxr access when priv ISA >= v1.10
RISC-V: Clear mtval/stval on exceptions without info
RISC-V: Hardwire satp to 0 for no-mmu case
RISC-V: Update E and I extension order
RISC-V: Remove erroneous comment from translate.c
RISC-V: Remove EM_RISCV ELF_MACHINE indirection
RISC-V: Make virt header comment title consistent
RISC-V: Make some header guards more specific
RISC-V: Fix missing break statement in disassembler
RISC-V: Include instruction hex in disassembly
RISC-V: Remove unused class definitions
RISC-V: Remove identity_translate from load_elf
RISC-V: Use ROM base address and size from memmap
RISC-V: Make virt board description match spike
RISC-V: Replace hardcoded constants with enum values
* remotes/kraxel/tags/usb-20180507-pull-request:
usb-host: skip open on pending postload bh
usb-mtp: Unconditionally check for the readonly bit
usb-mtp: Add some NULL checks for issues pointed out by coverity
Greg Kurz [Mon, 7 May 2018 09:02:09 +0000 (11:02 +0200)]
ppc: e500: use g_strdup_printf() instead of snprintf()
qemu-system-ppc fails to build with GCC 8.0.1:
/home/hsp/src/qemu-master/hw/ppc/e500.c: In function ‘ppce500_load_device_tree’:
/home/hsp/src/qemu-master/hw/ppc/e500.c:442:37: error: ‘/pic@’
directive output may be truncated writing 5 bytes into a region of
size between 1 and 128 [-Werror=format-truncation=]
snprintf(mpic, sizeof(mpic), "%s/pic@%llx", soc, MPC8544_MPIC_REGS_OFFSET);
^~~~~
In file included from /usr/include/stdio.h:862,
from /home/hsp/src/qemu-master/include/qemu/osdep.h:68,
from /home/hsp/src/qemu-master/hw/ppc/e500.c:17:
/usr/include/bits/stdio2.h:64:10: note: ‘__builtin___snprintf_chk’
output between 11 and 138 bytes into a destination of size 128
return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
__bos (__s), __fmt, __va_arg_pack ());
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/hsp/src/qemu-master/hw/ppc/e500.c:470:39: error:
‘/global-utilities@’ directive output may be truncated writing 18
bytes into a region of size between 1 and 128
[-Werror=format-truncation=]
snprintf(gutil, sizeof(gutil), "%s/global-utilities@%llx", soc,
^~~~~~~~~~~~~~~~~~
In file included from /usr/include/stdio.h:862,
from /home/hsp/src/qemu-master/include/qemu/osdep.h:68,
from /home/hsp/src/qemu-master/hw/ppc/e500.c:17:
/usr/include/bits/stdio2.h:64:10: note: ‘__builtin___snprintf_chk’
output between 24 and 151 bytes into a destination of size 128
return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
__bos (__s), __fmt, __va_arg_pack ());
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/hsp/src/qemu-master/hw/ppc/e500.c:477:36: error: ‘/msi@’
directive output may be truncated writing 5 bytes into a region of
size between 0 and 127 [-Werror=format-truncation=]
snprintf(msi, sizeof(msi), "/%s/msi@%llx", soc, MPC8544_MSI_REGS_OFFSET);
^~~~~
In file included from /usr/include/stdio.h:862,
from /home/hsp/src/qemu-master/include/qemu/osdep.h:68,
from /home/hsp/src/qemu-master/hw/ppc/e500.c:17:
/usr/include/bits/stdio2.h:64:10: note: ‘__builtin___snprintf_chk’
output between 12 and 139 bytes into a destination of size 128
return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
__bos (__s), __fmt, __va_arg_pack ());
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fix this by converting e500 to use g_strdup_printf()+g_free() instead
of snprintf(). This is done globally, even for call sites that don't
break build, since this is the preferred practice in QEMU.
We will be able to have memory devices (e.g. virtio) not requiring the
slot parameter (e.g. not exposed via ACPI). We still need the maxmem
parameter to setup a proper memory region for device memory. And some
architectures (e.g. s390x) will have to set up the maximum possible guest
address space size based on the maxmem parameter.
As far as I can see, all code (pc.c,spapr.c,ACPI code) should handle
!slots just fine, even though maxmem is set.
pc-dimm: factor out capacity and slot checks into MemoryDevice
Move the checks into memory_device_get_free_addr(). This will check
before doing any calculations if we have KVM/vhost slots left and if
the total region size would be exceeded.
Of course, while at it, make it independent of pc-dimm code.
pc-dimm: factor out address search into MemoryDevice code
This mainly moves code, but does a handfull of optimizations:
- We pass the machine instead of the address space properties
- We check the hinted address directly and handle fragmented memory
better
- We make the search independent of pc-dimm
We can just query it ourselves. When unplugging, we should always be
able to the region (as it was previously plugged). E.g. PPC already
assumed that and used &error_abort.
machine: make MemoryHotplugState accessible via the machine
Let's allow to query the MemoryHotplugState directly from the machine.
If the pointer is NULL, the machine does not support memory devices. If
the pointer is !NULL, the machine supports memory devices and the
data structure contains information about the applicable physical
guest address space region.
This allows us to generically detect if a certain machine has support
for memory devices, and to generically manage it (find free address
range, plug/unplug a memory region).
We will rename "MemoryHotplugState" to something more meaningful
("DeviceMemory") after we completed factoring out the pc-dimm code into
MemoryDevice code.
On the qmp level, we already have the concept of memory devices:
"query-memory-devices"
Right now, we only support NVDIMM and PCDIMM.
We want to map other devices later into the address space of the guest.
Such device could e.g. be virtio devices. These devices will have a
guest memory range assigned but won't be exposed via e.g. ACPI. We want
to make them look like memory device, but not glued to pc-dimm.
Especially, it will not always be possible to have TYPE_PC_DIMM as a parent
class (e.g. virtio devices). Let's use an interface instead. As a first
part, convert handling of
- qmp_pc_dimm_device_list
- get_plugged_memory_size
to our new model. plug/unplug stuff etc. will follow later.
A memory device will have to provide the following functions:
- get_addr(): Necessary, as the property "addr" can e.g. not be used for
virtio devices (already defined).
- get_plugged_size(): The amount this device offers to the guest as of
now.
- get_region_size(): Because this can later on be bigger than the
plugged size.
- fill_device_info(): Fill MemoryDeviceInfo, e.g. for qmp.
Make sure we only ask the spice local renderer for display updates in
case we have a valid primary surface. Without that spice is confused
and throws errors in case a display update request (triggered by
screendump for example) happens in parallel to a mode switch and hits
the race window where the old primary surface is gone and the new isn't
establisted yet.
Gerd Hoffmann [Thu, 3 May 2018 06:29:32 +0000 (08:29 +0200)]
usb-host: skip open on pending postload bh
usb-host emulates a device unplug after live migration, because the
device state is unknown and unplug/replug makes sure the guest
re-initializes the device into a working state. This can't be done in
post-load though, so post-load just schedules a bottom half which
executes after vmload is complete.
It can happen that the device autoscan timer hits the race window
between scheduling and running the bottom half, which in turn can
triggers an assert().
Fix that issue by just ignoring the usb_host_open() call in case the
bottom half didn't execute yet.
Bandan Das [Thu, 3 May 2018 19:20:27 +0000 (15:20 -0400)]
usb-mtp: Add some NULL checks for issues pointed out by coverity
CID 1390578: In usb_mtp_write_metadata, parent can never be NULL but
just in case, add an assert
CID 1390592: Check for o->format only if o !=NULL
CID 1390604: Check s->data_out != NULL in usb_mtp_handle_data
Michael Clark [Sat, 3 Mar 2018 22:52:13 +0000 (11:52 +1300)]
RISC-V: Mark ROM read-only after copying in code
The sifive_u machine already marks its ROM readonly however
it has the wrong base address for its mask ROM. This patch
fixes the sifive_u mask ROM base address.
This commit makes all other boards consistently use mask_rom
as the variable name for their ROMs. Boards that use device
tree now check that that the device tree fits in the assigned
ROM space using the new qemu_fdt_totalsize(void *fdt)
interface, adding a bounds check and error message. This
can detect truncation.
Michael Clark [Mon, 5 Mar 2018 21:33:31 +0000 (10:33 +1300)]
RISC-V: No traps on writes to misa,minstret,mcycle
These fields are marked WARL (Write Any Values, Reads
Legal Values) in the RISC-V Privileged Architecture
Specification so instead of raising exceptions,
illegal writes are silently dropped.
Michael Clark [Mon, 5 Mar 2018 21:17:11 +0000 (10:17 +1300)]
RISC-V: Make mtvec/stvec ignore vectored traps
Vectored traps for asynchrounous interrupts are optional.
The mtvec/stvec mode field is WARL and hence does not trap
if an illegal value is written. Illegal values are ignored.
Later we can add RISCV_FEATURE_VECTORED_TRAPS however
until then the correct behavior for WARL (Write Any, Read
Legal) fields is to drop writes to unsupported bits.
Michael Clark [Fri, 6 Apr 2018 00:46:19 +0000 (12:46 +1200)]
RISC-V: Add mcycle/minstret support for -icount auto
Previously the mycycle/minstret CSRs and rdcycle/rdinstret
psuedo instructions would return the time as a proxy for an
increasing instruction counter in the absence of having a
precise instruction count. If QEMU is invoked with -icount,
the mcycle/minstret CSRs and rdcycle/rdinstret psuedo
instructions will return the instruction count.
Michael Clark [Sun, 8 Apr 2018 23:33:05 +0000 (11:33 +1200)]
RISC-V: Use [ms]counteren CSRs when priv ISA >= v1.10
Privileged ISA v1.9.1 defines mscounteren and mucounteren:
* mscounteren contains a mask of counters available to S-mode
* mucounteren contains a mask of counters available to U-mode
Privileged ISA v1.10 defines mcounteren and scounteren:
* mcounteren contains a mask of counters available to S-mode
* scounteren contains a mask of counters available to U-mode
mcounteren and scounteren CSR registers were implemented
however they were not honoured for counter accesses when
the privilege ISA was >= v1.10. This fix solves the issue
by coalescing the counter enable registers. In addition
the code now generates illegal instruction exceptions
for accesses to the counter enabled registers depending
on the privileged ISA version.
- Coalesce mscounteren and mcounteren into one variable
- Coalesce mucounteren and scounteren into one variable
- Makes mcounteren and scounteren CSR accesses generate
illegal instructions when the privileged ISA <= v1.9.1
- Makes mscounteren and mucounteren CSR accesses generate
illegal instructions when the privileged ISA >= v1.10
Michael Clark [Mon, 9 Apr 2018 00:06:30 +0000 (12:06 +1200)]
RISC-V: Allow S-mode mxr access when priv ISA >= v1.10
The mstatus.MXR alias in sstatus should only be writable
by S-mode if the privileged ISA version >= v1.10. Also MXR
was masked in sstatus CSR read but not sstatus CSR writes.
Now we correctly mask sstatus.mxr in both read and write.
Michael Clark [Fri, 16 Mar 2018 19:12:00 +0000 (12:12 -0700)]
RISC-V: Clear mtval/stval on exceptions without info
mtval/stval must be set on all exceptions but zero is
a legal value if there is no exception specific info.
Placing the instruction bytes for illegal instruction
exceptions in mtval/stval is an optional feature and
is currently not supported by QEMU RISC-V.
Michael Clark [Mon, 5 Mar 2018 20:48:41 +0000 (09:48 +1300)]
RISC-V: Hardwire satp to 0 for no-mmu case
satp is WARL so it should not trap on illegal writes, rather
it can be hardwired to zero and silently ignore illegal writes.
It seems the RISC-V WARL behaviour is preferred to having to
trap overhead versus simply reading back the value and checking
if the write took (saves hundreds of cycles and more complex
trap handling code).
Michael Clark [Mon, 5 Mar 2018 00:28:00 +0000 (13:28 +1300)]
RISC-V: Update E and I extension order
Section 22.8 Subset Naming Convention of the RISC-V ISA Specification
defines the canonical order for extensions in the ISA string. It is
silent on the position of the E extension however E is a substitute
for I so it must come early in the extension list order. A comment
is added to state E and I are mutually exclusive, as the E extension
will be added to the RISC-V port in the future.
Michael Clark [Sun, 4 Mar 2018 00:50:12 +0000 (13:50 +1300)]
RISC-V: Include instruction hex in disassembly
This was added to help debug issues using -d in_asm. It is
useful to see the instruction bytes, as one can detect if
one is trying to execute ASCII or device-tree magic.
Michael Clark [Sun, 4 Mar 2018 00:27:37 +0000 (13:27 +1300)]
RISC-V: Remove unused class definitions
Removes a whole lot of unnecessary boilerplate code. Machines
don't need to be objects. The expansion of the SOC object model
for the RISC-V machines will happen in the future as SiFive
plans to add their FE310 and FU540 SOCs to QEMU. However, it
seems that this present boilerplate is complete unnecessary.
Michael Clark [Sat, 3 Mar 2018 22:32:17 +0000 (11:32 +1300)]
RISC-V: Remove identity_translate from load_elf
When load_elf is called with NULL as an argument to the
address translate callback, it does an identity translation.
This commit removes the redundant identity_translate callback.
Peter Maydell [Fri, 4 May 2018 17:58:39 +0000 (18:58 +0100)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20180504-1' into staging
target-arm queue:
* Emulate the SMMUv3 (IOMMU); one will be created in the 'virt' board
if the commandline includes "-machine iommu=smmuv3"
* target/arm: Implement v8M VLLDM and VLSTM
* hw/arm: Don't fail qtest due to missing SD card in -nodefaults mode
* Some fixes to silence Coverity false-positives
* arm: boot: set boot_info starting from first_cpu
(fixes a technical bug not visible in practice)
* hw/net/smc91c111: Convert away from old_mmio
* hw/usb/tusb6010: Convert away from old_mmio
* hw/char/cmsdk-apb-uart.c: Accept more input after character read
* target/arm: Make MPUIR write-ignored on OMAP, StrongARM
* hw/arm/virt: Add linux,pci-domain property
* remotes/pmaydell/tags/pull-target-arm-20180504-1: (24 commits)
hw/arm/virt: Introduce the iommu option
hw/arm/virt-acpi-build: Add smmuv3 node in IORT table
hw/arm/virt: Add SMMUv3 to the virt board
target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
hw/arm/smmuv3: Abort on vfio or vhost case
hw/arm/smmuv3: Implement translate callback
hw/arm/smmuv3: Event queue recording helper
hw/arm/smmuv3: Implement MMIO write operations
hw/arm/smmuv3: Queue helpers
hw/arm/smmuv3: Wired IRQ and GERROR helpers
hw/arm/smmuv3: Skeleton
hw/arm/smmu-common: VMSAv8-64 page table walk
hw/arm/smmu-common: IOMMU memory region and address space setup
hw/arm/smmu-common: smmu base device and datatypes
target/arm: Implement v8M VLLDM and VLSTM
hw/arm: Don't fail qtest due to missing SD card in -nodefaults mode
target/arm: Tidy condition in disas_simd_two_reg_misc
target/arm: Tidy conditions in handle_vec_simd_shri
arm: boot: set boot_info starting from first_cpu
hw/net/smc91c111: Convert away from old_mmio
...
Prem Mallappa [Fri, 4 May 2018 17:05:52 +0000 (18:05 +0100)]
hw/arm/virt-acpi-build: Add smmuv3 node in IORT table
This patch builds the smmuv3 node in the ACPI IORT table.
The RID space of the root complex, which spans 0x0-0x10000
maps to streamid space 0x0-0x10000 in smmuv3, which in turn
maps to deviceid space 0x0-0x10000 in the ITS group.
The guest must feature the IOMMU probe deferral series
(https://lkml.org/lkml/2017/4/10/214) which fixes streamid
multiple lookup. This bug is not related to the SMMU emulation.
Eric Auger [Fri, 4 May 2018 17:05:51 +0000 (18:05 +0100)]
hw/arm/smmuv3: Implement translate callback
This patch implements the IOMMU Memory Region translate()
callback. Most of the code relates to the translation
configuration decoding and check (STE, CD).
Eric Auger [Fri, 4 May 2018 17:05:51 +0000 (18:05 +0100)]
hw/arm/smmuv3: Wired IRQ and GERROR helpers
We introduce some helpers to handle wired IRQs and especially
GERROR interrupt. SMMU writes GERROR register on GERROR event
and SW acks GERROR interrupts by setting GERRORn.
The Wired interrupts are edge sensitive hence the pulse usage.
Prem Mallappa [Fri, 4 May 2018 17:05:51 +0000 (18:05 +0100)]
hw/arm/smmuv3: Skeleton
This patch implements a skeleton for the smmuv3 device.
Datatypes and register definitions are introduced. The MMIO
region, the interrupts and the queue are initialized.
Eric Auger [Fri, 4 May 2018 17:05:51 +0000 (18:05 +0100)]
hw/arm/smmu-common: IOMMU memory region and address space setup
We set up the infrastructure to enumerate all the PCI devices
attached to the SMMU and create an associated IOMMU memory
region and address space.
Those info are stored in SMMUDevice objects. The devices are
grouped according to the PCIBus they belong to. A hash table
indexed by the PCIBus pointer is used. Also an array indexed by
the bus number allows to find the list of SMMUDevices.
Peter Maydell [Fri, 4 May 2018 17:05:51 +0000 (18:05 +0100)]
target/arm: Implement v8M VLLDM and VLSTM
For v8M the instructions VLLDM and VLSTM support lazy saving
and restoring of the secure floating-point registers. Even
if the floating point extension is not implemented, these
instructions must act as NOPs in Secure state, so they can
be used as part of the secure-to-nonsecure call sequence.
Thomas Huth [Fri, 4 May 2018 17:05:51 +0000 (18:05 +0100)]
hw/arm: Don't fail qtest due to missing SD card in -nodefaults mode
When running omap1/2 or pxa2xx based ARM machines with -nodefaults,
they bail out immediately complaining about a "missing SecureDigital
device". That's not how the "default" devices in vl.c are meant to
work - it should be possible for a board to also start up without
default devices. So let's turn the error message and exit() into
a warning instead.
target/arm: Tidy conditions in handle_vec_simd_shri
The (size > 3 && !is_q) condition is identical to the preceeding test
of bit 3 in immh; eliminate it. For the benefit of Coverity, assert
that size is within the bounds we expect.
Igor Mammedov [Fri, 4 May 2018 17:05:51 +0000 (18:05 +0100)]
arm: boot: set boot_info starting from first_cpu
Even though nothing is currently broken (since all boards
use first_cpu as boot cpu), make sure that boot_info is set
on all CPUs.
If some board would like support heterogenuos setup (i.e.
init boot_info on subset of CPUs) in future, it should add
a reasonable API to do it, instead of starting assigning
boot_info from some CPU and till the end of present CPUs
list.
Jan Kiszka [Fri, 4 May 2018 17:05:50 +0000 (18:05 +0100)]
hw/arm/virt: Add linux,pci-domain property
This allows to pin the host controller in the Linux PCI domain space.
Linux requires that property to be available consistently or not at all,
in which case the domain number becomes unstable on additions/removals.
Adding it here won't make a difference in practice for most setups as we
only expose one controller.
However, enabling Jailhouse on top may introduce another controller, and
that one would like to have stable address as well. So the property is
needed for the first controller as well.
Peter Maydell [Fri, 4 May 2018 13:42:46 +0000 (14:42 +0100)]
Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2018-05-04' into staging
nbd patches for 2018-05-04
- Vladimir Sementsov-Ogievskiy: 0/2 fix coverity bugs
- Eric Blake: nbd/client: Fix error messages during NBD_INFO_BLOCK_SIZE
- Eric Blake: nbd/client: Relax handling of large NBD_CMD_BLOCK_STATUS reply
# gpg: Signature made Fri 04 May 2018 14:25:55 BST
# gpg: using RSA key A7A16B4A2527436A
# gpg: Good signature from "Eric Blake <[email protected]>"
# gpg: aka "Eric Blake (Free Software Programmer) <[email protected]>"
# gpg: aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A
* remotes/ericb/tags/pull-nbd-2018-05-04:
nbd/client: Relax handling of large NBD_CMD_BLOCK_STATUS reply
nbd/client: Fix error messages during NBD_INFO_BLOCK_SIZE
migration/block-dirty-bitmap: fix memory leak in dirty_bitmap_load_bits
nbd/client: fix nbd_negotiate_simple_meta_context
Eric Blake [Thu, 3 May 2018 22:26:26 +0000 (17:26 -0500)]
nbd/client: Relax handling of large NBD_CMD_BLOCK_STATUS reply
The NBD spec is proposing a relaxation of NBD_CMD_BLOCK_STATUS
where a server may have the final extent per context give a
length beyond the original request, if it can easily prove that
subsequent bytes have the same status, on the grounds that a
client can take advantage of this information for fewer block
status requests. Since qemu 2.12 as a client always sends
NBD_CMD_FLAG_REQ_ONE, and rejects a server that sends extra
length, the upstream NBD spec will probably limit this behavior
to clients that don't request REQ_ONE semantics; but it doesn't
hurt to relax qemu to always be permissive of this server
behavior, even if it continues to use REQ_ONE.
Eric Blake [Tue, 1 May 2018 15:46:53 +0000 (10:46 -0500)]
nbd/client: Fix error messages during NBD_INFO_BLOCK_SIZE
A missing space makes for poor error messages, and sizes can't
go negative. Also, we missed diagnosing a server that sends
a maximum block size less than the minimum.
Initialize received variable. Otherwise, is is possible for server to
answer without any contexts, but we will set context_id to something
random (received_id is not initialized too) and return 1, which is
wrong.
To solve it, just initialize received to false. Initialize received_id
too, just to make all possible checkers happy.
Bug was introduced in 78a33ab58782efdb206de14 "nbd: BLOCK_STATUS for
standard get_block_status function: client part" with the whole
function.
Peter Maydell [Fri, 4 May 2018 12:49:08 +0000 (13:49 +0100)]
Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2018-05-04' into staging
QAPI patches for 2018-05-04
# gpg: Signature made Fri 04 May 2018 08:59:16 BST
# gpg: using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <[email protected]>"
# gpg: aka "Markus Armbruster <[email protected]>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-qapi-2018-05-04:
qapi: deprecate CpuInfoFast.arch
qapi: discriminate CpuInfoFast on SysEmuTarget, not CpuInfoArch
qapi: change the type of TargetInfo.arch from string to enum SysEmuTarget
qapi: add SysEmuTarget to "common.json"
qapi: fill in CpuInfoFast.arch in query-cpus-fast
qobject: Modify qobject_ref() to return obj
qobject: Replace qobject_incref/QINCREF qobject_decref/QDECREF
qobject: use a QObjectBase_ struct
qobject: Ensure base is at offset 0
qobject: Use qobject_to() instead of type cast
Peter Maydell [Fri, 4 May 2018 10:53:58 +0000 (11:53 +0100)]
Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20180504' into staging
First s390x pull request for 2.13.
- new machine type
- extend SCLP event masks
- support configuration of consoles via -serial
- firmware improvements: non-sequential entries in boot menu, support
for indirect loading via .INS files in s390-netboot
- bugfixes and cleanups
* remotes/cohuck/tags/s390x-20180504:
pc-bios/s390: Update firmware images
s390-ccw: force diag 308 subcode to unsigned long
pc-bios/s390-ccw/net: Add support for .INS config files
pc-bios/s390-ccw/net: Use diag308 to reset machine before jumping to the OS
pc-bios/s390-ccw/net: Split up net_load() into init, load and release parts
pc-bios/s390-ccw: fix non-sequential boot entries (enum)
pc-bios/s390-ccw: fix non-sequential boot entries (eckd)
pc-bios/s390-ccw: fix loadparm initialization and int conversion
pc-bios/s390-ccw: rename MAX_TABLE_ENTRIES to MAX_BOOT_ENTRIES
pc-bios/s390-ccw: size_t should be unsigned
hw/s390x: Allow to configure the consoles with the "-serial" parameter
s390x/kvm: cleanup calls to cpu_synchronize_state()
vfio-ccw: introduce vfio_ccw_get_device()
s390x/sclp: extend SCLP event masks to 64 bits
s390x: introduce 2.13 compat machine
Peter Maydell [Fri, 4 May 2018 09:13:13 +0000 (10:13 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.13-20180504' into staging
ppc patch queue 2018-05-04
Second patch of patches for qemu-2.13 (or whatever the version ends up
being called). Highlights are:
* Preliminary patches for POWER9 hash MMU support for powernv
* A number of cleanups fo pseries startup and LPCR handling
* Remove support for explicitly allocated RMAs (which require kernel
support that's been gone for 3+ years)
* Some mac_newworld cleanups
* A few bugfixes
* remotes/dgibson/tags/ppc-for-2.13-20180504:
spapr: don't advertise radix GTSE if max-compat-cpu < power9
spapr: don't migrate "spapr_option_vector_ov5_cas" to pre 2.8 machines
target/ppc: always set PPC_MEM_TLBIE in pre 2.8 migration hack
mac_newworld: move wiring of macio IRQs to macio_newworld_realize()
mac_newworld: remove pics IRQ array and wire up macio to OpenPIC directly
uninorth: create new uninorth device
spapr: Clean up handling of LPCR power-saving exit bits
spapr: Move PAPR mode cpu setup fully to spapr code
target/ppc: Delay initialization of LPCR_UPRT for secondary cpus
spapr: Clean up LPCR updates from hypercalls
spapr: Make a helper to set up cpu entry point state
spapr: Remove unhelpful helpers from rtas_start_cpu()
spapr: Clean up rtas_start_cpu() & rtas_stop_self()
target/ppc: Add ppc_store_lpcr() helper
spapr: Remove support for explicitly allocated RMAs
target/ppc: add basic support for PTCR on POWER9
target/ppc: return a nil HPT base address on sPAPR machines
* remotes/vivier2/tags/linux-user-for-2.13-pull-request:
linux-user: remove useless padding in flock64 structure
linux-user: introduce target_sigsp() and target_save_altstack()
linux-user: ARM-FDPIC: Add support for signals for FDPIC targets
linux-user: ARM-FDPIC: Add support of FDPIC for ARM.
linux-user: ARM-FDPIC: Identify ARM FDPIC binaries
Remove CONFIG_USE_FDPIC.
The TARGET_BASE_ARCH values from "configure" don't all map to the
@CpuInfoArch enum constants; in particular "s390x" from the former does
not match @s390 in the latter. Clients are known to rely on the @s390
constant specifically, so we can't change it silently. Instead, deprecate
the @CpuInfoFast.@arch member (in favor of @CpuInfoFast.@target) using the
regular deprecation process.
(No deprecation reminder is added to sysemu_target_to_cpuinfo_arch(): once
@CpuInfoFast.@arch is removed, the assignment expression that calls
sysemu_target_to_cpuinfo_arch() from qmp_query_cpus_fast() will have to
disappear; in turn the static function left without callers will also
break the build, thus it'll have to go.)
qapi: discriminate CpuInfoFast on SysEmuTarget, not CpuInfoArch
Add a new field @target (of type @SysEmuTarget) to the output of the
@query-cpus-fast command, which provides more information about the
emulation target than the field @arch (of type @CpuInfoArch). Make @target
the new discriminator for the @CpuInfoFast return structure. Keep @arch
for compatibility.
We'll soon need an enumeration type that lists all the softmmu targets
that QEMU (the project) supports. Introduce @SysEmuTarget to
"common.json".
The enum constant @x86_64 doesn't match the QAPI convention of preferring
hyphen ("-") over underscore ("_"). This is intentional; the @SysEmuTarget
constants are supposed to produce QEMU executable names when stringified
and appended to the "qemu-system-" prefix. Put differently, the
replacement text of the TARGET_NAME preprocessor macro must be possible to
look up in the list of (stringified) enum constants.
Like other enum types, @SysEmuTarget too can be used for discriminator
fields in unions. For the @i386 constant, a C-language union member called
"i386" would be generated. On mingw build hosts, "i386" is a macro
however. Add "i386" to "polluted_words" at once.
* Commit ca230ff33f89 added the @arch field to @CpuInfoFast, but it failed
to set the new field in qmp_query_cpus_fast(), when TARGET_S390X was not
defined. The updated @query-cpus-fast example in "qapi-schema.json"
showed "arch":"x86" only because qmp_query_cpus_fast() calls g_malloc0()
to allocate @CpuInfoFast, and the CPU_INFO_ARCH_X86 enum constant is
generated with value 0.
All @arch values other than @s390 implied the @CpuInfoOther sub-struct
for @CpuInfoFast -- at the time of writing the patch --, thus no fields
other than @arch needed to be set when TARGET_S390X was not defined. Set
@arch now, by copying the corresponding assignments from
qmp_query_cpus().
* Commit 25fa194b7b11 added the @riscv enum constant to @CpuInfoArch (used
in both @CpuInfo and @CpuInfoFast -- the return types of the @query-cpus
and @query-cpus-fast commands, respectively), and assigned, in both
return structures, the @CpuInfoRISCV sub-structure to the new enum
value.
However, qmp_query_cpus_fast() would not populate either the @arch field
or the @CpuInfoRISCV sub-structure, when TARGET_RISCV was defined; only
qmp_query_cpus() would.
Assign @CpuInfoOther to the @riscv enum constant in @CpuInfoFast, and
populate only the @arch field in qmp_query_cpus_fast(). Getting CPU
state without interrupting KVM is an exceptional thing that only S390X
does currently. Quoting Cornelia Huck <[email protected]>, "s390x is
exceptional in that it has state in QEMU that is actually interesting
for upper layers and can be retrieved without performance penalty". See
also
<https://www.redhat.com/archives/libvir-list/2018-February/msg00121.html>.
For convenience and clarity, make it possible to call qobject_ref() at
the time when the reference is associated with a variable, or
argument, by making qobject_ref() return the same pointer as given.
Use that to simplify the callers.
Now that we can safely call QOBJECT() on QObject * as well as its
subtypes, we can have macros qobject_ref() / qobject_unref() that work
everywhere instead of having to use QINCREF() / QDECREF() for QObject
and qobject_incref() / qobject_decref() for its subtypes.
The replacement is mechanical, except I broke a long line, and added a
cast in monitor_qmp_cleanup_req_queue_locked(). Unlike
qobject_decref(), qobject_unref() doesn't accept void *.
Note that the new macros evaluate their argument exactly once, thus no
need to shout them.
By moving the base fields to a QObjectBase_, QObject can be a type
which also has a 'base' field. This allows writing a generic QOBJECT()
macro that will work with any QObject type, including QObject
itself. The container_of() macro ensures that the object to cast has a
QObjectBase_ base field, giving some type safety guarantees. QObject
must have no members but QObjectBase_ base, or else QOBJECT() breaks.
QObjectBase_ is not a typedef and uses a trailing underscore to make
it obvious it is not for normal use and to avoid potential abuse.