target/ppc: Don't check UPRT in radix mode when in HV real mode
It appears that during kexec, we run for a while in hypervisor
real mode with LPCR:HR set and LPCR:UPRT clear, which trips
the assertion in ppc_radix64_handle_mmu_fault().
First this shouldn't be an assertion, it's a guest error.
Then we shouldn't be checking these things in hypervisor real
mode (or in virtual hypervisor guest real mode which is similar)
as the real HW won't use those LPCR bits in those cases anyway,
so technically it's ok to have this discrepancy.
Greg Kurz [Fri, 5 Apr 2019 16:30:48 +0000 (18:30 +0200)]
spapr: Drop duplicate PCI swizzle code
LSI mapping in spapr currently open-codes standard PCI swizzling. It thus
duplicates the code of pci_swizzle_map_irq_fn().
Expose the swizzling formula so that it can be used with a slot number
when building the device tree. Simply drop pci_spapr_map_irq() and call
pci_swizzle_map_irq_fn() instead.
Greg Kurz [Fri, 5 Apr 2019 16:30:43 +0000 (18:30 +0200)]
spapr_pci: Get rid of duplicate code for node name creation
According to the changelog of 298a971024534, SpaprPhbState::dtbusname was
introduced to "make it easier to relate the guest and qemu views of memory
to each other", hence its name.
Use it when creating the PHB node to avoid code duplication.
hw/ppc/prep: Drop useless inclusion of "hw/input/i8042.h"
In commit 47973a2dbf we split the last generic chipset out of
the PC board, but missed to remove the i8042 keyboard controller.
This omission was later fixed in commit 7cb00357c1, but here we
forgot to remove the "i8042.h" include. Do it now.
NVIDIA V100 GPUs have on-board RAM which is mapped into the host memory
space and accessible as normal RAM via an NVLink bus. The VFIO-PCI driver
implements special regions for such GPUs and emulates an NVLink bridge.
NVLink2-enabled POWER9 CPUs also provide address translation services
which includes an ATS shootdown (ATSD) register exported via the NVLink
bridge device.
This adds a quirk to VFIO to map the GPU memory and create an MR;
the new MR is stored in a PCI device as a QOM link. The sPAPR PCI uses
this to get the MR and map it to the system address space.
Another quirk does the same for ATSD.
This adds additional steps to sPAPR PHB setup:
1. Search for specific GPUs and NPUs, collect findings in
sPAPRPHBState::nvgpus, manage system address space mappings;
2. Add device-specific properties such as "ibm,npu", "ibm,gpu",
"memory-block", "link-speed" to advertise the NVLink2 function to
the guest;
3. Add "mmio-atsd" to vPHB to advertise the ATSD capability;
4. Add new memory blocks (with extra "linux,memory-usable" to prevent
the guest OS from accessing the new memory until it is onlined) and
npuphb# nodes representing an NPU unit for every vPHB as the GPU driver
uses it for link discovery.
This allocates space for GPU RAM and ATSD like we do for MMIOs by
adding 2 new parameters to the phb_placement() hook. Older machine types
set these to zero.
This puts new memory nodes in a separate NUMA node to as the GPU RAM
needs to be configured equally distant from any other node in the system.
Unlike the host setup which assigns numa ids from 255 downwards, this
adds new NUMA nodes after the user configures nodes or from 1 if none
were configured.
This adds requirement similar to EEH - one IOMMU group per vPHB.
The reason for this is that ATSD registers belong to a physical NPU
so they cannot invalidate translations on GPUs attached to another NPU.
It is guaranteed by the host platform as it does not mix NVLink bridges
or GPUs from different NPU in the same IOMMU group. If more than one
IOMMU group is detected on a vPHB, this disables ATSD support for that
vPHB and prints a warning.
Pu Wen [Tue, 16 Apr 2019 12:06:13 +0000 (20:06 +0800)]
i386: Add new Hygon 'Dhyana' CPU model
Add a new base CPU model called 'Dhyana' to model processors from Hygon
Dhyana(family 18h), which derived from AMD EPYC(family 17h).
The following features bits have been removed compare to AMD EPYC:
aes, pclmulqdq, sha_ni
The Hygon Dhyana support to KVM in Linux is already accepted upstream[1].
So add Hygon Dhyana support to Qemu is necessary to create Hygon's own
CPU model.
This change adapts io_readx() to its input access_type. Currently
io_readx() treats any memory access as a read, although it has an
input argument "MMUAccessType access_type". This results in:
1) Calling the tlb_fill() only with MMU_DATA_LOAD
2) Considering only entry->addr_read as the tlb_addr
tcg/arm: Restrict constant pool displacement to 12 bits
This will not necessarily restrict the size of the TB, since for v7
the majority of constant pool usage is for calls from the out-of-line
ldst code, which is already at the end of the TB. But this does
allow us to save one insn per reference on the off-chance.
Zhang Yi [Mon, 22 Apr 2019 00:48:48 +0000 (08:48 +0800)]
util/mmap-alloc: support MAP_SYNC in qemu_ram_mmap()
When a file supporting DAX is used as vNVDIMM backend, mmap it with
MAP_SYNC flag in addition which can ensure file system metadata
synced in each guest writes to the backend file, without other QEMU
actions (e.g., periodic fsync() by QEMU).
Current, We have below different possible use cases:
1. pmem=on is set, shared=on is set, MAP_SYNC supported:
a: backend is a dax supporting file.
- MAP_SYNC will active.
b: backend is not a dax supporting file.
- mmap will trigger a warning. then MAP_SYNC flag will be ignored
2. The rest of cases:
- we will never pass the MAP_SYNC to mmap2
cpu: Rename parse_cpu_model() to parse_cpu_option()
The "model[,option...]" string parsed by the function is not just
a CPU model. Rename the function and its argument to indicate it
expects the full "-cpu" option to be provided.
Exploit that argument @name is nerver null. Check is_help_option()
first, because that's what we do elsewhere. If we (foolishly!)
defined a machine named "help", -machine help would now print help
instead of selecting the machine named "help".
All these related function will need a GSList for TYPE_MACHINE.
Currently we allocate this list each time we use it, while this is not
necessary to do so because we don't need to modify this.
This patch make the TYPE_MACHINE list allocation in select_machine and
pass this to its child for use.
Wei Yang [Fri, 5 Apr 2019 06:41:18 +0000 (14:41 +0800)]
vl.c: make find_default_machine() local
Function find_default_machine() is introduced by commit 2c8cffa599b7
"vl: make find_default_machine externally visible", and it was used
outside of vl.c until commit a904410af5f1 "pc_sysfw: remove the rom_only
property".
Commit a904410af5f1 "pc_sysfw: remove the rom_only property" removed the
only user of find_default_machine() outside vl.c, but neglected to make
it static. Do that now.
* tag 's390-ccw-bios-2019-04-12':
pc-bios/s390: Update firmware images
s390-bios: Use control unit type to find bootable devices
s390-bios: Support booting from real dasd device
s390-bios: Add channel command codes/structs needed for dasd-ipl
s390-bios: Use control unit type to determine boot method
s390-bios: Refactor virtio to run channel programs via cio
s390-bios: Factor finding boot device out of virtio code path
s390-bios: Extend find_dev() for non-virtio devices
s390-bios: cio error handling
s390-bios: Support for running format-0/1 channel programs
s390-bios: ptr2u32 and u32toptr
s390-bios: Map low core memory
s390-bios: Decouple channel i/o logic from virtio
s390-bios: Clean up cio.h
s390-bios: decouple common boot logic from virtio
s390-bios: decouple cio setup from virtio
s390 vfio-ccw: Add bootindex property and IPLB data
s390x/kvm: Configure page size after memory has actually been initialized
Right now we configure the pagesize quite early, when initializing KVM.
This is long before system memory is actually allocated via
memory_region_allocate_system_memory(), and therefore memory backends
marked as mapped.
Instead, let's configure the maximum page size after initializing
memory in s390_memory_init(). cap_hpage_1m is still properly
configured before creating any CPUs, and therefore before configuring
the CPU model and eventually enabling CMMA.
This is not a fix but rather a preparation for the future, when initial
memory might reside on memory backends (not the case for s390x right now)
We will replace qemu_getrampagesize() soon by a function that will always
return the maximum page size (not the minimum page size, which only
works by pure luck so far, as there are no memory backends).
tcg/ppc: Allow the constant pool to overflow at 32k
There is no point in coding for a 2GB offset when the max TB size
is already limited to 64k. If we further restrict to 32k then we
can eliminate the extra ADDIS instruction.
tcg: Restart TB generation after relocation overflow
If the TB generates too much code, such that backend relocations
overflow, try again with a smaller TB. In support of this, move
relocation processing from a random place within tcg_out_op, in
the handling of branch opcodes, to a new function at the end of
tcg_gen_code.
This is not a complete solution, as there are additional relocs
generated for out-of-line ldst handling and constant pools.
In order to handle TB's that translate to too much code, we
need to place the control of the length of the translation
in the hands of the code gen master loop.
Peter Maydell [Wed, 24 Apr 2019 12:19:41 +0000 (13:19 +0100)]
Merge remote-tracking branch 'remotes/lersek/tags/edk2-pull-2019-04-22' into staging
Advance the roms/edk2 submodule to the "edk2-stable201903" release, and
build and capture platform firmware binaries from that release. The
binaries are meant to be used by both end-users and by the "BIOS tables"
unit tests in qtest ("make check").
# gpg: Signature made Mon 22 Apr 2019 19:20:08 BST
# gpg: using RSA key D39DA71E0D496CFA
# gpg: Good signature from "Laszlo Ersek <[email protected]>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: F5D9 660F 1BA5 F310 A95A C5E0 466A EAE0 6125 3988
# Subkey fingerprint: B3A5 5D3F 88A8 90ED 2E63 3E8D D39D A71E 0D49 6CFA
* remotes/lersek/tags/edk2-pull-2019-04-22:
MAINTAINERS: add the "EDK2 Firmware" subsystem
Makefile: install the edk2 firmware images and their descriptors
tests: add missing dependency to build QTEST_QEMU_BINARY, round 2
pc-bios: document the edk2 firmware images; add firmware descriptors
pc-bios: add edk2 firmware binaries and variable store templates
roms: build edk2 firmware binaries and variable store templates
roms/Makefile: replace the $(EDK2_EFIROM) target with "edk2-basetools"
roms/edk2-funcs.sh: add the qemu_edk2_get_thread_count() function
roms/edk2: advance to tag edk2-stable201903
tests/uefi-test-tools/build.sh: work around TianoCore#1607
roms/edk2-funcs.sh: require gcc-4.8+ for building i386 and x86_64
roms: lift "edk2-funcs.sh" from "tests/uefi-test-tools/build.sh"
* remotes/armbru/tags/pull-error-monitor-2019-04-18: (36 commits)
include: Move fprintf_function to disas/
disas: Rename include/disas/bfd.h back to include/disas/dis-asm.h
monitor: Clean up how monitor_disas() funnels output to monitor
qom/cpu: Simplify how CPUClass:cpu_dump_state() prints
qemu-print: New qemu_fprintf(), qemu_vfprintf()
qom/cpu: Simplify how CPUClass::dump_statistics() prints
target/i386: Simplify how x86_cpu_dump_local_apic_state() prints
target: Clean up how the dump_mmu() print
target: Simplify how the TARGET_cpu_list() print
memory: Clean up how mtree_info() prints
block/qapi: Clean up how we print to monitor or stdout
qsp: Simplify how qsp_report() prints
tcg: Simplify how dump_drift_info() prints
tcg: Simplify how dump_exec_info() prints
tcg: Simplify how dump_opcount_info() prints
trace: Simplify how st_print_trace_file_status() prints
include: Include fprintf-fn.h only where needed
monitor: Simplify how -device/device_add print help
char-pty: Print "char device redirected" message to stdout
char: Make -chardev help print to stdout
...
The previous commits have eliminated fprintf_function outside
disassemblers, simplifying code and cleaning up the ugly type-punning
fprintf_function seems to attract. Move fprintf_function to
include/disas/dis-asm.h to reduce the temptation to abuse it.
I considered renaming it to fprintf_ftype (reverting that part of
commit 6e2d864edf5, v0.14.0) to get us closer to binutils, but I
figure the fork is too distant to make this worthwhile.
disas: Rename include/disas/bfd.h back to include/disas/dis-asm.h
Commit dc99065b5f9 (v0.1.0) added dis-asm.h from binutils.
Commit 43d4145a986 (v0.1.5) inlined bfd.h into dis-asm.h to remove the
dependency on binutils.
Commit 76cad71136b (v1.4.0) moved dis-asm.h to include/disas/bfd.h.
The new name is confusing when you try to match against (pre GPLv3+)
binutils. Rename it back. Keep it in the same directory, of course.
monitor: Clean up how monitor_disas() funnels output to monitor
INIT_DISASSEMBLE_INFO() takes an fprintf()-like callback and a FILE *
to pass to it. monitor_disas() passes monitor_fprintf() and the
current monitor cast to FILE *. monitor_fprintf() casts it right
back, and is otherwise identical to monitor_printf(). The
type-punning is ugly.
qom/cpu: Simplify how CPUClass:cpu_dump_state() prints
CPUClass method dump_statistics() takes an fprintf()-like callback and
a FILE * to pass to it. Most callers pass fprintf() and stderr.
log_cpu_state() passes fprintf() and qemu_log_file.
hmp_info_registers() passes monitor_fprintf() and the current monitor
cast to FILE *. monitor_fprintf() casts it right back, and is
otherwise identical to monitor_printf().
The callback gets passed around a lot, which is tiresome. The
type-punning around monitor_fprintf() is ugly.
Drop the callback, and call qemu_fprintf() instead. Also gets rid of
the type-punning, since qemu_fprintf() takes NULL instead of the
current monitor cast to FILE *.
Code that doesn't want to know about current monitor vs. stdout
vs. stderr takes an fprintf_function callback and a FILE * argument to
pass to it. Actual arguments are either fprintf() and stdout or
stderr, or monitor_fprintf() and the current monitor cast to FILE *.
monitor_fprintf() casts it right back, and is otherwise identical to
monitor_printf(). The type-punning is ugly.
New qemu_fprintf() and qemu_vprintf() address this need without type
punning: they are like fprintf() and vfprintf(), except they print to
the current monitor when passed a null FILE *. The next commits will
put them to use.
qom/cpu: Simplify how CPUClass::dump_statistics() prints
CPUClass method dump_statistics() takes an fprintf()-like callback and
a FILE * to pass to it.
Its only caller hmp_info_cpustats() (via cpu_dump_statistics()) passes
monitor_fprintf() and the current monitor cast to FILE *.
monitor_fprintf() casts it right back, and is otherwise identical to
monitor_printf(). The type-punning is ugly.
Drop the callback, and call qemu_printf() instead.
target/i386: Simplify how x86_cpu_dump_local_apic_state() prints
x86_cpu_dump_local_apic_state() takes an fprintf()-like callback and a
FILE * to pass to it, and so do its helper functions.
Its only caller hmp_info_local_apic() passes monitor_fprintf() and the
current monitor cast to FILE *. monitor_fprintf() casts it right
back, and is otherwise identical to monitor_printf(). The
type-punning is ugly.
Drop the callback, and call qemu_printf() instead.
The various dump_mmu() take an fprintf()-like callback and a FILE * to
pass to it, and so do their helper functions. Passing around callback
and argument is rather tiresome.
Most dump_mmu() are called only by the target's hmp_info_tlb(). These
all pass monitor_printf() cast to fprintf_function and the current
monitor cast to FILE *.
SPARC's dump_mmu() gets also called from target/sparc/ldst_helper.c a
few times #ifdef DEBUG_MMU. These calls pass fprintf() and stdout.
The type-punning is technically undefined behaviour, but works in
practice. Clean up: drop the callback, and call qemu_printf()
instead.
The various TARGET_cpu_list() take an fprintf()-like callback and a
FILE * to pass to it. Their callers (vl.c's main() via list_cpus(),
bsd-user/main.c's main(), linux-user/main.c's main()) all pass
fprintf() and stdout. Thus, the flexibility provided by the (rather
tiresome) indirection isn't actually used.
Drop the callback, and call qemu_printf() instead.
Calling printf() would also work, but would make the code unsuitable
for monitor context without making it simpler.
mtree_info() takes an fprintf()-like callback and a FILE * to pass to
it, and so do its helper functions. Passing around callback and
argument is rather tiresome.
Its only caller hmp_info_mtree() passes monitor_printf() cast to
fprintf_function and the current monitor cast to FILE *.
The type-punning is technically undefined behaviour, but works in
practice. Clean up: drop the callback, and call qemu_printf()
instead.
block/qapi: Clean up how we print to monitor or stdout
bdrv_snapshot_dump(), bdrv_image_info_specific_dump(),
bdrv_image_info_dump() and their helpers take an fprintf()-like
callback and a FILE * to pass to it.
hmp.c passes monitor_printf() cast to fprintf_function and the current
monitor cast to FILE *.
qemu-img.c and qemu-io-cmds.c pass fprintf and stdout.
The type-punning is technically undefined behaviour, but works in
practice. Clean up: drop the callback, and call qemu_printf()
instead.
qsp_report() takes an fprintf()-like callback and a FILE * to pass to
it.
Its only caller hmp_sync_profile() passes monitor_fprintf() and the
current monitor cast to FILE *. monitor_fprintf() casts it right
back, and is otherwise identical to monitor_printf(). The
type-punning is ugly.
Drop the callback, and call qemu_printf() instead.
dump_drift_info() takes an fprintf()-like callback and a FILE * to pass
to it.
Its only caller hmp_info_jit() passes monitor_fprintf() and a Monitor
* cast to FILE *. monitor_fprintf() casts it right back, and is
otherwise identical to monitor_printf(). The type-punning is ugly.
Drop the callback, and call qemu_printf() instead.
dump_exec_info() takes an fprintf()-like callback and a FILE * to pass
to it.
Its only caller hmp_info_jit() passes monitor_fprintf() and the
current monitor cast to FILE *. monitor_fprintf() casts it right
back, and is otherwise identical to monitor_printf(). The
type-punning is ugly.
Drop the callback, and call qemu_printf() instead.
dump_opcount_info() takes an fprintf()-like callback and a FILE * to
pass to it.
Its only caller hmp_info_opcount() passes monitor_fprintf() and the
current monitor cast to FILE *. monitor_fprintf() casts it right
back, and is otherwise identical to monitor_printf(). The
type-punning is ugly.
Drop the callback, and call qemu_printf() instead.
trace: Simplify how st_print_trace_file_status() prints
st_print_trace_file_status() takes an fprintf()-like callback and a
FILE * to pass to it.
Its only caller hmp_trace_file() passes monitor_fprintf() and the
current monitor cast to FILE *. monitor_fprintf() casts it right
back, and is otherwise identical to monitor_printf(). The
type-punning is ugly.
Drop the callback, and call qemu_printf() instead.
monitor: Simplify how -device/device_add print help
Commit a95db58f210 added monitor_vfprintf() as an error_printf()
generalized from stderr to arbitrary streams, then used it wrapped in
helper out_printf() to print -device/device_add help to stdout. Use
qemu_printf() instead, and delete monitor_vfprintf() and out_printf().
char-pty: Print "char device redirected" message to stdout
char_pty_open() prints a "char device redirected to PTY_NAME (label
LABEL)" message to the current monitor or else to stderr. This is not
an error, so it shouldn't go to stderr. Print it to stdout instead.
Why is it even printed? No other ChardevClass::open() prints anything
on success. It's because you need to know PTY_NAME to actually use
this char device, e.g. like e.g. "socat STDIO,cfmakeraw FILE:PTY_NAME"
to use the monitor's readline interface. You can get PTY_NAME with
"info chardev" (a.k.a. query-chardev for QMP), but only if you already
have a monitor.
Command line help explicitly requested by the user should be printed
to stdout, not stderr. We do elsewhere. Adjust -chardev to match:
use qemu_printf() instead of error_printf(). Plain printf() would be
wrong because we need to print to the current monitor for "chardev-add
help".
Command line help explicitly requested by the user should be printed
to stdout, not stderr. We do elsewhere. Adjust -drive to match: use
qemu_printf() instead of error_printf(). Plain printf() would be
wrong because we need to print to the current monitor for "drive_add
... format=help".
qemu-print: New qemu_printf(), qemu_vprintf() etc.
We commonly want to print to the current monitor if we have one, else
to stdout/stderr. For stderr, have error_printf(). For stdout, all
we have is monitor_vfprintf(), which is rather unwieldy. We often
print to stderr just because error_printf() is easier.
New qemu_printf() and qemu_vprintf() do exactly what's needed. The
next commits will put them to use.
monitor error: Make printf()-like functions return a value
printf() & friends return the number of characters written on success,
negative value on error.
monitor_printf(), monitor_vfprintf(), monitor_vprintf(),
error_printf(), error_printf_unless_qmp(), error_vprintf(), and
error_vprintf_unless_qmp() return void. Some of them carry a TODO
comment asking for int instead.
Improve them to return int like printf() does.
This makes our use of monitor_printf() as fprintf_function slightly
less dirty: the function cast no longer adds a return value that isn't
there. It still changes a parameter's pointer type. That will be
addressed in a future commit.
monitor_vfprintf() always returns zero. Improve it to return the
proper value.
vl: Make -machine $TYPE,help and -accel help print to stdout
Command line help help explicitly requested by the user should be
printed to stdout, not stderr. We do elsewhere. Adjust -machine
$TYPE,help and -accel help to match: use printf() instead of
error_printf().
s390x/kvm: Report warnings with warn_report(), not error_printf()
kvm_s390_mem_op() can fail in two ways: when !cap_mem_op, it returns
-ENOSYS, and when kvm_vcpu_ioctl() fails, it returns -errno set by
ioctl(). Its caller s390_cpu_virt_mem_rw() recovers from both
failures.
kvm_s390_mem_op() prints "KVM_S390_MEM_OP failed" with error_printf()
in the latter failure mode. Since this is obviously a warning, use
warn_report().
Perhaps the reporting should be left to the caller. It could warn on
failure other than -ENOSYS.
load_fit() reports errors with error_printf() instead of
error_report(). Worse, it even reports errors it actually recovers
from, in fit_cfg_compatible() and fit_load_fdt(). Messed up in
initial commit 51b58561c1d.
Convert the helper functions for load_fit() to Error. Make sure each
failure path sets an error.
Fix fit_cfg_compatible() and fit_load_fdt() not to report errors they
actually recover from.