Peter Maydell [Fri, 8 Dec 2017 16:57:28 +0000 (16:57 +0000)]
sparc: Make sure we mmap at SHMLBA alignment
SPARC Linux has an oddity that it insists that mmap()
of MAP_FIXED memory must be at an alignment defined by
SHMLBA, which is more aligned than the page size
(typically, SHMLBA alignment is to 16K, and pages are 8K).
This is a relic of ancient hardware that had cache
aliasing constraints, but even on modern hardware the
kernel still insists on the alignment.
To ensure that we get mmap() alignment sufficient to
make the kernel happy, change QEMU_VMALLOC_ALIGN,
qemu_fd_getpagesize() and qemu_mempath_getpagesize()
to use the maximum of getpagesize() and SHMLBA.
In particular, this allows 'make check' to pass on Sparc:
we were previously failing the ivshmem tests.
The existing QIOChannelSocket class provides the ability to
listen on a single socket at a time. This patch introduces
a QIONetListener class that provides a higher level API
concept around listening for network services, allowing
for listening on multiple sockets.
Peter Maydell [Fri, 15 Dec 2017 12:58:17 +0000 (12:58 +0000)]
Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20171215-v2' into staging
s390x changes for 2.12:
- Lots of tcg improvements: ccw hotplug is now working and we can run
a Linux kernel built for z12 under tcg
- zPCI improvements to get virtio-pci working
- get rid of the cssid restrictions for virtual and non-virtual channel
devices
- we now support 8TB+ systems
- 2.12 compat machine
- fixes and cleanups
* remotes/cohuck/tags/s390x-20171215-v2: (46 commits)
s390-ccw-virtio: allow for systems larger that 7.999TB
s390x: change the QEMU cpu model to a stripped down z12
s390x/tcg: we already implement the Set-Program-Parameter facility
s390x/tcg: implement extract-CPU-time facility
s390x/tcg: Implement SIGNAL ADAPTER instruction
s390x/tcg: Implement STORE CHANNEL PATH STATUS
s390x/tcg: wire up SET CHANNEL MONITOR
s390x/tcg: wire up SET ADDRESS LIMIT
s390x/tcg: implement Interlocked-Access Facility 2
s390x/tcg: ASI/ASGI/ALSI/ALSGI are atomic with Interlocked-acccess facility 1
s390x/tcg: wire up STORE CHANNEL REPORT WORD
s390x/tcg: indicate value of TODPR in STCKE
s390x/tcg: implement SET CLOCK PROGRAMMABLE FIELD
s390x/tcg: fix and cleanup mcck injection
s390x/kvm: factor out build_channel_report_mcic() into cpu.h
s390x/css: attach css bridge
s390x: deprecate s390-squash-mcss machine prop
s390x/css: unrestrict cssids
s390x/pci: search for subregion inside the BARs
s390x/pci: move the memory region write from pcistg
...
Peter Maydell [Fri, 15 Dec 2017 11:13:43 +0000 (11:13 +0000)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.12-20171215' into staging
ppc patch queue 2017-12-15
First pull request for qemu-2.12. This has quite a bit of stuff
accumulated while 2.11 was finalizing. Highlights are:
* Some preliminary work towards implementing the "XIVE" POWER9
interrupt controller
* Some fixes for problems during reboot with MTTCG
* A substantial TCG performance improvement via
tcg_get_lookup_and_goto_ptr
* Numerous assorted cleanups and bugfixes that weren't urgent enough
for 2.11
* remotes/dgibson/tags/ppc-for-2.12-20171215: (24 commits)
spapr: don't initialize PATB entry if max-cpu-compat < power9
spapr: Assume msi_nonbroken
spapr: Rename machine init functions for clarity
target/ppc: introduce the PPC_BIT() macro
spapr_events: drop bogus cell from "interrupt-ranges" property
spapr: fix LSI interrupt specifiers in the device tree
spapr: replace numa_get_node() with lookup in pc-dimm list
spapr: introduce a spapr_qirq() helper
spapr: introduce a spapr_irq_set_lsi() helper
spapr: move the IRQ allocation routines under the machine
ppc/xics: assign of the CPU 'intc' pointer under the core
ppc/xics: introduce an icp_create() helper
spapr/rtas: do not reset the MSR in stop-self command
spapr/rtas: fix reboot of a a SMP TCG guest
spapr/rtas: disable the decrementer interrupt when a CPU is unplugged
e500: fix pci host bridge class/type
openpic: debug w/ info_report()
pcc: define the Power-saving mode Exit Cause Enable bits in PowerPCCPUClass
nvram: add AT24Cx i2c eeprom
e500: name openpic and pci host bridge
...
s390-ccw-virtio: allow for systems larger that 7.999TB
KVM does not allow memory regions > KVM_MEM_MAX_NR_PAGES, basically
limiting the memory per slot to 8TB-4k. As memory slots on s390/kvm must
be a multiple of 1MB we need start a new memory region if we cross
8TB-1M.
With that (and optimistic overcommitment in the kernel) I was able to
start a 24TB guest on a 1TB system.
Peter Maydell [Fri, 15 Dec 2017 09:52:07 +0000 (09:52 +0000)]
Merge remote-tracking branch 'remotes/sstabellini/tags/xen-20171214-tag' into staging
Xen 2017/12/14
# gpg: Signature made Fri 15 Dec 2017 00:26:26 GMT
# gpg: using RSA key 0x894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <[email protected]>"
# gpg: aka "Stefano Stabellini <[email protected]>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3 0AEA 894F 8F48 70E1 AE90
* remotes/sstabellini/tags/xen-20171214-tag:
xen/pt: Set is_express to avoid out-of-bounds write
xenfb: activate input handlers for raw pointer devices
xenfb: Add [feature|request]-raw-pointer
xenfb: Use Input Handlers directly
ui: generate qcode to linux mappings
xen-disk: use an IOThread per instance
Simon Gaiser [Sat, 28 Oct 2017 02:53:15 +0000 (04:53 +0200)]
xen/pt: Set is_express to avoid out-of-bounds write
The passed-through device might be an express device. In this case the
old code allocated a too small emulated config space in
pci_config_alloc() since pci_config_size() returned the size for a
non-express device. This leads to an out-of-bound write in
xen_pt_config_reg_init(), which sometimes results in crashes. So set
is_express as already done for KVM in vfio-pci.
Shortened ASan report:
==17512==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x611000041648 at pc 0x55e0fdac51ff bp 0x7ffe4af07410 sp 0x7ffe4af07408
WRITE of size 2 at 0x611000041648 thread T0
#0 0x55e0fdac51fe in memcpy /usr/include/x86_64-linux-gnu/bits/string3.h:53
#1 0x55e0fdac51fe in stw_he_p include/qemu/bswap.h:330
#2 0x55e0fdac51fe in stw_le_p include/qemu/bswap.h:379
#3 0x55e0fdac51fe in pci_set_word include/hw/pci/pci.h:490
#4 0x55e0fdac51fe in xen_pt_config_reg_init hw/xen/xen_pt_config_init.c:1991
#5 0x55e0fdac51fe in xen_pt_config_init hw/xen/xen_pt_config_init.c:2067
#6 0x55e0fdabcf4d in xen_pt_realize hw/xen/xen_pt.c:830
#7 0x55e0fdf59666 in pci_qdev_realize hw/pci/pci.c:2034
#8 0x55e0fdda7d3d in device_set_realized hw/core/qdev.c:914
[...]
0x611000041648 is located 8 bytes to the right of 256-byte region [0x611000041540,0x611000041640)
allocated by thread T0 here:
#0 0x7ff596a94bb8 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xd9bb8)
#1 0x7ff57da66580 in g_malloc0 (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x50580)
#2 0x55e0fdda7d3d in device_set_realized hw/core/qdev.c:914
[...]
Owen Smith [Fri, 3 Nov 2017 11:56:31 +0000 (11:56 +0000)]
xenfb: activate input handlers for raw pointer devices
If the frontend requests raw pointers, the input handlers must be
activated to have the input events delivered to the xenfb backend.
Without activation, the input events are delivered to handlers
registered earlier, which would be the emulated USB tablet or
emulated PS/2 mouse.
HVM xen_kbdfront can incorrectly scale absolute coordinates when
the display resolution is not 800x600.
Owen Smith [Fri, 3 Nov 2017 11:56:30 +0000 (11:56 +0000)]
xenfb: Add [feature|request]-raw-pointer
Writes "feature-raw-pointer" during init to indicate the backend
can pass raw unscaled values for absolute axes to the frontend.
Frontends set "request-raw-pointer" to indicate the backend should
not attempt to scale absolute values to console size.
"request-raw-pointer" is only valid if "request-abs-pointer" is
also set. Raw unscaled pointer values are in the range [0, 0x7fff]
Owen Smith [Fri, 3 Nov 2017 11:56:29 +0000 (11:56 +0000)]
xenfb: Use Input Handlers directly
Avoid the unneccessary calls through the input-legacy.c file by
using the qemu_input_handler_*() calls directly. This did require
reworking the event and sync handlers to use the reverse mapping
from qcode to linux using qemu_input_qcode_to_linux().
Removes the scancode2linux mapping, and supporting documention.
Paul Durrant [Tue, 7 Nov 2017 10:46:53 +0000 (05:46 -0500)]
xen-disk: use an IOThread per instance
This patch allocates an IOThread object for each xen_disk instance and
sets the AIO context appropriately on connect. This allows processing
of I/O to proceed in parallel.
The patch also adds tracepoints into xen_disk to make it possible to
follow the state transtions of an instance in the log.
Laurent Vivier [Thu, 14 Dec 2017 18:09:48 +0000 (19:09 +0100)]
spapr: don't initialize PATB entry if max-cpu-compat < power9
if KVM is enabled and KVM capabilities MMU radix is available,
the partition table entry (patb_entry) for the radix mode is
initialized by default in ppc_spapr_reset().
It's a problem if we want to migrate the guest to a POWER8 host
while the kernel is not started to set the value to the one
expected for a POWER8 CPU.
The "-machine max-cpu-compat=power8" should allow to migrate
a POWER9 KVM host to a POWER8 KVM host, but because patb_entry
is set, the destination QEMU tries to enable radix mode on the
POWER8 host. This fails and cancels the migration:
Process table config unsupported by the host
error while loading state for instance 0x0 of device 'spapr'
load of migration failed: Invalid argument
This patch doesn't set the PATB entry if the user provides
a CPU compatibility mode that doesn't support radix mode.
David Gibson [Fri, 8 Dec 2017 03:11:49 +0000 (14:11 +1100)]
spapr: Assume msi_nonbroken
We conditionally adjust part of the guest device tree based on the
global msi_nonbroken flag. However, the main machine type code
initializes msi_nonbroken to true and there's nothing that would set
it to false again.
David Gibson [Fri, 8 Dec 2017 01:47:34 +0000 (12:47 +1100)]
spapr: Rename machine init functions for clarity
Machine objects have two init functions - the generic QOM level
instance_init which should only do static object initialization, and
the Machine specific MachineClass::init which does the actual
construction of the machine.
In spapr the functions implementing these two have names -
ppc_machine_initfn() and ppc_spapr_init() - which don't correspond closely
to either of those. To prevent people (read, me) from confusing which is
which, rename them spapr_instance_init() and spapr_machine_init() to
make it clearer which is which.
While we're there rename ppc_spapr_reset() to spapr_machine_reset() to
match.
Greg Kurz [Wed, 6 Dec 2017 08:16:52 +0000 (09:16 +0100)]
spapr_events: drop bogus cell from "interrupt-ranges" property
According to LoPAPR 1.1 B.6.12, the "/event-sources" node has an "interrupt-
ranges" property, the format of which is described in B.6.9.1.2 as follows:
“interrupt-ranges”
Standard property name that defines the interrupt number(s) and range(s)
handled by this unit.
prop-encoded-array: List of (int-number, range) specifications.
Int-number is encoded as with encode-int.
Range is encoded as with encode-int.
The first entry in this list shall contain the int-number associated with
the first “reg” property entry. The int-num-ber is the value representing
the interrupt source as would appear in the PowerPC External Interrupt
Architecture XISR. The range shall be the number of sequential interrupt
numbers which this unit can generate.
There's no such thing as a cell count at the end of the array, like the
one introduced by commit ffbb1705a33d in QEMU 2.8. It doesn't seem it had
any impact on existing guests and I couldn't find any related workaround
in linux. So, let's just drop the bogus lines.
Greg Kurz [Wed, 6 Dec 2017 08:13:16 +0000 (09:13 +0100)]
spapr: fix LSI interrupt specifiers in the device tree
LoPAPR 1.1 B.6.9.1.2 describes the "#interrupt-cells" property of the
PowerPC External Interrupt Source Controller node as follows:
“#interrupt-cells”
Standard property name to define the number of cells in an interrupt-
specifier within an interrupt domain.
prop-encoded-array: An integer, encoded as with encode-int, that denotes
the number of cells required to represent an interrupt specifier in its
child nodes.
The value of this property for the PowerPC External Interrupt option shall
be 2. Thus all interrupt specifiers (as used in the standard “interrupts”
property) shall consist of two cells, each containing an integer encoded
as with encode-int. The first integer represents the interrupt number the
second integer is the trigger code: 0 for edge triggered, 1 for level
triggered.
This patch fixes the interrupt specifiers in the "interrupt-map" property
of the PHB node, that were setting the second cell to 8 (confusion with
IRQ_TYPE_LEVEL_LOW ?) instead of 1.
VIO devices and RTAS event sources use the same format for interrupt
specifiers: while here, we introduce a common helper to handle the
encoding details.
Signed-off-by: Greg Kurz <[email protected]> Reviewed-by: Cédric Le Goater <[email protected]> Tested-by: Cédric Le Goater <[email protected]>
--
v3: - reference public LoPAPR instead of internal PAPR+ in changelog
- change helper name to spapr_dt_xics_irq()
v2: - drop the erroneous changes to the "interrupts" prop in PCI device nodes
- introduce a common helper to encode interrupt specifiers Signed-off-by: David Gibson <[email protected]>
Igor Mammedov [Tue, 5 Dec 2017 15:41:17 +0000 (16:41 +0100)]
spapr: replace numa_get_node() with lookup in pc-dimm list
SPAPR is the last user of numa_get_node() and a bunch of
supporting code to maintain numa_info[x].addr list.
Get LMB node id from pc-dimm list, which allows to
remove ~80LOC maintaining dynamic address range
lookup list.
It also removes pc-dimm dependency on numa_[un]set_mem_node_id()
and makes pc-dimms a sole source of information about which
node it belongs to and removes duplicate data from global
numa_info.
xics_get_qirq() is only used by the sPAPR machine. Let's move it there
and change its name to reflect its scope. It will be useful for XIVE
support which will use its own set of qirqs.
spapr: move the IRQ allocation routines under the machine
Also change the prototype to use a sPAPRMachineState and prefix them
with spapr_irq_. It will let us synchronise the IRQ allocation with
the XIVE interrupt mode when available.
ppc/xics: assign of the CPU 'intc' pointer under the core
The 'intc' pointer of the CPU references the interrupt presenter in
the XICS interrupt mode. When the XIVE interrupt mode is available and
activated, the machine will need to reassign this pointer to reflect
the change.
Moving this assignment under the realize routine of the CPU will ease
the process when the interrupt mode is toggled.
The sPAPR and the PowerNV core objects create the interrupt presenter
object of the CPUs in a very similar way. Let's provide a common
routine in which we use the presenter 'type' as a child identifier.
Cédric Le Goater [Fri, 24 Nov 2017 07:05:50 +0000 (08:05 +0100)]
spapr/rtas: do not reset the MSR in stop-self command
When a CPU is stopped with the 'stop-self' RTAS call, its state
'halted' is switched to 1 and, in this case, the MSR is not taken into
account anymore in the cpu_has_work() routine. Only the pending
hardware interrupts are checked with their LPCR:PECE* enablement bit.
The CPU is now also protected from the decrementer interrupt by the
LPCR:PECE* bits which are disabled in the 'stop-self' RTAS
call. Reseting the MSR is pointless.
Cédric Le Goater [Fri, 24 Nov 2017 07:05:49 +0000 (08:05 +0100)]
spapr/rtas: fix reboot of a a SMP TCG guest
Just like for hot unplug CPUs, when a guest is rebooted, the secondary
CPUs can be awaken by the decrementer and start entering SLOF at the
same time the boot CPU is.
To be safe, let's disable on the secondaries all the exceptions which
can cause an exit while the CPU is in power-saving mode.
Cédric Le Goater [Fri, 24 Nov 2017 07:05:48 +0000 (08:05 +0100)]
spapr/rtas: disable the decrementer interrupt when a CPU is unplugged
When a CPU is stopped with the 'stop-self' RTAS call, its state
'halted' is switched to 1 and, in this case, the MSR is not taken into
account anymore in the cpu_has_work() routine. Only the pending
hardware interrupts are checked with their LPCR:PECE* enablement bit.
If the DECR timer fires after 'stop-self' is called and before the CPU
'stop' state is reached, the nearly-dead CPU will have some work to do
and the guest will crash. This case happens very frequently with the
not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is
occasionally fired but after 'stop' state, so no work is to be done
and the guest survives.
I suspect there is a race between the QEMU mainloop triggering the
timers and the TCG CPU thread but I could not quite identify the root
cause. To be safe, let's disable in the LPCR all the exceptions which
can cause an exit while the CPU is in power-saving mode and reenable
them when the CPU is started.
Correct some confusion wrt. the PCI facing
side of the PCI host bridge (not PCIe root complex).
The ref. manual for the mpc8533 (as well as
mpc8540 and mpc8540) give the class code as
PCI_CLASS_PROCESSOR_POWERPC.
While the PCI_HEADER_TYPE field is oddly omitted,
the tables in the "PCI Configuration Header"
section shows a type 0 layout using all 6 BAR
registers (as 2x 32, and 2x 64 bit regions)
So 997505065dc92e533debf5cb23012ba4e673d387
seems to be in error. Although there was
perhaps some confusion as the mpc8533
has a separate PCIe root complex.
With PCIe, a root complex has PCI_HEADER_TYPE=1.
Neither the PCI host bridge, nor the PCIe
root complex advertise class PCI_CLASS_BRIDGE_PCI.
This was confusing Linux guests, which try
to interpret the host bridge as a pci-pci
bridge, but get confused and re-enumerate
the bus when the primary/secondary/subordinate
bus registers don't have valid values.
Greg Kurz [Mon, 20 Nov 2017 09:19:54 +0000 (10:19 +0100)]
spapr_cpu_core: instantiate CPUs separately
The current code assumes that only the CPU core object holds a
reference on each individual CPU object, and happily frees their
allocated memory when the core is unrealized. This is dangerous
as some other code can legitimely keep a pointer to a CPU if it
calls object_ref(), but it would end up with a dangling pointer.
Let's allocate all CPUs with object_new() and let QOM free them
when their reference count reaches zero. This greatly simplify the
code as we don't have to fiddle with the instance size anymore.
s390x: change the QEMU cpu model to a stripped down z12
We are good enough to boot upstream Linux kernels / Fedora 26/27. That
should be sufficient for now.
As the QEMU CPU model is migration safe, let's add compatibility code.
Generate the feature list to reduce the chance of messing things up in the
future.
s390x/tcg: we already implement the Set-Program-Parameter facility
The Set-Program-Parameter facility (also known as Load-Program-Parameter
facility) provides the LPP instruction used to load the program
parameter. We already implement that instruction in TCG, so add it to our
list.
Note: Not documented in the PoP but in "The Load-Program-Parameter and
CPU-Measurement Facilities) - SA23-2260-05 document.
While at it, make the whole list ordered (according to cpu_features_def.h).
It only provides the EXTRACT CPU TIME instruction. We can reuse the stpt
helper, which calculates the CPU timer value.
As the instruction is not privileged, but we don't have a CPU timer
value in case of linux user, we simply reuse cpu_get_host_ticks() to
produce some descending value.
Just like KVM does, we should suppress this instruction:
When this instruction is not provided, it is
checked for privileged operation exception and the
instruction is suppressed by the machine
Let's handle it just like KVM:
Depending on the model, this instruction may not be
provided. When this instruction is not provided, it is
checked for operand exception and privileged-opera-
tion exception, and then is suppressed.
The architecture mode indication wasn't stored. The split of certain
64bit fields was unnecessary. Also, the complete clock comparator, not
just bit 0-55 (starting at byte 1) was stored.
We now generate a proper MCIC via the same helper we use for KVM.
There is more to clean up, but we will change the other parts later on
either way.
s390x/kvm: factor out build_channel_report_mcic() into cpu.h
We'll need it later on in two places. Refactor it to just indicate the
validity bits. While at it, introduce a define for the used CR14 bit (we'll
also need later on).
Halil Pasic [Wed, 6 Dec 2017 14:44:38 +0000 (15:44 +0100)]
s390x: deprecate s390-squash-mcss machine prop
With the cssids unrestricted (commit "s390x/css: unrestrict cssids") the
s390-squash-mcss machine property should not be used. Actually Libvirt
never supported this, so the expectation is that removing it should be
pretty painless. But let's play nice and deprecate it first.
Halil Pasic [Wed, 6 Dec 2017 14:44:37 +0000 (15:44 +0100)]
s390x/css: unrestrict cssids
The default css 0xfe is currently restricted to virtual subchannel
devices. The hope when the decision was made was, that non-virtual
subchannel devices will come around when guest can exploit multiple
channel subsystems. Since the guests generally don't do, the pain
of the partitioned (cssid) namespace outweighs the gain.
Let us remove the corresponding restrictions (virtual devices
can be put only in 0xfe and non-virtual devices in any css except
the 0xfe -- while s390-squash-mcss then remaps everything to cssid 0).
At the same time, change our schema for generating css bus ids to put
both virtual and non-virtual devices into the default css (spilling over
into other css images, if needed). The intention is to deprecate
s390-squash-mcss. With this change devices without a specified devno
won't end up hidden to guests not supporting multiple channel subsystems,
unless this can not be avoided (default css full).
Let us also advertise the changes to the management software (so it can
tell are cssids unrestricted or restricted).
The adverse effect of getting rid of the restriction on migration should
not be too severe. Vfio-ccw devices are not live-migratable yet, and for
virtual devices using the extra freedom would only make sense with the
aforementioned guest support in place.
The auto-generated bus ids are affected by both changes. We hope to not
encounter any auto-generated bus ids in production as Libvirt is always
explicit about the bus id. Since 8ed179c937 ("s390x/css: catch section
mismatch on load", 2017-05-18) the worst that can happen because the same
device ended up having a different bus id is a cleanly failed migration.
I find it hard to reason about the impact of changed auto-generated bus
ids on migration for command line users as I don't know which rules is
such an user supposed to follow.
Another pain-point is down- or upgrade of QEMU for command line users.
The old way and the new way of doing vfio-ccw are mutually incompatible.
Libvirt is only going to support the new way, so for libvirt users, the
possible problems at QEMU downgrade are the following. If a domain
contains virtual devices placed into a css different than 0xfe the domain
will refuse to start with a QEMU not having this patch. Putting devices
into a css different that 0xfe however won't make much sense in the near
future (guest support). Libvirt will refuse to do vfio-ccw with a QEMU
not having this patch. This is business as usual.
Pierre Morel [Thu, 30 Nov 2017 12:55:30 +0000 (13:55 +0100)]
s390x/pci: search for subregion inside the BARs
When dispatching memory access to PCI BAR region, we must
look for possible subregions, used by the PCI device to map
different memory areas inside the same PCI BAR.
Since the data offset we received is calculated starting at the
region start address we need to adjust the offset for the subregion.
The data offset inside the subregion is calculated by substracting
the subregion's starting address from the data offset in the region.
The access to the MSIX region is now handled in a generic way,
we do not need the specific trap_msix() function anymore.
Pierre Morel [Thu, 30 Nov 2017 12:55:29 +0000 (13:55 +0100)]
s390x/pci: move the memory region write from pcistg
Let's move the memory region write from pcistg into a dedicated
function.
This allows us to prepare a later patch searching for subregions
inside of the memory region.
Pierre Morel [Thu, 30 Nov 2017 12:55:24 +0000 (13:55 +0100)]
s390x/pci: factor out endianess conversion
There are two places where the same endianness conversion
is done.
Let's factor this out into a static function.
Note that the conversion must always be done for data in a register:
The S390 BE guest converted date to le before issuing the instruction.
After interception in a BE host:
ZPCI VFIO using pwrite must make the conversion back for the BE kernel.
Kernel will do BE to le translation when loading the register for the
real instruction.
After interception in a le host:
TCG stores a BE register in le, swapping bytes.
But since the data in the register was already le it is now BE
ZPCI VFIO must convert it to le before writing to the PCI memory.
In both cases ZPCI VFIO must swap the bytes from the register.
All users are gone, we can finally drop it and make sure that all new
program interrupt injections are reminded of the retaddr - as they have to
use s390_program_interrupt() now.
s390x: handle exceptions during s390_cpu_virt_mem_rw() correctly (TCG)
s390_cpu_virt_mem_rw() must always return, so callers can react on
an exception (e.g. see ioinst_handle_stcrw()).
However, for TCG we always have to exit the cpu loop (and restore the
cpu state before that) if we injected a program interrupt. So let's
introduce and use s390_cpu_virt_mem_handle_exc() in code that is not
purely KVM.
Directly pass the retaddr we already have available in these functions.
s390x/pci: pass the retaddr to all PCI instructions
Once we wire up TCG, we will need the retaddr to correctly inject
program interrupts. As we want to get rid of the function
program_interrupt(), convert PCI code too.
For KVM, we can simply use RA_IGNORED.
Convert program_interrupt() to s390_program_interrupt() directly, making
use of the passed address.
valgrind pointed out that we call KVM_S390_GET_IRQ_STATE with an
undefined value for flags. Kernels prior to 4.15 did not use that
field, and later kernels ignore it for compatibility reasons, but we
better play safe.
The same is true for SET_IRQ_STATE. We should make sure to not use the
flag field, either.
Peter Maydell [Thu, 14 Dec 2017 14:22:17 +0000 (14:22 +0000)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20171213' into staging
target-arm queue:
* xilinx_spips: set reset values correctly
* MAINTAINERS: fix an email address
* hw/display/tc6393xb: limit irq handler index to TC6393XB_GPIOS
* nvic: Make systick banked for v8M
* refactor get_phys_addr() so we can return the right format PAR
for ATS operations
* implement v8M TT instruction
* fix some minor v8M bugs
* Implement reset for GICv3 ITS
* xlnx-zcu102: Add support for the ZynqMP QSPI
* remotes/pmaydell/tags/pull-target-arm-20171213: (43 commits)
xilinx_spips: Use memset instead of a for loop to zero registers
xilinx_spips: Set all of the reset values
xilinx_spips: Update the QSPI Mod ID reset value
MAINTAINERS: replace the unavailable email address
hw/display/tc6393xb: limit irq handler index to TC6393XB_GPIOS
nvic: Make systick banked
nvic: Make nvic_sysreg_ns_ops work with any MemoryRegion
target/arm: Extend PAR format determination
target/arm: Remove fsr argument from get_phys_addr() and arm_tlb_fill()
target/arm: Ignore fsr from get_phys_addr() in do_ats_write()
target/arm: Use ARMMMUFaultInfo in deliver_fault()
target/arm: Convert get_phys_addr_pmsav8() to not return FSC values
target/arm: Convert get_phys_addr_pmsav7() to not return FSC values
target/arm: Convert get_phys_addr_pmsav5() to not return FSC values
target/arm: Convert get_phys_addr_lpae() to not return FSC values
target/arm: Convert get_phys_addr_v6() to not return FSC values
target/arm: Convert get_phys_addr_v5() to not return FSC values
target/arm: Remove fsr argument from arm_ld*_ptw()
target/arm: Provide fault type enum and FSR conversion functions
target/arm: Implement TT instruction
...
Vadim Galitsyn [Mon, 23 Oct 2017 15:13:10 +0000 (17:13 +0200)]
tests: test-hmp: print command execution result
Provide HMP monitor command execution result as it would be seen
by user who established an HMP monitor session.
Currently many commands may silently fail without any sign of that.
This patch let this info to be printed once test is running in
verbose mode.
For the future it might be useful to fail the test if command has
failed, however it would require a bit of rework inside test
engine itself.
A simple example of silent failure without reporting it would to
add some non-existent HMP command into 'hmp_cmds' list. In this case
test will report it successfully passed without error.
Thomas Huth [Thu, 30 Nov 2017 20:19:00 +0000 (21:19 +0100)]
hmp-commands: Remove the deprecated usb_add and usb_del
It's easy to use device_add and device_del as replacement instead.
The usb_add and usb_del commands are deprecated since QEMU 2.10,
and nobody complained that they are still needed, so let's get rid
of them now to make the HMP interface a little bit less overloaded.
hw/display/tc6393xb: limit irq handler index to TC6393XB_GPIOS
The ctz32() routine could return a value greater than
TC6393XB_GPIOS=16, because the device has 24 GPIO level
bits but we only implement 16 outgoing lines. This could
lead to an OOB array access. Mask 'level' to avoid it.
Peter Maydell [Wed, 13 Dec 2017 17:59:26 +0000 (17:59 +0000)]
nvic: Make systick banked
For the v8M security extension, there should be two systick
devices, which use separate banked systick exceptions. The
register interface is banked in the same way as for other
banked registers, including the existence of an NS alias
region for secure code to access the nonsecure timer.
Peter Maydell [Wed, 13 Dec 2017 17:59:26 +0000 (17:59 +0000)]
nvic: Make nvic_sysreg_ns_ops work with any MemoryRegion
Generalize nvic_sysreg_ns_ops so that we can pass it an
arbitrary MemoryRegion which it will use as the underlying
register implementation to apply the NS-alias behaviour
to. We'll want this so we can do the same with systick.
Now that do_ats_write() is entirely in control of whether to
generate a 32-bit PAR or a 64-bit PAR, we can make it use the
correct (complicated) condition for doing so.
Signed-off-by: Edgar E. Iglesias <[email protected]> Reviewed-by: Richard Henderson <[email protected]> Reviewed-by: Edgar E. Iglesias <[email protected]> Tested-by: Stefano Stabellini <[email protected]> Signed-off-by: Peter Maydell <[email protected]>
Message-id: 1512503192[email protected]
[PMM: Rebased Edgar's patch on top of get_phys_addr() refactoring;
use arm_s1_regime_using_lpae_format() rather than
regime_using_lpae_format() because the latter will assert
if passed ARMMMUIdx_S12NSE0 or ARMMMUIdx_S12NSE1;
updated commit message appropriately] Signed-off-by: Peter Maydell <[email protected]>
Peter Maydell [Wed, 13 Dec 2017 17:59:25 +0000 (17:59 +0000)]
target/arm: Ignore fsr from get_phys_addr() in do_ats_write()
In do_ats_write(), rather than using the FSR value from get_phys_addr(),
construct the PAR values using the information in the ARMMMUFaultInfo
struct. This allows us to create a PAR of the correct format regardless
of what the translation table format is.
For the moment we leave the condition for "when should this be a
64 bit PAR" as it was previously; this will need to be fixed to
properly support AArch32 Hyp mode.
Peter Maydell [Wed, 13 Dec 2017 17:59:25 +0000 (17:59 +0000)]
target/arm: Use ARMMMUFaultInfo in deliver_fault()
Now that ARMMMUFaultInfo is guaranteed to have enough information
to construct a fault status code, we can pass it in to the
deliver_fault() function and let it generate the correct type
of FSR for the destination, rather than relying on the value
provided by get_phys_addr().
I don't think there are any cases the old code was getting
wrong, but this is more obviously correct.