Git Repo - qemu.git/log

dump: add Windows dump format to dump-guest-memory

This patch adds Windows crashdumping feature. Now QEMU can produce ELF-dump
containing Windows crashdump header, which can help to convert to a valid
WinDbg-understandable crashdump file, or immediately create such file.
The crashdump will be obtained by joining physical memory dump and 8K header
exposed through vmcoreinfo/fw_cfg device by guest driver at BSOD time. Option
'-w' was added to dump-guest-memory command. At the moment, only x64
configuration is supported.
Suitable driver can be found at
https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/fwcfg64

Signed-off-by: Viktor Prutyanov <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Message-Id: <20180517162342 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

i386/cpu: make -cpu host support monitor/mwait

When guest CPU PM is enabled, and with -cpu host, expose the host CPU
MWAIT leaf in the CPUID so guest can make good PM decisions.

Note: the result is 100% CPU utilization reported by host as host
no longer knows that the CPU is halted.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Eduardo Habkost <[email protected]>
Message-Id: <20180622192148 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

kvm: support -overcommit cpu-pm=on|off

With this flag, kvm allows guest to control host CPU power state. This
increases latency for other processes using same host CPU in an
unpredictable way, but if decreases idle entry/exit times for the
running VCPU, so to use it QEMU needs a hint about whether host CPU is
overcommitted, hence the flag name.

Follow-up patches will expose this capability to guest
(using mwait leaf).

Based on a patch by Wanpeng Li <[email protected]> .

Signed-off-by: Michael S. Tsirkin <[email protected]>
Message-Id: <20180622192148 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

hmp: obsolete "info ioapic"

Let's start to use "info pic" just like other platforms. For now we
keep the command for a while so that old users can know what is the new
command to use.

Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20171229073104 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

ioapic: support "info irq"

This include both userspace and in-kernel ioapic.  Note that the numbers
can be inaccurate for kvm-ioapic.  One reason is the same with
kvm-i8259, that when irqfd is used, irqs can be delivered all inside
kernel without our notice.  Meanwhile, kvm-ioapic is specially treated
when irq numbers <ISA_NUM_IRQS, those irqs will be delivered in kernel
too via kvm-i8259 (please refer to kvm_pc_gsi_handler).

Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20171229073104 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

ioapic: some proper indents when dump info

So that now it looks better when with other irqchips.

Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20171229073104 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

ioapic: support "info pic"

People start to use "info pic" for all kinds of irqchip dumps. Let x86
ioapic join the family. It dumps the same thing as "info ioapic".

Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20171229073104 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

doc: another fix to "info pic"

Something that commit 254316fa1f ("intc: make HMP 'info irq' and 'info
pic' commands available on all targets", 2016-10-04) forgot to touch up.

Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20171229073104 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

target-i386: Mark cpu_vmexit noreturn

It calls cpu_loop_exit in system emulation mode (and should never be
called in user emulation mode).

Signed-off-by: Jan Kiszka <[email protected]>
Message-Id: <6f4d44ffde55d074cbceb48309c1678600abad2f.1522769774 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

target-i386: Allow interrupt injection after STGI

We need to terminate the translation block after STGI so that pending
interrupts can be injected.

This fixes pending NMI injection for Jailhouse which uses "stgi; clgi"
to open a brief injection window.

Signed-off-by: Jan Kiszka <[email protected]>
Message-Id: <37939b244dda0e9cccf96ce50f2b15df1e48315d.1522769774 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

target-i386: Add NMI interception to SVM

Check for SVM interception prior to injecting an NMI. Tested via the
Jailhouse hypervisor.

Signed-off-by: Jan Kiszka <[email protected]>
Message-Id: <c65877e9a011ee4962931287e59f502c482b8d0b.1522769774 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

memory/hmp: Print owners/parents in "info mtree"

This adds owners/parents (which are the same, just occasionally
owner==NULL) printing for memory regions; a new '-o' flag
enabled new output.

Signed-off-by: Alexey Kardashevskiy <[email protected]>
Message-Id: <20180604032511 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

WHPX: register for unrecognized MSR exits

Some variations of Linux kernels end up accessing MSR's that the Windows
Hypervisor doesn't implement which causes a GP to be returned to the guest.
This fix registers QEMU for unimplemented MSR access and globally returns 0 on
reads and ignores writes. This behavior is allows the Linux kernel to probe the
MSR with a write/read/check sequence it does often without failing the access.

Signed-off-by: Justin Terry (VM) <[email protected]>
Message-Id: <20180605221500 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

WHPX workaround bug in OSVW handling

Adds a workaround to an incorrect value setting
CPUID Fn8000_0001_ECX[bit 9 OSVW] = 1. This can cause a guest linux kernel
to panic when an issue to rdmsr C001_0140h returns 0. Disabling this feature
correctly allows the guest to boot without accessing the osv workarounds.

Signed-off-by: Justin Terry (VM) <[email protected]>
Message-Id: <20180605221500 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

esp: remove legacy esp_init() function

Remove the legacy esp_init() function now that there are no more remaining
users.

Signed-off-by: Mark Cave-Ayland <[email protected]>
Message-Id: <20180613094727 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Tested-by: Hervé Poussineau <[email protected]>

hw/mips/jazz: create ESP device directly via qdev

MIPS jazz is the last user of the legacy esp_init() function so move creation
of the ESP device over to use qdev.

Note that the esp_reset and dma_enable qemu_irqs are currently unused and so
we do not wire these up and instead remove the variables to prevent the
compiler emitting unused variable warnings.

Signed-off-by: Mark Cave-Ayland <[email protected]>
Message-Id: <20180613094727 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Tested-by: Hervé Poussineau <[email protected]>

pr-manager-helper: report event on connection/disconnection

Let management know if there were any problems communicating with
qemu-pr-helper. The event is edge-triggered, and is sent every time
the connection status of the pr-manager-helper object changes.

Signed-off-by: Paolo Bonzini <[email protected]>

pr-manager: add query-pr-managers QMP command

This command lets you query the connection status of each pr-manager-helper
object.

Signed-off-by: Paolo Bonzini <[email protected]>

pr-manager: put stubs in .c file

Signed-off-by: Paolo Bonzini <[email protected]>

pr-manager-helper: avoid SIGSEGV when writing to the socket fail

When writing to the qemu-pr-helper socket failed, the persistent
reservation manager was correctly disconnecting the socket, but it
did not clear pr_mgr->ioc. So the rest of the code did not know
that the socket had been disconnected, accessed pr_mgr->ioc and
happily caused a crash.

To reproduce, it is enough to stop qemu-pr-helper between QEMU
startup and executing e.g. sg_persist -k /dev/sdb.

Reviewed-by: Michal Privoznik <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

pr-helper: fix assertion failure on failed multipath PERSISTENT RESERVE IN

The response size is expected to be zero if the SCSI status is not
"GOOD", but nothing was resetting it.

This can be reproduced simply by "sg_persist -s /dev/sdb" where /dev/sdb
in the guest is a scsi-block device corresponding to a multipath device
on the host.

Before:

  PR in (Read full status): Aborted command

and on the host:

  prh_write_response: Assertion `resp->sz == 0' failed.

After:

  PR in (Read full status): bad field in cdb or parameter list
  (perhaps unsupported service action)

Reported-by: Jiri Belka <[email protected]>
Reviewed-by: Michal Privoznik <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: Paolo Bonzini <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

pr-helper: fix --socket-path default in help

Currently --help shows "(default '(null)')" for the -k/--socket-path
option. Fix it by getting the default path in /var/run.

Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

Deprecate the -enable-hax option

We currently have got three ways of turning on the HAX accelerator:
"-machine accel=hax", "-accel hax" and "-enable-hax". That's really
confusing and overloaded. Since "-accel" is our preferred way to enable
an accelerator nowadays, and "-accel hax" is even less to type than
"-enable-hax", let's deprecate the "-enable-hax" option now.

Note: While "-enable-kvm" is available since a long time and can hardly be
removed since it is used in a lot of upper layer tools and scripts, the
"-enable-hax" option is still rather new and not very widespread yet, so
I think that it should be OK if we remove this in a couple of releases again
(we'll see whether someone complains after seeing the deprecation message -
then we could still reconsider to keep it if there a well-founded reasons).

Signed-off-by: Thomas Huth <[email protected]>
Message-Id: <1529950933 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

osdep: work around Coverity parsing errors

Coverity does not like the new _Float* types that are used by
recent glibc, and croaks on every single file that includes
stdlib.h. Add dummy typedefs to please it.

Reviewed-by: Peter Maydell <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

numa: report all DIMM/NVDIMMs as plugged memory

Right now, there is some inconsistency between hotplugged and
coldplugged memory. DIMMs added via "-device" result in different stats
than DIMMs added using "device_add".

E.g.
    [...]
    -numa node,nodeid=0,cpus=0-1 -numa node,nodeid=1,cpus=2-3 \
    -m 4G,maxmem=20G,slots=2 \
    -object memory-backend-ram,id=mem0,size=8G \
    -device pc-dimm,id=dimm0,memdev=mem0 \
    -object memory-backend-ram,id=mem1,size=8G \
    -device nvdimm,id=dimm1,memdev=mem1,node=1

Results in NUMA info
    (qemu) info numa
    info numa
    2 nodes
    node 0 cpus: 0 1
    node 0 size: 10240 MB
    node 0 plugged: 0 MB
    node 1 cpus: 2 3
    node 1 size: 10240 MB
    node 1 plugged: 0 MB

But in memory size summary:
    (qemu) info memory_size_summary
    info memory_size_summary
    base memory: 4294967296
    plugged memory: 17179869184

Make this consistent by reporting all hot and coldplugged
memory a.k.a. DIMM and NVDIMM as "plugged".

Fixes: 31959e82fb0 ("hmp: extend "info numa" with hotplugged memory information")
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20180622144045 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

pc-dimm: get_memory_region() will not fail after realize

Let's try to reduce error handling a bit. In the plug/unplug case, the
device was realized and therefore we can assume that getting access to
the memory region will not fail.

For get_vmstate_memory_region() this is already handled that way.
Document both cases.

Reviewed-by: Igor Mammedov <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20180619134141 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

nvdimm: make get_memory_region() perform checks and initialization

We might get a call to get_memory_region() before the device has been
realized. We should return a consistent value, as the return value
will e.g. later on be used in the pre_plug handler.

To avoid duplicating too much code, factor the initialization and checks
out into a helper function.

Reviewed-by: Igor Mammedov <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20180619134141 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

nvdimm: convert nvdimm_mr into a pointer

This way we can easily check if the region has already been inititalized
without having to rely on the size of an uninitialized region being 0.

Free the region in nvdimm_finalize() and not in unrealize() as we will
allow to create the region before realization in following patches.

Reviewed-by: David Gibson <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20180619134141 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

nvdimm: convert "unarmed" into a static property

We don't allow to modify it after realization. So we can simply turn
it into a static property.

Reviewed-by: David Gibson <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20180619134141 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

pc-dimm: merge get_(vmstate_)memory_region()

Importantly, get_vmstate_memory_region() should also fail with a proper
error if called before the device is realized. For a PCDIMM, both functions
are to return the same thing, so share the implementation.

All current users are called after the device has been realized, so we
can expect the calls to succeed.

Reviewed-by: David Gibson <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20180619134141 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

hostmem: drop error variable from host_memory_backend_get_memory()

Unused, so let's remove it.

Reviewed-by: David Gibson <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20180619134141 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

nvdimm: no need to overwrite get_vmstate_memory_region()

Our parent class (PC_DIMM) provides exactly the same function.

Reviewed-by: David Gibson <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20180619134141 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

pc: factor out pc specific dimm checks into pc_memory_pre_plug()

We can perform these checks before the device is actually realized.

Reviewed-by: David Gibson <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20180619134141 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

pc-dimm: remove pc_dimm_get_free_slot() from header

Not used outside of pc-dimm.c and there shouldn't be other users. If
other devices (e.g. memory devices) ever have to also use slots, then we
will have to factor this out.

Reviewed-by: Igor Mammedov <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20180619134141 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

pc-dimm: rename pc_dimm_memory_* to pc_dimm_*

Let's rename it to make it look more consistent.

Reviewed-by: Igor Mammedov <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20180619134141 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

pc: rename pc_dimm_(plug|unplug|...)* into pc_memory_(plug|unplug|...)*

Use a similar naming scheme as spapr. This way, we can go ahead and
rename e.g. pc_dimm_memory_plug to pc_dimm_plug, which avoids
confusion.

Reviewed-by: Igor Mammedov <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20180619134141 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

pc-dimm: remove leftover "struct pc_dimms_capacity"

Not needed anymore, let's drop it.

Reviewed-by: David Gibson <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20180619134141 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

qemu-options: Add missing newline to -accel help text

The newline was removed by commit c97d6d2c, and broke -help output:

Before this patch:

  $ qemu-system-x86_64 -help | grep smp
                  thread=single|multi (enable multi-threaded TCG)-smp [...]

After this patch:

  $ qemu-system-x86_64 -help  | grep smp
  -smp [cpus=]n[,maxcpus=cpus][,cores=cores][,threads=threads][,sockets=sockets]

Fixes: c97d6d2cdf97edb4aebe832fdba65d701ad7bcb6
Cc: Sergio Andres Gomez Del Real <[email protected]>
Signed-off-by: Eduardo Habkost <[email protected]>
Message-Id: <20180611195607 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

Replace '-enable-kvm' with '-accel kvm' in docs and help texts

The preferred way to select the KVM accelerator is to use "-accel kvm"
these days, so let's be consistent in our documentation and help texts.

Signed-off-by: Thomas Huth <[email protected]>
Message-Id: <1528866321 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

configure: enable debug-mutex if debug enabled

Reviewed-by: Emilio G. Cota <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180425025459 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

QemuMutex: support --enable-debug-mutex

We have had some tracing tools for mutex but it's not easy to use them
for e.g. dead locks. Let's provide "--enable-debug-mutex" parameter
when configure to allow QemuMutex to store the last owner that took
specific lock. It will be easy to use this tool to debug deadlocks
since we can directly know who took the lock then as long as we can have
a debugger attached to the process.

Reviewed-by: Emilio G. Cota <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180425025459 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

qemu-thread: introduce qemu-thread-common.h

Introduce some hooks for the shared part of qemu thread between POSIX
and Windows implementations. Note that in qemu_mutex_unlock_impl() we
moved the call before unlock operation which should make more sense.
And we don't need qemu_mutex_post_unlock() hook.

Put all these shared hooks into the header files. It should be internal
to qemu-thread but not for qemu-thread users, hence put into util/
directory.

Reviewed-by: Emilio G. Cota <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180425025459 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

tests/atomic_add-bench: add -m option to use mutexes

This allows us to use atomic-add-bench as a microbenchmark
for evaluating qemu_mutex_lock's performance.

Signed-off-by: Emilio G. Cota <[email protected]>
[cherry picked from https://github.com/cota/qemu/commit/f04f34df]
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <20180425025459 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

kvm: Delete the slot if and only if the KVM_MEM_READONLY flag is changed

According to KVM commit 75d61fbc, it needs to delete the slot before
changing the KVM_MEM_READONLY flag. But QEMU commit 235e8982 only check
whether KVM_MEM_READONLY flag is set instead of changing. It doesn't
need to delete the slot if the KVM_MEM_READONLY flag is not changed.

This fixes a issue that migrating a VM at the OVMF startup stage and
VM is executing the codes in rom. Between the deleting and adding the
slot in kvm_set_user_memory_region, there is a chance that guest access
rom and trap to KVM, then KVM can't find the corresponding memslot.
While KVM (on ARM) injects an abort to guest due to the broken hva, then
guest will get stuck.

Signed-off-by: Shannon Zhao <[email protected]>
Message-Id: <1526462314 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

exec: check that alignment is a power of two

Right now we can crash QEMU using e.g.

qemu-system-x86_64 -m 256M,maxmem=20G,slots=2 \
-object memory-backend-file,id=mem0,size=12288,mem-path=/dev/zero,align=12288 \
-device pc-dimm,id=dimm1,memdev=mem0

qemu-system-x86_64: util/mmap-alloc.c:115:
qemu_ram_mmap: Assertion `is_power_of_2(align)' failed

Fix this by adding a proper check.

Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20180607154705 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

memory-device: turn alignment assert into check

The start of the address space indicates which maximum alignment is
supported by our machine (e.g. ppc, x86 1GB). This is helpful to
catch fragmenting guest physical memory in strange fashions.

Right now we can crash QEMU by e.g. (there might be easier examples)

qemu-system-x86_64 -m 256M,maxmem=20G,slots=2 \
-object memory-backend-file,id=mem0,size=8192M,mem-path=/dev/zero,align=8192M \
-device pc-dimm,id=dimm1,memdev=mem0

Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20180607154705 [email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Igor Mammedov <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

whpx: commit missing file

Not included by mistake in commit 327fccb288976f95808efa968082fc9d4a9ced84.

Signed-off-by: Paolo Bonzini <[email protected]>

target/i386: Fix BLSR and BLSI

The implementation of these two instructions was swapped.
At the same time, unify the setup of eflags for the insn group.

Reported-by: Ricardo Ribalda Delgado <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-Id: <20170712192902 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

hw/char/serial: Only retry if qemu_chr_fe_write returns 0

Only retry on serial_xmit if qemu_chr_fe_write returns 0, as this is the
only recoverable error.

Retrying with any other scenario, in addition to being a waste of CPU
cycles, can compromise the Guest stability if by the vCPU issuing the
write and the main loop thread are, by chance or explicit pinning,
running on the same pCPU.

Previous discussion:

https://lists.nongnu.org/archive/html/qemu-devel/2018-05/msg06998.html

Signed-off-by: Sergio Lopez <[email protected]>
Message-Id: <1528185295 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

main-loop: document IOCanReadHandler

Signed-off-by: Stefan Hajnoczi <[email protected]>
Message-Id: <20180602085259 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

chardev: don't splatter terminal settings on exit if not previously set

The stdio chardev finalize method calls term_exit() to restore the
original terminal settings that were saved in the "oldtty" global. If
the qemu_chr_open_stdio() method exited with an error, we might not have
any original terminal settings saved in "oldtty" yet.

eg

$ qemu-system-x86_64 -monitor stdio -daemonize
qemu-system-x86_64: -monitor stdio: cannot use stdio with -daemonize

will cause QEMU to splatter the terminal settings with an all-zeros
"struct termios", with predictably unpleasant results. Fortunately the
existing "stdio_in_use" flag is suitable witness for whether "oldtty"
contains settings that need restoring.

Signed-off-by: Daniel P. Berrangé <[email protected]>
Message-Id: <20180604123043 [email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

move public invalidate APIs out of translate-all.{c,h}, clean up

Place them in exec.c, exec-all.h and ram_addr.h. This removes
knowledge of translate-all.h (which is an internal header) from
several files outside accel/tcg and removes knowledge of
AddressSpace from translate-all.c (as it only operates on ram_addr_t).

Signed-off-by: Paolo Bonzini <[email protected]>

exec: Fix MAP_RAM for cached access

When an IOMMUMemoryRegion is in front of a virtio device,
address_space_cache_init does not set cache->ptr as the memory
region is not RAM. However when the device performs an access,
we end up in glue() which performs the translation and then uses
MAP_RAM. This latter uses the unset ptr and returns a wrong value
which leads to a SIGSEV in address_space_lduw_internal_cached_slow,
for instance.

In slow path cache->ptr is NULL and MAP_RAM must redirect to
qemu_map_ram_ptr((mr)->ram_block, ofs).

As MAP_RAM, IS_DIRECT and INVALIDATE are the same in _cached_slow
and non cached mode, let's remove those macros.

This fixes the use cases featuring vIOMMU (Intel and ARM SMMU)
which lead to a SIGSEV.

Fixes: 48564041a73a (exec: reintroduce MemoryRegion caching)
Signed-off-by: Eric Auger <[email protected]>
Message-Id: <1528895946 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20180627' into staging

migration/next for 20180627

# gpg: Signature made Wed 27 Jun 2018 13:53:53 BST
# gpg:                using RSA key F487EF185872D723
# gpg: Good signature from "Juan Quintela <[email protected]>"
# gpg:                 aka "Juan Quintela <[email protected]>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* remotes/juanquintela/tags/migration/20180627:
  migration: fix crash in when incoming client channel setup fails
  postcopy: drop ram_pages parameter from postcopy_ram_incoming_init()
  migration: Stop sending whole pages through main channel
  migration: Remove not needed semaphore and quit
  migration: Wait for blocking IO
  migration: Start sending messages
  migration: Create ram_save_multifd_page
  migration: Create multifd_bytes ram_counter
  migration: Synchronize multifd threads with main thread
  migration: Add block where to send/receive packets
  migration: Multifd channels always wait on the sem
  migration: Add multifd traces for start/end thread
  migration: Abstract the number of bytes sent
  migration: Calculate mbps only during transfer time
  migration: Create multifd packet
  migration: Create multipage support

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/amarkovic/tags/mips-queue-2018-06-27' into staging

MIPS queue

# gpg: Signature made Wed 27 Jun 2018 19:16:23 BST
# gpg:                using RSA key D4972A8967F75A65
# gpg: Good signature from "Aleksandar Markovic <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8526 FBF1 5DA3 811F 4A01  DD75 D497 2A89 67F7 5A65

* remotes/amarkovic/tags/mips-queue-2018-06-27:
  target/mips: Fix gdbstub to read/write 64 bit FP registers
  target/mips: Fix data type for offset
  target/mips: Update gen_flt_ldst()
  target/mips: Fix microMIPS on reset
  target/mips: Raise a RI when given fs is n/a from CTC1
  hw/pci-host/xilinx-pcie: don't make "io" region be RAM
  hw/mips/mips_malta: don't make bios region 'nomigrate'
  hw/mips/boston: don't make flash region 'nomigrate'
  MAINTAINERS: update target-mips maintainers

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging

Pull request

* Trace TCG atomic memory accesses
* Document that trace event arguments cannot be floating point

# gpg: Signature made Wed 27 Jun 2018 13:57:40 BST
# gpg:                using RSA key 9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <[email protected]>"
# gpg:                 aka "Stefan Hajnoczi <[email protected]>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request:
  trace: forbid floating point types
  trace: enable tracing of TCG atomics
  trace: add trace_mem_build_info_no_se_be/le
  trace: expand mem_info:size_shift to 3 bits
  trace: simplify trace_mem functions
  trace: fix misreporting of TCG access sizes for user-space

Signed-off-by: Peter Maydell <[email protected]>

target/mips: Fix gdbstub to read/write 64 bit FP registers

Fix gdbstub to read/write 64 bit FP registers

Signed-off-by: Yongbok Kim <[email protected]>
Reviewed-by: Aleksandar Markovic <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>

target/mips: Fix data type for offset

Offset can be larger than 16 bit from nanoMIPS,
and immediate field can be larger than 16 bits as well.

Signed-off-by: Yongbok Kim <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Aleksandar Markovic <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>

target/mips: Update gen_flt_ldst()

Update gen_flt_ldst() in order to reuse the functions for nanoMIPS

Signed-off-by: Yongbok Kim <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Aleksandar Markovic <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>

target/mips: Fix microMIPS on reset

Fix to activate microMIPS on reset when Config3.ISA == {1, 3}

Signed-off-by: Yongbok Kim <[email protected]>
Reviewed-by: Aleksandar Markovic <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>

target/mips: Raise a RI when given fs is n/a from CTC1

Fix to raise a Reserved Instruction exception when given fs is not
available from CTC1.

Signed-off-by: Yongbok Kim <[email protected]>
Reviewed-by: Aleksandar Markovic <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>

hw/pci-host/xilinx-pcie: don't make "io" region be RAM

Currently we use memory_region_init_rom_nomigrate() to create
the "io" memory region to pass to pci_register_root_bus().
This is a dummy region, because this PCI controller doesn't
support accesses to PCI IO space.

There is no reason for the dummy region to be a RAM region;
it is only used as a place where PCI BARs can be mapped,
and if you could get a PCI card to do a bus master access
to the IO space it should not get acts-like-RAM behaviour.
Use a simple container memory region instead. (We do have
one PCI card model which can do bus master accesses to IO
space -- the LSI53C895A SCSI adaptor.)

This avoids the oddity of having a memory region which is
RAM but where the RAM is not migrated.

Note that the size of the region we use here has no
effect on behaviour.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Alistair Francis <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>

hw/mips/mips_malta: don't make bios region 'nomigrate'

Currently we use memory_region_init_rom_nomigrate() to create
the "bios.1fc" memory region, and we don't manually register
it with vmstate_register_ram(). This currently means that its
contents are migrated but as a ram block whose name is the empty
string; in future it may mean they are not migrated at all. Use
memory_region_init_ram() instead.

Note that this is a a cross-version migration compatibility break
for the "malta" machine.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Cédric Le Goater <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Paul Burton <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>

hw/mips/boston: don't make flash region 'nomigrate'

Currently we use memory_region_init_rom_nomigrate() to create
the "boston.flash" memory region, and we don't manually register
it with vmstate_register_ram(). This currently means that its
contents are migrated but as a ram block whose name is the empty
string; in future it may mean they are not migrated at all. Use
memory_region_init_ram() instead.

Note that this is a a cross-version migration compatibility break
for the "boston" machine.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Cédric Le Goater <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Paul Burton <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>

MAINTAINERS: update target-mips maintainers

Yongbok Kim transfers duties of QEMU for target MIPS maintainer to
myself as he leaves MIPS. Many thanks to Yongbok for his substantial
contributing to QEMU for MIPS over many years and taking care of its
maintainance for almost two years.

Signed-off-by: Aleksandar Markovic <[email protected]>
Acked-by: Yongbok Kim <[email protected]>
Reviewed-by: Aleksandar Markovic <[email protected]>

migration: fix crash in when incoming client channel setup fails

The way we determine if we can start the incoming migration was
changed to use migration_has_all_channels() in:

  commit 428d89084c709e568f9cd301c2f6416a54c53d6d
  Author: Juan Quintela <[email protected]>
  Date:   Mon Jul 24 13:06:25 2017 +0200

    migration: Create migration_has_all_channels

This method in turn calls multifd_recv_all_channels_created()
which is hardcoded to always return 'true' when multifd is
not in use. This is a latent bug...

...activated in a following commit where that return result
ends up acting as the flag to indicate whether it is possible
to start processing the migration:

  commit 36c2f8be2c4eb0003ac77a14910842b7ddd7337e
  Author: Juan Quintela <[email protected]>
  Date:   Wed Mar 7 08:40:52 2018 +0100

    migration: Delay start of migration main routines

This means that if channel initialization fails with normal
migration, it'll never notice and attempt to start the
incoming migration regardless and crash on a NULL pointer.

This can be seen, for example, if a client connects to a server
requiring TLS, but has an invalid x509 certificate:

qemu-system-x86_64: The certificate hasn't got a known issuer
qemu-system-x86_64: migration/migration.c:386: process_incoming_migration_co: Assertion `mis->from_src_file' failed.

#0  0x00007fffebd24f2b in raise () at /lib64/libc.so.6
#1  0x00007fffebd0f561 in abort () at /lib64/libc.so.6
#2  0x00007fffebd0f431 in _nl_load_domain.cold.0 () at /lib64/libc.so.6
#3  0x00007fffebd1d692 in  () at /lib64/libc.so.6
#4  0x0000555555ad027e in process_incoming_migration_co (opaque=<optimized out>) at migration/migration.c:386
#5  0x0000555555c45e8b in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:116
#6  0x00007fffebd3a6a0 in __start_context () at /lib64/libc.so.6
#7  0x0000000000000000 in  ()

To handle the non-multifd case, we check whether mis->from_src_file
is non-NULL. With this in place, the migration server drops the
rejected client and stays around waiting for another, hopefully
valid, client to arrive.

Signed-off-by: Daniel P. Berrangé <[email protected]>
Message-Id: <20180619163552 [email protected]>
Reviewed-by: Juan Quintela <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Juan Quintela <[email protected]>

postcopy: drop ram_pages parameter from postcopy_ram_incoming_init()

Not needed. Don't expose last_ram_page().

Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20180620202736 [email protected]>
Reviewed-by: Juan Quintela <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Juan Quintela <[email protected]>

migration: Stop sending whole pages through main channel

We have to flush() the QEMUFile because now we sent really few data
through that channel.

Signed-off-by: Juan Quintela <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>

migration: Remove not needed semaphore and quit

We know quit with shutdwon in the QIO.

Signed-off-by: Juan Quintela <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
--
Add comment
Use shutdown() instead of unref()

migration: Wait for blocking IO

We have three conditions here:
- channel fails -> error
- we have to quit: we close the channel and reads fails
- normal read that success, we are in bussiness

So forget the complications of waiting in a semaphore.

Signed-off-by: Juan Quintela <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>

migration: Start sending messages

Signed-off-by: Juan Quintela <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>

migration: Create ram_save_multifd_page

The function still don't use multifd, but we have simplified
ram_save_page, xbzrle and RDMA stuff is gone. We have added a new
counter.

Signed-off-by: Juan Quintela <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
--
Add last_page parameter
Add commets for done and address
Remove multifd field, it is the same than normal pages
Merge next patch, now we send multiple pages at a time
Remove counter for multifd pages, it is identical to normal pages
Use iovec's instead of creating the equivalent.
Clear memory used by pages (dave)
Use g_new0(danp)
define MULTIFD_CONTINUE
now pages member is a pointer
Fix off-by-one in number of pages in one packet
Remove RAM_SAVE_FLAG_MULTIFD_PAGE
s/multifd_pages_t/MultiFDPages_t/
add comment explaining what it means

migration: Create multifd_bytes ram_counter

This will include how many bytes they are sent through multifd.

Signed-off-by: Juan Quintela <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>

migration: Synchronize multifd threads with main thread

We synchronize all threads each RAM_SAVE_FLAG_EOS. Bitmap
synchronizations don't happen inside a ram section, so we are safe
about two channels trying to overwrite the same memory.

Signed-off-by: Juan Quintela <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
--
seq needs to be atomic now, will also be accessed from main thread.
Fix the if (true || ...) leftover
We are back to non-atomics

migration: Add block where to send/receive packets

Once there add tracepoints.

Signed-off-by: Juan Quintela <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>

migration: Multifd channels always wait on the sem

Either for quit, sync or packet, we first wake them.

Signed-off-by: Juan Quintela <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>

migration: Add multifd traces for start/end thread

We want to know how many pages/packets each channel has sent. Add
counters for those.

Signed-off-by: Juan Quintela <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
--
sort trace-events (dave)

migration: Abstract the number of bytes sent

Right now we use the "position" inside the QEMUFile, but things like
RDMA already do weird things to be able to maintain that counter
right, and multifd will have some similar problems.

Signed-off-by: Juan Quintela <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>

migration: Calculate mbps only during transfer time

We used to include in this calculation the setup time, but that can be
quite big in rdma or multifd.

Signed-off-by: Juan Quintela <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>

migration: Create multifd packet

We still don't put anything there.

Signed-off-by: Juan Quintela <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
--
fix magic (dave)
check offset/ramblock (dave)
s/seq/packet_num/ and make it 64bit

migration: Create multipage support

We only create/destry the page list here. We will use it later.

Signed-off-by: Juan Quintela <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>

trace: forbid floating point types

Only one existing trace event uses a floating point type. Unfortunately
float and double cannot be supported since SystemTap does not have
floating point types.

Remove float and double from the whitelist and document this limitation.
Update the migrate_transferred trace event to use uint64_t instead of
double.

Cc: Dr. David Alan Gilbert <[email protected]>
Cc: Daniel P. Berrangé <[email protected]>
Cc: Peter Maydell <[email protected]>
Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Juan Quintela <[email protected]>
Message-id: 20180621150254 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

trace: enable tracing of TCG atomics

We do not trace guest atomic accesses. Fix it.

Tested with a modified atomic_add-bench so that it executes
a deterministic number of instructions, i.e. fixed seeding,
no threading and fixed number of loop iterations instead
of running for a certain time.

Before:
- With parallel_cpus = false (no clone syscall so it is never set to true):
  220070 memory accesses
- With parallel_cpus = true (hard-coded):
  212105 memory accesses <-- we're not tracing the atomics!

After:
  220070 memory accesses regardless of parallel_cpus.

Signed-off-by: Emilio G. Cota <[email protected]>
Message-id: 1527028012 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

trace: add trace_mem_build_info_no_se_be/le

These will be used by the following commit.

Signed-off-by: Emilio G. Cota <[email protected]>
Message-id: 1527028012 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

trace: expand mem_info:size_shift to 3 bits

This will allow us to trace 16B-long memory accesses.

Signed-off-by: Emilio G. Cota <[email protected]>
Message-id: 1527028012 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

trace: simplify trace_mem functions

Add some defines for the mem_info bits, simplify
trace_mem_build_info, and also simplify trace_mem_get_info
by making it a wrapper around trace_mem_build_info.

This paves the way for increasing size_shift by one bit.

Signed-off-by: Emilio G. Cota <[email protected]>
Message-id: 1527028012 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

trace: fix misreporting of TCG access sizes for user-space

trace_mem_build_info expects a size_shift for its first argument. Fix it.

Signed-off-by: Emilio G. Cota <[email protected]>
Message-id: 1527028012 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20180626' into staging

target-arm queue:
* aspeed: set APB clocks correctly (fixes slowdown on palmetto)
* smmuv3: cache config data and TLB entries
* v7m/v8m: support read/write from MPU regions smaller than 1K
* various: clean up logging/debug messages
* xilinx_spips: Make dma transactions as per dma_burst_size

# gpg: Signature made Tue 26 Jun 2018 17:55:46 BST
# gpg:                using RSA key 3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <[email protected]>"
# gpg:                 aka "Peter Maydell <[email protected]>"
# gpg:                 aka "Peter Maydell <[email protected]>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20180626: (32 commits)
  aspeed/timer: use the APB frequency from the SCU
  aspeed: initialize the SCU controller first
  aspeed/scu: introduce clock frequencies
  hw/arm/smmuv3: Add notifications on invalidation
  hw/arm/smmuv3: IOTLB emulation
  hw/arm/smmuv3: Cache/invalidate config data
  hw/arm/smmuv3: Fix translate error handling
  target/arm: Handle small regions in get_phys_addr_pmsav8()
  target/arm: Set page (region) size in get_phys_addr_pmsav7()
  tcg: Support MMU protection regions smaller than TARGET_PAGE_SIZE
  hw/arm/stellaris: Use HWADDR_PRIx to display register address
  hw/arm/stellaris: Fix gptm_write() error message
  hw/net/smc91c111: Use qemu_log_mask(UNIMP) instead of fprintf
  hw/net/smc91c111: Use qemu_log_mask(GUEST_ERROR) instead of hw_error
  hw/net/stellaris_enet: Use qemu_log_mask(GUEST_ERROR) instead of hw_error
  hw/net/stellaris_enet: Fix a typo
  hw/arm/stellaris: Use qemu_log_mask(UNIMP) instead of fprintf
  hw/arm/omap: Use qemu_log_mask(GUEST_ERROR) instead of fprintf
  hw/arm/omap1: Use qemu_log_mask(GUEST_ERROR) instead of fprintf
  hw/i2c/omap_i2c: Use qemu_log_mask(UNIMP) instead of fprintf
  ...

Signed-off-by: Peter Maydell <[email protected]>

aspeed/timer: use the APB frequency from the SCU

The timer controller can be driven by either an external 1MHz clock or
by the APB clock. Today, the model makes the assumption that the APB
frequency is always set to 24MHz but this is incorrect.

The AST2400 SoC on the palmetto machines uses a 48MHz input clock
source and the APB can be set to 48MHz. The consequence is a general
system slowdown. The QEMU machines using the AST2500 SoC do not seem
impacted today because the APB frequency is still set to 24MHz.

We fix the timer frequency for all SoCs by linking the Timer model to
the SCU model. The APB frequency driving the timers is now the one
configured for the SoC.

Signed-off-by: Cédric Le Goater <[email protected]>
Reviewed-by: Joel Stanley <[email protected]>
Reviewed-by: Andrew Jeffery <[email protected]>
Message-id: 20180622075700 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

aspeed: initialize the SCU controller first

The System Control Unit should be initialized first as it drives all
the configuration of the SoC and other device models.

Signed-off-by: Cédric Le Goater <[email protected]>
Reviewed-by: Joel Stanley <[email protected]>
Acked-by: Andrew Jeffery <[email protected]>
Message-id: 20180622075700 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

aspeed/scu: introduce clock frequencies

All Aspeed SoC clocks are driven by an input source clock which can
have different frequencies : 24MHz or 25MHz, and also, on the Aspeed
AST2400 SoC, 48MHz. The H-PLL (CPU) clock is defined from a
calculation using parameters in the H-PLL Parameter register or from a
predefined set of frequencies if the setting is strapped by hardware
(Aspeed AST2400 SoC). The other clocks of the SoC are then defined
from the H-PLL using dividers.

We introduce first the APB clock because it should be used to drive
the Aspeed timer model.

Signed-off-by: Cédric Le Goater <[email protected]>
Reviewed-by: Andrew Jeffery <[email protected]>
Message-id: 20180622075700 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

hw/arm/smmuv3: Add notifications on invalidation

On TLB invalidation commands, let's call registered
IOMMU notifiers. Those can only be UNMAP notifiers.
SMMUv3 does not support notification on MAP (VFIO).

This patch allows vhost use case where IOTLB API is notified
on each guest IOTLB invalidation.

Signed-off-by: Eric Auger <[email protected]>
Reviewed-by: Peter Maydell <[email protected]>
Message-id: 1529653501 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

hw/arm/smmuv3: IOTLB emulation

We emulate a TLB cache of size SMMU_IOTLB_MAX_SIZE=256.
It is implemented as a hash table whose key is a combination
of the 16b asid and 48b IOVA (Jenkins hash).

Entries are invalidated on TLB invalidation commands, either
globally, or per asid, or per asid/iova.

Signed-off-by: Eric Auger <[email protected]>
Message-id: 1529653501 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

hw/arm/smmuv3: Cache/invalidate config data

Let's cache config data to avoid fetching and parsing STE/CD
structures on each translation. We invalidate them on data structure
invalidation commands.

We put in place a per-smmu mutex to protect the config cache. This
will be useful too to protect the IOTLB cache. The caches can be
accessed without BQL, ie. in IO dataplane. The same kind of mutex was
put in place in the intel viommu.

Signed-off-by: Eric Auger <[email protected]>
Reviewed-by: Peter Maydell <[email protected]>
Message-id: 1529653501 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

hw/arm/smmuv3: Fix translate error handling

In case the STE's config is "Bypass" we currently don't set the
IOMMUTLBEntry perm flags and the access does not succeed. Also
if the config is 0b0xx (Aborted/Reserved), decode_ste and
smmuv3_decode_config currently returns -EINVAL and we don't enter
the expected code path: we record an event whereas we should not.

This patch fixes those bugs and simplifies the error handling.
decode_ste and smmuv3_decode_config now return 0 if aborted or
bypassed config was found. Only bad config info produces negative
error values. In smmuv3_translate we more clearly differentiate
errors, bypass/smmu disabled, aborted and success cases. Also
trace points are differentiated.

Fixes: 9bde7f0674fe ("hw/arm/smmuv3: Implement translate callback")
Reported-by: [email protected]
Signed-off-by: [email protected]
Signed-off-by: Eric Auger <[email protected]>
Reviewed-by: Peter Maydell <[email protected]>
Message-id: 1529653501 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Handle small regions in get_phys_addr_pmsav8()

Allow ARMv8M to handle small MPU and SAU region sizes, by making
get_phys_add_pmsav8() set the page size to the 1 if the MPU or
SAU region covers less than a TARGET_PAGE_SIZE.

We choose to use a size of 1 because it makes no difference to
the core code, and avoids having to track both the base and
limit for SAU and MPU and then convert into an artificially
restricted "page size" that the core code will then ignore.

Since the core TCG code can't handle execution from small
MPU regions, we strip the exec permission from them so that
any execution attempts will cause an MPU exception, rather
than allowing it to end up with a cpu_abort() in
get_page_addr_code().

(The previous code's intention was to make any small page be
treated as having no permissions, but unfortunately errors
in the implementation meant that it didn't behave that way.
It's possible that some binaries using small regions were
accidentally working with our old behaviour and won't now.)

We also retain an existing bug, where we ignored the possibility
that the SAU region might not cover the entire page, in the
case of executable regions. This is necessary because some
currently-working guest code images rely on being able to
execute from addresses which are covered by a page-sized
MPU region but a smaller SAU region. We can remove this
workaround if we ever support execution from small regions.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Message-id: 20180620130619 [email protected]

target/arm: Set page (region) size in get_phys_addr_pmsav7()

We want to handle small MPU region sizes for ARMv7M. To do this,
make get_phys_addr_pmsav7() set the page size to the region
size if it is less that TARGET_PAGE_SIZE, rather than working
only in TARGET_PAGE_SIZE chunks.

Since the core TCG code con't handle execution from small
MPU regions, we strip the exec permission from them so that
any execution attempts will cause an MPU exception, rather
than allowing it to end up with a cpu_abort() in
get_page_addr_code().

(The previous code's intention was to make any small page be
treated as having no permissions, but unfortunately errors
in the implementation meant that it didn't behave that way.
It's possible that some binaries using small regions were
accidentally working with our old behaviour and won't now.)

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Message-id: 20180620130619 [email protected]

tcg: Support MMU protection regions smaller than TARGET_PAGE_SIZE

Add support for MMU protection regions that are smaller than
TARGET_PAGE_SIZE. We do this by marking the TLB entry for those
pages with a flag TLB_RECHECK. This flag causes us to always
take the slow-path for accesses. In the slow path we can then
special case them to always call tlb_fill() again, so we have
the correct information for the exact address being accessed.

This change allows us to handle reading and writing from small
regions; we cannot deal with execution from the small region.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Message-id: 20180620130619 [email protected]

hw/arm/stellaris: Use HWADDR_PRIx to display register address

Suggested-by: Thomas Huth <[email protected]>
Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Message-id: 20180624040609 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

hw/arm/stellaris: Fix gptm_write() error message

Missed in df3692e04b2.

Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Message-id: 20180624040609 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>