Git Repo - qemu.git/log

linux-user: log page table changes under -d page

The CPU_LOG_PAGE flag is woefully underused and could stand to do
extra duty tracking page changes. If the user doesn't want to see the
details as things change they still have the tracepoints available.

We push the locking into log_page_dump and pass a reason for the
banner text.

Signed-off-by: Alex Bennée <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Laurent Vivier <[email protected]>
Message-Id: <20191205122518 [email protected]>

linux-user: add target_mmap_complete tracepoint

For full details we also want to see where the mmaps end up.

Signed-off-by: Alex Bennée <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Laurent Vivier <[email protected]>
Message-Id: <20191205122518 [email protected]>

linux-user: convert target_mmap debug to tracepoint

It is a pain to re-compile when you need to debug and tracepoints are
a fairly low impact way to instrument QEMU.

Signed-off-by: Alex Bennée <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Laurent Vivier <[email protected]>
Message-Id: <20191205122518 [email protected]>

linux-user: convert target_mprotect debug to tracepoint

It is a pain to re-compile when you need to debug and tracepoints are
a fairly low impact way to instrument QEMU.

Signed-off-by: Alex Bennée <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Laurent Vivier <[email protected]>
Message-Id: <20191205122518 [email protected]>

travis.yml: Remove the redundant clang-with-MAIN_SOFTMMU_TARGETS entry

We test clang with the MAIN_SOFTMMU_TARGETS twice, once without
sanitizers and once with sanitizers enabled. That's somewhat redundant
since if compilation and tests succeeded with sanitizers enabled, it
should also work fine without sanitizers. Thus remove the clang entry
without sanitizers to speed up the CI testing a little bit.

Signed-off-by: Thomas Huth <[email protected]>
Reviewed-by: Wainer dos Santos Moschetta <[email protected]>
Signed-off-by: Alex Bennée <[email protected]>
Message-Id: <20191119092147 [email protected]>

docker: gtester is no longer used

We are using tap-driver.pl, do not require anymore gtester to be installed
to run the testsuite in docker-based tests.

Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: Alex Bennée <[email protected]>
Message-Id: <1576632611 [email protected]>

Added tests for close and change of logfile.

One test ensures that the logfile handle is still valid even if
the logfile is changed during logging.
The other test validates that the logfile handle remains valid under
the logfile lock even if the logfile is closed.

Signed-off-by: Robert Foley <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Alex Bennée <[email protected]>
Message-Id: <20191118211528 [email protected]>

Add use of RCU for qemu_logfile.

This now allows changing the logfile while logging is active,
and also solves the issue of a seg fault while changing the logfile.

Any read access to the qemu_logfile handle will use
the rcu_read_lock()/unlock() around the use of the handle.
To fetch the handle we will use atomic_rcu_read().
We also in many cases do a check for validity of the
logfile handle before using it to deal with the case where the
file is closed and set to NULL.

The cases where we write to the qemu_logfile will use atomic_rcu_set().
Writers will also use call_rcu() with a newly added qemu_logfile_free
function for freeing/closing when readers have finished.

Signed-off-by: Robert Foley <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Alex Bennée <[email protected]>
Message-Id: <20191118211528 [email protected]>

qemu_log_lock/unlock now preserves the qemu_logfile handle.

qemu_log_lock() now returns a handle and qemu_log_unlock() receives a
handle to unlock. This allows for changing the handle during logging
and ensures the lock() and unlock() are for the same file.

Also in target/tilegx/translate.c removed the qemu_log_lock()/unlock()
calls (and the log("\n")), since the translator can longjmp out of the
loop if it attempts to translate an instruction in an inaccessible page.

Signed-off-by: Robert Foley <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Signed-off-by: Alex Bennée <[email protected]>
Message-Id: <20191118211528 [email protected]>

Add a mutex to guarantee single writer to qemu_logfile handle.

Also added qemu_logfile_init() for initializing the logfile mutex.

Note that inside qemu_set_log() we needed to add a pair of
qemu_mutex_unlock() calls in order to avoid a double lock in
qemu_log_close(). This unavoidable temporary ugliness will be
cleaned up in a later patch in this series.

Signed-off-by: Robert Foley <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Alex Bennée <[email protected]>
Message-Id: <20191118211528 [email protected]>

Cleaned up flow of code in qemu_set_log(), to simplify and clarify.

Also added some explanation of the reasoning behind the branches.

Signed-off-by: Robert Foley <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Alex Bennée <[email protected]>
Message-Id: <20191118211528 [email protected]>

Fix double free issue in qemu_set_log_filename().

After freeing the logfilename, we set logfilename to NULL, in case of an
error which returns without setting logfilename.

Signed-off-by: Robert Foley <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Alex Bennée <[email protected]>
Message-Id: <20191118211528 [email protected]>

ci: build out-of-tree

Most developers are using out-of-tree builds and it was discussed in the past
to only allow those. To prepare for the transition, use out-of-tree builds
in all continuous integration jobs.

Based on a patch by Marc-André Lureau.

Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Alex Bennée <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Reviewed-by: Li-Wen Hsu <[email protected]>
Message-Id: <1576074829 [email protected]>

travis.yml: Enable builds on arm64, ppc64le and s390x

Travis recently added the possibility to test on these architectures,
too, so let's enable them in our travis.yml file to extend our test
coverage.

Unfortunately, the libssh in this Ubuntu version (bionic) is in a pretty
unusable Frankenstein state and libspice-server-dev is not available here,
so we can not use the global list of packages to install, but have to
provide individual package lists instead.

Also, some of the iotests crash when using "dist: bionic" on arm64
and ppc64le, thus these two builders have to use "dist: xenial" until
the problem is understood / fixed.

Signed-off-by: Thomas Huth <[email protected]>
Acked-by: David Gibson <[email protected]>
Message-Id: <20191204154618 [email protected]>
Signed-off-by: Alex Bennée <[email protected]>

tests/test-util-filemonitor: Skip test on non-x86 Travis containers

test-util-filemonitor fails in restricted non-x86 Travis containers
since they apparently blacklisted some required system calls there.
Let's simply skip the test if we detect such an environment.

Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Thomas Huth <[email protected]>
Signed-off-by: Alex Bennée <[email protected]>
Reviewed-by: Cleber Rosa <[email protected]>
Tested-by: Cleber Rosa <[email protected]>
Message-Id: <20191204154618 [email protected]>

tests/hd-geo-test: Skip test when images can not be created

In certain environments like restricted containers, we can not create
huge test images. To be able to use "make check" in such container
environments, too, let's skip the hd-geo-test instead of failing when
the test images could not be created.

Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Thomas Huth <[email protected]>
Signed-off-by: Alex Bennée <[email protected]>
Message-Id: <20191204154618 [email protected]>

iotests: Skip test 079 if it is not possible to create large files

Test 079 fails in the arm64, s390x and ppc64le LXD containers on Travis
(which we will hopefully enable in our CI soon). These containers
apparently do not allow large files to be created. Test 079 tries to
create a 4G sparse file, which is apparently already too big for these
containers, so check first whether we can really create such files before
executing the test.

Signed-off-by: Thomas Huth <[email protected]>
Signed-off-by: Alex Bennée <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20191204154618 [email protected]>

iotests: Skip test 060 if it is not possible to create large files

Test 060 fails in the arm64, s390x and ppc64le LXD containers on Travis
(which we will hopefully enable in our CI soon). These containers
apparently do not allow large files to be created. The repair process
in test 060 creates a file of 64 GiB, so test first whether such large
files are possible and skip the test if that's not the case.

Signed-off-by: Thomas Huth <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20191204154618 [email protected]>
Signed-off-by: Alex Bennée <[email protected]>

iotests: Provide a function for checking the creation of huge files

Some tests create huge (but sparse) files, and to be able to run those
tests in certain limited environments (like CI containers), we have to
check for the possibility to create such files first. Thus let's introduce
a common function to check for large files, and replace the already
existing checks in the iotests 005 and 220 with this function.

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Thomas Huth <[email protected]>
Reviewed-by: Cleber Rosa <[email protected]>
Tested-by: Cleber Rosa <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20191204154618 [email protected]>
Signed-off-by: Alex Bennée <[email protected]>

travis.yml: Run tcg tests with tci

So far we only have compile coverage for tci. But since commit
2f160e0f9797c7522bfd0d09218d0c9340a5137c ("tci: Add implementation
for INDEX_op_ld16u_i64") has been included now, we can also run the
"tcg" and "qtest" tests with tci, so let's enable them in Travis now.
Since we don't gain much additional test coverage by compiling all
targets, and TCI is broken e.g. with the Sparc targets, we also limit
the target list to a reasonable subset now (which should still get us
test coverage by tests/boot-serial-test for example).

Tested-by: Stefan Weil <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Thomas Huth <[email protected]>
Message-Id: <20191204083133 [email protected]>
[AJB: just --enable-debug-tcg]
Signed-off-by: Alex Bennée <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>

tests/vm: Allow to set qemu-img path

By default VM build test use qemu-img from system's PATH to
create the image disk. Due the lack of qemu-img on the system
or the desire to simply use a version built with QEMU, it would
be nice to allow one to set its path. So this patch makes that
possible by reading the path to qemu-img from QEMU_IMG if set,
otherwise it fallback to default behavior.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Message-Id: <20191114134246 [email protected]>
Signed-off-by: Alex Bennée <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

configure: allow disable of cross compilation containers

Our docker infrastructure isn't quite as multiarch as we would wish so
lets allow the user to disable it if they want. This will allow us to
use still run check-tcg on non-x86 CI setups.

Signed-off-by: Alex Bennée <[email protected]>
Reviewed-by: Stefan Weil <[email protected]>
Reviewed-by: Cleber Rosa <[email protected]>
Tested-by: Cleber Rosa <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Tested-by: Richard Henderson <[email protected]>

Merge remote-tracking branch 'remotes/huth-gitlab/tags/pull-request-2019-12-17' into staging

* Removal of the deprecated bluetooth code
* Some qtest and misc patches

# gpg: Signature made Tue 17 Dec 2019 08:09:08 GMT
# gpg:                using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg:                issuer "[email protected]"
# gpg: Good signature from "Thomas Huth <[email protected]>" [full]
# gpg:                 aka "Thomas Huth <[email protected]>" [full]
# gpg:                 aka "Thomas Huth <[email protected]>" [full]
# gpg:                 aka "Thomas Huth <[email protected]>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* remotes/huth-gitlab/tags/pull-request-2019-12-17:
  tests: use g_test_rand_int
  tests/Makefile: Fix check-report.* targets shown in check-help
  glib: use portable g_setenv()
  hw/misc/ivshmem: Bury dead legacy INTx code
  pseries: disable migration-test if /dev/kvm cannot be used
  tests: fix modules-test 'duplicate test case' error
  Remove libbluetooth / bluez from the CI tests
  Remove the core bluetooth code
  hw/usb: Remove the USB bluetooth dongle device
  hw/arm/nseries: Replace the bluetooth chardev with a "null" chardev

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/cleber/tags/python-next-pull-request' into staging

Python queue 2019-12-17

# gpg: Signature made Tue 17 Dec 2019 05:12:43 GMT
# gpg:                using RSA key 7ABB96EB8B46B94D5E0FE9BB657E8D33A5F209F3
# gpg: Good signature from "Cleber Rosa <[email protected]>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 7ABB 96EB 8B46 B94D 5E0F  E9BB 657E 8D33 A5F2 09F3

* remotes/cleber/tags/python-next-pull-request:
  python/qemu: Remove unneeded imports in __init__
  python/qemu: accel: Add tcg_available() method
  python/qemu: accel: Strengthen kvm_available() checks
  python/qemu: accel: Add list_accel() method
  python/qemu: Move kvm_available() to its own module
  Acceptance tests: use relative location for tests
  Acceptance tests: use avocado tags for machine type
  Acceptance tests: introduce utility method for tags unique vals
  Acceptance test x86_cpu_model_versions: use default vm
  tests/acceptance: Makes linux_initrd and empty_cpu_model use QEMUMachine
  python/qemu: Add set_qmp_monitor() to QEMUMachine
  analyze-migration.py: replace numpy with python 3.2
  analyze-migration.py: fix find() type error
  Revert "Acceptance test: cancel test if m68k kernel packages goes missing"
  tests/boot_linux_console: Fetch assets from Debian snapshot archives

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-5.0-20191217' into staging

ppc patch queue 2019-12-17

This is the first pull request for the qemu-5.0 branch.  It has a lot
of accumulated changes, including:

    * SLOF update to support boot using the IOMMU (will become
      necessary for secure guests)

    * Clean ups to pnv handling of chip models

    * A number of extensions to the powernv machine model

    * TCG extensions to allow powernv emulated systems to run KVM guests

    * Outline support for POWER10 chips in powernv

    * Cleanups to the ibm,client-architecture-support feature negotiation path

    * XIVE reworks to better handle the powernv machine

    * Improvements to not waste interrupt queues and other semi-scarce
      resources when using XIVE under KVM

# gpg: Signature made Tue 17 Dec 2019 04:42:20 GMT
# gpg:                using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <[email protected]>" [full]
# gpg:                 aka "David Gibson (Red Hat) <[email protected]>" [full]
# gpg:                 aka "David Gibson (ozlabs.org) <[email protected]>" [full]
# gpg:                 aka "David Gibson (kernel.org) <[email protected]>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-5.0-20191217: (88 commits)
  pseries: Update SLOF firmware image
  ppc/pnv: Drop PnvChipClass::type
  ppc/pnv: Introduce PnvChipClass::xscom_pcba() method
  ppc/pnv: Drop pnv_chip_is_power9() and pnv_chip_is_power10() helpers
  ppc/pnv: Pass content of the "compatible" property to pnv_dt_xscom()
  ppc/pnv: Pass XSCOM base address and address size to pnv_dt_xscom()
  ppc/pnv: Introduce PnvChipClass::xscom_core_base() method
  ppc/pnv: Introduce PnvChipClass::intc_print_info() method
  ppc/pnv: Drop pnv_is_power9() and pnv_is_power10() helpers
  ppc/pnv: Introduce PnvMachineClass::dt_power_mgt()
  ppc/pnv: Introduce PnvMachineClass and PnvMachineClass::compat
  ppc/pnv: Drop PnvPsiClass::chip_type
  ppc/pnv: Introduce PnvPsiClass::compat
  ppc: Drop useless extern annotation for functions
  ppc/pnv: Fix OCC common area region mapping
  ppc/pnv: Introduce PBA registers
  ppc/pnv: Make PnvXScomInterface an incomplete type
  ppc/pnv: populate the DT with realized XSCOM devices
  ppc/pnv: Loop on the whole hierarchy to populate the DT with the XSCOM nodes
  target/ppc: Add SPR TBU40
  ...

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/ehabkost/tags/x86-next-pull-request' into staging

x86 queue, 2019-12-16

Feature:
* Cooperlake CPU model

Cleanups:
* Use g_autofree in a few places

# gpg: Signature made Mon 16 Dec 2019 19:36:51 GMT
# gpg:                using RSA key 5A322FD5ABC4D3DBACCFD1AA2807936F984DC5A6
# gpg:                issuer "[email protected]"
# gpg: Good signature from "Eduardo Habkost <[email protected]>" [full]
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/x86-next-pull-request:
  i386: Use g_autofree in a few places
  i386: Add new CPU model Cooperlake
  i386: Add macro for stibp
  i386: Add MSR feature bit for MDS-NO

Signed-off-by: Peter Maydell <[email protected]>

tests: use g_test_rand_int

g_test_rand_int provides a reproducible random integer number, using a
different number seed every time but allowing reproduction using the
--seed command line option. It is thus better suited to tests than
g_random_int or random.

Signed-off-by: Paolo Bonzini <[email protected]>
Message-Id: <1576113478 [email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Signed-off-by: Thomas Huth <[email protected]>

tests/Makefile: Fix check-report.* targets shown in check-help

The check-report.html and check-report.xml targets were replaced
with check-report.tap in commit 9df43317b82 but the check-help
text was not updated so it still lists check-report.html.

Fixes: 9df43317b82
Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Message-Id: <20191211204427 [email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: Thomas Huth <[email protected]>

glib: use portable g_setenv()

We have a setenv() wrapper in os-win32.c that no one is actually using.
Drop it and change to g_setenv() uniformly.

Signed-off-by: Marc-André Lureau <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Message-Id: <1576074210 [email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: Thomas Huth <[email protected]>

hw/misc/ivshmem: Bury dead legacy INTx code

Devices "ivshmem-plain" and "ivshmem-doorbell" support only MSI-X.
Config space register Interrupt Pin is zero.  Device "ivshmem"
additionally supported legacy INTx, but it was removed in commit
5a0e75f0a9 "hw/misc/ivshmem: Remove deprecated "ivshmem" legacy
device".  The commit left ivshmem_update_irq() behind.  Since the
Interrupt Pin register is zero, the function does nothing.  Remove it.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Message-Id: <20191205203557 [email protected]>
Signed-off-by: Thomas Huth <[email protected]>

pseries: disable migration-test if /dev/kvm cannot be used

On ppc64, migration-test only works with kvm_hv, and we already
have a check to verify the module is loaded.

kvm_hv module can be loaded in memory and /sys/module/kvm_hv exists,
but on some systems (like build systems) /dev/kvm can be missing
(by administrators choice).

And as kvm_hv exists test-migration is started but QEMU falls back to
TCG because it cannot be used:

    Could not access KVM kernel module: No such file or directory
    failed to initialize KVM: No such file or directory
    Back to tcg accelerator

And as the test is done with TCG, it fails.

As for s390x, we must check for the existence and the access rights
of /dev/kvm.

Reported-by: Cole Robinson <[email protected]>
Signed-off-by: Laurent Vivier <[email protected]>
Message-Id: <20191120170955 [email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>
Reviewed-by: Juan Quintela <[email protected]>
Signed-off-by: Thomas Huth <[email protected]>

tests: fix modules-test 'duplicate test case' error

./configure --enable-sdl --audio-drv-list=sdl --enable-modules

Will generate two identical test names: /$arch/module/load/sdl
Which generates an error like:

(tests/modules-test:23814): GLib-ERROR **: 18:23:06.359: duplicate test case path: /aarch64//module/load/sdl

Add the subsystem prefix in the name as well, so instead we get:

/$arch/module/load/audio-sdl
/$arch/module/load/ui-sdl

Signed-off-by: Cole Robinson <[email protected]>
Message-Id: <d64c9aa098cc6e5c0b638438c4959eddfa7e24e2.1573679311 [email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Thomas Huth <[email protected]>

Remove libbluetooth / bluez from the CI tests

Since the bluetooth code has been removed, we don't need to test
with this library anymore.

Message-Id: <20191120091014 [email protected]>
Acked-by: Paolo Bonzini <[email protected]>
Signed-off-by: Thomas Huth <[email protected]>

Remove the core bluetooth code

It's been deprecated since QEMU v3.1. We've explicitly asked in the
deprecation message that people should speak up on qemu-devel in case
they are still actively using the bluetooth part of QEMU, but nobody
ever replied that they are really still using it.

I've tried it on my own to use this bluetooth subsystem for one of my
guests, but I was also not able to get it running anymore: When I was
trying to pass-through a real bluetooth device, either the guest did
not see the device at all, or the guest crashed.

Even worse for the emulated device: When running

qemu-system-x86_64 -bt device:keyboard

QEMU crashes once you hit a key.

So it seems like the bluetooth stack is not only neglected, it is
completely bitrotten, as far as I can tell. The only attention that
this code got during the past years were some CVEs that have been
spotted there. So this code is a burden for the developers, without
any real benefit anymore. Time to remove it.

Note: hw/bt/Kconfig only gets cleared but not removed here yet.
Otherwise there is a problem with the *-softmmu/config-devices.mak.d
dependency files - they still contain a reference to this file which
gets evaluated first on some build hosts, before the file gets
properly recreated. To avoid breaking these builders, we still need
the file around for some time. It will get removed in a couple of
weeks instead.

Message-Id: <20191120091014 [email protected]>
Reviewed-by: Ján Tomko <[email protected]>
Acked-by: Paolo Bonzini <[email protected]>
Signed-off-by: Thomas Huth <[email protected]>

pseries: Update SLOF firmware image

This fixes PCI bridges support regression.

This enables IOMMU support in virtio drivers.

The full list of changes is:

Alexey Kardashevskiy (12):
      allocator: Fix format strings for DEBUG
      virtio: Make virtio_set_qaddr static
      client: Load initramdisk location
      sloffs: Fix -Wunused-result gcc warnings in read/write
      pci-phb: Reimplement dma-map-in/out
      virtio: Store queue descriptors in virtio_device
      virtio-net: Init queues after features negotiation
      virtio: Enable IOMMU
      ibm,client-architecture-support: Fix stack handling
      fdt: Fix updating the tree at H_CAS
      version: update to 20191206
      version: update to 20191217

Michael Roth (1):
      dma: Define default dma methods for using by client/package instances

Signed-off-by: Alexey Kardashevskiy <[email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Drop PnvChipClass::type

It isn't used anymore.

Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157623844102.360005.12070225703151669294 [email protected]>
Reviewed-by: Cédric Le Goater <[email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Introduce PnvChipClass::xscom_pcba() method

The XSCOM bus is implemented with a QOM interface, which is mostly
generic from a CPU type standpoint, except for the computation of
addresses on the Pervasive Connect Bus (PCB) network. This is handled
by the pnv_xscom_pcba() function with a switch statement based on
the chip_type class level attribute of the CPU chip.

This can be achieved using QOM. Also the address argument is masked with
PNV_XSCOM_SIZE - 1, which is for POWER8 only. Addresses may have different
sizes with other CPU types. Have each CPU chip type handle the appropriate
computation with a QOM xscom_pcba() method.

Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157623843543.360005.13996472463887521794 [email protected]>
Reviewed-by: Cédric Le Goater <[email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Drop pnv_chip_is_power9() and pnv_chip_is_power10() helpers

They aren't used anymore.

Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157623842986.360005.1787401623906380181 [email protected]>
Reviewed-by: Cédric Le Goater <[email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Pass content of the "compatible" property to pnv_dt_xscom()

Since pnv_dt_xscom() is called from chip specific dt_populate() hooks,
it shouldn't have to guess the chip type in order to populate the
"compatible" property. Just pass the compat string and its size as
arguments.

Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157623842430.360005.9513965612524265862 [email protected]>
Reviewed-by: Cédric Le Goater <[email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Pass XSCOM base address and address size to pnv_dt_xscom()

Since pnv_dt_xscom() is called from chip specific dt_populate() hooks,
it shouldn't have to guess the chip type in order to populate the "reg"
property. Just pass the base address and address size as arguments.

Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157623841868.360005.17577624823547136435 [email protected]>
Reviewed-by: Cédric Le Goater <[email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Introduce PnvChipClass::xscom_core_base() method

The pnv_chip_core_realize() function configures the XSCOM MMIO subregion
for each core of a single chip. The base address of the subregion depends
on the CPU type. Its computation is currently open-code using the
pnv_chip_is_powerXX() helpers. This can be achieved with QOM. Introduce
a method for this in the base chip class and implement it in child classes.

Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157623841311.360005.4705705734873339545 [email protected]>
Reviewed-by: Cédric Le Goater <[email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Introduce PnvChipClass::intc_print_info() method

The pnv_pic_print_info() callback checks the type of the chip in order
to forward to the request appropriate interrupt controller. This can
be achieved with QOM. Introduce a method for this in the base chip class
and implement it in child classes.

This also prepares ground for the upcoming interrupt controller of POWER10
chips.

Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157623840755.360005.5002022339473369934 [email protected]>
Reviewed-by: Cédric Le Goater <[email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Drop pnv_is_power9() and pnv_is_power10() helpers

They aren't used anymore.

Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157623840200.360005.1300941274565357363 [email protected]>
Reviewed-by: Cédric Le Goater <[email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Introduce PnvMachineClass::dt_power_mgt()

We add an extra node to advertise power management on some machines,
namely powernv9 and powernv10. This is achieved by using the
pnv_is_power9() and pnv_is_power10() helpers.

This can be achieved with QOM. Add a method to the base class for
powernv machines and have it implemented by machine types that
support power management instead.

Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157623839642.360005.9243510140436689941 [email protected]>
Reviewed-by: Cédric Le Goater <[email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Introduce PnvMachineClass and PnvMachineClass::compat

The pnv_dt_create() function generates different contents for the
"compatible" property of the root node in the DT, depending on the
CPU type. This is open coded with multiple ifs using pnv_is_powerXX()
helpers.

It seems cleaner to achieve with QOM. Introduce a base class for the
powernv machine and a compat attribute that each child class can use
to provide the value for the "compatible" property.

Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157623839085.360005.4046508784077843216 [email protected]>
Reviewed-by: Cédric Le Goater <[email protected]>
[dwg: Folded in small fix Greg spotted after posting]
Signed-off-by: David Gibson <[email protected]>

python/qemu: Remove unneeded imports in __init__

__init_.py import some sub-modules unnecessarily. So let's
clean it up.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Suggested-by: Cleber Rosa <[email protected]>
Reviewed-by: Cleber Rosa <[email protected]>
Tested-by: Cleber Rosa <[email protected]>
Message-Id: <20191216191438 [email protected]>
Signed-off-by: Cleber Rosa <[email protected]>

ppc/pnv: Drop PnvPsiClass::chip_type

It isn't used anymore.

Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157623838530.360005.15470128760871845396 [email protected]>
Reviewed-by: Cédric Le Goater <[email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Introduce PnvPsiClass::compat

The Processor Service Interface (PSI) model has a chip_type class level
attribute, which is used to generate the content of the "compatible" DT
property according to the CPU type.

Since the PSI model already has specialized classes for each supported
CPU type, it seems cleaner to achieve this with QOM. Provide the content
of the "compatible" property with a new class level attribute.

Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157623837974.360005.14706607446188964477 [email protected]>
Reviewed-by: Cédric Le Goater <[email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc: Drop useless extern annotation for functions

Signed-off-by: Greg Kurz <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <157623837421.360005.412120366652768311 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Fix OCC common area region mapping

The OCC common area is mapped at a unique address on the system and
each OCC is assigned a segment to expose its sensor data :

  -------------------------------------------------------------------------
  | Start (Offset from | End           | Size     |Description            |
  | BAR2 base address) |               |          |                       |
  -------------------------------------------------------------------------
  |    0x00580000      |  0x005A57FF   |150kB     |OCC 0 Sensor Data Block|
  |    0x005A5800      |  0x005CAFFF   |150kB     |OCC 1 Sensor Data Block|
  |        :           |       :       |  :       |            :          |
  |    0x00686800      |  0x006ABFFF   |150kB     |OCC 7 Sensor Data Block|
  |    0x006AC000      |  0x006FFFFF   |336kB     |Reserved               |
  -------------------------------------------------------------------------

Maximum size is 1.5MB.

We could define a "OCC common area" memory region at the machine level
and sub regions for each OCC. But it adds some extra complexity to the
models. Fix the current layout with a simpler model.

Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191211082912 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Introduce PBA registers

The PBA bridge unit (Power Bus Access) connects the OCC (On Chip
Controller) to the Power bus and System Memory. The PBA is used to
gather sensor data, for power management, for sleep states, for
initial boot, among other things.

The PBA logic provides a set of four registers PowerBus Access Base
Address Registers (PBABAR0..3) which map the OCC address space to the
PowerBus space. These registers are setup by the initial FW and define
the PowerBus Range of system memory that can be accessed by PBA.

The current modeling of the PBABAR registers is done under the common
XSCOM handlers. We introduce a specific XSCOM regions for these
registers and fix :

- BAR sizes and BAR masks
- The mapping of the OCC common area. It is common to all chips and
should be mapped once. We will address per-OCC area in the next
change.
- OCC common area is in BAR 3 on P8

Inspired by previous work of Balamuruhan S <[email protected]>

Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191211082912 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Make PnvXScomInterface an incomplete type

PnvXScomInterface is an interface instance. It should never be
dereferenced. Drop the dummy type definition for extra safety,
which is the common practice with QOM interfaces.

While here also convert the bogus OBJECT_CHECK() to INTERFACE_CHECK().

Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157608025541.186670.1577861507610404326 [email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Cédric Le Goater <[email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: populate the DT with realized XSCOM devices

Some devices could be initialized in the instance_init handler but not
realized for configuration reasons. Nodes should not be added in the DT
for such devices.

Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191210135845 [email protected]>
Reviewed-by: Greg Kurz <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Loop on the whole hierarchy to populate the DT with the XSCOM nodes

Some PnvXScomInterface objects lie a bit deeper (PnvPBCQState) than
the first layer, so we need to loop on the whole object hierarchy to
catch them.

Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191210135845 [email protected]>
Reviewed-by: Greg Kurz <[email protected]>
[dwg: Corrected error in comment]
Signed-off-by: David Gibson <[email protected]>

target/ppc: Add SPR TBU40

The spr TBU40 is used to set the upper 40 bits of the timebase
register, present on POWER5+ and later processors.

This register can only be written by the hypervisor, and cannot be read.

Signed-off-by: Suraj Jitindar Singh <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191128134700 [email protected]>
Signed-off-by: David Gibson <[email protected]>

target/ppc: Add SPR ASDR

The Access Segment Descriptor Register (ASDR) provides information about
the storage element when taking a hypervisor storage interrupt. When
performing nested radix address translation, this is normally the guest
real address. This register is present on POWER9 processors and later.

Implement the ADSR, note read and write access is limited to the
hypervisor.

Signed-off-by: Suraj Jitindar Singh <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191128134700 [email protected]>
Signed-off-by: David Gibson <[email protected]>

target/ppc: Work [S]PURR implementation and add HV support

The Processor Utilisation of Resources Register (PURR) and Scaled
Processor Utilisation of Resources Register (SPURR) provide an estimate
of the resources used by the thread, present on POWER7 and later
processors.

Currently the [S]PURR registers simply count at the rate of the
timebase.

Preserve this behaviour but rework the implementation to store an offset
like the timebase rather than doing the calculation manually. Also allow
hypervisor write access to the register along with the currently
available read access.

Signed-off-by: Suraj Jitindar Singh <[email protected]>
Reviewed-by: David Gibson <[email protected]>
[ clg: rebased on current ppc tree ]
Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191128134700 [email protected]>
Signed-off-by: David Gibson <[email protected]>

target/ppc: Implement the VTB for HV access

The virtual timebase register (VTB) is a 64-bit register which
increments at the same rate as the timebase register, present on POWER8
and later processors.

The register is able to be read/written by the hypervisor and read by
the supervisor. All other accesses are illegal.

Currently the VTB is just an alias for the timebase (TB) register.

Implement the VTB so that is can be read/written independent of the TB.
Make use of the existing method for accessing timebase facilities where
by the compensation is stored and used to compute the value on reads/is
updated on writes.

Signed-off-by: Suraj Jitindar Singh <[email protected]>
[ clg: rebased on current ppc tree ]
Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191128134700 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: add a LPC Controller model for POWER10

Same a POWER9, only the MMIO window changes.

Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191205184454 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: add a PSI bridge model for POWER10

The POWER10 PSIHB controller is very similar to the one on POWER9. We
should probably introduce a common PnvPsiXive object.

The ESB page size should be changed to 64k when P10 support is ready.

Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191205184454 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/psi: cleanup definitions

Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191205184454 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Introduce a POWER10 PnvChip and a powernv10 machine

This is an empty shell with the XSCOM bus and cores. The chip controllers
will come later.

Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191205184454 [email protected]>
Signed-off-by: David Gibson <[email protected]>

target/ppc: Add POWER10 DD1.0 model information

This includes in QEMU a new CPU model for the POWER10 processor with
the same capabilities of a POWER9 process. The model will be extended
when support is completed.

Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191205184454 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc: Make PPCVirtualHypervisor an incomplete type

PPCVirtualHypervisor is an interface instance. It should never be
dereferenced. Drop the dummy type definition for extra safety, which
is the common practice with QOM interfaces.

Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157589808041.21182.18121655959115011353 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc: Ignore the CPU_INTERRUPT_EXITTB interrupt with KVM

This only makes sense with an emulated CPU. Don't set the bit in
CPUState::interrupt_request when using KVM to avoid confusions.

Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157548863423.3650476.16424649423510075159 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc: Don't use CPUPPCState::irq_input_state with modern Book3s CPU models

The power7_set_irq() and power9_set_irq() functions set this but it is
never used actually. Modern Book3s compatible CPUs are only supported
by the pnv and spapr machines. They have an interrupt controller, XICS
for POWER7/8 and XIVE for POWER9, whose models don't require to track
IRQ input states at the CPU level.

Drop these lines to avoid confusion.

Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157548862861.3650476.16622818876928044450 [email protected]>
Signed-off-by: David Gibson <[email protected]>

xics: Don't deassert outputs

The correct way to do this is to deassert the input pins on the CPU side.
This is the case since a previous change.

Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157548862298.3650476.1228720391270249433 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc: Deassert the external interrupt pin in KVM on reset

When a CPU is reset, QEMU makes sure no interrupt is pending by clearing
CPUPPCstate::pending_interrupts in ppc_cpu_reset(). In the case of a
complete machine emulation, eg. a sPAPR machine, an external interrupt
request could still be pending in KVM though, eg. an IPI. It will be
eventually presented to the guest, which is supposed to acknowledge it at
the interrupt controller. If the interrupt controller is emulated in QEMU,
either XICS or XIVE, ppc_set_irq() won't deassert the external interrupt
pin in KVM since it isn't pending anymore for QEMU. When the vCPU re-enters
the guest, the interrupt request is still pending and the vCPU will try
again to acknowledge it. This causes an infinite loop and eventually hangs
the guest.

The code has been broken since the beginning. The issue wasn't hit before
because accel=kvm,kernel-irqchip=off is an awkward setup that never got
used until recently with the LC92x IBM systems (aka, Boston).

Add a ppc_irq_reset() function to do the necessary cleanup, ie. deassert
the IRQ pins of the CPU in QEMU and most importantly the external interrupt
pin for this vCPU in KVM.

Reported-by: Satheesh Rajendran <[email protected]>
Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157548861740.3650476.16879693165328764758 [email protected]>
Signed-off-by: David Gibson <[email protected]>

spapr: Simplify ovec diff

spapr_ovec_diff(ov, old, new) has somewhat complex semantics. ov is set
to those bits which are in new but not old, and it returns as a boolean
whether or not there are any bits in old but not new.

It turns out that both callers only care about the second, not the first.
This is basically equivalent to a bitmap subset operation, which is easier
to understand and implement. So replace spapr_ovec_diff() with
spapr_ovec_subset().

Cc: Mike Roth <[email protected]>
Signed-off-by: David Gibson <[email protected]>
Reviewed-by: Cedric Le Goater <[email protected]>

spapr: Fold h_cas_compose_response() into h_client_architecture_support()

spapr_h_cas_compose_response() handles the last piece of the PAPR feature
negotiation process invoked via the ibm,client-architecture-support OF
call. Its only caller is h_client_architecture_support() which handles
most of the rest of that process.

I believe it was placed in a separate file originally to handle some
fiddly dependencies between functions, but mostly it's just confusing
to have the CAS process split into two pieces like this. Now that
compose response is simplified (by just generating the whole device
tree anew), it's cleaner to just fold it into
h_client_architecture_support().

Signed-off-by: David Gibson <[email protected]>
Reviewed-by: Cedric Le Goater <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>

spapr: Improve handling of fdt buffer size

Previously, spapr_build_fdt() constructed the device tree in a fixed
buffer of size FDT_MAX_SIZE. This is a bit inflexible, but more
importantly it's awkward for the case where we use it during CAS. In
that case the guest firmware supplies a buffer and we have to
awkwardly check that what we generated fits into it afterwards, after
doing a lot of size checks during spapr_build_fdt().

Simplify this by having spapr_build_fdt() take a 'space' parameter.
For the CAS case, we pass in the buffer size provided by SLOF, for the
machine init case, we continue to pass FDT_MAX_SIZE.

Signed-off-by: David Gibson <[email protected]>
Reviewed-by: Cedric Le Goater <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>

spapr: Don't trigger a CAS reboot for XICS/XIVE mode changeover

PAPR allows the interrupt controller used on a POWER9 machine (XICS or
XIVE) to be selected by the guest operating system, by using the
ibm,client-architecture-support (CAS) feature negotiation call.

Currently, if the guest selects an interrupt controller different from the
one selected at initial boot, this causes the system to be reset with the
new model and the boot starts again.  This means we run through the SLOF
boot process twice, as well as any other bootloader (e.g. grub) in use
before the OS calls CAS.  This can be confusing and/or inconvenient for
users.

Thanks to two fairly recent changes, we no longer need this reboot.  1) we
now completely regenerate the device tree when CAS is called (meaning we
don't need special case updates for all the device tree changes caused by
the interrupt controller mode change),  2) we now have explicit code paths
to activate and deactivate the different interrupt controllers, rather than
just implicitly calling those at machine reset time.

We can therefore eliminate the reboot for changing irq mode, simply by
putting a call to spapr_irq_update_active_intc() before we call
spapr_h_cas_compose_response() (which gives the updated device tree to
the guest firmware and OS).

Signed-off-by: David Gibson <[email protected]>
Reviewed-by: Cedric Le Goater <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>

ppc: well form kvmppc_hint_smt_possible error hint helper

Make kvmppc_hint_smt_possible hint append helper well formed:
rename errp to errp_in, as it is IN-parameter here (which is unusual
for errp), rename function to be kvmppc_error_append_*_hint.

Signed-off-by: Vladimir Sementsov-Ogievskiy <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Message-Id: <20191127191434 [email protected]>
Reviewed-by: Greg Kurz <[email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Dump the XIVE NVT table

This is useful to dump the saved contexts of the vCPUs : configuration
of the base END index of the vCPU and the Interrupt Pending Buffer
register, which is updated when an interrupt can not be presented.

When dumping the NVT table, we skip empty indirect pages which are not
necessarily allocated.

Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191125065820 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Extend XiveRouter with a get_block_id() handler

When doing CAM line compares, fetch the block id from the interrupt
controller which can have set the PC_TCTXT_CHIPID field.

Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191125065820 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Introduce a pnv_xive_block_id() helper

When PC_TCTXT_CHIPID_OVERRIDE is configured, the PC_TCTXT_CHIPID field
overrides the hardwired chip ID in the Powerbus operations and for CAM
compares. This is typically used in the one block-per-chip configuration
to associate a unique block id number to each IC of the system.

Simplify the model with a pnv_xive_block_id() helper and remove
'tctx_chipid' which becomes useless.

Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191125065820 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/xive: Synthesize interrupt from the saved IPB in the NVT

When a vCPU is dispatched on a HW thread, its context is pushed in the
thread registers and it is activated by setting the VO bit in the CAM
line word2. The HW grabs the associated NVT, pulls the IPB bits and
merges them with the IPB of the new context. If interrupts were missed
while the vCPU was not dispatched, these are synthesized in this
sequence.

Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191125065820 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/xive: Introduce a xive_tctx_ipb_update() helper

We will use it to resend missed interrupts when a vCPU context is
pushed on a HW thread.

Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191125065820 [email protected]>
Reviewed-by: Greg Kurz <[email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/xive: Remove the get_tctx() XiveRouter handler

It is now unused.

Reviewed-by: Greg Kurz <[email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191125065820 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/xive: Move the TIMA operations to the controller model

On the P9 Processor, the thread interrupt context registers of a CPU
can be accessed "directly" when by load/store from the CPU or
"indirectly" by the IC through an indirect TIMA page. This requires to
configure first the PC_TCTXT_INDIRx registers.

Today, we rely on the get_tctx() handler to deduce from the CPU PIR
the chip from which the TIMA access is being done. By handling the
TIMA memory ops under the interrupt controller model of each machine,
we can uniformize the TIMA direct and indirect ops under PowerNV. We
can also check that the CPUs have been enabled in the XIVE controller.

This prepares ground for the future versions of XIVE.

Reviewed-by: Greg Kurz <[email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191125065820 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Clarify how the TIMA is accessed on a multichip system

The TIMA region gives access to the thread interrupt context registers
of a CPU. It is mapped at the same address on all chips and can be
accessed by any CPU of the system. To identify the chip from which the
access is being done, the PowerBUS uses a 'chip' field in the
load/store messages. QEMU does not model these messages, instead, we
extract the chip id from the CPU PIR and do a lookup at the machine
level to fetch the targeted interrupt controller.

Introduce pnv_get_chip() and pnv_xive_tm_get_xive() helpers to clarify
this process in pnv_xive_get_tctx(). The latter will be removed in the
subsequent patches but the same principle will be kept.

Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191125065820 [email protected]>
Signed-off-by: David Gibson <[email protected]>

spapr/xive: Configure number of servers in KVM

The XIVE KVM devices now has an attribute to configure the number of
interrupt servers. This allows to greatly optimize the usage of the VP
space in the XIVE HW, and thus to start a lot more VMs.

Only set this attribute if available in order to support older POWER9
KVM.

The XIVE KVM device now reports the exhaustion of VPs upon the
connection of the first VCPU. Check that in order to have a chance
to provide a hint to the user.

Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157478679392.67101.7843580591407950866 [email protected]>
Reviewed-by: Cédric Le Goater <[email protected]>
Signed-off-by: David Gibson <[email protected]>

spapr/xics: Configure number of servers in KVM

The XICS-on-XIVE KVM devices now has an attribute to configure the number
of interrupt servers. This allows to greatly optimize the usage of the VP
space in the XIVE HW, and thus to start a lot more VMs.

Only set this attribute if available in order to support older POWER9 KVM
and pre-POWER9 XICS KVM devices.

The XICS-on-XIVE KVM device now reports the exhaustion of VPs upon the
connection of the first VCPU. Check that in order to have a chance to
provide a hint to the user.
`
Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157478678846.67101.9660531022460517710 [email protected]>
Reviewed-by: Cédric Le Goater <[email protected]>
Signed-off-by: David Gibson <[email protected]>

spapr: Pass the maximum number of vCPUs to the KVM interrupt controller

The XIVE and XICS-on-XIVE KVM devices on POWER9 hosts can greatly reduce
their consumption of some scarce HW resources, namely Virtual Presenter
identifiers, if they know the maximum number of vCPUs that may run in the
VM.

Prepare ground for this by passing the value down to xics_kvm_connect()
and kvmppc_xive_connect(). This is purely mechanical, no functional
change.

Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157478678301.67101.2717368060417156338 [email protected]>
Reviewed-by: Cédric Le Goater <[email protected]>
Signed-off-by: David Gibson <[email protected]>

linux-headers: Update

Update to mainline commit be2eca94d144 ("Merge tag 'for-linus-5.5-1'`
of git://github.com/cminyard/linux-ipmi")

Signed-off-by: Greg Kurz <[email protected]>
Message-Id: <157478677756.67101.11558821804418331832 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/xive: Extend the TIMA operation with a XivePresenter parameter

The TIMA operations are performed on behalf of the XIVE IVPE sub-engine
(Presenter) on the thread interrupt context registers. The current
operations supported by the model are simple and do not require access
to the controller but more complex operations will need access to the
controller NVT table and to its configuration.

Reviewed-by: Greg Kurz <[email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191125065820 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/xive: Use the XiveFabric and XivePresenter interfaces

Now that the machines have handlers implementing the XiveFabric and
XivePresenter interfaces, remove xive_presenter_match() and make use
of the 'match_nvt' handler of the machine.

Reviewed-by: Greg Kurz <[email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191125065820 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/spapr: Implement the XiveFabric interface

The CAM line matching sequence in the pseries machine does not change
much apart from the use of the new QOM interfaces. There is an extra
indirection because of the sPAPR IRQ backend of the machine. Only the
XIVE backend implements the new 'match_nvt' handler.

Reviewed-by: Greg Kurz <[email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191125065820 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Implement the XiveFabric interface

The CAM line matching on the PowerNV machine now scans all chips of
the system and all CPUs of a chip to find a dispatched NVT in the
thread contexts.

Reviewed-by: Greg Kurz <[email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191125065820 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/xive: Introduce a XiveFabric interface

The XiveFabric QOM interface acts as the PowerBUS interface between
the interrupt controller and the system and should be implemented by
the QEMU machine. On HW, the XIVE sub-engine is responsible for the
communication with the other chip is the Common Queue (CQ) bridge
unit.

This interface offers a 'match_nvt' handler to perform the CAM line
matching when looking for a XIVE Presenter with a dispatched NVT.

Reviewed-by: Greg Kurz <[email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191125065820 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Fix TIMA indirect access

When the TIMA of a CPU needs to be accessed from the indirect page,
the thread id of the target CPU is first stored in the PC_TCTXT_INDIR0
register. This thread id is relative to the chip and not to the system.

Introduce a helper routine to look for a CPU of a given PIR and fix
pnv_xive_get_indirect_tctx() to scan only the threads of the local
chip and not the whole machine.

Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191125065820 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Introduce a pnv_xive_is_cpu_enabled() helper

and use this helper to exclude CPUs which are not enabled in the XIVE
controller.

Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191125065820 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc: Introduce a ppc_cpu_pir() helper

Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191125065820 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Loop on the threads of the chip to find a matching NVT

CPU_FOREACH() loops on all the CPUs of the machine which is incorrect.
Each XIVE Presenter should scan only the HW threads of the chip it
belongs to.

Signed-off-by: Cédric Le Goater <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191125065820 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Instantiate cores separately

Allocating a big void * array to store multiple objects isn't a
recommended practice for various reasons:
- no compile time type checking
- potential dangling pointers if a reference on an individual is
taken and the array is freed later on
- duplicate boiler plate everywhere the array is browsed through

Allocate an array of pointers and populate it instead.

Signed-off-by: Greg Kurz <[email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191125065820 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/xive: Implement the XivePresenter interface

Each XIVE Router model, sPAPR and PowerNV, now implements the 'match_nvt'
handler of the XivePresenter QOM interface. This is simply moving code
and taking into account the new API.

To be noted that the xive_router_get_tctx() helper is not used anymore
when doing CAM matching and will be removed later on after other changes.

The XIVE presenter model is still too simple for the PowerNV machine
and the CAM matching algo is not correct on multichip system. Subsequent
patches will introduce more changes to scan all chips of the system.

Signed-off-by: Cédric Le Goater <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191125065820 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/xive: Introduce a XivePresenter interface

When the XIVE IVRE sub-engine (XiveRouter) looks for a Notification
Virtual Target (NVT) to notify, it broadcasts a message on the
PowerBUS to find an XIVE IVPE sub-engine (Presenter) with the NVT
dispatched on one of its HW threads, and then forwards the
notification if any response was received.

The current XIVE presenter model is sufficient for the pseries machine
because it has a single interrupt controller device, but the PowerNV
machine can have multiple chips each having its own interrupt
controller. In this case, the XIVE presenter model is too simple and
the CAM line matching should scan all chips of the system.

To start fixing this issue, we first extend the XIVE Router model with
a new XivePresenter QOM interface representing the XIVE IVPE
sub-engine. This interface exposes a 'match_nvt' handler which the
sPAPR and PowerNV XIVE Router models will need to implement to perform
the CAM line matching.

Signed-off-by: Cédric Le Goater <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191125065820 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Create BMC devices at machine init

The BMC of the OpenPOWER systems monitors the machine state using
sensors, controls the power and controls the access to the PNOR flash
device containing the firmware image required to boot the host.

QEMU models the power cycle process, access to the sensors and access
to the PNOR device. But, for these features to be available, the QEMU
PowerNV machine needs two extras devices on the command line, an IPMI
BT device for communication and a BMC backend device:

  -device ipmi-bmc-sim,id=bmc0 -device isa-ipmi-bt,bmc=bmc0,irq=10

The BMC properties are then defined accordingly in the device tree and
OPAL self adapts. If a BMC device and an IPMI BT device are not
available, OPAL does not try to communicate with the BMC in any
manner. This is not how real systems behave.

To be closer to the default behavior, create an IPMI BMC simulator
device and an IPMI BT device at machine initialization time. We loose
the ability to define an external BMC device but there are benefits:

  - a better match with real systems,
  - a better test coverage of the OPAL code,
  - system powerdown and reset commands that work,
  - a QEMU device tree compliant with the specifications (*).

(*) Still needs a MBOX device.

Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191121162340 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ppc/pnv: Add HIOMAP commands

This activates HIOMAP support on the QEMU PowerNV machine. The PnvPnor
model is used to access the flash contents. The model simply maps the
contents at a fix offset and enables or disables the mapping.

HIOMAP Protocol description :

https://github.com/openbmc/hiomapd/blob/master/Documentation/protocol.md

Reviewed-by: Joel Stanley <[email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191028070027 [email protected]>
Signed-off-by: David Gibson <[email protected]>

ipmi: Add support to customize OEM functions

The routine ipmi_register_oem_netfn() lets external modules register
command handlers for OEM functions. Required for the PowerNV machine.

Cc: Corey Minyard <[email protected]>
Reviewed-by: Corey Minyard <[email protected]>
Signed-off-by: Cédric Le Goater <[email protected]>
Message-Id: <20191028070027 [email protected]>
Acked-by: Corey Minyard <[email protected]>
Signed-off-by: David Gibson <[email protected]>