s390x/arch_dump: use proper note name and note size
In binutils/libbfd (bfd/elf.c) it is enforced that all s390
specific ELF notes like e.g. NT_S390_PREFIX or NT_S390_CTRS
have "LINUX" specified as note name and that the namesz is
6. Otherwise the notes are ignored.
QEMU currently uses "CORE" for these notes. Up to now this has
not been a real problem because the dump analysis tool "crash"
does handle that. But it will break all programs that use libbfd
for processing ELF notes.
So fix this and use "LINUX" for all s390 specific notes to comply
with libbfd. Also set the correct namesz.
Halil Pasic [Mon, 7 Dec 2015 15:45:17 +0000 (16:45 +0100)]
virtio-ccw: support VIRTIO_QUEUE_MAX virtqueues
The maximal number of virtqueues per device can be limited on a per
transport basis. For virtio-ccw this limit is defined by
VIRTIO_CCW_QUEUE_MAX, however the limitation used to come form the
number of adapter routes supported by flic (via notifiers).
Recently the limitation of the flic was adjusted so that it can
accommodate VIRTIO_QUEUE_MAX queues, and is in the meanwhile checked for
separately too.
Let us remove the transport specific limitation of virtio-ccw by
dropping VIRTIO_CCW_QUEUE_MAX and using VIRTIO_QUEUE_MAX instead.
Halil Pasic [Fri, 9 Dec 2016 19:00:21 +0000 (20:00 +0100)]
s390x: bump ADAPTER_ROUTES_MAX_GSI
Let's increase ADAPTER_ROUTES_MAX_GSI to VIRTIO_QUEUE_MAX which is the
largest demand foreseeable at the moment. Let us add a compatibility
macro for the previous machines so client code can maintain backwards
migration compatibility
To not mess up migration compatibility for virtio-ccw
VIRTIO_CCW_QUEUE_MAX is left at it's current value, and will be dropped
when virtio-ccw is converted to use the capability of the flic
introduced by this patch.
Halil Pasic [Fri, 9 Dec 2016 18:58:10 +0000 (19:58 +0100)]
virtio-ccw: check flic->adapter_routes_max_batch
Currently VIRTIO_CCW_QUEUE_MAX is defined as ADAPTER_ROUTES_MAX_GSI.
That is when checking queue max we implicitly check the constraint
concerning the number of adapter routes. This won't be satisfactory any
more (due to backward migration considerations) if ADAPTER_ROUTES_MAX_GSI
changes (ADAPTER_ROUTES_MAX_GSI is going to change because we want to
support up to VIRTIO_QUEUE_MAX queues per virtio-ccw device).
Let us introduce a check on a recently introduce flic property which
gives us the compatibility machine aware limit on adapter routes.
Halil Pasic [Fri, 9 Dec 2016 12:51:46 +0000 (13:51 +0100)]
s390x: add property adapter_routes_max_batch
To make virtio-ccw supports more that 64 virtqueues we will have to
increase ADAPTER_ROUTES_MAX_GSI which is currently limiting the number if
possible adapter routes. Of course increasing the number of supported
routes can break backwards migration.
Let us introduce a compatibility property adapter_routes_max_batch so
client code can use the some old limit if in compatibility mode and
retain the migration compatibility.
Halil Pasic [Thu, 10 Dec 2015 11:55:06 +0000 (12:55 +0100)]
virtio-ccw: Check the number of vqs in CCW_CMD_SET_IND
We cannot support more than 64 virtqueues with the 64 bits provided by
classic indicators. If a driver tries to setup classic indicators
(which it is free to do even for virtio-1 devices) for a device with
more than 64 virtqueues, we should reject the attempt so that the
driver does not end up with an unusable device.
This is in preparation for bumping the number of supported virtqueues
on the ccw transport.
Halil Pasic [Tue, 14 Feb 2017 13:06:07 +0000 (14:06 +0100)]
virtio-ccw: handle virtio 1 only devices
As a preparation for wiring-up virtio-crypto, the first non-transitional
virtio device on the ccw transport, let us introduce a mechanism for
disabling revision 0. This is more or less equivalent with disabling
legacy as revision 0 is legacy only, and legacy drivers use the revision
0 exclusively.
Cornelia Huck [Wed, 25 Jan 2017 15:01:03 +0000 (16:01 +0100)]
s390x/flic: fail migration on source already
Current code puts a 'FLIC_FAILED' marker into the migration stream
to indicate something went wrong while saving flic state and fails
load if it encounters that marker. VMState's put routine recently
gained the ability to return error codes (but did not wire it up
yet).
In order to be able to reap the benefits of returning an error and
failing migration on the source already once this gets wired up
in core, return an error in addition to storing 'FLIC_FAILED'.
Sometimes (e.g. early boot) a guest is broken in such ways that it loops
100% delivering operation exceptions (illegal operation) but the pgm new
PSW is not set properly. This will result in code being read from
address zero, which usually contains another illegal op. Let's detect
this case and put the guest in crashed state. Instead of only detecting
this for address zero apply a heuristic that will work for any program
check new psw so that it will also reach the crashed state if you
provide some random elf file to the -kernel option.
We do not want guest problem state to be able to trigger a guest panic,
e.g. by faulting on an address that is the same as the program check
new PSW, so we check for the problem state bit being off.
With this we
a: get rid of CPU consumption of such broken guests
b: keep the program old PSW. This allows to find out the original illegal
operation - making debugging such early boot issues much easier than
with single stepping
This relies on the kernel using a similar heuristic and passing such
operation exceptions to user space.
Cornelia Huck [Thu, 26 Jan 2017 16:47:40 +0000 (17:47 +0100)]
s390x/s390-virtio: get rid of DPRINTF
The DPRINTF approach is likely to introduce bitrot, and the preferred
way for debugging is tracing anyway. Fortunately, there are no users
(left), so nuke it.
Alex Bennée [Mon, 20 Feb 2017 10:51:39 +0000 (10:51 +0000)]
MAINTAINERS: merge Build and test automation with Docker tests
The docker framework is really just another piece in the build
automation puzzle so lets merge it together. For added bonus I've also
included the Travis and Patchew status links. The Shippable links will
be added later once mainline tests have been configured and setup.
Alex Bennée [Mon, 20 Feb 2017 10:51:38 +0000 (10:51 +0000)]
.shippable.yml: new CI provider
Ostensibly Shippable offers a similar set of services as Travis.
However they are focused on Docker container based work-flows so we
can use our existing containers to run a few extra builds - in this
case a bunch of cross-compiled targets on a Debian multiarch system.
Alex Bennée [Mon, 20 Feb 2017 10:51:37 +0000 (10:51 +0000)]
new: debian docker targets for cross-compiling
This provides a basic Debian install with access to the emdebian cross
compilers. The debian-armhf-cross and debian-arm64-cross targets build
on the basic Debian image to allow cross compiling to those targets.
A new environment variable (QEMU_CONFIGURE_OPTS) is set as part of the
docker container and passed to the build to specify the
--cross-prefix. The user still calls the build in the usual way, for
example:
make docker-test-build@debian-arm64-cross \
TARGET_LIST="aarch64-softmmu,aarch64-linux-user"
Alex Bennée [Mon, 20 Feb 2017 10:51:36 +0000 (10:51 +0000)]
tests/docker: add basic user mapping support
Currently all docker builds are done by exporting a tarball to the
docker container and running the build as the containers root user.
Other use cases are possible however and it is possible to map a part
of users file-system to the container. This is useful for example for
doing cross-builds of arbitrary source trees. For this to work
smoothly the container needs to have a user created that maps cleanly
to the host system.
This adds a -u option to the docker script so that:
util/cutils: Let qemu_strtosz*() optionally reject trailing crap
Change the qemu_strtosz() & friends to return -EINVAL when @endptr is
null and the conversion doesn't consume the string completely.
Matches how qemu_strtol() & friends work.
Only test_qemu_strtosz_simple() passes a null @endptr. No functional
change there, because its conversion consumes the string.
Simplify callers that use @endptr only to fail when it doesn't point
to '\0' to pass a null @endptr instead.
util/cutils: Rename qemu_strtosz() to qemu_strtosz_MiB()
With qemu_strtosz(), no suffix means mebibytes. It's used rarely.
I'm going to add a similar function where no suffix means bytes.
Rename qemu_strtosz() to qemu_strtosz_MiB() to make the name
qemu_strtosz() available for the new function.
Plenty of code relies on QemuOpt member @str not being null, including
qemu_opts_print(), qemu_opts_to_qdict(), and callbacks passed to
qemu_opt_foreach().
Begs the question whether it can be null. Only opt_set() creates
QemuOpt. It sets member @str to its argument @value. Passing null
for @value would plant a time bomb. Callers:
* opts_do_parse() can't pass null.
* qemu_opt_set() passes its argument @value. Callers:
- qemu_opts_from_qdict_1() can't pass null
- qemu_opts_set() passes its argument @value, but none of its
callers pass null.
- Many more outside qemu-option.c, but they shouldn't pass null,
either.
Assert member @str isn't null, so that misuse is caught right away.
Simplify parse_option_bool(), parse_option_number() and
parse_option_size() accordingly. Best viewed with whitespace changes
ignored.
This commit creates a board which defaults to having 2GB of RAM.
Unfortunately on 32-bit hosts we can't create boards with 2GB of RAM,
and so 'make check' fails. I missed this during testing of the
merge, unfortunately. Luckily the offending commit is the last
one in the merge request, so we can just revert it for now.
Curiously, unrealize() is not being used, but it seems more
appropriate than handle_destroy() together with realize(). It is more
ubiquitous destroy name in qemu code base and may throw errors.
Peter Maydell [Thu, 23 Feb 2017 09:59:40 +0000 (09:59 +0000)]
Merge remote-tracking branch 'remotes/yongbok/tags/mips-20170222' into staging
MIPS patches 2017-02-22
Changes:
* Add MIPS Boston board support
# gpg: Signature made Wed 22 Feb 2017 00:08:00 GMT
# gpg: using RSA key 0x2238EB86D5F797C2
# gpg: Good signature from "Yongbok Kim <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 8600 4CF5 3415 A5D9 4CFA 2B5C 2238 EB86 D5F7 97C2
* remotes/yongbok/tags/mips-20170222:
hw/mips: MIPS Boston board support
hw: xilinx-pcie: Add support for Xilinx AXI PCIe Controller
loader: Support Flattened Image Trees (FIT images)
dtc: Update requirement to v1.4.2
target-mips: Provide function to test if a CPU supports an ISA
hw/mips_gic: Update pin state on mask changes
hw/mips_gictimer: provide API for retrieving frequency
hw/mips_cmgcr: allow GCR base to be moved
XiongZhang [Wed, 22 Feb 2017 20:19:59 +0000 (13:19 -0700)]
vfio/pci-quirks.c: Disable stolen memory for igd VFIO
Regardless of running in UPT or legacy mode, the guest igd
drivers may attempt to use stolen memory, however only legacy
mode has BIOS support for reserving stolen memmory in the
guest VM. We zero out the stolen memory size in all cases,
then guest igd driver won't use stolen memory.
In legacy mode, user could use x-igd-gms option to specify the
amount of stolen memory which will be pre-allocated and reserved
by bios for igd use.
Since commit 4bb571d857d9 ("pci/pcie: don't assume cap id 0 is
reserved") removes the internal use of extended capability ID 0, the
comment here becomes invalid. However, peeling back the onion, the
code is still correct and we still can't seed the capability chain
with ID 0, unless we want to muck with using the version number to
force the header to be non-zero, which is much uglier to deal with.
The comment also now covers some of the subtleties of using cap ID 0,
such as transparently indicating absence of capabilities if none are
added. This doesn't detract from the correctness of the referenced
commit as vfio in the kernel also uses capability ID zero to mask
capabilties. In fact, we should skip zero capabilities precisely
because the kernel might also expose such a capability at the head
position and re-introduce the problem.
block: Don't bother asserting type of output visitor's output
After a visit of a complex QAPI type FOO
ov = qobject_output_visitor_new(&foo);
visit_type_FOO(ov, NULL, expr, &error_abort);
visit_complete(ov, &foo);
we can safely assume qobject_type(foo) is QTYPE_QDICT. We do in many
places, but occasionally assert qobject_type(obj) == QTYPE_QDICT.
Don't. The appropriate place to check such fundamental properties of
QAPI visitors is the test suite.
check-qdict: Tighten qdict_crumple_test_recursive() some
Consistently check for unexpected QDict entries, and qdict_get_qdict()
success. The latter doesn't tighten the test, it only makes it fail
more nicely.
qdict: Make qdict_get_qlist() safe like qdict_get_qdict()
Commit 89cad9f changed qdict_get_qdict() to return NULL instead of
crash when the key doesn't exist or its value isn't a QDict.
Commit 2d6421a neglected to do the same for qdict_get_qlist().
Correct that, and update the function comments.
Simple unions are simpler than flat unions in the schema, but more
complicated in C and on the QMP wire: there's extra indirection in C
and extra nesting on the wire, both pointless. They're best avoided
in new code.
NetLegacyOptions isn't new, but it's only used internally, not in QMP.
Convert it to a flat union.
Simple unions are simpler than flat unions in the schema, but more
complicated in C and on the QMP wire: there's extra indirection in C
and extra nesting on the wire, both pointless. They're best avoided
in new code.
NumaOptions isn't new, but it's only used internally, not in QMP.
Convert it to a flat union.
Peter Maydell [Tue, 21 Feb 2017 13:33:41 +0000 (13:33 +0000)]
hw/ppc/ppc405_uc.c: Avoid integer overflows
When performing clock calculations, the ppc405_uc code
has several places where it multiplies together two
32-bit variables and assigns the result to a 64-bit
variable. This doesn't quite do what is intended because
C will compute a 32-bit multiply result. Add casts to
ensure we don't truncate the result.
Thomas Huth [Wed, 15 Feb 2017 09:21:44 +0000 (10:21 +0100)]
hw/ppc/spapr: Check for valid page size when hot plugging memory
On POWER, the valid page sizes that the guest can use are bound
to the CPU and not to the memory region. QEMU already has some
fancy logic to find out the right maximum memory size to tell
it to the guest during boot (see getrampagesize() in the file
target/ppc/kvm.c for more information).
However, once we're booted and the guest is using huge pages
already, it is currently still possible to hot-plug memory regions
that does not support huge pages - which of course does not work
on POWER, since the guest thinks that it is possible to use huge
pages everywhere. The KVM_RUN ioctl will then abort with -EFAULT,
QEMU spills out a not very helpful error message together with
a register dump and the user is annoyed that the VM unexpectedly
died.
To avoid this situation, we should check the page size of hot-plugged
DIMMs to see whether it is possible to use it in the current VM.
If it does not fit, we can print out a better error message and
refuse to add it, so that the VM does not die unexpectely and the
user has a second chance to plug a DIMM with a matching memory
backend instead.
Alex Zuepke [Tue, 14 Feb 2017 11:54:29 +0000 (12:54 +0100)]
target-ppc: fix Book-E TLB matching
The Book-E TLB matching process should bail out early when a TLB
entry matches, but the access permissions are wrong. The CPU
will then raise a DSI error instead of a Data TLB error, as
described for TLB matching in Freescale and IBM documents.
Sam Bobroff [Mon, 21 Nov 2016 23:19:38 +0000 (10:19 +1100)]
hw/net/spapr_llan: 6 byte mac address device tree entry
The spapr-vlan device in QEMU has always presented it's MAC address in
the device tree as an 8 byte value, even though PAPR requires it to be
6 bytes. This is because, at the time, AIX required the value to be 8
bytes. However, modern versions of AIX support the (correct) 6
byte value so they no longer require the workaround.
It would be neatest to always provide a 6 byte value but that would
cause a problem with old Linux kernel ibmveth drivers, so the old 8
byte value is still presented when necessary.
Since commit 13f85203e (3.10, May 2013) the driver has been able to
handle 6 or 8 byte addresses so versions after that don't need to be
considered specially.
Drivers from kernels before that can also handle either type of
address, but not always:
* If the first byte's lowest bits are 10, the address must be 6 bytes.
* Otherwise, the address must be 8 bytes.
(The two bits in question are significant in a MAC address: they
indicate a locally-administered unicast address.)
So to maintain compatibility the old 8 byte value is presented when
the lowest two bits of the first byte are not 10.
Igor Mammedov [Fri, 10 Feb 2017 10:20:57 +0000 (11:20 +0100)]
machine: replace query_hotpluggable_cpus() callback with has_hotpluggable_cpus flag
Generic helper machine_query_hotpluggable_cpus() replaced
target specific query_hotpluggable_cpus() callbacks so
there is no need in it anymore. However inon NULL callback
value is used to detect/report hotpluggable cpus support,
therefore it can be removed completely.
Replace it with MachineClass.has_hotpluggable_cpus boolean
which is sufficient for the task.
All callbacks FOO_query_hotpluggable_cpus() are practically
the same except of setting vcpus_count to different values.
Convert them to a generic machine_query_hotpluggable_cpus()
callback by moving vcpus_count initialization to per machine
specific callback possible_cpu_arch_ids().
Igor Mammedov [Fri, 10 Feb 2017 10:18:49 +0000 (11:18 +0100)]
spapr: reuse machine->possible_cpus instead of cores[]
Replace SPAPR specific cores[] array with generic
machine->possible_cpus and store core objects there.
It makes cores bookkeeping similar to x86 cpus and
will allow to unify similar code.
It would allow to replace cpu_index based NUMA node
mapping with iproperty based one (for -device created
cores) since possible_cpus carries board defined
topology/layout.
Igor Mammedov [Thu, 9 Feb 2017 11:08:34 +0000 (12:08 +0100)]
pc: calculate topology only once when possible_cpus is initialised
Fill in CpuInstanceProperties once at board init time and
just copy them whenever query_hotpluggable_cpus() is called.
It will keep topology info always available without need
to recalculate it every time it's needed.
Considering it has NUMA node id, it will be used to keep
NUMA node to cpu mapping instead of numa_info[i].node_cpu
bitmasks.
Igor Mammedov [Thu, 9 Feb 2017 11:08:33 +0000 (12:08 +0100)]
pc: move pcms->possible_cpus init out of pc_cpus_init()
possible_cpus could be initialized earlier then cpu objects,
i.e. when -smp is parsed so move init code to possible_cpu_arch_ids()
interface func and do initialization on the first call.
it should help later with making -numa cpu/-smp parsing a machine state
properties.
Thomas Huth [Thu, 9 Feb 2017 11:14:41 +0000 (12:14 +0100)]
hw/pci-host/prep: Do not use hw_error() in realize function
hw_error() is for CPU related errors only (it prints out a
register dump and calls abort()), so we should not use it
if we just failed to load the bios image. Apart from that,
realize() functions should not exit directly but always set
the errp with error_setg() in case of errors instead.
Additionally, move some code around and delete the bios memory
subregion again in case of such an error, so that we leave a
clean state when returning to the caller.
target/ppc/POWER9: Direct all instr and data storage interrupts to the hypv
The vpm0 bit was removed from the LPCR in POWER9, this bit controlled
whether ISI and DSI interrupts were directed to the hypervisor or the
partition. These interrupts now go to the hypervisor irrespective, thus
it is no longer necessary to check the vmp0 bit in the LPCR.
The logical partitioning control register controls a threads operation
based on the partition it is currently executing. Add new definitions and
update the mask used when writing to the LPCR based on the POWER9 spec.
The DPFD field in the LPCR is 3 bits wide. This has always been defined
as 0x3 << shift which indicates a 2 bit field, which is incorrect.
Correct this.
Bharata B Rao [Fri, 10 Feb 2017 07:23:09 +0000 (12:53 +0530)]
target-ppc: Add xscvqpudz and xscvqpuwz instructions
xscvqpudz: VSX Scalar truncate & Convert Quad-Precision format to
Unsigned Doubleword format
xscvqpuwz: VSX Scalar truncate & Convert Quad-Precision format to
Unsigned Word format
Bharata B Rao [Fri, 10 Feb 2017 07:23:08 +0000 (12:53 +0530)]
target-ppc: Implement round to odd variants of quad FP instructions
xsaddqpo: VSX Scalar Add Quad-Precision using round to Odd
xsmulqo: VSX Scalar Multiply Quad-Precision using round to Odd
xsdivqpo: VSX Scalar Divide Quad-Precision using round to Odd
xscvqpdpo: VSX Scalar round & Convert Quad-Precision format to
Double-Precision format using round to Odd
xssqrtqpo: VSX Scalar Square Root Quad-Precision using round to Odd
xssubqpo: VSX Scalar Subtract Quad-Precision using round to Odd
In addition, fix the invalid bitmask in the instruction encoding
of xssqrtqp[o].
Bharata B Rao [Fri, 10 Feb 2017 07:23:05 +0000 (12:53 +0530)]
softfloat: Add round-to-odd rounding mode
Power ISA 3.0 introduces a few quadruple precision floating point
instructions that support round-to-odd rounding mode. The
round-to-odd mode is explained as under:
Let Z be the intermediate arithmetic result or the operand of a convert
operation. If Z can be represented exactly in the target format, the
result is Z. Otherwise the result is either Z1 or Z2 whichever is odd.
Here Z1 and Z2 are the next larger and smaller numbers representable
in the target format respectively.
Sam Bobroff [Tue, 7 Feb 2017 03:21:39 +0000 (14:21 +1100)]
target-ppc, tcg: fix usermode segfault with pthread_create()
Programs run under qemu-ppc64 on an x86_64 host currently segfault
if they use pthread_create() due to the adjustment made to the NIP in
commit bd6fefe71cec5a0c7d2be4ac96307f25db56abf9.
This patch changes cpu_loop() to set the NIP back to the
pre-incremented value before calling do_syscall(), which causes the
correct address to be used for the new thread and corrects the fault.
Sam Bobroff [Tue, 7 Feb 2017 02:56:44 +0000 (13:56 +1100)]
spapr: fix off-by-one error in spapr_ovec_populate_dt()
The last byte of the option vector was missing due to an off-by-one
error. Without this fix, client architecture support negotiation will
fail because the last byte of option vector 5, which contains the MMU
support, will be missed.