The USB_EHCI entry currently include PCI code. Since the EHCI
implementation is already split in sysbus/PCI, add a new
USB_EHCI_PCI. There are no logical changes, but the Kconfig
dependencies tree is cleaner.
Paolo Bonzini [Fri, 17 May 2019 13:19:00 +0000 (15:19 +0200)]
checkpatch: detect doubly-encoded UTF-8
Copy and pasting from Thunderbird's "view source" window results in double
encoding of multibyte UTF-8 sequences. The appearance of those sequences is
very peculiar, so detect it and give an error despite the (low) possibility
of false positives.
As the major offender, I am also adding the same check to my applypatch-msg
and commit-msg hooks, but this will also cause patchew to croak loudly when
this mistake happens.
Paolo Bonzini [Fri, 12 Jul 2019 17:34:35 +0000 (19:34 +0200)]
util: merge main-loop.c and iohandler.c
main-loop.c has a dependency on iohandler.c, and everything breaks
if that dependency is instead satisfied by stubs/iohandler.c.
Just put everything in the same file to avoid strange dependencies
on the order of files in util-obj-y.
King Wang [Fri, 12 Jul 2019 06:52:41 +0000 (14:52 +0800)]
memory: unref the memory region in simplify flatview
The memory region reference is increased when insert a range
into flatview range array, then decreased by destroy flatview.
If some flat range merged by flatview_simplify, the memory region
reference can not be decreased by destroy flatview any more.
Paolo Bonzini [Tue, 2 Jul 2019 09:40:41 +0000 (11:40 +0200)]
iscsi: base all handling of check condition on scsi_sense_to_errno
Now that scsi-disk is not using scsi_sense_to_errno to separate guest-recoverable
sense codes, we can modify it to simplify iscsi's own sense handling.
Paolo Bonzini [Tue, 2 Jul 2019 08:01:03 +0000 (10:01 +0200)]
scsi: add guest-recoverable ZBC errors
When running basic operations on zoned storage from the guest via
scsi-block, the following ASCs are reported for write or read commands
due to unexpected zone status or write pointer status:
21h 04h: UNALIGNED WRITE COMMAND
21h 05h: WRITE BOUNDARY VIOLATION
21h 06h: ATTEMPT TO READ INVALID DATA
55h 0Eh: INSUFFICIENT ZONE RESOURCES
Reporting these ASCs to the guest, the user applications can handle
them to manage zone/write pointer status, or help the user application
developers to understand the failure reason and fix bugs.
Paolo Bonzini [Tue, 2 Jul 2019 08:23:20 +0000 (10:23 +0200)]
scsi: explicitly list guest-recoverable sense codes
It's not really possible to fit all sense codes into errno codes,
especially in such a way that sense codes can be properly categorized as
either guest-recoverable or host-handled. Create a new function that
checks for guest recoverable sense, then scsi_sense_buf_to_errno only
needs to be called for host handled sense codes.
scsi-disk: pass sense correctly for guest-recoverable errors
When an error was passed down to the guest because it was recoverable,
the sense length was not copied from the SG_IO data. As a result,
the guest saw the CHECK CONDITION status but not the sense data.
Peter Maydell [Fri, 12 Jul 2019 16:34:13 +0000 (17:34 +0100)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pc, pci: fixes, cleanups, tests
A bunch of fixes all over the place.
ACPI tests will now run on more systems: might
introduce new failure reports but that's for
the best, isn't it?
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Fri 12 Jul 2019 15:57:40 BST
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>" [full]
# gpg: aka "Michael S. Tsirkin <[email protected]>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
virtio pmem: remove transitional names
virtio pmem: remove memdev null check
virtio pmem: fix wrong mem region condition
tests: acpi: do not skip tests when IASL is not installed
tests: acpi: do not require IASL for dumping AML blobs
virtio-balloon: fix QEMU 4.0 config size migration incompatibility
pcie: consistent names for function args
xio3130_downstream: typo fix
Coverity reports that when we're assigning vi->size we handle the
"pmem->memdev is NULL" case; but we then pass it into
object_get_canonical_path(), which unconditionally dereferences it
and will crash if it is NULL. If this pointer can be NULL then we
need to do something else here.
We are removing 'pmem->memdev' null check here as memdev will never
be null in this function.
Igor Mammedov [Mon, 8 Jul 2019 09:24:10 +0000 (05:24 -0400)]
tests: acpi: do not skip tests when IASL is not installed
tests do binary comparision so we can check tables without
IASL. Move IASL condition right before decompilation step
and skip it if IASL is not installed.
The virtio-balloon config size changed in QEMU 4.0 even for existing
machine types. Migration from QEMU 3.1 to 4.0 can fail in some
circumstances with the following error:
This happens because the virtio-balloon config size affects the VIRTIO
Legacy I/O Memory PCI BAR size.
Introduce a qdev property called "qemu-4-0-config-size" and enable it
only for the QEMU 4.0 machine types. This way <4.0 machine types use
the old size, 4.0 uses the larger size, and >4.0 machine types use the
appropriate size depending on enabled virtio-balloon features.
Live migration to and from old QEMUs to QEMU 4.1 works again as long as
a versioned machine type is specified (do not use just "pc"!).
The function declarations for pci_cap_slot_get and
pci_cap_slot_write_config call the argument "slot_ctl", but the function
definitions and all the call sites drop the 'o' and call it "slt_ctl".
Let's be consistent.
file-posix: Use max transfer length/segment count only for SCSI passthrough
Regular kernel block devices (/dev/sda*, /dev/nvme*, etc) don't have
max segment size/max segment count hardware requirements exposed
to the userspace, but rather the kernel block layer
takes care to split the incoming requests that
violate these requirements.
Allowing the kernel to do the splitting allows qemu to avoid
various overheads that arise otherwise from this.
This is especially visible in nbd server,
exposing as a raw file, a mostly empty qcow2 image over the net.
In this case most of the reads by the remote user
won't even hit the underlying kernel block device,
and therefore most of the overhead will be in the
nbd traffic which increases significantly with lower max transfer size.
In addition to that even for local block device
access the peformance improves a bit due to less
traffic between qemu and the kernel when large
transfer sizes are used (e.g for image conversion)
More info can be found at:
https://bugzilla.redhat.com/show_bug.cgi?id=1647104
Peter Maydell [Fri, 12 Jul 2019 10:06:48 +0000 (11:06 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.1-20190712' into staging
ppc patch queue for 2019-07-12
First 4.1 hard freeze pull request. Not much here, just a bug fix for
the XICS interrupt controller and a SLOF firmware update to fix a bug
with IP discovery when there are multiple NICs.
Greg Kurz [Wed, 3 Jul 2019 17:22:20 +0000 (19:22 +0200)]
xics/kvm: Always set the MASKED bit if interrupt is masked
The ics_set_kvm_state_one() function is called either to restore the
state of an interrupt source during migration or to set the interrupt
source to a default state during reset.
Since always, ie. 2013, the code only sets the MASKED bit if the 'current
priority' and the 'saved priority' are different. This is likely true
when restoring an interrupt that had been previously masked with the
ibm,int-off RTAS call. However this is always false in the case of
reset since both 'current priority' and 'saved priority' are equal to
0xff, and the MASKED bit is never set.
The legacy KVM XICS device gets away with that because it ends updating
its internal structure the same way, whether the MASKED bit is set or
the priority is 0xff.
The XICS-on-XIVE device for POWER9 is different. It sticks to the KVM
documentation [1] and _really_ relies on the MASKED bit to correctly
set. If not, it will configure the interrupt source in the XIVE HW, even
though the guest hasn't configured the interrupt yet. This disturbs the
complex logic implemented in XICS-on-XIVE and may result in the loss of
subsequent queued events.
Always set the MASKED bit if interrupt is masked as expected by the KVM
XICS-on-XIVE device. This has no impact on the legacy KVM XICS.
Peter Maydell [Thu, 11 Jul 2019 09:03:42 +0000 (10:03 +0100)]
Merge remote-tracking branch 'remotes/stsquad/tags/pull-testing-and-gdbstub-100719-1' into staging
Testing and gdbstub fixes:
- fix diff-out pass in check-tcg
- ensure generation of fprem reference
- fix gdb set_reg fallback
# gpg: Signature made Wed 10 Jul 2019 11:24:28 BST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <[email protected]>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-testing-and-gdbstub-100719-1:
gdbstub: revert to previous set_reg behaviour
gdbstub: add some notes to the header comment
tests/tcg: fix diff-out pass to properly report failure
tests/tcg: fix up test-i386-fprem.ref generation
John Snow [Wed, 10 Jul 2019 19:08:07 +0000 (15:08 -0400)]
docs/bitmaps: use QMP lexer instead of json
The annotated style json we use in QMP documentation is not strict json
and depending on the version of Sphinx (2.0+) or Pygments installed,
might cause the build to fail.
Use the new QMP lexer.
Further, some versions of Sphinx can not apply custom lexers to "code"
directives and require the use of "code-block" directives instead, so
make that change at this time as well.
Tested under:
- Sphinx 1.3.6 and Pygments 2.4
- Sphinx 1.7.6 and Pygments 2.2 (Fedora 29 packages)
- Sphinx 2.0.1 and Pygments 2.4
- Sphinx 3.0.0+/f396b3a783 and Pygments 2.4 (From Sphinx git c4f44bdd)
John Snow [Wed, 10 Jul 2019 19:08:06 +0000 (15:08 -0400)]
sphinx: add qmp_lexer
Sphinx, through Pygments, does not like annotated json examples very
much. In some versions of Sphinx (1.7), it will render the non-json
portions of code blocks in red, but in newer versions (2.0) it will
throw an exception and not highlight the block at all. Though we can
suppress this warning, it doesn't bring back highlighting on non-strict
json blocks.
We can alleviate this by creating a custom lexer for QMP examples that
allows us to properly highlight these examples in a robust way, keeping
our directionality and elision notations.
Alex Bennée [Fri, 5 Jul 2019 13:23:07 +0000 (14:23 +0100)]
gdbstub: revert to previous set_reg behaviour
The refactoring of handle_set_reg missed the fact we previously had
responded with an empty packet when we were not using XML based
protocols. This broke the fallback behaviour for architectures that
don't have registers defined in QEMU's gdb-xml directory.
Revert to the previous behaviour and clean up the commentary for what
is going on.
Alex Bennée [Fri, 5 Jul 2019 11:56:35 +0000 (12:56 +0100)]
tests/tcg: fix diff-out pass to properly report failure
A side effect of piping the output to head is squash the exit status
of the diff command. Fix this by only doing the pipe if the diff
failed and then ensuring the status is non-zero.
Alex Bennée [Fri, 5 Jul 2019 10:48:02 +0000 (11:48 +0100)]
tests/tcg: fix up test-i386-fprem.ref generation
We never shipped the reference data in the source tree because it's
quite big (64M). As a result the only option is to generate it
locally. Although we have a rule to generate the reference file we
missed the dependency and location changes, probably because it's only
run for SLOW test runs.
Makefile: Fix "make clean" in "unconfigured" source directory
Recent commit "Makefile: Reuse all's recursion machinery for clean and
install" broke targets clean and distclean in the source directory
before running configure:
$ make clean
LD recurse-clean.mo
cc: fatal error: no input files
compilation terminated.
make: *** [rules.mak:118: recurse-clean.mo] Error 1
Stephen Checkoway noticed commit 3ae0343db69 is incorrect.
This commit state all parallel flashes are limited to 16-bit
accesses, however the x32 configuration exists in some models,
such the Cypress S29CL032J, which CFI Device Geometry Definition
announces:
CFI ADDR DATA
0x28,0x29 = 0x0003 (x32-only asynchronous interface)
Guests should not be affected by the previous change, because
QEMU does not announce itself as x32 capable:
Commit 3ae0343db69 does not restrict the bus to 16-bit accesses,
but restrict the implementation as 16-bit access max, so a guest
32-bit access will result in 2x 16-bit calls.
Now, we have 2 boards that register the flash device in 32-bit
access:
- PPC: taihu_405ep
The CFI id matches the S29AL008J that is a 1MB in x16, while
the code QEMU forces it to be 2MB, and checking Linux it expects
a 4MB flash.
- ARM: Digic4
While the comment says "Samsung K8P3215UQB 64M Bit (4Mx16)",
this flash is 32Mb (2MB). Also note the CFI id does not match
the comment.
To avoid unexpected side effect, we revert commit 3ae0343db69,
and will clean the board code later.
* remotes/cohuck/tags/s390x-20190709:
s390x/tcg: move fallthrough annotation
s390: cpumodel: fix description for the new vector facility
s390x/cpumodel: Set up CPU model for AQIC interception
vfio-ccw: Test vfio_set_irq_signaling() return value
s390: cpumodel: fix description for the new vector facility
The new facility is called "Vector-Packed-Decimal-Enhancement Facility"
and not "Vector BCD enhancements facility 1". As the shortname might
have already found its way into some backports, let's keep vxbeh.
Alistair Francis [Thu, 20 Jun 2019 14:04:18 +0000 (07:04 -0700)]
tcg/riscv: Fix RISC-VH host build failure
Commit 269bd5d8 "cpu: Move the softmmu tlb to CPUNegativeOffsetState'
broke the RISC-V host build as there are two variables that are used but
not defined.
This patch renames the undefined variables mask_off and table_off to the
existing (but unused) mask_ofs and table_ofs variables.
Peter Maydell [Mon, 8 Jul 2019 16:40:05 +0000 (17:40 +0100)]
Merge remote-tracking branch 'remotes/stefanberger/tags/pull-tpm-2019-07-08-1' into staging
Merge tpm 2019/07/08 v1
# gpg: Signature made Mon 08 Jul 2019 15:04:46 BST
# gpg: using RSA key 75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <[email protected]>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2019-07-08-1:
hw/tpm: Only build tpm_ppi.o if any of TPM_TIS/TPM_CRB is built
Peter Maydell [Mon, 8 Jul 2019 14:21:20 +0000 (15:21 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches:
- virtio-scsi: Fix request resubmission after I/O error with iothreads
- qcow2: Fix missing v2/v3 subformat aliases for amend
- qcow(1): More specific error message for wrong format version
- MAINTAINERS: update RBD block maintainer
# gpg: Signature made Mon 08 Jul 2019 15:17:27 BST
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream:
qcow2: Allow -o compat=v3 during qemu-img amend
MAINTAINERS: update RBD block maintainer
block/qcow: Improve error when opening qcow2 files as qcow
virtio-scsi: restart DMA after iothread
qdev: add qdev_add_vm_change_state_handler()
vl: add qemu_add_vm_change_state_handler_prio()
Eric Blake [Fri, 5 Jul 2019 15:28:12 +0000 (10:28 -0500)]
qcow2: Allow -o compat=v3 during qemu-img amend
Commit b76b4f60 allowed '-o compat=v3' as an alias for the
less-appealing '-o compat=1.1' for 'qemu-img create' since we want to
use the QMP form as much as possible, but forgot to do likewise for
qemu-img amend. Also, it doesn't help that '-o help' doesn't list our
new preferred spellings.
Stefan Hajnoczi [Thu, 20 Jun 2019 17:37:09 +0000 (18:37 +0100)]
virtio-scsi: restart DMA after iothread
When the 'cont' command resumes guest execution the vm change state
handlers are invoked. Unfortunately there is no explicit ordering
between classic qemu_add_vm_change_state_handler() callbacks. When two
layers of code both use vm change state handlers, we don't control which
handler runs first.
virtio-scsi with iothreads hits a deadlock when a failed SCSI command is
restarted and completes before the iothread is re-initialized.
This patch uses the new qdev_add_vm_change_state_handler() API to
guarantee that virtio-scsi's virtio change state handler executes before
the SCSI bus children. This way DMA is restarted after the iothread has
re-initialized.
Stefan Hajnoczi [Thu, 20 Jun 2019 17:37:08 +0000 (18:37 +0100)]
qdev: add qdev_add_vm_change_state_handler()
Children sometimes depend on their parent's vm change state handler
having completed. Add a vm change state handler API for devices that
guarantees tree depth ordering.
* remotes/pmaydell/tags/pull-target-arm-20190708:
target/arm/vfp_helper: Call set_fpscr_to_host before updating to FPSCR
hw/arm/sbsa-ref: Remove unnecessary check for secure_sysmem == NULL
tests/migration-test: Fix read off end of aarch64_kernel array
target/arm: Fix sve_zcr_len_for_el
target/arm/vfp_helper: Call set_fpscr_to_host before updating to FPSCR
In commit e9d652824b0 we extracted the vfp_set_fpscr_to_host()
function but failed at calling it in the correct place, we call
it after xregs[ARM_VFP_FPSCR] is modified.
Fix by calling this function before we update FPSCR.
Peter Maydell [Mon, 8 Jul 2019 13:11:31 +0000 (14:11 +0100)]
hw/arm/sbsa-ref: Remove unnecessary check for secure_sysmem == NULL
In the virt machine, we support TrustZone being either present or
absent, and so the code must deal with the secure_sysmem pointer
possibly being NULL. In the sbsa-ref machine, TrustZone is always
present, but some code and comments copied from virt still treat
it as possibly not being present.
This causes Coverity to complain (CID 1407287) that we check
secure_sysmem for being NULL after an unconditional dereference.
Simplify the code so that instead of initializing the variable
to NULL, unconditionally assigning it, and then testing it for NULL,
we just initialize it correctly in the variable declaration and
then assume it to be non-NULL. We also delete a comment which
only applied to the non-TrustZone config.
Peter Maydell [Mon, 8 Jul 2019 13:11:31 +0000 (14:11 +0100)]
tests/migration-test: Fix read off end of aarch64_kernel array
The test aarch64 kernel is in an array defined with
unsigned char aarch64_kernel[] = { [...] }
which means it could be any size; currently it's quite small.
However we write it to a file using init_bootfile(), which
writes exactly 512 bytes to the file. This will break if
we ever end up with a kernel larger than that, and will
read garbage off the end of the array in the current setup
where the kernel is smaller.
Make init_bootfile() take an argument giving the length of
the data to write. This allows us to use it for all architectures
(previously s390 had a special-purpose init_bootfile_s390x
which hardcoded the file to write so it could write the
correct length). We assert that the x86 bootfile really is
exactly 512 bytes as it should be (and as we were previously
just assuming it was).
This was detected by the clang-7 asan:
==15607==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55a796f51d20 at pc 0x55a796b89c2f bp 0x7ffc58e89160 sp 0x7ffc58e88908
READ of size 512 at 0x55a796f51d20 thread T0
#0 0x55a796b89c2e in fwrite (/home/petmay01/linaro/qemu-from-laptop/qemu/build/sanitizers/tests/migration-test+0xb0c2e)
#1 0x55a796c46492 in init_bootfile /home/petmay01/linaro/qemu-from-laptop/qemu/tests/migration-test.c:99:5
#2 0x55a796c46492 in test_migrate_start /home/petmay01/linaro/qemu-from-laptop/qemu/tests/migration-test.c:593
#3 0x55a796c44101 in test_baddest /home/petmay01/linaro/qemu-from-laptop/qemu/tests/migration-test.c:854:9
#4 0x7f906ffd3cc9 (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x72cc9)
#5 0x7f906ffd3bfa (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x72bfa)
#6 0x7f906ffd3bfa (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x72bfa)
#7 0x7f906ffd3ea1 in g_test_run_suite (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x72ea1)
#8 0x7f906ffd3ec0 in g_test_run (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x72ec0)
#9 0x55a796c43707 in main /home/petmay01/linaro/qemu-from-laptop/qemu/tests/migration-test.c:1187:11
#10 0x7f906e9abb96 in __libc_start_main /build/glibc-OTsEL5/glibc-2.27/csu/../csu/libc-start.c:310
#11 0x55a796b6c2d9 in _start (/home/petmay01/linaro/qemu-from-laptop/qemu/build/sanitizers/tests/migration-test+0x932d9)
Pierre Morel [Fri, 5 Jul 2019 15:32:49 +0000 (17:32 +0200)]
s390x/cpumodel: Set up CPU model for AQIC interception
Let's add support for the AP-Queue interruption facility to the CPU
model.
The S390_FEAT_AP_QUEUE_INTERRUPT_CONTROL, CPU facility indicates
whether the PQAP instruction with the AQIC command is available
to the guest.
This feature will be enabled only if the AP instructions are
available on the linux host and AQIC facility is installed on
the host.
This feature must be turned on from userspace to intercept AP
instructions on the KVM guest. The QEMU command line to turn
this feature on looks something like this:
qemu-system-s390x ... -cpu xxx,apqi=on ...
or
... -cpu host
Right now AP pass-through devices do not support migration,
which means that we do not have to take care of migrating
the interrupt data:
virsh migrate apguest --live qemu+ssh://[email protected]/system
error: Requested operation is not valid: domain has assigned non-USB host devices
Alex Williamson [Tue, 2 Jul 2019 19:41:34 +0000 (13:41 -0600)]
vfio-ccw: Test vfio_set_irq_signaling() return value
Coverity doesn't like that most callers of vfio_set_irq_signaling() check
the return value and doesn't understand the equivalence of testing the
error pointer instead. Test the return value consistently.
* remotes/bonzini/tags/for-upstream:
ioapic: use irq number instead of vector in ioapic_eoi_broadcast
hw/i386: Fix linker error when ISAPC is disabled
Makefile: generate header file with the list of devices enabled
target/i386: kvm: Fix when nested state is needed for migration
minikconf: do not include variables from MINIKCONF_ARGS in config-all-devices.mak
target/i386: fix feature check in hyperv-stub.c
ioapic: clear irq_eoi when updating the ioapic redirect table entry
intel_iommu: Fix unexpected unmaps during global unmap
intel_iommu: Fix incorrect "end" for vtd_address_space_unmap
i386/kvm: Fix build with -m32
checkpatch: do not warn for multiline parenthesized returned value
pc: fix possible NULL pointer dereference in pc_machine_get_device_memory_region_size()
Peter Maydell [Mon, 8 Jul 2019 08:46:19 +0000 (09:46 +0100)]
Merge remote-tracking branch 'remotes/ehabkost/tags/machine-next-pull-request' into staging
Machine and x86 queue, 2019-07-05
* CPU die topology support (Like Xu)
* Deprecation of features (Igor Mammedov):
* 'mem' parameter of '-numa node' option
* implict memory distribution between NUMA nodes
* deprecate -mem-path fallback to anonymous RAM
* x86 versioned CPU models (Eduardo Habkost)
* SnowRidge CPU model (Paul Lai)
* Add deprecation information to query-machines (Eduardo Habkost)
* Other i386 fixes
* remotes/ehabkost/tags/machine-next-pull-request: (42 commits)
tests: use -numa memdev option in tests instead of legacy 'mem' option
numa: allow memory-less nodes when using memdev as backend
numa: Make deprecation warnings conditional on !qtest_enabled()
i386: Add Cascadelake-Server-v2 CPU model
docs: Deprecate CPU model runnability guarantees
i386: Make unversioned CPU models be aliases
i386: Replace -noTSX, -IBRS, -IBPB CPU models with aliases
i386: Define -IBRS, -noTSX, -IBRS versions of CPU models
i386: Register versioned CPU models
i386: Get model-id from CPU object on "-cpu help"
i386: Add x-force-features option for testing
qmp: Add "alias-of" field to query-cpu-definitions
i386: Introduce SnowRidge CPU model
qmp: Add deprecation information to query-machines
vl.c: Add -smp, dies=* command line support and update doc
machine: Refactor smp_parse() in vl.c as MachineClass::smp_parse()
target/i386: Add CPUID.1F generation support for multi-dies PCMachine
i386: Remove unused host_cpudef variable
x86/cpu: use FeatureWordArray to define filtered_features
i386: make 'hv-spinlocks' a regular uint32 property
...
(qemu) /home/test/qemu5/qemu/hw/intc/ioapic.c:266:27: runtime error: index 35 out of bounds for type 'int [24]'
=================================================================
==113504==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61b000003114 at pc 0x5579e3c7a80f bp 0x7fd004bf8c10 sp 0x7fd004bf8c00
WRITE of size 4 at 0x61b000003114 thread T4
#0 0x5579e3c7a80e in ioapic_eoi_broadcast /home/test/qemu5/qemu/hw/intc/ioapic.c:266
#1 0x5579e3c6f480 in apic_eoi /home/test/qemu5/qemu/hw/intc/apic.c:428
#2 0x5579e3c720a7 in apic_mem_write /home/test/qemu5/qemu/hw/intc/apic.c:802
#3 0x5579e3b1e31a in memory_region_write_accessor /home/test/qemu5/qemu/memory.c:503
#4 0x5579e3b1e6a2 in access_with_adjusted_size /home/test/qemu5/qemu/memory.c:569
#5 0x5579e3b28d77 in memory_region_dispatch_write /home/test/qemu5/qemu/memory.c:1497
#6 0x5579e3a1b36b in flatview_write_continue /home/test/qemu5/qemu/exec.c:3323
#7 0x5579e3a1b633 in flatview_write /home/test/qemu5/qemu/exec.c:3362
#8 0x5579e3a1bcb1 in address_space_write /home/test/qemu5/qemu/exec.c:3452
#9 0x5579e3a1bd03 in address_space_rw /home/test/qemu5/qemu/exec.c:3463
#10 0x5579e3b8b979 in kvm_cpu_exec /home/test/qemu5/qemu/accel/kvm/kvm-all.c:2045
#11 0x5579e3ae4499 in qemu_kvm_cpu_thread_fn /home/test/qemu5/qemu/cpus.c:1287
#12 0x5579e4cbdb9f in qemu_thread_start util/qemu-thread-posix.c:502
#13 0x7fd0146376da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da)
#14 0x7fd01436088e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x12188e
This is because in ioapic_eoi_broadcast function, we uses 'vector' to
index the 's->irq_eoi'. To fix this, we should uses the irq number.
Liran Alon [Mon, 24 Jun 2019 23:05:14 +0000 (02:05 +0300)]
target/i386: kvm: Fix when nested state is needed for migration
When vCPU is in VMX operation and enters SMM mode,
it temporarily exits VMX operation but KVM maintained nested-state
still stores the VMXON region physical address, i.e. even when the
vCPU is in SMM mode then (nested_state->hdr.vmx.vmxon_pa != -1ull).
Therefore, there is no need to explicitly check for
KVM_STATE_NESTED_SMM_VMXON to determine if it is necessary
to save nested-state as part of migration stream.
Paolo Bonzini [Mon, 24 Jun 2019 18:18:46 +0000 (20:18 +0200)]
minikconf: do not include variables from MINIKCONF_ARGS in config-all-devices.mak
When minikconf writes config-devices.mak, it includes all variables including
those from MINIKCONF_ARGS. This causes values from config-host.mak to "stick" to
the ones used in generating config-devices.mak, because config-devices.mak is
included after config-host.mak. Avoid this by omitting assignments coming
from the command line in the output of minikconf.
The reason was the conversion of cpu->hyperv_synic to
cpu->hyperv_synic_kvm_only although the rest of the patch introduces a
feature checking mechanism. So I've fixed the KVM_EXIT_HYPERV_SYNIC in
hyperv-stub to do the same feature check as in the real hyperv.c
vtd_address_space_unmap() will do proper page mask alignment to make
sure each IOTLB message will have correct masks for notification
messages (2^N-1), but sometimes it can be expanded to even supercede
the registered range. That could lead to unexpected UNMAP of already
mapped regions in some other notifiers.
Instead of doing mindless expension of the start address and address
mask, we split the range into smaller ones and guarantee that each
small range will have correct masks (2^N-1) and at the same time we
should also try our best to generate as less IOTLB messages as
possible.
Max Reitz [Mon, 24 Jun 2019 19:39:13 +0000 (21:39 +0200)]
i386/kvm: Fix build with -m32
find_next_bit() takes a pointer of type "const unsigned long *", but the
first argument passed here is a "uint64_t *". These types are
incompatible when compiling qemu with -m32.
Igor Mammedov [Tue, 2 Jul 2019 14:07:44 +0000 (10:07 -0400)]
numa: allow memory-less nodes when using memdev as backend
QEMU fails to start if memory-less node is present when memdev
is used
qemu-system-x86_64 -object memory-backend-ram,id=ram0,size=128M \
-numa node -numa node,memdev=ram0
with error:
"memdev option must be specified for either all or no nodes"
which works as expected if legacy 'mem' is used.
Fix check to make memory-less nodes valid when memdev option is used
but still disallow mix of mem and memdev options.
numa: Make deprecation warnings conditional on !qtest_enabled()
This will help us avoid spurious warnings during "make check".
Note that this will silence the warnings generated by
tests/numa-test, but not the ones generated by
tests/bios-tables-test. We still need to change
tests/bios-tables-test to use "-numa ...,memdev=" to silence
these warnings.
Eduardo Habkost [Fri, 28 Jun 2019 00:28:44 +0000 (21:28 -0300)]
i386: Add Cascadelake-Server-v2 CPU model
Add new version of Cascadelake-Server CPU model, setting
stepping=5 and enabling the IA32_ARCH_CAPABILITIES MSR
with some flags.
The new feature will introduce a new host software requirement,
breaking our CPU model runnability promises. This means we can't
enable the new CPU model version by default in QEMU 4.1, because
management software isn't ready yet to resolve CPU model aliases.
This is why "pc-*-4.1" will keep returning Cascadelake-Server-v1
if "-cpu Cascadelake-Server" is specified.
Includes a test case to ensure the right combinations of
machine-type + CPU model + command-line feature flags will work
as expected.
Eduardo Habkost [Fri, 28 Jun 2019 00:28:42 +0000 (21:28 -0300)]
i386: Make unversioned CPU models be aliases
This will make unversioned CPU models behavior depend on the
machine type:
* "pc-*-4.0" and older will not report them as aliases.
This is done to keep compatibility with older QEMU versions
after management software starts translating aliases.
* "pc-*-4.1" will translate unversioned CPU models to -v1.
This is done to keep compatibility with existing management
software, that still relies on CPU model runnability promises.
* "none" will translate unversioned CPU models to their latest
version. This is planned become the default in future machine
types (probably in pc-*-4.3).
Eduardo Habkost [Fri, 28 Jun 2019 00:28:39 +0000 (21:28 -0300)]
i386: Register versioned CPU models
Add support for registration of multiple versions of CPU models.
The existing CPU models will be registered with a "-v1" suffix.
The -noTSX, -IBRS, and -IBPB CPU model variants will become
versions of the original models in a separate patch, so
make sure we register no versions for them.
Eduardo Habkost [Fri, 28 Jun 2019 00:28:38 +0000 (21:28 -0300)]
i386: Get model-id from CPU object on "-cpu help"
When introducing versioned CPU models, the string at
X86CPUDefinition::model_id might not be the model-id we'll really
use. Instantiate a CPU object and check the model-id property on
"-cpu help"
Eduardo Habkost [Fri, 28 Jun 2019 00:28:37 +0000 (21:28 -0300)]
i386: Add x-force-features option for testing
Add a new option that can be used to disable feature flag
filtering. This will allow CPU model compatibility test cases to
work without host hardware dependencies.
Paul Lai [Wed, 26 Jun 2019 16:21:29 +0000 (09:21 -0700)]
i386: Introduce SnowRidge CPU model
SnowRidge CPU supports Accelerator Infrastrcture Architecture (MOVDIRI,
MOVDIR64B), CLDEMOTE and SPLIT_LOCK_DISABLE.
MOVDIRI, MOVDIR64B, and CLDEMOTE are found via CPUID.
The availability of SPLIT_LOCK_DISABLE is check via msr access
References can be found in either:
https://software.intel.com/en-us/articles/intel-sdm
https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-and-future-features-programming-reference
Eduardo Habkost [Sat, 8 Jun 2019 23:34:47 +0000 (20:34 -0300)]
qmp: Add deprecation information to query-machines
Export machine type deprecation status through the query-machines
QMP command. With this, libvirt and management software will be
able to show this information to users and/or suggest changes to
VM configuration to avoid deprecated machines.
Like Xu [Thu, 20 Jun 2019 05:45:25 +0000 (13:45 +0800)]
vl.c: Add -smp, dies=* command line support and update doc
For PC target, users could configure the number of dies per one package
via command line with this patch, such as "-smp dies=2,cores=4".
The parsing rules of new cpu-topology model obey the same restrictions/logic
as the legacy socket/core/thread model especially on missing values computing.
Like Xu [Thu, 20 Jun 2019 05:45:24 +0000 (13:45 +0800)]
machine: Refactor smp_parse() in vl.c as MachineClass::smp_parse()
To make smp_parse() more flexible and expansive, a smp_parse function
pointer is added to MachineClass that machine types could override.
The generic smp_parse() code in vl.c is moved to hw/core/machine.c, and
become the default implementation of MachineClass::smp_parse. A PC-specific
function called pc_smp_parse() has been added to hw/i386/pc.c, which in
this patch changes nothing against the default one .
Like Xu [Thu, 20 Jun 2019 05:45:23 +0000 (13:45 +0800)]
target/i386: Add CPUID.1F generation support for multi-dies PCMachine
The CPUID.1F as Intel V2 Extended Topology Enumeration Leaf would be
exposed if guests want to emulate multiple software-visible die within
each package. Per Intel's SDM, the 0x1f is a superset of 0xb, thus they
can be generated by almost same code as 0xb except die_offset setting.
If the number of dies per package is greater than 1, the cpuid_min_level
would be adjusted to 0x1f regardless of whether the host supports CPUID.1F.
Likewise, the CPUID.1F wouldn't be exposed if env->nr_dies < 2.
Roman Kagan [Tue, 18 Jun 2019 11:07:06 +0000 (11:07 +0000)]
i386: make 'hv-spinlocks' a regular uint32 property
X86CPU.hv-spinlocks is a uint32 property that has a special setter
validating the value to be no less than 0xFFF and no bigger than
UINT_MAX. The latter check is redundant; as for the former, there
appears to be no reason to prohibit the user from setting it to a lower
value.
So nuke the dedicated getter/setter pair and convert 'hv-spinlocks' to a
regular uint32 property.
Eduardo Habkost [Sat, 15 Jun 2019 20:05:05 +0000 (17:05 -0300)]
i386: Fix signedness of hyperv_spinlock_attempts
The current default value for hv-spinlocks is 0xFFFFFFFF (meaning
"never retry"). However, the value is stored as a signed
integer, making the getter of the hv-spinlocks QOM property
return -1 instead of 0xFFFFFFFF.
Fix this by changing the type of X86CPU::hyperv_spinlock_attempts
to uint32_t. This has no visible effect to guest operating
systems, affecting just the behavior of the QOM getter.