yuchenlin [Fri, 12 Oct 2018 09:07:52 +0000 (17:07 +0800)]
vhost-scsi: prevent using uninitialized vqs
There are 3 virtqueues (ctrl, event and cmd) for virtio scsi device,
but seabios will only set the physical address for the 3rd one (cmd).
Then in vhost_virtqueue_start(), virtio_queue_get_desc_addr()
will be 0 for ctrl and event vq.
In this case, ctrl and event vq are not initialized.
vhost_verify_ring_mappings may use uninitialized vhost_virtqueue
such that vhost_verify_ring_part_mapping returns ENOMEM.
When encountered this problem, we got the following logs:
qemu-system-x86_64: Unable to map available ring for ring 0
qemu-system-x86_64: Verify ring failure on region 0
CC mips-softmmu/hw/mips/gt64xxx_pci.o
In file included from include/hw/pci-host/gt64xxx.h:2,
from hw/mips/gt64xxx_pci.c:30:
include/hw/pci/pci_bus.h:23:5: error: unknown type name ‘PCIIOMMUFunc’
PCIIOMMUFunc iommu_fn;
^~~~~~~~~~~~
include/hw/pci/pci_bus.h:27:5: error: unknown type name ‘pci_set_irq_fn’
pci_set_irq_fn set_irq;
^~~~~~~~~~~~~~
include/hw/pci/pci_bus.h:28:5: error: unknown type name ‘pci_map_irq_fn’
pci_map_irq_fn map_irq;
^~~~~~~~~~~~~~
include/hw/pci/pci_bus.h:29:5: error: unknown type name ‘pci_route_irq_fn’
pci_route_irq_fn route_intx_to_irq;
^~~~~~~~~~~~~~~~
include/hw/pci/pci_bus.h:31:24: error: ‘PCI_SLOT_MAX’ undeclared here (not in a function)
PCIDevice *devices[PCI_SLOT_MAX * PCI_FUNC_MAX];
^~~~~~~~~~~~
include/hw/pci/pci_bus.h:31:39: error: ‘PCI_FUNC_MAX’ undeclared here (not in a function)
PCIDevice *devices[PCI_SLOT_MAX * PCI_FUNC_MAX];
^~~~~~~~~~~~
make[1]: *** [rules.mak:69: hw/mips/gt64xxx_pci.o] Error 1
make: *** [Makefile:482: subdir-mips-softmmu] Error 2
tests/bios-tables-test: add 64-bit PCI MMIO aperture round-up test on Q35
In commit 9fa99d2519cb ("hw/pci-host: Fix x86 Host Bridges 64bit PCI
hole", 2017-11-16), we meant to expose such a 64-bit PCI MMIO aperture in
the ACPI DSDT that would be at least as large as the new "pci-hole64-size"
property (2GB on i440fx, 32GB on q35). The goal was to offer "enough"
64-bit MMIO aperture to the guest OS for hotplug purposes.
Previous patch fixed the issue that the aperture is extended relative to
a possibly incorrect base. This may result in an aperture size that is
smaller than the intent of commit 9fa99d2519cb.
This patch adds a test to make sure it won't happen again.
In the test case being added:
- use 128 MB initial RAM size,
- ask for one DIMM hotplug slot,
- ask for 2 GB maximum RAM size,
- use a pci-testdev with a 64-bit BAR of 2 GB size.
Consequences:
(1) In pc_memory_init() [hw/i386/pc.c], the DIMM hotplug area size is
initially set to 2048-128 = 1920 MB. (Maximum RAM size minus initial
RAM size.)
(2) The DIMM area base is set to 4096 MB (because the initial RAM is only
128 MB -- there is no initial "high RAM").
(3) Due to commit 085f8e88ba73 ("pc: count in 1Gb hugepage alignment when
sizing hotplug-memory container", 2014-11-24), we add 1 GB for the one
DIMM hotplug slot that was specified. This sets the DIMM area size to
1920+1024 = 2944 MB.
(4) The reserved-memory-end address (exclusive) is set to 4096 + 2944 =
7040 MB (DIMM area base plus DIMM area size).
(5) The reserved-memory-end address is rounded up to GB alignment,
yielding 7 GB (7168 MB).
(6) Given the 2 GB BAR size of pci-testdev, SeaBIOS allocates said 64-bit
BAR in 64-bit address space.
(7) Because reserved-memory-end is at 7 GB, it is unaligned for the 2 GB
BAR. Therefore SeaBIOS allocates the BAR at 8 GB. QEMU then
(correctly) assigns the root bridge aperture base this BAR address, to
be exposed in \_SB.PCI0._CRS.
(8) The intent of commit 9fa99d2519cb dictates that QEMU extend the
aperture size to 32 GB, implying a 40 GB end address. However, QEMU
performs the extension relative to reserved-memory-end (7 GB), not
relative to the bridge aperture base that was correctly deduced from
SeaBIOS's BAR programming (8 GB). Therefore we see 39 GB as the
aperture end address in \_SB.PCI0._CRS:
hw/pci-host/x86: extend the 64-bit PCI hole relative to the fw-assigned base
In commit 9fa99d2519cb ("hw/pci-host: Fix x86 Host Bridges 64bit PCI
hole", 2017-11-16), we meant to expose such a 64-bit PCI MMIO aperture in
the ACPI DSDT that would be at least as large as the new "pci-hole64-size"
property (2GB on i440fx, 32GB on q35). The goal was to offer "enough"
64-bit MMIO aperture to the guest OS for hotplug purposes.
In that commit, we added or modified five functions:
- pc_pci_hole64_start(): shared between i440fx and q35. Provides a default
64-bit base, which starts beyond the cold-plugged 64-bit RAM, and skips
the DIMM hotplug area too (if any).
- i440fx_pcihost_get_pci_hole64_start(), q35_host_get_pci_hole64_start():
board-specific 64-bit base property getters called abstractly by the
ACPI generator. Both of these fall back to pc_pci_hole64_start() if the
firmware didn't program any 64-bit hole (i.e. if the firmware didn't
assign a 64-bit GPA to any MMIO BAR on any device). Otherwise, they
honor the firmware's BAR assignments (i.e., they treat the lowest 64-bit
GPA programmed by the firmware as the base address for the aperture).
- i440fx_pcihost_get_pci_hole64_end(), q35_host_get_pci_hole64_end():
these intended to extend the aperture to our size recommendation,
calculated relative to the base of the aperture.
Despite the original intent, i440fx_pcihost_get_pci_hole64_end() and
q35_host_get_pci_hole64_end() currently only extend the aperture relative
to the default base (pc_pci_hole64_start()), ignoring any programming done
by the firmware. This means that our size recommendation may not be met.
Fix it by honoring the firmware's address assignments.
The strange extension sizes were spotted by Alex, in the log of a guest
kernel running on top of OVMF (which prefers to assign 64-bit GPAs to
64-bit BARs).
This change only affects DSDT generation, therefore no new compat property
is being introduced.
Using an i440fx OVMF guest with 5GB RAM, an example _CRS change is:
(On i440fx, the low RAM split is at 3GB, in this case. Therefore, with 5GB
guest RAM and no DIMM hotplug range, pc_pci_hole64_start() returns 4 +
(5-3) = 6 GB. Adding the 2GB extension to that yields 8GB, which is below
the firmware-programmed base of 32GB, before the patch. Therefore, before
the patch, the extension is ineffective. After the patch, we add the 2GB
extension to the firmware-programmed base, namely 32GB.)
Using a q35 OVMF guest with 5GB RAM, an example _CRS change is:
(On Q35, the low RAM split is at 2GB. Therefore, with 5GB guest RAM and no
DIMM hotplug range, pc_pci_hole64_start() returns 4 + (5-2) = 7 GB. Adding
the 32GB extension to that yields 39GB (0x0000_0009_BFFF_FFFF + 1), before
the patch. After the patch, we add the 32GB extension to the
firmware-programmed base, namely 32GB.)
The ACPI test data for the bios-tables-test case that we added earlier in
this series are corrected too, as follows:
Add memory bar to pci-testdev. Size is configurable using the membar
property. Setting the size to zero (default) turns it off. Can be used
to check whether guests handle large pci bars correctly.
MAINTAINERS: list "tests/acpi-test-data" files in ACPI/SMBIOS section
The "tests/acpi-test-data" files are currently not covered by any section
in MAINTAINERS, and "scripts/checkpatch.pl" complains when new data files
are added.
Singh, Brijesh [Mon, 1 Oct 2018 19:44:45 +0000 (19:44 +0000)]
x86_iommu/amd: Enable Guest virtual APIC support
Now that amd-iommu support interrupt remapping, enable the GASup in IVRS
table and GASup in extended feature register to indicate that IOMMU
support guest virtual APIC mode. GASup provides option to guest OS to
make use of 128-bit IRTE.
Note that the GAMSup is set to zero to indicate that amd-iommu does not
support guest virtual APIC mode (aka AVIC) which would be used for the
nested VMs.
See Table 21 from IOMMU spec for interrupt virtualization controls
Singh, Brijesh [Mon, 1 Oct 2018 19:44:32 +0000 (19:44 +0000)]
x86_iommu/amd: remove V=1 check from amdvi_validate_dte()
Currently, the amdvi_validate_dte() assumes that a valid DTE will
always have V=1. This is not true. The V=1 means that bit[127:1] are
valid. A valid DTE can have IV=1 and V=0 (i.e address translation
disabled and interrupt remapping enabled)
Remove the V=1 check from amdvi_validate_dte(), make the caller
responsible to check for V or IV bits.
This also fixes a bug in existing code that when error is
detected during the translation we'll fail the translation
instead of assuming a passthrough mode.
Singh, Brijesh [Mon, 1 Oct 2018 19:44:29 +0000 (19:44 +0000)]
x86_iommu: move vtd_generate_msi_message in common file
The vtd_generate_msi_message() in intel-iommu is used to construct a MSI
Message from IRQ. A similar function will be needed when we add interrupt
remapping support in amd-iommu. Moving the function in common file to
avoid the code duplication. Rename it to x86_iommu_irq_to_msi_message().
There is no logic changes in the code flow.
Yongji Xie [Wed, 6 Jun 2018 13:24:48 +0000 (21:24 +0800)]
vhost-user-blk: start vhost when guest kicks
Some old guests (before commit 7a11370e5: "virtio_blk: enable VQs early")
kick virtqueue before setting VIRTIO_CONFIG_S_DRIVER_OK. This violates
the virtio spec. But virtio 1.0 transitional devices support this behaviour.
So we should start vhost when guest kicks in this case.
Peter Xu [Tue, 9 Oct 2018 07:45:43 +0000 (15:45 +0800)]
intel_iommu: handle invalid ce for shadow sync
We should handle VTD_FR_CONTEXT_ENTRY_P properly when synchronizing
shadow page tables. Having invalid context entry there is perfectly
valid when we move a device out of an existing domain. When that
happens, instead of posting an error we invalidate the whole region.
Without this patch, QEMU will crash if we do these steps:
(1) start QEMU with VT-d IOMMU and two 10G NICs (ixgbe)
(2) bind the NICs with vfio-pci in the guest
(3) start testpmd with the NICs applied
(4) stop testpmd
(5) rebind the NIC back to ixgbe kernel driver
Peter Xu [Tue, 9 Oct 2018 07:45:42 +0000 (15:45 +0800)]
intel_iommu: move ce fetching out when sync shadow
There are two callers for vtd_sync_shadow_page_table_range(): one
provided a valid context entry and one not. Move that fetching
operation into the caller vtd_sync_shadow_page_table() where we need to
fetch the context entry.
Meanwhile, remove the error_report_once() directly since we're already
tracing all the error cases in the previous call. Instead, return error
number back to caller. This will not change anything functional since
callers are dropping it after all.
We do this move majorly because we want to do something more later in
vtd_sync_shadow_page_table().
Peter Xu [Sat, 29 Sep 2018 03:36:15 +0000 (11:36 +0800)]
intel_iommu: better handling of dmar state switch
QEMU is not handling the global DMAR switch well, especially when from
"on" to "off".
Let's first take the example of system reset.
Assuming that a guest has IOMMU enabled. When it reboots, we will drop
all the existing DMAR mappings to handle the system reset, however we'll
still keep the existing memory layouts which has the IOMMU memory region
enabled. So after the reboot and before the kernel reloads again, there
will be no mapping at all for the host device. That's problematic since
any software (for example, SeaBIOS) that runs earlier than the kernel
after the reboot will assume the IOMMU is disabled, so any DMA from the
software will fail.
For example, a guest that boots on an assigned NVMe device might fail to
find the boot device after a system reboot/reset and we'll be able to
observe SeaBIOS errors if we capture the debugging log:
WARNING - Timeout at nvme_wait:144!
Meanwhile, we should see DMAR errors on the host of that NVMe device.
It's the DMA fault that caused a NVMe driver timeout.
The correct fix should be that we do proper switching of device DMA
address spaces when system resets, which will setup correct memory
regions and notify the backend of the devices. This might not affect
much on non-assigned devices since QEMU VT-d emulation will assume a
default passthrough mapping if DMAR is not enabled in the GCMD
register (please refer to vtd_iommu_translate). However that's required
for an assigned devices, since that'll rebuild the correct GPA to HPA
mapping that is needed for any DMA operation during guest bootstrap.
Besides the system reset, we have some other places that might change
the global DMAR status and we'd better do the same thing there. For
example, when we change the state of GCMD register, or the DMAR root
pointer. Do the same refresh for all these places. For these two
places we'll also need to explicitly invalidate the context entry cache
and iotlb cache.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1625173 CC: QEMU Stable <[email protected]> Reported-by: Cong Li <[email protected]> Signed-off-by: Peter Xu <[email protected]>
--
v2:
- do the same for GCMD write, or root pointer update [Alex]
- test is carried out by me this time, by observing the
vtd_switch_address_space tracepoint after system reboot
v3:
- rewrite commit message as suggested by Alex Signed-off-by: Peter Xu <[email protected]> Reviewed-by: Eric Auger <[email protected]> Reviewed-by: Jason Wang <[email protected]> Reviewed-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
Peter Maydell [Fri, 2 Nov 2018 11:52:39 +0000 (11:52 +0000)]
configure: Use LINKS loop for all build tree symlinks
A few places in configure were doing ad-hoc calls to
the symlink function to set up symlinks from the build tree
back to the source tree. We have a loop that does this
already for all files and directories listed in the LINKS
environment variable; use that instead.
Peter Maydell [Fri, 2 Nov 2018 11:52:38 +0000 (11:52 +0000)]
configure: Rename FILES variable to LINKS
The FILES variable is used to accumulate a list of things to symlink
from the source tree into the build tree. These don't have to be
individual files; symlinking an entire directory of data files is
also fine. Rename it to something less confusing before we add a few
directories to it.
Improve the comment to clarify what DIRS and LINKS do and why
it's not a good idea to add things to LINKS with wildcarding.
Peter Maydell [Fri, 2 Nov 2018 11:52:37 +0000 (11:52 +0000)]
tests: Move tests/hex-loader-check-data/ to tests/data/hex-loader/
Currently tests/hex-loader-check-data contains data files used
by the hexloader-test, and configure individually symlinks those
data files into the build directory using a wildcard.
Using a wildcard like this is a bad idea, because if a new
data file is added, nothing causes configure to be rerun,
and so no symlink is added for the new file. This can cause
tests to spuriously fail when they can't find their data.
Instead, it's better to symlink an entire directory of
data files. We already have such a directory: tests/data.
Move the data files from tests/hex-loader-check-data/ to
tests/data/hex-loader/, and remove the unnecessary symlinking.
Peter Maydell [Fri, 2 Nov 2018 11:52:36 +0000 (11:52 +0000)]
tests: Move tests/acpi-test-data/ to tests/data/acpi/
Currently tests/acpi-test-data contains data files used by the
bios-tables-test, and configure individually symlinks those
data files into the build directory using a wildcard.
Using a wildcard like this is a bad idea, because if a new
data file is added, nothing causes configure to be rerun,
and so no symlink is added for the new file. This can cause
tests to spuriously fail when they can't find their data.
Instead, it's better to symlink an entire directory of
data files. We already have such a directory: tests/data.
Move the data files from tests/acpi-test-data/ to
tests/data/acpi/, and remove the unnecessary symlinking.
We can remove entirely the note in rebuild-expected-aml.sh
about copying any new data files, because now they will
be in the source directory, not the build directory, and
no copying is required.
(We can't just change the existing tests/acpi-test-data/
to being a symlinked directory, because if we did that and
a developer switched git branches from one after that change
to one before it then configure would end up trashing all
the test files by making them symlinks to themselves.
Changing their path avoids this annoyance.)
Peter Maydell [Fri, 2 Nov 2018 17:17:12 +0000 (17:17 +0000)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20181102' into staging
target-arm queue:
* microbit: Add the UART to our nRF51 SoC model
* Add a virtual Xilinx Versal board "xlnx-versal-virt"
* hw/arm/virt: Set VIRT_COMPAT_3_0 compat
* MAINTAINERS: Remove bouncing email in ARM ACPI
* strongarm: mask off high[31:28] bits from dir and state registers
* target/arm: Conditionalize some asserts on aarch32 support
* hw/arm/xilinx_zynq: Use the ARRAY_SIZE macro
* remotes/pmaydell/tags/pull-target-arm-20181102:
hw/arm: versal: Add a virtual Xilinx Versal board
hw/arm: versal: Add a model of Xilinx Versal SoC
target/arm: Conditionalize some asserts on aarch32 support
hw/arm/xilinx_zynq: Use the ARRAY_SIZE macro
strongarm: mask off high[31:28] bits from dir and state registers
MAINTAINERS: Remove bouncing email in ARM ACPI
tests/boot-serial-test: Add microbit board testcase
hw/arm/nrf51_soc: Connect UART to nRF51 SoC
hw/char: Implement nRF51 SoC UART
hw/arm/virt: Set VIRT_COMPAT_3_0 compat
This board is based on the Xilinx Versal SoC. The exact
details of what peripherals are attached to this board
will remain in control of QEMU. QEMU will generate an
FDT on the fly for Linux and other software to auto-discover
peripherals.
strongarm: mask off high[31:28] bits from dir and state registers
The high[31:28] bits of 'direction' and 'state' registers of
SA-1100/SA-1110 device are reserved. Setting them may lead to
OOB 's->handler[]' array access issue. Mask off [31:28] bits to
avoid it.
Shannon Zhao's email at Huawei is bouncing: remove it.
X-Failed-Recipients: [email protected]
** Address not found **
Your message wasn't delivered to [email protected] because the address couldn't be found, or is unable to receive mail.
Note that the section still contains his personal email (see e59f13d76bb).
Peter Maydell [Fri, 2 Nov 2018 13:16:13 +0000 (13:16 +0000)]
Merge remote-tracking branch 'remotes/riscv/tags/riscv-for-master-3.1-sf1' into staging
RISC-V Patches for the 3.1 Soft Freeze, Part 2
This tag contains a few simple patches that I'd like to target for the
QEMU soft freeze. There's only one code change: a fix to our PMP
implementation that avoids an internal truncation while computing a
partial PMP read.
I also have two updates to the MAINTAINERS file: one to add Alistair as
a RISC-V maintainer, and one to add our newly created mailing list.
# gpg: Signature made Tue 30 Oct 2018 18:17:17 GMT
# gpg: using RSA key EF4CA1502CCBAB41
# gpg: Good signature from "Palmer Dabbelt <[email protected]>"
# gpg: aka "Palmer Dabbelt <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 00CE 76D1 8349 60DF CE88 6DF8 EF4C A150 2CCB AB41
* remotes/riscv/tags/riscv-for-master-3.1-sf1:
Add [email protected] as the RISC-V list
Add Alistair as a RISC-V Maintainer
target/riscv/pmp.c: pmpcfg_csr_read returns bogus value on RV64
Peter Maydell [Fri, 2 Nov 2018 10:53:00 +0000 (10:53 +0000)]
Merge remote-tracking branch 'remotes/elmarco/tags/chrdev-pull-request' into staging
- add websocket support
- socket: make 'fd' incompatible with 'reconnect'
- fix a websocket leak
- unrelated editorconfig patch that missed -trivial (included for
convenience)
- v2: fix commit author field
# gpg: Signature made Thu 01 Nov 2018 08:23:39 GMT
# gpg: using RSA key DAE8E10975969CE5
# gpg: Good signature from "Marc-André Lureau <[email protected]>"
# gpg: aka "Marc-André Lureau <[email protected]>"
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5
* remotes/elmarco/tags/chrdev-pull-request:
editorconfig: set emacs mode
tests/test-char: Check websocket chardev functionality
chardev: Add websocket support
chardev/char-socket: Function headers refactoring
char-socket: make 'fd' incompatible with 'reconnect'
char-socket: correctly set has_reconnect when parsing QemuOpts
websock: fix handshake leak
Peter Maydell [Fri, 2 Nov 2018 09:49:35 +0000 (09:49 +0000)]
Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20181031a' into staging
Minor migration fixes 2018-10-31
# gpg: Signature made Wed 31 Oct 2018 16:55:40 GMT
# gpg: using RSA key 0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <[email protected]>"
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert/tags/pull-migration-20181031a:
migration: avoid segmentfault when take a snapshot of a VM which being migrated
qapi: Fix COLOStatus and query-colo-status since version
COLO: Fix Colo doc secondeary should be secondary
Peter Maydell [Thu, 1 Nov 2018 17:26:16 +0000 (17:26 +0000)]
Merge remote-tracking branch 'remotes/mdroth/tags/qga-pull-2018-10-30-v3-tag' into staging
qemu-ga patch queue for soft-freeze
* support for --retry-path option for recovering from communication
path failures
* support for serial/device name in guest-get-fsinfo for linux/w32
* support for freezing individual mount points in guest-fsfreeze-*
* fixes for unicode paths on w32, not-present vcpus in guest-get-vcpus,
buffer overflow in guest-get-fsinfo for w32, and other minor fixes
v3:
* remove redundant check for --static in configure
* correct authorship on "qga-win: add debugging information"
v2:
* set libudev=off in configure for static builds
* remotes/mdroth/tags/qga-pull-2018-10-30-v3-tag: (24 commits)
qga-win: changing --retry-path option behavior
qga-win: report specific error when failing to open channel
qga-win: install service with --retry-path set by default
qga: add --retry-path option for re-initializing channel on failure
qga: move w32 service handling out of run_agent()
qga: hang GAConfig/socket_activation off of GAState global
qga: group agent init/cleanup init separate routines
qga: fix an off-by-one issue
qga-win: demystify namespace stripping
qga-win: return disk device in guest-get-fsinfo
qga-win: handle multi-disk volumes
qga-win: refactor disk info
qga-win: report disk serial number
qga-win: refactor disk properties (bus)
qga-win: add debugging information
build: rename CONFIG_QGA_NTDDDISK to CONFIG_QGA_NTDDSCSI
qga-win: fsinfo: pci-info: allow partial info
qga-win: prevent crash when executing fsinfo command
qga: linux: return disk device in guest-get-fsinfo
qga: linux: report disk serial number
...
Peter Maydell [Thu, 1 Nov 2018 16:32:54 +0000 (16:32 +0000)]
Merge remote-tracking branch 'remotes/ehabkost/tags/x86-next-pull-request' into staging
x86 queue, 2018-10-30
* MSR-based feature support for
MSR_IA32_ARCH_CAPABILITIES bits (Robert Hoo)
* Cascadelake-Server CPU model (Tao Xu)
* Add PKU on Skylake-Server CPU model (Tao Xu)
* Correct cpu_x86_cpuid(0xd) (Sebastian Andrzej Siewior)
* Remove dead code (Peter Maydell)
# gpg: Signature made Wed 31 Oct 2018 14:05:25 GMT
# gpg: using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <[email protected]>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/x86-next-pull-request:
i386: Add PKU on Skylake-Server CPU model
i386: Add new model of Cascadelake-Server
x86: define a new MSR based feature word -- FEATURE_WORDS_ARCH_CAPABILITIES
x86: Data structure changes to support MSR based features
kvm: Add support to KVM_GET_MSR_FEATURE_INDEX_LIST and KVM_GET_MSRS system ioctl
target/i386: Remove #ifdeffed-out icebp debugging hack
i386: correct cpu_x86_cpuid(0xd)
Peter Maydell [Thu, 1 Nov 2018 14:38:50 +0000 (14:38 +0000)]
Merge remote-tracking branch 'remotes/berrange/tags/misc-next-pull-request' into staging
Merge misc fixes
# gpg: Signature made Wed 31 Oct 2018 11:36:12 GMT
# gpg: using RSA key BE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <[email protected]>"
# gpg: aka "Daniel P. Berrange <[email protected]>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* remotes/berrange/tags/misc-next-pull-request:
scripts: report on author emails that are mangled by the mailing list
block: drop moderated sheepdog mailing list from MAINTAINERS file
# gpg: Signature made Wed 31 Oct 2018 00:28:39 GMT
# gpg: using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <[email protected]>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/python-next-pull-request:
scripts/qemu.py: use a more consistent docstring style
scripts/decodetree.py: fix reference to attributes
Travis support for the acceptance tests
Acceptance tests: add make rule for running them
Bootstrap Python venv for tests
iotests: Unify log outputs between Python 2 and 3
iotests: Modify imports for Python 3
iotests: 'new' module replacement in 169
iotests: Explicitly bequeath FDs in Python
iotests: Different iterator behavior in Python 3
iotests: Use // for Python integer division
iotests: Use Python byte strings where appropriate
iotests: Flush in iotests.py's QemuIoInteractive
iotests: Make nbd-fault-injector flush
scripts/device-crash-test: Remove devices that are not user_creatable anymore
Peter Maydell [Thu, 1 Nov 2018 12:08:10 +0000 (12:08 +0000)]
Merge remote-tracking branch 'remotes/stefanberger/tags/pull-tpm-2018-10-29-2' into staging
Merge tpm 2018/10/29 v2
# gpg: Signature made Tue 30 Oct 2018 21:40:24 GMT
# gpg: using RSA key 75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2018-10-29-2:
tpm: Zero-init structure to avoid uninitialized variables in valgrind log
MAINTAINERS: Change my email address to the new domain
docs: tpm: Mention implemented TPM CRB interface emulation and specs
tests/tpm: Display if swtpm is not found or --tpm2 not supported
tests/tpm: fix tpm_util_swtpm_has_tpm2()
Laurent Vivier [Tue, 30 Oct 2018 16:55:54 +0000 (17:55 +0100)]
target/m68k: use EXCP_ILLEGAL instead of EXCP_UNSUPPORTED
Coldfire defines an "Unsupported instruction" exception if execution
of a valid instruction is attempted but the required hardware is not
present in the processor.
We use it with instructions that are in fact undefined or illegal,
and the exception expected in this case by the kernel is the
illegal exception, so this patch fixes that.
Some time ago, I proposed to use an (eval) in .dir-locals.el to set
the mode for all json files and Makefile. Unfortunately, this isn't
safe, and emacs will prompt the user, which isn't very friendly.
Fortunately, editorconfig provides a special config key which does
allow to set the emacs mode. Add a few missing entries and set the
emacs mode.
Update top comment to provide a short summary about the file and the
IDE plugins while at it.
Test order:
Creating server websocket chardev
Creating usual tcp chardev client
Sending handshake message from client
Receiving handshake reply
Sending ping frame with "hello" payload
Receiving pong reply
Sending binary data "world"
Checking the received data on server side
Checking of closing handshake
Julia Suvorova [Thu, 18 Oct 2018 22:34:59 +0000 (01:34 +0300)]
chardev/char-socket: Function headers refactoring
Upcoming websocket support requires additional parameters in function
headers that are already overloaded. This patch replaces the bunch of
parameters with a single structure pointer.
char-socket: make 'fd' incompatible with 'reconnect'
A chardev socket created with the 'fd=' argument is not going to
handle reconnection properly by recycling the same fd (or not in a
supported way). Let's forbid this case.
char-socket: correctly set has_reconnect when parsing QemuOpts
qemu_chr_parse_socket() fills all ChardevSocket fields, but that
doesn't reflect correctly the arguments given with the options / on
the command line. "reconnect" takes a number as argument, and the
default value is 0, which doesn't help to identify the missing
option. The other arguments have default values that are less
problematic, leave them set by default for now.
==955== 217 bytes in 1 blocks are definitely lost in loss record 275 of 321
==955== at 0x483A965: realloc (vg_replace_malloc.c:785)
==955== by 0x50B6839: __vasprintf_chk (in /usr/lib64/libc-2.28.so)
==955== by 0x49AA05C: g_vasprintf (in /usr/lib64/libglib-2.0.so.0.5800.1)
==955== by 0x4983440: g_strdup_vprintf (in /usr/lib64/libglib-2.0.so.0.5800.1)
==955== by 0x126048: qio_channel_websock_handshake_send_res (channel-websock.c:162)
==955== by 0x1266E6: qio_channel_websock_handshake_send_res_ok (channel-websock.c:362)
==955== by 0x126D3E: qio_channel_websock_handshake_process (channel-websock.c:468)
==955== by 0x126EF2: qio_channel_websock_handshake_read (channel-websock.c:511)
==955== by 0x12715B: qio_channel_websock_handshake_io (channel-websock.c:571)
==955== by 0x125027: qio_channel_fd_source_dispatch (channel-watch.c:84)
==955== by 0x496326C: g_main_context_dispatch (in /usr/lib64/libglib-2.0.so.0.5800.1)
==955== by 0x169EC3: glib_pollfds_poll (main-loop.c:215)
While it would be possible to concatenate input files with make,
passing the original input files to decodetree.py allows us to
generate error messages which allows compilation environments
(read: emacs) to next-error to the correct input file.
decodetree: Remove "insn" argument from trans_* expanders
This allows trans_* expanders to be shared between decoders
for 32 and 16-bit insns, by not tying the expander to the
size of the insn that produced it.
This change requires adjusting the two existing users to match.
Allow argument sets to be shared between two decoders by avoiding
a re-declaration error. Make sure that anonymous argument sets
and anonymous formats have unique names.
Currently whenever the qemu-ga's service doesn't find the virtio-serial
the run_agent() loops in a QGA_RETRY_INTERVAL (default 5 seconds)
intervals and try to restart the qemu-ga which causes a synchronous loop.
Changed to wait and listen for the serial events by registering for
notifications a proper serial event handler that deals with events:
DBT_DEVICEARRIVAL indicates that the device has been inserted and
is available
DBT_DEVICEREMOVECOMPLETE indicates that the devive has been removed
Which allow us to determine when the channel path is available for the
qemu-ga to restart.
Michael Roth [Sun, 7 Oct 2018 11:02:21 +0000 (14:02 +0300)]
qga-win: install service with --retry-path set by default
It's nicer from a management perspective that the agent can survive
hotplug/unplug of the channel device, or be started prior to the
installation of the channel device's driver without and still be able
to resume normal function afterward. On linux there are alternatives
like systemd to support this, but on w32 --retry-path is the only
option so it makes sense to set it by default when installed as a
w32 service.
Michael Roth [Sun, 7 Oct 2018 11:02:20 +0000 (14:02 +0300)]
qga: add --retry-path option for re-initializing channel on failure
This adds an option to instruct the agent to periodically attempt
re-opening the communication channel after a channel error has
occurred. The main use-case for this is providing an OS-independent
way of allowing the agent to survive situations like hotplug/unplug of
the communication channel, or initial guest set up where the agent may
be installed/started prior to the installation of the channel device's
driver.
There are nicer ways of implementing this functionality via things
like systemd services, but this option is useful for platforms like
*BSD/w32.
Currently a channel error will result in the GSource for that channel
being removed from the GMainLoop, but the main loop continuing to run.
That behavior results in a dead loop when --retry-path isn't set, and
prevents us from knowing when to attempt re-opening the channel when
it is set, so we also force the loop to exit as part of this patch.
Michael Roth [Sun, 7 Oct 2018 11:02:19 +0000 (14:02 +0300)]
qga: move w32 service handling out of run_agent()
Eventually we want a w32 service to be able to restart the qga main
loop from within service_main(). To allow for this we move service
handling out of run_agent() such that service_main() calls
run_agent() instead of the reverse.
Michael Roth [Sun, 7 Oct 2018 11:02:18 +0000 (14:02 +0300)]
qga: hang GAConfig/socket_activation off of GAState global
For w32 services we rely on the global GAState to access resources
associated with the agent within service_main(). Currently this is
sufficient for starting the agent since we open the channel once prior
to calling service_main(), and simply start the GMainLoop to start the
agent from within service_main().
Eventually we want to be able to also [re-]open the communication
channel from within service_main(), which requires access to
config/socket_activation variables, so we hang them off GAState in
preparation for that.
Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Sameeh Jubran <[email protected]>
*dont move GAConfig struct, just the typedef
*fix build bisect for w32 Signed-off-by: Michael Roth <[email protected]>
Michael Roth [Sun, 7 Oct 2018 11:02:17 +0000 (14:02 +0300)]
qga: group agent init/cleanup init separate routines
This patch better separates the init/cleanup routines out into
separate functions to make the start-up procedure a bit easier to
follow. This will be useful when we eventually break out the actual
start/stop of the agent's main loop into separates routines that
can be called multiple times after the init phase.
Probe the volume for disk extents and return list of all disks.
Originally only first disk of composite volume was returned.
Note that the patch changes get_pci_info() from one state of brokenness
into a different state of brokenness. In other words it still does not do
what it's supposed to do (see comment in code). If anyone knows how to
fix it, please step in.
Refactor building of disk info into a function that builds the list and
a function that returns infor for single disk. This will be used in
future commit that will handle multi-disk volumes.
Signed-off-by: Tomáš Golembiovský <[email protected]>
*coding style fix-ups (declarations at beginning of block)
*improve readability for user-visible errors
*cover additional edge-cases with debug statements Signed-off-by: Michael Roth <[email protected]>
Refactor code that queries bus type to be more generic. The function
get_disk_bus_type() has been renamed to build_guest_disk_info().
Following commit(s) will extend this function.
build: rename CONFIG_QGA_NTDDDISK to CONFIG_QGA_NTDDSCSI
There was inconsistency between commits:
50cbebb9a3 configure: add configure check for ntdddisk.h a3ef3b2272 qga: added bus type and disk location path
The first commit added #define CONFIG_QGA_NTDDDISK but the second commit
expected the name to be CONFIG_QGA_NTDDSCSI. As a result the code in
second patch was never used.
Renaming the option to CONFIG_QGA_NTDDSCSI to match the name of header
file that is being checked for.
Sameeh Jubran [Tue, 23 Oct 2018 11:23:14 +0000 (13:23 +0200)]
qga-win: fsinfo: pci-info: allow partial info
The call to SetupDiGetDeviceRegistryProperty might fail because the
value doesn't exist in the registry, in this case we shouldn't exit from
the loop but instead continue to look for other available values in the
registry and set this value as unavailable (-1).
Signed-off-by: Sameeh Jubran <[email protected]> Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Tomáš Golembiovský <[email protected]>
*squash in fix for when get_pci_info() returns NULL pci_controller field
*fix handling for error_set() cases in get_pci_info(), not just NULL return
*force all -1 PCI addr fields if any single one of them isn't found Signed-off-by: Michael Roth <[email protected]>
Sameeh Jubran [Tue, 23 Oct 2018 11:23:13 +0000 (13:23 +0200)]
qga-win: prevent crash when executing fsinfo command
The fsinfo command is currently implemented for Windows only and it's disk
parameter can be enabled by adding the define "CONFIG_QGA_NTDDSCSI" to the qga
code. When enabled and executed the qemu-ga crashed with the following message:
------------------------------------------------
File qapi/qapi-visit-core.c, Line 49
After some digging, turns out that the GuestPCIAddress is null and the
qapi visitor doesn't like that, so we can always allocate it instead and
initiate all it's members to -1.
Especially for guests with large numbers of tlbs, like ARM or PPC,
we may well not use all of them in between flush operations.
Remember which tlbs have been used since the last flush, and
avoid any useless flushing.
Our only statistic so far was "full" tlb flushes, where all mmu_idx
are flushed at the same time.
Now count "partial" tlb flushes where sets of mmu_idx are flushed,
but the set is not maximal. Account one per mmu_idx flushed, as
that is the unit of work performed.
We don't actually count elided flushes yet, but go ahead and change
the interface presented to the monitor all at once.
cputlb: Merge tlb_flush_nocheck into tlb_flush_by_mmuidx_async_work
The difference between the two sets of APIs is now miniscule.
This allows tlb_flush, tlb_flush_all_cpus, and tlb_flush_all_cpus_synced
to be merged with their corresponding by_mmuidx functions as well. For
accounting, consider mmu_idx_bitmask = ALL_MMUIDX_BITS to be a full flush.
The set of large pages in the kernel is probably not the same
as the set of large pages in the application. Forcing one
range to cover both will flush more often than necessary.
This allows tlb_flush_page_async_work to flush just the one
mmu_idx implicated, which in turn allows us to remove
tlb_check_page_and_flush_by_mmuidx_async_work.
cputlb: Move cpu->pending_tlb_flush to env->tlb_c.pending_flush
Protect it with the tlb_lock instead of using atomics.
The move puts it in or near the same cacheline as the lock;
using the lock means we don't need a second atomic operation
in order to perform the update. Which makes it cheap to also
update pending_flush in tlb_flush_by_mmuidx_async_work.
cputlb: Remove tcg_enabled hack from tlb_flush_nocheck
The bugs this was working around were fixed with commits 022d6378c7fd target/unicore32: remove tlb_flush from uc32_init_fn 6e11beecfde0 target/alpha: remove tlb_flush from alpha_cpu_initfn