Peter Lieven [Tue, 31 Jan 2017 15:36:34 +0000 (16:36 +0100)]
qga: ignore EBUSY when freezing a filesystem
the current implementation fails if we try to freeze an
already frozen filesystem. This can happen if a filesystem
is mounted more than once (e.g. with a bind mount).
Stefan Hajnoczi [Fri, 6 Jan 2017 15:29:30 +0000 (15:29 +0000)]
qga: add systemd socket activation support
AF_UNIX and AF_VSOCK listen sockets can be passed in by systemd on
startup. This allows systemd to manage the listen socket until the
first client connects and between restarts. Advantages of socket
activation are that parallel startup of network services becomes
possible and that unused daemons do not consume memory.
The key to achieving this is the LISTEN_FDS environment variable, which
is a stable ABI as shown here:
https://www.freedesktop.org/wiki/Software/systemd/InterfacePortabilityAndStabilityChart/
We could link against libsystemd and use sd_listen_fds(3) but it's easy
to implement the tiny LISTEN_FDS ABI so that qemu-ga does not depend on
libsystemd. Some systems may not have systemd installed and wish to
avoid the dependency. Other init systems or socket activation servers
may implement the same ABI without systemd involvement.
Peter Maydell [Sat, 4 Mar 2017 16:31:14 +0000 (16:31 +0000)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170303' into staging
ppc patch queuye for 2017-03-03
This will probably be my last pull request before the hard freeze. It
has some new work, but that has all been posted in draft before the
soft freeze, so I think it's reasonable to include in qemu-2.9.
This batch has:
* A substantial amount of POWER9 work
* Implements the legacy (hash) MMU for POWER9
* Some more preliminaries for implementing the POWER9 radix
MMU
* POWER9 has_work
* Basic POWER9 compatibility mode handling
* Removal of some premature tests
* Some cleanups and fixes to the existing MMU code to make the
POWER9 work simpler
* A bugfix for TCG multiply adds on power
* Allow pseries guests to access PCIe extended config space
This also includes a code-motion not strictly in ppc code - moving
getrampagesize() from ppc code to exec.c. This will make some future
VFIO improvements easier, Paolo said it was ok to merge via my tree.
* remotes/dgibson/tags/ppc-for-2.9-20170303:
target/ppc: rewrite f[n]m[add,sub] using float64_muladd
spapr: Small cleanup of PPC MMU enums
spapr_pci: Advertise access to PCIe extended config space
target/ppc: Rework hash mmu page fault code and add defines for clarity
target/ppc: Move no-execute and guarded page checking into new function
target/ppc: Add execute permission checking to access authority check
target/ppc: Add Instruction Authority Mask Register Check
hw/ppc/spapr: Add POWER9 to pseries cpu models
target/ppc/POWER9: Add cpu_has_work function for POWER9
target/ppc/POWER9: Add POWER9 pa-features definition
target/ppc/POWER9: Add POWER9 mmu fault handler
target/ppc: Don't gen an SDR1 on POWER9 and rework register creation
target/ppc: Add patb_entry to sPAPRMachineState
target/ppc/POWER9: Add POWERPC_MMU_V3 bit
powernv: Don't test POWER9 CPU yet
exec, kvm, target-ppc: Move getrampagesize() to common code
target/ppc: Add POWER9/ISAv3.00 to compat_table
* remotes/bonzini/tags/for-upstream: (21 commits)
iscsi: fix missing unlock
memory: show region offset and ROM/RAM type in "info mtree -f"
x86: Work around SMI migration breakages
spice-char: fix segfault in char_spice_finalize
vl: disable default cdrom when using explicitely scsi-hd
memory: Introduce DEVICE_HOST_ENDIAN for ram device
qmp-events: fix GUEST_PANICKED description formatting
qapi: flatten GuestPanicInformation union
vmxcap: update for September 2016 SDM
vmxcap: port to Python 3
KVM: use KVM_CAP_IMMEDIATE_EXIT
kvm: use atomic_read/atomic_set to access cpu->exit_request
KVM: move SIG_IPI handling to kvm-all.c
KVM: do not use sigtimedwait to catch SIGBUS
KVM: remove kvm_arch_on_sigbus
cpus: reorganize signal handling code
KVM: x86: cleanup SIGBUS handlers
cpus: remove ugly cast on sigbus_handler
cpu-exec: remove unnecessary check of cpu->exit_request
replay: check icount in cpu exec loop
...
Paolo Bonzini [Thu, 2 Mar 2017 21:49:41 +0000 (22:49 +0100)]
memory: show region offset and ROM/RAM type in "info mtree -f"
"info mtree -f" output is currently hard to use for large RAM regions, because
there is no hint as to what part of the region is being mapped. Add the offset
if it is nonzero.
Secondly, FlatView has a readonly field, that can override the MemoryRegion
in the presence of aliases. Take it into account.
Migration from a 2.3.0 qemu results in a reboot on the receiving QEMU
due to a disagreement about SM (System management) interrupts.
2.3.0 didn't have much SMI support, but it did set CPU_INTERRUPT_SMI
and this gets into the migration stream, but on 2.3.0 it
never got delivered.
~2.4.0 SMI interrupt support was added but was broken - so
that when a 2.3.0 stream was received it cleared the CPU_INTERRUPT_SMI
but never actually caused an interrupt.
The SMI delivery was recently fixed by 68c6efe07a, but the
effect now is that an incoming 2.3.0 stream takes the interrupt it
had flagged but it's bios can't actually handle it(I think
partly due to the original interrupt not being taken during boot?).
The consequence is a triple(?) fault and a reboot.
Li Qiang [Tue, 21 Feb 2017 08:18:27 +0000 (00:18 -0800)]
spice-char: fix segfault in char_spice_finalize
In 'qemu_chr_open_spice_vmc' if the 'psubtype' is NULL, it will
call 'char_spice_finalize'. But as the SpiceChardev is not inserted
in the 'spice_chars' list, the 'QLIST_REMOVE' will cause a segfault.
Add a detect to avoid it.
Hervé Poussineau [Mon, 20 Feb 2017 20:41:19 +0000 (21:41 +0100)]
vl: disable default cdrom when using explicitely scsi-hd
In commit af6bf1328ef90fae617857c02697e0174b84d596 (May 2011),
ide-hd, ide-cd and scsi-cd have been added to disable default cdrom,
"or else you can't put one on secondary master without -nodefaults".
Make it the same for scsi-hd, so you can put one on scsi-id 2 without
using -nodefaults.
scsi-hd has probably been forgotten, as it has been added in the
preceding commit (b443ae67130d32ad06b06fc9aa6d04d05ccd93ce).
Affected users are the ones using a machine with SCSI devices and start QEMU
with -device scsi-hd but without -device scsi-cd or -cdrom
In that case, the default cdrom device will disappear instead of being empty.
Yongji Xie [Mon, 27 Feb 2017 04:52:44 +0000 (12:52 +0800)]
memory: Introduce DEVICE_HOST_ENDIAN for ram device
At the moment ram device's memory regions are DEVICE_NATIVE_ENDIAN. It's
incorrect. This memory region is backed by a MMIO area in host, so the
uint64_t data that MemoryRegionOps read from/write to this area should be
host-endian rather than target-endian. Hence, current code does not work
when target and host endianness are different which is the most common case
on PPC64. To fix it, this introduces DEVICE_HOST_ENDIAN for the ram device.
This has been tested on PPC64 BE/LE host/guest in all possible combinations
including TCG.
Paolo Bonzini [Wed, 8 Feb 2017 12:52:50 +0000 (13:52 +0100)]
KVM: use KVM_CAP_IMMEDIATE_EXIT
The purpose of the KVM_SET_SIGNAL_MASK API is to let userspace "kick"
a VCPU out of KVM_RUN through a POSIX signal. A signal is attached
to a dummy signal handler; by blocking the signal outside KVM_RUN and
unblocking it inside, this possible race is closed:
VCPU thread service thread
--------------------------------------------------------------
check flag
set flag
raise signal
(signal handler does nothing)
KVM_RUN
However, one issue with KVM_SET_SIGNAL_MASK is that it has to take
tsk->sighand->siglock on every KVM_RUN. This lock is often on a
remote NUMA node, because it is on the node of a thread's creator.
Taking this lock can be very expensive if there are many userspace
exits (as is the case for SMP Windows VMs without Hyper-V reference
time counter).
KVM_CAP_IMMEDIATE_EXIT provides an alternative, where the flag is
placed directly in kvm_run so that KVM can see it:
VCPU thread service thread
--------------------------------------------------------------
raise signal
signal handler
set run->immediate_exit
KVM_RUN
check run->immediate_exit
The previous patches changed QEMU so that the only blocked signal is
SIG_IPI, so we can now stop using KVM_SET_SIGNAL_MASK and sigtimedwait
if KVM_CAP_IMMEDIATE_EXIT is available.
On a 14-VCPU guest, an "inl" operation goes down from 30k to 6k on
an unlocked (no BQL) MemoryRegion, or from 30k to 15k if the BQL
is involved.
Paolo Bonzini [Wed, 8 Feb 2017 11:48:54 +0000 (12:48 +0100)]
KVM: do not use sigtimedwait to catch SIGBUS
Call kvm_on_sigbus_vcpu asynchronously from the VCPU thread.
Information for the SIGBUS can be stored in thread-local variables
and processed later in kvm_cpu_exec.
Paolo Bonzini [Thu, 9 Feb 2017 09:04:34 +0000 (10:04 +0100)]
KVM: remove kvm_arch_on_sigbus
Build it on kvm_arch_on_sigbus_vcpu instead. They do the same
for "action optional" SIGBUSes, and the main thread should never get
"action required" SIGBUSes because it blocks the signal.
Paolo Bonzini [Thu, 9 Feb 2017 08:50:02 +0000 (09:50 +0100)]
cpus: reorganize signal handling code
Move the KVM "eat signals" code under CONFIG_LINUX, in preparation
for moving it to kvm-all.c; reraise non-MCE SIGBUS immediately,
without passing it to KVM.
Paolo Bonzini [Wed, 8 Feb 2017 12:22:12 +0000 (13:22 +0100)]
cpus: remove ugly cast on sigbus_handler
The cast is there because sigbus_handler is invoked via sigfd_handler.
But it feels just wrong to use struct qemu_signalfd_siginfo in the
prototype of a function that is passed to sigaction.
Instead, do a simple-minded conversion of qemu_signalfd_siginfo to
siginfo_t.
Peter Maydell [Fri, 3 Mar 2017 14:04:27 +0000 (14:04 +0000)]
Merge remote-tracking branch 'remotes/dgibson/tags/submodule-update-20170303' into staging
submodule updates (SLOF & dtc) 2017-03-03
This set of patches updates the SLOF and dtc submodules for qemu-2.9.
The SLOF update could have gone in my ppc pull request earlier today,
but I forgot it. It should be safe to apply in either order with that
set though.
The dtc (and libfdt) update brings us up to dtc 1.4.3 which includes
some things that will be useful in future.
Peter Maydell [Fri, 3 Mar 2017 12:48:42 +0000 (12:48 +0000)]
dtc: Revert unintentional submodule downgrade from commit 077dd74239a99
Commit 077dd74239a99 inadvertently downgraded the 'dtc' submodule,
undoing the increment added in commit 6e85fce0225f. Revert this,
returning the submodule state to where we should be.
Peter Maydell [Fri, 3 Mar 2017 10:09:03 +0000 (10:09 +0000)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pc: fixes, features
virtio support for region caches broke a bunch of stuff - fixing most of
it though it's not ideal. Still pondering the right way to fix it.
New: VM gen ID and hotplug for PXB.
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Thu 02 Mar 2017 06:19:17 GMT
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg: aka "Michael S. Tsirkin <[email protected]>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
hw/pxb-pcie: fix PCI Express hotplug support
tests/acpi: update DSDT after last patch
acpi: simplify _OSC
virtio: unbreak virtio-pci with IOMMU after caching ring translations
virtio: add missing region cache init in virtio_load()
virtio: invalidate memory in vring_set_avail_event()
virtio: guard vring access when setting notification
virtio: check for vring setup in virtio_queue_empty
MAINTAINERS: Add VM Generation ID entries
tests: Move reusable ACPI code into a utility file
qmp/hmp: add query-vm-generation-id and 'info vm-generation-id' commands
ACPI: Add Virtual Machine Generation ID support
ACPI: Add vmgenid blob storage to the build tables
docs: VM Generation ID device description
linker-loader: Add new 'write pointer' command
David Gibson [Fri, 3 Mar 2017 03:54:38 +0000 (14:54 +1100)]
Update dtc submodule to v1.4.3
Since the last submodule update (which was v1.4.2) dtc and libfdt have
gained some features which would be useful in qemu. There's now a v1.4.3
upstream release, so update our submodule to point to it.
> qemu-bootlist: Take the "-boot strict=off" setting properly into account
> virtio-scsi: initialize vring avail queue buffers
> virtio: Remove global variables in block and 9p driver
> Remove superfluous checkpoints in tree.fs
> Provide "write" function in the disk-label package
> virtio: Implement block write support
> scsi: Add SCSI block write support
> deblocker: Add a 'write' function
> virtio-scsi: Fix descriptor order for SCSI WRITE commands
> board-qemu: Add a possibility to use hvterm input instead of USB keyboard
> Do not try to use virtio-gpu in VGA mode
> virtio: Fix stack comment of virtio-blk-read
> envvar: Do not read default values for /options from the NVRAM anymore
> envvar: Set properties in /options during "(set-defaults)"
Sam Bobroff [Thu, 2 Mar 2017 05:38:56 +0000 (16:38 +1100)]
spapr: Small cleanup of PPC MMU enums
The PPC MMU types are sometimes treated as if they were a bit field
and sometime as if they were an enum which causes maintenance
problems: flipping bits in the MMU type (which is done on both the 1TB
segment and 64K segment bits) currently produces new MMU type
values that are not handled in every "switch" on it, sometimes causing
an abort().
This patch provides some macros that can be used to filter out the
"bit field-like" bits so that the remainder of the value can be
switched on, like an enum. This allows removal of all of the
"degraded" types from the list and should ease maintenance.
David Gibson [Wed, 1 Mar 2017 05:23:12 +0000 (16:23 +1100)]
spapr_pci: Advertise access to PCIe extended config space
The (paravirtual) PCI host bridge on the 'pseries' machine in most
regards acts like a regular PCI bus, rather than a PCIe bus. Despite
this, though, it does allow access to the PCIe extended config space.
We already implemented the RTAS methods to allow this access.. but
forgot to put the markers into the device tree so that guest's know it
is there. This adds them in.
With this, a pseries guest is able to view extended config space on
(for example an e1000e device. This should be enough to allow guests
to use at least some PCIe devices.
target/ppc: Rework hash mmu page fault code and add defines for clarity
The hash mmu page fault handling code is responsible for generating ISIs
and DSIs when access permissions cause an access to fail. Part of this
involves setting the srr1 or dsisr registers to indicate what causes the
access to fail. Add defines for the bit fields of these registers and
rework the code to use these new defines in order to improve readability
and code clarity.
While we're here, update what is logged when an access fails to include
information as to what caused to access to fail for debug purposes.
Signed-off-by: Suraj Jitindar Singh <[email protected]>
[dwg: Moved constants to cpu.h since they're not MMUv3 specific] Signed-off-by: David Gibson <[email protected]>
target/ppc: Move no-execute and guarded page checking into new function
A pte entry has bit fields which can be used to make a page no-execute or
guarded, if either of these bits are set then an instruction access to this
page will fail. Currently these bits are checked with the pp_prot function
however the ISA specifies that the access authority controlled by the
key-pp value pair should only be checked on an instruction access after
the no-execute and guard bits have already been verified to permit the
access.
Move the no-execute and guard bit checking into a new separate function.
Note that we can remove the check for the no-execute bit in the slb entry
since this check was already performed above when we obtained the slb
entry.
In the event that the no-execute or guard bits are set, an ISI should be
generated with the SRR1_NOEXEC_GUARD (0x10000000) bit set in srr1. Add a
define for this for clarity.
Signed-off-by: Suraj Jitindar Singh <[email protected]>
[dwg: Move constants to cpu.h since they're not MMUv3 specific] Signed-off-by: David Gibson <[email protected]>
target/ppc: Add execute permission checking to access authority check
Basic storage protection defines various access authority permissions
based on a slb storage key and pte pp value pair. This access authority
defines read, write and execute permissions however currently we only
use this to control read and write permissions and ignore the execute
control.
Fix the code to allow execute permissions based on the key-pp value pair.
Execute is allowed under the same conditions which enable reads.
(i.e. read permission -> execute permission)
The instruction authority mask register (IAMR) can be used to restrict
permissions for instruction fetch accesses on a per key basis for each
of 32 different key values. Access permissions are derived based on the
specific key value stored in the relevant page table entry.
The IAMR was introduced in, and is present in processors since, POWER8
(ISA v2.07). Thus introduce a function to check access permissions based
on the pte key value and the contents of the IAMR when handling a page
fault to ensure sufficient access permissions for an instruction fetch.
A hash pte contains a key value in bits 2:3|52:54 of the second double word
of the pte, this key value gives an index into the IAMR which contains 32
2-bit access masks. If the least significant bit of the 2-bit access mask
corresponding to the given key value is set (IAMR[key] & 0x1 == 0x1) then
the instruction fetch is not permitted and an ISI is generated accordingly.
While we're here, add defines for the srr1 bits to be set for the ISI for
clarity.
Least significant bit of the access mask is set, thus the instruction fetch
is not permitted. We should generate an instruction storage interrupt (ISI)
with bit 42 of SRR1 set to indicate access precluded by virtual page class
key protection.
Signed-off-by: Suraj Jitindar Singh <[email protected]>
[dwg: Move new constants to cpu.h, since they're not MMUv3 specific] Signed-off-by: David Gibson <[email protected]>
target/ppc/POWER9: Add cpu_has_work function for POWER9
The cpu has work function is used to mask interrupts used to determine
if there is work for the cpu based on the LPCR. Add a function to do this
for POWER9 and add it to the POWER9 cpu definition. This is similar to that
for POWER8 except using the LPCR bits as defined for POWER9.
Add a pa-features definition which includes all of the new fields which
have been added, note we don't claim support for any of these new features
at this stage.
target/ppc: Don't gen an SDR1 on POWER9 and rework register creation
POWER9 doesn't have a storage description register 1 (SDR1) which is used
to store the base and size of the hash table. Thus we don't need to
generate this register on the POWER9 cpu model. While we're here, the
register generation code for 970, POWER5+, POWER<7/8/9> in general is a
mess where we call a generic function from a model specific function which
then attempts to call model specific functions, so rework this for
readability.
We update ppc_cpu_dump_state so that "info registers" will only display
the value of sdr1 if the register has been generated.
As mentioned above the register generation for the pcc->init_proc
function for 970, POWER5+, POWER7, POWER8 and POWER9 has been reworked
for improved clarity. Instead of calling init_proc_book3s_64 which then
attempts to generate the correct registers through a mess of if statements,
we remove this function and instead call the appropriate register
generation functions directly. This follows the register generation model
used for earlier cpu models (pre-970) whereby cpu specific registers are
generated directly in the init_proc function and makes it easier to
add/remove specific registers for new cpu models.
ISA v3.00 adds the idea of a partition table which is used to store the
address translation details for all partitions on the system. The partition
table consists of double word entries indexed by partition id where the second
double word contains the location of the process table in guest memory. The
process table is registered by the guest via a h-call.
We need somewhere to store the address of the process table so we add an entry
to the sPAPRMachineState struct called patb_entry to represent the second
doubleword of a single partition table entry corresponding to the current
guest. We need to store this value so we know if the guest is using radix or
hash translation and the location of the corresponding process table in guest
memory. Since we only have a single guest per qemu instance, we only need one
entry.
Since the partition table is technically a hypervisor resource we require that
access to it is abstracted by the virtual hypervisor through the get_patbe()
call. Currently the value of the entry is never set (and thus
defaults to 0 indicating hash), but it will be required to both implement
POWER9 kvm support and tcg radix support.
We also add this field to be migrated as part of the sPAPRMachineState as we
will need it on the receiving side as the guest will never tell us this
information again and we need it to perform translation.
David Gibson [Thu, 2 Mar 2017 01:13:07 +0000 (12:13 +1100)]
target/ppc/POWER9: Add POWERPC_MMU_V3 bit
For easier handling of future processors using the POWER9 or something
close to it, add a new bit in the MMU model. This was originally from a
revised version of 86cf1e9 "target/ppc/POWER9: Add ISAv3.00 MMU definition"
but the older version of the patch was already merged. This makes the
change on top of the original version.
David Gibson [Thu, 2 Mar 2017 04:29:31 +0000 (15:29 +1100)]
powernv: Don't test POWER9 CPU yet
A couple of tests for the work-in-progress 'powernv' machine type attempt
to test on POWER9 CPUs. However the POWER9 CPU support is incomplete and
this doesn't really work. In particular the firmware image we have
currently assumes the presence of the SDR1 register, which no longer exists
on POWER9. We only got away with this so far, because of a different bug
which added SDR1 to POWER9 even though it shouldn't be there.
For now, remove POWER9 testing of powernv, POWER8 testing will do for now
until the POWER9 support is more complete.
exec, kvm, target-ppc: Move getrampagesize() to common code
getrampagesize() returns the largest supported page size and mainly
used to know if huge pages are enabled.
However is implemented in target-ppc/kvm.c and not available
in TCG or other architectures.
This renames and moves gethugepagesize() to mmap-alloc.c where
fd-based analog of it is already implemented. This renames and moves
getrampagesize() to exec.c as it seems to be the common place for
helpers like this.
compat_table contains the list of logical pvr compat modes which a cpu can
operate in. It is a list of struct CompatInfo which contains the given pvr
value for a compat mode, the pcr bits which should be set to operate in
that compat mode, the pcr level which must be present in pcr_supported for
a processor to support that compat mode and the max threads possible in
that compat mode.
Add an entry for the POWER9/ISAv3.00 logical pvr which represents a
processor running with support for logical pvr 0x0f000005. A processor
running in this mode should have PCR_COMPAT_3_00 set in the pcr (if
available in pcr_mask) and should have PCR_COMPAT_3_00 in pcr_supported
to indicate that it is capable of running in this compat mode.
Also add PCR_COMPAT_3_00 to the bits which must be set for all previous
compat modes. Since no processor models contain this bit yet in pcr_mask
it will never be set, but this ensures we don't forget to in the future.
* remotes/cody/tags/block-pull-request:
block/rbd: add support for 'mon_host', 'auth_supported' via QAPI
block/rbd: add blockdev-add support
block/rbd: parse all options via bdrv_parse_filename
block/rbd: add all the currently supported runtime_opts
block/rbd: don't copy strings in qemu_rbd_next_tok()
Peter Maydell [Thu, 2 Mar 2017 17:39:12 +0000 (17:39 +0000)]
Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20170228a' into staging
Migration pull
Note: The 'postcopy: Update userfaultfd.h header' is part of
Paolo's header update and will disappear if applied after it.
Signed-off-by: Dr. David Alan Gilbert <[email protected]>
# gpg: Signature made Tue 28 Feb 2017 12:38:34 GMT
# gpg: using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert/tags/pull-migration-20170228a: (27 commits)
postcopy: Add extra check for COPY function
postcopy: Add doc about hugepages and postcopy
postcopy: Check for userfault+hugepage feature
postcopy: Update userfaultfd.h header
postcopy: Allow hugepages
postcopy: Send whole huge pages
postcopy: Mask fault addresses to huge page boundary
postcopy: Load huge pages in one go
postcopy: Use temporary for placing zero huge pages
postcopy: Plumb pagesize down into place helpers
postcopy: Record largest page size
postcopy: enhance ram_block_discard_range for hugepages
exec: ram_block_discard_range
postcopy: Chunk discards for hugepages
postcopy: Transmit and compare individual page sizes
postcopy: Transmit ram size summary word
migration: fix use-after-free of to_dst_file
migration: Update docs to discourage version bumps
migration: fix id leak regression
migrate: Introduce a 'dc->vmsd' check to avoid segfault for --only-migratable
...
Peter Maydell [Thu, 2 Mar 2017 15:25:37 +0000 (15:25 +0000)]
Merge remote-tracking branch 'remotes/elmarco/tags/leak-pull-request' into staging
# gpg: Signature made Wed 01 Mar 2017 09:02:53 GMT
# gpg: using RSA key 0xDAE8E10975969CE5
# gpg: Good signature from "Marc-André Lureau <[email protected]>"
# gpg: aka "Marc-André Lureau <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5
* remotes/elmarco/tags/leak-pull-request: (28 commits)
tests: fix virtio-blk-test leaks
tests: add specialized device_find function
tests: fix usb-test leaks
tests: allows to run single test in usb-hcd-ehci-test
usb: release the created buses
bus: do not unref hotplug handler
tests: fix virtio-9p-test leaks
tests: fix virtio-scsi-test leak
tests: fix e1000e leaks
tests: fix i440fx-test leaks
tests: fix e1000-test leak
tests: fix tco-test leaks
tests: fix eepro100-test leak
pc: pcihp: avoid adding ACPI_PCIHP_PROP_BSEL twice
tests: fix ipmi-bt-test leak
tests: fix ipmi-kcs-test leak
tests: fix bios-tables-test leak
tests: fix hd-geo-test leaks
tests: fix ide-test leaks
tests: fix vhost-user-test leaks
...
Peter Maydell [Thu, 2 Mar 2017 13:50:54 +0000 (13:50 +0000)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170301' into staging
ppc patch queue for 2017-03-01
I was hoping to get this pull request squeezed in before the soft
freeze, but I ran into some difficulties during testing. Everything
here was at least posted before the soft freeze, so I'm hoping we can
still merge it for 2.9.
The biggest things here are:
* Cleanups to handling of hashed page tables, that will make
adding support for the POWER9 MMU easier
* Cleanups to the XICS interrupt controller that will make
implementing the powernv machine easier
* TCG implementation of extended overflow and carry handling for
POWER9
It also includes:
* Increasing the CPU limit for pseries to 1024 vCPUs
* Generating proper OF node names in qemu (making hotplug and
coldplug logic closer together)
* remotes/dgibson/tags/ppc-for-2.9-20170301: (50 commits)
Add PowerPC 32-bit guest memory dump support
ppc/xics: rename 'ICPState *' variables to 'icp'
ppc/xics: move InterruptStatsProvider to the sPAPR machine
ppc/xics: move ics-simple post_load under the machine
ppc/xics: remove the XICSState classes
ppc/xics: export the XICS init routines
ppc/xics: move the ICP array under the sPAPR machine
ppc/xics: register the reset handler of ICP objects
ppc/xics: simplify spapr_dt_xics() interface
ppc/xics: use the QOM interface to grab an ICP
ppc/xics: move the cpu_setup() handler under the ICPState class
ppc/xics: simplify the cpu_setup() handler
ppc/xics: move kernel_xics_fd out of KVMXICSState
ppc/xics: extend the QOM interface to handle ICPs
ppc/xics: remove the XICS list of ICS
ppc/xics: register the reset handler of ICS objects
ppc/xics: remove xics_find_source()
ppc/xics: use the QOM interface to resend irqs
ppc/xics: use the QOM interface to get irqs
ppc/xics: use the QOM interface under the sPAPR machine
...
Peter Maydell [Thu, 2 Mar 2017 11:18:01 +0000 (11:18 +0000)]
Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging
x86 queue, 2017-02-27
"-cpu max" and query-cpu-model-expansion support for x86. This
should be the last x86 pull request before 2.9 soft freeze.
# gpg: Signature made Mon 27 Feb 2017 16:24:15 GMT
# gpg: using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <[email protected]>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/x86-pull-request:
i386: Improve query-cpu-model-expansion full mode
i386: Implement query-cpu-model-expansion QMP command
i386: Define static "base" CPU model
i386: Don't set CPUClass::cpu_def on "max" model
i386: Make "max" model not use any host CPUID info on TCG
i386: Create "max" CPU model
qapi-schema: Comment about full expansion of non-migration-safe models
i386: Reorganize and document CPUID initialization steps
i386: Rename X86CPU::host_features to X86CPU::max_features
i386: Add ordering field to CPUClass
i386: Unset cannot_destroy_with_object_finalize_yet on "host" model
Marcel Apfelbaum [Tue, 28 Feb 2017 14:13:29 +0000 (16:13 +0200)]
hw/pxb-pcie: fix PCI Express hotplug support
Add the missing osc method for pxb-pcie devices as APCI spec recommends,
see 6.2.9.1 OSC Implementation Example for PCI Host Bridge Devices, ACPI 3.0a:
It is recommended that a machine with multiple host bridge devices
should report the same capabilities for all host bridges, and also
negotiate control of the features described in the Control Field in
the same way for all host bridges.
Our _OSC method has a bunch of unused code loading data
into external CTRL and SUPP fields which are then never
used. Drop this in favor of a single local variable.
Jason Wang [Wed, 1 Mar 2017 04:10:40 +0000 (12:10 +0800)]
virtio: unbreak virtio-pci with IOMMU after caching ring translations
Commit c611c76417f5 ("virtio: add MemoryListener to cache ring
translations") registers a memory listener to dma_as. This may not
work when IOMMU is enabled: dma_as(bus_master_as) were initialized in
pcibus_machine_done() after virtio_realize(). This will cause a
segfault. Fixing this by using pci_device_iommu_address_space()
instead to make sure address space were initialized at this time.
With this fix, IOMMU device were required to be initialized before any
virtio-pci devices.
Stefan Hajnoczi [Wed, 22 Feb 2017 16:37:34 +0000 (16:37 +0000)]
virtio: add missing region cache init in virtio_load()
Commit 97cd965c070152bc626c7507df9fb356bbe1cd81 ("virtio: use
VRingMemoryRegionCaches for avail and used rings") switched to a memory
region cache to avoid repeated map/unmap operations.
The virtio_load() process is a little tricky because vring addresses are
serialized in two separate places. VIRTIO 1.0 devices serialize desc
and then a subsection with used and avail. Legacy devices only
serialize desc.
Live migration of VIRTIO 1.0 devices fails on the destination host with:
VQ 0 size 0x80 < last_avail_idx 0x12f8 - used_idx 0x0
Failed to load virtio-blk:virtio
error while loading state for instance 0x0 of device '0000:00:04.0/virtio-blk'
This happens because the memory region cache is only initialized after
desc is loaded and not after the used and avail subsection is loaded.
If the guest chose memory addresses that don't match the legacy ring
layout then the wrong guest memory location is accessed.
Wait until all ring addresses are known before trying to initialize the
region cache. Also clarify the incomplete comment about VIRTIO-1 ring
address subsection.
Cornelia Huck [Wed, 1 Mar 2017 17:58:52 +0000 (18:58 +0100)]
virtio: guard vring access when setting notification
Switching to vring caches exposed an existing bug in
virtio_queue_set_notification(): We can't access vring structures
if they have not been set up yet. This may happen, for example,
for virtio-blk devices with multiple queues: The code will try to
switch notifiers for every queue, but the guest may have only set up
a subset of them.
Fix this by guarding access to the vring memory by checking for
vring.desc. The first aio poll will iron out any remaining
inconsistencies for later-configured queues (buggy legacy drivers).
Paolo Bonzini [Thu, 23 Feb 2017 08:51:30 +0000 (09:51 +0100)]
virtio: check for vring setup in virtio_queue_empty
If the vring has not been set up, there is nothing in the virtqueue.
virtio_queue_host_notifier_aio_poll calls virtio_queue_empty even in
this case; we have to filter it out just like virtio_queue_notify_aio_vq.
Ben Warren [Thu, 16 Feb 2017 23:15:36 +0000 (15:15 -0800)]
ACPI: Add Virtual Machine Generation ID support
This implements the VM Generation ID feature by passing a 128-bit
GUID to the guest via a fw_cfg blob.
Any time the GUID changes, an ACPI notify event is sent to the guest
The user interface is a simple device with one parameter:
- guid (string, must be "auto" or in UUID format
xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
Ben Warren [Thu, 16 Feb 2017 23:15:35 +0000 (15:15 -0800)]
ACPI: Add vmgenid blob storage to the build tables
This allows them to be centrally initialized and destroyed
The "AcpiBuildTables.vmgenid" array will be used to construct the
"etc/vmgenid_guid" fw_cfg blob.
Its contents will be linked into fw_cfg after being built on the
pc_machine_done() -> acpi_setup() -> acpi_build() call path, and dropped
without use on the subsequent, guest triggered, acpi_build_update() ->
acpi_build() call path.
Ben Warren [Thu, 16 Feb 2017 23:15:33 +0000 (15:15 -0800)]
linker-loader: Add new 'write pointer' command
This is similar to the existing 'add pointer' functionality, but instead
of instructing the guest (BIOS or UEFI) to patch memory, it instructs
the guest to write the pointer back to QEMU via a writeable fw_cfg file.
Jeff Cody [Mon, 27 Feb 2017 17:36:46 +0000 (12:36 -0500)]
block/rbd: add support for 'mon_host', 'auth_supported' via QAPI
This adds support for three additional options that may be specified
by QAPI in blockdev-add:
server: host, port
auth method: either 'cephx' or 'none'
The "server" and "auth-supported" QAPI parameters are arrays. To conform
with the rados API, the array items are join as a single string with a ';'
character as a delimiter when setting the configuration values.
Peter Maydell [Wed, 1 Mar 2017 23:09:46 +0000 (23:09 +0000)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Tue 28 Feb 2017 20:35:32 GMT
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (46 commits)
block: Add Error parameter to bdrv_append()
block: Add Error parameter to bdrv_set_backing_hd()
block: Assertions for resize permission
block: Assertions for write permissions
block: Pass BdrvChild to bdrv_aligned_preadv/pwritev and copy-on-read
tests: Remove FIXME comments
nbd/server: Use real permissions for NBD exports
migration/block: Use real permissions
hmp: Request permissions in qemu-io
commit: Add filter-node-name to block-commit
mirror: Add filter-node-name to blockdev-mirror
stream: Use real permissions in streaming block job
mirror: Use real permissions in mirror/active commit block job
blockjob: Factor out block_job_remove_all_bdrv()
block: Allow backing file links in change_parent_backing_link()
block: BdrvChildRole.attach/detach() callbacks
block: Fix pending requests check in bdrv_append()
backup: Use real permissions in backup block job
commit: Use real permissions for HMP 'commit'
commit: Use real permissions in commit block job
...
Peter Maydell [Wed, 1 Mar 2017 20:33:47 +0000 (20:33 +0000)]
Merge remote-tracking branch 'remotes/sstabellini/tags/xen-20170228-tag' into staging
Xen 2017/02/28
# gpg: Signature made Tue 28 Feb 2017 19:13:08 GMT
# gpg: using RSA key 0x894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <[email protected]>"
# gpg: aka "Stefano Stabellini <[email protected]>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3 0AEA 894F 8F48 70E1 AE90
* remotes/sstabellini/tags/xen-20170228-tag:
Add a new qmp command to do checkpoint, query xen replication status
Add a new qmp command to start/stop replication
Peter Maydell [Wed, 1 Mar 2017 17:58:53 +0000 (17:58 +0000)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20170228-1' into staging
target-arm queue:
* raspi2: add gpio controller and sdhost controller, with
the wiring so the guest can switch which controller the
SD card is attached to
(this is sufficient to get raspbian kernels to boot)
* GICv3: support state save/restore from KVM
* update Linux headers to 4.11
* refactor and QOMify the ARMv7M container object
* remotes/pmaydell/tags/pull-target-arm-20170228-1: (21 commits)
bcm2835: add sdhost and gpio controllers
bcm2835_gpio: add bcm2835 gpio controller
hw/sd: add card-reparenting function
qdev: Have qdev_set_parent_bus() handle devices already on a bus
hw/intc/arm_gicv3_kvm: Reset GICv3 cpu interface registers
target-arm: Add GICv3CPUState in CPUARMState struct
hw/intc/arm_gicv3_kvm: Implement get/put functions
hw/intc/arm_gicv3_kvm: Add ICC_SRE_EL1 register to vmstate
update Linux headers to 4.11
update-linux-headers: update for 4.11
stm32f205: Rename 'nvic' local to 'armv7m'
stm32f205: Create armv7m object without using armv7m_init()
armv7m: Split systick out from NVIC
armv7m: Don't put core v7M devices under CONFIG_STELLARIS
armv7m: Make bitband device take the address space to access
armv7m: Make NVIC expose a memory region rather than mapping itself
armv7m: Make ARMv7M object take memory region link
armv7m: Use QOMified armv7m object in armv7m_init()
armv7m: QOMify the armv7m container
armv7m: Move NVICState struct definition into header
...
Thomas Huth [Tue, 31 Jan 2017 08:46:38 +0000 (09:46 +0100)]
audio/sdlaudio: Allow audio playback with SDL2
When compiling with SDL2, the semaphore trick used in sdlaudio.c
does not work - QEMU locks up completely in this case. To avoid
the hang and get at least some audio playback up and running (it's
a little bit crackling, but better than nothing), we can use the
SDL locking functions SDL_LockAudio() and SDL_UnlockAudio() to sync
with the sound playback thread instead.
Pavel Dovgalyuk [Tue, 14 Feb 2017 07:15:10 +0000 (10:15 +0300)]
audio: make audio poll timer deterministic
This patch changes resetting strategy of the audio polling timer.
It does not change expiration time if the timer is already set.
This patch is needed to make this timer deterministic and to use execution
record/replay for audio devices.
audio_reset_timer is used in the function audio_vm_change_state_handler.
Therefore every time VM is stopped or restarted the timer will be reset
to new timeout. Virtual clock does not proceed while VM is stopped.
Therefore there is no need in resetting the timeout when VM restarts.
v2: updated commit message
v3: now using timer_mod_anticipate function (as suggested by Yurii Zubrytskyi)
Peter Maydell [Wed, 1 Mar 2017 13:53:20 +0000 (13:53 +0000)]
Merge remote-tracking branch 'remotes/gkurz/tags/cve-2016-9602-for-upstream' into staging
This pull request have all the fixes for CVE-2016-9602, so that it can
be easily picked up by downstreams, as suggested by Michel Tokarev.
# gpg: Signature made Tue 28 Feb 2017 10:21:32 GMT
# gpg: using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <[email protected]>"
# gpg: aka "Greg Kurz <[email protected]>"
# gpg: aka "Greg Kurz <[email protected]>"
# gpg: aka "Gregory Kurz (Groug) <[email protected]>"
# gpg: aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894 DBA2 02FC 3AEB 0101 DBC2
Andrea Bolognani [Fri, 17 Feb 2017 10:14:39 +0000 (11:14 +0100)]
mach-virt: Provide sample configuration files
These are very much like the sample configuration files
for q35, and can be used both as documentation and as
a starting point for creating your own guest.
Two sample configuration files are provided:
* mach-virt-graphical.cfg can be used to start a
fully-featured (USB, graphical console, etc.)
guest that uses VirtIO devices;
* mach-virt-serial.cfg is similar but has a minimal
set of devices and uses the serial console.
All configuration files are fully commented and neatly
organized.
Andrea Bolognani [Fri, 17 Feb 2017 10:14:38 +0000 (11:14 +0100)]
q35: Improve sample configuration files
Instead of having a single sample configuration file,
we now have several:
* q35-emulated.cfg documents the default devices QEMU
adds to a q35 guest and the additional devices that
are pretty much guaranteed to be present in a
physical q35-based machine;
* q35-virtio-graphical.cfg can be used to start a
fully-featured (USB, graphical console, audio, etc.)
guest that uses VirtIO instead of emulated devices;
* q35-virtio-serial.cfg is similar but has a minimal
set of devices and uses the serial console.
All configuration files are fully commented and neatly
organized.
Peter Maydell [Wed, 1 Mar 2017 13:06:00 +0000 (13:06 +0000)]
Merge remote-tracking branch 'remotes/famz/tags/docker-pull-request' into staging
# gpg: Signature made Tue 28 Feb 2017 12:40:00 GMT
# gpg: using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6
Apparently, none of the bus owner give a reference to the hotplug
handler property, do not unref it on bus release.
Furthermore, a bus is allowed to be its own hotplug handler, which can
be seen in qbus_set_bus_hotplug_handler() function. However, in this
case, the reference can't be given to the property, or this will create
a cyclic dependency and the bus will never be free.
Each bus owner should manage the lifecycle of the hotplug handler.