I had XBZRLE here also, but it don't need extra resources on
destination, only on source. Additionally libvirt don't enable it on
destination, so don't put it here.
- initialize invalid_flags at declaration time.
- remove extra space (peter)
Peter Xu [Wed, 14 Jun 2017 07:55:58 +0000 (15:55 +0800)]
migration: fix incorrect enable return path
0425dc9 is actually v1 of that patch, but it was accidentally
merged (while there was a v2). That will cause problem when we try to
migrate to some old QEMUs when return path is not really there. Let's
fix it, then squashing this patch with 0425dc9 will be exactly patch
content of v2.
Peter Maydell [Tue, 13 Jun 2017 14:49:07 +0000 (15:49 +0100)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20170613' into staging
target-arm queue:
* vITS: Support save/restore
* timer/aspeed: Fix timer enablement when reload is not set
* aspped: add temperature sensor device
* timer.h: Provide better monotonic time on ARM hosts
* exynos4210: various cleanups
* exynos4210: support system poweroff
* remotes/pmaydell/tags/pull-target-arm-20170613:
hw/intc/arm_gicv3_its: Allow save/restore
hw/intc/arm_gicv3_kvm: Implement pending table save
hw/intc/arm_gicv3_its: Implement state save/restore
kvm-all: Pass an error object to kvm_device_access
timer/aspeed: fix timer enablement when a reload is not set
aspeed: add a temp sensor device on I2C bus 3
hw/misc: add a TMP42{1, 2, 3} device model
timer.h: Provide better monotonic time
hw/misc/exynos4210_pmu: Add support for system poweroff
hw/intc/exynos4210_gic: Constify array of combiner interrupts
hw/arm/exynos: Use type define instead of hard-coded a9mpcore_priv string
hw/arm/exynos: Declare local variables in some order
hw/arm/exynos: Move DRAM initialization next boards
hw/timer/exynos4210_mct: Remove unused defines
hw/timer/exynos4210_mct: Cleanup indentation and empty new lines
hw/timer/exynos4210_mct: Fix checkpatch style errors
hw/intc/exynos4210_gic: Use more meaningful name for local variable
Eric Auger [Tue, 13 Jun 2017 13:57:01 +0000 (14:57 +0100)]
hw/intc/arm_gicv3_its: Allow save/restore
We change the restoration priority of both the GICv3 and ITS. The
GICv3 must be restored before the ITS and the ITS needs to be restored
before PCIe devices since it translates their MSI transactions.
Eric Auger [Tue, 13 Jun 2017 13:57:00 +0000 (14:57 +0100)]
hw/intc/arm_gicv3_its: Implement state save/restore
We need to handle both registers and ITS tables. While
register handling is standard, ITS table handling is more
challenging since the kernel API is devised so that the
tables are flushed into guest RAM and not in vmstate buffers.
Flushing the ITS tables on device pre_save() is too late
since the guest RAM is already saved at this point.
Table flushing needs to happen when we are sure the vcpus
are stopped and before the last dirty page saving. The
right point is RUN_STATE_FINISH_MIGRATE but sometimes the
VM gets stopped before migration launch so let's simply
flush the tables each time the VM gets stopped.
For regular ITS registers we just can use vmstate pre_save()
and post_load() callbacks.
Eric Auger [Tue, 13 Jun 2017 13:57:00 +0000 (14:57 +0100)]
kvm-all: Pass an error object to kvm_device_access
In some circumstances, we don't want to abort if the
kvm_device_access fails. This will be the case during ITS
migration, in case the ITS table save/restore fails because
the guest did not program the vITS correctly. So let's pass an
error object to the function and return the ioctl value. New
callers will be able to make a decision upon this returned
value.
Existing callers pass &error_abort which will cause the
function to abort on failure.
Cédric Le Goater [Tue, 13 Jun 2017 13:57:00 +0000 (14:57 +0100)]
timer/aspeed: fix timer enablement when a reload is not set
When a timer is enabled before a reload value is set, the controller
waits for a reload value to be set before starting decrementing. This
fix tries to cover that case by changing the timer expiry only when
a reload value is valid.
hw/misc/exynos4210_pmu: Add support for system poweroff
On all Exynos-based boards, the system powers down itself by driving
PS_HOLD signal low - eight bit in PS_HOLD_CONTROL register of PMU.
Handle writing to respective PMU register to fix power off failure:
reboot: Power down
Unable to poweroff system
shutdown: 31 output lines suppressed due to ratelimiting
Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000000
CPU: 0 PID: 1 Comm: shutdown Not tainted 4.11.0-rc8 #846
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[<c031050c>] (unwind_backtrace) from [<c030ba6c>] (show_stack+0x10/0x14)
[<c030ba6c>] (show_stack) from [<c05b2800>] (dump_stack+0x88/0x9c)
[<c05b2800>] (dump_stack) from [<c03d3140>] (panic+0xdc/0x268)
[<c03d3140>] (panic) from [<c0343614>] (do_exit+0xa90/0xab4)
[<c0343614>] (do_exit) from [<c035f2dc>] (SyS_reboot+0x164/0x1d0)
[<c035f2dc>] (SyS_reboot) from [<c0307c80>] (ret_fast_syscall+0x0/0x3c)
Additionally the initial value of PS_HOLD has to be changed because
recent Linux kernel (v4.12-rc1) uses regmap cache for this access.
When the register is kept at reset value, the kernel will not issue a
write to it. Usually the bootloader sets the eight bit of PS_HOLD high
so mimic its existence here.
hw/timer/exynos4210_mct: Cleanup indentation and empty new lines
Statements under 'case' were in some places wrongly indented bringing
confusion and making the code less readable. Remove also few unneeded
blank lines. No functional changes.
hw/intc/exynos4210_gic: Use more meaningful name for local variable
There are to SysBusDevice variables in exynos4210_gic_realize()
function: one for the device itself and second for arm_gic device. Add
a prefix "gic" to the second one so it will be easier to understand the
code.
While at it, put local uninitialized 'i' variable at the end, next to
other uninitialized ones.
Stefan Hajnoczi [Mon, 5 Jun 2017 10:42:16 +0000 (11:42 +0100)]
monitor: resurrect handle_qmp_command trace event
Commit 104fc3027960dd2aa9d310936a6cb201c60e1088 ("qmp: Drop duplicated
QMP command object checks") removed the call to
trace_handle_qmp_command() while eliminating code duplication.
This patch brings the trace event back so QEMU-internal trace events can
be correlated with the QMP commands that caused them.
# gpg: Signature made Tue 13 Jun 2017 10:01:45 BST
# gpg: using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <[email protected]>"
# gpg: aka "Juan Quintela <[email protected]>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723
* remotes/juanquintela/tags/migration/20170613:
migration: Move migration.h to migration/
migration: Move remaining exported functions to migration/misc.h
migration: create global_state.c
migration: ram_control_* are implemented in qemu_file
migration: Commands are only used inside migration.c
migration: Move constants to savevm.h
migration: Move dump_vmsate_json_to_file() to misc.h
migration: Split registration functions from vmstate.h
migration: Move self_announce_delay() to misc.h
migration: Remove MigrationState from migration_channel_incomming()
ram: Now POSTCOPY_ACTIVE is the same that STATUS_ACTIVE
ram: Print block stats also in the complete case
migration: Don't try to set *errp directly
migration: isolate return path on src
Peter Maydell [Tue, 13 Jun 2017 10:56:00 +0000 (11:56 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.10-20170609' into staging
ppc patch queue 2017-06-09
This batch contains more patches to rework the pseries machine hotplug
infrastructure, plus an assorted batch of bugfixes.
It contains a start on fixes to restore migration from older machine
types on older versions which was broken by some xics changes. There
are still a few missing pieces here, though.
* remotes/dgibson/tags/ppc-for-2.10-20170609:
Revert "spapr: fix memory hot-unplugging"
xics: drop ICPStateClass::cpu_setup() handler
xics: setup cpu at realize time
xics: pass appropriate types to realize() handlers.
xics: introduce macros for ICP/ICS link properties
hw/cpu: core.c can be compiled as common object
hw/ppc/spapr: Adjust firmware name for PCI bridges
xics: add reset() handler to ICPStateClass
pnv_core: drop reference on ICPState object during CPU realization
spapr: Rework DRC name handling
spapr: Fold spapr_phb_{add,remove}_pci_device() into their only callers
spapr: Change DRC attach & detach methods to functions
spapr: Clean up handling of DR-indicator
spapr: Clean up RTAS set-indicator
spapr: Don't misuse DR-indicator in spapr_recover_pending_dimm_state()
spapr: Clean up DR entity sense handling
pseries: Correct panic behaviour for pseries machine type
spapr: fix memory leak in spapr_memory_pre_plug()
target/ppc: fix memory leak in kvmppc_is_mem_backend_page_size_ok()
target/ppc: pass const string to kvmppc_is_mem_backend_page_size_ok()
Peter Maydell [Tue, 13 Jun 2017 10:14:06 +0000 (11:14 +0100)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, pci, vhost: fixes
Some fixes all over the place.
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Thu 08 Jun 2017 20:04:24 BST
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg: aka "Michael S. Tsirkin <[email protected]>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
hw/pcie: fix the generic pcie root port to support migration
nvdimm acpi: fix region format interface code
vhost-user-bridge: fix iov_restore_front() warning
Peter Xu [Wed, 31 May 2017 10:35:34 +0000 (18:35 +0800)]
migration: isolate return path on src
There are some places that binded "return path" with postcopy. Let's be
prepared for its usage even without postcopy. This patch mainly did this
on source side.
Peter Maydell [Tue, 13 Jun 2017 08:27:17 +0000 (09:27 +0100)]
Merge remote-tracking branch 'remotes/borntraeger/tags/s390x-20170608' into staging
s390x: misc fixes
bunch of fixes
- reject MIDA accesses for CCWs
- cpumodel fixes
- cross-build fix for bios
- migration improvements
# gpg: Signature made Thu 08 Jun 2017 14:10:29 BST
# gpg: using RSA key 0x117BBC80B5A61C7C
# gpg: Good signature from "Christian Borntraeger (IBM) <[email protected]>"
# Primary key fingerprint: F922 9381 A334 08F9 DBAB FBCA 117B BC80 B5A6 1C7C
* remotes/borntraeger/tags/s390x-20170608:
s390x/cpumodel: improve defintion search without an IBC
s390x/cpumodel: take care of the cpuid format bit for KVM
pc-bios/s390-ccw: use STRIP variable in Makefile
s390x/css: fence off MIDA
s390x/css: catch section mismatch on load
Peter Maydell [Mon, 12 Jun 2017 18:26:49 +0000 (19:26 +0100)]
Merge remote-tracking branch 'remotes/elmarco/tags/char-pull-request' into staging
# gpg: Signature made Thu 08 Jun 2017 15:12:11 BST
# gpg: using RSA key 0xDAE8E10975969CE5
# gpg: Good signature from "Marc-André Lureau <[email protected]>"
# gpg: aka "Marc-André Lureau <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5
* remotes/elmarco/tags/char-pull-request:
test-char: start a /char/serial test
chardev: don't use alias names in parse_compat()
char: fix alias devices regression
Peter Maydell [Mon, 12 Jun 2017 13:14:42 +0000 (14:14 +0100)]
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Wed 07 Jun 2017 19:06:51 BST
# gpg: using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <[email protected]>"
# gpg: aka "Stefan Hajnoczi <[email protected]>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/block-pull-request:
configure: split c and cxx extra flags
coroutine-lock: do not touch coroutine after another one has been entered
.gdbinit: load QEMU sub-commands when gdb starts
coccinelle: fix typo in comment
oslib: strip trailing '\n' from error_setg() string argument
Peter Maydell [Mon, 12 Jun 2017 09:43:32 +0000 (10:43 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Fri 09 Jun 2017 12:47:31 BST
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream:
block: fix external snapshot abort permission error
block/qcow.c: Fix memory leak in qcow_create()
qemu-iotests: Test automatic commit job cancel on hot unplug
commit: Fix use after free in completion
qemu-iotests: Block migration test
migration/block: Clean up BBs in block_save_complete()
migration: Inactivate images after .save_live_complete_precopy()
block: Fix anonymous BBs in blk_root_inactivate()
In qemu_gluster_parse_json(), the call to qdict_array_entries()
could return a negative error code, which we were ignoring
because we assigned the result to an unsigned variable.
Fix this by using the 'int' type instead, which matches the
return type of qdict_array_entries() and also the type
we use for the loop enumeration variable 'i'.
In external_snapshot_abort(), we try to undo what was done in
external_snapshot_prepare() calling bdrv_replace_node() to swap the
nodes back. However, we receive a permissions error as writers are
blocked on the old node, which is now the new node backing file.
An easy fix (initially suggested by Kevin Wolf) is to call
bdrv_set_backing_hd() on the new node, to set the backing node to NULL.
Peter Maydell [Mon, 5 Jun 2017 13:55:54 +0000 (14:55 +0100)]
block/qcow.c: Fix memory leak in qcow_create()
Coverity points out that the code path in qcow_create() for
the magic "fat:" backing file name leaks the memory used to
store the filename (CID 1307771). Free the memory before
we overwrite the pointer.
Kevin Wolf [Fri, 2 Jun 2017 21:04:55 +0000 (23:04 +0200)]
commit: Fix use after free in completion
The final bdrv_set_backing_hd() could be working on already freed nodes
because the commit job drops its references (through BlockBackends) to
both overlay_bs and top already a bit earlier.
One way to trigger the bug is hot unplugging a disk for which
blockdev_mark_auto_del() cancels the block job.
Fix this by taking BDS-level references while we're still using the
nodes.
Kevin Wolf [Mon, 22 May 2017 15:17:49 +0000 (17:17 +0200)]
migration/block: Clean up BBs in block_save_complete()
We need to release any block migrations BlockBackends on the source
before successfully completing the migration because otherwise
inactivating the images will fail (inactivation only tolerates device
BBs).
Kevin Wolf [Mon, 22 May 2017 15:10:38 +0000 (17:10 +0200)]
migration: Inactivate images after .save_live_complete_precopy()
Block migration may still access the image during its
.save_live_complete_precopy() implementation, so we should only
inactivate the image afterwards.
Another reason for the change is that inactivating an image fails when
there is still a non-device BlockBackend using it, which includes the
BBs used by block migration. We want to give block migration a chance to
release the BBs before trying to inactivate the image (this will be done
in another patch).
Conflicts hw/ppc/spapr_drc.c, because get_index() has been renamed
spapr_get_index().
This didn't fix the problem. Once the hotplug has been started
some memory is allocated and some structures are allocated.
We don't free it when we ignore the unplug, and we can't because
they can be in use by the kernel.
Greg Kurz [Thu, 8 Jun 2017 13:43:08 +0000 (15:43 +0200)]
xics: drop ICPStateClass::cpu_setup() handler
The cpu_setup() handler is only implemented by xics_kvm, where it really
does a typical "realize" job. Moreover, the realize() handler is called
shortly after cpu_setup(), on the same path.
This patch converts xics_kvm to implement realize() instead of cpu_setup().
Greg Kurz [Thu, 8 Jun 2017 13:42:59 +0000 (15:42 +0200)]
xics: setup cpu at realize time
Until recently, spapr used to allocate ICPState objects for the lifetime
of the machine. They would only be associated to vCPUs in xics_cpu_setup()
when plugging a CPU core.
Now that ICPState objects have the same lifecycle as vCPUs, it is
possible to associate them during realization.
This patch hence open-codes xics_cpu_setup() in icp_realize(). The vCPU
is passed as a property. Note that vCPU now needs to be realized first
for the IRQs to be allocated. It also needs to resetted before ICPState
realization in order to synchronize with KVM.
Since ICPState objects are freed when unrealized, xics_cpu_destroy() isn't
needed anymore and can be safely dropped.
Greg Kurz [Thu, 8 Jun 2017 13:42:50 +0000 (15:42 +0200)]
xics: pass appropriate types to realize() handlers.
It makes more sense to pass an IPCState * to handlers of ICPStateClass
instead of a DeviceState *, if only to benefit from compile time type
checking. The same goes with ICSStateClass.
While here, we also change the declaration of ICPStateClass in xics.h
for consistency.
Thomas Huth [Thu, 8 Jun 2017 13:18:54 +0000 (15:18 +0200)]
hw/cpu: core.c can be compiled as common object
There does not seem to be any target specific code in core.c, so we can
put it into "common-obj" instead of "obj" to compile it only once for
all targets.
Haozhong Zhang [Wed, 7 Jun 2017 08:06:39 +0000 (16:06 +0800)]
nvdimm acpi: fix region format interface code
Per ACPI 6.2, section 5.2.25.6 and JEDEC Annex L Release 3, the
current region format interface code 0x201 indicates the block
addressed function interface 1, rather than a byte addressable
interface. Fix it by using 0x301 which indicates the byte addressable
no energy backed function interface 1.
CC tests/vhost-user-bridge.o
/home/dgilbert/git/qemu-world3/tests/vhost-user-bridge.c:228:23: warning: variables 'front' and 'iov' used in loop condition not modified in loop body [-Wfor-loop-analysis]
for (cur = front; front != iov; cur++) {
^~~~~ ~~~
1 warning generated.
Fix the loop, document the function, and fix some related assert().
In practice, the loop bug was harmless because the front sg buffer is
enough to discard/restore the header size.
Thomas Huth [Wed, 7 Jun 2017 08:20:27 +0000 (10:20 +0200)]
hw/ppc/spapr: Adjust firmware name for PCI bridges
SLOF uses "pci" as name for PCI bridges nodes in the device tree instead
of "pci-bridges", so booting via bootindex from a device behind a PCI
bridge currently does not work since QEMU passes the wrong name in the
"qemu,boot-list" property. Fix it by changing the name of the PCI bridge
nodes to "pci" instead.
Greg Kurz [Wed, 7 Jun 2017 17:17:00 +0000 (19:17 +0200)]
xics: add reset() handler to ICPStateClass
Taking into account that qemu_set_irq() returns immediatly if its first
argument is NULL, icp_kvm_reset() largely duplicates icp_reset().
This patch introduces a reset() handler, so that the common logic can
be implemented in icp_reset() only.
While there we can also drop icp_kvm_realize() and icp_kvm_unrealize(). This
causes icp-kvm to be realized in icp_realize(), which sets icp->xics, but
it has no impact.
Greg Kurz [Wed, 7 Jun 2017 17:16:52 +0000 (19:16 +0200)]
pnv_core: drop reference on ICPState object during CPU realization
Similarly to what was done to spapr with commit 249127d0dfeb, this patch
ensures that we don't keep an extra reference on the ICPState object. Also
since the object was just created and not reparented yet, the call to
object_property_add_child() should never fail: let's pass &error_abort to
make this clear.
David Gibson [Wed, 7 Jun 2017 02:00:11 +0000 (12:00 +1000)]
spapr: Rework DRC name handling
DRC objects have a get_name method which returns the DRC name generated
when the DRC is created. Replace that with a fixed spapr_drc_name()
function which generates the name on the fly from other information. This
means:
* We get rid of a method with only one implementation, and only local
callers
* We don't have to carry the name string around for the lifetime of the
DRC
* We use information added to the class structure to generate the name
in standard format, so we don't need an explicit switch on drc type
any more
We also eliminate the 'name' property; it's basically useless since the
only information in it can easily be deduced from other things.
David Gibson [Tue, 6 Jun 2017 07:44:11 +0000 (17:44 +1000)]
spapr: Change DRC attach & detach methods to functions
DRC objects have attach & detach methods, but there's only one
implementation. Although there are some differences in its behaviour for
different DRC types, the overall structure is the same, so while we might
want different method implementations for some parts, we're unlikely to
want them for the top-level functions.
David Gibson [Tue, 6 Jun 2017 07:42:26 +0000 (17:42 +1000)]
spapr: Clean up handling of DR-indicator
There are 3 types of "indicator" associated with hotplug in the PAPR spec
the "allocation state", "isolation state" and "DR-indicator". The first
two are intimately tied to the various state transitions associated with
hotplug. The DR-indicator, however, is different and simpler.
It's basically just a guest controlled variable which can be used by the
guest to flag state or problems associated with a device. The idea is that
the hypervisor can use it to present information back on management
consoles (on some machines with PowerVM it may even control physical LEDs
on the machine case associated with the relevant device).
For that reason, there's only ever likely to be a single update
implementation so the set_indicator_state method isn't useful. Replace it
with a direct function call.
While we're there, make some small associated cleanups:
* PAPR doesn't use the term "indicator state", just "DR-indicator" and
the allocation state and isolation state are also considered "indicators".
Rename things to be less confusing
* Fold set_indicator_state() and rtas_set_indicator_state() into a single
rtas_set_dr_indicator() function.
David Gibson [Tue, 6 Jun 2017 07:05:53 +0000 (17:05 +1000)]
spapr: Clean up RTAS set-indicator
In theory the RTAS set-indicator call can be used for a number of
"indicators" defined by PAPR. In practice the only ones we're ever likely
to implement are those used for Dynamic Reconfiguration (i.e. hotplug).
Because of this, the current implementation determines the associated DRC
object, before dispatching based on the type of indicator.
However, this means we also need a check that we're dealing with a DR
related indicator at all, which duplicates some of the logic from the
switch further down.
Even though it means a bit of code duplication, things work out cleaner if
we delegate the DRC lookup to the individual indicator type functions -
and it also allows some further cleanups.
While we're there, remove references to "sensor", a copy/paste artefact
from the related, but distinct "get-sensor" call.
David Gibson [Tue, 6 Jun 2017 07:01:21 +0000 (17:01 +1000)]
spapr: Don't misuse DR-indicator in spapr_recover_pending_dimm_state()
With some combinations of migration and hotplug we can lost temporary state
indicating how many DRCs (guest side hotplug handles) are still connected
to a DIMM object in the process of removal. When we hit that situation
spapr_recover_pending_dimm_state() is used to scan more extensively and
work out the right number.
It does this using drc->indicator state to determine what state of
disconnection the DRC is in. However, this is not safe, because the
indicator state is guest settable - in fact it's more-or-less a purely
guest->host notification mechanism which should have no bearing on the
internals of hotplug state management.
So, replace the test for this with a test on drc->dev, which is a purely
qemu side managed variable, and updated the same BQL critical section as
the indicator state.
This does introduce an off-by-one change, because the indicator state was
updated before the call to spapr_lmb_release() on the current DRC, whereas
drc->dev is updated afterwards. That's corrected by always decrementing
the nr_lmbs value instead of only doing so in the case where we didn't
have to recover information.
David Gibson [Wed, 7 Jun 2017 01:26:52 +0000 (11:26 +1000)]
spapr: Clean up DR entity sense handling
DRC classes have an entity_sense method to determine (in a specific PAPR
sense) the presence or absence of a device plugged into a DRC. However,
we only have one implementation of the method, which explicitly tests for
different DRC types. This changes it to instead have different method
implementations for the two cases: "logical" and "physical" DRCs.
While we're at it, the entity sense method always returns RTAS_OUT_SUCCESS,
and the interesting value is returned via pass-by-reference. Simplify this
to directly return the value we care about
David Gibson [Wed, 7 Jun 2017 07:06:44 +0000 (17:06 +1000)]
pseries: Correct panic behaviour for pseries machine type
The pseries machine type doesn't usually use the 'pvpanic' device as such,
because it has a firmware/hypervisor facility with roughly the same
purpose. The 'ibm,os-term' RTAS call notifies the hypervisor that the
guest has crashed.
Our implementation of this call was sending a GUEST_PANICKED qmp event;
however, it was not doing the other usual panic actions, making its
behaviour different from pvpanic for no good reason.
To correct this, we should call qemu_system_guest_panicked() rather than
directly sending the panic event.
* remotes/bonzini/tags/for-upstream: (31 commits)
docs: create config/, devel/ and spin/ subdirectories
cpus: reset throttle_thread_scheduled after sleep
kvm: don't register smram_listener when smm is off
nbd: make it thread-safe, fix qcow2 over nbd
target/i386: Add GDB XML description for SSE registers
i386/kvm: do not zero out segment flags if segment is unusable or not present
edu: fix memory leak on msi_broken platforms
linuxboot_dma: compile for i486
kvmclock: update system_time_msr address forcibly
nbd: Fully initialize client in case of failed negotiation
sockets: improve error reporting if UNIX socket path is too long
i386: fix read/write cr with icount option
target/i386: use multiple CPU AddressSpaces
target/i386: enable A20 automatically in system management mode
virtio-scsi: Unset hotplug handler when unrealize
exec: simplify phys_page_find() params
nbd/client.c: use errp instead of LOG
nbd: add errp to read_sync, write_sync and drop_sync
nbd: add errp parameter to nbd_wr_syncv()
nbd: read_sync and friends: return 0 on success
...
Paolo Bonzini [Tue, 6 Jun 2017 14:46:26 +0000 (16:46 +0200)]
docs: create config/, devel/ and spin/ subdirectories
Developer documentation should be its own manual. As a start, move all
developer-oriented files to a separate directory.
Also move non-text files to their own directories: docs/config/ for
QEMU -readconfig input, and docs/spin/ for formal models to be used
with the SPIN model checker.
Felipe Franciosi [Fri, 19 May 2017 21:29:50 +0000 (22:29 +0100)]
cpus: reset throttle_thread_scheduled after sleep
Currently, the throttle_thread_scheduled flag is reset back to 0 before
sleeping (as part of the throttling logic). Given that throttle_timer
(well, any timer) may tick with a slight delay, it so happens that under
heavy throttling (ie. close or on CPU_THROTTLE_PCT_MAX) the tick may
schedule a further cpu_throttle_thread() work item after the flag reset,
but before the previous sleep completed. This results on the vCPU thread
sleeping continuously for potentially several seconds in a row.
The chances of that happening can be drastically minimised by resetting
the flag after the sleep.
Gonglei [Thu, 1 Jun 2017 11:35:15 +0000 (19:35 +0800)]
kvm: don't register smram_listener when smm is off
If the user set disable smm by '-machine smm=off', we
should not register smram_listener so that we can
avoid waster memory in kvm since the added sencond
address space.
Meanwhile we should assign value of the global kvm_state
before invoking the kvm_arch_init(), because
pc_machine_is_smm_enabled() may use it by kvm_has_mm().
Paolo Bonzini [Thu, 1 Jun 2017 10:44:56 +0000 (12:44 +0200)]
nbd: make it thread-safe, fix qcow2 over nbd
NBD is not thread safe, because it accesses s->in_flight without
a CoMutex. Fixing this will be required for multiqueue.
CoQueue doesn't have spurious wakeups but, when another coroutine can
run between qemu_co_queue_next's wakeup and qemu_co_queue_wait's
re-locking of the mutex, the wait condition can become false and
a loop is necessary.
In fact, it turns out that the loop is necessary even without this
multi-threaded scenario. A particular sequence of coroutine wakeups
is happening ~80% of the time when starting a guest with qcow2 image
served over NBD (i.e. qemu-nbd --format=raw, and QEMU's -drive option
has -format=qcow2). This patch fixes that issue too.
target/i386: Add GDB XML description for SSE registers
Add an XML description for SSE registers (XMM+MXCSR) for both X86
and X86-64 architectures in the GDB stub:
- configure: Define gdb_xml_files for the X86 targets (32 and 64bit).
- gdb-xml/i386-32bit-sse.xml & gdb-xml/i386-64bit-sse.xml: The XML files
that contain a description of the XMM + MXCSR registers.
- gdb-xml/i386-32bit.xml & gdb-xml/i386-64bit.xml: wrappers that include
the XML file of the core registers and the other XML file of the SSE registers.
- target/i386/cpu.c: Modify the gdb_core_xml_file to the new XML wrapper,
modify the gdb_num_core_regs to fit the registers number defined in each
XML file.
Roman Pen [Thu, 1 Jun 2017 08:56:04 +0000 (10:56 +0200)]
i386/kvm: do not zero out segment flags if segment is unusable or not present
This is a fix for the problem [1], where VMCB.CPL was set to 0 and interrupt
was taken on userspace stack. The root cause lies in the specific AMD CPU
behaviour which manifests itself as unusable segment attributes on SYSRET[2].
Here in this patch flags are not touched even segment is unusable or is not
present, therefore CPL (which is stored in DPL field) should not be lost and
will be successfully restored on kvm/svm kernel side.
Also current patch should not break desired behavior described in this commit:
4cae9c97967a ("target-i386: kvm: clear unusable segments' flags in migration")
since present bit will be dropped if segment is unusable or is not present.
This is the second part of the whole fix of the corresponding problem [1],
first part is related to kvm/svm kernel side and does exactly the same:
segment attributes are not zeroed out.
Paolo Bonzini [Wed, 31 May 2017 12:56:37 +0000 (14:56 +0200)]
edu: fix memory leak on msi_broken platforms
If msi_init fails, the thread has already been created and the
mutex/condvar are not destroyed. Initialize everything only
after the point where pci_edu_realize cannot fail.
Paolo Bonzini [Wed, 31 May 2017 12:37:15 +0000 (14:37 +0200)]
linuxboot_dma: compile for i486
The ROM uses the cmovne instruction, which is new in Pentium Pro and does not
work when running QEMU with "-cpu 486". Avoid producing that instruction.
Denis Plotnikov [Mon, 29 May 2017 10:49:04 +0000 (13:49 +0300)]
kvmclock: update system_time_msr address forcibly
Do an update of system_time_msr address every time before reading
the value of tsc_timestamp from guest's kvmclock page.
There is no other code paths which ensure that qemu has an up-to-date
value of system_time_msr. So, force this update on guest's tsc_timestamp
reading.
This bug causes effect on those nested setups which turn off TPR access
interception for L2 guests and that access being intercepted by L0 doesn't
show up in L1.
Linux bootstrap initiate kvmclock before APIC initializing causing TPR access.
That's why on L1 guests, having TPR interception turned on for L2, the effect
of the bug is not revealed.
This patch fixes this problem by making sure it knows the correct
system_time_msr address every time it is needed.
Eric Blake [Sat, 27 May 2017 03:04:21 +0000 (22:04 -0500)]
nbd: Fully initialize client in case of failed negotiation
If a non-NBD client connects to qemu-nbd, we would end up with
a SIGSEGV in nbd_client_put() because we were trying to
unregister the client's association to the export, even though
we skipped inserting the client into that list. Easy trigger
in two terminals:
nmap claims that it thinks it connected to a pago-services1
server (which probably means nmap could be updated to learn the
NBD protocol and give a more accurate diagnosis of the open
port - but that's not our problem), then terminates immediately,
so our call to nbd_negotiate() fails. The fix is to reorder
nbd_co_client_start() to ensure that all initialization occurs
before we ever try talking to a client in nbd_negotiate(), so
that the teardown sequence on negotiation failure doesn't fault
while dereferencing a half-initialized object.
While debugging this, I also noticed that nbd_update_server_watch()
called by nbd_client_closed() was still adding a channel to accept
the next client, even when the state was no longer RUNNING. That
is fixed by making nbd_can_accept() pay attention to the current
state.
sockets: improve error reporting if UNIX socket path is too long
The 'struct sockaddr_un' only allows 108 bytes for the socket
path.
If the user supplies a path, QEMU uses snprintf() to silently
truncate it when too long. This is undesirable because the user
will then be unable to connect to the path they asked for.
If the user doesn't supply a path, QEMU builds one based on
TMPDIR, but if that leads to an overlong path, it mistakenly
uses error_setg_errno() with a stale errno value, because
snprintf() does not set errno on truncation.
In solving this the code needed some refactoring to ensure we
don't pass 'un.sun_path' directly to any APIs which expect
NUL-terminated strings, because the path is not required to
be terminated.
Mihail Abakumov [Fri, 19 May 2017 09:36:15 +0000 (12:36 +0300)]
i386: fix read/write cr with icount option
Running Windows with icount causes a crash in instruction of write cr.
This patch fixes it.
Reading and writing cr cause an icount read because there are called
cpu_get_apic_tpr and cpu_set_apic_tpr functions. So, there is need
gen_io_start()/gen_io_end() calls.
Peter Maydell [Wed, 7 Jun 2017 15:29:29 +0000 (16:29 +0100)]
arm_gicv3: Fix ICC_BPR1 reset value when EL3 not implemented
If EL3 is not implemented (ie only one security state) then the
one and only ICC_BPR1 register behaves like the Non-secure
ICC_BPR1 in an EL3-present configuration. In particular, its
reset value is GIC_MIN_BPR_NS, not GIC_MIN_BPR.
Correct the erroneous reset value; this fixes a problem where
we might hit the assert added in commit a89ff39ee901.