blockdev: Refuse to drive_del something added with blockdev-add
For some device models, the guest can prevent unplug. Some users need a
way to forcibly revoke device model access to the block backend then, so
the underlying images can be safely used for something else.
drive_del lets you do that. Unfortunately, it conflates revoking access
with destroying the backend.
Commit 9063f81 made drive_del immediately destroy the root BDS. Nice:
the device name becomes available for reuse immediately. Not so nice:
the device model's pointer to the root BDS dangles, and we're prone to
crash when the memory gets reused.
Commit d22b2f4 fixed that by hiding the root BDS instead of destroying
it. Destruction only happens on unplug. "Hiding" means removing it
from bdrv_states and graph_bdrv_states; see bdrv_make_anon().
This "destroy on revoke" is a misfeature we don't want to carry
forward to blockdev-add, just like "destroy on unplug" (commit 2d246f0). So make drive_del fail on anything added with blockdev-add.
We'll add separate QMP commands to revoke device model access and to
destroy backends.
BLOCK_IO_ERROR events are logged by libvirt, which helps with
post mortem analysis of guests. However, one information that
we miss today is a human readable string describing the cause
of the I/O error.
This commit adds that string it to BLOCK_IO_ERROR. Note that
this string is a debugging aid for humans, meaning that it
should not parsed by applications.
Stefan Hajnoczi [Thu, 11 Sep 2014 12:49:39 +0000 (13:49 +0100)]
dataplane: fix virtio_blk_data_plane_create() op blocker error path
Commit 3718d8ab65f68de2acccbe6a315907805f54e3cc ("block: Replace in_use
with operation blocker") broke the error path because it consumed
local_err instead of propagating it.
The caller has no way to know that the function failed. This caused
virtio-blk to start "successfully" even though there was a fatal
dataplane error.
block: Make the block accounting functions operate on BlockAcctStats
This is the next step for decoupling block accounting functions from
BlockDriverState.
In a future commit the BlockAcctStats structure will be moved from
BlockDriverState to the device models structures.
Note that bdrv_get_stats was introduced so device models can retrieve the
BlockAcctStats structure of a BlockDriverState without being aware of it's
layout.
This function should go away when BlockAcctStats will be embedded in the device
models structures.
block: rename BlockAcctType members to start with BLOCK_ instead of BDRV_
The middle term goal is to move the BlockAcctStats structure in the device models.
(Capturing I/O accounting statistics in the device models is good for billing)
This patch make a small step in this direction by removing a reference to BDRV.
The plan is to add new accounting metrics (latency, invalid requests, failed
requests, queue depth) and block.c is overpopulated so it will be better to work
in a separate module.
Moreover the long term plan is to have statistics in each of the BDS of the graph
for metrology purpose; this means that the device model statistics must move from
the topmost BDS to the device model.
So we need to decouple the statistic code from BlockDriverState.
This is another argument for the extraction of the code in a separate module.
Valentin Manea [Sun, 31 Aug 2014 08:32:08 +0000 (11:32 +0300)]
IDE: MMIO IDE device control should be little endian
Set the IDE MMIO memory type to little endian. The ATA specs identify
words part of the control commands encoded as little endian.
While this has no impact on little endian systems, it's required for big
endian systems(eg OpenRisc).
Commit 6db9560 split off the growable case so it can use
bdrv_file_open() instead of bdrv_open() then. Growable BDSes become
anonymous. Weird.
Commit 2e40134 folded bdrv_file_open() back into bdrv_open() with new
flag BDRV_O_PROTOCOL. We still have two bdrv_open() calls, and
growable BDSes remain anonymous.
Circle back to before commit 6db9560: just one call, not anonymous.
Luiz Capitulino [Fri, 29 Aug 2014 20:07:27 +0000 (16:07 -0400)]
block: extend BLOCK_IO_ERROR event with nospace indicator
Management software, such as RHEV's vdsm, want to be able to allocate
disk space on demand. The basic use case is to start a VM with a small
disk and then the disk is enlarged when QEMU hits a ENOSPC condition.
To this end, the management software has to be notified when QEMU
encounters ENOSPC. The solution implemented by this commit is simple:
it extends the BLOCK_IO_ERROR with a 'nospace' key, which is true
when QEMU is stopped due to ENOSPC.
Note that support for querying this event is already present in
query-block by means of the 'io-status' key. Also, the new 'nospace'
BLOCK_IO_ERROR field shares the same semantics with 'io-status',
which basically means that werror= has to be set to either
'stop' or 'enospc' to enable 'nospace'.
Finally, this commit also updates the 'io-status' key doc in the
schema with a list of supported device models.
Peter Maydell [Mon, 8 Sep 2014 12:14:41 +0000 (13:14 +0100)]
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Block pull request
# gpg: Signature made Mon 08 Sep 2014 11:49:31 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <[email protected]>"
# gpg: aka "Stefan Hajnoczi <[email protected]>"
* remotes/stefanha/tags/block-pull-request: (24 commits)
ide: Add resize callback to ide/core
IDE: Fill the IDENTIFY request consistently
vmdk: fix buf leak in vmdk_parse_extents()
vmdk: fix vmdk_parse_extents() extent_file leaks
ide: Add wwn support to IDE-ATAPI drive
qtest/ide: Uninitialize PC allocator
libqos: add a simple first-fit memory allocator
MAINTAINERS: update sheepdog maintainer
qemu-nbd: fix indentation and coding style
qemu-nbd: add option to set detect-zeroes mode
rename parse_enum_option to qapi_enum_parse and make it public
block/archipelago: Use QEMU atomic builtins
qemu-img: fix rebase src_cache option documentation
qemu-img: clarify src_cache option documentation
libqos: Added EVENT_IDX support
libqos: Added MSI-X support
libqos: Added test case for configuration changes in virtio-blk test
libqos: Added indirect descriptor support to virtio implementation
libqos: Added basic virtqueue support to virtio implementation
tests: Add virtio device initialization
...
Peter Maydell [Mon, 8 Sep 2014 11:02:07 +0000 (12:02 +0100)]
Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
Patch queue for ppc - 2014-09-08
Alexander Graf (11):
PPC: KVM: Fix g3beige and mac99 when HV is loaded
PPC: mac99: Move NVRAM to page boundary when necessary
KVM: Add helper to run KVM_CHECK_EXTENSION on vm fd
PPC: KVM: Use vm check_extension for pv hcall
PPC: mac99: Fix core99 timer frequency
PPC: mac_nvram: Remove unused functions
PPC: mac_nvram: Allow 2 and 4 byte accesses
PPC: mac_nvram: Split NVRAM into OF and OSX parts
PPC: Mac: Move tbfreq into local variable
PPC: Cuda: Use cuda timer to expose tbfreq to guest
PPC: Fix default config ordering and add eTSEC for ppc64
Alexey Kardashevskiy (7):
spapr: Move DT memory node rendering to a helper
spapr: Use DT memory node rendering helper for other nodes
spapr: Refactor spapr_populate_memory() to allow memoryless nodes
spapr: Split memory nodes to power-of-two blocks
spapr: Add a helper for node0_size calculation
spapr: Fix ibm, associativity for memory nodes
spapr_pci: Fix config space corruption
Anton Blanchard (2):
spapr-vlan: Don't touch last entry in buffer list
hypervisor property clashes with hypervisor node
Benjamin Herrenschmidt (2):
loader: Add load_image_size() to replace load_image()
spapr: Locate RTAS and device-tree based on real RMA
Bharat Bhushan (4):
ppc: debug stub: Get trap instruction opcode from KVM
ppc: synchronize excp_vectors for injecting exception
ppc: Add software breakpoint support
ppc: Add hw breakpoint watchpoint support
Gonglei (1):
spapr: fix possible memory leak
Greg Kurz (1):
spapr_pci: map the MSI window in each PHB
Nikunj A Dadhania (3):
ppc: spapr-rtas - implement os-term rtas call
spapr: add uuid/host details to device tree
ppc/spapr: Fix MAX_CPUS to 255
Peter Maydell (1):
hw/ppc/spapr_hcall.c: Fix typo in function names
Tom Musta (20):
linux-user: Fix Stack Pointer Bug in PPC setup_rt_frame
linux-user: Split PPC Trampoline Encoding from Register Save
linux-user: Enable Signal Handlers on PPC64
linux-user: Properly Dereference PPC64 ELFv1 Signal Handler Pointer
linux-user: Implement do_setcontext for PPC64
linux-user: Handle PPC64 ELFv2 Function Pointers
target-ppc: Bug Fix: rlwinm
target-ppc: Bug Fix: rlwnm
target-ppc: Bug Fix: rlwimi
target-ppc: Bug Fix: mullwo
target-ppc: Bug Fix: mullw
target-ppc: Bug Fix: mulldo OV Detection
target-ppc: Bug Fix: srawi
target-ppc: Bug Fix: srad
target-ppc: Special Case of rlwimi Should Use Deposit
target-ppc: Optimize rlwinm MB=0 ME=31
target-ppc: Optimize rlwnm MB=0 ME=31
target-ppc: Clean Up mullw
target-ppc: Clean up mullwo
target-ppc: Implement mulldo with TCG
# gpg: Signature made Mon 08 Sep 2014 11:51:15 BST using RSA key ID 03FEDC60
# gpg: Can't check signature: public key not found
* remotes/agraf/tags/signed-ppc-for-upstream: (52 commits)
hypervisor property clashes with hypervisor node
PPC: Fix default config ordering and add eTSEC for ppc64
spapr_pci: map the MSI window in each PHB
target-ppc: Implement mulldo with TCG
target-ppc: Clean up mullwo
target-ppc: Clean Up mullw
target-ppc: Optimize rlwnm MB=0 ME=31
target-ppc: Optimize rlwinm MB=0 ME=31
target-ppc: Special Case of rlwimi Should Use Deposit
spapr-vlan: Don't touch last entry in buffer list
spapr_pci: Fix config space corruption
PPC: Cuda: Use cuda timer to expose tbfreq to guest
PPC: Mac: Move tbfreq into local variable
PPC: mac_nvram: Split NVRAM into OF and OSX parts
PPC: mac_nvram: Allow 2 and 4 byte accesses
PPC: mac_nvram: Remove unused functions
PPC: mac99: Fix core99 timer frequency
PPC: KVM: Use vm check_extension for pv hcall
KVM: Add helper to run KVM_CHECK_EXTENSION on vm fd
target-ppc: Bug Fix: srad
...
Alexander Graf [Wed, 2 Jul 2014 17:01:46 +0000 (19:01 +0200)]
PPC: Fix default config ordering and add eTSEC for ppc64
We messed up the ordering in our default configs for PPC. The top entries
are generic entries, then come sections that indicate that features are only
in because of a special feature (such as PReP).
Fix the ordering again and while at it add eTSEC support to the ppc64 target
so that we can spawn eTSEC adapters with qemu-system-ppc64.
Greg Kurz [Wed, 27 Aug 2014 16:17:12 +0000 (18:17 +0200)]
spapr_pci: map the MSI window in each PHB
On sPAPR, virtio devices are connected to the PCI bus and use MSI-X.
Commit cc943c36faa192cd4b32af8fe5edb31894017d35 has modified MSI-X
so that writes are made using the bus master address space and follow
the IOMMU path.
Unfortunately, the IOMMU address space address space does not have an
MSI window: the notification is silently dropped in unassigned_mem_write
instead of reaching the guest... The most visible effect is that all
virtio devices are non-functional on sPAPR since then. :(
This patch does the following:
1) map the MSI window into the IOMMU address space for each PHB
- since each PHB instantiates its own IOMMU address space, we
can safely map the window at a fixed address (SPAPR_PCI_MSI_WINDOW)
- no real need to keep the MSI window setup in a separate function,
the spapr_pci_msi_init() code moves to spapr_phb_realize().
2) kill the global MSI window as it is not needed in the end
Tom Musta [Mon, 25 Aug 2014 19:25:43 +0000 (14:25 -0500)]
target-ppc: Clean up mullwo
Simplify the implementation of mullwo. For 64 bit CPUs, the result is
the concatenation of the upper and lower parts of the muls2_i32 operation,
which may be slightly better than deposit. For 32 bit CPUs, the lower part
of the muls_i32 operation is moved into the target GPR.
Tom Musta [Mon, 25 Aug 2014 19:25:39 +0000 (14:25 -0500)]
target-ppc: Special Case of rlwimi Should Use Deposit
The special case of rlwimi where MB <= ME and SH = 31-ME can be implemented
with a single TCG deposit operation. This replaces the less general case
of SH = MB = 0 and ME = 31.
Anton Blanchard [Fri, 22 Aug 2014 01:50:57 +0000 (11:50 +1000)]
spapr-vlan: Don't touch last entry in buffer list
The last 8 bytes of the buffer list is defined to contain the number
of dropped frames. At the moment we use it to store rx entries,
which trips up ethtool -S:
When disabling MSI/MSIX via "ibm,change-msi" RTAS call, no check was made
if MSI or MSIX is actually supported and the MSI message was reset
unconditionally. If this happened on a device which does not support MSI
(but does support MSIX, otherwise "ibm,change-msi" would not be called),
this device would have PCIDevice::msi_cap field (MSI capability offset)
set to zero and writing a vector would actually clear PCI status.
This clears MSI message only if MSI or MSIX is present on a device.
Alexander Graf [Sun, 13 Jul 2014 20:31:53 +0000 (22:31 +0200)]
PPC: Cuda: Use cuda timer to expose tbfreq to guest
Mac OS X calibrates a number of frequencies on bootup based on reading
tb values on bootup and comparing them to via cuda timer values.
The only variable we can really steer well (thanks to KVM) is the cuda
frequency. So let's use that one to fake Mac OS X into believing the
bus frequency is tbfreq * 4. That way Mac OS X will automatically
calculate the correct timebase frequency.
With this patch and the patch set I posted earlier I can successfully
run Mac OS X 10.2, 10.3 and 10.4 guests with -M mac99 on TCG and KVM.
Alexander Graf [Sun, 13 Jul 2014 20:29:02 +0000 (22:29 +0200)]
PPC: Mac: Move tbfreq into local variable
We already expose the real CPU's tb frequency to the guest via fw_cfg. Soon
we will need to also expose it to the MacIO, so let's move it to a variable
that we can leverage every time we need the frequency.
Alexander Graf [Sun, 13 Jul 2014 15:09:55 +0000 (17:09 +0200)]
PPC: mac_nvram: Split NVRAM into OF and OSX parts
Mac OS X (at least with -M mac99) searches for a valid NVRAM partition
of a special Apple type. If it can't find that partition in the first
half of NVRAM, it will look at the second half.
There are a few implications from this. The first is that we need to
split NVRAM into 2 halves - one for Open Firmware use, the other one for
Mac OS X. Without this split Mac OS X will just loop endlessly over the
second half trying to find a partition.
The other implication is that we should provide a specially crafted Mac
OS X compatible NVRAM partition on the second half that Mac OS X can
happily use as it sees fit.
Alexander Graf [Mon, 14 Jul 2014 17:17:35 +0000 (19:17 +0200)]
PPC: KVM: Use vm check_extension for pv hcall
To find out whether we support the KVM hypercall interface we need to ask KVM
on the VM level rather than the global KVM level, because Book3S HV KVM does
not support it and we play conservative when both HV and PR are loaded.
So instead, use the VM helper that falls back to global KVM enumeration. That
should cover all cases.
Tom Musta [Tue, 12 Aug 2014 13:45:10 +0000 (08:45 -0500)]
target-ppc: Bug Fix: srad
Fix the check for carry in the srad helper to properly construct
the mask -- a "1ULL" must be used (instead of "1") in order to
get the desired result.
Tom Musta [Tue, 12 Aug 2014 13:45:09 +0000 (08:45 -0500)]
target-ppc: Bug Fix: srawi
For 64 bit implementations, the special case of a shift by zero
should result in the sign extension of the least significant 32 bits
of the source GPR (not a direct copy of the 64 bit source GPR).
Tom Musta [Tue, 12 Aug 2014 13:45:07 +0000 (08:45 -0500)]
target-ppc: Bug Fix: mullwo
On 64-bit implementations, the mullwo result is the 64 bit product of
the signed 32 bit operands. Fix the implementation to properly deposit
the upper 32 bits into the target register.
Tom Musta [Tue, 12 Aug 2014 13:45:05 +0000 (08:45 -0500)]
target-ppc: Bug Fix: rlwimi
The rlwimi specification includes the ROTL32 operation, which is defined
to be a left rotation of two copies of the least significant 32 bits of
the source GPR.
The current implementation is incorrect on 64-bit implementations in that
it rotates a single copy of the least significant 32 bits, padding with
zeroes in the most significant bits.
Fix the code to properly implement this ROTL32 operation.
Also fix the special case of MB=31 and ME=0 to copy the entire contents
of the source GPR.
Tom Musta [Tue, 12 Aug 2014 13:45:04 +0000 (08:45 -0500)]
target-ppc: Bug Fix: rlwnm
The rlwnm specification includes the ROTL32 operation, which is defined
to be a left rotation of two copies of the least significant 32 bits of
the source GPR.
The current implementation is incorrect on 64-bit implementations in that
it rotates a single copy of the least significant 32 bits, padding with
zeroes in the most significant bits.
Fix the code to properly implement this ROTL32 operation.
Tom Musta [Tue, 12 Aug 2014 13:45:03 +0000 (08:45 -0500)]
target-ppc: Bug Fix: rlwinm
The rlwinm specification includes the ROTL32 operation, which is defined
to be a left rotation of two copies of the least significant 32 bits of
the source GPR.
The current implementation is incorrect on 64-bit implementations in that
it rotates a single copy of the least significant 32 bits, padding with
zeroes in the most significant bits.
Fix the code to properly implement this ROTL32 operation.
This patch adds hardware breakpoint and hardware watchpoint support
for ppc.
On BOOKE architecture we cannot share debug resources between QEMU
and guest because:
When QEMU is using debug resources then debug exception must
be always enabled. To achieve this we set MSR_DE and also set
MSRP_DEP so guest cannot change MSR_DE.
When emulating debug resource for guest we want guest
to control MSR_DE (enable/disable debug interrupt on need).
So above mentioned two configuration cannot be supported
at the same time. So the result is that we cannot share
debug resources between QEMU and Guest on BOOKE architecture.
In the current design QEMU gets priority over guest,
this means that if QEMU is using debug resources then guest
cannot use them and if guest is using debug resource then
qemu can overwrite them.
When QEMU is not able to handle debug exception then we inject program
exception to guest. Yes program exception NOT debug exception and the
reason is:
1) QEMU and guest not sharing debug resources
2) For software breakpoint QEMU uses a ehpriv-1 instruction;
So there cannot be any reason that we are in qemu with exit reason
KVM_EXIT_DEBUG for guest set debug exception, only possibility is
guest executed ehpriv-1 privilege instruction and that's why we are
injecting program exception.
This patch allow insert/remove software breakpoint.
When QEMU is not able to handle debug exception then we inject
program exception to guest because for software breakpoint QEMU
uses a ehpriv-1 instruction;
So there cannot be any reason that we are in qemu with exit reason
KVM_EXIT_DEBUG for guest set debug exception, only possibility is
guest executed ehpriv-1 privilege instruction and that's why we are
injecting program exception.
Signed-off-by: Bharat Bhushan <[email protected]>
[agraf: make deflect comment booke/book3s agnostic] Signed-off-by: Alexander Graf <[email protected]>
spapr: Locate RTAS and device-tree based on real RMA
We currently calculate the final RTAS and FDT location based on
the early estimate of the RMA size, cropped to 256M on KVM since
we only know the real RMA size at reset time which happens much
later in the boot process.
This means the FDT and RTAS end up right below 256M while they
could be much higher, using precious RMA space and limiting
what the OS bootloader can put there which has proved to be
a problem with some OSes (such as when using very large initrd's)
Fortunately, we do the actual copy of the device-tree into guest
memory much later, during reset, late enough to be able to do it
using the final RMA value, we just need to move the calculation
to the right place.
However, RTAS is still loaded too early, so we change the code to
load the tiny blob into qemu memory early on, and then copy it into
guest memory at reset time. It's small enough that the memory usage
doesn't matter.
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
[aik: fixed errors from checkpatch.pl, defined RTAS_MAX_ADDR] Signed-off-by: Alexey Kardashevskiy <[email protected]>
[agraf: fix compilation on 32bit hosts] Signed-off-by: Alexander Graf <[email protected]>
loader: Add load_image_size() to replace load_image()
A subsequent patch to ppc/spapr needs to load the RTAS blob into
qemu memory rather than target memory (so it can later be copied
into the right spot at machine reset time).
I would use load_image() but it is marked deprecated because it
doesn't take a buffer size as argument, so let's add load_image_size()
that does.
In multiple places there is a node0_size variable calculation
which assumes that NUMA node #0 and memory node #0 are the same
things which they are not. Since we are going to change it and
do not want to change it in multiple places, let's make a helper.
This adds a spapr_node0_size() helper and makes use of it.
Linux kernel expects nodes to have power-of-two size and
does WARN_ON if this is not the case:
[ 0.041456] WARNING: at drivers/base/memory.c:115
which is:
===
/* Validate blk_sz is a power of 2 and not less than section size */
if ((block_sz & (block_sz - 1)) || (block_sz < MIN_MEMORY_BLOCK_SIZE)) {
WARN_ON(1);
block_sz = MIN_MEMORY_BLOCK_SIZE;
}
===
This splits memory nodes into set of smaller blocks with
a size which is a power of two. This makes sure the start
address of every node is aligned to the node size.
Signed-off-by: Alexey Kardashevskiy <[email protected]>
[agraf: squash windows compile fix in] Signed-off-by: Alexander Graf <[email protected]>
spapr: Refactor spapr_populate_memory() to allow memoryless nodes
Current QEMU does not support memoryless NUMA nodes, however
actual hardware may have them so it makes sense to have a way
to emulate them in QEMU. This prepares SPAPR for that.
This moves 2 calls of spapr_populate_memory_node() into
the existing loop over numa nodes so first several nodes may
have no memory and this still will work.
If there is no numa configuration, the code assumes there is just
a single node at 0 and it has all the guest memory.
Alexander Graf [Fri, 11 Jul 2014 01:24:39 +0000 (03:24 +0200)]
PPC: mac99: Move NVRAM to page boundary when necessary
When running KVM we have to adhere to host page boundaries for memory slots.
Unfortunately the NVRAM on mac99 is a 4k RAM hole inside of an MMIO flash
area.
So if our host is configured with 64k page size, we can't use the mac99 target
with KVM. This is a real shame, as this limitation is not really an issue - we
can easily map NVRAM somewhere else and at least Linux and Mac OS X use it
at their new location.
So in that emergency case when it's about failing to run at all and moving NVRAM
to a place it shouldn't be at, choose the latter.
This patch enables -M mac99 with KVM on 64k page size hosts.
Tom Musta [Mon, 30 Jun 2014 13:13:42 +0000 (08:13 -0500)]
linux-user: Handle PPC64 ELFv2 Function Pointers
Function pointers in the 64-bit ELFv2 PowerPC ABI are actual (internal)
entry point addresses. However, when invoking a function via a function
pointer, GPR 12 must also be set to this address so that the TOC may be
handled properly.
Add this support to the invocation of a signal handler.
Tom Musta [Mon, 30 Jun 2014 13:13:40 +0000 (08:13 -0500)]
linux-user: Implement do_setcontext for PPC64
Eliminate the stub for the do_setcontext() function for TARGET_PPC64. The
implementation re-uses the existing TARGET_PPC32 code with the only change
being the computation of the address of the register save area.
Tom Musta [Mon, 30 Jun 2014 13:13:39 +0000 (08:13 -0500)]
linux-user: Properly Dereference PPC64 ELFv1 Signal Handler Pointer
Properly dereference 64-bit PPC ELF V1 ABIT function pointers to signal handlers.
On this platform, function pointers are pointers to structures and the first 64
bits of such a structure contains the function's entry point. The second 64 bits
contains the TOC pointer, which must be placed into GPR 2.
Tom Musta [Mon, 30 Jun 2014 13:13:38 +0000 (08:13 -0500)]
linux-user: Enable Signal Handlers on PPC64
Enable the 64-bit PowerPC signal handling code that was previously
disabled via #ifdefs. Specifically:
- Move the target_mcontext (register save area) structure and
append it to the 64-bit target_sigcontext structure. This
provides the space on the stack for saving and restoring
context.
- Define the target_rt_sigframe for 64-bit.
- Adjust the setup_frame and setup_rt_frame routines to properly
select the target_mcontext area and trampoline within the stack
frame; tthis is different for 32-bit and 64-bit implementations.
- Adjust the do_setcontext stub for 64-bit so that it compiles
without warnings.
The 64-bit signal handling code is still not functional after this
change; but the 32-bit code is. Subsequent changes will address
specific issues with the 64-bit code.
Signed-off-by: Tom Musta <[email protected]>
[agraf: fix build on 32bit hosts, ppc64abi32] Signed-off-by: Alexander Graf <[email protected]>
Tom Musta [Mon, 30 Jun 2014 13:13:37 +0000 (08:13 -0500)]
linux-user: Split PPC Trampoline Encoding from Register Save
Split the encoding of the PowerPC sigreturn trampoline from the saving of
register state onto the signal handler stack. This will make it easier
in subsequent patches to deal with variations in the stack frame layouts between
32 and 64 bit PowerPC.
Tom Musta [Mon, 30 Jun 2014 13:13:36 +0000 (08:13 -0500)]
linux-user: Fix Stack Pointer Bug in PPC setup_rt_frame
The code that sets the stack frame back pointer is incorrect for
the setup_rt_frame() code; qemu will abort (SIGSEGV) in some
environments. The setup_frame code was fixed in commit beb526b12134a6b6744125deec5a7fe24a8f92e3 but the setup_rt_frame
code was not.
Make the setup_rt_frame code consistent with the setup_frame
code.
PAPR compliant guest calls this in absence of kdump. This finally
reaches the guest and can be handled according to the policies set by
higher level tools(like taking dump) for further analysis by tools like
crash.
Linux kernel calls ibm,os-term when extended property of os-term is set.
This makes sure that a return to the linux kernel is gauranteed.
Alexander Graf [Thu, 24 Jul 2014 08:46:47 +0000 (10:46 +0200)]
PPC: KVM: Fix g3beige and mac99 when HV is loaded
On PPC we have 2 different styles of KVM: PR and HV. HV can only virtualize
sPAPR guests while PR can virtualize everything that's reasonably close to
the host hardware platform.
As long as only one kernel module (PR or HV) is loaded, the "default" kvm type
is the module that's loaded. So if your hardware only supports PR mode you can
easily spawn a Mac VM.
However, if both HV and PR are loaded we default to HV mode. And in that case
the Mac machines have to explicitly ask for PR mode to get a working VM.
Fix this up by explicitly having the Mac machines ask for PR style KVM. This
fixes bootup of Mac VMs on systems where bot HV and PR kvm modules are loaded
for me.
John Snow [Fri, 5 Sep 2014 03:42:17 +0000 (23:42 -0400)]
ide: Add resize callback to ide/core
Currently, if the block device backing the IDE drive is resized,
the information about the device as cached inside of the IDEState
structure is not updated, thus when a guest OS re-queries the drive,
it is unable to see the expanded size.
This patch adds a resize callback that updates the IDENTIFY data
buffer in order to correct this.
Lastly, a Linux guest as-is cannot resize a libata drive while in-use,
but it can see the expanded size as part of a bus rescan event.
This patch also allows guests such as Linux to see the new drive size
after a soft reboot event, without having to exit the QEMU process.
John Snow [Fri, 5 Sep 2014 03:42:16 +0000 (23:42 -0400)]
IDE: Fill the IDENTIFY request consistently
IDE-HD, IDE-ATAPI and IDE-CFATA all fill the
identify buffer in slightly different ways,
this is a relatively minor patch to make them
uniform, to emphasize that:
(1) We build the s->identify_data cache first, then
(2) We copy it to s->io_buffer to fulfill the request.
John Snow [Tue, 19 Aug 2014 18:57:55 +0000 (14:57 -0400)]
ide: Add wwn support to IDE-ATAPI drive
Although it is possible to specify the wwn
property for cdrom devices on the command line,
the underlying driver fails to relay this information
to the guest operating system via IDENTIFY.
John Snow [Fri, 1 Aug 2014 15:38:58 +0000 (11:38 -0400)]
libqos: add a simple first-fit memory allocator
Implement a simple first-fit memory allocator that
attempts to keep track of leased blocks of memory
in order to be able to re-use blocks.
Additionally, allow the user to specify when
initializing the device that upon cleanup,
we would like to assert that there are no
blocks in use. This may be useful for identifying
problems in qtests that use more complicated
set-up and tear-down routines.
This functionality is used in my upcoming ahci-test v2
patch set, but I didn't see fit to enable it for any
existing tests, which will continue to operate the
same as they have prior.
Stefan Hajnoczi [Tue, 2 Sep 2014 10:01:02 +0000 (11:01 +0100)]
qemu-img: clarify src_cache option documentation
The source cache option takes the same values as the cache option. The
documentation reads a little strange because it starts with "In contrast
the src_cache option ...". The fact that this is comparing with the
previous documented option (the 'cache' option) is implicit. Readers
may be confused, especially if they jump to src_cache without reading
cache documentation first.
Marc Marí [Mon, 1 Sep 2014 10:07:54 +0000 (12:07 +0200)]
tests: Functions bus_foreach and device_find from libqos virtio API
Virtio header has been changed to compile and work with a real device.
Functions bus_foreach and device_find have been implemented for PCI.
Virtio-blk test case now opens a fake device.
Laszlo Ersek [Sat, 23 Aug 2014 10:19:07 +0000 (12:19 +0200)]
pflash_cfi01: write flash contents to bdrv on incoming migration
A drive that backs a pflash device is special:
- it is very small,
- its entire contents are kept in a RAMBlock at all times, covering the
guest-phys address range that provides the guest's view of the emulated
flash chip.
The pflash device model keeps the drive (the host-side file) and the
guest-visible flash contents in sync. When migrating the guest, the
guest-visible flash contents (the RAMBlock) is migrated by default, but on
the target host, the drive (the host-side file) remains in full sync with
the RAMBlock only if:
- the source and target hosts share the storage underlying the pflash
drive,
- or the migration requests full or incremental block migration too, which
then covers all drives.
Due to the special nature of pflash drives, the following scenario makes
sense as well:
- no full nor incremental block migration, covering all drives, alongside
the base migration (justified eg. by shared storage for "normal" (big)
drives),
- non-shared storage for pflash drives.
In this case, currently only those portions of the flash drive are updated
on the target disk that the guest reprograms while running on the target
host.
In order to restore accord, dump the entire flash contents to the bdrv in
a post_load() callback.
- The read-only check follows the other call-sites of pflash_update();
- both "pfl->ro" and pflash_update() reflect / consider the case when
"pfl->bs" is NULL;
- the total size of the flash device is calculated as in
pflash_cfi01_realize().
When using shared storage, or requesting full or incremental block
migration along with the normal migration, the patch should incur a
harmless rewrite from the target side.
It is assumed that, on the target host, RAM is loaded ahead of the call to
pflash_post_load().
Peter Maydell [Fri, 5 Sep 2014 15:03:56 +0000 (16:03 +0100)]
Merge remote-tracking branch 'remotes/afaerber/tags/qom-cpu-for-peter' into staging
QOM CPUState and X86CPU
* Include exception state in CPU VMState
* Fix -cpu *,migratable=foo
* Error out on unknown -cpu *,+foo,-bar
# gpg: Signature made Fri 05 Sep 2014 15:38:14 BST using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <[email protected]>"
# gpg: aka "Andreas Färber <[email protected]>"
* remotes/afaerber/tags/qom-cpu-for-peter:
target-i386: Reject invalid CPU feature names on the command-line
target-i386: Support migratable=no properly
exec: Save CPUState::exception_index field
Eduardo Habkost [Wed, 20 Aug 2014 20:30:12 +0000 (17:30 -0300)]
target-i386: Support migratable=no properly
When the "migratable" property was implemented, the behavior was tested
by changing the default on the code, but actually using the option on
the command-line (e.g. "-cpu host,migratable=false") doesn't work as
expected. This is a regression for a common use case of "-cpu host",
which is to enable features that are supported by the host CPU + kernel
before feature-specific code is added to QEMU.
Fix this by initializing the feature words for "-cpu host" on
x86_cpu_parse_featurestr(), right after parsing the CPU options.