Halil Pasic [Tue, 11 Jul 2017 14:54:39 +0000 (16:54 +0200)]
s390x/css: add ORB to SubchDev
Since we are going to need a migration compatibility breaking change to
activate ChannelSubSys migration let us use the opportunity to introduce
ORB to the SubchDev before that (otherwise we would need separate
handling e.g. a compat property).
The ORB will be useful for implementing IDA, or async handling of
subchannel work.
Halil Pasic [Tue, 11 Jul 2017 14:54:38 +0000 (16:54 +0200)]
s390x/css: add missing css state conditionally
Although we have recently vmstatified the migration of some css
infrastructure, for some css entities there is still state to be
migrated left, because the focus was keeping migration stream
compatibility (that is basically everything as-is).
Let us add vmstate helpers and extend existing vmstate descriptions so
that we have everything we need. Let us guard the added state via
css_migration_enabled, so we keep the compatible behavior if css
migration is disabled.
Let's also annotate the bits which do not need to be migrated for better
readability.
Halil Pasic [Tue, 11 Jul 2017 14:54:37 +0000 (16:54 +0200)]
s390x: add css_migration_enabled to machine class
Currently the migration of the channel subsystem (css) is only partial
and is done by the virtio ccw proxies -- the only migratable css devices
existing at the moment.
With the current work on emulated and passthrough devices we need to
decouple the migration of the channel subsystem state from virtio ccw,
and have a separate section for it. A new section however necessarily
breaks the migration compatibility.
So let us introduce a switch at the machine class, and put it in 'off'
state for now. We will turn the switch 'on' for future machines once all
preparations are met. For compatibility machines the switch will stay
'off'.
Halil Pasic [Tue, 11 Jul 2017 14:54:36 +0000 (16:54 +0200)]
s390x: add helper get_machine_class
We will need the machine class at machine initialization time, so the
usual way via qdev won't do. Let's cache the machine class and also use
the default values of the base machine for capability discovery.
Yi Min Zhao [Fri, 17 Feb 2017 07:26:48 +0000 (15:26 +0800)]
s390x/css: update css_adapter_interrupt
Let's use the new inject_airq callback of flic to inject adapter
interrupts. For kvm case, if the kernel flic doesn't support the new
interface, the irq routine remains unchanged. For non-kvm case,
qemu-flic handles the suppression process.
Yi Min Zhao [Fri, 17 Feb 2017 07:00:59 +0000 (15:00 +0800)]
s390x/flic: introduce inject_airq callback
Let's introduce a specialized way to inject adapter interrupts that,
unlike the common interrupt injection method, allows to take the
characteristics of the adapter into account.
For adapters subject to AIS facility:
- for non-kvm case, we handle the suppression for a given ISC in QEMU.
- for kvm case, we pass adapter id to kvm to do airq injection.
Add add tracepoint for suppressed airq and suppressing airq.
Fei Li [Fri, 17 Feb 2017 06:23:44 +0000 (14:23 +0800)]
s390x/flic: introduce modify_ais_mode callback
In order to emulate the adapter interruption suppression (AIS)
facility properly, the guest needs to be able to modify the AIS mask.
Interrupt suppression will be handled via the flic (for kvm, via a
recently introduced kernel backend; for !kvm, in the flic code), so
let's introduce a method to change the mode via the flic interface.
We introduce the 'simm' and 'nimm' fields to QEMUS390FLICState
to store interruption modes for each ISC. Each bit in 'simm' and
'nimm' targets one ISC, and collaboratively indicate three modes:
ALL-Interruptions, SINGLE-Interruption and NO-Interruptions. This
interface can initiate most transitions between the states; transition
from SINGLE-Interruption to NO-Interruptions via adapter interrupt
injection will be introduced in a following patch. The meaningful
combinations are as follows:
interruption mode | simm bit | nimm bit
------------------|----------|----------
ALL | 0 | 0
SINGLE | 1 | 0
NO | 1 | 1
Fei Li [Tue, 7 Mar 2017 03:07:44 +0000 (04:07 +0100)]
s390x: add flags field for registering I/O adapter
Introduce a new 'flags' field to IoAdapter to contain further
characteristics of the adapter, like whether the adapter is subject to
adapter-interruption suppression.
For the kvm case, pass this value in the 'flags' field when
registering an adapter.
Jason J. Herne [Wed, 22 Feb 2017 15:32:20 +0000 (10:32 -0500)]
s390x/cpumodel: clean up spacing and comments
Clean up spacing and add comments to clarify difference between base, full and
default models.
Not having spacing around the model definitions in gen-features.c is
particularly frustrating as the reader tends to misinterpret which model they
are looking at or editing.
Claudio Imbrenda [Mon, 15 Aug 2016 16:44:04 +0000 (18:44 +0200)]
s390x/migration: Monitor commands for storage attributes
Add an "info" monitor command to non-destructively inspect the state of
the storage attributes of the guest, and a normal command to toggle
migration mode (useful for debugging).
Janosch Frank [Mon, 29 May 2017 12:19:02 +0000 (14:19 +0200)]
s390x/kvm: Rework cmma management
Let's keep track of cmma enablement and move the mem_path check into
the actual enablement. This now also warns users that do not use
cpu-models about disabled cmma when using huge pages.
* remotes/armbru/tags/pull-qapi-2017-07-12:
scripts: use build_ prefix for string not piped through cgen()
qobject: Update coccinelle script to catch Q{INC, DEC}REF
qobject: Catch another straggler for use of qdict_put_str()
* remotes/mjt/tags/trivial-patches-fetch:
include/hw/ptimer.h: Add documentation comments
hxtool: remove dead -q option
qga-win32: Fix memory leak of device information set
hw/core: fix missing return value in load_image_targphys_as()
elf-loader: warn about invalid endianness
configure: Handle having no c++ compiler in FORTIFY_SOURCE check
hw/pci: define msi_nonbroken in pci-stub
hw/misc: add missing includes
configure: Fix build with pkg-config and --static --enable-sdl
util/qemu-sockets: Drop unused helper socket_address_to_string()
target/xtensa: gdbstub: drop dead return statement
Peter Maydell [Thu, 13 Jul 2017 12:38:57 +0000 (13:38 +0100)]
Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2017-07-11' into staging
Block layer patches
# gpg: Signature made Tue 11 Jul 2017 17:05:56 BST
# gpg: using RSA key 0xF407DB0061D5CF40
# gpg: Good signature from "Max Reitz <[email protected]>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* remotes/maxreitz/tags/pull-block-2017-07-11: (85 commits)
iotests: Add preallocated growth test for qcow2
iotests: Add preallocated resize test for raw
block/qcow2: falloc/full preallocating growth
block/qcow2: Rename "fail_block" to just "fail"
block/qcow2: Add qcow2_refcount_area()
block/qcow2: Metadata preallocation for truncate
block/qcow2: Lock s->lock in preallocate()
block/qcow2: Generalize preallocate()
block/file-posix: Preallocation for truncate
block/file-posix: Generalize raw_regular_truncate
block/file-posix: Extract raw_regular_truncate()
block/file-posix: Small fixes in raw_create()
qemu-img: Expose PreallocMode for resizing
block: Add PreallocMode to blk_truncate()
block: Add PreallocMode to bdrv_truncate()
block: Add PreallocMode to BD.bdrv_truncate()
iotests: add test 178 for qemu-img measure
qemu-iotests: support per-format golden output files
qemu-img: add measure subcommand
qcow2: add bdrv_measure() support
...
Peter Maydell [Thu, 13 Jul 2017 11:48:37 +0000 (12:48 +0100)]
Merge remote-tracking branch 'remotes/yongbok/tags/mips-20170711' into staging
MIPS patches 2017-07-11
Changes:
* Fix MSA copy_[s|u]_df corner case of rd = 0
* Update malta to load the initrd at the end of the low memory
# gpg: Signature made Tue 11 Jul 2017 15:42:20 BST
# gpg: using RSA key 0x2238EB86D5F797C2
# gpg: Good signature from "Yongbok Kim <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8600 4CF5 3415 A5D9 4CFA 2B5C 2238 EB86 D5F7 97C2
* remotes/yongbok/tags/mips-20170711:
mips/malta: load the initrd at the end of the low memory
target/mips: fix msa copy_[s|u]_df rd = 0 corner case
Peter Maydell [Thu, 13 Jul 2017 09:47:10 +0000 (10:47 +0100)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20170711' into staging
target-arm queue:
* v7M: ignore writes to CONTROL.SPSEL from Thread mode
* KVM: Enable in-kernel timers with user space gic
* aspeed: Register all watchdogs
* hw/misc: Add Exynos4210 Pseudo Random Number Generator
* remotes/pmaydell/tags/pull-target-arm-20170711:
target-arm: v7M: ignore writes to CONTROL.SPSEL from Thread mode
ARM: KVM: Enable in-kernel timers with user space gic
aspeed: Register all watchdogs
hw/misc: Add Exynos4210 Pseudo Random Number Generator
scripts: use build_ prefix for string not piped through cgen()
The gen_ prefix is awkward. Generated C should go through cgen()
exactly once (see commit 1f9a7a1). The common way to get this wrong is
passing a foo=gen_foo() keyword argument to mcgen(). I'd like us to
adopt a naming convention where gen_ means "something that's been piped
through cgen(), and thus must not be passed to cgen() or mcgen()".
Requires renaming gen_params(), gen_marshal_proto() and
gen_event_send_proto().
Eric Blake [Sat, 24 Jun 2017 18:10:08 +0000 (12:10 -0600)]
qobject: Update coccinelle script to catch Q{INC, DEC}REF
The recent commit b097efc0 used qobject_decref(QOBJECT(E)), even
though we already have QDECREF(E) for that purpose. We can update
our coccinelle script to catch any future relapses; with that in
place, the rest of the patch is generated with:
spatch --sp-file scripts/coccinelle/qobject.cocci \
--macro-file scripts/cocci-macro-file.h --dir . --in-place
Eric Blake [Sat, 24 Jun 2017 18:10:07 +0000 (12:10 -0600)]
qobject: Catch another straggler for use of qdict_put_str()
Dan's addition of key-secret improvements in commit 29cf9336 was
developed prior to the addition of QDict scalar insertion macros,
but merged after the general cleanup in commit 46f5ac20.
Patch created mechanically by rerunning:
spatch --sp-file scripts/coccinelle/qobject.cocci \
--macro-file scripts/cocci-macro-file.h --dir . --in-place
Max Reitz [Tue, 13 Jun 2017 20:21:03 +0000 (22:21 +0200)]
block/qcow2: Add qcow2_refcount_area()
This function creates a collection of self-describing refcount
structures (including a new refcount table) at the end of a qcow2 image
file. Optionally, these structures can also describe a number of
additional clusters beyond themselves; this will be important for
preallocated truncation, which will place the data clusters and L2
tables there.
For now, we can use this function to replace the part of
alloc_refcount_block() that grows the refcount table (from which it is
actually derived).
Max Reitz [Tue, 13 Jun 2017 20:21:01 +0000 (22:21 +0200)]
block/qcow2: Lock s->lock in preallocate()
preallocate() is and will be called only from places that do not
otherwise need to lock s->lock: Currently that is qcow2_create2(), as of
a future patch it will be called from qcow2_truncate(), too.
It therefore makes sense to move locking that mutex into preallocate()
itself.
Max Reitz [Tue, 13 Jun 2017 20:21:00 +0000 (22:21 +0200)]
block/qcow2: Generalize preallocate()
This patch adds two new parameters to the preallocate() function so we
will be able to use it not just for preallocating a new image but also
for preallocated image growth.
The offset parameter allows the caller to specify a virtual offset from
which to start preallocating. For newly created images this is always 0,
but for preallocating growth this will be the old image length.
The new_length parameter specifies the supposed new length of the image
(basically the "end offset" for preallocation). During image truncation,
bdrv_getlength() will return the old image length so we cannot rely on
its return value then.
Max Reitz [Tue, 13 Jun 2017 20:20:58 +0000 (22:20 +0200)]
block/file-posix: Generalize raw_regular_truncate
Currently, raw_regular_truncate() is intended for setting the size of a
newly created file. However, we also want to use it for truncating an
existing file in which case only the newly added space (when growing)
should be preallocated.
This also means that if resizing failed, we should try to restore the
original file size. This is important when using preallocation.
Max Reitz [Tue, 13 Jun 2017 20:20:56 +0000 (22:20 +0200)]
block/file-posix: Small fixes in raw_create()
Variables should be declared at the start of a block, and if a certain
parameter value is not supported it may be better to return -ENOTSUP
instead of -EINVAL.
Max Reitz [Tue, 13 Jun 2017 20:20:53 +0000 (22:20 +0200)]
block: Add PreallocMode to bdrv_truncate()
For block drivers that just pass a truncate request to the underlying
protocol, we can now pass the preallocation mode instead of aborting if
it is not PREALLOC_MODE_OFF.
Max Reitz [Tue, 13 Jun 2017 20:20:52 +0000 (22:20 +0200)]
block: Add PreallocMode to BD.bdrv_truncate()
Add a PreallocMode parameter to the bdrv_truncate() function implemented
by each block driver. Currently, we always pass PREALLOC_MODE_OFF and no
driver accepts anything else.
Stefan Hajnoczi [Wed, 5 Jul 2017 12:57:37 +0000 (13:57 +0100)]
qemu-iotests: support per-format golden output files
Some tests produce format-dependent output. Either the difference is
filtered out and ignored, or the test case is format-specific so we
don't need to worry about per-format output differences.
There is a third case: the test script is the same for all image formats
and the format-dependent output is relevant. An ugly workaround is to
copy-paste the test into multiple per-format test cases. This
duplicates code and is not maintainable.
This patch allows test cases to add per-format golden output files so a
single test case can work correctly when format-dependent output must be
checked:
123.out.qcow2
123.out.raw
123.out.vmdk
...
This naming scheme is not composable with 123.out.nocache or 123.pc.out,
two other scenarios where output files are split. I don't think it
matters since few test cases need these features.
Stefan Hajnoczi [Wed, 5 Jul 2017 12:57:36 +0000 (13:57 +0100)]
qemu-img: add measure subcommand
The measure subcommand calculates the size required by a new image file.
This can be used by users or management tools that need to allocate
space on an LVM volume, SAN LUN, etc before creating or converting an
image file.
Stefan Hajnoczi [Wed, 5 Jul 2017 12:57:34 +0000 (13:57 +0100)]
qcow2: extract image creation option parsing
The image creation options parsed by qcow2_create() are also needed to
implement .bdrv_measure(). Extract the parsing code, including input
validation.
Stefan Hajnoczi [Wed, 5 Jul 2017 12:57:33 +0000 (13:57 +0100)]
qcow2: make refcount size calculation conservative
The refcount metadata size calculation is inaccurate and can produce
numbers that are too small. This is bad because we should calculate a
conservative number - one that is guaranteed to be large enough.
This patch switches the approach to a fixed point calculation because
the existing equation is hard to solve when inaccuracies are taken care
of.
Stefan Hajnoczi [Wed, 5 Jul 2017 12:57:30 +0000 (13:57 +0100)]
block: add bdrv_measure() API
bdrv_measure() provides a conservative maximum for the size of a new
image. This information is handy if storage needs to be allocated (e.g.
a SAN or an LVM volume) ahead of time.
Eric Blake [Mon, 3 Jul 2017 18:09:50 +0000 (13:09 -0500)]
tests: Avoid non-portable 'echo -ARG'
POSIX says that backslashes in the arguments to 'echo', as well as
any use of 'echo -n' and 'echo -e', are non-portable; it recommends
people should favor 'printf' instead. This is definitely true where
we do not control which shell is running (such as in makefile snippets
or in documentation examples). But even for scripts where we
require bash (and therefore, where echo does what we want by default),
it is still possible to use 'shopt -s xpg_echo' to change bash's
behavior of echo. And setting a good example never hurts when we are
not sure if a snippet will be copied from a bash-only script to a
general shell script (although I don't change the use of non-portable
\e for ESC when we know the running shell is bash).
Replace 'echo -n "..."' with 'printf %s "..."', and 'echo -e "..."'
with 'printf %b "...\n"', with the optimization that the %s/%b
argument can be omitted if the string being printed is a strict
literal with no '%', '$', or '`' (we could technically also make
this optimization when there are $ or `` substitutions but where
we can prove their results will not be problematic, but proving
that such substitutions are safe makes the patch less trivial
compared to just being consistent).
In the qemu-iotests check script, fix unusual shell quoting
that would result in word-splitting if 'date' outputs a space.
In test 051, take an opportunity to shorten the line.
In test 068, get rid of a pointless second invocation of bash.
Max Reitz [Sun, 2 Jul 2017 15:05:09 +0000 (17:05 +0200)]
iotests: Use absolute paths for executables
A user may specify a relative path for accessing qemu, qemu-img, etc.
through environment variables ($QEMU_PROG and friends) or a symlink.
If a test decides to change its working directory, relative paths will
cease to work, however. Work around this by making all of the paths to
programs that should undergo testing absolute. Besides "realpath", we
also have to use "type -p" to support programs in $PATH.
As a side effect, this fixes specifying these programs as symlinks for
out-of-tree builds: Before, you would have to create two symlinks, one
in the build and one in the source tree (the first one for common.config
to find, the second one for the iotest to use). Now it is sufficient to
create one in the build tree because common.config will resolve it.
iotests: chown LUKS device before qemu-io launches
On some distros, whenever you close a block device file
descriptor there is a udev rule that resets the file
permissions. This can race with the test script when
we run qemu-io multiple times against the same block
device. Occasionally the second qemu-io invocation
will find udev has reset the permissions causing failure.
iotests: reduce PBKDF iterations when testing LUKS
By default the PBKDF algorithm used with LUKS is tuned
based on the number of iterations to produce 1 second
of running time. This makes running the I/O test with
the LUKS format orders of magnitude slower than with
qcow2/raw formats.
When creating LUKS images, set the iteration time to
a 10ms to reduce the time overhead for LUKS, since
security does not matter in I/O tests.
Previously a full 'check -luks' would take
$ time ./check -luks
Passed all 22 tests
real 23m9.988s
user 21m46.223s
sys 0m22.841s
Now it takes
$ time ./check -luks
Passed all 22 tests
real 4m39.235s
user 3m29.590s
sys 0m24.234s
Still slow compared to qcow2/raw, but much improved
none the less.
The tests 033, 140, 145 and 157 were all broken
when run with LUKS, since they did not correctly use
the required image opts args syntax to specify the
decryption secret. Further, the 120 test simply does
not make sense to run with luks, as the scenario
exercised is not relevant.
The test 181 was broken when run with LUKS because
it didn't take account of fact that $TEST_IMG was
already in image opts syntax. The launch_qemu
helper also didn't register the secret object
providing the LUKS password.
While the qemu-img dd command does accept --image-opts
this is not sufficient to make it work with the LUKS
image yet. This is because bdrv_create() still always
requires the non-image-opts syntax.
This will be needed to check some restrictions before making bitmap
persistent in qmp-block-dirty-bitmap-add (this functionality will be
added by future patch)
New field BdrvDirtyBitmap.persistent means, that bitmap should be saved
by format driver in .bdrv_close and .bdrv_inactivate. No format driver
supports it for now.
Add format driver handler, which should mark loaded read-only
bitmaps as 'IN_USE' in the image and unset read_only field in
corresponding BdrvDirtyBitmap's.
Auto loading bitmaps are bitmaps in Qcow2, with the AUTO flag set. They
are loaded when the image is opened and become BdrvDirtyBitmaps for the
corresponding drive.
block/dirty-bitmap: add readonly field to BdrvDirtyBitmap
It will be needed in following commits for persistent bitmaps.
If bitmap is loaded from read-only storage (and we can't mark it
"in use" in this storage) corresponding BdrvDirtyBitmap should be
read-only.
Add bitmap extension as specified in docs/specs/qcow2.txt.
For now, just mirror extension header into Qcow2 state and check
constraints. Also, calculate refcounts for qcow2 bitmaps, to not break
qemu-img check.
For now, disable image resize if it has bitmaps. It will be fixed later.
A bitmap directory entry is sometimes called a 'bitmap header'. This
patch leaves only one name - 'bitmap directory entry'. The name 'bitmap
header' creates misunderstandings with 'qcow2 header' and 'qcow2 bitmap
header extension' (which is extension of qcow2 header)
sochin.jiang [Mon, 26 Jun 2017 11:04:24 +0000 (19:04 +0800)]
mirror: Fix inconsistent backing AioContext for after mirroring
mirror_complete opens the backing chain, which should have the same
AioContext as the top when using iothreads. Make the code guarantee
this, which fixes a failed assertion in bdrv_attach_child.
qcow2: report encryption specific image information
Currently 'qemu-img info' reports a simple "encrypted: yes"
field. This is not very useful now that qcow2 can support
multiple encryption formats. Users want to know which format
is in use and some data related to it.
Wire up usage of the qcrypto_block_get_info() method so that
'qemu-img info' can report about the encryption format
and parameters in use
While the crypto layer uses a fixed option name "key-secret",
the upper block layer may have a prefix on the options. e.g.
"encrypt.key-secret", in order to avoid clashes between crypto
option names & other block option names. To ensure the crypto
layer can report accurate error messages, we must tell it what
option name prefix was used.
Now that all encryption keys must be provided upfront via
the QCryptoSecret API and associated block driver properties
there is no need for any explicit encryption handling APIs
in the block layer. Encryption can be handled transparently
within the block driver. We only retain an API for querying
whether an image is encrypted or not, since that is a
potentially useful piece of metadata to report to the user.
Now that qcow & qcow2 are wired up to get encryption keys
via the QCryptoSecret object, nothing is relying on the
interactive prompting for passwords. All the code related
to password prompting can thus be ripped out.
The legacy "encryption=on" parameter still results in
creation of the old qcow2 AES format (and is equivalent
to the new 'encryption-format=aes'). e.g. the following are
equivalent:
With the LUKS format it is necessary to store the LUKS
partition header and key material in the QCow2 file. This
data can be many MB in size, so cannot go into the QCow2
header region directly. Thus the spec defines a FDE
(Full Disk Encryption) header extension that specifies
the offset of a set of clusters to hold the FDE headers,
as well as the length of that region. The LUKS header is
thus stored in these extra allocated clusters before the
main image payload.
Aside from all the cryptographic differences implied by
use of the LUKS format, there is one further key difference
between the use of legacy AES and LUKS encryption in qcow2.
For LUKS, the initialiazation vectors are generated using
the host physical sector as the input, rather than the
guest virtual sector. This guarantees unique initialization
vectors for all sectors when qcow2 internal snapshots are
used, thus giving stronger protection against watermarking
attacks.
qcow2: extend specification to cover LUKS encryption
Update the qcow2 specification to describe how the LUKS header is
placed inside a qcow2 file, when using LUKS encryption for the
qcow2 payload instead of the legacy AES-CBC encryption
qcow2: convert QCow2 to use QCryptoBlock for encryption
This converts the qcow2 driver to make use of the QCryptoBlock
APIs for encrypting image content, using the legacy QCow2 AES
scheme.
With this change it is now required to use the QCryptoSecret
object for providing passwords, instead of the current block
password APIs / interactive prompting.
The test 087 could be simplified since there is no longer a
difference in behaviour when using blockdev_add with encrypted
images for the running vs stopped CPU state.