Max Reitz [Wed, 3 Dec 2014 13:57:22 +0000 (14:57 +0100)]
vmdk: Fix error for JSON descriptor file names
If vmdk blindly tries to use path_combine() using bs->file->filename as
the base file name, this will result in a bad error message for JSON
file names when calling bdrv_open(). It is better to only try
bs->file->exact_filename; if that is empty, bs->file->filename will be
useless for path_combine() and an error should be emitted (containing
bs->file->filename because desc_file_path (which is
bs->file->exact_filename) is empty).
Gary R Hook [Tue, 25 Nov 2014 23:30:02 +0000 (17:30 -0600)]
block migration: fix return value
Modify block_save_iterate() to return positive/zero/negative
(success/not done/failure) return status. The computation of
the blocks transferred (an int64_t) exceeds the size of an
int return value.
Peter Maydell [Thu, 11 Dec 2014 18:27:02 +0000 (18:27 +0000)]
Merge remote-tracking branch 'remotes/mjt/tags/pull-trivial-patches-2014-12-11' into staging
trivial patches for 2014-12-11
# gpg: Signature made Thu 11 Dec 2014 18:13:58 GMT using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <[email protected]>"
# gpg: aka "Michael Tokarev <[email protected]>"
# gpg: aka "Michael Tokarev <[email protected]>"
* remotes/mjt/tags/pull-trivial-patches-2014-12-11:
Sort include/qemu/typedefs.h
hpet: increase spelling precision
pflash_cfi02.c: associate "cfi.pflash02" to "Storage devices" category
vt82c686: fix coverity warning about out-of-bounds write
virtio: remove useless declaration of virtio_net_init()
qapi-schema: fix typo about change-vnc-password
fw_cfg: remove superfluous blank line
get_maintainer.pl: Remove the --git-chief-penguins option
configure: Replace which(1) with "has"
util: Use g_new() & friends where that makes obvious sense
util: Fuse g_malloc(); memset() into g_new0()
util: Drop superfluous conditionals around g_free()
Drop superfluous conditionals around g_strdup()
Drop superfluous conditionals around qemu_opts_del()
usb: delete redundant brackets in usb_host_handle_control()
virtio-bus: avoid breaking build when open DEBUG switch
acpi-build: Make DPRINTF working for acpi-build
acpi-build: adjust indention 8 -> 4 spaces
target-s390x: fix possible out of bounds read
qmp: fix typo in input-send-event examples
Peter Maydell [Thu, 11 Dec 2014 16:47:23 +0000 (16:47 +0000)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20141211' into staging
target-arm queue:
* pass semihosting exit code out to system
* more TrustZone support code (still not enabled yet)
* allow user to direct semihosting to gdb or native explicitly
rather than always auto-guessing the destination
* fix memory leak in realview_init
* fix coverity warning in hw/arm/boot
* get state migration working for AArch64 CPUs
* check errors in kvm_arm_reset_vcpu
# gpg: Signature made Thu 11 Dec 2014 12:16:19 GMT using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <[email protected]>"
* remotes/pmaydell/tags/pull-target-arm-20141211: (33 commits)
target-arm: Check error conditions on kvm_arm_reset_vcpu
target-arm: Support save/load for 64 bit CPUs
target-arm/kvm: make reg sync code common between kvm32/64
arm_gic_kvm: Tell kernel about number of IRQs
hw/arm/boot: fix uninitialized scalar variable warning reported by coverity
hw/arm/realview.c: Fix memory leak in realview_init()
target-arm: make MAIR0/1 banked
target-arm: make c13 cp regs banked (FCSEIDR, ...)
target-arm: make VBAR banked
target-arm: make PAR banked
target-arm: make IFAR/DFAR banked
target-arm: make DFSR banked
target-arm: make IFSR banked
target-arm: make DACR banked
target-arm: make TTBCR banked
target-arm: make TTBR0/1 banked
target-arm: make CSSELR banked
target-arm: respect SCR.FW, SCR.AW and SCTLR.NMFI
target-arm: add SCTLR_EL3 and make SCTLR banked
target-arm: add MVBAR support
...
Peter Maydell [Thu, 11 Dec 2014 12:36:32 +0000 (12:36 +0000)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches for 2.3
# gpg: Signature made Wed 10 Dec 2014 09:31:53 GMT using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
* remotes/kevin/tags/for-upstream: (73 commits)
vmdk: Set errp on failures in vmdk_open_vmdk4
vmdk: Remove unnecessary initialization
vmdk: Check descriptor file length when reading it
vmdk: Clean up descriptor file reading
vmdk: Fix comment to match code of extent lines
vmdk: Use g_random_int to generate CID
block: Use g_new0() for a bit of extra type checking
block: remove BLOCK_OPT_NOCOW from vpc_create_opts
block: remove BLOCK_OPT_NOCOW from vdi_create_opts
qemu-iotests: Skip 099 for VMDK subformats with desc file
block/raw-posix: Fix ret in raw_open_common()
qcow2: Respect bdrv_truncate() error
qcow2: Flushing the caches in qcow2_close may fail
qcow2: Prevent numerical overflow
iotests: Add test for unsupported image creation
iotests: Only kill NBD server if it runs
qemu-img: Check create_opts before image amendment
qemu-img: Check create_opts before image creation
block: Check create_opts before image creation
block/nfs: Add create_opts
...
Christoffer Dall [Thu, 11 Dec 2014 12:07:53 +0000 (12:07 +0000)]
target-arm: Check error conditions on kvm_arm_reset_vcpu
When resetting a VCPU we currently call both kvm_arm_vcpu_init() and
write_kvmstate_to_list(), both of which can fail, but we never check the
return value.
The only choice here is to print an error an exit if the calls fail.
Peter Maydell [Thu, 11 Dec 2014 12:07:53 +0000 (12:07 +0000)]
target-arm: Support save/load for 64 bit CPUs
For migration to work on 64 bit CPUs, we need to include both
the 64-bit integer register file and the PSTATE. Everything
else is either stored in the same place as existing 32-bit CPU
state or handled by the generic sysreg mechanism.
Alex Bennée [Thu, 11 Dec 2014 12:07:53 +0000 (12:07 +0000)]
target-arm/kvm: make reg sync code common between kvm32/64
Before we launch a guest we query KVM for the list of "co-processor"
registers it knows about. This is used to synchronize system
register state for the bulk of coprocessor/system registers.
Move this code from the 32-bit specific vcpu init function into
a common routine and call it also from the 64-bit vcpu init.
This allows system registers to migrate correctly when using
KVM, and also permits QEMU code to see the current KVM register
state (which will be needed to support big-endian guests, since
the virtio endianness callback must check for some system register
settings).
Since vcpu reset also has to sync registers, we move the
32 bit kvm_arm_reset_vcpu() into common code as well and
share it with the 64 bit version.
Signed-off-by: Alex Bennée <[email protected]>
[PMM: just copy the 32-bit code rather than improving it along the way;
don't share reg_syncs_via_tuple_list() between 32 and 64 bit;
tweak function names; move reset] Signed-off-by: Peter Maydell <[email protected]>
Peter Maydell [Thu, 11 Dec 2014 12:07:53 +0000 (12:07 +0000)]
arm_gic_kvm: Tell kernel about number of IRQs
Newer kernels support a device attribute on the GIC which allows us to
tell it how many IRQs this GIC instance is configured with; use it, if
it exists.
zhanghailiang [Thu, 11 Dec 2014 12:07:53 +0000 (12:07 +0000)]
hw/arm/boot: fix uninitialized scalar variable warning reported by coverity
Coverity reports the 'size' may be used uninitialized, but that can't happen,
because the caller has checked "if (binfo->dtb_filename || binfo->get_dtb)"
before call 'load_dtb'.
Here we simply remove the 'if (binfo->get_dtb)' to satisfy coverity.
Nikita Belov [Thu, 11 Dec 2014 12:07:52 +0000 (12:07 +0000)]
hw/arm/realview.c: Fix memory leak in realview_init()
Variable 'ram_lo' is allocated unconditionally, but used only in some cases.
When it is unused pointer will be lost at function exit, resulting in a
memory leak. Allocate memory for 'ram_lo' only if it is needed.
Valgrind output:
==16879== 240 bytes in 1 blocks are definitely lost in loss record 6,033 of 7,018
==16879== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16879== by 0x33D2CE: malloc_and_trace (vl.c:2804)
==16879== by 0x509E610: g_malloc (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.4000.0)
==16879== by 0x288836: realview_init (realview.c:55)
==16879== by 0x28988C: realview_pb_a8_init (realview.c:375)
==16879== by 0x341426: main (vl.c:4413)
Fabian Aggeler [Thu, 11 Dec 2014 12:07:52 +0000 (12:07 +0000)]
target-arm: make c13 cp regs banked (FCSEIDR, ...)
When EL3 is running in AArch32 (or ARMv7 with Security Extensions)
FCSEIDR, CONTEXTIDR, TPIDRURW, TPIDRURO and TPIDRPRW have a secure
and a non-secure instance.
Greg Bellows [Thu, 11 Dec 2014 12:07:52 +0000 (12:07 +0000)]
target-arm: make VBAR banked
When EL3 is running in Aarch32 (or ARMv7 with Security Extensions)
VBAR has a secure and a non-secure instance, which are mapped to
VBAR_EL1 and VBAR_EL3.
Fabian Aggeler [Thu, 11 Dec 2014 12:07:51 +0000 (12:07 +0000)]
target-arm: make IFSR banked
When EL3 is running in AArch32 (or ARMv7 with Security Extensions)
IFSR has a secure and a non-secure instance. Adds IFSR32_EL2 definition and
storage.
Fabian Aggeler [Thu, 11 Dec 2014 12:07:51 +0000 (12:07 +0000)]
target-arm: make TTBCR banked
Adds secure and non-secure bank register suport for TTBCR.
Added new struct to compartmentalize the TCR data and masks. Removed old
tcr/ttbcr data and added a 4 element array of the new structs in cp15. This
allows for one entry per EL. Added a CP register definition for TCR_EL3.
Fabian Aggeler [Thu, 11 Dec 2014 12:07:51 +0000 (12:07 +0000)]
target-arm: make TTBR0/1 banked
Adds secure and non-secure bank register suport for TTBR0 and TTBR1.
Changes include adding secure and non-secure instances of ttbr0 and ttbr1 as
well as a CP register definition for TTBR0_EL3. Added a union containing
both EL based array fields and secure and non-secure fields mapped to them.
Updated accesses to use A32_BANKED_CURRENT_REG_GET macro.
Fabian Aggeler [Thu, 11 Dec 2014 12:07:50 +0000 (12:07 +0000)]
target-arm: respect SCR.FW, SCR.AW and SCTLR.NMFI
Add checks of SCR AW/FW bits when performing writes of CPSR. These SCR bits
are used to control whether the CPSR masking bits can be adjusted from
non-secure state.
Fabian Aggeler [Thu, 11 Dec 2014 12:07:49 +0000 (12:07 +0000)]
target-arm: implement IRQ/FIQ routing to Monitor mode
SCR.{IRQ/FIQ} bits allow to route IRQ/FIQ exceptions to monitor CPU
mode. When taking IRQ exception to monitor mode FIQ exception is
additionally masked.
Fabian Aggeler [Thu, 11 Dec 2014 12:07:49 +0000 (12:07 +0000)]
target-arm: move AArch32 SCR into security reglist
Define a new ARM CP register info list for the ARMv7 Security Extension
feature. Register that list only for ARM cores with Security Extension/EL3
support. Moving AArch32 SCR into Security Extension register group.
Fabian Aggeler [Thu, 11 Dec 2014 12:07:49 +0000 (12:07 +0000)]
target-arm: add CPREG secure state support
Prepare ARMCPRegInfo to support specifying two fieldoffsets per
register definition. This will allow us to keep one register
definition for banked registers (different offsets for secure/
non-secure world).
Also added secure state tracking field and flags. This allows for
identification of the register info secure state.
Fabian Aggeler [Thu, 11 Dec 2014 12:07:48 +0000 (12:07 +0000)]
target-arm: add banked register accessors
If EL3 is in AArch32 state certain cp registers are banked (secure and
non-secure instance). When reading or writing to coprocessor registers
the following macros can be used.
- A32_BANKED macros are used for choosing the banked register based on provided
input security argument. This macro is used to choose the bank during
translation of MRC/MCR instructions that are dependent on something other
than the current secure state.
- A32_BANKED_CURRENT macros are used for choosing the banked register based on
current secure state. This is NOT to be used for choosing the bank used
during translation as it breaks monitor mode.
If EL3 is operating in AArch64 state coprocessor registers are not
banked anymore. The macros use the non-secure instance (_ns) in this
case, which is architecturally mapped to the AArch64 EL register.
Greg Bellows [Thu, 11 Dec 2014 12:07:48 +0000 (12:07 +0000)]
target-arm: add async excp target_el function
Adds a dedicated function and a lookup table for determining the target
exception level of IRQ and FIQ exceptions. The lookup table is taken from the
ARMv7 and ARMv8 specification exception routing tables.
Greg Bellows [Thu, 11 Dec 2014 12:07:48 +0000 (12:07 +0000)]
target-arm: extend async excp masking
This patch extends arm_excp_unmasked() to use lookup tables for determining
whether IRQ and FIQ exceptions are masked. The lookup tables are based on the
ARMv8 and ARMv7 specification physical interrupt masking tables.
If EL3 is using AArch64 IRQ/FIQ masking is ignored in all exception levels
other than EL3 if SCR.{FIQ|IRQ} is set to 1 (routed to EL3).
Liviu Ionescu [Thu, 11 Dec 2014 12:07:48 +0000 (12:07 +0000)]
Add the "-semihosting-config" option.
The usual semihosting behaviour is to process the system calls locally and
return; unfortuantelly the initial implementation dinamically changed the
target to GDB during debug sessions, which, for the usual arm-none-eabi-gdb,
is not implemented. The result was that during debug sessions the semihosting
calls were discarded.
This patch adds a configuration variable and an option to set it on the
command line:
This option enables semihosting and defines where the semihosting calls will
be addressed, to QEMU ('native') or to GDB ('gdb'). The default is auto, which
means 'gdb' during debug sessions and 'native' otherwise.
Signed-off-by: Liviu Ionescu <[email protected]>
Message-id: 1416341957[email protected]
[PMM: moved declaration and definition of semihosting_target to
gdbstub.h and gdbstub.c to fix build failure on linux-user] Signed-off-by: Peter Maydell <[email protected]>
# gpg: Signature made Wed 10 Dec 2014 11:21:58 GMT using RSA key ID 6B69CA14
# gpg: Good signature from "Bastian Koppelmann <[email protected]>"
* remotes/bkoppelmann/tags/pull-tricore-20141210:
target-tricore: Add instructions of RCR opcode format
target-tricore: Add instructions of RLC opcode format
target-tricore: Add instructions of RCPW, RCRR and RCRW opcode format
target-tricore: Make TRICORE_FEATURES implying others.
target-tricore: Add instructions of RC opcode format
target-tricore: Add instructions of BRR opcode format
target-tricore: Add instructions of BRN opcode format
target-tricore: Add instructions of BRC opcode format
target-tricore: Add instructions of BOL opcode format
target-tricore: Add instructions of RCR opcode format
Add instructions of RCR opcode format.
Add helper for madd32/64_ssov and madd32/64_suov.
Add helper for msub32/64_ssov and msub32/64_suov.
Add microcode generator function madd/msub for 32bit and 64bit, which calculate a mul and a add/sub.
OPC2_32_RCR_MSUB_U_32 -> OPC2_32_RCR_MSUB_U_32.
target-tricore: Add instructions of RLC opcode format
Add instructions of RLC opcode format.
Add helper psw_write/read.
Add microcode generator gen_mtcr/mfcr, which loads/stores a value to a core special function register, which are defined in csfr.def
target-tricore: Make TRICORE_FEATURES implying others.
Since all the TriCore instructionsets are subsets of each other (1.3 C 1.3.1 C 1.6),
make the features implying each other, e.g 1.6 also has 1.3.1 and 1.3. This way
we only need to check our features for the instructionset, where a instruction was first introduced.
target-tricore: Add instructions of RC opcode format
Add instructions of RC opcode format.
Add helper for mul, sha, absdif with signed saturation on overflow.
Add helper for add, sub, mul with unsigned saturation on overflow.
Add microcode generator functions:
* gen_add_CC, which calculates the carry bit.
* gen_addc_CC, which adds the carry bit to the add and calculates the carry bit.
* gen_absdif, which calculates the absolute difference.
* gen_mul_i64s/u, which mul two 32 bits val into one 64bit reg.
* gen_sh_hi, which shifts two 16bit words in one reg.
* gen_sha_hi, which does a arithmetic shift on two 16bit words.
* gen_sh_cond, which shifts left a reg by one and writes the result of cond into the lsb.
* gen_accumulating_cond, which ands/ors/xors the result of cond of the lsbs
with the lsb of the result.
* gen_eqany_bi/hi, which checks ever byte/hword on equality.
Fam Zheng [Wed, 3 Dec 2014 23:28:32 +0000 (07:28 +0800)]
vmdk: Check descriptor file length when reading it
Since a too small file cannot be a valid VMDK image, and also since the
buffer's first 4 bytes will be unconditionally examined by
vmdk_open_sparse, let's error out the small file case to be clear.
Fam Zheng [Wed, 3 Dec 2014 23:28:30 +0000 (07:28 +0800)]
vmdk: Fix comment to match code of extent lines
commit 04d542c8b (vmdk: support vmfs files) added support of VMFS extent
type but the comment above the changed code is left out. Update the
comment so they are consistent.
Fam Zheng [Wed, 3 Dec 2014 23:28:29 +0000 (07:28 +0800)]
vmdk: Use g_random_int to generate CID
This replaces two "time(NULL)" invocations with "g_random_int()".
According to VMDK spec, CID "is a random 32‐bit value updated the first
time the content of the virtual disk is modified after the virtual disk
is opened". Using "seconds since epoch" is just a "lame way" to generate
it, and not completely safe because of the low precision.
Jeff Cody [Wed, 3 Dec 2014 15:30:08 +0000 (10:30 -0500)]
block: remove BLOCK_OPT_NOCOW from vpc_create_opts
In commit fef6070, the need for NOCOW was removed from the vpc driver,
as we removed the the posix calls. However, the BLOCK_OPT_NOCOW was not
removed from vpc_create_opts. This was a mistake - remove the opt from
there as well.
Jeff Cody [Wed, 3 Dec 2014 15:30:07 +0000 (10:30 -0500)]
block: remove BLOCK_OPT_NOCOW from vdi_create_opts
In commit 7074786, the need for NOCOW was removed from the vdi driver,
as we removed the the posix calls. However, the BLOCK_OPT_NOCOW was not
removed from vdi_create_opts. This was a mistake - remove the opt from
there as well.
Max Reitz [Tue, 2 Dec 2014 17:32:51 +0000 (18:32 +0100)]
qcow2: Flushing the caches in qcow2_close may fail
qcow2_cache_flush() may fail; if one of the caches failed to be flushed
successfully to disk in qcow2_close() the image should not be marked
clean, and we should emit a warning.
This breaks the (qcow2-specific) iotests 026, 071 and 089; change their
output accordingly.
Max Reitz [Tue, 2 Dec 2014 17:32:50 +0000 (18:32 +0100)]
qcow2: Prevent numerical overflow
In qcow2_alloc_cluster_offset(), *num is limited to
INT_MAX >> BDRV_SECTOR_BITS by all callers. However, since remaining is
of type uint64_t, we might as well cast *num to that type before
performing the shift.
Max Reitz [Tue, 2 Dec 2014 17:32:49 +0000 (18:32 +0100)]
iotests: Add test for unsupported image creation
Add a test for creating and amending images (amendment uses the creation
options) with formats not supporting creation over protocols not
supporting creation.
Max Reitz [Tue, 2 Dec 2014 17:32:48 +0000 (18:32 +0100)]
iotests: Only kill NBD server if it runs
There may be NBD tests which do not create a sample image and simply
test whether wrong usage of the protocol is rejected as expected. In
this case, there will be no NBD server and trying to kill it during
clean-up will fail.
Max Reitz [Tue, 2 Dec 2014 17:32:47 +0000 (18:32 +0100)]
qemu-img: Check create_opts before image amendment
The image options which can be amended are described by the .create_opts
field for every driver. This field must therefore be non-NULL so that
anything can be amended in the first place. Check that this holds true
before going into qemu_opts_create() (because if .create_opts is NULL,
the create_opts pointer in img_amend() will be NULL after
qemu_opts_append()).
Max Reitz [Tue, 2 Dec 2014 17:32:46 +0000 (18:32 +0100)]
qemu-img: Check create_opts before image creation
If a driver supports image creation, it needs to set the .create_opts
field. We can use that to make sure .create_opts for both drivers
involved is not NULL for the target image in qemu-img convert, which is
important so that the create_opts pointer in img_convert() is not NULL
after the qemu_opts_append() calls and when going into
qemu_opts_create().
Max Reitz [Tue, 2 Dec 2014 17:32:45 +0000 (18:32 +0100)]
block: Check create_opts before image creation
If a driver supports image creation, it needs to set the .create_opts
field. We can use that to make sure .create_opts for both drivers
involved is not NULL in bdrv_img_create(), which is important so that
the create_opts pointer in that function is not NULL after the
qemu_opts_append() calls and when going into qemu_opts_create().
Without this patch, it segfaults. With this patch, it does not. However,
this is not something that should really work; qemu-img should check
whether the parameter for the -f option (and -O for convert) is indeed a
format, and error out if it is not. Therefore, I am not making it an
iotest.
Max Reitz [Tue, 2 Dec 2014 17:32:42 +0000 (18:32 +0100)]
block: Omit bdrv_find_format for essential drivers
We can always assume raw, file and qcow2 being available; so do not use
bdrv_find_format() to locate their BlockDriver objects but statically
reference the respective objects.
Max Reitz [Tue, 2 Dec 2014 17:32:41 +0000 (18:32 +0100)]
block: Make essential BlockDriver objects public
There are some block drivers which are essential to QEMU and may not be
removed: These are raw, file and qcow2 (as the default non-raw format).
Make their BlockDriver objects public so they can be directly referenced
throughout the block layer without needing to call bdrv_find_format()
and having to deal with an error at runtime, while the real problem
occurred during linking (where raw, file or qcow2 were not linked into
qemu).
Max Reitz [Wed, 3 Dec 2014 09:15:04 +0000 (10:15 +0100)]
iotests: Specify qcow2 format for qemu-io in 059
There are two instances of iotest 059 using qemu-io on a qcow2 image. As
of "qemu-iotests: Use qemu-io -f $IMGFMT" the iotests can no longer rely
on $QEMU_IO doing probing, therefore the qcow2 format has to be
specified explicitly here.
Kevin Wolf [Wed, 3 Dec 2014 12:21:32 +0000 (13:21 +0100)]
ide: Check validity of logical block size
Our IDE emulation can't handle logical block sizes other than 512. Check
for it.
The original assumption was that other values would silently be ignored
(which is bad enough), but it's not quite true: The physical block size
is exposed in IDENTIFY DEVICE as a multiple of the logical block size.
Setting a logical block size therefore also corrupts the physical block
size (4096/4096 doesn't silently downgrade to 4096/512, but 512/512).
Paolo Bonzini [Fri, 28 Nov 2014 11:38:03 +0000 (11:38 +0000)]
block: do not use get_clock()
Use the external qemu-timer API instead.
No one else should be calling cpu_get_clock(), get_clock() and
get_clock_realtime() directly; they are internal functions and they
should be confined to qemu-timer.c and cpus.c (where the icount
implementation resides). All accesses should go through
qemu_clock_get_ns.
Kevin Wolf [Tue, 25 Nov 2014 17:12:42 +0000 (18:12 +0100)]
block: Don't probe for unknown backing file format
If a qcow2 image specifies a backing file format that doesn't correspond
to any format driver that qemu knows, we shouldn't fall back to probing,
but simply error out.
Not looking up the backing file driver in bdrv_open_backing_file(), but
just filling in the "driver" option if it isn't there moves us closer to
the goal of having everything in QDict options and gets us the error
handling of bdrv_open(), which correctly refuses unknown drivers.
Kevin Wolf [Tue, 25 Nov 2014 17:12:40 +0000 (18:12 +0100)]
qcow2: Fix header extension size check
After reading the extension header, offset is incremented, but not
checked against end_offset any more. This way an integer overflow could
happen when checking whether the extension end is within the allowed
range, effectively disabling the check.
This patch adds the missing check and a test case for it.
Stefan Hajnoczi [Fri, 21 Nov 2014 10:48:59 +0000 (10:48 +0000)]
blockdev: acquire AioContext in QMP 'transaction' actions
The transaction QMP command performs operations atomically on a group of
drives. This command needs to acquire AioContext in order to work
safely when virtio-blk dataplane IOThreads are accessing drives.
The transactional nature of the command means that actions are split
into prepare, commit, abort, and clean functions. Acquire the
AioContext in prepare and don't release it until one of the other
functions is called. This prevents the IOThread from running the
AioContext before the transaction has completed.
Stefan Hajnoczi [Fri, 21 Nov 2014 10:48:58 +0000 (10:48 +0000)]
blockdev: drop unnecessary DriveBackupState field assignment
drive_backup_prepare() assigns DriveBackupState fields to NULL in the
error path. This is unnecessary because the DriveBackupState is
allocated using g_malloc0() and other functions like
external_snapshot_prepare() already rely on this.
Do not explicitly assign fields to NULL so that the error path is
concise and does not require modification when fields are added to
DriveBackupState.
Kevin Wolf [Thu, 20 Nov 2014 15:27:13 +0000 (16:27 +0100)]
qemu-iotests: Fix stderr handling in common.qemu
The original intention was to pipe stderr of qemu into $fifo_out.
However, the redirections were specified in the wrong order for this.
This patch fixes it.
Now qemu's output on stderr can be retrieved with _send_qemu_cmd, which
applies several useful filters on the output that were missing before.
Kevin Wolf [Thu, 20 Nov 2014 15:27:12 +0000 (16:27 +0100)]
raw: Prohibit dangerous writes for probed images
If the user neglects to specify the image format, QEMU probes the
image to guess it automatically, for convenience.
Relying on format probing is insecure for raw images (CVE-2008-2004).
If the guest writes a suitable header to the device, the next probe
will recognize a format chosen by the guest. A malicious guest can
abuse this to gain access to host files, e.g. by crafting a QCOW2
header with backing file /etc/shadow.
Commit 1e72d3b (April 2008) provided -drive parameter format to let
users disable probing. Commit f965509 (March 2009) extended QCOW2 to
optionally store the backing file format, to let users disable backing
file probing. QED has had a flag to suppress probing since the
beginning (2010), set whenever a raw backing file is assigned.
All of these additions that allow to avoid format probing have to be
specified explicitly. The default still allows the attack.
In order to fix this, commit 79368c8 (July 2010) put probed raw images
in a restricted mode, in which they wouldn't be able to overwrite the
first few bytes of the image so that they would identify as a different
image. If a write to the first sector would write one of the signatures
of another driver, qemu would instead zero out the first four bytes.
This patch was later reverted in commit 8b33d9e (September 2010) because
it didn't get the handling of unaligned qiov members right.
Today's block layer that is based on coroutines and has qiov utility
functions makes it much easier to get this functionality right, so this
patch implements it.
The other differences of this patch to the old one are that it doesn't
silently write something different than the guest requested by zeroing
out some bytes (it fails the request instead) and that it doesn't
maintain a list of signatures in the raw driver (it calls the usual
probe function instead).
Note that this change doesn't introduce new breakage for false positive
cases where the guest legitimately writes data into the first sector
that matches the signatures of an image format (e.g. for nested virt):
These cases were broken before, only the failure mode changes from
corruption after the next restart (when the wrong format is probed) to
failing the problematic write request.
Also note that like in the original patch, the restrictions only apply
if the image format has been guessed by probing. Explicitly specifying a
format allows guests to write anything they like.
Kevin Wolf [Thu, 20 Nov 2014 15:27:11 +0000 (16:27 +0100)]
block: Read only one sector for format probing
The only image format driver that even potentially accesses anything
after 512 bytes in its bdrv_probe() implementation is VMDK, which reads
a plain-text descriptor file. In practice, the field it's looking for
seems to come first and will be well within the first 512 bytes, too.
Kevin Wolf [Thu, 20 Nov 2014 15:27:07 +0000 (16:27 +0100)]
qemu-iotests: Use qemu-io -f $IMGFMT
This patch changes $QEMU_IO so that all tests by default pass a format
argument to qemu-io.
There are a few cases where -f $IMGFMT is not wanted because it selects
the wrong driver or json: filenames including a driver are used. They
are changed to use $QEMU_IO_PROG, which doesn't include any options.
Tests 071 and 081 have output changes because now the actual request
fails instead of reading the 2k probing buffer.
Max Reitz [Tue, 18 Nov 2014 11:21:19 +0000 (12:21 +0100)]
qemu-nbd: Use BlockBackend where reasonable
Because qemu-nbd creates the BlockBackend by itself, it should create
the according BlockDriverState tree by itself as well; that means, it
has call bdrv_open() on its own. This is one of the places where
qemu-nbd still needs to use a BlockDriverState directly (the root BDS
below the BB); other places are the configuration of zero detection
(which may be lifted into the BB eventually, but is not yet) and
temporarily loading a snapshot.
Everywhere else, though, qemu-nbd can and thus should use BlockBackend.