We want to remove knowledge of BLOCK_ERR_STOP_ENOSPC from drivers;
drivers should only be told whether to stop/report/ignore the error.
On the other hand, we want to keep using the nicer BlockErrorAction
name in the drivers. So rename the enums, while leaving aside the
names of the enum values for now.
Paolo Bonzini [Fri, 28 Sep 2012 15:22:52 +0000 (17:22 +0200)]
qemu-iotests: add test for pausing a streaming operation
These check that a paused streaming job does not advance its offset.
Sometimes the new test fails; the map is different between the source
and the destination of the streaming because qemu-io does not always
pack adjacent clusters that have the same allocated/unallocated state.
However, this also happens with the existing test_stream testcase, and
is better fixed in qemu-io.
Paolo Bonzini [Fri, 28 Sep 2012 15:22:51 +0000 (17:22 +0200)]
qmp: add block-job-pause and block-job-resume
Add QMP commands matching the functionality.
Paused jobs cannot be canceled without first resuming them. This
ensures that I/O errors are never missed by management. However, an
optional force argument can be specified to allow that.
Paolo Bonzini [Fri, 28 Sep 2012 15:22:50 +0000 (17:22 +0200)]
block: add support for job pause/resume
Job pausing reuses the existing support for cancellable sleeps. A pause
happens at the next sleeping point and lasts until the coroutine is
re-entered explicitly. Cancellation was already doing a forced resume,
so implement it explicitly in terms of resume.
Paused jobs cannot be canceled without first resuming them. This ensures
that I/O errors are never missed by management.
Paolo Bonzini [Fri, 28 Sep 2012 15:22:49 +0000 (17:22 +0200)]
qmp: add 'busy' member to BlockJobInfo
Because pausing a job is asynchronous, we need to know whether it has
completed. This is described by the "busy" field of BlockJob; copy it
to BlockJobInfo.
Jeff Cody [Thu, 27 Sep 2012 17:29:17 +0000 (13:29 -0400)]
qemu-iotests: add initial tests for live block commit
Derived from the streaming test cases (030), this adds the
following 9 tests:
1. For the following image chain, commit [mid] into [backing],
and use qemu-io to verify [backing] has its original data, as
well as the data from [mid]
[backing] <-- [mid] <-- [test]
2. Verifies that 'block-commit' with the 'speed' parameter sets the
speed parameter, as reported by 'query-block-jobs'
3. Verifies that a bogus 'device' parameter to 'block-commit'
results in error
4-9: Appropriate error values returned for the following argument errors:
* top == base
* top is nonexistent
* base is nonexistent
* top == active layer (this is currently not supported)
* top and base arguments are reversed
* top argument is omitted
Jeff Cody [Thu, 27 Sep 2012 17:29:16 +0000 (13:29 -0400)]
QAPI: add command for live block commit, 'block-commit'
The command for live block commit is added, which has the following
arguments:
device: the block device to perform the commit on (mandatory)
base: the base image to commit into; optional (if not specified,
it is the underlying original image)
top: the top image of the commit - all data from inside top down
to base will be committed into base (mandatory for now; see
note, below)
speed: maximum speed, in bytes/sec
Note: Eventually this command will support merging down the active layer,
but that code is not yet complete. If the active layer is passed
in as top, then an error will be returned. Once merging down the
active layer is supported, the 'top' argument may become optional,
and default to the active layer.
The is done as a block job, so upon completion a BLOCK_JOB_COMPLETED will
be emitted.
Jeff Cody [Thu, 27 Sep 2012 17:29:13 +0000 (13:29 -0400)]
block: add live block commit functionality
This adds the live commit coroutine. This iteration focuses on the
commit only below the active layer, and not the active layer itself.
The behaviour is similar to block streaming; the sectors are walked
through, and anything that exists above 'base' is committed back down
into base. At the end, intermediate images are deleted, and the
chain stitched together. Images are restored to their original open
flags upon completion.
Jeff Cody [Thu, 27 Sep 2012 17:29:12 +0000 (13:29 -0400)]
block: add support functions for live commit, to find and delete images.
Add bdrv_find_overlay(), and bdrv_drop_intermediate().
bdrv_find_overlay(): given 'bs' and the active (topmost) BDS of an image chain,
find the image that is the immediate top of 'bs'
bdrv_drop_intermediate():
Given 3 BDS (active, top, base), drop images above
base up to and including top, and set base to be the
backing file of top's overlay node.
This patch adds gluster as the new block backend in QEMU. This gives
QEMU the ability to boot VM images from gluster volumes. Its already
possible to boot from VM images on gluster volumes using FUSE mount, but
this patchset provides the ability to boot VM images from gluster volumes
by by-passing the FUSE layer in gluster. This is made possible by
using libgfapi routines to perform IO on gluster volumes directly.
VM Image on gluster volume is specified like this:
'transport' specifies the transport type used to connect to gluster
management daemon (glusterd). Valid transport types are
tcp, unix and rdma. If a transport type isn't specified, then tcp
type is assumed.
'server' specifies the server where the volume file specification for
the given volume resides. This can be either hostname, ipv4 address
or ipv6 address. ipv6 address needs to be within square brackets [ ].
If transport type is 'unix', then 'server' field should not be specifed.
The 'socket' field needs to be populated with the path to unix domain
socket.
'port' is the port number on which glusterd is listening. This is optional
and if not specified, QEMU will send 0 which will make gluster to use the
default port. If the transport type is unix, then 'port' should not be
specified.
'volname' is the name of the gluster volume which contains the VM image.
'image' is the path to the actual VM image that resides on gluster volume.
Paolo Bonzini [Mon, 24 Sep 2012 09:10:56 +0000 (14:40 +0530)]
aio: Fix qemu_aio_wait() to maintain correct walking_handlers count
Fix qemu_aio_wait() to ensure that registered aio handlers don't get
deleted when they are still active. This is ensured by maintaning the
right count of walking_handlers.
Jeff Cody [Tue, 25 Sep 2012 16:29:39 +0000 (12:29 -0400)]
block: after creating a live snapshot, make old image read-only
Currently, after a live snapshot of a drive, the image that has
been 'demoted' to be below the new active layer remains r/w.
This patch reopens it read-only.
Note that we do not check for error on the reopen(), because we
will not abort the snapshots if the reopen fails.
Kevin Wolf [Tue, 25 Sep 2012 13:47:36 +0000 (15:47 +0200)]
block-migration: Flush requests in blk_mig_cleanup
When cancelling block migration, all in-flight requests of the block
migration must be completed before the data can be freed. This was
visible as failing assertions and segfaults.
Merge branch 'arm-devs.for-upstream' of git://git.linaro.org/people/pmaydell/qemu-arm
* 'arm-devs.for-upstream' of git://git.linaro.org/people/pmaydell/qemu-arm:
Versatile Express: Add modelling of NOR flash
Versatile Express: Fix NOR flash 0 address and remove flash alias
hw/armv7m_nvic: Correctly register GIC region when setting up NVIC
pl190: fix read of VECTADDR
The blank lines inside the single dump make it difficult for the
eye to pick out the block. Worse, with interior newlines, but
no blank line following, the PSW line appears to belong to the
next dump block.
Francesco Lavra [Wed, 19 Sep 2012 05:57:21 +0000 (05:57 +0000)]
Versatile Express: Add modelling of NOR flash
This patch adds modelling of the two NOR flash banks found on the
Versatile Express motherboard. Tested with U-Boot running on an emulated
Versatile Express, with either A9 or A15 CoreTile.
Francesco Lavra [Wed, 19 Sep 2012 05:51:58 +0000 (05:51 +0000)]
Versatile Express: Fix NOR flash 0 address and remove flash alias
In the A series memory map (implemented in the Cortex A15 CoreTile), the
first NOR flash bank (flash 0) is mapped to address 0x08000000, while
address 0x00000000 can be configured as alias to either the first or the
second flash bank. This patch fixes the definition of flash 0 address,
and for simplicity removes the alias definition.
hw/armv7m_nvic: Correctly register GIC region when setting up NVIC
When setting up the NVIC memory regions the memory range
0x100..0xcff is aliased to an IO memory region that belongs
to the ARM GIC. This aliased region should be added to the
NVIC memory container, but the actual GIC IO memory region
was being added instead. This mixup was causing the wrong
IO memory access functions to be called when accessing parts
of the NVIC memory.
Reading VECTADDR was causing us to set the current priority to
the wrong value, the most obvious effect of which was that we
would return the vector for the wrong interrupt as the result
of the read.
Amos Kong [Fri, 7 Sep 2012 03:11:03 +0000 (11:11 +0800)]
add a boot parameter to set reboot timeout
Added an option to let qemu transfer a configuration file to bios,
"etc/boot-fail-wait", which could be specified by command
-boot reboot-timeout=T
T have a max value of 0xffff, unit is ms.
With this option, guest will wait for a given time if not find
bootabled device, then reboot. If reboot-timeout is '-1', guest
will not reboot, qemu passes '-1' to bios by default.
This feature need the new seabios's support.
Seabios pulls the value from the fwcfg "file" interface, this
interface is used because SeaBIOS needs a reliable way of
obtaining a name, value size, and value. It in no way requires
that there be a real file on the user's host machine.
getaddrinfo can give us a list of addresses, but we only try to
connect to the first one. If that fails we never proceed to
the next one. This is common on desktop setups that often have ipv6
configured but not actually working.
To fix this make inet_connect_nonblocking retry connection with a different
address.
callers on inet_nonblocking_connect register a callback function that will
be called when connect opertion completes, in case of failure the fd will have
a negative value
Stefan Weil [Fri, 14 Sep 2012 17:02:30 +0000 (19:02 +0200)]
configure: Allow builds without any system or user emulation
The old code aborted configure when no emulation target was selected.
Even after removing the 'exit 1', it tried to read from STDIN
when QEMU was configured with
This patch adds a "use64" property which will make the ivshmem driver
register a 64bit memory bar when set, so you have something to play with
when testing 64bit pci bits. It also allows to have quite big shared
memory regions, like this:
[root@fedora ~]# lspci -vs1:1
01:01.0 RAM memory: Red Hat, Inc Device 1110
Subsystem: Red Hat, Inc Device 1100
Physical Slot: 1-1
Flags: fast devsel
Memory at fd400000 (32-bit, non-prefetchable) [disabled] [size=256]
Memory at 8040000000 (64-bit, prefetchable) [size=1G]
[ v5: rebase, update compat property for post-1.2 merge ]
[ v4: rebase & adapt to latest master again ]
[ v3: rebase & adapt to latest master ]
[ v2: default to on as suggested by avi,
turn off for pc-$old using compat property ]
Igor Mammedov [Wed, 5 Sep 2012 21:06:21 +0000 (23:06 +0200)]
Introduce powerdown_notifiers
Notifier will be used for signaling powerdown request to guest in
a more general way and intended to replace very specific
qemu_irq_rise(qemu_system_powerdown) and will allow to remove global
variable qemu_system_powerdown.
tcg: Streamline movcond_i64 using 32-bit arithmetic
Avoiding 64-bit arithmetic (outside of the compare) reduces the
generated op count from 15 to 12, and the generated code size on
i686 from 105 to 88 bytes.
Checking that we don't try for idx != [01] is trivial. Checking
that we don't issue more than one of any index requires a tad
more data and some ifdefs protecting that new variable.
For tcg_gen_concat_i32_i64 we only use deposit if the host supports it.
For tcg_gen_concat32_i64 even if the host does not, as we get identical
code before and after.
Note that this relies on the ANDI -> EXTU patch for the identity claim.
Anthony Liguori [Tue, 25 Sep 2012 21:06:16 +0000 (16:06 -0500)]
Merge remote-tracking branch 'kwolf/for-anthony' into staging
* kwolf/for-anthony:
block: remove keep_read_only flag from BlockDriverState struct
block: convert bdrv_commit() to use bdrv_reopen()
block: vpc image file reopen
block: vdi image file reopen
block: vmdk image file reopen
block: qcow image file reopen
block: qcow2 image file reopen
block: qed image file reopen
block: raw image file reopen
block: raw-posix image file reopen
block: purge s->aligned_buf and s->aligned_buf_size from raw-posix.c
block: use BDRV_O_NOCACHE instead of s->aligned_buf in raw-posix.c
block: do not parse BDRV_O_CACHE_WB in block drivers
block: move open flag parsing in raw block drivers to helper functions
block: move aio initialization into a helper function
block: Framework for reopening files safely
block: make bdrv_set_enable_write_cache() modify open_flags
block: correctly set the keep_read_only flag
blockdev: preserve readonly and snapshot states across media changes
Anthony Liguori [Tue, 25 Sep 2012 21:06:16 +0000 (16:06 -0500)]
Merge remote-tracking branch 'stefanha/trivial-patches' into staging
* stefanha/trivial-patches:
w32: Always use standard instead of native format strings
net/socket: Fix compiler warning (regression for MinGW)
linux-user: Remove redundant null check and replace free by g_free
qemu-timer: simplify qemu_run_timers
TextConsole: saturate escape parameter in TTY_STATE_CSI
curses: don't initialize curses when qemu is daemonized
dtrace backend: add function to reserved words
pflash_cfi01: Fix warning caused by unreachable code
ioh3420: Remove unreachable code
lm4549: Fix buffer overflow
cadence_uart: Fix buffer overflow
qemu-sockets: Fix potential memory leak
qemu-ga: Remove unreachable code after g_error
target-i386: Allow tsc-frequency to be larger then 2.147G
Anthony Liguori [Tue, 25 Sep 2012 21:06:16 +0000 (16:06 -0500)]
Merge remote-tracking branch 'bonzini/scsi-next' into staging
* bonzini/scsi-next:
SCSI: Standard INQUIRY data should report HiSup flag as set.
scsi-disk: use scsi_data_cdb_length
scsi: introduce scsi_cdb_length and scsi_data_cdb_length
scsi-disk: fix check for out-of-range LBA
scsi-disk: introduce check_lba_range
iSCSI: We dont need to explicitely call qemu_notify_event() any more
iSCSI: We need to support SG_IO also from iscsi_ioctl()
Anthony Liguori [Tue, 25 Sep 2012 21:06:15 +0000 (16:06 -0500)]
Merge remote-tracking branch 'bonzini/nbd-next' into staging
* bonzini/nbd-next:
nbd: add nbd_export_get_blockdev
nbd: negotiate with named exports
nbd: register named exports
qemu-nbd: rewrite termination conditions to use a state machine
nbd: add notification for closing an NBDExport
nbd: track clients into NBDExport
nbd: add reference counting to NBDExport
nbd: do not leak nbd_trip coroutines when a connection is torn down
nbd: make refcount interface public
nbd: do not close BlockDriverState in nbd_export_close
nbd: pass NBDClient to nbd_send_negotiate
nbd: add more constants
Jeff Cody [Thu, 20 Sep 2012 19:13:30 +0000 (15:13 -0400)]
block: vmdk image file reopen
This patch supports reopen for VMDK image files. VMDK extents are added
to the existing reopen queue, so that the transactional model of reopen
is maintained with multiple image files.
Jeff Cody [Thu, 20 Sep 2012 19:13:25 +0000 (15:13 -0400)]
block: raw-posix image file reopen
This is derived from the Supriya Kannery's reopen patches.
This contains the raw-posix driver changes for the bdrv_reopen_*
functions. All changes are staged into a temporary scratch buffer
during the prepare() stage, and copied over to the live structure
during commit(). Upon abort(), all changes are abandoned, and the
live structures are unmodified.
The _prepare() will create an extra fd - either by means of a dup,
if possible, or opening a new fd if not (for instance, access
control changes). Upon _commit(), the original fd is closed and
the new fd is used. Upon _abort(), the duplicate/new fd is closed.
Jeff Cody [Thu, 20 Sep 2012 19:13:23 +0000 (15:13 -0400)]
block: use BDRV_O_NOCACHE instead of s->aligned_buf in raw-posix.c
Rather than check for a non-NULL aligned_buf to determine if
raw_aio_submit needs to check for alignment, check for the presence
of BDRV_O_NOCACHE in the bs->open_flags.
Jeff Cody [Thu, 20 Sep 2012 19:13:19 +0000 (15:13 -0400)]
block: Framework for reopening files safely
This is based on Supriya Kannery's bdrv_reopen() patch series.
This provides a transactional method to reopen multiple
images files safely.
Image files are queue for reopen via bdrv_reopen_queue(), and the
reopen occurs when bdrv_reopen_multiple() is called. Changes are
staged in bdrv_reopen_prepare() and in the equivalent driver level
functions. If any of the staged images fails a prepare, then all
of the images left untouched, and the staged changes for each image
abandoned.
Block drivers are passed a reopen state structure, that contains:
* BDS to reopen
* flags for the reopen
* opaque pointer for any driver-specific data that needs to be
persistent from _prepare to _commit/_abort
* reopen queue pointer, if the driver needs to queue additional
BDS for a reopen
Jeff Cody [Thu, 20 Sep 2012 19:13:18 +0000 (15:13 -0400)]
block: make bdrv_set_enable_write_cache() modify open_flags
bdrv_set_enable_write_cache() sets the bs->enable_write_cache flag,
but without the flag recorded in bs->open_flags, then next time
a reopen() is performed the enable_write_cache setting may be
inadvertently lost.
This will set the flag in open_flags, so it is preserved across
reopens.
Jeff Cody [Thu, 20 Sep 2012 19:13:17 +0000 (15:13 -0400)]
block: correctly set the keep_read_only flag
I believe the bs->keep_read_only flag is supposed to reflect
the initial open state of the device. If the device is initially
opened R/O, then commit operations, or reopen operations changing
to R/W, are prohibited.
Currently, the keep_read_only flag is only accurate for the active
layer, and its backing file. Subsequent images end up always having
the keep_read_only flag set.
For instance, what happens now:
[ base ] kro = 1, ro = 1
|
v
[ snap-1 ] kro = 1, ro = 1
|
v
[ snap-2 ] kro = 0, ro = 1
|
v
[ active ] kro = 0, ro = 0
What we want:
[ base ] kro = 0, ro = 1
|
v
[ snap-1 ] kro = 0, ro = 1
|
v
[ snap-2 ] kro = 0, ro = 1
|
v
[ active ] kro = 0, ro = 0
Kevin Shanahan [Thu, 20 Sep 2012 23:20:22 +0000 (08:50 +0930)]
blockdev: preserve readonly and snapshot states across media changes
If readonly=on is given at device creation time, the ->readonly flag
needs to be set in the block driver state for this device so that
readonly-ness is preserved across media changes (qmp change command).
Similarly, to preserve the snapshot property requires ->open_flags to
be correct.
Stefan Weil [Sat, 22 Sep 2012 20:26:19 +0000 (22:26 +0200)]
w32: Add implementation of gmtime_r, localtime_r
Those functions are missing in MinGW.
Some versions of MinGW-w64 include defines for gmtime_r and localtime_r.
Older versions of these macros are buggy (they return a pointer to a
static variable), therefore we don't want them. Newer versions are
similar to the code used here, but without the memset.
The implementation which is used here is not strictly reentrant,
but sufficiently good for QEMU on w32 or w64.
Stefan Weil [Wed, 22 Aug 2012 19:42:32 +0000 (21:42 +0200)]
w32: Always use standard instead of native format strings
GLib 2.0 include files use __printf__ for the format attribute
which resolves to native format strings on w32 hosts.
QEMU wants standard format strings instead of native format
strings, so we simply change any declaration with __printf__
to use __gnu_printf__.
This works because all basic printf functions support both
kinds of format strings.
This fixes a compiler warning:
qapi/string-output-visitor.c: In function ‘print_type_int’:
qapi/string-output-visitor.c:34:5: warning: unknown conversion type character ‘l’ in format [-Wformat]
qapi/string-output-visitor.c:34:5: warning: too many arguments for format [-Wformat-extra-args]
net/socket.c:136: warning:
pointer targets in passing argument 2 of ‘sendto’ differ in signedness
/usr/lib/gcc/amd64-mingw32msvc/4.4.4/../../../../amd64-mingw32msvc/include/winsock2.h:1313: note:
expected ‘const char *’ but argument is of type ‘const uint8_t *’
Add a 'qemu_sendto' macro which provides that type cast where needed
and use the new macro instead of 'sendto'.
curses: don't initialize curses when qemu is daemonized
Current qemu initializes curses even if -daemonize option is
passed. This cause problem because shell prompt appears without
calling endwin().
This patch adds new function, is_daemonized(), to OS dependent
code. With this function, curses_display_init() can check that qemu is
daemonized or not. If daemonized, curses_display_init() isn't called
and the problem is avoided.
Of course, -daemonize && -curses doesn't make sense. Users shouldn't
pass the arguments at the same time. But the problem is very painful
because Ctrl-C cannot be delivered to the terminal.
Stefan Weil [Sat, 1 Sep 2012 11:00:48 +0000 (13:00 +0200)]
pflash_cfi01: Fix warning caused by unreachable code
Report from smatch:
hw/pflash_cfi01.c:431 pflash_write(180) info: ignoring unreachable code.
Instead of removing the return statement after the switch statement,
the patch replaces the return statements in the switch statement by
break statements. Other switch statements in the same code do it also
like that.
There must be enough space to add two entries starting with index
s->buffer_level, therefore the old check was wrong.
[Peter Maydell <[email protected]> clarifies the nature of the
analyser warning:
I don't object to making the change to placate the analyser,
but I don't think this is actually a buffer overrun. We always
add and remove samples from the buffer two at a time, so it's
not possible to get here with s->buffer_level == BUFFER_SIZE-1
(which is the only case where the old and new conditions
give different answers).]