Hans de Goede [Wed, 24 Oct 2012 16:14:01 +0000 (18:14 +0200)]
ehci: Improve latency of interrupt delivery and async schedule scanning
While doing various performance tests of reading from USB mass storage devices
I noticed the following::
1) When an async handled packet completes, we don't immediately report an
interrupt to the guest, instead we wait for the frame-timer to run and
report it from there
2) If 1) has been fixed and an async handled packet takes a while to complete,
then async_stepdown will become a high value, which means that there
will be a large latency before any new packets queued by the guest in
response to the interrupt get seen
1) was done deliberately as part of commit f0ad01f92:
http://www.kraxel.org/cgit/qemu/commit/?h=usb.57&id=f0ad01f92ca02eee7cadbfd225c5de753ebd5fce
Since setting the interrupt immediately on async packet completion was causing
issues with Linux guests, I believe this recently fixed Linux bug explains
why this is happening:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=361aabf395e4a23cf554cf4ec0c0c6963b8beb01
Note that we can *not* count on this fix being present in all Linux guests!
I was hoping that the recently added support for Interrupt Threshold Control
would fix the issues with Linux guests, but adding a simple ehci_commit_irq()
call to ehci_async_bh() still caused problems with Linux guests.
The problem is, that when doing ehci_commit_irq() from ehci_async_bh(),
the "old" frindex value is used to calculate usbsts_frindex, and when
the frame-timer then runs possibly very shortly after ehci_async_bh(),
it increases the frame-timer, and thus any interrupts raised from that
frame-timer run, will also get reported to the guest immediately, rather
then being delayed to the next frame-timer run.
Luckily the solution for this is simple, this means that we need to
increase frindex before calling ehci_commit_irq() from ehci_async_bh(),
which in the end boils down to simple calling ehci_frame_timer() instead
of ehci_async_bh() from the bh.
This may seem like it causes a lot of extra work to be done, but this
is not true. Any work done from the frame-timer processing the periodic
schedule is work which then does not need to be done the next time the
frame timer runs, also the frame-timer will re-arm itself at (possibly)
a later time then it was armed for saving a vmexit at that time.
As an additional advantage moving to simply calling the frame-timer also
fixes 2) as the packet completion will set async_stepdown to 0, and the
re-arming of the timer with an async_stepdown of 0 ensures that any
newly queued up packets get seen in a reasonable amount of time.
This improves the speed (MB/s) of a Linux guest reading from a USB mass
storage device by a factor of 1.5 - 1.7 with input pipelining disabled,
and by a factor of 1.8 with input pipelining enabled.
Hans de Goede [Wed, 24 Oct 2012 16:14:00 +0000 (18:14 +0200)]
ehci: Set int flag on a short input packet
According to 4.15.1.2 an interrupt must be raised when a short packet
is received. If we don't do this it may take a significant time for
the guest to notice a short trasnfer has completed, since only the last td
will have its IOC flag set, and a short transfer may complete in an earlier
packet.
Hans de Goede [Wed, 24 Oct 2012 16:13:59 +0000 (18:13 +0200)]
ehci: Get rid of packet tbytes field
This field is used in some places to track the tbytes field of the token, but
in other places the field is used directly, use it directly everywhere for
consistency.
Hans de Goede [Wed, 24 Oct 2012 16:13:58 +0000 (18:13 +0200)]
uhci: Move checks to continue queuing to uhci_fill_queue()
Rather then having a special check to start queuing after the first packet,
and then another check for the other packets in uhci_fill_queue(), simply
check the previous packet beforehand in uhci_fill_queue()
Luiz Capitulino [Wed, 24 Oct 2012 16:12:15 +0000 (14:12 -0200)]
win32: fix broken build due to missing QEMU_MADV_HUGEPAGE
Commit ad0b5321f1f797274603ebbe20108b0750baee94 forgot to add
QEMU_MADV_HUGEPAGE macros for when CONFIG_MADVISE is not defined.
This broke the build for Windows. Fix it.
Anthony Liguori [Wed, 24 Oct 2012 14:39:49 +0000 (09:39 -0500)]
Merge remote-tracking branch 'bonzini/nbd-next' into staging
* bonzini/nbd-next: (30 commits)
qmp: add NBD server commands
block: add close notifiers
block: prepare code for adding block notifiers
qemu-sockets: add socket_listen, socket_connect, socket_parse
tests: do not include tools-obj-y Signed-off-by: Paolo Bonzini <[email protected]>
qemu-sockets: return InetSocketAddress from inet_parse
qapi: add socket address types
build: add QAPI files to the tools
vnc: drop QERR_VNC_SERVER_FAILED
qemu-sockets: add error propagation to Unix socket functions
qemu-sockets: add error propagation to inet_parse
qemu-sockets: add error propagation to inet_dgram_opts
qemu-sockets: add error propagation to inet_connect_addr
qemu-sockets: include strerror or gai_strerror output in error messages
vnc: add error propagation to vnc_display_open
vnc: reorganize code for reverse mode
vnc: introduce a single label for error returns
vnc: avoid Yoda conditionals
qemu-ga: ask and print error information from qemu-sockets
nbd: ask and print error information from qemu-sockets
...
Paolo Bonzini [Fri, 19 Oct 2012 14:45:24 +0000 (16:45 +0200)]
migration: go to paused state after finishing incoming migration with -S
At the end of migration the machine has started already, and cannot be
destroyed without losing the guest's data. Hence, prelaunch is the
wrong state. Go to the paused state instead. QEMU would reach that
state anyway (after running the guest for the blink of an eye) if the
"stop" command had been received after the start of migration.
Paolo Bonzini [Tue, 23 Oct 2012 12:54:21 +0000 (14:54 +0200)]
qmp: handle stop/cont in INMIGRATE state
Right now, stop followed by an incoming migration will let the
virtual machine start. cont before an incoming migration instead
will fail.
This is bad because the actual behavior is not predictable; it is
racy with respect to the start of the incoming migration. That's
because incoming migration is blocking, and thus will delay the
processing of stop/cont until the end of the migration.
In addition, there's nothing that really prevents the user from
typing the block device's passwords before incoming migration is
done, so returning the DeviceEncrypted error is also helpful in
the QMP case.
Both things can be fixed by just toggling the autostart variable when
stop/cont are called in INMIGRATE state.
Note that libvirt is currently working around the race by looping
if the MigrationExpected answer is returned. After this patch, the
command will return right away without ever raising an error.
Aurelien Jarno [Fri, 19 Oct 2012 21:19:19 +0000 (23:19 +0200)]
hmp: fix info cpus for sparc targets
On sparc targets, info cpus returns this kind of output:
| info cpus
| * CPU #0: pc=0x0000000000424d18pc=0x0000000000424d18npc=0x0000000000424d1c thread_id=19460
pc is printed twice, there is no space between pc, pc and npc.
With this patch, pc is not printed anymore when has_npc is set. In addition
the space is printed before pc/nip/npc/PC instead of after the colon so that
multiple prints are possible. This result on the following kind of input on
sparc targets:
| info cpus
| * CPU #0: pc=0x0000000000424d18 npc=0x0000000000424d1c thread_id=19460
Peter Maydell [Fri, 19 Oct 2012 04:23:05 +0000 (04:23 +0000)]
target-arm: Remove out of date FIXME regarding saturating arithmetic
Remove an out of date FIXME regarding the saturating arithmetic helpers:
we now do pass a pointer to CPUARMState to these helpers, and since
the AREG0 changes went in there is no difference between helper.c
and op_helper.c and therefore no point in moving the functions.
Rework the handling of arguments to ARM semihosting calls so that we
handle a possible failure return from get_user_ual() or put_user_ual().
(This incidentally silences a lot of warnings from clang about
"expression result unused").
Corey Bryant [Thu, 18 Oct 2012 20:41:04 +0000 (16:41 -0400)]
osdep: Less restrictive F_SEFL in qemu_dup_flags()
qemu_dup_flags() currently limits the flags that can be set on the
fcntl() F_SETFL call to those that we currently know can be set with
fcntl() F_SETFL. The problem with this is that it will prevent use
of new flags in the future without a code update.
This patch relaxes the checking and lets fcntl() determine the flags
it can set.
Paolo Bonzini [Thu, 18 Oct 2012 14:49:29 +0000 (16:49 +0200)]
qmp: add pull_event function
This function is unlike get_events in that it makes it easy to process
one event at a time. This is useful in the mirroring test cases, where
we want to process just one event (BLOCK_JOB_ERROR) and leave the others
to a helper function.
Paolo Bonzini [Thu, 18 Oct 2012 14:49:28 +0000 (16:49 +0200)]
mirror: add support for on-source-error/on-target-error
Error management is important for mirroring; otherwise, an error on the
target (even something as "innocent" as ENOSPC) requires to start again
with a full copy. Similar to on_read_error/on_write_error, two separate
knobs are provided for on_source_error (reads) and on_target_error (writes).
The default is 'report' for both.
The 'ignore' policy will leave the sector dirty, so that it will be
retried later. Thus, it will not cause corruption.
Paolo Bonzini [Thu, 18 Oct 2012 14:49:25 +0000 (16:49 +0200)]
mirror: implement completion
Switching to the target of the migration is done mostly asynchronously,
and reported to management via the BLOCK_JOB_COMPLETED event; the only
synchronous phase is opening the backing files. bdrv_open_backing_file
can always be done, even for migration of the full image (aka sync:
'full'). In this case, qmp_drive_mirror will create the target disk
with no backing file at all, and bdrv_open_backing_file will be a no-op.
Paolo Bonzini [Thu, 18 Oct 2012 14:49:23 +0000 (16:49 +0200)]
mirror: introduce mirror job
This patch adds the implementation of a new job that mirrors a disk to
a new image while letting the guest continue using the old image.
The target is treated as a "black box" and data is copied from the
source to the target in the background. This can be used for several
purposes, including storage migration, continuous replication, and
observation of the guest I/O in an external program. It is also a
first step in replacing the inefficient block migration code that is
part of QEMU.
The job is possibly never-ending, but it is logically structured into
two phases: 1) copy all data as fast as possible until the target
first gets in sync with the source; 2) keep target in sync and
ensure that reopening to the target gets a correct (full) copy
of the source data.
The second phase is indicated by the progress in "info block-jobs"
reporting the current offset to be equal to the length of the file.
When the job is cancelled in the second phase, QEMU will run the
job until the source is clean and quiescent, then it will report
successful completion of the job.
In other words, the BLOCK_JOB_CANCELLED event means that the target
may _not_ be consistent with a past state of the source; the
BLOCK_JOB_COMPLETED event means that the target is consistent with
a past state of the source. (Note that it could already happen
that management lost the race against QEMU and got a completion
event instead of cancellation).
It is not yet possible to complete the job and switch over to the target
disk. The next patches will fix this and add many refinements to the
basic idea introduced here. These include improved error management,
some tunable knobs and performance optimizations.
Paolo Bonzini [Mon, 23 Jul 2012 13:15:47 +0000 (15:15 +0200)]
block: introduce BLOCK_JOB_READY event
Even for jobs that need to be manually completed, management may want
to take care itself of the completion, not requiring the user to issue
a command to terminate the job. In this case we want to avoid that
they poll us continuously, waiting for completion to become available.
Thus, add a new event that signals the phase switch and the availability
of the block-job-complete command.
Paolo Bonzini [Thu, 18 Oct 2012 14:49:21 +0000 (16:49 +0200)]
block: add block-job-complete
While streaming can be dropped as soon as it progressed through the whole
image, mirroring needs to be completed manually for two reasons: 1) so that
management knows exactly when the VM switches to the target; 2) because
for other use cases such as replication, we may leave the operation running
for the whole life of the virtual machine.
Add a new block job command that manually completes background operations.
Paolo Bonzini [Thu, 18 Oct 2012 14:49:18 +0000 (16:49 +0200)]
block: introduce new dirty bitmap functionality
Assert that write_compressed is never used with the dirty bitmap.
Setting the bits early is wrong, because a coroutine might concurrently
examine them and copy incomplete data from the source.
Paolo Bonzini [Thu, 18 Oct 2012 14:49:17 +0000 (16:49 +0200)]
block: add bdrv_open_backing_file
Mirroring runs without the backing file so that it can be copied outside
QEMU. However, we need to add it at the time the job is completed and
QEMU switches to the target. Factor out the common bits of opening an
image and completing a mirroring operation.
The new function does not assume that the file is closed immediately after
it returns failure, so it keeps the BDRV_O_NO_BACKING flag up-to-date.
Corey Bryant [Thu, 18 Oct 2012 19:19:34 +0000 (15:19 -0400)]
qemu-config: Add new -add-fd command line option
This option can be used for passing file descriptors on the
command line. It mirrors the existing add-fd QMP command which
allows an fd to be passed to QEMU via SCM_RIGHTS and added to an
fd set.
This can be combined with commands such as -drive to link file
descriptors in an fd set to a drive:
This example adds dups of fds 3 and 4, and the accompanying opaque
strings to the fd set with ID=2. qemu_open() already knows how
to handle a filename of this format. qemu_open() searches the
corresponding fd set for an fd and when it finds a match, QEMU
goes on to use a dup of that fd just like it would have used an
fd that it opened itself.
Corey Bryant [Thu, 18 Oct 2012 19:19:33 +0000 (15:19 -0400)]
monitor: Prevent removing fd from set during init
If an fd is added to an fd set via the command line, and it is not
referenced by another command line option (ie. -drive), then clean
it up after QEMU initialization is complete.
Corey Bryant [Thu, 18 Oct 2012 19:19:32 +0000 (15:19 -0400)]
monitor: Enable adding an inherited fd to an fd set
qmp_add_fd() gets an fd that was received over a socket with
SCM_RIGHTS and adds it to an fd set. This patch adds support
that will enable adding an fd that was inherited on the
command line to an fd set.
Note: All of the code added to monitor_fdset_add_fd(), with the
exception of the error path for non-valid fdset-id, is code motion
from qmp_add_fd().
Corey Bryant [Thu, 18 Oct 2012 19:19:31 +0000 (15:19 -0400)]
monitor: Allow add-fd to any specified fd set
The first call to add an fd to an fd set was previously not
allowed to choose the fd set ID. The ID was generated as
the first available and ensuing calls could add more fds by
specifying the fd set ID. This change allows users to
choose the fd set ID on the first call.
Stefan Hajnoczi [Wed, 17 Oct 2012 12:02:31 +0000 (14:02 +0200)]
qemu-img: Add --backing-chain option to info command
The qemu-img info --backing-chain option enumerates the backing file
chain. For example, for base.qcow2 <- snap1.qcow2 <- snap2.qcow2 the
output becomes:
$ qemu-img info --backing-chain snap2.qcow2
image: snap2.qcow2
file format: qcow2
virtual size: 100M (104857600 bytes)
disk size: 196K
cluster_size: 65536
backing file: snap1.qcow2
backing file format: qcow2
Jeff Cody [Tue, 16 Oct 2012 19:49:10 +0000 (15:49 -0400)]
block: in commit, determine base image from the top image
This simplifies some code and error checking, and also fixes a bug.
bdrv_find_backing_image() should only be passed absolute filenames,
or filenames relative to the chain. In the QMP message handler for
block commit, when looking up the base do so from the determined top
image, so we know it is reachable from top.
Some of the error messages put out by block-commit have changed
slightly, which causes 2 tests cases for block-commit to fail.
This patch updates the test cases to look for the correct error
output.
Jeff Cody [Tue, 16 Oct 2012 19:49:09 +0000 (15:49 -0400)]
block: make bdrv_find_backing_image compare canonical filenames
Currently, bdrv_find_backing_image compares bs->backing_file with
what is passed in as a backing_file name. Mismatches may occur,
however, when bs->backing_file and backing_file are not both
absolute or relative.
Use path_combine() to make sure any relative backing filenames are
relative to the current image filename being searched, and then use
realpath() to make all comparisons based on absolute filenames.
If either backing_file or bs->backing_file is determine to be a
protocol, then no filename normalization is performed.
This also changes bdrv_find_backing_image to no longer be recursive,
but iterative.
Alex Bligh [Tue, 16 Oct 2012 12:46:18 +0000 (13:46 +0100)]
qemu-img rebase: use empty string to rebase without backing file
This patch allows an empty filename to be passed as the new base image name
for qemu-img rebase to mean base the image on no backing file (i.e.
independent of any backing file). According to Eric Blake, qemu-img rebase
already supports this when '-u' is used; this adds support when -u is not
used.
Jeff Cody [Mon, 15 Oct 2012 20:58:02 +0000 (16:58 -0400)]
qmp: fix __accept() in qmp.py
In QEMUMonitorProtocol, commit e9d17b6 removed the __sockfile creation
from __negotiate_capabilities(), which breaks _accept(). This causes
failures in qemu-io python based tests (i.e. tests 030 and 040).
This patch creates the sockfile in __accept() as well.
Paolo Bonzini [Wed, 22 Aug 2012 14:43:07 +0000 (16:43 +0200)]
qmp: add NBD server commands
Adding an NBD server inside QEMU is trivial, since all the logic is
in nbd.c and can be shared easily between qemu-nbd and QEMU itself.
The main difference is that qemu-nbd serves a single unnamed export,
while QEMU serves named exports.
Paolo Bonzini [Thu, 23 Aug 2012 09:20:36 +0000 (11:20 +0200)]
block: add close notifiers
The first user of close notifiers will be the embedded NBD server.
It would be possible to use them to do some of the ad hoc processing
(e.g. for block jobs and I/O limits) that is currently done by
bdrv_close.
Paolo Bonzini [Fri, 19 Oct 2012 09:36:48 +0000 (11:36 +0200)]
block: prepare code for adding block notifiers
There is no reason in principle to skip job cancellation and draining
of pending I/O when there is no medium in the disk. Do these unconditionally,
which also prepares the code for the next patch.
These are QAPI-friendly versions of the qemu-sockets functions. They
support IP sockets, Unix sockets, and named file descriptors, using a
QAPI union to dispatch to the correct function.
Avi Kivity [Tue, 23 Oct 2012 10:30:10 +0000 (12:30 +0200)]
Rename target_phys_addr_t to hwaddr
target_phys_addr_t is unwieldly, violates the C standard (_t suffixes are
reserved) and its purpose doesn't match the name (most target_phys_addr_t
addresses are not target specific). Replace it with a finger-friendly,
standards conformant hwaddr.
Outstanding patchsets can be fixed up with the command
Paolo Bonzini [Wed, 19 Sep 2012 11:54:39 +0000 (13:54 +0200)]
qemu-sockets: add error propagation to Unix socket functions
Before:
$ qemu-system-x86_64 -monitor unix:/vvv,server=off
connect(unix:/vvv): No such file or directory
chardev: opening backend "socket" failed
After:
$ x86_64-softmmu/qemu-system-x86_64 -monitor unix:/vvv,server=off
qemu-system-x86_64: -monitor unix:/vvv,server=off: Failed to connect to socket: No such file or directory
chardev: opening backend "socket" failed
Paolo Bonzini [Tue, 2 Oct 2012 07:19:01 +0000 (09:19 +0200)]
qemu-sockets: add error propagation to inet_connect_addr
perror and fprintf can be removed because all clients can now consume
Errors properly. However, we'll need to change the non-blocking connect
handlers to take an Error, in order to improve error handling for
migration with the TCP protocol.
This is a minor degradation in error reporting for outgoing migration.
However, until 1.2 this case just failed without even attempting to
connect, so it is still an improvement as far as overall QoI is
concerned.
Paolo Bonzini [Tue, 2 Oct 2012 08:17:21 +0000 (10:17 +0200)]
vnc: add error propagation to vnc_display_open
Before:
$ qemu-system-x86_64 -vnc foo.bar:12345
getaddrinfo(foo.bar,18245): Name or service not known
Failed to start VNC server on `foo.bar:12345'
$ qemu-system-x86_64 -vnc localhost:12345,reverse=on
inet_connect_opts: connect(ipv4,yakj.usersys.redhat.com,127.0.0.1,12345): Connection refused
Failed to start VNC server on `localhost:12345,reverse=on'
After:
$ x86_64-softmmu/qemu-system-x86_64 -vnc foo.bar:12345
Failed to start VNC server on `foo.bar:12345': address resolution failed for foo.bar:18245: Name or service not known
$ x86_64-softmmu/qemu-system-x86_64 -vnc localhost:12345,reverse=on
Failed to start VNC server on `localhost:12345,reverse=on': Failed to connect to socket: Connection refused
Paolo Bonzini [Tue, 2 Oct 2012 08:07:21 +0000 (10:07 +0200)]
nbd: ask and print error information from qemu-sockets
Before:
$ qemu-system-x86_64 nbd:localhost:12345
inet_connect_opts: connect(ipv4,yakj.usersys.redhat.com,127.0.0.1,12345): Connection refused
qemu-system-x86_64: could not open disk image nbd:localhost:12345: Connection refused
After:
$ x86_64-softmmu/qemu-system-x86_64 nbd:localhost:12345
qemu-system-x86_64: Failed to connect to socket: Connection refused
qemu-system-x86_64: could not open disk image nbd:localhost:12345: Connection refused
$ x86_64-softmmu/qemu-system-x86_64 -monitor tcp:localhost:6000
x86_64-softmmu/qemu-system-x86_64: -monitor tcp:localhost:6000: Failed to connect to socket: Connection refused
chardev: opening backend "socket" failed
$ x86_64-softmmu/qemu-system-x86_64 -monitor tcp:foo.bar:12345
qemu-system-x86_64: -monitor tcp:foo.bar:12345: address resolution failed for foo.bar:12345: Name or service not known
chardev: opening backend "socket" failed
Paolo Bonzini [Tue, 2 Oct 2012 08:02:46 +0000 (10:02 +0200)]
migration (outgoing): add error propagation for all protocols
Error propagation is already there for socket backends. Add it to other
protocols, simplifying code that tests for errors that will never happen.
With all protocols understanding Error, the code can be simplified
further by removing the return value.
Unfortunately, the quality of error messages varies depending
on where the error is detected, because no Error is passed to the
NonBlockingConnectHandler. Thus, the exact error message still cannot
be sent to the user if the OS reports it asynchronously via SO_ERROR.
If NonBlockingConnectHandler received an Error**, we could for
example report the error class and/or message via a new field of the
query-migration command even if it is reported asynchronously.
Before:
(qemu) migrate fd:ffff
migrate: An undefined error has occurred
(qemu) info migrate
(qemu)
After:
(qemu) migrate fd:ffff
migrate: File descriptor named 'ffff' has not been found
(qemu) info migrate
capabilities: xbzrle: off
Migration status: failed
total time: 0 milliseconds
Paolo Bonzini [Tue, 2 Oct 2012 07:59:38 +0000 (09:59 +0200)]
migration: centralize call to migrate_fd_error()
The call to migrate_fd_error() was missing for non-socket backends, so
centralize it in qmp_migrate().
Before:
(qemu) migrate fd:ffff
migrate: An undefined error has occurred
(qemu) info migrate
(qemu)
After:
(qemu) migrate fd:ffff
migrate: An undefined error has occurred
(qemu) info migrate
capabilities: xbzrle: off
Migration status: failed
total time: 0 milliseconds
(The awful error message will be fixed later in the series).
Paolo Bonzini [Wed, 3 Oct 2012 12:34:33 +0000 (14:34 +0200)]
migration: avoid using error_is_set and thus relying on errp != NULL
The migration code is using errp to detect "internal" errors, this means
that it relies on errp being non-NULL.
No impact so far because our only QMP clients (the QMP marshaller and HMP)
never pass a NULL Error **. But if we had others, this patch would make
sure that migration can work with a NULL Error **.
Anthony Liguori [Mon, 22 Oct 2012 19:49:18 +0000 (14:49 -0500)]
Merge remote-tracking branch 'qemu-kvm/memory/urgent' into staging
* qemu-kvm/memory/urgent:
memory: abort if a memory region is destroyed during a transaction
i440fx: avoid destroying memory regions within a transaction
memory: Make eventfd adhere to device endianness
Gerd Hoffmann [Wed, 17 Oct 2012 07:54:19 +0000 (09:54 +0200)]
serial: split serial.c
Split serial.c into serial.c, serial.h and serial-isa.c. While being at
creating a serial.h header file move the serial prototypes from pc.h to
the new serial.h. The latter leads to s/pc.h/serial.h/ in tons of
boards which just want the serial bits from pc.h
Luiz Capitulino [Fri, 5 Oct 2012 19:47:57 +0000 (16:47 -0300)]
Call MADV_HUGEPAGE for guest RAM allocations
This makes it possible for QEMU to use transparent huge pages (THP)
when transparent_hugepage/enabled=madvise. Otherwise THP is only
used when it's enabled system wide.
Anthony Liguori [Mon, 22 Oct 2012 18:26:23 +0000 (13:26 -0500)]
Merge remote-tracking branch 'quintela/migration-next-20121017' into staging
* quintela/migration-next-20121017: (41 commits)
cpus: create qemu_in_vcpu_thread()
savevm: make qemu_file_put_notify() return errors
savevm: un-export qemu_file_set_error()
block-migration: handle errors with the return codes correctly
block-migration: Switch meaning of return value
block-migration: make flush_blks() return errors
buffered_file: buffered_put_buffer() don't need to set last_error
savevm: Only qemu_fflush() can generate errors
savevm: make qemu_fill_buffer() be consistent
savevm: unexport qemu_ftell()
savevm: unfold qemu_fclose_internal()
savevm: make qemu_fflush() return an error code
savevm: Remove qemu_fseek()
virtio-net: use qemu_get_buffer() in a temp buffer
savevm: unexport qemu_fflush
migration: make migrate_fd_wait_for_unfreeze() return errors
buffered_file: make buffered_flush return the error code
buffered_file: callers of buffered_flush() already check for errors
buffered_file: We can access directly to bandwidth_limit
buffered_file: unfold migrate_fd_close
...
Anthony Liguori [Mon, 22 Oct 2012 18:26:07 +0000 (13:26 -0500)]
Merge remote-tracking branch 'qemu-kvm/memory/dma' into staging
* qemu-kvm/memory/dma: (23 commits)
pci: honor PCI_COMMAND_MASTER
pci: give each device its own address space
memory: add address_space_destroy()
dma: make dma access its own address space
memory: per-AddressSpace dispatch
s390: avoid reaching into memory core internals
memory: use AddressSpace for MemoryListener filtering
memory: move tcg flush into a tcg memory listener
memory: move address_space_memory and address_space_io out of memory core
memory: manage coalesced mmio via a MemoryListener
xen: drop no-op MemoryListener callbacks
kvm: drop no-op MemoryListener callbacks
xen_pt: drop no-op MemoryListener callbacks
vfio: drop no-op MemoryListener callbacks
memory: drop no-op MemoryListener callbacks
memory: provide defaults for MemoryListener operations
memory: maintain a list of address spaces
memory: export AddressSpace
memory: prepare AddressSpace for exporting
xen_pt: use separate MemoryListeners for memory and I/O
...
Avi Kivity [Wed, 3 Oct 2012 15:17:27 +0000 (17:17 +0200)]
pci: give each device its own address space
Accesses from different devices can resolve differently
(depending on bridge settings, iommus, and PCI_COMMAND_MASTER), so
set up an address space for each device.
Currently iommus are expressed outside the memory API, so this doesn't
work if an iommu is present.
Avi Kivity [Wed, 3 Oct 2012 14:22:53 +0000 (16:22 +0200)]
memory: per-AddressSpace dispatch
Currently we use a global radix tree to dispatch memory access. This only
works with a single address space; to support multiple address spaces we
make the radix tree a member of AddressSpace (via an intermediate structure
AddressSpaceDispatch to avoid exposing too many internals).
A side effect is that address_space_io also gains a dispatch table. When
we remove all the pre-memory-API I/O registrations, we can use that for
dispatching I/O and get rid of the original I/O dispatch.
Avi Kivity [Tue, 2 Oct 2012 16:54:45 +0000 (18:54 +0200)]
memory: move tcg flush into a tcg memory listener
We plan to make the core listener listen to all address spaces; this
will cause many more flushes than necessary. Prepare for that by
moving the flush into a tcg-specific listener.
Later we can avoid registering the listener if tcg is disabled.
Avi Kivity [Tue, 2 Oct 2012 16:21:54 +0000 (18:21 +0200)]
memory: manage coalesced mmio via a MemoryListener
Instead of calling a global function on coalesced mmio changes, which
routes the call to kvm if enabled, add coalesced mmio hooks to
MemoryListener and make kvm use that instead.
The motivation is support for multiple address spaces (which means we
we need to filter the call on the right address space) but the result
is cleaner as well.
Michael Tokarev [Sun, 21 Oct 2012 18:52:54 +0000 (22:52 +0400)]
fix CONFIG_QEMU_HELPERDIR generation again
commit 38f419f35225 fixed a breakage with CONFIG_QEMU_HELPERDIR
which has been introduced by 8bf188aa18ef7a8. But while techinically
that fix has been correct, all other similar variables are handled
differently. Make it consistent, and let scripts/create_config
expand and capitalize the variable properly like for all other
qemu_*dir variables.