Jan Kiszka [Fri, 22 May 2009 21:51:45 +0000 (23:51 +0200)]
kvm: Mark full address range dirty on live migration start
As Avi correctly noted, last_ram_offset does not mark the last physical
RAM address the guest may see (due to non-continuous memory regions).
Ensure that we catch them all by marking the full possible address range
dirty.
Chris Lalancette [Mon, 25 May 2009 14:38:23 +0000 (16:38 +0200)]
Allow monitor interaction when using migrate -exec
All,
I've recently been playing around with migration via exec. Unfortunately,
when starting the incoming qemu process with "-incoming exec:cmd", it suffers
the same problem that -incoming tcp used to suffer; namely, that you can't
interact with the monitor until after the migration has happened. This causes
problems for libvirt usage of -incoming exec, since libvirt expects to be able
to access the monitor ahead of time. This fairly simple patch allows you to
access the monitor both before and after the migration has completed using exec.
(note: developed/tested with qemu-kvm, but applies perfectly fine to qemu)
Now that we have a separate aio pool structure we can remove those
aio pool details from BlockDriver.
Every driver supporting AIO now needs to declare a static AIOPool
with the aiocb size and the cancellation method. This cleans up the
current code considerably and will make it cleaner and more obvious
to support two different aio implementations behind a single
BlockDriver.
[this one is required for [PATCH] fully split aio_pool from BlockDriver,
sorry for not sending it out earlier]
Add a qcow_aio_setup helper to qcow to shared common code between
the aio_readv and aio_writev methods. Based on the function with
the same name in qcow2.
We do need hdev_create unconditionally on all platforms so that qemu-img
create support for host device works on all platforms.
Also relax the check to allow character devices in addition to block
devices. On many Unix platforms block devices have buffered block
nodes and unbuffered character device nodes, and on FreeBSD the block
nodes don't even exist anymore. Also on Linux we do support the
/dev/sgN scsi passthrough devices through the host device driver,
and probably the old-style /dev/raw/rawN raw devices although I haven't
tested that.
raw_pread_aligned currently returns the raw return value from
lseek/read, which is always -1 in case of an error. But the
callers higher up the stack expect it to return the negated
errno just like raw_pwrite_aligned.
Pointer vs addresses a VncDisplay structure,
so it is sufficient to allocate sizeof(VncDisplay)
or sizeof(*vs) bytes instead of the much larger
sizeof(VncState).
Maybe the misleading name should be fixed, too:
the code contains many places where vs is used,
sometimes it is a VncState *, sometimes it is a
VncDisplay *. vd would be a better name.
Kevin Wolf [Tue, 26 May 2009 12:36:03 +0000 (14:36 +0200)]
qcow2: Update multiple refcounts at once
Don't write each single changed refcount block entry to the disk after it is
written, but update all entries of the block and write all of them at once.
Kevin Wolf [Tue, 26 May 2009 12:36:02 +0000 (14:36 +0200)]
qcow2: Refactor update_refcount
This is a preparation patch with no functional changes. It moves the allocation
of new refcounts block to a new function and makes update_cluster_refcount (for
one cluster) call update_refcount (for multiple clusters) instead the other way
round.
Kevin Wolf [Sat, 23 May 2009 09:21:33 +0000 (11:21 +0200)]
e1000: Ignore reset command
When a reset is requested, the current e1000 emulation never clears the
reset bit which may cause a driver to hang. This patch masks the reset
bit out when setting the control registert, so the reset is immediately
completed.
Kevin Wolf [Tue, 19 May 2009 16:51:34 +0000 (18:51 +0200)]
Fix output of uninitialized strings
Commit ffad4116b96e29e0fbe892806f97c0a6c903d30d removed the "scratch buffer"
from check_params, but didn't care for the error messages which actually
included this string to tell the user which option was wrong. Now this string
is uninitialized, so this patch removes it from the message.
This means that the user is only told the whole parameter string and has to
pick the wrong option by himself as the callers of check_params can't know this
value any more. An alternative approach would be to revert that commit and do
whatever is needed to fix the original problem without changing check_params.
Paul Brook [Fri, 22 May 2009 23:05:19 +0000 (00:05 +0100)]
Add common BusState
Implement and use a common device bus state. The main side-effect is
that creating a bus and attaching it to a parent device are no longer
separate operations. For legacy code we allow a NULL parent, but that
should go away eventually.
Also tweak creation code to veriry theat a device in on the right bus.
Alexander Graf [Mon, 11 May 2009 15:41:42 +0000 (17:41 +0200)]
Add HTTP protocol using curl v6
Currently Qemu can read from posix I/O and NBD. This patch adds a
third protocol to the game: HTTP.
In certain situations it can be useful to access HTTP data directly,
for example if you want to try out an http provided OS image, but
don't know if you want to download it yet.
Using this patch you can now try it on on the fly. Just use it like:
Jason Wessel [Mon, 18 May 2009 15:00:28 +0000 (10:00 -0500)]
USB serial device support
Add in a workaround to allow the usb serial devices to work with the
usb pass through mechanism. The ioctl() to request an alternate
interface will always return < 0 for a usb-serial device based on the
kernel driver. This means there is no alternate interface end point.
This was fully tested with a pl2303 usb serial device.
Jason Wessel [Mon, 18 May 2009 15:00:27 +0000 (10:00 -0500)]
serial: fix lost character after sysrq
After creating an automated regression test to test the sysrq
responses while running a linux image in qemu, I found that the
simulated uart was eating the character right after the sysrq about
75% of the time.
The problem is that the qemu sets the LSR_DR (data ready) bit on a
serial break. The automated tests can send a break and the sysrq
character quickly enough that the qemu serial fifo has a real
character available. When there is valid character in the fifo, it
gets consumed by the serial driver in the guest OS.
The real hardware also appears to set the LSR_DR but always appears to
have a null byte in this condition. This patch changes the qemu
behavior to match the tested characteristics of a real 16550 chip.
Jason Wessel [Mon, 18 May 2009 15:00:26 +0000 (10:00 -0500)]
usb-serial: implement break event.
Implement the serial break via usb serial.
The second data byte in ftdi status packet contains the break status.
The values were already defined in usb-serial.c so it was a matter of
making use of the event_trigger to form a urb to send over to the host
controller with the serial break status set.
This was tested against a linux development image which enables sysrq
via a serial break on the ftdi usb console.
Jan Kiszka [Thu, 21 May 2009 20:43:39 +0000 (22:43 +0200)]
slirp: Reassign same address to same DHCP client
In case a client restarts a DHCP recovery without releasing its old
address, reassign the same address to prevent consuming free addresses
and moving away from the standard client address.
Jan Kiszka [Sat, 2 May 2009 00:18:38 +0000 (02:18 +0200)]
kvm: x86: Save/restore KVM-specific CPU states
Save and restore all so far neglected KVM-specific CPU states. Handling
the TSC stabilizes migration in KVM mode. The interrupt_bitmap and
mp_state are currently unused, but will become relevant for in-kernel
irqchip support. By including proper saving/restoring already, we avoid
having to increment CPU_SAVE_VERSION later on once again.
v2:
- initialize mp_state runnable (for the boot CPU)
Jan Kiszka [Fri, 1 May 2009 22:29:37 +0000 (00:29 +0200)]
kvm: Rework VCPU reset
Use standard callback with highest order to synchronize VCPU on reset
after all device callbacks were execute. This allows to remove the
special kvm hook in qemu_system_reset.
Jan Kiszka [Fri, 1 May 2009 22:29:37 +0000 (00:29 +0200)]
Introduce reset notifier order
Add the parameter 'order' to qemu_register_reset and sort callbacks on
registration. On system reset, callbacks with lower order will be
invoked before those with higher order. Update all existing users to the
standard order 0.
Note: At least for x86, the existing users seem to assume that handlers
are called in their registration order. Therefore, the patch preserves
this property. If someone feels bored, (s)he could try to identify this
dependency and express it properly on callback registration.
Jan Kiszka [Fri, 1 May 2009 22:29:37 +0000 (00:29 +0200)]
kvm: Fix framebuffer dirty log sync
kvm_physical_sync_dirty_bitmap() takes the end address as second
argument, not the region size. Moverover, the kvm API should not be used
directly here, but cpu_physical_sync_dirty_bitmap().
Jan Kiszka [Fri, 1 May 2009 22:22:51 +0000 (00:22 +0200)]
kvm: Add missing bits to support live migration
This patch adds the missing hooks to allow live migration in KVM mode.
It adds proper synchronization before/after saving/restoring the VCPU
states (note: PPC is untested), hooks into
cpu_physical_memory_set_dirty_tracking() to enable dirty memory logging
at KVM level, and synchronizes that drity log into QEMU's view before
running ram_live_save().
Jan Kiszka [Fri, 1 May 2009 18:52:47 +0000 (20:52 +0200)]
kvm: Rework dirty bitmap synchronization
Extend kvm_physical_sync_dirty_bitmap() so that is can sync across
multiple slots. Useful for updating the whole dirty log during
migration. Moreover, properly pass down errors the whole call chain.
Jan Kiszka [Fri, 1 May 2009 18:52:47 +0000 (20:52 +0200)]
kvm: Fix dirty log temporary buffer size
The buffer passed to KVM_GET_DIRTY_LOG requires one bit per page. Fix
the size calculation in kvm_physical_sync_dirty_bitmap accordingly,
avoiding allocation of extremly oversized buffers.
Jan Kiszka [Fri, 1 May 2009 18:52:46 +0000 (20:52 +0200)]
kvm: Introduce kvm_set_migration_log
Introduce a global dirty logging flag that enforces logging for all
slots. This can be used by the live migration code to enable/disable
global logging withouth destroying the per-slot setting.
Kevin Wolf [Mon, 18 May 2009 14:42:12 +0000 (16:42 +0200)]
Convert qemu-img convert to new bdrv_create
This is part two of the qemu-img conversion. This really works the same as the
previous conversion of qemu-img create: It introduces a new -o option for the
generic approach and adds the old-style options to this option set.
Kevin Wolf [Mon, 18 May 2009 14:42:11 +0000 (16:42 +0200)]
Convert qemu-img create to new bdrv_create
This patch changes qemu-img to actually use the new bdrv_create interface. It
translates the old-style qemu-img options which have been bdrv_create2
parameters or flags so far to option structures. As the generic approach, it
introduces an -o option which accepts any parameter the driver knows.
Kevin Wolf [Mon, 18 May 2009 14:42:10 +0000 (16:42 +0200)]
Convert all block drivers to new bdrv_create
Now we can make use of the newly introduced option structures. Instead of
having bdrv_create carry more and more parameters (which are format specific in
most cases), just pass a option structure as defined by the driver itself.
bdrv_create2() contains an emulation of the old interface to simplify the
transition.
Kevin Wolf [Mon, 18 May 2009 14:42:09 +0000 (16:42 +0200)]
Create qemu-option.h
This patch creates a new header file and the corresponding implementation file
for parsing of parameter strings for options (like used in -drive). Part of
this is code moved from vl.c (so qemu-img can use it later).
The idea is to have a data structure describing all accepted parameters. When
parsing a parameter string, the structure is copied and filled with the
parameter values.
Glauber Costa [Wed, 20 May 2009 22:26:58 +0000 (18:26 -0400)]
allow changing the speed of a running migration
This patch allow us to call migrate_set_speed on running
migrations. This should allow mgmt tools to increase the allocated
bandwidth of a running migration if there is no progress, and they
really want the migration to succeed.
Glauber Costa [Thu, 21 May 2009 21:38:01 +0000 (17:38 -0400)]
augment info migrate with page status
This patch augments info migrate output with status about:
* ram bytes remaining
* ram bytes transferred
* ram bytes total
This should be enough for management tools to realize
whether or not there is progress in migration. We can
add more information later on, if the need arrives
Anthony Liguori [Fri, 22 May 2009 01:41:01 +0000 (20:41 -0500)]
Introduce is_default field for QEMUMachine
f80f9ec changed the order that machines are registered which had the effect of
changing the default machine. This changeset introduces a new is_default field
so that machine types can declare that they are the default for an architecture.
Anthony Liguori [Thu, 21 May 2009 21:54:00 +0000 (16:54 -0500)]
Refactor how display drivers are selected
My previous commit, f92f8afebe, broke -vnc (spotted by Glauber Costa). This
is because it's necessary to tell when the no special display parameters have
been passed and default to SDL or VNC appropriately.
This refactors the display selection logic to be less complicated which has
the effect of fixing the regression mentioned above.
Anthony Liguori [Wed, 20 May 2009 18:01:02 +0000 (13:01 -0500)]
Eliminate --disable-gfx-check and make VNC default when SDL not available
--disable-gfx-check predates VNC server support. It made sense back then
because the only thing you could do without SDL was use -nographic mode or
similar tricks. Since this is a very advanced mode of operation, gfx-check
provided a good safety net for casual users.
A casual user is very likely to use VNC to interact with a guest. In fact, it's
often frustrating to install QEMU on a server and have to specify
disable-gfx-check when you only want to use VNC.
This patch eliminates disable-gfx-check and makes SDL behave like every other
optional dependency. If SDL is not available, instead of failing ungracefully
if no special options are specified, we default to -vnc localhost:0,to=99.
When we do default to VNC, we also print a message to tell the user that we've
done this include which port we're currently listening on.
When qemu is run under valgrind, valgrind shows the following output
on exit:
==3648== 1 errors in context 2 of 2:
==3648== Syscall param timer_create(evp) points to uninitialised byte(s)
==3648== at 0x54E936A: timer_create (in /lib/librt-2.9.so)
==3648== by 0x405DCF: dynticks_start_timer (vl.c:1549)
==3648== by 0x40A966: main (vl.c:1726)
==3648== Address 0x7fefffb34 is on thread 1's stack
==3648== Uninitialised value was created by a stack allocation
==3648== at 0x405D60: dynticks_start_timer (vl.c:1534)
This patch is a simple fix to remove this potential problem.
==3648== Process terminating with default action of signal 11 (SIGSEGV)
==3648== Access not within mapped region at address 0x8
==3648== at 0x40636B: host_alarm_handler (vl.c:1345)
==3648== by 0x52D807F: (within /lib/libpthread-2.9.so)
==3648== by 0x5C0A12E: tcsetattr (in /lib/libc-2.9.so)
==3648== by 0x4DD601: term_exit (qemu-char.c:700)
==3648== by 0x5B636EC: exit (in /lib/libc-2.9.so)
==3648== by 0x5B4B5AC: (below main) (in /lib/libc-2.9.so)
This simple fix check for a valid pointer as host_alarm_handler is
also called after alarm_timer is released in the exit path.
Mark McLoughlin [Tue, 19 May 2009 17:55:21 +0000 (18:55 +0100)]
kvm: work around supported cpuid ioctl() brokenness
KVM_GET_SUPPORTED_CPUID has been known to fail to return -E2BIG
when it runs out of entries. Detect this by always trying again
with a bigger table if the ioctl() fills the table.
Hollis Blanchard [Tue, 19 May 2009 20:08:25 +0000 (15:08 -0500)]
remove gcc 3.x requirement from documentation
This text is no longer accurate. After the patch is applied, the
generated version at http://www.nongnu.org/qemu/qemu-doc.html should be
regenerated.
This patch is also a candidate for the stable branch. (The URL above is
probably generated from the stable branch anyways, so maybe it goes
without saying.)
Paul Brook [Tue, 19 May 2009 15:17:58 +0000 (16:17 +0100)]
Hardware convenience library
The only target dependency for most hardware is sizeof(target_phys_addr_t).
Build these files into a convenience library, and use that instead of
building for every target.
Remove and poison various target specific macros to avoid bogus target
dependencies creeping back in.
Big/Little endian is not handled because devices should not know or care
about this to start with.