Jan Kiszka [Mon, 25 Jul 2011 16:11:20 +0000 (18:11 +0200)]
Allow to leave type on default in -machine
This allows to specify -machine options without setting an explicit
machine type. We will pick the default machine in this case. Requesting
the list of available machines is still possible via '-machine ?' e.g.
Stefan Berger [Tue, 26 Jul 2011 14:33:11 +0000 (10:33 -0400)]
Fix a compilation error in xen-mapcache.c
This patch fixes a compilation error in xen-mapcache.c .
/home/stefanb/qemu/qemu-git/xen-mapcache.c: In function ‘xen_ram_addr_from_mapcache’:
/home/stefanb/qemu/qemu-git/xen-mapcache.c:240:42: error: variable ‘pentry’ set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Multiboot images can specify a bss segment. The boot loader must clear
the memory of the bss and ensure that no modules or structures are
allocated inside it. Several fields are provided in the Multiboot
header that were previously not used properly. The header is now used
to determine how much data should be read from the image and how much
memory should be reserved to the bss segment.
When writing the last sector of an SD card using WRITE_MULTIPLE_BLOCK
QEmu throws an error saying that we've run off the end, and leaves
itself in the wrong state.
Paolo Bonzini [Thu, 28 Jul 2011 10:10:29 +0000 (12:10 +0200)]
softfloat: change default nan definitions to variables
Most definitions in softfloat.h are really target-independent, but the
file is not because it includes definitions of the default NaN values.
Change those to variables to allow including softfloat.h from files that
are not compiled per-target. By making them const, the compiler is
allowed to optimize them into softfloat functions that use them.
wayne [Wed, 27 Jul 2011 10:04:55 +0000 (18:04 +0800)]
showing a splash picture when start
Added options to let qemu transfer two configuration files to bios:
"bootsplash.bmp" and "etc/boot-menu-wait", which could be specified by command
-boot splash=P,splash-time=T
P is jpg/bmp file name or an absolute path, T have a max value of 0xffff, unit
is ms. With these two options, if user invoke qemu with menu=on option, then
a splash picture would be showed in a given time. For example:
qemu -boot menu=on,splash=/root/boot.bmp,splash-time=5000
would make boot.bmp shown as a brand with 5 seconds in the booting up process.
This feature need the new seabios's support, which could be got from git.
Avi Kivity [Tue, 26 Jul 2011 11:26:21 +0000 (14:26 +0300)]
sysbus: add MemoryRegion based memory management API
Allow registering sysbus device memory using a MemoryRegion. Once all users
are converted, sysbus_init_mmio() and sysbus_init_mmio_cb() will be removed.
Avi Kivity [Tue, 26 Jul 2011 11:26:13 +0000 (14:26 +0300)]
memory: transaction API
Allow changes to the memory hierarchy to be accumulated and
made visible all at once. This reduces computational effort,
especially when an accelerator (e.g. kvm) is involved.
Useful when a single register update causes multiple changes
to an address space.
The single pass algorithm removed a1, added b2, then removed a2 and a3,
which caused the wrong memory map to be built. The two pass algorithm
removes a1, a2, and a3, then adds b1.
Avi Kivity [Tue, 26 Jul 2011 11:26:11 +0000 (14:26 +0300)]
memory: add ioeventfd support
As with the rest of the memory API, the caller associates an eventfd
with an address, and the memory API takes care of registering or
unregistering when the address is made visible or invisible to the
guest.
Avi Kivity [Tue, 26 Jul 2011 11:26:07 +0000 (14:26 +0300)]
memory: late initialization of ram_addr
For non-RAM memory regions, we cannot tell whether this is an I/O region
or an MMIO region. Since the qemu backing registration is different for
the two, we have to defer initialization until we know which address
space we are in.
These shenanigans will be removed once the backing registration is unified
with the memory API.
Avi Kivity [Tue, 26 Jul 2011 11:26:05 +0000 (14:26 +0300)]
memory: abstract address space operations
Prepare for multiple address space support by abstracting away the details
of registering a memory range with qemu's flat representation into an
AddressSpace object.
Note operations which are memory specific are not abstracted, since they will
never be called on I/O address spaces anyway.
Avi Kivity [Tue, 26 Jul 2011 11:26:04 +0000 (14:26 +0300)]
Internal interfaces for memory API
get_system_memory() provides the root of the memory hierarchy.
This interface is intended to be private between memory.c and exec.c.
If this file is included elsewhere, it should be regarded as a bug (or
TODO item). However, it will be temporarily needed for the conversion
to hierarchical memory routing.
Avi Kivity [Tue, 26 Jul 2011 11:26:03 +0000 (14:26 +0300)]
memory: merge adjacent segments of a single memory region
Simple implementations of memory routers, for example the Cirrus VGA memory banks
or the 440FX PAM registers can generate adjacent memory regions which are contiguous.
Detect these and merge them; this saves kvm memory slots and shortens lookup times.
Avi Kivity [Tue, 26 Jul 2011 11:26:01 +0000 (14:26 +0300)]
Hierarchical memory region API
The memory API separates the attributes of a memory region (its size, how
reads or writes are handled, dirty logging, and coalescing) from where it
is mapped and whether it is enabled. This allows a device to configure
a memory region once, then hand it off to its parent bus to map it according
to the bus configuration.
Hierarchical registration also allows a device to compose a region out of
a number of sub-regions with different properties; for example some may be
RAM while others may be MMIO.
Jan Kiszka [Sun, 24 Jul 2011 17:38:36 +0000 (19:38 +0200)]
qdev: Reset hot-plugged devices
Device models rely on the core invoking their reset handlers after init.
We do this in the cold-plug case, but so far we miss this step after
hot-plug.
Create cscope symbols for assembly files in addition to .c/.h files.
Create cscope database with full path instead of relative path so cscope
can be used with CSCOPE_DB in any directory.
The unplug protocol is necessary to support PV drivers in the guest: the
drivers expect to be able to "unplug" emulated disks and nics before
initializing the Xen PV interfaces.
It is responsibility of the guest to make sure that the unplug is done
before the emulated devices or the PV interface start to be used.
We use pci_for_each_device to walk the PCI bus, identify the devices and
disks that we want to disable and dynamically unplug them.
Changes in v2:
- use PCI_CLASS constants;
- replace pci_unplug_device with qdev_unplug;
- do not import hw/ide/internal.h in xen_platform.c;
Changes in v3:
- introduce piix3-ide-xen, that support hot-unplug;
- move the unplug code to hw/ide/piix.c;
- just call qdev_unplug from xen_platform.c to unplug the IDE disks;
Anthony PERARD [Wed, 20 Jul 2011 08:17:43 +0000 (08:17 +0000)]
xen: Fix the memory registration to reflect of what is done by Xen.
A Xen guest memory is allocated by libxc. But this memory is not
allocated continuously, instead, it leaves the VGA IO memory space not
allocated, same for the MMIO space (at HVM_BELOW_4G_MMIO_START of size
HVM_BELOW_4G_MMIO_LENGTH).
So to reflect that, we do not register the physical memory for this two
holes. But we still keep only one RAMBlock for the all RAM as it is more
easier than have two separate blocks (1 above 4G). Also this prevent QEMU
from use the MMIO space for a ROM.
Alexander Graf [Sun, 17 Jul 2011 05:30:29 +0000 (07:30 +0200)]
xen: make xen_enabled even more clever
When using xen_enabled() we're currently only checking if xen is enabled
at all during the build. But what if you want to build multiple targets
out of which only one can potentially run xen code?
That means that for generic code we'll still have to fall back to the
variable and potentially slow the code down, but it's not as important as
that is mostly xen device emulation which is not touched for non-xen targets.
The target specific code however can with this patch see that it's unable to
ever execute xen code. We can thus always return 0 on xen_enabled(), giving
gcc enough hints to evict the mapcache code from the target memory management
code.
Anthony PERARD [Fri, 15 Jul 2011 04:32:53 +0000 (04:32 +0000)]
exec.c: Use ram_addr_t in cpu_physical_memory_rw(...).
As the variable pd and addr1 inside the function cpu_physical_memory_rw
are mean to handle a RAM address, they should be of the ram_addr_t type
instead of unsigned long.
Anthony PERARD [Fri, 15 Jul 2011 00:33:42 +0000 (00:33 +0000)]
xen: introduce xen_change_state_handler
Remove the call to xenstore_record_dm_state from xen_main_loop_prepare
that is HVM specific.
Add a new vm_change_state_handler shared between xen_pv and xen_hvm
machines to record the VM state to xenstore.
Blue Swirl [Sat, 23 Jul 2011 21:21:14 +0000 (21:21 +0000)]
simpletrace: suppress a warning from unused variable
Avoid this warning:
CC simpletrace.o
/src/qemu/simpletrace.c: In function 'writeout_thread':
/src/qemu/simpletrace.c:122:12: error: variable 'unused' set but not used [-Werror=unused-but-set-variable]
by adding GCC attribute unused to the variable.
Blue Swirl [Sat, 23 Jul 2011 20:04:29 +0000 (20:04 +0000)]
Wrap recv to avoid warnings
Avoid warnings like these by wrapping recv():
CC slirp/ip_icmp.o
/src/qemu/slirp/ip_icmp.c: In function 'icmp_receive':
/src/qemu/slirp/ip_icmp.c:418:5: error: passing argument 2 of 'recv' from incompatible pointer type [-Werror]
/usr/local/lib/gcc/i686-mingw32msvc/4.6.0/../../../../i686-mingw32msvc/include/winsock2.h:547:32: note: expected 'char *' but argument is of type 'struct icmp *'
Jan Kiszka [Fri, 17 Jun 2011 09:25:49 +0000 (11:25 +0200)]
Register Linux dyntick timer as per-thread signal
Derived from kvm-tool patch
http://thread.gmane.org/gmane.comp.emulators.kvm.devel/74309
Ingo Molnar pointed out that sending the timer signal to the whole
process, just blocking it everywhere, is suboptimal with an increasing
number of threads. QEMU is also using this pattern so far.
Linux provides a (non-portable) way to restrict the signal to a single
thread: We can use SIGEV_THREAD_ID unless we are forced to emulate
signalfd via an additional thread. That case could theoretically be
optimized as well, but it doesn't look worth bothering.
Jan Kiszka [Mon, 20 Jun 2011 12:06:28 +0000 (14:06 +0200)]
mc146818rtc: Handle host clock resets
Make use of the new clock reset notifier to update the RTC whenever
rtc_clock is the host clock and that happens to jump backward. This
avoids that the RTC stalls for the period the host clock was set back.
Jan Kiszka [Mon, 20 Jun 2011 12:06:27 +0000 (14:06 +0200)]
qemu-timer: Introduce clock reset notifier
QEMU_CLOCK_HOST is based on the system time which may jump backward in
case the admin or NTP adjusts it. RTC emulations and other device models
can suffer in this case as timers will stall for the period the clock
was tuned back.
This adds a detection mechanism that checks on every host clock readout
if the new time is before the last result. If that is the case a
notifier list is informed. Device models interested in this event can
register a notifier with the clock.
Jan Kiszka [Mon, 20 Jun 2011 12:06:26 +0000 (14:06 +0200)]
notifier: Pass data argument to callback
This allows to pass additional information to the notifier callback
which is useful if sender and receiver do not share any other distinct
data structure.
Indentation is off, and "dev-prop-int" suggests these are properties
you can configure with -device, which isn't the case. The other
buses' print_dev() callbacks don't do that. For instance, PCI's
output looks like this:
class Ethernet controller, addr 00:03.0, pci id 1af4:1000 (sub 1af4:0001)
bar 0: i/o at 0xffffffffffffffff [0x1e]
bar 1: mem at 0xffffffffffffffff [0xffe]
bar 6: mem at 0xffffffffffffffff [0xfffe]
Change virtser_bus_dev_print() to that style. Result:
dev: virtconsole, id ""
dev-prop: is_console = 1
dev-prop: nr = 0
dev-prop: chardev = <null>
dev-prop: name = <null>
port 0, guest on, host off, throttle off
Introduce a 'client_add' monitor command accepting an open FD
Allow client connections for VNC and socket based character
devices to be passed in over the monitor using SCM_RIGHTS.
One intended usage scenario is to start QEMU with VNC on a
UNIX domain socket. An unprivileged user which cannot access
the UNIX domain socket, can then connect to QEMU's VNC server
by passing an open FD to libvirt, which passes it onto QEMU.
In this case 'protocol' can be 'vnc' or 'spice', or the name
of a character device (eg from -chardev id=XXXX)
The 'skipauth' parameter can be used to skip any configured
VNC authentication scheme, which is useful if the mgmt layer
talking to the monitor has already authenticated the client
in another way.
Store VNC auth scheme per-client as well as per-server
A future patch will introduce a situation where different
clients may have different authentication schemes set.
When a new client arrives, copy the 'auth' and 'subauth'
fields from VncDisplay into the client's VncState, and
use the latter in all authentication functions.
* ui/vnc.h: Add 'auth' and 'subauth' to VncState
* ui/vnc-auth-sasl.c, ui/vnc-auth-vencrypt.c,
ui/vnc.c: Make auth functions pull auth scheme
from VncState instead of VncDisplay
Wen Congyang [Fri, 17 Jun 2011 02:25:22 +0000 (10:25 +0800)]
do not reset no_shutdown after we shutdown the vm
Daniel P. Berrange sent a libvirt's patch to support
reboots with the QEMU driver. He implements it in
json model like this:
1. add -no-shutdown in the qemu's option:
qemu -no-shutdown xxxx
2. shutdown the vm by monitor command system_powerdown
3. wait for shutdown event
4. reset the vm by monitor command system_reset
no_shutdown will be reset to 0 if the vm is powered down.
We only can reboot the vm once.
If no_shutdown is not reset to 0, we can reboot the vm
many times.
Kevin Wolf [Wed, 1 Jun 2011 11:29:11 +0000 (13:29 +0200)]
qemu-char: Print strerror message on failure
The only way for chardev drivers to communicate an error was to return a NULL
pointer, which resulted in an error message that said _that_ something went
wrong, but not _why_.
This patch changes the interface to return 0/-errno and updates
qemu_chr_open_opts to use strerror to display a more helpful error message.
Paolo Bonzini [Thu, 9 Jun 2011 11:10:25 +0000 (13:10 +0200)]
qemu-timer: change unix timer to dynticks
A timer that wakes up every millisecond puts a lot of stress on the
iothread. The large amount of IPIs causes very high context switch
activity, making emulation slow and the UI unusable. This is by the
way the same reason why the Windows timers were switched to dynticks.
Paolo Bonzini [Thu, 9 Jun 2011 11:10:24 +0000 (13:10 +0200)]
iothread: replace fair_mutex with a condition variable
This conveys the intention better, and scales to more than >1
threads contending the mutex with the iothread (as long as all
of them have a "quiescent point" like the TCG thread has).
Also, on Mac OS X the fair_mutex somehow didn't work as intended
and deadlocked.
Paolo Bonzini [Fri, 15 Jul 2011 15:10:15 +0000 (17:10 +0200)]
report serial devices created with -device in the PIIX4 config space
Serial and parallel devices created with -device are not reported in
the PIIX4 configuration space, and are hence not picked up by the DSDT.
This upsets Windows, which hides them altogether from the guest.
To avoid this, check at the end of machine initialization whether the
corresponding I/O ports have been registered. The new function in
ioport.c does this; this also requires a tweak to isa_unassign_ioport.
I left the comment in piix4_pm_initfn since the registers I moved do
seem to match the 82371AB datasheet. There are some quirks though.
We are setting this bit:
"Device 8 EIO Enable (EIO_EN_DEV8)—R/W. 1=Enable PCI access to the
device 8 enabled I/O ranges to be claimed by PIIX4 and forwarded
to the ISA/EIO bus. 0=Disable. The LPT_MON_EN must be set to enable
the decode."
but not LPT_MON_EN (bit 18 at 50h):
LPT Port Enable (LPT_MON_EN)—R/W. 1=Enable accesses to parallel
port address range (LPT_DEC_SEL) to generate a device 8 (parallel
port) decode event. 0=Disable.
We're also setting the LPT_DEC_SEL field (that's the 0x60 written to
63h) to 11, which means reserved, rather than to 01 (378h-37Fh).
Likewise we're not setting SA_MON_EN, SB_MON_EN (respectively bit 14
and bit 16 at address 50h) for the serial ports. However, we're setting
COMA_DEC_SEL and COMB_DEC_SEL correctly, unlike the corresponding register
for the parallel port.
All these fields are left as they are, since they are probably only
meant to be used in the DSDT.
Jan Kiszka [Wed, 20 Jul 2011 10:20:22 +0000 (12:20 +0200)]
net: Consistently use qemu_macaddr_default_if_unset
Drop the open-coded MAC assignment from net_init_nic and replace it with
standard qemu_macaddr_default_if_unset which is also used by qdev. That
avoid creating colliding MACs when instantiating NICs via different
mechanisms.
This change requires to store the MAC as MACAddr in NICInfo, and the
remaining nd_table users need to be updated.
Jan Kiszka [Wed, 20 Jul 2011 10:20:21 +0000 (12:20 +0200)]
net: Dump client type 'info network'
Include the client type name into the output of 'info network'. The
result looks like this:
(qemu) info network
VLAN 0 devices:
rtl8139.0: type=nic,model=rtl8139,macaddr=52:54:00:12:34:57
Devices not on any VLAN:
virtio-net-pci.0: type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56
\ network1: type=tap,fd=5
Jan Kiszka [Wed, 20 Jul 2011 10:20:20 +0000 (12:20 +0200)]
net: Refactor net_client_types
Position entries of net_client_types according to the corresponding
values of NET_CLIENT_TYPE_*. The array size is now defined by
NET_CLIENT_TYPE_MAX. This will allow to obtain entries based on type
value in later patches.
At this chance rename NET_CLIENT_TYPE_SLIRP to NET_CLIENT_TYPE_USER for
the sake of consistency.
Jan Kiszka [Wed, 20 Jul 2011 10:20:18 +0000 (12:20 +0200)]
slirp: Forward ICMP echo requests via unprivileged sockets
Linux 3.0 gained support for unprivileged ICMP ping sockets. Use this
feature to forward guest pings to the outer world. The host admin has to
set the ping_group_range in order to grant access to those sockets. To
allow ping for the users group (GID 100):
Jan Kiszka [Wed, 20 Jul 2011 10:20:17 +0000 (12:20 +0200)]
slirp: Put forked exec into separate process group
Recent smb daemons tend to terminate themselves via a process group
SIGTERM. If the daemon is still in qemu's group by that time, qemu will
die as well. Avoid this by always pushing fork_exec processes into a
group of their own, not just (unused) type 2 execs.
Jan Kiszka [Wed, 20 Jul 2011 10:20:14 +0000 (12:20 +0200)]
slirp: Canonicalize restrict syntax
All other boolean arguments accept on|off - except for slirp's restrict.
Fix that while still accepting the formerly allowed yes|y|no|n, but
reject everything else. This avoids accidentally allowing external
connections because syntax errors were so far interpreted as
'restrict=no'.
Jan Kiszka [Sat, 23 Jul 2011 10:38:37 +0000 (12:38 +0200)]
Generalize -machine command line option
-machine somehow suggests that it selects the machine, but it doesn't.
Fix that before this command is set in stone.
Actually, -machine should supersede -M and allow to introduce arbitrary
per-machine options to the command line. That will change the internal
realization again, but we will be able to keep the user interface
stable.