Jamie Lentin [Fri, 26 Nov 2010 13:04:08 +0000 (15:04 +0200)]
linux-user: Translate getsockopt level option
n setsockopt, the socket level options are translated to the hosts'
architecture before the real syscall is called, e.g.
TARGET_SO_TYPE -> SO_TYPE. This patch does the same with getsockopt.
Peter Maydell [Mon, 8 Nov 2010 18:13:58 +0000 (18:13 +0000)]
linux-user: remove unnecessary local from __get_user(), __put_user()
Remove an unnecessary local variable from the __get_user() and
__put_user() macros. This avoids confusing compilation failures
if the name of the local variable ('size') happens to be the
same as the variable the macro user is trying to read/write.
Nathan Froyd [Fri, 29 Oct 2010 14:48:57 +0000 (07:48 -0700)]
linux-user: fix memory leaks with NPTL emulation
Running programs that create large numbers of threads, such as this
snippet from libstdc++'s pthread7-rope.cc:
const int max_thread_count = 4;
const int max_loop_count = 10000;
...
for (int j = 0; j < max_loop_count; j++)
{
...
for (int i = 0; i < max_thread_count; i++)
pthread_create (&tid[i], NULL, thread_main, 0);
for (int i = 0; i < max_thread_count; i++)
pthread_join (tid[i], NULL);
}
in user-mode emulation will quickly run out of memory. This is caused
by a failure to free memory in do_syscall prior to thread exit:
/* TODO: Free CPU state. */
pthread_exit(NULL);
The first step in fixing this is to make all TaskStates used by QEMU
dynamically allocated. The TaskState used by the initial thread was
not, as it was allocated on main's stack. So fix that, free the
cpu_env, free the TaskState, and we're home free, right?
If we blindly free the TaskState, then, we yank the current (host)
thread's stack out from underneath it while it still has things to do,
like calling pthread_exit. That causes problems, as you might expect.
The solution adopted here is to let the C library allocate the thread's
stack (so the C library can properly clean it up at pthread_exit) and
provide a hint that we want NEW_STACK_SIZE bytes of stack.
With those two changes, we're done, right? Well, almost. You see,
we're creating all these host threads and their parent threads never
bother to check that their children are finished. There's no good place
for the parent threads to do so. Therefore, we need to create the
threads in a detached state so the parent thread doesn't have to call
pthread_join on the child to release the child's resources; the child
does so automatically.
With those three major changes, we can comfortably run programs like the
above without exhausting memory. We do need to delete 'stack' from the
TaskState structure.
linux-user: mmap_reserve() not controlled by RESERVED_VA
mmap_reserve() should be called only when RESERVED_VA is enabled.
Otherwise, unmaped virtual address space will never be reusable. This
bug will exhaust virtual address space in extreme conditions.
Anthony Liguori [Thu, 9 Sep 2010 19:51:31 +0000 (14:51 -0500)]
Use a Linux-style MAINTAINERS file
I make no claims that this is accurate or exhaustive but I think it's a
reasonable place to start.
As the file mentions, the purpose of this file is to give contributors
information about who they can go to with questions about a particular piece of
code or who they can ask for review.
If you sign up for a piece of code and indicate that it's Maintained or
Supported, please be prepared to be responsive to questions about that
subsystem.
Kevin Wolf [Fri, 26 Nov 2010 15:36:16 +0000 (16:36 +0100)]
ide: Reset current_addr after stopping DMA
Whenever SSBM is reset in the command register all state information is lost.
Restarting DMA means that current_addr must be reset to the base address of the
PRD table. The OS is not required to change the base address register before
starting a DMA operation, it can reuse the value it wrote for an earlier
request.
Paul Brook [Sat, 27 Nov 2010 11:23:34 +0000 (11:23 +0000)]
Split out common pcnet code
The core pcnet emulation code is used by both the PCI "pcnet" device
and the SPARC "lance" device. Split the common code frm the PCI code so
that that can be configures independantly.
Hannes Reinecke [Wed, 24 Nov 2010 11:15:59 +0000 (12:15 +0100)]
scsi: Move sense handling into the driver
The current sense handling in scsi-bus is only used by the
scsi-disk driver; the scsi-generic driver is using its own.
So we should move the current sense handling into the
scsi-disk driver.
Hannes Reinecke [Wed, 24 Nov 2010 11:15:57 +0000 (12:15 +0100)]
scsi: Return SAM status codes
Traditionally, the linux stack is using SCSI status codes
which are shifted by one as compared to those defined in SAM.
A SCSI emulation should naturally return the SAM defined codes,
not the linux ones.
So to avoid any confusion this patch modifies the existing
definitions to match those found in SAM and removes any
(now obsolete) byte-shift from the returned status codes.
Hannes Reinecke [Wed, 24 Nov 2010 11:15:56 +0000 (12:15 +0100)]
scsi: Increase the number of possible devices
The SCSI parallel interface has a limit of 8 devices, but
not the SCSI stack in general. So we should be removing the
hard-coded limit and use MAX_SCSI_DEVS instead.
And we only need to scan those devices which are allocated
by the bus.
Ryan Harper [Fri, 12 Nov 2010 17:07:13 +0000 (11:07 -0600)]
Implement drive_del to decouple block removal from device removal
Currently device hotplug removal code is tied to device removal via
ACPI. All pci devices that are removable via device_del() require the
guest to respond to the request. In some cases the guest may not
respond leaving the device still accessible to the guest. The management
layer doesn't currently have a reliable way to revoke access to host
resource in the presence of an uncooperative guest.
This patch implements a new monitor command, drive_del, which
provides an explicit command to revoke access to a host block device.
drive_del first quiesces the block device (qemu_aio_flush;
bdrv_flush() and bdrv_close()). This prevents further IO from being
submitted against the host device. Finally, drive_del cleans up
pointers between the drive object (host resource) and the device
object (guest resource).
Stefan Hajnoczi [Fri, 12 Nov 2010 09:57:11 +0000 (09:57 +0000)]
scsi-disk: Move active request asserts
SCSI read/write requests should not be re-issued before the current
fragment of I/O completes. There are asserts in scsi-disk.c that guard
this constraint but they trigger on SPARC Linux 2.4. It turns out that
the asserts are too early in the code path and don't allow for read
requests to terminate.
Only the read assert needs to be moved but move the write assert too for
consistency.
Rename the members of target_ucontext so that they don't conflict
with possible host macros for ucontext members. This has already
been done for the other targets.
Anthony Liguori [Fri, 19 Nov 2010 09:55:59 +0000 (18:55 +0900)]
qdev: reset qdev along with qdev tree
This patch changes the reset handling so that qdev has no knowledge of the
global system reset. Instead, a new bus/device level function is introduced
that allows all devices/buses on the bus/device to be reset using a depth
first transversal.
N.B. we have to expose the implicit system bus because we have various hacks
that result in an implicit system bus existing. Instead, we ought to have an
explicitly created system bus that we can trigger reset from. That's a topic
for a future patch though.
Anthony Liguori [Fri, 19 Nov 2010 09:55:58 +0000 (18:55 +0900)]
qbus: add functions to walk both devices and busses
There are some cases where you want to walk the busses, in particular, when
searching for a bus either by name or DeviceInfo.
Paolo suggested that we model the return values on how GCC's walkers work which
allows an actor to skip child transversal, or terminate walking with a positive
value that's returned as the qbus_walk_children's result.
Stefan Weil [Tue, 19 Oct 2010 21:08:21 +0000 (23:08 +0200)]
pci: Automatically patch PCI vendor id and device id in PCI ROM
PCI devices with different vendor or device ids sometimes share
the same rom code. Only the ids and the checksum
differs in a boot rom for such devices.
The i825xx ethernet controller family is a typical example
which is implemented in hw/eepro100.c. It uses at least
3 different device ids, so normally 3 boot roms would be needed.
By automatically patching vendor id and device id (and the checksum)
in qemu, all emulated family members can share the same boot rom.
VGA bios roms are another example with different vendor and device ids.
Only qemu's built-in default rom files will be patched.
v2:
* Patch also the vendor id (and remove the sanity check for vendor id).
v3:
* Don't patch a rom file when its name was set by the user.
Thus we avoid modifications of unknown rom data.
Stefan Weil [Fri, 19 Nov 2010 18:29:07 +0000 (19:29 +0100)]
pci: Replace unneeded type casts in calls of pci_register_bar
There is no need for these type casts (as other existing
code shows). So re-write the first argument without
type cast (and remove a related TODO comment).
Bits 12 to 15 in bridge control register are reserver and must be
read-only zero, curent mask is 0xffff which makes them writeable. Fix
this up by using symbolic bit names for writeable bits instead of a
hardcoded constant.
Open-code functions created in the previous patch,
to make code more compact and clear.
Detcted and documented what looks like a bug in code
that becomes apparent from this refactoring.
Gerd Hoffmann [Wed, 17 Nov 2010 11:06:44 +0000 (12:06 +0100)]
vgabios update: handle compatibility with older qemu versions
As pointed out by avi the vgabios update is guest-visible and thus has
migration implications.
One change is that the vga has a valid pci rom bar now. We already have
a pci bus property to enable/disable the rom bar and we'll load the bios
via fw_cfg as fallback for the no-rom-bar case. So we just have to add
compat properties to handle this case.
A second change is that the magic bochs lfb @ 0xe0000000 is gone. When
live-migrating a guest from a older qemu version it might be using the
lfb though, so we have to keep it for the old machine types. The patch
enables the bochs lfb in case we don't have the pci rom bar enabled
(i.e. we are in 0.13+older compat mode).
This patch depends on these patches which add (and use) the pc-0.13
machine type:
http://patchwork.ozlabs.org/patch/70797/
http://patchwork.ozlabs.org/patch/70798/
Jan Kiszka [Tue, 19 Oct 2010 15:03:24 +0000 (17:03 +0200)]
pcnet: Do not receive external frames in loopback mode
While not explicitly stated in the spec, it was observed on real systems
that enabling loopback testing on the pcnet controller disables
reception of external frames. And some legacy software relies on it, so
provide this behavior.
Avi Kivity [Wed, 17 Nov 2010 09:50:09 +0000 (11:50 +0200)]
Type-safe ioport callbacks
The current ioport callbacks are not type-safe, in that they accept an "opaque"
pointer as an argument whose type must match the argument to the registration
function; this is not checked by the compiler.
This patch adds an alternative that is type-safe. Instead of an opaque
argument, both registation and the callback use a new IOPort type. The
callback then uses container_of() to access its main structures.
Currently the old and new methods exist side by side; once the old way is gone,
we can also save a bunch of memory since the new method requires one pointer
per ioport instead of 6.
Stefan Hajnoczi [Tue, 16 Nov 2010 12:20:25 +0000 (12:20 +0000)]
trace: Trace vm_start()/vm_stop()
VM state change notifications are invoked from vm_start()/vm_stop().
Trace these state changes so we can reason about the state of the VM
from trace output.
Stefan Weil [Mon, 15 Nov 2010 20:15:26 +0000 (21:15 +0100)]
slirp: Remove unused code for bad sprintf
Neither DECLARE_SPRINTF nor BAD_SPRINTF are needed for QEMU.
QEMU won't support systems with missing or bad declarations
for sprintf. The unused code was detected while looking for
functions with missing format checking. Instead of adding
GCC_FMT_ATTR, the unused code was removed.
Avi Kivity [Tue, 16 Nov 2010 14:33:17 +0000 (16:33 +0200)]
optionrom: fix bugs in signrom.sh
signrom.sh has multiple bugs:
- the last byte is considered when calculating the existing checksum, but not
when computing the correction
- apprently the 'expr' expression overflows and produces incorrect results with
larger roms
- if the checksum happened to be zero, we calculated the correction byte to be
256
Instead of rewriting this in half a line of python, this patch fixes the bugs.
Add support for generating a systemtap tapset static probes
This introduces generation of a qemu.stp/qemu-system-XXX.stp
files which provides tapsets with friendly names for static
probes & their arguments. Instead of
There is one tapset defined per target arch, for both
user and system emulators.
* Makefile.target: Generate stp files for each target
* tracetool: Support for generating systemtap tapsets
* configure: Check for whether systemtap is available
with the DTrace backend
Add a DTrace tracing backend targetted for SystemTAP compatability
This introduces a new tracing backend that targets the SystemTAP
implementation of DTrace userspace tracing. The core functionality
should be applicable and standard across any DTrace implementation
on Solaris, OS-X, *BSD, but the Makefile rules will likely need
some small additional changes to cope with OS specific build
requirements.
This backend builds a little differently from the other tracing
backends. Specifically there is no 'trace.c' file, because the
'dtrace' command line tool generates a '.o' file directly from
the dtrace probe definition file. The probe definition is usually
named with a '.d' extension but QEMU uses '.d' files for its
external makefile dependancy tracking, so this uses '.dtrace' as
the extension for the probe definition file.
The 'tracetool' program gains the ability to generate a trace.h
file for DTrace, and also to generate the trace.d file containing
the dtrace probe definition.
Example usage of a dtrace probe in systemtap looks like:
* .gitignore: Ignore trace-dtrace.*
* Makefile: Extra rules for generating DTrace files
* Makefile.obj: Don't build trace.o for DTrace, use
trace-dtrace.o generated by 'dtrace' instead
* tracetool: Support for generating DTrace data files
Luiz Capitulino [Fri, 22 Oct 2010 18:09:05 +0000 (16:09 -0200)]
qemu-char: Introduce Memory driver
This driver handles in-memory chardev operations. That's, all writes
to this driver are stored in an internal buffer and it doesn't talk
to the external world in any way.
Right now it's very simple: it supports only writes. But it can be
easily extended to support more operations.
This is going to be used by the monitor's "HMP passthrough via QMP"
feature, which needs to run monitor handlers without a backing
device.