Naphtali Sprei [Sun, 14 Feb 2010 11:39:18 +0000 (13:39 +0200)]
block: more read-only changes, related to backing files
Open backing file read-only where possible
Upgrade backing file to read-write during commit, back to read-only after commit
If upgrade fail, back to read-only. If also fail, "disconnect" the drive.
Adam Litke [Fri, 12 Feb 2010 20:55:56 +0000 (14:55 -0600)]
Fix hanging user monitor when using balloon command
Arghh... Adding missing S-O-B
Hi Anthony. I wonder if there was a problem when importing my async
command handler patchset. Since the 'balloon' command completes
immediately, it must call the completion callback before returning.
That call was missing but is added by the patch below.
Luiz Capitulino [Thu, 11 Feb 2010 01:50:07 +0000 (23:50 -0200)]
Monitor: Report more than one error in handlers
Handlers can generate only one error in a call, we let the
programmer know if they brake this rule and clients will only
get the first generated error.
Luiz Capitulino [Thu, 11 Feb 2010 01:50:06 +0000 (23:50 -0200)]
Monitor: Debug stray prints the right way
QObject Monitor handlers should not call any Monitor print
function: they should only build objects, printing is done
by common code.
Current QMP code will ignore such calls, as we can't send
garbage to clients, additionally it will also emit an
undefined error on the assumption that print calls usually
report errors.
However, the right way to deal with this is to rely on a
return code. This has been fixed by other commit already.
Now, this commit drops the error from monitor_vprintf() and
adds a better debugging mechanism for those 'stray' prints:
we count them if debug is enabled and let the developer know
if a QObject handler is trying to print anything.
Luiz Capitulino [Thu, 11 Feb 2010 01:49:47 +0000 (23:49 -0200)]
Monitor: Introduce cmd_new_ret()
In order to implement the new error handling and debugging
mechanism for command handlers, we need to change the cmd_new()
callback to return a value.
This commit introduces cmd_new_ret(), which returns a value and
will be used only temporarily to handle the transition from
cmd_new().
That is, as soon as all command handlers are ported to cmd_new_ret(),
it will be renamed back to cmd_new() and the new error handling
and debugging mechanism will be added on top of it.
net: Fix bogus "Warning: vlan 0 with no nics" with -device
net_check_clients() prints this when an VLAN has host devices, but no
guest devices. It uses VLANState members nb_guest_devs and
nb_host_devs to keep track of these devices. However, -device does
not update nb_guest_devs, only net_init_nic() does that, for -net nic.
Check the VLAN clients directly, and remove the counters.
Rabin Vincent [Sun, 14 Feb 2010 18:32:34 +0000 (00:02 +0530)]
target-arm: fix thumb CPS
The Thumb CPS currently does not work correctly: CPSID touches more bits
than the instruction wants to, and CPSIE does nothing. Fix it by
passing the correct mask (the "affect" bits) and value.
Paolo Bonzini [Thu, 18 Feb 2010 20:25:23 +0000 (21:25 +0100)]
get rid of hostregs_helper.h
Since b567b38 (target-arm: remove T0 and T1, 2009-10-16) the only global
register that is used is AREG0, so the complexity of hostregs_helper.h
is unused. Use regular assignments and a compiler optimization barrier.
Artyom Tarasenko [Mon, 15 Feb 2010 17:39:50 +0000 (18:39 +0100)]
sparc32 fix spurious dma interrupts v2
Don't raise irq when not enabled.
Raise irq on enabling if DMA_INTR is set
Don't clear irq unless it was raised by DMA, as there are other irq sources
Don't set DMA_INTR bit spuriously.
Cleanup versatile_pci: no need to re-set fields
to zero (pci core sets 0 already), use set_word
for status field. Compile-tested only, but seems obvious.
Alexander Graf [Tue, 9 Feb 2010 16:37:10 +0000 (17:37 +0100)]
PPC: Add timer when running KVM
For some odd reason we sometimes hang inside KVM forever. I'd guess it's
a race condition where we actually have a level triggered interrupt, but
the infrastructure can't expose that yet, so the guest ACKs it, goes to
sleep and never gets notified that there's still an interrupt pending.
As a quick workaround, let's just wake up every 500 ms. That way we can
assure that we're always reinjecting interrupts in time.
Alexander Graf [Tue, 9 Feb 2010 16:37:09 +0000 (17:37 +0100)]
PPC: Fix large pages
We were masking 1TB SLB entries on the feature bit of 16 MB pages. Obviously
that breaks, so let's just ignore 1TB SLB entries for now and instead do
16MB pages correctly.
Alexander Graf [Tue, 9 Feb 2010 16:37:07 +0000 (17:37 +0100)]
PPC: Get rid of segfaults in DBDMA emulation
While trying to find the right channel number for the DBDMA emulation I
stumbled across segmentation faults that were purely triggered by the guest.
The guest should never have the possiblity to segfault us, so let's check
all indirect function calls on a channel, so the code even works for channels
that have not been reserved.
Alexander Graf [Tue, 9 Feb 2010 16:37:06 +0000 (17:37 +0100)]
PPC: Use macio IDE controller for Newworld
Per default Linux doesn't come with a lot of storage adapters enabled on
Mac configurations. The one that's pretty much always present is the pmac-ide,
while the cmd64x is almost never included in any distribution.
So let's switch to use the MacIO based IDE controller. There is corresponding
OpenBIOS code to get interrupts working properly.
Alexander Graf [Tue, 9 Feb 2010 16:37:05 +0000 (17:37 +0100)]
PPC: tell the guest about the time base frequency
Our guest systems need to know by how much the timebase increases every second,
so there usually is a "timebase-frequency" property in the cpu leaf of the
device tree.
This property is missing in OpenBIOS.
With qemu, Linux's fallback timebase speed and qemu's internal timebase speed
match up. With KVM, that is no longer true. The guest is running at the same
timebase speed as the host.
This leads to massive timing problems. On my test machine, a "sleep 2" takes
about 14 seconds with KVM enabled.
This patch exports the timebase frequency to OpenBIOS, so it can then put them
into the device tree.
Alexander Graf [Tue, 9 Feb 2010 16:37:03 +0000 (17:37 +0100)]
PPC: Include dump of lspci -nn on real G5
To ease debugging and to know what we're lacking, I found it really useful to
have an lspci dump of a real U3 based G5 around. So I added a comment for it.
If people don't think it's important enough to include this information in the
sources, just don't apply this patch.
Alexander Graf [Tue, 9 Feb 2010 16:37:02 +0000 (17:37 +0100)]
PPC: Use Mac99_U3 type on ppc64
The "Mac99" type so far defines a "U2" based configuration. Unfortunately,
there have never been any U2 based PPC64 machines. That's what the U3 was
developed for.
So let's split the Mac99 machine in a PPC64 and a PPC32 machine. The PPC32
machine stays "Mac99", while the PPC64 one becomes "Mac99_U3". All peripherals
stay the same.
Alexander Graf [Tue, 9 Feb 2010 16:37:01 +0000 (17:37 +0100)]
PPC: Uninorth config space accessor
The Uninorth PCI bridge requires different layouts in its PCI config space
accessors.
This patch introduces a conversion function that makes it compatible with
the way Linux accesses it.
I also kept an OpenBIOS compatibility hack in. I think it'd be better to
take small steps here and do the config space access rework in OpenBIOS
later on. When that's done we can remove that hack.
Some users prefer a single callback with length passed as parameter to
using b/w/l callbacks. It would maybe be cleaner to just pass length to
existing callbacks but that's a lot of churn. So for now add a wrapper.
For convenience use pcibus_t for address so a single callback can be
used for pci io and pci memory.
I did have to resort to preprocessor to reduce code duplication. It is
however slightly more straightforward, and better contained than what we
had with pci_host_template.h. Again, it would go away if we just passed
len to existing callbacks.
Isaku Yamahata [Mon, 8 Feb 2010 06:40:38 +0000 (15:40 +0900)]
pci: fix info pci with host bridge.
This patch fixes 525e05147d5a3bdc08caa422d108c1ef71b584b5.
pci host bridge doesn't have header type of bridge.
The check should be by header type, instead of pci class device.
Export the physical block size in the READ CAPACITY (16) command,
and add the new block limits VPD page to export the minimum and
optiomal I/O sizes.
Note that we also need to bump the scsi revision level to SPC-2
as that is the minimum requirement by at least the Linux kernel
to try READ CAPACITY (16) first and look at the block limits VPD
page.
Add three new qdev properties to export block topology information to
the guest. This is needed to get optimal I/O alignment for RAID arrays
or SSDs.
The options are:
- physical_block_size to specify the physical block size of the device,
this is going to increase from 512 bytes to 4096 kilobytes for many
modern storage devices
- min_io_size to specify the minimal I/O size without performance impact,
this is typically set to the RAID chunk size for arrays.
- opt_io_size to specify the optimal sustained I/O size, this is
typically the RAID stripe width for arrays.
I decided to not auto-probe these values from blkid which might easily
be possible as I don't know how to deal with these issues on migration.
Note that we specificly only set the physical_block_size, and not the
logial one which is the unit all I/O is described in. The reason for
that is that IDE does not support increasing the logical block size and
at last for now I want to stick to one meachnisms in queue and allow
for easy switching of transports for a given backing image which would
not be possible if scsi and virtio use real 4k sectors, while ide only
uses the physical block exponent.
To make this more common for the different block drivers introduce a
new BlockConf structure holding all common block properties and a
DEFINE_BLOCK_PROPERTIES macro to add them all together, mirroring
what is done for network drivers. Also switch over all block drivers
to use it, except for the floppy driver which has weird driveA/driveB
properties and probably won't require any advanced block options ever.
Example usage for a virtio device with 4k physical block size and
8k optimal I/O size:
The addition of the whole ATA IDENTIY page caused the config space to
go above the allowed size in the PCI spec, and thus the feature was
already reverted in the Linux guest driver and disabled by default in
qemu.
TeLeMan [Mon, 8 Feb 2010 08:20:00 +0000 (16:20 +0800)]
qemu-img: use the heap instead of the huge stack array for win32
The default stack size of PE is 1MB on win32 and IO_BUF_SIZE in
img_convert() & img_rebase() is 2MB, so qemu-img will crash when doing
"convert" & "rebase" on win32.
Although we can improve the stack size of PE to resolve it, I think we
should avoid using the huge stack variables.
Jim Meyering [Mon, 8 Feb 2010 18:28:38 +0000 (19:28 +0100)]
don't dereference NULL after failed strdup
Most of these are obvious NULL-deref bug fixes, for example,
the ones in these files:
block/curl.c
net.c
slirp/misc.c
and the first one in block/vvfat.c.
The others in block/vvfat.c may not lead to an immediate segfault, but I
traced the two schedule_rename(..., strdup(path)) uses, and a failed
strdup would appear to trigger this assertion in handle_renames_and_mkdirs:
assert(commit->path);
The conversion to use qemu_strdup in envlist_to_environ is not technically
needed, but does avoid a theoretical leak in the caller when strdup fails
for one value, but later succeeds in allocating another buffer(plausible,
if one string length is much larger than the others). The caller does
not know the length of the returned list, and as such can only free
pointers until it hits the first NULL. If there are non-NULL pointers
beyond the first, their buffers would be leaked. This one is admittedly
far-fetched.
The two in linux-user/main.c are worth fixing to ensure that an
OOM error is diagnosed up front, rather than letting it provoke some
harder-to-diagnose secondary error, in case of exec failure, or worse, in
case the exec succeeds but with an invalid list of command line options.
However, considering how unlikely it is to encounter a failed strdup early
in main, this isn't a big deal. Note that adding the required uses of
qemu_strdup here and in envlist.c induce link failures because qemu_strdup
is not currently in any library they're linked with. So for now, I've
omitted those changes, as well as the fixes in target-i386/helper.c
and target-sparc/helper.c.
If you'd like to see the above discussion (or anything else)
in the commit log, just let me know and I'll be happy to adjust.
Handle failing strdup by replacing each use with qemu_strdup,
so as not to dereference NULL or trigger a failing assertion.
* block/curl.c (curl_open): s/\bstrdup\b/qemu_strdup/
* block/vvfat.c (init_directories): Likewise.
(get_cluster_count_for_direntry, check_directory_consistency): Likewise.
* net.c (parse_host_src_port): Likewise.
* slirp/misc.c (fork_exec): Likewise.
Luiz Capitulino [Mon, 8 Feb 2010 19:01:30 +0000 (17:01 -0200)]
QMP: Don't leak on connection close
QMP's chardev event callback doesn't call
json_message_parser_destroy() on CHR_EVENT_CLOSED. As the call
to json_message_parser_init() on CHR_EVENT_OPENED allocates memory,
we'are leaking on close.
Fix that by just calling json_message_parser_destroy() on
CHR_EVENT_CLOSED.
Luiz Capitulino [Mon, 8 Feb 2010 19:01:29 +0000 (17:01 -0200)]
QError: Don't abort on multiple faults
Ideally, Monitor code should report an error only once and
return the error information up the call chain.
To assure that this happens as expected and that no error is
lost, we have an assert() in qemu_error_internal().
However, we still have not fully converted handlers using
monitor_printf() to report errors. As there can be multiple
monitor_printf() calls on an error, the assertion is easily
triggered when debugging is enabled; and we will get a memory
leak if it's not.
The solution to this problem is to allow multiple faults by only
reporting the first one, and to release the additional error objects.
A better mechanism to report multiple errors to programmers is
underway.
Tom Lendacky [Mon, 8 Feb 2010 16:10:01 +0000 (10:10 -0600)]
virtio-net: fix network stall under load
Fix a race condition where qemu finds that there are not enough virtio
ring buffers available and the guest make more buffers available before
qemu can enable notifications.
Roy Tam [Thu, 4 Feb 2010 02:30:30 +0000 (10:30 +0800)]
json: fix PRId64 on Win32
OK we are fooled by the json lexer and parser. As we use %I64d to
print 'long long' variables in Win32, but lexer and parser only deal
with %lld but not %I64d, this patch add support for %I64d and solve
'info pci', 'powser_reset' and 'power_powerdown' assert failure in
Win32.
Luiz Capitulino [Thu, 4 Feb 2010 20:10:06 +0000 (18:10 -0200)]
QMP: Enforce capability negotiation rules
With this commit QMP will be started in Capabilities Negotiation
mode, where the only command allowed to run is 'qmp_capabilities'.
All other commands will return CommandNotFound error. Asynchronous
messages are not delivered either.
When 'qmp_capabilities' is successfully executed QMP enters in
Command mode, where all commands (except 'qmp_capabilities') are
allowed to run and asynchronous messages are delivered.
Luiz Capitulino [Thu, 4 Feb 2010 20:10:05 +0000 (18:10 -0200)]
QMP: Introduce the qmp_capabilities command
This command will be used to enable QMP capabilities advertised
by the capabilities array.
Note that it will be mandatory to issue this command in order
to make QMP functional (although this behavior is not being
enforced by this commit).
Also, as we don't have any capabilities yet, the new command
doesn't accept any arguments. I will postpone the decision for
a format for this until we get our first capability.
Finally, this command is visible from the user Monitor too, in
the meaning that you can execute it but it won't do anything.
Making it only visible in QMP is beyond this series' goal, as
it requires changes in unrelated places.
Luiz Capitulino [Thu, 4 Feb 2010 20:10:04 +0000 (18:10 -0200)]
QMP: Add QEMU's version to the greeting message
With capability negotiation support clients will only have a chance
to check QEMU's version (ie. issue 'query-version') after the
negotiation procedure is done.
It might be useful to clients to check QEMU's version before
negotiating features, though.
To allow that, this commit adds the QEMU's version object to the
greeting message.
Not really sure this is needed, but doesn't hurt anyway.
David S. Ahern [Wed, 3 Feb 2010 16:00:54 +0000 (09:00 -0700)]
segfault due to buffer overrun in usb-serial
This fixes a segfault due to buffer overrun in the usb-serial device.
The memcpy was incrementing the start location by recv_used yet, the
computation of first_size (how much to write at the end of the buffer
before wrapping to the front) was not accounting for it. This causes the
next element after the receive buffer (recv_ptr) to get overwritten with
random data.
David S. Ahern [Wed, 3 Feb 2010 15:49:39 +0000 (08:49 -0700)]
audio streaming from usb devices
I have streaming audio devices working within qemu-kvm. This is a port
of the changes to qemu.
Streaming audio generates a series of isochronous requests that are
repetitive and time sensitive. The URBs need to be submitted in
consecutive USB frames and responses need to be handled in a timely manner.
Summary of the changes for isochronous requests:
1. The initial 'valid' value is increased to 32. It needs to be higher
than its current value of 10 since the host adds a 10 frame delay to the
scheduling of the first request; if valid is set to 10 the first
isochronous request times out and qemu cancels it. 32 was chosen as a
nice round number, and it is used in the path where a TD-async pairing
already exists.
2. The token field in the TD is *not* unique for isochronous requests,
so it is not a good choice for finding a matching async request. The
buffer (where to write the guest data) is unique, so use that value instead.
3. TD's for isochronous request need to be completed in the async
completion handler so that data is pushed to the guest as soon as it is
available. The uhci code currently attempts to process complete
isochronous TDs the next time the UHCI frame with the request is
processed. The results in lost data since the async requests will have
long since timed out based on the valid parameter. Increasing the valid
value is not acceptable as it introduces a 1+ second delay in the data
getting pushed to the guest.
4. The frame timer needs to be run on 1 msec intervals. Currently, the
expire time for the processing the next frame is computed after the
processing of each frame. This regularly causes the scheduling of frames
to shift in time. When this happens the periodic scheduling of the
requests is broken and the subsequent request is seen as a new request
by the host resulting in a 10 msec delay (first isochronous request is
scheduled for 10 frames from when the URB is submitted).
[ For what's worth a small change is needed to the guest driver to have
more outstanding URBs (at least 4 URBs with 5 packets per URB).]