Peter Maydell [Wed, 19 Sep 2012 13:51:38 +0000 (14:51 +0100)]
arch_init.c: Improve '-soundhw help' for non-HAS_AUDIO_CHOICE archs
For architectures which don't set HAS_AUDIO_CHOICE, improve the
'-soundhw help' message so that it doesn't simply print an empty
list, implying no sound support at all.
Anthony Liguori [Mon, 17 Sep 2012 15:23:15 +0000 (10:23 -0500)]
Merge remote-tracking branch 'kwolf/for-anthony' into staging
* kwolf/for-anthony:
block: Don't forget to delete temporary file
Don't require encryption password for 'qemu-img info' command
qemu-img: Add json output option to the info command.
qapi: Add SnapshotInfo and ImageInfo.
ahci: properly reset PxCMD on HBA reset
block: fix block tray status
vdi: Fix warning from clang
block/curl: Fix wrong free statement
ide: Fix error messages from static code analysis (no real error)
ATAPI: STARTSTOPUNIT only eject/load media if powercondition is 0
sheepdog: fix savevm and loadvm
Anthony Liguori [Mon, 17 Sep 2012 15:21:42 +0000 (10:21 -0500)]
Merge remote-tracking branch 'stefanha/trivial-patches' into staging
* stefanha/trivial-patches:
configure: fix seccomp check
arch_init.c: add missing '%' symbols before PRIu64 in debug printfs
kvm: Fix warning from static code analysis
qapi: Fix enumeration typo error
console: Clean up bytes per pixel calculation
Fix copy&paste typos in documentation comments
linux-user: Remove #if 0'd cpu_get_real_ticks() definition
ui: Fix spelling in comment (ressource -> resource)
Spelling fixes in comments and macro names (ressource -> resource)
Fix spelling (licenced -> licensed) in GPL
Spelling fixes in comments and documentation
srp: Don't use QEMU_PACKED for single elements of a structured type
Anthony Liguori [Mon, 17 Sep 2012 15:20:48 +0000 (10:20 -0500)]
Merge remote-tracking branch 'stefanha/net' into staging
* stefanha/net:
net: EAGAIN handling for net/socket.c TCP
net: EAGAIN handling for net/socket.c UDP
net: asynchronous send/receive infrastructure for net/socket.c
net: broadcast hub packets if at least one port can receive
net: fix usbnet_receive() packet drops
net: clean up usbnet_receive()
net: add -netdev options to man page
net: do not report queued packets as sent
net: add receive_disabled logic to iov delivery path
eepro100: Fix network hang when rx buffers run out
xen: flush queue when getting an event
e1000: flush queue whenever can_receive can go from false to true
net: notify iothread after flushing queue
Anthony Liguori [Mon, 17 Sep 2012 15:20:27 +0000 (10:20 -0500)]
Merge remote-tracking branch 'qemu-kvm/uq/master' into staging
* qemu-kvm/uq/master:
kvm: Rename irqchip_inject_ioctl to irq_set_ioctl
kvm: Stop flushing coalesced MMIO on vmexit
VGA: Flush coalesced MMIO on related MMIO/PIO accesses
memory: Flush coalesced MMIO on mapping and state changes
memory: Fold memory_region_update_topology into memory_region_transaction_commit
memory: Use transaction_begin/commit also for single-step operations
memory: Flush coalesced MMIO on selected region access
kvm-all.c: Move init of irqchip_inject_ioctl out of kvm_irqchip_create()
update-linux-headers.sh: Don't hard code list of architectures
David Gibson [Mon, 10 Sep 2012 02:30:57 +0000 (12:30 +1000)]
cpu_physical_memory_write_rom() needs to do TB invalidates
cpu_physical_memory_write_rom(), despite the name, can also be used to
write images into RAM - and will often be used that way if the machine
uses load_image_targphys() into RAM addresses.
However, cpu_physical_memory_write_rom(), unlike cpu_physical_memory_rw()
doesn't invalidate any cached TBs which might be affected by the region
written.
This was breaking reset (under full emu) on the pseries machine - we loaded
our firmware image into RAM, and while executing it rewrite the code at
the entry point (correctly causing a TB invalidate/refresh). When we
reset the firmware image was reloaded, but the TB from the rewrite was
still active and caused us to get an illegal instruction trap.
This patch fixes the bug by duplicating the tb invalidate code from
cpu_physical_memory_rw() in cpu_physical_memory_write_rom().
David Gibson [Mon, 10 Sep 2012 02:30:56 +0000 (12:30 +1000)]
qemu-char: BUGFIX, don't call FD_ISSET with negative fd
tcp_chr_connect(), unlike for example udp_chr_update_read_handler() does
not check if the fd it is using is valid (>= 0) before passing it to
qemu_set_fd_handler2(). If using e.g. a TCP serial port, which is not
initially connected, this can result in -1 being passed to FD_ISSET, which
has undefined behaviour. On x86 it seems to harmlessly return 0, but on
PowerPC, it causes a fortify buffer overflow error to be thrown.
This patch fixes this by putting an extra test in tcp_chr_connect(), and
also adds an assert qemu_set_fd_handler2() to catch other such errors on
all platforms, rather than just some.
Add an explicit CPUCRISState parameter instead of relying on AREG0, and
use cpu_ld* in translation and interrupt handling. Remove AREG0 swapping
in tlb_fill(). Switch to AREG0 free mode
Natanael Copa [Wed, 12 Sep 2012 09:06:51 +0000 (09:06 +0000)]
configure: properly check if -lrt and -lm is needed
Fixes build against uClibc.
uClibc provides 2 versions of clock_gettime(), one with realtime
support and one without (this is so you can avoid linking in -lrt
unless actually needed). This means that the clock_gettime() don't
need -lrt. We still need it for timer_create() so we check for this
function in addition.
Currently, if libseccomp is missing but the user explicitly requested
seccomp support using --enable-seccomp, configure silently ignores the
situation and disables seccomp support.
This is unlike all other tests that explicitly fail in such situation.
Stefan Hajnoczi [Mon, 20 Aug 2012 09:14:35 +0000 (10:14 +0100)]
net: EAGAIN handling for net/socket.c TCP
Replace spinning send_all() with a proper non-blocking send. When the
socket write buffer limit is reached, we should stop trying to send and
wait for the socket to become writable again.
Non-blocking TCP sockets can return in two different ways when the write
buffer limit is reached:
1. ret = -1 and errno = EAGAIN/EWOULDBLOCK. No data has been written.
2. ret < total_size. Short write, only part of the message was
transmitted.
Handle both cases and keep track of how many bytes have been written in
s->send_index. (This includes the 'length' header before the actual
payload buffer.)
Stefan Hajnoczi [Mon, 20 Aug 2012 09:21:54 +0000 (10:21 +0100)]
net: asynchronous send/receive infrastructure for net/socket.c
The net/socket.c net client is not truly asynchronous. This patch
borrows the qemu_set_fd_handler2() code from net/tap.c as the basis for
proper asynchronous send/receive.
Only read packets from the socket when the peer is able to receive.
This avoids needless queuing.
Stefan Hajnoczi [Fri, 24 Aug 2012 12:50:30 +0000 (13:50 +0100)]
net: broadcast hub packets if at least one port can receive
In commit 60c07d933c66c4b30a83b7ccbc8a0cb3df1b2d0e ("net: fix
qemu_can_send_packet logic") the "VLAN" broadcast behavior was changed
to queue packets if any net client cannot receive. It turns out that
this was not actually the right fix and just hides the real bug that
hw/usb/dev-network.c:usbnet_receive() clobbers its receive buffer when
called multiple times in a row. The commit also introduced a new bug
that "VLAN" packets would not be sent if one of multiple net clients was
down.
The hw/usb/dev-network.c bug has since been fixed, so this patch reverts
broadcast behavior to send packets as long as one net client can
receive. Packets simply get queued for the net clients that are
temporarily unable to receive.
Stefan Hajnoczi [Fri, 24 Aug 2012 12:37:29 +0000 (13:37 +0100)]
net: fix usbnet_receive() packet drops
The USB network interface has a single buffer which the guest reads
from. This patch prevents multiple calls to usbnet_receive() from
clobbering the input buffer. Instead we queue packets until buffer
space becomes available again.
This is inspired by virtio-net and e1000 rxbuf handling.
Stefan Hajnoczi [Fri, 24 Aug 2012 12:32:16 +0000 (13:32 +0100)]
net: clean up usbnet_receive()
The USB network interface has two code paths depending on whether or not
RNDIS mode is enabled. Refactor usbnet_receive() so that there is a
common path throughout the function instead of duplicating everything
across if (is_rndis(s)) ... else ... code paths.
Clean up coding style and 80 character line wrap along the way.
Stefan Hajnoczi [Mon, 20 Aug 2012 12:35:23 +0000 (13:35 +0100)]
net: do not report queued packets as sent
Net send functions have a return value where 0 means the packet has not
been sent and will be queued. A non-zero value means the packet was
sent or an error caused the packet to be dropped.
This patch fixes two instances where packets are queued but we return
their size. This causes callers to believe the packets were sent. When
the caller uses the async send interface this creates a real problem
because the callback will be invoked for a packet that the caller
believed to be already sent. This bug can cause double-frees in the
caller.
Stefan Hajnoczi [Fri, 17 Aug 2012 20:16:42 +0000 (21:16 +0100)]
net: add receive_disabled logic to iov delivery path
This patch adds the missing NetClient->receive_disabled logic in the
sendv delivery code path. It seems that commit 893379efd0e1b84ceb0c42a713293f3dbd27b1bd ("net: disable receiving if
client returns zero") only added the logic to qemu_deliver_packet() and
not qemu_deliver_packet_iov().
The receive_disabled flag should be automatically set when .receive(),
.receive_raw(), or .receive_iov() return 0. No further packets will be
delivered to the NetClient until the receive_disabled flag is cleared
again by calling qemu_flush_queued_packets().
Typically the NetClient will wait until its file descriptor becomes
writable and then invoke qemu_flush_queued_packets() to resume
transmission.
Bo Yang [Wed, 29 Aug 2012 11:26:11 +0000 (19:26 +0800)]
eepro100: Fix network hang when rx buffers run out
This is reported by QA. When installing os with pxe, after the initial
kernel and initrd are loaded, the procedure tries to copy files from install
server to local harddisk, the network becomes stall because of running out of
receive descriptor.
[Whitespace fixes and removed qemu_notify_event() because Paolo's
earlier net patches have moved it into qemu_flush_queued_packets().
Additional info:
I can reproduce the network hang with a tap device doing a iPXE HTTP
boot as follows:
I needed a vanilla iPXE ROM to get to the iPXE prompt. I think the boot
prompt has been disabled in the ROMs that ship with QEMU to reduce boot
time.
During the vmlinuz HTTP download there is a network hang. hw/eepro100.c
has reached the end of the rx descriptor list. When the iPXE driver
replenishes the rx descriptor list we don't kick the QEMU net subsystem
and event loop, thereby leaving the tap netdev without its file
descriptor in select(2).
Paolo Bonzini [Thu, 9 Aug 2012 14:45:57 +0000 (16:45 +0200)]
xen: flush queue when getting an event
xen does not have a register that, when written, will cause can_receive
to go from false to true. However, flushing the queue can be attempted
whenever the front-end raises its side of the Xen event channel. There
is a single event channel for tx and rx.
Paolo Bonzini [Thu, 9 Aug 2012 14:45:56 +0000 (16:45 +0200)]
e1000: flush queue whenever can_receive can go from false to true
When the guests replenish the receive ring buffer, the network device
should flush its queue of pending packets. This is done with
qemu_flush_queued_packets.
e1000's can_receive can go from false to true when RCTL or RDT are
modified.
Paolo Bonzini [Thu, 9 Aug 2012 14:45:55 +0000 (16:45 +0200)]
net: notify iothread after flushing queue
virtio-net has code to flush the queue and notify the iothread
whenever new receive buffers are added by the guest. That is
fine, and indeed we need to do the same in all other drivers.
However, notifying the iothread should be work for the network
subsystem. And since we are at it we can add a little smartness:
if some of the queued packets already could not be delivered,
there is no need to notify the iothread.
Igor Mitsyanko [Wed, 5 Sep 2012 09:04:56 +0000 (13:04 +0400)]
arch_init.c: add missing '%' symbols before PRIu64 in debug printfs
'%' symbols were missing in front of PRIu64 macros in DPRINTF() messages in
arch_init.c, this caused compilation warnings when compiled with DEBUG_ARCH_INIT defined.
BALATON Zoltan [Wed, 22 Aug 2012 15:19:42 +0000 (17:19 +0200)]
console: Clean up bytes per pixel calculation
Division with round up is the correct way to compute this even if the
only case where division with round down gives incorrect result is
probably 15 bpp. This case was explicitely patched up in one of these
functions but was unhandled in the other. (I'm not sure about setting
16 bpp for the 15bpp case either but I left that there for now.)
Remove the cpu_get_real_ticks() definition from linux-user/main.c.
This has been disabled via #if 0 and unused since commit 1dce7c3c22
in 2006; the definitions we actually use are in qemu-timer.h.
This option is described in RFC 1783. As this is only an optional field,
we may ignore it in some situations and handle it in some others.
However, MS Windows 2003 PXE boot client requests a block size of the MTU
(most of the times 1472 bytes), and doesn't work if the option is not
acknowledged (with whatever value).
According to the RFC 1783, we cannot acknowledge the option with a bigger
value than the requested one.
As current implementation is using 512 bytes by block, accept the option
with a value of 512 if the option was specified, and don't acknowledge it
if it is not present or less than 512 bytes.
Alon Levy [Wed, 12 Sep 2012 13:13:28 +0000 (16:13 +0300)]
hw/qxl: support client monitor configuration via device
Until now we used only the agent to change the monitor count and each
monitor resolution. This patch introduces the qemu part of using the
device as the mediator instead of the agent via virtio-serial.
Spice (>=0.11.5) calls the new QXLInterface::client_monitors_config,
which returns wether the interrupt is enabled, and if so and given a non
NULL monitors config will
generate an interrupt QXL_INTERRUPT_CLIENT_MONITORS_CONFIG with crc
checksum for the guest to verify a second call hasn't interfered.
The maximal number of monitors is limited on the QXLRom to 64.
Don't require encryption password for 'qemu-img info' command
The encryption password is only required if I/O is going to be
performed on a disk image. The 'qemu-img info' command merely
reports metadata, so it should not ask for a decryption password
Jason Baron [Tue, 4 Sep 2012 20:08:08 +0000 (16:08 -0400)]
ahci: properly reset PxCMD on HBA reset
While testing q35, I found that windows 7 (specifically, windows 7 ultimate
with sp1 x64), wouldn't install because it can't find the cdrom or disk drive.
The failure message is: 'A required cd/dvd device driver is missing. If you
have a driver floppy disk, CD, DVD, or USB flash drive, please insert it now.'
This can also be reproduced on piix by adding an ahci controller, and
observing that windows 7 does not see any devices behind it.
The problem is that when windows issues a HBA reset, qemu does not reset the
individual ports' PxCMD register. Windows 7 then reads back the PxCMD register
and presumably assumes that the ahci controller has already been initialized.
Windows then never sets up the PxIE register to enable interrupts, and thus it
never gets irqs back when it sends ata device inquiry commands.
This change brings qemu into ahci 1.3 specification compliance.
Section 10.4.3 HBA Reset:
"
When GHC.HR is set to '1', GHC.AE, GHC.IE, the IS register, and all port
register fields (except PxFB/PxFBU/PxCLB/PxCLBU) that are not HwInit in the
HBA's register memory space are reset.
"
I've also re-tested Fedora 16 and 17 to verify that they continue to work with
this change.
The upper limit of 30 was never reached because both for loops terminated
when 'smart_attributes' reached end of list, so there was no real buffer
overflow.
Nevertheless, changing the code not only fixes the error report, but also
reduces the size of smart_attributes and simplifies the for loops.
Ronnie Sahlberg [Tue, 31 Jul 2012 01:28:26 +0000 (11:28 +1000)]
ATAPI: STARTSTOPUNIT only eject/load media if powercondition is 0
The START STOP UNIT command will only eject/load media if
power condition is zero.
If power condition is !0 then LOEJ and START will be ignored.
From MMC (sbc contains similar wordings too)
The Power Conditions field requests the block device to be placed
in the power condition defined in
Table 558. If this field has a value other than 0h then the Start
and LoEj bits shall be ignored.
Uri Lublin [Tue, 11 Sep 2012 07:09:58 +0000 (10:09 +0300)]
qxl: better cleanup for surface destroy
Add back a call to qxl_spice_destroy_surface_wait_complete() in qxl_spice_destroy_surface_wait(),
that was removed by commit c480bb7da465186b84d8427e068ef7502e47ffbf
It is needed to complete surface-removal cleanup, for non async.
For async, qxl_spice_destroy_surface_wait_complete is called upon operation completion.
The recent introduction of set_client_capabilities has broken
(seamless) migration by trying to call qxl_send_events pre (seamless
incoming) and post (*) migration, triggering the following assert:
qxl_send_events: Assertion `qemu_spice_display_is_running(&d->ssd)' failed.
The solution is easy, pre migration the guest will have already received
the client caps on the migration source side, and post migration there no
longer is a guest, so we can simply ignore the set_client_capabilities call
in both those scenarios.
*) Post migration, so not fatal for to the migration itself, but still a crash
spice: send updates only for changed screen content
when creating screen updates go compare the current guest screen
against the mirror (which holds the most recent update sent), then
only create updates for the screen areas which did actually change.
[ v2: drop redundant qemu_spice_create_one_update call ]
Creating one function which creates a single update for a given
rectangle. And one (for now) pretty simple wrapper around it to
queue up screen updates for the dirty region.
Jan Kiszka [Thu, 23 Aug 2012 11:02:33 +0000 (13:02 +0200)]
VGA: Flush coalesced MMIO on related MMIO/PIO accesses
In preparation of stopping to flush coalesced MMIO unconditionally on
vmexits, mark VGA MMIO and PIO regions as synchronous /wrt coalesced
MMIO and flush the buffer explicitly on PIO accesses that do not use
generic memory regions yet.
Jan Kiszka [Thu, 23 Aug 2012 11:02:32 +0000 (13:02 +0200)]
memory: Flush coalesced MMIO on mapping and state changes
Flush pending coalesced MMIO before performing mapping or state changes
that could affect the event orderings or route the buffered requests to
a wrong region.
Jan Kiszka [Thu, 23 Aug 2012 11:02:30 +0000 (13:02 +0200)]
memory: Use transaction_begin/commit also for single-step operations
Wrap also simple operations consisting only of a single step with
memory_region_transaction_begin/commit. This allows to perform
additional steps like coalesced MMIO flushing from a single place.
This requires dropping some micro-optimizations: The skipping of
topology updates after updating disabled or unregistered regions.
Jan Kiszka [Thu, 23 Aug 2012 11:02:29 +0000 (13:02 +0200)]
memory: Flush coalesced MMIO on selected region access
Instead of flushing pending coalesced MMIO requests on every vmexit,
this provides a mechanism to selectively flush when memory regions
related to the coalesced one are accessed. This first of all includes
the coalesced region itself but can also applied to other regions, e.g.
of the same device, by calling memory_region_set_flush_coalesced.
Peter Maydell [Wed, 15 Aug 2012 11:08:13 +0000 (12:08 +0100)]
kvm-all.c: Move init of irqchip_inject_ioctl out of kvm_irqchip_create()
Move the init of the irqchip_inject_ioctl field of KVMState out of
kvm_irqchip_create() and into kvm_init(), so that kvm_set_irq()
can be used even when no irqchip is created (for architectures
that support async interrupt notification even without an in
kernel irqchip).
Peter Maydell [Wed, 18 Jul 2012 10:11:09 +0000 (11:11 +0100)]
update-linux-headers.sh: Don't hard code list of architectures
Rather than hardcoding the list of architectures in the kernel
header update script, just import headers for every architecture
which supports KVM (with a blacklist exception for ia64 which
has KVM headers but is dead). This reduces the number of QEMU
files which need to be updated to add support for a new KVM
architecture.
optimizer.c contains some cases were the break is appearing in both the
if and the else parts. Fix that by moving it to the outer part. Also
move some common code there.