Alexander Graf [Fri, 6 May 2011 08:37:56 +0000 (10:37 +0200)]
PPC MPC7544DS: Use new TLB helper function
Now that we have some nice helpers that can find us a TLB entry, let's
use that on the machine initialization code, so we don't need to know
about the internals of the TLB array.
Alexander Graf [Sat, 30 Apr 2011 21:34:58 +0000 (23:34 +0200)]
PPC: Implement e500 (FSL) MMU
Most of the code to support e500 style MMUs is already in place, but
we're missing on some of the special TLB0-TLB1 handling code and slightly
different TLB modification.
Alexander Graf [Sat, 30 Apr 2011 22:00:58 +0000 (00:00 +0200)]
PPC: Add another 64 bits to instruction feature mask
To enable quick runtime detection of instruction groups to the currently
selected CPU emulation, we have a feature mask of what exactly the respective
instruction supports.
This feature mask is 64 bits long and we just successfully exceeded those 64
bits. To add more features, we need to think of something.
The easiest solution that came to my mind was to simply add another 64 bits
that we can also match on. Since the comparison is only done on start of the
qemu process to generate an internal opcode calling table, we should be fine
on any performance penalties here.
Alexander Graf [Sat, 30 Apr 2011 21:05:03 +0000 (23:05 +0200)]
PPC: Make MPC8544DS obey -cpu switch
The MPC8544DS board emulation code ignored the user defined -cpu switch.
This patch enables it to only provide a sane default, not force an e500v2
CPU inside.
David Gibson [Tue, 10 May 2011 06:06:21 +0000 (16:06 +1000)]
Fix off-by-one error in sizing pSeries hcall table
The pSeries machine uses two tables to look up guest hcalls for emulation.
One of these is exactly one entry too small to hold all the hcalls it needs
to, leading to memory corruption.
This patch fixes the bug, and while we're at it, make both tables 'static'
since they're never used from other modules.
Alexander Graf [Sat, 16 Apr 2011 08:15:11 +0000 (10:15 +0200)]
kvm: ppc: warn user on PAGE_SIZE mismatch
On PPC, the default PAGE_SIZE is 64kb. Unfortunately, the hardware
alignments don't match here: There are RAM and MMIO regions within
a single page when it's 64kb in size.
So the only way out for now is to tell the user that he should use 4k
PAGE_SIZE.
This patch gives the user a hint on that, telling him that failing to
register a prefix slot is most likely to be caused by mismatching PAGE_SIZE.
This way it's also more future-proof, as bigger PAGE_SIZE can easily be
supported by other machines then, as long as they stick to 64kb granularities.
Alexander Graf [Sat, 16 Apr 2011 00:00:36 +0000 (02:00 +0200)]
kvm: ppc: detect old headers
When compiling Qemu with older kernel headers, the PVR setting
mechanism isn't available yet. Unfortunately, back then I didn't add
a capability we could check against, so all we can do is add a configure
test to see if we support PVR setting. For BookE, we don't care yet.
This fixes compilation errors with KVM enabled on older kernel headers
(like 2.6.32).
Scott Wood [Mon, 11 Apr 2011 23:34:34 +0000 (18:34 -0500)]
kvm: ppc: fixes for KVM_SET_SREGS on init
Classic/server ppc has had SREGS for a while now (though I think not
always?), but it's still missing for booke. Check the capability before
calling KVM_SET_SREGS.
Also, don't write random stack state into the non-PVR sregs fields --
have kvm fill it in first.
Eventually booke will have sregs and it will have its own capability to
be tested here. However, we will want a way for platform code to request
to look like the actual CPU we're running on, especially if SoC devices
are being directly assigned.
David Gibson [Tue, 19 Apr 2011 01:54:52 +0000 (11:54 +1000)]
Place pseries vty devices at addresses more similar to existing machines
Currently the qemu pseries machine numbers its virtual serial devices
from 0. However, existing pSeries machines running pHyp number them from
0x30000000.
In theory these indices are arbitrary, since everything necessary for the
kernel to find them is advertised in the device tree. However the debian
installer, at least, incorrectly looks for a device named vty@30... to
determine whether to use the hypervisor console.
Therefore this patch moves the numbers we use to match the existing pHyp
practice, in order to workaround broken userspace apps of this type.
David Gibson [Tue, 19 Apr 2011 01:54:51 +0000 (11:54 +1000)]
Make pSeries 'model' property more closely resemble real hardware
Currently, the qemu emulated pseries machine puts
"qemu,emulated-pSeries-LPAR" in the device tree's root level 'model'
property. Unfortunately this confuses some installers and ybin, which
expect this to start with "IBM" on pSeries machines. This patch addresses
this problem, making the property more closely resemble the pattern of
existing real hardware.
Anton Blanchard [Tue, 19 Apr 2011 01:54:50 +0000 (11:54 +1000)]
pseries: Increase maximum CPUs to 256
The original pSeries machine was limited to 32 CPUs, more or less
arbitrarily. Particularly when we get SMT KVM guests it will be
pretty easy to exceed this. Therefore, raise the max number of CPUs
in a pseries machine guest to 256.
Gerd Hoffmann [Mon, 9 May 2011 07:44:03 +0000 (09:44 +0200)]
usb-musb: uninline functions
Prototype without "inline" keyword breaks the build with some gcc
versions. Noticed by Alexander Graf.
Fix this by removing the inline keywork everywhere. Some functions
can't be inlined anyway as the are referenced using function pointers.
Beside that gcc does a pretty good job on auto-inlining these days.
This mask contains all of the bits that should be ignored while single
stepping in the debugger. The mask contains 2 bits that are not currently
cleared, but are also never set. The bits are included in the mask for
consistency in handling of the CPU_INTERRUPT_TGT_EXT_N bits.
These defines will be place-holders for cpu-specific functionality.
Generic code will, at the end of the patch series, no longer have to
concern itself about how SMI, NMI, etc should be handled. Instead,
generic code will know only that the interrupt is internal or external.
Alex Williamson [Tue, 3 May 2011 18:36:46 +0000 (12:36 -0600)]
CPUPhysMemoryClient: Pass guest physical address not region offset
When we're trying to get a newly registered phys memory client updated
with the current page mappings, we end up passing the region offset
(a ram_addr_t) as the start address rather than the actual guest
physical memory address (target_phys_addr_t). If your guest has less
than 3.5G of memory, these are coincidentally the same thing. If
there's more, the region offset for the memory above 4G starts over
at 0, so the set_memory client will overwrite it's lower memory entries.
Instead, keep track of the guest phsyical address as we're walking the
tables and pass that to the set_memory client.
Alex Williamson [Tue, 3 May 2011 18:36:32 +0000 (12:36 -0600)]
CPUPhysMemoryClient: Fix typo in phys memory client registration
When we register a physical memory client, we try to walk the page
tables, calling the set_memory hook for every entry. Effectively
playing catchup for the client for everything already registered.
With this type, we only walk the 2nd entry of the l1 table,
typically missing all of the registered memory.
Stefan Weil [Sat, 30 Apr 2011 20:40:07 +0000 (22:40 +0200)]
eepro100: Pad received short frames
QEMU sends frames smaller than 60 bytes to ethernet nics.
Such frames are rejected by real NICs and their emulations.
To avoid this behaviour, other NIC emulations pad received
frames. This patch enables this workaround for eepro100, too.
All related code is marked with CONFIG_PAD_RECEIVED_FRAMES,
so we can drop this in case QEMU's networking code is
ever changed.
Stefan Weil [Sat, 30 Apr 2011 20:40:04 +0000 (22:40 +0200)]
eepro100: Avoid duplicate debug messages
When DEBUG_EEPRO100 was enabled, unsupported writes were logged twice.
Now logging in eepro100_write1 and eepro100_write2 is similar to the
logging in eepro100_write4 (which already was correct).
Gerd Hoffmann [Wed, 4 May 2011 14:49:56 +0000 (16:49 +0200)]
usb: mass storage fix
Initialize scsi_len with zero when starting a new request, so any
stuff leftover from the previous request is cleared out. This may
happen in case the data returned by the scsi command doesn't fit
into the buffer provided by the guest.
Hans de Goede [Wed, 2 Feb 2011 16:46:00 +0000 (17:46 +0100)]
usb: control buffer fixes
Windows allows control transfers to pass up to 4k of data, so raise our
control buffer size to 4k. For control out transfers the usb core code copies
the control request data to a buffer before calling the device's handle_control
callback. Add a check for overflowing the buffer before copying the data.
Hans de Goede [Fri, 26 Nov 2010 14:02:16 +0000 (15:02 +0100)]
usb-linux: We only need to keep track of 15 endpoints
Currently we reserve room for endpoint data for 16 endpoints, but given
that we only use endpoint data for endpoints 1-15, and always index the
array with the endpoint-number - 1, 15 is enough.
Hans de Goede [Fri, 26 Nov 2010 13:59:35 +0000 (14:59 +0100)]
usb-linux: Refuse iso packets when max packet size is 0 (alt setting 0)
Refuse iso usb packets when then max packet size for the endpoint is 0,
this avoids an abort in usb_host_alloc_iso() caused by trying to qemu_malloc
a 0 bytes large buffer.
Hans de Goede [Fri, 26 Nov 2010 10:41:08 +0000 (11:41 +0100)]
usb-linux: Add support for buffering iso usb packets
Currently we are submitting iso packets to the host one at a time, as we
receive them from the emulated host controller. This has 2 problems:
1) If we were fast enough to submit every packet in time for the next host host
controller usb frame, we would be generating 1000 hardware interrupts per
second on the host
2) We are not fast enough to submit every packet in time for the next host host
controller usb frame, causing us to not submit iso urbs in some usb frames
which causes devices with an endpoint with an interval of 1 ms (so every
frame) to loose data. This causes for example ubs-1.1 webcams to not work
properly (usb-2.0 is not supported at all atm).
This patch fixes both problems by changing the iso packet pass through handling
to buffer packets. This version only does so for iso input packets (webcams,
audio in) I'm working on a second patch extending this to iso output packets
(audio out).
This patch makes use of the linux batching of iso packets in one urb.
When an iso in packet gets received from the emulated host controller,
it immediately submits 3 urbs with 32 iso in packets each. This causes
the host to only get an hw interrupt every 32 packets dropping the
interrupt rate to 32 interrupts per second and gives it a queue of urbs
to work from once the first 32 iso in packets have been received to make sure
no packets are dropped.
Besides submitting a whole bunch or urbs as soon as the first urb is
received, effectively creating a buffer inside the kernel, this patch also
gets rid of the asynchroneous completion for iso in urbs. Instead they are
only marked as complete in the fd write callback (which usbfs uses to signal
complete urbs). These complete packets then get consumed by returning them
synchroneously to the emulated host controller when it submits an iso in
packet for the ep in question. When no complete packets are ready (which
happens when the stream is starting) a 0 length packet gets returned to
the emulated host controller.
With this patch I've several usb-1.1 webcams working well with usb pass
through, where as without this patch none of them work.
Hans de Goede [Wed, 24 Nov 2010 11:57:59 +0000 (12:57 +0100)]
usb-linux: Get the alt. setting from sysfs rather then asking the dev
At least one device I have lies when receiving a USB_REQ_GET_INTERFACE,
always returning 0 even if the alternate setting is different. This is
likely caused because in practice this control message is never used as
the operating system's usb stack knows which alternate setting it has
told the device to get into, and thus this ctrl message does not get
tested by device manufacturers.
When usb_fs_type == USB_FS_SYS, the active alt. setting can be read directly
from sysfs, which allows using this device through qemu's usb redirection.
More in general it seems a good idea to not send needless control msg's to
devices, esp. as the code in question is called every time a set_interface
is done. Which happens multiple times during virtual machine startup, and
when device drivers are activating the usb device.
Hans de Goede [Wed, 24 Nov 2010 11:50:00 +0000 (12:50 +0100)]
usb-linux: introduce a usb_linux_alt_setting function
The next patch in this series introduces multiple ways to get the
alt setting dependent upon usb_fs_type, it is cleaner to put this
into its own function.
Note that this patch also changes the assumed alt setting in case
of an error getting the alt setting to be 0 (a sane default) rather
then the interface numberwhich makes no sense.
We don't use qemu internals from spice server context any more.
Thus we don't also need to grab the iothread mutex from spice
server context. And we don't have to temporarely release the
lock to avoid deadlocks. Drop all the calls.
spice: don't call displaystate callbacks from spice server context.
This patch moves the displaystate callback calls for setting the cursor
and the mouse pointer from spice server to qemu (iothread) context.
This allows us to simplify locking.
spice: don't create updates in spice server context.
This patch moves the creation of spice screen updates from the spice
server context to qemu iothread context (display refresh timer to be
exact). This way we avoid accessing qemu internals (display surface)
from spice thread context which in turn allows us to simplify locking.
Jes Sorensen [Tue, 1 Feb 2011 14:53:23 +0000 (15:53 +0100)]
Make spice dummy functions inline to fix calls not checking return values
qemu_spice_set_passwd() and qemu_spice_set_pw_expire() dummy functions
needs to be inline, in order to handle the case where they are called
without checking the return value.
Amit Shah [Thu, 28 Apr 2011 14:34:41 +0000 (20:04 +0530)]
atapi: Explain why we need a 'media not present' state
After the re-org of the atapi code, it might not be intuitive for a
reader of the code to understand why we're inserting a 'media not
present' state between cd changes.
Kevin Wolf [Fri, 29 Apr 2011 08:58:12 +0000 (10:58 +0200)]
qemu-img resize: Fix option parsing
For shrinking images, you're supposed to use a negative size. However, the
leading minus makes getopt think that it's an option and so you get the help
text if you don't use -- like in 'qemu-img resize test.img -- -1G'.
This patch handles the size first and removes it from the argument list so that
getopt won't even try to interpret it and you don't need -- any more.
The dirty bitmap copied out to userspace is stored in a long array,
and gets copied out to userspace accordingly. This patch accounts
for that correctly. Currently I'm seeing kvm crashing due to writing
beyond the end of the alloc'd dirty bitmap memory, because the buffer
has the wrong size.
Signed-off-by: Bruce Rogers Signed-off-by: Marcelo Tosatti <[email protected]>
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ int kvm_get_dirty_pages_range(kvm_context_t kvm, unsigned long phys_addr,
- buf = qemu_malloc((slots[i].len / 4096 + 7) / 8 + 2);
+ buf = qemu_malloc(BITMAP_SIZE(slots[i].len));
r = kvm_get_map(kvm, KVM_GET_DIRTY_LOG, i, buf);
BITMAP_SIZE is now open-coded in that function, like this:
The problem is that HOST_LONG_BITS in 32bit userspace is 32
but it's 64 in 64bit kernel. So userspace aligns this to
32, and kernel to 64, but since no length is passed from
userspace to kernel on ioctl, kernel uses its size calculation
and copies 4 extra bytes to userspace, corrupting memory.
Here's how it looks like during migrate execution:
(our is userspace size above, kern is the size as calculated
by the kernel).
Fix this by always aligning to 64 in a hope that no platform will
have sizeof(long)>8 any time soon, and add a comment describing it
all. It's a small price to pay for bad kernel design.
Alternatively it's possible to fix that in the kernel by using
different size calculation depending on the current process.
But this becomes quite ugly.
Special thanks goes to Stefan Hajnoczi for spotting the fundamental
cause of the issue, and to Alexander Graf for his support in #qemu.
Jan Kiszka [Tue, 12 Apr 2011 23:32:56 +0000 (01:32 +0200)]
kvm: Install specialized interrupt handler
KVM only requires to set the raised IRQ in CPUState and to kick the
receiving vcpu if it is remote. Installing a specialized handler allows
potential future changes to the TCG code path without risking KVM side
effects.
Glauber Costa [Thu, 17 Mar 2011 22:42:06 +0000 (19:42 -0300)]
kvm: add kvmclock to its second bit
We have two bits that can represent kvmclock in cpuid.
They signal the guest which msr set to use. When we tweak flags
involving this value - specially when we use "-", we have to act on both.
Jan Kiszka [Tue, 19 Apr 2011 11:06:06 +0000 (13:06 +0200)]
x86: Allow multiple cpu feature matches of lookup_feature
kvmclock is represented by two feature bits. Therefore, lookup_feature
needs to continue its search even after the first match. Enhance it
accordingly and switch to a bool return type at this chance.
Blue Swirl [Fri, 29 Apr 2011 20:01:51 +0000 (20:01 +0000)]
Merge branch 'patches' of git://qemu.weilnetz.de/git/qemu
* 'patches' of git://qemu.weilnetz.de/git/qemu:
qemu-timer: Fix timers for w32
qemu-timer: Avoid type casts
qemu-timer: Remove unneeded include statement (w32)
qemu-timer: Add and use new function qemu_timer_expired_ns
virtio-serial: Fix endianness bug in the config space
The virtio serial specification requres that the values in the config
space are encoded in native endian of the guest.
The qemu virtio-serial code did not do conversion to the guest endian
format what caused problems when host and guest use different format.
This patch corrects the qemu side, correctly doing host-native <->
guest-native conversions when accessing the config space. This won't
break any setups that aren't already broken, and fixes the case
of different host and guest endianness.
Hans de Goede [Thu, 24 Mar 2011 10:12:04 +0000 (11:12 +0100)]
spice-chardev: listen to frontend guest open / close
Note the vmc_register_interface() in spice_chr_write is left in place
in case someone uses spice-chardev with a frontend which does not have
guest open / close notification.
Hans de Goede [Thu, 24 Mar 2011 10:12:02 +0000 (11:12 +0100)]
chardev: Allow frontends to notify backends of guest open / close
Some frontends know when the guest has opened the "channel" and is actively
listening to it, for example virtio-serial. This patch adds 2 new qemu-chardev
functions which can be used by frontends to signal guest open / close, and
allows interested backends to listen to this.
target-arm: fix LDMIA bug on page boundary
target-arm: fix LDMIA bug on page boundary
When consecutive memory locations are on page boundary, a base register may be
loaded before page fault occurs. After page fault handling, it losts the memory
location information. To solve this problem, loading a base register has to put back.
Jan Kiszka [Sat, 9 Apr 2011 11:18:59 +0000 (13:18 +0200)]
ioapic: Do not set irr for masked edge IRQs
So far we set IRR for edge IRQs even if the pin is masked. If the guest
later on unmasks and switches the pin to level-triggered mode, irr will
remain set, causing an IRQ storm. The point is that setting IRR is not
correct in this case according to the spec, and avoiding this resolves
the issue.
Stefan Hajnoczi [Wed, 16 Mar 2011 08:31:43 +0000 (08:31 +0000)]
vl.c: Replace -virtfs string manipulation with QemuOpts
The -virtfs option creates an fsdev representing the pass-through file
system and a guest-visible virtio-9p-pci device that can access this
file system. This patch replaces the string manipulation used to build
and reparse option lists with direct QemuOpts calls. Removing the
string manipulation code makes it easier to maintain and less error
prone.
An error message is also updated to use "mount_tag" instead of
"mnt_tag".
v9fs_walk: As per 9p2000 RFC, MAXWELEM >= nwnames >= 0.
The nwnames field in TWALK message is assumed to be >=0 and <= MAXWELEM
which is defined as macro P9_MAXWELEM (16) in virtio-9p.h as per 9p2000
RFC. Appropriate changes are required in V9fsWalkState and v9fs_walk.
hw/virtio-9p-local.c: Remove unnecessary null char in symlink file
This patch removes the addition of null char in symlink file
which is being appended to file in case of mapped security model.
Without this patch, the extra null char causes LTP testcase lstat03
to fail and hence this fix is required.