aliguori [Sat, 13 Dec 2008 20:41:58 +0000 (20:41 +0000)]
Correctly initialize msr list in KVM
I believe this was spotted by Gerd Hoffman but I can't find his patch
now. This will cause very subtle corruption on the heap because we
don't allocate the appropriately sized buffer.
aurel32 [Sat, 13 Dec 2008 18:57:21 +0000 (18:57 +0000)]
TCG x86/x86-64: use move with zero-extend for loads/stores
Starting with version 4.3, gcc returns the result of a function in
rax/eax/ax/al instead of rax/eax, depending of the return type. As
a consequence we should use a zero extend moe in TCG loads/stores.
See http://gcc.gnu.org/ml/gcc/2008-01/msg00052.html for more details.
A big thanks to malc who founds the problem and wrote the x86 patch.
aurel32 [Sat, 13 Dec 2008 12:33:02 +0000 (12:33 +0000)]
target-i386: SVM: acknowledge interrupt only after it is taken
SVM specifies that the V_IRQ mask is only to be removed, if the
interrupt that is to be delivered actually is delivered.
As of the SVM rewrite, this mask is always unmasked when the main cpu
loop is processed, leaving a corner case where calling the interrupt
handler causes a #PF. In that case (booting Linux / starting gfxboot)
the current implementation tells the VMM the interrupt is taken, even
though it is not.
This patch modifies the VIRQ unmasking to occur after do_interrupt,
making gfxboot work again.
blueswir1 [Sat, 13 Dec 2008 11:49:17 +0000 (11:49 +0000)]
x86 cleanup
Remove some unnecessary includes, add needed includes, move prototypes to
cpu.h to suppress missing prototype warnings.
Remove unused functions and prototypes (cpu_x86_flush_tlb, cpu_lock,
cpu_unlock, restore_native_fp_state, save_native_fp_state).
Make some functions and data static (f15rk, parity_table, rclw_table,
rclb_table, raise_interrupt, fpu_raise_exception), they are not used
outside op_helper.c anymore.
Make some x86_64 and user only code conditional to avoid warnings.
Document where each function is implemented in cpu.h and exec.h.
blueswir1 [Sat, 13 Dec 2008 08:16:43 +0000 (08:16 +0000)]
Fix TARGET_LONG_BITS warning in TCG
Looking at tcg/tcg.c:828, the bug that the warning indicated would show up as
incorrect PC shown in log, only on 32 bit big endian host emulating a 64 bit
target, -d op flag enabled. Now that dyngen is gone, the patch can be applied.
aliguori [Fri, 12 Dec 2008 20:02:52 +0000 (20:02 +0000)]
Make sure to link librt if we need to.
This is really a stop-gap. The recent thread pool changes uncovered a
deeper issue with how we use librt. We really should be probing for
timer_create and then conditionally enabling that code.
aliguori [Fri, 12 Dec 2008 16:41:40 +0000 (16:41 +0000)]
Replace posix-aio with custom thread pool
glibc implements posix-aio as a thread pool and imposes a number of limitations.
1) it limits one request per-file descriptor. we hack around this by dup()'ing
file descriptors which is hideously ugly
2) it's impossible to add new interfaces and we need a vectored read/write
operation to properly support a zero-copy API.
What has been suggested to me by glibc folks, is to implement whatever new
interfaces we want and then it can eventually be proposed for standardization.
This requires that we implement our own posix-aio implementation though.
This patch implements posix-aio using pthreads. It immediately eliminates the
need for fd pooling.
It performs at least as well as the current posix-aio code (in some
circumstances, even better).
aliguori [Thu, 11 Dec 2008 21:20:03 +0000 (21:20 +0000)]
pci: virtio: use pci id defines (Gerd Hoffman)
Use the defines added by the previous patch in the virtio drivers.
Also remove the pointless vendor and device args from the
virtio_blk_init() function.
aliguori [Thu, 11 Dec 2008 21:15:42 +0000 (21:15 +0000)]
pci: add default pci subsystem id for all devices (Gerd Hoffman)
This sets a default PCI subsystem ID for all emulated PCI devices. PCI
specs require this, so do it.
In many cases it is enougth to know the PCI ID to handle a device
correctly. Sometimes a device driver must identify the exact piece of
hardware (via PCI Subsystem ID) though.
What does this patch to qemu devices:
Right now the emulated PCI devices have no PCI subsystem ID, only the
PCI ID. The discussed patch sets a default PCI subsystem ID for all
emulated devices. Which will make the qemu devices look pretty much
like in the laptop case: all PCI subsystem IDs will point to qemu by
default.
If a driver emulates a very specific piece of hardware where it has to
emulate more than just the PCI chip, it can overwrite the PCI subsystem
ID without problems. The es1370 driver does that for example.
aliguori [Thu, 11 Dec 2008 21:06:49 +0000 (21:06 +0000)]
Fix handling of disk-only snapshots (Kevin Wolf)
When creating a snapshot with multiple qcow2 disks attached, the current
behaviour is that qemu creates a disk snapshot on all of them and
chooses one to write the VM state to.
Despite having the state only in one image, loadvm tries to restore the
VM state from the middle of nowhere if you run qemu a second time with
only one of the other images attached. In the lucky case it will fail
because there simply is no state, but it also can happen that it loads
the state of a different snapshot (the one this new one is based upon).
The fix is to write a zero VM state size to the images which don't
contain the state, and check this in loadvm.
I agree that you probably have to provoke such things intentionally to
get in a state like this with qemu itself. However, with my second patch
that adds snapshot support to qemu-img it could become a reasonable use
case to have snapshots with and without VM states on the same image.
blueswir1 [Thu, 11 Dec 2008 17:30:50 +0000 (17:30 +0000)]
Allow to register a callback with fw_cfg_add_callback()
fw_cfg_add_callback() checks if key has FW_CFG_WRITE_CHANNEL bit set
after masking the key with FW_CFG_ENTRY_MASK.
But as FW_CFG_ENTRY_MASK is ~(FW_CFG_WRITE_CHANNEL | FW_CFG_ARCH_LOCAL),
the bit is never set and function exits.
This patch corrects this by checking the bit before masking the value.
aurel32 [Wed, 10 Dec 2008 17:31:51 +0000 (17:31 +0000)]
target-sh4: Add SH bit handling to TLB
This patch adds SH bit handling to sh4's TLB, which is a part of MMU
functionality that had not been implemented in qemu.
Additionally, increment_urc() call in cpu_load_tlb() is deleted, because
the specification explicitly says that URC is not incremented by an LDTLB
instruction (at Section 3 of SH7751 Hardware manual(REJ09B0370-0400)).
Even though URC is not needed to be strictly same as HW because it is a
random number, this condition is not negligible.
This patch adds support for 64-bit Block Move instructions. There are multiple
modes for 64-bit Block moves, direct, indirect, and table indirect. This patch
implements Direct and Table indirect moves which are needed by 64-bit windows
and SYM_CONF_DMA_ADDRESSING_MODE=2 for the Linux sym53c8xx_2 driver respectively.
Two helper functions are included to check which mode the guest is using. For
64-bit direct moves, we fetch a 3rd DWORD and store the value in the DBMS
register. For Table Indirect moves, we look into the table for which register
contains the upper 32-bits of the 64-bit address. This selector value indicates
which register to pull the value from and into dnad64 register.
Finally, lsi_do_dma is updated to use the approriate register to build a 64-bit
DMA address if required.
With this patch, Windows XP x64, 2003 SP2 x64, can now install to scsi devices.
Linux SYM_CONF_DMA_ADDRESSING_MODE=2 need a quirk fixup in Patch 4 to function
properly.
monitor_readline expects buf_size to include the terminating \0, but
do_change_vnc in monitor.c calls it as though it doesn't. The other site
where monitor_readline reads a password (in vl.c) passes the buffer
length
correctly.
aurel32 [Wed, 10 Dec 2008 15:02:16 +0000 (15:02 +0000)]
target-i386: Fix jmp im on x86_64 when executing 32-bit code
When running grub-install (32-bit) on an x86_64 Linux system in qemu, it
hangs on a pagefault forever, because an integer overflow occurs on the
IP on "jmp im". This patch masks overflows for 32 bit IPs on a 64 bit
system, just like it is done for 16 bit IPs already.
Using this patch, x86_64 openSUSE installation works again.
aurel32 [Wed, 10 Dec 2008 15:02:07 +0000 (15:02 +0000)]
MIPS Magnum: fix memory-mapped i8042
Current implementation of memory-mapped i8042 controller is atm
implemented with an interface shift (it_shift) parameter, like most all
memory-mapped devices in Qemu.
However, this isn't suitable for MIPS Magnum, where i8042 controller is at
0x80005000 up to 0x80005fff.
Thomas Bogendoerfer (from #mipslinux) tested the behaviour of a real
machine, and found that odd addresses are for status/command register, and
even addresses for data register.
Attached patch implements this behaviour by replacing the it_shift
parameter by a mask one.
Incidentally, keyboard now works on OpenBSD 2.3, which accesses i8042
controller at 0x80005060 and 0x80005061.
aliguori [Tue, 9 Dec 2008 20:09:57 +0000 (20:09 +0000)]
KVM: Coalesced MMIO support
MMIO exits are more expensive in KVM or Xen than in QEMU because they
involve, at least, privilege transitions. However, MMIO write
operations can be effectively batched if those writes do not have side
effects.
Good examples of this include VGA pixel operations when in a planar
mode. As it turns out, we can get a nice boost in other areas too.
Laurent mentioned a 9.7% performance boost in iperf with the coalesced
MMIO changes for the e1000 when he originally posted this work for KVM.
aliguori [Tue, 9 Dec 2008 19:59:09 +0000 (19:59 +0000)]
Disable KVM support if the kernel modules have broken memory slot handling
Prior to kvm-80, memory slot deletion was broken in the KVM kernel
modules. In kvm-81, a new capability is introduced to signify that this
problem has been fixed.
Since we rely on being able to delete memory slots, refuse to work with
any kernel module that does not have this capability present.
aurel32 [Mon, 8 Dec 2008 18:12:26 +0000 (18:12 +0000)]
linux-user: Fix h2g usage in page_find_alloc
Paul's comment on my first approach to fix the h2g usage in
page_find_alloc finally open my eyes about what the code is actually
supposed to do:
With the help of h2g_valid we can no cleanly check if a freshly allocate
page (for host usage) is guest-reachable and, in case it is, mark it
reserved in the guest's address range.
aurel32 [Mon, 8 Dec 2008 18:12:11 +0000 (18:12 +0000)]
linux-user: Safety belt for h2g
h2g can only work on 64-bit hosts if the provided address is mappable to
the guest range. Neglecting this was already the source for several
bugs. Instrument the macro so that it will trigger earlier in the
future (at least as long as we have this kind of mapping mechanism).
aurel32 [Mon, 8 Dec 2008 18:11:21 +0000 (18:11 +0000)]
target-ppc: memory load/store rework
Rework the memory load/store:
- Unify load/store functions for 32-bit and 64-bit CPU
- Don't swap values twice for bit-reverse load/store functions
in little endian mode.
- On a 64-bit CPU in 32-bit mode, do the address truncation for
address computation instead of every load store. Truncate the
address when incrementing the address (if needed)
- Cache writes to access_types.
- Add a few missing calls to gen_set_access_type()