aliguori [Thu, 29 Jan 2009 19:59:04 +0000 (19:59 +0000)]
check SCSI read/write requests against max LBA (Rik van Riel)
The bdrv layer uses a signed offset. Furthermore, block-raw-posix
only seeks when that offset is positive. Passing a negative offset
to block-raw-posix can result in data being written at the current
seek cursor's position.
It may be possible to exploit this to seek to the end of the disk
and extend the virtual disk by writing data to a negative sector
offset. After a reboot, this could lead to the guest having a
larger disk than it had before.
Close the hole by sanity checking the lba against the size of the
disk.
aliguori [Thu, 29 Jan 2009 17:02:17 +0000 (17:02 +0000)]
MTRR support on x86, part 2 (Carl-Daniel Hailfinger)
Load and save MTRR state together with machine state.
Add support for the MTRRcap MSR which is used by the latest Bochs BIOS
and some operating systems.
Fix a typo in ext2_feature_name.
With this patch, MTRR emulation should be good enough to not trigger any
sanity checks in well behaved BIOS/kernel code.
Some corner cases for BIOS/firmware usage remain to be implemented, but
that can be deferred to another patch.
Also, MTRR accesses on hardware not supporting MTRRs should cause #GP.
That can be enforced by another patch as well.
Currently when qemu_paio_read or qemu_paio_write return an error we call
qemu_aio_release without removing the request from the list.
I know that in the current implementation qemu_paio_write\read don't return
any error, but still the behavior is wrong, especially considering
that the implementation of these two functions is likely to change in is
the future.
This patch fixes the problem adding a raw_aio_remove function that
removes the callback from the queue and also calls qemu_aio_release.
raw_aio_remove is called by raw_aio_read, raw_aio_write and
raw_aio_cancel.
aliguori [Wed, 28 Jan 2009 21:58:29 +0000 (21:58 +0000)]
SCSI divide capacity by s->cluster_size (Rik van Riel)
Paul Brook pointed out that the number of sectors reported
by the SCSI read capacity commands needs to be divided by
s->cluster_size, because bdrv_get_geometry reports the number
of 512 byte sectors, while emulated CDROMs report 2048 byte
sectors back to the guest.
This has no consequences for emulated hard disks, which use
a cluster size of 1.
aliguori [Wed, 28 Jan 2009 21:58:25 +0000 (21:58 +0000)]
support >2TB SCSI disks (Rik van Riel)
Implement SCSI READ(16), WRITE(16) and SAI READ CAPACITY(16) commands,
so SCSI disks larger than 2TB can work with guests that support these
newer SCSI commands.
The cast to (uint64_t) is needed because otherwise gcc will use a
signed int, which gets sign extended into uint64_t lba, resulting
in bad block numbers for READ 10 and READ 16 with block numbers
larger than 2^31.
aliguori [Wed, 28 Jan 2009 21:58:22 +0000 (21:58 +0000)]
fix signed/unsigned overflows in SCSI disk (Rik van Riel)
Sector numbers can overflow on a virtual scsi disk of over 1TB
in size. Qemu's bdrv_read expects an int64_t, so fix the overflow
by going to that data type.
On large disks, we clip the capacity to 2TB instead of returning
"capacity modulo 2TB".
Turn sector_count into an unsigned to prevent a signed/unsigned
overflow with SCSI transfers larger than 2TB. We're unlikely to
ever hit this bug, but fixing it is just one line.
aliguori [Mon, 26 Jan 2009 20:32:22 +0000 (20:32 +0000)]
kvm-x86: Remove eflags conversion into emulator format (Jan Kiszka)
It seems that the conversion of the kernel-delivered eflags state into
qemu's internal split representation was once needed in an older kvm
design (register read-back may have taken place from inside cpu_exec).
Today it is plain wrong and causes incorrect cpu state reporting (gdb,
monitor) and should also corrupt its saving (savevm, migration). Drop
the related lines.
aliguori [Mon, 26 Jan 2009 20:26:49 +0000 (20:26 +0000)]
block-vpc: Adapt header structures to official documentation (Kevin Wolf)
The current definition of the VirtualPC headers is incomplete and partly
even wrong. This patch changes the header structs according to the
official VHD specification.
aliguori [Mon, 26 Jan 2009 20:26:46 +0000 (20:26 +0000)]
block-vpc: Split up struct vpc_subheader (Kevin Wolf)
struct vpc_subheader currently is a union of two completely different
data structures (the Hard Disk Footer and the Dynamic Disk Header).
That doesn't make too much sense, so split them up.
aliguori [Mon, 26 Jan 2009 19:54:36 +0000 (19:54 +0000)]
x86: Issue reset on triple faults (Jan Kiszka)
As discussed a few times on this list: A triple fault causes a system
reset on x86, and some guests make use of this (e.g. 386BSD). To keep
the chance of tracing unexpected resets, log them if CPU_LOG_RESET is
set.
There are three declared copyright holders in slirp that use the 4-clause
BSD license, the Regents of UC Berkley, Danny Gasparovski, and Kelly Price.
Below are the appropriate permissions to remove the advertise clause from slirp
from each party.
Special thanks go to Richard Fontana from Red Hat for contacting all of the
necessary authors to resolve this issue!
Regents of UC Berkley:
From ftp://ftp.cs.berkeley.edu/pub/4bsd/README.Impt.License.Change
July 22, 1999
To All Licensees, Distributors of Any Version of BSD:
As you know, certain of the Berkeley Software Distribution ("BSD") source
code files require that further distributions of products containing all or
portions of the software, acknowledge within their advertising materials
that such products contain software developed by UC Berkeley and its
contributors.
Specifically, the provision reads:
" * 3. All advertising materials mentioning features or use of this software
* must display the following acknowledgement:
* This product includes software developed by the University of
* California, Berkeley and its contributors."
Effective immediately, licensees and distributors are no longer required to
include the acknowledgement within advertising materials. Accordingly, the
foregoing paragraph of those BSD Unix files containing it is hereby deleted
in its entirety.
William Hoskins
Director, Office of Technology Licensing
University of California, Berkeley
Thanks for contacting me, Richard. I'm glad you were able to find
Dan, as I've been "keeping the light on" for Slirp. I have no use for
it now, and I have little time for it (now holding onto Keenspot's
Comic Genesis and having a regular US state government position). If
Dan would like to return to the project, I'd love to give it back to
him.
As for copyright, I don't own all of it. Dan does, so I will defer to
him. Any of my patches I will gladly license to the 3-part BSD
license. My interest in re-licensing was because we didn't have ready
info to contact Dan. If Dan would like to port Slirp back out of
QEMU, a lot of us 64-bit users would be grateful.
Feel free to share this email address with Dan. I will be glad to
effect a transfer of the project to him and Mr. Bellard of the QEMU
project.
aliguori [Mon, 26 Jan 2009 17:53:04 +0000 (17:53 +0000)]
MTRR support on x86 (Carl-Daniel Hailfinger)
The current codebase ignores MTRR (Memory Type Range Register)
configuration writes and reads because Qemu does not implement caching.
All BIOS/firmware in know of for x86 do implement a mode called
Cache-as-RAM (CAR) which locks down the CPU cache lines and uses the CPU
cache like RAM before RAM is enabled. Qemu assumes RAM is accessible
from the start, but it would be nice to be able to run real
BIOS/firmware in Qemu. For that, we need CAR support and for CAR support
we have to support MTRRs.
This patch is a first step in that direction. MTRRs are MSRs supported
by all recent x86 CPUs, even old i586. Besides influencing cache, the
MTRRs can be written and read back, so discarding MTRR writes violates
the expectations of existing code out there.
An added benefit of this patch is that it fixes the following Linux
kernel error message present in recent kernels (provided the BIOS has
the recent MTRR patches applied):
------------[ cut here ]------------
WARNING: at arch/x86/kernel/cpu/mtrr/main.c:1500 mtrr_trim_uncached_memory+0x382/0x384()
WARNING: strange, CPU MTRRs all blank?
Modules linked in:
Supported: Yes
Pid: 0, comm: swapper Not tainted 2.6.27.7-9-default #1
[<c0106570>] dump_trace+0x6b/0x249
[<c01070a5>] show_trace+0x20/0x39
[<c0343c02>] dump_stack+0x71/0x76
[<c012acb2>] warn_slowpath+0x6f/0x90
[<c0542f8f>] mtrr_trim_uncached_memory+0x382/0x384
[<c053f24d>] setup_arch+0x40d/0x639
[<c053a6ac>] start_kernel+0x6b/0x31f
=======================
---[ end trace 4eaa2a86a8e2da22 ]---
Handle common x86 MTRR reads and writes, but don't act on them.
aliguori [Mon, 26 Jan 2009 17:07:46 +0000 (17:07 +0000)]
build system: Further improve quiet mode (Jan Kiszka)
Derived from Stuart Brady's patch: Show the target directory as prefix
to the current module when building in quiet mode. This helps to gain
overview of the current build progress, specifically when running
parallelized builds.
Furthermore, suppress make command echoing when entering subdirs and
replace $(subst subdir-,,$@) with $* in the related rule.
aliguori [Mon, 26 Jan 2009 17:07:42 +0000 (17:07 +0000)]
Move definition of rgb_to_pixel_dup_table (Nathan Froyd)
This fixes the warning:
/scratch/froydnj/qemu.git/hw/vga.c:1515: warning: redundant redeclaration of 'rgb_to_pixel_dup_table'
/scratch/froydnj/qemu.git/hw/vga.c:1248: warning: previous declaration of 'rgb_to_pixel_dup_table' was here
aliguori [Mon, 26 Jan 2009 15:37:40 +0000 (15:37 +0000)]
Enabled building of x86_64 code on Mac OS X (Alexander Graf)
Mac OS X 10.5 supports 64-bit userspace on an x86_64 kernel and
by default uses 32-bit userspace applications, so the detection for
the host architecture fails.
This patch enabled building of x86_64 code on x86_64 capable CPUS
with Mac OS X.
aliguori [Mon, 26 Jan 2009 15:37:30 +0000 (15:37 +0000)]
vnc fixes and improvements (Stefano Stabellini)
this patch fixes a bug and improves the generic pixel conversion
function in vnc.c.
The bug is that when a new vnc client connects we need to reset the flag
has_WMVi but currently we don't.
The generic pixel conversion function is vnc_convert_pixel and currently
is not very efficient since uses the division and multiplication
operators.
To make it more efficient I changed to use bit shift operators instead.
aliguori [Mon, 26 Jan 2009 15:22:57 +0000 (15:22 +0000)]
Use the default subsystem vendor ID for virtio devices (Mark McLoughlin)
A subsystem vendor ID of zero isn't allowed, so we use our
default ID.
Gerd points out that although the PCI subsystem vendor ID is
treated by the guest as the virtio vendor ID:
/* we use the subsystem vendor/device id as the virtio vendor/device
* id. this allows us to use the same PCI vendor/device id for all
* virtio devices and to identify the particular virtio driver by
* the subsytem ids */
vp_dev->vdev.id.vendor = pci_dev->subsystem_vendor;
vp_dev->vdev.id.device = pci_dev->subsystem_device;
it looks like only the device ID is used right now:
# grep virtio modules.alias
alias virtio:d00000001v* virtio_net
alias virtio:d00000002v* virtio_blk
alias virtio:d00000003v* virtio_console
alias virtio:d00000004v* virtio-rng
alias virtio:d00000005v* virtio_balloon
alias pci:v00001AF4d*sv*sd*bc*sc*i* virtio_pci
alias virtio:d00000009v* 9pnet_virtio
so setting the subsystem vendor id to something != zero shouldn't cause
trouble.
aurel32 [Mon, 26 Jan 2009 10:22:15 +0000 (10:22 +0000)]
target-ppc: always load kernel to KERNEL_LOAD_ADDR
Linux changed its physical address location in the elf header from
0xc0000000 to 0 on 2.6.25, causing later kernels to fail booting
with the -kernel option.
This patch assures that the lowest segment in the elf binary is loaded
to KERNEL_LOAD_ADDR, which is where the firmware expects it.
aurel32 [Sat, 24 Jan 2009 18:21:08 +0000 (18:21 +0000)]
sh4: sh_pci. Register resouces both at A7 and P4.
Add resource registration both for P4 and A7.
This is needed because of #5935 SH4: Eliminate P4 to A7 mangling.
Additionally, {reg,iop,mem}base which is no longer used are removed.
aurel32 [Sat, 24 Jan 2009 18:06:21 +0000 (18:06 +0000)]
Support epoch of 1980 in RTC emulation for MIPS Magnum
On the MIPS Magnum, the time that is held in the RTC's NVRAM should be
relative to midnight on 1980-01-01. This patch adds an extra parameter
to rtc_init(), allowing different epochs to be used. For the Magnum,
1980 is specified, and for all other machines, 2000 is specified.
I've not modified the handling of the century byte, as with an epoch of
1980 and a year of 2009, one could argue that it should hold either
0, 1, 19 or 20. NT 3.50 on MIPS does not read the century byte.
aurel32 [Sat, 24 Jan 2009 15:07:42 +0000 (15:07 +0000)]
target-ppc: Change core powerpc gdbstub bits to be XML-aware
Define GDB_CORE_XML and hack things similarly to ARM so that despite the
FP registers coming in between the GPRs and some status registers,
everything works out OK no matter which kind of GDB we're communicating
with.
It matters whether we're built to target 64-bit or 32-bit cores. I
think there are still problems if we are debugging 32-bit programs on a
built-for-64-bit QEMU (QEMU will always send 64-bit registers), but I
don't know if there's a good way around that at the time being.
aurel32 [Sat, 24 Jan 2009 15:07:34 +0000 (15:07 +0000)]
target-ppc: Add XML files for PowerPC registers
These files are nearly identical to the XML files provided with GDB.
The only difference is that power-{fpu,spe}.xml do not assign register
numbers; the internal QEMU machinery takes care of that.
Define gdb_xml_files for ppc targets in configure as well.
blueswir1 [Sat, 24 Jan 2009 12:09:52 +0000 (12:09 +0000)]
Floppy: Properly handle Sense Interrupt Status after FDC Reset
Original text below.
Attached is a patch that changes how the emulated floppy controller replies to Sense Interrupt Status commands immediately after a controller reset. The specs state that after a Reset the 82078 goes into polling mode which needs four Sense Interrupt Status commands to be issued afterwards to clear the status of each drive. Currently we always respond to Sense Interrupt Status with a SEEK END instead of POLLING. This causes a problem with the SCO Openserver installer which is expects a POLLING state after reset. This patch returns a POLLING status for four Sense Interrupt Status requests immediately after a controller reset. This approach mirrors the way Bochs handles this situation. With the attached patch applied Openserver gets further when trying to load storage drivers from the floppy disk (blocked by another issue, patch on its way). I have successfully tested the floppy drive on the following OSs after applying this patch: Windows 98, Windows XP SP2, Linux x86 (SysRescCD 1.1.3 and Ubuntu 8.10).
Justin
Changelog:
Properly handle Sense Interrupt Status after FDC Reset
malc [Fri, 23 Jan 2009 19:56:19 +0000 (19:56 +0000)]
fix endianness problem sharing the videoram buffer
[ The following text is in the "UTF-8" character set. ]
[ Your display is set for the "koi8-r" character set. ]
[ Some characters may be displayed incorrectly. ]
This patch fixes vga rendering when the guest endianness differs from
the host endianness: in this case we can only share the buffer if the
bpp is 32 and we must change the pixelformat accordingly.
aliguori [Thu, 22 Jan 2009 18:57:34 +0000 (18:57 +0000)]
qcow2 format: keep 'num_free_bytes', and show it upon 'info blockstats' (Uri Lublin)
'num_free_bytes' is the number of non-allocated bytes below highest-allocation.
It's useful, together with the highest-allocation, to figure out how
fragmented the image is, and how likely it will run out-of-space soon.
For example when the highest allocation is high (almost end-of-disk), but
many bytes (clusters) are free, and can be re-allocated when neeeded, than
we know it's probably not going to reach end-of-disk-space soon.
Added bookkeeping to block-qcow2.c
Export it using BlockDeviceInfo
Show it upon 'info blockstats' if BlockDeviceInfo exists
We want to know the highest written offset for qcow2 images.
This gives a pretty good (and easy to calculate) estimation to how
much more allocation can be done for the block device.
It can be usefull for allocating more diskspace for that image
(if possible, e.g. lvm) before we run out-of-disk-space
aliguori [Thu, 22 Jan 2009 17:15:21 +0000 (17:15 +0000)]
install man-pages as non-executables (Andre Przywara)
make install-doc omits an explicit permission mask for the man-pages. This
defaults to have the executable bits set. Adding "-m 644" (for rw-r--r--)
fixes that.
aliguori [Thu, 22 Jan 2009 16:59:24 +0000 (16:59 +0000)]
Vectored block device API (Avi Kivity)
Most devices that are capable of DMA are also capable of scatter-gather.
With the memory mapping API, this means that the device code needs to be
able to access discontiguous host memory regions.
For block devices, this translates to vectored I/O. This patch implements
an aynchronous vectored interface for the qemu block devices. At the moment
all I/O is bounced and submitted through the non-vectored API; in the future
we will convert block devices to natively support vectored I/O wherever
possible.
aliguori [Thu, 22 Jan 2009 16:59:20 +0000 (16:59 +0000)]
I/O vector helpers (Avi Kivity)
In general, it is not possible to predict the size of of an I/O vector since
a contiguous guest region may map to a disconiguous host region. Add some
helpers to manage I/O vector growth.
aliguori [Thu, 22 Jan 2009 16:59:16 +0000 (16:59 +0000)]
Add map client retry notification (Avi Kivity)
The target memory mapping API may fail if the bounce buffer resources
are exhausted. Add a notification mechanism to allow clients to retry
the mapping operation when resources become available again.
aliguori [Thu, 22 Jan 2009 16:59:11 +0000 (16:59 +0000)]
Add target memory mapping API (Avi Kivity)
Devices accessing large amounts of memory (as with DMA) will wish to obtain
a pointer to guest memory rather than access it indirectly via
cpu_physical_memory_rw(). Add a new API to convert target addresses to
host pointers.
In case the target address does not correspond to RAM, a bounce buffer is
allocated. To prevent the guest from causing the host to allocate unbounded
amounts of bounce buffer, this memory is limited (currently to one page).
aliguori [Wed, 21 Jan 2009 19:18:00 +0000 (19:18 +0000)]
re-fix screendump (Stefano Stabellini)
Removing the assumption about a single graphic console made
get_graphic_console return NULL when called by vga_screen_dump.
In this case returning NULL is correct but since NULL is not handled in
qemu_console_resize it causes a segmentation fault.
Just returning immediately from qemu_console_resize is sufficient to fix the
problem.
aliguori [Wed, 21 Jan 2009 18:59:12 +0000 (18:59 +0000)]
fix curses interface (Stefano Stabellini)
Hi all,
this patch fixes the curses interface: when we switch from one console
to another we need to change the displaystate width and height even
though in the curses case the backing buffer remains of the same size.
I am also putting back the call to text_console_resize in
text_console_invalidate so that resizeable text consoles can be properly
handled.
aliguori [Wed, 21 Jan 2009 18:59:04 +0000 (18:59 +0000)]
Stop VM on ENOSPC error. (Gleb Natapov)
This version of the patch adds new option "werror" to -drive flag.
Possible values are:
report - report errors to a guest as IO errors
ignore - continue as if nothing happened
stop - stop VM on any error and retry last command on resume
enospc - stop vm on ENOSPC error and retry last command on resume
all other errors are reported to a guest.
Default is "report" to maintain current behaviour.
aliguori [Wed, 21 Jan 2009 18:58:51 +0000 (18:58 +0000)]
Adds null check for DisplayStatus (Stefano Stabellini)
Allocate a DisplaySurface in dumb_display_init if none else does it.
The DisplaySurface will be used for the qemu monitor, serial and
parallel ports, etc.
aliguori [Wed, 21 Jan 2009 18:31:35 +0000 (18:31 +0000)]
cirrus: cleanup reset handler (Jan Kiszka)
We should not re-register the cirrus io-memory regions on each reset.
Moreover, this patch removes some dead code and pushes other static
field initializations from reset to init_common.
aliguori [Wed, 21 Jan 2009 18:31:16 +0000 (18:31 +0000)]
cirrus: stop dirty logging during remaps (Jan Kiszka)
Cleaned-up port from kvm-userspace: We have to stop any vram logging
while doing remaps. Otherwise the logger gets confused. This reward is
enormously accelerated cirrus vga in kvm mode.
aliguori [Wed, 21 Jan 2009 18:13:09 +0000 (18:13 +0000)]
Make make output quieter (Avi Kivity)
Spew out less noise when compiling. This helps review make output for
information such as compilation warnings, rather than extra long compiler
invocations.
The full output can be generated by supplying a 'V=1' parameter to make.