Anthony Liguori [Sun, 4 Dec 2011 20:37:06 +0000 (14:37 -0600)]
qdev: add class_init to DeviceInfo
Since we are still dynamically creating TypeInfo, we need to chain the
class_init function in order to be able to make use of it within subclasses of
TYPE_DEVICE.
This will disappear once we register TypeInfos directly.
Anthony Liguori [Thu, 15 Dec 2011 20:40:29 +0000 (14:40 -0600)]
qdev: add a interface to register subclasses
In order to introduce inheritance while still using the qdev registration
interfaces, we need to be able to use a parent other than TYPE_DEVICE. Add a
new interface that allows this.
Anthony Liguori [Sun, 4 Dec 2011 17:08:36 +0000 (11:08 -0600)]
qdev: move qdev->info to class
Right now, DeviceInfo acts as the class for qdev. In order to switch to a
proper ObjectClass derivative, we need to ween all of the callers off of
interacting directly with the info pointer.
Anthony Liguori [Fri, 16 Dec 2011 20:34:46 +0000 (14:34 -0600)]
qdev: integrate with QEMU Object Model (v2)
This is a very shallow integration. We register a TYPE_DEVICE but only use
QOM as basically a memory allocator. This will make all devices show up as
QOM objects but they will all carry the TYPE_DEVICE.
Signed-off-by: Anthony Liguori <[email protected]>
---
v1 -> v2
- update for new location of object.h
Anthony Liguori [Sat, 3 Dec 2011 23:10:08 +0000 (17:10 -0600)]
qom: add the base Object class (v2)
This class provides the main building block for QEMU Object Model and is
extensively documented in the header file. It is largely inspired by GObject.
Signed-off-by: Anthony Liguori <[email protected]>
---
v1 -> v2
- remove printf() in type registration
- fix typo in comment (Paolo)
- make Interface private
- move object into a new directory and move header into include/qemu/
- don't make object.h depend on qemu-common.h
- remove Type and replace it with TypeImpl * (Paolo)
- use hash table to store types (Paolo)
- aggressively cache parent type (Paolo)
- make a type_register and use it with interfaces (Paolo)
- fix interface cast comment (Paolo)
- add a few more functions required in later series
Anthony Liguori [Fri, 27 Jan 2012 15:00:03 +0000 (09:00 -0600)]
Merge remote-tracking branch 'pmaydell/arm-devs.for-upstream' into staging
* pmaydell/arm-devs.for-upstream:
arm: SoC model for Calxeda Highbank
arm_boot: support board IDs more than 16 bits wide
arm: add secondary cpu boot callbacks to arm_boot.c
ahci: add support for non-PCI based controllers
Add xgmac ethernet model
Thomas Higdon [Tue, 24 Jan 2012 17:19:44 +0000 (12:19 -0500)]
scsi: Guard against buflen exceeding req->cmd.xfer in scsi_disk_emulate_command
Limit the return value (corresponding to the length of the buffer to be
DMAed back to the intiator) to the value in req->cmd.xfer, which is the
amount of data that the initiator expects. Eliminate now-duplicate code
that does this guarding in the functions for individual commands.
Without this, the SCRIPTS code in the emulated LSI device eventually
raises a DMA interrupt for a data overrun when an INQUIRY command whose
buflen exceeds req->cmd.xfer is processed. It's the responsibility of
the client to provide a request buffer and allocation length that are
large enough for the result of the command.
Li Zhi Hui [Mon, 21 Nov 2011 07:40:39 +0000 (15:40 +0800)]
qcow: Use bdrv functions to replace file operation
Since common file operation functions lack of error detection and use
much more I/O syscalls, so change them to bdrv series functions and
reduce I/O request.
Stefan Weil [Sat, 21 Jan 2012 12:54:24 +0000 (13:54 +0100)]
block/vdi: Zero unused parts when allocating a new block (fix #919242)
The new block was filled with zero when it was allocated by g_malloc0,
but when it was reused later and only partially used, data from the
previously allocated block were still present and written to the new
block.
This caused the problems reported by bug #919242
(https://bugs.launchpad.net/qemu/+bug/919242).
Now the unused parts of the new block which are before and after the data
are always filled with zero, so it is no longer necessary to zero the whole
block with g_malloc0.
There already exists a virtio_blk_handle_write trace event as well as
completion events. Add the virtio_blk_handle_read event so it's easy to
trace virtio-blk requests for both read and write operations.
Stefan Hajnoczi [Wed, 18 Jan 2012 14:40:50 +0000 (14:40 +0000)]
blockdev: make image streaming safe across hotplug
Unplugging a storage interface like virtio-blk causes the host block
device to be deleted too. Long-running operations like block migration
must take a DriveInfo reference to prevent the BlockDriverState from
being freed. For image streaming we can do the same thing.
Note that it is not possible to acquire/release the drive reference in
block.c where the block job functions live because
drive_get_ref()/drive_put_ref() are blockdev.c functions. Calling them
from block.c would be a layering violation - tools like qemu-img don't
even link against blockdev.c.
Stefan Hajnoczi [Wed, 18 Jan 2012 14:40:48 +0000 (14:40 +0000)]
qmp: add block_job_cancel command
Add block_job_cancel, which stops an active block streaming operation.
When the operation has been cancelled the new BLOCK_JOB_CANCELLED event
is emitted.
Stefan Hajnoczi [Wed, 18 Jan 2012 14:40:46 +0000 (14:40 +0000)]
qmp: add block_stream command
Add the block_stream command, which starts copy backing file contents
into the image file. Also add the BLOCK_JOB_COMPLETED QMP event which
is emitted when image streaming completes. Later patches add control
over the background copy speed, cancelation, and querying running
streaming operations.
Peter Maydell [Thu, 26 Jan 2012 11:43:48 +0000 (11:43 +0000)]
arm_boot: support board IDs more than 16 bits wide
Support passing a board ID value to the kernel in r1
that is more than 16 bits wide. This is needed to pass
the '-1 == invalid' value for boards which only support
device tree booting.
Mark Langsdorf [Thu, 26 Jan 2012 11:43:48 +0000 (11:43 +0000)]
arm: add secondary cpu boot callbacks to arm_boot.c
Create two functions, write_secondary_boot() and secondary_cpu_reset_hook(),
to allow platforms more control of how secondary CPUs are brought up. The
new functions default to NULL and aren't called unless they are populated
so there are no changes to existing platform models.
Stefan Hajnoczi [Wed, 18 Jan 2012 14:40:45 +0000 (14:40 +0000)]
block: rate-limit streaming operations
This patch implements rate-limiting for image streaming. If we've
exceeded the bandwidth quota for a 100 ms time slice we sleep the
coroutine until the next slice begins.
Stefan Hajnoczi [Wed, 18 Jan 2012 14:40:42 +0000 (14:40 +0000)]
block: make copy-on-read a per-request flag
Previously copy-on-read could only be enabled for all requests to a
block device. This means requests coming from the guest as well as
QEMU's internal requests would perform copy-on-read when enabled.
For image streaming we want to support finer-grained behavior than just
populating the image file from its backing image. Image streaming
supports partial streaming where a common backing image is preserved.
In this case guest requests should not perform copy-on-read because they
would indiscriminately copy data which should be left in a backing image
from the backing chain.
Introduce a per-request flag for copy-on-read so that a block device can
process both regular and copy-on-read requests. Overlapping reads and
writes still need to be serialized for correctness when copy-on-read is
happening, so add an in-flight reference count to track this.
Stefan Hajnoczi [Wed, 18 Jan 2012 14:40:41 +0000 (14:40 +0000)]
block: check bdrv_in_use() before blockdev operations
Long-running block operations like block migration and image streaming
must have continual access to their block device. It is not safe to
perform operations like hotplug, eject, change, resize, commit, or
external snapshot while a long-running operation is in progress.
This patch adds the missing bdrv_in_use() checks so that block migration
and image streaming never have the rug pulled out from underneath them.
Stefan Hajnoczi [Mon, 16 Jan 2012 09:28:06 +0000 (09:28 +0000)]
block: replace unchecked strdup/malloc/calloc with glib
Most of the codebase as been converted to use glib memory allocation
functions. There are still a few instances of malloc/calloc in the
block layer and qemu-io. Replace them, especially since they do not
check the strdup/malloc/calloc return value.
Anthony Liguori [Mon, 23 Jan 2012 17:00:26 +0000 (11:00 -0600)]
Merge remote-tracking branch 'qemu-kvm/uq/master' into staging
* qemu-kvm/uq/master:
kvm: Activate in-kernel irqchip support
kvm: x86: Add user space part for in-kernel IOAPIC
kvm: x86: Add user space part for in-kernel i8259
kvm: x86: Add user space part for in-kernel APIC
kvm: x86: Establish IRQ0 override control
kvm: Introduce core services for in-kernel irqchip support
memory: Introduce memory_region_init_reservation
ioapic: Factor out base class for KVM reuse
ioapic: Drop post-load irr initialization
i8259: Factor out base class for KVM reuse
i8259: Completely privatize PicState
apic: Open-code timer save/restore
apic: Factor out base class for KVM reuse
apic: Introduce apic_report_irq_delivered
apic: Inject external NMI events via LINT1
apic: Stop timer on reset
kvm: Move kvmclock into hw/kvm folder
msi: Generalize msix_supported to msi_supported
hyper-v: initialize Hyper-V CPUID leaves.
hyper-v: introduce Hyper-V support infrastructure.
Currently on the pseries machine the SLOF firmware is used normally,
but we bypass it when -kernel is specified. Having these two
different boot paths can cause some confusion.
In particular at present we need to "probe" the (emulated) PCI bus and
produce device tree nodes for the PCI devices in qemu, for the -kernel
case. In the SLOF case, it takes the device tree from qemu adds some
stuff to it then passes it on to the kernel.
It's been decided that a better approach is to always boot through
SLOF, even when using -kernel. WIth this approach we can leave PCI
probing and device node creation to SLOF in all cases which removes a
bunch of code in qemu, and avoids iterating the PCI devices from the
machine specific init code which we're not supposed to do.
This patch changes qemu to always boot through SLOF, and not to create
PCI nodes. Simultaneously it updates the included version of SLOF
(submodule and binary image) to one which supports (and requires) the
new approach.
The new SLOF version also includes a number of unrelated enhancements:
support for booting from virtio-pci devices and e1000, greatly
improved FCode support and many bugfixes. It also makes SLOF ready to
be used even when specifying a kernel on the qemu command line.
David Gibson [Wed, 11 Jan 2012 19:46:26 +0000 (19:46 +0000)]
pseries: Use correct dispatcher for PCI config space accesses
The pseries machine expects a para-virtualized guest and so supplies RTAS
functions (via a hypercall) for performing PCI config space access.
Currently the implementation of these calls into
pci_default_{read,write}_config(). However this would be incorrect for
any PCI device which overrides the default config read/write functions.
AFAICT there's only one such device today, but we should still get it
right. In addition the pci_host_config_{read,write}_common() functions
which do correctly do this dispatch, perform bounds checking on the config
space address, lack of which currently leads to an exploitable bug.
pseries: Support PCI extended config space in RTAS calls
On the pseries machine (which expexts a paravirtualized guest), guest
access to PCI config space is via host-provided RTAS functions. This
patch extends these RTAS functions to permit access to PCI extended
config space, as specified in PAPR.
David Gibson [Wed, 11 Jan 2012 19:46:24 +0000 (19:46 +0000)]
Correct types in bmdma_addr_{read,write}
Back when I made patches introducing dma_addr_t and various PCI DMA
wrapper functions, I made a mistake. The bmdma_addr_{read,write} functions
need to take target_phys_addr_t not dma_addr_t, since they are assigned
to MemoryRegionOps callbacks.
Fix dirty logging with 32-bit qemu & 64-bit guests
The kvm_get_dirty_pages_log_range() function uses two address
variables to step through the monitored memory region to update the
dirty log. However, these variables have type unsigned long, which
can overflow if running a 64-bit guest with a 32-bit qemu binary.
This patch changes these to target_phys_addr_t which will have the
correct size.
load_image_targphys() gets passed a max size for the file, but doesn't
enforce it at all. Add a check and return -1 (error) if the file is
too big, without loading it. Fix the bracing style in the function
while we're at it.
Alexander Graf [Tue, 10 Jan 2012 22:33:10 +0000 (23:33 +0100)]
virtio: change memcpy to guest reads
When accessing the device specific virtio config space, we memcpy
the data into a variable in QEMU. At that point we're basically
pulling host endianness into the game which is a really bad idea.
So instead, let's use the target specific load/store helpers for
memory pointers which fetch things in target endianness. The whole
array is already populated in target endianness anyways
(see virtio-blk).
The virtio config area in PIO space is a bit special. The initial
header is little endian but the rest (device specific) is guest
native endian.
The PIO accessors for PCI on machines that don't have native IO ports
assume that all PIO is little endian, which works fine for everything
except the above.
A complicated way to fix it would be to split the BAR into two memory
regions with different endianess settings, but this isn't practical
to do, besides, the PIO code doesn't honor region endianness anyway
(I have a patch for that too but it isn't necessary at this stage).
So I decided to go for the quick fix instead which consists of
reverting the swap in virtio-pci in selected places, hoping that when
we eventually do a "v2" of the virtio protocols, we sort that out once
and for all using a fixed endian setting for everything.
Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
[agraf: keep virtio in libhw and determine endianness through a
helper function in exec.c] Reviewed-by: Anthony Liguori <[email protected]>
Alexander Graf [Tue, 10 Jan 2012 18:39:38 +0000 (19:39 +0100)]
PPC: Bamboo: fold ppc440.c and ppc440_bamboo.c into a single file
The separation of ppc440 and ppc440_bamboo makes some sense, since ppc440
is the SoC while ppc440_bamboo is the actual board. But the separation
makes things harder for us for no good reason, so let's just fold them
in together with each other.
Alexander Graf [Tue, 10 Jan 2012 18:36:26 +0000 (19:36 +0100)]
PPC: 4xx: Qdevify the 440 PCI host controller
Due to popular demand, this qdevifies the PCI host controller of 4xx SoCs
the same way as e500.
We have to introduce a small stub function for pci init that will be
removed in a later patch, once we qdev'ified the board, to keep the build
working.
Alexander Graf [Tue, 3 Jan 2012 18:15:16 +0000 (19:15 +0100)]
PPC: 440: Ignore invalid PCI IRQs
When running a 440 target, we currently get invalid irq_num values (-1)
which completely confuse the IRQ setting code.
This is most likely due to the missing qdev conversion.
While this shouldn't happen in the first place and should really rather
be fixed by converting the target, I dislike segfaults. So for now, let's
just print a warning and ignore invalid irq_num values.
Alexander Graf [Tue, 3 Jan 2012 18:12:47 +0000 (19:12 +0100)]
PPC: Bamboo: Set initial TLB entry
Back in the day when the bamboo target got introduced, the initial TLB was
dictated by KVM. TCG has been missing initial TLB values ever since, rendering
the target unusable for TCG usage.
This patch adds linear TLB maps the way Linux expects them, making the target
work.
Alexander Graf [Tue, 3 Jan 2012 18:10:02 +0000 (19:10 +0100)]
PPC: Bamboo: Register CPU reset
To be able to support CPU reset, we need to put all register initialization
and initial state into a CPU reset hook instead of a function that is only
called once on bootup.
This is a preparation step for the initial TLB setting code and brings bamboo
more in line with what e500 and virtex already do.
Andreas Färber [Mon, 9 Jan 2012 01:04:05 +0000 (02:04 +0100)]
prep: Use i82378 PCI->ISA bridge for 'prep' machine
Speaker I/O, ISA bus, i8259 PIC, RTC and DMA are no longer set up
individually by the machine. Effectively, no-op speaker I/O is replaced
by pcspk; PIT and i82374 DMA are introduced.
Signed-off-by: Hervé Poussineau <[email protected]>
Remove related dead, alternative code.
Wire up PCI host bridge IRQs via GPIO-in IRQs of PCI->ISA bridge.
Andreas Färber [Sat, 25 Dec 2010 05:01:41 +0000 (06:01 +0100)]
prep: Add i82378 PCI-to-ISA bridge emulation
Prepare Intel 82378 emulation for use by PReP platforms.
Signed-off-by: Hervé Poussineau <[email protected]>
Create ISA bus in this device (suggested by Markus).
Rebase onto Memory API, mark memory ops as Little Endian.
Add VMState. Provide access to i8259 IRQs via qdev GPIOs.
Andreas Färber [Tue, 3 Jan 2012 01:42:46 +0000 (02:42 +0100)]
prep: qdev'ify Raven host bridge (SysBus)
Drop pci_prep_init() in favor of extended device state. Inspired by
patches from Hervé and Alex.
Assign the 4 IRQs from the board after device instantiation. This moves
the knowledge out of prep_pci and allows for future machines with
different IRQ wiring (IBM 40P). Suggested by Alex.
Andreas Färber [Fri, 13 Jan 2012 17:03:48 +0000 (18:03 +0100)]
prep: Use ISA m48t59
This simplifies the code later when the i8259 moves to the i82378
PCI->ISA bridge and happens to fix a SysBus m48t59 io_base issue
introduced by commit 0fb56ffc5edd66f12ccfc0d71af5f9c79c0a2612 (m48t59:
drop obsolete address base arithmetic). Suggested by Hervé and Jan.
The BIOS MemoryRegion is created with a fixed size of 1 MiB.
Ensure that the full size can be accessed since the exception
vectors are located at 0xfff00000 and the BIOS may want to use them.
It thereby no longer depends on the actual BIOS binary size.