s390: Rework kernel loading: supports elf and newer kernels
This reworks the image loading on s390.
Newer kernels will not always have a 0dd0 (basr 13,0) at address 0x10000.
We must not rely on specific code at certain addresses. This check was
introduced to warn users that tried to load vmlinux, since ELF loading
was not supported. Lets wire that up. If elf loading fails, we assume
that this is a standard kernel image and load that via load_image_targphys.
This patch also changes all other users of load_image to
load_image_targphys to be consistent. (the elf loader registers the kernel
as rom).
Avi Kivity [Mon, 5 Mar 2012 15:40:12 +0000 (17:40 +0200)]
memory: fix I/O port aliases
Commit e58ac72b6a0 ("ioport: change portio_list not to use
memory_region_set_offset()") started using aliases of I/O memory
regions. Since the IORange used for the I/O was contained in the
target region, the alias information (specifically, the offset
into the region) was lost. This broke -vga std.
Fix by allocating an independent object to hold the IORange and
also the new offset.
Note that I/O memory regions were conceptually broken wrt aliases
in a different way: an alias can cause the same region to appear
twice in an address space, but we had just one IORange to service it.
This patch fixes that problem as well, since we can now have multiple
IORange/MemoryRegion associations.
Avi Kivity [Mon, 5 Mar 2012 15:36:19 +0000 (17:36 +0200)]
ioport: add destructor method to IORange
Previously all callers had a containing object with a destructor that
could be used to trigger cleanup of the IORange objects (typically
just freeing the containing object), but a forthcoming memory API
change doesn't fit this pattern. Rather than setting up a new global
table, extend the ioport system to support destructors.
Stefan Weil [Fri, 2 Mar 2012 22:30:04 +0000 (23:30 +0100)]
w64: Fix data type of parameters for flush_icache_range
flush_icache_range takes two address parameters which must be large
enough to address any address of the host.
For hosts with sizeof(unsigned long) == sizeof(void *), this patch
changes nothing. All currently supported hosts fall into this category.
For w64 hosts, sizeof(unsigned long) is 4 while sizeof(void *) is 8,
so the use of tcg_target_ulong is needed for i386 and tci (the tcg
targets which work with w64).
Blue Swirl [Sat, 3 Mar 2012 17:59:06 +0000 (17:59 +0000)]
Merge branch 'upstream' of git://qemu.weilnetz.de/qemu
* 'upstream' of git://qemu.weilnetz.de/qemu:
Move definition of HOST_LONG_BITS to qemu-common.h
target-xtensa: Clean includes
target-unicore32: Clean includes
target-sh4: Clean includes
target-s390x: Clean includes
target-ppc: Clean includes
target-mips: Clean includes
target-microblaze: Clean includes
target-m68k: Clean includes
target-lm32: Clean includes
target-i386: Clean includes
target-cris: Clean includes
target-arm: Clean includes
target-alpha: Clean includes
Remove macro HOST_LONG_SIZE
Blue Swirl [Sat, 3 Mar 2012 17:53:56 +0000 (17:53 +0000)]
Merge branch 'arm-devs.for-upstream' of git://git.linaro.org/people/pmaydell/qemu-arm
* 'arm-devs.for-upstream' of git://git.linaro.org/people/pmaydell/qemu-arm:
hw/arm11mpcore: Fix broken realview_mpcore/arm11mpcore_priv properties
arm: add device tree support
arm: make sure that number of irqs can be represented in GICD_TYPER.
arm: clean up GIC constants
Fix confusion in the Property arrays for the "arm11mpcore_priv"
(per-CPU devices for the ARM11MPcore CPU) and "realview_mpcore"
(realview-eb board specific device encapsulating CPU and some
extra interrupt controllers) -- the num-irq property was defined
on the wrong device and the mpcore_rirq_properties were defined
as offsets in the wrong structure. The effect was that the
realview-eb-mpcore machine would abort on startup trying to
allocate an insane amount of memory. (This bug was introduced in
the QOM conversion in commit 999e12bb.)
Grant Likely [Fri, 2 Mar 2012 11:56:38 +0000 (11:56 +0000)]
arm: add device tree support
If compiled with CONFIG_FDT, allow user to specify a device tree file using
the -dtb argument. If the machine supports it then the dtb will be loaded
into memory and passed to the kernel on boot.
Signed-off-by: Jeremy Kerr <[email protected]> Signed-off-by: Grant Likely <[email protected]>
[Peter Maydell: Use machine opt rather than global to pass dtb filename] Signed-off-by: Peter Maydell <[email protected]>
Rusty Russell [Fri, 2 Mar 2012 11:56:38 +0000 (11:56 +0000)]
arm: make sure that number of irqs can be represented in GICD_TYPER.
We currently assume that the number of interrupts (ITLinesNumber in
the architecture reference manual) is divisible by 32, since we
present it to the guest when it reads GICD_TYPER (in gic_dist_readb())
as (N / 32) - 1.
Anthony Liguori [Thu, 1 Mar 2012 21:26:25 +0000 (15:26 -0600)]
Merge remote-tracking branch 'qemu-kvm/memory/core' into staging
* qemu-kvm/memory/core: (30 commits)
memory: allow phys_map tree paths to terminate early
memory: unify PhysPageEntry::node and ::leaf
memory: change phys_page_set() to set multiple pages
memory: switch phys_page_set() to a recursive implementation
memory: replace phys_page_find_alloc() with phys_page_set()
memory: simplify multipage/subpage registration
memory: give phys_page_find() its own tree search loop
memory: make phys_page_find() return a MemoryRegionSection
memory: move tlb flush to MemoryListener commit callback
memory: unify the two branches of cpu_register_physical_memory_log()
memory: fix RAM subpages in newly initialized pages
memory: compress phys_map node pointers to 16 bits
memory: store MemoryRegionSection pointers in phys_map
memory: unify phys_map last level with intermediate levels
memory: remove first level of l1_phys_map
memory: change memory registration to rebuild the memory map on each change
memory: support stateless memory listeners
memory: split memory listener for the two address spaces
xen: ignore I/O memory regions
memory: allow MemoryListeners to observe a specific address space
...
Anthony Liguori [Thu, 1 Mar 2012 21:26:01 +0000 (15:26 -0600)]
Merge remote-tracking branch 'qemu-kvm/uq/master' into staging
* qemu-kvm/uq/master:
pc-bios: update kvmvapic.bin
kvmvapic: Use optionrom helpers
optionsrom: Reserve space for checksum
kvmvapic: Simplify mp/up_set_tpr
kvmvapic: Introduce TPR access optimization for Windows guests
kvmvapic: Add option ROM
target-i386: Add infrastructure for reporting TPR MMIO accesses
Allow to use pause_all_vcpus from VCPU context
Process pending work while waiting for initial kick-off in TCG mode
Remove useless casts from cpu iterators
kvm: Set cpu_single_env only once
kvm: Synchronize cpu state in kvm_arch_stop_on_emulation_error()
Avi Kivity [Wed, 29 Feb 2012 11:22:12 +0000 (13:22 +0200)]
kvm: fix unaligned slots
kvm_set_phys_mem() may be passed sections that are not aligned to a page
boundary. The current code simply brute-forces the alignment which leads
to an inconsistency and an abort().
Fix by aligning the start and the end of the section correctly, discarding
and unaligned head or tail.
This was triggered by a guest sizing a 64-bit BAR that is smaller than a page
with PCI_COMMAND_MEMORY enabled and the upper dword clear.
Anthony Liguori [Wed, 29 Feb 2012 18:57:28 +0000 (12:57 -0600)]
Merge remote-tracking branch 'kwolf/for-anthony' into staging
* kwolf/for-anthony: (27 commits)
qemu-img: fix segment fault when the image format is qed
qemu-io: fix segment fault when the image format is qed
qemu-tool: revert cpu_get_clock() abort(3)
qemu-iotests: Test rebase with short backing file
qemu-iotests: 026: Reduce output changes for cache=none qcow2
qemu-iotests: Filter out DOS line endings
test: add image streaming tests
qemu-iotests: add iotests Python module
qemu-iotests: export TEST_DIR for non-bash tests
QMP: Add qmp command for blockdev-group-snapshot-sync
qapi: Introduce blockdev-group-snapshot-sync command
qcow2: Reject too large header extensions
qcow2: Fix offset in qcow2_read_extensions
block: drop aio_multiwrite in BlockDriver
block: remove unused fields in BlockDriverState
qcow2: Fix build with DEBUG_EXT enabled
ide: fail I/O to empty disk
fdc: DIR (Digital Input Register) should return status of current drive...
fdc: fix seek command, which shouldn't check tracks
fdc: check if media rate is correct before doing any transfer
...
Anthony Liguori [Wed, 29 Feb 2012 15:11:00 +0000 (09:11 -0600)]
Merge remote-tracking branch 'kraxel/usb.39' into staging
* kraxel/usb.39: (21 commits)
usb: Resolve warnings about unassigned bus on usb device creation
usb-redir: Return USB_RET_NAK when we've no data for an interrupt endpoint
usb-redir: Limit return values returned by iso packets
usb-redir: Let the usb-host know about our device filtering
usb-redir: Always clear device state on filter reject
usb-redir: Fix printing of device version
ehci: drop old stuff
usb-ehci: Handle ISO packets failing with an error other then NAK
libcacard: fix reported ATR length
usb-ccid: advertise SELF_POWERED
libcacard: link with glib for g_strndup
usb-desc: fix user trigerrable segfaults (!config)
usb-ehci: sanity-check iso xfers
usb: add tracepoint for usb packet state changes.
usb-xhci: enable packet queuing
usb-uhci: implement packet queuing
usb-uhci: process uhci_handle_td return code via switch.
usb-uhci: add UHCIQueue
usb-uhci: cleanup UHCIAsync allocation & initialization.
usb-ehci: fix reset
...
qemu-img: fix segment fault when the image format is qed
[root@f15 qemu]# qemu-img info /home/zwu/work/misc/rh6.img
image: /home/zwu/work/misc/rh6.img
file format: qed
virtual size: 4.0G (4294967296 bytes)
disk size: 1.2G
cluster_size: 65536
Segmentation fault (core dumped)
Today when i were fixing another issue, i found this issue; After simple
investigation, i found that the required clock vm_clock is not created
for qemu tool.
qemu-io: fix segment fault when the image format is qed
[root@f15 qemu]# qemu-io -c info /home/zwu/work/misc/rh6.img
format name: qed
cluster size: 64 KiB
vm state offset: 0.000000 bytes
Segmentation fault (core dumped)
Stefan Hajnoczi [Wed, 29 Feb 2012 14:41:32 +0000 (14:41 +0000)]
qemu-tool: revert cpu_get_clock() abort(3)
Despite the fact that the qemu-tool environment has no guest running and
vm_clock therefore does not make sense, there is code that gets the
vm_clock time even in qemu-tool. Therefore, revert the abort(3) call
and just return 0 like we used to. This unbreaks qemu-img/qemu-io with
QED and Kevin has also expressed interest in this for qcow2.
Kevin Wolf [Fri, 29 Apr 2011 13:32:55 +0000 (15:32 +0200)]
qemu-iotests: 026: Reduce output changes for cache=none qcow2
qemu-iotests supports the -nocache option which makes the tests run with
cache=none. For blkdebug tests with qcow2 this means that we may see
test results that differ from cache=writethrough. This patch makes the
diff a bit smaller and therefore easier to review.
Stefan Hajnoczi [Wed, 29 Feb 2012 13:25:22 +0000 (13:25 +0000)]
test: add image streaming tests
This patch adds a test suite for the image streaming feature. It
exercises the 'block_stream', 'block_job_cancel', 'block_job_set_speed',
and 'query-block-jobs' QMP commands.
Stefan Hajnoczi [Wed, 29 Feb 2012 13:25:21 +0000 (13:25 +0000)]
qemu-iotests: add iotests Python module
Block layer tests that involve QMP commands rather than qemu-img or
qemu-io are not well-suited for shell scripting. This patch adds a
Python module which allows tests to be written in Python instead.
The basic API is:
VM - class for launching and interacting with a VM
QMPTestCase - abstract base class for tests that use QMP
qemu_img() - wrapper function for invoking qemu-img
qemu_io() - wrapper function for invoking qemu-io
imgfmt - the image format under test (e.g. qcow2, qed)
test_dir - scratch directory path for temporary files
main() - entry point for running tests
Stefan Hajnoczi [Wed, 29 Feb 2012 13:25:20 +0000 (13:25 +0000)]
qemu-iotests: export TEST_DIR for non-bash tests
Since qemu-iotests may need to create large image files it is possible
to specify the test directory. The TEST_DIR variable needs to be
exported so non-bash tests can make use of it.
Jeff Cody [Tue, 28 Feb 2012 20:54:07 +0000 (15:54 -0500)]
QMP: Add qmp command for blockdev-group-snapshot-sync
This adds the QMP command for blockdev-group-snapshot-sync. It
takes an array in as the input, for the argument devlist. The
array consists of the following elements:
+ device: device to snapshot. e.g. "ide-hd0", "virtio0"
+ snapshot-file: path & file for the snapshot image. e.g. "/tmp/file.img"
+ format: snapshot format. e.g., "qcow2". Optional
This is a QAPI/QMP only command to take a snapshot of a group of
devices. This is similar to the blockdev-snapshot-sync command, except
blockdev-group-snapshot-sync accepts a list devices, filenames, and
formats.
It is attempted to keep the snapshot of the group atomic; if the
creation or open of any of the new snapshots fails, then all of
the new snapshots are abandoned, and the name of the snapshot image
that failed is returned. The failure case should not interrupt
any operations.
Rather than use bdrv_close() along with a subsequent bdrv_open() to
perform the pivot, the original image is never closed and the new
image is placed 'in front' of the original image via manipulation
of the BlockDriverState fields. Thus, once the new snapshot image
has been successfully created, there are no more failure points
before pivoting to the new snapshot.
This allows the group of disks to remain consistent with each other,
even across snapshot failures.
Kevin Wolf [Wed, 22 Feb 2012 11:37:13 +0000 (12:37 +0100)]
qcow2: Reject too large header extensions
Image files that make qemu-img info read several gigabytes into the
unknown header extensions list are bad. Just fail opening the image
if an extension claims to be larger than the header extension area.
Kevin Wolf [Wed, 22 Feb 2012 11:31:47 +0000 (12:31 +0100)]
qcow2: Fix offset in qcow2_read_extensions
The spec says that the length of extensions is padded to 8 bytes, not
the offset. Currently this is the same because the header size is a
multiple of 8, so this is only about compatibility with future changes
to the header size.
While touching it, move the calculation to a common place instead of
duplicating it for each header extension type.
fdc: check if media rate is correct before doing any transfer
The programmed rate has to be the same as the required rate for the
floppy format ; if that's not the case, the transfer should abort.
This check can be disabled by using the 'check_media_rate' property.
Save media rate value only if media rate check is enabled.
Floppies can be simple or double-sided. However, current code
was only taking the common case into account (ie 2 sides).
This repairs single-sided floppies, which where totally broken
before this patch : for track > 0, wrong sector number was
calculated, and data was read/written at wrong place on
underlying device.
Fortunately, only some 360 kB floppies are single-sided, so
this bug was probably not seen much.
Avi Kivity [Mon, 13 Feb 2012 18:45:32 +0000 (20:45 +0200)]
memory: allow phys_map tree paths to terminate early
When storing large contiguous ranges in phys_map, all values tend to
be the same pointers to a single MemoryRegionSection. Collapse them
by marking nodes with level > 0 as leaves. This reduces tree memory
usage dramatically.
Avi Kivity [Mon, 13 Feb 2012 15:14:32 +0000 (17:14 +0200)]
memory: simplify multipage/subpage registration
Instead of considering subpage on a per-page basis, split each section
into a subpage head, multipage body, and subpage tail, and register
each separately. This simplifies the registration functions.
Avi Kivity [Sun, 12 Feb 2012 16:32:55 +0000 (18:32 +0200)]
memory: store MemoryRegionSection pointers in phys_map
Instead of storing PhysPageDesc, store pointers to MemoryRegionSections.
The various offsets (phys_offset & ~TARGET_PAGE_MASK,
PHYS_OFFSET & TARGET_PAGE_MASK, region_offset) can all be synthesized
from the information in a MemoryRegionSection. Adjust phys_page_find()
to synthesize a PhysPageDesc.
The upshot is that phys_map now contains uniform values, so it's easier
to generate and compress.
The end result is somewhat clumsy but this will be improved as we we
propagate MemoryRegionSections throughout the code instead of transforming
them to PhysPageDesc.
The MemoryRegionSection pointers are stored as uint16_t offsets in an
array. This saves space (when we also compress node pointers) and is
more cache friendly.
Avi Kivity [Fri, 10 Feb 2012 12:57:31 +0000 (14:57 +0200)]
memory: remove first level of l1_phys_map
L1 and the lower levels in l1_phys_map are equivalent, except that L1 has
a different size, and is always allocated. Simplify the code by removing
L1. This leaves us with a tree composed solely of L2 tables, but that
problem can be renamed away later.
Avi Kivity [Thu, 9 Feb 2012 15:34:32 +0000 (17:34 +0200)]
memory: change memory registration to rebuild the memory map on each change
Instead of incrementally building the memory map, rebuild it every time.
This allows later simplification, since the code need not consider overlaying
a previous mapping. It is also RCU friendly.
With large memory guests this can get expensive, since the operation is
O(mem size), but this will be optimized later.
As a side effect subpage and L2 leaks are fixed here.
Avi Kivity [Wed, 8 Feb 2012 19:36:02 +0000 (21:36 +0200)]
memory: support stateless memory listeners
Current memory listeners are incremental; that is, they are expected to
maintain their own state, and receive callbacks for changes to that state.
This patch adds support for stateless listeners; these work by receiving
a ->begin() callback (which tells them that new state is coming), a
sequence of ->region_add() and ->region_nop() callbacks, and then a
->commit() callback which signifies the end of the new state. They should
ignore ->region_del() callbacks.
Avi Kivity [Wed, 8 Feb 2012 15:01:23 +0000 (17:01 +0200)]
memory: add a readonly attribute to MemoryRegionSection
.readonly cannot be obtained from the MemoryRegion, since it is
inherited from aliases (so you can have a MemoryRegion mapped RW
at one address and RO at another). Record it in a MemoryRegionSection
for listeners.