Peter Maydell [Tue, 19 Jul 2016 14:04:36 +0000 (15:04 +0100)]
disas: Fix ATTRIBUTE_UNUSED define clash with ALSA headers
disas/bfd.h defines ATTRIBUTE_UNUSED, but unfortunately the
ALSA system headers also define this macro, which means that
you can get a compilation failure if building with ALSA and
any files happen to include the alsa headers before bfd.h
rather than the other way around.
This is unfortunate namespace pollution by the ALSA headers but
we can work around it. Add an #ifndef guard to bfd.h and remove
the unnecessary extra definition in disas/arm.c to fix this.
Peter Maydell [Tue, 19 Jul 2016 14:08:05 +0000 (15:08 +0100)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* two old patches from prospective GSoC students
* i386 -kernel device tree support
* Coverity fix
* memory usage improvement from Peter
* checkpatch fix
* g_path_get_dirname cleanup
* caching of block status for iSCSI
* remotes/bonzini/tags/for-upstream:
target-i386: Remove redundant HF_SOFTMMU_MASK
block/iscsi: allow caching of the allocation map
block/iscsi: fix rounding in iscsi_allocationmap_set
Move README to markdown
cpu-exec: Move down some declarations in cpu_exec()
exec: avoid realloc in phys_map_node_reserve
checkpatch: consider git extended headers valid patches
megasas: remove useless check for cmd->frame
compiler: never omit assertions if using a static analysis tool
hw/i386: add device tree support
Changed malloc to g_malloc, free to g_free in bsd-user/qemu.h
use g_path_get_dirname instead of dirname
Peter Maydell [Tue, 19 Jul 2016 12:00:35 +0000 (13:00 +0100)]
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Tue 19 Jul 2016 03:33:40 BST
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
e1000e: fix building without CONFIG_VMXNET3_PCI
MAINTAINERS: release Scott from being a rocker maintainer
tap: fix memory leak on failure to create a multiqueue tap device
net: fix incorrect argument to iov_to_buf
net: fix incorrect access to pointer
e1000e: fix incorrect access to pointer
* remotes/jnsnow/tags/ide-pull-request:
block: ignore flush requests when storage is clean
tests: in IDE and AHCI tests perform DMA write before flushing
ide: set retry_unit for PIO and FLUSH requests
ide: refactor retry_unit set and clear into separate function
* remotes/stefanha/tags/tracing-pull-request:
trace: Add QAPI/QMP interfaces to query and control per-vCPU tracing state
trace: Allow event name pattern in "info trace-events"
trace: Conditionally trace events based on their per-vCPU state
trace: Add per-vCPU tracing states for events with the 'vcpu' property
trace: Cosmetic changes on fast-path tracing
disas: Remove unused macro '_'
trace: Identify events with the 'vcpu' property
trace: [bsd-user] Commandline arguments to control tracing
trace: [linux-user] Commandline arguments to control tracing
Peter Maydell [Tue, 19 Jul 2016 08:02:05 +0000 (09:02 +0100)]
Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20160718.0' into staging
VFIO update 2016-07-18
One fix for 2.7-rc0 which hides the ARI extended capability, fixing
multifunction support in PCIe configurations where the assigned device
function topology does not match the host (Alex Williamson)
Peter Lieven [Mon, 18 Jul 2016 08:52:20 +0000 (10:52 +0200)]
block/iscsi: allow caching of the allocation map
until now the allocation map was used only as a hint if a cluster
is allocated or not. If a block was not allocated (or Qemu had
no info about the allocation status) a get_block_status call was
issued to check the allocation status and possibly avoid
a subsequent read of unallocated sectors. If a block known to be
allocated the get_block_status call was omitted. In the other case
a get_block_status call was issued before every read to avoid
the necessity for a consistent allocation map. To avoid the
potential overhead of calling get_block_status for each and
every read request this took only place for the bigger requests.
This patch enhances this mechanism to cache the allocation
status and avoid calling get_block_status for blocks where
the allocation status has been queried before. This allows
for bypassing the read request even for smaller requests and
additionally omits calling get_block_status for known to be
unallocated blocks.
Move the README file to markdown so that it makes the github page look
prettier. I know that github repo is a mirror and not the official
repo, but I think it doesn't hurt to have it in markdown format.
block: ignore flush requests when storage is clean
Some guests (win2008 server for example) do a lot of unnecessary
flushing when underlying media has not changed. This adds additional
overhead on host when calling fsync/fdatasync.
This change introduces a write generation scheme in BlockDriverState.
Current write generation is checked against last flushed generation to
avoid unnessesary flushes.
The problem with excessive flushing was found by a performance test
which does parallel directory tree creation (from 2 processes).
Results improved from 0.424 loops/sec to 0.432 loops/sec.
Each loop creates 10^3 directories with 10 files in each.
This affected some blkdebug testcases that were expecting error logs from
failure-injected flushes which are now skipped entirely
(tests 026 071 089).
This also affects the performance of block jobs and thus BLOCK_JOB_READY
events for driver-mirror and active block-commit commands now arrives
faster, before QMP send successfully returns to caller (tests 141 144).
The following sequence of tests discovered a problem in IDE emulation:
1. Send DMA write to IDE device 0
2. Send CMD_FLUSH_CACHE to same IDE device which will be failed by block
layer using blkdebug script in tests/ide-test:test_retry_flush
When doing DMA request ide/core.c will set s->retry_unit to s->unit in
ide_start_dma. When dma completes ide_set_inactive sets retry_unit to -1.
After that ide_flush_cache runs and fails thanks to blkdebug.
ide_flush_cb calls ide_handle_rw_error which asserts that s->retry_unit
== s->unit. But s->retry_unit is still -1 after previous DMA completion
and flush does not use anything related to retry.
This patch restricts retry unit assertion only to ops that actually use
retry logic.
trace: Conditionally trace events based on their per-vCPU state
Events with the 'vcpu' property are conditionally emitted according to
their per-vCPU state. Other events are emitted normally based on their
global tracing state.
Note that the per-vCPU condition check applies to all tracing backends.
Eliminates a future compilation error when UI code includes the tracing
headers (indirectly pulling "disas/bfd.h" through "qom/cpu.h") and
GLib's i18n '_' macro.
trace: [bsd-user] Commandline arguments to control tracing
[Changed const char *trace_file to char *trace_file since it's a
heap-allocated string that needs to be freed. This type is also
returned by trace_opt_parse() and used in vl.c.
Also fixed coding style on for(;;) and else statement as suggested by
Eric Blake <[email protected]> since the patch modifies these lines or
close enough.
--Stefan]
trace: [linux-user] Commandline arguments to control tracing
[Changed const char *trace_file to char *trace_file since it's a
heap-allocated string that needs to be freed. This type is also
returned by trace_opt_parse() and used in vl.c.
--Stefan]
Alex Williamson [Mon, 18 Jul 2016 16:55:17 +0000 (10:55 -0600)]
vfio/pci: Hide ARI capability
QEMU supports ARI on downstream ports and assigned devices may support
ARI in their extended capabilities. The endpoint ARI capability
specifies the next function, such that the OS doesn't need to walk
each possible function, however this next function is relative to the
host, not the guest. This leads to device discovery issues when we
combine separate functions into virtual multi-function packages in a
guest. For example, SR-IOV VFs are not enumerated by simply probing
the function address space, therefore the ARI next-function field is
zero. When we combine multiple VFs together as a multi-function
device in the guest, the guest OS identifies ARI is enabled, relies on
this next-function field, and stops looking for additional function
after the first is found.
Long term we should expose the ARI capability to the guest to enable
configurations with more than 8 functions per slot, but this requires
additional QEMU PCI infrastructure to manage the next-function field
for multiple, otherwise independent devices. In the short term,
hiding this capability allows equivalent functionality to what we
currently have on non-express chipsets.
Pranith Kumar [Mon, 27 Jun 2016 18:13:22 +0000 (14:13 -0400)]
.travis.yml: Disable IRC build status updates from forks
We want the travis build bot to post notifications on IRC only for the
master qemu repository and not the various forks/branches of
others. Currently there is no direct option to restrict the updates to
one repository. This is being worked upon by the developers and
tracked in https://github.com/travis-ci/travis-ci/issues/1094.
Until such time, we can use the workaround as posted in
ref. https://github.com/facebook/flow/pull/1822.
This basically creates an ecrypted string which decrypts to qemu IRC
channel only on "qemu/qemu" repo and not on the forks. This enables
the build bot to notify the IRC only for the main repo.
Renames look like this with git-diff(1) when diff.renames = true is set:
diff --git a/a b/b
similarity index 100%
rename from a
rename to b
This raises the "Does not appear to be a unified-diff format patch"
error because checkpatch.pl only considers a diff valid if it contains
at least one "@@" hunk.
This patch accepts renames and copies too so that checkpatch.pl exits
successfully when a diff only renames/copies files. The git diff
extended header format is described on the git-diff(1) man page.
Roman Pen [Wed, 13 Jul 2016 13:03:24 +0000 (15:03 +0200)]
linux-aio: prevent submitting more than MAX_EVENTS
Invoking io_setup(MAX_EVENTS) we ask kernel to create ring buffer for us
with specified number of events. But kernel ring buffer allocation logic
is a bit tricky (ring buffer is page size aligned + some percpu allocation
are required) so eventually more than requested events number is allocated.
From a userspace side we have to follow the convention and should not try
to io_submit() more or logic, which consumes completed events, should be
changed accordingly. The pitfall is in the following sequence:
MAX_EVENTS = 128
io_setup(MAX_EVENTS)
io_submit(MAX_EVENTS)
io_submit(MAX_EVENTS)
/* now 256 events are in-flight */
io_getevents(MAX_EVENTS) = 128
/* we can handle only 128 events at once, to be sure
* that nothing is pended the io_getevents(MAX_EVENTS)
* call must be invoked once more or hang will happen. */
To prevent the hang or reiteration of io_getevents() call this patch
restricts the number of in-flights, which is now limited to MAX_EVENTS.
Peter Maydell [Mon, 18 Jul 2016 10:24:15 +0000 (11:24 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160718' into staging
ppc patch queue 2016-07-18
Here's what ought to be the final ppc pull request before the 2.7 hard
freeze. This set contains a rework of the DBDMA device for Mac
platforms, and some assorted cleanups and bugfixes.
# gpg: Signature made Mon 18 Jul 2016 05:35:27 BST
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg: aka "David Gibson (Red Hat) <[email protected]>"
# gpg: aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.7-20160718:
ppc: Yet another fix for the huge page support detection mechanism
target-ppc: fix left shift overflow in hpte_page_shift
ppc/mmu-hash64: Remove duplicated #include statement
ppc: abort if compat property contains an unknown value
spapr: Ensure CPU cores are added contiguously and removed in LIFO order
vfio/spapr: Remove stale ioctl() call
ppc: Fix support for odd MSR combinations
dbdma: reset io->processing flag for unassigned DBDMA channel rw accesses
dbdma: set FLUSH bit upon reception of flush command for unassigned DBDMA channels
dbdma: fix load_word/store_word value endianness
dbdma: fix endian of DBDMA_CMDPTR_LO during branch
dbdma: add per-channel debugging enabled via DEBUG_DBDMA_CHANMASK
dbdma: always define DBDMA_DPRINTF and enable debug with DEBUG_DBDMA
spapr: fix core unplug crash
Thomas Huth [Fri, 15 Jul 2016 08:10:25 +0000 (10:10 +0200)]
ppc: Yet another fix for the huge page support detection mechanism
Commit 86b50f2e1bef ("Disable huge page support if it is not available
for main RAM") already made sure that huge page support is not announced
to the guest if the normal RAM of non-NUMA configurations is not backed
by a huge page filesystem. However, there is one more case that can go
wrong: NUMA is enabled, but the RAM of the NUMA nodes are not configured
with huge page support (and only the memory of a DIMM is configured with
it). When QEMU is started with the following command line for example,
the Linux guest currently crashes because it is trying to use huge pages
on a memory region that does not support huge pages:
To fix this issue, we've got to make sure to disable huge page support,
too, when there is a NUMA node that is not using a memory backend with
huge page support.
Fixes: 86b50f2e1befc33407bdfeb6f45f7b0d2439a740 Signed-off-by: Thomas Huth <[email protected]> Signed-off-by: David Gibson <[email protected]>
Greg Kurz [Wed, 13 Jul 2016 10:00:17 +0000 (12:00 +0200)]
ppc: abort if compat property contains an unknown value
It is not possible to set the compat property to an unknown value with
powerpc_set_compat(). Something must have gone terribly wrong in QEMU,
if we detect an "Internal error" in powerpc_get_compat(). Let's abort then.
This patch also drops the "max_compat ? *max_compat : -1" construct. It is
useless since max_compat is dereferenced a few lines above.
spapr: Ensure CPU cores are added contiguously and removed in LIFO order
If CPU core addition or removal is allowed in random order leading to
holes in the core id range (and hence in the cpu_index range), migration
can fail as migration with holes in cpu_index range isn't yet handled
correctly.
Prevent this situation by enforcing the addition in contiguous order
and removal in LIFO order so that we never end up with holes in
cpu_index range.
David Gibson [Tue, 12 Jul 2016 06:54:03 +0000 (16:54 +1000)]
vfio/spapr: Remove stale ioctl() call
This ioctl() call to VFIO_IOMMU_SPAPR_TCE_REMOVE was left over from an
earlier version of the code and has since been folded into
vfio_spapr_remove_window().
It wasn't caught because although the argument structure has been removed,
the libc function remove() means this didn't trigger a compile failure.
The ioctl() was also almost certain to fail silently and harmlessly with
the bogus argument, so this wasn't caught in testing.
MacOS uses an architecturally illegal MSR combination that
seems nonetheless supported by 32-bit processors, which is
to have MSR[PR]=1 and one or more of MSR[DR/IR/EE]=0.
This adds support for it. To work properly we need to also
properly include support for PR=1,{I,D}R=0 to the MMU index
used by the qemu TLB.
Mark Cave-Ayland [Sun, 10 Jul 2016 18:08:54 +0000 (19:08 +0100)]
dbdma: add per-channel debugging enabled via DEBUG_DBDMA_CHANMASK
By default large amounts of DBDMA debugging are produced when often it is just
1 or 2 channels that are of interest. Introduce DEBUG_DBDMA_CHANMASK to allow
the developer to select the channels of interest at compile time, and then
further add the extra channel information to each debug statement where
possible.
Also clearly mark the start/end of DBDMA_run_bh to allow tracking the bottom
half execution.
This happens because spapr_core_unplug() assumes cpu_dt_id == core_id.
As long as cpu_dt_id is derived from the non-table cpu_index, this is
only true when you plug cores with contiguous ids.
It is safer to be consistent: the DR connector was created with an
index that is immediately written to cc->core_id, and spapr_core_plug()
also relies on cc->core_id.
Peter Lieven [Fri, 15 Jul 2016 10:03:50 +0000 (12:03 +0200)]
exec: avoid realloc in phys_map_node_reserve
this is the first step in reducing the brk heap fragmentation
created by the map->nodes memory allocation. Since the introduction
of RCU the freeing of the PhysPageMaps is delayed so that sometimes
several hundred are allocated at the same time.
Even worse the memory for map->nodes is allocated and shortly
afterwards reallocated. Since the number of nodes it grows
to in the end is the same for all PhysPageMaps remember this value
and at least avoid the reallocation.
The large number of simultaneous allocations (about 450 x 70kB in
my configuration) has to be addressed later.
Renames look like this with git-diff(1) when diff.renames = true is set:
diff --git a/a b/b
similarity index 100%
rename from a
rename to b
This raises the "Does not appear to be a unified-diff format patch"
error because checkpatch.pl only considers a diff valid if it contains
at least one "@@" hunk.
This patch accepts renames and copies too so that checkpatch.pl exits
successfully when a diff only renames/copies files. The git diff
extended header format is described on the git-diff(1) man page.
Paolo Bonzini [Fri, 15 Jul 2016 16:27:40 +0000 (18:27 +0200)]
compiler: never omit assertions if using a static analysis tool
Assertions help both Coverity and the clang static analyzer avoid
false positives, but on the other hand both are confused when
the condition is compiled as (void)(x != FOO). Always expand
assertion macros when using Coverity or clang, through a new
QEMU_STATIC_ANALYSIS preprocessor symbol.
Antonio Borneo [Wed, 6 Apr 2016 20:04:14 +0000 (22:04 +0200)]
hw/i386: add device tree support
With "-dtb" on command-line:
- append the device tree blob to the kernel image;
- pass the blob's pointer to the kernel through setup_data, as
requested by upstream kernel commit da6b737b9ab7 ("x86: Add
device tree support").
The device tree blob is passed as-is to the guest; none of its
fields is modified nor updated. This is not an issue; the kernel
commit above uses the device tree only as an extension to the
traditional kernel configuration.
Peter Lieven [Fri, 15 Jul 2016 09:45:11 +0000 (11:45 +0200)]
vnc-tight: fix regression with libxenstore
commit 095497ff added thread local storage for the color counting
palette. Unfortunately, a VncPalette is about 7kB on a x86_64 system.
This memory is reserved from the stack of every thread and it
exhausted the stack space of a libxenstore thread.
Fix this by allocating memory only for the VNC encoding thread.
In tight_encode_indexed_rect32, buf(or src)’s size is count. In for loop,
the logic is supposed to be that i is an index into src, i should be
incremented when incrementing src.
This is broken when src is incremented but i is not before while loop,
resulting in off-by-one bug in while loop.
It may happen that vnc connections linger in disconnecting state forever
because VncState happens to be in a state where vnc_update_client()
exists early and never reaches the vnc_disconnect_finish() call at the
bottom of the function. Fix that by doing an additinal check at the
start of the function.
Peter Maydell [Thu, 14 Jul 2016 16:32:53 +0000 (17:32 +0100)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20160714' into staging
target-arm queue:
* add virtio-mmio transport base address to device path
(avoid an assertion failure with multiple virtio-scsi-devices)
* revert hw/ptimer commit 5a50307 which causes regressions on
SPARC guests
* use Neon to accelerate zero-page checking on AArch64 hosts
* set the MPIDR for TCG to match how KVM does it (and fit with
GICv2/GICv3 restrictions on SGI target lists)
* add some missing AArch32 TLBI hypervisor TLB operations
* m25p80: Fix QIOR/DIOR handling for Winbond
* hw/misc: fix typo in Aspeed SCU hw-strap2 property name
* ast2400: pretend DMAs are done for U-boot
* ast2400: some minor code cleanups
* remotes/pmaydell/tags/pull-target-arm-20160714:
ast2400: externalize revision numbers
ast2400: pretend DMAs are done for U-boot
ast2400: replace aspeed_smc_is_implemented()
hw/misc: fix typo in Aspeed SCU hw-strap2 property name
m25p80: Fix QIOR/DIOR handling for Winbond
target-arm: Add missed AArch32 TLBI sytem registers
hw/arm/virt: tcg: adjust MPIDR like KVM
gic: provide defines for v2/v3 targetlist sizes
target-arm: Use Neon for zero checking
Revert "hw/ptimer: Perform counter wrap around if timer already expired"
virtio-mmio: format transport base address in BusClass.get_dev_path
AST2400_A0_SILICON_REV is defined twice. Fix this by including the
definition in the header file as well as the routine to check if a
silicon revision is supported. It will useful to reuse in other
controllers.
Let's add also AST2500_A0_SILICON_REV for future use.
U-boot does SPI timing calibration using DMA tranfers. To let the
initialization continue, we fake success by setting the DMA status of
the Interrupt Control Register.
For the moment, DMA support is not required as it is not used in
normal operation.
aspeed_smc_is_implemented() filters invalid registers in a peculiar
way. Let's remove it and open code the if conditions. It serves the
same purpose, the aesthetic is better, and new registers can easily be
added.
Winbond also support continuous read mode, but as an opposite for other
flash type read mode clock cycles are included to dummy cycles number.
This path add proper handling of read mode byte and update needed
dummy cycles. QPI mode and dummy cycles configuration are not supported.
Andrew Jones [Thu, 14 Jul 2016 15:51:37 +0000 (16:51 +0100)]
hw/arm/virt: tcg: adjust MPIDR like KVM
KVM adjusts the MPIDR of guest vcpus based on the architecture of
the host, 32-bit vs. 64-bit, and, for 64-bit, also on the type of
GIC the guest is using. To be consistent and improve SGI efficiency
we make the same adjustments for TCG as 64-bit KVM hosts. We neglect
to add consistency with 32-bit KVM hosts, as that would reduce SGI
efficiency and KVM is expected to change.
As MPIDR is a system register, and thus guest visible, we only make
adjustments for current and later versioned machines.
Revert "hw/ptimer: Perform counter wrap around if timer already expired"
Software should see timer counter wraparound only after IRQ being triggered.
This fixes regression introduced by the commit 5a50307 ("hw/ptimer: Perform
counter wrap around if timer already expired"), resulting in monotonic timer
jumping backwards on SPARC emulated machine running NetBSD guest OS, as
reported by Mark Cave-Ayland.
The reason is that the vmstate sections for the two scsi-hd devices are
not uniquely identifiable by name.
The direct parent buses of the scsi-hd devices -- scsi0.0 and scsi1.0 --
support the BusClass.get_dev_path member function. scsibus_get_dev_path()
formats a device path prefix with the help of its topologically parent
bus, and then appends the chan:id:lun triplet to it. For both scsi-hd
devices, this triplet is 0:0:0.
(Here we use "device path" in the QEMU migration sense, for vmstate
section identification, not in the OFW or UEFI device path senses.)
The virtio-scsi HBA is plugged into the virtio-mmio bus (implemented by
the internal VirtIOMMIOProxy device). This bus class
(TYPE_VIRTIO_MMIO_BUS) inherits, as its get_dev_path() member function,
the virtio_bus_get_dev_path() method from its parent class
(TYPE_VIRTIO_BUS).
virtio_bus_get_dev_path() does not format any kind of device address on
its own; "virtio addresses" are transport-specific. Therefore
virtio_bus_get_dev_path() asks the topologically parent bus of the proxy
object (implementing the specific virtio transport) to format the address
of the proxy object.
(For virtio-pci devices (where the proxy is an instance of VirtIOPCIProxy,
plugged into a PCI bus), this ends up in pcibus_get_dev_path().)
However, VirtIOMMIOProxy is usually (in practice: always) plugged into
"main-system-bus", the singleton TYPE_SYSTEM_BUS object. This BusClass
does not support formatting QEMU vmstate device paths at all (as
SysBusDevice objects can have zero or more IO ports and zero or more MMIO
regions). Hence the formatting request delegated from
virtio_bus_get_dev_path() gets answered with NULL.
The end result is that the two scsi-hd devices end up with the same device
path "0:0:0", which triggers the assert.
We can solve this by recognizing that virtio-mmio transports are
distinguished from each other by their base addresses in MMIO address
space. Implement virtio_mmio_bus_get_dev_path() as follows:
(1) The virtio device whose devpath is to be formatted resides on a
virtio-mmio bus that is implemented by a VirtIOMMIOProxy object. Ask
the parent bus of VirtIOMMIOProxy to format the device path of
VirtIOMMIOProxy, as a path prefix. (This is identical to what
virtio_bus_get_dev_path() does.)
(2) Append the base address of VirtIOMMIOProxy to the device path, such
as:
- virtio-mmio@000000000a003e00,
- virtio-mmio@000000000a003c00.
Given that these device paths are placed in the migration stream, step (2)
above, if done unconditionally, would break migration. So make that step
conditional on a new VirtIOMMIOProxy property, which is enabled for 2.7
machine types and later.
* remotes/bonzini/tags/for-upstream:
hostmem: detect host backend memory is being used properly
hostmem: fix QEMU crash by 'info memdev'
char: do not use atexit cleanup handler
net: do not use atexit for cleanup
slirp: use exit notifier for slirp_smb_cleanup
tap: use an exit notifier to call down_script
util: Fix MIN_NON_ZERO
qemu-sockets: use qapi_free_SocketAddress in cleanup
disas: avoid including everything in headers compiled from C++
json-streamer: fix double-free on exiting during a parse
main-loop: check return value before using pointer
Use "-s" instead of "--quiet" to resolve non-fatal build error on FreeBSD.
scsi-bus: Use longer sense buffer with scanners
scsi-bus: Add SCSI scanner support
Max Filippov [Wed, 6 Jul 2016 06:31:32 +0000 (09:31 +0300)]
target-xtensa: xtfpga: fix FLASH interface width
FLASH chip on XTFPGA boards is connected with 16-bit-wide interface.
Latest U-Boot can see the difference and does not work correctly with
32-bit-wide interface.
Set FLASH chip 'width' property to 2.
Peter Maydell [Thu, 14 Jul 2016 10:48:46 +0000 (11:48 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Wed 13 Jul 2016 12:46:17 BST
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (34 commits)
iotests: Make 157 actually format-agnostic
vvfat: Fix qcow write target driver specification
hmp: show all of snapshot info on every block dev in output of 'info snapshots'
hmp: use snapshot name to determine whether a snapshot is 'fully available'
qemu-iotests: Test naming of throttling groups
blockdev: Fix regression with the default naming of throttling groups
vmdk: fix metadata write regression
Improve block job rate limiting for small bandwidth values
qcow2: Fix qcow2_get_cluster_offset()
qemu-io: Use correct range limitations
qcow2: Avoid making the L1 table too big
qemu-img: Use strerror() for generic resize error
block: Remove BB options from blockdev-add
qemu-iotests: Test setting WCE with qdev
block/qdev: Allow configuring rerror/werror with qdev properties
commit: Fix use of error handling policy
block/qdev: Allow configuring WCE with qdev properties
block/qdev: Allow node name for drive properties
coroutine: move entry argument to qemu_coroutine_create
test-coroutine: prepare for the next patch
...
* mreitz/tags/pull-block-for-kevin-2016-07-13:
iotests: Make 157 actually format-agnostic
vvfat: Fix qcow write target driver specification
hmp: show all of snapshot info on every block dev in output of 'info snapshots'
hmp: use snapshot name to determine whether a snapshot is 'fully available'
qemu-iotests: Test naming of throttling groups
blockdev: Fix regression with the default naming of throttling groups
vmdk: fix metadata write regression
Improve block job rate limiting for small bandwidth values
qcow2: Fix qcow2_get_cluster_offset()
qemu-io: Use correct range limitations
qcow2: Avoid making the L1 table too big
qemu-img: Use strerror() for generic resize error
Max Reitz [Mon, 11 Jul 2016 13:22:46 +0000 (15:22 +0200)]
iotests: Make 157 actually format-agnostic
iotest 157 pretends not to care about the image format used, but in fact
it does due to the format name not being filtered in its output. This
patch adds filtering and changes the reference output accordingly.
Max Reitz [Mon, 11 Jul 2016 13:54:52 +0000 (15:54 +0200)]
vvfat: Fix qcow write target driver specification
First, bdrv_open_child() expects all options for the child to be
prefixed by the child's name (and a separating dot). Second,
bdrv_open_child() does not take ownership of the QDict passed to it but
only extracts all options for the child, so if a QDict is created for
the sole purpose of passing it to bdrv_open_child(), it needs to be
freed afterwards.
This patch makes vvfat adhere to both of these rules.
Lin Ma [Thu, 7 Jul 2016 05:26:04 +0000 (13:26 +0800)]
hmp: show all of snapshot info on every block dev in output of 'info snapshots'
Currently, the output of 'info snapshots' shows fully available snapshots.
It's opaque, hides some snapshot information to users. It's not convenient
if users want to know more about all of snapshot information on every block
device via monitor.
Follow Kevin's and Max's proposals, The patch makes the output more detailed:
(qemu) info snapshots
List of snapshots present on all disks:
ID TAG VM SIZE DATE VM CLOCK
-- checkpoint-1 165M 2016-05-22 16:58:07 00:02:06.813
List of partial (non-loadable) snapshots on 'drive_image1':
ID TAG VM SIZE DATE VM CLOCK
1 snap1 0 2016-05-22 16:57:31 00:01:30.567
Lin Ma [Thu, 7 Jul 2016 05:26:03 +0000 (13:26 +0800)]
hmp: use snapshot name to determine whether a snapshot is 'fully available'
Currently qemu uses snapshot id to determine whether a snapshot is fully
available, It causes incorrect output in some scenario.
For instance:
(qemu) info block
drive_image1 (#block113): /opt/vms/SLES12-SP1-JeOS-x86_64-GM/disk0.qcow2
(qcow2)
Cache mode: writeback
drive_image2 (#block349): /opt/vms/SLES12-SP1-JeOS-x86_64-GM/disk1.qcow2
(qcow2)
Cache mode: writeback
(qemu)
(qemu) info snapshots
There is no snapshot available.
(qemu)
(qemu) snapshot_blkdev_internal drive_image1 snap1
(qemu)
(qemu) info snapshots
There is no suitable snapshot available
(qemu)
(qemu) savevm checkpoint-1
(qemu)
(qemu) info snapshots
ID TAG VM SIZE DATE VM CLOCK
1 snap1 0 2016-05-22 16:57:31 00:01:30.567
(qemu)
$ qemu-img snapshot -l disk0.qcow2
Snapshot list:
ID TAG VM SIZE DATE VM CLOCK
1 snap1 0 2016-05-22 16:57:31 00:01:30.567
2 checkpoint-1 165M 2016-05-22 16:58:07 00:02:06.813
$ qemu-img snapshot -l disk1.qcow2
Snapshot list:
ID TAG VM SIZE DATE VM CLOCK
1 checkpoint-1 0 2016-05-22 16:58:07 00:02:06.813
The patch uses snapshot name instead of snapshot id to determine whether a
snapshot is fully available and uses '--' instead of snapshot id in output
because the snapshot id is not guaranteed to be the same on all images.
For instance:
(qemu) info snapshots
List of snapshots present on all disks:
ID TAG VM SIZE DATE VM CLOCK
-- checkpoint-1 165M 2016-05-22 16:58:07 00:02:06.813
Alberto Garcia [Fri, 8 Jul 2016 14:03:01 +0000 (17:03 +0300)]
qemu-iotests: Test naming of throttling groups
Throttling groups are named using the 'group' parameter of the
block_set_io_throttle command and the throttling.group command-line
option. If that parameter is unspecified the groups get the name of
the block device.
This patch adds a new test to check the naming of throttling groups.
Alberto Garcia [Fri, 8 Jul 2016 14:03:00 +0000 (17:03 +0300)]
blockdev: Fix regression with the default naming of throttling groups
When I/O limits are set for a block device, the name of the throttling
group is taken from the BlockBackend if the user doesn't specify one.
Commit efaa7c4eeb7490c6f37f3 moved the naming of the BlockBackend in
blockdev_init() to the end of the function, after I/O limits are set.
The consequence is that the throttling group gets an empty name.
Reda Sallahi [Thu, 7 Jul 2016 08:42:49 +0000 (10:42 +0200)]
vmdk: fix metadata write regression
Commit "cdeaf1f vmdk: add bdrv_co_write_zeroes" causes a regression on
writes. It writes metadata after every write instead of doing it only once
for each cluster.
vmdk_pwritev() writes metadata whenever m_data is set as valid so this patch
sets m_data as valid only when we have a new cluster which hasn't been
allocated before or a zero grain.
Sascha Silbe [Tue, 28 Jun 2016 15:28:41 +0000 (17:28 +0200)]
Improve block job rate limiting for small bandwidth values
ratelimit_calculate_delay() previously reset the accounting every time
slice, no matter how much data had been processed before. This had (at
least) two consequences:
1. The minimum speed is rather large, e.g. 5 MiB/s for commit and stream.
Not sure if there are real-world use cases where this would be a
problem. Mirroring and backup over a slow link (e.g. DSL) would
come to mind, though.
2. Tests for block job operations (e.g. cancel) were rather racy
All block jobs currently use a time slice of 100ms. That's a
reasonable value to get smooth output during regular
operation. However this also meant that the state of block jobs
changed every 100ms, no matter how low the configured limit was. On
busy hosts, qemu often transferred additional chunks until the test
case had a chance to cancel the job.
Fix the block job rate limit code to delay for more than one time
slice to address the above issues. To make it easier to handle
oversized chunks we switch the semantics from returning a delay
_before_ the current request to a delay _after_ the current
request. If necessary, this delay consists of multiple time slice
units.
Since the mirror job sends multiple chunks in one go even if the rate
limit was exceeded in between, we need to keep track of the start of
the current time slice so we can correctly re-compute the delay for
the updated amount of data.
The minimum bandwidth now is 1 data unit per time slice. The block
jobs are currently passing the amount of data transferred in sectors
and using 100ms time slices, so this translates to 5120
bytes/second. With chunk sizes usually being O(512KiB), tests have
plenty of time (O(100s)) to operate on block jobs. The chance of a
race condition now is fairly remote, except possibly on insanely
loaded systems.