Peter Maydell [Mon, 15 Jul 2019 15:11:47 +0000 (16:11 +0100)]
Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2019-07-15' into staging
Block patches for 4.1-rc1:
- Fixes for the NVMe block driver, the gluster block driver, and for
running multiple block jobs concurrently on a single chain
* remotes/maxreitz/tags/pull-block-2019-07-15:
gluster: fix .bdrv_reopen_prepare when backing file is a JSON object
iotests: Add read-only test case to 030
iotests: Add new case to 030
iotests: Add @use_log to VM.run_job()
iotests: Compare error messages in 030
iotests: Fix throttling in 030
block: Deep-clear inherits_from
block/stream: Swap backing file change order
block/stream: Fix error path
block: Add BDS.never_freeze
nvme: Set number of queues later in nvme_init()
* remotes/juanquintela/tags/migration-pull-request: (21 commits)
migration: always initial RAMBlock.bmap to 1 for new migration
migration/postcopy: remove redundant cpu_synchronize_all_post_init
migration/postcopy: fix document of postcopy_send_discard_bm_ram()
migration: allow private destination ram with x-ignore-shared
migration: Split log_clear() into smaller chunks
kvm: Support KVM_CLEAR_DIRTY_LOG
kvm: Introduce slots lock for memory listener
kvm: Persistent per kvmslot dirty bitmap
kvm: Update comments for sync_dirty_bitmap
memory: Introduce memory listener hook log_clear()
memory: Pass mr into snapshot_and_clear_dirty
bitmap: Add bitmap_copy_with_{src|dst}_offset()
memory: Don't set migration bitmap when without migration
migration: No need to take rcu during sync_dirty_bitmap
migration/ram.c: reset complete_round when we gets a queued page
migration/multifd: sync packet_num after all thread are done
cutils: remove one unnecessary pointer operation
migration/xbzrle: update cache and current_data in one place
migration/multifd: call multifd_send_sync_main when sending RAM_SAVE_FLAG_EOS
migration-test: rename parameter to parameter_int
...
gluster: fix .bdrv_reopen_prepare when backing file is a JSON object
When the backing_file is specified as a JSON object, the
qemu_gluster_reopen_prepare() fails with this message:
invalid URI json:{"server.0.host": ...}
In this case, we should call qemu_gluster_init() using the QDict
'state->options' that contains the JSON parameters already parsed.
Max Reitz [Wed, 3 Jul 2019 17:28:10 +0000 (19:28 +0200)]
iotests: Add @use_log to VM.run_job()
unittest-style tests generally do not use the log file, but VM.run_job()
can still be useful to them. Add a parameter to it that hides its
output from the log file.
Max Reitz [Wed, 3 Jul 2019 17:28:09 +0000 (19:28 +0200)]
iotests: Compare error messages in 030
Currently, 030 just compares the error class, which does not say
anything.
Before HEAD^ added throttling to test_overlapping_4, that test actually
usually failed because node2 was already gone, not because it was the
commit and stream job were not allowed to overlap.
Prevent such problems in the future by comparing the error description
instead.
Max Reitz [Wed, 3 Jul 2019 17:28:08 +0000 (19:28 +0200)]
iotests: Fix throttling in 030
Currently, TestParallelOps in 030 creates images that are too small for
job throttling to be effective. This is reflected by the fact that it
never undoes the throttling.
Increase the image size and undo the throttling when the job should be
completed. Also, add throttling in test_overlapping_4, or the jobs may
not be so overlapping after all. In fact, the error usually emitted
here is that node2 simply does not exist, not that overlapping jobs are
not allowed -- the fact that this job ignores the exact error messages
and just checks the error class is something that should be fixed in a
follow-up patch.
Max Reitz [Wed, 3 Jul 2019 17:28:07 +0000 (19:28 +0200)]
block: Deep-clear inherits_from
BDS.inherits_from does not always point to an immediate parent node.
When launching a block job with a filter node, for example, the node
directly below the filter will not point to the filter, but keep its old
pointee (above the filter).
If that pointee goes away while the job is still running, the node's
inherits_from will not be updated and thus point to garbage. To fix
this, bdrv_unref_child() has to check not only the parent node's
immediate children for nodes whose inherits_from needs to be cleared,
but its whole subtree.
Max Reitz [Wed, 3 Jul 2019 17:28:03 +0000 (19:28 +0200)]
block/stream: Fix error path
As of commit c624b015bf14fe01f1e6452a36e63b3ea1ae4998, the stream job
only freezes the chain until the overlay of the base node. The error
path must consider this.
Max Reitz [Wed, 3 Jul 2019 17:28:02 +0000 (19:28 +0200)]
block: Add BDS.never_freeze
The commit and the mirror block job must be able to drop their filter
node at any point. However, this will not be possible if any of the
BdrvChild links to them is frozen. Therefore, we need to prevent them
from ever becoming frozen.
Michal Privoznik [Wed, 10 Jul 2019 14:57:44 +0000 (16:57 +0200)]
nvme: Set number of queues later in nvme_init()
When creating the admin queue in nvme_init() the variable that
holds the number of queues created is modified before actual
queue creation. This is a problem because if creating the queue
fails then the variable is left in inconsistent state. This was
actually observed when I tried to hotplug a nvme disk. The
control got to nvme_file_open() which called nvme_init() which
failed and thus nvme_close() was called which in turn called
nvme_free_queue_pair() with queue being NULL. This lead to an
instant crash:
#0 0x000055d9507ec211 in nvme_free_queue_pair (bs=0x55d952ddb880, q=0x0) at block/nvme.c:164
#1 0x000055d9507ee180 in nvme_close (bs=0x55d952ddb880) at block/nvme.c:729
#2 0x000055d9507ee3d5 in nvme_file_open (bs=0x55d952ddb880, options=0x55d952bb1410, flags=147456, errp=0x7ffd8e19e200) at block/nvme.c:781
#3 0x000055d9507629f3 in bdrv_open_driver (bs=0x55d952ddb880, drv=0x55d95109c1e0 <bdrv_nvme>, node_name=0x0, options=0x55d952bb1410, open_flags=147456, errp=0x7ffd8e19e310) at block.c:1291
#4 0x000055d9507633d6 in bdrv_open_common (bs=0x55d952ddb880, file=0x0, options=0x55d952bb1410, errp=0x7ffd8e19e310) at block.c:1551
#5 0x000055d950766881 in bdrv_open_inherit (filename=0x0, reference=0x0, options=0x55d952bb1410, flags=32768, parent=0x55d9538ce420, child_role=0x55d950eaade0 <child_file>, errp=0x7ffd8e19e510) at block.c:3063
#6 0x000055d950765ae4 in bdrv_open_child_bs (filename=0x0, options=0x55d9541cdff0, bdref_key=0x55d950af33aa "file", parent=0x55d9538ce420, child_role=0x55d950eaade0 <child_file>, allow_none=true, errp=0x7ffd8e19e510) at block.c:2712
#7 0x000055d950766633 in bdrv_open_inherit (filename=0x0, reference=0x0, options=0x55d9541cdff0, flags=0, parent=0x0, child_role=0x0, errp=0x7ffd8e19e908) at block.c:3011
#8 0x000055d950766dba in bdrv_open (filename=0x0, reference=0x0, options=0x55d953d00390, flags=0, errp=0x7ffd8e19e908) at block.c:3156
#9 0x000055d9507cb635 in blk_new_open (filename=0x0, reference=0x0, options=0x55d953d00390, flags=0, errp=0x7ffd8e19e908) at block/block-backend.c:389
#10 0x000055d950465ec5 in blockdev_init (file=0x0, bs_opts=0x55d953d00390, errp=0x7ffd8e19e908) at blockdev.c:602
Ivan Ren [Sun, 14 Jul 2019 14:51:19 +0000 (22:51 +0800)]
migration: always initial RAMBlock.bmap to 1 for new migration
Reproduce the problem:
migrate
migrate_cancel
migrate
Error happen for memory migration
The reason as follows:
1. qemu start, ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] all set to
1 by a series of cpu_physical_memory_set_dirty_range
2. migration start:ram_init_bitmaps
- memory_global_dirty_log_start: begin log diry
- memory_global_dirty_log_sync: sync dirty bitmap to
ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]
- migration_bitmap_sync_range: sync ram_list.
dirty_memory[DIRTY_MEMORY_MIGRATION] to RAMBlock.bmap
and ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] is set to zero
3. migration data...
4. migrate_cancel, will stop log dirty
5. migration start:ram_init_bitmaps
- memory_global_dirty_log_start: begin log diry
- memory_global_dirty_log_sync: sync dirty bitmap to
ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]
- migration_bitmap_sync_range: sync ram_list.
dirty_memory[DIRTY_MEMORY_MIGRATION] to RAMBlock.bmap
and ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] is set to zero
Here RAMBlock.bmap only have new logged dirty pages, don't contain
the whole guest pages.
Wei Yang [Mon, 15 Jul 2019 02:05:49 +0000 (10:05 +0800)]
migration/postcopy: fix document of postcopy_send_discard_bm_ram()
Commit 6b6712efccd3 ('ram: Split dirty bitmap by RAMBlock') changes the
parameter of postcopy_send_discard_bm_ram(), while left the document
part untouched.
This patch correct the document and fix two typo by hand.
Peter Maydell [Mon, 15 Jul 2019 13:42:43 +0000 (14:42 +0100)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20190715' into staging
target-arm queue:
* report ARMv8-A FP support for AArch32 -cpu max
* hw/ssi/xilinx_spips: Avoid AXI writes to the LQSPI linear memory
* hw/ssi/xilinx_spips: Avoid out-of-bound access to lqspi_buf[]
* hw/ssi/mss-spi: Avoid crash when reading empty RX FIFO
* hw/display/xlnx_dp: Avoid crash when reading empty RX FIFO
* hw/arm/virt: Fix non-secure flash mode
* pl031: Correctly migrate state when using -rtc clock=host
* fix regression that meant arm926 and arm1026 lost VFP
double-precision support
* v8M: NS BusFault on vector table fetch escalates to NS HardFault
* remotes/pmaydell/tags/pull-target-arm-20190715:
target/arm: NS BusFault on vector table fetch escalates to NS HardFault
target/arm: Set VFP-related MVFR0 fields for arm926 and arm1026
pl031: Correctly migrate state when using -rtc clock=host
hw/arm/virt: Fix non-secure flash mode
hw/display/xlnx_dp: Avoid crash when reading empty RX FIFO
hw/ssi/mss-spi: Avoid crash when reading empty RX FIFO
hw/ssi/xilinx_spips: Avoid out-of-bound access to lqspi_buf[]
hw/ssi/xilinx_spips: Avoid AXI writes to the LQSPI linear memory
hw/ssi/xilinx_spips: Convert lqspi_read() to read_with_attrs
target/arm: report ARMv8-A FP support for AArch32 -cpu max
Peng Tao [Fri, 14 Jun 2019 06:35:13 +0000 (14:35 +0800)]
migration: allow private destination ram with x-ignore-shared
By removing the share ram check, qemu is able to migrate
to private destination ram when x-ignore-shared capability
is on. Then we can create multiple destination VMs based
on the same source VM.
This changes the x-ignore-shared migration capability to
work similar to Lai's original bypass-shared-memory
work(https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg00003.html)
which enables kata containers (https://katacontainers.io)
to implement the VM templating feature.
An example usage in kata containers(https://katacontainers.io):
1. Start the source VM:
qemu-system-x86 -m 2G \
-object memory-backend-file,id=mem0,size=2G,share=on,mem-path=/tmpfs/template-memory \
-numa node,memdev=mem0
2. Stop the template VM, set migration x-ignore-shared capability,
migrate "exec:cat>/tmpfs/state", quit it
3. Start target VM:
qemu-system-x86 -m 2G \
-object memory-backend-file,id=mem0,size=2G,share=off,mem-path=/tmpfs/template-memory \
-numa node,memdev=mem0 \
-incoming defer
4. connect to target VM qmp, set migration x-ignore-shared capability,
migrate_incoming "exec:cat /tmpfs/state"
5. create more target VMs repeating 3 and 4
Peter Xu [Mon, 3 Jun 2019 06:50:56 +0000 (14:50 +0800)]
migration: Split log_clear() into smaller chunks
Currently we are doing log_clear() right after log_sync() which mostly
keeps the old behavior when log_clear() was still part of log_sync().
This patch tries to further optimize the migration log_clear() code
path to split huge log_clear()s into smaller chunks.
We do this by spliting the whole guest memory region into memory
chunks, whose size is decided by MigrationState.clear_bitmap_shift (an
example will be given below). With that, we don't do the dirty bitmap
clear operation on the remote node (e.g., KVM) when we fetch the dirty
bitmap, instead we explicitly clear the dirty bitmap for the memory
chunk for each of the first time we send a page in that chunk.
Here comes an example.
Assuming the guest has 64G memory, then before this patch the KVM
ioctl KVM_CLEAR_DIRTY_LOG will be a single one covering 64G memory.
If after the patch, let's assume when the clear bitmap shift is 18,
then the memory chunk size on x86_64 will be 1UL<<18 * 4K = 1GB. Then
instead of sending a big 64G ioctl, we'll send 64 small ioctls, each
of the ioctl will cover 1G of the guest memory. For each of the 64
small ioctls, we'll only send if any of the page in that small chunk
was going to be sent right away.
Peter Xu [Mon, 3 Jun 2019 06:50:55 +0000 (14:50 +0800)]
kvm: Support KVM_CLEAR_DIRTY_LOG
Firstly detect the interface using KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2
and mark it. When failed to enable the new feature we'll fall back to
the old sync.
Provide the log_clear() hook for the memory listeners for both address
spaces of KVM (normal system memory, and SMM) and deliever the clear
message to kernel.
Peter Xu [Mon, 3 Jun 2019 06:50:54 +0000 (14:50 +0800)]
kvm: Introduce slots lock for memory listener
Introduce KVMMemoryListener.slots_lock to protect the slots inside the
kvm memory listener. Currently it is close to useless because all the
KVM code path now is always protected by the BQL. But it'll start to
make sense in follow up patches where we might do remote dirty bitmap
clear and also we'll update the per-slot cached dirty bitmap even
without the BQL. So let's prepare for it.
We can also use per-slot lock for above reason but it seems to be an
overkill. Let's just use this bigger one (which covers all the slots
of a single address space) but anyway this lock is still much smaller
than the BQL.
Peter Xu [Mon, 3 Jun 2019 06:50:53 +0000 (14:50 +0800)]
kvm: Persistent per kvmslot dirty bitmap
When synchronizing dirty bitmap from kernel KVM we do it in a
per-kvmslot fashion and we allocate the userspace bitmap for each of
the ioctl. This patch instead make the bitmap cache be persistent
then we don't need to g_malloc0() every time.
More importantly, the cached per-kvmslot dirty bitmap will be further
used when we want to add support for the KVM_CLEAR_DIRTY_LOG and this
cached bitmap will be used to guarantee we won't clear any unknown
dirty bits otherwise that can be a severe data loss issue for
migration code.
Introduce a new memory region listener hook log_clear() to allow the
listeners to hook onto the points where the dirty bitmap is cleared by
the bitmap users.
Let's take KVM as example - log_sync() for KVM will first copy the
kernel dirty bitmap to userspace, and at the same time we'll clear the
dirty bitmap there along with re-protecting all the guest pages again.
We add this new log_clear() interface only to split the old log_sync()
into two separated procedures:
- use log_sync() to collect the collection only, and,
- use log_clear() to clear the remote dirty bitmap.
With the new interface, the memory listener users will still be able
to decide how to implement the log synchronization procedure, e.g.,
they can still only provide log_sync() method only and put all the two
procedures within log_sync() (that's how the old KVM works before
KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is introduced). However with this
new interface the memory listener users will start to have a chance to
postpone the log clear operation explicitly if the module supports.
That can really benefit users like KVM at least for host kernels that
support KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2.
There are three places that can clear dirty bits in any one of the
dirty bitmap in the ram_list.dirty_memory[3] array:
Peter Xu [Mon, 3 Jun 2019 06:50:48 +0000 (14:50 +0800)]
memory: Don't set migration bitmap when without migration
Similar to 9460dee4b2 ("memory: do not touch code dirty bitmap unless
TCG is enabled", 2015-06-05) but for the migration bitmap - we can
skip the MIGRATION bitmap update if migration not enabled.
Peter Xu [Mon, 3 Jun 2019 06:50:46 +0000 (14:50 +0800)]
migration: No need to take rcu during sync_dirty_bitmap
cpu_physical_memory_sync_dirty_bitmap() has one RAMBlock* as
parameter, which means that it must be with RCU read lock held
already. Taking it again inside seems redundant. Removing it.
Instead comment on the functions about the RCU read lock.
Wei Yang [Wed, 5 Jun 2019 01:08:28 +0000 (09:08 +0800)]
migration/ram.c: reset complete_round when we gets a queued page
In case we gets a queued page, the order of block is interrupted. We may
not rely on the complete_round flag to say we have already searched the
whole blocks on the list.
Wei Yang [Wed, 12 Jun 2019 01:43:37 +0000 (09:43 +0800)]
migration/multifd: call multifd_send_sync_main when sending RAM_SAVE_FLAG_EOS
On receiving RAM_SAVE_FLAG_EOS, multifd_recv_sync_main() is called to
synchronize receive threads. Current synchronization mechanism is to wait
for each channel's sem_sync semaphore. This semaphore is triggered by a
packet with MULTIFD_FLAG_SYNC flag. While in current implementation, we
don't do multifd_send_sync_main() to send such packet when
blk_mig_bulk_active() is true.
This will leads to the receive threads won't notify
multifd_recv_sync_main() by sem_sync. And multifd_recv_sync_main() will
always wait there.
[Note]: normal migration test works, while didn't test the
blk_mig_bulk_active() case. Since not sure how to produce this
situation.
Peter Maydell [Mon, 15 Jul 2019 13:17:04 +0000 (14:17 +0100)]
target/arm: NS BusFault on vector table fetch escalates to NS HardFault
In the M-profile architecture, when we do a vector table fetch and it
fails, we need to report a HardFault. Whether this is a Secure HF or
a NonSecure HF depends on several things. If AIRCR.BFHFNMINS is 0
then HF is always Secure, because there is no NonSecure HardFault.
Otherwise, the answer depends on whether the 'underlying exception'
(MemManage, BusFault, SecureFault) targets Secure or NonSecure. (In
the pseudocode, this is handled in the Vector() function: the final
exc.isSecure is calculated by looking at the exc.isSecure from the
exception returned from the memory access, not the isSecure input
argument.)
We weren't doing this correctly, because we were looking at
the target security domain of the exception we were trying to
load the vector table entry for. This produces errors of two kinds:
* a load from the NS vector table which hits the "NS access
to S memory" SecureFault should end up as a Secure HardFault,
but we were raising an NS HardFault
* a load from the S vector table which causes a BusFault
should raise an NS HardFault if BFHFNMINS == 1 (because
in that case all BusFaults are NonSecure), but we were raising
a Secure HardFault
Correct the logic.
We also fix a comment error where we claimed that we might
be escalating MemManage to HardFault, and forgot about SecureFault.
(Vector loads can never hit MPU access faults, because they're
always aligned and always use the default address map.)
Peter Maydell [Mon, 15 Jul 2019 13:17:04 +0000 (14:17 +0100)]
target/arm: Set VFP-related MVFR0 fields for arm926 and arm1026
The ARMv5 architecture didn't specify detailed per-feature ID
registers. Now that we're using the MVFR0 register fields to
gate the existence of VFP instructions, we need to set up
the correct values in the cpu->isar structure so that we still
provide an FPU to the guest.
This fixes a regression in the arm926 and arm1026 CPUs, which
are the only ones that both have VFP and are ARMv5 or earlier.
This regression was introduced by the VFP refactoring, and more
specifically by commits 1120827fa182f0e76 and 266bd25c485597c,
which accidentally disabled VFP short-vector support and
double-precision support on these CPUs.
Peter Maydell [Mon, 15 Jul 2019 13:17:04 +0000 (14:17 +0100)]
pl031: Correctly migrate state when using -rtc clock=host
The PL031 RTC tracks the difference between the guest RTC
and the host RTC using a tick_offset field. For migration,
however, we currently always migrate the offset between
the guest and the vm_clock, even if the RTC clock is not
the same as the vm_clock; this was an attempt to retain
migration backwards compatibility.
Unfortunately this results in the RTC behaving oddly across
a VM state save and restore -- since the VM clock stands still
across save-then-restore, regardless of how much real world
time has elapsed, the guest RTC ends up out of sync with the
host RTC in the restored VM.
Fix this by migrating the raw tick_offset. To retain migration
compatibility as far as possible, we have a new property
migrate-tick-offset; by default this is 'true' and we will
migrate the true tick offset in a new subsection; if the
incoming data has no subsection we fall back to the old
vm_clock-based offset information, so old->new migration
compatibility is preserved. For complete new->old migration
compatibility, the property is set to 'false' for 4.0 and
earlier machine types (this will only affect 'virt-4.0'
and below, as none of the other pl031-using machines are
versioned).
David Engraf [Mon, 15 Jul 2019 13:17:04 +0000 (14:17 +0100)]
hw/arm/virt: Fix non-secure flash mode
Using the whole 128 MiB flash in non-secure mode is not working because
virt_flash_fdt() expects the same address for secure_sysmem and sysmem.
This is not correctly handled by caller because it forwards NULL for
secure_sysmem in non-secure flash mode.
hw/display/xlnx_dp: Avoid crash when reading empty RX FIFO
In the previous commit we fixed a crash when the guest read a
register that pop from an empty FIFO.
By auditing the repository, we found another similar use with
an easy way to reproduce:
$ qemu-system-aarch64 -M xlnx-zcu102 -monitor stdio -S
QEMU 4.0.50 monitor - type 'help' for more information
(qemu) xp/b 0xfd4a0134
Aborted (core dumped)
(gdb) bt
#0 0x00007f6936dea57f in raise () at /lib64/libc.so.6
#1 0x00007f6936dd4895 in abort () at /lib64/libc.so.6
#2 0x0000561ad32975ec in xlnx_dp_aux_pop_rx_fifo (s=0x7f692babee70) at hw/display/xlnx_dp.c:431
#3 0x0000561ad3297dc0 in xlnx_dp_read (opaque=0x7f692babee70, offset=77, size=4) at hw/display/xlnx_dp.c:667
#4 0x0000561ad321b896 in memory_region_read_accessor (mr=0x7f692babf620, addr=308, value=0x7ffe05c1db88, size=4, shift=0, mask=4294967295, attrs=...) at memory.c:439
#5 0x0000561ad321bd70 in access_with_adjusted_size (addr=308, value=0x7ffe05c1db88, size=1, access_size_min=4, access_size_max=4, access_fn=0x561ad321b858 <memory_region_read_accessor>, mr=0x7f692babf620, attrs=...) at memory.c:569
#6 0x0000561ad321e9d5 in memory_region_dispatch_read1 (mr=0x7f692babf620, addr=308, pval=0x7ffe05c1db88, size=1, attrs=...) at memory.c:1420
#7 0x0000561ad321ea9d in memory_region_dispatch_read (mr=0x7f692babf620, addr=308, pval=0x7ffe05c1db88, size=1, attrs=...) at memory.c:1447
#8 0x0000561ad31bd742 in flatview_read_continue (fv=0x561ad69c04f0, addr=4249485620, attrs=..., buf=0x7ffe05c1dcf0 "\020\335\301\005\376\177", len=1, addr1=308, l=1, mr=0x7f692babf620) at exec.c:3385
#9 0x0000561ad31bd895 in flatview_read (fv=0x561ad69c04f0, addr=4249485620, attrs=..., buf=0x7ffe05c1dcf0 "\020\335\301\005\376\177", len=1) at exec.c:3423
#10 0x0000561ad31bd90b in address_space_read_full (as=0x561ad5bb3020, addr=4249485620, attrs=..., buf=0x7ffe05c1dcf0 "\020\335\301\005\376\177", len=1) at exec.c:3436
#11 0x0000561ad33b1c42 in address_space_read (len=1, buf=0x7ffe05c1dcf0 "\020\335\301\005\376\177", attrs=..., addr=4249485620, as=0x561ad5bb3020) at include/exec/memory.h:2131
#12 0x0000561ad33b1c42 in memory_dump (mon=0x561ad59c4530, count=1, format=120, wsize=1, addr=4249485620, is_physical=1) at monitor/misc.c:723
#13 0x0000561ad33b1fc1 in hmp_physical_memory_dump (mon=0x561ad59c4530, qdict=0x561ad6c6fd00) at monitor/misc.c:795
#14 0x0000561ad37b4a9f in handle_hmp_command (mon=0x561ad59c4530, cmdline=0x561ad59d0f22 "/b 0x00000000fd4a0134") at monitor/hmp.c:1082
Fix by checking the FIFO is not empty before popping from it.
The datasheet is not clear about the reset value of this register,
we choose to return '0'.
hw/ssi/mss-spi: Avoid crash when reading empty RX FIFO
Reading the RX_DATA register when the RX_FIFO is empty triggers
an abort. This can be easily reproduced:
$ qemu-system-arm -M emcraft-sf2 -monitor stdio -S
QEMU 4.0.50 monitor - type 'help' for more information
(qemu) x 0x40001010
Aborted (core dumped)
(gdb) bt
#1 0x00007f035874f895 in abort () at /lib64/libc.so.6
#2 0x00005628686591ff in fifo8_pop (fifo=0x56286a9a4c68) at util/fifo8.c:66
#3 0x00005628683e0b8e in fifo32_pop (fifo=0x56286a9a4c68) at include/qemu/fifo32.h:137
#4 0x00005628683e0efb in spi_read (opaque=0x56286a9a4850, addr=4, size=4) at hw/ssi/mss-spi.c:168
#5 0x0000562867f96801 in memory_region_read_accessor (mr=0x56286a9a4b60, addr=16, value=0x7ffeecb0c5c8, size=4, shift=0, mask=4294967295, attrs=...) at memory.c:439
#6 0x0000562867f96cdb in access_with_adjusted_size (addr=16, value=0x7ffeecb0c5c8, size=4, access_size_min=1, access_size_max=4, access_fn=0x562867f967c3 <memory_region_read_accessor>, mr=0x56286a9a4b60, attrs=...) at memory.c:569
#7 0x0000562867f99940 in memory_region_dispatch_read1 (mr=0x56286a9a4b60, addr=16, pval=0x7ffeecb0c5c8, size=4, attrs=...) at memory.c:1420
#8 0x0000562867f99a08 in memory_region_dispatch_read (mr=0x56286a9a4b60, addr=16, pval=0x7ffeecb0c5c8, size=4, attrs=...) at memory.c:1447
#9 0x0000562867f38721 in flatview_read_continue (fv=0x56286aec6360, addr=1073745936, attrs=..., buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4, addr1=16, l=4, mr=0x56286a9a4b60) at exec.c:3385
#10 0x0000562867f38874 in flatview_read (fv=0x56286aec6360, addr=1073745936, attrs=..., buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4) at exec.c:3423
#11 0x0000562867f388ea in address_space_read_full (as=0x56286aa3e890, addr=1073745936, attrs=..., buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4) at exec.c:3436
#12 0x0000562867f389c5 in address_space_rw (as=0x56286aa3e890, addr=1073745936, attrs=..., buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4, is_write=false) at exec.c:3466
#13 0x0000562867f3bdd7 in cpu_memory_rw_debug (cpu=0x56286aa19d00, addr=1073745936, buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4, is_write=0) at exec.c:3976
#14 0x000056286811ed51 in memory_dump (mon=0x56286a8c32d0, count=1, format=120, wsize=4, addr=1073745936, is_physical=0) at monitor/misc.c:730
#15 0x000056286811eff1 in hmp_memory_dump (mon=0x56286a8c32d0, qdict=0x56286b15c400) at monitor/misc.c:785
#16 0x00005628684740ee in handle_hmp_command (mon=0x56286a8c32d0, cmdline=0x56286a8caeb2 "0x40001010") at monitor/hmp.c:1082
From the datasheet "Actel SmartFusion Microcontroller Subsystem
User's Guide" Rev.1, Table 13-3 "SPI Register Summary", this
register has a reset value of 0.
Check the FIFO is not empty before accessing it, else log an
error message.
Alex Bennée [Mon, 15 Jul 2019 13:17:02 +0000 (14:17 +0100)]
target/arm: report ARMv8-A FP support for AArch32 -cpu max
When we converted to using feature bits in 602f6e42cfbf we missed out
the fact (dp && arm_dc_feature(s, ARM_FEATURE_V8)) was supported for
-cpu max configurations. This caused a regression in the GCC test
suite. Fix this by setting the appropriate bits in mvfr1.FPHP to
report ARMv8-A with FP support (but not ARMv8.2-FP16).
In hmp_change(), the variable hmp_mon is only used
by code under #ifdef CONFIG_VNC. This results in a build
error when VNC is configured out with the default of
treating warnings as errors:
monitor/hmp-cmds.c: In function ‘hmp_change’:
monitor/hmp-cmds.c:1946:17: error: unused variable ‘hmp_mon’ [-Werror=unused-variable]
1946 | MonitorHMP *hmp_mon = container_of(mon, MonitorHMP, common);
| ^~~~~~~
Turn helper_retaddr into a multi-state flag that may now also
indicate when we're performing a read on behalf of the translator.
In this case, release the mmap_lock before the longjmp back to
the main cpu loop, and thereby avoid a failing assert therein.
At present we have a potential error in that helper_retaddr contains
data for handle_cpu_signal, but we have not ensured that those stores
will be scheduled properly before the operation that may fault.
It might be that these races are not in practice observable, due to
our use of -fno-strict-aliasing, but better safe than sorry.
This patch fixes two problems:
(1) The inputs to the EXTR insn were reversed,
(2) The input constraints use rZ, which means that we need to use
the REG0 macro in order to supply XZR for a constant 0 input.
Peter Maydell [Fri, 12 Jul 2019 16:34:13 +0000 (17:34 +0100)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pc, pci: fixes, cleanups, tests
A bunch of fixes all over the place.
ACPI tests will now run on more systems: might
introduce new failure reports but that's for
the best, isn't it?
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Fri 12 Jul 2019 15:57:40 BST
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>" [full]
# gpg: aka "Michael S. Tsirkin <[email protected]>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
virtio pmem: remove transitional names
virtio pmem: remove memdev null check
virtio pmem: fix wrong mem region condition
tests: acpi: do not skip tests when IASL is not installed
tests: acpi: do not require IASL for dumping AML blobs
virtio-balloon: fix QEMU 4.0 config size migration incompatibility
pcie: consistent names for function args
xio3130_downstream: typo fix
Coverity reports that when we're assigning vi->size we handle the
"pmem->memdev is NULL" case; but we then pass it into
object_get_canonical_path(), which unconditionally dereferences it
and will crash if it is NULL. If this pointer can be NULL then we
need to do something else here.
We are removing 'pmem->memdev' null check here as memdev will never
be null in this function.
Igor Mammedov [Mon, 8 Jul 2019 09:24:10 +0000 (05:24 -0400)]
tests: acpi: do not skip tests when IASL is not installed
tests do binary comparision so we can check tables without
IASL. Move IASL condition right before decompilation step
and skip it if IASL is not installed.
The virtio-balloon config size changed in QEMU 4.0 even for existing
machine types. Migration from QEMU 3.1 to 4.0 can fail in some
circumstances with the following error:
This happens because the virtio-balloon config size affects the VIRTIO
Legacy I/O Memory PCI BAR size.
Introduce a qdev property called "qemu-4-0-config-size" and enable it
only for the QEMU 4.0 machine types. This way <4.0 machine types use
the old size, 4.0 uses the larger size, and >4.0 machine types use the
appropriate size depending on enabled virtio-balloon features.
Live migration to and from old QEMUs to QEMU 4.1 works again as long as
a versioned machine type is specified (do not use just "pc"!).
The function declarations for pci_cap_slot_get and
pci_cap_slot_write_config call the argument "slot_ctl", but the function
definitions and all the call sites drop the 'o' and call it "slt_ctl".
Let's be consistent.
file-posix: Use max transfer length/segment count only for SCSI passthrough
Regular kernel block devices (/dev/sda*, /dev/nvme*, etc) don't have
max segment size/max segment count hardware requirements exposed
to the userspace, but rather the kernel block layer
takes care to split the incoming requests that
violate these requirements.
Allowing the kernel to do the splitting allows qemu to avoid
various overheads that arise otherwise from this.
This is especially visible in nbd server,
exposing as a raw file, a mostly empty qcow2 image over the net.
In this case most of the reads by the remote user
won't even hit the underlying kernel block device,
and therefore most of the overhead will be in the
nbd traffic which increases significantly with lower max transfer size.
In addition to that even for local block device
access the peformance improves a bit due to less
traffic between qemu and the kernel when large
transfer sizes are used (e.g for image conversion)
More info can be found at:
https://bugzilla.redhat.com/show_bug.cgi?id=1647104
Peter Maydell [Fri, 12 Jul 2019 10:06:48 +0000 (11:06 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.1-20190712' into staging
ppc patch queue for 2019-07-12
First 4.1 hard freeze pull request. Not much here, just a bug fix for
the XICS interrupt controller and a SLOF firmware update to fix a bug
with IP discovery when there are multiple NICs.
Greg Kurz [Wed, 3 Jul 2019 17:22:20 +0000 (19:22 +0200)]
xics/kvm: Always set the MASKED bit if interrupt is masked
The ics_set_kvm_state_one() function is called either to restore the
state of an interrupt source during migration or to set the interrupt
source to a default state during reset.
Since always, ie. 2013, the code only sets the MASKED bit if the 'current
priority' and the 'saved priority' are different. This is likely true
when restoring an interrupt that had been previously masked with the
ibm,int-off RTAS call. However this is always false in the case of
reset since both 'current priority' and 'saved priority' are equal to
0xff, and the MASKED bit is never set.
The legacy KVM XICS device gets away with that because it ends updating
its internal structure the same way, whether the MASKED bit is set or
the priority is 0xff.
The XICS-on-XIVE device for POWER9 is different. It sticks to the KVM
documentation [1] and _really_ relies on the MASKED bit to correctly
set. If not, it will configure the interrupt source in the XIVE HW, even
though the guest hasn't configured the interrupt yet. This disturbs the
complex logic implemented in XICS-on-XIVE and may result in the loss of
subsequent queued events.
Always set the MASKED bit if interrupt is masked as expected by the KVM
XICS-on-XIVE device. This has no impact on the legacy KVM XICS.
Peter Maydell [Thu, 11 Jul 2019 09:03:42 +0000 (10:03 +0100)]
Merge remote-tracking branch 'remotes/stsquad/tags/pull-testing-and-gdbstub-100719-1' into staging
Testing and gdbstub fixes:
- fix diff-out pass in check-tcg
- ensure generation of fprem reference
- fix gdb set_reg fallback
# gpg: Signature made Wed 10 Jul 2019 11:24:28 BST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <[email protected]>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-testing-and-gdbstub-100719-1:
gdbstub: revert to previous set_reg behaviour
gdbstub: add some notes to the header comment
tests/tcg: fix diff-out pass to properly report failure
tests/tcg: fix up test-i386-fprem.ref generation
John Snow [Wed, 10 Jul 2019 19:08:07 +0000 (15:08 -0400)]
docs/bitmaps: use QMP lexer instead of json
The annotated style json we use in QMP documentation is not strict json
and depending on the version of Sphinx (2.0+) or Pygments installed,
might cause the build to fail.
Use the new QMP lexer.
Further, some versions of Sphinx can not apply custom lexers to "code"
directives and require the use of "code-block" directives instead, so
make that change at this time as well.
Tested under:
- Sphinx 1.3.6 and Pygments 2.4
- Sphinx 1.7.6 and Pygments 2.2 (Fedora 29 packages)
- Sphinx 2.0.1 and Pygments 2.4
- Sphinx 3.0.0+/f396b3a783 and Pygments 2.4 (From Sphinx git c4f44bdd)
John Snow [Wed, 10 Jul 2019 19:08:06 +0000 (15:08 -0400)]
sphinx: add qmp_lexer
Sphinx, through Pygments, does not like annotated json examples very
much. In some versions of Sphinx (1.7), it will render the non-json
portions of code blocks in red, but in newer versions (2.0) it will
throw an exception and not highlight the block at all. Though we can
suppress this warning, it doesn't bring back highlighting on non-strict
json blocks.
We can alleviate this by creating a custom lexer for QMP examples that
allows us to properly highlight these examples in a robust way, keeping
our directionality and elision notations.
Alex Bennée [Fri, 5 Jul 2019 13:23:07 +0000 (14:23 +0100)]
gdbstub: revert to previous set_reg behaviour
The refactoring of handle_set_reg missed the fact we previously had
responded with an empty packet when we were not using XML based
protocols. This broke the fallback behaviour for architectures that
don't have registers defined in QEMU's gdb-xml directory.
Revert to the previous behaviour and clean up the commentary for what
is going on.
Alex Bennée [Fri, 5 Jul 2019 11:56:35 +0000 (12:56 +0100)]
tests/tcg: fix diff-out pass to properly report failure
A side effect of piping the output to head is squash the exit status
of the diff command. Fix this by only doing the pipe if the diff
failed and then ensuring the status is non-zero.
Alex Bennée [Fri, 5 Jul 2019 10:48:02 +0000 (11:48 +0100)]
tests/tcg: fix up test-i386-fprem.ref generation
We never shipped the reference data in the source tree because it's
quite big (64M). As a result the only option is to generate it
locally. Although we have a rule to generate the reference file we
missed the dependency and location changes, probably because it's only
run for SLOW test runs.
Makefile: Fix "make clean" in "unconfigured" source directory
Recent commit "Makefile: Reuse all's recursion machinery for clean and
install" broke targets clean and distclean in the source directory
before running configure:
$ make clean
LD recurse-clean.mo
cc: fatal error: no input files
compilation terminated.
make: *** [rules.mak:118: recurse-clean.mo] Error 1
Stephen Checkoway noticed commit 3ae0343db69 is incorrect.
This commit state all parallel flashes are limited to 16-bit
accesses, however the x32 configuration exists in some models,
such the Cypress S29CL032J, which CFI Device Geometry Definition
announces:
CFI ADDR DATA
0x28,0x29 = 0x0003 (x32-only asynchronous interface)
Guests should not be affected by the previous change, because
QEMU does not announce itself as x32 capable:
Commit 3ae0343db69 does not restrict the bus to 16-bit accesses,
but restrict the implementation as 16-bit access max, so a guest
32-bit access will result in 2x 16-bit calls.
Now, we have 2 boards that register the flash device in 32-bit
access:
- PPC: taihu_405ep
The CFI id matches the S29AL008J that is a 1MB in x16, while
the code QEMU forces it to be 2MB, and checking Linux it expects
a 4MB flash.
- ARM: Digic4
While the comment says "Samsung K8P3215UQB 64M Bit (4Mx16)",
this flash is 32Mb (2MB). Also note the CFI id does not match
the comment.
To avoid unexpected side effect, we revert commit 3ae0343db69,
and will clean the board code later.
* remotes/cohuck/tags/s390x-20190709:
s390x/tcg: move fallthrough annotation
s390: cpumodel: fix description for the new vector facility
s390x/cpumodel: Set up CPU model for AQIC interception
vfio-ccw: Test vfio_set_irq_signaling() return value
s390: cpumodel: fix description for the new vector facility
The new facility is called "Vector-Packed-Decimal-Enhancement Facility"
and not "Vector BCD enhancements facility 1". As the shortname might
have already found its way into some backports, let's keep vxbeh.
Alistair Francis [Thu, 20 Jun 2019 14:04:18 +0000 (07:04 -0700)]
tcg/riscv: Fix RISC-VH host build failure
Commit 269bd5d8 "cpu: Move the softmmu tlb to CPUNegativeOffsetState'
broke the RISC-V host build as there are two variables that are used but
not defined.
This patch renames the undefined variables mask_off and table_off to the
existing (but unused) mask_ofs and table_ofs variables.
Peter Maydell [Mon, 8 Jul 2019 16:40:05 +0000 (17:40 +0100)]
Merge remote-tracking branch 'remotes/stefanberger/tags/pull-tpm-2019-07-08-1' into staging
Merge tpm 2019/07/08 v1
# gpg: Signature made Mon 08 Jul 2019 15:04:46 BST
# gpg: using RSA key 75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <[email protected]>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2019-07-08-1:
hw/tpm: Only build tpm_ppi.o if any of TPM_TIS/TPM_CRB is built
Peter Maydell [Mon, 8 Jul 2019 14:21:20 +0000 (15:21 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches:
- virtio-scsi: Fix request resubmission after I/O error with iothreads
- qcow2: Fix missing v2/v3 subformat aliases for amend
- qcow(1): More specific error message for wrong format version
- MAINTAINERS: update RBD block maintainer
# gpg: Signature made Mon 08 Jul 2019 15:17:27 BST
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream:
qcow2: Allow -o compat=v3 during qemu-img amend
MAINTAINERS: update RBD block maintainer
block/qcow: Improve error when opening qcow2 files as qcow
virtio-scsi: restart DMA after iothread
qdev: add qdev_add_vm_change_state_handler()
vl: add qemu_add_vm_change_state_handler_prio()
Eric Blake [Fri, 5 Jul 2019 15:28:12 +0000 (10:28 -0500)]
qcow2: Allow -o compat=v3 during qemu-img amend
Commit b76b4f60 allowed '-o compat=v3' as an alias for the
less-appealing '-o compat=1.1' for 'qemu-img create' since we want to
use the QMP form as much as possible, but forgot to do likewise for
qemu-img amend. Also, it doesn't help that '-o help' doesn't list our
new preferred spellings.
Stefan Hajnoczi [Thu, 20 Jun 2019 17:37:09 +0000 (18:37 +0100)]
virtio-scsi: restart DMA after iothread
When the 'cont' command resumes guest execution the vm change state
handlers are invoked. Unfortunately there is no explicit ordering
between classic qemu_add_vm_change_state_handler() callbacks. When two
layers of code both use vm change state handlers, we don't control which
handler runs first.
virtio-scsi with iothreads hits a deadlock when a failed SCSI command is
restarted and completes before the iothread is re-initialized.
This patch uses the new qdev_add_vm_change_state_handler() API to
guarantee that virtio-scsi's virtio change state handler executes before
the SCSI bus children. This way DMA is restarted after the iothread has
re-initialized.
Stefan Hajnoczi [Thu, 20 Jun 2019 17:37:08 +0000 (18:37 +0100)]
qdev: add qdev_add_vm_change_state_handler()
Children sometimes depend on their parent's vm change state handler
having completed. Add a vm change state handler API for devices that
guarantees tree depth ordering.