Git Repo - qemu.git/log

virtio: Report real progress in VQ aio poll handler

In virtio_queue_host_notifier_aio_poll, not all "!virtio_queue_empty()"
cases are making true progress.

Currently the offending one is virtio-scsi event queue, whose handler
does nothing if no event is pending. As a result aio_poll() will spin on
the "non-empty" VQ and take 100% host CPU.

Fix this by reporting actual progress from virtio queue aio handlers.

Reported-by: Ed Swierk <[email protected]>
Signed-off-by: Fam Zheng <[email protected]>
Tested-by: Ed Swierk <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

pci/pcie: don't assume cap id 0 is reserved

VFIO actually wants to create a capability with ID == 0.
This is done to make guest drivers skip the given capability.
pcie_add_capability then trips up on this capability
when looking for end of capability list.

To support this use-case, it's easy enough to switch to
e.g. 0xffffffff for these comparisons - we can be sure
it will never match a 16-bit capability ID.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Reviewed-by: Alex Williamson <[email protected]>

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* GUEST_PANICKED improvements (Anton)
* vCont gdbstub rewrite (Claudio)
* Fix CPU creation with -device (Liyang)
* Logging fixes for pty chardevs (Ed)
* Makefile "move if changed" fix (Lin)
* First part of cpu_exec refactoring (me)
* SVM emulation fix (me)
* apic_delivered fix (Pavel)
* "info ioapic" fix (Peter)
* qemu-nbd socket activation (Richard)
* QOMification of mcf_uart (Thomas)

# gpg: Signature made Thu 16 Feb 2017 17:37:31 GMT
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <[email protected]>"
# gpg:                 aka "Paolo Bonzini <[email protected]>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (23 commits)
  target-i386: correctly propagate retaddr into SVM helpers
  vl: log available guest crash information
  report guest crash information in GUEST_PANICKED event
  i386/cpu: add crash-information QOM property
  Makefile: avoid leaving the temporary QEMU_PKGVERSION header file
  vl: Move the cpu_synchronize_all_post_init() after generic devices initialization
  qemu-nbd: Implement socket activation.
  qemu-doc: Clarify that -vga std is now the default
  cpu-exec: remove outermost infinite loop
  cpu-exec: avoid repeated sigsetjmp on interrupts
  cpu-exec: avoid cpu_loop_exit in cpu_handle_interrupt
  cpu-exec: tighten barrier on TCG_EXIT_REQUESTED
  cpu-exec: fix icount out-of-bounds access
  hw/char/mcf_uart: QOMify the ColdFire UART
  gdbstub: Fix vCont behaviour
  move vm_start to cpus.c
  char: drop data written to a disconnected pty
  apic: reset apic_delivered global variable on machine reset
  qemu-char: socket backend: disconnect on write error
  test-vmstate: remove yield_until_fd_readable
  ...

Signed-off-by: Peter Maydell <[email protected]>

target-i386: correctly propagate retaddr into SVM helpers

Commit 2afbdf8 ("target-i386: exception handling for memory helpers",
2015-09-15) changed tlb_fill's cpu_restore_state+raise_exception_err
to raise_exception_err_ra.  After this change, the cpu_restore_state
and raise_exception_err's cpu_loop_exit are merged into
raise_exception_err_ra's cpu_loop_exit_restore.

This actually fixed some bugs, but when SVM is enabled there is a
second path from raise_exception_err_ra to cpu_loop_exit.  This is
the VMEXIT path, and now cpu_vmexit is called without a
cpu_restore_state before.

The fix is to pass the retaddr to cpu_vmexit (via
cpu_svm_check_intercept_param).  All helpers can now use GETPC() to pass
the correct retaddr, too.

Cc: [email protected]
Fixes: 2afbdf84807d673eb682cb78158e11cdacbf4673
Reported-by: Alexander Boettcher <[email protected]>
Tested-by: Alexander Boettcher <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-upstream-pull-request' into staging

# gpg: Signature made Thu 16 Feb 2017 14:35:46 GMT
# gpg:                using RSA key 0xF30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <[email protected]>"
# gpg:                 aka "Laurent Vivier <[email protected]>"
# gpg:                 aka "Laurent Vivier (Red Hat) <[email protected]>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/linux-user-for-upstream-pull-request:
  linux-user: Add FICLONE and FICLONERANGE ioctls
  linux-user: Use correct types in load_symbols()
  linux-user: fill target sigcontext struct accordingly
  linux-user: fix tcg/mmap test
  linux-user: fix settime old value location
  linux-user: Update m68k syscall definitions to match Linux 4.6
  linux-user: Update sh4 syscall definitions to match Linux 4.8
  linux-user: manage two new IFLA host message types
  linux-user: Fix mq_open
  linux-user: Fix readahead
  linux-user: Fix inotify_init1 support
  linux-user: Fix s390x safe-syscall for z900
  linux-user: drop __cygwin__ ifdef
  linux-user: remove ifdef __USER_MISC

Signed-off-by: Peter Maydell <[email protected]>

vl: log available guest crash information

There is a suitable log mask for the purpose.

Signed-off-by: Anton Nefedov <[email protected]>
Signed-off-by: Denis V. Lunev <[email protected]>
Message-Id: <1487053524 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

report guest crash information in GUEST_PANICKED event

it's not very convenient to use the crash-information property interface,
so provide a CPU class callback to get the guest crash information, and pass
that information in the event

Signed-off-by: Anton Nefedov <[email protected]>
Signed-off-by: Denis V. Lunev <[email protected]>
Message-Id: <1487053524 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

i386/cpu: add crash-information QOM property

Windows reports BSOD parameters through Hyper-V crash MSRs. This
information is very useful for initial crash analysis and thus
it would be nice to have a way to fetch it.

Signed-off-by: Anton Nefedov <[email protected]>
Signed-off-by: Denis V. Lunev <[email protected]>
Message-Id: <1487053524 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

Makefile: avoid leaving the temporary QEMU_PKGVERSION header file

By commit 67a1de0d, When we perform 'git pull && make && sudo make install',
In 'make' stage a qemu-version.h.tmp will be generated. If the content of
qemu-version.h.tmp and qemu-version.h aren't consistent, The qemu-version.h.tmp
will be renamed to qemu-version.h. Because of the target FORCE, The same action
will be do again in 'make install' stage.

In 'make install' stage, If there is no qemu-version.h.tmp exists and we run
'make install' with sudo, The owner and group of new qemu-version.h.tmp will be
privileged user/group. When we run 'make' next time, qemu-version.h.tmp can't
be overwritten because of permission issue.

This patch removed qemu-version.h.tmp after build to fix this issue.

Signed-off-by: Lin Ma <[email protected]>
Message-Id: <20170215024030 [email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

vl: Move the cpu_synchronize_all_post_init() after generic devices initialization

At the Qemu initialization, we call the cpu_synchronize_all_post_init()
to synchronize All CPU states to KVM in the ./vl.c::main().

Currently, it is called before we initialize the CPUs, which is created
by "-device" command and parsed by generic devices initialization, So,
these CPUs may be ignored to synchronize.

The patch moves the cpu_synchronize_all_post_init func after generic
devices initialization to make sure that all the CPUs can be included.

Signed-off-by: Dou Liyang <[email protected]>
Message-Id: <1485916178 [email protected]>
Reviewed-by: Eduardo Habkost <[email protected]>
Acked-by: Eduardo Habkost <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

qemu-nbd: Implement socket activation.

Socket activation (sometimes known as systemd socket activation)
allows an Internet superserver to pass a pre-opened listening socket
to the process, instead of having qemu-nbd open a socket itself.  This
is done via the LISTEN_FDS and LISTEN_PID environment variables, and a
standard file descriptor range.

This change partially implements socket activation for qemu-nbd.  If
the environment variables are set correctly, then socket activation
will happen automatically, otherwise everything works as before.  The
limitation is that LISTEN_FDS must be 1.

Signed-off-by: Richard W.M. Jones <[email protected]>
Message-Id: <20170204100317 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

linux-user: Add FICLONE and FICLONERANGE ioctls

Add missing FICLONE and FICLONERANGE ioctls.

Signed-off-by: Helge Deller <[email protected]>
Reviewed-by: Laurent Vivier <[email protected]>
Message-Id: <20170211222602 [email protected]>
Signed-off-by: Laurent Vivier <[email protected]>

linux-user: Use correct types in load_symbols()

Coverity doesn't like the code in load_symbols() which assumes
it can use 'int' for a variable that might hold an offset into
the guest ELF file, because in a 64-bit guest that could
overflow. Guest binaries with 2GB sections aren't very likely
and this isn't a security issue because we fully trust the
guest linux-user binary anyway, but we might as well use the
right types, which will placate Coverity. Use uint64_t to
hold section sizes, and bail out if the symbol table is too
large rather than just overflowing an int.

(Coverity issue CID1005776)

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Laurent Vivier <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <1486249533 [email protected]>
Signed-off-by: Laurent Vivier <[email protected]>

linux-user: fill target sigcontext struct accordingly

A segfault is noticed when an emulated program uses any of ucontext
regs fields. Risu detected this issue in the following operation when
handling a signal:
  ucontext_t *uc = (ucontext_t*)uc;
  uc->uc_mcontext.regs->nip += 4;

but this works fine:
  uc->uc_mcontext.gp_regs[PT_NIP] += 4;

This patch set regs to a valid location as well as other sigcontext
fields.

Signed-off-by: Jose Ricardo Ziviani <[email protected]>
Reviewed-by: Laurent Vivier <[email protected]>
Message-Id: <1485900317 [email protected]>
Signed-off-by: Laurent Vivier <[email protected]>

linux-user: fix tcg/mmap test

tests/tcg/mmap test fails with values other than default target page
size. When creating a map beyond EOF, extra anonymous pages are added up
to the target page boundary. Currently, this operation is performed only
when qemu_real_host_page_size < TARGET_PAGE_SIZE, but it should be
performed if the configured page size (qemu -p) is larger than
qemu_real_host_page_size too.

Signed-off-by: Marc-André Lureau <[email protected]>
[pranith: dropped checkpatch changes]
Signed-off-by: Pranith Kumar <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Reviewed-by: Laurent Vivier <[email protected]>
Message-Id: <20170119151533 [email protected]>
Signed-off-by: Laurent Vivier <[email protected]>

linux-user: fix settime old value location

old_value is the 4th argument of timer_settime(), not the 2nd.

Signed-off-by: Marc-André Lureau <[email protected]>
Signed-off-by: Pranith Kumar <[email protected]>
Reviewed-by: Laurent Vivier <[email protected]>
Message-Id: <20170119151533 [email protected]>
Signed-off-by: Laurent Vivier <[email protected]>

linux-user: Update m68k syscall definitions to match Linux 4.6

Signed-off-by: John Paul Adrian Glaubitz <[email protected]>
Reviewed-by: Laurent Vivier <[email protected]>
Message-Id: <20170116224915 [email protected]>
Signed-off-by: Laurent Vivier <[email protected]>

linux-user: Update sh4 syscall definitions to match Linux 4.8

Signed-off-by: John Paul Adrian Glaubitz <[email protected]>
Reviewed-by: Laurent Vivier <[email protected]>
Message-Id: <20170116223140 [email protected]>
Signed-off-by: Laurent Vivier <[email protected]>

qemu-doc: Clarify that -vga std is now the default

The QEMU manual page states that Cirrus Logic is the default video
card if the user doesn't specify any. However this is not true since
QEMU 2.2.

Signed-off-by: Alberto Garcia <[email protected]>
Message-Id: <20170127094154 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

cpu-exec: remove outermost infinite loop

Reorganize the sigsetjmp so that the restart case falls through
to cpu_handle_exception and the execution loop.

Signed-off-by: Paolo Bonzini <[email protected]>

cpu-exec: avoid repeated sigsetjmp on interrupts

The sigsetjmp only needs to be prepared once for the whole execution
of cpu_exec. This patch takes care of the "== 0" side, using a
nested loop so that cpu_handle_interrupt goes straight back to
cpu_handle_exception without doing another sigsetjmp.

Signed-off-by: Paolo Bonzini <[email protected]>

cpu-exec: avoid cpu_loop_exit in cpu_handle_interrupt

The siglongjmp goes straight back to the beginning of cpu_exec's
outermost loop. We do not need a siglongjmp, we can simply
leave the inner TB execution loop.

Signed-off-by: Paolo Bonzini <[email protected]>

cpu-exec: tighten barrier on TCG_EXIT_REQUESTED

This seems to have worked just fine so far on weakly-ordered
architectures, but I don't see anything that prevents the
reordering from:

    store 1 to exit_request
    store 1 to tcg_exit_req
                                 load tcg_exit_req
                                 store 0 to tcg_exit_req
                                 load exit_request
                                 store 0 to exit_request
    store 1 to exit_request
    store 1 to tcg_exit_req

to this:

    store 1 to exit_request
    store 1 to tcg_exit_req
                                 load tcg_exit_req
                                 load exit_request
    store 1 to exit_request
    store 1 to tcg_exit_req
                                 store 0 to tcg_exit_req
                                 store 0 to exit_request

therefore losing a request.  It's possible that other memory barriers
(e.g. in rcu_read_unlock) are hiding it, but better safe than
sorry.

Signed-off-by: Paolo Bonzini <[email protected]>

cpu-exec: fix icount out-of-bounds access

When icount is active, tb_add_jump is surprisingly called with an
out of bounds basic block index. I have no idea how that can work,
but it does not seem like a good idea. Clear *last_tb for all
TB_EXIT_ICOUNT_EXPIRED cases, even when all you have to do is
refill icount_extra.

Signed-off-by: Paolo Bonzini <[email protected]>

hw/char/mcf_uart: QOMify the ColdFire UART

Use type_init() etc. to adapt the ColdFire UART
to the latest QEMU device conventions.

Signed-off-by: Thomas Huth <[email protected]>
Message-Id: <1485586582 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

gdbstub: Fix vCont behaviour

When GDB issues a "vCont", QEMU was not handling it correctly when
multiple VCPUs are active.
For vCont, for each thread (VCPU), it can be specified whether to
single step, continue or stop that thread. The default is to stop a
thread.
However, when (for example) "vCont;s:2" is issued, all VCPUs continue
to run, although all but VCPU nr 2 are to be stopped.

This patch completely rewrites the vCont parsing code.

Please note that this improvement only works in system emulation mode,
when in userspace emulation mode the old behaviour is preserved.

Signed-off-by: Claudio Imbrenda <[email protected]>
Message-Id: <1487092068 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

move vm_start to cpus.c

This patch:

* moves vm_start to cpus.c.
* exports qemu_vmstop_requested, since it's needed by vm_start.
* extracts vm_prepare_start from vm_start; it does what vm_start did,
except restarting the cpus.
* vm_start now calls vm_prepare_start and then restarts the cpus.

Signed-off-by: Claudio Imbrenda <[email protected]>
Message-Id: <1487092068 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

char: drop data written to a disconnected pty

When a serial port writes data to a pty that's disconnected, drop the
data and return the length dropped. This avoids triggering pointless
retries in callers like the 16550A serial_xmit(), and causes
qemu_chr_fe_write() to write all data to the log file, rather than
logging only while a pty client like virsh console happens to be
connected.

Signed-off-by: Ed Swierk <[email protected]>
Message-Id: <1485870329 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

apic: reset apic_delivered global variable on machine reset

This patch adds call to apic_reset_irq_delivered when the virtual
machine is reset.

Signed-off-by: Pavel Dovgalyuk <[email protected]>
Message-Id: <20170131114054.276.62201.stgit@PASHA-ISP>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>

qemu-char: socket backend: disconnect on write error

Socket backend read handler should normally perform a disconnect, however
the read handler may not get a chance to run if the frontend is not ready
(qemu_chr_be_can_write() == 0).

This means that in virtio-serial frontend case if
- the host has disconnected (giving EPIPE on socket write)
- and the guest has disconnected (-> frontend not ready -> backend
will not read)
- and there is still data (frontend->backend) to flush (has to be a really
tricky timing but nevertheless, we have observed the case in production)

This results in virtio-serial trying to flush this data continiously forming
a busy loop.

Solution: react on write error in the socket write handler.
errno is not reliable after qio_channel_writev_full(), so we may not get
the exact EPIPE, so disconnect on any error but QIO_CHANNEL_ERR_BLOCK which
io_channel_send_full() converts to errno EAGAIN.
We must not disconnect right away though, there still may be data to read
(see 4bf1cb0).

Signed-off-by: Anton Nefedov <[email protected]>
Signed-off-by: Denis V. Lunev <[email protected]>
CC: Paolo Bonzini <[email protected]>
CC: Daniel P. Berrange <[email protected]>
CC: Marc-AndrÃ© Lureau <[email protected]>
Message-Id: <1486045589 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

test-vmstate: remove yield_until_fd_readable

The function is not needed anymore now that migration is built on
top of QIOChannel.

Signed-off-by: Paolo Bonzini <[email protected]>

kvm/ioapic: correct kvm ioapic version

Signed-off-by: Peter Xu <[email protected]>
Message-Id: <1486106298 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

ioapic: fix error report value of def version

It should be 0x20, rather than 0x11.

Signed-off-by: Peter Xu <[email protected]>
Message-Id: <1486106298 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

kvm/ioapic: dump real object instead of a fake one

When we do "info ioapic" for kvm ioapic, we were building up a temporary
ioapic object. Let's fetch the real one and update correspond to the
real object as well.

This fixes printing uninitialized version field in
ioapic_print_redtbl().

Reported-by: Peter Maydell <[email protected]>
Suggested-by: Paolo Bonzini <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Message-Id: <1486106298 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging

# gpg: Signature made Wed 15 Feb 2017 03:46:59 GMT
# gpg:                using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  net: e1000e: fix an infinite loop issue
  net: imx: limit buffer descriptor count
  colo-compare: sort TCP packet queue by sequence number
  net: e1000e: fix dead code in e1000e_write_packet_to_guest
  net: Mark 'vlan' parameter as deprecated

Signed-off-by: Peter Maydell <[email protected]>

net: e1000e: fix an infinite loop issue

This issue is like the issue in e1000 network card addressed in
this commit:
e1000: eliminate infinite loops on out-of-bounds transfer start.

Signed-off-by: Li Qiang <[email protected]>
Reviewed-by: Dmitry Fleytman <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

net: imx: limit buffer descriptor count

i.MX Fast Ethernet Controller uses buffer descriptors to manage
data flow to/fro receive & transmit queues. While transmitting
packets, it could continue to read buffer descriptors if a buffer
descriptor has length of zero and has crafted values in bd.flags.
Set an upper limit to number of buffer descriptors.

Reported-by: Li Qiang <[email protected]>
Signed-off-by: Prasad J Pandit <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

colo-compare: sort TCP packet queue by sequence number

Improve efficiency of TCP packet comparison.

Signed-off-by: Zhang Chen <[email protected]>
Signed-off-by: Li Zhijian <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

net: e1000e: fix dead code in e1000e_write_packet_to_guest

Because is_first is declared inside a loop, it is always true. The store
is dead, and so is the "else" branch of "if (is_first)". is_last is
okay though.

Reported by Coverity.

Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: Dmitry Fleytman <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

net: Mark 'vlan' parameter as deprecated

The 'vlan' parameter is a continuous source of confusion for the users,
many people mix it up with the more common term VLAN (the link layer
packet encapsulation), and even if they realize that the QEMU 'vlan' is
rather some kind of network hub emulation, there is still a high risk
that they configure their QEMU networking in a wrong way with this
parameter (e.g. by hooking NICs together, so they get a 'loopback'
between one and the other NIC).
Thus at one point in time, we should finally get rid of the 'vlan'
feature in QEMU. Let's do a first step in this direction by declaring
the 'vlan' parameter as deprecated and informing the users to use the
'netdev' parameter instead.

Signed-off-by: Thomas Huth <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

linux-user: manage two new IFLA host message types

Add QEMU_IFLA_GSO_MAX_SEGS and QEMU_IFLA_GSO_MAX_SIZE
in host_to_target_data_link_rtattr().

These two messages are sent by the host kernel when
we use "sudo".

Found with qemu-m68k and Debian etch-m68k (sudo 1.6.8p12-4) and
host kernel 4.7.6-200.fc24.x86_64

Signed-off-by: Laurent Vivier <[email protected]>
Message-Id: <1477530049 [email protected]>

linux-user: Fix mq_open

If fourth argument is NULL it should be passed without
using lock_user function which would, in that case, return
EFAULT, and system call supports passing NULL as fourth argument.

Signed-off-by: Lena Djokic <[email protected]>
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Riku Voipio <[email protected]>

linux-user: Fix readahead

Calculation of 64-bit offset was not correct for all cases.

Signed-off-by: Lena Djokic <[email protected]>
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Riku Voipio <[email protected]>

linux-user: Fix inotify_init1 support

This commit adds necessary conversion of argument passed to inotify_init1.
inotify_init1 flags can be IN_NONBLOCK and IN_CLOEXEC which rely on O_NONBLOCK
and O_CLOEXEC and those can have different values on different platforms.

Signed-off-by: Lena Djokic <[email protected]>
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Riku Voipio <[email protected]>

linux-user: Fix s390x safe-syscall for z900

The LT instruction was added in the extended immediate facility
introduced with the z9-109 processor.

Cc: Riku Voipio <[email protected]>
Reported-by: Michael Tokarev <[email protected]>
Fixes: c9bc3437a905b660561a26cd4ecc64579843267b
Suggested-by: Aurelien Jarno <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Signed-off-by: Riku Voipio <[email protected]>

linux-user: drop __cygwin__ ifdef

linux-user doesn't work on cygwin anyways.

Cc: Richard Henderson <[email protected]>
Signed-off-by: Riku Voipio <[email protected]>

linux-user: remove ifdef __USER_MISC

This preprocessor macro isn't set anywhere. Remove
the check so -strace can show these options.

Signed-off-by: Riku Voipio <[email protected]>

Merge remote-tracking branch 'remotes/rth/tags/pull-or-20170214' into staging

Queued openrisc patches

# gpg: Signature made Mon 13 Feb 2017 21:21:03 GMT
# gpg:                using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <[email protected]>"
# gpg:                 aka "Richard Henderson <[email protected]>"
# gpg:                 aka "Richard Henderson <[email protected]>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC  16A4 AD12 70CC 4DD0 279B

* remotes/rth/tags/pull-or-20170214: (24 commits)
  target/openrisc: Optimize for r0 being zero
  target/openrisc: Tidy handling of delayed branches
  target/openrisc: Tidy ppc/npc implementation
  target/openrisc: Optimize l.jal to next
  target/openrisc: Fix madd
  target/openrisc: Implement muld, muldu, macu, msbu
  target/openrisc: Represent MACHI:MACLO as a single unit
  target/openrisc: Implement msync
  target/openrisc: Enable trap, csync, msync, psync for user mode
  target/openrisc: Set flags on helpers
  target/openrisc: Use movcond where appropriate
  target/openrisc: Keep SR_CY and SR_OV in a separate variables
  target/openrisc: Keep SR_F in a separate variable
  target/openrisc: Invert the decoding in dec_calc
  target/openrisc: Put SR[OVE] in TB flags
  target/openrisc: Streamline arithmetic and OVE
  target/openrisc: Rationalize immediate extraction
  target/openrisc: Tidy insn dumping
  target/openrisc: Implement lwa, swa
  target/openrisc: Fix exception handling status registers
  ...

Signed-off-by: Peter Maydell <[email protected]>

target/openrisc: Optimize for r0 being zero

The HW does not special-case r0, but the ABI specifies that r0 should
contain 0. If we expose this fact to the optimizer, we can simplify
a lot of the generated code. We must of course verify that r0==0, but
that is trivial to do with a TB flag.

Signed-off-by: Richard Henderson <[email protected]>

target/openrisc: Tidy handling of delayed branches

Signed-off-by: Richard Henderson <[email protected]>

target/openrisc: Tidy ppc/npc implementation

The NPC SPR is really only supposed to be used for FPGA debugging.
It contains the same contents as PC, unless one plays games. Follow
the or1ksim implementation in flushing delayed branch state when it
is changed.

The PPC SPR need not be updated every instruction, merely when we
exit the TB or attempt to read its contents.

Signed-off-by: Richard Henderson <[email protected]>

target/openrisc: Optimize l.jal to next

This allows the tcg optimizer to see, and fold, all of the
constants involved in a GOT base register load sequence.

Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

target/openrisc: Fix madd

Note that the specification for lf.madd.s is confused. It's
the only mention of supposed FPMADDHI/FPMADDLO special registers.
On the other hand, or1ksim implements a somewhat normal non-fused
multiply and add. Mirror that.

Reviewed-by: Bastian Koppelmann <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

target/openrisc: Implement muld, muldu, macu, msbu

Reviewed-by: Bastian Koppelmann <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

target/openrisc: Represent MACHI:MACLO as a single unit

Significantly simplifies the implementation of the use of MAC.

Reviewed-by: Bastian Koppelmann <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

target/openrisc: Implement msync

Signed-off-by: Richard Henderson <[email protected]>

target/openrisc: Enable trap, csync, msync, psync for user mode

Not documented as disabled for user mode.

Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Bastian Koppelmann <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

target/openrisc: Set flags on helpers

Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Bastian Koppelmann <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

target/openrisc: Use movcond where appropriate

Reviewed-by: Bastian Koppelmann <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

target/openrisc: Keep SR_CY and SR_OV in a separate variables

This significantly streamlines carry and overflow production.

Signed-off-by: Richard Henderson <[email protected]>

target/openrisc: Keep SR_F in a separate variable

This avoids having to keep merging and extracting the flag from SR.

Reviewed-by: Bastian Koppelmann <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

target/openrisc: Invert the decoding in dec_calc

Decoding the opcodes in the right order reduces by 100+ lines.
Also, it happens to put the opcodes in the same order as Chapter 17.

Reviewed-by: Bastian Koppelmann <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

target/openrisc: Put SR[OVE] in TB flags

Removes a call at execution time for overflow exceptions.

Signed-off-by: Richard Henderson <[email protected]>

target/openrisc: Streamline arithmetic and OVE

Fix incorrect overflow calculation.  Move overflow exception check
to a helper function, to eliminate inline branches.  Remove some
incorrect special casing of R0.  Implement multiply inline.

Reviewed-by: Bastian Koppelmann <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

target/openrisc: Rationalize immediate extraction

The architecture manual is consistent in using "I" for signed
fields and "K" for unsigned fields. Mirror that.

Reviewed-by: Bastian Koppelmann <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

target/openrisc: Tidy insn dumping

Avoids warnings from unused variables etc.

Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Bastian Koppelmann <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

target/openrisc: Implement lwa, swa

Signed-off-by: Richard Henderson <[email protected]>

target/openrisc: Fix exception handling status registers

I am working on testing instruction emulation patches for the linux
kernel. During testing I found these 2 issues:

- sets DSX (delay slot exception) but never clears it
- EEAR for illegal insns should point to the bad exception (as per
openrisc spec) but its not

This patch fixes these two issues by clearing the DSX flag when not in a
delay slot and by setting EEAR to exception PC when handling illegal
instruction exceptions.

After this patch the openrisc kernel with latest patches boots great on
qemu and instruction emulation works.

Cc: [email protected]
Cc: [email protected]
Signed-off-by: Stafford Horne <[email protected]>
Message-Id: <20170113220028 [email protected]>
Signed-off-by: Richard Henderson <[email protected]>

linux-user: Honor CLONE_SETTLS for openrisc

Threads work much better when you set the TLS register.
This was fixed in the upstream kernel for Linux 4.9.

Signed-off-by: Richard Henderson <[email protected]>

linux-user: Fix openrisc cpu_loop

We need to handle EXCP_DEBUG and EXCP_INTERRUPT.
We need to send signals to the guest using queue_signal.

Signed-off-by: Richard Henderson <[email protected]>

linux-user: Add MMAP_SHIFT for openrisc

The page size on openrisc is 8k. Sync the shift
required for the mmap2 syscall.

Signed-off-by: Richard Henderson <[email protected]>

target/openrisc: Rename the cpu from or32 to or1k

This is in keeping with the toolchain and or1ksim.

Signed-off-by: Richard Henderson <[email protected]>

Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20170213a' into staging

Migration

  Amit: migration: remove myself as maintainer
        MAINTAINERS: update my email address
  Ashijeet: migrate: Introduce zero RAM checks to skip RAM migration
  Pavel: Postcopy release RAM
  Halil: consolidate VMStateField.start
  Hailiang: COLO: fix setting checkpoint-delay not working properly
         COLO: Shutdown related socket fd while do failover
         COLO: Don't process failover request while loading VM's state
  Me:
     migration: Add VMSTATE_UNUSED_VARRAY_UINT32
     migration: Add VMSTATE_WITH_TMP
     tests/migration: Add test for VMSTATE_WITH_TMP
     virtio-net VMState conversion and new VMSTATE macros

# gpg: Signature made Mon 13 Feb 2017 17:36:39 GMT
# gpg:                using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-migration-20170213a:
  virtio/migration: Migrate virtio-net to VMState
  tests/migration: Add test for VMSTATE_WITH_TMP
  migration: Add VMSTATE_WITH_TMP
  migration: Add VMSTATE_UNUSED_VARRAY_UINT32
  COLO: Don't process failover request while loading VM's state
  COLO: Shutdown related socket fd while do failover
  COLO: fix setting checkpoint-delay not working properly
  migration: consolidate VMStateField.start
  migrate: Introduce zero RAM checks to skip RAM migration
  migration: discard non-dirty ram pages after the start of postcopy
  add 'release-ram' migrate capability
  migration: add MigrationState arg for ram_save_/compressed_/page()
  MAINTAINERS: update my email address
  migration: remove myself as maintainer

Signed-off-by: Peter Maydell <[email protected]>

virtio/migration: Migrate virtio-net to VMState

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Message-Id: <20170203160651 [email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Merge fix against Halil's removal of the '_start' field in
VMSTATE_VBUFFER_MULTIPLY

tests/migration: Add test for VMSTATE_WITH_TMP

Add a test for VMSTATE_WITH_TMP to tests/test-vmstate.c

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Juan Quintela <[email protected]>
Message-Id: <20170203160651 [email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration: Add VMSTATE_WITH_TMP

VMSTATE_WITH_TMP is for handling structures where some calculation
or rearrangement of the data needs to be performed before the data
hits the wire.
For example, where the value on the wire is an offset from a
non-migrated base, but the data in the structure is the actual pointer.

To use it, a temporary type is created and a vmsd used on that type.
The first element of the type must be 'parent' a pointer back to the
type of the main structure. VMSTATE_WITH_TMP takes care of allocating
and freeing the temporary before running the child vmsd.

The post_load/pre_save on the child vmsd can copy things from the parent
to the temporary using the parent pointer and do any other calculations
needed; it can then use normal VMSD entries to do the actual data
storage without having to fiddle around with qemu_get_*/qemu_put_*

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Reviewed-by: Juan Quintela <[email protected]>
Message-Id: <20170203160651 [email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration: Add VMSTATE_UNUSED_VARRAY_UINT32

VMSTATE_UNUSED_VARRAY_UINT32 is used to skip a chunk of the stream
that's an n-element array; note the array size and the dynamic value
read never get multiplied so there's no overflow risk.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Message-Id: <20170203160651 [email protected]>
Reviewed-by: Juan Quintela <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

COLO: Don't process failover request while loading VM's state

We should not do failover work while the main thread is loading
VM's state. Otherwise the consistent of VM's memory and
device state will be broken.

We will restart the loading process after jump over the stage,
The new failover status 'RELAUNCH' will help to record if we
need to restart the process.

Cc: Eric Blake <[email protected]>
Signed-off-by: zhanghailiang <[email protected]>
Signed-off-by: Li Zhijian <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Message-Id: <1484657864 [email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Added a missing '(Since 2.9)'

COLO: Shutdown related socket fd while do failover

If the net connection between primary host and secondary host breaks
while COLO/COLO incoming threads are doing read() or write().
It will block until connection is timeout, and the failover process
will be blocked because of it.

So it is necessary to shutdown all the socket fds used by COLO
to avoid this situation. Besides, we should close the corresponding
file descriptors after failvoer BH shutdown them,
Or there will be an error.

Signed-off-by: zhanghailiang <[email protected]>
Signed-off-by: Li Zhijian <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Cc: Dr. David Alan Gilbert <[email protected]>
Message-Id: <1484657864 [email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

COLO: fix setting checkpoint-delay not working properly

If we set checkpoint-delay through command 'migrate-set-parameters',
It will not take effect until we finish last sleep chekpoint-delay,
That's will be offensive espeically when we want to change its value
from an extreme big one to a proper value.

Fix it by using timer to realize checkpoint-delay.

Signed-off-by: zhanghailiang <[email protected]>
Message-Id: <1484657864 [email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>

migration: consolidate VMStateField.start

The member VMStateField.start is used for two things, partial data
migration for VBUFFER data (basically provide migration for a
sub-buffer) and for locating next in QTAILQ.

The implementation of the VBUFFER feature is broken when VMSTATE_ALLOC
is used. This however goes unnoticed because actually partial migration
for VBUFFER is not used at all.

Let's consolidate the usage of VMStateField.start by removing support
for partial migration for VBUFFER.

Signed-off-by: Halil Pasic <[email protected]>
Message-Id: <20170203175217 [email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migrate: Introduce zero RAM checks to skip RAM migration

Migration of a "none" machine with no RAM crashes abruptly as
bitmap_new() fails and thus aborts. Instead place zero RAM checks at
appropriate places to skip migration of RAM in this case and complete
migration successfully for devices only.

Signed-off-by: Ashijeet Acharya <[email protected]>
Message-Id: <1486564125 [email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration: discard non-dirty ram pages after the start of postcopy

After the start of postcopy migration there are some non-dirty pages which have
already been migrated. These pages are no longer needed on the source vm so that
we can free them and it doen't hurt to complete the migration.

Signed-off-by: Pavel Butsykin <[email protected]>
Message-Id: <20170203152321 [email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

add 'release-ram' migrate capability

This feature frees the migrated memory on the source during postcopy-ram
migration. In the second step of postcopy-ram migration when the source vm
is put on pause we can free unnecessary memory. It will allow, in particular,
to start relaxing the memory stress on the source host in a load-balancing
scenario.

Signed-off-by: Pavel Butsykin <[email protected]>
Message-Id: <20170203152321 [email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Manually merged in Pavel's 'migration: madvise error_report fixup!'

migration: add MigrationState arg for ram_save_/compressed_/page()

Cosmetic patch. The use of ms variable instead of migrate_get_current()
looks nicer, especially when there reuse.

Signed-off-by: Pavel Butsykin <[email protected]>
Message-Id: <20170203152321 [email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

MAINTAINERS: update my email address

I'm leaving my job at Red Hat, this email address will stop working next week.
Update it to one that I will have access to later.

Signed-off-by: Amit Shah <[email protected]>
Message-Id: <1486120433 [email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Juan Quintela <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration: remove myself as maintainer

I'm switching jobs, and I'm not sure I can continue maintaining migration.

Signed-off-by: Amit Shah <[email protected]>
Message-Id: <1486120416 [email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Juan Quintela <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging

# gpg: Signature made Mon 13 Feb 2017 16:29:26 GMT
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <[email protected]>"
# gpg:                 aka "Stefan Hajnoczi <[email protected]>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request:
  Makefile: Make "install" depend on "trace-events-all"
  docs: update manpage for stderr->log rename

Signed-off-by: Peter Maydell <[email protected]>

Makefile: Make "install" depend on "trace-events-all"

We install this file to data dir but since 0ab8ed18 it's no longer
required by any objects during "make". List it explicitly as a depended
target of install and fix the broken "make install" command.

Signed-off-by: Fam Zheng <[email protected]>
Message-id: 20170204143245 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

docs: update manpage for stderr->log rename

With commit ed7f5f1d8db06fc31352a5ef4f54985e630c575a the name of
this backend changed from “stderr” to “log”.

Signed-off-by: Philipp Gesang <[email protected]>
Message-id: 20170202114101 [email protected]
Signed-off-by: Stefan Hajnoczi <[email protected]>

Merge remote-tracking branch 'remotes/kraxel/tags/pull-vga-20170213-1' into staging

vga: bugfixes for cirrus and virtio-gpu

# gpg: Signature made Mon 13 Feb 2017 08:14:47 GMT
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <[email protected]>"
# gpg:                 aka "Gerd Hoffmann <[email protected]>"
# gpg:                 aka "Gerd Hoffmann (private) <[email protected]>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-vga-20170213-1:
  Revert "cirrus: allow zero source pitch in pattern fill rops"
  cirrus: fix patterncopy checks
  cirrus: replace debug printf with trace points
  vga: replace debug printf with trace points
  virtio-gpu: fix resource leak in virgl_cmd_resource_unref
  virtio-gpu: fix memory leak in set scanout

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2017-02-12' into staging

Block patches

# gpg: Signature made Sun 12 Feb 2017 01:26:20 GMT
# gpg:                using RSA key 0xF407DB0061D5CF40
# gpg: Good signature from "Max Reitz <[email protected]>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* remotes/maxreitz/tags/pull-block-2017-02-12: (21 commits)
  qemu-img: Avoid setting ret to unused value in img_convert()
  qemu-img: Use qemu_strtoul() rather than raw strtoul()
  qemu-io: don't allow I/O operations larger than BDRV_REQUEST_MAX_BYTES
  qcow2: Optimize the refcount-block overlap check
  qemu-io: Add failure regression tests
  qemu-iotests: Add _unsupported_fmt helper
  qemu-io: Return non-zero exit code on failure
  block/nfs: fix naming of runtime opts
  block/nfs: fix NULL pointer dereference in URI parsing
  block: bdrv_invalidate_cache: invalidate children first
  block/qapi: reduce the execution time of qmp_query_blockstats
  block/qapi: reduce the coupling between the bdrv_query_stats and bdrv_query_bds_stats
  qemu-iotest: test to lookup protocol-based image with relative backing
  qemu-iotests: Don't create fifos / pidfiles with protocol paths
  block: check full backing filename when searching protocol filenames
  block/vmdk: Fix the endian problem of buf_len and lba
  iotests: record separate timings per format,protocol pair
  iotests: Fix reference output for 059
  qapi: Tweak error message of bdrv_query_image_info
  qemu-img: Improve commit invalid base message
  ...

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/awilliam/tags/vfio-updates-20170210.0' into staging

VFIO updates 2017-02-10

- Fix GTT wrap-around for Skylake IGD assignment (Alex Williamson)
- Tag vfio-pci-igd-lpc-bridge as bridge device category (Thomas Huth)
- Don't build calxeda-xgmac or amd-xgbe except on ARM (Thomas Huth)

# gpg: Signature made Fri 10 Feb 2017 21:34:33 GMT
# gpg:                using RSA key 0x239B9B6E3BB08B22
# gpg: Good signature from "Alex Williamson <[email protected]>"
# gpg:                 aka "Alex Williamson <[email protected]>"
# gpg:                 aka "Alex Williamson <[email protected]>"
# gpg:                 aka "Alex Williamson <[email protected]>"
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B  8A90 239B 9B6E 3BB0 8B22

* remotes/awilliam/tags/vfio-updates-20170210.0:
  hw/vfio: Add CONFIG switches for calxeda-xgmac and amd-xgbe
  hw/vfio/pci-quirks: Set category of the "vfio-pci-igd-lpc-bridge" device
  vfio-pci: Fix GTT wrap-around for Skylake+ IGD

Signed-off-by: Peter Maydell <[email protected]>

qemu-img: Avoid setting ret to unused value in img_convert()

Coverity points out that we assign the return value from
bdrv_snapshot_load_tmp() to 'ret' in img_convert(), but then
never use that variable. (We check for failure by looking
at local_err instead.) Drop the unused assignment, bringing
the call into line with the following call to
bdrv_snapshot_laod_tmp_by_id_or_name().

(Fixes CID 1247240.)

Signed-off-by: Peter Maydell <[email protected]>
Message-id: 1486744104 [email protected]
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Max Reitz <[email protected]>

qemu-img: Use qemu_strtoul() rather than raw strtoul()

Some of the argument parsing in qemu-img uses strtoul() to parse
integer arguments. This is tricky to get correct and in fact the
code does not get it right, because it assigns the result of
strtoul() to an 'int' variable and then tries to check for > INT_MAX.
Coverity correctly complains that the comparison is always false.

Rewrite to use qemu_strtoul(), which has a saner convention for
reporting conversion failures.

(Fixes CID 1356421, CID 1356422, CID 1356423.)

Signed-off-by: Peter Maydell <[email protected]>
Message-id: 1486744104 [email protected]
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Max Reitz <[email protected]>

qemu-io: don't allow I/O operations larger than BDRV_REQUEST_MAX_BYTES

Passing a request size larger than BDRV_REQUEST_MAX_BYTES to any of the
I/O commands results in an error. While 'read' and 'write' handle the
error correctly, 'aio_read' and 'aio_write' hit an assertion:

blk_aio_read_entry: Assertion `rwco->qiov->size == acb->bytes' failed.

The reason is that the QEMU I/O code cannot handle request sizes
larger than BDRV_REQUEST_MAX_BYTES, so this patch makes qemu-io check
that all values are within range.

Signed-off-by: Alberto Garcia <[email protected]>
Message-id: 79f66648c685929a144396bda24d13a207131dcf.1485878688 [email protected]
[mreitz: Use BDRV_REQUEST_MAX_BYTES instead of INT_MAX]
Signed-off-by: Max Reitz <[email protected]>

qcow2: Optimize the refcount-block overlap check

The metadata overlap checks introduced in a40f1c2add help detect
corruption in the qcow2 image by verifying that data writes don't
overlap with existing metadata sections.

The 'refcount-block' check in particular iterates over the refcount
table in order to get the addresses of all refcount blocks and check
that none of them overlap with the region where we want to write.

The problem with the refcount table is that since it always occupies
complete clusters its size is usually very big. With the default
values of cluster_size=64KB and refcount_bits=16 this table holds 8192
entries, each one of them enough to map 2GB worth of host clusters.

So unless we're using images with several TB of allocated data this
table is going to be mostly empty, and iterating over it is a waste of
CPU. If the storage backend is fast enough this can have an effect on
I/O performance.

This patch keeps the index of the last used (i.e. non-zero) entry in
the refcount table and updates it every time the table changes. The
refcount-block overlap check then uses that index instead of reading
the whole table.

In my tests with a 4GB qcow2 file stored in RAM this doubles the
amount of write IOPS.

Signed-off-by: Alberto Garcia <[email protected]>
Message-id: 20170201123828 [email protected]
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Max Reitz <[email protected]>

qemu-io: Add failure regression tests

Add regression tests checking that qemu-io fails with non-zero exit code
when reading non-existing file or using the wrong image format.

Signed-off-by: Nir Soffer <[email protected]>
Message-id: 20170201003120 [email protected]
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Max Reitz <[email protected]>

qemu-iotests: Add _unsupported_fmt helper

This helper allows adding tests supporting any format expect the
specified formats. This may be useful to test that many formats behave
in a common way.

Signed-off-by: Nir Soffer <[email protected]>
Message-id: 20170201003120 [email protected]
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Max Reitz <[email protected]>

qemu-io: Return non-zero exit code on failure

The result of openfile was not checked, leading to failure deep in the
actual command with confusing error message, and exiting with exit code 0.

Here is a simple example - trying to read with the wrong format:

    $ touch file
    $ qemu-io -f qcow2 -c 'read -P 1 0 1024' file; echo $?
    can't open device file: Image is not in qcow2 format
    no file open, try 'help open'
    0

With this patch, we fail earlier with exit code 1:

    $ ./qemu-io -f qcow2 -c 'read -P 1 0 1024' file; echo $?
    can't open device file: Image is not in qcow2 format
    1

Failing earlier, we don't log this error now:

    no file open, try 'help open'

But some tests expected it; the line was removed from the test output.

Signed-off-by: Nir Soffer <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-id: 20170201003120 [email protected]
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Max Reitz <[email protected]>