Git Repo - qemu.git/log

Merge remote-tracking branch 'remotes/dgibson/tags/libfdt-20181002' into staging

Update dtc submodule to v1.4.7

We have some upcoming things planned for ppc that will require some
newer libfdt features.  In preparation, update the dtc/libfdt
submodule to upstreasm version v1.4.7.

# gpg: Signature made Tue 02 Oct 2018 05:23:43 BST
# gpg:                using RSA key 6C38CACA20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg:                 aka "David Gibson (Red Hat) <[email protected]>"
# gpg:                 aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg:                 aka "David Gibson (kernel.org) <[email protected]>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/libfdt-20181002:
  Update dtc/libfdt submodule to v1.4.7

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/xtensa/tags/20181001-xtensa' into staging

target/xtensa: preparation for FLIX support

Separate generation of per-instruction code (such as raising exceptions
and terminating TB) from per-opcode code.

# gpg: Signature made Mon 01 Oct 2018 19:14:34 BST
# gpg:                using RSA key 51F9CC91F83FA044
# gpg: Good signature from "Max Filippov <[email protected]>"
# gpg:                 aka "Max Filippov <[email protected]>"
# gpg:                 aka "Max Filippov <[email protected]>"
# Primary key fingerprint: 2B67 854B 98E5 327D CDEB  17D8 51F9 CC91 F83F A044

* remotes/xtensa/tags/20181001-xtensa:
  target/xtensa: extract gen_check_interrupts call
  target/xtensa: make rsr/wsr helpers return void
  target/xtensa: extract unconditional TB termination via slot 0
  target/xtensa: always end TB on CCOUNT access/CCOMPARE write
  target/xtensa: change SR number checks to assertions
  target/xtensa: extract unconditional TB termination
  target/xtensa: extract test for division by zero
  target/xtensa: extract test for cpdisabled exception
  target/xtensa: extract test for alloca exception
  target/xtensa: extract test for window underflow exception
  target/xtensa: extract test for window overflow exception
  target/xtensa: extract test for debug exception
  target/xtensa: extract test for syscall instruction
  target/xtensa: extract test for privileged instruction
  target/xtensa: extract test for an illegal instruction

Signed-off-by: Peter Maydell <[email protected]>

Update dtc/libfdt submodule to v1.4.7

dtc v1.4.7 contains a bunch of improvements to make libfdt safer against
handling a corrupted or malicious tree, which is a good thing to have. It
also includes an explicit fdt checking function that we'll be wanting in
future.

Signed-off-by: David Gibson <[email protected]>

target/xtensa: extract gen_check_interrupts call

- mark instructions that affect active IRQ level;
- put call for gen_check_interrupts right after the instruction
translation; when FLIX is enabled it will need to appear before
other exits from the TB as well;

Signed-off-by: Max Filippov <[email protected]>

target/xtensa: make rsr/wsr helpers return void

Now that all logic for TB termination is extracted from rsr/wsr their
return value is not used and may be dropped.

Signed-off-by: Max Filippov <[email protected]>

target/xtensa: extract unconditional TB termination via slot 0

- mark instructions that require TB termination via slot 0;
- put TB termination right after the instruction translation loop, if
termination w/o TB linking wasn't requested;

Signed-off-by: Max Filippov <[email protected]>

target/xtensa: always end TB on CCOUNT access/CCOMPARE write

Currently we only end TB in icount mode, because access to CCOUNT or
write to CCOMPARE are IO operations. Simplify the behaviour a bit and
end TB unconditionally.

Signed-off-by: Max Filippov <[email protected]>

target/xtensa: change SR number checks to assertions

Opcode decoding with libisa takes care about range of valid group SRs,
like CCOMPARE, IBREAKA, DBREAKA or DBREAKC. Turn range checks in wsr
implementations into assertions.

Signed-off-by: Max Filippov <[email protected]>

target/xtensa: extract unconditional TB termination

- mark all instructions that exit TB and require dynamic search for the
next TB;
- put TB termination right after the instruction translation loop;

Signed-off-by: Max Filippov <[email protected]>

target/xtensa: extract test for division by zero

- mark quos/quou/rems/remu instructions;
- drop parameter 0 from the translate_quou and split translate_remu from
it;
- put test for division by zero exception right after the coprocessor
exception test;

Signed-off-by: Max Filippov <[email protected]>

target/xtensa: extract test for cpdisabled exception

- add XtensaOpcodeOps::coprocessor with bitmask of coprocessors used by
  the instruction;
- replace coprocessor id parameter of gen_check_cpenable with the
  bitmask of used coprocessors;
- collect coprocessor IDs used by an instruction in the disassembly
  loop;
- put test for coprocessor disabled exception after the alloca test;

Signed-off-by: Max Filippov <[email protected]>

target/xtensa: extract test for alloca exception

- mark movsp instruction;
- put test for alloca exception right after the test for window
underflow;

Signed-off-by: Max Filippov <[email protected]>

target/xtensa: extract test for window underflow exception

- mark retw and retw.n instructions;
- extract window inderflow test from retw helper;
- put underflow exception check generation right after the overflow
check;

Signed-off-by: Max Filippov <[email protected]>

target/xtensa: extract test for window overflow exception

- add ps.callinc to the TB flags, that allows testing all instructions
  for window overflow statically;
- drop gen_window_check* functions; replace them with get_window_check
  that accepts bitmask of used registers;
- add XtensaOpcodeOps::test_overflow that returns bitmask of implicitly
  used registers; use it for entry and call{,x}{4,8,12};
- drop window overflow test from the entry helper;
- drop parameter 0 from translate_[di]cache and use translate_nop for
  d/i cache opcodes that don't need memory accessibility check;
- add bitmask XtensaOpcodeOps::windowed_register_op that marks opcode
  arguments that refer to windowed registers;
- translate windowed_register_op mask to a mask of actually used
  registers in the disassembly loop;
- add check for window overflow right after the check for debug
  exception;

Signed-off-by: Max Filippov <[email protected]>

target/xtensa: extract test for debug exception

- mark break and break.n instructions;
- collect debug cause bits from parameter 0 of instructions marked for
debug exception;
- put debug exception check right after syscall check;

Signed-off-by: Max Filippov <[email protected]>

target/xtensa: extract test for syscall instruction

- mark syscall instruction;
- put syscall exception check right after privileged exception check;

Signed-off-by: Max Filippov <[email protected]>

target/xtensa: extract test for privileged instruction

- mark privileged instructions;
- put single privileged instruction check after disassembly loop;
- translate_[di]cache: drop parameter 0, shift parameters one down;

Signed-off-by: Max Filippov <[email protected]>

target/xtensa: extract test for an illegal instruction

- TB flags: add XTENSA_TBFLAG_CWOE that corresponds to the architectural
CWOE state;
- entry: move CWOE check from the helper to the test_ill_entry;
- retw: move CWOE check from the helper to the test_ill_retw;
- separate instruction disassembly loop and translation loop; save
disassembly results in local array;

Signed-off-by: Max Filippov <[email protected]>

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches:

- qcow2 cache option default changes (Linux: 32 MB maximum, limited by
  whatever cache size can be made use of with the specific image;
  default cache-clean-interval of 10 minutes)
- reopen: Allow specifying unchanged child node references, and changing
  a few generic options (discard, detect-zeroes)
- Fix werror/rerror defaults for -device drive=<node-name>
- Test case fixes

# gpg: Signature made Mon 01 Oct 2018 18:17:35 BST
# gpg:                using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream: (23 commits)
  tests/test-bdrv-drain: Fix too late qemu_event_reset()
  test-replication: Lock AioContext around blk_unref()
  qcow2: Fix cache-clean-interval documentation
  block-backend: Set werror/rerror defaults in blk_new()
  qcow2: Explicit number replaced by a constant
  qcow2: Set the default cache-clean-interval to 10 minutes
  qcow2: Resize the cache upon image resizing
  qcow2: Increase the default upper limit on the L2 cache size
  qcow2: Assign the L2 cache relatively to the image size
  qcow2: Avoid duplication in setting the refcount cache size
  qcow2: Make sizes more humanly readable
  include: Add a lookup table of sizes
  qcow2: Options' documentation fixes
  block: Allow changing 'detect-zeroes' on reopen
  block: Allow changing 'discard' on reopen
  file-posix: Forbid trying to change unsupported options during reopen
  block: Forbid trying to change unsupported options during reopen
  block: Allow child references on reopen
  block: Don't look for child references in append_open_options()
  block: Remove child references from bs->{options,explicit_options}
  ...

Signed-off-by: Peter Maydell <[email protected]>

tests/test-bdrv-drain: Fix too late qemu_event_reset()

qemu_event_reset() must be called before the AIO request in a different
iothread is submitted. Otherwise the request could be completed before
we do the qemu_event_reset() and the test would hang in
qemu_event_wait().

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Max Reitz <[email protected]>
Tested-by: Max Reitz <[email protected]>

test-replication: Lock AioContext around blk_unref()

Recently, the test case has started failing because some job related
functions want to drop the AioContext lock even though it hasn't been
taken:

    (gdb) bt
    #0  0x00007f51c067c9fb in raise () from /lib64/libc.so.6
    #1  0x00007f51c067e77d in abort () from /lib64/libc.so.6
    #2  0x0000558c9d5dde7b in error_exit (err=<optimized out>, msg=msg@entry=0x558c9d6fe120 <__func__.18373> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36
    #3  0x0000558c9d6b5263 in qemu_mutex_unlock_impl (mutex=mutex@entry=0x558c9f3999a0, file=file@entry=0x558c9d6fd36f "util/async.c", line=line@entry=516) at util/qemu-thread-posix.c:96
    #4  0x0000558c9d6b0565 in aio_context_release (ctx=ctx@entry=0x558c9f399940) at util/async.c:516
    #5  0x0000558c9d5eb3da in job_completed_txn_abort (job=0x558c9f68e640) at job.c:738
    #6  0x0000558c9d5eb227 in job_finish_sync (job=0x558c9f68e640, finish=finish@entry=0x558c9d5eb8d0 <job_cancel_err>, errp=errp@entry=0x0) at job.c:986
    #7  0x0000558c9d5eb8ee in job_cancel_sync (job=<optimized out>) at job.c:941
    #8  0x0000558c9d64d853 in replication_close (bs=<optimized out>) at block/replication.c:148
    #9  0x0000558c9d5e5c9f in bdrv_close (bs=0x558c9f41b020) at block.c:3420
    #10 bdrv_delete (bs=0x558c9f41b020) at block.c:3629
    #11 bdrv_unref (bs=0x558c9f41b020) at block.c:4685
    #12 0x0000558c9d62a3f3 in blk_remove_bs (blk=blk@entry=0x558c9f42a7c0) at block/block-backend.c:783
    #13 0x0000558c9d62a667 in blk_delete (blk=0x558c9f42a7c0) at block/block-backend.c:402
    #14 blk_unref (blk=0x558c9f42a7c0) at block/block-backend.c:457
    #15 0x0000558c9d5dfcea in test_secondary_stop () at tests/test-replication.c:478
    #16 0x00007f51c1f13178 in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0
    #17 0x00007f51c1f1337b in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0
    #18 0x00007f51c1f1337b in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0
    #19 0x00007f51c1f13552 in g_test_run_suite () from /lib64/libglib-2.0.so.0
    #20 0x00007f51c1f13571 in g_test_run () from /lib64/libglib-2.0.so.0
    #21 0x0000558c9d5de31f in main (argc=<optimized out>, argv=<optimized out>) at tests/test-replication.c:581

It is yet unclear whether this should really be considered a bug in the
test case or whether blk_unref() should work for callers that haven't
taken the AioContext lock, but in order to fix the build tests quickly,
just take the AioContext lock around blk_unref().

Signed-off-by: Kevin Wolf <[email protected]>

qcow2: Fix cache-clean-interval documentation

Fixing cache-clean-interval documentation following the recent change to
a default of 600 seconds on supported plarforms (only Linux currently).

Signed-off-by: Leonid Bloch <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

block-backend: Set werror/rerror defaults in blk_new()

Currently, the default values for werror and rerror have to be set
explicitly with blk_set_on_error() by the callers of blk_new(). The only
caller actually doing this is blockdev_init(), which is called for
BlockBackends created using -drive.

In particular, anonymous BlockBackends created with
-device ...,drive=<node-name> didn't get the correct default set and
instead defaulted to the integer value 0 (= BLOCKDEV_ON_ERROR_REPORT).
This is the intended default for rerror anyway, but the default for
werror should be BLOCKDEV_ON_ERROR_ENOSPC.

Set the defaults in blk_new() instead so that they apply no matter what
way the BlockBackend was created.

Cc: [email protected]
Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>

Merge remote-tracking branch 'remotes/kraxel/tags/ui-20181001-pull-request' into staging

ui: some small fixes/improvements.

# gpg: Signature made Mon 01 Oct 2018 11:42:16 BST
# gpg:                using RSA key 4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <[email protected]>"
# gpg:                 aka "Gerd Hoffmann <[email protected]>"
# gpg:                 aka "Gerd Hoffmann (private) <[email protected]>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/ui-20181001-pull-request:
  gtk: add zoom-to-fit to gtk options.
  vnc: call sasl_server_init() only when required
  sdl2: show console #0 unconditionally

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/kraxel/tags/usb-20181001-pull-request' into staging

usb: fixes for mtp, hub and ohci.

# gpg: Signature made Mon 01 Oct 2018 10:28:36 BST
# gpg:                using RSA key 4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <[email protected]>"
# gpg:                 aka "Gerd Hoffmann <[email protected]>"
# gpg:                 aka "Gerd Hoffmann (private) <[email protected]>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/usb-20181001-pull-request:
  ohci: set effectively usb frame rate to 1kHz
  usb-hub: clear suspend on detach
  usb-mtp: reset ObjectInfo dataset size on cleanup
  doc: replace x-root with rootdir for usb-mtp
  usb-mtp: fix error conditions for write operation

Signed-off-by: Peter Maydell <[email protected]>

qcow2: Explicit number replaced by a constant

Signed-off-by: Leonid Bloch <[email protected]>
Reviewed-by: Alberto Garcia <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

qcow2: Set the default cache-clean-interval to 10 minutes

The default cache-clean-interval is set to 10 minutes, in order to lower
the overhead of the qcow2 caches (before the default was 0, i.e.
disabled).

* For non-Linux platforms the default is kept at 0, because
cache-clean-interval is not supported there yet.

Signed-off-by: Leonid Bloch <[email protected]>
Reviewed-by: Alberto Garcia <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

qcow2: Resize the cache upon image resizing

The caches are now recalculated upon image resizing. This is done
because the new default behavior of assigning L2 cache relatively to
the image size, implies that the cache will be adapted accordingly
after an image resize.

Signed-off-by: Leonid Bloch <[email protected]>
Reviewed-by: Alberto Garcia <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

qcow2: Increase the default upper limit on the L2 cache size

The upper limit on the L2 cache size is increased from 1 MB to 32 MB
on Linux platforms, and to 8 MB on other platforms (this difference is
caused by the ability to set intervals for cache cleaning on Linux
platforms only).

This is done in order to allow default full coverage with the L2 cache
for images of up to 256 GB in size (was 8 GB). Note, that only the
needed amount to cover the full image is allocated. The value which is
changed here is just the upper limit on the L2 cache size, beyond which
it will not grow, even if the size of the image will require it to.

Signed-off-by: Leonid Bloch <[email protected]>
Reviewed-by: Alberto Garcia <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

qcow2: Assign the L2 cache relatively to the image size

Sufficient L2 cache can noticeably improve the performance when using
large images with frequent I/O.

Previously, unless 'cache-size' was specified and was large enough, the
L2 cache was set to a certain size without taking the virtual image size
into account.

Now, the L2 cache assignment is aware of the virtual size of the image,
and will cover the entire image, unless the cache size needed for that is
larger than a certain maximum. This maximum is set to 1 MB by default
(enough to cover an 8 GB image with the default cluster size) but can
be increased or decreased using the 'l2-cache-size' option. This option
was previously documented as the *maximum* L2 cache size, and this patch
makes it behave as such, instead of as a constant size. Also, the
existing option 'cache-size' can limit the sum of both L2 and refcount
caches, as previously.

Signed-off-by: Leonid Bloch <[email protected]>
Reviewed-by: Alberto Garcia <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

qcow2: Avoid duplication in setting the refcount cache size

The refcount cache size does not need to be set to its minimum value in
read_cache_sizes(), as it is set to at least its minimum value in
qcow2_update_options_prepare().

Signed-off-by: Leonid Bloch <[email protected]>
Reviewed-by: Alberto Garcia <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

qcow2: Make sizes more humanly readable

Signed-off-by: Leonid Bloch <[email protected]>
Reviewed-by: Alberto Garcia <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

include: Add a lookup table of sizes

Adding a lookup table for the powers of two, with the appropriate size
prefixes. This is needed when a size has to be stringified, in which
case something like '(1 * KiB)' would become a literal '(1 * (1L << 10))'
string. Powers of two are used very often for sizes, so such a table
will also make it easier and more intuitive to write them.

This table is generatred using the following AWK script:

BEGIN {
suffix="KMGTPE";
for(i=10; i<64; i++) {
val=2**i;
s=substr(suffix, int(i/10), 1);
n=2**(i%10);
pad=21-int(log(n)/log(10));
printf("#define S_%d%siB %*d\n", n, s, pad, val);
}
}

Signed-off-by: Leonid Bloch <[email protected]>
Reviewed-by: Alberto Garcia <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

qcow2: Options' documentation fixes

Signed-off-by: Leonid Bloch <[email protected]>
Reviewed-by: Alberto Garcia <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

block: Allow changing 'detect-zeroes' on reopen

'detect-zeroes' is one of the basic BlockdevOptions available for all
drivers, but it's not handled by bdrv_reopen_prepare(), so any attempt
to change it results in an error:

(qemu) qemu-io virtio0 "reopen -o detect-zeroes=on"
Cannot change the option 'detect-zeroes'

Since there's no reason why we shouldn't allow changing it and the
implementation is simple let's just do it.

Signed-off-by: Alberto Garcia <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

block: Allow changing 'discard' on reopen

'discard' is one of the basic BlockdevOptions available for all
drivers, but it's not handled by bdrv_reopen_prepare() so any attempt
to change it results in an error:

(qemu) qemu-io virtio0 "reopen -o discard=on"
Cannot change the option 'discard'

Since there's no reason why we shouldn't allow changing it and the
implementation is simple let's just do it.

Signed-off-by: Alberto Garcia <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

file-posix: Forbid trying to change unsupported options during reopen

The file-posix code is used for the "file", "host_device" and
"host_cdrom" drivers, and it allows reopening images. However the only
option that is actually processed is "x-check-cache-dropped", and
changes in all other options (e.g. "filename") are silently ignored:

(qemu) qemu-io virtio0 "reopen -o file.filename=no-such-file"

While we could allow changing some of the other options, let's keep
things as they are for now but return an error if the user tries to
change any of them.

Signed-off-by: Alberto Garcia <[email protected]>
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

block: Forbid trying to change unsupported options during reopen

The bdrv_reopen_prepare() function checks all options passed to each
BlockDriverState (in the reopen_state->options QDict) and makes all
necessary preparations to apply the option changes requested by the
user.

Options are removed from the QDict as they are processed, so at the
end of bdrv_reopen_prepare() only the options that can't be changed
are left. Then a loop goes over all remaining options and verifies
that the old and new values are identical, returning an error if
they're not.

The problem is that at the moment there are options that are removed
from the QDict although they can't be changed. The consequence of this
is any modification to any of those options is silently ignored:

(qemu) qemu-io virtio0 "reopen -o discard=on"

This happens when all options from bdrv_runtime_opts are removed
from the QDict but then only a few of them are processed. Since
it's especially important that "node-name" and "driver" are not
changed, the code puts them back into the QDict so they are checked
at the end of the function. Instead of putting only those two options
back into the QDict, this patch puts all unprocessed options using
qemu_opts_to_qdict().

update_flags_from_options() also needs to be modified to prevent
BDRV_OPT_CACHE_NO_FLUSH, BDRV_OPT_CACHE_DIRECT and BDRV_OPT_READ_ONLY
from going back to the QDict.

Signed-off-by: Alberto Garcia <[email protected]>
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

block: Allow child references on reopen

In the previous patches we removed all child references from
bs->{options,explicit_options} because keeping them is useless and
wrong.

Because of this, any attempt to reopen a BlockDriverState using a
child reference as one of its options would result in a failure,
because bdrv_reopen_prepare() would detect that there's a new option
(the child reference) that wasn't present in bs->options.

But passing child references on reopen can be useful. It's a way to
specify a BDS's child without having to pass recursively all of the
child's options, and if the reference points to a different BDS then
this can allow us to replace the child.

However, replacing the child is something that needs to be implemented
case by case and only when it makes sense. For now, this patch allows
passing a child reference as long as it points to the current child of
the BlockDriverState.

It's also important to remember that, as a consequence of the
previous patches, this child reference will be removed from
bs->{options,explicit_options} after the reopening has been completed.

Signed-off-by: Alberto Garcia <[email protected]>
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

block: Don't look for child references in append_open_options()

In the previous patch we removed child references from bs->options, so
there's no need to look for them here anymore.

Signed-off-by: Alberto Garcia <[email protected]>
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

block: Remove child references from bs->{options,explicit_options}

Block drivers allow opening their children using a reference to an
existing BlockDriverState. These references remain stored in the
'options' and 'explicit_options' QDicts, but we don't need to keep
them once everything is open.

What is more important, these values can become wrong if the children
change:

    $ qemu-img create -f qcow2 hd0.qcow2 10M
    $ qemu-img create -f qcow2 hd1.qcow2 10M
    $ qemu-img create -f qcow2 hd2.qcow2 10M
    $ $QEMU -drive if=none,file=hd0.qcow2,node-name=hd0 \
            -drive if=none,file=hd1.qcow2,node-name=hd1,backing=hd0 \
            -drive file=hd2.qcow2,node-name=hd2,backing=hd1

After this hd2 has hd1 as its backing file. Now let's remove it using
block_stream:

    (qemu) block_stream hd2 0 hd0.qcow2

Now hd0 is the backing file of hd2, but hd2's options QDicts still
contain backing=hd1.

Signed-off-by: Alberto Garcia <[email protected]>
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

file-posix: x-check-cache-dropped should default to false on reopen

The default value of x-check-cache-dropped is false. There's no reason
to use the previous value as a default in raw_reopen_prepare() because
bdrv_reopen_queue_child() already takes care of putting the old
options in the BDRVReopenState.options QDict.

If x-check-cache-dropped was previously set but is now missing from
the reopen QDict then it should be reset to false.

Signed-off-by: Alberto Garcia <[email protected]>
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

qemu-io: Fix writethrough check in reopen

"qemu-io reopen" doesn't allow changing the writethrough setting of
the cache, but the check is wrong, causing an error even on a simple
reopen with the default parameters:

   $ qemu-img create -f qcow2 hd.qcow2 1M
   $ qemu-system-x86_64 -monitor stdio -drive if=virtio,file=hd.qcow2
   (qemu) qemu-io virtio0 reopen
   Cannot change cache.writeback: Device attached

Signed-off-by: Alberto Garcia <[email protected]>
Reviewed-by: Max Reitz <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

file-posix: Include filename in locking error message

Image locking errors happening at device initialization time doesn't say
which file cannot be locked, for instance,

-device scsi-disk,drive=drive-1: Failed to get shared "write" lock
Is another process using the image?

could refer to either the overlay image or its backing image.

Hoist the error_append_hint to the caller of raw_check_lock_bytes where
file name is known, and include it in the error hint.

Signed-off-by: Fam Zheng <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

Merge remote-tracking branch 'remotes/kraxel/tags/vga-20180927-pull-request' into staging

vga: add edid support, qxl bugfixes.

# gpg: Signature made Thu 27 Sep 2018 08:12:32 BST
# gpg:                using RSA key 4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <[email protected]>"
# gpg:                 aka "Gerd Hoffmann <[email protected]>"
# gpg:                 aka "Gerd Hoffmann (private) <[email protected]>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/vga-20180927-pull-request:
  qxl: support mono cursors with inverted colors
  qxl: use guest_monitor_config for local renderer.
  display/stdvga: add edid support.
  display/edid: add DEFINE_EDID_PROPERTIES
  display/edid: add region helper.
  display/edid: add qemu_edid_size()
  display/edid: add edid generator to qemu.

Signed-off-by: Peter Maydell <[email protected]>

gtk: add zoom-to-fit to gtk options.

This allows to set the option on the command line, i.e. "-display
gtk,zoom-to-fit={on,off}", overriding the default chosen by qemu.

Signed-off-by: Gerd Hoffmann <[email protected]>
Reviewed-by: Markus Armbruster <[email protected]>
Message-id: 20180827095620 [email protected]

vnc: call sasl_server_init() only when required

VNC server is calling sasl_server_init() during startup of QEMU, even
if SASL auth has not been enabled.

This may create undesirable warnings like "Could not find keytab file:
/etc/qemu/krb5.tab" when the user didn't configure SASL on host and
started VNC server.

Instead, only initialize SASL when needed. Note that HMP/QMP "change
vnc" calls vnc_display_open() again, which will initialize SASL if
needed.

Fix assignment in if condition, while touching this code.

Related to:
https://bugzilla.redhat.com/show_bug.cgi?id=1609327

Signed-off-by: Marc-André Lureau <[email protected]>
Reviewed-by: Daniel P. Berrangé <[email protected]>
Message-id: 20180907063634 [email protected]
Signed-off-by: Gerd Hoffmann <[email protected]>

sdl2: show console #0 unconditionally

Otherwise sdl2 will show no window in case no graphical
display device is present.

Reproducer: qemu -nodefaults -display sdl -serial vc

Signed-off-by: Gerd Hoffmann <[email protected]>
Message-id: 20180912114300 [email protected]

ohci: set effectively usb frame rate to 1kHz

USB frame rate is slightly lower than 1kHz: ie. ~950Hz.
Thus usb-audio device is not able to perform a simple audio playback
without underruns on audio backend.
eg. "-device pci-ohci,id=ohci -device usb-audio,bus=ohci.0" vs PulseAudio
backend. more than 50 underruns are observed per second.

Update ohci_sof_time computation, using QEMU_CLOCK_VIRTUAL in
ohci_usb_start(), and increment by usb_frame_time in ohci_sof()
makes USB frame rate close to 1kHz.
This way, no audio underrun are observed during audio playback.

Signed-off-by: Miguel GAIO <[email protected]>
Message-Id: <20180927151936 [email protected]>
Signed-off-by: Gerd Hoffmann <[email protected]>

usb-hub: clear suspend on detach

Signed-off-by: Gerd Hoffmann <[email protected]>
Message-id: 20180912114012 [email protected]

usb-mtp: reset ObjectInfo dataset size on cleanup

Stale values in this field may result in qemu
expecting more data on the next operation

Signed-off-by: Bandan Das <[email protected]>
Message-id: 20180907220851 [email protected]
Signed-off-by: Gerd Hoffmann <[email protected]>

doc: replace x-root with rootdir for usb-mtp

Signed-off-by: Bandan <[email protected]>
Message-id: 20180907220851 [email protected]
Signed-off-by: Gerd Hoffmann <[email protected]>

usb-mtp: fix error conditions for write operation

Return STORE_FULL if we can't write all the bytes but
return incomplete transfer if data received is less then
what was specified in the metadata. Also, use d->offset
as the file size which is valid for all file sizes.

Signed-off-by: Bandan <[email protected]>
Message-id: 20180907220851 [email protected]
Signed-off-by: Gerd Hoffmann <[email protected]>

Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2018-09-26' into staging

nbd patches for 2018-09-26

Fixes for external clients; add reminder to revisit naming of x- command

- Vladimir Sementsov-Ogievskiy: nbd/server: send more than one extent of base:allocation context
- John Snow: qapi: bitmap-merge: document name change
- Vladimir Sementsov-Ogievskiy: nbd/server: fix bitmap export

# gpg: Signature made Thu 27 Sep 2018 03:40:03 BST
# gpg:                using RSA key A7A16B4A2527436A
# gpg: Good signature from "Eric Blake <[email protected]>"
# gpg:                 aka "Eric Blake (Free Software Programmer) <[email protected]>"
# gpg:                 aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A

* remotes/ericb/tags/pull-nbd-2018-09-26:
  nbd/server: send more than one extent of base:allocation context
  qapi: bitmap-merge: document name change
  nbd/server: fix bitmap export

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20180926' into staging

Queued tcg patches

# gpg: Signature made Wed 26 Sep 2018 19:27:22 BST
# gpg:                using RSA key 64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <[email protected]>"
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* remotes/rth/tags/pull-tcg-20180926:
  tcg/i386: fix vector operations on 32-bit hosts
  qht-bench: add -p flag to precompute hash values
  qht: constify arguments to some internal functions
  qht: constify qht_statistics_init
  qht: constify qht_lookup
  qht: fix comment in qht_bucket_remove_entry
  qht: drop ht argument from qht iterators
  test-qht: speed up + test qht_resize
  test-qht: test deletion of the last entry in a bucket
  test-qht: test removal of non-existent entries
  test-qht: test qht_iter_remove
  qht: add qht_iter_remove
  qht: remove unused map param from qht_remove__locked

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20180926a' into staging

Migration pull 2018-09-26

This supercedes Juan's pull from the 13th

# gpg: Signature made Wed 26 Sep 2018 18:07:30 BST
# gpg:                using RSA key 0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <[email protected]>"
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-migration-20180926a:
  migration/ram.c: Avoid taking address of fields in packed MultiFDInit_t struct
  migration: fix the compression code
  migration: fix QEMUFile leak
  tests/migration: Speed up the test on ppc64
  migration: cleanup in error paths in loadvm
  migration/postcopy: Clear have_listen_thread
  tests/migration: Add migration-test header file
  tests/migration: Support cross compilation in generating boot header file
  tests/migration: Convert x86 boot block compilation script into Makefile
  migration: use save_page_use_compression in flush_compressed_data
  migration: show the statistics of compression
  migration: do not flush_compressed_data at the end of iteration
  Add a hint message to loadvm and exits on failure
  migration: handle the error condition properly
  migration: fix calculating xbzrle_counters.cache_miss_rate
  migration/rdma: Fix uninitialised rdma_return_path

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/otubo/tags/pull-seccomp-20180926' into staging

pull-seccomp-20180926

# gpg: Signature made Wed 26 Sep 2018 14:20:06 BST
# gpg:                using RSA key DF32E7C0F0FFF9A2
# gpg: Good signature from "Eduardo Otubo (Senior Software Engineer) <[email protected]>"
# Primary key fingerprint: D67E 1B50 9374 86B4 0723  DBAB DF32 E7C0 F0FF F9A2

* remotes/otubo/tags/pull-seccomp-20180926:
  seccomp: check TSYNC host capability

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/famz/tags/staging-pull-request' into staging

Block and testing patches

- Paolo's AIO fixes.
- VMDK streamOptimized corner case fix
- VM testing improvment on -cpu

# gpg: Signature made Wed 26 Sep 2018 03:54:08 BST
# gpg:                using RSA key CA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <[email protected]>"
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021  AD56 CA35 624C 6A91 71C6

* remotes/famz/tags/staging-pull-request:
  vmdk: align end of file to a sector boundary
  tests/vm: Use -cpu max rather than -cpu host
  aio-posix: do skip system call if ctx->notifier polling succeeds
  aio-posix: compute timeout before polling
  aio-posix: fix concurrent access to poll_disable_cnt

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-3.1-pull-request' into staging

- some fixes for setrlimit() and write()
- fixes ELF loader when host page size is greater than target page size
- add SO_LINGER to getsockopt()/setsockopt()
- move TargetFdTrans from syscall.c
  v2: add "#include <linux/netlink.h>" in linux-user/fd-trans.c

# gpg: Signature made Tue 25 Sep 2018 21:51:13 BST
# gpg:                using RSA key F30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <[email protected]>"
# gpg:                 aka "Laurent Vivier <[email protected]>"
# gpg:                 aka "Laurent Vivier (Red Hat) <[email protected]>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/linux-user-for-3.1-pull-request:
  linux-user: do setrlimit selectively
  linux-user: write(fd, NULL, 0) parity with linux's treatment of same
  linux-user: elf: mmap all the target-pages of hostpage for data segment
  linux-user: add SO_LINGER to {g,s}etsockopt
  linux-user: move TargetFdTrans functions to their own file

Signed-off-by: Peter Maydell <[email protected]>

qxl: support mono cursors with inverted colors

Monochrome cursors are still used by Windows guests with the
QXL-WDDM-DOD driver. Such cursor types have one odd feature, inversion
of colors. GDK does not seem to support it, so implement an alternative
solution: fill the inverted pixels and add an outline to make the cursor
more visible. Tested with the text cursor in Notepad and Windows 10.

cursor_set_mono is also used by the vmware GPU, so add a special check
to avoid breaking its 32bpp format (tested with Kubuntu 14.04.4). I was
unable to find a guest which supports the 1bpp format with a vmware GPU.

The old implementation was buggy and removed in v2.10.0-108-g79c5a10cdd
("qxl: drop mono cursor support"), this version improves upon that by
adding bounds validation, clarifying the semantics of the two masks and
adds a workaround for inverted colors support.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1611984
Signed-off-by: Peter Wu <[email protected]>
Message-id: 20180903145447 [email protected]

[ kraxel: minor codestyle fix ]

Signed-off-by: Gerd Hoffmann <[email protected]>

qxl: use guest_monitor_config for local renderer.

When processing monitor config from guest store head0 width and height
for single-head configurations.  Use these when creating the
DisplaySurface in the local renderer.

This fixes a rendering issue with wayland.  Wayland rounds up the
framebuffer width and height to a multiple of 64, so with odd
resolutions (800x600 for example) the framebuffer is larger than the
actual screen.  The monitor config has the actual screen size though.

This fixes guest display for anything using the local renderer
(non-spice UI, screendump monitor command).

Signed-off-by: Gerd Hoffmann <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Message-id: 20180919103057 [email protected]

display/stdvga: add edid support.

This patch adds edid support to the qemu stdvga.  It is turned off by
default and can be enabled with the new edid property.  The patch also
adds xres and yres properties to specify the video mode you want the
guest use.  Works only with edid enabled and updated guest driver.

The mmio bar of the stdvga has some unused address space at the start.
It was reserved just in case it'll be needed for virtio, but it turned
out to not be needed for that.  So let's use that region to place the
EDID data block there.

Signed-off-by: Gerd Hoffmann <[email protected]>
Message-id: 20180925075646 [email protected]

display/edid: add DEFINE_EDID_PROPERTIES

Add a define for edid monitor properties.

Signed-off-by: Gerd Hoffmann <[email protected]>
Message-id: 20180925075646 [email protected]

display/edid: add region helper.

Create a io region for an EDID data block.

Signed-off-by: Gerd Hoffmann <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-id: 20180925075646 [email protected]

display/edid: add qemu_edid_size()

Helper function to figure the size of a edid blob, by checking how many
extensions are present. Both the base edid blob and the extensions are
128 bytes in size.

Signed-off-by: Gerd Hoffmann <[email protected]>
Message-id: 20180925075646 [email protected]

display/edid: add edid generator to qemu.

EDID is a metadata format to describe monitors.  On physical hardware
the monitor has an eeprom with that data block which can be read over
i2c bus.

On a linux system you can usually find the EDID data block in
/sys/class/drm/$card/$connector/edid.  xorg ships a edid-decode utility
which you can use to turn the blob into readable form.

I think it would be a good idea to use EDID for virtual displays too.
Needs changes in both qemu and guest kms drivers.  This patch is the
first step, it adds an generator for EDID blobs to qemu.  Comes with a
qemu-edid test tool included.

With EDID we can pass more information to the guest.  Names and serial
numbers, so the guests display configuration has no boring "Unknown
Monitor".  List of video modes.  Display resolution, pretty important
in case we want add HiDPI support some day.

Signed-off-by: Gerd Hoffmann <[email protected]>
Message-id: 20180925075646 [email protected]

nbd/server: send more than one extent of base:allocation context

This is necessary for efficient block-status export, for clients which
support it. (qemu is not yet such a client, but could become one.)

Signed-off-by: Vladimir Sementsov-Ogievskiy <[email protected]>
Message-Id: <20180704112302 [email protected]>
[eblake: grammar tweaks]
Signed-off-by: Eric Blake <[email protected]>

qapi: bitmap-merge: document name change

We named these using underscores instead of the preferred dash,
document this nearby so we cannot possibly forget to rectify this
when we remove the 'x-' prefixes when the feature becomes stable.

We do not implement the change ahead of time to avoid more work
for libvirt to do in order to figure out how to use the beta version
of the API needlessly.

Reported-by: Eric Blake <[email protected]>
Signed-off-by: John Snow <[email protected]>
Message-Id: <20180919190934 [email protected]>
Reviewed-by: Eric Blake <[email protected]>
[eblake: typo fix]
Signed-off-by: Eric Blake <[email protected]>

migration/ram.c: Avoid taking address of fields in packed MultiFDInit_t struct

Taking the address of a field in a packed struct is a bad idea, because
it might not be actually aligned enough for that pointer type (and
thus cause a crash on dereference on some host architectures). Newer
versions of clang warn about this:

migration/ram.c:651:19: warning: taking address of packed member 'magic' of class or structure 'MultiFDInit_t' may result in an unaligned pointer value [-Waddress-of-packed-member]
migration/ram.c:652:19: warning: taking address of packed member 'version' of class or structure 'MultiFDInit_t' may result in an unaligned pointer value [-Waddress-of-packed-member]
migration/ram.c:737:19: warning: taking address of packed member 'magic' of class or structure 'MultiFDPacket_t' may result in an unaligned pointer value [-Waddress-of-packed-member]
migration/ram.c:745:19: warning: taking address of packed member 'version' of class or structure 'MultiFDPacket_t' may result in an unaligned pointer value [-Waddress-of-packed-member]
migration/ram.c:755:19: warning: taking address of packed member 'size' of class or structure 'MultiFDPacket_t' may result in an unaligned pointer value [-Waddress-of-packed-member]

Avoid the bug by not using the "modify in place" byteswapping
functions.

Signed-off-by: Peter Maydell <[email protected]>
Message-Id: <20180925161924 [email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration: fix the compression code

Add judgement in compress_threads_save_cleanup() to check whether the
static CompressParam *comp_param has been allocated. If not, just
return; or else segmentation fault will occur when using the NULL
comp_param's parameters. One test case can reproduce this is: set
the compression on and migrate to a wrong nonexistent host IP address.

Our current code does not judge before handling comp_param[idx]'s quit
and cond that whether they have been initialized. If not initialized,
"qemu_mutex_lock_impl: Assertion `mutex->initialized' failed." will
occur. Fix this by squashing the terminate_compression_threads() into
compress_threads_save_cleanup() and employing the existing judgement
condition. One test case can reproduce this error is: set the
compression on and fail to fully setup the default eight compression
thread in compress_threads_save_setup().

Signed-off-by: Fei Li <[email protected]>
Message-Id: <20180925091440 [email protected]>
Reviewed-by: Peter Xu <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration: fix QEMUFile leak

Spotted by ASAN while running:

$ tests/migration-test -p /x86_64/migration/postcopy/recovery

=================================================================
==18034==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 33864 byte(s) in 1 object(s) allocated from:
    #0 0x7f3da7f31e50 in calloc (/lib64/libasan.so.5+0xeee50)
    #1 0x7f3da644441d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5241d)
    #2 0x55af9db15440 in qemu_fopen_channel_input /home/elmarco/src/qemu/migration/qemu-file-channel.c:183
    #3 0x55af9db15413 in channel_get_output_return_path /home/elmarco/src/qemu/migration/qemu-file-channel.c:159
    #4 0x55af9db0d4ac in qemu_file_get_return_path /home/elmarco/src/qemu/migration/qemu-file.c:78
    #5 0x55af9dad5e4f in open_return_path_on_source /home/elmarco/src/qemu/migration/migration.c:2295
    #6 0x55af9dadb3bf in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:3111
    #7 0x55af9dae1bf3 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:91
    #8 0x55af9daddeca in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:108
    #9 0x55af9e13d3db in qio_task_complete /home/elmarco/src/qemu/io/task.c:158
    #10 0x55af9e13ca03 in qio_task_thread_result /home/elmarco/src/qemu/io/task.c:89
    #11 0x7f3da643b1ca in g_idle_dispatch gmain.c:5535

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20180925092245 [email protected]>
Reviewed-by: Peter Xu <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

tests/migration: Speed up the test on ppc64

The SLOF boot process is always quite slow ... but we can speed it up
a little bit by specifying "-nodefaults" and by using the "nvramrc"
variable instead of "boot-command" (since "nvramrc" is evaluated earlier
in the SLOF boot process than "boot-command").

Signed-off-by: Thomas Huth <[email protected]>
Message-Id: <1537204330 [email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Laurent Vivier <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration: cleanup in error paths in loadvm

There's a couple of error paths in qemu_loadvm_state
which happen early on but after we've initialised the
load state; that needs to be cleaned up otherwise
we can hit asserts if the state gets reinitialised later.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Message-Id: <20180914170430 [email protected]>
Reviewed-by: Peter Xu <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration/postcopy: Clear have_listen_thread

Clear have_listen_thread when we exit the thread.
The fallout from this was that various things thought there was
an ongoing postcopy after the postcopy had finished.

The case that failed was postcopy->savevm->loadvm.

This corresponds to RH bug https://bugzilla.redhat.com/show_bug.cgi?id=1608765

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Message-Id: <20180914170430 [email protected]>
Reviewed-by: Peter Xu <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

tcg/i386: fix vector operations on 32-bit hosts

The TCG backend uses LOWREGMASK to get the low 3 bits of register numbers.
This was defined as no-op for 32-bit x86, with the assumption that we have
eight registers anyway. This assumption is not true once we have xmm regs.

Since LOWREGMASK was a no-op, xmm register indidices were wrong in opcodes
and have overflown into other opcode fields, wreaking havoc.

To trigger these problems, you can try running the "movi d8, #0x0" AArch64
instruction on 32-bit x86. "vpxor %xmm0, %xmm0, %xmm0" should be generated,
but instead TCG generated "vpxor %xmm0, %xmm0, %xmm2".

Fixes: 770c2fc7bb ("Add vector operations")
Signed-off-by: Roman Kapl <[email protected]>
Message-Id: <20180824131734 [email protected]>
Signed-off-by: Richard Henderson <[email protected]>

qht-bench: add -p flag to precompute hash values

Precomputing the hash values allows us to perform more frequent
accesses to the hash table, thereby reaching higher throughputs.

We keep the old behaviour by default, since (1) we might confuse
users if they measured a speedup without changing anything in
the QHT implementation, and (2) benchmarking the hash function
"on line" is also valuable.

Before:
$ taskset -c 0 tests/qht-bench -n 1
Throughput:        38.18 MT/s

After:
$ taskset -c 0 tests/qht-bench -n 1
Throughput:        38.16 MT/s

After (with precomputing):
$ taskset -c 0 tests/qht-bench -n 1 -p
Throughput:        50.87 MT/s

Signed-off-by: Emilio G. Cota <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

qht: constify arguments to some internal functions

These functions do not modify their @ht or @bucket arguments.
Constify those arguments.

Signed-off-by: Emilio G. Cota <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

qht: constify qht_statistics_init

Signed-off-by: Emilio G. Cota <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

qht: constify qht_lookup

seqlock_read_begin takes a const param since c04649eeea
("seqlock: constify seqlock_read_begin", 2018-08-23), so
we can constify the entire lookup.

Signed-off-by: Emilio G. Cota <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

qht: fix comment in qht_bucket_remove_entry

Signed-off-by: Emilio G. Cota <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

qht: drop ht argument from qht iterators

Accessing the HT from an iterator results almost always
in a deadlock. Given that only one qht-internal function
uses this argument, drop it from the interface.

Suggested-by: Alex Bennée <[email protected]>
Signed-off-by: Emilio G. Cota <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

test-qht: speed up + test qht_resize

Perform first the tests that exercise code paths that are
easier to hit at small table sizes, and then resize the table
to speed up subsequent tests. If this resize is not too large,
we can make the test faster with no code coverage loss.

- With gcov enabled:

Before: 20.568s, 90.28% qht.c coverage
After: 5.168s, 93.06% qht.c coverage

The coverage increase is entirely due to calling qht_resize,
which we weren't calling before. Note that the code paths
that remain to be tested are either error handling or
can only occur when several threads are accessing the
hash table concurrently (e.g. seqlock retry, trylock fail).

- Without gcov:

Before: 1.987s
After: 0.528s

The speedup is almost the same as with gcov, although the
"before" run is a lot faster.

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Emilio G. Cota <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

test-qht: test deletion of the last entry in a bucket

This improves coverage by one (!) LoC in qht.c, bringing the
coverage rate up from 90.00% to 90.28%.

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Emilio G. Cota <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

test-qht: test removal of non-existent entries

This improves qht.c code coverage from 89.44% to 90.00%.

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Emilio G. Cota <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

test-qht: test qht_iter_remove

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Emilio G. Cota <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

qht: add qht_iter_remove

This currently has no users, but the use case is so common that I
think we must support it.

Note that without the appended we cannot safely remove a set of
elements; a 2-step approach (i.e. qht_iter first, keep track of
the to-be-deleted elements, and then a bunch of qht_remove calls)
would be racy, since between the iteration and the removals other
threads might insert additional elements.

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Emilio G. Cota <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

qht: remove unused map param from qht_remove__locked

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Emilio G. Cota <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

nbd/server: fix bitmap export

bitmap_to_extents function is broken: it switches dirty variable after
every iteration, however it can process only part of dirty (or zero)
area during one iteration in case when this area is too large for one
extent.

Fortunately, the bug doesn't produce wrong extent flags: it just inserts
a zero-length extent between sequential extents representing large dirty
(or zero) area. However, zero-length extents are forbidden by the NBD
protocol. So, a careful client should consider such a reply as a server
fault, while a less-careful will likely ignore zero-length extents.

The bug can only be triggered by a client that requests block status
for nearly 4G at once (a request of 4G and larger is impossible per
the protocol, and requests smaller than 4G less the bitmap granularity
cause the loop to quit iterating rather than revisit the tail of the
large area); it also cannot trigger if the client used the
NBD_CMD_FLAG_REQ_ONE flag. Since qemu 3.0 as client (using the
x-dirty-bitmap extension) always passes the flag, it is immune; and
we are not aware of other open-source clients that know how to request
qemu:dirty-bitmap:FOO contexts. Clients that want to avoid the bug
could cap block status requests to a smaller length, such as 2G or 3G.

Fix this by more careful handling of dirty variable.

Bug was introduced in 3d068aff16
"nbd/server: implement dirty bitmap export", with the whole function.
and is present in v3.0.0 release.

Signed-off-by: Vladimir Sementsov-Ogievskiy <[email protected]>
Message-Id: <20180914165116 [email protected]>
CC: [email protected]
Reviewed-by: Eric Blake <[email protected]>
[eblake: improved commit message]
Signed-off-by: Eric Blake <[email protected]>

seccomp: check TSYNC host capability

Remove -sandbox option if the host is not capable of TSYNC, since the
sandbox will fail at setup time otherwise. This will help libvirt, for
ex, to figure out if -sandbox will work.

Signed-off-by: Marc-André Lureau <[email protected]>
Signed-off-by: Eduardo Otubo <[email protected]>
Acked-by: Eduardo Otubo <[email protected]>

tests/migration: Add migration-test header file

This patch moves the settings related migration-test from the
migration-test.c file to a new header file.

Reviewed-by: Juan Quintela <[email protected]>
Reviewed-by: Andrew Jones <[email protected]>
Signed-off-by: Wei Huang <[email protected]>
Message-Id: <1536174934 [email protected]>
Signed-off-by: Juan Quintela <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

tests/migration: Support cross compilation in generating boot header file

Recently a new configure option, CROSS_CC_GUEST, was added to
$(TARGET)-softmmu/config-target.mak to support TCG-related tests. This
patch tries to leverage this option to support cross compilation when the
migration boot block file is being re-generated:

* The x86 related files are moved to a new sub-dir (named ./i386).
* A new top-layer Makefile is created in tests/migration/ directory.
This Makefile searches and parses CROSS_CC_GUEST to generate CROSS_PREFIX.
The CROSS_PREFIX, if available, is then passed to migration/$ARCH/Makefile.

Reviewed-by: Juan Quintela <[email protected]>
Signed-off-by: Wei Huang <[email protected]>
Message-Id: <1536174934 [email protected]>
Signed-off-by: Juan Quintela <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

tests/migration: Convert x86 boot block compilation script into Makefile

The x86 boot block header currently is generated with a shell script.
To better support other CPUs (e.g. aarch64), we convert the script
into Makefile. This allows us to 1) support cross-compilation easily,
and 2) avoid creating a script file for every architecture.

Note that, in the new design, the cross compiler prefix can be specified by
setting the CROSS_PREFIX in "make" command. Also to allow gcc pre-processor
to include the C-style file correctly, it also renames the
x86-a-b-bootblock.s file extension from .s to .S.

Reviewed-by: Juan Quintela <[email protected]>
Reviewed-by: Andrew Jones <[email protected]>
Signed-off-by: Wei Huang <[email protected]>
Message-Id: <1536174934 [email protected]>
Signed-off-by: Juan Quintela <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration: use save_page_use_compression in flush_compressed_data

It avoids to touch compression locks if xbzrle and compression
are both enabled

Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: Juan Quintela <[email protected]>
Message-Id: <20180906070101 [email protected]>
Signed-off-by: Juan Quintela <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration: show the statistics of compression

Currently, it includes:
pages: amount of pages compressed and transferred to the target VM
busy: amount of count that no free thread to compress data
busy-rate: rate of thread busy
compressed-size: amount of bytes after compression
compression-rate: rate of compressed size

Reviewed-by: Juan Quintela <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Signed-off-by: Xiao Guangrong <[email protected]>
Message-Id: <20180906070101 [email protected]>
Signed-off-by: Juan Quintela <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration: do not flush_compressed_data at the end of iteration

flush_compressed_data() needs to wait all compression threads to
finish their work, after that all threads are free until the
migration feeds new request to them, reducing its call can improve
the throughput and use CPU resource more effectively

We do not need to flush all threads at the end of iteration, the
data can be kept locally until the memory block is changed or
memory migration starts over in that case we will meet a dirtied
page which may still exists in compression threads's ring

Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: Juan Quintela <[email protected]>
Message-Id: <20180906070101 [email protected]>
Signed-off-by: Juan Quintela <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

Add a hint message to loadvm and exits on failure

This patch adds a small hint for the failure case of the load snapshot
process. It may be useful for users to remember that the VM
configuration has changed between the save and load processes.

(qemu) loadvm vm-20180903083641
Unknown savevm section or instance 'cpu_common' 4.
Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
Error -22 while loading VM state
(qemu) device_add host-spapr-cpu-core,core-id=4
(qemu) loadvm vm-20180903083641
(qemu) c
(qemu) info status
VM status: running

It also exits Qemu if the snapshot cannot be loaded before reaching the
main loop (-loadvm in the command line).

$ qemu-system-ppc64 ... -loadvm vm-20180903083641
qemu-system-ppc64: Unknown savevm section or instance 'cpu_common' 4.
Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
qemu-system-ppc64: Error -22 while loading VM state
$

Signed-off-by: Jose Ricardo Ziviani <[email protected]>
Message-Id: <20180903162613 [email protected]>
Reviewed-by: Juan Quintela <[email protected]>
Signed-off-by: Juan Quintela <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration: handle the error condition properly

ram_find_and_save_block() can return negative if any error hanppens,
however, it is completely ignored in current code

Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: Juan Quintela <[email protected]>
Message-Id: <20180903092644 [email protected]>
Signed-off-by: Juan Quintela <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration: fix calculating xbzrle_counters.cache_miss_rate

As Peter pointed out:
| - xbzrle_counters.cache_miss is done in save_xbzrle_page(), so it's
| per-guest-page granularity
|
| - RAMState.iterations is done for each ram_find_and_save_block(), so
| it's per-host-page granularity
|
| An example is that when we migrate a 2M huge page in the guest, we
| will only increase the RAMState.iterations by 1 (since
| ram_find_and_save_block() will be called once), but we might increase
| xbzrle_counters.cache_miss for 2M/4K=512 times (we'll call
| save_xbzrle_page() that many times) if all the pages got cache miss.
| Then IMHO the cache miss rate will be 512/1=51200% (while it should
| actually be just 100% cache miss).

And he also suggested as xbzrle_counters.cache_miss_rate is the only
user of rs->iterations we can adapt it to count target guest page
numbers

After that, rename 'iterations' to 'target_page_count' to better reflect
its meaning

Suggested-by: Peter Xu <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Reviewed-by: Juan Quintela <[email protected]>
Signed-off-by: Xiao Guangrong <[email protected]>
Message-Id: <20180903092644 [email protected]>
Signed-off-by: Juan Quintela <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

migration/rdma: Fix uninitialised rdma_return_path

Clang correctly errors out moaning that rdma_return_path
is used uninitialised in the earlier error paths.
Make it NULL so that the error path ignores it.

Fixes: 55cc1b5937a8e709e4c102e74b206281073aab82
Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reported-by: Cornelia Huck <[email protected]>
Message-Id: <20180830173657 [email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Juan Quintela <[email protected]>
Signed-off-by: Juan Quintela <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

vmdk: align end of file to a sector boundary

There is a rare case which the size of last compressed cluster
is larger than the cluster size, which will cause the file is
not aligned at the sector boundary.

There are three reasons to do it. First, if vmdk doesn't align at
the sector boundary, there may be many undefined behaviors,
such as, in vbox it will show VMDK: Compressed image is corrupted
'syno-vm-disk1.vmdk' (VERR_ZIP_CORRUPTED) when we try to import an
ova with unaligned vmdk. Second, all the cluster_sector is aligned
to sector, the last one should be like this, too. Third, it ease
reading with sector based I/Os.

Signed-off-by: yuchenlin <[email protected]>
Message-Id: <20180913082952 [email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Signed-off-by: Fam Zheng <[email protected]>