target/mips: Amend preprocessor constants for CP0 registers
Correct existing CP0-related preprocessor constants (replace
"CPO" with "CP0" (form letter "O" to digit "0", when needed).
Besides, add preprocessor constants for CP0 subregisters.
The names of the subregisters were chosen to be in sync with
the table of corresponding assembler mnemonics found in the
documentation for I6500 and I6400 (release 1.0).
block/mirror: fix and improve do_sync_target_write
Use bdrv_dirty_bitmap_next_dirty_area() instead of
bdrv_dirty_iter_next_area(), because of the following problems of
bdrv_dirty_iter_next_area():
1. Using HBitmap iterators we should carefully handle unaligned offset,
as first call to hbitmap_iter_next() may return a value less than
original offset (actually, it will be original offset rounded down to
bitmap granularity). This handling is not done in
do_sync_target_write().
look at the code:
if (max_offset == iter->bitmap->size) {
/* If max_offset points to the image end, round it up by the
* bitmap granularity */
gran_max_offset = ROUND_UP(max_offset, granularity);
} else {
gran_max_offset = max_offset;
}
ret = hbitmap_iter_next(&iter->hbi, false);
if (ret < 0 || ret + granularity > gran_max_offset) {
return false;
}
and assume that max_offset != iter->bitmap->size but still unaligned.
if 0 < ret < max_offset we found dirty area, but the function can
return false in this case (if ret + granularity > max_offset).
3. bdrv_dirty_iter_next_area() uses inefficient loop to find the end of
the dirty area. Let's use more efficient hbitmap_next_zero instead
(bdrv_dirty_bitmap_next_dirty_area() do so)
The function alters bdrv_dirty_iter_next_area(), which is wrong and
less efficient (see further commit
"block/mirror: fix and improve do_sync_target_write" for description).
Peter Maydell [Tue, 15 Jan 2019 18:32:57 +0000 (18:32 +0000)]
Merge remote-tracking branch 'remotes/thibault/tags/samuel-thibault' into staging
slirp updates
Gerd Hoffmann (1):
slirp: add tftp tracing
Marc-André Lureau (61):
slirp: associate slirp_output callback with the Slirp context
slirp: remove do_pty from fork_exec()
slirp: replace ex_pty with ex_chardev
slirp: use a dedicated field for chardev pointer
slirp: remove unused EMU_RSH
slirp: rename /extra/chardev
slirp: move internal function declarations
slirp: remove Monitor dependency, return a string for info
slirp: fix slirp_add_exec() leaks
slirp: replace the poor-man string split with g_strsplit()
slirp: remove dead declarations
slirp: move socket pair creation in helper function
slirp: remove unused M_TRAILINGSPACE
slirp: use a callback structure to interface with qemu
slirp: remove PROBE_CONN dead-code
slirp: remove FULL_BOLT
slirp: remove the disabled readv()/writev() code path
slirp: remove HAVE_SYS_SIGNAL_H
slirp: remove unused HAVE_SYS_BITYPES_H
slirp: remove NO_UNIX_SOCKETS
slirp: remove unused HAVE_SYS_STROPTS_H
slirp: remove unused HAVE_ARPA_INET_H
slirp: remove unused HAVE_SYS_WAIT_H
slirp: remove unused HAVE_SYS_SELECT_H
slirp: remove HAVE_SYS_IOCTL_H
slirp: remove HAVE_SYS_FILIO_H
slirp: remove unused DECLARE_IOVEC
slirp: remove unused HAVE_INET_ATON
slirp: replace HOST_WORDS_BIGENDIAN with glib equivalent
slirp: replace SIZEOF_CHAR_P with glib equivalent
slirp: replace compile time DO_KEEPALIVE
slirp: remove unused global slirp_instance
slirp: replace error_report() with g_critical()
slirp: improve a bit the debug macros
slirp: add a callback to log guest errors
slirp: remove #if notdef dead code
slirp: remove unused sbflush()
slirp: NULL is defined by stddef.h
slirp: remove dead TCP_ACK_HACK code
slirp: replace ARRAY_SIZE with G_N_ELEMENTS
net: do not depend on slirp internals
glib-compat: add g_spawn_async_with_fds() fallback
slirp: simplify fork_exec()
slirp: replace error_report() with g_critical()
slirp: drop <Vista compatibility
slirp: rename exec_list
slirp: use virtual time for packet expiration
slirp: replace a fprintf with g_critical()
slirp: replace some fprintf() with DEBUG_MISC
slirp: replace a DEBUG block with WITH_ICMP_ERROR_MSG
slirp: no need to make DPRINTF conditional on DEBUG
slirp: always build with debug statements
slirp: introduce SLIRP_DEBUG environment variable
slirp: use %p for pointers format
slirp: remove remaining DEBUG blocks
slirp: replace DEBUG_ARGS with DEBUG_ARG
slirp: factor out guestfwd addition checks
slirp: add clock_get_ns() callback
build-sys: use a separate slirp-obj-y && slirp.mo
slirp: set G_LOG_DOMAIN
slirp: call into g_debug() for DEBUG macros
Prasad J Pandit (1):
slirp: check data length while emulating ident function
Samuel Thibault (2):
slirp: Enable fork_exec support on Windows
slirp: Mark debugging calls as unlikely
# gpg: Signature made Mon 14 Jan 2019 22:52:32 GMT
# gpg: using RSA key DB550E89F0FA54F3
# gpg: Good signature from "Samuel Thibault <[email protected]>"
# gpg: aka "Samuel Thibault <[email protected]>"
# gpg: aka "Samuel Thibault <[email protected]>"
# gpg: aka "Samuel Thibault <[email protected]>"
# gpg: aka "Samuel Thibault <[email protected]>"
# gpg: aka "Samuel Thibault <[email protected]>"
# gpg: aka "Samuel Thibault <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 900C B024 B679 31D4 0F82 304B D017 8C76 7D06 9EE6
# Subkey fingerprint: E61D BB15 D417 2BDE C97E 92D9 DB55 0E89 F0FA 54F3
* remotes/thibault/tags/samuel-thibault: (65 commits)
slirp: check data length while emulating ident function
slirp: Mark debugging calls as unlikely
slirp: call into g_debug() for DEBUG macros
slirp: set G_LOG_DOMAIN
build-sys: use a separate slirp-obj-y && slirp.mo
slirp: add clock_get_ns() callback
slirp: factor out guestfwd addition checks
slirp: replace DEBUG_ARGS with DEBUG_ARG
slirp: remove remaining DEBUG blocks
slirp: use %p for pointers format
slirp: introduce SLIRP_DEBUG environment variable
slirp: always build with debug statements
slirp: no need to make DPRINTF conditional on DEBUG
slirp: replace a DEBUG block with WITH_ICMP_ERROR_MSG
slirp: replace some fprintf() with DEBUG_MISC
slirp: replace a fprintf with g_critical()
slirp: use virtual time for packet expiration
slirp: rename exec_list
slirp: drop <Vista compatibility
slirp: Enable fork_exec support on Windows
...
Peter Maydell [Tue, 15 Jan 2019 14:19:18 +0000 (14:19 +0000)]
Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2019-01-14' into staging
nbd patches for 2019-01-14
Promote bitmap/NBD interfaces to stable for use in incremental
backups. Add 'qemu-nbd --bitmap'.
- John Snow: 0/11 bitmaps: remove x- prefix from QMP api
- Philippe Mathieu-Daudé: qemu-nbd: Rename 'exp' variable clashing with math::exp() symbol
- Eric Blake: 0/8 Promote x-nbd-server-add-bitmap to stable
# gpg: Signature made Mon 14 Jan 2019 16:13:45 GMT
# gpg: using RSA key A7A16B4A2527436A
# gpg: Good signature from "Eric Blake <[email protected]>"
# gpg: aka "Eric Blake (Free Software Programmer) <[email protected]>"
# gpg: aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A
* remotes/ericb/tags/pull-nbd-2019-01-14:
qemu-nbd: Add --bitmap=NAME option
nbd: Merge nbd_export_bitmap into nbd_export_new
nbd: Remove x-nbd-server-add-bitmap
nbd: Allow bitmap export during QMP nbd-server-add
nbd: Merge nbd_export_set_name into nbd_export_new
nbd: Only require disabled bitmap for read-only exports
nbd: Forbid nbd-server-stop when server is not running
nbd: Add some error case testing to iotests 223
qemu-nbd: Rename 'exp' variable clashing with math::exp() symbol
iotests: add iotest 236 for testing bitmap merge
iotests: implement pretty-print for log and qmp_log
iotests: change qmp_log filters to expect QMP objects only
iotests: remove default filters from qmp_log
iotests: add qmp recursive sorting function
iotests: add filter_generated_node_ids
iotests.py: don't abort if IMGKEYSECRET is undefined
block: remove 'x' prefix from experimental bitmap APIs
blockdev: n-ary bitmap merge
block/dirty-bitmap: remove assertion from restore
blockdev: abort transactions in reverse order
Top changeset contributors by employer
Red Hat 3091 (43.3%)
Linaro 1201 (16.8%)
(None) 484 (6.8%)
IBM 426 (6.0%)
Academics (various) 186 (2.6%)
Virtuozzo 172 (2.4%)
Wave Computing 118 (1.7%)
Igalia 109 (1.5%)
Xilinx 102 (1.4%)
Cadence Design Systems 80 (1.1%)
Top lines changed by employer
Red Hat 140523 (30.3%)
Cadence Design Systems 81010 (17.5%)
Linaro 78098 (16.8%)
Wave Computing 33134 (7.1%)
IBM 18918 (4.1%)
SiFive 14436 (3.1%)
Academics (various) 11995 (2.6%)
(None) 11458 (2.5%)
Virtuozzo 10770 (2.3%)
Oracle 6698 (1.4%)
# gpg: Signature made Mon 14 Jan 2019 16:08:52 GMT
# gpg: using RSA key FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <[email protected]>"
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-misc-gitdm-next-140119-1:
MAINTAINERS: add myself as a route for gitdm updates
contrib/gitdm: add another name to WaveComp map
contrib/gitdm: add two more IBM'ers to group-map-ibm
contrib/gitdm: Add other IBMers
contrib/gitdm: add Nokia and Proxmox to the domain-map
slirp: check data length while emulating ident function
While emulating identification protocol, tcp_emu() does not check
available space in the 'sc_rcv->sb_data' buffer. It could lead to
heap buffer overflow issue. Add check to avoid it.
This will allow to have cflags for the whole slirp.mo -objs.
It makes it possible to build tests that links only with
slirp-obj-y (and not the whole common-obj).
It is also a step towards building slirp as a shared library, although
this requires a bit more thoughts to build with
net/slirp.o (CONFIG_SLIRP would need to be 'm') and other build issues.
Peter Maydell [Mon, 14 Jan 2019 19:28:12 +0000 (19:28 +0000)]
Merge remote-tracking branch 'remotes/stsquad/tags/pull-testing-next-140119-1' into staging
A bunch of fixes for testing:
- Various Travis updates
- "stable" SID snapshot for docker
- avoid :latest docker tags
- g_usleep fix for some tests
# gpg: Signature made Mon 14 Jan 2019 14:59:35 GMT
# gpg: using RSA key FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <[email protected]>"
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-testing-next-140119-1: (21 commits)
Revert "tests: Disable qht-bench parallel test when using gprof"
tests: use g_usleep instead of rem = sleep(time)
tests/docker: remove SID_AGE test hack
tests/docker: update our Travis image
travis: bump to Xenial baseline
docker: Use a stable snapshot for Debian Sid
travis: remove matrix settings that duplicate global settings
travis: run tests in verbose mode
travis: stop using container based envs
travis: stop redefining the script commands
travis: use homebrew addon for MacOSX
travis: don't clone git submodules upfront
travis: standardize the syntax used for env variables
travis: define all the build matrix entries in one place
travis: add whitespace between each major section & matrix entry
tests: use in-place sed magic for enabling deb-src in travis image
tests: update Fedora i386 cross image to Fedora 29
tests: update Fedora dockerfile to use Fedora 29
tests: remove obsolete 'debian' dockerfile
tests: run ldconfig after installing extra software
...
Peter Maydell [Mon, 14 Jan 2019 17:35:00 +0000 (17:35 +0000)]
Merge remote-tracking branch 'remotes/ehabkost/tags/x86-next-pull-request' into staging
x86 queue, 2019-01-14
* Reenable RDTSCP support on Opteron_G[345] CPU models CPU models
(Borislav Petkov)
* host-phys-bits-limit option for better control of 5-level EPT
(Eduardo Habkost)
* Disable MPX support on named CPU models (Paolo Bonzini)
* expose HV_CPUID_ENLIGHTMENT_INFO.EAX and HV_CPUID_NESTED_FEATURES.EAX
as feature words (Vitaly Kuznetsov)
# gpg: Signature made Mon 14 Jan 2019 14:33:55 GMT
# gpg: using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <[email protected]>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/x86-next-pull-request:
i386/kvm: add a comment explaining why .feat_names are commented out for Hyper-V feature bits
x86: host-phys-bits-limit option
target/i386: Disable MPX support on named CPU models
target-i386: Reenable RDTSCP support on Opteron_G[345] CPU models CPU models
i386/kvm: expose HV_CPUID_ENLIGHTMENT_INFO.EAX and HV_CPUID_NESTED_FEATURES.EAX as feature words
Eric Blake [Fri, 11 Jan 2019 19:47:20 +0000 (13:47 -0600)]
qemu-nbd: Add --bitmap=NAME option
Having to fire up qemu, then use QMP commands for nbd-server-start
and nbd-server-add, just to expose a persistent dirty bitmap, is
rather tedious. Make it possible to expose a dirty bitmap using
just qemu-nbd (of course, for now this only works when qemu-nbd is
visiting a BDS formatted as qcow2).
Of course, any good feature also needs unit testing, so expand
iotest 223 to cover it.
Eric Blake [Fri, 11 Jan 2019 19:47:19 +0000 (13:47 -0600)]
nbd: Merge nbd_export_bitmap into nbd_export_new
We only have one caller that wants to export a bitmap name,
which it does right after creation of the export. But there is
still a brief window of time where an NBD client could see the
export but not the dirty bitmap, which a robust client would
have to interpret as meaning the entire image should be treated
as dirty. Better is to eliminate the window entirely, by
inlining nbd_export_bitmap() into nbd_export_new(), and refusing
to create the bitmap in the first place if the requested bitmap
can't be located.
We also no longer need logic for setting a different bitmap
name compared to the bitmap being exported.
Eric Blake [Fri, 11 Jan 2019 19:47:18 +0000 (13:47 -0600)]
nbd: Remove x-nbd-server-add-bitmap
Now that nbd-server-add can do the same functionality (well, other
than making the exported bitmap name different than the underlying
bitamp - but we argued that was not essential, since it is just as
easy to create a new non-persistent bitmap with the desired name),
we no longer need the experimental separate command.
Eric Blake [Fri, 11 Jan 2019 19:47:17 +0000 (13:47 -0600)]
nbd: Allow bitmap export during QMP nbd-server-add
With the experimental x-nbd-server-add-bitmap command, there was
a window of time where an NBD client could see the export but not
the associated dirty bitmap, which can cause a client that planned
on using the dirty bitmap to be forced to treat the entire image
as dirty as a safety fallback. Furthermore, if the QMP client
successfully exports a disk but then fails to add the bitmap, it
has to take on the burden of removing the export. Since we don't
allow changing the exposed dirty bitmap (whether to a different
bitmap, or removing advertisement of the bitmap), it is nicer to
make the bitmap tied to the export at the time the export is
created, with automatic failure to export if the bitmap is not
available.
The experimental command included an optional 'bitmap-export-name'
field for remapping the name exposed over NBD to be different from
the bitmap name stored on disk. However, my libvirt demo code
for implementing differential backups on top of persistent bitmaps
did not need to take advantage of that feature (it is instead
possible to create a new temporary bitmap with the desired name,
use block-dirty-bitmap-merge to merge one or more persistent
bitmaps into the temporary, then associate the temporary with the
NBD export, if control is needed over the exported bitmap name).
Hence, I'm not copying that part of the experiment over to the
stable addition. For more details on the libvirt demo, see
https://www.redhat.com/archives/libvir-list/2018-October/msg01254.html,
https://kvmforum2018.sched.com/event/FzuB/facilitating-incremental-backup-eric-blake-red-hat
This patch focuses on the user interface, and reduces (but does
not completely eliminate) the window where an NBD client can see
the export but not the dirty bitmap, with less work to clean up
after errors. Later patches will add further cleanups now that
this interface is declared stable via a single QMP command,
including removing the race window.
Eric Blake [Fri, 11 Jan 2019 19:47:16 +0000 (13:47 -0600)]
nbd: Merge nbd_export_set_name into nbd_export_new
The existing NBD code had a weird split where nbd_export_new()
created an export but did not add it to the list of exported
names until a later nbd_export_set_name() came along and grabbed
a second reference on the object; later, the first call to
nbd_export_close() drops the second reference while removing
the export from the list. This is in part because the QAPI
NbdServerRemoveNode enum documents the possibility of adding a
mode where we could do a soft disconnect: preventing new clients,
but waiting for existing clients to gracefully quit, based on
the mode used when calling nbd_export_close().
But in spite of all that, note that we never change the name of
an NBD export while it is exposed, which means it is easier to
just inline the process of setting the name as part of creating
the export.
Inline the contents of nbd_export_set_name() and
nbd_export_set_description() into the two points in an export
lifecycle where they matter, then adjust both callers to pass
the name up front. Note that for creation, all callers pass a
non-NULL name, (passing NULL at creation was for old style
servers, but we removed support for that in commit 7f7dfe2a),
so we can add an assert and do things unconditionally; but for
cleanup, because of the dual nature of nbd_export_close(), we
still have to be careful to avoid use-after-free. Along the
way, add a comment reminding ourselves of the potential of
adding a middle mode disconnect.
Eric Blake [Fri, 11 Jan 2019 19:47:15 +0000 (13:47 -0600)]
nbd: Only require disabled bitmap for read-only exports
Our initial implementation of x-nbd-server-add-bitmap put
in a restriction because of incremental backups: in that
usage, we are exporting one qcow2 file (the temporary overlay
target of a blockdev-backup sync:none job) and a dirty bitmap
owned by a second qcow2 file (the source of the
blockdev-backup, which is the backing file of the temporary).
While both qcow2 files are still writable (the target in
order to capture copy-on-write of old contents, and the
source in order to track live guest writes in the meantime),
the NBD client expects to see constant data, including the
dirty bitmap. An enabled bitmap in the source would be
modified by guest writes, which is at odds with the NBD
export being a read-only constant view, hence the initial
code choice of enforcing a disabled bitmap (the intent is
that the exposed bitmap was disabled in the same transaction
that started the blockdev-backup job, although we don't want
to track enough state to actually enforce that).
However, consider the case of a bitmap contained in a read-only
node (including when the bitmap is found in a backing layer of
the active image). Because the node can't be modified, the
bitmap won't change due to writes, regardless of whether it is
still enabled. Forbidding the export unless the bitmap is
disabled is awkward, paritcularly since we can't change the
bitmap to be disabled (because the node is read-only).
Alternatively, consider the case of live storage migration,
where management directs the destination to create a writable
NBD server, then performs a drive-mirror from the source to
the target, prior to doing the rest of the live migration.
Since storage migration can be time-consuming, it may be wise
to let the destination include a dirty bitmap to track which
portions it has already received, where even if the migration
is interrupted and restarted, the source can query the
destination block status in order to potentially minimize
re-sending data that has not changed in the meantime on a
second attempt. Such code has not been written, and might not
be trivial (after all, a cluster being marked dirty in the
bitmap does not necessarily guarantee it has the desired
contents), but it makes sense that letting an active dirty
bitmap be exposed and changing alongside writes may prove
useful in the future.
Solve both issues by gating the restriction against a
disabled bitmap to only happen when the caller has requested
a read-only export, and where the BDS that owns the bitmap
(whether or not it is the BDS handed to nbd_export_new() or
from its backing chain) is still writable. We could drop
the check altogether (if management apps are prepared to
deal with a changing bitmap even on a read-only image), but
for now keeping a check for the read-only case still stands
a chance of preventing management errors.
Update iotest 223 to show the looser behavior by leaving
a bitmap enabled the whole run; note that we have to tear
down and re-export a node when handling an error.
Eric Blake [Fri, 11 Jan 2019 19:47:14 +0000 (13:47 -0600)]
nbd: Forbid nbd-server-stop when server is not running
Since we already forbid other nbd-server commands when not
in the right state, it is unlikely that any caller was relying
on a second stop to behave as a silent no-op. Update iotest
223 to show the improved behavior.
Eric Blake [Fri, 11 Jan 2019 19:47:13 +0000 (13:47 -0600)]
nbd: Add some error case testing to iotests 223
Testing success paths is important, but it's also nice to highlight
expected failure handling, to show that we don't crash, and so that
upcoming tests that change behavior can demonstrate the resulting
effects on error paths.
Add the following errors:
Attempting to export without a running server
Attempting to start a second server
Attempting to export a bad node name
Attempting to export a name that is already exported
Attempting to export an enabled bitmap
Attempting to remove an already removed export
Attempting to quit server a second time
All of these properly complain except for a second server-stop,
which will be fixed next.
qemu-nbd: Rename 'exp' variable clashing with math::exp() symbol
The use of a variable named 'exp' prevents includes to import <math.h>.
Rename it to avoid:
qemu-nbd.c:64:19: error: ‘exp’ redeclared as different kind of symbol
static NBDExport *exp;
^~~
In file included from /usr/include/features.h:428,
from /usr/include/bits/libc-header-start.h:33,
from /usr/include/stdint.h:26,
from /usr/lib/gcc/x86_64-redhat-linux/8/include/stdint.h:9,
from /source/qemu/include/qemu/osdep.h:80,
from /source/qemu/qemu-nbd.c:19:
/usr/include/bits/mathcalls.h:95:1: note: previous declaration of ‘exp’ was here
__MATHCALL_VEC (exp,, (_Mdouble_ __x));
^~~~~~~~~~~~~~
John Snow [Fri, 21 Dec 2018 09:35:28 +0000 (04:35 -0500)]
iotests: implement pretty-print for log and qmp_log
If iotests have lines exceeding >998 characters long, git doesn't
want to send it plaintext to the list. We can solve this by allowing
the iotests to use pretty printed QMP output that we can match against
instead.
John Snow [Fri, 21 Dec 2018 09:35:27 +0000 (04:35 -0500)]
iotests: change qmp_log filters to expect QMP objects only
As laid out in the previous commit's message:
```
Several places in iotests deal with serializing objects into JSON
strings, but to add pretty-printing it seems desirable to localize
all of those cases.
log() seems like a good candidate for that centralized behavior.
log() can already serialize json objects, but when it does so,
it assumes filters=[] operates on QMP objects, not strings.
qmp_log currently operates by dumping outgoing and incoming QMP
objects into strings and filtering them assuming that filters=[]
are string filters.
```
Therefore:
Change qmp_log to treat filters as if they're always qmp object filters,
then change the logging call to rely on log()'s ability to serialize QMP
objects, so we're not duplicating that effort.
Add a qmp version of filter_testfiles and adjust the only caller using
it for qmp_log to use the qmp version.
John Snow [Fri, 21 Dec 2018 09:35:26 +0000 (04:35 -0500)]
iotests: remove default filters from qmp_log
Several places in iotests deal with serializing objects into JSON
strings, but to add pretty-printing it seems desirable to localize
all of those cases.
log() seems like a good candidate for that centralized behavior.
log() can already serialize json objects, but when it does so,
it assumes filters=[] operates on QMP objects, not strings.
qmp_log currently operates by dumping outgoing and incoming QMP
objects into strings and filtering them assuming that filters=[]
are string filters.
To have qmp_log use log's serialization, qmp_log will need to
accept only qmp filters, not text filters.
However, only a single caller of qmp_log actually requires any
filters at all. I remove the default filter and add it explicitly
to the caller in preparation for refactoring qmp_log to use rich
filters instead.
test 206 is amended to name the filter explicitly and the default
is removed.
John Snow [Fri, 21 Dec 2018 09:35:25 +0000 (04:35 -0500)]
iotests: add qmp recursive sorting function
Python before 3.6 does not sort dictionaries (including kwargs).
Therefore, printing QMP objects involves sorting the keys to have
a predictable ordering in the iotests output. This means that
iotests output will sometimes show arguments in an order not
specified by the test author.
Presently, we accomplish this by using json.dumps' sort_keys argument,
where we only serialize the arguments dictionary, but not the command.
However, if we want to pretty-print QMP objects being sent to the
QEMU process, we need to build the entire command before logging it.
Ordinarily, this would then involve "arguments" being sorted above
"execute", which would necessitate a rather ugly and harder-to-read
change to many iotests outputs.
To facilitate pretty-printing AND maintaining predictable output AND
having "arguments" sort after "execute", add a custom sort function
that takes a dictionary and recursively builds an OrderedDict that
maintains the specific key order we wish to see in iotests output.
The qmp_log function uses this to build a QMP object that keeps
"execute" above "arguments", but sorts all keys and keys in any
subdicts in "arguments" lexicographically to maintain consistent
iotests output, with no incompatible changes to any current test.
John Snow [Fri, 21 Dec 2018 09:35:23 +0000 (04:35 -0500)]
iotests.py: don't abort if IMGKEYSECRET is undefined
Instead of using os.environ[], use .get with a default of empty string
to match the setup in check to allow us to import the iotests module
(for debugging, say) without needing a crafted environment just to
import the module.
John Snow [Fri, 21 Dec 2018 09:35:22 +0000 (04:35 -0500)]
block: remove 'x' prefix from experimental bitmap APIs
The 'x' prefix was added because I was uncertain of the direction we'd
take for the libvirt API. With the general approach solidified, I feel
comfortable committing to this API for 4.0.
John Snow [Fri, 21 Dec 2018 09:35:21 +0000 (04:35 -0500)]
blockdev: n-ary bitmap merge
Especially outside of transactions, it is helpful to provide
all-or-nothing semantics for bitmap merges. This facilitates
the coalescing of multiple bitmaps into a single target for
the "checkpoint" interpretation when assembling bitmaps that
represent arbitrary points in time from component bitmaps.
This is an incompatible change from the preliminary version
of the API.
John Snow [Fri, 11 Jan 2019 17:59:16 +0000 (11:59 -0600)]
blockdev: abort transactions in reverse order
Presently, we abort transactions in the same order they were processed in.
Bitmap commands, though, attempt to restore backup data structures on abort.
That's not valid, they need to be aborted in reverse chronological order.
Replace the QSIMPLEQ data structure with a QTAILQ one, so we can iterate
in reverse for the abort phase of the transaction.
Alex Bennée [Fri, 11 Jan 2019 13:50:02 +0000 (13:50 +0000)]
tests: use g_usleep instead of rem = sleep(time)
Relying on sleep to always return having slept isn't safe as a signal
may have occurred. If signals are constantly incoming the program will
never reach its termination condition. This is believed to be the
mechanism causing time outs for qht-test in Travis.
The glib g_usleep() deals with all of this for us so lets use it instead.
The Debian Sid repository is not garanteed to be stable, as his
'unstable' name suggest :)
To allow quick testing, packages are pushed various time a day,
which my be annoying when trying to use it for stable development
(which is not recommended, but Sid provides edge packages we use
for testing).
Debian provides repositories snapshots which are suitable for our
use. Pick a recent date that works. When required, update to newer
releases will be easy.
This fixes current issues with this image:
$ make docker-image-debian-sid
[...]
The following packages have unmet dependencies:
build-essential : Depends: dpkg-dev (>= 1.17.11) but it is not going to be installed
git : Depends: perl but it is not going to be installed
Depends: liberror-perl but it is not going to be installed
pkg-config : Depends: libdpkg-perl but it is not going to be installed
texinfo : Depends: perl (>= 5.26.2-6) but it is not going to be installed
Depends: libtext-unidecode-perl but it is not going to be installed
Depends: libxml-libxml-perl but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
Travis sometimes fails a build because it produces no console output for
over 10 minutes. If this is due to a genuine hang, it would be useful to
have used verbose test output to see where it failed. If this is just
due to tests being very slow, having verbose output might allow the
build to succeed.
"Container-based infrastructure is currently being deprecated.
Please remove any sudo: false keys in your .travis.yml file
to use the default fully-virtualized Linux infrastructure
instead."
One of the matrix entries redefines the script command in order to add
the ${MAKEFLAGS} variable. Ideally ${MAKEFLAGS} would be referenced by
the definition of the ${TEST_CMD} env variable, but this isn't possible
in travis. ${MAKEFLAGS} exists to eliminate duplication of flags in
every "make" command, but this cure causes a worse problem, namely the
reduplication of the "script" command. It is simpler to just insert "-j3"
directly into any "make" command.
The configure script & Makefile are already capable of figuring out
which git submodules are required for a given build platform, and
cloning them at the right time.
travis: define all the build matrix entries in one place
The current build matrix is constructed from entries listed under the
environment variable config section, as well as the general purpose
build matrix section. Move everything under the general purpose section
so it is clear at a glance what is in the matrix.
tests: update Fedora i386 cross image to Fedora 29
Using the "latest" tag is not a good idea because this changes what
release it points to every 6 months. Together with caching of docker
builds this can cause confusion where CI has cached & built with Fedora
N, while a developer tries to reproduce a CI problem with Fedora N + 1,
or vica-verca.
The 'debian' dockerfile was deprecated in favour of versioned
dockerfiles in July 2017. That is enough time for developers to
be warned about the rename.
tests: run ldconfig after installing extra software
The docker file builds and installs software into /usr/local but does
not run ldconfig. As a result QEMU links to libvirglrenderer.so, but
then crashes in "make check" unable to find the library.
Use a stable tag instead of some random commit from mainstream
development, to avoid unexpected build failures.
This fixes:
CC virglrenderer.lo
virglrenderer.c: In function 'virgl_has_gl_colorspace':
virglrenderer.c:208:11: error: implicit declaration of function 'virgl_has_egl_khr_gl_colorspace' [-Werror=implicit-function-declaration]
virgl_has_egl_khr_gl_colorspace(egl_info));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
virglrenderer.c:208:43: error: 'egl_info' undeclared (first use in this function)
virgl_has_egl_khr_gl_colorspace(egl_info));
^~~~~~~~
virglrenderer.c:208:43: note: each undeclared identifier is reported only once for each function it appears in
cc1: some warnings being treated as errors
As of this commit 'git virglrenderer-0.7.0' is the last stable tag.
(virglrenderer commit breaking: fb4f7577f7ef)
Eduardo Habkost [Tue, 11 Dec 2018 19:25:27 +0000 (17:25 -0200)]
x86: host-phys-bits-limit option
Some downstream distributions of QEMU set host-phys-bits=on by
default. This worked very well for most use cases, because
phys-bits really didn't have huge consequences. The only
difference was on the CPUID data seen by guests, and on the
handling of reserved bits.
This changed in KVM commit 855feb673640 ("KVM: MMU: Add 5 level
EPT & Shadow page table support"). Now choosing a large
phys-bits value for a VM has bigger impact: it will make KVM use
5-level EPT even when it's not really necessary. This means
using the host phys-bits value may not be the best choice.
Management software could address this problem by manually
configuring phys-bits depending on the size of the VM and the
amount of MMIO address space required for hotplug. But this is
not trivial to implement.
However, there's another workaround that would work for most
cases: keep using the host phys-bits value, but only if it's
smaller than 48. This patch makes this possible by introducing a
new "-cpu" option: "host-phys-bits-limit". Management software
or users can make sure they will always use 4-level EPT using:
"host-phys-bits=on,host-phys-bits-limit=48".
This behavior is still not enabled by default because QEMU
doesn't enable host-phys-bits=on by default. But users,
management software, or downstream distributions may choose to
change their defaults using the new option.
Paolo Bonzini [Thu, 20 Dec 2018 12:11:00 +0000 (13:11 +0100)]
target/i386: Disable MPX support on named CPU models
MPX support is being phased out by Intel; GCC has dropped it, Linux
is also going to do that. Even though KVM will have special code
to support MPX after the kernel proper stops enabling it in XCR0,
we probably also want to deprecate that in a few years. As a start,
do not enable it by default for any named CPU model starting with
the 4.0 machine types; this include Skylake, Icelake and Cascadelake.
Opteron_G2 - being family 15, model 6, doesn't have RDTSCP support
(the real hardware doesn't have it. K8 got RDTSCP support with the NPT
models, i.e., models >= 0x40).
Document the host's minimum required kernel version, while at it.
Vitaly Kuznetsov [Mon, 26 Nov 2018 13:59:58 +0000 (14:59 +0100)]
i386/kvm: expose HV_CPUID_ENLIGHTMENT_INFO.EAX and HV_CPUID_NESTED_FEATURES.EAX as feature words
It was found that QMP users of QEMU (e.g. libvirt) may need
HV_CPUID_ENLIGHTMENT_INFO.EAX/HV_CPUID_NESTED_FEATURES.EAX information. In
particular, 'hv_tlbflush' and 'hv_evmcs' enlightenments are only exposed in
HV_CPUID_ENLIGHTMENT_INFO.EAX.
HV_CPUID_NESTED_FEATURES.EAX is exposed for two reasons: convenience
(we don't need to export it from hyperv_handle_properties() and as
future-proof for Enlightened MSR-Bitmap, PV EPT invalidation and
direct virtual flush features.
It is possible for an io_poll callback to be concurrently executed along
with an aio_set_fd_handlers. This can cause all sorts of problems, like
a NULL callback or a bad opaque pointer.
This changes set_fd_handlers so that it no longer modify existing handlers
entries and instead, always insert those after having proper initialisation.
Peter Maydell [Mon, 14 Jan 2019 13:54:17 +0000 (13:54 +0000)]
Merge remote-tracking branch 'remotes/aperard/tags/pull-xen-20190114' into staging
Xen queue
* Xen PV backend 'qdevification'.
Starting with xen_disk.
* Performance improvements for xen-block.
* Remove of the Xen PV domain builder.
* bug fixes.
# gpg: Signature made Mon 14 Jan 2019 13:46:33 GMT
# gpg: using RSA key 0CF5572FD7FB55AF
# gpg: Good signature from "Anthony PERARD <[email protected]>"
# gpg: aka "Anthony PERARD <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5379 2F71 024C 600F 778A 7161 D8D5 7199 DF83 42C8
# Subkey fingerprint: F80C 0063 08E2 2CFD 8A92 E798 0CF5 572F D7FB 55AF
* remotes/aperard/tags/pull-xen-20190114: (25 commits)
xen-block: avoid repeated memory allocation
xen-block: improve response latency
xen-block: improve batching behaviour
xen: Replace few mentions of xend by libxl
Remove broken Xen PV domain builder
xen: remove the legacy 'xen_disk' backend
MAINTAINERS: add myself as a Xen maintainer
xen: automatically create XenBlockDevice-s
xen: add a mechanism to automatically create XenDevice-s...
xen: add implementations of xen-block connect and disconnect functions...
xen: purge 'blk' and 'ioreq' from function names in dataplane/xen-block.c
xen: remove 'ioreq' struct/varable/field names from dataplane/xen-block.c
xen: remove 'XenBlkDev' and 'blkdev' names from dataplane/xen-block
xen: add header and build dataplane/xen-block.c
xen: remove unnecessary code from dataplane/xen-block.c
xen: duplicate xen_disk.c as basis of dataplane/xen-block.c
xen: add event channel interface for XenDevice-s
xen: add grant table interface for XenDevice-s
xen: add xenstore watcher infrastructure
xen: create xenstore areas for XenDevice-s
...
Tim Smith [Wed, 12 Dec 2018 11:16:26 +0000 (11:16 +0000)]
xen-block: avoid repeated memory allocation
The xen-block dataplane currently allocates memory to hold the data for
each request as that request is used, and frees it afterwards. Because
it requires page-aligned blocks, this interacts poorly with non-page-
aligned allocations and balloons the heap.
Instead, allocate the maximum possible buffer size required for the
protocol, which is BLKIF_MAX_SEGMENTS_PER_REQUEST (currently 11) pages
when the request structure is created, and keep that buffer until it is
destroyed. Since the requests are re-used via a free list, this should
actually improve memory usage.
Signed-off-by: Tim Smith <[email protected]>
Re-based and commit comment adjusted.
Tim Smith [Wed, 12 Dec 2018 11:16:25 +0000 (11:16 +0000)]
xen-block: improve response latency
If the I/O ring is full, the guest cannot send any more requests
until some responses are sent. Only sending all available responses
just before checking for new work does not leave much time for the
guest to supply new work, so this will cause stalls if the ring gets
full. Also, not completing reads as soon as possible adds latency
to the guest.
To alleviate that, complete IO requests as soon as they come back.
xen_block_send_response() already returns a value indicating whether
a notify should be sent, which is all the batching we need.
Signed-off-by: Tim Smith <[email protected]>
Re-based and commit comment adjusted.
Tim Smith [Wed, 12 Dec 2018 11:16:24 +0000 (11:16 +0000)]
xen-block: improve batching behaviour
When I/O consists of many small requests, performance is improved by
batching them together in a single io_submit() call. When there are
relatively few requests, the extra overhead is not worth it. This
introduces a check to start batching I/O requests via blk_io_plug()/
blk_io_unplug() in an amount proportional to the number which were
already in flight at the time we started reading the ring.
Signed-off-by: Tim Smith <[email protected]>
Re-based and commit comment adjusted.
Paul Durrant [Tue, 8 Jan 2019 14:49:02 +0000 (14:49 +0000)]
MAINTAINERS: add myself as a Xen maintainer
I have made many significant contributions to the Xen code in QEMU,
particularly the recent patches introducing a new PV device framework.
I intend to make further significant contributions, porting other PV back-
ends to the new framework with the intent of eventually removing the
legacy code. It therefore seems reasonable that I become a maintainer of
the Xen code.
Paul Durrant [Tue, 8 Jan 2019 14:49:01 +0000 (14:49 +0000)]
xen: automatically create XenBlockDevice-s
This patch adds create and destroy function for XenBlockDevice-s so that
they can be created automatically when the Xen toolstack instantiates a new
PV backend via xenstore. When the XenBlockDevice is created this way it is
also necessary to create a 'drive' which matches the configuration that the
Xen toolstack has written into xenstore. This is done by formulating the
parameters necessary for each 'blockdev' layer of the drive and then using
qmp_blockdev_add() to create the layers. Also, for compatibility with the
legacy 'xen_disk' implementation, an iothread is automatically created for
the new XenBlockDevice. This, like the driver layers, will be destroyed
after the XenBlockDevice is unrealized.
The legacy backend scan for 'qdisk' is removed by this patch, which makes
the 'xen_disk' code is redundant. The code will be removed by a subsequent
patch.
Paul Durrant [Tue, 8 Jan 2019 14:49:00 +0000 (14:49 +0000)]
xen: add a mechanism to automatically create XenDevice-s...
...that maintains compatibility with existing Xen toolstacks.
Xen toolstacks instantiate PV backends by simply writing information into
xenstore and expecting a backend implementation to be watching for this.
This patch adds a new 'xen-backend' module to allow individual XenDevice
implementations to register create and destroy functions. The creator
will be called when a tool-stack instantiates a new backend in this way,
and the destructor will then be called after the resulting XenDevice
object is unrealized.
To support this it is also necessary to add new watchers into the XenBus
implementation to handle enumeration of new backends and also destruction
of XenDevice-s when the toolstack sets the backend 'online' key to 0.
NOTE: This patch only adds the framework. A subsequent patch will add a
creator function for xen-block devices.
Paul Durrant [Tue, 8 Jan 2019 14:48:59 +0000 (14:48 +0000)]
xen: add implementations of xen-block connect and disconnect functions...
...and wire in the dataplane.
This patch adds the remaining code to make the xen-block XenDevice
functional. The parameters that a block frontend expects to find are
populated in the backend xenstore area, and the 'ring-ref' and
'event-channel' values specified in the frontend xenstore area are
mapped/bound and used to set up the dataplane.
Paul Durrant [Tue, 8 Jan 2019 14:48:58 +0000 (14:48 +0000)]
xen: purge 'blk' and 'ioreq' from function names in dataplane/xen-block.c
This is a purely cosmetic patch that purges remaining use of 'blk' and
'ioreq' in local function names, and then makes sure all functions are
prefixed with 'xen_block_'.
Paul Durrant [Tue, 8 Jan 2019 14:48:57 +0000 (14:48 +0000)]
xen: remove 'ioreq' struct/varable/field names from dataplane/xen-block.c
This is a purely cosmetic patch that purges the name 'ioreq' from struct,
variable and field names. (This name has been problematic for a long time
as 'ioreq' is the name used for generic I/O requests coming from Xen).
The patch replaces 'struct ioreq' with a new 'XenBlockRequest' type and
'ioreq' field/variable names with 'request', and then does necessary
fix-up to adhere to coding style.
Function names are not modified by this patch. They will be dealt with in
a subsequent patch.
Paul Durrant [Tue, 8 Jan 2019 14:48:56 +0000 (14:48 +0000)]
xen: remove 'XenBlkDev' and 'blkdev' names from dataplane/xen-block
This is a purely cosmetic patch that substitutes the old 'struct XenBlkDev'
name with 'XenBlockDataPlane' and 'blkdev' field/variable names with
'dataplane', and then does necessary fix-up to adhere to coding style.
Paul Durrant [Tue, 8 Jan 2019 14:48:55 +0000 (14:48 +0000)]
xen: add header and build dataplane/xen-block.c
This patch adds the transformations necessary to get dataplane/xen-block.c
to build against the new XenBus/XenDevice framework. MAINTAINERS is also
updated due to the introduction of dataplane/xen-block.h.
NOTE: Existing data structure names are retained for the moment. These will
be modified by subsequent patches. A typedef for XenBlockDataPlane
has been added to the header (based on the old struct XenBlkDev name
for the moment) so that the old names don't need to leak out of the
dataplane code.
Paul Durrant [Tue, 8 Jan 2019 14:48:54 +0000 (14:48 +0000)]
xen: remove unnecessary code from dataplane/xen-block.c
Not all of the code duplicated from xen_disk.c is required as the basis for
the new dataplane implementation so this patch removes extraneous code,
along with the legacy #includes and calls to the legacy xen_pv_printf()
function. Error messages are changed to be reported using error_report().
NOTE: The code is still not yet built. Further transformations will be
required to make it correctly interface to the new XenBus/XenDevice
framework. They will be delivered in a subsequent patch.
Paul Durrant [Tue, 8 Jan 2019 14:48:53 +0000 (14:48 +0000)]
xen: duplicate xen_disk.c as basis of dataplane/xen-block.c
The new xen-block XenDevice implementation requires the same core
dataplane as the legacy xen_disk implementation it will eventually replace.
This patch therefore copies the legacy xen_disk.c source module into a new
dataplane/xen-block.c source module as the basis for the new dataplane and
adjusts the MAINTAINERS file accordingly.
NOTE: The duplicated code is not yet built. It is simply put into place by
this patch (just fixing style violations) such that the
modifications that will need to be made to the code are not
conflated with code movement, thus making review harder.