Peter Maydell [Mon, 10 Dec 2018 11:26:49 +0000 (11:26 +0000)]
uuid: Make qemu_uuid_bswap() take and return a QemuUUID
Currently qemu_uuid_bswap() takes a pointer to the QemuUUID to
be byte-swapped. This means it can't be used when the UUID
to be swapped is in a packed member of a struct. It's also
out of line with the general bswap*() functions we provide
in bswap.h, which take the value to be swapped and return it.
Make qemu_uuid_bswap() take a QemuUUID and return the swapped version.
This fixes some clang warnings about taking the address of
a packed struct member in block/vdi.c.
Peter Maydell [Mon, 10 Dec 2018 11:26:48 +0000 (11:26 +0000)]
block/vdi: Don't take address of fields in packed structs
Taking the address of a field in a packed struct is a bad idea, because
it might not be actually aligned enough for that pointer type (and
thus cause a crash on dereference on some host architectures). Newer
versions of clang warn about this.
Instead of passing UUID related functions the address of a possibly
unaligned QemuUUID struct, use local variables and then copy to/from
the struct field as appropriate.
Peter Maydell [Mon, 10 Dec 2018 11:26:47 +0000 (11:26 +0000)]
block/vpc: Don't take address of fields in packed structs
Taking the address of a field in a packed struct is a bad idea, because
it might not be actually aligned enough for that pointer type (and
thus cause a crash on dereference on some host architectures). Newer
versions of clang warn about this. Avoid the bug by generating the
UUID into a local variable which is definitely safely aligned and
then copying it into place.
Kevin Wolf [Fri, 7 Dec 2018 11:42:19 +0000 (12:42 +0100)]
vmdk: Reject excess extents in blockdev-create
Clarify that the number of extents provided in BlockdevCreateOptionsVmdk
must match the number of extents that will actually be used. Providing
more extents will result in an error now.
This requires adapting the test case to provide the right number of
extents.
Fam Zheng [Tue, 15 May 2018 15:36:32 +0000 (23:36 +0800)]
vmdk: Implement .bdrv_co_create callback
This makes VMDK support blockdev-create. The implementation reuses the
image creation code in vmdk_co_create_opts which now acceptes a callback
pointer to "retrieve" BlockBackend pointers from the caller. This way we
separate the logic between file/extent acquisition and initialization.
The QAPI command parameters are mostly the same as the old create_opts
except the dropped legacy @compat6 switch, which is redundant with
@hwversion.
Fam Zheng [Tue, 15 May 2018 15:36:31 +0000 (23:36 +0800)]
vmdk: Refactor vmdk_create_extent
The extracted vmdk_init_extent takes a BlockBackend object and
initializes the format metadata. It is the common part between "qemu-img
create" and "blockdev-create".
Add a "BlockBackend *pbb" parameter to vmdk_create_extent, to return the
opened BB to the caller in the next patch.
Max Reitz [Fri, 21 Dec 2018 18:42:26 +0000 (19:42 +0100)]
iotests: Make 234 stable
This test waits for a MIGRATION event with status=completed on the
source VM before querying the migration status on both source and
destination. However, just because the source says migration has
completed does not mean the destination thinks the same. Therefore, in
some cases, the destination VM may still report "active" instead of
"completed" when asked for its migration status.
Fix this by enabling migration events on both VMs and waiting until both
source and destination emit a status=completed MIGRATION event.
Kevin Wolf [Mon, 7 Jan 2019 12:02:48 +0000 (13:02 +0100)]
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
yuchenlin [Sat, 5 Jan 2019 08:42:43 +0000 (16:42 +0800)]
qemu-iotests: add test case for dmg
Recently, some bugs in dmg file have been fixed. To prevent reading dmg
is broken someday in the future, add a simple test which ensures the
conversion from dmg to raw should not hang or face any I/O error.
Alberto Garcia [Wed, 14 Nov 2018 14:58:57 +0000 (16:58 +0200)]
qcow2: Assert that refcount block offsets fit in the refcount table
Refcount table entries have a field to store the offset of the
refcount block. The rest of the bits of the entry are currently
reserved.
The offset is always taken from the entry using REFT_OFFSET_MASK to
ensure that we only use the bits that belong to that field.
While that mask is used every time we read from the refcount table, it
is never used when we write to it. Due to the other constraints of the
qcow2 format QEMU can never produce refcount block offsets that don't
fit in that field so any such offset when allocating a refcount block
would indicate a bug in QEMU.
Alberto Garcia [Thu, 22 Nov 2018 15:00:27 +0000 (17:00 +0200)]
mirror: Block the source BlockDriverState in mirror_start_job()
The mirror_start_job() function used for the commit-active job blocks
the source, target and all intermediate nodes for the duration of the
job.
target <- intermediate <- source
Since 4ef85a9c2339 this function creates a dummy mirror_top_bs that
goes on top of the source node, and it is this dummy node that gets
blocked instead. The source node is never blocked or added to the
job's list of nodes.
target <- intermediate <- source <- mirror_top
At the moment I don't think it is possible to exploit this problem
because any additional job on 'source' would either be forbidden for
other reasons or it would need to involve an additional node that is
blocked, causing an error.
This can be seen in the error messages, however, because they never
refer to the source node being blocked:
Alberto Garcia [Thu, 22 Nov 2018 15:00:26 +0000 (17:00 +0200)]
mirror: Release the dirty bitmap if mirror_start_job() fails
At the moment I don't see how to make this function fail after the
dirty bitmap has been created, but if that was possible then we would
hit the assert(QLIST_EMPTY(&bs->dirty_bitmaps)) in bdrv_close().
Peter Maydell [Thu, 31 Jan 2019 19:26:09 +0000 (19:26 +0000)]
Merge remote-tracking branch 'remotes/xanclic/tags/pull-block-2019-01-31' into staging
Block patches:
- New debugging QMP command to explore block graphs
- Converted DPRINTF()s to trace events
- Fixed qemu-io's use of getopt() for systems with optreset
- Minor NVMe emulation fixes
- An iotest fix
# gpg: Signature made Thu 31 Jan 2019 00:51:46 GMT
# gpg: using RSA key F407DB0061D5CF40
# gpg: Good signature from "Max Reitz <[email protected]>" [full]
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* remotes/xanclic/tags/pull-block-2019-01-31:
iotests: Allow 147 to be run concurrently
iotests: Bind qemu-nbd to localhost in 147
iotests.py: Add qemu_nbd_pipe()
nvme: use pci_dev directly in nvme_realize
nvme: ensure the num_queues is not zero
nvme: use TYPE_NVME instead of constant string
qemu-io: Add generic function for reinitializing optind.
block/sheepdog: Convert from DPRINTF() macro to trace events
block/file-posix: Convert from DPRINTF() macro to trace events
block/curl: Convert from DPRINTF() macro to trace events
block/ssh: Convert from DPRINTF() macro to trace events
scripts: add render_block_graph function for QEMUMachine
qapi: add x-debug-query-block-graph
* remotes/vivier2/tags/trivial-branch-pull-request:
virtio-blk: remove duplicate definition of VirtIOBlock *s pointer
hw/block: clean up stale xen_disk trace entries
target/m68k: Fix LGPL information in the file headers
target/s390x: Fix LGPL version in the file header comments
tcg: Fix LGPL version number
target/tricore: Fix LGPL version number
target/openrisc: Fix LGPL version number
COPYING.LIB: Synchronize the LGPL 2.1 with the version from gnu.org
Don't talk about the LGPL if the file is licensed under the GPL
hw: sd: set category of the sd memory card
hw: input: set category of the i8042 device
typo: apci->acpi
hw: edu: set category of the edu device
* remotes/kraxel/tags/usb-20190130-pull-request:
usb-mtp: replace the homebrew write with qemu_write_full
usb-mtp: breakup MTP write into smaller chunks
usb-mtp: Reallocate buffer in multiples of MTP_WRITE_BUF_SZ
usb: implement XHCI underrun/overrun events
usb: XHCI shall not halt isochronous endpoints
hw/usb: Fix LGPL information in the file headers
usb: dev-mtp: close fd in usb_mtp_object_readdir()
usb: assign unique serial numbers to hid devices
Peter Maydell [Thu, 31 Jan 2019 12:03:39 +0000 (12:03 +0000)]
Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
Pull request
User-visible changes:
* The new qemu-trace-stap script makes it convenient to collect traces without
writing SystemTap scripts. See "man qemu-trace-stap" for details.
# gpg: Signature made Wed 30 Jan 2019 03:17:57 GMT
# gpg: using RSA key 9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <[email protected]>" [full]
# gpg: aka "Stefan Hajnoczi <[email protected]>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/tracing-pull-request:
trace: rerun tracetool after ./configure changes
trace: improve runstate tracing
trace: add ability to do simple printf logging via systemtap
trace: forbid use of %m in trace event format strings
trace: enforce that every trace-events file has a final newline
display: ensure qxl log_buf is a nul terminated string
Max Reitz [Fri, 21 Dec 2018 23:47:50 +0000 (00:47 +0100)]
iotests: Allow 147 to be run concurrently
To do this, we need to allow creating the NBD server on various ports
instead of a single one (which may not even work if you run just one
instance, because something entirely else might be using that port).
So we just pick a random port in [32768, 32768 + 1024) and try to create
a server there. If that fails, we just retry until something sticks.
For the IPv6 test, we need a different range, though (just above that
one). This is because "localhost" resolves to both 127.0.0.1 and ::1.
This means that if you bind to it, it will bind to both, if possible, or
just one if the other is already in use. Therefore, if the IPv6 test
has already taken [::1]:some_port and we then try to take
localhost:some_port, that will work -- only the second server will be
bound to 127.0.0.1:some_port alone and not [::1]:some_port in addition.
So we have two different servers on the same port, one for IPv4 and one
for IPv6.
But when we then try to connect to the server through
localhost:some_port, we will always end up at the IPv6 one (as long as
it is up), and this may not be the one we want.
Thus, we must make sure not to create an IPv6-only NBD server on the
same port as a normal "dual-stack" NBD server -- which is done by using
distinct port ranges, as explained above.
Max Reitz [Fri, 21 Dec 2018 23:47:49 +0000 (00:47 +0100)]
iotests: Bind qemu-nbd to localhost in 147
By default, qemu-nbd binds to 0.0.0.0. However, we then proceed to
connect to "localhost". Usually, this works out fine; but if this test
is run concurrently, some other test function may have bound a different
server to ::1 (on the same port -- you can bind different serves to the
same port, as long as one is on IPv4 and the other on IPv6).
So running qemu-nbd works, it can bind to 0.0.0.0:NBD_PORT. But
potentially a concurrent test has successfully taken [::1]:NBD_PORT. In
this case, trying to connect to "localhost" will lead us to the IPv6
instance, where we do not want to end up.
Fix this by just binding to "localhost". This will make qemu-nbd error
out immediately and not give us cryptic errors later.
(Also, it will allow us to just try a different port as of a future
patch.)
Max Reitz [Fri, 21 Dec 2018 23:47:48 +0000 (00:47 +0100)]
iotests.py: Add qemu_nbd_pipe()
In some cases, we may want to deal with qemu-nbd errors (e.g. by
launching it in a different configuration until it no longer throws
any). In that case, we do not want its output ending up in the test
output.
It may still be useful for handling the error, though, so add a new
function that works basically like qemu_nbd(), only that it returns the
qemu-nbd output instead of making it end up in the log. In contrast to
qemu_img_pipe(), it does still return the exit code as well, though,
because that is even more important for error handling.
Li Qiang [Sun, 20 Jan 2019 05:55:57 +0000 (21:55 -0800)]
nvme: ensure the num_queues is not zero
When it is zero, it causes segv.
Using following command:
"-drive file=//home/test/test1.img,if=none,id=id0
-device nvme,drive=id0,serial=test,num_queues=0"
causes following Backtrack:
Thread 4 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe9735700 (LWP 30952)]
0x0000555555a7a77c in nvme_start_ctrl (n=0x5555577473f0) at hw/block/nvme.c:825
825 if (unlikely(n->cq[0])) {
(gdb) bt
0 0x0000555555a7a77c in nvme_start_ctrl (n=0x5555577473f0)
at hw/block/nvme.c:825
1 0x0000555555a7af7f in nvme_write_bar (n=0x5555577473f0, offset=20,
data=4587521, size=4) at hw/block/nvme.c:969
2 0x0000555555a7b81a in nvme_mmio_write (opaque=0x5555577473f0, addr=20,
data=4587521, size=4) at hw/block/nvme.c:1163
3 0x0000555555869236 in memory_region_write_accessor (mr=0x555557747cd0,
addr=20, value=0x7fffe97320f8, size=4, shift=0, mask=4294967295, attrs=...)
at /home/test/qemu1/qemu/memory.c:502
4 0x0000555555869446 in access_with_adjusted_size (addr=20,
value=0x7fffe97320f8, size=4, access_size_min=2, access_size_max=8,
access_fn=0x55555586914d <memory_region_write_accessor>,
mr=0x555557747cd0, attrs=...) at /home/test/qemu1/qemu/memory.c:568
5 0x000055555586c479 in memory_region_dispatch_write (mr=0x555557747cd0,
addr=20, data=4587521, size=4, attrs=...)
at /home/test/qemu1/qemu/memory.c:1499
6 0x00005555558030af in flatview_write_continue (fv=0x7fffe0061130,
addr=4273930260, attrs=..., buf=0x7ffff7ff0028 "\001", len=4, addr1=20,
l=4, mr=0x555557747cd0) at /home/test/qemu1/qemu/exec.c:3234
7 0x00005555558031f9 in flatview_write (fv=0x7fffe0061130, addr=4273930260,
attrs=..., buf=0x7ffff7ff0028 "\001", len=4)
at /home/test/qemu1/qemu/exec.c:3273
8 0x00005555558034ff in address_space_write (
---Type <return> to continue, or q <return> to quit---
as=0x555556758480 <address_space_memory>, addr=4273930260, attrs=...,
buf=0x7ffff7ff0028 "\001", len=4) at /home/test/qemu1/qemu/exec.c:3363
9 0x0000555555803550 in address_space_rw (
as=0x555556758480 <address_space_memory>, addr=4273930260, attrs=...,
buf=0x7ffff7ff0028 "\001", len=4, is_write=true)
at /home/test/qemu1/qemu/exec.c:3374
10 0x00005555558884a1 in kvm_cpu_exec (cpu=0x555556920e40)
at /home/test/qemu1/qemu/accel/kvm/kvm-all.c:2031
11 0x000055555584cd9d in qemu_kvm_cpu_thread_fn (arg=0x555556920e40)
at /home/test/qemu1/qemu/cpus.c:1281
12 0x0000555555dbaf6d in qemu_thread_start (args=0x5555569438a0)
at util/qemu-thread-posix.c:502
13 0x00007ffff5dc86db in start_thread (arg=0x7fffe9735700)
at pthread_create.c:463
14 0x00007ffff5af188f in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
After main option parsing, we reinitialize optind so we can parse each
command. However reinitializing optind to 0 does not work on FreeBSD.
What happens when you do this is optind remains 0 after the option
parsing loop, and the result is we try to parse argv[optind] ==
argv[0] == "aio_write" as if it was the first parameter.
The FreeBSD manual page says:
In order to use getopt() to evaluate multiple sets of arguments, or to
evaluate a single set of arguments multiple times, the variable optreset
must be set to 1 before the second and each additional set of calls to
getopt(), and the variable optind must be reinitialized.
(From the rest of the man page it is clear that optind must be
reinitialized to 1).
The glibc man page says:
A program that scans multiple argument vectors, or rescans the same
vector more than once, and wants to make use of GNU extensions such as
'+' and '-' at the start of optstring, or changes the value of
POSIXLY_CORRECT between scans, must reinitialize getopt() by resetting
optind to 0, rather than the traditional value of 1. (Resetting to 0
forces the invocation of an internal initialization routine that
rechecks POSIXLY_CORRECT and checks for GNU extensions in optstring.)
This commit introduces an OS-portability function called
qemu_reset_optind which provides a way of resetting optind that works
on FreeBSD and platforms that use optreset, while keeping it the same
as now on other platforms.
Note that the qemu codebase sets optind in many other places, but in
those other places it's setting a local variable and not using getopt.
This change is only needed in places where we are using getopt and the
associated global variable optind.
scripts: add render_block_graph function for QEMUMachine
Render block nodes graph with help of graphviz. This new function is
for debugging, so there is no sense to put it into qemu.py as a method
of QEMUMachine. Let's instead put it separately.
virtio-blk: remove duplicate definition of VirtIOBlock *s pointer
VirtIOBlock *s is already defined and initialized with req->dev
on top of virtio_blk_handle_request(), so we can remove it from
the code block of VIRTIO_BLK_T_GET_ID case.
Thomas Huth [Tue, 29 Jan 2019 13:43:58 +0000 (14:43 +0100)]
target/m68k: Fix LGPL information in the file headers
It's either "GNU *Library* General Public License version 2" or
"GNU Lesser General Public License version *2.1*", but there was
no "version 2.0" of the "Lesser" license. So assume that version
2.1 is meant here.
Also some files mention the GPL instead of the LGPL after declaring
that the files are licensed under the LGPL, so change these spots to
use LGPL, too.
Thomas Huth [Tue, 29 Jan 2019 13:37:47 +0000 (14:37 +0100)]
target/s390x: Fix LGPL version in the file header comments
It's either "GNU *Library* General Public License version 2" or
"GNU Lesser General Public License version *2.1*", but there was
no "version 2.0" of the "Lesser" license. So assume that version
2.1 is meant here.
Thomas Huth [Wed, 23 Jan 2019 14:08:56 +0000 (15:08 +0100)]
tcg: Fix LGPL version number
It's either "GNU *Library* General Public version 2" or "GNU Lesser
General Public version *2.1*", but there was no "version 2.0" of the
"Lesser" library. So assume that version 2.1 is meant here.
Thomas Huth [Wed, 23 Jan 2019 14:08:55 +0000 (15:08 +0100)]
target/tricore: Fix LGPL version number
It's either "GNU *Library* General Public version 2" or "GNU Lesser
General Public version *2.1*", but there was no "version 2.0" of the
"Lesser" library. So assume that version 2.1 is meant here.
Thomas Huth [Wed, 23 Jan 2019 14:08:54 +0000 (15:08 +0100)]
target/openrisc: Fix LGPL version number
It's either "GNU *Library* General Public version 2" or "GNU Lesser
General Public version *2.1*", but there was no "version 2.0" of the
"Lesser" library. So assume that version 2.1 is meant here.
Thomas Huth [Wed, 23 Jan 2019 14:08:53 +0000 (15:08 +0100)]
COPYING.LIB: Synchronize the LGPL 2.1 with the version from gnu.org
The current version of the LGPL 2.1 from gnu.org (see the URL
https://www.gnu.org/licenses/old-licenses/lgpl-2.1.txt ) slightly
differs from the old one that we use in our repository. Especially
the recommendation to use "either version 2 of the License, or [...]
any later version" is somewhat misleading, since there was never a
"version 2" of the "Lesser GPL" license - the "version 2" was still
called "Library GPL" instead.
Thomas Huth [Wed, 23 Jan 2019 14:51:23 +0000 (15:51 +0100)]
Don't talk about the LGPL if the file is licensed under the GPL
Some files claim that the code is licensed under the GPL, but then
suddenly suggest that the user should have a look at the LGPL.
That's of course non-sense, replace it with the correct GPL wording
instead.
Bandan Das [Tue, 29 Jan 2019 13:19:07 +0000 (08:19 -0500)]
usb-mtp: breakup MTP write into smaller chunks
For every MTP_WRITE_BUF_SZ copied, this patch writes it to file before
getting the next block of data. The file is kept opened for the
duration of the operation but the sanity checks on the write operation
are performed only once when the write operation starts. Additionally,
we also update the file size in the object metadata once the file has
completely been written.
Bandan Das [Tue, 29 Jan 2019 13:19:06 +0000 (08:19 -0500)]
usb-mtp: Reallocate buffer in multiples of MTP_WRITE_BUF_SZ
This is a "pre-patch" to breaking up the write buffer for
MTP writes. Instead of allocating a mtp buffer equal to size
sent by the initiator, we start with a small size and reallocate
multiples (of that small size) as needed.
Yuri Benditovich [Mon, 28 Jan 2019 20:05:09 +0000 (20:05 +0000)]
usb: implement XHCI underrun/overrun events
Implement underrun/overrun events of isochronous endpoints
according to XHCI spec (4.10.3.1)
Guest software restarts data streaming when receives these events.
The XHCI reports these events using interrupter assigned
to the slot (as these events do not have TRB), so current
commit adds the field of assigned interrupter to the
XHCISlot structure. Guest software assigns interrupter to the
slot on 'Address Device' and 'Evaluate Context' commands.
Yuri Benditovich [Mon, 28 Jan 2019 20:05:07 +0000 (20:05 +0000)]
usb: XHCI shall not halt isochronous endpoints
According to the XHCI spec (4.10.2) the controller
never halts isochronous endpoints. This commit prevent
stop of isochronous streaming when sporadic errors
status received from backends.
Thomas Huth [Wed, 23 Jan 2019 14:40:54 +0000 (15:40 +0100)]
hw/usb: Fix LGPL information in the file headers
It's either "GNU *Library* General Public version 2" or "GNU Lesser
General Public version *2.1*", but there was no "version 2.0" of the
"Lesser" library. So assume that version 2.1 is meant here.
Additionally, suggest that the user should have received a copy of
the LGPL, and not the GPL here.
[ kraxel: dropped chunk which adds close() after successful
fdopendir() call, that is not needed according to
POSIX even though Coverity flags it as bug ]
Gerd Hoffmann [Thu, 10 Jan 2019 12:51:08 +0000 (13:51 +0100)]
usb: assign unique serial numbers to hid devices
Windows guests have trouble dealing with usb devices having identical
serial numbers. So, assign unique serial numbers to usb hid devices.
All other usb devices have this already.
In the past the fixed serial number has been used to indicate working
remote setup to linux guests. Here is a bit of history:
* First there was nothing.
* Then I added a rule to udev checking for serial == 42.
(this is in rhel-6).
* Then systemd + udev merged.
* Then I changed the rule to check for serial != 1 instead, so we can
use any serial but "1" which is the one the old broken devices had
(this is in rhel-7). March 2014 in upstream systemd.
* Then all usb power management rules where dropped from systemd (June
2015). Which I figured today (Sept 2018), after wondering that the
rules are gone in fedora 28.
So, three years ago the serial number check was dropped upstream, yet I
hav't seen a single report about autosuspend issues (or cpu usage for
usb emulation going up, which is the typical symtom).
So I figured I can stop worring that changing the serial number will
break things and just do it.
And even if it turns out autosuspend is still an issue: I think
meanwhile we can really stop worrying about guests running in old qemu
versions with broken usb suspend (fixed in 0.13 !). If needed we can
enable autosuspend unconditionally in guests.
Stefan Hajnoczi [Tue, 29 Jan 2019 02:53:43 +0000 (10:53 +0800)]
trace: rerun tracetool after ./configure changes
Autogenerated code in trace.h/trace.c and friends is specific to the
config-host.mak TRACE_BACKENDS setting and must be regenerated when
./configure --enable-trace-backend= changes settings.
This patch ensures that changes to TRACE_BACKENDS are detected. For
example, the trace-root.h file is now updated after switching trace
backends:
$ ./configure && make
$ cp trace-root.h /tmp/old-trace-root.h
$ ./configure --enable-trace-backend=simple && make
$ diff -u /tmp/old-trace-root.h trace-root.h
Peter Maydell [Tue, 29 Jan 2019 12:00:19 +0000 (12:00 +0000)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20190129' into staging
target-arm queue:
* Fix validation of 32-bit address spaces for aa32 (fixes an assert introduced in ba97be9f4a4)
* v8m: Ensure IDAU is respected if SAU is disabled
* gdbstub: fix gdb_get_cpu(s, pid, tid) when pid and/or tid are 0
* exec.c: Use correct attrs in cpu_memory_rw_debug()
* accel/tcg/user-exec: Don't parse aarch64 insns to test for read vs write
* target/arm: Don't clear supported PMU events when initializing PMCEID1
* memory: add memory_region_flush_rom_device()
* microbit: Add stub NRF51 TWI magnetometer/accelerometer detection
* tests/microbit-test: extend testing of microbit devices
* checkpatch: Don't emit spurious warnings about block comments
* aspeed/smc: misc bug fixes
* xlnx-zynqmp: Don't create rpu-cluster if there are no RPUs
* xlnx-zynqmp: Realize cluster after putting RPUs in it
* accel/tcg: Add cluster number to TCG TB hash so differently configured
CPUs don't pick up cached TBs for the wrong kind of CPU
* remotes/pmaydell/tags/pull-target-arm-20190129: (23 commits)
gdbstub: Simplify gdb_get_cpu_pid() to use cpu->cluster_index
accel/tcg: Add cluster number to TCG TB hash
qom/cpu: Add cluster_index to CPUState
hw/arm/xlnx-zynqmp: Realize cluster after putting RPUs in it
aspeed/smc: snoop SPI transfers to fake dummy cycles
aspeed/smc: Add dummy data register
aspeed/smc: define registers for all possible CS
aspeed/smc: fix default read value
xlnx-zynqmp: Don't create rpu-cluster if there are no RPUs
checkpatch: Don't emit spurious warnings about block comments
tests/microbit-test: Check nRF51 UART functionality
tests/microbit-test: Make test independent of global_qtest
tests/libqtest: Introduce qtest_init_with_serial()
memory: add memory_region_flush_rom_device()
target/arm: Don't clear supported PMU events when initializing PMCEID1
MAINTAINERS: update microbit ARM board files
accel/tcg/user-exec: Don't parse aarch64 insns to test for read vs write
exec.c: Use correct attrs in cpu_memory_rw_debug()
tests/microbit-test: add TWI stub device test
arm: Stub out NRF51 TWI magnetometer/accelerometer detection
...
Peter Maydell [Tue, 29 Jan 2019 11:46:06 +0000 (11:46 +0000)]
accel/tcg: Add cluster number to TCG TB hash
Include the cluster number in the hash we use to look
up TBs. This is important because a TB that is valid
for one cluster at a given physical address and set
of CPU flags is not necessarily valid for another:
the two clusters may have different views of physical
memory, or may have different CPU features (eg FPU
present or absent).
We put the cluster number in the high 8 bits of the
TB cflags. This gives us up to 256 clusters, which should
be enough for anybody. If we ever need more, or need
more bits in cflags for other purposes, we could make
tb_hash_func() take more data (and expand qemu_xxhash7()
to qemu_xxhash8()).
Peter Maydell [Tue, 29 Jan 2019 11:46:05 +0000 (11:46 +0000)]
qom/cpu: Add cluster_index to CPUState
For TCG we want to distinguish which cluster a CPU is in, and
we need to do it quickly. Cache the cluster index in the CPUState
struct, by having the cluster object set cpu->cluster_index for
each CPU child when it is realized.
This means that board/SoC code must add all CPUs to the cluster
before realizing the cluster object. Regrettably QOM provides no
way to prevent adding children to a realized object and no way for
the parent to be notified when a new child is added to it, so
we don't have any way to enforce/assert this constraint; all
we can do is document it in a comment. We can at least put in a
check that the cluster contains at least one CPU, which should
catch the typical cases of "realized cluster too early" or
"forgot to parent the CPUs into it".
The restriction on how many clusters can exist in the system
is imposed by TCG code which will be added in a subsequent commit,
but the check to enforce it in cluster.c fits better in this one.
Peter Maydell [Tue, 29 Jan 2019 11:46:05 +0000 (11:46 +0000)]
hw/arm/xlnx-zynqmp: Realize cluster after putting RPUs in it
Currently the cluster implementation doesn't have any constraints
on the ordering of realizing the TYPE_CPU_CLUSTER and populating it
with child objects. We want to impose a constraint that realize
must happen only after all the child objects are added, so move
the realize of rpu_cluster. (The apu_cluster is already
realized after child population.)
Cédric Le Goater [Tue, 29 Jan 2019 11:46:05 +0000 (11:46 +0000)]
aspeed/smc: snoop SPI transfers to fake dummy cycles
The m25p80 models dummy cycles using byte transfers. This works well
when the transfers are initiated by the QEMU model of a SPI controller
but when these are initiated by the OS, it breaks emulation.
Snoop the SPI transfer to catch commands requiring dummy cycles and
replace them with byte transfers compatible with the m25p80 model.
Cédric Le Goater [Tue, 29 Jan 2019 11:46:05 +0000 (11:46 +0000)]
aspeed/smc: define registers for all possible CS
The model should expose one control register per possible CS. When
testing the validity of the register number in the read operation,
replace 's->num_cs' by 'ctrl->max_slaves' which represents the maximum
number of flash devices a controller can handle.
Peter Maydell [Tue, 29 Jan 2019 11:46:05 +0000 (11:46 +0000)]
xlnx-zynqmp: Don't create rpu-cluster if there are no RPUs
If we aren't going to create any RPUs, then don't create the
rpu-cluster unit. This allows us to add an assertion to the
cluster object that it contains at least one CPU, which helps
to avoid bugs in creating clusters and putting CPUs in them.
Peter Maydell [Tue, 29 Jan 2019 11:46:05 +0000 (11:46 +0000)]
checkpatch: Don't emit spurious warnings about block comments
In checkpatch we attempt to check for and warn about
block comments which start with /* or /** followed by a
non-blank. Unfortunately a bug in the regex meant that
we would incorrectly warn about comments starting with
"/**" with no following text:
git show 9813dc6ac3954d58ba16b3920556f106f97e1c67|./scripts/checkpatch.pl -
WARNING: Block comments use a leading /* on a separate line
#34: FILE: tests/libqtest.h:233:
+/**
The sequence "/\*\*?" was intended to match either "/*" or "/**",
but Perl's semantics for '?' allow it to backtrack and try the
"matches 0 chars" option if the "matches 1 char" choice leads to
a failure of the rest of the regex to match. Switch to "/\*\*?+"
which uses what perlre(1) calls the "possessive" quantifier form:
this means that if it matches the "/**" string it will not later
backtrack to matching just the "/*" prefix.
The other end of the regex is also wrong: it is attempting
to check for "/* or /** followed by something that isn't
just whitespace", but [ \t]*.+[ \t]* will match on pure
whitespace. This is less significant but means that a line
with just a comment-starter followed by trailing whitespace
will generate an incorrect warning about block comment style
as well as the correct error about trailing whitespace which
a different checkpatch test emits.
Stefan Hajnoczi [Tue, 29 Jan 2019 11:46:04 +0000 (11:46 +0000)]
memory: add memory_region_flush_rom_device()
ROM devices go via MemoryRegionOps->write() callbacks for write
operations and do not dirty/invalidate that memory. Device emulation
must be able to mark memory ranges that have been modified internally
(e.g. using memory_region_get_ram_ptr()).
Introduce the memory_region_flush_rom_device() API for this purpose.
This patch introduced two calls to get_pmceid() during CPU
initialization - one each for PMCEID0 and PMCEID1. In addition to
building the register values, get_pmceid() clears an internal array
mapping event numbers to their implementations (supported_event_map)
before rebuilding it. This is an optimization since much of the logic is
shared. However, since it was called twice, the contents of
supported_event_map reflect only the events in PMCEID1 (the second call
to get_pmceid()).
Fix this bug by moving the initialization of PMCEID0 and PMCEID1 back
into a single function call, and name it more appropriately since it is
doing more than simply generating the contents of the PMCEID[01]
registers.
Peter Maydell [Tue, 29 Jan 2019 11:46:04 +0000 (11:46 +0000)]
accel/tcg/user-exec: Don't parse aarch64 insns to test for read vs write
In cpu_signal_handler() for aarch64 hosts, currently we parse
the faulting instruction to see if it is a load or a store.
Since the 3.16 kernel (~2014), the kernel has provided us with
the syndrome register for a fault, which includes the WnR bit.
Use this instead if it is present, only falling back to
instruction parsing if not.
Peter Maydell [Tue, 29 Jan 2019 11:46:04 +0000 (11:46 +0000)]
exec.c: Use correct attrs in cpu_memory_rw_debug()
In the softmmu version of cpu_memory_rw_debug(), we ask the
CPU for the attributes to use for the virtual memory access,
and we correctly use those to identify the address space
index. However, we were not passing them in to the
address_space_write_rom() and address_space_rw() functions.
The effect of this was that a memory access from the gdbstub
to a device which had behaviour that was sensitive to the
memory attributes (such as some ARMv8M NVIC registers) was
incorrectly always performed as if non-secure, rather than
using the right security state for the CPU's current state.
Steffen Görtz [Tue, 29 Jan 2019 11:46:03 +0000 (11:46 +0000)]
arm: Stub out NRF51 TWI magnetometer/accelerometer detection
Recent microbit firmwares panic if the TWI magnetometer/accelerometer
devices are not detected during startup. We don't implement TWI (I2C)
so let's stub out these devices just to let the firmware boot.
Thomas Roth [Tue, 29 Jan 2019 11:46:03 +0000 (11:46 +0000)]
target/arm: v8m: Ensure IDAU is respected if SAU is disabled
The current behavior of v8m_security_lookup in helper.c only checks whether the
IDAU specifies a higher security if the SAU is enabled. If SAU.ALLNS is set to
1, this will lead to addresses being treated as non-secure, even though the
IDAU indicates that they must be secure.
This patch changes the behavior to also check the IDAU if the SAU is currently
disabled.
(This brings the behaviour here into line with the v8M Arm ARM
SecurityCheck() pseudocode.)
Signed-off-by: Thomas Roth <[email protected]>
Message-id: CAGGekkuc+-tvp5RJP7CM+Jy_hJF7eiRHZ96132sb=hPPCappKg@mail.gmail.com Reviewed-by: Peter Maydell <[email protected]>
[PMM: added pseudocode ref to the commit message, fixed comment style] Signed-off-by: Peter Maydell <[email protected]>
Vitaly Kuznetsov [Mon, 21 Jan 2019 15:50:51 +0000 (16:50 +0100)]
i386: Enable NPT and NRIPSAVE for AMD CPUs
Modern AMD CPUs support NPT and NRIPSAVE features and KVM exposes these
when present. NRIPSAVE apeared somewhere in Opteron_G3 lifetime (e.g.
QuadCore AMD Opteron 2378 has is but QuadCore AMD Opteron HE 2344 doesn't),
NPT was introduced a bit earlier.
Add the FEAT_SVM leaf to Opteron_G4/G5 and EPYC/EPYC-IBPB cpu models.
Tao Xu [Thu, 27 Dec 2018 02:43:03 +0000 (10:43 +0800)]
i386: Update stepping of Cascadelake-Server
Update the stepping from 5 to 6, in order that
the Cascadelake-Server CPU model can support AVX512VNNI
and MSR based features exposed by ARCH_CAPABILITIES.
Emilio G. Cota [Wed, 16 Jan 2019 17:01:14 +0000 (12:01 -0500)]
tcg/i386: enable dynamic TLB sizing
As the following experiments show, this series is a net perf gain,
particularly for memory-heavy workloads. Experiments are run on an
Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz.
1. System boot + shudown, debian aarch64:
- Before (v3.1.0):
Performance counter stats for './die.sh v3.1.0' (10 runs):
No improvement (within noise range). Note that for this workload,
increasing the time window too much can lead to perf degradation,
since it flushes the TLB *very* frequently.
3. x86_64 SPEC06int:
x86_64-softmmu speedup vs. v3.1.0 for SPEC06int (test set)
Host: Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz (Skylake)
Emilio G. Cota [Wed, 16 Jan 2019 17:01:12 +0000 (12:01 -0500)]
cputlb: do not evict empty entries to the vtlb
Currently we evict an entry to the victim TLB when it doesn't match
the current address. But it could be that there's no match because
the current entry is empty (i.e. all -1's, for instance via tlb_flush).
Do not evict the entry to the vtlb in that case.
This change will help us keep track of the TLB's use rate, which
we'll use to implement a policy for dynamic TLB sizing.