Peter Maydell [Mon, 13 Jul 2020 14:37:16 +0000 (15:37 +0100)]
hw/arm/armsse: Assert info->num_cpus is in-bounds in armsse_realize()
In armsse_realize() we have a loop over [0, info->num_cpus), which
indexes into various fixed-size arrays in the ARMSSE struct. This
confuses Coverity, which warns that we might overrun those arrays
(CID 1430326, 1430337, 1430371, 1430414, 1430430). This can't
actually happen, because the info struct is always one of the entries
in the armsse_variants[] array and num_cpus is either 1 or 2; we also
already assert in armsse_init() that num_cpus is not too large.
However, adding an assert to armsse_realize() like the one in
armsse_init() should help Coverity figure out that these code paths
aren't possible.
Peter Maydell [Sat, 11 Jul 2020 14:24:23 +0000 (15:24 +0100)]
qdev: Move doc comments from qdev.c to qdev-core.h
The doc-comments which document the qdev API are split between the
header file and the C source files, because as a project we haven't
been consistent about where we put them.
Move all the doc-comments in qdev.c to the header files, so that
users of the APIs don't have to look at the implementation files for
this information.
In the process, unify them into our doc-comment format and expand on
them in some cases to clarify expected use cases.
Control this cpu feature via a machine property, much as we do
with secure=on, since both require specialized support in the
machine setup to be functional.
Default MTE to off, since this feature implies extra overhead.
Peter Maydell [Mon, 20 Jul 2020 10:03:07 +0000 (11:03 +0100)]
Merge remote-tracking branch 'remotes/cminyard/tags/for-qemu-i2c-5' into staging
Minor changes to:
Add an SMBus config entry
Cleanup/simplify/document some I2C interfaces
# gpg: Signature made Thu 16 Jul 2020 18:46:55 BST
# gpg: using RSA key FD0D5CE67CE0F59A6688268661F38C90919BFF81
# gpg: Good signature from "Corey Minyard <[email protected]>" [unknown]
# gpg: aka "Corey Minyard <[email protected]>" [unknown]
# gpg: aka "Corey Minyard <[email protected]>" [unknown]
# gpg: aka "Corey Minyard <[email protected]>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: FD0D 5CE6 7CE0 F59A 6688 2686 61F3 8C90 919B FF81
* remotes/cminyard/tags/for-qemu-i2c-5:
hw/i2c: Document the I2C qdev helpers
hw/i2c: Rename i2c_create_slave() as i2c_slave_create_simple()
hw/i2c: Rename i2c_realize_and_unref() as i2c_slave_realize_and_unref()
hw/i2c: Rename i2c_try_create_slave() as i2c_slave_new()
hw/i2c/aspeed_i2c: Simplify aspeed_i2c_get_bus()
hw/i2c/Kconfig: Add an entry for the SMBus
Peter Maydell [Fri, 17 Jul 2020 15:25:08 +0000 (16:25 +0100)]
Makefile: Remove config-devices.mak on "make clean"
The config-devices.mak files are generated by "make", and so they
should be deleted by "make clean".
(This is different from config-host.mak and config-all-disas.mak,
which are created by "configure" and so only deleted by
"make distclean".)
If we don't delete these files on "make clean", then the build
tree is left in a state where it has the config-devices.mak
file but not the config-devices.mak.d file, and make will not
realize that it needs to rebuild config-devices.mak if, for
instance, hw/sd/Kconfig changes.
NB: config-all-devices.mak is also generated by "make", but we
already remove it on "make clean".
* remotes/rth/tags/pull-tcg-20200717:
tcg/cpu-exec: precise single-stepping after an interrupt
tcg/cpu-exec: precise single-stepping after an exception
tcg: Save/restore vecop_list around minmax fallback
Peter Maydell [Sat, 18 Jul 2020 22:59:03 +0000 (23:59 +0100)]
Merge remote-tracking branch 'remotes/cminyard/tags/for-qemu-ipmi-5' into staging
Man page update and new set sensor command
Some minor man page updates for fairly obvious things.
The set sensor command addition has been in the Power group's tree for a
long time and I have neglected to submit it.
-corey
# gpg: Signature made Fri 17 Jul 2020 17:45:32 BST
# gpg: using RSA key FD0D5CE67CE0F59A6688268661F38C90919BFF81
# gpg: Good signature from "Corey Minyard <[email protected]>" [unknown]
# gpg: aka "Corey Minyard <[email protected]>" [unknown]
# gpg: aka "Corey Minyard <[email protected]>" [unknown]
# gpg: aka "Corey Minyard <[email protected]>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: FD0D 5CE6 7CE0 F59A 6688 2686 61F3 8C90 919B FF81
* remotes/cminyard/tags/for-qemu-ipmi-5:
ipmi: add SET_SENSOR_READING command
ipmi: Fix a man page entry
ipmi: Add man page pieces for the IPMI PCI devices
Cédric Le Goater [Mon, 18 Nov 2019 09:24:29 +0000 (10:24 +0100)]
ipmi: add SET_SENSOR_READING command
SET_SENSOR_READING is a complex IPMI command (see IPMI spec 35.17)
which enables the host software to set the reading value and the event
status of sensors supporting it.
Below is a proposal for all the operations (reading, assert, deassert,
event data) with the following limitations :
- No event are generated for threshold-based sensors.
- The case in which the BMC needs to generate its own events is not
supported.
Peter Maydell [Fri, 17 Jul 2020 13:58:13 +0000 (14:58 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches:
- file-posix: Fix read-only Linux block devices with auto-read-only
- Require aligned image size with O_DIRECT to avoid assertion failure
- Allow byte-aligned direct I/O on NFS instead of guessing 4k alignment
- Fix nbd_export_close_all() crash
- Fix race in iotests case 030
- qemu-img resize: Require --shrink for shrinking all image formats
- crypto: use a stronger private key for tests
- Remove VXHS block device
- MAINTAINERS: vvfat: set status to odd fixes
* remotes/kevin/tags/for-upstream:
file-posix: Fix leaked fd in raw_open_common() error path
file-posix: Fix check_hdev_writable() with auto-read-only
file-posix: Move check_hdev_writable() up
file-posix: Allow byte-aligned O_DIRECT with NFS
block: Require aligned image size to avoid assertion failure
iotests: test shutdown when bitmap is exported through NBD
nbd: make nbd_export_close_all() synchronous
iotests/030: Reduce job speed to make race less likely
crypto: use a stronger private key for tests
qemu-img resize: Require --shrink for shrinking all image formats
Remove VXHS block device
vvfat: set status to odd fixes
We shouldn't fail when finding an unnamed bitmap in a unnamed node or
node with auto-generated node name, as bitmap migration ignores such
bitmaps in the first place.
Kevin Wolf [Fri, 17 Jul 2020 10:54:25 +0000 (12:54 +0200)]
file-posix: Fix check_hdev_writable() with auto-read-only
For Linux block devices, being able to open the device read-write
doesn't necessarily mean that the device is actually writable (one
example is a read-only LV, as you get with lvchange -pr <device>). We
have check_hdev_writable() to check this condition and fail opening the
image read-write if it's not actually writable.
However, this check doesn't take auto-read-only into account, but
results in a hard failure instead of downgrading to read-only where
possible.
Fix this and do the writable check not based on BDRV_O_RDWR, but only
when this actually results in opening the file read-write. A second
check is inserted in raw_reconfigure_getfd() to have the same check when
dynamic auto-read-only upgrades an image file from read-only to
read-write.
Kevin Wolf [Thu, 16 Jul 2020 14:26:01 +0000 (16:26 +0200)]
file-posix: Allow byte-aligned O_DIRECT with NFS
Since commit a6b257a08e3 ('file-posix: Handle undetectable alignment'),
we assume that if we open a file with O_DIRECT and alignment probing
returns 1, we just couldn't find out the real alignment requirement
because some filesystems make the requirement only for allocated blocks.
In this case, a safe default of 4k is used.
This is too strict for NFS, which does actually allow byte-aligned
requests even with O_DIRECT. Because we can't distinguish both cases
with generic code, let's just look at the file system magic and disable
s->needs_alignment for NFS. This way, O_DIRECT can still be used on NFS
for images that are not aligned to 4k.
Kevin Wolf [Thu, 16 Jul 2020 14:26:00 +0000 (16:26 +0200)]
block: Require aligned image size to avoid assertion failure
Unaligned requests will automatically be aligned to bl.request_alignment
and we can't extend write requests to access space beyond the end of the
image without resizing the image, so if we have the WRITE permission,
but not the RESIZE one, it's required that the image size is aligned.
Failing to meet this requirement could cause assertion failures like
this if RESIZE permissions weren't requested:
This was e.g. triggered by qemu-img converting to a target image with 4k
request alignment when the image was only aligned to 512 bytes, but not
to 4k.
Turn this into a graceful error in bdrv_check_perm() so that WRITE
without RESIZE can only be taken if the image size is aligned. If a user
holds both permissions and drops only RESIZE, the function will return
an error, but bdrv_child_try_set_perm() will ignore the failure silently
if permissions are only requested to be relaxed and just keep both
permissions while returning success.
Consider nbd_export_close_all(). The call-stack looks like this:
nbd_export_close_all() -> nbd_export_close -> call client_close() for
each client.
client_close() doesn't guarantee that client is closed: nbd_trip()
keeps reference to it. So, nbd_export_close_all() just reduce
reference counter on export and removes it from the list, but doesn't
guarantee that nbd_trip() finished neither export actually removed.
Let's wait for all exports actually removed.
Without this fix, the following crash is possible:
- export bitmap through internal Qemu NBD server
- connect a client
- shutdown Qemu
On shutdown nbd_export_close_all is called, but it actually don't wait
for nbd_trip() to finish and to release its references. So, export is
not release, and exported bitmap remains busy, and on try to remove the
bitmap (which is part of bdrv_close()) the assertion fails:
Kevin Wolf [Thu, 16 Jul 2020 13:28:29 +0000 (15:28 +0200)]
iotests/030: Reduce job speed to make race less likely
It can happen that the throttling of the stream job doesn't make it slow
enough that we can be sure that it still exists when it is referenced
again. Just use a much smaller speed to make this very unlikely to
happen again.
The unit tests using the x509 crypto functionality have started
failing in Fedora 33 rawhide with a message like
The certificate uses an insecure algorithm
This is result of Fedora changes to support strong crypto [1]. RSA
with 1024 bit key is viewed as legacy and thus insecure. Generate
a new private key which is 3072 bits long and reasonable future
proof.
Kevin Wolf [Fri, 10 Jul 2020 12:17:17 +0000 (14:17 +0200)]
qemu-img resize: Require --shrink for shrinking all image formats
QEMU 2.11 introduced the --shrink option for qemu-img resize to avoid
accidentally shrinking images (commit 4ffca8904a3). However, for
compatibility reasons, it was not enforced for raw images yet, but only
a deprecation warning was printed. This warning has existed for long
enough that we can now finally require --shrink for raw images, too, and
error out if it's not given.
Documentation already describes the state as it is after this patch.
The vxhs code doesn't compile since v2.12.0. There's no point in fixing
and then adding CI for a config that our users have demonstrated that
they do not use; better to just remove it.
* remotes/huth-gitlab/tags/pull-request-2020-07-17:
gitlab-ci.yml: Add fuzzer tests
qom: Plug memory leak in "info qom-tree"
configure: Fix for running with --enable-werror on macOS
fuzz: Expect the cmdline in a freeable GString
tests: qmp-cmd-test: fix memory leak
qtest: bios-tables-test: fix a memory leak
Thomas Huth [Wed, 15 Jul 2020 04:32:48 +0000 (06:32 +0200)]
gitlab-ci.yml: Add fuzzer tests
So far we neither compile-tested nor run any of the new fuzzers in our CI,
which led to some build failures of the fuzzer code in the past weeks.
To avoid this problem, add a job to compile the fuzzer code and run some
loops (which likely don't find any new bugs via fuzzing, but at least we
know that the code can still be run).
A nice side-effect of this test is that the leak tests are enabled here,
so we should now notice some of the memory leaks in our code base earlier.
Commit e8c9e65816 "qom: Make "info qom-tree" show children sorted"
created a memory leak, because I didn't realize
object_get_canonical_path_component()'s value needs to be freed.
Reproducer:
$ qemu-system-x86_64 -nodefaults -display none -S -monitor stdio
QEMU 5.0.50 monitor - type 'help' for more information
(qemu) info qom-tree
This leaks some 4500 path components, 12-13 characters on average,
i.e. roughly 100kBytes depending on the allocator. A couple of
hundred "info qom-tree" here, a couple of hundred there, and soon
enough we're talking about real memory.
Thomas Huth [Thu, 16 Jul 2020 05:12:22 +0000 (07:12 +0200)]
configure: Fix for running with --enable-werror on macOS
The configure script currently refuses to succeed when run on macOS
with --enable-werror:
ERROR: configure test passed without -Werror but failed with -Werror.
The information in config.log indicates:
config-temp/qemu-conf.c:3:55: error: control reaches end of non-void
function [-Werror,-Wreturn-type]
static void *f(void *p) { pthread_setname_np("QEMU"); }
^
And indeed, the return statement is missing here.
In the initial FuzzTarget, get_init_cmdline returned a char *. With this
API, we had no guarantee about where the string came from. For example,
i440fx-qtest-reboot-fuzz simply returned a pointer to a string literal,
while the QOS-based targets build the arguments out in a GString an
return the gchar *str pointer. Since we did not try to free the cmdline,
we have a leak for any targets that do not simply return string
literals. Clean up this mess by forcing fuzz-targets to return
a GString, that we can free.
$ aarch64-linux-gnu-gdb
GNU gdb (GDB) 9.2
[...]
(gdb) tar rem :1234
Remote debugging using :1234
warning: No executable has been specified and target does not support
determining executable automatically. Try using the "file" command.
0x0000000000000000 in ?? ()
(gdb) # writing nop insns to 0x200 and 0x204
(gdb) set *0x200 = 0xd503201f
(gdb) set *0x204 = 0xd503201f
(gdb) # 0x0 address contains 0 which is an invalid opcode.
(gdb) # The CPU should raise an exception and jump to 0x200
(gdb) si
0x0000000000000204 in ?? ()
With this commit, the same run steps correctly on the first instruction
of the exception vector:
Peter Maydell [Thu, 16 Jul 2020 20:46:18 +0000 (21:46 +0100)]
Merge remote-tracking branch 'remotes/ehabkost/tags/x86-next-pull-request' into staging
x86 fixes for -rc1
Fixes for x86 that missed hard freeze:
* Don't trigger warnings for features set by
CPU model versions (Xiaoyao Li)
* Missing features in Icelake-Server, Skylake-Server,
Cascadelake-Server CPU models (Chenyi Qiang)
* Fix hvf x86_64 guest boot crash (Roman Bolshakov)
* remotes/ehabkost/tags/x86-next-pull-request:
i386: hvf: Explicitly set CR4 guest/host mask
target/i386: add the missing vmx features for Skylake-Server and Cascadelake-Server CPU models
target/i386: fix model number and add missing features for Icelake-Server CPU model
target/i386: add fast short REP MOV support
i386/cpu: Don't add unavailable_features to env->user_features
i368/cpu: Clear env->user_features after loading versioned CPU model
hw/i2c: Rename i2c_create_slave() as i2c_slave_create_simple()
We use "create_simple" names for functions that allocate, initialize,
configure and realize device objects: pci_create_simple(),
isa_create_simple(), usb_create_simple(). For consistency, rename
i2c_create_slave() as i2c_slave_create_simple(). Since we have
to update all the callers, also let it return a I2CSlave object.
hw/i2c: Rename i2c_try_create_slave() as i2c_slave_new()
We use "new" names for functions that allocate and initialize
device objects: pci_new(), isa_new(), usb_new().
Let's call this one i2c_slave_new(). Since we have to update
all the callers, also let it return a I2CSlave object.
All the callers of aspeed_i2c_get_bus() have a AspeedI2CState and
cast it to a DeviceState with DEVICE(), then aspeed_i2c_get_bus()
cast the DeviceState to an AspeedI2CState with ASPEED_I2C()...
Simplify aspeed_i2c_get_bus() callers by using AspeedI2CState
argument.
The System Management Bus is more or less a derivative of the I2C
bus, thus the Kconfig entry depends of I2C.
Not all boards providing an I2C bus support SMBus.
Use two different Kconfig entries to be able to select I2C without
selecting SMBus.
target/i386: fix model number and add missing features for Icelake-Server CPU model
Add the missing features(sha_ni, avx512ifma, rdpid, fsrm,
vmx-rdseed-exit, vmx-pml, vmx-eptp-switching) and change the model
number to 106 in the Icelake-Server-v4 CPU model.
Xiaoyao Li [Mon, 13 Jul 2020 17:44:36 +0000 (01:44 +0800)]
i386/cpu: Don't add unavailable_features to env->user_features
Features unavailable due to absent of their dependent features should
not be added to env->user_features. env->user_features only contains the
feature explicity specified with -feature/+feature by user.
Xiaoyao Li [Mon, 13 Jul 2020 17:44:35 +0000 (01:44 +0800)]
i368/cpu: Clear env->user_features after loading versioned CPU model
Features defined in versioned CPU model are recorded in env->user_features
since they are updated as property. It's unwated because they are not
user specified.
Simply clear env->user_features as a fix. It won't clear user specified
features because user specified features are filled to
env->user_features later in x86_cpu_expand_features().
Peter Maydell [Thu, 16 Jul 2020 13:46:47 +0000 (14:46 +0100)]
Merge remote-tracking branch 'remotes/stefanberger/tags/pull-tpm-2020-07-15-1' into staging
Merge tpm 2020/07/15 v1
# gpg: Signature made Wed 15 Jul 2020 20:16:21 BST
# gpg: using RSA key B818B9CADF9089C2D5CEC66B75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <[email protected]>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2020-07-15-1:
tests: tpm: Skip over pcrUpdateCounter byte in result comparison
tpm: tpm_spapr: Exit on TPM backend failures
Peter Maydell [Thu, 16 Jul 2020 12:12:05 +0000 (13:12 +0100)]
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Wed 15 Jul 2020 14:49:07 BST
# gpg: using RSA key EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <[email protected]>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
ftgmac100: fix dblac write test
net: detect errors from probing vnet hdr flag for TAP devices
net: check if the file descriptor is valid before using it
qemu-options.hx: Clean up and fix typo for colo-compare
net/colo-compare.c: Expose compare "max_queue_size" to users
hw/net: Added CSO for IPv6
virtio-net: fix removal of failover device
Fix the contition to figure whenever we need to wait for more data or
not. Simply check the mode, if we are not in DATAIN state any more we
are done already and don't need to go ASYNC.
Calling ramfb_display_update() might replace the DisplaySurface with the
boot display, which in turn will free the currently active
DisplaySurface.
So clear our DisplaySurface pinter (dpy->region.surface pointer) to (a)
avoid use-after-free and (b) force replacing the boot display with the
real display when switching back.
Stefan Berger [Tue, 7 Jul 2020 20:16:25 +0000 (16:16 -0400)]
tests: tpm: Skip over pcrUpdateCounter byte in result comparison
The TPM 2 code in libtpms was fixed to handle the PCR 'TCB group' according
to the PCClient profile. The change of the PCRs belonging to the 'TCB group'
now affects the pcrUpdateCounter in the TPM2_PCRRead() responses where its
value is now different (typically lower by '1') than what it was before. To
not fail the tests, we skip the comparison of the 14th byte, which
represents the pcrUpdateCounter.
Stefan Berger [Tue, 7 Jul 2020 20:16:24 +0000 (16:16 -0400)]
tpm: tpm_spapr: Exit on TPM backend failures
Exit on TPM backend failures in the same way as the TPM CRB and TIS device
models do. With this change we now get an error report when the backend
did not start up properly:
error: internal error: qemu unexpectedly closed the monitor:
2020-07-07T12:49:28.333928Z qemu-system-ppc64: tpm-emulator: \
TPM result for CMD_INIT: 0x101 operation failed
Peter Maydell [Wed, 15 Jul 2020 16:16:39 +0000 (17:16 +0100)]
Merge remote-tracking branch 'remotes/stsquad/tags/pull-misc-for-rc0-150720-3' into staging
Final fixes for 5.1-rc0
- minor documentation nit
- docker.py bootstrap fixes
- tweak containers.yml wildcards
- fix float16 nan detection
- conditional use of -Wpsabi
- fix missing iotlb data for plugins
- proper locking for helper based bb count
- drop ppc64abi32 from the plugin check-tcg test
# gpg: Signature made Wed 15 Jul 2020 11:59:08 BST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <[email protected]>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-misc-for-rc0-150720-3:
.travis.yml: skip ppc64abi32-linux-user with plugins
plugins: expand the bb plugin to be thread safe and track per-cpu
cputlb: ensure we save the IOTLB data in case of reset
tests/plugins: don't unconditionally add -Wpsabi
fpu/softfloat: fix up float16 nan recognition
gitlab-ci/containers: Add missing wildcard where we should look for changes
docker.py: fix fetching of FROM layers
tests/docker: Remove the libssh workaround from the ubuntu 20.04 image
docs/devel: fix grammar in multi-thread-tcg
erik-smit [Sun, 28 Jun 2020 14:26:59 +0000 (16:26 +0200)]
ftgmac100: fix dblac write test
The test of the write of the dblac register was testing the old value
instead of the new value. This would accept the write of an invalid value
but subsequently refuse any following valid writes.
net: detect errors from probing vnet hdr flag for TAP devices
When QEMU sets up a tap based network device backend, it mostly ignores errors
reported from various ioctl() calls it makes, assuming the TAP file descriptor
is valid. This assumption can easily be violated when the user is passing in a
pre-opened file descriptor. At best, the ioctls may fail with a -EBADF, but if
the user passes in a bogus FD number that happens to clash with a FD number that
QEMU has opened internally for another reason, a wide variety of errnos may
result, as the TUNGETIFF ioctl number may map to a completely different command
on a different type of file.
By ignoring all these errors, QEMU sets up a zombie network backend that will
never pass any data. Even worse, when QEMU shuts down, or that network backend
is hot-removed, it will close this bogus file descriptor, which could belong to
another QEMU device backend.
There's no obvious guaranteed reliable way to detect that a FD genuinely is a
TAP device, as opposed to a UNIX socket, or pipe, or something else. Checking
the errno from probing vnet hdr flag though, does catch the big common cases.
ie calling TUNGETIFF will return EBADF for an invalid FD, and ENOTTY when FD is
a UNIX socket, or pipe which catches accidental collisions with FDs used for
stdio, or monitor socket.
Previously the example below where bogus fd 9 collides with the FD used for the
chardev saw:
$ ./x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hostnet0,fd=9 \
-chardev socket,id=charchannel0,path=/tmp/qga,server,nowait \
-monitor stdio -vnc :0
qemu-system-x86_64: -netdev tap,id=hostnet0,fd=9: TUNGETIFF ioctl() failed: Inappropriate ioctl for device
TUNSETOFFLOAD ioctl() failed: Bad address
QEMU 2.9.1 monitor - type 'help' for more information
(qemu) Warning: netdev hostnet0 has no peer
which gives a running QEMU with a zombie network backend.
With this change applied we get an error message and QEMU immediately exits
before carrying on and making a bigger disaster:
$ ./x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hostnet0,fd=9 \
-chardev socket,id=charchannel0,path=/tmp/qga,server,nowait \
-monitor stdio -vnc :0
qemu-system-x86_64: -netdev tap,id=hostnet0,vhost=on,fd=9: Unable to query TUNGETIFF on FD 9: Inappropriate ioctl for device
net: check if the file descriptor is valid before using it
qemu_set_nonblock() checks that the file descriptor can be used and, if
not, crashes QEMU. An assert() is used for that. The use of assert() is
used to detect programming error and the coredump will allow to debug
the problem.
But in the case of the tap device, this assert() can be triggered by
a misconfiguration by the user. At startup, it's not a real problem, but it
can also happen during the hot-plug of a new device, and here it's a
problem because we can crash a perfectly healthy system.
For instance:
# ip link add link virbr0 name macvtap0 type macvtap mode bridge
# ip link set macvtap0 up
# TAP=/dev/tap$(ip -o link show macvtap0 | cut -d: -f1)
# qemu-system-x86_64 -machine q35 -device pcie-root-port,id=pcie-root-port-0 -monitor stdio 9<> $TAP
(qemu) netdev_add type=tap,id=hostnet0,vhost=on,fd=9
(qemu) device_add driver=virtio-net-pci,netdev=hostnet0,id=net0,bus=pcie-root-port-0
(qemu) device_del net0
(qemu) netdev_del hostnet0
(qemu) netdev_add type=tap,id=hostnet1,vhost=on,fd=9
qemu-system-x86_64: .../util/oslib-posix.c:247: qemu_set_nonblock: Assertion `f != -1' failed.
Aborted (core dumped)
To avoid that, add a function, qemu_try_set_nonblock(), that allows to report the
problem without crashing.
In the same way, we also update the function for vhostfd in net_init_tap_one() and
for fd in net_init_socket() (both descriptors are provided by the user and can
be wrong).
Peter Maydell [Wed, 15 Jul 2020 12:54:09 +0000 (13:54 +0100)]
Merge remote-tracking branch 'remotes/philmd-gitlab/tags/mips-next-20200714' into staging
MIPS patches for 5.1
- A pair of fixes,
- Add Huacai Chen as MIPS KVM maintainer,
- Add Jiaxun Yang as designated MIPS TCG reviewer.
CI jobs results:
. https://travis-ci.org/github/philmd/qemu/builds/708079271
. https://gitlab.com/philmd/qemu/-/pipelines/166528104
. https://cirrus-ci.com/build/6483996878045184
# gpg: Signature made Tue 14 Jul 2020 20:59:58 BST
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <[email protected]>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
Peter Maydell [Wed, 15 Jul 2020 12:04:27 +0000 (13:04 +0100)]
Merge remote-tracking branch 'remotes/philmd-gitlab/tags/python-next-20200714' into staging
Python patches for 5.1
- Reduce race conditions on QEMUMachine::shutdown()
1. Remove the "bare except" pattern in the existing shutdown code,
which can mask problems and make debugging difficult.
2. Ensure that post-shutdown cleanup is always performed, even when
graceful termination fails.
3. Unify cleanup paths such that no matter how the VM is terminated,
the same functions and steps are always taken to reset the object
state.
4. Rewrite shutdown() such that any error encountered when attempting
a graceful shutdown will be raised as an AbnormalShutdown exception.
The pythonic idiom is to allow the caller to decide if this is a
problem or not.
- Modify part of the python/qemu library to comply with:
. mypy --strict
. pylint
. flake8
- Script for the TCG Continuous Benchmarking project that uses
callgrind to dissect QEMU execution into three main phases:
CI jobs results:
. https://cirrus-ci.com/build/5421349961203712
. https://gitlab.com/philmd/qemu/-/pipelines/166556001
. https://travis-ci.org/github/philmd/qemu/builds/708102347
# gpg: Signature made Tue 14 Jul 2020 21:40:05 BST
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <[email protected]>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
* remotes/philmd-gitlab/tags/python-next-20200714:
python/qmp.py: add QMPProtocolError
python/qmp.py: add casts to JSON deserialization
python/qmp.py: Do not return None from cmd_obj
python/qmp.py: re-absorb MonitorResponseError
iotests.py: use qemu.qmp type aliases
python/qmp.py: Define common types
python/machine.py: change default wait timeout to 3 seconds
python/machine.py: re-add sigkill warning suppression
python/machine.py: split shutdown into hard and soft flavors
tests/acceptance: Don't test reboot on cubieboard
tests/acceptance: wait() instead of shutdown() where appropriate
python/machine.py: Make wait() call shutdown()
python/machine.py: Add a configurable timeout to shutdown()
python/machine.py: Prohibit multiple shutdown() calls
python/machine.py: Perform early cleanup for wait() calls, too
python/machine.py: Add _early_cleanup hook
python/machine.py: Close QMP socket in cleanup
python/machine.py: consolidate _post_shutdown()
scripts/performance: Add dissect.py script
0: 978 times (97.80%), avg time 0.270 (0.01 varience/0.08 deviation)
-6: 21 times (2.10%), avg time 0.336 (0.01 varience/0.12 deviation)
-11: 1 times (0.10%), avg time 0.502 (0.00 varience/0.00 deviation)
Ran command 1000 times, 978 passes
But when running with plugins we hit the failure a lot more often:
0: 91 times (91.00%), avg time 0.302 (0.04 varience/0.19 deviation)
-11: 9 times (9.00%), avg time 0.558 (0.01 varience/0.11 deviation)
Ran command 100 times, 91 passes
The crash occurs in guest code which is the same in both pass and fail
cases. However we see various messages reported on the console about
corrupted memory lists which seems to imply the guest memory allocation
is corrupted. This lines up with the seg fault being in the guest
__libc_free function. So we think this is a guest bug which is
exacerbated by various modes of translation. If anyone has access to
real hardware to soak test the test case we could prove this properly.
Alex Bennée [Mon, 13 Jul 2020 20:04:11 +0000 (21:04 +0100)]
plugins: expand the bb plugin to be thread safe and track per-cpu
While there isn't any easy way to make the inline counts thread safe
we can ensure the callback based ones are. While we are at it we can
reduce introduce a new option ("idle") to dump a report of the current
bb and insn count each time a vCPU enters the idle state.
Alex Bennée [Mon, 13 Jul 2020 20:04:10 +0000 (21:04 +0100)]
cputlb: ensure we save the IOTLB data in case of reset
Any write to a device might cause a re-arrangement of memory
triggering a TLB flush and potential re-size of the TLB invalidating
previous entries. This would cause users of qemu_plugin_get_hwaddr()
to see the warning:
invalid use of qemu_plugin_get_hwaddr
because of the failed tlb_lookup which should always succeed. To
prevent this we save the IOTLB data in case it is later needed by a
plugin doing a lookup.
Alex Bennée [Mon, 13 Jul 2020 20:04:09 +0000 (21:04 +0100)]
tests/plugins: don't unconditionally add -Wpsabi
Not all compilers support the -Wpsabi (clang-9 in my case). To handle
this gracefully we pare back the shared build machinery so the
Makefile is relatively "standalone". We still take advantage of
config-host.mak as configure has done a bunch of probing for us but
that is it.
Thomas Huth [Mon, 13 Jul 2020 18:22:35 +0000 (20:22 +0200)]
gitlab-ci/containers: Add missing wildcard where we should look for changes
The tests/docker/* wildcard seems to only match the files that are directly
in the tests/docker folder - but changes to the files in the directory
tests/docker/dockerfiles are currently ignored. Seems like we need a
separate entry to match the files in that folder. With this wildcard added,
the stages now get re-run successfully when something in the dockerfiles
has been changed.
Alex Bennée [Mon, 13 Jul 2020 20:04:07 +0000 (21:04 +0100)]
docker.py: fix fetching of FROM layers
This worked on a system that was already bootstrapped because the
stage 2 images already existed even if they wouldn't be used. What we
should have pulled down was the FROM line containers first because
building on gitlab doesn't have the advantage of using our build
system to build the pre-requisite bits.
We still pull the image we want to build just in case we can use the
cached data.
Peter Maydell [Wed, 15 Jul 2020 08:06:55 +0000 (09:06 +0100)]
Merge remote-tracking branch 'remotes/philmd-gitlab/tags/sdcard-CVE-2020-13253-pull-request' into staging
Fix CVE-2020-13253
By using invalidated address, guest can do out-of-bounds accesses.
These patches fix the issue by only allowing SD card image sizes
power of 2, and not switching to SEND_DATA state when the address
is invalid (out of range).
This issue was found using QEMU fuzzing mode (using --enable-fuzzing,
see docs/devel/fuzzing.txt) and reported by Alexander Bulekov.
CI jobs results:
. https://cirrus-ci.com/build/5157142548185088
. https://gitlab.com/philmd/qemu/-/pipelines/166381731
. https://travis-ci.org/github/philmd/qemu/builds/707956535
# gpg: Signature made Tue 14 Jul 2020 14:54:44 BST
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <[email protected]>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
* remotes/philmd-gitlab/tags/sdcard-CVE-2020-13253-pull-request:
hw/sd/sdcard: Do not switch to ReceivingData if address is invalid
hw/sd/sdcard: Update coding style to make checkpatch.pl happy
hw/sd/sdcard: Do not allow invalid SD card sizes
hw/sd/sdcard: Simplify realize() a bit
hw/sd/sdcard: Restrict Class 6 commands to SCSD cards
tests/acceptance/boot_linux: Expand SD card image to power of 2
tests/acceptance/boot_linux: Tag tests using a SD card with 'device:sd'
docs/orangepi: Add instructions for resizing SD image to power of two
MAINTAINERS: Cc qemu-block mailing list
John Snow [Fri, 10 Jul 2020 05:22:09 +0000 (01:22 -0400)]
python/qmp.py: add casts to JSON deserialization
mypy and python type hints are not powerful enough to properly describe
JSON messages in Python 3.6. The best we can do, generally, is describe
them as Dict[str, Any].
Add casts to coerce this type for static analysis; but do NOT enforce
this type at runtime in any way.
Note: Python 3.8 adds a TypedDict construct which allows for the
description of more arbitrary Dictionary shapes. There is a third-party
module, "Pydantic", which is compatible with 3.6 that can be used
instead of the JSON library that parses JSON messages to fully-typed
Python objects, and may be preferable in some cases.
(That is well beyond the scope of this commit or series.)
John Snow [Fri, 10 Jul 2020 05:22:08 +0000 (01:22 -0400)]
python/qmp.py: Do not return None from cmd_obj
This makes typing the qmp library difficult, as it necessitates wrapping
Optional[] around the type for every return type up the stack. At some
point, it becomes difficult to discern or remember why it's None instead
of the expected object.
Use the python exception system to tell us exactly why we didn't get an
object. Remove this special-cased return.
John Snow [Fri, 10 Jul 2020 05:06:47 +0000 (01:06 -0400)]
python/machine.py: split shutdown into hard and soft flavors
This is done primarily to avoid the 'bare except' pattern, which
suppresses all exceptions during shutdown and can obscure errors.
Replace this with a pattern that isolates the different kind of shutdown
paradigms (_hard_shutdown and _soft_shutdown), and a new fallback shutdown
handler (_do_shutdown) that gracefully attempts one before the other.
This split now also ensures that no matter what happens,
_post_shutdown() is always invoked.
shutdown() changes in behavior such that if it attempts to do a graceful
shutdown and is unable to, it will now always raise an exception to
indicate this. This can be avoided by the test writer in three ways:
1. If the VM is expected to have already exited or is in the process of
exiting, wait() can be used instead of shutdown() to clean up resources
instead. This helps avoid race conditions in shutdown.
2. If a test writer is expecting graceful shutdown to fail, shutdown
should be called in a try...except block.
3. If the test writer has no interest in performing a graceful shutdown
at all, kill() can be used instead.
Handling shutdown in this way makes it much more explicit which type of
shutdown we want and allows the library to report problems with this
process.
John Snow [Fri, 10 Jul 2020 05:06:46 +0000 (01:06 -0400)]
tests/acceptance: Don't test reboot on cubieboard
cubieboard does not have a functioning reboot, it halts and QEMU does
not exit.
vm.shutdown() is modified in a forthcoming patch that makes it less tolerant
of race conditions on shutdown; tests should consciously decide to WAIT
or to SHUTDOWN qemu.
So long as this test is attempting to reboot, the correct choice would
be to WAIT for the VM to exit. However, since that's broken, we should
SHUTDOWN instead.
SHUTDOWN is indeed what already happens when the test performs teardown,
however, if anyone fixes cubieboard reboot in the future, this test will
develop a new race condition that might be hard to debug.
Therefore: remove the reboot test and make it obvious that the VM is
still running when the test concludes, where the test teardown will do
the right thing.
John Snow [Fri, 10 Jul 2020 05:06:45 +0000 (01:06 -0400)]
tests/acceptance: wait() instead of shutdown() where appropriate
When issuing 'reboot' to a VM with the no-reboot option, that VM will
exit. When then issuing a shutdown command, the cleanup may race.
Add calls to vm.wait() which will gracefully mark the VM as having
exited. Subsequent vm.shutdown() calls in generic tearDown code will not
race when called after completion of the call.