Niek Linnenbank [Wed, 11 Mar 2020 22:18:47 +0000 (23:18 +0100)]
hw/arm/allwinner-h3: add SDRAM controller device
In the Allwinner H3 SoC the SDRAM controller is responsible
for interfacing with the external Synchronous Dynamic Random
Access Memory (SDRAM). Types of memory that the SDRAM controller
supports are DDR2/DDR3 and capacities of up to 2GiB. This commit
adds emulation support of the Allwinner H3 SDRAM controller.
Niek Linnenbank [Wed, 11 Mar 2020 22:18:46 +0000 (23:18 +0100)]
hw/arm/allwinner-h3: add Boot ROM support
A real Allwinner H3 SoC contains a Boot ROM which is the
first code that runs right after the SoC is powered on.
The Boot ROM is responsible for loading user code (e.g. a bootloader)
from any of the supported external devices and writing the downloaded
code to internal SRAM. After loading the SoC begins executing the code
written to SRAM.
This commits adds emulation of the Boot ROM firmware setup functionality
by loading user code from SD card in the A1 SRAM. While the A1 SRAM is
64KiB, we limit the size to 32KiB because the real H3 Boot ROM also rejects
sizes larger than 32KiB. For reference, this behaviour is documented
by the Linux Sunxi project wiki at:
Niek Linnenbank [Wed, 11 Mar 2020 22:18:45 +0000 (23:18 +0100)]
hw/arm/allwinner-h3: add EMAC ethernet device
The Allwinner Sun8i System on Chip family includes an Ethernet MAC (EMAC)
which provides 10M/100M/1000M Ethernet connectivity. This commit
adds support for the Allwinner EMAC from the Sun8i family (H2+, H3, A33, etc),
including emulation for the following functionality:
* DMA transfers
* MII interface
* Transmit CRC calculation
Niek Linnenbank [Wed, 11 Mar 2020 22:18:44 +0000 (23:18 +0100)]
hw/arm/allwinner: add SD/MMC host controller
The Allwinner System on Chip families sun4i and above contain
an integrated storage controller for Secure Digital (SD) and
Multi Media Card (MMC) interfaces. This commit adds support
for the Allwinner SD/MMC storage controller with the following
emulated features:
Niek Linnenbank [Wed, 11 Mar 2020 22:18:43 +0000 (23:18 +0100)]
hw/arm/allwinner: add Security Identifier device
The Security Identifier device found in various Allwinner System on Chip
designs gives applications a per-board unique identifier. This commit
adds support for the Allwinner Security Identifier using a 128-bit
UUID value as input.
Niek Linnenbank [Wed, 11 Mar 2020 22:18:42 +0000 (23:18 +0100)]
hw/arm/allwinner: add CPU Configuration module
Various Allwinner System on Chip designs contain multiple processors
that can be configured and reset using the generic CPU Configuration
module interface. This commit adds support for the Allwinner CPU
configuration interface which emulates the following features:
Niek Linnenbank [Wed, 11 Mar 2020 22:18:41 +0000 (23:18 +0100)]
hw/arm/allwinner-h3: add System Control module
The Allwinner H3 System on Chip has an System Control
module that provides system wide generic controls and
device information. This commit adds support for the
Allwinner H3 System Control module.
Niek Linnenbank [Wed, 11 Mar 2020 22:18:40 +0000 (23:18 +0100)]
hw/arm/allwinner-h3: add USB host controller
The Allwinner H3 System on Chip contains multiple USB 2.0 bus
connections which provide software access using the Enhanced
Host Controller Interface (EHCI) and Open Host Controller
Interface (OHCI) interfaces. This commit adds support for
both interfaces in the Allwinner H3 System on Chip.
Niek Linnenbank [Wed, 11 Mar 2020 22:18:39 +0000 (23:18 +0100)]
hw/arm/allwinner-h3: add Clock Control Unit
The Clock Control Unit is responsible for clock signal generation,
configuration and distribution in the Allwinner H3 System on Chip.
This commit adds support for the Clock Control Unit which emulates
a simple read/write register interface.
Niek Linnenbank [Wed, 11 Mar 2020 22:18:38 +0000 (23:18 +0100)]
hw/arm: add Xunlong Orange Pi PC machine
The Xunlong Orange Pi PC is an Allwinner H3 System on Chip
based embedded computer with mainline support in both U-Boot
and Linux. The board comes with a Quad Core Cortex A7 @ 1.3GHz,
1GiB RAM, 100Mbit ethernet, USB, SD/MMC, USB, HDMI and
various other I/O. This commit add support for the Xunlong
Orange Pi PC machine.
Niek Linnenbank [Wed, 11 Mar 2020 22:18:37 +0000 (23:18 +0100)]
hw/arm: add Allwinner H3 System-on-Chip
The Allwinner H3 is a System on Chip containing four ARM Cortex A7
processor cores. Features and specifications include DDR2/DDR3 memory,
SD/MMC storage cards, 10/100/1000Mbit Ethernet, USB 2.0, HDMI and
various I/O modules. This commit adds support for the Allwinner H3
System on Chip.
Igor Mammedov [Tue, 3 Mar 2020 09:12:54 +0000 (04:12 -0500)]
hw/arm/cubieboard: make sure SOC object isn't leaked
SOC object returned by object_new() is leaked in current code.
Set SOC parent explicitly to board and then unref to SOC object
to make sure that refererence returned by object_new() is taken
care of.
The SOC object will be kept alive by its parent (machine) and
will be automatically freed when MachineState is destroyed.
target/arm: Disable clean_data_tbi for system mode
We must include the tag in the FAR_ELx register when raising
an addressing exception. Which means that we should not clear
out the tag during translation.
We cannot at present comply with this for user mode, so we
retain the clean_data_tbi function for the moment, though it
no longer does what it says on the tin for system mode. This
function is to be replaced with MTE, so don't worry about the
slight misnaming.
The Aspeed SMC Controller can operate in different modes : Read, Fast
Read, Write and User modes. When the User mode is configured, it
selects automatically the SPI slave device until the CE_STOP_ACTIVE
bit is set to 1. When any other modes are configured the device is
unselected. The HW logic handles the chip select automatically when
the flash is accessed through its AHB window.
When configuring the CEx Control Register, the User mode logic to
select and unselect the slave is incorrect and data corruption can be
seen on machines using two chips, witherspoon and romulus.
Rework the handler setting the CEx Control Register to fix this issue.
Peter Maydell [Tue, 3 Mar 2020 17:49:49 +0000 (17:49 +0000)]
target/arm: Recalculate hflags correctly after writes to CONTROL
A write to the CONTROL register can change our current EL (by
writing to the nPRIV bit). That means that we can't assume
that s->current_el is still valid in trans_MSR_v7m() when
we try to rebuild the hflags.
Add a new helper rebuild_hflags_m32_newel() which, like the
existing rebuild_hflags_a32_newel(), recalculates the current
EL from scratch, and use it in trans_MSR_v7m().
This fixes an assertion about an hflags mismatch when the
guest changes privilege by writing to CONTROL.
Peter Maydell [Tue, 3 Mar 2020 17:49:48 +0000 (17:49 +0000)]
target/arm: Update hflags in trans_CPS_v7m()
For M-profile CPUs, the FAULTMASK value affects the CPU's MMU index
(it changes the NegPri bit). We update the hflags after calls
to the v7m_msr helper in trans_MSR_v7m() but forgot to do so
in trans_CPS_v7m().
Peter Maydell [Tue, 3 Mar 2020 17:49:47 +0000 (17:49 +0000)]
hw/intc/armv7m_nvic: Rebuild hflags on reset
Some of an M-profile CPU's cached hflags state depends on state that's
in our NVIC object. We already do an hflags rebuild when the NVIC
registers are written, but we also need to do this on NVIC reset,
because there's no guarantee that this will happen before the
CPU reset.
This fixes an assertion due to mismatched hflags which happens if
the CPU is reset from inside a HardFault handler.
Peter Maydell [Thu, 12 Mar 2020 15:20:52 +0000 (15:20 +0000)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-docs-20200312' into staging
docs queue:
* Remove some no longer needed texinfo infrastructure
* Reorder the top level index docs to put most useful manuals first
* Split the Arm target-specific info into sub-pages
* Improve the Arm documentation a bit with info previously
only on the wiki page
* remotes/pmaydell/tags/pull-docs-20200312:
docs: Be consistent about capitalization of 'Arm'
docs: Move arm-cpu-features.rst into the system manual
docs/system/target-arm.rst: Add some introductory text
docs/system: Split target-arm.rst into sub-documents
Makefile: Allow for subdirectories in Sphinx manual dependencies
docs/qemu-option-trace.rst.inc: Remove redundant comment
docs/index.rst, docs/index.html.in: Reorder manuals
Makefile: Make all Sphinx documentation depend on the extensions
docs/sphinx/hxtool.py: Remove STEXI/ETEXI support
hxtool: Remove Texinfo generation support
Update comments in .hx files that mention Texinfo
Makefile: Remove redundant Texinfo related code
Peter Maydell [Mon, 9 Mar 2020 21:58:18 +0000 (21:58 +0000)]
docs: Be consistent about capitalization of 'Arm'
The company 'Arm' went through a rebranding some years back
involving a recapitalization from 'ARM' to 'Arm'. As a result
our documentation is a bit inconsistent between the two forms.
It's not worth trying to update everywhere in QEMU, but it's
easy enough to make docs/ consistent.
Note that "ARMv8" and similar architecture names, and
older CPU names like "ARM926" still retain all-caps.
Peter Maydell [Mon, 9 Mar 2020 21:58:17 +0000 (21:58 +0000)]
docs: Move arm-cpu-features.rst into the system manual
Now we have somewhere to put arm-specific rst documentation,
we can move arm-cpu-features.rst from the docs/ top level
directory into docs/system/arm/.
Peter Maydell [Mon, 9 Mar 2020 21:58:16 +0000 (21:58 +0000)]
docs/system/target-arm.rst: Add some introductory text
Now we've moved the various bits of per-board documentation into
their own files, the top level document is a little bare. Add
some introductory information, including a note that many
of the board models we support are currently undocumented.
(Most sections of this new text were originally written by me
for the wiki page https://wiki.qemu.org/Documentation/Platforms/ARM)
Peter Maydell [Mon, 9 Mar 2020 21:58:15 +0000 (21:58 +0000)]
docs/system: Split target-arm.rst into sub-documents
Currently the documentation for Arm system emulator targets is in a
single target-arm.rst. This describes only some of the boards and
often in a fairly abbreviated fashion. Restructure it so that each
board has its own documentation file in the docs/system/arm/
subdirectory.
This will hopefully encourage us to write board documentation that
describes the board in detail, rather than a few brief paragraphs
in a single long page. The table of contents should also help users
to find the board they care about faster.
Once the structure is in place we'll be able to move microvm.rst
from the top-level docs/ directory.
All the text from the old page is retained, except for the final
paragraph ("A Linux 2.6 test image is available on the QEMU web site.
More information is available in the QEMU mailing-list archive."),
which is deleted. The git history shows this was originally added
in reference to the integratorcp board (at that time the only
Arm board that was supported), and has subsequently gradually been
further and further separated from the integratorcp documentation
by the insertion of other board documentation sections. It's
extremely out of date and no longer accurate, since AFAICT there
isn't an integratorcp kernel on the website any more; so better
deleted than retained.
Peter Maydell [Mon, 9 Mar 2020 21:58:14 +0000 (21:58 +0000)]
Makefile: Allow for subdirectories in Sphinx manual dependencies
Currently we put 'docs/foo/*.rst' in the Make list of dependencies
for the Sphinx 'foo' manual, which means all the files must be
in the top level of that manual's directory. We'd like to be
able to have subdirectories inside some of the manuals, so add
'docs/foo/*/*.rst' to the dependencies too.
The Texinfo version of the tracing options documentation has now
been deleted, so we can remove the now-redundant comment at the top
of the rST version that was reminding us that the two should be
kept in sync.
Now that qemu-doc.html is no longer present, the ordering of manuals
within the top-level index page looks a bit odd. Reshuffle so that
the manuals the user is most likely to be interested in are at the
top of the list, and the reference material is at the bottom.
Similarly, we reorder the index.rst file used as the base of
the "all manuals in one" documentation for readthedocs.
The new order is:
* system
* user
* tools
* interop
* specs
* QMP reference (if present)
* Guest agent protocol reference (if present)
* devel (if present)
Peter Maydell [Fri, 6 Mar 2020 17:17:47 +0000 (17:17 +0000)]
Makefile: Make all Sphinx documentation depend on the extensions
Add the Python source files of our Sphinx extensions to the
dependencies of the Sphinx manuals, so that if we edit the
extension source code the manuals get rebuilt.
Adding this dependency unconditionally means that we'll rebuild
a manual even if it happens to not use the extension whose
source file was changed, but this is simpler and less error
prone, and it's unlikely that we'll be making frequent changes
to the extensions.
Peter Maydell [Fri, 6 Mar 2020 17:17:45 +0000 (17:17 +0000)]
hxtool: Remove Texinfo generation support
All the STEXI/ETEXI blocks and the Makfile rules that use them have now
been removed from the codebase. We can remove the code from the hxtool
script which handles the STEXI/ETEXI directives and the '-t' option.
Peter Maydell [Wed, 11 Mar 2020 17:06:40 +0000 (17:06 +0000)]
Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2020-03-11' into staging
Block patches for the 5.0 softfreeze:
- qemu-img measure for LUKS
- Improve block-copy's performance by reducing inter-request
dependencies
- Make curl's detection of accept-ranges more robust
- Memleak fixes
- iotest fix
* remotes/maxreitz/tags/pull-block-2020-03-11:
block/block-copy: hide structure definitions
block/block-copy: reduce intersecting request lock
block/block-copy: rename start to offset in interfaces
block/block-copy: refactor interfaces to use bytes instead of end
block/block-copy: factor out find_conflicting_inflight_req
block/block-copy: use block_status
block/block-copy: specialcase first copy_range request
block/block-copy: fix progress calculation
job: refactor progress to separate object
block/qcow2-threads: fix qcow2_decompress
qemu-img: free memory before re-assign
block/qcow2: do free crypto_opts in qcow2_close()
iotests: Fix nonportable use of od --endian
block/curl: HTTP header field names are case insensitive
block/curl: HTTP header fields allow whitespace around values
iotests: add 288 luks qemu-img measure test
qemu-img: allow qemu-img measure --object without a filename
luks: implement .bdrv_measure()
luks: extract qcrypto_block_calculate_payload_offset()
Currently, block_copy operation lock the whole requested region. But
there is no reason to lock clusters, which are already copied, it will
disturb other parallel block_copy requests for no reason.
Let's instead do the following:
Lock only sub-region, which we are going to operate on. Then, after
copying all dirty sub-regions, we should wait for intersecting
requests block-copy, if they failed, we should retry these new dirty
clusters.
block/block-copy: refactor interfaces to use bytes instead of end
We have a lot of "chunk_end - start" invocations, let's switch to
bytes/cur_bytes scheme instead.
While being here, improve check on block_copy_do_copy parameters to not
overflow when calculating nbytes and use int64_t for bytes in
block_copy for consistency.
Use bdrv_block_status_above to chose effective chunk size and to handle
zeroes effectively.
This substitutes checking for just being allocated or not, and drops
old code path for it. Assistance by backup job is dropped too, as
caching block-status information is more difficult than just caching
is-allocated information in our dirty bitmap, and backup job is not
good place for this caching anyway.
block/block-copy: specialcase first copy_range request
In block_copy_do_copy we fallback to read+write if copy_range failed.
In this case copy_size is larger than defined for buffered IO, and
there is corresponding commit. Still, backup copies data cluster by
cluster, and most of requests are limited to one cluster anyway, so the
only source of this one bad-limited request is copy-before-write
operation.
Further patch will move backup to use block_copy directly, than for
cases where copy_range is not supported, first request will be
oversized in each backup. It's not good, let's change it now.
Fix is simple: just limit first copy_range request like buffer-based
request. If it succeed, set larger copy_range limit.
Assume we have two regions, A and B, and region B is in-flight now,
region A is not yet touched, but it is unallocated and should be
skipped.
Correspondingly, as progress we have
total = A + B
current = 0
If we reset unallocated region A and call progress_reset_callback,
it will calculate 0 bytes dirty in the bitmap and call
job_progress_set_remaining, which will set
total = current + 0 = 0 + 0 = 0
So, B bytes are actually removed from total accounting. When job
finishes we'll have
total = 0
current = B
, which doesn't sound good.
This is because we didn't considered in-flight bytes, actually when
calculating remaining, we should have set (in_flight + dirty_bytes)
as remaining, not only dirty_bytes.
To fix it, let's refactor progress calculation, moving it to block-copy
itself instead of fixing callback. And, of course, track in_flight
bytes count.
We still have to keep one callback, to maintain backup job bytes_read
calculation, but it will go on soon, when we turn the whole backup
process into one block_copy call.
On success path we return what inflate() returns instead of 0. And it
most probably works for Z_STREAM_END as it is positive, but is
definitely broken for Z_BUF_ERROR.
While being here, switch to errno return code, to be closer to
qcow2_compress API (and usual expectations).
Revert condition in if to be more positive. Drop dead initialization of
ret.
Pan Nengyuan [Thu, 27 Feb 2020 01:29:50 +0000 (09:29 +0800)]
qemu-img: free memory before re-assign
collect_image_check() is called twice in img_check(), the filename/format will be alloced without free the original memory.
It is not a big deal since the process will exit anyway, but seems like a clean code and it will remove the warning spotted by asan.
Pan Nengyuan [Thu, 27 Feb 2020 01:29:49 +0000 (09:29 +0800)]
block/qcow2: do free crypto_opts in qcow2_close()
'crypto_opts' forgot to free in qcow2_close(), this patch fix the bellow leak stack:
Direct leak of 24 byte(s) in 1 object(s) allocated from:
#0 0x7f0edd81f970 in __interceptor_calloc (/lib64/libasan.so.5+0xef970)
#1 0x7f0edc6d149d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5249d)
#2 0x55d7eaede63d in qobject_input_start_struct /mnt/sdb/qemu-new/qemu_test/qemu/qapi/qobject-input-visitor.c:295
#3 0x55d7eaed78b8 in visit_start_struct /mnt/sdb/qemu-new/qemu_test/qemu/qapi/qapi-visit-core.c:49
#4 0x55d7eaf5140b in visit_type_QCryptoBlockOpenOptions qapi/qapi-visit-crypto.c:290
#5 0x55d7eae43af3 in block_crypto_open_opts_init /mnt/sdb/qemu-new/qemu_test/qemu/block/crypto.c:163
#6 0x55d7eacd2924 in qcow2_update_options_prepare /mnt/sdb/qemu-new/qemu_test/qemu/block/qcow2.c:1148
#7 0x55d7eacd33f7 in qcow2_update_options /mnt/sdb/qemu-new/qemu_test/qemu/block/qcow2.c:1232
#8 0x55d7eacd9680 in qcow2_do_open /mnt/sdb/qemu-new/qemu_test/qemu/block/qcow2.c:1512
#9 0x55d7eacdc55e in qcow2_open_entry /mnt/sdb/qemu-new/qemu_test/qemu/block/qcow2.c:1792
#10 0x55d7eacdc8fe in qcow2_open /mnt/sdb/qemu-new/qemu_test/qemu/block/qcow2.c:1819
#11 0x55d7eac3742d in bdrv_open_driver /mnt/sdb/qemu-new/qemu_test/qemu/block.c:1317
#12 0x55d7eac3e990 in bdrv_open_common /mnt/sdb/qemu-new/qemu_test/qemu/block.c:1575
#13 0x55d7eac4442c in bdrv_open_inherit /mnt/sdb/qemu-new/qemu_test/qemu/block.c:3126
#14 0x55d7eac45c3f in bdrv_open /mnt/sdb/qemu-new/qemu_test/qemu/block.c:3219
#15 0x55d7ead8e8a4 in blk_new_open /mnt/sdb/qemu-new/qemu_test/qemu/block/block-backend.c:397
#16 0x55d7eacde74c in qcow2_co_create /mnt/sdb/qemu-new/qemu_test/qemu/block/qcow2.c:3534
#17 0x55d7eacdfa6d in qcow2_co_create_opts /mnt/sdb/qemu-new/qemu_test/qemu/block/qcow2.c:3668
#18 0x55d7eac1c678 in bdrv_create_co_entry /mnt/sdb/qemu-new/qemu_test/qemu/block.c:485
#19 0x55d7eb0024d2 in coroutine_trampoline /mnt/sdb/qemu-new/qemu_test/qemu/util/coroutine-ucontext.c:115
Eric Blake [Wed, 26 Feb 2020 12:54:24 +0000 (06:54 -0600)]
iotests: Fix nonportable use of od --endian
Tests 261 and 272 fail on RHEL 7 with coreutils 8.22, since od
--endian was not added until coreutils 8.23. Fix this by manually
constructing the final value one byte at a time.
Stefan Hajnoczi [Fri, 21 Feb 2020 11:25:21 +0000 (11:25 +0000)]
qemu-img: allow qemu-img measure --object without a filename
In most qemu-img sub-commands the --object option only makes sense when
there is a filename. qemu-img measure is an exception because objects
may be referenced from the image creation options instead of an existing
image file. Allow --object without a filename.
The qcow2 .bdrv_measure() code calculates the crypto payload offset.
This logic really belongs in crypto/block.c where it can be reused by
other image formats.
The "luks" block driver will need this same logic in order to implement
.bdrv_measure(), so extract the qcrypto_block_calculate_payload_offset()
function now.
Peter Maydell [Tue, 10 Mar 2020 16:50:28 +0000 (16:50 +0000)]
Merge remote-tracking branch 'remotes/borntraeger/tags/s390x-20200310' into staging
s390x/ipl: Fixes for ipl and bios
- provide a pointer to the loadparm. This fixes crashes in zipl
- do not throw away guest changes of the IPL parameter during reset
- refactor IPLB checks
* remotes/borntraeger/tags/s390x-20200310:
s390x: ipl: Consolidate iplb validity check into one function
s390/ipl: sync back loadparm
s390x/bios: rebuild s390-ccw.img
pc-bios: s390x: Save iplb location in lowcore
Greg Kurz [Tue, 10 Mar 2020 15:12:49 +0000 (16:12 +0100)]
9p/proxy: Fix export_flags
The common fsdev options are set by qemu_fsdev_add() before it calls
the backend specific option parsing code. In the case of "proxy" this
means "writeout" or "readonly" were simply ignored. This has been
broken from the beginning.
Halil Pasic [Mon, 9 Mar 2020 13:32:23 +0000 (14:32 +0100)]
s390/ipl: sync back loadparm
We expose loadparm as a r/w machine property, but if loadparm is set by
the guest via DIAG 308, we don't update the property. Having a
disconnect between the guest view and the QEMU property is not nice in
itself, but things get even worse for SCSI, where under certain
circumstances (see 789b5a401b "s390: Ensure IPL from SCSI works as
expected" for details) we call s390_gen_initial_iplb() on resets
effectively overwriting the guest/user supplied loadparm with the stale
value.
Janosch Frank [Wed, 4 Mar 2020 11:42:31 +0000 (06:42 -0500)]
pc-bios: s390x: Save iplb location in lowcore
The POP states that for a list directed IPL the IPLB is stored into
memory by the machine loader and its address is stored at offset 0x14
of the lowcore.
ZIPL currently uses the address in offset 0x14 to access the IPLB and
acquire flags about secure boot. If the IPLB address points into
memory which has an unsupported mix of flags set, ZIPL will panic
instead of booting the OS.
As the lowcore can have quite a high entropy for a guest that did drop
out of protected mode (i.e. rebooted) we encountered the ZIPL panic
quite often.
Peter Maydell [Mon, 9 Mar 2020 19:49:53 +0000 (19:49 +0000)]
Merge remote-tracking branch 'remotes/dgilbert/tags/pull-hmp-20200309' into staging
HMP Pull 2020-03-09
Maxim's hmp block move, Thomas's deprecation in hostfwd.
Signed-off-by: Dr. David Alan Gilbert <[email protected]>
# gpg: Signature made Mon 09 Mar 2020 19:43:33 GMT
# gpg: using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <[email protected]>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert/tags/pull-hmp-20200309:
net: Remove deprecated [hub_id name] tuple of 'hostfwd_add' / 'hostfwd_remove'
monitor/hmp: Move hmp_drive_add_node to block-hmp-cmds.c
monitor/hmp: move hmp_info_block* to block-hmp-cmds.c
monitor/hmp: move remaining hmp_block* functions to block-hmp-cmds.c
monitor/hmp: move hmp_nbd_server* to block-hmp-cmds.c
monitor/hmp: move hmp_snapshot_* to block-hmp-cmds.c
monitor/hmp: move hmp_block_job* to block-hmp-cmds.c
monitor/hmp: move hmp_drive_mirror and hmp_drive_backup to block-hmp-cmds.c
monitor/hmp: move hmp_drive_del and hmp_commit to block-hmp-cmds.c
monitor/hmp: rename device-hotplug.c to block/monitor/block-hmp-cmds.c
monitor/hmp: inline add_init_drive
usb/dev-storage: remove unused include
Stefan Hajnoczi [Thu, 5 Mar 2020 17:08:06 +0000 (17:08 +0000)]
aio-posix: remove idle poll handlers to improve scalability
When there are many poll handlers it's likely that some of them are idle
most of the time. Remove handlers that haven't had activity recently so
that the polling loop scales better for guests with a large number of
devices.
This feature only takes effect for the Linux io_uring fd monitoring
implementation because it is capable of combining fd monitoring with
userspace polling. The other implementations can't do that and risk
starving fds in favor of poll handlers, so don't try this optimization
when they are in use.
IOPS improves from 10k to 105k when the guest has 100
virtio-blk-pci,num-queues=32 devices and 1 virtio-blk-pci,num-queues=1
device for rw=randread,iodepth=1,bs=4k,ioengine=libaio on NVMe.
[Clarified aio_poll_handlers locking discipline explanation in comment
after discussion with Paolo Bonzini <[email protected]>.
--Stefan]
Peter Maydell [Fri, 6 Mar 2020 13:47:51 +0000 (13:47 +0000)]
qemu.nsi: Install Sphinx documentation
The old qemu-doc.html is no longer built, so update the Windows
installer to install the new Sphinx manual sets.
We install all five of the manuals, even though some of them
(notably the user-mode manual) will not be very useful to Windows
users, because skipping some of them would mean broken links
in the top level 'index.html' page.
Stefan Hajnoczi [Thu, 5 Mar 2020 17:08:05 +0000 (17:08 +0000)]
aio-posix: support userspace polling of fd monitoring
Unlike ppoll(2) and epoll(7), Linux io_uring completions can be polled
from userspace. Previously userspace polling was only allowed when all
AioHandler's had an ->io_poll() callback. This prevented starvation of
fds by userspace pollable handlers.
Add the FDMonOps->need_wait() callback that enables userspace polling
even when some AioHandlers lack ->io_poll().
For example, it's now possible to do userspace polling when a TCP/IP
socket is monitored thanks to Linux io_uring.
The recent Linux io_uring API has several advantages over ppoll(2) and
epoll(2). Details are given in the source code.
Add an io_uring implementation and make it the default on Linux.
Performance is the same as with epoll(7) but later patches add
optimizations that take advantage of io_uring.
It is necessary to change how aio_set_fd_handler() deals with deleting
AioHandlers since removing monitored file descriptors is asynchronous in
io_uring. fdmon_io_uring_remove() marks the AioHandler deleted and
aio_set_fd_handler() will let it handle deletion in that case.
Stefan Hajnoczi [Thu, 5 Mar 2020 17:08:03 +0000 (17:08 +0000)]
aio-posix: simplify FDMonOps->update() prototype
The AioHandler *node, bool is_new arguments are more complicated to
think about than simply being given AioHandler *old_node, AioHandler
*new_node.
Furthermore, the new Linux io_uring file descriptor monitoring mechanism
added by the new patch requires access to both the old and the new
nodes. Make this change now in preparation.
Stefan Hajnoczi [Thu, 5 Mar 2020 17:08:02 +0000 (17:08 +0000)]
aio-posix: extract ppoll(2) and epoll(7) fd monitoring
The ppoll(2) and epoll(7) file descriptor monitoring implementations are
mixed with the core util/aio-posix.c code. Before adding another
implementation for Linux io_uring, extract out the existing
ones so there is a clear interface and the core code is simpler.
The new interface is AioContext->fdmon_ops, a pointer to a FDMonOps
struct. See the patch for details.
Semantic changes:
1. ppoll(2) now reflects events from pollfds[] back into AioHandlers
while we're still on the clock for adaptive polling. This was
already happening for epoll(7), so if it's really an issue then we'll
need to fix both in the future.
2. epoll(7)'s fallback to ppoll(2) while external events are disabled
was broken when the number of fds exceeded the epoll(7) upgrade
threshold. I guess this code path simply wasn't tested and no one
noticed the bug. I didn't go out of my way to fix it but the correct
code is simpler than preserving the bug.
I also took some liberties in removing the unnecessary
AioContext->epoll_available (just check AioContext->epollfd != -1
instead) and AioContext->epoll_enabled (it's implicit if our
AioContext->fdmon_ops callbacks are being invoked) fields.
Stefan Hajnoczi [Thu, 5 Mar 2020 17:08:01 +0000 (17:08 +0000)]
aio-posix: move RCU_READ_LOCK() into run_poll_handlers()
Now that run_poll_handlers_once() is only called by run_poll_handlers()
we can improve the CPU time profile by moving the expensive
RCU_READ_LOCK() out of the polling loop.
This reduces the run_poll_handlers() from 40% CPU to 10% CPU in perf's
sampling profiler output.
Stefan Hajnoczi [Thu, 5 Mar 2020 17:08:00 +0000 (17:08 +0000)]
aio-posix: completely stop polling when disabled
One iteration of polling is always performed even when polling is
disabled. This is done because:
1. Userspace polling is cheaper than making a syscall. We might get
lucky.
2. We must poll once more after polling has stopped in case an event
occurred while stopping polling.
However, there are downsides:
1. Polling becomes a bottleneck when the number of event sources is very
high. It's more efficient to monitor fds in that case.
2. A high-frequency polling event source can starve non-polling event
sources because ppoll(2)/epoll(7) is never invoked.
This patch removes the forced polling iteration so that poll_ns=0 really
means no polling.
IOPS increases from 10k to 60k when the guest has 100
virtio-blk-pci,num-queues=32 devices and 1 virtio-blk-pci,num-queues=1
device because the large number of event sources being polled slows down
the event loop.
Stefan Hajnoczi [Mon, 24 Feb 2020 10:34:06 +0000 (10:34 +0000)]
aio-posix: remove confusing QLIST_SAFE_REMOVE()
QLIST_SAFE_REMOVE() is confusing here because the node must be on the
list. We actually just wanted to clear the linked list pointers when
removing it from the list. QLIST_REMOVE() now does this, so switch to
it.
Stefan Hajnoczi [Mon, 24 Feb 2020 10:34:05 +0000 (10:34 +0000)]
qemu/queue.h: clear linked list pointers on remove
Do not leave stale linked list pointers around after removal. It's
safer to set them to NULL so that use-after-removal results in an
immediate segfault.
The RCU queue removal macros are unchanged since nodes may still be
traversed after removal.
Peter Maydell [Mon, 9 Mar 2020 15:35:58 +0000 (15:35 +0000)]
Merge remote-tracking branch 'remotes/vivier2/tags/trivial-branch-pull-request' into staging
- includes cleanup
- reduce .data footprint
- fix warnings reported by Clang static code analyzer
- fix dp8393x part lost in merge
- update git.orderfile and rules.mak
* remotes/vivier2/tags/trivial-branch-pull-request: (33 commits)
monitor/hmp-cmds: Remove redundant statement in hmp_rocker_of_dpa_groups()
display/exynos4210_fimd: Remove redundant statement in exynos4210_fimd_update()
display/pxa2xx_lcd: Remove redundant statement in pxa2xx_palette_parse()
scsi/scsi-disk: Remove redundant statement in scsi_disk_emulate_command()
dma/xlnx-zdma: Remove redundant statement in zdma_write_dst()
block/file-posix: Remove redundant statement in raw_handle_perm_lock()
block/stream: Remove redundant statement in stream_run()
core/qdev: fix memleak in qdev_get_gpio_out_connector()
hw/i386/pc: Clean up includes
hw/pci-host/q35: Remove unused includes
hw/i386: Include "hw/mem/nvdimm.h"
hw/acpi: Include "hw/mem/nvdimm.h"
hw/pci-host/piix: Include "qemu/range.h"
hw/i2c/smbus_ich9: Include "qemu/range.h"
hw/pci-host/q35: Include "qemu/range.h"
hw/timer/hpet: Include "exec/address-spaces.h"
hw/acpi/cpu_hotplug: Include "hw/pci/pci.h"
hw/hppa/machine: Include "net/net.h"
hw/alpha/dp264: Include "net/net.h"
hw/alpha/alpha_sys: Remove unused "hw/ide.h" header
...
Chen Qun [Mon, 2 Mar 2020 13:07:08 +0000 (21:07 +0800)]
scsi/scsi-disk: Remove redundant statement in scsi_disk_emulate_command()
Clang static code analyzer show warning:
scsi/scsi-disk.c:1918:5: warning: Value stored to 'buflen' is never read
buflen = req->cmd.xfer;
^ ~~~~~~~~~~~~~
Chen Qun [Mon, 2 Mar 2020 13:07:12 +0000 (21:07 +0800)]
dma/xlnx-zdma: Remove redundant statement in zdma_write_dst()
Clang static code analyzer show warning:
hw/dma/xlnx-zdma.c:399:13: warning: Value stored to 'dst_type' is never read
dst_type = FIELD_EX32(s->dsc_dst.words[3], ZDMA_CH_DST_DSCR_WORD3,
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All this files use methods/definitions declared in the NVDIMM
device header. Include it.
This fixes (when modifying unrelated headers):
hw/i386/acpi-build.c:2733:9: error: implicit declaration of function 'nvdimm_build_acpi' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
nvdimm_build_acpi(table_offsets, tables_blob, tables->linker,
^
hw/i386/pc.c:1996:61: error: use of undeclared identifier 'TYPE_NVDIMM'
const bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM);
^
hw/i386/pc.c:2032:55: error: use of undeclared identifier 'TYPE_NVDIMM'
bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM);
^
hw/i386/pc.c:2040:9: error: implicit declaration of function 'nvdimm_plug' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
nvdimm_plug(ms->nvdimms_state);
^
hw/i386/pc.c:2040:9: error: this function declaration is not a prototype [-Werror,-Wstrict-prototypes]
nvdimm_plug(ms->nvdimms_state);
^
hw/i386/pc.c:2065:42: error: use of undeclared identifier 'TYPE_NVDIMM'
if (object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM)) {
^
hw/i386/pc_i440fx.c:307:9: error: implicit declaration of function 'nvdimm_init_acpi_state' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
nvdimm_init_acpi_state(machine->nvdimms_state, system_io,
^
hw/i386/pc_q35.c:332:9: error: implicit declaration of function 'nvdimm_init_acpi_state' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
nvdimm_init_acpi_state(machine->nvdimms_state, system_io,
^