* remotes/stefanha/tags/tracing-pull-request:
trace: Forbid event format ending with newline character
trace: Remove trailing newline in events
loader: Trace loaded images
Peter Maydell [Thu, 19 Sep 2019 10:14:28 +0000 (11:14 +0100)]
Merge remote-tracking branch 'remotes/palmer/tags/riscv-for-master-4.2-sf1-v3' into staging
RISC-V Patches for the 4.2 Soft Freeze, Part 1, v3
This contains quite a few patches that I'd like to target for 4.2.
They're mostly emulation fixes for the sifive_u board, which now much
more closely matches the hardware and can therefor run the same fireware
as what gets loaded onto the board. Additional user-visible
improvements include:
* support for loading initrd files from the command line into Linux, via
/chosen/linux,initrd-{start,end} device tree nodes.
* The conversion of LOG_TRACE to trace events.
* The addition of clock DT nodes for our uart and ethernet.
This also includes some preliminary work for the H extension patches,
but does not include the H extension patches as I haven't had time to
review them yet.
This passes my OE boot test on 32-bit and 64-bit virt machines, as well
as a 64-bit upstream Linux boot on the sifive_u machine. It has been
fixed to actually pass "make check" this time.
Changes since v2 (never made it to the list):
* Sets the sifive_u machine default core count to 2 instead of 5.
* Sets the sifive_u machine default core count to 5 instead of 1, as
it's impossible to have a single core sifive_u machine.
# gpg: Signature made Tue 17 Sep 2019 16:43:30 BST
# gpg: using RSA key 00CE76D1834960DFCE886DF8EF4CA1502CCBAB41
# gpg: issuer "[email protected]"
# gpg: Good signature from "Palmer Dabbelt <[email protected]>" [unknown]
# gpg: aka "Palmer Dabbelt <[email protected]>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 00CE 76D1 8349 60DF CE88 6DF8 EF4C A150 2CCB AB41
* remotes/palmer/tags/riscv-for-master-4.2-sf1-v3: (48 commits)
gdbstub: riscv: fix the fflags registers
target/riscv: Use TB_FLAGS_MSTATUS_FS for floating point
target/riscv: Fix mstatus dirty mask
target/riscv: Use both register name and ABI name
riscv: sifive_u: Update model and compatible strings in device tree
riscv: sifive_u: Remove handcrafted clock nodes for UART and ethernet
riscv: sifive_u: Fix broken GEM support
riscv: sifive_u: Instantiate OTP memory with a serial number
riscv: sifive: Implement a model for SiFive FU540 OTP
riscv: roms: Update default bios for sifive_u machine
riscv: sifive_u: Change UART node name in device tree
riscv: sifive_u: Update UART base addresses and IRQs
riscv: sifive_u: Reference PRCI clocks in UART and ethernet nodes
riscv: sifive_u: Add PRCI block to the SoC
riscv: sifive_u: Generate hfclk and rtcclk nodes
riscv: sifive: Implement PRCI model for FU540
riscv: sifive_u: Update PLIC hart topology configuration string
riscv: sifive_u: Update hart configuration to reflect the real FU540 SoC
riscv: sifive_u: Set the minimum number of cpus to 2
riscv: hart: Add a "hartid-base" property to RISC-V hart array
...
trace: Forbid event format ending with newline character
Event format ending with newlines confuse the trace reports.
Forbid them.
Add a check to refuse new format added with trailing newline:
$ make
[...]
GEN hw/misc/trace.h
Traceback (most recent call last):
File "scripts/tracetool.py", line 152, in <module>
main(sys.argv)
File "scripts/tracetool.py", line 143, in main
events.extend(tracetool.read_events(fh, arg))
File "scripts/tracetool/__init__.py", line 367, in read_events
event = Event.build(line)
File "scripts/tracetool/__init__.py", line 281, in build
raise ValueError("Event format can not end with a newline character")
ValueError: Error at hw/misc/trace-events:121: Event format can not end with a newline character
While the tracing framework does not forbid trailing newline in
events format string, using them lead to confuse output.
It is the responsibility of the backend to properly end an event
line.
Some of our formats have trailing newlines, remove them.
[Fixed typo in commit description reported by Eric Blake
<[email protected]>
--Stefan]
KONRAD Frederic [Tue, 10 Sep 2019 08:15:41 +0000 (10:15 +0200)]
gdbstub: riscv: fix the fflags registers
While debugging an application with GDB the following might happen:
(gdb) return
Make xxx return now? (y or n) y
Could not fetch register "fflags"; remote failure reply 'E14'
This is because riscv_gdb_get_fpu calls riscv_csrrw_debug with a wrong csr
number (8). It should use the csr_register_map in order to reach the
riscv_cpu_get_fflags callback.
Bin Meng [Fri, 6 Sep 2019 16:20:18 +0000 (09:20 -0700)]
riscv: sifive_u: Remove handcrafted clock nodes for UART and ethernet
In the past we did not have a model for PRCI, hence two handcrafted
clock nodes ("/soc/ethclk" and "/soc/uartclk") were created for the
purpose of supplying hard-coded clock frequencies. But now since we
have added the PRCI support in QEMU, we don't need them any more.
Bin Meng [Fri, 6 Sep 2019 16:20:17 +0000 (09:20 -0700)]
riscv: sifive_u: Fix broken GEM support
At present the GEM support in sifive_u machine is seriously broken.
The GEM block register base was set to a weird number (0x100900FC),
which for no way could work with the cadence_gem model in QEMU.
Not like other GEM variants, the FU540-specific GEM has a management
block to control 10/100/1000Mbps link speed changes, that is mapped
to 0x100a0000. We can simply map it into MMIO space without special
handling using create_unimplemented_device().
Update the GEM node compatible string to use the official name used
by the upstream Linux kernel, and add the management block reg base
& size to the <reg> property encoding.
Tested with upstream U-Boot and Linux kernel MACB drivers.
Bin Meng [Fri, 6 Sep 2019 16:20:16 +0000 (09:20 -0700)]
riscv: sifive_u: Instantiate OTP memory with a serial number
This adds an OTP memory with a given serial number to the sifive_u
machine. With such support, the upstream U-Boot for sifive_fu540
boots out of the box on the sifive_u machine.
Bin Meng [Fri, 6 Sep 2019 16:20:15 +0000 (09:20 -0700)]
riscv: sifive: Implement a model for SiFive FU540 OTP
This implements a simple model for SiFive FU540 OTP (One-Time
Programmable) Memory interface, primarily for reading out the
stored serial number from the first 1 KiB of the 16 KiB OTP
memory reserved by SiFive for internal use.
Bin Meng [Fri, 6 Sep 2019 16:20:14 +0000 (09:20 -0700)]
riscv: roms: Update default bios for sifive_u machine
With the support of heterogeneous harts and PRCI model, it's now
possible to use the OpenSBI image (PLATFORM=sifive/fu540) built
for the real hardware.
Bin Meng [Fri, 6 Sep 2019 16:20:13 +0000 (09:20 -0700)]
riscv: sifive_u: Change UART node name in device tree
OpenSBI for fu540 does DT fix up (see fu540_modify_dt()) by updating
chosen "stdout-path" to point to "/soc/serial@...", and U-Boot will
use this information to locate the serial node and probe its driver.
However currently we generate the UART node name as "/soc/uart@...",
causing U-Boot fail to find the serial node in DT.
Bin Meng [Fri, 6 Sep 2019 16:20:11 +0000 (09:20 -0700)]
riscv: sifive_u: Reference PRCI clocks in UART and ethernet nodes
Now that we have added a PRCI node, update existing UART and ethernet
nodes to reference PRCI as their clock sources, to keep in sync with
the Linux kernel device tree.
Bin Meng [Fri, 6 Sep 2019 16:20:06 +0000 (09:20 -0700)]
riscv: sifive_u: Update hart configuration to reflect the real FU540 SoC
The FU540-C000 includes a 64-bit E51 RISC-V core and four 64-bit U54
RISC-V cores. Currently the sifive_u machine only populates 4 U54
cores. Update the max cpu number to 5 to reflect the real hardware,
by creating 2 CPU clusters as containers for RISC-V hart arrays to
populate heterogeneous harts.
The cpu nodes in the generated DTS have been updated as well.
Bin Meng [Fri, 6 Sep 2019 16:20:04 +0000 (09:20 -0700)]
riscv: hart: Add a "hartid-base" property to RISC-V hart array
At present each hart's hartid in a RISC-V hart array is assigned
the same value of its index in the hart array. But for a system
that has multiple hart arrays, this is not the case any more.
Add a new "hartid-base" property so that hartid number can be
assigned based on the property value.
Bin Meng [Fri, 6 Sep 2019 16:20:03 +0000 (09:20 -0700)]
riscv: hart: Extract hart realize to a separate routine
Currently riscv_harts_realize() creates all harts based on the
same cpu type given in the hart array property. With current
implementation it can only create homogeneous harts. Exact the
hart realize to a separate routine in preparation for supporting
multiple hart arrays.
Note the file header says the RISC-V hart array holds the state
of a heterogeneous array of RISC-V harts, which is not true.
Update the comment to mention homogeneous array of RISC-V harts.
Bin Meng [Fri, 6 Sep 2019 16:19:48 +0000 (09:19 -0700)]
riscv: hw: Remove duplicated "hw/hw.h" inclusion
Commit a27bd6c779ba ("Include hw/qdev-properties.h less") wrongly
added "hw/hw.h" to sifive_prci.c and sifive_test.c.
Another inclusion of "hw/hw.h" was later added via
commit 650d103d3ea9 ("Include hw/hw.h exactly where needed"), that
resulted in duplicated inclusion of "hw/hw.h".
Bin Meng [Wed, 14 Aug 2019 15:33:32 +0000 (08:33 -0700)]
riscv: hmp: Add a command to show virtual memory mappings
This adds 'info mem' command for RISC-V, to show virtual memory
mappings that aids debugging.
Rather than showing every valid PTE, the command compacts the
output by merging all contiguous physical address mappings into
one block and only shows the merged block mapping details.
Bin Meng [Fri, 16 Aug 2019 13:09:36 +0000 (06:09 -0700)]
riscv: Resolve full path of the given bios image
At present when "-bios image" is supplied, we just use the straight
path without searching for the configured data directories. Like
"-bios default", we add the same logic so that "-L" actually works.
Bin Meng [Thu, 8 Aug 2019 02:49:30 +0000 (19:49 -0700)]
riscv: rv32: Root page table address can be larger than 32-bit
For RV32, the root page table's PPN has 22 bits hence its address
bits could be larger than the maximum bits that target_ulong is
able to represent. Use hwaddr instead.
Alistair Francis [Tue, 30 Jul 2019 23:35:24 +0000 (16:35 -0700)]
target/riscv: Create function to test if FP is enabled
Let's create a function that tests if floating point support is
enabled. We can then protect all floating point operations based on if
they are enabled.
This patch so far doesn't change anything, it's just preparing for the
Hypervisor support for floating point operations.
riscv: sivive_u: Add dummy serial clock and aliases entry for uart
The riscv uart needs valid clocks. This requires a refereence
to the clock node. Since the SOC clock is not emulated by qemu,
add a reference to a fixed clock instead. The clock-frequency
entry in the uart node does not seem to be necessary, so drop it.
In addition to a reference to the clock, the driver also needs
an aliases entry for the serial node. Add it as well.
Without this patch, the serial driver fails to instantiate with
the following error message.
sifive-serial 10013000.uart: unable to find controller clock
sifive-serial: probe of 10013000.uart failed with error -2
Li Qiang [Sat, 31 Aug 2019 15:39:22 +0000 (08:39 -0700)]
vnc: fix memory leak when vnc disconnect
Currently when qemu receives a vnc connect, it creates a 'VncState' to
represent this connection. In 'vnc_worker_thread_loop' it creates a
local 'VncState'. The connection 'VcnState' and local 'VncState' exchange
data in 'vnc_async_encoding_start' and 'vnc_async_encoding_end'.
In 'zrle_compress_data' it calls 'deflateInit2' to allocate the libz library
opaque data. The 'VncState' used in 'zrle_compress_data' is the local
'VncState'. In 'vnc_zrle_clear' it calls 'deflateEnd' to free the libz
library opaque data. The 'VncState' used in 'vnc_zrle_clear' is the connection
'VncState'. In currently implementation there will be a memory leak when the
vnc disconnect. Following is the asan output backtrack:
Direct leak of 29760 byte(s) in 5 object(s) allocated from:
0 0xffffa67ef3c3 in __interceptor_calloc (/lib64/libasan.so.4+0xd33c3)
1 0xffffa65071cb in g_malloc0 (/lib64/libglib-2.0.so.0+0x571cb)
2 0xffffa5e968f7 in deflateInit2_ (/lib64/libz.so.1+0x78f7)
3 0xaaaacec58613 in zrle_compress_data ui/vnc-enc-zrle.c:87
4 0xaaaacec58613 in zrle_send_framebuffer_update ui/vnc-enc-zrle.c:344
5 0xaaaacec34e77 in vnc_send_framebuffer_update ui/vnc.c:919
6 0xaaaacec5e023 in vnc_worker_thread_loop ui/vnc-jobs.c:271
7 0xaaaacec5e5e7 in vnc_worker_thread ui/vnc-jobs.c:340
8 0xaaaacee4d3c3 in qemu_thread_start util/qemu-thread-posix.c:502
9 0xffffa544e8bb in start_thread (/lib64/libpthread.so.0+0x78bb)
10 0xffffa53965cb in thread_start (/lib64/libc.so.6+0xd55cb)
This is because the opaque allocated in 'deflateInit2' is not freed in
'deflateEnd'. The reason is that the 'deflateEnd' calls 'deflateStateCheck'
and in the latter will check whether 's->strm != strm'(libz's data structure).
This check will be true so in 'deflateEnd' it just return 'Z_STREAM_ERROR' and
not free the data allocated in 'deflateInit2'.
The reason this happens is that the 'VncState' contains the whole 'VncZrle',
so when calling 'deflateInit2', the 's->strm' will be the local address.
So 's->strm != strm' will be true.
To fix this issue, we need to make 'zrle' of 'VncState' to be a pointer.
Then the connection 'VncState' and local 'VncState' exchange mechanism will
work as expection. The 'tight' of 'VncState' has the same issue, let's also turn
it to a pointer.
When the mouse will move out of the screen of the local host on
the right, the mouse and the keyboard will be grabbed and all
related events will be send to the guest OS.
This is usefull when qemu is configured without emulated graphic card
but with a VFIO attached graphic card.
More information about Barrier can be found at:
https://github.com/debauchee/barrier
This avoids to install the Barrier server in the guest OS,
for instance when it is not supported or during the installation.
Fix egl_fb_read() to use the (destination) surface size instead of the
(source) framebuffer source for glReadPixels. Pass the DisplaySurface
instead of the pixeldata pointer to egl_fb_read() to make this possible.
With that in place framebuffer reads work fine even if the surface and
framebuffer sizes don't match, so we can remove the guest-triggerable
asserts in egl_scanout_flush().
Peter Maydell [Thu, 1 Aug 2019 18:30:12 +0000 (19:30 +0100)]
target/sparc: Switch to do_transaction_failed() hook
Switch the SPARC target from the old unassigned_access hook to the
new do_transaction_failed hook.
This will cause the "if transaction failed" code paths added in
the previous commits to become active if the access is to an
unassigned address. In particular we'll now handle bus errors
during page table walks correctly (generating a translation
error with the right kind of fault status).
Peter Maydell [Thu, 1 Aug 2019 18:30:10 +0000 (19:30 +0100)]
target/sparc: Handle bus errors in mmu_probe()
Convert the mmu_probe() function to using address_space_ldl()
rather than ldl_phys(), so we can explicitly detect memory
transaction failures.
This makes no practical difference at the moment, because
ldl_phys() will return 0 on a transaction failure, and we
treat transaction failures and 0 PDEs identically. However
the spec says that MMU probe operations are supposed to
update the fault status registers, and if we ever implement
that we'll want to distinguish the difference. For the
moment, just add a TODO comment about the bug.
Peter Maydell [Thu, 1 Aug 2019 18:30:09 +0000 (19:30 +0100)]
target/sparc: Correctly handle bus errors in page table walks
Currently we use the ldl_phys() function to read page table entries.
With the unassigned_access hook in place, if these hit an unassigned
area of memory then the hook will cause us to wrongly generate
an exception with a fault address matching the address of the
page table entry.
Change to using address_space_ldl() so we can detect and correctly
handle bus errors and give them their correct behaviour of
causing a translation error with a suitable fault status register.
Note that this won't actually take effect until we switch the
over to using the do_translation_failed hook.
Peter Maydell [Thu, 1 Aug 2019 18:30:08 +0000 (19:30 +0100)]
target/sparc: Check for transaction failures in MXCC stream ASI accesses
Currently the ld/st_asi helper functions make calls to the
ld*_phys() and st*_phys() functions for those ASIs which
imply direct accesses to physical addresses. These implicitly
rely on the unassigned_access hook to cause them to generate
an MMU fault if the access fails.
Switch to using the address_space_* functions instead, which
return a MemTxResult that we can check. This means that when
we switch SPARC over to using the do_transaction_failed hook
we'll still get the same MMU faults we did before.
This commit converts the ASIs which do MXCC stream source
and destination accesses.
It's not clear to me whether raising an MMU fault like this
is the correct behaviour if we encounter a bus error, but
we retain the same behaviour that the old unassigned_access
hook would implement.
Peter Maydell [Thu, 1 Aug 2019 18:30:07 +0000 (19:30 +0100)]
target/sparc: Check for transaction failures in MMU passthrough ASIs
Currently the ld/st_asi helper functions make calls to the
ld*_phys() and st*_phys() functions for those ASIs which
imply direct accesses to physical addresses. These implicitly
rely on the unassigned_access hook to cause them to generate
an MMU fault if the access fails.
Switch to using the address_space_* functions instead, which
return a MemTxResult that we can check. This means that when
we switch SPARC over to using the do_transaction_failed hook
we'll still get the same MMU faults we did before.
This commit converts the ASIs which do "MMU passthrough".
Peter Maydell [Thu, 1 Aug 2019 18:30:06 +0000 (19:30 +0100)]
target/sparc: Factor out the body of sparc_cpu_unassigned_access()
Currently the SPARC target uses the old-style do_unassigned_access
hook. We want to switch it over to do_transaction_failed, but to do
this we must first remove all the direct calls in ldst_helper.c to
cpu_unassigned_access(). Factor out the body of the hook function's
code into a new sparc_raise_mmu_fault() and call it from the hook and
from the various places that used to call cpu_unassigned_access().
In passing, this fixes a bug where the code that raised the
MMU exception was directly calling GETPC() from a function that
was several levels deep in the callstack from the original
helper function: the new sparc_raise_mmu_fault() instead takes
the return address as an argument.
Other than the use of retaddr rather than GETPC() and a comment
format fixup, the body of the new function has no changes from
that of the old hook function.
* remotes/bonzini/tags/for-upstream: (29 commits)
hw/i386/pc: Extract the x86 generic fw_cfg code
hw/i386/pc: Rename pc_build_feature_control() as generic fw_cfg_build_*
hw/i386/pc: Let pc_build_feature_control() take a MachineState argument
hw/i386/pc: Let pc_build_feature_control() take a FWCfgState argument
hw/i386/pc: Rename pc_build_smbios() as generic fw_cfg_build_smbios()
hw/i386/pc: Let pc_build_smbios() take a generic MachineState argument
hw/i386/pc: Let pc_build_smbios() take a FWCfgState argument
hw/i386/pc: Replace PCMachineState argument with MachineState in fw_cfg_arch_create
hw/i386/pc: Pass the CPUArchIdList array by argument
hw/i386/pc: Pass the apic_id_limit value by argument
hw/i386/pc: Pass the boot_cpus value by argument
hw/i386/pc: Rename bochs_bios_init as more generic fw_cfg_arch_create
hw/i386/pc: Use address_space_memory in place
hw/i386/pc: Extract e820 memory layout code
hw/i386/pc: Use e820_get_num_entries() to access e820_entries
cpus: Fix throttling during vm_stop
qemu-thread: Add qemu_cond_timedwait
memory: inline and optimize devend_memop
memory: fetch pmem size in get_file_size()
elf-ops.h: fix int overflow in load_elf()
...
virtio-mmio: implement modern (v2) personality (virtio-1)
Implement the modern (v2) personality, according to the VirtIO 1.0
specification.
Support for v2 among guests is not as widespread as it'd be
desirable. While the Linux driver has had it for a while, support is
missing, at least, from Tianocore EDK II, NetBSD and FreeBSD.
For this reason, the v2 personality is disabled, keeping the legacy
behavior as default. Machine types willing to use v2, can enable it
using MachineClass's compat_props.
Extract all the functions that are not PC-machine specific into
the (arch-specific) fw_cfg.c file. This will allow other X86-machine
to reuse these functions.
Paolo Bonzini [Mon, 16 Sep 2019 10:42:42 +0000 (12:42 +0200)]
hw/i386/pc: Replace PCMachineState argument with MachineState in fw_cfg_arch_create
In the previous commit we removed the last access to PCMachineState.
Replace it with a generic MachineState argument and use it to retrieve
the CPUArchIdList.
hw/i386/pc: Rename bochs_bios_init as more generic fw_cfg_arch_create
The bochs_bios_init() function is not restricted to the Bochs
BIOS and is useful to other BIOS.
Since it is not specific to the PC machine, and can be reused
by other machines of the X86 architecture, rename it as
fw_cfg_arch_create().
Throttling thread sleeps in VCPU thread. For high throttle percentage
this sleep is more than 10ms. E.g. for 60% - 15ms, for 99% - 990ms.
vm_stop() kicks all VCPUs and waits for them. It's called at the end of
migration and because of the long sleep the migration downtime might be
more than 100ms even for downtime-limit 1ms.
Use qemu_cond_timedwait for high percentage to wake up during vm_stop.
The new function is needed to implement conditional sleep for CPU
throttling. It's possible to reuse qemu_sem_timedwait, but it's more
difficult than just add qemu_cond_timedwait.
Also moved compute_abs_deadline function up the code to reuse it in
qemu_cond_timedwait_impl win32.
Peter Maydell [Mon, 16 Sep 2019 14:25:55 +0000 (15:25 +0100)]
Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2019-09-16' into staging
Block patches:
- Fix for block jobs when used with I/O threads
- Fix for a corruption when using qcow2's LUKS encryption mode
- cURL fix
- check-block.sh cleanups (for make check)
- Refactoring
* remotes/maxreitz/tags/pull-block-2019-09-16:
qemu-iotests: Add test for bz #1745922
block/qcow2: refactor encryption code
block/qcow2: Fix corruption introduced by commit 8ac0f15f335
blockjob: update nodes head while removing all bdrv
curl: Check curl_multi_add_handle()'s return code
curl: Handle success in multi_check_completion
curl: Report only ready sockets
curl: Pass CURLSocket to curl_multi_do()
curl: Check completion in curl_multi_do()
curl: Keep *socket until the end of curl_sock_cb()
curl: Keep pointer to the CURLState in CURLSocket
tests/qemu-iotests: Fix qemu-io related output in 026.out.nocache
tests/Makefile: Do not print the name of the check-block.sh shell script
tests/qemu-iotests/check: Replace "tests" with "iotests" in final status text
block: Remove unused masks
block: Use QEMU_IS_ALIGNED
* Change the qcow2_co_{encrypt|decrypt} to just receive full host and
guest offsets and use this function directly instead of calling
do_perform_cow_encrypt (which is removed by that patch).
* Adjust qcow2_co_encdec to take full host and guest offsets as well.
* Document the qcow2_co_{encrypt|decrypt} arguments
to prevent the bug fixed in former commit from hopefully
happening again.
The corruption happens when we do a write that
* writes to two or more unallocated clusters at once
* doesn't fully cover the first sector
* doesn't fully cover the last sector
* uses luks encryption
In this case, when allocating the new clusters we COW both areas
prior to the write and after the write, and we encrypt them.
The above mentioned commit accidentally made it so we encrypt the
second COW area using the physical cluster offset of the first area.
The problem is that offset_in_cluster in do_perform_cow_encrypt
can be larger that the cluster size, thus cluster_offset
will no longer point to the start of the cluster at which encrypted
area starts.
Next patch in this series will refactor the code to avoid all these
assumptions.
In the bugreport that was triggered by rebasing a luks image to new,
zero filled base, which lot of such writes, and causes some files
with zero areas to contain garbage there instead.
But as described above it can happen elsewhere as well
blockjob: update nodes head while removing all bdrv
block_job_remove_all_bdrv() iterates through job->nodes, calling
bdrv_root_unref_child() for each entry. The call to the latter may
reach child_job_[can_]set_aio_ctx(), which will also attempt to
traverse job->nodes, potentially finding entries that where freed
on previous iterations.
To avoid this situation, update job->nodes head on each iteration to
ensure that already freed entries are no longer linked to the list.
Max Reitz [Tue, 10 Sep 2019 12:41:35 +0000 (14:41 +0200)]
curl: Handle success in multi_check_completion
Background: As of cURL 7.59.0, it verifies that several functions are
not called from within a callback. Among these functions is
curl_multi_add_handle().
curl_read_cb() is a callback from cURL and not a coroutine. Waking up
acb->co will lead to entering it then and there, which means the current
request will settle and the caller (if it runs in the same coroutine)
may then issue the next request. In such a case, we will enter
curl_setup_preadv() effectively from within curl_read_cb().
Calling curl_multi_add_handle() will then fail and the new request will
not be processed.
Fix this by not letting curl_read_cb() wake up acb->co. Instead, leave
the whole business of settling the AIOCB objects to
curl_multi_check_completion() (which is called from our timer callback
and our FD handler, so not from any cURL callbacks).
Max Reitz [Tue, 10 Sep 2019 12:41:34 +0000 (14:41 +0200)]
curl: Report only ready sockets
Instead of reporting all sockets to cURL, only report the one that has
caused curl_multi_do_locked() to be called. This lets us get rid of the
QLIST_FOREACH_SAFE() list, which was actually wrong: SAFE foreaches are
only safe when the current element is removed in each iteration. If it
possible for the list to be concurrently modified, we cannot guarantee
that only the current element will be removed. Therefore, we must not
use QLIST_FOREACH_SAFE() here.
Max Reitz [Tue, 10 Sep 2019 12:41:33 +0000 (14:41 +0200)]
curl: Pass CURLSocket to curl_multi_do()
curl_multi_do_locked() currently marks all sockets as ready. That is
not only inefficient, but in fact unsafe (the loop is). A follow-up
patch will change that, but to do so, curl_multi_do_locked() needs to
know exactly which socket is ready; and that is accomplished by this
patch here.
Max Reitz [Tue, 10 Sep 2019 12:41:32 +0000 (14:41 +0200)]
curl: Check completion in curl_multi_do()
While it is more likely that transfers complete after some file
descriptor has data ready to read, we probably should not rely on it.
Better be safe than sorry and call curl_multi_check_completion() in
curl_multi_do(), too, just like it is done in curl_multi_read().
With this change, curl_multi_do() and curl_multi_read() are actually the
same, so drop curl_multi_read() and use curl_multi_do() as the sole FD
handler.
Max Reitz [Tue, 10 Sep 2019 12:41:30 +0000 (14:41 +0200)]
curl: Keep pointer to the CURLState in CURLSocket
A follow-up patch will make curl_multi_do() and curl_multi_read() take a
CURLSocket instead of the CURLState. They still need the latter,
though, so add a pointer to it to the former.