Víctor Colombo [Wed, 4 May 2022 21:05:20 +0000 (18:05 -0300)]
target/ppc: Remove fpscr_* macros from cpu.h
fpscr_* defined macros are hiding the usage of *env behind them.
Substitute the usage of these macros with `env->fpscr & FP_*` to make
the code cleaner.
ppc/xive: Update the state of the External interrupt signal
When pulling or pushing an OS context from/to a CPU, we should
re-evaluate the state of the External interrupt signal. Otherwise, we
can end up catching the External interrupt exception in hypervisor
mode, which is unexpected.
The problem is best illustrated with the following scenario:
1. an External interrupt is raised while the guest is on the CPU.
2. before the guest can ack the External interrupt, an hypervisor
interrupt is raised, for example the Hypervisor Decrementer or
Hypervisor Virtualization interrupt. The hypervisor interrupt forces
the guest to exit while the External interrupt is still pending.
3. the hypervisor handles the hypervisor interrupt. At this point, the
External interrupt is still pending. So it's very likely to be
delivered while the hypervisor is running. That's unexpected and can
result in an infinite loop where the hypervisor catches the External
interrupt, looks for an interrupt in its hypervisor queue, doesn't
find any, exits the interrupt handler with the External interrupt
still raised, repeat...
The fix is simply to always lower the External interrupt signal when
pulling an OS context. It means it needs to be raised again when
re-pushing the OS context. Fortunately, it's already the case, as we
now always call xive_tctx_ipb_update(), which will raise the signal if
needed.
ppc/xive: Always recompute the PIPR when pushing an OS context
The Post Interrupt Priority Register (PIPR) is not restored like the
other OS-context related fields of the TIMA when pushing an OS context
on the CPU. It's not needed because it can be calculated from the
Interrupt Pending Buffer (IPB), which is saved and restored. The PIPR
must therefore always be recomputed when pushing an OS context.
This patch fixes a path on P9 and P10 where it was not done. If there
was a pending interrupt when the OS context was pulled, the IPB was
saved correctly. When pushing back the context, the code in
xive_tctx_need_resend() was checking for a interrupt raised while the
context was not on the CPU, saved in the NVT. If one was found, then
it was merged with the saved IPB and the PIPR updated and everything
was fine. However, if there was no interrupt found in the NVT, then
xive_tctx_ipb_update() was not being called and the PIPR was not
updated. This patch fixes it by always calling xive_tctx_ipb_update().
Note that on P10 (xive2.c) and because of the above, there's no longer
any need to check the CPPR value so it can go away.
Bin Meng [Thu, 21 Apr 2022 01:17:29 +0000 (09:17 +0800)]
target/ppc: Fix BookE debug interrupt generation
Per E500 core reference manual [1], chapter 8.4.4 "Branch Taken Debug
Event" and chapter 8.4.5 "Instruction Complete Debug Event":
"A branch taken debug event occurs if both MSR[DE] and DBCR0[BRT]
are set ... Branch taken debug events are not recognized if MSR[DE]
is cleared when the branch instruction executes."
"An instruction complete debug event occurs when any instruction
completes execution so long as MSR[DE] and DBCR0[ICMP] are both
set ... Instruction complete debug events are not recognized if
MSR[DE] is cleared at the time of the instruction execution."
Current codes do not check MSR.DE bit before setting HFLAGS_SE and
HFLAGS_BE flag, which would cause the immediate debug interrupt to
be generated, e.g.: when DBCR0.ICMP bit is set by guest software
and MSR.DE is not set.
target/ppc: init 'rmmu_info' in kvm_get_radix_page_info()
Init the struct to avoid Valgrind complaints about unitialized bytes,
such as this one:
==39549== Syscall param ioctl(generic) points to uninitialised byte(s)
==39549== at 0x55864E4: ioctl (in /usr/lib64/libc.so.6)
==39549== by 0xD1F7EF: kvm_vm_ioctl (kvm-all.c:3035)
==39549== by 0xAF8F5B: kvm_get_radix_page_info (kvm.c:276)
==39549== by 0xB00533: kvmppc_host_cpu_class_init (kvm.c:2369)
==39549== by 0xD3DCE7: type_initialize (object.c:366)
==39549== by 0xD3FACF: object_class_foreach_tramp (object.c:1071)
==39549== by 0x502757B: g_hash_table_foreach (in /usr/lib64/libglib-2.0.so.0.7000.5)
==39549== by 0xD3FC1B: object_class_foreach (object.c:1093)
==39549== by 0xB0141F: kvm_ppc_register_host_cpu_type (kvm.c:2613)
==39549== by 0xAF87E7: kvm_arch_init (kvm.c:157)
==39549== by 0xD1E2A7: kvm_init (kvm-all.c:2595)
==39549== by 0x8E6E93: accel_init_machine (accel-softmmu.c:39)
==39549== Address 0x1fff00e208 is on thread 1's stack
==39549== in frame #2, created by kvm_get_radix_page_info (kvm.c:267)
==39549== Uninitialised value was created by a stack allocation
==39549== at 0xAF8EE8: kvm_get_radix_page_info (kvm.c:267)
target/ppc: init 'sregs' in kvmppc_put_books_sregs()
Init 'sregs' to avoid Valgrind complaints about uninitialized bytes
from kvmppc_put_books_sregs():
==54059== Thread 3:
==54059== Syscall param ioctl(generic) points to uninitialised byte(s)
==54059== at 0x55864E4: ioctl (in /usr/lib64/libc.so.6)
==54059== by 0xD1FA23: kvm_vcpu_ioctl (kvm-all.c:3053)
==54059== by 0xAFB18B: kvmppc_put_books_sregs (kvm.c:891)
==54059== by 0xAFB47B: kvm_arch_put_registers (kvm.c:949)
==54059== by 0xD1EDA7: do_kvm_cpu_synchronize_post_init (kvm-all.c:2766)
==54059== by 0x481AF3: process_queued_cpu_work (cpus-common.c:343)
==54059== by 0x4EF247: qemu_wait_io_event_common (cpus.c:412)
==54059== by 0x4EF343: qemu_wait_io_event (cpus.c:436)
==54059== by 0xD21E83: kvm_vcpu_thread_fn (kvm-accel-ops.c:54)
==54059== by 0xFFEBF3: qemu_thread_start (qemu-thread-posix.c:556)
==54059== by 0x54E6DC3: start_thread (in /usr/lib64/libc.so.6)
==54059== by 0x5596C9F: clone (in /usr/lib64/libc.so.6)
==54059== Address 0x799d1cc is on thread 3's stack
==54059== in frame #2, created by kvmppc_put_books_sregs (kvm.c:851)
==54059== Uninitialised value was created by a stack allocation
==54059== at 0xAFAEB0: kvmppc_put_books_sregs (kvm.c:851)
This happens because Valgrind does not consider the 'sregs'
initialization done by kvm_vcpu_ioctl() at the end of the function.
target/ppc: init 'lpcr' in kvmppc_enable_cap_large_decr()
'lpcr' is used as an input of kvm_get_one_reg(). Valgrind doesn't
understand that and it returns warnings as such for this function:
==55240== Thread 1:
==55240== Conditional jump or move depends on uninitialised value(s)
==55240== at 0xB011E4: kvmppc_enable_cap_large_decr (kvm.c:2546)
==55240== by 0x92F28F: cap_large_decr_cpu_apply (spapr_caps.c:523)
==55240== by 0x930C37: spapr_caps_cpu_apply (spapr_caps.c:921)
==55240== by 0x955D3B: spapr_reset_vcpu (spapr_cpu_core.c:73)
==55240== by 0x95612B: spapr_cpu_core_reset (spapr_cpu_core.c:209)
==55240== by 0x95619B: spapr_cpu_core_reset_handler (spapr_cpu_core.c:218)
==55240== by 0xD3605F: qemu_devices_reset (reset.c:69)
==55240== by 0x92112B: spapr_machine_reset (spapr.c:1641)
==55240== by 0x4FBD63: qemu_system_reset (runstate.c:444)
==55240== by 0x62812B: qdev_machine_creation_done (machine.c:1247)
==55240== by 0x5064C3: qemu_machine_creation_done (vl.c:2725)
==55240== by 0x5065DF: qmp_x_exit_preconfig (vl.c:2748)
==55240== Uninitialised value was created by a stack allocation
==55240== at 0xB01158: kvmppc_enable_cap_large_decr (kvm.c:2540)
target/ppc: initialize 'val' union in kvm_get_one_spr()
Valgrind isn't convinced that we are initializing the values we assign
to env->spr[spr] because it doesn't understand that the 'val' union is
being written by the kvm_vcpu_ioctl() that follows (via struct
kvm_one_reg).
This results in Valgrind complaining about uninitialized values every
time we use env->spr in a conditional, like this instance:
==707578== Thread 1:
==707578== Conditional jump or move depends on uninitialised value(s)
==707578== at 0xA10A40: hreg_compute_hflags_value (helper_regs.c:106)
==707578== by 0xA10C9F: hreg_compute_hflags (helper_regs.c:173)
==707578== by 0xA110F7: hreg_store_msr (helper_regs.c:262)
==707578== by 0xA051A3: ppc_cpu_reset (cpu_init.c:7168)
==707578== by 0xD4730F: device_transitional_reset (qdev.c:799)
==707578== by 0xD4A11B: resettable_phase_hold (resettable.c:182)
==707578== by 0xD49A77: resettable_assert_reset (resettable.c:60)
==707578== by 0xD4994B: resettable_reset (resettable.c:45)
==707578== by 0xD458BB: device_cold_reset (qdev.c:296)
==707578== by 0x48FBC7: cpu_reset (cpu-common.c:114)
==707578== by 0x97B5EB: spapr_reset_vcpu (spapr_cpu_core.c:38)
==707578== by 0x97BABB: spapr_cpu_core_reset (spapr_cpu_core.c:209)
==707578== Uninitialised value was created by a stack allocation
==707578== at 0xB11F08: kvm_get_one_spr (kvm.c:543)
Initializing 'val' has no impact in the logic and makes Valgrind output
more bearable.
* tag 'pull-target-arm-20220505' of https://git.linaro.org/people/pmaydell/qemu-arm: (23 commits)
target/arm: read access to performance counters from EL0
target/arm: Add isar_feature_{aa64,any}_ras
target/arm: Add isar predicates for FEAT_Debugv8p2
target/arm: Remove HOST_BIG_ENDIAN ifdef in add_cpreg_to_hashtable
target/arm: Reformat comments in add_cpreg_to_hashtable
target/arm: Perform override check early in add_cpreg_to_hashtable
target/arm: Hoist isbanked computation in add_cpreg_to_hashtable
target/arm: Use bool for is64 and ns in add_cpreg_to_hashtable
target/arm: Consolidate cpreg updates in add_cpreg_to_hashtable
target/arm: Hoist computation of key in add_cpreg_to_hashtable
target/arm: Merge allocation of the cpreg and its name
target/arm: Store cpregs key in the hash table directly
target/arm: Drop always-true test in define_arm_vh_e2h_redirects_aliases
target/arm: Name CPSecureState type
target/arm: Name CPState type
target/arm: Change cpreg access permissions to enum
target/arm: Avoid bare abort() or assert(0)
target/arm: Reorg ARMCPRegInfo type field bits
target/arm: Make some more cpreg data static const
target/arm: Replace sentinels with ARRAY_SIZE in cpregs.h
...
Alex Zuepke [Thu, 28 Apr 2022 13:27:17 +0000 (15:27 +0200)]
target/arm: read access to performance counters from EL0
The ARMv8 manual defines that PMUSERENR_EL0.ER enables read-access
to both PMXEVCNTR_EL0 and PMEVCNTR<n>_EL0 registers, however,
we only use it for PMXEVCNTR_EL0. Extend to PMEVCNTR<n>_EL0 as well.
Add the aa64 predicate for detecting RAS support from id registers.
We already have the aa32 version from the M-profile work.
Add the 'any' predicate for testing both aa64 and aa32.
target/arm: Merge allocation of the cpreg and its name
Simplify freeing cp_regs hash table entries by using a single
allocation for the entire value.
This fixes a theoretical bug if we were to ever free the entire
hash table, because we've been installing string literal constants
into the cpreg structure in define_arm_vh_e2h_redirects_aliases.
However, at present we only free entries created for AArch32
wildcard cpregs which get overwritten by more specific cpregs,
so this bug is never exposed.
target/arm: Store cpregs key in the hash table directly
Cast the uint32_t key into a gpointer directly, which
allows us to avoid allocating storage for each key.
Use g_hash_table_lookup when we already have a gpointer
(e.g. for callbacks like count_cpreg), or when using
get_arm_cp_reginfo would require casting away const.
Give this enum a name and use in ARMCPRegInfo and add_cpreg_to_hashtable.
Add the enumerator ARM_CP_SECSTATE_BOTH to clarify how 0
is handled in define_one_arm_cp_reg_with_opaque.
Instead of defining ARM_CP_FLAG_MASK to remove flags,
define ARM_CP_SPECIAL_MASK to isolate special cases.
Sort the specials to the low bits. Use an enum.
Split the large comment block so as to document each
value separately.
target/arm: Replace sentinels with ARRAY_SIZE in cpregs.h
Remove a possible source of error by removing REGINFO_SENTINEL
and using ARRAY_SIZE (convinently hidden inside a macro) to
find the end of the set of regs being registered or modified.
The space saved by not having the extra array element reduces
the executable's .data.rel.ro section by about 9k.
target/arm: Reorg CPAccessResult and access_check_cp_reg
Rearrange the values of the enumerators of CPAccessResult
so that we may directly extract the target el. For the two
special cases in access_check_cp_reg, use CPAccessResult.
target/arm: Enable SCTLR_EL1.BT0 for aarch64-linux-user
This controls whether the PACI{A,B}SP instructions trap with BTYPE=3
(indirect branch from register other than x16/x17). The linux kernel
sets this in bti_enable().
Merge tag 'for-upstream' of git://repo.or.cz/qemu/kevin into staging
Block layer patches
- Fix and re-enable GLOBAL_STATE_CODE assertions
- vhost-user: Fixes for VHOST_USER_ADD/REM_MEM_REG
- vmdk: Fix reopening bs->file
- coroutine: use QEMU_DEFINE_STATIC_CO_TLS()
- docs/qemu-img: Fix list of formats which implement check
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmJyi+YRHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9aphxAAt0sXEOWlcIeU87NOk+V30rRiBup0K/HZ
# wqsE6e0EMbygmC2aS/xqNu3naQ/TMY6UaoVBWSpf0D3sK2GnWEJW8bjV05ObZBwp
# 6QUgqljk1QAAVv0o2/nViAcV8mEW+OzZLveP+qxFRNlNGoJDsbGzWj939SHM13eu
# ZD+/GGs/qXL3Gxp6adhOBjxbXYjvxm13F3pVjoyAugjMSqoSuCI0eXu1xkwXNHSP
# /wqObH3dQSzIvEXfE/1BOp3ofZwvg+XzeZ6MM4I/lvHDZWuQBfCQcBYKL9mMNWGc
# ijFEeolWt7hER50ik4XPvBmbj0jU2nPXQwo1XcFeWX3MSoNsha2jCZsz4LqzadIN
# YijGQHmkfDRmG2LSoIGcgM7chdwj88K8pfMnrrTsVEB6Dl4QrK6FjXviL5mG+rcX
# 5FbKpgRwm3fmtug7Ttpgm1LJQmwK5A3YPenPH+CC2FoK3Rje46ZoMwQR5PBuHvM1
# rg9RB01eGJQGrw5Rt3VFk7304O/yT2J5m96x6CMejx4CuGK78VpkBC54HixTTh0R
# nXxLLZdqawVqwrPP6sE02FEajM931///nhU6fuN/832m3bYUsfM8SZNVqJh5xoex
# SK8x/a5HnTIkp7kys3f4juc+jzb4Yvka8IBIZi/gqzoqf8POGKLBCTpEXa3lBWIT
# gnCCnWWEdgY=
# =ETW/
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 04 May 2022 09:21:26 AM CDT
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "[email protected]"
# gpg: Good signature from "Kevin Wolf <[email protected]>" [full]
* tag 'for-upstream' of git://repo.or.cz/qemu/kevin:
coroutine-win32: use QEMU_DEFINE_STATIC_CO_TLS()
coroutine: use QEMU_DEFINE_STATIC_CO_TLS()
coroutine-ucontext: use QEMU_DEFINE_STATIC_CO_TLS()
iotests/reopen-file: Test reopening file child
block/vmdk: Fix reopening bs->file
iotests: Add regression test for issue 945
Revert "main-loop: Disable GLOBAL_STATE_CODE() assertions"
qcow2: Do not reopen data_file in invalidate_cache
block: Classify bdrv_get_flags() as I/O function
vhost-user: Don't pass file descriptor for VHOST_USER_REM_MEM_REG
libvhost-user: Fix extra vu_add/rem_mem_reg reply
docs/vhost-user: Clarifications for VHOST_USER_ADD/REM_MEM_REG
qemu-img: properly list formats which have consistency check implemented
Merge tag 'pull-request-2022-05-04' of https://gitlab.com/thuth/qemu into staging
* Silence the warning about the msa5 feature when using the "max" CPU on s390x
* Implement the s390x Vector-Enhancements Facility 2
* Remove the old libopcode-based s390 disassembler
* Fix branch-relative-long test compilation with Clang
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmJyXKURHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbX1bg//bSZhEFekeak8nsM2piEwA/d3hEz5aTqN
# 9UW296E3MpE6cyfai+rQw1HzACA/sbOHLGBpOfo+dPkCq7JPhif62xOWd/6pfjvl
# d6+GRB7YusSnyePwQ7AJwWK7xOFi9LqYiqfM7wqUQf/TbetB4/ufssVc47LBsrqR
# 5OWJMRf0G/GItpCCy4IDp1oEJnKI9lGN+VG9hWJePeGYPLelmx0uHH02kgDCOb93
# atCOEeoDEsrVsbtwt9/NDw5H3DvgL2/bYGtVMkkXivysT3QhrxzoJMYRndK03CSx
# 2rWnmGGqorlzIJ8RdKvu27c9XfTtf8ssaidZMuCk4WD54H7Ln32L9EvRCpjtT8o2
# RHgxnkWSa2NWHhVrX9r0syRc7tFfFK3U7G5kYlZov+o1IyrgA7prwIjKzTk5ZIAl
# ZPmXWTUuewWSnGsJsRK9R8+UQ+nB6x8gxqK1s0dHf2rTgtIgWsx5s9WEdxGqeQ5h
# 5IvIBOML4aXnp2i0QGoGdq4zaDl1ac8AGpLd2jqc9svlHl44Q7NfY2MiWMVGCOP+
# O7DdO/tfmuJyPZS4QolGHghJFycC3Qr3Z42/dJrNK8bwaVGG/ysWkrutxcUzS3z9
# /xkkBWz8Vlktcy4Ft8lqkvofQGUYuJIfbU++EBu6yAp+mSzbO7elE8TZbgpGOVQv
# BFgwW3J4iqI=
# =7QOT
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 04 May 2022 03:59:49 AM PDT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "[email protected]"
# gpg: Good signature from "Thomas Huth <[email protected]>" [undefined]
# gpg: aka "Thomas Huth <[email protected]>" [undefined]
# gpg: aka "Thomas Huth <[email protected]>" [unknown]
# gpg: aka "Thomas Huth <[email protected]>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2022-05-04' of https://gitlab.com/thuth/qemu:
tests/tcg/s390x: Use a different PCRel32 notation in branch-relative-long.c
disas: Remove old libopcode s390 disassembler
tests/tcg/s390x: Tests for Vector Enhancements Facility 2
target/s390x: add S390_FEAT_VECTOR_ENH2 to qemu CPU model
target/s390x: vxeh2: vector {load, store} byte reversed element
target/s390x: vxeh2: vector {load, store} byte reversed elements
target/s390x: vxeh2: vector {load, store} elements reversed
target/s390x: vxeh2: vector shift double by bit
target/s390x: vxeh2: Update for changes to vector shifts
target/s390x: vxeh2: vector string search
target/s390x: vxeh2: vector convert short/32b
tcg: Implement tcg_gen_{h,w}swap_{i32,i64}
s390x/cpu_models: make "max" match the unmodified "qemu" CPU model under TCG
s390x/cpu_models: drop "msa5" from the TCG "max" model
target/s390x: Fix writeback to v1 in helper_vstl
Stefan Hajnoczi [Mon, 7 Mar 2022 15:38:53 +0000 (15:38 +0000)]
coroutine-win32: use QEMU_DEFINE_STATIC_CO_TLS()
Thread-Local Storage variables cannot be used directly from coroutine
code because the compiler may optimize TLS variable accesses across
qemu_coroutine_yield() calls. When the coroutine is re-entered from
another thread the TLS variables from the old thread must no longer be
used.
Use QEMU_DEFINE_STATIC_CO_TLS() for the current and leader variables.
I think coroutine-win32.c could get away with __thread because the
variables are only used in situations where either the stale value is
correct (current) or outside coroutine context (loading leader when
current is NULL). Due to the difficulty of being sure that this is
really safe in all scenarios it seems worth converting it anyway.
Stefan Hajnoczi [Mon, 7 Mar 2022 15:38:52 +0000 (15:38 +0000)]
coroutine: use QEMU_DEFINE_STATIC_CO_TLS()
Thread-Local Storage variables cannot be used directly from coroutine
code because the compiler may optimize TLS variable accesses across
qemu_coroutine_yield() calls. When the coroutine is re-entered from
another thread the TLS variables from the old thread must no longer be
used.
Use QEMU_DEFINE_STATIC_CO_TLS() for the current and leader variables.
The alloc_pool QSLIST needs a typedef so the return value of
get_ptr_alloc_pool() can be stored in a local variable.
One example of why this code is necessary: a coroutine that yields
before calling qemu_coroutine_create() to create another coroutine is
affected by the TLS issue.
Stefan Hajnoczi [Mon, 7 Mar 2022 15:38:51 +0000 (15:38 +0000)]
coroutine-ucontext: use QEMU_DEFINE_STATIC_CO_TLS()
Thread-Local Storage variables cannot be used directly from coroutine
code because the compiler may optimize TLS variable accesses across
qemu_coroutine_yield() calls. When the coroutine is re-entered from
another thread the TLS variables from the old thread must no longer be
used.
Use QEMU_DEFINE_STATIC_CO_TLS() for the current and leader variables.
Hanna Reitz [Mon, 14 Mar 2022 16:27:18 +0000 (17:27 +0100)]
block/vmdk: Fix reopening bs->file
VMDK disk data is stored in extents, which may or may not be separate
from bs->file. VmdkExtent.file points to where they are stored. Each
that is stored in bs->file will simply reuse the exact pointer value of
bs->file.
(That is why vmdk_free_extents() will unref VmdkExtent.file (e->file)
only if e->file != bs->file.)
Reopen operations can change bs->file (they will replace the whole
BdrvChild object, not just the BDS stored in that BdrvChild), and then
we will need to change all .file pointers of all such VmdkExtents to
point to the new BdrvChild.
In vmdk_reopen_prepare(), we have to check which VmdkExtents are
affected, and in vmdk_reopen_commit(), we can modify them. We have to
split this because:
- The new BdrvChild is created only after prepare, so we can change
VmdkExtent.file only in commit
- In commit, there no longer is any (valid) reference to the old
BdrvChild object, so there would be nothing to compare VmdkExtent.file
against to see whether it was equal to bs->file before reopening
(There is BDRVReopenState.old_file_bs, but the old bs->file
BdrvChild's .bs pointer will be NULL-ed when the new BdrvChild is
created, and so we cannot compare VmdkExtent.file->bs against
BDRVReopenState.old_file_bs)
Hanna Reitz [Wed, 27 Apr 2022 11:40:57 +0000 (13:40 +0200)]
iotests: Add regression test for issue 945
Create a VM with a BDS in an iothread, add -incoming defer to the
command line, and then export this BDS via NBD. Doing so should not
fail an assertion.
This reverts commit b1c073490553f80594b903ceedfc7c1aef6b1b19. (We
wanted to do so once the 7.1 tree opens, which has happened. The issue
reported in https://gitlab.com/qemu-project/qemu/-/issues/945 should be
fixed by the preceding patches.)
Hanna Reitz [Wed, 27 Apr 2022 11:40:55 +0000 (13:40 +0200)]
qcow2: Do not reopen data_file in invalidate_cache
qcow2_co_invalidate_cache() closes and opens the qcow2 file, by calling
qcow2_close() and qcow2_do_open(). These two functions must thus be
usable from both a global-state and an I/O context.
As they are, they are not safe to call in an I/O context, because they
use bdrv_unref_child() and bdrv_open_child() to close/open the data_file
child, respectively, both of which are global-state functions. When
used from qcow2_co_invalidate_cache(), we do not need to close/open the
data_file child, though (we do not do this for bs->file or bs->backing
either), and so we should skip it in the qcow2_co_invalidate_cache()
path.
To do so, add a parameter to qcow2_do_open() and qcow2_close() to make
them skip handling s->data_file, and have qcow2_co_invalidate_cache()
exempt it from the memset() on the BDRVQcow2State.
(Note that the QED driver similarly closes/opens the QED image by
invoking bdrv_qed_close()+bdrv_qed_do_open(), but both functions seem
safe to use in an I/O context.)
Kevin Wolf [Thu, 7 Apr 2022 13:36:57 +0000 (15:36 +0200)]
vhost-user: Don't pass file descriptor for VHOST_USER_REM_MEM_REG
The spec clarifies now that QEMU should not send a file descriptor in a
request to remove a memory region. Change it accordingly.
For libvhost-user, this is a bug fix that makes it compatible with
rust-vmm's implementation that doesn't send a file descriptor. Keep
accepting, but ignoring a file descriptor for compatibility with older
QEMU versions.
Kevin Wolf [Thu, 7 Apr 2022 13:36:56 +0000 (15:36 +0200)]
libvhost-user: Fix extra vu_add/rem_mem_reg reply
Outside of postcopy mode, neither VHOST_USER_ADD_MEM_REG nor
VHOST_USER_REM_MEM_REG are supposed to send a reply unless explicitly
requested with the need_reply flag. Their current implementation always
sends a reply, even if it isn't requested. This confuses the master
because it will interpret the reply as a reply for the next message for
which it actually expects a reply.
need_reply is already handled correctly by vu_dispatch(), so just don't
send a reply in the non-postcopy part of the message handler for these
two commands.
Kevin Wolf [Thu, 7 Apr 2022 13:36:55 +0000 (15:36 +0200)]
docs/vhost-user: Clarifications for VHOST_USER_ADD/REM_MEM_REG
The specification for VHOST_USER_ADD/REM_MEM_REG messages is unclear
in several points, which has led to clients having incompatible
implementations. This changes the specification to be more explicit
about them:
* VHOST_USER_ADD_MEM_REG is not specified as receiving a file
descriptor, though it obviously does need to do so. All
implementations agree on this one, fix the specification.
* VHOST_USER_REM_MEM_REG is not specified as receiving a file
descriptor either, and it also has no reason to do so. rust-vmm does
not send file descriptors for removing a memory region (in agreement
with the specification), libvhost-user and QEMU do (which is a bug),
though libvhost-user doesn't actually make any use of it.
Change the specification so that for compatibility QEMU's behaviour
becomes legal, even if discouraged, but rust-vmm's behaviour becomes
the explicitly recommended mode of operation.
* VHOST_USER_ADD_MEM_REG doesn't have a documented return value, which
is the desired behaviour in the non-postcopy case. It also implemented
like this in QEMU and rust-vmm, though libvhost-user is buggy and
sometimes sends an unexpected reply. This will be fixed in a separate
patch.
However, in postcopy mode it does reply like VHOST_USER_SET_MEM_TABLE.
This behaviour is shared between libvhost-user and QEMU; rust-vmm
doesn't implement postcopy mode yet. Mention it explicitly in the
spec.
* The specification doesn't mention how VHOST_USER_REM_MEM_REG
identifies the memory region to be removed. Change it to describe the
existing behaviour of libvhost-user (guest address, user address and
size must match).
qemu-img: properly list formats which have consistency check implemented
Simple grep for the .bdrv_co_check callback presence gives the following
list of block drivers
* QED
* VDI
* VHDX
* VMDK
* Parallels
which have this callback. The presense of the callback means that
consistency check is supported.
* tag 'qga-pull-request' of gitlab.com:marcandre.lureau/qemu:
qga: Introduce disk smart
qga: Introduce NVMe disk bus type
qga/commands-posix: 'guest-shutdown' for Solaris
qga/commands-posix: Log all net stats failures
qga/commands-posix: Fix listing ifaces for Solaris
qga/commands-posix: Fix iface hw address detection
qga/commands-posix: Use getifaddrs when available
zhenwei pi [Wed, 20 Apr 2022 02:26:10 +0000 (10:26 +0800)]
qga: Introduce disk smart
After assigning a NVMe/SCSI controller to guest by VFIO, we lose
everything on the host side. A guest uses these devices exclusively,
we usually don't care the actions on these devices. But there is a
low probability that hitting physical hardware warning, we need a
chance to get the basic smart log info.
Introduce disk smart, and implement NVMe smart on linux.
zhenwei pi [Wed, 20 Apr 2022 02:26:09 +0000 (10:26 +0800)]
qga: Introduce NVMe disk bus type
Assigning a NVMe disk by VFIO or emulating a NVMe controller by QEMU,
a NVMe disk get exposed in guest side. Support NVMe disk bus type and
implement posix version.
Andrew Deason [Tue, 26 Apr 2022 19:55:26 +0000 (14:55 -0500)]
qga/commands-posix: 'guest-shutdown' for Solaris
On Solaris, instead of the -P, -H, and -r flags, we need to provide
the target init state to the 'shutdown' command: state 5 is poweroff,
0 is halt, and 6 is reboot. We also need to pass -g0 to avoid the
default 60-second delay, and -y to avoid a confirmation prompt.
Implement this logic under an #ifdef CONFIG_SOLARIS, so the
'guest-shutdown' command works properly on Solaris.
Andrew Deason [Tue, 26 Apr 2022 19:55:24 +0000 (14:55 -0500)]
qga/commands-posix: Fix listing ifaces for Solaris
The code for guest-network-get-interfaces needs a couple of small
adjustments for Solaris:
- The results from SIOCGIFHWADDR are documented as being in ifr_addr,
not ifr_hwaddr (ifr_hwaddr doesn't exist on Solaris).
- The implementation of guest_get_network_stats is Linux-specific, so
hide it under #ifdef CONFIG_LINUX. On non-Linux, we just won't
provide network interface stats.
Since its introduction in commit 3424fc9f16a1 ("qemu-ga: add
guest-network-get-interfaces command"), guest-network-get-interfaces
seems to check if a given interface has a hardware address by checking
'ifa->ifa_flags & SIOCGIFHWADDR'. But ifa_flags is a field for IFF_*
flags (IFF_UP, IFF_LOOPBACK, etc), and comparing it to an ioctl like
SIOCGIFHWADDR doesn't make sense.
On Linux, this isn't a big deal, since SIOCGIFHWADDR has so many bits
set (0x8927), 'ifa->ifa_flags & SIOCGIFHWADDR' will usually have a
nonzero result for any 'normal'-looking interfaces: anything with
IFF_UP (0x1) or IFF_BROADCAST (0x2) set, as well as several
less-common flags. This means we'll try to get the hardware address
for most/all interfaces, even those that don't really have one (like
the loopback device). For those interfaces, Linux just returns a
hardware address of all zeroes.
On Solaris, however, trying to get the hardware address for a loopback
device returns an EADDRNOTAVAIL error. This causes us to return an
error and the entire guest-network-get-interfaces call fails.
Change this logic to always try to get the hardware address for each
interface, and don't return an error if we fail to get it. Instead,
just don't include the 'hardware-address' field in the result if we
can't get the hardware address.
Andrew Deason [Tue, 26 Apr 2022 19:55:22 +0000 (14:55 -0500)]
qga/commands-posix: Use getifaddrs when available
Currently, commands-posix.c assumes that getifaddrs() is only
available on Linux, and so the related guest agent command
guest-network-get-interfaces is only implemented for #ifdef __linux__.
This function does exist on other platforms, though, such as Solaris.
So, add a meson check for getifaddrs(), and move the code for
guest-network-get-interfaces to be built whenever getifaddrs() is
available.
The implementation for guest-network-get-interfaces still has some
Linux-specific code, which is not fixed in this commit. This commit
moves the relevant big chunks of code around without changing them, so
a future commit can change the code in place.
tests/tcg/s390x: Use a different PCRel32 notation in branch-relative-long.c
Binutils >=2.37 and Clang do not accept (. - 0x100000000) PCRel32
constants. While this looks like a bug that needs fixing, use a
different notation (-0x100000000) as a workaround.
s390x/cpu_models: make "max" match the unmodified "qemu" CPU model under TCG
Before we were able to bump up the qemu CPU model to a z13, we included
some experimental features during development in the "max" model only.
Nowadays, the "max" model corresponds exactly to the "qemu" CPU model
of the latest QEMU machine under TCG.
Let's remove all the special casing, effectively making both models
match completely from now on, and clean up.
s390x/cpu_models: drop "msa5" from the TCG "max" model
We don't include the "msa5" feature in the "qemu" model because it
generates a warning. The PoP states:
"The message-security-assist extension 5 requires
the secure-hash-algorithm (SHA-512) capabilities of
the message-security-assist extension 2 as a prereq-
uisite. (March, 2015)"
As SHA-512 won't be supported in the near future, let's just drop the
feature from the "max" model. This avoids the warning and allows us for
making the "max" model match the "qemu" model (except for compat
machines). We don't lose much, as we only implement the function stubs
for MSA, excluding any real subfunctions.
Merge tag 'pull-aspeed-20220503' of https://github.com/legoater/qemu into staging
aspeed queue:
* New AST1030 SoC and eval board
* Accumulative mode support for HACE controller
* GPIO fix and unit test
* Clock modeling adjustments for the AST2600
* Dummy eMMC Boot Controller model
* Change of AST2500 EVB and AST2600 EVB flash model (for quad IO)
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmJwwq8ACgkQUaNDx8/7
# 7KF0IhAApbCCcg06PR66pmaDBFY2RWmU0XShDoCEeHyT5huQFcAJWNoqVAJ52E8L
# ZCPEeORQthxMwmtw7JLIGCFhDx4P4YzfNZRPANRosKs7BR0GequVgHp7c6fXhD/3
# A3w42hfuNR4Hrbsil/yhN2vxFAYXudA+NPez2ibex3UyVc/ZUu71nCqZTxh3wZdN
# XQTuqxWerA5RRBRtVn8n/aBp+3mo5enD4dx44KWMZxKxJaFJfZQHVZttGHU9azF+
# fXJ1lmrJZ7eHmWjCEvgnHXwl0nWiMwkLZ9/MKOAPkdjUG1JciGRxbJki0bGuS7Jr
# NzOyO0f++ZtOsuLGA03WiwR1oo3GmG7lBFqBcdzMwN2EMvDvVvJUp3v8IdV/L10P
# OJ10rBi6FDJuKGHJGIQywlFSYYjPb+DgNEWId2rugVVm4dR02Cn69amuL40OO9by
# /C7hO9gSvRTqSSdjFcdkbI2h+kx0354F2/gR2LFLBh1KUHulTJ4ErthrKBiuNPC8
# tsELzYVnxWVT+nc30Nmidg3uCW3/5zBlaj0qlL4aiFjKR5na6Wpz+oE/aNiNdyT3
# IBI+J5zvbtn/prNTWLW1TCuGdwj357LfYfkfkH8eqZWfX5vGq+5hVTc/m8EW5Cx8
# yV8JrbjX8uDI379skdl4imtedbKZhPLd7csM/zrorsJhBBwSoLA=
# =+hIh
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 02 May 2022 10:50:39 PM PDT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <[email protected]>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-aspeed-20220503' of https://github.com/legoater/qemu:
aspeed/hace: Support AST1030 HACE
hw/gpio/aspeed_gpio: Fix QOM pin property
tests/qtest: Add test for Aspeed HACE accumulative mode
aspeed/hace: Support AST2600 HACE
aspeed/hace: Support HMAC Key Buffer register.
hw/arm/aspeed: fix AST2500/AST2600 EVB fmc model
test/avocado/machine_aspeed.py: Add ast1030 test case
aspeed: Add an AST1030 eval board
aspeed/soc : Add AST1030 support
aspeed/scu: Add AST1030 support
aspeed/timer: Add AST1030 support
aspeed/wdt: Add AST1030 support
aspeed/wdt: Fix ast2500/ast2600 default reload value
aspeed/smc: Add AST1030 support
aspeed/adc: Add AST1030 support
aspeed: Add eMMC Boot Controller stub
aspeed: sbc: Correct default reset values
hw: aspeed_scu: Introduce clkin_25Mhz attribute
hw: aspeed_scu: Add AST2600 apb_freq and hpll calculation function
The qemu_*block() functions are meant to be be used with sockets (the
win32 implementation expects SOCKET)
Over time, those functions where used with Win32 SOCKET or
file-descriptors interchangeably. But for portability, they must only be
used with socket-like file-descriptors. FDs can use
g_unix_set_fd_nonblocking() instead.
Rename the functions with "socket" in the name to prevent bad usages.
This is effectively reverting commit f9e8cacc5557e43 ("oslib-posix:
rename socket_set_nonblock() to qemu_set_nonblock()").
Since commit a2ce7dbd917 ("meson: convert tests/qtest to meson"),
libqtest.h is under libqos/ directory, while libqtest.c is still in
qtest/. Move back to its original location to avoid mixing with libqos/.