Git Repo - qemu.git/log

Merge remote-tracking branch 'remotes/rth-gitlab/tags/pull-tcg-20210604' into staging

Host vector support for arm neon.

# gpg: Signature made Fri 04 Jun 2021 19:56:59 BST
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "[email protected]"
# gpg: Good signature from "Richard Henderson <[email protected]>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* remotes/rth-gitlab/tags/pull-tcg-20210604:
  tcg/arm: Implement TCG_TARGET_HAS_rotv_vec
  tcg/arm: Implement TCG_TARGET_HAS_roti_vec
  tcg/arm: Implement TCG_TARGET_HAS_shv_vec
  tcg/arm: Implement TCG_TARGET_HAS_bitsel_vec
  tcg/arm: Implement TCG_TARGET_HAS_minmax_vec
  tcg/arm: Implement TCG_TARGET_HAS_sat_vec
  tcg/arm: Implement TCG_TARGET_HAS_mul_vec
  tcg/arm: Implement TCG_TARGET_HAS_shi_vec
  tcg/arm: Implement andc, orc, abs, neg, not vector operations
  tcg/arm: Implement minimal vector operations
  tcg/arm: Implement tcg_out_dup*_vec
  tcg/arm: Implement tcg_out_mov for vector types
  tcg/arm: Implement tcg_out_ld/st for vector types
  tcg/arm: Add host vector framework
  tcg: Change parameters for tcg_target_const_match

Signed-off-by: Peter Maydell <[email protected]>

tcg/arm: Implement TCG_TARGET_HAS_rotv_vec

Implement via expansion, so don't actually set TCG_TARGET_HAS_rotv_vec.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

tcg/arm: Implement TCG_TARGET_HAS_roti_vec

Implement via expansion, so don't actually set TCG_TARGET_HAS_roti_vec.
For NEON, this is shift-right followed by shift-left-and-insert.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

tcg/arm: Implement TCG_TARGET_HAS_shv_vec

The three vector shift by vector operations are all implemented via
expansion. Therefore do not actually set TCG_TARGET_HAS_shv_vec,
as none of shlv_vec, shrv_vec, sarv_vec may actually appear in the
instruction stream, and therefore also do not appear in tcg_target_op_def.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

tcg/arm: Implement TCG_TARGET_HAS_bitsel_vec

NEON has 3 instructions implementing this 4 argument operation,
with each insn overlapping a different logical input onto the
destination register.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

tcg/arm: Implement TCG_TARGET_HAS_minmax_vec

This is minimum and maximum, signed and unsigned.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

tcg/arm: Implement TCG_TARGET_HAS_sat_vec

This is saturating add and subtract, signed and unsigned.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

tcg/arm: Implement TCG_TARGET_HAS_mul_vec

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

tcg/arm: Implement TCG_TARGET_HAS_shi_vec

This consists of the three immediate shifts: shli, shri, sari.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

tcg/arm: Implement andc, orc, abs, neg, not vector operations

These logical and arithmetic operations are optional, but are
trivial to accomplish with the existing infrastructure.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

tcg/arm: Implement minimal vector operations

Implementing dup2, add, sub, and, or, xor as the minimal set.
This allows us to actually enable neon in the header file.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

tcg/arm: Implement tcg_out_dup*_vec

Most of dupi is copied from tcg/aarch64, which has the same
encoding for AdvSimdExpandImm.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

tcg/arm: Implement tcg_out_mov for vector types

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

tcg/arm: Implement tcg_out_ld/st for vector types

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

tcg/arm: Add host vector framework

Add registers and function stubs. The functionality
is disabled via use_neon_instructions defined to 0.

We must still include results for the mandatory opcodes in
tcg_target_op_def, as all opcodes are checked during tcg init.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

tcg: Change parameters for tcg_target_const_match

Change the return value to bool, because that's what is should
have been from the start. Pass the ct mask instead of the whole
TCGArgConstraint, as that's the only part that's relevant.

Change the value argument to int64_t. We will need the extra
width for 32-bit hosts wanting to match vector constants.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into staging

* OpenBSD cleanup (Brad)
* fixes for the i386 accel/cpu refactoring (Claudio)
* unmap test for emulated SCSI (Kit)
* fix for iscsi module (myself)
* fix for -readconfig of objects (myself)
* fixes for x86 16-bit task switching (myself)
* fix for x86 MOV from/to CR8 (Richard)

# gpg: Signature made Fri 04 Jun 2021 12:53:32 BST
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "[email protected]"
# gpg: Good signature from "Paolo Bonzini <[email protected]>" [full]
# gpg:                 aka "Paolo Bonzini <[email protected]>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini-gitlab/tags/for-upstream:
  vl: plug -object back into -readconfig
  vl: plumb keyval-based options into -readconfig
  qemu-config: parse configuration files to a QDict
  i386: run accel_cpu_instance_init as post_init
  i386: reorder call to cpu_exec_realizefn
  tests/qtest/virtio-scsi-test: add unmap large LBA with 4k blocks test
  target/i386: Fix decode of cr8
  target/i386: tcg: fix switching from 16-bit to 32-bit tasks or vice versa
  target/i386: tcg: fix loading of registers from 16-bit TSS
  target/i386: tcg: fix segment register offsets for 16-bit TSS
  oslib-posix: Remove OpenBSD workaround for fcntl("/dev/null", F_SETFL, O_NONBLOCK) failure
  iscsi: link libm into the module
  meson: allow optional dependencies for block modules

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging

# gpg: Signature made Fri 04 Jun 2021 08:26:16 BST
# gpg:                using RSA key EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <[email protected]>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  MAINTAINERS: Added eBPF maintainers information.
  docs: Added eBPF documentation.
  virtio-net: Added eBPF RSS to virtio-net.
  ebpf: Added eBPF RSS loader.
  ebpf: Added eBPF RSS program.
  net: Added SetSteeringEBPF method for NetClientState.
  net/tap: Added TUNSETSTEERINGEBPF code.

Signed-off-by: Peter Maydell <[email protected]>

vl: plug -object back into -readconfig

Commit bc2f4fcb1d ("qom: move user_creatable_add_opts logic to vl.c
and QAPIfy it", 2021-03-19) switched the creation of objects from
qemu_opts_foreach to a bespoke QTAILQ in preparation for supporting JSON
syntax in -object.

Unfortunately in doing so it lost support for [object] stanzas in
configuration files and also for "-set object.ID.KEY=VAL". The latter
is hard to re-establish and probably best solved by deprecating -set.
This patch uses the infrastructure introduced by the previous two
patches in order to parse QOM objects correctly from configuration
files.

Cc: Markus Armbruster <[email protected]>
Cc: [email protected]
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Message-Id: <20210524105752.3318299 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

vl: plumb keyval-based options into -readconfig

Let -readconfig support parsing command line options into QDict or
QemuOpts. This will be used to add back support for objects in
-readconfig.

Cc: Markus Armbruster <[email protected]>
Cc: [email protected]
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Message-Id: <20210524105752.3318299 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

qemu-config: parse configuration files to a QDict

Change the parser to put the values into a QDict and pass them
to a callback. qemu_config_parse's QemuOpts creation is
itself turned into a callback function.

This is useful for -readconfig to support keyval-based options;
getting a QDict from the parser removes a roundtrip from
QDict to QemuOpts and then back to QDict.

Unfortunately there is a disadvantage in that semantic errors will
point to the last line of the group, because the entries of the QDict
do not have a location attached.

Cc: Kevin Wolf <[email protected]>
Cc: Markus Armbruster <[email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
Message-Id: <20210524105752.3318299 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

i386: run accel_cpu_instance_init as post_init

This fixes host and max cpu initialization, by running the accel cpu
initialization only after all instance init functions are called for all
X86 cpu subclasses.

The bug this is fixing is related to the "max" and "host" i386 cpu
subclasses, which set cpu->max_features, which is then used at cpu
realization time.

In order to properly split the accel-specific max features code that
needs to be executed at cpu instance initialization time,

we cannot call the accel cpu initialization at the end of the x86 base
class initialization, or we will have no way to specialize
"max features" cpu behavior, overriding the "max" cpu class defaults,
and checking for the "max features" flag itself.

This patch moves the accel-specific cpu instance initialization to after
all x86 cpu instance code has been executed, including subclasses,

so that proper initialization of cpu "host" and "max" can be restored.

Fixes: f5cc5a5c ("i386: split cpu accelerators from cpu.c,"...)
Cc: Eduardo Habkost <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Signed-off-by: Claudio Fontana <[email protected]>
Message-Id: <20210603123001 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

i386: reorder call to cpu_exec_realizefn

i386 realizefn code is sensitive to ordering, and recent commits
aimed at refactoring it, splitting accelerator-specific code,
broke assumptions which need to be fixed.

We need to:

* process hyper-v enlightements first, as they assume features
  not to be expanded

* only then, expand features

* after expanding features, attempt to check them and modify them in the
  accel-specific realizefn code called by cpu_exec_realizefn().

* after the framework has been called via cpu_exec_realizefn,
  the code can check for what has or hasn't been set by accel-specific
  code, or extend its results, ie:

  - check and evenually set code_urev default
  - modify cpu->mwait after potentially being set from host CPUID.
  - finally check for phys_bits assuming all user and accel-specific
    adjustments have already been taken into account.

Fixes: f5cc5a5c ("i386: split cpu accelerators from cpu.c"...)
Fixes: 30565f10 ("cpu: call AccelCPUClass::cpu_realizefn in"...)
Cc: Eduardo Habkost <[email protected]>
Cc: Vitaly Kuznetsov <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Signed-off-by: Claudio Fontana <[email protected]>
Message-Id: <20210603123001 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

tests/qtest/virtio-scsi-test: add unmap large LBA with 4k blocks test

Add test for issue #345

Signed-off-by: Kit Westneat <[email protected]>
Message-Id: <20210603142022 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

target/i386: Fix decode of cr8

A recent cleanup did not recognize that there are two ways
to encode cr8: one via the LOCK and the other via REX.

Fixes: 7eff2e7c
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/380
Signed-off-by: Richard Henderson <[email protected]>
Message-Id: <20210602035511 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

target/i386: tcg: fix switching from 16-bit to 32-bit tasks or vice versa

The format of the task state segment is governed by bit 3 in the
descriptor type field. On a task switch, the format for saving
is given by the current value of TR's type field, while the
format for loading is given by the new descriptor.

Signed-off-by: Paolo Bonzini <[email protected]>

target/i386: tcg: fix loading of registers from 16-bit TSS

According to the manual, the high 16-bit of the registers are preserved
when switching to a 16-bit task. Implement this in switch_tss_ra.

Signed-off-by: Paolo Bonzini <[email protected]>

target/i386: tcg: fix segment register offsets for 16-bit TSS

The TSS offsets in the manuals have only 2-byte slots for the
segment registers. QEMU incorrectly uses 4-byte slots, so
that SS overlaps the LDT selector.

Resolves: #382
Reported-by: Peter Maydell <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

oslib-posix: Remove OpenBSD workaround for fcntl("/dev/null", F_SETFL, O_NONBLOCK) failure

OpenBSD prior to 6.3 required a workaround to utilize fcntl(F_SETFL) on memory
devices.

Since modern verions of OpenBSD that are only officialy supported and buildable
on do not have this issue I am garbage collecting this workaround.

Signed-off-by: Brad Smith <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

iscsi: link libm into the module

Depending on the configuration of QEMU, some binaries might not need libm
at all. In that case libiscsi, which uses exp(), will fail to load.
Link it in the module explicitly.

Reported-by: Yi Sun <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

meson: allow optional dependencies for block modules

Right now all dependencies for block modules are passed to
module_ss.add(when: ...), so they are mandatory.  In the next patch we
will need to add a libm dependency to a module, but libm does not exist
on all systems.  So, modify the creation of module_ss and modsrc so that
dependencies can also be passed to module_ss.add(if_true: ...).

While touching the array, remove the useless dependency of the curl
module on glib.  glib is always linked in QEMU and in fact all other
block modules also need it, but they don't have to specify it.

Signed-off-by: Paolo Bonzini <[email protected]>

Merge remote-tracking branch 'remotes/rth-gitlab/tags/pull-fpu-20210603' into staging

Finish conversion of float128 and floatx80 to FloatParts.
Implement float128_muladd and float128_{min,max}*.
Optimize int-to-float conversion with hard-float.

# gpg: Signature made Thu 03 Jun 2021 22:13:10 BST
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "[email protected]"
# gpg: Good signature from "Richard Henderson <[email protected]>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* remotes/rth-gitlab/tags/pull-fpu-20210603: (29 commits)
  softfloat: Use hard-float for {u}int64_to_float{32,64}
  tests/fp: Enable more tests
  softfloat: Convert modrem operations to FloatParts
  softfloat: Move floatN_log2 to softfloat-parts.c.inc
  softfloat: Convert float32_exp2 to FloatParts
  softfloat: Convert floatx80 compare to FloatParts
  softfloat: Convert floatx80_scalbn to FloatParts
  softfloat: Convert floatx80 to integer to FloatParts
  softfloat: Convert floatx80 float conversions to FloatParts
  softfloat: Convert integer to floatx80 to FloatParts
  softfloat: Convert floatx80_round_to_int to FloatParts
  softfloat: Convert floatx80_round to FloatParts
  softfloat: Convert floatx80_sqrt to FloatParts
  softfloat: Convert floatx80_div to FloatParts
  softfloat: Convert floatx80_mul to FloatParts
  softfloat: Convert floatx80_add/sub to FloatParts
  tests/fp/fp-test: Reverse order of floatx80 precision tests
  softfloat: Adjust parts_uncanon_normal for floatx80
  softfloat: Introduce Floatx80RoundPrec
  softfloat: Reduce FloatFmt
  ...

Signed-off-by: Peter Maydell <[email protected]>

MAINTAINERS: Added eBPF maintainers information.

Signed-off-by: Yuri Benditovich <[email protected]>
Signed-off-by: Andrew Melnychenko <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

docs: Added eBPF documentation.

Signed-off-by: Yuri Benditovich <[email protected]>
Signed-off-by: Andrew Melnychenko <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

virtio-net: Added eBPF RSS to virtio-net.

When RSS is enabled the device tries to load the eBPF program
to select RX virtqueue in the TUN. If eBPF can be loaded
the RSS will function also with vhost (works with kernel 5.8 and later).
Software RSS is used as a fallback with vhost=off when eBPF can't be loaded
or when hash population requested by the guest.

Signed-off-by: Yuri Benditovich <[email protected]>
Signed-off-by: Andrew Melnychenko <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

ebpf: Added eBPF RSS loader.

Added function that loads RSS eBPF program.
Added stub functions for RSS eBPF loader.
Added meson and configuration options.

By default, eBPF feature enabled if libbpf is present in the build system.
libbpf checked in configuration shell script and meson script.

Signed-off-by: Yuri Benditovich <[email protected]>
Signed-off-by: Andrew Melnychenko <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

ebpf: Added eBPF RSS program.

RSS program and Makefile to build it.
The bpftool used to generate '.h' file.
The data in that file may be loaded by libbpf.
EBPF compilation is not required for building qemu.
You can use Makefile if you need to regenerate rss.bpf.skeleton.h.

Signed-off-by: Yuri Benditovich <[email protected]>
Signed-off-by: Andrew Melnychenko <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

net: Added SetSteeringEBPF method for NetClientState.

For now, that method supported only by Linux TAP.
Linux TAP uses TUNSETSTEERINGEBPF ioctl.

Signed-off-by: Andrew Melnychenko <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

net/tap: Added TUNSETSTEERINGEBPF code.

Additional code that will be used for eBPF setting steering routine.

Signed-off-by: Andrew Melnychenko <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

softfloat: Use hard-float for {u}int64_to_float{32,64}

For the normal case of no additional scaling, this reduces the
profile contribution of int64_to_float64 to the testcase in the
linked issue from 0.81% to 0.04%.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/134
Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

tests/fp: Enable more tests

Fix the trivial typo in extF80_lt_quiet, and re-enable
all of the floatx80 tests that are now fixed.

Signed-off-by: Alex Bennée <[email protected]>
Message-ID: <[email protected]>
[rth: Squash the fix for lt_quiet, and enable that too.]
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Convert modrem operations to FloatParts

Rename to parts$N_modrem. This was the last use of a lot
of the legacy infrastructure, so remove it as required.

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Move floatN_log2 to softfloat-parts.c.inc

Rename to parts$N_log2.  Though this is partly a ruse, since I do not
believe the code will succeed for float128 without work.  Which is ok
for now, because we do not need this for more than float32 and float64.

Since berkeley-testfloat-3 doesn't support log2, compare float64_log2
vs the system log2.  Fix the errors for inputs near 1.0:

test: 3ff00000000000b0  +0x1.00000000000b0p+0
  sf: 3d2fa00000000000  +0x1.fa00000000000p-45
libm: 3d2fbd422b1bd36f  +0x1.fbd422b1bd36fp-45
Error in fraction: 32170028290927 ulp

test: 3feec24f6770b100  +0x1.ec24f6770b100p-1
  sf: bfad3740d13c9ec0  -0x1.d3740d13c9ec0p-5
libm: bfad3740d13c9e98  -0x1.d3740d13c9e98p-5
Error in fraction: 40 ulp

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Convert float32_exp2 to FloatParts

Keep the intermediate results in FloatParts instead of
converting back and forth between float64. Use muladd
instead of separate mul+add.

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Convert floatx80 compare to FloatParts

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Convert floatx80_scalbn to FloatParts

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Convert floatx80 to integer to FloatParts

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Convert floatx80 float conversions to FloatParts

This is the last use of commonNaNT and all of the routines
that use it, so remove all of them for Werror.

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Convert integer to floatx80 to FloatParts

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Convert floatx80_round_to_int to FloatParts

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Convert floatx80_round to FloatParts

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Convert floatx80_sqrt to FloatParts

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Convert floatx80_div to FloatParts

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Convert floatx80_mul to FloatParts

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Convert floatx80_add/sub to FloatParts

Since this is the first such, this includes all of the
packing and unpacking routines as well.

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

tests/fp/fp-test: Reverse order of floatx80 precision tests

Many qemu softfloat will check floatx80_rounding_precision
even when berkeley testfloat will not. So begin with
floatx80_precision_x, so that's the one we use
when !FUNC_EFF_ROUNDINGPRECISION.

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Adjust parts_uncanon_normal for floatx80

With floatx80_precision_x, the rounding happens across
the break between words. Notice this case with

frac_lsb = round_mask + 1 -> 0

and check the bits in frac_hi as needed.

In addition, since frac_shift == 0, we won't implicitly clear
round_mask via the right-shift, so explicitly clear those bits.
This fixes rounding for floatx80_precision_[sd].

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Introduce Floatx80RoundPrec

Use an enumeration instead of raw 32/64/80 values.

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Reduce FloatFmt

Remove frac_lsb, frac_lsbm1, roundeven_mask. Compute
these from round_mask in parts$N_uncanon_normal.

With floatx80, round_mask will not be tied to frac_shift.
Everything else is easily computable.

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Split out parts_uncanon_normal

We will need to treat the non-normal cases of floatx80 specially,
so split out the normal case that we can reuse.

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Move sqrt_float to softfloat-parts.c.inc

Rename to parts$N_sqrt.
Reimplement float128_sqrt with FloatParts128.

Reimplement with the inverse sqrt newton-raphson algorithm from musl.
This is significantly faster than even the berkeley sqrt n-r algorithm,
because it does not use division instructions, only multiplication.

Ordinarily, changing algorithms at the same time as migrating code is
a bad idea, but this is the only way I found that didn't break one of
the routines at the same time.

Tested-by: Alex Bennée <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Move scalbn_decomposed to softfloat-parts.c.inc

Rename to parts$N_scalbn.
Reimplement float128_scalbn with FloatParts128.

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Move compare_floats to softfloat-parts.c.inc

Rename to parts$N_compare.  Rename all of the intermediate
functions to ftype_do_compare.  Rename the hard-float functions
to ftype_hs_compare.  Convert float128 to FloatParts128.

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Implement float128_(min|minnum|minnummag|max|maxnum|maxnummag)

The float128 implementation is straight-forward.
Unfortuantely, we don't have any tests we can simply adjust/unlock.

Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20210517142739 [email protected]>
[rth: Update for changed parts_minmax return value]
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Move minmax_flags to softfloat-parts.c.inc

Rename to parts$N_minmax. Combine 3 bool arguments to a bitmask.
Introduce ftype_minmax functions as a common optimization point.
Fold bfloat16 expansions into the same macro as the other types.

Reviewed-by: David Hildenbrand <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Move uint_to_float to softfloat-parts.c.inc

Rename to parts$N_uint_to_float.
Reimplement uint64_to_float128 with FloatParts128.

Reviewed-by: Alex Bennée <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Move int_to_float to softfloat-parts.c.inc

Rename to parts$N_sint_to_float.
Reimplement int{32,64}_to_float128 with FloatParts128.

Reviewed-by: Alex Bennée <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

softfloat: Move round_to_uint_and_pack to softfloat-parts.c.inc

Rename to parts$N_float_to_uint. Reimplement
float128_to_uint{32,64}{_round_to_zero} with FloatParts128.

Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20210603' into staging

target-arm queue:
* Some not-yet-enabled preliminaries for M-profile MVE support
* Consistently use "Cortex-Axx", not "Cortex Axx" in docs, comments
* docs: Fix installation of man pages with Sphinx 4.x
* Mark LDS{MIN,MAX} as signed operations
* Fix missing syndrome value for DAIF and PAC check exceptions
* Implement BFloat16 extensions
* Refactoring of hvf accelerator code in preparation for aarch64 support
* Fix some coverity nits in test code

# gpg: Signature made Thu 03 Jun 2021 16:58:02 BST
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "[email protected]"
# gpg: Good signature from "Peter Maydell <[email protected]>" [ultimate]
# gpg:                 aka "Peter Maydell <[email protected]>" [ultimate]
# gpg:                 aka "Peter Maydell <[email protected]>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20210603: (45 commits)
  tests/unit/test-vmstate: Assert that dup() and mkstemp() succeed
  tests/qtest/tpm-tests: Remove unnecessary NULL checks
  tests/qtest/pflash-cfi02-test: Avoid potential integer overflow
  tests/qtest/hd-geo-test: Fix checks on mkstemp() return value
  tests/qtest/e1000e-test: Check qemu_recv() succeeded
  tests/qtest/bios-tables-test: Check for dup2() failure
  hvf: Simplify post reset/init/loadvm hooks
  hvf: Introduce hvf vcpu struct
  hvf: Remove hvf-accel-ops.h
  hvf: Make synchronize functions static
  hvf: Use cpu_synchronize_state()
  hvf: Split out common code on vcpu init and destroy
  hvf: Remove use of hv_uvaddr_t and hv_gpaddr_t
  hvf: Make hvf_set_phys_mem() static
  hvf: Move hvf internal definitions into common header
  hvf: Move cpu functions into common directory
  hvf: Move vcpu thread functions into common directory
  hvf: Move assert_hvf_ok() into common directory
  target/arm: Enable BFloat16 extensions
  linux-user/aarch64: Enable hwcap bits for bfloat16
  ...

Signed-off-by: Peter Maydell <[email protected]>

tests/unit/test-vmstate: Assert that dup() and mkstemp() succeed

Coverity complains that we don't check for failures from dup()
and mkstemp(); add asserts that these syscalls succeeded.

Fixes: Coverity CID 1432516, 1432574
Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Stefan Berger <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-id: 20210525134458 [email protected]

tests/qtest/tpm-tests: Remove unnecessary NULL checks

Coverity points out that in tpm_test_swtpm_migration_test() we
assume that src_tpm_addr and dst_tpm_addr are non-NULL (we
pass them to tpm_util_migration_start_qemu() which will
unconditionally dereference them) but then later explicitly
check them for NULL. Remove the pointless checks.

Fixes: Coverity CID 1432367, 1432359
Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Stefan Berger <[email protected]>
Message-id: 20210525134458 [email protected]

tests/qtest/pflash-cfi02-test: Avoid potential integer overflow

Coverity points out that we calculate a 64-bit value using 32-bit
arithmetic; add the cast to force the multiply to be done as 64-bits.
(The overflow will never happen with the current test data.)

Fixes: Coverity CID 1432320
Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Stefan Berger <[email protected]>
Message-id: 20210525134458 [email protected]

tests/qtest/hd-geo-test: Fix checks on mkstemp() return value

Coverity notices that the checks against mkstemp() failing in
create_qcow2_with_mbr() are wrong: mkstemp returns -1 on failure but
the check is just "g_assert(fd)". Fix to use "g_assert(fd >= 0)",
matching the correct check in create_test_img().

Fixes: Coverity CID 1432274
Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Stefan Berger <[email protected]>
Message-id: 20210525134458 [email protected]

tests/qtest/e1000e-test: Check qemu_recv() succeeded

The e1000e_send_verify() test calls qemu_recv() but doesn't
check that the call succeeded, which annoys Coverity. Add
an explicit test check for the length of the data.

(This is a test check, not a "we assume this syscall always
succeeds", so we use g_assert_cmpint() rather than g_assert().)

Fixes: Coverity CID 1432324
Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Stefan Berger <[email protected]>
Message-id: 20210525134458 [email protected]

tests/qtest/bios-tables-test: Check for dup2() failure

Coverity notes that we don't check for dup2() failing. Add some
assertions so that if it does ever happen we get some indication.
(This is similar to how we handle other "don't expect this syscall to
fail" checks in this test code.)

Fixes: Coverity CID 1432346
Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Stefan Berger <[email protected]>
Message-id: 20210525134458 [email protected]

hvf: Simplify post reset/init/loadvm hooks

The hooks we have that call us after reset, init and loadvm really all
just want to say "The reference of all register state is in the QEMU
vcpu struct, please push it".

We already have a working pushing mechanism though called cpu->vcpu_dirty,
so we can just reuse that for all of the above, syncing state properly the
next time we actually execute a vCPU.

This fixes PSCI resets on ARM, as they modify CPU state even after the
post init call has completed, but before we execute the vCPU again.

To also make the scheme work for x86, we have to make sure we don't
move stale eflags into our env when the vcpu state is dirty.

Signed-off-by: Alexander Graf <[email protected]>
Reviewed-by: Roman Bolshakov <[email protected]>
Tested-by: Roman Bolshakov <[email protected]>
Reviewed-by: Sergio Lopez <[email protected]>
Message-id: 20210519202253 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

hvf: Introduce hvf vcpu struct

We will need more than a single field for hvf going forward. To keep
the global vcpu struct uncluttered, let's allocate a special hvf vcpu
struct, similar to how hax does it.

Signed-off-by: Alexander Graf <[email protected]>
Reviewed-by: Roman Bolshakov <[email protected]>
Tested-by: Roman Bolshakov <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Reviewed-by: Sergio Lopez <[email protected]>
Message-id: 20210519202253 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

hvf: Remove hvf-accel-ops.h

We can move the definition of hvf_vcpu_exec() into our internal
hvf header, obsoleting the need for hvf-accel-ops.h.

Signed-off-by: Alexander Graf <[email protected]>
Reviewed-by: Sergio Lopez <[email protected]>
Message-id: 20210519202253 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

hvf: Make synchronize functions static

The hvf accel synchronize functions are only used as input for local
callback functions, so we can make them static.

Signed-off-by: Alexander Graf <[email protected]>
Reviewed-by: Sergio Lopez <[email protected]>
Message-id: 20210519202253 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

hvf: Use cpu_synchronize_state()

There is no reason to call the hvf specific hvf_cpu_synchronize_state()
when we can just use the generic cpu_synchronize_state() instead. This
allows us to have less dependency on internal function definitions and
allows us to make hvf_cpu_synchronize_state() static.

Signed-off-by: Alexander Graf <[email protected]>
Reviewed-by: Sergio Lopez <[email protected]>
Message-id: 20210519202253 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

hvf: Split out common code on vcpu init and destroy

Until now, Hypervisor.framework has only been available on x86_64 systems.
With Apple Silicon shipping now, it extends its reach to aarch64. To
prepare for support for multiple architectures, let's start moving common
code out into its own accel directory.

This patch splits the vcpu init and destroy functions into a generic and
an architecture specific portion. This also allows us to move the generic
functions into the generic hvf code, removing exported functions.

Signed-off-by: Alexander Graf <[email protected]>
Reviewed-by: Sergio Lopez <[email protected]>
Message-id: 20210519202253 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

hvf: Remove use of hv_uvaddr_t and hv_gpaddr_t

The ARM version of Hypervisor.framework no longer defines these two
types, so let's just revert to standard ones.

Signed-off-by: Alexander Graf <[email protected]>
Reviewed-by: Sergio Lopez <[email protected]>
Message-id: 20210519202253 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

hvf: Make hvf_set_phys_mem() static

The hvf_set_phys_mem() function is only called within the same file.
Make it static.

Signed-off-by: Alexander Graf <[email protected]>
Reviewed-by: Sergio Lopez <[email protected]>
Message-id: 20210519202253 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

hvf: Move hvf internal definitions into common header

Until now, Hypervisor.framework has only been available on x86_64 systems.
With Apple Silicon shipping now, it extends its reach to aarch64. To
prepare for support for multiple architectures, let's start moving common
code out into its own accel directory.

This patch moves a few internal struct and constant defines over.

Signed-off-by: Alexander Graf <[email protected]>
Reviewed-by: Sergio Lopez <[email protected]>
Message-id: 20210519202253 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

hvf: Move cpu functions into common directory

Until now, Hypervisor.framework has only been available on x86_64 systems.
With Apple Silicon shipping now, it extends its reach to aarch64. To
prepare for support for multiple architectures, let's start moving common
code out into its own accel directory.

This patch moves CPU and memory operations over. While at it, make sure
the code is consumable on non-i386 systems.

Signed-off-by: Alexander Graf <[email protected]>
Reviewed-by: Sergio Lopez <[email protected]>
Message-id: 20210519202253 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

hvf: Move vcpu thread functions into common directory

Until now, Hypervisor.framework has only been available on x86_64 systems.
With Apple Silicon shipping now, it extends its reach to aarch64. To
prepare for support for multiple architectures, let's start moving common
code out into its own accel directory.

This patch moves the vCPU thread loop over.

Signed-off-by: Alexander Graf <[email protected]>
Reviewed-by: Sergio Lopez <[email protected]>
Message-id: 20210519202253 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

hvf: Move assert_hvf_ok() into common directory

Until now, Hypervisor.framework has only been available on x86_64 systems.
With Apple Silicon shipping now, it extends its reach to aarch64. To
prepare for support for multiple architectures, let's start moving common
code out into its own accel directory.

This patch moves assert_hvf_ok() and introduces generic build infrastructure.

Signed-off-by: Alexander Graf <[email protected]>
Reviewed-by: Sergio Lopez <[email protected]>
Message-id: 20210519202253 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Enable BFloat16 extensions

Disable BF16 again for !have_neon and !have_vfp during realize.

Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20210525225817 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

linux-user/aarch64: Enable hwcap bits for bfloat16

Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20210525225817 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Implement bfloat widening fma (indexed)

This is BFMLAL{B,T} for both AArch64 AdvSIMD and SVE,
and VFMA{B,T}.BF16 for AArch32 NEON.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20210525225817 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Implement bfloat widening fma (vector)

This is BFMLAL{B,T} for both AArch64 AdvSIMD and SVE,
and VFMA{B,T}.BF16 for AArch32 NEON.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20210525225817 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Implement bfloat16 matrix multiply accumulate

This is BFMMLA for both AArch64 AdvSIMD and SVE,
and VMMLA.BF16 for AArch32 NEON.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20210525225817 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Implement bfloat16 dot product (indexed)

This is BFDOT for both AArch64 AdvSIMD and SVE,
and VDOT.BF16 for AArch32 NEON.

Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20210525225817 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Implement bfloat16 dot product (vector)

This is BFDOT for both AArch64 AdvSIMD and SVE,
and VDOT.BF16 for AArch32 NEON.

Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20210525225817 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

softfpu: Add float_round_to_odd_inf

For Arm BFDOT and BFMMLA, we need a version of round-to-odd
that overflows to infinity, instead of the max normal number.

Cc: Alex Bennée <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20210525225817 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Implement vector float32 to bfloat16 conversion

This is BFCVT{N,T} for both AArch64 AdvSIMD and SVE,
and VCVT.BF16.F32 for AArch32 NEON.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20210525225817 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Implement scalar float32 to bfloat16 conversion

This is the 64-bit BFCVT and the 32-bit VCVT{B,T}.BF16.F32.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20210525225817 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Unify unallocated path in disas_fp_1src

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20210525225817 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Add isar_feature_{aa32, aa64, aa64_sve}_bf16

Note that the SVE BFLOAT16 support does not require SVE2,
it is an independent extension.

Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Message-id: 20210525225817 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: use raise_exception_ra for stack limit exception

The sequence cpu_restore_state() + raise_exception() is equivalent to
raise_exception_ra(), so use that instead. (In this case we never
cared about the syndrome value, because M-profile doesn't use the
syndrome; the old code was just written unnecessarily awkwardly.)

Cc: Richard Henderson <[email protected]>
Cc: Peter Maydell <[email protected]>
Signed-off-by: Jamie Iles <[email protected]>
[PMM: Retain edited version of comment; rewrite commit message]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>