Git Repo - linux.git/log

tls: fix missing memory barrier in tls_init

In tls_init(), a write memory barrier is missing, and store-store
reordering may cause NULL dereference in tls_{setsockopt,getsockopt}.

CPU0                               CPU1
-----                              -----
// In tls_init()
// In tls_ctx_create()
ctx = kzalloc()
ctx->sk_proto = READ_ONCE(sk->sk_prot) -(1)

// In update_sk_prot()
WRITE_ONCE(sk->sk_prot, tls_prots)     -(2)

                                   // In sock_common_setsockopt()
                                   READ_ONCE(sk->sk_prot)->setsockopt()

                                   // In tls_{setsockopt,getsockopt}()
                                   ctx->sk_proto->setsockopt()    -(3)

In the above scenario, when (1) and (2) are reordered, (3) can observe
the NULL value of ctx->sk_proto, causing NULL dereference.

To fix it, we rely on rcu_assign_pointer() which implies the release
barrier semantic. By moving rcu_assign_pointer() after ctx->sk_proto is
initialized, we can ensure that ctx->sk_proto are visible when
changing sk->sk_prot.

Fixes: d5bee7374b68 ("net/tls: Annotate access to sk_prot with READ_ONCE/WRITE_ONCE")
Signed-off-by: Yewon Choi <[email protected]>
Signed-off-by: Dae R. Jeong <[email protected]>
Link: https://lore.kernel.org/netdev/ZU4OJG56g2V9z_H7@dragonet/T/
Link: https://lore.kernel.org/r/Zkx4vjSFp0mfpjQ2@libra05
Signed-off-by: Paolo Abeni <[email protected]>

net: fec: avoid lock evasion when reading pps_enable

The assignment of pps_enable is protected by tmreg_lock, but the read
operation of pps_enable is not. So the Coverity tool reports a lock
evasion warning which may cause data race to occur when running in a
multithread environment. Although this issue is almost impossible to
occur, we'd better fix it, at least it seems more logically reasonable,
and it also prevents Coverity from continuing to issue warnings.

Fixes: 278d24047891 ("net: fec: ptp: Enable PPS output based on ptp clock")
Signed-off-by: Wei Fang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

Revert "ixgbe: Manual AN-37 for troublesome link partners for X550 SFI"

This reverts commit 565736048bd5f9888990569993c6b6bfdf6dcb6d.

According to the commit, it implements a manual AN-37 for some
"troublesome" Juniper MX5 switches. This appears to be a workaround for a
particular switch.

It has been reported that this causes a severe breakage for other switches,
including a Cisco 3560CX-12PD-S.

The code appears to be a workaround for a specific switch which fails to
link in SFI mode. It expects to see AN-37 auto negotiation in order to
link. The Cisco switch is not expecting AN-37 auto negotiation. When the
device starts the manual AN-37, the Cisco switch decides that the port is
confused and stops attempting to link with it. This persists until a power
cycle. A simple driver unload and reload does not resolve the issue, even
if loading with a version of the driver which lacks this workaround.

The authors of the workaround commit have not responded with
clarifications, and the result of the workaround is complete failure to
connect with other switches.

This appears to be a case where the driver can either "correctly" link with
the Juniper MX5 switch, at the cost of bricking the link with the Cisco
switch, or it can behave properly for the Cisco switch, but fail to link
with the Junipir MX5 switch. I do not know enough about the standards
involved to clearly determine whether either switch is at fault or behaving
incorrectly. Nor do I know whether there exists some alternative fix which
corrects behavior with both switches.

Revert the workaround for the Juniper switch.

Fixes: 565736048bd5 ("ixgbe: Manual AN-37 for troublesome link partners for X550 SFI")
Link: https://lore.kernel.org/netdev/[email protected]/T/
Link: https://forum.proxmox.com/threads/intel-x553-sfp-ixgbe-no-go-on-pve8.135129/#post-612291
Signed-off-by: Jacob Keller <[email protected]>
Cc: Jeff Daly <[email protected]>
Cc: [email protected]
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/20240520-net-2024-05-20-revert-silicom-switch-workaround-v1-1-50f80f261c94@intel.com
Signed-off-by: Paolo Abeni <[email protected]>

doc: ceph: update userspace command to get CephFS metadata

According to ceph documentation [1], "getfattr -d /some/dir" no longer
displays the list of all extended attributes. Both CephFS kernel and
FUSE clients hide this information.

To retrieve the information you have to specify the particular attribute
name e.g. "getfattr -n ceph.dir.rbytes /some/dir".

[1] https://docs.ceph.com/en/latest/cephfs/quota/

Signed-off-by: Artem Ikonnikov <[email protected]>
Reviewed-by: Xiubo Li <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>

ceph: add CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK feature bit

Since we have support checking the mds auth cap in kclient, just
set the feature bit.

Link: https://tracker.ceph.com/issues/61333
Signed-off-by: Xiubo Li <[email protected]>
Reviewed-by: Milind Changire <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>

ceph: check the cephx mds auth access for async dirop

Before doing the op locally we need to check the cephx access.

Link: https://tracker.ceph.com/issues/61333
Signed-off-by: Xiubo Li <[email protected]>
Reviewed-by: Milind Changire <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>

ceph: check the cephx mds auth access for open

Before opening the file locally we need to check the cephx access.

Link: https://tracker.ceph.com/issues/61333
Signed-off-by: Xiubo Li <[email protected]>
Reviewed-by: Milind Changire <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>

ceph: check the cephx mds auth access for setattr

If we hit any failre just try to force it to do the sync setattr.

Link: https://tracker.ceph.com/issues/61333
Signed-off-by: Xiubo Li <[email protected]>
Reviewed-by: Milind Changire <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>

ceph: add ceph_mds_check_access() helper

This will help check the mds auth access in client side. Always
insert the server path in front of the target path when matching
the paths.

[ idryomov: use u32 instead of uint32_t ]

Link: https://tracker.ceph.com/issues/61333
Signed-off-by: Xiubo Li <[email protected]>
Reviewed-by: Milind Changire <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>

ceph: save cap_auths in MDS client when session is opened

Save the cap_auths, which have been parsed by the MDS, in the opened
session.

[ idryomov: use s64 and u32 instead of int64_t and uint32_t, switch to
bool for root_squash, readable and writeable ]

Link: https://tracker.ceph.com/issues/61333
Signed-off-by: Xiubo Li <[email protected]>
Reviewed-by: Milind Changire <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>

testing: net-drv: use stats64 for testing

Testing a network device that has large numbers of bytes/packets may
overflow. Using stats64 when comparing fixes this problem.

I tripped on this while iterating on a qstats patch for mlx5. See below
for confirmation without my added code that this is a bug.

Before this patch (with added debugging output):

$ NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
KTAP version 1
1..4
ok 1 stats.check_pause
ok 2 stats.check_fec
rstat: 481708634 qstat: 666201639514 key: tx-bytes
not ok 3 stats.pkt_byte_sum
ok 4 stats.qstat_by_ifindex

Note the huge delta above ^^^ in the rtnl vs qstats.

After this patch:

$ NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
KTAP version 1
1..4
ok 1 stats.check_pause
ok 2 stats.check_fec
ok 3 stats.pkt_byte_sum
ok 4 stats.qstat_by_ifindex

It looks like rtnl_fill_stats in net/core/rtnetlink.c will attempt to
copy the 64bit stats into a 32bit structure which is probably why this
behavior is occurring.

To show this is happening, you can get the underlying stats that the
stats.py test uses like this:

$ ./cli.py --spec ../../../Documentation/netlink/specs/rt_link.yaml \
           --do getlink --json '{"ifi-index": 7}'

And examine the output (heavily snipped to show relevant fields):

'stats': {
           'multicast': 3739197,
           'rx-bytes': 1201525399,
           'rx-packets': 56807158,
           'tx-bytes': 492404458,
           'tx-packets': 1200285371,

'stats64': {
             'multicast': 3739197,
             'rx-bytes': 35561263767,
             'rx-packets': 56807158,
             'tx-bytes': 666212335338,
             'tx-packets': 1200285371,

The stats.py test prior to this patch was using the 'stats' structure
above, which matches the failure output on my system.

Comparing side by side, rx-bytes and tx-bytes, and getting ethtool -S
output:

rx-bytes stats:    1201525399
rx-bytes stats64: 35561263767
rx-bytes ethtool: 36203402638

tx-bytes stats:      492404458
tx-bytes stats64: 666212335338
tx-bytes ethtool: 666215360113

Note that the above was taken from a system with an mlx5 NIC, which only
exposes ndo_get_stats64.

Based on the ethtool output and qstat output, it appears that stats.py
should be updated to use the 'stats64' structure for accurate
comparisons when packet/byte counters get very large.

To confirm that this was not related to the qstats code I was iterating
on, I booted a kernel without my driver changes and re-ran the test
which shows the qstats are skipped (as they don't exist for mlx5):

NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
KTAP version 1
1..4
ok 1 stats.check_pause
ok 2 stats.check_fec
ok 3 stats.pkt_byte_sum # SKIP qstats not supported by the device
ok 4 stats.qstat_by_ifindex # SKIP No ifindex supports qstats

But, fetching the stats using the CLI

$ ./cli.py --spec ../../../Documentation/netlink/specs/rt_link.yaml \
           --do getlink --json '{"ifi-index": 7}'

Shows the same issue (heavily snipped for relevant fields only):

'stats': {
           'multicast': 105489,
           'rx-bytes': 530879526,
           'rx-packets': 751415,
           'tx-bytes': 2510191396,
           'tx-packets': 27700323,
'stats64': {
             'multicast': 105489,
             'rx-bytes': 530879526,
             'rx-packets': 751415,
             'tx-bytes': 15395093284,
             'tx-packets': 27700323,

Comparing side by side with ethtool -S on the unmodified mlx5 driver:

tx-bytes stats:    2510191396
tx-bytes stats64: 15395093284
tx-bytes ethtool: 17718435810

Fixes: f0e6c86e4bab ("testing: net-drv: add a driver test for stats reporting")
Signed-off-by: Joe Damato <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

dt-bindings: display: Reorganize legacy eDP panel bindings

Back in the day, we used to need to list the exact panel in dts for
eDP panels. This led to all sorts of problems including a large number
of cases where people listed a bogus panel in their device tree
because of the needs of second sourcing (and third sourcing, and
fourth sourcing, ...). Back when we needed to add eDP panels to dts
files we used to list them in "panel-simple.yaml".

These days we have the new way of doing things as documented in
"panel-edp.yaml". We can just list the compatible "edp-panel", add
some timing info to the source code, and we're good to go. There's not
really good reasons not to use this new method.

To try to make it obvious that we shouldn't add new compatible strings
for eDP panels, let's move them all out of the old "panel-simple.yaml"
file to their own file: "panel-edp-legacy.yaml". This new file will
have a description that makes it obvious that we shouldn't use it for
new panels.

While we're doing this:
- We can remove eDP-specific properties from panel-simple.yaml since
  there are no more panels there.
- We don't need to copy non-eDP properties to the
  "panel-edp-legacy.yaml".
- We'll fork off a separate yaml file for "samsung,atna33xc20.yaml".
  This is an eDP panel which isn't _quite_ handled by the generic
  "edp-panel" compatible since it's not allowed to have an external
  backlight (it has one builtin) and it absolutely requires an
  "enable" GPIO.
- We'll un-fork the "sharp,ld-d5116z01b.yaml" and put it in
  "panel-edp-legacy.yaml" since there doesn't appear to be any reason
  for it to be separate.

Suggested-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Rob Herring (Arm) <[email protected]>
Link: https://lore.kernel.org/r/20240520153813.1.Iefaa5b93ca2faada269af77deecdd139261da7ec@changeid
Signed-off-by: Neil Armstrong <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/20240520153813.1.Iefaa5b93ca2faada269af77deecdd139261da7ec@changeid

ALSA: hda/realtek: fix mute/micmute LEDs don't work for ProBook 440/460 G11.

HP ProBook 440/460 G11 needs ALC236_FIXUP_HP_GPIO_LED quirk to
make mic-mute/audio-mute working.

Signed-off-by: Andy Chi <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

Merge tag 'drm-misc-next-fixes-2024-05-23' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next-fixes for v6.10-rc1:
- MST null deref fix.
- Don't let next bridge create connector in adv7511 to make probe work.

Signed-off-by: Dave Airlie <[email protected]>
From: Maarten Lankhorst <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

Merge tag 'amd-drm-fixes-6.10-2024-05-22' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-fixes-6.10-2024-05-22:

amdgpu:
- Handle vbios table integrated info v2.3

amdkfd:
- Handle duplicate BOs in reserve_bo_and_cond_vms
- Handle memory limitations on small APUs

Signed-off-by: Dave Airlie <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

Merge tag 'mm-nonmm-stable-2024-05-22-17-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more non-mm updates from Andrew Morton:

- A series ("kbuild: enable more warnings by default") from Arnd
   Bergmann which enables a number of additional build-time warnings. We
   fixed all the fallout which we could find, there may still be a few
   stragglers.

- Samuel Holland has developed the series "Unified cross-architecture
   kernel-mode FPU API". This does a lot of consolidation of
   per-architecture kernel-mode FPU usage and enables the use of newer
   AMD GPUs on RISC-V.

- Tao Su has fixed some selftests build warnings in the series
   "Selftests: Fix compilation warnings due to missing _GNU_SOURCE
   definition".

- This pull also includes a nilfs2 fixup from Ryusuke Konishi.

* tag 'mm-nonmm-stable-2024-05-22-17-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (23 commits)
  nilfs2: make block erasure safe in nilfs_finish_roll_forward()
  selftests/harness: use 1024 in place of LINE_MAX
  Revert "selftests/harness: remove use of LINE_MAX"
  selftests/fpu: allow building on other architectures
  selftests/fpu: move FP code to a separate translation unit
  drm/amd/display: use ARCH_HAS_KERNEL_FPU_SUPPORT
  drm/amd/display: only use hard-float, not altivec on powerpc
  riscv: add support for kernel-mode FPU
  x86: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  powerpc: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  LoongArch: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  lib/raid6: use CC_FLAGS_FPU for NEON CFLAGS
  arm64: crypto: use CC_FLAGS_FPU for NEON CFLAGS
  arm64: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  ARM: crypto: use CC_FLAGS_FPU for NEON CFLAGS
  ARM: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  arch: add ARCH_HAS_KERNEL_FPU_SUPPORT
  x86/fpu: fix asm/fpu/types.h include guard
  kbuild: enable -Wcast-function-type-strict unconditionally
  kbuild: enable -Wformat-truncation on clang
  ...

Merge branch 'next' into for-linus

Prepare input updates for 6.10 merge window.

bcachefs: Fix race path in bch2_inode_insert()

__destroy_new_inode() is appropriate when we have _just_allocated the
inode, but not when it's been fully initialized and on i_sb_list.

Reported-by: [email protected]
Signed-off-by: Kent Overstreet <[email protected]>

Merge tag 'mm-stable-2024-05-22-17-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more mm updates from Andrew Morton:
"A series from Dave Chinner which cleans up and fixes the handling of
  nested allocations within stackdepot and page-owner"

* tag 'mm-stable-2024-05-22-17-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/page-owner: use gfp_nested_mask() instead of open coded masking
  stackdepot: use gfp_nested_mask() instead of open coded masking
  mm: lift gfp_kmemleak_mask() to gfp.h

bcachefs: Ensure we're RW before journalling

Reported-by: [email protected]
Signed-off-by: Kent Overstreet <[email protected]>

tracing/treewide: Remove second parameter of __assign_str()

With the rework of how the __string() handles dynamic strings where it
saves off the source string in field in the helper structure[1], the
assignment of that value to the trace event field is stored in the helper
value and does not need to be passed in again.

This means that with:

  __string(field, mystring)

Which use to be assigned with __assign_str(field, mystring), no longer
needs the second parameter and it is unused. With this, __assign_str()
will now only get a single parameter.

There's over 700 users of __assign_str() and because coccinelle does not
handle the TRACE_EVENT() macro I ended up using the following sed script:

  git grep -l __assign_str | while read a ; do
      sed -e 's/$__assign_str([^,]*[^ ,]$ *,[^;]*/\1)/' $a > /tmp/test-file;
      mv /tmp/test-file $a;
  done

I then searched for __assign_str() that did not end with ';' as those
were multi line assignments that the sed script above would fail to catch.

Note, the same updates will need to be done for:

  __assign_str_len()
  __assign_rel_str()
  __assign_rel_str_len()

I tested this with both an allmodconfig and an allyesconfig (build only for both).

[1] https://lore.kernel.org/linux-trace-kernel/20240222211442.634192653@goodmis.org/

Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Julia Lawall <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
Acked-by: Jani Nikula <[email protected]>
Acked-by: Christian König <[email protected]> for the amdgpu parts.
Acked-by: Thomas Hellström <[email protected]> #for
Acked-by: Rafael J. Wysocki <[email protected]> # for thermal
Acked-by: Takashi Iwai <[email protected]>
Acked-by: Darrick J. Wong <[email protected]> # xfs
Tested-by: Guenter Roeck <[email protected]>

bcachefs: Fix shutdown ordering

the btree key cache uses the srcu struct created/destroyed by
btree_iter.c; btree_iter needs to be exited last.

Reported-by: [email protected]
Signed-off-by: Kent Overstreet <[email protected]>

ksmbd: ignore trailing slashes in share paths

Trailing slashes in share paths (like: /home/me/Share/) caused permission
issues with shares for clients on iOS and on Android TV for me,
but otherwise they work fine with plain old Samba.

Cc: [email protected]
Signed-off-by: Nandor Kracser <[email protected]>
Signed-off-by: Namjae Jeon <[email protected]>
Signed-off-by: Steve French <[email protected]>

bcachefs: Fix unsafety in bch2_dirent_name_bytes()

Reported-by: [email protected]
Signed-off-by: Kent Overstreet <[email protected]>

Merge patch series "riscv: Extension parsing fixes"

Charlie Jenkins <[email protected]> says:

This series contains two minor fixes for the extension parsing in
cpufeature.c.

Some T-Head boards without vector 1.0 support report "v" in the isa
string in their DT which will cause the kernel to run vector code. The
code to blacklist "v" from these boards was doing so by using
riscv_cached_mvendorid() which has not been populated at the time of
extension parsing. This fix instead greedily reads the mvendorid CSR of
the boot hart to determine if the cpu is from T-Head.

The other fix is for an incorrect indexing bug. riscv extensions
sometimes imply other extensions. When adding these "subset" extensions
to the hardware capabilities array, they need to be checked if they are
valid. The current code only checks if the extension that is including
other extensions is valid and not the subset extensions.

These patches were previously included in:
https://lore.kernel.org/lkml/20240420-dev-charlie-support_thead_vector_6_9-v3-0-67cff4271d1d@rivosinc.com/

* b4-shazam-merge:
riscv: cpufeature: Fix extension subset checking
riscv: cpufeature: Fix thead vector hwcap removal

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

riscv: selftests: Add signal handling vector tests

Add two tests to check vector save/restore when a signal is received
during a vector routine. One test ensures that a value is not clobbered
during signal handling. The other verifies that vector registers
modified in the signal handler are properly reflected when the signal
handling is complete.

Signed-off-by: Charlie Jenkins <[email protected]>
Reviewed-by: Björn Töpel <[email protected]>
Reviewed-by: Andy Chiu <[email protected]>
Tested-by: Andy Chiu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

riscv: mm: accelerate pagefault when badaccess

The access_error() of vma already checked under per-VMA lock, if it
is a bad access, directly handle error, no need to retry with mmap_lock
again. Since the page faut is handled under per-VMA lock, count it as
a vma lock event with VMA_LOCK_SUCCESS.

Reviewed-by: Suren Baghdasaryan <[email protected]>
Signed-off-by: Kefeng Wang <[email protected]>
Reviewed-by: Alexandre Ghiti <[email protected]>
Tested-by: Alexandre Ghiti <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

riscv: uaccess: Relax the threshold for fast path

The bytes copy for unaligned head would cover at most SZREG-1 bytes, so
it's better to set the threshold as >= (SZREG-1 + word_copy stride size)
which equals to 9*SZREG-1.

Signed-off-by: Xiao Wang <[email protected]>
Reviewed-by: Alexandre Ghiti <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

riscv: uaccess: Allow the last potential unrolled copy

When the dst buffer pointer points to the last accessible aligned addr, we
could still run another iteration of unrolled copy.

Signed-off-by: Xiao Wang <[email protected]>
Reviewed-by: Alexandre Ghiti <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

riscv: typo in comment for get_f64_reg

Signed-off-by: Xingyou Chen <[email protected]>
Reviewed-by: Randy Dunlap <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

Use bool value in set_cpu_online()

The declaration of set_cpu_online() takes a bool value. So replace
int here to make it consistent with the declaration.

Signed-off-by: Zhao Ke <[email protected]>
Reviewed-by: Charlie Jenkins <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

riscv: selftests: Add hwprobe binaries to .gitignore

The cbo and which-cpu hwprobe selftests leave their artifacts in the
kernel tree and end up being tracked by git. Add the binaries to the
hwprobe selftest .gitignore so this no longer happens.

Signed-off-by: Charlie Jenkins <[email protected]>
Fixes: a29e2a48afe3 ("RISC-V: selftests: Add CBO tests")
Fixes: ef7d6abb2cf5 ("RISC-V: selftests: Add which-cpus hwprobe test")
Reviewed-by: Muhammad Usama Anjum <[email protected]>
Reviewed-by: Andrew Jones <[email protected]>
Link: https://lore.kernel.org/r/20240425-gitignore_hwprobe_artifacts-v1-1-dfc5a20da469@rivosinc.com
Signed-off-by: Palmer Dabbelt <[email protected]>

Merge patch series "riscv: fix debug_pagealloc"

Nam Cao <[email protected]> says:

The debug_pagealloc feature is not functional on RISCV. With this feature
enabled (CONFIG_DEBUG_PAGEALLOC=y and debug_pagealloc=on), kernel crashes
early during boot.

QEMU command that can reproduce this problem:
   qemu-system-riscv64 -machine virt \
   -kernel Image \
   -append "console=ttyS0 root=/dev/vda debug_pagealloc=on" \
   -nographic \
   -drive "file=root.img,format=raw,id=hd0" \
   -device virtio-blk-device,drive=hd0 \
   -m 4G \

This series makes debug_pagealloc functional.

* b4-shazam-merge:
  riscv: rewrite __kernel_map_pages() to fix sleeping in invalid context
  riscv: force PAGE_SIZE linear mapping if debug_pagealloc is enabled

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

riscv: stacktrace: fixed walk_stackframe()

If the load access fault occures in a leaf function (with
CONFIG_FRAME_POINTER=y), when wrong stack trace will be displayed:

[<ffffffff804853c2>] regmap_mmio_read32le+0xe/0x1c
---[ end trace 0000000000000000 ]---

Registers dump:
    ra     0xffffffff80485758 <regmap_mmio_read+36>
    sp     0xffffffc80200b9a0
    fp     0xffffffc80200b9b0
    pc     0xffffffff804853ba <regmap_mmio_read32le+6>

Stack dump:
    0xffffffc80200b9a0:  0xffffffc80200b9e0  0xffffffc80200b9e0
    0xffffffc80200b9b0:  0xffffffff8116d7e8  0x0000000000000100
    0xffffffc80200b9c0:  0xffffffd8055b9400  0xffffffd8055b9400
    0xffffffc80200b9d0:  0xffffffc80200b9f0  0xffffffff8047c526
    0xffffffc80200b9e0:  0xffffffc80200ba30  0xffffffff8047fe9a

The assembler dump of the function preambula:
    add     sp,sp,-16
    sd      s0,8(sp)
    add     s0,sp,16

In the fist stack frame, where ra is not stored on the stack we can
observe:

        0(sp)                  8(sp)
        .---------------------------------------------.
    sp->|       frame->fp      | frame->ra (saved fp) |
        |---------------------------------------------|
    fp->|         ....         |         ....         |
        |---------------------------------------------|
        |                      |                      |

and in the code check is performed:
if (regs && (regs->epc == pc) && (frame->fp & 0x7))

I see no reason to check frame->fp value at all, because it is can be
uninitialized value on the stack. A better way is to check frame->ra to
be an address on the stack. After the stacktrace shows as expect:

[<ffffffff804853c2>] regmap_mmio_read32le+0xe/0x1c
[<ffffffff80485758>] regmap_mmio_read+0x24/0x52
[<ffffffff8047c526>] _regmap_bus_reg_read+0x1a/0x22
[<ffffffff8047fe9a>] _regmap_read+0x5c/0xea
[<ffffffff80480376>] _regmap_update_bits+0x76/0xc0
...
---[ end trace 0000000000000000 ]---
As pointed by Samuel Holland it is incorrect to remove check of the stackframe
entirely.

Changes since v2 [2]:
- Add accidentally forgotten curly brace

Changes since v1 [1]:
- Instead of just dropping frame->fp check, replace it with validation of
   frame->ra, which should be a stack address.
- Move frame pointer validation into the separate function.

[1] https://lore.kernel.org/linux-riscv/20240426072701 [email protected]/
[2] https://lore.kernel.org/linux-riscv/20240521131314 [email protected]/

Fixes: f766f77a74f5 ("riscv/stacktrace: Fix stack output without ra on the stack top")
Signed-off-by: Matthew Bystrin <[email protected]>
Reviewed-by: Samuel Holland <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

ftrace: riscv: move from REGS to ARGS

This commit replaces riscv's support for FTRACE_WITH_REGS with support
for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop
relying on stop_machine() for RISCV's implementation of ftrace.

The main relevant benefit that this change will bring for the above
use-case is that now we don't have separate ftrace_caller and
ftrace_regs_caller trampolines. This will allow the callsite to call
ftrace_caller by modifying a single instruction. Now the callsite can
do something similar to:

When not tracing:            |             When tracing:

func:                                      func:
  auipc t0, ftrace_caller_top                auipc t0, ftrace_caller_top
  nop  <=========<Enable/Disable>=========>  jalr  t0, ftrace_caller_bottom
  [...]                                      [...]

The above assumes that we are dropping the support of calling a direct
trampoline from the callsite. We need to drop this as the callsite can't
change the target address to call, it can only enable/disable a call to
a preset target (ftrace_caller in the above diagram). We can later optimize
this by calling an intermediate dispatcher trampoline before ftrace_caller.

Currently, ftrace_regs_caller saves all CPU registers in the format of
struct pt_regs and allows the tracer to modify them. We don't need to
save all of the CPU registers because at function entry only a subset of
pt_regs is live:

|----------+----------+---------------------------------------------|
| Register | ABI Name | Description                                 |
|----------+----------+---------------------------------------------|
| x1       | ra       | Return address for traced function          |
| x2       | sp       | Stack pointer                               |
| x5       | t0       | Return address for ftrace_caller trampoline |
| x8       | s0/fp    | Frame pointer                               |
| x10-11   | a0-1     | Function arguments/return values            |
| x12-17   | a2-7     | Function arguments                          |
|----------+----------+---------------------------------------------|

See RISCV calling convention[1] for the above table.

Saving just the live registers decreases the amount of stack space
required from 288 Bytes to 112 Bytes.

Basic testing was done with this on the VisionFive 2 development board.

Note:
  - Moving from REGS to ARGS will mean that RISCV will stop supporting
    KPROBES_ON_FTRACE as it requires full pt_regs to be saved.
  - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2].

[1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf
[2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/

Signed-off-by: Puranjay Mohan <[email protected]>
Tested-by: Björn Töpel <[email protected]>
Reviewed-by: Björn Töpel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

Merge patch series "riscv: access_ok() optimization"

Samuel Holland <[email protected]> says:

This series optimizes access_ok() by defining TASK_SIZE_MAX. At Alex's
suggestion, I also tried making TASK_SIZE constant (specifically by
making PGDIR_SHIFT a variable instead of a ternary expression, then
replacing the load with an immediate using ALTERNATIVE). This appeared
to slightly improve performance on some implementations (C906) but
regressed it on others (FU740). So I am leaving further optimizations to
a later series.

* b4-shazam-merge:
riscv: Define TASK_SIZE_MAX for __access_ok()
riscv: Remove PGDIR_SIZE_L3 and TASK_SIZE_MIN

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

riscv: do not select MODULE_SECTIONS by default

Since commit aad15bc85c18 ("riscv: Change code model of module to
medany to improve data accessing"), kernel modules have not been built
with -fPIC, so they wouldn't have R_RISCV_GOT_HI20 or R_RISCV_CALL_PLT
relocations, and handling of those relocations is unnecessary.

If RELOCATABLE=y, kernel modules will be built with -fPIE, which would
reintroduce said relocations, so only select MODULE_SECTIONS when
RELOCATABLE.

Signed-off-by: Qingfang Deng <[email protected]>
Reviewed-by: Charlie Jenkins <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

riscv: show help string for riscv-specific targets

Define the archhelp variable so that 'make ACRH=riscv help' will show
the targets specific to building a RISC-V kernel like other
architectures.

Tested-by: Björn Töpel <[email protected]>
Signed-off-by: Emil Renner Berthing <[email protected]>
Reviewed-by: Masahiro Yamada <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

riscv: make image compression configurable

Previously the build process would always set KBUILD_IMAGE to the
uncompressed Image file (unless XIP_KERNEL or EFI_ZBOOT was enabled) and
unconditionally compress it into Image.gz. However there are already
build targets for Image.bz2, Image.lz4, Image.lzma, Image.lzo and
Image.zstd, so let's make use of those, make the compression method
configurable and set KBUILD_IMAGE accordingly so that targets like
'make install' and 'make bindeb-pkg' will use the chosen image.

Tested-by: Björn Töpel <[email protected]>
Signed-off-by: Emil Renner Berthing <[email protected]>
Reviewed-by: Nicolas Schier <[email protected]>
Reviewed-by: Masahiro Yamada <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

bcachefs: Fix stack oob in __bch2_encrypt_bio()

Reported-by: [email protected]
Signed-off-by: Kent Overstreet <[email protected]>

bcachefs: Fix btree_trans leak in bch2_readahead()

Reported-by: [email protected]
Signed-off-by: Kent Overstreet <[email protected]>

bcachefs: Fix bogus verify_replicas_entry() assert

verify_replicas_entry() is only for newly created replicas entries -
existing entries on disk may have unknown data types, and we have real
verifiers for them.

Reported-by: [email protected]
Signed-off-by: Kent Overstreet <[email protected]>

i3c: dw: Add hot-join support.

Add hot-join support for dw i3c master controller.
By default, the hot-join acknowledgment is disabled, and the hardware will
automatically send the DISEC CCC when it receives the hot-join request.
Users can use the sys entry to enable it.

Signed-off-by: Billy Tsai <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexandre Belloni <[email protected]>

i3c: master: Enable runtime PM for master controller

Enable runtime PM for i3c master node during master registration time.

Sometimes i3c client device driver may want to control the PM of the
parent (master) to perform the transactions and save the power in an
efficient way by controlling the session. Hence device can call PM
APIs by passing the parent node.

Here, I3C target device when calls pm_runtime_get_sync(dev->parent)
couldn't invoke master drivers runtime PM callback registered by
the master driver because parent's PM status was disabled in the
Master node.

Also call pm_runtime_no_callbacks() and pm_suspend_ignore_children()
for the master node to not have any callback addition and ignore the
children to have runtime PM work just locally in the driver. This
should be generic and common change for all i3c devices and should
not have any other impact.

With these changes, I3C client device works and able to invoke
master driver registered runtime PM callbacks.

Signed-off-by: Mukesh Kumar Savaliya <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexandre Belloni <[email protected]>

i3c: master: svc: fix invalidate IBI type and miss call client IBI handler

In an In-Band Interrupt (IBI) handle, the code logic is as follows:

1: writel(SVC_I3C_MCTRL_REQUEST_AUTO_IBI | SVC_I3C_MCTRL_IBIRESP_AUTO,
  master->regs + SVC_I3C_MCTRL);

2: ret = readl_relaxed_poll_timeout(master->regs + SVC_I3C_MSTATUS, val,
                                    SVC_I3C_MSTATUS_IBIWON(val), 0, 1000);
...
3: ibitype = SVC_I3C_MSTATUS_IBITYPE(status);
   ibiaddr = SVC_I3C_MSTATUS_IBIADDR(status);

SVC_I3C_MSTATUS_IBIWON may be set before step 1. Thus, step 2 will return
immediately, and the I3C controller has not sent out the 9th SCL yet.
Consequently, ibitype and ibiaddr are 0, resulting in an unknown IBI type
occurrence and missing call I3C client driver's IBI handler.

A typical case is that SVC_I3C_MSTATUS_IBIWON is set when an IBI occurs
during the controller send start frame in svc_i3c_master_xfer().

Clear SVC_I3C_MSTATUS_IBIWON before issue SVC_I3C_MCTRL_REQUEST_AUTO_IBI
to fix this issue.

Cc: [email protected]
Fixes: 5e5e3c92e748 ("i3c: master: svc: fix wrong data return when IBI happen during start frame")
Signed-off-by: Frank Li <[email protected]>
Reviewed-by: Miquel Raynal <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexandre Belloni <[email protected]>

i3c: master: svc: change ENXIO to EAGAIN when IBI occurs during start frame

svc_i3c_master_xfer() returns error ENXIO if an In-Band Interrupt (IBI)
occurs when the host starts the frame.

Change error code to EAGAIN to inform the client driver that this
situation has occurred and to try again sometime later.

Fixes: 5e5e3c92e748 ("i3c: master: svc: fix wrong data return when IBI happen during start frame")
Signed-off-by: Frank Li <[email protected]>
Reviewed-by: Miquel Raynal <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexandre Belloni <[email protected]>

i3c: Add comment for -EAGAIN in i3c_device_do_priv_xfers()

In accordance with I3C spec ver 1.1.1 09-Jun-2021, section: 5.1.2.2.3, if
a target requests hot join (HJ), In-Band Interrupt (IBI), or controller
role request (CRR) during the emission of an I3C address in
i3c_device_do_priv_xfers(), the target may win bus arbitration. In such
cases, it is imperative to notify the I3C client driver and retry
i3c_device_do_priv_xfers() after some delay.

Signed-off-by: Frank Li <[email protected]>
Reviewed-by: Miquel Raynal <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexandre Belloni <[email protected]>

mm: simplify and improve print_vma_addr() output

Use '%pD' to print out the filename, and print out the actual offset
within the file too, rather than just what the virtual address of the
mapping is (which doesn't tell you anything about any mapping offsets).

Also, use the exact vma_lookup() instead of find_vma() - the latter
looks up any vma _after_ the address, which is of questionable value
(yes, maybe you fell off the beginning, but you'd be more likely to fall
off the end).

Signed-off-by: Linus Torvalds <[email protected]>

Merge local branch 'x86-codegen'

Merge trivial x86 code generation annoyances

- Introduce helper macros for clang asm input problems

- use said macros to improve trivially stupid code generation issues in
   bitops and array_index_mask_nospec

- also improve codegen with 32-bit array index comparisons

None of these really matter, but I look at code generation and profiles
fairly regularly, and these misfeatures caused the generated code to
look really odd and distract from the real issues.

* branch 'x86-codegen' of local tree:
  x86: improve bitop code generation with clang
  x86: improve array_index_mask_nospec() code generation
  clang: work around asm input constraint problems

x86: improve bitop code generation with clang

This uses the new ASM_INPUT_RM macro to avoid the bad code generation
issue that clang has with more generic asm inputs.

This ends up avoiding generating code like this:

mov    %r10,(%rsp)
tzcnt  (%rsp),%rcx

which now becomes just

tzcnt  %r10,%rcx

and in the process ends up also removing a few unnecessary stack frames
when the only use was that pointless "asm uses memory location off stack".

Signed-off-by: Linus Torvalds <[email protected]>

x86: improve array_index_mask_nospec() code generation

Don't force the inputs to be 'unsigned long', when the comparison can
easily be done in 32-bit if that's more appropriate.

Note that while we can look at the inputs to choose an appropriate size
for the compare instruction, the output is fixed at 'unsigned long'.
That's not technically optimal either, since a 32-bit 'sbbl' would often
be sufficient.

But for the outgoing mask we don't know how the mask ends up being used
(ie we have uses that have an incoming 32-bit array index, but end up
using the mask for other things).  That said, it only costs the extra
REX prefix to always generate the 64-bit mask.

[ A 'sbbl' also always technically generates a 64-bit mask, but with the
  upper 32 bits clear: that's fine for when the incoming index that will
  be masked is already 32-bit, but not if you use the mask to mask a
  pointer afterwards, like the file table lookup does ]

Cc: Peter Zijlstra <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

clang: work around asm input constraint problems

Work around clang problems with asm constraints that have multiple
possibilities, particularly "g" and "rm".

Clang seems to turn inputs like that into the most generic form, which
is the memory input - but to make matters worse, clang won't even use a
possible original memory location, but will spill the value to stack,
and use the stack for the asm input.

See

https://github.com/llvm/llvm-project/issues/20571#issuecomment-980933442

for some explanation of why clang has this strange behavior, but the end
result is that "g" and "rm" really end up generating horrid code.

Link: https://github.com/llvm/llvm-project/issues/20571
Cc: Peter Zijlstra <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge tag 'char-misc-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc and other driver subsystem updates from Greg KH:
"Here is the big set of char/misc and other driver subsystem updates
  for 6.10-rc1. Nothing major here, just lots of new drivers and updates
  for apis and new hardware types. Included in here are:

   - big IIO driver updates with more devices and drivers added

   - fpga driver updates

   - hyper-v driver updates

   - uio_pruss driver removal, no one uses it, other drivers control the
     same hardware now

   - binder minor updates

   - mhi driver updates

   - excon driver updates

   - counter driver updates

   - accessability driver updates

   - coresight driver updates

   - other hwtracing driver updates

   - nvmem driver updates

   - slimbus driver updates

   - spmi driver updates

   - other smaller misc and char driver updates

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (319 commits)
  misc: ntsync: mark driver as "broken" to prevent from building
  spmi: pmic-arb: Add multi bus support
  spmi: pmic-arb: Register controller for bus instead of arbiter
  spmi: pmic-arb: Make core resources acquiring a version operation
  spmi: pmic-arb: Make the APID init a version operation
  spmi: pmic-arb: Fix some compile warnings about members not being described
  dt-bindings: spmi: Deprecate qcom,bus-id
  dt-bindings: spmi: Add X1E80100 SPMI PMIC ARB schema
  spmi: pmic-arb: Replace three IS_ERR() calls by null pointer checks in spmi_pmic_arb_probe()
  spmi: hisi-spmi-controller: Do not override device identifier
  dt-bindings: spmi: hisilicon,hisi-spmi-controller: clean up example
  dt-bindings: spmi: hisilicon,hisi-spmi-controller: fix binding references
  spmi: make spmi_bus_type const
  extcon: adc-jack: Document missing struct members
  extcon: realtek: Remove unused of_gpio.h
  extcon: usbc-cros-ec: Convert to platform remove callback returning void
  extcon: usb-gpio: Convert to platform remove callback returning void
  extcon: max77843: Convert to platform remove callback returning void
  extcon: max3355: Convert to platform remove callback returning void
  extcon: intel-mrfld: Convert to platform remove callback returning void
  ...

Merge tag 'driver-core-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
"Here is the small set of driver core and kernfs changes for 6.10-rc1.

  Nothing major here at all, just a small set of changes for some driver
  core apis, and minor fixups. Included in here are:

   - sysfs_bin_attr_simple_read() helper added and used

   - device_show_string() helper added and used

  All usages of these were acked by the various maintainers. Also in
  here are:

   - kernfs minor cleanup

   - removed unused functions

   - typo fix in documentation

   - pay attention to sysfs_create_link() failures in module.c finally

  All of these have been in linux-next for a very long time with no
  reported problems"

* tag 'driver-core-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  device property: Fix a typo in the description of device_get_child_node_count()
  kernfs: mount: Remove unnecessary ‘NULL’ values from knparent
  scsi: Use device_show_string() helper for sysfs attributes
  platform/x86: Use device_show_string() helper for sysfs attributes
  perf: Use device_show_string() helper for sysfs attributes
  IB/qib: Use device_show_string() helper for sysfs attributes
  hwmon: Use device_show_string() helper for sysfs attributes
  driver core: Add device_show_string() helper for sysfs attributes
  treewide: Use sysfs_bin_attr_simple_read() helper
  sysfs: Add sysfs_bin_attr_simple_read() helper
  module: don't ignore sysfs_create_link() failures
  driver core: Remove unused platform_notify, platform_notify_remove

Merge tag 'staging-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging driver updates from Greg KH:
"Here is the big set of staging driver changes for 6.10-rc1. Not a lot
  of cleanups happening this kernel release, intern applications must be
  out of sync at the moment. But we did delete two drivers, wlan-ng and
  pi433, as they are no longer in use and the developers involved wanted
  them just gone entirely, allowing us to drop 19k lines from the tree.

  Other than the normal coding style cleanups here, there has been a lot
  of work on the vc04_services code, with the intent to finally get that
  out of staging hopefully soon. It's getting closer, which is nice to
  see.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'staging-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (98 commits)
  staging: pi433: Remove unused driver
  staging: vchiq_core: Add missing blank lines
  staging: vchiq_core: Drop unnecessary blank lines
  staging: vchiq_core: Add parentheses to VCHIQ_MSG_SRCPORT
  staging: vchiq_core: Use printk messages for devices
  staging: vchiq_arm: Drop unnecessary NULL check
  staging: vc04_services: Delete unnecessary NULL check
  staging: vc04_services: vchiq_arm: Fix NULL ptr dereferences
  Staging: rtl8192e: Rename variable DssCCk
  Staging: rtl8192e: Rename variable ExtHTCapInfo
  Staging: rtl8192e: Rename variable MPDUDensity
  Staging: rtl8192e: Rename variable MaxRxAMPDUFactor
  Staging: rtl8192e: Rename variable MaxAMSDUSize
  Staging: rtl8192e: Rename variable DelayBA
  Staging: rtl8192e: Rename variable RxSTBC
  Staging: rtl8192e: Rename variable TxSTBC
  Staging: rtl8192e: Rename variable GreenField
  Staging: rtl8192e: Rename variable ShortGI20Mhz
  Staging: rtl8192e: Rename variable ShortGI40Mhz
  Staging: rtl8192e: Rename variable MimoPwrSave
  ...

Merge tag 'tty-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty / serial updates from Greg KH:
"Here is the big set of tty/serial driver changes for 6.10-rc1.
  Included in here are:

   - Usual good set of api cleanups and evolution by Jiri Slaby to make
     the serial interfaces move out of the 1990's by using kfifos
     instead of hand-rolling their own logic.

   - 8250_exar driver updates

   - max3100 driver updates

   - sc16is7xx driver updates

   - exar driver updates

   - sh-sci driver updates

   - tty ldisc api addition to help refuse bindings

   - other smaller serial driver updates

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'tty-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (113 commits)
  serial: Clear UPF_DEAD before calling tty_port_register_device_attr_serdev()
  serial: imx: Raise TX trigger level to 8
  serial: 8250_pnp: Simplify "line" related code
  serial: sh-sci: simplify locking when re-issuing RXDMA fails
  serial: sh-sci: let timeout timer only run when DMA is scheduled
  serial: sh-sci: describe locking requirements for invalidating RXDMA
  serial: sh-sci: protect invalidating RXDMA on shutdown
  tty: add the option to have a tty reject a new ldisc
  serial: core: Call device_set_awake_path() for console port
  dt-bindings: serial: brcm,bcm2835-aux-uart: convert to dtschema
  tty: serial: uartps: Add support for uartps controller reset
  arm64: zynqmp: Add resets property for UART nodes
  dt-bindings: serial: cdns,uart: Add optional reset property
  serial: 8250_pnp: Switch to DEFINE_SIMPLE_DEV_PM_OPS()
  serial: 8250_exar: Keep the includes sorted
  serial: 8250_exar: Make type of bit the same in exar_ee_*_bit()
  serial: 8250_exar: Use BIT() in exar_ee_read()
  serial: 8250_exar: Switch to use dev_err_probe()
  serial: 8250_exar: Return directly from switch-cases
  serial: 8250_exar: Decrease indentation level
  ...

Merge tag 'usb-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB / Thunderbolt updates from Greg KH:
"Here is the big set of USB and Thunderbolt changes for 6.10-rc1.
  Nothing hugely earth-shattering, just constant forward progress for
  hardware support of new devices and cleanups over the drivers.

  Included in here are:

   - Thunderbolt / USB 4 driver updates

   - typec driver updates

   - dwc3 driver updates

   - gadget driver updates

   - uss720 driver id additions and fixes (people use USB->arallel port
     devices still!)

   - onboard-hub driver rename and additions for new hardware

   - xhci driver updates

   - other small USB driver updates and additions for quirks and api
     changes

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'usb-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (154 commits)
  drm/bridge: aux-hpd-bridge: correct devm_drm_dp_hpd_bridge_add() stub
  usb: fotg210: Add missing kernel doc description
  usb: dwc3: core: Fix unused variable warning in core driver
  usb: typec: tipd: rely on i2c_get_match_data()
  usb: typec: tipd: fix event checking for tps6598x
  usb: typec: tipd: fix event checking for tps25750
  dt-bindings: usb: qcom,dwc3: fix interrupt max items
  usb: fotg210: Use *-y instead of *-objs in Makefile
  usb: phy: tegra: Replace of_gpio.h by proper one
  usb: typec: ucsi: displayport: Fix potential deadlock
  usb: typec: qcom-pmic-typec: split HPD bridge alloc and registration
  usb: musc: Remove unused list 'buffers'
  usb: dwc3: Wait unconditionally after issuing EndXfer command
  usb: gadget: u_audio: Clear uac pointer when freed.
  usb: gadget: u_audio: Fix race condition use of controls after free during gadget unbind.
  dt-bindings: usb: dwc3: Add QDU1000 compatible
  usb: core: Remove the useless struct usb_devmap which is just a bitmap
  MAINTAINERS: Remove {ehci,uhci}-platform.c from ARM/VT8500 entry
  USB: usb_parse_endpoint: ignore reserved bits
  usb: xhci: compact 'trb_in_td()' arguments
  ...

Merge tag 'leds-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds

Pull LED updates from Lee Jones:
"Core Frameworks:
   - Ensure seldom updated triggers have a brightness value before first
     update

  New Device Support:
   - Add support for Simatic IPC Device BX_59A to IPC LEDs Core
   - Add support for Qualcomm PMI8950 PWM to LPG Core

  New Functionality:
   - Add a bunch of new LED function identifiers
   - Add support for High Resolution Timers in LED Trigger Patten

  Fix-ups:
   - Shift out Audio Trigger to the Sound subsystem
   - Convert suitable calls to devm_* managed resources
   - Device Tree binding adaptions/conversions/creation
   - Remove superfluous code/variables/attributes and simplify overall
   - Use/convert to new/better APIs/helpers/MACROs instead of
     hand-rolling implementations

  Bug Fixes:
   - Repair enabling Torch Mode from V4L2 on the second LED
   - Ensure PWM is disabled when suspending"

* tag 'leds-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds: (28 commits)
  leds: mt6370: Remove unused field 'reg_cfgs' from 'struct mt6370_priv'
  leds: lp50xx: Remove unused field 'num_of_banked_leds' from 'struct lp50xx'
  leds: lp50xx: Remove unused field 'bank_modules' from 'struct lp50xx_led'
  leds: aat1290: Remove unused field 'torch_brightness' from 'struct aat1290_led'
  leds: sun50i-a100: Use match_string() helper to simplify the code
  leds: pwm: Disable PWM when going to suspend
  leds: trigger: pattern: Add support for hrtimer
  leds: mt6360: Fix the second LED can not enable torch mode by V4L2
  dt-bindings: leds: leds-qcom-lpg: Add support for PMI8950 PWM
  leds: qcom-lpg: Add support for PMI8950 PWM
  leds: apu: Remove duplicate DMI lookup data
  leds: trigger: netdev: Remove not needed call to led_set_brightness in deactivate
  dt-bindings: leds: Add LED_FUNCTION_SPEED_* for link speed on LAN/WAN
  dt-bindings: leds: Add LED_FUNCTION_MOBILE for mobile network
  leds: simatic-ipc-leds-gpio: Add support for module BX-59A
  dt-bindings: leds: qcom-lpg: Document PM6150L compatible
  dt-bindings: leds: pca963x: Convert text bindings to YAML
  leds: an30259a: Use devm_mutex_init() for mutex initialization
  leds: mlxreg: Use devm_mutex_init() for mutex initialization
  leds: nic78bx: Use devm API to cleanup module's resources
  ...

Merge tag 'backlight-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight

Pull backlight updates from Lee Jones:
"Fix-ups:
   - FB Backlight interaction overhaul
   - Remove superfluous code and simplify overall
   - Constify various structs and struct attributes

  Bug Fixes:
   - Repair LED flickering
   - Fix signedness bugs"

* tag 'backlight-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight: (42 commits)
  backlight: sky81452-backlight: Remove unnecessary call to of_node_get()
  backlight: mp3309c: Fix LEDs flickering in PWM mode
  backlight: otm3225a: Drop driver owner assignment
  backlight: lp8788: Drop support for platform data
  backlight: lcd: Make lcd_class constant
  backlight: Make backlight_class constant
  backlight: mp3309c: Fix signedness bug in mp3309c_parse_fwnode()
  const_structs.checkpatch: add lcd_ops
  fbdev: omap: lcd_ams_delta: Constify lcd_ops
  fbdev: imx: Constify lcd_ops
  fbdev: clps711x: Constify lcd_ops
  HID: picoLCD: Constify lcd_ops
  backlight: tdo24m: Constify lcd_ops
  backlight: platform_lcd: Constify lcd_ops
  backlight: otm3225a: Constify lcd_ops
  backlight: ltv350qv: Constify lcd_ops
  backlight: lms501kf03: Constify lcd_ops
  backlight: lms283gf05: Constify lcd_ops
  backlight: l4f00242t03: Constify lcd_ops
  backlight: jornada720_lcd: Constify lcd_ops
  ...

Merge tag 'mfd-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd

Pull MFD updates from Lee Jones:
"New Device Support:
   - Add support for X-Powers AXP717 PMIC to AXP22X
   - Add support for Rockchip RK816 PMIC to RK8XX
   - Add support for TI TPS65224 PMIC to TPS6594

  New Functionality:
   - Add Power Off functionality to Rohm BD71828
   - Allow I2C SMBus access in Renesas RSMU

  Fix-ups:
   - Device Tree binding adaptions/conversions/creation
   - Shift Intel support over to MSI interrupts
   - Generify adding platform data away from being ACPI specific
   - Use device core supplied attribute to register sysfs entries
   - Replace hand-rolled functionality with generic APIs
   - Utilise centrally provided helpers and macros
   - Clean-up error handling
   - Remove superfluous/duplicated/unused sections
   - Trivial; spelling, whitespace, coding-style adaptions
   - More Maple Tree conversions"

* tag 'mfd-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (44 commits)
  dt-bindings: mfd: Use full path to other schemas
  mfd: rsmu: support I2C SMBus access
  dt-bindings: mfd: Convert lp873x.txt to json-schema
  dt-bindings: mfd: aspeed: Drop 'oneOf' for pinctrl node
  dt-bindings: mfd: allwinner,sun6i-a31-prcm: Use hyphens in node names
  mfd: ssbi: Remove unused field 'slave' from 'struct ssbi'
  mfd: kempld: Remove custom DMI matching code
  mfd: cs42l43: Update patching revision check
  dt-bindings: mfd: qcom: pm8xxx: Add pm8901 compatible
  mfd: timberdale: Remove redundant assignment to variable err
  dt-bindings: mfd: qcom,spmi-pmic: Add pbs to SPMI device types
  dt-bindings: mfd: syscon: Add ti,am62p-cpsw-mac-efuse compatible
  dt-bindings: mfd: qcom,tcsr: Add compatible for SDX75
  mfd: axp20x: Convert to use Maple Tree register cache
  mfd: bd71828: Remove commented code lines
  mfd: intel-m10-bmc: Change staging size to a variable
  dt-bindings: mfd: Add ROHM BD71879
  mfd: Tidy Kconfig dependency's parentheses
  mfd: ocelot-spi: Use spi_sync_transfer()
  dt-bindings: mfd: syscon: Add missing simple syscon compatibles
  ...

Input: edt-ft5x06 - add support for FocalTech FT5452 and FT8719

The driver is compatible with FocalTech FT5452 and FT8719 touchscreens
too. FT5452 supports up to 5 touch points. FT8719 supports up to 10 touch
points. Add compatible data for both of them.

Signed-off-by: Joel Selvaraj <[email protected]>
Link: https://lore.kernel.org/r/20240521-add-support-ft5452-and-ft8719-touchscreen-v1-2-2a648ac7176b@gmail.com
Signed-off-by: Dmitry Torokhov <[email protected]>

dt-bindings: input: touchscreen: edt-ft5x06: Document FT5452 and FT8719 support

Document FocalTech FT5452 and FT8719 support by adding their compatibles.

Signed-off-by: Joel Selvaraj <[email protected]>
Acked-by: Krzysztof Kozlowski <[email protected]>
Link: https://lore.kernel.org/r/20240521-add-support-ft5452-and-ft8719-touchscreen-v1-1-2a648ac7176b@gmail.com
Signed-off-by: Dmitry Torokhov <[email protected]>

blk-throttle: remove unused struct 'avg_latency_bucket'

'avg_latency_bucket' is unused since
commit bf20ab538c81 ("blk-throttle: remove
CONFIG_BLK_DEV_THROTTLING_LOW")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

io_uring: remove checks for NULL 'sq_offset'

Since the 5.12 kernel release, nobody has been passing NULL as the
sq_offset pointer. Remove the checks for it being NULL or not, it will
always be valid.

Signed-off-by: Jens Axboe <[email protected]>

Merge tag 'riscv-for-linus-6.10-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

- Add byte/half-word compare-and-exchange, emulated via LR/SC loops

- Support for Rust

- Support for Zihintpause in hwprobe

- Add PR_RISCV_SET_ICACHE_FLUSH_CTX prctl()

- Support lockless lockrefs

* tag 'riscv-for-linus-6.10-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (42 commits)
  riscv: defconfig: Enable CONFIG_CLK_SOPHGO_CV1800
  riscv: select ARCH_HAS_FAST_MULTIPLIER
  riscv: mm: still create swiotlb buffer for kmalloc() bouncing if required
  riscv: Annotate pgtable_l{4,5}_enabled with __ro_after_init
  riscv: Remove redundant CONFIG_64BIT from pgtable_l{4,5}_enabled
  riscv: mm: Always use an ASID to flush mm contexts
  riscv: mm: Preserve global TLB entries when switching contexts
  riscv: mm: Make asid_bits a local variable
  riscv: mm: Use a fixed layout for the MM context ID
  riscv: mm: Introduce cntx2asid/cntx2version helper macros
  riscv: Avoid TLB flush loops when affected by SiFive CIP-1200
  riscv: Apply SiFive CIP-1200 workaround to single-ASID sfence.vma
  riscv: mm: Combine the SMP and UP TLB flush code
  riscv: Only send remote fences when some other CPU is online
  riscv: mm: Broadcast kernel TLB flushes only when needed
  riscv: Use IPIs for remote cache/TLB flushes by default
  riscv: Factor out page table TLB synchronization
  riscv: Flush the instruction cache during SMP bringup
  riscv: hwprobe: export Zihintpause ISA extension
  riscv: misaligned: remove CONFIG_RISCV_M_MODE specific code
  ...

Merge tag 'loongarch-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch updates from Huacai Chen:

- Select some options in Kconfig

- Give a chance to build with !CONFIG_SMP

- Switch to use built-in rustc target

- Add new supported device nodes to dts

- Some bug fixes and other small changes

- Update the default config file

* tag 'loongarch-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: Update Loongson-3 default config file
  LoongArch: dts: Add new supported device nodes to Loongson-2K2000
  LoongArch: dts: Add new supported device nodes to Loongson-2K0500
  LoongArch: dts: Remove "disabled" state of clock controller node
  LoongArch: rust: Switch to use built-in rustc target
  LoongArch: Fix callchain parse error with kernel tracepoint events again
  LoongArch: Give a chance to build with !CONFIG_SMP
  LoongArch: Select THP_SWAP if HAVE_ARCH_TRANSPARENT_HUGEPAGE
  LoongArch: Select ARCH_WANT_DEFAULT_BPF_JIT
  LoongArch: Select ARCH_SUPPORTS_INT128 if CC_HAS_INT128
  LoongArch: Select ARCH_HAS_FAST_MULTIPLIER

riscv: cpufeature: Fix extension subset checking

This loop is supposed to check if ext->subset_ext_ids[j] is valid, rather
than if ext->subset_ext_ids[i] is valid, before setting the extension
id ext->subset_ext_ids[j] in isainfo->isa.

Signed-off-by: Charlie Jenkins <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Reviewed-by: Alexandre Ghiti <[email protected]>
Reviewed-by: Andrew Jones <[email protected]>
Fixes: 0d8295ed975b ("riscv: add ISA extension parsing for scalar crypto")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

riscv: cpufeature: Fix thead vector hwcap removal

The riscv_cpuinfo struct that contains mvendorid and marchid is not
populated until all harts are booted which happens after the DT parsing.
Use the mvendorid/marchid from the boot hart to determine if the DT
contains an invalid V.

Fixes: d82f32202e0d ("RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs")
Signed-off-by: Charlie Jenkins <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Reviewed-by: Guo Ren <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

Merge tag 'microblaze-v6.10' of git://git.monstr.eu/linux-2.6-microblaze

Pull microblaze updates from Michal Simek:

- Cleanup code around removed early_printk

* tag 'microblaze-v6.10' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: Remove early printk call from cpuinfo-static.c
microblaze: Remove gcc flag for non existing early_printk.c file

Merge tag 'ovl-update-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs

Pull overlayfs updates from Miklos Szeredi:

- Add tmpfile support

- Clean up include

* tag 'ovl-update-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
  ovl: remove duplicate included header
  ovl: remove upper umask handling from ovl_create_upper()
  ovl: implement tmpfile

Merge tag 'fuse-update-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse

Pull fuse updates from Miklos Szeredi:

- Add fs-verity support (Richard Fung)

- Add multi-queue support to virtio-fs (Peter-Jan Gootzen)

- Fix a bug in NOTIFY_RESEND handling (Hou Tao)

- page -> folio cleanup (Matthew Wilcox)

* tag 'fuse-update-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  virtio-fs: add multi-queue support
  virtio-fs: limit number of request queues
  fuse: clear FR_SENT when re-adding requests into pending list
  fuse: set FR_PENDING atomically in fuse_resend()
  fuse: Add initial support for fs-verity
  fuse: Convert fuse_readpages_end() to use folio_end_read()

riscv: rewrite __kernel_map_pages() to fix sleeping in invalid context

__kernel_map_pages() is a debug function which clears the valid bit in page
table entry for deallocated pages to detect illegal memory accesses to
freed pages.

This function set/clear the valid bit using __set_memory(). __set_memory()
acquires init_mm's semaphore, and this operation may sleep. This is
problematic, because __kernel_map_pages() can be called in atomic context,
and thus is illegal to sleep. An example warning that this causes:

BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1578
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2, name: kthreadd
preempt_count: 2, expected: 0
CPU: 0 PID: 2 Comm: kthreadd Not tainted 6.9.0-g1d4c6d784ef6 #37
Hardware name: riscv-virtio,qemu (DT)
Call Trace:
[<ffffffff800060dc>] dump_backtrace+0x1c/0x24
[<ffffffff8091ef6e>] show_stack+0x2c/0x38
[<ffffffff8092baf8>] dump_stack_lvl+0x5a/0x72
[<ffffffff8092bb24>] dump_stack+0x14/0x1c
[<ffffffff8003b7ac>] __might_resched+0x104/0x10e
[<ffffffff8003b7f4>] __might_sleep+0x3e/0x62
[<ffffffff8093276a>] down_write+0x20/0x72
[<ffffffff8000cf00>] __set_memory+0x82/0x2fa
[<ffffffff8000d324>] __kernel_map_pages+0x5a/0xd4
[<ffffffff80196cca>] __alloc_pages_bulk+0x3b2/0x43a
[<ffffffff8018ee82>] __vmalloc_node_range+0x196/0x6ba
[<ffffffff80011904>] copy_process+0x72c/0x17ec
[<ffffffff80012ab4>] kernel_clone+0x60/0x2fe
[<ffffffff80012f62>] kernel_thread+0x82/0xa0
[<ffffffff8003552c>] kthreadd+0x14a/0x1be
[<ffffffff809357de>] ret_from_fork+0xe/0x1c

Rewrite this function with apply_to_existing_page_range(). It is fine to
not have any locking, because __kernel_map_pages() works with pages being
allocated/deallocated and those pages are not changed by anyone else in the
meantime.

Fixes: 5fde3db5eb02 ("riscv: add ARCH_SUPPORTS_DEBUG_PAGEALLOC support")
Signed-off-by: Nam Cao <[email protected]>
Cc: [email protected]
Reviewed-by: Alexandre Ghiti <[email protected]>
Link: https://lore.kernel.org/r/1289ecba9606a19917bc12b6c27da8aa23e1e5ae.1715750938.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <[email protected]>

riscv: force PAGE_SIZE linear mapping if debug_pagealloc is enabled

debug_pagealloc is a debug feature which clears the valid bit in page table
entry for freed pages to detect illegal accesses to freed memory.

For this feature to work, virtual mapping must have PAGE_SIZE resolution.
(No, we cannot map with huge pages and split them only when needed; because
pages can be allocated/freed in atomic context and page splitting cannot be
done in atomic context)

Force linear mapping to use small pages if debug_pagealloc is enabled.

Note that it is not necessary to force the entire linear mapping, but only
those that are given to memory allocator. Some parts of memory can keep
using huge page mapping (for example, kernel's executable code). But these
parts are minority, so keep it simple. This is just a debug feature, some
extra overhead should be acceptable.

Fixes: 5fde3db5eb02 ("riscv: add ARCH_SUPPORTS_DEBUG_PAGEALLOC support")
Signed-off-by: Nam Cao <[email protected]>
Cc: [email protected]
Reviewed-by: Alexandre Ghiti <[email protected]>
Link: https://lore.kernel.org/r/2e391fa6c6f9b3fcf1b41cefbace02ee4ab4bf59.1715750938.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <[email protected]>

drm/amdgpu/atomfirmware: add intergrated info v2.3 table

[Why]
The vram width value is 0.
Because the integratedsysteminfo table in VBIOS has updated to 2.3.

[How]
Driver needs a new intergrated info v2.3 table too.
Then the vram width value will be correct.

Signed-off-by: Li Ma <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]

vfs: Delete the associated dentry when deleting a file

Our applications, built on Elasticsearch[0], frequently create and
delete files.  These applications operate within containers, some with a
memory limit exceeding 100GB.  Over prolonged periods, the accumulation
of negative dentries within these containers can amount to tens of
gigabytes.

Upon container exit, directories are deleted.  However, due to the
numerous associated dentries, this process can be time-consuming.  Our
users have expressed frustration with this prolonged exit duration,
which constitutes our first issue.

Simultaneously, other processes may attempt to access the parent
directory of the Elasticsearch directories.  Since the task responsible
for deleting the dentries holds the inode lock, processes attempting
directory lookup experience significant delays.  This issue, our second
problem, is easily demonstrated:

  - Task 1 generates negative dentries:
  $ pwd
  ~/test
  $ mkdir es && cd es/ && ./create_and_delete_files.sh

  [ After generating tens of GB dentries ]

  $ cd ~/test && rm -rf es

  [ It will take a long duration to finish ]

  - Task 2 attempts to lookup the 'test/' directory
  $ pwd
  ~/test
  $ ls

  The 'ls' command in Task 2 experiences prolonged execution as Task 1
  is deleting the dentries.

We've devised a solution to address both issues by deleting associated
dentry when removing a file.  Interestingly, we've noted that a similar
patch was proposed years ago[1], although it was rejected citing the
absence of tangible issues caused by negative dentries.  Given our
current challenges, we're resubmitting the proposal.  All relevant
stakeholders from previous discussions have been included for reference.

Some alternative solutions are also under discussion[2][3], such as
shrinking child dentries outside of the parent inode lock or even
asynchronously shrinking child dentries.  However, given the
straightforward nature of the current solution, I believe this approach
is still necessary.

[ NOTE! This is a pretty fundamental change in how we deal with
  unlinking dentries, and it doesn't change the fact that you can have
  lots of negative dentries from just doing negative lookups.

  But the kernel test robot is at least initially happy with this from a
  performance angle, so I'm applying this ASAP just to get more testing
  and as a "known fix for an issue people hit in real life".

  Put another way: we should still look at the alternatives, and this
  patch may get reverted if somebody finds a performance regression on
  some other load.       - Linus ]

Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: Yafang Shao <[email protected]>
Link: https://github.com/elastic/elasticsearch
Link: https://patchwork.kernel.org/project/linux-fsdevel/patch/[email protected]
Link: https://lore.kernel.org/linux-fsdevel/[email protected]/
Link: https://lore.kernel.org/linux-fsdevel/CAHk-=wjEMf8Du4UFzxuToGDnF3yLaMcrYeyNAaH1NJWa6fwcNQ@mail.gmail.com/
Cc: Al Viro <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Wangkai <[email protected]>
Cc: Colin Walters <[email protected]>
Tested-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Linus Torvalds <[email protected]>

drm/panel-edp: Add CMN N116BCJ-EAK

Add support for the CMN N116BCJ-EAK, place the raw EDID here for
subsequent reference.
00 ff ff ff ff ff ff 00 0d ae 60 11 00 00 00 00
04 22 01 04 95 1a 0e 78 02 67 75 98 59 53 90 27
1c 50 54 00 00 00 01 01 01 01 01 01 01 01 01 01
01 01 01 01 01 01 da 1d 56 e2 50 00 20 30 30 20
a6 00 00 90 10 00 00 18 00 00 00 fe 00 4e 31 31
36 42 43 4a 2d 45 41 4b 0a 20 00 00 00 fe 00 43
4d 4e 0a 20 20 20 20 20 20 20 20 20 00 00 00 fe
00 4e 31 31 36 42 43 4a 2d 45 41 4b 0a 20 00 98

Signed-off-by: Haikun Zhou <[email protected]>
Reviewed-by: Douglas Anderson <[email protected]>
Signed-off-by: Douglas Anderson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/20240522113924.1261683-1-zhouhaikun5@huaqin.corp-partner.google.com

virtio-pci: Check if is_avq is NULL

[bug]
In the virtio_pci_common.c function vp_del_vqs, vp_dev->is_avq is involved
to determine whether it is admin virtqueue, but this function vp_dev->is_avq
may be empty. For installations, virtio_pci_legacy does not assign a value
to vp_dev->is_avq.

[fix]
Check whether it is vp_dev->is_avq before use.

[test]
Test with virsh Attach device
Before this patch, the following command would crash the guest system

After applying the patch, everything seems to be working fine.

Signed-off-by: Li Zhang <[email protected]>
Message-Id: <1710566754 [email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

Merge tag 'stable/vduse-virtio-net' into vhost

This adds support for virtio-net to vduse.

Signed-off-by: Michael S. Tsirkin <[email protected]>

virtio: delete vq in vp_find_vqs_msix() when request_irq() fails

When request_irq() fails, error path calls vp_del_vqs(). There, as vq is
present in the list, free_irq() is called for the same vector. That
causes following splat:

[    0.414355] Trying to free already-free IRQ 27
[    0.414403] WARNING: CPU: 1 PID: 1 at kernel/irq/manage.c:1899 free_irq+0x1a1/0x2d0
[    0.414510] Modules linked in:
[    0.414540] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.9.0-rc4+ #27
[    0.414540] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014
[    0.414540] RIP: 0010:free_irq+0x1a1/0x2d0
[    0.414540] Code: 1e 00 48 83 c4 08 48 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 90 8b 74 24 04 48 c7 c7 98 80 6c b1 e8 00 c9 f7 ff 90 <0f> 0b 90 90 48 89 ee 4c 89 ef e8 e0 20 b8 00 49 8b 47 40 48 8b 40
[    0.414540] RSP: 0000:ffffb71480013ae0 EFLAGS: 00010086
[    0.414540] RAX: 0000000000000000 RBX: ffffa099c2722000 RCX: 0000000000000000
[    0.414540] RDX: 0000000000000000 RSI: ffffb71480013998 RDI: 0000000000000001
[    0.414540] RBP: 0000000000000246 R08: 00000000ffffdfff R09: 0000000000000001
[    0.414540] R10: 00000000ffffdfff R11: ffffffffb18729c0 R12: ffffa099c1c91760
[    0.414540] R13: ffffa099c1c916a4 R14: ffffa099c1d2f200 R15: ffffa099c1c91600
[    0.414540] FS:  0000000000000000(0000) GS:ffffa099fec40000(0000) knlGS:0000000000000000
[    0.414540] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.414540] CR2: 0000000000000000 CR3: 0000000008e3e001 CR4: 0000000000370ef0
[    0.414540] Call Trace:
[    0.414540]  <TASK>
[    0.414540]  ? __warn+0x80/0x120
[    0.414540]  ? free_irq+0x1a1/0x2d0
[    0.414540]  ? report_bug+0x164/0x190
[    0.414540]  ? handle_bug+0x3b/0x70
[    0.414540]  ? exc_invalid_op+0x17/0x70
[    0.414540]  ? asm_exc_invalid_op+0x1a/0x20
[    0.414540]  ? free_irq+0x1a1/0x2d0
[    0.414540]  vp_del_vqs+0xc1/0x220
[    0.414540]  vp_find_vqs_msix+0x305/0x470
[    0.414540]  vp_find_vqs+0x3e/0x1a0
[    0.414540]  vp_modern_find_vqs+0x1b/0x70
[    0.414540]  init_vqs+0x387/0x600
[    0.414540]  virtnet_probe+0x50a/0xc80
[    0.414540]  virtio_dev_probe+0x1e0/0x2b0
[    0.414540]  really_probe+0xc0/0x2c0
[    0.414540]  ? __pfx___driver_attach+0x10/0x10
[    0.414540]  __driver_probe_device+0x73/0x120
[    0.414540]  driver_probe_device+0x1f/0xe0
[    0.414540]  __driver_attach+0x88/0x180
[    0.414540]  bus_for_each_dev+0x85/0xd0
[    0.414540]  bus_add_driver+0xec/0x1f0
[    0.414540]  driver_register+0x59/0x100
[    0.414540]  ? __pfx_virtio_net_driver_init+0x10/0x10
[    0.414540]  virtio_net_driver_init+0x90/0xb0
[    0.414540]  do_one_initcall+0x58/0x230
[    0.414540]  kernel_init_freeable+0x1a3/0x2d0
[    0.414540]  ? __pfx_kernel_init+0x10/0x10
[    0.414540]  kernel_init+0x1a/0x1c0
[    0.414540]  ret_from_fork+0x31/0x50
[    0.414540]  ? __pfx_kernel_init+0x10/0x10
[    0.414540]  ret_from_fork_asm+0x1a/0x30
[    0.414540]  </TASK>

Fix this by calling deleting the current vq when request_irq() fails.

Fixes: 0b0f9dc52ed0 ("Revert "virtio_pci: use shared interrupts for virtqueues"")
Signed-off-by: Jiri Pirko <[email protected]>
Message-Id: <20240426150845.3999481 [email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

MAINTAINERS: add Eugenio Pérez as reviewer

Add myself as a reviewer of some VirtIO areas I'm interested.

Until this point I've been scanning manually the list looking for
series that touches this area. Adding myself to make this task easier.

Signed-off-by: Eugenio Pérez <[email protected]>
Message-Id: <20240213182450 [email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

vhost-vdpa: Remove usage of the deprecated ida_simple_xx() API

ida_alloc() and ida_free() should be preferred to the deprecated
ida_simple_get() and ida_simple_remove().

Note that the upper limit of ida_simple_get() is exclusive, but the one of
ida_alloc_max() is inclusive. So a -1 has been added when needed.

Signed-off-by: Christophe JAILLET <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Message-Id: <67c2edf49788c27d5f7a49fc701520b9fcf739b5.1713088999 [email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Acked-by: Jason Wang <[email protected]>

vp_vdpa: don't allocate unused msix vectors

When there is a ctlq and it doesn't require interrupt
callbacks,the original method of calculating vectors
wastes hardware msi or msix resources as well as system
IRQ resources.

When conducting performance testing using testpmd in the
guest os, it was found that the performance was lower compared
to directly using vfio-pci to passthrough the device

In scenarios where the virtio device in the guest os does
not utilize interrupts, the vdpa driver still configures
the hardware's msix vector. Therefore, the hardware still
sends interrupts to the host os. Because of this unnecessary
action by the hardware, hardware performance decreases, and
it also affects the performance of the host os.

Before modification:(interrupt mode)
32:  0   0  0  0 PCI-MSI 32768-edge    vp-vdpa[0000:00:02.0]-0
33:  0   0  0  0 PCI-MSI 32769-edge    vp-vdpa[0000:00:02.0]-1
34:  0   0  0  0 PCI-MSI 32770-edge    vp-vdpa[0000:00:02.0]-2
35:  0   0  0  0 PCI-MSI 32771-edge    vp-vdpa[0000:00:02.0]-config

After modification:(interrupt mode)
32:  0  0  1  7   PCI-MSI 32768-edge  vp-vdpa[0000:00:02.0]-0
33: 36  0  3  0   PCI-MSI 32769-edge  vp-vdpa[0000:00:02.0]-1
34:  0  0  0  0   PCI-MSI 32770-edge  vp-vdpa[0000:00:02.0]-config

Before modification:(virtio pmd mode for guest os)
32:  0   0  0  0 PCI-MSI 32768-edge    vp-vdpa[0000:00:02.0]-0
33:  0   0  0  0 PCI-MSI 32769-edge    vp-vdpa[0000:00:02.0]-1
34:  0   0  0  0 PCI-MSI 32770-edge    vp-vdpa[0000:00:02.0]-2
35:  0   0  0  0 PCI-MSI 32771-edge    vp-vdpa[0000:00:02.0]-config

After modification:(virtio pmd mode for guest os)
32: 0  0  0   0   PCI-MSI 32768-edge   vp-vdpa[0000:00:02.0]-config

To verify the use of the virtio PMD mode in the guest operating
system, the following patch needs to be applied to QEMU:
https://lore.kernel.org/all/20240408073311 [email protected]

Signed-off-by: Yuxue Liu <[email protected]>
Acked-by: Jason Wang <[email protected]>
Reviewed-by: Heng Qi <[email protected]>
Message-Id: <20240410033020 [email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

sound: virtio: drop owner assignment

virtio core already sets the .owner, so driver does not need to.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Message-Id: <20240331-module-owner-virtio-v2-25-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Acked-by: Anton Yakovlev <[email protected]>

fuse: virtio: drop owner assignment

virtio core already sets the .owner, so driver does not need to.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Message-Id: <20240331-module-owner-virtio-v2-24-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>

scsi: virtio: drop owner assignment

virtio core already sets the .owner, so driver does not need to.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Message-Id: <20240331-module-owner-virtio-v2-23-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Acked-by: Martin K. Petersen <[email protected]>

rpmsg: virtio: drop owner assignment

virtio core already sets the .owner, so driver does not need to.

Reviewed-by: Mathieu Poirier <[email protected]>
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Message-Id: <20240331-module-owner-virtio-v2-22-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <[email protected]>

nvdimm: virtio_pmem: drop owner assignment

virtio core already sets the .owner, so driver does not need to.

Acked-by: Dave Jiang <[email protected]>
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Message-Id: <20240331-module-owner-virtio-v2-21-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Pankaj Gupta <[email protected]
Reviewed-by: Pankaj Gupta <[email protected]>

wifi: mac80211_hwsim: drop owner assignment

virtio core already sets the .owner, so driver does not need to.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Message-Id: <20240331-module-owner-virtio-v2-20-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <[email protected]>

vsock/virtio: drop owner assignment

virtio core already sets the .owner, so driver does not need to.

Acked-by: Stefano Garzarella <[email protected]>
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Message-Id: <20240331-module-owner-virtio-v2-19-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>

net: 9p: virtio: drop owner assignment

virtio core already sets the .owner, so driver does not need to.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Message-Id: <20240331-module-owner-virtio-v2-18-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <[email protected]>

net: virtio: drop owner assignment

virtio core already sets the .owner, so driver does not need to.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Message-Id: <20240331-module-owner-virtio-v2-17-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <[email protected]>

net: caif: virtio: drop owner assignment

virtio core already sets the .owner, so driver does not need to.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Message-Id: <20240331-module-owner-virtio-v2-16-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <[email protected]>

misc: nsm: drop owner assignment

virtio core already sets the .owner, so driver does not need to.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Message-Id: <20240331-module-owner-virtio-v2-15-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Alexander Graf <[email protected]>

iommu: virtio: drop owner assignment

virtio core already sets the .owner, so driver does not need to.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Message-Id: <20240331-module-owner-virtio-v2-14-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <[email protected]>

drm/virtio: drop owner assignment

virtio core already sets the .owner, so driver does not need to.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Message-Id: <20240331-module-owner-virtio-v2-13-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <[email protected]>

gpio: virtio: drop owner assignment

virtio core already sets the .owner, so driver does not need to.

Acked-by: Bartosz Golaszewski <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Message-Id: <20240331-module-owner-virtio-v2-12-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Acked-by: Linus Walleij <[email protected]>

firmware: arm_scmi: virtio: drop owner assignment

virtio core already sets the .owner, so driver does not need to.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Message-Id: <20240331-module-owner-virtio-v2-11-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Acked-by: Sudeep Holla <[email protected]>

crypto: virtio - drop owner assignment

virtio core already sets the .owner, so driver does not need to.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Message-Id: <20240331-module-owner-virtio-v2-10-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Acked-by: Herbert Xu <[email protected]>

virtio_console: drop owner assignment

virtio core already sets the .owner, so driver does not need to.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Message-Id: <20240331-module-owner-virtio-v2-9-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <[email protected]>

hwrng: virtio: drop owner assignment

virtio core already sets the .owner, so driver does not need to.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Message-Id: <20240331-module-owner-virtio-v2-8-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <[email protected]>