Git Repo - linux.git/log

Merge tag 's390-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull more s390 updates from Vasily Gorbik:

- Fix KMSAN build breakage caused by the conflict between s390 and
   mm-stable trees

- Add KMSAN page markers for ptdump

- Add runtime constant support

- Fix __pa/__va for modules under non-GPL licenses by exporting
   necessary vm_layout struct with EXPORT_SYMBOL to prevent linkage
   problems

- Fix an endless loop in the CF_DIAG event stop in the CPU Measurement
   Counter Facility code when the counter set size is zero

- Remove the PROTECTED_VIRTUALIZATION_GUEST config option and enable
   its functionality by default

- Support allocation of multiple MSI interrupts per device and improve
   logging of architecture-specific limitations

- Add support for lowcore relocation as a debugging feature to catch
   all null ptr dereferences in the kernel address space, improving
   detection beyond the current implementation's limited write access
   protection

- Clean up and rework CPU alternatives to allow for callbacks and early
   patching for the lowcore relocation

* tag 's390-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (39 commits)
  s390: Remove protvirt and kvm config guards for uv code
  s390/boot: Add cmdline option to relocate lowcore
  s390/kdump: Make kdump ready for lowcore relocation
  s390/entry: Make system_call() ready for lowcore relocation
  s390/entry: Make ret_from_fork() ready for lowcore relocation
  s390/entry: Make __switch_to() ready for lowcore relocation
  s390/entry: Make restart_int_handler() ready for lowcore relocation
  s390/entry: Make mchk_int_handler() ready for lowcore relocation
  s390/entry: Make int handlers ready for lowcore relocation
  s390/entry: Make pgm_check_handler() ready for lowcore relocation
  s390/entry: Add base register to CHECK_VMAP_STACK/CHECK_STACK macro
  s390/entry: Add base register to SIEEXIT macro
  s390/entry: Add base register to MBEAR macro
  s390/entry: Make __sie64a() ready for lowcore relocation
  s390/head64: Make startup code ready for lowcore relocation
  s390: Add infrastructure to patch lowcore accesses
  s390/atomic_ops: Disable flag outputs constraint for GCC versions below 14.2.0
  s390/entry: Move SIE indicator flag to thread info
  s390/nmi: Simplify ptregs setup
  s390/alternatives: Remove alternative facility list
  ...

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
"The usual summary below, but the main fix is for the fast GUP lockless
  page-table walk when we have a combination of compile-time and
  run-time folding of the p4d and the pud respectively.

   - Remove some redundant Kconfig conditionals

   - Fix string output in ptrace selftest

   - Fix fast GUP crashes in some page-table configurations

   - Remove obsolete linker option when building the vDSO

   - Fix some sysreg field definitions for the GIC"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: mm: Fix lockless walks with static and dynamic page-table folding
  arm64/sysreg: Correct the values for GICv4.1
  arm64/vdso: Remove --hash-style=sysv
  kselftest: missing arg in ptrace.c
  arm64/Kconfig: Remove redundant 'if HAVE_FUNCTION_GRAPH_TRACER'
  arm64: remove redundant 'if HAVE_ARCH_KASAN' in Kconfig

Merge tag 'ceph-for-6.11-rc1' of https://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
"A small patchset to address bogus I/O errors and ultimately an
  assertion failure in the face of watch errors with -o exclusive
  mappings in RBD marked for stable and some assorted CephFS fixes"

* tag 'ceph-for-6.11-rc1' of https://github.com/ceph/ceph-client:
  rbd: don't assume rbd_is_lock_owner() for exclusive mappings
  rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive mappings
  rbd: rename RBD_LOCK_STATE_RELEASING and releasing_wait
  ceph: fix incorrect kmalloc size of pagevec mempool
  ceph: periodically flush the cap releases
  ceph: convert comma to semicolon in __ceph_dentry_dir_lease_touch()
  ceph: use cap_wait_list only if debugfs is enabled

Merge tag 'erofs-for-6.11-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull more erofs updates from Gao Xiang:

- Support STATX_DIOALIGN and FS_IOC_GETFSSYSFSPATH

- Fix a race of LZ4 decompression due to recent refactoring

- Another multi-page folio adaption in erofs_bread()

* tag 'erofs-for-6.11-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: convert comma to semicolon
  erofs: support multi-page folios for erofs_bread()
  erofs: add support for FS_IOC_GETFSSYSFSPATH
  erofs: fix race in z_erofs_get_gbuf()
  erofs: support STATX_DIOALIGN

Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull struct file leak fixes from Al Viro:
"a couple of leaks on failure exits missing fdput()"

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
lirc: rc_dev_get_from_fd(): fix file leak
powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap()

arm64: allow installing compressed image by default

On arm64 we build compressed images, but "make install" by default will
install the old non-compressed one.  To actually get the compressed
image install, you need to use "make zinstall", which is not the usual
way to install a kernel.

Which may not sound like much of an issue, but when you deal with
multiple architectures (and years of your fingers knowing the regular
"make install" incantation), this inconsistency is pretty annoying.

But as Will Deacon says:
"Sadly, bootloaders being as top quality as you might expect, I don't
  think we're in a position to rely on decompressor support across the
  board. Our Image.gz is literally just that -- we don't have a built-in
  decompressor (nor do I think we want to rush into that again after the
  fun we had on arm32) and the recent EFI zboot support solves that
  problem for platforms using EFI.

  Changing the default 'install' target terrifies me. There are bound to
  be folks with embedded boards who've scripted this and we could really
  ruin their day if we quietly give them a compressed kernel that their
  bootloader doesn't know how to handle :/"

So make this conditional on a new "COMPRESSED_INSTALL" option.

Cc: Catalin Marinas <[email protected]>
Acked-by: Will Deacon <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge tag 'bitmap-6.11-rc1' of https://github.com:/norov/linux

Pull bitmap updates from Yury Norov:
"Random fixes"

* tag 'bitmap-6.11-rc1' of https://github.com:/norov/linux:
  riscv: Remove unnecessary int cast in variable_fls()
  radix tree test suite: put definition of bitmap_clear() into lib/bitmap.c
  bitops: Add a comment explaining the double underscore macros
  lib: bitmap: add missing MODULE_DESCRIPTION() macros
  cpumask: introduce assign_cpu() macro

erofs: convert comma to semicolon

Replace a comma between expression statements by a semicolon.

Signed-off-by: Chen Ni <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Gao Xiang <[email protected]>

erofs: support multi-page folios for erofs_bread()

If the requested page is part of the previous multi-page folio, there
is no need to call read_mapping_folio() again.

Also, get rid of the remaining one of page->index [1] in our codebase.

[1] https://lore.kernel.org/r/[email protected]

Cc: Matthew Wilcox <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Gao Xiang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

erofs: add support for FS_IOC_GETFSSYSFSPATH

FS_IOC_GETFSSYSFSPATH ioctl exposes /sys/fs path of a given filesystem,
potentially standarizing sysfs reporting. This patch add support for
FS_IOC_GETFSSYSFSPATH for erofs, "erofs/<dev>" will be outputted for bdev
cases, "erofs/[domain_id,]<fs_id>" will be outputted for fscache cases.

Signed-off-by: Huang Xiaojia <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Gao Xiang <[email protected]>

erofs: fix race in z_erofs_get_gbuf()

In z_erofs_get_gbuf(), the current task may be migrated to another
CPU between `z_erofs_gbuf_id()` and `spin_lock(&gbuf->lock)`.

Therefore, z_erofs_put_gbuf() will trigger the following issue
which was found by stress test:

<2>[772156.434168] kernel BUG at fs/erofs/zutil.c:58!
..
<4>[772156.435007]
<4>[772156.439237] CPU: 0 PID: 3078 Comm: stress Kdump: loaded Tainted: G            E      6.10.0-rc7+ #2
<4>[772156.439239] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 1.0.0 01/01/2017
<4>[772156.439241] pstate: 83400005 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
<4>[772156.439243] pc : z_erofs_put_gbuf+0x64/0x70 [erofs]
<4>[772156.439252] lr : z_erofs_lz4_decompress+0x600/0x6a0 [erofs]
..
<6>[772156.445958] stress (3127): drop_caches: 1
<4>[772156.446120] Call trace:
<4>[772156.446121]  z_erofs_put_gbuf+0x64/0x70 [erofs]
<4>[772156.446761]  z_erofs_lz4_decompress+0x600/0x6a0 [erofs]
<4>[772156.446897]  z_erofs_decompress_queue+0x740/0xa10 [erofs]
<4>[772156.447036]  z_erofs_runqueue+0x428/0x8c0 [erofs]
<4>[772156.447160]  z_erofs_readahead+0x224/0x390 [erofs]
..

Fixes: f36f3010f676 ("erofs: rename per-CPU buffers to global buffer pool and make it configurable")
Cc: <[email protected]> # 6.10+
Reviewed-by: Chunhai Guo <[email protected]>
Reviewed-by: Sandeep Dhavale <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Gao Xiang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

erofs: support STATX_DIOALIGN

Add support for STATX_DIOALIGN to EROFS, so that direct I/O
alignment restrictions are exposed to userspace in a generic
way.

[Before]
```
./statx_test /mnt/erofs/testfile
statx(/mnt/erofs/testfile) = 0
dio mem align:0
dio offset align:0
```

[After]
```
./statx_test /mnt/erofs/testfile
statx(/mnt/erofs/testfile) = 0
dio mem align:512
dio offset align:512
```

Signed-off-by: Hongbo Li <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Gao Xiang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

Merge tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf and netfilter.

  A lot of networking people were at a conference last week, busy
  catching COVID, so relatively short PR.

  Current release - regressions:

   - tcp: process the 3rd ACK with sk_socket for TFO and MPTCP

  Current release - new code bugs:

   - l2tp: protect session IDR and tunnel session list with one lock,
     make sure the state is coherent to avoid a warning

   - eth: bnxt_en: update xdp_rxq_info in queue restart logic

   - eth: airoha: fix location of the MBI_RX_AGE_SEL_MASK field

  Previous releases - regressions:

   - xsk: require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len,
     the field reuses previously un-validated pad

  Previous releases - always broken:

   - tap/tun: drop short frames to prevent crashes later in the stack

   - eth: ice: add a per-VF limit on number of FDIR filters

   - af_unix: disable MSG_OOB handling for sockets in sockmap/sockhash"

* tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits)
  tun: add missing verification for short frame
  tap: add missing verification for short frame
  mISDN: Fix a use after free in hfcmulti_tx()
  gve: Fix an edge case for TSO skb validity check
  bnxt_en: update xdp_rxq_info in queue restart logic
  tcp: process the 3rd ACK with sk_socket for TFO/MPTCP
  selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test
  xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len
  bpf: Fix a segment issue when downgrading gso_size
  net: mediatek: Fix potential NULL pointer dereference in dummy net_device handling
  MAINTAINERS: make Breno the netconsole maintainer
  MAINTAINERS: Update bonding entry
  net: nexthop: Initialize all fields in dumped nexthops
  net: stmmac: Correct byte order of perfect_match
  selftests: forwarding: skip if kernel not support setting bridge fdb learning limit
  tipc: Return non-zero value from tipc_udp_addr2str() on error
  netfilter: nft_set_pipapo_avx2: disable softinterrupts
  ice: Fix recipe read procedure
  ice: Add a per-VF limit on number of FDIR filters
  net: bonding: correctly annotate RCU in bond_should_notify_peers()
  ...

Merge tag 'printk-for-6.11-trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

- trivial printk changes

The bigger "real" printk work is still being discussed.

* tag 'printk-for-6.11-trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
vsprintf: add missing MODULE_DESCRIPTION() macro
printk: Rename console_replay_all() and update context

Merge tag 'constfy-sysctl-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl

Pull sysctl constification from Joel Granados:
"Treewide constification of the ctl_table argument of proc_handlers
  using a coccinelle script and some manual code formatting fixups.

  This is a prerequisite to moving the static ctl_table structs into
  read-only data section which will ensure that proc_handler function
  pointers cannot be modified"

* tag 'constfy-sysctl-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
  sysctl: treewide: constify the ctl_table argument of proc_handlers

Merge tag 'efi-fixes-for-v6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI fixes from Ard Biesheuvel:

- Wipe screen_info after allocating it from the heap - used by arm32
   and EFI zboot, other EFI architectures allocate it statically

- Revert to allocating boot_params from the heap on x86 when entering
   via the native PE entrypoint, to work around a regression on older
   Dell hardware

* tag 'efi-fixes-for-v6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  x86/efistub: Revert to heap allocated boot_params for PE entrypoint
  efi/libstub: Zero initialize heap allocated struct screen_info

Merge tag 'kgdb-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux

Pull kgdb updates from Daniel Thompson:
"Three small changes this cycle:

   - Clean up an architecture abstraction that is no longer needed
     because all the architectures have converged.

   - Actually use the prompt argument to kdb_position_cursor() instead
     of ignoring it (functionally this fix is a nop but that was due to
     luck rather than good judgement)

   - Fix a -Wformat-security warning"

* tag 'kgdb-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
  kdb: Get rid of redundant kdb_curr_task()
  kdb: Use the passed prompt in kdb_position_cursor()
  kdb: address -Wformat-security warnings

Merge tag 'mips_6.11_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS updates from Thomas Bogendoerfer:

- Use improved timer sync for Loongson64

- Fix address of GCR_ACCESS register

- Add missing MODULE_DESCRIPTION

* tag 'mips_6.11_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  mips: sibyte: add missing MODULE_DESCRIPTION() macro
  MIPS: SMP-CPS: Fix address for GCR_ACCESS register for CM3 and later
  MIPS: Loongson64: Switch to SYNC_R4K

Merge tag 'parisc-for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc updates from Helge Deller:
"The gettimeofday() and clock_gettime() syscalls are now available as
  vDSO functions, and Dave added a patch which allows to use NVMe cards
  in the PCI slots as fast and easy alternative to SCSI discs.

  Summary:

   - add gettimeofday() and clock_gettime() vDSO functions

   - enable PCI_MSI_ARCH_FALLBACKS to allow PCI to PCIe bridge adaptor
     with PCIe NVME card to function in parisc machines

   - allow users to reduce kernel unaligned runtime warnings

   - minor code cleanups"

* tag 'parisc-for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Add support for CONFIG_SYSCTL_ARCH_UNALIGN_NO_WARN
  parisc: Use max() to calculate parisc_tlb_flush_threshold
  parisc: Fix warning at drivers/pci/msi/msi.h:121
  parisc: Add 64-bit gettimeofday() and clock_gettime() vDSO functions
  parisc: Add 32-bit gettimeofday() and clock_gettime() vDSO functions
  parisc: Clean up unistd.h file

Merge tag 'uml-for-linus-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux

Pull UML updates from Richard Weinberger:

- Support for preemption

- i386 Rust support

- Huge cleanup by Benjamin Berg

- UBSAN support

- Removal of dead code

* tag 'uml-for-linus-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (41 commits)
  um: vector: always reset vp->opened
  um: vector: remove vp->lock
  um: register power-off handler
  um: line: always fill *error_out in setup_one_line()
  um: remove pcap driver from documentation
  um: Enable preemption in UML
  um: refactor TLB update handling
  um: simplify and consolidate TLB updates
  um: remove force_flush_all from fork_handler
  um: Do not flush MM in flush_thread
  um: Delay flushing syscalls until the thread is restarted
  um: remove copy_context_skas0
  um: remove LDT support
  um: compress memory related stub syscalls while adding them
  um: Rework syscall handling
  um: Add generic stub_syscall6 function
  um: Create signal stack memory assignment in stub_data
  um: Remove stub-data.h include from common-offsets.h
  um: time-travel: fix signal blocking race/hang
  um: time-travel: remove time_exit()
  ...

Merge tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
"Here is the big set of driver core changes for 6.11-rc1.

  Lots of stuff in here, with not a huge diffstat, but apis are evolving
  which required lots of files to be touched. Highlights of the changes
  in here are:

   - platform remove callback api final fixups (Uwe took many releases
     to get here, finally!)

   - Rust bindings for basic firmware apis and initial driver-core
     interactions.

     It's not all that useful for a "write a whole driver in rust" type
     of thing, but the firmware bindings do help out the phy rust
     drivers, and the driver core bindings give a solid base on which
     others can start their work.

     There is still a long way to go here before we have a multitude of
     rust drivers being added, but it's a great first step.

   - driver core const api changes.

     This reached across all bus types, and there are some fix-ups for
     some not-common bus types that linux-next and 0-day testing shook
     out.

     This work is being done to help make the rust bindings more safe,
     as well as the C code, moving toward the end-goal of allowing us to
     put driver structures into read-only memory. We aren't there yet,
     but are getting closer.

   - minor devres cleanups and fixes found by code inspection

   - arch_topology minor changes

   - other minor driver core cleanups

  All of these have been in linux-next for a very long time with no
  reported problems"

* tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits)
  ARM: sa1100: make match function take a const pointer
  sysfs/cpu: Make crash_hotplug attribute world-readable
  dio: Have dio_bus_match() callback take a const *
  zorro: make match function take a const pointer
  driver core: module: make module_[add|remove]_driver take a const *
  driver core: make driver_find_device() take a const *
  driver core: make driver_[create|remove]_file take a const *
  firmware_loader: fix soundness issue in `request_internal`
  firmware_loader: annotate doctests as `no_run`
  devres: Correct code style for functions that return a pointer type
  devres: Initialize an uninitialized struct member
  devres: Fix memory leakage caused by driver API devm_free_percpu()
  devres: Fix devm_krealloc() wasting memory
  driver core: platform: Switch to use kmemdup_array()
  driver core: have match() callback in struct bus_type take a const *
  MAINTAINERS: add Rust device abstractions to DRIVER CORE
  device: rust: improve safety comments
  MAINTAINERS: add Danilo as FIRMWARE LOADER maintainer
  MAINTAINERS: add Rust FW abstractions to FIRMWARE LOADER
  firmware: rust: improve safety comments
  ...

Merge tag 'linux-watchdog-6.11-rc1' of git://www.linux-watchdog.org/linux-watchdog

Pull watchdog updates from Wim Van Sebroeck:

- make watchdog_class const

- rework of the rzg2l_wdt driver

- other small fixes and improvements

* tag 'linux-watchdog-6.11-rc1' of git://www.linux-watchdog.org/linux-watchdog:
  dt-bindings: watchdog: dlg,da9062-watchdog: Drop blank space
  watchdog: rzn1: Convert comma to semicolon
  watchdog: lenovo_se10_wdt: Convert comma to semicolon
  dt-bindings: watchdog: renesas,wdt: Document RZ/G3S support
  watchdog: rzg2l_wdt: Add suspend/resume support
  watchdog: rzg2l_wdt: Rely on the reset driver for doing proper reset
  watchdog: rzg2l_wdt: Remove comparison with zero
  watchdog: rzg2l_wdt: Remove reset de-assert from probe
  watchdog: rzg2l_wdt: Check return status of pm_runtime_put()
  watchdog: rzg2l_wdt: Use pm_runtime_resume_and_get()
  watchdog: rzg2l_wdt: Make the driver depend on PM
  watchdog: rzg2l_wdt: Restrict the driver to ARCH_RZG2L and ARCH_R9A09G011
  watchdog: imx7ulp_wdt: keep already running watchdog enabled
  watchdog: starfive: Add missing clk_disable_unprepare()
  watchdog: Make watchdog_class const

Merge tag 'dma-mapping-6.11-2024-07-24' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fix from Christoph Hellwig:

- fix the order of actions in dmam_free_coherent (Lance Richardson)

* tag 'dma-mapping-6.11-2024-07-24' of git://git.infradead.org/users/hch/dma-mapping:
dma: fix call order in dmam_free_coherent

Merge branch 'tap-tun-harden-by-dropping-short-frame'

Dongli Zhang says:

====================
tap/tun: harden by dropping short frame

This is to harden all of tap/tun to avoid any short frame smaller than the
Ethernet header (ETH_HLEN).

While the xen-netback already rejects short frame smaller than ETH_HLEN ...

914 static void xenvif_tx_build_gops(struct xenvif_queue *queue,
915                                      int budget,
916                                      unsigned *copy_ops,
917                                      unsigned *map_ops)
918 {
... ...
1007                 if (unlikely(txreq.size < ETH_HLEN)) {
1008                         netdev_dbg(queue->vif->dev,
1009                                    "Bad packet size: %d\n", txreq.size);
1010                         xenvif_tx_err(queue, &txreq, extra_count, idx);
1011                         break;
1012                 }

... the short frame may not be dropped by vhost-net/tap/tun.

This fixes CVE-2024-41090 and CVE-2024-41091.
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

tun: add missing verification for short frame

The cited commit missed to check against the validity of the frame length
in the tun_xdp_one() path, which could cause a corrupted skb to be sent
downstack. Even before the skb is transmitted, the
tun_xdp_one-->eth_type_trans() may access the Ethernet header although it
can be less than ETH_HLEN. Once transmitted, this could either cause
out-of-bound access beyond the actual length, or confuse the underlayer
with incorrect or inconsistent header length in the skb metadata.

In the alternative path, tun_get_user() already prohibits short frame which
has the length less than Ethernet header size from being transmitted for
IFF_TAP.

This is to drop any frame shorter than the Ethernet header size just like
how tun_get_user() does.

CVE: CVE-2024-41091
Inspired-by: https://lore.kernel.org/netdev/[email protected]/
Fixes: 043d222f93ab ("tuntap: accept an array of XDP buffs through sendmsg()")
Cc: [email protected]
Signed-off-by: Dongli Zhang <[email protected]>
Reviewed-by: Si-Wei Liu <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Paolo Abeni <[email protected]>
Reviewed-by: Jason Wang <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

tap: add missing verification for short frame

The cited commit missed to check against the validity of the frame length
in the tap_get_user_xdp() path, which could cause a corrupted skb to be
sent downstack. Even before the skb is transmitted, the
tap_get_user_xdp()-->skb_set_network_header() may assume the size is more
than ETH_HLEN. Once transmitted, this could either cause out-of-bound
access beyond the actual length, or confuse the underlayer with incorrect
or inconsistent header length in the skb metadata.

In the alternative path, tap_get_user() already prohibits short frame which
has the length less than Ethernet header size from being transmitted.

This is to drop any frame shorter than the Ethernet header size just like
how tap_get_user() does.

CVE: CVE-2024-41090
Link: https://lore.kernel.org/netdev/[email protected]/
Fixes: 0efac27791ee ("tap: accept an array of XDP buffs through sendmsg()")
Cc: [email protected]
Signed-off-by: Si-Wei Liu <[email protected]>
Signed-off-by: Dongli Zhang <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Paolo Abeni <[email protected]>
Reviewed-by: Jason Wang <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

mISDN: Fix a use after free in hfcmulti_tx()

Don't dereference *sp after calling dev_kfree_skb(*sp).

Fixes: af69fb3a8ffa ("Add mISDN HFC multiport driver")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

gve: Fix an edge case for TSO skb validity check

The NIC requires each TSO segment to not span more than 10
descriptors. NIC further requires each descriptor to not exceed
16KB - 1 (GVE_TX_MAX_BUF_SIZE_DQO).

The descriptors for an skb are generated by
gve_tx_add_skb_no_copy_dqo() for DQO RDA queue format.
gve_tx_add_skb_no_copy_dqo() loops through each skb frag and
generates a descriptor for the entire frag if the frag size is
not greater than GVE_TX_MAX_BUF_SIZE_DQO. If the frag size is
greater than GVE_TX_MAX_BUF_SIZE_DQO, it is split into descriptor(s)
of size GVE_TX_MAX_BUF_SIZE_DQO and a descriptor is generated for
the remainder (frag size % GVE_TX_MAX_BUF_SIZE_DQO).

gve_can_send_tso() checks if the descriptors thus generated for an
skb would meet the requirement that each TSO-segment not span more
than 10 descriptors. However, the current code misses an edge case
when a TSO segment spans multiple descriptors within a large frag.
This change fixes the edge case.

gve_can_send_tso() relies on the assumption that max gso size (9728)
is less than GVE_TX_MAX_BUF_SIZE_DQO and therefore within an skb
fragment a TSO segment can never span more than 2 descriptors.

Fixes: a57e5de476be ("gve: DQO: Add TX path")
Signed-off-by: Praveen Kaligineedi <[email protected]>
Signed-off-by: Bailey Forrest <[email protected]>
Reviewed-by: Jeroen de Borst <[email protected]>
Cc: [email protected]
Reviewed-by: Willem de Bruijn <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

bnxt_en: update xdp_rxq_info in queue restart logic

When the netdev_rx_queue_restart() restarts queues, the bnxt_en driver
updates(creates and deletes) a page_pool.
But it doesn't update xdp_rxq_info, so the xdp_rxq_info is still
connected to an old page_pool.
So, bnxt_rx_ring_info->page_pool indicates a new page_pool, but
bnxt_rx_ring_info->xdp_rxq is still connected to an old page_pool.

An old page_pool is no longer used so it is supposed to be
deleted by page_pool_destroy() but it isn't.
Because the xdp_rxq_info is holding the reference count for it and the
xdp_rxq_info is not updated, an old page_pool will not be deleted in
the queue restart logic.

Before restarting 1 queue:
./tools/net/ynl/samples/page-pool
enp10s0f1np1[6] page pools: 4 (zombies: 0)
refs: 8192 bytes: 33554432 (refs: 0 bytes: 0)
recycling: 0.0% (alloc: 128:8048 recycle: 0:0)

After restarting 1 queue:
./tools/net/ynl/samples/page-pool
enp10s0f1np1[6] page pools: 5 (zombies: 0)
refs: 10240 bytes: 41943040 (refs: 0 bytes: 0)
recycling: 20.0% (alloc: 160:10080 recycle: 1920:128)

Before restarting queues, an interface has 4 page_pools.
After restarting one queue, an interface has 5 page_pools, but it
should be 4, not 5.
The reason is that queue restarting logic creates a new page_pool and
an old page_pool is not deleted due to the absence of an update of
xdp_rxq_info logic.

Fixes: 2d694c27d32e ("bnxt_en: implement netdev_queue_mgmt_ops")
Signed-off-by: Taehee Yoo <[email protected]>
Reviewed-by: David Wei <[email protected]>
Reviewed-by: Somnath Kotur <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2024-07-25

We've added 14 non-merge commits during the last 8 day(s) which contain
a total of 19 files changed, 177 insertions(+), 70 deletions(-).

The main changes are:

1) Fix af_unix to disable MSG_OOB handling for sockets in BPF sockmap and
   BPF sockhash. Also add test coverage for this case, from Michal Luczaj.

2) Fix a segmentation issue when downgrading gso_size in the BPF helper
   bpf_skb_adjust_room(), from Fred Li.

3) Fix a compiler warning in resolve_btfids due to a missing type cast,
   from Liwei Song.

4) Fix stack allocation for arm64 to align the stack pointer at a 16 byte
   boundary in the fexit_sleep BPF selftest, from Puranjay Mohan.

5) Fix a xsk regression to require a flag when actuating tx_metadata_len,
   from Stanislav Fomichev.

6) Fix function prototype BTF dumping in libbpf for prototypes that have
   no input arguments, from Andrii Nakryiko.

7) Fix stacktrace symbol resolution in perf script for BPF programs
   containing subprograms, from Hou Tao.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test
  xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len
  bpf: Fix a segment issue when downgrading gso_size
  tools/resolve_btfids: Fix comparison of distinct pointer types warning in resolve_btfids
  bpf, events: Use prog to emit ksymbol event for main program
  selftests/bpf: Test sockmap redirect for AF_UNIX MSG_OOB
  selftests/bpf: Parametrize AF_UNIX redir functions to accept send() flags
  selftests/bpf: Support SOCK_STREAM in unix_inet_redir_to_connected()
  af_unix: Disable MSG_OOB handling for sockets in sockmap/sockhash
  bpftool: Fix typo in usage help
  libbpf: Fix no-args func prototype BTF dumping syntax
  MAINTAINERS: Update powerpc BPF JIT maintainers
  MAINTAINERS: Update email address of Naveen
  selftests/bpf: fexit_sleep: Fix stack allocation for arm64
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

arm64: mm: Fix lockless walks with static and dynamic page-table folding

Lina reports random oopsen originating from the fast GUP code when
16K pages are used with 4-level page-tables, the fourth level being
folded at runtime due to lack of LPA2.

In this configuration, the generic implementation of
p4d_offset_lockless() will return a 'p4d_t *' corresponding to the
'pgd_t' allocated on the stack of the caller, gup_fast_pgd_range().
This is normally fine, but when the fourth level of page-table is folded
at runtime, pud_offset_lockless() will offset from the address of the
'p4d_t' to calculate the address of the PUD in the same page-table page.
This results in a stray stack read when the 'p4d_t' has been allocated
on the stack and can send the walker into the weeds.

Fix the problem by providing our own definition of p4d_offset_lockless()
when CONFIG_PGTABLE_LEVELS <= 4 which returns the real page-table
pointer rather than the address of the local stack variable.

Cc: Catalin Marinas <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Fixes: 0dd4f60a2c76 ("arm64: mm: Add support for folding PUDs at runtime")
Reported-by: Asahi Lina <[email protected]>
Reviewed-by: Ard Biesheuvel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>

tcp: process the 3rd ACK with sk_socket for TFO/MPTCP

The 'Fixes' commit recently changed the behaviour of TCP by skipping the
processing of the 3rd ACK when a sk->sk_socket is set. The goal was to
skip tcp_ack_snd_check() in tcp_rcv_state_process() not to send an
unnecessary ACK in case of simultaneous connect(). Unfortunately, that
had an impact on TFO and MPTCP.

I started to look at the impact on MPTCP, because the MPTCP CI found
some issues with the MPTCP Packetdrill tests [1]. Then Paolo Abeni
suggested me to look at the impact on TFO with "plain" TCP.

For MPTCP, when receiving the 3rd ACK of a request adding a new path
(MP_JOIN), sk->sk_socket will be set, and point to the MPTCP sock that
has been created when the MPTCP connection got established before with
the first path. The newly added 'goto' will then skip the processing of
the segment text (step 7) and not go through tcp_data_queue() where the
MPTCP options are validated, and some actions are triggered, e.g.
sending the MPJ 4th ACK [2] as demonstrated by the new errors when
running a packetdrill test [3] establishing a second subflow.

This doesn't fully break MPTCP, mainly the 4th MPJ ACK that will be
delayed. Still, we don't want to have this behaviour as it delays the
switch to the fully established mode, and invalid MPTCP options in this
3rd ACK will not be caught any more. This modification also affects the
MPTCP + TFO feature as well, and being the reason why the selftests
started to be unstable the last few days [4].

For TFO, the existing 'basic-cookie-not-reqd' test [5] was no longer
passing: if the 3rd ACK contains data, and the connection is accept()ed
before receiving them, these data would no longer be processed, and thus
not ACKed.

One last thing about MPTCP, in case of simultaneous connect(), a
fallback to TCP will be done, which seems fine:

  `../common/defaults.sh`

   0 socket(..., SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_MPTCP) = 3
  +0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)

  +0 > S  0:0(0)                 <mss 1460, sackOK, TS val 100 ecr 0,   nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
  +0 < S  0:0(0) win 1000        <mss 1460, sackOK, TS val 407 ecr 0,   nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
  +0 > S. 0:0(0) ack 1           <mss 1460, sackOK, TS val 330 ecr 0,   nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
  +0 < S. 0:0(0) ack 1 win 65535 <mss 1460, sackOK, TS val 700 ecr 100, nop, wscale 8, mpcapable v1 flags[flag_h] key[skey=2]>
  +0 >  . 1:1(0) ack 1           <nop, nop, TS val 845707014 ecr 700, nop, nop, sack 0:1>

Simultaneous SYN-data crossing is also not supported by TFO, see [6].

Kuniyuki Iwashima suggested to restrict the processing to SYN+ACK only:
that's a more generic solution than the one initially proposed, and
also enough to fix the issues described above.

Later on, Eric Dumazet mentioned that an ACK should still be sent in
reaction to the second SYN+ACK that is received: not sending a DUPACK
here seems wrong and could hurt:

   0 socket(..., SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 3
  +0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)

  +0 > S  0:0(0)                <mss 1460, sackOK, TS val 1000 ecr 0,nop,wscale 8>
  +0 < S  0:0(0)       win 1000 <mss 1000, sackOK, nop, nop>
  +0 > S. 0:0(0) ack 1          <mss 1460, sackOK, TS val 3308134035 ecr 0,nop,wscale 8>
  +0 < S. 0:0(0) ack 1 win 1000 <mss 1000, sackOK, nop, nop>
  +0 >  . 1:1(0) ack 1          <nop, nop, sack 0:1>  // <== Here

So in this version, the 'goto consume' is dropped, to always send an ACK
when switching from TCP_SYN_RECV to TCP_ESTABLISHED. This ACK will be
seen as a DUPACK -- with DSACK if SACK has been negotiated -- in case of
simultaneous SYN crossing: that's what is expected here.

Link: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/9936227696
Link: https://datatracker.ietf.org/doc/html/rfc8684#fig_tokens
Link: https://github.com/multipath-tcp/packetdrill/blob/mptcp-net-next/gtests/net/mptcp/syscalls/accept.pkt#L28
Link: https://netdev.bots.linux.dev/contest.html?executor=vmksft-mptcp-dbg&test=mptcp-connect-sh
Link: https://github.com/google/packetdrill/blob/master/gtests/net/tcp/fastopen/server/basic-cookie-not-reqd.pkt#L21
Link: https://github.com/google/packetdrill/blob/master/gtests/net/tcp/fastopen/client/simultaneous-fast-open.pkt
Fixes: 23e89e8ee7be ("tcp: Don't drop SYN+ACK for simultaneous connect().")
Suggested-by: Paolo Abeni <[email protected]>
Suggested-by: Kuniyuki Iwashima <[email protected]>
Suggested-by: Eric Dumazet <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Link: https://patch.msgid.link/20240724-upstream-net-next-20240716-tcp-3rd-ack-consume-sk_socket-v3-1-d48339764ce9@kernel.org
Signed-off-by: Paolo Abeni <[email protected]>

rbd: don't assume rbd_is_lock_owner() for exclusive mappings

Expanding on the previous commit, assuming that rbd_is_lock_owner()
always returns true (i.e. that we are either in RBD_LOCK_STATE_LOCKED
or RBD_LOCK_STATE_QUIESCING) if the mapping is exclusive is wrong too.
In case ceph_cls_set_cookie() fails, the lock would be temporarily
released even if the mapping is exclusive, meaning that we can end up
even in RBD_LOCK_STATE_UNLOCKED.

IOW, exclusive mappings are really "just" about disabling automatic
lock transitions (as documented in the man page), not about grabbing
the lock and holding on to it whatever it takes.

Cc: [email protected]
Fixes: 637cd060537d ("rbd: new exclusive lock wait/wake code")
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Dongsheng Yang <[email protected]>

rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive mappings

Every time a watch is reestablished after getting lost, we need to
update the cookie which involves quiescing exclusive lock.  For this,
we transition from RBD_LOCK_STATE_LOCKED to RBD_LOCK_STATE_QUIESCING
roughly for the duration of rbd_reacquire_lock() call.  If the mapping
is exclusive and I/O happens to arrive in this time window, it's failed
with EROFS (later translated to EIO) based on the wrong assumption in
rbd_img_exclusive_lock() -- "lock got released?" check there stopped
making sense with commit a2b1da09793d ("rbd: lock should be quiesced on
reacquire").

To make it worse, any such I/O is added to the acquiring list before
EROFS is returned and this sets up for violating rbd_lock_del_request()
precondition that the request is either on the running list or not on
any list at all -- see commit ded080c86b3f ("rbd: don't move requests
to the running list on errors").  rbd_lock_del_request() ends up
processing these requests as if they were on the running list which
screws up quiescing_wait completion counter and ultimately leads to

    rbd_assert(!completion_done(&rbd_dev->quiescing_wait));

being triggered on the next watch error.

Cc: [email protected] # 06ef84c4e9c4: rbd: rename RBD_LOCK_STATE_RELEASING and releasing_wait
Cc: [email protected]
Fixes: 637cd060537d ("rbd: new exclusive lock wait/wake code")
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Dongsheng Yang <[email protected]>

rbd: rename RBD_LOCK_STATE_RELEASING and releasing_wait

... to RBD_LOCK_STATE_QUIESCING and quiescing_wait to recognize that
this state and the associated completion are backing rbd_quiesce_lock(),
which isn't specific to releasing the lock.

While exclusive lock does get quiesced before it's released, it also
gets quiesced before an attempt to update the cookie is made and there
the lock is not released as long as ceph_cls_set_cookie() succeeds.

Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Dongsheng Yang <[email protected]>

selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test

This flag is now required to use tx_metadata_len.

Fixes: 40808a237d9c ("selftests/bpf: Add TX side to xdp_metadata")
Reported-by: Julian Schindel <[email protected]>
Signed-off-by: Stanislav Fomichev <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Maciej Fijalkowski <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len

Julian reports that commit 341ac980eab9 ("xsk: Support tx_metadata_len")
can break existing use cases which don't zero-initialize xdp_umem_reg
padding. Introduce new XDP_UMEM_TX_METADATA_LEN to make sure we
interpret the padding as tx_metadata_len only when being explicitly
asked.

Fixes: 341ac980eab9 ("xsk: Support tx_metadata_len")
Reported-by: Julian Schindel <[email protected]>
Signed-off-by: Stanislav Fomichev <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Maciej Fijalkowski <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

bpf: Fix a segment issue when downgrading gso_size

Linearize the skb when downgrading gso_size because it may trigger a
BUG_ON() later when the skb is segmented as described in [1,2].

Fixes: 2be7e212d5419 ("bpf: add bpf_skb_adjust_room helper")
Signed-off-by: Fred Li <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Link: https://lore.kernel.org/all/[email protected]
Link: https://lore.kernel.org/bpf/[email protected]

net: mediatek: Fix potential NULL pointer dereference in dummy net_device handling

Move the freeing of the dummy net_device from mtk_free_dev() to
mtk_remove().

Previously, if alloc_netdev_dummy() failed in mtk_probe(),
eth->dummy_dev would be NULL. The error path would then call
mtk_free_dev(), which in turn called free_netdev() assuming dummy_dev
was allocated (but it was not), potentially causing a NULL pointer
dereference.

By moving free_netdev() to mtk_remove(), we ensure it's only called when
mtk_probe() has succeeded and dummy_dev is fully allocated. This
addresses a potential NULL pointer dereference detected by Smatch[1].

Fixes: b209bd6d0bff ("net: mediatek: mtk_eth_sock: allocate dummy net_device dynamically")
Reported-by: Dan Carpenter <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]/ [1]
Suggested-by: Dan Carpenter <[email protected]>
Reviewed-by: Dan Carpenter <[email protected]>
Signed-off-by: Breno Leitao <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

Merge tag 'nf-24-07-24' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains a Netfilter fix for net:

Patch #1 if FPU is busy, then pipapo set backend falls back to standard
set element lookup. Moreover, disable bh while at this.
From Florian Westphal.

netfilter pull request 24-07-24

* tag 'nf-24-07-24' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nft_set_pipapo_avx2: disable softinterrupts
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
This series contains updates to ice driver only.

Ahmed enforces the iavf per VF filter limit on ice (PF) driver to prevent
possible resource exhaustion.

Wojciech corrects assignment of l2 flags read from firmware.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: Fix recipe read procedure
ice: Add a per-VF limit on number of FDIR filters
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

Merge tag 'phy-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy

Pull phy updates from Vinod Koul:
"New Support
   - Samsung Exynos gs101 drd combo phy
   - Qualcomm SC8180x USB uniphy, IPQ9574 QMP PCIe phy
   - Airoha EN7581 PCIe phy
   - Freescale i.MX8Q HSIO SerDes phy
   - Starfive jh7110 dphy tx

  Updates:
   - Resume support for j721e-wiz driver
   - Updates to Exynos usbdrd driver
   - Support for optional power domains in g12a usb2-phy driver
   - Debugfs support and updates to zynqmp driver"

* tag 'phy-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (56 commits)
  phy: airoha: Add dtime and Rx AEQ IO registers
  dt-bindings: phy: airoha: Add dtime and Rx AEQ IO registers
  dt-bindings: phy: rockchip-emmc-phy: Convert to dtschema
  dt-bindings: phy: qcom,qmp-usb: fix spelling error
  phy: exynos5-usbdrd: support Exynos USBDRD 3.1 combo phy (HS & SS)
  phy: exynos5-usbdrd: convert Vbus supplies to regulator_bulk
  phy: exynos5-usbdrd: convert (phy) register access clock to clk_bulk
  phy: exynos5-usbdrd: convert core clocks to clk_bulk
  phy: exynos5-usbdrd: support isolating HS and SS ports independently
  dt-bindings: phy: samsung,usb3-drd-phy: add gs101 compatible
  phy: core: Fix documentation of of_phy_get
  phy: starfive: Correct the dphy configure process
  phy: zynqmp: Add debugfs support
  phy: zynqmp: Take the phy mutex in xlate
  phy: zynqmp: Only wait for PLL lock "primary" instances
  phy: zynqmp: Store instance instead of type
  phy: zynqmp: Enable reference clock correctly
  phy: cadence-torrent: Check return value on register read
  phy: Fix the cacography in phy-exynos5250-usb2.c
  phy: phy-rockchip-samsung-hdptx: Select CONFIG_MFD_SYSCON
  ...

Merge tag 'soundwire-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire

Pull soundwire updates from Vinod Koul:

- Simplification across subsystem using cleanup.h

- Support for debugfs to read/write commands

- Few Intel and Qualcomm driver updates

* tag 'soundwire-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
  soundwire: debugfs: simplify with cleanup.h
  soundwire: cadence: simplify with cleanup.h
  soundwire: intel_ace2x: simplify with cleanup.h
  soundwire: intel_ace2x: simplify return path in hw_params
  soundwire: intel: simplify with cleanup.h
  soundwire: intel: simplify return path in hw_params
  soundwire: amd_init: simplify with cleanup.h
  soundwire: amd: simplify with cleanup.h
  soundwire: amd: simplify return path in hw_params
  soundwire: intel_auxdevice: start the bus at default frequency
  soundwire: intel_auxdevice: add cs42l43 codec to wake_capable_list
  drivers:soundwire: qcom: cleanup port maask calculations
  soundwire: bus: simplify by using local slave->prop
  soundwire: generic_bandwidth_allocation: change port_bo parameter to pointer
  soundwire: Intel: clarify Copyright information
  soundwire: intel_ace2.x: add AC timing extensions for PantherLake
  soundwire: bus: add stream refcount
  soundwire: debugfs: add interface to read/write commands

Merge tag 'dmaengine-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine

Pull dmaengine updates from Vinod Koul:
"New support:

   - New dmaengine_prep_peripheral_dma_vec() to support transfers using
     dma vectors and documentation and user in AXI dma

   - STMicro STM32 DMA3 support and new capabilities of cyclic dma

  Updates:

   - Yaml conversion for Freescale imx dma and qdma bindings,
     sprd sc9860 dma binding

   - Altera msgdma updates for descriptor management"

* tag 'dmaengine-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (35 commits)
  dt-bindings: fsl-qdma: fix interrupts 'if' check logic
  dt-bindings: dma: sprd,sc9860-dma: convert to YAML
  dmaengine: fsl-dpaa2-qdma: add missing MODULE_DESCRIPTION() macro
  dmaengine: ti: add missing MODULE_DESCRIPTION() macros
  dmaengine: ti: cppi41: add missing MODULE_DESCRIPTION() macro
  dmaengine: virt-dma: add missing MODULE_DESCRIPTION() macro
  dmaengine: ti: k3-udma: Fix BCHAN count with UHC and HC channels
  dmaengine: sh: rz-dmac: Fix lockdep assert warning
  dmaengine: qcom: gpi: clean up the IRQ disable/enable in gpi_reset_chan()
  dmaengine: fsl-edma: change the memory access from local into remote mode in i.MX 8QM
  dmaengine: qcom: gpi: remove unused struct 'reg_info'
  dmaengine: moxart-dma: remove unused struct 'moxart_filter_data'
  dt-bindings: fsl-qdma: Convert to yaml format
  dmaengine: fsl-edma: remove redundant "idle" field from fsl_chan
  dmaengine: fsl-edma: request per-channel IRQ only when channel is allocated
  dmaengine: stm32-dma3: defer channel registration to specify channel name
  dmaengine: add channel device name to channel registration
  dmaengine: stm32-dma3: improve residue granularity
  dmaengine: stm32-dma3: add device_pause and device_resume ops
  dmaengine: stm32-dma3: add DMA_MEMCPY capability
  ...

sysctl: treewide: constify the ctl_table argument of proc_handlers

const qualify the struct ctl_table argument in the proc_handler function
signatures. This is a prerequisite to moving the static ctl_table
structs into .rodata data which will ensure that proc_handler function
pointers cannot be modified.

This patch has been generated by the following coccinelle script:

```
  virtual patch

  @r1@
  identifier ctl, write, buffer, lenp, ppos;
  identifier func !~ "appldata_(timer|interval)_handler|sched_(rt|rr)_handler|rds_tcp_skbuf_handler|proc_sctp_do_(hmac_alg|rto_min|rto_max|udp_port|alpha_beta|auth|probe_interval)";
  @@

  int func(
  - struct ctl_table *ctl
  + const struct ctl_table *ctl
    ,int write, void *buffer, size_t *lenp, loff_t *ppos);

  @r2@
  identifier func, ctl, write, buffer, lenp, ppos;
  @@

  int func(
  - struct ctl_table *ctl
  + const struct ctl_table *ctl
    ,int write, void *buffer, size_t *lenp, loff_t *ppos)
  { ... }

  @r3@
  identifier func;
  @@

  int func(
  - struct ctl_table *
  + const struct ctl_table *
    ,int , void *, size_t *, loff_t *);

  @r4@
  identifier func, ctl;
  @@

  int func(
  - struct ctl_table *ctl
  + const struct ctl_table *ctl
    ,int , void *, size_t *, loff_t *);

  @r5@
  identifier func, write, buffer, lenp, ppos;
  @@

  int func(
  - struct ctl_table *
  + const struct ctl_table *
    ,int write, void *buffer, size_t *lenp, loff_t *ppos);

```

* Code formatting was adjusted in xfs_sysctl.c to comply with code
  conventions. The xfs_stats_clear_proc_handler,
  xfs_panic_mask_proc_handler and xfs_deprecated_dointvec_minmax where
  adjusted.

* The ctl_table argument in proc_watchdog_common was const qualified.
  This is called from a proc_handler itself and is calling back into
  another proc_handler, making it necessary to change it as part of the
  proc_handler migration.

Co-developed-by: Thomas Weißschuh <[email protected]>
Signed-off-by: Thomas Weißschuh <[email protected]>
Co-developed-by: Joel Granados <[email protected]>
Signed-off-by: Joel Granados <[email protected]>

Merge tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull random number generator updates from Jason Donenfeld:
"This adds getrandom() support to the vDSO.

  First, it adds a new kind of mapping to mmap(2), MAP_DROPPABLE, which
  lets the kernel zero out pages anytime under memory pressure, which
  enables allocating memory that never gets swapped to disk but also
  doesn't count as being mlocked.

  Then, the vDSO implementation of getrandom() is introduced in a
  generic manner and hooked into random.c.

  Next, this is implemented on x86. (Also, though it's not ready for
  this pull, somebody has begun an arm64 implementation already)

  Finally, two vDSO selftests are added.

  There are also two housekeeping cleanup commits"

* tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  MAINTAINERS: add random.h headers to RNG subsection
  random: note that RNDGETPOOL was removed in 2.6.9-rc2
  selftests/vDSO: add tests for vgetrandom
  x86: vdso: Wire up getrandom() vDSO implementation
  random: introduce generic vDSO getrandom() implementation
  mm: add MAP_DROPPABLE for designating always lazily freeable mappings

Merge tag 'vfs-6.11-rc1.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:
"VFS:

   - The new 64bit mount ids start after the old mount id, i.e., at the
     first non-32 bit value. However, we started counting one id too
     late and thus lost 4294967296 as the first valid id. Fix that.

   - Update a few comments on some vfs_*() creation helpers.

   - Move copying of the xattr name out from the locks required to start
     a filesystem write.

   - Extend the filelock lock UAF fix to the compat code as well.

   - Now that we added the ability to look up an inode under RCU it's
     possible that lockless hash lookup can find and lock an inode after
     it gets I_FREEING set. It then waits until inode teardown in
     evict() is finished.

     The flag however is still set after evict() has woken up all
     waiters. If the inode lock is taken late enough on the waiting side
     after hash removal and wakeup happened the waiting thread will
     never be woken.

     Before RCU based lookup this was synchronized via the
     inode_hash_lock. But since unhashing requires the inode lock as
     well we can check whether the inode is unhashed while holding inode
     lock even without holding inode_hash_lock.

  pidfd:

   - The nsproxy structure contains nearly all of the namespaces
     associated with a task. When a namespace type isn't supported
     nsproxy might contain a NULL pointer or always point to the initial
     namespace type. The logic isn't consistent. So when deriving
     namespace fds we need to ensure that the namespace type is
     supported.

     First, so that we don't risk dereferncing NULL pointers. The
     correct bigger fix would be to change all namespaces to always set
     a valid namespace pointer in struct nsproxy independent of whether
     or not it is compiled in. But that requires quite a few changes.

     Second, so that we don't allow deriving namespace fds when the
     namespace type doesn't exist and thus when they couldn't also be
     derived via /proc/self/ns/.

   - Add missing selftests for the new pidfd ioctls to derive namespace
     fds. This simply extends the already existing testsuite.

  netfs:

   - Fix debug logging and fix kconfig variable name so it actually
     works.

   - Fix writeback that goes both to the server and cache. The streams
     are only activated once a subreq is added. When a server write
     happens the subreq doesn't need to have finished by the time the
     cache write is started. If the server write has already finished by
     the time the cache write is about to start the cache write will
     operate on a folio that might already have been reused. Fix this by
     preactivating the cache write.

   - Limit cachefiles subreq size for cache writes to MAX_RW_COUNT"

* tag 'vfs-6.11-rc1.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  inode: clarify what's locked
  vfs: Fix potential circular locking through setxattr() and removexattr()
  filelock: Fix fcntl/close race recovery compat path
  fs: use all available ids
  cachefiles: Set the max subreq size for cache writes to MAX_RW_COUNT
  netfs: Fix writeback that needs to go to both server and cache
  pidfs: add selftests for new namespace ioctls
  pidfs: handle kernels without namespaces cleanly
  pidfs: when time ns disabled add check for ioctl
  vfs: correct the comments of vfs_*() helpers
  vfs: handle __wait_on_freeing_inode() and evict() race
  netfs: Rename CONFIG_FSCACHE_DEBUG to CONFIG_NETFS_DEBUG
  netfs: Revert "netfs: Switch debug logging to pr_debug()"

hostfs: fix folio conversion

Commit e3ec0fe944d2 ("hostfs: Convert hostfs_read_folio() to use a
folio") simplified hostfs_read_folio(), but in the process of converting
to using folios natively also mis-used the folio_zero_tail() function
due to the very confusing API of that function.

Very arguably it's folio_zero_tail() API itself that is buggy, since it
would make more sense (and the documentation kind of implies) that the
third argument would be the pointer to the beginning of the folio
buffer.

But no, the third argument to folio_zero_tail() is where we should start
zeroing the tail (even if we already also pass in the offset separately
as the second argument).

So fix the hostfs caller, and we can leave any folio_zero_tail() sanity
cleanup for later.

Reported-and-tested-by: Maciej Żenczykowski <[email protected]>
Fixes: e3ec0fe944d2 ("hostfs: Convert hostfs_read_folio() to use a folio")
Link: https://lore.kernel.org/all/CANP3RGceNzwdb7w=vPf5=7BCid5HVQDmz1K5kC9JG42+HVAh_g@mail.gmail.com/
Cc: Matthew Wilcox <[email protected]>
Cc: Christian Brauner <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

MAINTAINERS: make Breno the netconsole maintainer

netconsole has no maintainer, and Breno has been working on
improving it consistently for some time. So I think we found
the maintainer :)

Acked-by: Paolo Abeni <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Acked-by: Breno Leitao <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

MAINTAINERS: Update bonding entry

Update my email address, clarify support status, and delete the
web site that hasn't been used in a long time.

Signed-off-by: Jay Vosburgh <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: nexthop: Initialize all fields in dumped nexthops

struct nexthop_grp contains two reserved fields that are not initialized by
nla_put_nh_group(), and carry garbage. This can be observed e.g. with
strace (edited for clarity):

    # ip nexthop add id 1 dev lo
    # ip nexthop add id 101 group 1
    # strace -e recvmsg ip nexthop get id 101
    ...
    recvmsg(... [{nla_len=12, nla_type=NHA_GROUP},
                 [{id=1, weight=0, resvd1=0x69, resvd2=0x67}]] ...) = 52

The fields are reserved and therefore not currently used. But as they are, they
leak kernel memory, and the fact they are not just zero complicates repurposing
of the fields for new ends. Initialize the full structure.

Fixes: 430a049190de ("nexthop: Add support for nexthop groups")
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: stmmac: Correct byte order of perfect_match

The perfect_match parameter of the update_vlan_hash operation is __le16,
and is correctly converted from host byte-order in the lone caller,
stmmac_vlan_update().

However, the implementations of this caller, dwxgmac2_update_vlan_hash()
and dwxgmac2_update_vlan_hash(), both treat this parameter as host byte
order, using the following pattern:

u32 value = ...
...
writel(value | perfect_match, ...);

This is not correct because both:
1) value is host byte order; and
2) writel expects a host byte order value as it's first argument

I believe that this will break on big endian systems. And I expect it
has gone unnoticed by only being exercised on little endian systems.

The approach taken by this patch is to update the callback, and it's
caller to simply use a host byte order value.

Flagged by Sparse.
Compile tested only.

Fixes: c7ab0b8088d7 ("net: stmmac: Fallback to VLAN Perfect filtering if HASH is not available")
Signed-off-by: Simon Horman <[email protected]>
Reviewed-by: Maxime Chevallier <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

selftests: forwarding: skip if kernel not support setting bridge fdb learning limit

If the testing kernel doesn't support setting fdb_max_learned or show
fdb_n_learned, just skip it. Or we will get errors like

./bridge_fdb_learning_limit.sh: line 218: [: null: integer expression expected
./bridge_fdb_learning_limit.sh: line 225: [: null: integer expression expected

Fixes: 6f84090333bb ("selftests: forwarding: bridge_fdb_learning_limit: Add a new selftest")
Signed-off-by: Hangbin Liu <[email protected]>
Acked-by: Nikolay Aleksandrov <[email protected]>
Reviewed-by: Johannes Nixdorf <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tipc: Return non-zero value from tipc_udp_addr2str() on error

tipc_udp_addr2str() should return non-zero value if the UDP media
address is invalid. Otherwise, a buffer overflow access can occur in
tipc_media_addr_printf(). Fix this by returning 1 on an invalid UDP
media address.

Fixes: d0f91938bede ("tipc: add ip/udp media type")
Signed-off-by: Shigeru Yoshida <[email protected]>
Reviewed-by: Tung Nguyen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

inode: clarify what's locked

In __wait_on_freeing_inode() we warn in case the inode_hash_lock is held
but the inode is unhashed. We then release the inode_lock. So using
"locked" as parameter name is confusing. Use is_inode_hash_locked as
parameter name instead.

Signed-off-by: Christian Brauner <[email protected]>

vfs: Fix potential circular locking through setxattr() and removexattr()

When using cachefiles, lockdep may emit something similar to the circular
locking dependency notice below.  The problem appears to stem from the
following:

(1) Cachefiles manipulates xattrs on the files in its cache when called
     from ->writepages().

(2) The setxattr() and removexattr() system call handlers get the name
     (and value) from userspace after taking the sb_writers lock, putting
     accesses of the vma->vm_lock and mm->mmap_lock inside of that.

(3) The afs filesystem uses a per-inode lock to prevent multiple
     revalidation RPCs and in writeback vs truncate to prevent parallel
     operations from deadlocking against the server on one side and local
     page locks on the other.

Fix this by moving the getting of the name and value in {get,remove}xattr()
outside of the sb_writers lock.  This also has the minor benefits that we
don't need to reget these in the event of a retry and we never try to take
the sb_writers lock in the event we can't pull the name and value into the
kernel.

Alternative approaches that might fix this include moving the dispatch of a
write to the cache off to a workqueue or trying to do without the
validation lock in afs.  Note that this might also affect other filesystems
that use netfslib and/or cachefiles.

======================================================
WARNING: possible circular locking dependency detected
6.10.0-build2+ #956 Not tainted
------------------------------------------------------
fsstress/6050 is trying to acquire lock:
ffff888138fd82f0 (mapping.invalidate_lock#3){++++}-{3:3}, at: filemap_fault+0x26e/0x8b0

but task is already holding lock:
ffff888113f26d18 (&vma->vm_lock->lock){++++}-{3:3}, at: lock_vma_under_rcu+0x165/0x250

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #4 (&vma->vm_lock->lock){++++}-{3:3}:
        __lock_acquire+0xaf0/0xd80
        lock_acquire.part.0+0x103/0x280
        down_write+0x3b/0x50
        vma_start_write+0x6b/0xa0
        vma_link+0xcc/0x140
        insert_vm_struct+0xb7/0xf0
        alloc_bprm+0x2c1/0x390
        kernel_execve+0x65/0x1a0
        call_usermodehelper_exec_async+0x14d/0x190
        ret_from_fork+0x24/0x40
        ret_from_fork_asm+0x1a/0x30

-> #3 (&mm->mmap_lock){++++}-{3:3}:
        __lock_acquire+0xaf0/0xd80
        lock_acquire.part.0+0x103/0x280
        __might_fault+0x7c/0xb0
        strncpy_from_user+0x25/0x160
        removexattr+0x7f/0x100
        __do_sys_fremovexattr+0x7e/0xb0
        do_syscall_64+0x9f/0x100
        entry_SYSCALL_64_after_hwframe+0x76/0x7e

-> #2 (sb_writers#14){.+.+}-{0:0}:
        __lock_acquire+0xaf0/0xd80
        lock_acquire.part.0+0x103/0x280
        percpu_down_read+0x3c/0x90
        vfs_iocb_iter_write+0xe9/0x1d0
        __cachefiles_write+0x367/0x430
        cachefiles_issue_write+0x299/0x2f0
        netfs_advance_write+0x117/0x140
        netfs_write_folio.isra.0+0x5ca/0x6e0
        netfs_writepages+0x230/0x2f0
        afs_writepages+0x4d/0x70
        do_writepages+0x1e8/0x3e0
        filemap_fdatawrite_wbc+0x84/0xa0
        __filemap_fdatawrite_range+0xa8/0xf0
        file_write_and_wait_range+0x59/0x90
        afs_release+0x10f/0x270
        __fput+0x25f/0x3d0
        __do_sys_close+0x43/0x70
        do_syscall_64+0x9f/0x100
        entry_SYSCALL_64_after_hwframe+0x76/0x7e

-> #1 (&vnode->validate_lock){++++}-{3:3}:
        __lock_acquire+0xaf0/0xd80
        lock_acquire.part.0+0x103/0x280
        down_read+0x95/0x200
        afs_writepages+0x37/0x70
        do_writepages+0x1e8/0x3e0
        filemap_fdatawrite_wbc+0x84/0xa0
        filemap_invalidate_inode+0x167/0x1e0
        netfs_unbuffered_write_iter+0x1bd/0x2d0
        vfs_write+0x22e/0x320
        ksys_write+0xbc/0x130
        do_syscall_64+0x9f/0x100
        entry_SYSCALL_64_after_hwframe+0x76/0x7e

-> #0 (mapping.invalidate_lock#3){++++}-{3:3}:
        check_noncircular+0x119/0x160
        check_prev_add+0x195/0x430
        __lock_acquire+0xaf0/0xd80
        lock_acquire.part.0+0x103/0x280
        down_read+0x95/0x200
        filemap_fault+0x26e/0x8b0
        __do_fault+0x57/0xd0
        do_pte_missing+0x23b/0x320
        __handle_mm_fault+0x2d4/0x320
        handle_mm_fault+0x14f/0x260
        do_user_addr_fault+0x2a2/0x500
        exc_page_fault+0x71/0x90
        asm_exc_page_fault+0x22/0x30

other info that might help us debug this:

Chain exists of:
   mapping.invalidate_lock#3 --> &mm->mmap_lock --> &vma->vm_lock->lock

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   rlock(&vma->vm_lock->lock);
                                lock(&mm->mmap_lock);
                                lock(&vma->vm_lock->lock);
   rlock(mapping.invalidate_lock#3);

  *** DEADLOCK ***

1 lock held by fsstress/6050:
  #0: ffff888113f26d18 (&vma->vm_lock->lock){++++}-{3:3}, at: lock_vma_under_rcu+0x165/0x250

stack backtrace:
CPU: 0 PID: 6050 Comm: fsstress Not tainted 6.10.0-build2+ #956
Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
Call Trace:
  <TASK>
  dump_stack_lvl+0x57/0x80
  check_noncircular+0x119/0x160
  ? queued_spin_lock_slowpath+0x4be/0x510
  ? __pfx_check_noncircular+0x10/0x10
  ? __pfx_queued_spin_lock_slowpath+0x10/0x10
  ? mark_lock+0x47/0x160
  ? init_chain_block+0x9c/0xc0
  ? add_chain_block+0x84/0xf0
  check_prev_add+0x195/0x430
  __lock_acquire+0xaf0/0xd80
  ? __pfx___lock_acquire+0x10/0x10
  ? __lock_release.isra.0+0x13b/0x230
  lock_acquire.part.0+0x103/0x280
  ? filemap_fault+0x26e/0x8b0
  ? __pfx_lock_acquire.part.0+0x10/0x10
  ? rcu_is_watching+0x34/0x60
  ? lock_acquire+0xd7/0x120
  down_read+0x95/0x200
  ? filemap_fault+0x26e/0x8b0
  ? __pfx_down_read+0x10/0x10
  ? __filemap_get_folio+0x25/0x1a0
  filemap_fault+0x26e/0x8b0
  ? __pfx_filemap_fault+0x10/0x10
  ? find_held_lock+0x7c/0x90
  ? __pfx___lock_release.isra.0+0x10/0x10
  ? __pte_offset_map+0x99/0x110
  __do_fault+0x57/0xd0
  do_pte_missing+0x23b/0x320
  __handle_mm_fault+0x2d4/0x320
  ? __pfx___handle_mm_fault+0x10/0x10
  handle_mm_fault+0x14f/0x260
  do_user_addr_fault+0x2a2/0x500
  exc_page_fault+0x71/0x90
  asm_exc_page_fault+0x22/0x30

Signed-off-by: David Howells <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
cc: Alexander Viro <[email protected]>
cc: Christian Brauner <[email protected]>
cc: Jan Kara <[email protected]>
cc: Jeff Layton <[email protected]>
cc: Gao Xiang <[email protected]>
cc: Matthew Wilcox <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
[brauner: fix minor issues]
Signed-off-by: Christian Brauner <[email protected]>

filelock: Fix fcntl/close race recovery compat path

When I wrote commit 3cad1bc01041 ("filelock: Remove locks reliably when
fcntl/close race is detected"), I missed that there are two copies of the
code I was patching: The normal version, and the version for 64-bit offsets
on 32-bit kernels.
Thanks to Greg KH for stumbling over this while doing the stable
backport...

Apply exactly the same fix to the compat path for 32-bit kernels.

Fixes: c293621bbf67 ("[PATCH] stale POSIX lock handling")
Cc: [email protected]
Link: https://bugs.chromium.org/p/project-zero/issues/detail?id=2563
Signed-off-by: Jann Horn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>

fs: use all available ids

The counter is unconditionally incremented for each mount allocation.
If we set it to 1ULL << 32 we're losing 4294967296 as the first valid
non-32 bit mount id.

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Josef Bacik <[email protected]>
Reviewed-by: Jeff Layton <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>

cachefiles: Set the max subreq size for cache writes to MAX_RW_COUNT

Set the maximum size of a subrequest that writes to cachefiles to be
MAX_RW_COUNT so that we don't overrun the maximum write we can make to the
backing filesystem.

Signed-off-by: David Howells <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
Signed-off-by: Christian Brauner <[email protected]>

netfs: Fix writeback that needs to go to both server and cache

When netfslib is performing writeback (ie. ->writepages), it maintains two
parallel streams of writes, one to the server and one to the cache, but it
doesn't mark either stream of writes as active until it gets some data that
needs to be written to that stream.

This is done because some folios will only be written to the cache
(e.g. copying to the cache on read is done by marking the folios and
letting writeback do the actual work) and sometimes we'll only be writing
to the server (e.g. if there's no cache).

Now, since we don't actually dispatch uploads and cache writes in parallel,
but rather flip between the streams, depending on which has the lowest
so-far-issued offset, and don't wait for the subreqs to finish before
flipping, we can end up in a situation where, say, we issue a write to the
server and this completes before we start the write to the cache.

But because we only activate a stream when we first add a subreq to it, the
result collection code may run before we manage to activate the stream -
resulting in the folio being cleaned and having the writeback-in-progress
mark removed. At this point, the folio no longer belongs to us.

This is only really a problem for folios that need to be written to both
streams - and in that case, the upload to the server is started first,
followed by the write to the cache - and the cache write may see a bad
folio.

Fix this by activating the cache stream up front if there's a cache
available. If there's a cache, then all data is going to be written to it.

Fixes: 288ace2f57c9 ("netfs: New writeback implementation")
Signed-off-by: David Howells <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
Signed-off-by: Christian Brauner <[email protected]>

pidfs: add selftests for new namespace ioctls

Add selftests to verify that deriving namespace file descriptors from
pidfd file descriptors works correctly.

Link: https://lore.kernel.org/r/20240722-work-pidfs-69dbea91edab@brauner
Signed-off-by: Christian Brauner <[email protected]>

pidfs: handle kernels without namespaces cleanly

The nsproxy structure contains nearly all of the namespaces associated
with a task. When a given namespace type is not supported by this kernel
the rules whether the corresponding pointer in struct nsproxy is NULL or
always init_<ns_type>_ns differ per namespace. Ideally, that wouldn't be
the case and for all namespace types we'd always set it to
init_<ns_type>_ns when the corresponding namespace type isn't supported.

Make sure we handle all namespaces where the pointer in struct nsproxy
can be NULL when the namespace type isn't supported.

Link: https://lore.kernel.org/r/20240722-work-pidfs-e6a83030f63e@brauner
Fixes: 5b08bd408534 ("pidfs: allow retrieval of namespace file descriptors") # mainline only
Signed-off-by: Christian Brauner <[email protected]>

pidfs: when time ns disabled add check for ioctl

syzbot call pidfd_ioctl() with cmd "PIDFD_GET_TIME_NAMESPACE" and disabled
CONFIG_TIME_NS, since time_ns is NULL, it will make NULL ponter deref in
open_namespace.

Fixes: 5b08bd408534 ("pidfs: allow retrieval of namespace file descriptors") # mainline only
Reported-and-tested-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=34a0ee986f61f15da35d
Signed-off-by: Edward Adam Davis <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>

vfs: correct the comments of vfs_*() helpers

correct the comments of vfs_*() helpers in fs/namei.c, including:
1. vfs_create()
2. vfs_mknod()
3. vfs_mkdir()
4. vfs_rmdir()
5. vfs_symlink()

All of them come from the same commit:
6521f8917082 "namei: prepare for idmapped mounts"

The @dentry is actually the dentry of child directory rather than
base directory(parent directory), and thus the @dir has to be
modified due to the change of @dentry.

Signed-off-by: Congjie Zhou <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>

vfs: handle __wait_on_freeing_inode() and evict() race

Lockless hash lookup can find and lock the inode after it gets the
I_FREEING flag set, at which point it blocks waiting for teardown in
evict() to finish.

However, the flag is still set even after evict() wakes up all waiters.

This results in a race where if the inode lock is taken late enough, it
can happen after both hash removal and wakeups, meaning there is nobody
to wake the racing thread up.

This worked prior to RCU-based lookup because the entire ordeal was
synchronized with the inode hash lock.

Since unhashing requires the inode lock, we can safely check whether it
happened after acquiring it.

Link: https://lore.kernel.org/v9fs/[email protected]/
Reported-by: Dominique Martinet <[email protected]>
Fixes: 7180f8d91fcb ("vfs: add rcu-based find_inode variants for iget ops")
Signed-off-by: Mateusz Guzik <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>

netfs: Rename CONFIG_FSCACHE_DEBUG to CONFIG_NETFS_DEBUG

CONFIG_FSCACHE_DEBUG should have been renamed to CONFIG_NETFS_DEBUG, so do
that now.

Signed-off-by: David Howells <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
cc: Uwe Kleine-König <[email protected]>
cc: Christian Brauner <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
Signed-off-by: Christian Brauner <[email protected]>

netfs: Revert "netfs: Switch debug logging to pr_debug()"

Revert commit 163eae0fb0d4c610c59a8de38040f8e12f89fd43 to get back the
original operation of the debugging macros.

Signed-off-by: David Howells <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
cc: Uwe Kleine-König <[email protected]>
cc: Christian Brauner <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
Signed-off-by: Christian Brauner <[email protected]>

netfilter: nft_set_pipapo_avx2: disable softinterrupts

We need to disable softinterrupts, else we get following problem:

1. pipapo_avx2 called from process context; fpu usable
2. preempt_disable() called, pcpu scratchmap in use
3. softirq handles rx or tx, we re-enter pipapo_avx2
4. fpu busy, fallback to generic non-avx version
5. fallback reuses scratch map and index, which are in use
by the preempted process

Handle this same way as generic version by first disabling
softinterrupts while the scratchmap is in use.

Fixes: f0b3d338064e ("netfilter: nft_set_pipapo_avx2: Add irq_fpu_usable() check, fallback to non-AVX2 version")
Cc: Stefano Brivio <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Reviewed-by: Stefano Brivio <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

Merge tag 'perf-tools-fixes-for-v6.11-2024-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools fixes from Namhyung Kim:
"Two fixes for building perf and other tools:

   - Fix breakage in tracing tools due to pkg-config for
     libtrace{event,fs}

   - Fix build of perf when libunwind is used"

* tag 'perf-tools-fixes-for-v6.11-2024-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
  perf dso: Fix build when libunwind is enabled
  tools/latency: Use pkg-config in lib_setup of Makefile.config
  tools/rtla: Use pkg-config in lib_setup of Makefile.config
  tools/verification: Use pkg-config in lib_setup of Makefile.config
  tools: Make pkg-config dependency checks usable by other tools
  perf build: Warn if libtracefs is not found

Merge tag 'execve-v6.11-rc1-fix1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull execve fix from Kees Cook:
"This moves the exec and binfmt_elf tests out of your way and into the
  tests/ subdirectory, following the newly ratified KUnit naming
  conventions. :)"

* tag 'execve-v6.11-rc1-fix1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  execve: Move KUnit tests to tests/ subdirectory

parisc: Add support for CONFIG_SYSCTL_ARCH_UNALIGN_NO_WARN

Allow users to disable kernel warnings for unaligned memory
accesses from kernel via the /proc/sys/kernel/ignore-unaligned-usertrap
procfs entry.
That way users can disable those warnings in case they happen too
often.

Signed-off-by: Helge Deller <[email protected]>

ice: Fix recipe read procedure

When ice driver reads recipes from firmware information about
need_pass_l2 and allow_pass_l2 flags is not stored correctly.
Those flags are stored as one bit each in ice_sw_recipe structure.
Because of that, the result of checking a flag has to be casted to bool.
Note that the need_pass_l2 flag currently works correctly, because
it's stored in the first bit.

Fixes: bccd9bce29e0 ("ice: Add guard rule when creating FDB in switchdev")
Reviewed-by: Marcin Szycik <[email protected]>
Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Wojciech Drewek <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Tested-by: Sujai Buvaneswaran <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>

ice: Add a per-VF limit on number of FDIR filters

While the iavf driver adds a s/w limit (128) on the number of FDIR
filters that the VF can request, a malicious VF driver can request more
than that and exhaust the resources for other VFs.

Add a similar limit in ice.

CC: [email protected]
Fixes: 1f7ea1cd6a37 ("ice: Enable FDIR Configure for AVF")
Reviewed-by: Przemek Kitszel <[email protected]>
Suggested-by: Sridhar Samudrala <[email protected]>
Signed-off-by: Ahmed Zaki <[email protected]>
Reviewed-by: Wojciech Drewek <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>

Merge tag 'f2fs-for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
"A pretty small update including mostly minor bug fixes in zoned
  storage along with the large section support.

  Enhancements:
   - add support for FS_IOC_GETFSSYSFSPATH
   - enable atgc dynamically if conditions are met
   - use new ioprio Macro to get ckpt thread ioprio level
   - remove unreachable lazytime mount option parsing

  Bug fixes:
   - fix null reference error when checking end of zone
   - fix start segno of large section
   - fix to cover read extent cache access with lock
   - don't dirty inode for readonly filesystem
   - allocate a new section if curseg is not the first seg in its zone
   - only fragment segment in the same section
   - truncate preallocated blocks in f2fs_file_open()
   - fix to avoid use SSR allocate when do defragment
   - fix to force buffered IO on inline_data inode

  And some minor code clean-ups and sanity checks"

* tag 'f2fs-for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (26 commits)
  f2fs: clean up addrs_per_{inode,block}()
  f2fs: clean up F2FS_I()
  f2fs: use meta inode for GC of COW file
  f2fs: use meta inode for GC of atomic file
  f2fs: only fragment segment in the same section
  f2fs: fix to update user block counts in block_operations()
  f2fs: remove unreachable lazytime mount option parsing
  f2fs: fix null reference error when checking end of zone
  f2fs: fix start segno of large section
  f2fs: remove redundant sanity check in sanity_check_inode()
  f2fs: assign CURSEG_ALL_DATA_ATGC if blkaddr is valid
  f2fs: fix to use mnt_{want,drop}_write_file replace file_{start,end}_wrtie
  f2fs: clean up set REQ_RAHEAD given rac
  f2fs: enable atgc dynamically if conditions are met
  f2fs: fix to truncate preallocated blocks in f2fs_file_open()
  f2fs: fix to cover read extent cache access with lock
  f2fs: fix return value of f2fs_convert_inline_inode()
  f2fs: use new ioprio Macro to get ckpt thread ioprio level
  f2fs: fix to don't dirty inode for readonly filesystem
  f2fs: fix to avoid use SSR allocate when do defragment
  ...

Merge tag 'jfs-6.11' of github.com:kleikamp/linux-shaggy

Pull jfs updates from David Kleikamp:
"Folio conversion from Matthew Wilcox and a few various fixes"

* tag 'jfs-6.11' of github.com:kleikamp/linux-shaggy:
  jfs: don't walk off the end of ealist
  jfs: Fix shift-out-of-bounds in dbDiscardAG
  jfs: Fix array-index-out-of-bounds in diFree
  jfs: fix null ptr deref in dtInsertEntry
  jfs: Remove use of folio error flag
  fs: Remove i_blocks_per_page
  jfs: Change metapage->page to metapage->folio
  jfs: Convert force_metapage to use a folio
  jfs: Convert inc_io to take a folio
  jfs: Convert page_to_mp to folio_to_mp
  jfs; Convert __invalidate_metapages to use a folio
  jfs: Convert dec_io to take a folio
  jfs: Convert drop_metapage and remove_metapage to take a folio
  jfs; Convert release_metapage to use a folio
  jfs: Convert insert_metapage() to take a folio
  jfs: Convert __get_metapage to use a folio
  jfs: Convert metapage_writepage to metapage_write_folio
  jfs: Convert metapage_read_folio to use folio APIs

Merge tag 'kbuild-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

- Remove tristate choice support from Kconfig

- Stop using the PROVIDE() directive in the linker script

- Reduce the number of links for the combination of CONFIG_KALLSYMS and
   CONFIG_DEBUG_INFO_BTF

- Enable the warning for symbol reference to .exit.* sections by
   default

- Fix warnings in RPM package builds

- Improve scripts/make_fit.py to generate a FIT image with separate
   base DTB and overlays

- Improve choice value calculation in Kconfig

- Fix conditional prompt behavior in choice in Kconfig

- Remove support for the uncommon EMAIL environment variable in Debian
   package builds

- Remove support for the uncommon "name <email>" form for the DEBEMAIL
   environment variable

- Raise the minimum supported GNU Make version to 4.0

- Remove stale code for the absolute kallsyms

- Move header files commonly used for host programs to scripts/include/

- Introduce the pacman-pkg target to generate a pacman package used in
   Arch Linux

- Clean up Kconfig

* tag 'kbuild-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (65 commits)
  kbuild: doc: gcc to CC change
  kallsyms: change sym_entry::percpu_absolute to bool type
  kallsyms: unify seq and start_pos fields of struct sym_entry
  kallsyms: add more original symbol type/name in comment lines
  kallsyms: use \t instead of a tab in printf()
  kallsyms: avoid repeated calculation of array size for markers
  kbuild: add script and target to generate pacman package
  modpost: use generic macros for hash table implementation
  kbuild: move some helper headers from scripts/kconfig/ to scripts/include/
  Makefile: add comment to discourage tools/* addition for kernel builds
  kbuild: clean up scripts/remove-stale-files
  kconfig: recursive checks drop file/lineno
  kbuild: rpm-pkg: introduce a simple changelog section for kernel.spec
  kallsyms: get rid of code for absolute kallsyms
  kbuild: Create INSTALL_PATH directory if it does not exist
  kbuild: Abort make on install failures
  kconfig: remove 'e1' and 'e2' macros from expression deduplication
  kconfig: remove SYMBOL_CHOICEVAL flag
  kconfig: add const qualifiers to several function arguments
  kconfig: call expr_eliminate_yn() at least once in expr_eliminate_dups()
  ...

Merge tag 'rpmsg-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux

Pull rpmsg updates from Bjorn Andersson:

- fix interrupt handling in the stm32 remoteproc driver when being
   attached to an already running remote processor

- fix invalid kernel-doc and add missing MODULE_DESCRIPTION() in the
   rpmsg char driver

* tag 'rpmsg-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
  rpmsg: char: add missing MODULE_DESCRIPTION() macro
  remoteproc: stm32_rproc: Fix mailbox interrupts queuing
  rpmsg: char: Fix rpmsg_eptdev structure documentation

Merge tag 'rproc-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux

Pull remoteproc updates from Bjorn Andersson:

- The maximum amount of DDR memory used by the Mediatek MT8188/MT8195
   SCP is increased to handle new use cases. Handling of optional L1TCM
   memory is made actually optional.

- An optimization is introduced to only clear the unused portion of IPI
   shared buffers, rather than the entire buffer before writing the
   message.

- Detection for IPC-only mode in the TI K3 DSP remoteproc driver is
   corrected. The loglevel of a debug print in the same is lowered from
   error.

- Support for attaching to an running remote processor is added to the
   Xilinx R5F.

- An in-kernel implementation of the Qualcomm "protected domain mapper"
   (aka service registry) service is introduced, to remove the
   dependency on a userspace implementation to detect when the battery
   monitor and USB Type-C port manager becomes available. This is then
   integrated with the Qualcomm remoteproc driver.

- The Qualcomm PAS remoteproc driver gains support for attempting to
   bust hwspinlocks held by the remote processor when it
   crashed/stopped.

- The TI OMAP remoteproc driver is transitioned to use devres helpers
   for various forms of allocations.

- Parsing of memory-regions in the i.MX remoteproc driver is improved
   to avoid a NULL pointer dereference if the phandle reference is
   empty. of_node reference counting is corrected in the same.

* tag 'rproc-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
  remoteproc: mediatek: Increase MT8188/MT8195 SCP core0 DRAM size
  remoteproc: k3-dsp: Fix log levels where appropriate
  remoteproc: xlnx: Add attach detach support
  remoteproc: qcom: select AUXILIARY_BUS
  remoteproc: k3-r5: Fix IPC-only mode detection
  remoteproc: mediatek: Don't attempt to remap l1tcm memory if missing
  remoteproc: qcom: enable in-kernel PD mapper
  dt-bindings: remoteproc: imx_rproc: Add minItems for power-domain
  remoteproc: imx_rproc: Fix refcount mistake in imx_rproc_addr_init
  remoteproc: omap: Use devm_rproc_add() helper
  remoteproc: omap: Use devm action to release reserved memory
  remoteproc: omap: Use devm_rproc_alloc() helper
  remoteproc: imx_rproc: Skip over memory region when node value is NULL
  dt-bindings: remoteproc: k3-dsp: Correct optional sram properties for AM62A SoCs
  remoteproc: qcom_q6v5_pas: Add hwspinlock bust on stop
  soc: qcom: smem: Add qcom_smem_bust_hwspin_lock_by_host()
  remoteproc: mediatek: Zero out only remaining bytes of IPI buffer

Merge tag 'hwlock-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux

Pull hwspinlock updates from Bjorn Andersson:
"This introduces a mechanism in the hardware spinlock framework, and
  the Qualcomm TCSR mutex driver, for allowing clients to bust locks
  held by a remote processor in the event that this enters a faulty
  state while holding the shared lock"

* tag 'hwlock-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
  hwspinlock: qcom: implement bust operation
  hwspinlock: Introduce hwspin_lock_bust()

Merge tag 'sh-for-v6.11-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux

Pull sh updates from John Paul Adrian Glaubitz:
"This is rather small this time and contains just three changes.

  The first change by Oscar Salvador drops support for memory hotplug
  and hotremove for sh as the kernel stopped supporting it on 32-bit
  platforms since 7ec58a2b941e ("mm/memory_hotplug: restrict
  CONFIG_MEMORY_HOTPLUG to 64 bit").

  That then results in a follow-up change to update all affected board
  config files.

  The third change comes from Jeff Johnson which adds the missing
  MODULE_DESCRIPTION() macro to the push-switch driver"

* tag 'sh-for-v6.11-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
  sh: push-switch: Add missing MODULE_DESCRIPTION() macro
  sh: config: Drop CONFIG_MEMORY_{HOTPLUG,HOTREMOVE}
  sh: Drop support for memory hotplug and memory hotremove

Merge tag 'modules-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull module update from Luis Chamberlain:
"This is a super boring development cycle this time around for modules,
  there is only one patch in this pull request.

  The patch deals with a corner case set of dependencies which is not
  resolved today to ensure users get the module they need on initramfs.
  Currently only one module is known to exist which needs this, however
  this can grow to capture other corner cases likely escaped and not
  reported before. The kernel change is just a section update, the real
  work is done and merged already on upstream kmod.

  This has been on linux-next for 3 weeks now"

* tag 'modules-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  module: create weak dependecies

Merge tag 'livepatching-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching

Pull livepatching update from Petr Mladek:

- show patch->replace flag in sysfs

- add or improve few selftests

* tag 'livepatching-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
  livepatch: Replace snprintf() with sysfs_emit()
  selftests/livepatch: Add selftests for "replace" sysfs attribute
  livepatch: Add "replace" sysfs attribute
  selftests: livepatch: Test atomic replace against multiple modules
  selftests/livepatch: define max test-syscall processes

Merge tag 'i2c-for-6.11-rc1-second-batch' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull more i2c updates from Wolfram Sang:
"The I2C core has two header documentation updates as the dependecies
  are in now.

  The I2C host drivers add some patches which nearly fell through the
  cracks:

   - Added descriptions in the DTS for the Qualcomm SM8650 and SM8550
     Camera Control Interface (CCI).

   - Added support for the "settle-time-us" property, which allows the
     gpio-mux device to switch from one bus to another with a
     configurable delay. The time can be set in the DTS. The latest
     change also includes file sorting.

   - Fixed slot numbering in the SMBus framework to prevent failures
     when more than 8 slots are occupied. It now enforces a a maximum of
     8 slots to be used. This ensures that the Intel PIIX4 device can
     register the SPDs correctly without failure, even if other slots
     are populated but not used"

* tag 'i2c-for-6.11-rc1-second-batch' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: header: improve kdoc for i2c_algorithm
  i2c: header: remove unneeded stuff regarding i2c_algorithm
  i2c: piix4: Register SPDs
  i2c: smbus: remove i801 assumptions from SPD probing
  i2c: mux: gpio: Add support for the 'settle-time-us' property
  i2c: mux: gpio: Re-order #include to match alphabetic order
  dt-bindings: i2c: mux-gpio: Add 'settle-time-us' property
  dt-bindings: i2c: qcom-cci: Document sm8650 compatible
  dt-bindings: i2c: qcom-cci: Document sm8550 compatible

Merge tag 'mailbox-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox

Pull mailbox updates from Jassi Brar:
"broadcom:
   - remove unused pdc_dma_map

  imx:
   - fix TXDB_V2 channel race condition

  mediatek:
   - cleanup and refactor driver
   - add bindings for gce-props

  omap:
   - fix mailbox interrupt sharing

  qcom:
   - add bindings for SA8775p
   - add CPUCP driver

  zynqmp:
   - make polling period configurable"

* tag 'mailbox-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox:
  mailbox: mtk-cmdq: Move devm_mbox_controller_register() after devm_pm_runtime_enable()
  mailbox: zynqmp-ipi: Make polling period configurable
  mailbox: qcom-cpucp: fix 64BIT dependency
  mailbox: Add support for QTI CPUCP mailbox controller
  dt-bindings: mailbox: qcom: Add CPUCP mailbox controller bindings
  dt-bindings: remoteproc: qcom,sa8775p-pas: Document the SA8775p ADSP, CDSP and GPDSP
  mailbox: mtk-cmdq: add missing MODULE_DESCRIPTION() macro
  mailbox: bcm-pdc: remove unused struct 'pdc_dma_map'
  mailbox: imx: fix TXDB_V2 channel race condition
  mailbox: omap: Fix mailbox interrupt sharing
  mailbox: mtk-cmdq: Dynamically allocate clk_bulk_data structure
  mailbox: mtk-cmdq: Move and partially refactor clocks probe
  mailbox: mtk-cmdq: Stop requiring name for GCE clock
  dt-bindings: mailbox: Add mediatek,gce-props.yaml

Merge tag 'pcmcia-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux

Pull PCMCIA updates from Dominik Brodowski:
"A number of tiny cleanups of the PCMCIA subsystem by Jeff Johnson,
  Jules Irenge, and Krzysztof Kozlowski"

* tag 'pcmcia-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux:
  pcmcia: add missing MODULE_DESCRIPTION() macros
  pcmcia: Use resource_size function on resource object
  pcmcia: bcm63xx: drop driver owner assignment

Merge tag 'for-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply

Pull power supply and reset updates from Sebastian Reichel:
"Power-supply core:
   - new charging_orange_full_green RGB LED trigger
   - simplify and cleanup power-supply LED trigger code
   - expose power information via hwmon compatibility layer

  New hardware support:
   - enable battery support for Qualcomm Snapdragon X Elite
   - new battery driver for Maxim MAX17201/MAX17205
   - new battery driver for Lenovo Yoga C630 laptop (custom EC)

  Cleanups:
   - cleanup 'struct i2c_device_id' initializations
   - misc small battery driver cleanups and fixes"

* tag 'for-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
  power: supply: sysfs: use power_supply_property_is_writeable()
  power: supply: qcom_battmgr: Enable battery support on x1e80100
  power: supply: add support for MAX1720x standalone fuel gauge
  dt-bindings: power: supply: add support for MAX17201/MAX17205 fuel gauge
  power: reset: piix4: add missing MODULE_DESCRIPTION() macro
  power: supply: samsung-sdi-battery: Constify struct power_supply_maintenance_charge_table
  power: supply: samsung-sdi-battery: Constify struct power_supply_vbat_ri_table
  power: supply: lenovo_yoga_c630_battery: add Lenovo C630 driver
  power: supply: ingenic: Fix some error handling paths in ingenic_battery_get_property()
  power: supply: ab8500: Clean some error messages
  power: supply: ab8500: Use iio_read_channel_processed_scale()
  power: supply: ab8500: Fix error handling when calling iio_read_channel_processed()
  power: supply: hwmon: Add support for power sensors
  power: supply: ab8500: remove unused struct 'inst_curr_result_list'
  power: supply: bd99954: remove unused struct 'battery_data'
  power: supply: leds: Add activate() callback to triggers
  power: supply: leds: Share trig pointer for online and charging_full
  power: supply: leds: Add power_supply_[un]register_led_trigger()
  power: supply: Drop explicit initialization of struct i2c_device_id::driver_data to 0

Merge tag 'hsi-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi

Pull HSI update from Sebastian Reichel:

- drop unused gpio.h header from SSI McSAAB protocol driver

* tag 'hsi-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi:
HSI: ssi_protocol: Remove unused linux/gpio.h

kbuild: doc: gcc to CC change

In this part of the documentation, $(CC) is meant, but gcc is written.

Signed-off-by: Ivan Davydov <[email protected]>
Signed-off-by: Masahiro Yamada <[email protected]>

Merge branch 'for-6.11/sysfs-patch-replace' into for-linus

arm64/sysreg: Correct the values for GICv4.1

Currently, sysreg has value as 0b0010 for the presence of GICv4.1 in
ID_PFR1_EL1 and ID_AA64PFR0_EL1, instead of 0b0011 as per ARM ARM.
Hence, correct them to reflect ARM ARM.

Signed-off-by: Raghavendra Rao Ananta <[email protected]>
Reviewed-by: Zenghui Yu <[email protected]>
Reviewed-by: Anshuman Khandual <[email protected]>
Reviewed-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>

arm64/vdso: Remove --hash-style=sysv

glibc added support for .gnu.hash in 2006 and .hash has been obsoleted
for more than one decade in many Linux distributions. Using
--hash-style=sysv might imply unaddressed issues and confuse readers.

Just drop the option and rely on the linker default, which is likely
"both", or "gnu" when the distribution really wants to eliminate sysv
hash overhead.

Similar to commit 6b7e26547fad ("x86/vdso: Emit a GNU hash").

Signed-off-by: Fangrui Song <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>

kselftest: missing arg in ptrace.c

The string passed to ksft_test_result_skip is missing the `type_name`

Signed-off-by: Remington Brasga <[email protected]>
Reviewed-by: Dev Jain <[email protected]>
Reviewed-by: Anshuman Khandual <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>

arm64/Kconfig: Remove redundant 'if HAVE_FUNCTION_GRAPH_TRACER'

Since the commit 819e50e25d0c ("arm64: Add ftrace support"),
HAVE_FUNCTION_GRAPH_TRACER has always been enabled. Although a subsequent
commit 364697032246 ("arm64: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL")
redundantly added check on HAVE_FUNCTION_GRAPH_TRACER, while enabling the
config HAVE_FUNCTION_GRAPH_RETVAL. Let's just drop this redundant check.

Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
CC: [email protected]
Signed-off-by: Anshuman Khandual <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>

arm64: remove redundant 'if HAVE_ARCH_KASAN' in Kconfig

Since commit 0383808e4d99 ("arm64: kasan: Reduce minimum shadow
alignment and enable 5 level paging"), HAVE_ARCH_KASAN is always 'y'.

The condition 'if HAVE_ARCH_KASAN' is always met.

Signed-off-by: Masahiro Yamada <[email protected]>
Reviewed-by: Randy Dunlap <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Reviewed-by: Anshuman Khandual <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>

s390: Remove protvirt and kvm config guards for uv code

Removing the CONFIG_PROTECTED_VIRTUALIZATION_GUEST ifdefs and config
option as well as CONFIG_KVM ifdefs in uv files.

Having this configurable has been more of a pain than a help.
It's time to remove the ifdefs and the config option.

Signed-off-by: Janosch Frank <[email protected]>
Acked-by: Christian Borntraeger <[email protected]>
Acked-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390/boot: Add cmdline option to relocate lowcore

Now that everything has been converted, add the option
'relocate_lowcore' to enable relocating the lowcore.

Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Sven Schnelle <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390/kdump: Make kdump ready for lowcore relocation

In preparation of having lowcore at different address than zero,
add the base register to all lowcore accesses in store_status()
and __do_machine_kdump().

Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Sven Schnelle <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390/entry: Make system_call() ready for lowcore relocation

In preparation of having lowcore at different address than zero,
add the base register to all lowcore accesses in system_call().

Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Sven Schnelle <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390/entry: Make ret_from_fork() ready for lowcore relocation

In preparation of having lowcore at different address than zero,
add the base register to all lowcore accesses in ret_from_fork().

Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Sven Schnelle <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390/entry: Make __switch_to() ready for lowcore relocation

In preparation of having lowcore at different address than zero,
add the base register to all lowcore accesses in __switch_to().

Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Sven Schnelle <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>