Git Repo - linux.git/log

i3c: mipi-i3c-hci: Error out instead on BUG_ON() in IBI DMA setup

Definitely condition dma_get_cache_alignment * defined value > 256
during driver initialization is not reason to BUG_ON(). Turn that to
graceful error out with -EINVAL.

Signed-off-by: Jarkko Nikula <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexandre Belloni <[email protected]>

i3c: mipi-i3c-hci: Set IBI Status and Data Ring base addresses

IBI Status and Data Ring base address registers are not set so HW
obviously cannot update those rings after In-Band Interrupt.

Set them to already allocated and mapped ring addresses.

Signed-off-by: Jarkko Nikula <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexandre Belloni <[email protected]>

i3c: mipi-i3c-hci: Switch to lower_32_bits()/upper_32_bits() helpers

Rather than having own lo32()/hi32() helpers for dealing with 32-bit and
64-bit build targets switch to generic lower_32_bits()/upper_32_bits()
helpers.

Signed-off-by: Jarkko Nikula <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexandre Belloni <[email protected]>

i3c: dw: Remove ibi_capable property

Since DW I3C IP master role always supports IBI, we don't need
to keep two variants of master ops and select one using this
property. Hence remove the code.

Signed-off-by: Aniket <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexandre Belloni <[email protected]>

i3c: dw: Fix IBI intr programming

IBI_SIR_REQ_REJECT register is not present if the IP has
IC_HAS_IBI_DATA = 1 set. So don't rely on doing read-
modify-write op on this register.
Instead maintain a variable to store the sir reject mask
and use it to set IBI_SIR_REQ_REJECT.

Signed-off-by: Aniket <[email protected]>
Reviewed-by: Jeremy Kerr <[email protected]>
Signed-off-by: Alexandre Belloni <[email protected]>

i3c: dw: Fix clearing queue thld

QUEUE_THLD_CTRL_IBI_STAT_MASK is repeated twice.
Replace with QUEUE_THLD_CTRL_IBI_DATA_MASK.

Signed-off-by: Aniket <[email protected]>
Signed-off-by: Alexandre Belloni <[email protected]>

i3c: mipi-i3c-hci: Fix number of DAT/DCT entries for HCI versions < 1.1

I was wrong about the TABLE_SIZE field description in the
commit 0676bfebf576 ("i3c: mipi-i3c-hci: Fix DAT/DCT entry sizes").

For the MIPI I3C HCI versions 1.0 and earlier the TABLE_SIZE field in
the registers DAT_SECTION_OFFSET and DCT_SECTION_OFFSET is indeed defined
in DWORDs and not number of entries like it is defined in later versions.

Where above fix allowed driver initialization to continue the wrongly
interpreted TABLE_SIZE field leads variables DAT_entries being twice and
DCT_entries four times as big as they really are.

That in turn leads clearing the DAT table over the boundary in the
dat_v1.c: hci_dat_v1_init().

So interprete the TABLE_SIZE field in DWORDs for HCI versions < 1.1 and
fix number of DAT/DCT entries accordingly.

Fixes: 0676bfebf576 ("i3c: mipi-i3c-hci: Fix DAT/DCT entry sizes")
Signed-off-by: Jarkko Nikula <[email protected]>
Signed-off-by: Alexandre Belloni <[email protected]>

i3c: master: svc: resend target address when get NACK

According to I3C Spec 1.1.1, 11-Jun-2021, section: 5.1.2.2.3:

If the Controller chooses to start an I3C Message with an I3C Dynamic
Address, then special provisions shall be made because that same I3C Target
may be initiating an IBI or a Controller Role Request. So, one of three
things may happen: (skip 1, 2)

3. The Addresses match and the RnW bits also match, and so neither
Controller nor Target will ACK since both are expecting the other side to
provide ACK. As a result, each side might think it had "won" arbitration,
but neither side would continue, as each would subsequently see that the
other did not provide ACK.
...
For either value of RnW: Due to the NACK, the Controller shall defer the
Private Write or Private Read, and should typically transmit the Target
^^^^^^^^^^^^^^^^^^^
Address again after a Repeated START (i.e., the next one or any one prior
^^^^^^^^^^^^^
to a STOP in the Frame). Since the Address Header following a Repeated
START is not arbitrated, the Controller will always win (see Section
5.1.2.2.4).

Resend target address again if address is not 7E and controller get NACK.

Reviewed-by: Miquel Raynal <[email protected]>
Signed-off-by: Frank Li <[email protected]>
Signed-off-by: Alexandre Belloni <[email protected]>

erofs: convert comma to semicolon

Replace a comma between expression statements by a semicolon.

Signed-off-by: Chen Ni <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Gao Xiang <[email protected]>

erofs: support multi-page folios for erofs_bread()

If the requested page is part of the previous multi-page folio, there
is no need to call read_mapping_folio() again.

Also, get rid of the remaining one of page->index [1] in our codebase.

[1] https://lore.kernel.org/r/[email protected]

Cc: Matthew Wilcox <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Gao Xiang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

erofs: add support for FS_IOC_GETFSSYSFSPATH

FS_IOC_GETFSSYSFSPATH ioctl exposes /sys/fs path of a given filesystem,
potentially standarizing sysfs reporting. This patch add support for
FS_IOC_GETFSSYSFSPATH for erofs, "erofs/<dev>" will be outputted for bdev
cases, "erofs/[domain_id,]<fs_id>" will be outputted for fscache cases.

Signed-off-by: Huang Xiaojia <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Gao Xiang <[email protected]>

erofs: fix race in z_erofs_get_gbuf()

In z_erofs_get_gbuf(), the current task may be migrated to another
CPU between `z_erofs_gbuf_id()` and `spin_lock(&gbuf->lock)`.

Therefore, z_erofs_put_gbuf() will trigger the following issue
which was found by stress test:

<2>[772156.434168] kernel BUG at fs/erofs/zutil.c:58!
..
<4>[772156.435007]
<4>[772156.439237] CPU: 0 PID: 3078 Comm: stress Kdump: loaded Tainted: G            E      6.10.0-rc7+ #2
<4>[772156.439239] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 1.0.0 01/01/2017
<4>[772156.439241] pstate: 83400005 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
<4>[772156.439243] pc : z_erofs_put_gbuf+0x64/0x70 [erofs]
<4>[772156.439252] lr : z_erofs_lz4_decompress+0x600/0x6a0 [erofs]
..
<6>[772156.445958] stress (3127): drop_caches: 1
<4>[772156.446120] Call trace:
<4>[772156.446121]  z_erofs_put_gbuf+0x64/0x70 [erofs]
<4>[772156.446761]  z_erofs_lz4_decompress+0x600/0x6a0 [erofs]
<4>[772156.446897]  z_erofs_decompress_queue+0x740/0xa10 [erofs]
<4>[772156.447036]  z_erofs_runqueue+0x428/0x8c0 [erofs]
<4>[772156.447160]  z_erofs_readahead+0x224/0x390 [erofs]
..

Fixes: f36f3010f676 ("erofs: rename per-CPU buffers to global buffer pool and make it configurable")
Cc: <[email protected]> # 6.10+
Reviewed-by: Chunhai Guo <[email protected]>
Reviewed-by: Sandeep Dhavale <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Gao Xiang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

erofs: support STATX_DIOALIGN

Add support for STATX_DIOALIGN to EROFS, so that direct I/O
alignment restrictions are exposed to userspace in a generic
way.

[Before]
```
./statx_test /mnt/erofs/testfile
statx(/mnt/erofs/testfile) = 0
dio mem align:0
dio offset align:0
```

[After]
```
./statx_test /mnt/erofs/testfile
statx(/mnt/erofs/testfile) = 0
dio mem align:512
dio offset align:512
```

Signed-off-by: Hongbo Li <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Gao Xiang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

Merge tag 'amd-drm-fixes-6.11-2024-07-25' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-fixes-6.11-2024-07-25:

amdgpu:
- SDMA 5.2 workaround
- GFX12 fixes
- Uninitialized variable fix
- VCN/JPEG 4.0.3 fixes
- Misc display fixes
- RAS fixes
- VCN4/5 harvest fix
- GPU reset fix

Signed-off-by: Dave Airlie <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

Merge tag 'drm-misc-next-fixes-2024-07-25' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

A single fix for a panel compatible

Signed-off-by: Dave Airlie <[email protected]>
From: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/20240725-frisky-wren-of-tact-f5f504@houat

Merge tag 'drm-intel-next-fixes-2024-07-25' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next

- Do not consider preemption during execlists_dequeue for gen8 [gt] (Nitin Gote)
- Allow NULL memory region (Jonathan Cavitt)

Signed-off-by: Dave Airlie <[email protected]>
From: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/ZqICQzyzm/6hDWy4@linux

Merge tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf and netfilter.

  A lot of networking people were at a conference last week, busy
  catching COVID, so relatively short PR.

  Current release - regressions:

   - tcp: process the 3rd ACK with sk_socket for TFO and MPTCP

  Current release - new code bugs:

   - l2tp: protect session IDR and tunnel session list with one lock,
     make sure the state is coherent to avoid a warning

   - eth: bnxt_en: update xdp_rxq_info in queue restart logic

   - eth: airoha: fix location of the MBI_RX_AGE_SEL_MASK field

  Previous releases - regressions:

   - xsk: require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len,
     the field reuses previously un-validated pad

  Previous releases - always broken:

   - tap/tun: drop short frames to prevent crashes later in the stack

   - eth: ice: add a per-VF limit on number of FDIR filters

   - af_unix: disable MSG_OOB handling for sockets in sockmap/sockhash"

* tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits)
  tun: add missing verification for short frame
  tap: add missing verification for short frame
  mISDN: Fix a use after free in hfcmulti_tx()
  gve: Fix an edge case for TSO skb validity check
  bnxt_en: update xdp_rxq_info in queue restart logic
  tcp: process the 3rd ACK with sk_socket for TFO/MPTCP
  selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test
  xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len
  bpf: Fix a segment issue when downgrading gso_size
  net: mediatek: Fix potential NULL pointer dereference in dummy net_device handling
  MAINTAINERS: make Breno the netconsole maintainer
  MAINTAINERS: Update bonding entry
  net: nexthop: Initialize all fields in dumped nexthops
  net: stmmac: Correct byte order of perfect_match
  selftests: forwarding: skip if kernel not support setting bridge fdb learning limit
  tipc: Return non-zero value from tipc_udp_addr2str() on error
  netfilter: nft_set_pipapo_avx2: disable softinterrupts
  ice: Fix recipe read procedure
  ice: Add a per-VF limit on number of FDIR filters
  net: bonding: correctly annotate RCU in bond_should_notify_peers()
  ...

Merge tag 'printk-for-6.11-trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

- trivial printk changes

The bigger "real" printk work is still being discussed.

* tag 'printk-for-6.11-trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
vsprintf: add missing MODULE_DESCRIPTION() macro
printk: Rename console_replay_all() and update context

Merge tag 'constfy-sysctl-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl

Pull sysctl constification from Joel Granados:
"Treewide constification of the ctl_table argument of proc_handlers
  using a coccinelle script and some manual code formatting fixups.

  This is a prerequisite to moving the static ctl_table structs into
  read-only data section which will ensure that proc_handler function
  pointers cannot be modified"

* tag 'constfy-sysctl-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
  sysctl: treewide: constify the ctl_table argument of proc_handlers

Merge tag 'efi-fixes-for-v6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI fixes from Ard Biesheuvel:

- Wipe screen_info after allocating it from the heap - used by arm32
   and EFI zboot, other EFI architectures allocate it statically

- Revert to allocating boot_params from the heap on x86 when entering
   via the native PE entrypoint, to work around a regression on older
   Dell hardware

* tag 'efi-fixes-for-v6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  x86/efistub: Revert to heap allocated boot_params for PE entrypoint
  efi/libstub: Zero initialize heap allocated struct screen_info

Merge tag 'kgdb-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux

Pull kgdb updates from Daniel Thompson:
"Three small changes this cycle:

   - Clean up an architecture abstraction that is no longer needed
     because all the architectures have converged.

   - Actually use the prompt argument to kdb_position_cursor() instead
     of ignoring it (functionally this fix is a nop but that was due to
     luck rather than good judgement)

   - Fix a -Wformat-security warning"

* tag 'kgdb-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
  kdb: Get rid of redundant kdb_curr_task()
  kdb: Use the passed prompt in kdb_position_cursor()
  kdb: address -Wformat-security warnings

Merge tag 'mips_6.11_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS updates from Thomas Bogendoerfer:

- Use improved timer sync for Loongson64

- Fix address of GCR_ACCESS register

- Add missing MODULE_DESCRIPTION

* tag 'mips_6.11_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  mips: sibyte: add missing MODULE_DESCRIPTION() macro
  MIPS: SMP-CPS: Fix address for GCR_ACCESS register for CM3 and later
  MIPS: Loongson64: Switch to SYNC_R4K

Merge tag 'parisc-for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc updates from Helge Deller:
"The gettimeofday() and clock_gettime() syscalls are now available as
  vDSO functions, and Dave added a patch which allows to use NVMe cards
  in the PCI slots as fast and easy alternative to SCSI discs.

  Summary:

   - add gettimeofday() and clock_gettime() vDSO functions

   - enable PCI_MSI_ARCH_FALLBACKS to allow PCI to PCIe bridge adaptor
     with PCIe NVME card to function in parisc machines

   - allow users to reduce kernel unaligned runtime warnings

   - minor code cleanups"

* tag 'parisc-for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Add support for CONFIG_SYSCTL_ARCH_UNALIGN_NO_WARN
  parisc: Use max() to calculate parisc_tlb_flush_threshold
  parisc: Fix warning at drivers/pci/msi/msi.h:121
  parisc: Add 64-bit gettimeofday() and clock_gettime() vDSO functions
  parisc: Add 32-bit gettimeofday() and clock_gettime() vDSO functions
  parisc: Clean up unistd.h file

Merge tag 'uml-for-linus-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux

Pull UML updates from Richard Weinberger:

- Support for preemption

- i386 Rust support

- Huge cleanup by Benjamin Berg

- UBSAN support

- Removal of dead code

* tag 'uml-for-linus-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (41 commits)
  um: vector: always reset vp->opened
  um: vector: remove vp->lock
  um: register power-off handler
  um: line: always fill *error_out in setup_one_line()
  um: remove pcap driver from documentation
  um: Enable preemption in UML
  um: refactor TLB update handling
  um: simplify and consolidate TLB updates
  um: remove force_flush_all from fork_handler
  um: Do not flush MM in flush_thread
  um: Delay flushing syscalls until the thread is restarted
  um: remove copy_context_skas0
  um: remove LDT support
  um: compress memory related stub syscalls while adding them
  um: Rework syscall handling
  um: Add generic stub_syscall6 function
  um: Create signal stack memory assignment in stub_data
  um: Remove stub-data.h include from common-offsets.h
  um: time-travel: fix signal blocking race/hang
  um: time-travel: remove time_exit()
  ...

Merge tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
"Here is the big set of driver core changes for 6.11-rc1.

  Lots of stuff in here, with not a huge diffstat, but apis are evolving
  which required lots of files to be touched. Highlights of the changes
  in here are:

   - platform remove callback api final fixups (Uwe took many releases
     to get here, finally!)

   - Rust bindings for basic firmware apis and initial driver-core
     interactions.

     It's not all that useful for a "write a whole driver in rust" type
     of thing, but the firmware bindings do help out the phy rust
     drivers, and the driver core bindings give a solid base on which
     others can start their work.

     There is still a long way to go here before we have a multitude of
     rust drivers being added, but it's a great first step.

   - driver core const api changes.

     This reached across all bus types, and there are some fix-ups for
     some not-common bus types that linux-next and 0-day testing shook
     out.

     This work is being done to help make the rust bindings more safe,
     as well as the C code, moving toward the end-goal of allowing us to
     put driver structures into read-only memory. We aren't there yet,
     but are getting closer.

   - minor devres cleanups and fixes found by code inspection

   - arch_topology minor changes

   - other minor driver core cleanups

  All of these have been in linux-next for a very long time with no
  reported problems"

* tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits)
  ARM: sa1100: make match function take a const pointer
  sysfs/cpu: Make crash_hotplug attribute world-readable
  dio: Have dio_bus_match() callback take a const *
  zorro: make match function take a const pointer
  driver core: module: make module_[add|remove]_driver take a const *
  driver core: make driver_find_device() take a const *
  driver core: make driver_[create|remove]_file take a const *
  firmware_loader: fix soundness issue in `request_internal`
  firmware_loader: annotate doctests as `no_run`
  devres: Correct code style for functions that return a pointer type
  devres: Initialize an uninitialized struct member
  devres: Fix memory leakage caused by driver API devm_free_percpu()
  devres: Fix devm_krealloc() wasting memory
  driver core: platform: Switch to use kmemdup_array()
  driver core: have match() callback in struct bus_type take a const *
  MAINTAINERS: add Rust device abstractions to DRIVER CORE
  device: rust: improve safety comments
  MAINTAINERS: add Danilo as FIRMWARE LOADER maintainer
  MAINTAINERS: add Rust FW abstractions to FIRMWARE LOADER
  firmware: rust: improve safety comments
  ...

Merge tag 'linux-watchdog-6.11-rc1' of git://www.linux-watchdog.org/linux-watchdog

Pull watchdog updates from Wim Van Sebroeck:

- make watchdog_class const

- rework of the rzg2l_wdt driver

- other small fixes and improvements

* tag 'linux-watchdog-6.11-rc1' of git://www.linux-watchdog.org/linux-watchdog:
  dt-bindings: watchdog: dlg,da9062-watchdog: Drop blank space
  watchdog: rzn1: Convert comma to semicolon
  watchdog: lenovo_se10_wdt: Convert comma to semicolon
  dt-bindings: watchdog: renesas,wdt: Document RZ/G3S support
  watchdog: rzg2l_wdt: Add suspend/resume support
  watchdog: rzg2l_wdt: Rely on the reset driver for doing proper reset
  watchdog: rzg2l_wdt: Remove comparison with zero
  watchdog: rzg2l_wdt: Remove reset de-assert from probe
  watchdog: rzg2l_wdt: Check return status of pm_runtime_put()
  watchdog: rzg2l_wdt: Use pm_runtime_resume_and_get()
  watchdog: rzg2l_wdt: Make the driver depend on PM
  watchdog: rzg2l_wdt: Restrict the driver to ARCH_RZG2L and ARCH_R9A09G011
  watchdog: imx7ulp_wdt: keep already running watchdog enabled
  watchdog: starfive: Add missing clk_disable_unprepare()
  watchdog: Make watchdog_class const

Merge tag 'dma-mapping-6.11-2024-07-24' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fix from Christoph Hellwig:

- fix the order of actions in dmam_free_coherent (Lance Richardson)

* tag 'dma-mapping-6.11-2024-07-24' of git://git.infradead.org/users/hch/dma-mapping:
dma: fix call order in dmam_free_coherent

Merge tag 'asoc-fix-v6.11-merge-window' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v6.11

A selection of routine fixes and quirks that came in since the merge
window. The fsl-asoc-card change is a fix for systems with multiple
cards where updating templates in place leaks data from one card to
another.

Merge branch 'tap-tun-harden-by-dropping-short-frame'

Dongli Zhang says:

====================
tap/tun: harden by dropping short frame

This is to harden all of tap/tun to avoid any short frame smaller than the
Ethernet header (ETH_HLEN).

While the xen-netback already rejects short frame smaller than ETH_HLEN ...

914 static void xenvif_tx_build_gops(struct xenvif_queue *queue,
915                                      int budget,
916                                      unsigned *copy_ops,
917                                      unsigned *map_ops)
918 {
... ...
1007                 if (unlikely(txreq.size < ETH_HLEN)) {
1008                         netdev_dbg(queue->vif->dev,
1009                                    "Bad packet size: %d\n", txreq.size);
1010                         xenvif_tx_err(queue, &txreq, extra_count, idx);
1011                         break;
1012                 }

... the short frame may not be dropped by vhost-net/tap/tun.

This fixes CVE-2024-41090 and CVE-2024-41091.
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

tun: add missing verification for short frame

The cited commit missed to check against the validity of the frame length
in the tun_xdp_one() path, which could cause a corrupted skb to be sent
downstack. Even before the skb is transmitted, the
tun_xdp_one-->eth_type_trans() may access the Ethernet header although it
can be less than ETH_HLEN. Once transmitted, this could either cause
out-of-bound access beyond the actual length, or confuse the underlayer
with incorrect or inconsistent header length in the skb metadata.

In the alternative path, tun_get_user() already prohibits short frame which
has the length less than Ethernet header size from being transmitted for
IFF_TAP.

This is to drop any frame shorter than the Ethernet header size just like
how tun_get_user() does.

CVE: CVE-2024-41091
Inspired-by: https://lore.kernel.org/netdev/[email protected]/
Fixes: 043d222f93ab ("tuntap: accept an array of XDP buffs through sendmsg()")
Cc: [email protected]
Signed-off-by: Dongli Zhang <[email protected]>
Reviewed-by: Si-Wei Liu <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Paolo Abeni <[email protected]>
Reviewed-by: Jason Wang <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

tap: add missing verification for short frame

The cited commit missed to check against the validity of the frame length
in the tap_get_user_xdp() path, which could cause a corrupted skb to be
sent downstack. Even before the skb is transmitted, the
tap_get_user_xdp()-->skb_set_network_header() may assume the size is more
than ETH_HLEN. Once transmitted, this could either cause out-of-bound
access beyond the actual length, or confuse the underlayer with incorrect
or inconsistent header length in the skb metadata.

In the alternative path, tap_get_user() already prohibits short frame which
has the length less than Ethernet header size from being transmitted.

This is to drop any frame shorter than the Ethernet header size just like
how tap_get_user() does.

CVE: CVE-2024-41090
Link: https://lore.kernel.org/netdev/[email protected]/
Fixes: 0efac27791ee ("tap: accept an array of XDP buffs through sendmsg()")
Cc: [email protected]
Signed-off-by: Si-Wei Liu <[email protected]>
Signed-off-by: Dongli Zhang <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Paolo Abeni <[email protected]>
Reviewed-by: Jason Wang <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

mISDN: Fix a use after free in hfcmulti_tx()

Don't dereference *sp after calling dev_kfree_skb(*sp).

Fixes: af69fb3a8ffa ("Add mISDN HFC multiport driver")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

gve: Fix an edge case for TSO skb validity check

The NIC requires each TSO segment to not span more than 10
descriptors. NIC further requires each descriptor to not exceed
16KB - 1 (GVE_TX_MAX_BUF_SIZE_DQO).

The descriptors for an skb are generated by
gve_tx_add_skb_no_copy_dqo() for DQO RDA queue format.
gve_tx_add_skb_no_copy_dqo() loops through each skb frag and
generates a descriptor for the entire frag if the frag size is
not greater than GVE_TX_MAX_BUF_SIZE_DQO. If the frag size is
greater than GVE_TX_MAX_BUF_SIZE_DQO, it is split into descriptor(s)
of size GVE_TX_MAX_BUF_SIZE_DQO and a descriptor is generated for
the remainder (frag size % GVE_TX_MAX_BUF_SIZE_DQO).

gve_can_send_tso() checks if the descriptors thus generated for an
skb would meet the requirement that each TSO-segment not span more
than 10 descriptors. However, the current code misses an edge case
when a TSO segment spans multiple descriptors within a large frag.
This change fixes the edge case.

gve_can_send_tso() relies on the assumption that max gso size (9728)
is less than GVE_TX_MAX_BUF_SIZE_DQO and therefore within an skb
fragment a TSO segment can never span more than 2 descriptors.

Fixes: a57e5de476be ("gve: DQO: Add TX path")
Signed-off-by: Praveen Kaligineedi <[email protected]>
Signed-off-by: Bailey Forrest <[email protected]>
Reviewed-by: Jeroen de Borst <[email protected]>
Cc: [email protected]
Reviewed-by: Willem de Bruijn <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

bnxt_en: update xdp_rxq_info in queue restart logic

When the netdev_rx_queue_restart() restarts queues, the bnxt_en driver
updates(creates and deletes) a page_pool.
But it doesn't update xdp_rxq_info, so the xdp_rxq_info is still
connected to an old page_pool.
So, bnxt_rx_ring_info->page_pool indicates a new page_pool, but
bnxt_rx_ring_info->xdp_rxq is still connected to an old page_pool.

An old page_pool is no longer used so it is supposed to be
deleted by page_pool_destroy() but it isn't.
Because the xdp_rxq_info is holding the reference count for it and the
xdp_rxq_info is not updated, an old page_pool will not be deleted in
the queue restart logic.

Before restarting 1 queue:
./tools/net/ynl/samples/page-pool
enp10s0f1np1[6] page pools: 4 (zombies: 0)
refs: 8192 bytes: 33554432 (refs: 0 bytes: 0)
recycling: 0.0% (alloc: 128:8048 recycle: 0:0)

After restarting 1 queue:
./tools/net/ynl/samples/page-pool
enp10s0f1np1[6] page pools: 5 (zombies: 0)
refs: 10240 bytes: 41943040 (refs: 0 bytes: 0)
recycling: 20.0% (alloc: 160:10080 recycle: 1920:128)

Before restarting queues, an interface has 4 page_pools.
After restarting one queue, an interface has 5 page_pools, but it
should be 4, not 5.
The reason is that queue restarting logic creates a new page_pool and
an old page_pool is not deleted due to the absence of an update of
xdp_rxq_info logic.

Fixes: 2d694c27d32e ("bnxt_en: implement netdev_queue_mgmt_ops")
Signed-off-by: Taehee Yoo <[email protected]>
Reviewed-by: David Wei <[email protected]>
Reviewed-by: Somnath Kotur <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

io_uring/msg_ring: fix uninitialized use of target_req->flags

syzbot reports that KMSAN complains that 'nr_tw' is an uninit-value
with the following report:

BUG: KMSAN: uninit-value in io_req_local_work_add io_uring/io_uring.c:1192 [inline]
BUG: KMSAN: uninit-value in io_req_task_work_add_remote+0x588/0x5d0 io_uring/io_uring.c:1240
io_req_local_work_add io_uring/io_uring.c:1192 [inline]
io_req_task_work_add_remote+0x588/0x5d0 io_uring/io_uring.c:1240
io_msg_remote_post io_uring/msg_ring.c:102 [inline]
io_msg_data_remote io_uring/msg_ring.c:133 [inline]
io_msg_ring_data io_uring/msg_ring.c:152 [inline]
io_msg_ring+0x1c38/0x1ef0 io_uring/msg_ring.c:305
io_issue_sqe+0x383/0x22c0 io_uring/io_uring.c:1710
io_queue_sqe io_uring/io_uring.c:1924 [inline]
io_submit_sqe io_uring/io_uring.c:2180 [inline]
io_submit_sqes+0x1259/0x2f20 io_uring/io_uring.c:2295
__do_sys_io_uring_enter io_uring/io_uring.c:3205 [inline]
__se_sys_io_uring_enter+0x40c/0x3ca0 io_uring/io_uring.c:3142
__x64_sys_io_uring_enter+0x11f/0x1a0 io_uring/io_uring.c:3142
x64_sys_call+0x2d82/0x3c10 arch/x86/include/generated/asm/syscalls_64.h:427
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f

which is the following check:

if (nr_tw < nr_wait)
return;

in io_req_local_work_add(). While nr_tw itself cannot be uninitialized,
it does depend on req->flags, which off the msg ring issue path can
indeed be uninitialized.

Fix this by always clearing the allocated 'req' fully if we can't grab
one from the cache itself.

Fixes: 50cf5f3842af ("io_uring/msg_ring: add an alloc cache for io_kiocb entries")
Reported-by: [email protected]
Link: https://lore.kernel.org/io-uring/[email protected]/
Signed-off-by: Jens Axboe <[email protected]>

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2024-07-25

We've added 14 non-merge commits during the last 8 day(s) which contain
a total of 19 files changed, 177 insertions(+), 70 deletions(-).

The main changes are:

1) Fix af_unix to disable MSG_OOB handling for sockets in BPF sockmap and
   BPF sockhash. Also add test coverage for this case, from Michal Luczaj.

2) Fix a segmentation issue when downgrading gso_size in the BPF helper
   bpf_skb_adjust_room(), from Fred Li.

3) Fix a compiler warning in resolve_btfids due to a missing type cast,
   from Liwei Song.

4) Fix stack allocation for arm64 to align the stack pointer at a 16 byte
   boundary in the fexit_sleep BPF selftest, from Puranjay Mohan.

5) Fix a xsk regression to require a flag when actuating tx_metadata_len,
   from Stanislav Fomichev.

6) Fix function prototype BTF dumping in libbpf for prototypes that have
   no input arguments, from Andrii Nakryiko.

7) Fix stacktrace symbol resolution in perf script for BPF programs
   containing subprograms, from Hou Tao.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test
  xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len
  bpf: Fix a segment issue when downgrading gso_size
  tools/resolve_btfids: Fix comparison of distinct pointer types warning in resolve_btfids
  bpf, events: Use prog to emit ksymbol event for main program
  selftests/bpf: Test sockmap redirect for AF_UNIX MSG_OOB
  selftests/bpf: Parametrize AF_UNIX redir functions to accept send() flags
  selftests/bpf: Support SOCK_STREAM in unix_inet_redir_to_connected()
  af_unix: Disable MSG_OOB handling for sockets in sockmap/sockhash
  bpftool: Fix typo in usage help
  libbpf: Fix no-args func prototype BTF dumping syntax
  MAINTAINERS: Update powerpc BPF JIT maintainers
  MAINTAINERS: Update email address of Naveen
  selftests/bpf: fexit_sleep: Fix stack allocation for arm64
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

nvme-pci: add missing condition check for existence of mapped data

nvme_map_data() is called when request has physical segments, hence
the nvme_unmap_data() should have same condition to avoid dereference.

Fixes: 4aedb705437f ("nvme-pci: split metadata handling from nvme_map_data / nvme_unmap_data")
Signed-off-by: Leon Romanovsky <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Nitesh Shetty <[email protected]>
Signed-off-by: Keith Busch <[email protected]>

ASoC: fsl-asoc-card: Dynamically allocate memory for snd_soc_dai_link_components

The static snd_soc_dai_link_components cause conflict for multiple
instances of this generic driver. For example, when there is
wm8962 and SPDIF case enabled together, the contaminated
snd_soc_dai_link_components will cause another device probe fail.

Fixes: 6d174cc4f224 ("ASoC: fsl-asoc-card: merge spdif support from imx-spdif.c")
Signed-off-by: Shengjiu Wang <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Mark Brown <[email protected]>

arm64: mm: Fix lockless walks with static and dynamic page-table folding

Lina reports random oopsen originating from the fast GUP code when
16K pages are used with 4-level page-tables, the fourth level being
folded at runtime due to lack of LPA2.

In this configuration, the generic implementation of
p4d_offset_lockless() will return a 'p4d_t *' corresponding to the
'pgd_t' allocated on the stack of the caller, gup_fast_pgd_range().
This is normally fine, but when the fourth level of page-table is folded
at runtime, pud_offset_lockless() will offset from the address of the
'p4d_t' to calculate the address of the PUD in the same page-table page.
This results in a stray stack read when the 'p4d_t' has been allocated
on the stack and can send the walker into the weeds.

Fix the problem by providing our own definition of p4d_offset_lockless()
when CONFIG_PGTABLE_LEVELS <= 4 which returns the real page-table
pointer rather than the address of the local stack variable.

Cc: Catalin Marinas <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Fixes: 0dd4f60a2c76 ("arm64: mm: Add support for folding PUDs at runtime")
Reported-by: Asahi Lina <[email protected]>
Reviewed-by: Ard Biesheuvel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>

iommu: arm-smmu: Fix Tegra workaround for PAGE_SIZE mappings

PAGE_SIZE can be 16KB for Tegra which is not supported by MMU-500 on
both Tegra194 and Tegra234. Retain only valid granularities from
pgsize_bitmap which would either be 4KB or 64KB.

Signed-off-by: Ashish Mhetre <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>

of: remove internal arguments from of_property_for_each_u32()

The of_property_for_each_u32() macro needs five parameters, two of which
are primarily meant as internal variables for the macro itself (in the
for() clause). Yet these two parameters are used by a few drivers, and this
can be considered misuse or at least bad practice.

Now that the kernel uses C11 to build, these two parameters can be avoided
by declaring them internally, thus changing this pattern:

  struct property *prop;
  const __be32 *p;
  u32 val;

  of_property_for_each_u32(np, "xyz", prop, p, val) { ... }

to this:

  u32 val;

  of_property_for_each_u32(np, "xyz", val) { ... }

However two variables cannot be declared in the for clause even with C11,
so declare one struct that contain the two variables we actually need. As
the variables inside this struct are not meant to be used by users of this
macro, give the struct instance the noticeable name "_it" so it is visible
during code reviews, helping to avoid new code to use it directly.

Most usages are trivially converted as they do not use those two
parameters, as expected. The non-trivial cases are:

- drivers/clk/clk.c, of_clk_get_parent_name(): easily doable anyway
- drivers/clk/clk-si5351.c, si5351_dt_parse(): this is more complex as the
   checks had to be replicated in a different way, making code more verbose
   and somewhat uglier, but I refrained from a full rework to keep as much
   of the original code untouched having no hardware to test my changes

All the changes have been build tested. The few for which I have the
hardware have been runtime-tested too.

Reviewed-by: Andre Przywara <[email protected]> # drivers/clk/sunxi/clk-simple-gates.c, drivers/clk/sunxi/clk-sun8i-bus-gates.c
Acked-by: Bartosz Golaszewski <[email protected]> # drivers/gpio/gpio-brcmstb.c
Acked-by: Nicolas Ferre <[email protected]> # drivers/irqchip/irq-atmel-aic-common.c
Acked-by: Jonathan Cameron <[email protected]> # drivers/iio/adc/ti_am335x_adc.c
Acked-by: Uwe Kleine-König <[email protected]> # drivers/pwm/pwm-samsung.c
Acked-by: Richard Leitner <[email protected]> # drivers/usb/misc/usb251xb.c
Acked-by: Mark Brown <[email protected]> # sound/soc/codecs/arizona.c
Reviewed-by: Richard Fitzgerald <[email protected]> # sound/soc/codecs/arizona.c
Acked-by: Michael Ellerman <[email protected]> # arch/powerpc/sysdev/xive/spapr.c
Acked-by: Stephen Boyd <[email protected]> # clk
Signed-off-by: Luca Ceresoli <[email protected]>
Acked-by: Lee Jones <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Rob Herring (Arm) <[email protected]>

dt-bindings: watchdog: add support for Amlogic A4 SoCs

Update dt-binding document for watchdog of Amlogic A4 SoCs.

Signed-off-by: Huqiang Qin <[email protected]>
Signed-off-by: Xianwei Zhao <[email protected]>
Acked-by: Krzysztof Kozlowski <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Rob Herring (Arm) <[email protected]>

ASoC: amd: yc: Support mic on Lenovo Thinkpad E16 Gen 2

Lenovo Thinkpad E16 Gen 2 AMD model (model 21M5) needs a corresponding
quirk entry for making the internal mic working.

Link: https://bugzilla.suse.com/show_bug.cgi?id=1228269
Cc: [email protected]
Signed-off-by: Takashi Iwai <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Mark Brown <[email protected]>

x86/xen: fix memblock_reserve() usage on PVH

The current usage of memblock_reserve() in init_pvh_bootparams() is done before
the .bss is zeroed, and that used to be fine when
memblock_reserved_init_regions implicitly ended up in the .meminit.data
section. However after commit 73db3abdca58c memblock_reserved_init_regions
ends up in the .bss section, thus breaking it's usage before the .bss is
cleared.

Move and rename the call to xen_reserve_extra_memory() so it's done in the
x86_init.oem.arch_setup hook, which gets executed after the .bss has been
zeroed, but before calling e820__memory_setup().

Fixes: 73db3abdca58c ("init/modpost: conditionally check section mismatch to __meminit*")
Signed-off-by: Roger Pau Monné <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Message-ID: <20240725073116 [email protected]>
Signed-off-by: Juergen Gross <[email protected]>

x86/xen: move xen_reserve_extra_memory()

In preparation for making the function static.

No functional change.

Signed-off-by: Roger Pau Monné <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Message-ID: <20240725073116 [email protected]>
Signed-off-by: Juergen Gross <[email protected]>

tcp: process the 3rd ACK with sk_socket for TFO/MPTCP

The 'Fixes' commit recently changed the behaviour of TCP by skipping the
processing of the 3rd ACK when a sk->sk_socket is set. The goal was to
skip tcp_ack_snd_check() in tcp_rcv_state_process() not to send an
unnecessary ACK in case of simultaneous connect(). Unfortunately, that
had an impact on TFO and MPTCP.

I started to look at the impact on MPTCP, because the MPTCP CI found
some issues with the MPTCP Packetdrill tests [1]. Then Paolo Abeni
suggested me to look at the impact on TFO with "plain" TCP.

For MPTCP, when receiving the 3rd ACK of a request adding a new path
(MP_JOIN), sk->sk_socket will be set, and point to the MPTCP sock that
has been created when the MPTCP connection got established before with
the first path. The newly added 'goto' will then skip the processing of
the segment text (step 7) and not go through tcp_data_queue() where the
MPTCP options are validated, and some actions are triggered, e.g.
sending the MPJ 4th ACK [2] as demonstrated by the new errors when
running a packetdrill test [3] establishing a second subflow.

This doesn't fully break MPTCP, mainly the 4th MPJ ACK that will be
delayed. Still, we don't want to have this behaviour as it delays the
switch to the fully established mode, and invalid MPTCP options in this
3rd ACK will not be caught any more. This modification also affects the
MPTCP + TFO feature as well, and being the reason why the selftests
started to be unstable the last few days [4].

For TFO, the existing 'basic-cookie-not-reqd' test [5] was no longer
passing: if the 3rd ACK contains data, and the connection is accept()ed
before receiving them, these data would no longer be processed, and thus
not ACKed.

One last thing about MPTCP, in case of simultaneous connect(), a
fallback to TCP will be done, which seems fine:

  `../common/defaults.sh`

   0 socket(..., SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_MPTCP) = 3
  +0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)

  +0 > S  0:0(0)                 <mss 1460, sackOK, TS val 100 ecr 0,   nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
  +0 < S  0:0(0) win 1000        <mss 1460, sackOK, TS val 407 ecr 0,   nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
  +0 > S. 0:0(0) ack 1           <mss 1460, sackOK, TS val 330 ecr 0,   nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
  +0 < S. 0:0(0) ack 1 win 65535 <mss 1460, sackOK, TS val 700 ecr 100, nop, wscale 8, mpcapable v1 flags[flag_h] key[skey=2]>
  +0 >  . 1:1(0) ack 1           <nop, nop, TS val 845707014 ecr 700, nop, nop, sack 0:1>

Simultaneous SYN-data crossing is also not supported by TFO, see [6].

Kuniyuki Iwashima suggested to restrict the processing to SYN+ACK only:
that's a more generic solution than the one initially proposed, and
also enough to fix the issues described above.

Later on, Eric Dumazet mentioned that an ACK should still be sent in
reaction to the second SYN+ACK that is received: not sending a DUPACK
here seems wrong and could hurt:

   0 socket(..., SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 3
  +0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)

  +0 > S  0:0(0)                <mss 1460, sackOK, TS val 1000 ecr 0,nop,wscale 8>
  +0 < S  0:0(0)       win 1000 <mss 1000, sackOK, nop, nop>
  +0 > S. 0:0(0) ack 1          <mss 1460, sackOK, TS val 3308134035 ecr 0,nop,wscale 8>
  +0 < S. 0:0(0) ack 1 win 1000 <mss 1000, sackOK, nop, nop>
  +0 >  . 1:1(0) ack 1          <nop, nop, sack 0:1>  // <== Here

So in this version, the 'goto consume' is dropped, to always send an ACK
when switching from TCP_SYN_RECV to TCP_ESTABLISHED. This ACK will be
seen as a DUPACK -- with DSACK if SACK has been negotiated -- in case of
simultaneous SYN crossing: that's what is expected here.

Link: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/9936227696
Link: https://datatracker.ietf.org/doc/html/rfc8684#fig_tokens
Link: https://github.com/multipath-tcp/packetdrill/blob/mptcp-net-next/gtests/net/mptcp/syscalls/accept.pkt#L28
Link: https://netdev.bots.linux.dev/contest.html?executor=vmksft-mptcp-dbg&test=mptcp-connect-sh
Link: https://github.com/google/packetdrill/blob/master/gtests/net/tcp/fastopen/server/basic-cookie-not-reqd.pkt#L21
Link: https://github.com/google/packetdrill/blob/master/gtests/net/tcp/fastopen/client/simultaneous-fast-open.pkt
Fixes: 23e89e8ee7be ("tcp: Don't drop SYN+ACK for simultaneous connect().")
Suggested-by: Paolo Abeni <[email protected]>
Suggested-by: Kuniyuki Iwashima <[email protected]>
Suggested-by: Eric Dumazet <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Link: https://patch.msgid.link/20240724-upstream-net-next-20240716-tcp-3rd-ack-consume-sk_socket-v3-1-d48339764ce9@kernel.org
Signed-off-by: Paolo Abeni <[email protected]>

rbd: don't assume rbd_is_lock_owner() for exclusive mappings

Expanding on the previous commit, assuming that rbd_is_lock_owner()
always returns true (i.e. that we are either in RBD_LOCK_STATE_LOCKED
or RBD_LOCK_STATE_QUIESCING) if the mapping is exclusive is wrong too.
In case ceph_cls_set_cookie() fails, the lock would be temporarily
released even if the mapping is exclusive, meaning that we can end up
even in RBD_LOCK_STATE_UNLOCKED.

IOW, exclusive mappings are really "just" about disabling automatic
lock transitions (as documented in the man page), not about grabbing
the lock and holding on to it whatever it takes.

Cc: [email protected]
Fixes: 637cd060537d ("rbd: new exclusive lock wait/wake code")
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Dongsheng Yang <[email protected]>

rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive mappings

Every time a watch is reestablished after getting lost, we need to
update the cookie which involves quiescing exclusive lock.  For this,
we transition from RBD_LOCK_STATE_LOCKED to RBD_LOCK_STATE_QUIESCING
roughly for the duration of rbd_reacquire_lock() call.  If the mapping
is exclusive and I/O happens to arrive in this time window, it's failed
with EROFS (later translated to EIO) based on the wrong assumption in
rbd_img_exclusive_lock() -- "lock got released?" check there stopped
making sense with commit a2b1da09793d ("rbd: lock should be quiesced on
reacquire").

To make it worse, any such I/O is added to the acquiring list before
EROFS is returned and this sets up for violating rbd_lock_del_request()
precondition that the request is either on the running list or not on
any list at all -- see commit ded080c86b3f ("rbd: don't move requests
to the running list on errors").  rbd_lock_del_request() ends up
processing these requests as if they were on the running list which
screws up quiescing_wait completion counter and ultimately leads to

    rbd_assert(!completion_done(&rbd_dev->quiescing_wait));

being triggered on the next watch error.

Cc: [email protected] # 06ef84c4e9c4: rbd: rename RBD_LOCK_STATE_RELEASING and releasing_wait
Cc: [email protected]
Fixes: 637cd060537d ("rbd: new exclusive lock wait/wake code")
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Dongsheng Yang <[email protected]>

rbd: rename RBD_LOCK_STATE_RELEASING and releasing_wait

... to RBD_LOCK_STATE_QUIESCING and quiescing_wait to recognize that
this state and the associated completion are backing rbd_quiesce_lock(),
which isn't specific to releasing the lock.

While exclusive lock does get quiesced before it's released, it also
gets quiesced before an attempt to update the cookie is made and there
the lock is not released as long as ceph_cls_set_cookie() succeeds.

Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Dongsheng Yang <[email protected]>

selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test

This flag is now required to use tx_metadata_len.

Fixes: 40808a237d9c ("selftests/bpf: Add TX side to xdp_metadata")
Reported-by: Julian Schindel <[email protected]>
Signed-off-by: Stanislav Fomichev <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Maciej Fijalkowski <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len

Julian reports that commit 341ac980eab9 ("xsk: Support tx_metadata_len")
can break existing use cases which don't zero-initialize xdp_umem_reg
padding. Introduce new XDP_UMEM_TX_METADATA_LEN to make sure we
interpret the padding as tx_metadata_len only when being explicitly
asked.

Fixes: 341ac980eab9 ("xsk: Support tx_metadata_len")
Reported-by: Julian Schindel <[email protected]>
Signed-off-by: Stanislav Fomichev <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Maciej Fijalkowski <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

bpf: Fix a segment issue when downgrading gso_size

Linearize the skb when downgrading gso_size because it may trigger a
BUG_ON() later when the skb is segmented as described in [1,2].

Fixes: 2be7e212d5419 ("bpf: add bpf_skb_adjust_room helper")
Signed-off-by: Fred Li <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Link: https://lore.kernel.org/all/[email protected]
Link: https://lore.kernel.org/bpf/[email protected]

net: mediatek: Fix potential NULL pointer dereference in dummy net_device handling

Move the freeing of the dummy net_device from mtk_free_dev() to
mtk_remove().

Previously, if alloc_netdev_dummy() failed in mtk_probe(),
eth->dummy_dev would be NULL. The error path would then call
mtk_free_dev(), which in turn called free_netdev() assuming dummy_dev
was allocated (but it was not), potentially causing a NULL pointer
dereference.

By moving free_netdev() to mtk_remove(), we ensure it's only called when
mtk_probe() has succeeded and dummy_dev is fully allocated. This
addresses a potential NULL pointer dereference detected by Smatch[1].

Fixes: b209bd6d0bff ("net: mediatek: mtk_eth_sock: allocate dummy net_device dynamically")
Reported-by: Dan Carpenter <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]/ [1]
Suggested-by: Dan Carpenter <[email protected]>
Reviewed-by: Dan Carpenter <[email protected]>
Signed-off-by: Breno Leitao <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

Merge tag 'nf-24-07-24' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains a Netfilter fix for net:

Patch #1 if FPU is busy, then pipapo set backend falls back to standard
set element lookup. Moreover, disable bh while at this.
From Florian Westphal.

netfilter pull request 24-07-24

* tag 'nf-24-07-24' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nft_set_pipapo_avx2: disable softinterrupts
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

ALSA: hda/realtek: Implement sound init sequence for Samsung Galaxy Book3 Pro 360

Samsung Galaxy Book3 Pro 360 sends a large amount of data to the codec
through hda processing coefficients. This data was captured using a
modified version of QEMU, but the actual content of the data remains
opaque to me. Elliding any part of the data seems to cause sound to
not work.

Signed-off-by: Nick Weihs <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
This series contains updates to ice driver only.

Ahmed enforces the iavf per VF filter limit on ice (PF) driver to prevent
possible resource exhaustion.

Wojciech corrects assignment of l2 flags read from firmware.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: Fix recipe read procedure
ice: Add a per-VF limit on number of FDIR filters
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

drm/amdgpu: reset vm state machine after gpu reset(vram lost)

[Why]
Page table of compute VM in the VRAM will lost after gpu reset.
VRAM won't be restored since compute VM has no shadows.

[How]
Use higher 32-bit of vm->generation to record a vram_lost_counter.
Reset the VM state machine when vm->genertaion is not equal to
the new generation token.

v2: Check vm->generation instead of calling drm_sched_entity_error
in amdgpu_vm_validate.
v3: Use new generation token instead of vram_lost_counter for check.

Signed-off-by: ZhenGuo Yin <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
(cherry picked from commit 47c0388b0589cb481c294dcb857d25a214c46eb3)

drm/amdgpu: add missed harvest check for VCN IP v4/v5

To prevent below probe failure, add a check for models with VCN
IP v4.0.6 where VCN1 may be harvested.

v2:
Apply the same check to VCN IP v4.0 and v5.0.

[   54.070117] RIP: 0010:vcn_v4_0_5_start_dpg_mode+0x9be/0x36b0 [amdgpu]
[   54.071055] Code: 80 fb ff 8d 82 00 80 fe ff 81 fe 00 06 00 00 0f 43
c2 49 69 d5 38 0d 00 00 48 8d 71 04 c1 e8 02 4c 01 f2 48 89 b2 50 f6 02
00 <89> 01 48 8b 82 50 f6 02 00 48 8d 48 04 48 89 8a 50 f6 02 00 c7 00
[   54.072408] RSP: 0018:ffffb17985f736f8 EFLAGS: 00010286
[   54.072793] RAX: 00000000000000d6 RBX: ffff99a82f680000 RCX:
0000000000000000
[   54.073315] RDX: ffff99a82f680000 RSI: 0000000000000004 RDI:
ffff99a82f680000
[   54.073835] RBP: ffffb17985f73730 R08: 0000000000000001 R09:
0000000000000000
[   54.074353] R10: 0000000000000008 R11: ffffb17983c05000 R12:
0000000000000000
[   54.074879] R13: 0000000000000000 R14: ffff99a82f680000 R15:
0000000000000001
[   54.075400] FS:  00007f8d9c79a000(0000) GS:ffff99ab2f140000(0000)
knlGS:0000000000000000
[   54.075988] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   54.076408] CR2: 0000000000000000 CR3: 0000000140c3a000 CR4:
0000000000750ef0
[   54.076927] PKRU: 55555554
[   54.077132] Call Trace:
[   54.077319]  <TASK>
[   54.077484]  ? show_regs+0x69/0x80
[   54.077747]  ? __die+0x28/0x70
[   54.077979]  ? page_fault_oops+0x180/0x4b0
[   54.078286]  ? do_user_addr_fault+0x2d2/0x680
[   54.078610]  ? exc_page_fault+0x84/0x190
[   54.078910]  ? asm_exc_page_fault+0x2b/0x30
[   54.079224]  ? vcn_v4_0_5_start_dpg_mode+0x9be/0x36b0 [amdgpu]
[   54.079941]  ? vcn_v4_0_5_start_dpg_mode+0xe6/0x36b0 [amdgpu]
[   54.080617]  vcn_v4_0_5_set_powergating_state+0x82/0x19b0 [amdgpu]
[   54.081316]  amdgpu_device_ip_set_powergating_state+0x64/0xc0
[amdgpu]
[   54.082057]  amdgpu_vcn_ring_begin_use+0x6f/0x1d0 [amdgpu]
[   54.082727]  amdgpu_ring_alloc+0x44/0x70 [amdgpu]
[   54.083351]  amdgpu_vcn_dec_sw_ring_test_ring+0x40/0x110 [amdgpu]
[   54.084054]  amdgpu_ring_test_helper+0x22/0x90 [amdgpu]
[   54.084698]  vcn_v4_0_5_hw_init+0x87/0xc0 [amdgpu]
[   54.085307]  amdgpu_device_init+0x1f96/0x2780 [amdgpu]
[   54.085951]  amdgpu_driver_load_kms+0x1e/0xc0 [amdgpu]
[   54.086591]  amdgpu_pci_probe+0x19f/0x550 [amdgpu]
[   54.087215]  local_pci_probe+0x48/0xa0
[   54.087509]  pci_device_probe+0xc9/0x250
[   54.087812]  really_probe+0x1a4/0x3f0
[   54.088101]  __driver_probe_device+0x7d/0x170
[   54.088443]  driver_probe_device+0x24/0xa0
[   54.088765]  __driver_attach+0xdd/0x1d0
[   54.089068]  ? __pfx___driver_attach+0x10/0x10
[   54.089417]  bus_for_each_dev+0x8e/0xe0
[   54.089718]  driver_attach+0x22/0x30
[   54.090000]  bus_add_driver+0x120/0x220
[   54.090303]  driver_register+0x62/0x120
[   54.090606]  ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
[   54.091255]  __pci_register_driver+0x62/0x70
[   54.091593]  amdgpu_init+0x67/0xff0 [amdgpu]
[   54.092190]  do_one_initcall+0x5f/0x330
[   54.092495]  do_init_module+0x68/0x240
[   54.092794]  load_module+0x201c/0x2110
[   54.093093]  init_module_from_file+0x97/0xd0
[   54.093428]  ? init_module_from_file+0x97/0xd0
[   54.093777]  idempotent_init_module+0x11c/0x2a0
[   54.094134]  __x64_sys_finit_module+0x64/0xc0
[   54.094476]  do_syscall_64+0x58/0x120
[   54.094767]  entry_SYSCALL_64_after_hwframe+0x6e/0x76

Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Saleemkhan Jamadar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
(cherry picked from commit 0b071245ddd98539d4f7493bdd188417fcf2d629)

drm/amdgpu: Fix eeprom max record count

The eeprom table is empty before initializing,
set eeprom table version first before initializing.

Changed from V1:
Reuse amdgpu_ras_set_eeprom_table_version function

Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
(cherry picked from commit 015b8a2fdf39a4c288ff24e7b715b8d9198e56dc)

drm/amdgpu: fix ras UE error injection failure issue

The ras command shared memory is allocated from
VRAM and the response status of the command
buffer will not be zero due to gpu being in
fatal error state after ras UE error injection.

Signed-off-by: YiPeng Chai <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
(cherry picked from commit 8284951a6e79c6806c675e5f68a4cd425dd56bc4)

drm/amd/display: Remove ASSERT if significance is zero in math_ceil2

In the DML math_ceil2 function, there is one ASSERT if the significance
is equal to zero. However, significance might be equal to zero
sometimes, and this is not an issue for a ceil function, but the current
ASSERT will trigger warnings in those cases. This commit removes the
ASSERT if the significance is equal to zero to avoid unnecessary noise.

Cc: Mario Limonciello <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: [email protected]
Reviewed-by: Chaitanya Dhere <[email protected]>
Signed-off-by: Rodrigo Siqueira <[email protected]>
Signed-off-by: Aurabindo Pillai <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
(cherry picked from commit 332315885d3ccc6d8fe99700f3c2e4c24aa65ab7)

drm/amd/display: Check for NULL pointer

[why & how]
Need to make sure plane_state is initialized
before accessing its members.

Cc: Mario Limonciello <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: [email protected]
Reviewed-by: Xi (Alex) Liu <[email protected]>
Signed-off-by: Sung Joon Kim <[email protected]>
Signed-off-by: Aurabindo Pillai <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
(cherry picked from commit 295d91cbc700651782a60572f83c24861607b648)

drm/amdgpu/vcn: Use offsets local to VCN/JPEG in VF

For VCN/JPEG 4.0.3, use only the local addressing scheme.

- Mask bit higher than AID0 range

v2
remain the case for mmhub use master XCC

Signed-off-by: Jane Jian <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
(cherry picked from commit caaf576292f8ccef5cdc0ac16e77b87dbf6e17ab)

drm/amdgpu: Add empty HDP flush function to VCN v4.0.3

VCN 4.0.3 does not HDP flush with RRMT enabled. Instead, mmsch
will do the HDP flush.

This change is necessary for VCN v4.0.3, no need for backward compatibility

Signed-off-by: Lijo Lazar <[email protected]>
Signed-off-by: Jane Jian <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
(cherry picked from commit 49cfaebe48e97500a68d5322a8194736b0a2c3cf)

drm/amdgpu: Add empty HDP flush function to JPEG v4.0.3

JPEG v4.0.3 doesn't support HDP flush when RRMT is enabled. Instead,
mmsch fw will do the flush.

This change is necessary for JPEG v4.0.3, no need for backward compatibility

Signed-off-by: Lijo Lazar <[email protected]>
Signed-off-by: Jane Jian <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
(cherry picked from commit 585e3fdb36f59c5cfed0ae06c852dc1df22b1d60)

drm/amd/amdgpu: Fix uninitialized variable warnings

Return 0 to avoid returning an uninitialized variable r.

Cc: [email protected]
Fixes: 230dd6bb6117 ("drm/amd/amdgpu: implement mode2 reset on smu_v13_0_10")
Signed-off-by: Ma Ke <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
(cherry picked from commit 6472de66c0aa18d50a4b5ca85f8272e88a737676)

drm/amdgpu: Fix atomics on GFX12

If PCIe supports atomics, configure register to prevent DF from
breaking atomics in separate load/store operations.

Signed-off-by: David Belanger <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
(cherry picked from commit 666f14cab21b17ccc1bdfe1e82458aa429b3b7e0)

drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell

We seem to have a case where SDMA will sometimes miss a doorbell
if GFX is entering the powergating state when the doorbell comes in.
To workaround this, we can update the wptr via MMIO, however,
this is only safe because we disallow gfxoff in begin_ring() for
SDMA 5.2 and then allow it again in end_ring().

Enable this workaround while we are root causing the issue with
the HW team.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/3440
Tested-by: Friedrich Vock <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
(cherry picked from commit f2ac52634963fc38e4935e11077b6f7854e5d700)

Merge tag 'phy-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy

Pull phy updates from Vinod Koul:
"New Support
   - Samsung Exynos gs101 drd combo phy
   - Qualcomm SC8180x USB uniphy, IPQ9574 QMP PCIe phy
   - Airoha EN7581 PCIe phy
   - Freescale i.MX8Q HSIO SerDes phy
   - Starfive jh7110 dphy tx

  Updates:
   - Resume support for j721e-wiz driver
   - Updates to Exynos usbdrd driver
   - Support for optional power domains in g12a usb2-phy driver
   - Debugfs support and updates to zynqmp driver"

* tag 'phy-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (56 commits)
  phy: airoha: Add dtime and Rx AEQ IO registers
  dt-bindings: phy: airoha: Add dtime and Rx AEQ IO registers
  dt-bindings: phy: rockchip-emmc-phy: Convert to dtschema
  dt-bindings: phy: qcom,qmp-usb: fix spelling error
  phy: exynos5-usbdrd: support Exynos USBDRD 3.1 combo phy (HS & SS)
  phy: exynos5-usbdrd: convert Vbus supplies to regulator_bulk
  phy: exynos5-usbdrd: convert (phy) register access clock to clk_bulk
  phy: exynos5-usbdrd: convert core clocks to clk_bulk
  phy: exynos5-usbdrd: support isolating HS and SS ports independently
  dt-bindings: phy: samsung,usb3-drd-phy: add gs101 compatible
  phy: core: Fix documentation of of_phy_get
  phy: starfive: Correct the dphy configure process
  phy: zynqmp: Add debugfs support
  phy: zynqmp: Take the phy mutex in xlate
  phy: zynqmp: Only wait for PLL lock "primary" instances
  phy: zynqmp: Store instance instead of type
  phy: zynqmp: Enable reference clock correctly
  phy: cadence-torrent: Check return value on register read
  phy: Fix the cacography in phy-exynos5250-usb2.c
  phy: phy-rockchip-samsung-hdptx: Select CONFIG_MFD_SYSCON
  ...

Merge tag 'soundwire-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire

Pull soundwire updates from Vinod Koul:

- Simplification across subsystem using cleanup.h

- Support for debugfs to read/write commands

- Few Intel and Qualcomm driver updates

* tag 'soundwire-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
  soundwire: debugfs: simplify with cleanup.h
  soundwire: cadence: simplify with cleanup.h
  soundwire: intel_ace2x: simplify with cleanup.h
  soundwire: intel_ace2x: simplify return path in hw_params
  soundwire: intel: simplify with cleanup.h
  soundwire: intel: simplify return path in hw_params
  soundwire: amd_init: simplify with cleanup.h
  soundwire: amd: simplify with cleanup.h
  soundwire: amd: simplify return path in hw_params
  soundwire: intel_auxdevice: start the bus at default frequency
  soundwire: intel_auxdevice: add cs42l43 codec to wake_capable_list
  drivers:soundwire: qcom: cleanup port maask calculations
  soundwire: bus: simplify by using local slave->prop
  soundwire: generic_bandwidth_allocation: change port_bo parameter to pointer
  soundwire: Intel: clarify Copyright information
  soundwire: intel_ace2.x: add AC timing extensions for PantherLake
  soundwire: bus: add stream refcount
  soundwire: debugfs: add interface to read/write commands

Merge tag 'dmaengine-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine

Pull dmaengine updates from Vinod Koul:
"New support:

   - New dmaengine_prep_peripheral_dma_vec() to support transfers using
     dma vectors and documentation and user in AXI dma

   - STMicro STM32 DMA3 support and new capabilities of cyclic dma

  Updates:

   - Yaml conversion for Freescale imx dma and qdma bindings,
     sprd sc9860 dma binding

   - Altera msgdma updates for descriptor management"

* tag 'dmaengine-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (35 commits)
  dt-bindings: fsl-qdma: fix interrupts 'if' check logic
  dt-bindings: dma: sprd,sc9860-dma: convert to YAML
  dmaengine: fsl-dpaa2-qdma: add missing MODULE_DESCRIPTION() macro
  dmaengine: ti: add missing MODULE_DESCRIPTION() macros
  dmaengine: ti: cppi41: add missing MODULE_DESCRIPTION() macro
  dmaengine: virt-dma: add missing MODULE_DESCRIPTION() macro
  dmaengine: ti: k3-udma: Fix BCHAN count with UHC and HC channels
  dmaengine: sh: rz-dmac: Fix lockdep assert warning
  dmaengine: qcom: gpi: clean up the IRQ disable/enable in gpi_reset_chan()
  dmaengine: fsl-edma: change the memory access from local into remote mode in i.MX 8QM
  dmaengine: qcom: gpi: remove unused struct 'reg_info'
  dmaengine: moxart-dma: remove unused struct 'moxart_filter_data'
  dt-bindings: fsl-qdma: Convert to yaml format
  dmaengine: fsl-edma: remove redundant "idle" field from fsl_chan
  dmaengine: fsl-edma: request per-channel IRQ only when channel is allocated
  dmaengine: stm32-dma3: defer channel registration to specify channel name
  dmaengine: add channel device name to channel registration
  dmaengine: stm32-dma3: improve residue granularity
  dmaengine: stm32-dma3: add device_pause and device_resume ops
  dmaengine: stm32-dma3: add DMA_MEMCPY capability
  ...

sysctl: treewide: constify the ctl_table argument of proc_handlers

const qualify the struct ctl_table argument in the proc_handler function
signatures. This is a prerequisite to moving the static ctl_table
structs into .rodata data which will ensure that proc_handler function
pointers cannot be modified.

This patch has been generated by the following coccinelle script:

```
  virtual patch

  @r1@
  identifier ctl, write, buffer, lenp, ppos;
  identifier func !~ "appldata_(timer|interval)_handler|sched_(rt|rr)_handler|rds_tcp_skbuf_handler|proc_sctp_do_(hmac_alg|rto_min|rto_max|udp_port|alpha_beta|auth|probe_interval)";
  @@

  int func(
  - struct ctl_table *ctl
  + const struct ctl_table *ctl
    ,int write, void *buffer, size_t *lenp, loff_t *ppos);

  @r2@
  identifier func, ctl, write, buffer, lenp, ppos;
  @@

  int func(
  - struct ctl_table *ctl
  + const struct ctl_table *ctl
    ,int write, void *buffer, size_t *lenp, loff_t *ppos)
  { ... }

  @r3@
  identifier func;
  @@

  int func(
  - struct ctl_table *
  + const struct ctl_table *
    ,int , void *, size_t *, loff_t *);

  @r4@
  identifier func, ctl;
  @@

  int func(
  - struct ctl_table *ctl
  + const struct ctl_table *ctl
    ,int , void *, size_t *, loff_t *);

  @r5@
  identifier func, write, buffer, lenp, ppos;
  @@

  int func(
  - struct ctl_table *
  + const struct ctl_table *
    ,int write, void *buffer, size_t *lenp, loff_t *ppos);

```

* Code formatting was adjusted in xfs_sysctl.c to comply with code
  conventions. The xfs_stats_clear_proc_handler,
  xfs_panic_mask_proc_handler and xfs_deprecated_dointvec_minmax where
  adjusted.

* The ctl_table argument in proc_watchdog_common was const qualified.
  This is called from a proc_handler itself and is calling back into
  another proc_handler, making it necessary to change it as part of the
  proc_handler migration.

Co-developed-by: Thomas Weißschuh <[email protected]>
Signed-off-by: Thomas Weißschuh <[email protected]>
Co-developed-by: Joel Granados <[email protected]>
Signed-off-by: Joel Granados <[email protected]>

apparmor: unpack transition table if dfa is not present

Due to a bug in earlier userspaces, a transition table may be present
even when the dfa is not. Commit 7572fea31e3e
("apparmor: convert fperm lookup to use accept as an index") made the
verification check more rigourous regressing old userspaces with
the bug. For compatibility reasons allow the orphaned transition table
during unpack and discard.

Fixes: 7572fea31e3e ("apparmor: convert fperm lookup to use accept as an index")
Signed-off-by: Georgia Garcia <[email protected]>
Signed-off-by: John Johansen <[email protected]>

apparmor: try to avoid refing the label in apparmor_file_open

If the label is not stale (which is the common case), the fact that the
passed file object holds a reference can be leverged to avoid the
ref/unref cycle. Doing so reduces performance impact of apparmor on
parallel open() invocations.

When benchmarking on a 24-core vm using will-it-scale's open1_process
("Separate file open"), the results are (ops/s):
before: 6092196
after: 8309726 (+36%)

Signed-off-by: Mateusz Guzik <[email protected]>
Signed-off-by: John Johansen <[email protected]>

apparmor: test: add MODULE_DESCRIPTION()

Fix the 'make W=1' warning:
WARNING: modpost: missing MODULE_DESCRIPTION() in security/apparmor/apparmor_policy_unpack_test.o

Signed-off-by: Jeff Johnson <[email protected]>
Signed-off-by: John Johansen <[email protected]>

apparmor: take nosymfollow flag into account

A "nosymfollow" flag was added in commit
dab741e0e02b ("Add a "nosymfollow" mount option.")

While we don't need to implement any special logic on
the AppArmor kernel side to handle it, we should provide
user with a correct list of mount flags in audit logs.

Signed-off-by: Alexander Mikhalitsyn <[email protected]>
Reviewed-by: Georgia Garcia <[email protected]>
Signed-off-by: John Johansen <[email protected]>

Merge tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull random number generator updates from Jason Donenfeld:
"This adds getrandom() support to the vDSO.

  First, it adds a new kind of mapping to mmap(2), MAP_DROPPABLE, which
  lets the kernel zero out pages anytime under memory pressure, which
  enables allocating memory that never gets swapped to disk but also
  doesn't count as being mlocked.

  Then, the vDSO implementation of getrandom() is introduced in a
  generic manner and hooked into random.c.

  Next, this is implemented on x86. (Also, though it's not ready for
  this pull, somebody has begun an arm64 implementation already)

  Finally, two vDSO selftests are added.

  There are also two housekeeping cleanup commits"

* tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  MAINTAINERS: add random.h headers to RNG subsection
  random: note that RNDGETPOOL was removed in 2.6.9-rc2
  selftests/vDSO: add tests for vgetrandom
  x86: vdso: Wire up getrandom() vDSO implementation
  random: introduce generic vDSO getrandom() implementation
  mm: add MAP_DROPPABLE for designating always lazily freeable mappings

Merge tag 'vfs-6.11-rc1.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:
"VFS:

   - The new 64bit mount ids start after the old mount id, i.e., at the
     first non-32 bit value. However, we started counting one id too
     late and thus lost 4294967296 as the first valid id. Fix that.

   - Update a few comments on some vfs_*() creation helpers.

   - Move copying of the xattr name out from the locks required to start
     a filesystem write.

   - Extend the filelock lock UAF fix to the compat code as well.

   - Now that we added the ability to look up an inode under RCU it's
     possible that lockless hash lookup can find and lock an inode after
     it gets I_FREEING set. It then waits until inode teardown in
     evict() is finished.

     The flag however is still set after evict() has woken up all
     waiters. If the inode lock is taken late enough on the waiting side
     after hash removal and wakeup happened the waiting thread will
     never be woken.

     Before RCU based lookup this was synchronized via the
     inode_hash_lock. But since unhashing requires the inode lock as
     well we can check whether the inode is unhashed while holding inode
     lock even without holding inode_hash_lock.

  pidfd:

   - The nsproxy structure contains nearly all of the namespaces
     associated with a task. When a namespace type isn't supported
     nsproxy might contain a NULL pointer or always point to the initial
     namespace type. The logic isn't consistent. So when deriving
     namespace fds we need to ensure that the namespace type is
     supported.

     First, so that we don't risk dereferncing NULL pointers. The
     correct bigger fix would be to change all namespaces to always set
     a valid namespace pointer in struct nsproxy independent of whether
     or not it is compiled in. But that requires quite a few changes.

     Second, so that we don't allow deriving namespace fds when the
     namespace type doesn't exist and thus when they couldn't also be
     derived via /proc/self/ns/.

   - Add missing selftests for the new pidfd ioctls to derive namespace
     fds. This simply extends the already existing testsuite.

  netfs:

   - Fix debug logging and fix kconfig variable name so it actually
     works.

   - Fix writeback that goes both to the server and cache. The streams
     are only activated once a subreq is added. When a server write
     happens the subreq doesn't need to have finished by the time the
     cache write is started. If the server write has already finished by
     the time the cache write is about to start the cache write will
     operate on a folio that might already have been reused. Fix this by
     preactivating the cache write.

   - Limit cachefiles subreq size for cache writes to MAX_RW_COUNT"

* tag 'vfs-6.11-rc1.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  inode: clarify what's locked
  vfs: Fix potential circular locking through setxattr() and removexattr()
  filelock: Fix fcntl/close race recovery compat path
  fs: use all available ids
  cachefiles: Set the max subreq size for cache writes to MAX_RW_COUNT
  netfs: Fix writeback that needs to go to both server and cache
  pidfs: add selftests for new namespace ioctls
  pidfs: handle kernels without namespaces cleanly
  pidfs: when time ns disabled add check for ioctl
  vfs: correct the comments of vfs_*() helpers
  vfs: handle __wait_on_freeing_inode() and evict() race
  netfs: Rename CONFIG_FSCACHE_DEBUG to CONFIG_NETFS_DEBUG
  netfs: Revert "netfs: Switch debug logging to pr_debug()"

hostfs: fix folio conversion

Commit e3ec0fe944d2 ("hostfs: Convert hostfs_read_folio() to use a
folio") simplified hostfs_read_folio(), but in the process of converting
to using folios natively also mis-used the folio_zero_tail() function
due to the very confusing API of that function.

Very arguably it's folio_zero_tail() API itself that is buggy, since it
would make more sense (and the documentation kind of implies) that the
third argument would be the pointer to the beginning of the folio
buffer.

But no, the third argument to folio_zero_tail() is where we should start
zeroing the tail (even if we already also pass in the offset separately
as the second argument).

So fix the hostfs caller, and we can leave any folio_zero_tail() sanity
cleanup for later.

Reported-and-tested-by: Maciej Żenczykowski <[email protected]>
Fixes: e3ec0fe944d2 ("hostfs: Convert hostfs_read_folio() to use a folio")
Link: https://lore.kernel.org/all/CANP3RGceNzwdb7w=vPf5=7BCid5HVQDmz1K5kC9JG42+HVAh_g@mail.gmail.com/
Cc: Matthew Wilcox <[email protected]>
Cc: Christian Brauner <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ALSA: hda/realtek: cs35l41: Fixup remaining asus strix models

Adjust quirks for 0x3a20, 0x3a30, 0x3a50 to match the 0x3a60. This
set has now been confirmed to work with this patch.

Signed-off-by: Luke D. Jones <[email protected]>
Fixes: 811dd426a9b1 ("ALSA: hda/realtek: Add quirks for Asus ROG 2024 laptops using CS35L41")
Cc: <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

ublk: fix UBLK_CMD_DEL_DEV_ASYNC handling

In ublk_ctrl_uring_cmd(), ioctl command NR should be used for
matching _IOC_NR(cmd_op).

Fix it by adding one private macro, and this way is clean.

Fixes: 13fe8e6825e4 ("ublk: add UBLK_CMD_DEL_DEV_ASYNC")
Signed-off-by: Ming Lei <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

block: fix deadlock between sd_remove & sd_release

Our test report the following hung task:

[ 2538.459400] INFO: task "kworker/0:0":7 blocked for more than 188 seconds.
[ 2538.459427] Call trace:
[ 2538.459430]  __switch_to+0x174/0x338
[ 2538.459436]  __schedule+0x628/0x9c4
[ 2538.459442]  schedule+0x7c/0xe8
[ 2538.459447]  schedule_preempt_disabled+0x24/0x40
[ 2538.459453]  __mutex_lock+0x3ec/0xf04
[ 2538.459456]  __mutex_lock_slowpath+0x14/0x24
[ 2538.459459]  mutex_lock+0x30/0xd8
[ 2538.459462]  del_gendisk+0xdc/0x350
[ 2538.459466]  sd_remove+0x30/0x60
[ 2538.459470]  device_release_driver_internal+0x1c4/0x2c4
[ 2538.459474]  device_release_driver+0x18/0x28
[ 2538.459478]  bus_remove_device+0x15c/0x174
[ 2538.459483]  device_del+0x1d0/0x358
[ 2538.459488]  __scsi_remove_device+0xa8/0x198
[ 2538.459493]  scsi_forget_host+0x50/0x70
[ 2538.459497]  scsi_remove_host+0x80/0x180
[ 2538.459502]  usb_stor_disconnect+0x68/0xf4
[ 2538.459506]  usb_unbind_interface+0xd4/0x280
[ 2538.459510]  device_release_driver_internal+0x1c4/0x2c4
[ 2538.459514]  device_release_driver+0x18/0x28
[ 2538.459518]  bus_remove_device+0x15c/0x174
[ 2538.459523]  device_del+0x1d0/0x358
[ 2538.459528]  usb_disable_device+0x84/0x194
[ 2538.459532]  usb_disconnect+0xec/0x300
[ 2538.459537]  hub_event+0xb80/0x1870
[ 2538.459541]  process_scheduled_works+0x248/0x4dc
[ 2538.459545]  worker_thread+0x244/0x334
[ 2538.459549]  kthread+0x114/0x1bc

[ 2538.461001] INFO: task "fsck.":15415 blocked for more than 188 seconds.
[ 2538.461014] Call trace:
[ 2538.461016]  __switch_to+0x174/0x338
[ 2538.461021]  __schedule+0x628/0x9c4
[ 2538.461025]  schedule+0x7c/0xe8
[ 2538.461030]  blk_queue_enter+0xc4/0x160
[ 2538.461034]  blk_mq_alloc_request+0x120/0x1d4
[ 2538.461037]  scsi_execute_cmd+0x7c/0x23c
[ 2538.461040]  ioctl_internal_command+0x5c/0x164
[ 2538.461046]  scsi_set_medium_removal+0x5c/0xb0
[ 2538.461051]  sd_release+0x50/0x94
[ 2538.461054]  blkdev_put+0x190/0x28c
[ 2538.461058]  blkdev_release+0x28/0x40
[ 2538.461063]  __fput+0xf8/0x2a8
[ 2538.461066]  __fput_sync+0x28/0x5c
[ 2538.461070]  __arm64_sys_close+0x84/0xe8
[ 2538.461073]  invoke_syscall+0x58/0x114
[ 2538.461078]  el0_svc_common+0xac/0xe0
[ 2538.461082]  do_el0_svc+0x1c/0x28
[ 2538.461087]  el0_svc+0x38/0x68
[ 2538.461090]  el0t_64_sync_handler+0x68/0xbc
[ 2538.461093]  el0t_64_sync+0x1a8/0x1ac

  T1: T2:
  sd_remove
  del_gendisk
  __blk_mark_disk_dead
  blk_freeze_queue_start
  ++q->mq_freeze_depth
   bdev_release
mutex_lock(&disk->open_mutex)
   sd_release
scsi_execute_cmd
blk_queue_enter
wait_event(!q->mq_freeze_depth)
  mutex_lock(&disk->open_mutex)

SCSI does not set GD_OWNS_QUEUE, so QUEUE_FLAG_DYING is not set in
this scenario. This is a classic ABBA deadlock. To fix the deadlock,
make sure we don't try to acquire disk->open_mutex after freezing
the queue.

Cc: [email protected]
Fixes: eec1be4c30df ("block: delete partitions later in del_gendisk")
Signed-off-by: Yang Yang <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Reviewed-by: Bart Van Assche <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Fixes: and Cc: stable tags are missing. Otherwise this patch looks fine
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

selftests/landlock: Add cred_transfer test

Check that keyctl(KEYCTL_SESSION_TO_PARENT) preserves the parent's
restrictions.

Fixes: e1199815b47b ("selftests/landlock: Add user space tests")
Co-developed-by: Jann Horn <[email protected]>
Signed-off-by: Jann Horn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mickaël Salaün <[email protected]>

landlock: Don't lose track of restrictions on cred_transfer

When a process' cred struct is replaced, this _almost_ always invokes
the cred_prepare LSM hook; but in one special case (when
KEYCTL_SESSION_TO_PARENT updates the parent's credentials), the
cred_transfer LSM hook is used instead. Landlock only implements the
cred_prepare hook, not cred_transfer, so KEYCTL_SESSION_TO_PARENT causes
all information on Landlock restrictions to be lost.

This basically means that a process with the ability to use the fork()
and keyctl() syscalls can get rid of all Landlock restrictions on
itself.

Fix it by adding a cred_transfer hook that does the same thing as the
existing cred_prepare hook. (Implemented by having hook_cred_prepare()
call hook_cred_transfer() so that the two functions are less likely to
accidentally diverge in the future.)

Cc: [email protected]
Fixes: 385975dca53e ("landlock: Set up the security framework and manage credentials")
Signed-off-by: Jann Horn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mickaël Salaün <[email protected]>

RISC-V: Select ACPI PPTT drivers

After adding ACPI support to populate_cache_leaves(), RISC-V can build
cacheinfo through the ACPI PPTT table, thus enabling the ACPI_PPTT
configuration.

Signed-off-by: Yunhui Cui <[email protected]>
Reviewed-by: Jeremy Linton <[email protected]>
Reviewed-by: Sudeep Holla <[email protected]>
Reviewed-by: Sunil V L <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

riscv: cacheinfo: initialize cacheinfo's level and type from ACPI PPTT

Before cacheinfo can be built correctly, we need to initialize level
and type. Since RISC-V currently does not have a register group that
describes cache-related attributes like ARM64, we cannot obtain them
directly, so now we obtain cache leaves from the ACPI PPTT table
(acpi_get_cache_info()) and set the cache type through split_levels.

Suggested-by: Jeremy Linton <[email protected]>
Suggested-by: Sudeep Holla <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Reviewed-by: Sunil V L <[email protected]>
Reviewed-by: Jeremy Linton <[email protected]>
Reviewed-by: Sudeep Holla <[email protected]>
Signed-off-by: Yunhui Cui <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

riscv: cacheinfo: remove the useless input parameter (node) of ci_leaf_init()

ci_leaf_init() is a declared static function. The implementation of the
function body and the caller do not use the parameter (struct device_node
*node) input parameter, so remove it.

Fixes: 6a24915145c9 ("Revert "riscv: Set more data to cacheinfo"")
Signed-off-by: Yunhui Cui <[email protected]>
Reviewed-by: Jeremy Linton <[email protected]>
Reviewed-by: Sudeep Holla <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

RISC-V: ACPI: Enable SPCR table for console output on RISC-V

The ACPI SPCR code has been used to enable console output for ARM64 and
X86. The same code can be reused for RISC-V. Furthermore, SPCR table is
mandated for headless system as outlined in the RISC-V BRS
Specification, chapter 6.

Signed-off-by: Sia Jee Heng <[email protected]>
Reviewed-by: Sunil V L <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

MAINTAINERS: make Breno the netconsole maintainer

netconsole has no maintainer, and Breno has been working on
improving it consistently for some time. So I think we found
the maintainer :)

Acked-by: Paolo Abeni <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Acked-by: Breno Leitao <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

MAINTAINERS: Update bonding entry

Update my email address, clarify support status, and delete the
web site that hasn't been used in a long time.

Signed-off-by: Jay Vosburgh <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: nexthop: Initialize all fields in dumped nexthops

struct nexthop_grp contains two reserved fields that are not initialized by
nla_put_nh_group(), and carry garbage. This can be observed e.g. with
strace (edited for clarity):

    # ip nexthop add id 1 dev lo
    # ip nexthop add id 101 group 1
    # strace -e recvmsg ip nexthop get id 101
    ...
    recvmsg(... [{nla_len=12, nla_type=NHA_GROUP},
                 [{id=1, weight=0, resvd1=0x69, resvd2=0x67}]] ...) = 52

The fields are reserved and therefore not currently used. But as they are, they
leak kernel memory, and the fact they are not just zero complicates repurposing
of the fields for new ends. Initialize the full structure.

Fixes: 430a049190de ("nexthop: Add support for nexthop groups")
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: stmmac: Correct byte order of perfect_match

The perfect_match parameter of the update_vlan_hash operation is __le16,
and is correctly converted from host byte-order in the lone caller,
stmmac_vlan_update().

However, the implementations of this caller, dwxgmac2_update_vlan_hash()
and dwxgmac2_update_vlan_hash(), both treat this parameter as host byte
order, using the following pattern:

u32 value = ...
...
writel(value | perfect_match, ...);

This is not correct because both:
1) value is host byte order; and
2) writel expects a host byte order value as it's first argument

I believe that this will break on big endian systems. And I expect it
has gone unnoticed by only being exercised on little endian systems.

The approach taken by this patch is to update the callback, and it's
caller to simply use a host byte order value.

Flagged by Sparse.
Compile tested only.

Fixes: c7ab0b8088d7 ("net: stmmac: Fallback to VLAN Perfect filtering if HASH is not available")
Signed-off-by: Simon Horman <[email protected]>
Reviewed-by: Maxime Chevallier <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

io_uring: align iowq and task request error handling

There is a difference in how io_queue_sqe and io_wq_submit_work treat
error codes they get from io_issue_sqe. The first one fails anything
unknown but latter only fails when the code is negative.

It doesn't make sense to have this discrepancy, align them to the
io_queue_sqe behaviour.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/c550e152bf4a290187f91a4322ddcb5d6d1f2c73.1721819383.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

io_uring: kill REQ_F_CANCEL_SEQ

We removed the reliance on the flag by the cancellation path and now
it's unused.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/e57afe566bbe4fefeb44daffb08900f2a4756577.1721819383.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

io_uring: simplify io_uring_cmd return

We don't have to return error code from an op handler back to core
io_uring, so once io_uring_cmd() sets the results and handles errors we
can juts return IOU_OK and simplify the code.

Note, only valid with e0b23d9953b0c ("io_uring: optimise ltimeout for
inline execution"), there was a problem with iopoll before.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/8eae2be5b2a49236cd5f1dadbd1aa5730e9e2d4f.1721819383.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

io_uring: fix io_match_task must_hold

The __must_hold annotation in io_match_task() uses a non existing
parameter "req", fix it.

Fixes: 6af3f48bf6156 ("io_uring: fix link traversal locking")
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/3e65ee7709e96507cef3d93291746f2c489f2307.1721819383.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

io_uring: don't allow netpolling with SETUP_IOPOLL

IORING_SETUP_IOPOLL rings don't have any netpoll handling, let's fail
attempts to register netpolling in this case, there might be people who
will mix up IOPOLL and netpoll.

Cc: [email protected]
Fixes: ef1186c1a875b ("io_uring: add register/unregister napi function")
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/1e7553aee0a8ae4edec6742cd6dd0c1e6914fba8.1721819383.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

io_uring: tighten task exit cancellations

io_uring_cancel_generic() should retry if any state changes like a
request is completed, however in case of a task exit it only goes for
another loop and avoids schedule() if any tracked (i.e. REQ_F_INFLIGHT)
request got completed.

Let's assume we have a non-tracked request executing in iowq and a
tracked request linked to it. Let's also assume
io_uring_cancel_generic() fails to find and cancel the request, i.e.
via io_run_local_work(), which may happen as io-wq has gaps.
Next, the request logically completes, io-wq still hold a ref but queues
it for completion via tw, which happens in
io_uring_try_cancel_requests(). After, right before prepare_to_wait()
io-wq puts the request, grabs the linked one and tries executes it, e.g.
arms polling. Finally the cancellation loop calls prepare_to_wait(),
there are no tw to run, no tracked request was completed, so the
tctx_inflight() check passes and the task is put to indefinite sleep.

Cc: [email protected]
Fixes: 3f48cf18f886c ("io_uring: unify files and task cancel")
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/acac7311f4e02ce3c43293f8f1fda9c705d158f1.1721819383.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

riscv: boot: remove duplicated targets line

The "targets:" is duplicated in another line, remove the one with less
targets.

Signed-off-by: Jisheng Zhang <[email protected]>
Reviewed-by: Emil Renner Berthing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>

trace: riscv: Remove deprecated kprobe on ftrace support

Since commit 7caa9765465f60 ("ftrace: riscv: move from REGS to ARGS"),
kprobe on ftrace is not supported by riscv, because riscv's support for
FTRACE_WITH_REGS has been replaced with support for FTRACE_WITH_ARGS, and
KPROBES_ON_FTRACE will be supplanted by FPROBES. So remove the deprecated
kprobe on ftrace support, which is misunderstood.

Signed-off-by: Jinjie Ruan <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>