Git Repo - linux.git/log

mptcp: drop the push_pending field

Such field is there to avoid acquiring the data lock in a few spots,
but it adds complexity to the already non trivial locking schema.

All the relevant call sites (mptcp-level re-injection, set socket
options), are slow-path, drop such field in favor of 'cb_flags', adding
the relevant locking.

This patch could be seen as an improvement, instead of a fix. But it
simplifies the next patch. The 'Fixes' tag has been added to help having
this series backported to stable.

Fixes: e9d09baca676 ("mptcp: avoid atomic bit manipulation when possible")
Cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tools/rtla: Exit with EXIT_SUCCESS when help is invoked

Fix rtla so that the following commands exit with 0 when help is invoked

rtla osnoise top -h
rtla osnoise hist -h
rtla timerlat top -h
rtla timerlat hist -h

Link: https://lore.kernel.org/linux-trace-devel/[email protected]
Cc: [email protected]
Fixes: 1eeb6328e8b3 ("rtla/timerlat: Add timerlat hist mode")
Signed-off-by: John Kacur <[email protected]>
Signed-off-by: Daniel Bristot de Oliveira <[email protected]>

tools/rtla: Replace setting prio with nice for SCHED_OTHER

Since the sched_priority for SCHED_OTHER is always 0, it makes no
sence to set it.
Setting nice for SCHED_OTHER seems more meaningful.

Link: https://lkml.kernel.org/r/[email protected]
Cc: [email protected]
Fixes: b1696371d865 ("rtla: Helper functions for rtla")
Signed-off-by: limingming3 <[email protected]>
Signed-off-by: Daniel Bristot de Oliveira <[email protected]>

Merge branch 'net-misplaced-fields'

Eric Dumazet says:

====================
net: more three misplaced fields

We recently reorganized some structures for better data locality
in networking fast paths.

This series moves three fields that were not correctly classified.

There probably more to come.

Reference : https://lwn.net/Articles/951321/
====================

Signed-off-by: David S. Miller <[email protected]>

net-device: move lstats in net_device_read_txrx

dev->lstats is notably used from loopback ndo_start_xmit()
and other virtual drivers.

Per cpu stats updates are dirtying per-cpu data,
but the pointer itself is read-only.

Fixes: 43a71cd66b9c ("net-device: reorganize net_device fast path variables")
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Coco Li <[email protected]>
Cc: Simon Horman <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp: move tp->tcp_usec_ts to tcp_sock_read_txrx group

tp->tcp_usec_ts is a read mostly field, used in rx and tx fast paths.

Fixes: d5fed5addb2b ("tcp: reorganize tcp_sock fast path variables")
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Coco Li <[email protected]>
Cc: Wei Wang <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp: move tp->scaling_ratio to tcp_sock_read_txrx group

tp->scaling_ratio is a read mostly field, used in rx and tx fast paths.

Fixes: d5fed5addb2b ("tcp: reorganize tcp_sock fast path variables")
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Coco Li <[email protected]>
Cc: Wei Wang <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

drm/i915/dp: Limit SST link rate to <=8.1Gbps

Limit the link rate to HBR3 or below (<=8.1Gbps) in SST mode.
UHBR (10Gbps+) link rates require 128b/132b channel encoding
which we have not yet hooked up into the SST/no-sideband codepaths.

Cc: [email protected]
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Jani Nikula <[email protected]>
(cherry picked from commit 6061811d72e14f41f71b6a025510920b187bfcca)
Signed-off-by: Joonas Lahtinen <[email protected]>

drm/i915/dsc: Fix the macro that calculates DSCC_/DSCA_ PPS reg address

Commit bd077259d0a9 ("drm/i915/vdsc: Add function to read any PPS
register") defines a new macro to calculate the DSC PPS register
addresses with PPS number as an input. This macro correctly calculates
the addresses till PPS 11 since the addresses increment by 4. So in that
case the following macro works correctly to give correct register
address:

_MMIO(_DSCA_PPS_0 + (pps) * 4)

However after PPS 11, the register address for PPS 12 increments by 12
because of RC Buffer memory allocation in between. Because of this
discontinuity in the address space, the macro calculates wrong addresses
for PPS 12 - 16 resulting into incorrect DSC PPS parameter value
read/writes causing DSC corruption.

This fixes it by correcting this macro to add the offset of 12 for PPS
>=12.

v3: Add correct paranthesis for pps argument (Jani Nikula)

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10172
Fixes: bd077259d0a9 ("drm/i915/vdsc: Add function to read any PPS register")
Cc: Suraj Kandpal <[email protected]>
Cc: Ankit Nautiyal <[email protected]>
Cc: Animesh Manna <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Sean Paul <[email protected]>
Cc: Drew Davenport <[email protected]>
Signed-off-by: Manasi Navare <[email protected]>
Reviewed-by: Jani Nikula <[email protected]>
Signed-off-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 6074be620c31dc2ae11af96a1a5ea95580976fb5)
Signed-off-by: Joonas Lahtinen <[email protected]>

tools/rv: Fix curr_reactor uninitialized variable

clang is reporting:

$ make HOSTCC=clang CC=clang LLVM_IAS=1

clang -O -g -DVERSION=\"6.8.0-rc3\" -flto=auto -fexceptions
-fstack-protector-strong -fasynchronous-unwind-tables
-fstack-clash-protection  -Wall -Werror=format-security
-Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS
$(pkg-config --cflags libtracefs)  -I include
-c -o src/in_kernel.o src/in_kernel.c
[...]

src/in_kernel.c:227:6: warning: variable 'curr_reactor' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
  227 |         if (!end)
      |             ^~~~
src/in_kernel.c:242:9: note: uninitialized use occurs here
  242 |         return curr_reactor;
      |                ^~~~~~~~~~~~
src/in_kernel.c:227:2: note: remove the 'if' if its condition is always false
  227 |         if (!end)
      |         ^~~~~~~~~
  228 |                 goto out_free;
      |                 ~~~~~~~~~~~~~
src/in_kernel.c:221:6: warning: variable 'curr_reactor' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
  221 |         if (!start)
      |             ^~~~~~
src/in_kernel.c:242:9: note: uninitialized use occurs here
  242 |         return curr_reactor;
      |                ^~~~~~~~~~~~
src/in_kernel.c:221:2: note: remove the 'if' if its condition is always false
  221 |         if (!start)
      |         ^~~~~~~~~~~
  222 |                 goto out_free;
      |                 ~~~~~~~~~~~~~
src/in_kernel.c:215:20: note: initialize the variable 'curr_reactor' to silence this warning
  215 |         char *curr_reactor;
      |                           ^
      |                            = NULL
2 warnings generated.

Which is correct. Setting curr_reactor to NULL avoids the problem.

Link: https://lkml.kernel.org/r/3a35551149e5ee0cb0950035afcb8082c3b5d05b.1707217097.git.bristot@kernel.org
Cc: [email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Bill Wendling <[email protected]>
Cc: Justin Stitt <[email protected]>
Cc: Donald Zickus <[email protected]>
Fixes: 6d60f89691fc ("tools/rv: Add in-kernel monitor interface")
Signed-off-by: Daniel Bristot de Oliveira <[email protected]>

tools/rv: Fix Makefile compiler options for clang

The following errors are showing up when compiling rv with clang:

$ make HOSTCC=clang CC=clang LLVM_IAS=1
[...]
  clang -O -g -DVERSION=\"6.8.0-rc1\" -flto=auto -ffat-lto-objects
  -fexceptions -fstack-protector-strong -fasynchronous-unwind-tables
  -fstack-clash-protection  -Wall -Werror=format-security
  -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS
  -Wno-maybe-uninitialized $(pkg-config --cflags libtracefs)
  -I include   -c -o src/utils.o src/utils.c
  clang: warning: optimization flag '-ffat-lto-objects' is not supported [-Wignored-optimization-argument]
  warning: unknown warning option '-Wno-maybe-uninitialized'; did you mean '-Wno-uninitialized'? [-Wunknown-warning-option]
  1 warning generated.

  clang -o rv -ggdb  src/in_kernel.o src/rv.o src/trace.o src/utils.o $(pkg-config --libs libtracefs)
  src/in_kernel.o: file not recognized: file format not recognized
  clang: error: linker command failed with exit code 1 (use -v to see invocation)
  make: *** [Makefile:110: rv] Error 1

Solve these issues by:
  - removing -ffat-lto-objects and -Wno-maybe-uninitialized if using clang
  - informing the linker about -flto=auto

Link: https://lkml.kernel.org/r/ed94a8ddc2ca8c8ef663cfb7ae9dd196c4a66b33.1707217097.git.bristot@kernel.org
Cc: [email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Bill Wendling <[email protected]>
Cc: Justin Stitt <[email protected]>
Fixes: 4bc4b131d44c ("rv: Add rv tool")
Suggested-by: Donald Zickus <[email protected]>
Signed-off-by: Daniel Bristot de Oliveira <[email protected]>

tools/rtla: Remove unused sched_getattr() function

Clang is reporting:

$ make HOSTCC=clang CC=clang LLVM_IAS=1
[...]
clang -O -g -DVERSION=\"6.8.0-rc3\" -flto=auto -fexceptions -fstack-protector-strong -fasynchronous-unwind-tables -fstack-clash-protection  -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS $(pkg-config --cflags libtracefs)    -c -o src/utils.o src/utils.c
src/utils.c:241:19: warning: unused function 'sched_getattr' [-Wunused-function]
  241 | static inline int sched_getattr(pid_t pid, struct sched_attr *attr,
      |                   ^~~~~~~~~~~~~
1 warning generated.

Which is correct, so remove the unused function.

Link: https://lkml.kernel.org/r/eaed7ba122c4ae88ce71277c824ef41cbf789385.1707217097.git.bristot@kernel.org
Cc: [email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Bill Wendling <[email protected]>
Cc: Justin Stitt <[email protected]>
Cc: Donald Zickus <[email protected]>
Fixes: b1696371d865 ("rtla: Helper functions for rtla")
Signed-off-by: Daniel Bristot de Oliveira <[email protected]>

tools/rtla: Fix clang warning about mount_point var size

clang is reporting this warning:

$ make HOSTCC=clang CC=clang LLVM_IAS=1
[...]
clang -O -g -DVERSION=\"6.8.0-rc3\" -flto=auto -fexceptions
-fstack-protector-strong -fasynchronous-unwind-tables
-fstack-clash-protection  -Wall -Werror=format-security
-Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS
$(pkg-config --cflags libtracefs)    -c -o src/utils.o src/utils.c

src/utils.c:548:66: warning: 'fscanf' may overflow; destination buffer in argument 3 has size 1024, but the corresponding specifier may require size 1025 [-Wfortify-source]
  548 |         while (fscanf(fp, "%*s %" STR(MAX_PATH) "s %99s %*s %*d %*d\n", mount_point, type) == 2) {
      |                                                                         ^

Increase mount_point variable size to MAX_PATH+1 to avoid the overflow.

Link: https://lkml.kernel.org/r/1b46712e93a2f4153909514a36016959dcc4021c.1707217097.git.bristot@kernel.org
Cc: [email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Bill Wendling <[email protected]>
Cc: Justin Stitt <[email protected]>
Cc: Donald Zickus <[email protected]>
Fixes: a957cbc02531 ("rtla: Add -C cgroup support")
Signed-off-by: Daniel Bristot de Oliveira <[email protected]>

tools/rtla: Fix uninitialized bucket/data->bucket_size warning

When compiling rtla with clang, I am getting the following warnings:

$ make HOSTCC=clang CC=clang LLVM_IAS=1

[..]
clang -O -g -DVERSION=\"6.8.0-rc3\" -flto=auto -fexceptions
-fstack-protector-strong -fasynchronous-unwind-tables
-fstack-clash-protection  -Wall -Werror=format-security
-Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS
$(pkg-config --cflags libtracefs)
-c -o src/osnoise_hist.o src/osnoise_hist.c
src/osnoise_hist.c:138:6: warning: variable 'bucket' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
  138 |         if (data->bucket_size)
      |             ^~~~~~~~~~~~~~~~~
src/osnoise_hist.c:149:6: note: uninitialized use occurs here
  149 |         if (bucket < entries)
      |             ^~~~~~
src/osnoise_hist.c:138:2: note: remove the 'if' if its condition is always true
  138 |         if (data->bucket_size)
      |         ^~~~~~~~~~~~~~~~~~~~~~
  139 |                 bucket = duration / data->bucket_size;
src/osnoise_hist.c:132:12: note: initialize the variable 'bucket' to silence this warning
  132 |         int bucket;
      |                   ^
      |                    = 0
1 warning generated.

[...]

clang -O -g -DVERSION=\"6.8.0-rc3\" -flto=auto -fexceptions
-fstack-protector-strong -fasynchronous-unwind-tables
-fstack-clash-protection  -Wall -Werror=format-security
-Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS
$(pkg-config --cflags libtracefs)
-c -o src/timerlat_hist.o src/timerlat_hist.c
src/timerlat_hist.c:181:6: warning: variable 'bucket' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
  181 |         if (data->bucket_size)
      |             ^~~~~~~~~~~~~~~~~
src/timerlat_hist.c:204:6: note: uninitialized use occurs here
  204 |         if (bucket < entries)
      |             ^~~~~~
src/timerlat_hist.c:181:2: note: remove the 'if' if its condition is always true
  181 |         if (data->bucket_size)
      |         ^~~~~~~~~~~~~~~~~~~~~~
  182 |                 bucket = latency / data->bucket_size;
src/timerlat_hist.c:175:12: note: initialize the variable 'bucket' to silence this warning
  175 |         int bucket;
      |                   ^
      |                    = 0
1 warning generated.

This is a legit warning, but data->bucket_size is always > 0 (see
timerlat_hist_parse_args()), so the if is not necessary.

Remove the unneeded if (data->bucket_size) to avoid the warning.

Link: https://lkml.kernel.org/r/6e1b1665cd99042ae705b3e0fc410858c4c42346.1707217097.git.bristot@kernel.org
Cc: [email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Bill Wendling <[email protected]>
Cc: Justin Stitt <[email protected]>
Cc: Donald Zickus <[email protected]>
Fixes: 1eeb6328e8b3 ("rtla/timerlat: Add timerlat hist mode")
Fixes: 829a6c0b5698 ("rtla/osnoise: Add the hist mode")
Signed-off-by: Daniel Bristot de Oliveira <[email protected]>

tools/rtla: Fix Makefile compiler options for clang

The following errors are showing up when compiling rtla with clang:

$ make HOSTCC=clang CC=clang LLVM_IAS=1
[...]

  clang -O -g -DVERSION=\"6.8.0-rc1\" -flto=auto -ffat-lto-objects
-fexceptions -fstack-protector-strong
-fasynchronous-unwind-tables -fstack-clash-protection  -Wall
-Werror=format-security -Wp,-D_FORTIFY_SOURCE=2
-Wp,-D_GLIBCXX_ASSERTIONS -Wno-maybe-uninitialized
$(pkg-config --cflags libtracefs)    -c -o src/utils.o src/utils.c

  clang: warning: optimization flag '-ffat-lto-objects' is not supported [-Wignored-optimization-argument]
  warning: unknown warning option '-Wno-maybe-uninitialized'; did you mean '-Wno-uninitialized'? [-Wunknown-warning-option]
  1 warning generated.

  clang -o rtla -ggdb  src/osnoise.o src/osnoise_hist.o src/osnoise_top.o
  src/rtla.o src/timerlat_aa.o src/timerlat.o src/timerlat_hist.o
  src/timerlat_top.o src/timerlat_u.o src/trace.o src/utils.o $(pkg-config --libs libtracefs)

  src/osnoise.o: file not recognized: file format not recognized
  clang: error: linker command failed with exit code 1 (use -v to see invocation)
  make: *** [Makefile:110: rtla] Error 1

Solve these issues by:
  - removing -ffat-lto-objects and -Wno-maybe-uninitialized if using clang
  - informing the linker about -flto=auto

Link: https://lore.kernel.org/linux-trace-kernel/567ac1b94effc228ce9a0225b9df7232a9b35b55.1707217097.git.bristot@kernel.org
Cc: [email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Bill Wendling <[email protected]>
Cc: Justin Stitt <[email protected]>
Fixes: 1a7b22ab15eb ("tools/rtla: Build with EXTRA_{C,LD}FLAGS")
Suggested-by: Donald Zickus <[email protected]>
Signed-off-by: Daniel Bristot de Oliveira <[email protected]>

accel/ivpu: Fix DevTLB errors on suspend/resume and recovery

Issue IP reset before shutdown in order to
complete all upstream requests to the SOC.
Without this DevTLB is complaining about
incomplete transactions and NPU cannot resume from
suspend.
This problem is only happening on recent IFWI
releases.

IP reset in rare corner cases can mess up PCI
configuration, so save it before the reset.
After this happens it is also impossible to
issue PLL requests and D0->D3->D0 cycle is needed
to recover the NPU. Add WP 0 request on power up,
so the PUNIT is always notified about NPU reset.

Use D0/D3 cycle for recovery as it can recover
from failed IP reset and FLR cannot.

Fixes: 3f7c0634926d ("accel/ivpu/37xx: Fix hangs related to MMIO reset")
Signed-off-by: Jacek Lawrynowicz <[email protected]>
Reviewed-by: Jeffrey Hugo <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

ASoC: Intel: Boards: Fix NULL pointer deref in BYT/CHT

Merge series from Hans de Goede <[email protected]>:

While testing 6.8 on a Bay Trail device with a ALC5640 codec
I noticed a regression in 6.8 which causes a NULL pointer deref
in probe().

All BYT/CHT Intel machine drivers are affected. Patch 1/2 of
this series fixes all of them.

Patch 2/2 adds some small cleanups to cht_bsw_rt5645.c for
issues which I noticed while working on 1/2.

cifs: update the same create_guid on replay

File open requests made to the server contain a
CreateGuid, which is used by the server to identify
the open request. If the same request needs to be
replayed, it needs to be sent with the same CreateGuid
in the durable handle v2 context.

Without doing so, we could end up leaking handles on
the server when:
1. multichannel is used AND
2. connection goes down, but not for all channels

This is because the replayed open request would have a
new CreateGuid and the server will treat this as a new
request and open a new handle.

This change fixes this by reusing the existing create_guid
stored in the cached fid struct.

REF: MS-SMB2 4.9 Replay Create Request on an Alternate Channel

Fixes: 4f1fffa23769 ("cifs: commands that are retried should have replay flag set")
Signed-off-by: Shyam Prasad N <[email protected]>
Signed-off-by: Steve French <[email protected]>

cifs: fix underflow in parse_server_interfaces()

In this loop, we step through the buffer and after each item we check
if the size_left is greater than the minimum size we need. However,
the problem is that "bytes_left" is type ssize_t while sizeof() is type
size_t. That means that because of type promotion, the comparison is
done as an unsigned and if we have negative bytes left the loop
continues instead of ending.

Fixes: fe856be475f7 ("CIFS: parse and store info on iface queries")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Shyam Prasad N <[email protected]>
Signed-off-by: Steve French <[email protected]>

Linux 6.8-rc4

Merge tag 'timers_urgent_for_v6.8_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fix from Borislav Petkov:

- Make sure a warning is issued when a hrtimer gets queued after the
   timers have been migrated on the CPU down path and thus said timer
   will get ignored

* tag 'timers_urgent_for_v6.8_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  hrtimer: Report offline hrtimer enqueue

Merge tag 'x86_urgent_for_v6.8_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

- Correct the minimum CPU family for Transmeta Crusoe in Kconfig so
   that such hw can boot again

- Do not take into accout XSTATE buffer size info supplied by userspace
   when constructing a sigreturn frame

- Switch get_/put_user* to EX_TYPE_UACCESS exception handling when an
   MCE is encountered so that it can be properly recovered from instead
   of simply panicking

* tag 'x86_urgent_for_v6.8_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/Kconfig: Transmeta Crusoe is CPU family 5, not 6
  x86/fpu: Stop relying on userspace for info to fault in xsave buffer
  x86/lib: Revert to _ASM_EXTABLE_UA() for {get,put}_user() fixups

ASoC: SOF: amd: Fix locking in ACP IRQ handler

A recent change in acp_irq_thread() was meant to address a potential race
condition while trying to acquire the hardware semaphore responsible for
the synchronization between firmware and host IPC interrupts.

This resulted in an improper use of the IPC spinlock, causing normal
kernel memory allocations (which may sleep) inside atomic contexts:

1707255557.133976 kernel: BUG: sleeping function called from invalid context at include/linux/sched/mm.h:315

...

1707255557.134757 kernel:  sof_ipc3_rx_msg+0x70/0x130 [snd_sof]
1707255557.134793 kernel:  acp_sof_ipc_irq_thread+0x1e0/0x550 [snd_sof_amd_acp]
1707255557.134855 kernel:  acp_irq_thread+0xa3/0x130 [snd_sof_amd_acp]
1707255557.134904 kernel:  ? irq_thread+0xb5/0x1e0
1707255557.134947 kernel:  ? __pfx_irq_thread_fn+0x10/0x10
1707255557.134985 kernel:  irq_thread_fn+0x23/0x60

Moreover, there are attempts to lock a mutex from the same atomic
context:

1707255557.136357 kernel: =============================
1707255557.136393 kernel: [ BUG: Invalid wait context ]
1707255557.136413 kernel: 6.8.0-rc3-next-20240206-audio-next #9 Tainted: G        W
1707255557.136432 kernel: -----------------------------
1707255557.136451 kernel: irq/66-AudioDSP/502 is trying to lock:
1707255557.136470 kernel: ffff965152f26af8 (&sb->s_type->i_mutex_key#2){+.+.}-{3:3}, at: start_creating.part.0+0x5f/0x180

...

1707255557.137429 kernel:  start_creating.part.0+0x5f/0x180
1707255557.137457 kernel:  __debugfs_create_file+0x61/0x210
1707255557.137475 kernel:  snd_sof_debugfs_io_item+0x75/0xc0 [snd_sof]
1707255557.137494 kernel:  sof_ipc3_do_rx_work+0x7cf/0x9f0 [snd_sof]
1707255557.137513 kernel:  sof_ipc3_rx_msg+0xb3/0x130 [snd_sof]
1707255557.137532 kernel:  acp_sof_ipc_irq_thread+0x1e0/0x550 [snd_sof_amd_acp]
1707255557.137551 kernel:  acp_irq_thread+0xa3/0x130 [snd_sof_amd_acp]

Fix the issues by reducing the lock scope in acp_irq_thread(), so that
it guards only the hardware semaphore acquiring attempt.  Additionally,
restore the initial locking in acp_sof_ipc_irq_thread() to synchronize
the handling of immediate replies from DSP core.

Fixes: 802134c8c2c8 ("ASoC: SOF: amd: Refactor spinlock_irq(&sdev->ipc_lock) sequence in irq_handler")
Signed-off-by: Cristian Ciocaltea <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>

ASoC: rt5645: Fix deadlock in rt5645_jack_detect_work()

There is a path in rt5645_jack_detect_work(), where rt5645->jd_mutex
is left locked forever. That may lead to deadlock
when rt5645_jack_detect_work() is called for the second time.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: cdba4301adda ("ASoC: rt5650: add mutex to avoid the jack detection failure")
Signed-off-by: Alexey Khoroshilov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>

spi: ppc4xx: Drop write-only variable

Since commit 24778be20f87 ("spi: convert drivers to use
bits_per_word_mask") the bits_per_word variable is only written to. The
check that was there before isn't needed any more as the spi core
ensures that only 8 bit transfers are used, so the variable can go away
together with all assignments to it.

Fixes: 24778be20f87 ("spi: convert drivers to use bits_per_word_mask")
Signed-off-by: Uwe Kleine-König <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>

spi: ppc4xx: Fix fallout from rename in struct spi_bitbang

I failed to adapt this driver because it's not enabled in a powerpc
allmodconfig build and also wasn't hit by my grep expertise. Fix
accordingly.

Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Fixes: 2259233110d9 ("spi: bitbang: Follow renaming of SPI "master" to "controller"")
Signed-off-by: Uwe Kleine-König <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>

spi: ppc4xx: Fix fallout from include cleanup

The driver uses several symbols declared in <linux/platform_device.h>,
e.g module_platform_driver(). Include this header explicitly now that
<linux/of_platform.h> doesn't include <linux/platform_device.h> any
more.

Fixes: ef175b29a242 ("of: Stop circularly including of_device.h and of_platform.h")
Signed-off-by: Uwe Kleine-König <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>

ASoC: Intel: cht_bsw_rt5645: Cleanup codec_name handling

4 fixes / cleanups to the rt5645 mc driver's codec_name handling:

1. In the for loop looking for the dai_index for the codec, replace
card->dai_link[i] with cht_dailink[i]. The for loop already uses
ARRAY_SIZE(cht_dailink) as bound and card->dai_link is just a pointer to
cht_dailink using card->dai_link only obfuscates that cht_dailink is being
modified directly rather then say a copy of cht_dailink. Using
cht_dailink[i] also makes the code consistent with other machine drivers.

2. Don't set cht_dailink[dai_index].codecs->name in the for loop,
this immediately gets overridden using acpi_dev_name(adev) directly
below the loop.

3. Add a missing break to the loop.

4. Remove the now no longer used (only set, never read) codec_name field
from struct cht_mc_private.

Signed-off-by: Hans de Goede <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>

ASoC: Intel: Boards: Fix NULL pointer deref in BYT/CHT boards

Since commit 13f58267cda3 ("ASoC: soc.h: don't create dummy Component
via COMP_DUMMY()") dummy snd_soc_dai_link.codecs entries no longer
have a name set.

This means that when looking for the codec dai_link the machine
driver can no longer unconditionally run strcmp() on
snd_soc_dai_link.codecs[0].name since this may now be NULL.

Add a check for snd_soc_dai_link.codecs[0].name being NULL to all
BYT/CHT machine drivers to avoid NULL pointer dereferences in
their probe() methods.

Fixes: 13f58267cda3 ("ASoC: soc.h: don't create dummy Component via COMP_DUMMY()")
Cc: Kuninori Morimoto <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>

Merge tag 'mm-hotfixes-stable-2024-02-10-11-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
"21 hotfixes. 12 are cc:stable and the remainder pertain to post-6.7
  issues or aren't considered to be needed in earlier kernel versions"

* tag 'mm-hotfixes-stable-2024-02-10-11-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (21 commits)
  nilfs2: fix potential bug in end_buffer_async_write
  mm/damon/sysfs-schemes: fix wrong DAMOS tried regions update timeout setup
  nilfs2: fix hang in nilfs_lookup_dirty_data_buffers()
  MAINTAINERS: Leo Yan has moved
  mm/zswap: don't return LRU_SKIP if we have dropped lru lock
  fs,hugetlb: fix NULL pointer dereference in hugetlbs_fill_super
  mailmap: switch email address for John Moon
  mm: zswap: fix objcg use-after-free in entry destruction
  mm/madvise: don't forget to leave lazy MMU mode in madvise_cold_or_pageout_pte_range()
  arch/arm/mm: fix major fault accounting when retrying under per-VMA lock
  selftests: core: include linux/close_range.h for CLOSE_RANGE_* macros
  mm/memory-failure: fix crash in split_huge_page_to_list from soft_offline_page
  mm: memcg: optimize parent iteration in memcg_rstat_updated()
  nilfs2: fix data corruption in dsync block recovery for small block sizes
  mm/userfaultfd: UFFDIO_MOVE implementation should use ptep_get()
  exit: wait_task_zombie: kill the no longer necessary spin_lock_irq(siglock)
  fs/proc: do_task_stat: use sig->stats_lock to gather the threads/children stats
  fs/proc: do_task_stat: move thread_group_cputime_adjusted() outside of lock_task_sighand()
  getrusage: use sig->stats_lock rather than lock_task_sighand()
  getrusage: move thread_group_cputime_adjusted() outside of lock_task_sighand()
  ...

bcachefs: fix missing endiannes conversion in sb_members

Signed-off-by: Kent Overstreet <[email protected]>

bcachefs: fix kmemleak in __bch2_read_super error handling path

During xfstest tests, there are some kmemleak reports e.g. generic/051 with
if USE_KMEMLEAK=yes:

====================================================================
EXPERIMENTAL kmemleak reported some memory leaks!  Due to the way kmemleak
works, the leak might be from an earlier test, or something totally unrelated.
unreferenced object 0xffff9ef905aaf778 (size 8):
  comm "mount.bcachefs", pid 169844, jiffies 4295281209 (age 87.040s)
  hex dump (first 8 bytes):
    a5 cc cc cc cc cc cc cc                          ........
  backtrace:
    [<ffffffff87fd9a43>] __kmem_cache_alloc_node+0x1f3/0x2c0
    [<ffffffff87f49b66>] kmalloc_trace+0x26/0xb0
    [<ffffffffc0a3fefe>] __bch2_read_super+0xfe/0x4e0 [bcachefs]
    [<ffffffffc0a3ad22>] bch2_fs_open+0x262/0x1710 [bcachefs]
    [<ffffffffc09c9e24>] bch2_mount+0x4c4/0x640 [bcachefs]
    [<ffffffff88080c90>] legacy_get_tree+0x30/0x60
    [<ffffffff8802c748>] vfs_get_tree+0x28/0xf0
    [<ffffffff88061fe5>] path_mount+0x475/0xb60
    [<ffffffff880627e5>] __x64_sys_mount+0x105/0x140
    [<ffffffff88932642>] do_syscall_64+0x42/0xf0
    [<ffffffff88a000e6>] entry_SYSCALL_64_after_hwframe+0x6e/0x76
unreferenced object 0xffff9ef96cdc4fc0 (size 32):
  comm "mount.bcachefs", pid 169844, jiffies 4295281209 (age 87.040s)
  hex dump (first 32 bytes):
    2f 64 65 76 2f 6d 61 70 70 65 72 2f 74 65 73 74  /dev/mapper/test
    2d 31 00 cc cc cc cc cc cc cc cc cc cc cc cc cc  -1..............
  backtrace:
    [<ffffffff87fd9a43>] __kmem_cache_alloc_node+0x1f3/0x2c0
    [<ffffffff87f4a081>] __kmalloc_node_track_caller+0x51/0x150
    [<ffffffff87f3adc2>] kstrdup+0x32/0x60
    [<ffffffffc0a3ff1a>] __bch2_read_super+0x11a/0x4e0 [bcachefs]
    [<ffffffffc0a3ad22>] bch2_fs_open+0x262/0x1710 [bcachefs]
    [<ffffffffc09c9e24>] bch2_mount+0x4c4/0x640 [bcachefs]
    [<ffffffff88080c90>] legacy_get_tree+0x30/0x60
    [<ffffffff8802c748>] vfs_get_tree+0x28/0xf0
    [<ffffffff88061fe5>] path_mount+0x475/0xb60
    [<ffffffff880627e5>] __x64_sys_mount+0x105/0x140
    [<ffffffff88932642>] do_syscall_64+0x42/0xf0
    [<ffffffff88a000e6>] entry_SYSCALL_64_after_hwframe+0x6e/0x76
====================================================================

The leak happens if bdev_open_by_path() failed to open a block device
then it goes label 'out' directly without call of bch2_free_super().

Fix it by going to label 'err' instead of 'out' if bdev_open_by_path()
fails.

Signed-off-by: Su Yue <[email protected]>
Signed-off-by: Kent Overstreet <[email protected]>

bcachefs: Fix missing bch2_err_class() calls

We aren't supposed to be leaking our private error codes outside of
fs/bcachefs/.

Fixes:
Signed-off-by: Kent Overstreet <[email protected]>

Merge branch 'tls-fixes'

Jakub Kicinski says:

====================
net: tls: fix some issues with async encryption

valis was reporting a race on socket close so I sat down to try to fix it.
I used Sabrina's async crypto debug patch to test... and in the process
run into some of the same issues, and created very similar fixes :(
I didn't realize how many of those patches weren't applied. Once I found
Sabrina's code [1] it turned out to be so similar in fact that I added
her S-o-b's and Co-develop'eds in a semi-haphazard way.

With this series in place all expected tests pass with async crypto.
Sabrina had a few more fixes, but I'll leave those to her, things are
not crashing anymore.

[1] https://lore.kernel.org/netdev/cover.1694018970 [email protected]/
====================

Signed-off-by: David S. Miller <[email protected]>

net: tls: fix returned read length with async decrypt

We double count async, non-zc rx data. The previous fix was
lucky because if we fully zc async_copy_bytes is 0 so we add 0.
Decrypted already has all the bytes we handled, in all cases.
We don't have to adjust anything, delete the erroneous line.

Fixes: 4d42cd6bc2ac ("tls: rx: fix return value for async crypto")
Co-developed-by: Sabrina Dubroca <[email protected]>
Signed-off-by: Sabrina Dubroca <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

selftests: tls: use exact comparison in recv_partial

This exact case was fail for async crypto and we weren't
catching it.

Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: tls: fix use-after-free with partial reads and async decrypt

tls_decrypt_sg doesn't take a reference on the pages from clear_skb,
so the put_page() in tls_decrypt_done releases them, and we trigger
a use-after-free in process_rx_list when we try to read from the
partially-read skb.

Fixes: fd31f3996af2 ("tls: rx: decrypt into a fresh skb")
Signed-off-by: Sabrina Dubroca <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: tls: handle backlogging of crypto requests

Since we're setting the CRYPTO_TFM_REQ_MAY_BACKLOG flag on our
requests to the crypto API, crypto_aead_{encrypt,decrypt} can return
-EBUSY instead of -EINPROGRESS in valid situations. For example, when
the cryptd queue for AESNI is full (easy to trigger with an
artificially low cryptd.cryptd_max_cpu_qlen), requests will be enqueued
to the backlog but still processed. In that case, the async callback
will also be called twice: first with err == -EINPROGRESS, which it
seems we can just ignore, then with err == 0.

Compared to Sabrina's original patch this version uses the new
tls_*crypt_async_wait() helpers and converts the EBUSY to
EINPROGRESS to avoid having to modify all the error handling
paths. The handling is identical.

Fixes: a54667f6728c ("tls: Add support for encryption using async offload accelerator")
Fixes: 94524d8fc965 ("net/tls: Add support for async decryption of tls records")
Co-developed-by: Sabrina Dubroca <[email protected]>
Signed-off-by: Sabrina Dubroca <[email protected]>
Link: https://lore.kernel.org/netdev/9681d1febfec295449a62300938ed2ae66983f28.1694018970.git.sd@queasysnail.net/
Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tls: fix race between tx work scheduling and socket close

Similarly to previous commit, the submitting thread (recvmsg/sendmsg)
may exit as soon as the async crypto handler calls complete().
Reorder scheduling the work before calling complete().
This seems more logical in the first place, as it's
the inverse order of what the submitting thread will do.

Reported-by: valis <[email protected]>
Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance")
Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Sabrina Dubroca <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tls: fix race between async notify and socket close

The submitting thread (one which called recvmsg/sendmsg)
may exit as soon as the async crypto handler calls complete()
so any code past that point risks touching already freed data.

Try to avoid the locking and extra flags altogether.
Have the main thread hold an extra reference, this way
we can depend solely on the atomic ref counter for
synchronization.

Don't futz with reiniting the completion, either, we are now
tightly controlling when completion fires.

Reported-by: valis <[email protected]>
Fixes: 0cada33241d9 ("net/tls: fix race condition causing kernel panic")
Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Sabrina Dubroca <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: tls: factor out tls_*crypt_async_wait()

Factor out waiting for async encrypt and decrypt to finish.
There are already multiple copies and a subsequent fix will
need more. No functional changes.

Note that crypto_wait_req() returns wait->err

Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Sabrina Dubroca <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

iio: adc: ad4130: only set GPIO_CTRL if pin is unused

Currently, GPIO_CTRL bits are set even if the pins are used for
measurements.

GPIO_CTRL bits should only be set if the pin is not used for
other functionality.

Fix this by only setting the GPIO_CTRL bits if the pin has no
other function.

Fixes: 62094060cf3a ("iio: adc: ad4130: add AD4130 driver")
Signed-off-by: Cosmin Tanislav <[email protected]>
Reviewed-by: Nuno Sa <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jonathan Cameron <[email protected]>

iio: adc: ad4130: zero-initialize clock init data

The clk_init_data struct does not have all its members
initialized, causing issues when trying to expose the internal
clock on the CLK pin.

Fix this by zero-initializing the clk_init_data struct.

Fixes: 62094060cf3a ("iio: adc: ad4130: add AD4130 driver")
Signed-off-by: Cosmin Tanislav <[email protected]>
Reviewed-by: Nuno Sa <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jonathan Cameron <[email protected]>

Merge tag 'block-6.8-2024-02-10' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:

- NVMe pull request via Keith:
     - Update a potentially stale firmware attribute (Maurizio)
     - Fixes for the recent verbose error logging (Keith, Chaitanya)
     - Protection information payload size fix for passthrough (Francis)

- Fix for a queue freezing issue in virtblk (Yi)

- blk-iocost underflow fix (Tejun)

- blk-wbt task detection fix (Jan)

* tag 'block-6.8-2024-02-10' of git://git.kernel.dk/linux:
  virtio-blk: Ensure no requests in virtqueues before deleting vqs.
  blk-iocost: Fix an UBSAN shift-out-of-bounds warning
  nvme: use ns->head->pi_size instead of t10_pi_tuple structure size
  nvme-core: fix comment to reflect right functions
  nvme: move passthrough logging attribute to head
  blk-wbt: Fix detection of dirty-throttled tasks
  nvme-host: fix the updating of the firmware version

Merge tag 'firewire-fixes-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394

Pull firewire fix from Takashi Sakamoto:
"A change to accelerate the device detection step in some cases.

  In the self-identification step after bus-reset, all nodes in the same
  bus broadcast selfID packet including the value of gap count. The
  value is related to the cable hops between nodes, and used to
  calculate the subaction gap and the arbitration reset gap.

  When each node has the different value of the gap count, the
  asynchronous communication between them is unreliable, since an
  asynchronous transaction could be interrupted by another asynchronous
  transaction before completion. The gap count inconsistency can be
  resolved by several ways; e.g. the transfer of PHY configuration
  packet and generation of bus-reset.

  The current implementation of firewire stack can correctly detect the
  gap count inconsistency, however the recovery action from the
  inconsistency tends to be delayed after reading configuration ROM of
  root node. This results in the long time to probe devices in some
  combinations of hardware.

  Here the stack is changed to schedule the action as soon as possible"

* tag 'firewire-fixes-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
  firewire: core: send bus reset promptly on gap count error

Merge tag '6.8-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd

Pull smb server fixes from Steve French:
"Two ksmbd server fixes:

   - memory leak fix

   - a minor kernel-doc fix"

* tag '6.8-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: free aux buffer if ksmbd_iov_pin_rsp_read fails
  ksmbd: Add kernel-doc for ksmbd_extract_sharename() function

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"Three small driver fixes and one core fix.

  The core fix being a fixup to the one in the last pull request which
  didn't entirely move checking of scsi_host_busy() out from under the
  host lock"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: core: Remove the ufshcd_release() in ufshcd_err_handling_prepare()
  scsi: ufs: core: Fix shift issue in ufshcd_clear_cmd()
  scsi: lpfc: Use unsigned type for num_sge
  scsi: core: Move scsi_host_busy() out of host lock if it is for per-command

Merge tag '6.8-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:

- reconnect fix

- multichannel channel selection fix

- minor mount warning fix

- reparse point fix

- null pointer check improvement

* tag '6.8-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: clarify mount warning
  cifs: handle cases where multiple sessions share connection
  cifs: change tcon status when need_reconnect is set on it
  smb: client: set correct d_type for reparse points under DFS mounts
  smb3: add missing null server pointer check

Merge tag 'ceph-for-6.8-rc4' of https://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
"Some fscrypt-related fixups (sparse reads are used only for encrypted
  files) and two cap handling fixes from Xiubo and Rishabh"

* tag 'ceph-for-6.8-rc4' of https://github.com/ceph/ceph-client:
  ceph: always check dir caps asynchronously
  ceph: prevent use-after-free in encode_cap_msg()
  ceph: always set initial i_blkbits to CEPH_FSCRYPT_BLOCK_SHIFT
  libceph: just wait for more data to be available on the socket
  libceph: rename read_sparse_msg_*() to read_partial_sparse_msg_*()
  libceph: fail sparse-read if the data length doesn't match

Merge tag 'ntfs3_for_6.8' of https://github.com/Paragon-Software-Group/linux-ntfs3

Pull ntfs3 fixes from Konstantin Komarov:
"Fixed:
   - size update for compressed file
   - some logic errors, overflows
   - memory leak
   - some code was refactored

  Added:
   - implement super_operations::shutdown

  Improved:
   - alternative boot processing
   - reduced stack usage"

* tag 'ntfs3_for_6.8' of https://github.com/Paragon-Software-Group/linux-ntfs3: (28 commits)
  fs/ntfs3: Slightly simplify ntfs_inode_printk()
  fs/ntfs3: Add ioctl operation for directories (FITRIM)
  fs/ntfs3: Fix oob in ntfs_listxattr
  fs/ntfs3: Fix an NULL dereference bug
  fs/ntfs3: Update inode->i_size after success write into compressed file
  fs/ntfs3: Fixed overflow check in mi_enum_attr()
  fs/ntfs3: Correct function is_rst_area_valid
  fs/ntfs3: Use i_size_read and i_size_write
  fs/ntfs3: Prevent generic message "attempt to access beyond end of device"
  fs/ntfs3: use non-movable memory for ntfs3 MFT buffer cache
  fs/ntfs3: Use kvfree to free memory allocated by kvmalloc
  fs/ntfs3: Disable ATTR_LIST_ENTRY size check
  fs/ntfs3: Fix c/mtime typo
  fs/ntfs3: Add NULL ptr dereference checking at the end of attr_allocate_frame()
  fs/ntfs3: Add and fix comments
  fs/ntfs3: ntfs3_forced_shutdown use int instead of bool
  fs/ntfs3: Implement super_operations::shutdown
  fs/ntfs3: Drop suid and sgid bits as a part of fpunch
  fs/ntfs3: Add file_modified
  fs/ntfs3: Correct use bh_read
  ...

work around gcc bugs with 'asm goto' with outputs

We've had issues with gcc and 'asm goto' before, and we created a
'asm_volatile_goto()' macro for that in the past: see commits
3f0116c3238a ("compiler/gcc4: Add quirk for 'asm goto' miscompilation
bug") and a9f180345f53 ("compiler/gcc4: Make quirk for
asm_volatile_goto() unconditional").

Then, much later, we ended up removing the workaround in commit
43c249ea0b1e ("compiler-gcc.h: remove ancient workaround for gcc PR
58670") because we no longer supported building the kernel with the
affected gcc versions, but we left the macro uses around.

Now, Sean Christopherson reports a new version of a very similar
problem, which is fixed by re-applying that ancient workaround.  But the
problem in question is limited to only the 'asm goto with outputs'
cases, so instead of re-introducing the old workaround as-is, let's
rename and limit the workaround to just that much less common case.

It looks like there are at least two separate issues that all hit in
this area:

(a) some versions of gcc don't mark the asm goto as 'volatile' when it
     has outputs:

        https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98619
        https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110420

     which is easy to work around by just adding the 'volatile' by hand.

(b) Internal compiler errors:

        https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422

     which are worked around by adding the extra empty 'asm' as a
     barrier, as in the original workaround.

but the problem Sean sees may be a third thing since it involves bad
code generation (not an ICE) even with the manually added 'volatile'.

but the same old workaround works for this case, even if this feels a
bit like voodoo programming and may only be hiding the issue.

Reported-and-tested-by: Sean Christopherson <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/
Cc: Nick Desaulniers <[email protected]>
Cc: Uros Bizjak <[email protected]>
Cc: Jakub Jelinek <[email protected]>
Cc: Andrew Pinski <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge branch 'net-fix-module_description-for-net-p5'

Breno Leitao says:

====================
net: Fix MODULE_DESCRIPTION() for net (p5)

There are hundreds of network modules that misses MODULE_DESCRIPTION(),
causing a warning when compiling with W=1. Example:

WARNING: modpost: missing MODULE_DESCRIPTION() in net/sched/em_cmp.o
WARNING: modpost: missing MODULE_DESCRIPTION() in net/sched/em_nbyte.o
WARNING: modpost: missing MODULE_DESCRIPTION() in net/sched/em_u32.o
WARNING: modpost: missing MODULE_DESCRIPTION() in net/sched/em_meta.o
WARNING: modpost: missing MODULE_DESCRIPTION() in net/sched/em_text.o
WARNING: modpost: missing MODULE_DESCRIPTION() in net/sched/em_canid.o
WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv4/ip_tunnel.o
WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv4/ipip.o
WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv4/ip_gre.o
WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv4/udp_tunnel.o
WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv4/ip_vti.o
WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv4/ah4.o
WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv4/esp4.o
WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv4/xfrm4_tunnel.o
WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv4/tunnel4.o
WARNING: modpost: missing MODULE_DESCRIPTION() in net/xfrm/xfrm_algo.o
WARNING: modpost: missing MODULE_DESCRIPTION() in net/xfrm/xfrm_user.o
WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv6/ah6.o
WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv6/esp6.o
WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv6/xfrm6_tunnel.o
WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv6/tunnel6.o

This part5 of the patchset focus on the missing net/ module, which
are now warning free.

v1: https://lore.kernel.org/all/20240205101400.1480521 [email protected]/
v2: https://lore.kernel.org/all/20240207101929 [email protected]/
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: fill in MODULE_DESCRIPTION()s for dsa_loop_bdinfo

W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to the DSA loopback fixed PHY module.

Suggested-by: Florian Fainelli <[email protected]>
Signed-off-by: Breno Leitao <[email protected]>
Acked-by: Florian Fainelli <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: fill in MODULE_DESCRIPTION()s for ipvtap

W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to the IP-VLAN based tap driver.

Signed-off-by: Breno Leitao <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: fill in MODULE_DESCRIPTION()s for net/sched

W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to the network schedulers.

Suggested-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: Breno Leitao <[email protected]>
Reviewed-by: Jamal Hadi Salim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: fill in MODULE_DESCRIPTION()s for ipv4 modules

W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to the IPv4 modules.

Signed-off-by: Breno Leitao <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: fill in MODULE_DESCRIPTION()s for ipv6 modules

W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to the IPv6 modules.

Signed-off-by: Breno Leitao <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: fill in MODULE_DESCRIPTION()s for 6LoWPAN

W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to IPv6 over Low power Wireless Personal Area Network.

Signed-off-by: Breno Leitao <[email protected]>
Acked-by: Alexander Aring <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: fill in MODULE_DESCRIPTION()s for af_key

W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to the PF_KEY socket helpers.

Signed-off-by: Breno Leitao <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: fill in MODULE_DESCRIPTION()s for mpoa

W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to the Multi-Protocol Over ATM (MPOA) driver.

Signed-off-by: Breno Leitao <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: fill in MODULE_DESCRIPTION()s for xfrm

W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to the XFRM interface drivers.

Signed-off-by: Breno Leitao <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

lan966x: Fix crash when adding interface under a lag

There is a crash when adding one of the lan966x interfaces under a lag
interface. The issue can be reproduced like this:
ip link add name bond0 type bond miimon 100 mode balance-xor
ip link set dev eth0 master bond0

The reason is because when adding a interface under the lag it would go
through all the ports and try to figure out which other ports are under
that lag interface. And the issue is that lan966x can have ports that are
NULL pointer as they are not probed. So then iterating over these ports
it would just crash as they are NULL pointers.
The fix consists in actually checking for NULL pointers before accessing
something from the ports. Like we do in other places.

Fixes: cabc9d49333d ("net: lan966x: Add lag support for lan966x")
Signed-off-by: Horatiu Vultur <[email protected]>
Reviewed-by: Michal Swiatkowski <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net/sched: act_mirred: Don't zero blockid when net device is being deleted

While testing tdc with parallel tests for mirred to block we caught an
intermittent bug. The blockid was being zeroed out when a net device
was deleted and, thus, giving us an incorrect blockid value whenever
we tried to dump the mirred action. Since we don't increment the block
refcount in the control path (and only use the ID), we don't need to
zero the blockid field whenever a net device is going down.

Fixes: 42f39036cda8 ("net/sched: act_mirred: Allow mirred to block")
Signed-off-by: Victor Nogueira <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'net-openvswitch-limit-the-recursions-from-action-sets'

Aaron Conole says:

====================
net: openvswitch: limit the recursions from action sets

Open vSwitch module accepts actions as a list from the netlink socket
and then creates a copy which it uses in the action set processing.
During processing of the action list on a packet, the module keeps a
count of the execution depth and exits processing if the action depth
goes too high.

However, during netlink processing the recursion depth isn't checked
anywhere, and the copy trusts that kernel has large enough stack to
accommodate it.  The OVS sample action was the original action which
could perform this kinds of recursion, and it originally checked that
it didn't exceed the sample depth limit.  However, when sample became
optimized to provide the clone() semantics, the recursion limit was
dropped.

This series adds a depth limit during the __ovs_nla_copy_actions() call
that will ensure we don't exceed the max that the OVS userspace could
generate for a clone().

Additionally, this series provides a selftest in 2/2 that can be used to
determine if the OVS module is allowing unbounded access.  It can be
safely omitted where the ovs selftest framework isn't available.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: openvswitch: Add validation for the recursion test

Add a test case into the netlink checks that will show the number of
nested action recursions won't exceed 16. Going to 17 on a small
clone call isn't enough to exhaust the stack on (most) systems, so
it should be safe to run even on systems that don't have the fix
applied.

Signed-off-by: Aaron Conole <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: openvswitch: limit the number of recursions from action sets

The ovs module allows for some actions to recursively contain an action
list for complex scenarios, such as sampling, checking lengths, etc.
When these actions are copied into the internal flow table, they are
evaluated to validate that such actions make sense, and these calls
happen recursively.

The ovs-vswitchd userspace won't emit more than 16 recursion levels
deep. However, the module has no such limit and will happily accept
limits larger than 16 levels nested. Prevent this by tracking the
number of recursions happening and manually limiting it to 16 levels
nested.

The initial implementation of the sample action would track this depth
and prevent more than 3 levels of recursion, but this was removed to
support the clone use case, rather than limited at the current userspace
limit.

Fixes: 798c166173ff ("openvswitch: Optimize sample action for the clone use cases")
Signed-off-by: Aaron Conole <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

smb3: clarify mount warning

When a user tries to use the "sec=krb5p" mount parameter to encrypt
data on connection to a server (when authenticating with Kerberos), we
indicate that it is not supported, but do not note the equivalent
recommended mount parameter ("sec=krb5,seal") which turns on encryption
for that mount (and uses Kerberos for auth). Update the warning message.

Reviewed-by: Shyam Prasad N <[email protected]>
Signed-off-by: Steve French <[email protected]>

cifs: handle cases where multiple sessions share connection

Based on our implementation of multichannel, it is entirely
possible that a server struct may not be found in any channel
of an SMB session.

In such cases, we should be prepared to move on and search for
the server struct in the next session.

Signed-off-by: Shyam Prasad N <[email protected]>
Signed-off-by: Steve French <[email protected]>

cifs: change tcon status when need_reconnect is set on it

When a tcon is marked for need_reconnect, the intention
is to have it reconnected.

This change adjusts tcon->status in cifs_tree_connect
when need_reconnect is set. Also, this change has a minor
correction in resetting need_reconnect on success. It makes
sure that it is done with tc_lock held.

Signed-off-by: Shyam Prasad N <[email protected]>
Signed-off-by: Steve French <[email protected]>

Merge branch 'selftests-forwarding-various-fixes'

Ido Schimmel says:

====================
selftests: forwarding: Various fixes

Fix various problems in the forwarding selftests so that they will pass
in the netdev CI instead of being ignored. See commit messages for
details.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: forwarding: Fix bridge locked port test flakiness

The redirection test case fails in the netdev CI on debug kernels
because an FDB entry is learned despite the presence of a tc filter that
redirects incoming traffic [1].

I am unable to reproduce the failure locally, but I can see how it can
happen given that learning is first enabled and only then the ingress tc
filter is configured. On debug kernels the time window between these two
operations is longer compared to regular kernels, allowing random
packets to be transmitted and trigger learning.

Fix by reversing the order and configure the ingress tc filter before
enabling learning.

[1]
[...]
# TEST: Locked port MAB redirect [FAIL]
# Locked entry created for redirected traffic

Fixes: 38c43a1ce758 ("selftests: forwarding: Add test case for traffic redirection from a locked port")
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Hangbin Liu <[email protected]>
Acked-by: Nikolay Aleksandrov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: forwarding: Suppress grep warnings

Suppress the following grep warnings:

[...]
INFO: # Port group entries configuration tests - (*, G)
TEST: Common port group entries configuration tests (IPv4 (*, G))   [ OK ]
TEST: Common port group entries configuration tests (IPv6 (*, G))   [ OK ]
grep: warning: stray \ before /
grep: warning: stray \ before /
grep: warning: stray \ before /
TEST: IPv4 (*, G) port group entries configuration tests            [ OK ]
grep: warning: stray \ before /
grep: warning: stray \ before /
grep: warning: stray \ before /
TEST: IPv6 (*, G) port group entries configuration tests            [ OK ]
[...]

They do not fail the test, but do clutter the output.

Fixes: b6d00da08610 ("selftests: forwarding: Add bridge MDB test")
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Hangbin Liu <[email protected]>
Acked-by: Nikolay Aleksandrov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: forwarding: Fix bridge MDB test flakiness

After enabling a multicast querier on the bridge (like the test is
doing), the bridge will wait for the Max Response Delay before starting
to forward according to its MDB in order to let Membership Reports
enough time to be received and processed.

Currently, the test is waiting for exactly the default Max Response
Delay (10 seconds) which is racy and leads to failures [1].

Fix by reducing the Max Response Delay to 1 second.

[1]
[...]
# TEST: IPv4 host entries forwarding tests [FAIL]
# Packet locally received after flood

Fixes: b6d00da08610 ("selftests: forwarding: Add bridge MDB test")
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Hangbin Liu <[email protected]>
Acked-by: Nikolay Aleksandrov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: forwarding: Fix layer 2 miss test flakiness

After enabling a multicast querier on the bridge (like the test is
doing), the bridge will wait for the Max Response Delay before starting
to forward according to its MDB in order to let Membership Reports
enough time to be received and processed.

Currently, the test is waiting for exactly the default Max Response
Delay (10 seconds) which is racy and leads to failures [1].

Fix by reducing the Max Response Delay to 1 second.

[1]
[...]
# TEST: L2 miss - Multicast (IPv4) [FAIL]
# Unregistered multicast filter was hit after adding MDB entry

Fixes: 8c33266ae26a ("selftests: forwarding: Add layer 2 miss test cases")
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Hangbin Liu <[email protected]>
Acked-by: Nikolay Aleksandrov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: net: Fix bridge backup port test flakiness

The test toggles the carrier of a bridge port in order to test the
bridge backup port feature.

Due to the linkwatch delayed work the carrier change is not always
reflected fast enough to the bridge driver and packets are not forwarded
as the test expects, resulting in failures [1].

Fix by busy waiting on the bridge port state until it changes to the
desired state following the carrier change.

[1]
# Backup port
# -----------
[...]
# TEST: swp1 carrier off                                              [ OK ]
# TEST: No forwarding out of swp1                                     [FAIL]
[  641.995910] br0: port 1(swp1) entered disabled state
# TEST: No forwarding out of vx0                                      [ OK ]

Fixes: b408453053fb ("selftests: net: Add bridge backup port and backup nexthop ID test")
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Acked-by: Paolo Abeni <[email protected]>
Acked-by: Nikolay Aleksandrov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

btrfs: add new unused block groups to the list of unused block groups

Space reservations for metadata are, most of the time, pessimistic as we
reserve space for worst possible cases - where tree heights are at the
maximum possible height (8), we need to COW every extent buffer in a tree
path, need to split extent buffers, etc.

For data, we generally reserve the exact amount of space we are going to
allocate. The exception here is when using compression, in which case we
reserve space matching the uncompressed size, as the compression only
happens at writeback time and in the worst possible case we need that
amount of space in case the data is not compressible.

This means that when there's not available space in the corresponding
space_info object, we may need to allocate a new block group, and then
that block group might not be used after all. In this case the block
group is never added to the list of unused block groups and ends up
never being deleted - except if we unmount and mount again the fs, as
when reading block groups from disk we add unused ones to the list of
unused block groups (fs_info->unused_bgs). Otherwise a block group is
only added to the list of unused block groups when we deallocate the
last extent from it, so if no extent is ever allocated, the block group
is kept around forever.

This also means that if we have a bunch of tasks reserving space in
parallel we can end up allocating many block groups that end up never
being used or kept around for too long without being used, which has
the potential to result in ENOSPC failures in case for example we over
allocate too many metadata block groups and then end up in a state
without enough unallocated space to allocate a new data block group.

This is more likely to happen with metadata reservations as of kernel
6.7, namely since commit 28270e25c69a ("btrfs: always reserve space for
delayed refs when starting transaction"), because we started to always
reserve space for delayed references when starting a transaction handle
for a non-zero number of items, and also to try to reserve space to fill
the gap between the delayed block reserve's reserved space and its size.

So to avoid this, when finishing the creation a new block group, add the
block group to the list of unused block groups if it's still unused at
that time. This way the next time the cleaner kthread runs, it will delete
the block group if it's still unused and not needed to satisfy existing
space reservations.

Reported-by: Ivan Shapovalov <[email protected]>
Link: https://lore.kernel.org/linux-btrfs/[email protected]/
CC: [email protected] # 6.7+
Reviewed-by: Johannes Thumshirn <[email protected]>
Reviewed-by: Josef Bacik <[email protected]>
Reviewed-by: Boris Burkov <[email protected]>
Signed-off-by: Filipe Manana <[email protected]>
Signed-off-by: David Sterba <[email protected]>

btrfs: do not delete unused block group if it may be used soon

Before deleting a block group that is in the list of unused block groups
(fs_info->unused_bgs), we check if the block group became used before
deleting it, as extents from it may have been allocated after it was added
to the list.

However even if the block group was not yet used, there may be tasks that
have only reserved space and have not yet allocated extents, and they
might be relying on the availability of the unused block group in order
to allocate extents. The reservation works first by increasing the
"bytes_may_use" field of the corresponding space_info object (which may
first require flushing delayed items, allocating a new block group, etc),
and only later a task does the actual allocation of extents.

For metadata we usually don't end up using all reserved space, as we are
pessimistic and typically account for the worst cases (need to COW every
single node in a path of a tree at maximum possible height, etc). For
data we usually reserve the exact amount of space we're going to allocate
later, except when using compression where we always reserve space based
on the uncompressed size, as compression is only triggered when writeback
starts so we don't know in advance how much space we'll actually need, or
if the data is compressible.

So don't delete an unused block group if the total size of its space_info
object minus the block group's size is less then the sum of used space and
space that may be used (space_info->bytes_may_use), as that means we have
tasks that reserved space and may need to allocate extents from the block
group. In this case, besides skipping the deletion, re-add the block group
to the list of unused block groups so that it may be reconsidered later,
in case the tasks that reserved space end up not needing to allocate
extents from it.

Allowing the deletion of the block group while we have reserved space, can
result in tasks failing to allocate metadata extents (-ENOSPC) while under
a transaction handle, resulting in a transaction abort, or failure during
writeback for the case of data extents.

CC: [email protected] # 6.0+
Reviewed-by: Johannes Thumshirn <[email protected]>
Reviewed-by: Josef Bacik <[email protected]>
Reviewed-by: Boris Burkov <[email protected]>
Signed-off-by: Filipe Manana <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>

btrfs: add and use helper to check if block group is used

Add a helper function to determine if a block group is being used and make
use of it at btrfs_delete_unused_bgs(). This helper will also be used in
future code changes.

Reviewed-by: Johannes Thumshirn <[email protected]>
Reviewed-by: Josef Bacik <[email protected]>
Reviewed-by: Boris Burkov <[email protected]>
Signed-off-by: Filipe Manana <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>

btrfs: don't drop extent_map for free space inode on write error

While running the CI for an unrelated change I hit the following panic
with generic/648 on btrfs_holes_spacecache.

assertion failed: block_start != EXTENT_MAP_HOLE, in fs/btrfs/extent_io.c:1385
------------[ cut here ]------------
kernel BUG at fs/btrfs/extent_io.c:1385!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 2695096 Comm: fsstress Kdump: loaded Tainted: G        W          6.8.0-rc2+ #1
RIP: 0010:__extent_writepage_io.constprop.0+0x4c1/0x5c0
Call Trace:
<TASK>
extent_write_cache_pages+0x2ac/0x8f0
extent_writepages+0x87/0x110
do_writepages+0xd5/0x1f0
filemap_fdatawrite_wbc+0x63/0x90
__filemap_fdatawrite_range+0x5c/0x80
btrfs_fdatawrite_range+0x1f/0x50
btrfs_write_out_cache+0x507/0x560
btrfs_write_dirty_block_groups+0x32a/0x420
commit_cowonly_roots+0x21b/0x290
btrfs_commit_transaction+0x813/0x1360
btrfs_sync_file+0x51a/0x640
__x64_sys_fdatasync+0x52/0x90
do_syscall_64+0x9c/0x190
entry_SYSCALL_64_after_hwframe+0x6e/0x76

This happens because we fail to write out the free space cache in one
instance, come back around and attempt to write it again.  However on
the second pass through we go to call btrfs_get_extent() on the inode to
get the extent mapping.  Because this is a new block group, and with the
free space inode we always search the commit root to avoid deadlocking
with the tree, we find nothing and return a EXTENT_MAP_HOLE for the
requested range.

This happens because the first time we try to write the space cache out
we hit an error, and on an error we drop the extent mapping.  This is
normal for normal files, but the free space cache inode is special.  We
always expect the extent map to be correct.  Thus the second time
through we end up with a bogus extent map.

Since we're deprecating this feature, the most straightforward way to
fix this is to simply skip dropping the extent map range for this failed
range.

I shortened the test by using error injection to stress the area to make
it easier to reproduce.  With this patch in place we no longer panic
with my error injection test.

CC: [email protected] # 4.14+
Reviewed-by: Filipe Manana <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: David Sterba <[email protected]>

Merge tag 'riscv-for-linus-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

- fix missing TLB flush during early boot on SPARSEMEM_VMEMMAP
   configurations

- fixes to correctly implement the break-before-make behavior requried
   by the ISA for NAPOT mappings

- fix a missing TLB flush on intermediate mapping changes

- fix build warning about a missing declaration of overflow_stack

- fix performace regression related to incorrect tracking of completed
   batch TLB flushes

* tag 'riscv-for-linus-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix arch_tlbbatch_flush() by clearing the batch cpumask
  riscv: declare overflow_stack as exported from traps.c
  riscv: Fix arch_hugetlb_migration_supported() for NAPOT
  riscv: Flush the tlb when a page directory is freed
  riscv: Fix hugetlb_mask_last_page() when NAPOT is enabled
  riscv: Fix set_huge_pte_at() for NAPOT mapping
  riscv: mm: execute local TLB flush after populating vmemmap

Merge tag 'trace-v6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

- Fix broken direct trampolines being called when another callback is
   attached the same function.

   ARM 64 does not support FTRACE_WITH_REGS, and when it added direct
   trampoline calls from ftrace, it removed the "WITH_REGS" flag from
   the ftrace_ops for direct trampolines. This broke x86 as x86 requires
   direct trampolines to have WITH_REGS.

   This wasn't noticed because direct trampolines work as long as the
   function it is attached to is not shared with other callbacks (like
   the function tracer). When there are other callbacks, a helper
   trampoline is called, to call all the non direct callbacks and when
   it returns, the direct trampoline is called.

   For x86, the direct trampoline sets a flag in the regs field to tell
   the x86 specific code to call the direct trampoline. But this only
   works if the ftrace_ops had WITH_REGS set. ARM does things
   differently that does not require this. For now, set WITH_REGS if the
   arch supports WITH_REGS (which ARM does not), and this makes it work
   for both ARM64 and x86.

- Fix wasted memory in the saved_cmdlines logic.

   The saved_cmdlines is a cache that maps PIDs to COMMs that tracing
   can use. Most trace events only save the PID in the event. The
   saved_cmdlines file lists PIDs to COMMs so that the tracing tools can
   show an actual name and not just a PID for each event. There's an
   array of PIDs that map to a small set of saved COMM strings. The
   array is set to PID_MAX_DEFAULT which is usually set to 32768. When a
   PID comes in, it will add itself to this array along with the index
   into the COMM array (note if the system allows more than
   PID_MAX_DEFAULT, this cache is similar to cache lines as an update of
   a PID that has the same PID_MAX_DEFAULT bits set will flush out
   another task with the same matching bits set).

   A while ago, the size of this cache was changed to be dynamic and the
   array was moved into a structure and created with kmalloc(). But this
   new structure had the size of 131104 bytes, or 0x20020 in hex. As
   kmalloc allocates in powers of two, it was actually allocating
   0x40000 bytes (262144) leaving 131040 bytes of wasted memory. The
   last element of this structure was a pointer to the COMM string array
   which defaulted to just saving 128 COMMs.

   By changing the last field of this structure to a variable length
   string, and just having it round up to fill the allocated memory, the
   default size of the saved COMM cache is now 8190. This not only uses
   the wasted space, but actually saves space by removing the extra
   allocation for the COMM names.

* tag 'trace-v6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Fix wasted memory in saved_cmdlines logic
  ftrace: Fix DIRECT_CALLS to use SAVE_REGS by default

Merge tag 'probes-fixes-v6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes fixes from Masami Hiramatsu:

- remove unnecessary initial values of kprobes local variables

- probe-events parser bug fixes:

    - calculate the argument size and format string after setting type
      information from BTF, because BTF can change the size and format
      string.

    - show $comm parse error correctly instead of failing silently.

* tag 'probes-fixes-v6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  kprobes: Remove unnecessary initial values of variables
  tracing/probes: Fix to set arg size and fmt after setting type from BTF
  tracing/probes: Fix to show a parse error for bad type for $comm

PCI: Fix active state requirement in PME polling

The commit noted in fixes added a bogus requirement that runtime PM managed
devices need to be in the RPM_ACTIVE state for PME polling. In fact, only
devices in low power states should be polled.

However there's still a requirement that the device config space must be
accessible, which has implications for both the current state of the polled
device and the parent bridge, when present. It's not sufficient to assume
the bridge remains in D0 and cases have been observed where the bridge
passes the D0 test, but the PM state indicates RPM_SUSPENDING and config
space of the polled device becomes inaccessible during pci_pme_wakeup().

Therefore, since the bridge is already effectively required to be in the
RPM_ACTIVE state, formalize this in the code and elevate the PM usage count
to maintain the state while polling the subordinate device.

This resolves a regression reported in the bugzilla below where a
Thunderbolt/USB4 hierarchy fails to scan for an attached NVMe endpoint
downstream of a bridge in a D3hot power state.

Link: https://lore.kernel.org/r/[email protected]
Fixes: d3fcd7360338 ("PCI: Fix runtime PM race with PME polling")
Reported-by: Sanath S <[email protected]>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218360
Signed-off-by: Alex Williamson <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Tested-by: Sanath S <[email protected]>
Reviewed-by: Rafael J. Wysocki <[email protected]>
Cc: Lukas Wunner <[email protected]>
Cc: Mika Westerberg <[email protected]>

Merge tag 'efi-fixes-for-v6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI fixes from Ard Biesheuvel:
"The only notable change here is the patch that changes the way we deal
  with spurious errors from the EFI memory attribute protocol. This will
  be backported to v6.6, and is intended to ensure that we will not
  paint ourselves into a corner when we tighten this further in order to
  comply with MS requirements on signed EFI code.

  Note that this protocol does not currently exist in x86 production
  systems in the field, only in Microsoft's fork of OVMF, but it will be
  mandatory for Windows logo certification for x86 PCs in the future.

   - Tighten ELF relocation checks on the RISC-V EFI stub

   - Give up if the new EFI memory attributes protocol fails spuriously
     on x86

   - Take care not to place the kernel in the lowest 16 MB of DRAM on
     x86

   - Omit special purpose EFI memory from memblock

   - Some fixes for the CXL CPER reporting code

   - Make the PE/COFF layout of mixed-mode capable images comply with a
     strict interpretation of the spec"

* tag 'efi-fixes-for-v6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  x86/efistub: Use 1:1 file:memory mapping for PE/COFF .compat section
  cxl/trace: Remove unnecessary memcpy's
  cxl/cper: Fix errant CPER prints for CXL events
  efi: Don't add memblocks for soft-reserved memory
  efi: runtime: Fix potential overflow of soft-reserved region size
  efi/libstub: Add one kernel-doc comment
  x86/efistub: Avoid placing the kernel below LOAD_PHYSICAL_ADDR
  x86/efistub: Give up if memory attribute protocol returns an error
  riscv/efistub: Tighten ELF relocation check
  riscv/efistub: Ensure GP-relative addressing is not used

Merge tag 'pci-v6.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci fixes from Bjorn Helgaas:

- Fix an unintentional truncation of DWC MSI-X address to 32 bits and
   update similar MSI code to match (Dan Carpenter)

* tag 'pci-v6.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI: dwc: Clean up dw_pcie_ep_raise_msi_irq() alignment
  PCI: dwc: Fix a 64bit bug in dw_pcie_ep_raise_msix_irq()

Merge tag 'hwmon-for-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:

- coretemp: Various fixes, and increase number of supported CPU cores

- aspeed-pwm-tacho: Add missing mutex protection

* tag 'hwmon-for-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (coretemp) Enlarge per package core count limit
  hwmon: (coretemp) Fix bogus core_id to attr name mapping
  hwmon: (coretemp) Fix out-of-bounds memory access
  hwmon: (aspeed-pwm-tacho) mutex for tach reading

Merge tag 'mmc-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
"MMC core:
   - Allow non-sleeping read-only slot-gpio

  MMC host:
   - sdhci-pci-o2micro: Fix a warm reboot BIOS issue"

* tag 'mmc-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: slot-gpio: Allow non-sleeping GPIO ro
  mmc: sdhci-pci-o2micro: Fix a warm reboot issue that disk can't be detected by BIOS

Merge tag 'pmdomain-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm

Pull pmdomain fixes from Ulf Hansson:
"Core:
   - Move the unused cleanup to a _sync initcall

  Providers:
   - mediatek: Fix race conditions at probe/remove with genpd
   - renesas: r8a77980-sysc: CR7 must be always on"

* tag 'pmdomain-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
  pmdomain: mediatek: fix race conditions with genpd
  pmdomain: renesas: r8a77980-sysc: CR7 must be always on
  pmdomain: core: Move the unused cleanup to a _sync initcall

Merge tag 'gpio-fixes-for-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fix from Bartosz Golaszewski:

- remove the new GPIO device from the global list unconditionally in
error path in core GPIOLIB

* tag 'gpio-fixes-for-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: remove GPIO device from the list unconditionally in error path

Merge tag 'drm-fixes-2024-02-09' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
"Regular weekly fixes, xe, amdgpu and msm are most of them, with some
  misc in i915, ivpu and nouveau, scattered but nothing too intense at
  this point.

  i915:
   - gvt: docs fix, uninit var, MAINTAINERS

  ivpu:
   - add aborted job status
   - disable d3 hot delay
   - mmu fixes

  nouveau:
   - fix gsp rpc size request
   - fix dma buffer leaks
   - use common code for gsp mem ctor

  xe:
   - Fix a loop in an error path
   - Fix a missing dma-fence reference
   - Fix a retry path on userptr REMAP
   - Workaround for a false gcc warning
   - Fix missing map of the usm batch buffer in the migrate vm.
   - Fix a memory leak.
   - Fix a bad assumption of used page size
   - Fix hitting a BUG() due to zero pages to map.
   - Remove some leftover async bind queue relics

  amdgpu:
   - Misc NULL/bounds check fixes
   - ODM pipe policy fix
   - Aborted suspend fixes
   - JPEG 4.0.5 fix
   - DCN 3.5 fixes
   - PSP fix
   - DP MST fix
   - Phantom pipe fix
   - VRAM vendor fix
   - Clang fix
   - SR-IOV fix

  msm:
   - DPU:
      - fix for kernel doc warnings and smatch warnings in dpu_encoder
      - fix for smatch warning in dpu_encoder
      - fix the bus bandwidth value for SDM670
   - DP:
      - fixes to handle unknown bpc case correctly for DP
      - fix for MISC0 programming
   - GPU:
      - dmabuf vmap fix
      - a610 UBWC corruption fix (incorrect hbb)
      - revert a commit that was making GPU recovery unreliable"

* tag 'drm-fixes-2024-02-09' of git://anongit.freedesktop.org/drm/drm: (43 commits)
  drm/xe: Remove TEST_VM_ASYNC_OPS_ERROR
  drm/xe/vm: don't ignore error when in_kthread
  drm/xe: Assume large page size if VMA not yet bound
  drm/xe/display: Fix memleak in display initialization
  drm/xe: Map both mem.kernel_bb_pool and usm.bb_pool
  drm/xe: circumvent bogus stringop-overflow warning
  drm/xe: Pick correct userptr VMA to repin on REMAP op failure
  drm/xe: Take a reference in xe_exec_queue_last_fence_get()
  drm/xe: Fix loop in vm_bind_ioctl_ops_unwind
  drm/amdgpu: Fix HDP flush for VFs on nbio v7.9
  drm/amd/display: Implement bounds check for stream encoder creation in DCN301
  drm/amd/display: Increase frame-larger-than for all display_mode_vba files
  drm/amd/display: Clear phantom stream count and plane count
  drm/amdgpu: Avoid fetching VRAM vendor info
  drm/amd/display: Disable ODM by default for DCN35
  drm/amd/display: Update phantom pipe enable / disable sequence
  drm/amd/display: Fix MST Null Ptr for RV
  drm/amdgpu: Fix shared buff copy to user
  drm/amd/display: Increase eval/entry delay for DCN35
  drm/amdgpu: remove asymmetrical irq disabling in jpeg 4.0.5 suspend
  ...

perf/arm-cmn: Workaround AmpereOneX errata AC04_MESH_1 (incorrect child count)

AmpereOneX mesh implementation has a bug in HN-P nodes that makes them
report incorrect child count. The failing crosspoints report 8 children
while they only have two.

When the driver tries to access the inexistent child nodes, it believes it
has reached an invalid node type and probing fails. The workaround is to
ignore those incorrect child nodes and continue normally.

Signed-off-by: Ilkka Koskinen <[email protected]>
[ rm: rewrote simpler generalised version ]
Tested-by: Ilkka Koskinen <[email protected]>
Signed-off-by: Robin Murphy <[email protected]>
Link: https://lore.kernel.org/r/ce4b1442135fe03d0de41859b04b268c88c854a3.1707498577.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <[email protected]>

arm64: jump_label: use constraints "Si" instead of "i"

The generic constraint "i" seems to be copied from x86 or arm (and with
a redundant generic operand modifier "c"). It works with -fno-PIE but
not with -fPIE/-fPIC in GCC's aarch64 port.

The machine constraint "S", which denotes a symbol or label reference
with a constant offset, supports PIC and has been available in GCC since
2012 and in Clang since 7.0. However, Clang before 19 does not support
"S" on a symbol with a constant offset [1] (e.g.
`static_key_false(&nf_hooks_needed[pf][hook])` in
include/linux/netfilter.h), so we use "i" as a fallback.

Suggested-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Fangrui Song <[email protected]>
Link: https://github.com/llvm/llvm-project/pull/80255
Acked-by: Mark Rutland <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>

arm64: fix typo in comments

fix typo in comments

thath -> that

Signed-off-by: Seongsu Park <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>

perf: CXL: fix mismatched cpmu event opcode

S2M NDR BI-ConflictAck opcode is described as 4 in the CXL
r3.0 3.3.9 Table 3.43. However, it is defined as 3 in macro definition.

Fixes: 5d7107c72796 ("perf: CXL Performance Monitoring Unit driver")
Signed-off-by: Hojin Nam <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
Link: https://lore.kernel.org/r/20240208013415epcms2p2904187c8a863f4d0d2adc980fb91a2dc@epcms2p2
Signed-off-by: Will Deacon <[email protected]>

arm64/signal: Don't assume that TIF_SVE means we saved SVE state

When we are in a syscall we will only save the FPSIMD subset even though
the task still has access to the full register set, and on context switch
we will only remove TIF_SVE when loading the register state. This means
that the signal handling code should not assume that TIF_SVE means that
the register state is stored in SVE format, it should instead check the
format that was recorded during save.

Fixes: 8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch")
Signed-off-by: Mark Brown <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>

x86/Kconfig: Transmeta Crusoe is CPU family 5, not 6

The kernel built with MCRUSOE is unbootable on Transmeta Crusoe.  It shows
the following error message:

  This kernel requires an i686 CPU, but only detected an i586 CPU.
  Unable to boot - please use a kernel appropriate for your CPU.

Remove MCRUSOE from the condition introduced in commit in Fixes, effectively
changing X86_MINIMUM_CPU_FAMILY back to 5 on that machine, which matches the
CPU family given by CPUID.

  [ bp: Massage commit message. ]

Fixes: 25d76ac88821 ("x86/Kconfig: Explicitly enumerate i686-class CPUs in Kconfig")
Signed-off-by: Aleksander Mazur <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Acked-by: H. Peter Anvin <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

spi: spi-ppc4xx: include missing platform_device.h

the driver currently fails to compile on 6.8-rc3 due to:
| spi-ppc4xx.c: In function ‘spi_ppc4xx_of_probe’:
| @346:36: error: invalid use of undefined type ‘struct platform_device’
| 346 | struct device_node *np = op->dev.of_node;
| | ^~
| ... (more similar errors)

it was working with 6.7. Looks like it only needed the include
and its compiling fine!

Signed-off-by: Christian Lamparter <[email protected]>
Link: https://lore.kernel.org/r/3eb3f9c4407ba99d1cd275662081e46b9e839173.1707490664.git.chunkeey@gmail.com
Signed-off-by: Mark Brown <[email protected]>

ASoC: cs35l56: Remove default from IRQ1_CFG register

The driver never uses the IRQ1_CFG register so there's no need to provide
a default value. It's set as a readable register only for debugging
through the regmap registers file.

A system-specific firmware could overwrite this register with a non-default
value. Therefore the driver can't hardcode what the initial value actually
is. As the register is only for debugging the value can be left unknown
until someone wants to read it through debugfs.

Signed-off-by: Richard Fitzgerald <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>

ALSA: hda/cs35l56: select intended config FW_CS_DSP

Commit 73cfbfa9caea ("ALSA: hda/cs35l56: Add driver for Cirrus Logic
CS35L56 amplifier") adds configs SND_HDA_SCODEC_CS35L56_{I2C,SPI},
which selects the non-existing config CS_DSP. Note the renaming in
commit d7cfdf17cb9d ("firmware: cs_dsp: Rename KConfig symbol CS_DSP ->
FW_CS_DSP"), though.

Select the intended config FW_CS_DSP.

This broken select command probably was not noticed as the configs also
select SND_HDA_CS_DSP_CONTROLS and this then selects FW_CS_DSP. So, the
select FW_CS_DSP could actually be dropped, but we will keep this
redundancy in place as the author originally also intended to have this
redundancy of selects in place.

Fixes: 73cfbfa9caea ("ALSA: hda/cs35l56: Add driver for Cirrus Logic CS35L56 amplifier")
Signed-off-by: Lukas Bulwahn <[email protected]>
Reviewed-by: Simon Trimmer <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

tracing: Fix wasted memory in saved_cmdlines logic

While looking at improving the saved_cmdlines cache I found a huge amount
of wasted memory that should be used for the cmdlines.

The tracing data saves pids during the trace. At sched switch, if a trace
occurred, it will save the comm of the task that did the trace. This is
saved in a "cache" that maps pids to comms and exposed to user space via
the /sys/kernel/tracing/saved_cmdlines file. Currently it only caches by
default 128 comms.

The structure that uses this creates an array to store the pids using
PID_MAX_DEFAULT (which is usually set to 32768). This causes the structure
to be of the size of 131104 bytes on 64 bit machines.

In hex: 131104 = 0x20020, and since the kernel allocates generic memory in
powers of two, the kernel would allocate 0x40000 or 262144 bytes to store
this structure. That leaves 131040 bytes of wasted space.

Worse, the structure points to an allocated array to store the comm names,
which is 16 bytes times the amount of names to save (currently 128), which
is 2048 bytes. Instead of allocating a separate array, make the structure
end with a variable length string and use the extra space for that.

This is similar to a recommendation that Linus had made about eventfs_inode names:

https://lore.kernel.org/all/20240130190355 [email protected]/

Instead of allocating a separate string array to hold the saved comms,
have the structure end with: char saved_cmdlines[]; and round up to the
next power of two over sizeof(struct saved_cmdline_buffers) + num_cmdlines * TASK_COMM_LEN
It will use this extra space for the saved_cmdline portion.

Now, instead of saving only 128 comms by default, by using this wasted
space at the end of the structure it can save over 8000 comms and even
saves space by removing the need for allocating the other array.

Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: [email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Vincent Donnefort <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: Mete Durlu <[email protected]>
Fixes: 939c7a4f04fcd ("tracing: Introduce saved_cmdlines_size file")
Signed-off-by: Steven Rostedt (Google) <[email protected]>