Git Repo - linux.git/log

Merge tag '6.4-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:
"Four small smb3 client fixes:

   - two small fixes suggested by kernel test robot

   - small cleanup fix

   - update Paulo's email address in the maintainer file"

* tag '6.4-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: address unused variable warning
  smb: delete an unnecessary statement
  smb3: missing null check in SMB2_change_notify
  smb3: update a reviewer email in MAINTAINERS file

mailbox: mailbox-test: fix a locking issue in mbox_test_message_write()

There was a bug where this code forgot to unlock the tdev->mutex if the
kzalloc() failed. Fix this issue, by moving the allocation outside the
lock.

Fixes: 2d1e952a2b8e ("mailbox: mailbox-test: Fix potential double-free in mbox_test_message_write()")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Lee Jones <[email protected]>
Signed-off-by: Jassi Brar <[email protected]>

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:

- Fix 64K ARM page size support in bnxt_re and efa

- bnxt_re fixes for a memory leak, incorrect error handling and a
   remove a bogus FW failure when running on a VF

- Update MAINTAINERS for hns and efa

- Fix two rxe regressions added this merge window in error unwind and
   incorrect spinlock primitives

- hns gets a better algorithm for allocating page tables to avoid
   running out of resources, and a timeout adjustment

- Fix a text case failure in hns

- Use after free in irdma and fix incorrect construction of a WQE
   causing mis-execution

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/irdma: Fix Local Invalidate fencing
  RDMA/irdma: Prevent QP use after free
  MAINTAINERS: Update maintainer of Amazon EFA driver
  RDMA/bnxt_re: Do not enable congestion control on VFs
  RDMA/bnxt_re: Fix return value of bnxt_re_process_raw_qp_pkt_rx
  RDMA/bnxt_re: Fix a possible memory leak
  RDMA/hns: Modify the value of long message loopback slice
  RDMA/hns: Fix base address table allocation
  RDMA/hns: Fix timeout attr in query qp for HIP08
  RDMA/efa: Fix unsupported page sizes in device
  RDMA/rxe: Convert spin_{lock_bh,unlock_bh} to spin_{lock_irqsave,unlock_irqrestore}
  RDMA/rxe: Fix double unlock in rxe_qp.c
  MAINTAINERS: Update maintainers of HiSilicon RoCE
  RDMA/bnxt_re: Fix the page_size used during the MR creation

Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:
"Fix two regressions in ext4 and a number of issues reported by syzbot"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: enable the lazy init thread when remounting read/write
  ext4: fix fsync for non-directories
  ext4: add lockdep annotations for i_data_sem for ea_inode's
  ext4: disallow ea_inodes with extended attributes
  ext4: set lockdep subclass for the ea_inode in ext4_xattr_inode_cache_find()
  ext4: add EA_INODE checking to ext4_iget()

HID: logitech-hidpp: Handle timeout differently from busy

If an attempt at contacting a receiver or a device fails because the
receiver or device never responds, don't restart the communication, only
restart it if the receiver or device answers that it's busy, as originally
intended.

This was the behaviour on communication timeout before commit 586e8fede795
("HID: logitech-hidpp: Retry commands when device is busy").

This fixes some overly long waits in a critical path on boot, when
checking whether the device is connected by getting its HID++ version.

Signed-off-by: Bastien Nocera <[email protected]>
Suggested-by: Mark Lord <[email protected]>
Fixes: 586e8fede795 ("HID: logitech-hidpp: Retry commands when device is busy")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217412
Signed-off-by: Jiri Kosina <[email protected]>

udp6: Fix race condition in udp6_sendmsg & connect

Syzkaller got the following report:
BUG: KASAN: use-after-free in sk_setup_caps+0x621/0x690 net/core/sock.c:2018
Read of size 8 at addr ffff888027f82780 by task syz-executor276/3255

The function sk_setup_caps (called by ip6_sk_dst_store_flow->
ip6_dst_store) referenced already freed memory as this memory was
freed by parallel task in udpv6_sendmsg->ip6_sk_dst_lookup_flow->
sk_dst_check.

          task1 (connect)              task2 (udp6_sendmsg)
        sk_setup_caps->sk_dst_set |
                                  |  sk_dst_check->
                                  |      sk_dst_set
                                  |      dst_release
        sk_setup_caps references  |
        to already freed dst_entry|

The reason for this race condition is: sk_setup_caps() keeps using
the dst after transferring the ownership to the dst cache.

Found by Linux Verification Center (linuxtesting.org) with syzkaller.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Vladislav Efanov <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/netlink: fix NETLINK_LIST_MEMBERSHIPS length report

The current code for the length calculation wrongly truncates the reported
length of the groups array, causing an under report of the subscribed
groups. To fix this, use 'BITS_TO_BYTES()' which rounds up the
division by 8.

Fixes: b42be38b2778 ("netlink: add API to retrieve all group memberships")
Signed-off-by: Pedro Tammela <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: sched: fix NULL pointer dereference in mq_attach

When use the following command to test:
1)ip link add bond0 type bond
2)ip link set bond0 up
3)tc qdisc add dev bond0 root handle ffff: mq
4)tc qdisc replace dev bond0 parent ffff:fff1 handle ffff: mq

The kernel reports NULL pointer dereference issue. The stack information
is as follows:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Internal error: Oops: 0000000096000006 [#1] SMP
Modules linked in:
pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : mq_attach+0x44/0xa0
lr : qdisc_graft+0x20c/0x5cc
sp : ffff80000e2236a0
x29: ffff80000e2236a0 x28: ffff0000c0e59d80 x27: ffff0000c0be19c0
x26: ffff0000cae3e800 x25: 0000000000000010 x24: 00000000fffffff1
x23: 0000000000000000 x22: ffff0000cae3e800 x21: ffff0000c9df4000
x20: ffff0000c9df4000 x19: 0000000000000000 x18: ffff80000a934000
x17: ffff8000f5b56000 x16: ffff80000bb08000 x15: 0000000000000000
x14: 0000000000000000 x13: 6b6b6b6b6b6b6b6b x12: 6b6b6b6b00000001
x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
x8 : ffff0000c0be0730 x7 : bbbbbbbbbbbbbbbb x6 : 0000000000000008
x5 : ffff0000cae3e864 x4 : 0000000000000000 x3 : 0000000000000001
x2 : 0000000000000001 x1 : ffff8000090bc23c x0 : 0000000000000000
Call trace:
mq_attach+0x44/0xa0
qdisc_graft+0x20c/0x5cc
tc_modify_qdisc+0x1c4/0x664
rtnetlink_rcv_msg+0x354/0x440
netlink_rcv_skb+0x64/0x144
rtnetlink_rcv+0x28/0x34
netlink_unicast+0x1e8/0x2a4
netlink_sendmsg+0x308/0x4a0
sock_sendmsg+0x64/0xac
____sys_sendmsg+0x29c/0x358
___sys_sendmsg+0x90/0xd0
__sys_sendmsg+0x7c/0xd0
__arm64_sys_sendmsg+0x2c/0x38
invoke_syscall+0x54/0x114
el0_svc_common.constprop.1+0x90/0x174
do_el0_svc+0x3c/0xb0
el0_svc+0x24/0xec
el0t_64_sync_handler+0x90/0xb4
el0t_64_sync+0x174/0x178

This is because when mq is added for the first time, qdiscs in mq is set
to NULL in mq_attach(). Therefore, when replacing mq after adding mq, we
need to initialize qdiscs in the mq before continuing to graft. Otherwise,
it will couse NULL pointer dereference issue in mq_attach(). And the same
issue will occur in the attach functions of mqprio, taprio and htb.
ffff:fff1 means that the repalce qdisc is ingress. Ingress does not allow
any qdisc to be attached. Therefore, ffff:fff1 is incorrectly used, and
the command should be dropped.

Fixes: 6ec1c69a8f64 ("net_sched: add classful multiqueue dummy scheduler")
Signed-off-by: Zhengchao Shao <[email protected]>
Tested-by: Peilin Ye <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'net-sched-fixes-for-sch_ingress-and-sch_clsact'

Peilin Ye says:

====================
net/sched: Fixes for sch_ingress and sch_clsact

These are v6 fixes for ingress and clsact Qdiscs, including only first 4
patches (already tested and reviewed) from v5. Patch 5 and 6 from
previous versions are still under discussion and will be sent separately.

[a] https://syzkaller.appspot.com/bug?extid=b53a9c0d1ea4ad62da8b

Link to v5: https://lore.kernel.org/r/cover.1684887977 [email protected]/
Link to v4: https://lore.kernel.org/r/cover.1684825171 [email protected]/
Link to v3 (incomplete): https://lore.kernel.org/r/cover.1684821877 [email protected]/
Link to v2: https://lore.kernel.org/r/cover.1684796705 [email protected]/
Link to v1: https://lore.kernel.org/r/cover.1683326865 [email protected]/
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net/sched: Prohibit regrafting ingress or clsact Qdiscs

Currently, after creating an ingress (or clsact) Qdisc and grafting it
under TC_H_INGRESS (TC_H_CLSACT), it is possible to graft it again under
e.g. a TBF Qdisc:

  $ ip link add ifb0 type ifb
  $ tc qdisc add dev ifb0 handle 1: root tbf rate 20kbit buffer 1600 limit 3000
  $ tc qdisc add dev ifb0 clsact
  $ tc qdisc link dev ifb0 handle ffff: parent 1:1
  $ tc qdisc show dev ifb0
  qdisc tbf 1: root refcnt 2 rate 20Kbit burst 1600b lat 560.0ms
  qdisc clsact ffff: parent ffff:fff1 refcnt 2
                                      ^^^^^^^^

clsact's refcount has increased: it is now grafted under both
TC_H_CLSACT and 1:1.

ingress and clsact Qdiscs should only be used under TC_H_INGRESS
(TC_H_CLSACT).  Prohibit regrafting them.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Fixes: 1f211a1b929c ("net, sched: add clsact qdisc")
Tested-by: Pedro Tammela <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Reviewed-by: Jamal Hadi Salim <[email protected]>
Reviewed-by: Vlad Buslov <[email protected]>
Signed-off-by: Peilin Ye <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net/sched: Reserve TC_H_INGRESS (TC_H_CLSACT) for ingress (clsact) Qdiscs

Currently it is possible to add e.g. an HTB Qdisc under ffff:fff1
(TC_H_INGRESS, TC_H_CLSACT):

  $ ip link add name ifb0 type ifb
  $ tc qdisc add dev ifb0 parent ffff:fff1 htb
  $ tc qdisc add dev ifb0 clsact
  Error: Exclusivity flag on, cannot modify.
  $ drgn
  ...
  >>> ifb0 = netdev_get_by_name(prog, "ifb0")
  >>> qdisc = ifb0.ingress_queue.qdisc_sleeping
  >>> print(qdisc.ops.id.string_().decode())
  htb
  >>> qdisc.flags.value_() # TCQ_F_INGRESS
  2

Only allow ingress and clsact Qdiscs under ffff:fff1.  Return -EINVAL
for everything else.  Make TCQ_F_INGRESS a static flag of ingress and
clsact Qdiscs.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Fixes: 1f211a1b929c ("net, sched: add clsact qdisc")
Tested-by: Pedro Tammela <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Reviewed-by: Jamal Hadi Salim <[email protected]>
Reviewed-by: Vlad Buslov <[email protected]>
Signed-off-by: Peilin Ye <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net/sched: sch_clsact: Only create under TC_H_CLSACT

clsact Qdiscs are only supposed to be created under TC_H_CLSACT (which
equals TC_H_INGRESS). Return -EOPNOTSUPP if 'parent' is not
TC_H_CLSACT.

Fixes: 1f211a1b929c ("net, sched: add clsact qdisc")
Tested-by: Pedro Tammela <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Reviewed-by: Jamal Hadi Salim <[email protected]>
Reviewed-by: Vlad Buslov <[email protected]>
Signed-off-by: Peilin Ye <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net/sched: sch_ingress: Only create under TC_H_INGRESS

ingress Qdiscs are only supposed to be created under TC_H_INGRESS.
Return -EOPNOTSUPP if 'parent' is not TC_H_INGRESS, similar to
mq_init().

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: [email protected]
Closes: https://lore.kernel.org/r/[email protected]/
Tested-by: Pedro Tammela <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Reviewed-by: Jamal Hadi Salim <[email protected]>
Reviewed-by: Vlad Buslov <[email protected]>
Signed-off-by: Peilin Ye <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

Merge tag 'for-6.4-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
"One bug fix and two build warning fixes:

   - call proper end bio callback for metadata RAID0 in a rare case of
     an unaligned block

   - fix uninitialized variable (reported by gcc 10.2)

   - fix warning about potential access beyond array bounds on mips64
     with 64k pages (runtime check would not allow that)"

* tag 'for-6.4-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fix csum_tree_block page iteration to avoid tripping on -Werror=array-bounds
  btrfs: fix an uninitialized variable warning in btrfs_log_inode
  btrfs: call btrfs_orig_bbio_end_io in btrfs_end_bio_work

Merge tag 'perf-tools-fixes-for-v6.4-2-2023-05-30' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

- Fix BPF CO-RE naming convention for checking the availability of
   fields on 'union perf_mem_data_src' on the running kernel

- Remove the use of llvm-strip on BPF skel object files, not needed,
   fixes a build breakage when the llvm package, that contains it in
   most distros, isn't installed

- Fix tools that use both evsel->{bpf_counter_list,bpf_filters},
   removing them from a union

- Remove extra "--" from the 'perf ftrace latency' --use-nsec option,
   previously it was working only when using the '-n' alternative

- Don't stop building when both binutils-devel and a C++ compiler isn't
   available to compile the alternative C++ demangle support code,
   disable that feature instead

- Sync the linux/in.h and coresight-pmu.h header copies with the kernel
   sources

- Fix relative include path to cs-etm.h

* tag 'perf-tools-fixes-for-v6.4-2-2023-05-30' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf evsel: Separate bpf_counter_list and bpf_filters, can be used at the same time
  tools headers UAPI: Sync the linux/in.h with the kernel sources
  perf cs-etm: Copy kernel coresight-pmu.h header
  perf bpf: Do not use llvm-strip on BPF binary
  perf build: Don't compile demangle-cxx.cpp if not necessary
  perf arm: Fix include path to cs-etm.h
  perf bpf filter: Fix a broken perf sample data naming for BPF CO-RE
  perf ftrace latency: Remove unnecessary "--" from --use-nsec option

Merge tag 'regmap-fix-v6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

Pull regmap fixes from Mark Brown:
"The most important fix here is for missing dropping of the RCU read
  lock when syncing maple tree register caches, the physical devices I
  have that use the code don't do any syncing so I'd only ever tested
  this with virtual devices and missed the fact that we need to drop the
  lock in order to write to buses that need to sleep.

  Otherwise there's a fix for an edge case when splitting up large batch
  writes which has been lurking for a long time, a check to make sure
  nobody writes new drivers with a bug that was found in several
  SoundWire drivers and a tweak to the way the new kunit tests are
  enabled to ensure they don't cause regmap to be enabled when it
  wouldn't otherwise be"

* tag 'regmap-fix-v6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: maple: Drop the RCU read lock while syncing registers
  regmap: sdw: check for invalid multi-register writes config
  regmap: Account for register length when chunking
  regmap: REGMAP_KUNIT should not select REGMAP

Merge tag 'modules-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull modules fix from Luis Chamberlain:
"A fix is provided for ia64. Even though ia64 is on life support it
  helps to fix issues if we can. Thanks to Linus for doing tons of the
  ia64 debugging"

* tag 'modules-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  module: fix module load for ia64

ext4: enable the lazy init thread when remounting read/write

In commit a44be64bbecb ("ext4: don't clear SB_RDONLY when remounting
r/w until quota is re-enabled") we defer clearing tyhe SB_RDONLY flag
in struct super. However, we didn't defer when we checked sb_rdonly()
to determine the lazy itable init thread should be enabled, with the
next result that the lazy inode table initialization would not be
properly started. This can cause generic/231 to fail in ext4's
nojournal mode.

Fix this by moving when we decide to start or stop the lazy itable
init thread to after we clear the SB_RDONLY flag when we are
remounting the file system read/write.

Fixes a44be64bbecb ("ext4: don't clear SB_RDONLY when remounting r/w until...")

Signed-off-by: Theodore Ts'o <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>

ext4: fix fsync for non-directories

Commit e360c6ed7274 ("ext4: Drop special handling of journalled data
from ext4_sync_file()") simplified ext4_sync_file() by dropping special
handling of journalled data mode as it was not needed anymore. However
that branch was also used for directories and symlinks and since the
fastcommit code does not track metadata changes to non-regular files, the
change has caused e.g. fsync(2) on directories to not commit transaction
as it should. Fix the problem by adding handling for non-regular files.

Fixes: e360c6ed7274 ("ext4: Drop special handling of journalled data from ext4_sync_file()")
Reported-by: Eric Whitney <[email protected]>
Link: https://lore.kernel.org/all/ZFqO3xVnmhL7zv1x@debian-BULLSEYE-live-builder-AMD64
Signed-off-by: Jan Kara <[email protected]>
Tested-by: Eric Whitney <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>

ext4: add lockdep annotations for i_data_sem for ea_inode's

Treat i_data_sem for ea_inodes as being in their own lockdep class to
avoid lockdep complaints about ext4_setattr's use of inode_lock() on
normal inodes potentially causing lock ordering with i_data_sem on
ea_inodes in ext4_xattr_inode_write(). However, ea_inodes will be
operated on by ext4_setattr(), so this isn't a problem.

Cc: [email protected]
Link: https://syzkaller.appspot.com/bug?extid=298c5d8fb4a128bc27b0
Reported-by: [email protected]
Signed-off-by: Theodore Ts'o <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>

ext4: disallow ea_inodes with extended attributes

An ea_inode stores the value of an extended attribute; it can not have
extended attributes itself, or this will cause recursive nightmares.
Add a check in ext4_iget() to make sure this is the case.

Cc: [email protected]
Reported-by: [email protected]
Signed-off-by: Theodore Ts'o <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>

ext4: set lockdep subclass for the ea_inode in ext4_xattr_inode_cache_find()

If the ea_inode has been pushed out of the inode cache while there is
still a reference in the mb_cache, the lockdep subclass will not be
set on the inode, which can lead to some lockdep false positives.

Fixes: 33d201e0277b ("ext4: fix lockdep warning about recursive inode locking")
Cc: [email protected]
Reported-by: [email protected]
Signed-off-by: Theodore Ts'o <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>

module: fix module load for ia64

Frank reported boot regression in ia64 as:

ELILO v3.16 for EFI/IA-64
..
Uncompressing Linux... done
Loading file AC100221.initrd.img...done
[    0.000000] Linux version 6.4.0-rc3 (root@x4270) (ia64-linux-gcc
(GCC) 12.2.0, GNU ld (GNU Binutils) 2.39) #1 SMP Thu May 25 15:52:20
CEST 2023
[    0.000000] efi: EFI v1.1 by HP
[    0.000000] efi: SALsystab=0x3ee7a000 ACPI 2.0=0x3fe2a000
ESI=0x3ee7b000 SMBIOS=0x3ee7c000 HCDP=0x3fe28000
[    0.000000] PCDP: v3 at 0x3fe28000
[    0.000000] earlycon: uart8250 at MMIO 0x00000000f4050000 (options
'9600n8')
[    0.000000] printk: bootconsole [uart8250] enabled
[    0.000000] ACPI: Early table checksum verification disabled
[    0.000000] ACPI: RSDP 0x000000003FE2A000 000028 (v02 HP    )
[    0.000000] ACPI: XSDT 0x000000003FE2A02C 0000CC (v01 HP     rx2620
00000000 HP   00000000)
[...]
[    3.793350] Run /init as init process
Loading, please wait...
Starting systemd-udevd version 252.6-1
[    3.951100] ------------[ cut here ]------------
[    3.951100] WARNING: CPU: 6 PID: 140 at kernel/module/main.c:1547
__layout_sections+0x370/0x3c0
[    3.949512] Unable to handle kernel paging request at virtual address
1000000000000000
[    3.951100] Modules linked in:
[    3.951100] CPU: 6 PID: 140 Comm: (udev-worker) Not tainted 6.4.0-rc3 #1
[    3.956161] (udev-worker)[142]: Oops 11003706212352 [1]
[    3.951774] Hardware name: hp server rx2620                   , BIOS
04.29
11/30/2007
[    3.951774]
[    3.951774] Call Trace:
[    3.958339] Unable to handle kernel paging request at virtual address
1000000000000000
[    3.956161] Modules linked in:
[    3.951774]  [<a0000001000156d0>] show_stack.part.0+0x30/0x60
[    3.951774]                                 sp=e000000183a67b20
bsp=e000000183a61628
[    3.956161]
[    3.956161]

which bisect to module_memory change [1].

Debug showed that ia64 uses some special sections:

__layout_sections: section .got (sh_flags 10000002) matched to MOD_INVALID
__layout_sections: section .sdata (sh_flags 10000003) matched to MOD_INVALID
__layout_sections: section .sbss (sh_flags 10000003) matched to MOD_INVALID

All these sections are loaded to module core memory before [1].

Fix ia64 boot by loading these sections to MOD_DATA (core rw data).

[1] commit ac3b43283923 ("module: replace module_layout with module_memory")

Fixes: ac3b43283923 ("module: replace module_layout with module_memory")
Reported-by: Frank Scheiner <[email protected]>
Closes: https://lists.debian.org/debian-ia64/2023/05/msg00010.html
Closes: https://marc.info/?l=linux-ia64&m=168509859125505
Cc: Linus Torvalds <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Tested-by: John Paul Adrian Glaubitz <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>

Merge branch 'selftests-mptcp-skip-tests-not-supported-by-old-kernels-part-1'

Matthieu Baerts says:

====================
selftests: mptcp: skip tests not supported by old kernels (part 1)

After a few years of increasing test coverage in the MPTCP selftests, we
realised [1] the last version of the selftests is supposed to run on old
kernels without issues.

Supporting older versions is not that easy for this MPTCP case: these
selftests are often validating the internals by checking packets that
are exchanged, when some MIB counters are incremented after some
actions, how connections are getting opened and closed in some cases,
etc. In other words, it is not limited to the socket interface between
the userspace and the kernelspace. In addition, the current selftests
run a lot of different sub-tests but the TAP13 protocol used in the
selftests don't support sub-tests: in other words, one failure in
sub-tests implies that the whole selftest is seen as failed at the end
because sub-tests are not tracked. It is then important to skip
sub-tests not supported by old kernels.

To minimise the modifications and reduce the complexity to support old
versions, the idea is to look at external signs and skip the whole
selftests or just some sub-tests before starting them.

This first part focuses on marking the different selftests as skipped
if MPTCP is not even supported. That's what is done in patches 2 to 8.
Patch 2/8 introduces a new file (mptcp_lib.sh) to be able to re-use some
helpers in the different selftests. The first MPTCP selftest has been
introduced in v5.6.

Patch 1/8 is a bit different but still linked: it modifies mptcp_join.sh
selftest not to use 'cmp --bytes' which is not supported by the BusyBox
implementation. It is apparently quite common to use BusyBox in CI
environments. This tool is needed for a subtest introduced in v6.1.

Link: https://lore.kernel.org/stable/CA+G9fYtDGpgT4dckXD-y-N92nqUxuvue_7AtDdBcHrbOMsDZLg@mail.gmail.com/
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
====================

Link: https://lore.kernel.org/r/20230528-upstream-net-20230528-mptcp-selftests-support-old-kernels-part-1-v1-0-a32d85577fc6@tessares.net
Signed-off-by: Paolo Abeni <[email protected]>

selftests: mptcp: userspace pm: skip if MPTCP is not supported

Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped".

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 259a834fadda ("selftests: mptcp: functional tests for the userspace PM type")
Cc: [email protected]
Acked-by: Paolo Abeni <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>

selftests: mptcp: sockopt: skip if MPTCP is not supported

Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped".

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: dc65fe82fb07 ("selftests: mptcp: add packet mark test case")
Cc: [email protected]
Acked-by: Paolo Abeni <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>

selftests: mptcp: simult flows: skip if MPTCP is not supported

Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped".

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 1a418cb8e888 ("mptcp: simult flow self-tests")
Cc: [email protected]
Acked-by: Paolo Abeni <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>

selftests: mptcp: diag: skip if MPTCP is not supported

Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped".

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: df62f2ec3df6 ("selftests/mptcp: add diag interface tests")
Cc: [email protected]
Acked-by: Paolo Abeni <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>

selftests: mptcp: join: skip if MPTCP is not supported

Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped".

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: b08fbf241064 ("selftests: add test-cases for MPTCP MP_JOIN")
Cc: [email protected]
Acked-by: Paolo Abeni <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>

selftests: mptcp: pm nl: skip if MPTCP is not supported

Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped".

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: eedbc685321b ("selftests: add PM netlink functional tests")
Cc: [email protected]
Acked-by: Paolo Abeni <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>

selftests: mptcp: connect: skip if MPTCP is not supported

Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped". Note that this check can also
mark the test as failed if 'SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES' env
var is set to 1: by doing that, we can make sure a test is not being
skipped by mistake.

A new shared file is added here to be able to re-used the same check in
the different selftests we have.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 048d19d444be ("mptcp: add basic kselftest for mptcp")
Cc: [email protected]
Acked-by: Paolo Abeni <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>

selftests: mptcp: join: avoid using 'cmp --bytes'

BusyBox's 'cmp' command doesn't support the '--bytes' parameter.

Some CIs -- i.e. LKFT -- use BusyBox and have the mptcp_join.sh test
failing [1] because their 'cmp' command doesn't support this '--bytes'
option:

    cmp: unrecognized option '--bytes=1024'
    BusyBox v1.35.0 () multi-call binary.

    Usage: cmp [-ls] [-n NUM] FILE1 [FILE2]

Instead, 'head --bytes' can be used as this option is supported by
BusyBox. A temporary file is needed for this operation.

Because it is apparently quite common to use BusyBox, it is certainly
better to backport this fix to impacted kernels.

Fixes: 6bf41020b72b ("selftests: mptcp: update and extend fastclose test-cases")
Cc: [email protected]
Link: https://qa-reports.linaro.org/lkft/linux-mainline-master/build/v6.3-rc5-5-g148341f0a2f5/testrun/16088933/suite/kselftest-net-mptcp/test/net_mptcp_userspace_pm_sh/log
Suggested-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>

net: mana: Fix perf regression: remove rx_cqes, tx_cqes counters

The apc->eth_stats.rx_cqes is one per NIC (vport), and it's on the
frequent and parallel code path of all queues. So, r/w into this
single shared variable by many threads on different CPUs creates a
lot caching and memory overhead, hence perf regression. And, it's
not accurate due to the high volume concurrent r/w.

For example, a workload is iperf with 128 threads, and with RPS
enabled. We saw perf regression of 25% with the previous patch
adding the counters. And this patch eliminates the regression.

Since the error path of mana_poll_rx_cq() already has warnings, so
keeping the counter and convert it to a per-queue variable is not
necessary. So, just remove this counter from this high frequency
code path.

Also, remove the tx_cqes counter for the same reason. We have
warnings & other counters for errors on that path, and don't need
to count every normal cqe processing.

Cc: [email protected]
Fixes: bd7fc6e1957c ("net: mana: Add new MANA VF performance counters for easier troubleshooting")
Signed-off-by: Haiyang Zhang <[email protected]>
Reviewed-by: Horatiu Vultur <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

Merge branch 'two-fixes-for-smcrv2'

Wen Gu says:

====================
Two fixes for SMCRv2

This patch set includes two bugfix for SMCRv2.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

net/smc: Don't use RMBs not mapped to new link in SMCRv2 ADD LINK

We encountered a crash when using SMCRv2. It is caused by a logical
error in smc_llc_fill_ext_v2().

BUG: kernel NULL pointer dereference, address: 0000000000000014
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 7 PID: 453 Comm: kworker/7:4 Kdump: loaded Tainted: G        W   E      6.4.0-rc3+ #44
Workqueue: events smc_llc_add_link_work [smc]
RIP: 0010:smc_llc_fill_ext_v2+0x117/0x280 [smc]
RSP: 0018:ffffacb5c064bd88 EFLAGS: 00010282
RAX: ffff9a6bc1c3c02c RBX: ffff9a6be3558000 RCX: 0000000000000000
RDX: 0000000000000002 RSI: 0000000000000002 RDI: 000000000000000a
RBP: ffffacb5c064bdb8 R08: 0000000000000040 R09: 000000000000000c
R10: ffff9a6bc0910300 R11: 0000000000000002 R12: 0000000000000000
R13: 0000000000000002 R14: ffff9a6bc1c3c02c R15: ffff9a6be3558250
FS:  0000000000000000(0000) GS:ffff9a6eefdc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000014 CR3: 000000010b078003 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
  <TASK>
  smc_llc_send_add_link+0x1ae/0x2f0 [smc]
  smc_llc_srv_add_link+0x2c9/0x5a0 [smc]
  ? cc_mkenc+0x40/0x60
  smc_llc_add_link_work+0xb8/0x140 [smc]
  process_one_work+0x1e5/0x3f0
  worker_thread+0x4d/0x2f0
  ? __pfx_worker_thread+0x10/0x10
  kthread+0xe5/0x120
  ? __pfx_kthread+0x10/0x10
  ret_from_fork+0x2c/0x50
  </TASK>

When an alernate RNIC is available in system, SMC will try to add a new
link based on the RNIC for resilience. All the RMBs in use will be mapped
to the new link. Then the RMBs' MRs corresponding to the new link will be
filled into SMCRv2 LLC ADD LINK messages.

However, smc_llc_fill_ext_v2() mistakenly accesses to unused RMBs which
haven't been mapped to the new link and have no valid MRs, thus causing
a crash. So this patch fixes the logic.

Fixes: b4ba4652b3f8 ("net/smc: extend LLC layer for SMC-Rv2")
Signed-off-by: Wen Gu <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>

net/smc: Scan from current RMB list when no position specified

When finding the first RMB of link group, it should start from the
current RMB list whose index is 0. So fix it.

Fixes: b4ba4652b3f8 ("net/smc: extend LLC layer for SMC-Rv2")
Signed-off-by: Wen Gu <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>

rxrpc: Truncate UTS_RELEASE for rxrpc version

UTS_RELEASE has a maximum length of 64 which can cause rxrpc_version to
exceed the 65 byte message limit.

Per the rx spec[1]: "If a server receives a packet with a type value of 13,
and the client-initiated flag set, it should respond with a 65-byte payload
containing a string that identifies the version of AFS software it is
running."

The current implementation causes a compile error when WERROR is turned on
and/or UTS_RELEASE exceeds the length of 49 (making the version string more
than 64 characters).

Fix this by generating the string during module initialisation and limiting
the UTS_RELEASE segment of the string does not exceed 49 chars. We need to
make sure that the 64 bytes includes "linux-" at the front and " AF_RXRPC"
at the back as this may be used in pattern matching.

Fixes: 44ba06987c0b ("RxRPC: Handle VERSION Rx protocol packets")
Reported-by: Kenny Ho <[email protected]>
Link: https://lore.kernel.org/r/[email protected]/
Signed-off-by: David Howells <[email protected]>
Acked-by: Kenny Ho <[email protected]>
cc: Marc Dionne <[email protected]>
cc: Andrew Lunn <[email protected]>
cc: David Laight <[email protected]>
cc: "David S. Miller" <[email protected]>
cc: Eric Dumazet <[email protected]>
cc: Jakub Kicinski <[email protected]>
cc: Paolo Abeni <[email protected]>
cc: [email protected]
cc: [email protected]
Link: https://web.mit.edu/kolya/afs/rx/rx-spec
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Jeffrey Altman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

tcp: Return user_mss for TCP_MAXSEG in CLOSE/LISTEN state if user_mss set

This patch replaces the tp->mss_cache check in getting TCP_MAXSEG
with tp->rx_opt.user_mss check for CLOSE/LISTEN sock. Since
tp->mss_cache is initialized with TCP_MSS_DEFAULT, checking if
it's zero is probably a bug.

With this change, getting TCP_MAXSEG before connecting will return
default MSS normally, and return user_mss if user_mss is set.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Jack Yang <[email protected]>
Suggested-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/netdev/CANn89i+3kL9pYtkxkwxwNMzvC_w3LNUum_2=3u+UyLBmGmifHA@mail.gmail.com/#t
Signed-off-by: Cambda Zhu <[email protected]>
Link: https://lore.kernel.org/netdev/[email protected]/
Reviewed-by: Jason Xing <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

tcp: deny tcp_disconnect() when threads are waiting

Historically connect(AF_UNSPEC) has been abused by syzkaller
and other fuzzers to trigger various bugs.

A recent one triggers a divide-by-zero [1], and Paolo Abeni
was able to diagnose the issue.

tcp_recvmsg_locked() has tests about sk_state being not TCP_LISTEN
and TCP REPAIR mode being not used.

Then later if socket lock is released in sk_wait_data(),
another thread can call connect(AF_UNSPEC), then make this
socket a TCP listener.

When recvmsg() is resumed, it can eventually call tcp_cleanup_rbuf()
and attempt a divide by 0 in tcp_rcv_space_adjust() [1]

This patch adds a new socket field, counting number of threads
blocked in sk_wait_event() and inet_wait_for_connect().

If this counter is not zero, tcp_disconnect() returns an error.

This patch adds code in blocking socket system calls, thus should
not hurt performance of non blocking ones.

Note that we probably could revert commit 499350a5a6e7 ("tcp:
initialize rcv_mss to TCP_MIN_MSS instead of 0") to restore
original tcpi_rcv_mss meaning (was 0 if no payload was ever
received on a socket)

[1]
divide error: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 13832 Comm: syz-executor.5 Not tainted 6.3.0-rc4-syzkaller-00224-g00c7b5f4ddc5 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023
RIP: 0010:tcp_rcv_space_adjust+0x36e/0x9d0 net/ipv4/tcp_input.c:740
Code: 00 00 00 00 fc ff df 4c 89 64 24 48 8b 44 24 04 44 89 f9 41 81 c7 80 03 00 00 c1 e1 04 44 29 f0 48 63 c9 48 01 e9 48 0f af c1 <49> f7 f6 48 8d 04 41 48 89 44 24 40 48 8b 44 24 30 48 c1 e8 03 48
RSP: 0018:ffffc900033af660 EFLAGS: 00010206
RAX: 4a66b76cbade2c48 RBX: ffff888076640cc0 RCX: 00000000c334e4ac
RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000001
RBP: 00000000c324e86c R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8880766417f8
R13: ffff888028fbb980 R14: 0000000000000000 R15: 0000000000010344
FS: 00007f5bffbfe700(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b32f25000 CR3: 000000007ced0000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
tcp_recvmsg_locked+0x100e/0x22e0 net/ipv4/tcp.c:2616
tcp_recvmsg+0x117/0x620 net/ipv4/tcp.c:2681
inet6_recvmsg+0x114/0x640 net/ipv6/af_inet6.c:670
sock_recvmsg_nosec net/socket.c:1017 [inline]
sock_recvmsg+0xe2/0x160 net/socket.c:1038
____sys_recvmsg+0x210/0x5a0 net/socket.c:2720
___sys_recvmsg+0xf2/0x180 net/socket.c:2762
do_recvmmsg+0x25e/0x6e0 net/socket.c:2856
__sys_recvmmsg net/socket.c:2935 [inline]
__do_sys_recvmmsg net/socket.c:2958 [inline]
__se_sys_recvmmsg net/socket.c:2951 [inline]
__x64_sys_recvmmsg+0x20f/0x260 net/socket.c:2951
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f5c0108c0f9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f5bffbfe168 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
RAX: ffffffffffffffda RBX: 00007f5c011ac050 RCX: 00007f5c0108c0f9
RDX: 0000000000000001 RSI: 0000000020000bc0 RDI: 0000000000000003
RBP: 00007f5c010e7b39 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000122 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f5c012cfb1f R14: 00007f5bffbfe300 R15: 0000000000022000
</TASK>

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot <[email protected]>
Reported-by: Paolo Abeni <[email protected]>
Diagnosed-by: Paolo Abeni <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Tested-by: Paolo Abeni <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

af_packet: do not use READ_ONCE() in packet_bind()

A recent patch added READ_ONCE() in packet_bind() and packet_bind_spkt()

This is better handled by reading pkt_sk(sk)->num later
in packet_do_bind() while appropriate lock is held.

READ_ONCE() in writers are often an evidence of something being wrong.

Fixes: 822b5a1c17df ("af_packet: Fix data-races of pkt_sk(sk)->num.")
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: Kuniyuki Iwashima <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

netlink: specs: correct types of legacy arrays

ethtool has some attrs which dump multiple scalars into
an attribute. The spec currently expects one attr per entry.

Fixes: a353318ebf24 ("tools: ynl: populate most of the ethtool spec")
Acked-by: Stanislav Fomichev <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: usb: qmi_wwan: Set DTR quirk for BroadMobi BM818

BM818 is based on Qualcomm MDM9607 chipset.

Fixes: 9a07406b00cd ("net: usb: qmi_wwan: Add the BroadMobi BM818 card")
Cc: [email protected]
Signed-off-by: Sebastian Krzyszkowiak <[email protected]>
Acked-by: Bjørn Mork <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

ata: libata-scsi: Use correct device no in ata_find_dev()

For devices not attached to a port multiplier and managed directly by
libata, the device number passed to ata_find_dev() must always be lower
than the maximum number of devices returned by ata_link_max_devices().
That is 1 for SATA devices or 2 for an IDE link with master+slave
devices. This device number is the SCSI device ID which matches these
constraints as the IDs are generated per port and so never exceed the
maximum number of devices for the link being used.

However, for libsas managed devices, SCSI device IDs are assigned per
struct scsi_host, leading to device IDs for SATA devices that can be
well in excess of libata per-link maximum number of devices. This
results in ata_find_dev() to always return NULL for libsas managed
devices except for the first device of the target scsi_host with ID
(device number) equal to 0. This issue is visible by executing the
hdparm utility, which fails. E.g.:

hdparm -i /dev/sdX
/dev/sdX:
HDIO_GET_IDENTITY failed: No message of desired type

Fix this by rewriting ata_find_dev() to ignore the device number for
non-PMP attached devices with a link with at most 1 device, that is SATA
devices. For these, the device number 0 is always used to
return the correct pointer to the struct ata_device of the port link.
This change excludes IDE master/slave setups (maximum number of devices
per link is 2) and port-multiplier attached devices. Also, to be
consistant with the fact that SCSI device IDs and channel numbers used
as device numbers are both unsigned int, change the devno argument of
ata_find_dev() to unsigned int.

Reported-by: Xingui Yang <[email protected]>
Fixes: 41bda9c98035 ("libata-link: update hotplug to handle PMP links")
Cc: [email protected]
Signed-off-by: Damien Le Moal <[email protected]>
Reviewed-by: Jason Yan <[email protected]>

RDMA/irdma: Fix Local Invalidate fencing

If the local invalidate fence is indicated in the WR, only the read fence
is currently being set in WQE. Fix this to set both the read and local
fence in the WQE.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mustafa Ismail <[email protected]>
Signed-off-by: Shiraz Saleem <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>

RDMA/irdma: Prevent QP use after free

There is a window where the poll cq may use a QP that has been freed.
This can happen if a CQE is polled before irdma_clean_cqes() can clear the
CQE's related to the QP and the destroy QP races to free the QP memory.
then the QP structures are used in irdma_poll_cq. Fix this by moving the
clearing of CQE's before the reference is removed and the QP is destroyed.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mustafa Ismail <[email protected]>
Signed-off-by: Shiraz Saleem <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>

MAINTAINERS: Update maintainer of Amazon EFA driver

Change EFA driver maintainer from Gal Pressman to myself. Keep Gal as a
reviewer at his request.

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Michael Margolin <[email protected]>
Acked-by: Gal Pressman <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>

Merge tag 'trace-v6.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:
"User events:

   - Use long instead of int for storing the enable set/clear bit, as it
     was found that big endian machines could end up using the wrong
     bits.

   - Split allocating mm and attaching it. This keeps the allocation
     separate from the registration and avoids various races.

   - Remove RCU locking around pin_user_pages_remote() as that can
     schedule. The RCU protection is no longer needed with the above
     split of mm allocation and attaching.

   - Rename the "link" fields of the various structs to something more
     meaningful.

   - Add comments around user_event_mm struct usage and locking
     requirements.

  Timerlat tracer:

   - Fix missed wakeup of timerlat thread caused by the timerlat
     interrupt triggering when tracing is off. The timer interrupt
     handler needs to always wake up the timerlat thread regardless if
     tracing is enabled or not, otherwise, it will never wake up.

  Histograms:

   - Fix regression of breaking the "stacktrace" modifier for variables.
     That modifier cannot be used for values, but can be used for
     variables that are passed from one histogram to the next. This was
     broken when adding the restriction to values as the variable logic
     used the same code.

   - Rename the special field "stacktrace" to "common_stacktrace".

     Special fields (that are not actually part of the event, but can
     act just like event fields, like 'comm' and 'timestamp') should be
     prefixed with 'common_' for consistency. To keep backward
     compatibility, 'stacktrace' can still be used (as with the special
     field 'cpu'), but can be overridden if the event has a field called
     'stacktrace'.

   - Update the synthetic event selftests to use the new name (synthetic
     events are created by histograms)

  Tracing bootup selftests:

   - Reorganize the code to keep artifacts of the selftests not compiled
     in when selftests are not configured.

   - Add various cond_resched() around the selftest code, as the
     softlock watchdog was triggering much more often. It appears that
     the kernel runs slower now with full debugging enabled.

   - While debugging ftrace with ftrace (using an instance ring buffer
     instead of the top level one), I found that the selftests were
     disabling prints to the debug instance.

     This should not happen, as the selftests only disable printing to
     the main buffer as the selftests examine the main buffer to see if
     it has what it expects, and prints can make the tests fail.

     Make the selftests only disable printing to the toplevel buffer,
     and leave the instance buffers alone"

* tag 'trace-v6.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Have function_graph selftest call cond_resched()
  tracing: Only make selftest conditionals affect the global_trace
  tracing: Make tracing_selftest_running/delete nops when not used
  tracing: Have tracer selftests call cond_resched() before running
  tracing: Move setting of tracing_selftest_running out of register_tracer()
  tracing/selftests: Update synthetic event selftest to use common_stacktrace
  tracing: Rename stacktrace field to common_stacktrace
  tracing/histograms: Allow variables to have some modifiers
  tracing/user_events: Document user_event_mm one-shot list usage
  tracing/user_events: Rename link fields for clarity
  tracing/user_events: Remove RCU lock while pinning pages
  tracing/user_events: Split up mm alloc and attach
  tracing/timerlat: Always wakeup the timerlat thread
  tracing/user_events: Use long vs int for atomic bit ops

Merge tag 'v6.4-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fix from Herbert Xu:
"Fix an alignment crash in x86/aria"

* tag 'v6.4-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: x86/aria - Use 16 byte alignment for GFNI constant vectors

Revert "module: error out early on concurrent load of the same module file"

This reverts commit 9828ed3f695a138f7add89fa2a186ababceb8006.

Sadly, it does seem to cause failures to load modules. Johan Hovold reports:

"This change breaks module loading during boot on the Lenovo Thinkpad
  X13s (aarch64).

  Specifically it results in indefinite probe deferral of the display
  and USB (ethernet) which makes it a pain to debug. Typing in the dark
  to acquire some logs reveals that other modules are missing as well"

Since this was applied late as a "let's try this", I'm reverting it
asap, and we can try to figure out what goes wrong later.  The excessive
parallel module loading problem is annoying, but not noticeable in
normal situations, and this was only meant as an optimistic workaround
for a user-space bug.

One possible solution may be to do the optimistic exclusive open first,
and then use a lock to serialize loading if that fails.

Reported-by: Johan Hovold <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: Linus Torvalds <[email protected]>

tracing: Have function_graph selftest call cond_resched()

When all kernel debugging is enabled (lockdep, KSAN, etc), the function
graph enabling and disabling can take several seconds to complete. The
function_graph selftest enables and disables function graph tracing
several times. With full debugging enabled, the soft lockup watchdog was
triggering because the selftest was running without ever scheduling.

Add cond_resched() throughout the test to make sure it does not trigger
the soft lockup detector.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Steven Rostedt (Google) <[email protected]>

tracing: Only make selftest conditionals affect the global_trace

The tracing_selftest_running and tracing_selftest_disabled variables were
to keep trace_printk() and other writes from affecting the tracing
selftests, as the tracing selftests would examine the ring buffer to see
if it contained what it expected or not. trace_printk() and friends could
add to the ring buffer and cause the selftests to fail (and then disable
the tracer that was being tested). To keep that from happening, these
variables were added and would keep trace_printk() and friends from
writing to the ring buffer while the tests were going on.

But this was only the top level ring buffer (owned by the global_trace
instance). There is no reason to prevent writing into ring buffers of
other instances via the trace_array_printk() and friends. For the
functions that could be used by other instances, check if the global_trace
is the tracer instance that is being written to before deciding to not
allow the write.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Steven Rostedt (Google) <[email protected]>

tracing: Make tracing_selftest_running/delete nops when not used

There's no reason to test the condition variables tracing_selftest_running
or tracing_selftest_delete when tracing selftests are not enabled. Make
them define 0s when not the selftests are not configured in.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Steven Rostedt (Google) <[email protected]>

tracing: Have tracer selftests call cond_resched() before running

As there are more and more internal selftests being added to the Linux
kernel (KSAN, lockdep, etc) the selftests are taking longer to run when
these are enabled. Add a cond_resched() to the calling of
do_run_tracer_selftest() to force a schedule if NEED_RESCHED is set,
otherwise the soft lockup watchdog may trigger on boot up.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Steven Rostedt (Google) <[email protected]>

tracing: Move setting of tracing_selftest_running out of register_tracer()

The variables tracing_selftest_running and tracing_selftest_disabled are
only used for when CONFIG_FTRACE_STARTUP_TEST is enabled. Make them only
visible within the selftest code. The setting of those variables are in
the register_tracer() call, and set in a location where they do not need
to be. Create a wrapper around run_tracer_selftest() called
do_run_tracer_selftest() which sets those variables, and have
register_tracer() call that instead.

Having those variables only set within the CONFIG_FTRACE_STARTUP_TEST
scope gets rid of them (and also the ability to remove testing against
them) when the startup tests are not enabled (most cases).

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Steven Rostedt (Google) <[email protected]>

Merge tag 'phy-fixes-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy

Pull phy fixes from Vinod Koul:

- init count imbalance fix in qcom-qmp-pcie and combo drivers

- kernel doc header fix for qcom-snps driver

- mediatek floating point comparison fix

- amlogic fix register value

* tag 'phy-fixes-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
  phy: qcom-snps: correct struct qcom_snps_hsphy kerneldoc
  phy: amlogic: phy-meson-g12a-mipi-dphy-analog: fix CNTL2_DIF_TX_CTL0 value
  phy: mediatek: rework the floating point comparisons to fixed point
  phy: qcom-qmp-pcie-msm8996: fix init-count imbalance
  phy: qcom-qmp-combo: fix init-count imbalance

Merge tag 'dmaengine-fix-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine

Pull dmaengine fixes from Vinod Koul:
"Driver fixes for the at-hdmac, pl330, TI and IDXD drivers:

   - AT HDMAC driver fixes for Flow Controller bitfield, peripheral ID
     handling and potential NULL dereference check

   - PL330 function rename to avoid conflicts

   - build warning fix for pm function in TI driver

   - IDXD driver fix for passing freed memory"

* tag 'dmaengine-fix-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
  dmaengine: at_hdmac: Extend the Flow Controller bitfield to three bits
  dmaengine: at_hdmac: Repair bitfield macros for peripheral ID handling
  dmaengine: pl330: rename _start to prevent build error
  dmaengine: at_xdmac: fix potential Oops in at_xdmac_prep_interleaved()
  dmaengine: ti: k3-udma: annotate pm function with __maybe_unused
  dmaengine: idxd: Fix passing freed memory in idxd_cdev_open()

ext4: add EA_INODE checking to ext4_iget()

Add a new flag, EXT4_IGET_EA_INODE which indicates whether the inode
is expected to have the EA_INODE flag or not. If the flag is not
set/clear as expected, then fail the iget() operation and mark the
file system as corrupted.

This commit also makes the ext4_iget() always perform the
is_bad_inode() check even when the inode is already inode cache. This
allows us to remove the is_bad_inode() check from the callers of
ext4_iget() in the ea_inode code.

Reported-by: [email protected]
Reported-by: [email protected]
Reported-by: [email protected]
Cc: [email protected]
Signed-off-by: Theodore Ts'o <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>

Linux 6.4-rc4

Merge tag 'x86-urgent-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cpu fix from Thomas Gleixner:
"A single fix for x86:

   - Prevent a bogus setting for the number of HT siblings, which is
     caused by the CPUID evaluation trainwreck of X86. That recomputes
     the value for each CPU, so the last CPU "wins". That can cause
     completely bogus sibling values"

* tag 'x86-urgent-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/topology: Fix erroneous smp_num_siblings on Intel Hybrid platforms

Merge tag 'perf-urgent-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Thomas Gleixner:
"A small set of perf fixes:

   - Make the MSR-readout based CHA discovery work around broken
     discovery tables in some SPR firmwares.

   - Prevent saving PEBS configuration which has software bits set that
     cause a crash when restored into the relevant MSR"

* tag 'perf-urgent-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/uncore: Correct the number of CHAs on SPR
  perf/x86/intel: Save/restore cpuc->active_pebs_data_cfg when using guest PEBS

Merge tag 'objtool-urgent-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull unwinder fixes from Thomas Gleixner:
"A set of unwinder and tooling fixes:

   - Ensure that the stack pointer on x86 is aligned again so that the
     unwinder does not read past the end of the stack

   - Discard .note.gnu.property section which has a pointlessly
     different alignment than the other note sections. That confuses
     tooling of all sorts including readelf, libbpf and pahole"

* tag 'objtool-urgent-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/show_trace_log_lvl: Ensure stack pointer is aligned, again
  vmlinux.lds.h: Discard .note.gnu.property section

Merge tag 'core-debugobjects-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull debugobjects fixes from Thomas Gleixner:
"Two fixes for debugobjects:

   - Prevent the allocation path from waking up kswapd.

     That's a long standing issue due to the GFP_ATOMIC allocation flag.
     As debug objects can be invoked from pretty much any context waking
     kswapd can end up in arbitrary lock chains versus the waitqueue
     lock

   - Correct the explicit lockdep wait-type violation in
     debug_object_fill_pool()"

* tag 'core-debugobjects-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  debugobjects: Don't wake up kswapd from fill_pool()
  debugobjects,locking: Annotate debug_object_fill_pool() wait type violation

Merge tag 'irq-urgent-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
"A set of fixes for interrupt chip drivers:

   - Prevent loss of state in the MIPS GIC interrupt controller

   - Disable pseudo NMIs on Mediatek based Chromebooks as they have
     firmware issues which cause instantenous chrashes and freezes wen
     pseudo NMIs are used

   - Fix the error handling path in the MBIGEN driver and a defined but
     not used warning in the meson-gpio interrupt chip driver"

* tag 'irq-urgent-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/mbigen: Unify the error handling in mbigen_of_create_domain()
  irqchip/meson-gpio: Mark OF related data as maybe unused
  irqchip/mips-gic: Use raw spinlock for gic_lock
  irqchip/mips-gic: Don't touch vl_map if a local interrupt is not routable
  irqchip/gic-v3: Disable pseudo NMIs on Mediatek devices w/ firmware issues
  dt-bindings: interrupt-controller: arm,gic-v3: Add quirk for Mediatek SoCs w/ broken FW

Merge tag 'mips-fixes_6.4_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS fixes from Thomas Bogendoerfer:

- fixes to get alchemy platform back in shape

- fix for initrd detection

* tag 'mips-fixes_6.4_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  mips: Move initrd_start check after initrd address sanitisation.
  MIPS: Alchemy: fix dbdma2
  MIPS: Restore Au1300 support
  MIPS: unhide PATA_PLATFORM

Merge tag 'powerpc-6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fix from Michael Ellerman:

- Reinstate ARCH_FORCE_MAX_ORDER ranges to fix various breakage

* tag 'powerpc-6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm: Reinstate ARCH_FORCE_MAX_ORDER ranges

Merge tag 'for-linus-6.4-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

- a double free fix in the Xen pvcalls backend driver

- a fix for a regression causing the MSI related sysfs entries to not
   being created in Xen PV guests

- a fix in the Xen blkfront driver for handling insane input data
   better

* tag 'for-linus-6.4-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/pci/xen: populate MSI sysfs entries
  xen/pvcalls-back: fix double frees with pvcalls_new_active_socket()
  xen/blkfront: Only check REQ_FUA for writes

Merge tag 'char-misc-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc fixes from Greg KH:
"Here are some small driver fixes for 6.4-rc4. They are just two
  different types:

   - binder fixes and reverts for reported problems and regressions in
     the binder "driver".

   - coresight driver fixes for reported problems.

  All of these have been in linux-next for over a week with no reported
  problems"

* tag 'char-misc-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  binder: fix UAF of alloc->vma in race with munmap()
  binder: add lockless binder_alloc_(set|get)_vma()
  Revert "android: binder: stop saving a pointer to the VMA"
  Revert "binder_alloc: add missing mmap_lock calls when using the VMA"
  binder: fix UAF caused by faulty buffer cleanup
  coresight: perf: Release Coresight path when alloc trace id failed
  coresight: Fix signedness bug in tmc_etr_buf_insert_barrier_packet()

cifs: address unused variable warning

Fix trivial unused variable warning (when SMB1 support disabled)

"ioctl.c:324:17: warning: variable 'caps' set but not used [-Wunused-but-set-variable]"

Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Signed-off-by: Steve French <[email protected]>

wifi: mt76: mt7615: fix possible race in mt7615_mac_sta_poll

Grab sta_poll_lock spinlock in mt7615_mac_sta_poll routine in order to
avoid possible races with mt7615_mac_add_txs() or mt7615_mac_fill_rx()
removing msta pointer from sta_poll_list.

Fixes: a621372a04ac ("mt76: mt7615: rework mt7615_mac_sta_poll for usb code")
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/48b23404b759de4f1db2ef85975c72a4aeb1097c.1684938695.git.lorenzo@kernel.org

smb: delete an unnecessary statement

We don't need to set the list iterators to NULL before a
list_for_each_entry() loop because they are assigned inside the
macro.

Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Mukesh Ojha <[email protected]>
Signed-off-by: Steve French <[email protected]>

ksmbd: call putname after using the last component

last component point filename struct. Currently putname is called after
vfs_path_parent_lookup(). And then last component is used for
lookup_one_qstr_excl(). name in last component is freed by previous
calling putname(). And It cause file lookup failure when testing
generic/464 test of xfstest.

Fixes: 74d7970febf7 ("ksmbd: fix racy issue from using ->d_parent and ->d_name")
Signed-off-by: Namjae Jeon <[email protected]>
Signed-off-by: Steve French <[email protected]>

ksmbd: fix incorrect AllocationSize set in smb2_get_info

If filesystem support sparse file, ksmbd should return allocated size
using ->i_blocks instead of stat->size. This fix generic/694 xfstests.

Cc: [email protected]
Signed-off-by: Namjae Jeon <[email protected]>
Signed-off-by: Steve French <[email protected]>

ksmbd: fix UAF issue from opinfo->conn

If opinfo->conn is another connection and while ksmbd send oplock break
request to cient on current connection, The connection for opinfo->conn
can be disconnect and conn could be freed. When sending oplock break
request, this ksmbd_conn can be used and cause user-after-free issue.
When getting opinfo from the list, ksmbd check connection is being
released. If it is not released, Increase ->r_count to wait that connection
is freed.

Cc: [email protected]
Reported-by: Per Forlin <[email protected]>
Tested-by: Per Forlin <[email protected]>
Signed-off-by: Namjae Jeon <[email protected]>
Signed-off-by: Steve French <[email protected]>

ksmbd: fix multiple out-of-bounds read during context decoding

Check the remaining data length before accessing the context structure
to ensure that the entire structure is contained within the packet.
Additionally, since the context data length `ctxt_len` has already been
checked against the total packet length `len_of_ctxts`, update the
comparison to use `ctxt_len`.

Cc: [email protected]
Signed-off-by: Kuan-Ting Chen <[email protected]>
Acked-by: Namjae Jeon <[email protected]>
Signed-off-by: Steve French <[email protected]>

ksmbd: fix slab-out-of-bounds read in smb2_handle_negotiate

Check request_buf length first to avoid out-of-bounds read by
req->DialectCount.

[ 3350.990282] BUG: KASAN: slab-out-of-bounds in smb2_handle_negotiate+0x35d7/0x3e60
[ 3350.990282] Read of size 2 at addr ffff88810ad61346 by task kworker/5:0/276
[ 3351.000406] Workqueue: ksmbd-io handle_ksmbd_work
[ 3351.003499] Call Trace:
[ 3351.006473]  <TASK>
[ 3351.006473]  dump_stack_lvl+0x8d/0xe0
[ 3351.006473]  print_report+0xcc/0x620
[ 3351.006473]  kasan_report+0x92/0xc0
[ 3351.006473]  smb2_handle_negotiate+0x35d7/0x3e60
[ 3351.014760]  ksmbd_smb_negotiate_common+0x7a7/0xf00
[ 3351.014760]  handle_ksmbd_work+0x3f7/0x12d0
[ 3351.014760]  process_one_work+0xa85/0x1780

Cc: [email protected]
Signed-off-by: Kuan-Ting Chen <[email protected]>
Acked-by: Namjae Jeon <[email protected]>
Signed-off-by: Steve French <[email protected]>

ksmbd: fix credit count leakage

This patch fix the failure from smb2.credits.single_req_credits_granted
test. When client send 8192 credit request, ksmbd return 8191 credit
granted. ksmbd should give maximum possible credits that must be granted
within the range of not exceeding the max credit to client.

Cc: [email protected]
Signed-off-by: Namjae Jeon <[email protected]>
Signed-off-by: Steve French <[email protected]>

ksmbd: fix uninitialized pointer read in smb2_create_link()

There is a case that file_present is true and path is uninitialized.
This patch change file_present is set to false by default and set to
true when patch is initialized.

Fixes: 74d7970febf7 ("ksmbd: fix racy issue from using ->d_parent and ->d_name")
Reported-by: Coverity Scan <[email protected]>
Signed-off-by: Namjae Jeon <[email protected]>
Signed-off-by: Steve French <[email protected]>

ksmbd: fix uninitialized pointer read in ksmbd_vfs_rename()

Uninitialized rd.delegated_inode can be used in vfs_rename().
Fix this by setting rd.delegated_inode to NULL to avoid the uninitialized
read.

Fixes: 74d7970febf7 ("ksmbd: fix racy issue from using ->d_parent and ->d_name")
Reported-by: Coverity Scan <[email protected]>
Signed-off-by: Namjae Jeon <[email protected]>
Signed-off-by: Steve French <[email protected]>

Merge tag 'cxl-fixes-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull compute express link fixes from Dan Williams:
"The 'media ready' series prevents the driver from acting on bad
  capacity information, and it moves some checks earlier in the init
  sequence which impacts topics in the queue for 6.5.

  Additional hotplug testing uncovered a missing enable for memory
  decode. A debug crash fix is also included.

  Summary:

   - Stop trusting capacity data before the "media ready" indication

   - Add missing HDM decoder capability enable for the cold-plug case

   - Fix a debug message induced crash"

* tag 'cxl-fixes-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl: Explicitly initialize resources when media is not ready
  cxl/port: Fix NULL pointer access in devm_cxl_add_port()
  cxl: Move cxl_await_media_ready() to before capacity info retrieval
  cxl: Wait Memory_Info_Valid before access memory related info
  cxl/port: Enable the HDM decoder capability for switch ports

Merge tag 'arm-fixes-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
"There have not been a lot of fixes for for the soc tree in 6.4, but
  these have been sitting here for too long.

  For the devicetree side, there is one minor warning fix for vexpress,
  the rest all all for the the NXP i.MX platforms: SoC specific bugfixes
  for the iMX8 clocks and its USB-3.0 gadget device, as well as board
  specific fixes for regulators and the phy on some of the i.MX boards.

  The microchip risc-v and arm32 maintainers now also add a shared
  maintainer file entry for the arm64 parts.

  The remaining fixes are all for firmware drivers, addressing mistakes
  in the optee, scmi and ff-a firmware driver implementation, mostly in
  the error handling code, incorrect use of the alloc_workqueue()
  interface in SCMI, and compatibility with corner cases of the firmware
  implementation"

* tag 'arm-fixes-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  MAINTAINERS: update arm64 Microchip entries
  arm64: dts: imx8: fix USB 3.0 Gadget Failure in QM & QXPB0 at super speed
  dt-binding: cdns,usb3: Fix cdns,on-chip-buff-size type
  arm64: dts: colibri-imx8x: delete adc1 and dsp
  arm64: dts: colibri-imx8x: fix iris pinctrl configuration
  arm64: dts: colibri-imx8x: move pinctrl property from SoM to eval board
  arm64: dts: colibri-imx8x: fix eval board pin configuration
  arm64: dts: imx8mp: Fix video clock parents
  ARM: dts: imx6qdl-mba6: Add missing pvcie-supply regulator
  ARM: dts: imx6ull-dhcor: Set and limit the mode for PMIC buck 1, 2 and 3
  arm64: dts: imx8mn-var-som: fix PHY detection bug by adding deassert delay
  arm64: dts: imx8mn: Fix video clock parents
  firmware: arm_ffa: Set reserved/MBZ fields to zero in the memory descriptors
  firmware: arm_ffa: Fix FFA device names for logical partitions
  firmware: arm_ffa: Fix usage of partition info get count flag
  firmware: arm_ffa: Check if ffa_driver remove is present before executing
  arm64: dts: arm: add missing cache properties
  ARM: dts: vexpress: add missing cache properties
  firmware: arm_scmi: Fix incorrect alloc_workqueue() invocation
  optee: fix uninited async notif value

Merge tag 'pci-v6.4-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull PCI fix from Bjorn Helgaas:

- Quirk Ice Lake Root Ports to work around DPC log size issue (Mika
Westerberg)

* tag 'pci-v6.4-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI/DPC: Quirk PIO log size for Intel Ice Lake Root Ports

Merge tag 'vfio-v6.4-rc4' of https://github.com/awilliam/linux-vfio

Pull VFIO fix from Alex Williamson:

- Test for and return error for invalid pfns through the pin pages
interface (Yan Zhao)

* tag 'vfio-v6.4-rc4' of https://github.com/awilliam/linux-vfio:
vfio/type1: check pfn valid before converting to struct page

Merge tag 'block-6.4-2023-05-26' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:
"A few fixes for the storage side of things:

   - Fix bio caching condition for passthrough IO (Anuj)

   - end-of-device check fix for zero sized devices (Christoph)

   - Update Paolo's email address

   - NVMe pull request via Keith with a single quirk addition

   - Fix regression in how wbt enablement is done (Yu)

   - Fix race in active queue accounting (Tian)"

* tag 'block-6.4-2023-05-26' of git://git.kernel.dk/linux:
  NVMe: Add MAXIO 1602 to bogus nid list.
  block: make bio_check_eod work for zero sized devices
  block: fix bio-cache for passthru IO
  block, bfq: update Paolo's address in maintainer list
  blk-mq: fix race condition in active queue accounting
  blk-wbt: fix that wbt can't be disabled by default

Merge tag 'io_uring-6.4-2023-05-26' of git://git.kernel.dk/linux

Pull io_uring fix from Jens Axboe:
"Just a single fix for the conditional schedule with the SQPOLL thread,
dropping the uring_lock if we do need to reschedule"

* tag 'io_uring-6.4-2023-05-26' of git://git.kernel.dk/linux:
io_uring: unlock sqd->lock before sq thread release CPU

btrfs: fix csum_tree_block page iteration to avoid tripping on -Werror=array-bounds

When compiling on a MIPS 64-bit machine we get these warnings:

    In file included from ./arch/mips/include/asm/cacheflush.h:13,
             from ./include/linux/cacheflush.h:5,
             from ./include/linux/highmem.h:8,
     from ./include/linux/bvec.h:10,
     from ./include/linux/blk_types.h:10,
                     from ./include/linux/blkdev.h:9,
             from fs/btrfs/disk-io.c:7:
    fs/btrfs/disk-io.c: In function ‘csum_tree_block’:
    fs/btrfs/disk-io.c:100:34: error: array subscript 1 is above array bounds of ‘struct page *[1]’ [-Werror=array-bounds]
      100 |   kaddr = page_address(buf->pages[i]);
          |                        ~~~~~~~~~~^~~
    ./include/linux/mm.h:2135:48: note: in definition of macro ‘page_address’
     2135 | #define page_address(page) lowmem_page_address(page)
          |                                                ^~~~
    cc1: all warnings being treated as errors

We can check if i overflows to solve the problem. However, this doesn't make
much sense, since i == 1 and num_pages == 1 doesn't execute the body of the loop.
In addition, i < num_pages can also ensure that buf->pages[i] will not cross
the boundary. Unfortunately, this doesn't help with the problem observed here:
gcc still complains.

To fix this add a compile-time condition for the extent buffer page
array size limit, which would eventually lead to eliminating the whole
for loop.

CC: [email protected] # 5.10+
Signed-off-by: pengfuyuan <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>

btrfs: fix an uninitialized variable warning in btrfs_log_inode

This fixes the following warning reported by gcc 10.2.1 under x86_64:

../fs/btrfs/tree-log.c: In function ‘btrfs_log_inode’:
../fs/btrfs/tree-log.c:6211:9: error: ‘last_range_start’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
6211 |   ret = insert_dir_log_key(trans, log, path, key.objectid,
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
6212 |       first_dir_index, last_dir_index);
      |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../fs/btrfs/tree-log.c:6161:6: note: ‘last_range_start’ was declared here
6161 |  u64 last_range_start;
      |      ^~~~~~~~~~~~~~~~

This might be a false positive fixed in later compiler versions but we
want to have it fixed.

Reported-by: k2ci <[email protected]>
Reviewed-by: Anand Jain <[email protected]>
Signed-off-by: Shida Zhang <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>

btrfs: call btrfs_orig_bbio_end_io in btrfs_end_bio_work

When I implemented the storage layer bio splitting, I was under the
assumption that we'll never split metadata bios. But Qu reminded me that
this can actually happen with very old file systems with unaligned
metadata chunks and RAID0.

I still haven't seen such a case in practice, but we better handled this
case, especially as it is fairly easily to do not calling the ->end_іo
method directly in btrfs_end_io_work, and using the proper
btrfs_orig_bbio_end_io helper instead.

In addition to the old file system with unaligned metadata chunks case
documented in the commit log, the combination of the new scrub code
with Johannes pending raid-stripe-tree also triggers this case. We
spent some time debugging it and found that this patch solves
the problem.

Fixes: 103c19723c80 ("btrfs: split the bio submission path into a separate file")
CC: [email protected] # 6.3+
Reviewed-by: Johannes Thumshirn <[email protected]>
Tested-by: Johannes Thumshirn <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: David Sterba <[email protected]>

Merge tag 'thermal-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control fix from Rafael Wysocki:
"Fix a regression introduced inadvertently during the 6.3 cycle by a
  commit making the Intel int340x thermal driver use sysfs_emit_at()
  instead of scnprintf() (Srinivas Pandruvada)"

* tag 'thermal-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: intel: int340x: Add new line for UUID display

Merge tag 'pm-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
"Fix three issues related to the ->fast_switch callback in the AMD
  P-state cpufreq driver (Gautham R. Shenoy and Wyes Karny)"

* tag 'pm-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: amd-pstate: Update policy->cur in amd_pstate_adjust_perf()
  cpufreq: amd-pstate: Remove fast_switch_possible flag from active driver
  cpufreq: amd-pstate: Add ->fast_switch() callback

cxl: Explicitly initialize resources when media is not ready

When media is not ready do not assume that the capacity information from
the identify command is valid, i.e. ->total_bytes
->partition_align_bytes ->{volatile,persistent}_only_bytes. Explicitly
zero out the capacity resources and exit early.

Given zero-init of those fields this patch is functionally equivalent to
the prior state, but it improves readability and robustness going
forward.

Signed-off-by: Dave Jiang <[email protected]>
Link: https://lore.kernel.org/r/168506118166.3004974.13523455340007852589.stgit@djiang5-mobl3
Signed-off-by: Dan Williams <[email protected]>

Merge tag 'gpio-fixes-for-v6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

- fix incorrect output in in-tree gpio tools

- fix a shell coding issue in gpio-sim selftests

- correctly set the permissions for debugfs attributes exposed by
   gpio-mockup

- fix chip name and pin count in gpio-f7188x for one of the supported
   models

- fix numberspace pollution when using dynamically and statically
   allocated GPIOs together

* tag 'gpio-fixes-for-v6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio-f7188x: fix chip name and pin count on Nuvoton chip
  gpiolib: fix allocation of mixed dynamic/static GPIOs
  gpio: mockup: Fix mode of debugfs files
  selftests: gpio: gpio-sim: Fix BUG: test FAILED due to recent change
  tools: gpio: fix debounce_period_us output of lsgpio

Merge tag 'for-6.4-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

- handle memory allocation error in checksumming helper (reported by
   syzbot)

- fix lockdep splat when aborting a transaction, add NOFS protection
   around invalidate_inode_pages2 that could allocate with GFP_KERNEL

- reduce chances to hit an ENOSPC during scrub with RAID56 profiles

* tag 'for-6.4-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: use nofs when cleaning up aborted transactions
  btrfs: handle memory allocation failure in btrfs_csum_one_bio
  btrfs: scrub: try harder to mark RAID56 block groups read-only

Merge tag 'drm-fixes-2023-05-26' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
"This week's collection is pretty spread out, accel/qaic has a bunch of
  fixes, amdgpu, then lots of single fixes across a bunch of places.

  core:
   - fix drmm_mutex_init lock class

  mgag200:
   - fix gamma lut initialisation

  pl111:
   - fix FB depth on IMPD-1 framebuffer

  amdgpu:
   - Fix missing BO unlocking in KIQ error path
   - Avoid spurious secure display error messages
   - SMU13 fix
   - Fix an OD regression
   - GPU reset display IRQ warning fix
   - MST fix

  radeon:
   - Fix a DP regression

  i915:
   - PIPEDMC disabling fix for bigjoiner config

  panel:
   - fix aya neo air plus quirk

  sched:
   - remove redundant NULL check

  qaic:
   - fix NNC message corruption
   - Grab ch_lock during QAIC_ATTACH_SLICE_BO
   - Flush the transfer list again
   - Validate if BO is sliced before slicing
   - Validate user data before grabbing any lock
   - initialize ret variable to 0
   - silence some uninitialized variable warnings"

* tag 'drm-fixes-2023-05-26' of git://anongit.freedesktop.org/drm/drm:
  drm/amd/display: Have Payload Properly Created After Resume
  drm/amd/display: Fix warning in disabling vblank irq
  drm/amd/pm: Fix output of pp_od_clk_voltage
  drm/amd/pm: add missing NotifyPowerSource message mapping for SMU13.0.7
  drm/radeon: reintroduce radeon_dp_work_func content
  drm/amdgpu: don't enable secure display on incompatible platforms
  drm:amd:amdgpu: Fix missing buffer object unlock in failure path
  accel/qaic: Fix NNC message corruption
  accel/qaic: Grab ch_lock during QAIC_ATTACH_SLICE_BO
  accel/qaic: Flush the transfer list again
  accel/qaic: Validate if BO is sliced before slicing
  accel/qaic: Validate user data before grabbing any lock
  accel/qaic: initialize ret variable to 0
  drm/i915: Fix PIPEDMC disabling for a bigjoiner configuration
  drm: fix drmm_mutex_init()
  drm/sched: Remove redundant check
  drm: panel-orientation-quirks: Change Air's quirk to support Air Plus
  accel/qaic: silence some uninitialized variable warnings
  drm/pl111: Fix FB depth on IMPD-1 framebuffer
  drm/mgag200: Fix gamma lut not initialized.

x86: re-introduce support for ERMS copies for user space accesses

I tried to streamline our user memory copy code fairly aggressively in
commit adfcf4231b8c ("x86: don't use REP_GOOD or ERMS for user memory
copies"), in order to then be able to clean up the code and inline the
modern FSRM case in commit 577e6a7fd50d ("x86: inline the 'rep movs' in
user copies for the FSRM case").

We had reports [1] of that causing regressions earlier with blogbench,
but that turned out to be a horrible benchmark for that case, and not a
sufficient reason for re-instating "rep movsb" on older machines.

However, now Eric Dumazet reported [2] a regression in performance that
seems to be a rather more real benchmark, where due to the removal of
"rep movs" a TCP stream over a 100Gbps network no longer reaches line
speed.

And it turns out that with the simplified the calling convention for the
non-FSRM case in commit 427fda2c8a49 ("x86: improve on the non-rep
'copy_user' function"), re-introducing the ERMS case is actually fairly
simple.

Of course, that "fairly simple" is glossing over several missteps due to
having to fight our assembler alternative code. This code really wanted
to rewrite a conditional branch to have two different targets, but that
made objtool sufficiently unhappy that this instead just ended up doing
a choice between "jump to the unrolled loop, or use 'rep movsb'
directly".

Let's see if somebody finds a case where the kernel memory copies also
care (see commit 68674f94ffc9: "x86: don't use REP_GOOD or ERMS for
small memory copies"). But Eric does argue that the user copies are
special because networking tries to copy up to 32KB at a time, if
order-3 pages allocations are possible.

In-kernel memory copies are typically small, unless they are the special
"copy pages at a time" kind that still use "rep movs".

Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lore.kernel.org/lkml/CANn89iKUbyrJ=r2+_kK+sb2ZSSHifFZ7QkPLDpAtkJ8v4WUumA@mail.gmail.com/
Reported-and-tested-by: Eric Dumazet <[email protected]>
Fixes: adfcf4231b8c ("x86: don't use REP_GOOD or ERMS for user memory copies")
Signed-off-by: Linus Torvalds <[email protected]>

perf evsel: Separate bpf_counter_list and bpf_filters, can be used at the same time

'struct evsel' uses a union for the two lists. This turned out to be
error prone.

For example:

  [root@quaco ~]# perf stat --bpf-prog 5246
  Error: cpu-clock event does not have sample flags 66c660
  failed to set filter "(null)" on event cpu-clock with 2 (No such file or directory)
  [root@quaco ~]# perf stat --bpf-prog 5246

Fix this issue by separating the two lists.

Fixes: 56ec9457a4a20c5e ("perf bpf filter: Implement event sample filtering")
Signed-off-by: Song Liu <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

tools headers UAPI: Sync the linux/in.h with the kernel sources

Picking the changes from:

  3632679d9e4f879f ("ipv{4,6}/raw: fix output xfrm lookup wrt protocol")

That includes a define (IP_PROTOCOL) that isn't being used in generating
any id -> string table used by 'perf trace'.

Addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/in.h' differs from latest version at 'include/uapi/linux/in.h'
  diff -u tools/include/uapi/linux/in.h include/uapi/linux/in.h

Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nicolas Dichtel <[email protected]>
Cc: Paolo Abeni <[email protected]>
Link: https://lore.kernel.org/lkml/ZHD/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf cs-etm: Copy kernel coresight-pmu.h header

Copy the kernel version of the header to fix the header diff build
warning. Some new definitions were only added to the tools side header,
but these are only used in Perf so move them to a different header.

Signed-off-by: James Clark <[email protected]>
Reviewed-by: Mike Leach <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Cc: Mark Rutland <[email protected]>
Cc: Suzuki K Poulose <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: [email protected]
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: John Garry <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf bpf: Do not use llvm-strip on BPF binary

llvm-strip is not really required. Remove this dependency to make it
easier to build perf with BUILD_BPF_SKEL=1.

Committer notes:

This removes the need for the 'llvm' package just to get llvm-strip.

Reported-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf build: Don't compile demangle-cxx.cpp if not necessary

demangle-cxx.cpp requires a C++ compiler, but feature checks may fail
because of the absence of this. Add a CONFIG_CXX_DEMANGLE so that the
source isn't built if not supported. Copy libbfd and cplus demangle
variants to a weak symbol-elf.c version so they aren't dependent on
C++. These variants are only built with the build option
BUILD_NONDISTRO=1.

Committer note:

This also handles this build break when a C++ compiler isn't available:

CXX /tmp/build/perf/util/demangle-cxx.o
/bin/sh: g++: command not found

Signed-off-by: Ian Rogers <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Qi Liu <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf arm: Fix include path to cs-etm.h

Change "../cs-etm.h" to just "../../../util/cs-etm.h" as ../cs-etm.h
doesn't exist.

Suggested-by: Leo Yan <[email protected]>
Reviewed-by: Leo Yan <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>