Git Repo - linux.git/log

mm: filemap: check if THP has hwpoisoned subpage for PMD page fault

When handling shmem page fault the THP with corrupted subpage could be
PMD mapped if certain conditions are satisfied.  But kernel is supposed
to send SIGBUS when trying to map hwpoisoned page.

There are two paths which may do PMD map: fault around and regular
fault.

Before commit f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault()
codepaths") the thing was even worse in fault around path.  The THP
could be PMD mapped as long as the VMA fits regardless what subpage is
accessed and corrupted.  After this commit as long as head page is not
corrupted the THP could be PMD mapped.

In the regular fault path the THP could be PMD mapped as long as the
corrupted page is not accessed and the VMA fits.

This loophole could be fixed by iterating every subpage to check if any
of them is hwpoisoned or not, but it is somewhat costly in page fault
path.

So introduce a new page flag called HasHWPoisoned on the first tail
page.  It indicates the THP has hwpoisoned subpage(s).  It is set if any
subpage of THP is found hwpoisoned by memory failure and after the
refcount is bumped successfully, then cleared when the THP is freed or
split.

The soft offline path doesn't need this since soft offline handler just
marks a subpage hwpoisoned when the subpage is migrated successfully.
But shmem THP didn't get split then migrated at all.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: 800d8c63b2e9 ("shmem: add huge pages support")
Signed-off-by: Yang Shi <[email protected]>
Reviewed-by: Naoya Horiguchi <[email protected]>
Suggested-by: Kirill A. Shutemov <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: hwpoison: remove the unnecessary THP check

When handling THP hwpoison checked if the THP is in allocation or free
stage since hwpoison may mistreat it as hugetlb page.  After commit
415c64c1453a ("mm/memory-failure: split thp earlier in memory error
handling") the problem has been fixed, so this check is no longer
needed.  Remove it.  The side effect of the removal is hwpoison may
report unsplit THP instead of unknown error for shmem THP.  It seems not
like a big deal.

The following patch "mm: filemap: check if THP has hwpoisoned subpage
for PMD page fault" depends on this, which fixes shmem THP with
hwpoisoned subpage(s) are mapped PMD wrongly.  So this patch needs to be
backported to -stable as well.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Yang Shi <[email protected]>
Suggested-by: Naoya Horiguchi <[email protected]>
Acked-by: Naoya Horiguchi <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

memcg: page_alloc: skip bulk allocator for __GFP_ACCOUNT

Commit 5c1f4e690eec ("mm/vmalloc: switch to bulk allocator in
__vmalloc_area_node()") switched to bulk page allocator for order 0
allocation backing vmalloc. However bulk page allocator does not
support __GFP_ACCOUNT allocations and there are several users of
kvmalloc(__GFP_ACCOUNT).

For now make __GFP_ACCOUNT allocations bypass bulk page allocator. In
future if there is workload that can be significantly improved with the
bulk page allocator with __GFP_ACCCOUNT support, we can revisit the
decision.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: 5c1f4e690eec ("mm/vmalloc: switch to bulk allocator in __vmalloc_area_node()")
Signed-off-by: Shakeel Butt <[email protected]>
Reported-by: Vasily Averin <[email protected]>
Tested-by: Vasily Averin <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Roman Gushchin <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge tag 'net-5.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"Including fixes from WiFi (mac80211), and BPF.

  Current release - regressions:

   - skb_expand_head: adjust skb->truesize to fix socket memory
     accounting

   - mptcp: fix corrupt receiver key in MPC + data + checksum

  Previous releases - regressions:

   - multicast: calculate csum of looped-back and forwarded packets

   - cgroup: fix memory leak caused by missing cgroup_bpf_offline

   - cfg80211: fix management registrations locking, prevent list
     corruption

   - cfg80211: correct false positive in bridge/4addr mode check

   - tcp_bpf: fix race in the tcp_bpf_send_verdict resulting in reusing
     previous verdict

  Previous releases - always broken:

   - sctp: enhancements for the verification tag, prevent attackers from
     killing SCTP sessions

   - tipc: fix size validations for the MSG_CRYPTO type

   - mac80211: mesh: fix HE operation element length check, prevent out
     of bound access

   - tls: fix sign of socket errors, prevent positive error codes being
     reported from read()/write()

   - cfg80211: scan: extend RCU protection in
     cfg80211_add_nontrans_list()

   - implement ->sock_is_readable() for UDP and AF_UNIX, fix poll() for
     sockets in a BPF sockmap

   - bpf: fix potential race in tail call compatibility check resulting
     in two operations which would make the map incompatible succeeding

   - bpf: prevent increasing bpf_jit_limit above max

   - bpf: fix error usage of map_fd and fdget() in generic batch update

   - phy: ethtool: lock the phy for consistency of results

   - prevent infinite while loop in skb_tx_hash() when Tx races with
     driver reconfiguring the queue <> traffic class mapping

   - usbnet: fixes for bad HW conjured by syzbot

   - xen: stop tx queues during live migration, prevent UAF

   - net-sysfs: initialize uid and gid before calling
     net_ns_get_ownership

   - mlxsw: prevent Rx stalls under memory pressure"

* tag 'net-5.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (67 commits)
  Revert "net: hns3: fix pause config problem after autoneg disabled"
  mptcp: fix corrupt receiver key in MPC + data + checksum
  riscv, bpf: Fix potential NULL dereference
  octeontx2-af: Fix possible null pointer dereference.
  octeontx2-af: Display all enabled PF VF rsrc_alloc entries.
  octeontx2-af: Check whether ipolicers exists
  net: ethernet: microchip: lan743x: Fix skb allocation failure
  net/tls: Fix flipped sign in async_wait.err assignment
  net/tls: Fix flipped sign in tls_err_abort() calls
  net/smc: Correct spelling mistake to TCPF_SYN_RECV
  net/smc: Fix smc_link->llc_testlink_time overflow
  nfp: bpf: relax prog rejection for mtu check through max_pkt_offset
  vmxnet3: do not stop tx queues after netif_device_detach()
  r8169: Add device 10ec:8162 to driver r8169
  ptp: Document the PTP_CLK_MAGIC ioctl number
  usbnet: fix error return code in usbnet_probe()
  net: hns3: adjust string spaces of some parameters of tx bd info in debugfs
  net: hns3: expand buffer len for some debugfs command
  net: hns3: add more string spaces for dumping packets number of queue info in debugfs
  net: hns3: fix data endian problem of some functions of debugfs
  ...

Merge tag 'spi-fix-v5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
"A couple of final driver specific fixes for v5.15, one fixing
  potential ID collisions between two instances of the Altera driver and
  one making Microwire full duplex mode actually work on pl022"

* tag 'spi-fix-v5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: spl022: fix Microwire full duplex mode
  spi: altera: Change to dynamic allocation of spi id

Merge tag 'regmap-fix-v5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

Pull regmap fix from Mark Brown:
"This fixes a potential double free when handling an out of memory
error inserting a node into an rbtree regcache"

* tag 'regmap-fix-v5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: Fix possible double-free in regcache_rbtree_exit()

Merge tag 'linux-watchdog-5.15-rc7' of git://www.linux-watchdog.org/linux-watchdog

Pull watchdog fixes from Wim Van Sebroeck:
"I overlooked Guenters request to sent this upstream earlier, so it's a
  bit late in the release cycle.

  This contains:

   - Revert "watchdog: iTCO_wdt: Account for rebooting on second
     timeout"

   - sbsa: only use 32-bit accessors

   - sbsa: drop unneeded MODULE_ALIAS

   - ixp4xx_wdt: Fix address space warning

   - Fix OMAP watchdog early handling"

* tag 'linux-watchdog-5.15-rc7' of git://www.linux-watchdog.org/linux-watchdog:
  watchdog: Fix OMAP watchdog early handling
  watchdog: ixp4xx_wdt: Fix address space warning
  watchdog: sbsa: drop unneeded MODULE_ALIAS
  watchdog: sbsa: only use 32-bit accessors
  Revert "watchdog: iTCO_wdt: Account for rebooting on second timeout"

Merge tag 'trace-v5.15-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
"Do not WARN when attaching event probe to non-existent event

  If the user tries to attach an event probe (eprobe) to an event that
  does not exist, it will trigger a warning. There's an error check that
  only expects memory issues otherwise it is considered a bug. But
  changes in the code to move around the locking made it that it can
  error out if the user attempts to attach to an event that does not
  exist, returning an -ENODEV. As this path can be caused by user space
  putting in a bad value, do not trigger a WARN"

* tag 'trace-v5.15-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Do not warn when connecting eprobe to non existing event

Revert "net: hns3: fix pause config problem after autoneg disabled"

This reverts commit 3bda2e5df476417b6d08967e2d84234a59d57b1c.

According to discussion with Andrew as follow:
https://lore.kernel.org/netdev/09eda9fe-196b-006b-6f01-f54e75715961@huawei.com/

HNS3 driver needs to separate pause autoneg from general autoneg, so revert
this incorrect patch.

Signed-off-by: Guangbin Huang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

mptcp: fix corrupt receiver key in MPC + data + checksum

using packetdrill it's possible to observe that the receiver key contains
random values when clients transmit MP_CAPABLE with data and checksum (as
specified in RFC8684 §3.1). Fix the layout of mptcp_out_options, to avoid
using the skb extension copy when writing the MP_CAPABLE sub-option.

Fixes: d7b269083786 ("mptcp: shrink mptcp_out_options struct")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/233
Reported-by: Poorva Sonparote <[email protected]>
Signed-off-by: Davide Caratti <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

riscv, bpf: Fix potential NULL dereference

The bpf_jit_binary_free() function requires a non-NULL argument. When
the RISC-V BPF JIT fails to converge in NR_JIT_ITERATIONS steps,
jit_data->header will be NULL, which triggers a NULL
dereference. Avoid this by checking the argument, prior calling the
function.

Fixes: ca6cb5447cec ("riscv, bpf: Factor common RISC-V JIT code")
Signed-off-by: Björn Töpel <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'octeontx2-debugfs-fixes'

Rakesh Babu Saladi says:

====================
RVU Debugfs fix updates.

The following patch series consists of the patch fixes done over
rvu_debugfs.c and rvu_nix.c files.

Patch 1: Check and return if ipolicers do not exists.
Patch 2: Fix rsrc_alloc to print all enabled PF/VF entries with list of LFs
allocated for each functional block.
Patch 3: Fix possible null pointer dereference.
====================

Signed-off-by: David S. Miller <[email protected]>

octeontx2-af: Fix possible null pointer dereference.

This patch fixes possible null pointer dereference in files
"rvu_debugfs.c" and "rvu_nix.c"

Fixes: 8756828a8148 ("octeontx2-af: Add NPA aura and pool contexts to debugfs")
Fixes: 9a946def264d ("octeontx2-af: Modify nix_vtag_cfg mailbox to support TX VTAG entries")
Signed-off-by: Rakesh Babu Saladi <[email protected]>
Signed-off-by: Subbaraya Sundeep <[email protected]>
Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

octeontx2-af: Display all enabled PF VF rsrc_alloc entries.

Currently, we are using a fixed buffer size of length 2048 to display
rsrc_alloc output. As a result a maximum of 2048 characters of
rsrc_alloc output is displayed, which may lead sometimes to display only
partial output. This patch fixes this dependency on max limit of buffer
size and displays all PF VF entries.

Each column of the debugfs entry "rsrc_alloc" uses a fixed width of 12
characters to print the list of LFs of each block for a PF/VF. If the
length of list of LFs of a block exceeds this fixed width then the list
gets truncated and displays only a part of the list. This patch fixes
this by using the maximum possible length of list of LFs among all
blocks of all PFs and VFs entries as the width size.

Fixes: f7884097141b ("octeontx2-af: Formatting debugfs entry rsrc_alloc.")
Fixes: 23205e6d06d4 ("octeontx2-af: Dump current resource provisioning status")
Signed-off-by: Rakesh Babu <[email protected]>
Signed-off-by: Nithin Dabilpuram <[email protected]>
Signed-off-by: Sunil Kovvuri Goutham <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

octeontx2-af: Check whether ipolicers exists

While displaying ingress policers information in
debugfs check whether ingress policers exist in
the hardware or not because some platforms(CN9XXX)
do not have this feature.

Fixes: e7d8971763f3 ("octeontx2-af: cn10k: Debugfs support for bandwidth")
Signed-off-by: Subbaraya Sundeep <[email protected]>
Signed-off-by: Rakesh Babu <[email protected]>
Signed-off-by: Sunil Kovvuri Goutham <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ethernet: microchip: lan743x: Fix skb allocation failure

The driver allocates skb during ndo_open with GFP_ATOMIC which has high chance of failure when there are multiple instances.
GFP_KERNEL is enough while open and use GFP_ATOMIC only from interrupt context.

Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: Yuiko Oshino <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/tls: Fix flipped sign in async_wait.err assignment

sk->sk_err contains a positive number, yet async_wait.err wants the
opposite. Fix the missed sign flip, which Jakub caught by inspection.

Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance")
Suggested-by: Jakub Kicinski <[email protected]>
Signed-off-by: Daniel Jordan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/tls: Fix flipped sign in tls_err_abort() calls

sk->sk_err appears to expect a positive value, a convention that ktls
doesn't always follow and that leads to memory corruption in other code.
For instance,

    [kworker]
    tls_encrypt_done(..., err=<negative error from crypto request>)
      tls_err_abort(.., err)
        sk->sk_err = err;

    [task]
    splice_from_pipe_feed
      ...
        tls_sw_do_sendpage
          if (sk->sk_err) {
            ret = -sk->sk_err;  // ret is positive

    splice_from_pipe_feed (continued)
      ret = actor(...)  // ret is still positive and interpreted as bytes
                        // written, resulting in underflow of buf->len and
                        // sd->len, leading to huge buf->offset and bogus
                        // addresses computed in later calls to actor()

Fix all tls_err_abort() callers to pass a negative error code
consistently and centralize the error-prone sign flip there, throwing in
a warning to catch future misuse and uninlining the function so it
really does only warn once.

Cc: [email protected]
Fixes: c46234ebb4d1e ("tls: RX path for ktls")
Reported-by: [email protected]
Signed-off-by: Daniel Jordan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'SMC-fixes'

Tony Lu says:

====================
Fixes for SMC

There are some fixes for SMC.

v1->v2:
- fix wrong email address.
====================

Signed-off-by: David S. Miller <[email protected]>

net/smc: Correct spelling mistake to TCPF_SYN_RECV

There should use TCPF_SYN_RECV instead of TCP_SYN_RECV.

Signed-off-by: Wen Gu <[email protected]>
Reviewed-by: Tony Lu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/smc: Fix smc_link->llc_testlink_time overflow

The value of llc_testlink_time is set to the value stored in
net->ipv4.sysctl_tcp_keepalive_time when linkgroup init. The value of
sysctl_tcp_keepalive_time is already jiffies, so we don't need to
multiply by HZ, which would cause smc_link->llc_testlink_time overflow,
and test_link send flood.

Signed-off-by: Tony Lu <[email protected]>
Reviewed-by: Xuan Zhuo <[email protected]>
Reviewed-by: Wen Gu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

nfp: bpf: relax prog rejection for mtu check through max_pkt_offset

MTU change is refused whenever the value of new MTU is bigger than
the max packet bytes that fits in NFP Cluster Target Memory (CTM).
However, an eBPF program doesn't always need to access the whole
packet data.

The maximum direct packet access (DPA) offset has always been
caculated by verifier and stored in the max_pkt_offset field of prog
aux data.

Signed-off-by: Yu Xiao <[email protected]>
Reviewed-by: Yinjun Zhang <[email protected]>
Reviewed-by: Niklas Soderlund <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

vmxnet3: do not stop tx queues after netif_device_detach()

The netif_device_detach() conditionally stops all tx queues if the queues
are running. There is no need to call netif_tx_stop_all_queues() again.

Signed-off-by: Dongli Zhang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tracing: Do not warn when connecting eprobe to non existing event

When the syscall trace points are not configured in, the kselftests for
ftrace will try to attach an event probe (eprobe) to one of the system
call trace points. This triggered a WARNING, because the failure only
expects to see memory issues. But this is not the only failure. The user
may attempt to attach to a non existent event, and the kernel must not
warn about it.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: 7491e2c442781 ("tracing: Add a probe that attaches to trace events")
Signed-off-by: Steven Rostedt (VMware) <[email protected]>

r8169: Add device 10ec:8162 to driver r8169

This patch makes the driver r8169 pick up device Realtek Semiconductor Co.
, Ltd. Device [10ec:8162].

Signed-off-by: Janghyub Seo <[email protected]>
Suggested-by: Rushab Shah <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

ptp: Document the PTP_CLK_MAGIC ioctl number

Add PTP_CLK_MAGIC to the userspace-api/ioctl/ioctl-number.rst
documentation file.

Fixes: d94ba80ebbea ("ptp: Added a brand new class driver for ptp clocks.")
Signed-off-by: Randy Dunlap <[email protected]>
Cc: Richard Cochran <[email protected]>
Cc: John Stultz <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes from Michael Tsirkin:
"A couple of fixes that seem important enough to pick at the last
  moment"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio-ring: fix DMA metadata flags
  vduse: Fix race condition between resetting and irq injecting
  vduse: Disallow injecting interrupt before DRIVER_OK is set

virtio-ring: fix DMA metadata flags

The flags are currently overwritten, leading to the wrong direction
being passed to the DMA unmap functions.

Fixes: 72b5e8958738aaa4 ("virtio-ring: store DMA metadata in desc_extra for split virtqueue")
Signed-off-by: Vincent Whitchurch <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Michael S. Tsirkin <[email protected]>
Acked-by: Jason Wang <[email protected]>

usbnet: fix error return code in usbnet_probe()

Return error code if usb_maxpacket() returns 0 in usbnet_probe()

Fixes: 397430b50a36 ("usbnet: sanity check for maxpacket")
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Wang Hai <[email protected]>
Reviewed-by: Johan Hovold <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge tag 'trace-v5.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull nds32 tracing fix from Steven Rostedt:
"Fix nds32le build when DYNAMIC_FTRACE is disabled

  A randconfig found that nds32le architecture fails to build due to a
  prototype mismatch between a ftrace function pointer and the function
  it was to be assigned to. That function pointer prototype missed being
  updated when all the ftrace callbacks were updated"

* tag 'trace-v5.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace/nds32: Update the proto for ftrace_trace_function to match ftrace_stub

Merge tag 'nios2_fixes_for_v5.15_part3' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux

Pull nios2 fix from Dinh Nguyen:
"Fix a build error for allmodconfig"

* tag 'nios2_fixes_for_v5.15_part3' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
nios2: Make NIOS2_DTB_SOURCE_BOOL depend on !COMPILE_TEST

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
"Nothing very exciting here, it has been a quiet cycle overall. Usual
  collection of small bug fixes:

   - irdma issues with CQ entries, VLAN completions and a mutex deadlock

   - Incorrect DCT packets in mlx5

   - Userspace triggered overflows in qib

   - Locking error in hfi

   - Typo in errno value in qib/hfi1

   - Double free in qedr

   - Leak of random kernel memory to userspace with a netlink callback"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string
  RDMA/irdma: Do not hold qos mutex twice on QP resume
  RDMA/irdma: Set VLAN in UD work completion correctly
  RDMA/mlx5: Initialize the ODP xarray when creating an ODP MR
  rdma/qedr: Fix crash due to redundant release of device's qp memory
  RDMA/rdmavt: Fix error code in rvt_create_qp()
  IB/hfi1: Fix abba locking issue with sc_disable()
  IB/qib: Protect from buffer overflow in struct qib_user_sdma_pkt fields
  RDMA/mlx5: Set user priority for DCT
  RDMA/irdma: Process extended CQ entries correctly

ftrace/nds32: Update the proto for ftrace_trace_function to match ftrace_stub

The ftrace callback prototype was changed to pass a special ftrace_regs
instead of pt_regs as the last parameter, but the static ftrace for nds32
missed updating ftrace_trace_function and this caused a warning when
compared to ftrace_stub:

../arch/nds32/kernel/ftrace.c: In function '_mcount':
../arch/nds32/kernel/ftrace.c:24:35: error: comparison of distinct pointer types lacks a cast [-Werror]
24 | if (ftrace_trace_function != ftrace_stub)
| ^~

Link: https://lore.kernel.org/all/[email protected]/
Link: https://lkml.kernel.org/r/[email protected]
Cc: Ingo Molnar <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Nick Hu <[email protected]>
Cc: Greentime Hu <[email protected]>
Cc: Vincent Chen <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Fixes: d19ad0775dcd6 ("ftrace: Have the callbacks receive a struct ftrace_regs instead of pt_regs")
Reported-by: Randy Dunlap <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>

Merge tag 'mac80211-for-net-2021-10-27' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
Two fixes:
* bridge vs. 4-addr mode check was wrong
* management frame registrations locking was
wrong, causing list corruption/crashes
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

nios2: Make NIOS2_DTB_SOURCE_BOOL depend on !COMPILE_TEST

nios2:allmodconfig builds fail with

make[1]: *** No rule to make target 'arch/nios2/boot/dts/""',
needed by 'arch/nios2/boot/dts/built-in.a'. Stop.
make: [Makefile:1868: arch/nios2/boot/dts] Error 2 (ignored)

This is seen with compile tests since those enable NIOS2_DTB_SOURCE_BOOL,
which in turn enables NIOS2_DTB_SOURCE. This causes the build error
because the default value for NIOS2_DTB_SOURCE is an empty string.
Disable NIOS2_DTB_SOURCE_BOOL for compile tests to avoid the error.

Fixes: 2fc8483fdcde ("nios2: Build infrastructure")
Signed-off-by: Guenter Roeck <[email protected]>
Reviewed-by: Randy Dunlap <[email protected]>
Signed-off-by: Dinh Nguyen <[email protected]>

Merge branch 'hns3-fixes'

Guangbin Huang says:

====================
net: hns3: add some fixes for -net

This series adds some fixes for the HNS3 ethernet driver.
====================

Signed-off-by: David S. Miller <[email protected]>

net: hns3: adjust string spaces of some parameters of tx bd info in debugfs

This patch adjusts the string spaces of some parameters of tx bd info in
debugfs according to their maximum needs.

Signed-off-by: Guangbin Huang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: expand buffer len for some debugfs command

The specified buffer length for three debugfs files fd_tcam, uc and tqp
is not enough for their maximum needs, so this patch fixes them.

Fixes: b5a0b70d77b9 ("net: hns3: refactor dump fd tcam of debugfs")
Fixes: 1556ea9120ff ("net: hns3: refactor dump mac list of debugfs")
Fixes: d96b0e59468d ("net: hns3: refactor dump reg of debugfs")
Signed-off-by: Guangbin Huang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: add more string spaces for dumping packets number of queue info in debugfs

As the width of packets number registers is 32 bits, they needs at most
10 characters for decimal data printing, but now the string spaces is not
enough, so this patch fixes it.

Fixes: e44c495d95e ("net: hns3: refactor queue info of debugfs")
Signed-off-by: Jie Wang <[email protected]>
Signed-off-by: Guangbin Huang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: fix data endian problem of some functions of debugfs

The member data in struct hclge_desc is type of __le32, it needs endian
conversion before using it, and some functions of debugfs didn't do that,
so this patch fixes it.

Fixes: c0ebebb9ccc1 ("net: hns3: Add "dcb register" status information query function")
Signed-off-by: Jie Wang <[email protected]>
Signed-off-by: Guangbin Huang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: ignore reset event before initialization process is done

Currently, if there is a reset event triggered by RAS during device in
initialization process, driver may run reset process concurrently with
initialization process. In this case, it may cause problem. For example,
the RSS indirection table may has not been alloc memory in initialization
process yet, but it is used in reset process, it will cause a call trace
like this:

[61228.744836] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
...
[61228.897677] Workqueue: hclgevf hclgevf_service_task [hclgevf]
[61228.911390] pstate: 40400009 (nZcv daif +PAN -UAO -TCO BTYPE=--)
[61228.918670] pc : hclgevf_set_rss_indir_table+0xb4/0x190 [hclgevf]
[61228.927812] lr : hclgevf_set_rss_indir_table+0x90/0x190 [hclgevf]
[61228.937248] sp : ffff8000162ebb50
[61228.941087] x29: ffff8000162ebb50 x28: ffffb77add72dbc0 x27: ffff0820c7dc8080
[61228.949516] x26: 0000000000000000 x25: ffff0820ad4fc880 x24: ffff0820c7dc8080
[61228.958220] x23: ffff0820c7dc8090 x22: 00000000ffffffff x21: 0000000000000040
[61228.966360] x20: ffffb77add72b9c0 x19: 0000000000000000 x18: 0000000000000030
[61228.974646] x17: 0000000000000000 x16: ffffb77ae713feb0 x15: ffff0820ad4fcce8
[61228.982808] x14: ffffffffffffffff x13: ffff8000962eb7f7 x12: 00003834ec70c960
[61228.991990] x11: 00e0fafa8c206982 x10: 9670facc78a8f9a8 x9 : ffffb77add717530
[61229.001123] x8 : ffff0820ad4fd6b8 x7 : 0000000000000000 x6 : 0000000000000011
[61229.010249] x5 : 00000000000cb1b0 x4 : 0000000000002adb x3 : 0000000000000049
[61229.018662] x2 : ffff8000162ebbb8 x1 : 0000000000000000 x0 : 0000000000000480
[61229.027002] Call trace:
[61229.030177]  hclgevf_set_rss_indir_table+0xb4/0x190 [hclgevf]
[61229.039009]  hclgevf_rss_init_hw+0x128/0x1b4 [hclgevf]
[61229.046809]  hclgevf_reset_rebuild+0x17c/0x69c [hclgevf]
[61229.053862]  hclgevf_reset_service_task+0x4cc/0xa80 [hclgevf]
[61229.061306]  hclgevf_service_task+0x6c/0x630 [hclgevf]
[61229.068491]  process_one_work+0x1dc/0x48c
[61229.074121]  worker_thread+0x15c/0x464
[61229.078562]  kthread+0x168/0x16c
[61229.082873]  ret_from_fork+0x10/0x18
[61229.088221] Code: 7900e7f6 f904a683 d503201f 9101a3e2 (38616b43)
[61229.095357] ---[ end trace 153661a538f6768c ]---

To fix this problem, don't schedule reset task before initialization
process is done.

Signed-off-by: Guangbin Huang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: change hclge/hclgevf workqueue to WQ_UNBOUND mode

Currently, the workqueue of hclge/hclgevf is executed on
the CPU that initiates scheduling requests by default. In
stress scenarios, the CPU may be busy and workqueue scheduling
is completed after a long period of time. To avoid this
situation and implement proper scheduling, use the WQ_UNBOUND
mode instead. In this way, the workqueue can be performed on
a relatively idle CPU.

Signed-off-by: Yufeng Mo <[email protected]>
Signed-off-by: Guangbin Huang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: fix pause config problem after autoneg disabled

If a TP port is configured by follow steps:
1.ethtool -s ethx autoneg off speed 100 duplex full
2.ethtool -A ethx rx on tx on
3.ethtool -s ethx autoneg on(rx&tx negotiated pause results are off)
4.ethtool -s ethx autoneg off speed 100 duplex full

In step 3, driver will set rx&tx pause parameters of hardware to off as
pause parameters negotiated with link partner are off.

After step 4, the "ethtool -a ethx" command shows both rx and tx pause
parameters are on. However, pause parameters of hardware are still off
and port has no flow control function actually.

To fix this problem, if autoneg is disabled, driver uses its saved
parameters to restore pause of hardware. If the speed is not changed in
this case, there is no link state changed for phy, it will cause the pause
parameter is not taken effect, so we need to force phy to go down and up.

Fixes: aacbe27e82f0 ("net: hns3: modify how pause options is displayed")
Signed-off-by: Guangbin Huang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'arm-soc-fixes-5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
"One last set of small fixes for the soc tree:

   - Incorrect ethernet phy settings found on i.mx and allwinner
     platforms

   - a revert for a Qualcomm DT change that caused a boot regression

   - four patches for incorrect settings in i.MX DT files

   - new MAINTAINER file entries for dhcom boards

   - a Kconfig fix for a reset driver that became unselectable

   - three more code changes for bugs in reset drivers"

* tag 'arm-soc-fixes-5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  MAINTAINERS: Add maintainers for DHCOM i.MX6 and DHCOM/DHCOR STM32MP1
  Revert "arm64: dts: qcom: sm8250: remove bus clock from the mdss node for sm8250 target"
  arm64: dts: imx8mm-kontron: Fix connection type for VSC8531 RGMII PHY
  arm64: dts: imx8mm-kontron: Fix CAN SPI clock frequency
  arm64: dts: imx8mm-kontron: Fix polarity of reg_rst_eth2
  arm64: dts: imx8mm-kontron: Set lower limit of VDD_SNVS to 800 mV
  arm64: dts: imx8mm-kontron: Make sure SOC and DRAM supply voltages are correct
  reset: socfpga: add empty driver allowing consumers to probe
  reset: tegra-bpmp: Handle errors in BPMP response
  reset: pistachio: Re-enable driver selection
  reset: brcmstb-rescal: fix incorrect polarity of status bit
  ARM: dts: sun7i: A20-olinuxino-lime2: Fix ethernet phy-mode
  arm64: dts: allwinner: h5: NanoPI Neo 2: Fix ethernet node

Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2021-10-26

We've added 12 non-merge commits during the last 7 day(s) which contain
a total of 23 files changed, 118 insertions(+), 98 deletions(-).

The main changes are:

1) Fix potential race window in BPF tail call compatibility check, from Toke Høiland-Jørgensen.

2) Fix memory leak in cgroup fs due to missing cgroup_bpf_offline(), from Quanyang Wang.

3) Fix file descriptor reference counting in generic_map_update_batch(), from Xu Kuohai.

4) Fix bpf_jit_limit knob to the max supported limit by the arch's JIT, from Lorenz Bauer.

5) Fix BPF sockmap ->poll callbacks for UDP and AF_UNIX sockets, from Cong Wang and Yucong Sun.

6) Fix BPF sockmap concurrency issue in TCP on non-blocking sendmsg calls, from Liu Jian.

7) Fix build failure of INODE_STORAGE and TASK_STORAGE maps on !CONFIG_NET, from Tejun Heo.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Fix potential race in tail call compatibility check
  bpf: Move BPF_MAP_TYPE for INODE_STORAGE and TASK_STORAGE outside of CONFIG_NET
  selftests/bpf: Use recv_timeout() instead of retries
  net: Implement ->sock_is_readable() for UDP and AF_UNIX
  skmsg: Extract and reuse sk_msg_is_readable()
  net: Rename ->stream_memory_read to ->sock_is_readable
  tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function
  cgroup: Fix memory leak caused by missing cgroup_bpf_offline
  bpf: Fix error usage of map_fd and fdget() in generic_map_update_batch()
  bpf: Prevent increasing bpf_jit_limit above max
  bpf: Define bpf_jit_alloc_exec_limit for arm64 JIT
  bpf: Define bpf_jit_alloc_exec_limit for riscv JIT
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

bpf: Fix potential race in tail call compatibility check

Lorenzo noticed that the code testing for program type compatibility of
tail call maps is potentially racy in that two threads could encounter a
map with an unset type simultaneously and both return true even though they
are inserting incompatible programs.

The race window is quite small, but artificially enlarging it by adding a
usleep_range() inside the check in bpf_prog_array_compatible() makes it
trivial to trigger from userspace with a program that does, essentially:

        map_fd = bpf_create_map(BPF_MAP_TYPE_PROG_ARRAY, 4, 4, 2, 0);
        pid = fork();
        if (pid) {
                key = 0;
                value = xdp_fd;
        } else {
                key = 1;
                value = tc_fd;
        }
        err = bpf_map_update_elem(map_fd, &key, &value, 0);

While the race window is small, it has potentially serious ramifications in
that triggering it would allow a BPF program to tail call to a program of a
different type. So let's get rid of it by protecting the update with a
spinlock. The commit in the Fixes tag is the last commit that touches the
code in question.

v2:
- Use a spinlock instead of an atomic variable and cmpxchg() (Alexei)
v3:
- Put lock and the members it protects into an embedded 'owner' struct (Daniel)

Fixes: 3324b584b6f6 ("ebpf: misc core cleanup")
Reported-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

bpf: Move BPF_MAP_TYPE for INODE_STORAGE and TASK_STORAGE outside of CONFIG_NET

bpf_types.h has BPF_MAP_TYPE_INODE_STORAGE and BPF_MAP_TYPE_TASK_STORAGE
declared inside #ifdef CONFIG_NET although they are built regardless of
CONFIG_NET. So, when CONFIG_BPF_SYSCALL && !CONFIG_NET, they are built
without the declarations leading to spurious build failures and not
registered to bpf_map_types making them unavailable.

Fix it by moving the BPF_MAP_TYPE for the two map types outside of
CONFIG_NET.

Reported-by: kernel test robot <[email protected]>
Fixes: a10787e6d58c ("bpf: Enable task local storage for tracing programs")
Signed-off-by: Tejun Heo <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Martin KaFai Lau <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

Merge branch 'sock_map: fix ->poll() and update selftests'

Cong Wang says:

====================
This patchset fixes ->poll() for sockets in sockmap and updates
selftests accordingly with select(). Please check each patch
for more details.

Fixes: c50524ec4e3a ("Merge branch 'sockmap: add sockmap support for unix datagram socket'")
Fixes: 89d69c5d0fbc ("Merge branch 'sockmap: introduce BPF_SK_SKB_VERDICT and support UDP'")
Acked-by: John Fastabend <[email protected]>
---
v4: add a comment in udp_poll()

v3: drop sk_psock_get_checked()
    reuse tcp_bpf_sock_is_readable()

v2: rename and reuse ->stream_memory_read()
    fix a compile error in sk_psock_get_checked()

Cong Wang (3):
  net: rename ->stream_memory_read to ->sock_is_readable
  skmsg: extract and reuse sk_msg_is_readable()
  net: implement ->sock_is_readable() for UDP and AF_UNIX

====================

Reviewed-by: Jakub Sitnicki <[email protected]>
Acked-by: John Fastabend <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>

selftests/bpf: Use recv_timeout() instead of retries

We use non-blocking sockets in those tests, retrying for
EAGAIN is ugly because there is no upper bound for the packet
arrival time, at least in theory. After we fix poll() on
sockmap sockets, now we can switch to select()+recv().

Signed-off-by: Yucong Sun <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

net: Implement ->sock_is_readable() for UDP and AF_UNIX

Yucong noticed we can't poll() sockets in sockmap even
when they are the destination sockets of redirections.
This is because we never poll any psock queues in ->poll(),
except for TCP. With ->sock_is_readable() now we can
overwrite >sock_is_readable(), invoke and implement it for
both UDP and AF_UNIX sockets.

Reported-by: Yucong Sun <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

skmsg: Extract and reuse sk_msg_is_readable()

tcp_bpf_sock_is_readable() is pretty much generic,
we can extract it and reuse it for non-TCP sockets.

Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

net: Rename ->stream_memory_read to ->sock_is_readable

The proto ops ->stream_memory_read() is currently only used
by TCP to check whether psock queue is empty or not. We need
to rename it before reusing it for non-TCP protocols, and
adjust the exsiting users accordingly.

Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function

With two Msgs, msgA and msgB and a user doing nonblocking sendmsg calls (or
multiple cores) on a single socket 'sk' we could get the following flow.

msgA, sk                               msgB, sk
-----------                            ---------------
tcp_bpf_sendmsg()
lock(sk)
psock = sk->psock
                                        tcp_bpf_sendmsg()
                                        lock(sk) ... blocking
tcp_bpf_send_verdict
if (psock->eval == NONE)
   psock->eval = sk_psock_msg_verdict
..
< handle SK_REDIRECT case >
   release_sock(sk)                     < lock dropped so grab here >
   ret = tcp_bpf_sendmsg_redir
                                        psock = sk->psock
                                        tcp_bpf_send_verdict
lock_sock(sk) ... blocking on B
                                        if (psock->eval == NONE) <- boom.
                                         psock->eval will have msgA state

The problem here is we dropped the lock on msgA and grabbed it with msgB.
Now we have old state in psock and importantly psock->eval has not been
cleared. So msgB will run whatever action was done on A and the verdict
program may never see it.

Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Liu Jian <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: John Fastabend <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

watchdog: Fix OMAP watchdog early handling

TI's implementation does not service the watchdog even if the kernel
command line parameter omap_wdt.early_enable is set to 1. This patch
fixes the issue.

Signed-off-by: Walter Stoll <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>

watchdog: ixp4xx_wdt: Fix address space warning

sparse reports the following address space warning.

drivers/watchdog/ixp4xx_wdt.c:122:20: sparse:
incorrect type in assignment (different address spaces)
drivers/watchdog/ixp4xx_wdt.c:122:20: sparse:
expected void [noderef] __iomem *base
drivers/watchdog/ixp4xx_wdt.c:122:20: sparse:
got void *platform_data

Add a typecast to solve the problem.

Fixes: 21a0a29d16c6 ("watchdog: ixp4xx: Rewrite driver to use core")
Cc: Linus Walleij <[email protected]>
Reviewed-by: Linus Walleij <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Wim Van Sebroeck <[email protected]>

watchdog: sbsa: drop unneeded MODULE_ALIAS

The MODULE_DEVICE_TABLE already creates proper alias for platform
driver. Having another MODULE_ALIAS causes the alias to be duplicated.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>

watchdog: sbsa: only use 32-bit accessors

SBSA says of the generic watchdog:

  All registers are 32 bits in size and should be accessed using 32-bit
  reads and writes. If an access size other than 32 bits is used then
  the results are IMPLEMENTATION DEFINED.

and for qemu, the implementation will only allow 32-bit accesses
resulting in a synchronous external abort when configuring the watchdog.
Use lo_hi_* accessors rather than a readq/writeq.

Fixes: abd3ac7902fb ("watchdog: sbsa: Support architecture version 1")
Signed-off-by: Jamie Iles <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Reviewed-by: Shaokun Zhang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>

Revert "watchdog: iTCO_wdt: Account for rebooting on second timeout"

This reverts commit cb011044e34c ("watchdog: iTCO_wdt: Account for
rebooting on second timeout") and commit aec42642d91f ("watchdog: iTCO_wdt:
Fix detection of SMI-off case") since those patches cause a regression
on certain boards (https://bugzilla.kernel.org/show_bug.cgi?id=213809).

While this revert may result in some boards to only reset after twice
the configured timeout value, that is still better than a watchdog reset
after half the configured value.

Fixes: cb011044e34c ("watchdog: iTCO_wdt: Account for rebooting on second timeout")
Fixes: aec42642d91f ("watchdog: iTCO_wdt: Fix detection of SMI-off case")
Cc: Jan Kiszka <[email protected]>
Cc: Mantas Mikulėnas <[email protected]>
Reported-by: Javier S. Pedro <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Wim Van Sebroeck <[email protected]>

MAINTAINERS: Add maintainers for DHCOM i.MX6 and DHCOM/DHCOR STM32MP1

Add maintainers for DH electronics DHCOM i.MX6
and DHCOM/DHCOR STM32MP1 boards.

Signed-off-by: Christoph Niedermaier <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]'
To: [email protected]
To: [email protected]
Signed-off-by: Arnd Bergmann <[email protected]>

Merge tag 'qcom-arm64-fixes-for-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes

Qualcomm ARM64 DTS one more fix for 5.15

This reverts a clock change in the Qualcomm RB5 devicetree which in some
combinations of firmware and configuration causes the device to crash
during boot.

Data on an adjacent platform indicates that this is probably not be the
root cause of the problem, but this resolves the regression seen on RB5
and will allow the SM8250 platform to boot v5.15.

* tag 'qcom-arm64-fixes-for-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
Revert "arm64: dts: qcom: sm8250: remove bus clock from the mdss node for sm8250 target"

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>

MAINTAINERS: please remove myself from the Prestera driver

Signed-off-by: Vadym Kochan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: lan78xx: fix division by zero in send path

Add the missing endpoint max-packet sanity check to probe() to avoid
division by zero in lan78xx_tx_bh() in case a malicious device has
broken descriptors (or when doing descriptor fuzz testing).

Note that USB core will reject URBs submitted for endpoints with zero
wMaxPacketSize but that drivers doing packet-size calculations still
need to handle this (cf. commit 2548288b4fb0 ("USB: Fix: Don't skip
endpoint descriptors with maxpacket=0")).

Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Cc: [email protected] # 4.3
Cc: [email protected] <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: batman-adv: fix error handling

Syzbot reported ODEBUG warning in batadv_nc_mesh_free(). The problem was
in wrong error handling in batadv_mesh_init().

Before this patch batadv_mesh_init() was calling batadv_mesh_free() in case
of any batadv_*_init() calls failure. This approach may work well, when
there is some kind of indicator, which can tell which parts of batadv are
initialized; but there isn't any.

All written above lead to cleaning up uninitialized fields. Even if we hide
ODEBUG warning by initializing bat_priv->nc.work, syzbot was able to hit
GPF in batadv_nc_purge_paths(), because hash pointer in still NULL. [1]

To fix these bugs we can unwind batadv_*_init() calls one by one.
It is good approach for 2 reasons: 1) It fixes bugs on error handling
path 2) It improves the performance, since we won't call unneeded
batadv_*_free() functions.

So, this patch makes all batadv_*_init() clean up all allocated memory
before returning with an error to no call correspoing batadv_*_free()
and open-codes batadv_mesh_free() with proper order to avoid touching
uninitialized fields.

Link: https://lore.kernel.org/netdev/[email protected]/
Reported-and-tested-by: [email protected]
Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol")
Signed-off-by: Pavel Skripkin <[email protected]>
Acked-by: Sven Eckelmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tipc: fix size validations for the MSG_CRYPTO type

The function tipc_crypto_key_rcv is used to parse MSG_CRYPTO messages
to receive keys from other nodes in the cluster in order to decrypt any
further messages from them.
This patch verifies that any supplied sizes in the message body are
valid for the received message.

Fixes: 1ef6f7c9390f ("tipc: add automatic session key exchange")
Signed-off-by: Max VA <[email protected]>
Acked-by: Ying Xue <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Acked-by: Jon Maloy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

nfc: port100: fix using -ERRNO as command type mask

During probing, the driver tries to get a list (mask) of supported
command types in port100_get_command_type_mask() function. The value
is u64 and 0 is treated as invalid mask (no commands supported). The
function however returns also -ERRNO as u64 which will be interpret as
valid command mask.

Return 0 on every error case of port100_get_command_type_mask(), so the
probing will stop.

Cc: <[email protected]>
Fixes: 0347a6ab300a ("NFC: port100: Commands mechanism implementation")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-10-25

This series contains updates to ice driver only.

Dave adds event handler for LAG NETDEV_UNREGISTER to unlink device from
link aggregate.

Yongxin Liu adds a check for PTP support during release which would
cause a call trace on non-PTP supported devices.
====================

Signed-off-by: David S. Miller <[email protected]>

net: multicast: calculate csum of looped-back and forwarded packets

During a testing of an user-space application which transmits UDP
multicast datagrams and utilizes multicast routing to send the UDP
datagrams out of defined network interfaces, I've found a multicast
router does not fill-in UDP checksum into locally produced, looped-back
and forwarded UDP datagrams, if an original output NIC the datagrams
are sent to has UDP TX checksum offload enabled.

The datagrams are sent malformed out of the NIC the datagrams have been
forwarded to.

It is because:

1. If TX checksum offload is enabled on the output NIC, UDP checksum
   is not calculated by kernel and is not filled into skb data.

2. dev_loopback_xmit(), which is called solely by
   ip_mc_finish_output(), sets skb->ip_summed = CHECKSUM_UNNECESSARY
   unconditionally.

3. Since 35fc92a9 ("[NET]: Allow forwarding of ip_summed except
   CHECKSUM_COMPLETE"), the ip_summed value is preserved during
   forwarding.

4. If ip_summed != CHECKSUM_PARTIAL, checksum is not calculated during
   a packet egress.

The minimum fix in dev_loopback_xmit():

1. Preserves skb->ip_summed CHECKSUM_PARTIAL. This is the
   case when the original output NIC has TX checksum offload enabled.
   The effects are:

     a) If the forwarding destination interface supports TX checksum
        offloading, the NIC driver is responsible to fill-in the
        checksum.

     b) If the forwarding destination interface does NOT support TX
        checksum offloading, checksums are filled-in by kernel before
        skb is submitted to the NIC driver.

     c) For local delivery, checksum validation is skipped as in the
        case of CHECKSUM_UNNECESSARY, thanks to skb_csum_unnecessary().

2. Translates ip_summed CHECKSUM_NONE to CHECKSUM_UNNECESSARY. It
   means, for CHECKSUM_NONE, the behavior is unmodified and is there
   to skip a looped-back packet local delivery checksum validation.

Signed-off-by: Cyril Strejc <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

spi: spl022: fix Microwire full duplex mode

There are missing braces in the function that verify controller parameters,
then an error is always returned when the parameter to select Microwire
frames operation is used on devices allowing it.

Signed-off-by: Thomas Perrot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>

mlxsw: pci: Recycle received packet upon allocation failure

When the driver fails to allocate a new Rx buffer, it passes an empty Rx
descriptor (contains zero address and size) to the device and marks it
as invalid by setting the skb pointer in the descriptor's metadata to
NULL.

After processing enough Rx descriptors, the driver will try to process
the invalid descriptor, but will return immediately seeing that the skb
pointer is NULL. Since the driver no longer passes new Rx descriptors to
the device, the Rx queue will eventually become full and the device will
start to drop packets.

Fix this by recycling the received packet if allocation of the new
packet failed. This means that allocation is no longer performed at the
end of the Rx routine, but at the start, before tearing down the DMA
mapping of the received packet.

Remove the comment about the descriptor being zeroed as it is no longer
correct. This is OK because we either use the descriptor as-is (when
recycling) or overwrite its address and size fields with that of the
newly allocated Rx buffer.

The issue was discovered when a process ("perf") consumed too much
memory and put the system under memory pressure. It can be reproduced by
injecting slab allocation failures [1]. After the fix, the Rx queue no
longer comes to a halt.

[1]
# echo 10 > /sys/kernel/debug/failslab/times
# echo 1000 > /sys/kernel/debug/failslab/interval
# echo 100 > /sys/kernel/debug/failslab/probability

FAULT_INJECTION: forcing a failure.
name failslab, interval 1000, probability 100, space 0, times 8
[...]
Call Trace:
  <IRQ>
  dump_stack_lvl+0x34/0x44
  should_fail.cold+0x32/0x37
  should_failslab+0x5/0x10
  kmem_cache_alloc_node+0x23/0x190
  __alloc_skb+0x1f9/0x280
  __netdev_alloc_skb+0x3a/0x150
  mlxsw_pci_rdq_skb_alloc+0x24/0x90
  mlxsw_pci_cq_tasklet+0x3dc/0x1200
  tasklet_action_common.constprop.0+0x9f/0x100
  __do_softirq+0xb5/0x252
  irq_exit_rcu+0x7a/0xa0
  common_interrupt+0x83/0xa0
  </IRQ>
  asm_common_interrupt+0x1e/0x40
RIP: 0010:cpuidle_enter_state+0xc8/0x340
[...]
mlxsw_spectrum2 0000:06:00.0: Failed to alloc skb for RDQ

Fixes: eda6500a987a ("mlxsw: Add PCI bus implementation")
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

ice: check whether PTP is initialized in ice_ptp_release()

PTP is currently only supported on E810 devices, it is checked
in ice_ptp_init(). However, there is no check in ice_ptp_release().
For other E800 series devices, ice_ptp_release() will be wrongly executed.

Fix the following calltrace.

  INFO: trying to register non-static key.
  The code is fine but needs lockdep annotation, or maybe
  you didn't initialize this object before use?
  turning off the locking correctness validator.
  Workqueue: ice ice_service_task [ice]
  Call Trace:
   dump_stack_lvl+0x5b/0x82
   dump_stack+0x10/0x12
   register_lock_class+0x495/0x4a0
   ? find_held_lock+0x3c/0xb0
   __lock_acquire+0x71/0x1830
   lock_acquire+0x1e6/0x330
   ? ice_ptp_release+0x3c/0x1e0 [ice]
   ? _raw_spin_lock+0x19/0x70
   ? ice_ptp_release+0x3c/0x1e0 [ice]
   _raw_spin_lock+0x38/0x70
   ? ice_ptp_release+0x3c/0x1e0 [ice]
   ice_ptp_release+0x3c/0x1e0 [ice]
   ice_prepare_for_reset+0xcb/0xe0 [ice]
   ice_do_reset+0x38/0x110 [ice]
   ice_service_task+0x138/0xf10 [ice]
   ? __this_cpu_preempt_check+0x13/0x20
   process_one_work+0x26a/0x650
   worker_thread+0x3f/0x3b0
   ? __kthread_parkme+0x51/0xb0
   ? process_one_work+0x650/0x650
   kthread+0x161/0x190
   ? set_kthread_struct+0x40/0x40
   ret_from_fork+0x1f/0x30

Fixes: 4dd0d5c33c3e ("ice: add lock around Tx timestamp tracker flush")
Signed-off-by: Yongxin Liu <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>

ice: Respond to a NETDEV_UNREGISTER event for LAG

When the PF is a member of a link aggregate, and the driver
is removed, the process will hang unless we respond to the
NETDEV_UNREGISTER event that is sent to the event_handler
for LAG.

Add a case statement for the ice_lag_event_handler to unlink
the PF from the link aggregate.

Also remove code that was incorrectly applying a dev_hold to
peer_netdevs that were associated with the ice driver.

Fixes: df006dd4b1dc ("ice: Add initial support framework for LAG")
Signed-off-by: Dave Ertman <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>

Revert "arm64: dts: qcom: sm8250: remove bus clock from the mdss node for sm8250 target"

This reverts commit 001ce9785c0674d913531345e86222c965fc8bf4.

This upstream commit broke AOSP (post Android 12 merge) build
on RB5. The device either silently crashes into USB crash mode
after android boot animation or we see a blank blue screen
with following dpu errors in dmesg:

[  T444] hw recovery is not complete for ctl:3
[  T444] [drm:dpu_encoder_phys_vid_prepare_for_kickoff:539] [dpu error]enc31 intf1 ctl 3 reset failure: -22
[  T444] [drm:dpu_encoder_phys_vid_wait_for_commit_done:513] [dpu error]vblank timeout
[  T444] [drm:dpu_kms_wait_for_commit_done:454] [dpu error]wait for commit done returned -110
[    C7] [drm:dpu_encoder_frame_done_timeout:2127] [dpu error]enc31 frame done timeout
[  T444] [drm:dpu_encoder_phys_vid_wait_for_commit_done:513] [dpu error]vblank timeout
[  T444] [drm:dpu_kms_wait_for_commit_done:454] [dpu error]wait for commit done returned -110

Fixes: 001ce9785c06 ("arm64: dts: qcom: sm8250: remove bus clock from the mdss node for sm8250 target")
Signed-off-by: Amit Pundir <[email protected]>
Signed-off-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Bjorn Andersson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

Linux 5.15-rc7

secretmem: Prevent secretmem_users from wrapping to zero

Commit 110860541f44 ("mm/secretmem: use refcount_t instead of atomic_t")
attempted to fix the problem of secretmem_users wrapping to zero and
allowing suspend once again.

But it was reverted in commit 87066fdd2e30 ("Revert 'mm/secretmem: use
refcount_t instead of atomic_t'") because of the problems it caused - a
refcount_t was not semantically the right type to use.

Instead prevent secretmem_users from wrapping to zero by forbidding new
users if the number of users has wrapped from positive to negative.
This stops a long way short of reaching the necessary 4 billion users
where it wraps to zero again, so there's no need to be clever with
special anti-wrap types or checking the return value from atomic_inc().

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Cc: Jordy Zomer <[email protected]>
Cc: Kees Cook <[email protected]>,
Cc: James Bottomley <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

spi: Fix tegra20 build with CONFIG_PM=n once again

Commit efafec27c565 ("spi: Fix tegra20 build with CONFIG_PM=n") already
fixed the build without PM support once.  There was an alternative fix
by Guenter in commit 2bab94090b01 ("spi: tegra20-slink: Declare runtime
suspend and resume functions conditionally"), and Mark then merged the
two correctly in ffb1e76f4f32 ("Merge tag 'v5.15-rc2' into spi-5.15").

But for some inexplicable reason, Mark then merged things _again_ in
commit 59c4e190b10c ("Merge tag 'v5.15-rc3' into spi-5.15"), and screwed
things up at that point, and the __maybe_unused attribute on
tegra_slink_runtime_resume() went missing.

Reinstate it, so that alpha (and other architectures without PM support)
builds cleanly again.

Btw, this is another prime example of how random back-merges are not
good.  Just don't do them.  Subsystem developers should not merge my
tree in any normal circumstances.  Both of those merge commits pointed
to above are bad: even the one that got the merge result right doesn't
even mention _why_ it was done, and the one that got it wrong is
obviously broken.

Reported-by: Guenter Roeck <[email protected]>
Cc: Mark Brown <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:

- Fix clang-related relocation warning in futex code

- Fix incorrect use of get_kernel_nofault()

- Fix bad code generation in __get_user_check() when kasan is enabled

- Ensure TLB function table is correctly aligned

- Remove duplicated string function definitions in decompressor

- Fix link-time orphan section warnings

- Fix old-style function prototype for arch_init_kprobes()

- Only warn about XIP address when not compile testing

- Handle BE32 big endian for keystone2 remapping

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9148/1: handle CONFIG_CPU_ENDIAN_BE32 in arch/arm/kernel/head.S
  ARM: 9141/1: only warn about XIP address when not compile testing
  ARM: 9139/1: kprobes: fix arch_init_kprobes() prototype
  ARM: 9138/1: fix link warning with XIP + frame-pointer
  ARM: 9134/1: remove duplicate memcpy() definition
  ARM: 9133/1: mm: proc-macros: ensure *_tlb_fns are 4B aligned
  ARM: 9132/1: Fix __get_user_check failure with ARM KASAN images
  ARM: 9125/1: fix incorrect use of get_kernel_nofault()
  ARM: 9122/1: select HAVE_FUTEX_CMPXCHG

Merge tag 'libata-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata

Pull libata fix from Damien Le Moal:
"A single fix in this pull request addressing an invalid error code
return in the sata_mv driver (from Zheyu)"

* tag 'libata-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
ata: sata_mv: Fix the error handling of mv_chip_id()

Merge tag 'pinctrl-v5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:
"Some late pin control fixes, the most generally annoying will probably
  be the AMD IRQ storm fix affecting the Microsoft surface.

  Summary:

   - Three fixes pertaining to Broadcom DT bindings. Some stuff didn't
     work out as inteded, we need to back out

   - A resume bug fix in the STM32 driver

   - Disable and mask the interrupts on probe in the AMD pinctrl driver,
     affecting Microsoft surface"

* tag 'pinctrl-v5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: amd: disable and mask interrupts on probe
  pinctrl: stm32: use valid pin identifier in stm32_pinctrl_resume()
  Revert "pinctrl: bcm: ns: support updated DT binding as syscon subnode"
  dt-bindings: pinctrl: brcm,ns-pinmux: drop unneeded CRU from example
  Revert "dt-bindings: pinctrl: bcm4708-pinmux: rework binding to use syscon"

net-sysfs: initialize uid and gid before calling net_ns_get_ownership

Currently in net_ns_get_ownership() it may not be able to set uid or gid
if make_kuid or make_kgid returns an invalid value, and an uninit-value
issue can be triggered by this.

This patch is to fix it by initializing the uid and gid before calling
net_ns_get_ownership(), as it does in kobject_get_ownership()

Fixes: e6dee9f3893c ("net-sysfs: add netdev_change_owner()")
Reported-by: Paolo Abeni <[email protected]>
Signed-off-by: Xin Long <[email protected]>
Acked-by: Christian Brauner <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

xen/netfront: stop tx queues during live migration

The tx queues are not stopped during the live migration. As a result, the
ndo_start_xmit() may access netfront_info->queues which is freed by
talk_to_netback()->xennet_destroy_queues().

This patch is to netif_device_detach() at the beginning of xen-netfront
resuming, and netif_device_attach() at the end of resuming.

     CPU A                                CPU B

talk_to_netback()
-> if (info->queues)
        xennet_destroy_queues(info);
    to free netfront_info->queues

                                        xennet_start_xmit()
                                        to access netfront_info->queues

  -> err = xennet_create_queues(info, &num_queues);

The idea is borrowed from virtio-net.

Cc: Joe Jin <[email protected]>
Signed-off-by: Dongli Zhang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Prevent infinite while loop in skb_tx_hash()

Drivers call netdev_set_num_tc() and then netdev_set_tc_queue()
to set the queue count and offset for each TC.  So the queue count
and offset for the TCs may be zero for a short period after dev->num_tc
has been set.  If a TX packet is being transmitted at this time in the
code path netdev_pick_tx() -> skb_tx_hash(), skb_tx_hash() may see
nonzero dev->num_tc but zero qcount for the TC.  The while loop that
keeps looping while hash >= qcount will not end.

Fix it by checking the TC's qcount to be nonzero before using it.

Fixes: eadec877ce9c ("net: Add support for subordinate traffic classes to netdev_pick_tx")
Reviewed-by: Andy Gospodarek <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string

When copying the device name, the length of the data memcpy copied exceeds
the length of the source buffer, which cause the KASAN issue below.  Use
strscpy_pad() instead.

BUG: KASAN: slab-out-of-bounds in ib_nl_set_path_rec_attrs+0x136/0x320 [ib_core]
Read of size 64 at addr ffff88811a10f5e0 by task rping/140263
CPU: 3 PID: 140263 Comm: rping Not tainted 5.15.0-rc1+ #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
  dump_stack_lvl+0x57/0x7d
  print_address_description.constprop.0+0x1d/0xa0
  kasan_report+0xcb/0x110
  kasan_check_range+0x13d/0x180
  memcpy+0x20/0x60
  ib_nl_set_path_rec_attrs+0x136/0x320 [ib_core]
  ib_nl_make_request+0x1c6/0x380 [ib_core]
  send_mad+0x20a/0x220 [ib_core]
  ib_sa_path_rec_get+0x3e3/0x800 [ib_core]
  cma_query_ib_route+0x29b/0x390 [rdma_cm]
  rdma_resolve_route+0x308/0x3e0 [rdma_cm]
  ucma_resolve_route+0xe1/0x150 [rdma_ucm]
  ucma_write+0x17b/0x1f0 [rdma_ucm]
  vfs_write+0x142/0x4d0
  ksys_write+0x133/0x160
  do_syscall_64+0x43/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f26499aa90f
Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 29 fd ff ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 5c fd ff ff 48
RSP: 002b:00007f26495f2dc0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00000000000007d0 RCX: 00007f26499aa90f
RDX: 0000000000000010 RSI: 00007f26495f2e00 RDI: 0000000000000003
RBP: 00005632a8315440 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000293 R12: 00007f26495f2e00
R13: 00005632a83154e0 R14: 00005632a8315440 R15: 00005632a830a810

Allocated by task 131419:
  kasan_save_stack+0x1b/0x40
  __kasan_kmalloc+0x7c/0x90
  proc_self_get_link+0x8b/0x100
  pick_link+0x4f1/0x5c0
  step_into+0x2eb/0x3d0
  walk_component+0xc8/0x2c0
  link_path_walk+0x3b8/0x580
  path_openat+0x101/0x230
  do_filp_open+0x12e/0x240
  do_sys_openat2+0x115/0x280
  __x64_sys_openat+0xce/0x140
  do_syscall_64+0x43/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: 2ca546b92a02 ("IB/sa: Route SA pathrecord query through netlink")
Link: https://lore.kernel.org/r/72ede0f6dab61f7f23df9ac7a70666e07ef314b0.1635055496.git.leonro@nvidia.com
Signed-off-by: Mark Zhang <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>

net: nxp: lpc_eth.c: avoid hang when bringing interface down

A hard hang is observed whenever the ethernet interface is brought
down. If the PHY is stopped before the LPC core block is reset,
the SoC will hang. Comparing lpc_eth_close() and lpc_eth_open() I
re-arranged the ordering of the functions calls in lpc_eth_close() to
reset the hardware before stopping the PHY.
Fixes: b7370112f519 ("lpc32xx: Added ethernet driver")
Signed-off-by: Trevor Woerner <[email protected]>
Acked-by: Vladimir Zapolskiy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

cfg80211: correct bridge/4addr mode check

Without the patch we fail:

$ sudo brctl addbr br0
$ sudo brctl addif br0 wlp1s0
$ sudo iw wlp1s0 set 4addr on
command failed: Device or resource busy (-16)

Last command failed but iface was already in 4addr mode.

Fixes: ad4bb6f8883a ("cfg80211: disallow bridging managed/adhoc interfaces")
Signed-off-by: Janusz Dziedzic <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[add fixes tag, fix indentation, edit commit log]
Signed-off-by: Johannes Berg <[email protected]>

cfg80211: fix management registrations locking

The management registrations locking was broken, the list was
locked for each wdev, but cfg80211_mgmt_registrations_update()
iterated it without holding all the correct spinlocks, causing
list corruption.

Rather than trying to fix it with fine-grained locking, just
move the lock to the wiphy/rdev (still need the list on each
wdev), we already need to hold the wdev lock to change it, so
there's no contention on the lock in any case. This trivially
fixes the bug since we hold one wdev's lock already, and now
will hold the lock that protects all lists.

Cc: [email protected]
Reported-by: Jouni Malinen <[email protected]>
Fixes: 6cd536fe62ef ("cfg80211: change internal management frame registration API")
Link: https://lore.kernel.org/r/20211025133111.5cf733eab0f4.I7b0abb0494ab712f74e2efcd24bb31ac33f7eee9@changeid
Signed-off-by: Johannes Berg <[email protected]>

Merge branch 'ksettings-locking-fixes'

Andrew Lunn says:

====================
ksettings_{get|set} lock fixes

Walter Stoll <[email protected]> reported a race condition
between "ethtool -s eth0 speed 100 duplex full autoneg off" and phylib
reading the current status from the PHY. Both ksetting_get and
ksetting_set fail the take the phydev mutex, and as a result, there is
a small window of time where the phydev members are not self
consistent.

Patch 1 fixes phy_ethtool_ksettings_get by adding the needed lock.
Patches 2 and 3 move code around and perform to refactoring, to allow
patch 4 to fix phy_ethtool_ksettings_set by added the lock.

Thanks go to Walter for the detailed origional report, suggested fix,
and testing of the proposed patches.
====================

Signed-off-by: David S. Miller <[email protected]>

phy: phy_ethtool_ksettings_set: Lock the PHY while changing settings

There is a race condition where the PHY state machine can change
members of the phydev structure at the same time userspace requests a
change via ethtool. To prevent this, have phy_ethtool_ksettings_set
take the PHY lock.

Fixes: 2d55173e71b0 ("phy: add generic function to support ksetting support")
Reported-by: Walter Stoll <[email protected]>
Suggested-by: Walter Stoll <[email protected]>
Tested-by: Walter Stoll <[email protected]>
Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

phy: phy_start_aneg: Add an unlocked version

Split phy_start_aneg into a wrapper which takes the PHY lock, and a
helper doing the real work. This will be needed when
phy_ethtook_ksettings_set takes the lock.

Fixes: 2d55173e71b0 ("phy: add generic function to support ksetting support")
Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

phy: phy_ethtool_ksettings_set: Move after phy_start_aneg

This allows it to make use of a helper which assume the PHY is already
locked.

Fixes: 2d55173e71b0 ("phy: add generic function to support ksetting support")
Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

phy: phy_ethtool_ksettings_get: Lock the phy for consistency

The PHY structure should be locked while copying information out if
it, otherwise there is no guarantee of self consistency. Without the
lock the PHY state machine could be updating the structure.

Fixes: 2d55173e71b0 ("phy: add generic function to support ksetting support")
Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ARM: 9148/1: handle CONFIG_CPU_ENDIAN_BE32 in arch/arm/kernel/head.S

My intel-ixp42x-welltech-epbx100 no longer boot since 4.14.
This is due to commit 463dbba4d189 ("ARM: 9104/2: Fix Keystone 2 kernel
mapping regression")
which forgot to handle CONFIG_CPU_ENDIAN_BE32 as possible BE config.

Suggested-by: Krzysztof Hałasa <[email protected]>
Fixes: 463dbba4d189 ("ARM: 9104/2: Fix Keystone 2 kernel mapping regression")
Signed-off-by: Corentin Labbe <[email protected]>
Signed-off-by: Russell King (Oracle) <[email protected]>

ata: sata_mv: Fix the error handling of mv_chip_id()

mv_init_host() propagates the value returned by mv_chip_id() which in turn
gets propagated by mv_pci_init_one() and hits local_pci_probe().

During the process of driver probing, the probe function should return < 0
for failure, otherwise, the kernel will treat value > 0 as success.

Since this is a bug rather than a recoverable runtime error we should
use dev_alert() instead of dev_err().

Signed-off-by: Zheyu Ma <[email protected]>
Signed-off-by: Damien Le Moal <[email protected]>

Revert "mm/secretmem: use refcount_t instead of atomic_t"

This reverts commit 110860541f443f950c1274f217a1a3e298670a33.

Converting the "secretmem_users" counter to a refcount is incorrect,
because a refcount is special in zero and can't just be incremented (but
a count of users is not, and "no users" is actually perfectly valid and
not a sign of a free'd resource).

Reported-by: [email protected]
Cc: Jordy Zomer <[email protected]>
Cc: Kees Cook <[email protected]>,
Cc: Jordy Zomer <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull autofs fix from Al Viro:
"Fix for a braino of mine (in getting rid of open-coded
  dentry_path_raw() in autofs a couple of cycles ago).

  Mea culpa...  Obvious -stable fodder"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  autofs: fix wait name hash calculation in autofs_wait()

Merge tag 'sched_urgent_for_v5.15_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fix from Borislav Petkov:
"Reset clang's Shadow Call Stack on hotplug to prevent it from
overflowing"

* tag 'sched_urgent_for_v5.15_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/scs: Reset the shadow stack when idle_task_exit

Merge tag 'x86_urgent_for_v5.15_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fix from Borislav Petkov:
"A single change adding Dave Hansen to our maintainers team"

* tag 'x86_urgent_for_v5.15_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
MAINTAINERS: Add Dave Hansen to the x86 maintainer team

Merge tag '5.15-rc6-ksmbd-fixes' of git://git.samba.org/ksmbd

Pull ksmbd fixes from Steve French:
"Ten fixes for the ksmbd kernel server, for improved security and
  additional buffer overflow checks:

   - a security improvement to session establishment to reduce the
     possibility of dictionary attacks

   - fix to ensure that maximum i/o size negotiated in the protocol is
     not less than 64K and not more than 8MB to better match expected
     behavior

   - fix for crediting (flow control) important to properly verify that
     sufficient credits are available for the requested operation

   - seven additional buffer overflow, buffer validation checks"

* tag '5.15-rc6-ksmbd-fixes' of git://git.samba.org/ksmbd:
  ksmbd: add buffer validation in session setup
  ksmbd: throttle session setup failures to avoid dictionary attacks
  ksmbd: validate OutputBufferLength of QUERY_DIR, QUERY_INFO, IOCTL requests
  ksmbd: validate credit charge after validating SMB2 PDU body size
  ksmbd: add buffer validation for smb direct
  ksmbd: limit read/write/trans buffer size not to exceed 8MB
  ksmbd: validate compound response buffer
  ksmbd: fix potencial 32bit overflow from data area check in smb2_write
  ksmbd: improve credits management
  ksmbd: add validation in smb2_ioctl

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"Ten fixes, seven of which are in drivers.

  The core fixes are one to fix a potential crash on resume, one to sort
  out our reference count releases to avoid releasing in-use modules and
  one to adjust the cmd per lun calculation to avoid an overflow in
  hyper-v"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: ufs-pci: Force a full restore after suspend-to-disk
  scsi: qla2xxx: Fix unmap of already freed sgl
  scsi: qla2xxx: Fix a memory leak in an error path of qla2x00_process_els()
  scsi: qla2xxx: Return -ENOMEM if kzalloc() fails
  scsi: sd: Fix crashes in sd_resume_runtime()
  scsi: mpi3mr: Fix duplicate device entries when scanning through sysfs
  scsi: core: Put LLD module refcnt after SCSI device is released
  scsi: storvsc: Fix validation for unsolicited incoming packets
  scsi: iscsi: Fix set_param() handling
  scsi: core: Fix shost->cmd_per_lun calculation in scsi_add_host_with_dma()

net: ethernet: microchip: lan743x: Fix dma allocation failure by using dma_set_mask_and_coherent

The dma failure was reported in the raspberry pi github (issue #4117).
https://github.com/raspberrypi/linux/issues/4117
The use of dma_set_mask_and_coherent fixes the issue.
Tested on 32/64-bit raspberry pi CM4 and 64-bit ubuntu x86 PC with EVB-LAN7430.

Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: Yuiko Oshino <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ethernet: microchip: lan743x: Fix driver crash when lan743x_pm_resume fails

The driver needs to clean up and return when the initialization fails on resume.

Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: Yuiko Oshino <[email protected]>
Signed-off-by: David S. Miller <[email protected]>