Git Repo - linux.git/log

powerpc64/ftrace: Fix ftrace for clang builds

Clang doesn't support -mprofile-kernel ABI, so guard the checks against
CONFIG_DYNAMIC_FTRACE_WITH_REGS, rather than the elf ABI version.

Fixes: 23b44fc248f4 ("powerpc/ftrace: Make __ftrace_make_{nop/call}() common to PPC32 and PPC64")
Cc: [email protected] # v5.19+
Reported-by: Nick Desaulniers <[email protected]>
Reported-by: Ondrej Mosnacek <[email protected]>
Signed-off-by: Naveen N. Rao <[email protected]>
Tested-by: Ondrej Mosnacek <[email protected]>
Acked-by: Nick Desaulniers <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://github.com/llvm/llvm-project/issues/57031
Link: https://github.com/ClangBuiltLinux/linux/issues/1682
Link: https://lore.kernel.org/r/[email protected]

powerpc: Make eh value more explicit when using lwarx

Just like the first patch of this series, define a local 'eh' in order
to make the code clearer.

And IS_ENABLED() returns either 1 or 0 so no need to do
IS_ENABLED(CONFIG_PPC64) ? 1 : 0.

Signed-off-by: Christophe Leroy <[email protected]>
[mpe: Use symbolic names, use 'n' constraint per Segher]
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/629befaa2d05e2922346e58a383886510d6af55a.1659430931.git.christophe.leroy@csgroup.eu

powerpc: Don't hide eh field of lwarx behind a macro

The eh field must remain 0 for PPC32 and is only used
by PPC64.

Don't hide that behind a macro, just leave the responsibility
to the user.

At the time being, the only users of PPC_RAW_L{WDQ}ARX are
setting the eh field to 0, so the special handling of __PPC_EH
is useless. Just take the value given by the caller.

Same for DEFINE_TESTOP(), don't do special handling in that
macro, ensure the caller hands over the proper eh value.

Signed-off-by: Christophe Leroy <[email protected]>
[mpe: Use 'n' constraint per Segher]
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/8b9c8a1a14f9143552a85fcbf96698224a8c2469.1659430931.git.christophe.leroy@csgroup.eu

powerpc: Fix eh field when calling lwarx on PPC32

Commit 9401f4e46cf6 ("powerpc: Use lwarx/ldarx directly instead of
PPC_LWARX/LDARX macros") properly handled the eh field of lwarx
in asm/bitops.h but failed to clear it for PPC32 in
asm/simple_spinlock.h

So, do as in arch_atomic_try_cmpxchg_lock(), set it to 1 if PPC64
but set it to 0 if PPC32. For that use IS_ENABLED(CONFIG_PPC64) which
returns 1 when CONFIG_PPC64 is set and 0 otherwise.

Fixes: 9401f4e46cf6 ("powerpc: Use lwarx/ldarx directly instead of PPC_LWARX/LDARX macros")
Cc: [email protected] # v5.15+
Reported-by: Pali Rohár <[email protected]>
Signed-off-by: Christophe Leroy <[email protected]>
Tested-by: Pali Rohár <[email protected]>
Reviewed-by: Segher Boessenkool <[email protected]>
[mpe: Use symbolic names, use 'n' constraint per Segher]
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/a1176e19e627dd6a1b8d24c6c457a8ab874b7d12.1659430931.git.christophe.leroy@csgroup.eu

Merge branch 'do-not-use-rt_tos-for-ipv6-flowlabel'

Matthias May says:

====================
Do not use RT_TOS for IPv6 flowlabel

According to Guillaume Nault RT_TOS should never be used for IPv6.

Quote:
RT_TOS() is an old macro used to interprete IPv4 TOS as described in
the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4
code, although, given the current state of the code, most of the
existing calls have no consequence.

But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS"
field to be interpreted the RFC 1349 way. There's no historical
compatibility to worry about.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

ipv6: do not use RT_TOS for IPv6 flowlabel

According to Guillaume Nault RT_TOS should never be used for IPv6.

Quote:
RT_TOS() is an old macro used to interprete IPv4 TOS as described in
the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4
code, although, given the current state of the code, most of the
existing calls have no consequence.

But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS"
field to be interpreted the RFC 1349 way. There's no historical
compatibility to worry about.

Fixes: 571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Acked-by: Guillaume Nault <[email protected]>
Signed-off-by: Matthias May <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

mlx5: do not use RT_TOS for IPv6 flowlabel

According to Guillaume Nault RT_TOS should never be used for IPv6.

Quote:
RT_TOS() is an old macro used to interprete IPv4 TOS as described in
the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4
code, although, given the current state of the code, most of the
existing calls have no consequence.

But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS"
field to be interpreted the RFC 1349 way. There's no historical
compatibility to worry about.

Fixes: ce99f6b97fcd ("net/mlx5e: Support SRIOV TC encapsulation offloads for IPv6 tunnels")
Acked-by: Guillaume Nault <[email protected]>
Signed-off-by: Matthias May <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

vxlan: do not use RT_TOS for IPv6 flowlabel

According to Guillaume Nault RT_TOS should never be used for IPv6.

Quote:
RT_TOS() is an old macro used to interprete IPv4 TOS as described in
the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4
code, although, given the current state of the code, most of the
existing calls have no consequence.

But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS"
field to be interpreted the RFC 1349 way. There's no historical
compatibility to worry about.

Fixes: 1400615d64cf ("vxlan: allow setting ipv6 traffic class")
Acked-by: Guillaume Nault <[email protected]>
Signed-off-by: Matthias May <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

geneve: do not use RT_TOS for IPv6 flowlabel

According to Guillaume Nault RT_TOS should never be used for IPv6.

Quote:
RT_TOS() is an old macro used to interprete IPv4 TOS as described in
the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4
code, although, given the current state of the code, most of the
existing calls have no consequence.

But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS"
field to be interpreted the RFC 1349 way. There's no historical
compatibility to worry about.

Fixes: 3a56f86f1be6 ("geneve: handle ipv6 priority like ipv4 tos")
Acked-by: Guillaume Nault <[email protected]>
Signed-off-by: Matthias May <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

geneve: fix TOS inheriting for ipv4

The current code retrieves the TOS field after the lookup
on the ipv4 routing table. The routing process currently
only allows routing based on the original 3 TOS bits, and
not on the full 6 DSCP bits.
As a result the retrieved TOS is cut to the 3 bits.
However for inheriting purposes the full 6 bits should be used.

Extract the full 6 bits before the route lookup and use
that instead of the cut off 3 TOS bits.

Fixes: e305ac6cf5a1 ("geneve: Add support to collect tunnel metadata.")
Signed-off-by: Matthias May <[email protected]>
Acked-by: Guillaume Nault <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: atlantic: fix aq_vec index out of range error

The final update statement of the for loop exceeds the array range, the
dereference of self->aq_vec[i] is not checked and then leads to the
index out of range error.
Also fixed this kind of coding style in other for loop.

[   97.937604] UBSAN: array-index-out-of-bounds in drivers/net/ethernet/aquantia/atlantic/aq_nic.c:1404:48
[   97.937607] index 8 is out of range for type 'aq_vec_s *[8]'
[   97.937608] CPU: 38 PID: 3767 Comm: kworker/u256:18 Not tainted 5.19.0+ #2
[   97.937610] Hardware name: Dell Inc. Precision 7865 Tower/, BIOS 1.0.0 06/12/2022
[   97.937611] Workqueue: events_unbound async_run_entry_fn
[   97.937616] Call Trace:
[   97.937617]  <TASK>
[   97.937619]  dump_stack_lvl+0x49/0x63
[   97.937624]  dump_stack+0x10/0x16
[   97.937626]  ubsan_epilogue+0x9/0x3f
[   97.937627]  __ubsan_handle_out_of_bounds.cold+0x44/0x49
[   97.937629]  ? __scm_send+0x348/0x440
[   97.937632]  ? aq_vec_stop+0x72/0x80 [atlantic]
[   97.937639]  aq_nic_stop+0x1b6/0x1c0 [atlantic]
[   97.937644]  aq_suspend_common+0x88/0x90 [atlantic]
[   97.937648]  aq_pm_suspend_poweroff+0xe/0x20 [atlantic]
[   97.937653]  pci_pm_suspend+0x7e/0x1a0
[   97.937655]  ? pci_pm_suspend_noirq+0x2b0/0x2b0
[   97.937657]  dpm_run_callback+0x54/0x190
[   97.937660]  __device_suspend+0x14c/0x4d0
[   97.937661]  async_suspend+0x23/0x70
[   97.937663]  async_run_entry_fn+0x33/0x120
[   97.937664]  process_one_work+0x21f/0x3f0
[   97.937666]  worker_thread+0x4a/0x3c0
[   97.937668]  ? process_one_work+0x3f0/0x3f0
[   97.937669]  kthread+0xf0/0x120
[   97.937671]  ? kthread_complete_and_exit+0x20/0x20
[   97.937672]  ret_from_fork+0x22/0x30
[   97.937676]  </TASK>

v2. fixed "warning: variable 'aq_vec' set but not used"

v3. simplified a for loop

Fixes: 97bde5c4f909 ("net: ethernet: aquantia: Support for NIC-specific code")
Signed-off-by: Chia-Lin Kao (AceLan) <[email protected]>
Acked-by: Sudarsana Reddy Kalluru <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

ax88796: Fix some typo in a comment

s/by caused/be caused/
s/ax88786/ax88796/

Signed-off-by: Christophe JAILLET <[email protected]>
Link: https://lore.kernel.org/r/7db4b622d2c3e5af58c1d1f32b81836f4af71f18.1659801746.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Harden set element field checks to avoid out-of-bound memory access,
   this patch also fixes the type of issue described in 7e6bc1f6cabc
   ("netfilter: nf_tables: stricter validation of element data") in a
   broader way.

2) Patches to restrict the chain, set, and rule id lookup in the
   transaction to the corresponding top-level table, patches from
   Thadeu Lima de Souza Cascardo.

3) Fix incorrect comment in ip6t_LOG.h

4) nft_data_init() performs upfront validation of the expected data.
   struct nft_data_desc is used to describe the expected data to be
   received from userspace. The .size field represents the maximum size
   that can be stored, for bound checks. Then, .len is an input/output field
   which stores the expected length as input (this is optional, to restrict
   the checks), as output it stores the real length received from userspace
   (if it was not specified as input). This patch comes in response to
   7e6bc1f6cabc ("netfilter: nf_tables: stricter validation of element data")
   to address this type of issue in a more generic way by avoid opencoded
   data validation. Next patch requires this as a dependency.

5) Disallow jump to implicit chain from set element, this configuration
   is invalid. Only allow jump to chain via immediate expression is
   supported at this stage.

6) Fix possible null-pointer derefence in the error path of table updates,
   if memory allocation of the transaction fails. From Florian Westphal.

* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nf_tables: fix null deref due to zeroed list head
  netfilter: nf_tables: disallow jump to implicit chain from set element
  netfilter: nf_tables: upfront validation of data via nft_data_init()
  netfilter: ip6t_LOG: Fix a typo in a comment
  netfilter: nf_tables: do not allow RULE_ID to refer to another chain
  netfilter: nf_tables: do not allow CHAIN_ID to refer to another table
  netfilter: nf_tables: do not allow SET_ID to refer to another table
  netfilter: nf_tables: validate variable length element extension
====================

Link: https://lore.kernel.org/r/[email protected]/
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'Don't reinit map value in prealloc_lru_pop'

Kumar Kartikeya Dwivedi says:

====================

Fix for a bug in prealloc_lru_pop spotted while reading the code, then a test +
example that checks whether it is fixed.

Changelog:
----------
v2 -> v3:
v2: https://lore.kernel.org/bpf/20220809140615 [email protected]

* Switch test to use kptr instead of kptr_ref to stabilize test runs
* Fix missing lru_bug__destroy (Yonghong)
* Collect Acks

v1 -> v2:
v1: https://lore.kernel.org/bpf/20220806014603 [email protected]

* Expand commit log to include summary of the discussion with Yonghong
* Make lru_bug selftest serial to not mess up refcount for map_kptr test
====================

Signed-off-by: Alexei Starovoitov <[email protected]>

selftests/bpf: Add test for prealloc_lru_pop bug

Add a regression test to check against invalid check_and_init_map_value
call inside prealloc_lru_pop.

The kptr should not be reset to NULL once we set it after deleting the
map element. Hence, we trigger a program that updates the element
causing its reuse, and checks whether the unref kptr is reset or not.
If it is, prealloc_lru_pop does an incorrect check_and_init_map_value
call and the test fails.

Acked-by: Yonghong Song <[email protected]>
Signed-off-by: Kumar Kartikeya Dwivedi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>

bpf: Don't reinit map value in prealloc_lru_pop

The LRU map that is preallocated may have its elements reused while
another program holds a pointer to it from bpf_map_lookup_elem. Hence,
only check_and_free_fields is appropriate when the element is being
deleted, as it ensures proper synchronization against concurrent access
of the map value. After that, we cannot call check_and_init_map_value
again as it may rewrite bpf_spin_lock, bpf_timer, and kptr fields while
they can be concurrently accessed from a BPF program.

This is safe to do as when the map entry is deleted, concurrent access
is protected against by check_and_free_fields, i.e. an existing timer
would be freed, and any existing kptr will be released by it. The
program can create further timers and kptrs after check_and_free_fields,
but they will eventually be released once the preallocated items are
freed on map destruction, even if the item is never reused again. Hence,
the deleted item sitting in the free list can still have resources
attached to it, and they would never leak.

With spin_lock, we never touch the field at all on delete or update, as
we may end up modifying the state of the lock. Since the verifier
ensures that a bpf_spin_lock call is always paired with bpf_spin_unlock
call, the program will eventually release the lock so that on reuse the
new user of the value can take the lock.

Essentially, for the preallocated case, we must assume that the map
value may always be in use by the program, even when it is sitting in
the freelist, and handle things accordingly, i.e. use proper
synchronization inside check_and_free_fields, and never reinitialize the
special fields when it is reused on update.

Fixes: 68134668c17f ("bpf: Add map side support for bpf timers.")
Acked-by: Yonghong Song <[email protected]>
Signed-off-by: Kumar Kartikeya Dwivedi <[email protected]>
Acked-by: Martin KaFai Lau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>

bpf: Allow calling bpf_prog_test kfuncs in tracing programs

In addition to TC hook, enable these in tracing programs so that they
can be used in selftests.

Acked-by: Yonghong Song <[email protected]>
Signed-off-by: Kumar Kartikeya Dwivedi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>

dt-bindings: mfd: convert to yaml Qualcomm SPMI PMIC

Convert Qualcomm SPMI PMIC binding to yaml format.

Additional changes:
- filled many missing compatibles

Co-developed-by: Caleb Connolly <[email protected]>
Signed-off-by: David Heidelberg <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

dm writecache: fix smatch warning about invalid return from writecache_map

There's a smatch warning "inconsistent returns '&wc->lock'" in
dm-writecache. The reason for the warning is that writecache_map()
doesn't drop the lock on the impossible path.

Fix this warning by adding wc_unlock() after the BUG statement (so
that it will be compiled-away anyway).

Fixes: df699cc16ea5e ("dm writecache: report invalid return from writecache_map helpers")
Reported-by: kernel test robot <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Mikulas Patocka <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

dm verity: fix verity_parse_opt_args parsing

Commit df326e7a0699 ("dm verity: allow optional args to alter primary
args handling") introduced a bug where verity_parse_opt_args() wouldn't
properly shift past an optional argument's additional params (by
ignoring them).

Fix this by avoiding returning with error if an unknown argument is
encountered when @only_modifier_opts=true is passed to
verity_parse_opt_args().

In practice this regressed the cryptsetup testsuite's FEC testing
because unknown optional arguments were encountered, wherey
short-circuiting ever testing FEC mode. With this fix all of the
cryptsetup testsuite's verity FEC tests pass.

Fixes: df326e7a0699 ("dm verity: allow optional args to alter primary args handling")
Reported-by: Milan Broz <[email protected]>>
Signed-off-by: Mike Snitzer <[email protected]>

dm verity: fix DM_VERITY_OPTS_MAX value yet again

Must account for the possibility that "try_verify_in_tasklet" is used.

This is the same issue that was fixed with commit 160f99db94322 -- it
is far too easy to miss that additional a new argument(s) require
bumping DM_VERITY_OPTS_MAX accordingly.

Fixes: 5721d4e5a9cd ("dm verity: Add optional "try_verify_in_tasklet" feature")
Signed-off-by: Mike Snitzer <[email protected]>

dm bufio: simplify DM_BUFIO_CLIENT_NO_SLEEP locking

Historically none of the bufio code runs in interrupt context but with
the use of DM_BUFIO_CLIENT_NO_SLEEP a bufio client can, see: commit
5721d4e5a9cd ("dm verity: Add optional "try_verify_in_tasklet" feature")
That said, the new tasklet usecase still doesn't require interrupts be
disabled by bufio (let alone conditionally restore them).

Yet with PREEMPT_RT, and falling back from tasklet to workqueue, care
must be taken to properly synchronize between softirq and process
context, otherwise ABBA deadlock may occur. While it is unnecessary to
disable bottom-half preemption within a tasklet, we must consistently do
so in process context to ensure locking is in the proper order.

Fix these issues by switching from spin_lock_irq{save,restore} to using
spin_{lock,unlock}_bh instead. Also remove the 'spinlock_flags' member
in dm_bufio_client struct (that can be used unsafely if bufio must
recurse on behalf of some caller, e.g. block layer's submit_bio).

Fixes: 5721d4e5a9cd ("dm verity: Add optional "try_verify_in_tasklet" feature")
Reported-by: Jens Axboe <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

add barriers to buffer_uptodate and set_buffer_uptodate

Let's have a look at this piece of code in __bread_slow:

get_bh(bh);
bh->b_end_io = end_buffer_read_sync;
submit_bh(REQ_OP_READ, 0, bh);
wait_on_buffer(bh);
if (buffer_uptodate(bh))
return bh;

Neither wait_on_buffer nor buffer_uptodate contain any memory barrier.
Consequently, if someone calls sb_bread and then reads the buffer data,
the read of buffer data may be executed before wait_on_buffer(bh) on
architectures with weak memory ordering and it may return invalid data.

Fix this bug by adding a memory barrier to set_buffer_uptodate and an
acquire barrier to buffer_uptodate (in a similar way as
folio_test_uptodate and folio_mark_uptodate).

Signed-off-by: Mikulas Patocka <[email protected]>
Reviewed-by: Matthew Wilcox (Oracle) <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>

Merge tag 'nfsd-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd updates from Chuck Lever:
"Work on 'courteous server', which was introduced in 5.19, continues
  apace. This release introduces a more flexible limit on the number of
  NFSv4 clients that NFSD allows, now that NFSv4 clients can remain in
  courtesy state long after the lease expiration timeout. The client
  limit is adjusted based on the physical memory size of the server.

  The NFSD filecache is a cache of files held open by NFSv4 clients or
  recently touched by NFSv2 or NFSv3 clients. This cache had some
  significant scalability constraints that have been relieved in this
  release. Thanks to all who contributed to this work.

  A data corruption bug found during the most recent NFS bake-a-thon
  that involves NFSv3 and NFSv4 clients writing the same file has been
  addressed in this release.

  This release includes several improvements in CPU scalability for
  NFSv4 operations. In addition, Neil Brown provided patches that
  simplify locking during file lookup, creation, rename, and removal
  that enables subsequent work on making these operations more scalable.
  We expect to see that work materialize in the next release.

  There are also numerous single-patch fixes, clean-ups, and the usual
  improvements in observability"

* tag 'nfsd-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (78 commits)
  lockd: detect and reject lock arguments that overflow
  NFSD: discard fh_locked flag and fh_lock/fh_unlock
  NFSD: use (un)lock_inode instead of fh_(un)lock for file operations
  NFSD: use explicit lock/unlock for directory ops
  NFSD: reduce locking in nfsd_lookup()
  NFSD: only call fh_unlock() once in nfsd_link()
  NFSD: always drop directory lock in nfsd_unlink()
  NFSD: change nfsd_create()/nfsd_symlink() to unlock directory before returning.
  NFSD: add posix ACLs to struct nfsd_attrs
  NFSD: add security label to struct nfsd_attrs
  NFSD: set attributes when creating symlinks
  NFSD: introduce struct nfsd_attrs
  NFSD: verify the opened dentry after setting a delegation
  NFSD: drop fh argument from alloc_init_deleg
  NFSD: Move copy offload callback arguments into a separate structure
  NFSD: Add nfsd4_send_cb_offload()
  NFSD: Remove kmalloc from nfsd4_do_async_copy()
  NFSD: Refactor nfsd4_do_copy()
  NFSD: Refactor nfsd4_cleanup_inter_ssc() (2/2)
  NFSD: Refactor nfsd4_cleanup_inter_ssc() (1/2)
  ...

NTB: EPF: Tidy up some bounds checks

This sscanf() is reading from the filename which was set by the kernel
so it should be trust worthy.  Although the data is likely trust worthy
there is some bounds checking but unfortunately, it is not complete or
consistent.  Additionally, the Smatch static checker marks everything
that comes from sscanf() as tainted and so Smatch complains that this
code can lead to an out of bounds issue.  Let's clean things up and make
Smatch happy.

The first problem is that there is no bounds checking in the _show()
functions.  The _store() and _show() functions are very similar so make
the bounds checking the same in both.

The second issue is that if "win_no" is zero it leads to an array
underflow so add an if (win_no <= 0) check for that.

Signed-off-by: Dan Carpenter <[email protected]>
Acked-by: Souptick Joarder (HPE) <[email protected]>
Signed-off-by: Jon Mason <[email protected]>

NTB: EPF: Fix error code in epf_ntb_bind()

Return an error code if pci_register_driver() fails. Don't return
success.

Fixes: da51fd247424 ("NTB: EPF: support NTB transfer between PCI RC and EP connection")
Signed-off-by: Dan Carpenter <[email protected]>
Acked-by: Souptick Joarder (HPE) <[email protected]>
Signed-off-by: Jon Mason <[email protected]>

PCI: endpoint: pci-epf-vntb: reduce several globals to statics

sparse reports
drivers/pci/endpoint/functions/pci-epf-vntb.c:975:5: warning: symbol 'pci_read' was not declared. Should it be static?
drivers/pci/endpoint/functions/pci-epf-vntb.c:984:5: warning: symbol 'pci_write' was not declared. Should it be static?
drivers/pci/endpoint/functions/pci-epf-vntb.c:989:16: warning: symbol 'vpci_ops' was not declared. Should it be static?

These functions and variables are only used in pci-epf-vntb.c, so their storage
class specifiers should be static.

Fixes: ff32fac00d97 ("NTB: EPF: support NTB transfer between PCI RC and EP connection")
Signed-off-by: Tom Rix <[email protected]>
Acked-by: Frank Li <[email protected]>
Signed-off-by: Jon Mason <[email protected]>

PCI: endpoint: pci-epf-vntb: fix error handle in epf_ntb_mw_bar_init()

In error case of epf_ntb_mw_bar_init(), memory window BARs should be
cleared, so add 'num_mws' parameter in epf_ntb_mw_bar_clear() and
calling it in error path to clear the BARs. Also add missing error
code when pci_epc_mem_alloc_addr() fails.

Fixes: ff32fac00d97 ("NTB: EPF: support NTB transfer between PCI RC and EP connection")
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Yang Yingliang <[email protected]>
Signed-off-by: Jon Mason <[email protected]>

PCI: endpoint: Fix Kconfig dependency

If CONFIG_NTB is not set and CONFIG_PCI_EPF_VNTB is y.

make ARCH=x86_64 CROSS_COMPILE=x86_64-linux-gnu-, will be failed, like this:

drivers/pci/endpoint/functions/pci-epf-vntb.o: In function `epf_ntb_cmd_handler':
pci-epf-vntb.c:(.text+0x95e): undefined reference to `ntb_db_event'
pci-epf-vntb.c:(.text+0xa1f): undefined reference to `ntb_link_event'
pci-epf-vntb.c:(.text+0xa42): undefined reference to `ntb_link_event'
drivers/pci/endpoint/functions/pci-epf-vntb.o: In function `pci_vntb_probe':
pci-epf-vntb.c:(.text+0x1250): undefined reference to `ntb_register_device'

The functions ntb_*() are defined in drivers/ntb/core.c, which need CONFIG_NTB setting y to be build-in.
To fix this build error, add depends on NTB.

Reported-by: Hulk Robot <[email protected]>
Fixes: ff32fac00d97("NTB: EPF: support NTB transfer between PCI RC and EP connection")
Signed-off-by: Ren Zhijie <[email protected]>
Acked-by: Frank Li <[email protected]>
Acked-by: Randy Dunlap <[email protected]>
Tested-by: Randy Dunlap <[email protected]> # build-tested
Reported-by: Randy Dunlap <[email protected]>
Signed-off-by: Jon Mason <[email protected]>

NTB: EPF: set pointer addr to null using NULL rather than 0

The pointer addr is being set to null using 0. Use NULL instead.

Cleans up sparse warning:
warning: Using plain integer as NULL pointer

Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Jon Mason <[email protected]>

Documentation: PCI: extend subheading underline for "lspci output" section

The underline syntax for "lspci output..." section is off-by-one less
than the section heading's length, hence triggers the warning:

Documentation/PCI/endpoint/pci-vntb-howto.rst:131: WARNING: Title underline too short.

Extend the underline by one to match the heading length.

Link: https://lore.kernel.org/linux-next/[email protected]/
Fixes: 0c4b285d9636cc ("Documentation: PCI: Add specification for the PCI vNTB function device")
Reported-by: Stephen Rothwell <[email protected]>
Cc: Kishon Vijay Abraham I <[email protected]>
Cc: Lorenzo Pieralisi <[email protected]>
Cc: "Krzysztof Wilczyński" <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Frank Li <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Bagas Sanjaya <[email protected]>
Signed-off-by: Jon Mason <[email protected]>

Documentation: PCI: Use code-block block for scratchpad registers diagram

The diagram in "Scratchpad Registers" isn't formatted inside code block,
hence triggers indentation warning:

Documentation/PCI/endpoint/pci-vntb-function.rst:82: WARNING: Unexpected indentation.

Fix the warning by using code-block directive to format the diagram
inside code block, as in other diagrams in Documentation/. While at it,
unindent the preceeding text.

Link: https://lore.kernel.org/linux-next/[email protected]/
Fixes: 0c4b285d9636cc ("Documentation: PCI: Add specification for the PCI vNTB function device")
Reported-by: Stephen Rothwell <[email protected]>
Cc: Kishon Vijay Abraham I <[email protected]>
Cc: Lorenzo Pieralisi <[email protected]>
Cc: "Krzysztof Wilczyński" <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Frank Li <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Bagas Sanjaya <[email protected]>
Signed-off-by: Jon Mason <[email protected]>

Documentation: PCI: Add specification for the PCI vNTB function device

Add specification for the PCI vNTB function device. The endpoint function
driver and the host PCI driver should be created based on this
specification.

Signed-off-by: Frank Li <[email protected]>
Signed-off-by: Jon Mason <[email protected]>

PCI: endpoint: Support NTB transfer between RC and EP

Add NTB function driver and virtual PCI Bus and Virtual NTB driver
to implement communication between PCIe Root Port and PCIe EP devices

┌────────────┐         ┌─────────────────────────────────────┐
│            │         │                                     │
├────────────┤         │                      ┌──────────────┤
│ NTB        │         │                      │ NTB          │
│ NetDev     │         │                      │ NetDev       │
├────────────┤         │                      ├──────────────┤
│ NTB        │         │                      │ NTB          │
│ Transfer   │         │                      │ Transfer     │
├────────────┤         │                      ├──────────────┤
│            │         │                      │              │
│  PCI NTB   │         │                      │              │
│    EPF     │         │                      │              │
│   Driver   │         │                      │ PCI Virtual  │
│            │         ├───────────────┐      │ NTB Driver   │
│            │         │ PCI EP NTB    │◄────►│              │
│            │         │  FN Driver    │      │              │
├────────────┤         ├───────────────┤      ├──────────────┤
│            │         │               │      │              │
│  PCI Bus   │ ◄─────► │  PCI EP Bus   │      │  Virtual PCI │
│            │  PCI    │               │      │     Bus      │
└────────────┘         └───────────────┴──────┴──────────────┘
PCIe Root Port                        PCI EP

This driver includes 3 parts:
1 PCI EP NTB function driver
2 Virtual PCI bus
3 PCI virtual NTB driver, which is loaded only by above virtual PCI bus

Signed-off-by: Frank Li <[email protected]>
Signed-off-by: Jon Mason <[email protected]>

NTB: epf: Allow more flexibility in the memory BAR map method

Support the below BAR configuration methods for epf NTB.

BAR 0: config and scratchpad
BAR 2: doorbell
BAR 4: memory map windows

Set difference BAR number information into struct ntb_epf_data. So difference
VID/PID can choose different BAR configurations. There are difference
BAR map method between epf NTB and epf vNTB Endpoint function.

Signed-off-by: Frank Li <[email protected]>
Signed-off-by: Jon Mason <[email protected]>

PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address

ntb_mw_set_trans() will set memory map window after endpoint function
driver bind. The inbound map address need be updated dynamically when
using NTB by PCIe Root Port and PCIe Endpoint connection.

Checking if iatu already assigned to the BAR, if yes, using assigned iatu
number to update inbound address map and skip set BAR's register.

Signed-off-by: Frank Li <[email protected]>
Signed-off-by: Jon Mason <[email protected]>

dt-bindings: soc: qcom: smd-rpm: extend example

Replace existing limited example with proper code for Qualcomm Resource
Power Manager (RPM) over SMD based on MSM8916. This also fixes the
example's indentation.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Acked-by: Rob Herring <[email protected]>
Signed-off-by: Bjorn Andersson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

dt-bindings: soc: qcom: smd: reference SMD edge schema

The child node of smd is an SMD edge representing remote subsystem.
Bring back missing reference from previously sent patch (disappeared
when applying).

Link: https://lore.kernel.org/r/[email protected]
Fixes: 385fad1303af ("dt-bindings: remoteproc: qcom,smd-edge: define re-usable schema for smd-edge")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Acked-by: Rob Herring <[email protected]>
Signed-off-by: Bjorn Andersson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

dt-bindings: mmc: sdhci-msm: Fix 'operating-points-v2 was unexpected' issue

As Rob reported in [1], there is one more issue present
in the 'sdhci-msm' dt-binding which shows up when a fix for
'unevaluatedProperties' handling is applied:

Documentation/devicetree/bindings/mmc/sdhci-msm.example.dtb:
mmc@8804000: Unevaluated properties are not allowed
('operating-points-v2' was unexpected)

Fix the same.

[1]. https://lore.kernel.org/lkml/20220514220116.1008254 [email protected]/

Cc: Bjorn Andersson <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Ulf Hansson <[email protected]>
Signed-off-by: Bhupesh Sharma <[email protected]>
Acked-by: Rob Herring <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

dt-bindings: display: simple-framebuffer: Drop Bartlomiej Zolnierkiewicz

Bartlomiej's Samsung email address is not working since around last
year and there was no follow up patch take over of the drivers, so drop
the email from maintainers.

Cc: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

can: mcp251x: Fix race condition on receive interrupt

The mcp251x driver uses both receiving mailboxes of the CAN controller
chips. For retrieving the CAN frames from the controller via SPI, it checks
once per interrupt which mailboxes have been filled and will retrieve the
messages accordingly.

This introduces a race condition, as another CAN frame can enter mailbox 1
while mailbox 0 is emptied. If now another CAN frame enters mailbox 0 until
the interrupt handler is called next, mailbox 0 is emptied before
mailbox 1, leading to out-of-order CAN frames in the network device.

This is fixed by checking the interrupt flags once again after freeing
mailbox 0, to correctly also empty mailbox 1 before leaving the handler.

For reproducing the bug I created the following setup:
- Two CAN devices, one Raspberry Pi with MCP2515, the other can be any.
- Setup CAN to 1 MHz
- Spam bursts of 5 CAN-messages with increasing CAN-ids
- Continue sending the bursts while sleeping a second between the bursts
- Check on the RPi whether the received messages have increasing CAN-ids
- Without this patch, every burst of messages will contain a flipped pair

v3: https://lore.kernel.org/all/20220804075914 [email protected]
v2: https://lore.kernel.org/all/20220804064803 [email protected]
v1: https://lore.kernel.org/all/20220803153300 [email protected]

Fixes: bf66f3736a94 ("can: mcp251x: Move to threaded interrupts instead of workqueues.")
Signed-off-by: Sebastian Würl <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
[mkl: reduce scope of intf1, eflag1]
Signed-off-by: Marc Kleine-Budde <[email protected]>

plip: avoid rcu debug splat

WARNING: suspicious RCU usage
5.2.0-rc2-00605-g2638eb8b50cfc #1 Not tainted
drivers/net/plip/plip.c:1110 suspicious rcu_dereference_check() usage!

plip_open is called with RTNL held, switch to the correct helper.

Fixes: 2638eb8b50cf ("net: ipv4: provide __rcu annotation for ifa_list")
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: bgmac: Fix a BUG triggered by wrong bytes_compl

On one of our machines we got:

kernel BUG at lib/dynamic_queue_limits.c:27!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
CPU: 0 PID: 1166 Comm: irq/41-bgmac Tainted: G        W  O    4.14.275-rt132 #1
Hardware name: BRCM XGS iProc
task: ee3415c0 task.stack: ee32a000
PC is at dql_completed+0x168/0x178
LR is at bgmac_poll+0x18c/0x6d8
pc : [<c03b9430>]    lr : [<c04b5a18>]    psr: 800a0313
sp : ee32be14  ip : 000005ea  fp : 00000bd4
r10: ee558500  r9 : c0116298  r8 : 00000002
r7 : 00000000  r6 : ef128810  r5 : 01993267  r4 : 01993851
r3 : ee558000  r2 : 000070e1  r1 : 00000bd4  r0 : ee52c180
Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 12c5387d  Table: 8e88c04a  DAC: 00000051
Process irq/41-bgmac (pid: 1166, stack limit = 0xee32a210)
Stack: (0xee32be14 to 0xee32c000)
be00:                                              ee558520 ee52c100 ef128810
be20: 00000000 00000002 c0116298 c04b5a18 00000000 c0a0c8c4 c0951780 00000040
be40: c0701780 ee558500 ee55d520 ef05b340 ef6f9780 ee558520 00000001 00000040
be60: ffffe000 c0a56878 ef6fa040 c0952040 0000012c c0528744 ef6f97b0 fffcfb6a
be80: c0a04104 2eda8000 c0a0c4ec c0a0d368 ee32bf44 c0153534 ee32be98 ee32be98
bea0: ee32bea0 ee32bea0 ee32bea8 ee32bea8 00000000 c01462e4 ffffe000 ef6f22a8
bec0: ffffe000 00000008 ee32bee4 c0147430 ffffe000 c094a2a8 00000003 ffffe000
bee0: c0a54528 00208040 0000000c c0a0c8c4 c0a65980 c0124d3c 00000008 ee558520
bf00: c094a23c c0a02080 00000000 c07a9910 ef136970 ef136970 ee30a440 ef136900
bf20: ee30a440 00000001 ef136900 ee30a440 c016d990 00000000 c0108db0 c012500c
bf40: ef136900 c016da14 ee30a464 ffffe000 00000001 c016dd14 00000000 c016db28
bf60: ffffe000 ee21a080 ee30a400 00000000 ee32a000 ee30a440 c016dbfc ee25fd70
bf80: ee21a09c c013edcc ee32a000 ee30a400 c013ec7c 00000000 00000000 00000000
bfa0: 00000000 00000000 00000000 c0108470 00000000 00000000 00000000 00000000
bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[<c03b9430>] (dql_completed) from [<c04b5a18>] (bgmac_poll+0x18c/0x6d8)
[<c04b5a18>] (bgmac_poll) from [<c0528744>] (net_rx_action+0x1c4/0x494)
[<c0528744>] (net_rx_action) from [<c0124d3c>] (do_current_softirqs+0x1ec/0x43c)
[<c0124d3c>] (do_current_softirqs) from [<c012500c>] (__local_bh_enable+0x80/0x98)
[<c012500c>] (__local_bh_enable) from [<c016da14>] (irq_forced_thread_fn+0x84/0x98)
[<c016da14>] (irq_forced_thread_fn) from [<c016dd14>] (irq_thread+0x118/0x1c0)
[<c016dd14>] (irq_thread) from [<c013edcc>] (kthread+0x150/0x158)
[<c013edcc>] (kthread) from [<c0108470>] (ret_from_fork+0x14/0x24)
Code: a83f15e0 0200001a 0630a0e1 c3ffffea (f201f0e7)

The issue seems similar to commit 90b3b339364c ("net: hisilicon: Fix a BUG
trigered by wrong bytes_compl") and potentially introduced by commit
b38c83dd0866 ("bgmac: simplify tx ring index handling").

If there is an RX interrupt between setting ring->end
and netdev_sent_queue() we can hit the BUG_ON as bgmac_dma_tx_free()
can miscalculate the queue size while called from bgmac_poll().

The machine which triggered the BUG runs a v4.14 RT kernel - but the issue
seems present in mainline too.

Fixes: b38c83dd0866 ("bgmac: simplify tx ring index handling")
Signed-off-by: Sandor Bodo-Merle <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: dsa: felix: suppress non-changes to the tagging protocol

The way in which dsa_tree_change_tag_proto() works is that when
dsa_tree_notify() fails, it doesn't know whether the operation failed
mid way in a multi-switch tree, or it failed for a single-switch tree.
So even though drivers need to fail cleanly in
ds->ops->change_tag_protocol(), DSA will still call dsa_tree_notify()
again, to restore the old tag protocol for potential switches in the
tree where the change did succeeed (before failing for others).

This means for the felix driver that if we report an error in
felix_change_tag_protocol(), we'll get another call where proto_ops ==
old_proto_ops. If we proceed to act upon that, we may do unexpected
things. For example, we will call dsa_tag_8021q_register() twice in a
row, without any dsa_tag_8021q_unregister() in between. Then we will
actually call dsa_tag_8021q_unregister() via old_proto_ops->teardown,
which (if it manages to run at all, after walking through corrupted data
structures) will leave the ports inoperational anyway.

The bug can be readily reproduced if we force an error while in
tag_8021q mode; this crashes the kernel.

echo ocelot-8021q > /sys/class/net/eno2/dsa/tagging
echo edsa > /sys/class/net/eno2/dsa/tagging # -EPROTONOSUPPORT

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000014
Call trace:
vcap_entry_get+0x24/0x124
ocelot_vcap_filter_del+0x198/0x270
felix_tag_8021q_vlan_del+0xd4/0x21c
dsa_switch_tag_8021q_vlan_del+0x168/0x2cc
dsa_switch_event+0x68/0x1170
dsa_tree_notify+0x14/0x34
dsa_port_tag_8021q_vlan_del+0x84/0x110
dsa_tag_8021q_unregister+0x15c/0x1c0
felix_tag_8021q_teardown+0x16c/0x180
felix_change_tag_protocol+0x1bc/0x230
dsa_switch_event+0x14c/0x1170
dsa_tree_change_tag_proto+0x118/0x1c0

Fixes: 7a29d220f4c0 ("net: dsa: felix: reimplement tagging protocol change with function pointers")
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge tag 'wireless-2022-08-09' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless

Kalle Valo says:

====================
wireless fixes for v6.0

First set of fixes for v6.0. Small one this time, fix a cfg80211
warning seen with brcmfmac and remove an unncessary inline keyword
from wilc1000.

* tag 'wireless-2022-08-09' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: wilc1000: fix spurious inline in wilc_handle_disconnect()
wifi: cfg80211: Fix validating BSS pointers in __cfg80211_connect_result
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

netfilter: nf_tables: fix null deref due to zeroed list head

In nf_tables_updtable, if nf_tables_table_enable returns an error,
nft_trans_destroy is called to free the transaction object.

nft_trans_destroy() calls list_del(), but the transaction was never
placed on a list -- the list head is all zeroes, this results in
a null dereference:

BUG: KASAN: null-ptr-deref in nft_trans_destroy+0x26/0x59
Call Trace:
nft_trans_destroy+0x26/0x59
nf_tables_newtable+0x4bc/0x9bc
[..]

Its sane to assume that nft_trans_destroy() can be called
on the transaction object returned by nft_trans_alloc(), so
make sure the list head is initialised.

Fixes: 55dd6f93076b ("netfilter: nf_tables: use new transaction infrastructure to handle table")
Reported-by: mingi cho <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: nf_tables: disallow jump to implicit chain from set element

Extend struct nft_data_desc to add a flag field that specifies
nft_data_init() is being called for set element data.

Use it to disallow jump to implicit chain from set element, only jump
to chain via immediate expression is allowed.

Fixes: d0e2c7de92c7 ("netfilter: nf_tables: add NFT_CHAIN_BINDING")
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: nf_tables: upfront validation of data via nft_data_init()

Instead of parsing the data and then validate that type and length are
correct, pass a description of the expected data so it can be validated
upfront before parsing it to bail out earlier.

This patch adds a new .size field to specify the maximum size of the
data area. The .len field is optional and it is used as an input/output
field, it provides the specific length of the expected data in the input
path. If then .len field is not specified, then obtained length from the
netlink attribute is stored. This is required by cmp, bitwise, range and
immediate, which provide no netlink attribute that describes the data
length. The immediate expression uses the destination register type to
infer the expected data type.

Relying on opencoded validation of the expected data might lead to
subtle bugs as described in 7e6bc1f6cabc ("netfilter: nf_tables:
stricter validation of element data").

Signed-off-by: Pablo Neira Ayuso <[email protected]>

NFS: Improve readpage/writepage tracing

Switch formatting to better match that used by other NFS tracepoints.

Signed-off-by: Trond Myklebust <[email protected]>

NFS: Improve O_DIRECT tracing

Switch the formatting to match the other NFS tracepoints.

Signed-off-by: Trond Myklebust <[email protected]>

NFS: Improve write error tracing

Don't leak request pointers, but use the "device:inode" labelling that
is used by all the other trace points. Furthermore, replace use of page
indexes with an offset, again in order to align behaviour with other
NFS trace points.

Signed-off-by: Trond Myklebust <[email protected]>

posix-cpu-timers: Cleanup CPU timers before freeing them during exec

Commit 55e8c8eb2c7b ("posix-cpu-timers: Store a reference to a pid not a
task") started looking up tasks by PID when deleting a CPU timer.

When a non-leader thread calls execve, it will switch PIDs with the leader
process. Then, as it calls exit_itimers, posix_cpu_timer_del cannot find
the task because the timer still points out to the old PID.

That means that armed timers won't be disarmed, that is, they won't be
removed from the timerqueue_list. exit_itimers will still release their
memory, and when that list is later processed, it leads to a
use-after-free.

Clean up the timers from the de-threaded task before freeing them. This
prevents a reported use-after-free.

Fixes: 55e8c8eb2c7b ("posix-cpu-timers: Store a reference to a pid not a task")
Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

time: Correct the prototype of ns_to_kernel_old_timeval and ns_to_timespec64

In ns_to_kernel_old_timeval() definition, the function argument is defined
with const identifier in kernel/time/time.c, but the prototype in
include/linux/time32.h looks different.

- The function is defined in kernel/time/time.c as below:
struct __kernel_old_timeval ns_to_kernel_old_timeval(const s64 nsec)

- The function is decalared in include/linux/time32.h as below:
extern struct __kernel_old_timeval ns_to_kernel_old_timeval(s64 nsec);

Because the variable of arithmethic types isn't modified in the calling scope,
there's no need to mark arguments as const, which was already mentioned during
review (Link[1) of the original patch.

Likewise remove the "const" keyword in both definition and declaration of
ns_to_timespec64() as requested by Arnd (Link[2]).

Fixes: a84d1169164b ("y2038: Introduce struct __kernel_old_timeval")
Signed-off-by: Youngmin Nam <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Arnd Bergmann <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Link[1]: https://lore.kernel.org/all/20180310081123 [email protected]/
Link[2]: https://lore.kernel.org/all/CAK8P3a3nknJgEDESGdJH91jMj6R_xydFqWASd8r5BbesdvMBgA@mail.gmail.com/

netfilter: ip6t_LOG: Fix a typo in a comment

s/_IPT_LOG_H/_IP6T_LOG_H/

While at it add some surrounding space to ease reading.

Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: nf_tables: do not allow RULE_ID to refer to another chain

When doing lookups for rules on the same batch by using its ID, a rule from
a different chain can be used. If a rule is added to a chain but tries to
be positioned next to a rule from a different chain, it will be linked to
chain2, but the use counter on chain1 would be the one to be incremented.

When looking for rules by ID, use the chain that was used for the lookup by
name. The chain used in the context copied to the transaction needs to
match that same chain. That way, struct nft_rule does not need to get
enlarged with another member.

Fixes: 1a94e38d254b ("netfilter: nf_tables: add NFTA_RULE_ID attribute")
Fixes: 75dd48e2e420 ("netfilter: nf_tables: Support RULE_ID reference in new rule")
Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
Cc: <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: nf_tables: do not allow CHAIN_ID to refer to another table

When doing lookups for chains on the same batch by using its ID, a chain
from a different table can be used. If a rule is added to a table but
refers to a chain in a different table, it will be linked to the chain in
table2, but would have expressions referring to objects in table1.

Then, when table1 is removed, the rule will not be removed as its linked to
a chain in table2. When expressions in the rule are processed or removed,
that will lead to a use-after-free.

When looking for chains by ID, use the table that was used for the lookup
by name, and only return chains belonging to that same table.

Fixes: 837830a4b439 ("netfilter: nf_tables: add NFTA_RULE_CHAIN_ID attribute")
Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
Cc: <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: nf_tables: do not allow SET_ID to refer to another table

When doing lookups for sets on the same batch by using its ID, a set from a
different table can be used.

Then, when the table is removed, a reference to the set may be kept after
the set is freed, leading to a potential use-after-free.

When looking for sets by ID, use the table that was used for the lookup by
name, and only return sets belonging to that same table.

This fixes CVE-2022-2586, also reported as ZDI-CAN-17470.

Reported-by: Team Orca of Sea Security (@seasecresponse)
Fixes: 958bee14d071 ("netfilter: nf_tables: use new transaction infrastructure to handle sets")
Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
Cc: <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: nf_tables: validate variable length element extension

Update template to validate variable length extensions. This patch adds
a new .ext_len[id] field to the template to store the expected extension
length. This is used to sanity check the initialization of the variable
length extension.

Use PTR_ERR() in nft_set_elem_init() to report errors since, after this
update, there are two reason why this might fail, either because of
ENOMEM or insufficient room in the extension field (EINVAL).

Kernels up until 7e6bc1f6cabc ("netfilter: nf_tables: stricter
validation of element data") allowed to copy more data to the extension
than was allocated. This ext_len field allows to validate if the
destination has the correct size as additional check.

Signed-off-by: Pablo Neira Ayuso <[email protected]>

ACPI: property: Fix error handling in acpi_init_properties()

buf.pointer, memory for storing _DSD data and nodes, was released if either
parsing properties or, as recently added, attaching data node tags failed.
Alas, properties were still left pointing to this memory if parsing
properties were successful but attaching data node tags failed.

Fix this by separating error handling for the two, and leaving properties
intact if data nodes cannot be tagged for a reason or another.

Reported-by: kernel test robot <[email protected]>
Fixes: 1d52f10917a7 ("ACPI: property: Tie data nodes to acpi handles")
Signed-off-by: Sakari Ailus <[email protected]>
[ rjw: Drop unrelated white space change ]
Signed-off-by: Rafael J. Wysocki <[email protected]>

Merge tag 'fscache-fixes-20220809' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull fscache updates from David Howells:

- Fix a cookie access ref leak if a cookie is invalidated a second time
   before the first invalidation is actually processed.

- Add a tracepoint to log cookie lookup failure

* tag 'fscache-fixes-20220809' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  fscache: add tracepoint when failing cookie
  fscache: don't leak cookie access refs if invalidation is in progress or failed

Merge tag 'afs-fixes-20220802' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull AFS fixes from David Howells:
"Fix AFS refcount handling.

  The first patch converts afs to use refcount_t for its refcounts and
  the second patch fixes afs_put_call() and afs_put_server() to save the
  values they're going to log in the tracepoint before decrementing the
  refcount"

* tag 'afs-fixes-20220802' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: Fix access after dec in put functions
  afs: Use refcount_t rather than atomic_t

Merge tag 'fs.setgid.v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull setgid updates from Christian Brauner:
"This contains the work to move setgid stripping out of individual
  filesystems and into the VFS itself.

  Creating files that have both the S_IXGRP and S_ISGID bit raised in
  directories that themselves have the S_ISGID bit set requires
  additional privileges to avoid security issues.

  When a filesystem creates a new inode it needs to take care that the
  caller is either in the group of the newly created inode or they have
  CAP_FSETID in their current user namespace and are privileged over the
  parent directory of the new inode. If any of these two conditions is
  true then the S_ISGID bit can be raised for an S_IXGRP file and if not
  it needs to be stripped.

  However, there are several key issues with the current implementation:

   - S_ISGID stripping logic is entangled with umask stripping.

     For example, if the umask removes the S_IXGRP bit from the file
     about to be created then the S_ISGID bit will be kept.

     The inode_init_owner() helper is responsible for S_ISGID stripping
     and is called before posix_acl_create(). So we can end up with two
     different orderings:

     1. FS without POSIX ACL support

        First strip umask then strip S_ISGID in inode_init_owner().

        In other words, if a filesystem doesn't support or enable POSIX
        ACLs then umask stripping is done directly in the vfs before
        calling into the filesystem:

     2. FS with POSIX ACL support

        First strip S_ISGID in inode_init_owner() then strip umask in
        posix_acl_create().

        In other words, if the filesystem does support POSIX ACLs then
        unmask stripping may be done in the filesystem itself when
        calling posix_acl_create().

     Note that technically filesystems are free to impose their own
     ordering between posix_acl_create() and inode_init_owner() meaning
     that there's additional ordering issues that influence S_ISGID
     inheritance.

     (Note that the commit message of commit 1639a49ccdce ("fs: move
     S_ISGID stripping into the vfs_*() helpers") gets the ordering
     between inode_init_owner() and posix_acl_create() the wrong way
     around. I realized this too late.)

   - Filesystems that don't rely on inode_init_owner() don't get S_ISGID
     stripping logic.

     While that may be intentional (e.g. network filesystems might just
     defer setgid stripping to a server) it is often just a security
     issue.

     Note that mandating the use of inode_init_owner() was proposed as
     an alternative solution but that wouldn't fix the ordering issues
     and there are examples such as afs where the use of
     inode_init_owner() isn't possible.

     In any case, we should also try the cleaner and generalized
     solution first before resorting to this approach.

   - We still have S_ISGID inheritance bugs years after the initial
     round of S_ISGID inheritance fixes:

       e014f37db1a2 ("xfs: use setattr_copy to set vfs inode attributes")
       01ea173e103e ("xfs: fix up non-directory creation in SGID directories")
       fd84bfdddd16 ("ceph: fix up non-directory creation in SGID directories")

  All of this led us to conclude that the current state is too messy.
  While we won't be able to make it completely clean as
  posix_acl_create() is still a filesystem specific call we can improve
  the S_SIGD stripping situation quite a bit by hoisting it out of
  inode_init_owner() and into the respective vfs creation operations.

  The obvious advantage is that we don't need to rely on individual
  filesystems getting S_ISGID stripping right and instead can
  standardize the ordering between S_ISGID and umask stripping directly
  in the VFS.

  A few short implementation notes:

   - The stripping logic needs to happen in vfs_*() helpers for the sake
     of stacking filesystems such as overlayfs that rely on these
     helpers taking care of S_ISGID stripping.

   - Security hooks have never seen the mode as it is ultimately seen by
     the filesystem because of the ordering issue we mentioned. Nothing
     is changed for them. We simply continue to strip the umask before
     passing the mode down to the security hooks.

   - The following filesystems use inode_init_owner() and thus relied on
     S_ISGID stripping: spufs, 9p, bfs, btrfs, ext2, ext4, f2fs,
     hfsplus, hugetlbfs, jfs, minix, nilfs2, ntfs3, ocfs2, omfs,
     overlayfs, ramfs, reiserfs, sysv, ubifs, udf, ufs, xfs, zonefs,
     bpf, tmpfs.

     We've audited all callchains as best as we could. More details can
     be found in the commit message to 1639a49ccdce ("fs: move S_ISGID
     stripping into the vfs_*() helpers")"

* tag 'fs.setgid.v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  ceph: rely on vfs for setgid stripping
  fs: move S_ISGID stripping into the vfs_*() helpers
  fs: Add missing umask strip in vfs_tmpfile
  fs: add mode_strip_sgid() helper

Merge tag 'memblock-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock

Pull memblock updates from Mike Rapoport:

- An optimization in memblock_add_range() to reduce array traversals

- Improvements to the memblock test suite

* tag 'memblock-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  memblock test: Modify the obsolete description in README
  memblock tests: fix compilation errors
  memblock tests: change build options to run-time options
  memblock tests: remove completed TODO items
  memblock tests: set memblock_debug to enable memblock_dbg() messages
  memblock tests: add verbose output to memblock tests
  memblock tests: Makefile: add arguments to control verbosity
  memblock: avoid some repeat when add new range

drm/gem: Properly annotate WW context on drm_gem_lock_reservations() error

Use ww_acquire_fini() in the error code paths. Otherwise lockdep
thinks that lock is held when lock's memory is freed after the
drm_gem_lock_reservations() error. The ww_acquire_context needs to be
annotated as "released", which fixes the noisy "WARNING: held lock freed!"
splat of VirtIO-GPU driver with CONFIG_DEBUG_MUTEXES=y and enabled lockdep.

Cc: [email protected]
Fixes: 7edc3e3b975b5 ("drm: Add helpers for locking an array of BO reservations.")
Reviewed-by: Thomas Hellström <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

Merge tag 'm68knommu-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu

Pull m68knommu fixes from Greg Ungerer:

- spelling in comment

- compilation when flexcan driver enabled

- sparse warning

* tag 'm68knommu-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
  m68k: Fix syntax errors in comments
  m68k: coldfire: make symbol m523x_clk_lookup static
  m68k: coldfire/device.c: protect FLEXCAN blocks

Merge tag 'x86_bugs_pbrsb' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 eIBRS fixes from Borislav Petkov:
"More from the CPU vulnerability nightmares front:

  Intel eIBRS machines do not sufficiently mitigate against RET
  mispredictions when doing a VM Exit therefore an additional RSB,
  one-entry stuffing is needed"

* tag 'x86_bugs_pbrsb' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/speculation: Add LFENCE to RSB fill sequence
  x86/speculation: Add RSB VM Exit protections

drm/shmem-helper: Add missing vunmap on error

The vmapping of dma-buf may succeed, but DRM SHMEM rejects the IOMEM
mapping, and thus, drm_gem_shmem_vmap_locked() should unvmap the IOMEM
before erroring out.

Cc: [email protected]
Fixes: 49a3f51dfeee ("drm/gem: Use struct dma_buf_map in GEM vmap ops and convert GEM backends")
Signed-off-by: Dmitry Osipenko <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

ntb: intel: add GNR support for Intel PCIe gen5 NTB

Add Intel Granite Rapids NTB PCI device ID and related enabling.
Expectation is same hardware interface as Saphire Rapids Xeon platforms.

Signed-off-by: Dave Jiang <[email protected]>
Acked-by: Allen Hubbe <[email protected]>
Signed-off-by: Jon Mason <[email protected]>

NTB: ntb_tool: uninitialized heap data in tool_fn_write()

The call to:

ret = simple_write_to_buffer(buf, size, offp, ubuf, size);

will return success if it is able to write even one byte to "buf".
The value of "*offp" controls which byte. This could result in
reading uninitialized data when we do the sscanf() on the next line.

This code is not really desigined to handle partial writes where
*offp is non-zero and the "buf" is preserved and re-used between writes.
Just ban partial writes and replace the simple_write_to_buffer() with
copy_from_user().

Fixes: 578b881ba9c4 ("NTB: Add tool test client")
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Jon Mason <[email protected]>

ntb: idt: fix clang -Wformat warnings

When building with Clang we encounter these warnings:
| drivers/ntb/hw/idt/ntb_hw_idt.c:2409:28: error: format specifies type
| 'unsigned char' but the argument has type 'int' [-Werror,-Wformat]
| "\t%hhu-%hhu.\t", idx + cnt - 1);
-
| drivers/ntb/hw/idt/ntb_hw_idt.c:2438:29: error: format specifies type
| 'unsigned char' but the argument has type 'int' [-Werror,-Wformat]
| "\t%hhu-%hhu.\t", idx + cnt - 1);
-
| drivers/ntb/hw/idt/ntb_hw_idt.c:2484:15: error: format specifies type
| 'unsigned char' but the argument has type 'int' [-Werror,-Wformat], src);

For the first two warnings the format specifier used is `%hhu` which
describes a u8. Both `idx` and `cnt` are u8 as well. However, the
expression as a whole is promoted to an int as you cannot get
smaller-than-int from addition. Therefore, to fix the warning, use the
promoted-to-type's format specifier -- in this case `%d`.

example:
``
uint8_t a = 4, b = 7;
int size = sizeof(a + b - 1);
printf("%d\n", size);
// output: 4
```

For the last warning, src is of type `int` while the format specifier
describes a u8. The fix here is just to use the proper specifier `%d`.

See more:
(https://wiki.sei.cmu.edu/confluence/display/c/INT02-C.+Understand+integer+conversion+rules)
"Integer types smaller than int are promoted when an operation is
performed on them. If all values of the original type can be represented
as an int, the value of the smaller type is converted to an int;
otherwise, it is converted to an unsigned int."

Link: https://github.com/ClangBuiltLinux/linux/issues/378
Signed-off-by: Justin Stitt <[email protected]>
Acked-by: Serge Semin <[email protected]>
Signed-off-by: Jon Mason <[email protected]>

ALSA: hda/realtek: Add a quirk for HP OMEN 15 (8786) mute LED

Board ID 8786 seems to be another variant of the Omen 15 that needs
ALC285_FIXUP_HP_MUTE_LED for working mute LED.

Signed-off-by: Bedant Patnaik <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

fscache: add tracepoint when failing cookie

Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: David Howells <[email protected]>

fscache: don't leak cookie access refs if invalidation is in progress or failed

It's possible for a request to invalidate a fscache_cookie will come in
while we're already processing an invalidation. If that happens we
currently take an extra access reference that will leak. Only call
__fscache_begin_cookie_access if the FSCACHE_COOKIE_DO_INVALIDATE bit
was previously clear.

Also, ensure that we attempt to clear the bit when the cookie is
"FAILED" and put the reference to avoid an access leak.

Fixes: 85e4ea1049c7 ("fscache: Fix invalidation/lookup race")
Suggested-by: David Howells <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: David Howells <[email protected]>

ALSA: usb-audio: More comprehensive mixer map for ASUS ROG Zenith II

ASUS ROG Zenith II has two USB interfaces, one for the front headphone
and another for the rest I/O. Currently we provided the mixer mapping
for the latter but with an incomplete form.

This patch corrects and provides more comprehensive mixer mapping, as
well as providing the proper device names for both the front headphone
and main audio.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=211005
Fixes: 2a48218f8e23 ("ALSA: usb-audio: Add mixer workaround for TRX40 and co")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

ALSA: scarlett2: Add Focusrite Clarett+ 8Pre support

The Focusrite Clarett+ 8Pre uses the same protocol as the Scarlett Gen
2 and Gen 3 product range. This patch adds support for the Clarett+
8Pre by adding appropriate entries to the scarlett2 driver.

The Clarett+ 2Pre and 4Pre, and the Clarett USB product line
presumably use the same protocol as well, so support for them can
easily be added if someone can test.

Signed-off-by: Christian Colglazier <[email protected]>
Signed-off-by: Geoffrey D. Bennett <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

can: ems_usb: fix clang's -Wunaligned-access warning

clang emits a -Wunaligned-access warning on struct __packed
ems_cpc_msg.

The reason is that the anonymous union msg (not declared as packed) is
being packed right after some non naturally aligned variables (3*8
bits + 2*32) inside a packed struct:

| struct __packed ems_cpc_msg {
| u8 type; /* type of message */
| u8 length; /* length of data within union 'msg' */
| u8 msgid; /* confirmation handle */
| __le32 ts_sec; /* timestamp in seconds */
| __le32 ts_nsec; /* timestamp in nano seconds */
| /* ^ not naturally aligned */
|
| union {
| /* ^ not declared as packed */
| u8 generic[64];
| struct cpc_can_msg can_msg;
| struct cpc_can_params can_params;
| struct cpc_confirm confirmation;
| struct cpc_overrun overrun;
| struct cpc_can_error error;
| struct cpc_can_err_counter err_counter;
| u8 can_state;
| } msg;
| };

Starting from LLVM 14, having an unpacked struct nested in a packed
struct triggers a warning. c.f. [1].

Fix the warning by marking the anonymous union as packed.

[1] https://github.com/llvm/llvm-project/issues/55520

Fixes: 702171adeed3 ("ems_usb: Added support for EMS CPC-USB/ARM7 CAN/USB interface")
Link: https://lore.kernel.org/all/[email protected]
Cc: Gerhard Uttenthaler <[email protected]>
Cc: Sebastian Haas <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: j1939: j1939_session_destroy(): fix memory leak of skbs

We need to drop skb references taken in j1939_session_skb_queue() when
destroying a session in j1939_session_destroy(). Otherwise those skbs
would be lost.

Link to Syzkaller info and repro: https://forge.ispras.ru/issues/11743.

Found by Linux Verification Center (linuxtesting.org) with Syzkaller.

V1: https://lore.kernel.org/all/20220708175949 [email protected]

Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
Suggested-by: Oleksij Rempel <[email protected]>
Signed-off-by: Fedor Pchelkin <[email protected]>
Signed-off-by: Alexey Khoroshilov <[email protected]>
Acked-by: Oleksij Rempel <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: j1939: j1939_sk_queue_activate_next_locked(): replace WARN_ON_ONCE with netdev_warn_once()

We should warn user-space that it is doing something wrong when trying
to activate sessions with identical parameters but WARN_ON_ONCE macro
can not be used here as it serves a different purpose.

So it would be good to replace it with netdev_warn_once() message.

Found by Linux Verification Center (linuxtesting.org) with Syzkaller.

Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
Signed-off-by: Fedor Pchelkin <[email protected]>
Signed-off-by: Alexey Khoroshilov <[email protected]>
Acked-by: Oleksij Rempel <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
[mkl: fix indention]
Signed-off-by: Marc Kleine-Budde <[email protected]>

Merge tag 'for-net-2022-08-08' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth

Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

- Fixes various issues related to ISO channel/socket support
- Fixes issues when building with C=1
- Fix cancel uninitilized work which blocks syzbot to run

* tag 'for-net-2022-08-08' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
  Bluetooth: ISO: Fix not using the correct QoS
  Bluetooth: don't try to cancel uninitialized works at mgmt_index_removed()
  Bluetooth: ISO: Fix iso_sock_getsockopt for BT_DEFER_SETUP
  Bluetooth: MGMT: Fixes build warnings with C=1
  Bluetooth: hci_event: Fix build warning with C=1
  Bluetooth: ISO: Fix memory corruption
  Bluetooth: Fix null pointer deref on unexpected status event
  Bluetooth: ISO: Fix info leak in iso_sock_getsockopt()
  Bluetooth: hci_conn: Fix updating ISO QoS PHY
  Bluetooth: ISO: unlock on error path in iso_sock_setsockopt()
  Bluetooth: L2CAP: Fix l2cap_global_chan_by_psm regression
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

s390/qeth: cache link_info for ethtool

Since
commit e6e771b3d897 ("s390/qeth: detach netdevice while card is offline")
there was a timing window during recovery, that qeth_query_card_info could
be sent to the card, even before it was ready for it, leading to a failing
card recovery. There is evidence that this window was hit, as not all
callers of get_link_ksettings() check for netif_device_present.

Use cached values in qeth_get_link_ksettings(), instead of calling
qeth_query_card_info() and falling back to default values in case it
fails. Link info is already updated when the card goes online, e.g. after
STARTLAN (physical link up). Set the link info to default values, when the
card goes offline or at STOPLAN (physical link down). A follow-on patch
will improve values reported for link down.

Fixes: e6e771b3d897 ("s390/qeth: detach netdevice while card is offline")
Signed-off-by: Alexandra Winter <[email protected]>
Reviewed-by: Thorsten Winkler <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: phy: dp83867: fix get nvmem cell fail

If CONFIG_NVMEM is not set of_nvmem_cell_get, of_nvmem_device_get
functions will return ERR_PTR(-EOPNOTSUPP) and "failed to get nvmem
cell io_impedance_ctrl" error would be reported despite "io_impedance_ctrl"
is completely missing in Device Tree and we should use default values.

Check -EOPNOTSUPP togather with -ENOENT to avoid this situation.

Fixes: 5c2d0a6a0701 ("net: phy: dp83867: implement support for io_impedance_ctrl nvmem cell")
Signed-off-by: Nikita Shubin <[email protected]>
Acked-by: Rasmus Villemoes <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: phy: c45 baset1: do not skip aneg configuration if clock role is not specified

In case master/slave clock role is not specified (which is default), the
aneg registers will not be written.

The visible impact of this is missing pause advertisement.

So, rework genphy_c45_baset1_an_config_aneg() to be able to write
advertisement registers even if clock role is unknown.

Fixes: 3da8ffd8545f ("net: phy: Add 10BASE-T1L support in phy-c45")
Signed-off-by: Oleksij Rempel <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

atm: idt77252: fix use-after-free bugs caused by tst_timer

There are use-after-free bugs caused by tst_timer. The root cause
is that there are no functions to stop tst_timer in idt77252_exit().
One of the possible race conditions is shown below:

    (thread 1)          |        (thread 2)
                        |  idt77252_init_one
                        |    init_card
                        |      fill_tst
                        |        mod_timer(&card->tst_timer, ...)
idt77252_exit           |  (wait a time)
                        |  tst_timer
                        |
                        |    ...
  kfree(card) // FREE   |
                        |    card->soft_tst[e] // USE

The idt77252_dev is deallocated in idt77252_exit() and used in
timer handler.

This patch adds del_timer_sync() in idt77252_exit() in order that
the timer handler could be stopped before the idt77252_dev is
deallocated.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Duoming Zhou <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: dsa: felix: fix min gate len calculation for tc when its first gate is closed

min_gate_len[tc] is supposed to track the shortest interval of
continuously open gates for a traffic class. For example, in the
following case:

TC 76543210

t0 00000001b 200000 ns
t1 00000010b 200000 ns

min_gate_len[0] and min_gate_len[1] should be 200000, while
min_gate_len[2-7] should be 0.

However what happens is that min_gate_len[0] is 200000, but
min_gate_len[1] ends up being 0 (despite gate_len[1] being 200000 at the
point where the logic detects the gate close event for TC 1).

The problem is that the code considers a "gate close" event whenever it
sees that there is a 0 for that TC (essentially it's level rather than
edge triggered). By doing that, any time a gate is seen as closed
without having been open prior, gate_len, which is 0, will be written
into min_gate_len. Once min_gate_len becomes 0, it's impossible for it
to track anything higher than that (the length of actually open
intervals).

To fix this, we make the writing to min_gate_len[tc] be edge-triggered,
which avoids writes for gates that are closed in consecutive intervals.
However what this does is it makes us need to special-case the
permanently closed gates at the end.

Fixes: 55a515b1f5a9 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port")
Signed-off-by: Vladimir Oltean <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net/x25: fix call timeouts in blocking connects

When a userspace application starts a blocking connect(), a CALL REQUEST
is sent, the t21 timer is started and the connect is waiting in
x25_wait_for_connection_establishment(). If then for some reason the t21
timer expires before any reaction on the assigned logical channel (e.g.
CALL ACCEPT, CLEAR REQUEST), there is sent a CLEAR REQUEST and timer
t23 is started waiting for a CLEAR confirmation. If we now receive a
CLEAR CONFIRMATION from the peer, x25_disconnect() is called in
x25_state2_machine() with reason "0", which means "normal" call
clearing. This is ok, but the parameter "reason" is used as sk->sk_err
in x25_disconnect() and sock_error(sk) is evaluated in
x25_wait_for_connection_establishment() to check if the call is still
pending. As "0" is not rated as an error, the connect will stuck here
forever.

To fix this situation, also check if the sk->sk_state changed form
TCP_SYN_SENT to TCP_CLOSE in the meantime, which is also done by
x25_disconnect().

Signed-off-by: Martin Schiller <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'tsnep-two-fixes-for-the-driver'

Gerhard Engleder says:

====================
tsnep: Two fixes for the driver

Two simple bugfixes for tsnep driver.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

tsnep: Fix tsnep_tx_unmap() error path usage

If tsnep_tx_map() fails, then tsnep_tx_unmap() shall start at the write
index like tsnep_tx_map(). This is different to the normal operation.
Thus, add an additional parameter to tsnep_tx_unmap() to enable start at
different positions for successful TX and failed TX.

Fixes: 403f69bbdbad ("tsnep: Add TSN endpoint Ethernet MAC driver")
Signed-off-by: Gerhard Engleder <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

tsnep: Fix unused warning for 'tsnep_of_match'

Kernel test robot found the following warning:

drivers/net/ethernet/engleder/tsnep_main.c:1254:34: warning:
'tsnep_of_match' defined but not used [-Wunused-const-variable=]

of_match_ptr() compiles into NULL if CONFIG_OF is disabled.
tsnep_of_match exists always so use of of_match_ptr() is useless.
Fix warning by dropping of_match_ptr().

Reported-by: kernel test robot <[email protected]>
Signed-off-by: Gerhard Engleder <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

Merge tag '5.20-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd

Pull ksmbd updates from Steve French:

- fixes for memory access bugs (out of bounds access, oops, leak)

- multichannel fixes

- session disconnect performance improvement, and session register
   improvement

- cleanup

* tag '5.20-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: fix heap-based overflow in set_ntacl_dacl()
  ksmbd: prevent out of bound read for SMB2_TREE_CONNNECT
  ksmbd: prevent out of bound read for SMB2_WRITE
  ksmbd: fix use-after-free bug in smb2_tree_disconect
  ksmbd: fix memory leak in smb2_handle_negotiate
  ksmbd: fix racy issue while destroying session on multichannel
  ksmbd: use wait_event instead of schedule_timeout()
  ksmbd: fix kernel oops from idr_remove()
  ksmbd: add channel rwlock
  ksmbd: replace sessions list in connection with xarray
  MAINTAINERS: ksmbd: add entry for documentation
  ksmbd: remove unused ksmbd_share_configs_cleanup function

Merge tag 'pull-work.iov_iter-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull more iov_iter updates from Al Viro:

- more new_sync_{read,write}() speedups - ITER_UBUF introduction

- ITER_PIPE cleanups

- unification of iov_iter_get_pages/iov_iter_get_pages_alloc and
   switching them to advancing semantics

- making ITER_PIPE take high-order pages without splitting them

- handling copy_page_from_iter() for high-order pages properly

* tag 'pull-work.iov_iter-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (32 commits)
  fix copy_page_from_iter() for compound destinations
  hugetlbfs: copy_page_to_iter() can deal with compound pages
  copy_page_to_iter(): don't split high-order page in case of ITER_PIPE
  expand those iov_iter_advance()...
  pipe_get_pages(): switch to append_pipe()
  get rid of non-advancing variants
  ceph: switch the last caller of iov_iter_get_pages_alloc()
  9p: convert to advancing variant of iov_iter_get_pages_alloc()
  af_alg_make_sg(): switch to advancing variant of iov_iter_get_pages()
  iter_to_pipe(): switch to advancing variant of iov_iter_get_pages()
  block: convert to advancing variants of iov_iter_get_pages{,_alloc}()
  iov_iter: advancing variants of iov_iter_get_pages{,_alloc}()
  iov_iter: saner helper for page array allocation
  fold __pipe_get_pages() into pipe_get_pages()
  ITER_XARRAY: don't open-code DIV_ROUND_UP()
  unify the rest of iov_iter_get_pages()/iov_iter_get_pages_alloc() guts
  unify xarray_get_pages() and xarray_get_pages_alloc()
  unify pipe_get_pages() and pipe_get_pages_alloc()
  iov_iter_get_pages(): sanity-check arguments
  iov_iter_get_pages_alloc(): lift freeing pages array on failure exits into wrapper
  ...

fix copy_page_from_iter() for compound destinations

had been broken for ITER_BVEC et.al. since ever (OK, v3.17 when
ITER_BVEC had first appeared)...

Signed-off-by: Al Viro <[email protected]>

hugetlbfs: copy_page_to_iter() can deal with compound pages

... since April 2021

Signed-off-by: Al Viro <[email protected]>

copy_page_to_iter(): don't split high-order page in case of ITER_PIPE

... just shove it into one pipe_buffer.

Signed-off-by: Al Viro <[email protected]>

expand those iov_iter_advance()...

Signed-off-by: Al Viro <[email protected]>

pipe_get_pages(): switch to append_pipe()

now that we are advancing the iterator, there's no need to
treat the first page separately - just call append_pipe()
in a loop.

Signed-off-by: Al Viro <[email protected]>

get rid of non-advancing variants

mechanical change; will be further massaged in subsequent commits

Reviewed-by: Jeff Layton <[email protected]>
Signed-off-by: Al Viro <[email protected]>

ceph: switch the last caller of iov_iter_get_pages_alloc()

here nothing even looks at the iov_iter after the call, so we couldn't
care less whether it advances or not.

Reviewed-by: Jeff Layton <[email protected]>
Signed-off-by: Al Viro <[email protected]>

9p: convert to advancing variant of iov_iter_get_pages_alloc()

that one is somewhat clumsier than usual and needs serious testing.

Signed-off-by: Al Viro <[email protected]>

af_alg_make_sg(): switch to advancing variant of iov_iter_get_pages()

... and adjust the callers

Reviewed-by: Jeff Layton <[email protected]>
Signed-off-by: Al Viro <[email protected]>

iter_to_pipe(): switch to advancing variant of iov_iter_get_pages()

... and untangle the cleanup on failure to add into pipe.

Reviewed-by: Jeff Layton <[email protected]>
Signed-off-by: Al Viro <[email protected]>