Git Repo - linux.git/log

sctp: fix memory leak in sctp_stream_outq_migrate()

When sctp_stream_outq_migrate() is called to release stream out resources,
the memory pointed to by prio_head in stream out is not released.

The memory leak information is as follows:
unreferenced object 0xffff88801fe79f80 (size 64):
   comm "sctp_repo", pid 7957, jiffies 4294951704 (age 36.480s)
   hex dump (first 32 bytes):
     80 9f e7 1f 80 88 ff ff 80 9f e7 1f 80 88 ff ff  ................
     90 9f e7 1f 80 88 ff ff 90 9f e7 1f 80 88 ff ff  ................
   backtrace:
     [<ffffffff81b215c6>] kmalloc_trace+0x26/0x60
     [<ffffffff88ae517c>] sctp_sched_prio_set+0x4cc/0x770
     [<ffffffff88ad64f2>] sctp_stream_init_ext+0xd2/0x1b0
     [<ffffffff88aa2604>] sctp_sendmsg_to_asoc+0x1614/0x1a30
     [<ffffffff88ab7ff1>] sctp_sendmsg+0xda1/0x1ef0
     [<ffffffff87f765ed>] inet_sendmsg+0x9d/0xe0
     [<ffffffff8754b5b3>] sock_sendmsg+0xd3/0x120
     [<ffffffff8755446a>] __sys_sendto+0x23a/0x340
     [<ffffffff87554651>] __x64_sys_sendto+0xe1/0x1b0
     [<ffffffff89978b49>] do_syscall_64+0x39/0xb0
     [<ffffffff89a0008b>] entry_SYSCALL_64_after_hwframe+0x63/0xcd

Link: https://syzkaller.appspot.com/bug?exrid=29c402e56c4760763cc0
Fixes: 637784ade221 ("sctp: introduce priority based stream scheduler")
Reported-by: [email protected]
Signed-off-by: Zhengchao Shao <[email protected]>
Reviewed-by: Xin Long <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

packet: do not set TP_STATUS_CSUM_VALID on CHECKSUM_COMPLETE

CHECKSUM_COMPLETE signals that skb->csum stores the sum over the
entire packet. It does not imply that an embedded l4 checksum
field has been validated.

Fixes: 682f048bd494 ("af_packet: pass checksum validation status to the user")
Signed-off-by: Willem de Bruijn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net/mlx5: Lag, Fix for loop when checking lag

The cited commit adds a for loop to check if each port supports lag
or not. But dev is not initialized correctly. Fix it by initializing
dev for each iteration.

Fixes: e87c6a832f88 ("net/mlx5: E-switch, Fix duplicate lag creation")
Signed-off-by: Chris Mi <[email protected]>
Reported-by: Jacob Keller <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Revert "net/mlx5e: MACsec, remove replay window size limitation in offload path"

This reverts commit c0071be0e16c461680d87b763ba1ee5e46548fde.

The cited commit removed the validity checks which initialized the
window_sz and never removed the use of the now uninitialized variable,
so now we are left with wrong value in the window size and the following
clang warning: [-Wuninitialized]
drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c:232:45:
warning: variable 'window_sz' is uninitialized when used here
MLX5_SET(macsec_aso, aso_ctx, window_size, window_sz);

Revet at this time to address the clang issue due to lack of time to
test the proper solution.

Fixes: c0071be0e16c ("net/mlx5e: MACsec, remove replay window size limitation in offload path")
Signed-off-by: Saeed Mahameed <[email protected]>
Reported-by: Jacob Keller <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: marvell: prestera: Fix a NULL vs IS_ERR() check in some functions

rhashtable_lookup_fast() returns NULL when failed instead of error
pointer.

Fixes: 396b80cb5cc8 ("net: marvell: prestera: Add neighbour cache accounting")
Fixes: 0a23ae237171 ("net: marvell: prestera: Add router nexthops ABI")
Signed-off-by: Shang XiaoJing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

net: tun: Fix use-after-free in tun_detach()

syzbot reported use-after-free in tun_detach() [1]. This causes call
trace like below:

==================================================================
BUG: KASAN: use-after-free in notifier_call_chain+0x1ee/0x200 kernel/notifier.c:75
Read of size 8 at addr ffff88807324e2a8 by task syz-executor.0/3673

CPU: 0 PID: 3673 Comm: syz-executor.0 Not tainted 6.1.0-rc5-syzkaller-00044-gcc675d22e422 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:284 [inline]
print_report+0x15e/0x461 mm/kasan/report.c:395
kasan_report+0xbf/0x1f0 mm/kasan/report.c:495
notifier_call_chain+0x1ee/0x200 kernel/notifier.c:75
call_netdevice_notifiers_info+0x86/0x130 net/core/dev.c:1942
call_netdevice_notifiers_extack net/core/dev.c:1983 [inline]
call_netdevice_notifiers net/core/dev.c:1997 [inline]
netdev_wait_allrefs_any net/core/dev.c:10237 [inline]
netdev_run_todo+0xbc6/0x1100 net/core/dev.c:10351
tun_detach drivers/net/tun.c:704 [inline]
tun_chr_close+0xe4/0x190 drivers/net/tun.c:3467
__fput+0x27c/0xa90 fs/file_table.c:320
task_work_run+0x16f/0x270 kernel/task_work.c:179
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0xb3d/0x2a30 kernel/exit.c:820
do_group_exit+0xd4/0x2a0 kernel/exit.c:950
get_signal+0x21b1/0x2440 kernel/signal.c:2858
arch_do_signal_or_restart+0x86/0x2300 arch/x86/kernel/signal.c:869
exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
__syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x63/0xcd

The cause of the issue is that sock_put() from __tun_detach() drops
last reference count for struct net, and then notifier_call_chain()
from netdev_state_change() accesses that struct net.

This patch fixes the issue by calling sock_put() from tun_detach()
after all necessary accesses for the struct net has done.

Fixes: 83c1f36f9880 ("tun: send netlink notification when the device is modified")
Reported-by: [email protected]
Link: https://syzkaller.appspot.com/bug?id=96eb7f1ce75ef933697f24eeab928c4a716edefe
Signed-off-by: Shigeru Yoshida <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

net: mdiobus: fix unbalanced node reference count

I got the following report while doing device(mscc-miim) load test
with CONFIG_OF_UNITTEST and CONFIG_OF_DYNAMIC enabled:

  OF: ERROR: memory leak, expected refcount 1 instead of 2,
  of_node_get()/of_node_put() unbalanced - destroy cset entry:
  attach overlay node /spi/soc@0/mdio@7107009c/ethernet-phy@0

If the 'fwnode' is not an acpi node, the refcount is get in
fwnode_mdiobus_phy_device_register(), but it has never been
put when the device is freed in the normal path. So call
fwnode_handle_put() in phy_device_release() to avoid leak.

If it's an acpi node, it has never been get, but it's put
in the error path, so call fwnode_handle_get() before
phy_device_register() to keep get/put operation balanced.

Fixes: bc1bee3b87ee ("net: mdiobus: Introduce fwnode_mdiobus_register_phy()")
Signed-off-by: Yang Yingliang <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: hsr: Fix potential use-after-free

The skb is delivered to netif_rx() which may free it, after calling this,
dereferencing skb may trigger use-after-free.

Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
Signed-off-by: YueHaibing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

tipc: re-fetch skb cb after tipc_msg_validate

As the call trace shows, the original skb was freed in tipc_msg_validate(),
and dereferencing the old skb cb would cause an use-after-free crash.

  BUG: KASAN: use-after-free in tipc_crypto_rcv_complete+0x1835/0x2240 [tipc]
  Call Trace:
   <IRQ>
   tipc_crypto_rcv_complete+0x1835/0x2240 [tipc]
   tipc_crypto_rcv+0xd32/0x1ec0 [tipc]
   tipc_rcv+0x744/0x1150 [tipc]
  ...
  Allocated by task 47078:
   kmem_cache_alloc_node+0x158/0x4d0
   __alloc_skb+0x1c1/0x270
   tipc_buf_acquire+0x1e/0xe0 [tipc]
   tipc_msg_create+0x33/0x1c0 [tipc]
   tipc_link_build_proto_msg+0x38a/0x2100 [tipc]
   tipc_link_timeout+0x8b8/0xef0 [tipc]
   tipc_node_timeout+0x2a1/0x960 [tipc]
   call_timer_fn+0x2d/0x1c0
  ...
  Freed by task 47078:
   tipc_msg_validate+0x7b/0x440 [tipc]
   tipc_crypto_rcv_complete+0x4b5/0x2240 [tipc]
   tipc_crypto_rcv+0xd32/0x1ec0 [tipc]
   tipc_rcv+0x744/0x1150 [tipc]

This patch fixes it by re-fetching the skb cb from the new allocated skb
after calling tipc_msg_validate().

Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication")
Reported-by: Shuang Li <[email protected]>
Signed-off-by: Xin Long <[email protected]>
Link: https://lore.kernel.org/r/1b1cdba762915325bd8ef9a98d0276eb673df2a5.1669398403.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'mptcp-more-fixes-for-6-1'

Matthieu Baerts says:

====================
mptcp: More fixes for 6.1

Patch 1 makes sure data received after a close will still be processed
and acked as exepected. This is a regression for a commit introduced
in v5.11.

Patch 2 fixes a kernel deadlock found when working on validating TFO
with a listener MPTCP socket. This is not directly linked to TFO but
it is easier to reproduce the issue with it. This fixes a bug introduced
by a commit from v6.0.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

mptcp: fix sleep in atomic at close time

Matt reported a splat at msk close time:

    BUG: sleeping function called from invalid context at net/mptcp/protocol.c:2877
    in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 155, name: packetdrill
    preempt_count: 201, expected: 0
    RCU nest depth: 0, expected: 0
    4 locks held by packetdrill/155:
    #0: ffff888001536990 (&sb->s_type->i_mutex_key#6){+.+.}-{3:3}, at: __sock_release (net/socket.c:650)
    #1: ffff88800b498130 (sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_close (net/mptcp/protocol.c:2973)
    #2: ffff88800b49a130 (sk_lock-AF_INET/1){+.+.}-{0:0}, at: __mptcp_close_ssk (net/mptcp/protocol.c:2363)
    #3: ffff88800b49a0b0 (slock-AF_INET){+...}-{2:2}, at: __lock_sock_fast (include/net/sock.h:1820)
    Preemption disabled at:
    0x0
    CPU: 1 PID: 155 Comm: packetdrill Not tainted 6.1.0-rc5 #365
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
    Call Trace:
    <TASK>
    dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4))
    __might_resched.cold (kernel/sched/core.c:9891)
    __mptcp_destroy_sock (include/linux/kernel.h:110)
    __mptcp_close (net/mptcp/protocol.c:2959)
    mptcp_subflow_queue_clean (include/net/sock.h:1777)
    __mptcp_close_ssk (net/mptcp/protocol.c:2363)
    mptcp_destroy_common (net/mptcp/protocol.c:3170)
    mptcp_destroy (include/net/sock.h:1495)
    __mptcp_destroy_sock (net/mptcp/protocol.c:2886)
    __mptcp_close (net/mptcp/protocol.c:2959)
    mptcp_close (net/mptcp/protocol.c:2974)
    inet_release (net/ipv4/af_inet.c:432)
    __sock_release (net/socket.c:651)
    sock_close (net/socket.c:1367)
    __fput (fs/file_table.c:320)
    task_work_run (kernel/task_work.c:181 (discriminator 1))
    exit_to_user_mode_prepare (include/linux/resume_user_mode.h:49)
    syscall_exit_to_user_mode (kernel/entry/common.c:130)
    do_syscall_64 (arch/x86/entry/common.c:87)
    entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)

We can't call mptcp_close under the 'fast' socket lock variant, replace
it with a sock_lock_nested() as the relevant code is already under the
listening msk socket lock protection.

Reported-by: Matthieu Baerts <[email protected]>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/316
Fixes: 30e51b923e43 ("mptcp: fix unreleased socket in accept queue")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Matthieu Baerts <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

mptcp: don't orphan ssk in mptcp_close()

All of the subflows of a msk will be orphaned in mptcp_close(), which
means the subflows are in DEAD state. After then, DATA_FIN will be sent,
and the other side will response with a DATA_ACK for this DATA_FIN.

However, if the other side still has pending data, the data that received
on these subflows will not be passed to the msk, as they are DEAD and
subflow_data_ready() will not be called in tcp_data_ready(). Therefore,
these data can't be acked, and they will be retransmitted again and again,
until timeout.

Fix this by setting ssk->sk_socket and ssk->sk_wq to 'NULL', instead of
orphaning the subflows in __mptcp_close(), as Paolo suggested.

Fixes: e16163b6e2b7 ("mptcp: refactor shutdown and close")
Reviewed-by: Biao Jiang <[email protected]>
Reviewed-by: Mengen Sun <[email protected]>
Signed-off-by: Menglong Dong <[email protected]>
Reviewed-by: Paolo Abeni <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

dsa: lan9303: Correct stat name

This patch changes the reported ethtool statistics for the lan9303
family of parts covered by this driver.

The TxUnderRun statistic label is renamed to RxShort to accurately
reflect what stat the device is reporting. I did not reorder the
statistics as that might cause problems with existing user code that
are expecting the stats at a certain offset.

Fixes: a1292595e006 ("net: dsa: add new DSA switch driver for the SMSC-LAN9303")
Signed-off-by: Jerry Ray <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge tag 'wireless-2022-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless

Kalle Valo says:

====================
wireless fixes for v6.1

Third, and hopefully final, set of fixes for v6.1. We are marking the
rsi driver as orphan, have some Information Element parsing fixes to
wilc1000 driver and three small fixes to the stack.

* tag 'wireless-2022-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: mac8021: fix possible oob access in ieee80211_get_rate_duration
  wifi: cfg80211: don't allow multi-BSSID in S1G
  wifi: cfg80211: fix buffer overflow in elem comparison
  wifi: wilc1000: validate number of channels
  wifi: wilc1000: validate length of IEEE80211_P2P_ATTR_CHANNEL_LIST attribute
  wifi: wilc1000: validate length of IEEE80211_P2P_ATTR_OPER_CHANNEL attribute
  wifi: wilc1000: validate pairwise and authentication suite offsets
  MAINTAINERS: mark rsi wifi driver as orphan
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
bpf 2022-11-25

We've added 10 non-merge commits during the last 8 day(s) which contain
a total of 7 files changed, 48 insertions(+), 30 deletions(-).

The main changes are:

1) Several libbpf ringbuf fixes related to probing for its availability,
   size overflows when mmaping a 2G ringbuf and rejection of invalid
   reservationsizes, from Hou Tao.

2) Fix a buggy return pointer in libbpf for attach_raw_tp function,
   from Jiri Olsa.

3) Fix a local storage BPF map bug where the value's spin lock field
   can get initialized incorrectly, from Xu Kuohai.

4) Two follow-up fixes in kprobe_multi BPF selftests for BPF CI,
   from Jiri Olsa.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Make test_bench_attach serial
  selftests/bpf: Filter out default_idle from kprobe_multi bench
  bpf: Set and check spin lock value in sk_storage_map_test
  bpf: Do not copy spin lock field from user in bpf_selem_alloc
  libbpf: Check the validity of size in user_ring_buffer__reserve()
  libbpf: Handle size overflow for user ringbuf mmap
  libbpf: Handle size overflow for ringbuf mmap
  libbpf: Use page size as max_entries when probing ring buffer map
  bpf, perf: Use subprog name when reporting subprog ksymbol
  libbpf: Use correct return pointer in attach_raw_tp
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

ipv4: Fix route deletion when nexthop info is not specified

When the kernel receives a route deletion request from user space it
tries to delete a route that matches the route attributes specified in
the request.

If only prefix information is specified in the request, the kernel
should delete the first matching FIB alias regardless of its associated
FIB info. However, an error is currently returned when the FIB info is
backed by a nexthop object:

# ip nexthop add id 1 via 192.0.2.2 dev dummy10
# ip route add 198.51.100.0/24 nhid 1
# ip route del 198.51.100.0/24
RTNETLINK answers: No such process

Fix by matching on such a FIB info when legacy nexthop attributes are
not specified in the request. An earlier check already covers the case
where a nexthop ID is specified in the request.

Add tests that cover these flows. Before the fix:

# ./fib_nexthops.sh -t ipv4_fcnal
...
TEST: Delete route when not specifying nexthop attributes           [FAIL]

Tests passed:  11
Tests failed:   1

After the fix:

# ./fib_nexthops.sh -t ipv4_fcnal
...
TEST: Delete route when not specifying nexthop attributes           [ OK ]

Tests passed:  12
Tests failed:   0

No regressions in other tests:

# ./fib_nexthops.sh
...
Tests passed: 228
Tests failed:   0

# ./fib_tests.sh
...
Tests passed: 186
Tests failed:   0

Cc: [email protected]
Reported-by: Jonas Gorski <[email protected]>
Tested-by: Jonas Gorski <[email protected]>
Fixes: 493ced1ac47c ("ipv4: Allow routes to use nexthop objects")
Fixes: 6bf92d70e690 ("net: ipv4: fix route with nexthop object delete warning")
Fixes: 61b91eb33a69 ("ipv4: Handle attempt to delete multipath route when fib_info contains an nh reference")
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Nikolay Aleksandrov <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'wwan-iosm-fixes'

M Chetan Kumar says:

====================
net: wwan: iosm: fix build errors & bugs

This patch series fixes iosm driver bugs & build errors.

PATCH1: Fix kernel build robot reported error.
PATCH2: Fix build error reported on armhf while preparing
6.1-rc5 for Debian.
PATCH3: Fix UL throughput crash.
PATCH4: Fix incorrect skb length.

Refer to commit message for details.

Changes since v1:
* PATCH4: Fix sparse warning.
====================

Signed-off-by: David S. Miller <[email protected]>

net: wwan: iosm: fix incorrect skb length

skb passed to network layer contains incorrect length.

In mux aggregation protocol, the datagram block received
from device contains block signature, packet & datagram
header. The right skb len to be calculated by subracting
datagram pad len from datagram length.

Whereas in mux lite protocol, the skb contains single
datagram so skb len is calculated by subtracting the
packet offset from datagram header.

Fixes: 1f52d7b62285 ("net: wwan: iosm: Enable M.2 7360 WWAN card support")
Signed-off-by: M Chetan Kumar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: wwan: iosm: fix crash in peek throughput test

Peek throughput UL test is resulting in crash. If the UL
transfer block free list is exhaust, the peeked skb is freed.
In the next transfer freed skb is referred from UL list which
results in crash.

Don't free the skb if UL transfer blocks are unavailable. The
pending skb will be picked for transfer on UL transfer block
available.

Fixes: 1f52d7b62285 ("net: wwan: iosm: Enable M.2 7360 WWAN card support")
Signed-off-by: M Chetan Kumar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: wwan: iosm: fix dma_alloc_coherent incompatible pointer type

Fix build error reported on armhf while preparing 6.1-rc5
for Debian.

iosm_ipc_protocol.c:244:36: error: passing argument 3 of
'dma_alloc_coherent' from incompatible pointer type.

Change phy_ap_shm type from phys_addr_t to dma_addr_t.

Fixes: faed4c6f6f48 ("net: iosm: shared memory protocol")
Reported-by: Bonaccorso Salvatore <[email protected]>
Signed-off-by: M Chetan Kumar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: wwan: iosm: fix kernel test robot reported error

sparse warnings - iosm_ipc_mux_codec.c:1474 using plain
integer as NULL pointer.

Use skb_trim() to reset skb tail & len.

Fixes: 9413491e20e1 ("net: iosm: encode or decode datagram")
Reported-by: kernel test robot <[email protected]>
Signed-off-by: M Chetan Kumar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: phylink: fix PHY validation with rate adaption

Tim Harvey reports that link modes which he does not expect to be
supported are being advertised, and this is because of the workaround
we have for PHYs that switch interface modes.

Fix this up by checking whether rate matching will be used for the
requested interface mode, and if rate matching will be used, perform
validation only with the requested interface mode, rather than invoking
this workaround.

Signed-off-by: Russell King (Oracle) <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ethernet: nixge: fix NULL dereference

In function nixge_hw_dma_bd_release() dereference of NULL pointer
priv->rx_bd_v is possible for the case of its allocation failure in
nixge_hw_dma_bd_init().

Move for() loop with priv->rx_bd_v dereference under the check for
its validity.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 492caffa8a1a ("net: ethernet: nixge: Add support for National Instruments XGE netdev")
Signed-off-by: Yuri Karpov <[email protected]>
Reviewed-by: Maciej Fijalkowski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/9p: Fix a potential socket leak in p9_socket_open

Both p9_fd_create_tcp() and p9_fd_create_unix() will call
p9_socket_open(). If the creation of p9_trans_fd fails,
p9_fd_create_tcp() and p9_fd_create_unix() will return an
error directly instead of releasing the cscoket, which will
result in a socket leak.

This patch adds sock_release() to fix the leak issue.

Fixes: 6b18662e239a ("9p connect fixes")
Signed-off-by: Wang Hai <[email protected]>
ACKed-by: Al Viro <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: net_netdev: Fix error handling in ntb_netdev_init_module()

The ntb_netdev_init_module() returns the ntb_transport_register_client()
directly without checking its return value, if
ntb_transport_register_client() failed, the NTB client device is not
unregistered.

Fix by unregister NTB client device when ntb_transport_register_client()
failed.

Fixes: 548c237c0a99 ("net: Add support for NTB virtual ethernet device")
Signed-off-by: Yuan Can <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ethernet: ti: am65-cpsw: fix error handling in am65_cpsw_nuss_probe()

The am65_cpsw_nuss_cleanup_ndev() function calls unregister_netdev()
even if register_netdev() fails, which triggers WARN_ON(1) in
unregister_netdevice_many(). To fix it, make sure that
unregister_netdev() is called only on registered netdev.

Compile tested only.

Fixes: 84b4aa493249 ("net: ethernet: ti: am65-cpsw: add multi port support in mac-only mode")
Signed-off-by: Zhang Changzhong <[email protected]>
Reviewed-by: Maciej Fijalkowski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'mlx5-fixes-2022-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-fixes-2022-11-24
This series provides bug fixes to mlx5 driver.

Focusing on error handling and proper memory management in mlx5, in
general and in the newly added macsec module.

I still have few fixes left in my queue and I hope those will be the
last ones for mlx5 for this cycle.

Please pull and let me know if there is any problem.

Happy thanksgiving.
====================

Signed-off-by: David S. Miller <[email protected]>

Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-11-23 (ixgbevf, i40e, fm10k, iavf, e100)

This series contains updates to various Intel drivers.

Shang XiaoJing fixes init module error path stop to resource leaks for
ixgbevf and i40e.

Yuan Can also does the same for fm10k and iavf.

Wang Hai stops freeing of skb as it was causing use after free error for
e100.
====================

Signed-off-by: David S. Miller <[email protected]>

net: phy: fix null-ptr-deref while probe() failed

I got a null-ptr-deref report as following when doing fault injection test:

BUG: kernel NULL pointer dereference, address: 0000000000000058
Oops: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 1 PID: 253 Comm: 507-spi-dm9051 Tainted: G    B            N 6.1.0-rc3+
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:klist_put+0x2d/0xd0
Call Trace:
<TASK>
klist_remove+0xf1/0x1c0
device_release_driver_internal+0x23e/0x2d0
bus_remove_device+0x1bd/0x240
device_del+0x357/0x770
phy_device_remove+0x11/0x30
mdiobus_unregister+0xa5/0x140
release_nodes+0x6a/0xa0
devres_release_all+0xf8/0x150
device_unbind_cleanup+0x19/0xd0

//probe path:
phy_device_register()
  device_add()

phy_connect
  phy_attach_direct() //set device driver
    probe() //it's failed, driver is not bound
    device_bind_driver() // probe failed, it's not called

//remove path:
phy_device_remove()
  device_del()
    device_release_driver_internal()
      __device_release_driver() //dev->drv is not NULL
        klist_remove() <- knode_driver is not added yet, cause null-ptr-deref

In phy_attach_direct(), after setting the 'dev->driver', probe() fails,
device_bind_driver() is not called, so the knode_driver->n_klist is not
set, then it causes null-ptr-deref in __device_release_driver() while
deleting device. Fix this by setting dev->driver to NULL in the error
path in phy_attach_direct().

Fixes: e13934563db0 ("[PATCH] PHY Layer fixup")
Signed-off-by: Yang Yingliang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

wifi: mac8021: fix possible oob access in ieee80211_get_rate_duration

Fix possible out-of-bound access in ieee80211_get_rate_duration routine
as reported by the following UBSAN report:

UBSAN: array-index-out-of-bounds in net/mac80211/airtime.c:455:47
index 15 is out of range for type 'u16 [12]'
CPU: 2 PID: 217 Comm: kworker/u32:10 Not tainted 6.1.0-060100rc3-generic
Hardware name: Acer Aspire TC-281/Aspire TC-281, BIOS R01-A2 07/18/2017
Workqueue: mt76 mt76u_tx_status_data [mt76_usb]
Call Trace:
<TASK>
show_stack+0x4e/0x61
dump_stack_lvl+0x4a/0x6f
dump_stack+0x10/0x18
ubsan_epilogue+0x9/0x43
__ubsan_handle_out_of_bounds.cold+0x42/0x47
ieee80211_get_rate_duration.constprop.0+0x22f/0x2a0 [mac80211]
? ieee80211_tx_status_ext+0x32e/0x640 [mac80211]
ieee80211_calc_rx_airtime+0xda/0x120 [mac80211]
ieee80211_calc_tx_airtime+0xb4/0x100 [mac80211]
mt76x02_send_tx_status+0x266/0x480 [mt76x02_lib]
mt76x02_tx_status_data+0x52/0x80 [mt76x02_lib]
mt76u_tx_status_data+0x67/0xd0 [mt76_usb]
process_one_work+0x225/0x400
worker_thread+0x50/0x3e0
? process_one_work+0x400/0x400
kthread+0xe9/0x110
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x22/0x30

Fixes: db3e1c40cf2f ("mac80211: Import airtime calculation code from mt76")
Signed-off-by: Lorenzo Bianconi <[email protected]>
Acked-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>

wifi: cfg80211: don't allow multi-BSSID in S1G

In S1G beacon frames there shouldn't be multi-BSSID elements
since that's not supported, remove that to avoid a potential
integer underflow and/or misparsing the frames due to the
different length of the fixed part of the frame.

While at it, initialize non_tx_data so we don't send garbage
values to the user (even if it doesn't seem to matter now.)

Reported-and-tested-by: Sönke Huster <[email protected]>
Fixes: 9eaffe5078ca ("cfg80211: convert S1G beacon to scan results")
Signed-off-by: Johannes Berg <[email protected]>

wifi: cfg80211: fix buffer overflow in elem comparison

For vendor elements, the code here assumes that 5 octets
are present without checking. Since the element itself is
already checked to fit, we only need to check the length.

Reported-and-tested-by: Sönke Huster <[email protected]>
Fixes: 0b8fb8235be8 ("cfg80211: Parsing of Multiple BSSID information in scanning")
Signed-off-by: Johannes Berg <[email protected]>

net: loopback: use NET_NAME_PREDICTABLE for name_assign_type

When the name_assign_type attribute was introduced (commit
685343fc3ba6, "net: add name_assign_type netdev attribute"), the
loopback device was explicitly mentioned as one which would make use
of NET_NAME_PREDICTABLE:

    The name_assign_type attribute gives hints where the interface name of a
    given net-device comes from. These values are currently defined:
...
      NET_NAME_PREDICTABLE:
        The ifname has been assigned by the kernel in a predictable way
        that is guaranteed to avoid reuse and always be the same for a
        given device. Examples include statically created devices like
        the loopback device [...]

Switch to that so that reading /sys/class/net/lo/name_assign_type
produces something sensible instead of returning -EINVAL.

Signed-off-by: Rasmus Villemoes <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: fec: don't reset irq coalesce settings to defaults on "ip link up"

Currently, when a FEC device is brought up, the irq coalesce settings
are reset to their default values (1000us, 200 frames). That's
unexpected, and breaks for example use of an appropriate .link file to
make systemd-udev apply the desired
settings (https://www.freedesktop.org/software/systemd/man/systemd.link.html),
or any other method that would do a one-time setup during early boot.

Refactor the code so that fec_restart() instead uses
fec_enet_itr_coal_set(), which simply applies the settings that are
stored in the private data, and initialize that private data with the
default values.

Signed-off-by: Rasmus Villemoes <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

octeontx2-pf: Fix pfc_alloc_status array overflow

This patch addresses pfc_alloc_status array overflow occurring for
send queue index value greater than PFC priority. Queue index can be
greater than supported PFC priority for multiple scenarios (e.g. QoS,
during non zero SMQ allocation for a PF/VF).
In those scenarios the API should return default tx scheduler '0'.
This is causing mbox errors as otx2_get_smq_idx returing invalid smq value.

Fixes: 99c969a83d82 ("octeontx2-pf: Add egress PFC support")
Signed-off-by: Suman Ghosh <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: stmmac: Set MAC's flow control register to reflect current settings

Currently, pause frame register GMAC_RX_FLOW_CTRL_RFE is not updated
correctly when 'ethtool -A <IFACE> autoneg off rx off tx off' command
is issued. This fix ensures the flow control change is reflected directly
in the GMAC_RX_FLOW_CTRL_RFE register.

Fixes: 46f69ded988d ("net: stmmac: Use resolved link config in mac_link_up()")
Cc: <[email protected]> # 5.10.x
Signed-off-by: Goh, Wei Sheng <[email protected]>
Signed-off-by: Noor Azura Ahmad Tarmizi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

aquantia: Do not purge addresses when setting the number of rings

IPV6 addresses are purged when setting the number of rx/tx
rings using ethtool -G. The function aq_set_ringparam
calls dev_close, which removes the addresses. As a solution,
call an internal function (aq_ndev_close).

Fixes: c1af5427954b ("net: aquantia: Ethtool based ring size configuration")
Signed-off-by: Izabela Bakollari <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

qed: avoid defines prefixed with CONFIG

Defines prefixed with "CONFIG" should be limited to proper Kconfig options,
that are introduced in a Kconfig file.

Here, constants for bitmap indices of some configs are defined and these
defines begin with the config's name, and are suffixed with BITMAP_IDX.

To avoid defines prefixed with "CONFIG", name these constants
BITMAP_IDX_FOR_CONFIG_XYZ instead of CONFIG_XYZ_BITMAP_IDX.

No functional change.

Signed-off-by: Lukas Bulwahn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

qlcnic: fix sleep-in-atomic-context bugs caused by msleep

The watchdog timer is used to monitor whether the process
of transmitting data is timeout. If we use qlcnic driver,
the dev_watchdog() that is the timer handler of watchdog
timer will call qlcnic_tx_timeout() to process the timeout.
But the qlcnic_tx_timeout() calls msleep(), as a result,
the sleep-in-atomic-context bugs will happen. The processes
are shown below:

   (atomic context)
dev_watchdog
  qlcnic_tx_timeout
    qlcnic_83xx_idc_request_reset
      qlcnic_83xx_lock_driver
        msleep

---------------------------

   (atomic context)
dev_watchdog
  qlcnic_tx_timeout
    qlcnic_83xx_idc_request_reset
      qlcnic_83xx_lock_driver
        qlcnic_83xx_recover_driver_lock
          msleep

Fix by changing msleep() to mdelay(), the mdelay() is
busy-waiting and the bugs could be mitigated.

Fixes: 629263acaea3 ("qlcnic: 83xx CNA inter driver communication mechanism")
Signed-off-by: Duoming Zhou <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'linux-can-fixes-for-6.1-20221124' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
linux-can-fixes-for-6.1-20221124

this is a pull request of 8 patches for net/master.

Ziyang Xuan contributes a patch for the can327, fixing a potential SKB
leak when the netdev is down.

Heiko Schocher's patch for the sja1000 driver fixes the width of the
definition of the OCR_MODE_MASK.

Zhang Changzhong contributes 4 patches. In the sja1000_isa, cc770, and
m_can_pci drivers the error path in the probe() function and in case
of the etas_es58x a function that is called by probe() are fixed.

Jiasheng Jiang add a missing check for the return value of the
devm_clk_get() in the m_can driver.

Yasushi SHOJI's patch for the mcba_usb fixes setting of the external
termination resistor.
====================

Signed-off-by: David S. Miller <[email protected]>

wifi: wilc1000: validate number of channels

There is no validation of 'e->no_of_channels' which can trigger an
out-of-bounds write in the following 'memset' call. Validate that the
number of channels does not extends beyond the size of the channel list
element.

Signed-off-by: Phil Turnbull <[email protected]>
Tested-by: Ajay Kathat <[email protected]>
Acked-by: Ajay Kathat <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: wilc1000: validate length of IEEE80211_P2P_ATTR_CHANNEL_LIST attribute

Validate that the IEEE80211_P2P_ATTR_CHANNEL_LIST attribute contains
enough space for a 'struct wilc_attr_oper_ch'. If the attribute is too
small then it can trigger an out-of-bounds write later in the function.

'struct wilc_attr_oper_ch' is variable sized so also check 'attr_len'
does not extend beyond the end of 'buf'.

Signed-off-by: Phil Turnbull <[email protected]>
Tested-by: Ajay Kathat <[email protected]>
Acked-by: Ajay Kathat <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: wilc1000: validate length of IEEE80211_P2P_ATTR_OPER_CHANNEL attribute

Validate that the IEEE80211_P2P_ATTR_OPER_CHANNEL attribute contains
enough space for a 'struct struct wilc_attr_oper_ch'. If the attribute is
too small then it triggers an out-of-bounds write later in the function.

Signed-off-by: Phil Turnbull <[email protected]>
Tested-by: Ajay Kathat <[email protected]>
Acked-by: Ajay Kathat <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: wilc1000: validate pairwise and authentication suite offsets

There is no validation of 'offset' which can trigger an out-of-bounds
read when extracting RSN capabilities.

Signed-off-by: Phil Turnbull <[email protected]>
Tested-by: Ajay Kathat <[email protected]>
Acked-by: Ajay Kathat <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

can: mcba_usb: Fix termination command argument

Microchip USB Analyzer can activate the internal termination resistors
by setting the "termination" option ON, or OFF to to deactivate them.
As I've observed, both with my oscilloscope and captured USB packets
below, you must send "0" to turn it ON, and "1" to turn it OFF.

From the schematics in the user's guide, I can confirm that you must
drive the CAN_RES signal LOW "0" to activate the resistors.

Reverse the argument value of usb_msg.termination to fix this.

These are the two commands sequence, ON then OFF.

> No.     Time           Source                Destination           Protocol Length Info
>       1 0.000000       host                  1.3.1                 USB      46     URB_BULK out
>
> Frame 1: 46 bytes on wire (368 bits), 46 bytes captured (368 bits)
> USB URB
> Leftover Capture Data: a80000000000000000000000000000000000a8
>
> No.     Time           Source                Destination           Protocol Length Info
>       2 4.372547       host                  1.3.1                 USB      46     URB_BULK out
>
> Frame 2: 46 bytes on wire (368 bits), 46 bytes captured (368 bits)
> USB URB
> Leftover Capture Data: a80100000000000000000000000000000000a9

Signed-off-by: Yasushi SHOJI <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: m_can: Add check for devm_clk_get

Since the devm_clk_get may return error,
it should be better to add check for the cdev->hclk,
as same as cdev->cclk.

Fixes: f524f829b75a ("can: m_can: Create a m_can platform framework")
Signed-off-by: Jiasheng Jiang <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: m_can: pci: add missing m_can_class_free_dev() in probe/remove methods

In m_can_pci_remove() and error handling path of m_can_pci_probe(),
m_can_class_free_dev() should be called to free resource allocated by
m_can_class_allocate_dev(), otherwise there will be memleak.

Fixes: cab7ffc0324f ("can: m_can: add PCI glue driver for Intel Elkhart Lake")
Signed-off-by: Zhang Changzhong <[email protected]>
Reviewed-by: Jarkko Nikula <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: etas_es58x: es58x_init_netdev(): free netdev when register_candev()

In case of register_candev() fails, clear
es58x_dev->netdev[channel_idx] and add free_candev(). Otherwise
es58x_free_netdevs() will unregister the netdev that has never been
registered.

Fixes: 8537257874e9 ("can: etas_es58x: add core support for ETAS ES58X CAN USB interfaces")
Signed-off-by: Zhang Changzhong <[email protected]>
Acked-by: Arunachalam Santhanam <[email protected]>
Acked-by: Vincent Mailhol <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: cc770: cc770_isa_probe(): add missing free_cc770dev()

Add the missing free_cc770dev() before return from cc770_isa_probe()
in the register_cc770dev() error handling case.

In addition, remove blanks before goto labels.

Fixes: 7e02e5433e00 ("can: cc770: legacy CC770 ISA bus driver")
Signed-off-by: Zhang Changzhong <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: sja1000_isa: sja1000_isa_probe(): add missing free_sja1000dev()

Add the missing free_sja1000dev() before return from
sja1000_isa_probe() in the register_sja1000dev() error handling case.

In addition, remove blanks before goto labels.

Fixes: 2a6ba39ad6a2 ("can: sja1000: legacy SJA1000 ISA bus driver")
Signed-off-by: Zhang Changzhong <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: sja1000: fix size of OCR_MODE_MASK define

bitfield mode in ocr register has only 2 bits not 3, so correct
the OCR_MODE_MASK define.

Signed-off-by: Heiko Schocher <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: can327: can327_feed_frame_to_netdev(): fix potential skb leak when netdev is down

In can327_feed_frame_to_netdev(), it did not free the skb when netdev
is down, and all callers of can327_feed_frame_to_netdev() did not free
allocated skb too. That would trigger skb leak.

Fix it by adding kfree_skb() in can327_feed_frame_to_netdev() when netdev
is down. Not tested, just compiled.

Fixes: 43da2f07622f ("can: can327: CAN/ldisc driver for ELM327 based OBD-II adapters")
Signed-off-by: Ziyang Xuan <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Reviewed-by: Max Staudt <[email protected]>
Cc: [email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>

net: thunderx: Fix the ACPI memory leak

The ACPI buffer memory (string.pointer) should be freed as the buffer is
not used after returning from bgx_acpi_match_id(), free it to prevent
memory leak.

Fixes: 46b903a01c05 ("net, thunder, bgx: Add support to get MAC address from ACPI.")
Signed-off-by: Yu Liao <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

octeontx2-af: Fix reference count issue in rvu_sdp_init()

pci_get_device() will decrease the reference count for the *from*
parameter. So we don't need to call put_device() to decrease the
reference. Let's remove the put_device() in the loop and only decrease
the reference count of the returned 'pdev' for the last loop because it
will not be passed to pci_get_device() as input parameter. We don't need
to check if 'pdev' is NULL because it is already checked inside
pci_dev_put(). Also add pci_dev_put() for the error path.

Fixes: fe1939bb2340 ("octeontx2-af: Add SDP interface support")
Signed-off-by: Xiongfeng Wang <[email protected]>
Reviewed-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

net: altera_tse: release phylink resources in tse_shutdown()

Call phylink_disconnect_phy() in tse_shutdown() to release the
resources occupied by phylink_of_phy_connect() in the tse_open().

Fixes: fef2998203e1 ("net: altera: tse: convert to phylink")
Signed-off-by: Liu Jian <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

virtio_net: Fix probe failed when modprobe virtio_net

When doing the following test steps, an error was found:
  step 1: modprobe virtio_net succeeded
    # modprobe virtio_net        <-- OK

  step 2: fault injection in register_netdevice()
    # modprobe -r virtio_net     <-- OK
    # ...
      FAULT_INJECTION: forcing a failure.
      name failslab, interval 1, probability 0, space 0, times 0
      CPU: 0 PID: 3521 Comm: modprobe
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
      Call Trace:
       <TASK>
       ...
       should_failslab+0xa/0x20
       ...
       dev_set_name+0xc0/0x100
       netdev_register_kobject+0xc2/0x340
       register_netdevice+0xbb9/0x1320
       virtnet_probe+0x1d72/0x2658 [virtio_net]
       ...
       </TASK>
      virtio_net: probe of virtio0 failed with error -22

  step 3: modprobe virtio_net failed
    # modprobe virtio_net        <-- failed
      virtio_net: probe of virtio0 failed with error -2

The root cause of the problem is that the queues are not
disable on the error handling path when register_netdevice()
fails in virtnet_probe(), resulting in an error "-ENOENT"
returned in the next modprobe call in setup_vq().

virtio_pci_modern_device uses virtqueues to send or
receive message, and "queue_enable" records whether the
queues are available. In vp_modern_find_vqs(), all queues
will be selected and activated, but once queues are enabled
there is no way to go back except reset.

Fix it by reset virtio device on error handling path. This
makes error handling follow the same order as normal device
cleanup in virtnet_remove() which does: unregister, destroy
failover, then reset. And that flow is better tested than
error handling so we can be reasonably sure it works well.

Fixes: 024655555021 ("virtio_net: fix use after free on allocation failure")
Signed-off-by: Li Zetao <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

net/mlx5e: MACsec, block offload requests with encrypt off

Currently offloading MACsec with authentication only (encrypt
property set to off) is not supported, block such requests
when adding/updating a macsec device.

Fixes: 8ff0ac5be144 ("net/mlx5: Add MACsec offload Tx command support")
Signed-off-by: Emeel Hakim <[email protected]>
Reviewed-by: Raed Salem <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5e: MACsec, fix Tx SA active field update

Currently during update Tx security association (SA) flow, the Tx SA
active state is updated only if the Tx SA in question is the same SA
that the MACsec interface is using for Tx,in consequence when the
MACsec interface chose to work with this Tx SA later, where this SA
for example should have been updated to active state and it was not,
the relevant Tx SA HW context won't be installed, hence the MACSec
flow won't be offloaded.

Fix by update Tx SA active state as part of update flow regardless
whether the SA in question is the same Tx SA used by the MACsec
interface.

Fixes: 8ff0ac5be144 ("net/mlx5: Add MACsec offload Tx command support")
Signed-off-by: Raed Salem <[email protected]>
Reviewed-by: Emeel Hakim <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5e: MACsec, remove replay window size limitation in offload path

Currently offload path limits replay window size to 32/64/128/256 bits,
such a limitation should not exist since software allows it.
Remove such limitation.

Fixes: eb43846b43c3 ("net/mlx5e: Support MACsec offload replay window")
Signed-off-by: Emeel Hakim <[email protected]>
Reviewed-by: Raed Salem <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5e: MACsec, fix add Rx security association (SA) rule memory leak

Currently MACsec's add Rx SA flow steering (fs) rule routine
uses a spec object which is dynamically allocated and do
not free it upon leaving. The above led to a memory leak.

Fix by freeing dynamically allocated objects.

Fixes: 3b20949cb21b ("net/mlx5e: Add MACsec RX steering rules")
Signed-off-by: Emeel Hakim <[email protected]>
Reviewed-by: Raed Salem <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5e: MACsec, fix mlx5e_macsec_update_rxsa bail condition and functionality

Fix update Rx SA wrong bail condition, naturally update functionality
needs to check that something changed otherwise bailout currently the
active state check does just the opposite, furthermore unlike deactivate
path which remove the macsec rules to deactivate the offload, the
activation path does not include the counter part installation of the
macsec rules.

Fix by using correct bailout condition and when Rx SA changes state to
active then add the relevant macsec rules.

While at it, refine function name to reflect more precisely its role.

Fixes: aae3454e4d4c ("net/mlx5e: Add MACsec offload Rx command support")
Signed-off-by: Raed Salem <[email protected]>
Reviewed-by: Emeel Hakim <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5e: MACsec, fix update Rx secure channel active field

The main functionality for this operation is to update the
active state of the Rx security channel (SC) if the new
active setting is different from the current active state
of this Rx SC, however the relevant active state check is
done post updating the current active state to match the
new active state, effectively blocks any offload state
update for the Rx SC in question.

Fix by delay the assignment to be post the relevant check.

Fixes: aae3454e4d4c ("net/mlx5e: Add MACsec offload Rx command support")
Signed-off-by: Raed Salem <[email protected]>
Reviewed-by: Emeel Hakim <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5e: MACsec, fix memory leak when MACsec device is deleted

When the MACsec netdevice is deleted, all related Rx/Tx HW/SW
states should be released/deallocated, however currently part
of the Rx security channel association data is not cleaned
properly, hence the memory leaks.

Fix by make sure all related Rx Sc resources are cleaned/freed,
while at it improve code by grouping release SC context in a
function so it can be used in both delete MACsec device and
delete Rx SC operations.

Fixes: 5a39816a75e5 ("net/mlx5e: Add MACsec offload SecY support")
Signed-off-by: Raed Salem <[email protected]>
Reviewed-by: Emeel Hakim <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5e: MACsec, fix RX data path 16 RX security channel limit

Currently the data path metadata flow id mask wrongly limits the
number of different RX security channels (SC) to 16, whereas in
adding RX SC the limit is "2^16 - 1" this cause an overlap in
metadata flow id once more than 16 RX SCs is added, this corrupts
MACsec RX offloaded flow handling.

Fix by using the correct mask, while at it improve code to use this
mask when adding the Rx rule and improve visibility of such errors
by adding debug massage.

Fixes: b7c9400cbc48 ("net/mlx5e: Implement MACsec Rx data path using MACsec skb_metadata_dst")
Signed-off-by: Raed Salem <[email protected]>
Reviewed-by: Emeel Hakim <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5e: Use kvfree() in mlx5e_accel_fs_tcp_create()

'accel_tcp' is allocated by kvzalloc(), which should freed by kvfree().

Fixes: f52f2faee581 ("net/mlx5e: Introduce flow steering API")
Signed-off-by: YueHaibing <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5e: Fix a couple error codes

If kvzalloc() fails then return -ENOMEM. Don't return success.

Fixes: 3b20949cb21b ("net/mlx5e: Add MACsec RX steering rules")
Fixes: e467b283ffd5 ("net/mlx5e: Add MACsec TX steering rules")
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5e: Fix use-after-free when reverting termination table

When having multiple dests with termination tables and second one
or afterwards fails the driver reverts usage of term tables but
doesn't reset the assignment in attr->dests[num_vport_dests].termtbl
which case a use-after-free when releasing the rule.
Fix by resetting the assignment of termtbl to null.

Fixes: 10caabdaad5a ("net/mlx5e: Use termination table for VLAN push actions")
Signed-off-by: Roi Dayan <[email protected]>
Reviewed-by: Maor Dickman <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5: Fix uninitialized variable bug in outlen_write()

If sscanf() return 0, outlen is uninitialized and used in kzalloc(),
this is unexpected. We should return -EINVAL if the string is invalid.

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: YueHaibing <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5: E-switch, Fix duplicate lag creation

If creating bond first and then enabling sriov in switchdev mode,
will hit the following syndrome:

mlx5_core 0000:08:00.0: mlx5_cmd_out_err:778:(pid 25543): CREATE_LAG(0x840) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x7d49cb), err(-22)

The reason is because the offending patch removes eswitch mode
none. In vf lag, the checking of eswitch mode none is replaced
by checking if sriov is enabled. But when driver enables sriov,
it triggers the bond workqueue task first and then setting sriov
number in pci_enable_sriov(). So the check fails.

Fix it by checking if sriov is enabled using eswitch internal
counter that is set before triggering the bond workqueue task.

Fixes: f019679ea5f2 ("net/mlx5: E-switch, Remove dependency between sriov and eswitch mode")
Signed-off-by: Chris Mi <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Reviewed-by: Vlad Buslov <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5: E-switch, Destroy legacy fdb table when needed

The cited commit removes eswitch mode none. But when disabling
sriov in legacy mode or changing from switchdev to legacy mode
without sriov enabled, the legacy fdb table is not destroyed.

It is not the right behavior. Destroy legacy fdb table in above
two caes.

Fixes: f019679ea5f2 ("net/mlx5: E-switch, Remove dependency between sriov and eswitch mode")
Signed-off-by: Chris Mi <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Reviewed-by: Eli Cohen <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Reviewed-by: Vlad Buslov <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5: DR, Fix uninitialized var warning

Smatch warns this:

drivers/net/ethernet/mellanox/mlx5/core/steering/dr_table.c:81
mlx5dr_table_set_miss_action() error: uninitialized symbol 'ret'.

Initializing ret with -EOPNOTSUPP and fix missing action case.

Fixes: 7838e1725394 ("net/mlx5: DR, Expose steering table functionality")
Signed-off-by: YueHaibing <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net: wwan: t7xx: Fix the ACPI memory leak

The ACPI buffer memory (buffer.pointer) should be freed as the
buffer is not used after acpi_evaluate_object(), free it to
prevent memory leak.

Fixes: 13e920d93e37 ("net: wwan: t7xx: Add core components")
Signed-off-by: Hanjun Guo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

octeontx2-pf: Add check for devm_kcalloc

As the devm_kcalloc may return NULL pointer,
it should be better to add check for the return
value, as same as the others.

Fixes: e8e095b3b370 ("octeontx2-af: cn10k: Bandwidth profiles config support")
Signed-off-by: Jiasheng Jiang <[email protected]>
Reviewed-by: Maciej Fijalkowski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

net: enetc: preserve TX ring priority across reconfiguration

In the blamed commit, a rudimentary reallocation procedure for RX buffer
descriptors was implemented, for the situation when their format changes
between normal (no PTP) and extended (PTP).

enetc_hwtstamp_set() calls enetc_close() and enetc_open() in a sequence,
and this sequence loses information which was previously configured in
the TX BDR Mode Register, specifically via the enetc_set_bdr_prio() call.
The TX ring priority is configured by tc-mqprio and tc-taprio, and
affects important things for TSN such as the TX time of packets. The
issue manifests itself most visibly by the fact that isochron --txtime
reports premature packet transmissions when PTP is first enabled on an
enetc interface.

Save the TX ring priority in a new field in struct enetc_bdr (occupies a
2 byte hole on arm64) in order to make this survive a ring reconfiguration.

Fixes: 434cebabd3a2 ("enetc: Add dynamic allocation of extended Rx BD rings")
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Alexander Lobakin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: marvell: prestera: add missing unregister_netdev() in prestera_port_create()

If prestera_port_sfp_bind() fails, unregister_netdev() should be called
in error handling path.

Compile tested only.

Fixes: 52323ef75414 ("net: marvell: prestera: add phylink support")
Signed-off-by: Zhang Changzhong <[email protected]>
Reviewed-by: Maciej Fijalkowski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'nfc-st-nci-restructure-validating-logic-in-evt_transaction'

Martin Faltesek says:

====================
nfc: st-nci: Restructure validating logic in EVT_TRANSACTION

These are the same 3 patches that were applied in st21nfca here:
https://lore.kernel.org/netdev/20220607025729.1673212 [email protected]
with a couple minor differences.

st-nci has nearly identical code to that of st21nfca for EVT_TRANSACTION,
except that there are two extra validation checks that are not present
in the st-nci code.

The 3/3 patch as coded for st21nfca pulls those checks in, bringing both
drivers into parity.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

nfc: st-nci: fix incorrect sizing calculations in EVT_TRANSACTION

The transaction buffer is allocated by using the size of the packet buf,
and subtracting two which seems intended to remove the two tags which are
not present in the target structure. This calculation leads to under
counting memory because of differences between the packet contents and the
target structure. The aid_len field is a u8 in the packet, but a u32 in
the structure, resulting in at least 3 bytes always being under counted.
Further, the aid data is a variable length field in the packet, but fixed
in the structure, so if this field is less than the max, the difference is
added to the under counting.

To fix, perform validation checks progressively to safely reach the
next field, to determine the size of both buffers and verify both tags.
Once all validation checks pass, allocate the buffer and copy the data.
This eliminates freeing memory on the error path, as validation checks are
moved ahead of memory allocation.

Reported-by: Denis Efremov <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Fixes: 5d1ceb7f5e56 ("NFC: st21nfcb: Add HCI transaction event support")
Signed-off-by: Martin Faltesek <[email protected]>
Reviewed-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

nfc: st-nci: fix memory leaks in EVT_TRANSACTION

Error path does not free previously allocated memory. Add devm_kfree() to
the failure path.

Reported-by: Denis Efremov <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Fixes: 5d1ceb7f5e56 ("NFC: st21nfcb: Add HCI transaction event support")
Signed-off-by: Martin Faltesek <[email protected]>
Reviewed-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

nfc: st-nci: fix incorrect validating logic in EVT_TRANSACTION

The first validation check for EVT_TRANSACTION has two different checks
tied together with logical AND. One is a check for minimum packet length,
and the other is for a valid aid_tag. If either condition is true (fails),
then an error should be triggered. The fix is to change && to ||.

Reported-by: Denis Efremov <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Fixes: 5d1ceb7f5e56 ("NFC: st21nfcb: Add HCI transaction event support")
Signed-off-by: Martin Faltesek <[email protected]>
Reviewed-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec

Steffen Klassert says:

====================
ipsec 2022-11-23

1) Fix "disable_policy" on ipv4 early demuxP Packets after
   the initial packet in a flow might be incorectly dropped
   on early demux if there are no matching policies.
   From Eyal Birger.

2) Fix a kernel warning in case XFRM encap type is not
   available. From Eyal Birger.

3) Fix ESN wrap around for GSO to avoid a double usage of a
    sequence number. From Christian Langrock.

4) Fix a send_acquire race with pfkey_register.
   From Herbert Xu.

5) Fix a list corruption panic in __xfrm_state_delete().
   Thomas Jarosch.

6) Fix an unchecked return value in xfrm6_init().
   Chen Zhongjin.

* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
  xfrm: Fix ignored return value in xfrm6_init()
  xfrm: Fix oops in __xfrm_state_delete()
  af_key: Fix send_acquire race with pfkey_register
  xfrm: replay: Fix ESN wrap around for GSO
  xfrm: lwtunnel: squelch kernel warning in case XFRM encap type is not available
  xfrm: fix "disable_policy" on ipv4 early demux
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

1) Fix regression in ipset hash:ip with IPv4 range, from Vishwanath Pai.
   This is fixing up a bug introduced in the 6.0 release.

2) The "netfilter: ipset: enforce documented limit to prevent allocating
   huge memory" patch contained a wrong condition which makes impossible to
   add up to 64 clashing elements to a hash:net,iface type of set while it
   is the documented feature of the set type. The patch fixes the condition
   and thus makes possible to add the elements while keeps preventing
   allocating huge memory, from Jozsef Kadlecsik. This has been broken
   for several releases.

3) Missing locking when updating the flow block list which might lead
   a reader to crash. This has been broken since the introduction of the
   flowtable hardware offload support.

* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: flowtable_offload: add missing locking
  netfilter: ipset: restore allowing 64 clashing elements in hash:net,iface
  netfilter: ipset: regression in ip_set_hash_ip.c
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Documentation: networking: Update generic_netlink_howto URL

The documentation refers to invalid web page under www.linuxfoundation.org
The patch refers to a working URL under wiki.linuxfoundation.org

Signed-off-by: Nir Levy <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Jakub Kicinski <[email protected]>

e100: Fix possible use after free in e100_xmit_prepare

In e100_xmit_prepare(), if we can't map the skb, then return -ENOMEM, so
e100_xmit_frame() will return NETDEV_TX_BUSY and the upper layer will
resend the skb. But the skb is already freed, which will cause UAF bug
when the upper layer resends the skb.

Remove the harmful free.

Fixes: 5e5d49422dfb ("e100: Release skb when DMA mapping is failed in e100_xmit_prepare")
Signed-off-by: Wang Hai <[email protected]>
Reviewed-by: Alexander Duyck <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>

iavf: Fix error handling in iavf_init_module()

The iavf_init_module() won't destroy workqueue when pci_register_driver()
failed. Call destroy_workqueue() when pci_register_driver() failed to
prevent the resource leak.

Similar to the handling of u132_hcd_init in commit f276e002793c
("usb: u132-hcd: fix resource leak")

Fixes: 2803b16c10ea ("i40e/i40evf: Use private workqueue")
Signed-off-by: Yuan Can <[email protected]>
Tested-by: Konrad Jankowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>

fm10k: Fix error handling in fm10k_init_module()

A problem about modprobe fm10k failed is triggered with the following log
given:

Intel(R) Ethernet Switch Host Interface Driver
Copyright(c) 2013 - 2019 Intel Corporation.
debugfs: Directory 'fm10k' with parent '/' already present!

The reason is that fm10k_init_module() returns fm10k_register_pci_driver()
directly without checking its return value, if fm10k_register_pci_driver()
failed, it returns without removing debugfs and destroy workqueue,
resulting the debugfs of fm10k can never be created later and leaks the
workqueue.

fm10k_init_module()
   alloc_workqueue()
   fm10k_dbg_init() # create debugfs
   fm10k_register_pci_driver()
     pci_register_driver()
       driver_register()
         bus_add_driver()
           priv = kzalloc(...) # OOM happened
   # return without remove debugfs and destroy workqueue

Fix by remove debugfs and destroy workqueue when
fm10k_register_pci_driver() returns error.

Fixes: 7461fd913afe ("fm10k: Add support for debugfs")
Fixes: b382bb1b3e2d ("fm10k: use separate workqueue for fm10k driver")
Signed-off-by: Yuan Can <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>

i40e: Fix error handling in i40e_init_module()

i40e_init_module() won't free the debugfs directory created by
i40e_dbg_init() when pci_register_driver() failed. Add fail path to
call i40e_dbg_exit() to remove the debugfs entries to prevent the bug.

i40e: Intel(R) Ethernet Connection XL710 Network Driver
i40e: Copyright (c) 2013 - 2019 Intel Corporation.
debugfs: Directory 'i40e' with parent '/' already present!

Fixes: 41c445ff0f48 ("i40e: main driver core")
Signed-off-by: Shang XiaoJing <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>

ixgbevf: Fix resource leak in ixgbevf_init_module()

ixgbevf_init_module() won't destroy the workqueue created by
create_singlethread_workqueue() when pci_register_driver() failed. Add
destroy_workqueue() in fail path to prevent the resource leak.

Similar to the handling of u132_hcd_init in commit f276e002793c
("usb: u132-hcd: fix resource leak")

Fixes: 40a13e2493c9 ("ixgbevf: Use a private workqueue to avoid certain possible hangs")
Signed-off-by: Shang XiaoJing <[email protected]>
Reviewed-by: Saeed Mahameed <[email protected]>
Tested-by: Konrad Jankowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>

net/cdc_ncm: Fix multicast RX support for CDC NCM devices with ZLP

ZLP for DisplayLink ethernet devices was enabled in 6.0:
266c0190aee3 ("net/cdc_ncm: Enable ZLP for DisplayLink ethernet devices").
The related driver_info should be the "same as cdc_ncm_info, but with
FLAG_SEND_ZLP". However, set_rx_mode that enables handling multicast
traffic was missing in the new cdc_ncm_zlp_info.

usbnet_cdc_update_filter rx mode was introduced in linux 5.9 with:
e10dcb1b6ba7 ("net: cdc_ncm: hook into set_rx_mode to admit multicast
traffic")

Without this hook, multicast, and then IPv6 SLAAC, is broken.

Fixes: 266c0190aee3 ("net/cdc_ncm: Enable ZLP for DisplayLink ethernet devices")
Signed-off-by: Santiago Ruano Rincón <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: usb: qmi_wwan: add u-blox 0x1342 composition

Add RmNet support for LARA-L6.

LARA-L6 module can be configured (by AT interface) in three different
USB modes:
* Default mode (Vendor ID: 0x1546 Product ID: 0x1341) with 4 serial
interfaces
* RmNet mode (Vendor ID: 0x1546 Product ID: 0x1342) with 4 serial
interfaces and 1 RmNet virtual network interface
* CDC-ECM mode (Vendor ID: 0x1546 Product ID: 0x1343) with 4 serial
interface and 1 CDC-ECM virtual network interface

In RmNet mode LARA-L6 exposes the following interfaces:
If 0: Diagnostic
If 1: AT parser
If 2: AT parser
If 3: AT parset/alternative functions
If 4: RMNET interface

Signed-off-by: Davide Tronchin <[email protected]>
Acked-by: Bjørn Mork <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

l2tp: Don't sleep and disable BH under writer-side sk_callback_lock

When holding a reader-writer spin lock we cannot sleep. Calling
setup_udp_tunnel_sock() with write lock held violates this rule, because we
end up calling percpu_down_read(), which might sleep, as syzbot reports
[1]:

__might_resched.cold+0x222/0x26b kernel/sched/core.c:9890
percpu_down_read include/linux/percpu-rwsem.h:49 [inline]
cpus_read_lock+0x1b/0x140 kernel/cpu.c:310
static_key_slow_inc+0x12/0x20 kernel/jump_label.c:158
udp_tunnel_encap_enable include/net/udp_tunnel.h:187 [inline]
setup_udp_tunnel_sock+0x43d/0x550 net/ipv4/udp_tunnel_core.c:81
l2tp_tunnel_register+0xc51/0x1210 net/l2tp/l2tp_core.c:1509
pppol2tp_connect+0xcdc/0x1a10 net/l2tp/l2tp_ppp.c:723

Trim the writer-side critical section for sk_callback_lock down to the
minimum, so that it covers only operations on sk_user_data.

Also, when grabbing the sk_callback_lock, we always need to disable BH, as
Eric points out. Failing to do so leads to deadlocks because we acquire
sk_callback_lock in softirq context, which can get stuck waiting on us if:

1) it runs on the same CPU, or

       CPU0
       ----
  lock(clock-AF_INET6);
  <Interrupt>
    lock(clock-AF_INET6);

2) lock ordering leads to priority inversion

       CPU0                    CPU1
       ----                    ----
  lock(clock-AF_INET6);
                               local_irq_disable();
                               lock(&tcp_hashinfo.bhash[i].lock);
                               lock(clock-AF_INET6);
  <Interrupt>
    lock(&tcp_hashinfo.bhash[i].lock);

... as syzbot reports [2,3]. Use the _bh variants for write_(un)lock.

[1] https://lore.kernel.org/netdev/0000000000004e78ec05eda79749@google.com/
[2] https://lore.kernel.org/netdev/000000000000e38b6605eda76f98@google.com/
[3] https://lore.kernel.org/netdev/000000000000dfa31e05eda76f75@google.com/

v2:
- Check and set sk_user_data while holding sk_callback_lock for both
  L2TP encapsulation types (IP and UDP) (Tetsuo)

Cc: Tom Parkin <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Fixes: b68777d54fac ("l2tp: Serialize access to sk_user_data with sk_callback_lock")
Reported-by: Eric Dumazet <[email protected]>
Reported-by: [email protected]
Reported-by: [email protected]
Reported-by: [email protected]
Signed-off-by: Jakub Sitnicki <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dm9051: Fix missing dev_kfree_skb() in dm9051_loop_rx()

The dm9051_loop_rx() returns without release skb when dm9051_stop_mrcmd()
returns error, free the skb to avoid this leak.

Fixes: 2dc95a4d30ed ("net: Add dm9051 driver")
Signed-off-by: Yuan Can <[email protected]>
Reviewed-by: Maciej Fijalkowski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

arcnet: fix potential memory leak in com20020_probe()

In com20020_probe(), if com20020_config() fails, dev and info
will not be freed, which will lead to a memory leak.

This patch adds freeing dev and info after com20020_config()
fails to fix this bug.

Compile tested only.

Fixes: 15b99ac17295 ("[PATCH] pcmcia: add return value to _config() functions")
Signed-off-by: Wang Hai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'mlx5-fixes-2022-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2022-11-21

This series provides bug fixes to mlx5 driver.

* tag 'mlx5-fixes-2022-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: Fix possible race condition in macsec extended packet number update routine
  net/mlx5e: Fix MACsec update SecY
  net/mlx5e: Fix MACsec SA initialization routine
  net/mlx5e: Remove leftovers from old XSK queues enumeration
  net/mlx5e: Offload rule only when all encaps are valid
  net/mlx5e: Fix missing alignment in size of MTT/KLM entries
  net/mlx5: Fix sync reset event handler error flow
  net/mlx5: E-Switch, Set correctly vport destination
  net/mlx5: Lag, avoid lockdep warnings
  net/mlx5: Fix handling of entry refcount when command is not issued to FW
  net/mlx5: cmdif, Print info on any firmware cmd failure to tracepoint
  net/mlx5: SF: Fix probing active SFs during driver probe phase
  net/mlx5: Fix FW tracer timestamp calculation
  net/mlx5: Do not query pci info while pci disabled
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

ipv4: Fix error return code in fib_table_insert()

In fib_table_insert(), if the alias was already inserted, but node not
exist, the error code should be set before return from error handling path.

Fixes: a6c76c17df02 ("ipv4: Notify route after insertion to the routing table")
Signed-off-by: Ziyang Xuan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'net-ethernet-mtk_eth_soc-fix-memory-leak-in-error-path'

Yan Cangang says:

====================
net: ethernet: mtk_eth_soc: fix memory leak in error path
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: ethernet: mtk_eth_soc: fix memory leak in error path

In mtk_ppe_init(), when dmam_alloc_coherent() or devm_kzalloc() failed,
the rhashtable ppe->l2_flows isn't destroyed. Fix it.

In mtk_probe(), when mtk_ppe_init() or mtk_eth_offload_init() or
register_netdev() failed, have the same problem. Fix it.

Fixes: 33fc42de3327 ("net: ethernet: mtk_eth_soc: support creating mac address based offload entries")
Signed-off-by: Yan Cangang <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: ethernet: mtk_eth_soc: fix resource leak in error path

In mtk_probe(), when mtk_ppe_init() or mtk_eth_offload_init() failed,
mtk_mdio_cleanup() isn't called. Fix it.

Fixes: ba37b7caf1ed ("net: ethernet: mtk_eth_soc: add support for initializing the PPE")
Fixes: 502e84e2382d ("net: ethernet: mtk_eth_soc: add flow offloading support")
Signed-off-by: Yan Cangang <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: ethernet: mtk_eth_soc: fix potential memory leak in mtk_rx_alloc()

When fail to dma_map_single() in mtk_rx_alloc(), it returns directly.
But the memory allocated for local variable data is not freed, and
local variabel data has not been attached to ring->data[i] yet, so the
memory allocated for local variable data will not be freed outside
mtk_rx_alloc() too. Thus memory leak would occur in this scenario.

Add skb_free_frag(data) when dma_map_single() failed.

Fixes: 23233e577ef9 ("net: ethernet: mtk_eth_soc: rely on page_pool for single page buffers")
Signed-off-by: Ziyang Xuan <[email protected]>
Acked-by: Lorenzo Bianconi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'dccp-tcp-fix-bhash2-issues-related-to-warn_on-in-inet_csk_get_port'

Kuniyuki Iwashima says:

====================
dccp/tcp: Fix bhash2 issues related to WARN_ON() in inet_csk_get_port().

syzkaller was hitting a WARN_ON() in inet_csk_get_port() in the 4th patch,
which was because we forgot to fix up bhash2 bucket when connect() for a
socket bound to a wildcard address fails in __inet_stream_connect().

There was a similar report [0], but its repro does not fire the WARN_ON() due
to inconsistent error handling.

When connect() for a socket bound to a wildcard address fails, saddr may or
may not be reset depending on where the failure happens. When we fail in
__inet_stream_connect(), sk->sk_prot->disconnect() resets saddr. OTOH, in
(dccp|tcp)_v[46]_connect(), if we fail after inet_hash6?_connect(), we
forget to reset saddr.

We fix this inconsistent error handling in the 1st patch, and then we'll
fix the bhash2 WARN_ON() issue.

Note that there is still an issue in that we reset saddr without checking
if there are conflicting sockets in bhash and bhash2, but this should be
another series.

See [1][2] for the previous discussion.

[0]: https://lore.kernel.org/netdev/0000000000003f33bc05dfaf44fe@google.com/
[1]: https://lore.kernel.org/netdev/20221029001249 [email protected]/
[2]: https://lore.kernel.org/netdev/20221103172419 [email protected]/
[3]: https://lore.kernel.org/netdev/20221118081906.053d5231@kernel.org/T/#m00aafedb29ff0b55d5e67aef0252ef1baaf4b6ee
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

dccp/tcp: Fixup bhash2 bucket when connect() fails.

If a socket bound to a wildcard address fails to connect(), we
only reset saddr and keep the port.  Then, we have to fix up the
bhash2 bucket; otherwise, the bucket has an inconsistent address
in the list.

Also, listen() for such a socket will fire the WARN_ON() in
inet_csk_get_port(). [0]

Note that when a system runs out of memory, we give up fixing the
bucket and unlink sk from bhash and bhash2 by inet_put_port().

[0]:
WARNING: CPU: 0 PID: 207 at net/ipv4/inet_connection_sock.c:548 inet_csk_get_port (net/ipv4/inet_connection_sock.c:548 (discriminator 1))
Modules linked in:
CPU: 0 PID: 207 Comm: bhash2_prev_rep Not tainted 6.1.0-rc3-00799-gc8421681c845 #63
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.amzn2022.0.1 04/01/2014
RIP: 0010:inet_csk_get_port (net/ipv4/inet_connection_sock.c:548 (discriminator 1))
Code: 74 a7 eb 93 48 8b 54 24 18 0f b7 cb 4c 89 e6 4c 89 ff e8 48 b2 ff ff 49 8b 87 18 04 00 00 e9 32 ff ff ff 0f 0b e9 34 ff ff ff <0f> 0b e9 42 ff ff ff 41 8b 7f 50 41 8b 4f 54 89 fe 81 f6 00 00 ff
RSP: 0018:ffffc900003d7e50 EFLAGS: 00010202
RAX: ffff8881047fb500 RBX: 0000000000004e20 RCX: 0000000000000000
RDX: 000000000000000a RSI: 00000000fffffe00 RDI: 00000000ffffffff
RBP: ffffffff8324dc00 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000001 R14: 0000000000004e20 R15: ffff8881054e1280
FS:  00007f8ac04dc740(0000) GS:ffff88842fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020001540 CR3: 00000001055fa003 CR4: 0000000000770ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
inet_csk_listen_start (net/ipv4/inet_connection_sock.c:1205)
inet_listen (net/ipv4/af_inet.c:228)
__sys_listen (net/socket.c:1810)
__x64_sys_listen (net/socket.c:1819 net/socket.c:1817 net/socket.c:1817)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
RIP: 0033:0x7f8ac051de5d
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 93 af 1b 00 f7 d8 64 89 01 48
RSP: 002b:00007ffc1c177248 EFLAGS: 00000206 ORIG_RAX: 0000000000000032
RAX: ffffffffffffffda RBX: 0000000020001550 RCX: 00007f8ac051de5d
RDX: ffffffffffffff80 RSI: 0000000000000000 RDI: 0000000000000004
RBP: 00007ffc1c177270 R08: 0000000000000018 R09: 0000000000000007
R10: 0000000020001540 R11: 0000000000000206 R12: 00007ffc1c177388
R13: 0000000000401169 R14: 0000000000403e18 R15: 00007f8ac0723000
</TASK>

Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
Reported-by: syzbot <[email protected]>
Reported-by: Mat Martineau <[email protected]>
Signed-off-by: Kuniyuki Iwashima <[email protected]>
Acked-by: Joanne Koong <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>