]> Git Repo - linux.git/log
linux.git
3 years agoMerge branch 'ethtool-link_mode'
David S. Miller [Wed, 7 Apr 2021 21:53:04 +0000 (14:53 -0700)]
Merge branch 'ethtool-link_mode'

Danielle Ratson says:

====================
Fix link_mode derived params functionality

Currently, link_mode parameter derives 3 other link parameters, speed,
lanes and duplex, and the derived information is sent to user space.

Few bugs were found in that functionality.
First, some drivers clear the 'ethtool_link_ksettings' struct in their
get_link_ksettings() callback and cause receiving wrong link mode
information in user space. And also, some drivers can report random
values in the 'link_mode' field and cause general protection fault.

Second, the link parameters are only derived in netlink path so in ioctl
path, we don't any reasonable values.

Third, setting 'speed 10000 lanes 1' fails since the lanes parameter
wasn't set for ETHTOOL_LINK_MODE_10000baseR_FEC_BIT.

Patch #1 solves the first two problems by removing link_mode parameter
and deriving the link parameters in driver instead of ethtool.
Patch #2 solves the third one, by setting the lanes parameter for the
link_mode.

v3:
* Remove the link_mode parameter in the first patch to solve
  both two issues from patch#1 and patch#2.
* Add the second patch to solve the third issue.

v2:
* Add patch #2.
* Introduce 'cap_link_mode_supported' instead of adding a
  validity field to 'ethtool_link_ksettings' struct in patch #1.
====================

Signed-off-by: David S. Miller <[email protected]>
3 years agoethtool: Add lanes parameter for ETHTOOL_LINK_MODE_10000baseR_FEC_BIT
Danielle Ratson [Wed, 7 Apr 2021 10:06:52 +0000 (13:06 +0300)]
ethtool: Add lanes parameter for ETHTOOL_LINK_MODE_10000baseR_FEC_BIT

Lanes field is missing for ETHTOOL_LINK_MODE_10000baseR_FEC_BIT
link mode and it causes a failure when trying to set
'speed 10000 lanes 1' on Spectrum-2 machines when autoneg is set to on.

Add the lanes parameter for ETHTOOL_LINK_MODE_10000baseR_FEC_BIT
link mode.

Fixes: c8907043c6ac9 ("ethtool: Get link mode in use instead of speed and duplex parameters")
Signed-off-by: Danielle Ratson <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoethtool: Remove link_mode param and derive link params from driver
Danielle Ratson [Wed, 7 Apr 2021 10:06:51 +0000 (13:06 +0300)]
ethtool: Remove link_mode param and derive link params from driver

Some drivers clear the 'ethtool_link_ksettings' struct in their
get_link_ksettings() callback, before populating it with actual values.
Such drivers will set the new 'link_mode' field to zero, resulting in
user space receiving wrong link mode information given that zero is a
valid value for the field.

Another problem is that some drivers (notably tun) can report random
values in the 'link_mode' field. This can result in a general protection
fault when the field is used as an index to the 'link_mode_params' array
[1].

This happens because such drivers implement their set_link_ksettings()
callback by simply overwriting their private copy of
'ethtool_link_ksettings' struct with the one they get from the stack,
which is not always properly initialized.

Fix these problems by removing 'link_mode' from 'ethtool_link_ksettings'
and instead have drivers call ethtool_params_from_link_mode() with the
current link mode. The function will derive the link parameters (e.g.,
speed) from the link mode and fill them in the 'ethtool_link_ksettings'
struct.

v3:
* Remove link_mode parameter and derive the link parameters in
  the driver instead of passing link_mode parameter to ethtool
  and derive it there.

v2:
* Introduce 'cap_link_mode_supported' instead of adding a
  validity field to 'ethtool_link_ksettings' struct.

[1]
general protection fault, probably for non-canonical address 0xdffffc00f14cc32c: 0000 [#1] PREEMPT SMP KASAN
KASAN: probably user-memory-access in range [0x000000078a661960-0x000000078a661967]
CPU: 0 PID: 8452 Comm: syz-executor360 Not tainted 5.11.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__ethtool_get_link_ksettings+0x1a3/0x3a0 net/ethtool/ioctl.c:446
Code: b7 3e fa 83 fd ff 0f 84 30 01 00 00 e8 16 b0 3e fa 48 8d 3c ed 60 d5 69 8a 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03
+38 d0 7c 08 84 d2 0f 85 b9
RSP: 0018:ffffc900019df7a0 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff888026136008 RCX: 0000000000000000
RDX: 00000000f14cc32c RSI: ffffffff873439ca RDI: 000000078a661960
RBP: 00000000ffff8880 R08: 00000000ffffffff R09: ffff88802613606f
R10: ffffffff873439bc R11: 0000000000000000 R12: 0000000000000000
R13: ffff88802613606c R14: ffff888011d0c210 R15: ffff888011d0c210
FS:  0000000000749300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004b60f0 CR3: 00000000185c2000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 linkinfo_prepare_data+0xfd/0x280 net/ethtool/linkinfo.c:37
 ethnl_default_notify+0x1dc/0x630 net/ethtool/netlink.c:586
 ethtool_notify+0xbd/0x1f0 net/ethtool/netlink.c:656
 ethtool_set_link_ksettings+0x277/0x330 net/ethtool/ioctl.c:620
 dev_ethtool+0x2b35/0x45d0 net/ethtool/ioctl.c:2842
 dev_ioctl+0x463/0xb70 net/core/dev_ioctl.c:440
 sock_do_ioctl+0x148/0x2d0 net/socket.c:1060
 sock_ioctl+0x477/0x6a0 net/socket.c:1177
 vfs_ioctl fs/ioctl.c:48 [inline]
 __do_sys_ioctl fs/ioctl.c:753 [inline]
 __se_sys_ioctl fs/ioctl.c:739 [inline]
 __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: c8907043c6ac9 ("ethtool: Get link mode in use instead of speed and duplex parameters")
Signed-off-by: Danielle Ratson <[email protected]>
Reported-by: Eric Dumazet <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoxircom: remove redundant error check on variable err
Colin Ian King [Wed, 7 Apr 2021 09:39:22 +0000 (10:39 +0100)]
xircom: remove redundant error check on variable err

The error check on err is always false as err is always 0 at the
port_found label. The code is redundant and can be removed.

Addresses-Coverity: ("Logically dead code")
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoMerge tag 'linux-can-next-for-5.13-20210407' of git://git.kernel.org/pub/scm/linux...
David S. Miller [Wed, 7 Apr 2021 21:44:52 +0000 (14:44 -0700)]
Merge tag 'linux-can-next-for-5.13-20210407' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2021-04-07

this is a pull request of 6 patches for net-next/master.

The first patch targets the CAN driver infrastructure, it improves the
alloc_can{,fd}_skb() function to set the pointer to the CAN frame to
NULL if skb allocation fails.

The next patch adds missing error handling to the m_can driver's RX
path (the code was introduced in -next, no need to backport).

In the next patch an unused constant is removed from an enum in the
c_can driver.

The last 3 patches target the mcp251xfd driver. They add BQL support
and try to work around a sometimes broken CRC when reading the TBC
register.
====================

Signed-off-by: David S. Miller <[email protected]>
3 years agonet: remove the new_ifindex argument from dev_change_net_namespace
Andrei Vagin [Wed, 7 Apr 2021 06:40:51 +0000 (23:40 -0700)]
net: remove the new_ifindex argument from dev_change_net_namespace

Here is only one place where we want to specify new_ifindex. In all
other cases, callers pass 0 as new_ifindex. It looks reasonable to add a
low-level function with new_ifindex and to convert
dev_change_net_namespace to a static inline wrapper.

Fixes: eeb85a14ee34 ("net: Allow to specify ifindex when device is moved to another namespace")
Suggested-by: Jakub Kicinski <[email protected]>
Signed-off-by: Andrei Vagin <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: introduce nla_policy for IFLA_NEW_IFINDEX
Andrei Vagin [Wed, 7 Apr 2021 06:40:03 +0000 (23:40 -0700)]
net: introduce nla_policy for IFLA_NEW_IFINDEX

In this case, we don't need to check that new_ifindex is positive in
validate_linkmsg.

Fixes: eeb85a14ee34 ("net: Allow to specify ifindex when device is moved to another namespace")
Suggested-by: Jakub Kicinski <[email protected]>
Signed-off-by: Andrei Vagin <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoMerge tag 'mlx5-updates-2021-04-06' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Wed, 7 Apr 2021 21:38:24 +0000 (14:38 -0700)]
Merge tag 'mlx5-updates-2021-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2021-04-06

Introduce TC sample offload

Background
----------

The tc sample action allows user to sample traffic matched by tc
classifier. The sampling consists of choosing packets randomly and
sampling them using psample module.

The tc sample parameters include group id, sampling rate and packet's
truncation (to save kernel-user traffic).

Sample in TC SW
---------------

User must specify rate and group id for sample action, truncate is
optional.

tc filter add dev enp4s0f0_0 ingress protocol ip prio 1 flower \
src_mac 02:25:d0:14:01:02 dst_mac 02:25:d0:14:01:03 \
action sample rate 10 group 5 trunc 60 \
action mirred egress redirect dev enp4s0f0_1

The tc sample action kernel module 'act_sample' will call another
kernel module 'psample' to send sampled packets to userspace.

MLX5 sample HW offload - MLX5 driver patches
--------------------------------------------

The sample action is translated to a goto flow table object
destination which samples packets according to the provided
sample ratio. Sampled packets are duplicated. One copy is
processed by a termination table, named the sample table,
which sends the packet to the eswitch manager port (that will
be processed by software).

The second copy is processed by the default table which executes
the subsequent actions. The default table is created per <vport,
chain, prio> tuple as rules with different prios and chains may
overlap.

For example, for the following typical flow table:

+-------------------------------+
+       original flow table     +
+-------------------------------+
+         original match        +
+-------------------------------+
+ sample action + other actions +
+-------------------------------+

We translate the tc filter with sample action to the following HW model:

        +---------------------+
        + original flow table +
        +---------------------+
        +   original match    +
        +---------------------+
                   |
                   v
+------------------------------------------------+
+                Flow Sampler Object             +
+------------------------------------------------+
+                    sample ratio                +
+------------------------------------------------+
+    sample table id    |    default table id    +
+------------------------------------------------+
           |                            |
           v                            v
+-----------------------------+  +----------------------------------------+
+        sample table         +  + default table per <vport, chain, prio> +
+-----------------------------+  +----------------------------------------+
+ forward to management vport +  +            original match              +
+-----------------------------+  +----------------------------------------+
                                 +            other actions               +
                                 +----------------------------------------+

Flow sampler object
-------------------

Hardware introduces flow sampler object to do sample. It is a new
destination type. Driver needs to specify two flow table ids in it.
One is sample table id. The other one is the default table id.
Sample table samples the packets according to the sample rate and
forward the sampled packets to eswitch manager port. Default table
finishes the subsequent actions.

Group id and reg_c0
-------------------

Userspace program will take different actions for sampled packets
according to tc sample action group id. So hardware must pass group
id to software for each sampled packets. In Paul Blakey's "Introduce
connection tracking offload" patch set, reg_c0 lower 16 bits are used
for miss packet chain id restore. We convert reg_c0 lower 16 bits to
a common object pool, so other features can also use it.

Since sample group id is 32 bits, create a 16 bits object id to map
the group id and write the object id to reg_c0 lower 16 bits. reg_c0
can only be used for matching. Write reg_c0 to flow_tag, so software
can get the object id via flow_tag and find group id via the common
object pool.

Sampler restore handle
----------------------

Use common object pool to create an object id to map sample parameters.
Allocate a modify header action to write the object id to reg_c0 lower
16 bits. Create a restore rule to pass the object id to software. So
software can identify sampled packets via the object id and send it to
userspace.

Aggregate the modify header action, restore rule and object id to a
sample restore handle. Re-use identical sample restore handle for
the same object id.

Send sampled packets to userspace
---------------------------------

The destination for sampled packets is eswitch manager port, so
representors can receive sampled packets together with the group id.
Driver will send sampled packets and group id to userspace via psample.
====================

Signed-off-by: David S. Miller <[email protected]>
3 years agoMerge tag 'mlx5-fixes-2021-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git...
David S. Miller [Wed, 7 Apr 2021 21:34:03 +0000 (14:34 -0700)]
Merge tag 'mlx5-fixes-2021-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2021-04-06

This series provides some fixes to mlx5 driver.
Please pull and let me know if there is any problem.
====================

Signed-off-by: David S. Miller <[email protected]>
3 years agonfc/fdp: remove unnecessary assignment and label
wengjianfeng [Wed, 7 Apr 2021 03:16:38 +0000 (11:16 +0800)]
nfc/fdp: remove unnecessary assignment and label

In function fdp_nci_patch_otp and fdp_nci_patch_ram,many goto
out statements are used, and out label just return variable r.
in some places,just jump to the out label, and in other places,
assign a value to the variable r,then jump to the out label.
It is unnecessary, we just use return sentences to replace goto
sentences and delete out label.

Signed-off-by: wengjianfeng <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agodrm/amd/display: Add missing mask for DCN3
Qingqing Zhuo [Thu, 25 Mar 2021 07:44:11 +0000 (03:44 -0400)]
drm/amd/display: Add missing mask for DCN3

[Why]
DCN3 is not reusing DCN1 mask_sh_list, causing
SURFACE_FLIP_INT_MASK missing in the mapping.

[How]
Add the corresponding entry to DCN3 list.

Signed-off-by: Qingqing Zhuo <[email protected]>
Reviewed-by: Nicholas Kazlauskas <[email protected]>
Acked-by: Qingqing Zhuo <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
3 years agonet: tipc: Fix spelling errors in net/tipc module
Zheng Yongjun [Wed, 7 Apr 2021 01:59:45 +0000 (09:59 +0800)]
net: tipc: Fix spelling errors in net/tipc module

These patches fix a series of spelling errors in net/tipc module.

Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Zheng Yongjun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agomlxsw: core: Remove critical trip points from thermal zones
Vadim Pasternak [Tue, 6 Apr 2021 12:27:33 +0000 (15:27 +0300)]
mlxsw: core: Remove critical trip points from thermal zones

Disable software thermal protection by removing critical trip points
from all thermal zones.

The software thermal protection is redundant given there are two layers
of protection below it in firmware and hardware. The first layer is
performed by firmware, the second, in case firmware was not able to
perform protection, by hardware.
The temperature threshold set for hardware protection is always higher
than for firmware.

Signed-off-by: Vadim Pasternak <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: hsr: Reset MAC header for Tx path
Kurt Kanzenbach [Tue, 6 Apr 2021 07:35:09 +0000 (09:35 +0200)]
net: hsr: Reset MAC header for Tx path

Reset MAC header in HSR Tx path. This is needed, because direct packet
transmission, e.g. by specifying PACKET_QDISC_BYPASS does not reset the MAC
header.

This has been observed using the following setup:

|$ ip link add name hsr0 type hsr slave1 lan0 slave2 lan1 supervision 45 version 1
|$ ifconfig hsr0 up
|$ ./test hsr0

The test binary is using mmap'ed sockets and is specifying the
PACKET_QDISC_BYPASS socket option.

This patch resolves the following warning on a non-patched kernel:

|[  112.725394] ------------[ cut here ]------------
|[  112.731418] WARNING: CPU: 1 PID: 257 at net/hsr/hsr_forward.c:560 hsr_forward_skb+0x484/0x568
|[  112.739962] net/hsr/hsr_forward.c:560: Malformed frame (port_src hsr0)

The warning can be safely removed, because the other call sites of
hsr_forward_skb() make sure that the skb is prepared correctly.

Fixes: d346a3fae3ff ("packet: introduce PACKET_QDISC_BYPASS socket option")
Signed-off-by: Kurt Kanzenbach <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agostmmac: intel: Enable SERDES PHY rx clk for PSE
Voon Weifeng [Tue, 6 Apr 2021 01:32:50 +0000 (09:32 +0800)]
stmmac: intel: Enable SERDES PHY rx clk for PSE

EHL PSE SGMII mode requires to ungate the SERDES PHY rx clk for power up
sequence and vice versa.

Signed-off-by: Voon Weifeng <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoMerge branch 'ethtool-doc'
David S. Miller [Wed, 7 Apr 2021 21:22:49 +0000 (14:22 -0700)]
Merge branch 'ethtool-doc'

Jakub Kicinski says:

====================
ethtool: kdoc fixes

Number of kdoc fixes to ethtool headers. All comment changes.

With all the patches posted kdoc script seems happy:
$ ./scripts/kernel-doc -none include/uapi/linux/ethtool.h include/linux/ethtool.h
$

Note that some of the changes are in -next, e.g. the FEC
documentation update so full effect will be seen after
trees converge.
====================

Signed-off-by: David S. Miller <[email protected]>
3 years agoethtool: fix kdoc in headers
Jakub Kicinski [Wed, 7 Apr 2021 00:28:27 +0000 (17:28 -0700)]
ethtool: fix kdoc in headers

Fix remaining issues with kdoc in the ethtool headers.

Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoethtool: document reserved fields in the uAPI
Jakub Kicinski [Wed, 7 Apr 2021 00:28:26 +0000 (17:28 -0700)]
ethtool: document reserved fields in the uAPI

Add a note on expected handling of reserved fields,
and references to all kdocs. This fixes a bunch
of kdoc warnings.

Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoethtool: un-kdocify extended link state
Jakub Kicinski [Wed, 7 Apr 2021 00:28:25 +0000 (17:28 -0700)]
ethtool: un-kdocify extended link state

Extended link state structures and enums use kdoc headers
but then do not describe any of the members.

Convert to normal comments.

Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoethtool: document PHY tunable callbacks
Jakub Kicinski [Wed, 7 Apr 2021 00:23:59 +0000 (17:23 -0700)]
ethtool: document PHY tunable callbacks

Add missing kdoc for phy tunable callbacks.

Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoMerge branch 'mptcp-next'
David S. Miller [Wed, 7 Apr 2021 21:09:40 +0000 (14:09 -0700)]
Merge branch 'mptcp-next'

Mat Martineau says:

====================
mptcp: Cleanup, a new test case, and header trimming

Some more patches to include from the MPTCP tree:

Patches 1-6 refactor an address-related data structure and reduce some
duplicate code that handles IPv4 and IPv6 addresses.

Patch 7 adds a test case for the MPTCP netlink interface, passing a
specific ifindex to the kernel.

Patch 8 drops extra header options from IPv4 address echo packets,
improving consistency and testability between IPv4 and IPv6.
====================

Signed-off-by: David S. Miller <[email protected]>
3 years agomptcp: drop all sub-options except ADD_ADDR when the echo bit is set
Davide Caratti [Wed, 7 Apr 2021 00:16:04 +0000 (17:16 -0700)]
mptcp: drop all sub-options except ADD_ADDR when the echo bit is set

Current Linux carries echo-ed ADD_ADDR over pure TCP ACKs, so there is no
need to add a DSS element that would fit only ADD_ADDR with IPv4 address.
Drop the DSS from echo-ed ADD_ADDR, regardless of the IP version.

Signed-off-by: Davide Caratti <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoselftests: mptcp: add the net device name testcase
Geliang Tang [Wed, 7 Apr 2021 00:16:03 +0000 (17:16 -0700)]
selftests: mptcp: add the net device name testcase

This patch added a new testcase for setting the net device name. In it,
pass the net device name to pm_nl_ctl to set the ifindex field of struct
mptcp_pm_addr_entry.

Signed-off-by: Geliang Tang <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agomptcp: unify add_addr(6)_generate_hmac
Geliang Tang [Wed, 7 Apr 2021 00:16:02 +0000 (17:16 -0700)]
mptcp: unify add_addr(6)_generate_hmac

The length of the IPv4 address is 4 octets and IPv6 is 16. That's the only
difference between add_addr_generate_hmac and add_addr6_generate_hmac.

This patch dropped the duplicate code and unify them into one.

Co-developed-by: Matthieu Baerts <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Signed-off-by: Geliang Tang <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agomptcp: drop MPTCP_ADDR_IPVERSION_4/6
Geliang Tang [Wed, 7 Apr 2021 00:16:01 +0000 (17:16 -0700)]
mptcp: drop MPTCP_ADDR_IPVERSION_4/6

Since the type of the address family in struct mptcp_options_received
became sa_family_t, we should set AF_INET/AF_INET6 to it, instead of
using MPTCP_ADDR_IPVERSION_4/6.

Signed-off-by: Geliang Tang <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agomptcp: use mptcp_addr_info in mptcp_options_received
Geliang Tang [Wed, 7 Apr 2021 00:16:00 +0000 (17:16 -0700)]
mptcp: use mptcp_addr_info in mptcp_options_received

This patch added a new struct mptcp_addr_info member addr in struct
mptcp_options_received, and dropped the original family, addr_id, addr,
addr6 and port fields in it. Then we can pass the parameter mp_opt.addr
directly to mptcp_pm_add_addr_received and mptcp_pm_add_addr_echoed.

Since the port number became big-endian now, use htons to convert the
incoming port number to it. Also use ntohs to convert it when passing
it to add_addr_generate_hmac or printing it out.

Co-developed-by: Matthieu Baerts <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Signed-off-by: Geliang Tang <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agomptcp: drop OPTION_MPTCP_ADD_ADDR6
Geliang Tang [Wed, 7 Apr 2021 00:15:59 +0000 (17:15 -0700)]
mptcp: drop OPTION_MPTCP_ADD_ADDR6

Since the family field was added in struct mptcp_out_options, no need to
use OPTION_MPTCP_ADD_ADDR6 to identify the IPv6 address. Drop it.

Signed-off-by: Geliang Tang <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agomptcp: use mptcp_addr_info in mptcp_out_options
Geliang Tang [Wed, 7 Apr 2021 00:15:58 +0000 (17:15 -0700)]
mptcp: use mptcp_addr_info in mptcp_out_options

This patch moved the mptcp_addr_info struct from protocol.h to mptcp.h,
added a new struct mptcp_addr_info member addr in struct mptcp_out_options,
and dropped the original addr, addr6, addr_id and port fields in it. Then
we can use opts->addr to get the adding address from PM directly using
mptcp_pm_add_addr_signal.

Since the port number became big-endian now, use ntohs to convert it
before sending it out with the ADD_ADDR suboption. Also convert it
when passing it to add_addr_generate_hmac or printing it out.

Co-developed-by: Matthieu Baerts <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Signed-off-by: Geliang Tang <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agomptcp: move flags and ifindex out of mptcp_addr_info
Geliang Tang [Wed, 7 Apr 2021 00:15:57 +0000 (17:15 -0700)]
mptcp: move flags and ifindex out of mptcp_addr_info

This patch moved the flags and ifindex fields from struct mptcp_addr_info
to struct mptcp_pm_addr_entry. Add the flags and ifindex values as two new
parameters to __mptcp_subflow_connect.

In mptcp_pm_create_subflow_or_signal_addr, pass the local address entry's
flags and ifindex fields to __mptcp_subflow_connect.

In mptcp_pm_nl_add_addr_received, just pass two zeros to it.

Signed-off-by: Geliang Tang <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet/rds: Avoid potential use after free in rds_send_remove_from_sock
Aditya Pakki [Wed, 7 Apr 2021 00:09:12 +0000 (19:09 -0500)]
net/rds: Avoid potential use after free in rds_send_remove_from_sock

In case of rs failure in rds_send_remove_from_sock(), the 'rm' resource
is freed and later under spinlock, causing potential use-after-free.
Set the free pointer to NULL to avoid undefined behavior.

Signed-off-by: Aditya Pakki <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoMerge tag 'arc-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Linus Torvalds [Wed, 7 Apr 2021 20:21:21 +0000 (13:21 -0700)]
Merge tag 'arc-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixlets from Vineet Gupta:
 "A few straggler fixes for ARC"

* tag 'arc-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: treewide: avoid the pointer addition with NULL pointer
  arc: kernel: Return -EFAULT if copy_to_user() fails
  ARC: haps: bump memory to 1 GB

3 years agoIB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS
Mike Marciniszyn [Mon, 29 Mar 2021 13:48:19 +0000 (09:48 -0400)]
IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS

A panic can result when AIP is enabled:

  BUG: unable to handle kernel NULL pointer dereference at 000000000000000
  PGD 0 P4D 0
  Oops: 0000 1 SMP PTI
  CPU: 70 PID: 981 Comm: systemd-udevd Tainted: G OE --------- - - 4.18.0-240.el8.x86_64 #1
  Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.01.01.0005.101720141054 10/17/2014
  RIP: 0010:__bitmap_and+0x1b/0x70
  RSP: 0018:ffff99aa0845f9f0 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffff8d5a6fc18000 RCX: 0000000000000048
  RDX: 0000000000000000 RSI: ffffffffc06336f0 RDI: ffff8d5a8fa67750
  RBP: 0000000000000079 R08: 0000000fffffffff R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000001 R12: ffffffffc06336f0
  R13: 00000000000000a0 R14: ffff8d5a6fc18000 R15: 0000000000000003
  FS: 00007fec137a5980(0000) GS:ffff8d5a9fa80000(0000) knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 0000000a04b48002 CR4: 00000000001606e0
  Call Trace:
  hfi1_num_netdev_contexts+0x7c/0x110 [hfi1]
  hfi1_init_dd+0xd7f/0x1a90 [hfi1]
  ? pci_bus_read_config_dword+0x49/0x70
  ? pci_mmcfg_read+0x3e/0xe0
  do_init_one.isra.18+0x336/0x640 [hfi1]
  local_pci_probe+0x41/0x90
  pci_device_probe+0x105/0x1c0
  really_probe+0x212/0x440
  driver_probe_device+0x49/0xc0
  device_driver_attach+0x50/0x60
  __driver_attach+0x61/0x130
  ? device_driver_attach+0x60/0x60
  bus_for_each_dev+0x77/0xc0
  ? klist_add_tail+0x3b/0x70
  bus_add_driver+0x14d/0x1e0
  ? dev_init+0x10b/0x10b [hfi1]
  driver_register+0x6b/0xb0
  ? dev_init+0x10b/0x10b [hfi1]
  hfi1_mod_init+0x1e6/0x20a [hfi1]
  do_one_initcall+0x46/0x1c3
  ? free_unref_page_commit+0x91/0x100
  ? _cond_resched+0x15/0x30
  ? kmem_cache_alloc_trace+0x140/0x1c0
  do_init_module+0x5a/0x220
  load_module+0x14b4/0x17e0
  ? __do_sys_finit_module+0xa8/0x110
  __do_sys_finit_module+0xa8/0x110
  do_syscall_64+0x5b/0x1a0

The issue happens when pcibus_to_node() returns NO_NUMA_NODE.

Fix this issue by moving the initialization of dd->node to hfi1_devdata
allocation and remove the other pcibus_to_node() calls in the probe path
and use dd->node instead.

Affinity logic is adjusted to use a new field dd->affinity_entry as a
guard instead of dd->node.

Fixes: 4730f4a6c6b2 ("IB/hfi1: Activate the dummy netdev")
Link: https://lore.kernel.org/r/1617025700-31865-4-git-send-email-dennis.dalessandro@cornelisnetworks.com
Cc: [email protected]
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
3 years agoRDMA/cxgb4: check for ipv6 address properly while destroying listener
Potnuri Bharat Teja [Wed, 31 Mar 2021 13:57:15 +0000 (19:27 +0530)]
RDMA/cxgb4: check for ipv6 address properly while destroying listener

ipv6 bit is wrongly set by the below which causes fatal adapter lookup
engine errors for ipv4 connections while destroying a listener.  Fix it to
properly check the local address for ipv6.

Fixes: 3408be145a5d ("RDMA/cxgb4: Fix adapter LE hash errors while destroying ipv6 listening server")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Potnuri Bharat Teja <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
3 years agoof: properly check for error returned by fdt_get_name()
Frank Rowand [Mon, 5 Apr 2021 03:28:45 +0000 (22:28 -0500)]
of: properly check for error returned by fdt_get_name()

fdt_get_name() returns error values via a parameter pointer
instead of in function return.  Fix check for this error value
in populate_node() and callers of populate_node().

Chasing up the caller tree showed callers of various functions
failing to initialize the value of pointer parameters that
can return error values.  Initialize those values to NULL.

The bug was introduced by
commit e6a6928c3ea1 ("of/fdt: Convert FDT functions to use libfdt")
but this patch can not be backported directly to that commit
because the relevant code has further been restructured by
commit dfbd4c6eff35 ("drivers/of: Split unflatten_dt_node()")

The bug became visible by triggering a crash on openrisc with:
commit 79edff12060f ("scripts/dtc: Update to upstream version v1.6.0-51-g183df9e9c2b9")
as reported in:
https://lore.kernel.org/lkml/20210327224116[email protected]/

Fixes: 79edff12060f ("scripts/dtc: Update to upstream version v1.6.0-51-g183df9e9c2b9")
Reported-by: Guenter Roeck <[email protected]>
Signed-off-by: Frank Rowand <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Rob Herring <[email protected]>
3 years agoACPI: processor: Fix build when CONFIG_ACPI_PROCESSOR=m
Vitaly Kuznetsov [Tue, 6 Apr 2021 15:56:40 +0000 (17:56 +0200)]
ACPI: processor: Fix build when CONFIG_ACPI_PROCESSOR=m

Commit 8cdddd182bd7 ("ACPI: processor: Fix CPU0 wakeup in
acpi_idle_play_dead()") tried to fix CPU0 hotplug breakage by copying
wakeup_cpu0() + start_cpu0() logic from hlt_play_dead()//mwait_play_dead()
into acpi_idle_play_dead(). The problem is that these functions are not
exported to modules so when CONFIG_ACPI_PROCESSOR=m build fails.

The issue could've been fixed by exporting both wakeup_cpu0()/start_cpu0()
(the later from assembly) but it seems putting the whole pattern into a
new function and exporting it instead is better.

Reported-by: kernel test robot <[email protected]>
Fixes: 8cdddd182bd7 ("CPI: processor: Fix CPU0 wakeup in acpi_idle_play_dead()")
Cc: <[email protected]> # 5.10+
Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
3 years agoMerge tag 'arm-fixes-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Wed, 7 Apr 2021 16:26:50 +0000 (09:26 -0700)]
Merge tag 'arm-fixes-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "Most of the changes again are devicetree fixes, but there are also
  five trivial build fixes for issues I found when test building with
  gcc-11 or when running 'make W=1', and some OMAP platform specific
  code fixups.

  Broadcom:
   - One revert for a Raspberry pi interrupt controller change that
     caused a regression.

  TI OMAP:
   - Remove unused duplicate sha2md5_fck clock node that can race with
     the OMAP4_SHA2MD5_CLKCTRL clock node for disable for unused clocks

   - Add aliases for omap4/5 mmc to put the slots back into the right
     order again

   - Fix typo for bionic voltage controllers that accidentally use mpu
     for all instances instead of mpu, core and iva

   - Fix random hangs for droid4 caused by missing fix from TI Android
     kernel tree to do a dummy smc call on cpuidle wakeup path

  NXP i.MX:
   - Fix a system failure on imx6qdl-phytec-pfla02 board when booting
     from SD, by adding missing vmmc supply for SD interfaces.

   - Fix address typo in i.MX8MM/Q IOMUXC_SD1_DATA0_GPIO2_IO2
     definition.

  Marvell mvebu:
   - Fix storm interrupt on Turris Omnia

   - Enable hardware buffer management as it should be

  ... and build fixes for PXA, Freescale, Marvell, OMAP1 and Keystone"

* tag 'arm-fixes-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  ARM: dts: turris-omnia: configure LED[2]/INTn pin as interrupt pin
  ARM: dts: turris-omnia: fix hardware buffer management
  Revert "arm64: dts: marvell: armada-cp110: Switch to per-port SATA interrupts"
  ARM: mvebu: avoid clang -Wtautological-constant warning
  ARM: pxa: mainstone: avoid -Woverride-init warning
  ARM: omap1: fix building with clang IAS
  soc/fsl: qbman: fix conflicting alignment attributes
  ARM: keystone: fix integer overflow warning
  ARM: dts: imx6: pbab01: Set vmmc supply for both SD interfaces
  arm64: dts: imx8mm/q: Fix pad control of SD1_DATA0
  ARM: OMAP4: PM: update ROM return address for OSWR and OFF
  ARM: OMAP4: Fix PMIC voltage domains for bionic
  ARM: dts: Fix moving mmc devices with aliases for omap4 & 5
  ARM: dts: Drop duplicate sha2md5_fck to fix clk_disable race
  Revert "ARM: dts: bcm2711: Add the BSC interrupt controller"

3 years agoMerge branch 'parisc-5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller...
Linus Torvalds [Wed, 7 Apr 2021 16:20:07 +0000 (09:20 -0700)]
Merge branch 'parisc-5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc fixes from Helge Deller:
 "One link error fix found by the kernel test robot, one sparse warning
  fix, remove a duplicate declaration and some spelling fixes"

* 'parisc-5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: math-emu: Few spelling fixes in the file fpu.h
  parisc: avoid a warning on u8 cast for cmpxchg on u8 pointers
  parisc: parisc-agp requires SBA IOMMU driver
  parisc: Remove duplicate struct task_struct declaration

3 years agoMerge tag 'platform-drivers-x86-v5.12-3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 7 Apr 2021 16:14:04 +0000 (09:14 -0700)]
Merge tag 'platform-drivers-x86-v5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fix from Hans de Goede:
 "A single bugfix to fix spurious wakeups from suspend caused by recent
  intel-hid driver changes"

* tag 'platform-drivers-x86-v5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: intel-hid: Fix spurious wakeups caused by tablet-mode events during suspend

3 years agoMerge tag 'regulator-fix-v5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 7 Apr 2021 16:08:36 +0000 (09:08 -0700)]
Merge tag 'regulator-fix-v5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
 "bd9571mwv regulator fixes for v5.12.

  A set of driver specific fixes here, the main one is a fix to not try
  to set unsupported voltages on this device. The other two patches
  clean up the error handling and eliminate the possibility that we
  could overflow the page when writing sysfs output (which AFAICT wasn't
  an issue but better to be sure)"

* tag 'regulator-fix-v5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: bd9571mwv: Convert device attribute to sysfs_emit()
  regulator: bd9571mwv: Fix regulator name printed on registration failure
  regulator: bd9571mwv: Fix AVS and DVFS voltage range

3 years agoxen/evtchn: Change irq_info lock to raw_spinlock_t
Luca Fancellu [Tue, 6 Apr 2021 10:51:04 +0000 (11:51 +0100)]
xen/evtchn: Change irq_info lock to raw_spinlock_t

Unmask operation must be called with interrupt disabled,
on preempt_rt spin_lock_irqsave/spin_unlock_irqrestore
don't disable/enable interrupts, so use raw_* implementation
and change lock variable in struct irq_info from spinlock_t
to raw_spinlock_t

Cc: [email protected]
Fixes: 25da4618af24 ("xen/events: don't unmask an event channel when an eoi is pending")
Signed-off-by: Luca Fancellu <[email protected]>
Reviewed-by: Julien Grall <[email protected]>
Reviewed-by: Wei Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Boris Ostrovsky <[email protected]>
3 years agoMerge tag 'asoc-fix-v5.12-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git...
Takashi Iwai [Wed, 7 Apr 2021 13:00:33 +0000 (15:00 +0200)]
Merge tag 'asoc-fix-v5.12-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v5.12

A fairly small batch of driver specific fixes, mainly for various x86
systems with the biggest set being fixes to power down DSPs properly on
x86 SOF systems.

3 years agos390/setup: use memblock_free_late() to free old stack
Heiko Carstens [Mon, 5 Apr 2021 20:32:27 +0000 (22:32 +0200)]
s390/setup: use memblock_free_late() to free old stack

Use memblock_free_late() to free the old machine check stack to the
buddy allocator instead of leaking it.

Fixes: b61b1595124a ("s390: add stack for machine check handler")
Cc: Vasily Gorbik <[email protected]>
Acked-by: Sven Schnelle <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
3 years agoALSA: aloop: Fix initialization of controls
Jonas Holmberg [Wed, 7 Apr 2021 07:54:28 +0000 (09:54 +0200)]
ALSA: aloop: Fix initialization of controls

Add a control to the card before copying the id so that the numid field
is initialized in the copy. Otherwise the numid field of active_id,
format_id, rate_id and channels_id will be the same (0) and
snd_ctl_notify() will not queue the events properly.

Signed-off-by: Jonas Holmberg <[email protected]>
Reviewed-by: Jaroslav Kysela <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
3 years agocan: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
Marc Kleine-Budde [Mon, 15 Mar 2021 09:59:09 +0000 (10:59 +0100)]
can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register

MCP251XFD_REG_TBC is the time base counter register. It increments
once per SYS clock tick, which is 20 or 40 MHz. Observation shows that
if the lowest byte (which is transferred first on the SPI bus) of that
register is 0x00 or 0x80 the calculated CRC doesn't always match the
transferred one.

To reproduce this problem let the driver read the TBC register in a
high frequency. This can be done by attaching only the mcp251xfd CAN
controller to a valid terminated CAN bus and send a single CAN frame.
As there are no other CAN controller on the bus, the sent CAN frame is
not ACKed and the mcp251xfd repeats it. If user space enables the bus
error reporting, each of the NACK errors is reported with a time
stamp (which is read from the TBC register) to user space.

$ ip link set can0 down
$ ip link set can0 up type can bitrate 500000 berr-reporting on
$ cansend can0 4FF#ff.01.00.00.00.00.00.00

This leads to several error messages per second:

| mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=00 3a 86 da, CRC=0x7753) retrying.
| mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=80 01 b4 da, CRC=0x5830) retrying.
| mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=00 e9 23 db, CRC=0xa723) retrying.
| mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=00 8a 30 db, CRC=0x4a9c) retrying.
| mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=80 f3 43 db, CRC=0x66d2) retrying.

If the highest bit in the lowest byte is flipped the transferred CRC
matches the calculated one. We assume for now the CRC calculation in
the chip works on wrong data and the transferred data is correct.

This patch implements the following workaround:

- If a CRC read error on the TBC register is detected and the lowest
  byte is 0x00 or 0x80, the highest bit of the lowest byte is flipped
  and the CRC is calculated again.
- If the CRC now matches, the _original_ data is passed to the reader.
  For now we assume transferred data was OK.

Link: https://lore.kernel.org/r/[email protected]
Cc: Manivannan Sadhasivam <[email protected]>
Cc: Thomas Kopp <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
3 years agocan: mcp251xfd: mcp251xfd_regmap_crc_read_one(): Factor out crc check into separate...
Marc Kleine-Budde [Mon, 15 Mar 2021 07:59:15 +0000 (08:59 +0100)]
can: mcp251xfd: mcp251xfd_regmap_crc_read_one(): Factor out crc check into separate function

This patch factors out the crc check into a separate function. This is
preparation for the next patch.

Link: https://lore.kernel.org/r/[email protected]
Cc: Manivannan Sadhasivam <[email protected]>
Cc: Thomas Kopp <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
3 years agocan: mcp251xfd: add BQL support
Marc Kleine-Budde [Sun, 13 Dec 2020 16:25:15 +0000 (17:25 +0100)]
can: mcp251xfd: add BQL support

This patch re-adds BQL support to the driver. Support for
netdev_xmit_more() will be added in a separate patch series.

Link: https://lore.kernel.org/r/[email protected]
Cc: Manivannan Sadhasivam <[email protected]>
Cc: Thomas Kopp <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
3 years agocan: c_can: remove unused enum BOSCH_C_CAN_PLATFORM
Marc Kleine-Budde [Wed, 24 Jul 2019 09:51:32 +0000 (11:51 +0200)]
can: c_can: remove unused enum BOSCH_C_CAN_PLATFORM

This patch removes the unused enum BOSCH_C_CAN_PLATFORM.

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>
3 years agocan: m_can: m_can_receive_skb(): add missing error handling to can_rx_offload_queue_s...
Marc Kleine-Budde [Thu, 1 Apr 2021 08:37:31 +0000 (10:37 +0200)]
can: m_can: m_can_receive_skb(): add missing error handling to can_rx_offload_queue_sorted() call

In commit 1be37d3b0414 ("can: m_can: fix periph RX path: use
rx-offload to ensure skbs are sent from softirq context") the RX path
for peripherals (i.e. SPI based m_can controllers) was converted to
the rx-offload infrastructure. However, the error handling for
can_rx_offload_queue_sorted() was forgotten.
can_rx_offload_queue_sorted() will return with an error if the
internal queue is full.

This patch adds the missing error handling, by increasing the
rx_fifo_errors.

Fixes: 1be37d3b0414 ("can: m_can: fix periph RX path: use rx-offload to ensure skbs are sent from softirq context")
Link: https://lore.kernel.org/r/[email protected]
Reported-by: coverity-bot <[email protected]>
Addresses-Coverity-ID: 1503583 ("Error handling issues")
Reviewed-by: Kees Cook <[email protected]>
Cc: Torin Cooper-Bennun <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
3 years agocan: skb: alloc_can{,fd}_skb(): set "cf" to NULL if skb allocation fails
Marc Kleine-Budde [Fri, 2 Apr 2021 10:05:39 +0000 (12:05 +0200)]
can: skb: alloc_can{,fd}_skb(): set "cf" to NULL if skb allocation fails

The handling of CAN bus errors typically consist of allocating a CAN
error SKB using alloc_can_err_skb() followed by stats handling and
filling the error details in the newly allocated CAN error SKB. Even
if the allocation of the SKB fails the stats handling should not be
skipped.

The common pattern in CAN drivers is to allocate the skb and work on
the struct can_frame pointer "cf", if it has been assigned by
alloc_can_err_skb().

| skb = alloc_can_err_skb(priv->ndev, &cf);
|
|  /* RX errors */
|  if (bdiag1 & (MCP251XFD_REG_BDIAG1_DCRCERR |
|        MCP251XFD_REG_BDIAG1_NCRCERR)) {
|  netdev_dbg(priv->ndev, "CRC error\n");
|
|  stats->rx_errors++;
|  if (cf)
|  cf->data[3] |= CAN_ERR_PROT_LOC_CRC_SEQ;
|  }

In case of an OOM alloc_can_err_skb() returns NULL, but doesn't set
"cf" to NULL as well. For the above pattern to work the "cf" has to be
initialized to NULL, which is easily forgotten.

To solve this kind of problems, set "cf" to NULL if
alloc_can_err_skb() returns NULL.

Link: https://lore.kernel.org/r/[email protected]
Suggested-by: Vincent MAILHOL <[email protected]>
Reviewed-by: Vincent Mailhol <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
3 years agonet/mlx5e: TC, Add support to offload sample action
Chris Mi [Mon, 21 Sep 2020 08:45:07 +0000 (16:45 +0800)]
net/mlx5e: TC, Add support to offload sample action

The following diagram illustrates the hardware model for tc sample action:

        +---------------------+
        + original flow table +
        +---------------------+
        +   original match    +
        +---------------------+
                   |
                   v
+------------------------------------------------+
+                Flow Sampler Object             +
+------------------------------------------------+
+                    sample ratio                +
+------------------------------------------------+
+    sample table id    |    default table id    +
+------------------------------------------------+
           |                            |
           v                            v
+-----------------------------+  +----------------------------------------+
+        sample table         +  + default table per <vport, chain, prio> +
+-----------------------------+  +----------------------------------------+
+ forward to management vport +  +            original match              +
+-----------------------------+  +----------------------------------------+
                                 +            other actions               +
                                 +----------------------------------------+

The sample action is translated to a goto flow table object
destination which samples packets according to the provided
sample ratio. Sampled packets are duplicated. One copy is
processed by a termination table, named the sample table,
which sends the packet to the eswitch manager port (that will
be processed by software).

The second copy is processed by the default table which executes
the subsequent actions. The default table is created per <vport,
chain, prio> tuple as rules with different prios and chains may
overlap.

Signed-off-by: Chris Mi <[email protected]>
Reviewed-by: Oz Shlomo <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5e: TC, Handle sampled packets
Chris Mi [Tue, 26 Jan 2021 03:15:46 +0000 (11:15 +0800)]
net/mlx5e: TC, Handle sampled packets

Mark the sampled packets with a sample restore object. Send sampled
packets using the psample api.

Signed-off-by: Chris Mi <[email protected]>
Reviewed-by: Oz Shlomo <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5e: TC, Refactor tc update skb function
Chris Mi [Fri, 15 Jan 2021 08:03:47 +0000 (16:03 +0800)]
net/mlx5e: TC, Refactor tc update skb function

As a pre-step to process sampled packet in this function.

Signed-off-by: Chris Mi <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5e: TC, Add sampler restore handle API
Chris Mi [Tue, 26 Jan 2021 03:08:28 +0000 (11:08 +0800)]
net/mlx5e: TC, Add sampler restore handle API

Use common object pool to create an object ID to map sample parameters.
Allocate a modify header action to write the object ID to reg_c0 lower
16 bits. Create a restore rule to pass the object ID to software. So
software can identify sampled packets via the object ID and send it to
userspace.

Aggregate the modify header action, restore rule and object ID to a
sample restore handle. Re-use identical sample restore handle for
the same object ID.

Signed-off-by: Chris Mi <[email protected]>
Reviewed-by: Oz Shlomo <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5e: TC, Add sampler object API
Chris Mi [Mon, 21 Sep 2020 05:25:54 +0000 (13:25 +0800)]
net/mlx5e: TC, Add sampler object API

In order to offload sample action, HW introduces sampler object. The
sampler object samples packets according to the provided sample ratio.
Sampled packets are duplicated. One copy is processed by a termination
table, named the sample table, which sends the packet up to software.
The second copy is processed by the default table.

Instantiate sampler object. Re-use identical sampler object for
the same sample ratio, sample table and default table as a prestep for
offloading tc sample actions.

Signed-off-by: Chris Mi <[email protected]>
Reviewed-by: Oz Shlomo <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5e: TC, Add sampler termination table API
Chris Mi [Fri, 18 Sep 2020 09:46:57 +0000 (17:46 +0800)]
net/mlx5e: TC, Add sampler termination table API

Sampled packets are sent to software using termination tables. There
is only one rule in that table that is to forward sampled packets to
the e-switch management vport.

Create a sampler termination table and rule for each eswitch.

Signed-off-by: Chris Mi <[email protected]>
Reviewed-by: Oz Shlomo <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5e: TC, Parse sample action
Chris Mi [Mon, 31 Aug 2020 05:28:35 +0000 (13:28 +0800)]
net/mlx5e: TC, Parse sample action

Parse TC sample action and save sample parameters in flow attribute
data structure.

Signed-off-by: Chris Mi <[email protected]>
Reviewed-by: Oz Shlomo <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: Instantiate separate mapping objects for FDB and NIC tables
Chris Mi [Mon, 31 Aug 2020 05:27:53 +0000 (13:27 +0800)]
net/mlx5: Instantiate separate mapping objects for FDB and NIC tables

Currently, the u32 chain id is mapped to u16 value which is stored on
the lower 16 bits of reg_c0 for FDB and reg_b for NIC tables. The
mapping is internally maintained by the chains object. However, with
the introduction of reg_c0 objects the fdb may store more than just
the chain id on reg_c0. This is not relevant for NIC tables.

Separate the chains mapping instantiation for FDB and NIC tables.
Remove the mapping from the chains object. For FDB tables, create
the mapping per eswitch. For NIC tables, create the mapping per tc
table. Pass the corresponding mapping pointer when creating the
chains object.

Signed-off-by: Chris Mi <[email protected]>
Reviewed-by: Oz Shlomo <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: Map register values to restore objects
Chris Mi [Thu, 10 Sep 2020 07:28:02 +0000 (15:28 +0800)]
net/mlx5: Map register values to restore objects

Currently reg_c0 lower 16 bits and reg_b are used to store the chain
id that missed in FDB and NIC tables accordingly. However, the
registers' values may index a restore object, rather than a single u32
value. Different object types can be used to restore mutually exclusive
contexts such as chain id and sample group id.

Use the mapping object to associate an index with a restore object
as a prestep for supporting additional restore types.

Signed-off-by: Chris Mi <[email protected]>
Reviewed-by: Oz Shlomo <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: E-switch, Set per vport table default group number
Chris Mi [Fri, 9 Oct 2020 03:06:33 +0000 (11:06 +0800)]
net/mlx5: E-switch, Set per vport table default group number

Different per voprt table is created using a different per vport table
namespace. Because we can't use variable to set the namespace member
value.  If max group number is 0 in the namespace, use the eswitch
default max group number.

Signed-off-by: Chris Mi <[email protected]>
Reviewed-by: Oz Shlomo <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: E-switch, Generalize per vport table API
Chris Mi [Mon, 31 Aug 2020 05:23:25 +0000 (13:23 +0800)]
net/mlx5: E-switch, Generalize per vport table API

Currently, per vport table was used only for port mirroring actions.
However, sample action will also require a per vport table instance.

Generalize the vport table API to work with multiple namespaces where
each namespace manages its own vport table instance.

Signed-off-by: Chris Mi <[email protected]>
Reviewed-by: Oz Shlomo <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: E-switch, Rename functions to follow naming convention.
Chris Mi [Thu, 14 Jan 2021 07:12:36 +0000 (15:12 +0800)]
net/mlx5: E-switch, Rename functions to follow naming convention.

Public api starts with mlx5 and remove mlx5 for non-public api.

Signed-off-by: Chris Mi <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: E-switch, Move vport table functions to a new file
Chris Mi [Mon, 31 Aug 2020 05:22:20 +0000 (13:22 +0800)]
net/mlx5: E-switch, Move vport table functions to a new file

Currently, the vport table functions are in common eswitch offload
file. This file is too big. Move the vport table create, delete and
lookup functions to a separate file. Put the file in esw directory.

Pre-step for generalizing its functionality for serving both the
mirroring and the sample features.

Signed-off-by: Chris Mi <[email protected]>
Reviewed-by: Oz Shlomo <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: fix kfree mismatch in indir_table.c
Xiaoming Ni [Mon, 5 Apr 2021 02:53:39 +0000 (10:53 +0800)]
net/mlx5: fix kfree mismatch in indir_table.c

Memory allocated by kvzalloc() should be freed by kvfree().

Fixes: 34ca65352ddf2 ("net/mlx5: E-Switch, Indirect table infrastructur")
Signed-off-by: Xiaoming Ni <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: Fix PBMC register mapping
Aya Levin [Sun, 4 Apr 2021 09:55:00 +0000 (12:55 +0300)]
net/mlx5: Fix PBMC register mapping

Add reserved mapping to cover all the register in order to avoid setting
arbitrary values to newer FW which implements the reserved fields.

Fixes: 50b4a3c23646 ("net/mlx5: PPTB and PBMC register firmware command support")
Signed-off-by: Aya Levin <[email protected]>
Reviewed-by: Moshe Shemesh <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: Fix PPLM register mapping
Aya Levin [Sun, 4 Apr 2021 07:50:50 +0000 (10:50 +0300)]
net/mlx5: Fix PPLM register mapping

Add reserved mapping to cover all the register in order to avoid
setting arbitrary values to newer FW which implements the reserved
fields.

Fixes: a58837f52d43 ("net/mlx5e: Expose FEC feilds and related capability bit")
Signed-off-by: Aya Levin <[email protected]>
Reviewed-by: Moshe Shemesh <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: Fix placement of log_max_flow_counter
Raed Salem [Thu, 21 Jan 2021 14:01:37 +0000 (16:01 +0200)]
net/mlx5: Fix placement of log_max_flow_counter

The cited commit wrongly placed log_max_flow_counter field of
mlx5_ifc_flow_table_prop_layout_bits, align it to the HW spec intended
placement.

Fixes: 16f1c5bb3ed7 ("net/mlx5: Check device capability for maximum flow counters")
Signed-off-by: Raed Salem <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agonet/mlx5: Fix HW spec violation configuring uplink
Eli Cohen [Wed, 24 Mar 2021 07:46:09 +0000 (09:46 +0200)]
net/mlx5: Fix HW spec violation configuring uplink

Make sure to modify uplink port to follow only if the uplink_follow
capability is set as required by the HW spec. Failure to do so causes
traffic to the uplink representor net device to cease after switching to
switchdev mode.

Fixes: 7d0314b11cdd ("net/mlx5e: Modify uplink state on interface up/down")
Signed-off-by: Eli Cohen <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
3 years agoLOOKUP_MOUNTPOINT: we are cleaning "jumped" flag too late
Al Viro [Tue, 6 Apr 2021 23:46:51 +0000 (19:46 -0400)]
LOOKUP_MOUNTPOINT: we are cleaning "jumped" flag too late

That (and traversals in case of umount .) should be done before
complete_walk().  Either a braino or mismerge damage on queue
reorders - either way, I should've spotted that much earlier.

Fucked-up-by: Al Viro <[email protected]>
X-Paperbag: Brown
Fixes: 161aff1d93ab "LOOKUP_MOUNTPOINT: fold path_mountpointat() into path_lookupat()"
Cc: [email protected] # v5.7+
Signed-off-by: Al Viro <[email protected]>
3 years agodocs: ethtool: correct quotes
Jakub Kicinski [Tue, 6 Apr 2021 22:59:31 +0000 (15:59 -0700)]
docs: ethtool: correct quotes

Quotes to backticks. All commands use backticks since the names
are constants.

Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agodocs: ethtool: fix some copy-paste errors
Jakub Kicinski [Tue, 6 Apr 2021 22:58:15 +0000 (15:58 -0700)]
docs: ethtool: fix some copy-paste errors

Fix incorrect documentation. Mostly referring to other objects,
likely because the text was copied and not adjusted.

Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: tun: set tun->dev->addr_len during TUNSETLINK processing
Phillip Potter [Tue, 6 Apr 2021 17:45:54 +0000 (18:45 +0100)]
net: tun: set tun->dev->addr_len during TUNSETLINK processing

When changing type with TUNSETLINK ioctl command, set tun->dev->addr_len
to match the appropriate type, using new tun_get_addr_len utility function
which returns appropriate address length for given type. Fixes a
KMSAN-found uninit-value bug reported by syzbot at:
https://syzkaller.appspot.com/bug?id=0766d38c656abeace60621896d705743aeefed51

Reported-by: [email protected]
Diagnosed-by: Eric Dumazet <[email protected]>
Signed-off-by: Phillip Potter <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonfp: flower: add support for packet-per-second policing
Peng Zhang [Tue, 6 Apr 2021 15:54:52 +0000 (17:54 +0200)]
nfp: flower: add support for packet-per-second policing

Allow hardware offload of a policer action attached to a matchall filter
which enforces a packets-per-second rate-limit.

e.g.
tc filter add dev tap1 parent ffff: u32 match \
        u32 0 0 police pkts_rate 3000 pkts_burst 1000

Signed-off-by: Peng Zhang <[email protected]>
Signed-off-by: Baowen Zheng <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
Signed-off-by: Louis Peens <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoethtool: fix incorrect datatype in set_eee ops
Wong Vee Khee [Tue, 6 Apr 2021 13:17:30 +0000 (21:17 +0800)]
ethtool: fix incorrect datatype in set_eee ops

The member 'tx_lpi_timer' is defined with __u32 datatype in the ethtool
header file. Hence, we should use ethnl_update_u32() in set_eee ops.

Fixes: fd77be7bd43c ("ethtool: set EEE settings with EEE_SET request")
Cc: <[email protected]> # 5.10.x
Cc: Michal Kubecek <[email protected]>
Signed-off-by: Wong Vee Khee <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
Reviewed-by: Michal Kubecek <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: hns3: clear VF down state bit before request link status
Guangbin Huang [Tue, 6 Apr 2021 13:10:43 +0000 (21:10 +0800)]
net: hns3: clear VF down state bit before request link status

Currently, the VF down state bit is cleared after VF sending
link status request command. There is problem that when VF gets
link status replied from PF, the down state bit may still set
as 1. In this case, the link status replied from PF will be
ignored and always set VF link status to down.

To fix this problem, clear VF down state bit before VF requests
link status.

Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Guangbin Huang <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
David S. Miller [Tue, 6 Apr 2021 23:36:41 +0000 (16:36 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following batch contains Netfilter/IPVS updates for your net-next tree:

1) Simplify log infrastructure modularity: Merge ipv4, ipv6, bridge,
   netdev and ARP families to nf_log_syslog.c. Add module softdeps.
   This fixes a rare deadlock condition that might occur when log
   module autoload is required. From Florian Westphal.

2) Moves part of netfilter related pernet data from struct net to
   net_generic() infrastructure. All of these users can be modules,
   so if they are not loaded there is no need to waste space. Size
   reduction is 7 cachelines on x86_64, also from Florian.

2) Update nftables audit support to report events once per table,
   to get it aligned with iptables. From Richard Guy Briggs.

3) Check for stale routes from the flowtable garbage collector path.
   This is fixing IPv6 which breaks due missing check for the dst_cookie.

4) Add a nfnl_fill_hdr() function to simplify netlink + nfnetlink
   headers setup.

5) Remove documentation on several statified functions.

6) Remove printk on netns creation for the FTP IPVS tracker,
   from Florian Westphal.

7) Remove unnecessary nf_tables_destroy_list_lock spinlock
   initialization, from Yang Yingliang.

7) Remove a duplicated forward declaration in ipset,
   from Wan Jiabing.
====================

Signed-off-by: David S. Miller <[email protected]>
3 years agoMerge tag 'linux-can-fixes-for-5.12-20210406' of git://git.kernel.org/pub/scm/linux...
David S. Miller [Tue, 6 Apr 2021 23:34:11 +0000 (16:34 -0700)]
Merge tag 'linux-can-fixes-for-5.12-20210406' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2021-04-06

this is a pull request of 1 patch for net/master.

The patch is by me and fixes the SPI half duplex support in the
mcp251x CAN driver.
====================

Signed-off-by: David S. Miller <[email protected]>
3 years agotime64.h: Consolidated PSEC_PER_SEC definition
Andy Shevchenko [Tue, 6 Apr 2021 10:22:51 +0000 (13:22 +0300)]
time64.h: Consolidated PSEC_PER_SEC definition

We have currently three users of the PSEC_PER_SEC each of them defining it
individually. Instead, move it to time64.h to be available for everyone.

There is a new user coming with the same constant in use. It will also
make its life easier.

Signed-off-by: Andy Shevchenko <[email protected]>
Acked-by: Heiko Stuebner <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agostmmac: intel: Drop duplicate ID in the list of PCI device IDs
Andy Shevchenko [Tue, 6 Apr 2021 10:13:06 +0000 (13:13 +0300)]
stmmac: intel: Drop duplicate ID in the list of PCI device IDs

The PCI device IDs are defined with a prefix PCI_DEVICE_ID.
There is no need to repeat the ID part at the end of each definition.

Signed-off-by: Andy Shevchenko <[email protected]>
Reviewed-by: Wong Vee Khee <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agopcnet32: Use pci_resource_len to validate PCI resource
Guenter Roeck [Tue, 6 Apr 2021 04:29:22 +0000 (21:29 -0700)]
pcnet32: Use pci_resource_len to validate PCI resource

pci_resource_start() is not a good indicator to determine if a PCI
resource exists or not, since the resource may start at address 0.
This is seen when trying to instantiate the driver in qemu for riscv32
or riscv64.

pci 0000:00:01.0: reg 0x10: [io  0x0000-0x001f]
pci 0000:00:01.0: reg 0x14: [mem 0x00000000-0x0000001f]
...
pcnet32: card has no PCI IO resources, aborting

Use pci_resouce_len() instead.

Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agobpf, sockmap: Fix incorrect fwd_alloc accounting
John Fastabend [Thu, 1 Apr 2021 22:00:40 +0000 (15:00 -0700)]
bpf, sockmap: Fix incorrect fwd_alloc accounting

Incorrect accounting fwd_alloc can result in a warning when the socket
is torn down,

 [18455.319240] WARNING: CPU: 0 PID: 24075 at net/core/stream.c:208 sk_stream_kill_queues+0x21f/0x230
 [...]
 [18455.319543] Call Trace:
 [18455.319556]  inet_csk_destroy_sock+0xba/0x1f0
 [18455.319577]  tcp_rcv_state_process+0x1b4e/0x2380
 [18455.319593]  ? lock_downgrade+0x3a0/0x3a0
 [18455.319617]  ? tcp_finish_connect+0x1e0/0x1e0
 [18455.319631]  ? sk_reset_timer+0x15/0x70
 [18455.319646]  ? tcp_schedule_loss_probe+0x1b2/0x240
 [18455.319663]  ? lock_release+0xb2/0x3f0
 [18455.319676]  ? __release_sock+0x8a/0x1b0
 [18455.319690]  ? lock_downgrade+0x3a0/0x3a0
 [18455.319704]  ? lock_release+0x3f0/0x3f0
 [18455.319717]  ? __tcp_close+0x2c6/0x790
 [18455.319736]  ? tcp_v4_do_rcv+0x168/0x370
 [18455.319750]  tcp_v4_do_rcv+0x168/0x370
 [18455.319767]  __release_sock+0xbc/0x1b0
 [18455.319785]  __tcp_close+0x2ee/0x790
 [18455.319805]  tcp_close+0x20/0x80

This currently happens because on redirect case we do skb_set_owner_r()
with the original sock. This increments the fwd_alloc memory accounting
on the original sock. Then on redirect we may push this into the queue
of the psock we are redirecting to. When the skb is flushed from the
queue we give the memory back to the original sock. The problem is if
the original sock is destroyed/closed with skbs on another psocks queue
then the original sock will not have a way to reclaim the memory before
being destroyed. Then above warning will be thrown

  sockA                          sockB

  sk_psock_strp_read()
   sk_psock_verdict_apply()
     -- SK_REDIRECT --
     sk_psock_skb_redirect()
                                skb_queue_tail(psock_other->ingress_skb..)

  sk_close()
   sock_map_unref()
     sk_psock_put()
       sk_psock_drop()
         sk_psock_zap_ingress()

At this point we have torn down our own psock, but have the outstanding
skb in psock_other. Note that SK_PASS doesn't have this problem because
the sk_psock_drop() logic releases the skb, its still associated with
our psock.

To resolve lets only account for sockets on the ingress queue that are
still associated with the current socket. On the redirect case we will
check memory limits per 6fa9201a89898, but will omit fwd_alloc accounting
until skb is actually enqueued. When the skb is sent via skb_send_sock_locked
or received with sk_psock_skb_ingress memory will be claimed on psock_other.

Fixes: 6fa9201a89898 ("bpf, sockmap: Avoid returning unneeded EAGAIN when redirecting to self")
Reported-by: Andrii Nakryiko <[email protected]>
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/161731444013.68884.4021114312848535993.stgit@john-XPS-13-9370
3 years agobpf, sockmap: Fix sk->prot unhash op reset
John Fastabend [Thu, 1 Apr 2021 22:00:19 +0000 (15:00 -0700)]
bpf, sockmap: Fix sk->prot unhash op reset

In '4da6a196f93b1' we fixed a potential unhash loop caused when
a TLS socket in a sockmap was removed from the sockmap. This
happened because the unhash operation on the TLS ctx continued
to point at the sockmap implementation of unhash even though the
psock has already been removed. The sockmap unhash handler when a
psock is removed does the following,

 void sock_map_unhash(struct sock *sk)
 {
void (*saved_unhash)(struct sock *sk);
struct sk_psock *psock;

rcu_read_lock();
psock = sk_psock(sk);
if (unlikely(!psock)) {
rcu_read_unlock();
if (sk->sk_prot->unhash)
sk->sk_prot->unhash(sk);
return;
}
        [...]
 }

The unlikely() case is there to handle the case where psock is detached
but the proto ops have not been updated yet. But, in the above case
with TLS and removed psock we never fixed sk_prot->unhash() and unhash()
points back to sock_map_unhash resulting in a loop. To fix this we added
this bit of code,

 static inline void sk_psock_restore_proto(struct sock *sk,
                                          struct sk_psock *psock)
 {
       sk->sk_prot->unhash = psock->saved_unhash;

This will set the sk_prot->unhash back to its saved value. This is the
correct callback for a TLS socket that has been removed from the sock_map.
Unfortunately, this also overwrites the unhash pointer for all psocks.
We effectively break sockmap unhash handling for any future socks.
Omitting the unhash operation will leave stale entries in the map if
a socket transition through unhash, but does not do close() op.

To fix set unhash correctly before calling into tls_update. This way the
TLS enabled socket will point to the saved unhash() handler.

Fixes: 4da6a196f93b1 ("bpf: Sockmap/tls, during free we may call tcp_bpf_unhash() in loop")
Reported-by: Cong Wang <[email protected]>
Reported-by: Lorenz Bauer <[email protected]>
Suggested-by: Cong Wang <[email protected]>
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/161731441904.68884.15593917809745631972.stgit@john-XPS-13-9370
3 years agonetdevsim: remove unneeded semicolon
Qiheng Lin [Tue, 6 Apr 2021 03:18:13 +0000 (11:18 +0800)]
netdevsim: remove unneeded semicolon

Eliminate the following coccicheck warning:
 drivers/net/netdevsim/fib.c:569:2-3: Unneeded semicolon

Signed-off-by: Qiheng Lin <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: ethernet: mtk_eth_soc: remove unneeded semicolon
Qiheng Lin [Tue, 6 Apr 2021 03:04:33 +0000 (11:04 +0800)]
net: ethernet: mtk_eth_soc: remove unneeded semicolon

Eliminate the following coccicheck warning:
 drivers/net/ethernet/mediatek/mtk_ppe.c:270:2-3: Unneeded semicolon

Signed-off-by: Qiheng Lin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agotipc: increment the tmp aead refcnt before attaching it
Xin Long [Tue, 6 Apr 2021 02:45:23 +0000 (10:45 +0800)]
tipc: increment the tmp aead refcnt before attaching it

Li Shuang found a NULL pointer dereference crash in her testing:

  [] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
  [] RIP: 0010:tipc_crypto_rcv_complete+0xc8/0x7e0 [tipc]
  [] Call Trace:
  []  <IRQ>
  []  tipc_crypto_rcv+0x2d9/0x8f0 [tipc]
  []  tipc_rcv+0x2fc/0x1120 [tipc]
  []  tipc_udp_recv+0xc6/0x1e0 [tipc]
  []  udpv6_queue_rcv_one_skb+0x16a/0x460
  []  udp6_unicast_rcv_skb.isra.35+0x41/0xa0
  []  ip6_protocol_deliver_rcu+0x23b/0x4c0
  []  ip6_input+0x3d/0xb0
  []  ipv6_rcv+0x395/0x510
  []  __netif_receive_skb_core+0x5fc/0xc40

This is caused by NULL returned by tipc_aead_get(), and then crashed when
dereferencing it later in tipc_crypto_rcv_complete(). This might happen
when tipc_crypto_rcv_complete() is called by two threads at the same time:
the tmp attached by tipc_crypto_key_attach() in one thread may be released
by the one attached by that in the other thread.

This patch is to fix it by incrementing the tmp's refcnt before attaching
it instead of calling tipc_aead_get() after attaching it.

Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication")
Reported-by: Li Shuang <[email protected]>
Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonfc: s3fwrn5: remove unnecessary label
wengjianfeng [Tue, 6 Apr 2021 01:59:54 +0000 (09:59 +0800)]
nfc: s3fwrn5: remove unnecessary label

In function s3fwrn5_nci_post_setup, the variable ret is assigned then
goto out label, which just return ret, so we use return to replace it.
Other goto sentences are similar, we use return sentences to replace
goto sentences and delete out label.

Signed-off-by: wengjianfeng <[email protected]>
Reviewed-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoMerge branch 'usbnet-speed'
David S. Miller [Tue, 6 Apr 2021 23:22:37 +0000 (16:22 -0700)]
Merge branch 'usbnet-speed'

Grant Grundler says:

====================
usbnet: speed reporting for devices without MDIO

This series introduces support for USB network devices that report
speed as a part of their protocol, not emulating an MII to be accessed
over MDIO.

v2: rebased on recent upstream changes
v3: incorporated hints on naming and comments
v4: fix misplaced hunks; reword some commit messages;
    add same change for cdc_ether
v4-repost: added "net-next" to subject and Andrew Lunn's Reviewed-by

I'm reposting Oliver Neukum's <[email protected]> patch series with
fix ups for "misplaced hunks" (landed in the wrong patches).
Please fixup the "author" if "git am" fails to attribute the
patches 1-3 (of 4) to Oliver.

I've tested v4 series with "5.12-rc3+" kernel on Intel NUC6i5SYB
and + Sabrent NT-S25G. Google Pixelbook Go (chromeos-4.4 kernel)
+ Alpha Network AUE2500C were connected directly to the NT-S25G
to get 2.5Gbps link rate:
Settings for enx002427880815:
        Supported ports: [  ]
        Supported link modes:   Not reported
        Supported pause frame use: No
        Supports auto-negotiation: No
        Supported FEC modes: Not reported
        Advertised link modes:  Not reported
        Advertised pause frame use: No
        Advertised auto-negotiation: No
        Advertised FEC modes: Not reported
        Speed: 2500Mb/s
        Duplex: Half
        Auto-negotiation: off
        Port: Twisted Pair
        PHYAD: 0
        Transceiver: internal
        MDI-X: Unknown
        Current message level: 0x00000007 (7)
                               drv probe link
        Link detected: yes

"Duplex" is a lie since we get no information about it.

I expect "Auto-Negotiation" is always true for cdc_ncm and
cdc_ether devices and perhaps someone knows offhand how
to have ethtool report "true" instead.

But this is good step in the right direction.

base-commit: 1c273e10bc0cc7efb933e0ca10e260cdfc9f0b8c
====================

Signed-off-by: David S. Miller <[email protected]>
3 years agonet: cdc_ether: record speed in status method
Grant Grundler [Mon, 5 Apr 2021 23:13:44 +0000 (16:13 -0700)]
net: cdc_ether: record speed in status method

Until very recently, the usbnet framework only had support functions
for devices which reported the link speed by explicitly querying the
PHY over a MDIO interface. However, the cdc_ether devices send
notifications when the link state or link speeds change and do not
expose the PHY (or modem) directly.

Support funtions (e.g. usbnet_get_link_ksettings_internal()) to directly
query state recorded by the cdc_ether driver were added in a previous patch.

Instead of cdc_ether spewing the link speed into the dmesg buffer,
record the link speed encoded in these notifications and tell the
usbnet framework to use the new functions to get link speed/state.

User space can now get the most recent link speed/state using ethtool.

v4: added to series since cdc_ether uses same notifications
    as cdc_ncm driver.

Signed-off-by: Grant Grundler <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: cdc_ncm: record speed in status method
Oliver Neukum [Mon, 5 Apr 2021 23:13:43 +0000 (16:13 -0700)]
net: cdc_ncm: record speed in status method

Until very recently, the usbnet framework only had support functions
for devices which reported the link speed by explicitly querying the
PHY over a MDIO interface. However, the cdc_ncm devices send
notifications when the link state or link speeds change and do not
expose the PHY (or modem) directly.

Support funtions (e.g. usbnet_get_link_ksettings_internal()) to directly
query state recorded by the cdc_ncm driver were added in a previous patch.

So instead of cdc_ncm spewing the link speed into the dmesg buffer,
record the link speed encoded in these notifications and tell the
usbnet framework to use the new functions to get link speed/state.
Link speed/state is now available via ethtool.

This is especially useful given all current RTL8156 devices emit
a connection/speed status notification every 32ms and this would
fill the dmesg buffer. This implementation replaces the one
recently submitted in de658a195ee23ca6aaffe197d1d2ea040beea0a2 :
   "net: usb: cdc_ncm: don't spew notifications"

v2: rebased on upstream
v3: changed variable names
v4: rewrote commit message

Signed-off-by: Oliver Neukum <[email protected]>
Tested-by: Roland Dreier <[email protected]>
Signed-off-by: Grant Grundler <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agousbnet: add method for reporting speed without MII
Oliver Neukum [Mon, 5 Apr 2021 23:13:42 +0000 (16:13 -0700)]
usbnet: add method for reporting speed without MII

The old method for reporting link speed assumed a driver uses the
generic phy (mii) MDIO read/write functions. CDC devices don't
expose the phy.

Add a primitive internal version reporting back directly what
the CDC notification/status operations recorded.

v2: rebased on upstream
v3: changed names and made clear which units are used
v4: moved hunks to correct patch; rewrote commmit messages

Signed-off-by: Oliver Neukum <[email protected]>
Tested-by: Roland Dreier <[email protected]>
Reviewed-by: Grant Grundler <[email protected]>
Tested-by: Grant Grundler <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agousbnet: add _mii suffix to usbnet_set/get_link_ksettings
Oliver Neukum [Mon, 5 Apr 2021 23:13:41 +0000 (16:13 -0700)]
usbnet: add _mii suffix to usbnet_set/get_link_ksettings

The generic functions assumed devices provided an MDIO interface (accessed
via older mii code, not phylib). This is true only for genuine ethernet.

Devices with a higher level of abstraction or based on different
technologies do not have MDIO. To support this case, first rename
the existing functions with _mii suffix.

v2: rebased on changed upstream
v3: changed names to clearly say that this does NOT use phylib
v4: moved hunks to correct patch; reworded commmit messages

Signed-off-by : Oliver Neukum <[email protected]>
Tested-by: Roland Dreier <[email protected]>
Reviewed-by: Grant Grundler <[email protected]>
Tested-by: Grant Grundler <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agovirtio_net: Do not pull payload in skb->head
Eric Dumazet [Fri, 2 Apr 2021 13:26:02 +0000 (06:26 -0700)]
virtio_net: Do not pull payload in skb->head

Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") brought  a ~10% performance drop.

The reason for the performance drop was that GRO was forced
to chain sk_buff (using skb_shinfo(skb)->frag_list), which
uses more memory but also cause packet consumers to go over
a lot of overhead handling all the tiny skbs.

It turns out that virtio_net page_to_skb() has a wrong strategy :
It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
copies 128 bytes from the page, before feeding the packet to GRO stack.

This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
meaning we were not packing MSS with 100% efficiency.

Fix is to pull only the ethernet header in page_to_skb()

Then, we change virtio_net_hdr_to_skb() to pull the missing
headers, instead of assuming they were already pulled by callers.

This fixes the performance regression, but could also allow virtio_net
to accept packets with more than 128bytes of headers.

Many thanks to Xuan Zhuo for his report, and his tests/help.

Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
Reported-by: Xuan Zhuo <[email protected]>
Link: https://www.spinics.net/lists/netdev/msg731397.html
Co-Developed-by: Xuan Zhuo <[email protected]>
Signed-off-by: Xuan Zhuo <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Cc: "Michael S. Tsirkin" <[email protected]>
Cc: Jason Wang <[email protected]>
Cc: [email protected]
Acked-by: Jason Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agotcp: Reset tcp connections in SYN-SENT state
Manoj Basapathi [Mon, 5 Apr 2021 17:02:42 +0000 (22:32 +0530)]
tcp: Reset tcp connections in SYN-SENT state

Userspace sends tcp connection (sock) destroy on network switch
i.e switching the default network of the device between multiple
networks(Cellular/Wifi/Ethernet).

Kernel though doesn't send reset for the connections in SYN-SENT state
and these connections continue to remain.
Even as per RFC 793, there is no hard rule to not send RST on ABORT in
this state.

Modify tcp_abort and tcp_disconnect behavior to send RST for connections
in syn-sent state to avoid lingering connections on network switch.

Signed-off-by: Manoj Basapathi <[email protected]>
Signed-off-by: Sauvik Saha <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agonet: broadcom: bcm4908enet: Fix a double free in bcm4908_enet_dma_alloc
Lv Yunlong [Fri, 2 Apr 2021 17:40:19 +0000 (10:40 -0700)]
net: broadcom: bcm4908enet: Fix a double free in bcm4908_enet_dma_alloc

In bcm4908_enet_dma_alloc, if callee bcm4908_dma_alloc_buf_descs() failed,
it will free the ring->cpu_addr by dma_free_coherent() and return error.
Then bcm4908_enet_dma_free() will be called, and free the same cpu_addr
by dma_free_coherent() again.

My patch set ring->cpu_addr to NULL after it is freed in
bcm4908_dma_alloc_buf_descs() to avoid the double free.

Fixes: 4feffeadbcb2e ("net: broadcom: bcm4908enet: add BCM4908 controller driver")
Signed-off-by: Lv Yunlong <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
3 years agoMerge tag 'mvebu-fixes-5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gclem...
Arnd Bergmann [Sat, 3 Apr 2021 19:57:53 +0000 (21:57 +0200)]
Merge tag 'mvebu-fixes-5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu into arm/fixes

mvebu fixes for 5.12 (part 1)

2 fixes on on turris-omnia (Armada 38x based:)
 - Fix storm interrupt
 - Enable hardware buffer management as it should be

Unbreak AHCI on all Marvell Armada 7k8k / CN913x platforms

* tag 'mvebu-fixes-5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu:
  ARM: dts: turris-omnia: configure LED[2]/INTn pin as interrupt pin
  ARM: dts: turris-omnia: fix hardware buffer management
  Revert "arm64: dts: marvell: armada-cp110: Switch to per-port SATA interrupts"

Link: https://lore.kernel.org/r/87a6qgctit.fsf@BL-laptop
Signed-off-by: Arnd Bergmann <[email protected]>
3 years agoBluetooth: Add support for reading AOSP vendor capabilities
Marcel Holtmann [Tue, 6 Apr 2021 19:55:52 +0000 (21:55 +0200)]
Bluetooth: Add support for reading AOSP vendor capabilities

When drivers indicate support for AOSP vendor extension, initialize them
and read its capabilities.

Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Luiz Augusto von Dentz <[email protected]>
3 years agonet: mac802154: Fix general protection fault
Pavel Skripkin [Thu, 4 Mar 2021 15:21:25 +0000 (18:21 +0300)]
net: mac802154: Fix general protection fault

syzbot found general protection fault in crypto_destroy_tfm()[1].
It was caused by wrong clean up loop in llsec_key_alloc().
If one of the tfm array members is in IS_ERR() range it will
cause general protection fault in clean up function [1].

Call Trace:
 crypto_free_aead include/crypto/aead.h:191 [inline] [1]
 llsec_key_alloc net/mac802154/llsec.c:156 [inline]
 mac802154_llsec_key_add+0x9e0/0xcc0 net/mac802154/llsec.c:249
 ieee802154_add_llsec_key+0x56/0x80 net/mac802154/cfg.c:338
 rdev_add_llsec_key net/ieee802154/rdev-ops.h:260 [inline]
 nl802154_add_llsec_key+0x3d3/0x560 net/ieee802154/nl802154.c:1584
 genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:739
 genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
 genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:800
 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502
 genl_rcv+0x24/0x40 net/netlink/genetlink.c:811
 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
 netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338
 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927
 sock_sendmsg_nosec net/socket.c:654 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:674
 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350
 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Signed-off-by: Pavel Skripkin <[email protected]>
Reported-by: [email protected]
Acked-by: Alexander Aring <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Stefan Schmidt <[email protected]>
3 years agonet: ieee802154: stop dump llsec params for monitors
Alexander Aring [Mon, 5 Apr 2021 00:30:54 +0000 (20:30 -0400)]
net: ieee802154: stop dump llsec params for monitors

This patch stops dumping llsec params for monitors which we don't support
yet. Otherwise we will access llsec mib which isn't initialized for
monitors.

Reported-by: [email protected]
Signed-off-by: Alexander Aring <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Stefan Schmidt <[email protected]>
3 years agonet: ieee802154: forbid monitor for del llsec seclevel
Alexander Aring [Mon, 5 Apr 2021 00:30:53 +0000 (20:30 -0400)]
net: ieee802154: forbid monitor for del llsec seclevel

This patch forbids to del llsec seclevel for monitor interfaces which we
don't support yet. Otherwise we will access llsec mib which isn't
initialized for monitors.

Reported-by: [email protected]
Signed-off-by: Alexander Aring <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Stefan Schmidt <[email protected]>
3 years agonet: ieee802154: forbid monitor for add llsec seclevel
Alexander Aring [Mon, 5 Apr 2021 00:30:52 +0000 (20:30 -0400)]
net: ieee802154: forbid monitor for add llsec seclevel

This patch forbids to add llsec seclevel for monitor interfaces which we
don't support yet. Otherwise we will access llsec mib which isn't
initialized for monitors.

Signed-off-by: Alexander Aring <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Stefan Schmidt <[email protected]>
3 years agonet: ieee802154: stop dump llsec seclevels for monitors
Alexander Aring [Mon, 5 Apr 2021 00:30:51 +0000 (20:30 -0400)]
net: ieee802154: stop dump llsec seclevels for monitors

This patch stops dumping llsec seclevels for monitors which we don't
support yet. Otherwise we will access llsec mib which isn't initialized
for monitors.

Signed-off-by: Alexander Aring <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Stefan Schmidt <[email protected]>
This page took 0.135332 seconds and 4 git commands to generate.