Heiner Kallweit [Wed, 6 Jan 2021 13:03:40 +0000 (14:03 +0100)]
net: phy: replace mutex_is_locked with lockdep_assert_held in phylib
Switch to lockdep_assert_held(_once), similar to what is being done
in other subsystems. One advantage is that there's zero runtime
overhead if lockdep support isn't enabled.
Christian Perle reported a PMTU blackhole due to unexpected interaction
between the ip defragmentation that comes with connection tracking and
ip tunnels.
Unfortunately setting 'nopmtudisc' on the tunnel breaks the test
scenario even without netfilter.
MTU is 1500 everywhere, except on Router A to Wanrouter and
Wanrouter to Router B.
Router A and Router B use IPIP tunnel interfaces to tunnel traffic
between Client A and Client B over WAN.
Client A sends a 1400 byte UDP datagram to Client B.
This packet gets encapsulated in the IPIP tunnel.
This works, packet is received on client B.
When conntrack (or anything else that forces ip defragmentation) is
enabled on Router A, the packet gets dropped on Router A after
encapsulation because they exceed the link MTU.
Setting the 'nopmtudisc' flag on the IPIP tunnel makes things worse,
no packets pass even in the no-netfilter scenario.
Patch one is a reproducer script for selftest infra.
Patch two is a fix for 'nopmtudisc' behaviour so ip_tunnel will send
an icmp error to Client A. This allows 'nopmtudisc' tunnel to forward
the UDP datagrams.
Patch three enables ip refragmentation for all reassembled packets, just
like ipv6.
====================
net: ip: always refragment ip defragmented packets
Conntrack reassembly records the largest fragment size seen in IPCB.
However, when this gets forwarded/transmitted, fragmentation will only
be forced if one of the fragmented packets had the DF bit set.
In that case, a flag in IPCB will force fragmentation even if the
MTU is large enough.
This should work fine, but this breaks with ip tunnels.
Consider client that sends a UDP datagram of size X to another host.
The client fragments the datagram, so two packets, of size y and z, are
sent. DF bit is not set on any of these packets.
Middlebox netfilter reassembles those packets back to single size-X
packet, before routing decision.
packet-size-vs-mtu checks in ip_forward are irrelevant, because DF bit
isn't set. At output time, ip refragmentation is skipped as well
because x is still smaller than the mtu of the output device.
If ttransmit device is an ip tunnel, the packet size increases to
x+overhead.
Also, tunnel might be configured to force DF bit on outer header.
In this case, packet will be dropped (exceeds MTU) and an ICMP error is
generated back to sender.
But sender already respects the announced MTU, all the packets that
it sent did fit the announced mtu.
Force refragmentation as per original sizes unconditionally so ip tunnel
will encapsulate the fragments instead.
The only other solution I see is to place ip refragmentation in
the ip_tunnel code to handle this case.
Fixes: d6b915e29f4ad ("ip_fragment: don't forward defragmented DF packet") Reported-by: Christian Perle <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Acked-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
For some reason ip_tunnel insist on setting the DF bit anyway when the
inner header has the DF bit set, EVEN if the tunnel was configured with
'nopmtudisc'.
This means that the script added in the previous commit
cannot be made to work by adding the 'nopmtudisc' flag to the
ip tunnel configuration. Doing so breaks connectivity even for the
without-conntrack/netfilter scenario.
When nopmtudisc is set, the tunnel will skip the mtu check, so no
icmp error is sent to client. Then, because inner header has DF set,
the outer header gets added with DF bit set as well.
IP stack then sends an error to itself because the packet exceeds
the device MTU.
It has been two releases since we added the common infra for UDP
tunnel port offload, and we have not heard of any major issues.
Remove the old direct driver NDOs completely, and perform minor
simplifications in the tunnel drivers.
Move the NETIF_F_RX_UDP_TUNNEL_PORT feature check into
udp_tunnel_nic_*_port() helpers, since they're always
done right before the call.
Add similar checks before calling the notifier.
udp_tunnel_nic invokes the notifier without checking
features which could result in some wasted cycles.
Lukas Bulwahn [Wed, 6 Jan 2021 16:17:35 +0000 (17:17 +0100)]
docs: octeontx2: tune rst markup
Commit 80b9414832a1 ("docs: octeontx2: Add Documentation for NPA health
reporters") added new documentation with improper formatting for rst, and
caused a few new warnings for make htmldocs in octeontx2.rst:169--202.
Tune markup and formatting for better presentation in the HTML view.
====================
bcm63xx_enet: major makeover of driver
This patch series aim to improve the bcm63xx_enet driver by integrating the
latest networking features, i.e. batched rx processing, BQL, build_skb,
etc.
The newer enetsw SoCs are found to be able to do unaligned rx DMA by adding
NET_IP_ALIGN padding which, combined with these patches, improved packet
processing performance by ~50% on BCM6328.
Older non-enetsw SoCs still benefit mainly from rx batching. Performance
improvement of ~30% is observed on BCM6333.
The BCM63xx SoCs are designed for routers. As such, having BQL is
beneficial as well as trivial to add.
v3:
* Simplify xmit_more patch by not moving around the code needlessly.
* Fix indentation in xmit_more patch.
* Fix indentation in build_skb patch.
* Split rx ring cleanup patch from build_skb patch and precede build_skb
patch for better understanding, as suggested by Florian Fainelli.
v2:
* Add xmit_more support and rx loop improvisation patches.
* Moved BQL netdev_reset_queue() to bcm_enet_stop()/bcm_enetsw_stop()
functions as suggested by Florian Fainelli.
* Improved commit messages.
====================
We can increase the efficiency of rx path by using buffers to receive
packets then build SKBs around them just before passing into the network
stack. In contrast, preallocating SKBs too early reduces CPU cache
efficiency.
Check if we're in NAPI context when refilling RX. Normally we're almost
always running in NAPI context. Dispatch to napi_alloc_frag directly
instead of relying on netdev_alloc_frag which does the same but
with the overhead of local_bh_disable/enable.
Tested on BCM6328 320 MHz and iperf3 -M 512 to measure packet/sec
performance. Included netif_receive_skb_list and NET_IP_ALIGN
optimizations.
Use netdev_alloc_skb_ip_align on newer SoCs with integrated switch
(enetsw) when refilling RX. Increases packet processing performance
by 30% (with netif_receive_skb_list).
Non-enetsw SoCs cannot function with the extra pad so continue to use
the regular netdev_alloc_skb.
Tested on BCM6328 320 MHz and iperf3 -M 512 to measure packet/sec
performance.
Dinghao Liu [Mon, 21 Dec 2020 11:27:31 +0000 (19:27 +0800)]
net/mlx5e: Fix memleak in mlx5e_create_l2_table_groups
When mlx5_create_flow_group() fails, ft->g should be
freed just like when kvzalloc() fails. The caller of
mlx5e_create_l2_table_groups() does not catch this
issue on failure, which leads to memleak.
Fixes: 33cfaaa8f36f ("net/mlx5e: Split the main flow steering table") Signed-off-by: Dinghao Liu <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
Dinghao Liu [Mon, 28 Dec 2020 08:48:40 +0000 (16:48 +0800)]
net/mlx5e: Fix two double free cases
mlx5e_create_ttc_table_groups() frees ft->g on failure of
kvzalloc(), but such failure will be caught by its caller
in mlx5e_create_ttc_table() and ft->g will be freed again
in mlx5e_destroy_flow_table(). The same issue also occurs
in mlx5e_create_ttc_table_groups(). Set ft->g to NULL after
kfree() to avoid double free.
Fixes: 7b3722fa9ef6 ("net/mlx5e: Support RSS for GRE tunneled packets") Fixes: 33cfaaa8f36f ("net/mlx5e: Split the main flow steering table") Signed-off-by: Dinghao Liu <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
Aya Levin [Sun, 27 Dec 2020 14:33:19 +0000 (16:33 +0200)]
net/mlx5e: ethtool, Fix restriction of autoneg with 56G
Prior to this patch, configuring speed to 50G with autoneg off over
devices supporting 50G per lane failed.
Support for 50G per lane introduced a new set of link-modes, on which
driver always performed a speed validation as if only legacy link-modes
were configured. Fix driver speed validation to force setting autoneg
over 56G only if in legacy link-mode.
Fixes: 3d7cadae51f1 ("net/mlx5e: ethtool, Fix analysis of speed setting") Signed-off-by: Aya Levin <[email protected]> Reviewed-by: Eran Ben Elisha <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
Maor Dickman [Mon, 14 Dec 2020 11:53:03 +0000 (13:53 +0200)]
net/mlx5e: In skb build skip setting mark in switchdev mode
sop_drop_qpn field in the cqe is used by two features, in SWITCHDEV mode
to restore the chain id in case of a miss and in LEGACY mode to support
skbedit mark action. In build RX skb, the skb mark field is set regardless
of the configured mode which cause a corruption of the mark field in case
of switchdev mode.
Fix by overriding the mark value back to 0 in the representor tc update
skb flow.
Fixes: 8f1e0b97cc70 ("net/mlx5: E-Switch, Mark miss packets with new chain id mapping") Signed-off-by: Maor Dickman <[email protected]> Reviewed-by: Raed Salem <[email protected]> Reviewed-by: Oz Shlomo <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
Alaa Hleihel [Mon, 4 Jan 2021 10:54:40 +0000 (12:54 +0200)]
net/mlx5: E-Switch, fix changing vf VLANID
Adding vf VLANID for the first time, or after having cleared previously
defined VLANID works fine, however, attempting to change an existing vf
VLANID clears the rules on the firmware, but does not add new rules for
the new vf VLANID.
Fix this by changing the logic in function esw_acl_egress_lgcy_setup()
so that it will always configure egress rules.
Moshe Shemesh [Fri, 13 Nov 2020 04:06:28 +0000 (06:06 +0200)]
net/mlx5e: Fix SWP offsets when vlan inserted by driver
In case WQE includes inline header the vlan is inserted by driver even
if vlan offload is set. On geneve over vlan interface where software
parser is used the SWP offsets should be updated according to the added
vlan.
Oz Shlomo [Mon, 7 Dec 2020 08:15:18 +0000 (08:15 +0000)]
net/mlx5e: CT: Use per flow counter when CT flow accounting is enabled
Connection counters may be shared for both directions when the counter
is used for connection aging purposes. However, if TC flow
accounting is enabled then a unique counter is required per direction.
Instantiate a unique counter per direction if the conntrack accounting
extension is enabled. Use a shared counter when the connection accounting
extension is disabled.
Aya Levin [Tue, 24 Nov 2020 20:16:23 +0000 (22:16 +0200)]
net/mlx5e: Add missing capability check for uplink follow
Expose firmware indication that it supports setting eswitch uplink state
to follow (follow the physical link). Condition setting the eswitch
uplink admin-state with this capability bit. Older FW may not support
the uplink state setting.
Fixes: 7d0314b11cdd ("net/mlx5e: Modify uplink state on interface up/down") Signed-off-by: Aya Levin <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
Fixes: 9b412cc35f00 ("net/mlx5e: Add LAG warning if bond slave is not lag master") Signed-off-by: Mark Zhang <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Reviewed-by: Maor Gottlieb <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
Linus Torvalds [Thu, 7 Jan 2021 20:21:32 +0000 (12:21 -0800)]
Merge tag 'spi-fix-v5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A couple of core fixes here, both to do with handling of drivers which
don't report their maximum speed since we factored some of the
handling for transfer speeds out into the core in the previous
release.
There's also some driver specific fixes, including a relatively large
set for some races around timeouts in spi-geni-qcom"
* tag 'spi-fix-v5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: fix the divide by 0 error when calculating xfer waiting time
spi: Fix the clamping of spi->max_speed_hz
spi: altera: fix return value for altera_spi_txrx()
spi: stm32: FIFO threshold level - fix align packet size
spi: spi-geni-qcom: Print an error when we timeout setting the CS
spi: spi-geni-qcom: Don't try to set CS if an xfer is pending
spi: spi-geni-qcom: Fail new xfers if xfer/cancel/abort pending
spi: spi-geni-qcom: Fix geni_spi_isr() NULL dereference in timeout case
When measuring the throughput (iperf3 + TCP) while routing on a
not-so-powerful device (Mediatek MT7621, 880MHz CPU), I noticed that I
achieved significantly lower speeds with QMI-based modems than for
example a USB LAN dongle. The CPU was saturated in all of my tests.
With the dongle I got ~300 Mbit/s, while I only measured ~200 Mbit/s
with the modems. All offloads, etc. were switched off for the dongle,
and I configured the modems to use QMAP (16k aggregation). The tests
with the dongle were performed in my local (gigabit) network, while the
LTE network the modems were connected to delivers 700-800 Mbit/s.
Profiling the kernel revealed the cause of the performance difference.
In qmimux_rx_fixup(), an SKB is allocated for each packet contained in
the URB. This SKB has too little headroom, causing the check in
skb_cow() (called from ip_forward()) to fail. pskb_expand_head() is then
called and the SKB is reallocated. In the output from perf, I see that a
significant amount of time is spent in pskb_expand_head() + support
functions.
In order to ensure that the SKB has enough headroom, this commit
increases the amount of memory allocated in qmimux_rx_fixup() by
LL_MAX_HEADER. The reason for using LL_MAX_HEADER and not a more
accurate value, is that we do not know the type of the outgoing network
interface. After making this change, I achieve the same throughput with
the modems as with the dongle.
Sean Tranchetti [Wed, 6 Jan 2021 00:22:26 +0000 (16:22 -0800)]
tools: selftests: add test for changing routes with PTMU exceptions
Adds new 2 new tests to the PTMU script: pmtu_ipv4/6_route_change.
These tests explicitly test for a recently discovered problem in the
IPv6 routing framework where PMTU exceptions were not properly released
when replacing a route via "ip route change ...".
After creating PMTU exceptions, the route from the device A to R1 will be
replaced with a new route, then device A will be deleted. If the PMTU
exceptions were properly cleaned up by the kernel, this device deletion
will succeed. Otherwise, the unregistration of the device will stall, and
messages such as the following will be logged in dmesg:
unregister_netdevice: waiting for veth_A-R1 to become free. Usage count = 4
Sean Tranchetti [Wed, 6 Jan 2021 00:22:25 +0000 (16:22 -0800)]
net: ipv6: fib: flush exceptions when purging route
Route removal is handled by two code paths. The main removal path is via
fib6_del_route() which will handle purging any PMTU exceptions from the
cache, removing all per-cpu copies of the DST entry used by the route, and
releasing the fib6_info struct.
The second removal location is during fib6_add_rt2node() during a route
replacement operation. This path also calls fib6_purge_rt() to handle
cleaning up the per-cpu copies of the DST entries and releasing the
fib6_info associated with the older route, but it does not flush any PMTU
exceptions that the older route had. Since the older route is removed from
the tree during the replacement, we lose any way of accessing it again.
As these lingering DSTs and the fib6_info struct are holding references to
the underlying netdevice struct as well, unregistering that device from the
kernel can never complete.
Linus Torvalds [Thu, 7 Jan 2021 19:57:56 +0000 (11:57 -0800)]
Merge tag 'regmap-fix-v5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
"A couple of small fixes for leaks when attaching a device to a
preexisting regmap"
* tag 'regmap-fix-v5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: debugfs: Fix a reversed if statement in regmap_debugfs_init()
regmap: debugfs: Fix a memory leak when calling regmap_attach_dev
Jakub Kicinski [Thu, 7 Jan 2021 19:08:08 +0000 (11:08 -0800)]
Merge tag 'linux-can-fixes-for-5.11-20210107' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2021-01-07
The first patch is by me for the m_can driver and removes an erroneous
m_can_clk_stop() from the driver's unregister function.
The second patch targets the tcan4x5x driver, is by me, and fixes the bit
timing constant parameters.
The next two patches are by me, target the mcp251xfd driver, and fix a race
condition in the optimized TEF path (which was added in net-next for v5.11).
The similar code in the RX path is changed to look the same, although it
doesn't suffer from the race condition.
A patch by Lad Prabhakar updates the description and help text for the rcar CAN
driver to reflect all supported SoCs.
In the last patch Sriram Dash transfers the maintainership of the m_can driver
to Pankaj Sharma.
* tag 'linux-can-fixes-for-5.11-20210107' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
MAINTAINERS: Update MCAN MMIO device driver maintainer
can: rcar: Kconfig: update help description for CAN_RCAR config
can: mcp251xfd: mcp251xfd_handle_rxif_ring(): first increment RX tail pointer in HW, then in driver
can: mcp251xfd: mcp251xfd_handle_tefif(): fix TEF vs. TX race condition
can: tcan4x5x: fix bittiming const, use common bittiming from m_can driver
can: m_can: m_can_class_unregister(): remove erroneous m_can_clk_stop()
====================
The first 16 patches are by me and target the tcan4x5x SPI glue driver for the
m_can CAN driver. First there are a several cleanup commits, then the SPI
regmap part is converted to 8 bits per word, to make it possible to use that
driver on SPI controllers that only support the 8 bit per word mode (such as
the SPI cores on the raspberry pi).
Oliver Hartkopp contributes a patch for the CAN_RAW protocol. The getsockopt()
for CAN_RAW_FILTER is changed to return -ERANGE if the filterset does not fit
into the provided user space buffer.
The last two patches are by Joakim Zhang and add wakeup support to the flexcan
driver for the i.MX8QM SoC. The dt-bindings docs are extended to describe the
added property.
* tag 'linux-can-next-for-5.12-20210106' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
can: flexcan: add CAN wakeup function for i.MX8QM
dt-bindings: can: fsl,flexcan: add fsl,scu-index property to indicate a resource
can: raw: return -ERANGE when filterset does not fit into user space buffer
can: tcan4x5x: add support for half-duplex controllers
can: tcan4x5x: rework SPI access
can: tcan4x5x: add {wr,rd}_table
can: tcan4x5x: add max_raw_{read,write} of 256
can: tcan4x5x: tcan4x5x_regmap: set reg_stride to 4
can: tcan4x5x: fix max register value
can: tcan4x5x: tcan4x5x_regmap_init(): use spi as context pointer
can: tcan4x5x: tcan4x5x_regmap_write(): remove not needed casts and replace 4 by sizeof
can: tcan4x5x: rename regmap_spi_gather_write() -> tcan4x5x_regmap_gather_write()
can: tcan4x5x: remove regmap async support
can: tcan4x5x: tcan4x5x_bus: remove not needed read_flag_mask
can: tcan4x5x: mark struct regmap_bus tcan4x5x_bus as constant
can: tcan4x5x: move regmap code into seperate file
can: tcan4x5x: rename tcan4x5x.c -> tcan4x5x-core.c
can: tcan4x5x: beautify indention of tcan4x5x_of_match and tcan4x5x_id_table
can: tcan4x5x: replace DEVICE_NAME by KBUILD_MODNAME
====================
can: mcp251xfd: mcp251xfd_handle_rxif_ring(): first increment RX tail pointer in HW, then in driver
The previous patch fixes a TEF vs. TX race condition, by first updating the TEF
tail pointer in hardware, and then updating the driver internal pointer.
The same pattern exists in the RX-path, too. This should be no problem, as the
driver accesses the RX-FIFO from the interrupt handler only, thus the access is
properly serialized. Fix the order here, too, so that the TEF- and RX-path look
similar.
can: mcp251xfd: mcp251xfd_handle_tefif(): fix TEF vs. TX race condition
The mcp251xfd driver uses a TX FIFO for sending CAN frames and a TX Event FIFO
(TEF) for completed TX-requests.
The TEF event handling in the mcp251xfd_handle_tefif() function has a race
condition. It first increments the tx-ring's tail counter to signal that
there's room in the TX and TEF FIFO, then it increments the TEF FIFO in
hardware.
A running mcp251xfd_start_xmit() on a different CPU might not stop the txqueue
(as the tx-ring still shows free space). The next mcp251xfd_start_xmit() will
push a message into the chip and the TX complete event might overflow the TEF
FIFO.
can: tcan4x5x: fix bittiming const, use common bittiming from m_can driver
According to the TCAN4550 datasheet "SLLSF91 - DECEMBER 2018" the tcan4x5x has
the same bittiming constants as a m_can revision 3.2.x/3.3.0.
The tcan4x5x chip I'm using identifies itself as m_can revision 3.2.1, so
remove the tcan4x5x specific bittiming values and rely on the values in the
m_can driver, which are selected according to core revision.
Randy Dunlap [Wed, 6 Jan 2021 04:25:31 +0000 (20:25 -0800)]
ptp: ptp_ines: prevent build when HAS_IOMEM is not set
ptp_ines.c uses devm_platform_ioremap_resource(), which is only
built/available when CONFIG_HAS_IOMEM is enabled.
CONFIG_HAS_IOMEM is not enabled for arch/s390/, so builds on S390
have a build error:
s390-linux-ld: drivers/ptp/ptp_ines.o: in function `ines_ptp_ctrl_probe':
ptp_ines.c:(.text+0x17e6): undefined reference to `devm_platform_ioremap_resource'
Prevent builds of ptp_ines.c when HAS_IOMEM is not set.
Randy Dunlap [Wed, 6 Jan 2021 02:18:15 +0000 (18:18 -0800)]
net: dsa: fix led_classdev build errors
Fix build errors when LEDS_CLASS=m and NET_DSA_HIRSCHMANN_HELLCREEK=y.
This limits the latter to =m when LEDS_CLASS=m.
microblaze-linux-ld: drivers/net/dsa/hirschmann/hellcreek_ptp.o: in function `hellcreek_ptp_setup':
(.text+0xf80): undefined reference to `led_classdev_register_ext'
microblaze-linux-ld: (.text+0xf94): undefined reference to `led_classdev_register_ext'
microblaze-linux-ld: drivers/net/dsa/hirschmann/hellcreek_ptp.o: in function `hellcreek_ptp_free':
(.text+0x1018): undefined reference to `led_classdev_unregister'
microblaze-linux-ld: (.text+0x1024): undefined reference to `led_classdev_unregister'
Alan Maguire [Wed, 6 Jan 2021 15:59:06 +0000 (15:59 +0000)]
bpftool: Fix compilation failure for net.o with older glibc
For older glibc ~2.17, #include'ing both linux/if.h and net/if.h
fails due to complaints about redefinition of interface flags:
CC net.o
In file included from net.c:13:0:
/usr/include/linux/if.h:71:2: error: redeclaration of enumerator ‘IFF_UP’
IFF_UP = 1<<0, /* sysfs */
^
/usr/include/net/if.h:44:5: note: previous definition of ‘IFF_UP’ was here
IFF_UP = 0x1, /* Interface is up. */
The issue was fixed in kernel headers in [1], but since compilation
of net.c picks up system headers the problem can recur.
Dropping #include <linux/if.h> resolves the issue and it is
not needed for compilation anyhow.
Valdis Klētnieks [Sat, 26 Dec 2020 18:21:58 +0000 (13:21 -0500)]
gcc-plugins: fix gcc 11 indigestion with plugins...
Fedora Rawhide has started including gcc 11,and the g++ compiler
throws a wobbly when it hits scripts/gcc-plugins:
HOSTCXX scripts/gcc-plugins/latent_entropy_plugin.so
In file included from /usr/include/c++/11/type_traits:35,
from /usr/lib/gcc/x86_64-redhat-linux/11/plugin/include/system.h:244,
from /usr/lib/gcc/x86_64-redhat-linux/11/plugin/include/gcc-plugin.h:28,
from scripts/gcc-plugins/gcc-common.h:7,
from scripts/gcc-plugins/latent_entropy_plugin.c:78:
/usr/include/c++/11/bits/c++0x_warning.h:32:2: error: #error This file requires compiler and library support for the ISO
C++ 2011 standard. This support must be enabled with the -std=c++11 or -std=gnu++11 compiler options.
32 | #error This file requires compiler and library support \
In fact, it works just fine with c++11, which has been in gcc since 4.8,
and we now require 4.9 as a minimum.
Linus Torvalds [Wed, 6 Jan 2021 19:19:08 +0000 (11:19 -0800)]
Merge tag 'for-5.11-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"A few more fixes that arrived before the end of the year:
- a bunch of fixes related to transaction handle lifetime wrt various
operations (umount, remount, qgroup scan, orphan cleanup)
- async discard scheduling fixes
- fix item size calculation when item keys collide for extend refs
(hardlinks)
- fix qgroup flushing from running transaction
- fix send, wrong file path when there is an inode with a pending
rmdir
- fix deadlock when cloning inline extent and low on free metadata
space"
* tag 'for-5.11-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: run delayed iputs when remounting RO to avoid leaking them
btrfs: add assertion for empty list of transactions at late stage of umount
btrfs: fix race between RO remount and the cleaner task
btrfs: fix transaction leak and crash after cleaning up orphans on RO mount
btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan
btrfs: merge critical sections of discard lock in workfn
btrfs: fix racy access to discard_ctl data
btrfs: fix async discard stall
btrfs: tests: initialize test inodes location
btrfs: send: fix wrong file path when there is an inode with a pending rmdir
btrfs: qgroup: don't try to wait flushing if we're already holding a transaction
btrfs: correctly calculate item size used when item key collision happens
btrfs: fix deadlock when cloning inline extent and low on free metadata space
Joakim Zhang [Fri, 6 Nov 2020 10:56:27 +0000 (18:56 +0800)]
can: flexcan: add CAN wakeup function for i.MX8QM
The System Controller Firmware (SCFW) is a low-level system function
which runs on a dedicated Cortex-M core to provide power, clock, and
resource management. It exists on some i.MX8 processors. e.g. i.MX8QM
(QM, QP), and i.MX8QX (QXP, DX). SCU driver manages the IPC interface
between host CPU and the SCU firmware running on M4.
For i.MX8QM, stop mode request is controlled by System Controller Unit(SCU)
firmware, this patch introduces FLEXCAN_QUIRK_SETUP_STOP_MODE_SCFW quirk
for this function.
Oliver Hartkopp [Wed, 16 Dec 2020 17:49:28 +0000 (18:49 +0100)]
can: raw: return -ERANGE when filterset does not fit into user space buffer
Multiple filters (struct can_filter) can be set with the setsockopt()
function, which was originally intended as a write-only operation.
As getsockopt() also provides a CAN_RAW_FILTER option to read back the
given filters, the caller has to provide an appropriate user space buffer.
In the case this buffer is too small the getsockopt() silently truncates
the filter information and gives no information about the needed space.
This is safe but not convenient for the programmer.
In net/core/sock.c the SO_PEERGROUPS sockopt had a similar requirement
and solved it by returning -ERANGE in the case that the provided data
does not fit into the given user space buffer and fills the required size
into optlen, so that the caller can retry with a matching buffer length.
This patch adopts this approach for CAN_RAW_FILTER getsockopt().
This patch reworks the SPI access and fixes several probems:
- tcan4x5x_regmap_gather_write(), tcan4x5x_regmap_read():
Do not place variable "addr" on stack and use it as buffer for SPI
transfer. Buffers for SPI transfers must be allocated from DMA save
memory.
- tcan4x5x_regmap_gather_write(), tcan4x5x_regmap_read():
Halfe number of SPI transfers by using a single buffer + memcpy().
This improves the performance, especially on SPI controllers, which
use interrupt based transfers.
- Use "8" bits per word, not "32". This makes it possible to use this
driver on SoCs like the Raspberry Pi, which SPI host controller
drivers only support 8 bits per word.
Note: this breaks half duplex only controllers. Support for them will be
re-added in the next patch.
can: tcan4x5x: tcan4x5x_regmap_init(): use spi as context pointer
This patch replaces the context pointer of the regmap callback functions by a
pointer to the spi_device. This saves one level of indirection in the
callbacks.
This patch renames the regmap_spi_gather_write() function to
tcan4x5x_regmap_gather_write(). Now it has a "tcan4x5x_" prefix as all other
functions in this driver.
Jiri Olsa [Tue, 5 Jan 2021 23:42:19 +0000 (00:42 +0100)]
tools/resolve_btfids: Warn when having multiple IDs for single type
The kernel image can contain multiple types (structs/unions)
with the same name. This causes distinct type hierarchies in
BTF data and makes resolve_btfids fail with error like:
BTFIDS vmlinux
FAILED unresolved symbol udp6_sock
as reported by Qais Yousef [1].
This change adds warning when multiple types of the same name
are detected:
BTFIDS vmlinux
WARN: multiple IDs found for 'file': 526, 113351 - using 526
WARN: multiple IDs found for 'sk_buff': 2744, 113958 - using 2744
We keep the lower ID for the given type instance and let the
build continue.
Also changing the 'nr' variable name to 'nr_types' to avoid confusion.
David S. Miller [Wed, 6 Jan 2021 01:05:12 +0000 (17:05 -0800)]
Merge branch 'net-ks8851-Add-KS8851-PHY-support'
Marek Vasut says:
====================
net: ks8851: Add KS8851 PHY support
The KS8851 has a reduced internal PHY, which is accessible through its
registers at offset 0xe4. The PHY is compatible with KS886x PHY present
in Micrel switches, including the PHY ID Low/High registers swap, which
is present both in the MAC and the switch.
====================
Marek Vasut [Tue, 5 Jan 2021 14:11:51 +0000 (15:11 +0100)]
net: ks8851: Register MDIO bus and the internal PHY
The KS8851 has a reduced internal PHY, which is accessible through its
registers at offset 0xe4. The PHY is compatible with KS886x PHY present
in Micrel switches, except the PHY ID Low/High registers are swapped.
Register MDIO bus so this PHY can be detected and probed by phylib.
Marek Vasut [Tue, 5 Jan 2021 14:11:50 +0000 (15:11 +0100)]
net: phy: micrel: Add KS8851 PHY support
The KS8851 has a reduced internal PHY, which is accessible through its
registers at offset 0xe4. The PHY is compatible with KS886x PHY present
in Micrel switches, including the PHY ID Low/High registers swap, which
is present both in the MAC and the switch.
Qinglang Miao [Tue, 5 Jan 2021 05:57:54 +0000 (13:57 +0800)]
net: qrtr: fix null-ptr-deref in qrtr_ns_remove
A null-ptr-deref bug is reported by Hulk Robot like this:
--------------
KASAN: null-ptr-deref in range [0x0000000000000128-0x000000000000012f]
Call Trace:
qrtr_ns_remove+0x22/0x40 [ns]
qrtr_proto_fini+0xa/0x31 [qrtr]
__x64_sys_delete_module+0x337/0x4e0
do_syscall_64+0x34/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x468ded
--------------
When qrtr_ns_init fails in qrtr_proto_init, qrtr_ns_remove which would
be called later on would raise a null-ptr-deref because qrtr_ns.workqueue
has been destroyed.
Fix it by making qrtr_ns_init have a return value and adding a check in
qrtr_proto_init.
net: cdc_ncm: correct overhead in delayed_ndp_size
Aligning to tx_ndp_modulus is not sufficient because the next align
call can be cdc_ncm_align_tail, which can add up to ctx->tx_modulus +
ctx->tx_remainder - 1 bytes. This used to lead to occasional crashes
on a Huawei 909s-120 LTE module as follows:
- the condition marked /* if there is a remaining skb [...] */ is true
so the swaps happen
- skb_out is set from ctx->tx_curr_skb
- skb_out->len is exactly 0x3f52
- ctx->tx_curr_size is 0x4000 and delayed_ndp_size is 0xac
(note that the sum of skb_out->len and delayed_ndp_size is 0x3ffe)
- the for loop over n is executed once
- the cdc_ncm_align_tail call marked /* align beginning of next frame */
increases skb_out->len to 0x3f56 (the sum is now 0x4002)
- the condition marked /* check if we had enough room left [...] */ is
false so we break out of the loop
- the condition marked /* If requested, put NDP at end of frame. */ is
true so the NDP is written into skb_out
- now skb_out->len is 0x4002, so padding_count is minus two interpreted
as an unsigned number, which is used as the length argument to memset,
leading to a crash with various symptoms but usually including
The cdc_ncm_align_tail call first aligns on a ctx->tx_modulus
boundary (adding at most ctx->tx_modulus-1 bytes), then adds
ctx->tx_remainder bytes. Alternatively, the next alignment call can
occur in cdc_ncm_ndp16 or cdc_ncm_ndp32, in which case at most
ctx->tx_ndp_modulus-1 bytes are added.
A similar problem has occurred before, and the code is nontrivial to
reason about, so add a guard before the crashing call. By that time it
is too late to prevent any memory corruption (we'll have written past
the end of the buffer already) but we can at least try to get a warning
written into an on-disk log by avoiding the hard crash caused by padding
past the buffer with a huge number of zeros.
Jian Shen [Tue, 5 Jan 2021 03:37:28 +0000 (11:37 +0800)]
net: hns3: fix incorrect handling of sctp6 rss tuple
For DEVICE_VERSION_V2, the hardware only supports src-ip,
dst-ip and verification-tag for rss tuple set of sctp6
packet. For DEVICE_VERSION_V3, the hardware supports
src-port and dst-port as well.
Currently, when user queries the sctp6 rss tuples info,
some unsupported information will be showed on V2. So add
a check for hardware version when initializing and queries
sctp6 rss tuple to fix this issue.
Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support") Signed-off-by: Jian Shen <[email protected]> Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Yufeng Mo [Tue, 5 Jan 2021 03:37:27 +0000 (11:37 +0800)]
net: hns3: fix the number of queues actually used by ARQ
HCLGE_MBX_MAX_ARQ_MSG_NUM is used to apply memory for the number
of queues used by ARQ(Asynchronous Receive Queue), so the head
and tail pointers should also use this macro.
Fixes: 07a0556a3a73 ("net: hns3: Changes to support ARQ(Asynchronous Receive Queue)") Signed-off-by: Yufeng Mo <[email protected]> Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Yonglong Liu [Tue, 5 Jan 2021 03:37:26 +0000 (11:37 +0800)]
net: hns3: fix a phy loopback fail issue
When phy driver does not implement the set_loopback interface,
phy loopback test will return -EOPNOTSUPP, and the loopback test
will fail. So when phy driver does not implement the set_loopback
interface, don't do phy loopback test.
Fixes: c9765a89d142 ("net: hns3: add phy selftest function") Signed-off-by: Yonglong Liu <[email protected]> Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Geetha sowjanya [Mon, 4 Jan 2021 07:20:39 +0000 (12:50 +0530)]
octeontx2-pf: Add RSS multi group support
Hardware supports 8 RSS groups per interface. Currently we are using
only group '0'. This patch allows user to create new RSS groups/contexts
and use the same as destination for flow steering rules.
usage:
To steer the traffic to RQ 2,3
ethtool -X eth0 weight 0 0 1 1 context new
(It will print the allocated context id number)
New RSS context is 1
David S. Miller [Wed, 6 Jan 2021 00:32:08 +0000 (16:32 -0800)]
Merge branch 'stmmac-fixes'
Samuel Holland says:
====================
Fixes for dwmac-sun8i suspend/resume
This series fixes issues preventing dwmac-sun8i from working after a
suspend/resume cycle. Those issues include the PHY being left powered
off, the MAC syscon configuration being reset, and the reference to the
reset controller being improperly dropped. They also fix related issues
in probe error handling and driver removal.
====================
Previously, sun8i_dwmac_set_syscon was called from a chain of functions
in several different files:
sun8i_dwmac_probe
stmmac_dvr_probe
stmmac_hw_init
stmmac_hwif_init
sun8i_dwmac_setup
sun8i_dwmac_set_syscon
which made the lifetime of the syscon values hard to reason about. Part
of the problem is that there is no similar platform driver callback from
stmmac_dvr_remove. As a result, the driver unset the syscon value in
sun8i_dwmac_exit, but this leaves it uninitialized after a suspend/
resume cycle. It was also unset a second time (outside sun8i_dwmac_exit)
in the probe error path.
Move the init to the earliest available place in sun8i_dwmac_probe
(after stmmac_probe_config_dt, which initializes plat_dat), and the
deinit to the corresponding position in the cleanup order.
Since priv is not filled in until stmmac_dvr_probe, this requires
changing the sun8i_dwmac_set_syscon parameters to priv's two relevant
members.
Fixes: 9f93ac8d4085 ("net-next: stmmac: Add dwmac-sun8i") Fixes: 634db83b8265 ("net: stmmac: dwmac-sun8i: Handle integrated/external MDIOs") Signed-off-by: Samuel Holland <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Samuel Holland [Sun, 3 Jan 2021 11:17:43 +0000 (05:17 -0600)]
net: stmmac: dwmac-sun8i: Balance internal PHY power
sun8i_dwmac_exit calls sun8i_dwmac_unpower_internal_phy, but
sun8i_dwmac_init did not call sun8i_dwmac_power_internal_phy. This
caused PHY power to remain off after a suspend/resume cycle. Fix this by
recording if PHY power should be restored, and if so, restoring it.
Fixes: 634db83b8265 ("net: stmmac: dwmac-sun8i: Handle integrated/external MDIOs") Signed-off-by: Samuel Holland <[email protected]> Signed-off-by: David S. Miller <[email protected]>
While stmmac_pltfr_remove calls sun8i_dwmac_exit, the sun8i_dwmac_init
and sun8i_dwmac_exit functions are also called by the stmmac_platform
suspend/resume callbacks. They may be called many times during the
device's lifetime and should not release resources used by the driver.
Furthermore, there was no error handling in case registering the MDIO
mux failed during probe, and the EPHY clock was never released at all.
Fix all of these issues by moving the deinitialization code to a driver
removal callback. Also ensure the EPHY is powered down before removal.
Fixes: 634db83b8265 ("net: stmmac: dwmac-sun8i: Handle integrated/external MDIOs") Signed-off-by: Samuel Holland <[email protected]> Reviewed-by: Chen-Yu Tsai <[email protected]> Signed-off-by: David S. Miller <[email protected]>
stmmac_pltfr_remove does three things in one function, making it
inapproprate for unwinding the steps in the probe function. Currently,
a failure before the call to stmmac_dvr_probe would leak OF node
references due to missing a call to stmmac_remove_config_dt. And an
error in stmmac_dvr_probe would cause the driver to attempt to remove a
netdevice that was never added. Fix these by reordering the init and
splitting out the error handling steps.
Fixes: 9f93ac8d4085 ("net-next: stmmac: Add dwmac-sun8i") Fixes: 40a1dcee2d18 ("net: ethernet: dwmac-sun8i: Use the correct function in exit path") Signed-off-by: Samuel Holland <[email protected]> Reviewed-by: Chen-Yu Tsai <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Jakub Kicinski [Thu, 31 Dec 2020 03:40:27 +0000 (19:40 -0800)]
net: vlan: avoid leaks on register_vlan_dev() failures
VLAN checks for NETREG_UNINITIALIZED to distinguish between
registration failure and unregistration in progress.
Since commit cb626bf566eb ("net-sysfs: Fix reference count leak")
registration failure may, however, result in NETREG_UNREGISTERED
as well as NETREG_UNINITIALIZED.
This fix is similer to cebb69754f37 ("rtnetlink: Fix
memory(net_device) leak when ->newlink fails")
Fixes: cb626bf566eb ("net-sysfs: Fix reference count leak") Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Bongsu Jeon [Thu, 31 Dec 2020 02:59:26 +0000 (11:59 +0900)]
net: nfc: nci: Change the NCI close sequence
If there is a NCI command in work queue after closing the NCI device at
nci_unregister_device, The NCI command timer starts at flush_workqueue
function and then NCI command timeout handler would be called 5 second
after flushing the NCI command work queue and destroying the queue.
At that time, the timeout handler would try to use NCI command work queue
that is destroyed already. it will causes the problem. To avoid this
abnormal situation, change the sequence to prevent the NCI command timeout
handler from being called after destroying the NCI command work queue.
Zheng Yongjun [Wed, 30 Dec 2020 09:18:09 +0000 (17:18 +0800)]
net: kcm: Replace fput with sockfd_put
The function sockfd_lookup uses fget on the value that is stored in
the file field of the returned structure, so fput should ultimately be
applied to this value. This can be done directly, but it seems better
to use the specific macro sockfd_put, which does the same thing.
Perform a source code refactoring by using the following semantic patch.
// <smpl>
@@
expression s;
@@
s = sockfd_lookup(...)
...
+ sockfd_put(s);
- fput(s->file);
// </smpl>