nfp: flower: fix memory leak in nfp_flower_spawn_vnic_reprs
In nfp_flower_spawn_vnic_reprs in the loop if initialization or the
allocations fail memory is leaked. Appropriate releases are added.
Fixes: b94524529741 ("nfp: flower: add per repr private data for LAG offload") Signed-off-by: Navid Emamdoost <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
nfp: flower: prevent memory leak in nfp_flower_spawn_phy_reprs
In nfp_flower_spawn_phy_reprs, in the for loop over eth_tbl if any of
intermediate allocations or initializations fail memory is leaked.
requiered releases are added.
Fixes: b94524529741 ("nfp: flower: add per repr private data for LAG offload") Signed-off-by: Navid Emamdoost <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Paul Blakey [Wed, 25 Sep 2019 15:02:35 +0000 (18:02 +0300)]
net/sched: Set default of CONFIG_NET_TC_SKB_EXT to N
This a new feature, it is preferred that it defaults to N.
We will probe the feature support from userspace before actually using it.
Fixes: 95a7233c452a ('net: openvswitch: Set OvS recirc_id from tc chain index') Signed-off-by: Paul Blakey <[email protected]> Signed-off-by: David S. Miller <[email protected]>
David Ahern [Wed, 25 Sep 2019 14:53:19 +0000 (07:53 -0700)]
vrf: Do not attempt to create IPv6 mcast rule if IPv6 is disabled
A user reported that vrf create fails when IPv6 is disabled at boot using
'ipv6.disable=1':
https://bugzilla.kernel.org/show_bug.cgi?id=204903
The failure is adding fib rules at create time. Add RTNL_FAMILY_IP6MR to
the check in vrf_fib_rule if ipv6_mod_enabled is disabled.
Fixes: e4a38c0c4b27 ("ipv6: add vrf table handling code for ipv6 mcast") Signed-off-by: David Ahern <[email protected]> Cc: Patrick Ruddy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
David S. Miller [Fri, 27 Sep 2019 10:13:55 +0000 (12:13 +0200)]
Merge branch 'qdisc-destroy'
Vlad Buslov says:
====================
Fix Qdisc destroy issues caused by adding fine-grained locking to filter API
TC filter API unlocking introduced several new fine-grained locks. The
change caused sleeping-while-atomic BUGs in several Qdiscs that call cls
APIs which need to obtain new mutex while holding sch tree spinlock. This
series fixes affected Qdiscs by ensuring that cls API that became sleeping
is only called outside of sch tree lock critical section.
====================
net: sched: sch_sfb: don't call qdisc_put() while holding tree lock
Recent changes that removed rtnl dependency from rules update path of tc
also made tcf_block_put() function sleeping. This function is called from
ops->destroy() of several Qdisc implementations, which in turn is called by
qdisc_put(). Some Qdiscs call qdisc_put() while holding sch tree spinlock,
which results sleeping-while-atomic BUG.
Steps to reproduce for sfb:
tc qdisc add dev ens1f0 handle 1: root sfb
tc qdisc add dev ens1f0 parent 1:10 handle 50: sfq perturb 10
tc qdisc change dev ens1f0 root handle 1: sfb
In sfb_change() function use qdisc_purge_queue() instead of
qdisc_tree_flush_backlog() to properly reset old child Qdisc and save
pointer to it into local temporary variable. Put reference to Qdisc after
sch tree lock is released in order not to call potentially sleeping cls API
in atomic section. This is safe to do because Qdisc has already been reset
by qdisc_purge_queue() inside sch tree lock critical section.
net: sched: multiq: don't call qdisc_put() while holding tree lock
Recent changes that removed rtnl dependency from rules update path of tc
also made tcf_block_put() function sleeping. This function is called from
ops->destroy() of several Qdisc implementations, which in turn is called by
qdisc_put(). Some Qdiscs call qdisc_put() while holding sch tree spinlock,
which results sleeping-while-atomic BUG.
Rearrange locking in multiq_tune() in following ways:
- In loop that removes Qdiscs from disabled queues, call
qdisc_purge_queue() instead of qdisc_tree_flush_backlog() on Qdisc that
is being destroyed. Save the Qdisc in temporary allocated array and call
qdisc_put() on each element of the array after sch tree lock is released.
This is safe to do because Qdiscs have already been reset by
qdisc_purge_queue() inside sch tree lock critical section.
- Do the same change for second loop that initializes Qdiscs for newly
enabled queues in multiq_tune() function. Since sch tree lock is obtained
and released on each iteration of this loop, just call qdisc_put()
directly outside of critical section. Don't verify that old Qdisc is not
noop_qdisc before releasing reference to it because such check is already
performed by qdisc_put*() functions.
Fixes: c266f64dbfa2 ("net: sched: protect block state with mutex") Signed-off-by: Vlad Buslov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
net: sched: sch_htb: don't call qdisc_put() while holding tree lock
Recent changes that removed rtnl dependency from rules update path of tc
also made tcf_block_put() function sleeping. This function is called from
ops->destroy() of several Qdisc implementations, which in turn is called by
qdisc_put(). Some Qdiscs call qdisc_put() while holding sch tree spinlock,
which results sleeping-while-atomic BUG.
Steps to reproduce for htb:
tc qdisc add dev ens1f0 root handle 1: htb default 12
tc class add dev ens1f0 parent 1: classid 1:1 htb rate 100kbps ceil 100kbps
tc qdisc add dev ens1f0 parent 1:1 handle 40: sfq perturb 10
tc class add dev ens1f0 parent 1:1 classid 1:2 htb rate 100kbps ceil 100kbps
In htb_change_class() function save parent->leaf.q to local temporary
variable and put reference to it after sch tree lock is released in order
not to call potentially sleeping cls API in atomic section. This is safe to
do because Qdisc has already been reset by qdisc_purge_queue() inside sch
tree lock critical section.
Fixes: c266f64dbfa2 ("net: sched: protect block state with mutex") Signed-off-by: Vlad Buslov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
David S. Miller [Fri, 27 Sep 2019 10:05:02 +0000 (12:05 +0200)]
Merge branch 'SO_PRIORITY'
Eric Dumazet says:
====================
tcp: provide correct skb->priority
SO_PRIORITY socket option requests TCP egress packets
to contain a user provided value.
TCP manages to send most packets with the requested values,
notably for TCP_ESTABLISHED state, but fails to do so for
few packets.
These packets are control packets sent on behalf
of SYN_RECV or TIME_WAIT states.
Note that to test this with packetdrill, it is a bit
of a hassle, since packetdrill can not verify priority
of egress packets, other than indirect observations,
using for example sch_prio on its tunnel device.
The bad skb priorities cause problems for GCP,
as this field is one of the keys used in routing.
====================
Eric Dumazet [Tue, 24 Sep 2019 15:01:16 +0000 (08:01 -0700)]
tcp: honor SO_PRIORITY in TIME_WAIT state
ctl packets sent on behalf of TIME_WAIT sockets currently
have a zero skb->priority, which can cause various problems.
In this patch we :
- add a tw_priority field in struct inet_timewait_sock.
- populate it from sk->sk_priority when a TIME_WAIT is created.
- For IPv4, change ip_send_unicast_reply() and its two
callers to propagate tw_priority correctly.
ip_send_unicast_reply() no longer changes sk->sk_priority.
- For IPv6, make sure TIME_WAIT sockets pass their tw_priority
field to tcp_v6_send_response() and tcp_v6_send_ack().
Eric Dumazet [Tue, 24 Sep 2019 15:01:14 +0000 (08:01 -0700)]
ipv6: add priority parameter to ip6_xmit()
Currently, ip6_xmit() sets skb->priority based on sk->sk_priority
This is not desirable for TCP since TCP shares the same ctl socket
for a given netns. We want to be able to send RST or ACK packets
with a non zero skb->priority.
Allan Zhang [Wed, 25 Sep 2019 23:43:12 +0000 (16:43 -0700)]
bpf: Fix bpf_event_output re-entry issue
BPF_PROG_TYPE_SOCK_OPS program can reenter bpf_event_output because it
can be called from atomic and non-atomic contexts since we don't have
bpf_prog_active to prevent it happen.
This patch enables 3 levels of nesting to support normal, irq and nmi
context.
We can easily reproduce the issue by running netperf crr mode with 100
flows and 10 threads from netperf client side.
Andrew Lunn [Wed, 25 Sep 2019 00:47:07 +0000 (02:47 +0200)]
net: dsa: qca8k: Fix port enable for CPU port
The CPU port does not have a PHY connected to it. So calling
phy_support_asym_pause() results in an Opps. As with other DSA
drivers, add a guard that the port is a user port.
Fixes: a2c11b034142 ("kcm: use BPF_PROG_RUN") Fixes: 6cab5e90ab2b ("bpf: run bpf programs with preemption disabled") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: syzbot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Dan Carpenter [Wed, 25 Sep 2019 11:05:54 +0000 (14:05 +0300)]
net: ethernet: stmmac: Fix signedness bug in ipq806x_gmac_of_parse()
The "gmac->phy_mode" variable is an enum and in this context GCC will
treat it as an unsigned int so the error handling will never be
triggered.
Fixes: b1c17215d718 ("stmmac: add ipq806x glue layer") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Dan Carpenter [Wed, 25 Sep 2019 11:05:24 +0000 (14:05 +0300)]
net: nixge: Fix a signedness bug in nixge_probe()
The "priv->phy_mode" is an enum and in this context GCC will treat it
as an unsigned int so it can never be less than zero.
Fixes: 492caffa8a1a ("net: ethernet: nixge: Add support for National Instruments XGE netdev") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Dan Carpenter [Wed, 25 Sep 2019 11:01:00 +0000 (14:01 +0300)]
of: mdio: Fix a signedness bug in of_phy_get_and_connect()
The "iface" variable is an enum and in this context GCC treats it as
an unsigned int so the error handling is never triggered.
Fixes: b78624125304 ("of_mdio: Abstract a general interface for phy connect") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Dan Carpenter [Wed, 25 Sep 2019 10:59:11 +0000 (13:59 +0300)]
net: axienet: fix a signedness bug in probe
The "lp->phy_mode" is an enum but in this context GCC treats it as an
unsigned int so the error handling is never triggered.
Fixes: ee06b1728b95 ("net: axienet: add support for standard phy-mode binding") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Radhey Shyam Pandey <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Dan Carpenter [Wed, 25 Sep 2019 10:58:22 +0000 (13:58 +0300)]
net: stmmac: dwmac-meson8b: Fix signedness bug in probe
The "dwmac->phy_mode" is an enum and in this context GCC treats it as
an unsigned int so the error handling is never triggered.
Fixes: 566e82516253 ("net: stmmac: add a glue driver for the Amlogic Meson 8b / GXBB DWMAC") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Martin Blumenstingl <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Dan Carpenter [Wed, 25 Sep 2019 10:57:14 +0000 (13:57 +0300)]
enetc: Fix a signedness bug in enetc_of_get_phy()
The "priv->if_mode" is type phy_interface_t which is an enum. In this
context GCC will treat the enum as an unsigned int so this error
handling is never triggered.
Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Dan Carpenter [Wed, 25 Sep 2019 10:56:38 +0000 (13:56 +0300)]
net: netsec: Fix signedness bug in netsec_probe()
The "priv->phy_interface" variable is an enum and in this context GCC
will treat it as an unsigned int so the error handling is never
triggered.
Fixes: 533dd11a12f6 ("net: socionext: Add Synquacer NetSec driver") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Dan Carpenter [Wed, 25 Sep 2019 10:55:32 +0000 (13:55 +0300)]
net: hisilicon: Fix signedness bug in hix5hd2_dev_probe()
The "priv->phy_mode" variable is an enum and in this context GCC will
treat it as unsigned to the error handling will never trigger.
Fixes: 57c5bc9ad7d7 ("net: hisilicon: add hix5hd2 mac driver") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Dan Carpenter [Wed, 25 Sep 2019 10:54:30 +0000 (13:54 +0300)]
net: aquantia: Fix aq_vec_isr_legacy() return value
The irqreturn_t type is an enum or an unsigned int in GCC. That
creates to problems because it can't detect if the
self->aq_hw_ops->hw_irq_read() call fails and at the end the function
always returns IRQ_HANDLED.
drivers/net/ethernet/aquantia/atlantic/aq_vec.c:316 aq_vec_isr_legacy() warn: unsigned 'err' is never less than zero.
drivers/net/ethernet/aquantia/atlantic/aq_vec.c:329 aq_vec_isr_legacy() warn: always true condition '(err >= 0) => (0-u32max >= 0)'
Fixes: 970a2e9864b0 ("net: ethernet: aquantia: Vector operations") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Igor Russkikh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
According to Tal Gilboa the only benefit from DIM comes from a driver
that uses it. So it doesn't make sense to make this symbol user visible,
instead all drivers that use it should select it (as is already the case
AFAICT).
libbpf: Teach btf_dumper to emit stand-alone anonymous enum definitions
BTF-to-C converter previously skipped anonymous enums in an assumption
that those are embedded in struct's field definitions. This is not
always the case and a lot of kernel constants are defined as part of
anonymous enums. This change fixes the logic by eagerly marking all
types as either referenced by any other type or not. This is enough to
distinguish two classes of anonymous enums and emit previously omitted
enum definitions.
ipv6: do not free rt if FIB_LOOKUP_NOREF is set on suppress rule
Commit 7d9e5f422150 removed references from certain dsts, but accounting
for this never translated down into the fib6 suppression code. This bug
was triggered by WireGuard users who use wg-quick(8), which uses the
"suppress-prefix" directive to ip-rule(8) for routing all of their
internet traffic without routing loops. The test case added here
causes the reference underflow by causing packets to evaluate a suppress
rule.
Fixes: 7d9e5f422150 ("ipv6: convert major tx path to use RT6_LOOKUP_F_DST_NOREF") Signed-off-by: Jason A. Donenfeld <[email protected]> Acked-by: Wei Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Li RongQing [Tue, 24 Sep 2019 11:11:52 +0000 (19:11 +0800)]
openvswitch: change type of UPCALL_PID attribute to NLA_UNSPEC
userspace openvswitch patch "(dpif-linux: Implement the API
functions to allow multiple handler threads read upcall)"
changes its type from U32 to UNSPEC, but leave the kernel
unchanged
and after kernel 6e237d099fac "(netlink: Relax attr validation
for fixed length types)", this bug is exposed by the below
warning
[ 57.215841] netlink: 'ovs-vswitchd': attribute type 5 has an invalid length.
Fixes: 5cd667b0a456 ("openvswitch: Allow each vport to have an array of 'port_id's") Signed-off-by: Li RongQing <[email protected]> Acked-by: Pravin B Shelar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
The size of individual pages in the page pool in given by an order. The
order is the binary logarithm of the number of pages that make up one of
the pages in the pool. However, the driver currently passes the number
of pages rather than the order, so it ends up wasting quite a bit of
memory.
Fix this by taking the binary logarithm and passing that in the order
field.
Fixes: 2af6106ae949 ("net: stmmac: Introducing support for Page Pool") Signed-off-by: Thierry Reding <[email protected]> Signed-off-by: David S. Miller <[email protected]>
The issue was caused by skb's true_size changed without its sk's
sk_wmem_alloc increased in tcp/skb_gro_receive(). Later when the
skb is being freed and the skb's truesize is subtracted from its
sk's sk_wmem_alloc in tcp_wfree(), underflow occurs.
macsec is calling gro_cells_receive() to receive a packet, which
actually requires skb->sk to be NULL. However when macsec dev is
over veth, it's possible the skb->sk is still set if the skb was
not unshared or expanded from the peer veth.
ip_rcv() is calling skb_orphan() to drop the skb's sk for tproxy,
but it is too late for macsec's calling gro_cells_receive(). So
fix it by dropping the skb's sk earlier on rx path of macsec.
net/sched: cbs: Fix not adding cbs instance to list
When removing a cbs instance when offloading is enabled, the crash
below can be observed.
The problem happens because that when offloading is enabled, the cbs
instance is not added to the list.
Also, the current code doesn't handle correctly the case when offload
is disabled without removing the qdisc: if the link speed changes the
credit calculations will be wrong. When we create the cbs instance
with offloading enabled, it's not added to the notification list, when
later we disable offloading, it's not in the list, so link speed
changes will not affect it.
The solution for both issues is the same, add the cbs instance being
created unconditionally to the global list, even if the link state
notification isn't useful "right now".
Fixes: e0a7683d30e9 ("net/sched: cbs: fix port_rate miscalculation") Signed-off-by: Vinicius Costa Gomes <[email protected]> Acked-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
selftests/bpf: adjust strobemeta loop to satisfy latest clang
Some recent changes in latest Clang started causing the following
warning when unrolling strobemeta test case main loop:
progs/strobemeta.h:416:2: warning: loop not unrolled: the optimizer was
unable to perform the requested transformation; the transformation might
be disabled or specified as part of an unsupported transformation
ordering [-Wpass-failed=transform-warning]
This patch simplifies loop's exit condition to depend only on constant
max iteration number (STROBE_MAX_MAP_ENTRIES), while moving early
termination logic inside the loop body. The changes are equivalent from
program logic standpoint, but fixes the warning. It also appears to
improve generated BPF code, as it fixes previously failing non-unrolled
strobemeta test cases.
Some compilers emit warning for potential uninitialized next_id usage.
The code is correct, but control flow is too complicated for some
compilers to figure this out. Re-initialize next_id to satisfy
compiler.
Jonathan Lemon [Tue, 24 Sep 2019 16:25:21 +0000 (09:25 -0700)]
bpf/xskmap: Return ERR_PTR for failure case instead of NULL.
When kzalloc() failed, NULL was returned to the caller, which
tested the pointer with IS_ERR(), which didn't match, so the
pointer was used later, resulting in a NULL dereference.
selftests/bpf: test_progs: fix client/server race in tcp_rtt
This is the same problem I found earlier in test_sockopt_inherit:
there is a race between server thread doing accept() and client
thread doing connect(). Let's explicitly synchronize them via
pthread conditional variable.
v2:
* don't exit from server_thread without signaling condvar,
fixes possible issue where main() would wait forever (Andrii Nakryiko)
Fixes: b55873984dab ("selftests/bpf: test BPF_SOCK_OPS_RTT_CB") Signed-off-by: Stanislav Fomichev <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
Jose Abreu [Mon, 23 Sep 2019 07:49:08 +0000 (09:49 +0200)]
net: stmmac: selftests: Flow Control test can also run with ASYM Pause
The Flow Control selftest is also available with ASYM Pause. Lets add
this check to the test and fix eventual false positive failures.
Fixes: 091810dbded9 ("net: stmmac: Introduce selftests support") Signed-off-by: Jose Abreu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
This series includes two fixes. The first improves reset code to allow
linkwatch_event to proceed during reset. The second ensures that no more
than one thread runs in reset at a time.
v2:
- Separate change param reset from do_reset()
- Return IBMVNIC_OPEN_FAILED if __ibmvnic_open fails
- Remove setting wait_for_reset to false from __ibmvnic_reset(), this
is done in wait_for_reset()
- Move the check for force_reset_recovery from patch 1 to patch 2
v3:
- Restore reset’s successful return in open failure case
v4:
- Change resetting flag access to atomic
====================
Juliet Kim [Fri, 20 Sep 2019 20:11:23 +0000 (16:11 -0400)]
net/ibmvnic: prevent more than one thread from running in reset
The current code allows more than one thread to run in reset. This can
corrupt struct adapter data. Check adapter->resetting before performing
a reset, if there is another reset running delay (100 msec) before trying
again.
Juliet Kim [Fri, 20 Sep 2019 20:11:22 +0000 (16:11 -0400)]
net/ibmvnic: unlock rtnl_lock in reset so linkwatch_event can run
Commit a5681e20b541 ("net/ibmnvic: Fix deadlock problem in reset")
made the change to hold the RTNL lock during a reset to avoid deadlock
but linkwatch_event is fired during the reset and needs the RTNL lock.
That keeps linkwatch_event process from proceeding until the reset
is complete. The reset process cannot tolerate the linkwatch_event
processing after reset completes, so release the RTNL lock during the
process to allow a chance for linkwatch_event to run during reset.
This does not guarantee that the linkwatch_event will be processed as
soon as link state changes, but is an improvement over the current code
where linkwatch_event processing is always delayed, which prevents
transmissions on the device from being deactivated leading transmit
watchdog timer to time-out.
Release the RTNL lock before link state change and re-acquire after
the link state change to allow linkwatch_event to grab the RTNL lock
and run during the reset.
Fixes: a5681e20b541 ("net/ibmnvic: Fix deadlock problem in reset") Signed-off-by: Juliet Kim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
arcnet: provide a buffer big enough to actually receive packets
struct archdr is only big enough to hold the header of various types of
arcnet packets. So to provide enough space to hold the data read from
hardware provide a buffer large enough to hold a packet with maximal
size.
The problem was noticed by the stack protector which makes the kernel
oops.
iwlwifi: fw: don't send GEO_TX_POWER_LIMIT command to FW version 36
The intention was to have the GEO_TX_POWER_LIMIT command in FW version
36 as well, but not all 8000 family got this feature enabled. The
8000 family is the only one using version 36, so skip this version
entirely. If we try to send this command to the firmwares that do not
support it, we get a BAD_COMMAND response from the firmware.
This fixes https://bugzilla.kernel.org/show_bug.cgi?id=204151.
mt7615 patch/n9/cr4 firmwares are available in mediatek folder in
linux-firmware repository. Because of this mt7615 won't work on regular
distributions like Ubuntu. Fix path definitions. Moreover remove useless
firmware name pointers and use definitions directly
David S. Miller [Tue, 24 Sep 2019 14:37:18 +0000 (16:37 +0200)]
Merge branch 'check-CAP_NEW_RAW'
Greg Kroah-Hartman says:
====================
Raw socket cleanups
Ori Nimron pointed out that there are a number of places in the kernel
where you can create a raw socket, without having to have the
CAP_NET_RAW permission.
To resolve this, here's a short patch series to test these odd and old
protocols for this permission before allowing the creation to succeed
All patches are currently against the net tree.
====================
Dmytro Linkin [Fri, 13 Sep 2019 10:42:21 +0000 (10:42 +0000)]
net/mlx5e: Fix matching on tunnel addresses type
In mlx5 parse_tunnel_attr() function dispatch on encap IP address type
is performed by directly checking flow_rule_match_key() on
FLOW_DISSECTOR_KEY_ENC_IPV4_ADDRS, and then on
FLOW_DISSECTOR_KEY_ENC_IPV6_ADDRS. However, since those are stored in
union, first check is always true if any type of encap address is set,
which leads to IPv6 tunnel encap address being parsed as IPv4 by mlx5.
Determine correct IP address type by checking control key first and if
it set, take address type from match.key->addr_type.
net/mlx5e: Fix traffic duplication in ethtool steering
Before this patch, when adding multiple ethtool steering rules with
identical classification, the driver used to append the new destination
to the already existing hw rule, which caused the hw to forward the
traffic to all destinations (rx queues).
Here we avoid this by setting the "no append" mlx5 fs core flag when
adding a new ethtool rule.
Bodong Wang [Mon, 26 Aug 2019 21:34:12 +0000 (16:34 -0500)]
net/mlx5: Add device ID of upcoming BlueField-2
Add the device ID of upcoming BlueField-2 integrated ConnectX-6 Dx
network controller. Its VFs will be using the generic VF device ID:
0x101e "ConnectX Family mlx5Gen Virtual Function".
Fixes: 2e9d3e83ab82 ("net/mlx5: Update the list of the PCI supported devices") Signed-off-by: Bodong Wang <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
Alex Vesker [Thu, 19 Sep 2019 08:24:19 +0000 (11:24 +0300)]
net/mlx5: DR, Fix getting incorrect prev node in ste_free
When we free an STE and the STE is in the middle of collision
list, the prev_ste was obtained incorrectly from the list.
To avoid such issues list_entry calls replaced with standard list API.
After firmware has been downloaded, driver should send
some information to it through H2C commands. Those H2C
commands are transmitted through TX path.
But before HCI has been started, the TX path is not
working completely. Such as PCI interfaces, the interrupts
are not enabled, hence TX interrupts will not be issued
after H2C skb has been DMAed to the device. And the H2C
skbs will not be released until the device is powered off.
net: dsa: Use the correct style for SPDX License Identifier
This patch corrects the SPDX License Identifier style
in header file for Distributed Switch Architecture drivers.
For C header files Documentation/process/license-rules.rst
mandates C-like comments (opposed to C source files where
C++ style should be used)
Changes made by using a script provided by Joe Perches here:
https://lkml.org/lkml/2019/2/7/46.
net: dsa: b53: Use the correct style for SPDX License Identifier
This patch corrects the SPDX License Identifier style
in header file for Broadcom BCM53xx managed switch driver.
For C header files Documentation/process/license-rules.rst
mandates C-like comments (opposed to C source files where
C++ style should be used)
Changes made by using a script provided by Joe Perches here:
https://lkml.org/lkml/2019/2/7/46.
Mao Wenan [Sun, 22 Sep 2019 05:38:08 +0000 (13:38 +0800)]
net: ena: Select DIMLIB for ENA_ETHERNET
If CONFIG_ENA_ETHERNET=y and CONFIG_DIMLIB=n,
below erros can be found:
drivers/net/ethernet/amazon/ena/ena_netdev.o: In function `ena_dim_work':
ena_netdev.c:(.text+0x21cc): undefined reference to `net_dim_get_rx_moderation'
ena_netdev.c:(.text+0x21cc): relocation truncated to
fit: R_AARCH64_CALL26 against undefined symbol `net_dim_get_rx_moderation'
drivers/net/ethernet/amazon/ena/ena_netdev.o: In function `ena_io_poll':
ena_netdev.c:(.text+0x7bd4): undefined reference to `net_dim'
ena_netdev.c:(.text+0x7bd4): relocation truncated to fit:
R_AARCH64_CALL26 against undefined symbol `net_dim'
After commit 282faf61a053 ("net: ena: switch to dim algorithm for rx adaptive
interrupt moderation"), it introduces dim algorithm, which configured by CONFIG_DIMLIB.
So, this patch is to select DIMLIB for ENA_ETHERNET.
Fixes: 282faf61a053 ("net: ena: switch to dim algorithm for rx adaptive interrupt moderation") Signed-off-by: Mao Wenan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
Putting a struct stmmac_rss object on the stack is a bad idea,
as it exceeds the warning limit for a stack frame on 32-bit architectures:
drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c:1221:12: error: stack frame size of 1208 bytes in function '__stmmac_test_l3filt' [-Werror,-Wframe-larger-than=]
drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c:1338:12: error: stack frame size of 1208 bytes in function '__stmmac_test_l4filt' [-Werror,-Wframe-larger-than=]
As the object is the trivial empty case, change the called function
to accept a NULL pointer to mean the same thing and remove the
large variable in the two callers.
Fixes: 4647e021193d ("net: stmmac: selftests: Add selftest for L3/L4 Filters") Signed-off-by: Arnd Bergmann <[email protected]> Acked-by: Jose Abreu <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
Mao Wenan [Thu, 19 Sep 2019 06:38:19 +0000 (14:38 +0800)]
net: dsa: sja1105: Add dependency for NET_DSA_SJA1105_TAS
If CONFIG_NET_DSA_SJA1105_TAS=y and CONFIG_NET_SCH_TAPRIO=n,
below error can be found:
drivers/net/dsa/sja1105/sja1105_tas.o: In function `sja1105_setup_tc_taprio':
sja1105_tas.c:(.text+0x318): undefined reference to `taprio_offload_free'
sja1105_tas.c:(.text+0x590): undefined reference to `taprio_offload_get'
drivers/net/dsa/sja1105/sja1105_tas.o: In function `sja1105_tas_teardown':
sja1105_tas.c:(.text+0x610): undefined reference to `taprio_offload_free'
make: *** [vmlinux] Error 1
sja1105_tas needs tc-taprio, so this patch add the dependency for it.
Fixes: 317ab5b86c8e ("net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio offload") Signed-off-by: Mao Wenan <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
Cong Wang [Thu, 19 Sep 2019 01:44:43 +0000 (18:44 -0700)]
net_sched: add policy validation for action attributes
Similar to commit 8b4c3cdd9dd8
("net: sched: Add policy validation for tc attributes"), we need
to add proper policy validation for TC action attributes too.
Cong Wang [Wed, 18 Sep 2019 23:24:12 +0000 (16:24 -0700)]
net_sched: add max len check for TCA_KIND
The TCA_KIND attribute is of NLA_STRING which does not check
the NUL char. KMSAN reported an uninit-value of TCA_KIND which
is likely caused by the lack of NUL.
Change it to NLA_NUL_STRING and add a max len too.
As the endpoint is unregistered there might still be work pending to
handle incoming messages, which will result in a use after free
scenario. The plan is to remove the rx_worker, but until then (and for
stable@) ensure that the work is stopped before the node is freed.
Peter Mamonov [Wed, 18 Sep 2019 16:27:55 +0000 (19:27 +0300)]
net/phy: fix DP83865 10 Mbps HDX loopback disable function
According to the DP83865 datasheet "the 10 Mbps HDX loopback can be
disabled in the expanded memory register 0x1C0.1". The driver erroneously
used bit 0 instead of bit 1.
Fixes: 4621bf129856 ("phy: Add file missed in previous commit.") Signed-off-by: Peter Mamonov <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
usbnet: ignore endpoints with invalid wMaxPacketSize
Endpoints with zero wMaxPacketSize are not usable for transferring
data. Ignore such endpoints when looking for valid in, out and
status pipes, to make the drivers more robust against invalid and
meaningless descriptors.
The wMaxPacketSize of these endpoints are used for memory allocations
and as divisors in many usbnet minidrivers. Avoiding zero is therefore
critical.
cdc_ncm: fix divide-by-zero caused by invalid wMaxPacketSize
Endpoints with zero wMaxPacketSize are not usable for transferring
data. Ignore such endpoints when looking for valid in, out and
status pipes, to make the driver more robust against invalid and
meaningless descriptors.
The wMaxPacketSize of the out pipe is used as divisor. So this change
fixes a divide-by-zero bug.
drivers/net/wireless/zydas/zd1211rw/zd_usb.c: In function ‘check_read_regs’:
drivers/net/wireless/zydas/zd1211rw/zd_def.h:18:25: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 6 has type ‘size_t’ {aka ‘unsigned int’} [-Wformat=]
dev_printk(level, dev, "%s() " fmt, __func__, ##args)
^~~~~~~
drivers/net/wireless/zydas/zd1211rw/zd_def.h:22:4: note: in expansion of macro ‘dev_printk_f’
dev_printk_f(KERN_DEBUG, dev, fmt, ## args)
^~~~~~~~~~~~
drivers/net/wireless/zydas/zd1211rw/zd_usb.c:1635:3: note: in expansion of macro ‘dev_dbg_f’
dev_dbg_f(zd_usb_dev(usb),
^~~~~~~~~
drivers/net/wireless/zydas/zd1211rw/zd_usb.c:1636:51: note: format string is defined here
"error: actual length %d less than expected %ld\n",
~~^
%d
Fixes: 84b0b66352470e64 ("zd1211rw: zd_usb: Use struct_size() helper") Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
Interrupt is disabled to stop PCI, which means the skbs
queued for each TX ring will not be released via DMA
interrupt. To avoid those skbs remained being left in
the skb queue until PCI has been removed, driver needs
to release skbs by itself.
rtw88: pci: extract skbs free routine for trx rings
These skbs free routines could be used when driver wants
to stop PCI bus, because some of the skbs remained in the
queue may not have been returned via DMA interrupt.
The `adi,disable-energy-detect` property was implemented in an initial
version of the `adin` driver series, but after a review it was discarded in
favor of implementing the ETHTOOL_PHY_EDPD phy-tunable option.
With the ETHTOOL_PHY_EDPD control, it's possible to disable/enable
Energy-Detect-Power-Down for the `adin` PHY, so this device-tree is not
needed.
Fixes: 767078132ff9 ("dt-bindings: net: add bindings for ADIN PHY driver") Signed-off-by: Alexandru Ardelean <[email protected]> Reviewed-by: Rob Herring <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
David Ahern [Tue, 17 Sep 2019 17:39:49 +0000 (10:39 -0700)]
ipv4: Revert removal of rt_uses_gateway
Julian noted that rt_uses_gateway has a more subtle use than 'is gateway
set':
https://lore.kernel.org/netdev/alpine.LFD.2.21.1909151104060[email protected]/
Revert that part of the commit referenced in the Fixes tag.
Currently, there are no u8 holes in 'struct rtable'. There is a 4-byte hole
in the second cacheline which contains the gateway declaration. So move
rt_gw_family down to the gateway declarations since they are always used
together, and then re-use that u8 for rt_uses_gateway. End result is that
rtable size is unchanged.
David Ahern [Tue, 17 Sep 2019 17:30:35 +0000 (10:30 -0700)]
selftests: Update fib_nexthop_multiprefix to handle missing ping6
Some distributions (e.g., debian buster) do not install ping6. Re-use
the hook in pmtu.sh to detect this and fallback to ping.
Fixes: 735ab2f65dce ("selftests: Add test with multiple prefixes using single nexthop") Signed-off-by: David Ahern <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>