Git Repo - linux.git/log

batman-adv: mcast: detect, distribute and maintain multicast router presence

To be able to apply our group aware multicast optimizations to packets
with a scope greater than link-local we need to not only keep track of
multicast listeners but also multicast routers.

With this patch a node detects the presence of multicast routers on
its segment by checking if
/proc/sys/net/ipv{4,6}/conf/<bat0|br0(bat)>/mc_forwarding is set for one
thing. This option is enabled by multicast routing daemons and needed
for the kernel's multicast routing tables to receive and route packets.

For another thing if a bridge is configured on top of bat0 then the
presence of an IPv6 multicast router behind this bridge is currently
detected by checking for an IPv6 multicast "All Routers Address"
(ff02::2). This should later be replaced by querying the bridge, which
performs proper, RFC4286 compliant Multicast Router Discovery (our
simplified approach includes more hosts than necessary, most notably
not just multicast routers but also unicast ones and is not applicable
for IPv4).

If no multicast router is detected then this is signalized via the new
BATADV_MCAST_WANT_NO_RTR4 and BATADV_MCAST_WANT_NO_RTR6
multicast tvlv flags.

Signed-off-by: Linus Lüssing <[email protected]>
Signed-off-by: Sven Eckelmann <[email protected]>
Signed-off-by: Simon Wunderlich <[email protected]>

batman-adv: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.

Because we don't care if debugfs works or not, this trickles back a bit
so we can clean things up by making some functions return void instead
of an error value that is never going to fail.

Cc: Marek Lindner <[email protected]>
Cc: Simon Wunderlich <[email protected]>
Cc: Antonio Quartulli <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
[[email protected]: drop unused variables]
Signed-off-by: Sven Eckelmann <[email protected]>
Signed-off-by: Simon Wunderlich <[email protected]>

batman-adv: mcast: avoid redundant multicast TT entries with bridges

When a bridge is added on top of bat0 we set the WANT_ALL_UNSNOOPABLES
flag. Which means we sign up for all traffic for ff02::1/128 and
224.0.0.0/24.

When the node itself had IPv6 enabled or joined a group in 224.0.0.0/24
itself then so far this would result in a multicast TT entry which is
redundant to the WANT_ALL_UNSNOOPABLES.

With this patch such redundant TT entries are avoided.

Signed-off-by: Linus Lüssing <[email protected]>
Signed-off-by: Sven Eckelmann <[email protected]>
Signed-off-by: Simon Wunderlich <[email protected]>

batman-adv: mcast: collect softif listeners from IP lists instead

Instead of collecting multicast MAC addresses from the netdev hw mc
list collect a node's multicast listeners from the IP lists and convert
those to MAC addresses.

This allows to exclude addresses of specific scope later. On a
multicast MAC address the IP destination scope is not visible anymore.

Signed-off-by: Linus Lüssing <[email protected]>
Signed-off-by: Sven Eckelmann <[email protected]>
Signed-off-by: Simon Wunderlich <[email protected]>

Merge branch 's390-qeth-next'

Julian Wiedmann says:

====================
s390/qeth: updates 2019-06-27

please apply another round of qeth updates for net-next.
This completes the conversion of the control path to use dynamically
allocated cmd buffers, along with some fine-tuning for the route
validation fix that recently went into -net.
====================

Signed-off-by: David S. Miller <[email protected]>

s390/qeth: move cast type selection into fill_header()

The cast type currently gets selected in .ndo_start_xmit, and is then
piped through several layers until it's stored into the HW header.
Push the selection down into qeth_l?_fill_header() to (1) reduce the
number of xmit-wide parameters, and (2) merge the two route validation
checks into just one.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: extract helper for route validation

As follow-up to commit 0cd6783d3c7d ("s390/qeth: check dst entry before use"),
consolidate the dst_check() logic into a single helper and add a wrapper
around the cast type selection.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: consolidate skb RX processing in L3 driver

Use napi_gro_receive() to pass up all types of packets that a L3 device
may receive.
1) For proper L2 packets received by the IQD sniffer, this is the
   obvious thing to do.
2) For af_iucv (which doesn't provide a GRO assist), the GRO code will
   transparently fall back to netif_receive_skb(). So there's no need to
   special-case this traffic in our code.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: consolidate pm code

De-duplicate the pm callback implementations from the two sub-drivers,
replacing them with core helpers that delegate to the .set_online and
.set_offline callbacks.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: streamline SNMP cmd code

Apply some cleanups to qeth_snmp_command() and its callback:
1. when accessing the user data, use the proper struct instead of
   hard-coded offsets. Also copy the request data straight into the
   allocated cmd, skipping the extra memdup_user() to a tmp buffer.
2. capping the request length is no longer needed, the same check gets
   applied at a base level in qeth_alloc_cmd().
3. clean up some duplicated (and misindented) trace statements.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: remove static cmd buffer infrastructure

Now that all cmds are dynamically allocated, the code for static cmd
buffers can go away entirely. Resulting in a nice reduction of
code/data size & complexity, while removing the risk that
qeth_clear_cmd_buffers() releases cmds that are still in-flight.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: dynamically allocate MPC cmds

The base MPC cmds are the last remaining user of the static cmd buffers.
Port them over to use dynamic allocation, and stop backing the write
channel's cmd buffers with pages.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: dynamically allocate vnicc cmds

The VNICC code is somewhat quirky in that it defers the whole cmd setup
to a common helper qeth_l2_vnicc_request(). Some of the cmd specifics
are then passed in via parameter, while others are simply hard-coded.

Split the whole machinery up into the usual format: one helper that
allocates the cmd & fills in the common fields, while all the cmd
originators take care of their sub-cmd type specific work.
This makes it much easier to calculate the cmd's precise length, and
reduces code complexity.

Signed-off-by: Julian Wiedmann <[email protected]>
Reviewed-by: Alexandra Winter <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: dynamically allocate diag cmds

Add a new wrapper that allocates DIAG cmds of the right size, and fills
in the common fields.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: dynamically allocate various cmds with sub-types

This patch converts the adapter, assist and bridgeport cmd paths to
dynamic allocation. Most of the work is about re-organizing the cmd
headers, calculating the correct cmd length, and filling in the right
value in the sub-cmd's length field.

Since we now also set the correct length for cmds that are not reflected
by a fixed struct (ie SNMP), we can remove the work-around from
qeth_snmp_command().

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: clarify parameter for simple assist cmds

For code that uses qeth_send_simple_setassparms_prot(), we currently
can't differentiate whether the cmd should contain (1) no parameter, or
(2) a 4-byte parameter with value 0.
At the moment this doesn't cause any trouble. But when using dynamically
allocated cmds, we need to know whether to allocate & transmit an
additional 4 bytes of zeroes.
So instead of the raw parameter value, pass a parameter pointer
(or NULL) to qeth_send_simple_setassparms_prot().

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: dynamically allocate simple IPA cmds

This patch reduces the usage of the write channel's static cmd buffers,
by dynamically allocating all simple IPA cmds (eg. STARTLAN, SETVMAC).
It also converts the OSN path.

Doing so requires some changes to how we calculate the cmd length.
Currently when building IPA cmds, we're quite generous in how much data
we send down to the device (basically the size of the biggest cmd we
know). This is no real concern at the moment, since the static cmd
buffers are backed with zeroed pages. But for dynamic allocations, the
exact length matters. So this patch also adds the needed length
calculations to each cmd path.

Commands that have multiple subtypes (eg. SETADP) of differing length
will be converted with follow-up patches.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/smc: common release code for non-accepted sockets

There are common steps when releasing an accepted or unaccepted socket.
Move this code into a common routine.

Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: Karsten Graul <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'net-ipv4-fix-circular-list-infinite-loop'

Florian Westphal says:

====================
net: ipv4: fix circular-list infinite loop

Tariq and Ran reported a regression caused by net-next commit
2638eb8b50cf ("net: ipv4: provide __rcu annotation for ifa_list").

This happens when net.ipv4.conf.$dev.promote_secondaries sysctl is
enabled -- we can arrange for ifa->next to point at ifa, so next
process that tries to walk the list loops forever.

Fix this and extend rtnetlink.sh with a small test case for this.
====================

Signed-off-by: David S. Miller <[email protected]>

selftests: rtnetlink: add small test case with 'promote_secondaries' enabled

This exercises the 'promote_secondaries' code path.

Without previous fix, this triggers infinite loop/soft lockup:
ifconfig process spinning at 100%, never to return.

Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ipv4: fix infinite loop on secondary addr promotion

secondary address promotion causes infinite loop -- it arranges
for ifa->ifa_next to point back to itself.

Problem is that 'prev_prom' and 'last_prim' might point at the same entry,
so 'last_sec' pointer must be obtained after prev_prom->next update.

Fixes: 2638eb8b50cf ("net: ipv4: provide __rcu annotation for ifa_list")
Reported-by: Ran Rozenstein <[email protected]>
Reported-by: Tariq Toukan <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ethtool: Allow parsing ETHER_FLOW types when using flow_rule

When parsing an ethtool_rx_flow_spec, users can specify an ethernet flow
which could contain matches based on the ethernet header, such as the
MAC address, the VLAN tag or the ethertype.

ETHER_FLOW uses the src and dst ethernet addresses, along with the
ethertype as keys. Matches based on the vlan tag are also possible, but
they are specified using the special FLOW_EXT flag.

Signed-off-by: Maxime Chevallier <[email protected]>
Acked-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

proc: remove useless d_is_dir() check

Remove the d_is_dir() check from tgid_pidfd_to_pid().

It is pointless since you should never get &proc_tgid_base_operations
for f_op on a non-directory.

Suggested-by: Al Viro <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>

copy_process(): don't use ksys_close() on cleanups

anon_inode_getfd() should be used *ONLY* in situations when we are
guaranteed to be past the last failure point (including copying the
descriptor number to userland, at that). And ksys_close() should
not be used for cleanups at all.

anon_inode_getfile() is there for all nontrivial cases like that.
Just use that...

Fixes: b3e583825266 ("clone: add CLONE_PIDFD")
Signed-off-by: Al Viro <[email protected]>
Reviewed-by: Jann Horn <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>

af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET

When an application is run that:
a) Sets its scheduler to be SCHED_FIFO
and
b) Opens a memory mapped AF_PACKET socket, and sends frames with the
MSG_DONTWAIT flag cleared, its possible for the application to hang
forever in the kernel.  This occurs because when waiting, the code in
tpacket_snd calls schedule, which under normal circumstances allows
other tasks to run, including ksoftirqd, which in some cases is
responsible for freeing the transmitted skb (which in AF_PACKET calls a
destructor that flips the status bit of the transmitted frame back to
available, allowing the transmitting task to complete).

However, when the calling application is SCHED_FIFO, its priority is
such that the schedule call immediately places the task back on the cpu,
preventing ksoftirqd from freeing the skb, which in turn prevents the
transmitting task from detecting that the transmission is complete.

We can fix this by converting the schedule call to a completion
mechanism.  By using a completion queue, we force the calling task, when
it detects there are no more frames to send, to schedule itself off the
cpu until such time as the last transmitted skb is freed, allowing
forward progress to be made.

Tested by myself and the reporter, with good results

Change Notes:

V1->V2:
Enhance the sleep logic to support being interruptible and
allowing for honoring to SK_SNDTIMEO (Willem de Bruijn)

V2->V3:
Rearrage the point at which we wait for the completion queue, to
avoid needing to check for ph/skb being null at the end of the loop.
Also move the complete call to the skb destructor to avoid needing to
modify __packet_set_status.  Also gate calling complete on
packet_read_pending returning zero to avoid multiple calls to complete.
(Willem de Bruijn)

Move timeo computation within loop, to re-fetch the socket
timeout since we also use the timeo variable to record the return code
from the wait_for_complete call (Neil Horman)

V3->V4:
Willem has requested that the control flow be restored to the
previous state.  Doing so lets us eliminate the need for the
po->wait_on_complete flag variable, and lets us get rid of the
packet_next_frame function, but introduces another complexity.
Specifically, but using the packet pending count, we can, if an
applications calls sendmsg multiple times with MSG_DONTWAIT set, each
set of transmitted frames, when complete, will cause
tpacket_destruct_skb to issue a complete call, for which there will
never be a wait_on_completion call.  This imbalance will lead to any
future call to wait_for_completion here to return early, when the frames
they sent may not have completed.  To correct this, we need to re-init
the completion queue on every call to tpacket_snd before we enter the
loop so as to ensure we wait properly for the frames we send in this
iteration.

Change the timeout and interrupted gotos to out_put rather than
out_status so that we don't try to free a non-existant skb
Clean up some extra newlines (Willem de Bruijn)

Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: Neil Horman <[email protected]>
Reported-by: Matteo Croce <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sctp: change to hold sk after auth shkey is created successfully

Now in sctp_endpoint_init(), it holds the sk then creates auth
shkey. But when the creation fails, it doesn't release the sk,
which causes a sk defcnf leak,

Here to fix it by only holding the sk when auth shkey is created
successfully.

Fixes: a29a5bd4f5c3 ("[SCTP]: Implement SCTP-AUTH initializations.")
Reported-by: [email protected]
Reported-by: [email protected]
Signed-off-by: Xin Long <[email protected]>
Acked-by: Neil Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'macb-build-fixes'

Palmer Dabbelt says:

====================
net: macb: Fix compilation on systems without COMMON_CLK, v2

Our patch to add support for the FU540-C000 broke compilation on at
least powerpc allyesconfig, which was found as part of the linux-next
build regression tests.  This must have somehow slipped through the
cracks, as the patch has been reverted in linux-next for a while now.
This patch applies on top of the offending commit, which is the only one
I've even tried it on as I'm not sure how this subsystem makes it to
Linus.

This patch set fixes the issue by adding a dependency of COMMON_CLK to
the MACB Kconfig entry, which avoids the build failure by disabling MACB
on systems where it wouldn't compile.  All known users of MACB have
COMMON_CLK, so this shouldn't cause any issues.  This is a significantly
simpler approach than disabling just the FU540-C000 support.

I've also included a second patch to indicate this is a driver for a
Cadence device that was originally written by an engineer at Atmel.  The
only relation is that I stumbled across it when writing the first patch.

Changes since v1 <20190624061603 [email protected]>:

* Disable MACB on systems without COMMON_CLK, instead of just disabling
  the FU540-C000 support on these systems.
* Update the commit message to reflect the driver was written by Atmel.
====================

Signed-off-by: David S. Miller <[email protected]>

net: macb: Kconfig: Rename Atmel to Cadence

The help text makes it look like NET_VENDOR_CADENCE enables support for
Atmel devices, when in reality it's a driver written by Atmel that
supports Cadence devices. This may confuse users that have this device
on a non-Atmel SoC.

The fix is just s/Atmel/Cadence/, but I did go and re-wrap the Kconfig
help text as that change caused it to go over 80 characters.

Signed-off-by: Palmer Dabbelt <[email protected]>
Acked-by: Nicolas Ferre <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: macb: Kconfig: Make MACB depend on COMMON_CLK

commit c218ad559020 ("macb: Add support for SiFive FU540-C000") added a
dependency on the common clock framework to the macb driver, but didn't
express that dependency in Kconfig. As a result macb now fails to
compile on systems without COMMON_CLK, which specifically causes a build
failure on powerpc allyesconfig.

This patch adds the dependency, which results in the macb driver no
longer being selectable on systems without the common clock framework.
All known systems that have this device already support the common clock
framework, so this should not cause trouble for any uses. Supporting
both the FU540-C000 and systems without COMMON_CLK is quite ugly.

I've build tested this on powerpc allyesconfig and RISC-V defconfig
(which selects MACB), but I have not even booted the resulting kernels.

Fixes: c218ad559020 ("macb: Add support for SiFive FU540-C000")
Signed-off-by: Palmer Dabbelt <[email protected]>
Acked-by: Nicolas Ferre <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'ipv6-fix-neighbour-resolution-with-raw-socket'

Nicolas Dichtel says:

====================
ipv6: fix neighbour resolution with raw socket

The first patch prepares the fix, it constify rt6_nexthop().
The detail of the bug is explained in the second patch.

v1 -> v2:
- fix compilation warnings
- split the initial patch
====================

Signed-off-by: David S. Miller <[email protected]>

ipv6: fix neighbour resolution with raw socket

The scenario is the following: the user uses a raw socket to send an ipv6
packet, destinated to a not-connected network, and specify a connected nh.
Here is the corresponding python script to reproduce this scenario:

import socket
IPPROTO_RAW = 255
send_s = socket.socket(socket.AF_INET6, socket.SOCK_RAW, IPPROTO_RAW)
# scapy
# p = IPv6(src='fd00:100::1', dst='fd00:200::fa')/ICMPv6EchoRequest()
# str(p)
req = b'`\x00\x00\x00\x00\x08:@\xfd\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xfd\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xfa\x80\x00\x81\xc0\x00\x00\x00\x00'
send_s.sendto(req, ('fd00:175::2', 0, 0, 0))

fd00:175::/64 is a connected route and fd00:200::fa is not a connected
host.

With this scenario, the kernel starts by sending a NS to resolve
fd00:175::2. When it receives the NA, it flushes its queue and try to send
the initial packet. But instead of sending it, it sends another NS to
resolve fd00:200::fa, which obvioulsy fails, thus the packet is dropped. If
the user sends again the packet, it now uses the right nh (fd00:175::2).

The problem is that ip6_dst_lookup_neigh() uses the rt6i_gateway, which is
:: because the associated route is a connected route, thus it uses the dst
addr of the packet. Let's use rt6_nexthop() to choose the right nh.

Signed-off-by: Nicolas Dichtel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ipv6: constify rt6_nexthop()

There is no functional change in this patch, it only prepares the next one.

rt6_nexthop() will be used by ip6_dst_lookup_neigh(), which uses const
variables.

Signed-off-by: Nicolas Dichtel <[email protected]>
Reported-by: kbuild test robot <[email protected]>
Acked-by: Nick Desaulniers <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: microchip: Use gpiod_set_value_cansleep()

Replace gpiod_set_value() with gpiod_set_value_cansleep(), as the switch
reset GPIO can be connected to e.g. I2C GPIO expander and it is perfectly
fine for the kernel to sleep for a bit in ksz_switch_register().

Signed-off-by: Marek Vasut <[email protected]>
Cc: Andrew Lunn <[email protected]>
Cc: Florian Fainelli <[email protected]>
Cc: Linus Walleij <[email protected]>
Cc: Tristram Ha <[email protected]>
Cc: Woojung Huh <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Reviewed-by: Linus Walleij <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Allow 0.0.0.0/8 as a valid address range

The longstanding prohibition against using 0.0.0.0/8 dates back
to two issues with the early internet.

There was an interoperability problem with BSD 4.2 in 1984, fixed in
BSD 4.3 in 1986. BSD 4.2 has long since been retired.

Secondly, addresses of the form 0.x.y.z were initially defined only as
a source address in an ICMP datagram, indicating "node number x.y.z on
this IPv4 network", by nodes that know their address on their local
network, but do not yet know their network prefix, in RFC0792 (page
19). This usage of 0.x.y.z was later repealed in RFC1122 (section
3.2.2.7), because the original ICMP-based mechanism for learning the
network prefix was unworkable on many networks such as Ethernet (which
have longer addresses that would not fit into the 24 "node number"
bits). Modern networks use reverse ARP (RFC0903) or BOOTP (RFC0951)
or DHCP (RFC2131) to find their full 32-bit address and CIDR netmask
(and other parameters such as default gateways). 0.x.y.z has had
16,777,215 addresses in 0.0.0.0/8 space left unused and reserved for
future use, since 1989.

This patch allows for these 16m new IPv4 addresses to appear within
a box or on the wire. Layer 2 switches don't care.

0.0.0.0/32 is still prohibited, of course.

Signed-off-by: Dave Taht <[email protected]>
Signed-off-by: John Gilmore <[email protected]>
Acked-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: aquantia: fix vlans not working over bridged network

In configuration of vlan over bridge over aquantia device
it was found that vlan tagged traffic is dropped on chip.

The reason is that bridge device enables promisc mode,
but in atlantic chip vlan filters will still apply.
So we have to corellate promisc settings with vlan configuration.

The solution is to track in a separate state variable the
need of vlan forced promisc. And also consider generic
promisc configuration when doing vlan filter config.

Fixes: 7975d2aff5af ("net: aquantia: add support of rx-vlan-filter offload")
Signed-off-by: Dmitry Bogdanov <[email protected]>
Signed-off-by: Igor Russkikh <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

rtnetlink: skip metrics loop for dst_default_metrics

dst_default_metrics has all of the metrics initialized to 0, so nothing
will be added to the skb in rtnetlink_put_metrics. Avoid the loop if
metrics is from dst_default_metrics.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'skfp-cleanups'

Puranjay Mohan says:

====================
net: fddi: skfp: Use PCI generic definitions instead of private duplicates

This patch series removes the private duplicates of PCI definitions in
favour of generic definitions defined in pci_regs.h.

This driver only uses some of the generic PCI definitons,
which are included from pci_regs.h and thier private versions
are removed from skfbi.h with all other private defines.

The skfbi.h defines PCI_REV_ID and other private defines with different
names, these are renamed to Generic PCI names to make them
compatible with defines in pci_regs.h.

All unused defines are removed from skfbi.h.

Changes in v5:
Removed unused PCI definitions which were left in v4

Changes in v4:
Removed unused PCI definitions which were left in v3

Changes in v3:
Renamed all local PCI definitions to Generic names.
Corrected coding style mistakes.

Changes in v2:
Converted individual patches to a series.
Made sure that individual patches build correctly
====================

Signed-off-by: David S. Miller <[email protected]>

net: fddi: skfp: Remove unused private PCI definitions

Remove unused private PCI definitions from skfbi.h because generic PCI
symbols are already included from pci_regs.h.

Signed-off-by: Puranjay Mohan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: fddi: skfp: Include generic PCI definitions

Include the uapi/linux/pci_regs.h header file which contains the generic
PCI defines.

Signed-off-by: Puranjay Mohan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: fddi: skfp: Rename local PCI defines to match generic PCI defines

Rename the PCI_REV_ID and other local defines to Generic PCI define names
in skfbi.h and drvfbi.c to make it compatible with the pci_regs.h.

Signed-off-by: Puranjay Mohan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ipv4: reset rt_iif for recirculated mcast/bcast out pkts

Multicast or broadcast egress packets have rt_iif set to the oif. These
packets might be recirculated back as input and lookup to the raw
sockets may fail because they are bound to the incoming interface
(skb_iif). If rt_iif is not zero, during the lookup, inet_iif() function
returns rt_iif instead of skb_iif. Hence, the lookup fails.

v2: Make it non vrf specific (David Ahern). Reword the changelog to
reflect it.
Signed-off-by: Stephen Suryaputra <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

team: Always enable vlan tx offload

We should rather have vlan_tci filled all the way down
to the transmitting netdevice and let it do the hw/sw
vlan implementation.

Suggested-by: Jiri Pirko <[email protected]>
Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'wireless-drivers-next-for-davem-2019-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valu says:

====================
wireless-drivers-next patches for 5.3

First set of patches for 5.3, but not that many patches this time.

This pull request fails to compile with the tip tree due to
ktime_get_boot_ns() API changes there. It should be easy for Linus to
fix it in p54 driver once he pulls this, an example resolution here:

https://lkml.kernel.org/r/20190625160432.533aa140@canb.auug.org.au

Major changes:

airo

* switch to use skcipher interface

p54

* support boottime in scan results

rtw88

* add fast xmit support

* add random mac address on scan support

rt2x00

* add software watchdog to detect hangs, it's disabled by default
====================

Signed-off-by: David S. Miller <[email protected]>

Merge branch 'smc-fixes'

Ursula Braun says:

====================
net/smc: fixes 2019-06-26

here are 2 small smc fixes for the net tree.
====================

Signed-off-by: David S. Miller <[email protected]>

net/smc: Fix error path in smc_init

If register_pernet_subsys success in smc_init,
we should cleanup it in case any other error.

Fixes: 64e28b52c7a6 (net/smc: add pnet table namespace support")
Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/smc: hold conns_lock before calling smc_lgr_register_conn()

After smc_lgr_create(), the newly created link group is added
to smc_lgr_list, thus is accessible from other context.
Although link group creation is serialized by
smc_create_lgr_pending, the new link group may still be accessed
concurrently. For example, if ib_device is no longer active,
smc_ib_port_event_work() will call smc_port_terminate(), which
in turn will call __smc_lgr_terminate() on every link group of
this device. So conns_lock is required here.

Signed-off-by: Huaping Zhou <[email protected]>
Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

i40e/i40e_virtchnl_pf: Use struct_size() in kzalloc()

One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:

struct virtchnl_iwarp_qvlist_info {
...
struct virtchnl_iwarp_qv_info qv_info[1];
};

size = sizeof(struct virtchnl_iwarp_qvlist_info) + (sizeof(struct virtchnl_iwarp_qv_info) * count;
instance = kzalloc(size, GFP_KERNEL);

and

struct virtchnl_vf_resource {
...
struct virtchnl_vsi_resource vsi_res[1];
};

size = sizeof(struct virtchnl_vf_resource) + sizeof(struct virtchnl_vsi_resource) * count;
instance = kzalloc(size, GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:

instance = kzalloc(struct_size(instance, qv_info, count), GFP_KERNEL);

and

instance = kzalloc(struct_size(instance, vsi_res, count), GFP_KERNEL);

Notice that, in the first case above, variable size is not necessary, hence it
is removed.

This code was detected with the help of Coccinelle.

Signed-off-by: "Gustavo A. R. Silva" <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: update copyright string

It was found that the string that prints our copyright was
not up to date. Updating to reflect our copyright.

Signed-off-by: Alice Michael <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: Fix descriptor count manipulation

Changing descriptor count via 'ethtool -G' is not persistent across resets.
When PF reset occurs, we roll back to the default value of vsi->num_desc,
which is used then in i40e_alloc_rings to set descriptor count. XDP does a
PF reset so when user has changed the descriptor count and load XDP
program, the default count will be back there.

To fix this:
  * introduce new VSI members - num_tx_desc and num_rx_desc in favour of
    num_desc
  * set them in i40e_set_ringparam to user's values
  * set them to default values in i40e_set_num_rings_in_vsi only when they
    don't have previous values

Signed-off-by: Maciej Fijalkowski <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: missing priorities for any QoS traffic

This patch fixes reading f/w LLDP agent status at DCB init time.
It's done by removing direct NVM reading in i40e_update_dcb_config()
and checking whether f/w LLDP agent is disabled via
I40E_FLAG_DISABLE_FW_LLDP flag in i40e_init_pf_dcb(). The function
i40e_update_dcb_config() in i40e_main.c is a temporary solution which
will be later renamed to i40e_init_dcb() in the i40e_dcb module. Also
logging was extended to make visible if f/w LLDP agent is running or not
and always log a message when DCB was not initialized. Without this
patch for new f/w versions f/w LLDP agent status was always read
from NVM as disabled and DCB initialization failed without
clear reason in logs.

Signed-off-by: Aleksandr Loktionov <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: Add log entry while creating or deleting TC0

Generate log entry when TC0 is created or deleted.
Log entry is generated during main VSI setup.
Before there was no log info about adding or deleting TC0.

Signed-off-by: Piotr Kwapulinski <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: fix incorrect function documentation comment

Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: Fix for missing "link modes" info in ethtool

Fix for missing "Supported link modes" and "Advertised link modes"
info in ethtool after changed speed on X722 devices with BASE-T PHY
with FW API version >= 1.7.
The same FW API version on X710 and X722 does not mean the same
feature set so the change was needed as mac type of the device
should also be checked instead of FW API version only.

Signed-off-by: Martyna Szapar <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: fix 'Unknown bps' in dmesg for 2.5Gb/5Gb speeds

This patch fixes 'NIC Link is Up, Unknown bps' message in dmesg
for 2.5Gb/5Gb speeds. This problem is fixed by adding constants
for VIRTCHNL_LINK_SPEED_2_5GB and VIRTCHNL_LINK_SPEED_5GB cases
in the i40e_virtchnl_link_speed() function.

Signed-off-by: Aleksandr Loktionov <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ixgbevf: fix possible divide by zero in ixgbevf_update_itr

The next call to ixgbevf_update_itr will continue to dynamically
update ITR.

Copy from commit bdbeefe8ea8c ("ixgbe: fix possible divide by zero in
ixgbe_update_itr")

Signed-off-by: Young Xiao <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ixgbe: Check DDM existence in transceiver before access

Some transceivers may comply with SFF-8472 but not implement the Digital
Diagnostic Monitoring (DDM) interface described in it. The existence of
such area is specified by bit 6 of byte 92, set to 1 if implemented.

Currently, due to not checking this bit ixgbe fails trying to read SFP
module's eeprom with the follow message:

ethtool -m enP51p1s0f0
Cannot get Module EEPROM data: Input/output error

Because it fails to read the additional 256 bytes in which it was assumed
to exist the DDM data.

This issue was noticed using a Mellanox Passive DAC PN 01FT738. The eeprom
data was confirmed by Mellanox as correct and present in other Passive
DACs in from other manufacturers.

Signed-off-by: "Mauro S. M. Rodrigues" <[email protected]>
Reviewed-by: Jesse Brandeburg <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

net: stmmac: Fix crash observed if PHY does not support EEE

If the PHY does not support EEE mode, then a crash is observed when the
ethernet interface is enabled. The crash occurs, because if the PHY does
not support EEE, then although the EEE timer is never configured, it is
still marked as enabled and so the stmmac ethernet driver is still
trying to update the timer by calling mod_timer(). This triggers a BUG()
in the mod_timer() because we are trying to update a timer when there is
no callback function set because timer_setup() was never called for this
timer.

The problem is caused because we return true from the function
stmmac_eee_init(), marking the EEE timer as enabled, even when we have
not configured the EEE timer. Fix this by ensuring that we return false
if the PHY does not support EEE and hence, 'eee_active' is not set.

Fixes: 74371272f97f ("net: stmmac: Convert to phylink and remove phylib logic")
Signed-off-by: Jon Hunter <[email protected]>
Tested-by: Thierry Reding <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: stmmac: Fix possible deadlock when disabling EEE support

When stmmac_eee_init() is called to disable EEE support, then the timer
for EEE support is stopped and we return from the function. Prior to
stopping the timer, a mutex was acquired but in this case it is never
released and so could cause a deadlock. Fix this by releasing the mutex
prior to returning from stmmax_eee_init() when stopping the EEE timer.

Fixes: 74371272f97f ("net: stmmac: Convert to phylink and remove phylib logic")
Signed-off-by: Jon Hunter <[email protected]>
Tested-by: Thierry Reding <[email protected]>
Acked-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ipv6: fix suspicious RCU usage in rt6_dump_route()

syzbot reminded us that rt6_nh_dump_exceptions() needs to be called
with rcu_read_lock()

net/ipv6/route.c:1593 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
2 locks held by syz-executor609/8966:
#0: 00000000b7dbe288 (rtnl_mutex){+.+.}, at: netlink_dump+0xe7/0xfb0 net/netlink/af_netlink.c:2199
#1: 00000000f2d87c21 (&(&tb->tb6_lock)->rlock){+...}, at: spin_lock_bh include/linux/spinlock.h:343 [inline]
#1: 00000000f2d87c21 (&(&tb->tb6_lock)->rlock){+...}, at: fib6_dump_table.isra.0+0x37e/0x570 net/ipv6/ip6_fib.c:533

stack backtrace:
CPU: 0 PID: 8966 Comm: syz-executor609 Not tainted 5.2.0-rc5+ #43
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x172/0x1f0 lib/dump_stack.c:113
lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5250
fib6_nh_get_excptn_bucket+0x18e/0x1b0 net/ipv6/route.c:1593
rt6_nh_dump_exceptions+0x45/0x4d0 net/ipv6/route.c:5541
rt6_dump_route+0x904/0xc50 net/ipv6/route.c:5640
fib6_dump_node+0x168/0x280 net/ipv6/ip6_fib.c:467
fib6_walk_continue+0x4a9/0x8e0 net/ipv6/ip6_fib.c:1986
fib6_walk+0x9d/0x100 net/ipv6/ip6_fib.c:2034
fib6_dump_table.isra.0+0x38a/0x570 net/ipv6/ip6_fib.c:534
inet6_dump_fib+0x93c/0xb00 net/ipv6/ip6_fib.c:624
rtnl_dump_all+0x295/0x490 net/core/rtnetlink.c:3445
netlink_dump+0x558/0xfb0 net/netlink/af_netlink.c:2244
__netlink_dump_start+0x5b1/0x7d0 net/netlink/af_netlink.c:2352
netlink_dump_start include/linux/netlink.h:226 [inline]
rtnetlink_rcv_msg+0x73d/0xb00 net/core/rtnetlink.c:5182
netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5237
netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
netlink_sendmsg+0x8ae/0xd70 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:646 [inline]
sock_sendmsg+0xd7/0x130 net/socket.c:665
sock_write_iter+0x27c/0x3e0 net/socket.c:994
call_write_iter include/linux/fs.h:1872 [inline]
new_sync_write+0x4d3/0x770 fs/read_write.c:483
__vfs_write+0xe1/0x110 fs/read_write.c:496
vfs_write+0x20c/0x580 fs/read_write.c:558
ksys_write+0x14f/0x290 fs/read_write.c:611
__do_sys_write fs/read_write.c:623 [inline]
__se_sys_write fs/read_write.c:620 [inline]
__x64_sys_write+0x73/0xb0 fs/read_write.c:620
do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4401b9
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffc8e134978 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004401b9
RDX: 000000000000001c RSI: 0000000020000000 RDI: 00

Fixes: 1e47b4837f3b ("ipv6: Dump route exceptions if requested")
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Stefano Brivio <[email protected]>
Cc: David Ahern <[email protected]>
Reviewed-by: Stefano Brivio <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ipv4: fix suspicious RCU usage in fib_dump_info_fnhe()

sysbot reported that we lack appropriate rcu_read_lock()
protection in fib_dump_info_fnhe()

net/ipv4/route.c:2875 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
1 lock held by syz-executor609/8966:
#0: 00000000b7dbe288 (rtnl_mutex){+.+.}, at: netlink_dump+0xe7/0xfb0 net/netlink/af_netlink.c:2199

stack backtrace:
CPU: 0 PID: 8966 Comm: syz-executor609 Not tainted 5.2.0-rc5+ #43
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x172/0x1f0 lib/dump_stack.c:113
lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5250
fib_dump_info_fnhe+0x9d9/0x1080 net/ipv4/route.c:2875
fn_trie_dump_leaf net/ipv4/fib_trie.c:2141 [inline]
fib_table_dump+0x64a/0xd00 net/ipv4/fib_trie.c:2175
inet_dump_fib+0x83c/0xa90 net/ipv4/fib_frontend.c:1004
rtnl_dump_all+0x295/0x490 net/core/rtnetlink.c:3445
netlink_dump+0x558/0xfb0 net/netlink/af_netlink.c:2244
__netlink_dump_start+0x5b1/0x7d0 net/netlink/af_netlink.c:2352
netlink_dump_start include/linux/netlink.h:226 [inline]
rtnetlink_rcv_msg+0x73d/0xb00 net/core/rtnetlink.c:5182
netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5237
netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
netlink_sendmsg+0x8ae/0xd70 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:646 [inline]
sock_sendmsg+0xd7/0x130 net/socket.c:665
sock_write_iter+0x27c/0x3e0 net/socket.c:994
call_write_iter include/linux/fs.h:1872 [inline]
new_sync_write+0x4d3/0x770 fs/read_write.c:483
__vfs_write+0xe1/0x110 fs/read_write.c:496
vfs_write+0x20c/0x580 fs/read_write.c:558
ksys_write+0x14f/0x290 fs/read_write.c:611
__do_sys_write fs/read_write.c:623 [inline]
__se_sys_write fs/read_write.c:620 [inline]
__x64_sys_write+0x73/0xb0 fs/read_write.c:620
do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4401b9
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffc8e134978 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004401b9
RDX: 000000000000001c RSI: 0000000020000000 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8
R10: 0000000000000010 R11: 0000000000000246 R12: 0000000000401a40
R13: 0000000000401ad0 R14: 0000000000000000 R15: 0000000000000000

Fixes: ee28906fd7a1 ("ipv4: Dump route exceptions if requested")
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Stefano Brivio <[email protected]>
Cc: David Ahern <[email protected]>
Reported-by: syzbot <[email protected]>
Reviewed-by: Stefano Brivio <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Revert "net: ena: ethtool: add extra properties retrieval via get_priv_flags"

This reverts commit 315c28d2b714 ("net: ena: ethtool: add extra properties retrieval via get_priv_flags").

As discussed at netconf and on the mailing list we can't allow
for the the abuse of private flags for exposing arbitrary device
labels.

Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'net-hns3-some-code-optimizations-bugfixes'

Huazhong Tan says:

====================
net: hns3: some code optimizations & bugfixes

This patch-set includes code optimizations and bugfixes for
the HNS3 ethernet controller driver.

[patch 1/11] fixes a selftest issue when doing autoneg.

[patch 2/11 - 3-11] adds two code optimizations about VLAN issue.

[patch 4/11] restores the MAC autoneg state after reset.

[patch 5/11 - 8/11] adds some code optimizations and bugfixes about
HW errors handling.

[patch 9/11 - 11/11] fixes some issues related to driver loading and
unloading.
====================

Signed-off-by: David S. Miller <[email protected]>

net: hns3: add exception handling when enable NIC HW error interrupts

If we failed to enable NIC HW error interrupts during client
initialization in some cases, we should do exception handling to clear
flags and free the resources.

Fixes: 00ea6e5fda9d ("net: hns3: delay and separate enabling of NIC and ROCE HW errors")
Signed-off-by: Weihang Li <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: fixes wrong place enabling ROCE HW error when loading

The ROCE HW errors should only be enabled when initializing ROCE's
client, the current code enable it no matter initializing NIC or
ROCE client.

So this patch fixes it.

Fixes: 00ea6e5fda9d ("net: hns3: delay and separate enabling of NIC and ROCE HW errors")
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: fix race conditions between reset and module loading & unloading

When loading or unloading module, it should wait for the reset task
done before it un-initializes the client, otherwise the reset task
may cause a NULL pointer reference.

Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: add check to number of buffer descriptors

This patch adds check to number of bds before we allocate memory for
them. If we get an invalid bd num in some cases, it will cause a memory
overflow.

Signed-off-by: Weihang Li <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: remove override_pci_need_reset

We add override_pci_need_reset to prevent redundant and unwanted PF
resets if a RAS error occurs in commit 69b51bbb03f7 ("net: hns3: fix
to stop multiple HNS reset due to the AER changes").

Now in HNS3 driver, we use hw_err_reset_req to record reset level that
we need to recover from a RAS error. This variable cans solve above
issue as override_pci_need_reset, so this patch removes
override_pci_need_reset.

Signed-off-by: Weihang Li <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: modify handling of out of memory in hclge_err.c

Users should be informed if HNS driver failed to allocate memory for
descriptor when handling hw errors. This patch solve above issues.

Signed-off-by: Weihang Li <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: code optimizaition of hclge_handle_hw_ras_error()

This patch optimizes hclge_handle_hw_ras_error() to make the code logic
clearer.
1. If there was no NIC or Roce RAS when we read
   HCLGE_RAS_PF_OTHER_INT_STS_REG, we return directly.
2. Because NIC and Roce RAS may occurs at the same time, so we should
   check value of revision at first before we handle Roce RAS instead
   of only checking it in branch of no NIC RAS is detected.
3. Check HCLGE_STATE_RST_HANDLING each time before we want to return
   PCI_ERS_RESULT_NEED_RESET.
4. Remove checking of HCLGE_RAS_REG_NFE_MASK and
   HCLGE_RAS_REG_ROCEE_ERR_MASK because if hw_err_reset_req is not
   zero, it proves that we have set it in handling of NIC or Roce RAS.

Signed-off-by: Weihang Li <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: restore the MAC autoneg state after reset

When doing global reset, the MAC autoneg state of fibre
port is set to default, which may cause user configuration
lost. This patch fixes it by restore the MAC autoneg state
after reset.

Fixes: 22f48e24a23d ("net: hns3: add autoneg and change speed support for fibre port")
Signed-off-by: Jian Shen <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: sync VLAN filter entries when kill VLAN ID failed

When HW is resetting, firmware is unable to handle commands
from driver. So if remove VLAN device from stack at this time,
it will fail to remove the VLAN ID from HW VLAN filter, then
the VLAN filter status is unsynced with stack.

This patch fixes it by recording the VLAN ID delete failed,
and removes them again when reset complete.

Fixes: 44e626f720c3 ("net: hns3: fix VLAN offload handle for VLAN inserted by port")
Signed-off-by: Jian Shen <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: remove VF VLAN filter entry inexistent warning print

For VF VLAN filter is disabled when VF VLAN table is full, then the
new VLAN ID won't be added into VF VLAN table, it will always print
fail log when remove these VLAN IDs. If user has added too many
VLANs, it will cause massive verbose print logs.

Signed-off-by: Jian Shen <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: fix selftest fail issue for fibre port with autoneg on

When doing selftest for fibre port with autoneg on, the MAC speed
may be incorrect, which may cause the selftest failed. This patch
fixes it by halting autoneg during the selftest.

Fixes: 22f48e24a23d ("net: hns3: add autoneg and change speed support for fibre port")
Signed-off-by: Jian Shen <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bonding: Always enable vlan tx offload

We build vlan on top of bonding interface, which vlan offload
is off, bond mode is 802.3ad (LACP) and xmit_hash_policy is
BOND_XMIT_POLICY_ENCAP34.

Because vlan tx offload is off, vlan tci is cleared and skb push
the vlan header in validate_xmit_vlan() while sending from vlan
devices. Then in bond_xmit_hash, __skb_flow_dissect() fails to
get information from protocol headers encapsulated within vlan,
because 'nhoff' is points to IP header, so bond hashing is based
on layer 2 info, which fails to distribute packets across slaves.

This patch always enable bonding's vlan tx offload, pass the vlan
packets to the slave devices with vlan tci, let them to handle
vlan implementation.

Fixes: 278339a42a1b ("bonding: propogate vlan_features to bonding master")
Suggested-by: Jiri Pirko <[email protected]>
Signed-off-by: YueHaibing <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

HID: intel-ish-hid: fix wrong driver_data usage

Currently, in suspend() and resume(), ishtp client drivers are using
driver_data to get "struct ishtp_cl_device" object which is set by
bus driver. It's wrong since the driver_data should not be owned bus.
driver_data should be owned by the corresponding ishtp client driver.
Due to this, some ishtp client driver like cros_ec_ishtp which uses
its driver_data to transfer its data to its child doesn't work correctly.

So this patch removes setting driver_data in bus drier and instead of
using driver_data to get "struct ishtp_cl_device", since "struct device"
is embedded in "struct ishtp_cl_device", we introduce a helper function
that returns "struct ishtp_cl_device" from "struct device".

Signed-off-by: Hyungwoo Yang <[email protected]>
Acked-by: Srinivas Pandruvada <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>

HID: multitouch: Add pointstick support for ALPS Touchpad

There's a new ALPS touchpad/pointstick combo device that requires
MT_CLS_WIN_8_DUAL to make its pointsitck work as a mouse.

The device can be found on HP ZBook 17 G5.

Signed-off-by: Kai-Heng Feng <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>

HID: logitech-dj: Fix forwarding of very long HID++ reports

The HID++ spec also defines very long HID++ reports, with a reportid of
0x12. The MX5000 and MX5500 keyboards use 0x12 output reports for sending
messages to display on their buildin LCD.

Userspace (libmx5000) supports this, in order for this to work when talking
to the HID devices instantiated for the keyboard by hid-logitech-dj,
we need to properly forward these reports to the device.

This commit fixes logi_dj_ll_raw_request not forwarding these reports.

Fixes: f2113c3020ef ("HID: logitech-dj: add support for Logitech Bluetooth Mini-Receiver")
Signed-off-by: Hans de Goede <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>

HID: uclogic: Add support for Huion HS64 tablet

Add support for Huion HS64 drawing tablet to hid-uclogic

Signed-off-by: Kyle Godbey <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>

HID: chicony: add another quirk for PixArt mouse

I've spotted another Chicony PixArt mouse in the wild, which requires
HID_QUIRK_ALWAYS_POLL quirk, otherwise it disconnects each minute.

USB ID of this device is 0x04f2:0x0939.

We've introduced quirks like this for other models before, so lets add
this mouse too.

Link: https://github.com/sriemer/fix-linux-mouse#usb-mouse-disconnectsreconnects-every-minute-on-linux
Signed-off-by: Oleksandr Natalenko <[email protected]>
Acked-by: Sebastian Parschauer <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>

HID: intel-ish-hid: Fix a use after free in load_fw_from_host()

We have to print the filename first before we can kfree it.

Fixes: 91b228107da3 ("HID: intel-ish-hid: ISH firmware loader client driver")
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>

csky: Fixup libgcc unwind error

The struct rt_sigframe is also defined in libgcc/config/csky/linux-unwind.h
of gcc. Although there is no use for the first three word space, we must
keep them the same with linux-unwind.h for member position.

The BUG is found in glibc test with the tst-cancel02.
The BUG is from commit:bf2416829362 of linux-5.2-rc1 merge window.

Signed-off-by: Guo Ren <[email protected]>
Signed-off-by: Mao Han <[email protected]>
Cc: Arnd Bergmann <[email protected]>

tc-testing: add ingress qdisc tests

Signed-off-by: Roman Mashak <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

linux/dim: Add completions count to dim_sample

Added a measurement of completions per/msec to allow for completion based
dim algorithms.

In order to use dynamic interrupt moderation with RDMA we need to have a
different measurment than packets per second. This change is meant to
prepare for adding a new DIM method.

All drivers that use net_dim and thus do not need a completion count will
have the completions set to 0.

Signed-off-by: Yamin Friedman <[email protected]>
Reviewed-by: Max Gurtovoy <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

linux/dim: Move implementation to .c files

Moved all logic from dim.h and net_dim.h to dim.c and net_dim.c.
This is both more structurally appealing and would allow to only
expose externally used functions.

Signed-off-by: Tal Gilboa <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

linux/dim: Rename externally used net_dim members

Removed 'net' prefix from functions and structs used by external drivers.

Signed-off-by: Tal Gilboa <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

linux/dim: Rename net_dim_sample() to net_dim_update_sample()

In order to avoid confusion between the function and the similarly
named struct.
In preparation for removing the 'net' prefix from dim members.

Signed-off-by: Tal Gilboa <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

linux/dim: Rename externally exposed macros

Renamed macros in use by external drivers.

Signed-off-by: Tal Gilboa <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

linux/dim: Remove "net" prefix from internal DIM members

Only renaming functions and structs which aren't used by an external code.

Signed-off-by: Tal Gilboa <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

linux/dim: Move logic to dim.h

In preparation for supporting more implementations of the DIM
algorithm, I'm moving what would become common logic to a common
library. Downstream DIM implementations will use the common lib
for their implementation.

Signed-off-by: Tal Gilboa <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

tipc: rename function msg_get_wrapped() to msg_inner_hdr()

We rename the inline function msg_get_wrapped() to the more
comprehensible msg_inner_hdr().

Signed-off-by: Jon Maloy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tipc: eliminate unnecessary skb expansion during retransmission

We increase the allocated headroom for the buffer copies to be
retransmitted. This eliminates the need for the lower stack levels
(UDP/IP/L2) to expand the headroom in order to add their own headers.

Signed-off-by: Jon Maloy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

clk: socfpga: stratix10: fix divider entry for the emac clocks

The fixed dividers for the emac clocks should be 2 not 4.

Cc: [email protected]
Signed-off-by: Dinh Nguyen <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>

tipc: simplify stale link failure criteria

In commit a4dc70d46cf1 ("tipc: extend link reset criteria for stale
packet retransmission") we made link retransmission failure events
dependent on the link tolerance, and not only of the number of failed
retransmission attempts, as we did earlier. This works well. However,
keeping the original, additional criteria of 99 failed retransmissions
is now redundant, and may in some cases lead to failure detection
times in the order of minutes instead of the expected 1.5 sec link
tolerance value.

We now remove this criteria altogether.

Acked-by: Ying Xue <[email protected]>
Signed-off-by: Jon Maloy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/ipv6: Fix misuse of proc_dointvec "skip_notify_on_dev_down"

/proc/sys/net/ipv6/route/skip_notify_on_dev_down assumes given value to be
0 or 1. Use proc_dointvec_minmax instead of proc_dointvec.

Fixes: 7c6bb7d2faaf ("net/ipv6: Add knob to skip DELROUTE message ondevice down")
Signed-off-by: Eiichi Tsukata <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tc-testing: Restore original behaviour for namespaces in tdc

This patch restores the original behaviour for tdc prior to the
introduction of the plugin system, where the network namespace
functionality was split from the main script.

It introduces the concept of required plugins for testcases,
and will automatically load any plugin that isn't already
enabled when said plugin is required by even one testcase.

Additionally, the -n option for the nsPlugin is deprecated
so the default action is to make use of the namespaces.
Instead, we introduce -N to not use them, but still create
the veth pair.

buildebpfPlugin's -B option is also deprecated.

If a test cases requires the features of a specific plugin
in order to pass, it should instead include a new key/value
pair describing plugin interactions:

        "plugins": {
                "requires": "buildebpfPlugin"
        },

A test case can have more than one required plugin: a list
can be inserted as the value for 'requires'.

Signed-off-by: Lucas Bates <[email protected]>
Acked-by: Davide Caratti <[email protected]>
Tested-by: Nicolas Dichtel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ipv4: Use return value of inet_iif() for __raw_v4_lookup in the while loop

In commit 19e4e768064a8 ("ipv4: Fix raw socket lookup for local
traffic"), the dif argument to __raw_v4_lookup() is coming from the
returned value of inet_iif() but the change was done only for the first
lookup. Subsequent lookups in the while loop still use skb->dev->ifIndex.

Fixes: 19e4e768064a8 ("ipv4: Fix raw socket lookup for local traffic")
Signed-off-by: Stephen Suryaputra <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patches contains Netfilter updates for net-next:

1) .br_defrag indirection depends on CONFIG_NF_DEFRAG_IPV6, from wenxu.

2) Remove unnecessary memset() in ipset, from Florent Fourcot.

3) Merge control plane addition and deletion in ipset, also from Florent.

4) A few missing check for nla_parse() in ipset, from Aditya Pakki
   and Jozsef Kadlecsik.

5) Incorrect cleanup in error path of xt_set version 3, from Jozsef.

6) Memory accounting problems when resizing in ipset, from Stefano Brivio.

7) Jozsef updates his email to @netfilter.org, this batch comes with a
   conflict resolution with recent SPDX header updates.

8) Add to create custom conntrack expectations via nftables, from
   Stephane Veyret.

9) A lookup optimization for conntrack, from Florian Westphal.

10) Check for supported flags in xt_owner.

11) Support for pernet sysctl in br_netfilter, patches
    from Christian Brauner.

12) Patches to move common synproxy infrastructure to nf_synproxy.c,
    to prepare the synproxy support for nf_tables, patches from
    Fernando Fernandez Mancera.

13) Support to restore expiration time in set element, from Laura Garcia.

14) Fix recent rewrite of netfilter IPv6 to avoid indirections
    when CONFIG_IPV6 is unset, from Arnd Bergmann.

15) Always reset vlan tag on skbuff fraglist when refragmenting in
    bridge conntrack, from wenxu.

16) Support to match IPv4 options in nf_tables, from Stephen Suryaputra.
====================

Signed-off-by: David S. Miller <[email protected]>

dm verity: use message limit for data block corruption message

DM verity should also use DMERR_LIMIT to limit repeat data block
corruption messages.

Signed-off-by: Milan Broz <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

dm table: don't copy from a NULL pointer in realloc_argv()

For the first call to realloc_argv() in dm_split_args(), old_argv is
NULL and size is zero. Then memcpy is called, with the NULL old_argv
as the source argument and a zero size argument. AFAIK, this is
undefined behavior and generates the following warning when compiled
with UBSAN on ppc64le:

In file included from ./arch/powerpc/include/asm/paca.h:19,
                 from ./arch/powerpc/include/asm/current.h:16,
                 from ./include/linux/sched.h:12,
                 from ./include/linux/kthread.h:6,
                 from drivers/md/dm-core.h:12,
                 from drivers/md/dm-table.c:8:
In function 'memcpy',
    inlined from 'realloc_argv' at drivers/md/dm-table.c:565:3,
    inlined from 'dm_split_args' at drivers/md/dm-table.c:588:9:
./include/linux/string.h:345:9: error: argument 2 null where non-null expected [-Werror=nonnull]
  return __builtin_memcpy(p, q, size);
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/md/dm-table.c: In function 'dm_split_args':
./include/linux/string.h:345:9: note: in a call to built-in function '__builtin_memcpy'

Signed-off-by: Jerome Marchand <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

dm log writes: make sure super sector log updates are written in order

Currently, although we submit super bios in order (and super.nr_entries
is incremented by each logged entry), submit_bio() is async so each
super sector may not be written to log device in order and then the
final nr_entries may be smaller than it should be.

This problem can be reproduced by the xfstests generic/455 with ext4:

QA output created by 455
-Silence is golden
+mark 'end' does not exist

Fix this by serializing submission of super sectors to make sure each
is written to the log disk in order.

Fixes: 0e9cebe724597 ("dm: add log writes target")
Cc: [email protected]
Signed-off-by: zhangyi (F) <[email protected]>
Suggested-by: Josef Bacik <[email protected]>
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>