Git Repo - linux.git/log

pktgen: use dynamic allocation for debug print buffer

After the removal of the VLA, we get a harmless warning about a large
stack frame:

net/core/pktgen.c: In function 'pktgen_if_write':
net/core/pktgen.c:1710:1: error: the frame size of 1076 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]

The function was previously shown to be safe despite hitting
the 1024 bye warning level. To get rid of the annoyging warning,
while keeping it readable, this changes it to use strndup_user().

Obviously this is not a fast path, so the kmalloc() overhead
can be disregarded.

Fixes: 35951393bbff ("pktgen: Remove VLA usage")
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: fix sysctl_fb_tunnels_only_for_init_net link error

The new variable is only available when CONFIG_SYSCTL is enabled,
otherwise we get a link error:

net/ipv4/ip_tunnel.o: In function `ip_tunnel_init_net':
ip_tunnel.c:(.text+0x278b): undefined reference to `sysctl_fb_tunnels_only_for_init_net'
net/ipv6/sit.o: In function `sit_init_net':
sit.c:(.init.text+0x4c): undefined reference to `sysctl_fb_tunnels_only_for_init_net'
net/ipv6/ip6_tunnel.o: In function `ip6_tnl_init_net':
ip6_tunnel.c:(.init.text+0x39): undefined reference to `sysctl_fb_tunnels_only_for_init_net'

This adds an extra condition, keeping the traditional behavior when
CONFIG_SYSCTL is disabled.

Fixes: 79134e6ce2c9 ("net: do not create fallback tunnels for non-default namespaces")
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Add comment about pernet_operations methods and synchronization

Make locking scheme be visible for users, and provide
a comment what for we are need exit_batch() methods,
and when it should be used.

Signed-off-by: Kirill Tkhai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

cxgb4: Add HMA support

HMA(Host Memory Access) maps a part of host memory for T6-SO memfree cards.

This commit does the following:
- Query FW to check if we have HMA support. If yes, the params will
return HMA size configured in FW. We will dma map memory based
on this size.
- Also contains changes to get HMA memory information via debugfs.

Signed-off-by: Arjun Vynipadath <[email protected]>
Signed-off-by: Santosh Rastapur <[email protected]>
Signed-off-by: Michael Werner <[email protected]>
Signed-off-by: Ganesh GR <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'pernet-convert-part6'

Kirill Tkhai says:

====================
Converting pernet_operations (part #6)

this series continues to review and to convert pernet_operations
to make them possible to be executed in parallel for several
net namespaces in the same time. There are sctp, tipc and rds
in this series.
====================

Signed-off-by: David S. Miller <[email protected]>

net: Convert rds_tcp_net_ops

These pernet_operations create and destroy sysctl table
and listen socket. Also, exit method flushes global
workqueue and work. Everything looks per-net safe,
so we can mark them async.

Signed-off-by: Kirill Tkhai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Convert tipc_net_ops

TIPC looks concentrated in itself, and other pernet_operations
seem not touching its entities.

tipc_net_ops look pernet-divided, and they should be safe to
be executed in parallel for several net the same time.

Signed-off-by: Kirill Tkhai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Convert sctp_ctrlsock_ops

These pernet_operations create and destroy net::sctp::ctl_sock.
Since pernet_operations do not send sctp packets each other,
they look safe to be marked as async.

Signed-off-by: Kirill Tkhai <[email protected]>
Acked-by: Neil Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Convert sctp_defaults_ops

These pernet_operations have a deal with sysctl, /proc
entries and statistics. Also, there are freeing of
net::sctp::addr_waitq queue and net::sctp::local_addr_list
in exit method. All of them look pernet-divided, and it
seems these items are only interesting for sctp_defaults_ops,
which are safe to be executed in parallel.

Signed-off-by: Kirill Tkhai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sctp: fix error return code in sctp_sendmsg_new_asoc()

Return error code -EINVAL in the address len check error handling
case since 'err' can be overwrite to 0 by 'err = sctp_verify_addr()'
in the for loop.

Fixes: 2c0dbaa0c43d ("sctp: add support for SCTP_DSTADDRV4/6 Information for sendmsg")
Signed-off-by: Wei Yongjun <[email protected]>
Acked-by: Neil Horman <[email protected]>
Reviewed-by: Xin Long <[email protected]>
Acked-by: Neil Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ibmvnic: Fix recent errata commit

Sorry, one of the patches I sent in an earlier series
has some dumb mistakes. One was that I had changed the
parameter for the errata workaround function but forgot
to make that change in the code that called it.

The second mistake was a forgotten return value at the end
of the function in case the workaround was not needed.

Signed-off-by: Thomas Falcon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'ibmvnic-Fix-VLAN-and-other-device-errata'

Thomas Falcon says:

====================
ibmvnic: Fix VLAN and other device errata

This patch series contains fixes for VLAN and other backing hardware
errata. The VLAN fixes are mostly to account for the additional four
bytes VLAN header in TX descriptors and buffers, when applicable.

The other fixes for device errata are to pad small packets to avoid a
possible connection error that can occur when some devices attempt to
transmit small packets. The other fixes are GSO related. Some devices
cannot handle a smaller MSS or a packet with a single segment, so
disable GSO in those cases.

v2: Fix style mistake (unneeded brackets) in patch 3/4
====================

Signed-off-by: David S. Miller <[email protected]>

ibmvnic: Handle TSO backing device errata

TSO packets with one segment or with an MSS less than 224 can
cause errors on some backing devices, so disable GSO in those cases.

Signed-off-by: Thomas Falcon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ibmvnic: Pad small packets to minimum MTU size

Some backing devices cannot handle small packets well,
so pad any small packets to avoid that. It was recommended
that the VNIC driver should not send packets smaller than the
minimum MTU value provided by firmware, so pad small packets
to be at least that long.

Signed-off-by: Thomas Falcon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ibmvnic: Account for VLAN header length in TX buffers

The extra four bytes of a VLAN packet was throwing off
TX buffer entry values used by the driver. Account for those
bytes when in buffer size and buffer entry calculations

Signed-off-by: Thomas Falcon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ibmvnic: Account for VLAN tag in L2 Header descriptor

If a VLAN tag is present in the Ethernet header, account
for that when providing the L2 header to firmware.

Signed-off-by: Thomas Falcon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tc-testing: updated gact tests with batch test cases

Signed-off-by: Roman Mashak <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tc-testing: add TC vlan action tests

Signed-off-by: Roman Mashak <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: phy: set link state to down when creating the phy_device

Currently the link state is initialized to "up" when the phy_device is
being created. This is not consistent with the phy state being
initialized to PHY_DOWN.

Usually this doen't do any harm because the link state is updated
once the PHY reaches state PHY_AN. However e.g. if a LAN port isn't
used and the PHY remains down this inconsistency remains and calls
to functions like phy_print_status() give false results.
Therefore change the initialization to link being down.

Signed-off-by: Heiner Kallweit <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
10GbE Intel Wired LAN Driver Updates 2018-03-12

This series contains updates to ixgbe and ixgbevf only.

Shannon Nelson provides three fixes to the ipsec portion of ixgbe.  Make
sure we are using 128-bit authentication, since it is the only size
supported for hardware offload.  Fixed the transmit trailer length
calculation for ipsec by finding the padding value and adding it to the
authentication length, then save it off so that we can put it in the
transmit descriptor to tell the device where to stop the checksum
calculation.  Lastly, cleaned up useless and dead code.

Tonghao Zhang adds a ethtool stat for receive length errors, since the
driver was already collecting this counter.

Arnd Bergmann fixed a warning about an used variable by "rephrasing" the
code so that the compiler can see the use of the variable in question.

Paul fixes an issue where "HIDE_VLAN" was being cleared on VF reset, so
ensure to set "HIDE_VLAN" when port VLAN is enabled after a VF reset.
====================

Signed-off-by: David S. Miller <[email protected]>

ixgbe: fix disabling hide VLAN on VF reset

If port VLAN is enabled, set PFQDE.HIDE_VLAN during VF reset.

Setting only PFQDE.PFQDE during VF reset was clearing PFQDE.HIDE_VLAN.

Signed-off-by: Paul Greenwalt <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

net: rds: drop VLA in rds_walk_conn_path_info()

Avoid VLA[1] by using an already allocated buffer passed
by the caller.

[1] https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Salvatore Mesoraca <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: rds: drop VLA in rds_for_each_conn_info()

Avoid VLA[1] by using an already allocated buffer passed
by the caller.

[1] https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Salvatore Mesoraca <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ixgbevf: fix unused variable warning

The new ixgbevf_set_rx_buffer_len() function causes a harmless warnings
in configurations with large page size:

drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c: In function 'ixgbevf_set_rx_buffer_len':
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:1758:15: error: unused variable 'max_frame' [-Werror=unused-variable]

This rephrases the code so that the compiler can see the use of that
variable, making it slightly easier to read in the process.

Fixes: f15c5ba5b6cd ("ixgbevf: add support for using order 1 pages to receive large frames")
Signed-off-by: Arnd Bergmann <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Acked-by: Alexander Duyck <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ixgbe: Add receive length error counter

ixgbe enabled rlec counter and the rx_error used it.
We can export the counter directly via ethtool -S ethX.

Signed-off-by: Tonghao Zhang <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ixgbe: remove unneeded ipsec state free callback

With commit 7f05b467a735 ("xfrm: check for xdo_dev_state_free")
we no longer need to add an empty callback function
to the driver, so now let's remove the useless code.

Signed-off-by: Shannon Nelson <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ixgbe: fix ipsec trailer length

Fix up the Tx trailer length calculation. We can't believe the
trailer len from the xstate information because it was calculated
before the packet was put together and padding added. This bit
of code finds the padding value in the trailer, adds it to the
authentication length, and saves it so later we can put it into
the Tx descriptor to tell the device where to stop the checksum
calculation.

Fixes: 592594704761 ("ixgbe: process the Tx ipsec offload")
Signed-off-by: Shannon Nelson <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ixgbe: check for 128-bit authentication

Make sure the Security Association is using
a 128-bit authentication, since that's the only
size that the hardware offload supports.

Signed-off-by: Shannon Nelson <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

mlxsw: spectrum_kvdl: Make some functions static

Fixes the following sparse warnings:

drivers/net/ethernet/mellanox/mlxsw/spectrum_kvdl.c:371:5: warning:
symbol 'mlxsw_sp_kvdl_single_occ_get' was not declared. Should it be static?
drivers/net/ethernet/mellanox/mlxsw/spectrum_kvdl.c:384:5: warning:
symbol 'mlxsw_sp_kvdl_chunks_occ_get' was not declared. Should it be static?
drivers/net/ethernet/mellanox/mlxsw/spectrum_kvdl.c:397:5: warning:
symbol 'mlxsw_sp_kvdl_large_chunks_occ_get' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Make RX-FCS and HW GRO mutually exclusive

Same as LRO, hardware GRO cannot be enabled with RX-FCS.
When both are requested, hardware GRO will be dropped.

Suggested-by: David Miller <[email protected]>
Signed-off-by: Gal Pressman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: llc: drop VLA in llc_sap_mcast()

Avoid a VLA[1] by using a real constant expression instead of a variable.
The compiler should be able to optimize the original code and avoid using
an actual VLA. Anyway this change is useful because it will avoid a false
positive with -Wvla, it might also help the compiler generating better
code.

[1] https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Salvatore Mesoraca <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

lan743x: make functions lan743x_csr_read and lan743x_csr_read static

Functions lan743x_csr_read and lan743x_csr_read are local to the source
and do not need to be in global scope, so make them static.

Cleans up sparse warning:
drivers/net/ethernet/microchip/lan743x_main.c:56:5: warning: symbol
lan743x_csr_read' was not declared. Should it be static?
drivers/net/ethernet/microchip/lan743x_main.c:61:6: warning: symbol
'lan743x_csr_write' was not declared. Should it be static?

Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

lan743x: remove some redundant variables and assignments

Function lan743x_phy_init assigns pointer 'netdev' but this is never read
and hence it can be removed. The return error code handling can also be
cleaned up to remove the variable 'ret'.

Function lan743x_phy_link_status_change assigns pointer 'phy' twice and
this is never read, so it also can be removed.

Finally, function lan743x_tx_napi_poll initializes pointer 'adapter'
and then re-assigns the same value into this pointer a little later on
so this second assignment is redundant and can be also removed.

Cleans up clang warnings:
drivers/net/ethernet/microchip/lan743x_main.c:951:2: warning: Value
stored to 'netdev' is never read
drivers/net/ethernet/microchip/lan743x_main.c:971:3: warning: Value
stored to 'phy' is never read
drivers/net/ethernet/microchip/lan743x_main.c:1583:26: warning: Value
stored to 'adapter' during its initialization is never read

Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

rds: remove redundant variable 'sg_off'

Variable sg_off is assigned a value but it is never read, hence it is
redundant and can be removed.

Cleans up clang warning:
net/rds/message.c:373:2: warning: Value stored to 'sg_off' is never read

Signed-off-by: Colin Ian King <[email protected]>
Acked-by: Sowmini Varadhan <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ipv6: Use ip6_multipath_hash_policy() in rt6_multipath_hash().

Make use of the new helper.

Suggested-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'mlxsw-Removing-dependency-of-mlxsw-on-GRE'

Ido Schimmel says:

====================
mlxsw: Removing dependency of mlxsw on GRE

Petr says:

mlxsw_spectrum supports offloading of a tc action mirred egress mirror
to a gretap or ip6gretap netdevice, which necessitates calls to
functions defined in ip_gre, ip6_gre and ip6_tunnel modules. Previously
this was enabled by introducing a hard dependency of MLXSW_SPECTRUM on
NET_IPGRE and IPV6_GRE. However the rest of mlxsw is careful about
picking which modules are absolutely required, and therefore the better
approach is to make mlxsw_spectrum tolerant of absence of one or both of
the GRE flavors.

One way this might be resolved is by keeping the code in mlxsw_spectrum
intact, and defining defaults for functions that mlxsw_spectrum depends
on. The downsides are that other modules end up littered with these
do-nothing defaults; that the driver ends up carrying quite a bit of
dead code; and that the driver ends up having to explicitly depend on
IPV6_TUNNEL to prevent configurations where mlxsw_spectrum is compiled
in and and ip6_tunnel is a module, something that it currently can treat
as an implementation detail of the IPV6_GRE dependency.

Alternatively, the driver should just bite the bullet and ifdef-out the
code that handles configurations that are not supported. Since that's
what we are doing for IPv6 dependency, let's do the same for the GRE
flavors.

Patch #1 introduces a wrapper function for determining the value of
ipv6.sysctl.multipath_hash_policy, which defaults to 0 on non-IPv6
builds. That function is then used from spectrum_router.c, instead of
the direct variable reference that was introduced there during the short
window when the Spectrum driver had a hard dependency on IPv6.

Patch #2 moves one function to keep together in one block all the
callbacks for handling (IPv4) gretap mirroring.

Patch #3 then introduces the ifdefs to hide the irrelevant code.
====================

Signed-off-by: David S. Miller <[email protected]>

mlxsw: spectrum: Don't depend on ip_gre and ip6_gre

mlxsw_spectrum supports offloading of a tc action mirred egress mirror
to a gretap or an ip6gretap netdevice, which necessitates calls to
functions defined in ip_gre, ip6_gre and ip6_tunnel modules. Previously
this was enabled by introducing a hard dependency of MLXSW_SPECTRUM on
NET_IPGRE and IPV6_GRE. However the rest of mlxsw is careful about
picking which modules are absolutely required, and therefore the better
approach is to make mlxsw_spectrum tolerant of absence of one or both of
the GRE flavors.

Hence rework the NET_IPGRE and IPV6_GRE dependencies to just guard
matching modularity, and hide the corresponding code in spectrum_span.c
in an #if IS_ENABLED. Mark mlxsw_sp_span_entry_tunnel_parms_common as
maybe unused, to muffle warnings if neither GRE flavor is selected,
which seems cleaner than introducing a composite #if.

Signed-off-by: Petr Machata <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mlxsw: spectrum: Move mlxsw_sp_span_gretap4_route()

Move the function next to the rest of gretap4 functions. Thus the
generic functions shared between gretap4 and gretap6 are in one block at
the beginning, followed by a gretap4 block, followed by a gretap6 block.

Signed-off-by: Petr Machata <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ipv6: Introduce ip6_multipath_hash_policy()

In order to abstract away access to the
ipv6.sysctl.multipath_hash_policy variable, which is not available on
systems compiled without IPv6 support, introduce a wrapper function
ip6_multipath_hash_policy() that falls back to 0 on non-IPv6 systems.

Use this wrapper from mlxsw/spectrum_router instead of a direct
reference.

Signed-off-by: Petr Machata <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Acked-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

cxgb4/cxgb4vf: check fw caps to set link mode mask

check firmware capabilities before setting ethtool
link mode mask, also add few missing speeds.

Signed-off-by: Ganesh Goudar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

cxgb4: do not display 50Gbps as unsupported speed

50Gbps is a supported speed, Stop reporting it as
unsupported speed.

Signed-off-by: Ganesh Goudar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

liquidio: fix ndo_change_mtu to always return correct status to the caller

In a scenario where the command queued to firmware get dropped or times
out, MTU change from host will not propagate to firmware. So, it is
required for host driver to wait for response from firmware or timeout
and then return correct status to caller of ndo_change_mtu.

Also moved the common code for MTU change from PF and VF driver files to
common file lio_core.c

Signed-off-by: Veerasenareddy Burru <[email protected]>
Signed-off-by: Felix Manlunas <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'hns3-next'

Peng Li says:

====================
fix some bugs for HNS3 driver

This patchset fixes some bugs for HNS3 driver:
[Patch 1/12 - Patch 8/12] fix various bugs for PF driver.
[Patch 9/12 - Patch 12/12] fix issues when change the us mac address of
PF/VF device to an existent one in the mac_vlan table.
====================

Signed-off-by: David S. Miller <[email protected]>

net: hns3: add result checking for VF when modify unicast mac address

VF changes unicast mac address by sending mailbox msg to PF, then PF
completes the mac address modification. It may fail when the target
uc mac address is already in the mac_vlan table. VF should be aware
of it by reading the message result.

Signed-off-by: Jian Shen <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: add existence checking before adding unicast mac address

It's not allowed to add two same unicast mac address entries to the
mac_vlan table. When modify the uc mac address of a VF device to the
same value with the PF device's, the PF device will lose its entry of
the mac_vlan table.

Lookup the mac address in the mac_vlan table, and add it if the entry
is inexistent.

Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Jian Shen <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: fix return value error of hclge_get_mac_vlan_cmd_status()

Error code -EIO was used to indicate mutilple errors in function
hclge_get_mac_vlan_cmd_status().This patch fixes it by using
error code depending on the error type.

For no space error, return -ENOSPC.
For entry not found, return -ENOENT.
For command send fail, return -EIO.
For invalid op code, return -EINVAL.

Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Jian Shen <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: fix error type definition of return value

An enum type variable was used to store an "int" type return value.
This patch fixes it.

Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Jian Shen <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: fix for buffer overflow smatch warning

This patch fixes the buffer overflow warning by refactoring
hclgevf_bind_ring_to_vector and hclge_get_ring_chain_from_mbx.

Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Fixes: dde1a86e93ca ("net: hns3: Add mailbox support to PF driver")
Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: fix for loopback failure when vlan filter is enable

When vlan ctag filter is enabled, the loopback selftest fails because
loopback selftest does not support vlan.

This patch fixes it by disabling the vlan ctag filter when runnig
loopback selftest.

Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: add support for querying pfc puase packets statistic

This patch add support for querying pfc puase packets statistic
in hclge_ieee_getpfc, which is used to tell user how many pfc
puase packets have been sent and received by this mac port.

Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: fix rx path skb->truesize reporting bug

Original skb->truesize reports the received packet size,
not the actual buffer size NIC driver allocated(1 Page).
The linux net protocol will misjudge the true size of rx queue.

Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: unify the pause params setup function

Since the firmware cmd to setup mac pause params is the same as the
firmware cmd to pfc pause params, this patch unifies the pause params
setup function.

Signed-off-by: Fuyun Liang <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: fix for ipv6 address loss problem after setting channels

The function of dev_close and dev_open is just likes ifconfig <netif> down
and ifconfig <netif> up. The ipv6 address will be lost after dev_close and
dev_open are called. This patch uses hns3_nic_net_stop to replace dev_close
and uses hns3_nic_net_open to replace dev_open.

Signed-off-by: Fuyun Liang <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: fix for netdev not running problem after calling net_stop and net_open

The link status update function is called by timer every second. But
net_stop and net_open may be called with very short intervals. The link
status update function can not detect the link state has changed. It
causes the netdev not running problem.

This patch fixes it by updating the link state in ae_stop function.

Signed-off-by: Fuyun Liang <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: add existence check when remove old uc mac address

When driver is in initial state, the mac_vlan table table is empty.
So the delete operation for mac address must fail. Existence check
is needed here. Otherwise, the error message will make user confused.

Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Fuyun Liang <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'selftests-forwarding-Tweaks-and-a-new-test'

Ido Schimmel says:

====================
selftests: forwarding: Tweaks and a new test

First patch adds a new test for VLAN-unaware bridges.

Next two patches make the tests fail in case they are missing interfaces
or dependencies.

Last patch allows one to create the veth interfaces even without the
optional configuration file.
====================

Signed-off-by: David S. Miller <[email protected]>

selftests: forwarding: Allow creation of interfaces without a config file

Some users want to be able to run the tests without a configuration file
which is useful when one needs to test both virtual and physical
interfaces on the same machine.

Move the defines that set the type of interface to create and whether to
create it away from the optional configuration file to the library like
the rest of the defines.

Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

selftests: forwarding: Exit with error when missing interfaces

Returning 0 gives a false sense of success when the required modules did
not even manage to be initialized and register the required net devices.

Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

selftests: forwarding: Exit with error when missing dependencies

We already return an error when some dependencies (e.g., 'jq') are
missing so lets be consistent and do that for all.

Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

selftests: forwarding: Add a test for VLAN-unaware bridge

Similar to the VLAN-aware bridge test, test the VLAN-unaware bridge and
make sure that ping, FDB learning and flooding work as expected.

Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 's390-qeth-next'

Julian Wiedmann says:

====================
s390/qeth: updates 2018-03-09

here is the current pile of qeth patches for net-next. Just the usual
small updates and clean ups. Please apply.
====================

Signed-off-by: David S. Miller <[email protected]>

s390/qeth: shrink qeth_ipaddr struct

Using up 8 bytes in every ipaddr object to store SETIP/DELIP flags is
rather wasteful. Except for takeover eligibility, the flag values all
just depend on the address type, so determine them on demand.

While at it reorder the struct to fill an alignment hole.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: extract helpers for managing special IPs

Reduce code duplication.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: simplify card look-up on IP notification

On an IP event, current code tries to determine if the netdev belongs
to a L3 card by walking all qeth cards in the system, and then all of
their VLAN devices too. Short-cut the whole thing by identifying a L3
device through its netdev_ops.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: restructure IP notification handlers

Extract a helper that does the actual work & returns the right NOTIFY_*
responses, and start putting the temporary ipaddr container objects
on the stack rather than kmalloc'ing them. They are small, and this
reduces the confusion of which objects actually get added to qeth's
IP tables.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: reset NAPI context during queue init

init_qdio_queues() resets the Input Queue's overall QDIO state, and
positions the buffer cursor back to 0. So this is the obvious place to
also reset the queue's NAPI context (in contrast to doing it rather
randomly in the middle of the big set_online() path).
No functional change.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: reduce RX skb setup

Newly-allocated skbs default to PACKET_HOST, and eth_type_trans() is
smart enough to determine any other packet type from the frame's
destination address.
So except for the IQD sniffer case, there is no need to set up
skb->pkt_type manually.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: allocate skb from NAPI cache

napi_alloc_skb() doesn't need to disable IRQs during the allocation,
and thus may save us a few cycles.
Doing so requires a small fix-up in the HiperTransport path, which
currently assumes a fixed NET_SKB_PAD headroom padding. napi_alloc_skb()
adds an additional NET_IP_ALIGN padding, so use the proper helper for
setting up the mac_header offset.

Use this opportunity to convert the non-NAPI path to netdev_alloc_skb(),
which means that skb->dev is now always set-up during allocation and
doesn't need to be assigned manually.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: pass correct length to header_ops->create()

We need to pass the *payload* length, not the L2 address length.
For qeth (using eth_header()) this is merely a cosmetic change:
the parameter only matters when building headers for ETH_P_802_2
or ETH_P_802_3, whereas our fake headers are built with
ETH_P_IP / ETH_P_IPV6 / 0.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: advertise IFF_UNICAST_FLT

qeth implements HW-based Unicast Filtering (via SETVMAC) on L2 devices.
Tell the stack, so it knows that receiving traffic for secondary
addresses doesn't require full-blown promiscuous mode.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: support SG for more device types

NETIF_F_SG support is currently limited to OSA (and for L2 even OSD)
devices. Advertise it for some more device types (OSM, L2 OSX, z/VM OSA)
that share the same code paths. For now, keep it switched off by
default on these devices.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: remove outdated portname debug msg

The 'portname' attribute is deprecated and setting it has no effect.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/qeth: use __ipa_cmd() for casting an IPA cmd buffer

"s390/qeth: fix SETIP command handling" introduced a new helper, apply
it driver-wide.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: introduce IFF_NO_RX_HANDLER

Some network devices - notably ipvlan slave - are not compatible with
any kind of rx_handler. Currently the hook can be installed but any
configuration (bridge, bond, macsec, ...) is nonfunctional.

This change allocates a priv_flag bit to mark such devices and explicitly
forbid installing a rx_handler if such bit is set. The new bit is used
by ipvlan slave device.

Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

cxgb4: increase max tx rate limit to 100 Gbps

T6 cards can support up to 100 G speeds. So, increase
max programmable tx rate limit to 100 Gbps.

Signed-off-by: Ganesh Goudar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

pktgen: Remove VLA usage

In preparation to enabling -Wvla, remove VLA usage and replace it
with a fixed-length array instead.

Signed-off-by: Gustavo A. R. Silva <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

drivers: vhost: vsock: fixed a brace coding style issue

Fixed a coding style issue.

Signed-off-by: Vaibhav Murkute <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'hns3-fixes-for-configuration-lost-problems'

Peng Li says:

====================
fixes for configuration lost problems

This patchset refactors some functions and some bugs in order
to fix the configuration loss problem when resetting and
setting channel number.
====================

Signed-off-by: David S. Miller <[email protected]>

net: hns3: fix for coal configuation lost when setting the channel

This patch fixes the coalesce configuation lost problem when
setting the channel number by restoring all vectors's coalesce
configuation to vector 0's, because all vectors belonging to
the same netdev have the same coalesce configuation for now.

Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: refactor the coalesce related struct

This patch refoctors the coalesce related struct by introducing
the hns3_enet_coalesce struct, in order to fix the coalesce
configuation lost problem when changing the channel number.

Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: fix for coalesce configuration lost during reset

Coalesce configuration will be set to default value by
hns3_nic_init_vector_data during reset, which causes the
coalesce configuration loss problem.

This patch fixes it by setting the default value in
hns3_nic_alloc_vector_data, which will not be called in the
reset process.

Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: refactor the get/put_vector function

There is a get_vector function, which allocate the vectors
for a client, but there is not a put_vector to free the
vector.

This patch introduces the put_vector function in order to
fix the coalesce configuration lost problem during reset
process.

Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: fix for use-after-free when setting ring parameter

In hns3_set_ringparam, hns3_uninit_all_ring frees the
memory pointed by priv->ring_data[i].ring, and
hns3_change_all_ring_bd_num use that pointer without mallocing,
which will cause a use-after-free problem.

The patch fixes it by not freeing the memory in
hns3_uninit_all_ring, and uses hns3_put_ring_config to free it
when necessary.

Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: fix for pause configuration lost during reset

Pause configuration will be set to default value by hclge_tm_schd_init
during reset, which causes the RSS configuration loss problem.

This patch fixes it by calling hclge_tm_init_hw during reset process
, which will set the pause configuration to default value.

Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: fix for RSS configuration loss problem during reset

RSS configuration will be set to default value by hclge_rss_init_hw
during reset, which causes the RSS configuration loss problem.

This patch fixes it by setting the default value in
hclge_rss_init_cfg function, which will not be called in the reset
process.

Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: refactor the hclge_get/set_rss_tuple function

This patch refactors the hclge_get/set_rss_tuple function
in order to fix the rss configuration loss problem during
reset process.

Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: refactor the hclge_get/set_rss function

This patch refactors the hclge_get/set_rss function in
order to fix the rss configuration loss problem during
reset process.

Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'sched-action-events'

Roman Mashak says:

====================
Fix event generation for actions batch Add/Delete mode

When adding or deleting a batch of entries, the kernel sends upto
TCA_ACT_MAX_PRIO entries in an event to user space. However it does not
consider that the action sizes may vary and require different skb sizes.

For example :

% cat tc-batch.sh
TC="sudo /mnt/iproute2.git/tc/tc"

$TC actions flush action gact
for i in `seq 1 $1`;
do
   cmd="action pass index $i "
   args=$args$cmd
done
$TC actions add $args
%
% ./tc-batch.sh 32
Error: Failed to fill netlink attributes while adding TC action.
We have an error talking to the kernel
%

This patchset introduces new callback in tc_action_ops, which calculates
the action size, and passes size to tcf_add_notify()/tcf_del_notify(). The
patch fixes act_gact, and the rest of actions will be updated in the
follow-up patches.

v3:
   Fixed tcf_action_fill_size() to return shared attrs length when
   action ->get_fill_size() isn't implemented.
v2:
   Restructured patches to make them bisectable.
====================

Signed-off-by: David S. Miller <[email protected]>

net sched actions: implement get_fill_size routine in act_gact

Signed-off-by: Roman Mashak <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net sched actions: calculate add/delete event message size

Introduce routines to calculate size of the shared tc netlink attributes
and the full message size including netlink header and tc service header.

Update add/delete action logic to have the size for event messages,
the size is passed to tcf_add_notify() and tcf_del_notify() where the
notification message is being allocated and constructed.

Signed-off-by: Roman Mashak <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net sched actions: add new tc_action_ops callback

Add a new callback in tc_action_ops, it will be needed by the tc actions
to compute its size when a ADD/DELETE notification message is constructed.
This routine has to take into account optional/variable size TLVs specific
per action.

Signed-off-by: Roman Mashak <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net sched actions: update Add/Delete action API with new argument

Introduce a new function argument to carry total attributes size for
correct allocation of skb in event messages.

Signed-off-by: Roman Mashak <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: do not create fallback tunnels for non-default namespaces

fallback tunnels (like tunl0, gre0, gretap0, erspan0, sit0,
ip6tnl0, ip6gre0) are automatically created when the corresponding
module is loaded.

These tunnels are also automatically created when a new network
namespace is created, at a great cost.

In many cases, netns are used for isolation purposes, and these
extra network devices are a waste of resources. We are using
thousands of netns per host, and hit the netns creation/delete
bottleneck a lot. (Many thanks to Kirill for recent work on this)

Add a new sysctl so that we can opt-out from this automatic creation.

Note that these tunnels are still created for the initial namespace,
to be the least intrusive for typical setups.

Tested:
lpk43:~# cat add_del_unshare.sh
for i in `seq 1 40`
do
(for j in `seq 1 100` ; do unshare -n /bin/true >/dev/null ; done) &
done
wait

lpk43:~# echo 0 >/proc/sys/net/core/fb_tunnels_only_for_init_net
lpk43:~# time ./add_del_unshare.sh

real 0m37.521s
user 0m0.886s
sys 7m7.084s
lpk43:~# echo 1 >/proc/sys/net/core/fb_tunnels_only_for_init_net
lpk43:~# time ./add_del_unshare.sh

real 0m4.761s
user 0m0.851s
sys 1m8.343s
lpk43:~#

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tools: tc-testing: Can pause just before post-suite

With option -P, the test script will pause just before
the post_suite functions are called. This allows the tester to
inspect the system before it is torn down.

Signed-off-by: Brenda J. Butler <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tools: tc-testing: Can refer to $TESTID in test spec

When processing the commands in the test cases, substitute
the test id for $TESTID. This helps to make more flexible
tests. For example, the testid can be given as a command
line argument.

As an example, if we wish to save the test output to a file
named for the test case, we can write in the test case:

"cmdUnderTest": "some test command | tee -a $TESTID.out"

Signed-off-by: Brenda J. Butler <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: mv88e6xxx: Fix irq free'ing

Call the common irq free function, rather than going recursive and
blowing away the stack, followed by the machine.

Fixes: 294d711ee8c0 ("net: dsa: mv88e6xxx: Poll when no interrupt defined")
Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tc-testing: add csum tests

Signed-off-by: Roman Mashak <[email protected]>
Tested-by: Davide Caratti <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: usb: asix88179_178a: de-duplicate code

Remove the duplicated code for asix88179_178a bind and reset methods.

Signed-off-by: Alexander Kurz <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: usb: asix88179_178a: set permanent address once only

The permanent address of asix88179_178a devices is read at probe time
and should not be overwritten later. Otherwise it may be overwritten
unintentionally with a configured address.

Signed-off-by: Alexander Kurz <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'ntuple-filters-with-RSS'

Edward Cree says:

====================
ntuple filters with RSS

This series introduces the ability to mark an ethtool steering filter to use
RSS spreading, and the ability to create and configure multiple RSS contexts
with different indirection tables, hash keys, and hash fields.
An implementation for the sfc driver (for 7000-series and later SFC NICs) is
included in patch 2/2.

The anticipated use case of this feature is for steering traffic destined for
a container (or virtual machine) to the subset of CPUs on which processes in
the container (or the VM's vCPUs) are bound, while retaining the scalability
of RSS spreading from the viewpoint inside the container.
The use of both a base queue number (ring_cookie) and indirection table is
intended to allow re-use of a single RSS context to target multiple sets of
CPUs. For instance, if an 8-core system is hosting three containers on CPUs
[1,2], [3,4] and [6,7], then a single RSS context with an equal-weight [0,1]
indirection table could be used to target all three containers by setting
ring_cookie to 1, 3 and 6 on the respective filters.

v2: Initialised ctx in efx_ef10_filter_insert() to avoid (false positive) gcc
warning.
====================

Signed-off-by: David S. Miller <[email protected]>