Git Repo - linux.git/log

Merge branch 'improve-test-coverage-sparc'

David Miller says:

====================
On sparc64 a ton of test cases in test_verifier.c fail because
the memory accesses in the test case are unaligned (or cannot
be proven to be aligned by the verifier).

Perhaps we can eventually try to (carefully) modify each test case
which has this problem to not use unaligned accesses but:

1) That is delicate work.

2) The changes might not fully respect the original
   intention of the testcase.

3) In some cases, such a transformation might not even
   be feasible at all.

So add an "any alignment" flag to tell the verifier to forcefully
disable it's alignment checks completely.

test_verifier.c is then annotated to use this flag when necessary.

The presence of the flag in each test case is good documentation to
anyone who wants to actually tackle the job of eliminating the
unaligned memory accesses in the test cases.

I've also seen several weird things in test cases, like trying to
access __skb->mark in a packet buffer.

This gets rid of 104 test_verifier.c failures on sparc64.

Changes since v1:

1) Explain the new BPF_PROG_LOAD flag in easier to understand terms.
   Suggested by Alexei.

2) Make bpf_verify_program() just take a __u32 prog_flags instead of
   just accumulating boolean arguments over and over.  Also suggested
   by Alexei.

Changes since RFC:

1) Only the admin can allow the relaxation of alignment restrictions
   on inefficient unaligned access architectures.

2) Use F_NEEDS_EFFICIENT_UNALIGNED_ACCESS instead of making a new
   flag.

3) Annotate in the output, when we have a test case that the verifier
   accepted but we did not try to execute because we are on an
   inefficient unaligned access platform.  Maybe with some arch
   machinery we can avoid this in the future.

Signed-off-by: David S. Miller <[email protected]>
====================

Signed-off-by: Alexei Starovoitov <[email protected]>

bpf: Apply F_NEEDS_EFFICIENT_UNALIGNED_ACCESS to more ACCEPT test cases.

If a testcase has alignment problems but is expected to be ACCEPT,
verify it using F_NEEDS_EFFICIENT_UNALIGNED_ACCESS too.

Maybe in the future if we add some architecture specific code to elide
the unaligned memory access warnings during the test, we can execute
these as well.

Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>

bpf: Make more use of 'any' alignment in test_verifier.c

Use F_NEEDS_EFFICIENT_UNALIGNED_ACCESS in more tests where the
expected result is REJECT.

Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>

bpf: Adjust F_NEEDS_EFFICIENT_UNALIGNED_ACCESS handling in test_verifier.c

Make it set the flag argument to bpf_verify_program() which will relax
the alignment restrictions.

Now all such test cases will go properly through the verifier even on
inefficient unaligned access architectures.

On inefficient unaligned access architectures do not try to run such
programs, instead mark the test case as passing but annotate the
result similarly to how it is done now in the presence of this flag.

So, we get complete full coverage for all REJECT test cases, and at
least verifier level coverage for ACCEPT test cases.

Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>

bpf: Add BPF_F_ANY_ALIGNMENT.

Often we want to write tests cases that check things like bad context
offset accesses. And one way to do this is to use an odd offset on,
for example, a 32-bit load.

This unfortunately triggers the alignment checks first on platforms
that do not set CONFIG_EFFICIENT_UNALIGNED_ACCESS. So the test
case see the alignment failure rather than what it was testing for.

It is often not completely possible to respect the original intention
of the test, or even test the same exact thing, while solving the
alignment issue.

Another option could have been to check the alignment after the
context and other validations are performed by the verifier, but
that is a non-trivial change to the verifier.

Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>

bpf: Fix verifier log string check for bad alignment.

The message got changed a lot time ago.

This was responsible for 36 test case failures on sparc64.

Fixes: f1174f77b50c ("bpf/verifier: rework value tracking")
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>

Merge branch 'bpftool-fixes'

Quentin Monnet says:

====================
Hi,
Several items for bpftool are included in this set: the first three patches
are fixes for bpftool itself and bash completion, while the last two
slightly improve the information obtained when dumping programs or maps, on
Daniel's suggestion. Please refer to individual commit logs for more
details.
====================

Signed-off-by: Alexei Starovoitov <[email protected]>

tools: bpftool: add owner_prog_type and owner_jited to bpftool output

For prog array maps, the type of the owner program, and the JIT-ed state
of that program, are available from the file descriptor information
under /proc. Add them to "bpftool map show" output. Example output:

    # bpftool map show
    158225: prog_array  name jmp_table  flags 0x0
        key 4B  value 4B  max_entries 8  memlock 4096B
        owner_prog_type flow_dissector  owner jited
    # bpftool --json --pretty map show
    [{
            "id": 1337,
            "type": "prog_array",
            "name": "jmp_table",
            "flags": 0,
            "bytes_key": 4,
            "bytes_value": 4,
            "max_entries": 8,
            "bytes_memlock": 4096,
            "owner_prog_type": "flow_dissector",
            "owner_jited": true
        }
    ]

As we move the table used for associating names to program types,
complete it with the missing types (lwt_seg6local and sk_reuseport).
Also add missing types to the help message for "bpftool prog"
(sk_reuseport and flow_dissector).

Suggested-by: Daniel Borkmann <[email protected]>
Signed-off-by: Quentin Monnet <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>

tools: bpftool: mark offloaded programs more explicitly in plain output

In bpftool (plain) output for "bpftool prog show" or "bpftool map show",
an offloaded BPF object is simply denoted with "dev ifname", which is
not really explicit. Change it with something that clearly shows the
program is offloaded.

While at it also add an additional space, as done between other
information fields.

Example output, before:

    # bpftool prog show
    1337: xdp  tag a04f5eef06a7f555 dev foo
            loaded_at 2018-10-19T16:40:36+0100  uid 0
            xlated 16B  not jited  memlock 4096B

After:

    # bpftool prog show
    1337: xdp  tag a04f5eef06a7f555  offloaded_to foo
            loaded_at 2018-10-19T16:40:36+0100  uid 0
            xlated 16B  not jited  memlock 4096B

Suggested-by: Daniel Borkmann <[email protected]>
Signed-off-by: Quentin Monnet <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>

tools: bpftool: fix bash completion for new map types (queue and stack)

Commit 197c2dac74e4 ("bpf: Add BPF_MAP_TYPE_QUEUE and BPF_MAP_TYPE_STACK
to bpftool-map") added support for queue and stack eBPF map types in
bpftool map handling. Let's update the bash completion accordingly.

Fixes: 197c2dac74e4 ("bpf: Add BPF_MAP_TYPE_QUEUE and BPF_MAP_TYPE_STACK to bpftool-map")
Signed-off-by: Quentin Monnet <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>

tools: bpftool: fix bash completion for bpftool prog (attach|detach)

Fix bash completion for "bpftool prog (attach|detach) PROG TYPE MAP" so
that the list of indices proposed for MAP are map indices, and not PROG
indices. Also use variables for map and prog reference types ("id",
"pinned", and "tag" for programs).

Fixes: b7d3826c2ed6 ("bpf: bpftool, add support for attaching programs to maps")
Signed-off-by: Quentin Monnet <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>

tools: bpftool: use "/proc/self/" i.o. crafting links with getpid()

The getpid() function is called in a couple of places in bpftool to
craft links of the shape "/proc/<pid>/...". Instead, it is possible to
use the "/proc/self/" shortcut, which makes things a bit easier, in
particular in jit_disasm.c.

Do the replacement, and remove the includes of <sys/types.h> from the
relevant files, now we do not use getpid() anymore.

Signed-off-by: Quentin Monnet <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>

arm64/bpf: use movn/movk/movk sequence to generate kernel addresses

On arm64, all executable code is guaranteed to reside in the vmalloc
space (or the module space), and so jump targets will only use 48
bits at most, and the remaining bits are guaranteed to be 0x1.

This means we can generate an immediate jump address using a sequence
of one MOVN (move wide negated) and two MOVK instructions, where the
first one sets the lower 16 bits but also sets all top bits to 0x1.

Signed-off-by: Ard Biesheuvel <[email protected]>
Acked-by: Will Deacon <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
bpf-next 2018-11-30

The following pull-request contains BPF updates for your *net-next* tree.

(Getting out bit earlier this time to pull in a dependency from bpf.)

The main changes are:

1) Add libbpf ABI versioning and document API naming conventions
   as well as ABI versioning process, from Andrey.

2) Add a new sk_msg_pop_data() helper for sk_msg based BPF
   programs that is used in conjunction with sk_msg_push_data()
   for adding / removing meta data to the msg data, from John.

3) Optimize convert_bpf_ld_abs() for 0 offset and fix various
   lib and testsuite build failures on 32 bit, from David.

4) Make BPF prog dump for !JIT identical to how we dump subprogs
   when JIT is in use, from Yonghong.

5) Rename btf_get_from_id() to make it more conform with libbpf
   API naming conventions, from Martin.

6) Add a missing BPF kselftest config item, from Naresh.
====================

Signed-off-by: David S. Miller <[email protected]>

tools/bpf: make libbpf _GNU_SOURCE friendly

During porting libbpf to bcc, I got some warnings like below:
  ...
  [  2%] Building C object src/cc/CMakeFiles/bpf-shared.dir/libbpf/src/libbpf.c.o
  /home/yhs/work/bcc2/src/cc/libbpf/src/libbpf.c:12:0:
  warning: "_GNU_SOURCE" redefined [enabled by default]
   #define _GNU_SOURCE
  ...
  [  3%] Building C object src/cc/CMakeFiles/bpf-shared.dir/libbpf/src/libbpf_errno.c.o
  /home/yhs/work/bcc2/src/cc/libbpf/src/libbpf_errno.c: In function ‘libbpf_strerror’:
  /home/yhs/work/bcc2/src/cc/libbpf/src/libbpf_errno.c:45:7:
  warning: assignment makes integer from pointer without a cast [enabled by default]
     ret = strerror_r(err, buf, size);
  ...

bcc is built with _GNU_SOURCE defined and this caused the above warning.
This patch intends to make libpf _GNU_SOURCE friendly by
  . define _GNU_SOURCE in libbpf.c unless it is not defined
  . undefine _GNU_SOURCE as non-gnu version of strerror_r is expected.

Signed-off-by: Yonghong Song <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

net: Don't default Aquantia USB driver to 'y'

Reported-by: Pavel Machek <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: explain __skb_checksum_complete() with comments

Cc: Herbert Xu <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp: remove loop to compute wscale

We can remove the loop and conditional branches
and compute wscale efficiently thanks to ilog2()

Signed-off-by: Eric Dumazet <[email protected]>
Acked-by: Soheil Hassas Yeganeh <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

qede - Add a statistic for a case where driver drops tx packet due to memory allocation failure.

skb_linearization can fail due to memory allocation failure.
In such a case, the driver will drop the packet. In such a case
The driver used to print an error message.
This patch replaces this error message by a dedicated statistic.

Signed-off-by: Michael Shteinbok <[email protected]>
Signed-off-by: Ariel Elior <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dpaa2-eth: Add "fall through" comments

Add comments in the switch statement for XDP action to indicate
fallthrough is intended.

Signed-off-by: Ioana Radulescu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'ave-suspend-resume'

Kunihiko Hayashi says:

====================
Add suspend/resume support for AVE ethernet driver

This series adds support for suspend/resume to AVE ethernet driver.

And to avoid the error that wol state of phy hardware is enabled by default,
this sets initial wol state to disabled and add preservation the state in
suspend/resume sequence.
====================

Signed-off-by: David S. Miller <[email protected]>

net: ethernet: ave: Preserve wol state in suspend/resume sequence

Since the wol state forces to be initialized after reset, the state should
be preserved in suspend/resume sequence.

Signed-off-by: Kunihiko Hayashi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ethernet: ave: Set initial wol state to disabled

If wol state of phy hardware is enabled after reset, phy_ethtool_get_wol()
returns that wol.wolopts is true.

However, since net_device.wol_enabled is zero and this doesn't apply wol
state until calling ethtool_set_wol(), so mdio_bus_phy_may_suspend()
returns true, that is, it's in a state where phy can suspend even though
wol state is enabled.

In this inconsistency, phy_suspend() returns -EBUSY, and at last,
suspend sequence fails with the following message:

    dpm_run_callback(): mdio_bus_phy_suspend+0x0/0x58 returns -16
    PM: Device 65000000.ethernet-ffffffff:01 failed to suspend: error -16
    PM: Some devices failed to suspend, or early wake event detected

In order to fix the above issue, this patch forces to set initial wol state
to disabled as default.

Signed-off-by: Kunihiko Hayashi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ethernet: ave: Add suspend/resume support

This patch introduces suspend and resume functions to ave driver.

Signed-off-by: Kunihiko Hayashi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'linux-can-next-for-4.21-20181128' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
This is a pull request for net-next/master consisting of 18 patches.

The first patch is by Colin Ian King and fixes the spelling in the ucan
driver.

The next three patches target the xilinx driver. YueHaibing's patch
fixes the return type of ndo_start_xmit function. Two patches by
Shubhrajyoti Datta add support for the CAN FD 2.0 controllers.

Flavio Suligoi's patch for the sja1000 driver add support for the ASEM
CAN raw hardware.

Wolfram Sang's and Kuninori Morimoto's patches switch the rcar driver to
use SPDX license identifiers.

The remaining 111 patches improve the flexcan driver. Pankaj Bansal's
patch enables the driver in Kconfig on all architectures with IOMEM
support. The next four patches by me fix indention, add missing
parentheses and comments. Aisheng Dong's patches add self wake support
and document it in the DT bindings. The remaining patches by Pankaj
Bansal first fix the loopback support and prepare the driver for the
CAN-FD support needed for the LX2160A SoC. The actual CAN-FD support
will be added in a later patch series.
====================

Signed-off-by: David S. Miller <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Trivial conflict in net/core/filter.c, a locally computed
'sdif' is now an argument to the function.

Signed-off-by: David S. Miller <[email protected]>

bpf: Fix various lib and testsuite build failures on 32-bit.

Cannot cast a u64 to a pointer on 32-bit without an intervening (long)
cast otherwise GCC warns.

Signed-off-by: David S. Miller <[email protected]>
Acked-by: Song Liu <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>

selftests/bpf: add config fragment CONFIG_FTRACE_SYSCALLS

CONFIG_FTRACE_SYSCALLS=y is required for get_cgroup_id_user test case
this test reads a file from debug trace path
/sys/kernel/debug/tracing/events/syscalls/sys_enter_nanosleep/id

Signed-off-by: Naresh Kamboju <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

Merge branch 'bpf-sk-msg-pop-data'

John Fastabend says:

====================
After being able to add metadata to messages with sk_msg_push_data we
have also found it useful to be able to "pop" this metadata off before
sending it to applications in some cases. This series adds a new helper
sk_msg_pop_data() and the associated patches to add tests and tools/lib
support.

Thanks!

v2: Daniel caught that we missed adding sk_msg_pop_data to the changes
    data helper so that the verifier ensures BPF programs revalidate
    data after using this helper. Also improve documentation adding a
    return description and using RST syntax per Quentin's comment. And
    delta calculations for DROP with pop'd data (albeit a strange set
    of operations for a program to be doing) had potential to be
    incorrect possibly confusing user space applications, so fix it.
====================

Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

bpf: test_sockmap, add options for msg_pop_data() helper

Similar to msg_pull_data and msg_push_data add a set of options to
have msg_pop_data() exercised.

Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

bpf: add msg_pop_data helper to tools

Add the necessary header definitions to tools for new
msg_pop_data_helper.

Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

bpf: helper to pop data from messages

This adds a BPF SK_MSG program helper so that we can pop data from a
msg. We use this to pop metadata from a previous push data call.

Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) ARM64 JIT fixes for subprog handling from Daniel Borkmann.

2) Various sparc64 JIT bug fixes (fused branch convergance, frame
    pointer usage detection logic, PSEODU call argument handling).

3) Fix to use BH locking in nf_conncount, from Taehee Yoo.

4) Fix race of TX skb freeing in ipheth driver, from Bernd Eckstein.

5) Handle return value of TX NAPI completion properly in lan743x
    driver, from Bryan Whitehead.

6) MAC filter deletion in i40e driver clears wrong state bit, from
    Lihong Yang.

7) Fix use after free in rionet driver, from Pan Bian.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (53 commits)
  s390/qeth: fix length check in SNMP processing
  net: hisilicon: remove unexpected free_netdev
  rapidio/rionet: do not free skb before reading its length
  i40e: fix kerneldoc for xsk methods
  ixgbe: recognize 1000BaseLX SFP modules as 1Gbps
  i40e: Fix deletion of MAC filters
  igb: fix uninitialized variables
  netfilter: nf_tables: deactivate expressions in rule replecement routine
  lan743x: Enable driver to work with LAN7431
  tipc: fix lockdep warning during node delete
  lan743x: fix return value for lan743x_tx_napi_poll
  net: via: via-velocity: fix spelling mistake "alignement" -> "alignment"
  qed: fix spelling mistake "attnetion" -> "attention"
  net: thunderx: fix NULL pointer dereference in nic_remove
  sctp: increase sk_wmem_alloc when head->truesize is increased
  firestream: fix spelling mistake: "Inititing" -> "Initializing"
  net: phy: add workaround for issue where PHY driver doesn't bind to the device
  usbnet: ipheth: fix potential recvmsg bug and recvmsg bug 2
  sparc: Adjust bpf JIT prologue for PSEUDO calls.
  bpf, doc: add entries of who looks over which jits
  ...

Merge tag 'xtensa-20181128' of git://github.com/jcmvbkbc/linux-xtensa

Pull Xtensa fixes from Max Filippov:

- fix kernel exception on userspace access to a currently disabled
   coprocessor

- fix coprocessor data saving/restoring in configurations with multiple
   coprocessors

- fix ptrace access to coprocessor data on configurations with multiple
   coprocessors with high alignment requirements

* tag 'xtensa-20181128' of git://github.com/jcmvbkbc/linux-xtensa:
  xtensa: fix coprocessor part of ptrace_{get,set}xregs
  xtensa: fix coprocessor context offset definitions
  xtensa: enable coprocessors that are being flushed

Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue

Jeff Kirsher says:

====================
Intel Wired LAN Driver Fixes 2018-11-28

This series contains fixes to igb, ixgbe and i40e.

Yunjian Wang from Huawei resolves a variable that could potentially be
NULL before it is used.

Lihong fixes an i40e issue which goes back to 4.17 kernels, where
deleting any of the MAC filters was causing the incorrect syncing for
the PF.

Josh Elsasser caught that there were missing enum values in the link
capabilities for x550 devices, which was preventing link for 1000BaseLX
SFP modules.

Jan fixes the function header comments for XSK methods.
====================

Signed-off-by: David S. Miller <[email protected]>

s390/qeth: fix length check in SNMP processing

The response for a SNMP request can consist of multiple parts, which
the cmd callback stages into a kernel buffer until all parts have been
received. If the callback detects that the staging buffer provides
insufficient space, it bails out with error.
This processing is buggy for the first part of the response - while it
initially checks for a length of 'data_len', it later copies an
additional amount of 'offsetof(struct qeth_snmp_cmd, data)' bytes.

Fix the calculation of 'data_len' for the first part of the response.
This also nicely cleans up the memcpy code.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Julian Wiedmann <[email protected]>
Reviewed-by: Ursula Braun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: qualcomm: rmnet: remove set but not used variables 'ip_family, fc_seq, qos_id'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c:26:6:
warning: variable 'ip_family' set but not used [-Wunused-but-set-variable]
drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c:27:6:
warning: variable 'fc_seq' set but not used [-Wunused-but-set-variable]
drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c:28:6:
warning: variable 'qos_id' set but not used [-Wunused-but-set-variable]

It never used since introduction in commit
ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")

Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

qlcnic: remove set but not used variables 'cur_rings, max_hw_rings, tx_desc_info'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c:4011:5:
warning: variable 'max_hw_rings' set but not used [-Wunused-but-set-variable]
drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c:4013:6:
warning: variable 'cur_rings' set but not used [-Wunused-but-set-variable]
drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c:2996:25:
warning: variable 'tx_desc_info' set but not used [-Wunused-but-set-variable]

'cur_rings, max_hw_rings' never used since introduction in commit
34e8c406fda5 ("qlcnic: refactor Tx/SDS ring calculation and validation in driver.")
'tx_desc_info' never used since commit
95b3890ae39f ("qlcnic: Enhance Tx timeout debugging.")
Also 'queue_type' only can be QLCNIC_RX_QUEUE/QLCNIC_TX_QUEUE,
so make a trival cleanup on if statement.

Signed-off-by: YueHaibing <[email protected]>
Acked-by: Shahed Shaikh <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: neterion: vxge: remove set but not used variables 'max_frags' and 'txdl_priv'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/neterion/vxge/vxge-traffic.c:1698:35:
warning: variable 'txdl_priv' set but not used [-Wunused-but-set-variable]
drivers/net/ethernet/neterion/vxge/vxge-traffic.c:1699:6:
warning: variable 'max_frags' set but not used [-Wunused-but-set-variable]

It never used since introduction in
commit 113241321dcd ("Neterion: New driver: Traffic & alarm handler")

Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Disable BH while holding list spinlock in nf_conncount, from
   Taehee Yoo.

2) List corruption in nf_conncount, also from Taehee.

3) Fix race that results in leaving around an empty list node in
   nf_conncount, from Taehee Yoo.

4) Proper chain handling for inactive chains from the commit path,
   from Florian Westphal. This includes a selftest for this.

5) Do duplicate rule handles when replacing rules, also from Florian.

6) Remove net_exit path in xt_RATEEST that results in splat, from Taehee.

7) Possible use-after-free in nft_compat when releasing extensions.
   From Florian.

8) Memory leak in xt_hashlimit, from Taehee.

9) Call ip_vs_dst_notifier after ipv6_dev_notf, from Xin Long.

10) Fix cttimeout with udplite and gre, from Florian.

11) Preserve oif for IPv6 link-local generated traffic from mangle
    table, from Alin Nastac.

12) Missing error handling in masquerade notifiers, from Taehee Yoo.

13) Use mutex to protect registration/unregistration of masquerade
    extensions in order to prevent a race, from Taehee.

14) Incorrect condition check in tree_nodes_free(), also from Taehee.

15) Fix chain counter leak in rule replacement path, from Taehee.
====================

Signed-off-by: David S. Miller <[email protected]>

Merge branch 'dpaa2-eth-Introduce-XDP-support'

Ioana Ciocoi Radulescu says:

====================
dpaa2-eth: Introduce XDP support

Add support for XDP programs. Only XDP_PASS, XDP_DROP and XDP_TX
actions are supported for now. Frame header changes are also
allowed.

v2: - count the XDP packets in the rx/tx inteface stats
- add message with the maximum supported MTU value for XDP
====================

Signed-off-by: David S. Miller <[email protected]>

dpaa2-eth: Add xdp counters

Add counters for xdp processed frames to the channel statistics.

Signed-off-by: Ioana Radulescu <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dpaa2-eth: Cleanup channel stats

Remove unused counter. Reorder fields in channel stats structure
to match the ethtool strings order and make it easier to print them
with ethtool -S.

Signed-off-by: Ioana Radulescu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dpaa2-eth: Add support for XDP_TX

Send frames back on the same port for XDP_TX action.
Since the frame buffers have been allocated by us, we can recycle
them directly into the Rx buffer pool instead of requesting a
confirmation frame upon transmission complete.

Signed-off-by: Ioana Radulescu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dpaa2-eth: Map Rx buffers as bidirectional

In order to support enqueueing Rx FDs back to hardware, we need to
DMA map Rx buffers as bidirectional.

Signed-off-by: Ioana Radulescu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dpaa2-eth: Release buffers back to pool on XDP_DROP

Instead of freeing the RX buffers, release them back into the pool.
We wait for the maximum number of buffers supported by a single
release command to accumulate before issuing the command.

Also, don't unmap the Rx buffers at the beginning of the Rx routine
anymore, since that would require remapping them before release.
Instead, just do a DMA sync at first and only unmap if the frame is
meant for the stack.

Signed-off-by: Ioana Radulescu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dpaa2-eth: Move function

We'll use function free_bufs() on the XDP path as well, so move
it higher in order to avoid a forward declaration.

Signed-off-by: Ioana Radulescu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dpaa2-eth: Allow XDP header adjustments

Reserve XDP_PACKET_HEADROOM bytes in Rx buffers to allow XDP
programs to increase frame header size.

Signed-off-by: Ioana Radulescu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dpaa2-eth: Add basic XDP support

We keep one XDP program reference per channel. The only actions
supported for now are XDP_DROP and XDP_PASS.

Until now we didn't enforce a maximum size for Rx frames based
on MTU value. Change that, since for XDP mode we must ensure no
scatter-gather frames can be received.

Signed-off-by: Ioana Radulescu <[email protected]>
Acked-by: Camelia Groza <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hisilicon: remove unexpected free_netdev

The net device ndev is freed via free_netdev when failing to register
the device. The control flow then jumps to the error handling code
block. ndev is used and freed again. Resulting in a use-after-free bug.

Signed-off-by: Pan Bian <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

rapidio/rionet: do not free skb before reading its length

skb is freed via dev_kfree_skb_any, however, skb->len is read then. This
may result in a use-after-free bug.

Fixes: e6161d64263 ("rapidio/rionet: rework driver initialization and removal")
Signed-off-by: Pan Bian <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

i40e: fix kerneldoc for xsk methods

One method, xsk_umem_setup, had an incorrect kernel doc
description, which has been corrected.

Also fixes small typos found in the comments.

Signed-off-by: Jan Sokolowski <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

Merge tag 'for-4.20-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
"Some of these bugs are being hit during testing so we'd like to get
  them merged, otherwise there are usual stability fixes for stable
  trees"

* tag 'for-4.20-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: relocation: set trans to be NULL after ending transaction
  Btrfs: fix race between enabling quotas and subvolume creation
  Btrfs: send, fix infinite loop due to directory rename dependencies
  Btrfs: ensure path name is null terminated at btrfs_control_ioctl
  Btrfs: fix rare chances for data loss when doing a fast fsync
  btrfs: Always try all copies when reading extent buffers

Merge tag 'spi-fix-v4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
"A few driver specific fixes here, nothing big or that stands out for
  anyone other than the driver users.

  The omap2-mcspi fix is for issues that started showing up with a
  change in defconfig in this release to make cpuidle get turned on by
  default"

* tag 'spi-fix-v4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: omap2-mcspi: Add missing suspend and resume calls
  spi: mediatek: use correct mata->xfer_len when in fifo transfer
  spi: uniphier: fix incorrect property items

ixgbe: recognize 1000BaseLX SFP modules as 1Gbps

Add the two 1000BaseLX enum values to the X550's check for 1Gbps modules,
allowing the core driver code to establish a link over this SFP type.

This is done by the out-of-tree driver but the fix wasn't in mainline.

Fixes: e23f33367882 ("ixgbe: Fix 1G and 10G link stability for X550EM_x SFP+”)
Fixes: 6a14ee0cfb19 ("ixgbe: Add X550 support function pointers")
Signed-off-by: Josh Elsasser <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
"Bugfixes, many of them reported by syzkaller and mostly predating the
  merge window"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb
  kvm: mmu: Fix race in emulated page table writes
  KVM: nVMX: vmcs12 revision_id is always VMCS12_REVISION even when copied from eVMCS
  KVM: nVMX: Verify eVMCS revision id match supported eVMCS version on eVMCS VMPTRLD
  KVM: nVMX/nSVM: Fix bug which sets vcpu->arch.tsc_offset to L1 tsc_offset
  x86/kvm/vmx: fix old-style function declaration
  KVM: x86: fix empty-body warnings
  KVM: VMX: Update shared MSRs to be saved/restored on MSR_EFER.LMA changes
  KVM: x86: Fix kernel info-leak in KVM_HC_CLOCK_PAIRING hypercall
  KVM: nVMX: Fix kernel info-leak when enabling KVM_CAP_HYPERV_ENLIGHTENED_VMCS more than once
  svm: Add mutex_lock to protect apic_access_page_done on AMD systems
  KVM: X86: Fix scan ioapic use-before-initialization
  KVM: LAPIC: Fix pv ipis use-before-initialization
  KVM: VMX: re-add ple_gap module parameter
  KVM: PPC: Book3S HV: Fix handling for interrupted H_ENTER_NESTED

i40e: Fix deletion of MAC filters

In __i40e_del_filter function, the flag __I40E_MACVLAN_SYNC_PENDING for
the PF state is wrongly set for the VSI. Deleting any of the MAC filters
has caused the incorrect syncing for the PF. Fix it by setting this state
flag to the intended PF.

CC: stable <[email protected]>
Signed-off-by: Lihong Yang <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

igb: fix uninitialized variables

This patch fixes the variable 'phy_word' may be used uninitialized.

Signed-off-by: Yunjian Wang <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

can: flexcan: split the Message Buffer RAM area

The message buffer RAM area is not a contiguous 1KB area but 2 partitions
of 512 bytes each. Till now, we used Message buffers with payload size 8
bytes, which translates to 32 MBs per partition and no spare space is left
in any partition.
However, in upcoming SOC LX2160A the message buffers can have payload size
64 bytes. This results in less than 32 MBs per partition and some empty
area is left at the end of each partition.This empty area should not be
accessed.
Therefore, split the Message Buffer RAM area into two partitions.

Signed-off-by: Pankaj Bansal <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: flexcan: Add provision for variable payload size

Till now the flexcan module supported 8 byte payload size as per CAN 2.0
specifications. But now upcoming flexcan module in NXP LX2160A SOC
supports CAN FD protocol too. The Message buffers need to be configured
to have payload size 64 bytes.

Therefore, added provision in the driver for payload size to be 64 bytes.

Signed-off-by: Pankaj Bansal <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: flexcan: move rx_offload_add() from flexcan_probe() to flexcan_open()

rx offload depends on number of message buffers, which in turn depends
on messgae buffer size. with the upcoming LX2160A SOC the message buffer
size can be configured to 72 bytes if it were to be used in CAN FD mode.

The current mode in which the flexcan is being operated is known at the
time of flexcan_open() but not at the time of flexcan_probe().

Therefore, move the rx_offload_add() from flexcan_probe() to
flexcan_open(). correspondingly, move rx_offload_delete() from
flexcan_remove() to flexcan_close().

Signed-off-by: Pankaj Bansal <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: flexcan: flexcan_chip_start(): enable loopback mode in flexcan

Self reception disable bit needs to be cleared for loopback mode to work
in flexcan.

Signed-off-by: Pankaj Bansal <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: flexcan: add self wakeup support

If wakeup is enabled, enter stop mode, else enter disabled mode. Self wake
can only work on stop mode.

Starting from IMX6, the flexcan stop mode control bits is SoC specific,
move it out of IP driver and parse it from devicetree.

Signed-off-by: Aisheng Dong <[email protected]>
Signed-off-by: Joakim Zhang <[email protected]>
Reviewed-by: Dong Aisheng <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>

dt-bindings: can: flexcan: add stop mode property to device tree

The FlexCAN controller can parse the stop mode property to enable CAN
self wakeup feature.

Signed-off-by: Aisheng Dong <[email protected]>
Signed-off-by: Joakim Zhang <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: flexcan: flexcan_chip_start(): adjust comment to match the code

With the conversion of the flexcan driver to support both timestamp and
FIFO mode the setup of the MCR register ("enable fifo") has been moved.

This patch moves the comment too, in order to match the code again.

Signed-off-by: Marc Kleine-Budde <[email protected]>

can: flexcan: FLEXCAN_IFLAG_MB: add () around macro argument

This patch fixes the following checkpatch warning:

| Macro argument 'x' may be better as '(x)' to avoid precedence issues

Fixes: cbffaf7aa09e ("can: flexcan: Always use last mailbox for TX")
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: flexcan: flexcan_irq(): fix indention

The patch fixes the indention by replacing space by tabs, as noted by
checkpatch:

| ERROR: code indent should use tabs where possible
| #980: FILE: drivers/net/can/flexcan.c:980:

Fixes: da49a8075c00 ("can: flexcan: implement error passive state quirk")
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: flexcan: flexcan_start_xmit(): fix indention

This patch fixes the indentio nin flexcan_start_xmit() by alligning the
code to the opening parenthesis.

Signed-off-by: Marc Kleine-Budde <[email protected]>

can: flexcan: enable flexcan for all architectures

flexcan is an IP module independent of the SOC architecture.
Therefore, enable it for all architectures.

Signed-off-by: Pankaj Bansal <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: rcar: add SPDX identifiers to Kconfig and Makefile

This patch adds SPDX-License-Identifier to Kconfig and Makefile.

Signed-off-by: Kuninori Morimoto <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Acked-by: Ramesh Shanmugasundaram <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: rcar: use SPDX identifier for Renesas drivers

Adopt the SPDX license identifier headers to ease license compliance
management.

Signed-off-by: Wolfram Sang <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: sja1000: plx_pci: add support for ASEM CAN raw device

This patch adds support for ASEM opto-isolated dual channels
CAN raw device (http://www.asem.it)

Signed-off-by: Flavio Suligoi <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: xilinx: add can 2.0 support

Add support for can 2.0.

Signed-off-by: Shubhrajyoti Datta <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>

dt-bindings: can: xilinx_can: add Xilinx CAN FD 2.0 bindings

Add compatible string and new attributes to support the Xilinx CAN
FD 2.0.

Signed-off-by: Shubhrajyoti Datta <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: xilinx: fix return type of ndo_start_xmit function

The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.

Found by coccinelle.

Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>

can: ucan: fix spelling mistake: "resumbmitting" -> "resubmitting"

Trivial fix to spelling mistake in netdev_dbg error message

Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>

netfilter: nf_tables: deactivate expressions in rule replecement routine

There is no expression deactivation call from the rule replacement path,
hence, chain counter is not decremented. A few steps to reproduce the
problem:

   %nft add table ip filter
   %nft add chain ip filter c1
   %nft add chain ip filter c1
   %nft add rule ip filter c1 jump c2
   %nft replace rule ip filter c1 handle 3 accept
   %nft flush ruleset

<jump c2> expression means immediate NFT_JUMP to chain c2.
Reference count of chain c2 is increased when the rule is added.

When rule is deleted or replaced, the reference counter of c2 should be
decreased via nft_rule_expr_deactivate() which calls
nft_immediate_deactivate().

Splat looks like:
[  214.396453] WARNING: CPU: 1 PID: 21 at net/netfilter/nf_tables_api.c:1432 nf_tables_chain_destroy.isra.38+0x2f9/0x3a0 [nf_tables]
[  214.398983] Modules linked in: nf_tables nfnetlink
[  214.398983] CPU: 1 PID: 21 Comm: kworker/1:1 Not tainted 4.20.0-rc2+ #44
[  214.398983] Workqueue: events nf_tables_trans_destroy_work [nf_tables]
[  214.398983] RIP: 0010:nf_tables_chain_destroy.isra.38+0x2f9/0x3a0 [nf_tables]
[  214.398983] Code: 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 8e 00 00 00 48 8b 7b 58 e8 e1 2c 4e c6 48 89 df e8 d9 2c 4e c6 eb 9a <0f> 0b eb 96 0f 0b e9 7e fe ff ff e8 a7 7e 4e c6 e9 a4 fe ff ff e8
[  214.398983] RSP: 0018:ffff8881152874e8 EFLAGS: 00010202
[  214.398983] RAX: 0000000000000001 RBX: ffff88810ef9fc28 RCX: ffff8881152876f0
[  214.398983] RDX: dffffc0000000000 RSI: 1ffff11022a50ede RDI: ffff88810ef9fc78
[  214.398983] RBP: 1ffff11022a50e9d R08: 0000000080000000 R09: 0000000000000000
[  214.398983] R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff11022a50eba
[  214.398983] R13: ffff888114446e08 R14: ffff8881152876f0 R15: ffffed1022a50ed6
[  214.398983] FS:  0000000000000000(0000) GS:ffff888116400000(0000) knlGS:0000000000000000
[  214.398983] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  214.398983] CR2: 00007fab9bb5f868 CR3: 000000012aa16000 CR4: 00000000001006e0
[  214.398983] Call Trace:
[  214.398983]  ? nf_tables_table_destroy.isra.37+0x100/0x100 [nf_tables]
[  214.398983]  ? __kasan_slab_free+0x145/0x180
[  214.398983]  ? nf_tables_trans_destroy_work+0x439/0x830 [nf_tables]
[  214.398983]  ? kfree+0xdb/0x280
[  214.398983]  nf_tables_trans_destroy_work+0x5f5/0x830 [nf_tables]
[ ... ]

Fixes: bb7b40aecbf7 ("netfilter: nf_tables: bogus EBUSY in chain deletions")
Reported by: Christoph Anton Mitterer <[email protected]>
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914505
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201791
Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

net/ipv4: Fix missing raw_init when CONFIG_PROC_FS is disabled

Randy reported when CONFIG_PROC_FS is not enabled:
ld: net/ipv4/af_inet.o: in function `inet_init':
af_inet.c:(.init.text+0x42d): undefined reference to `raw_init'

Fix by moving the endif up to the end of the proc entries

Fixes: 6897445fb194c ("net: provide a sysctl raw_l3mdev_accept for raw socket lookup with VRFs")
Reported-by: Randy Dunlap <[email protected]>
Cc: Mike Manning <[email protected]>
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'bnx2x-Popoulate-firmware-versions-in-driver-info-query'

Sudarsana Reddy Kalluru says:

====================
bnx2x: Popoulate firmware versions in driver info query.

The patch series populates MBI and storm firware versions in the ethtool
driver info query.
====================

Signed-off-by: David S. Miller <[email protected]>

bnx2x: Add storm FW version to ethtool driver query output.

The patch populates the Storm FW version in the ethtool driver query data.

Signed-off-by: Sudarsana Reddy Kalluru <[email protected]>
Signed-off-by: Ariel Elior <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bnx2x: Add MBI version to ethtool driver query output.

The patch populates the MBI version in the ethtool driver query data.
Adding 'extended_dev_info_shared_cfg' structure describing the nvram
structure, this is required to access the mbi version string.

Signed-off-by: Sudarsana Reddy Kalluru <[email protected]>
Signed-off-by: Ariel Elior <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp: remove hdrlen argument from tcp_queue_rcv()

Only one caller needs to pull TCP headers, so lets
move __skb_pull() to the caller side.

Signed-off-by: Eric Dumazet <[email protected]>
Acked-by: Yuchung Cheng <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/ncsi: Add NCSI Mellanox OEM command

This patch adds OEM Mellanox commands and response handling. It also
defines OEM Get MAC Address handler to get and configure the device.

ncsi_oem_gma_handler_mlx: This handler send NCSI mellanox command for
getting mac address.
ncsi_rsp_handler_oem_mlx: This handles response received for all
mellanox OEM commands.
ncsi_rsp_handler_oem_mlx_gma: This handles get mac address response and
set it to device.

Signed-off-by: Vijay Khemka <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

r8169: remove unneeded mmiowb barriers

writex() has implicit barriers, that's what makes it different from
writex_relaxed(). Therefore these calls to mmiowb() can be removed.

Signed-off-by: Heiner Kallweit <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns3: Config NIC port speed same as that of optical module

Port 0/1 of HiP08 supports 10G and 25G. This patch adds a
change to configure NIC port speed same as that of optical
module(SFP/QFSP). Driver gets the optical module speed and
sets NIC port speed accordingly.

Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Salil Mehta <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

lan743x: Enable driver to work with LAN7431

This driver was designed to work with both LAN7430 and LAN7431.
The only difference between the two is the LAN7431 has support
for external phy.

This change adds LAN7431 to the list of recognized devices
supported by this driver.

Updates for v2:
changed 'fixes' tag to match defined format

fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: Bryan Whitehead <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tipc: fix lockdep warning during node delete

We see the following lockdep warning:

[ 2284.078521] ======================================================
[ 2284.078604] WARNING: possible circular locking dependency detected
[ 2284.078604] 4.19.0+ #42 Tainted: G            E
[ 2284.078604] ------------------------------------------------------
[ 2284.078604] rmmod/254 is trying to acquire lock:
[ 2284.078604] 00000000acd94e28 ((&n->timer)#2){+.-.}, at: del_timer_sync+0x5/0xa0
[ 2284.078604]
[ 2284.078604] but task is already holding lock:
[ 2284.078604] 00000000f997afc0 (&(&tn->node_list_lock)->rlock){+.-.}, at: tipc_node_stop+0xac/0x190 [tipc]
[ 2284.078604]
[ 2284.078604] which lock already depends on the new lock.
[ 2284.078604]
[ 2284.078604]
[ 2284.078604] the existing dependency chain (in reverse order) is:
[ 2284.078604]
[ 2284.078604] -> #1 (&(&tn->node_list_lock)->rlock){+.-.}:
[ 2284.078604]        tipc_node_timeout+0x20a/0x330 [tipc]
[ 2284.078604]        call_timer_fn+0xa1/0x280
[ 2284.078604]        run_timer_softirq+0x1f2/0x4d0
[ 2284.078604]        __do_softirq+0xfc/0x413
[ 2284.078604]        irq_exit+0xb5/0xc0
[ 2284.078604]        smp_apic_timer_interrupt+0xac/0x210
[ 2284.078604]        apic_timer_interrupt+0xf/0x20
[ 2284.078604]        default_idle+0x1c/0x140
[ 2284.078604]        do_idle+0x1bc/0x280
[ 2284.078604]        cpu_startup_entry+0x19/0x20
[ 2284.078604]        start_secondary+0x187/0x1c0
[ 2284.078604]        secondary_startup_64+0xa4/0xb0
[ 2284.078604]
[ 2284.078604] -> #0 ((&n->timer)#2){+.-.}:
[ 2284.078604]        del_timer_sync+0x34/0xa0
[ 2284.078604]        tipc_node_delete+0x1a/0x40 [tipc]
[ 2284.078604]        tipc_node_stop+0xcb/0x190 [tipc]
[ 2284.078604]        tipc_net_stop+0x154/0x170 [tipc]
[ 2284.078604]        tipc_exit_net+0x16/0x30 [tipc]
[ 2284.078604]        ops_exit_list.isra.8+0x36/0x70
[ 2284.078604]        unregister_pernet_operations+0x87/0xd0
[ 2284.078604]        unregister_pernet_subsys+0x1d/0x30
[ 2284.078604]        tipc_exit+0x11/0x6f2 [tipc]
[ 2284.078604]        __x64_sys_delete_module+0x1df/0x240
[ 2284.078604]        do_syscall_64+0x66/0x460
[ 2284.078604]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 2284.078604]
[ 2284.078604] other info that might help us debug this:
[ 2284.078604]
[ 2284.078604]  Possible unsafe locking scenario:
[ 2284.078604]
[ 2284.078604]        CPU0                    CPU1
[ 2284.078604]        ----                    ----
[ 2284.078604]   lock(&(&tn->node_list_lock)->rlock);
[ 2284.078604]                                lock((&n->timer)#2);
[ 2284.078604]                                lock(&(&tn->node_list_lock)->rlock);
[ 2284.078604]   lock((&n->timer)#2);
[ 2284.078604]
[ 2284.078604]  *** DEADLOCK ***
[ 2284.078604]
[ 2284.078604] 3 locks held by rmmod/254:
[ 2284.078604]  #0: 000000003368be9b (pernet_ops_rwsem){+.+.}, at: unregister_pernet_subsys+0x15/0x30
[ 2284.078604]  #1: 0000000046ed9c86 (rtnl_mutex){+.+.}, at: tipc_net_stop+0x144/0x170 [tipc]
[ 2284.078604]  #2: 00000000f997afc0 (&(&tn->node_list_lock)->rlock){+.-.}, at: tipc_node_stop+0xac/0x19
[...}

The reason is that the node timer handler sometimes needs to delete a
node which has been disconnected for too long. To do this, it grabs
the lock 'node_list_lock', which may at the same time be held by the
generic node cleanup function, tipc_node_stop(), during module removal.
Since the latter is calling del_timer_sync() inside the same lock, we
have a potential deadlock.

We fix this letting the timer cleanup function use spin_trylock()
instead of just spin_lock(), and when it fails to grab the lock it
just returns so that the timer handler can terminate its execution.
This is safe to do, since tipc_node_stop() anyway is about to
delete both the timer and the node instance.

Fixes: 6a939f365bdb ("tipc: Auto removal of peer down node instance")
Acked-by: Ying Xue <[email protected]>
Signed-off-by: Jon Maloy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

lan743x: fix return value for lan743x_tx_napi_poll

The lan743x driver, when under heavy traffic load, has been noticed
to sometimes hang, or cause a kernel panic.

Debugging reveals that the TX napi poll routine was returning
the wrong value, 'weight'. Most other drivers return 0.
And call napi_complete, instead of napi_complete_done.

Additionally when creating the tx napi poll routine.
Changed netif_napi_add, to netif_tx_napi_add.

Updates for v3:
    changed 'fixes' tag to match defined format

Updates for v2:
use napi_complete, instead of napi_complete_done in
    lan743x_tx_napi_poll
use netif_tx_napi_add, instead of netif_napi_add for
    registration of tx napi poll routine

fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: Bryan Whitehead <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: via: via-velocity: fix spelling mistake "alignement" -> "alignment"

The text in array velocity_gstrings contains a spelling mistake,
rename rx_frame_alignement_errors to rx_frame_alignment_errors.

Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

qed: fix spelling mistake "attnetion" -> "attention"

The text in array s_igu_fifo_error_strs contains a spelling mistake,
fix it.

Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'net-nsid-interpretation'

Nicolas Dichtel says:

====================
Ease to interpret net-nsid

The goal of this series is to ease the interpretation of nsid received in
netlink messages from other netns (when the user uses
NETLINK_F_LISTEN_ALL_NSID).

After this series, with a patched iproute2:

$ ip netns add foo
$ ip netns add bar
$ touch /var/run/netns/init_net
$ mount --bind /proc/1/ns/net /var/run/netns/init_net
$ ip netns set init_net 11
$ ip netns set foo 12
$ ip netns set bar 13
$ ip netns
init_net (id: 11)
bar (id: 13)
foo (id: 12)
$ ip -n foo netns set init_net 21
$ ip -n foo netns set foo 22
$ ip -n foo netns set bar 23
$ ip -n foo netns
init_net (id: 21)
bar (id: 23)
foo (id: 22)
$ ip -n bar netns set init_net 31
$ ip -n bar netns set foo 32
$ ip -n bar netns set bar 33
$ ip -n bar netns
init_net (id: 31)
bar (id: 33)
foo (id: 32)
$ ip netns list-id target-nsid 12
nsid 21 current-nsid 11 (iproute2 netns name: init_net)
nsid 22 current-nsid 12 (iproute2 netns name: foo)
nsid 23 current-nsid 13 (iproute2 netns name: bar)
$ ip -n bar netns list-id target-nsid 32 nsid 31
nsid 21 current-nsid 31 (iproute2 netns name: init_net)

v3 -> v4:
  - patch 5/5: fix imbalance lock in error path

v2 -> v3:
  - patch 5/5: account NETNSA_CURRENT_NSID in rtnl_net_get_size()

v1 -> v2:
  - patch 1/5: remove net from struct rtnl_net_dump_cb
  - patch 2/5: new in this version
  - patch 3/5: use a bool to know if rtnl_get_net_ns_capable() was called
  - patch 5/5: use struct net_fill_args
====================

Signed-off-by: David S. Miller <[email protected]>

netns: enable to dump full nsid translation table

Like the previous patch, the goal is to ease to convert nsids from one
netns to another netns.
A new attribute (NETNSA_CURRENT_NSID) is added to the kernel answer when
NETNSA_TARGET_NSID is provided, thus the user can easily convert nsids.

Signed-off-by: Nicolas Dichtel <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

netns: enable to specify a nsid for a get request

Combined with NETNSA_TARGET_NSID, it enables to "translate" a nsid from one
netns to a nsid of another netns.
This is useful when using NETLINK_F_LISTEN_ALL_NSID because it helps the
user to interpret a nsid received from an other netns.

Signed-off-by: Nicolas Dichtel <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

netns: add support of NETNSA_TARGET_NSID

Like it was done for link and address, add the ability to perform get/dump
in another netns by specifying a target nsid attribute.

Signed-off-by: Nicolas Dichtel <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

netns: introduce 'struct net_fill_args'

This is a preparatory work. To avoid having to much arguments for the
function rtnl_net_fill(), a new structure is defined.

Signed-off-by: Nicolas Dichtel <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

netns: remove net arg from rtnl_net_fill()

This argument is not used anymore.

Fixes: cab3c8ec8d57 ("netns: always provide the id to rtnl_net_fill()")
Signed-off-by: Nicolas Dichtel <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: thunderx: fix NULL pointer dereference in nic_remove

Fix a possible NULL pointer dereference in nic_remove routine
removing the nicpf module if nic_probe fails.
The issue can be triggered with the following reproducer:

$rmmod nicvf
$rmmod nicpf

[  521.412008] Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000014
[  521.422777] Mem abort info:
[  521.425561]   ESR = 0x96000004
[  521.428624]   Exception class = DABT (current EL), IL = 32 bits
[  521.434535]   SET = 0, FnV = 0
[  521.437579]   EA = 0, S1PTW = 0
[  521.440730] Data abort info:
[  521.443603]   ISV = 0, ISS = 0x00000004
[  521.447431]   CM = 0, WnR = 0
[  521.450417] user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000072a3da42
[  521.457022] [0000000000000014] pgd=0000000000000000
[  521.461916] Internal error: Oops: 96000004 [#1] SMP
[  521.511801] Hardware name: GIGABYTE H270-T70/MT70-HD0, BIOS T49 02/02/2018
[  521.518664] pstate: 80400005 (Nzcv daif +PAN -UAO)
[  521.523451] pc : nic_remove+0x24/0x88 [nicpf]
[  521.527808] lr : pci_device_remove+0x48/0xd8
[  521.532066] sp : ffff000013433cc0
[  521.535370] x29: ffff000013433cc0 x28: ffff810f6ac50000
[  521.540672] x27: 0000000000000000 x26: 0000000000000000
[  521.545974] x25: 0000000056000000 x24: 0000000000000015
[  521.551274] x23: ffff8007ff89a110 x22: ffff000001667070
[  521.556576] x21: ffff8007ffb170b0 x20: ffff8007ffb17000
[  521.561877] x19: 0000000000000000 x18: 0000000000000025
[  521.567178] x17: 0000000000000000 x16: 000000000000010ffc33ff98 x8 : 0000000000000000
[  521.593683] x7 : 0000000000000000 x6 : 0000000000000001
[  521.598983] x5 : 0000000000000002 x4 : 0000000000000003
[  521.604284] x3 : ffff8007ffb17184 x2 : ffff8007ffb17184
[  521.609585] x1 : ffff000001662118 x0 : ffff000008557be0
[  521.614887] Process rmmod (pid: 1897, stack limit = 0x00000000859535c3)
[  521.621490] Call trace:
[  521.623928]  nic_remove+0x24/0x88 [nicpf]
[  521.627927]  pci_device_remove+0x48/0xd8
[  521.631847]  device_release_driver_internal+0x1b0/0x248
[  521.637062]  driver_detach+0x50/0xc0
[  521.640628]  bus_remove_driver+0x60/0x100
[  521.644627]  driver_unregister+0x34/0x60
[  521.648538]  pci_unregister_driver+0x24/0xd8
[  521.652798]  nic_cleanup_module+0x14/0x111c [nicpf]
[  521.657672]  __arm64_sys_delete_module+0x150/0x218
[  521.662460]  el0_svc_handler+0x94/0x110
[  521.666287]  el0_svc+0x8/0xc
[  521.669160] Code: aa1e03e0 9102c295 d503201f f9404eb3 (b9401660)

Fixes: 4863dea3fab0 ("net: Adding support for Cavium ThunderX network controller")
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'qed-enhancements-series'

Sudarsana Reddy Kalluru says:

====================
qed* enhancements series

The patch series add few enhancements to qed/qede drivers.

Changes from previous versions:
-------------------------------
v3: Revert v2 changes as the other paths (i.e. ptp) access the same data in
atomic context.
v2: Use __set_bit()/__clear_bit() where data access doesn't need to be
atomic.
====================

Signed-off-by: David S. Miller <[email protected]>

qed: Add support for MBI upgrade over MFW.

The patch adds driver support for MBI image update through MFW.

Signed-off-by: Sudarsana Reddy Kalluru <[email protected]>
Signed-off-by: Ariel Elior <[email protected]>
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

qede: Update link status only when interface is ready.

In the case of internal reload (e.g., mtu change), there could be a race
between link-up notification from mfw and the driver unload processing. In
such case kernel assumes the link is up and starts using the queues which
leads to the server crash.

Send link notification to the kernel only when driver has already requested
MFW for the link.

Signed-off-by: Sudarsana Reddy Kalluru <[email protected]>
Signed-off-by: Ariel Elior <[email protected]>
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>