Git Repo - linux.git/log

]> Git Repo - linux.git/log

Tobias Waldekranz [Fri, 15 Jan 2021 12:52:59 +0000 (13:52 +0100)]

net: dsa: mv88e6xxx: Only allow LAG offload on supported hardware

There are chips that do have Global 2 registers, and therefore trunk
mapping/mask tables are not available. Refuse the offload as early as
possible on those devices.

Fixes: 57e661aae6a8 ("net: dsa: mv88e6xxx: Link aggregation support")
Signed-off-by: Tobias Waldekranz <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Tobias Waldekranz [Fri, 15 Jan 2021 12:52:58 +0000 (13:52 +0100)]

net: dsa: mv88e6xxx: Provide dummy implementations for trunk setters

Support for Global 2 registers is build-time optional. In the case
where it was not enabled the build would fail as no "dummy"
implementation of these functions was available.

Fixes: 57e661aae6a8 ("net: dsa: mv88e6xxx: Link aggregation support")
Reported-by: kernel test robot <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Tested-by: Vladimir Oltean <[email protected]>
Signed-off-by: Tobias Waldekranz <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Jakub Kicinski [Fri, 15 Jan 2021 23:37:53 +0000 (15:37 -0800)]

Merge branch 'arrow-speedchips-xrs700x-dsa-driver'

George McCollister says:

====================
Arrow SpeedChips XRS700x DSA Driver

This series adds a DSA driver for the Arrow SpeedChips XRS 7000 series
of HSR/PRP gigabit switch chips.

The chips use Flexibilis IP.
More information can be found here:
https://www.flexibilis.com/products/speedchips-xrs7000/

The switches have up to three RGMII ports and one MII port and are
managed via mdio or i2c. They use a one byte trailing tag to identify
the switch port when in managed mode so I've added a tag driver which
implements this.

This series contains minimal DSA functionality which may be built upon
in future patches. The ultimate goal is to add HSR and PRP
(IEC 62439-3 Clause 5 & 4) offloading with integration into net/hsr.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

George McCollister [Thu, 14 Jan 2021 19:57:34 +0000 (13:57 -0600)]

dt-bindings: net: dsa: add bindings for xrs700x switches

Add documentation and an example for Arrow SpeedChips XRS7000 Series
single chip Ethernet switches.

Signed-off-by: George McCollister <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

George McCollister [Thu, 14 Jan 2021 19:57:33 +0000 (13:57 -0600)]

net: dsa: add Arrow SpeedChips XRS700x driver

Add a driver with initial support for the Arrow SpeedChips XRS7000
series of gigabit Ethernet switch chips which are typically used in
critical networking applications.

The switches have up to three RGMII ports and one RMII port.
Management to the switches can be performed over i2c or mdio.

Support for advanced features such as PTP and
HSR/PRP (IEC 62439-3 Clause 5 & 4) is not included in this patch and
may be added at a later date.

Signed-off-by: George McCollister <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

George McCollister [Thu, 14 Jan 2021 19:57:32 +0000 (13:57 -0600)]

dsa: add support for Arrow XRS700x tag trailer

Add support for Arrow SpeedChips XRS700x single byte tag trailer. This
is modeled on tag_trailer.c which works in a similar way.

Signed-off-by: George McCollister <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Jakub Kicinski [Fri, 15 Jan 2021 23:06:08 +0000 (15:06 -0800)]

Merge branch 'add-further-dt-configuration-for-at803x-phys'

Russell King says:

====================
Add further DT configuration for AT803x PHYs

This patch series adds the ability to configure the SmartEEE feature
in AT803x PHYs. SmartEEE defaults to enabled on these PHYs, and has
a history of causing random sporadic link drops at Gigabit speeds.

There appears to be two solutions to this. There is the approach that
Freescale adopted early on, which is to disable the SmartEEE feature.
However, this loses the power saving provided by EEE. Another solution
was found by Jon Nettleton is to increase the Tw parameter for Gigabit
links.

This patch series adds support for both approaches, by adding a boolean:

qca,disable-smarteee

if one wishes to disable SmartEEE, and two properties to configure the
SmartEEE Tw parameters:

qca,smarteee-tw-us-100m
qca,smarteee-tw-us-1g

Sadly, the PHY quirk I merged a while back for AT8035 on iMX6 is broken
- rather than disabling SmartEEE mode, it enables it.

The addition of these properties will be sent to the appropriate
platform maintainers - although for SolidRun platforms, we only make use
of "qca,smarteee-tw-us-1g".
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Russell King [Thu, 14 Jan 2021 10:45:49 +0000 (10:45 +0000)]

net: phy: at803x: add support for configuring SmartEEE

SmartEEE for the atheros phy was deemed buggy by Freescale and commits
were added to disable it for their boards.

In initial testing, SolidRun found that the default settings were
causing disconnects but by increasing the Tw buffer time we could allow
enough time for all parts of the link to come out of a low power state
and function properly without causing a disconnect. This allows us to
have functional power savings of between 300 and 400mW, rather than
disabling the feature altogether.

This commit adds support for disabling SmartEEE and configuring the Tw
parameters for 1G and 100M speeds.

Signed-off-by: Russell King <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Russell King [Thu, 14 Jan 2021 10:45:44 +0000 (10:45 +0000)]

dt: ar803x: document SmartEEE properties

The SmartEEE feature of Atheros AR803x PHYs can cause the link to
bounce. Add DT properties to allow SmartEEE to be disabled, and to
allow the Tw parameters for 100M and 1G links to be configured.

Signed-off-by: Russell King <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Alexei Starovoitov [Fri, 15 Jan 2021 03:29:58 +0000 (19:29 -0800)]

Merge branch 'perf: Add mmap2 build id support'

Jiri Olsa says:

====================

hi,
adding the support to have buildid stored in mmap2 event,
so we can bypass the final perf record hunt on build ids.

This patchset allows perf to record build ID in mmap2 event,
and adds perf tooling to store/download binaries to .debug
cache based on these build IDs.

Note that the build id retrieval code is stolen from bpf
code, where it's been used (together with file offsets)
to replace IPs in user space stack traces. It's now added
under lib directory.

v7 changes:
  - included only missing kernel patches, cc-ed bpf@vger and
    rebased on bpf-next/master [Alexei]

v6 changes:
  - last 4 patches rebased Arnaldo's perf/core

v5 changes:
  - rebased on latest perf/core
  - several patches already pulled in
  - fixed trace+probe_vfs_getname.sh output redirection
  - fixed changelogs [Arnaldo]
  - renamed BUILD_ID_SIZE to BUILD_ID_SIZE_MAX [Song]

v4 changes:
  - fixed typo in changelog [Namhyung]
  - removed force_download bool from struct dso_store_data,
    because it's not used  [Namhyung]

v3 changes:
  - added acks
  - removed forgotten debug code [Arnaldo]
  - fixed readlink termination [Ian]
  - fixed doc for --debuginfod=URLs [Ian]
  - adopted kernel's memchr_inv function and used
    it in build_id__is_defined function [Arnaldo]

On recording server:

  - on the recording server we can run record with --buildid-mmap
    option to store build ids in mmap2 events:

    # perf record --buildid-mmap
    ^C[ perf record: Woken up 2 times to write data ]
    [ perf record: Captured and wrote 0.836 MB perf.data ]

  - it stores nothing to ~/.debug cache:

    # find ~/.debug
    find: ‘/root/.debug’: No such file or directory

  - and still reports properly:

    # perf report --stdio
    ...
    99.82%  swapper          [kernel.kallsyms]  [k] native_safe_halt
     0.03%  swapper          [kernel.kallsyms]  [k] finish_task_switch
     0.02%  swapper          [kernel.kallsyms]  [k] __softirqentry_text_start
     0.01%  kcompactd0       [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
     0.01%  ksoftirqd/6      [kernel.kallsyms]  [k] slab_free_freelist_hook
     0.01%  kworker/17:1H-x  [kernel.kallsyms]  [k] slab_free_freelist_hook

  - display used/hit build ids:

    # perf buildid-list | head -5
    5dcec522abf136fcfd3128f47e131f2365834dd7 /proc/kcore
    589e403a34f55486bcac848a45e00bcdeedd1ca8 /usr/lib64/libcrypto.so.1.1.1g
    94569566d4eac7e9c87ba029d43d4e2158f9527e /usr/lib64/libpthread-2.30.so
    559b9702bebe31c6d132c8dc5cc887673d65d5b5 /usr/lib64/libc-2.30.so
    40da7abe89f631f60538a17686a7d65c6a02ed31 /usr/lib64/ld-2.30.so

  - store build id binaries into build id cache:

    # perf buildid-cache -a perf.data
    OK   5dcec522abf136fcfd3128f47e131f2365834dd7 /proc/kcore
    OK   589e403a34f55486bcac848a45e00bcdeedd1ca8 /usr/lib64/libcrypto.so.1.1.1g
    OK   94569566d4eac7e9c87ba029d43d4e2158f9527e /usr/lib64/libpthread-2.30.so
    OK   559b9702bebe31c6d132c8dc5cc887673d65d5b5 /usr/lib64/libc-2.30.so
    OK   40da7abe89f631f60538a17686a7d65c6a02ed31 /usr/lib64/ld-2.30.so
    OK   a674f7a47c78e35a088104647b9640710277b489 /usr/sbin/sshd
    OK   e5cb4ca25f46485bdbc691c3a92e7e111dac3ef2 /usr/bin/bash
    OK   9bc8589108223c944b452f0819298a0c3cba6215 /usr/bin/find

    # find ~/.debug | head -5
    /root/.debug
    /root/.debug/proc
    /root/.debug/proc/kcore
    /root/.debug/proc/kcore/5dcec522abf136fcfd3128f47e131f2365834dd7
    /root/.debug/proc/kcore/5dcec522abf136fcfd3128f47e131f2365834dd7/kallsyms

  - run debuginfod daemon to provide binaries to another server (below)
    (the initialization could take some time)

    # debuginfod -F /

On another server:

  - copy perf.data from 'record' server and run:

    $ find ~/.debug/
    find: ‘/home/jolsa/.debug/’: No such file or directory

    $ perf buildid-list | head -5
    No kallsyms or vmlinux with build-id 5dcec522abf136fcfd3128f47e131f2365834dd7 was found
    5dcec522abf136fcfd3128f47e131f2365834dd7 [kernel.kallsyms]
    5784f813b727a50cfd3363234aef9fcbab685cc4 /lib/modules/5.10.0-rc2speed+/kernel/fs/xfs/xfs.ko
    589e403a34f55486bcac848a45e00bcdeedd1ca8 /usr/lib64/libcrypto.so.1.1.1g
    94569566d4eac7e9c87ba029d43d4e2158f9527e /usr/lib64/libpthread-2.30.so
    559b9702bebe31c6d132c8dc5cc887673d65d5b5 /usr/lib64/libc-2.30.so

  - report does not show anything (kernel build id does not match):

   $ perf report --stdio
   ...
    76.73%  swapper          [kernel.kallsyms]    [k] 0xffffffff81aa8ebe
     1.89%  find             [kernel.kallsyms]    [k] 0xffffffff810f2167
     0.93%  sshd             [kernel.kallsyms]    [k] 0xffffffff8153380c
     0.83%  swapper          [kernel.kallsyms]    [k] 0xffffffff81104b0b
     0.71%  kworker/u40:2-e  [kernel.kallsyms]    [k] 0xffffffff810f3850
     0.70%  kworker/u40:0-e  [kernel.kallsyms]    [k] 0xffffffff810f3850
     0.64%  find             [kernel.kallsyms]    [k] 0xffffffff81a9ba0a
     0.63%  find             [kernel.kallsyms]    [k] 0xffffffff81aa93b0

  - add build ids does not work, because existing binaries (on another server)
    have different build ids:

    $ perf buildid-cache -a perf.data
    No kallsyms or vmlinux with build-id 5dcec522abf136fcfd3128f47e131f2365834dd7 was found
    FAIL 5dcec522abf136fcfd3128f47e131f2365834dd7 [kernel.kallsyms]
    FAIL 5784f813b727a50cfd3363234aef9fcbab685cc4 /lib/modules/5.10.0-rc2speed+/kernel/fs/xfs/xfs.ko
    FAIL 589e403a34f55486bcac848a45e00bcdeedd1ca8 /usr/lib64/libcrypto.so.1.1.1g
    FAIL 94569566d4eac7e9c87ba029d43d4e2158f9527e /usr/lib64/libpthread-2.30.so
    FAIL 559b9702bebe31c6d132c8dc5cc887673d65d5b5 /usr/lib64/libc-2.30.so
    FAIL 40da7abe89f631f60538a17686a7d65c6a02ed31 /usr/lib64/ld-2.30.so
    FAIL a674f7a47c78e35a088104647b9640710277b489 /usr/sbin/sshd
    FAIL e5cb4ca25f46485bdbc691c3a92e7e111dac3ef2 /usr/bin/bash
    FAIL 9bc8589108223c944b452f0819298a0c3cba6215 /usr/bin/find

  - add build ids with debuginfod setup pointing to record server:

    $ perf buildid-cache -a perf.data --debuginfod http://192.168.122.174:8002
    No kallsyms or vmlinux with build-id 5dcec522abf136fcfd3128f47e131f2365834dd7 was found
    OK   5dcec522abf136fcfd3128f47e131f2365834dd7 [kernel.kallsyms]
    OK   5784f813b727a50cfd3363234aef9fcbab685cc4 /lib/modules/5.10.0-rc2speed+/kernel/fs/xfs/xfs.ko
    OK   589e403a34f55486bcac848a45e00bcdeedd1ca8 /usr/lib64/libcrypto.so.1.1.1g
    OK   94569566d4eac7e9c87ba029d43d4e2158f9527e /usr/lib64/libpthread-2.30.so
    OK   559b9702bebe31c6d132c8dc5cc887673d65d5b5 /usr/lib64/libc-2.30.so
    OK   40da7abe89f631f60538a17686a7d65c6a02ed31 /usr/lib64/ld-2.30.so
    OK   a674f7a47c78e35a088104647b9640710277b489 /usr/sbin/sshd
    OK   e5cb4ca25f46485bdbc691c3a92e7e111dac3ef2 /usr/bin/bash
    OK   9bc8589108223c944b452f0819298a0c3cba6215 /usr/bin/find

  - and report works:

    $ perf report --stdio
    ...
    76.73%  swapper          [kernel.kallsyms]    [k] native_safe_halt
     1.91%  find             [kernel.kallsyms]    [k] queue_work_on
     0.93%  sshd             [kernel.kallsyms]    [k] iowrite16
     0.83%  swapper          [kernel.kallsyms]    [k] finish_task_switch
     0.72%  kworker/u40:2-e  [kernel.kallsyms]    [k] process_one_work
     0.70%  kworker/u40:0-e  [kernel.kallsyms]    [k] process_one_work
     0.64%  find             [kernel.kallsyms]    [k] syscall_enter_from_user_mode
     0.63%  find             [kernel.kallsyms]    [k] _raw_spin_unlock_irqrestore

  - because we have the data in build id cache:

    $ find ~/.debug | head -10
    .../.debug
    .../.debug/home
    .../.debug/home/jolsa
    .../.debug/home/jolsa/.cache
    .../.debug/home/jolsa/.cache/debuginfod_client
    .../.debug/home/jolsa/.cache/debuginfod_client/5dcec522abf136fcfd3128f47e131f2365834dd7
    .../.debug/home/jolsa/.cache/debuginfod_client/5dcec522abf136fcfd3128f47e131f2365834dd7/executable
    .../.debug/home/jolsa/.cache/debuginfod_client/5dcec522abf136fcfd3128f47e131f2365834dd7/executable/5dcec522abf136fcfd3128f47e131f2365834dd7
    .../.debug/home/jolsa/.cache/debuginfod_client/5dcec522abf136fcfd3128f47e131f2365834dd7/executable/5dcec522abf136fcfd3128f47e131f2365834dd7/elf
    .../.debug/home/jolsa/.cache/debuginfod_client/5dcec522abf136fcfd3128f47e131f2365834dd7/executable/5dcec522abf136fcfd3128f47e131f2365834dd7/debug

Available also in:
  git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  perf/build_id

thanks,
jirka
====================

Signed-off-by: Alexei Starovoitov <[email protected]>

commit | commitdiff | tree

Jiri Olsa [Thu, 14 Jan 2021 13:40:44 +0000 (14:40 +0100)]

perf: Add build id data in mmap2 event

Adding support to carry build id data in mmap2 event.

The build id data replaces maj/min/ino/ino_generation
fields, which are also used to identify map's binary,
so it's ok to replace them with build id data:

  union {
          struct {
                  u32       maj;
                  u32       min;
                  u64       ino;
                  u64       ino_generation;
          };
          struct {
                  u8        build_id_size;
                  u8        __reserved_1;
                  u16       __reserved_2;
                  u8        build_id[20];
          };
  };

Replaced maj/min/ino/ino_generation fields give us size
of 24 bytes. We use 20 bytes for build id data, 1 byte
for size and rest is unused.

There's new misc bit for mmap2 to signal there's build
id data in it:

  #define PERF_RECORD_MISC_MMAP_BUILD_ID   (1 << 14)

Signed-off-by: Jiri Olsa <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

commit | commitdiff | tree

Jiri Olsa [Thu, 14 Jan 2021 13:40:43 +0000 (14:40 +0100)]

bpf: Add size arg to build_id_parse function

It's possible to have other build id types (other than default SHA1).
Currently there's also ld support for MD5 build id.

Adding size argument to build_id_parse function, that returns (if defined)
size of the parsed build id, so we can recognize the build id type.

Signed-off-by: Jiri Olsa <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

commit | commitdiff | tree

Jiri Olsa [Thu, 14 Jan 2021 13:40:42 +0000 (14:40 +0100)]

bpf: Move stack_map_get_build_id into lib

Moving stack_map_get_build_id into lib with
declaration in linux/buildid.h header:

int build_id_parse(struct vm_area_struct *vma, unsigned char *build_id);

This function returns build id for given struct vm_area_struct.
There is no functional change to stack_map_get_build_id function.

Signed-off-by: Jiri Olsa <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

commit | commitdiff | tree

Jakub Kicinski [Fri, 15 Jan 2021 02:34:50 +0000 (18:34 -0800)]

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Alexei Starovoitov [Fri, 15 Jan 2021 02:34:30 +0000 (18:34 -0800)]

Merge branch 'Atomics for eBPF'

Brendan Jackman says:

====================

There's still one unresolved review comment from John[3] which I
will resolve with a followup patch.

Differences from v6->v7 [1]:

* Fixed riscv build error detected by 0-day robot.

Differences from v5->v6 [1]:

* Carried Björn Töpel's ack for RISC-V code, plus a couple more acks from
  Yonhgong.

* Doc fixups.

* Trivial cleanups.

Differences from v4->v5 [1]:

* Fixed bogus type casts in interpreter that led to warnings from
  the 0day robot.

* Dropped feature-detection for Clang per Andrii's suggestion in [4].
  The selftests will now fail to build unless you have llvm-project
  commit 286daafd6512. The ENABLE_ATOMICS_TEST macro is still needed
  to support the no_alu32 tests.

* Carried some Acks from John and Yonghong.

* Dropped confusing usage of __atomic_exchange from prog_test in
  favour of __sync_lock_test_and_set.

* [Really] got rid of all the forest of instruction macros
  (BPF_ATOMIC_FETCH_ADD and friends); now there's just BPF_ATOMIC_OP
  to define all the instructions as we use them in the verifier
  tests. This makes the atomic ops less special in that API, and I
  don't think the resulting usage is actually any harder to read.

Differences from v3->v4 [1]:

* Added one Ack from Yonghong. He acked some other patches but those
  have now changed non-trivally so I didn't add those acks.

* Fixups to commit messages.

* Fixed disassembly and comments: first arg to atomic_fetch_* is a
  pointer.

* Improved prog_test efficiency. BPF progs are now all loaded in a
  single call, then the skeleton is re-used for each subtest.

* Dropped use of tools/build/feature in favour of a one-liner in the
  Makefile.

* Dropped the commit that created an emit_neg helper in the x86
  JIT. It's not used any more (it wasn't used in v3 either).

* Combined all the different filter.h macros (used to be
  BPF_ATOMIC_ADD, BPF_ATOMIC_FETCH_ADD, BPF_ATOMIC_AND, etc) into
  just BPF_ATOMIC32 and BPF_ATOMIC64.

* Removed some references to BPF_STX_XADD from tools/, samples/ and
  lib/ that I missed before.

Differences from v2->v3 [1]:

* More minor fixes and naming/comment changes

* Dropped atomic subtract: compilers can implement this by preceding
  an atomic add with a NEG instruction (which is what the x86 JIT did
  under the hood anyway).

* Dropped the use of -mcpu=v4 in the Clang BPF command-line; there is
  no longer an architecture version bump. Instead a feature test is
  added to Kbuild - it builds a source file to check if Clang
  supports BPF atomics.

* Fixed the prog_test so it no longer breaks
  test_progs-no_alu32. This requires some ifdef acrobatics to avoid
  complicating the prog_tests model where the same userspace code
  exercises both the normal and no_alu32 BPF test objects, using the
  same skeleton header.

Differences from v1->v2 [1]:

* Fixed mistakes in the netronome driver

* Addd sub, add, or, xor operations

* The above led to some refactors to keep things readable. (Maybe I
  should have just waited until I'd implemented these before starting
  the review...)

* Replaced BPF_[CMP]SET | BPF_FETCH with just BPF_[CMP]XCHG, which
  include the BPF_FETCH flag

* Added a bit of documentation. Suggestions welcome for more places
  to dump this info...

The prog_test that's added depends on Clang/LLVM features added by
Yonghong in commit 286daafd6512 (was
https://reviews.llvm.org/D72184).

This only includes a JIT implementation for x86_64 - I don't plan to
implement JIT support myself for other architectures.

Operations
==========

This patchset adds atomic operations to the eBPF instruction set. The
use-case that motivated this work was a trivial and efficient way to
generate globally-unique cookies in BPF progs, but I think it's
obvious that these features are pretty widely applicable.  The
instructions that are added here can be summarised with this list of
kernel operations:

* atomic[64]_[fetch_]add
* atomic[64]_[fetch_]and
* atomic[64]_[fetch_]or
* atomic[64]_xchg
* atomic[64]_cmpxchg

The following are left out of scope for this effort:

* 16 and 8 bit operations
* Explicit memory barriers

Encoding
========

I originally planned to add new values for bpf_insn.opcode. This was
rather unpleasant: the opcode space has holes in it but no entire
instruction classes[2]. Yonghong Song had a better idea: use the
immediate field of the existing STX XADD instruction to encode the
operation. This works nicely, without breaking existing programs,
because the immediate field is currently reserved-must-be-zero, and
extra-nicely because BPF_ADD happens to be zero.

Note that this of course makes immediate-source atomic operations
impossible. It's hard to imagine a measurable speedup from such
instructions, and if it existed it would certainly not benefit x86,
which has no support for them.

The BPF_OP opcode fields are re-used in the immediate, and an
additional flag BPF_FETCH is used to mark instructions that should
fetch a pre-modification value from memory.

So, BPF_XADD is now called BPF_ATOMIC (the old name is kept to avoid
breaking userspace builds), and where we previously had .imm = 0, we
now have .imm = BPF_ADD (which is 0).

Operands
========

Reg-source eBPF instructions only have two operands, while these
atomic operations have up to four. To avoid needing to encode
additional operands, then:

- One of the input registers is re-used as an output register
  (e.g. atomic_fetch_add both reads from and writes to the source
  register).

- Where necessary (i.e. for cmpxchg) , R0 is "hard-coded" as one of
  the operands.

This approach also allows the new eBPF instructions to map directly
to single x86 instructions.

[1] Previous iterations:
    v1: https://lore.kernel.org/bpf/20201123173202.1335708 [email protected]/
    v2: https://lore.kernel.org/bpf/20201127175738.1085417 [email protected]/
    v3: https://lore.kernel.org/bpf/[email protected]/
    v4: https://lore.kernel.org/bpf/20201207160734.2345502 [email protected]/
    v5: https://lore.kernel.org/bpf/20201215121816.1048557 [email protected]/
    v6: https://lore.kernel.org/bpf/20210112154235.2192781 [email protected]/

[2] Visualisation of eBPF opcode space:
    https://gist.github.com/bjackman/00fdad2d5dfff601c1918bc29b16e778

[3] Comment from John about propagating bounds in verifier:
    https://lore.kernel.org/bpf/[email protected]/

[4] Mail from Andrii about not supporting old Clang in selftests:
    https://lore.kernel.org/bpf/CAEf4BzYBddPaEzRUs=jaWSo5kbf=LZdb7geAUVj85GxLQztuAQ@mail.gmail.com/
====================

Signed-off-by: Alexei Starovoitov <[email protected]>

commit | commitdiff | tree

Brendan Jackman [Thu, 14 Jan 2021 18:17:51 +0000 (18:17 +0000)]

bpf: Document new atomic instructions

Document new atomic instructions.

Signed-off-by: Brendan Jackman <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

commit | commitdiff | tree

Brendan Jackman [Thu, 14 Jan 2021 18:17:50 +0000 (18:17 +0000)]

bpf: Add tests for new BPF atomic operations

The prog_test that's added depends on Clang/LLVM features added by
Yonghong in commit 286daafd6512 (was https://reviews.llvm.org/D72184).

Note the use of a define called ENABLE_ATOMICS_TESTS: this is used
to:

- Avoid breaking the build for people on old versions of Clang
- Avoid needing separate lists of test objects for no_alu32, where
atomics are not supported even if Clang has the feature.

The atomics_test.o BPF object is built unconditionally both for
test_progs and test_progs-no_alu32. For test_progs, if Clang supports
atomics, ENABLE_ATOMICS_TESTS is defined, so it includes the proper
test code. Otherwise, progs and global vars are defined anyway, as
stubs; this means that the skeleton user code still builds.

The atomics_test.o userspace object is built once and used for both
test_progs and test_progs-no_alu32. A variable called skip_tests is
defined in the BPF object's data section, which tells the userspace
object whether to skip the atomics test.

Signed-off-by: Brendan Jackman <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

commit | commitdiff | tree

Brendan Jackman [Thu, 14 Jan 2021 18:17:49 +0000 (18:17 +0000)]

bpf: Add bitwise atomic instructions

This adds instructions for

atomic[64]_[fetch_]and
atomic[64]_[fetch_]or
atomic[64]_[fetch_]xor

All these operations are isomorphic enough to implement with the same
verifier, interpreter, and x86 JIT code, hence being a single commit.

The main interesting thing here is that x86 doesn't directly support
the fetch_ version these operations, so we need to generate a CMPXCHG
loop in the JIT. This requires the use of two temporary registers,
IIUC it's safe to use BPF_REG_AX and x86's AUX_REG for this purpose.

Signed-off-by: Brendan Jackman <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

commit | commitdiff | tree

Brendan Jackman [Thu, 14 Jan 2021 18:17:48 +0000 (18:17 +0000)]

bpf: Pull out a macro for interpreting atomic ALU operations

Since the atomic operations that are added in subsequent commits are
all isomorphic with BPF_ADD, pull out a macro to avoid the
interpreter becoming dominated by lines of atomic-related code.

Note that this sacrificies interpreter performance (combining
STX_ATOMIC_W and STX_ATOMIC_DW into single switch case means that we
need an extra conditional branch to differentiate them) in favour of
compact and (relatively!) simple C code.

Signed-off-by: Brendan Jackman <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

commit | commitdiff | tree

Brendan Jackman [Thu, 14 Jan 2021 18:17:47 +0000 (18:17 +0000)]

bpf: Add instructions for atomic_[cmp]xchg

This adds two atomic opcodes, both of which include the BPF_FETCH
flag. XCHG without the BPF_FETCH flag would naturally encode
atomic_set. This is not supported because it would be of limited
value to userspace (it doesn't imply any barriers). CMPXCHG without
BPF_FETCH woulud be an atomic compare-and-write. We don't have such
an operation in the kernel so it isn't provided to BPF either.

There are two significant design decisions made for the CMPXCHG
instruction:

- To solve the issue that this operation fundamentally has 3
   operands, but we only have two register fields. Therefore the
   operand we compare against (the kernel's API calls it 'old') is
   hard-coded to be R0. x86 has similar design (and A64 doesn't
   have this problem).

   A potential alternative might be to encode the other operand's
   register number in the immediate field.

- The kernel's atomic_cmpxchg returns the old value, while the C11
   userspace APIs return a boolean indicating the comparison
   result. Which should BPF do? A64 returns the old value. x86 returns
   the old value in the hard-coded register (and also sets a
   flag). That means return-old-value is easier to JIT, so that's
   what we use.

Signed-off-by: Brendan Jackman <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

commit | commitdiff | tree

Brendan Jackman [Thu, 14 Jan 2021 18:17:46 +0000 (18:17 +0000)]

bpf: Add BPF_FETCH field / create atomic_fetch_add instruction

The BPF_FETCH field can be set in bpf_insn.imm, for BPF_ATOMIC
instructions, in order to have the previous value of the
atomically-modified memory location loaded into the src register
after an atomic op is carried out.

Suggested-by: Yonghong Song <[email protected]>
Signed-off-by: Brendan Jackman <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: John Fastabend <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

commit | commitdiff | tree

Brendan Jackman [Thu, 14 Jan 2021 18:17:45 +0000 (18:17 +0000)]

bpf: Move BPF_STX reserved field check into BPF_STX verifier code

I can't find a reason why this code is in resolve_pseudo_ldimm64;
since I'll be modifying it in a subsequent commit, tidy it up.

Signed-off-by: Brendan Jackman <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Acked-by: John Fastabend <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

commit | commitdiff | tree

Brendan Jackman [Thu, 14 Jan 2021 18:17:44 +0000 (18:17 +0000)]

bpf: Rename BPF_XADD and prepare to encode other atomics in .imm

A subsequent patch will add additional atomic operations. These new
operations will use the same opcode field as the existing XADD, with
the immediate discriminating different operations.

In preparation, rename the instruction mode BPF_ATOMIC and start
calling the zero immediate BPF_ADD.

This is possible (doesn't break existing valid BPF progs) because the
immediate field is currently reserved MBZ and BPF_ADD is zero.

All uses are removed from the tree but the BPF_XADD definition is
kept around to avoid breaking builds for people including kernel
headers.

Signed-off-by: Brendan Jackman <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Björn Töpel <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

commit | commitdiff | tree

Brendan Jackman [Thu, 14 Jan 2021 18:17:43 +0000 (18:17 +0000)]

bpf: x86: Factor out a lookup table for some ALU opcodes

A later commit will need to lookup a subset of these opcodes. To
avoid duplicating code, pull out a table.

The shift opcodes won't be needed by that later commit, but they're
already duplicated, so fold them into the table anyway.

Signed-off-by: Brendan Jackman <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: John Fastabend <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

commit | commitdiff | tree

Brendan Jackman [Thu, 14 Jan 2021 18:17:42 +0000 (18:17 +0000)]

bpf: x86: Factor out emission of REX byte

The JIT case for encoding atomic ops is about to get more
complicated. In order to make the review & resulting code easier,
let's factor out some shared helpers.

Signed-off-by: Brendan Jackman <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: John Fastabend <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

commit | commitdiff | tree

Brendan Jackman [Thu, 14 Jan 2021 18:17:41 +0000 (18:17 +0000)]

bpf: x86: Factor out emission of ModR/M for *(reg + off)

The case for JITing atomics is about to get more complicated. Let's
factor out some common code to make the review and result more
readable.

NB the atomics code doesn't yet use the new helper - a subsequent
patch will add its use as a side-effect of other changes.

Signed-off-by: Brendan Jackman <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: John Fastabend <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

commit | commitdiff | tree

Jakub Kicinski [Fri, 15 Jan 2021 02:24:55 +0000 (18:24 -0800)]

Merge branch 'dissect-ptp-l2-packet-header'

Eran Ben Elisha says:

====================
Dissect PTP L2 packet header

This series adds support for dissecting PTP L2 packet
header (EtherType 0x88F7).

For packet header dissecting, skb->protocol is needed. Add protocol
parsing operation to vlan ops, to guarantee skb->protocol is set,
as EtherType 0x88F7 occasionally follows a vlan header.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Eran Ben Elisha [Tue, 12 Jan 2021 19:07:13 +0000 (21:07 +0200)]

net: flow_dissector: Parse PTP L2 packet header

Add support for parsing PTP L2 packet header. Such packet consists
of an L2 header (with ethertype of ETH_P_1588), PTP header, body
and an optional suffix.

Signed-off-by: Eran Ben Elisha <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Eran Ben Elisha [Tue, 12 Jan 2021 19:07:12 +0000 (21:07 +0200)]

net: vlan: Add parse protocol header ops

Add parse protocol header ops for vlan device. Before this patch, vlan
tagged packet transmitted by af_packet had skb->protocol unset. Some
kernel methods (like __skb_flow_dissect()) rely on this missing information
for its packet processing.

Signed-off-by: Eran Ben Elisha <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Ayush Sawal [Wed, 13 Jan 2021 04:43:02 +0000 (10:13 +0530)]

ch_ipsec: Remove initialization of rxq related data

Removing initialization of nrxq and rxq_size in uld_info. As
ipsec uses nic queues only, there is no need to create uld
rx queues for ipsec.

Signed-off-by: Ayush Sawal <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Jakub Kicinski [Fri, 15 Jan 2021 01:40:14 +0000 (17:40 -0800)]

Merge branch 'net-ipa-gsi-interrupt-updates'

Alex Elder says:

====================
net: ipa: GSI interrupt updates

This series implements some updates for the GSI interrupt code,
buliding on some bug fixes implemented last month.

The first two are simple changes made to improve readability and
consistency.  The third replaces all msleep() calls with comparable
usleep_range() calls.

The remainder make some more substantive changes to make the code
align with recommendations from Qualcomm.  The fourth implements a
much shorter timeout for completion GSI commands, and the fifth
implements a longer delay between retries of the STOP channel
command.  Finally, the last implements retries for stopping TX
channels (in addition to RX channels).
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Alex Elder [Wed, 13 Jan 2021 17:15:32 +0000 (11:15 -0600)]

net: ipa: retry TX channel stop commands

For RX channels we issue a stop command more than once if necessary
to allow it to stop. It turns out that TX channels could also
require retries.

Retry channel stop commands if necessary regardless of the channel
direction. Rename the symbol defining the retry count so it's not
RX-specific.

Signed-off-by: Alex Elder <[email protected]>
Reviewed-by: Saeed Mahameed <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Alex Elder [Wed, 13 Jan 2021 17:15:31 +0000 (11:15 -0600)]

net: ipa: change stop channel retry delay

If a GSI stop channel command leaves the channel in STOP_IN_PROC
state, we retry the stop command after a 1-2 millisecond delay.

I have been told that a 3-5 millisecond delay is a better choice.

Signed-off-by: Alex Elder <[email protected]>
Reviewed-by: Saeed Mahameed <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Alex Elder [Wed, 13 Jan 2021 17:15:30 +0000 (11:15 -0600)]

net: ipa: change GSI command timeout

The GSI command timeout is currently 5 seconds, which is much higher
than it should be.

Express the timeout in milliseconds rather than seconds, and reduce
it to 50 milliseconds.

Signed-off-by: Alex Elder <[email protected]>
Reviewed-by: Saeed Mahameed <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Alex Elder [Wed, 13 Jan 2021 17:15:29 +0000 (11:15 -0600)]

net: ipa: use usleep_range()
65;6003;1c
The use of msleep() for small periods (less than 20 milliseconds) is
not recommended because the actual delay can be much different than
expected.

We use msleep(1) in several places in the IPA driver to insert short
delays. Replace them with usleep_range calls, which should reliably
delay a period in the range requested.

Signed-off-by: Alex Elder <[email protected]>
Reviewed-by: Saeed Mahameed <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Alex Elder [Wed, 13 Jan 2021 17:15:28 +0000 (11:15 -0600)]

net: ipa: introduce some interrupt helpers

Create a new function gsi_irq_ev_ctrl_enable() that encapsulates
enabling the event ring control GSI interrupt type, and enables a
single event ring to signal that interrupt. When an event ring
changes state as a result of an event ring command, it triggers this
interrupt.

Create an inverse function gsi_irq_ev_ctrl_disable() as well.
Because only one event ring at a time is enabled for this interrupt,
we can simply disable the interrupt for *all* channels.

Create a pair of helpers that serve the same purpose for channel
commands.

Signed-off-by: Alex Elder <[email protected]>
Reviewed-by: Saeed Mahameed <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Alex Elder [Wed, 13 Jan 2021 17:15:27 +0000 (11:15 -0600)]

net: ipa: a few simple renames

The return value of gsi_command() is true if successful or false if
we time out waiting for a completion interrupt.

Rename the variables in the three callers of gsi_command() to be
"timeout", to make it more obvious that's the only reason for
failure.

In addition, add a "gsi_" prefix to evt_ring_command() so its name
is consistent with the convention used for GSI channel and generic
commands.

Signed-off-by: Alex Elder <[email protected]>
Reviewed-by: Saeed Mahameed <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Jakub Kicinski [Fri, 15 Jan 2021 01:22:07 +0000 (17:22 -0800)]

Merge tag 'linux-can-next-for-5.12-20210114' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2021-01-14

The first two patches update the MAINTAINERS file, Lukas Bulwahn's patch fixes
the files entry for the tcan4x5x driver, which was broken by me in net-next.
A patch by me adds the a missing header file to the CAN Networking Layer.

The next 5 patches are by me and split the the CAN driver related
infrastructure code into more files in a separate subdir. The next two patches
by me clean up the CAN length related code. This is followed by 6 patches by
Vincent Mailhol and me, they add helper code for for CAN frame length
calculation neede for BQL support.

A patch by Vincent Mailhol adds software TX timestamp support.

The last patch is by me, targets the tcan4x5x driver, and removes the unneeded
__packed attribute from the struct tcan4x5x_map_buf.

* tag 'linux-can-next-for-5.12-20210114' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
  can: tcan4x5x: remove __packed attribute from struct tcan4x5x_map_buf
  can: dev: can_put_echo_skb(): add software tx timestamps
  can: dev: can_rx_offload_get_echo_skb(): extend to return can frame length
  can: dev: can_get_echo_skb(): extend to return can frame length
  can: dev: can_put_echo_skb(): extend to handle frame_len
  can: dev: extend struct can_skb_priv to hold CAN frame length
  can: length: can_skb_get_frame_len(): introduce function to get data length of frame in data link layer
  can: length: canfd_sanitize_len(): add function to sanitize CAN-FD data length
  can: length: can_fd_len2dlc(): simplify length calculcation
  can: length: convert to kernel coding style
  can: dev: move netlink related code into seperate file
  can: dev: move skb related into seperate file
  can: dev: move length related code into seperate file
  can: dev: move bittiming related code into seperate file
  can: dev: move driver related infrastructure into separate subdir
  MAINTAINERS: CAN network layer: add missing header file can-ml.h
  MAINTAINERS: adjust entry to tcan4x5x file split
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Jakub Kicinski [Fri, 15 Jan 2021 01:11:59 +0000 (17:11 -0800)]

Merge branch 'net-dsa-link-aggregation-support'

Tobias Waldekranz says:

====================
net: dsa: Link aggregation support

Start of by adding an extra notification when adding a port to a bond,
this allows static LAGs to be offloaded using the bonding driver.

Then add the generic support required to offload link aggregates to
drivers built on top of the DSA subsystem.

Finally, implement offloading for the mv88e6xxx driver, i.e. Marvell's
LinkStreet family.

Supported LAG implementations:
- Bonding
- Team

Supported modes:
- Isolated. The LAG may be used as a regular interface outside of any
  bridge.
- Bridged. The LAG may be added to a bridge, in which case switching
  is offloaded between the LAG and any other switch ports. I.e. the
  LAG behaves just like a port from this perspective.

In bridged mode, the following is supported:
- STP filtering.
- VLAN filtering.
- Multicast filtering. The bridge correctly snoops IGMP and configures
  the proper groups if snooping is enabled. Static groups can also be
  configured. MLD seems to work, but has not been extensively tested.
- Unicast filtering. Automatic learning works. Static entries are
  _not_ supported. This will be added in a later series as it requires
  some more general refactoring in mv88e6xxx before I can test it.

v4 -> v5:
- Cleanup PVT configuration for LAGed ports in mv88e6xxx (Vladimir)
- Document dsa_lag_{map,unmap} (Vladimir)
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Tobias Waldekranz [Wed, 13 Jan 2021 08:42:55 +0000 (09:42 +0100)]

net: dsa: tag_dsa: Support reception of packets from LAG devices

Packets ingressing on a LAG that egress on the CPU port, which are not
classified as management, will have a FORWARD tag that does not
contain the normal source device/port tuple. Instead the trunk bit
will be set, and the port field holds the LAG id.

Since the exact source port information is not available in the tag,
frames are injected directly on the LAG interface and thus do never
pass through any DSA port interface on ingress.

Management frames (TO_CPU) are not affected and will pass through the
DSA port interface as usual.

Signed-off-by: Tobias Waldekranz <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Tobias Waldekranz [Wed, 13 Jan 2021 08:42:54 +0000 (09:42 +0100)]

net: dsa: mv88e6xxx: Link aggregation support

Support offloading of LAGs to hardware. LAGs may be attached to a
bridge in which case VLANs, multicast groups, etc. are also offloaded
as usual.

Signed-off-by: Tobias Waldekranz <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Tobias Waldekranz [Wed, 13 Jan 2021 08:42:53 +0000 (09:42 +0100)]

net: dsa: Link aggregation support

Monitor the following events and notify the driver when:

- A DSA port joins/leaves a LAG.
- A LAG, made up of DSA ports, joins/leaves a bridge.
- A DSA port in a LAG is enabled/disabled (enabled meaning
"distributing" in 802.3ad LACP terms).

When a LAG joins a bridge, the DSA subsystem will treat that as each
individual port joining the bridge. The driver may look at the port's
LAG device pointer to see if it is associated with any LAG, if that is
required. This is analogue to how switchdev events are replicated out
to all lower devices when reaching e.g. a LAG.

Drivers can optionally request that DSA maintain a linear mapping from
a LAG ID to the corresponding netdev by setting ds->num_lag_ids to the
desired size.

In the event that the hardware is not capable of offloading a
particular LAG for any reason (the typical case being use of exotic
modes like broadcast), DSA will take a hands-off approach, allowing
the LAG to be formed as a pure software construct. This is reported
back through the extended ACK, but is otherwise transparent to the
user.

Signed-off-by: Tobias Waldekranz <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Tested-by: Vladimir Oltean <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Tobias Waldekranz [Wed, 13 Jan 2021 08:42:52 +0000 (09:42 +0100)]

net: dsa: Don't offload port attributes on standalone ports

In a situation where a standalone port is indirectly attached to a
bridge (e.g. via a LAG) which is not offloaded, do not offload any
port attributes either. The port should behave as a standard NIC.

Previously, on mv88e6xxx, this meant that in the following setup:

     br0
     /
  team0
   / \
swp0 swp1

If vlan filtering was enabled on br0, swp0's and swp1's QMode was set
to "secure". This caused all untagged packets to be dropped, as their
default VID (0) was not loaded into the VTU.

Signed-off-by: Tobias Waldekranz <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Tobias Waldekranz [Wed, 13 Jan 2021 08:42:51 +0000 (09:42 +0100)]

net: bonding: Notify ports about their initial state

When creating a static bond (e.g. balance-xor), all ports will always
be enabled. This is set, and the corresponding notification is sent
out, before the port is linked to the bond upper.

In the offloaded case, this ordering is hard to deal with.

The lower will first see a notification that it can not associate with
any bond. Then the bond is joined. After that point no more
notifications are sent, so all ports remain disabled.

This change simply sends an extra notification once the port has been
linked to the upper to synchronize the initial state.

Signed-off-by: Tobias Waldekranz <[email protected]>
Acked-by: Jay Vosburgh <[email protected]>
Tested-by: Vladimir Oltean <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Christophe JAILLET [Thu, 14 Jan 2021 08:47:57 +0000 (09:47 +0100)]

mlxsw: pci: switch from 'pci_' to 'dma_' API

The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'mlxsw_pci_queue_init()' and
'mlxsw_pci_fw_area_init()' GFP_KERNEL can be used because both functions
are already using this flag and no lock is acquired.

When memory is allocated in 'mlxsw_pci_mbox_alloc()' GFP_KERNEL can be used
because it is only called from the probe function and no lock is acquired
in the between.
The call chain is:
  --> mlxsw_pci_probe()
    --> mlxsw_pci_cmd_init()
      --> mlxsw_pci_mbox_alloc()

While at it, also replace the 'dma_set_mask/dma_set_coherent_mask' sequence
by a less verbose 'dma_set_mask_and_coherent() call.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <[email protected]>
Tested-by: Ido Schimmel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Vladimir Oltean [Thu, 14 Jan 2021 08:35:56 +0000 (10:35 +0200)]

net: marvell: prestera: fix uninitialized vid in prestera_port_vlans_add

prestera_bridge_port_vlan_add should have been called with vlan->vid,
however this was masked by the presence of the local vid variable and I
did not notice the build warning.

Reported-by: kernel test robot <[email protected]>
Fixes: b7a9e0da2d1c ("net: switchdev: remove vid_begin -> vid_end range from VLAN objects")
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Reviewed-by: Taras Chornyi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Eelco Chaudron [Wed, 13 Jan 2021 13:50:00 +0000 (14:50 +0100)]

net: openvswitch: add log message for error case

As requested by upstream OVS, added some error messages in the
validate_and_copy_dec_ttl function.

Includes a small cleanup, which removes an unnecessary parameter
from the dec_ttl_exception_handler() function.

Reported-by: Flavio Leitner <[email protected]>
Signed-off-by: Eelco Chaudron <[email protected]>
Acked-by: Flavio Leitner <[email protected]>
Link: https://lore.kernel.org/r/161054576573.26637.18396634650212670580.stgit@ebuild
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Jakub Kicinski [Fri, 15 Jan 2021 00:26:54 +0000 (16:26 -0800)]

Merge branch 'selftests-updates-to-allow-single-instance-of-nettest-for-client-and-server'

David Ahern says:

====================
selftests: Updates to allow single instance of nettest for client and server

Update nettest to handle namespace change internally to allow a
single instance to run both client and server modes. Device validation
needs to be moved after the namespace change and a few run time
options need to be split to allow values for client and server.

v4
- really fix the memory leak with stdout/stderr buffers

v3
- send proper status in do_server for UDP sockets
- fix memory leak with stdout/stderr buffers
- new patch with separate option for address binding
- new patch to remove unnecessary newline

v2
- fix checkpath warnings
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

David Ahern [Thu, 14 Jan 2021 03:09:49 +0000 (20:09 -0700)]

selftests: Add separate option to nettest for address binding

Add separate option to nettest to specify local address
binding in client mode.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

David Ahern [Thu, 14 Jan 2021 03:09:48 +0000 (20:09 -0700)]

selftests: Remove exraneous newline in nettest

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

David Ahern [Thu, 14 Jan 2021 03:09:47 +0000 (20:09 -0700)]

selftests: Add separate options for server device bindings

Add new options to nettest to specify device binding and expected
device binding for server mode, and update fcnal-test script. This
is needed to allow a single instance of nettest running both server
and client modes to use different device bindings.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

David Ahern [Thu, 14 Jan 2021 03:09:46 +0000 (20:09 -0700)]

selftests: Add new option for client-side passwords

Add new option to nettest to specify MD5 password to use for client
side. Update fcnal-test script. This is needed for a single instance
running both server and client modes to test password mismatches.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

David Ahern [Thu, 14 Jan 2021 03:09:45 +0000 (20:09 -0700)]

selftests: Consistently specify address for MD5 protection

nettest started with -r as the remote address for MD5 passwords.
The -m argument was added to use prefixes with a length when that
feature was added to the kernel. Since -r is used to specify
remote address for client mode, change nettest to only use -m
for MD5 passwords and update fcnal-test script.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

David Ahern [Thu, 14 Jan 2021 03:09:44 +0000 (20:09 -0700)]

selftests: Make address validation apply only to client mode

When a single instance of nettest is used for client and server
make sure address validation is only done for client mode.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

David Ahern [Thu, 14 Jan 2021 03:09:43 +0000 (20:09 -0700)]

selftests: Add missing newline in nettest error messages

A few logging lines are missing the newline, or need it moved up for
cleaner logging.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

David Ahern [Thu, 14 Jan 2021 03:09:42 +0000 (20:09 -0700)]

selftests: Use separate stdout and stderr buffers in nettest

When a single instance of nettest is doing both client and
server modes, stdout and stderr messages can get interlaced
and become unreadable. Allocate a new set of buffers for the
child process handling server mode.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

David Ahern [Thu, 14 Jan 2021 03:09:41 +0000 (20:09 -0700)]

selftests: Add support to nettest to run both client and server

Add option to nettest to run both client and server within a
single instance. Client forks a child process to run the server
code. A pipe is used for the server to tell the client it has
initialized and is ready or had an error. This avoid unnecessary
sleeps to handle such race when the commands are separately launched.

Signed-off-by: Seth David Schoen <[email protected]>
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

David Ahern [Thu, 14 Jan 2021 03:09:40 +0000 (20:09 -0700)]

selftests: Add options to set network namespace to nettest

Add options to specify server and client network namespace to
use before running respective functions.

Signed-off-by: Seth David Schoen <[email protected]>
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

David Ahern [Thu, 14 Jan 2021 03:09:39 +0000 (20:09 -0700)]

selftests: Move address validation in nettest

IPv6 addresses can have a device name to declare a scope (e.g.,
fe80::5054:ff:fe12:3456%eth0). The next patch adds support to
switch network namespace before running client or server code
(or both), so move the address validation to the server and
client functions.

IPv4 multicast groups do not have the device scope in the address
specification, so they can be validated inline with option parsing.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

David Ahern [Thu, 14 Jan 2021 03:09:38 +0000 (20:09 -0700)]

selftests: Move convert_addr up in nettest

convert_addr needs to be invoked in a different location. Move
the code up to avoid a forward declaration.

Code move only.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

David Ahern [Thu, 14 Jan 2021 03:09:37 +0000 (20:09 -0700)]

selftests: Move device validation in nettest

Later patch adds support for switching network namespaces before
running client, server or both. Device validations need to be
done after the network namespace switch, so add a helper to do it
and invoke in server and client code versus inline with argument
parsing. Move related argument checks as well.

Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Jakub Kicinski [Thu, 14 Jan 2021 23:40:36 +0000 (15:40 -0800)]

Merge branch 'add-100-base-x-mode'

Bjarni Jonasson says:

====================
Add 100 base-x mode

Adding support for 100 base-x in phylink.
The Sparx5 switch supports 100 base-x pcs (IEEE 802.3 Clause 24) 4b5b encoded.
These patches adds phylink support for that mode.

Tested in Sparx5, using sfp modules:
Axcen 100fx AXFE-1314-0521 (base-fx)
Axcen 100lx AXFE-1314-0551 (base-lx)
HP SFP 100FX J9054C (bx-10)
Excom SFP-SX-M1002 (base-lx)

v1 -> v2:
  Added description to Documentation/networking/phy.rst
  Moved PHY_INTERFACE_MODE_100BASEX to above 1000BASEX
  Patching against net-next
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Bjarni Jonasson [Wed, 13 Jan 2021 11:56:26 +0000 (12:56 +0100)]

sfp: add support for 100 base-x SFPs

Add support for 100Base-FX, 100Base-LX, 100Base-PX and 100Base-BX10 modules
This is needed for Sparx-5 switch.

Signed-off-by: Bjarni Jonasson <[email protected]>
Reviewed-by: Russell King <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Bjarni Jonasson [Wed, 13 Jan 2021 11:56:25 +0000 (12:56 +0100)]

net: phy: Add 100 base-x mode

Sparx-5 supports this mode and it is missing in the PHY core.

Signed-off-by: Bjarni Jonasson <[email protected]>
Reviewed-by: Russell King <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Russell King [Tue, 12 Jan 2021 22:59:43 +0000 (22:59 +0000)]

net: phy: ar803x: disable extended next page bit

This bit is enabled by default and advertises support for extended
next page support. XNP is only needed for 10GBase-T and MultiGig
support which is not supported. Additionally, Cisco MultiGig switches
will read this bit and attempt 10Gb negotiation even though Next Page
support is disabled. This will cause timeouts when the interface is
forced to 100Mbps and auto-negotiation will fail. The interfaces are
only 1000Base-T and supporting auto-negotiation for this only requires
the Next Page bit to be set.

Taken from:
https://github.com/SolidRun/linux-stable/commit/7406c5244b7ea6bc17a2afe8568277a8c4b126a9
and adapted to mainline kernels by rmk.

Signed-off-by: Russell King <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Linus Torvalds [Thu, 14 Jan 2021 21:54:09 +0000 (13:54 -0800)]

Merge tag 'linux-kselftest-fixes-5.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan:
"One single fix to skip BPF selftests by default.

  BPF selftests have a hard dependency on cutting edge versions of tools
  in the BPF ecosystem including LLVM.

  Skipping BPF allows by default will make it easier for users
  interested in running kselftest as a whole. Users can include BPF in
  Kselftest build by via SKIP_TARGETS variable"

* tag 'linux-kselftest-fixes-5.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: Skip BPF seftests by default

commit | commitdiff | tree

Linus Torvalds [Thu, 14 Jan 2021 21:31:07 +0000 (13:31 -0800)]

Merge tag 'net-5.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"We have a few fixes for long standing issues, in particular Eric's fix
  to not underestimate the skb sizes, and my fix for brokenness of
  register_netdevice() error path. They may uncover other bugs so we
  will keep an eye on them. Also included are Willem's fixes for
  kmap(_atomic).

  Looking at the "current release" fixes, it seems we are about one rc
  behind a normal cycle. We've previously seen an uptick of "people had
  run their test suites" / "humans actually tried to use new features"
  fixes between rc2 and rc3.

  Summary:

  Current release - regressions:

   - fix feature enforcement to allow NETIF_F_HW_TLS_TX if IP_CSUM &&
     IPV6_CSUM

   - dcb: accept RTM_GETDCB messages carrying set-like DCB commands if
     user is admin for backward-compatibility

   - selftests/tls: fix selftests build after adding ChaCha20-Poly1305

  Current release - always broken:

   - ppp: fix refcount underflow on channel unbridge

   - bnxt_en: clear DEFRAG flag in firmware message when retry flashing

   - smc: fix out of bound access in the new netlink interface

  Previous releases - regressions:

   - fix use-after-free with UDP GRO by frags

   - mptcp: better msk-level shutdown

   - rndis_host: set proper input size for OID_GEN_PHYSICAL_MEDIUM
     request

   - i40e: xsk: fix potential NULL pointer dereferencing

  Previous releases - always broken:

   - skb frag: kmap_atomic fixes

   - avoid 32 x truesize under-estimation for tiny skbs

   - fix issues around register_netdevice() failures

   - udp: prevent reuseport_select_sock from reading uninitialized socks

   - dsa: unbind all switches from tree when DSA master unbinds

   - dsa: clear devlink port type before unregistering slave netdevs

   - can: isotp: isotp_getname(): fix kernel information leak

   - mlxsw: core: Thermal control fixes

   - ipv6: validate GSO SKB against MTU before finish IPv6 processing

   - stmmac: use __napi_schedule() for PREEMPT_RT

   - net: mvpp2: remove Pause and Asym_Pause support

  Misc:

   - remove from MAINTAINERS folks who had been inactive for >5yrs"

* tag 'net-5.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (58 commits)
  mptcp: fix locking in mptcp_disconnect()
  net: Allow NETIF_F_HW_TLS_TX if IP_CSUM && IPV6_CSUM
  MAINTAINERS: dccp: move Gerrit Renker to CREDITS
  MAINTAINERS: ipvs: move Wensong Zhang to CREDITS
  MAINTAINERS: tls: move Aviad to CREDITS
  MAINTAINERS: ena: remove Zorik Machulsky from reviewers
  MAINTAINERS: vrf: move Shrijeet to CREDITS
  MAINTAINERS: net: move Alexey Kuznetsov to CREDITS
  MAINTAINERS: altx: move Jay Cliburn to CREDITS
  net: avoid 32 x truesize under-estimation for tiny skbs
  nt: usb: USB_RTL8153_ECM should not default to y
  net: stmmac: fix taprio configuration when base_time is in the past
  net: stmmac: fix taprio schedule configuration
  net: tip: fix a couple kernel-doc markups
  net: sit: unregister_netdevice on newlink's error path
  net: stmmac: Fixed mtu channged by cache aligned
  cxgb4/chtls: Fix tid stuck due to wrong update of qid
  i40e: fix potential NULL pointer dereferencing
  net: stmmac: use __napi_schedule() for PREEMPT_RT
  can: mcp251xfd: mcp251xfd_handle_rxif_one(): fix wrong NULL pointer check
  ...

commit | commitdiff | tree

Paolo Abeni [Thu, 14 Jan 2021 15:37:37 +0000 (16:37 +0100)]

mptcp: fix locking in mptcp_disconnect()

tcp_disconnect() expects the caller acquires the sock lock,
but mptcp_disconnect() is not doing that. Add the missing
required lock.

Reported-by: Eric Dumazet <[email protected]>
Fixes: 76e2a55d1625 ("mptcp: better msk-level shutdown.")
Signed-off-by: Paolo Abeni <[email protected]>
Link: https://lore.kernel.org/r/f818e82b58a556feeb71dcccc8bf1c87aafc6175.1610638176.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Linus Torvalds [Thu, 14 Jan 2021 19:10:12 +0000 (11:10 -0800)]

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID fixes from Jiri Kosina:

- memory leak fix for Wacom driver (Ping Cheng)

- various trivial small fixes, cleanups and device ID additions

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: logitech-hidpp: Add product ID for MX Ergo in Bluetooth mode
  HID: Ignore battery for Elan touchscreen on ASUS UX550
  HID: logitech-dj: add the G602 receiver
  HID: wiimote: remove h from printk format specifier
  HID: uclogic: remove h from printk format specifier
  HID: sony: select CONFIG_CRC32
  HID: sfh: fix address space confusion
  HID: multitouch: Enable multi-input for Synaptics pointstick/touchpad device
  HID: wacom: Fix memory leakage caused by kfifo_alloc

commit | commitdiff | tree

Tariq Toukan [Thu, 14 Jan 2021 15:12:15 +0000 (17:12 +0200)]

net: Allow NETIF_F_HW_TLS_TX if IP_CSUM && IPV6_CSUM

Cited patch below blocked the TLS TX device offload unless HW_CSUM
is set. This broke devices that use IP_CSUM && IP6_CSUM.
Here we fix it.

Note that the single HW_TLS_TX feature flag indicates support for
both IPv4/6, hence it should still be disabled in case only one of
(IP_CSUM | IPV6_CSUM) is set.

Fixes: ae0b04b238e2 ("net: Disable NETIF_F_HW_TLS_TX when HW_CSUM is disabled")
Signed-off-by: Tariq Toukan <[email protected]>
Reported-by: Rohit Maheshwari <[email protected]>
Reviewed-by: Maxim Mikityanskiy <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Jakub Kicinski [Thu, 14 Jan 2021 18:53:50 +0000 (10:53 -0800)]

Merge branch 'maintainers-remove-inactive-folks-from-networking'

To make maintainers' lives easier we're trying to nudge people
towards CCing all the relevant folks on patches, in an attempt
to improve review rate. We have a check in patchwork which validates
the CC list against get_maintainers.pl. It's a little awkward, however,
to force people to CC maintainers who we haven't seen on the mailing
list for years. This series removes from maintainers folks who didn't
provide any tag (incl. authoring a patch) in the last 5 years.
To ensure reasonable signal to noise ratio we only considered
MAINTAINERS entries which had more than 100 patches fall under
them in that time period.

All this is purely a process-greasing exercise, I hope nobody
sees this series as an affront. Most folks are moved to CREDITS,
a couple entries are simply removed.

The following inactive maintainers are kept, because they indicated
the intention to come back in the near future:

- Veaceslav Falico (bonding)
- Christian Benvenuti (Cisco drivers)
- Felix Fietkau (mtk-eth)
- Mirko Linder (skge/sky2)

Patches in this series contain report from a script which did
the analysis. Big thanks to Jonathan Corbet for help and writing
the script (although I feel like I used it differently than Jon
may have intended ;)). The output format is thus:

Subsystem $name
  Changes $reviewed / $total ($percent%)           // how many changes to the subsystem had at least one ack/review
  Last activity: $date_of_most_recent_patch
  $maintainer/reviewer1:
    Author $last_commit_authored_by_the_person $how_many_in_5yrs
    Committer $last_committed $how_many
    Tags $last_tag_like_review_signoff_etc $how_many
  $maintainer/reviewer2:
    Author $last_commit_authored_by_the_person $how_many_in_5yrs
    Committer $last_committed $how_many
    Tags $last_tag_like_review_signoff_etc $how_many
  Top reviewers: // Top 3 reviewers (who are not listed in MAINTAINERS)
    [$count_of_reviews_and_acks]: $email
  INACTIVE MAINTAINER $name   // maintainer / reviewer who has done nothing in last 5yrs

v2:
- keep Felix and Mirko

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Jakub Kicinski [Thu, 14 Jan 2021 01:49:12 +0000 (17:49 -0800)]

MAINTAINERS: dccp: move Gerrit Renker to CREDITS

As far as I can tell we haven't heard from Gerrit for roughly
5 years now. DCCP patch would really benefit from some review.
Gerrit was the last maintainer so mark this entry as orphaned.

Subsystem DCCP PROTOCOL
  Changes 38 / 166 (22%)
  (No activity)
  Top reviewers:
    [6]: [email protected]
    [6]: [email protected]
    [5]: [email protected]
  INACTIVE MAINTAINER Gerrit Renker <[email protected]>

Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Jakub Kicinski [Thu, 14 Jan 2021 01:49:11 +0000 (17:49 -0800)]

MAINTAINERS: ipvs: move Wensong Zhang to CREDITS

Move Wensong Zhang to credits, we haven't heard from
him in years.

Subsystem IPVS
  Changes 83 / 226 (36%)
  Last activity: 2020-11-27
  Wensong Zhang <[email protected]>:
  Simon Horman <[email protected]>:
    Committer c24b75e0f923 2019-10-24 00:00:00 33
    Tags 7980d2eabde8 2020-10-12 00:00:00 76
  Julian Anastasov <[email protected]>:
    Author 7980d2eabde8 2020-10-12 00:00:00 26
    Tags 4bc3c8dc9f5f 2020-11-27 00:00:00 78
  Top reviewers:
    [6]: [email protected]
  INACTIVE MAINTAINER Wensong Zhang <[email protected]>

Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Jakub Kicinski [Thu, 14 Jan 2021 01:49:10 +0000 (17:49 -0800)]

MAINTAINERS: tls: move Aviad to CREDITS

Aviad wrote parts of the initial TLS implementation
but hasn't been contributing to TLS since.

Subsystem NETWORKING [TLS]
  Changes 123 / 308 (39%)
  Last activity: 2020-12-01
  Boris Pismenny <[email protected]>:
    Tags 138559b9f99d 2020-11-17 00:00:00 1
  Aviad Yehezkel <[email protected]>:
  John Fastabend <[email protected]>:
    Author e91de6afa81c 2020-06-01 00:00:00 22
    Tags e91de6afa81c 2020-06-01 00:00:00 29
  Daniel Borkmann <[email protected]>:
    Author c16ee04c9b30 2018-10-20 00:00:00 7
    Committer b8e202d1d1d0 2020-02-21 00:00:00 19
    Tags b8e202d1d1d0 2020-02-21 00:00:00 28
  Jakub Kicinski <[email protected]>:
    Author 5c39f26e67c9 2020-11-27 00:00:00 89
    Committer d31c08007523 2020-12-01 00:00:00 15
    Tags d31c08007523 2020-12-01 00:00:00 117
  Top reviewers:
    [50]: [email protected]
    [26]: [email protected]
    [14]: [email protected]
  INACTIVE MAINTAINER Aviad Yehezkel <[email protected]>

Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Jakub Kicinski [Thu, 14 Jan 2021 01:49:09 +0000 (17:49 -0800)]

MAINTAINERS: ena: remove Zorik Machulsky from reviewers

While ENA has 3 reviewers and 2 maintainers, we mostly see review
tags and comments from the maintainers. While we very much appreciate
Zorik's invovment in the community let's trim the reviewer list
down to folks we've seen tags from.

Subsystem AMAZON ETHERNET DRIVERS
  Changes 13 / 269 (4%)
  Last activity: 2020-11-24
  Netanel Belgazal <[email protected]>:
    Author 24dee0c7478d 2019-12-10 00:00:00 43
    Tags 0e3a3f6dacf0 2020-07-21 00:00:00 47
  Arthur Kiyanovski <[email protected]>:
    Author 0e3a3f6dacf0 2020-07-21 00:00:00 79
    Tags 09323b3bca95 2020-11-24 00:00:00 104
  Guy Tzalik <[email protected]>:
    Tags 713865da3c62 2020-09-10 00:00:00 3
  Saeed Bishara <[email protected]>:
    Tags 470793a78ce3 2020-02-11 00:00:00 2
  Zorik Machulsky <[email protected]>:
  Top reviewers:
    [4]: [email protected]
    [3]: [email protected]
    [3]: [email protected]
  INACTIVE MAINTAINER Zorik Machulsky <[email protected]>

Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Jakub Kicinski [Thu, 14 Jan 2021 01:49:08 +0000 (17:49 -0800)]

MAINTAINERS: vrf: move Shrijeet to CREDITS

Shrijeet has moved on from VRF-related work.

Subsystem VRF
  Changes 30 / 120 (25%)
  Last activity: 2020-12-09
  David Ahern <[email protected]>:
    Author 1b6687e31a2d 2020-07-23 00:00:00 1
    Tags 9125abe7b9cb 2020-12-09 00:00:00 4
  Shrijeet Mukherjee <[email protected]>:
  Top reviewers:
    [13]: [email protected]
    [4]: [email protected]
  INACTIVE MAINTAINER Shrijeet Mukherjee <[email protected]>

Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Jakub Kicinski [Thu, 14 Jan 2021 01:49:07 +0000 (17:49 -0800)]

MAINTAINERS: net: move Alexey Kuznetsov to CREDITS

Move Alexey to CREDITS.

I am probably not giving him enough justice with
the description line..

Subsystem NETWORKING [IPv4/IPv6]
  Changes 1535 / 5111 (30%)
  Last activity: 2020-12-10
  "David S. Miller" <[email protected]>:
    Author b7e4ba9a91df 2020-12-09 00:00:00 407
    Committer e0fecb289ad3 2020-12-10 00:00:00 3992
    Tags e0fecb289ad3 2020-12-10 00:00:00 3978
  Alexey Kuznetsov <[email protected]>:
  Hideaki YOSHIFUJI <[email protected]>:
    Tags d5d8760b78d0 2016-06-16 00:00:00 8
  Top reviewers:
    [225]: [email protected]
    [222]: [email protected]
    [176]: [email protected]
  INACTIVE MAINTAINER Alexey Kuznetsov <[email protected]>

Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Jakub Kicinski [Thu, 14 Jan 2021 01:49:06 +0000 (17:49 -0800)]

MAINTAINERS: altx: move Jay Cliburn to CREDITS

Jay was not active in recent years and does not have plans
to return to work on ATLX drivers.

Subsystem ATLX ETHERNET DRIVERS
  Changes 20 / 116 (17%)
  Last activity: 2020-02-24
  Jay Cliburn <[email protected]>:
  Chris Snook <[email protected]>:
    Tags ea973742140b 2020-02-24 00:00:00 1
  Top reviewers:
    [4]: [email protected]
    [2]: [email protected]
    [2]: [email protected]
  INACTIVE MAINTAINER Jay Cliburn <[email protected]>

Acked-by: Chris Snook <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Eric Dumazet [Wed, 13 Jan 2021 16:18:19 +0000 (08:18 -0800)]

net: avoid 32 x truesize under-estimation for tiny skbs

Both virtio net and napi_get_frags() allocate skbs
with a very small skb->head

While using page fragments instead of a kmalloc backed skb->head might give
a small performance improvement in some cases, there is a huge risk of
under estimating memory usage.

For both GOOD_COPY_LEN and GRO_MAX_HEAD, we can fit at least 32 allocations
per page (order-3 page in x86), or even 64 on PowerPC

We have been tracking OOM issues on GKE hosts hitting tcp_mem limits
but consuming far more memory for TCP buffers than instructed in tcp_mem[2]

Even if we force napi_alloc_skb() to only use order-0 pages, the issue
would still be there on arches with PAGE_SIZE >= 32768

This patch makes sure that small skb head are kmalloc backed, so that
other objects in the slab page can be reused instead of being held as long
as skbs are sitting in socket queues.

Note that we might in the future use the sk_buff napi cache,
instead of going through a more expensive __alloc_skb()

Another idea would be to use separate page sizes depending
on the allocated length (to never have more than 4 frags per page)

I would like to thank Greg Thelen for his precious help on this matter,
analysing crash dumps is always a time consuming task.

Fixes: fd11a83dd363 ("net: Pull out core bits of __netdev_alloc_skb and add __napi_alloc_skb")
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Greg Thelen <[email protected]>
Reviewed-by: Alexander Duyck <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Geert Uytterhoeven [Wed, 13 Jan 2021 14:43:09 +0000 (15:43 +0100)]

nt: usb: USB_RTL8153_ECM should not default to y

In general, device drivers should not be enabled by default.

Fixes: 657bc1d10bfc23ac ("r8153_ecm: avoid to be prior to r8152 driver")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Yannick Vignon [Wed, 13 Jan 2021 13:15:57 +0000 (14:15 +0100)]

net: stmmac: fix taprio configuration when base_time is in the past

The Synopsys TSN MAC supports Qbv base times in the past, but only up to a
certain limit. As a result, a taprio qdisc configuration with a small
base time (for example when treating the base time as a simple phase
offset) is not applied by the hardware and silently ignored.

This was observed on an NXP i.MX8MPlus device, but likely affects all
TSN-variants of the MAC.

Fix the issue by making sure the base time is in the future, pushing it by
an integer amount of cycle times if needed. (a similar check is already
done in several other taprio implementations, see for example
drivers/net/ethernet/intel/igc/igc_tsn.c#L116 or
drivers/net/dsa/sja1105/sja1105_ptp.h#L39).

Fixes: b60189e0392f ("net: stmmac: Integrate EST with TAPRIO scheduler API")
Signed-off-by: Yannick Vignon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Yannick Vignon [Wed, 13 Jan 2021 13:15:56 +0000 (14:15 +0100)]

net: stmmac: fix taprio schedule configuration

When configuring a 802.1Qbv schedule through the tc taprio qdisc on an NXP
i.MX8MPlus device, the effective cycle time differed from the requested one
by N*96ns, with N number of entries in the Qbv Gate Control List. This is
because the driver was adding a 96ns margin to each interval of the GCL,
apparently to account for the IPG. The problem was observed on NXP
i.MX8MPlus devices but likely affected all devices relying on the same
configuration callback (dwmac 4.00, 4.10, 5.10 variants).

Fix the issue by removing the margins, and simply setup the MAC with the
provided cycle time value. This is the behavior expected by the user-space
API, as altering the Qbv schedule timings would break standards conformance.
This is also the behavior of several other Ethernet MAC implementations
supporting taprio, including the dwxgmac variant of stmmac.

Fixes: 504723af0d85 ("net: stmmac: Add basic EST support for GMAC5+")
Signed-off-by: Yannick Vignon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Mauro Carvalho Chehab [Thu, 14 Jan 2021 08:04:48 +0000 (09:04 +0100)]

net: tip: fix a couple kernel-doc markups

A function has a different name between their prototype
and its kernel-doc markup:

../net/tipc/link.c:2551: warning: expecting prototype for link_reset_stats(). Prototype was for tipc_link_reset_stats() instead
../net/tipc/node.c:1678: warning: expecting prototype for is the general link level function for message sending(). Prototype was for tipc_node_xmit() instead

Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Acked-by: Jon Maloy <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Jakub Kicinski [Thu, 14 Jan 2021 01:29:47 +0000 (17:29 -0800)]

net: sit: unregister_netdevice on newlink's error path

We need to unregister the netdevice if config failed.
.ndo_uninit takes care of most of the heavy lifting.

This was uncovered by recent commit c269a24ce057 ("net: make
free_netdev() more lenient with unregistering devices").
Previously the partially-initialized device would be left
in the system.

Reported-and-tested-by: [email protected]
Fixes: e2f1f072db8d ("sit: allow to configure 6rd tunnels via netlink")
Acked-by: Nicolas Dichtel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Nicholas Miell [Mon, 11 Jan 2021 06:09:25 +0000 (22:09 -0800)]

HID: logitech-hidpp: Add product ID for MX Ergo in Bluetooth mode

The Logitech MX Ergo trackball supports HID++ 4.5 over Bluetooth. Add its
product ID to the table so we can get battery monitoring support.
(The hid-logitech-hidpp driver already recognizes it when connected via
a Unifying Receiver.)

[[email protected]: fix whitespace damage]
Signed-off-by: Nicholas Miell <[email protected]>
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>

commit | commitdiff | tree

Marc Kleine-Budde [Wed, 13 Jan 2021 20:00:53 +0000 (21:00 +0100)]

can: tcan4x5x: remove __packed attribute from struct tcan4x5x_map_buf

The first member of struct tcan4x5x_map_buf is the struct tcan4x5x_buf_cmd,
which has a size of 4 bytes. It's followed by an array of u8. The compiler
places the array directly after the struct tcan4x5x_buf_cmd.

This patch removes the not needed attribute __packed from the struct
tcan4x5x_map_buf.

Suggested-by: Jakub Kicinski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>

commit | commitdiff | tree

Vincent Mailhol [Tue, 12 Jan 2021 09:54:37 +0000 (18:54 +0900)]

can: dev: can_put_echo_skb(): add software tx timestamps

Call skb_tx_timestamp() within can_put_echo_skb() so that a software tx
timestamp gets attached to the skb.

There two main reasons to include this call in can_put_echo_skb():

  * It easily allow to enable the tx timestamp on all devices with
    just one small change.

  * According to Documentation/networking/timestamping.rst, the tx
    timestamps should be generated in the device driver as close as possible,
    but always prior to passing the packet to the network interface. During the
    call to can_put_echo_skb(), the skb gets cloned meaning that the driver
    should not dereference the skb variable anymore after can_put_echo_skb()
    returns. This makes can_put_echo_skb() the very last place we can use the
    skb without having to access the echo_skb[] array.

Remark: by default, skb_tx_timestamp() does nothing. It needs to be activated
by passing the SOF_TIMESTAMPING_TX_SOFTWARE flag either through socket options
or control messages.

References:

* Support for the error queue in CAN RAW sockets (which is needed for
   tx timestamps) was introduced in:
   https://git.kernel.org//torvalds/c/eb88531bdbfaafb827192d1fc6c5a3fcc4fadd96

  * Put the call to skb_tx_timestamp() just before adding it to the
    array:
    https://lore.kernel.org/r/043c3ea1-6bdd-59c0-0269-27b2b5b36cec@victronenergy.com

  * About Tx hardware timestamps
    https://lore.kernel.org/r/20210111171152 [email protected]

Signed-off-by: Vincent Mailhol <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>

commit | commitdiff | tree

Marc Kleine-Budde [Mon, 11 Jan 2021 14:19:29 +0000 (15:19 +0100)]

can: dev: can_rx_offload_get_echo_skb(): extend to return can frame length

In order to implement byte queue limits (bql) in CAN drivers, the length of the
CAN frame needs to be passed into the networking stack after queueing and after
transmission completion.

To avoid to calculate this length twice, extend can_rx_offload_get_echo_skb()
to return that value. Convert all users of this function, too.

Reviewed-by: Vincent Mailhol <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>

commit | commitdiff | tree

Marc Kleine-Budde [Mon, 11 Jan 2021 14:19:28 +0000 (15:19 +0100)]

can: dev: can_get_echo_skb(): extend to return can frame length

In order to implement byte queue limits (bql) in CAN drivers, the length of the
CAN frame needs to be passed into the networking stack after queueing and after
transmission completion.

To avoid to calculate this length twice, extend can_get_echo_skb() to return
that value. Convert all users of this function, too.

Reviewed-by: Vincent Mailhol <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>

commit | commitdiff | tree

Vincent Mailhol [Mon, 11 Jan 2021 14:19:27 +0000 (15:19 +0100)]

can: dev: can_put_echo_skb(): extend to handle frame_len

Add a frame_len argument to can_put_echo_skb() which is used to save length of
the CAN frame into field frame_len of struct can_skb_priv so that it can be
later used after transmission completion. Convert all users of this function,
too.

Drivers which implement BQL call can_put_echo_skb() with the output of
can_skb_get_frame_len(skb) and drivers which do not simply pass zero as an
input (in the same way that NULL would be given to can_get_echo_skb()). This
way, we have a nice symmetry between the two echo functions.

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>
Reviewed-by: Vincent Mailhol <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vincent Mailhol <[email protected]>

commit | commitdiff | tree

Marc Kleine-Budde [Mon, 11 Jan 2021 14:19:26 +0000 (15:19 +0100)]

can: dev: extend struct can_skb_priv to hold CAN frame length

In order to implement byte queue limits (bql) in CAN drivers, the length of the
CAN frame needs to be passed into the networking stack after queueing and after
transmission completion.

To avoid to calculate this length twice, extend the struct can_skb_priv to hold
the length of the CAN frame and extend __can_get_echo_skb() to return that
value.

Reviewed-by: Vincent Mailhol <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>

commit | commitdiff | tree

Vincent Mailhol [Mon, 11 Jan 2021 14:19:25 +0000 (15:19 +0100)]

can: length: can_skb_get_frame_len(): introduce function to get data length of frame in data link layer

This patch adds the function can_skb_get_frame_len() which returns the length
of a CAN frame on the data link layer, including Start-of-frame, Identifier,
various other bits, the actual data, the CRC, the End-of-frame, the Inter frame
spacing.

Co-developed-by: Arunachalam Santhanam <[email protected]>
Signed-off-by: Arunachalam Santhanam <[email protected]>
Co-developed-by: Vincent Mailhol <[email protected]>
Signed-off-by: Vincent Mailhol <[email protected]>
Acked-by: Vincent Mailhol <[email protected]>
Reviewed-by: Vincent Mailhol <[email protected]>
Co-developed-by: Marc Kleine-Budde <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>

commit | commitdiff | tree

Marc Kleine-Budde [Mon, 11 Jan 2021 14:19:24 +0000 (15:19 +0100)]

can: length: canfd_sanitize_len(): add function to sanitize CAN-FD data length

The data field in CAN-FD frames have specifig frame length (0, 1, 2, 3, 4, 5,
6, 7, 8, 12, 16, 20, 24, 32, 48, 64). This function "rounds" up a given length
to the next valid CAN-FD frame length.

Reviewed-by: Vincent Mailhol <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>

commit | commitdiff | tree

Marc Kleine-Budde [Mon, 11 Jan 2021 14:19:23 +0000 (15:19 +0100)]

can: length: can_fd_len2dlc(): simplify length calculcation

If the length paramter in len2dlc() exceeds the size of the len2dlc array, we
return 0xF. This is equal to the last 16 members of the array.

This patch removes these members from the array, uses ARRAY_SIZE() for the
length check, and returns CANFD_MAX_DLC (which is 0xf).

Reviewed-by: Vincent Mailhol <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>

commit | commitdiff | tree

David Wu [Wed, 13 Jan 2021 03:41:09 +0000 (11:41 +0800)]

net: stmmac: Fixed mtu channged by cache aligned

Since the original mtu is not used when the mtu is updated,
the mtu is aligned with cache, this will get an incorrect.
For example, if you want to configure the mtu to be 1500,
but mtu 1536 is configured in fact.

Fixed: eaf4fac478077 ("net: stmmac: Do not accept invalid MTU values")
Signed-off-by: David Wu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Ayush Sawal [Tue, 12 Jan 2021 05:36:00 +0000 (11:06 +0530)]

cxgb4/chtls: Fix tid stuck due to wrong update of qid

TID stuck is seen when there is a race in
CPL_PASS_ACCEPT_RPL/CPL_ABORT_REQ and abort is arriving
before the accept reply, which sets the queue number.
In this case HW ends up sending CPL_ABORT_RPL_RSS to an
incorrect ingress queue.

V1->V2:
- Removed the unused variable len in chtls_set_quiesce_ctrl().

V2->V3:
- As kfree_skb() has a check for null skb, so removed this
check before calling kfree_skb() in func chtls_send_reset().

Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition")
Signed-off-by: Rohit Maheshwari <[email protected]>
Signed-off-by: Ayush Sawal <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Yuchung Cheng [Mon, 11 Jan 2021 23:05:52 +0000 (15:05 -0800)]

tcp: assign skb hash after tcp_event_data_sent

Move skb_set_hash_from_sk s.t. it's called after instead of before
tcp_event_data_sent is called. This enables congestion control
modules to change the socket hash right before restarting from
idle (via the TX_START congestion event).

Signed-off-by: Yuchung Cheng <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: Neal Cardwell <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Cristian Dumitrescu [Mon, 11 Jan 2021 18:11:38 +0000 (18:11 +0000)]

i40e: fix potential NULL pointer dereferencing

Currently, the function i40e_construct_skb_zc only frees the input xdp
buffer when the output skb is successfully built. On error, the
function i40e_clean_rx_irq_zc does not commit anything for the current
packet descriptor and simply exits the packet descriptor processing
loop, with the plan to restart the processing of this descriptor on
the next invocation. Therefore, on error the ring next-to-clean
pointer should not advance, the xdp i.e. *bi buffer should not be
freed and the current buffer info should not be invalidated by setting
*bi to NULL. Therefore, the *bi should only be set to NULL when the
function i40e_construct_skb_zc is successful, otherwise a NULL *bi
will be dereferenced when the work for the current descriptor is
eventually restarted.

Fixes: 3b4f0b66c2b3 ("i40e, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL")
Signed-off-by: Cristian Dumitrescu <[email protected]>
Acked-by: Björn Töpel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Ioana Ciornei [Mon, 11 Jan 2021 17:18:02 +0000 (19:18 +0200)]

dpaa2-mac: fix the remove path for non-MAC interfaces

Check if the interface is indeed connected to a MAC before trying to
close the DPMAC object representing it. Without this check we end up
working with a NULL pointer.

Fixes: d87e606373f6 ("dpaa2-mac: export MAC counters even when in TYPE_FIXED")
Signed-off-by: Ioana Ciornei <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

commit | commitdiff | tree

Ionut-robert Aron [Mon, 11 Jan 2021 17:07:25 +0000 (19:07 +0200)]

dpaa2-eth: add support for Rx VLAN filtering

Declare Rx VLAN filtering as supported and user-changeable only when
there are VLAN filtering entries available on the DPNI object. Even
then, rx-vlan-filtering is by default disabled.
Also, populate the .ndo_vlan_rx_add_vid() and .ndo_vlan_rx_kill_vid()
callbacks for adding and removing a specific VLAN from the VLAN table.

Signed-off-by: Ionut-robert Aron <[email protected]>
Signed-off-by: Ioana Ciornei <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Empty description

RSS Atom

This page took 0.143794 seconds and 4 git commands to generate.