]> Git Repo - linux.git/log
linux.git
4 years agoMerge branch 'mptcp-genl-events'
David S. Miller [Sat, 13 Feb 2021 00:31:46 +0000 (16:31 -0800)]
Merge branch 'mptcp-genl-events'

Mat Martineau says:

====================
mptcp: Add genl events for connection info

This series from the MPTCP tree adds genl multicast events that are
important for implementing a userspace path manager. In MPTCP, a path
manager is responsible for adding or removing additional subflows on
each MPTCP connection. The in-kernel path manager (already part of the
kernel) is a better fit for many server use cases, but the additional
flexibility of userspace path managers is often useful for client
devices.

Patches 1, 2, 4, 5, and 6 do some refactoring to streamline the netlink
event implementation in the final patch.

Patch 3 improves the timeliness of subflow destruction to ensure the
'subflow closed' event will be sent soon enough.

Patch 7 allows use of the GENL_UNS_ADMIN_PERM flag on genl mcast groups
to mandate CAP_NET_ADMIN, which is important to protect token information
in the MPTCP events. This is a genetlink change.

Patch 8 adds the MPTCP netlink events.
====================

Signed-off-by: David S. Miller <[email protected]>
4 years agomptcp: add netlink event support
Florian Westphal [Sat, 13 Feb 2021 00:00:01 +0000 (16:00 -0800)]
mptcp: add netlink event support

Allow userspace (mptcpd) to subscribe to mptcp genl multicast events.
This implementation reuses the same event API as the mptcp kernel fork
to ease integration of existing tools, e.g. mptcpd.

Supported events include:
1. start and close of an mptcp connection
2. start and close of subflows (joins)
3. announce and withdrawals of addresses
4. subflow priority (backup/non-backup) change.

Reviewed-by: Matthieu Baerts <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agomptcp: avoid lock_fast usage in accept path
Florian Westphal [Fri, 12 Feb 2021 23:59:59 +0000 (15:59 -0800)]
mptcp: avoid lock_fast usage in accept path

Once event support is added this may need to allocate memory while msk
lock is held with softirqs disabled.

Not using lock_fast also allows to do the allocation with GFP_KERNEL.

Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agomptcp: pass subflow socket to a few helpers
Florian Westphal [Fri, 12 Feb 2021 23:59:58 +0000 (15:59 -0800)]
mptcp: pass subflow socket to a few helpers

Pass the first/initial subflow to the existing functions so they can
pass this on to the notification handler that is added later in the
series.

Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agomptcp: move subflow close loop after sk close check
Florian Westphal [Fri, 12 Feb 2021 23:59:57 +0000 (15:59 -0800)]
mptcp: move subflow close loop after sk close check

In case mptcp socket is already dead the entire mptcp socket
will be freed. We can avoid the close check in this case.

Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agomptcp: schedule worker when subflow is closed
Florian Westphal [Fri, 12 Feb 2021 23:59:56 +0000 (15:59 -0800)]
mptcp: schedule worker when subflow is closed

When remote side closes a subflow we should schedule the worker to
dispose of the subflow in a timely manner.

Otherwise, SF_CLOSED event won't be generated until the mptcp
socket itself is closing or local side is closing another subflow.

Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agomptcp: split __mptcp_close_ssk helper
Florian Westphal [Fri, 12 Feb 2021 23:59:55 +0000 (15:59 -0800)]
mptcp: split __mptcp_close_ssk helper

Prepare for subflow close events:

When mptcp connection is torn down its enough to send the mptcp socket
close notification rather than a subflow close event for all of the
subflows followed by the mptcp close event.

This splits the helper: mptcp_close_ssk() will emit the close
notification, __mptcp_close_ssk will not.

Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agomptcp: move pm netlink work into pm_netlink
Florian Westphal [Fri, 12 Feb 2021 23:59:54 +0000 (15:59 -0800)]
mptcp: move pm netlink work into pm_netlink

Allows to make some functions static and avoids acquire of the pm
spinlock in protocol.c.

Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoMerge branch 'mptcp-selftests'
David S. Miller [Sat, 13 Feb 2021 00:20:34 +0000 (16:20 -0800)]
Merge branch 'mptcp-selftests'

Mat Martineau says:

====================
mptcp: Selftest enhancement and fixes

This is a collection of selftest updates from the MPTCP tree.

Patch 1 uses additional 'ss' command line parameters and 'nstat' to
improve output when certain MPTCP tests fail.

Patches 2 & 3 fix a copy/paste error and some output formatting.

Patch 4 makes sure tests still pass if certain connection-related
packets are retransmitted.
====================

Signed-off-by: David S. Miller <[email protected]>
4 years agoselftests: mptcp: fail if not enough SYN/3rd ACK
Matthieu Baerts [Fri, 12 Feb 2021 23:20:30 +0000 (15:20 -0800)]
selftests: mptcp: fail if not enough SYN/3rd ACK

If we receive less MPCapable SYN or 3rd ACK than expected, we now mark
the test as failed.

On the other hand, if we receive more, we keep the warning but we add a
hint that it is probably due to retransmissions and that's why we don't
mark the test as failed.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/148
Signed-off-by: Matthieu Baerts <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoselftests: mptcp: display warnings on one line
Matthieu Baerts [Fri, 12 Feb 2021 23:20:29 +0000 (15:20 -0800)]
selftests: mptcp: display warnings on one line

Before we had this in case of SYN retransmissions:

  (...)
  # ns4 MPTCP -> ns2 (10.0.1.2:10034      ) MPTCP (duration  1201ms) [ OK ]
  # ns4 MPTCP -> ns2 (dead:beef:1::2:10035) MPTCP (duration  1242ms) [ OK ]
  # ns4 MPTCP -> ns2 (10.0.2.1:10036      ) MPTCP ns2-60143c00-cDZWo4 SYNRX: MPTCP -> MPTCP: expect 11, got
  # 13
  # (duration  6221ms) [ OK ]
  # ns4 MPTCP -> ns2 (dead:beef:2::1:10037) MPTCP (duration  1427ms) [ OK ]
  # ns4 MPTCP -> ns3 (10.0.2.2:10038      ) MPTCP (duration   881ms) [ OK ]
  (...)

Now we have:

  (...)
  # ns4 MPTCP -> ns2 (10.0.1.2:10034      ) MPTCP (duration  1201ms) [ OK ]
  # ns4 MPTCP -> ns2 (dead:beef:1::2:10035) MPTCP (duration  1242ms) [ OK ]
  # ns4 MPTCP -> ns2 (10.0.2.1:10036      ) MPTCP (duration  6221ms) [ OK ] WARN: SYNRX: expect 11, got 13
  # ns4 MPTCP -> ns2 (dead:beef:2::1:10037) MPTCP (duration  1427ms) [ OK ]
  # ns4 MPTCP -> ns3 (10.0.2.2:10038      ) MPTCP (duration   881ms) [ OK ]
  (...)

So we put everything on one line, keep the durations and "OK" aligned
and removed duplicated info to short the warning.

Signed-off-by: Matthieu Baerts <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoselftests: mptcp: fix ACKRX debug message
Matthieu Baerts [Fri, 12 Feb 2021 23:20:28 +0000 (15:20 -0800)]
selftests: mptcp: fix ACKRX debug message

Info from received MPCapable SYN were printed instead of the ones from
received MPCapable 3rd ACK.

Fixes: fed61c4b584c ("selftests: mptcp: make 2nd net namespace use tcp syn cookies unconditionally")
Signed-off-by: Matthieu Baerts <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoselftests: mptcp: dump more info on errors
Paolo Abeni [Fri, 12 Feb 2021 23:20:27 +0000 (15:20 -0800)]
selftests: mptcp: dump more info on errors

Even if that may sound completely unlikely, the mptcp implementation
is not perfect, yet.

When the self-tests report an error we usually need more information
of what the scripts currently report. iproute allow provides
some additional goodies since a few releases, let's dump them.

Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoMerge branch 'hns3-cleanups'
David S. Miller [Fri, 12 Feb 2021 21:13:16 +0000 (13:13 -0800)]
Merge branch 'hns3-cleanups'

Huazhong Tan says:

====================
net: hns3: some cleanups for -next

To improve code readability and maintainability, the series
refactor out some bloated functions in the HNS3 ethernet driver.

change log:
V2: remove an unused variable in #5

previous version:
V1: https://patchwork.kernel.org/project/netdevbpf/cover/1612943005[email protected]/
====================

Acked-by: Jakub Kicinski <[email protected]>
4 years agonet: hns3: refactor out hclge_rm_vport_all_mac_table()
Hao Chen [Fri, 12 Feb 2021 03:24:17 +0000 (11:24 +0800)]
net: hns3: refactor out hclge_rm_vport_all_mac_table()

hclge_rm_vport_all_mac_table() is bloated, so split it into
separate functions for readability and maintainability.

Signed-off-by: Hao Chen <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: hns3: refactor out hclgevf_set_rss_tuple()
Huazhong Tan [Fri, 12 Feb 2021 03:24:16 +0000 (11:24 +0800)]
net: hns3: refactor out hclgevf_set_rss_tuple()

To make it more readable and maintainable, split
hclgevf_set_rss_tuple() into two parts.

Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: hns3: refactor out hclge_set_rss_tuple()
Huazhong Tan [Fri, 12 Feb 2021 03:24:15 +0000 (11:24 +0800)]
net: hns3: refactor out hclge_set_rss_tuple()

To make it more readable and maintainable, split
hclge_set_rss_tuple() into two parts.

Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: hns3: split out hclgevf_cmd_send()
Yufeng Mo [Fri, 12 Feb 2021 03:24:14 +0000 (11:24 +0800)]
net: hns3: split out hclgevf_cmd_send()

hclgevf_cmd_send() is bloated, so split it into separate
functions for readability and maintainability.

Signed-off-by: Yufeng Mo <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: hns3: split out hclge_cmd_send()
Yufeng Mo [Fri, 12 Feb 2021 03:24:13 +0000 (11:24 +0800)]
net: hns3: split out hclge_cmd_send()

hclge_cmd_send() is bloated, so split it into separate
functions for readability and maintainability.

Signed-off-by: Yufeng Mo <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: hns3: split out hclge_dbg_dump_qos_buf_cfg()
Jian Shen [Fri, 12 Feb 2021 03:21:08 +0000 (11:21 +0800)]
net: hns3: split out hclge_dbg_dump_qos_buf_cfg()

hclge_dbg_dump_qos_buf_cfg() is bloated, so split it into
separate functions for readability and maintainability.

Signed-off-by: Jian Shen <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: hns3: refactor out hclgevf_get_rss_tuple()
Jian Shen [Fri, 12 Feb 2021 03:21:07 +0000 (11:21 +0800)]
net: hns3: refactor out hclgevf_get_rss_tuple()

To improve code readability and maintainability, separate
the flow type parsing part and the converting part from
bloated hclgevf_get_rss_tuple().

Signed-off-by: Jian Shen <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: hns3: refactor out hclge_get_rss_tuple()
Jian Shen [Fri, 12 Feb 2021 03:21:06 +0000 (11:21 +0800)]
net: hns3: refactor out hclge_get_rss_tuple()

To improve code readability and maintainability, separate
the flow type parsing part and the converting part from
bloated hclge_get_rss_tuple().

Signed-off-by: Jian Shen <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: hns3: refactor out hclge_set_vf_vlan_common()
Peng Li [Fri, 12 Feb 2021 03:21:05 +0000 (11:21 +0800)]
net: hns3: refactor out hclge_set_vf_vlan_common()

To improve code readability and maintainability, separate
the command handling part and the status parsing part from
bloated hclge_set_vf_vlan_common().

Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: hns3: use ipv6_addr_any() helper
Jiaran Zhang [Fri, 12 Feb 2021 03:21:04 +0000 (11:21 +0800)]
net: hns3: use ipv6_addr_any() helper

Use common ipv6_addr_any() to determine if an addr is ipv6 any addr.

Signed-off-by: Jiaran Zhang <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: hns3: clean up hns3_dbg_cmd_write()
Peng Li [Fri, 12 Feb 2021 03:21:03 +0000 (11:21 +0800)]
net: hns3: clean up hns3_dbg_cmd_write()

As more commands are added, hns3_dbg_cmd_write() is going to
get more bloated, so move the part about command check into
a separate function.

Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: hns3: refactor out hclgevf_cmd_convert_err_code()
Peng Li [Fri, 12 Feb 2021 03:21:02 +0000 (11:21 +0800)]
net: hns3: refactor out hclgevf_cmd_convert_err_code()

To improve code readability and maintainability, refactor
hclgevf_cmd_convert_err_code() with an array of imp_errcode
and common_errno mapping, instead of a bloated switch/case.

Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: hns3: refactor out hclge_cmd_convert_err_code()
Peng Li [Fri, 12 Feb 2021 03:21:01 +0000 (11:21 +0800)]
net: hns3: refactor out hclge_cmd_convert_err_code()

To improve code readability and maintainability, refactor
hclge_cmd_convert_err_code() with an array of imp_errcode
and common_errno mapping, instead of a bloated switch/case.

Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoixgbe: store the result of ixgbe_rx_offset() onto ixgbe_ring
Maciej Fijalkowski [Mon, 18 Jan 2021 15:13:18 +0000 (16:13 +0100)]
ixgbe: store the result of ixgbe_rx_offset() onto ixgbe_ring

Output of ixgbe_rx_offset() is based on ethtool's priv flag setting, which
when changed, causes PF reset (disables napi, frees irqs, loads
different Rx mem model, etc.). This means that within napi its result is
constant and there is no reason to call it per each processed frame.

Add new 'rx_offset' field to ixgbe_ring that is meant to hold the
ixgbe_rx_offset() result and use it within ixgbe_clean_rx_irq().
Furthermore, use it within ixgbe_alloc_mapped_page().

Last but not least, un-inline the function of interest as it lives in .c
file so let compiler do the decision about the inlining.

Reviewed-by: Björn Töpel <[email protected]>
Signed-off-by: Maciej Fijalkowski <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
4 years agoice: store the result of ice_rx_offset() onto ice_ring
Maciej Fijalkowski [Mon, 18 Jan 2021 15:13:17 +0000 (16:13 +0100)]
ice: store the result of ice_rx_offset() onto ice_ring

Output of ice_rx_offset() is based on ethtool's priv flag setting, which
when changed, causes PF reset (disables napi, frees irqs, loads
different Rx mem model, etc.). This means that within napi its result is
constant and there is no reason to call it per each processed frame.

Add new 'rx_offset' field to ice_ring that is meant to hold the
ice_rx_offset() result and use it within ice_clean_rx_irq().
Furthermore, use it within ice_alloc_mapped_page().

Reviewed-by: Björn Töpel <[email protected]>
Signed-off-by: Maciej Fijalkowski <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
4 years agoi40e: store the result of i40e_rx_offset() onto i40e_ring
Maciej Fijalkowski [Mon, 18 Jan 2021 15:13:16 +0000 (16:13 +0100)]
i40e: store the result of i40e_rx_offset() onto i40e_ring

Output of i40e_rx_offset() is based on ethtool's priv flag setting,
which when changed, causes PF reset (disables napi, frees irqs, loads
different Rx mem model, etc.). This means that within napi its result is
constant and there is no reason to call it per each processed frame.

Add new 'rx_offset' field to i40e_ring that is meant to hold the
i40e_rx_offset() result and use it within i40e_clean_rx_irq().
Furthermore, use it within i40e_alloc_mapped_page().

Last but not least, un-inline the function of interest so that compiler
makes the decision about inlining as it lives in .c file.

Reviewed-by: Björn Töpel <[email protected]>
Signed-off-by: Maciej Fijalkowski <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
4 years agoi40e: Simplify the do-while allocation loop
Björn Töpel [Mon, 18 Jan 2021 15:13:15 +0000 (16:13 +0100)]
i40e: Simplify the do-while allocation loop

Fold the count decrement into the while-statement.

Reviewed-by: Maciej Fijalkowski <[email protected]>
Signed-off-by: Björn Töpel <[email protected]>
Tested-by: Kiran Bhandare <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
4 years agoice: skip NULL check against XDP prog in ZC path
Maciej Fijalkowski [Mon, 18 Jan 2021 15:13:14 +0000 (16:13 +0100)]
ice: skip NULL check against XDP prog in ZC path

Whole zero-copy variant of clean Rx IRQ is executed when xsk_pool is
attached to rx_ring and it can happen only when XDP program is present
on interface. Therefore it is safe to assume that program is always
!NULL and there is no need for checking it in ice_run_xdp_zc.

Reviewed-by: Björn Töpel <[email protected]>
Signed-off-by: Maciej Fijalkowski <[email protected]>
Tested-by: Kiran Bhandare <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
4 years agoice: remove redundant checks in ice_change_mtu
Maciej Fijalkowski [Mon, 18 Jan 2021 15:13:13 +0000 (16:13 +0100)]
ice: remove redundant checks in ice_change_mtu

dev_validate_mtu checks that mtu value specified by user is not less
than min mtu and not greater than max allowed mtu. It is being done
before calling the ndo_change_mtu exposed by driver, so remove these
redundant checks in ice_change_mtu.

Reviewed-by: Björn Töpel <[email protected]>
Signed-off-by: Maciej Fijalkowski <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
4 years agoice: move skb pointer from rx_buf to rx_ring
Maciej Fijalkowski [Mon, 18 Jan 2021 15:13:12 +0000 (16:13 +0100)]
ice: move skb pointer from rx_buf to rx_ring

Similar thing has been done in i40e, as there is no real need for having
the sk_buff pointer in each rx_buf. Non-eop frames can be simply handled
on that pointer moved upwards to rx_ring.

Reviewed-by: Björn Töpel <[email protected]>
Signed-off-by: Maciej Fijalkowski <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
4 years agoice: simplify ice_run_xdp
Maciej Fijalkowski [Mon, 18 Jan 2021 15:13:11 +0000 (16:13 +0100)]
ice: simplify ice_run_xdp

There's no need for 'result' variable, we can directly return the
internal status based on action returned by xdp prog.

Reviewed-by: Björn Töpel <[email protected]>
Signed-off-by: Maciej Fijalkowski <[email protected]>
Tested-by: Kiran Bhandare <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
4 years agoi40e: adjust i40e_is_non_eop
Maciej Fijalkowski [Mon, 18 Jan 2021 15:13:10 +0000 (16:13 +0100)]
i40e: adjust i40e_is_non_eop

i40e_is_non_eop had a leftover comment and unused skb argument which was
used for placing the skb onto rx_buf in case when current buffer was
non-eop one. This is not relevant anymore as commit e72e56597ba1
("i40e/i40evf: Moves skb from i40e_rx_buffer to i40e_ring") pulled the
non-complete skb handling out of rx_bufs up to rx_ring.  Therefore,
let's adjust the function arguments that i40e_is_non_eop takes.

Furthermore, since there is already a function responsible for bumping
the ntc, make use of that and drop that logic from i40e_is_non_eop so
that the scope of this function is limited to what the name actually
states.

Reviewed-by: Björn Töpel <[email protected]>
Signed-off-by: Maciej Fijalkowski <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
4 years agoi40e: drop misleading function comments
Maciej Fijalkowski [Mon, 18 Jan 2021 15:13:09 +0000 (16:13 +0100)]
i40e: drop misleading function comments

i40e_cleanup_headers has a statement about check against skb being
linear or not which is not relevant anymore, so let's remove it.

Same case for i40e_can_reuse_rx_page, it references things that are not
present there anymore.

Reviewed-by: Björn Töpel <[email protected]>
Signed-off-by: Maciej Fijalkowski <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
4 years agoi40e: drop redundant check when setting xdp prog
Maciej Fijalkowski [Mon, 18 Jan 2021 15:13:08 +0000 (16:13 +0100)]
i40e: drop redundant check when setting xdp prog

Net core handles the case where netdev has no xdp prog attached and
current prog is NULL. Therefore, remove such check within
i40e_xdp_setup.

Reviewed-by: Björn Töpel <[email protected]>
Signed-off-by: Maciej Fijalkowski <[email protected]>
Tested-by: Kiran Bhandare <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
4 years agonl80211: add documentation for HT/VHT/HE disable attributes
Johannes Berg [Fri, 12 Feb 2021 09:50:23 +0000 (10:50 +0100)]
nl80211: add documentation for HT/VHT/HE disable attributes

These were missed earlier, add the necessary documentation
and, while at it, clarify it.

Signed-off-by: Johannes Berg <[email protected]>
Link: https://lore.kernel.org/r/20210212105023.895c3389f063.I46dea3bfc64385bc6f600c50d294007510994f8f@changeid
Signed-off-by: Johannes Berg <[email protected]>
4 years agocfg80211/mac80211: Support disabling HE mode
Ben Greear [Thu, 4 Feb 2021 14:46:10 +0000 (06:46 -0800)]
cfg80211/mac80211: Support disabling HE mode

Allow user to disable HE mode, similar to how VHT and HT
can be disabled.  Useful for testing.

Signed-off-by: Ben Greear <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
4 years agomac80211: add STBC encoding to ieee80211_parse_tx_radiotap
Philipp Borgers [Mon, 25 Jan 2021 15:07:44 +0000 (16:07 +0100)]
mac80211: add STBC encoding to ieee80211_parse_tx_radiotap

This patch adds support for STBC encoding to the radiotap tx parse
function. Prior to this change adding the STBC flag to the radiotap
header did not encode frames with STBC.

Signed-off-by: Philipp Borgers <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[use u8_get_bits/u32_encode_bits instead of manually shifting]
Signed-off-by: Johannes Berg <[email protected]>
4 years agomac80211: minstrel_ht: remove sample rate switching code for constrained devices
Felix Fietkau [Wed, 27 Jan 2021 05:57:35 +0000 (06:57 +0100)]
mac80211: minstrel_ht: remove sample rate switching code for constrained devices

This was added to mitigate the effects of too much sampling on devices that
use a static global fallback table instead of configurable multi-rate retry.
Now that the sampling algorithm is improved, this code path no longer performs
any better than the standard probing on affected devices.

Signed-off-by: Felix Fietkau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
4 years agomac80211: minstrel_ht: show sampling rates in debugfs
Felix Fietkau [Wed, 27 Jan 2021 05:57:34 +0000 (06:57 +0100)]
mac80211: minstrel_ht: show sampling rates in debugfs

This makes it easier to see what rates are going to be tested next

Signed-off-by: Felix Fietkau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
4 years agomac80211: minstrel_ht: significantly redesign the rate probing strategy
Felix Fietkau [Wed, 27 Jan 2021 05:57:33 +0000 (06:57 +0100)]
mac80211: minstrel_ht: significantly redesign the rate probing strategy

The biggest flaw in current minstrel_ht is the fact that it needs way too
many probing packets to be able to quickly find the best rate.
Depending on the wifi hardware and operating mode, this can significantly
reduce throughput when not operating at the highest available data rate.

In order to be able to significantly reduce the amount of rate sampling,
we need a much smarter selection of probing rates.

The new approach introduced by this patch maintains a limited set of
available rates to be tested during a statistics window.

They are split into distinct categories:
- MINSTREL_SAMPLE_TYPE_INC - incremental rate upgrade:
  Pick the next rate group and find the first rate that is faster than
  the current max. throughput rate
- MINSTREL_SAMPLE_TYPE_JUMP - random testing of higher rates:
  Pick a random rate from the next group that is faster than the current
  max throughput rate. This allows faster adaptation when the link changes
  significantly
- MINSTREL_SAMPLE_TYPE_SLOW - test a rate between max_prob, max_tp2 and
  max_tp in order to reduce the gap between them

In order to prioritize sampling, every 6 attempts are split into 3x INC,
2x JUMP, 1x SLOW.

Available rates are checked and refilled on every stats window update.

With this approach, we finally get a very small delta in throughput when
comparing setting the optimal data rate as a fixed rate vs normal rate
control operation.

Signed-off-by: Felix Fietkau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
4 years agomac80211: minstrel_ht: reduce the need to sample slower rates
Felix Fietkau [Wed, 27 Jan 2021 05:57:32 +0000 (06:57 +0100)]
mac80211: minstrel_ht: reduce the need to sample slower rates

In order to more gracefully be able to fall back to lower rates without too
much throughput fluctuations, initialize all untested rates below tested ones
to the maximum probabilty of higher rates.
Usually this leads to untested lower rates getting initialized with a
probability value of 100%, making them better candidates for fallback without
having to rely on random probing

Signed-off-by: Felix Fietkau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
4 years agomac80211: minstrel_ht: update total packets counter in tx status path
Felix Fietkau [Wed, 27 Jan 2021 05:57:31 +0000 (06:57 +0100)]
mac80211: minstrel_ht: update total packets counter in tx status path

Keep the update in one place and prepare for further rework

Signed-off-by: Felix Fietkau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
4 years agomac80211: minstrel_ht: use bitfields to encode rate indexes
Felix Fietkau [Wed, 27 Jan 2021 05:57:30 +0000 (06:57 +0100)]
mac80211: minstrel_ht: use bitfields to encode rate indexes

Get rid of a lot of divisions and modulo operations
Reduces code size and improves performance

Signed-off-by: Felix Fietkau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
4 years agocfg80211: initialize reg_rule in __freq_reg_info()
Luca Coelho [Thu, 4 Feb 2021 13:44:39 +0000 (15:44 +0200)]
cfg80211: initialize reg_rule in __freq_reg_info()

Sparse started warning on this function because we can potentially
return an uninitialized value.  The reason is that if the caller
passes a min_bw value that is higher then the last value in bws[], we
will not go into the loop and reg_rule will remain initialized.  This
cannot happen because the only caller of this function uses either 1
or 20 in min_bw, but the function will be more robust if we
pre-initialize the value.

Signed-off-by: Luca Coelho <[email protected]>
Link: https://lore.kernel.org/r/iwlwifi.20210204154439.6c884ea7281c.I257278d03b0c1ae0aa6631672cfa48f1a95d5996@changeid
Signed-off-by: Johannes Berg <[email protected]>
4 years agomac80211: fix potential overflow when multiplying to u32 integers
Colin Ian King [Fri, 5 Feb 2021 17:53:52 +0000 (17:53 +0000)]
mac80211: fix potential overflow when multiplying to u32 integers

The multiplication of the u32 variables tx_time and estimated_retx is
performed using a 32 bit multiplication and the result is stored in
a u64 result. This has a potential u32 overflow issue, so avoid this
by casting tx_time to a u64 to force a 64 bit multiply.

Addresses-Coverity: ("Unintentional integer overflow")
Fixes: 050ac52cbe1f ("mac80211: code for on-demand Hybrid Wireless Mesh Protocol")
Signed-off-by: Colin Ian King <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
4 years agomac80211: enable QoS support for nl80211 ctrl port
Markus Theil [Sat, 6 Feb 2021 11:51:12 +0000 (12:51 +0100)]
mac80211: enable QoS support for nl80211 ctrl port

This patch unifies sending control port frames
over nl80211 and AF_PACKET sockets a little more.

Before this patch, EAPOL frames got QoS prioritization
only when using AF_PACKET sockets.

__ieee80211_select_queue only selects a QoS-enabled queue
for control port frames, when the control port protocol
is set correctly on the skb. For the AF_PACKET path this
works, but the nl80211 path used ETH_P_802_3.

Another check for injected frames in wme.c then prevented
the QoS TID to be copied in the frame.

In order to fix this, get rid of the frame injection marking
for nl80211 ctrl port and set the correct ethernet protocol.

Please note:
An erlier version of this path tried to prevent
frame aggregation for control port frames in order to speed up
the initial connection setup a little. This seemed to cause
issues on my older Intel dvm-based hardware, and was therefore
removed again. Future commits which try to reintroduce this
have to check carefully how hw behaves with aggregated and
non-aggregated traffic for the same TID.
My NIC: Intel(R) Centrino(R) Ultimate-N 6300 AGN, REV=0x74

Reported-by: kernel test robot <[email protected]>
Signed-off-by: Markus Theil <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
4 years agocfg80211: remove unused callback
Matteo Croce [Mon, 8 Feb 2021 11:33:56 +0000 (12:33 +0100)]
cfg80211: remove unused callback

The ieee80211 class registers a callback which actually does nothing.
Given that the callback is optional, and all its accesses are protected
by a NULL check, remove it entirely.

Signed-off-by: Matteo Croce <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
4 years agortw88: 8822c: update RF_B (2/2) parameter tables to v60
Po-Hao Huang [Tue, 9 Feb 2021 07:07:55 +0000 (15:07 +0800)]
rtw88: 8822c: update RF_B (2/2) parameter tables to v60

Update RTL8822C devices' RF_A tables to v60.
The new parameters fix incorrect RSSI report under 2.4G link.

Signed-off-by: Po-Hao Huang <[email protected]>
Signed-off-by: Ping-Ke Shih <[email protected]>
Reviewed-by: Brian Norris <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
4 years agortw88: 8822c: update RF_B (1/2) parameter tables to v60
Po-Hao Huang [Tue, 9 Feb 2021 07:07:54 +0000 (15:07 +0800)]
rtw88: 8822c: update RF_B (1/2) parameter tables to v60

Update RTL8822C devices' RF_B tables to v60.
The new parameters fix incorrect RSSI report under 2.4G link.

Signed-off-by: Po-Hao Huang <[email protected]>
Signed-off-by: Ping-Ke Shih <[email protected]>
Reviewed-by: Brian Norris <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
4 years agortw88: 8822c: update RF_A parameter tables to v60
Po-Hao Huang [Tue, 9 Feb 2021 07:07:53 +0000 (15:07 +0800)]
rtw88: 8822c: update RF_A parameter tables to v60

Update RTL8822C devices' RF_A tables to v60.
The new parameters fix incorrect RSSI report under 2.4G link.

Signed-off-by: Po-Hao Huang <[email protected]>
Signed-off-by: Ping-Ke Shih <[email protected]>
Reviewed-by: Brian Norris <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
4 years agortw88: 8822c: update MAC/BB parameter tables to v60
Po-Hao Huang [Tue, 9 Feb 2021 07:07:52 +0000 (15:07 +0800)]
rtw88: 8822c: update MAC/BB parameter tables to v60

Update RTL8822C devices' MAC/BB tables to v60.
The new parameters fix incorrect RSSI report under 2.4G link.

Signed-off-by: Po-Hao Huang <[email protected]>
Signed-off-by: Ping-Ke Shih <[email protected]>
Reviewed-by: Brian Norris <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
4 years agortw88: replace tx tasklet with work queue
Po-Hao Huang [Tue, 9 Feb 2021 07:07:51 +0000 (15:07 +0800)]
rtw88: replace tx tasklet with work queue

Replace tasklet so we can do tx scheduling in parallel. Since throughput
is delay-sensitive in most cases, we allocate a dedicated, high priority
wq for our needs.

Signed-off-by: Po-Hao Huang <[email protected]>
Signed-off-by: Ping-Ke Shih <[email protected]>
Reviewed-by: Brian Norris <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
4 years agortw88: add napi support
Po-Hao Huang [Tue, 9 Feb 2021 07:07:50 +0000 (15:07 +0800)]
rtw88: add napi support

Use napi to reduce overhead on rx interrupts.

Driver used to interrupt kernel for every Rx packet, this could
affect both system and network performance. NAPI is a mechanism that
uses polling when processing huge amount of traffic, by doing this
the number of interrupts can be decreased.

Network performance can also benefit from this patch. Since TCP
connection is bidirectional and acks are required for every several
packets. These ack packets occupie the PCI bus bandwidth and could
lead to performance degradation.

When napi is used, GRO receive is enabled by default in the mac80211
stack. So mac80211 won't pass every RX TCP packets to the kernel TCP
network stack immediately. Instead an aggregated large length TCP packet
will be delivered.

This reduces the tx acks sent and gains rx performance. After the patch,
the Rx throughput increases about 25Mbps in 11ac.

Signed-off-by: Po-Hao Huang <[email protected]>
Signed-off-by: Ping-Ke Shih <[email protected]>
Reviewed-by: Brian Norris <[email protected]>
Tested-by: Brian Norris <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
4 years agortw88: add rts condition
Po-Hao Huang [Tue, 9 Feb 2021 07:07:49 +0000 (15:07 +0800)]
rtw88: add rts condition

Since we set the IEEE80211_HW_HAS_RATE_CONTROL flag, so use_rts in
ieee80211_tx_info will never be set in the ieee80211_xmit_fast path.
Add length check for skb to decide whether rts is needed.

Signed-off-by: Po-Hao Huang <[email protected]>
Signed-off-by: Ping-Ke Shih <[email protected]>
Reviewed-by: Brian Norris <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
4 years agortw88: add dynamic rrsr configuration
Po-Hao Huang [Tue, 9 Feb 2021 07:07:48 +0000 (15:07 +0800)]
rtw88: add dynamic rrsr configuration

Register rrsr determines the response rate we send.
In field tests, using rate higher than current tx rate could lead
to difficulty for the receiving end to receive management/control
frames. Calculate current modulation level by tx rate then cross out
rate higher than those.

Signed-off-by: Po-Hao Huang <[email protected]>
Signed-off-by: Ping-Ke Shih <[email protected]>
Reviewed-by: Brian Norris <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
4 years agoiwlwifi: remove incorrect comment in pnvm
Luca Coelho [Thu, 11 Feb 2021 20:30:55 +0000 (22:30 +0200)]
iwlwifi: remove incorrect comment in pnvm

We use this driver as a backport that also runs on older kernels (as
part of the backports project).  So we use some checks to backport or
prevent code from compiling in incompatible kernel version.

When I took one of the PNVM patches from the backport, I accidentally
left the comment that a certain part of the code doesn't work in older
kernels.  This obviously should never be valid for the mainline.
Remove this comment.

Reported-by: Kalle Valo <[email protected]>
Signed-off-by: Luca Coelho <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/iwlwifi.20210211223049.40d545a0fa89.I04793aaa5312b926335c8db32131f000432df511@changeid
4 years agoMerge branch 'sock-rx-qmap'
David S. Miller [Fri, 12 Feb 2021 03:08:07 +0000 (19:08 -0800)]
Merge branch 'sock-rx-qmap'

Tariq Toukan says:

====================
Compile-flag for sock RX queue mapping

Socket's RX queue mapping logic is useful also for non-XPS use cases.
This series breaks the dependency between the two, introducing a new
kernel config flag SOCK_RX_QUEUE_MAPPING.

Here we select this new kernel flag from TLS_DEVICE, as well as XPS.
====================

Acked-by: Jakub Kicinski <[email protected]>
4 years agonet/mlx5: Remove TLS dependencies on XPS
Tariq Toukan [Thu, 11 Feb 2021 11:35:53 +0000 (13:35 +0200)]
net/mlx5: Remove TLS dependencies on XPS

No real dependency on XPS, but on RX queue mapping, which
is being selected by TLS_DEVICE.

Signed-off-by: Tariq Toukan <[email protected]>
Reviewed-by: Maxim Mikityanskiy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet/tls: Select SOCK_RX_QUEUE_MAPPING from TLS_DEVICE
Tariq Toukan [Thu, 11 Feb 2021 11:35:52 +0000 (13:35 +0200)]
net/tls: Select SOCK_RX_QUEUE_MAPPING from TLS_DEVICE

Compile-in the socket RX queue mapping field and logic when TLS_DEVICE
is enabled. This allows device drivers to pick the recorded socket's
RX queue and use it for streams distribution.

Signed-off-by: Tariq Toukan <[email protected]>
Reviewed-by: Maxim Mikityanskiy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet/sock: Add kernel config SOCK_RX_QUEUE_MAPPING
Tariq Toukan [Thu, 11 Feb 2021 11:35:51 +0000 (13:35 +0200)]
net/sock: Add kernel config SOCK_RX_QUEUE_MAPPING

Use a new config SOCK_RX_QUEUE_MAPPING to compile-in the socket
RX queue field and logic, instead of the XPS config.
This breaks dependency in XPS, and allows selecting it from non-XPS
use cases, as we do in the next patch.

In addition, use the new flag to wrap the logic in sk_rx_queue_get()
and protect access to the sk_rx_queue_mapping field, while keeping
the function exposed unconditionally, just like sk_rx_queue_set()
and sk_rx_queue_clear().

Signed-off-by: Tariq Toukan <[email protected]>
Reviewed-by: Maxim Mikityanskiy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agotcp: Sanitize CMSG flags and reserved args in tcp_zerocopy_receive.
Arjun Roy [Thu, 11 Feb 2021 21:21:07 +0000 (13:21 -0800)]
tcp: Sanitize CMSG flags and reserved args in tcp_zerocopy_receive.

Explicitly define reserved field and require it and any subsequent
fields to be zero-valued for now. Additionally, limit the valid CMSG
flags that tcp_zerocopy_receive accepts.

Fixes: 7eeba1706eba ("tcp: Add receive timestamp support for receive zerocopy.")
Signed-off-by: Arjun Roy <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: Soheil Hassas Yeganeh <[email protected]>
Suggested-by: David Ahern <[email protected]>
Suggested-by: Leon Romanovsky <[email protected]>
Suggested-by: Jakub Kicinski <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agor8169: handle tx before rx in napi poll
Heiner Kallweit [Thu, 11 Feb 2021 20:20:08 +0000 (21:20 +0100)]
r8169: handle tx before rx in napi poll

Cleaning up tx descriptors first increases the chance that
rtl_rx() can allocate new skb's from the cache.

Signed-off-by: Heiner Kallweit <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: fix dev_ifsioc_locked() race condition
Cong Wang [Thu, 11 Feb 2021 19:34:10 +0000 (11:34 -0800)]
net: fix dev_ifsioc_locked() race condition

dev_ifsioc_locked() is called with only RCU read lock, so when
there is a parallel writer changing the mac address, it could
get a partially updated mac address, as shown below:

Thread 1 Thread 2
// eth_commit_mac_addr_change()
memcpy(dev->dev_addr, addr->sa_data, ETH_ALEN);
// dev_ifsioc_locked()
memcpy(ifr->ifr_hwaddr.sa_data,
dev->dev_addr,...);

Close this race condition by guarding them with a RW semaphore,
like netdev_get_name(). We can not use seqlock here as it does not
allow blocking. The writers already take RTNL anyway, so this does
not affect the slow path. To avoid bothering existing
dev_set_mac_address() callers in drivers, introduce a new wrapper
just for user-facing callers on ioctl and rtnetlink paths.

Note, bonding also changes slave mac addresses but that requires
a separate patch due to the complexity of bonding code.

Fixes: 3710becf8a58 ("net: RCU locking for simple ioctl()")
Reported-by: "Gong, Sishuai" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: mvpp2: fix interrupt mask/unmask skip condition
Stefan Chulski [Thu, 11 Feb 2021 15:13:19 +0000 (17:13 +0200)]
net: mvpp2: fix interrupt mask/unmask skip condition

The condition should be skipped if CPU ID equal to nthreads.
The patch doesn't fix any actual issue since
nthreads = min_t(unsigned int, num_present_cpus(), MVPP2_MAX_THREADS).
On all current Armada platforms, the number of CPU's is
less than MVPP2_MAX_THREADS.

Fixes: e531f76757eb ("net: mvpp2: handle cases where more CPUs are available than s/w threads")
Reported-by: Russell King <[email protected]>
Signed-off-by: Stefan Chulski <[email protected]>
Reviewed-by: Russell King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoMerge branch 'am65-cpsw-nuss-switchdev-driver'
David S. Miller [Fri, 12 Feb 2021 01:52:13 +0000 (17:52 -0800)]
Merge branch 'am65-cpsw-nuss-switchdev-driver'

Vignesh Raghavendra says:

====================
net: ti: am65-cpsw-nuss: Add switchdev driver

This series adds switchdev support for AM65 CPSW NUSS driver to support
multi port CPSW present on J721e and AM64 SoCs.
It adds devlink hook to switch b/w switch mode and multi mac mode.

v2:
Rebased on latest net-next
Update patch 1/4 with rationale for using devlink
====================

4 years agodocs: networking: ti: Add driver doc for AM65 NUSS switch driver
Vignesh Raghavendra [Thu, 11 Feb 2021 10:56:44 +0000 (16:26 +0530)]
docs: networking: ti: Add driver doc for AM65 NUSS switch driver

J721e, J7200 and AM64 have multi port switches which can work in multi
mac mode and in switch mode. Add documentation explaining how to use
different modes.

Borrowed from:
Documentation/networking/device_drivers/ethernet/ti/cpsw_switchdev.rst

Signed-off-by: Vignesh Raghavendra <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: ti: am65-cpsw-nuss: Add switchdev support
Vignesh Raghavendra [Thu, 11 Feb 2021 10:56:43 +0000 (16:26 +0530)]
net: ti: am65-cpsw-nuss: Add switchdev support

J721e, J7200 and AM64 have multi port switches which can work in multi
mac mode and in switch mode. Add support for configuring this HW in
switch mode using devlink and switchdev notifiers.

Support is similar to existing CPSW switchdev implementation of TI's 32 bit
platform like AM33/AM43/AM57.

To enable switch mode:
devlink dev param set platform/8000000.ethernet name switch_mode value true cmode runtime

All configuration is implemented via switchdev API and notifiers.
Supported:
      - SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS
      - SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS
      - SWITCHDEV_ATTR_ID_PORT_STP_STATE
      - SWITCHDEV_OBJ_ID_PORT_VLAN
      - SWITCHDEV_OBJ_ID_PORT_MDB
      - SWITCHDEV_OBJ_ID_HOST_MDB

Hence AM65 CPSW switchdev driver supports:
     - FDB offloading
     - MDB offloading
     - VLAN filtering and offloading
     - STP

Signed-off-by: Vignesh Raghavendra <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: ti: am65-cpsw-nuss: Add netdevice notifiers
Vignesh Raghavendra [Thu, 11 Feb 2021 10:56:42 +0000 (16:26 +0530)]
net: ti: am65-cpsw-nuss: Add netdevice notifiers

Register netdevice notifiers in order to receive notification when
individual MAC ports are added to the HW bridge.

Signed-off-by: Vignesh Raghavendra <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: ti: am65-cpsw-nuss: Add devlink support
Vignesh Raghavendra [Thu, 11 Feb 2021 10:56:41 +0000 (16:26 +0530)]
net: ti: am65-cpsw-nuss: Add devlink support

AM65 NUSS ethernet switch on K3 devices can be configured to work either
in independent mac mode where each port acts as independent network
interface (multi mac) or switch mode.

Add devlink hooks to provide a way to switch b/w these modes.

Rationale to use devlink instead of defaulting to bridge mode is that
SoC use cases require to support multiple independent MAC ports with no
switching so that users can use software bridges with multi-mac
configuration (e.g: to support LAG, HSR/PRP, etc). Also, switching
between multi mac and switch mode requires significant Port and ALE
reconfiguration, therefore is easier to be made as part of mode change
devlink hooks. It also allows to keep user interface similar to what
was implemented for the previous generation of TI CPSW IP
(on AM33/AM43/AM57 SoCs).

Signed-off-by: Vignesh Raghavendra <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoMerge branch 'bcm4908_enet-post-review-fixes'
David S. Miller [Thu, 11 Feb 2021 23:04:17 +0000 (15:04 -0800)]
Merge branch 'bcm4908_enet-post-review-fixes'

Rafał Miłecki says:

====================
bcm4908_enet: post-review fixes

V2 of my BCM4908 Ethernet patchset was applied to the net-next.git and
it was later that is received some extra reviews. I'm sending patches
that handle pointed out issues.

David: earler I missed that V2 was applied and I sent V3 and V4 of my
inital patchset. Sorry for that. I think it's the best to ignore V3 and
V4 I sent and proceed with this fixes patchset instead.
====================

Signed-off-by: David S. Miller <[email protected]>
4 years agonet: broadcom: bcm4908_enet: fix endianness in xmit code
Rafał Miłecki [Thu, 11 Feb 2021 12:12:39 +0000 (13:12 +0100)]
net: broadcom: bcm4908_enet: fix endianness in xmit code

Use le32_to_cpu() for reading __le32 struct field filled by hw.

Signed-off-by: Rafał Miłecki <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: broadcom: bcm4908_enet: fix received skb length
Rafał Miłecki [Thu, 11 Feb 2021 12:12:38 +0000 (13:12 +0100)]
net: broadcom: bcm4908_enet: fix received skb length

Use ETH_FCS_LEN instead of magic value and drop incorrect + 2

Signed-off-by: Rafał Miłecki <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: broadcom: bcm4908_enet: fix minor typos
Rafał Miłecki [Thu, 11 Feb 2021 12:12:37 +0000 (13:12 +0100)]
net: broadcom: bcm4908_enet: fix minor typos

1. Fix "ensable" typo noticed by Andrew
2. Fix chipset name in the struct net_device_ops variable

Suggested-by: Andrew Lunn <[email protected]>
Signed-off-by: Rafał Miłecki <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: broadcom: bcm4908_enet: drop "inline" from C functions
Rafał Miłecki [Thu, 11 Feb 2021 12:12:36 +0000 (13:12 +0100)]
net: broadcom: bcm4908_enet: drop "inline" from C functions

It seems preferred to let compiler optimize code if applicable.
While at it drop unused enet_umac_maskset().

Suggested-by: Andrew Lunn <[email protected]>
Signed-off-by: Rafał Miłecki <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: broadcom: bcm4908_enet: drop unneeded memset()
Rafał Miłecki [Thu, 11 Feb 2021 12:12:35 +0000 (13:12 +0100)]
net: broadcom: bcm4908_enet: drop unneeded memset()

dma_alloc_coherent takes care of zeroing allocated memory

Suggested-by: Andrew Lunn <[email protected]>
Signed-off-by: Rafał Miłecki <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agonet: broadcom: rename BCM4908 driver & update DT binding
Rafał Miłecki [Thu, 11 Feb 2021 12:12:34 +0000 (13:12 +0100)]
net: broadcom: rename BCM4908 driver & update DT binding

compatible string was updated to match normal naming convention so
update driver as well

Signed-off-by: Rafał Miłecki <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agodt-bindings: net: bcm4908-enet: include ethernet-controller.yaml
Rafał Miłecki [Thu, 11 Feb 2021 12:12:33 +0000 (13:12 +0100)]
dt-bindings: net: bcm4908-enet: include ethernet-controller.yaml

It should be /included/ by every Ethernet controller binding. It adds
support for various generic properties.

Suggested-by: Rob Herring <[email protected]>
Signed-off-by: Rafał Miłecki <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agodt-bindings: net: rename BCM4908 Ethernet binding
Rafał Miłecki [Thu, 11 Feb 2021 12:12:32 +0000 (13:12 +0100)]
dt-bindings: net: rename BCM4908 Ethernet binding

Rob pointed out that a normal convention is "brcm,bcm4908-enet" so
update whole binding to match it.

Suggested-by: Rob Herring <[email protected]>
Signed-off-by: Rafał Miłecki <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoMerge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kern
David S. Miller [Thu, 11 Feb 2021 22:59:01 +0000 (14:59 -0800)]
Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kern
el/git/bluetooth/bluetooth-next

Johan Hedberg says:

====================
pull request: bluetooth-next 2021-02-11

Here's the main bluetooth-next pull request for 5.12:

 - Add support for advertising monitor offliading using Microsoft
   vendor extensions
 - Add firmware download support for MediaTek MT7921U USB devices
 - Suspend-related fixes for Qualcomm devices
 - Add support for Intel GarfieldPeak controller
 - Various other smaller fixes & cleanups

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <[email protected]>
4 years agoMerge branch 'marvell-cn10k'
David S. Miller [Thu, 11 Feb 2021 22:55:04 +0000 (14:55 -0800)]
Merge branch 'marvell-cn10k'

Geetha sowjanya says:

====================
Add Marvell CN10K support

The current admin function (AF) driver and the netdev driver supports
OcteonTx2 silicon variants. The same OcteonTx2's
Resource Virtualization Unit (RVU) is carried forward to the next-gen
silicon ie OcteonTx3, with some changes and feature enhancements.

This patch set adds support for OcteonTx3 (CN10K) silicon and gets
the drivers to the same level as OcteonTx2. No new OcteonTx3 specific
features are added.

Changes cover below HW level differences
- PCIe BAR address changes wrt shared mailbox memory region
- Receive buffer freeing to HW
- Transmit packet's descriptor submission to HW
- Programmable HW interface identifiers (channels)
- Increased MTU support
- A Serdes MAC block (RPM) configuration

v5-v6
Rebased on top of latest net-next branch.

v4-v5
Fixed sparse warnings.

v3-v4
Fixed compiler warnings.

v2-v3
Reposting as a single thread.
Rebased on top latest net-next branch.

v1-v2
Fixed check-patch reported issues.
====================

Signed-off-by: David S. Miller <[email protected]>
4 years agoocteontx2-af: cn10k: MAC internal loopback support
Hariprasad Kelam [Thu, 11 Feb 2021 15:58:34 +0000 (21:28 +0530)]
octeontx2-af: cn10k: MAC internal loopback support

MAC on CN10K silicon support loopback for selftest or debug purposes.
This patch does necessary configuration to loopback packets upon receiving
request from LMAC mapped RVU PF's netdev via mailbox.

Also MAC (CGX) on OcteonTx2 silicon variants and MAC (RPM) on
OcteonTx3 CN10K are different and loopback needs to be configured
differently. Upper layer interface between RVU AF and PF netdev is
kept same. Based on silicon variant appropriate fn() pointer is
called to config the MAC.

Signed-off-by: Hariprasad Kelam <[email protected]>
Signed-off-by: Geetha sowjanya <[email protected]>
Signed-off-by: Sunil Goutham <[email protected]>
Reported-by: kernel test robot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoocteontx2-af: cn10k: Add RPM Rx/Tx stats support
Hariprasad Kelam [Thu, 11 Feb 2021 15:58:33 +0000 (21:28 +0530)]
octeontx2-af: cn10k: Add RPM Rx/Tx stats support

RPM supports below list of counters as an extension to existing counters
 *  class based flow control pause frames
 *  vlan/jabber/fragmented packets
 *  fcs/alignment/oversized error packets

This patch adds support to display supported RPM counters via debugfs
and define new mbox rpm_stats to read all support counters.

Signed-off-by: Hariprasad Kelam <[email protected]>
Signed-off-by: Geetha sowjanya <[email protected]>
Signed-off-by: Sunil Kovvuri Goutham <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoocteontx2-af: cn10k: Add RPM LMAC pause frame support
Rakesh Babu [Thu, 11 Feb 2021 15:58:32 +0000 (21:28 +0530)]
octeontx2-af: cn10k: Add RPM LMAC pause frame support

Flow control configuration is different for CGX(Octeontx2)
and RPM(CN10K) functional blocks. This patch adds the necessary
changes for RPM to support 802.3 pause frames configuration on
cn10k platforms.

Signed-off-by: Rakesh Babu <[email protected]>
Signed-off-by: Geetha sowjanya <[email protected]>
Signed-off-by: Sunil Kovvuri Goutham <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoocteontx2-pf: cn10k: Get max mtu supported from admin function
Hariprasad Kelam [Thu, 11 Feb 2021 15:58:31 +0000 (21:28 +0530)]
octeontx2-pf: cn10k: Get max mtu supported from admin function

CN10K supports max MTU of 16K on LMAC links and 64k on LBK
links and Octeontx2 silicon supports 9K mtu on both links.
Get the same from nix_get_hw_info mbox message in netdev probe.

This patch also calculates receive buffer size required based
on the MTU set.

Signed-off-by: Hariprasad Kelam <[email protected]>
Signed-off-by: Subbaraya Sundeep <[email protected]>
Signed-off-by: Geetha sowjanya <[email protected]>
Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoocteontx2-af: cn10K: Add MTU configuration
Hariprasad Kelam [Thu, 11 Feb 2021 15:58:30 +0000 (21:28 +0530)]
octeontx2-af: cn10K: Add MTU configuration

OcteonTx3 CN10K silicon supports bigger MTU when compared
to 9216 MTU supported by OcteonTx2 silicon variants. Lookback
interface supports upto 64K and RPM LMAC interfaces support
upto 16K.

This patch does the necessary configuration and adds support
for PF/VF drivers to retrieve max packet size supported via mbox

This patch also configures tx link credit by considering supported
fifo size and max packet length for Octeontx3 silicon.

This patch also removes platform specific name from the driver name.

Signed-off-by: Hariprasad Kelam <[email protected]>
Signed-off-by: Geetha sowjanya <[email protected]>
Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoocteontx2-af: cn10k: Add support for programmable channels
Subbaraya Sundeep [Thu, 11 Feb 2021 15:58:29 +0000 (21:28 +0530)]
octeontx2-af: cn10k: Add support for programmable channels

NIX uses unique channel numbers to identify the packet sources/sinks
like CGX,LBK and SDP. The channel numbers assigned to each block are
hardwired in CN9xxx silicon.
The fixed channel numbers in CN9xxx are:

0x0 | a << 8 | b            - LBK(0..3)_CH(0..63)
0x0 | a << 8                - Reserved
0x700 | a                   - SDP_CH(0..255)
0x800 | a << 8 | b << 4 | c - CGX(0..7)_LMAC(0..3)_CH(0..15)

All the channels in the above fixed enumerator(with maximum
number of blocks) are not required since some chips
have less number of blocks.
For CN10K silicon the channel numbers need to be programmed by
software in each block with the base channel number and range of
channels. This patch calculates and assigns the channel numbers
to efficiently distribute the channel number range(0-4095) among
all the blocks. The assignment is made based on the actual number of
blocks present and also contiguously leaving no holes.
The channel numbers remaining after the math are used as new CPT
replay channels present in CN10K. Also since channel numbers are
not fixed the transmit channel link number needed by AF consumers
is calculated by AF and sent along with nix_lf_alloc mailbox response.

Signed-off-by: Subbaraya Sundeep <[email protected]>
Signed-off-by: Geetha sowjanya <[email protected]>
Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoocteontx2-af: cn10k: Add RPM MAC support
Hariprasad Kelam [Thu, 11 Feb 2021 15:58:28 +0000 (21:28 +0530)]
octeontx2-af: cn10k: Add RPM MAC support

OcteonTx2's next gen platform the CN10K has RPM MAC which has a
different serdes when compared to CGX MAC. Though the underlying
HW is different, the CSR interface has been designed largely inline
with CGX MAC, with few exceptions though. So we are using the same
CGX driver for RPM MAC as well and will have a different set of APIs
for RPM where ever necessary.

This patch adds initial support for CN10K's RPM MAC i.e. the driver
registration, communication with firmware etc. For communication with
firmware, RPM provides a different IRQ when compared to CGX.
The CGX and RPM blocks support different features. Currently few
features like ptp, flowcontrol and higig are not supported by RPM. This
patch adds new mailbox message "CGX_FEATURES_GET" to get the list of
features supported by underlying MAC.

RPM has different implementations for RX/TX stats. Unlike CGX,
bar offset of stat registers are different. This patch adds
support to access the same and dump the values in debugfs.

Signed-off-by: Hariprasad Kelam <[email protected]>
Signed-off-by: Subbaraya Sundeep <[email protected]>
Signed-off-by: Geetha sowjanya <[email protected]>
Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoocteontx2-pf: cn10k: Use LMTST lines for NPA/NIX operations
Geetha sowjanya [Thu, 11 Feb 2021 15:58:27 +0000 (21:28 +0530)]
octeontx2-pf: cn10k: Use LMTST lines for NPA/NIX operations

This patch adds support to use new LMTST lines for NPA batch free
and burst SQE flush. Adds new dev_hw_ops structure to hold platform
specific functions and create new files cn10k.c and cn10k.h.

Signed-off-by: Geetha sowjanya <[email protected]>
Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoocteontx2-pf: cn10k: Map LMTST region
Geetha sowjanya [Thu, 11 Feb 2021 15:58:26 +0000 (21:28 +0530)]
octeontx2-pf: cn10k: Map LMTST region

On CN10K platform transmit/receive buffer alloc and free from/to hardware
had changed to support burst operation. Whereas pervious silicon's only
support single buffer free at a time.
To Support the same firmware allocates a DRAM region for each PF/VF for
storing LMTLINES. These LMTLINES are used for NPA batch free and for
flushing SQE to the hardware.
PF/VF LMTST region is accessed via BAR4. PFs LMTST region is followed
by its VFs mbox memory. The size of region varies from 2KB to 256KB based
on number of LMTLINES configured.

This patch adds support for
- Mapping PF/VF LMTST region.
- Reserves 0-71 (RX + TX + XDP) LMTST lines for NPA batch
  free operation.
- Reserves 72-512 LMTST lines for NIX SQE flush.

Signed-off-by: Geetha sowjanya <[email protected]>
Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoocteontx2-pf: cn10k: Initialise NIX context
Geetha sowjanya [Thu, 11 Feb 2021 15:58:25 +0000 (21:28 +0530)]
octeontx2-pf: cn10k: Initialise NIX context

On CN10K platform NIX RQ and SQ context structure got changed.
This patch uses new mbox message "NIX_CN10K_AQ_ENQ" for NIX
context initialization on CN10K platform.

This patch also updates the nix_rx_parse_s and nix_sqe_sg_s
structures to add packet steering bit feilds.

Signed-off-by: Geetha sowjanya <[email protected]>
Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoocteontx2-af: cn10k: Update NIX and NPA context in debugfs
Geetha sowjanya [Thu, 11 Feb 2021 15:58:24 +0000 (21:28 +0530)]
octeontx2-af: cn10k: Update NIX and NPA context in debugfs

On CN10K platform NPA and NIX context structure bit fields
had changed to support new features like bandwidth steering etc.
This patch dumps approprate context for CN10K platform.

Signed-off-by: Geetha sowjanya <[email protected]>
Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoocteontx2-af: cn10k: Update NIX/NPA context structure
Geetha sowjanya [Thu, 11 Feb 2021 15:58:23 +0000 (21:28 +0530)]
octeontx2-af: cn10k: Update NIX/NPA context structure

NIX hardware context structure got changed to accommodate new
features like bandwidth steering, L3/L4 outer/inner checksum
enable/disable etc., on CN10K platform.
This patch defines new mbox message NIX_CN10K_AQ_INST for new
NIX context initialization.

This patch also updates the NPA context structures to accommodate
bit field changes made for CN10K platform.

This patch also removes Big endian bit fields from existing
structures as its support got deprecated in current and upcoming silicons.

Signed-off-by: Geetha sowjanya <[email protected]>
Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoocteontx2-pf: cn10k: Add mbox support for CN10K
Subbaraya Sundeep [Thu, 11 Feb 2021 15:58:22 +0000 (21:28 +0530)]
octeontx2-pf: cn10k: Add mbox support for CN10K

Firmware allocates memory regions for PFs and VFs in DRAM.
The PFs memory region is used for AF-PF and PF-VF mailbox.
This mbox facilitate communication between AF-PF and PF-VF.

On CN10K platform:
The DRAM region allocated to PF is enumerated as PF BAR4 memory.
PF BAR4 contains AF-PF mbox region followed by its VFs mbox region.
AF-PF mbox region base address is configured at RVU_AF_PFX_BAR4_ADDR
PF-VF mailbox base address is configured at
RVU_PF(x)_VF_MBOX_ADDR = RVU_AF_PF()_BAR4_ADDR+64KB. PF access its
mbox region via BAR4, whereas VF accesses PF-VF DRAM mailboxes via
BAR2 indirect access.

On CN9XX platform:
Mailbox region in DRAM is divided into two parts AF-PF mbox region and
PF-VF mbox region i.e all PFs mbox region is contiguous similarly all
VFs.
The base address of the AF-PF mbox region is configured at
RVU_AF_PF_BAR4_ADDR.
AF-PF1 mbox address can be calculated as RVU_AF_PF_BAR4_ADDR * mbox
size.
The base address of PF-VF mbox region for each PF is configure at
RVU_AF_PF(0..15)_VF_BAR4_ADDR.PF access its mbox region via BAR4 and its
VF mbox regions from RVU_PF_VF_BAR4_ADDR register, whereas VF access its
mbox region via BAR4.

This patch changes mbox initialization to support both CN9XX and CN10K
platform.
The patch also adds new hw_cap flag to setting hw features like TSO etc
and removes platform specific name from the PF/VF driver name to make it
appropriate for all supported platforms

This patch also removes platform specific name from the PF/VF driver name
to make it appropriate for all supported platforms

Signed-off-by: Subbaraya Sundeep <[email protected]>
Signed-off-by: Geetha sowjanya <[email protected]>
Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoocteontx2-af: cn10k: Add mbox support for CN10K platform
Subbaraya Sundeep [Thu, 11 Feb 2021 15:58:21 +0000 (21:28 +0530)]
octeontx2-af: cn10k: Add mbox support for CN10K platform

Firmware allocates memory regions for PFs and VFs in DRAM.
The PFs memory region is used for AF-PF and PF-VF mailbox.
This mbox facilitates communication between AF-PF and PF-VF.

On CN10K platform:
The DRAM region allocated to PF is enumerated as PF BAR4 memory.
PF BAR4 contains AF-PF mbox region followed by its VFs mbox region.
AF-PF mbox region base address is configured at RVU_AF_PFX_BAR4_ADDR
PF-VF mailbox base address is configured at
RVU_PF(x)_VF_MBOX_ADDR = RVU_AF_PF()_BAR4_ADDR+64KB. PF access its
mbox region via BAR4, whereas VF accesses PF-VF DRAM mailboxes via
BAR2 indirect access.

On CN9XX platform:
Mailbox region in DRAM is divided into two parts AF-PF mbox region and
PF-VF mbox region i.e all PFs mbox region is contiguous similarly all
VFs.
The base address of the AF-PF mbox region is configured at
RVU_AF_PF_BAR4_ADDR.
AF-PF1 mbox address can be calculated as RVU_AF_PF_BAR4_ADDR * mbox
size.
The base address of PF-VF mbox region for each PF is configure at
RVU_AF_PF(0..15)_VF_BAR4_ADDR.PF access its mbox region via BAR4 and its
VF mbox regions from RVU_PF_VF_BAR4_ADDR register, whereas VF access its
mbox region via BAR4.

This patch changes mbox initialization to support both CN9XX and CN10K
platform.

This patch also adds CN10K PTP subsystem and device IDs to ptp
driver id table.

Signed-off-by: Subbaraya Sundeep <[email protected]>
Signed-off-by: Geetha sowjanya <[email protected]>
Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 years agoMerge branch 'mvpp2-tx-flow-control'
David S. Miller [Thu, 11 Feb 2021 22:50:24 +0000 (14:50 -0800)]
Merge branch 'mvpp2-tx-flow-control'

Stefan Chulski says:

====================
net: mvpp2: Add TX Flow Control support

Armada hardware has a pause generation mechanism in GOP (MAC).
The GOP generate flow control frames based on an indication programmed in Ports Control 0 Register. There is a bit per port.
However assertion of the PortX Pause bits in the ports control 0 register only sends a one time pause.
To complement the function the GOP has a mechanism to periodically send pause control messages based on periodic counters.
This mechanism ensures that the pause is effective as long as the Appropriate PortX Pause is asserted.

Problem is that Packet Processor that actually can drop packets due to lack of resources not connected to the GOP flow control generation mechanism.
To solve this issue Armada has firmware running on CM3 CPU dedicated for Flow Control support.
Firmware monitors Packet Processor resources and asserts XON/XOFF by writing to Ports Control 0 Register.

MSS shared SRAM memory used to communicate between CM3 firmware and PP2 driver.
During init PP2 driver informs firmware about used BM pools, RXQs, congestion and depletion thresholds.

The pause frames are generated whenever congestion or depletion in resources is detected.
The back pressure is stopped when the resource reaches a sufficient level.
So the congestion/depletion and sufficient level implement a hysteresis that reduces the XON/XOFF toggle frequency.

Packet Processor v23 hardware introduces support for RX FIFO fill level monitor.
Patch "add PPv23 version definition" to differ between v23 and v22 hardware.
Patch "add TX FC firmware check" verifies that CM3 firmware supports Flow Control monitoring.

v12 --> v13
- Remove bm_underrun_protect module_param

v11 --> v12
- Improve warning message in "net: mvpp2: add TX FC firmware check" patch

v10 --> v11
- Improve "net: mvpp2: add CM3 SRAM memory map" comment
- Move condition check to 'net: mvpp2: always compare hw-version vs MVPP21' patch

v9 --> v10
- Add CM3 SRAM description to PPv2 documentation

v8 --> v9
- Replace generic pool allocation with devm_ioremap_resource

v7 --> v8
- Reorder "always compare hw-version vs MVPP21" and "add PPv23 version definition" commits
- Typo fixes
- Remove condition fix from "add RXQ flow control configurations"

v6 --> v7
- Reduce patch set from 18 to 15 patches
 - Documentation change combined into a single patch
 - RXQ and BM size change combined into a single patch
 - Ring size change check moved into "add RXQ flow control configurations" commit

v5 --> v6
- No change

v4 --> v5
- Add missed Signed-off
- Fix warnings in patches 3 and 12
- Add revision requirement to warning message
- Move mss_spinlock into RXQ flow control configurations patch
- Improve FCA RXQ non occupied descriptor threshold commit message

v3 --> v4
- Remove RFC tag

v2 --> v3
- Remove inline functions
- Add PPv2.3 description into marvell-pp2.txt
- Improve mvpp2_interrupts_mask/unmask procedure
- Improve FC enable/disable procedure
- Add priv->sram_pool check
- Remove gen_pool_destroy call
- Reduce Flow Control timer to x100 faster

v1 --> v2
- Add memory requirements information
- Add EPROBE_DEFER if of_gen_pool_get return NULL
- Move Flow control configuration to mvpp2_mac_link_up callback
====================

Signed-off-by: David S. Miller <[email protected]>
4 years agonet: mvpp2: add TX FC firmware check
Stefan Chulski [Thu, 11 Feb 2021 10:49:02 +0000 (12:49 +0200)]
net: mvpp2: add TX FC firmware check

Patch check that TX FC firmware is running in CM3.
If not, global TX FC would be disabled.

Signed-off-by: Stefan Chulski <[email protected]>
Acked-by: Marcin Wojtas <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
This page took 0.119004 seconds and 4 git commands to generate.