Edward Cree [Fri, 11 Sep 2020 22:39:02 +0000 (23:39 +0100)]
sfc: decouple TXQ type from label
Make it possible to have an arbitrary mapping from types to labels,
because when we add inner-csum-offload TXQs there will no longer be a
convenient nesting hierarchy of NIC types (EF10 will have inner-csum
TXQs, while Siena will have HIGHPRI).
Correct a misleading comment on efx_hard_start_xmit().
These are never modified, so constify them to allow the compiler to
place them in read-only memory. This moves about 25kB to read-only
memory as seen by the output of the size command.
Before:
text data bss dec hex filename
296203 65464 1248 362915 589a3 drivers/net/ethernet/marvell/octeontx2/af/octeontx2_af.ko
After:
text data bss dec hex filename
321003 40664 1248 362915 589a3 drivers/net/ethernet/marvell/octeontx2/af/octeontx2_af.ko
Edward Cree [Fri, 11 Sep 2020 18:45:14 +0000 (19:45 +0100)]
sfc: cleanups around efx_alloc_channel
The old_channel argument is never used, so remove it.
The function is only called from elsewhere in efx_channels.c, so make
it static and remove the declaration from the header file.
Each MDB entry is encoded in a nested netlink attribute called
'MDBA_MDB_ENTRY'. In turn, this attribute contains another nested
attributed called 'MDBA_MDB_ENTRY_INFO', which encodes a single port
group entry within the MDB entry.
The cited commit added the ability to restart a dump from a specific
port group entry. However, on failure to add a port group entry to the
dump the entire MDB entry (stored in 'nest2') is removed, resulting in
missing port group entries.
Fix this by finalizing the MDB entry with the partial list of already
encoded port group entries.
Fixes: 5205e919c9f0 ("net: bridge: mcast: add support for src list and filter mode dumping") Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Colin Ian King [Fri, 11 Sep 2020 10:35:09 +0000 (11:35 +0100)]
ipv6: remove redundant assignment to variable err
The variable err is being initialized with a value that is never read and
it is being updated later with a new value. The initialization is redundant
and can be removed. Also re-order variable declarations in reverse
Christmas tree ordering.
====================
ag71xx: add ethtool and flow control support
The main target of this patches is to provide flow control support
for ag71xx driver. To be able to validate this functionality, I also
added ethtool support with HW counters. So, this patches was validated
with iperf3 and counters showing Pause frames send or received by this
NIC.
====================
Add flow control support. The functionality was tested on AR9331 SoC and
confirmed by iperf3 results and HW counters exported over ethtool.
Following test configurations was used:
Xie He [Fri, 11 Sep 2020 06:35:03 +0000 (23:35 -0700)]
drivers/net/wan/x25_asy: Remove an unused flag "SLF_OUTWAIT"
The "SLF_OUTWAIT" flag defined in x25_asy.h is not actually used.
It is only cleared at one place in x25_asy.c but is never read or set.
So we can remove it.
Luo Jiaxing [Fri, 11 Sep 2020 03:55:58 +0000 (11:55 +0800)]
net: stmmac: set get_rx_header_len() as void for it didn't have any error code to return
We found the following warning when using W=1 to build kernel:
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:3634:6: warning: variable ‘ret’ set but not used [-Wunused-but-set-variable]
int ret, coe = priv->hw->rx_csum;
When digging stmmac_get_rx_header_len(), dwmac4_get_rx_header_len() and
dwxgmac2_get_rx_header_len() return 0 only, without any error code to
report. Therefore, it's better to define get_rx_header_len() as void.
David S. Miller [Fri, 11 Sep 2020 21:31:54 +0000 (14:31 -0700)]
Merge branch 'Add-GVE-Features'
David Awogbemila says:
====================
Add GVE Features.
Note: Patch 4 in v3 was dropped.
Patch 4 (patch 5 in v3): Start/stop timer only when report stats is
enabled/disabled.
Patch 7 (patch 8 in v3): Use netdev_info, not dev_info, to log
device link status.
====================
David Awogbemila [Fri, 11 Sep 2020 17:38:51 +0000 (10:38 -0700)]
gve: Enable Link Speed Reporting in the driver.
This change allows the driver to report the device link speed
when the ethtool command:
ethtool <nic name>
is run.
Getting the link speed is done via a new admin queue command:
ReportLinkSpeed.
gve: Use link status register to report link status
This makes the driver better aware of the connectivity status of the
device. Based on the device's status register, the driver can call
netif_carrier_{on,off}.
David Awogbemila [Fri, 11 Sep 2020 17:38:48 +0000 (10:38 -0700)]
gve: NIC stats for report-stats and for ethtool
This adds per queue NIC stats to ethtool stats and to report-stats.
These stats are always exposed to guest whether or not the
report-stats flag is turned on.
Kuo Zhao [Fri, 11 Sep 2020 17:38:47 +0000 (10:38 -0700)]
gve: Add Gvnic stats AQ command and ethtool show/set-priv-flags.
This adds functionality to report driver stats to Hypervisor.
(Users may want to turn this feature off as a matter of privacy
so a "report-stats" flag is added as an ethtool priv option.
It is also disabled by default.)
The hypervisor would trigger a stats report in case "too many"
packets dropped; the stats would be useful in debugging stuck
queues.
A "stats_report_trigger_cnt" stat is added to count the number of times
the hypervisor attempts to trigger stats report.
A timer is also added so that when report-stats is enabled, stat are
updated once every 20 seconds.
David S. Miller [Fri, 11 Sep 2020 20:32:31 +0000 (13:32 -0700)]
Merge tag 'wireless-drivers-next-2020-09-11' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for v5.10
First set of patches for v5.10. Most noteworthy here is ath11k getting
initial support for QCA6390 and IPQ6018 devices. But most of the
patches are cleanup: W=1 warning fixes, fallthrough keywords, DMA API
changes and tasklet API changes.
Major changes:
ath10k
* support SDIO firmware codedumps
* support station specific TID configurations
ath11k
* add support for IPQ6018
* add support for QCA6390 PCI devices
ath9k
* add support for NL80211_EXT_FEATURE_CAN_REPLACE_PTK0 to improve PTK0
rekeying
Luo Jiaxing [Thu, 10 Sep 2020 13:12:16 +0000 (21:12 +0800)]
net: smc91x: Remove set but not used variable 'status' in smc_phy_configure()
Fixes the following warning when using W=1 to build kernel:
drivers/net/ethernet/smsc/smc91x.c: In function ‘smc_phy_configure’:
drivers/net/ethernet/smsc/smc91x.c:1039:6: warning: variable ‘status’ set but not used [-Wunused-but-set-variable]
int status;
David S. Miller [Thu, 10 Sep 2020 22:24:27 +0000 (15:24 -0700)]
Merge branch 'smc-next'
Karsten Graul says:
====================
net/smc: updates 2020-09-10
Please apply the following patch series for smc to netdev's net-next tree.
This patch series is a mix of various improvements and cleanups.
The patches 1 and 10 improve the handling of large parallel workloads.
Patch 8 corrects a kernel config default for config CCWGROUP on s390.
Patch 9 allows userspace tools to retrieve socket information for more
sockets.
====================
net/smc: use separate work queues for different worker types
There are 6 types of workers which exist per smc connection. 3 of them
are used for listen and handshake processing, another 2 are used for
close and abort processing and 1 is the tx worker that moves calls to
sleeping functions into a worker.
To prevent flooding of the system work queue when many connections are
opened or closed at the same time (some pattern uperf implements), move
those workers to one of 3 smc-specific work queues. Two work queues are
module-global and used for handshake and close workers. The third work
queue is defined per link group and used by the tx workers that may
sleep waiting for resources of this link group.
And in smc_llc_enqueue() queue the llc_event_work work to the system
prio work queue because its critical that this work is started fast.
net/smc: use the retry mechanism for netlink messages
When the netlink messages to be sent to the userspace
are too big for a single netlink message, send them in
chunks using the netlink_dump infrastructure. Modify the
smc diag dump code so that it can signal to the netlink_dump
infrastructure that it needs to send more data.
s390/net: add SMC config as one of the defaults of CCWGROUP
arch/s390/net/pnet.c uses ccwgroup function dev_is_ccwgroup()
in pnetid_by_dev_port().
For s390 the net/smc code makes use of function pnetid_by_dev_port().
Make sure ccwgroup is built into the kernel, if smc is to be built
into the kernel.
net/smc: immediate freeing in smc_lgr_cleanup_early()
smc_lgr_cleanup_early() schedules the free worker with delay. DMB
unregistering occurs in this delayed worker increasing the risk
to reach the SMCD SBA limit without need. Terminate the
linkgroup immediately, since termination means early DMB unregistering.
For SMCD the global smc_server_lgr_pending lock is given up early.
A linkgroup to be given up with smc_lgr_cleanup_early() may already
contain more than one connection. Using __smc_lgr_terminate() in
smc_lgr_cleanup_early() covers this.
And consolidate smc_ism_put_vlan() and smc_put_device() into smc_lgr_free()
only.
net/smc: common routine for CLC accept and confirm
smc_clc_send_accept() and smc_clc_send_confirm() are quite similar.
Move common code into a separate function smc_clc_send_confirm_accept().
And introduce separate SMCD and SMCR struct definitions for CLC accept
resp. confirm.
No functional change.
Field names "srv_first_contact" and "cln_first_contact" are misleading,
since they apply to both, server and client. Rename them to
"first_contact_peer" and "first_contact_local".
Rename "ism_gid" by the more precise name "ism_peer_gid".
Rename version constant "SMC_CLC_V1" into "SMC_V1".
No functional change.
SMC starts a separate tcp_listen worker for every SMC socket in
state SMC_LISTEN, and can accept an incoming connection request only,
if this worker is really running and waiting in kernel_accept(). But
the number of running workers is limited.
This patch reworks the listening SMC code and starts a tcp_listen worker
after the SYN-ACK handshake on the internal clc-socket only.
David S. Miller [Thu, 10 Sep 2020 22:22:17 +0000 (15:22 -0700)]
Merge branch 'nfc-s3fwrn5-Few-cleanups'
Krzysztof Kozlowski says:
====================
nfc: s3fwrn5: Few cleanups
Changes since v2:
1. Fix dtschema ID after rename (patch 1/8).
2. Apply patch 9/9 (defconfig change).
Changes since v1:
1. Rename dtschema file and add additionalProperties:false, as Rob
suggested,
2. Add Marek's tested-by,
3. New patches: #4, #5, #6, #7 and #9.
====================
MAINTAINERS: Add Krzysztof Kozlowski to Samsung S3FWRN5 and remove Robert
Robert Bałdyga's email does not work (bounces) since 2016 so remove it.
Additionally there are no review/ack/tested tags from Krzysztof Opasiak
so it looks like the driver is not supported.
As a maintainer of Samsung ARM/ARM64 SoC, I can take care about this
driver and provide some review. However clearly driver is not in
supported mode as I do not work in Samsung anymore.
nfc: s3fwrn5: Remove wrong vendor prefix from GPIOs
The device tree property prefix describes the vendor, which in case of
S3FWRN5 chip is Samsung. Therefore the "s3fwrn5" prefix for "en-gpios"
and "fw-gpios" is not correct and should be deprecated. Introduce
properly named properties for these GPIOs but still support deprecated
ones.
dt-bindings: net: nfc: s3fwrn5: Remove wrong vendor prefix from GPIOs
The device tree property prefix describes the vendor, which in case of
S3FWRN5 chip is Samsung. Therefore the "s3fwrn5" prefix for "en-gpios"
and "fw-gpios" is not correct and should be deprecated. Introduce
properly named properties for these GPIOs and rename the fw-gpios" to
"wake-gpios" to better describe its purpose.
dt-bindings: net: nfc: s3fwrn5: Convert to dtschema
Convert the Samsung S3FWRN5 NCI NFC controller bindings to dtschema.
This is conversion only so it includes properties with invalid prefixes
(s3fwrn5,en-gpios) which should be addressed later.
Wang Hai [Thu, 10 Sep 2020 14:56:18 +0000 (22:56 +0800)]
net: hns: Fix some kernel-doc warnings in hns_enet.c
Fixes the following W=1 kernel build warning(s):
drivers/net/ethernet/hisilicon/hns/hns_enet.c:1841: warning: Excess function parameter 'netdev' description in 'hns_set_multicast_list'
drivers/net/ethernet/hisilicon/hns/hns_enet.c:1841: warning: Excess function parameter 'p' description in 'hns_set_multicast_list'
Wang Hai [Thu, 10 Sep 2020 14:56:17 +0000 (22:56 +0800)]
net: hns: Fix some kernel-doc warnings in hns_dsaf_xgmac.c
Fixes the following W=1 kernel build warning(s):
drivers/net/ethernet/hisilicon/hns/hns_dsaf_xgmac.c:137: warning: Excess function parameter 'drv' description in 'hns_xgmac_enable'
drivers/net/ethernet/hisilicon/hns/hns_dsaf_xgmac.c:497: warning: Excess function parameter 'cmd' description in 'hns_xgmac_get_regs'
Wang Hai [Thu, 10 Sep 2020 14:56:15 +0000 (22:56 +0800)]
hinic: Fix some kernel-doc warnings in hinic_hw_io.c
Fixes the following W=1 kernel build warning(s):
drivers/net/ethernet/huawei/hinic/hinic_hw_io.c:373: warning: Excess function parameter 'sq_msix_entry' description in 'hinic_io_create_qps'
drivers/net/ethernet/huawei/hinic/hinic_hw_io.c:373: warning: Excess function parameter 'rq_msix_entry' description in 'hinic_io_create_qps'
Alex Dewar [Thu, 10 Sep 2020 13:49:10 +0000 (14:49 +0100)]
net: mvpp2: ptp: Fix unused variables
In the functions mvpp2_isr_handle_xlg() and
mvpp2_isr_handle_gmac_internal(), the bool variable link is assigned a
true value in the case that a given bit of val is set. However, if the
bit is unset, no value is assigned to link and it is then passed to
mvpp2_isr_handle_link() without being initialised. Fix by assigning to
link the value of the bit test.
Build-tested on x86.
Fixes: 36cfd3a6e52b ("net: mvpp2: restructure "link status" interrupt handling") Signed-off-by: Alex Dewar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Wang Hai [Thu, 10 Sep 2020 13:36:16 +0000 (21:36 +0800)]
net: cxgb3: Fix some kernel-doc warnings
Fixes the following W=1 kernel build warning(s):
drivers/net/ethernet/chelsio/cxgb3/t3_hw.c:2209: warning: Excess function parameter 'adapter' description in 'clear_sge_ctxt'
drivers/net/ethernet/chelsio/cxgb3/t3_hw.c:2975: warning: Excess function parameter 'adapter' description in 't3_set_proto_sram'
====================
Enhance current features in ena driver
This series adds the following:
* Exposes new device stats using ethtool.
* Adds and exposes the stats of xdp TX queues through ethtool.
====================
The new metrics provide granular visibility along multiple network
dimensions and enable troubleshooting and remediation of issues caused
by instances exceeding network performance allowances.
The new statistics can be queried using ethtool command.
net: ena: ethtool: convert stat_offset to 64 bit resolution
The type of all stat fields is u64, therefore when iterating over stat
fields in a stats struct, it makes sense to use an offset in 64 bit
resolution. Doing so allows us to drop some of the casting that is
currently used when referencing stats.
The delay was intended to be configured to "simulate" a high(er) BDP
link. As such, it needs to be set as part of the loss-configuration and
not as part of the netem reordering configuration.
The reordering-config also requires a delay but that delay is the
reordering-extend. So, a good approach is to set the reordering-extend
as a function of the configured latency. E.g., 25% of the overall
latency.
To speed up the selftests, we limit the delay to 50ms maximum to avoid
having the selftests run for too long.
Finally, the intention of tc_reorder was that when it is unset, the test
picks a random configuration. However, currently it is always initialized
and thus the random config won't be picked up.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/6 Reported-and-reviewed-by: Matthieu Baerts <[email protected]> Signed-off-by: Christoph Paasch <[email protected]> Signed-off-by: David S. Miller <[email protected]>
David S. Miller [Thu, 10 Sep 2020 20:15:40 +0000 (13:15 -0700)]
Merge branch 'tcp-add-tos-reflection-feature'
Wei Wang says:
====================
tcp: add tos reflection feature
This patch series adds a new tcp feature to reflect TOS value received in
SYN, and send it out in SYN-ACK, and eventually set the TOS value of the
established socket with this reflected TOS value. This provides a way to
set the traffic class/QoS level for all traffic in the same connection
to be the same as the incoming SYN. It could be useful for datacenters
to provide equivalent QoS according to the incoming request.
This feature is guarded by /proc/sys/net/ipv4/tcp_reflect_tos, and is by
default turned off.
====================
Wei Wang [Thu, 10 Sep 2020 00:50:48 +0000 (17:50 -0700)]
tcp: reflect tos value received in SYN to the socket
This commit adds a new TCP feature to reflect the tos value received in
SYN, and send it out on the SYN-ACK, and eventually set the tos value of
the established socket with this reflected tos value. This provides a
way to set the traffic class/QoS level for all traffic in the same
connection to be the same as the incoming SYN request. It could be
useful in data centers to provide equivalent QoS according to the
incoming request.
This feature is guarded by /proc/sys/net/ipv4/tcp_reflect_tos, and is by
default turned off.
Wei Wang [Thu, 10 Sep 2020 00:50:47 +0000 (17:50 -0700)]
ip: pass tos into ip_build_and_send_pkt()
This commit adds tos as a new passed in parameter to
ip_build_and_send_pkt() which will be used in the later commit.
This is a pure restructure and does not have any functional change.
Wei Wang [Thu, 10 Sep 2020 00:50:46 +0000 (17:50 -0700)]
tcp: record received TOS value in the request socket
A new field is added to the request sock to record the TOS value
received on the listening socket during 3WHS:
When not under syn flood, it is recording the TOS value sent in SYN.
When under syn flood, it is recording the TOS value sent in the ACK.
This is a preparation patch in order to do TOS reflection in the later
commit.
====================
netpoll: make sure napi_list is safe for RCU traversal
This series is a follow-up to the fix in commit 96e97bc07e90 ("net:
disable netpoll on fresh napis"). To avoid any latent race conditions
convert dev->napi_list to a proper RCU list. We need minor restructuring
because it looks like netif_napi_del() used to be idempotent, and
it may be quite hard to track down everyone who depends on that.
====================
Jakub Kicinski [Wed, 9 Sep 2020 17:37:52 +0000 (10:37 -0700)]
net: manage napi add/del idempotence explicitly
To RCUify napi->dev_list we need to replace list_del_init()
with list_del_rcu(). There is no _init() version for RCU for
obvious reasons. Up until now netif_napi_del() was idempotent
so to make sure it remains such add a bit which is set when
NAPI is listed, and cleared when it removed. Since we don't
expect multiple calls to netif_napi_add() to be correct,
add a warning on that side.
Now that napi_hash_add / napi_hash_del are only called by
napi_add / del we can actually steal its bit. We just need
to make sure hash node is initialized correctly.
Jakub Kicinski [Wed, 9 Sep 2020 17:37:51 +0000 (10:37 -0700)]
net: remove napi_hash_del() from driver-facing API
We allow drivers to call napi_hash_del() before calling
netif_napi_del() to batch RCU grace periods. This makes
the API asymmetric and leaks internal implementation details.
Soon we will want the grace period to protect more than just
the NAPI hash table.
Restructure the API and have drivers call a new function -
__netif_napi_del() if they want to take care of RCU waits.
Note that only core was checking the return status from
napi_hash_del() so the new helper does not report if the
NAPI was actually deleted.
Some notes on driver oddness:
- veth observed the grace period before calling netif_napi_del()
but that should not matter
- myri10ge observed normal RCU flavor
- bnx2x and enic did not actually observe the grace period
(unless they did so implicitly)
- virtio_net and enic only unhashed Rx NAPIs
The last two points seem to indicate that the calls to
napi_hash_del() were a left over rather than an optimization.
Regardless, it's easy enough to correct them.
This patch may introduce extra synchronize_net() calls for
interfaces which set NAPI_STATE_NO_BUSY_POLL and depend on
free_netdev() to call netif_napi_del(). This seems inevitable
since we want to use RCU for netpoll dev->napi_list traversal,
and almost no drivers set IFF_DISABLE_NETPOLL.
Jakub Kicinski [Tue, 8 Sep 2020 22:21:14 +0000 (15:21 -0700)]
mlx4: make sure to always set the port type
Even tho mlx4_core registers the devlink ports, it's mlx4_en
and mlx4_ib which set their type. In situations where one of
the two is not built yet the machine has ports of given type
we see the devlink warning from devlink_port_type_warn() trigger.
Having ports of a type not supported by the kernel may seem
surprising, but it does occur in practice - when the unsupported
port is not plugged in to a switch anyway users are more than happy
not to see it (and potentially allocate any resources to it).
Set the type in mlx4_core if type-specific driver is not built.
Jakub Kicinski [Tue, 8 Sep 2020 22:21:13 +0000 (15:21 -0700)]
devlink: don't crash if netdev is NULL
Following change will add support for a corner case where
we may not have a netdev to pass to devlink_port_type_eth_set()
but we still want to set port type.
This is definitely a corner case, and drivers should not normally
pass NULL netdev - print a warning message when this happens.
Sadly for other port types (ib) switches don't have a device
reference, the way we always do for Ethernet, so we can't put
the warning in __devlink_port_type_set().
net: mvneta: rely on MVNETA_MAX_RX_BUF_SIZE for pkt split in mvneta_swbm_rx_frame()
In order to easily change the rx buffer size, rely on
MVNETA_MAX_RX_BUF_SIZE instead of PAGE_SIZE in mvneta_swbm_rx_frame
routine for rx buffer split. Currently this is not an issue since we set
MVNETA_MAX_RX_BUF_SIZE to PAGE_SIZE - MVNETA_SKB_PAD but it is a good to
have to configure a different rx buffer size.
====================
Allow more than 255 IPv4 multicast interfaces
Currently it is not possible to use more than 255 multicast interfaces
for IPv4 due to the format of the igmpmsg header which only has 8 bits
available for the VIF ID. There is space available in the igmpmsg
header to store the full VIF ID in the form of an unused byte following
the VIF ID field. There is also enough space for the full VIF ID in
the Netlink cache notifications, however the value is currently taken
directly from the igmpmsg header and has thus already been truncated.
Adding the high byte of the VIF ID into the unused3 byte of igmpmsg
allows use of more than 255 IPv4 multicast interfaces. The full VIF ID
is also available in the Netlink notification by assembling it from
both bytes from the igmpmsg.
Additionally this reveals a deficiency in the Netlink cache report
notifications, they lack any means for differentiating cache reports
relating to different multicast routing tables. This is easily
resolved by adding the multicast route table ID to the cache reports.
changes in v2:
- Added high byte of VIF ID to igmpmsg struct replacing unused3
member.
- Assemble VIF ID in Netlink notification from both bytes in igmpmsg
header.
====================
Paul Davey [Mon, 7 Sep 2020 22:04:07 +0000 (10:04 +1200)]
ipmr: Add high byte of VIF ID to igmpmsg
Use the unused3 byte in struct igmpmsg to hold the high 8 bits of the
VIF ID.
If using more than 255 IPv4 multicast interfaces it is necessary to have
access to a VIF ID for cache reports that is wider than 8 bits, the VIF
ID present in the igmpmsg reports sent to mroute_sk was only 8 bits wide
in the igmpmsg header. Adding the high 8 bits of the 16 bit VIF ID in
the unused byte allows use of more than 255 IPv4 multicast interfaces.
Paul Davey [Mon, 7 Sep 2020 22:04:06 +0000 (10:04 +1200)]
ipmr: Add route table ID to netlink cache reports
Insert the multicast route table ID as a Netlink attribute to Netlink
cache report notifications.
When multiple route tables are in use it is necessary to have a way to
determine which route table a given cache report belongs to when
receiving the cache report.
David S. Miller [Wed, 9 Sep 2020 21:22:42 +0000 (14:22 -0700)]
Merge branch 'Marvell-PP2-2-PTP-support'
Russell King says:
====================
Marvell PP2.2 PTP support
This series adds PTP support for PP2.2 hardware to the mvpp2 driver.
Tested on the Macchiatobin eth1 port.
Note that on the Macchiatobin, eth0 uses a separate TAI block from
eth1, and there is no hardware synchronisation between the two.
====================
Russell King [Wed, 9 Sep 2020 16:25:55 +0000 (17:25 +0100)]
net: mvpp2: ptp: add support for receive timestamping
Add support for receive timestamping. When enabled, the hardware adds
a timestamp into the receive queue descriptor for all received packets
with no filtering. Hence, we can only support NONE or ALL receive
filter modes.
The timestamp in the receive queue contains two bit sof seconds and
the full nanosecond timestamp. This has to be merged with the remainder
of the seconds from the TAI clock to arrive at a full timestamp before
we can convert it to a ktime for the skb hardware timestamp field.
Russell King [Wed, 9 Sep 2020 16:25:45 +0000 (17:25 +0100)]
net: mvpp2: check first level interrupt status registers
Check the first level interrupt status registers to determine how to
further process the port interrupt. We will need this to know whether
to invoke the link status processing and/or the PTP processing for
both XLG and GMAC.
The link interrupt is used for way more than just the link status; it
comes from a collection of units to do with the port. The Marvell
documentation describes the interrupt as "GOP port X interrupt".
Since we are adding PTP support, and the PTP interrupt uses this,
rename it to be more inline with the documentation.
This interrupt is also mis-named in the DT binding, but we leave that
alone.
David S. Miller [Wed, 9 Sep 2020 21:19:56 +0000 (14:19 -0700)]
Merge branch 'devlink-show-controller-number'
Parav Pandit says:
====================
devlink show controller number
Currently a devlink instance that supports an eswitch handles eswitch
ports of two type of controllers.
(1) controller discovered on same system where eswitch resides.
This is the case where PCI PF/VF of a controller and devlink eswitch
instance both are located on a single system.
(2) controller located on external system.
This is the case where a controller is plugged in one system and its
devlink eswitch ports are located in a different system. In this case
devlink instance of the eswitch only have access to ports of the
controller.
However, there is no way to describe that a eswitch devlink port
belongs to which controller (mainly which external host controller).
This problem is more prevalent when port attribute such as PF and VF
numbers are overlapping between multiple controllers of same eswitch.
Due to this, for a specific switch_id, unique phys_port_name cannot
be constructed for such devlink ports.
This short series overcomes this limitation by defining two new
attributes.
(a) external: Indicates if port belongs to external controller
(b) controller number: Indicates a controller number of the port
Based on this a unique phys_port_name is prepared using controller
number.
phys_port_name construction using unique controller number is only
applicable to external controller ports. This ensures that for
non smartnic usecases where there is no external controller,
phys_port_name stays same as before.
Patch summary:
Patch-1 Added mlx5 driver to read controller number
Patch-2 Adds the missing comment for the port attributes
Patch-3 Move structure comments away from structure fields
Patch-4 external attribute added for PCI port flavours
Patch-5 Add controller number
Patch-6 Use controller number to build phys_port_name
---
Changelog:
v2->v3:
- Updated diagram to get rid of controller 'A' and 'B'
- Kept ports of single controller together in diagram
- Updated diagram for pf1's VF and SF and its ports
v1->v2:
- Added text diagram of multiple controllers
- Updated example for a VF
- Addressed comments from Jiri and Jakub
- Moved controller number attribute to PCI port flavours
This enables to better, hirerchical view with controller and its
PF, VF numbers
- Split 'external' and 'controller number' attributes as two
different attributes
- Merged mlx5_core driver to avoid compiliation break
====================
devlink: Use controller while building phys_port_name
Now that controller number attribute is available, use it when
building phsy_port_name for external controller ports.
An example devlink port and representor netdev name consist of controller
annotation for external controller with controller number = 1,
for a VF 1 of PF 0:
$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev ens2f0c1pf0vf1 flavour pcivf controller 1 pfnum 0 vfnum 1 external true splittable false
function:
hw_addr 00:00:00:00:00:00
A devlink port may be for a controller consist of PCI device.
A devlink instance holds ports of two types of controllers.
(1) controller discovered on same system where eswitch resides
This is the case where PCI PF/VF of a controller and devlink eswitch
instance both are located on a single system.
(2) controller located on external host system.
This is the case where a controller is located in one system and its
devlink eswitch ports are located in a different system.
When a devlink eswitch instance serves the devlink ports of both
controllers together, PCI PF/VF numbers may overlap.
Due to this a unique phys_port_name cannot be constructed.
For example in below such system controller-0 and controller-1, each has
PCI PF pf0 whose eswitch ports can be present in controller-0.
These results in phys_port_name as "pf0" for both.
Similar problem exists for VFs and upcoming Sub functions.