Git Repo - linux.git/log

sfc: select inner-csum-offload TX queues for skbs that need it

Won't actually be exercised until we start advertising the corresponding
offload features.

Signed-off-by: Edward Cree <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: create inner-csum queues on EF10 if supported

If the MC reports the VXLAN_NVGRE datapath capability, then these queues
can be used for checksum offload of encapsulated packets.

Signed-off-by: Edward Cree <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: define inner/outer csum offload TXQ types

Nothing yet creates inner csum TXQs; just change all references to
EFX_TXQ_TYPE_OFFLOAD to the new EFX_TXQ_TYPE_OUTER_CSUM.

Signed-off-by: Edward Cree <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: decouple TXQ type from label

Make it possible to have an arbitrary mapping from types to labels,
because when we add inner-csum-offload TXQs there will no longer be a
convenient nesting hierarchy of NIC types (EF10 will have inner-csum
TXQs, while Siena will have HIGHPRI).
Correct a misleading comment on efx_hard_start_xmit().

Signed-off-by: Edward Cree <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

octeontx2-af: Constify npc_kpu_profile_{action,cam}

These are never modified, so constify them to allow the compiler to
place them in read-only memory. This moves about 25kB to read-only
memory as seen by the output of the size command.

Before:
   text    data     bss     dec     hex filename
296203   65464    1248  362915   589a3 drivers/net/ethernet/marvell/octeontx2/af/octeontx2_af.ko

After:
   text    data     bss     dec     hex filename
321003   40664    1248  362915   589a3 drivers/net/ethernet/marvell/octeontx2/af/octeontx2_af.ko

Signed-off-by: Rikard Falkeborn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'sfc-misc-cleanups'

Edward Cree says:

====================
sfc: misc cleanups

Clean up a few nits I noticed while working on TXQ stuff.
====================

Reviewed-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: cleanups around efx_alloc_channel

The old_channel argument is never used, so remove it.
The function is only called from elsewhere in efx_channels.c, so make
it static and remove the declaration from the header file.

Signed-off-by: Edward Cree <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: remove spurious unreachable return statement

The statement above it already returns, so there is no way to get here.

Signed-off-by: Edward Cree <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sfc: remove duplicate call to efx_init_channels from EF100 probe

efx_init_struct already calls this, we don't need to do it again.

Signed-off-by: Edward Cree <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bridge: mcast: Fix incomplete MDB dump

Each MDB entry is encoded in a nested netlink attribute called
'MDBA_MDB_ENTRY'. In turn, this attribute contains another nested
attributed called 'MDBA_MDB_ENTRY_INFO', which encodes a single port
group entry within the MDB entry.

The cited commit added the ability to restart a dump from a specific
port group entry. However, on failure to add a port group entry to the
dump the entire MDB entry (stored in 'nest2') is removed, resulting in
missing port group entries.

Fix this by finalizing the MDB entry with the partial list of already
encoded port group entries.

Fixes: 5205e919c9f0 ("net: bridge: mcast: add support for src list and filter mode dumping")
Signed-off-by: Ido Schimmel <[email protected]>
Acked-by: Nikolay Aleksandrov <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ipv6: remove redundant assignment to variable err

The variable err is being initialized with a value that is never read and
it is being updated later with a new value. The initialization is redundant
and can be removed. Also re-order variable declarations in reverse
Christmas tree ordering.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'ag71xx-add-ethtool-and-flow-control-support'

Oleksij Rempel says:

====================
ag71xx: add ethtool and flow control support

The main target of this patches is to provide flow control support
for ag71xx driver. To be able to validate this functionality, I also
added ethtool support with HW counters. So, this patches was validated
with iperf3 and counters showing Pause frames send or received by this
NIC.
====================

Signed-off-by: David S. Miller <[email protected]>

net: ag71xx: add flow control support

Add flow control support. The functionality was tested on AR9331 SoC and
confirmed by iperf3 results and HW counters exported over ethtool.
Following test configurations was used:

iMX6S receiver <--- TL-SG1005D switch <---- AR9331 sender

The switch is supporting symmytric flow control:
Settings for eth0:
        Supported ports: [ MII ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Link partner advertised link modes:  10baseT/Half 10baseT/Full
                                             100baseT/Half 100baseT/Full
--->>   Link partner advertised pause frame use: Symmetric
        Link partner advertised auto-negotiation: Yes
        Link partner advertised FEC modes: Not reported
        Speed: 100Mb/s
        Duplex: Full
        Auto-negotiation: on
        Port: MII
        PHYAD: 4
        Transceiver: external
        Link detected: yes

The iMX6S system was configured to 10Mbit, to let the switch use flow
control:
  - ethtool -s eth0 speed 10

With flow control disabled on AR9331:
  - ethtool -A eth0  rx off tx off
  - iperf3 -u -c 172.17.0.1 -b100M -l1472 -t10

[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec  66.2 MBytes  55.5 Mbits/sec  0.000 ms  0/47155 (0%)  sender
[  5]   0.00-10.04  sec  11.5 MBytes  9.57 Mbits/sec  1.309 ms  38986/47146 (83%)  receiver

With flow control enabled on AR9331:
  - ethtool -A eth0  rx on tx on
  - iperf3 -u -c 172.17.0.1 -b100M -l1472 -t10

[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec  15.1 MBytes  12.6 Mbits/sec  0.000 ms  0/10727 (0%)  sender
[  5]   0.00-10.05  sec  11.5 MBytes  9.57 Mbits/sec  1.371 ms  2525/10689 (24%)  receiver

Similar results are get in opposite direction by introducing extra CPU
load on AR9331:
  - chrt 40 dd if=/dev/zero of=/dev/null &

Signed-off-by: Oleksij Rempel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ag71xx: add ethtool support

Add basic ethtool support. The functionality was tested on AR9331 SoC.

Signed-off-by: Oleksij Rempel <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

drivers/net/wan/x25_asy: Remove an unused flag "SLF_OUTWAIT"

The "SLF_OUTWAIT" flag defined in x25_asy.h is not actually used.
It is only cleared at one place in x25_asy.c but is never read or set.
So we can remove it.

Signed-off-by: Xie He <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: stmmac: set get_rx_header_len() as void for it didn't have any error code to return

We found the following warning when using W=1 to build kernel:

drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:3634:6: warning: variable ‘ret’ set but not used [-Wunused-but-set-variable]
int ret, coe = priv->hw->rx_csum;

When digging stmmac_get_rx_header_len(), dwmac4_get_rx_header_len() and
dwxgmac2_get_rx_header_len() return 0 only, without any error code to
report. Therefore, it's better to define get_rx_header_len() as void.

Signed-off-by: Luo Jiaxing <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'Add-GVE-Features'

David Awogbemila says:

====================
Add GVE Features.

Note: Patch 4 in v3 was dropped.

Patch 4 (patch 5 in v3): Start/stop timer only when report stats is
enabled/disabled.
Patch 7 (patch 8 in v3): Use netdev_info, not dev_info, to log
device link status.
====================

Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

gve: Enable Link Speed Reporting in the driver.

This change allows the driver to report the device link speed
when the ethtool command:
ethtool <nic name>
is run.
Getting the link speed is done via a new admin queue command:
ReportLinkSpeed.

Reviewed-by: Yangchun Fu <[email protected]>
Signed-off-by: David Awogbemila <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

gve: Use link status register to report link status

This makes the driver better aware of the connectivity status of the
device. Based on the device's status register, the driver can call
netif_carrier_{on,off}.

Reviewed-by: Yangchun Fu <[email protected]>
Signed-off-by: Patricio Noyola <[email protected]>
Signed-off-by: David Awogbemila <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

gve: Batch AQ commands for creating and destroying queues.

Adds support for batching AQ commands and uses it for creating and
destroying queues.

Reviewed-by: Yangchun Fu <[email protected]>
Signed-off-by: Sagi Shahar <[email protected]>
Signed-off-by: David Awogbemila <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

gve: NIC stats for report-stats and for ethtool

This adds per queue NIC stats to ethtool stats and to report-stats.
These stats are always exposed to guest whether or not the
report-stats flag is turned on.

Signed-off-by: David Awogbemila <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

gve: Add Gvnic stats AQ command and ethtool show/set-priv-flags.

This adds functionality to report driver stats to Hypervisor.
(Users may want to turn this feature off as a matter of privacy
so a "report-stats" flag is added as an ethtool priv option.
It is also disabled by default.)
The hypervisor would trigger a stats report in case "too many"
packets dropped; the stats would be useful in debugging stuck
queues.
A "stats_report_trigger_cnt" stat is added to count the number of times
the hypervisor attempts to trigger stats report.

A timer is also added so that when report-stats is enabled, stat are
updated once every 20 seconds.

Reviewed-by: Yangchun Fu <[email protected]>
Signed-off-by: Kuo Zhao <[email protected]>
Signed-off-by: David Awogbemila <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

gve: Use dev_info/err instead of netif_info/err.

Update the driver to use dev_info/err instead of netif_info/err.

Reviewed-by: Yangchun Fu <[email protected]>
Signed-off-by: Catherine Sullivan <[email protected]>
Signed-off-by: David Awogbemila <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

gve: Add stats for gve.

Sample output of "ethtool -S <interface-name>" with 1 RX queue and 1 TX
queue:
NIC statistics:
     rx_packets: 1039
     tx_packets: 37
     rx_bytes: 822071
     tx_bytes: 4100
     rx_dropped: 0
     tx_dropped: 0
     tx_timeouts: 0
     rx_skb_alloc_fail: 0
     rx_buf_alloc_fail: 0
     rx_desc_err_dropped_pkt: 0
     interface_up_cnt: 1
     interface_down_cnt: 0
     reset_cnt: 0
     page_alloc_fail: 0
     dma_mapping_error: 0
     rx_posted_desc[0]: 1365
     rx_completed_desc[0]: 341
     rx_bytes[0]: 215094
     rx_dropped_pkt[0]: 0
     rx_copybreak_pkt[0]: 3
     rx_copied_pkt[0]: 3
     tx_posted_desc[0]: 6
     tx_completed_desc[0]: 6
     tx_bytes[0]: 420
     tx_wake[0]: 0
     tx_stop[0]: 0
     tx_event_counter[0]: 6
     adminq_prod_cnt: 34
     adminq_cmd_fail: 0
     adminq_timeouts: 0
     adminq_describe_device_cnt: 1
     adminq_cfg_device_resources_cnt: 1
     adminq_register_page_list_cnt: 16
     adminq_unregister_page_list_cnt: 0
     adminq_create_tx_queue_cnt: 8
     adminq_create_rx_queue_cnt: 8
     adminq_destroy_tx_queue_cnt: 0
     adminq_destroy_rx_queue_cnt: 0
     adminq_dcfg_device_resources_cnt: 0
     adminq_set_driver_parameter_cnt: 0

Reviewed-by: Yangchun Fu <[email protected]>
Signed-off-by: Kuo Zhao <[email protected]>
Signed-off-by: David Awogbemila <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

gve: Get and set Rx copybreak via ethtool

This adds support for getting and setting the RX copybreak
value via ethtool.

Reviewed-by: Yangchun Fu <[email protected]>
Signed-off-by: Kuo Zhao <[email protected]>
Signed-off-by: David Awogbemila <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'wireless-drivers-next-2020-09-11' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
wireless-drivers-next patches for v5.10

First set of patches for v5.10. Most noteworthy here is ath11k getting
initial support for QCA6390 and IPQ6018 devices. But most of the
patches are cleanup: W=1 warning fixes, fallthrough keywords, DMA API
changes and tasklet API changes.

Major changes:

ath10k

* support SDIO firmware codedumps

* support station specific TID configurations

ath11k

* add support for IPQ6018

* add support for QCA6390 PCI devices

ath9k

* add support for NL80211_EXT_FEATURE_CAN_REPLACE_PTK0 to improve PTK0
rekeying

wcn36xx

* add support for TX ack
====================

Signed-off-by: David S. Miller <[email protected]>

Merge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git

ath.git patches for v5.10. Major changes:

ath10k

* support SDIO firmware codedumps

* support station specific TID configurations

ath11k

* add support for IPQ6018

ath10k: Remove unused macro ATH10K_ROC_TIMEOUT_HZ

There is no caller in tree, so can remove it.

Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

ath11k: Remove unused inline function htt_htt_stats_debug_dump()

There is no caller in tree, so can remove it.

Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

ath11k: fix link error when CONFIG_REMOTEPROC is disabled

If CONFIG_REMOTEPROC was disabled the linking failed with:

ERROR: modpost: "rproc_get_by_phandle" [drivers/net/wireless/ath/ath11k/ath11k.ko] undefined!

Compile tested only.

Reported-by: Randy Dunlap <[email protected]>
Fixes: 1ff8ed786d5d ("ath11k: use remoteproc only with AHB devices")
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/0101017476e38f40-c4168ac4-c00a-4220-a032-fe17e4a157cb-000000@us-west-2.amazonses.com

ath11k: remove calling ath11k_init_hw_params() second time

During probe ath11k_init_hw_params() is called from ath11k_core_pre_init()
and is not needed agian in ath11k_core_init().

Tested on: IPQ8074 hw2.0 AHB WLAN.HK.2.4.0.1-00009-QCAHKSWPL_SILICONZ-1

Fixes: 1ff8ed786d5d (ath11k: use remoteproc only with AHB devices)
Signed-off-by: Anilkumar Kolli <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/010101746d2a40d3-25cd7dbe-c0dd-4fdf-8735-366d7fb40207-000000@us-west-2.amazonses.com

ath11k: add raw mode and software crypto support

Adding raw mode tx/rx support. Also, adding support
for software crypto which depends on raw mode.

To enable raw mode tx/rx:
insmod ath11k.ko frame_mode=0

To enable software crypto:
insmod ath11k.ko crypto_mode=1

These modes could be helpful in debugging crypto related issues.

Tested-on: IPQ8074 WLAN.HK.2.1.0.1-01228-QCAHKSWPL_SILICONZ-1

Signed-off-by: Manikanta Pubbisetty <[email protected]>
Signed-off-by: Venkateswara Naralasetty <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/010101746c6a52d9-18302a2c-0d6d-4057-aa4b-95960c809646-000000@us-west-2.amazonses.com

ath11k: add ipq6018 support

IPQ6018 has one 5G and one 2G radio with 2x2,
shares ipq8074 configurations.

Tested on: IPQ6018 hw1.0 AHB WLAN.HK.2.2-02134-QCAHKSWPL_SILICONZ-1
Tested on: IPQ8074 hw2.0 AHB WLAN.HK.2.4.0.1-00009-QCAHKSWPL_SILICONZ-1

Signed-off-by: Anilkumar Kolli <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/010101746cb68b63-c2bc31ec-a31e-442e-a572-26f4c045c06b-000000@us-west-2.amazonses.com

ath11k: move target ce configs to hw_params

Move target CE config and target CE service config to hw_params.
No functional changes.

Tested on: IPQ8074 hw2.0 AHB WLAN.HK.2.4.0.1-00009-QCAHKSWPL_SILICONZ-1

Signed-off-by: Anilkumar Kolli <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/010101746cb685d9-6bedeccb-29a1-4d32-8664-fcfe7d105f4a-000000@us-west-2.amazonses.com

dt: bindings: net: update compatible for ath11k

Add IPQ6018 wireless driver support,
its based on ath11k driver.

Signed-off-by: Anilkumar Kolli <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/010101746cb6751a-ca300933-1174-4534-a01b-b1dbf1c1f305-000000@us-west-2.amazonses.com

net: smc91x: Remove set but not used variable 'status' in smc_phy_configure()

Fixes the following warning when using W=1 to build kernel:

drivers/net/ethernet/smsc/smc91x.c: In function ‘smc_phy_configure’:
drivers/net/ethernet/smsc/smc91x.c:1039:6: warning: variable ‘status’ set but not used [-Wunused-but-set-variable]
int status;

Signed-off-by: Luo Jiaxing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'smc-next'

Karsten Graul says:

====================
net/smc: updates 2020-09-10

Please apply the following patch series for smc to netdev's net-next tree.

This patch series is a mix of various improvements and cleanups.
The patches 1 and 10 improve the handling of large parallel workloads.
Patch 8 corrects a kernel config default for config CCWGROUP on s390.
Patch 9 allows userspace tools to retrieve socket information for more
sockets.
====================

Signed-off-by: David S. Miller <[email protected]>

net/smc: use separate work queues for different worker types

There are 6 types of workers which exist per smc connection. 3 of them
are used for listen and handshake processing, another 2 are used for
close and abort processing and 1 is the tx worker that moves calls to
sleeping functions into a worker.
To prevent flooding of the system work queue when many connections are
opened or closed at the same time (some pattern uperf implements), move
those workers to one of 3 smc-specific work queues. Two work queues are
module-global and used for handshake and close workers. The third work
queue is defined per link group and used by the tx workers that may
sleep waiting for resources of this link group.
And in smc_llc_enqueue() queue the llc_event_work work to the system
prio work queue because its critical that this work is started fast.

Reviewed-by: Ursula Braun <[email protected]>
Signed-off-by: Karsten Graul <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/smc: use the retry mechanism for netlink messages

When the netlink messages to be sent to the userspace
are too big for a single netlink message, send them in
chunks using the netlink_dump infrastructure. Modify the
smc diag dump code so that it can signal to the netlink_dump
infrastructure that it needs to send more data.

Signed-off-by: Guvenc Gulce <[email protected]>
Signed-off-by: Karsten Graul <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390/net: add SMC config as one of the defaults of CCWGROUP

arch/s390/net/pnet.c uses ccwgroup function dev_is_ccwgroup()
in pnetid_by_dev_port().
For s390 the net/smc code makes use of function pnetid_by_dev_port().
Make sure ccwgroup is built into the kernel, if smc is to be built
into the kernel.

Signed-off-by: Guvenc Gulce <[email protected]>
Reviewed-by: Ursula Braun <[email protected]>
Signed-off-by: Karsten Graul <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/smc: immediate freeing in smc_lgr_cleanup_early()

smc_lgr_cleanup_early() schedules the free worker with delay. DMB
unregistering occurs in this delayed worker increasing the risk
to reach the SMCD SBA limit without need. Terminate the
linkgroup immediately, since termination means early DMB unregistering.

For SMCD the global smc_server_lgr_pending lock is given up early.
A linkgroup to be given up with smc_lgr_cleanup_early() may already
contain more than one connection. Using __smc_lgr_terminate() in
smc_lgr_cleanup_early() covers this.

And consolidate smc_ism_put_vlan() and smc_put_device() into smc_lgr_free()
only.

Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: Karsten Graul <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/smc: reduce smc_listen_decline() calls

smc_listen_work() contains already an smc_listen_decline() exit.
Use this exit for smc_listen_rdma_finish() problems as well.

Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: Karsten Graul <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/smc: improve server ISM device determination

Move check whether peer can be reached into smc_pnet_find_ism_by_pnetid().
Thus searching continues for another ism device, if check fails.

Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: Karsten Graul <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/smc: common routine for CLC accept and confirm

smc_clc_send_accept() and smc_clc_send_confirm() are quite similar.
Move common code into a separate function smc_clc_send_confirm_accept().
And introduce separate SMCD and SMCR struct definitions for CLC accept
resp. confirm.
No functional change.

Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: Karsten Graul <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/smc: dynamic allocation of CLC proposal buffer

Reduce stack size for smc_listen_work() and smc_clc_send_proposal()
by dynamic allocation of the CLC buffer to be received or sent.

Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: Karsten Graul <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/smc: introduce better field names

Field names "srv_first_contact" and "cln_first_contact" are misleading,
since they apply to both, server and client. Rename them to
"first_contact_peer" and "first_contact_local".
Rename "ism_gid" by the more precise name "ism_peer_gid".
Rename version constant "SMC_CLC_V1" into "SMC_V1".
No functional change.

Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: Karsten Graul <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/smc: reduce active tcp_listen workers

SMC starts a separate tcp_listen worker for every SMC socket in
state SMC_LISTEN, and can accept an incoming connection request only,
if this worker is really running and waiting in kernel_accept(). But
the number of running workers is limited.
This patch reworks the listening SMC code and starts a tcp_listen worker
after the SYN-ACK handshake on the internal clc-socket only.

Suggested-by: Karsten Graul <[email protected]>
Signed-off-by: Ursula Braun <[email protected]>
Reviewed-by: Guvenc Gulce <[email protected]>
Signed-off-by: Karsten Graul <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'nfc-s3fwrn5-Few-cleanups'

Krzysztof Kozlowski says:

====================
nfc: s3fwrn5: Few cleanups

Changes since v2:
1. Fix dtschema ID after rename (patch 1/8).
2. Apply patch 9/9 (defconfig change).

Changes since v1:
1. Rename dtschema file and add additionalProperties:false, as Rob
suggested,
2. Add Marek's tested-by,
3. New patches: #4, #5, #6, #7 and #9.
====================

Signed-off-by: David S. Miller <[email protected]>

arm64: dts: exynos: Use newer S3FWRN5 GPIO properties in Exynos5433 TM2

Since "s3fwrn5" is not a valid vendor prefix, use new GPIO properties
instead of the deprecated.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Tested-by: Marek Szyprowski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

MAINTAINERS: Add Krzysztof Kozlowski to Samsung S3FWRN5 and remove Robert

Robert Bałdyga's email does not work (bounces) since 2016 so remove it.
Additionally there are no review/ack/tested tags from Krzysztof Opasiak
so it looks like the driver is not supported.

As a maintainer of Samsung ARM/ARM64 SoC, I can take care about this
driver and provide some review. However clearly driver is not in
supported mode as I do not work in Samsung anymore.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

nfc: s3fwrn5: Constify s3fwrn5_fw_info when not modified

Two functions accept pointer to struct s3fwrn5_fw_info but do not
modify the contents. Make them const so the code is a little bit safer.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

nfc: s3fwrn5: Add missing CRYPTO_HASH dependency

The driver uses crypto hash functions so it needs to select CRYPTO_HASH.
This fixes build errors:

arc-linux-ld: drivers/nfc/s3fwrn5/firmware.o: in function `s3fwrn5_fw_download':
firmware.c:(.text+0x152): undefined reference to `crypto_alloc_shash'

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

nfc: s3fwrn5: Remove unneeded 'ret' variable

The local variable 'ret' can be removed:

drivers/nfc/s3fwrn5/i2c.c:167:6: warning: variable 'ret' set but not used [-Wunused-but-set-variable]

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

nfc: s3fwrn5: Remove wrong vendor prefix from GPIOs

The device tree property prefix describes the vendor, which in case of
S3FWRN5 chip is Samsung. Therefore the "s3fwrn5" prefix for "en-gpios"
and "fw-gpios" is not correct and should be deprecated. Introduce
properly named properties for these GPIOs but still support deprecated
ones.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Tested-by: Marek Szyprowski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dt-bindings: net: nfc: s3fwrn5: Remove wrong vendor prefix from GPIOs

The device tree property prefix describes the vendor, which in case of
S3FWRN5 chip is Samsung. Therefore the "s3fwrn5" prefix for "en-gpios"
and "fw-gpios" is not correct and should be deprecated. Introduce
properly named properties for these GPIOs and rename the fw-gpios" to
"wake-gpios" to better describe its purpose.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dt-bindings: net: nfc: s3fwrn5: Convert to dtschema

Convert the Samsung S3FWRN5 NCI NFC controller bindings to dtschema.
This is conversion only so it includes properties with invalid prefixes
(s3fwrn5,en-gpios) which should be addressed later.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'hns-kdoc'

Wang Hai says:

====================
Fix some kernel-doc warnings for hns.
====================

Signed-off-by: David S. Miller <[email protected]>

net: hns: Fix a kernel-doc warning in hinic_hw_eqs.c

Fixes the following W=1 kernel build warning(s):

drivers/net/ethernet/huawei/hinic/hinic_hw_eqs.c:115: warning: Excess function parameter 'hw_handler' description in 'hinic_aeq_register_hw_cb'

Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Wang Hai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns: Fix a kernel-doc warning in hinic_hw_api_cmd.c

Fixes the following W=1 kernel build warning(s):

drivers/net/ethernet/huawei/hinic/hinic_hw_api_cmd.c:382: warning: Excess function parameter 'size' description in 'api_cmd'

Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Wang Hai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns: Fix some kernel-doc warnings in hns_enet.c

Fixes the following W=1 kernel build warning(s):

drivers/net/ethernet/hisilicon/hns/hns_enet.c:1841: warning: Excess function parameter 'netdev' description in 'hns_set_multicast_list'
drivers/net/ethernet/hisilicon/hns/hns_enet.c:1841: warning: Excess function parameter 'p' description in 'hns_set_multicast_list'

Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Wang Hai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns: Fix some kernel-doc warnings in hns_dsaf_xgmac.c

Fixes the following W=1 kernel build warning(s):

drivers/net/ethernet/hisilicon/hns/hns_dsaf_xgmac.c:137: warning: Excess function parameter 'drv' description in 'hns_xgmac_enable'
drivers/net/ethernet/hisilicon/hns/hns_dsaf_xgmac.c:497: warning: Excess function parameter 'cmd' description in 'hns_xgmac_get_regs'

Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Wang Hai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns: fix 'cdev' kernel-doc warning in hnae_ae_unregister()

Rename cdev to hdev.

Fixes the following W=1 kernel build warning(s):

drivers/net/ethernet/hisilicon/hns/hnae.c:444: warning: Excess function parameter 'cdev' description in 'hnae_ae_unregister'

Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Wang Hai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

hinic: Fix some kernel-doc warnings in hinic_hw_io.c

Fixes the following W=1 kernel build warning(s):

drivers/net/ethernet/huawei/hinic/hinic_hw_io.c:373: warning: Excess function parameter 'sq_msix_entry' description in 'hinic_io_create_qps'
drivers/net/ethernet/huawei/hinic/hinic_hw_io.c:373: warning: Excess function parameter 'rq_msix_entry' description in 'hinic_io_create_qps'

Rename these wrong names.

Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Wang Hai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: mvpp2: ptp: Fix unused variables

In the functions mvpp2_isr_handle_xlg() and
mvpp2_isr_handle_gmac_internal(), the bool variable link is assigned a
true value in the case that a given bit of val is set. However, if the
bit is unset, no value is assigned to link and it is then passed to
mvpp2_isr_handle_link() without being initialised. Fix by assigning to
link the value of the bit test.

Build-tested on x86.

Fixes: 36cfd3a6e52b ("net: mvpp2: restructure "link status" interrupt handling")
Signed-off-by: Alex Dewar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: cxgb3: Fix some kernel-doc warnings

Fixes the following W=1 kernel build warning(s):

drivers/net/ethernet/chelsio/cxgb3/t3_hw.c:2209: warning: Excess function parameter 'adapter' description in 'clear_sge_ctxt'
drivers/net/ethernet/chelsio/cxgb3/t3_hw.c:2975: warning: Excess function parameter 'adapter' description in 't3_set_proto_sram'

Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Wang Hai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'Enhance-current-features-in-ena-driver'

Sameeh Jubran says:

====================
Enhance current features in ena driver

This series adds the following:
* Exposes new device stats using ethtool.
* Adds and exposes the stats of xdp TX queues through ethtool.
====================

Signed-off-by: David S. Miller <[email protected]>

net: ena: xdp: add queue counters for xdp actions

When using XDP every ingress packet is passed to an eBPF (xdp) program
which returns an action for this packet.

This patch adds counters for the number of times each such action was
received. It also counts all the invalid actions received from the eBPF
program.

Signed-off-by: Shay Agroskin <[email protected]>
Signed-off-by: Sameeh Jubran <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ena: ethtool: add stats printing to XDP queues

Added statistics for TX queues that are used for XDP TX. The statistics
are the same as the ones printed for regular non-XDP TX queues.

The XDP queue statistics can be queried using
`ethtool -S <ifname>`

Signed-off-by: Shay Agroskin <[email protected]>
Signed-off-by: Sameeh Jubran <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ena: ethtool: Add new device statistics

The new metrics provide granular visibility along multiple network
dimensions and enable troubleshooting and remediation of issues caused
by instances exceeding network performance allowances.

The new statistics can be queried using ethtool command.

Signed-off-by: Guy Tzalik <[email protected]>
Signed-off-by: Sameeh Jubran <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ena: ethtool: convert stat_offset to 64 bit resolution

The type of all stat fields is u64, therefore when iterating over stat
fields in a stats struct, it makes sense to use an offset in 64 bit
resolution. Doing so allows us to drop some of the casting that is
currently used when referencing stats.

Signed-off-by: Sameeh Jubran <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

selftests/mptcp: Better delay & reordering configuration

The delay was intended to be configured to "simulate" a high(er) BDP
link. As such, it needs to be set as part of the loss-configuration and
not as part of the netem reordering configuration.

The reordering-config also requires a delay but that delay is the
reordering-extend. So, a good approach is to set the reordering-extend
as a function of the configured latency. E.g., 25% of the overall
latency.

To speed up the selftests, we limit the delay to 50ms maximum to avoid
having the selftests run for too long.

Finally, the intention of tc_reorder was that when it is unset, the test
picks a random configuration. However, currently it is always initialized
and thus the random config won't be picked up.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/6
Reported-and-reviewed-by: Matthieu Baerts <[email protected]>
Signed-off-by: Christoph Paasch <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'tcp-add-tos-reflection-feature'

Wei Wang says:

====================
tcp: add tos reflection feature

This patch series adds a new tcp feature to reflect TOS value received in
SYN, and send it out in SYN-ACK, and eventually set the TOS value of the
established socket with this reflected TOS value. This provides a way to
set the traffic class/QoS level for all traffic in the same connection
to be the same as the incoming SYN. It could be useful for datacenters
to provide equivalent QoS according to the incoming request.
This feature is guarded by /proc/sys/net/ipv4/tcp_reflect_tos, and is by
default turned off.
====================

Signed-off-by: David S. Miller <[email protected]>

tcp: reflect tos value received in SYN to the socket

This commit adds a new TCP feature to reflect the tos value received in
SYN, and send it out on the SYN-ACK, and eventually set the tos value of
the established socket with this reflected tos value. This provides a
way to set the traffic class/QoS level for all traffic in the same
connection to be the same as the incoming SYN request. It could be
useful in data centers to provide equivalent QoS according to the
incoming request.
This feature is guarded by /proc/sys/net/ipv4/tcp_reflect_tos, and is by
default turned off.

Signed-off-by: Wei Wang <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ip: pass tos into ip_build_and_send_pkt()

This commit adds tos as a new passed in parameter to
ip_build_and_send_pkt() which will be used in the later commit.
This is a pure restructure and does not have any functional change.

Signed-off-by: Wei Wang <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp: record received TOS value in the request socket

A new field is added to the request sock to record the TOS value
received on the listening socket during 3WHS:
When not under syn flood, it is recording the TOS value sent in SYN.
When under syn flood, it is recording the TOS value sent in the ACK.
This is a preparation patch in order to do TOS reflection in the later
commit.

Signed-off-by: Wei Wang <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: mventa: drop mvneta_stats from mvneta_swbm_rx_frame signature

Remove mvneta_stats from mvneta_swbm_rx_frame signature since now stats
are accounted in mvneta_run_xdp routine

Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'netpoll-make-sure-napi_list-is-safe-for-RCU-traversal'

Jakub Kicinski says:

====================
netpoll: make sure napi_list is safe for RCU traversal

This series is a follow-up to the fix in commit 96e97bc07e90 ("net:
disable netpoll on fresh napis"). To avoid any latent race conditions
convert dev->napi_list to a proper RCU list. We need minor restructuring
because it looks like netif_napi_del() used to be idempotent, and
it may be quite hard to track down everyone who depends on that.
====================

Signed-off-by: David S. Miller <[email protected]>

net: make sure napi_list is safe for RCU traversal

netpoll needs to traverse dev->napi_list under RCU, make
sure it uses the right iterator and that removal from this
list is handled safely.

Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: manage napi add/del idempotence explicitly

To RCUify napi->dev_list we need to replace list_del_init()
with list_del_rcu(). There is no _init() version for RCU for
obvious reasons. Up until now netif_napi_del() was idempotent
so to make sure it remains such add a bit which is set when
NAPI is listed, and cleared when it removed. Since we don't
expect multiple calls to netif_napi_add() to be correct,
add a warning on that side.

Now that napi_hash_add / napi_hash_del are only called by
napi_add / del we can actually steal its bit. We just need
to make sure hash node is initialized correctly.

Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: remove napi_hash_del() from driver-facing API

We allow drivers to call napi_hash_del() before calling
netif_napi_del() to batch RCU grace periods. This makes
the API asymmetric and leaks internal implementation details.
Soon we will want the grace period to protect more than just
the NAPI hash table.

Restructure the API and have drivers call a new function -
__netif_napi_del() if they want to take care of RCU waits.

Note that only core was checking the return status from
napi_hash_del() so the new helper does not report if the
NAPI was actually deleted.

Some notes on driver oddness:
- veth observed the grace period before calling netif_napi_del()
but that should not matter
- myri10ge observed normal RCU flavor
- bnx2x and enic did not actually observe the grace period
(unless they did so implicitly)
- virtio_net and enic only unhashed Rx NAPIs

The last two points seem to indicate that the calls to
napi_hash_del() were a left over rather than an optimization.
Regardless, it's easy enough to correct them.

This patch may introduce extra synchronize_net() calls for
interfaces which set NAPI_STATE_NO_BUSY_POLL and depend on
free_netdev() to call netif_napi_del(). This seems inevitable
since we want to use RCU for netpoll dev->napi_list traversal,
and almost no drivers set IFF_DISABLE_NETPOLL.

Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'mlx4-avoid-devlink-port-type-not-set-warnings'

Jakub Kicinski says:

====================
mlx4: avoid devlink port type not set warnings

This small set addresses the issue of mlx4 potentially not setting
devlink port type when Ethernet or IB driver is not built, but
port has that type.

v2:
- add patch 1
====================

Signed-off-by: David S. Miller <[email protected]>

mlx4: make sure to always set the port type

Even tho mlx4_core registers the devlink ports, it's mlx4_en
and mlx4_ib which set their type. In situations where one of
the two is not built yet the machine has ports of given type
we see the devlink warning from devlink_port_type_warn() trigger.

Having ports of a type not supported by the kernel may seem
surprising, but it does occur in practice - when the unsupported
port is not plugged in to a switch anyway users are more than happy
not to see it (and potentially allocate any resources to it).

Set the type in mlx4_core if type-specific driver is not built.

Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

devlink: don't crash if netdev is NULL

Following change will add support for a corner case where
we may not have a netdev to pass to devlink_port_type_eth_set()
but we still want to set port type.

This is definitely a corner case, and drivers should not normally
pass NULL netdev - print a warning message when this happens.

Sadly for other port types (ib) switches don't have a device
reference, the way we always do for Ethernet, so we can't put
the warning in __devlink_port_type_set().

Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: mvneta: rely on MVNETA_MAX_RX_BUF_SIZE for pkt split in mvneta_swbm_rx_frame()

In order to easily change the rx buffer size, rely on
MVNETA_MAX_RX_BUF_SIZE instead of PAGE_SIZE in mvneta_swbm_rx_frame
routine for rx buffer split. Currently this is not an issue since we set
MVNETA_MAX_RX_BUF_SIZE to PAGE_SIZE - MVNETA_SKB_PAD but it is a good to
have to configure a different rx buffer size.

Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'Allow-more-than-255-IPv4-multicast-interfaces'

Paul Davey says:

====================
Allow more than 255 IPv4 multicast interfaces

Currently it is not possible to use more than 255 multicast interfaces
for IPv4 due to the format of the igmpmsg header which only has 8 bits
available for the VIF ID.  There is space available in the igmpmsg
header to store the full VIF ID in the form of an unused byte following
the VIF ID field.  There is also enough space for the full VIF ID in
the Netlink cache notifications, however the value is currently taken
directly from the igmpmsg header and has thus already been truncated.

Adding the high byte of the VIF ID into the unused3 byte of igmpmsg
allows use of more than 255 IPv4 multicast interfaces. The full VIF ID
is  also available in the Netlink notification by assembling it from
both bytes from the igmpmsg.

Additionally this reveals a deficiency in the Netlink cache report
notifications, they lack any means for differentiating cache reports
relating to different multicast routing tables.  This is easily
resolved by adding the multicast route table ID to the cache reports.

changes in v2:
- Added high byte of VIF ID to igmpmsg struct replacing unused3
   member.
- Assemble VIF ID in Netlink notification from both bytes in igmpmsg
   header.
====================

Signed-off-by: David S. Miller <[email protected]>

ipmr: Use full VIF ID in netlink cache reports

Insert the full 16 bit VIF ID into ipmr Netlink cache reports.

The VIF_ID attribute has 32 bits of space so can store the full VIF ID
extracted from the high and low byte fields in the igmpmsg.

Signed-off-by: Paul Davey <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ipmr: Add high byte of VIF ID to igmpmsg

Use the unused3 byte in struct igmpmsg to hold the high 8 bits of the
VIF ID.

If using more than 255 IPv4 multicast interfaces it is necessary to have
access to a VIF ID for cache reports that is wider than 8 bits, the VIF
ID present in the igmpmsg reports sent to mroute_sk was only 8 bits wide
in the igmpmsg header. Adding the high 8 bits of the 16 bit VIF ID in
the unused byte allows use of more than 255 IPv4 multicast interfaces.

Signed-off-by: Paul Davey <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ipmr: Add route table ID to netlink cache reports

Insert the multicast route table ID as a Netlink attribute to Netlink
cache report notifications.

When multiple route tables are in use it is necessary to have a way to
determine which route table a given cache report belongs to when
receiving the cache report.

Signed-off-by: Paul Davey <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: b53: Report VLAN table occupancy via devlink

We already maintain an array of VLANs used by the switch so we can
simply iterate over it to report the occupancy via devlink.

Signed-off-by: Florian Fainelli <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'Marvell-PP2-2-PTP-support'

Russell King says:

====================
Marvell PP2.2 PTP support

This series adds PTP support for PP2.2 hardware to the mvpp2 driver.
Tested on the Macchiatobin eth1 port.

Note that on the Macchiatobin, eth0 uses a separate TAI block from
eth1, and there is no hardware synchronisation between the two.
====================

Signed-off-by: David S. Miller <[email protected]>

net: mvpp2: ptp: add support for transmit timestamping

Add support for timestamping transmit packets. We allocate SYNC
messages to queue 1, every other message to queue 0.

Signed-off-by: Russell King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: mvpp2: ptp: add support for receive timestamping

Add support for receive timestamping. When enabled, the hardware adds
a timestamp into the receive queue descriptor for all received packets
with no filtering. Hence, we can only support NONE or ALL receive
filter modes.

The timestamp in the receive queue contains two bit sof seconds and
the full nanosecond timestamp. This has to be merged with the remainder
of the seconds from the TAI clock to arrive at a full timestamp before
we can convert it to a ktime for the skb hardware timestamp field.

Signed-off-by: Russell King <[email protected]>
Acked-by: Richard Cochran <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: mvpp2: ptp: add TAI support

Add support for the TAI block in the mvpp2.2 hardware.

Acked-by: Richard Cochran <[email protected]>
Signed-off-by: Russell King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: mvpp2: check first level interrupt status registers

Check the first level interrupt status registers to determine how to
further process the port interrupt. We will need this to know whether
to invoke the link status processing and/or the PTP processing for
both XLG and GMAC.

Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: Russell King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: mvpp2: rename mis-named "link status" interrupt

The link interrupt is used for way more than just the link status; it
comes from a collection of units to do with the port. The Marvell
documentation describes the interrupt as "GOP port X interrupt".

Since we are adding PTP support, and the PTP interrupt uses this,
rename it to be more inline with the documentation.

This interrupt is also mis-named in the DT binding, but we leave that
alone.

Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: Russell King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: mvpp2: restructure "link status" interrupt handling

The "link status" interrupt is used for more than just link status.
Restructure mvpp2_link_status_isr() so we can add additional handling.

Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: Russell King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'devlink-show-controller-number'

Parav Pandit says:

====================
devlink show controller number

Currently a devlink instance that supports an eswitch handles eswitch
ports of two type of controllers.
(1) controller discovered on same system where eswitch resides.
This is the case where PCI PF/VF of a controller and devlink eswitch
instance both are located on a single system.
(2) controller located on external system.
This is the case where a controller is plugged in one system and its
devlink eswitch ports are located in a different system. In this case
devlink instance of the eswitch only have access to ports of the
controller.
However, there is no way to describe that a eswitch devlink port
belongs to which controller (mainly which external host controller).
This problem is more prevalent when port attribute such as PF and VF
numbers are overlapping between multiple controllers of same eswitch.
Due to this, for a specific switch_id, unique phys_port_name cannot
be constructed for such devlink ports.

This short series overcomes this limitation by defining two new
attributes.
(a) external: Indicates if port belongs to external controller
(b) controller number: Indicates a controller number of the port

Based on this a unique phys_port_name is prepared using controller
number.

phys_port_name construction using unique controller number is only
applicable to external controller ports. This ensures that for
non smartnic usecases where there is no external controller,
phys_port_name stays same as before.

Patch summary:
Patch-1 Added mlx5 driver to read controller number
Patch-2 Adds the missing comment for the port attributes
Patch-3 Move structure comments away from structure fields
Patch-4 external attribute added for PCI port flavours
Patch-5 Add controller number
Patch-6 Use controller number to build phys_port_name

---
Changelog:
v2->v3:
- Updated diagram to get rid of controller 'A' and 'B'
- Kept ports of single controller together in diagram
- Updated diagram for pf1's VF and SF and its ports
v1->v2:
- Added text diagram of multiple controllers
- Updated example for a VF
- Addressed comments from Jiri and Jakub
- Moved controller number attribute to PCI port flavours
   This enables to better, hirerchical view with controller and its
    PF, VF numbers
- Split 'external' and 'controller number' attributes as two
   different attributes
- Merged mlx5_core driver to avoid compiliation break
====================

Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

devlink: Use controller while building phys_port_name

Now that controller number attribute is available, use it when
building phsy_port_name for external controller ports.

An example devlink port and representor netdev name consist of controller
annotation for external controller with controller number = 1,
for a VF 1 of PF 0:

$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev ens2f0c1pf0vf1 flavour pcivf controller 1 pfnum 0 vfnum 1 external true splittable false
  function:
    hw_addr 00:00:00:00:00:00

$ devlink port show pci/0000:06:00.0/2 -jp
{
    "port": {
        "pci/0000:06:00.0/2": {
            "type": "eth",
            "netdev": "ens2f0c1pf0vf1",
            "flavour": "pcivf",
            "controller": 1,
            "pfnum": 0,
            "vfnum": 1,
            "external": true,
            "splittable": false,
            "function": {
                "hw_addr": "00:00:00:00:00:00"
            }
        }
    }
}

Controller number annotation is skipped for non external controllers to
maintain backward compatibility.

Signed-off-by: Parav Pandit <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

devlink: Introduce controller number

A devlink port may be for a controller consist of PCI device.
A devlink instance holds ports of two types of controllers.
(1) controller discovered on same system where eswitch resides
This is the case where PCI PF/VF of a controller and devlink eswitch
instance both are located on a single system.
(2) controller located on external host system.
This is the case where a controller is located in one system and its
devlink eswitch ports are located in a different system.

When a devlink eswitch instance serves the devlink ports of both
controllers together, PCI PF/VF numbers may overlap.
Due to this a unique phys_port_name cannot be constructed.

For example in below such system controller-0 and controller-1, each has
PCI PF pf0 whose eswitch ports can be present in controller-0.
These results in phys_port_name as "pf0" for both.
Similar problem exists for VFs and upcoming Sub functions.

An example view of two controller systems:

             ---------------------------------------------------------
             |                                                       |
             |           --------- ---------         ------- ------- |
-----------  |           | vf(s) | | sf(s) |         |vf(s)| |sf(s)| |
| server  |  | -------   ----/---- ---/----- ------- ---/--- ---/--- |
| pci rc  |=== | pf0 |______/________/       | pf1 |___/_______/     |
| connect |  | -------                       -------                 |
-----------  |     | controller_num=1 (no eswitch)                   |
             ------|--------------------------------------------------
             (internal wire)
                   |
             ---------------------------------------------------------
             | devlink eswitch ports and reps                        |
             | ----------------------------------------------------- |
             | |ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 |ctrl-0 | |
             | |pf0    | pf0vfN | pf0sfN | pf1    | pf1vfN |pf1sfN | |
             | ----------------------------------------------------- |
             | |ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 |ctrl-1 | |
             | |pf1    | pf1vfN | pf1sfN | pf1    | pf1vfN |pf0sfN | |
             | ----------------------------------------------------- |
             |                                                       |
             |                                                       |
             |           --------- ---------         ------- ------- |
             |           | vf(s) | | sf(s) |         |vf(s)| |sf(s)| |
             | -------   ----/---- ---/----- ------- ---/--- ---/--- |
             | | pf0 |______/________/       | pf1 |___/_______/     |
             | -------                       -------                 |
             |                                                       |
             |  local controller_num=0 (eswitch)                     |
             ---------------------------------------------------------

An example devlink port for external controller with controller
number = 1 for a VF 1 of PF 0:

$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev ens2f0pf0vf1 flavour pcivf controller 1 pfnum 0 vfnum 1 external true splittable false
  function:
    hw_addr 00:00:00:00:00:00

$ devlink port show pci/0000:06:00.0/2 -jp
{
    "port": {
        "pci/0000:06:00.0/2": {
            "type": "eth",
            "netdev": "ens2f0pf0vf1",
            "flavour": "pcivf",
            "controller": 1,
            "pfnum": 0,
            "vfnum": 1,
            "external": true,
            "splittable": false,
            "function": {
                "hw_addr": "00:00:00:00:00:00"
            }
        }
    }
}

Signed-off-by: Parav Pandit <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

devlink: Introduce external controller flag

A devlink eswitch port may represent PCI PF/VF ports of a controller.

A controller either located on same system or it can be an external
controller located in host where such NIC is plugged in.

Add the ability for driver to specify if a port is for external
controller.

Use such flag in the mlx5_core driver.

An example of an external controller having VF1 of PF0 belong to
controller 1.

$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev ens2f0pf0vf1 flavour pcivf pfnum 0 vfnum 1 external true splittable false
  function:
    hw_addr 00:00:00:00:00:00
$ devlink port show pci/0000:06:00.0/2 -jp
{
    "port": {
        "pci/0000:06:00.0/2": {
            "type": "eth",
            "netdev": "ens2f0pf0vf1",
            "flavour": "pcivf",
            "pfnum": 0,
            "vfnum": 1,
            "external": true,
            "splittable": false,
            "function": {
                "hw_addr": "00:00:00:00:00:00"
            }
        }
    }
}

Signed-off-by: Parav Pandit <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>