When running short on descriptors, only stop the queue for the netdev that
tx was attempted for. By the time something tries to send on the other
netdev, the ring might have some more room already.
Felix Fietkau [Fri, 23 Apr 2021 05:20:58 +0000 (22:20 -0700)]
net: ethernet: mtk_eth_soc: reduce MDIO bus access latency
usleep_range often ends up sleeping much longer than the 10-20us provided
as a range here. This causes significant latency in mdio bus acceses,
which easily adds multiple seconds to the boot time on MT7621 when polling
DSA slave ports.
Use cond_resched instead of usleep_range, since the MDIO access does not
take much time
In case build_skb fails, call skb_free_frag on the correct pointer. Also
update the DMA structures with the new mapping before exiting, because
the mapping was successful
Felix Fietkau [Fri, 23 Apr 2021 05:20:55 +0000 (22:20 -0700)]
net: ethernet: mtk_eth_soc: unmap RX data before calling build_skb
Since build_skb accesses the data area (for initializing shinfo), dma unmap
needs to happen before that call
Signed-off-by: Felix Fietkau <[email protected]>
[Ilya: split build_skb cleanup fix into a separate commit] Signed-off-by: Ilya Lipnitskiy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Dexuan Cui [Thu, 22 Apr 2021 20:08:16 +0000 (13:08 -0700)]
net: mana: Use int to check the return value of mana_gd_poll_cq()
mana_gd_poll_cq() may return -1 if an overflow error is detected (this
should never happen unless there is a bug in the driver or the hardware).
Fix the type of the variable "comp_read" by using int rather than u32.
Reported-by: Dan Carpenter <[email protected]> Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)") Signed-off-by: Dexuan Cui <[email protected]> Signed-off-by: David S. Miller <[email protected]>
net: sock: remove the unnecessary check in proto_register
tw_prot_cleanup will check the twsk_prot.
Fixes: 0f5907af3913 ("net: Fix potential memory leak in proto_register()") Cc: Miaohe Lin <[email protected]> Signed-off-by: Tonghao Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
This patch series to add setting for HW descriptor prefetch for DWMAC
version 5.20 onwards. For Intel platform, enable the capability by
default.
====================
stmmac: intel: Enable HW descriptor prefetch by default
Enable HW descriptor prefetch by default by setting plat->dma_cfg->dche =
true in intel_mgbe_common_data(). Need to be noted that this capability
only be supported in DWMAC core version 5.20 onwards. In stmmac, there is
a checking to check the core version. If the core version is below 5.20,
this capability wouldn`t be configured.
Below is the iperf result comparison between HW descriptor prefetch
disabled(DCHE=0b) and enabled(DCHE=1b). Tested on Intel Elkhartlake
platform with DWMAC Core 5.20. Observed line rate performance
improvement with HW descriptor prefetch enabled.
DWMAC Core 5.20 onwards supports HW descriptor prefetching.
Additionally, it also depends on platform specific RTL configuration.
This capability could be enabled by setting DMA_Mode bit-19 (DCHE).
So, to enable this cability, platform must set plat->dma_cfg->dche = true
and the DWMAC core version must be 5.20 onwards. Else, this capability
wouldn`t be configured
net/mlx4: Treat VFs fair when handling comm_channel_events
Handling comm_channel_event in mlx4_master_comm_channel uses a double
loop to determine which slaves have requested work. The search is
always started at lowest slave. This leads to unfairness; lower VFs
tends to be prioritized over higher VFs.
The patch uses find_next_bit to determine which slaves to handle.
Fairness is implemented by always starting at the next to the last
start.
An MPI program has been used to measure improvements. It runs 500
ibv_reg_mr, synchronizes with all other instances and then runs 500
ibv_dereg_mr.
The results running 500 processes, time reported is for running 500
calls:
Hayes Wang [Thu, 22 Apr 2021 08:48:02 +0000 (16:48 +0800)]
r8152: replace return with break for ram code speedup mode timeout
When the timeout occurs, we still have to run the following process
for releasing patch request. Otherwise, the PHY would keep no link.
Therefore, use break to stop the loop of loading firmware and
release the patch request rather than return the function directly.
Fixes: 4a51b0e8a014 ("r8152: support PHY firmware for RTL8156 series") Signed-off-by: Hayes Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
David S. Miller [Thu, 22 Apr 2021 20:57:21 +0000 (13:57 -0700)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
100GbE Intel Wired LAN Driver Updates 2021-04-22
This series contains updates to virtchnl header file, ice, and iavf
drivers.
Vignesh adds support to warn about potentially malicious VFs; those that
are overflowing the mailbox for the ice driver.
Michal adds support for an allowlist/denylist of VF commands based on
supported capabilities for the ice driver.
Brett adds support for iavf UDP segmentation offload by adding the
capability bit to virtchnl, advertising support in the ice driver, and
enabling it in the iavf driver. He also adds a helper function for
getting the VF VSI for ice.
Colin Ian King removes an unneeded pointer assignment.
Qi enables support in the ice driver to support virtchnl requests from
the iavf to configure its own RSS input set. This includes adding new
capability bits, structures, and commands to virtchnl header file.
Haiyue enables configuring RSS flow hash via ethtool to support TCP, UDP
and SCTP protocols in iavf.
====================
There are a few warnings about empty debug macros in this driver:
drivers/net/ethernet/neterion/vxge/vxge-main.c: In function 'vxge_probe':
drivers/net/ethernet/neterion/vxge/vxge-main.c:4480:76: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
4480 | "Failed in enabling SRIOV mode: %d\n", ret);
Change them to proper 'do { } while (0)' expressions to make the
code a little more robust and avoid the warnings.
net: wwan: core: Return poll error in case of port removal
Ensure that the poll system call returns proper error flags when port
is removed (nullified port ops), allowing user side to properly fail,
without further read or write.
netdevsim: Only use sampling truncation length when valid
When the sampling truncation length is invalid (zero), pass the length
of the packet. Without the fix, no payload is reported to user space
when the truncation length is zero.
Fixes: a8700c3dd0a4 ("netdevsim: Add dummy psample implementation") Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
The problem is that the enetc Makefile is not actually used for
the ierb module if that is the only built-in driver in there
and everything else is a loadable module.
Fix it by always entering the directory this time, regardless
of which symbols are configured. This should reliably fix the
problem and prevent it from coming back another time.
Fixes: 112463ddbe82 ("net: dsa: felix: fix link error") Fixes: e7d48e5fbf30 ("net: enetc: add a mini driver for the Integrated Endpoint Register Block") Signed-off-by: Arnd Bergmann <[email protected]> Acked-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
The MANA driver causes a build failure in some configurations when
it selects an unavailable symbol:
WARNING: unmet direct dependencies detected for PCI_HYPERV
Depends on [n]: PCI [=y] && X86_64 [=y] && HYPERV [=n] && PCI_MSI [=y] && PCI_MSI_IRQ_DOMAIN [=y] && SYSFS [=y]
Selected by [y]:
- MICROSOFT_MANA [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_MICROSOFT [=y] && PCI_MSI [=y] && X86_64 [=y]
drivers/pci/controller/pci-hyperv.c: In function 'hv_irq_unmask':
drivers/pci/controller/pci-hyperv.c:1217:9: error: implicit declaration of function 'hv_set_msi_entry_from_desc' [-Werror=implicit-function-declaration]
1217 | hv_set_msi_entry_from_desc(¶ms->int_entry.msi_entry, msi_desc);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
A PCI driver should never depend on a particular host bridge
implementation in the first place, but if we have this dependency
it's better to express it as a 'depends on' rather than 'select'.
Currently, RSS hash input is not available to AVF by ethtool, it is set
by the PF directly.
Add the RSS configure support for AVF through new virtchnl message, and
define the capability flag VIRTCHNL_VF_OFFLOAD_ADV_RSS_PF to query this
new RSS offload support.
Brett Creeley [Tue, 2 Mar 2021 18:15:39 +0000 (10:15 -0800)]
ice: Add helper function to get the VF's VSI
Currently, the driver gets the VF's VSI by using a long string of
dereferences (i.e. vf->pf->vsi[vf->lan_vsi_idx]). If the method to get
the VF's VSI were to change the driver would have to change it in every
location. Fix this by adding the helper ice_get_vf_vsi().
As the hardware is capable of supporting UDP segmentation offload, add a
capability bit to virtchnl.h to communicate this and have the driver
advertise its support.
Declare bitmap of allowed commands on VF. Initialize default
opcodes list that should be always supported. Declare array of
supported opcodes for each caps used in virtchnl code.
Change allowed bitmap by setting or clearing corresponding
bit to allowlist (bit set) or denylist (bit clear).
Vignesh Sridhar [Tue, 2 Mar 2021 18:12:00 +0000 (10:12 -0800)]
ice: warn about potentially malicious VFs
Attempt to detect malicious VFs and, if suspected, log the information but
keep going to allow the user to take any desired actions.
Potentially malicious VFs are identified by checking if the VFs are
transmitting too many messages via the PF-VF mailbox which could cause an
overflow of this channel resulting in denial of service. This is done by
creating a snapshot or static capture of the mailbox buffer which can be
traversed and in which the messages sent by VFs are tracked.
Vladimir Oltean [Wed, 21 Apr 2021 18:44:20 +0000 (21:44 +0300)]
net: bridge: fix error in br_multicast_add_port when CONFIG_NET_SWITCHDEV=n
When CONFIG_NET_SWITCHDEV is disabled, the shim for switchdev_port_attr_set
inside br_mc_disabled_update returns -EOPNOTSUPP. This is not caught,
and propagated to the caller of br_multicast_add_port, preventing ports
from joining the bridge.
Adam Ford [Wed, 21 Apr 2021 14:05:05 +0000 (09:05 -0500)]
net: ethernet: ravb: Fix release of refclk
The call to clk_disable_unprepare() can happen before priv is
initialized. This means moving clk_disable_unprepare out of
out_release into a new label.
Fixes: 8ef7adc6beb2 ("net: ethernet: ravb: Enable optional refclk") Signed-off-by: Adam Ford <[email protected]> Reviewed-by: Sergei Shtylyov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
net: dsa: fix bridge support for drivers without port_bridge_flags callback
Starting with patch: a8b659e7ff75 ("net: dsa: act as passthrough for bridge port flags")
drivers without "port_bridge_flags" callback will fail to join the bridge.
Looking at the code, -EOPNOTSUPP seems to be the proper return value,
which makes at least microchip and atheros switches work again.
Fixes: 5961d6a12c13 ("net: dsa: inherit the actual bridge port flags at join time") Signed-off-by: Oleksij Rempel <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Dan Carpenter [Wed, 21 Apr 2021 13:22:50 +0000 (16:22 +0300)]
stmmac: intel: unlock on error path in intel_crosststamp()
We recently added some new locking to this function but one error path
was overlooked. We need to drop the lock before returning.
Fixes: f4da56529da6 ("net: stmmac: Add support for external trigger timestamping") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Wong Vee Khee <[email protected]> Signed-off-by: David S. Miller <[email protected]>
net: dsa: mv88e6xxx: Fix off-by-one in VTU devlink region size
In the unlikely event of the VTU being loaded to the brim with 4k
entries, the last one was placed in the buffer, but the size reported
to devlink was off-by-one. Make sure that the final entry is available
to the caller.
Fixes: ca4d632aef03 ("net: dsa: mv88e6xxx: Export VTU as devlink region") Signed-off-by: Tobias Waldekranz <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
David S. Miller [Wed, 21 Apr 2021 17:23:17 +0000 (10:23 -0700)]
Merge branch 'octeontx2-af-cn10k'
Srujana Challa says:
====================
Add support for CN10K CPT block
OcteonTX3 (CN10K) silicon is a Marvell next-gen silicon. CN10K CPT
introduces new features like reassembly support and some feature
enhancements.
This patchset adds new mailbox messages and some minor changes to
existing mailbox messages to support CN10K CPT.
octeontx2-af: cn10k: Add mailbox to configure reassembly timeout
CN10K CPT coprocessor includes a component named RXC which
is responsible for reassembly of inner IP packets. RXC has
the feature to evict oldest entries based on age/threshold.
This patch adds a new mailbox to configure reassembly age
or threshold.
The mhi_wwan_rx_budget_dec function is supposed to return true if
RX buffer budget has been successfully decremented, allowing to queue
a new RX buffer for transfer. However the current implementation is
broken when RX budget is '1', in which case budget is decremented but
false is returned, preventing to requeue one buffer, and leading to
RX buffer starvation.
Fixes: fa588eba632d ("net: Add Qcom WWAN control driver") Signed-off-by: Loic Poulain <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Michael Walle [Tue, 20 Apr 2021 10:29:29 +0000 (12:29 +0200)]
net: phy: at803x: fix probe error if copper page is selected
The commit c329e5afb42f ("net: phy: at803x: select correct page on
config init") selects the copper page during probe. This fails if the
copper page was already selected. In this case, the value of the copper
page (which is 1) is propagated through phy_restore_page() and is
finally returned for at803x_probe(). Fix it, by just using the
at803x_page_write() directly.
Also in case of an error, the regulator is not disabled and leads to a
WARN_ON() when the probe fails. This couldn't happen before, because
at803x_parse_dt() was the last call in at803x_probe(). It is hard to
see, that the parse_dt() actually enables the regulator. Thus move the
regulator_enable() to the probe function and undo it in case of an
error.
Fixes: c329e5afb42f ("net: phy: at803x: select correct page on config init") Signed-off-by: Michael Walle <[email protected]> Reviewed-by: David Bauer <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Colin Ian King [Tue, 20 Apr 2021 12:27:30 +0000 (13:27 +0100)]
net: mana: remove redundant initialization of variable err
The variable err is being initialized with a value that is
never read and it is being updated later with a new value. The
initialization is redundant and can be removed
Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Eric Dumazet [Tue, 20 Apr 2021 09:43:41 +0000 (02:43 -0700)]
virtio-net: fix use-after-free in page_to_skb()
KASAN/syzbot had 4 reports, one of them being:
BUG: KASAN: slab-out-of-bounds in memcpy include/linux/fortify-string.h:191 [inline]
BUG: KASAN: slab-out-of-bounds in page_to_skb+0x5cf/0xb70 drivers/net/virtio_net.c:480
Read of size 12 at addr ffff888014a5f800 by task systemd-udevd/8445
Michael Walle [Tue, 20 Apr 2021 14:28:21 +0000 (16:28 +0200)]
net: enetc: automatically select IERB module
Now that enetc supports flow control we have to make sure the settings in
the IERB are correct. Therefore, we actually depend on the enetc-ierb
module. Previously it was possible that this module was disabled while the
enetc was enabled. Fix it by automatically select the enetc-ierb module.
Fixes: e7d48e5fbf30 ("net: enetc: add a mini driver for the Integrated Endpoint Register Block") Signed-off-by: Michael Walle <[email protected]> Acked-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Eric Dumazet [Tue, 20 Apr 2021 20:01:44 +0000 (13:01 -0700)]
virtio-net: restrict build_skb() use to some arches
build_skb() is supposed to be followed by
skb_reserve(skb, NET_IP_ALIGN), so that IP headers are word-aligned.
(Best practice is to reserve NET_IP_ALIGN+NET_SKB_PAD, but the NET_SKB_PAD
part is only a performance optimization if tunnel encaps are added.)
Unfortunately virtio_net has not provisioned this reserve.
We can only use build_skb() for arches where NET_IP_ALIGN == 0
bit operation helpers such as test_bit, clear_bit, etc take bit
position as parameter and not value. Current usage causes double
shift => BIT(BIT(0)). Fix that in wwan_core and mhi_wwan_ctrl.
In addition to the mv88e6xxx support to dynamically change the
protocol, it is now possible to override the protocol from the device
tree. This means that when a board vendor finds an incompatibility,
they can specify a working protocol in the DT, and users will not have
to worry about it.
Some background information:
In a system using an NXP T1023 SoC connected to a 6390X switch, we
noticed that TO_CPU frames where not reaching the CPU. This only
happened on hardware port 8. Looking at the DSA master interface
(dpaa-ethernet) we could see that an Rx error counter was bumped at
the same rate. The logs indicated a parser error.
It just so happens that a TO_CPU coming in on device 0, port 8, will
result in the first two bytes of the DSA tag being one of:
00 40
00 44
00 46
My guess was that since these values looked like 802.3 length fields,
the controller's parser would signal an error if the frame length did
not match what was in the header.
This was later confirmed using two different workarounds provided by
Vladimir. Unfortunately these either bypass or ignore the hardware
parser and thus robs working combinations of the ability to do RSS and
other nifty things. It was therefore decided to go with the option of
a DT override.
v1 -> v2:
- Fail if the device does not support changing protocols instead of
falling back to the default. (Andrew)
- Only call change_tag_protocol on CPU ports. (Andrew/Vladimir)
- Only allow changing the protocol on chips that have at least
"undocumented" level of support for EDSA. (Andrew).
- List the supported protocols in the binding documentation. I opted
for only listing the protocols that I have tested. As more people
test their drivers, they can add them. (Rob)
v2 -> v3:
- Rename "dsa,tag-protocol" -> "dsa-tag-protocol". (Rob)
- Some cleanups to 4/5. (Vladimir)
- Add a comment detailing how tree/driver agreement on the tag
protocol is enforced. (Vladimir).
====================
The 'dsa-tag-protocol' is used to force a switch tree to use a
particular tag protocol, typically because the Ethernet controller
that it is connected to is not compatible with the default one.
net: dsa: Allow default tag protocol to be overridden from DT
Some combinations of tag protocols and Ethernet controllers are
incompatible, and it is hard for the driver to keep track of these.
Therefore, allow the device tree author (typically the board vendor)
to inform the driver of this fact by selecting an alternate protocol
that is known to work.
net: dsa: mv88e6xxx: Allow dynamic reconfiguration of tag protocol
For devices that supports both regular and Ethertyped DSA tags, allow
the user to change the protocol.
Additionally, because there are ethernet controllers that do not
handle regular DSA tags in all cases, also allow the protocol to be
changed on devices with undocumented support for EDSA. But, in those
cases, make sure to log the fact that an undocumented feature has been
enabled.
net: dsa: mv88e6xxx: Mark chips with undocumented EDSA tag support
All devices are capable of using regular DSA tags. Support for
Ethertyped DSA tags sort into three categories:
1. No support. Older chips fall into this category.
2. Full support. Datasheet explicitly supports configuring the CPU
port to receive FORWARDs with a DSA tag.
3. Undocumented support. Datasheet lists the configuration from
category 2 as "reserved for future use", but does empirically
behave like a category 2 device.
So, instead of listing the one true protocol that should be used by a
particular chip, specify the level of support for EDSA (support for
regular DSA is implicit on all chips). As before, we use EDSA for all
chips that fully supports it.
In upcoming changes, we will use this information to support
dynamically changing the tag protocol.
David S. Miller [Tue, 20 Apr 2021 23:44:04 +0000 (16:44 -0700)]
Merge tag 'mac80211-next-for-net-next-2021-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
Another set of updates, all over the map:
* set sk_pacing_shift for 802.3->802.11 encap offload
* some monitor support for 802.11->802.3 decap offload
* HE (802.11ax) spec updates
* userspace API for TDLS HE support
* along with various other small features, cleanups and
fixups
====================
Currently, mlxsw admits for offload a suitable root qdisc, and its
children. Thus up to two levels of hierarchy are offloaded. Often, this is
enough: one can configure TCs with RED and TCs with a shaper, and can even
see counters for each TC by looking at a qdisc at a sufficiently shallow
position.
While simple, the system has obvious shortcomings. It is not possible to
configure both RED and shaping on one TC. It is not possible to place a
PRIO below root TBF, which would then be offloaded as port shaper. FIFOs
are only offloaded at root or directly below, which is confusing to users,
because RED and TBF of course have their own FIFO.
This patchset is a step towards the end goal of allowing more comprehensive
qdisc tree offload and cleans up the qdisc offload code.
- Patches #1-#4 contain small cleanups.
- Up until now, since mlxsw offloaded only a very simple qdisc
configurations, basically all bookkeeping was done using one container
for the root qdisc, and 8 containers for its children. Patches #5, #6, #8
and #9 gradually introduce a more dynamic structure, where parent-child
relationships are tracked directly at qdiscs, instead of being implicit.
- This tree management assumes only one qdisc is created at a time. In FIFO
handlers, this condition was enforced simply by asserting RTNL lock. But
instead of furthering this RTNL dependence, patch #7 converts the whole
qdisc offload logic to a per-port mutex.
Petr Machata [Tue, 20 Apr 2021 14:53:48 +0000 (16:53 +0200)]
selftests: mlxsw: sch_red_ets: Test proper counter cleaning in ETS
There was a bug introduced during the rework which cause non-zero backlog
being stuck at ETS. Introduce a selftest that would have caught the issue
earlier.
Petr Machata [Tue, 20 Apr 2021 14:53:47 +0000 (16:53 +0200)]
mlxsw: spectrum_qdisc: Index future FIFOs by band number
mlxsw used to hold an array of qdiscs indexed by the TC number. In the
previous patch, it was changed to allocate child qdiscs dynamically, and
they are now indexed by band number. Follow suit with the array of future
FIFOs.
Instead of keeping qdiscs in globally-preallocated arrays, introduce a
per-qdisc-kind value num_classes, and then allocate the necessary child
qdiscs (if any) based on that value. Since now dynamic allocation is
involved, mlxsw_sp_qdisc_replace() gets messy enough that it is worth it to
split it to two cases: a new qdisc allocation and a change of existing
qdisc. (Note that the change also includes what TC formally calls replace,
if the qdisc kind is the same.)
Petr Machata [Tue, 20 Apr 2021 14:53:45 +0000 (16:53 +0200)]
mlxsw: spectrum_qdisc: Guard all qdisc accesses with a lock
The FIFO handler currently guards accesses to the future FIFO tracking by
asserting RTNL. In the future, the changes to the qdisc state will be more
thorough, so other qdiscs will need this guarding is as well. In order
to not further the RTNL infestation, instead convert to a custom lock that
will guard accesses to the qdisc state.
Petr Machata [Tue, 20 Apr 2021 14:53:44 +0000 (16:53 +0200)]
mlxsw: spectrum_qdisc: Track children per qdisc
mlxsw currently allows a two-level structure of qdiscs: the root and
possibly a number of children. In order to support offloading more general
qdisc trees, introduce to struct mlxsw_sp_qdisc a pointer to child qdiscs.
Refer to the child qdiscs through this pointer, instead of going through
the tclass_qdiscs in qdisc_state. Additionally introduce a field
num_classes, which holds number of given qdisc's children.
Also introduce a generic function for walking qdisc trees. Rewrite
mlxsw_sp_qdisc_find() and _find_by_handle() to use the generic walker.
For now, keep the qdisc_state.tclass_qdisc, and just point root_qdiscs's
children to this array. Following patches will make the allocation dynamic.
Petr Machata [Tue, 20 Apr 2021 14:53:43 +0000 (16:53 +0200)]
mlxsw: spectrum_qdisc: Promote backlog reduction to mlxsw_sp_qdisc_destroy()
When a qdisc is removed, it is necessary to update the backlog value at its
parent--unless the qdisc is at root position. RED, TBF and FIFO all do
that, each separately. Since all of them need to do this, just promote the
operation directly to mlxsw_sp_qdisc_destroy(), instead of deferring it to
individual destructors. Since FIFO dtor thus becomes trivial, remove it.
Add struct mlxsw_sp_qdisc.parent to point at the parent qdisc. This will be
handy later as deeper structures are offloaded. Use the parent qdisc to
find the chain of parents whose backlog value needs to be updated.
Petr Machata [Tue, 20 Apr 2021 14:53:42 +0000 (16:53 +0200)]
mlxsw: spectrum_qdisc: Track tclass_num as int, not u8
tclass_num is just a number, a value that would be ordinarily passed around
as an int. (Which is unlike a u8 prio_bitmap.) In several places,
tclass_num already is an int. Convert the remaining instances.
Petr Machata [Tue, 20 Apr 2021 14:53:41 +0000 (16:53 +0200)]
mlxsw: spectrum_qdisc: Drop an always-true condition
The function mlxsw_sp_qdisc_compare() is invoked a couple lines above this
check, which will bounce any requests where this condition does not hold.
Therefore drop it.
The purpose of this function is to filter out events that are related to
qdiscs that are not offloaded, or are not offloaded anymore. But the
function is unnecessarily thorough:
- mlxsw_sp_qdisc pointer is never NULL in the context where it is called
- Two qdiscs with the same handle will never have different types. Even
when replacing one qdisc with another in the same class, Linux will not
permit handle reuse unless the qdisc type also matches.
Simplify the function by omitting these two unnecessary conditions.
Marek Behún [Tue, 20 Apr 2021 07:54:00 +0000 (09:54 +0200)]
net: phy: marvell: fix HWMON enable register for 6390
Register 27_6.15:14 has the following description in 88E6393X
documentation:
Temperature Sensor Enable
0x0 - Sample every 1s
0x1 - Sense rate decided by bits 10:8 of this register
0x2 - Use 26_6.5 (One shot Temperature Sample) to enable
0x3 - Disable
This is compatible with how the 6390 code uses this register currently,
but the 6390 code handles it as two 1-bit registers (somewhat), instead
of one register with 4 possible values.
(A newer version of the 6390 documentation removed temperature sensor
section completely. In an older version, the above mentioned register
is reserved, although it is R/W. Since the code works, I think we can
assume that it is correct.)
Rename this register and define all 4 values according to 6393X
documentation.
Marek Behún [Tue, 20 Apr 2021 07:53:59 +0000 (09:53 +0200)]
net: phy: marvell: refactor HWMON OOP style
Use a structure of Marvell PHY specific HWMON methods to reduce code
duplication. Store a pointer to this structure into the PHY driver's
driver_data member.
David S. Miller [Tue, 20 Apr 2021 23:14:02 +0000 (16:14 -0700)]
Merge tag 'mlx5-updates-2021-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2021-04-19
This patchset provides some updates to mlx5e and mlx5 SW steering drivers:
1) Tariq and Vladyslav they both provide some trivial update to mlx5e netdev.
The next 12 patches in the patchset are focused toward mlx5 SW steering:
2) 3 trivial cleanup patches
3) Dynamic Flex parser support:
Flex parser is a HW parser that can support protocols that are not
natively supported by the HCA, such as Geneve (TLV options) and GTP-U.
There are 8 such parsers, and each of them can be assigned to parse a
specific set of protocols.
4) Enable matching on Geneve TLV options
5) Use Flex parser for MPLS over UDP/GRE
6) Enable matching on tunnel GTP-U and GTP-U first extension
header using
7) Improved QoS for SW steering internal QPair for a better insertion rate
====================
Xiaoliang Yang [Mon, 19 Apr 2021 10:25:30 +0000 (18:25 +0800)]
net: dsa: felix: disable always guard band bit for TAS config
ALWAYS_GUARD_BAND_SCH_Q bit in TAS config register is descripted as
this:
0: Guard band is implemented for nonschedule queues to schedule
queues transition.
1: Guard band is implemented for any queue to schedule queue
transition.
The driver set guard band be implemented for any queue to schedule queue
transition before, which will make each GCL time slot reserve a guard
band time that can pass the max SDU frame. Because guard band time could
not be set in tc-taprio now, it will use about 12000ns to pass 1500B max
SDU. This limits each GCL time interval to be more than 12000ns.
This patch change the guard band to be only implemented for nonschedule
queues to schedule queues transition, so that there is no need to reserve
guard band on each GCL. Users can manually add guard band time for each
schedule queues in their configuration if they want.
David S. Miller [Tue, 20 Apr 2021 23:08:02 +0000 (16:08 -0700)]
Merge branch 'net-generic-selftest-support'
Oleksij Rempel says:
====================
provide generic net selftest support
changes v3:
- make more granular tests
- enable loopback for all PHYs by default
- fix allmodconfig build errors
- poll for link status update after switching to the loopback mode
changes v2:
- make generic selftests available for all networking devices.
- make use of net_selftest* on FEC, ag71xx and all DSA switches.
- add loopback support on more PHYs.
This patch set provides diagnostic capabilities for some iMX, ag71xx or
any DSA based devices. For proper functionality, PHY loopback support is
needed.
So far there is only initial infrastructure with basic tests.
====================
net: dsa: enable selftest support for all switches by default
Most of generic selftest should be able to work with probably all ethernet
controllers. The DSA switches are not exception, so enable it by default at
least for DSA.
Port some parts of the stmmac selftest and reuse it as basic generic selftest
library. This patch was tested with following combinations:
- iMX6DL FEC -> AT8035
- iMX6DL FEC -> SJA1105Q switch -> KSZ8081
- iMX6DL FEC -> SJA1105Q switch -> KSZ9031
- AR9331 ag71xx -> AR9331 PHY
- AR9331 ag71xx -> AR9331 switch -> AR9331 PHY
net: phy: genphy_loopback: add link speed configuration
In case of loopback, in most cases we need to disable autoneg support
and force some speed configuration. Otherwise, depending on currently
active auto negotiated link speed, the loopback may or may not work.
This patch was tested with following PHYs: TJA1102, KSZ8081, KSZ9031,
AT8035, AR9331.
net: phy: execute genphy_loopback() per default on all PHYs
The generic loopback is really generic and is defined by the 802.3
standard, we should just mandate that drivers implement a custom
loopback if the generic one cannot work.
When using SW steering, rule insertion rate depends on the RDMA RC QP
performance used for writing to the ICM. During stress this QP is competing
on the HW resources with all the other QPs that are used to send data.
To protect SW steering QP's performance in such cases, we set this QP to
use isolated VL. The VL number is reserved by FW and is not exposed to the
driver.
Support for this QP on isolated VL exists only when both force-loopback and
isolate_vl_tc capabilities are set.
When supported by the device, SW steering RoCE RC QP that is used to
write/read to/from ICM will be created with force-loopback attribute.
Such QP doesn't require GID index upon creation.
Flex parser is a HW parser that can support protocols that are not
natively supported by the HCA, such as Geneve (TLV options) and GTP-U.
There are 8 such parsers, and each of them can be assigned to parse a
specific set of protocols.
This patch adds misc4 match params which allows using a correct flex parser
that was programmed to the required protocol.