Git Repo - linux.git/log

octeontx2-af: Manage new blocks in 98xx

AF manages the tasks of allocating, freeing
LFs from RVU blocks to PF and VFs. With new
NIX1 and CPT1 blocks in 98xx, this patch
adds support for handling new blocks too.

Co-developed-by: Subbaraya Sundeep <[email protected]>
Signed-off-by: Subbaraya Sundeep <[email protected]>
Signed-off-by: Rakesh Babu <[email protected]>
Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

octeontx2-af: Update get/set resource count functions

Since multiple blocks of same type are present in
98xx, modify functions which get resource count and
which update resource count to work with individual
block address instead of block type.

Reviewed-by: Jesse Brandeburg <[email protected]>
Signed-off-by: Subbaraya Sundeep <[email protected]>
Signed-off-by: Sunil Goutham <[email protected]>
Signed-off-by: Rakesh Babu <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: axienet: Properly handle PCS/PMA PHY for 1000BaseX mode

Update the axienet driver to properly support the Xilinx PCS/PMA PHY
component which is used for 1000BaseX and SGMII modes, including
properly configuring the auto-negotiation mode of the PHY and reading
the negotiated state from the PHY.

Signed-off-by: Robert Hancock <[email protected]>
Reviewed-by: Radhey Shyam Pandey <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: ipa: avoid a bogus warning

The previous commit added support for IPA having up to six source
and destination resources.  But currently nothing uses more than
four.  (Five of each are used in a newer version of the hardware.)

I find that in one of my build environments the compiler complains
about newly-added code in two spots.  Inspection shows that the
warnings have no merit, but this compiler does not recognize that.

    ipa_main.c:457:39: warning: array index 5 is past the end of the
        array (which contains 4 elements) [-Warray-bounds]
    (and the same warning at line 483)

We can make this warning go away by changing the number of elements
in the source and destination resource limit arrays--now rather than
waiting until we need it to support the newer hardware.  This change
was coming soon anyway; make it now to get rid of the warning.

Signed-off-by: Alex Elder <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'net-add-functionality-to-net-core-byte-packet-counters-and-use-it-in-r8169'

Heiner Kallweit says:

====================
net: add functionality to net core byte/packet counters and use it in r8169

This series adds missing functionality to the net core handling of
byte/packet counters and statistics. The extensions are then used
to remove private rx/tx byte/packet counters in r8169 driver.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

r8169: remove no longer needed private rx/tx packet/byte counters

After switching to the net core rx/tx byte/packet counters we can
remove the now unused private version.

Signed-off-by: Heiner Kallweit <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

r8169: use struct pcpu_sw_netstats for rx/tx packet/byte counters

Switch to the net core rx/tx byte/packet counter infrastructure.
This simplifies the code, only small drawback is some memory overhead
because we use just one queue, but allocate the counters per cpu.

Signed-off-by: Heiner Kallweit <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: core: add devm_netdev_alloc_pcpu_stats

We have netdev_alloc_pcpu_stats(), and we have devm_alloc_percpu().
Add a managed version of netdev_alloc_pcpu_stats, e.g. for allocating
the per-cpu stats in the probe() callback of a driver. It needs to be
a macro for dealing properly with the type argument.

Signed-off-by: Heiner Kallweit <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: core: add dev_sw_netstats_tx_add

Add dev_sw_netstats_tx_add(), complementing already existing
dev_sw_netstats_rx_add(). Other than dev_sw_netstats_rx_add allow to
pass the number of packets as function argument.

Signed-off-by: Heiner Kallweit <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'in_interrupt-cleanup-part-2'

Sebastian Andrzej Siewior says:

====================
in_interrupt() cleanup, part 2

in the discussion about preempt count consistency across kernel configurations:

https://lore.kernel.org/r/20200914204209.256266093@linutronix.de/

Linus clearly requested that code in drivers and libraries which changes
behaviour based on execution context should either be split up so that
e.g. task context invocations and BH invocations have different interfaces
or if that's not possible the context information has to be provided by the
caller which knows in which context it is executing.

This includes conditional locking, allocation mode (GFP_*) decisions and
avoidance of code paths which might sleep.

In the long run, usage of 'preemptible, in_*irq etc.' should be banned from
driver code completely.

This is part two addressing remaining drivers except for orinoco-usb.
====================

Cherry picking only Ethernet changes.

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: tlan: Replace in_irq() usage

The driver uses in_irq() to determine if the tlan_priv::lock has to be
acquired in tlan_mii_read_reg() and tlan_mii_write_reg().

The interrupt handler acquires the lock outside of these functions so the
in_irq() check is meant to prevent a lock recursion deadlock. But this
check is incorrect when interrupt force threading is enabled because then
the handler runs in thread context and in_irq() correctly returns false.

The usage of in_*() in drivers is phased out and Linus clearly requested
that code which changes behaviour depending on context should either be
seperated or the context be conveyed in an argument passed by the caller,
which usually knows the context.

tlan_set_timer() has this conditional as well, but this function is only
invoked from task context or the timer callback itself. So it always has to
lock and the check can be removed.

tlan_mii_read_reg(), tlan_mii_write_reg() and tlan_phy_print() are invoked
from interrupt and other contexts.

Split out the actual function body into helper variants which are called
from interrupt context and make the original functions wrappers which
acquire tlan_priv::lock unconditionally.

Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Cc: Samuel Chessman <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: forcedeth: Replace context and lock check with a lockdep_assert()

nv_update_stats() triggers a WARN_ON() when invoked from hard interrupt
context because the locks in use are not hard interrupt safe. It also has
an assert_spin_locked() which was the lock check before the lockdep era.

Lockdep has way broader locking correctness checks and covers both issues,
so replace the warning and the lock assert with lockdep_assert_held().

Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Cc: Rain River <[email protected]>
Cc: Zhu Yanjun <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: neterion: s2io: Replace in_interrupt() for context detection

wait_for_cmd_complete() uses in_interrupt() to detect whether it is safe to
sleep or not.

The usage of in_interrupt() in drivers is phased out and Linus clearly
requested that code which changes behaviour depending on context should
either be seperated or the context be conveyed in an argument passed by the
caller, which usually knows the context.

in_interrupt() also is only partially correct because it fails to chose the
correct code path when just preemption or interrupts are disabled.

Add an argument 'may_block' to both functions and adjust the callers to
pass the context information.

The following call chains which end up invoking wait_for_cmd_complete()
were analyzed to be safe to sleep:

s2io_card_up()
   s2io_set_multicast()

init_nic()
   init_tti()

s2io_close()
   do_s2io_delete_unicast_mc()
     do_s2io_add_mac()

s2io_set_mac_addr()
   do_s2io_prog_unicast()
     do_s2io_add_mac()

s2io_reset()
   do_s2io_restore_unicast_mc()
     do_s2io_add_mc()
       do_s2io_add_mac()

s2io_open()
   do_s2io_prog_unicast()
     do_s2io_add_mac()

The following call chains which end up invoking wait_for_cmd_complete()
were analyzed to be safe to sleep:

__dev_set_rx_mode()
    s2io_set_multicast()

s2io_txpic_intr_handle()
   s2io_link()
     init_tti()

Add a may_sleep argument to wait_for_cmd_complete(), s2io_set_multicast()
and init_tti() and hand the context information in from the call sites.

Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Cc: Jon Mason <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

netfilter: ipset: Expose the initval hash parameter to userspace

It makes possible to reproduce exactly the same set after a save/restore.

Signed-off-by: Jozsef Kadlecsik <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: ipset: Add bucketsize parameter to all hash types

The parameter defines the upper limit in any hash bucket at adding new entries
from userspace - if the limit would be exceeded, ipset doubles the hash size
and rehashes. It means the set may consume more memory but gives faster
evaluation at matching in the set.

Signed-off-by: Jozsef Kadlecsik <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: ipset: Support the -exist flag with the destroy command

The -exist flag was supported with the create, add and delete commands.
In order to gracefully handle the destroy command with nonexistent sets,
the -exist flag is added to destroy too.

Signed-off-by: Jozsef Kadlecsik <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: nft_reject: add reject verdict support for netdev

Adds support for reject from ingress hook in netdev family.
Both stacks ipv4 and ipv6. With reject packets supporting ICMP
and TCP RST.

This ability is required in devices that need to REJECT legitimate
clients which traffic is forwarded from the ingress hook.

Joint work with Laura Garcia.

Signed-off-by: Jose M. Guisado Gomez <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: nft_reject: unify reject init and dump into nft_reject

Bridge family is using the same static init and dump function as inet.

This patch removes duplicate code unifying these functions body into
nft_reject.c so they can be reused in the rest of families supporting
reject verdict.

Signed-off-by: Jose M. Guisado Gomez <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: nf_reject: add reject skbuff creation helpers

Adds reject skbuff creation helper functions to ipv4/6 nf_reject
infrastructure. Use these functions for reject verdict in bridge
family.

Can be reused by all different families that support reject and
will not inject the reject packet through ip local out.

Signed-off-by: Jose M. Guisado Gomez <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

Merge branch 'l2-multicast-forwarding-for-ocelot-switch'

Vladimir Oltean says:

====================
L2 multicast forwarding for Ocelot switch

This series enables the mscc_ocelot switch to forward raw L2 (non-IP)
mdb entries as configured by the bridge driver after this patch:

https://patchwork.ozlabs.org/project/netdev/patch/20201028233831 [email protected]/
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: mscc: ocelot: support L2 multicast entries

There is one main difference in mscc_ocelot between IP multicast and L2
multicast. With IP multicast, destination ports are encoded into the
upper bytes of the multicast MAC address. Example: to deliver the
address 01:00:5E:11:22:33 to ports 3, 8, and 9, one would need to
program the address of 00:03:08:11:22:33 into hardware. Whereas for L2
multicast, the MAC table entry points to a Port Group ID (PGID), and
that PGID contains the port mask that the packet will be forwarded to.
As to why it is this way, no clue. My guess is that not all port
combinations can be supported simultaneously with the limited number of
PGIDs, and this was somehow an issue for IP multicast but not for L2
multicast. Anyway.

Prior to this change, the raw L2 multicast code was bogus, due to the
fact that there wasn't really any way to test it using the bridge code.
There were 2 issues:
- A multicast PGID was allocated for each MDB entry, but it wasn't in
  fact programmed to hardware. It was dummy.
- In fact we don't want to reserve a multicast PGID for every single MDB
  entry. That would be odd because we can only have ~60 PGIDs, but
  thousands of MDB entries. So instead, we want to reserve a multicast
  PGID for every single port combination for multicast traffic. And
  since we can have 2 (or more) MDB entries delivered to the same port
  group (and therefore PGID), we need to reference-count the PGIDs.

Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: mscc: ocelot: make entry_type a member of struct ocelot_multicast

This saves a re-classification of the MDB address on deletion.

Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: mscc: ocelot: remove the "new" variable in ocelot_port_mdb_add

It is Not Needed, a comment will suffice.

Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: mscc: ocelot: use ether_addr_copy

Since a helper is available for copying Ethernet addresses, let's use it.

Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: mscc: ocelot: classify L2 mdb entries as LOCKED

ocelot.h says:

/* MAC table entry types.
* ENTRYTYPE_NORMAL is subject to aging.
* ENTRYTYPE_LOCKED is not subject to aging.
* ENTRYTYPE_MACv4 is not subject to aging. For IPv4 multicast.
* ENTRYTYPE_MACv6 is not subject to aging. For IPv6 multicast.
*/

We don't want the permanent entries added with 'bridge mdb' to be
subject to aging.

Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: bridge: explicitly convert between mdb entry state and port group flags

When creating a new multicast port group, there is implicit conversion
between the __u8 state member of struct br_mdb_entry and the unsigned
char flags member of struct net_bridge_port_group. This implicit
conversion relies on the fact that MDB_PERMANENT is equal to
MDB_PG_FLAGS_PERMANENT.

Let's be more explicit and convert the state to flags manually.

Signed-off-by: Vladimir Oltean <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: bridge: mcast: add support for raw L2 multicast groups

Extend the bridge multicast control and data path to configure routes
for L2 (non-IP) multicast groups.

The uapi struct br_mdb_entry union u is extended with another variant,
mac_addr, which does not change the structure size, and which is valid
when the proto field is zero.

To be compatible with the forwarding code that is already in place,
which acts as an IGMP/MLD snooping bridge with querier capabilities, we
need to declare that for L2 MDB entries (for which there exists no such
thing as IGMP/MLD snooping/querying), that there is always a querier.
Otherwise, these entries would be flooded to all bridge ports and not
just to those that are members of the L2 multicast group.

Needless to say, only permanent L2 multicast groups can be installed on
a bridge port.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Vladimir Oltean <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'sfc-ef100-tso-enhancements'

Edward Cree says:

====================
sfc: EF100 TSO enhancements

Support TSO over encapsulation (with GSO_PARTIAL), and over VLANs
(which the code already handled but we didn't advertise). Also
correct our handling of IPID mangling.

I couldn't find documentation of exactly what shaped SKBs we can
get given, so patch #2 is slightly guesswork, but when I tested
TSO over both underlay and (VxLAN) overlay, the checksums came
out correctly, so at least in those cases the edits we're making
must be the right ones.
Similarly, I'm not 100% sure I've correctly understood how FIXEDID
and MANGLEID are supposed to work in patch #3.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

sfc: advertise our vlan features

Signed-off-by: Edward Cree <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

sfc: only use fixed-id if the skb asks for it

AIUI, the NETIF_F_TSO_MANGLEID flag is a signal to the stack that a
driver may _need_ to mangle IDs in order to do TSO, and conversely
a signal from the stack that the driver is permitted to do so.
Since we support both fixed and incrementing IPIDs, we should rely
on the SKB_GSO_FIXEDID flag on a per-skb basis, rather than using
the MANGLEID feature to make all TSOs fixed-id.
Includes other minor cleanups of ef100_make_tso_desc() coding style.

Signed-off-by: Edward Cree <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

sfc: implement encap TSO on EF100

The NIC only needs to know where the headers it has to edit (TCP and
inner and outer IPv4) are, which fits GSO_PARTIAL nicely.
It also supports non-PARTIAL offload of UDP tunnels, again just
needing to be told the outer transport offset so that it can edit
the UDP length field.
(It's not clear to me whether the stack will ever use the non-PARTIAL
version with the netdev feature flags we're setting here.)

Signed-off-by: Edward Cree <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

sfc: extend bitfield macros to 17 fields

We need EFX_POPULATE_OWORD_17 for an encap TSO descriptor on EF100.

Signed-off-by: Edward Cree <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'net-ipa-minor-bug-fixes'

Alex Elder says:

====================
net: ipa: minor bug fixes

This series fixes several bugs.  They are minor, in that the code
currently works on supported platforms even without these patches
applied, but they're bugs nevertheless and should be fixed.

Version 2 improves the commit message for the fourth patch.  It also
fixes a bug in two spots in the last patch.  Both of these changes
were suggested by Willem de Bruijn.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: ipa: avoid going past end of resource group array

The minimum and maximum limits for resources assigned to a given
resource group are programmed in pairs, with the limits for two
groups set in a single register.

If the number of supported resource groups is odd, only half of the
register that defines these limits is valid for the last group; that
group has no second group in the pair.

Currently we ignore this constraint, and it turns out to be harmless,
but it is not guaranteed to be.  This patch addresses that, and adds
support for programming the 5th resource group's limits.

Rework how the resource group limit registers are programmed by
having a single function program all group pairs rather than having
one function program each pair.  Add the programming of the 4-5
resource group pair limits to this function.  If a resource group is
not supported, pass a null pointer to ipa_resource_config_common()
for that group and have that function write zeroes in that case.

Tested-by: Sujit Kautkar <[email protected]>
Signed-off-by: Alex Elder <[email protected]>
Acked-by: Willem de Bruijn <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: ipa: distinguish between resource group types

The number of resource groups supported by the hardware can be
different for source and destination resources.  Determine the
number supported for each using separate functions.  Make the
functions inline end move their definitions into "ipa_reg.h",
because they determine whether certain register definitions are
valid.  Pass just the IPA hardware version as argument.

IPA_RESOURCE_GROUP_COUNT represents the maximum number of resource
groups the driver supports for any hardware version.  Change that
symbol to be two separate constants, one for source and the other
for destination resource groups.  Rename them to end with "_MAX"
rather than "_COUNT", to reflect their true purpose.

Tested-by: Sujit Kautkar <[email protected]>
Signed-off-by: Alex Elder <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: ipa: assign endpoint to a resource group

The IPA hardware manages various resources (e.g. descriptors)
internally to perform its functions. The resources are grouped,
allowing different endpoints to use separate resource pools. This
way one group of endpoints can be configured to operate unaffected
by the resource use of endpoints in a different group.

Endpoints should be assigned to a resource group, but we currently
don't do that.

Define a new resource_group field in the endpoint configuration
data, and use it to assign the proper resource group to use for
each AP endpoint.

Tested-by: Sujit Kautkar <[email protected]>
Signed-off-by: Alex Elder <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: ipa: fix resource group field mask definition

The mask for the RSRC_GRP field in the INIT_RSRC_GRP endpoint
initialization register is incorrectly defined for IPA v4.2 (where
it is only one bit wide).  So we need to fix this.

The fix is not straightforward, however.  Field masks are passed to
functions like u32_encode_bits(), and for that they must be constant.

To address this, we define a new inline function that returns the
*encoded* value to use for a given RSRC_GRP field, which depends on
the IPA version.  The caller can then use something like this, to
assign a given endpoint resource id 1:

    u32 offset = IPA_REG_ENDP_INIT_RSRC_GRP_N_OFFSET(endpoint_id);
    u32 val = rsrc_grp_encoded(ipa->version, 1);

    iowrite32(val, ipa->reg_virt + offset);

The next patch requires this fix.

Tested-by: Sujit Kautkar <[email protected]>
Signed-off-by: Alex Elder <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: ipa: assign proper packet context base

At the end of ipa_mem_setup() we write the local packet processing
context base register to tell it where the processing context memory
is. But we are writing the wrong value.

The value written turns out to be the offset of the modem header
memory region (assigned earlier in the function). Fix this bug.

Tested-by: Sujit Kautkar <[email protected]>
Signed-off-by: Alex Elder <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: dec: tulip: de2104x: Add shutdown handler to stop NIC

The driver does not implement a shutdown handler which leads to issues
when using kexec in certain scenarios. The NIC keeps on fetching
descriptors which gets flagged by the IOMMU with errors like this:

DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000

Signed-off-by: Moritz Fischer <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: phy: marvell: add special handling of Finisar modules with 88E1111

The Finisar FCLF8520P2BTL 1000BaseT SFP module uses a Marvel 88E1111 PHY
with a modified PHY ID. Add support for this ID using the 88E1111
methods.

By default these modules do not have 1000BaseX auto-negotiation enabled,
which is not generally desirable with Linux networking drivers. Add
handling to enable 1000BaseX auto-negotiation when these modules are
used in 1000BaseX mode. Also, some special handling is required to ensure
that 1000BaseT auto-negotiation is enabled properly when desired.

Based on existing handling in the AMD xgbe driver and the information in
the Finisar FAQ:
https://www.finisar.com/sites/default/files/resources/an-2036_1000base-t_sfp_faqreve1.pdf

Signed-off-by: Robert Hancock <[email protected]>
Reviewed-by: Russell King <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'sctp-implement-rfc6951-udp-encapsulation-of-sctp'

Xin Long says:

====================
sctp: Implement RFC6951: UDP Encapsulation of SCTP

Description From the RFC:

   The Main Reasons:

   o  To allow SCTP traffic to pass through legacy NATs, which do not
      provide native SCTP support as specified in [BEHAVE] and
      [NATSUPP].

   o  To allow SCTP to be implemented on hosts that do not provide
      direct access to the IP layer.  In particular, applications can
      use their own SCTP implementation if the operating system does not
      provide one.

   Implementation Notes:

   UDP-encapsulated SCTP is normally communicated between SCTP stacks
   using the IANA-assigned UDP port number 9899 (sctp-tunneling) on both
   ends.  There are circumstances where other ports may be used on
   either end, and it might be required to use ports other than the
   registered port.

   Each SCTP stack uses a single local UDP encapsulation port number as
   the destination port for all its incoming SCTP packets, this greatly
   simplifies implementation design.

   An SCTP implementation supporting UDP encapsulation MUST maintain a
   remote UDP encapsulation port number per destination address for each
   SCTP association.  Again, because the remote stack may be using ports
   other than the well-known port, each port may be different from each
   stack.  However, because of remapping of ports by NATs, the remote
   ports associated with different remote IP addresses may not be
   identical, even if they are associated with the same stack.

   Because the well-known port might not be used, implementations need
   to allow other port numbers to be specified as a local or remote UDP
   encapsulation port number through APIs.

Patches:

   This patchset is using the udp4/6 tunnel APIs to implement the UDP
   Encapsulation of SCTP with not much change in SCTP protocol stack
   and with all current SCTP features keeped in Linux Kernel.

   1 - 4: Fix some UDP issues that may be triggered by SCTP over UDP.
   5 - 7: Process incoming UDP encapsulated packets and ICMP packets.
   8 -10: Remote encap port's update by sysctl, sockopt and packets.
   11-14: Process outgoing pakects with UDP encapsulated and its GSO.
   15-16: Add the part from draft-tuexen-tsvwg-sctp-udp-encaps-cons-03.
      17: Enable this feature.

Tests:

  - lksctp-tools/src/func_tests with UDP Encapsulation enabled/disabled:

      Both make v4test and v6test passed.

  - sctp-tests with UDP Encapsulation enabled/disabled:

      repeatability/procdumps/sctpdiag/gsomtuchange/extoverflow/
      sctphashtable passed. Others failed as expected due to those
      "iptables -p sctp" rules.

  - netperf on lo/netns/virtio_net, with gso enabled/disabled and
    with ip_checksum enabled/disabled, with UDP Encapsulation
    enabled/disabled:

      No clear performance dropped.

v1->v2:
  - Fix some incorrect code in the patches 5,6,8,10,11,13,14,17, suggested
    by Marcelo.
  - Append two patches 15-16 to add the Additional Considerations for UDP
    Encapsulation of SCTP from draft-tuexen-tsvwg-sctp-udp-encaps-cons-03.
v2->v3:
  - remove the cleanup code in patch 2, suggested by Willem.
  - remove the patch 3 and fix the checksum in the new patch 3 after
    talking with Paolo, Marcelo and Guillaume.
  - add 'select NET_UDP_TUNNEL' in patch 4 to solve a compiling error.
  - fix __be16 type cast warning in patch 8.
  - fix the wrong endian orders when setting values in 14,16.
v3->v4:
  - add entries in ip-sysctl.rst in patch 7,16, as Marcelo Suggested.
  - not create udp socks when udp_port is set to 0 in patch 16, as
    Marcelo noticed.
v4->v5:
  - improve the description for udp_port and encap_port entries in patch
    7, 16.
  - use 0 as the default udp_port.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

sctp: enable udp tunneling socks

This patch is to enable udp tunneling socks by calling
sctp_udp_sock_start() in sctp_ctrlsock_init(), and
sctp_udp_sock_stop() in sctp_ctrlsock_exit().

Also add sysctl udp_port to allow changing the listening
sock's port by users.

Wit this patch, the whole sctp over udp feature can be
enabled and used.

v1->v2:
  - Also update ctl_sock udp_port in proc_sctp_do_udp_port()
    where netns udp_port gets changed.
v2->v3:
  - Call htons() when setting sk udp_port from netns udp_port.
v3->v4:
  - Not call sctp_udp_sock_start() when new_value is 0.
  - Add udp_port entry in ip-sysctl.rst.
v4->v5:
  - Not call sctp_udp_sock_start/stop() in sctp_ctrlsock_init/exit().
  - Improve the description of udp_port in ip-sysctl.rst.

Signed-off-by: Xin Long <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

sctp: handle the init chunk matching an existing asoc

This is from Section 4 of draft-tuexen-tsvwg-sctp-udp-encaps-cons-03,
and it requires responding with an abort chunk with an error cause
when the udp source port of the received init chunk doesn't match the
encap port of the transport.

Signed-off-by: Xin Long <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

sctp: add the error cause for new encapsulation port restart

This patch is to add the function to make the abort chunk with
the error cause for new encapsulation port restart, defined
on Section 4.4 in draft-tuexen-tsvwg-sctp-udp-encaps-cons-03.

v1->v2:
- no change.
v2->v3:
- no need to call htons() when setting nep.cur_port/new_port.

Signed-off-by: Xin Long <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

sctp: support for sending packet over udp6 sock

This one basically does the similar things in sctp_v6_xmit as does for
udp4 sock in the last patch, just note that:

  1. label needs to be calculated, as it's the param of
     udp_tunnel6_xmit_skb().

  2. The 'nocheck' param of udp_tunnel6_xmit_skb() is false, as
     required by RFC.

v1->v2:
  - Use sp->udp_port instead in sctp_v6_xmit(), which is more safe.

Signed-off-by: Xin Long <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

sctp: support for sending packet over udp4 sock

This patch does what the rfc6951#section-5.3 says for ipv4:

  "Within the UDP header, the source port MUST be the local UDP
   encapsulation port number of the SCTP stack, and the destination port
   MUST be the remote UDP encapsulation port number maintained for the
   association and the destination address to which the packet is sent
   (see Section 5.1).

   Because the SCTP packet is the UDP payload, the length of the UDP
   packet MUST be the length of the SCTP packet plus the size of the UDP
   header.

   The SCTP checksum MUST be computed for IPv4 and IPv6, and the UDP
   checksum SHOULD be computed for IPv4 and IPv6."

Some places need to be adjusted in sctp_packet_transmit():

  1. For non-gso packets, when transport's encap_port is set, sctp
     checksum has to be done in sctp_packet_pack(), as the outer
     udp will use ip_summed = CHECKSUM_PARTIAL to do the offload
     setting for checksum.

  2. Delay calling dst_clone() and skb_dst_set() for non-udp packets
     until sctp_v4_xmit(), as for udp packets, skb_dst_set() is not
     needed before calling udp_tunnel_xmit_skb().

then in sctp_v4_xmit():

  1. Go to udp_tunnel_xmit_skb() only when transport->encap_port and
     net->sctp.udp_port both are set, as these are one for dst port
     and another for src port.

  2. For gso packet, SKB_GSO_UDP_TUNNEL_CSUM is set for gso_type, and
     with this udp checksum can be done in __skb_udp_tunnel_segment()
     for each segments after the sctp gso.

  3. inner_mac_header and inner_transport_header are set, as these
     will be needed in __skb_udp_tunnel_segment() to find the right
     headers.

  4. df and ttl are calculated, as these are the required params by
     udp_tunnel_xmit_skb().

  5. nocheck param has to be false, as "the UDP checksum SHOULD be
     computed for IPv4 and IPv6", says in rfc6951#section-5.3.

v1->v2:
  - Use sp->udp_port instead in sctp_v4_xmit(), which is more safe.

Signed-off-by: Xin Long <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

sctp: call sk_setup_caps in sctp_packet_transmit instead

sk_setup_caps() was originally called in Commit 90017accff61 ("sctp:
Add GSO support"), as:

"We have to refresh this in case we are xmiting to more than one
transport at a time"

This actually happens in the loop of sctp_outq_flush_transports(),
and it shouldn't be tied to gso, so move it out of gso part and
before sctp_packet_pack().

Signed-off-by: Xin Long <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

sctp: add udphdr to overhead when udp_port is set

sctp_mtu_payload() is for calculating the frag size before making
chunks from a msg. So we should only add udphdr size to overhead
when udp socks are listening, as only then sctp can handle the
incoming sctp over udp packets and outgoing sctp over udp packets
will be possible.

Note that we can't do this according to transport->encap_port, as
different transports may be set to different values, while the
chunks were made before choosing the transport, we could not be
able to meet all rfc6951#section-5.6 recommends.

v1->v2:
- Add udp_port for sctp_sock to avoid a potential race issue, it
will be used in xmit path in the next patch.

Signed-off-by: Xin Long <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

sctp: allow changing transport encap_port by peer packets

As rfc6951#section-5.4 says:

  "After finding the SCTP association (which
   includes checking the verification tag), the UDP source port MUST be
   stored as the encapsulation port for the destination address the SCTP
   packet is received from (see Section 5.1).

   When a non-encapsulated SCTP packet is received by the SCTP stack,
   the encapsulation of outgoing packets belonging to the same
   association and the corresponding destination address MUST be
   disabled."

transport encap_port should be updated by a validated incoming packet's
udp src port.

We save the udp src port in sctp_input_cb->encap_port, and then update
the transport in two places:

  1. right after vtag is verified, which is required by RFC, and this
     allows the existent transports to be updated by the chunks that
     can only be processed on an asoc.

  2. right before processing the 'init' where the transports are added,
     and this allows building a sctp over udp connection by client with
     the server not knowing the remote encap port.

  3. when processing ootb_pkt and creating the temporary transport for
     the reply pkt.

Note that sctp_input_cb->header is removed, as it's not used any more
in sctp.

v1->v2:
  - Change encap_port as __be16 for sctp_input_cb.

Signed-off-by: Xin Long <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

sctp: add SCTP_REMOTE_UDP_ENCAPS_PORT sockopt

This patch is to implement:

  rfc6951#section-6.1: Get or Set the Remote UDP Encapsulation Port Number

with the param of the struct:

  struct sctp_udpencaps {
    sctp_assoc_t sue_assoc_id;
    struct sockaddr_storage sue_address;
    uint16_t sue_port;
  };

the encap_port of sock, assoc or transport can be changed by users,
which also means it allows the different transports of the same asoc
to have different encap_port value.

v1->v2:
  - no change.
v2->v3:
  - fix the endian warning when setting values between encap_port and
    sue_port.

Signed-off-by: Xin Long <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

sctp: add encap_port for netns sock asoc and transport

encap_port is added as per netns/sock/assoc/transport, and the
latter one's encap_port inherits the former one's by default.
The transport's encap_port value would mostly decide if one
packet should go out with udp encapsulated or not.

This patch also allows users to set netns' encap_port by sysctl.

v1->v2:
  - Change to define encap_port as __be16 for sctp_sock, asoc and
    transport.
v2->v3:
  - No change.
v3->v4:
  - Add 'encap_port' entry in ip-sysctl.rst.
v4->v5:
  - Improve the description of encap_port in ip-sysctl.rst.

Signed-off-by: Xin Long <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

sctp: add encap_err_lookup for udp encap socks

As it says in rfc6951#section-5.5:

  "When receiving ICMP or ICMPv6 response packets, there might not be
   enough bytes in the payload to identify the SCTP association that the
   SCTP packet triggering the ICMP or ICMPv6 packet belongs to.  If a
   received ICMP or ICMPv6 packet cannot be related to a specific SCTP
   association or the verification tag cannot be verified, it MUST be
   discarded silently.  In particular, this means that the SCTP stack
   MUST NOT rely on receiving ICMP or ICMPv6 messages.  Implementation
   constraints could prevent processing received ICMP or ICMPv6
   messages."

ICMP or ICMPv6 packets need to be handled, and this is implemented by
udp encap sock .encap_err_lookup function.

The .encap_err_lookup function is called in __udp(6)_lib_err_encap()
to confirm this path does need to be updated. For sctp, what we can
do here is check if the corresponding asoc and transport exist.

Note that icmp packet process for sctp over udp is done by udp sock
.encap_err_lookup(), and it means for now we can't do as much as
sctp_v4/6_err() does. Also we can't do the two mappings mentioned
in rfc6951#section-5.5.

Signed-off-by: Xin Long <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

sctp: create udp6 sock and set its encap_rcv

This patch is to add the udp6 sock part in sctp_udp_sock_start/stop().
udp_conf.use_udp6_rx_checksums is set to true, as:

   "The SCTP checksum MUST be computed for IPv4 and IPv6, and the UDP
    checksum SHOULD be computed for IPv4 and IPv6"

says in rfc6951#section-5.3.

v1->v2:
  - Add pr_err() when fails to create udp v6 sock.
  - Add #if IS_ENABLED(CONFIG_IPV6) not to create v6 sock when ipv6 is
    disabled.

Signed-off-by: Xin Long <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

sctp: create udp4 sock and add its encap_rcv

This patch is to add the functions to create/release udp4 sock,
and set the sock's encap_rcv to process the incoming udp encap
sctp packets. In sctp_udp_rcv(), as we can see, all we need to
do is fix the transport header for sctp_rcv(), then it would
implement the part of rfc6951#section-5.4:

  "When an encapsulated packet is received, the UDP header is removed.
   Then, the generic lookup is performed, as done by an SCTP stack
   whenever a packet is received, to find the association for the
   received SCTP packet"

Note that these functions will be called in the last patch of
this patchset when enabling this feature.

v1->v2:
  - Add pr_err() when fails to create udp v4 sock.
v2->v3:
  - Add 'select NET_UDP_TUNNEL' in sctp Kconfig.
v3->v4:
  - No change.
v4->v5:
  - Change to set udp_port to 0 by default.

Signed-off-by: Xin Long <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

udp: support sctp over udp in skb_udp_tunnel_segment

For the gso of sctp over udp packets, sctp_gso_segment() will be called in
skb_udp_tunnel_segment(), we need to set transport_header to sctp header.

As all the current HWs can't handle both crc checksum and udp checksum at
the same time, the crc checksum has to be done in sctp_gso_segment() by
removing the NETIF_F_SCTP_CRC flag from the features.

Meanwhile, if the HW can't do udp checksum, csum and csum_start has to be
set correctly, and udp checksum will be done in __skb_udp_tunnel_segment()
by calling gso_make_checksum().

Thanks to Paolo, Marcelo and Guillaume for helping with this one.

v1->v2:
  - no change.
v2->v3:
  - remove the he NETIF_F_SCTP_CRC flag from the features.
  - set csum and csum_start in sctp_gso_make_checksum().

Signed-off-by: Xin Long <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

udp6: move the mss check after udp gso tunnel processing

For some protocol's gso, like SCTP, it's using GSO_BY_FRAGS for
gso_size. When using UDP to encapsulate its packet, it will
return error in udp6_ufo_fragment() as skb->len < gso_size,
and it will never go to the gso tunnel processing.

So we should move this check after udp gso tunnel processing,
the same as udp4_ufo_fragment() does.

v1->v2:
- no change.
v2->v3:
- not do any cleanup.

Signed-off-by: Xin Long <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

udp: check udp sock encap_type in __udp_lib_err

There is a chance that __udp4/6_lib_lookup() returns a udp encap
sock in __udp_lib_err(), like the udp encap listening sock may
use the same port as remote encap port, in which case it should
go to __udp4/6_lib_err_encap() for more validation before
processing the icmp packet.

This patch is to check encap_type in __udp_lib_err() for the
further validation for a encap sock.

Signed-off-by: Xin Long <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: dsa: mv88e6xxx: fix vlan setup

DSA assumes that a bridge which has vlan filtering disabled is not
vlan aware, and ignores all vlan configuration. However, the kernel
software bridge code allows configuration in this state.

This causes the kernel's idea of the bridge vlan state and the
hardware state to disagree, so "bridge vlan show" indicates a correct
configuration but the hardware lacks all configuration. Even worse,
enabling vlan filtering on a DSA bridge immediately blocks all traffic
which, given the output of "bridge vlan show", is very confusing.

Allow the VLAN configuration to be updated on Marvell DSA bridges,
otherwise we end up cutting all traffic when enabling vlan filtering.

Signed-off-by: Russell King <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Tested-by: Vladimir Oltean <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

drivers: net: phy: Fix spelling in comment defalut to default

Fixed spelling in comment like below:

s/defalut/default/p

This is in linux-next.

Signed-off-by: Bhaskar Chowdhury <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: cls_api: remove unneeded local variable in tc_dump_chain()

make clang-analyzer on x86_64 defconfig caught my attention with:

net/sched/cls_api.c:2964:3: warning: Value stored to 'parent' is never read
  [clang-analyzer-deadcode.DeadStores]
                parent = 0;
                ^

net/sched/cls_api.c:2977:4: warning: Value stored to 'parent' is never read
  [clang-analyzer-deadcode.DeadStores]
                        parent = q->handle;
                        ^

Commit 32a4f5ecd738 ("net: sched: introduce chain object to uapi")
introduced tc_dump_chain() and this initial implementation already
contained these unneeded dead stores.

Simplify the code to make clang-analyzer happy.

As compilers will detect these unneeded assignments and optimize this
anyway, the resulting binary is identical before and after this change.

No functional change. No change in object code.

Signed-off-by: Lukas Bulwahn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

ipv6: mcast: make annotations for ip6_mc_msfget() consistent

Commit 931ca7ab7fe8 ("ip*_mc_gsfget(): lift copyout of struct group_filter
into callers") adjusted the type annotations for ip6_mc_msfget() at its
declaration, but missed the type annotations at its definition.

Hence, sparse complains on ./net/ipv6/mcast.c:

  mcast.c:550:5: error: symbol 'ip6_mc_msfget' redeclared with different type \
  (incompatible argument 3 (different address spaces))

Make ip6_mc_msfget() annotations consistent, which also resolves this
warning from sparse:

  mcast.c:607:34: warning: incorrect type in argument 1 (different address spaces)
  mcast.c:607:34:    expected void [noderef] __user *to
  mcast.c:607:34:    got struct __kernel_sockaddr_storage *p

No functional change. No change in object code.

Signed-off-by: Lukas Bulwahn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

tipc: remove dead code in tipc_net and relatives

dist_queue is no longer used since commit 37922ea4a310
("tipc: permit overlapping service ranges in name table")

Acked-by: Jon Maloy <[email protected]>
Acked-by: Ying Xue <[email protected]>
Signed-off-by: Hoang Huu Le <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: ipv6: calipso: Fix kerneldoc warnings

net/ipv6/calipso.c:1236: warning: Excess function parameter 'reg' description in 'calipso_req_delattr'
net/ipv6/calipso.c:1236: warning: Function parameter or member 'req' not described in 'calipso_req_delattr'
net/ipv6/calipso.c:435: warning: Excess function parameter 'audit_secid' description in 'calipso_doi_remove'
net/ipv6/calipso.c:435: warning: Function parameter or member 'audit_info' not described in 'calipso_doi_remove'

Signed-off-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: ipv6: rpl*: Fix strange kerneldoc warnings due to bad header

net/ipv6/rpl_iptunnel.c:15: warning: cannot understand function prototype: 'struct rpl_iptunnel_encap '

The header on the file containing the author copyright message uses
kerneldoc /** opener. This confuses the parser when it gets to

struct rpl_iptunnel_encap {
struct ipv6_rpl_sr_hdr srh[0];
};

Similarly:

net//ipv6/rpl.c:10: warning: Function parameter or member 'x' not described in 'IPV6_PFXTAIL_LEN'

where IPV6_PFXTAIL_LEN is a macro definition, not a function.

Convert the header comments to a plain /* comment.

Signed-off-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: ipv4: Fix some kerneldoc warnings in TCP Low Priority

net//ipv4/tcp_lp.c:120: warning: Function parameter or member 'sk' not described in 'tcp_lp_cong_avoid'
net//ipv4/tcp_lp.c:135: warning: Function parameter or member 'sk' not described in 'tcp_lp_remote_hz_estimator'
net//ipv4/tcp_lp.c:188: warning: Function parameter or member 'sk' not described in 'tcp_lp_owd_calculator'
net//ipv4/tcp_lp.c:222: warning: Function parameter or member 'rtt' not described in 'tcp_lp_rtt_sample'
net//ipv4/tcp_lp.c:222: warning: Function parameter or member 'sk' not described in 'tcp_lp_rtt_sample'
net//ipv4/tcp_lp.c:265: warning: Function parameter or member 'sk' not described in 'tcp_lp_pkts_acked'
net//ipv4/tcp_lp.c:97: warning: Function parameter or member 'sk' not described in 'tcp_lp_init'

There are still a few kerneldoc warnings after this fix.

Signed-off-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: dccp: Fix most of the kerneldoc warnings

net/dccp/ccids/ccid2.c:190: warning: Function parameter or member 'hc' not described in 'ccid2_update_used_window'
net/dccp/ccids/ccid2.c:190: warning: Function parameter or member 'new_wnd' not described in 'ccid2_update_used_window'
net/dccp/ccids/ccid2.c:360: warning: Function parameter or member 'sk' not described in 'ccid2_rtt_estimator'
net/dccp/ccids/ccid3.c:112: warning: Function parameter or member 'sk' not described in 'ccid3_hc_tx_update_x'
net/dccp/ccids/ccid3.c:159: warning: Function parameter or member 'hc' not described in 'ccid3_hc_tx_update_s'
net/dccp/ccids/ccid3.c:268: warning: Function parameter or member 'sk' not described in 'ccid3_hc_tx_send_packet'
net/dccp/ccids/ccid3.c:667: warning: Function parameter or member 'sk' not described in 'ccid3_first_li'
net/dccp/ccids/ccid3.c:85: warning: Function parameter or member 'hc' not described in 'ccid3_update_send_interval'
net/dccp/ccids/lib/loss_interval.c:85: warning: Function parameter or member 'lh' not described in 'tfrc_lh_update_i_mean'
net/dccp/ccids/lib/loss_interval.c:85: warning: Function parameter or member 'skb' not described in 'tfrc_lh_update_i_mean'
net/dccp/ccids/lib/packet_history.c:392: warning: Function parameter or member 'h' not described in 'tfrc_rx_hist_sample_rtt'
net/dccp/ccids/lib/packet_history.c:392: warning: Function parameter or member 'skb' not described in 'tfrc_rx_hist_sample_rtt'
net/dccp/feat.c:1003: warning: Function parameter or member 'dreq' not described in 'dccp_feat_server_ccid_dependencies'
net/dccp/feat.c:1040: warning: Function parameter or member 'array_len' not described in 'dccp_feat_prefer'
net/dccp/feat.c:1040: warning: Function parameter or member 'array' not described in 'dccp_feat_prefer'
net/dccp/feat.c:1040: warning: Function parameter or member 'preferred_value' not described in 'dccp_feat_prefer'
net/dccp/output.c:151: warning: Function parameter or member 'dp' not described in 'dccp_determine_ccmps'
net/dccp/output.c:242: warning: Function parameter or member 'sk' not described in 'dccp_xmit_packet'
net/dccp/output.c:305: warning: Function parameter or member 'sk' not described in 'dccp_flush_write_queue'
net/dccp/output.c:305: warning: Function parameter or member 'time_budget' not described in 'dccp_flush_write_queue'
net/dccp/output.c:378: warning: Function parameter or member 'sk' not described in 'dccp_retransmit_skb'
net/dccp/qpolicy.c:88: warning: Function parameter or member '' not described in 'dccp_qpolicy_operations'
net/dccp/qpolicy.c:88: warning: Function parameter or member '{' not described in 'dccp_qpolicy_operations'
net/dccp/qpolicy.c:88: warning: Function parameter or member 'params' not described in 'dccp_qpolicy_operations'

Signed-off-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: dcb: Fix kerneldoc warnings

net//dcb/dcbnl.c:1836: warning: Function parameter or member 'app' not described in 'dcb_getapp'
net//dcb/dcbnl.c:1836: warning: Function parameter or member 'dev' not described in 'dcb_getapp'
net//dcb/dcbnl.c:1858: warning: Function parameter or member 'dev' not described in 'dcb_setapp'
net//dcb/dcbnl.c:1858: warning: Function parameter or member 'new' not described in 'dcb_setapp'
net//dcb/dcbnl.c:1899: warning: Function parameter or member 'app' not described in 'dcb_ieee_getapp_mask'
net//dcb/dcbnl.c:1899: warning: Function parameter or member 'dev' not described in 'dcb_ieee_getapp_mask'
net//dcb/dcbnl.c:1922: warning: Function parameter or member 'dev' not described in 'dcb_ieee_setapp'
net//dcb/dcbnl.c:1922: warning: Function parameter or member 'new' not described in 'dcb_ieee_setapp'
net//dcb/dcbnl.c:1953: warning: Function parameter or member 'del' not described in 'dcb_ieee_delapp'
net//dcb/dcbnl.c:1953: warning: Function parameter or member 'dev' not described in 'dcb_ieee_delapp'
net//dcb/dcbnl.c:1986: warning: Function parameter or member 'dev' not described in 'dcb_ieee_getapp_prio_dscp_mask_map'
net//dcb/dcbnl.c:1986: warning: Function parameter or member 'p_map' not described in 'dcb_ieee_getapp_prio_dscp_mask_map'
net//dcb/dcbnl.c:2016: warning: Function parameter or member 'dev' not described in 'dcb_ieee_getapp_dscp_prio_mask_map'
net//dcb/dcbnl.c:2016: warning: Function parameter or member 'p_map' not described in 'dcb_ieee_getapp_dscp_prio_mask_map'
net//dcb/dcbnl.c:2045: warning: Function parameter or member 'dev' not described in 'dcb_ieee_getapp_default_prio_mask'

For some of these warnings, change to comments to plain comments,
since no attempt is being made to follow kerneldoc syntax.

Signed-off-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: nfc: Fix kerneldoc warnings

net//nfc/core.c:1046: warning: Function parameter or member 'tx_headroom' not described in 'nfc_allocate_device'
net//nfc/core.c:1046: warning: Function parameter or member 'tx_tailroom' not described in 'nfc_allocate_device'
net//nfc/core.c:198: warning: Excess function parameter 'protocols' description in 'nfc_start_poll'
net//nfc/core.c:198: warning: Function parameter or member 'im_protocols' not described in 'nfc_start_poll'
net//nfc/core.c:198: warning: Function parameter or member 'tm_protocols' not described in 'nfc_start_poll'
net//nfc/core.c:441: warning: Function parameter or member 'mode' not described in 'nfc_deactivate_target'
net//nfc/core.c:711: warning: Function parameter or member 'dev' not described in 'nfc_alloc_send_skb'
net//nfc/core.c:711: warning: Function parameter or member 'err' not described in 'nfc_alloc_send_skb'
net//nfc/core.c:711: warning: Function parameter or member 'flags' not described in 'nfc_alloc_send_skb'
net//nfc/core.c:711: warning: Function parameter or member 'sk' not described in 'nfc_alloc_send_skb'
net//nfc/digital_core.c:470: warning: Function parameter or member 'im_protocols' not described in 'digital_start_poll'
net//nfc/digital_core.c:470: warning: Function parameter or member 'nfc_dev' not described in 'digital_start_poll'
net//nfc/digital_core.c:470: warning: Function parameter or member 'tm_protocols' not described in 'digital_start_poll'
net//nfc/nci/core.c:1119: warning: Function parameter or member 'tx_headroom' not described in 'nci_allocate_device'
net//nfc/nci/core.c:1119: warning: Function parameter or member 'tx_tailroom' not described in 'nci_allocate_device'

Signed-off-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: appletalk: fix kerneldoc warnings

net/appletalk/aarp.c:68: warning: Function parameter or member 'dev' not described in 'aarp_entry'
net/appletalk/aarp.c:68: warning: Function parameter or member 'expires_at' not described in 'aarp_entry'
net/appletalk/aarp.c:68: warning: Function parameter or member 'hwaddr' not described in 'aarp_entry'
net/appletalk/aarp.c:68: warning: Function parameter or member 'last_sent' not described in 'aarp_entry'
net/appletalk/aarp.c:68: warning: Function parameter or member 'next' not described in 'aarp_entry'
net/appletalk/aarp.c:68: warning: Function parameter or member 'packet_queue' not described in 'aarp_entry'
net/appletalk/aarp.c:68: warning: Function parameter or member 'status' not described in 'aarp_entry'
net/appletalk/aarp.c:68: warning: Function parameter or member 'target_addr' not described in 'aarp_entry'
net/appletalk/aarp.c:68: warning: Function parameter or member 'xmit_count' not described in 'aarp_entry'
net/appletalk/ddp.c:1422: warning: Function parameter or member 'dev' not described in 'atalk_rcv'
net/appletalk/ddp.c:1422: warning: Function parameter or member 'orig_dev' not described in 'atalk_rcv'
net/appletalk/ddp.c:1422: warning: Function parameter or member 'pt' not described in 'atalk_rcv'
net/appletalk/ddp.c:1422: warning: Function parameter or member 'skb' not described in 'atalk_rcv'

Signed-off-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: netlabel: Fix kerneldoc warnings

net/netlabel/netlabel_calipso.c:376: warning: Function parameter or member 'ops' not described in 'netlbl_calipso_ops_register'

Signed-off-by: Andrew Lunn <[email protected]>
Acked-by: Paul Moore <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: l3mdev: Fix kerneldoc warning

net/l3mdev/l3mdev.c:249: warning: Function parameter or member 'arg' not described in 'l3mdev_fib_rule_match'

Signed-off-by: Andrew Lunn <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: openvswitch: Fix kerneldoc warnings

net/openvswitch/flow.c:303: warning: Function parameter or member 'key_vh' not described in 'parse_vlan_tag'
net/openvswitch/flow.c:303: warning: Function parameter or member 'skb' not described in 'parse_vlan_tag'
net/openvswitch/flow.c:303: warning: Function parameter or member 'untag_vlan' not described in 'parse_vlan_tag'
net/openvswitch/vport.c:122: warning: Function parameter or member 'parms' not described in 'ovs_vport_alloc'

Signed-off-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: llc: Fix kerneldoc warnings

net/llc/llc_conn.c:917: warning: Function parameter or member 'kern' not described in 'llc_sk_alloc'
net/llc/llc_conn.c:917: warning: Function parameter or member 'prot' not described in 'llc_sk_alloc'

Signed-off-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'markup-some-printk-like-functions'

Andrew Lunn says:

====================
Markup some printk like functions

W=1 warns of functions which look like printk but don't have
attributes so the compile can check that arguments matches the format
string.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: tipc: Add __printf() markup to fix -Wsuggest-attribute=format

net/tipc/netlink_compat.c: In function ‘tipc_tlv_sprintf’:
net/tipc/netlink_compat.c:137:2: warning: function ‘tipc_tlv_sprintf’ might be a candidate for ‘gnu_printf’ format attribute [-Wsuggest-attribute=format]
137 | n = vscnprintf(buf, rem, fmt, args);

Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: dccp: Add __printf() markup to fix -Wsuggest-attribute=format

net/dccp/ccid.c: In function ‘ccid_kmem_cache_create’:
net/dccp/ccid.c:85:2: warning: function ‘ccid_kmem_cache_create’ might be a candidate for ‘gnu_printf’ format attribute [-Wsuggest-attribute=format]
85 | vsnprintf(slab_name_fmt, CCID_SLAB_NAME_LENGTH, fmt, args);

Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: tipc: Fix parameter types passed to %s formater

Now that the compiler is performing printf checking, we get the warning:

net/tipc/netlink_compat.c: In function ‘tipc_nl_compat_link_stat_dump’:
net/tipc/netlink_compat.c:591:39: warning: format ‘%s’ expects argument of type ‘char *’, but argument 3 has type ‘void *’ [-Wformat=]
  591 |  tipc_tlv_sprintf(msg->rep, "\nLink <%s>\n",
      |                                      ~^
      |                                       |
      |                                       char *
      |                                      %p
  592 |     nla_data(link[TIPC_NLA_LINK_NAME]));
      |     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |     |
      |     void *

There is no nla_string(), so cast to a char *.

Signed-off-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'selftests-net-bridge-add-tests-for-igmpv3'

Nikolay Aleksandrov says:

====================
selftests: net: bridge: add tests for IGMPv3

This set adds tests for the bridge's new IGMPv3 support. The tests use
precooked packets which are sent via mausezahn and the resulting state
after each test is checked for proper X,Y sets, (*,G) source list, source
list entry timers, (S,G) existence and flags, packet forwarding and
blocking, exclude group expiration and (*,G) auto-add. The first 3 patches
prepare the existing IGMPv2 tests, then patch 4 adds new helpers which are
used throughout the rest of the v3 tests.
The following new tests are added:
- base case: IGMPv3 report 239.10.10.10 is_include (A)
- include -> allow report
- include -> is_include report
- include -> is_exclude report
- include -> to_exclude report
- exclude -> allow report
- exclude -> is_include report
- exclude -> is_exclude report
- exclude -> to_exclude report
- include -> block report
- exclude -> block report
- exclude timeout (move to include + entry deletion)
- S,G port entry automatic add to a *,G,exclude port

The variable names and set notation are the same as per RFC 3376,
for more information check RFC 3376 sections 4.2.15 and 6.4.1.
MLDv2 tests will be added by a separate patch-set.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: net: bridge: add test for igmpv3 *,g auto-add

When we have *,G ports in exclude mode and a new S,G,port is added
the kernel has to automatically create an S,G entry for each exclude
port to get proper forwarding.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: net: bridge: add test for igmpv3 exclude timeout

Test that when a group in exclude mode expires it changes mode to
include and the blocked entries are deleted.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: net: bridge: add test for igmpv3 exc -> block report

The test checks for the following case:
   state          report        result                  action
EXCLUDE (X,Y)  BLOCK (A)     EXCLUDE (X+(A-Y),Y)      (A-X-Y)=Group Timer
                                                       Send Q(G,A-Y)

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: net: bridge: add test for igmpv3 inc -> block report

The test checks for the following case:
state report result action
INCLUDE (A) BLOCK (B) INCLUDE (A) Send Q(G,A*B)

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: net: bridge: add test for igmpv3 exc -> to_exclude report

The test checks for the following case:
    state          report        result                  action
  EXCLUDE (X,Y)  TO_EX (A)     EXCLUDE (A-Y,Y*A)        (A-X-Y)=Group Timer
                                                        Delete (X-A)
                                                        Delete (Y-A)
                                                        Send Q(G,A-Y)
                                                        Group Timer=GMI

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: net: bridge: add test for igmpv3 exc -> is_exclude report

The test checks for the following case:
     state          report        result                  action
   EXCLUDE (X,Y)  IS_EX (A)     EXCLUDE (A-Y,Y*A)        (A-X-Y)=GMI
                                                         Delete (X-A)
                                                         Delete (Y-A)
                                                         Group Timer=GMI

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: net: bridge: add test for igmpv3 exc -> is_include report

The test checks for the following case:
state report result action
EXCLUDE (X,Y) IS_IN (A) EXCLUDE (X+A,Y-A) (A)=GMI

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: net: bridge: add test for igmpv3 exc -> allow report

The test checks for the following case:
state report result action
EXCLUDE (X,Y) ALLOW (A) EXCLUDE (X+A,Y-A) (A)=GMI

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: net: bridge: add test for igmpv3 inc -> to_exclude report

The test checks for the following case:
  state          report        result                  action
INCLUDE (A)    TO_EX (B)     EXCLUDE (A*B,B-A)       (B-A)=0
                                                      Delete (A-B)
                                                      Send Q(G,A*B)
                                                      Group Timer=GMI

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: net: bridge: add test for igmpv3 inc -> is_exclude report

The test checks for the following case:
   state          report        result                 action
  INCLUDE (A)    IS_EX (B)     EXCLUDE (A*B,B-A)      (B-A)=0
                                                      Delete (A-B)
                                                      Group Timer=GMI

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: net: bridge: add test for igmpv3 inc -> is_include report

The test checks for the following case:
state report result action
INCLUDE (A) IS_IN (B) INCLUDE (A+B) (B)=GMI

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: net: bridge: add tests for igmpv3 is_include and inc -> allow reports

First we test is_include/include mode then we build on that with allow
effectively achieving:
state report result action
INCLUDE (A) ALLOW (B) INCLUDE (A+B) (B)=GMI

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: net: bridge: igmp: add IGMPv3 entries' state helpers

Add helpers which will be used in subsequent tests, they are:
- check_sg_entries: check for proper source list and S,G entry
existence
- check_sg_fwding: check for proper traffic forwarding/blocking
- check_sg_state: check for proper blocked/forwarding entry state

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: net: bridge: igmp: check for specific udp ip protocol

We have to specifically check for udp protocol in addition to the mac
address because in IGMPv3 tests group-specific queries will use the same
mac address.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: net: bridge: igmp: add support for packet source address

Add support for one more argument which specifies the source address to
use. It will be later used for IGMPv3 S,G entry testing.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

selftests: net: bridge: rename current igmp tests to igmpv2

To prepare the bridge_igmp.sh for IGMPv3 we need to rename the
current test to IGMPv2.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: phy: leds: Deduplicate link LED trigger registration

Refactor phy_led_trigger_register() and deduplicate its functionality
when registering LED trigger for link.

Signed-off-by: Andy Shevchenko <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: stmmac: Enable EEE HW LPI timer with auto SW/HW switching

This patch enables the HW LPI Timer which controls the automatic entry
and exit of the LPI state.
The EEE LPI timer value is configured through ethtool. The driver will
auto select the LPI HW timer if the value in the HW timer supported range.
Else, the driver will fallback to SW timer.

Signed-off-by: Vineetha G. Jaya Kumaran <[email protected]>
Signed-off-by: Voon Weifeng <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge tag 'wimax-staging' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground

Arnd Bergmann says:

====================
wimax: move to staging

After I sent a fix for what appeared to be a harmless warning in
the wimax user interface code, the conclusion was that the whole
thing has most likely not been used in a very long time, and the
user interface possibly been broken since b61a5eea5904 ("wimax: use
genl_register_family_with_ops()").

Using a shared branch between net-next and staging should help
coordinate patches getting submitted against it.
====================

Signed-off-by: Jakub Kicinski <[email protected]>

tipc: add stricter control of reserved service types

TIPC reserves 64 service types for current and future internal use.
Therefore, the bind() function is meant to block regular user sockets
from being bound to these values, while it should let through such
bindings from internal users.

However, since we at the design moment saw no way to distinguish
between regular and internal users the filter function ended up
with allowing all bindings of the reserved types which were really
in use ([0,1]), and block all the rest ([2,63]).

This is risky, since a regular user may bind to the service type
representing the topology server (TIPC_TOP_SRV == 1) or the one used
for indicating neighboring node status (TIPC_CFG_SRV == 0), and wreak
havoc for users of those services, i.e., most users.

The reality is however that TIPC_CFG_SRV never is bound through the
bind() function, since it doesn't represent a regular socket, and
TIPC_TOP_SRV can also be made to bypass the checks in tipc_bind()
by introducing a different entry function, tipc_sk_bind().

It should be noted that although this is a change of the API semantics,
there is no risk we will break any currently working applications by
doing this. Any application trying to bind to the values in question
would be badly broken from the outset, so there is no chance we would
find any such applications in real-world production systems.

v2: Added warning printout when a user is blocked from binding,
as suggested by Jakub Kicinski

Acked-by: Yung Xue <[email protected]>
Signed-off-by: Jon Maloy <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net/mac8390: discard unnecessary breaks

The 'break' is unnecessary because of previous 'return',
and we could discard it.

Signed-off-by: Zhang Qilong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: mii: Report advertised link capabilities when autonegotiation is off

Unify the set of information returned by mii_ethtool_get_link_ksettings(),
mii_ethtool_gset() and phy_ethtool_ksettings_get(). Make the mii_*()
functions report advertised settings when autonegotiation if disabled.

Suggested-by: Andrew Lunn <[email protected]>
Signed-off-by: Łukasz Stelmach <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>