]> Git Repo - linux.git/log
linux.git
4 months agoMerge tag 'nf-next-24-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilt...
Jakub Kicinski [Fri, 15 Nov 2024 22:09:20 +0000 (14:09 -0800)]
Merge tag 'nf-next-24-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

1) Extended netlink error reporting if nfnetlink attribute parser fails,
   from Donald Hunter.

2) Incorrect request_module() module, from Simon Horman.

3) A series of patches to reduce memory consumption for set element
   transactions.
   Florian Westphal says:

"When doing a flush on a set or mass adding/removing elements from a
set, each element needs to allocate 96 bytes to hold the transactional
state.

In such cases, virtually all the information in struct nft_trans_elem
is the same.

Change nft_trans_elem to a flex-array, i.e. a single nft_trans_elem
can hold multiple set element pointers.

The number of elements that can be stored in one nft_trans_elem is limited
by the slab allocator, this series limits the compaction to at most 62
elements as it caps the reallocation to 2048 bytes of memory."

4) A series of patches to prepare the transition to dscp_t in .flowi_tos.
   From Guillaume Nault.

5) Support for bitwise operations with two source registers,
   from Jeremy Sowden.

* tag 'nf-next-24-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: bitwise: add support for doing AND, OR and XOR directly
  netfilter: bitwise: rename some boolean operation functions
  netfilter: nf_dup4: Convert nf_dup_ipv4_route() to dscp_t.
  netfilter: nft_fib: Convert nft_fib4_eval() to dscp_t.
  netfilter: rpfilter: Convert rpfilter_mt() to dscp_t.
  netfilter: flow_offload: Convert nft_flow_route() to dscp_t.
  netfilter: ipv4: Convert ip_route_me_harder() to dscp_t.
  netfilter: nf_tables: allocate element update information dynamically
  netfilter: nf_tables: switch trans_elem to real flex array
  netfilter: nf_tables: prepare nft audit for set element compaction
  netfilter: nf_tables: prepare for multiple elements in nft_trans_elem structure
  netfilter: nf_tables: add nft_trans_commit_list_add_elem helper
  netfilter: bpf: Pass string literal as format argument of request_module()
  netfilter: nfnetlink: Report extack policy errors for batched ops
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonetfilter: bitwise: add support for doing AND, OR and XOR directly
Jeremy Sowden [Thu, 14 Nov 2024 21:08:13 +0000 (22:08 +0100)]
netfilter: bitwise: add support for doing AND, OR and XOR directly

Hitherto, these operations have been converted in user space to
mask-and-xor operations on one register and two immediate values, and it
is the latter which have been evaluated by the kernel.  We add support
for evaluating these operations directly in kernel space on one register
and either an immediate value or a second register.

Pablo made a few changes to the original patch:

- EINVAL if NFTA_BITWISE_SREG2 is used with fast version.
- Allow _AND,_OR,_XOR with _DATA != sizeof(u32)
- Dump _SREG2 or _DATA with _AND,_OR,_XOR

Signed-off-by: Jeremy Sowden <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
4 months agonetfilter: bitwise: rename some boolean operation functions
Jeremy Sowden [Thu, 14 Nov 2024 21:07:51 +0000 (22:07 +0100)]
netfilter: bitwise: rename some boolean operation functions

In the next patch we add support for doing AND, OR and XOR operations
directly in the kernel, so rename some functions and an enum constant
related to mask-and-xor boolean operations.

Signed-off-by: Jeremy Sowden <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
4 months agonetfilter: nf_dup4: Convert nf_dup_ipv4_route() to dscp_t.
Guillaume Nault [Thu, 14 Nov 2024 16:03:52 +0000 (17:03 +0100)]
netfilter: nf_dup4: Convert nf_dup_ipv4_route() to dscp_t.

Use ip4h_dscp() instead of reading iph->tos directly.

ip4h_dscp() returns a dscp_t value which is temporarily converted back
to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to
dscp_t in the future, we'll only have to remove that
inet_dscp_to_dsfield() call.

Signed-off-by: Guillaume Nault <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
4 months agonetfilter: nft_fib: Convert nft_fib4_eval() to dscp_t.
Guillaume Nault [Thu, 14 Nov 2024 16:03:45 +0000 (17:03 +0100)]
netfilter: nft_fib: Convert nft_fib4_eval() to dscp_t.

Use ip4h_dscp() instead of reading iph->tos directly.

ip4h_dscp() returns a dscp_t value which is temporarily converted back
to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to
dscp_t in the future, we'll only have to remove that
inet_dscp_to_dsfield() call.

Signed-off-by: Guillaume Nault <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
4 months agonetfilter: rpfilter: Convert rpfilter_mt() to dscp_t.
Guillaume Nault [Thu, 14 Nov 2024 16:03:38 +0000 (17:03 +0100)]
netfilter: rpfilter: Convert rpfilter_mt() to dscp_t.

Use ip4h_dscp() instead of reading iph->tos directly.

ip4h_dscp() returns a dscp_t value which is temporarily converted back
to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to
dscp_t in the future, we'll only have to remove that
inet_dscp_to_dsfield() call.

Signed-off-by: Guillaume Nault <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
4 months agonetfilter: flow_offload: Convert nft_flow_route() to dscp_t.
Guillaume Nault [Thu, 14 Nov 2024 16:03:31 +0000 (17:03 +0100)]
netfilter: flow_offload: Convert nft_flow_route() to dscp_t.

Use ip4h_dscp()instead of reading ip_hdr()->tos directly.

ip4h_dscp() returns a dscp_t value which is temporarily converted back
to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to
dscp_t in the future, we'll only have to remove that
inet_dscp_to_dsfield() call.

Also, remove the comment about the net/ip.h include file, since it's
now required for the ip4h_dscp() helper too.

Signed-off-by: Guillaume Nault <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
4 months agonetfilter: ipv4: Convert ip_route_me_harder() to dscp_t.
Guillaume Nault [Thu, 14 Nov 2024 16:03:21 +0000 (17:03 +0100)]
netfilter: ipv4: Convert ip_route_me_harder() to dscp_t.

Use ip4h_dscp()instead of reading iph->tos directly.

ip4h_dscp() returns a dscp_t value which is temporarily converted back
to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to
dscp_t in the future, we'll only have to remove that
inet_dscp_to_dsfield() call.

Signed-off-by: Guillaume Nault <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
4 months agoMerge branch 'net-make-rss-rxnfc-semantics-more-explicit'
Jakub Kicinski [Fri, 15 Nov 2024 03:53:43 +0000 (19:53 -0800)]
Merge branch 'net-make-rss-rxnfc-semantics-more-explicit'

Edward Cree says:

====================
net: make RSS+RXNFC semantics more explicit

The original semantics of ntuple filters with FLOW_RSS were not
 fully understood by all drivers, some ignoring the ring_cookie from
 the flow rule.  Require this support to be explicitly declared by
 the driver for filters relying on it to be inserted, and add self-
 test coverage for this functionality.
Also teach ethtool_check_max_channel() about this.
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agoselftest: extend test_rss_context_queue_reconfigure for action addition
Edward Cree [Wed, 13 Nov 2024 12:13:13 +0000 (12:13 +0000)]
selftest: extend test_rss_context_queue_reconfigure for action addition

The combination of ntuple action (ring_cookie) and RSS context can
 cause an ntuple rule to target a higher queue than appears in any
 RSS indirection table or directly in the ntuple rule, since the two
 numbers are added together.  Verify the logic that prevents reducing
 the queue count in this case.

Signed-off-by: Edward Cree <[email protected]>
Link: https://patch.msgid.link/58276b800ab78c0a79c1918046ccae7fe45ba802.1731499022.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agoselftest: validate RSS+ntuple filters with nonzero ring_cookie
Edward Cree [Wed, 13 Nov 2024 12:13:12 +0000 (12:13 +0000)]
selftest: validate RSS+ntuple filters with nonzero ring_cookie

Test creates an ntuple filter with 'action 2' and an RSS context whose
 indirection table has entries 0 and 1.  Resulting traffic should go to
 queues 2 and 3; verify that it never hits queues 0 and 1.

Signed-off-by: Edward Cree <[email protected]>
Link: https://patch.msgid.link/114afdf4d2867f72ed27751e8e08fe8b128a8529.1731499022.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agoselftest: include dst-ip in ethtool ntuple rules
Edward Cree [Wed, 13 Nov 2024 12:13:11 +0000 (12:13 +0000)]
selftest: include dst-ip in ethtool ntuple rules

sfc hardware does not support filters with only ipproto + dst-port;
 adding dst-ip to the flow spec allows the rss_ctx test to be run on
 these devices.

Signed-off-by: Edward Cree <[email protected]>
Reviewed-by: Martin Habets <[email protected]>
Link: https://patch.msgid.link/8e5d23c8f21310c23c080cc7bcd31b76f8fd3096.1731499022.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: ethtool: account for RSS+RXNFC add semantics when checking channel count
Edward Cree [Wed, 13 Nov 2024 12:13:10 +0000 (12:13 +0000)]
net: ethtool: account for RSS+RXNFC add semantics when checking channel count

In ethtool_check_max_channel(), the new RX count must not only cover the
 max queue indices in RSS indirection tables and RXNFC destinations
 separately, but must also, for RXNFC rules with FLOW_RSS, cover the sum
 of the destination queue and the maximum index in the associated RSS
 context's indirection table, since that is the highest queue that the
 rule can actually deliver traffic to.
It could be argued that the max queue across all custom RSS contexts
 (ethtool_get_max_rss_ctx_channel()) need no longer be considered, since
 any context to which packets can actually be delivered will be targeted
 by some RXNFC rule and its max will thus be allowed for by
 ethtool_get_max_rxnfc_channel().  For simplicity we keep both checks, so
 even RSS contexts unused by any RXNFC rule must fit the channel count.

Signed-off-by: Edward Cree <[email protected]>
Link: https://patch.msgid.link/43257d375434bef388e36181492aa4c458b88336.1731499022.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: ethtool: only allow set_rxnfc with rss + ring_cookie if driver opts in
Edward Cree [Wed, 13 Nov 2024 12:13:09 +0000 (12:13 +0000)]
net: ethtool: only allow set_rxnfc with rss + ring_cookie if driver opts in

Ethtool ntuple filters with FLOW_RSS were originally defined as adding
 the base queue ID (ring_cookie) to the value from the indirection table,
 so that the same table could distribute over more than one set of queues
 when used by different filters.
However, some drivers / hardware ignore the ring_cookie, and simply use
 the indirection table entries as queue IDs directly.  Thus, for drivers
 which have not opted in by setting ethtool_ops.cap_rss_rxnfc_adds to
 declare that they support the original (addition) semantics, reject in
 ethtool_set_rxnfc any filter which combines FLOW_RSS and a nonzero ring.
(For a ring_cookie of zero, both behaviours are equivalent.)
Set the cap bit in sfc, as it is known to support this feature.

Signed-off-by: Edward Cree <[email protected]>
Reviewed-by: Martin Habets <[email protected]>
Link: https://patch.msgid.link/cc3da0844083b0e301a33092a6299e4042b65221.1731499022.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agodt-bindings: net: sff,sfp: Fix "interrupts" property typo
Rob Herring (Arm) [Wed, 13 Nov 2024 22:58:25 +0000 (16:58 -0600)]
dt-bindings: net: sff,sfp: Fix "interrupts" property typo

The example has "interrupt" property which is not a defined property. It
should be "interrupts" instead. "interrupts" also should not contain a
phandle.

Signed-off-by: Rob Herring (Arm) <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Acked-by: Conor Dooley <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agodt-bindings: net: mdio-mux-gpio: Drop undocumented "marvell,reg-init"
Rob Herring (Arm) [Wed, 13 Nov 2024 22:57:13 +0000 (16:57 -0600)]
dt-bindings: net: mdio-mux-gpio: Drop undocumented "marvell,reg-init"

"marvell,reg-init" is not yet documented by schema. It's irrelevant to
the example, so just drop it.

Signed-off-by: Rob Herring (Arm) <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Acked-by: Conor Dooley <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: sparx5: add missing lan969x Kconfig dependency
Arnd Bergmann [Wed, 13 Nov 2024 11:55:08 +0000 (12:55 +0100)]
net: sparx5: add missing lan969x Kconfig dependency

The sparx5 switchdev driver can be built either with or without support
for the Lan969x switch. However, it cannot be built-in when the lan969x
driver is a loadable module because of a link-time dependency:

arm-linux-gnueabi-ld: drivers/net/ethernet/microchip/sparx5/sparx5_main.o:(.rodata+0xd44): undefined reference to `lan969x_desc'

Add a Kconfig dependency to reflect this in Kconfig, allowing all
the valid configurations but forcing sparx5 to be a loadable module
as well if lan969x is.

Fixes: 98a01119608d ("net: sparx5: add compatible string for lan969x")
Signed-off-by: Arnd Bergmann <[email protected]>
Reviewed-by: Daniel Machon <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: enetc: clean up before returning in probe()
Dan Carpenter [Wed, 13 Nov 2024 07:31:25 +0000 (10:31 +0300)]
net: enetc: clean up before returning in probe()

We recently added this error  path.  We need to call enetc_pci_remove()
before returning.  It cleans up the resources from enetc_pci_probe().

Fixes: 99100d0d9922 ("net: enetc: add preliminary support for i.MX95 ENETC PF")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Wei Fang <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agomdio: Remove mdio45_ethtool_gset_npage()
Alistair Francis [Tue, 12 Nov 2024 10:54:30 +0000 (20:54 +1000)]
mdio: Remove mdio45_ethtool_gset_npage()

The mdio45_ethtool_gset_npage() function isn't called, so let's remove
it.

Signed-off-by: Alistair Francis <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agoinclude: mdio: Remove mdio45_ethtool_gset()
Alistair Francis [Tue, 12 Nov 2024 10:54:29 +0000 (20:54 +1000)]
include: mdio: Remove mdio45_ethtool_gset()

mdio45_ethtool_gset() is never called, so let's remove it.

Signed-off-by: Alistair Francis <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agoMerge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf...
Jakub Kicinski [Fri, 15 Nov 2024 03:08:04 +0000 (19:08 -0800)]
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Martin KaFai Lau says:

====================
pull-request: bpf-next 2024-11-14

We've added 9 non-merge commits during the last 4 day(s) which contain
a total of 3 files changed, 226 insertions(+), 84 deletions(-).

The main changes are:

1) Fixes to bpf_msg_push/pop_data and test_sockmap. The changes has
   dependency on the other changes in the bpf-next/net branch,
   from Zijian Zhang.

2) Drop netns codes from mptcp test. Reuse the common helpers in
   test_progs, from Geliang Tang.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
  bpf, sockmap: Fix sk_msg_reset_curr
  bpf, sockmap: Several fixes to bpf_msg_pop_data
  bpf, sockmap: Several fixes to bpf_msg_push_data
  selftests/bpf: Add more tests for test_txmsg_push_pop in test_sockmap
  selftests/bpf: Add push/pop checking for msg_verify_data in test_sockmap
  selftests/bpf: Fix total_bytes in msg_loop_rx in test_sockmap
  selftests/bpf: Fix SENDPAGE data logic in test_sockmap
  selftests/bpf: Add txmsg_pass to pull/push/pop in test_sockmap
  selftests/bpf: Drop netns helpers in mptcp
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agoMerge branch 'ipv4-prepare-bpf-helpers-to-flowi4_tos-conversion'
Jakub Kicinski [Fri, 15 Nov 2024 03:07:50 +0000 (19:07 -0800)]
Merge branch 'ipv4-prepare-bpf-helpers-to-flowi4_tos-conversion'

Guillaume Nault says:

====================
ipv4: Prepare bpf helpers to .flowi4_tos conversion.

Continue the process of making a dscp_t variable available when setting
.flowi4_tos. This series focuses on the BPF helpers that initialise a
struct flowi4 manually.

The objective is to eventually convert .flowi4_tos to dscp_t, (to get
type annotation and prevent ECN bits from interfering with DSCP).
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agobpf: lwtunnel: Prepare bpf_lwt_xmit_reroute() to future .flowi4_tos conversion.
Guillaume Nault [Fri, 8 Nov 2024 16:47:15 +0000 (17:47 +0100)]
bpf: lwtunnel: Prepare bpf_lwt_xmit_reroute() to future .flowi4_tos conversion.

Use ip4h_dscp() to get the DSCP from the IPv4 header, then convert the
dscp_t value to __u8 with inet_dscp_to_dsfield().

Then, when we'll convert .flowi4_tos to dscp_t, we'll just have to drop
the inet_dscp_to_dsfield() call.

Signed-off-by: Guillaume Nault <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Link: https://patch.msgid.link/8338a12377c44f698a651d1ce357dd92bdf18120.1731064982.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agobpf: ipv4: Prepare __bpf_redirect_neigh_v4() to future .flowi4_tos conversion.
Guillaume Nault [Fri, 8 Nov 2024 16:47:12 +0000 (17:47 +0100)]
bpf: ipv4: Prepare __bpf_redirect_neigh_v4() to future .flowi4_tos conversion.

Use ip4h_dscp() to get the DSCP from the IPv4 header, then convert the
dscp_t value to __u8 with inet_dscp_to_dsfield().

Then, when we'll convert .flowi4_tos to dscp_t, we'll just have to drop
the inet_dscp_to_dsfield() call.

Signed-off-by: Guillaume Nault <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Link: https://patch.msgid.link/35eacc8955003e434afb1365d404193cc98a9579.1731064982.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agoMerge branch 'tools-net-ynl-rework-async-notification-handling'
Jakub Kicinski [Fri, 15 Nov 2024 02:09:08 +0000 (18:09 -0800)]
Merge branch 'tools-net-ynl-rework-async-notification-handling'

Donald Hunter says:

====================
tools/net/ynl: rework async notification handling

Revert patch 1bf70e6c3a53 which modified check_ntf() and instead add a
new poll_ntf() with async notification semantics. See patch 2 for a
detailed description.
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agotools/net/ynl: add async notification handling
Donald Hunter [Wed, 13 Nov 2024 09:08:43 +0000 (09:08 +0000)]
tools/net/ynl: add async notification handling

The notification handling in ynl is currently very simple, using sleep()
to wait a period of time and then handling all the buffered messages in
a single batch.

This patch adds async notification handling so that messages can be
processed as they are received. This makes it possible to use ynl as a
library that supplies notifications in a timely manner.

- Add poll_ntf() to be a generator that yields 1 notification at a
  time and blocks until a notification is available.
- Add a --duration parameter to the CLI, with --sleep as an alias.

./tools/net/ynl/cli.py \
    --spec <SPEC> --subscribe <TOPIC> [ --duration <SECS> ]

The cli will report any notifications for duration seconds and then
exit. If duration is not specified, then it will poll forever, until
interrupted.

Here is an example python snippet that shows how to use ynl as a library
for receiving notifications:

    ynl = YnlFamily(f"{dir}/rt_route.yaml")
    ynl.ntf_subscribe('rtnlgrp-ipv4-route')

    for event in ynl.poll_ntf():
        handle(event)

Signed-off-by: Donald Hunter <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agoRevert "tools/net/ynl: improve async notification handling"
Donald Hunter [Wed, 13 Nov 2024 09:08:42 +0000 (09:08 +0000)]
Revert "tools/net/ynl: improve async notification handling"

This reverts commit 1bf70e6c3a5346966c25e0a1ff492945b25d3f80.

This modification to check_ntf() is being reverted so that its behaviour
remains equivalent to ynl_ntf_check() in the C YNL. Instead a new
poll_ntf() will be added in a separate patch.

Signed-off-by: Donald Hunter <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agoMerge branch 'net-phy-switch-eee_broken_modes-to-linkmode-bitmap-and-add-accessor'
Jakub Kicinski [Fri, 15 Nov 2024 02:01:40 +0000 (18:01 -0800)]
Merge branch 'net-phy-switch-eee_broken_modes-to-linkmode-bitmap-and-add-accessor'

Heiner Kallweit says:

====================
net: phy: switch eee_broken_modes to linkmode bitmap and add accessor

eee_broken_modes has a eee_cap1 register layout currently. This doesn't
allow to flag e.g. 2.5Gbps or 5Gbps BaseT EEE as broken. To overcome
this limitation switch eee_broken_modes to a linkmode bitmap.
Add an accessor for the bitmap and use it in r8169.
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agor8169: copy vendor driver 2.5G/5G EEE advertisement constraints
Heiner Kallweit [Fri, 8 Nov 2024 07:08:24 +0000 (08:08 +0100)]
r8169: copy vendor driver 2.5G/5G EEE advertisement constraints

Vendor driver r8125 doesn't advertise 2.5G EEE on RTL8125A, and r8126
doesn't advertise 5G EEE. Likely there are compatibility issues,
therefore do the same in r8169.
With this change we don't have to disable 2.5G EEE advertisement in
rtl8125a_config_eee_phy() any longer.
We use new phylib accessor phy_set_eee_broken() to mark the respective
EEE modes as broken.

Signed-off-by: Heiner Kallweit <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: phy: add phy_set_eee_broken
Heiner Kallweit [Fri, 8 Nov 2024 07:07:10 +0000 (08:07 +0100)]
net: phy: add phy_set_eee_broken

Add an accessor for eee_broken_modes, so that drivers
don't have to deal with phylib internals.

Signed-off-by: Heiner Kallweit <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: phy: convert eee_broken_modes to a linkmode bitmap
Heiner Kallweit [Fri, 8 Nov 2024 06:54:47 +0000 (07:54 +0100)]
net: phy: convert eee_broken_modes to a linkmode bitmap

eee_broken_modes has a eee_cap1 register layout currently. This doen't
allow to flag e.g. 2.5Gbps or 5Gbps BaseT EEE as broken. To overcome
this limitation switch eee_broken_modes to a linkmode bitmap.

Signed-off-by: Heiner Kallweit <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 14 Nov 2024 19:27:36 +0000 (11:27 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR (net-6.12-rc8).

Conflicts:

tools/testing/selftests/net/.gitignore
  252e01e68241 ("selftests: net: add netlink-dumps to .gitignore")
  be43a6b23829 ("selftests: ncdevmem: Move ncdevmem under drivers/net/hw")
https://lore.kernel.org/all/20241113122359.1b95180a@canb.auug.org.au/

drivers/net/phy/phylink.c
  671154f174e0 ("net: phylink: ensure PHY momentary link-fails are handled")
  7530ea26c810 ("net: phylink: remove "using_mac_select_pcs"")

Adjacent changes:

drivers/net/ethernet/stmicro/stmmac/dwmac-intel-plat.c
  5b366eae7193 ("stmmac: dwmac-intel-plat: fix call balance of tx_clk handling routines")
  e96321fad3ad ("net: ethernet: Switch back to struct platform_driver::remove()")

Signed-off-by: Jakub Kicinski <[email protected]>
4 months agoMerge tag 'net-6.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 14 Nov 2024 18:05:33 +0000 (10:05 -0800)]
Merge tag 'net-6.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bluetooth.

  Quite calm week. No new regression under investigation.

  Current release - regressions:

   - eth: revert "igb: Disable threaded IRQ for igb_msix_other"

  Current release - new code bugs:

   - bluetooth: btintel: direct exception event to bluetooth stack

  Previous releases - regressions:

   - core: fix data-races around sk->sk_forward_alloc

   - netlink: terminate outstanding dump on socket close

   - mptcp: error out earlier on disconnect

   - vsock: fix accept_queue memory leak

   - phylink: ensure PHY momentary link-fails are handled

   - eth: mlx5:
      - fix null-ptr-deref in add rule err flow
      - lock FTE when checking if active

   - eth: dwmac-mediatek: fix inverted handling of mediatek,mac-wol

  Previous releases - always broken:

   - sched: fix u32's systematic failure to free IDR entries for hnodes.

   - sctp: fix possible UAF in sctp_v6_available()

   - eth: bonding: add ns target multicast address to slave device

   - eth: mlx5: fix msix vectors to respect platform limit

   - eth: icssg-prueth: fix 1 PPS sync"

* tag 'net-6.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (38 commits)
  net: sched: u32: Add test case for systematic hnode IDR leaks
  selftests: bonding: add ns multicast group testing
  bonding: add ns target multicast address to slave device
  net: ti: icssg-prueth: Fix 1 PPS sync
  stmmac: dwmac-intel-plat: fix call balance of tx_clk handling routines
  net: Make copy_safe_from_sockptr() match documentation
  net: stmmac: dwmac-mediatek: Fix inverted handling of mediatek,mac-wol
  ipmr: Fix access to mfc_cache_list without lock held
  samples: pktgen: correct dev to DEV
  net: phylink: ensure PHY momentary link-fails are handled
  mptcp: pm: use _rcu variant under rcu_read_lock
  mptcp: hold pm lock when deleting entry
  mptcp: update local address flags when setting it
  net: sched: cls_u32: Fix u32's systematic failure to free IDR entries for hnodes.
  MAINTAINERS: Re-add cancelled Renesas driver sections
  Revert "igb: Disable threaded IRQ for igb_msix_other"
  Bluetooth: btintel: Direct exception event to bluetooth stack
  Bluetooth: hci_core: Fix calling mgmt_device_connected
  virtio/vsock: Improve MSG_ZEROCOPY error handling
  vsock: Fix sk_error_queue memory leak
  ...

4 months agoMerge tag 'bcachefs-2024-11-13' of git://evilpiepirate.org/bcachefs
Linus Torvalds [Thu, 14 Nov 2024 18:00:23 +0000 (10:00 -0800)]
Merge tag 'bcachefs-2024-11-13' of git://evilpiepirate.org/bcachefs

Pull bcachefs fixes from Kent Overstreet:
 "This fixes one minor regression from the btree cache fixes (in the
  scan_for_btree_nodes repair path) - and the shutdown path fix is the
  big one here, in terms of bugs closed:

   - Assorted tiny syzbot fixes

   - Shutdown path fix: "bch2_btree_write_buffer_flush_going_ro()"

     The shutdown path wasn't flushing the btree write buffer, leading
     to shutting down while we still had operations in flight. This
     fixes a whole slew of syzbot bugs, and undoubtedly other strange
     heisenbugs.

* tag 'bcachefs-2024-11-13' of git://evilpiepirate.org/bcachefs:
  bcachefs: Fix assertion pop in bch2_ptr_swab()
  bcachefs: Fix journal_entry_dev_usage_to_text() overrun
  bcachefs: Allow for unknown key types in backpointers fsck
  bcachefs: Fix assertion pop in topology repair
  bcachefs: Fix hidden btree errors when reading roots
  bcachefs: Fix validate_bset() repair path
  bcachefs: Fix missing validation for bch_backpointer.level
  bcachefs: Fix bch_member.btree_bitmap_shift validation
  bcachefs: bch2_btree_write_buffer_flush_going_ro()

4 months agoeth: fbnic: Add support to dump registers
Mohsin Bashir [Tue, 12 Nov 2024 22:26:05 +0000 (14:26 -0800)]
eth: fbnic: Add support to dump registers

Add support for the 'ethtool -d <dev>' command to retrieve and print
a register dump for fbnic. The dump defaults to version 1 and consists
of two parts: all the register sections that can be dumped linearly, and
an RPC RAM section that is structured in an interleaved fashion and
requires special handling. For each register section, the dump also
contains the start and end boundary information which can simplify parsing.

Signed-off-by: Mohsin Bashir <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
4 months agonetfilter: nf_tables: allocate element update information dynamically
Florian Westphal [Wed, 13 Nov 2024 15:35:53 +0000 (16:35 +0100)]
netfilter: nf_tables: allocate element update information dynamically

Move the timeout/expire/flag members from nft_trans_one_elem struct into
a dybamically allocated structure, only needed when timeout update was
requested.

This halves size of nft_trans_one_elem struct and allows to compact up to
124 elements in one transaction container rather than 62.

This halves memory requirements for a large flush or insert transaction,
where ->update remains NULL.

Care has to be taken to release the extra data in all spots, including
abort path.

Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
4 months agonetfilter: nf_tables: switch trans_elem to real flex array
Florian Westphal [Wed, 13 Nov 2024 15:35:52 +0000 (16:35 +0100)]
netfilter: nf_tables: switch trans_elem to real flex array

When queueing a set element add or removal operation to the transaction
log, check if the previous operation already asks for a the identical
operation on the same set.

If so, store the element reference in the preceding operation.
This significantlty reduces memory consumption when many set add/delete
operations appear in a single transaction.

Example: 10k elements require 937kb of memory (10k allocations from
kmalloc-96 slab).

Assuming we can compact 4 elements in the same set, 468 kbytes
are needed (64 bytes for base struct, nft_trans_elemn, 32 bytes
for nft_trans_one_elem structure, so 2500 allocations from kmalloc-192
slab).

For large batch updates we can compact up to 62 elements
into one single nft_trans_elem structure (~65% mem reduction):
(64 bytes for base struct, nft_trans_elem, 32 byte for nft_trans_one_elem
 struct).

We can halve size of nft_trans_one_elem struct by moving
timeout/expire/update_flags into a dynamically allocated structure,
this allows to store 124 elements in a 2k slab nft_trans_elem struct.
This is done in a followup patch.

Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
4 months agonetfilter: nf_tables: prepare nft audit for set element compaction
Florian Westphal [Wed, 13 Nov 2024 15:35:51 +0000 (16:35 +0100)]
netfilter: nf_tables: prepare nft audit for set element compaction

nftables audit log format emits the number of added/deleted rules, sets,
set elements and so on, to userspace:

    table=t1 family=2 entries=4 op=nft_register_set
                      ~~~~~~~~~

At this time, the 'entries' key is the number of transactions that will
be applied.

The upcoming set element compression will coalesce subsequent
adds/deletes to the same set requests in the same transaction
request to conseve memory.

Without this patch, we'd under-report the number of altered elements.

Increment the audit counter by the number of elements to keep the reported
entries value the same.

Without this, nft_audit.sh selftest fails because the recorded
(expected) entries key is smaller than the expected one.

Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
4 months agonetfilter: nf_tables: prepare for multiple elements in nft_trans_elem structure
Florian Westphal [Wed, 13 Nov 2024 15:35:50 +0000 (16:35 +0100)]
netfilter: nf_tables: prepare for multiple elements in nft_trans_elem structure

Add helpers to release the individual elements contained in the
trans_elem container structure.

No functional change intended.

Followup patch will add 'nelems' member and will turn 'priv' into
a flexible array.

These helpers can then loop over all elements.
Care needs to be taken to handle a mix of new elements and existing
elements that are being updated (e.g. timeout refresh).

Before this patch, NEWSETELEM transaction with update is released
early so nft_trans_set_elem_destroy() won't get called, so we need
to skip elements marked as update.

Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
4 months agonetfilter: nf_tables: add nft_trans_commit_list_add_elem helper
Florian Westphal [Wed, 13 Nov 2024 15:35:49 +0000 (16:35 +0100)]
netfilter: nf_tables: add nft_trans_commit_list_add_elem helper

Add and use a wrapper to append trans_elem structures to the
transaction log.

Unlike the existing helper, pass a gfp_t to indicate if sleeping
is allowed.

This will be used by a followup patch to realloc nft_trans_elem
structures after they gain a flexible array member to reduce
number of such container structures on the transaction list.

Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
4 months agonetfilter: bpf: Pass string literal as format argument of request_module()
Simon Horman [Mon, 11 Nov 2024 14:47:51 +0000 (14:47 +0000)]
netfilter: bpf: Pass string literal as format argument of request_module()

Both gcc-14 and clang-18 report that passing a non-string literal as the
format argument of request_module() is potentially insecure.

E.g. clang-18 says:

.../nf_bpf_link.c:46:24: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
   46 |                 err = request_module(mod);
      |                                      ^~~
.../kmod.h:25:55: note: expanded from macro 'request_module'
   25 | #define request_module(mod...) __request_module(true, mod)
      |                                                       ^~~
.../nf_bpf_link.c:46:24: note: treat the string as an argument to avoid this
   46 |                 err = request_module(mod);
      |                                      ^
      |                                      "%s",
.../kmod.h:25:55: note: expanded from macro 'request_module'
   25 | #define request_module(mod...) __request_module(true, mod)
      |                                                       ^

It is always the case where the contents of mod is safe to pass as the
format argument. That is, in my understanding, it never contains any
format escape sequences.

But, it seems better to be safe than sorry. And, as a bonus, compiler
output becomes less verbose by addressing this issue as suggested by
clang-18.

Compile tested only.

Signed-off-by: Simon Horman <[email protected]>
Reviewed-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
4 months agonetfilter: nfnetlink: Report extack policy errors for batched ops
Donald Hunter [Fri, 1 Nov 2024 14:32:07 +0000 (14:32 +0000)]
netfilter: nfnetlink: Report extack policy errors for batched ops

The nftables batch processing does not currently populate extack with
policy errors. Fix this by passing extack when parsing batch messages.

Signed-off-by: Donald Hunter <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
4 months agonet: sched: u32: Add test case for systematic hnode IDR leaks
Alexandre Ferrieux [Wed, 13 Nov 2024 10:04:28 +0000 (11:04 +0100)]
net: sched: u32: Add test case for systematic hnode IDR leaks

Add a tdc test case to exercise the just-fixed systematic leak of
IDR entries in u32 hnode disposal. Given the IDR in question is
confined to the range [1..0x7FF], it is sufficient to create/delete
the same filter 2048 times to fill it up and get a nonzero exit
status from "tc filter add".

Signed-off-by: Alexandre Ferrieux <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Reviewed-by: Victor Nogueira <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
4 months agoMerge branch 'bonding-fix-ns-targets-not-work-on-hardware-nic'
Paolo Abeni [Thu, 14 Nov 2024 10:16:30 +0000 (11:16 +0100)]
Merge branch 'bonding-fix-ns-targets-not-work-on-hardware-nic'

Hangbin Liu says:

====================
bonding: fix ns targets not work on hardware NIC

The first patch fixed ns targets not work on hardware NIC when bonding
set arp_validate.

The second patch add a related selftest for bonding.

v4: Thanks Nikolay for the comments:
    use bond_slave_ns_maddrs_{add/del} with clear name
    fix comments typos
    remove _slave_set_ns_maddrs underscore directly
    update bond_option_arp_validate_set() change logic
v3: use ndisc_mc_map to convert the mcast mac address (Jay Vosburgh)
v2: only add/del mcast group on backup slaves when arp_validate is set (Jay Vosburgh)
    arp_validate doesn't support 3ad, tlb, alb. So let's only do it on ab mode.
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
4 months agoselftests: bonding: add ns multicast group testing
Hangbin Liu [Mon, 11 Nov 2024 10:16:50 +0000 (10:16 +0000)]
selftests: bonding: add ns multicast group testing

Add a test to make sure the backup slaves join correct multicast group
when arp_validate enabled and ns_ip6_target is set. Here is the result:

TEST: arp_validate (active-backup ns_ip6_target arp_validate 0)     [ OK ]
TEST: arp_validate (join mcast group)                               [ OK ]
TEST: arp_validate (active-backup ns_ip6_target arp_validate 1)     [ OK ]
TEST: arp_validate (join mcast group)                               [ OK ]
TEST: arp_validate (active-backup ns_ip6_target arp_validate 2)     [ OK ]
TEST: arp_validate (join mcast group)                               [ OK ]
TEST: arp_validate (active-backup ns_ip6_target arp_validate 3)     [ OK ]
TEST: arp_validate (join mcast group)                               [ OK ]
TEST: arp_validate (active-backup ns_ip6_target arp_validate 4)     [ OK ]
TEST: arp_validate (join mcast group)                               [ OK ]
TEST: arp_validate (active-backup ns_ip6_target arp_validate 5)     [ OK ]
TEST: arp_validate (join mcast group)                               [ OK ]
TEST: arp_validate (active-backup ns_ip6_target arp_validate 6)     [ OK ]
TEST: arp_validate (join mcast group)                               [ OK ]

Signed-off-by: Hangbin Liu <[email protected]>
Reviewed-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
4 months agobonding: add ns target multicast address to slave device
Hangbin Liu [Mon, 11 Nov 2024 10:16:49 +0000 (10:16 +0000)]
bonding: add ns target multicast address to slave device

Commit 4598380f9c54 ("bonding: fix ns validation on backup slaves")
tried to resolve the issue where backup slaves couldn't be brought up when
receiving IPv6 Neighbor Solicitation (NS) messages. However, this fix only
worked for drivers that receive all multicast messages, such as the veth
interface.

For standard drivers, the NS multicast message is silently dropped because
the slave device is not a member of the NS target multicast group.

To address this, we need to make the slave device join the NS target
multicast group, ensuring it can receive these IPv6 NS messages to validate
the slave’s status properly.

There are three policies before joining the multicast group:
1. All settings must be under active-backup mode (alb and tlb do not support
   arp_validate), with backup slaves and slaves supporting multicast.
2. We can add or remove multicast groups when arp_validate changes.
3. Other operations, such as enslaving, releasing, or setting NS targets,
   need to be guarded by arp_validate.

Fixes: 4e24be018eb9 ("bonding: add new parameter ns_targets")
Signed-off-by: Hangbin Liu <[email protected]>
Reviewed-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
4 months agonet: ti: icssg-prueth: Fix 1 PPS sync
Meghana Malladi [Mon, 11 Nov 2024 09:58:42 +0000 (15:28 +0530)]
net: ti: icssg-prueth: Fix 1 PPS sync

The first PPS latch time needs to be calculated by the driver
(in rounded off seconds) and configured as the start time
offset for the cycle. After synchronizing two PTP clocks
running as master/slave, missing this would cause master
and slave to start immediately with some milliseconds
drift which causes the PPS signal to never synchronize with
the PTP master.

Fixes: 186734c15886 ("net: ti: icssg-prueth: add packet timestamping and ptp support")
Signed-off-by: Meghana Malladi <[email protected]>
Reviewed-by: Vadim Fedorenko <[email protected]>
Reviewed-by: MD Danish Anwar <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
4 months agoMerge branch 'net-dsa-microchip-add-lan9646-switch-support'
Jakub Kicinski [Thu, 14 Nov 2024 03:55:02 +0000 (19:55 -0800)]
Merge branch 'net-dsa-microchip-add-lan9646-switch-support'

Tristram Ha says:

====================
net: dsa: microchip: Add LAN9646 switch support

This series of patches is to add LAN9646 switch support to the KSZ DSA
driver.
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: dsa: microchip: Add LAN9646 switch support to KSZ DSA driver
Tristram Ha [Sat, 9 Nov 2024 01:57:05 +0000 (17:57 -0800)]
net: dsa: microchip: Add LAN9646 switch support to KSZ DSA driver

LAN9646 switch is a 6-port switch with functions like KSZ9897.  It has
4 internal PHYs and 1 SGMII port.  The chip id read from hardware is
same as KSZ9477, so software driver needs to create a new chip id and
group allowable functions under its chip data structure to
differentiate the product.

Signed-off-by: Tristram Ha <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agodt-bindings: net: dsa: microchip: Add LAN9646 switch support
Tristram Ha [Sat, 9 Nov 2024 01:57:04 +0000 (17:57 -0800)]
dt-bindings: net: dsa: microchip: Add LAN9646 switch support

LAN9646 switch is a 6-port switch with functions like KSZ9897.  It has
4 internal PHYs and 1 SGMII port.

Signed-off-by: Tristram Ha <[email protected]>
Acked-by: Krzysztof Kozlowski <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agostmmac: dwmac-intel-plat: fix call balance of tx_clk handling routines
Vitalii Mordan [Fri, 8 Nov 2024 17:33:34 +0000 (20:33 +0300)]
stmmac: dwmac-intel-plat: fix call balance of tx_clk handling routines

If the clock dwmac->tx_clk was not enabled in intel_eth_plat_probe,
it should not be disabled in any path.

Conversely, if it was enabled in intel_eth_plat_probe, it must be disabled
in all error paths to ensure proper cleanup.

Found by Linux Verification Center (linuxtesting.org) with Klever.

Fixes: 9efc9b2b04c7 ("net: stmmac: Add dwmac-intel-plat for GBE driver")
Signed-off-by: Vitalii Mordan <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: Make copy_safe_from_sockptr() match documentation
Michal Luczaj [Sun, 10 Nov 2024 23:17:34 +0000 (00:17 +0100)]
net: Make copy_safe_from_sockptr() match documentation

copy_safe_from_sockptr()
  return copy_from_sockptr()
    return copy_from_sockptr_offset()
      return copy_from_user()

copy_from_user() does not return an error on fault. Instead, it returns a
number of bytes that were not copied. Have it handled.

Patch has a side effect: it un-breaks garbage input handling of
nfc_llcp_setsockopt() and mISDN's data_sock_setsockopt().

Fixes: 6309863b31dd ("net: add copy_safe_from_sockptr() helper")
Signed-off-by: Michal Luczaj <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: stmmac: dwmac-mediatek: Fix inverted handling of mediatek,mac-wol
Nícolas F. R. A. Prado [Sat, 9 Nov 2024 15:16:32 +0000 (10:16 -0500)]
net: stmmac: dwmac-mediatek: Fix inverted handling of mediatek,mac-wol

The mediatek,mac-wol property is being handled backwards to what is
described in the binding: it currently enables PHY WOL when the property
is present and vice versa. Invert the driver logic so it matches the
binding description.

Fixes: fd1d62d80ebc ("net: stmmac: replace the use_phy_wol field with a flag")
Signed-off-by: Nícolas F. R. A. Prado <[email protected]>
Link: https://patch.msgid.link/20241109-mediatek-mac-wol-noninverted-v2-1-0e264e213878@collabora.com
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agoipmr: Fix access to mfc_cache_list without lock held
Breno Leitao [Fri, 8 Nov 2024 14:08:36 +0000 (06:08 -0800)]
ipmr: Fix access to mfc_cache_list without lock held

Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the
following code flow, the RCU read lock is not held, causing the
following error when `RCU_PROVE` is not held. The same problem might
show up in the IPv6 code path.

6.12.0-rc5-kbuilder-01145-gbac17284bdcb #33 Tainted: G            E    N
-----------------------------
net/ipv4/ipmr_base.c:313 RCU-list traversed in non-reader section!!

rcu_scheduler_active = 2, debug_locks = 1
   2 locks held by RetransmitAggre/3519:
    #0: ffff88816188c6c0 (nlk_cb_mutex-ROUTE){+.+.}-{3:3}, at: __netlink_dump_start+0x8a/0x290
    #1: ffffffff83fcf7a8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_dumpit+0x6b/0x90

stack backtrace:
    lockdep_rcu_suspicious
    mr_table_dump
    ipmr_rtm_dumproute
    rtnl_dump_all
    rtnl_dumpit
    netlink_dump
    __netlink_dump_start
    rtnetlink_rcv_msg
    netlink_rcv_skb
    netlink_unicast
    netlink_sendmsg

This is not a problem per see, since the RTNL lock is held here, so, it
is safe to iterate in the list without the RCU read lock, as suggested
by Eric.

To alleviate the concern, modify the code to use
list_for_each_entry_rcu() with the RTNL-held argument.

The annotation will raise an error only if RTNL or RCU read lock are
missing during iteration, signaling a legitimate problem, otherwise it
will avoid this false positive.

This will solve the IPv6 case as well, since ip6mr_rtm_dumproute() calls
this function as well.

Signed-off-by: Breno Leitao <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agosamples: pktgen: correct dev to DEV
Wei Fang [Tue, 12 Nov 2024 03:03:47 +0000 (11:03 +0800)]
samples: pktgen: correct dev to DEV

In the pktgen_sample01_simple.sh script, the device variable is uppercase
'DEV' instead of lowercase 'dev'. Because of this typo, the script cannot
enable UDP tx checksum.

Fixes: 460a9aa23de6 ("samples: pktgen: add UDP tx checksum support")
Signed-off-by: Wei Fang <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Acked-by: Jesper Dangaard Brouer <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: phylink: ensure PHY momentary link-fails are handled
Russell King (Oracle) [Tue, 12 Nov 2024 16:20:00 +0000 (16:20 +0000)]
net: phylink: ensure PHY momentary link-fails are handled

Normally, phylib won't notify changes in quick succession. However, as
a result of commit 3e43b903da04 ("net: phy: Immediately call
adjust_link if only tx_lpi_enabled changes") this is no longer true -
it is now possible that phy_link_down() and phy_link_up() will both
complete before phylink's resolver has run, which means it'll miss that
pl->phy_state.link momentarily became false.

Rename "mac_link_dropped" to be more generic "link_failed" since it will
cover more than the MAC/PCS end of the link failing, and arrange to set
this in phylink_phy_change() if we notice that the PHY reports that the
link is down.

This will ensure that we capture an EEE reconfiguration event.

Fixes: 3e43b903da04 ("net: phy: Immediately call adjust_link if only tx_lpi_enabled changes")
Signed-off-by: Russell King (Oracle) <[email protected]>
Reviewed-by: Oleksij Rempel <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agoMerge branch 'support-external-snapshots-on-dwmac1000'
Jakub Kicinski [Thu, 14 Nov 2024 02:52:16 +0000 (18:52 -0800)]
Merge branch 'support-external-snapshots-on-dwmac1000'

Maxime Chevallier says:

====================
Support external snapshots on dwmac1000

The main change since v3 is the move of the fifo flush wait in the
ptp_clock_info enable() function within the mutex that protects the ptp
registers. Thanks Jakub and Paolo for spotting this.

This series also aggregates Daniel's reviews, except for the patch 4
which was modified since then.

This series is another take on the previous work [1] done by
Alexis Lothoré, that fixes the support for external snapshots
timestamping in GMAC3-based devices.

Details on why this is needed are mentionned on the cover [2] from V1.

[1]: https://lore.kernel.org/netdev/20230616100409[email protected]/
[2]: https://lore.kernel.org/netdev/20241029115419.1160201[email protected]/

Link to V1: https://lore.kernel.org/netdev/20241029115419.1160201[email protected]/
Link to V2: https://lore.kernel.org/netdev/20241104170251.2202270[email protected]/
Link to V3: https://lore.kernel.org/netdev/20241106090331[email protected]/
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: stmmac: dwmac_socfpga: This platform has GMAC
Maxime Chevallier [Tue, 12 Nov 2024 17:06:57 +0000 (18:06 +0100)]
net: stmmac: dwmac_socfpga: This platform has GMAC

Indicate that dwmac_socfpga has a gmac. This will make sure that
gmac-specific interrupt processing is done, including timestamp
interrupt handling. Without this, the external snapshot interrupt is
never ack'd and we have an interrupt storm on external snapshot event.

Reviewed-by: Daniel Machon <[email protected]>
Signed-off-by: Maxime Chevallier <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: stmmac: Configure only the relevant bits for timestamping setup
Maxime Chevallier [Tue, 12 Nov 2024 17:06:56 +0000 (18:06 +0100)]
net: stmmac: Configure only the relevant bits for timestamping setup

The PTP_TCR (Timestamp Control Register) is used to configure several
features related to packet timestamping.

On one hand, it configures the 1588 packet processing, to indicate what
types of frames should be timestamped (all, only 1588v1 or 1588v2, using
L2 or L4 timestamping, on IPv4 or IPv6, etc.). This is congfigured
usually through the ioctl / ndo dedicated for such setup. This
configuration is done by setting some fields in that register, that seem
to behave the same way on all dwmac variants, including DWMAC1000.

On the other hand, and only on DWMAC1000 apparently, some fields in that
register are used to configure external snapshots (bits 24/25).
On DWMAC4 and others, these fields are reserved and external
snapshots are configured through a dedicated register that simply
doesn't seem to exist on DWMAC1000.

This configuration is done in the dwmac1000-specific ptp_clock_info ops
(cf dwmac1000_ptp_enable()).

So to avoid the timestamping configuration interfering with the external
snapshots, this commit makes sure that the config_hw_tstamping only
configures the relevant bits in PTP_TCR, so that the DWMAC1000
timestamping can correctly rely on these otherwise reserved fields.

Reviewed-by: Daniel Machon <[email protected]>
Signed-off-by: Maxime Chevallier <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: stmmac: Don't include dwmac4 definitions in stmmac_ptp
Maxime Chevallier [Tue, 12 Nov 2024 17:06:55 +0000 (18:06 +0100)]
net: stmmac: Don't include dwmac4 definitions in stmmac_ptp

The stmmac_ptp code doesn't need the dwmac4 register definitions, remove
the inclusion.

Reviewed-by: Daniel Machon <[email protected]>
Signed-off-by: Maxime Chevallier <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: stmmac: Enable timestamping interrupt on dwmac1000
Maxime Chevallier [Tue, 12 Nov 2024 17:06:54 +0000 (18:06 +0100)]
net: stmmac: Enable timestamping interrupt on dwmac1000

The default configuration for the interrupts on dwmac1000 have the
timestamping interrupt masked. Now that the timestamping has been
adapted to dwmac1000, enable the timestamping interrupt on these
platforms.

On dwmac1000, the external snapshot interrupt is configured through a
dedicated bit, that is set as reserved on other dwmac variants. The
timestaming interrupt is acknowledged by reading the
GMAC3_X_TIMESTAMP_STATUS register.

Make sure that this interrupt is enabled when snapshot is enabled, and
masked when disabled.

Reviewed-by: Daniel Machon <[email protected]>
Signed-off-by: Maxime Chevallier <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: stmmac: Introduce dwmac1000 timestamping operations
Maxime Chevallier [Tue, 12 Nov 2024 17:06:53 +0000 (18:06 +0100)]
net: stmmac: Introduce dwmac1000 timestamping operations

In GMAC3_X, the timestamping configuration differs from GMAC4 in the
layout of the registers accessed to grab the number of snapshots in FIFO
as well as the register offset to grab the aux snapshot timestamp.

Introduce dedicated ops to configure timestamping on dwmac100 and
dwmac1000. The latency correction doesn't seem to exist on GMAC3, so its
corresponding operation isn't populated.

Reviewed-by: Daniel Machon <[email protected]>
Signed-off-by: Maxime Chevallier <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: stmmac: Introduce dwmac1000 ptp_clock_info and operations
Maxime Chevallier [Tue, 12 Nov 2024 17:06:52 +0000 (18:06 +0100)]
net: stmmac: Introduce dwmac1000 ptp_clock_info and operations

The PTP configuration for GMAC3_X differs from the other implementations
in several ways :

 - There's only one external snapshot trigger
 - The snapshot configuration is done through the PTP_TCR register,
   whereas the other dwmac variants have a dedicated ACR (auxiliary
   control reg) for that purpose
 - The layout for the PTP_TCR register also differs, as bits 24/25 are
   used for the snapshot configuration. These bits are reserved on other
   variants.

On GMAC3_X, we also can't discover the number of snapshot triggers
automatically.

The GMAC3_X has one PPS output, however it's configuration isn't
supported yet so report 0 n_per_out for now.

Introduce a dedicated set of ptp_clock_info ops and configuration
parameters to reflect these differences specific to GMAC3_X.

This was tested on dwmac_socfpga.

Signed-off-by: Maxime Chevallier <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: stmmac: Only update the auto-discovered PTP clock features
Maxime Chevallier [Tue, 12 Nov 2024 17:06:51 +0000 (18:06 +0100)]
net: stmmac: Only update the auto-discovered PTP clock features

Some DWMAC variants such as dwmac1000 don't support discovering the
number of output pps and auxiliary snapshots. Allow these parameters to
be defined in default ptp_clock_info, and let them be updated only when
the feature discovery yielded a result.

Reviewed-by: Daniel Machon <[email protected]>
Signed-off-by: Maxime Chevallier <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: stmmac: Use per-hw ptp clock ops
Maxime Chevallier [Tue, 12 Nov 2024 17:06:50 +0000 (18:06 +0100)]
net: stmmac: Use per-hw ptp clock ops

The auxiliary snapshot configuration was found to differ depending on
the dwmac version. To prepare supporting this, allow specifying the
ptp_clock_info ops in the hwif array

Reviewed-by: Daniel Machon <[email protected]>
Signed-off-by: Maxime Chevallier <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: stmmac: Don't modify the global ptp ops directly
Maxime Chevallier [Tue, 12 Nov 2024 17:06:49 +0000 (18:06 +0100)]
net: stmmac: Don't modify the global ptp ops directly

The stmmac_ptp_clock_ops are copied into the stmmac_priv structure
before being registered to the PTP core. Some adjustments are made prior
to that, such as the number of snapshots or max adjustment parameters.

Instead of modifying the global definition, then copying into the local
private data, let's first copy then modify the local parameters.

Reviewed-by: Daniel Machon <[email protected]>
Signed-off-by: Maxime Chevallier <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agoMerge branch 'mptcp-pm-a-few-more-fixes'
Jakub Kicinski [Thu, 14 Nov 2024 02:51:09 +0000 (18:51 -0800)]
Merge branch 'mptcp-pm-a-few-more-fixes'

Matthieu Baerts says:

====================
mptcp: pm: a few more fixes

Three small fixes related to the MPTCP path-manager:

- Patch 1: correctly reflect the backup flag to the corresponding local
  address entry of the userspace path-manager. A fix for v5.19.

- Patch 2: hold the PM lock when deleting an entry from the local
  addresses of the userspace path-manager to avoid messing up with this
  list. A fix for v5.19.

- Patch 3: use _rcu variant to iterate the in-kernel path-manager's
  local addresses list, when under rcu_read_lock(). A fix for v5.17.
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agomptcp: pm: use _rcu variant under rcu_read_lock
Matthieu Baerts (NGI0) [Tue, 12 Nov 2024 19:18:35 +0000 (20:18 +0100)]
mptcp: pm: use _rcu variant under rcu_read_lock

In mptcp_pm_create_subflow_or_signal_addr(), rcu_read_(un)lock() are
used as expected to iterate over the list of local addresses, but
list_for_each_entry() was used instead of list_for_each_entry_rcu() in
__lookup_addr(). It is important to use this variant which adds the
required READ_ONCE() (and diagnostic checks if enabled).

Because __lookup_addr() is also used in mptcp_pm_nl_set_flags() where it
is called under the pernet->lock and not rcu_read_lock(), an extra
condition is then passed to help the diagnostic checks making sure
either the associated spin lock or the RCU lock is held.

Fixes: 86e39e04482b ("mptcp: keep track of local endpoint still available for each msk")
Cc: [email protected]
Reviewed-by: Geliang Tang <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agomptcp: hold pm lock when deleting entry
Geliang Tang [Tue, 12 Nov 2024 19:18:34 +0000 (20:18 +0100)]
mptcp: hold pm lock when deleting entry

When traversing userspace_pm_local_addr_list and deleting an entry from
it in mptcp_pm_nl_remove_doit(), msk->pm.lock should be held.

This patch holds this lock before mptcp_userspace_pm_lookup_addr_by_id()
and releases it after list_move() in mptcp_pm_nl_remove_doit().

Fixes: d9a4594edabf ("mptcp: netlink: Add MPTCP_PM_CMD_REMOVE")
Cc: [email protected]
Signed-off-by: Geliang Tang <[email protected]>
Reviewed-by: Matthieu Baerts (NGI0) <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agomptcp: update local address flags when setting it
Geliang Tang [Tue, 12 Nov 2024 19:18:33 +0000 (20:18 +0100)]
mptcp: update local address flags when setting it

Just like in-kernel pm, when userspace pm does set_flags, it needs to send
out MP_PRIO signal, and also modify the flags of the corresponding address
entry in the local address list. This patch implements the missing logic.

Traverse all address entries on userspace_pm_local_addr_list to find the
local address entry, if bkup is true, set the flags of this entry with
FLAG_BACKUP, otherwise, clear FLAG_BACKUP.

Fixes: 892f396c8e68 ("mptcp: netlink: issue MP_PRIO signals from userspace PMs")
Cc: [email protected]
Signed-off-by: Geliang Tang <[email protected]>
Reviewed-by: Matthieu Baerts (NGI0) <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: phy: c45: don't use temporary linkmode bitmaps in genphy_c45_ethtool_get_eee
Heiner Kallweit [Tue, 12 Nov 2024 20:33:11 +0000 (21:33 +0100)]
net: phy: c45: don't use temporary linkmode bitmaps in genphy_c45_ethtool_get_eee

genphy_c45_eee_is_active() populates both bitmaps only if it returns
successfully. So we can avoid the overhead of the temporary bitmaps.

Signed-off-by: Heiner Kallweit <[email protected]>
Reviewed-by: Russell King (Oracle) <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agonet: simplify eeecfg_mac_can_tx_lpi
Heiner Kallweit [Tue, 12 Nov 2024 20:36:29 +0000 (21:36 +0100)]
net: simplify eeecfg_mac_can_tx_lpi

Simplify the function.

Signed-off-by: Heiner Kallweit <[email protected]>
Reviewed-by: Russell King (Oracle) <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agoynl: samples: Fix the wrong format specifier
Luo Yifan [Wed, 13 Nov 2024 01:11:42 +0000 (09:11 +0800)]
ynl: samples: Fix the wrong format specifier

Make a minor change to eliminate a static checker warning. The type
of s->ifc is unsigned int, so the correct format specifier should be
%u instead of %d.

Signed-off-by: Luo Yifan <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agoMerge branch 'tools-ynl-two-patches-to-ease-building-with-rpmbuild'
Jakub Kicinski [Thu, 14 Nov 2024 02:43:47 +0000 (18:43 -0800)]
Merge branch 'tools-ynl-two-patches-to-ease-building-with-rpmbuild'

Jan Stancek says:

====================
tools: ynl: two patches to ease building with rpmbuild

I'm looking to build and package ynl for Fedora and Centos Stream users.
Default rpmbuild has couple hardening options enabled by default [1][2],
which currently prevent ynl from building.

This series contains 2 small patches to address it.

[1] https://fedoraproject.org/wiki/Changes/Harden_All_Packages
[2] https://fedoraproject.org/wiki/Changes/PythonSafePath
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agotools: ynl: extend CFLAGS to keep options from environment
Jan Stancek [Tue, 12 Nov 2024 08:21:33 +0000 (09:21 +0100)]
tools: ynl: extend CFLAGS to keep options from environment

Package build environments like Fedora rpmbuild introduced hardening
options (e.g. -pie -Wl,-z,now) by passing a -spec option to CFLAGS
and LDFLAGS.

ynl Makefiles currently override CFLAGS but not LDFLAGS, which leads
to a mismatch and build failure:
        CC sample devlink
  /usr/bin/ld: devlink.o: relocation R_X86_64_32 against symbol `ynl_devlink_family' can not be used when making a PIE object; recompile with -fPIE
  /usr/bin/ld: failed to set dynamic section sizes: bad value
  collect2: error: ld returned 1 exit status

Extend CFLAGS to support hardening options set by build environment.

Signed-off-by: Jan Stancek <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Reviewed-by: Donald Hunter <[email protected]>
Link: https://patch.msgid.link/265b2d5d3a6d4721a161219f081058ed47dc846a.1731399562.git.jstancek@redhat.com
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agotools: ynl: add script dir to sys.path
Jan Stancek [Tue, 12 Nov 2024 08:21:32 +0000 (09:21 +0100)]
tools: ynl: add script dir to sys.path

Python options like PYTHONSAFEPATH or -P [1] do not add script
directory to PYTHONPATH. ynl depends on this path to build and run.

[1] This option is default for Fedora rpmbuild since introduction of
    https://fedoraproject.org/wiki/Changes/PythonSafePath

Signed-off-by: Jan Stancek <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Reviewed-by: Donald Hunter <[email protected]>
Link: https://patch.msgid.link/b26537cdb6e1b24435b50b2ef81d71f31c630bc1.1731399562.git.jstancek@redhat.com
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agoMerge tag 'wireless-next-2024-11-13' of git://git.kernel.org/pub/scm/linux/kernel...
Jakub Kicinski [Thu, 14 Nov 2024 02:35:18 +0000 (18:35 -0800)]
Merge tag 'wireless-next-2024-11-13' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.13

Most likely the last -next pull request for v6.13. Most changes are in
Realtek and Qualcomm drivers, otherwise not really anything
noteworthy.

Major changes:

mac80211
 * EHT 1024 aggregation size for transmissions

ath12k
 * switch to using wiphy_lock() and remove ar->conf_mutex
 * firmware coredump collection support
 * add debugfs support for a multitude of statistics

ath11k
 * dt: document WCN6855 hardware inputs

ath9k
 * remove include/linux/ath9k_platform.h

ath5k
 * Arcadyan ARV45XX AR2417 & Gigaset SX76[23] AR241[34]A support

rtw88:
 * 8821au and 8812au USB adapters support

rtw89
 * thermal protection
 * firmware secure boot for WiFi 6 chip

* tag 'wireless-next-2024-11-13' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (154 commits)
  Revert "wifi: iwlegacy: do not skip frames with bad FCS"
  wifi: mac80211: pass MBSSID config by reference
  wifi: mac80211: Support EHT 1024 aggregation size in TX
  net: rfkill: gpio: Add check for clk_enable()
  wifi: brcmfmac: Fix oops due to NULL pointer dereference in brcmf_sdiod_sglist_rw()
  wifi: Switch back to struct platform_driver::remove()
  wifi: ipw2x00: libipw_rx_any(): fix bad alignment
  wifi: brcmfmac: release 'root' node in all execution paths
  wifi: iwlwifi: mvm: don't call power_update_mac in fast suspend
  wifi: iwlwifi: s/IWL_MVM_INVALID_STA/IWL_INVALID_STA
  wifi: iwlwifi: bump minimum API version in BZ/SC to 92
  wifi: iwlwifi: move IWL_LMAC_*_INDEX to fw/api/context.h
  wifi: iwlwifi: be less noisy if the NIC is dead in S3
  wifi: iwlwifi: mvm: tell iwlmei when we finished suspending
  wifi: iwlwifi: allow fast resume on ax200
  wifi: iwlwifi: mvm: support new initiator and responder command version
  wifi: iwlwifi: mvm: use wiphy locked debugfs for low-latency
  wifi: iwlwifi: mvm: MLO scan upon channel condition degradation
  wifi: iwlwifi: mvm: support new versions of the wowlan APIs
  wifi: iwlwifi: mvm: allow always calling iwl_mvm_get_bss_vif()
  ...
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
4 months agoMerge tag 'pm-6.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Wed, 13 Nov 2024 21:32:51 +0000 (13:32 -0800)]
Merge tag 'pm-6.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
 "Fix a locking issue in the asymmetric CPU capacity setup code in the
  intel_pstate driver that may lead to a deadlock if CPU online/offline
  runs in parallel with the code in question, which is unlikely but not
  impossible (Rafael Wysocki)"

* tag 'pm-6.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: intel_pstate: Rearrange locking in hybrid_init_cpu_capacity_scaling()

4 months agoMerge tag 'tpmdd-next-6.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 13 Nov 2024 21:28:58 +0000 (13:28 -0800)]
Merge tag 'tpmdd-next-6.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd

Pull tpm fixes from Jarkko Sakkinen:
 "Two bug fixes for TPM bus encryption (the remaining reported issues in
  the feature)"

* tag 'tpmdd-next-6.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  tpm: Disable TPM on tpm2_create_primary() failure
  tpm: Opt-in in disable PCR integrity protection

4 months agotpm: Disable TPM on tpm2_create_primary() failure
Jarkko Sakkinen [Wed, 13 Nov 2024 18:35:39 +0000 (20:35 +0200)]
tpm: Disable TPM on tpm2_create_primary() failure

The earlier bug fix misplaced the error-label when dealing with the
tpm2_create_primary() return value, which the original completely ignored.

Cc: [email protected]
Reported-by: Christoph Anton Mitterer <[email protected]>
Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1087331
Fixes: cc7d8594342a ("tpm: Rollback tpm2_load_null()")
Signed-off-by: Jarkko Sakkinen <[email protected]>
4 months agotpm: Opt-in in disable PCR integrity protection
Jarkko Sakkinen [Wed, 13 Nov 2024 05:54:12 +0000 (07:54 +0200)]
tpm: Opt-in in disable PCR integrity protection

The initial HMAC session feature added TPM bus encryption and/or integrity
protection to various in-kernel TPM operations. This can cause performance
bottlenecks with IMA, as it heavily utilizes PCR extend operations.

In order to mitigate this performance issue, introduce a kernel
command-line parameter to the TPM driver for disabling the integrity
protection for PCR extend operations (i.e. TPM2_PCR_Extend).

Cc: James Bottomley <[email protected]>
Link: https://lore.kernel.org/linux-integrity/[email protected]/
Fixes: 6519fea6fd37 ("tpm: add hmac checks to tpm2_pcr_extend()")
Tested-by: Mimi Zohar <[email protected]>
Co-developed-by: Roberto Sassu <[email protected]>
Signed-off-by: Roberto Sassu <[email protected]>
Co-developed-by: Mimi Zohar <[email protected]>
Signed-off-by: Mimi Zohar <[email protected]>
Signed-off-by: Jarkko Sakkinen <[email protected]>
4 months agoMerge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Linus Torvalds [Wed, 13 Nov 2024 17:14:19 +0000 (09:14 -0800)]
Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Daniel Borkmann:

 - Fix a mismatching RCU unlock flavor in bpf_out_neigh_v6 (Jiawei Ye)

 - Fix BPF sockmap with kTLS to reject vsock and unix sockets upon kTLS
   context retrieval (Zijian Zhang)

 - Fix BPF bits iterator selftest for s390x (Hou Tao)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Fix mismatched RCU unlock flavour in bpf_out_neigh_v6
  bpf: Add sk_is_inet and IS_ICSK check in tls_sw_has_ctx_tx/rx
  selftests/bpf: Use -4095 as the bad address for bits iterator

4 months agoMerge tag 'loongarch-fixes-6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 13 Nov 2024 17:09:00 +0000 (09:09 -0800)]
Merge tag 'loongarch-fixes-6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:

 - fix possible CPUs setup logical-physical CPU mapping, in order to
   avoid CPU hotplug issue

 - fix some KASAN bugs

 - fix AP booting issue in VM mode

 - some trivial cleanups

* tag 'loongarch-fixes-6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: Fix AP booting issue in VM mode
  LoongArch: Add WriteCombine shadow mapping in KASAN
  LoongArch: Disable KASAN if PGDIR_SIZE is too large for cpu_vabits
  LoongArch: Make KASAN work with 5-level page-tables
  LoongArch: Define a default value for VM_DATA_DEFAULT_FLAGS
  LoongArch: Fix early_numa_add_cpu() usage for FDT systems
  LoongArch: For all possible CPUs setup logical-physical CPU mapping

4 months agoMerge tag 'mm-hotfixes-stable-2024-11-12-16-39' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Wed, 13 Nov 2024 16:58:11 +0000 (08:58 -0800)]
Merge tag 'mm-hotfixes-stable-2024-11-12-16-39' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "10 hotfixes, 7 of which are cc:stable. 7 are MM, 3 are not. All
  singletons"

* tag 'mm-hotfixes-stable-2024-11-12-16-39' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm: swapfile: fix cluster reclaim work crash on rotational devices
  selftests: hugetlb_dio: fixup check for initial conditions to skip in the start
  mm/thp: fix deferred split queue not partially_mapped: fix
  mm/gup: avoid an unnecessary allocation call for FOLL_LONGTERM cases
  nommu: pass NULL argument to vma_iter_prealloc()
  ocfs2: fix UBSAN warning in ocfs2_verify_volume()
  nilfs2: fix null-ptr-deref in block_dirty_buffer tracepoint
  nilfs2: fix null-ptr-deref in block_touch_buffer tracepoint
  mm: page_alloc: move mlocked flag clearance into free_pages_prepare()
  mm: count zeromap read and set for swapout and swapin

4 months agoMerge branch 'phy-mediatek-reorg'
David S. Miller [Wed, 13 Nov 2024 13:06:04 +0000 (13:06 +0000)]
Merge branch 'phy-mediatek-reorg'

Sky Huang says:

====================
Re-organize MediaTek ethernet phy drivers and propose mtk-phy-lib

This patchset comes from patch 1/9, 3/9, 4/9, 5/9 and 7/9 of:
https://lore.kernel.org/netdev/20241004102413[email protected]/

This patchset changes MediaTek's ethernet phy's folder structure and
integrates helper functions, including LED & token ring manipulation,
into mtk-phy-lib.

---
Change in v2:
- Add correct Reviewed-by tag in each patch.

Change in v3:
[patch 4/5]
- Fix kernel test robot error by adding missing MTK_NET_PHYLIB.
====================

Signed-off-by: Sky Huang <[email protected]>
4 months agonet: phy: mediatek: add MT7530 & MT7531's PHY ID macros
SkyLake.Huang [Fri, 8 Nov 2024 16:34:55 +0000 (00:34 +0800)]
net: phy: mediatek: add MT7530 & MT7531's PHY ID macros

This patch adds MT7530 & MT7531's PHY ID macros in mtk-ge.c so that
it follows the same rule of mtk-ge-soc.c.

Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: SkyLake.Huang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 months agonet: phy: mediatek: Integrate read/write page helper functions
SkyLake.Huang [Fri, 8 Nov 2024 16:34:54 +0000 (00:34 +0800)]
net: phy: mediatek: Integrate read/write page helper functions

This patch integrates read/write page helper functions as MTK phy lib.
They are basically the same in mtk-ge.c & mtk-ge-soc.c.

Signed-off-by: SkyLake.Huang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 months agonet: phy: mediatek: Improve readability of mtk-phy-lib.c's mtk_phy_led_hw_ctrl_set()
SkyLake.Huang [Fri, 8 Nov 2024 16:34:53 +0000 (00:34 +0800)]
net: phy: mediatek: Improve readability of mtk-phy-lib.c's mtk_phy_led_hw_ctrl_set()

This patch removes parens around TRIGGER_NETDEV_RX/TRIGGER_NETDEV_TX in
mtk_phy_led_hw_ctrl_set(), which improves readability.

Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: SkyLake.Huang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 months agonet: phy: mediatek: Move LED helper functions into mtk phy lib
SkyLake.Huang [Fri, 8 Nov 2024 16:34:52 +0000 (00:34 +0800)]
net: phy: mediatek: Move LED helper functions into mtk phy lib

This patch creates mtk-phy-lib.c & mtk-phy.h and integrates mtk-ge-soc.c's
LED helper functions so that we can use those helper functions in other
MTK's ethernet phy driver.

Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: SkyLake.Huang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 months agonet: phy: mediatek: Re-organize MediaTek ethernet phy drivers
SkyLake.Huang [Fri, 8 Nov 2024 16:34:51 +0000 (00:34 +0800)]
net: phy: mediatek: Re-organize MediaTek ethernet phy drivers

Re-organize MediaTek ethernet phy driver files and get ready to integrate
some common functions and add new 2.5G phy driver.
mtk-ge.c: MT7530 Gphy on MT7621 & MT7531 Gphy
mtk-ge-soc.c: Built-in Gphy on MT7981 & Built-in switch Gphy on MT7988
mtk-2p5ge.c: Planned for built-in 2.5G phy on MT7988

Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: SkyLake.Huang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 months agoMerge branch 'octeontx2-rvu-rep'
David S. Miller [Wed, 13 Nov 2024 11:57:12 +0000 (11:57 +0000)]
Merge branch 'octeontx2-rvu-rep'

Geetha sowjanya says:

====================
Introduce RVU representors

This series adds representor support for each rvu devices.
When switchdev mode is enabled, representor netdev is registered
for each rvu device. In implementation of representor model,
one NIX HW LF with multiple SQ and RQ is reserved, where each
RQ and SQ of the LF are mapped to a representor. A loopback channel
is reserved to support packet path between representors and VFs.
CN10K silicon supports 2 types of MACs, RPM and SDP. This
patch set adds representor support for both RPM and SDP MAC
interfaces.

- Patch 1: Implements basic representor driver.
- Patch 2: Add devlink support to create representor netdevs that
  can be used to manage VFs.
- Patch 3: Implements basec netdev_ndo_ops.
- Patch 4: Installs tcam rules to route packets between representor and
   VFs.
- Patch 5: Enables fetching VF stats via representor interface
- Patch 6: Adds support to sync link state between representors and VFs .
- Patch 7: Enables configuring VF MTU via representor netdevs.
- Patch 8: Adds representors for sdp MAC.
- Patch 9: Adds devlink port support.
- Patch 10: Implements offload stats.
- Patch 11: Implements tc offload support.
- patch 12: Adds documentation for rvu port representor.

pci/0002:1c:00.0

Command to create PF/VF representor

Rpf1vf0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether f6:43:83:ee:26:21 brd ff:ff:ff:ff:ff:ff
Rpf1vf1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether 12:b2:54:0e:24:54 brd ff:ff:ff:ff:ff:ff
Rpf1vf2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether 4a:12:c4:4c:32:62 brd ff:ff:ff:ff:ff:ff
Rpf1vf3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether ca:cb:68:0e:e2:6e brd ff:ff:ff:ff:ff:ff
Rpf2vf0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether 06:cc:ad:b4:f0:93 brd ff:ff:ff:ff:ff:ff

~# devlink port
pci/0002:1c:00.0/0: type eth netdev Rpf1vf0 flavour physical port 0 splittable false
pci/0002:1c:00.0/1: type eth netdev Rpf1vf1 flavour pcivf controller 0 pfnum 1 vfnum 1 external false splittable false
pci/0002:1c:00.0/2: type eth netdev Rpf1vf2 flavour pcivf controller 0 pfnum 1 vfnum 2 external false splittable false
pci/0002:1c:00.0/3: type eth netdev Rpf1vf3 flavour pcivf controller 0 pfnum 1 vfnum 3 external false splittable false

-----------
v11:v1:
  - Submitted refactoring changes as a separate patch set.
https://lore.kernel.org/netdev/20241023161843[email protected]/T/
  - Moved documentation to a separate patch.
  - patch 9: Added code changes to forward updated mac address to VF.
  - Implemented TC offload support.

v10-v11:
  - As suggested by "Jiri Pirko" adjusted the documentation.
  - Added more commit description to patch1.

v9-v10:
  - Fixed build warning w.r.t documentation.

v8-v9:
   - Updated the documentation.

v7-v8:
   - Implemented offload stats ndo.
   - Added documentation.

v6-v7:
  - Rebased on top net-next branch.

v5-v6:
  - Addressed review comments provided by "Simon Horman".
  - Added review tag.

v4-v5:
  - Patch 3: Removed devm_* usage in rvu_rep_create()
  - Patch 3: Fixed build warnings.

v3-v4:
 - Patch 2 & 3: Fixed coccinelle reported warnings.
 - Patch 10: Added devlink port support.

v2-v3:
 - Used extack for error messages.
 - As suggested reworked commit messages.
 - Fixed sparse warning.

v1-v2:
 -Fixed build warnings.
 -Address review comments provided by "Kalesh Anakkur Purayil".
====================

Signed-off-by: David S. Miller <[email protected]>
4 months agoDocumentation: octeontx2: Add Documentation for RVU representors
Geetha sowjanya [Thu, 7 Nov 2024 16:08:39 +0000 (21:38 +0530)]
Documentation: octeontx2: Add Documentation for RVU representors

Adds documentation for creating and configuring rvu port representors

Signed-off-by: Geetha sowjanya <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 months agoocteontx2-pf: Adds TC offload support
Geetha sowjanya [Thu, 7 Nov 2024 16:08:38 +0000 (21:38 +0530)]
octeontx2-pf: Adds TC offload support

Implements tc offload support for rvu representors.

Usage example:

 - Add tc rule to drop packets with vlan id 3 using port
   representor(Rpf1vf0).

# tc filter add dev Rpf1vf0 protocol 802.1Q parent ffff: flower
   vlan_id 3 vlan_ethtype ipv4 skip_sw action drop

- Redirect packets with vlan id 5 and IPv4 packets to eth1,
  after stripping vlan header.

# tc filter add dev Rpf1vf0 ingress protocol 802.1Q flower vlan_id 5
  vlan_ethtype ipv4 skip_sw action vlan pop action mirred ingress
  redirect dev eth1

Signed-off-by: Geetha sowjanya <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 months agoocteontx2-pf: Implement offload stats ndo for representors
Geetha sowjanya [Thu, 7 Nov 2024 16:08:37 +0000 (21:38 +0530)]
octeontx2-pf: Implement offload stats ndo for representors

Implement the offload stat ndo by fetching the HW stats
of rx/tx queues attached to the representor.

Signed-off-by: Geetha sowjanya <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 months agoocteontx2-pf: Add devlink port support
Geetha sowjanya [Thu, 7 Nov 2024 16:08:36 +0000 (21:38 +0530)]
octeontx2-pf: Add devlink port support

Register devlink port for the rvu representors.

Signed-off-by: Geetha sowjanya <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 months agoocteontx2-pf: Add representors for sdp MAC
Geetha sowjanya [Thu, 7 Nov 2024 16:08:35 +0000 (21:38 +0530)]
octeontx2-pf: Add representors for sdp MAC

Hardware supports different types of MACs eg RPM, SDP, LBK.
LBK is for internal Tx->Rx HW loopback path. RPM and SDP MACs support
ingress/egress pkt IO on interfaces with different set of capabilities
like interface modes. At the time of netdev driver registration PF will
seek MAC related information from Admin function driver
'drivers/net/ethernet/marvell/octeontx2/af' and sets up ingress/egress
queues etc such that pkt IO on the channels of these different MACs is
possible. This patch add representors for SDP MAC.

Signed-off-by: Geetha sowjanya <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 months agoocteontx2-pf: Configure VF mtu via representor
Geetha sowjanya [Thu, 7 Nov 2024 16:08:34 +0000 (21:38 +0530)]
octeontx2-pf: Configure VF mtu via representor

Adds support to manage the mtu configuration for VF through representor.
On update of representor mtu a mbox notification is send
to VF to update its mtu.

This feature is implemented based on the "Network Function Representors"
kernel documentation.
"
Setting an MTU on the representor should cause that same MTU
to be reported to the representee.
"

Signed-off-by: Sai Krishna <[email protected]>
Signed-off-by: Geetha sowjanya <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 months agoocteontx2-pf: Add support to sync link state between representor and VFs
Geetha sowjanya [Thu, 7 Nov 2024 16:08:33 +0000 (21:38 +0530)]
octeontx2-pf: Add support to sync link state between representor and VFs

Implements the below requirement mentioned
in the representors documentation.

"
The representee's link state is controlled through the
representor. Setting the representor administratively UP
or DOWN should cause carrier ON or OFF at the representee.
"

This patch enables
- Reflecting the link state of representor based on the VF state and
 link state of VF based on representor.
- On VF interface up/down a notification is sent via mbox to representor
  to update the link state.
  eg: ip link set eth0 up/down  will disable carrier on/off
       of the corresponding representor(r0p1) interface.
- On representor interface up/down will cause the link state update of VF.
  eg: ip link set r0p1 up/down  will disable carrier on/off
       of the corresponding representee(eth0) interface.

Signed-off-by: Harman Kalra <[email protected]>
Signed-off-by: Geetha sowjanya <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 months agoocteontx2-pf: Get VF stats via representor
Geetha sowjanya [Thu, 7 Nov 2024 16:08:32 +0000 (21:38 +0530)]
octeontx2-pf: Get VF stats via representor

Adds support to export VF port statistics via representor
netdev. Defines new mbox "NIX_LF_STATS" to fetch VF hw stats.

Signed-off-by: Geetha sowjanya <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
4 months agoocteontx2-af: Add packet path between representor and VF
Geetha sowjanya [Thu, 7 Nov 2024 16:08:31 +0000 (21:38 +0530)]
octeontx2-af: Add packet path between representor and VF

Current HW, do not support in-built switch which will forward pkts
between representee and representor. When representor is put under
a bridge and pkts needs to be sent to representee, then pkts from
representor are sent on a HW internal loopback channel, which again
will be punted to ingress pkt parser. Now the rules that this patch
installs are the MCAM filters/rules which will match against these
pkts and forward them to representee.
The rules that this patch installs are for basic
representor <=> representee path similar to Tun/TAP between VM and
Host.

Signed-off-by: Geetha sowjanya <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
This page took 0.124778 seconds and 4 git commands to generate.