Git Repo - linux.git/log

net: enetc: include MAC Merge / FP registers in register dump

These have been useful in debugging various problems related to frame
preemption, so make them available through ethtool --register-dump for
later too.

Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: enetc: only commit preemptible TCs to hardware when MM TX is active

This was left as TODO in commit 01e23b2b3bad ("net: enetc: add support
for preemptible traffic classes") since it's relatively complicated.

Where this makes a difference is with a configuration as follows:

ethtool --set-mm eno0 pmac-enabled on tx-enabled on verify-enabled on

Preemptible packets should only be sent when the MAC Merge TX direction
becomes active (i.o.w. when the verification process succeeds, aka when
the link partner confirms it can process preemptible traffic). But the
tc qdisc with the preemptible traffic classes is offloaded completely
asynchronously w.r.t. the MM becoming active.

The ENETC manual does suggest that this should be handled in the driver:
"On startup, software should wait for the verification process to
complete (MMCSR[VSTS]=011) before initiating traffic".

Adding the necessary logic allows future selftests to uphold the claim
that an inactive or disabled MAC Merge layer should never send data
packets through the pMAC.

This change moves enetc_set_ptcfpr() from enetc.c to enetc_ethtool.c,
where its only caller is now - enetc_mm_commit_preemptible_tcs().

Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: enetc: report mm tx-active based on tx-enabled and verify-status

The MMCSR register contains 2 fields with overlapping meaning:

- LPA (Local preemption active):
This read-only status bit indicates whether preemption is active for
this port. This bit will be set if preemption is both enabled and has
completed the verification process.
- TXSTS (Merge status):
This read-only status field provides the state of the MAC Merge sublayer
transmit status as defined in IEEE Std 802.3-2018 Clause 99.
00 Transmit preemption is inactive
01 Transmit preemption is active
10 Reserved
11 Reserved

However none of these 2 fields offer reliable reporting to software.

When connecting ENETC to a link partner which is not capable of Frame
Preemption, the expectation is that ENETC's verification should fail
(VSTS=4) and its MM TX direction should be inactive (LPA=0, TXSTS=00)
even though the MM TX is enabled (ME=1). But surprise, the LPA bit of
MMCSR stays set even if VSTS=4 and ME=1.

OTOH, the TXSTS field has the opposite problem. I cannot get its value
to change from 0, even when connecting to a link partner capable of
frame preemption, which does respond to its verification frames (ME=1
and VSTS=3, "SUCCEEDED").

The only option with such buggy hardware seems to be to reimplement the
formula for calculating tx-active in software, which is for tx-enabled
to be true, and for the verify-status to be either SUCCEEDED, or
DISABLED.

Without reliable tx-active reporting, we have no good indication when
to commit the preemptible traffic classes to hardware, which makes it
possible (but not desirable) to send preemptible traffic to a link
partner incapable of receiving it. However, currently we do not have the
logic to wait for TX to be active yet, so the impact is limited.

Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net: enetc: fix MAC Merge layer remaining enabled until a link down event

Current enetc_set_mm() is designed to set the priv->active_offloads bit
ENETC_F_QBU for enetc_mm_link_state_update() to act on, but if the link
is already up, it modifies the ENETC_MMCSR_ME ("Merge Enable") bit
directly.

The problem is that it only *sets* ENETC_MMCSR_ME if the link is up, it
doesn't *clear* it if needed. So subsequent enetc_get_mm() calls still
see tx-enabled as true, up until a link down event, which is when
enetc_mm_link_state_update() will get called.

This is not a functional issue as far as I can assess. It has only come
up because I'd like to uphold a simple API rule in core ethtool code:
the pMAC cannot be disabled if TX is going to be enabled. Currently,
the fact that TX remains enabled for longer than expected (after the
enetc_set_mm() call that disables it) is going to violate that rule,
which is how it was caught.

Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

wwan: core: add print for wwan port attach/disconnect

Refer to USB serial device or net device, there is a notice to
let end user know the status of device, like attached or
disconnected. Add attach/disconnect print for wwan device as
well.

Signed-off-by: Slark Xiao <[email protected]>
Reviewed-by: Loic Poulain <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: skbuff: update and rename __kfree_skb_defer()

__kfree_skb_defer() uses the old naming where "defer" meant
slab bulk free/alloc APIs. In the meantime we also made
__kfree_skb_defer() feed the per-NAPI skb cache, which
implies bulk APIs. So take away the 'defer' and add 'napi'.

While at it add a drop reason. This only matters on the
tx_action path, if the skb has a frag_list. But getting
rid of a SKB_DROP_REASON_NOT_SPECIFIED seems like a net
benefit so why not.

Reviewed-by: Alexander Lobakin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

eth: mlx5: avoid iterator use outside of a loop

Fix the following warning about risky iterator use:

drivers/net/ethernet/mellanox/mlx5/core/eq.c:1010 mlx5_comp_irq_get_affinity_mask() warn: iterator used outside loop: 'eq'

Acked-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

flow_dissector: Address kdoc warnings

Address a number of warnings flagged by
./scripts/kernel-doc -none include/net/flow_dissector.h

include/net/flow_dissector.h:23: warning: Function parameter or member 'addr_type' not described in 'flow_dissector_key_control'
include/net/flow_dissector.h:23: warning: Function parameter or member 'flags' not described in 'flow_dissector_key_control'
include/net/flow_dissector.h:46: warning: Function parameter or member 'padding' not described in 'flow_dissector_key_basic'
include/net/flow_dissector.h:145: warning: Function parameter or member 'tipckey' not described in 'flow_dissector_key_addrs'
include/net/flow_dissector.h:157: warning: cannot understand function prototype: 'struct flow_dissector_key_arp '
include/net/flow_dissector.h:171: warning: cannot understand function prototype: 'struct flow_dissector_key_ports '
include/net/flow_dissector.h:203: warning: cannot understand function prototype: 'struct flow_dissector_key_icmp '

Also improve indentation on adjacent lines to those changed
to address the above.

No functional changes intended.

Signed-off-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

page_pool: unlink from napi during destroy

Jesper points out that we must prevent recycling into cache
after page_pool_destroy() is called, because page_pool_destroy()
is not synchronized with recycling (some pages may still be
outstanding when destroy() gets called).

I assumed this will not happen because NAPI can't be scheduled
if its page pool is being destroyed. But I missed the fact that
NAPI may get reused. For instance when user changes ring configuration
driver may allocate a new page pool, stop NAPI, swap, start NAPI,
and then destroy the old pool. The NAPI is running so old page
pool will think it can recycle to the cache, but the consumer
at that point is the destroy() path, not NAPI.

To avoid extra synchronization let the drivers do "unlinking"
during the "swap" stage while NAPI is indeed disabled.

Fixes: 8c48eea3adf3 ("page_pool: allow caching from safely localized NAPI")
Reported-by: Jesper Dangaard Brouer <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/
Acked-by: Jesper Dangaard Brouer <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: phy: fix circular LEDS_CLASS dependencies

The CONFIG_PHYLIB symbol is selected by a number of device drivers that
need PHY support, but it now has a dependency on CONFIG_LEDS_CLASS,
which may not be enabled, causing build failures.

Avoid the risk of missing and circular dependencies by guarding the
phylib LED support itself in another Kconfig symbol that can only be
enabled if the dependency is met.

This could be made a hidden symbol and always enabled when both CONFIG_OF
and CONFIG_LEDS_CLASS are reachable from the phylib, but there may be an
advantage in having users see this option when they have a misconfigured
kernel without built-in LED support.

Fixes: 01e5b728e9e4 ("net: phy: Add a binding for PHY LEDs")
Signed-off-by: Arnd Bergmann <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net/mlx5: Update op_mode to op_mod for port selection

To be consistent with the other enum keys use OP_MOD
instead of OP_MODE.

Signed-off-by: Roi Dayan <[email protected]>
Reviewed-by: Maor Dickman <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5: E-Switch, Remove unused mlx5_esw_offloads_vport_metadata_set()

Remove unused function which also seems a duplicate
of esw_port_metadata_set().

Signed-off-by: Roi Dayan <[email protected]>
Reviewed-by: Maor Dickman <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5: E-Switch, Remove redundant dev arg from mlx5_esw_vport_alloc()

The passded esw->dev is redundant as esw being passed and esw->dev being
used inside.

Signed-off-by: Roi Dayan <[email protected]>
Reviewed-by: Maor Dickman <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5: Include linux/pci.h for pci_msix_can_alloc_dyn()

Add include directive to assure pci_msix_can_alloc_dyn() prototype.

Fixes: 3354822cde5a ("net/mlx5: Use dynamic msix vectors allocation")
Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Signed-off-by: Eli Cohen <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5e: RX, Hook NAPIs to page pools

Linking the NAPI to the rq page_pool to improve page_pool cache
usage during skb recycling.

Here are the observed improvements for a iperf single stream
test case:

- For 1500 MTU and legacy rq, seeing a 20% improvement of cache usage.

- For 9K MTU, seeing 33-40 % page_pool cache usage improvements for
both striding and legacy rq (depending if the application is running on
the same core as the rq or not).

Signed-off-by: Dragos Tatulea <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5e: RX, Fix XDP_TX page release for legacy rq nonlinear case

When the XDP handler marks the data for transmission (XDP_TX),
it is incorrect to release the page fragment. Instead, the
fragments should be marked as MLX5E_WQE_FRAG_SKIP_RELEASE
because XDP will release the page directly to the page_pool
(page_pool_put_defragged_page) after TX completion.

The linear case already does this. This patch fixes the
nonlinear part as well.

Also, the looping over the fragments was incorrect: When handling
pages after XDP_TX in the legacy rq nonlinear case, the loop was
skipping the first wqe fragment.

Fixes: 3f93f82988bc ("net/mlx5e: RX, Defer page release in legacy rq for better recycling")
Signed-off-by: Dragos Tatulea <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5e: RX, Fix releasing page_pool pages twice for striding RQ

mlx5e_free_rx_descs is responsible for calling the dealloc_wqe op which
returns pages to the page_pool. This can happen during flush or close.
For XSK, the regular RQ is flushed (when replaced by the XSK RQ) and
also closed later. This is normally not a problem as the wqe list is
empty on a second call to mlx5e_free_rx_descs. However, for striding RQ,
the previously released wqes from the list will appear as missing
and will be released a second time by mlx5e_free_rx_missing_descs.

This patch sets the no release bits on the striding RQ wqes in the
dealloc_wqe op to prevent releasing the pages a second time.

Please note that the bits are set only in the control path during
close and not in the data path.

Fixes: 4c2a13236807 ("net/mlx5e: RX, Defer page release in striding rq for better recycling")
Signed-off-by: Dragos Tatulea <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5e: Add vnic devlink health reporter to representors

Create a new devlink health reporter for representor interface, which
reports the values of representor vnic diagnostic counters when diagnosed.

This patch will allow admins to monitor VF diagnostic counters through
the representor-interface vnic reporter.

Example of usage:
$ devlink health diagnose pci/0000:08:00.0/65537 reporter vnic
  vNIC env counters:
    total_error_queues: 0 send_queue_priority_update_flow: 0
    comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0
    invalid_command: 0 quota_exceeded_command: 0
    nic_receive_steering_discard: 0

Signed-off-by: Maher Sanalla <[email protected]>
Reviewed-by: Moshe Shemesh <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5: Add vnic devlink health reporter to PFs/VFs

Create a vnic devlink health reporter for PFs/VFs interfaces.
The reporter's diagnose callback displays the values of vNIC/vport
transport debug counters of PFs/VFs, as follows:

$ devlink health diagnose pci/0000:08:00.0 reporter vnic
vNIC env counters:
    total_error_queues: 0 send_queue_priority_update_flow: 0
    comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0
    invalid_command: 0 quota_exceeded_command: 0
    nic_receive_steering_discard: 0

Moreover, add documentation on the reporter functionality and the
counters description.

While at it, expose the vNIC counters diagnose function to be used by
the downstream patch, which will reveal the counters for representor
interfaces.

Signed-off-by: Maher Sanalla <[email protected]>
Reviewed-by: Moshe Shemesh <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

Revert "net/mlx5: Expose vnic diagnostic counters for eswitch managed vports"

This reverts commit 606e6a72e29dff9e3341c4cc9b554420e4793f401 which exposes
the vnic diagnostic counters via debugfs. Instead, The upcoming series will
expose the same counters through devlink health reporter.

Signed-off-by: Maher Sanalla <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

Revert "net/mlx5: Expose steering dropped packets counter"

This reverts commit 4fe1b3a5f8fe2fdcedcaba9561e5b0ae5cb1d15b, which
exposes the steering dropped packets counter via debugfs. The upcoming
series will expose the counter via devlink health reporter instead
of debugfs.

Signed-off-by: Maher Sanalla <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5: DR, Add memory statistics for domain object

Add counters for number of buddies that are currently in use per domain
per buddy type (STE, MODIFY-HEADER, MODIFY-PATTERN).

Signed-off-by: Erez Shitrit <[email protected]>
Signed-off-by: Yevgeny Kliteynik <[email protected]>
Reviewed-by: Alex Vesker <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5: DR, Add more info in domain dbg dump

Add additinal items to domain info dump: Linux version and device name.

Signed-off-by: Yevgeny Kliteynik <[email protected]>
Reviewed-by: Alex Vesker <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5: DR, Calculate sync threshold of each pool according to its type

When certain ICM chunk is no longer needed, it needs to be freed.
Fully freeing ICM memory involves issuing FW SYNC_STEERING command.
This is very time consuming, and it is impractical to do it for every
freed chunk.
Instead, we manage these 'freed' chunks in hot list (list of chunks
that are not required by SW any more, but HW might still access them).
When size of the hot list reaches certain threshold, we purge it and
issue SYNC_STEERING FW command.
There is one threshold for all the different ICM types, which is not
optimal, as different ICM types require different approach: STEs pool
is very large, and it is very 'dynamic' in its nature, so letting hot
list to become too large will result in a significant perf hiccup when
purging the hot list. Modify action is much smaller and less dynamic,
so we can let the hot list to grow to almost the size of the whole pool.

This patch fixes this problem: instead of having same hot memory
threshold for all the pools, sync operation will be triggered in
accordance with the ICM type.

Signed-off-by: Yevgeny Kliteynik <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5: DR, Fix dumping of legacy modify_hdr in debug dump

The steering dump parser expects to see 0 as rewrite num of actions
in case pattern/args aren't supported - parsing of legacy modify header
is based on this assumption.
Fix this to align to parser's expectation.

Signed-off-by: Yevgeny Kliteynik <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

Merge branch 'fix __retval() being always ignored'

Eduard Zingerman says:

====================

Florian Westphal found a bug in test_loader.c processing of __retval tag.
Because of this bug the function test_loader.c:do_prog_test_run()
never executed and all __retval test tags were ignored. See [1].

Fix for this bug uncovers two additional bugs:
- During test_verifier tests migration to inline assembly (see [2])
  I missed the fact that some tests require maps to contain mock values;
- Some issue with a new refcounted_kptr test, which causes kernel to
  produce dead lock and refcount saturation warnings when subject to
  libbpf's bpf_test_run_opts().

This series fixes the bug in __retval() processing, and address the
issue with test maps not being populated. The issue in refcounted_kptr
is not addressed, __retval tags in those tests are commented out.

I found that the following tests depend on test maps being populated:
- progs/verifier_array_access.c
- verifier/value_ptr_arith.c (planned for migration to inline assembly)

Given the small amount of these tests I decided to opt for simple
non-generic solution (see patch #4).

[1] https://lore.kernel.org/bpf/f4c4aee644425842ee6aa8edf1da68f0a8260e7c [email protected]/T/
[2] https://lore.kernel.org/bpf/20230325025524 [email protected]/
====================

Signed-off-by: Alexei Starovoitov <[email protected]>

selftests/bpf: populate map_array_ro map for verifier_array_access test

Two test cases:
- "valid read map access into a read-only array 1" and
- "valid read map access into a read-only array 2"

Expect that map_array_ro map is filled with mock data. This logic was
not taken into acount during initial test conversion.

This commit modifies prog_tests/verifier.c entry point for this test
to fill the map.

Fixes: a3c830ae0209 ("selftests/bpf: verifier/array_access.c converted to inline assembly")
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>

selftests/bpf: add pre bpf_prog_test_run_opts() callback for test_loader

When a test case is annotated with __retval tag the test_loader engine
would use libbpf's bpf_prog_test_run_opts() to do a test run of the
program and compare retvals.

This commit allows to perform arbitrary actions on bpf object right
before test loader invokes bpf_prog_test_run_opts(). This could be
used to setup some state for program execution, e.g. fill some maps.

Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>

selftests/bpf: fix __retval() being always ignored

Florian Westphal found a bug in and suggested a fix for test_loader.c
processing of __retval tag. Because of this bug the function
test_loader.c:do_prog_test_run() never executed and all __retval test
tags were ignored.

If this bug is fixed a number of test cases from
progs/verifier_array_access.c fail with retval not matching the
expected value. This test was recently converted to use test_loader.c
and inline assembly in [1]. When doing the conversion I missed the
important detail of test_verifier.c operation: when it creates
fixup_map_array_ro, fixup_map_array_wo and fixup_map_array_small it
populates these maps with a dummy record.

Disabling the __retval checks for the affected verifier_array_access
in this commit to avoid false-postivies in any potential bisects.
The issue is addressed in the next patch.

I verified that the __retval tags are now respected by changing
expected return values for all tests annotated with __retval, and
checking that these tests started to fail.

[1] https://lore.kernel.org/bpf/20230325025524 [email protected]/

Fixes: 19a8e06f5f91 ("selftests/bpf: Tests execution support for test_loader.c")
Reported-by: Florian Westphal <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]/T/
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>

selftests/bpf: disable program test run for progs/refcounted_kptr.c

Florian Westphal found a bug in test_loader.c processing of __retval
tag. Because of this bug the function test_loader.c:do_prog_test_run()
never executed and all __retval test tags were ignored. This hid an
issue with progs/refcounted_kptr.c tests.

When __retval tag bug is fixed and refcounted_kptr.c tests are run
kernel reports various issues and eventually hangs. Shortest reproducer
is the following command run a few times:

$ for i in $(seq 1 4); do (./test_progs --allow=refcounted_kptr &); done

Commenting out __retval tags for these tests until this issue is resolved.

Reported-by: Florian Westphal <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]/T/
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>

bpftool: Replace "__fallthrough" by a comment to address merge conflict

The recent support for inline annotations in control flow graphs
generated by bpftool introduced the usage of the "__fallthrough" macro
in a switch/case block in btf_dumper.c. This change went through the
bpf-next tree, but resulted in a merge conflict in linux-next, because
this macro has been renamed "fallthrough" (no underscores) in the
meantime.

To address the conflict, we temporarily switch to a simple comment
instead of a macro.

Related: commit f7a858bffcdd ("tools: Rename __fallthrough to fallthrough")

Fixes: 9fd496848b1c ("bpftool: Support inline annotations when dumping the CFG of a program")
Reported-by: Sven Schnelle <[email protected]>
Reported-by: Thomas Richter <[email protected]>
Suggested-by: Daniel Borkmann <[email protected]>
Signed-off-by: Quentin Monnet <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/
Link: https://lore.kernel.org/bpf/[email protected]/
Link: https://lore.kernel.org/bpf/[email protected]

ice: sleep, don't busy-wait, in the SQ send retry loop

10 ms is a lot of time to spend busy-waiting. Sleeping is clearly
allowed here, because we have just returned from ice_sq_send_cmd(),
which takes a mutex.

On kernels with HZ=100, this msleep may be twice as long, but I don't
think it matters.
I did not actually observe any retries happening here.

Signed-off-by: Michal Schmidt <[email protected]>
Reviewed-by: Arkadiusz Kubalewski <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Tested-by: Sunitha Mekala <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>

ice: remove unused buffer copy code in ice_sq_send_cmd_retry()

The 'buf_cpy'-related code in ice_sq_send_cmd_retry() looks broken.
'buf' is nowhere copied into 'buf_cpy'.

The reason this does not cause problems is that all commands for which
'is_cmd_for_retry' is true go with a NULL buf.

Let's remove 'buf_cpy'. Add a WARN_ON in case the assumption no longer
holds in the future.

Signed-off-by: Michal Schmidt <[email protected]>
Reviewed-by: Arkadiusz Kubalewski <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Tested-by: Sunitha Mekala <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>

ice: sleep, don't busy-wait, for ICE_CTL_Q_SQ_CMD_TIMEOUT

The driver polls for ice_sq_done() with a 100 µs period for up to 1 s
and it uses udelay to do that.

Let's use usleep_range instead. We know sleeping is allowed here,
because we're holding a mutex (cq->sq_lock). To preserve the total
max waiting time, measure the timeout in jiffies.

ICE_CTL_Q_SQ_CMD_TIMEOUT is used also in ice_release_res(), but there
the polling period is 1 ms (i.e. 10 times longer). Since the timeout was
expressed in terms of the number of loops, the total timeout in this
function is 10 s. I do not know if this is intentional. This patch keeps
it.

The patch lowers the CPU usage of the ice-gnss-<dev_name> kernel thread
on my system from ~8 % to less than 1 %.

I received a report of high CPU usage with ptp4l where the busy-waiting
in ice_sq_send_cmd dominated the profile. This patch has been tested in
that usecase too and it made a huge improvement there.

Tested-by: Brent Rowsell <[email protected]>
Signed-off-by: Michal Schmidt <[email protected]>
Reviewed-by: Arkadiusz Kubalewski <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Tested-by: Sunitha Mekala <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>

ice: remove ice_ctl_q_info::sq_cmd_timeout

sq_cmd_timeout is initialized to ICE_CTL_Q_SQ_CMD_TIMEOUT and never
changed, so just use the constant directly.

Suggested-by: Simon Horman <[email protected]>
Signed-off-by: Michal Schmidt <[email protected]>
Reviewed-by: Arkadiusz Kubalewski <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Tested-by: Sunitha Mekala <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>

ice: increase the GNSS data polling interval to 20 ms

Double the GNSS data polling interval from 10 ms to 20 ms.
According to Karol Kolacinski from the Intel team, they have been
planning to make this change.

Signed-off-by: Michal Schmidt <[email protected]>
Reviewed-by: Arkadiusz Kubalewski <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Tested-by: Sunitha Mekala <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>

ice: do not busy-wait to read GNSS data

The ice-gnss-<dev_name> kernel thread, which reads data from the u-blox
GNSS module, keep a CPU core almost 100% busy. The main reason is that
it busy-waits for data to become available.

A simple improvement would be to replace the "mdelay(10);" in
ice_gnss_read() with sleeping. A better fix is to not do any waiting
directly in the function and just requeue this delayed work as needed.
The advantage is that canceling the work from ice_gnss_exit() becomes
immediate, rather than taking up to ~2.5 seconds (ICE_MAX_UBX_READ_TRIES
* 10 ms).

This lowers the CPU usage of the ice-gnss-<dev_name> thread on my system
from ~90 % to ~8 %.

I am not sure if the larger 0.1 s pause after inserting data into the
gnss subsystem is really necessary, but I'm keeping that as it was.

Of course, ideally the driver would not have to poll at all, but I don't
know if the E810 can watch for GNSS data availability over the i2c bus
by itself and notify the driver.

Signed-off-by: Michal Schmidt <[email protected]>
Reviewed-by: Arkadiusz Kubalewski <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Tested-by: Sunitha Mekala <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Adjacent changes:

net/mptcp/protocol.h
  63740448a32e ("mptcp: fix accept vs worker race")
  2a6a870e44dd ("mptcp: stops worker on unaccepted sockets at listener close")
  ddb1a072f858 ("mptcp: move first subflow allocation at mpc access time")

Signed-off-by: Jakub Kicinski <[email protected]>

wifi: ath9k: Don't mark channelmap stack variable read-only in ath9k_mci_update_wlan_channels()

This partially reverts commit e161d4b60ae3a5356e07202e0bfedb5fad82c6aa.

Turns out the channelmap variable is not actually read-only, it's modified
through the MCI_GPM_CLR_CHANNEL_BIT() macro further down in the function,
so making it read-only causes page faults when that code is hit.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217183
Link: https://lore.kernel.org/r/[email protected]
Fixes: e161d4b60ae3 ("wifi: ath9k: Make arrays prof_prio and channelmap static const")
Cc: [email protected]
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge tag 'rust-fixes-6.3' of https://github.com/Rust-for-Linux/linux

Pull Rust fixes from Miguel Ojeda:
"Most of these are straightforward.

  The last one is more complex, but it only touches Rust + GCC builds
  which are for the moment best-effort.

   - Code: Missing 'extern "C"' fix.

   - Scripts: 'is_rust_module.sh' and 'generate_rust_analyzer.py' fixes.

   - A couple trivial fixes

   - Build: Rust + GCC build fix and 'grep' warning fix"

* tag 'rust-fixes-6.3' of https://github.com/Rust-for-Linux/linux:
  rust: allow to use INIT_STACK_ALL_ZERO
  rust: fix regexp in scripts/is_rust_module.sh
  rust: build: Fix grep warning
  scripts: generate_rust_analyzer: Handle sub-modules with no Makefile
  rust: kernel: Mark rust_fmt_argument as extern "C"
  rust: sort uml documentation arch support table
  rust: str: fix requierments->requirements typo

Merge tag 'net-6.3-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
"Including fixes from netfilter and bpf.

  There are a few fixes for new code bugs, including the Mellanox one
  noted in the last networking pull. No known regressions outstanding.

  Current release - regressions:

   - sched: clear actions pointer in miss cookie init fail

   - mptcp: fix accept vs worker race

   - bpf: fix bpf_arch_text_poke() with new_addr == NULL on s390

   - eth: bnxt_en: fix a possible NULL pointer dereference in unload
     path

   - eth: veth: take into account peer device for
     NETDEV_XDP_ACT_NDO_XMIT xdp_features flag

  Current release - new code bugs:

   - eth: revert "net/mlx5: Enable management PF initialization"

  Previous releases - regressions:

   - netfilter: fix recent physdev match breakage

   - bpf: fix incorrect verifier pruning due to missing register
     precision taints

   - eth: virtio_net: fix overflow inside xdp_linearize_page()

   - eth: cxgb4: fix use after free bugs caused by circular dependency
     problem

   - eth: mlxsw: pci: fix possible crash during initialization

  Previous releases - always broken:

   - sched: sch_qfq: prevent slab-out-of-bounds in qfq_activate_agg

   - netfilter: validate catch-all set elements

   - bridge: don't notify FDB entries with "master dynamic"

   - eth: bonding: fix memory leak when changing bond type to ethernet

   - eth: i40e: fix accessing vsi->active_filters without holding lock

  Misc:

   - Mat is back as MPTCP co-maintainer"

* tag 'net-6.3-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (33 commits)
  net: bridge: switchdev: don't notify FDB entries with "master dynamic"
  Revert "net/mlx5: Enable management PF initialization"
  MAINTAINERS: Resume MPTCP co-maintainer role
  mailmap: add entries for Mat Martineau
  e1000e: Disable TSO on i219-LM card to increase speed
  bnxt_en: fix free-runnig PHC mode
  net: dsa: microchip: ksz8795: Correctly handle huge frame configuration
  bpf: Fix incorrect verifier pruning due to missing register precision taints
  hamradio: drop ISA_DMA_API dependency
  mlxsw: pci: Fix possible crash during initialization
  mptcp: fix accept vs worker race
  mptcp: stops worker on unaccepted sockets at listener close
  net: rpl: fix rpl header size calculation
  net: vmxnet3: Fix NULL pointer dereference in vmxnet3_rq_rx_complete()
  bonding: Fix memory leak when changing bond type to Ethernet
  veth: take into account peer device for NETDEV_XDP_ACT_NDO_XMIT xdp_features flag
  mlxfw: fix null-ptr-deref in mlxfw_mfa2_tlv_next()
  bnxt_en: Fix a possible NULL pointer dereference in unload path
  bnxt_en: Do not initialize PTP on older P3/P4 chips
  netfilter: nf_tables: tighten netlink attribute requirements for catch-all elements
  ...

Merge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git

ath.git patches for v6.4. Major changes:

wcn36xx

* support for pronto v3 hardware

ath11k

* PCIe DeviceTree bindings

* WCN6750: enable SAR support

ath10k

* convert DeviceTree bindings to YAML

net: libwx: fix memory leak in wx_setup_rx_resources

When wx_alloc_page_pool() failed in wx_setup_rx_resources(), it doesn't
release DMA buffer. Add dma_free_coherent() in the error path to release
the DMA buffer.

Fixes: 850b971110b2 ("net: libwx: Allocate Rx and Tx resources")
Signed-off-by: Zhengchao Shao <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

Merge tag 'mt76-for-kvalo-2023-04-18' of https://github.com/nbd168/wireless

mt76 patches for 6.4

- fixes
- connac code unification
- mt7921 p2p support
- mt7996 mesh a-msdu support
- mt7996 eht support
- mt7996 coredump support

wifi: rtw88: Update spelling in main.h

Update spelling in comments in main.h

Found by inspection.

Signed-off-by: Simon Horman <[email protected]>
Reviewed-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: airo: remove ISA_DMA_API dependency

This driver does not actually use the ISA DMA API, it is purely
PIO based, so remove the dependency.

Signed-off-by: Arnd Bergmann <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: rtl8xxxu: Simplify setting the initial gain

The goal of writing 0x6954341e / 0x6955341e to REG_OFDM0_XA_AGC_CORE1
appears to be setting the initial gain, which is stored in bits 0..6.
Bits 7..31 are the same as what the phy init tables write.

Modify only bits 0..6 so that we don't have to care about the values
of the others. This way we don't have to add another "else if" for the
RTL8192FU.

Why we need to change the initial gain from the default 0x20 to 0x1e?
Not sure. Some of the vendor drivers change it to 0x1e before scanning
and then restore it to the original value after.

Signed-off-by: Bitterblue Smith <[email protected]>
Reviewed-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: rtl8xxxu: Add rtl8xxxu_write{8,16,32}_{set,clear}

Also add rtl8xxxu_write32_mask, rtl8xxxu_write_rfreg_mask.

These helper functions make it easier to modify only parts of a register
by eliminating the call to the register reading function and the bit
manipulations.

Signed-off-by: Bitterblue Smith <[email protected]>
Reviewed-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: rtl8xxxu: Don't print the vendor/product/serial

Most devices have a vendor name, product name, and serial number in the
efuse, but it's pretty useless. It duplicates the information already
printed by the USB subsystem:

   usb 1-4: New USB device found, idVendor=0bda, idProduct=8178, bcdDevice= 2.00
   usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
   usb 1-4: Product: 802.11n WLAN Adapter
   usb 1-4: Manufacturer: Realtek
   usb 1-4: SerialNumber: 00e04c000001
-> usb 1-4: Vendor: Realtek
-> usb 1-4: Product: 802.11n WLAN Adapter

   usb 1-4: New USB device found, idVendor=0bda, idProduct=818b, bcdDevice= 2.00
   usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
   usb 1-4: Product: 802.11n NIC
   usb 1-4: Manufacturer: Realtek
   usb 1-4: SerialNumber: 00e04c000001
-> usb 1-4: Vendor: Realtek
-> usb 1-4: Product: 802.11n NIC
-> usb 1-4: Serial not available.

   usb 1-4: New USB device found, idVendor=0bda, idProduct=f179, bcdDevice= 0.00
   usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
   usb 1-4: Product: 802.11n
   usb 1-4: Manufacturer: Realtek
   usb 1-4: SerialNumber: 002E2DC0041F
-> usb 1-4: Vendor: Realtek
-> usb 1-4: Product: 802.11n

   usb 1-4: New USB device found, idVendor=0bda, idProduct=8179, bcdDevice= 0.00
   usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
   usb 1-4: Product: 802.11n NIC
   usb 1-4: Manufacturer: Realtek
   usb 1-4: SerialNumber: 00E04C0001
-> usb 1-4: Vendor: Realtek
-> usb 1-4: Product: 802.11n NIC
-> usb 1-4: Serial: 00E04C0001

Also, that data is not interpreted correctly in all cases:

usb 3-1.1.2: New USB device found, idVendor=0bda, idProduct=8179, bcdDevice= 0.00
usb 3-1.1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 3-1.1.2: Product: 802.11n NIC
usb 3-1.1.2: Manufacturer: Realtek
usb 3-1.1.2: Vendor: Realtek
usb 3-1.1.2: Product: \x03802.11n NI
usb 3-1.1.2: Serial: \xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217231
Signed-off-by: Bitterblue Smith <[email protected]>
Reviewed-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: rtw88: Fix memory leak in rtw88_usb

Kmemleak shows the following leak arising from routine in the usb
probe routine:

unreferenced object 0xffff895cb29bba00 (size 512):
  comm "(udev-worker)", pid 534, jiffies 4294903932 (age 102751.088s)
  hex dump (first 32 bytes):
    77 30 30 30 00 00 00 00 02 2f 2d 2b 30 00 00 00  w000...../-+0...
    02 00 2a 28 00 00 00 00 ff 55 ff ff ff 00 00 00  ..*(.....U......
  backtrace:
    [<ffffffff9265fa36>] kmalloc_trace+0x26/0x90
    [<ffffffffc17eec41>] rtw_usb_probe+0x2f1/0x680 [rtw_usb]
    [<ffffffffc03e19fd>] usb_probe_interface+0xdd/0x2e0 [usbcore]
    [<ffffffff92b4f2fe>] really_probe+0x18e/0x3d0
    [<ffffffff92b4f5b8>] __driver_probe_device+0x78/0x160
    [<ffffffff92b4f6bf>] driver_probe_device+0x1f/0x90
    [<ffffffff92b4f8df>] __driver_attach+0xbf/0x1b0
    [<ffffffff92b4d350>] bus_for_each_dev+0x70/0xc0
    [<ffffffff92b4e51e>] bus_add_driver+0x10e/0x210
    [<ffffffff92b50935>] driver_register+0x55/0xf0
    [<ffffffffc03e0708>] usb_register_driver+0x88/0x140 [usbcore]
    [<ffffffff92401153>] do_one_initcall+0x43/0x210
    [<ffffffff9254f42a>] do_init_module+0x4a/0x200
    [<ffffffff92551d1c>] __do_sys_finit_module+0xac/0x120
    [<ffffffff92ee6626>] do_syscall_64+0x56/0x80
    [<ffffffff9300006a>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

The leak was verified to be real by unloading the driver, which resulted
in a dangling pointer to the allocation.

The allocated memory is freed in rtw_usb_intf_deinit().

Signed-off-by: Larry Finger <[email protected]>
Cc: Sascha Hauer <[email protected]>
Cc: Ping-Ke Shih <[email protected]>
Reviewed-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: rtw88: call rtw8821c_switch_rf_set() according to chip variant

We have to call rtw8821c_switch_rf_set() with SWITCH_TO_WLG or
SWITCH_TO_BTG according to the chip variant as denoted in rfe_option.
The information which argument to use for which variant has been
taken from the vendor driver.

Signed-off-by: Sascha Hauer <[email protected]>
Reviewed-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: rtw88: set pkg_type correctly for specific rtw8821c variants

According to the vendor driver the pkg_type has to be set to '1'
for some rtw8821c variants. As the pkg_type has been hardcoded to
'0', add a field for it in struct rtw_hal and set this correctly
in the rtw8821c part.
With this parsing of a rtw_table is influenced and check_positive()
in phy.c returns true for some cases here. The same is done in the
vendor driver. However, this has no visible effect on the driver
here.

Signed-off-by: Sascha Hauer <[email protected]>
Reviewed-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Sascha Hauer <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: rtw88: rtw8821c: Fix rfe_option field width

On my RTW8821CU chipset rfe_option reads as 0x22. Looking at the
vendor driver suggests that the field width of rfe_option is 5 bit,
so rfe_option should be masked with 0x1f.

Without this the rfe_option comparisons with 2 further down the
driver evaluate as false when they should really evaluate as true.
The effect is that 2G channels do not work.

rfe_option is also used as an array index into rtw8821c_rfe_defs[].
rtw8821c_rfe_defs[34] (0x22) was added as part of adding USB support,
likely because rfe_option reads as 0x22. As this now becomes 0x2,
rtw8821c_rfe_defs[34] is no longer used and can be removed.

Note that this might not be the whole truth. In the vendor driver
there are indeed places where the unmasked rfe_option value is used.
However, the driver has several places where rfe_option is tested
with the pattern if (rfe_option == 2 || rfe_option == 0x22) or
if (rfe_option == 4 || rfe_option == 0x24), so that rfe_option BIT(5)
has no influence on the code path taken. We therefore mask BIT(5)
out from rfe_option entirely until this assumption is proved wrong
by some chip variant we do not know yet.

Signed-off-by: Sascha Hauer <[email protected]>
Tested-by: Alexandru gagniuc <[email protected]>
Tested-by: Larry Finger <[email protected]>
Tested-by: ValdikSS <[email protected]>
Cc: [email protected]
Reviewed-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: rtw88: usb: fix priority queue to endpoint mapping

The RTW88 chipsets have four different priority queues in hardware. For
the USB type chipsets the packets destined for a specific priority queue
must be sent through the endpoint corresponding to the queue. This was
not fully understood when porting from the RTW88 USB out of tree driver
and thus violated.

This patch implements the qsel to endpoint mapping as in
get_usb_bulkout_id_88xx() in the downstream driver.

Without this the driver often issues "timed out to flush queue 3"
warnings and often TX stalls completely.

Signed-off-by: Sascha Hauer <[email protected]>
Tested-by: ValdikSS <[email protected]>
Tested-by: Alexandru gagniuc <[email protected]>
Tested-by: Larry Finger <[email protected]>
Cc: [email protected]
Reviewed-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: rtw88: 8822c: add iface combination

Allow 8822c to operate two interface concurrently, only 1 AP mode plus
1 station mode under same frequency is supported. Combination of other
types will not be added until requested.

Signed-off-by: Po-Hao Huang <[email protected]>
Signed-off-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: rtw88: handle station mode concurrent scan with AP mode

This patch allows vifs sharing same hardware with the AP mode vif to
do scan, do note that this could lead to packet loss or disconnection
of the AP's clients. Since we don't have chanctx, update scan info
upon set channel so bandwidth changes won't go unnoticed and get
misconfigured after scan. Download beacon just before scan starts to
allow hardware to get proper content to do beaconing. Last, beacons
should only be transmitted in AP's operating channel. Turn related
beacon functions off while we're in other channels so the receiving
stations won't get confused.

Signed-off-by: Po-Hao Huang <[email protected]>
Signed-off-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: rtw88: prevent scan abort with other VIFs

Only abort scan with current scanning VIF. If we have more than one
interface, we could call rtw_hw_scan_abort() with the wrong VIF as
input. This avoids potential null pointer being accessed when actually
the other VIF is scanning.

Signed-off-by: Po-Hao Huang <[email protected]>
Signed-off-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: rtw88: refine reserved page flow for AP mode

Only gather reserved pages from AP interface after it has started. Or
else ieee80211_beacon_get_*() returns NULL and causes other VIFs'
reserved pages fail to download. Update location of current reserved page
after beacon renews so offsets changed by beacon can be recognized.

Signed-off-by: Po-Hao Huang <[email protected]>
Signed-off-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: rtw88: disallow PS during AP mode

Firmware can't support PS mode during AP mode, so disallow this case.

Signed-off-by: Po-Hao Huang <[email protected]>
Signed-off-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: rtw88: 8822c: extend reserved page number

Extend 8822c's reserved page number to accommodate additional required
pages. Reserved page is an area of memory in the FIFO dedicated for
special purposes. Previously only one interface is supported so 8 pages
should suffice, extend it so we can support 2 interfaces concurrently.

Signed-off-by: Po-Hao Huang <[email protected]>
Signed-off-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: rtw88: add port switch for AP mode

Switch port settings if AP mode does not start on port 0 because of
hardware limitation. For some ICs, beacons on ports other than zero
could misbehave and do not issue properly, to fix this we change AP
VIFs to port zero when multiple interfaces is active.

Signed-off-by: Po-Hao Huang <[email protected]>
Signed-off-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: rtw88: add bitmap for dynamic port settings

In order to support multiple interfaces, multiple port settings will
be required. Current code always uses port 0 and should be changed.
Declare a bitmap with size equal to hardware port number to record
the current usage.

Signed-off-by: Po-Hao Huang <[email protected]>
Signed-off-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: rtw89: mac: use regular int as return type of DLE buffer request

The function to request DLE (data link engine) buffer uses 'u16' as return
value that mixes error code, so change it to 'int' as regular error code.
Also, treat invalid register value (0xfff) as an error.

Signed-off-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

wifi: mac80211: remove return value check of debugfs_create_dir()

Smatch complains that:
debugfs_hw_add() warn: 'statsd' is an error pointer or valid

Debugfs checks are generally not supposed to be checked for errors
and it is not necessary here.

Just delete the dead code.

Signed-off-by: Yingsha Xu <[email protected]>
Reviewed-by: Dongliang Mu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>

wifi: iwlwifi: mvm: fix RFKILL report when driver is going down

When CSME takes ownership, the driver sets RFKILL on, and this
triggers driver unload and sending the confirmation SAP message.
However, when IWL_MVM_MEI_REPORT_RFKILL is set, RFKILL was not
reported and as a result, the driver did not confirm the ownership
transition. Fix it.

Signed-off-by: Avraham Stern <[email protected]>
Signed-off-by: Gregory Greenman <[email protected]>
Link: https://lore.kernel.org/r/20230418122405.29ac3cd3df73.I96b32bc274bfe1e3871e54d3fa29c7ac4f40446f@changeid
Signed-off-by: Johannes Berg <[email protected]>

wifi: iwlwifi: mei: re-ask for ownership after it was taken by CSME

When the host disconnects from the AP CSME takes ownership right away.
Since the driver never asks for ownership again wifi is left in rfkill
until CSME releases the NIC, although in many cases the host could
re-connect shortly after the disconnection. To allow the host to
recover from occasional disconnection, re-ask for ownership to let
the host connect again.
Allow one minute before re-asking for ownership to avoid too frequent
ownership transitions.

Signed-off-by: Avraham Stern <[email protected]>
Signed-off-by: Gregory Greenman <[email protected]>
Link: https://lore.kernel.org/r/20230418122405.a6c6ebc48f2d.I8a17003b86e71b3567521cc69864b9cbe9553ea9@changeid
Signed-off-by: Johannes Berg <[email protected]>

wifi: iwlwifi: mei: make mei filtered scan more aggressive

When mei filtered scan is performed, it must find the AP on the first
scan, otherwise CSME will take the ownership of the NIC.
Make this scan more aggressive by scanning the channel the AP is
supposed to be on (as reported by CSME) several times.

Signed-off-by: Avraham Stern <[email protected]>
Signed-off-by: Gregory Greenman <[email protected]>
Link: https://lore.kernel.org/r/20230418122405.47e383b10b18.I14340a118acdb19ecb7214e7ff413054c77bd99c@changeid
Signed-off-by: Johannes Berg <[email protected]>

wifi: iwlwifi: modify scan request and results when in link protection

When CSME is connected and has link protection set, the driver must
connect to the same AP CSME is connected to.
When in link protection, modify scan request parameters to include
only the channel of the AP CSME is connected to and scan for the
same SSID. In addition, filter the scan results to include only
results from the same AP. This will make sure the driver will connect
to the same AP and will do it fast enough to keep the session alive.

Signed-off-by: Avraham Stern <[email protected]>
Signed-off-by: Gregory Greenman <[email protected]>
Link: https://lore.kernel.org/r/20230418122405.c1b55de3d704.I3895eebe18b3b672607695c887d728e113fc85ec@changeid
Signed-off-by: Johannes Berg <[email protected]>

wifi: iwlwifi: mvm: enable support for MLO APIs

Enable driver's support for MLO APIs to unlock this functionality.

Signed-off-by: Gregory Greenman <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
Link: https://lore.kernel.org/r/20230418122405.0ae0dd6f0481.Iec993cf0f28eacb2483fb9d1e755b0b2fd62e163@changeid
Signed-off-by: Johannes Berg <[email protected]>

wifi: iwlwifi: mvm: prefer RCU_INIT_POINTER()

For constant values we don't need rcu_assign_pointer(),
use RCU_INIT_POINTER() instead.

Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Gregory Greenman <[email protected]>
Link: https://lore.kernel.org/r/20230418122405.7b400d21a27f.Iccdef9d777677390a9881c88b06c0ed13a83d978@changeid
Signed-off-by: Johannes Berg <[email protected]>

wifi: iwlwifi: mvm: fix potential memory leak

If we do get multiple notifications from firmware, then
we might have allocated 'notif', but don't free it. Fix
that by checking for duplicates before allocation.

Fixes: 4da46a06d443 ("wifi: iwlwifi: mvm: Add support for wowlan info notification")
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Gregory Greenman <[email protected]>
Link: https://lore.kernel.org/r/20230418122405.116758321cc4.I8bdbcbb38c89ac637eaa20dda58fa9165b25893a@changeid
Signed-off-by: Johannes Berg <[email protected]>

wifi: iwlwifi: fw: fix argument to efi.get_variable

We should pass the newly allocated data to fill.

Signed-off-by: Alon Giladi <[email protected]>
Signed-off-by: Gregory Greenman <[email protected]>
Link: https://lore.kernel.org/r/20230418122405.aaa6d8874442.I734841c71aad9564cb22c50f2737aaff489fadaf@changeid
Signed-off-by: Johannes Berg <[email protected]>

wifi: iwlwifi: mvm: fix MIC removal confusion

The RADA/firmware collaborate on MIC stripping in the following
way:
- the firmware fills the IWL_RX_MPDU_MFLG1_MIC_CRC_LEN_MASK
   value for how many words need to be removed at the end of
   the frame, CRC and, if decryption was done, MIC
- if the RADA is active, it will
   - remove that much from the end of the frame
   - zero the value in IWL_RX_MPDU_MFLG1_MIC_CRC_LEN_MASK

As a consequence, the only thing the driver should need to do
is to
- unconditionally tell mac80211 that the MIC was removed
   if decryption was already done
- remove as much as IWL_RX_MPDU_MFLG1_MIC_CRC_LEN_MASK says
   at the end of the frame, since either RADA did it and then
   the value is 0, or RADA was disabled and then the value is
   whatever should be removed to strip both CRC & MIC

However, all this code was historically grown and getting a
bit confused. Originally, we were indicating that the MIC was
not stripped, which is the version of the code upstreamed in
commit 780e87c29e77 ("iwlwifi: mvm: add 9000 series RX processing")
which indicated RX_FLAG_DECRYPTED in iwl_mvm_rx_crypto().

We later had a commit to change that to also indicate that the
MIC was stripped, adding RX_FLAG_MIC_STRIPPED. However, this was
then "fixed" later to only do that conditionally on RADA being
enabled, since otherwise RADA didn't strip the MIC bytes yet.
At the time, we were also always including the FCS if the RADA
was not enabled, so that was still broken wrt. the FCS if the
RADA isn't enabled - but that's a pretty rare case. Notably
though, it does happen for management frames, where we do need
to remove the MIC and CRC but the RADA is disabled.

Later, in commit 40a0b38d7a7f ("iwlwifi: mvm: Fix calculation of
frame length"), we changed this again, upstream this was just a
single commit, but internally it was split into first the correct
commit and then an additional fix that reduced the number of bytes
that are removed by crypt_len. Note that this is clearly wrong
since crypt_len indicates the length of the PN header (always 8),
not the length of the MIC (8 or 16 depending on algorithm).
However, this additional fix mostly canceled the other bugs,
apart from the confusion about the size of the MIC.

To fix this correctly, remove all those additional workarounds.
We really should always indicate to mac80211 the MIC was stripped
(it cannot use it anyway if decryption was already done), and also
always actually remove it and the CRC regardless of the RADA being
enabled or not. That's simple though, the value indicated in the
metadata is zeroed by the RADA if it's enabled and used the value,
so there's no need to check if it's enabled or not.

Notably then, this fixes the MIC size confusion, letting us receive
GCMP-256 encrypted management frames correctly that would otherwise
be reported to mac80211 8 bytes too short since the RADA is turned
off for them, crypt_len is 8, but the MIC size is 16, so when we do
the adjustment based on IWL_RX_MPDU_MFLG1_MIC_CRC_LEN_MASK (which
indicates 20 bytes to remove) we remove 12 bytes but indicate then
to mac80211 the MIC is still present, so mac80211 again removes the
MIC of 16 bytes, for an overall removal of 28 rather than 20 bytes.

Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Gregory Greenman <[email protected]>
Link: https://lore.kernel.org/r/20230418122405.81345b6ab0cd.Ibe0348defb6cce11c99929a1f049e60b5cfc150c@changeid
Signed-off-by: Johannes Berg <[email protected]>

wifi: iwlwifi: fw: fix memory leak in debugfs

Fix a memory leak that occurs when reading the fw_info
file all the way, since we return NULL indicating no
more data, but don't free the status tracking object.

Fixes: 36dfe9ac6e8b ("iwlwifi: dump api version in yaml format")
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Gregory Greenman <[email protected]>
Link: https://lore.kernel.org/r/20230418122405.239e501b3b8d.I4268f87809ef91209cbcd748eee0863195e70fa2@changeid
Signed-off-by: Johannes Berg <[email protected]>

wifi: iwlwifi: Update support for b0 version

Add support for B0 version of MAC of MR device

Signed-off-by: Mukesh Sisodiya <[email protected]>
Signed-off-by: Gregory Greenman <[email protected]>
Link: https://lore.kernel.org/r/20230418122405.5dca1ea7a0cf.I87932e1e216a1940eeae8824071ecb777f4c034f@changeid
Signed-off-by: Johannes Berg <[email protected]>

net: bridge: switchdev: don't notify FDB entries with "master dynamic"

There is a structural problem in switchdev, where the flag bits in
struct switchdev_notifier_fdb_info (added_by_user, is_local etc) only
represent a simplified / denatured view of what's in struct
net_bridge_fdb_entry :: flags (BR_FDB_ADDED_BY_USER, BR_FDB_LOCAL etc).
Each time we want to pass more information about struct
net_bridge_fdb_entry :: flags to struct switchdev_notifier_fdb_info
(here, BR_FDB_STATIC), we find that FDB entries were already notified to
switchdev with no regard to this flag, and thus, switchdev drivers had
no indication whether the notified entries were static or not.

For example, this command:

ip link add br0 type bridge && ip link set swp0 master br0
bridge fdb add dev swp0 00:01:02:03:04:05 master dynamic

has never worked as intended with switchdev. It causes a struct
net_bridge_fdb_entry to be passed to br_switchdev_fdb_notify() which has
a single flag set: BR_FDB_ADDED_BY_USER.

This is further passed to the switchdev notifier chain, where interested
drivers have no choice but to assume this is a static (does not age) and
sticky (does not migrate) FDB entry. So currently, all drivers offload
it to hardware as such, as can be seen below ("offload" is set).

bridge fdb get 00:01:02:03:04:05 dev swp0 master
00:01:02:03:04:05 dev swp0 offload master br0

The software FDB entry expires $ageing_time centiseconds after the
kernel last sees a packet with this MAC SA, and the bridge notifies its
deletion as well, so it eventually disappears from hardware too.

This is a problem, because it is actually desirable to start offloading
"master dynamic" FDB entries correctly - they should expire $ageing_time
centiseconds after the *hardware* port last sees a packet with this
MAC SA - and this is how the current incorrect behavior was discovered.
With an offloaded data plane, it can be expected that software only sees
exception path packets, so an otherwise active dynamic FDB entry would
be aged out by software sooner than it should.

With the change in place, these FDB entries are no longer offloaded:

bridge fdb get 00:01:02:03:04:05 dev swp0 master
00:01:02:03:04:05 dev swp0 master br0

and this also constitutes a better way (assuming a backport to stable
kernels) for user space to determine whether the kernel has the
capability of doing something sane with these or not.

As opposed to "master dynamic" FDB entries, on the current behavior of
which no one currently depends on (which can be deduced from the lack of
kselftests), Ido Schimmel explains that entries with the "extern_learn"
flag (BR_FDB_ADDED_BY_EXT_LEARN) should still be notified to switchdev,
since the spectrum driver listens to them (and this is kind of okay,
because although they are treated identically to "static", they are
expected to not age, and to roam).

Fixes: 6b26b51b1d13 ("net: bridge: Add support for notifying devices about FDB add/del")
Link: https://lore.kernel.org/netdev/20230327115206.jk5q5l753aoelwus@skbuf/
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Jesse Brandeburg <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Tested-by: Ido Schimmel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>

Merge branch 'Access variable length array relaxed for integer type'

Feng zhou says:

====================

From: Feng Zhou <[email protected]>

Add support for integer type of accessing variable length array.
Add a selftest to check it.
====================

Signed-off-by: Alexei Starovoitov <[email protected]>

selftests/bpf: Add test to access integer type of variable array

Add prog test for accessing integer type of variable array in tracing
program.
In addition, hook load_balance function to access sd->span[0], only
to confirm whether the load is successful. Because there is no direct
way to trigger load_balance call.

Co-developed-by: Chengming Zhou <[email protected]>
Signed-off-by: Chengming Zhou <[email protected]>
Signed-off-by: Feng Zhou <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>

bpf: support access variable length array of integer type

After this commit:
bpf: Support variable length array in tracing programs (9c5f8a1008a1)
Trace programs can access variable length array, but for structure
type. This patch adds support for integer type.

Example:
Hook load_balance
struct sched_domain {
...
unsigned long span[];
}

The access: sd->span[0].

Co-developed-by: Chengming Zhou <[email protected]>
Signed-off-by: Chengming Zhou <[email protected]>
Signed-off-by: Feng Zhou <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>

Revert "net/mlx5: Enable management PF initialization"

This reverts commit fe998a3c77b9f989a30a2a01fb00d3729a6d53a4.

Paul reports that it causes a regression with IB on CX4
and FW 12.18.1000. In addition I think that the concept
of "management PF" is not fully accepted and requires
a discussion.

Fixes: fe998a3c77b9 ("net/mlx5: Enable management PF initialization")
Reported-by: Paul Moore <[email protected]>
Link: https://lore.kernel.org/all/CAHC9VhQ7A4+msL38WpbOMYjAqLp0EtOjeLh4Dc6SQtD6OUvCQg@mail.gmail.com/
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch 'another-crack-at-a-handshake-upcall-mechanism'

Chuck Lever says:

====================
Another crack at a handshake upcall mechanism

Here is v10 of a series to add generic support for transport layer
security handshake on behalf of kernel socket consumers (user space
consumers use a security library directly, of course). A summary of
the purpose of these patches is archived here:

https://lore.kernel.org/netdev/1DE06BB1-6BA9-4DB4-B2AA-07DE532963D6@oracle.com/

The first patch in the series applies to the top-level .gitignore
file to address the build warnings reported a few days ago. I intend
to submit that separately. I'd like you to consider taking the rest
of this series for v6.4.

The full patch set to support SunRPC with TLSv1.3 is available in
the topic-rpc-with-tls-upcall branch here, based on net-next/main:

https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git

This patch set includes support for in-transit confidentiality and
peer authentication for both the Linux NFS client and server.

A user space handshake agent for TLSv1.3 to go along with the kernel
patches is available in the "main" branch here:

https://github.com/oracle/ktls-utils
====================

Link: https://lore.kernel.org/r/168174169259.9520.1911007910797225963.stgit@91.116.238.104.host.secureserver.net
Signed-off-by: Jakub Kicinski <[email protected]>

net/handshake: Add Kunit tests for the handshake consumer API

These verify the API contracts and help exercise lifetime rules for
consumer sockets and handshake_req structures.

One way to run these tests:

./tools/testing/kunit/kunit.py run --kunitconfig ./net/handshake/.kunitconfig

Signed-off-by: Chuck Lever <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net/handshake: Add a kernel API for requesting a TLSv1.3 handshake

To enable kernel consumers of TLS to request a TLS handshake, add
support to net/handshake/ to request a handshake upcall.

This patch also acts as a template for adding handshake upcall
support for other kernel transport layer security providers.

Signed-off-by: Chuck Lever <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

net/handshake: Create a NETLINK service for handling handshake requests

When a kernel consumer needs a transport layer security session, it
first needs a handshake to negotiate and establish a session. This
negotiation can be done in user space via one of the several
existing library implementations, or it can be done in the kernel.

No in-kernel handshake implementations yet exist. In their absence,
we add a netlink service that can:

a. Notify a user space daemon that a handshake is needed.

b. Once notified, the daemon calls the kernel back via this
   netlink service to get the handshake parameters, including an
   open socket on which to establish the session.

c. Once the handshake is complete, the daemon reports the
   session status and other information via a second netlink
   operation. This operation marks that it is safe for the
   kernel to use the open socket and the security session
   established there.

The notification service uses a multicast group. Each handshake
mechanism (eg, tlshd) adopts its own group number so that the
handshake services are completely independent of one another. The
kernel can then tell via netlink_has_listeners() whether a handshake
service is active and prepared to handle a handshake request.

A new netlink operation, ACCEPT, acts like accept(2) in that it
instantiates a file descriptor in the user space daemon's fd table.
If this operation is successful, the reply carries the fd number,
which can be treated as an open and ready file descriptor.

While user space is performing the handshake, the kernel keeps its
muddy paws off the open socket. A second new netlink operation,
DONE, indicates that the user space daemon is finished with the
socket and it is safe for the kernel to use again. The operation
also indicates whether a session was established successfully.

Signed-off-by: Chuck Lever <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

.gitignore: Do not ignore .kunitconfig files

Circumvent the .gitignore wildcard to avoid warnings about ignored
.kunitconfig files. As far as I can tell, the warnings are harmless
and these files are not actually ignored.

Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Signed-off-by: Chuck Lever <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

Merge tag 'ipsec-next-2023-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next

Steffen Klassert says:

====================
ipsec-next 2023-04-19

1) Remove inner/outer modes from input/output path. These are
   not needed anymore. From Herbert Xu.

* tag 'ipsec-next-2023-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next:
  xfrm: Remove inner/outer modes from output path
  xfrm: Remove inner/outer modes from input path
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

dt-bindings: net: ethernet: Fix JSON pointer references

A JSON pointer reference (the part after the "#") must start with a "/".
Conversely, references to the entire document must not have a trailing "/"
and should be just a "#". The existing jsonschema package allows these,
but coming changes make allowed "$ref" URIs stricter and throw errors on
these references.

Signed-off-by: Rob Herring <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: micrel: Update the list of supported phys

At the beginning of the file micrel.c there is list of supported PHYs.
Extend this list with the following PHYs lan8841, lan8814 and lan8804,
as these PHYs were added but the list was not updated.

Signed-off-by: Horatiu Vultur <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: stmmac: dwmac-meson8b: Avoid cast to incompatible function type

Rather than casting clk_disable_unprepare to an incompatible function
type provide a trivial wrapper with the correct signature for the
use-case.

Reported by clang-16 with W=1:

drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c:276:6: error: cast from 'void (*)(struct clk *)' to 'void (*)(void *)' converts to incompatible function type [-Werror,-Wcast-function-type-strict]
(void(*)(void *))clk_disable_unprepare,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
No functional change intended.
Compile tested only.

Signed-off-by: Simon Horman <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
bpf 2023-04-19

We've added 3 non-merge commits during the last 6 day(s) which contain
a total of 3 files changed, 34 insertions(+), 9 deletions(-).

The main changes are:

1) Fix a crash on s390's bpf_arch_text_poke() under a NULL new_addr,
   from Ilya Leoshkevich.

2) Fix a bug in BPF verifier's precision tracker, from Daniel Borkmann
   and Andrii Nakryiko.

3) Fix a regression in veth's xdp_features which led to a broken BPF CI
   selftest, from Lorenzo Bianconi.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Fix incorrect verifier pruning due to missing register precision taints
  veth: take into account peer device for NETDEV_XDP_ACT_NDO_XMIT xdp_features flag
  s390/bpf: Fix bpf_arch_text_poke() with new_addr == NULL
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

MAINTAINERS: Resume MPTCP co-maintainer role

I'm returning to the MPTCP maintainer role I held for most of the
subsytem's history. This time I'm using my kernel.org email address.

Acked-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/mptcp/[email protected]/
Signed-off-by: Mat Martineau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

mailmap: add entries for Mat Martineau

Map Mat's old corporate addresses to his kernel.org one.

Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts <[email protected]>
Link: https://lore.kernel.org/r/20230418-upstream-net-20230418-mailmap-mat-v1-1-13ca5dc83037@tessares.net
Signed-off-by: Jakub Kicinski <[email protected]>

Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2023-04-17 (i40e)

This series contains updates to i40e only.

Alex moves setting of active filters to occur under lock and checks/takes
error path in rebuild if re-initializing the misc interrupt vector
failed.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
i40e: fix i40e_setup_misc_vector() error handling
i40e: fix accessing vsi->active_filters without holding lock
====================

Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge tag 'mm-hotfixes-stable-2023-04-19-16-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
"22 hotfixes.

  19 are cc:stable and the remainder address issues which were
  introduced during this merge cycle, or aren't considered suitable for
  -stable backporting.

  19 are for MM and the remainder are for other subsystems"

* tag 'mm-hotfixes-stable-2023-04-19-16-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (22 commits)
  nilfs2: initialize unused bytes in segment summary blocks
  mm: page_alloc: skip regions with hugetlbfs pages when allocating 1G pages
  mm/mmap: regression fix for unmapped_area{_topdown}
  maple_tree: fix mas_empty_area() search
  maple_tree: make maple state reusable after mas_empty_area_rev()
  mm: kmsan: handle alloc failures in kmsan_ioremap_page_range()
  mm: kmsan: handle alloc failures in kmsan_vmap_pages_range_noflush()
  tools/Makefile: do missed s/vm/mm/
  mm: fix memory leak on mm_init error handling
  mm/page_alloc: fix potential deadlock on zonelist_update_seq seqlock
  kernel/sys.c: fix and improve control flow in __sys_setres[ug]id()
  Revert "userfaultfd: don't fail on unrecognized features"
  writeback, cgroup: fix null-ptr-deref write in bdi_split_work_to_wbs
  maple_tree: fix a potential memory leak, OOB access, or other unpredictable bug
  tools/mm/page_owner_sort.c: fix TGID output when cull=tg is used
  mailmap: update jtoppins' entry to reference correct email
  mm/mempolicy: fix use-after-free of VMA iterator
  mm/huge_memory.c: warn with pr_warn_ratelimited instead of VM_WARN_ON_ONCE_FOLIO
  mm/mprotect: fix do_mprotect_pkey() return on error
  mm/khugepaged: check again on anon uffd-wp during isolation
  ...

e1000e: Disable TSO on i219-LM card to increase speed

While using i219-LM card currently it was only possible to achieve
about 60% of maximum speed due to regression introduced in Linux 5.8.
This was caused by TSO not being disabled by default despite commit
f29801030ac6 ("e1000e: Disable TSO for buffer overrun workaround").
Fix that by disabling TSO during driver probe.

Fixes: f29801030ac6 ("e1000e: Disable TSO for buffer overrun workaround")
Signed-off-by: Sebastian Basierski <[email protected]>
Signed-off-by: Mateusz Palczewski <[email protected]>
Tested-by: Naama Meir <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

bnxt_en: fix free-runnig PHC mode

The patch in fixes changed the way real-time mode is chosen for PHC on
the NIC. Apparently there is one more use case of the check outside of
ptp part of the driver which was not converted to the new macro and is
making a lot of noise in free-running mode.

Fixes: 131db4991622 ("bnxt_en: reset PHC frequency in free-running mode")
Signed-off-by: Vadim Fedorenko <[email protected]>
Reviewed-by: Michael Chan <[email protected]>
Reviewed-by: Pavan Chebbi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: dsa: mt7530: fix support for MT7531BE

There are two variants of the MT7531 switch IC which got different
features (and pins) regarding port 5:
* MT7531AE: SGMII/1000Base-X/2500Base-X SerDes PCS
* MT7531BE: RGMII

Moving the creation of the SerDes PCS from mt753x_setup to mt7530_probe
with commit 6de285229773 ("net: dsa: mt7530: move SGMII PCS creation
to mt7530_probe function") works fine for MT7531AE which got two
instances of mtk-pcs-lynxi, however, MT7531BE requires mt7531_pll_setup
to setup clocks before the single PCS on port 6 (usually used as CPU
port) starts to work and hence the PCS creation failed on MT7531BE.

Fix this by introducing a pointer to mt7531_create_sgmii function in
struct mt7530_priv and call it again at the end of mt753x_setup like it
was before commit 6de285229773 ("net: dsa: mt7530: move SGMII PCS
creation to mt7530_probe function").

Fixes: 6de285229773 ("net: dsa: mt7530: move SGMII PCS creation to mt7530_probe function")
Signed-off-by: Daniel Golle <[email protected]>
Acked-by: Arınç ÜNAL <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge tag 'spi-fix-v6.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fix from Mark Brown:
"A small fix in the error handling for the rockchip driver, ensuring we
  don't leak clock enables if we fail to request the interrupt for the
  device"

* tag 'spi-fix-v6.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: spi-rockchip: Fix missing unwind goto in rockchip_sfc_probe()

Merge tag 'regulator-fix-v6.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
"A few driver specific fixes, one build coverage issue and a couple of
  'someone typed in the wrong number' style errors in describing devices
  to the subsystem"

* tag 'regulator-fix-v6.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: sm5703: Fix missing n_voltages for fixed regulators
  regulator: fan53555: Fix wrong TCS_SLEW_MASK
  regulator: fan53555: Explicitly include bits header

net: dsa: microchip: ksz8795: Correctly handle huge frame configuration

Because of the logic in place, SW_HUGE_PACKET can never be set.
(If the first condition is true, then the 2nd one is also true, but is not
executed)

Change the logic and update each bit individually.

Fixes: 29d1e85f45e0 ("net: dsa: microchip: ksz8: add MTU configuration support")
Signed-off-by: Christophe JAILLET <[email protected]>
Reviewed-by: Oleksij Rempel <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Link: https://lore.kernel.org/r/43107d9e8b5b8b05f0cbd4e1f47a2bb88c8747b2.1681755535.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <[email protected]>