Paolo Abeni [Tue, 20 Feb 2024 11:00:01 +0000 (12:00 +0100)]
udp: add local "peek offset enabled" flag
We want to re-organize the struct sock layout. The sk_peek_off
field location is problematic, as most protocols want it in the
RX read area, while UDP wants it on a cacheline different from
sk_receive_queue.
Create a local (inside udp_sock) copy of the 'peek offset is enabled'
flag and place it inside the same cacheline of reader_queue.
Check such flag before reading sk_peek_off. This will save potential
false sharing and cache misses in the fast-path.
Tested under UDP flood with small packets. The struct sock layout
update causes a 4% performance drop, and this patch restores completely
the original tput.
mv88q2xxx_config_init calls genphy_c45_read_pma which is done by
mv88q2xxx_read_status, it calls also mv88q2xxx_config_aneg which is
also called by the PHY state machine. Let the PHY state machine handle
the phydriver ops in their intendend way.
Dimitri Fedrau [Sun, 18 Feb 2024 07:57:48 +0000 (08:57 +0100)]
net: phy: marvell-88q2xxx: switch to mv88q2xxx_config_aneg
Switch to mv88q2xxx_config_aneg for Marvell 88Q2220 devices and remove
the mv88q222x_config_aneg function which is basically a copy of the
mv88q2xxx_config_aneg function.
Dimitri Fedrau [Sun, 18 Feb 2024 07:57:47 +0000 (08:57 +0100)]
net: phy: marvell-88q2xxx: make mv88q2xxx_config_aneg generic
Marvell 88Q2xxx devices follow the same scheme, after configuration they
need a soft reset. Soft resets differ between devices, so we use the
.soft_reset callback instead of creating .config_aneg callbacks for each
device.
Dimitri Fedrau [Sun, 18 Feb 2024 07:57:43 +0000 (08:57 +0100)]
net: phy: marvell-88q2xxx: add interrupt support for link detection
Added .config_intr and .handle_interrupt callbacks. Whenever the link
goes up or down an interrupt will be triggered. Interrupts are configured
separately for 100/1000BASET1.
Dimitri Fedrau [Sun, 18 Feb 2024 07:57:42 +0000 (08:57 +0100)]
net: phy: marvell-88q2xxx: add driver for the Marvell 88Q2220 PHY
Add a driver for the Marvell 88Q2220. This driver allows to detect the
link, switch between 100BASE-T1 and 1000BASE-T1 and switch between
master and slave mode. Autonegotiation is supported.
Dimitri Fedrau [Sun, 18 Feb 2024 07:57:41 +0000 (08:57 +0100)]
net: phy: marvell-88q2xxx: fix typos
Rename mv88q2xxxx_get_sqi to mv88q2xxx_get_sqi and
mv88q2xxxx_get_sqi_max to mv88q2xxx_get_sqi_max.
Fix linebreaks and use everywhere hexadecimal numbers written with
lowercase letters instead of mixing it up.
This patch series reworks the way that we manage the GENET MDIO
controller clocks around I/O accesses. During testing with a fully
modular build where bcmgenet, mdio-bcm-unimac, and the Broadcom PHY
driver (broadcom) are all loaded as modules, with no particular care
being taken to order them to mimize deferred probing the following bus
error was obtained:
The issue here is that we managed to probe the GENET controller, the
mdio-bcm-unimac MDIO controller, but the PHY was still being held in a
probe deferral state because it depended upon a GPIO controller provider
not loaded yet. As soon as that provider is loaded however, the PHY
continues to probe, tries to disable the interrupts, and this causes a
MDIO transaction. That MDIO transaction requires I/O register accesses
within the GENET's larger block, and since its clocks are turned off,
the CPU gets a bus error signaled as a System Error.
The patch series takes the simplest approach of keeping the clocks
enabled just for the duration of the I/O accesses. This is also
beneficial to other drivers like bcmasp2 which make use of the same MDIO
controller driver.
Changes in v2:
- added missing ret assignment in the if (IS_ERR(priv->clk)) branch
- added Jacob's R-by tags
- corrected the commit ID being reverted in patch #3
====================
Florian Fainelli [Mon, 19 Feb 2024 20:40:53 +0000 (12:40 -0800)]
Revert "net: bcmgenet: Ensure MDIO unregistration has clocks enabled"
This reverts commit 1b5ea7ffb7a3bdfffb4b7f40ce0d20a3372ee405 ("net:
bcmgenet: Ensure MDIO unregistration has clocks enabled"). This is no
longer necessary now that the MDIO bus controller has a clock that it
can manage around the I/O accesses.
Florian Fainelli [Mon, 19 Feb 2024 20:40:52 +0000 (12:40 -0800)]
net: bcmgenet: Pass "main" clock down to the MDIO driver
GENET has historically had to create a MDIO platform device for its
controller and pass some auxiliary data to it, like a MDIO completion
callback. Now we also pass the "main" clock to allow for the MDIO bus
controller to manage that clock adequately around I/O accesses.
Florian Fainelli [Mon, 19 Feb 2024 20:40:51 +0000 (12:40 -0800)]
net: mdio: mdio-bcm-unimac: Manage clock around I/O accesses
Up until now we have managed not to have the mdio-bcm-unimac manage its
clock except during probe and suspend/resume. This works most of the
time, except where it does not.
With a fully modular build, we can get into a situation whereby the
GENET driver is fully registered, and so is the mdio-bcm-unimac driver,
however the Ethernet PHY driver is not yet, because it depends on a
resource that is not yet available (e.g.: GPIO provider). In that state,
the network device is not usable yet, and so to conserve power, the
GENET driver will have turned off its "main" clock which feeds its MDIO
controller.
When the PHY driver finally probes however, we make an access to the PHY
registers to e.g.: disable interrupts, and this causes a bus error
within the MDIO controller space because the MDIO controller clock(s)
are turned off.
To remedy that, we manage the clock around all of the I/O accesses to
the hardware which are done exclusively during read, write and clock
divider configuration.
This ensures that the register space is accessible, and this also
ensures that there are not unnecessarily elevated reference counts
keeping the clocks active when the network device is administratively
turned off. It would be the case with the previous way of managing the
clock.
David S. Miller [Wed, 21 Feb 2024 11:28:58 +0000 (11:28 +0000)]
Merge branch 'net-kmem-cache-create'
Kunwu Chan says:
====================
net: Use KMEM_CACHE instead of kmem_cache_create
As Jiri Pirko suggests,
I'm using a patchset to cleanup the same issues in the 'net' module.
Use the new KMEM_CACHE() macro instead of direct kmem_cache_create
to simplify the creation of SLAB caches.
Some cache names are changed to be the same as struct names.
This change is recorded in the changelog for easy reference.
It's harmless cause it's used in /proc/slabinfo to identify this cache.
---
Changes in v2:
- Delete a patch as Eric said in https://lore.kernel.org/all/CANn89iLkWvum6wSqSya_K+1eqnFvp=L2WLW=kAYrZTF8Ei4b7g@mail.gmail.com/
- No code changes,only add Reviewed-by tag
====================
Kunwu Chan [Tue, 20 Feb 2024 07:36:45 +0000 (15:36 +0800)]
ipv4: Simplify the allocation of slab caches in ip_rt_init
Use the new KMEM_CACHE() macro instead of direct kmem_cache_create
to simplify the creation of SLAB caches.
And change cache name from 'ip_dst_cache' to 'rtable'.
Kunwu Chan [Tue, 20 Feb 2024 07:36:44 +0000 (15:36 +0800)]
ipmr: Simplify the allocation of slab caches
Use the new KMEM_CACHE() macro instead of direct kmem_cache_create
to simplify the creation of SLAB caches.
And change cache name from 'ip_mrt_cache' to 'mfc_cache'.
Kunwu Chan [Tue, 20 Feb 2024 07:36:43 +0000 (15:36 +0800)]
ip6mr: Simplify the allocation of slab caches in ip6_mr_init
Use the new KMEM_CACHE() macro instead of direct kmem_cache_create
to simplify the creation of SLAB caches.
And change cache name from 'ip6_mrt_cache' to 'mfc6_cache'.
Kunwu Chan [Tue, 20 Feb 2024 07:36:42 +0000 (15:36 +0800)]
net: kcm: Simplify the allocation of slab caches
Use the new KMEM_CACHE() macro instead of direct kmem_cache_create
to simplify the creation of SLAB caches.
And change cache name from 'kcm_mux_cache' to 'kcm_mux',
'kcm_psock_cache' to 'kcm_psock'.
Breno Leitao [Mon, 19 Feb 2024 13:43:28 +0000 (05:43 -0800)]
net/dummy: Move stats allocation to core
With commit 34d21de99cea9 ("net: Move {l,t,d}stats allocation to core and
convert veth & vrf"), stats allocation could be done on net core instead
of this driver.
With this new approach, the driver doesn't have to bother with error
handling (allocation failure checking, making sure free happens in the
right spot, etc). This is core responsibility now.
Move dummy driver to leverage the core allocation.
Heiner Kallweit [Sun, 18 Feb 2024 14:49:55 +0000 (15:49 +0100)]
tg3: copy only needed fields from userspace-provided EEE data
The current code overwrites fields in tp->eee with unchecked data from
edata, e.g. the bitmap with supported modes. ethtool properly returns
the received data from get_eee() call, but we have no guarantee that
other users of the ioctl set_eee() interface behave properly too.
Therefore copy only fields which are actually needed.
This is a simple and straight forward cleanup series that makes all device
types in the net subsystem constants. This has been possible since 2011 [1]
but not all occurrences were cleaned. I have been sweeping the tree to fix
them all.
I was not sure if I should send these squashed, but there are quite a few
changes so I decided to send them separately. Please let me know if that is
not desirable.
Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the hso_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.
net: wwan: core: constify the struct device_type usage
Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the wwan_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.
net: netdevsim: constify the struct device_type usage
Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the
nsim_bus_dev_type variable to be a constant structure as well, placing it
into read-only memory which can not be modified at runtime.
Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the vlan_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.
Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the l2tpeth_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.
Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the hsr_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.
net: geneve: constify the struct device_type usage
Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the geneve_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.
Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the ppp_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.
Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the vxlan_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.
net: bridge: constify the struct device_type usage
Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the br_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.
Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the dsa_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.
net: usbnet: constify the struct device_type usage
Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the wlan_type
and wwan_type variables to be constant structures as well, placing it into
read-only memory which can not be modified at runtime.
net: wan: framer: constify of_phandle_args in xlate
The xlate callbacks are supposed to translate of_phandle_args to proper
provider without modifying the of_phandle_args. Make the argument
pointer to const for code safety and readability.
this is a pull request of 9 patches for net-next/master.
The first patch is by Francesco Dolcini and removes a redundant check
for pm_clock_support from the m_can driver.
Martin Hundebøll contributes 3 patches to the m_can/tcan4x5x driver to
allow resume upon RX of a CAN frame.
3 patches by Srinivas Goud add support for ECC statistics to the
xilinx_can driver.
The last 2 patches are by Oliver Hartkopp and me, target the CAN RAW
protocol and fix an error in the getsockopt() for CAN-XL introduced in
the previous pull request to net-next (linux-can-next-for-6.9-20240213).
* tag 'linux-can-next-for-6.9-20240220' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
can: raw: raw_getsockopt(): reduce scope of err
can: raw: fix getsockopt() for new CAN_RAW_XL_VCID_OPTS
can: xilinx_can: Add ethtool stats interface for ECC errors
can: xilinx_can: Add ECC support
dt-bindings: can: xilinx_can: Add 'xlnx,has-ecc' optional property
can: tcan4x5x: support resuming from rx interrupt signal
can: m_can: allow keeping the transceiver running in suspend
dt-bindings: can: tcan4x5x: Document the wakeup-source flag
can: m_can: remove redundant check for pm_clock_support
====================
Florian Westphal [Fri, 16 Feb 2024 11:36:57 +0000 (12:36 +0100)]
net: skbuff: add overflow debug check to pull/push helpers
syzbot managed to trigger following splat:
BUG: KASAN: use-after-free in __skb_flow_dissect+0x4a3b/0x5e50
Read of size 1 at addr ffff888208a4000e by task a.out/2313
[..]
__skb_flow_dissect+0x4a3b/0x5e50
__skb_get_hash+0xb4/0x400
ip_tunnel_xmit+0x77e/0x26f0
ipip_tunnel_xmit+0x298/0x410
..
Analysis shows that the skb has a valid ->head, but bogus ->data
pointer.
skb->data gets its bogus value via the neigh layer, which does:
1556 __skb_pull(skb, skb_network_offset(skb));
... and the skb was already dodgy at this point:
skb_network_offset(skb) returns a negative value due to an
earlier overflow of skb->network_header (u16). __skb_pull thus
"adjusts" skb->data by a huge offset, pointing outside skb->head
area.
Allow debug builds to splat when we try to pull/push more than
INT_MAX bytes.
After this, the syzkaller reproducer yields a more precise splat
before the flow dissector attempts to read off skb->data memory:
Eric Dumazet [Fri, 16 Feb 2024 16:20:06 +0000 (16:20 +0000)]
net: reorganize "struct sock" fields
Last major reorg happened in commit 9115e8cd2a0c ("net: reorganize
struct sock for better data locality")
Since then, many changes have been done.
Before SO_PEEK_OFF support is added to TCP, we need
to move sk_peek_off to a better location.
It is time to make another pass, and add six groups,
without explicit alignment.
- sock_write_rx (following sk_refcnt) read-write fields in rx path.
- sock_read_rx read-mostly fields in rx path.
- sock_read_rxtx read-mostly fields in both rx and tx paths.
- sock_write_rxtx read-write fields in both rx and tx paths.
- sock_write_tx read-write fields in tx paths.
- sock_read_tx read-mostly fields in tx paths.
Results on TCP_RR benchmarks seem to show a gain (4 to 5 %).
It is possible UDP needs a change, because sk_peek_off
shares a cache line with sk_receive_queue.
If this the case, we can exchange roles of sk->sk_receive
and up->reader_queue queues.
Colin Ian King [Fri, 16 Feb 2024 12:54:43 +0000 (12:54 +0000)]
net: tcp: Remove redundant initialization of variable len
The variable len being initialized with a value that is never read, an
if statement is initializing it in both paths of the if statement.
The initialization is redundant and can be removed.
Cleans up clang scan build warning:
net/ipv4/tcp_ao.c:512:11: warning: Value stored to 'len' during its
initialization is never read [deadcode.DeadStores]
Reduce the scope of the variable "err" to the individual cases. This
is to avoid the mistake of setting "err" in the mistaken belief that
it will be evaluated later.
Currently these components in the net stack use the struct page
directly:
1. Drivers.
2. Page pool.
3. skb_frag_t.
To add support for new (non struct page) memory types to the net stack, we
must first abstract the current memory type.
Originally the plan was to reuse struct page* for the new memory types,
and to set the LSB on the page* to indicate it's not really a page.
However, for safe compiler type checking we need to introduce a new type.
struct netmem is introduced to abstract the underlying memory type.
Currently it's a no-op abstraction that is always a struct page underneath.
In parallel there is an undergoing effort to add support for devmem to the
net stack:
Mina Almasry [Wed, 14 Feb 2024 22:34:03 +0000 (14:34 -0800)]
net: add netmem to skb_frag_t
Use struct netmem* instead of page in skb_frag_t. Currently struct
netmem* is always a struct page underneath, but the abstraction
allows efforts to add support for skb frags not backed by pages.
There is unfortunately 1 instance where the skb_frag_t is assumed to be
a exactly a bio_vec in kcm. For this case, WARN_ON_ONCE and return error
before doing a cast.
Add skb[_frag]_fill_netmem_*() and skb_add_rx_frag_netmem() helpers so
that the API can be used to create netmem skbs.
Mina Almasry [Wed, 14 Feb 2024 22:34:02 +0000 (14:34 -0800)]
net: introduce abstraction for network memory
Add the netmem_ref type, an abstraction for network memory.
To add support for new memory types to the net stack, we must first
abstract the current memory type. Currently parts of the net stack
use struct page directly:
- page_pool
- drivers
- skb_frag_t
Originally the plan was to reuse struct page* for the new memory types,
and to set the LSB on the page* to indicate it's not really a page.
However, for compiler type checking we need to introduce a new type.
netmem_ref is introduced to abstract the underlying memory type.
Currently it's a no-op abstraction that is always a struct page
underneath. In parallel there is an undergoing effort to add support
for devmem to the net stack:
netmem_ref can be pointers to different underlying memory types, and the
low bits are set to indicate the memory type. Helpers are provided
to convert netmem pointers to the underlying memory type (currently only
struct page). In the devmem series helpers are provided so that calling
code can use netmem without worrying about the underlying memory type
unless absolutely necessary.
Breno Leitao [Fri, 16 Feb 2024 09:41:52 +0000 (01:41 -0800)]
net: sysfs: Do not create sysfs for non BQL device
Creation of sysfs entries is expensive, mainly for workloads that
constantly creates netdev and netns often.
Do not create BQL sysfs entries for devices that don't need,
basically those that do not have a real queue, i.e, devices that has
NETIF_F_LLTX and IFF_NO_QUEUE, such as `lo` interface.
This will remove the /sys/class/net/eth0/queues/tx-X/byte_queue_limits/
directory for these devices.
In the example below, eth0 has the `byte_queue_limits` directory but not
`lo`.
# ls /sys/class/net/lo/queues/tx-0/
traffic_class tx_maxrate tx_timeout xps_cpus xps_rxqs
# ls /sys/class/net/eth0/queues/tx-0/byte_queue_limits/
hold_time inflight limit limit_max limit_min
This also removes the #ifdefs, since we can also use netdev_uses_bql() to
check if the config is enabled. (as suggested by Jakub).
Lorenzo Bianconi [Fri, 16 Feb 2024 09:25:43 +0000 (10:25 +0100)]
net: page_pool: fix recycle stats for system page_pool allocator
Use global percpu page_pool_recycle_stats counter for system page_pool
allocator instead of allocating a separate percpu variable for each
(also percpu) page pool instance.
page_pool: disable direct recycling based on pool->cpuid on destroy
Now that direct recycling is performed basing on pool->cpuid when set,
memory leaks are possible:
1. A pool is destroyed.
2. Alloc cache is emptied (it's done only once).
3. pool->cpuid is still set.
4. napi_pp_put_page() does direct recycling basing on pool->cpuid.
5. Now alloc cache is not empty, but it won't ever be freed.
In order to avoid that, rewrite pool->cpuid to -1 when unlinking NAPI to
make sure no direct recycling will be possible after emptying the cache.
This involves a bit of overhead as pool->cpuid now must be accessed
via READ_ONCE() to avoid partial reads.
Rename page_pool_unlink_napi() -> page_pool_disable_direct_recycling()
to reflect what it actually does and unexport it.
Ajay Singh [Thu, 15 Feb 2024 15:36:21 +0000 (16:36 +0100)]
wifi: wilc1000: add missing read critical sections around vif list traversal
Some code manipulating the vif list is still missing some srcu_read_lock /
srcu_read_unlock, and so can trigger RCU warnings:
=============================
WARNING: suspicious RCU usage
6.8.0-rc1+ #37 Not tainted
-----------------------------
drivers/net/wireless/microchip/wilc1000/hif.c:110 RCU-list traversed without holding the required lock!!
[...]
stack backtrace:
CPU: 0 PID: 6 Comm: kworker/0:0 Not tainted 6.8.0-rc1+ #37
Hardware name: Atmel SAMA5
Workqueue: events sdio_irq_work
unwind_backtrace from show_stack+0x18/0x1c
show_stack from dump_stack_lvl+0x34/0x58
dump_stack_lvl from wilc_get_vif_from_idx+0x158/0x180
wilc_get_vif_from_idx from wilc_network_info_received+0x80/0x48c
wilc_network_info_received from wilc_handle_isr+0xa10/0xd30
wilc_handle_isr from wilc_sdio_interrupt+0x44/0x58
wilc_sdio_interrupt from process_sdio_pending_irqs+0x1c8/0x60c
process_sdio_pending_irqs from sdio_irq_work+0x6c/0x14c
sdio_irq_work from process_one_work+0x8d4/0x169c
process_one_work from worker_thread+0x8cc/0x1340
worker_thread from kthread+0x448/0x510
kthread from ret_from_fork+0x14/0x28
Fix those warnings by adding the needed lock around the corresponding
critical sections
wifi: wilc1000: use SRCU instead of RCU for vif list traversal
Enabling CONFIG_PROVE_RCU_LIST raises many warnings in wilc driver, even on
some places already protected by a read critical section. An example of
such case is in wilc_get_available_idx:
=============================
WARNING: suspicious RCU usage
6.8.0-rc1+ #32 Not tainted
-----------------------------
drivers/net/wireless/microchip/wilc1000/netdev.c:944 RCU-list traversed in non-reader section!!
[...]
stack backtrace:
CPU: 0 PID: 26 Comm: kworker/0:3 Not tainted 6.8.0-rc1+ #32
Hardware name: Atmel SAMA5
Workqueue: events_freezable mmc_rescan
unwind_backtrace from show_stack+0x18/0x1c
show_stack from dump_stack_lvl+0x34/0x58
dump_stack_lvl from wilc_netdev_ifc_init+0x788/0x8ec
wilc_netdev_ifc_init from wilc_cfg80211_init+0x690/0x910
wilc_cfg80211_init from wilc_sdio_probe+0x168/0x490
wilc_sdio_probe from sdio_bus_probe+0x230/0x3f4
sdio_bus_probe from really_probe+0x270/0xdf4
really_probe from __driver_probe_device+0x1dc/0x580
__driver_probe_device from driver_probe_device+0x60/0x140
driver_probe_device from __device_attach_driver+0x268/0x364
__device_attach_driver from bus_for_each_drv+0x15c/0x1cc
bus_for_each_drv from __device_attach+0x1ec/0x3e8
__device_attach from bus_probe_device+0x190/0x1c0
bus_probe_device from device_add+0x10dc/0x18e4
device_add from sdio_add_func+0x1c0/0x2c0
sdio_add_func from mmc_attach_sdio+0xa08/0xe1c
mmc_attach_sdio from mmc_rescan+0xa00/0xfe0
mmc_rescan from process_one_work+0x8d4/0x169c
process_one_work from worker_thread+0x8cc/0x1340
worker_thread from kthread+0x448/0x510
kthread from ret_from_fork+0x14/0x28
This warning is due to the section being protected by a srcu critical read
section, but the list traversal being done with classic RCU API. Fix the
warning by using corresponding SRCU read lock/unlock APIs. While doing so,
since we always manipulate the same list (managed through a pointer
embedded in struct_wilc), add a macro to reduce the corresponding
boilerplate in each call site.
Ping-Ke Shih [Thu, 15 Feb 2024 05:57:41 +0000 (13:57 +0800)]
wifi: rtw89: 8922a: add helper of set_channel
Reset hardware state to prevent hardware stays at abnormal state during
setting channel. Besides, add preparation for MLO/DBCC before setting
channel, and reconfigure registers after that.
Ping-Ke Shih [Thu, 15 Feb 2024 05:57:40 +0000 (13:57 +0800)]
wifi: rtw89: 8922a: add set_channel RF part
Configure RF registers according to band, channel, bandwidth. Since this
chip will support MLO, it needs check the operating mode to decide paths
we are going to configure.
Ping-Ke Shih [Thu, 15 Feb 2024 05:57:38 +0000 (13:57 +0800)]
wifi: rtw89: 8922a: add set_channel MAC part
To set channel, add a function to get TXSB (TX subband) that is hardware
index to indicate primary channel. Then, configure band, channel,
bandwidth and TXSB via registers.
Kees Cook [Fri, 16 Feb 2024 23:27:44 +0000 (15:27 -0800)]
net: sched: Annotate struct tc_pedit with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).
As found with Coccinelle[1], add __counted_by for struct tc_pedit.
Additionally, since the element count member must be set before accessing
the annotated flexible array member, move its initialization earlier.
Shannon Nelson [Fri, 16 Feb 2024 22:29:51 +0000 (14:29 -0800)]
pds_core: delete VF dev on reset
When the VF is hit with a reset, remove the aux device in
the prepare for reset and try to restore it after the reset.
The userland mechanics will need to recover and rebuild whatever
uses the device afterwards.
David S. Miller [Mon, 19 Feb 2024 10:20:39 +0000 (10:20 +0000)]
Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next
-queue
Tony Nguyen says:
====================
i40e: Simplify VSI and VEB handling
Ivan Vecera says:
The series simplifies handling of VSIs and VEBs by introducing for-each
iterating macros, 'find' helper functions. Also removes the VEB
recursion because the VEBs cannot have sub-VEBs according datasheet and
fixes the support for floating VEBs.
The series content:
Patch 1 - Uses existing helper function for find FDIR VSI instead of loop
Patch 2 - Adds and uses macros to iterate VSI and VEB arrays
Patch 3 - Adds 2 helper functions to find VSIs and VEBs by their SEID
Patch 4 - Fixes broken support for floating VEBs
Patch 5 - Removes VEB recursion and simplifies VEB handling
====================
If message contains unknown attribute and user passes
"--process-unknown" command line option, _decode() gets called with space
arg set to None. In that case, attr_space variable is not initialized
used which leads to following trace:
Traceback (most recent call last):
File "./tools/net/ynl/cli.py", line 77, in <module>
main()
File "./tools/net/ynl/cli.py", line 68, in main
reply = ynl.dump(args.dump, attrs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "tools/net/ynl/lib/ynl.py", line 909, in dump
return self._op(method, vals, [], dump=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "tools/net/ynl/lib/ynl.py", line 894, in _op
rsp_msg = self._decode(decoded.raw_attrs, op.attr_set.name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "tools/net/ynl/lib/ynl.py", line 639, in _decode
self._rsp_add(rsp, attr_name, None, self._decode_unknown(attr))
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "tools/net/ynl/lib/ynl.py", line 569, in _decode_unknown
return self._decode(NlAttrs(attr.raw), None)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "tools/net/ynl/lib/ynl.py", line 630, in _decode
search_attrs = SpaceAttrs(attr_space, rsp, outer_attrs)
^^^^^^^^^^
UnboundLocalError: cannot access local variable 'attr_space' where it is not associated with a value
Fix this by moving search_attrs assignment under the if statement
above it to make sure attr_space is initialized.
Fixes: bf8b832374fb ("tools/net/ynl: Support sub-messages in nested attribute spaces") Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Kamal Heib [Thu, 15 Feb 2024 22:31:04 +0000 (17:31 -0500)]
net: ena: Remove ena_select_queue
Avoid the following warnings by removing the ena_select_queue() function
and rely on the net core to do the queue selection, The issue happen
when an skb received from an interface with more queues than ena is
forwarded to the ena interface.
[ 1176.159959] eth0 selects TX queue 11, but real number of TX queues is 8
[ 1176.863976] eth0 selects TX queue 14, but real number of TX queues is 8
[ 1180.767877] eth0 selects TX queue 14, but real number of TX queues is 8
[ 1188.703742] eth0 selects TX queue 14, but real number of TX queues is 8
Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Kamal Heib <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Heiner Kallweit [Wed, 14 Feb 2024 20:17:11 +0000 (21:17 +0100)]
net: phy: add PHY_EEE_CAP2_FEATURES
As a prerequisite for adding EEE CAP2 register support, complement
PHY_EEE_CAP1_FEATURES with PHY_EEE_CAP2_FEATURES.
For now only 2500baseT and 5000baseT modes are supported.
Ivan Vecera [Fri, 24 Nov 2023 15:03:43 +0000 (16:03 +0100)]
i40e: Remove VEB recursion
The VEB (virtual embedded switch) as a switch element can be
connected according datasheet though its uplink to:
- Physical port
- Port Virtualizer (not used directly by i40e driver but can
be present in MFP mode where the physical port is shared
between PFs)
- No uplink (aka floating VEB)
But VEB uplink cannot be connected to another VEB and any attempt
to do so results in:
that indicates "the uplink SEID does not point to valid element".
Remove this logic from the driver code this way:
1) For debugfs only allow to build floating VEB (uplink_seid == 0)
or main VEB (uplink_seid == mac_seid)
2) Do not recurse in i40e_veb_link_event() as no VEB cannot have
sub-VEBs
3) Ditto for i40e_veb_rebuild() + simplify the function as we know
that the VEB for rebuild can be only the main LAN VEB or some
of the floating VEBs
4) In i40e_rebuild() there is no need to check veb->uplink_seid
as the possible ones are 0 and MAC SEID
5) In i40e_vsi_release() do not take into account VEBs whose
uplink is another VEB as this is not possible
6) Remove veb_idx field from i40e_veb as a VEB cannot have
sub-VEBs
Ivan Vecera [Fri, 24 Nov 2023 15:03:42 +0000 (16:03 +0100)]
i40e: Fix broken support for floating VEBs
Although the i40e supports so-called floating VEB (VEB without
an uplink connection to external network), this support is
broken. This functionality is currently unused (except debugfs)
but it will be used by subsequent series for switchdev mode
slow-path. Fix this by following:
1) Handle correctly floating VEB (VEB with uplink_seid == 0)
in i40e_reconstitute_veb() and look for owner VSI and
create it only for non-floating VEBs and also set bridge
mode only for such VEBs as the floating ones are using
always VEB mode.
2) Handle correctly floating VEB in i40e_veb_release() and
disallow its release when there are some VSIs. This is
different from regular VEB that have owner VSI that is
connected to VEB's uplink after VEB deletion by FW.
3) Fix i40e_add_veb() to handle 'vsi' that is NULL for floating
VEBs. For floating VEB use 0 for downlink SEID and 'true'
for 'default_port' parameters as per datasheet.
4) Fix 'add relay' command in i40e_dbg_command_write() to allow
to create floating VEB by 'add relay 0 0' or 'add relay'
Ivan Vecera [Fri, 24 Nov 2023 15:03:41 +0000 (16:03 +0100)]
i40e: Add helpers to find VSI and VEB by SEID and use them
Add two helpers i40e_(veb|vsi)_get_by_seid() to find corresponding
VEB or VSI by their SEID value and use these helpers to replace
existing open-coded loops.
ECC is an IP configuration option where counter registers are added in
IP for 1bit/2bit ECC errors count and reset.
Also driver reports 1bit/2bit ECC errors for FIFOs based on ECC error
interrupts.
Add xlnx,has-ecc optional property for Xilinx AXI CAN controller
to support ECC if the ECC block is enabled in the HW.
Add ethtool stats interface for getting all the ECC errors information.
There is no public documentation for it available.
Changes in v8:
- Use u64_stats_sync instead of spinlock
- Renamed stats strings: use "_" instead of "-"
- Renamed stats strings: add "_errors" trailer
- Renamed stats variables similar to stats strings
Changes in v7:
- Update with spinlock only for stats counters
Changes in v6:
- Update commit description
Changes in v5:
- Fix review comments
- Change the sequence of updates the stats
- Add get_strings and get_sset_count stats interface
- Use u64 stats helper function
Srinivas Goud [Tue, 13 Feb 2024 10:36:44 +0000 (11:36 +0100)]
can: xilinx_can: Add ECC support
Add ECC support for Xilinx CAN Controller, so this driver reports
1bit/2bit ECC errors for FIFO's based on ECC error interrupt. ECC
feature for Xilinx CAN Controller selected through 'xlnx,has-ecc' DT
property
Add Aquantia AQR113 PHY ID. Aquantia AQR113 is just a chip size variant of
the already supported AQR133C where the only difference is the PHY ID
and the hw chip size.
David S. Miller [Fri, 16 Feb 2024 08:48:09 +0000 (08:48 +0000)]
Merge branch 'ionic-xdp-support'
Shannon Nelson says:
====================
ionic: add XDP support
This patchset is new support in ionic for XDP processing,
including basic XDP on Rx packets, TX and REDIRECT, and frags
for jumbo frames.
Since ionic has not yet been converted to use the page_pool APIs,
this uses the simple MEM_TYPE_PAGE_ORDER0 buffering. There are plans
to convert the driver in the near future.
v4:
- removed "inline" from short utility functions
- changed to use "goto err_out" in ionic_xdp_register_rxq_info()
- added "continue" to reduce nesting in ionic_xdp_queues_config()
- used xdp_prog in ionic_rx_clean() to flag whether or not to sync
the rx buffer after calling ionix_xdp_run()
- swapped order of XDP_TX and XDP_REDIRECT cases in ionic_xdp_run()
to make patch 6 a little cleaner
v2:
https://lore.kernel.org/netdev/20240208005725[email protected]/
- added calls to txq_trans_cond_update()
- added a new patch to catch NAPI budget==0