====================
net: ndo_fdb_add/del: Have drivers report whether they notified
Currently when FDB entries are added to or deleted from a VXLAN netdevice,
the VXLAN driver emits one notification, including the VXLAN-specific
attributes. The core however always sends a notification as well, a generic
one. Thus two notifications are unnecessarily sent for these operations. A
similar situation comes up with bridge driver, which also emits
notifications on its own.
# ip link add name vx type vxlan id 1000 dstport 4789
# bridge monitor fdb &
[1] 1981693
# bridge fdb add de:ad:be:ef:13:37 dev vx self dst 192.0.2.1
de:ad:be:ef:13:37 dev vx dst 192.0.2.1 self permanent
de:ad:be:ef:13:37 dev vx self permanent
In order to prevent this duplicity, add a parameter, bool *notified, to
ndo_fdb_add and ndo_fdb_del. The flag is primed to false, and if the callee
sends a notification on its own, it sets the flag to true, thus informing
the core that it should not generate another notification.
Patches #1 to #2 are concerned with the above.
In the remaining patches, #3 to #7, add a selftest. This takes place across
several patches. Many of the helpers we would like to use for the test are
in forwarding/lib.sh, whereas net/ is a more suitable place for the test,
so the libraries need to be massaged a bit first.
====================
Petr Machata [Thu, 14 Nov 2024 14:09:59 +0000 (15:09 +0100)]
selftests: net: fdb_notify: Add a test for FDB notifications
Check that only one notification is produced for various FDB edit
operations.
Regarding the ip_link_add() and ip_link_master() helpers. This pattern of
action plus corresponding defer is bound to come up often, and a dedicated
vocabulary to capture it will be handy. tunnel_create() and vlan_create()
from forwarding/lib.sh are somewhat opaque and perhaps too kitchen-sinky,
so I tried to go in the opposite direction with these ones, and wrapped
only the bare minimum to schedule a corresponding cleanup.
Petr Machata [Thu, 14 Nov 2024 14:09:58 +0000 (15:09 +0100)]
selftests: net: lib: Add kill_process
A number of selftests run processes in the background and need to kill them
afterwards. Instead for everyone to open-code the kill / wait / redirect
mantra, add a helper in net/lib.sh. Convert existing open-code sites.
Petr Machata [Thu, 14 Nov 2024 14:09:57 +0000 (15:09 +0100)]
selftests: net: lib: Move checks from forwarding/lib.sh here
For logging to be useful, something has to set RET and retmsg by calling
ret_set_ksft_status(). There is a suite of functions to that end in
forwarding/lib: check_err, check_fail et.al. Move them to net/lib.sh so
that every net test can use them.
Existing lib.sh users might be using these same names for their functions.
However lib.sh is always sourced near the top of the file (checked), and
whatever new definitions will simply override the ones provided by lib.sh.
Petr Machata [Thu, 14 Nov 2024 14:09:56 +0000 (15:09 +0100)]
selftests: net: lib: Move tests_run from forwarding/lib.sh here
It would be good to use the same mechanism for scheduling and dispatching
general net tests as the many forwarding tests already use. To that end,
move the logging helpers to net/lib.sh so that every net test can use them.
Existing lib.sh users might be using the name themselves. However lib.sh is
always sourced near the top of the file (checked), and whatever new
definition will simply override the one provided by lib.sh.
Petr Machata [Thu, 14 Nov 2024 14:09:55 +0000 (15:09 +0100)]
selftests: net: lib: Move logging from forwarding/lib.sh here
Many net selftests invent their own logging helpers. These really should be
in a library sourced by these tests. Currently forwarding/lib.sh has a
suite of perfectly fine logging helpers, but sourcing a forwarding/ library
from a higher-level directory smells of layering violation. In this patch,
move the logging helpers to net/lib.sh so that every net test can use them.
Together with the logging helpers, it's also necessary to move
pause_on_fail(), and EXIT_STATUS and RET.
Existing lib.sh users might be using these same names for their functions
or variables. However lib.sh is always sourced near the top of the
file (checked), and whatever new definitions will simply override the ones
provided by lib.sh.
Petr Machata [Thu, 14 Nov 2024 14:09:54 +0000 (15:09 +0100)]
ndo_fdb_del: Add a parameter to report whether notification was sent
In a similar fashion to ndo_fdb_add, which was covered in the previous
patch, add the bool *notified argument to ndo_fdb_del. Callees that send a
notification on their own set the flag to true.
Petr Machata [Thu, 14 Nov 2024 14:09:53 +0000 (15:09 +0100)]
ndo_fdb_add: Add a parameter to report whether notification was sent
Currently when FDB entries are added to or deleted from a VXLAN netdevice,
the VXLAN driver emits one notification, including the VXLAN-specific
attributes. The core however always sends a notification as well, a generic
one. Thus two notifications are unnecessarily sent for these operations. A
similar situation comes up with bridge driver, which also emits
notifications on its own:
# ip link add name vx type vxlan id 1000 dstport 4789
# bridge monitor fdb &
[1] 1981693
# bridge fdb add de:ad:be:ef:13:37 dev vx self dst 192.0.2.1
de:ad:be:ef:13:37 dev vx dst 192.0.2.1 self permanent
de:ad:be:ef:13:37 dev vx self permanent
In order to prevent this duplicity, add a paremeter to ndo_fdb_add,
bool *notified. The flag is primed to false, and if the callee sends a
notification on its own, it sets it to true, thus informing the core that
it should not generate another notification.
====================
net: netpoll: Improve SKB pool management
The netpoll subsystem pre-allocates 32 SKBs in a pool for emergency use
during out-of-memory conditions. However, the current implementation has
several inefficiencies:
* The SKB pool, once allocated, is never freed:
* Resources remain allocated even after netpoll users are removed
* Failed initialization can leave pool populated forever
* The global pool design makes resource tracking difficult
This series addresses these issues through three patches:
Patch 1 ("net: netpoll: Individualize the skb pool"):
- Replace global pool with per-user pools in netpoll struct
Patch 2 ("net: netpoll: flush skb pool during cleanup"):
- Properly free pool resources during netconsole cleanup
These changes improve resource management and make the code more
maintainable. As a side benefit, the improved structure would allow
netpoll to be modularized if desired in the future.
Breno Leitao [Thu, 14 Nov 2024 11:00:12 +0000 (03:00 -0800)]
net: netpoll: flush skb pool during cleanup
The netpoll subsystem maintains a pool of 32 pre-allocated SKBs per
instance, but these SKBs are not freed when the netpoll user is brought
down. This leads to memory waste as these buffers remain allocated but
unused.
Add skb_pool_flush() to properly clean up these SKBs when netconsole is
terminated, improving memory efficiency.
Breno Leitao [Thu, 14 Nov 2024 11:00:11 +0000 (03:00 -0800)]
net: netpoll: Individualize the skb pool
The current implementation of the netpoll system uses a global skb
pool, which can lead to inefficient memory usage and
waste when targets are disabled or no longer in use.
This can result in a significant amount of memory being unnecessarily
allocated and retained, potentially causing performance issues and
limiting the availability of resources for other system components.
Modify the netpoll system to assign a skb pool to each target instead of
using a global one.
This approach allows for more fine-grained control over memory
allocation and deallocation, ensuring that resources are only allocated
and retained as needed.
====================
enic: Use all the resources configured on VIC
Allow users to configure and use more than 8 rx queues and 8 tx queues
on the Cisco VIC.
This series changes the maximum number of tx and rx queues supported
from 8 to the hardware limit of 256, and allocates memory based on the
number of resources configured on the VIC.
Nelson Escobar [Wed, 13 Nov 2024 23:56:37 +0000 (23:56 +0000)]
enic: Adjust used MSI-X wq/rq/cq/interrupt resources in a more robust way
Instead of failing to use MSI-X if resources aren't configured exactly
right, use the resources we do have. Since we could start using large
numbers of rq resources, we do limit the rq count to what
netif_get_num_default_rss_queues() recommends.
Nelson Escobar [Wed, 13 Nov 2024 23:56:35 +0000 (23:56 +0000)]
enic: Save resource counts we read from HW
Save the resources counts for wq,rq,cq, and interrupts in *_avail variables
so that we don't lose the information when adjusting the counts we are
actually using.
Report the wq_avail and rq_avail as the channel maximums in 'ethtool -l'
output.
Nelson Escobar [Wed, 13 Nov 2024 23:56:34 +0000 (23:56 +0000)]
enic: Make MSI-X I/O interrupts come after the other required ones
The VIC hardware has a constraint that the MSIX interrupt used for errors
be specified as a 7 bit number. Before this patch, it was allocated after
the I/O interrupts, which would cause a problem if 128 or more I/O
interrupts are in use.
So make the required interrupts come before the I/O interrupts to
guarantee the error interrupt offset never exceeds 7 bits.
Vadim Fedorenko [Thu, 14 Nov 2024 11:48:20 +0000 (03:48 -0800)]
bnxt_en: optimize gettimex64
Current implementation of gettimex64() makes at least 3 PCIe reads to
get current PHC time. It takes at least 2.2us to get this value back to
userspace. At the same time there is cached value of upper bits of PHC
available for packet timestamps already. This patch reuses cached value
to speed up reading of PHC time.
Jakub Kicinski [Fri, 15 Nov 2024 22:16:28 +0000 (14:16 -0800)]
Merge tag 'for-net-next-2024-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Luiz Augusto von Dentz says:
====================
bluetooth-next pull request for net-next:
- btusb: add Foxconn 0xe0fc for Qualcomm WCN785x
- btmtk: Fix ISO interface handling
- Add quirk for ATS2851
- btusb: Add RTL8852BE device 0489:e123
- ISO: Do not emit LE PA/BIG Create Sync if previous is pending
- btusb: Add USB HW IDs for MT7920/MT7925
- btintel_pcie: Add handshake between driver and firmware
- btintel_pcie: Add recovery mechanism
- hci_conn: Use disable_delayed_work_sync
- SCO: Use kref to track lifetime of sco_conn
- ISO: Use kref to track lifetime of iso_conn
- btnxpuart: Add GPIO support to power save feature
- btusb: Add 0x0489:0xe0f3 and 0x13d3:0x3623 for Qualcomm WCN785x
* tag 'for-net-next-2024-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (51 commits)
Bluetooth: MGMT: Add initial implementation of MGMT_OP_HCI_CMD_SYNC
Bluetooth: fix use-after-free in device_for_each_child()
Bluetooth: btintel: Direct exception event to bluetooth stack
Bluetooth: hci_core: Fix calling mgmt_device_connected
Bluetooth: hci_bcm: Use the devm_clk_get_optional() helper
Bluetooth: ISO: Send BIG Create Sync via hci_sync
Bluetooth: hci_conn: Remove alloc from critical section
Bluetooth: ISO: Use kref to track lifetime of iso_conn
Bluetooth: SCO: Use kref to track lifetime of sco_conn
Bluetooth: HCI: Add IPC(11) bus type
Bluetooth: btusb: Add 3 HWIDs for MT7925
Bluetooth: btusb: Add new VID/PID 0489/e124 for MT7925
Bluetooth: ISO: Update hci_conn_hash_lookup_big for Broadcast slave
Bluetooth: ISO: Do not emit LE BIG Create Sync if previous is pending
Bluetooth: ISO: Fix matching parent socket for BIS slave
Bluetooth: ISO: Do not emit LE PA Create Sync if previous is pending
Bluetooth: btrtl: Decrease HCI_OP_RESET timeout from 10 s to 2 s
Bluetooth: btbcm: fix missing of_node_put() in btbcm_get_board_name()
Bluetooth: btusb: Add new VID/PID 0489/e111 for MT7925
Bluetooth: btmtk: adjust the position to init iso data anchor
...
====================
Jakub Kicinski [Fri, 15 Nov 2024 22:09:20 +0000 (14:09 -0800)]
Merge tag 'nf-next-24-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
1) Extended netlink error reporting if nfnetlink attribute parser fails,
from Donald Hunter.
2) Incorrect request_module() module, from Simon Horman.
3) A series of patches to reduce memory consumption for set element
transactions.
Florian Westphal says:
"When doing a flush on a set or mass adding/removing elements from a
set, each element needs to allocate 96 bytes to hold the transactional
state.
In such cases, virtually all the information in struct nft_trans_elem
is the same.
Change nft_trans_elem to a flex-array, i.e. a single nft_trans_elem
can hold multiple set element pointers.
The number of elements that can be stored in one nft_trans_elem is limited
by the slab allocator, this series limits the compaction to at most 62
elements as it caps the reallocation to 2048 bytes of memory."
4) A series of patches to prepare the transition to dscp_t in .flowi_tos.
From Guillaume Nault.
5) Support for bitwise operations with two source registers,
from Jeremy Sowden.
* tag 'nf-next-24-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: bitwise: add support for doing AND, OR and XOR directly
netfilter: bitwise: rename some boolean operation functions
netfilter: nf_dup4: Convert nf_dup_ipv4_route() to dscp_t.
netfilter: nft_fib: Convert nft_fib4_eval() to dscp_t.
netfilter: rpfilter: Convert rpfilter_mt() to dscp_t.
netfilter: flow_offload: Convert nft_flow_route() to dscp_t.
netfilter: ipv4: Convert ip_route_me_harder() to dscp_t.
netfilter: nf_tables: allocate element update information dynamically
netfilter: nf_tables: switch trans_elem to real flex array
netfilter: nf_tables: prepare nft audit for set element compaction
netfilter: nf_tables: prepare for multiple elements in nft_trans_elem structure
netfilter: nf_tables: add nft_trans_commit_list_add_elem helper
netfilter: bpf: Pass string literal as format argument of request_module()
netfilter: nfnetlink: Report extack policy errors for batched ops
====================
Jeremy Sowden [Thu, 14 Nov 2024 21:08:13 +0000 (22:08 +0100)]
netfilter: bitwise: add support for doing AND, OR and XOR directly
Hitherto, these operations have been converted in user space to
mask-and-xor operations on one register and two immediate values, and it
is the latter which have been evaluated by the kernel. We add support
for evaluating these operations directly in kernel space on one register
and either an immediate value or a second register.
Pablo made a few changes to the original patch:
- EINVAL if NFTA_BITWISE_SREG2 is used with fast version.
- Allow _AND,_OR,_XOR with _DATA != sizeof(u32)
- Dump _SREG2 or _DATA with _AND,_OR,_XOR
Jeremy Sowden [Thu, 14 Nov 2024 21:07:51 +0000 (22:07 +0100)]
netfilter: bitwise: rename some boolean operation functions
In the next patch we add support for doing AND, OR and XOR operations
directly in the kernel, so rename some functions and an enum constant
related to mask-and-xor boolean operations.
Guillaume Nault [Thu, 14 Nov 2024 16:03:52 +0000 (17:03 +0100)]
netfilter: nf_dup4: Convert nf_dup_ipv4_route() to dscp_t.
Use ip4h_dscp() instead of reading iph->tos directly.
ip4h_dscp() returns a dscp_t value which is temporarily converted back
to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to
dscp_t in the future, we'll only have to remove that
inet_dscp_to_dsfield() call.
Guillaume Nault [Thu, 14 Nov 2024 16:03:45 +0000 (17:03 +0100)]
netfilter: nft_fib: Convert nft_fib4_eval() to dscp_t.
Use ip4h_dscp() instead of reading iph->tos directly.
ip4h_dscp() returns a dscp_t value which is temporarily converted back
to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to
dscp_t in the future, we'll only have to remove that
inet_dscp_to_dsfield() call.
Guillaume Nault [Thu, 14 Nov 2024 16:03:38 +0000 (17:03 +0100)]
netfilter: rpfilter: Convert rpfilter_mt() to dscp_t.
Use ip4h_dscp() instead of reading iph->tos directly.
ip4h_dscp() returns a dscp_t value which is temporarily converted back
to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to
dscp_t in the future, we'll only have to remove that
inet_dscp_to_dsfield() call.
Guillaume Nault [Thu, 14 Nov 2024 16:03:31 +0000 (17:03 +0100)]
netfilter: flow_offload: Convert nft_flow_route() to dscp_t.
Use ip4h_dscp()instead of reading ip_hdr()->tos directly.
ip4h_dscp() returns a dscp_t value which is temporarily converted back
to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to
dscp_t in the future, we'll only have to remove that
inet_dscp_to_dsfield() call.
Also, remove the comment about the net/ip.h include file, since it's
now required for the ip4h_dscp() helper too.
Guillaume Nault [Thu, 14 Nov 2024 16:03:21 +0000 (17:03 +0100)]
netfilter: ipv4: Convert ip_route_me_harder() to dscp_t.
Use ip4h_dscp()instead of reading iph->tos directly.
ip4h_dscp() returns a dscp_t value which is temporarily converted back
to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to
dscp_t in the future, we'll only have to remove that
inet_dscp_to_dsfield() call.
====================
net: make RSS+RXNFC semantics more explicit
The original semantics of ntuple filters with FLOW_RSS were not
fully understood by all drivers, some ignoring the ring_cookie from
the flow rule. Require this support to be explicitly declared by
the driver for filters relying on it to be inserted, and add self-
test coverage for this functionality.
Also teach ethtool_check_max_channel() about this.
====================
Edward Cree [Wed, 13 Nov 2024 12:13:13 +0000 (12:13 +0000)]
selftest: extend test_rss_context_queue_reconfigure for action addition
The combination of ntuple action (ring_cookie) and RSS context can
cause an ntuple rule to target a higher queue than appears in any
RSS indirection table or directly in the ntuple rule, since the two
numbers are added together. Verify the logic that prevents reducing
the queue count in this case.
Edward Cree [Wed, 13 Nov 2024 12:13:12 +0000 (12:13 +0000)]
selftest: validate RSS+ntuple filters with nonzero ring_cookie
Test creates an ntuple filter with 'action 2' and an RSS context whose
indirection table has entries 0 and 1. Resulting traffic should go to
queues 2 and 3; verify that it never hits queues 0 and 1.
Edward Cree [Wed, 13 Nov 2024 12:13:10 +0000 (12:13 +0000)]
net: ethtool: account for RSS+RXNFC add semantics when checking channel count
In ethtool_check_max_channel(), the new RX count must not only cover the
max queue indices in RSS indirection tables and RXNFC destinations
separately, but must also, for RXNFC rules with FLOW_RSS, cover the sum
of the destination queue and the maximum index in the associated RSS
context's indirection table, since that is the highest queue that the
rule can actually deliver traffic to.
It could be argued that the max queue across all custom RSS contexts
(ethtool_get_max_rss_ctx_channel()) need no longer be considered, since
any context to which packets can actually be delivered will be targeted
by some RXNFC rule and its max will thus be allowed for by
ethtool_get_max_rxnfc_channel(). For simplicity we keep both checks, so
even RSS contexts unused by any RXNFC rule must fit the channel count.
Edward Cree [Wed, 13 Nov 2024 12:13:09 +0000 (12:13 +0000)]
net: ethtool: only allow set_rxnfc with rss + ring_cookie if driver opts in
Ethtool ntuple filters with FLOW_RSS were originally defined as adding
the base queue ID (ring_cookie) to the value from the indirection table,
so that the same table could distribute over more than one set of queues
when used by different filters.
However, some drivers / hardware ignore the ring_cookie, and simply use
the indirection table entries as queue IDs directly. Thus, for drivers
which have not opted in by setting ethtool_ops.cap_rss_rxnfc_adds to
declare that they support the original (addition) semantics, reject in
ethtool_set_rxnfc any filter which combines FLOW_RSS and a nonzero ring.
(For a ring_cookie of zero, both behaviours are equivalent.)
Set the cap bit in sfc, as it is known to support this feature.
The example has "interrupt" property which is not a defined property. It
should be "interrupts" instead. "interrupts" also should not contain a
phandle.
The sparx5 switchdev driver can be built either with or without support
for the Lan969x switch. However, it cannot be built-in when the lan969x
driver is a loadable module because of a link-time dependency:
arm-linux-gnueabi-ld: drivers/net/ethernet/microchip/sparx5/sparx5_main.o:(.rodata+0xd44): undefined reference to `lan969x_desc'
Add a Kconfig dependency to reflect this in Kconfig, allowing all
the valid configurations but forcing sparx5 to be a loadable module
as well if lan969x is.
We've added 9 non-merge commits during the last 4 day(s) which contain
a total of 3 files changed, 226 insertions(+), 84 deletions(-).
The main changes are:
1) Fixes to bpf_msg_push/pop_data and test_sockmap. The changes has
dependency on the other changes in the bpf-next/net branch,
from Zijian Zhang.
2) Drop netns codes from mptcp test. Reuse the common helpers in
test_progs, from Geliang Tang.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
bpf, sockmap: Fix sk_msg_reset_curr
bpf, sockmap: Several fixes to bpf_msg_pop_data
bpf, sockmap: Several fixes to bpf_msg_push_data
selftests/bpf: Add more tests for test_txmsg_push_pop in test_sockmap
selftests/bpf: Add push/pop checking for msg_verify_data in test_sockmap
selftests/bpf: Fix total_bytes in msg_loop_rx in test_sockmap
selftests/bpf: Fix SENDPAGE data logic in test_sockmap
selftests/bpf: Add txmsg_pass to pull/push/pop in test_sockmap
selftests/bpf: Drop netns helpers in mptcp
====================
====================
ipv4: Prepare bpf helpers to .flowi4_tos conversion.
Continue the process of making a dscp_t variable available when setting
.flowi4_tos. This series focuses on the BPF helpers that initialise a
struct flowi4 manually.
The objective is to eventually convert .flowi4_tos to dscp_t, (to get
type annotation and prevent ECN bits from interfering with DSCP).
====================
Revert patch 1bf70e6c3a53 which modified check_ntf() and instead add a
new poll_ntf() with async notification semantics. See patch 2 for a
detailed description.
====================
Donald Hunter [Wed, 13 Nov 2024 09:08:43 +0000 (09:08 +0000)]
tools/net/ynl: add async notification handling
The notification handling in ynl is currently very simple, using sleep()
to wait a period of time and then handling all the buffered messages in
a single batch.
This patch adds async notification handling so that messages can be
processed as they are received. This makes it possible to use ynl as a
library that supplies notifications in a timely manner.
- Add poll_ntf() to be a generator that yields 1 notification at a
time and blocks until a notification is available.
- Add a --duration parameter to the CLI, with --sleep as an alias.
This modification to check_ntf() is being reverted so that its behaviour
remains equivalent to ynl_ntf_check() in the C YNL. Instead a new
poll_ntf() will be added in a separate patch.
====================
net: phy: switch eee_broken_modes to linkmode bitmap and add accessor
eee_broken_modes has a eee_cap1 register layout currently. This doesn't
allow to flag e.g. 2.5Gbps or 5Gbps BaseT EEE as broken. To overcome
this limitation switch eee_broken_modes to a linkmode bitmap.
Add an accessor for the bitmap and use it in r8169.
====================
Vendor driver r8125 doesn't advertise 2.5G EEE on RTL8125A, and r8126
doesn't advertise 5G EEE. Likely there are compatibility issues,
therefore do the same in r8169.
With this change we don't have to disable 2.5G EEE advertisement in
rtl8125a_config_eee_phy() any longer.
We use new phylib accessor phy_set_eee_broken() to mark the respective
EEE modes as broken.
Heiner Kallweit [Fri, 8 Nov 2024 06:54:47 +0000 (07:54 +0100)]
net: phy: convert eee_broken_modes to a linkmode bitmap
eee_broken_modes has a eee_cap1 register layout currently. This doen't
allow to flag e.g. 2.5Gbps or 5Gbps BaseT EEE as broken. To overcome
this limitation switch eee_broken_modes to a linkmode bitmap.
This command may be used to send a HCI command and wait for an
(optional) event.
The HCI command is specified by the Opcode, any arbitrary is supported
including vendor commands, but contrary to the like of
Raw/User channel it is run as an HCI command send by the kernel
since it uses its command synchronization thus it is possible to wait
for a specific event as a response.
Setting event to 0x00 will cause the command to wait for either
HCI Command Status or HCI Command Complete.
Timeout is specified in seconds, setting it to 0 will cause the
default timeout to be used.
In 'hci_conn_del_sysfs()', 'device_unregister()' may be called when
an underlying (kobject) reference counter is greater than 1. This
means that reparenting (happened when the device is actually freed)
is delayed and, during that delay, parent controller device (hciX)
may be deleted. Since the latter may create a dangling pointer to
freed parent, avoid that scenario by reparenting to NULL explicitly.
Reported-by: [email protected] Tested-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=6cf5652d3df49fae2e3f Fixes: a85fb91e3d72 ("Bluetooth: Fix double free in hci_conn_cleanup") Signed-off-by: Dmitry Antipov <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]>
Kiran K [Tue, 22 Oct 2024 09:11:34 +0000 (14:41 +0530)]
Bluetooth: btintel: Direct exception event to bluetooth stack
Have exception event part of HCI traces which helps for debug.
snoop traces:
> HCI Event: Vendor (0xff) plen 79
Vendor Prefix (0x8780)
Intel Extended Telemetry (0x03)
Unknown extended telemetry event type (0xde)
01 01 de
Unknown extended subevent 0x07
01 01 de 07 01 de 06 1c ef be ad de ef be ad de
ef be ad de ef be ad de ef be ad de ef be ad de
ef be ad de 05 14 ef be ad de ef be ad de ef be
ad de ef be ad de ef be ad de 43 10 ef be ad de
ef be ad de ef be ad de ef be ad de
Fixes: af395330abed ("Bluetooth: btintel: Add Intel devcoredump support") Signed-off-by: Kiran K <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]>
Since 61a939c68ee0 ("Bluetooth: Queue incoming ACL data until
BT_CONNECTED state is reached") there is no long the need to call
mgmt_device_connected as ACL data will be queued until BT_CONNECTED
state.
Iulia Tanasescu [Mon, 11 Nov 2024 11:47:08 +0000 (13:47 +0200)]
Bluetooth: ISO: Send BIG Create Sync via hci_sync
Before issuing the LE BIG Create Sync command, an available BIG handle
is chosen by iterating through the conn_hash list and finding the first
unused value.
If a BIG is terminated, the associated hcons are removed from the list
and the LE BIG Terminate Sync command is sent via hci_sync queue.
However, a new LE BIG Create sync command might be issued via
hci_send_cmd, before the previous BIG sync was terminated. This
can cause the same BIG handle to be reused and the LE BIG Create Sync
to fail with Command Disallowed.
< HCI Command: LE Broadcast Isochronous Group Create Sync (0x08|0x006b)
BIG Handle: 0x00
BIG Sync Handle: 0x0002
Encryption: Unencrypted (0x00)
Broadcast Code[16]: 00000000000000000000000000000000
Maximum Number Subevents: 0x00
Timeout: 20000 ms (0x07d0)
Number of BIS: 1
BIS ID: 0x01
> HCI Event: Command Status (0x0f) plen 4
LE Broadcast Isochronous Group Create Sync (0x08|0x006b) ncmd 1
Status: Command Disallowed (0x0c)
< HCI Command: LE Broadcast Isochronous Group Terminate Sync (0x08|0x006c)
BIG Handle: 0x00
This commit fixes the ordering of the LE BIG Create Sync/LE BIG Terminate
Sync commands, to make sure that either the previous BIG sync is
terminated before reusing the handle, or that a new handle is chosen
for a new sync.
Fixes: eca0ae4aea66 ("Bluetooth: Add initial implementation of BIS connections") Signed-off-by: Iulia Tanasescu <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]>
Iulia Tanasescu [Mon, 11 Nov 2024 11:47:07 +0000 (13:47 +0200)]
Bluetooth: hci_conn: Remove alloc from critical section
This removes the kzalloc memory allocation inside critical section in
create_pa_sync, fixing the following message that appears when the kernel
is compiled with CONFIG_DEBUG_ATOMIC_SLEEP enabled:
BUG: sleeping function called from invalid context at
include/linux/sched/mm.h:321
Bluetooth: ISO: Use kref to track lifetime of iso_conn
This make use of kref to keep track of reference of iso_conn which
allows better tracking of its lifetime with usage of things like
kref_get_unless_zero in a similar way as used in l2cap_chan.
In addition to it remove call to iso_sock_set_timer on iso_sock_disconn
since at that point it is useless to set a timer as the sk will be freed
there is nothing to be done in iso_sock_timeout.
Fixes: ccf74f2390d6 ("Bluetooth: Add BTPROTO_ISO socket type") Signed-off-by: Luiz Augusto von Dentz <[email protected]>
Bluetooth: SCO: Use kref to track lifetime of sco_conn
This make use of kref to keep track of reference of sco_conn which
allows better tracking of its lifetime with usage of things like
kref_get_unless_zero in a similar way as used in l2cap_chan.
In addition to it remove call to sco_sock_set_timer on __sco_sock_close
since at that point it is useless to set a timer as the sk will be freed
there is nothing to be done in sco_sock_timeout.
Zephyr(1) has been using the same bus defines as Linux so tools likes of
btmon, etc, are able to decode the bus used by the driver to transport
HCI packets.
Iulia Tanasescu [Fri, 1 Nov 2024 08:23:39 +0000 (10:23 +0200)]
Bluetooth: ISO: Update hci_conn_hash_lookup_big for Broadcast slave
Currently, hci_conn_hash_lookup_big only checks for BIS master connections,
by filtering out connections with the destination address set. This commit
updates this function to also consider BIS slave connections, since it is
also used for a Broadcast Receiver to set an available BIG handle before
issuing the LE BIG Create Sync command.
Iulia Tanasescu [Fri, 1 Nov 2024 08:23:38 +0000 (10:23 +0200)]
Bluetooth: ISO: Do not emit LE BIG Create Sync if previous is pending
The Bluetooth Core spec does not allow a LE BIG Create sync command to be
sent to Controller if another one is pending (Vol 4, Part E, page 2586).
In order to avoid this issue, the HCI_CONN_CREATE_BIG_SYNC was added
to mark that the LE BIG Create Sync command has been sent for a hcon.
Once the BIG Sync Established event is received, the hcon flag is
erased and the next pending hcon is handled.
Iulia Tanasescu [Fri, 1 Nov 2024 08:23:37 +0000 (10:23 +0200)]
Bluetooth: ISO: Fix matching parent socket for BIS slave
Currently, when a BIS slave connection is notified to the
ISO layer, the parent socket is tried to be matched by the
HCI_EVT_LE_BIG_SYNC_ESTABILISHED event. However, a BIS slave
connection is notified to the ISO layer after the Command
Complete for the LE Setup ISO Data Path command is received.
This causes the parent to be incorrectly matched if multiple
listen sockets are present.
This commit adds a fix by matching the parent based on the
BIG handle set in the notified connection.
Iulia Tanasescu [Fri, 1 Nov 2024 08:23:36 +0000 (10:23 +0200)]
Bluetooth: ISO: Do not emit LE PA Create Sync if previous is pending
The Bluetooth Core spec does not allow a LE PA Create sync command to be
sent to Controller if another one is pending (Vol 4, Part E, page 2493).
In order to avoid this issue, the HCI_CONN_CREATE_PA_SYNC was added
to mark that the LE PA Create Sync command has been sent for a hcon.
Once the PA Sync Established event is received, the hcon flag is
erased and the next pending hcon is handled.
Hilda Wu [Wed, 30 Oct 2024 08:43:34 +0000 (16:43 +0800)]
Bluetooth: btrtl: Decrease HCI_OP_RESET timeout from 10 s to 2 s
The original timeout setting for HCI Reset on shutdown is 10 seconds.
HCI Reset shouldn't take 10 seconds to complete so instead use the
default timeout for commands.
Javier Carrasco [Thu, 31 Oct 2024 12:11:23 +0000 (13:11 +0100)]
Bluetooth: btbcm: fix missing of_node_put() in btbcm_get_board_name()
of_find_node_by_path() returns a pointer to a device_node with its
refcount incremented, and a call to of_node_put() is required to
decrement the refcount again and avoid leaking the resource.
If 'of_property_read_string_index(root, "compatible", 0, &tmp)' fails,
the function returns without calling of_node_put(root) before doing so.
The automatic cleanup attribute can be used by means of the __free()
macro to automatically call of_node_put() when the variable goes out of
scope, fixing the issue and also accounting for new error paths.
Fixes: 63fac3343b99 ("Bluetooth: btbcm: Support per-board firmware variants") Signed-off-by: Javier Carrasco <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]>
Chris Lu [Fri, 25 Oct 2024 06:07:17 +0000 (14:07 +0800)]
Bluetooth: btmtk: adjust the position to init iso data anchor
MediaTek iso data anchor init should be moved to where MediaTek
claims iso data interface.
If there is an unexpected BT usb disconnect during setup flow,
it will cause a NULL pointer crash issue when releasing iso
anchor since the anchor wasn't been init yet. Adjust the position
to do iso data anchor init.
Fixes: ceac1cb0259d ("Bluetooth: btusb: mediatek: add ISO data transmission functions") Signed-off-by: Chris Lu <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]>
pcim_iomap_regions() and pcim_iomap_table() have been deprecated in
commit e354bb84a4c1 ("PCI: Deprecate pcim_iomap_table(),
pcim_iomap_regions_request_all()").
Colin Ian King [Mon, 7 Oct 2024 16:10:35 +0000 (17:10 +0100)]
Bluetooth: btintel_pcie: remove redundant assignment to variable ret
The variable ret is being assigned -ENOMEM however this is never
read and it is being re-assigned a new value when the code jumps
to label resubmit. The assignment is redundant and can be removed.
Andrej Shadura [Wed, 9 Oct 2024 12:14:24 +0000 (14:14 +0200)]
Bluetooth: Fix type of len in rfcomm_sock_getsockopt{,_old}()
Commit 9bf4e919ccad worked around an issue introduced after an innocuous
optimisation change in LLVM main:
> len is defined as an 'int' because it is assigned from
> '__user int *optlen'. However, it is clamped against the result of
> sizeof(), which has a type of 'size_t' ('unsigned long' for 64-bit
> platforms). This is done with min_t() because min() requires compatible
> types, which results in both len and the result of sizeof() being casted
> to 'unsigned int', meaning len changes signs and the result of sizeof()
> is truncated. From there, len is passed to copy_to_user(), which has a
> third parameter type of 'unsigned long', so it is widened and changes
> signs again. This excessive casting in combination with the KCSAN
> instrumentation causes LLVM to fail to eliminate the __bad_copy_from()
> call, failing the build.
The same issue occurs in rfcomm in functions rfcomm_sock_getsockopt and
rfcomm_sock_getsockopt_old.
Change the type of len to size_t in both rfcomm_sock_getsockopt and
rfcomm_sock_getsockopt_old and replace min_t() with min().
Kiran K [Thu, 17 Oct 2024 11:51:56 +0000 (17:21 +0530)]
Bluetooth: btintel: Do no pass vendor events to stack
During firmware download, vendor specific events like boot up and
secure send result are generated. These events can be safely processed at
the driver level. Passing on these events to stack prints unnecessary
log as below.
Everest K.C. [Wed, 16 Oct 2024 21:39:55 +0000 (15:39 -0600)]
Bluetooth: btintel_pcie: Remove deadcode
The switch case statement has a default branch. Thus, the return
statement at the end of the function can never be reached.
Fix it by removing the return statement at the end of the
function.
Chen-Yu Tsai [Tue, 8 Oct 2024 08:27:20 +0000 (16:27 +0800)]
Bluetooth: btmtksdio: Lookup device node only as fallback
If the device tree is properly written, the SDIO function device node
should be correctly defined, and the mmc core in Linux should correctly
tie it to the device being probed.
Only fall back to searching for the device node by compatible if the
original device node tied to the device is incorrect, as seen in older
device trees.
Kiran K [Tue, 1 Oct 2024 10:44:51 +0000 (16:14 +0530)]
Bluetooth: btintel_pcie: Add recovery mechanism
This patch adds a recovery mechanism to recover controller if firmware
download fails due to unresponsive controller, command timeout, firmware
signature verification failure etc. Driver attmepts maximum of 5 times
before giving up.
Kiran K [Tue, 1 Oct 2024 10:44:50 +0000 (16:14 +0530)]
Bluetooth: btintel_pcie: Add handshake between driver and firmware
The following handshake mechanism needs be followed after firmware
download is completed to bring the firmware to running state.
After firmware fragments of Operational image are downloaded and
secure sends result of the image succeeds,
1. Driver sends HCI Intel reset with boot option #1 to switch FW image.
2. FW sends Alive GP[0] MSIx
3. Driver enables data path (doorbell 0x460 for RBDs, etc...)
4. Driver gets Bootup event from firmware
5. Driver performs D0 entry to device (WRITE to IPC_Sleep_Control =0x0)
6. FW sends Alive GP[0] MSIx
7. Device host interface is fully set for BT protocol stack operation.
8. Driver may optionally get debug event with ID 0x97 which can be dropped
For Intermediate loadger image, all the above steps are applicable
expcept #5 and #6.
On HCI_OP_RESET, firmware raises alive interrupt. Driver needs to wait
for it before passing control over to bluetooth stack.
Bluetooth: hci_core: Fix not checking skb length on hci_scodata_packet
This fixes not checking if skb really contains an SCO header otherwise
the code may attempt to access some uninitilized/invalid memory past the
valid skb->data.
Bluetooth: hci_core: Fix not checking skb length on hci_acldata_packet
This fixes not checking if skb really contains an ACL header otherwise
the code may attempt to access some uninitilized/invalid memory past the
valid skb->data.
Bluetooth: btnxpuart: Add GPIO support to power save feature
This adds support for driving the chip into sleep or wakeup with a GPIO.
If the device tree property device-wakeup-gpios is defined, the driver
utilizes this GPIO for controlling the chip's power save state, else it
uses the default UART-break method.
Bluetooth: hci_conn: Use disable_delayed_work_sync
This makes use of disable_delayed_work_sync instead
cancel_delayed_work_sync as it not only cancel the ongoing work but also
disables new submit which is disarable since the object holding the work
is about to be freed.
Jiande Lu [Mon, 30 Sep 2024 10:47:50 +0000 (18:47 +0800)]
Bluetooth: btusb: Add USB HW IDs for MT7920/MT7925
Add HW IDs for wireless module. These HW IDs are extracted from
Windows driver inf file and the test for card bring up successful.
MT7920 HW IDs test with below patch.
https://patchwork.kernel.org/project/bluetooth/
patch/20240930081257[email protected]/
Patch has been tested successfully and controller is recognized
devices pair successfully.
MT7920 module bring up message as below.
Bluetooth: Core ver 2.22
Bluetooth: HCI device and connection manager initialized
Bluetooth: HCI socket layer initialized
Bluetooth: L2CAP socket layer initialized
Bluetooth: SCO socket layer initialized
Bluetooth: hci0: HW/SW Version: 0x008a008a, Build Time: 20240930111457
Bluetooth: hci0: Device setup in 143004 usecs
Bluetooth: hci0: HCI Enhanced Setup Synchronous Connection
command is advertised, but not supported.
Bluetooth: hci0: AOSP extensions version v1.00
Bluetooth: hci0: AOSP quality report is supported
Bluetooth: BNEP (Ethernet Emulation) ver 1.3
Bluetooth: BNEP filters: protocol multicast
Bluetooth: BNEP socket layer initialized
Bluetooth: MGMT ver 1.22
Bluetooth: RFCOMM TTY layer initialized
Bluetooth: RFCOMM socket layer initialized
Bluetooth: RFCOMM ver 1.11
MT7925 module bring up message as below.
Bluetooth: Core ver 2.22
Bluetooth: HCI device and connection manager initialized
Bluetooth: HCI socket layer initialized
Bluetooth: L2CAP socket layer initialized
Bluetooth: SCO socket layer initialized
Bluetooth: hci0: HW/SW Version: 0x00000000, Build Time: 20240816133202
Bluetooth: hci0: Device setup in 286558 usecs
Bluetooth: hci0: HCI Enhanced Setup Synchronous
Connection command is advertised, but not supported.
Bluetooth: hci0: AOSP extensions version v1.00
Bluetooth: BNEP (Ethernet Emulation) ver 1.3
Bluetooth: BNEP filters: protocol multicast
Bluetooth: BNEP socket layer initialized
Bluetooth: MGMT ver 1.22
Bluetooth: RFCOMM TTY layer initialized
Bluetooth: RFCOMM socket layer initialized
Bluetooth: RFCOMM ver 1.11