Ido Schimmel [Thu, 21 Mar 2024 17:30:42 +0000 (19:30 +0200)]
ipv6: Fix address dump when IPv6 is disabled on an interface
Cited commit started returning an error when user space requests to dump
the interface's IPv6 addresses and IPv6 is disabled on the interface.
Restore the previous behavior and do not return an error.
Before cited commit:
# ip address show dev dummy1
2: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 1a:52:02:5a:c2:6e brd ff:ff:ff:ff:ff:ff
inet6 fe80::1852:2ff:fe5a:c26e/64 scope link proto kernel_ll
valid_lft forever preferred_lft forever
# ip link set dev dummy1 mtu 1000
# ip address show dev dummy1
2: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1000 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 1a:52:02:5a:c2:6e brd ff:ff:ff:ff:ff:ff
After cited commit:
# ip address show dev dummy1
2: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 1e:9b:94:00:ac:e8 brd ff:ff:ff:ff:ff:ff
inet6 fe80::1c9b:94ff:fe00:ace8/64 scope link proto kernel_ll
valid_lft forever preferred_lft forever
# ip link set dev dummy1 mtu 1000
# ip address show dev dummy1
RTNETLINK answers: No such device
Dump terminated
With this patch:
# ip address show dev dummy1
2: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 42:35:fc:53:66:cf brd ff:ff:ff:ff:ff:ff
inet6 fe80::4035:fcff:fe53:66cf/64 scope link proto kernel_ll
valid_lft forever preferred_lft forever
# ip link set dev dummy1 mtu 1000
# ip address show dev dummy1
2: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1000 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 42:35:fc:53:66:cf brd ff:ff:ff:ff:ff:ff
Jakub Kicinski [Thu, 21 Mar 2024 02:02:14 +0000 (19:02 -0700)]
tools: ynl: fix setting presence bits in simple nests
When we set members of simple nested structures in requests
we need to set "presence" bits for all the nesting layers
below. This has nothing to do with the presence type of
the last layer.
Ryosuke Yasuoka [Wed, 20 Mar 2024 00:54:10 +0000 (09:54 +0900)]
nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet
syzbot reported the following uninit-value access issue [1][2]:
nci_rx_work() parses and processes received packet. When the payload
length is zero, each message type handler reads uninitialized payload
and KMSAN detects this issue. The receipt of a packet with a zero-size
payload is considered unexpected, and therefore, such packets should be
silently discarded.
This patch resolved this issue by checking payload size before calling
each message type handler codes.
Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation") Reported-and-tested-by: [email protected] Reported-and-tested-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=7ea9413ea6749baf5574 [1] Closes: https://syzkaller.appspot.com/bug?extid=29b5ca705d2e0f4a44d2 [2] Signed-off-by: Ryosuke Yasuoka <[email protected]> Reviewed-by: Jeremy Cline <[email protected]> Reviewed-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Linus Torvalds [Thu, 21 Mar 2024 21:50:39 +0000 (14:50 -0700)]
Merge tag 'net-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from CAN, netfilter, wireguard and IPsec.
I'd like to highlight [ lowlight? - Linus ] Florian W stepping down as
a netfilter maintainer due to constant stream of bug reports. Not sure
what we can do but IIUC this is not the first such case.
Current release - regressions:
- rxrpc: fix use of page_frag_alloc_align(), it changed semantics and
we added a new caller in a different subtree
- xfrm: allow UDP encapsulation only in offload modes
Current release - new code bugs:
- tcp: fix refcnt handling in __inet_hash_connect()
- Revert "net: Re-use and set mono_delivery_time bit for userspace
tstamp packets", conflicted with some expectations in BPF uAPI
Previous releases - regressions:
- ipv4: raw: fix sending packets from raw sockets via IPsec tunnels
- report RCU QS for busy network kthreads (with Paul McK's blessing)
- tcp/rds: fix use-after-free on netns with kernel TCP reqsk
- virt: vmxnet3: fix missing reserved tailroom with XDP
Misc:
- couple of build fixes for Documentation"
* tag 'net-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (59 commits)
selftests: forwarding: Fix ping failure due to short timeout
MAINTAINERS: step down as netfilter maintainer
netfilter: nf_tables: Fix a memory leak in nf_tables_updchain
net: dsa: mt7530: fix handling of all link-local frames
net: dsa: mt7530: fix link-local frames that ingress vlan filtering ports
bpf: report RCU QS in cpumap kthread
net: report RCU QS on threaded NAPI repolling
rcu: add a helper to report consolidated flavor QS
ionic: update documentation for XDP support
lib/bitmap: Fix bitmap_scatter() and bitmap_gather() kernel doc
netfilter: nf_tables: do not compare internal table flags on updates
netfilter: nft_set_pipapo: release elements in clone only from destroy path
octeontx2-af: Use separate handlers for interrupts
octeontx2-pf: Send UP messages to VF only when VF is up.
octeontx2-pf: Use default max_active works instead of one
octeontx2-pf: Wait till detach_resources msg is complete
octeontx2: Detect the mbox up or down message via register
devlink: fix port new reply cmd type
tcp: Clear req->syncookie in reqsk_alloc().
net/bnx2x: Prevent access to a freed page in page_pool
...
Linus Torvalds [Thu, 21 Mar 2024 21:41:00 +0000 (14:41 -0700)]
Merge tag 'kbuild-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Generate a list of built DTB files (arch/*/boot/dts/dtbs-list)
- Use more threads when building Debian packages in parallel
- Fix warnings shown during the RPM kernel package uninstallation
- Change OBJECT_FILES_NON_STANDARD_*.o etc. to take a relative path to
Makefile
- Support GCC's -fmin-function-alignment flag
- Fix a null pointer dereference bug in modpost
- Add the DTB support to the RPM package
- Various fixes and cleanups in Kconfig
* tag 'kbuild-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (67 commits)
kconfig: tests: test dependency after shuffling choices
kconfig: tests: add a test for randconfig with dependent choices
kconfig: tests: support KCONFIG_SEED for the randconfig runner
kbuild: rpm-pkg: add dtb files in kernel rpm
kconfig: remove unneeded menu_is_visible() call in conf_write_defconfig()
kconfig: check prompt for choice while parsing
kconfig: lxdialog: remove unused dialog colors
kconfig: lxdialog: fix button color for blackbg theme
modpost: fix null pointer dereference
kbuild: remove GCC's default -Wpacked-bitfield-compat flag
kbuild: unexport abs_srctree and abs_objtree
kbuild: Move -Wenum-{compare-conditional,enum-conversion} into W=1
kconfig: remove named choice support
kconfig: use linked list in get_symbol_str() to iterate over menus
kconfig: link menus to a symbol
kbuild: fix inconsistent indentation in top Makefile
kbuild: Use -fmin-function-alignment when available
alpha: merge two entries for CONFIG_ALPHA_GAMMA
alpha: merge two entries for CONFIG_ALPHA_EV4
kbuild: change DTC_FLAGS_<basetarget>.o to take the path relative to $(obj)
...
Linus Torvalds [Thu, 21 Mar 2024 21:13:18 +0000 (14:13 -0700)]
Merge tag 'firewire-fixes-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull firewire fixes Takashi Sakamoto:
"The previous pull includes some regressions in some device attributes
exposed to sysfs. They are fixed now"
* tag 'firewire-fixes-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
firewire: core: add memo about the caller of show functions for device attributes
Revert "firewire: Kill unnecessary buf check in device_attribute.show"
Linus Torvalds [Thu, 21 Mar 2024 20:34:15 +0000 (13:34 -0700)]
Merge tag 'driver-core-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the "big" set of driver core and kernfs changes for 6.9-rc1.
Nothing all that crazy here, just some good updates that include:
- automatic attribute group hiding from Dan Williams (he fixed up my
horrible attempt at doing this.)
- kobject lock contention fixes from Eric Dumazet
- driver core cleanups from Andy
- kernfs rcu work from Tejun
- fw_devlink changes to resolve some reported issues
- other minor changes, all details in the shortlog
All of these have been in linux-next for a long time with no reported
issues"
* tag 'driver-core-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (28 commits)
device: core: Log warning for devices pending deferred probe on timeout
driver: core: Use dev_* instead of pr_* so device metadata is added
driver: core: Log probe failure as error and with device metadata
of: property: fw_devlink: Add support for "post-init-providers" property
driver core: Add FWLINK_FLAG_IGNORE to completely ignore a fwnode link
driver core: Adds flags param to fwnode_link_add()
debugfs: fix wait/cancellation handling during remove
device property: Don't use "proxy" headers
device property: Move enum dev_dma_attr to fwnode.h
driver core: Move fw_devlink stuff to where it belongs
driver core: Drop unneeded 'extern' keyword in fwnode.h
firmware_loader: Suppress warning on FW_OPT_NO_WARN flag
sysfs:Addresses documentation in sysfs_merge_group and sysfs_unmerge_group.
firmware_loader: introduce __free() cleanup hanler
platform-msi: Remove usage of the deprecated ida_simple_xx() API
sysfs: Introduce DEFINE_SIMPLE_SYSFS_GROUP_VISIBLE()
sysfs: Document new "group visible" helpers
sysfs: Fix crash on empty group attributes array
sysfs: Introduce a mechanism to hide static attribute_groups
sysfs: Introduce a mechanism to hide static attribute_groups
...
Linus Torvalds [Thu, 21 Mar 2024 20:21:31 +0000 (13:21 -0700)]
Merge tag 'char-misc-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc and other driver subsystem updates from Greg KH:
"Here is the big set of char/misc and a number of other driver
subsystem updates for 6.9-rc1. Included in here are:
- IIO driver updates, loads of new ones and evolution of existing ones
- coresight driver updates
- const cleanups for many driver subsystems
- speakup driver additions
- platform remove callback void cleanups
- mei driver updates
- mhi driver updates
- cdx driver updates for MSI interrupt handling
- nvmem driver updates
- other smaller driver updates and cleanups, full details in the
shortlog
All of these have been in linux-next for a long time with no reported
issue, other than a build warning for the speakup driver"
The build warning hits clang and is a gcc (and C23) extension, and is
fixed up in the merge.
Link: https://lore.kernel.org/all/[email protected]/
* tag 'char-misc-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (279 commits)
binder: remove redundant variable page_addr
uio_dmem_genirq: UIO_MEM_DMA_COHERENT conversion
uio_pruss: UIO_MEM_DMA_COHERENT conversion
cnic,bnx2,bnx2x: use UIO_MEM_DMA_COHERENT
uio: introduce UIO_MEM_DMA_COHERENT type
cdx: add MSI support for CDX bus
pps: use cflags-y instead of EXTRA_CFLAGS
speakup: Add /dev/synthu device
speakup: Fix 8bit characters from direct synth
parport: sunbpp: Convert to platform remove callback returning void
parport: amiga: Convert to platform remove callback returning void
char: xillybus: Convert to platform remove callback returning void
vmw_balloon: change maintainership
MAINTAINERS: change the maintainer for hpilo driver
char: xilinx_hwicap: Fix NULL vs IS_ERR() bug
hpet: remove hpets::hp_clocksource
platform: goldfish: move the separate 'default' propery for CONFIG_GOLDFISH
char: xilinx_hwicap: drop casting to void in dev_set_drvdata
greybus: move is_gb_* functions out of greybus.h
greybus: Remove usage of the deprecated ida_simple_xx() API
...
Linus Torvalds [Thu, 21 Mar 2024 20:03:44 +0000 (13:03 -0700)]
Merge tag 'staging-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver updates from Greg KH:
"Here is the big set of Staging driver cleanups for 6.9-rc1. Nothing
major in here, lots of small coding style cleanups for most drivers,
and the removal of some obsolete hardare (the emxx_udc and some
drivers/staging/board/ files).
All of these have been in linux-next for a long time with no reported
issues"
* tag 'staging-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (122 commits)
staging: greybus: Replaces directive __attribute__((packed)) by __packed as suggested by checkpatch
staging: greybus: Replace __attribute__((packed)) by __packed in various instances
Staging: rtl8192e: Rename function GetHalfNmodeSupportByAPsHandler()
Staging: rtl8192e: Rename function rtllib_FlushRxTsPendingPkts()
Staging: rtl8192e: Rename goto OnADDBARsp_Reject
Staging: rtl8192e: Rename goto OnADDBAReq_Fail
Staging: rtl8192e: Rename function rtllib_send_ADDBARsp()
Staging: rtl8192e: Rename function rtllib_send_ADDBAReq()
Staging: rtl8192e: Rename variable TxRxSelect
Staging: rtl8192e: Fix 5 chckpatch alignment warnings in rtl819x_BAProc.c
Staging: rtl8192e: Rename function MgntQuery_MgntFrameTxRate
Staging: rtl8192e: Rename boolean variable bHalfWirelessN24GMode
Staging: rtl8192e: Rename reference AllowAllDestAddrHandler
Staging: rtl8192e: Rename varoable asSta
Staging: rtl8192e: Rename varoable osCcxVerNum
Staging: rtl8192e: Rename variable CcxAironetBuf
Staging: rtl8192e: Rename variable osCcxAironetIE
Staging: rtl8192e: Rename variable AironetIeOui
Staging: rtl8192e: Rename variable asRsn
Staging: rtl8192e: Rename variable CcxVerNumBuf
...
Linus Torvalds [Thu, 21 Mar 2024 19:44:10 +0000 (12:44 -0700)]
Merge tag 'tty-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty / serial driver updates from Greg KH:
"Here is the big set of TTY/Serial driver updates and cleanups for
6.9-rc1. Included in here are:
- more tty cleanups from Jiri
- loads of 8250 driver cleanups from Andy
- max310x driver updates
- samsung serial driver updates
- uart_prepare_sysrq_char() updates for many drivers
- platform driver remove callback void cleanups
- stm32 driver updates
- other small tty/serial driver updates
All of these have been in linux-next for a long time with no reported
issues"
* tag 'tty-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (199 commits)
dt-bindings: serial: stm32: add power-domains property
serial: 8250_dw: Replace ACPI device check by a quirk
serial: Lock console when calling into driver before registration
serial: 8250_uniphier: Switch to use uart_read_port_properties()
serial: 8250_tegra: Switch to use uart_read_port_properties()
serial: 8250_pxa: Switch to use uart_read_port_properties()
serial: 8250_omap: Switch to use uart_read_port_properties()
serial: 8250_of: Switch to use uart_read_port_properties()
serial: 8250_lpc18xx: Switch to use uart_read_port_properties()
serial: 8250_ingenic: Switch to use uart_read_port_properties()
serial: 8250_dw: Switch to use uart_read_port_properties()
serial: 8250_bcm7271: Switch to use uart_read_port_properties()
serial: 8250_bcm2835aux: Switch to use uart_read_port_properties()
serial: 8250_aspeed_vuart: Switch to use uart_read_port_properties()
serial: port: Introduce a common helper to read properties
serial: core: Add UPIO_UNKNOWN constant for unknown port type
serial: core: Move struct uart_port::quirks closer to possible values
serial: sh-sci: Call sci_serial_{in,out}() directly
serial: core: only stop transmit when HW fifo is empty
serial: pch: Use uart_prepare_sysrq_char().
...
Linus Torvalds [Thu, 21 Mar 2024 19:35:20 +0000 (12:35 -0700)]
Merge tag 'usb-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt updates from Greg KH:
"Here is the big set of USB and Thunderbolt changes for 6.9-rc1. Lots
of tiny changes and forward progress to support new hardware and
better support for existing devices. Included in here are:
- Thunderbolt (i.e. USB4) updates for newer hardware and uses as more
people start to use the hardware
- default USB authentication mode Kconfig and documentation update to
make it more obvious what is going on
- USB typec updates and enhancements
- usual dwc3 driver updates
- usual xhci driver updates
- function USB (i.e. gadget) driver updates and additions
- new device ids for lots of drivers
- loads of other small updates, full details in the shortlog
All of these, including a "last minute regression fix" have been in
linux-next with no reported issues"
* tag 'usb-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (185 commits)
usb: usb-acpi: Fix oops due to freeing uninitialized pld pointer
usb: gadget: net2272: Use irqflags in the call to net2272_probe_fin
usb: gadget: tegra-xudc: Fix USB3 PHY retrieval logic
phy: tegra: xusb: Add API to retrieve the port number of phy
USB: gadget: pxa27x_udc: Remove unused of_gpio.h
usb: gadget/snps_udc_plat: Remove unused of_gpio.h
usb: ohci-pxa27x: Remove unused of_gpio.h
usb: sl811-hcd: only defined function checkdone if QUIRK2 is defined
usb: Clarify expected behavior of dev_bin_attrs_are_visible()
xhci: Allow RPM on the USB controller (1022:43f7) by default
usb: isp1760: remove SLAB_MEM_SPREAD flag usage
usb: misc: onboard_hub: use pointer consistently in the probe function
usb: gadget: fsl: Increase size of name buffer for endpoints
usb: gadget: fsl: Add of device table to enable module autoloading
usb: typec: tcpm: add support to set tcpc connector orientatition
usb: typec: tcpci: add generic tcpci fallback compatible
dt-bindings: usb: typec-tcpci: add tcpci fallback binding
usb: gadget: fsl-udc: Replace custom log wrappers by dev_{err,warn,dbg,vdbg}
usb: core: Set connect_type of ports based on DT node
dt-bindings: usb: Add downstream facing ports to realtek binding
...
Linus Torvalds [Thu, 21 Mar 2024 17:49:54 +0000 (10:49 -0700)]
Merge tag 'hwlock-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux
Pull hwspinlock updates from Bjorn Andersson:
"Some code cleanup for the OMAP hwspinlock driver"
* tag 'hwlock-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
hwspinlock: omap: Use index to get hwspinlock pointer
hwspinlock: omap: Use devm_hwspin_lock_register() helper
hwspinlock: omap: Use devm_pm_runtime_enable() helper
hwspinlock: omap: Remove unneeded check for OF node
Linus Torvalds [Thu, 21 Mar 2024 17:45:43 +0000 (10:45 -0700)]
Merge tag 'rpmsg-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux
Pull rpmsg updates from Bjorn Andersson:
"This transitions rpmsg_ctrl and rpmsg_char drivers away from the
deprecated ida_simple_*() API. It also makes the rpmsg_bus const"
* tag 'rpmsg-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
rpmsg: core: Make rpmsg_bus const
rpmsg: Remove usage of the deprecated ida_simple_xx() API
Linus Torvalds [Thu, 21 Mar 2024 17:37:39 +0000 (10:37 -0700)]
Merge tag 'rproc-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux
Pull remoteproc updates from Bjorn Andersson:
"Qualcomm SM8650 audio, compute and modem remoteproc are added.
Qualcomm X1 Elite audio and compute remoteprocs are added, after
support for shutting down the bootloader-loaded firmware loaded into
the audio DSP..
A dozen drivers in the subsystem are transitioned to use devres
helpers for remoteproc and memory allocations - this makes it possible
to acquire in-kernel handle to individual remoteproc instances in a
cluster.
The release of DMA memory for remoteproc virtio is corrected to ensure
that restarting due to a watchdog bite doesn't attempt to allocate the
memory again without first freeing it.
Last, but not least, a couple of DeviceTree binding cleanups"
* tag 'rproc-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: (30 commits)
remoteproc: qcom_q6v5_pas: Unload lite firmware on ADSP
remoteproc: qcom_q6v5_pas: Add support for X1E80100 ADSP/CDSP
dt-bindings: remoteproc: qcom,sm8550-pas: document the X1E80100 aDSP & cDSP
remoteproc: qcom_wcnss: Use devm_rproc_alloc() helper
remoteproc: qcom_q6v5_wcss: Use devm_rproc_alloc() helper
remoteproc: qcom_q6v5_pas: Use devm_rproc_alloc() helper
remoteproc: qcom_q6v5_mss: Use devm_rproc_alloc() helper
remoteproc: qcom_q6v5_adsp: Use devm_rproc_alloc() helper
dt-bindings: remoteproc: do not override firmware-name $ref
dt-bindings: remoteproc: qcom,glink-rpm-edge: drop redundant type from label
remoteproc: qcom: pas: correct data indentation
remoteproc: Make rproc_get_by_phandle() work for clusters
remoteproc: qcom: pas: Add SM8650 remoteproc support
remoteproc: qcom: pas: make region assign more generic
dt-bindings: remoteproc: qcom,sm8550-pas: document the SM8650 PAS
remoteproc: k3-dsp: Use devm_rproc_add() helper
remoteproc: k3-dsp: Use devm_ioremap_wc() helper
remoteproc: k3-dsp: Add devm action to release tsp
remoteproc: k3-dsp: Use devm_kzalloc() helper
remoteproc: k3-dsp: Use devm_ti_sci_get_by_phandle() helper
...
Linus Torvalds [Thu, 21 Mar 2024 17:13:47 +0000 (10:13 -0700)]
Merge tag 'sh-for-v6.9-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux
Pull sh updates from John Paul Adrian Glaubitz:
"Two patches by Ricardo B. Marliere make two instances of struct
bus_type in the interrupt controller driver and the DMA sysfs
interface const since the driver core in the kernel is now able to
handle that.
A third patch by Artur Rojek enforces internal linkage for the
function setup_hd64461() in order to fix the build of hp6xx_defconfig
with -Werror=missing-prototypes"
* tag 'sh-for-v6.9-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
sh: hd64461: Make setup_hd64461() static
sh: intc: Make intc_subsys const
sh: dma-sysfs: Make dma_subsys const
Linus Torvalds [Thu, 21 Mar 2024 17:01:02 +0000 (10:01 -0700)]
Merge tag 'hyperv-next-signed-20240320' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv updates from Wei Liu:
- Use Hyper-V entropy to seed guest random number generator (Michael
Kelley)
- Convert to platform remove callback returning void for vmbus (Uwe
Kleine-König)
- Introduce hv_get_hypervisor_version function (Nuno Das Neves)
- Rename some HV_REGISTER_* defines for consistency (Nuno Das Neves)
- Change prefix of generic HV_REGISTER_* MSRs to HV_MSR_* (Nuno Das
Neves)
- Cosmetic changes for hv_spinlock.c (Purna Pavan Chandra Aekkaladevi)
- Use per cpu initial stack for vtl context (Saurabh Sengar)
* tag 'hyperv-next-signed-20240320' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
x86/hyperv: Use Hyper-V entropy to seed guest random number generator
x86/hyperv: Cosmetic changes for hv_spinlock.c
hyperv-tlfs: Rename some HV_REGISTER_* defines for consistency
hv: vmbus: Convert to platform remove callback returning void
mshyperv: Introduce hv_get_hypervisor_version function
x86/hyperv: Use per cpu initial stack for vtl context
hyperv-tlfs: Change prefix of generic HV_REGISTER_* MSRs to HV_MSR_*
Linus Torvalds [Thu, 21 Mar 2024 16:54:28 +0000 (09:54 -0700)]
Merge tag 'for-6.9-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
"Fix a problem found in 6.7 after adding the temp-fsid feature which
changed device tracking in memory and broke grub-probe. This is used
on initrd-less systems. There were several iterations of the fix and
it took longer than expected"
* tag 'for-6.9-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: do not skip re-registration for the mounted device
Linus Torvalds [Thu, 21 Mar 2024 16:47:12 +0000 (09:47 -0700)]
Merge tag 'exfat-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
Pull exfat updates from Namjae Jeon:
- Improve dirsync performance by syncing on a dentry-set rather than on
a per-directory entry
* tag 'exfat-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: remove duplicate update parent dir
exfat: do not sync parent dir if just update timestamp
exfat: remove unused functions
exfat: convert exfat_find_empty_entry() to use dentry cache
exfat: convert exfat_init_ext_entry() to use dentry cache
exfat: move free cluster out of exfat_init_ext_entry()
exfat: convert exfat_remove_entries() to use dentry cache
exfat: convert exfat_add_entry() to use dentry cache
exfat: add exfat_get_empty_dentry_set() helper
exfat: add __exfat_get_dentry_set() helper
Linus Torvalds [Thu, 21 Mar 2024 16:27:37 +0000 (09:27 -0700)]
Merge tag 'bitmap-for-6.9' of https://github.com/norov/linux
Pull bitmap updates from Yury Norov:
"A couple of random cleanups plus a step-down patch from Andy"
* tag 'bitmap-for-6.9' of https://github.com/norov/linux:
bitmap: Step down as a reviewer
lib/find: optimize find_*_bit_wrap
lib/find_bit: Fix the code comments about find_next_bit_wrap
Paolo Abeni [Thu, 21 Mar 2024 14:16:16 +0000 (15:16 +0100)]
Merge tag 'nf-24-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net. There is a
larger batch of fixes still pending that will follow up asap, this is
what I deemed to be more urgent at this time:
1) Use clone view in pipapo set backend to release elements from destroy
path, otherwise it is possible to destroy elements twice.
2) Incorrect check for internal table flags lead to bogus transaction
objects.
3) Fix counters memleak in netdev basechain update error path,
from Quan Tian.
netfilter pull request 24-03-21
* tag 'nf-24-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_tables: Fix a memory leak in nf_tables_updchain
netfilter: nf_tables: do not compare internal table flags on updates
netfilter: nft_set_pipapo: release elements in clone only from destroy path
====================
Paolo Abeni [Thu, 21 Mar 2024 11:59:04 +0000 (12:59 +0100)]
Merge tag 'linux-can-fixes-for-6.9-20240319' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2024-03-20
this is a pull request of 1 patch for net/master.
Martin Jocić contributes a fix for the kvaser_pciefd driver, so that
up to 8 channels on the Xilinx-based adapters can be used. This issue
has been introduced in net-next for v6.9.
Ido Schimmel [Wed, 20 Mar 2024 06:57:17 +0000 (08:57 +0200)]
selftests: forwarding: Fix ping failure due to short timeout
The tests send 100 pings in 0.1 second intervals and force a timeout of
11 seconds, which is borderline (especially on debug kernels), resulting
in random failures in netdev CI [1].
Fix by increasing the timeout to 20 seconds. It should not prolong the
test unless something is wrong, in which case the test will rightfully
fail.
[1]
# selftests: net/forwarding: vxlan_bridge_1d_port_8472_ipv6.sh
# INFO: Running tests with UDP port 8472
# TEST: ping: local->local [ OK ]
# TEST: ping: local->remote 1 [FAIL]
# Ping failed
[...]
Quan Tian [Wed, 6 Mar 2024 17:24:02 +0000 (01:24 +0800)]
netfilter: nf_tables: Fix a memory leak in nf_tables_updchain
If nft_netdev_register_hooks() fails, the memory associated with
nft_stats is not freed, causing a memory leak.
This patch fixes it by moving nft_stats_alloc() down after
nft_netdev_register_hooks() succeeds.
Fixes: b9703ed44ffb ("netfilter: nf_tables: support for adding new devices to an existing netdev chain") Signed-off-by: Quan Tian <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
Arınç ÜNAL [Thu, 14 Mar 2024 09:33:42 +0000 (12:33 +0300)]
net: dsa: mt7530: fix handling of all link-local frames
Currently, the MT753X switches treat frames with :01-0D and :0F MAC DAs as
regular multicast frames, therefore flooding them to user ports.
On page 205, section "8.6.3 Frame filtering" of the active standard, IEEE
Std 802.1Qâ„¢-2022, it is stated that frames with 01:80:C2:00:00:00-0F as MAC
DA must only be propagated to C-VLAN and MAC Bridge components. That means
VLAN-aware and VLAN-unaware bridges. On the switch designs with CPU ports,
these frames are supposed to be processed by the CPU (software). So we make
the switch only forward them to the CPU port. And if received from a CPU
port, forward to a single port. The software is responsible of making the
switch conform to the latter by setting a single port as destination port
on the special tag.
This switch intellectual property cannot conform to this part of the
standard fully. Whilst the REV_UN frame tag covers the remaining :04-0D and
:0F MAC DAs, it also includes :22-FF which the scope of propagation is not
supposed to be restricted for these MAC DAs.
Set frames with :01-03 MAC DAs to be trapped to the CPU port(s). Add a
comment for the remaining MAC DAs.
Note that the ingress port must have a PVID assigned to it for the switch
to forward untagged frames. A PVID is set by default on VLAN-aware and
VLAN-unaware ports. However, when the network interface that pertains to
the ingress port is attached to a vlan_filtering enabled bridge, the user
can remove the PVID assignment from it which would prevent the link-local
frames from being trapped to the CPU port. I am yet to see a way to forward
link-local frames while preventing other untagged frames from being
forwarded too.
Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch") Signed-off-by: Arınç ÜNAL <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
Whether VLAN-aware or not, on every VID VLAN table entry that has the CPU
port as a member of it, frames are set to egress the CPU port with the VLAN
tag stacked. This is so that VLAN tags can be appended after hardware
special tag (called DSA tag in the context of Linux drivers).
For user ports on a VLAN-unaware bridge, frame ingressing the user port
egresses CPU port with only the special tag.
For user ports on a VLAN-aware bridge, frame ingressing the user port
egresses CPU port with the special tag and the VLAN tag.
This causes issues with link-local frames, specifically BPDUs, because the
software expects to receive them VLAN-untagged.
There are two options to make link-local frames egress untagged. Setting
CONSISTENT or UNTAGGED on the EG_TAG bits on the relevant register.
CONSISTENT means frames egress exactly as they ingress. That means
egressing with the VLAN tag they had at ingress or egressing untagged if
they ingressed untagged. Although link-local frames are not supposed to be
transmitted VLAN-tagged, if they are done so, when egressing through a CPU
port, the special tag field will be broken.
BPDU egresses CPU port with VLAN tag egressing stacked, received on
software:
To prevent confusing the software, force the frame to egress UNTAGGED
instead of CONSISTENT. This way, frames can't possibly be received TAGGED
by software which would have the special tag field broken.
VLAN Tag Egress Procedure
For all frames, one of these options set the earliest in this order will
apply to the frame:
- EG_TAG in certain registers for certain frames.
This will apply to frame with matching MAC DA or EtherType.
- EG_TAG in the address table.
This will apply to frame at its incoming port.
- EG_TAG in the PVC register.
This will apply to frame at its incoming port.
- EG_CON and [EG_TAG per port] in the VLAN table.
This will apply to frame at its outgoing port.
- EG_TAG in the PCR register.
This will apply to frame at its outgoing port.
EG_TAG in certain registers for certain frames:
PPPoE Discovery_ARP/RARP: PPP_EG_TAG and ARP_EG_TAG in the APC register.
IGMP_MLD: IGMP_EG_TAG and MLD_EG_TAG in the IMC register.
BPDU and PAE: BPDU_EG_TAG and PAE_EG_TAG in the BPC register.
REV_01 and REV_02: R01_EG_TAG and R02_EG_TAG in the RGAC1 register.
REV_03 and REV_0E: R03_EG_TAG and R0E_EG_TAG in the RGAC2 register.
REV_10 and REV_20: R10_EG_TAG and R20_EG_TAG in the RGAC3 register.
REV_21 and REV_UN: R21_EG_TAG and RUN_EG_TAG in the RGAC4 register.
With this change, it can be observed that a bridge interface with stp_state
and vlan_filtering enabled will properly block ports now.
Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch") Signed-off-by: Arınç ÜNAL <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
====================
Report RCU QS for busy network kthreads
This changeset fixes a common problem for busy networking kthreads.
These threads, e.g. NAPI threads, typically will do:
* polling a batch of packets
* if there are more work, call cond_resched() to allow scheduling
* continue to poll more packets when rx queue is not empty
We observed this being a problem in production, since it can block RCU
tasks from making progress under heavy load. Investigation indicates
that just calling cond_resched() is insufficient for RCU tasks to reach
quiescent states. This also has the side effect of frequently clearing
the TIF_NEED_RESCHED flag on voluntary preempt kernels. As a result,
schedule() will not be called in these circumstances, despite schedule()
in fact provides required quiescent states. This at least affects NAPI
threads, napi_busy_loop, and also cpumap kthread.
By reporting RCU QSes in these kthreads periodically before cond_resched, the
blocked RCU waiters can correctly progress. Instead of just reporting QS for
RCU tasks, these code share the same concern as noted in the commit d28139c4e967 ("rcu: Apply RCU-bh QSes to RCU-sched and RCU-preempt when safe").
So report a consolidated QS for safety.
It is worth noting that, although this problem is reproducible in
napi_busy_loop, it only shows up when setting the polling interval to as high
as 2ms, which is far larger than recommended 50us-100us in the documentation.
So napi_busy_loop is left untouched.
Lastly, this does not affect RT kernels, which does not enter the scheduler
through cond_resched(). Without the mentioned side effect, schedule() will
be called time by time, and clear the RCU task holdouts.
Yan Zhai [Tue, 19 Mar 2024 20:44:40 +0000 (13:44 -0700)]
bpf: report RCU QS in cpumap kthread
When there are heavy load, cpumap kernel threads can be busy polling
packets from redirect queues and block out RCU tasks from reaching
quiescent states. It is insufficient to just call cond_resched() in such
context. Periodically raise a consolidated RCU QS before cond_resched
fixes the problem.
Yan Zhai [Tue, 19 Mar 2024 20:44:37 +0000 (13:44 -0700)]
net: report RCU QS on threaded NAPI repolling
NAPI threads can keep polling packets under load. Currently it is only
calling cond_resched() before repolling, but it is not sufficient to
clear out the holdout of RCU tasks, which prevent BPF tracing programs
from detaching for long period. This can be reproduced easily with
following set up:
ip netns add test1
ip netns add test2
ip -n test1 link add veth1 type veth peer name veth2 netns test2
ip -n test1 link set veth1 up
ip -n test1 link set lo up
ip -n test2 link set veth2 up
ip -n test2 link set lo up
ip -n test1 addr add 192.168.1.2/31 dev veth1
ip -n test1 addr add 1.1.1.1/32 dev lo
ip -n test2 addr add 192.168.1.3/31 dev veth2
ip -n test2 addr add 2.2.2.2/31 dev lo
ip -n test1 route add default via 192.168.1.3
ip -n test2 route add default via 192.168.1.2
for i in `seq 10 210`; do
for j in `seq 10 210`; do
ip netns exec test2 iptables -I INPUT -s 3.3.$i.$j -p udp --dport 5201
done
done
ip netns exec test2 ethtool -K veth2 gro on
ip netns exec test2 bash -c 'echo 1 > /sys/class/net/veth2/threaded'
ip netns exec test1 ethtool -K veth1 tso off
Then run an iperf3 client/server and a bpftrace script can trigger it:
Yan Zhai [Tue, 19 Mar 2024 20:44:34 +0000 (13:44 -0700)]
rcu: add a helper to report consolidated flavor QS
When under heavy load, network processing can run CPU-bound for many
tens of seconds. Even in preemptible kernels (non-RT kernel), this can
block RCU Tasks grace periods, which can cause trace-event removal to
take more than a minute, which is unacceptably long.
This commit therefore creates a new helper function that passes through
both RCU and RCU-Tasks quiescent states every 100 milliseconds. This
hard-coded value suffices for current workloads.
Herve Codina [Thu, 14 Mar 2024 12:00:06 +0000 (13:00 +0100)]
lib/bitmap: Fix bitmap_scatter() and bitmap_gather() kernel doc
The make htmldoc command failed with the following error
... include/linux/bitmap.h:524: ERROR: Unexpected indentation.
... include/linux/bitmap.h:524: CRITICAL: Unexpected section title or transition.
Move the visual representation to a literal block.
Linus Torvalds [Wed, 20 Mar 2024 23:42:47 +0000 (16:42 -0700)]
Merge tag 'v6.9-rc-smb3-server-fixes' of git://git.samba.org/ksmbd
Pull smb server updates from Steve French:
- add support for durable file handles (an important data integrity
feature)
- fixes for potential out of bounds issues
- fix possible null dereference in close
- getattr fixes
- trivial typo fix and minor cleanup
* tag 'v6.9-rc-smb3-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: remove module version
ksmbd: fix potencial out-of-bounds when buffer offset is invalid
ksmbd: fix slab-out-of-bounds in smb_strndup_from_utf16()
ksmbd: Fix spelling mistake "connction" -> "connection"
ksmbd: fix possible null-deref in smb_lazy_parent_lease_break_close
ksmbd: add support for durable handles v1/v2
ksmbd: mark SMB2_SESSION_EXPIRED to session when destroying previous session
ksmbd: retrieve number of blocks using vfs_getattr in set_file_allocation_info
ksmbd: replace generic_fillattr with vfs_getattr
Linus Torvalds [Wed, 20 Mar 2024 23:37:07 +0000 (16:37 -0700)]
Merge tag 'trace-tools-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull trace tool updates from Steven Rostedt:
"Tracing:
- Update makefiles for latency-collector and RTLA, using tools/build/
makefiles like perf does, inheriting its benefits. For example,
having a proper way to handle library dependencies.
- The timerlat tracer has an interface for any tool to use. rtla
timerlat tool uses this interface dispatching its own threads as
workload. But, rtla timerlat could also be used for any other
process. So, add 'rtla timerlat -U' option, allowing the timerlat
tool to measure the latency of any task using the timerlat tracer
interface.
Verification:
- Update makefiles for verification/rv, using tools/build/ makefiles
like perf does, inheriting its benefits. For example, having a
proper way to handle dependencies"
* tag 'trace-tools-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tools/rtla: Add -U/--user-load option to timerlat
tools/verification: Use tools/build makefiles on rv
tools/rtla: Use tools/build makefiles to build rtla
tools/tracing: Use tools/build makefiles on latency-collector
Masahiro Yamada [Wed, 20 Mar 2024 16:52:11 +0000 (01:52 +0900)]
kconfig: tests: test dependency after shuffling choices
Commit c8fb7d7e48d1 ("kconfig: fix broken dependency in randconfig-
generated .config") fixed the issue, but I did not add a test case.
This commit adds a test case that emulates the reported situation.
The test would fail without c8fb7d7e48d1.
To handle the choice "choose X", FOO must be calculated beforehand.
FOO depends on A, which is a member of another choice "choose A or B".
Kconfig _temporarily_ assumes the value of A to proceed. The choice
"choose A or B" will be shuffled later, but the result may or may not
meet "FOO depends on A". Kconfig should invalidate the symbol values
and recompute them.
In the real example for ARCH=arm64, the choice "Instrumentation type"
needs the value of CPU_BIG_ENDIAN. The choice "Endianness" will be
shuffled later.
Masahiro Yamada [Wed, 20 Mar 2024 16:52:10 +0000 (01:52 +0900)]
kconfig: tests: add a test for randconfig with dependent choices
Since commit 3b9a19e08960 ("kconfig: loop as long as we changed some
symbols in randconfig"), conf_set_all_new_symbols() is repeated until
there is no more choice left to be shuffled. The motivation was to
shuffle a choice nested in another choice.
Although commit 09d5873e4d1f ("kconfig: allow only 'config', 'comment',
and 'if' inside 'choice'") disallowed the nested choice structure,
we must still keep 3b9a19e08960 because there are still cases where
conf_set_all_new_symbols() must iterate.
scripts/kconfig/tests/choice_randomize/Kconfig is the test case.
The second choice depends on 'B', which is the member of the first
choice.
With 3b9a19e08960 reverted, we would never get the pattern specified by
scripts/kconfig/tests/choice_randomize/expected_config2.
A real example can be found in lib/Kconfig.debug. Without 3b9a19e08960,
the randconfig would not shuffle the "Compressed Debug information"
choice, which depends on DEBUG_INFO, which is derived from another
choice "Debug information".
My goal is to refactor Kconfig so that randconfig will work more
simply, without using the loop.
For now, let's add a test case to ensure all dependent choices are
shuffled, as it is a somewhat tricky case for the current Kconfig.
This patchset fixes the problems related to RVU mailbox.
During long run tests some times VF commands like setting
MTU or toggling interface fails because VF mailbox is timedout
waiting for response from PF.
Below are the fixes
Patch 1: There are two types of messages in RVU mailbox namely up and down
messages. Down messages are synchronous messages where a PF/VF sends
a message to AF and AF replies back with response. UP messages are
notifications and are asynchronous like AF sending link events to
PF. When VF sends a down message to PF, PF forwards to AF and sends
the response from AF back to VF. PF has to forward VF messages since
there is no path in hardware for VF to send directly to AF.
There is one mailbox interrupt from AF to PF when raised could mean
two scenarios one is where AF sending reply to PF for a down message
sent by PF and another one is AF sending up message asynchronously
when link changed for that PF. Receiving the up message interrupt while
PF is in middle of forwarding down message causes mailbox errors.
Fix this by receiver detecting the type of message from the mbox data register
set by sender.
Patch 2:
During VF driver remove, VF has to wait until last message is
completed and then turn off mailbox interrupts from PF.
Patch 3:
Do not use ordered workqueue for message processing since multiple works are
queued simultaneously by all the VFs and PF link UP messages.
Patch 4:
When sending link event to VF by PF check whether VF is really up to
receive this message.
Patch 5:
In AF driver, use separate interrupt handlers for the AF-VF interrupt and
AF-PF interrupt. Sometimes both interrupts are raised to two CPUs at same
time and both CPUs execute same function at same time corrupting the data.
v2 changes:
Added missing mutex unlock in error path in patch 1
Refactored if else logic in patch 1 as suggested by Paolo Abeni
====================
octeontx2-af: Use separate handlers for interrupts
For PF to AF interrupt vector and VF to AF vector same
interrupt handler is registered which is causing race condition.
When two interrupts are raised to two CPUs at same time
then two cores serve same event corrupting the data.
Fixes: 7304ac4567bc ("octeontx2-af: Add mailbox IRQ and msg handlers") Signed-off-by: Subbaraya Sundeep <[email protected]> Signed-off-by: David S. Miller <[email protected]>
octeontx2-pf: Send UP messages to VF only when VF is up.
When PF sending link status messages to VF, it is possible
that by the time link_event_task work function is executed
VF might have brought down. Hence before sending VF link
status message check whether VF is up to receive it.
Fixes: ad513ed938c9 ("octeontx2-vf: Link event notification support") Signed-off-by: Subbaraya Sundeep <[email protected]> Signed-off-by: David S. Miller <[email protected]>
octeontx2-pf: Use default max_active works instead of one
Only one execution context for the workqueue used for PF and
VFs mailbox communication is incorrect since multiple works are
queued simultaneously by all the VFs and PF link UP messages.
Hence use default number of execution contexts by passing zero
as max_active to alloc_workqueue function. With this fix in place,
modify UP messages also to wait until completion.
Fixes: d424b6c02415 ("octeontx2-pf: Enable SRIOV and added VF mbox handling") Signed-off-by: Subbaraya Sundeep <[email protected]> Signed-off-by: David S. Miller <[email protected]>
octeontx2-pf: Wait till detach_resources msg is complete
During VF driver remove, a message is sent to detach VF
resources to PF but VF is not waiting until message is
complete. Also mailbox interrupts need to be turned off
after the detach resource message is complete. This patch
fixes that problem.
Fixes: 05fcc9e08955 ("octeontx2-pf: Attach NIX and NPA block LFs") Signed-off-by: Subbaraya Sundeep <[email protected]> Signed-off-by: David S. Miller <[email protected]>
octeontx2: Detect the mbox up or down message via register
A single line of interrupt is used to receive up notifications
and down reply messages from AF to PF (similarly from PF to its VF).
PF acts as bridge and forwards VF messages to AF and sends respsones
back from AF to VF. When an async event like link event is received
by up message when PF is in middle of forwarding VF message then
mailbox errors occur because PF state machine is corrupted.
Since VF is a separate driver or VF driver can be in a VM it is
not possible to serialize from the start of communication at VF.
Hence to differentiate between type of messages at PF this patch makes
sender to set mbox data register with distinct values for up and down
messages. Sender also checks whether previous interrupt is received
before triggering current interrupt by waiting for mailbox data register
to become zero.
Fixes: 5a6d7c9daef3 ("octeontx2-pf: Mailbox communication with AF") Signed-off-by: Subbaraya Sundeep <[email protected]> Signed-off-by: David S. Miller <[email protected]>
The timerlat tracer provides an interface for any application to wait
for the timerlat's periodic wakeup. Currently, rtla timerlat uses it
to dispatch its user-space workload (-u option).
But as the tracer interface is generic, rtla timerlat can also be used
to monitor any workload that uses it. For example, a user might
place their own workload to wait on the tracer interface, and
monitor the results with rtla timerlat.
Add the -U option to rtla timerlat top and hist. With this option, rtla
timerlat will not dispatch its workload but only setting up the
system, waiting for a user to dispatch its workload.
The sample code in this patch is an example of python application
that loops in the timerlat tracer fd.
To use it, dispatch:
# rtla timerlat -U
In a terminal, then run the python program on another terminal,
specifying the CPU to run it. For example, setting on CPU 1:
#./timerlat_load.py 1
Then rtla timerlat will start printing the statistics of the
./timerlat_load.py app.
An interesting point is that the "Ret user Timer Latency" value
is the overall response time of the load. The sample load does
a memory copy to exemplify that.
The stop tracing options on rtla timerlat works in this setup
as well, including auto analysis.
tools/rtla: Use tools/build makefiles to build rtla
Use tools/build/ makefiles to build rtla, inheriting the benefits of
it. For example, having a proper way to handle dependencies.
rtla is built using perf infra-structure when building inside the
kernel tree.
At this point, rtla diverges from perf in two points: Documentation
and tarball generation/build.
At the documentation level, rtla is one step ahead, placing the
documentation at Documentation/tools/rtla/, using the same build
tools as kernel documentation. The idea is to move perf
documentation to the same scheme and then share the same makefiles.
rtla has a tarball target that the (old) RHEL8 uses. The tarball was
kept using a simple standalone makefile for compatibility. The
standalone makefile shares most of the code, e.g., flags, with
regular buildings.
The tarball method was set as deprecated. If necessary, we can make
a rtla tarball like perf, which includes the entire tools/build.
But this would also require changes in the user side (the directory
structure changes, and probably the deps to build the package).
tools/tracing: Use tools/build makefiles on latency-collector
Use tools/build/ makefiles to build latency-collector, inheriting
the benefits of it. For example: Before this patch, a missing
tracefs/traceevents headers will result in fail like this:
~/linux/tools/tracing/latency $ make
cc -Wall -Wextra -g -O2 -o latency-collector latency-collector.c -lpthread
latency-collector.c:26:10: fatal error: tracefs.h: No such file or directory
26 | #include <tracefs.h>
| ^~~~~~~~~~~
compilation terminated.
make: *** [Makefile:14: latency-collector] Error 1
Which is not that helpful. After this change it reports:
~/linux/tools/tracing/latency# make
Auto-detecting system features:
... libtraceevent: [ OFF ]
... libtracefs: [ OFF ]
libtraceevent is missing. Please install libtraceevent-dev/libtraceevent-devel
libtracefs is missing. Please install libtracefs-dev/libtracefs-devel
Makefile.config:29: *** Please, check the errors above.. Stop.
This type of output is common across other tools in tools/ like perf
and objtool.
1) Fix possible page_pool leak triggered by esp_output.
From Dragos Tatulea.
2) Fix UDP encapsulation in software GSO path.
From Leon Romanovsky.
* tag 'ipsec-2024-03-19' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
xfrm: Allow UDP encapsulation only in offload modes
net: esp: fix bad handling of pages from page_pool
====================
syzkaller reported a read of uninit req->syncookie. [0]
Originally, req->syncookie was used only in tcp_conn_request()
to indicate if we need to encode SYN cookie in SYN+ACK, so the
field remains uninitialised in other places.
The commit 695751e31a63 ("bpf: tcp: Handle BPF SYN Cookie in
cookie_v[46]_check().") added another meaning in ACK path;
req->syncookie is set true if SYN cookie is validated by BPF
kfunc.
After the change, cookie_v[46]_check() always read req->syncookie,
but it is not initialised in the normal SYN cookie case as reported
by KMSAN.
Let's make sure we always initialise req->syncookie in reqsk_alloc().
Thinh Tran [Fri, 15 Mar 2024 20:55:35 +0000 (15:55 -0500)]
net/bnx2x: Prevent access to a freed page in page_pool
Fix race condition leading to system crash during EEH error handling
During EEH error recovery, the bnx2x driver's transmit timeout logic
could cause a race condition when handling reset tasks. The
bnx2x_tx_timeout() schedules reset tasks via bnx2x_sp_rtnl_task(),
which ultimately leads to bnx2x_nic_unload(). In bnx2x_nic_unload()
SGEs are freed using bnx2x_free_rx_sge_range(). However, this could
overlap with the EEH driver's attempt to reset the device using
bnx2x_io_slot_reset(), which also tries to free SGEs. This race
condition can result in system crashes due to accessing freed memory
locations in bnx2x_free_rx_sge()
799 static inline void bnx2x_free_rx_sge(struct bnx2x *bp,
800 struct bnx2x_fastpath *fp, u16 index)
801 {
802 struct sw_rx_page *sw_buf = &fp->rx_page_ring[index];
803 struct page *page = sw_buf->page;
....
where sw_buf was set to NULL after the call to dma_unmap_page()
by the preceding thread.
Linus Torvalds [Wed, 20 Mar 2024 00:27:25 +0000 (17:27 -0700)]
Merge tag 'bcachefs-2024-03-19' of https://evilpiepirate.org/git/bcachefs
Pull bcachefs fixes from Kent Overstreet:
"Assorted bugfixes.
Most are fixes for simple assertion pops; the most significant fix is
for a deadlock in recovery when we have to rewrite large numbers of
btree nodes to fix errors. This was incorrectly running out of the
same workqueue as the core interior btree update path - we now give it
its own single threaded workqueue.
This was visible to users as "bch2_btree_update_start(): error:
BCH_ERR_journal_reclaim_would_deadlock" - and then recovery hanging"
* tag 'bcachefs-2024-03-19' of https://evilpiepirate.org/git/bcachefs:
bcachefs: Fix lost wakeup on journal shutdown
bcachefs; Fix deadlock in bch2_btree_update_start()
bcachefs: ratelimit errors from async_btree_node_rewrite
bcachefs: Run check_topology() first
bcachefs: Improve bch2_fatal_error()
bcachefs: Fix lost transaction restart error
bcachefs: Don't corrupt journal keys gap buffer when dropping alloc info
bcachefs: fix for building in userspace
bcachefs: bch2_snapshot_is_ancestor() now safe to call in early recovery
bcachefs: Fix nested transaction restart handling in bch2_bucket_gens_init()
bcachefs: Improve sysfs internal/btree_updates
bcachefs: Split out btree_node_rewrite_worker
bcachefs: Fix locking in bch2_alloc_write_key()
bcachefs: Avoid extent entry type assertions in .invalid()
bcachefs: Fix spurious -BCH_ERR_transaction_restart_nested
bcachefs: Fix check_key_has_snapshot() call
bcachefs: Change "accounting overran journal reservation" to a warning
Linus Torvalds [Tue, 19 Mar 2024 18:57:26 +0000 (11:57 -0700)]
Merge tag 'soc-late-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull more ARM SoC updates from Arnd Bergmann:
"These are changes that for some reason ended up not making it into the
first four branches but that should still make it into 6.9:
- A rework of the omap clock support that touches both drivers and
device tree files
- The reset controller branch changes that had a dependency on late
bugfixes. Merging them here avoids a backmerge of 6.8-rc5 into the
drivers branch
- The RISC-V/starfive, RISC-V/microchip and ARM/Broadcom devicetree
changes that got delayed and needed some extra time in linux-next
for wider testing"
* tag 'soc-late-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (31 commits)
soc: fsl: dpio: fix kcalloc() argument order
bus: ts-nbus: Improve error reporting
bus: ts-nbus: Convert to atomic pwm API
riscv: dts: starfive: jh7110: Add camera subsystem nodes
ARM: bcm: stop selecing CONFIG_TICK_ONESHOT
ARM: dts: omap3: Update clksel clocks to use reg instead of ti,bit-shift
ARM: dts: am3: Update clksel clocks to use reg instead of ti,bit-shift
clk: ti: Improve clksel clock bit parsing for reg property
clk: ti: Handle possible address in the node name
dt-bindings: pwm: opencores: Add compatible for StarFive JH8100
dt-bindings: riscv: cpus: reg matches hart ID
reset: Instantiate reset GPIO controller for shared reset-gpios
reset: gpio: Add GPIO-based reset controller
cpufreq: do not open-code of_phandle_args_equal()
of: Add of_phandle_args_equal() helper
reset: simple: add support for Sophgo SG2042
dt-bindings: reset: sophgo: support SG2042
riscv: dts: microchip: add specific compatible for mpfs pdma
riscv: dts: microchip: add missing CAN bus clocks
ARM: brcmstb: Add debug UART entry for 74165
...
Linus Torvalds [Tue, 19 Mar 2024 18:38:27 +0000 (11:38 -0700)]
Merge tag 's390-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull more s390 updates from Heiko Carstens:
- Various virtual vs physical address usage fixes
- Add new bitwise types and helper functions and use them in s390
specific drivers and code to make it easier to find virtual vs
physical address usage bugs.
Right now virtual and physical addresses are identical for s390,
except for module, vmalloc, and similar areas. This will be changed,
hopefully with the next merge window, so that e.g. the kernel image
and modules will be located close to each other, allowing for direct
branches and also for some other simplifications.
As a prerequisite this requires to fix all misuses of virtual and
physical addresses. As it turned out people are so used to the
concept that virtual and physical addresses are the same, that new
bugs got added to code which was already fixed. In order to avoid
that even more code gets merged which adds such bugs add and use new
bitwise types, so that sparse can be used to find such usage bugs.
Most likely the new types can go away again after some time
- Provide a simple ARCH_HAS_DEBUG_VIRTUAL implementation
- Fix kprobe branch handling: if an out-of-line single stepped relative
branch instruction has a target address within a certain address area
in the entry code, the program check handler may incorrectly execute
cleanup code as if KVM code was executed, leading to crashes
- Fix reference counting of zcrypt card objects
- Various other small fixes and cleanups
* tag 's390-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (41 commits)
s390/entry: compare gmap asce to determine guest/host fault
s390/entry: remove OUTSIDE macro
s390/entry: add CIF_SIE flag and remove sie64a() address check
s390/cio: use while (i--) pattern to clean up
s390/raw3270: make class3270 constant
s390/raw3270: improve raw3270_init() readability
s390/tape: make tape_class constant
s390/vmlogrdr: make vmlogrdr_class constant
s390/vmur: make vmur_class constant
s390/zcrypt: make zcrypt_class constant
s390/mm: provide simple ARCH_HAS_DEBUG_VIRTUAL support
s390/vfio_ccw_cp: use new address translation helpers
s390/iucv: use new address translation helpers
s390/ctcm: use new address translation helpers
s390/lcs: use new address translation helpers
s390/qeth: use new address translation helpers
s390/zfcp: use new address translation helpers
s390/tape: fix virtual vs physical address confusion
s390/3270: use new address translation helpers
s390/3215: use new address translation helpers
...
tracing: Just use strcmp() for testing __string() and __assign_str() match
As __assign_str() no longer uses its "src" parameter, there's a check to
make sure nothing depends on it being different than what was passed to
__string(). It originally just compared the pointer passed to __string()
with the pointer passed into __assign_str() via the "src" parameter. But
there's a couple of outliers that just pass in a quoted string constant,
where comparing the pointers is UB to the compiler, as the compiler is
free to create multiple copies of the same string constant.
Instead, just use strcmp(). It may slow down the trace event, but this
will eventually be removed.
Also, fix the issue of passing NULL to strcmp() by adding a WARN_ON() to
make sure that both "src" and the pointer saved in __string() are either
both NULL or have content, and then checking if "src" is not NULL before
performing the strcmp().
Linus Torvalds [Tue, 19 Mar 2024 18:19:36 +0000 (11:19 -0700)]
Merge tag 'pm-6.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These update the Energy Model to make it prevent errors due to power
unit mismatches, fix a typo in power management documentation, convert
one driver to using a platform remove callback returning void, address
two cpufreq issues (one in the core and one in the DT driver), and
enable boost support in the SCMI cpufreq driver.
Specifics:
- Modify the Energy Model code to bail out and complain if the unit
of power is not uW to prevent errors due to unit mismatches (Lukasz
Luba)
- Make the intel_rapl platform driver use a remove callback returning
void (Uwe Kleine-König)
- Fix typo in the suspend and interrupts document (Saravana Kannan)
- Make per-policy boost flags actually take effect on platforms using
cpufreq_boost_set_sw() (Sibi Sankar)
- Enable boost support in the SCMI cpufreq driver (Sibi Sankar)
- Make the DT cpufreq driver use zalloc_cpumask_var() for allocating
cpumasks to avoid using unitinialized memory (Marek Szyprowski)"
* tag 'pm-6.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: scmi: Enable boost support
firmware: arm_scmi: Add support for marking certain frequencies as turbo
cpufreq: dt: always allocate zeroed cpumask
cpufreq: Fix per-policy boost behavior on SoCs using cpufreq_boost_set_sw()
Documentation: power: Fix typo in suspend and interrupts doc
PM: EM: Force device drivers to provide power in uW
powercap: intel_rapl: Convert to platform remove callback returning void
Linus Torvalds [Tue, 19 Mar 2024 18:15:14 +0000 (11:15 -0700)]
Merge tag 'acpi-6.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more ACPI updates from Rafael Wysocki:
"These update ACPI documentation and kerneldoc comments.
Specifics:
- Add markup to generate links from footnotes in the ACPI enumeration
document (Chris Packham)
- Update the handle_eject_request() kerneldoc comment to document the
arguments of the function and improve kerneldoc comments for ACPI
suspend and hibernation functions (Yang Li)"
* tag 'acpi-6.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: PM: Improve kerneldoc comments for suspend and hibernation functions
ACPI: docs: enumeration: Make footnotes links
ACPI: Document handle_eject_request() arguments
Linus Torvalds [Tue, 19 Mar 2024 18:11:01 +0000 (11:11 -0700)]
Merge tag 'thermal-6.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more thermal control updates from Rafael Wysocki:
"These update thermal drivers for ARM platforms by adding new hardware
support (r8a779h0, H616 THS), addressing issues (Mediatek LVTS,
Mediatek MT7896, thermal-of) and cleaning up code.
Specifics:
- Fix memory leak in the error path at probe time in the Mediatek
LVTS driver (Christophe Jaillet)
- Fix control buffer enablement regression on Meditek MT7896 (Frank
Wunderlich)
- Drop spaces before TABs in different places: thermal-of, ST drivers
and Makefile (Geert Uytterhoeven)
- Adjust DT binding for NXP as fsl,tmu-range min/maxItems can vary
among several SoC versions (Fabio Estevam)
- Add support for the H616 THS controller on Sun8i platforms (Martin
Botka)
- Don't fail probe due to zone registration failure because there is
no trip points defined in the DT (Mark Brown)
- Support variable TMU array size for new platforms (Peng Fan)
- Adjust the DT binding for thermal-of and make the polling time not
required and assume it is zero when not found in the DT (Konrad
Dybcio)
- Add r8a779h0 support in both the DT and the rcar_gen3 driver (Geert
Uytterhoeven)"
* tag 'thermal-6.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal/drivers/rcar_gen3: Add support for R-Car V4M
dt-bindings: thermal: rcar-gen3-thermal: Add r8a779h0 support
thermal/of: Assume polling-delay(-passive) 0 when absent
dt-bindings: thermal-zones: Don't require polling-delay(-passive)
thermal/drivers/qoriq: Fix getting tmu range
thermal/drivers/sun8i: Don't fail probe due to zone registration failure
thermal/drivers/sun8i: Add support for H616 THS controller
thermal/drivers/sun8i: Add SRAM register access code
thermal/drivers/sun8i: Extend H6 calibration to support 4 sensors
thermal/drivers/sun8i: Explain unknown H6 register value
dt-bindings: thermal: sun8i: Add H616 THS controller
soc: sunxi: sram: export register 0 for THS on H616
dt-bindings: thermal: qoriq-thermal: Adjust fsl,tmu-range min/maxItems
thermal: Drop spaces before TABs
thermal/drivers/mediatek: Fix control buffer enablement on MT7896
thermal/drivers/mediatek/lvts_thermal: Fix a memory leak in an error handling path
Linus Torvalds [Tue, 19 Mar 2024 18:05:34 +0000 (11:05 -0700)]
Merge tag 'ata-6.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata fix from Niklas Cassel:
"A single fix for ASMedia HBAs.
These HBAs do not indicate that they support SATA Port Multipliers
CAP.SPM (Supports Port Multiplier) is not set.
Likewise, they do not allow you to probe the devices behind an
attached PMP, as defined according to the SATA-IO PMP specification.
Instead, they have decided to implement their own version of PMP,
and because of this, plugging in a PMP actually works, even if the
HBA claims that it does not support PMP.
Revert a recent quirk for these HBAs, as that breaks ASMedia's own
implementation of PMP.
Unfortunately, this will once again give some users of these HBAs
significantly increased boot time. However, a longer boot time for
some, is the lesser evil compared to some other users not being able
to detect their drives at all"
* tag 'ata-6.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ahci: asm1064: asm1166: don't limit reported ports
Linus Torvalds [Tue, 19 Mar 2024 15:57:39 +0000 (08:57 -0700)]
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
- Per vq sizes in vdpa
- Info query for block devices support in vdpa
- DMA sync callbacks in vduse
- Fixes, cleanups
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (35 commits)
virtio_net: rename free_old_xmit_skbs to free_old_xmit
virtio_net: unify the code for recycling the xmit ptr
virtio-net: add cond_resched() to the command waiting loop
virtio-net: convert rx mode setting to use workqueue
virtio: packed: fix unmap leak for indirect desc table
vDPA: report virtio-blk flush info to user space
vDPA: report virtio-block read-only info to user space
vDPA: report virtio-block write zeroes configuration to user space
vDPA: report virtio-block discarding configuration to user space
vDPA: report virtio-block topology info to user space
vDPA: report virtio-block MQ info to user space
vDPA: report virtio-block max segments in a request to user space
vDPA: report virtio-block block-size to user space
vDPA: report virtio-block max segment size to user space
vDPA: report virtio-block capacity to user space
virtio: make virtio_bus const
vdpa: make vdpa_bus const
vDPA/ifcvf: implement vdpa_config_ops.get_vq_num_min
vDPA/ifcvf: get_max_vq_size to return max size
virtio_vdpa: create vqs with the actual size
...
Linus Torvalds [Tue, 19 Mar 2024 15:48:09 +0000 (08:48 -0700)]
Merge tag 'for-linus-6.9-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
- Xen event channel handling fix for a regression with a rare kernel
config and some added hardening
- better support of running Xen dom0 in PVH mode
- a cleanup for the xen grant-dma-iommu driver
* tag 'for-linus-6.9-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/events: increment refcnt only if event channel is refcounted
xen/evtchn: avoid WARN() when unbinding an event channel
x86/xen: attempt to inflate the memory balloon on PVH
xen/grant-dma-iommu: Convert to platform remove callback returning void
Nikita Kiryushin [Fri, 15 Mar 2024 17:50:52 +0000 (20:50 +0300)]
net: phy: fix phy_read_poll_timeout argument type in genphy_loopback
read_poll_timeout inside phy_read_poll_timeout can set val negative
in some cases (for example, __mdiobus_read inside phy_read can return
-EOPNOTSUPP).
Supposedly, commit 4ec732951702 ("net: phylib: fix phy_read*_poll_timeout()")
should fix problems with wrong-signed vals, but I do not see how
as val is sent to phy_read as is and __val = phy_read (not val)
is checked for sign.
Change val type for signed to allow better error handling as done in other
phy_read_poll_timeout callers. This will not fix any error handling
by itself, but allows, for example, to modify cond with appropriate
sign check or check resulting val separately.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Michal Koutný [Fri, 15 Mar 2024 16:02:10 +0000 (17:02 +0100)]
net/sched: Add module alias for sch_fq_pie
The commit 2c15a5aee2f3 ("net/sched: Load modules via their alias")
starts loading modules via aliases and not canonical names. The new
aliases were added in commit 241a94abcf46 ("net/sched: Add module
aliases for cls_,sch_,act_ modules") via a Coccinele script.
sch_fq_pie.c is missing module.h header and thus Coccinele did not patch
it. Add the include and module alias manually, so that autoloading works
for sch_fq_pie too.
(Note: commit message in commit 241a94abcf46 ("net/sched: Add module
aliases for cls_,sch_,act_ modules") was mangled due to '#'
misinterpretation. The predicate haskernel is:
Tobias Brunner [Fri, 15 Mar 2024 14:35:40 +0000 (15:35 +0100)]
ipv4: raw: Fix sending packets from raw sockets via IPsec tunnels
Since the referenced commit, the xfrm_inner_extract_output() function
uses the protocol field to determine the address family. So not setting
it for IPv4 raw sockets meant that such packets couldn't be tunneled via
IPsec anymore.
IPv6 raw sockets are not affected as they already set the protocol since 9c9c9ad5fae7 ("ipv6: set skb->protocol on tcp, raw and ip6_append_data
genereated skbs").
Felix Maurer [Fri, 15 Mar 2024 12:04:52 +0000 (13:04 +0100)]
hsr: Handle failures in module init
A failure during registration of the netdev notifier was not handled at
all. A failure during netlink initialization did not unregister the netdev
notifier.
Handle failures of netdev notifier registration and netlink initialization.
Both functions should only return negative values on failure and thereby
lead to the hsr module not being loaded.
Mathias Nyman [Fri, 8 Mar 2024 11:34:25 +0000 (13:34 +0200)]
usb: usb-acpi: Fix oops due to freeing uninitialized pld pointer
If reading the ACPI _PLD port location object fails, or the port
doesn't have a _PLD ACPI object then the *pld pointer will remain
uninitialized and oops when freed.
The patch that caused this is currently in next, on its way to v6.9.
So no need to add this to stable or current 6.8 kernel.
Yuezhang Mo [Fri, 5 Aug 2022 08:57:04 +0000 (16:57 +0800)]
exfat: remove unused functions
exfat_count_ext_entries() is no longer called, remove it.
exfat_update_dir_chksum() is no longer called, remove it and
rename exfat_update_dir_chksum_with_entry_set() to it.
Yuezhang Mo [Mon, 30 Oct 2023 10:00:51 +0000 (18:00 +0800)]
exfat: convert exfat_find_empty_entry() to use dentry cache
Before this conversion, each dentry traversed needs to be read
from the storage device or page cache. There are at least 16
dentries in a sector. This will result in frequent page cache
searches.
After this conversion, if all directory entries in a sector are
used, the sector only needs to be read once.
Yuezhang Mo [Fri, 5 Aug 2022 08:42:02 +0000 (16:42 +0800)]
exfat: convert exfat_init_ext_entry() to use dentry cache
Before this conversion, in exfat_init_ext_entry(), to init
the dentries in a dentry set, the sync times is equals the
dentry number if 'dirsync' or 'sync' is enabled.
That affects not only performance but also device life.
After this conversion, only needs to be synchronized once if
'dirsync' or 'sync' is enabled.
Yuezhang Mo [Wed, 1 Feb 2023 10:53:18 +0000 (18:53 +0800)]
exfat: move free cluster out of exfat_init_ext_entry()
exfat_init_ext_entry() is an init function, it's a bit strange
to free cluster in it. And the argument 'inode' will be removed
from exfat_init_ext_entry(). So this commit changes to free the
cluster in exfat_remove_entries().
Yuezhang Mo [Fri, 5 Aug 2022 07:55:58 +0000 (15:55 +0800)]
exfat: convert exfat_remove_entries() to use dentry cache
Before this conversion, in exfat_remove_entries(), to mark the
dentries in a dentry set as deleted, the sync times is equals
the dentry numbers if 'dirsync' or 'sync' is enabled.
That affects not only performance but also device life.
After this conversion, only needs to be synchronized once if
'dirsync' or 'sync' is enabled.
Yuezhang Mo [Mon, 30 Oct 2023 09:25:31 +0000 (17:25 +0800)]
exfat: add exfat_get_empty_dentry_set() helper
This helper is used to lookup empty dentry set. If there are
no enough empty dentries at the input location, this helper will
return the number of dentries that need to be skipped for the
next lookup.
Yuezhang Mo [Fri, 8 Dec 2023 11:17:02 +0000 (19:17 +0800)]
exfat: add __exfat_get_dentry_set() helper
Since exfat_get_dentry_set() invokes the validate functions of
exfat_validate_entry(), it only supports getting a directory
entry set of an existing file, doesn't support getting an empty
entry set.
Yewon Choi [Fri, 15 Mar 2024 09:28:38 +0000 (18:28 +0900)]
rds: introduce acquire/release ordering in acquire/release_in_xmit()
acquire/release_in_xmit() work as bit lock in rds_send_xmit(), so they
are expected to ensure acquire/release memory ordering semantics.
However, test_and_set_bit/clear_bit() don't imply such semantics, on
top of this, following smp_mb__after_atomic() does not guarantee release
ordering (memory barrier actually should be placed before clear_bit()).
Instead, we use clear_bit_unlock/test_and_set_bit_lock() here.
Previously, patches have been added to limit the reported count of SATA
ports for asm1064 and asm1166 SATA controllers, as those controllers do
report more ports than physically having.
While it is allowed to report more ports than physically having in CAP.NP,
it is not allowed to report more ports than physically having in the PI
(Ports Implemented) register, which is what these HBAs do.
(This is a AHCI spec violation.)
Unfortunately, it seems that the PMP implementation in these ASMedia HBAs
is also violating the AHCI and SATA-IO PMP specification.
What these HBAs do is that they do not report that they support PMP
(CAP.SPM (Supports Port Multiplier) is not set).
Instead, they have decided to add extra "virtual" ports in the PI register
that is used if a port multiplier is connected to any of the physical
ports of the HBA.
Enumerating the devices behind the PMP as specified in the AHCI and
SATA-IO specifications, by using PMP READ and PMP WRITE commands to the
physical ports of the HBA is not possible, you have to use the "virtual"
ports.
This is of course bad, because this gives us no way to detect the device
and vendor ID of the PMP actually connected to the HBA, which means that
we can not apply the proper PMP quirks for the PMP that is connected to
the HBA.
Limiting the port map will thus stop these controllers from working with
SATA Port Multipliers.
This patch reverts both patches for asm1064 and asm1166, so old behavior
is restored and SATA PMP will work again, but it will also reintroduce the
(minutes long) extra boot time for the ASMedia controllers that do not
have a PMP connected (either on the PCIe card itself, or an external PMP).
However, a longer boot time for some, is the lesser evil compared to some
other users not being able to detect their drives at all.
Fixes: 0077a504e1a4 ("ahci: asm1166: correct count of reported ports") Fixes: 9815e3961754 ("ahci: asm1064: correct count of reported ports") Cc: [email protected] Reported-by: Matt <[email protected]> Signed-off-by: Conrad Kostecki <[email protected]> Reviewed-by: Hans de Goede <[email protected]>
[cassel: rewrote commit message] Signed-off-by: Niklas Cassel <[email protected]>
Jakub Kicinski [Fri, 15 Mar 2024 00:21:08 +0000 (17:21 -0700)]
tools: ynl: add header guards for nlctrl
I "extracted" YNL C into a GitHub repo to make it easier
to use in other projects: https://github.com/linux-netdev/ynl-c
GitHub actions use Ubuntu by default, and the kernel headers
there are missing f329a0ebeaba ("genetlink: correct uAPI defines").
Add the direct include workaround for nlctrl.
Paolo Abeni [Tue, 19 Mar 2024 10:22:54 +0000 (11:22 +0100)]
Merge branch 'wireguard-fixes-for-6-9-rc1'
Jason A. Donenfeld says:
====================
wireguard fixes for 6.9-rc1
This series has four WireGuard fixes:
1) Annotate a data race that KCSAN found by using READ_ONCE/WRITE_ONCE,
which has been causing syzkaller noise.
2) Use the generic netdev tstats allocation and stats getters instead of
doing this within the driver.
3) Explicitly check a flag variable instead of an empty list in the
netlink code, to prevent a UaF situation when paging through GET
results during a remove-all SET operation.
4) Set a flag in the RISC-V CI config so the selftests continue to boot.
====================
wireguard: selftests: set RISCV_ISA_FALLBACK on riscv{32,64}
This option is needed to continue booting with QEMU. Recent changes that
made this optional meant that it gets unset in the test harness, and so
WireGuard CI has been broken. Fix this by simply setting this option.
wireguard: netlink: access device through ctx instead of peer
The previous commit fixed a bug that led to a NULL peer->device being
dereferenced. It's actually easier and faster performance-wise to
instead get the device from ctx->wg. This semantically makes more sense
too, since ctx->wg->peer_allowedips.seq is compared with
ctx->allowedips_seq, basing them both in ctx. This also acts as a
defence in depth provision against freed peers.
wireguard: netlink: check for dangling peer via is_dead instead of empty list
If all peers are removed via wg_peer_remove_all(), rather than setting
peer_list to empty, the peer is added to a temporary list with a head on
the stack of wg_peer_remove_all(). If a netlink dump is resumed and the
cursored peer is one that has been removed via wg_peer_remove_all(), it
will iterate from that peer and then attempt to dump freed peers.
Fix this by instead checking peer->is_dead, which was explictly created
for this purpose. Also move up the device_update_lock lockdep assertion,
since reading is_dead relies on that.
It can be reproduced by a small script like:
echo "Setting config..."
ip link add dev wg0 type wireguard
wg setconf wg0 /big-config
(
while true; do
echo "Showing config..."
wg showconf wg0 > /dev/null
done
) &
sleep 4
wg setconf wg0 <(printf "[Peer]\nPublicKey=$(wg genkey)\n")
Resulting in:
BUG: KASAN: slab-use-after-free in __lock_acquire+0x182a/0x1b20
Read of size 8 at addr ffff88811956ec70 by task wg/59
CPU: 2 PID: 59 Comm: wg Not tainted 6.8.0-rc2-debug+ #5
Call Trace:
<TASK>
dump_stack_lvl+0x47/0x70
print_address_description.constprop.0+0x2c/0x380
print_report+0xab/0x250
kasan_report+0xba/0xf0
__lock_acquire+0x182a/0x1b20
lock_acquire+0x191/0x4b0
down_read+0x80/0x440
get_peer+0x140/0xcb0
wg_get_device_dump+0x471/0x1130
Commit 3e2f544dd8a33 ("net: get stats64 if device if driver is
configured") moved the callback to dev_get_tstats64() to net core, so,
unless the driver is doing some custom stats collection, it does not
need to set .ndo_get_stats64.
Since this driver is now relying in NETDEV_PCPU_STAT_TSTATS, then, it
doesn't need to set the dev_get_tstats64() generic .ndo_get_stats64
function pointer.
Breno Leitao [Thu, 14 Mar 2024 22:49:07 +0000 (16:49 -0600)]
wireguard: device: leverage core stats allocator
With commit 34d21de99cea9 ("net: Move {l,t,d}stats allocation to core
and convert veth & vrf"), stats allocation could be done on net core
instead of in this driver.
With this new approach, the driver doesn't have to bother with error
handling (allocation failure checking, making sure free happens in the
right spot, etc). This is core responsibility now.
Remove the allocation in this driver and leverage the network core
allocation instead.
wireguard: receive: annotate data-race around receiving_counter.counter
Syzkaller with KCSAN identified a data-race issue when accessing
keypair->receiving_counter.counter. Use READ_ONCE() and WRITE_ONCE()
annotations to mark the data race as intentional.
BUG: KCSAN: data-race in wg_packet_decrypt_worker / wg_packet_rx_poll
write to 0xffff888107765888 of 8 bytes by interrupt on cpu 0:
counter_validate drivers/net/wireguard/receive.c:321 [inline]
wg_packet_rx_poll+0x3ac/0xf00 drivers/net/wireguard/receive.c:461
__napi_poll+0x60/0x3b0 net/core/dev.c:6536
napi_poll net/core/dev.c:6605 [inline]
net_rx_action+0x32b/0x750 net/core/dev.c:6738
__do_softirq+0xc4/0x279 kernel/softirq.c:553
do_softirq+0x5e/0x90 kernel/softirq.c:454
__local_bh_enable_ip+0x64/0x70 kernel/softirq.c:381
__raw_spin_unlock_bh include/linux/spinlock_api_smp.h:167 [inline]
_raw_spin_unlock_bh+0x36/0x40 kernel/locking/spinlock.c:210
spin_unlock_bh include/linux/spinlock.h:396 [inline]
ptr_ring_consume_bh include/linux/ptr_ring.h:367 [inline]
wg_packet_decrypt_worker+0x6c5/0x700 drivers/net/wireguard/receive.c:499
process_one_work kernel/workqueue.c:2633 [inline]
...
read to 0xffff888107765888 of 8 bytes by task 3196 on cpu 1:
decrypt_packet drivers/net/wireguard/receive.c:252 [inline]
wg_packet_decrypt_worker+0x220/0x700 drivers/net/wireguard/receive.c:501
process_one_work kernel/workqueue.c:2633 [inline]
process_scheduled_works+0x5b8/0xa30 kernel/workqueue.c:2706
worker_thread+0x525/0x730 kernel/workqueue.c:2787
...