Linus Torvalds [Wed, 20 Nov 2024 22:05:34 +0000 (14:05 -0800)]
Merge tag 'media/v6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fix from Mauro Carvalho Chehab:
- uvcvideo: Skip parsing frames of type UVC_VS_UNDEFINED in
uvc_parse_format
* tag 'media/v6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: uvcvideo: Skip parsing frames of type UVC_VS_UNDEFINED in uvc_parse_format
Linus Torvalds [Wed, 20 Nov 2024 21:57:40 +0000 (13:57 -0800)]
Merge tag 'hid-for-linus-2024111801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID updates from Jiri Kosina:
- improvement in the way hid-bpf coexists with specific drivers (others
than hid-generic) that are already bound to devices (Benjamin
Tissoires)
- removal of three way-too-aggressive BUG_ON()s from HID drivers (He
Lugang)
- assorted cleanups and small code fixes to HID core (Dmitry Torokhov,
Yan Zhen, Nathan Chancellor, Andy Shevchenko)
- support for Corsair Void headset family (Stuart Hayhurst)
- Support for Goodix GT7986U SPI (Charles Wang)
- initial vendor-specific driver for Kysona, currently adding support
for Kysona M600 (Lode Willems)
- other assorted code cleanups and small bugfixes all over the place
* tag 'hid-for-linus-2024111801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (40 commits)
HID: multitouch: make mt_set_mode() less cryptic
HID: hid-goodix-spi: Add OF supports
dt-bindings: input: Goodix GT7986U SPI HID Touchscreen
HID: hyperv: streamline driver probe to avoid devres issues
HID: magicmouse: Apple Magic Trackpad 2 USB-C driver support
HID: rmi: Add select RMI4_F3A in Kconfig
HID: wacom: Interpret tilt data from Intuos Pro BT as signed values
HID: steelseries: Add capacity_level mapping
HID: steelseries: Fix battery requests stopping after some time
HID: hid-goodix: Fix HID get/set feature operation overwritten problem
HID: hid-goodix: Return 0 when receiving an empty HID feature package
HID: bpf: drop use of Logical|Physical|UsageRange
HID: bpf: Fix Rapoo M50 Plus Silent side buttons
HID: bpf: Fix NKRO on Mistel MD770
HID: replace BUG_ON() with WARN_ON()
HID: wacom: Set eraser status when either 'Eraser' or 'Invert' usage is set
HID: Kysona: add basic online status
HID: Kysona: check battery status every 5s using a workqueue
HID: Kysona: Add basic battery reporting for Kysona M600
HID: Add IDs for Kysona
...
Linus Torvalds [Wed, 20 Nov 2024 21:19:25 +0000 (13:19 -0800)]
Merge tag 'devicetree-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree updates from Rob Herring:
"Bindings:
- Enable dtc "interrupt_provider" warnings for binding examples. Fix
the warnings in fsl,mu-msi and ti,sci-inta due to this.
- Convert zii,rave-sp-wdt, zii,rave-sp-pwrbutton, and
altr,fpga-passive-serial to DT schema format
- Add some documentation on the different forms of YAML text blocks
which are a constant source of review comments
- Fix some schema errors in constraints for arrays
- Add compatibles for qcom,sar2130p-pdc and onnn,adt7462
DT core:
- Allow overlay kunit tests to run CONFIG_OF_OVERLAY=n
- Add some warnings on deprecated address handling
- Rework early_init_dt_scan() so the arch can pass in the phys
address of the DTB as __pa() is not always valid to use. This fixes
a warning for arm64 with kexec.
- Add and use some new DT graph iterators for iterating over ports
and endpoints
- Rework reserved-memory handling to be sized dynamically for fixed
regions
- Optimize of_modalias() to avoid a strlen() call
- Constify struct device_node and property pointers where ever
possible"
* tag 'devicetree-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (36 commits)
of: Allow overlay kunit tests to run CONFIG_OF_OVERLAY=n
dt-bindings: interrupt-controller: qcom,pdc: Add SAR2130P compatible
of/address: Rework bus matching to avoid warnings
of: WARN on deprecated #address-cells/#size-cells handling
of/fdt: Don't use default address cell sizes for address translation
dt-bindings: Enable dtc "interrupt_provider" warnings
of/fdt: add dt_phys arg to early_init_dt_scan and early_init_dt_verify
dt-bindings: cache: qcom,llcc: Fix X1E80100 reg entries
dt-bindings: watchdog: convert zii,rave-sp-wdt.txt to yaml format
dt-bindings: input: convert zii,rave-sp-pwrbutton.txt to yaml
media: xilinx-tpg: use new of_graph functions
fbdev: omapfb: use new of_graph functions
gpu: drm: omapdrm: use new of_graph functions
ASoC: audio-graph-card2: use new of_graph functions
ASoC: audio-graph-card: use new of_graph functions
ASoC: test-component: use new of_graph functions
of: property: use new of_graph functions
of: property: add of_graph_get_next_port_endpoint()
of: property: add of_graph_get_next_port()
of: module: remove strlen() call in of_modalias()
...
Linus Torvalds [Wed, 20 Nov 2024 20:55:41 +0000 (12:55 -0800)]
Merge tag 'auxdisplay-v6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-auxdisplay
Pull auxdisplay update from Andy Shevchenko:
- Move Holtek 16k33 driver to use agnostic i2c_get_match_data()
- Miscellaneuous cleanups
* tag 'auxdisplay-v6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-auxdisplay:
auxdisplay: Remove unused functions
auxdisplay: ht16k33: Make use of i2c_get_match_data()
auxdisplay: Drop explicit initialization of struct i2c_device_id::driver_data to 0
Linus Torvalds [Wed, 20 Nov 2024 20:51:32 +0000 (12:51 -0800)]
Merge tag 'mmc-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC updates from Ulf Hansson:
"MMC core:
- Add support for Ultra Capacity SD cards (SDUC, 2TB to 128TB)
- Add support for Ultra High-Speed II SD cards (UHS-II)
- Use a reset control for pwrseq_simple
- Add SD card quirk for broken poweroff notification
- Use GFP_NOIO for SD ACMD22
MMC host:
- bcm2835: Introduce proper clock handling
- mtk-sd: Add support for the Host-Software-Queue interface
- mtk-sd: Add support for the mt7988/mt8196 variants
- mtk-sd: Fix a couple of error paths in ->probe()
- sdhci: Add interface to support UHS-II SD cards
- sdhci_am654: Fixup support for changing the signal voltage level
- sdhci-cadence: Add support for the Microchip PIC64GX variant
- sdhci-esdhc-imx: Add support for eMMC HW-reset
- sdhci-msm: Add support for the X1E80100/IPQ5424/SAR2130P/QCS615 variants
- sdhci-of-arasan: Add support for eMMC HW-reset
- sdhci-pci-gli: Add UHS-II support for the GL9767/GL9755 variants
MEMSTICK:
- A couple of minor updates"
* tag 'mmc-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (78 commits)
mmc: pwrseq_simple: Handle !RESET_CONTROLLER properly
mmc: mtk-sd: Fix MMC_CAP2_CRYPTO flag setting
mmc: mtk-sd: Fix error handle of probe function
mmc: core: Correction a warning caused by incorrect type in assignment for UHS-II
mmc: sdhci-esdhc-imx: Update esdhc sysctl dtocv bitmask
mmc: sdhci-esdhc-imx: Implement emmc hardware reset
mmc: core: Correct type in variable assignment for UHS-II
mmc: sdhci-uhs2: correction a warning caused by incorrect type in argument
mmc: sdhci-uhs2: Remove unnecessary variables
mmc: sdhci-uhs2: Correct incorrect type in argument
mmc: sdhci: Make MMC_SDHCI_UHS2 config symbol invisible
mmc: sdhci-uhs2: Remove unnecessary NULL check
mmc: core: Fix error paths for UHS-II card init and re-init
mmc: core: Add error handling of sd_uhs2_power_up()
mmc: core: Simplify sd_uhs2_power_up()
mmc: bcm2835: Introduce proper clock handling
mmc: bcm2835: Fix type of current clock speed
dt-bindings: mmc: Add sdhci compatible for QCS615
mmc: core: Use GFP_NOIO in ACMD22
dt-bindings: mmc: sdhci-msm: Add SAR2130P compatible
...
Linus Torvalds [Wed, 20 Nov 2024 20:44:59 +0000 (12:44 -0800)]
Merge tag 'pmdomain-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain updates from Ulf Hansson:
"pmdomain core:
- Set the required dev for a required OPP during genpd attach
- Add support for required OPPs to dev_pm_domain_attach_list()
pmdomain providers:
- ti: Enable GENPD_FLAG_ACTIVE_WAKEUP flag for ti_sci PM domains
- mediatek: Add support for MT6735 PM domains
- mediatek: Use OF-specific regulator API to get power domain supply
- qcom: Add support for the SM8750/SAR2130P/qcs615/qcs8300 rpmhpds
pmdomain consumers:
- Convert a couple of consumer drivers to
*_pm_domain_attach|detach_list()
opp core:
- Rework and cleanup some code that manages required OPPs
- Remove *_opp_attach|detach_genpd()"
* tag 'pmdomain-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: (25 commits)
pmdomain: qcom: rpmhpd: Add rpmhpd support for SM8750
dt-bindings: power: qcom,rpmpd: document the SM8750 RPMh Power Domains
pmdomain: imx: Use of_property_present() for non-boolean properties
pmdomain: imx: gpcv2: replace dev_err() with dev_err_probe()
pmdomain: ti-sci: Use scope based of_node_put() to simplify code.
pmdomain: ti-sci: Add missing of_node_put() for args.np
pmdomain: ti-sci: set the GENPD_FLAG_ACTIVE_WAKEUP flag for all PM domains
pmdomain: mediatek: Add support for MT6735
pmdomain: qcom: rpmhpd: add support for SAR2130P
dt-bindings: power: Add binding for MediaTek MT6735 power controller
dt-bindings: power: rpmpd: Add SAR2130P compatible
OPP: Drop redundant *_opp_attach|detach_genpd()
cpufreq: qcom-nvmem: Convert to dev_pm_domain_attach|detach_list()
media: venus: Convert into devm_pm_domain_attach_list() for OPP PM domain
drm/tegra: gr3d: Convert into devm_pm_domain_attach_list()
OPP: Drop redundant code in _link_required_opps()
pmdomain: core: Set the required dev for a required OPP during genpd attach
pmdomain: core: Manage the default required OPP from a separate function
PM: domains: Support required OPPs in dev_pm_domain_attach_list()
OPP: Rework _set_required_devs() to manage a single device per call
...
Linus Torvalds [Wed, 20 Nov 2024 20:41:03 +0000 (12:41 -0800)]
Merge tag 'pwrseq-updates-for-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull power sequencing updates from Bartosz Golaszewski:
- extend support for the wcn6855 model in the pwrseq-qcom-wcn driver
- make this driver depend on CONFIG_OF=y as it uses some very
OF-specific interfaces and depends on phandle parsing
* tag 'pwrseq-updates-for-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
power: sequencing: qcom-wcn: improve support for wcn6855
power: sequencing: make the QCom PMU pwrseq driver depend on CONFIG_OF
Linus Torvalds [Wed, 20 Nov 2024 20:37:06 +0000 (12:37 -0800)]
Merge tag 'gpio-updates-for-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio updates from Bartosz Golaszewski:
"Three new drivers, support for some new models in existing ones and
lots of various tweaks and improvements across the board (switching to
using recommended APIs, code shrink and simplification, etc.).
Also a new feature in the character device uAPI where we now notify
the user-space about changes triggered by in-kernel users as well, not
only when they were done by other user-space agents.
Summary:
GPIOLIB core:
- use the new mem_is_zero() instead of memchr_inv(s, 0, n)
- don't store debounce period twice needlessly
- clean-up debugfs handling
- remove leftover comments referring to no longer used spinlocks
- unduplicate some operations like SRCU locks and initializing GPIO
descriptors
- constify the sysfs class struct
- use lock guards in GPIO sysfs code
- update GPIO uAPI internal flags all at once atomically for
consistency with other places
- modify the behavior of the sysfs interface by no longer exporting
lines that are named inside the driver code or board files with the
sysfs links bearing the line names as this has for many years been
largely unused due to the prevalence of DT, ACPI and firmware nodes
over board files and made the API inconsistent
- for GPIO interrupt providers: free irqs that are still requested by
users when removing the chip
GPIO uAPI:
- notify user-space about changes to GPIO lines' state (requested,
released, reconfigured) triggered from the kernel as well (until
now we'd only do this for changes triggered from user-space)
- to that end: modify the internal workings of the notification
mechanism by switching to an atomic notifier which allows us to
send events from atomic context
- also to that end store the debounce period in the GPIO descriptor
struct and not in the character device context struct
- while at it, also cover the corner-case of users introducing
changes over sysfs while others watch them via the character device
- don't report GPIO lines requested as interrupts as "used" to
user-space as it can still request them as GPIOs
New drivers:
- GPIO part of the MFD Congatec Board Controller
- PolarFire GPIO controller
- GPIOs on FTDI FT2232H
Driver improvements:
- use generic device property accessors instead of OF-specific ones
across many GPIO drivers (mpc8xxx, vf610, eic-sprd, davinci,
ts4900, xilinx, mvebu)
- use devres helpers to simplify error paths and either shrink or
entirely remove the driver's remove() callback (grgpio, amdpt,
menz127, max730x, ftgpio010, 74x164, ljca)
- use helper variables to store the address of pdev->dev and avoid
some line-breaks
- use device_for_each_child_node_scoped() to avoid having to put the
fwnode on breaks or errors (gpio-sim, gpio-dwapb, gpiolib-acpi)
- use a scoped bitmap to simplify the code and drop goto labels in
gpio-aggregator
- drop unneeded Kconfig dependencies on OF_GPIO (grgpio, mveby,
xilinx)
- add support for new models to gpio-aspeed, gpio-rockchip and
gpio-dwapb
- clean-up ACPI handling and some other bits in gpio-xgene-sb
- replace deprecated PCI functions in pcie-idio-24 and pci-idio-16
- allow to build davinci and mvebu drivers with COMPILE_TEST=y
- remove dead code in gpio-mb86s7x
- switch back to using platform_driver::remove() (after the
conversion to remove_new()) across the GPIO drivers
- remove remaining uses of GPIOF_ACTIVE_LOW across the tree and drop
this deprecated symbol
- convert the gpio-altera driver to no longer pull in the deprecated
legacy-of-mm-gpiochip.h header
- use of_property_present() instead of of_property_read_bool() in
gpiolib-of and gpio-rockchip
- allow to build the tegra186 driver on Tegra234 platforms in Kconfig
Late fixes:
- add a missing return value check after devm_kasprintf() to
gpio-grgpio
DT bindings:
- document the ngpios property of gpio-mmio
- add support for a new aspeed model
- fix the example for st,nomadik-gpio
Other:
- kernel doc and comments tweaks
- fix typos in TODO
- reorder headers alphabetically in some drivers
- fix incorrect format specifiers in gpio tools"
* tag 'gpio-updates-for-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: (98 commits)
gpio: tegra186: Allow to enable driver on Tegra234
gpio: grgpio: Add NULL check in grgpio_probe
tools: gpio: Fix several incorrect format specifiers
gpio: mpfs: add CoreGPIO support
gpio: rockchip: support new version GPIO
gpio: rockchip: change the GPIO version judgment logic
gpio: rockchip: explan the format of the GPIO version ID
gpiolib: cdev: use !mem_is_zero() instead of memchr_inv(s, 0, n)
MAINTAINERS: add gpio driver to PolarFire entry
gpio: Get rid of GPIOF_ACTIVE_LOW
USB: gadget: pxa27x_udc: Avoid using GPIOF_ACTIVE_LOW
pcmcia: soc_common: Avoid using GPIOF_ACTIVE_LOW
leds: gpio: Avoid using GPIOF_ACTIVE_LOW
Input: gpio_keys_polled - avoid using GPIOF_ACTIVE_LOW
Input: gpio_keys - avoid using GPIOF_ACTIVE_LOW
gpio: Use of_property_present() for non-boolean properties
gpio: mpfs: add polarfire soc gpio support
gpio: altera: Drop legacy-of-mm-gpiochip.h header
gpio: pcie-idio-24: Replace deprecated PCI functions
gpio: pci-idio-16: Replace deprecated PCI functions
...
Linus Torvalds [Wed, 20 Nov 2024 20:34:22 +0000 (12:34 -0800)]
Merge tag 'pwm/for-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux
Pull pwm updates from Uwe Kleine-König:
"This contains a new abstraction for PWM waveforms that is more
expressive that the legacy one. Compared to the old abstraction it
contains a duty_offset member instead of polarity. This new
abstraction is already used in an ADC driver merged into the iio tree.
The new API requires changes to the lowlevel drivers. For now there
are two drivers that are converted to the new API (axi-pwmgen and
stm32). Converted drivers continue to work with the old API. Drivers
not yet converted only work with the older API.
Otherwise it's the usual collection of fixes, cleanups and dt doc
updates.
This time around thanks go to Andy Shevchenko, Clark Wang, Conor
Dooley, David Lechner, Dimitri Fedrau, Frank Li, Jun Li, Kelvin Zhang,
Krzysztof Kozlowski, Nuno Sa, Shen Lichuan and Trevor Gamblin for code
contributions, testing and review"
* tag 'pwm/for-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux:
pwm: Assume a disabled PWM to emit a constant inactive output
pwm: core: export pwm_get_state_hw()
pwm: core: use device_match_name() instead of strcmp(dev_name(...
dt-bindings: pwm: adi,axi-pwmgen: Increase #pwm-cells to 3
pwm: imx27: Use clk_bulk_*() API to simplify clock handling
pwm: imx27: Workaround of the pwm output bug when decrease the duty cycle
pwm: axi-pwmgen: Enable FORCE_ALIGN by default
pwm: axi-pwmgen: Rename 0x10 register
dt-bindings: pwm: amlogic: Document C3 PWM
pwm: axi-pwmgen: Create a dedicated function for getting driver data from a chip
pwm: atmel-tcb: Use min() macro
pwm: stm32: Fix error checking for a regmap_read() call
pwm: Add kernel doc for members added to pwm_ops recently
pwm: Reorder symbols in core.c
pwm: stm32: Implementation of the waveform callbacks
pwm: axi-pwmgen: Implementation of the waveform callbacks
pwm: Add tracing for waveform callbacks
pwm: Provide new consumer API functions for waveforms
pwm: New abstraction for PWM waveforms
pwm: Add more locking
Linus Torvalds [Wed, 20 Nov 2024 20:23:06 +0000 (12:23 -0800)]
Merge tag 'spi-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi updates from Mark Brown:
"The only real core work we've got this time around is the completion
of the transition to the new host/target naming for the core APIs,
Kconfig still needs doing but that's a lot less invasive.
Otherwise the big changes are the new drivers that have been added:
- Completion of the conversion to spi_alloc_host()/_target() and
removal of the old naming.
- Cleanups for Rockchip drivers, these brought in a new logging
helper in the driver core for warnings during probe.
- Support for configuration of the word delay via spidev_test.
- Support for AMD HID2 controllers, Apple SPI controller and Realtek
SPI-NAND controllers"
* tag 'spi-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (58 commits)
spi: imx: support word delay
spi: imx: pass struct spi_transfer to prepare_transfer()
spi: cs42l43: Add GPIO speaker id support to the bridge configuration
spi: Delete useless checks
spi: apple: Remove unnecessary .owner for apple_spi_driver
spi: spidev_test: add support for word delay
spi: apple: Add driver for Apple SPI controller
spi: dt-bindings: apple,spi: Add binding for Apple SPI controllers
spi: Use of_property_present() for non-boolean properties
spi: zynqmp-gqspi: Undo runtime PM changes at driver exit time
spi: spi-mem: rtl-snand: Correctly handle DMA transfers
spi: tegra210-quad: Avoid shift-out-of-bounds
spi: axi-spi-engine: Emit trace events for spi transfers
dt-bindings: spi: sprd,sc9860-spi: convert to YAML
spi: Replace deprecated PCI functions
spi: dt-bindings: samsung: Add a compatible for samsung,exynos8895-spi
spi: spi-mem: Add Realtek SPI-NAND controller
dt-bindings: spi: Add realtek,rtl9301-snand
spi: make class structs const
spi: dt-bindings: brcm,bcm2835-aux-spi: Convert to dtschema
...
Linus Torvalds [Wed, 20 Nov 2024 20:18:50 +0000 (12:18 -0800)]
Merge tag 'regulator-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator updates from Mark Brown:
"This was a quite quiet release for regulators on the drivers front,
but we do have two small but useful core improvements:
- Improve handling of cases where DT platforms specify both DT and
init_data based configuration for a single regulator.
- A helper of_regulator_get_optional() to simplify writing generic
bindings for regulator consumers in subsystems.
There's also some YAML conversions for the DT bindings which are a big
part of the diffstat"
* tag 'regulator-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: dt-bindings: qcom,rpmh: Correct PM8550VE supplies
regulator: Switch back to struct platform_driver::remove()
regulator: doc: remove documentation comment for regulator_init
regulator: doc: add missing documentation for init_cb
regulator: rk808: Restrict DVS GPIOs to the RK808 variant only
regulator: rk808: Use dev_err_probe() in the probe path
regulator: rk808: Perform trivial code cleanups
regulator: dt-bindings: qcom,qca6390-pmu: add more properties for wcn6855
regulator: dt-bindings: lltc,ltc3676: convert to YAML
regulator: core: Use fsleep() to get best sleep mechanism
regulator: core: remove machine init callback from config
regulator: core: add callback to perform runtime init
regulator: core: do not silently ignore provided init_data
regulator: max5970: Drop unused structs
regulator: dt-bindings: vctrl-regulator: convert to YAML
regulator: qcom-smd: make smd_vreg_rpm static
regulator: Call of_node_put() only once in rzg2l_usb_vbus_regulator_probe()
regulator: isl6271a: Drop explicit initialization of struct i2c_device_id::driver_data to 0
regulator: Add devres version of of_regulator_get_optional()
regulator: Add of_regulator_get_optional() for pure DT regulator lookup
Linus Torvalds [Wed, 20 Nov 2024 20:09:47 +0000 (12:09 -0800)]
Merge tag 'regmap-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"The main thing for regmap this time around is some improvements of the
lockdep annotations which stop some false positives. We also have one
new helper for setting a bitmask to the same value, and several test
improvements"
* tag 'regmap-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: provide regmap_assign_bits()
regmap: irq: Set lockdep class for hierarchical IRQ domains
regmap: maple: Provide lockdep (sub)class for maple tree's internal lock
regmap: kunit: Fix repeated test param
regcache: Improve documentation of available cache types
regmap: Specifically test writing 0 as a value to sparse caches
regmap-irq: Consistently use memset32() in regmap_irq_thread()
Linus Torvalds [Wed, 20 Nov 2024 19:54:39 +0000 (11:54 -0800)]
Merge tag 'linux_kselftest-next-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest update from Shuah Khan:
"timer test:
- remove duplicate defines
- fixes to improve error reporting
rtc test:
- check rtc alarm status in alarm test
resctrl test:
- add array overrun checks during iMC config parsing code and when
reading strings
- fixes and reorganizing code"
* tag 'linux_kselftest-next-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (23 commits)
selftests/resctrl: Replace magic constants used as array size
selftests/resctrl: Keep results from first test run
selftests/resctrl: Do not compare performance counters and resctrl at low bandwidth
selftests/resctrl: Use cache size to determine "fill_buf" buffer size
selftests/resctrl: Ensure measurements skip initialization of default benchmark
selftests/resctrl: Make benchmark parameter passing robust
selftests/resctrl: Remove unused measurement code
selftests/resctrl: Only support measured read operation
selftests/resctrl: Remove "once" parameter required to be false
selftests/resctrl: Make wraparound handling obvious
selftests/resctrl: Protect against array overflow when reading strings
selftests/resctrl: Protect against array overrun during iMC config parsing
selftests/resctrl: Fix memory overflow due to unhandled wraparound
selftests/resctrl: Print accurate buffer size as part of MBM results
selftests/resctrl: Make functions only used in same file static
selftests: Add a test mangling with uc_sigmask
selftests: Rename sigaltstack to generic signal
selftest: rtc: Add to check rtc alarm status for alarm related test
selftests:timers: remove local CLOCKID defines
selftests: timers: Remove unneeded semicolon
...
Linus Torvalds [Wed, 20 Nov 2024 19:47:43 +0000 (11:47 -0800)]
Merge tag 'kgdb-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux
Pull kgdb updates from Daniel Thompson:
"A relatively modest collection of changes:
- Adopt kstrtoint() and kstrtol() instead of the simple_strtoXX
family for better error checking of user input.
- Align the print behavour when breakpoints are enabled and disabled
by adopting the current behaviour of breakpoint disable for both.
- Remove some of the (rather odd and user hostile) hex fallbacks and
require kdb users to prefix with 0x instead.
- Tidy up (and fix) control code handling in kdb's keyboard code.
This makes the control code handling at the keyboard behave the
same way as it does via the UART.
- Switch my own entry in MAINTAINERS to my @kernel.org address"
* tag 'kgdb-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
kdb: fix ctrl+e/a/f/b/d/p/n broken in keyboard mode
MAINTAINERS: Use Daniel Thompson's korg address for kgdb work
kdb: Fix breakpoint enable to be silent if already enabled
kdb: Remove fallback interpretation of arbitrary numbers as hex
trace: kdb: Replace simple_strtoul with kstrtoul in kdb_ftdump
kdb: Replace the use of simple_strto with safer kstrto in kdb_main
Linus Torvalds [Wed, 20 Nov 2024 19:34:10 +0000 (11:34 -0800)]
Merge tag 'ftrace-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull ftrace updates from Steven Rostedt:
- Restructure the function graph shadow stack to prepare it for use
with kretprobes
With the goal of merging the shadow stack logic of function graph and
kretprobes, some more restructuring of the function shadow stack is
required.
Move out function graph specific fields from the fgraph
infrastructure and store it on the new stack variables that can pass
data from the entry callback to the exit callback.
Hopefully, with this change, the merge of kretprobes to use fgraph
shadow stacks will be ready by the next merge window.
- Make shadow stack 4k instead of using PAGE_SIZE.
Some architectures have very large PAGE_SIZE values which make its
use for shadow stacks waste a lot of memory.
- Give shadow stacks its own kmem cache.
When function graph is started, every task on the system gets a
shadow stack. In the future, shadow stacks may not be 4K in size.
Have it have its own kmem cache so that whatever size it becomes will
still be efficient in allocations.
- Initialize profiler graph ops as it will be needed for new updates to
fgraph
- Convert to use guard(mutex) for several ftrace and fgraph functions
- Add more comments and documentation
- Show function return address in function graph tracer
Add an option to show the caller of a function at each entry of the
function graph tracer, similar to what the function tracer does.
- Abstract out ftrace_regs from being used directly like pt_regs
ftrace_regs was created to store a partial pt_regs. It holds only the
registers and stack information to get to the function arguments and
return values. On several archs, it is simply a wrapper around
pt_regs. But some users would access ftrace_regs directly to get the
pt_regs which will not work on all archs. Make ftrace_regs an
abstract structure that requires all access to its fields be through
accessor functions.
- Show how long it takes to do function code modifications
When code modification for function hooks happen, it always had the
time recorded in how long it took to do the conversion. But this
value was never exported. Recently the code was touched due to new
ROX modification handling that caused a large slow down in doing the
modifications and had a significant impact on boot times.
Expose the timings in the dyn_ftrace_total_info file. This file was
created a while ago to show information about memory usage and such
to implement dynamic function tracing. It's also an appropriate file
to store the timings of this modification as well. This will make it
easier to see the impact of changes to code modification on boot up
timings.
- Other clean ups and small fixes
* tag 'ftrace-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (22 commits)
ftrace: Show timings of how long nop patching took
ftrace: Use guard to take ftrace_lock in ftrace_graph_set_hash()
ftrace: Use guard to take the ftrace_lock in release_probe()
ftrace: Use guard to lock ftrace_lock in cache_mod()
ftrace: Use guard for match_records()
fgraph: Use guard(mutex)(&ftrace_lock) for unregister_ftrace_graph()
fgraph: Give ret_stack its own kmem cache
fgraph: Separate size of ret_stack from PAGE_SIZE
ftrace: Rename ftrace_regs_return_value to ftrace_regs_get_return_value
selftests/ftrace: Fix check of return value in fgraph-retval.tc test
ftrace: Use arch_ftrace_regs() for ftrace_regs_*() macros
ftrace: Consolidate ftrace_regs accessor functions for archs using pt_regs
ftrace: Make ftrace_regs abstract from direct use
fgragh: No need to invoke the function call_filter_check_discard()
fgraph: Simplify return address printing in function graph tracer
function_graph: Remove unnecessary initialization in ftrace_graph_ret_addr()
function_graph: Support recording and printing the function return address
ftrace: Have calltime be saved in the fgraph storage
ftrace: Use a running sleeptime instead of saving on shadow stack
fgraph: Use fgraph data to store subtime for profiler
...
Linus Torvalds [Wed, 20 Nov 2024 18:08:00 +0000 (10:08 -0800)]
Merge tag 'sched_ext-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext updates from Tejun Heo:
- Improve the default select_cpu() implementation making it topology
aware and handle WAKE_SYNC better.
- set_arg_maybe_null() was used to inform the verifier which ops args
could be NULL in a rather hackish way. Use the new __nullable CFI
stub tags instead.
- On Sapphire Rapids multi-socket systems, a BPF scheduler, by
hammering on the same queue across sockets, could live-lock the
system to the point where the system couldn't make reasonable forward
progress.
This could lead to soft-lockup triggered resets or stalling out
bypass mode switch and thus BPF scheduler ejection for tens of
minutes if not hours. After trying a number of mitigations, the
following set worked reliably:
- Injecting artificial cpu_relax() loops in two places while
sched_ext is trying to turn on the bypass mode.
- Triggering scheduler ejection when soft-lockup detection is
imminent (a quarter of threshold left).
While not the prettiest, the impact both in terms of code complexity
and overhead is minimal.
- A common complaint on the API is the overuse of the word "dispatch"
and the confusion around "consume". This is due to how the dispatch
queues became more generic over time. Rename the affected kfuncs for
clarity. Thanks to BPF's compatibility features, this change can be
made in a way that's both forward and backward compatible. The
compatibility code will be dropped in a few releases.
- Other misc changes
* tag 'sched_ext-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (21 commits)
sched_ext: Replace scx_next_task_picked() with switch_class() in comment
sched_ext: Rename scx_bpf_dispatch[_vtime]_from_dsq*() -> scx_bpf_dsq_move[_vtime]*()
sched_ext: Rename scx_bpf_consume() to scx_bpf_dsq_move_to_local()
sched_ext: Rename scx_bpf_dispatch[_vtime]() to scx_bpf_dsq_insert[_vtime]()
sched_ext: scx_bpf_dispatch_from_dsq_set_*() are allowed from unlocked context
sched_ext: add a missing rcu_read_lock/unlock pair at scx_select_cpu_dfl()
sched_ext: Clarify sched_ext_ops table for userland scheduler
sched_ext: Enable the ops breather and eject BPF scheduler on softlockup
sched_ext: Avoid live-locking bypass mode switching
sched_ext: Fix incorrect use of bitwise AND
sched_ext: Do not enable LLC/NUMA optimizations when domains overlap
sched_ext: Introduce NUMA awareness to the default idle selection policy
sched_ext: Replace set_arg_maybe_null() with __nullable CFI stub tags
sched_ext: Rename CFI stubs to names that are recognized by BPF
sched_ext: Introduce LLC awareness to the default idle selection policy
sched_ext: Clarify ops.select_cpu() for single-CPU tasks
sched_ext: improve WAKE_SYNC behavior for default idle CPU selection
sched_ext: Use btf_ids to resolve task_struct
sched/ext: Use tg_cgroup() to elieminate duplicate code
sched/ext: Fix unmatch trailing comment of CONFIG_EXT_GROUP_SCHED
...
Linus Torvalds [Wed, 20 Nov 2024 17:54:49 +0000 (09:54 -0800)]
Merge tag 'cgroup-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
- cpu.stat now also shows niced CPU time
- Freezer and cpuset optimizations
- Other misc changes
* tag 'cgroup-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup/cpuset: Disable cpuset_cpumask_can_shrink() test if not load balancing
cgroup/cpuset: Further optimize code if CONFIG_CPUSETS_V1 not set
cgroup/cpuset: Enforce at most one rebuild_sched_domains_locked() call per operation
cgroup/cpuset: Revert "Allow suppression of sched domain rebuild in update_cpumasks_hier()"
MAINTAINERS: remove Zefan Li
cgroup/freezer: Add cgroup CGRP_FROZEN flag update helper
cgroup/freezer: Reduce redundant traversal for cgroup_freeze
cgroup/bpf: only cgroup v2 can be attached by bpf programs
Revert "cgroup: Fix memory leak caused by missing cgroup_bpf_offline"
selftests/cgroup: Fix compile error in test_cpu.c
cgroup/rstat: Selftests for niced CPU statistics
cgroup/rstat: Tracking cgroup-level niced CPU time
cgroup/cpuset: Fix spelling errors in file kernel/cgroup/cpuset.c
Linus Torvalds [Wed, 20 Nov 2024 17:41:11 +0000 (09:41 -0800)]
Merge tag 'wq-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue updates from Tejun Heo:
- The maximum concurrency limit of 512 which was set a long time ago is
too low now.
A legitimate use (BPF cgroup release) of system_wq could saturate it
under stress test conditions leading to false dependencies and
deadlocks.
While the offending use was switched to a dedicated workqueue, use
the opportunity to bump WQ_MAX_ACTIVE four fold and document that
system workqueue shouldn't be saturated. Workqueue should add at
least a warning mechanism for cases where system workqueues are
saturated.
- Recent workqueue updates to support more flexible execution topology
made unbound workqueues use per-cpu worker pool frontends which
pushed up workqueue flush overhead.
As consecutive CPUs are likely to be pointing to the same worker
pool, reduce overhead by switching locks only when necessary.
* tag 'wq-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: Reduce expensive locks for unbound workqueue
workqueue: Adjust WQ_MAX_ACTIVE from 512 to 2048
workqueue: doc: Add a note saturating the system_wq is not permitted
Linus Torvalds [Wed, 20 Nov 2024 17:36:05 +0000 (09:36 -0800)]
Merge tag 'probes-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes updates from Masami Hiramatsu:
"Kprobes cleanups. Functionality does not change.
- kprobes: Cleanup the config comment
Adjust #endif comments.
- kprobes: Cleanup collect_one_slot() and __disable_kprobe()
Make fail fast to reduce code nested level.
- kprobes: Use struct_size() in __get_insn_slot()
Use struct_size() to avoid special macro.
- x86/kprobes: Cleanup kprobes on ftrace code
Use macro instead of direct field access/magic number, and avoid
redundant instruction pointer setting"
* tag 'probes-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
x86/kprobes: Cleanup kprobes on ftrace code
kprobes: Use struct_size() in __get_insn_slot()
kprobes: Cleanup collect_one_slot() and __disable_kprobe()
kprobes: Cleanup the config comment
Linus Torvalds [Wed, 20 Nov 2024 17:33:23 +0000 (09:33 -0800)]
Merge tag 'livepatching-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching
Pull livepatching update from Petr Mladek:
- A new selftest for livepatching of a kprobed function
* tag 'livepatching-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
selftests: livepatch: test livepatching a kprobed function
selftests: livepatch: save and restore kprobe state
selftests: livepatch: rename KLP_SYSFS_DIR to SYSFS_KLP_DIR
Linus Torvalds [Wed, 20 Nov 2024 17:21:11 +0000 (09:21 -0800)]
Merge tag 'printk-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:
- Print more precise information about the printk log buffer memory
usage.
- Make sure that the sysrq title is shown on the console even when
deferred.
- Do not enable earlycon by `console=` which is meant to disable the
default console.
* tag 'printk-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk: add dummy printk_force_console_enter/exit helpers
tty: sysrq: Use printk_force_console context on __handle_sysrq
printk: Introduce FORCE_CON flag
printk: Improve memory usage logging during boot
init: Don't proxy `console=` to earlycon
Linus Torvalds [Wed, 20 Nov 2024 17:16:45 +0000 (09:16 -0800)]
Merge tag 'docs-6.13' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
"Another moderately busy cycle in docsland:
- Work on Chinese translations has picked up again. Happily, they are
maintaining the existing translations and not just adding new ones.
- Some maintenance of the Japanese and Italian translations as well.
- The removal of the venerable "dontdiff" file. It has long outlived
its usefulness and contained entries ("parse.*") that would
actively mask actual source change.
- The addition of enforcement information to the code-of-conduct
documentation.
Along with some build-system fixes and a lot of typo and language
fixes"
* tag 'docs-6.13' of git://git.lwn.net/linux: (52 commits)
Documentation/CoC: spell out enforcement for unacceptable behaviors
docs: fix typos and whitespace in Documentation/process/backporting.rst
docs/zh_CN: fix one sentence in llvm.rst
docs: bug-bisect: add a note about bisecting -next
docs/zh_CN: add the translation of kbuild/llvm.rst
Documentation: Fix incorrect paths/magic in magic numbers rst
Documentation/maintainer-tip: Fix typos
Documentation: Improve crash_kexec_post_notifiers description
Docs/zh_CN: Translate physical_memory.rst to Simplified Chinese
Documentation: admin: reorganize kernel-parameters intro
docs/zh_CN: update the translation of process/programming-language.rst
docs/zh_CN: update the translation of mm/page_owner.rst
docs/zh_CN: update the translation of mm/page_table_check.rst
docs/zh_CN: update the translation of mm/overcommit-accounting.rst
docs/zh_CN: update the translation of mm/admon/faq.rst
docs/zh_CN: update the translation of mm/active_mm.rst
docs/zh_CN: update the translation of mm/hmm.rst
docs: remove Documentation/dontdiff
docs/zh_CN: Add a entry in Chinese glossary
Docs/zh_CN: Fix the pfn calculation error in page_tables.rst
...
Linus Torvalds [Wed, 20 Nov 2024 00:35:06 +0000 (16:35 -0800)]
Merge tag 'timers-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
"A rather large update for timekeeping and timers:
- The final step to get rid of auto-rearming posix-timers
posix-timers are currently auto-rearmed by the kernel when the
signal of the timer is ignored so that the timer signal can be
delivered once the corresponding signal is unignored.
This requires to throttle the timer to prevent a DoS by small
intervals and keeps the system pointlessly out of low power states
for no value. This is a long standing non-trivial problem due to
the lock order of posix-timer lock and the sighand lock along with
life time issues as the timer and the sigqueue have different life
time rules.
Cure this by:
- Embedding the sigqueue into the timer struct to have the same
life time rules. Aside of that this also avoids the lookup of
the timer in the signal delivery and rearm path as it's just a
always valid container_of() now.
- Queuing ignored timer signals onto a seperate ignored list.
- Moving queued timer signals onto the ignored list when the
signal is switched to SIG_IGN before it could be delivered.
- Walking the ignored list when SIG_IGN is lifted and requeue the
signals to the actual signal lists. This allows the signal
delivery code to rearm the timer.
This also required to consolidate the signal delivery rules so they
are consistent across all situations. With that all self test
scenarios finally succeed.
- Core infrastructure for VFS multigrain timestamping
This is required to allow the kernel to use coarse grained time
stamps by default and switch to fine grained time stamps when inode
attributes are actively observed via getattr().
These changes have been provided to the VFS tree as well, so that
the VFS specific infrastructure could be built on top.
- Cleanup and consolidation of the sleep() infrastructure
- Move all sleep and timeout functions into one file
- Rework udelay() and ndelay() into proper documented inline
functions and replace the hardcoded magic numbers by proper
defines.
- Rework the fsleep() implementation to take the reality of the
timer wheel granularity on different HZ values into account.
Right now the boundaries are hard coded time ranges which fail
to provide the requested accuracy on different HZ settings.
- Update documentation for all sleep/timeout related functions
and fix up stale documentation links all over the place
- Fixup a few usage sites
- Rework of timekeeping and adjtimex(2) to prepare for multiple PTP
clocks
A system can have multiple PTP clocks which are participating in
seperate and independent PTP clock domains. So far the kernel only
considers the PTP clock which is based on CLOCK TAI relevant as
that's the clock which drives the timekeeping adjustments via the
various user space daemons through adjtimex(2).
The non TAI based clock domains are accessible via the file
descriptor based posix clocks, but their usability is very limited.
They can't be accessed fast as they always go all the way out to
the hardware and they cannot be utilized in the kernel itself.
As Time Sensitive Networking (TSN) gains traction it is required to
provide fast user and kernel space access to these clocks.
The approach taken is to utilize the timekeeping and adjtimex(2)
infrastructure to provide this access in a similar way how the
kernel provides access to clock MONOTONIC, REALTIME etc.
Instead of creating a duplicated infrastructure this rework
converts timekeeping and adjtimex(2) into generic functionality
which operates on pointers to data structures instead of using
static variables.
This allows to provide time accessors and adjtimex(2) functionality
for the independent PTP clocks in a subsequent step.
- Consolidate hrtimer initialization
hrtimers are set up by initializing the data structure and then
seperately setting the callback function for historical reasons.
That's an extra unnecessary step and makes Rust support less
straight forward than it should be.
Provide a new set of hrtimer_setup*() functions and convert the
core code and a few usage sites of the less frequently used
interfaces over.
The bulk of the htimer_init() to hrtimer_setup() conversion is
already prepared and scheduled for the next merge window.
- Drivers:
- Ensure that the global timekeeping clocksource is utilizing the
cluster 0 timer on MIPS multi-cluster systems.
Otherwise CPUs on different clusters use their cluster specific
clocksource which is not guaranteed to be synchronized with
other clusters.
- Mostly boring cleanups, fixes, improvements and code movement"
* tag 'timers-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (140 commits)
posix-timers: Fix spurious warning on double enqueue versus do_exit()
clocksource/drivers/arm_arch_timer: Use of_property_present() for non-boolean properties
clocksource/drivers/gpx: Remove redundant casts
clocksource/drivers/timer-ti-dm: Fix child node refcount handling
dt-bindings: timer: actions,owl-timer: convert to YAML
clocksource/drivers/ralink: Add Ralink System Tick Counter driver
clocksource/drivers/mips-gic-timer: Always use cluster 0 counter as clocksource
clocksource/drivers/timer-ti-dm: Don't fail probe if int not found
clocksource/drivers:sp804: Make user selectable
clocksource/drivers/dw_apb: Remove unused dw_apb_clockevent functions
hrtimers: Delete hrtimer_init_on_stack()
alarmtimer: Switch to use hrtimer_setup() and hrtimer_setup_on_stack()
io_uring: Switch to use hrtimer_setup_on_stack()
sched/idle: Switch to use hrtimer_setup_on_stack()
hrtimers: Delete hrtimer_init_sleeper_on_stack()
wait: Switch to use hrtimer_setup_sleeper_on_stack()
timers: Switch to use hrtimer_setup_sleeper_on_stack()
net: pktgen: Switch to use hrtimer_setup_sleeper_on_stack()
futex: Switch to use hrtimer_setup_sleeper_on_stack()
fs/aio: Switch to use hrtimer_setup_sleeper_on_stack()
...
Linus Torvalds [Wed, 20 Nov 2024 00:09:13 +0000 (16:09 -0800)]
Merge tag 'timers-vdso-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull vdso data page handling updates from Thomas Gleixner:
"First steps of consolidating the VDSO data page handling.
The VDSO data page handling is architecture specific for historical
reasons, but there is no real technical reason to do so.
Aside of that VDSO data has become a dump ground for various
mechanisms and fail to provide a clear separation of the
functionalities.
Clean this up by:
- consolidating the VDSO page data by getting rid of architecture
specific warts especially in x86 and PowerPC.
- removing the last includes of header files which are pulling in
other headers outside of the VDSO namespace.
- seperating timekeeping and other VDSO data accordingly.
Further consolidation of the VDSO page handling is done in subsequent
changes scheduled for the next merge window.
This also lays the ground for expanding the VDSO time getters for
independent PTP clocks in a generic way without making every
architecture add support seperately"
* tag 'timers-vdso-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
x86/vdso: Add missing brackets in switch case
vdso: Rename struct arch_vdso_data to arch_vdso_time_data
powerpc: Split systemcfg struct definitions out from vdso
powerpc: Split systemcfg data out of vdso data page
powerpc: Add kconfig option for the systemcfg page
powerpc/pseries/lparcfg: Use num_possible_cpus() for potential processors
powerpc/pseries/lparcfg: Fix printing of system_active_processors
powerpc/procfs: Propagate error of remap_pfn_range()
powerpc/vdso: Remove offset comment from 32bit vdso_arch_data
x86/vdso: Split virtual clock pages into dedicated mapping
x86/vdso: Delete vvar.h
x86/vdso: Access vdso data without vvar.h
x86/vdso: Move the rng offset to vsyscall.h
x86/vdso: Access rng vdso data without vvar.h
x86/vdso: Access timens vdso data without vvar.h
x86/vdso: Allocate vvar page from C code
x86/vdso: Access rng data from kernel without vvar
x86/vdso: Place vdso_data at beginning of vvar page
x86/vdso: Use __arch_get_vdso_data() to access vdso data
x86/mm/mmap: Remove arch_vma_name()
...
Linus Torvalds [Tue, 19 Nov 2024 23:54:19 +0000 (15:54 -0800)]
Merge tag 'irq-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull interrupt subsystem updates from Thomas Gleixner:
"Tree wide:
- Make nr_irqs static to the core code and provide accessor functions
to remove existing and prevent future aliasing problems with local
variables or function arguments of the same name.
Core code:
- Prevent freeing an interrupt in the devres code which is not
managed by devres in the first place.
- Use seq_put_decimal_ull_width() for decimal values output in
/proc/interrupts which increases performance significantly as it
avoids parsing the format strings over and over.
- Optimize raising the timer and hrtimer soft interrupts by using the
'set bit only' variants instead of the combined version which
checks whether ksoftirqd should be woken up. The latter is a
pointless exercise as both soft interrupts are raised in the
context of the timer interrupt and therefore never wake up
ksoftirqd.
- Delegate timer/hrtimer soft interrupt processing to a dedicated
thread on RT.
Timer and hrtimer soft interrupts are always processed in ksoftirqd
on RT enabled kernels. This can lead to high latencies when other
soft interrupts are delegated to ksoftirqd as well.
The separate thread allows to run them seperately under a RT
scheduling policy to reduce the latency overhead.
Drivers:
- New drivers or extensions of existing drivers to support Renesas
RZ/V2H(P), Aspeed AST27XX, T-HEAD C900 and ATMEL sam9x7 interrupt
chips
- Support for multi-cluster GICs on MIPS.
MIPS CPUs can come with multiple CPU clusters, where each CPU
cluster has its own GIC (Generic Interrupt Controller). This
requires to access the GIC of a remote cluster through a redirect
register block.
This is encapsulated into a set of helper functions to keep the
complexity out of the actual code paths which handle the GIC
details.
- Support for encrypted guests in the ARM GICV3 ITS driver
The ITS page needs to be shared with the hypervisor and therefore
must be decrypted.
- Small cleanups and fixes all over the place"
* tag 'irq-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
irqchip/riscv-aplic: Prevent crash when MSI domain is missing
genirq/proc: Use seq_put_decimal_ull_width() for decimal values
softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.
timers: Use __raise_softirq_irqoff() to raise the softirq.
hrtimer: Use __raise_softirq_irqoff() to raise the softirq
riscv: defconfig: Enable T-HEAD C900 ACLINT SSWI drivers
irqchip: Add T-HEAD C900 ACLINT SSWI driver
dt-bindings: interrupt-controller: Add T-HEAD C900 ACLINT SSWI device
irqchip/stm32mp-exti: Use of_property_present() for non-boolean properties
irqchip/mips-gic: Fix selection of GENERIC_IRQ_EFFECTIVE_AFF_MASK
irqchip/mips-gic: Prevent indirect access to clusters without CPU cores
irqchip/mips-gic: Multi-cluster support
irqchip/mips-gic: Setup defaults in each cluster
irqchip/mips-gic: Support multi-cluster in for_each_online_cpu_gic()
irqchip/mips-gic: Replace open coded online CPU iterations
genirq/irqdesc: Use str_enabled_disabled() helper in wakeup_show()
genirq/devres: Don't free interrupt which is not managed by devres
irqchip/gic-v3-its: Fix over allocation in itt_alloc_pool()
irqchip/aspeed-intc: Add AST27XX INTC support
dt-bindings: interrupt-controller: Add support for ASPEED AST27XX INTC
...
Linus Torvalds [Tue, 19 Nov 2024 23:20:04 +0000 (15:20 -0800)]
Merge tag 'core-debugobjects-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull debugobjects updates from Thomas Gleixner:
- Prevent destroying the kmem_cache on early failure.
Destroying a kmem_cache requires work queues to be set up, but in the
early failure case they are not yet initializated. So rather leak the
cache instead of triggering a BUG.
- Reduce parallel pool fill attempts.
Refilling the object pool requires to take the global pool lock,
which causes a massive performance issue when a large number of CPUs
attempt to refill concurrently. It turns out that it's sufficient to
let one CPU handle the refill from the to free list and in case there
are not enough objects on it to allocate new objects from the kmem
cache.
This also splits the free list handling from the actual allocation
path as that yields better results on RT where allocation is
restricted to preemptible code paths. The refill from free list has
no such restrictions.
- Consolidate the global and the per CPU pools to use the same data
structure, so all helper functions can be shared.
- Simplify the object allocation/free logic.
The allocation/free logic is an incomprehensible maze, which tries to
utilize the to free list and the global pool in the best way. This
all can be simplified into a straight forward comprehensible code
flow.
- Convert the allocation/free mechanism to batch mode.
Transferring objects from the global pool to the per CPU pools or
vice versa is done by walking the hlist and moving object by object.
That not only increases the pool lock held time, it also dirties up
to 17 cache lines.
This can be avoided by storing the pointer to the first object in a
batch of 16 objects in the objects themself and propagate it through
the batch when an object is enqueued into a pool or to a temporary
hlist head on allocation.
This allows to move batches of objects with at max four cache lines
dirtied and reduces the pool lock held time and therefore contention
significantly.
- Improve the object reusage
The current implementation is too agressively freeing unused objects,
which is counterproductive on bursty workloads like a kernel compile.
Address this by:
* increasing the per CPU pool size
* refilling the per CPU pool from the to be freed pool when the
per CPU pool emptied a batch
* keeping track of object usage with a exponentially wheighted
moving average which prevents the work queue callback to free
objects prematuraly.
This combined reduces the allocation/free rate for a full kernel
compile significantly:
- A few cleanups and a more cache line friendly layout of debug
information on top.
* tag 'core-debugobjects-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
debugobjects: Track object usage to avoid premature freeing of objects
debugobjects: Refill per CPU pool more agressively
debugobjects: Double the per CPU slots
debugobjects: Move pool statistics into global_pool struct
debugobjects: Implement batch processing
debugobjects: Prepare kmem_cache allocations for batching
debugobjects: Prepare for batching
debugobjects: Use static key for boot pool selection
debugobjects: Rework free_object_work()
debugobjects: Rework object freeing
debugobjects: Rework object allocation
debugobjects: Move min/max count into pool struct
debugobjects: Rename and tidy up per CPU pools
debugobjects: Use separate list head for boot pool
debugobjects: Move pools into a datastructure
debugobjects: Reduce parallel pool fill attempts
debugobjects: Make debug_objects_enabled bool
debugobjects: Provide and use free_object_list()
debugobjects: Remove pointless debug printk
debugobjects: Reuse put_objects() on OOM
...
Linus Torvalds [Tue, 19 Nov 2024 22:48:31 +0000 (14:48 -0800)]
Merge tag 'x86-mm-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm updates from Ingo Molnar:
- Put cpumask_test_cpu() check in switch_mm_irqs_off() under
CONFIG_DEBUG_VM, to micro-optimize the context-switching code (Rik
van Riel)
- Add missing details in virtual memory layout (Kirill A. Shutemov)
* tag 'x86-mm-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm/tlb: Put cpumask_test_cpu() check in switch_mm_irqs_off() under CONFIG_DEBUG_VM
x86/mm/doc: Add missing details in virtual memory layout
Linus Torvalds [Tue, 19 Nov 2024 22:46:39 +0000 (14:46 -0800)]
Merge tag 'x86-cleanups-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Ingo Molnar:
- x86/boot: Remove unused function atou() (Dr. David Alan Gilbert)
- x86/cpu: Use str_yes_no() helper in show_cpuinfo_misc() (Thorsten
Blum)
- x86/platform: Switch back to struct platform_driver::remove() (Uwe
Kleine-König)
* tag 'x86-cleanups-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot: Remove unused function atou()
x86/cpu: Use str_yes_no() helper in show_cpuinfo_misc()
x86/platform: Switch back to struct platform_driver::remove()
Linus Torvalds [Tue, 19 Nov 2024 22:34:02 +0000 (14:34 -0800)]
Merge tag 'x86-splitlock-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 splitlock updates from Ingo Molnar:
- Move Split and Bus lock code to a dedicated file (Ravi Bangoria)
- Add split/bus lock support for AMD (Ravi Bangoria)
* tag 'x86-splitlock-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/bus_lock: Add support for AMD
x86/split_lock: Move Split and Bus lock code to a dedicated file
Linus Torvalds [Tue, 19 Nov 2024 22:16:06 +0000 (14:16 -0800)]
Merge tag 'sched-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"Core facilities:
- Add the "Lazy preemption" model (CONFIG_PREEMPT_LAZY=y), which
optimizes fair-class preemption by delaying preemption requests to
the tick boundary, while working as full preemption for
RR/FIFO/DEADLINE classes. (Peter Zijlstra)
- x86: Enable Lazy preemption (Peter Zijlstra)
- riscv: Enable Lazy preemption (Jisheng Zhang)
- Initialize idle tasks only once (Thomas Gleixner)
Linus Torvalds [Tue, 19 Nov 2024 21:34:06 +0000 (13:34 -0800)]
Merge tag 'perf-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull performance events updates from Ingo Molnar:
"Uprobes:
- Add BPF session support (Jiri Olsa)
- Switch to RCU Tasks Trace flavor for better performance (Andrii
Nakryiko)
- Massively increase uretprobe SMP scalability by SRCU-protecting
the uretprobe lifetime (Andrii Nakryiko)
- Kill xol_area->slot_count (Oleg Nesterov)
Core facilities:
- Implement targeted high-frequency profiling by adding the ability
for an event to "pause" or "resume" AUX area tracing (Adrian
Hunter)
VM profiling/sampling:
- Correct perf sampling with guest VMs (Colton Lewis)
New hardware support:
- x86/intel: Add PMU support for Intel ArrowLake-H CPUs (Dapeng Mi)
Misc fixes and enhancements:
- x86/intel/pt: Fix buffer full but size is 0 case (Adrian Hunter)
- x86/amd: Warn only on new bits set (Breno Leitao)
- x86/amd/uncore: Avoid a false positive warning about snprintf
truncation in amd_uncore_umc_ctx_init (Jean Delvare)
- uprobes: Re-order struct uprobe_task to save some space
(Christophe JAILLET)
- x86/rapl: Move the pmu allocation out of CPU hotplug (Kan Liang)
- x86/rapl: Clean up cpumask and hotplug (Kan Liang)
- uprobes: Deuglify xol_get_insn_slot/xol_free_insn_slot paths (Oleg
Nesterov)"
* tag 'perf-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
perf/core: Correct perf sampling with guest VMs
perf/x86: Refactor misc flag assignments
perf/powerpc: Use perf_arch_instruction_pointer()
perf/core: Hoist perf_instruction_pointer() and perf_misc_flags()
perf/arm: Drop unused functions
uprobes: Re-order struct uprobe_task to save some space
perf/x86/amd/uncore: Avoid a false positive warning about snprintf truncation in amd_uncore_umc_ctx_init
perf/x86/intel: Do not enable large PEBS for events with aux actions or aux sampling
perf/x86/intel/pt: Add support for pause / resume
perf/core: Add aux_pause, aux_resume, aux_start_paused
perf/x86/intel/pt: Fix buffer full but size is 0 case
uprobes: SRCU-protect uretprobe lifetime (with timeout)
uprobes: allow put_uprobe() from non-sleepable softirq context
perf/x86/rapl: Clean up cpumask and hotplug
perf/x86/rapl: Move the pmu allocation out of CPU hotplug
uprobe: Add support for session consumer
uprobe: Add data pointer to consumer handlers
perf/x86/amd: Warn only on new bits set
uprobes: fold xol_take_insn_slot() into xol_get_insn_slot()
uprobes: kill xol_area->slot_count
...
Linus Torvalds [Tue, 19 Nov 2024 21:27:52 +0000 (13:27 -0800)]
Merge tag 'objtool-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool updates from Ingo Molnar:
- Detect non-relocated text references for more robust
IBT sealing (Josh Poimboeuf)
- Fix build error when building stripped down
UAPI headers (HONG Yifan)
- Exclude __tracepoints data from ENDBR checks to fix
false positives on clang builds (Peter Zijlstra)
- Fix ORC unwind for newly forked tasks (Zheng Yejian)
- Fix readelf related faddr2line regression (Carlos Llamas)
* tag 'objtool-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Exclude __tracepoints data from ENDBR checks
Revert "scripts/faddr2line: Check only two symbols when calculating symbol size"
x86/unwind/orc: Fix unwind for newly forked tasks
objtool: Also include tools/include/uapi
objtool: Detect non-relocated text references
Linus Torvalds [Tue, 19 Nov 2024 20:43:11 +0000 (12:43 -0800)]
Merge tag 'locking-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
"Lockdep:
- Enable PROVE_RAW_LOCK_NESTING with PROVE_LOCKING (Sebastian Andrzej
Siewior)
- Add lockdep_cleanup_dead_cpu() (David Woodhouse)
futexes:
- Use atomic64_inc_return() in get_inode_sequence_number() (Uros
Bizjak)
- Use atomic64_try_cmpxchg_relaxed() in get_inode_sequence_number()
(Uros Bizjak)
spinlocks:
- Use atomic_try_cmpxchg_release() in osq_unlock() (Uros Bizjak)
atomics:
- x86: Use ALT_OUTPUT_SP() for __alternative_atomic64() (Uros Bizjak)
- x86: Use ALT_OUTPUT_SP() for __arch_{,try_}cmpxchg64_emu() (Uros
Bizjak)
KCSAN, seqlocks:
- Support seqcount_latch_t (Marco Elver)
<linux/cleanup.h>:
- Add if_not_guard() conditional guard helper (David Lechner)
- Adjust scoped_guard() macros to avoid potential warning (Przemek
Kitszel)
- Remove address space of returned pointer (Uros Bizjak)
Rust integration:
- Fix raw_spin_lock initialization on PREEMPT_RT (Eder Zulian)
Misc cleanups & fixes:
- lockdep: Fix wait-type check related warnings (Ahmed Ehab)
- lockdep: Use info level for initial info messages (Jiri Slaby)
- spinlocks: Make __raw_* lock ops static (Geert Uytterhoeven)
- pvqspinlock: Convert fields of 'enum vcpu_state' to uppercase
(Qiuxu Zhuo)
- iio: magnetometer: Fix if () scoped_guard() formatting (Stephen
Rothwell)
- rtmutex: Fix misleading comment (Peter Zijlstra)
- percpu-rw-semaphores: Fix grammar in percpu-rw-semaphore.rst (Xiu
Jianfeng)"
* tag 'locking-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits)
locking/Documentation: Fix grammar in percpu-rw-semaphore.rst
iio: magnetometer: fix if () scoped_guard() formatting
rust: helpers: Avoid raw_spin_lock initialization for PREEMPT_RT
kcsan, seqlock: Fix incorrect assumption in read_seqbegin()
seqlock, treewide: Switch to non-raw seqcount_latch interface
kcsan, seqlock: Support seqcount_latch_t
time/sched_clock: Broaden sched_clock()'s instrumentation coverage
time/sched_clock: Swap update_clock_read_data() latch writes
locking/atomic/x86: Use ALT_OUTPUT_SP() for __arch_{,try_}cmpxchg64_emu()
locking/atomic/x86: Use ALT_OUTPUT_SP() for __alternative_atomic64()
cleanup: Add conditional guard helper
cleanup: Adjust scoped_guard() macros to avoid potential warning
locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock()
cleanup: Remove address space of returned pointer
locking/rtmutex: Fix misleading comment
locking/rt: Annotate unlock followed by lock for sparse.
locking/rt: Add sparse annotation for RCU.
locking/rt: Remove one __cond_lock() in RT's spin_trylock_irqsave()
locking/rt: Add sparse annotation PREEMPT_RT's sleeping locks.
locking/pvqspinlock: Convert fields of 'enum vcpu_state' to uppercase
...
Linus Torvalds [Tue, 19 Nov 2024 20:27:19 +0000 (12:27 -0800)]
Merge tag 'x86_cpu_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpuid updates from Borislav Petkov:
- Add a feature flag which denotes AMD CPUs supporting workload
classification with the purpose of using such hints when making
scheduling decisions
- Determine the boost enumerator for each AMD core based on its type:
efficiency or performance, in the cppc driver
- Add the type of a CPU to the topology CPU descriptor with the goal of
supporting and making decisions based on the type of the respective
core
- Add a feature flag to denote AMD cores which have heterogeneous
topology and enable SD_ASYM_PACKING for those
- Check microcode revisions before disabling PCID on Intel
- Cleanups and fixlets
* tag 'x86_cpu_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu: Remove redundant CONFIG_NUMA guard around numa_add_cpu()
x86/cpu: Fix FAM5_QUARK_X1000 to use X86_MATCH_VFM()
x86/cpu: Fix formatting of cpuid_bits[] in scattered.c
x86/cpufeatures: Add X86_FEATURE_AMD_WORKLOAD_CLASS feature bit
x86/amd: Use heterogeneous core topology for identifying boost numerator
x86/cpu: Add CPU type to struct cpuinfo_topology
x86/cpu: Enable SD_ASYM_PACKING for PKG domain on AMD
x86/cpufeatures: Add X86_FEATURE_AMD_HETEROGENEOUS_CORES
x86/cpufeatures: Rename X86_FEATURE_FAST_CPPC to have AMD prefix
x86/mm: Don't disable PCID when INVLPG has been fixed by microcode
Linus Torvalds [Tue, 19 Nov 2024 20:21:35 +0000 (12:21 -0800)]
Merge tag 'x86_sev_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SEV updates from Borislav Petkov:
- Do the proper memory conversion of guest memory in order to be able
to kexec kernels in SNP guests along with other adjustments and
cleanups to that effect
- Start converting and moving functionality from the sev-guest driver
into core code with the purpose of supporting the secure TSC SNP
feature where the hypervisor cannot influence the TSC exposed to the
guest anymore
- Add a "nosnp" cmdline option in order to be able to disable SNP
support in the hypervisor and thus free-up resources which are not
going to be used
- Cleanups
[ Reminding myself about the endless TLA's again: SEV is the AMD Secure
Encrypted Virtualization - Linus ]
* tag 'x86_sev_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sev: Cleanup vc_handle_msr()
x86/sev: Convert shared memory back to private on kexec
x86/mm: Refactor __set_clr_pte_enc()
x86/boot: Skip video memory access in the decompressor for SEV-ES/SNP
virt: sev-guest: Carve out SNP message context structure
virt: sev-guest: Reduce the scope of SNP command mutex
virt: sev-guest: Consolidate SNP guest messaging parameters to a struct
x86/sev: Cache the secrets page address
x86/sev: Handle failures from snp_init()
virt: sev-guest: Use AES GCM crypto library
x86/virt: Provide "nosnp" boot option for sev kernel command line
x86/virt: Move SEV-specific parsing into arch/x86/virt/svm
Linus Torvalds [Tue, 19 Nov 2024 20:13:16 +0000 (12:13 -0800)]
Merge tag 'x86_microcode_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 microcode loader update from Borislav Petkov:
- Remove the unconditional cache writeback and invalidation after
loading the microcode patch on Intel as this was addressing a
microcode bug for which there is a concrete microcode revision check
instead
* tag 'x86_microcode_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode/intel: Remove unnecessary cache writeback and invalidation
Linus Torvalds [Tue, 19 Nov 2024 20:11:28 +0000 (12:11 -0800)]
Merge tag 'x86_cache_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cache resource control updates from Borislav Petkov:
- Add support for 6-node sub-NUMA clustering on Intel
- Cleanup
* tag 'x86_cache_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/resctrl: Support Sub-NUMA cluster mode SNC6
x86/resctrl: Slightly clean-up mbm_config_show()
Linus Torvalds [Tue, 19 Nov 2024 20:04:51 +0000 (12:04 -0800)]
Merge tag 'ras_core_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Borislav Petkov:
- Log and handle twp new AMD-specific MCA registers: SYND1 and SYND2
and report the Field Replaceable Unit text info reported through them
- Add support for handling variable-sized SMCA BERT records
- Add the capability for reporting vendor-specific RAS error info
without adding vendor-specific fields to struct mce
- Cleanups
* tag 'ras_core_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
EDAC/mce_amd: Add support for FRU text in MCA
x86/mce/apei: Handle variable SMCA BERT record size
x86/MCE/AMD: Add support for new MCA_SYND{1,2} registers
tracing: Add __print_dynamic_array() helper
x86/mce: Add wrapper for struct mce to export vendor specific info
x86/mce/intel: Use MCG_BANKCNT_MASK instead of 0xff
x86/mce/mcelog: Use xchg() to get and clear the flags
Linus Torvalds [Tue, 19 Nov 2024 20:00:10 +0000 (12:00 -0800)]
Merge tag 'edac_updates_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC updates from Borislav Petkov:
- Add support for Bluefield-2 SOCs to bluefield_edac
- Add support for Intel Panther Lake-H to igen6_edac
- Add polling support to igen6_edac as some Intel M100 chips have
trouble with error interrupts
- Add Kaby Lake-S support to ie31200_edac
- Fix memory source detection in the SKX common module which is used by
a couple of Intel EDAC drivers
- Add support for the NXP i.MX9 memory controller to fsl_edac
- The usual fixes and cleanups all over the place
* tag 'edac_updates_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/igen6: Add polling support
EDAC/igen6: Initialize edac_op_state according to the configuration data
EDAC/igen6: Avoid segmentation fault on module unload
EDAC/ie31200: Add Kaby Lake-S dual-core host bridge ID
MAINTAINERS: Change FSL DDR EDAC maintainership
EDAC/{skx_common,i10nm}: Fix incorrect far-memory error source indicator
EDAC/skx_common: Differentiate memory error sources
EDAC/fsl_ddr: Add support for i.MX9 DDR controller
dt-bindings: memory: fsl: Add compatible string nxp,imx9-memory-controller
EDAC/fsl_ddr: Fix bad bit shift operations
EDAC/fsl_ddr: Move global variables into struct fsl_mc_pdata
EDAC/fsl_ddr: Pass down fsl_mc_pdata in ddr_in32() and ddr_out32()
RAS/AMD/ATL: Add debug prints for DF register reads
EDAC/bluefield: Use Arm SMC for EMI access on BlueField-2
EDAC/bluefield: Fix potential integer overflow
EDAC/igen6: Add Intel Panther Lake-H SoCs support
Linus Torvalds [Tue, 19 Nov 2024 19:44:17 +0000 (11:44 -0800)]
Merge tag 'kcsan-20241112-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/melver/linux
Pull Kernel Concurrency Sanitizer (KCSAN) updates from Marco Elver:
- Make KCSAN compatible with PREEMPT_RT
- Minor cleanup
* tag 'kcsan-20241112-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/melver/linux:
kcsan: Remove redundant call of kallsyms_lookup_name()
kcsan: Turn report_filterlist_lock into a raw_spinlock
Linus Torvalds [Tue, 19 Nov 2024 19:27:07 +0000 (11:27 -0800)]
Merge tag 'rcu.release.v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux
Pull RCU updates from Frederic Weisbecker:
"SRCU:
- Introduction of the new SRCU-lite flavour with a new pair of
srcu_read_[un]lock_lite() APIs. In practice the read side using
this flavour becomes lighter by removing a full memory barrier on
LOCK and a full memory barrier on UNLOCK. This comes at the expense
of a higher latency write side with two (in the best case of a
snaphot of unused read-sides) or more RCU grace periods on the
update side which now assumes by itself the whole full ordering
guarantee against the LOCK/UNLOCK counters on both indexes, along
with the accesses performed inside.
Uretprobes is a known potential user.
Note this doesn't replace the default normal flavour of SRCU which
still behaves the same as usual.
- Add testing of SRCU-lite through rcutorture and rcuscale
- Various cleanups on the way.
Fixes:
- Allow short-circuiting RCU-TASKS-RUDE grace periods on
architectures that have sane noinstr boundaries forbidding tracing
on low-level idle and kernel entry code. RCU-TASKS is enough on
such configurations because it involves an RCU grace period that
waits for all idle tasks to either schedule out voluntarily or
enter into RCU unwatched noinstr code.
- Allow and test start_poll_synchronize_rcu() with IRQs disabled.
- Mention rcuog kthreads in relevant documentation and Kconfig help
- Various fixes and consolidations
rcutorture:
- Add --no-affinity on tools to leave the affinity setting of guests
up to the user.
- Add guest_os_delay parameter to rcuscale for better warm-up
control.
- Fix and improve some rcuscale error handling.
- Various cleanups and fixes
stall:
- Remove dead code
- Stop dumping tasks if a stalled grace period eventually ended
midway as that only produces confusing output.
- Optimize detection of stalling CPUs and avoid useless node locking
otherwise.
NOCB:
- Fix rcu_barrier() hang due to a race against callbacks
deoffloading. This is not yet used, except by rcutorture, and waits
for its promised cpusets interface.
- Remove leftover function declaration"
* tag 'rcu.release.v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (42 commits)
rcuscale: Remove redundant WARN_ON_ONCE() splat
rcuscale: Do a proper cleanup if kfree_scale_init() fails
srcu: Unconditionally record srcu_read_lock_lite() in ->srcu_reader_flavor
srcu: Check for srcu_read_lock_lite() across all CPUs
srcu: Remove smp_mb() from srcu_read_unlock_lite()
rcutorture: Avoid printing cpu=-1 for no-fault RCU boost failure
rcuscale: Add guest_os_delay module parameter
refscale: Correct affinity check
torture: Add --no-affinity parameter to kvm.sh
rcu/nocb: Fix missed RCU barrier on deoffloading
rcu/kvfree: Fix data-race in __mod_timer / kvfree_call_rcu
rcu/srcutiny: don't return before reenabling preemption
rcu-tasks: Remove open-coded one-byte cmpxchg() emulation
doc: Remove kernel-parameters.txt entry for rcutorture.read_exit
rcutorture: Test start-poll primitives with interrupts disabled
rcu: Permit start_poll_synchronize_rcu*() with interrupts disabled
rcu: Allow short-circuiting of synchronize_rcu_tasks_rude()
doc: Add rcuog kthreads to kernel-per-CPU-kthreads.rst
rcu: Add rcuog kthreads to RCU_NOCB_CPU help text
rcu: Use the BITS_PER_LONG macro
...
Linus Torvalds [Tue, 19 Nov 2024 19:23:52 +0000 (11:23 -0800)]
Merge tag 'hwmon-for-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hmon updates from Guenter Roeck:
"New drivers:
- ISL28022 power monitor
- Nuvoton NCT7363Y
Added support for new chips to existing drivers:
- The tmp180 driver now supports NXP p3t1085, including its I3C mode
- The ina2xx driver now supports SY24655 and INA260
- The amc6821 driver now supports tsd,mule
Other notable improvements:
- The sht4x driver now supports the chip heater
- The pmbus/isl68137 driver now supports a voltage divider on Vout
- The cros_ec driver registers with the thermal framework
- The pmbus/ltc2978 driver now supports LTC7841
- The PMBus core now allow drivers to override WRITE_PROTECT
Various other minor improvements and cleanups"
* tag 'hwmon-for-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (51 commits)
hwmon: (pmbus/isl68137) add support for voltage divider on Vout
dt-bindings: hwmon: isl68137: add bindings to support voltage dividers
hwmon: tmp108: fix I3C dependency
hwmon: (cros_ec) register thermal sensors to thermal framework
hwmon: (tmp108) Add support for I3C device
hwmon: (tmp108) Add helper function tmp108_common_probe() to prepare I3C support
hwmon: (acpi_power_meter) Fix fail to load module on platform without _PMD method
hwmon: (nct6775-core) Fix overflows seen when writing limit attributes
hwmon: (pwm-fan) Introduce start from stopped state handling
dt-bindings: hwmon: pwm-fan: Document start from stopped state properties
hwmon: (tmp108) Add NXP p3t1085 support
dt-bindings: hwmon: ti,tmp108: Add nxp,p3t1085 compatible string
hwmon: (sch5627, max31827) Fix typos in driver documentation
hwmon: (jc42) Drop of_match_ptr() protection
hwmon: (f71882fg) Fix grammar in fan speed trip points explanation
dt-bindings: hwmon: pmbus: add ti tps25990 support
hwmon: (pmbus/core) clear faults after setting smbalert mask
hwmon: (pmbus/core) allow drivers to override WRITE_PROTECT
hwmon: (pmbus) add documentation for existing flags
hwmon: (ina226) Add support for SY24655
...
Linus Torvalds [Tue, 19 Nov 2024 19:17:53 +0000 (11:17 -0800)]
Merge tag 'acpi-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"These include a couple of fixes, a new ACPI backlight quirk for Apple
MacbookPro11,2 and Air7,2 and a bunch of cleanups:
- Fix _CPC register setting issue for registers located in memory in
the ACPI CPPC library code (Lifeng Zheng)
- Use DEFINE_SIMPLE_DEV_PM_OPS in the ACPI battery driver, make it
use devm_ for initializing mutexes and allocating driver data, and
make it check the register_pm_notifier() return value (Thomas
Weißschuh, Andy Shevchenko)
- Make the ACPI EC driver support compile-time conditional and allow
ACPI to be built without CONFIG_HAS_IOPORT (Arnd Bergmann)
- Remove a redundant error check from the pfr_telemetry driver (Colin
Ian King)
- Rearrange the processor_perflib code in the ACPI processor driver
to avoid compiling x86-specific code on other architectures (Arnd
Bergmann)
- Add adev NULL check to acpi_quirk_skip_serdev_enumeration() and
make UART skip quirks work on PCI UARTs without an UID (Hans de
Goede)
- Force native backlight handling Apple MacbookPro11,2 and Air7,2 in
the ACPI video driver (Jonathan Denose)
- Switch several ACPI platform drivers back to using struct
platform_driver::remove() (Uwe Kleine-König)
- Replace strcpy() with strscpy() in multiple places in the ACPI
subsystem (Muhammad Qasim Abdul Majeed, Abdul Rahim)"
* tag 'acpi-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (24 commits)
ACPI: video: force native for Apple MacbookPro11,2 and Air7,2
ACPI: CPPC: Fix _CPC register setting issue
ACPI: Switch back to struct platform_driver::remove()
ACPI: x86: Add adev NULL check to acpi_quirk_skip_serdev_enumeration()
ACPI: x86: Make UART skip quirks work on PCI UARTs without an UID
ACPI: allow building without CONFIG_HAS_IOPORT
ACPI: processor_perflib: extend X86 dependency
ACPI: scan: Use strscpy() instead of strcpy()
ACPI: SBSHC: Use strscpy() instead of strcpy()
ACPI: SBS: Use strscpy() instead of strcpy()
ACPI: power: Use strscpy() instead of strcpy()
ACPI: pci_root: Use strscpy() instead of strcpy()
ACPI: pci_link: Use strscpy() instead of strcpy()
ACPI: event: Use strscpy() instead of strcpy()
ACPI: EC: Use strscpy() instead of strcpy()
ACPI: APD: Use strscpy() instead of strcpy()
ACPI: thermal: Use strscpy() instead of strcpy()
ACPI: battery: Check for error code from devm_mutex_init() call
ACPI: EC: make EC support compile-time conditional
ACPI: pfr_telemetry: remove redundant error check on ret
...
Linus Torvalds [Tue, 19 Nov 2024 19:15:40 +0000 (11:15 -0800)]
Merge tag 'thermal-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
"These are thermal core changes, including the addition of support for
temperature thresholds that can be set from user space, fixes related
to thermal zone initialization, suspend/resume and exit, locking
rework and rearrangement of the code handling thermal zone temperature
updates.
Specifics:
- Add support for thermal thresholds that can be added and removed
from user space via netlink along with a related library update
(Daniel Lezcano)
- Fix thermal zone initialization, suspend/resume and exit
synchronization issues (Rafael Wysocki)
- Rearrange locking in the thermal core to use guards (Rafael
Wysocki)
- Make the code handling thermal zone temperature updates use sorted
lists of trip points to reduce the number of trip points table
walks in the thermal core (Rafael Wysocki)
- Fix and clean up the thermal testing facility code (Rafael Wysocki)
- Fix a Power Allocator thermal governor issue (ZhengShaobo)"
* tag 'thermal-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (45 commits)
thermal: testing: Initialize some variables annoteded with _free()
thermal: testing: Use DEFINE_FREE() and __free() to simplify code
thermal: testing: Simplify tt_get_tt_zone()
thermal: gov_power_allocator: Granted power set to max when nobody request power
thermal: core: Relocate thermal zone initialization routine
thermal: core: Use trip lists for trip crossing detection
thermal: core: Eliminate thermal_zone_trip_down()
thermal: core: Relocate functions that update trip points
thermal: core: Move some trip processing to thermal_trip_crossed()
thermal: core: Pass trip descriptor to thermal_trip_crossed()
thermal: core: Rearrange __thermal_zone_device_update()
thermal: core: Prepare for moving trips between sorted lists
thermal: core: Rename trip list node in struct thermal_trip_desc
thermal: core: Build sorted lists instead of sorting them later
thermal/lib: Fix memory leak on error in thermal_genl_auto()
thermal: thresholds: Fix thermal lock annotation issue
tools/thermal/thermal-engine: Take into account the thresholds API
tools/lib/thermal: Add the threshold netlink ABI
tools/lib/thermal: Make more generic the command encoding function
thermal: netlink: Add the commands and the events for the thresholds
...
Linus Torvalds [Tue, 19 Nov 2024 19:05:00 +0000 (11:05 -0800)]
Merge tag 'pm-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"The amd-pstate cpufreq driver gets the majority of changes this time.
They are mostly fixes and cleanups, but one of them causes it to
become the default cpufreq driver on some AMD server platforms.
Apart from that, the menu cpuidle governor is modified to not use
iowait any more, the intel_idle gets a custom C-states table for
Granite Rapids Xeon D, and the intel_pstate driver will use a more
aggressive Balance- performance default EPP value on Granite Rapids
now.
There are also some fixes, cleanups and tooling updates.
Specifics:
- Update the amd-pstate driver to set the initial scaling frequency
policy lower bound to be the lowest non-linear frequency (Dhananjay
Ugwekar)
- Enable amd-pstate by default on servers starting with newer AMD
Epyc processors (Swapnil Sapkal)
- Align more codepaths between shared memory and MSR designs in
amd-pstate (Dhananjay Ugwekar)
- Clean up amd-pstate code to rename functions and remove redundant
calls (Dhananjay Ugwekar, Mario Limonciello)
- Do other assorted fixes and cleanups in amd-pstate (Dhananjay
Ugwekar and Mario Limonciello)
- Change the Balance-performance EPP value for Granite Rapids in the
intel_pstate driver to a more performance-biased one (Srinivas
Pandruvada)
- Simplify MSR read on the boot CPU in the ACPI cpufreq driver (Chang
S. Bae)
- Ensure sugov_eas_rebuild_sd() is always called when sugov_init()
succeeds to always enforce sched domains rebuild in case EAS needs
to be enabled (Christian Loehle)
- Switch cpufreq back to platform_driver::remove() (Uwe Kleine-König)
- Use proper frequency unit names in cpufreq (Marcin Juszkiewicz)
- Add a built-in idle states table for Granite Rapids Xeon D to the
intel_idle driver (Artem Bityutskiy)
- Fix some typos in comments in the cpuidle core and drivers (Shen
Lichuan)
- Remove iowait influence from the menu cpuidle governor (Christian
Loehle)
- Add min/max available performance state limits to the Energy Model
management code (Lukasz Luba)
- Update pm-graph to v5.13 (Todd Brandt)
- Add documentation for some recently introduced cpupower utility
options (Tor Vic)
- Make cpupower inform users where cpufreq-bench.conf should be
located when opening it fails (Peng Fan)
- Allow overriding cross-compiling env params in cpupower (Peng Fan)
- Add compile_commands.json to .gitignore in cpupower (John B. Wyatt
IV)
- Improve disable c_state block in cpupower bindings and add a test
to confirm that CPU state is disabled to it (John B. Wyatt IV)
- Add Chinese Simplified translation to cpupower (Kieran Moy)
- Add checks for xgettext and msgfmt to cpupower (Siddharth Menon)"
* tag 'pm-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (38 commits)
cpufreq: intel_pstate: Update Balance-performance EPP for Granite Rapids
cpufreq: ACPI: Simplify MSR read on the boot CPU
sched/cpufreq: Ensure sd is rebuilt for EAS check
intel_idle: add Granite Rapids Xeon D support
PM: EM: Add min/max available performance state limits
cpufreq/amd-pstate: Move registration after static function call update
cpufreq/amd-pstate: Push adjust_perf vfunc init into cpu_init
cpufreq/amd-pstate: Align offline flow of shared memory and MSR based systems
cpufreq/amd-pstate: Call cppc_set_epp_perf in the reenable function
cpufreq/amd-pstate: Do not attempt to clear MSR_AMD_CPPC_ENABLE
cpufreq/amd-pstate: Rename functions that enable CPPC
cpufreq/amd-pstate-ut: Add fix for min freq unit test
amd-pstate: Switch to amd-pstate by default on some Server platforms
amd-pstate: Set min_perf to nominal_perf for active mode performance gov
cpufreq/amd-pstate: Remove the redundant amd_pstate_set_driver() call
cpufreq/amd-pstate: Remove the switch case in amd_pstate_init()
cpufreq/amd-pstate: Call amd_pstate_set_driver() in amd_pstate_register_driver()
cpufreq/amd-pstate: Call amd_pstate_register() in amd_pstate_init()
cpufreq/amd-pstate: Set the initial min_freq to lowest_nonlinear_freq
cpufreq/amd-pstate: Remove the redundant verify() function
...
Linus Torvalds [Tue, 19 Nov 2024 18:43:44 +0000 (10:43 -0800)]
Merge tag 'random-6.13-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator updates from Jason Donenfeld:
"This contains a single series from Uros to replace uses of
<linux/random.h> with prandom.h or other more specific headers
as needed, in order to avoid a circular header issue.
Uros' goal is to be able to use percpu.h from prandom.h, which
will then allow him to define __percpu in percpu.h rather than
in compiler_types.h"
* tag 'random-6.13-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
prandom: Include <linux/percpu.h> in <linux/prandom.h>
random: Do not include <linux/prandom.h> in <linux/random.h>
netem: Include <linux/prandom.h> in sch_netem.c
lib/test_scanf: Include <linux/prandom.h> instead of <linux/random.h>
lib/test_parman: Include <linux/prandom.h> instead of <linux/random.h>
bpf/tests: Include <linux/prandom.h> instead of <linux/random.h>
lib/rbtree-test: Include <linux/prandom.h> instead of <linux/random.h>
random32: Include <linux/prandom.h> instead of <linux/random.h>
kunit: string-stream-test: Include <linux/prandom.h>
lib/interval_tree_test.c: Include <linux/prandom.h> instead of <linux/random.h>
bpf: Include <linux/prandom.h> instead of <linux/random.h>
scsi: libfcoe: Include <linux/prandom.h> instead of <linux/random.h>
fscrypt: Include <linux/once.h> in fs/crypto/keyring.c
mtd: tests: Include <linux/prandom.h> instead of <linux/random.h>
media: vivid: Include <linux/prandom.h> in vivid-vid-cap.c
drm/lib: Include <linux/prandom.h> instead of <linux/random.h>
drm/i915/selftests: Include <linux/prandom.h> instead of <linux/random.h>
crypto: testmgr: Include <linux/prandom.h> instead of <linux/random.h>
x86/kaslr: Include <linux/prandom.h> instead of <linux/random.h>
Linus Torvalds [Tue, 19 Nov 2024 18:28:41 +0000 (10:28 -0800)]
Merge tag 'v6.13-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Add sig driver API
- Remove signing/verification from akcipher API
- Move crypto_simd_disabled_for_test to lib/crypto
- Add WARN_ON for return values from driver that indicates memory
corruption
Algorithms:
- Provide crc32-arch and crc32c-arch through Crypto API
- Optimise crc32c code size on x86
- Optimise crct10dif on arm/arm64
- Optimise p10-aes-gcm on powerpc
- Optimise aegis128 on x86
- Output full sample from test interface in jitter RNG
- Retry without padata when it fails in pcrypt
Drivers:
- Add support for Airoha EN7581 TRNG
- Add support for STM32MP25x platforms in stm32
- Enable iproc-r200 RNG driver on BCMBCA
- Add Broadcom BCM74110 RNG driver"
* tag 'v6.13-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (112 commits)
crypto: marvell/cesa - fix uninit value for struct mv_cesa_op_ctx
crypto: cavium - Fix an error handling path in cpt_ucode_load_fw()
crypto: aesni - Move back to module_init
crypto: lib/mpi - Export mpi_set_bit
crypto: aes-gcm-p10 - Use the correct bit to test for P10
hwrng: amd - remove reference to removed PPC_MAPLE config
crypto: arm/crct10dif - Implement plain NEON variant
crypto: arm/crct10dif - Macroify PMULL asm code
crypto: arm/crct10dif - Use existing mov_l macro instead of __adrl
crypto: arm64/crct10dif - Remove remaining 64x64 PMULL fallback code
crypto: arm64/crct10dif - Use faster 16x64 bit polynomial multiply
crypto: arm64/crct10dif - Remove obsolete chunking logic
crypto: bcm - add error check in the ahash_hmac_init function
crypto: caam - add error check to caam_rsa_set_priv_key_form
hwrng: bcm74110 - Add Broadcom BCM74110 RNG driver
dt-bindings: rng: add binding for BCM74110 RNG
padata: Clean up in padata_do_multithreaded()
crypto: inside-secure - Fix the return value of safexcel_xcbcmac_cra_init()
crypto: qat - Fix missing destroy_workqueue in adf_init_aer()
crypto: rsassa-pkcs1 - Reinstate support for legacy protocols
...
Linus Torvalds [Tue, 19 Nov 2024 18:25:47 +0000 (10:25 -0800)]
Merge tag 'chrome-platform-firmware-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux
Pull chrome platform firmware updates from Tzung-Bi Shih:
- Do not double register "simple-framebuffer" platform device if
Generic System Framebuffers (sysfb) already did that
- Fix a missing of unregistering platform driver in error handling path
* tag 'chrome-platform-firmware-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
firmware: google: Unregister driver_info on failure
firmware: coreboot: Don't register a pdev if screen_info data is present
firmware: sysfb: Add a sysfb_handles_screen_info() helper function
Linus Torvalds [Tue, 19 Nov 2024 18:24:12 +0000 (10:24 -0800)]
Merge tag 'chrome-platform-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux
Pull chrome platform updates from Tzung-Bi Shih:
"Fixes:
- Fix a leak of fwnode refcount.
Cleanups:
- Drop unused I2C driver data
- Move back from platform_driver::remove_new() to
platform_driver::remove()"
* tag 'chrome-platform-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
platform/chrome: Switch back to struct platform_driver::remove()
platform/chrome: cros_ec_typec: fix missing fwnode reference decrement
platform/chrome: Drop explicit initialization of struct i2c_device_id::driver_data to 0
Linus Torvalds [Tue, 19 Nov 2024 18:18:45 +0000 (10:18 -0800)]
Merge tag 'csd-lock.2024.11.16a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull CSD-lock update from Paul McKenney:
"This switches from sched_clock() to ktime_get_mono_fast_ns(), which on
x86 switches from the rdtsc instruction to the rdtscp instruction,
thus avoiding instruction reorderings that cause false-positive
reports of CSD-lock stalls of almost 2^46 nanoseconds. These false
positives are rare, but really are seen in the wild"
* tag 'csd-lock.2024.11.16a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
locking/csd-lock: Switch from sched_clock() to ktime_get_mono_fast_ns()
Linus Torvalds [Tue, 19 Nov 2024 18:16:59 +0000 (10:16 -0800)]
Merge tag 'scftorture.2024.11.16a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull scftorture updates from Paul McKenney:
- Avoid divide operation
- Fix cleanup code waiting for IPI handlers
- Move memory allocations out of preempt-disable region of code for
PREEMPT_RT compatibility
- Use a lockless list to avoid freeing memory while interrupts are
disabled, again for PREEMPT_RT compatibility
- Make lockless list scf_add_to_free_list() correctly handle freeing a
NULL pointer
* tag 'scftorture.2024.11.16a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
scftorture: Handle NULL argument passed to scf_add_to_free_list().
scftorture: Use a lock-less list to free memory.
scftorture: Move memory allocation outside of preempt_disable region.
scftorture: Wait until scf_cleanup_handler() completes.
scftorture: Avoid additional div operation.
Linus Torvalds [Tue, 19 Nov 2024 18:13:53 +0000 (10:13 -0800)]
Merge tag 'nolibc.2024.11.01a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull nolibc updates from Paul McKenney:
- Fix potential error due to missing #include on s390
- Compatibility with -Wmissing-fallthrough
- Run qemu with more memory during tests
* tag 'nolibc.2024.11.01a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
selftests/nolibc: start qemu with 1 GiB of memory
tools/nolibc: compiler: add macro __nolibc_fallthrough
tools/nolibc: s390: include std.h
Linus Torvalds [Tue, 19 Nov 2024 02:26:57 +0000 (18:26 -0800)]
Merge tag 'for-linus-6.13-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
- a series for booting as a PVH guest, doing some cleanups after the
previous work to make PVH boot code position independent
- a fix of the xenbus driver avoiding a leak in an error case
* tag 'for-linus-6.13-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: Fix the issue of resource not being properly released in xenbus_dev_probe()
x86/pvh: Avoid absolute symbol references in .head.text
x86/xen: Avoid relocatable quantities in Xen ELF notes
x86/pvh: Omit needless clearing of phys_base
x86/pvh: Use correct size value in GDT descriptor
x86/pvh: Call C code via the kernel virtual mapping
Linus Torvalds [Tue, 19 Nov 2024 02:10:37 +0000 (18:10 -0800)]
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
- Support for running Linux in a protected VM under the Arm
Confidential Compute Architecture (CCA)
- Guarded Control Stack user-space support. Current patches follow the
x86 ABI of implicitly creating a shadow stack on clone(). Subsequent
patches (already on the list) will add support for clone3() allowing
finer-grained control of the shadow stack size and placement from
libc
- AT_HWCAP3 support (not running out of HWCAP2 bits yet but we are
getting close with the upcoming dpISA support)
- Other arch features:
- In-kernel use of the memcpy instructions, FEAT_MOPS (previously
only exposed to user; uaccess support not merged yet)
- MTE: hugetlbfs support and the corresponding kselftests
- Optimise CRC32 using the PMULL instructions
- Support for FEAT_HAFT enabling ARCH_HAS_NONLEAF_PMD_YOUNG
- Optimise the kernel TLB flushing to use the range operations
- POE/pkey (permission overlays): further cleanups after bringing
the signal handler in line with the x86 behaviour for 6.12
- arm64 perf updates:
- Support for the NXP i.MX91 PMU in the existing IMX driver
- Support for Ampere SoCs in the Designware PCIe PMU driver
- Support for Marvell's 'PEM' PCIe PMU present in the 'Odyssey' SoC
- Support for Samsung's 'Mongoose' CPU PMU
- Support for PMUv3.9 finer-grained userspace counter access
control
- Switch back to platform_driver::remove() now that it returns
'void'
- Add some missing events for the CXL PMU driver
- Miscellaneous arm64 fixes/cleanups:
- Page table accessors cleanup: type updates, drop unused macros,
reorganise arch_make_huge_pte() and clean up pte_mkcont(), sanity
check addresses before runtime P4D/PUD folding
- Command line override for ID_AA64MMFR0_EL1.ECV (advertising the
FEAT_ECV for the generic timers) allowing Linux to boot with
firmware deployments that don't set SCTLR_EL3.ECVEn
- ACPI/arm64: tighten the check for the array of platform timer
structures and adjust the error handling procedure in
gtdt_parse_timer_block()
- Optimise the cache flush for the uprobes xol slot (skip if no
change) and other uprobes/kprobes cleanups
- Fix the context switching of tpidrro_el0 when kpti is enabled
- Dynamic shadow call stack fixes
- Sysreg updates
- Various arm64 kselftest improvements
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (168 commits)
arm64: tls: Fix context-switching of tpidrro_el0 when kpti is enabled
kselftest/arm64: Try harder to generate different keys during PAC tests
kselftest/arm64: Don't leak pipe fds in pac.exec_sign_all()
arm64/ptrace: Clarify documentation of VL configuration via ptrace
kselftest/arm64: Corrupt P0 in the irritator when testing SSVE
acpi/arm64: remove unnecessary cast
arm64/mm: Change protval as 'pteval_t' in map_range()
kselftest/arm64: Fix missing printf() argument in gcs/gcs-stress.c
kselftest/arm64: Add FPMR coverage to fp-ptrace
kselftest/arm64: Expand the set of ZA writes fp-ptrace does
kselftets/arm64: Use flag bits for features in fp-ptrace assembler code
kselftest/arm64: Enable build of PAC tests with LLVM=1
kselftest/arm64: Check that SVCR is 0 in signal handlers
selftests/mm: Fix unused function warning for aarch64_write_signal_pkey()
kselftest/arm64: Fix printf() compiler warnings in the arm64 syscall-abi.c tests
kselftest/arm64: Fix printf() warning in the arm64 MTE prctl() test
kselftest/arm64: Fix printf() compiler warnings in the arm64 fp tests
kselftest/arm64: Fix build with stricter assemblers
arm64/scs: Drop unused prototype __pi_scs_patch_vmlinux()
arm64/scs: Deal with 64-bit relative offsets in FDE frames
...
Linus Torvalds [Tue, 19 Nov 2024 02:00:18 +0000 (18:00 -0800)]
Merge tag 'm68k-for-v6.13-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull m68k updates from Geert Uytterhoeven:
- Revive SCSI and early console support on MVME147
- Fix early kernel parameters using static keys
- Prevent and improve handling of kernel configurations that lack
specific platform, CPU, or MMU support, to avoid build failures
- Miscellaneous fixes and improvements
- Defconfig updates
* tag 'm68k-for-v6.13-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: defconfig: Update defconfigs for v6.12-rc1
m68k: mvme147: Reinstate early console
m68k: Make sure NR_IRQS is never zero
m68k: Select M68020 as fallback for classic
m68k: Move Sun 3 into a top-level platform option
m68k: kernel: Use str_read_write() helper function
m68k: Initialize jump labels early during setup_arch()
m68k: mvme147: Fix SCSI controller IRQ numbers
m68k: mvme147: Make mvme147_sched_init() __init
Linus Torvalds [Tue, 19 Nov 2024 01:48:39 +0000 (17:48 -0800)]
Merge tag 'mips_6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS updates from Thomas Bogendoerfer:
"Just cleanups and fixes"
* tag 'mips_6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
mips: dts: realtek: Add I2C controllers
mips: dts: realtek: Add syscon-reboot node
MIPS: loongson3_defconfig: Enable blk_dev_nvme by default
MIPS: loongson3_defconfig: Update configs dependencies
MAINTAINERS: Remove linux-mips.org references
MAINTAINERS: Retire Ralf Baechle
TC: Fix the wrong format specifier
MIPS: kernel: proc: Use str_yes_no() helper function
MIPS: mobileye: eyeq6h-epm6: Use eyeq6h in the board device tree
mips: bmips: bcm6358/6368: define required brcm,bmips-cbr-reg
MIPS: Allow using more than 32-bit addresses for reset vectors when possible
mips: asm: fix warning when disabling MIPS_FP_SUPPORT
mips: sgi-ip22: Replace "s[n]?printf" with sysfs_emit in sysfs callbacks
Linus Torvalds [Tue, 19 Nov 2024 01:45:41 +0000 (17:45 -0800)]
Merge tag 's390-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Heiko Carstens:
- Add firmware sysfs interface which allows user space to retrieve the
dump area size of the machine
- Add 'measurement_chars_full' CHPID sysfs attribute to make the
complete associated Channel-Measurements Characteristics Block
available
- Add virtio-mem support
- Move gmap aka KVM page fault handling from the main fault handler to
KVM code. This is the first step to make s390 KVM page fault handling
similar to other architectures. With this first step the main fault
handler does not have any special handling anymore, and therefore
convert it to support LOCK_MM_AND_FIND_VMA
- With gcc 14 s390 support for flag output operand support for inline
assemblies was added. This allows for several optimizations:
- Provide a cmpxchg inline assembly which makes use of this, and
provide all variants of arch_try_cmpxchg() so that the compiler
can generate slightly better code
- Convert a few cmpxchg() loops to try_cmpxchg() loops
- Similar to x86 add a CC_OUT() helper macro (and other macros),
and convert all inline assemblies to make use of them, so that
depending on compiler version better code can be generated
- List installed host-key hashes in sysfs if the machine supports the
Query Ultravisor Keys UVC
- Add 'Retrieve Secret' ioctl which allows user space in protected
execution guests to retrieve previously stored secrets from the
Ultravisor
- Add pkey-uv module which supports the conversion of Ultravisor
retrievable secrets to protected keys
- Extend the existing paes cipher to exploit the full AES-XTS hardware
acceleration introduced with message-security assist extension 10
- Convert hopefully all sysfs show functions to use sysfs_emit() so
that the constant flow of such patches stop
- For PCI devices make use of the newly added Topology ID attribute to
enable whole card multi-function support despite the change to PCHID
per port. Additionally improve the overall robustness and usability
of the multifunction support
- Various other small improvements, fixes, and cleanups
* tag 's390-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (133 commits)
s390/cio/ioasm: Convert to use flag output macros
s390/cio/qdio: Convert to use flag output macros
s390/sclp: Convert to use flag output macros
s390/dasd: Convert to use flag output macros
s390/boot/physmem: Convert to use flag output macros
s390/pci: Convert to use flag output macros
s390/kvm: Convert to use flag output macros
s390/extmem: Convert to use flag output macros
s390/string: Convert to use flag output macros
s390/diag: Convert to use flag output macros
s390/irq: Convert to use flag output macros
s390/smp: Convert to use flag output macros
s390/uv: Convert to use flag output macros
s390/pai: Convert to use flag output macros
s390/mm: Convert to use flag output macros
s390/cpu_mf: Convert to use flag output macros
s390/cpcmd: Convert to use flag output macros
s390/topology: Convert to use flag output macros
s390/time: Convert to use flag output macros
s390/pageattr: Convert to use flag output macros
...
Linus Torvalds [Tue, 19 Nov 2024 01:34:05 +0000 (17:34 -0800)]
Merge tag 'lsm-pr-20241112' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull lsm updates from Paul Moore:
"Thirteen patches, all focused on moving away from the current 'secid'
LSM identifier to a richer 'lsm_prop' structure.
This move will help reduce the translation that is necessary in many
LSMs, offering better performance, and make it easier to support
different LSMs in the future"
* tag 'lsm-pr-20241112' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
lsm: remove lsm_prop scaffolding
netlabel,smack: use lsm_prop for audit data
audit: change context data from secid to lsm_prop
lsm: create new security_cred_getlsmprop LSM hook
audit: use an lsm_prop in audit_names
lsm: use lsm_prop in security_inode_getsecid
lsm: use lsm_prop in security_current_getsecid
audit: update shutdown LSM data
lsm: use lsm_prop in security_ipc_getsecid
audit: maintain an lsm_prop in audit_context
lsm: add lsmprop_to_secctx hook
lsm: use lsm_prop in security_audit_rule_match
lsm: add the lsm_prop data structure
Linus Torvalds [Tue, 19 Nov 2024 01:30:52 +0000 (17:30 -0800)]
Merge tag 'selinux-pr-20241112' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux updates from Paul Moore:
- Add support for netlink xperms
Some time ago we added the concept of "xperms" to the SELinux policy
so that we could write policy for individual ioctls, this builds upon
this by using extending xperms to netlink so that we can write
SELinux policy for individual netlnk message types and not rely on
the fairly coarse read/write mapping tables we currently have.
There are limitations involving generic netlink due to the
multiplexing that is done, but it's no worse that what we currently
have. As usual, more information can be found in the commit message.
- Deprecate /sys/fs/selinux/user
We removed the only known userspace use of this back in 2020 and now
that several years have elapsed we're starting down the path of
deprecating it in the kernel.
- Cleanup the build under scripts/selinux
A couple of patches to move the genheaders tool under
security/selinux and correct our usage of kernel headers in the tools
located under scripts/selinux. While these changes originated out of
an effort to build Linux on different systems, they are arguably the
right thing to do regardless.
- Minor code cleanups and style fixes
Not much to say here, two minor cleanup patches that came out of the
netlink xperms work
* tag 'selinux-pr-20241112' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: Deprecate /sys/fs/selinux/user
selinux: apply clang format to security/selinux/nlmsgtab.c
selinux: streamline selinux_nlmsg_lookup()
selinux: Add netlink xperm support
selinux: move genheaders to security/selinux/
selinux: do not include <linux/*.h> headers from host programs
Linus Torvalds [Tue, 19 Nov 2024 01:28:52 +0000 (17:28 -0800)]
Merge tag 'audit-pr-20241112' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit
Pull audit updates from Paul Moore:
"The audit patches are minimal this time around with one patch to
correct some kdoc function parameters and one to leverage the
`str_yes_no()` function; nothing very exciting"
* tag 'audit-pr-20241112' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: Use str_yes_no() helper function
audit: Reorganize kerneldoc parameter names
Linus Torvalds [Tue, 19 Nov 2024 01:02:57 +0000 (17:02 -0800)]
Merge tag 'for-6.13/io_uring-20241118' of git://git.kernel.dk/linux
Pull io_uring updates from Jens Axboe:
- Cleanups of the eventfd handling code, making it fully private.
- Support for sending a sync message to another ring, without having a
ring available to send a normal async message.
- Get rid of the separate unlocked hash table, unify everything around
the single locked one.
- Add support for ring resizing. It can be hard to appropriately size
the CQ ring upfront, if the application doesn't know how busy it will
be. This results in applications sizing rings for the most busy case,
which can be wasteful. With ring resizing, they can start small and
grow the ring, if needed.
- Add support for fixed wait regions, rather than needing to copy the
same wait data tons of times for each wait operation.
- Rewrite the resource node handling, which before was serialized per
ring. This caused issues with particularly fixed files, where one
file waiting on IO could hold up putting and freeing of other
unrelated files. Now each node is handled separately. New code is
much simpler too, and was a net 250 line reduction in code.
- Add support for just doing partial buffer clones, rather than always
cloning the entire buffer table.
- Series adding static NAPI support, where a specific NAPI instance is
used rather than having a list of them available that need lookup.
- Add support for mapped regions, and also convert the fixed wait
support mentioned above to that concept. This avoids doing special
mappings for various planned features, and folds the existing
registered wait into that too.
- Add support for hybrid IO polling, which is a variant of strict
IOPOLL but with an initial sleep delay to avoid spinning too early
and wasting resources on devices that aren't necessarily in the < 5
usec category wrt latencies.
- Various cleanups and little fixes.
* tag 'for-6.13/io_uring-20241118' of git://git.kernel.dk/linux: (79 commits)
io_uring/region: fix error codes after failed vmap
io_uring: restore back registered wait arguments
io_uring: add memory region registration
io_uring: introduce concept of memory regions
io_uring: temporarily disable registered waits
io_uring: disable ENTER_EXT_ARG_REG for IOPOLL
io_uring: fortify io_pin_pages with a warning
switch io_msg_ring() to CLASS(fd)
io_uring: fix invalid hybrid polling ctx leaks
io_uring/uring_cmd: fix buffer index retrieval
io_uring/rsrc: add & apply io_req_assign_buf_node()
io_uring/rsrc: remove '->ctx_ptr' of 'struct io_rsrc_node'
io_uring/rsrc: pass 'struct io_ring_ctx' reference to rsrc helpers
io_uring: avoid normal tw intermediate fallback
io_uring/napi: add static napi tracking strategy
io_uring/napi: clean up __io_napi_do_busy_loop
io_uring/napi: Use lock guards
io_uring/napi: improve __io_napi_add
io_uring/napi: fix io_napi_entry RCU accesses
io_uring/napi: protect concurrent io_napi_entry timeout accesses
...
Linus Torvalds [Tue, 19 Nov 2024 00:50:08 +0000 (16:50 -0800)]
Merge tag 'for-6.13/block-20241118' of git://git.kernel.dk/linux
Pull block updates from Jens Axboe:
- NVMe updates via Keith:
- Use uring_cmd helper (Pavel)
- Host Memory Buffer allocation enhancements (Christoph)
- Target persistent reservation support (Guixin)
- Persistent reservation tracing (Guixen)
- NVMe 2.1 specification support (Keith)
- Rotational Meta Support (Matias, Wang, Keith)
- Volatile cache detection enhancment (Guixen)
- MD updates via Song:
- Maintainers update
- raid5 sync IO fix
- Enhance handling of faulty and blocked devices
- raid5-ppl atomic improvement
- md-bitmap fix
- Support for manually defining embedded partition tables
- Zone append fixes and cleanups
- Stop sending the queued requests in the plug list to the driver
->queue_rqs() handle in reverse order.
- Zoned write plug cleanups
- Cleanups disk stats tracking and add support for disk stats for
passthrough IO
- Add preparatory support for file system atomic writes
- Add lockdep support for queue freezing. Already found a bunch of
issues, and some fixes for that are in here. More will be coming.
- Fix race between queue stopping/quiescing and IO queueing
- ublk recovery improvements
- Fix ublk mmap for 64k pages
- Various fixes and cleanups
* tag 'for-6.13/block-20241118' of git://git.kernel.dk/linux: (118 commits)
MAINTAINERS: Update git tree for mdraid subsystem
block: make struct rq_list available for !CONFIG_BLOCK
block/genhd: use seq_put_decimal_ull for diskstats decimal values
block: don't reorder requests in blk_mq_add_to_batch
block: don't reorder requests in blk_add_rq_to_plug
block: add a rq_list type
block: remove rq_list_move
virtio_blk: reverse request order in virtio_queue_rqs
nvme-pci: reverse request order in nvme_queue_rqs
btrfs: validate queue limits
block: export blk_validate_limits
nvmet: add tracing of reservation commands
nvme: parse reservation commands's action and rtype to string
nvmet: report ns's vwc not present
md/raid5: Increase r5conf.cache_name size
block: remove the ioprio field from struct request
block: remove the write_hint field from struct request
nvme: check ns's volatile write cache not present
nvme: add rotational support
nvme: use command set independent id ns if available
...
Linus Torvalds [Tue, 19 Nov 2024 00:45:28 +0000 (16:45 -0800)]
Merge tag 'ata-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata updates from Niklas Cassel:
- Fix typos in comments (Yan Zhen)
- Remove unused macro definitions (Damien Le Moal)
- Switch back to the .remove() callback (Uwe Kleine-König)
- Make use of the get_unaligned_be24() helper instead of open coding
(Andy Shevchenko)
- Refactor and cleanup ata_scsi_simulate() command emulation, such that
all commands use ata_scsi_rbuf_fill() with its own callback (Damien
Le Moal)
- Improve ata_scsi_simulate() command emulation by accurately setting
the SCSI command residual (number of bytes not filled) in the command
reply (Damien Le Moal)
- Add missing iommus property in ahci-platform device tree binding
(Frank Wunderlich)
* tag 'ata-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
dt-bindings: ata: ahci-platform: add missing iommus property
ata: libata-scsi: Return residual for emulated SCSI commands
ata: libata-scsi: Remove struct ata_scsi_args
ata: libata-scsi: Document all VPD page inquiry actors
ata: libata-scsi: Refactor ata_scsiop_maint_in()
ata: libata-scsi: Refactor ata_scsiop_read_cap()
ata: libata-scsi: Refactor ata_scsi_simulate()
ata: libata-scsi: Refactor scsi_6_lba_len() with use of get_unaligned_be24()
ata: Switch back to struct platform_driver::remove()
ata: libata: Remove unused macro definitions
ata: Fix typos in the comment
Linus Torvalds [Tue, 19 Nov 2024 00:37:41 +0000 (16:37 -0800)]
Merge tag 'for-6.13-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"Changes outside of btrfs: add io_uring command flag to track a dying
task (the rest will go via the block git tree).
User visible changes:
- wire encoded read (ioctl) to io_uring commands, this can be used on
itself, in the future this will allow 'send' to be asynchronous. As
a consequence, the encoded read ioctl can also work in non-blocking
mode
- new ioctl to wait for cleaned subvolumes, no need to use the
generic and root-only SEARCH_TREE ioctl, will be used by "btrfs
subvol sync"
- recognize different paths/symlinks for the same devices and don't
report them during rescanning, this can be observed with LVM or DM
- seeding device use case change, the sprout device (the one
capturing new writes) will not clear the read-only status of the
super block; this prevents accumulating space from deleted
snapshots
Performance improvements:
- reduce lock contention when traversing extent buffers
- reduce extent tree lock contention when searching for inline
backref
- switch from rb-trees to xarray for delayed ref tracking,
improvements due to better cache locality, branching factors and
more compact data structures
- enable extent map shrinker again (prevent memory exhaustion under
some types of IO load), reworked to run in a single worker thread
(there used to be problems causing long stalls under memory
pressure)
Core changes:
- raid-stripe-tree feature updates:
- make device replace and scrub work
- implement partial deletion of stripe extents
- new selftests
- split the config option BTRFS_DEBUG and add EXPERIMENTAL for
features that are experimental or with known problems so we don't
misuse debugging config for that
- continued folio API conversions:
- buffered writes
- make buffered write copy one page at a time, preparatory work for
future integration with large folios, may cause performance drop
- proper locking of root item regarding starting send
- error handling improvements
- code cleanups and refactoring:
- dead code removal
- unused parameter reduction
- lockdep assertions"
* tag 'for-6.13-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (119 commits)
btrfs: send: check for read-only send root under critical section
btrfs: send: check for dead send root under critical section
btrfs: remove check for NULL fs_info at btrfs_folio_end_lock_bitmap()
btrfs: fix warning on PTR_ERR() against NULL device at btrfs_control_ioctl()
btrfs: fix a typo in btrfs_use_zone_append
btrfs: avoid superfluous calls to free_extent_map() in btrfs_encoded_read()
btrfs: simplify logic to decrement snapshot counter at btrfs_mksnapshot()
btrfs: remove hole from struct btrfs_delayed_node
btrfs: update stale comment for struct btrfs_delayed_ref_node::add_list
btrfs: add new ioctl to wait for cleaned subvolumes
btrfs: simplify range tracking in cow_file_range()
btrfs: remove conditional path allocation in btrfs_read_locked_inode()
btrfs: push cleanup into btrfs_read_locked_inode()
io_uring/cmd: let cmds to know about dying task
btrfs: add struct io_btrfs_cmd as type for io_uring_cmd_to_pdu()
btrfs: add io_uring command for encoded reads (ENCODED_READ ioctl)
btrfs: move priv off stack in btrfs_encoded_read_regular_fill_pages()
btrfs: don't sleep in btrfs_encoded_read() if IOCB_NOWAIT is set
btrfs: change btrfs_encoded_read() so that reading of extent is done by caller
btrfs: remove pointless iocb::ki_pos addition in btrfs_encoded_read()
...
Linus Torvalds [Tue, 19 Nov 2024 00:32:58 +0000 (16:32 -0800)]
Merge tag 'ext4_for_linus-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
"A lot of miscellaneous ext4 bug fixes and cleanups this cycle, most
notably in the journaling code, bufered I/O, and compiler warning
cleanups"
* tag 'ext4_for_linus-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (33 commits)
jbd2: Fix comment describing journal_init_common()
ext4: prevent an infinite loop in the lazyinit thread
ext4: use struct_size() to improve ext4_htree_store_dirent()
ext4: annotate struct fname with __counted_by()
jbd2: avoid dozens of -Wflex-array-member-not-at-end warnings
ext4: use str_yes_no() helper function
ext4: prevent delalloc to nodelalloc on remount
jbd2: make b_frozen_data allocation always succeed
ext4: cleanup variable name in ext4_fc_del()
ext4: use string choices helpers
jbd2: remove the 'success' parameter from the jbd2_do_replay() function
jbd2: remove useless 'block_error' variable
jbd2: factor out jbd2_do_replay()
jbd2: refactor JBD2_COMMIT_BLOCK process in do_one_pass()
jbd2: unified release of buffer_head in do_one_pass()
jbd2: remove redundant judgments for check v1 checksum
ext4: use ERR_CAST to return an error-valued pointer
mm: zero range of eof folio exposed by inode size extension
ext4: partial zero eof block on unaligned inode size extension
ext4: disambiguate the return value of ext4_dio_write_end_io()
...
Linus Torvalds [Mon, 18 Nov 2024 22:54:10 +0000 (14:54 -0800)]
Merge tag 'pull-statx' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull statx updates from Al Viro:
"Sanitize struct filename and lookup flags handling in statx and
friends"
* tag 'pull-statx' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
libfs: kill empty_dir_getattr()
fs: Simplify getattr interface function checking AT_GETATTR_NOSEC flag
fs/stat.c: switch to CLASS(fd_raw)
kill getname_statx_lookup_flags()
io_statx_prep(): use getname_uflags()
Linus Torvalds [Mon, 18 Nov 2024 20:58:23 +0000 (12:58 -0800)]
Merge tag 'pull-ufs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull ufs updates from Al Viro:
"ufs cleanups, fixes and folio conversion"
* tag 'pull-ufs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ufs: ufs_sb_private_info: remove unused s_{2,3}apb fields
ufs: Convert ufs_change_blocknr() to take a folio
ufs: Pass a folio to ufs_new_fragments()
ufs: Convert ufs_inode_getfrag() to take a folio
ufs: Convert ufs_extend_tail() to take a folio
ufs: Convert ufs_inode_getblock() to take a folio
ufs: take the handling of free block counters into a helper
clean ufs_trunc_direct() up a bit...
ufs: get rid of ubh_{ubhcpymem,memcpyubh}()
ufs_inode_getfrag(): remove junk comment
ufs_free_fragments(): fix the braino in sanity check
ufs_clusteracct(): switch to passing fragment number
ufs: untangle ubh_...block...(), part 3
ufs: untangle ubh_...block...(), part 2
ufs: untangle ubh_...block...() macros, part 1
ufs: fix ufs_read_cylinder() failure handling
ufs: missing ->splice_write()
ufs: fix handling of delete_entry and set_link failures
Dmitry Torokhov [Mon, 28 Oct 2024 19:07:50 +0000 (12:07 -0700)]
HID: multitouch: make mt_set_mode() less cryptic
mt_set_mode() accepts 2 boolean switches indicating whether the device
(if it follows Windows Precision Touchpad specification) should report
hardware buttons and/or surface contacts. For a casual reader it is
completely not clear, as they look at the call site, which exact mode
is being requested.
Define report_mode enum and change mt_set_mode() to accept is as
an argument instead. This allows to write:
Linus Torvalds [Mon, 18 Nov 2024 20:44:25 +0000 (12:44 -0800)]
Merge tag 'pull-xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull xattr updates from Al Viro:
"Sanitize xattr and io_uring interactions with it, add *xattrat()
syscalls, sanitize struct filename handling in there"
* tag 'pull-xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
xattr: remove redundant check on variable err
fs/xattr: add *at family syscalls
new helpers: file_removexattr(), filename_removexattr()
new helpers: file_listxattr(), filename_listxattr()
replace do_getxattr() with saner helpers.
replace do_setxattr() with saner helpers.
new helper: import_xattr_name()
fs: rename struct xattr_ctx to kernel_xattr_ctx
xattr: switch to CLASS(fd)
io_[gs]etxattr_prep(): just use getname()
io_uring: IORING_OP_F[GS]ETXATTR is fine with REQ_F_FIXED_FILE
getname_maybe_null() - the third variant of pathname copy-in
teach filename_lookup() to treat NULL filename as ""
Linus Torvalds [Mon, 18 Nov 2024 20:24:06 +0000 (12:24 -0800)]
Merge tag 'pull-fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull 'struct fd' class updates from Al Viro:
"The bulk of struct fd memory safety stuff
Making sure that struct fd instances are destroyed in the same scope
where they'd been created, getting rid of reassignments and passing
them by reference, converting to CLASS(fd{,_pos,_raw}).
We are getting very close to having the memory safety of that stuff
trivial to verify"
* tag 'pull-fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (28 commits)
deal with the last remaing boolean uses of fd_file()
css_set_fork(): switch to CLASS(fd_raw, ...)
memcg_write_event_control(): switch to CLASS(fd)
assorted variants of irqfd setup: convert to CLASS(fd)
do_pollfd(): convert to CLASS(fd)
convert do_select()
convert vfs_dedupe_file_range().
convert cifs_ioctl_copychunk()
convert media_request_get_by_fd()
convert spu_run(2)
switch spufs_calls_{get,put}() to CLASS() use
convert cachestat(2)
convert do_preadv()/do_pwritev()
fdget(), more trivial conversions
fdget(), trivial conversions
privcmd_ioeventfd_assign(): don't open-code eventfd_ctx_fdget()
o2hb_region_dev_store(): avoid goto around fdget()/fdput()
introduce "fd_pos" class, convert fdget_pos() users to it.
fdget_raw() users: switch to CLASS(fd_raw)
convert vmsplice() to CLASS(fd)
...
Linus Torvalds [Mon, 18 Nov 2024 19:44:06 +0000 (11:44 -0800)]
Merge tag 'vfs-6.13.ecryptfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull ecryptfs updates from Christian Brauner:
"The folio project is about to remove page->index. This contains the
work required for ecryptfs"
* tag 'vfs-6.13.ecryptfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
ecryptfs: Pass the folio index to crypt_extent()
ecryptfs: Convert lower_offset_for_page() to take a folio
ecryptfs: Convert ecryptfs_decrypt_page() to take a folio
ecryptfs: Convert ecryptfs_encrypt_page() to take a folio
ecryptfs: Convert ecryptfs_write_lower_page_segment() to take a folio
ecryptfs: Convert ecryptfs_write() to use a folio
ecryptfs: Convert ecryptfs_read_lower_page_segment() to take a folio
ecryptfs: Convert ecryptfs_copy_up_encrypted_with_header() to take a folio
ecryptfs: Use a folio throughout ecryptfs_read_folio()
ecryptfs: Convert ecryptfs_writepage() to ecryptfs_writepages()
Linus Torvalds [Mon, 18 Nov 2024 19:30:09 +0000 (11:30 -0800)]
Merge tag 'vfs-6.13.untorn.writes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs untorn write support from Christian Brauner:
"An atomic write is a write issed with torn-write protection. This
means for a power failure or any hardware failure all or none of the
data from the write will be stored, never a mix of old and new data.
This work is already supported for block devices. If a block device is
opened with O_DIRECT and the block device supports atomic write, then
FMODE_CAN_ATOMIC_WRITE is added to the file of the opened block
device.
This contains the work to expand atomic write support to filesystems,
specifically ext4 and XFS. Currently, only support for writing exactly
one filesystem block atomically is added.
Since it's now possible to have filesystem block size > page size for
XFS, it's possible to write 4K+ blocks atomically on x86"
* tag 'vfs-6.13.untorn.writes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
iomap: drop an obsolete comment in iomap_dio_bio_iter
ext4: Do not fallback to buffered-io for DIO atomic write
ext4: Support setting FMODE_CAN_ATOMIC_WRITE
ext4: Check for atomic writes support in write iter
ext4: Add statx support for atomic writes
xfs: Support setting FMODE_CAN_ATOMIC_WRITE
xfs: Validate atomic writes
xfs: Support atomic write for statx
fs: iomap: Atomic write support
fs: Export generic_atomic_write_valid()
block: Add bdev atomic write limits helpers
fs/block: Check for IOCB_DIRECT in generic_atomic_write_valid()
block/fs: Pass an iocb to generic_atomic_write_valid()
Linus Torvalds [Mon, 18 Nov 2024 19:05:26 +0000 (11:05 -0800)]
Merge tag 'vfs-6.13.tmpfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull tmpfs case folding updates from Christian Brauner:
"This adds case-insensitive support for tmpfs.
The work contained in here adds support for case-insensitive file
names lookups in tmpfs. The main difference from other casefold
filesystems is that tmpfs has no information on disk, just on RAM, so
we can't use mkfs to create a case-insensitive tmpfs. For this
implementation, there's a mount option for casefolding. The rest of
the patchset follows a similar approach as ext4 and f2fs.
The use case for this feature is similar to the use case for ext4, to
better support compatibility layers (like Wine), particularly in
combination with sandboxing/container tools (like Flatpak).
Those containerization tools can share a subset of the host filesystem
with an application. In the container, the root directory and any
parent directories required for a shared directory are on tmpfs, with
the shared directories bind-mounted into the container's view of the
filesystem.
If the host filesystem is using case-insensitive directories, then the
application can do lookups inside those directories in a
case-insensitive way, without this needing to be implemented in
user-space. However, if the host is only sharing a subset of a
case-insensitive directory with the application, then the parent
directories of the mount point will be part of the container's root
tmpfs. When the application tries to do case-insensitive lookups of
those parent directories on a case-sensitive tmpfs, the lookup will
fail"
* tag 'vfs-6.13.tmpfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
tmpfs: Initialize sysfs during tmpfs init
tmpfs: Fix type for sysfs' casefold attribute
libfs: Fix kernel-doc warning in generic_ci_validate_strict_name
docs: tmpfs: Add casefold options
tmpfs: Expose filesystem features via sysfs
tmpfs: Add flag FS_CASEFOLD_FL support for tmpfs dirs
tmpfs: Add casefold lookup support
libfs: Export generic_ci_ dentry functions
unicode: Recreate utf8_parse_version()
unicode: Export latest available UTF-8 version number
ext4: Use generic_ci_validate_strict_name helper
libfs: Create the helper function generic_ci_validate_strict_name()
Linus Torvalds [Mon, 18 Nov 2024 18:50:09 +0000 (10:50 -0800)]
Merge tag 'vfs-6.13.usercopy' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull copy_struct_to_user helper from Christian Brauner:
"This adds a copy_struct_to_user() helper which is a companion helper
to the already widely used copy_struct_from_user().
It copies a struct from kernel space to userspace, in a way that
guarantees backwards-compatibility for struct syscall arguments as
long as future struct extensions are made such that all new fields are
appended to the old struct, and zeroed-out new fields have the same
meaning as the old struct.
The first user is sched_getattr() system call but the new extensible
pidfs ioctl will be ported to it as well"
* tag 'vfs-6.13.usercopy' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
sched_getattr: port to copy_struct_to_user
uaccess: add copy_struct_to_user helper
Linus Torvalds [Mon, 18 Nov 2024 18:47:46 +0000 (10:47 -0800)]
Merge tag 'vfs-6.13.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull pidfs update from Christian Brauner:
"This adds a new ioctl to retrieve information about a pidfd.
A common pattern when using pidfds is having to get information about
the process, which currently requires /proc being mounted, resolving
the fd to a pid, and then do manual string parsing of /proc/N/status
and friends. This needs to be reimplemented over and over in all
userspace projects (e.g.: it has been reimplemented in systemd, dbus,
dbus-daemon, polkit so far), and requires additional care in checking
that the fd is still valid after having parsed the data, to avoid
races.
Having a programmatic API that can be used directly removes all these
requirements, including having /proc mounted.
As discussed at LPC24, add an ioctl with an extensible struct so that
more parameters can be added later if needed. Start with returning
pid/tgid/ppid and some creds unconditionally, and cgroupid optionally"
* tag 'vfs-6.13.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
pidfd: add ioctl to retrieve pid info
Linus Torvalds [Mon, 18 Nov 2024 18:45:06 +0000 (10:45 -0800)]
Merge tag 'vfs-6.13.ovl' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull overlayfs updates from Christian Brauner:
"Make overlayfs support specifying layers through file descriptors.
Currently overlayfs only allows specifying layers through path names.
This is inconvenient for users that want to assemble an overlayfs
mount purely based on file descriptors:
There's also a large set of new overlayfs selftests to test new
features and some older properties"
* tag 'vfs-6.13.ovl' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
selftests: add test for specifying 500 lower layers
selftests: add overlayfs fd mounting selftests
selftests: use shared header
Documentation,ovl: document new file descriptor based layers
ovl: specify layers via file descriptors
fs: add helper to use mount option as path or fd
Linus Torvalds [Mon, 18 Nov 2024 18:30:29 +0000 (10:30 -0800)]
Merge tag 'vfs-6.13.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs file updates from Christian Brauner:
"This contains changes the changes for files for this cycle:
- Introduce a new reference counting mechanism for files.
As atomic_inc_not_zero() is implemented with a try_cmpxchg() loop
it has O(N^2) behaviour under contention with N concurrent
operations and it is in a hot path in __fget_files_rcu().
The rcuref infrastructures remedies this problem by using an
unconditional increment relying on safe- and dead zones to make
this work and requiring rcu protection for the data structure in
question. This not just scales better it also introduces overflow
protection.
However, in contrast to generic rcuref, files require a memory
barrier and thus cannot rely on *_relaxed() atomic operations and
also require to be built on atomic_long_t as having massive amounts
of reference isn't unheard of even if it is just an attack.
This adds a file specific variant instead of making this a generic
library.
This has been tested by various people and it gives consistent
improvement up to 3-5% on workloads with loads of threads.
- Add a fastpath for find_next_zero_bit(). Skip 2-levels searching
via find_next_zero_bit() when there is a free slot in the word that
contains the next fd. This improves pts/blogbench-1.1.0 read by 8%
and write by 4% on Intel ICX 160.
- Conditionally clear full_fds_bits since it's very likely that a bit
in full_fds_bits has been cleared during __clear_open_fds(). This
improves pts/blogbench-1.1.0 read up to 13%, and write up to 5% on
Intel ICX 160.
- Get rid of all lookup_*_fdget_rcu() variants. They were used to
lookup files without taking a reference count. That became invalid
once files were switched to SLAB_TYPESAFE_BY_RCU and now we're
always taking a reference count. Switch to an already existing
helper and remove the legacy variants.
- Remove pointless includes of <linux/fdtable.h>.
- Avoid cmpxchg() in close_files() as nobody else has a reference to
the files_struct at that point.
- Move close_range() into fs/file.c and fold __close_range() into it.
- Cleanup calling conventions of alloc_fdtable() and expand_files().
- Merge __{set,clear}_close_on_exec() into one.
- Make __set_open_fd() set cloexec as well instead of doing it in two
separate steps"
* tag 'vfs-6.13.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
selftests: add file SLAB_TYPESAFE_BY_RCU recycling stressor
fs: port files to file_ref
fs: add file_ref
expand_files(): simplify calling conventions
make __set_open_fd() set cloexec state as well
fs: protect backing files with rcu
file.c: merge __{set,clear}_close_on_exec()
alloc_fdtable(): change calling conventions.
fs/file.c: add fast path in find_next_fd()
fs/file.c: conditionally clear full_fds
fs/file.c: remove sanity_check and add likely/unlikely in alloc_fd()
move close_range(2) into fs/file.c, fold __close_range() into it
close_files(): don't bother with xchg()
remove pointless includes of <linux/fdtable.h>
get rid of ...lookup...fdget_rcu() family
Linus Torvalds [Mon, 18 Nov 2024 18:26:49 +0000 (10:26 -0800)]
Merge tag 'vfs-6.13.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull netfs updates from Christian Brauner:
"Various fixes for the netfs library and related infrastructure:
cachefiles:
- Fix a dentry leak in cachefiles_open_file()
- Fix incorrect length return value in
cachefiles_ondemand_fd_write_iter()
- Fix missing pos updates in cachefiles_ondemand_fd_write_iter()
- Clean up in cachefiles_commit_tmpfile()
- Fix NULL pointer dereference in object->file
- Add a memory barrier for FSCACHE_VOLUME_CREATING
netfs:
- Remove call to folio_index()
- Fix a few minor bugs in netfs_page_mkwrite()
- Remove unnecessary references to pages"
* tag 'vfs-6.13.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
netfs/fscache: Add a memory barrier for FSCACHE_VOLUME_CREATING
cachefiles: Fix NULL pointer dereference in object->file
cachefiles: Clean up in cachefiles_commit_tmpfile()
cachefiles: Fix missing pos updates in cachefiles_ondemand_fd_write_iter()
cachefiles: Fix incorrect length return value in cachefiles_ondemand_fd_write_iter()
netfs: Remove unnecessary references to pages
netfs: Fix a few minor bugs in netfs_page_mkwrite()
netfs: Remove call to folio_index()
Linus Torvalds [Mon, 18 Nov 2024 17:54:32 +0000 (09:54 -0800)]
Merge tag 'vfs-6.13.pagecache' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs pagecache updates from Christian Brauner:
"Cleanup filesystem page flag usage: This continues the work to make
the mappedtodisk/owner_2 flag available to filesystems which don't use
buffer heads. Further patches remove uses of Private2. This brings us
very close to being rid of it entirely"
* tag 'vfs-6.13.pagecache' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
migrate: Remove references to Private2
ceph: Remove call to PagePrivate2()
btrfs: Switch from using the private_2 flag to owner_2
mm: Remove PageMappedToDisk
nilfs2: Convert nilfs_copy_buffer() to use folios
fs: Move clearing of mappedtodisk to buffer.c
Linus Torvalds [Mon, 18 Nov 2024 17:51:32 +0000 (09:51 -0800)]
Merge tag 'vfs-6.13.rust.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs rust file abstractions from Christian Brauner:
"This contains the file abstractions needed by the Rust implementation
of the Binder driver and other parts of the kernel.
Let's treat this as a first attempt at getting something working but I
do expect the actual interfaces to change significantly over time.
Simply because we are still figuring out what actually works. But
there's no point in further theorizing. Let's see how it holds up with
actual users"
* tag 'vfs-6.13.rust.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
rust: task: adjust safety comments in Task methods
rust: add seqfile abstraction
rust: file: add abstraction for `poll_table`
rust: file: add `Kuid` wrapper
rust: file: add `FileDescriptorReservation`
rust: security: add abstraction for secctx
rust: cred: add Rust abstraction for `struct cred`
rust: file: add Rust abstraction for `struct file`
rust: task: add `Task::current_raw`
rust: types: add `NotThreadSafe`
Linus Torvalds [Mon, 18 Nov 2024 17:35:30 +0000 (09:35 -0800)]
Merge tag 'vfs-6.13.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner:
"Features:
- Fixup and improve NLM and kNFSD file lock callbacks
Last year both GFS2 and OCFS2 had some work done to make their
locking more robust when exported over NFS. Unfortunately, part of
that work caused both NLM (for NFS v3 exports) and kNFSD (for
NFSv4.1+ exports) to no longer send lock notifications to clients
This in itself is not a huge problem because most NFS clients will
still poll the server in order to acquire a conflicted lock
It's important for NLM and kNFSD that they do not block their
kernel threads inside filesystem's file_lock implementations
because that can produce deadlocks. We used to make sure of this by
only trusting that posix_lock_file() can correctly handle blocking
lock calls asynchronously, so the lock managers would only setup
their file_lock requests for async callbacks if the filesystem did
not define its own lock() file operation
However, when GFS2 and OCFS2 grew the capability to correctly
handle blocking lock requests asynchronously, they started
signalling this behavior with EXPORT_OP_ASYNC_LOCK, and the check
for also trusting posix_lock_file() was inadvertently dropped, so
now most filesystems no longer produce lock notifications when
exported over NFS
Fix this by using an fop_flag which greatly simplifies the problem
and grooms the way for future uses by both filesystems and lock
managers alike
- Add a sysctl to delete the dentry when a file is removed instead of
making it a negative dentry
Commit 681ce8623567 ("vfs: Delete the associated dentry when
deleting a file") introduced an unconditional deletion of the
associated dentry when a file is removed. However, this led to
performance regressions in specific benchmarks, such as
ilebench.sum_operations/s, prompting a revert in commit 4a4be1ad3a6e ("Revert "vfs: Delete the associated dentry when
deleting a file""). This reintroduces the concept conditionally
through a sysctl
- Expand the statmount() system call:
* Report the filesystem subtype in a new fs_subtype field to
e.g., report fuse filesystem subtypes
* Report the superblock source in a new sb_source field
* Add a new way to return filesystem specific mount options in an
option array that returns filesystem specific mount options
separated by zero bytes and unescaped. This allows caller's to
retrieve filesystem specific mount options and immediately pass
them to e.g., fsconfig() without having to unescape or split
them
* Report security (LSM) specific mount options in a separate
security option array. We don't lump them together with
filesystem specific mount options as security mount options are
generic and most users aren't interested in them
The format is the same as for the filesystem specific mount
option array
- Support relative paths in fsconfig()'s FSCONFIG_SET_STRING command
- Optimize acl_permission_check() to avoid costly {g,u}id ownership
checks if possible
- Use smp_mb__after_spinlock() to avoid full smp_mb() in evict()
- Add synchronous wakeup support for ep_poll_callback.
Currently, epoll only uses wake_up() to wake up task. But sometimes
there are epoll users which want to use the synchronous wakeup flag
to give a hint to the scheduler, e.g., the Android binder driver.
So add a wake_up_sync() define, and use wake_up_sync() when sync is
true in ep_poll_callback()
Fixes:
- Fix kernel documentation for inode_insert5() and iget5_locked()
- Annotate racy epoll check on file->f_ep
- Make F_DUPFD_QUERY associative
- Avoid filename buffer overrun in initramfs
- Don't let statmount() return empty strings
- Add a cond_resched() to dump_user_range() to avoid hogging the CPU
- Don't query the device logical blocksize multiple times for hfsplus
- Make filemap_read() check that the offset is positive or zero
Cleanups:
- Various typo fixes
- Cleanup wbc_attach_fdatawrite_inode()
- Add __releases annotation to wbc_attach_and_unlock_inode()
- Add hugetlbfs tracepoints
- Fix various vfs kernel doc parameters
- Remove obsolete TODO comment from io_cancel()
- Convert wbc_account_cgroup_owner() to take a folio
- Fix comments for BANDWITH_INTERVAL and wb_domain_writeout_add()
- Reorder struct posix_acl to save 8 bytes
- Annotate struct posix_acl with __counted_by()
- Replace one-element array with flexible array member in freevxfs
- Use idiomatic atomic64_inc_return() in alloc_mnt_ns()"
* tag 'vfs-6.13.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (35 commits)
statmount: retrieve security mount options
vfs: make evict() use smp_mb__after_spinlock instead of smp_mb
statmount: add flag to retrieve unescaped options
fs: add the ability for statmount() to report the sb_source
writeback: wbc_attach_fdatawrite_inode out of line
writeback: add a __releases annoation to wbc_attach_and_unlock_inode
fs: add the ability for statmount() to report the fs_subtype
fs: don't let statmount return empty strings
fs:aio: Remove TODO comment suggesting hash or array usage in io_cancel()
hfsplus: don't query the device logical block size multiple times
freevxfs: Replace one-element array with flexible array member
fs: optimize acl_permission_check()
initramfs: avoid filename buffer overrun
fs/writeback: convert wbc_account_cgroup_owner to take a folio
acl: Annotate struct posix_acl with __counted_by()
acl: Realign struct posix_acl to save 8 bytes
epoll: Add synchronous wakeup support for ep_poll_callback
coredump: add cond_resched() to dump_user_range
mm/page-writeback.c: Fix comment of wb_domain_writeout_add()
mm/page-writeback.c: Update comment for BANDWIDTH_INTERVAL
...
Linus Torvalds [Mon, 18 Nov 2024 17:33:34 +0000 (09:33 -0800)]
Merge tag 'vfs-6.13.mount.api' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs mount api conversions from Christian Brauner:
"Convert adfs, affs, befs, hfs, hfsplus, jfs, and hpfs to the new mount
api"
* tag 'vfs-6.13.mount.api' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
efs: fix the efs new mount api implementation
ubifs: Convert ubifs to use the new mount API
hpfs: convert hpfs to use the new mount api
jfs: convert jfs to use the new mount api
hfsplus: convert hfsplus to use the new mount api
hfs: convert hfs to use the new mount api
befs: convert befs to use the new mount api
affs: convert affs to use the new mount api
adfs: convert adfs to use the new mount api
Linus Torvalds [Mon, 18 Nov 2024 17:15:39 +0000 (09:15 -0800)]
Merge tag 'vfs-6.13.mgtime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs multigrain timestamps from Christian Brauner:
"This is another try at implementing multigrain timestamps. This time
with significant help from the timekeeping maintainers to reduce the
performance impact.
Thomas provided a base branch that contains the required timekeeping
interfaces for the VFS. It serves as the base for the multi-grain
timestamp work:
- Multigrain timestamps allow the kernel to use fine-grained
timestamps when an inode's attributes is being actively observed
via ->getattr(). With this support, it's possible for a file to get
a fine-grained timestamp, and another modified after it to get a
coarse-grained stamp that is earlier than the fine-grained time. If
this happens then the files can appear to have been modified in
reverse order, which breaks VFS ordering guarantees.
To prevent this, a floor value is maintained for multigrain
timestamps. Whenever a fine-grained timestamp is handed out, record
it, and when later coarse-grained stamps are handed out, ensure
they are not earlier than that value. If the coarse-grained
timestamp is earlier than the fine-grained floor, return the floor
value instead.
The timekeeper changes add a static singleton atomic64_t into
timekeeper.c that is used to keep track of the latest fine-grained
time ever handed out. This is tracked as a monotonic ktime_t value
to ensure that it isn't affected by clock jumps. Because it is
updated at different times than the rest of the timekeeper object,
the floor value is managed independently of the timekeeper via a
cmpxchg() operation, and sits on its own cacheline.
Two new public timekeeper interfaces are added:
(1) ktime_get_coarse_real_ts64_mg() fills a timespec64 with the
later of the coarse-grained clock and the floor time
(2) ktime_get_real_ts64_mg() gets the fine-grained clock value,
and tries to swap it into the floor. A timespec64 is filled
with the result.
- The VFS has always used coarse-grained timestamps when updating the
ctime and mtime after a change. This has the benefit of allowing
filesystems to optimize away a lot metadata updates, down to around
1 per jiffy, even when a file is under heavy writes.
Unfortunately, this has always been an issue when we're exporting
via NFSv3, which relies on timestamps to validate caches. A lot of
changes can happen in a jiffy, so timestamps aren't sufficient to
help the client decide when to invalidate the cache. Even with
NFSv4, a lot of exported filesystems don't properly support a
change attribute and are subject to the same problems with
timestamp granularity. Other applications have similar issues with
timestamps (e.g backup applications).
If we were to always use fine-grained timestamps, that would
improve the situation, but that becomes rather expensive, as the
underlying filesystem would have to log a lot more metadata
updates.
This adds a way to only use fine-grained timestamps when they are
being actively queried. Use the (unused) top bit in
inode->i_ctime_nsec as a flag that indicates whether the current
timestamps have been queried via stat() or the like. When it's set,
we allow the kernel to use a fine-grained timestamp iff it's
necessary to make the ctime show a different value.
This solves the problem of being able to distinguish the timestamp
between updates, but introduces a new problem: it's now possible
for a file being changed to get a fine-grained timestamp. A file
that is altered just a bit later can then get a coarse-grained one
that appears older than the earlier fine-grained time. This
violates timestamp ordering guarantees.
This is where the earlier mentioned timkeeping interfaces help. A
global monotonic atomic64_t value is kept that acts as a timestamp
floor. When we go to stamp a file, we first get the latter of the
current floor value and the current coarse-grained time. If the
inode ctime hasn't been queried then we just attempt to stamp it
with that value.
If it has been queried, then first see whether the current coarse
time is later than the existing ctime. If it is, then we accept
that value. If it isn't, then we get a fine-grained time and try to
swap that into the global floor. Whether that succeeds or fails, we
take the resulting floor time, convert it to realtime and try to
swap that into the ctime.
We take the result of the ctime swap whether it succeeds or fails,
since either is just as valid.
Filesystems can opt into this by setting the FS_MGTIME fstype flag.
Others should be unaffected (other than being subject to the same
floor value as multigrain filesystems)"
* tag 'vfs-6.13.mgtime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs: reduce pointer chasing in is_mgtime() test
tmpfs: add support for multigrain timestamps
btrfs: convert to multigrain timestamps
ext4: switch to multigrain timestamps
xfs: switch to multigrain timestamps
Documentation: add a new file documenting multigrain timestamps
fs: add percpu counters for significant multigrain timestamp events
fs: tracepoints around multigrain timestamp events
fs: handle delegated timestamps in setattr_copy_mgtime
timekeeping: Add percpu counter for tracking floor swap events
timekeeping: Add interfaces for handling timestamps with a floor value
fs: have setattr_copy handle multigrain timestamps appropriately
fs: add infrastructure for multigrain timestamps
posix-timers: Fix spurious warning on double enqueue versus do_exit()
A timer sigqueue may find itself already pending when it is tried to
be enqueued. This situation can happen if the timer sigqueue is enqueued
but then the timer is reset afterwards and fires before the pending
signal managed to be delivered.
However when such a double enqueue occurs while the corresponding signal
is ignored, the sigqueue is expected to be found either on the dedicated
ignored list if the timer was periodic or dropped if the timer was
one-shot. In any case it is not supposed to be queued on the real signal
queue.
An assertion verifies the latter expectation on top of the return value
of prepare_signal(), assuming "false" means that the signal is being
ignored. But prepare_signal() may also fail if the target is exiting as
the last task of its group. In this case the double enqueue observes the
sigqueue queued, as in such a situation:
TASK A (same group as B) TASK B (same group as A)
------------------------ ------------------------
// timer event
// queue signal to TASK B
posix_timer_queue_signal()
// reset timer through syscall
do_timer_settime()
// exit, leaving task B alone
do_exit()
do_exit()
synchronize_group_exit()
signal->flags = SIGNAL_GROUP_EXIT
// ========> <IRQ> timer event
posix_timer_queue_signal()
// return false due to SIGNAL_GROUP_EXIT
if (!prepare_signal())
WARN_ON_ONCE(!list_empty(&q->list))
Fortunately the recovery code in that case already does the right thing:
just exit from posixtimer_send_sigqueue() and wait for __exit_signal()
to flush the pending signal. Just make sure to warn only the case when
the sigqueue is queued and the signal is really ignored.
Nir Lichtman [Mon, 11 Nov 2024 21:56:22 +0000 (21:56 +0000)]
kdb: fix ctrl+e/a/f/b/d/p/n broken in keyboard mode
Problem: When using kdb via keyboard it does not react to control
characters which are supported in serial mode.
Example: Chords such as ctrl+a/e/d/p do not work in keyboard mode
Solution: Before disregarding non-printable key characters, check if they
are one of the supported control characters, I have took the control
characters from the switch case upwards in this function that translates
scan codes of arrow keys/backspace/home/.. to the control characters.