Git Repo - linux.git/log

wireguard: queueing: use CFI-safe ptr_ring cleanup function

We make too nuanced use of ptr_ring to entirely move to the skb_array
wrappers, but we at least should avoid the naughty function pointer cast
when cleaning up skbs. Otherwise RAP/CFI will honk at us. This patch
uses the __skb_array_destroy_skb wrapper for the cleanup, rather than
directly providing kfree_skb, which is what other drivers in the same
situation do too.

Reported-by: PaX Team <[email protected]>
Fixes: 886fcee939ad ("wireguard: receive: use ring buffer for incoming handshakes")
Signed-off-by: Jason A. Donenfeld <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

RISC-V CPU Idle Support

This series adds RISC-V CPU Idle support using SBI HSM suspend function.
The RISC-V SBI CPU idle driver added by this series is highly inspired
from the ARM PSCI CPU idle driver.

Special thanks Sandeep Tripathy for providing early feeback on SBI HSM
support in all above projects (RISC-V SBI specification, OpenSBI, and
Linux RISC-V).

* palmer/riscv-idle:
  RISC-V: Enable RISC-V SBI CPU Idle driver for QEMU virt machine
  dt-bindings: Add common bindings for ARM and RISC-V idle states
  cpuidle: Add RISC-V SBI CPU idle driver
  cpuidle: Factor-out power domain related code from PSCI domain driver
  RISC-V: Add SBI HSM suspend related defines
  RISC-V: Add arch functions for non-retentive suspend entry/exit
  RISC-V: Rename relocate() and make it global
  RISC-V: Enable CPU_IDLE drivers

mm: page_alloc: validate buddy before check its migratetype.

Whenever a buddy page is found, page_is_buddy() should be called to
check its validity. Add the missing check during pageblock merge check.

Fixes: 1dd214b8f21c ("mm: page_alloc: avoid merging non-fallbackable pageblocks with others")
Link: https://lore.kernel.org/all/[email protected]/
Reported-and-tested-by: Steven Rostedt <[email protected]>
Signed-off-by: Zi Yan <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

riscv: Rename "sp_in_global" to "current_stack_pointer"

To follow the existing per-arch conventions, rename "sp_in_global" to
"current_stack_pointer". This will let it be used in non-arch places
(like HARDENED_USERCOPY).

Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: Palmer Dabbelt <[email protected]>

Merge tag 'for-5.18/parisc-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull more parisc architecture updates from Helge Deller:

- Revert a patch to the invalidate/flush vmap routines which broke
   kernel patching functions on older PA-RISC machines.

- Fix the kernel patching code wrt locking and flushing. Works now on
   B160L machine as well.

- Fix CPU IRQ affinity for LASI, WAX and Dino chips

- Add CPU hotplug support

- Detect the hppa-suse-linux-gcc compiler when cross-compiling

* tag 'for-5.18/parisc-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix patch code locking and flushing
  parisc: Find a new timesync master if current CPU is removed
  parisc: Move common_stext into .text section when CONFIG_HOTPLUG_CPU=y
  parisc: Rewrite arch_cpu_idle_dead() for CPU hotplugging
  parisc: Implement __cpu_die() and __cpu_disable() for CPU hotplugging
  parisc: Add PDC locking functions for rendezvous code
  parisc: Move disable_sr_hashing_asm() into .text section
  parisc: Move CPU startup-related functions into .text section
  parisc: Move store_cpu_topology() into text section
  parisc: Switch from GENERIC_CPU_DEVICES to GENERIC_ARCH_TOPOLOGY
  parisc: Ensure set_firmware_width() is called only once
  parisc: Add constants for control registers and clean up mfctl()
  parisc: Detect hppa-suse-linux-gcc compiler for cross-building
  parisc: Clean up cpu_check_affinity() and drop cpu_set_affinity_irq()
  parisc: Fix CPU affinity for Lasi, WAX and Dino chips
  Revert "parisc: Fix invalidate/flush vmap routines"

Merge tag 'modules-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull module update from Luis Chamberlain:
"There is only one patch which qualifies for modules for v5.18-rc1 and
  its a small fix from Dan Carpenter for lib/test_kmod module.

  The rest of the changes are too major and landed in modules-testing
  too late for inclusion. The good news is that most of the major
  changes for v5.19 is going to be tested very early through linux-next.

  This simple fix is all we have for modules for v5.18-rc1"

* tag 'modules-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  lib/test: use after free in register_test_dev_kmod()

docs: Add a document on how to fix a messy diffstat

A branch with merges in will sometimes create a diffstat containing a lot
of unrelated work at "git request-pull" time. Create a document based on
Linus's advice (found in the links below) and add it to the maintainer
manual in the hope of saving some wear on Linus's keyboard going forward.

Link: https://lore.kernel.org/lkml/CAHk-=wg3wXH2JNxkQi+eLZkpuxqV+wPiHhw_Jf7ViH33Sw7PHA@mail.gmail.com/
Link: https://lore.kernel.org/lkml/CAHk-=wgXbSa8yq8Dht8at+gxb_idnJ7X5qWZQWRBN4_CUPr=eQ@mail.gmail.com/
Acked-by: Borislav Petkov <[email protected]>
Reviewed-by: Akira Yokosawa <[email protected]>
Signed-off-by: Jonathan Corbet <[email protected]>

docs: sphinx/requirements: Limit jinja2<3.1

jinja2 release 3.1.0 (March 24, 2022) broke Sphinx<4.0.
This looks like the result of deprecating Python 3.6.
It has been tested against Sphinx 4.3.0 and later.

Setting an upper limit of <3.1 to junja2 can unbreak Sphinx<4.0
including Sphinx 2.4.4.

Signed-off-by: Akira Yokosawa <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Cc: [email protected] # v5.15+
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jonathan Corbet <[email protected]>

sfc: Avoid NULL pointer dereference on systems without numa awareness

On such systems cpumask_of_node() returns NULL, which bitmap
operations are not happy with.

Fixes: c265b569a45f ("sfc: default config to 1 channel/core in local NUMA node only")
Fixes: 09a99ab16c60 ("sfc: set affinity hints in local NUMA node only")
Signed-off-by: Martin Habets <[email protected]>
Reviewed-by: Íñigo Huguet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

ptp: ocp: handle error from nvmem_device_find

nvmem_device_find returns a valid pointer or IS_ERR().
Handle this properly.

Fixes: 0cfcdd1ebcfe ("ptp: ocp: add nvmem interface for accessing eeprom")
Signed-off-by: Jonathan Lemon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

net: dsa: felix: fix possible NULL pointer dereference

As the possible failure of the allocation, kzalloc() may return NULL
pointer.
Therefore, it should be better to check the 'sgi' in order to prevent
the dereference of NULL pointer.

Fixes: 23ae3a7877718 ("net: dsa: felix: add stream gate settings for psfp").
Signed-off-by: Zheng Yongjun <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

drbd: fix potential silent data corruption

Scenario:
---------

bio chain generated by blk_queue_split().
Some split bio fails and propagates its error status to the "parent" bio.
But then the (last part of the) parent bio itself completes without error.

We would clobber the already recorded error status with BLK_STS_OK,
causing silent data corruption.

Reproducer:
-----------

How to trigger this in the real world within seconds:

DRBD on top of degraded parity raid,
small stripe_cache_size, large read_ahead setting.
Drop page cache (sysctl vm.drop_caches=1, fadvise "DONTNEED",
umount and mount again, "reboot").

Cause significant read ahead.

Large read ahead request is split by blk_queue_split().
Parts of the read ahead that are already in the stripe cache,
or find an available stripe cache to use, can be serviced.
Parts of the read ahead that would need "too much work",
would need to wait for a "stripe_head" to become available,
are rejected immediately.

For larger read ahead requests that are split in many pieces, it is very
likely that some "splits" will be serviced, but then the stripe cache is
exhausted/busy, and the remaining ones will be rejected.

Signed-off-by: Lars Ellenberg <[email protected]>
Signed-off-by: Christoph Böhmwalder <[email protected]>
Cc: <[email protected]> # 4.13.x
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

MIPS: rb532: move GPIOD definition into C-files

My kernel robot reports build error from drivers/iio/adc/da9150-gpadc.c,

  drivers/iio/adc/da9150-gpadc.c:254:13: error: ‘DA9150_GPADC_CHAN_0x08’
  undeclared here (not in a function); did you mean ‘DA9150_GPADC_CHAN_TBAT’?
     254 |  .channel = DA9150_GPADC_CHAN_##_id,

We define GPIOD in rb.h, in fact it should only be used in gpio.c, but
it affects the driver da9150-gpadc.c which goes against the original
intention of the design, just move it to its scope.

Fixes: 1b432840d0a4 ("MIPS: RB532: GPIO register offsets are relative to GPIOBASE")
Suggested-by: Jonathan Cameron <[email protected]>
Suggested-by: Andy Shevchenko <[email protected]>
Reported-by: k2ci <[email protected]>
Signed-off-by: Jackie Liu <[email protected]>
Signed-off-by: Thomas Bogendoerfer <[email protected]>

MIPS: lantiq: check the return value of kzalloc()

kzalloc() is a memory allocation function which can return NULL when
some internal memory errors happen. So it is better to check the
return value of it to prevent potential wrong memory access or
memory leak.

Signed-off-by: Xiaoke Wang <[email protected]>
Signed-off-by: Thomas Bogendoerfer <[email protected]>

mips: sgi-ip22: add a check for the return of kzalloc()

kzalloc() is a memory allocation function which can return NULL when
some internal memory errors happen. So it is better to check it to
prevent potential wrong memory access.

Signed-off-by: Xiaoke Wang <[email protected]>
Signed-off-by: Thomas Bogendoerfer <[email protected]>

Merge tag 'pwm/for-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm

Pull pwm updates from Thierry Reding:
"This contains conversions of some more drivers to the atomic API as
  well as the addition of new chip support for some existing drivers.

  There are also various minor fixes and cleanups across the board, from
  drivers to device tree bindings"

* tag 'pwm/for-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (45 commits)
  pwm: rcar: Simplify multiplication/shift logic
  dt-bindings: pwm: renesas,tpu: Do not require pwm-cells twice
  dt-bindings: pwm: tiehrpwm: Do not require pwm-cells twice
  dt-bindings: pwm: tiecap: Do not require pwm-cells twice
  dt-bindings: pwm: samsung: Do not require pwm-cells twice
  dt-bindings: pwm: intel,keembay: Do not require pwm-cells twice
  dt-bindings: pwm: brcm,bcm7038: Do not require pwm-cells twice
  dt-bindings: pwm: toshiba,visconti: Include generic PWM schema
  dt-bindings: pwm: renesas,pwm: Include generic PWM schema
  dt-bindings: pwm: sifive: Include generic PWM schema
  dt-bindings: pwm: rockchip: Include generic PWM schema
  dt-bindings: pwm: mxs: Include generic PWM schema
  dt-bindings: pwm: iqs620a: Include generic PWM schema
  dt-bindings: pwm: intel,lgm: Include generic PWM schema
  dt-bindings: pwm: imx: Include generic PWM schema
  dt-bindings: pwm: allwinner,sun4i-a10: Include generic PWM schema
  pwm: pwm-mediatek: Beautify error messages text
  pwm: pwm-mediatek: Allocate clk_pwms with devm_kmalloc_array
  pwm: pwm-mediatek: Simplify error handling with dev_err_probe()
  pwm: brcmstb: Remove useless locking
  ...

Merge tag 'regulator-fix-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
"A couple of fixes for the rt4831 driver which fix features that didn't
  work due to incomplete description of the register configuration"

* tag 'regulator-fix-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: rt4831: Add active_discharge_on to fix discharge API
  regulator: rt4831: Add bypass mask to fix set_bypass API work

Merge tag 'dmaengine-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine

Pull dmaengine updates from Vinod Koul:
"This time we have bunch of driver updates and some new device support.

  New support:
   - Document RZ/V2L and RZ/G2UL dma binding
   - TI AM62x k3-udma and k3-psil support

  Updates:
   - Yaml conversion for Mediatek uart apdma schema
   - Removal of DMA-32 fallback configuration for various drivers
   - imx-sdma updates for channel restart"

* tag 'dmaengine-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (23 commits)
  dmaengine: hisi_dma: fix MSI allocate fail when reload hisi_dma
  dmaengine: dw-axi-dmac: cleanup comments
  dmaengine: fsl-dpaa2-qdma: Drop comma after SoC match table sentinel
  dt-bindings: dma: Convert mtk-uart-apdma to DT schema
  dmaengine: ppc4xx: Make use of the helper macro LIST_HEAD()
  dmaengine: idxd: Remove useless DMA-32 fallback configuration
  dmaengine: qcom_hidma: Remove useless DMA-32 fallback configuration
  dmaengine: sh: Kconfig: Add ARCH_R9A07G054 dependency for RZ_DMAC config option
  dmaengine: ti: k3-psil: Add AM62x PSIL and PDMA data
  dmaengine: ti: k3-udma: Add AM62x DMSS support
  dmaengine: ti: cleanup comments
  dmaengine: imx-sdma: clean up some inconsistent indenting
  dmaengine: Revert "dmaengine: shdma: Fix runtime PM imbalance on error"
  dmaengine: idxd: restore traffic class defaults after wq reset
  dmaengine: altera-msgdma: Remove useless DMA-32 fallback configuration
  dmaengine: stm32-dma: set dma_device max_sg_burst
  dmaengine: imx-sdma: fix cyclic buffer race condition
  dmaengine: imx-sdma: restart cyclic channel if needed
  dmaengine: iot: Remove useless DMA-32 fallback configuration
  dmaengine: ptdma: handle the cases based on DMA is complete
  ...

Merge tag 'rproc-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux

Pull remoteproc updates from Bjorn Andersson:
"In the remoteproc core, it's now possible to mark the sysfs attributes
  read only on a per-instance basis, which is then used by the TI wkup
  M3 driver.

  Also, the rproc_shutdown() interface propagates errors to the caller
  and an array underflow is fixed in the debugfs interface. The
  rproc_da_to_va() API is moved to the public API to allow e.g. child
  rpmsg devices to acquire pointers to memory shared with the remote
  processor.

  The TI K3 R5F and DSP drivers gains support for attaching to instances
  already started by the bootloader, aka IPC-only mode.

  The Mediatek remoteproc driver gains support for the MT8186 SCP. The
  driver's probe function is reordered and moved to use the devres
  version of rproc_alloc() to save a few gotos. The driver's probe
  function is also transitioned to use dev_err_probe() to provide better
  debug support.

  Support for the Qualcomm SC7280 Wireless Subsystem (WPSS) is
  introduced. The Hexagon based remoteproc drivers gains support for
  voting for interconnect bandwidth during launch of the remote
  processor. The modem subsystem (MSS) driver gains support for probing
  the BAM-DMUX driver, which provides the network interface towards the
  modem on a set of older Qualcomm platforms. In addition a number a bug
  fixes are introduces in the Qualcomm drivers.

  Lastly Qualcomm ADSP DeviceTree binding is converted to YAML format,
  to allow validation of DeviceTree source files"

* tag 'rproc-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: (22 commits)
  remoteproc: qcom_q6v5_mss: Create platform device for BAM-DMUX
  remoteproc: qcom: q6v5_wpss: Add support for sc7280 WPSS
  dt-bindings: remoteproc: qcom: Add SC7280 WPSS support
  dt-bindings: remoteproc: qcom: adsp: Convert binding to YAML
  remoteproc: k3-dsp: Add support for IPC-only mode for all K3 DSPs
  remoteproc: k3-dsp: Refactor mbox request code in start
  remoteproc: k3-r5: Add support for IPC-only mode for all R5Fs
  remoteproc: k3-r5: Refactor mbox request code in start
  remoteproc: Change rproc_shutdown() to return a status
  remoteproc: qcom: q6v5: Add interconnect path proxy vote
  remoteproc: mediatek: Support mt8186 scp
  dt-bindings: remoteproc: mediatek: Add binding for mt8186 scp
  remoteproc: qcom_q6v5_mss: Fix some leaks in q6v5_alloc_memory_region
  remoteproc: qcom_wcnss: Add missing of_node_put() in wcnss_alloc_memory_region
  remoteproc: qcom: Fix missing of_node_put in adsp_alloc_memory_region
  remoteproc: move rproc_da_to_va declaration to remoteproc.h
  remoteproc: wkup_m3: Set sysfs_read_only flag
  remoteproc: Introduce sysfs_read_only flag
  remoteproc: Fix count check in rproc_coredump_write()
  remoteproc: mtk_scp: Use dev_err_probe() where possible
  ...

Merge tag 'hwlock-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux

Pull hwspinlock updates from Bjorn Andersson:
"This updates sprd and srm32 drivers to use struct_size() instead of
  their open-coded equivalents. It also cleans up the omap dt-bindings
  example"

* tag 'hwlock-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
  hwspinlock: sprd: Use struct_size() helper in devm_kzalloc()
  hwspinlock: stm32: Use struct_size() helper in devm_kzalloc()
  dt-bindings: hwlock: omap: Remove redundant binding example

Merge tag 'rpmsg-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux

Pull rpmsg updates from Bjorn Andersson:
"The major part of the rpmsg changes for v5.18 relates to improvements
  in the rpmsg char driver, which now allow automatically attaching to
  rpmsg channels as well as initiating new communication channels from
  the Linux side.

  The SMD driver is moved to arch_initcall with the purpose of
  registering root clocks earlier during boot.

  Also in the SMD driver, a workaround for the resource power management
  (RPM) channel is introduced to resolve an issue where both the RPM and
  Linux side waits for the other to close the communication established
  by the bootloader - this unblocks support for clocks and regulators on
  some older Qualcomm platforms"

* tag 'rpmsg-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
  rpmsg: ctrl: Introduce new RPMSG_CREATE/RELEASE_DEV_IOCTL controls
  rpmsg: char: Introduce the "rpmsg-raw" channel
  rpmsg: char: Add possibility to use default endpoint of the rpmsg device
  rpmsg: char: Refactor rpmsg_chrdev_eptdev_create function
  rpmsg: Update rpmsg_chrdev_register_device function
  rpmsg: Move the rpmsg control device from rpmsg_char to rpmsg_ctrl
  rpmsg: Create the rpmsg class in core instead of in rpmsg char
  rpmsg: char: Export eptdev create and destroy functions
  rpmsg: char: treat rpmsg_trysend() ENOMEM as EAGAIN
  rpmsg: qcom_smd: Fix redundant channel->registered assignment
  rpmsg: use struct_size over open coded arithmetic
  rpmsg: smd: allow opening rpm_requests even if already opened
  rpmsg: qcom_smd: Promote to arch_initcall

Merge tag 'i3c/for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux

Pull i3c updates from Alexandre Belloni:

- support dynamic addition of i2c devices

* tag 'i3c/for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
  i3c: fix uninitialized variable use in i2c setup
  i3c: support dynamically added i2c devices
  i3c: remove i2c board info from i2c_dev_desc

Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk updates from Stephen Boyd:
"There's one large change in the core clk framework here. We change how
  clk_set_rate_range() works so that the frequency is re-evaulated each
  time the rate is changed. Previously we wouldn't let clk providers see
  a rate that was different if it was still within the range, which
  could be bad for power if the clk could run slower when a range
  expands. Now the clk provider can decide to do something differently
  when the constraints change. This broke Nvidia's clk driver so we had
  to wait for the fix for that to bake a little more in -next.

  The rate range patch series also introduced a kunit suite for the clk
  framework that we're going to extend in the next release. It already
  made it easy to find corner cases in the rate range patches so I'm
  excited to see it cover more clk code and increase our confidence in
  core framework patches in the future. I also added a kunit test for
  the basic clk gate code and that work will continue to cover more
  basic clk types: muxes, dividers, etc.

  Beyond the core code we have the usual set of clk driver updates and
  additions. Qualcomm again dominates the diffstat here with lots more
  SoCs being supported and i.MX follows afer that with a similar number
  of SoCs gaining clk drivers. Beyond those large additions there's
  drivers being modernized to use clk_parent_data so we can move away
  from global string names for all the clks in an SoC. Finally there's
  lots of little fixes all over the clk drivers for typos, warnings, and
  missing clks that aren't critical and get batched up waiting for the
  next merge window to open. Nothing super big stands out in the driver
  pile. Full details are below.

  Core:
   - Make clk_set_rate_range() re-evaluate the limits each time
   - Introduce various clk_set_rate_range() tests
   - Add clk_drop_range() to drop a previously set range

  New Drivers:
   - i.MXRT1050 clock driver and bindings
   - i.MX8DXL clock driver and bindings
   - i.MX93 clock driver and bindings
   - NCO blocks on Apple SoCs
   - Audio clks on StarFive JH7100 RISC-V SoC
   - Add support for the new Renesas RZ/V2L SoC
   - Qualcomm SDX65 A7 PLL
   - Qualcomm SM6350 GPU clks
   - Qualcomm SM6125, SM6350, QCS2290 display clks
   - Qualcomm MSM8226 multimedia clks

  Updates:
   - Kunit tests for clk-gate implementation
   - Terminate arrays with sentinels and make that clearer
   - Cleanup SPDX tags
   - Fix typos in comments
   - Mark mux table as const in clk-mux
   - Make the all_lists array const
   - Convert Cirrus Logic CS2000P driver to regmap, yamlify DT binding
     and add support for dynamic mode
   - Clock configuration on Microchip PolarFire SoCs
   - Free allocations on probe error in Mediatek clk driver
   - Modernize Mediatek clk driver by consolidating code
   - Add watchdog (WDT), I2C, and pin function controller (PFC) clocks
     on Renesas R-Car S4-8
   - Improve the clocks for the Rockchip rk3568 display outputs
     (parenting, pll-rates)
   - Use of_device_get_match_data() instead of open-coding on Rockchip
     rk3568
   - Reintroduce the expected fractional-divider behaviour that
     disappeared with the addition of CLK_FRAC_DIVIDER_POWER_OF_TWO_PS
   - Remove SYS PLL 1/2 clock gates for i.MX8M*
   - Remove AUDIO MCLK ROOT from i.MX7D
   - Add fracn gppll clock type used by i.MX93
   - Add new composite clock for i.MX93
   - Add missing media mipi phy ref clock for i.MX8MP
   - Fix off by one in imx_lpcg_parse_clks_from_dt()
   - Rework for the imx pll14xx
   - sama7g5: One low priority fix for GCLK of PDMC
   - Add DMA engine (SYS-DMAC) clocks on Renesas R-Car S4-8
   - Add MOST (MediaLB I/F) clocks on Renesas R-Car E3 and D3
   - Add CAN-FD clocks on Renesas R-Car V3U
   - Qualcomm SC8280XP RPMCC
   - Add some missing clks on Qualcomm MSM8992/MSM8994/MSM8998 SoCs
   - Rework Qualcomm GCC bindings and convert SDM845 camera bindig to
     YAML
   - Convert various Qualcomm drivers to use clk_parent_data
   - Remove test clocks from various Qualcomm drivers
   - Crypto engine clks on Qualcomm IPQ806x + more freqs for SDCC/NSS
   - Qualcomm SM8150 EMAC, PCIe, UFS GDSCs
   - Better pixel clk frequency support on Qualcomm RCG2 clks"

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (227 commits)
  clk: zynq: Update the parameters to zynq_clk_register_periph_clk
  clk: zynq: trivial warning fix
  clk: Drop the rate range on clk_put()
  clk: test: Test clk_set_rate_range on orphan mux
  clk: Initialize orphan req_rate
  dt-bindings: clock: drop useless consumer example
  dt-bindings: clock: renesas: Make example 'clocks' parsable
  clk: qcom: gcc-msm8994: Fix gpll4 width
  dt-bindings: clock: fix dt_binding_check error for qcom,gcc-other.yaml
  clk: rs9: Add Renesas 9-series PCIe clock generator driver
  clk: fixed-factor: Introduce devm_clk_hw_register_fixed_factor_index()
  clk: visconti: prevent array overflow in visconti_clk_register_gates()
  dt-bindings: clk: rs9: Add Renesas 9-series I2C PCIe clock generator
  clk: sifive: Move all stuff into SoCs header files from C files
  clk: sifive: Add SoCs prefix in each SoCs-dependent data
  riscv: dts: Change the macro name of prci in each device node
  dt-bindings: change the macro name of prci in header files and example
  clk: sifive: duplicate the macro definitions for the time being
  clk: qcom: sm6125-gcc: fix typos in comments
  clk: ti: clkctrl: fix typos in comments
  ...

Merge tag 'libnvdimm-for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Dan Williams:
"The update for this cycle includes the deprecation of block-aperture
  mode and a new perf events interface for the papr_scm nvdimm driver.

  The perf events approach was acked by PeterZ.

   - Add perf support for nvdimm events, initially only for 'papr_scm'
     devices.

   - Deprecate the 'block aperture' support in libnvdimm, it only ever
     existed in the specification, not in shipping product"

* tag 'libnvdimm-for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  nvdimm/blk: Fix title level
  MAINTAINERS: remove section LIBNVDIMM BLK: MMIO-APERTURE DRIVER
  powerpc/papr_scm: Fix build failure when
  drivers/nvdimm: Fix build failure when CONFIG_PERF_EVENTS is not set
  nvdimm/region: Delete nd_blk_region infrastructure
  ACPI: NFIT: Remove block aperture support
  nvdimm/namespace: Delete nd_namespace_blk
  nvdimm/namespace: Delete blk namespace consideration in shared paths
  nvdimm/blk: Delete the block-aperture window driver
  nvdimm/region: Fix default alignment for small regions
  docs: ABI: sysfs-bus-nvdimm: Document sysfs event format entries for nvdimm pmu
  powerpc/papr_scm: Add perf interface support
  drivers/nvdimm: Add perf interface to expose nvdimm performance stats
  drivers/nvdimm: Add nvdimm pmu structure

fs: fix an infinite loop in iomap_fiemap

when get fiemap starting from MAX_LFS_FILESIZE, (maxbytes - *len) < start
will always true , then *len set zero. because of start offset is beyond
file size, for erofs filesystem it will always return iomap.length with
zero,iomap iterate will enter infinite loop. it is necessary cover this
corner case to avoid this situation.

------------[ cut here ]------------
WARNING: CPU: 7 PID: 905 at fs/iomap/iter.c:35 iomap_iter+0x97f/0xc70
Modules linked in: xfs erofs
CPU: 7 PID: 905 Comm: iomap Tainted: G        W         5.17.0-rc8 #27
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:iomap_iter+0x97f/0xc70
Code: 85 a1 fc ff ff e8 71 be 9c ff 0f 1f 44 00 00 e9 92 fc ff ff e8 62 be 9c ff 0f 0b b8 fb ff ff ff e9 fc f8 ff ff e8 51 be 9c ff <0f> 0b e9 2b fc ff ff e8 45 be 9c ff 0f 0b e9 e1 fb ff ff e8 39 be
RSP: 0018:ffff888060a37ab0 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff888060a37bb0 RCX: 0000000000000000
RDX: ffff88807e19a900 RSI: ffffffff81a7da7f RDI: ffff888060a37be0
RBP: 7fffffffffffffff R08: 0000000000000000 R09: ffff888060a37c20
R10: ffff888060a37c67 R11: ffffed100c146f8c R12: 7fffffffffffffff
R13: 0000000000000000 R14: ffff888060a37bd8 R15: ffff888060a37c20
FS:  00007fd3cca01540(0000) GS:ffff888108780000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020010820 CR3: 0000000054b92000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
iomap_fiemap+0x1c9/0x2f0
erofs_fiemap+0x64/0x90 [erofs]
do_vfs_ioctl+0x40d/0x12e0
__x64_sys_ioctl+0xaa/0x1c0
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
</TASK>
---[ end trace 0000000000000000 ]---
watchdog: BUG: soft lockup - CPU#7 stuck for 26s! [iomap:905]

Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Guo Xuenan <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
[djwong: fix some typos]
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

loop: fix ioctl calls using compat_loop_info

Support for cryptoloop was deleted in commit 47e9624616c8 ("block:
remove support for cryptoloop and the xor transfer"), making the usage
of loop_info->lo_encrypt_type obsolete. However, this member was also
removed from the compat_loop_info definition and this breaks userspace
ioctl calls for 32-bit binaries and CONFIG_COMPAT=y.

This patch restores the compat_loop_info->lo_encrypt_type member and
marks it obsolete as well as in the uapi header definitions.

Fixes: 47e9624616c8 ("block: remove support for cryptoloop and the xor transfer")
Signed-off-by: Carlos Llamas <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

PCI/doc: cleanup references to the legacy PCI DMA API

Mention the regular DMA API calls instead of the now removed PCI DMA API.

Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>

ksmbd: replace usage of found with dedicated list iterator variable

To move the list iterator variable into the list_for_each_entry_*()
macro in the future it should be avoided to use the list iterator
variable after the loop body.

To *never* use the list iterator variable after the loop it was
concluded to use a separate iterator variable instead of a
found boolean [1].

This removes the need to use a found variable and simply checking if
the variable was set, can determine if the break/goto was hit.

Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/
Signed-off-by: Jakob Koschel <[email protected]>
Reviewed-by: Hyunchul Lee <[email protected]>
Acked-by: Namjae Jeon <[email protected]>
Signed-off-by: Steve French <[email protected]>

ksmbd: Remove a redundant zeroing of memory

fill_transform_hdr() has only one caller that already clears tr_buf (it is
kzalloc'ed).

So there is no need to clear it another time here.

Remove the superfluous memset() and add a comment to remind that the caller
must clear the buffer.

Signed-off-by: Christophe JAILLET <[email protected]>
Acked-by: Hyunchul Lee <[email protected]>
Acked-by: Namjae Jeon <[email protected]>
Signed-off-by: Steve French <[email protected]>

MAINTAINERS: ksmbd: switch Sergey to reviewer

Sergey don't have the time to work ksmbd. He will continue to review
ksmbd works at free time. This patch switches him from maintainer
to reviewer.

Cc: Sergey Senozhatsky <[email protected]>
Signed-off-by: Namjae Jeon <[email protected]>
Signed-off-by: Steve French <[email protected]>

ksmbd: shorten experimental warning on loading the module

ksmbd is continuing to improve. Shorten the warning message
logged the first time it is loaded to:
"The ksmbd server is experimental"

Acked-by: Namjae Jeon <[email protected]>
Signed-off-by: Steve French <[email protected]>

ALSA: pcm: Fix potential AB/BA lock with buffer_mutex and mmap_lock

syzbot caught a potential deadlock between the PCM
runtime->buffer_mutex and the mm->mmap_lock.  It was brought by the
recent fix to cover the racy read/write and other ioctls, and in that
commit, I overlooked a (hopefully only) corner case that may take the
revert lock, namely, the OSS mmap.  The OSS mmap operation
exceptionally allows to re-configure the parameters inside the OSS
mmap syscall, where mm->mmap_mutex is already held.  Meanwhile, the
copy_from/to_user calls at read/write operations also take the
mm->mmap_lock internally, hence it may lead to a AB/BA deadlock.

A similar problem was already seen in the past and we fixed it with a
refcount (in commit b248371628aa).  The former fix covered only the
call paths with OSS read/write and OSS ioctls, while we need to cover
the concurrent access via both ALSA and OSS APIs now.

This patch addresses the problem above by replacing the buffer_mutex
lock in the read/write operations with a refcount similar as we've
used for OSS.  The new field, runtime->buffer_accessing, keeps the
number of concurrent read/write operations.  Unlike the former
buffer_mutex protection, this protects only around the
copy_from/to_user() calls; the other codes are basically protected by
the PCM stream lock.  The refcount can be a negative, meaning blocked
by the ioctls.  If a negative value is seen, the read/write aborts
with -EBUSY.  In the ioctl side, OTOH, they check this refcount, too,
and set to a negative value for blocking unless it's already being
accessed.

Reported-by: [email protected]
Fixes: dca947d4d26d ("ALSA: pcm: Fix races among concurrent read/write and buffer changes")
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

Merge tag 'asoc-fix-v5.18' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v5.18

A few fixes that came in during the merge window, all fairly routine.

x86/fpu/xstate: Consolidate size calculations

Use the offset calculation to do the size calculation which avoids yet
another series of CPUID instructions for each invocation.

[ Fix the FP/SSE only case which missed to take the xstate
header into account, as
Reported-by: kernel test robot <[email protected]> ]
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/87o81pgbp2.ffs@tglx

x86/fpu/xstate: Handle supervisor states in XSTATE permissions

The size calculation in __xstate_request_perm() fails to take supervisor
states into account because the permission bitmap is only relevant for user
states.

Up to 5.17 this does not matter because there are no supervisor states
supported, but the (re-)enabling of ENQCMD makes them available.

Fixes: 7c1ef59145f1 ("x86/cpufeatures: Re-enable ENQCMD")
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

x86/fpu/xsave: Handle compacted offsets correctly with supervisor states

So far the cached fixed compacted offsets worked, but with (re-)enabling
of ENQCMD this does no longer work with KVM fpstate.

KVM does not have supervisor features enabled for the guest FPU, which
means that KVM has then a different XSAVE area layout than the host FPU
state. This in turn breaks the copy from/to UABI functions when invoked for
a guest state.

Remove the pre-calculated compacted offsets and calculate the offset
of each component at runtime based on the XCOMP_BV field in the XSAVE
header.

The runtime overhead is not interesting because these copy from/to UABI
functions are not used in critical fast paths. KVM uses them to save and
restore FPU state during migration. The host uses them for ptrace and for
the slow path of 32bit signal handling.

Fixes: 7c1ef59145f1 ("x86/cpufeatures: Re-enable ENQCMD")
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

x86/fpu: Cache xfeature flags from CPUID

In preparation for runtime calculation of XSAVE offsets cache the feature
flags for each XSTATE component during feature enumeration via CPUID(0xD).

EDX has two relevant bits:
0 Supervisor component
1 Feature storage must be 64 byte aligned

These bits are currently only evaluated during init, but the alignment bit
must be cached to make runtime calculation of XSAVE offsets efficient.

Cache the full EDX content and use it for the existing alignment and
supervisor checks.

Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

x86/fpu/xsave: Initialize offset/size cache early

Reading XSTATE feature information from CPUID over and over does not make
sense. The information has to be cached anyway, so it can be done early.

Prepare for runtime calculation of XSTATE offsets and allow
consolidation of the size calculation functions in a later step.

Rename the function while at it as it does not setup any features.

Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

x86/fpu: Remove unused supervisor only offsets

No users.

Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

ALSA: hda: Avoid unsol event during RPM suspending

There is a corner case with unsol event handling during codec runtime
suspending state. When the codec runtime suspend call initiated, the
codec->in_pm atomic variable would be 0, currently the codec runtime
suspend function calls snd_hdac_enter_pm() which will just increments
the codec->in_pm atomic variable. Consider unsol event happened just
after this step and before snd_hdac_leave_pm() in the codec runtime
suspend function. The snd_hdac_power_up_pm() in the unsol event
flow in hdmi_present_sense_via_verbs() function would just increment
the codec->in_pm atomic variable without calling pm_runtime_get_sync
function.

As codec runtime suspend flow is already in progress and in parallel
unsol event is also accessing the codec verbs, as soon as codec
suspend flow completes and clocks are  switched off before completing
the unsol event handling as both functions doesn't wait for each other.
This will result in below errors

[  589.428020] tegra-hda 3510000.hda: azx_get_response timeout, switching
to polling mode: last cmd=0x505f2f57
[  589.428344] tegra-hda 3510000.hda: spurious response 0x80000074:0x5,
last cmd=0x505f2f57
[  589.428547] tegra-hda 3510000.hda: spurious response 0x80000065:0x5,
last cmd=0x505f2f57

To avoid this, the unsol event flow should not perform any codec verb
related operations during RPM_SUSPENDING state.

Signed-off-by: Mohan Kumar <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

ALSA: hda/realtek: Fix audio regression on Mi Notebook Pro 2020

Commit 5aec98913095 ("ALSA: hda/realtek - ALC236 headset MIC recording
issue") is to solve recording issue met on AL236, by matching codec
variant ALC269_TYPE_ALC257 and ALC269_TYPE_ALC256.

This match can be too broad and Mi Notebook Pro 2020 is broken by the
patch.

Instead, use codec ID to be narrow down the scope, in order to make
ALC256 unaffected.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215484
Fixes: 5aec98913095 ("ALSA: hda/realtek - ALC236 headset MIC recording issue")
Reported-by: kernel test robot <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
Cc: <[email protected]>
Signed-off-by: Kai-Heng Feng <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

fs: fix fd table size alignment properly

Jason Donenfeld reports that my commit 1c24a186398f ("fs: fd tables have
to be multiples of BITS_PER_LONG") doesn't work, and the reason is an
embarrassing brown-paper-bag bug.

Yes, we want to align the number of fds to BITS_PER_LONG, and yes, the
reason they might not be aligned is because the incoming 'max_fd'
argument might not be aligned.

But aligining the argument - while simple - will cause a "infinitely
big" maxfd (eg NR_OPEN_MAX) to just overflow to zero. Which most
definitely isn't what we want either.

The obvious fix was always just to do the alignment last, but I had
moved it earlier just to make the patch smaller and the code look
simpler. Duh. It certainly made _me_ look simple.

Fixes: 1c24a186398f ("fs: fd tables have to be multiples of BITS_PER_LONG")
Reported-and-tested-by: Jason A. Donenfeld <[email protected]>
Cc: Fedor Pchelkin <[email protected]>
Cc: Alexey Khoroshilov <[email protected]>
Cc: Christian Brauner <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

PCI: Remove the deprecated "pci-dma-compat.h" API

Now that all usages of the functions defined in "pci-dma-compat.h" have
been removed, it is time to remove this file as well.

In order not to break builds, move the "#include <linux/dma-mapping.h>"
that was in "pci-dma-compat.h" into "include/linux/pci.h"

Signed-off-by: Christophe JAILLET <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>

crypto: x86/sm3 - Fixup SLS

This missed the big asm update due to being merged through the crypto
tree.

Fixes: f94909ceb1ed ("x86: Prepare asm files for straight-line-speculation")
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Alexei Starovoitov says:

====================
pull-request: bpf 2022-03-29

We've added 16 non-merge commits during the last 1 day(s) which contain
a total of 24 files changed, 354 insertions(+), 187 deletions(-).

The main changes are:

1) x86 specific bits of fprobe/rethook, from Masami and Peter.

2) ice/xsk fixes, from Maciej and Magnus.

3) Various small fixes, from Andrii, Yonghong, Geliang and others.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Fix clang compilation errors
  ice: xsk: Fix indexing in ice_tx_xsk_pool()
  ice: xsk: Stop Rx processing when ntc catches ntu
  ice: xsk: Eliminate unnecessary loop iteration
  xsk: Do not write NULL in SW ring at allocation failure
  x86,kprobes: Fix optprobe trampoline to generate complete pt_regs
  x86,rethook: Fix arch_rethook_trampoline() to generate a complete pt_regs
  x86,rethook,kprobes: Replace kretprobe with rethook on x86
  kprobes: Use rethook for kretprobe if possible
  bpftool: Fix generated code in codegen_asserts
  selftests/bpf: fix selftest after random: Urandom_read tracepoint removal
  bpf: Fix maximum permitted number of arguments check
  bpf: Sync comments for bpf_get_stack
  fprobe: Fix sparse warning for acccessing __rcu ftrace_hash
  fprobe: Fix smatch type mismatch warning
  bpf/bpftool: Add unprivileged_bpf_disabled check against value of 2
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

Merge tag 'nfs-for-5.18-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
"Highlights include:

  Features:

   - Switch NFS to use readahead instead of the obsolete readpages.

   - Readdir fixes to improve cacheability of large directories when
     there are multiple readers and writers.

   - Readdir performance improvements when doing a seekdir() immediately
     after opening the directory (common when re-exporting NFS).

   - NFS swap improvements from Neil Brown.

   - Loosen up memory allocation to permit direct reclaim and write back
     in cases where there is no danger of deadlocking the writeback code
     or NFS swap.

   - Avoid sillyrename when the NFSv4 server claims to support the
     necessary features to recover the unlinked but open file after
     reboot.

  Bugfixes:

   - Patch from Olga to add a mount option to control NFSv4.1 session
     trunking discovery, and default it to being off.

   - Fix a lockup in nfs_do_recoalesce().

   - Two fixes for list iterator variables being used when pointing to
     the list head.

   - Fix a kernel memory scribble when reading from a non-socket
     transport in /sys/kernel/sunrpc.

   - Fix a race where reconnecting to a server could leave the TCP
     socket stuck forever in the connecting state.

   - Patch from Neil to fix a shutdown race which can leave the SUNRPC
     transport timer primed after we free the struct xprt itself.

   - Patch from Xin Xiong to fix reference count leaks in the NFSv4.2
     copy offload.

   - Sunrpc patch from Olga to avoid resending a task on an offlined
     transport.

  Cleanups:

   - Patches from Dave Wysochanski to clean up the fscache code"

* tag 'nfs-for-5.18-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (91 commits)
  NFSv4/pNFS: Fix another issue with a list iterator pointing to the head
  NFS: Don't loop forever in nfs_do_recoalesce()
  SUNRPC: Don't return error values in sysfs read of closed files
  SUNRPC: Do not dereference non-socket transports in sysfs
  NFSv4.1: don't retry BIND_CONN_TO_SESSION on session error
  SUNRPC don't resend a task on an offlined transport
  NFS: replace usage of found with dedicated list iterator variable
  SUNRPC: avoid race between mod_timer() and del_timer_sync()
  pNFS/files: Ensure pNFS allocation modes are consistent with nfsiod
  pNFS/flexfiles: Ensure pNFS allocation modes are consistent with nfsiod
  NFSv4/pnfs: Ensure pNFS allocation modes are consistent with nfsiod
  NFS: Avoid writeback threads getting stuck in mempool_alloc()
  NFS: nfsiod should not block forever in mempool_alloc()
  SUNRPC: Make the rpciod and xprtiod slab allocation modes consistent
  SUNRPC: Fix unx_lookup_cred() allocation
  NFS: Fix memory allocation in rpc_alloc_task()
  NFS: Fix memory allocation in rpc_malloc()
  SUNRPC: Improve accuracy of socket ENOBUFS determination
  SUNRPC: Replace internal use of SOCKWQ_ASYNC_NOSPACE
  SUNRPC: Fix socket waits for write buffer space
  ...

xfs: drop async cache flushes from CIL commits.

Jan Kara reported a performance regression in dbench that he
bisected down to commit bad77c375e8d ("xfs: CIL checkpoint
flushes caches unconditionally").

Whilst developing the journal flush/fua optimisations this cache was
part of, it appeared to made a significant difference to
performance. However, now that this patchset has settled and all the
correctness issues fixed, there does not appear to be any
significant performance benefit to asynchronous cache flushes.

In fact, the opposite is true on some storage types and workloads,
where additional cache flushes that can occur from fsync heavy
workloads have measurable and significant impact on overall
throughput.

Local dbench testing shows little difference on dbench runs with
sync vs async cache flushes on either fast or slow SSD storage, and
no difference in streaming concurrent async transaction workloads
like fs-mark.

Fast NVME storage.

From `dbench -t 30`, CIL scale:

clients async sync
BW Latency BW Latency
1 935.18   0.855 915.64   0.903
8 2404.51   6.873 2341.77   6.511
16 3003.42   6.460 2931.57   6.529
32 3697.23   7.939 3596.28   7.894
128 7237.43  15.495 7217.74  11.588
512 5079.24  90.587 5167.08  95.822

fsmark, 32 threads, create w/ 64 byte xattr w/32k logbsize

create chown unlink
async   1m41s 1m16s 2m03s
sync 1m40s 1m19s 1m54s

Slower SATA SSD storage:

From `dbench -t 30`, CIL scale:

clients async sync
BW Latency BW Latency
1   78.59  15.792   83.78  10.729
8 367.88  92.067 404.63  59.943
16 564.51  72.524 602.71  76.089
32 831.66 105.984 870.26 110.482
128 1659.76 102.969 1624.73  91.356
512 2135.91 223.054 2603.07 161.160

fsmark, 16 threads, create w/32k logbsize

create unlink
async   5m06s 4m15s
sync 5m00s 4m22s

And on Jan's test machine:

                   5.18-rc8-vanilla       5.18-rc8-patched
Amean     1        71.22 (   0.00%)       64.94 *   8.81%*
Amean     2        93.03 (   0.00%)       84.80 *   8.85%*
Amean     4       150.54 (   0.00%)      137.51 *   8.66%*
Amean     8       252.53 (   0.00%)      242.24 *   4.08%*
Amean     16      454.13 (   0.00%)      439.08 *   3.31%*
Amean     32      835.24 (   0.00%)      829.74 *   0.66%*
Amean     64     1740.59 (   0.00%)     1686.73 *   3.09%*

Performance and cache flush behaviour is restored to pre-regression
levels.

As such, we can now consider the async cache flush mechanism an
unnecessary exercise in premature optimisation and hence we can
now remove it and the infrastructure it requires completely.

Fixes: bad77c375e8d ("xfs: CIL checkpoint flushes caches unconditionally")
Reported-and-tested-by: Jan Kara <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

xfs: shutdown during log recovery needs to mark the log shutdown

When a checkpoint writeback is run by log recovery, corruption
propagated from the log can result in writeback verifiers failing
and calling xfs_force_shutdown() from
xfs_buf_delwri_submit_buffers().

This results in the mount being marked as shutdown, but the log does
not get marked as shut down because:

        /*
         * If this happens during log recovery then we aren't using the runtime
         * log mechanisms yet so there's nothing to shut down.
         */
        if (!log || xlog_in_recovery(log))
                return false;

If there are other buffers that then fail (say due to detecting the
mount shutdown), they will now hang in xfs_do_force_shutdown()
waiting for the log to shut down like this:

  __schedule+0x30d/0x9e0
  schedule+0x55/0xd0
  xfs_do_force_shutdown+0x1cd/0x200
  ? init_wait_var_entry+0x50/0x50
  xfs_buf_ioend+0x47e/0x530
  __xfs_buf_submit+0xb0/0x240
  xfs_buf_delwri_submit_buffers+0xfe/0x270
  xfs_buf_delwri_submit+0x3a/0xc0
  xlog_do_recovery_pass+0x474/0x7b0
  ? do_raw_spin_unlock+0x30/0xb0
  xlog_do_log_recovery+0x91/0x140
  xlog_do_recover+0x38/0x1e0
  xlog_recover+0xdd/0x170
  xfs_log_mount+0x17e/0x2e0
  xfs_mountfs+0x457/0x930
  xfs_fs_fill_super+0x476/0x830

xlog_force_shutdown() always needs to mark the log as shut down,
regardless of whether recovery is in progress or not, so that
multiple calls to xfs_force_shutdown() during recovery don't end
up waiting for the log to be shut down like this.

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

xfs: xfs_trans_commit() path must check for log shutdown

If a shut races with xfs_trans_commit() and we have shut down the
filesystem but not the log, we will still cancel the transaction.
This can result in aborting dirty log items instead of committing and
pinning them whilst the log is still running. Hence we can end up
with dirty, unlogged metadata that isn't in the AIL in memory that
can be flushed to disk via writeback clustering.

This was discovered from a g/388 trace where an inode log item was
having IO completed on it and it wasn't in the AIL, hence tripping
asserts xfs_ail_check(). Inode cluster writeback started long after
the filesystem shutdown started, and long after the transaction
containing the dirty inode was aborted and the log item marked
XFS_LI_ABORTED. The inode was seen as dirty and unpinned, so it
was flushed. IO completion tried to remove the inode from the AIL,
at which point stuff went bad:

XFS (pmem1): Log I/O Error (0x6) detected at xfs_fs_goingdown+0xa3/0xf0 (fs/xfs/xfs_fsops.c:500).  Shutting down filesystem.
XFS: Assertion failed: in_ail, file: fs/xfs/xfs_trans_ail.c, line: 67
XFS (pmem1): Please unmount the filesystem and rectify the problem(s)
Workqueue: xfs-buf/pmem1 xfs_buf_ioend_work
RIP: 0010:assfail+0x27/0x2d
Call Trace:
  <TASK>
  xfs_ail_check+0xa8/0x180
  xfs_ail_delete_one+0x3b/0xf0
  xfs_buf_inode_iodone+0x329/0x3f0
  xfs_buf_ioend+0x1f8/0x530
  xfs_buf_ioend_work+0x15/0x20
  process_one_work+0x1ac/0x390
  worker_thread+0x56/0x3c0
  kthread+0xf6/0x120
  ret_from_fork+0x1f/0x30
  </TASK>

xfs_trans_commit() needs to check log state for shutdown, not mount
state. It cannot abort dirty log items while the log is still
running as dirty items must remained pinned in memory until they are
either committed to the journal or the log has shut down and they
can be safely tossed away. Hence if the log has not shut down, the
xfs_trans_commit() path must allow completed transactions to commit
to the CIL and pin the dirty items even if a mount shutdown has
started.

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

xfs: xfs_do_force_shutdown needs to block racing shutdowns

When we call xfs_forced_shutdown(), the caller often expects the
filesystem to be completely shut down when it returns. However,
if we have racing xfs_forced_shutdown() calls, the first caller sets
the mount shutdown flag then goes to shutdown the log. The second
caller sees the mount shutdown flag and returns immediately - it
does not wait for the log to be shut down.

Unfortunately, xfs_forced_shutdown() is used in some places that
expect it to completely shut down the filesystem before it returns
(e.g. xfs_trans_log_inode()). As such, returning before the log has
been shut down leaves us in a place where the transaction failed to
complete correctly but we still call xfs_trans_commit(). This
situation arises because xfs_trans_log_inode() does not return an
error and instead calls xfs_force_shutdown() to ensure that the
transaction being committed is aborted.

Unfortunately, we have a race condition where xfs_trans_commit()
needs to check xlog_is_shutdown() because it can't abort log items
before the log is shut down, but it needs to use xfs_is_shutdown()
because xfs_forced_shutdown() does not block waiting for the log to
shut down.

To fix this conundrum, first we make all calls to
xfs_forced_shutdown() block until the log is also shut down. This
means we can then safely use xfs_forced_shutdown() as a mechanism
that ensures the currently running transaction will be aborted by
xfs_trans_commit() regardless of the shutdown check it uses.

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

xfs: log shutdown triggers should only shut down the log

We've got a mess on our hands.

1. xfs_trans_commit() cannot cancel transactions because the mount is
shut down - that causes dirty, aborted, unlogged log items to sit
unpinned in memory and potentially get written to disk before the
log is shut down. Hence xfs_trans_commit() can only abort
transactions when xlog_is_shutdown() is true.

2. xfs_force_shutdown() is used in places to cause the current
modification to be aborted via xfs_trans_commit() because it may be
impractical or impossible to cancel the transaction directly, and
hence xfs_trans_commit() must cancel transactions when
xfs_is_shutdown() is true in this situation. But we can't do that
because of #1.

3. Log IO errors cause log shutdowns by calling xfs_force_shutdown()
to shut down the mount and then the log from log IO completion.

4. xfs_force_shutdown() can result in a log force being issued,
which has to wait for log IO completion before it will mark the log
as shut down. If #3 races with some other shutdown trigger that runs
a log force, we rely on xfs_force_shutdown() silently ignoring #3
and avoiding shutting down the log until the failed log force
completes.

5. To ensure #2 always works, we have to ensure that
xfs_force_shutdown() does not return until the the log is shut down.
But in the case of #4, this will result in a deadlock because the
log Io completion will block waiting for a log force to complete
which is blocked waiting for log IO to complete....

So the very first thing we have to do here to untangle this mess is
dissociate log shutdown triggers from mount shutdowns. We already
have xlog_forced_shutdown, which will atomically transistion to the
log a shutdown state. Due to internal asserts it cannot be called
multiple times, but was done simply because the only place that
could call it was xfs_do_force_shutdown() (i.e. the mount shutdown!)
and that could only call it once and once only. So the first thing
we do is remove the asserts.

We then convert all the internal log shutdown triggers to call
xlog_force_shutdown() directly instead of xfs_force_shutdown(). This
allows the log shutdown triggers to shut down the log without
needing to care about mount based shutdown constraints. This means
we shut down the log independently of the mount and the mount may
not notice this until it's next attempt to read or modify metadata.
At that point (e.g. xfs_trans_commit()) it will see that the log is
shutdown, error out and shutdown the mount.

To ensure that all the unmount behaviours and asserts track
correctly as a result of a log shutdown, propagate the shutdown up
to the mount if it is not already set. This keeps the mount and log
state in sync, and saves a huge amount of hassle where code fails
because of a log shutdown but only checks for mount shutdowns and
hence ends up doing the wrong thing. Cleaning up that mess is
an exercise for another day.

This enables us to address the other problems noted above in
followup patches.

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacks

Brian reported a null pointer dereference failure during unmount in
xfs/006. He tracked the problem down to the AIL being torn down
before a log shutdown had completed and removed all the items from
the AIL. The failure occurred in this path while unmount was
proceeding in another task:

xfs_trans_ail_delete+0x102/0x130 [xfs]
xfs_buf_item_done+0x22/0x30 [xfs]
xfs_buf_ioend+0x73/0x4d0 [xfs]
xfs_trans_committed_bulk+0x17e/0x2f0 [xfs]
xlog_cil_committed+0x2a9/0x300 [xfs]
xlog_cil_process_committed+0x69/0x80 [xfs]
xlog_state_shutdown_callbacks+0xce/0xf0 [xfs]
xlog_force_shutdown+0xdf/0x150 [xfs]
xfs_do_force_shutdown+0x5f/0x150 [xfs]
xlog_ioend_work+0x71/0x80 [xfs]
process_one_work+0x1c5/0x390
worker_thread+0x30/0x350
kthread+0xd7/0x100
ret_from_fork+0x1f/0x30

This is processing an EIO error to a log write, and it's
triggering a force shutdown. This causes the log to be shut down,
and then it is running attached iclog callbacks from the shutdown
context. That means the fs and log has already been marked as
xfs_is_shutdown/xlog_is_shutdown and so high level code will abort
(e.g. xfs_trans_commit(), xfs_log_force(), etc) with an error
because of shutdown.

The umount would have been blocked waiting for a log force
completion inside xfs_log_cover() -> xfs_sync_sb(). The first thing
for this situation to occur is for xfs_sync_sb() to exit without
waiting for the iclog buffer to be comitted to disk. The
above trace is the completion routine for the iclog buffer, and
it is shutting down the filesystem.

xlog_state_shutdown_callbacks() does this:

{
        struct xlog_in_core     *iclog;
        LIST_HEAD(cb_list);

        spin_lock(&log->l_icloglock);
        iclog = log->l_iclog;
        do {
                if (atomic_read(&iclog->ic_refcnt)) {
                        /* Reference holder will re-run iclog callbacks. */
                        continue;
                }
                list_splice_init(&iclog->ic_callbacks, &cb_list);
>>>>>>           wake_up_all(&iclog->ic_write_wait);
>>>>>>           wake_up_all(&iclog->ic_force_wait);
        } while ((iclog = iclog->ic_next) != log->l_iclog);

        wake_up_all(&log->l_flush_wait);
        spin_unlock(&log->l_icloglock);

>>>>>>  xlog_cil_process_committed(&cb_list);
}

This wakes any thread waiting on IO completion of the iclog (in this
case the umount log force) before shutdown processes all the pending
callbacks.  That means the xfs_sync_sb() waiting on a sync
transaction in xfs_log_force() on iclog->ic_force_wait will get
woken before the callbacks attached to that iclog are run. This
results in xfs_sync_sb() returning an error, and so unmount unblocks
and continues to run whilst the log shutdown is still in progress.

Normally this is just fine because the force waiter has nothing to
do with AIL operations. But in the case of this unmount path, the
log force waiter goes on to tear down the AIL because the log is now
shut down and so nothing ever blocks it again from the wait point in
xfs_log_cover().

Hence it's a race to see who gets to the AIL first - the unmount
code or xlog_cil_process_committed() killing the superblock buffer.

To fix this, we just have to change the order of processing in
xlog_state_shutdown_callbacks() to run the callbacks before it wakes
any task waiting on completion of the iclog.

Reported-by: Brian Foster <[email protected]>
Fixes: aad7272a9208 ("xfs: separate out log shutdown callback processing")
Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

xfs: shutdown in intent recovery has non-intent items in the AIL

generic/388 triggered a failure in RUI recovery due to a corrupted
btree record and the system then locked up hard due to a subsequent
assert failure while holding a spinlock cancelling intents:

XFS (pmem1): Corruption of in-memory data (0x8) detected at xfs_do_force_shutdown+0x1a/0x20 (fs/xfs/xfs_trans.c:964).  Shutting down filesystem.
XFS (pmem1): Please unmount the filesystem and rectify the problem(s)
XFS: Assertion failed: !xlog_item_is_intent(lip), file: fs/xfs/xfs_log_recover.c, line: 2632
Call Trace:
  <TASK>
  xlog_recover_cancel_intents.isra.0+0xd1/0x120
  xlog_recover_finish+0xb9/0x110
  xfs_log_mount_finish+0x15a/0x1e0
  xfs_mountfs+0x540/0x910
  xfs_fs_fill_super+0x476/0x830
  get_tree_bdev+0x171/0x270
  ? xfs_init_fs_context+0x1e0/0x1e0
  xfs_fs_get_tree+0x15/0x20
  vfs_get_tree+0x24/0xc0
  path_mount+0x304/0xba0
  ? putname+0x55/0x60
  __x64_sys_mount+0x108/0x140
  do_syscall_64+0x35/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Essentially, there's dirty metadata in the AIL from intent recovery
transactions, so when we go to cancel the remaining intents we assume
that all objects after the first non-intent log item in the AIL are
not intents.

This is not true. Intent recovery can log new intents to continue
the operations the original intent could not complete in a single
transaction. The new intents are committed before they are deferred,
which means if the CIL commits in the background they will get
inserted into the AIL at the head.

Hence if we shut down the filesystem while processing intent
recovery, the AIL may have new intents active at the current head.
Hence this check:

                /*
                 * We're done when we see something other than an intent.
                 * There should be no intents left in the AIL now.
                 */
                if (!xlog_item_is_intent(lip)) {
#ifdef DEBUG
                        for (; lip; lip = xfs_trans_ail_cursor_next(ailp, &cur))
                                ASSERT(!xlog_item_is_intent(lip));
#endif
                        break;
                }

in both xlog_recover_process_intents() and
log_recover_cancel_intents() is simply not valid. It was valid back
when we only had EFI/EFD intents and didn't chain intents, but it
hasn't been valid ever since intent recovery could create and commit
new intents.

Given that crashing the mount task like this pretty much prevents
diagnosing what went wrong that lead to the initial failure that
triggered intent cancellation, just remove the checks altogether.

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

xfs: aborting inodes on shutdown may need buffer lock

Most buffer io list operations are run with the bp->b_lock held, but
xfs_iflush_abort() can be called without the buffer lock being held
resulting in inodes being removed from the buffer list while other
list operations are occurring. This causes problems with corrupted
bp->b_io_list inode lists during filesystem shutdown, leading to
traversals that never end, double removals from the AIL, etc.

Fix this by passing the buffer to xfs_iflush_abort() if we have
it locked. If the inode is attached to the buffer, we're going to
have to remove it from the buffer list and we'd have to get the
buffer off the inode log item to do that anyway.

If we don't have a buffer passed in (e.g. from xfs_reclaim_inode())
then we can determine if the inode has a log item and if it is
attached to a buffer before we do anything else. If it does have an
attached buffer, we can lock it safely (because the inode has a
reference to it) and then perform the inode abort.

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

Merge tag 'jfs-5.18' of https://github.com/kleikamp/linux-shaggy

Pull jfs updates from Dave Kleikamp:
"A couple bug fixes"

* tag 'jfs-5.18' of https://github.com/kleikamp/linux-shaggy:
jfs: prevent NULL deref in diFree
jfs: fix divide error in dbNextAG

dt-bindings: net: qcom,ethqos: Document SM8150 SoC compatible

SM8150 has an ethernet controller and it needs a different
configuration, so add a new compatible for this.

Acked-by: Rob Herring <[email protected]>
Signed-off-by: Vinod Koul <[email protected]>
[bhsharma: Massage the commit log]
Signed-off-by: Bhupesh Sharma <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

lib/test: use after free in register_test_dev_kmod()

The "test_dev" pointer is freed but then returned to the caller.

Fixes: d9c6a72d6fa2 ("kmod: add test driver to stress test the module loader")
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>

fs: fd tables have to be multiples of BITS_PER_LONG

This has always been the rule: fdtables have several bitmaps in them,
and as a result they have to be sized properly for bitmaps.  We walk
those bitmaps in chunks of 'unsigned long' in serveral cases, but even
when we don't, we use the regular kernel bitops that are defined to work
on arrays of 'unsigned long', not on some byte array.

Now, the distinction between arrays of bytes and 'unsigned long'
normally only really ends up being noticeable on big-endian systems, but
Fedor Pchelkin and Alexey Khoroshilov reported that copy_fd_bitmaps()
could be called with an argument that wasn't even a multiple of
BITS_PER_BYTE.  And then it fails to do the proper copy even on
little-endian machines.

The bug wasn't in copy_fd_bitmap(), but in sane_fdtable_size(), which
didn't actually sanitize the fdtable size sufficiently, and never made
sure it had the proper BITS_PER_LONG alignment.

That's partly because the alignment historically came not from having to
explicitly align things, but simply from previous fdtable sizes, and
from count_open_files(), which counts the file descriptors by walking
them one 'unsigned long' word at a time and thus naturally ends up doing
sizing in the proper 'chunks of unsigned long'.

But with the introduction of close_range(), we now have an external
source of "this is how many files we want to have", and so
sane_fdtable_size() needs to do a better job.

This also adds that explicit alignment to alloc_fdtable(), although
there it is mainly just for documentation at a source code level.  The
arithmetic we do there to pick a reasonable fdtable size already aligns
the result sufficiently.

In fact,clang notices that the added ALIGN() in that function doesn't
actually do anything, and does not generate any extra code for it.

It turns out that gcc ends up confusing itself by combining a previous
constant-sized shift operation with the variable-sized shift operations
in roundup_pow_of_two().  And probably due to that doesn't notice that
the ALIGN() is a no-op.  But that's a (tiny) gcc misfeature that doesn't
matter.  Having the explicit alignment makes sense, and would actually
matter on a 128-bit architecture if we ever go there.

This also adds big comments above both functions about how fdtable sizes
have to have that BITS_PER_LONG alignment.

Fixes: 60997c3d45d9 ("close_range: add CLOSE_RANGE_UNSHARE")
Reported-by: Fedor Pchelkin <[email protected]>
Reported-by: Alexey Khoroshilov <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/
Tested-and-acked-by: Christian Brauner <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

riscv module: remove (NOLOAD)

On ELF, (NOLOAD) sets the section type to SHT_NOBITS[1]. It is conceptually
inappropriate for .plt, .got, and .got.plt sections which are always
SHT_PROGBITS.

In GNU ld, if PLT entries are needed, .plt will be SHT_PROGBITS anyway
and (NOLOAD) will be essentially ignored. In ld.lld, since
https://reviews.llvm.org/D118840 ("[ELF] Support (TYPE=<value>) to
customize the output section type"), ld.lld will report a `section type
mismatch` error (later changed to a warning). Just remove (NOLOAD) to
fix the warning.

[1] https://lld.llvm.org/ELF/linker_script.html As of today, "The
section should be marked as not loadable" on
https://sourceware.org/binutils/docs/ld/Output-Section-Type.html is
outdated for ELF.

Link: https://github.com/ClangBuiltLinux/linux/issues/1597
Fixes: ab1ef68e5401 ("RISC-V: Add sections of PLT and GOT for kernel module")
Reported-by: Nathan Chancellor <[email protected]>
Signed-off-by: Fangrui Song <[email protected]>
Signed-off-by: Palmer Dabbelt <[email protected]>

rtc: check if __rtc_read_time was successful

Clang static analysis reports this issue
interface.c:810:8: warning: Passed-by-value struct
  argument contains uninitialized data
  now = rtc_tm_to_ktime(tm);
      ^~~~~~~~~~~~~~~~~~~

tm is set by a successful call to __rtc_read_time()
but its return status is not checked.  Check if
it was successful before setting the enabled flag.
Move the decl of err to function scope.

Fixes: 2b2f5ff00f63 ("rtc: interface: ignore expired timers when enqueuing new timers")
Signed-off-by: Tom Rix <[email protected]>
Signed-off-by: Alexandre Belloni <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

rtc: gamecube: Fix refcount leak in gamecube_rtc_read_offset_from_sram

The of_find_compatible_node() function returns a node pointer with
refcount incremented, We should use of_node_put() on it when done
Add the missing of_node_put() to release the refcount.

Fixes: 86559400b3ef ("rtc: gamecube: Add a RTC driver for the GameCube, Wii and Wii U")
Signed-off-by: Miaoqian Lin <[email protected]>
Signed-off-by: Alexandre Belloni <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

rtc: mc146818-lib: Fix the AltCentury for AMD platforms

Setting the century forward has been failing on AMD platforms.
There was a previous attempt at fixing this for family 0x17 as part of
commit 7ad295d5196a ("rtc: Fix the AltCentury value on AMD/Hygon
platform") but this was later reverted due to some problems reported
that appeared to stem from an FW bug on a family 0x17 desktop system.

The same comments mentioned in the previous commit continue to apply
to the newer platforms as well.

```
MC146818 driver use function mc146818_set_time() to set register
RTC_FREQ_SELECT(RTC_REG_A)'s bit4-bit6 field which means divider stage
reset value on Intel platform to 0x7.

While AMD/Hygon RTC_REG_A(0Ah)'s bit4 is defined as DV0 [Reference]:
DV0 = 0 selects Bank 0, DV0 = 1 selects Bank 1. Bit5-bit6 is defined
as reserved.

DV0 is set to 1, it will select Bank 1, which will disable AltCentury
register(0x32) access. As UEFI pass acpi_gbl_FADT.century 0x32
(AltCentury), the CMOS write will be failed on code:
CMOS_WRITE(century, acpi_gbl_FADT.century).

Correct RTC_REG_A bank select bit(DV0) to 0 on AMD/Hygon CPUs, it will
enable AltCentury(0x32) register writing and finally setup century as
expected.
```

However in closer examination the change previously submitted was also
modifying bits 5 & 6 which are declared reserved in the AMD documentation.
So instead modify just the DV0 bank selection bit.

Being cognizant that there was a failure reported before, split the code
change out to a static function that can also be used for exclusions if
any regressions such as Mikhail's pop up again.

Cc: Jinke Fan <[email protected]>
Cc: Mikhail Gavrilov <[email protected]>
Link: https://lore.kernel.org/all/CABXGCsMLob0DC25JS8wwAYydnDoHBSoMh2_YLPfqm3TTvDE-Zw@mail.gmail.com/
Link: https://www.amd.com/system/files/TechDocs/51192_Bolton_FCH_RRG.pdf
Signed-off-by: Raul E Rangel <[email protected]>
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Alexandre Belloni <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

io_uring: defer msg-ring file validity check until command issue

In preparation for not using the file at prep time, defer checking if this
file refers to a valid io_uring instance until issue time.

Signed-off-by: Jens Axboe <[email protected]>

parisc: Fix patch code locking and flushing

This change fixes the following:

1) The flags variable is not initialized. Always use raw_spin_lock_irqsave
and raw_spin_unlock_irqrestore to serialize patching.

2) flush_kernel_vmap_range is primarily intended for DMA flushes. Since
__patch_text_multiple is often called with interrupts disabled, it is
better to directly call flush_kernel_dcache_range_asm and
flush_kernel_icache_range_asm. This avoids an extra call.

3) The final call to flush_icache_range is unnecessary.

Signed-off-by: John David Anglin <[email protected]>
Signed-off-by: Helge Deller <[email protected]>

parisc: Find a new timesync master if current CPU is removed

When CPU hotplugging is enabled, the user may want to remove the
current CPU which is providing the timer ticks. If this happens
we need to find a new timesync master.

Signed-off-by: Helge Deller <[email protected]>

parisc: Move common_stext into .text section when CONFIG_HOTPLUG_CPU=y

Move the common_stext function into the non-init text section if hotplug
is enabled. This function is called from the firmware when hotplugged
CPUs are brought up.

Signed-off-by: Helge Deller <[email protected]>

parisc: Rewrite arch_cpu_idle_dead() for CPU hotplugging

Let the PDC firmware put the CPU into firmware idle loop with the
pdc_cpu_rendezvous() function.

Signed-off-by: Helge Deller <[email protected]>

parisc: Implement __cpu_die() and __cpu_disable() for CPU hotplugging

Add relevant code to __cpu_die() and __cpu_disable() to finally enable
the CPU hotplugging features. Reset the irq count values in smp_callin()
to zero before bringing up the CPU.

It seems that the firmware may need up to 8 seconds to fully stop a CPU
in which no other PDC calls are allowed to be made. Use a timeout
__cpu_die() to accommodate for this.

Use "chcpu -d 1" to bring CPU1 down, and "chcpu -e 1" to bring it up.

Signed-off-by: Helge Deller <[email protected]>

parisc: Add PDC locking functions for rendezvous code

Add pdc_cpu_rendezvous_lock() and pdc_cpu_rendezvous_unlock()
to lock PDC while CPU is transitioning into rendezvous state.
This is needed, because the transition phase may take up to 8 seconds.

Add pdc_pat_get_PDC_entrypoint() to get PDC entry point for current CPU.

Signed-off-by: Helge Deller <[email protected]>

parisc: Move disable_sr_hashing_asm() into .text section

Signed-off-by: Helge Deller <[email protected]>

parisc: Move CPU startup-related functions into .text section

If CONFIG_HOTPLUG_CPU is enabled, those functions will be run again
after bootup. So they need to reside in the .text section.

Signed-off-by: Helge Deller <[email protected]>

parisc: Move store_cpu_topology() into text section

Signed-off-by: Helge Deller <[email protected]>

parisc: Switch from GENERIC_CPU_DEVICES to GENERIC_ARCH_TOPOLOGY

Switch away from the own cpu topology code to common code which is used
by ARM64 and RISCV. That will allow us to enable CPU hotplug later on.

Signed-off-by: Helge Deller <[email protected]>

parisc: Ensure set_firmware_width() is called only once

Call set_firmware_width() only once at runtime.
This prevents that hotplugged CPUs will get stuck in spinlocks later on.

Signed-off-by: Helge Deller <[email protected]>

parisc: Add constants for control registers and clean up mfctl()

Clean up the code for the mfctl() and mtctl() functions and add often
used constants.

Signed-off-by: Helge Deller <[email protected]>

parisc: Detect hppa-suse-linux-gcc compiler for cross-building

Allow the system to find the SUSE hppa compiler and linker to set
CROSS32_COMPILE and CROSS_COMPILE.

Suggested-by: Jiri Slaby <[email protected]>
Signed-off-by: Helge Deller <[email protected]>

parisc: Clean up cpu_check_affinity() and drop cpu_set_affinity_irq()

The cpu_set_affinity_irq() isn't needed. Not the CPU irqs need to
change, but the slave irq chips simply need to be reprogrammed to
a new CPU irq with the txn_* functions.

Signed-off-by: Helge Deller <[email protected]>

parisc: Fix CPU affinity for Lasi, WAX and Dino chips

Add the missing logic to allow Lasi, WAX and Dino to set the
CPU affinity. This fixes IRQ migration to other CPUs when a
CPU is shutdown which currently holds the IRQs for one of those
chips.

Signed-off-by: Helge Deller <[email protected]>

x86/fpu: Remove redundant XCOMP_BV initialization

fpu_copy_uabi_to_guest_fpstate() initializes the XCOMP_BV field in the
XSAVE header. That's a leftover from the old KVM FPU buffer handling code.

Since

d69c1382e1b7 ("x86/kvm: Convert FPU handling to a single swap buffer")

KVM uses the FPU core allocation code, which initializes the XCOMP_BV
field already.

Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

Merge tag 'devprop-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull device properties code update from Rafael Wysocki:
"This is based on new i2c material for 5.18-rc1 and simply reorganizes
  the code on top of it so as to group similar functions together (Andy
  Shevchenko)"

* tag 'devprop-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  device property: Don't split fwnode_get_irq*() APIs in the code

Merge tag 'pm-5.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more power management updates from Rafael Wysocki:
"These update ARM cpufreq drivers, the OPP (Operating Performance
  Points) library and the power management documentation.

  Specifics:

   - Add per core DVFS support for QCom SoC (Bjorn Andersson), convert
     to yaml binding (Manivannan Sadhasivam) and various other fixes to
     the QCom drivers (Luca Weiss).

   - Add OPP table for imx7s SoC (Denys Drozdov) and minor fixes (Stefan
     Agner).

   - Fix CPPC driver's freq/performance conversions (Pierre Gondois).

   - Minor generic cleanups (Yury Norov).

   - Introduce opp-microwatt property to the OPP core, bindings, etc
     (Lukasz Luba).

   - Convert DT bindings to schema format and various related fixes
     (Yassine Oudjana).

   - Expose OPP's OF node in debugfs (Viresh Kumar).

   - Add Intel uncore frequency scaling documentation file to its
     MAINTAINERS entry (Srinivas Pandruvada).

   - Clean up the AMD P-state driver documentation (Jan Engelhardt)"

* tag 'pm-5.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (24 commits)
  Documentation: amd-pstate: grammar and sentence structure updates
  dt-bindings: cpufreq: cpufreq-qcom-hw: Convert to YAML bindings
  dt-bindings: dvfs: Use MediaTek CPUFREQ HW as an example
  Documentation: EM: Describe new registration method using DT
  OPP: Add support of "opp-microwatt" for EM registration
  PM: EM: add macro to set .active_power() callback conditionally
  OPP: Add "opp-microwatt" supporting code
  dt-bindings: opp: Add "opp-microwatt" entry in the OPP
  MAINTAINERS: Add additional file to uncore frequency control
  cpufreq: blocklist Qualcomm sc8280xp and sa8540p in cpufreq-dt-platdev
  cpufreq: qcom-hw: Add support for per-core-dcvs
  dt-bindings: power: avs: qcom,cpr: Convert to DT schema
  arm64: dts: qcom: qcs404: Rename CPU and CPR OPP tables
  arm64: dts: qcom: msm8996: Rename cluster OPP tables
  dt-bindings: opp: Convert qcom-nvmem-cpufreq to DT schema
  dt-bindings: opp: qcom-opp: Convert to DT schema
  arm64: dts: qcom: msm8996-mtp: Add msm8996 compatible
  dt-bindings: arm: qcom: Add msm8996 and apq8096 compatibles
  opp: Expose of-node's name in debugfs
  cpufreq: CPPC: Fix performance/frequency conversion
  ...

KVM: x86: Forbid VMM to set SYNIC/STIMER MSRs when SynIC wasn't activated

Setting non-zero values to SYNIC/STIMER MSRs activates certain features,
this should not happen when KVM_CAP_HYPERV_SYNIC{,2} was not activated.

Note, it would've been better to forbid writing anything to SYNIC/STIMER
MSRs, including zeroes, however, at least QEMU tries clearing
HV_X64_MSR_STIMER0_CONFIG without SynIC. HV_X64_MSR_EOM MSR is somewhat
'special' as writing zero there triggers an action, this also should not
happen when SynIC wasn't activated.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Message-Id: <20220325132140 [email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: x86: Avoid theoretical NULL pointer dereference in kvm_irq_delivery_to_apic_fast()

When kvm_irq_delivery_to_apic_fast() is called with APIC_DEST_SELF
shorthand, 'src' must not be NULL. Crash the VM with KVM_BUG_ON()
instead of crashing the host.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Message-Id: <20220325132140 [email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: x86: Check lapic_in_kernel() before attempting to set a SynIC irq

When KVM_CAP_HYPERV_SYNIC{,2} is activated, KVM already checks for
irqchip_in_kernel() so normally SynIC irqs should never be set. It is,
however, possible for a misbehaving VMM to write to SYNIC/STIMER MSRs
causing erroneous behavior.

The immediate issue being fixed is that kvm_irq_delivery_to_apic()
(kvm_irq_delivery_to_apic_fast()) crashes when called with
'irq.shorthand = APIC_DEST_SELF' and 'src == NULL'.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Message-Id: <20220325132140 [email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>

Documentation: KVM: add API issues section

Add a section to document all the different ways in which the KVM API sucks.

I am sure there are way more, give people a place to vent so that userspace
authors are aware.

Signed-off-by: Paolo Bonzini <[email protected]>
Message-Id: <20220322110712 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

Documentation: KVM: add virtual CPU errata documentation

Add a file to document all the different ways in which the virtual CPU
emulation is imperfect. Include an example to show how to document
such errata.

Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: Jim Mattson <[email protected]>
Reviewed-by: Oliver Upton <[email protected]>
Message-Id: <20220322110712 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

Documentation: KVM: add separate directories for architecture-specific documentation

ARM already has an arm/ subdirectory, but s390 and x86 do not even though
they have a relatively large number of files specific to them. Create
new directories in Documentation/virt/kvm for these two architectures
as well.

While at it, group the API documentation and the developer documentation
in the table of contents.

Signed-off-by: Paolo Bonzini <[email protected]>
Message-Id: <20220322110712 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

Documentation: kvm: include new locks

kvm->mn_invalidate_lock and kvm->slots_arch_lock were not included in the
documentation, add them.

Signed-off-by: Paolo Bonzini <[email protected]>
Message-Id: <20220322110720 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

Documentation: kvm: fixes for locking.rst

Separate the various locks clearly, and include the new names of blocked_vcpu_on_cpu_lock
and blocked_vcpu_on_cpu.

Signed-off-by: Paolo Bonzini <[email protected]>
Message-Id: <20220322110720 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: x86: Fix clang -Wimplicit-fallthrough in do_host_cpuid()

Clang warns:

  arch/x86/kvm/cpuid.c:739:2: error: unannotated fall-through between switch labels [-Werror,-Wimplicit-fallthrough]
          default:
          ^
  arch/x86/kvm/cpuid.c:739:2: note: insert 'break;' to avoid fall-through
          default:
          ^
          break;
  1 error generated.

Clang is a little more pedantic than GCC, which does not warn when
falling through to a case that is just break or return. Clang's version
is more in line with the kernel's own stance in deprecated.rst, which
states that all switch/case blocks must end in either break,
fallthrough, continue, goto, or return. Add the missing break to silence
the warning.

Fixes: f144c49e8c39 ("KVM: x86: synthesize CPUID leaf 0x80000021h if useful")
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Message-Id: <20220322152906 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

Merge branches 'clk-sifive' and 'clk-visconti' into clk-next

* clk-sifive:
  clk: sifive: Move all stuff into SoCs header files from C files
  clk: sifive: Add SoCs prefix in each SoCs-dependent data
  riscv: dts: Change the macro name of prci in each device node
  dt-bindings: change the macro name of prci in header files and example
  clk: sifive: duplicate the macro definitions for the time being

* clk-visconti:
  clk: visconti: prevent array overflow in visconti_clk_register_gates()

Merge branches 'clk-range', 'clk-uniphier', 'clk-apple' and 'clk-qcom' into clk-next

- Make clk_set_rate_range() re-evaluate the limits each time
- Introduce various clk_set_rate_range() tests
- Add clk_drop_range() to drop a previously set range
- Support for NCO blocks on Apple SoCs

* clk-range:
  clk: Drop the rate range on clk_put()
  clk: test: Test clk_set_rate_range on orphan mux
  clk: Initialize orphan req_rate
  clk: bcm: rpi: Run some clocks at the minimum rate allowed
  clk: bcm: rpi: Set a default minimum rate
  clk: bcm: rpi: Add variant structure
  clk: Add clk_drop_range
  clk: Always set the rate on clk_set_range_rate
  clk: Use clamp instead of open-coding our own
  clk: Always clamp the rounded rate
  clk: Enforce that disjoints limits are invalid
  clk: Introduce Kunit Tests for the framework
  clk: Fix clk_hw_get_clk() when dev is NULL

* clk-uniphier:
  clk: uniphier: Fix fixed-rate initialization

* clk-apple:
  clk: clk-apple-nco: Allow and fix module building
  MAINTAINERS: Add clk-apple-nco under ARM/APPLE MACHINE
  clk: clk-apple-nco: Add driver for Apple NCO
  dt-bindings: clock: Add Apple NCO

* clk-qcom: (61 commits)
  clk: qcom: gcc-msm8994: Fix gpll4 width
  dt-bindings: clock: fix dt_binding_check error for qcom,gcc-other.yaml
  clk: qcom: Add display clock controller driver for SM6125
  dt-bindings: clock: add QCOM SM6125 display clock bindings
  clk: qcom: Fix sorting of SDX_GCC_65 in Makefile and Kconfig
  clk: qcom: gcc: Add emac GDSC support for SM8150
  clk: qcom: gcc: sm8150: Fix some identation issues
  clk: qcom: gcc: Add UFS_CARD and UFS_PHY GDSCs for SM8150
  clk: qcom: gcc: Add PCIe0 and PCIe1 GDSC for SM8150
  clk: qcom: clk-rcg2: Update the frac table for pixel clock
  clk: qcom: clk-rcg2: Update logic to calculate D value for RCG
  clk: qcom: smd: Add missing MSM8998 RPM clocks
  clk: qcom: smd: Add missing RPM clocks for msm8992/4
  dt-bindings: clock: qcom: rpmcc: Add RPM Modem SubSystem (MSS) clocks
  clk: qcom: gcc-ipq806x: add CryptoEngine resets
  dt-bindings: reset: add ipq8064 ce5 resets
  clk: qcom: gcc-ipq806x: add CryptoEngine clocks
  dt-bindings: clock: add ipq8064 ce5 clk define
  clk: qcom: gcc-ipq806x: add additional freq for sdc table
  clk: qcom: clk-rcg: add clk_rcg_floor_ops ops
  ...

Merge branches 'clk-starfive', 'clk-ti', 'clk-terminate' and 'clk-cleanup' into clk-next

- Audio clks on StarFive JH7100 RISC-V SoC
- Terminate arrays with sentinels and make that clearer
- Cleanup SPDX tags
- Fix typos in comments

* clk-starfive:
  clk: starfive: Add JH7100 audio clock driver
  clk: starfive: jh7100: Support more clock types
  clk: starfive: jh7100: Make hw clock implementation reusable
  dt-bindings: clock: Add starfive,jh7100-audclk bindings
  dt-bindings: clock: Add JH7100 audio clock definitions
  clk: starfive: jh7100: Handle audio_div clock properly
  clk: starfive: jh7100: Don't round divisor up twice

* clk-ti:
  clk: ti: Drop legacy compatibility clocks for dra7
  clk: ti: Drop legacy compatibility clocks for am4
  clk: ti: Drop legacy compatibility clocks for am3
  clk: ti: Update component clocks to use ti_dt_clk_name()
  clk: ti: Update pll and clockdomain clocks to use ti_dt_clk_name()
  clk: ti: Add ti_dt_clk_name() helper to use clock-output-names
  clk: ti: Use clock-output-names for clkctrl
  clk: ti: Add ti_find_clock_provider() to use clock-output-names
  clk: ti: Optionally parse IO address from parent clock node
  clk: ti: Preserve node in ti_dt_clocks_register()
  clk: ti: Constify clkctrl_name

* clk-terminate:
  clk: actions: Make sentinel elements more obvious
  clk: clps711x: Terminate clk_div_table with sentinel element
  clk: hisilicon: Terminate clk_div_table with sentinel element
  clk: loongson1: Terminate clk_div_table with sentinel element
  clk: actions: Terminate clk_div_table with sentinel element

* clk-cleanup:
  clk: zynq: Update the parameters to zynq_clk_register_periph_clk
  clk: zynq: trivial warning fix
  clk: qcom: sm6125-gcc: fix typos in comments
  clk: ti: clkctrl: fix typos in comments
  clk: COMMON_CLK_LAN966X should depend on SOC_LAN966
  clk: Use of_device_get_match_data()
  clk: bcm2835: Remove unused variable
  clk: tegra: tegra124-emc: Fix missing put_device() call in emc_ensure_emc_driver
  clk: cleanup comments
  clk: socfpga: cleanup spdx tags

Merge branches 'clk-mvebu', 'clk-const', 'clk-imx' and 'clk-rockchip' into clk-next

- Mark mux table as const in clk-mux
- Make the all_lists array const

* clk-mvebu:
  clk: mvebu: use time_is_before_eq_jiffies() instead of open coding it

* clk-const:
  clk: Mark clk_core_evict_parent_cache_subtree() 'target' const
  clk: Mark 'all_lists' as const
  clk: pistachio: Declare mux table as const u32[]
  clk: qcom: Declare mux table as const u32[]
  clk: mmp: Declare mux tables as const u32[]
  clk: hisilicon: Remove unnecessary cast of mux table to u32 *
  clk: mux: Declare u32 *table parameter as const
  clk: nxp: Declare mux table parameter as const u32 *
  clk: nxp: Remove unused variable

* clk-imx: (28 commits)
  dt-bindings: clock: drop useless consumer example
  clk: imx: Select MXC_CLK for i.MX93 clock driver
  clk: imx: remove redundant re-assignment of pll->base
  MAINTAINERS: clk: imx: add git tree and dt-bindings files
  clk: imx: pll14xx: Support dynamic rates
  clk: imx: pll14xx: Add pr_fmt
  clk: imx: pll14xx: explicitly return lowest rate
  clk: imx: pll14xx: name variables after usage
  clk: imx: pll14xx: consolidate rate calculation
  clk: imx: pll14xx: Use FIELD_GET/FIELD_PREP
  clk: imx: pll14xx: Drop wrong shifting
  clk: imx: pll14xx: Use register defines consistently
  clk: imx8mp: remove SYS PLL 1/2 clock gates
  clk: imx8mn: remove SYS PLL 1/2 clock gates
  clk: imx8mm: remove SYS PLL 1/2 clock gates
  clk: imx: add i.MX93 clk
  clk: imx: support fracn gppll
  clk: imx: add i.MX93 composite clk
  dt-bindings: clock: add i.MX93 clock definition
  dt-bindings: clock: Add imx93 clock support
  ...

* clk-rockchip:
  clk: rockchip: re-add rational best approximation algorithm to the fractional divider
  clk/rockchip: Use of_device_get_match_data()
  clk: rockchip: Add CLK_SET_RATE_PARENT to the HDMI reference clock on rk3568
  clk: rockchip: drop CLK_SET_RATE_PARENT from dclk_vop* on rk3568
  clk: rockchip: Add more PLL rates for rk3568

Merge branches 'clk-xilinx', 'clk-kunit', 'clk-cs2000' and 'clk-renesas' into clk-next

- Kunit tests for clk-gate implementation
- Convert Cirrus Logic CS2000P driver to regmap, yamlify DT binding and add
   support for dynamic mode

* clk-xilinx:
  clk: zynqmp: replace warn_once with pr_debug for failed clock ops

* clk-kunit:
  clk: gate: Add some kunit test suites

* clk-cs2000:
  clk: cs2000-cp: convert driver to regmap
  clk: cs2000-cp: freeze config during register fiddling
  clk: cs2000-cp: make clock skip setting configurable
  clk: cs2000-cp: add support for dynamic mode
  clk: cs2000-cp: Make aux output function controllable
  dt-bindings: clock: cs2000-cp: document cirrus,dynamic-mode
  dt-bindings: clock: cs2000-cp: document cirrus,clock-skip flag
  dt-bindings: clock: cs2000-cp: document aux-output-source
  dt-bindings: clock: convert cs2000-cp bindings to yaml

* clk-renesas:
  dt-bindings: clock: renesas: Make example 'clocks' parsable
  clk: rs9: Add Renesas 9-series PCIe clock generator driver
  clk: fixed-factor: Introduce devm_clk_hw_register_fixed_factor_index()
  dt-bindings: clk: rs9: Add Renesas 9-series I2C PCIe clock generator
  clk: renesas: r8a779f0: Add PFC clock
  clk: renesas: r8a779f0: Add I2C clocks
  clk: renesas: r8a779f0: Add WDT clock
  clk: renesas: r8a779f0: Fix RSW2 clock divider
  clk: renesas: rzg2l-cpg: Add support for RZ/V2L SoC
  dt-bindings: clock: renesas: Document RZ/V2L SoC
  dt-bindings: clock: Add R9A07G054 CPG Clock and Reset Definitions
  clk: renesas: r8a779a0: Add CANFD module clock
  clk: renesas: r9a07g044: Update multiplier and divider values for PLL2/3
  clk: renesas: r8a7799[05]: Add MLP clocks
  clk: renesas: r8a779f0: Add SYS-DMAC clocks

Merge branches 'clk-microchip', 'clk-si', 'clk-mtk', 'clk-at91' and 'clk-st' into clk-next

- Clock configuration on Microchip PolarFire SoCs
- Free allocations on probe error in Mediatek clk driver
- Modernize Mediatek clk driver by consolidating code

* clk-microchip:
  clk: microchip: Add driver for Microchip PolarFire SoC
  dt-bindings: clk: microchip: Add Microchip PolarFire host binding

* clk-si:
  clk-si5341: replace snprintf in show functions with sysfs_emit
  clk: si5341: fix reported clk_rate when output divider is 2

* clk-mtk: (32 commits)
  clk: mediatek: Warn if clk IDs are duplicated
  clk: mediatek: mt8195: Implement remove functions
  clk: mediatek: mt8195: Implement error handling in probe functions
  clk: mediatek: mt8195: Hook up mtk_clk_simple_remove()
  clk: mediatek: Unregister clks in mtk_clk_simple_probe() error path
  clk: mediatek: mtk: Implement error handling in register APIs
  clk: mediatek: pll: Implement error handling in register API
  clk: mediatek: mux: Implement error handling in register API
  clk: mediatek: mux: Reverse check for existing clk to reduce nesting level
  clk: mediatek: gate: Implement error handling in register API
  clk: mediatek: cpumux: Implement error handling in register API
  clk: mediatek: mtk: Clean up included headers
  clk: mediatek: Add mtk_clk_simple_remove()
  clk: mediatek: Implement mtk_clk_unregister_composites() API
  clk: mediatek: Implement mtk_clk_unregister_divider_clks() API
  clk: mediatek: Implement mtk_clk_unregister_factors() API
  clk: mediatek: Implement mtk_clk_unregister_fixed_clks() API
  clk: mediatek: pll: Clean up included headers
  clk: mediatek: pll: Implement unregister API
  clk: mediatek: pll: Split definitions into separate header file
  ...

* clk-at91:
  clk: at91: clk-master: remove dead code
  clk: at91: sama7g5: fix parents of PDMCs' GCLK
  clk: at91: sama7g5: Allow MCK1 to be exported and referenced in DT
  clk: at91: allow setting PMC_AUDIOPINCK clock parents via DT

* clk-st:
  clk: stm32mp1: Add parent_data to ETHRX clock
  clk: stm32mp1: Split ETHCK_K into separate MUX and GATE clock

clk: zynq: Update the parameters to zynq_clk_register_periph_clk

In case there are only one gate or the two_gate is 0 the clk1 clock
passed is not used. We are passing 0 which is arm_pll.
Pass a invalid clock instead.

Signed-off-by: Shubhrajyoti Datta <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Stephen Boyd <[email protected]>

clk: zynq: trivial warning fix

Fix the below warning

WARNING: Missing a blank line after declarations
+ int enable = !!(fclk_enable & BIT(i - fclk0));
+ zynq_clk_register_fclk(i, clk_output_name[i],

Signed-off-by: Shubhrajyoti Datta <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Stephen Boyd <[email protected]>

Revert "KVM: set owner of cpu and vm file operations"

This reverts commit 3d3aab1b973b01bd2a1aa46307e94a1380b1d802.

Now that the KVM module's lifetime is tied to kvm.users_count, there is
no need to also tie it's lifetime to the lifetime of the VM and vCPU
file descriptors.

Suggested-by: Sean Christopherson <[email protected]>
Signed-off-by: David Matlack <[email protected]>
Message-Id: <20220303183328.1499189 [email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: Prevent module exit until all VMs are freed

Tie the lifetime the KVM module to the lifetime of each VM via
kvm.users_count. This way anything that grabs a reference to the VM via
kvm_get_kvm() cannot accidentally outlive the KVM module.

Prior to this commit, the lifetime of the KVM module was tied to the
lifetime of /dev/kvm file descriptors, VM file descriptors, and vCPU
file descriptors by their respective file_operations "owner" field.
This approach is insufficient because references grabbed via
kvm_get_kvm() do not prevent closing any of the aforementioned file
descriptors.

This fixes a long standing theoretical bug in KVM that at least affects
async page faults. kvm_setup_async_pf() grabs a reference via
kvm_get_kvm(), and drops it in an asynchronous work callback. Nothing
prevents the VM file descriptor from being closed and the KVM module
from being unloaded before this callback runs.

Fixes: af585b921e5d ("KVM: Halt vcpu if page it tries to access is swapped out")
Fixes: 3d3aab1b973b ("KVM: set owner of cpu and vm file operations")
Cc: [email protected]
Suggested-by: Ben Gardon <[email protected]>
[ Based on a patch from Ben implemented for Google's kernel. ]
Signed-off-by: David Matlack <[email protected]>
Message-Id: <20220303183328.1499189 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>