Git Repo - linux.git/log

arm64: defconfig: enable syscon-poweroff driver

Enable the generic syscon-poweroff driver used on all Exynos ARM64 SoCs
(e.g. Exynos5433) and few APM SoCs.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Reviewed-by: Alim Akhtar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>

ARM: locomo: fix locomolcd_power declaration

The locomolcd driver has one remaining missing-prototype warning:

drivers/video/backlight/locomolcd.c:83:6: error: no previous prototype for 'locomolcd_power' [-Werror=missing-prototypes]

There is in fact an unused prototype with a similar name in a global
header, so move the actual one there and remove the old one.

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>

Merge tag 'scmi-fix-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes

Arm SCMI fix for v6.6

A single fix to address scmi_perf_attributes_get() using the protocol
version even before it was populated and ending up with unexpected
bogowatts power scale.

* tag 'scmi-fix-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
firmware: arm_scmi: Fixup perf power-cost/microwatt support

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>

Merge tag 'ffa-fix-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes

Arm FF-A fix for v6.6

It has been reported that the driver sets the memory region attributes
for MEM_LEND operation when the specification clearly states not to. The
fix here addresses the issue by ensuring the memory region attributes are
cleared for the memory lending operation.

* tag 'ffa-fix-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
firmware: arm_ffa: Don't set the memory region attributes for MEM_LEND

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>

i2c: npcm7xx: Fix callback completion ordering

Sometimes, our completions race with new master transfers and override
the bus->operation and bus->master_or_slave variables. This causes
transactions to timeout and kernel crashes less frequently.

To remedy this, we re-order all completions to the very end of the
function.

Fixes: 56a1485b102e ("i2c: npcm7xx: Add Nuvoton NPCM I2C controller driver")
Signed-off-by: William A. Kennington III <[email protected]>
Reviewed-by: Tali Perry <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"A single fix for libata: older devices don't support command duration
  limits (CDL) and some don't support report opcodes, meaning there's no
  way to tell if they support the command or not.

  Reduce the problems of incorrectly using CDL commands on older devices
  by checking SCSI spec compliance at SPC-5 (the spec which introduced
  the command) before turning on CDL"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: core: ata: Do no try to probe for CDL on old drives

Merge tag 'vfio-v6.6-rc4' of https://github.com/awilliam/linux-vfio

Pull VFIO fixes from Alex Williamson:

- The new PDS vfio-pci variant driver only supports SR-IOV VF devices
   and incorrectly made a direct reference to the physfn field of the
   pci_dev.  Fix this both by making the Kconfig depend on IOV support
   as well as using the correct wrapper for this access (Shixiong Ou)

- Resolve an error path issue where on unwind of the mdev registration
   the created kset is not unregistered and the wrong error code is
   returned (Jinjie Ruan)

* tag 'vfio-v6.6-rc4' of https://github.com/awilliam/linux-vfio:
  vfio/mdev: Fix a null-ptr-deref bug for mdev_unregister_parent()
  vfio/pds: Use proper PF device access helper
  vfio/pds: Add missing PCI_IOV depends

spi: spi-gxp: BUG: Correct spi write return value

Bug fix to correct return value of gxp_spi_write function to zero.
Completion of succesful operation should return zero.

Fixes: 730bc8ba5e9e spi: spi-gxp: Add support for HPE GXP SoCs
Signed-off-by: Charles Kearney <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>

timers: Tag (hr)timer softirq as hotplug safe

Specific stress involving frequent CPU-hotplug operations, such as
running rcutorture for example, may trigger the following message:

NOHZ tick-stop error: local softirq work is pending, handler #02!!!"

This happens in the CPU-down hotplug process, after
CPUHP_AP_SMPBOOT_THREADS whose teardown callback parks ksoftirqd, and
before the target CPU shuts down through CPUHP_AP_IDLE_DEAD. In this
fragile intermediate state, softirqs waiting for threaded handling may be
forever ignored and eventually reported by the idle task as in the above
example.

However some vectors are known to be safe as long as the corresponding
subsystems have teardown callbacks handling the migration of their
events. The above error message reports pending timers softirq although
this vector can be considered as hotplug safe because the
CPUHP_TIMERS_PREPARE teardown callback performs the necessary migration
of timers after the death of the CPU. Hrtimers also have a similar
hotplug handling.

Therefore this error message, as far as (hr-)timers are concerned, can
be considered spurious and the relevant softirq vectors can be marked as
hotplug safe.

Fixes: 0345691b24c0 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle")
Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Joel Fernandes (Google) <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]

swiotlb: fix the check whether a device has used software IO TLB

When CONFIG_SWIOTLB_DYNAMIC=y, devices which do not use the software IO TLB
can avoid swiotlb lookup. A flag is added by commit 1395706a1490 ("swiotlb:
search the software IO TLB only if the device makes use of it"), the flag
is correctly set, but it is then never checked. Add the actual check here.

Note that this code is an alternative to the default pool check, not an
additional check, because:

1. swiotlb_find_pool() also searches the default pool;
2. if dma_uses_io_tlb is false, the default swiotlb pool is not used.

Tested in a KVM guest against a QEMU RAM-backed SATA disk over virtio and
*not* using software IO TLB, this patch increases IOPS by approx 2% for
4-way parallel I/O.

The write memory barrier in swiotlb_dyn_alloc() is not needed, because a
newly allocated pool must always be observed by swiotlb_find_slots() before
an address from that pool is passed to is_swiotlb_buffer().

Correctness was verified using the following litmus test:

C swiotlb-new-pool

(*
* Result: Never
*
* Check that a newly allocated pool is always visible when the
* corresponding swiotlb buffer is visible.
*)

{
mem_pools = default;
}

P0(int **mem_pools, int *pool)
{
/* add_mem_pool() */
WRITE_ONCE(*pool, 999);
rcu_assign_pointer(*mem_pools, pool);
}

P1(int **mem_pools, int *flag, int *buf)
{
/* swiotlb_find_slots() */
int *r0;
int r1;

rcu_read_lock();
r0 = READ_ONCE(*mem_pools);
r1 = READ_ONCE(*r0);
rcu_read_unlock();

if (r1) {
WRITE_ONCE(*flag, 1);
smp_mb();
}

/* device driver (presumed) */
WRITE_ONCE(*buf, r1);
}

P2(int **mem_pools, int *flag, int *buf)
{
/* device driver (presumed) */
int r0 = READ_ONCE(*buf);

/* is_swiotlb_buffer() */
int r1;
int *r2;
int r3;

smp_rmb();
r1 = READ_ONCE(*flag);
if (r1) {
/* swiotlb_find_pool() */
rcu_read_lock();
r2 = READ_ONCE(*mem_pools);
r3 = READ_ONCE(*r2);
rcu_read_unlock();
}
}

exists (2:r0<>0 /\ 2:r3=0) (* Not found. *)

Fixes: 1395706a1490 ("swiotlb: search the software IO TLB only if the device makes use of it")
Reported-by: Jonathan Corbet <[email protected]>
Closes: https://lore.kernel.org/linux-iommu/[email protected]/
Signed-off-by: Petr Tesarik <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>

Merge tag 'aspeed-6.6-maintainers' of git://git.kernel.org/pub/scm/linux/kernel/git/joel/bmc into arm/fixes

ASPEED Maintainers update

Andrew has changed addresses and the git tree has long since been at a
different location.

* tag 'aspeed-6.6-maintainers' of git://git.kernel.org/pub/scm/linux/kernel/git/joel/bmc:
MAINTAINERS: aspeed: Update Andrew's email address
MAINTAINERS: aspeed: Update git tree URL

Link: https://lore.kernel.org/r/CACPK8Xc+D=YBc2Dhk-6-gOuvKN0xGgZYNop6oJVa=VNgaEYOHw@mail.gmail.com
Signed-off-by: Arnd Bergmann <[email protected]>

soc: loongson: loongson2_guts: Remove unneeded semicolon

No functional modification involved.

./drivers/soc/loongson/loongson2_guts.c:73:2-3: Unneeded semicolon.

Reviewed-by: Huacai Chen <[email protected]>
Signed-off-by: Mingtong Bao <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>

soc: loongson: loongson2_guts: Convert to devm_platform_ioremap_resource()

Use devm_platform_ioremap_resource() to simplify code.

Signed-off-by: Dongliang Mu <[email protected]>
Signed-off-by: Binbin Zhou <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>

soc: loongson: loongson_pm2: Populate children syscon nodes

The syscon poweroff and reboot nodes logically belong to the Power
Management Unit so populate possible children.

Without it, the reboot/poweroff feature becomes unavailable.

Signed-off-by: Binbin Zhou <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>

dt-bindings: soc: loongson,ls2k-pmc: Allow syscon-reboot/syscon-poweroff as child

The reboot and poweroff features are actually part of the Power
Management Unit system controller, thus allow them as its children,
instead of specifying as separate device nodes with syscon phandle.

Without it, the reboot/poweroff feature becomes unavailable.

Signed-off-by: Binbin Zhou <[email protected]>
Reviewed-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>

soc: loongson: loongson_pm2: Drop useless of_device_id compatible

Now, "loongson,ls2k0500-pmc" is used as fallback compatible, so the
ls2k1000 compatible in the driver can be dropped directly.

Signed-off-by: Binbin Zhou <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>

dt-bindings: soc: loongson,ls2k-pmc: Use fallbacks for ls2k-pmc compatible

The Loongson-2K series chips (ls2k0500/ls2k1000/ls2k2000) share the same
PM system controller, using ls2k0500 compatible as fallback for the
others.

Signed-off-by: Binbin Zhou <[email protected]>
Acked-by: Conor Dooley <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>

soc: loongson: loongson_pm2: Add dependency for INPUT

Since commit 67694c076bd7 ("soc: loongson2_pm: add power management
support"), the Loongson-2K PM driver was added, but it didn't update the
Kconfig entry for the INPUT dependency, leading to build errors, so
update the Kconfig entry to depend on INPUT.

/opt/crosstool/gcc-13.2.0-nolibc/loongarch64-linux/bin/loongarch64-linux-ld:
drivers/soc/loongson/loongson2_pm.o: in function `loongson2_power_button_init':
/work/lnx/next/linux-next-20230825/LOONG64/../drivers/soc/loongson/loongson2_pm.c:101:(.text+0x350): undefined reference to `input_allocate_device'
/opt/crosstool/gcc-13.2.0-nolibc/loongarch64-linux/bin/loongarch64-linux-ld:
/work/lnx/next/linux-next-20230825/LOONG64/../drivers/soc/loongson/loongson2_pm.c:109:(.text+0x3dc): undefined reference to `input_set_capability'
/opt/crosstool/gcc-13.2.0-nolibc/loongarch64-linux/bin/loongarch64-linux-ld:
/work/lnx/next/linux-next-20230825/LOONG64/../drivers/soc/loongson/loongson2_pm.c:111:(.text+0x3e4): undefined reference to `input_register_device'
/opt/crosstool/gcc-13.2.0-nolibc/loongarch64-linux/bin/loongarch64-linux-ld:
/work/lnx/next/linux-next-20230825/LOONG64/../drivers/soc/loongson/loongson2_pm.c:125:(.text+0x3fc): undefined reference to `input_free_device'
/opt/crosstool/gcc-13.2.0-nolibc/loongarch64-linux/bin/loongarch64-linux-ld: drivers/soc/loongson/loongson2_pm.o: in function `input_report_key':
/work/lnx/next/linux-next-20230825/LOONG64/../include/linux/input.h:425:(.text+0x58c): undefined reference to `input_event'

Reported-by: Randy Dunlap <[email protected]>
Signed-off-by: Binbin Zhou <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>

Merge tag 'optee-for-for-v6.6' of https://git.linaro.org/people/jens.wiklander/linux-tee into arm/fixes

Remove a few unused declarations in TEE subsystem

* tag 'optee-for-for-v6.6' of https://git.linaro.org/people/jens.wiklander/linux-tee:
tee: Remove unused declarations

Link: https://lore.kernel.org/r/20230913083909.GA473533@rayden
Signed-off-by: Arnd Bergmann <[email protected]>

Merge tag 'riscv-dt-fixes-for-v6.6-rc3' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into arm/fixes

RISC-V Devicetree fixes for v6.6-rc3

Starfive:
A fix for the size of the NOR flash that was causing complaints from the
MTD subsystem during boot & two issues that a certain someone introduced
while resolving merge conflicts. Of the latter, one is a cosmetic
ordering change & the other lead to the usb controller being disabled.

Signed-off-by: Conor Dooley <[email protected]>
* tag 'riscv-dt-fixes-for-v6.6-rc3' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
  riscv: dts: starfive: visionfive 2: Fix uart0 pins sort order
  riscv: dts: starfive: visionfive 2: Enable usb0
  riscv: dts: starfive: fix NOR flash reserved-data partition size

Link: https://lore.kernel.org/r/20230916-previous-oversold-9d30891ac6cf@spud
Signed-off-by: Arnd Bergmann <[email protected]>

arm64: defconfig: remove CONFIG_COMMON_CLK_NPCM8XX=y

There is no code for this config option and enabling it in defconfig
causes warnings from tools which are detecting unused and obsolete
kernel config flags since the flag will be completely missing from
effective build config after "make olddefconfig".

Fixes yocto kernel recipe build time warning:

WARNING: [kernel config]: This BSP contains fragments with warnings:
...
[INFO]: the following symbols were not found in the active
configuration:
- CONFIG_COMMON_CLK_NPCM8XX

The flag was added with commit 45472f1e5348c7b755b4912f2f529ec81cea044b
v5.19-rc4-15-g45472f1e5348 so 6.1 and 6.4 stable kernel trees are
affected.

Fixes: 45472f1e5348c7b755b4912f2f529ec81cea044b ("arm64: defconfig: Add Nuvoton NPCM family support")
Cc: [email protected]
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Bjorn Andersson <[email protected]>
Cc: Krzysztof Kozlowski <[email protected]>
Cc: Konrad Dybcio <[email protected]>
Cc: Neil Armstrong <[email protected]>
Cc: Tomer Maimon <[email protected]>
Cc: Bruce Ashfield <[email protected]>
Cc: Jon Mason <[email protected]>
Cc: Jon Mason <[email protected]>
Cc: Ross Burton <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Signed-off-by: Mikko Rapeli <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>

Merge tag 'imx-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes

i.MX fixes for 6.6:

- A couple of i.MX8MP device tree changes from Adam Ford to fix clock
  configuration regressions caused by 16c984524862 ("arm64: dts: imx8mp:
  don't initialize audio clocks from CCM node").
- Fix pmic-irq-hog GPIO line in imx93-tqma9352 device tree.
- Fix a mmemory leak with error handling path of imx_dsp_setup_channels()
  in imx-dsp driver.
- Fix HDMI node in imx8mm-evk device tree.
- Add missing clock enable functionality for imx8mm_soc_uid() function
  in soc-imx8m driver.
- Add missing imx8mm-prt8mm.dtb build target.

* tag 'imx-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  arm64: dts: imx: Add imx8mm-prt8mm.dtb to build
  arm64: dts: imx8mm-evk: Fix hdmi@3d node
  soc: imx8m: Enable OCOTP clock for imx8mm before reading registers
  arm64: dts: imx8mp-beacon-kit: Fix audio_pll2 clock
  arm64: dts: imx8mp: Fix SDMA2/3 clocks
  arm64: dts: freescale: tqma9352: Fix gpio hog
  firmware: imx-dsp: Fix an error handling path in imx_dsp_setup_channels()

Link: https://lore.kernel.org/r/20230926123710.GT7231@dragon
Signed-off-by: Arnd Bergmann <[email protected]>

Merge tag 'omap-for-v6.6/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes

Fixes for omaps and ti-sysc

Fixes for several ti-sysc interconnect target module driver issues for
external abort on non-linefetch, am35x soc match, and uart module quirks
handling needed for devices to work and to allow device wake-up to work.

Fixes for droid4 boot time errors and warnings as noticed after boot doing
dmesg -lerr,warn. Let's also cut down the debug uart noise by using
overrun-throttle-ms, and downgrade the u-boot version warnings to
debug statements to further reduce the boot time noise with warnings.

* tag 'omap-for-v6.6/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  bus: ti-sysc: Fix SYSC_QUIRK_SWSUP_SIDLE_ACT handling for uart wake-up
  ARM: omap2+: Downgrade u-boot version warnings to debug statements
  ARM: dts: ti: omap: Fix noisy serial with overrun-throttle-ms for mapphone
  ARM: dts: ti: omap: motorola-mapphone: Fix abe_clkctrl warning on boot
  ARM: dts: ti: omap: Fix bandgap thermal cells addressing for omap3/4
  bus: ti-sysc: Fix missing AM35xx SoC matching
  bus: ti-sysc: Use fsleep() instead of usleep_range() in sysc_reset()

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>

ARM: uniphier: fix cache kernel-doc warnings

Fix kernel-doc warning(s) as reported by lkp:

arch/arm/mm/cache-uniphier.c:72: warning: cannot understand function prototype: 'struct uniphier_cache_data '
cache-uniphier.c:82: warning: Function parameter or member 'way_ctrl_base' not described in 'uniphier_cache_data'

Fixes: e7ecbc057bc5 ("ARM: uniphier: add outer cache support")
Signed-off-by: Randy Dunlap <[email protected]>
Reported-by: kernel test robot <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Olof Johansson <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Kunihiko Hayashi <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Link: lore.kernel.org/r/202309260130 [email protected] # fixes only one item
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>

LoongArch: Add support for 64_PCREL relocation type

When build and update kernel with the latest upstream binutils and
loongson3_defconfig, module loader fails with:

  kmod: zsmalloc: Unknown relocation type 109
  kmod: fuse: Unknown relocation type 109
  kmod: fuse: Unknown relocation type 109
  kmod: radeon: Unknown relocation type 109
  kmod: nf_tables: Unknown relocation type 109
  kmod: nf_tables: Unknown relocation type 109

This is because the latest upstream binutils replaces a pair of ADD64
and SUB64 with 64_PCREL, so add support for 64_PCREL relocation type.

Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ecb802d02eeb
Cc: <[email protected]>
Signed-off-by: Tiezhu Yang <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>

LoongArch: Add support for 32_PCREL relocation type

When build and update kernel with the latest upstream binutils and
loongson3_defconfig, module loader fails with:

  kmod: zsmalloc: Unsupport relocation type 99, please add its support.
  kmod: fuse: Unsupport relocation type 99, please add its support.
  kmod: ipmi_msghandler: Unsupport relocation type 99, please add its support.
  kmod: ipmi_msghandler: Unsupport relocation type 99, please add its support.
  kmod: pstore: Unsupport relocation type 99, please add its support.
  kmod: drm_display_helper: Unsupport relocation type 99, please add its support.
  kmod: drm_display_helper: Unsupport relocation type 99, please add its support.
  kmod: drm_display_helper: Unsupport relocation type 99, please add its support.
  kmod: fuse: Unsupport relocation type 99, please add its support.
  kmod: fat: Unsupport relocation type 99, please add its support.

This is because the latest upstream binutils replaces a pair of ADD32
and SUB32 with 32_PCREL, so add support for 32_PCREL relocation type.

Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ecb802d02eeb
Cc: <[email protected]>
Co-developed-by: Youling Tang <[email protected]>
Signed-off-by: Youling Tang <[email protected]>
Signed-off-by: Tiezhu Yang <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>

LoongArch: Define relocation types for ABI v2.10

The relocation types from 101 to 109 are used by GNU binutils >= 2.41,
add their definitions to use them in later patches.

Link: https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=include/elf/loongarch.h#l230
Cc: <[email protected]>
Signed-off-by: Tiezhu Yang <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>

LoongArch: numa: Fix high_memory calculation

For 64bit kernel without HIGHMEM, high_memory is the virtual address of
the highest physical address in the system. But __va(get_num_physpages()
<< PAGE_SHIFT) is not what we want for high_memory because there may be
holes in the physical address space. On the other hand, max_low_pfn is
calculated from memblock_end_of_DRAM(), which is exactly corresponding
to the highest physical address, so use it for high_memory calculation.

Cc: <[email protected]>
Fixes: d4b6f1562a3c3284adce ("LoongArch: Add Non-Uniform Memory Access (NUMA) support")
Signed-off-by: Chong Qiao <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>

gpio: pmic-eic-sprd: Add can_sleep flag for PMIC EIC chip

The drivers uses a mutex and I2C bus access in its PMIC EIC chip
get implementation. This means these functions can sleep and the PMIC EIC
chip should set the can_sleep property to true.

This will ensure that a warning is printed when trying to get the
value from a context that potentially can't sleep.

Fixes: 348f3cde84ab ("gpio: Add Spreadtrum PMIC EIC driver support")
Signed-off-by: Wenhua Lin <[email protected]>
Signed-off-by: Bartosz Golaszewski <[email protected]>

gpio: timberdale: Fix potential deadlock on &tgpio->lock

As timbgpio_irq_enable()/timbgpio_irq_disable() callback could be
executed under irq context, it could introduce double locks on
&tgpio->lock if it preempts other execution units requiring
the same locks.

timbgpio_gpio_set()
--> timbgpio_update_bit()
--> spin_lock(&tgpio->lock)
<interrupt>
--> timbgpio_irq_disable()
--> spin_lock_irqsave(&tgpio->lock)

This flaw was found by an experimental static analysis tool I am
developing for irq-related deadlock.

To prevent the potential deadlock, the patch uses spin_lock_irqsave()
on &tgpio->lock inside timbgpio_gpio_set() to prevent the possible
deadlock scenario.

Signed-off-by: Chengfeng Ye <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: Bartosz Golaszewski <[email protected]>

accel/ivpu: Use cached buffers for FW loading

Create buffers with cache coherency on the CPU side (write-back) while
disabling snooping on the VPU side. These buffers require an explicit
cache flush after each CPU-side modification.

Configuring pages as write-combined may introduce significant delays,
potentially taking hundreds of milliseconds for 64 MB buffers.

Added internal DRM_IVPU_BO_NOSNOOP mask which disables snooping on the
VPU side. Allocate FW runtime memory buffer (64 MB) as cached with
snooping-disabled.

This fixes random long FW loading times and boot params memory
corruption on warmboot (due to missed wmb).

Fixes: 02d5b0aacd05 ("accel/ivpu: Implement firmware parsing and booting")
Signed-off-by: Karol Wachowski <[email protected]>
Reviewed-by: Stanislaw Gruszka <[email protected]>
Reviewed-by: Jeffrey Hugo <[email protected]>
Signed-off-by: Stanislaw Gruszka <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

accel/ivpu/40xx: Fix missing VPUIP interrupts

Move sequence of masking and unmasking global interrupts from buttress
interrupt handler to generic one that handles both VPUIP and BTRS
interrupts.

Unmasking global interrupts will re-trigger MSI for any pending interrupts.
Lack of this sequence can randomly cause to miss any VPUIP interrupt that
comes after reading VPU_40XX_HOST_SS_ICB_STATUS_0 and before clearing
all active interrupt sources.

Fixes: 79cdc56c4a54 ("accel/ivpu: Add initial support for VPU 4")
Signed-off-by: Karol Wachowski <[email protected]>
Reviewed-by: Stanislaw Gruszka <[email protected]>
Reviewed-by: Jeffrey Hugo <[email protected]>
Signed-off-by: Stanislaw Gruszka <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

accel/ivpu/40xx: Disable frequency change interrupt

Do not enable frequency change interrupt on 40xx as it might
lead to an interrupt storm in current design.

FREQ_CHANGE interrupt is triggered on D0I2 entry which will cause
KMD to check VPU interrupt sources by reading VPUIP registers.
Access to those registers will toggle necessary clocks and trigger
another FREQ_CHANGE interrupt possibly ending in an infinite loop.

FREQ_CHANGE interrupt has only debug purposes and can be permanently
disabled.

Fixes: 79cdc56c4a54 ("accel/ivpu: Add initial support for VPU 4")
Signed-off-by: Karol Wachowski <[email protected]>
Reviewed-by: Stanislaw Gruszka <[email protected]>
Reviewed-by: Jeffrey Hugo <[email protected]>
Signed-off-by: Stanislaw Gruszka <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

accel/ivpu/40xx: Ensure clock resource ownership Ack before Power-Up

We need to wait for the CLOCK_RESOURCE_OWN_ACK bit to be set
after configuring the workpoint. This step ensures that the VPU
microcontroller clock is actively toggling and ready for operation.

Previously, we relied solely on the READY bit in the VPU_STATUS
register, which indicated the completion of the workpoint download.
However, this approach was insufficient, as the READY bit could be set
while the device was still running on a sideband clock until the PLL
locked. To guarantee that the PLL is locked and the device is running on
the main clock source, we now wait for the CLOCK_RESOURCE_OWN_ACK before
proceeding with the remainder of the power-up sequence.

Fixes: 79cdc56c4a54 ("accel/ivpu: Add initial support for VPU 4")
Signed-off-by: Karol Wachowski <[email protected]>
Reviewed-by: Stanislaw Gruszka <[email protected]>
Reviewed-by: Jeffrey Hugo <[email protected]>
Signed-off-by: Stanislaw Gruszka <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

accel/ivpu: Don't flood dmesg with VPU ready message

Use ivpu_dbg() to print the VPU ready message so it doesn't pollute
the dmesg.

Signed-off-by: Jacek Lawrynowicz <[email protected]>
Reviewed-by: Stanislaw Gruszka <[email protected]>
Reviewed-by: Jeffrey Hugo <[email protected]>
Signed-off-by: Stanislaw Gruszka <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

accel/ivpu: Do not use wait event interruptible

If we receive signal when waiting for IPC message response in
ivpu_ipc_receive() we return error and continue to operate.
Then the driver can send another IPC messages and re-use occupied
slot of the message still processed by the firmware. This can result
in corrupting firmware memory and following FW crash with messages:

[ 3698.569719] intel_vpu 0000:00:0b.0: [drm] ivpu_ipc_send_receive_internal(): IPC receive failed: type 0x1103, ret -512
[ 3698.569747] intel_vpu 0000:00:0b.0: [drm] ivpu_jsm_unregister_db(): Failed to unregister doorbell 3: -512
[ 3698.569756] intel_vpu 0000:00:0b.0: [drm] ivpu_ipc_tx_prepare(): IPC message vpu:0x88980000 not released by firmware
[ 3698.569763] intel_vpu 0000:00:0b.0: [drm] ivpu_ipc_tx_prepare(): JSM message vpu:0x88980040 not released by firmware
[ 3698.570234] intel_vpu 0000:00:0b.0: [drm] ivpu_ipc_send_receive_internal(): IPC receive failed: type 0x110e, ret -512
[ 3698.570318] intel_vpu 0000:00:0b.0: [drm] *ERROR* ivpu_mmu_dump_event(): MMU EVTQ: 0x10 (Translation fault) SSID: 0 SID: 3, e[2] 00000000, e[3] 00000208, in addr: 0x88988000, fetch addr: 0x0

To fix the issue don't use interruptible variant of wait event to
allow firmware to finish IPC processing.

Fixes: 5d7422cfb498 ("accel/ivpu: Add IPC driver and JSM messages")
Reviewed-by: Karol Wachowski <[email protected]>
Reviewed-by: Jeffrey Hugo <[email protected]>
Signed-off-by: Stanislaw Gruszka <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

MAINTAINERS: update nouveau maintainers

Since I will continue to work on Nouveau consistently, also beyond my
former and still ongoing VM_BIND/EXEC work, add myself to the list of
Nouveau maintainers.

Signed-off-by: Danilo Krummrich <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

Merge tag 'wq-for-6.6-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

Pull workqueue fixes from Tejun Heo:

- Remove double allocation of wq_update_pod_attrs_buf

- Fix missing allocation of pwq_release_worker when
   wq_cpu_intensive_thresh_us is set to a custom value

* tag 'wq-for-6.6-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: Fix missed pwq_release_worker creation in wq_cpu_intensive_thresh_init()
  workqueue: Removed double allocation of wq_update_pod_attrs_buf

i915/guc: Get runtime pm in busyness worker only if already active

Ideally the busyness worker should take a gt pm wakeref because the
worker only needs to be active while gt is awake. However, the gt_park
path cancels the worker synchronously and this complicates the flow if
the worker is also running at the same time. The cancel waits for the
worker and when the worker releases the wakeref, that would call gt_park
and would lead to a deadlock.

The resolution is to take the global pm wakeref if runtime pm is already
active. If not, we don't need to update the busyness stats as the stats
would already be updated when the gt was parked.

Note:
- We do not requeue the worker if we cannot take a reference to runtime
  pm since intel_guc_busyness_unpark would requeue the worker in the
  resume path.

- If the gt was parked longer than time taken for GT timestamp to roll
  over, we ignore those rollovers since we don't care about tracking the
  exact GT time. We only care about roll overs when the gt is active and
  running workloads.

- There is a window of time between gt_park and runtime suspend, where
  the worker may run. This is acceptable since the worker will not find
  any new data to update busyness.

v2: (Daniele)
- Edit commit message and code comment
- Use runtime pm in the worker
- Put runtime pm after enabling the worker
- Use Link tag and add Fixes tag

v3: (Daniele)
- Reword commit and comments and add details

Link: https://gitlab.freedesktop.org/drm/intel/-/issues/7077
Fixes: 77cdd054dd2c ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu")
Signed-off-by: Umesh Nerlige Ramappa <[email protected]>
Reviewed-by: Daniele Ceraolo Spurio <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit e2f99b79d4c594cdf7ab449e338d4947f5ea8903)
Signed-off-by: Rodrigo Vivi <[email protected]>

drm/i915/gt: Fix reservation address in ggtt_reserve_guc_top

There is an assertion in ggtt_reserve_guc_top that the global GTT
is of size at least GUC_GGTT_TOP, which is not the case on a 32-bit
platform; see commit 562d55d991b39ce376c492df2f7890fd6a541ffc
("drm/i915/bdw: Only use 2g GGTT for 32b platforms"). If GEM_BUG_ON
is enabled, this triggers a BUG(); if GEM_BUG_ON is disabled, the
subsequent reservation fails and the driver fails to initialise
the device:

i915 0000:00:02.0: [drm:i915_init_ggtt [i915]] Failed to reserve top of GGTT for GuC
i915 0000:00:02.0: Device initialization failed (-28)
i915 0000:00:02.0: Please file a bug on drm/i915; see https://gitlab.freedesktop.org/drm/intel/-/wikis/How-to-file-i915-bugs for details.
i915: probe of 0000:00:02.0 failed with error -28

Make the reservation at the top of the available space, whatever
that is, instead of assuming that the top will be GUC_GGTT_TOP.

Fixes: 911800765ef6 ("drm/i915/uc: Reserve upper range of GGTT")
Link: https://gitlab.freedesktop.org/drm/intel/-/issues/9080
Signed-off-by: Javier Pello <[email protected]>
Reviewed-by: Daniele Ceraolo Spurio <[email protected]>
Cc: Fernando Pacheco <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: [email protected]
Cc: [email protected] # v5.3+
Signed-off-by: John Harrison <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 0f3fa942d91165c2702577e9274d2ee1c7212afc)
Signed-off-by: Rodrigo Vivi <[email protected]>

i915: Limit the length of an sg list to the requested length

The folio conversion changed the behaviour of shmem_sg_alloc_table() to
put the entire length of the last folio into the sg list, even if the sg
list should have been shorter. gen8_ggtt_insert_entries() relied on the
list being the right length and would overrun the end of the page tables.
Other functions may also have been affected.

Clamp the length of the last entry in the sg list to be the expected
length.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Fixes: 0b62af28f249 ("i915: convert shmem_sg_free_table() to use a folio_batch")
Cc: [email protected] # 6.5.x
Link: https://gitlab.freedesktop.org/drm/intel/-/issues/9256
Link: https://lore.kernel.org/lkml/[email protected]/
Reported-by: Oleksandr Natalenko <[email protected]>
Tested-by: Oleksandr Natalenko <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Signed-off-by: Andrzej Hajda <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 26a8e32e6d77900819c0c730fbfb393692dbbeea)
Signed-off-by: Rodrigo Vivi <[email protected]>

Merge tag 'for-6.6-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

- delayed refs fixes:
     - fix race when refilling delayed refs block reserve
     - prevent transaction block reserve underflow when starting
       transaction
     - error message and value adjustments

- fix build warnings with CONFIG_CC_OPTIMIZE_FOR_SIZE and
   -Wmaybe-uninitialized

- fix for smatch report where uninitialized data from invalid extent
   buffer range could be returned to the caller

- fix numeric overflow in statfs when calculating lower threshold
   for a full filesystem

* tag 'for-6.6-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: initialize start_slot in btrfs_log_prealloc_extents
  btrfs: make sure to initialize start and len in find_free_dev_extent
  btrfs: reset destination buffer when read_extent_buffer() gets invalid range
  btrfs: properly report 0 avail for very full file systems
  btrfs: log message if extent item not found when running delayed extent op
  btrfs: remove redundant BUG_ON() from __btrfs_inc_extent_ref()
  btrfs: return -EUCLEAN for delayed tree ref with a ref count not equals to 1
  btrfs: prevent transaction block reserve underflow when starting transaction
  btrfs: fix race when refilling delayed refs block reserve

Merge tag 'linux-kselftest-fixes-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fix from Shuah Khan:
"One single fix to unmount tracefs when test created mount"

* tag 'linux-kselftest-fixes-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/user_events: Fix to unmount tracefs when test created mount

Merge tag 'v6.6-rc4.vfs.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:
"This contains the usual miscellaneous fixes and cleanups for vfs and
  individual fses:

  Fixes:
   - Revert ki_pos on error from buffered writes for direct io fallback
   - Add missing documentation for block device and superblock handling
     for changes merged this cycle
   - Fix reiserfs flexible array usage
   - Ensure that overlayfs sets ctime when setting mtime and atime
   - Disable deferred caller completions with overlayfs writes until
     proper support exists

  Cleanups:
   - Remove duplicate initialization in pipe code
   - Annotate aio kioctx_table with __counted_by"

* tag 'v6.6-rc4.vfs.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
  overlayfs: set ctime when setting mtime and atime
  ntfs3: put resources during ntfs_fill_super()
  ovl: disable IOCB_DIO_CALLER_COMP
  porting: document superblock as block device holder
  porting: document new block device opening order
  fs/pipe: remove duplicate "offset" initializer
  fs-writeback: do not requeue a clean inode having skipped pages
  aio: Annotate struct kioctx_table with __counted_by
  direct_write_fallback(): on error revert the ->ki_pos update from buffered write
  reiserfs: Replace 1-element array with C99 style flex-array

Merge tag 'perf-tools-fixes-for-v6.6-1-2023-09-25' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools fixes from Namhyung Kim:
"Build:

   - Update header files in the tools/**/include directory to sync with
     the kernel sources as usual.

   - Remove unused bpf-prologue files. While it's not strictly a fix,
     but the functionality was removed in this cycle so better to get
     rid of the code together.

   - Other minor build fixes.

  Misc:

   - Fix uninitialized memory access in PMU parsing code

   - Fix segfaults on software event"

* tag 'perf-tools-fixes-for-v6.6-1-2023-09-25' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
  perf jevent: fix core dump on software events on s390
  perf pmu: Ensure all alias variables are initialized
  perf jevents metric: Fix type of strcmp_cpuid_str
  perf trace: Avoid compile error wrt redefining bool
  perf bpf-prologue: Remove unused file
  tools headers UAPI: Update tools's copy of drm.h headers
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  perf bench sched-seccomp-notify: Use the tools copy of seccomp.h UAPI
  tools headers UAPI: Copy seccomp.h to be able to build 'perf bench' in older systems
  tools headers UAPI: Sync files changed by new fchmodat2 and map_shadow_stack syscalls with the kernel sources
  perf tools: Update copy of libbpf's hashmap.c

MAINTAINERS: aspeed: Update Andrew's email address

I've changed employers, have company email that deals with patch-based
workflows without too much of a headache, and am trying to steer some
content out of my personal mail.

Signed-off-by: Andrew Jeffery <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Joel Stanley <[email protected]>

MAINTAINERS: aspeed: Update git tree URL

The description for joel/aspeed.git on git.kernel.org currently says:

Old Aspeed tree. Please see joel/bmc.git

Let's update MAINTAINERS accordingly.

Signed-off-by: Zev Weiss <[email protected]>
Acked-by: Joel Stanley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Joel Stanley <[email protected]>

rbd: take header_rwsem in rbd_dev_refresh() only when updating

rbd_dev_refresh() has been holding header_rwsem across header and
parent info read-in unnecessarily for ages.  With commit 870611e4877e
("rbd: get snapshot context after exclusive lock is ensured to be
held"), the potential for deadlocks became much more real owning to
a) header_rwsem now nesting inside lock_rwsem and b) rw_semaphores
not allowing new readers after a writer is registered.

For example, assuming that I/O request 1, I/O request 2 and header
read-in request all target the same OSD:

1. I/O request 1 comes in and gets submitted
2. watch error occurs
3. rbd_watch_errcb() takes lock_rwsem for write, clears owner_cid and
   releases lock_rwsem
4. after reestablishing the watch, rbd_reregister_watch() calls
   rbd_dev_refresh() which takes header_rwsem for write and submits
   a header read-in request
5. I/O request 2 comes in: after taking lock_rwsem for read in
   __rbd_img_handle_request(), it blocks trying to take header_rwsem
   for read in rbd_img_object_requests()
6. another watch error occurs
7. rbd_watch_errcb() blocks trying to take lock_rwsem for write
8. I/O request 1 completion is received by the messenger but can't be
   processed because lock_rwsem won't be granted anymore
9. header read-in request completion can't be received, let alone
   processed, because the messenger is stranded

Change rbd_dev_refresh() to take header_rwsem only for actually
updating rbd_dev->header.  Header and parent info read-in don't need
any locking.

Cc: [email protected] # 0b035401c570: rbd: move rbd_dev_refresh() definition
Cc: [email protected] # 510a7330c82a: rbd: decouple header read-in from updating rbd_dev->header
Cc: [email protected] # c10311776f0a: rbd: decouple parent info read-in from updating rbd_dev
Cc: [email protected]
Fixes: 870611e4877e ("rbd: get snapshot context after exclusive lock is ensured to be held")
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Dongsheng Yang <[email protected]>

rbd: decouple parent info read-in from updating rbd_dev

Unlike header read-in, parent info read-in is already decoupled in
get_parent_info(), but it's buried in rbd_dev_v2_parent_info() along
with the processing logic.

Separate the initial read-in and update read-in logic into
rbd_dev_setup_parent() and rbd_dev_update_parent() respectively and
have rbd_dev_v2_parent_info() just populate struct parent_image_info
(i.e. what get_parent_info() did). Some existing QoI issues, like
flatten of a standalone clone being disregarded on refresh, remain.

Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Dongsheng Yang <[email protected]>

rbd: decouple header read-in from updating rbd_dev->header

Make rbd_dev_header_info() populate a passed struct rbd_image_header
instead of rbd_dev->header and introduce rbd_dev_update_header() for
updating mutable fields in rbd_dev->header upon refresh. The initial
read-in of both mutable and immutable fields in rbd_dev_image_probe()
passes in rbd_dev->header so no update step is required there.

rbd_init_layout() is now called directly from rbd_dev_image_probe()
instead of individually in format 1 and format 2 implementations.

Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Dongsheng Yang <[email protected]>

rbd: move rbd_dev_refresh() definition

Move rbd_dev_refresh() definition further down to avoid having to
move struct parent_image_info definition in the next commit. This
spares some forward declarations too.

Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Dongsheng Yang <[email protected]>

block: fix kernel-doc for disk_force_media_change()

Drop one function parameter's kernel-doc comment since the parameter
was removed. This prevents a kernel-doc warning:

block/disk-events.c:300: warning: Excess function parameter 'events' description in 'disk_force_media_change'

Fixes: ab6860f62bfe ("block: simplify the disk_force_media_change interface")
Signed-off-by: Randy Dunlap <[email protected]>
Reported-by: kernel test robot <[email protected]>
Closes: lore.kernel.org/r/[email protected]
Cc: Christoph Hellwig <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: [email protected]
Reviewed-by: Christoph Hellwig <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

firmware: arm_ffa: Don't set the memory region attributes for MEM_LEND

As per the FF-A specification: section "Usage of other memory region
attributes", in a transaction to donate memory or lend memory to a single
borrower, if the receiver is a PE or Proxy endpoint, the owner must not
specify the attributes and the relayer will return INVALID_PARAMETERS
if the attributes are set.

Let us not set the memory region attributes for MEM_LEND.

Fixes: 82a8daaecfd9 ("firmware: arm_ffa: Add support for MEM_LEND")
Reported-by: Joao Alves <[email protected]>
Reported-by: Olivier Deprez <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sudeep Holla <[email protected]>

iomap: add a workaround for racy i_size updates on block devices

A szybot reproducer that does write I/O while truncating the size of a
block device can end up in clean_bdev_aliases, which tries to clean the
bdev aliases that it uses.  This is because iomap_to_bh automatically
sets the BH_New flag when outside of i_size.  For block devices updates
to i_size are racy and we can hit this case in a tiny race window,
leading to the eventual clean_bdev_aliases call.  Fix this by erroring
out of > i_size I/O on block devices.

Reported-by: [email protected]
Signed-off-by: Christoph Hellwig <[email protected]>
Tested-by: [email protected]
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

Merge tag 'fix-fix-iunlink-6.6_2023-09-25' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-fixesB

xfs: fix reloading the last iunlink item

It's not a good idea to be trying to send bug fixes to the mailing list
while also trying to take a vacation.  Dave sent some review comments
about the iunlink reloading patches, I changed them in djwong-dev, and
forgot to backport those changes to my -fixes tree.

As a result, the patch is missing some important pieces.  Perhaps
manually copying code diffs between email and two separate git trees
is archaic and stupid^W^W^W^Wisn't really a good idea?

Signed-off-by: Darrick J. Wong <[email protected]>
Signed-off-by: Chandan Babu R <[email protected]>
* tag 'fix-fix-iunlink-6.6_2023-09-25' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: fix reloading entire unlinked bucket lists

overlayfs: set ctime when setting mtime and atime

Nathan reported that he was seeing the new warning in
setattr_copy_mgtime pop when starting podman containers. Overlayfs is
trying to set the atime and mtime via notify_change without also
setting the ctime.

POSIX states that when the atime and mtime are updated via utimes() that
we must also update the ctime to the current time. The situation with
overlayfs copy-up is analogies, so add ATTR_CTIME to the bitmask.
notify_change will fill in the value.

Reported-by: Nathan Chancellor <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
Tested-by: Nathan Chancellor <[email protected]>
Acked-by: Christian Brauner <[email protected]>
Acked-by: Amir Goldstein <[email protected]>
Message-Id: <20230913-ctime-v1-1-c6bc509cbc27@kernel.org>
Signed-off-by: Christian Brauner <[email protected]>

ntfs3: put resources during ntfs_fill_super()

During ntfs_fill_super() some resources are allocated that we need to
cleanup in ->put_super() such as additional inodes. When
ntfs_fill_super() fails these resources need to be cleaned up as well.

Reported-by: [email protected]
Fixes: 78a06688a4d4 ("ntfs3: drop inode references in ntfs_put_super()")
Signed-off-by: Christian Brauner <[email protected]>

dt-bindings: spi: fsl-imx-cspi: Document missing entries

The imx25, imx50, imx51 and imx53 SPIs are compatible with the imx35.

Document them accordingly.

Signed-off-by: Fabio Estevam <[email protected]>
Reviewed-by: Krzysztof Kozlowski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>

ACPI: video: Fix NULL pointer dereference in acpi_video_bus_add()

acpi_video_bus_add_notify_handler() could free video->input and
set it to NULL on failure, but this failure would be missed in its
caller acpi_video_bus_add(). As a result, when an error happens in
acpi_dev_install_notify_handler(), acpi_video_bus_add() would call
acpi_video_bus_remove_notify_handler(), where a potential NULL pointer
video->input is dereferenced in input_unregister_device().

Fix this by adding a return value check and adjusting the following
error handling code.

Fixes: 6f7016819766 ("ACPI: video: Install Notify() handler directly")
Signed-off-by: Dinghao Liu <[email protected]>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <[email protected]>

MIPS: Alchemy: only build mmc support helpers if au1xmmc is enabled

While commit d4a5c59a955b ("mmc: au1xmmc: force non-modular build and
remove symbol_get usage") to be built in, it can still build a kernel
without MMC support and thuse no mmc_detect_change symbol at all.

Add ifdefs to build the mmc support code in the alchemy arch code
conditional on mmc support.

Fixes: d4a5c59a955b ("mmc: au1xmmc: force non-modular build and remove symbol_get usage")
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Randy Dunlap <[email protected]>
Tested-by: Randy Dunlap <[email protected]> # build-tested
Signed-off-by: Thomas Bogendoerfer <[email protected]>

ovl: disable IOCB_DIO_CALLER_COMP

overlayfs copies the kiocb flags when it sets up a new kiocb to handle
a write, but it doesn't properly support dealing with the deferred
caller completions of the kiocb. This means it doesn't get the final
write completion value, and hence will complete the write with '0' as
the result.

We could support the caller completions in overlayfs, but for now let's
just disable them in the generated write kiocb.

Reported-by: Zorro Lang <[email protected]>
Link: https://lore.kernel.org/io-uring/20230924142754.ejwsjen5pvyc32l4@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com/
Fixes: 8c052fb3002e ("iomap: support IOCB_DIO_CALLER_COMP")
Signed-off-by: Jens Axboe <[email protected]>
Message-Id: <71897125-e570-46ce-946a-d4729725e28f@kernel.dk>
Signed-off-by: Christian Brauner <[email protected]>

perf/x86/amd: Do not WARN() on every IRQ

Zen 4 systems running buggy microcode can hit a WARN_ON() in the PMI
handler, as shown below, several times while perf runs. A simple
`perf top` run is enough to render the system unusable:

  WARNING: CPU: 18 PID: 20608 at arch/x86/events/amd/core.c:944 amd_pmu_v2_handle_irq+0x1be/0x2b0

This happens because the Performance Counter Global Status Register
(PerfCntGlobalStatus) has one or more bits set which are considered
reserved according to the "AMD64 Architecture Programmer’s Manual,
Volume 2: System Programming, 24593":

  https://www.amd.com/system/files/TechDocs/24593.pdf

To make this less intrusive, warn just once if any reserved bit is set
and prompt the user to update the microcode. Also sanitize the value to
what the code is handling, so that the overflow events continue to be
handled for the number of counters that are known to be sane.

Going forward, the following microcode patch levels are recommended
for Zen 4 processors in order to avoid such issues with reserved bits:

  Family=0x19 Model=0x11 Stepping=0x01: Patch=0x0a10113e
  Family=0x19 Model=0x11 Stepping=0x02: Patch=0x0a10123e
  Family=0x19 Model=0xa0 Stepping=0x01: Patch=0x0aa00116
  Family=0x19 Model=0xa0 Stepping=0x02: Patch=0x0aa00212

Commit f2eb058afc57 ("linux-firmware: Update AMD cpu microcode") from
the linux-firmware tree has binaries that meet the minimum required
patch levels.

  [ sandipan: - add message to prompt users to update microcode
              - rework commit message and call out required microcode levels ]

Fixes: 7685665c390d ("perf/x86/amd/core: Add PerfMonV2 overflow handling")
Reported-by: Jirka Hladky <[email protected]>
Signed-off-by: Breno Leitao <[email protected]>
Signed-off-by: Sandipan Das <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/all/3540f985652f41041e54ee82aa53e7dbd55739ae.1694696888.git.sandipan.das@amd.com/

misc: rtsx: Fix some platforms can not boot and move the l1ss judgment to probe

commit 101bd907b424 ("misc: rtsx: judge ASPM Mode to set PETXCFG Reg")
some readers no longer force #CLKREQ to low
when the system need to enter ASPM.
But some platform maybe not implement complete ASPM?
it causes some platforms can not boot

Like in the past only the platform support L1ss we release the #CLKREQ.
Move the judgment (L1ss) to probe,
we think read config space one time when the driver start is enough

Fixes: 101bd907b424 ("misc: rtsx: judge ASPM Mode to set PETXCFG Reg")
Cc: stable <[email protected]>
Reported-by: Paul Grandperrin <[email protected]>
Signed-off-by: Ricky Wu <[email protected]>
Tested-By: Jade Lovelace <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

sched/rt: Make rt_rq->pushable_tasks updates drive rto_mask

Sebastian noted that the rto_push_work IRQ work can be queued for a CPU
that has an empty pushable_tasks list, which means nothing useful will be
done in the IPI other than queue the work for the next CPU on the rto_mask.

rto_push_irq_work_func() only operates on tasks in the pushable_tasks list,
but the conditions for that irq_work to be queued (and for a CPU to be
added to the rto_mask) rely on rq_rt->nr_migratory instead.

nr_migratory is increased whenever an RT task entity is enqueued and it has
nr_cpus_allowed > 1. Unlike the pushable_tasks list, nr_migratory includes a
rt_rq's current task. This means a rt_rq can have a migratible current, N
non-migratible queued tasks, and be flagged as overloaded / have its CPU
set in the rto_mask, despite having an empty pushable_tasks list.

Make an rt_rq's overload logic be driven by {enqueue,dequeue}_pushable_task().
Since rt_rq->{rt_nr_migratory,rt_nr_total} become unused, remove them.

Note that the case where the current task is pushed away to make way for a
migration-disabled task remains unchanged: the migration-disabled task has
to be in the pushable_tasks list in the first place, which means it has
nr_cpus_allowed > 1.

Reported-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Valentin Schneider <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Sebastian Andrzej Siewior <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

ata: libata-sata: increase PMP SRST timeout to 10s

On certain SATA controllers, softreset fails after wakeup from S2RAM with
the message "softreset failed (1st FIS failed)", sometimes resulting in
drives not being detected again. With the increased timeout, this issue
is avoided. Instead, "softreset failed (device not ready)" is now
logged 1-2 times; this later failure seems to cause fewer problems
however, and the drives are detected reliably once they've spun up and
the probe is retried.

The issue was observed with the primary SATA controller of the QNAP
TS-453B, which is an "Intel Corporation Celeron/Pentium Silver Processor
SATA Controller [8086:31e3] (rev 06)" integrated in the Celeron J4125 CPU,
and the following drives:

- Seagate IronWolf ST12000VN0008
- Seagate IronWolf ST8000NE0004

The SATA controller seems to be more relevant to this issue than the
drives, as the same drives are always detected reliably on the secondary
SATA controller on the same board (an ASMedia 106x) without any "softreset
failed" errors even without the increased timeout.

Fixes: e7d3ef13d52a ("libata: change drive ready wait after hard reset to 5s")
Cc: [email protected]
Signed-off-by: Matthias Schiffer <[email protected]>
Signed-off-by: Damien Le Moal <[email protected]>

Documentation: kbuild: explain handling optional dependencies

This problem frequently comes up in randconfig testing, with
drivers failing to link because of a dependency on an optional
feature.

The Kconfig language for this is very confusing, so try to
document it in "Kconfig hints" section.

Reviewed-by: Javier Martinez Canillas <[email protected]>
Reviewed-by: Sakari Ailus <[email protected]>
Reviewed-by: Nicolas Schier <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Masahiro Yamada <[email protected]>

kbuild: Use CRC32 and a 1MiB dictionary for XZ compressed modules

Kmod is now (since kmod commit 09c9f8c5df04 ("libkmod: Use kernel
decompression when available")) using the kernel decompressor, when
loading compressed modules.

However, the kernel XZ decompressor is XZ Embedded, which doesn't
handle CRC64 and dictionaries larger than 1MiB.

Use CRC32 and 1MiB dictionary when XZ compressing and installing
kernel modules.

Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1050582
Signed-off-by: Martin Nybo Andersen <[email protected]>
Signed-off-by: Masahiro Yamada <[email protected]>

ata: libata-scsi: ignore reserved bits for REPORT SUPPORTED OPERATION CODES

For REPORT SUPPORTED OPERATION CODES command, the service action field is
defined as bits 0-4 in the second byte in the CDB. Bits 5-7 in the second
byte are reserved.

Only look at the service action field in the second byte when determining
if the MAINTENANCE IN opcode is a REPORT SUPPORTED OPERATION CODES command.

This matches how we only look at the service action field in the second
byte when determining if the SERVICE ACTION IN(16) opcode is a READ
CAPACITY(16) command (reserved bits 5-7 in the second byte are ignored).

Fixes: 7b2030942859 ("libata: Add support for SCT Write Same")
Cc: [email protected]
Signed-off-by: Niklas Cassel <[email protected]>
Signed-off-by: Damien Le Moal <[email protected]>

dt-bindings: ata: pata-common: Add missing additionalProperties on child nodes

The PATA child node schema is missing constraints to prevent unknown
properties. As none of the users of this common binding extend the child
nodes with additional properties, adding "additionalProperties: false"
here is sufficient.

Signed-off-by: Rob Herring <[email protected]>
Acked-by: Conor Dooley <[email protected]>
Signed-off-by: Damien Le Moal <[email protected]>

accel/ivpu: Add Arrow Lake pci id

Enable VPU on Arrow Lake CPUs.

Reviewed-by: Krystian Pradzynski <[email protected]>
Reviewed-by: Karol Wachowski <[email protected]>
Reviewed-by: Jeffrey Hugo <[email protected]>
Signed-off-by: Stanislaw Gruszka <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

i2c: mux: Avoid potential false error message in i2c_mux_add_adapter

I2C_CLASS_DEPRECATED is a flag and not an actual class.
There's nothing speaking against both, parent and child, having
I2C_CLASS_DEPRECATED set. Therefore exclude it from the check.

Signed-off-by: Heiner Kallweit <[email protected]>
Acked-by: Peter Rosin <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>

dt-bindings: i2c: mxs: Pass ref and 'unevaluatedProperties: false'

Running 'make dtbs_check DT_SCHEMA_FILES=i2c-mxs.yaml' throws
several schema warnings such as:

imx28-m28evk.dtb: i2c@80058000: '#address-cells', '#size-cells', 'codec@a', 'eeprom@51', 'rtc@68' do not match any of the regexes: 'pinctrl-[0-9]+'
from schema $id: http://devicetree.org/schemas/i2c/i2c-mxs.yaml#

Fix these warnings by passing a reference to i2c-controller.yaml#
and using 'unevaluatedProperties: false' just like the yaml bindings
of other I2C controllers.

Signed-off-by: Fabio Estevam <[email protected]>
Reviewed-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>

arm64: dts: imx: Add imx8mm-prt8mm.dtb to build

imx8mm-prt8mm.dts was not getting built. Add it to the build.

Fixes: 58497d7a13ed ("arm64: dts: imx: add Protonic PRT8MM board")
Signed-off-by: Rob Herring <[email protected]>
Signed-off-by: Shawn Guo <[email protected]>

xfs: fix reloading entire unlinked bucket lists

During review of the patcheset that provided reloading of the incore
iunlink list, Dave made a few suggestions, and I updated the copy in my
dev tree.  Unfortunately, I then got distracted by ... who even knows
what ... and forgot to backport those changes from my dev tree to my
release candidate branch.  I then sent multiple pull requests with stale
patches, and that's what was merged into -rc3.

So.

This patch re-adds the use of an unlocked iunlink list check to
determine if we want to allocate the resources to recreate the incore
list.  Since lost iunlinked inodes are supposed to be rare, this change
helps us avoid paying the transaction and AGF locking costs every time
we open any inode.

This also re-adds the shutdowns on failure, and re-applies the
restructuring of the inner loop in xfs_inode_reload_unlinked_bucket, and
re-adds a requested comment about the quotachecking code.

Retain the original RVB tag from Dave since there's no code change from
the last submission.

Fixes: 68b957f64fca1 ("xfs: load uncached unlinked inodes into memory on demand")
Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>

Linux 6.6-rc3

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
"ARM:

   - Fix EL2 Stage-1 MMIO mappings where a random address was used

   - Fix SMCCC function number comparison when the SVE hint is set

  RISC-V:

   - Fix KVM_GET_REG_LIST API for ISA_EXT registers

   - Fix reading ISA_EXT register of a missing extension

   - Fix ISA_EXT register handling in get-reg-list test

   - Fix filtering of AIA registers in get-reg-list test

  x86:

   - Fixes for TSC_AUX virtualization

   - Stop zapping page tables asynchronously, since we don't zap them as
     often as before"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: SVM: Do not use user return MSR support for virtualized TSC_AUX
  KVM: SVM: Fix TSC_AUX virtualization setup
  KVM: SVM: INTERCEPT_RDTSCP is never intercepted anyway
  KVM: x86/mmu: Stop zapping invalidated TDP MMU roots asynchronously
  KVM: x86/mmu: Do not filter address spaces in for_each_tdp_mmu_root_yield_safe()
  KVM: x86/mmu: Open code leaf invalidation from mmu_notifier
  KVM: riscv: selftests: Selectively filter-out AIA registers
  KVM: riscv: selftests: Fix ISA_EXT register handling in get-reg-list
  RISC-V: KVM: Fix riscv_vcpu_get_isa_ext_single() for missing extensions
  RISC-V: KVM: Fix KVM_GET_REG_LIST API for ISA_EXT registers
  KVM: selftests: Assert that vasprintf() is successful
  KVM: arm64: nvhe: Ignore SVE hint in SMCCC function ID
  KVM: arm64: Properly return allocated EL2 VA from hyp_alloc_private_va_range()

Merge tag 'trace-v6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

- Fix the "bytes" output of the per_cpu stat file

   The tracefs/per_cpu/cpu*/stats "bytes" was giving bogus values as the
   accounting was not accurate. It is suppose to show how many used
   bytes are still in the ring buffer, but even when the ring buffer was
   empty it would still show there were bytes used.

- Fix a bug in eventfs where reading a dynamic event directory (open)
   and then creating a dynamic event that goes into that diretory screws
   up the accounting.

   On close, the newly created event dentry will get a "dput" without
   ever having a "dget" done for it. The fix is to allocate an array on
   dir open to save what dentries were actually "dget" on, and what ones
   to "dput" on close.

* tag 'trace-v6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  eventfs: Remember what dentries were created on dir open
  ring-buffer: Fix bytes info in per_cpu buffer stats

Merge tag 'cxl-fixes-6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull cxl fixes from Dan Williams:
"A collection of regression fixes, bug fixes, and some small cleanups
  to the Compute Express Link code.

  The regressions arrived in the v6.5 dev cycle and missed the v6.6
  merge window due to my personal absences this cycle. The most
  important fixes are for scenarios where the CXL subsystem fails to
  parse valid region configurations established by platform firmware.
  This is important because agreement between OS and BIOS on the CXL
  configuration is fundamental to implementing "OS native" error
  handling, i.e. address translation and component failure
  identification.

  Other important fixes are a driver load error when the BIOS lets the
  Linux PCI core handle AER events, but not CXL memory errors.

  The other fixex might have end user impact, but for now are only known
  to trigger in our test/emulation environment.

  Summary:

   - Fix multiple scenarios where platform firmware defined regions fail
     to be assembled by the CXL core.

   - Fix a spurious driver-load failure on platforms that enable OS
     native AER, but not OS native CXL error handling.

   - Fix a regression detecting "poison" commands when "security"
     commands are also defined.

   - Fix a cxl_test regression with the move to centralize CXL port
     register enumeration in the CXL core.

   - Miscellaneous small fixes and cleanups"

* tag 'cxl-fixes-6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/acpi: Annotate struct cxl_cxims_data with __counted_by
  cxl/port: Fix cxl_test register enumeration regression
  cxl/region: Refactor granularity select in cxl_port_setup_targets()
  cxl/region: Match auto-discovered region decoders by HPA range
  cxl/mbox: Fix CEL logic for poison and security commands
  cxl/pci: Replace host_bridge->native_aer with pcie_aer_is_native()
  PCI/AER: Export pcie_aer_is_native()
  cxl/pci: Fix appropriate checking for _OSC while handling CXL RAS registers

arm64: dts: imx8mm-evk: Fix hdmi@3d node

The hdmi@3d node's compatible string is "adi,adv7535" instead of
"adi,adv7533" or "adi,adv751*".

Fix the hdmi@3d node by means of:
* Use default register addresses for "cec", "edid" and "packet", because
there is no need to use a non-default address map.
* Add missing interrupt related properties.
* Drop "adi,input-*" properties which are only valid for adv751*.
* Add VDDEXT_3V3 fixed regulator
* Add "*-supply" properties, since most are required.
* Fix label names - s/adv7533/adv7535/.

Fixes: a27335b3f1e0 ("arm64: dts: imx8mm-evk: Add HDMI support")
Signed-off-by: Liu Ying <[email protected]>
Tested-by: Fabio Estevam <[email protected]>
Signed-off-by: Shawn Guo <[email protected]>

soc: imx8m: Enable OCOTP clock for imx8mm before reading registers

Commit 836fb30949d9 ("soc: imx8m: Enable OCOTP clock before reading the
register") added configuration to enable the OCOTP clock before
attempting to read from the associated registers.

This same kexec issue is present with the imx8m SoCs that use the
imx8mm_soc_uid function (e.g. imx8mp). This requires the imx8mm_soc_uid
function to configure the OCOTP clock before accessing the associated
registers. This change implements the same clock enable functionality
that is present in the imx8mq_soc_revision function for the
imx8mm_soc_uid function.

Signed-off-by: Nathan Rossi <[email protected]>
Reviewed-by: Fabio Estevam <[email protected]>
Fixes: 836fb30949d9 ("soc: imx8m: Enable OCOTP clock before reading the register")
Signed-off-by: Shawn Guo <[email protected]>

arm64: dts: imx8mp-beacon-kit: Fix audio_pll2 clock

Commit 16c984524862 ("arm64: dts: imx8mp: don't initialize audio clocks
from CCM node") removed the Audio clocks from the main clock node, because
the intent is to force people to setup the audio PLL clocks per board
instead of having a common set of rates since not all boards may use
the various audio PLL clocks for audio devices.

This resulted in an incorrect clock rate when attempting to playback
audio, since the AUDIO_PLL2 wasn't set any longer. Fix this by
setting the AUDIO_PLL2 rate inside the SAI3 node since it's the SAI3
that needs it.

Fixes: 16c984524862 ("arm64: dts: imx8mp: don't initialize audio clocks from CCM node")
Signed-off-by: Adam Ford <[email protected]>
Reviewed-by: Lucas Stach <[email protected]>
Reviewed-by: Fabio Estevam <[email protected]>
Signed-off-by: Shawn Guo <[email protected]>

arm64: dts: imx8mp: Fix SDMA2/3 clocks

Commit 16c984524862 ("arm64: dts: imx8mp: don't initialize audio clocks
from CCM node") removed the Audio clocks from the main clock node, because
the intent is to force people to setup the audio PLL clocks per board
instead of having a common set of rates, since not all boards may use
the various audio PLL clocks in the same way.

Unfortunately, with this parenting removed, the SDMA2 and SDMA3
clocks were slowed to 24MHz because the SDMA2/3 clocks are controlled
via the audio_blk_ctrl which is clocked from IMX8MP_CLK_AUDIO_ROOT,
and that clock is enabled by pgc_audio.

Per the TRM, "The SDMA2/3 target frequency is 400MHz IPG and 400MHz
AHB, always 1:1 mode, to make sure there is enough throughput for all
the audio use cases."

Instead of cluttering the clock node, place the clock rate and parent
information into the pgc_audio node.

With the parenting and clock rates restored for IMX8MP_CLK_AUDIO_AHB,
and IMX8MP_CLK_AUDIO_AXI_SRC, it appears the SDMA2 and SDMA3 run at
400MHz again.

Fixes: 16c984524862 ("arm64: dts: imx8mp: don't initialize audio clocks from CCM node")
Signed-off-by: Adam Ford <[email protected]>
Reviewed-by: Lucas Stach <[email protected]>
Reviewed-by: Fabio Estevam <[email protected]>
Signed-off-by: Shawn Guo <[email protected]>

sched/core: Refactor the task_flags check for worker sleeping in sched_submit_work()

Simplify the conditional logic for checking worker flags
by splitting the original compound `if` statement into
separate `if` and `else if` clauses.

This modification not only retains the previous functionality,
but also reduces a single `if` check, improving code clarity
and potentially enhancing performance.

Signed-off-by: Wang Jinchao <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/ZOIMvURE99ZRAYEj@fedora

sched/fair: Fix warning in bandwidth distribution

We've observed the following warning being hit in
distribute_cfs_runtime():

SCHED_WARN_ON(cfs_rq->runtime_remaining > 0)

We have the following race:

- CPU 0: running bandwidth distribution (distribute_cfs_runtime).
   Inspects the local cfs_rq and makes its runtime_remaining positive.
   However, we defer unthrottling the local cfs_rq until after
   considering all remote cfs_rq's.

- CPU 1: starts running bandwidth distribution from the slack timer. When
   it finds the cfs_rq for CPU 0 on the throttled list, it observers the
   that the cfs_rq is throttled, yet is not on the CSD list, and has a
   positive runtime_remaining, thus triggering the warning in
   distribute_cfs_runtime.

To fix this, we can rework the local unthrottling logic to put the local
cfs_rq on a local list, so that any future bandwidth distributions will
realize that the cfs_rq is about to be unthrottled.

Signed-off-by: Josh Don <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

sched/fair: Make cfs_rq->throttled_csd_list available on !SMP

This makes the following patch cleaner by avoiding extra CONFIG_SMP
conditionals on the availability of rq->throttled_csd_list.

Signed-off-by: Josh Don <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

x86/kgdb: Fix a kerneldoc warning when build with W=1

When compiled with W=1, the following warning is generated:

arch/x86/kernel/kgdb.c:698: warning: Cannot understand *
on line 698 - I thought it was a doc line

Remove the corresponding empty comment line to fix the warning.

Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Randy Dunlap <[email protected]>
Link: https://lore.kernel.org/r/aad659537c1d4ebd86912a6f0be458676c8e69af.1695401178.git.christophe.jaillet@wanadoo.fr

Merge tag 'gpio-fixes-for-v6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

- fix an invalid usage of __free(kfree) leading to kfreeing an
   ERR_PTR()

- fix an irq domain leak in gpio-tb10x

- MAINTAINERS update

* tag 'gpio-fixes-for-v6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: sim: fix an invalid __free() usage
  gpio: tb10x: Fix an error handling path in tb10x_gpio_probe()
  MAINTAINERS: gpio-regmap: make myself a maintainer of it

Merge tag 'mm-hotfixes-stable-2023-09-23-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
"13 hotfixes, 10 of which pertain to post-6.5 issues. The other three
  are cc:stable"

* tag 'mm-hotfixes-stable-2023-09-23-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  proc: nommu: fix empty /proc/<pid>/maps
  filemap: add filemap_map_order0_folio() to handle order0 folio
  proc: nommu: /proc/<pid>/maps: release mmap read lock
  mm: memcontrol: fix GFP_NOFS recursion in memory.high enforcement
  pidfd: prevent a kernel-doc warning
  argv_split: fix kernel-doc warnings
  scatterlist: add missing function params to kernel-doc
  selftests/proc: fixup proc-empty-vm test after KSM changes
  revert "scripts/gdb/symbols: add specific ko module load command"
  selftests: link libasan statically for tests with -fsanitize=address
  task_work: add kerneldoc annotation for 'data' argument
  mm: page_alloc: fix CMA and HIGHATOMIC landing on the wrong buddy list
  sh: mm: re-add lost __ref to ioremap_prot() to fix modpost warning

Merge tag '6.6-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:
"Six smb3 client fixes, including three for stable, from the SMB
  plugfest (testing event) this week:

   - Reparse point handling fix (found when investigating dir
     enumeration when fifo in dir)

   - Fix excessive thread creation for dir lease cleanup

   - UAF fix in negotiate path

   - remove duplicate error message mapping and fix confusing warning
     message

   - add dynamic trace point to improve debugging RDMA connection
     attempts"

* tag '6.6-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: fix confusing debug message
  smb: client: handle STATUS_IO_REPARSE_TAG_NOT_HANDLED
  smb3: remove duplicate error mapping
  cifs: Fix UAF in cifs_demultiplex_thread()
  smb3: do not start laundromat thread when dir leases  disabled
  smb3: Add dynamic trace points for RDMA (smbdirect) reconnect

Merge tag 'i2c-for-6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
"A set of I2C driver fixes. Mostly fixing resource leaks or sanity
  checks"

* tag 'i2c-for-6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: xiic: Correct return value check for xiic_reinit()
  i2c: mux: gpio: Add missing fwnode_handle_put()
  i2c: mux: demux-pinctrl: check the return value of devm_kstrdup()
  i2c: designware: fix __i2c_dw_disable() in case master is holding SCL low
  i2c: i801: unregister tco_pdev in i801_probe() error path

mfd: cs42l43: Use correct macro for new-style PM runtime ops

The code was accidentally mixing new and old style macros, update the
macros used to remove an unused function warning whilst building with
no PM enabled in the config.

Fixes: ace6d1448138 ("mfd: cs42l43: Add support for cs42l43 core driver")
Signed-off-by: Charles Keepax <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/
Reviewed-by: Nathan Chancellor <[email protected]>
Tested-by: Geert Uytterhoeven <[email protected]>
Acked-by: Lee Jones <[email protected]>
Signed-off-by: Uwe Kleine-König <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge tag 'loongarch-fixes-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:
"Fix lockdep, fix a boot failure, fix some build warnings, fix document
  links, and some cleanups"

* tag 'loongarch-fixes-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  docs/zh_CN/LoongArch: Update the links of ABI
  docs/LoongArch: Update the links of ABI
  LoongArch: Don't inline kasan_mem_to_shadow()/kasan_shadow_to_mem()
  kasan: Cleanup the __HAVE_ARCH_SHADOW_MAP usage
  LoongArch: Set all reserved memblocks on Node#0 at initialization
  LoongArch: Remove dead code in relocate_new_kernel
  LoongArch: Use _UL() and _ULL()
  LoongArch: Fix some build warnings with W=1
  LoongArch: Fix lockdep static memory detection

Merge tag 's390-6.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Vasily Gorbik:

- Fix potential string buffer overflow in hypervisor user-defined
   certificates handling

- Update defconfigs

* tag 's390-6.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/cert_store: fix string length handling
  s390: update defconfigs

Merge tag 'iomap-6.6-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull iomap fixes from Darrick Wong:

- Return EIO on bad inputs to iomap_to_bh instead of BUGging, to deal
   less poorly with block device io racing with block device resizing

- Fix a stale page data exposure bug introduced in 6.6-rc1 when
   unsharing a file range that is not in the page cache

* tag 'iomap-6.6-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  iomap: convert iomap_unshare_iter to use large folios
  iomap: don't skip reading in !uptodate folios when unsharing a range
  iomap: handle error conditions more gracefully in iomap_to_bh

Merge tag 'kvm-riscv-fixes-6.6-1' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv fixes for 6.6, take #1

- Fix KVM_GET_REG_LIST API for ISA_EXT registers
- Fix reading ISA_EXT register of a missing extension
- Fix ISA_EXT register handling in get-reg-list test
- Fix filtering of AIA registers in get-reg-list test

KVM: SVM: Do not use user return MSR support for virtualized TSC_AUX

When the TSC_AUX MSR is virtualized, the TSC_AUX value is swap type "B"
within the VMSA. This means that the guest value is loaded on VMRUN and
the host value is restored from the host save area on #VMEXIT.

Since the value is restored on #VMEXIT, the KVM user return MSR support
for TSC_AUX can be replaced by populating the host save area with the
current host value of TSC_AUX. And, since TSC_AUX is not changed by Linux
post-boot, the host save area can be set once in svm_hardware_enable().
This eliminates the two WRMSR instructions associated with the user return
MSR support.

Signed-off-by: Tom Lendacky <[email protected]>
Message-Id: <d381de38eb0ab6c9c93dda8503b72b72546053d7.1694811272 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: SVM: Fix TSC_AUX virtualization setup

The checks for virtualizing TSC_AUX occur during the vCPU reset processing
path. However, at the time of initial vCPU reset processing, when the vCPU
is first created, not all of the guest CPUID information has been set. In
this case the RDTSCP and RDPID feature support for the guest is not in
place and so TSC_AUX virtualization is not established.

This continues for each vCPU created for the guest. On the first boot of
an AP, vCPU reset processing is executed as a result of an APIC INIT
event, this time with all of the guest CPUID information set, resulting
in TSC_AUX virtualization being enabled, but only for the APs. The BSP
always sees a TSC_AUX value of 0 which probably went unnoticed because,
at least for Linux, the BSP TSC_AUX value is 0.

Move the TSC_AUX virtualization enablement out of the init_vmcb() path and
into the vcpu_after_set_cpuid() path to allow for proper initialization of
the support after the guest CPUID information has been set.

With the TSC_AUX virtualization support now in the vcpu_set_after_cpuid()
path, the intercepts must be either cleared or set based on the guest
CPUID input.

Fixes: 296d5a17e793 ("KVM: SEV-ES: Use V_TSC_AUX if available instead of RDTSC/MSR_TSC_AUX intercepts")
Signed-off-by: Tom Lendacky <[email protected]>
Message-Id: <4137fbcb9008951ab5f0befa74a0399d2cce809a.1694811272 [email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: SVM: INTERCEPT_RDTSCP is never intercepted anyway

svm_recalc_instruction_intercepts() is always called at least once
before the vCPU is started, so the setting or clearing of the RDTSCP
intercept can be dropped from the TSC_AUX virtualization support.

Extracted from a patch by Tom Lendacky.

Cc: [email protected]
Fixes: 296d5a17e793 ("KVM: SEV-ES: Use V_TSC_AUX if available instead of RDTSC/MSR_TSC_AUX intercepts")
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: x86/mmu: Stop zapping invalidated TDP MMU roots asynchronously

Stop zapping invalidate TDP MMU roots via work queue now that KVM
preserves TDP MMU roots until they are explicitly invalidated.  Zapping
roots asynchronously was effectively a workaround to avoid stalling a vCPU
for an extended during if a vCPU unloaded a root, which at the time
happened whenever the guest toggled CR0.WP (a frequent operation for some
guest kernels).

While a clever hack, zapping roots via an unbound worker had subtle,
unintended consequences on host scheduling, especially when zapping
multiple roots, e.g. as part of a memslot.  Because the work of zapping a
root is no longer bound to the task that initiated the zap, things like
the CPU affinity and priority of the original task get lost.  Losing the
affinity and priority can be especially problematic if unbound workqueues
aren't affined to a small number of CPUs, as zapping multiple roots can
cause KVM to heavily utilize the majority of CPUs in the system, *beyond*
the CPUs KVM is already using to run vCPUs.

When deleting a memslot via KVM_SET_USER_MEMORY_REGION, the async root
zap can result in KVM occupying all logical CPUs for ~8ms, and result in
high priority tasks not being scheduled in in a timely manner.  In v5.15,
which doesn't preserve unloaded roots, the issues were even more noticeable
as KVM would zap roots more frequently and could occupy all CPUs for 50ms+.

Consuming all CPUs for an extended duration can lead to significant jitter
throughout the system, e.g. on ChromeOS with virtio-gpu, deleting memslots
is a semi-frequent operation as memslots are deleted and recreated with
different host virtual addresses to react to host GPU drivers allocating
and freeing GPU blobs.  On ChromeOS, the jitter manifests as audio blips
during games due to the audio server's tasks not getting scheduled in
promptly, despite the tasks having a high realtime priority.

Deleting memslots isn't exactly a fast path and should be avoided when
possible, and ChromeOS is working towards utilizing MAP_FIXED to avoid the
memslot shenanigans, but KVM is squarely in the wrong.  Not to mention
that removing the async zapping eliminates a non-trivial amount of
complexity.

Note, one of the subtle behaviors hidden behind the async zapping is that
KVM would zap invalidated roots only once (ignoring partial zaps from
things like mmu_notifier events).  Preserve this behavior by adding a flag
to identify roots that are scheduled to be zapped versus roots that have
already been zapped but not yet freed.

Add a comment calling out why kvm_tdp_mmu_invalidate_all_roots() can
encounter invalid roots, as it's not at all obvious why zapping
invalidated roots shouldn't simply zap all invalid roots.

Reported-by: Pattara Teerapong <[email protected]>
Cc: David Stevens <[email protected]>
Cc: Yiwei Zhang<[email protected]>
Cc: Paul Hsia <[email protected]>
Cc: [email protected]
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <20230916003916.2545000 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: x86/mmu: Do not filter address spaces in for_each_tdp_mmu_root_yield_safe()

All callers except the MMU notifier want to process all address spaces.
Remove the address space ID argument of for_each_tdp_mmu_root_yield_safe()
and switch the MMU notifier to use __for_each_tdp_mmu_root_yield_safe().

Extracted out of a patch by Sean Christopherson <[email protected]>

Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>