Dave Airlie [Fri, 12 Apr 2024 01:01:44 +0000 (11:01 +1000)]
Merge tag 'drm-msm-next-2024-04-11' of https://gitlab.freedesktop.org/drm/msm into drm-fixes
Fixes for v6.9
Display:
- Fixes for PM refcount leak when DP goes to disconnected state and
also when link training fails. This is also one of the issues found
with the pm runtime series
- Add missing newlines to prints in msm_fb and msm_kms
- Change permissions of some dpu debugfs entries which write to const
data from catalog to read-only to avoid protection faults
- Fix the interface table for the catalog of X1E80100. This is an
important fix to bringup DP for X1E80100.
- Logging fix to print the callback symbol in the invalid IRQ message
case rather than printing when its known to be NULL.
- Bindings fix to add DP node as child of mdss for mdss node
- Minor typo fix in DP driver API which handles port status change
GPU:
- fix CHRASHDUMP_READ()
- fix HHB (highest bank bit) for a619 to fix UBWC corruption
Lucas De Marchi [Fri, 5 Apr 2024 20:07:11 +0000 (13:07 -0700)]
drm/xe/display: Fix double mutex initialization
All of these mutexes are already initialized by the display side since
commit 3fef3e6ff86a ("drm/i915: move display mutex inits to display
code"), so the xe shouldn´t initialize them.
Harry Wentland [Thu, 21 Mar 2024 15:13:38 +0000 (11:13 -0400)]
drm/amd/display: Set VSC SDP Colorimetry same way for MST and SST
The previous check for the is_vsc_sdp_colorimetry_supported flag
for MST sink signals did nothing. Simplify the code and use the
same check for MST and SST.
Harry Wentland [Tue, 12 Mar 2024 15:55:52 +0000 (11:55 -0400)]
drm/amd/display: Program VSC SDP colorimetry for all DP sinks >= 1.4
In order for display colorimetry to work correctly on DP displays
we need to send the VSC SDP packet. We should only do so for
panels with DPCD revision greater or equal to 1.4 as older
receivers might have problems with it.
Wenjing Liu [Fri, 22 Mar 2024 19:02:45 +0000 (15:02 -0400)]
drm/amd/display: always reset ODM mode in context when adding first plane
[why]
In current implemenation ODM mode is only reset when the last plane is
removed from dc state. For any dc validate we will always remove all
current planes and add new planes. However when switching from no planes
to 1 plane, ODM mode is not reset because no planes get removed. This
has caused an issue where we kept ODM combine when it should have been
remove when a plane is added. The change is to reset ODM mode when
adding the first plane.
Zhigang Luo [Mon, 18 Mar 2024 18:13:10 +0000 (14:13 -0400)]
amd/amdkfd: sync all devices to wait all processes being evicted
If there are more than one device doing reset in parallel, the first
device will call kfd_suspend_all_processes() to evict all processes
on all devices, this call takes time to finish. other device will
start reset and recover without waiting. if the process has not been
evicted before doing recover, it will be restored, then caused page
fault.
Lijo Lazar [Wed, 6 Mar 2024 11:35:07 +0000 (17:05 +0530)]
drm/amdgpu: Fix VCN allocation in CPX partition
VCN need not be shared in CPX mode always for all GFX 9.4.3 SOC SKUs. In
certain configs, VCN instance can be exclusively allocated to a
partition even under CPX mode.
Alex Hung [Sat, 16 Mar 2024 03:25:25 +0000 (21:25 -0600)]
drm/amd/display: Skip on writeback when it's not applicable
[WHY]
dynamic memory safety error detector (KASAN) catches and generates error
messages "BUG: KASAN: slab-out-of-bounds" as writeback connector does not
support certain features which are not initialized.
[HOW]
Skip them when connector type is DRM_MODE_CONNECTOR_WRITEBACK.
shaoyunl [Fri, 22 Mar 2024 16:25:16 +0000 (12:25 -0400)]
drm/amdgpu : Add mes_log_enable to control mes log feature
The MES log might slow down the performance for extra step of log the data,
disable it by default and introduce a parameter can enable it when necessary
Lijo Lazar [Wed, 14 Feb 2024 12:25:54 +0000 (17:55 +0530)]
drm/amdgpu: Reset dGPU if suspend got aborted
For SOC21 ASICs, there is an issue in re-enabling PM features if a
suspend got aborted. In such cases, reset the device during resume
phase. This is a workaround till a proper solution is finalized.
Currently, with F32 HWS GPU reset is only when unmap queue fails.
However, if compute queue doesn't repond to preemption request in time
unmap will return without any error. In this case, only preemption error
is logged and Reset is not triggered. Call GPU reset in this case also.
Ville Syrjälä [Thu, 4 Apr 2024 21:34:29 +0000 (00:34 +0300)]
drm/i915/vrr: Disable VRR when using bigjoiner
All joined pipes share the same transcoder/timing generator.
Currently we just do the commits per-pipe, which doesn't really
work if we need to change switch between non-VRR and VRR timings
generators on the fly, or even when sending the push to the
transcoder. For now just disable VRR when bigjoiner is needed.
Ville Syrjälä [Thu, 4 Apr 2024 21:34:28 +0000 (00:34 +0300)]
drm/i915: Disable live M/N updates when using bigjoiner
All joined pipes share the same transcoder/timing generator.
Currently we just do the commits per-pipe, which doesn't really
work if we need to change the timings at the same time. For
now just disable live M/N updates when bigjoiner is needed.
Ville Syrjälä [Thu, 4 Apr 2024 21:34:27 +0000 (00:34 +0300)]
drm/i915: Disable port sync when bigjoiner is used
The current modeset sequence can't handle port sync and bigjoiner
at the same time. Refuse port sync when bigjoiner is needed,
at least until we fix the modeset sequence.
Ville Syrjälä [Thu, 4 Apr 2024 21:34:26 +0000 (00:34 +0300)]
drm/i915/psr: Disable PSR when bigjoiner is used
Bigjoiner seem to be causing all kinds of grief to the PSR
code currently. I don't believe there is any hardware issue
but the code simply not handling this correctly. For now
just disable PSR when bigjoiner is needed.
John Harrison [Fri, 29 Mar 2024 23:53:05 +0000 (16:53 -0700)]
drm/i915/guc: Fix the fix for reset lock confusion
The previous fix for the circlular lock splat about the busyness
worker wasn't quite complete. Even though the reset-in-progress flag
is cleared at the start of intel_uc_reset_finish, the entire function
is still inside the reset mutex lock. Not sure why the patch appeared
to fix the issue both locally and in CI. However, it is now back
again.
There is a further complication that the wedge code path within
intel_gt_reset() jumps around so much that it results in nested
reset_prepare/_finish calls. That is, the call sequence is:
intel_gt_reset
| reset_prepare
| __intel_gt_set_wedged
| | reset_prepare
| | reset_finish
| reset_finish
The nested finish means that even if the clear of the in-progress flag
was moved to the end of _finish, it would still be clear for the
entire second call. Surprisingly, this does not seem to be causing any
other problems at present.
As an aside, a wedge on fini does not call the finish functions at
all. The reset_in_progress flag is left set (twice).
So instead of trying to cancel the worker anywhere at all in the reset
path, just add a cancel to intel_guc_submission_fini instead. Note
that it is not a problem if the worker is still active during a reset.
Either it will run before the reset path starts locking things and
will simply block the reset code for a tiny amount of time. Or it will
run after the locks have been acquired and will early exit due to the
try-lock.
Also, do not use the reset-in-progress flag to decide whether a
synchronous cancel is safe (from a lockdep perspective) or not.
Instead, use the actual reset mutex state (both the genuine one and
the custom rolled BACKOFF one).
Ville Syrjälä [Tue, 2 Apr 2024 15:50:04 +0000 (18:50 +0300)]
drm/i915/cdclk: Fix voltage_level programming edge case
Currently we only consider the relationship of the
old and new CDCLK frequencies when determining whether
to do the repgramming from intel_set_cdclk_pre_plane_update()
or intel_set_cdclk_post_plane_update().
It is technically possible to have a situation where the
CDCLK frequency is decreasing, but the voltage_level is
increasing due a DDI port. In this case we should bump
the voltage level already in intel_set_cdclk_pre_plane_update()
(so that the voltage_level will have been increased by the
time the port gets enabled), while leaving the CDCLK frequency
unchanged (as active planes/etc. may still depend on it).
We can then reduce the CDCLK frequency to its final value
from intel_set_cdclk_post_plane_update().
In order to handle that correctly we shall construct a
suitable amalgam of the old and new cdclk states in
intel_set_cdclk_pre_plane_update().
And we can simply call intel_set_cdclk() unconditionally
in both places as it will not do anything if nothing actually
changes vs. the current hw state.
v2: Handle cdclk_state->disable_pipes
v3: Only synchronize the cd2x update against the pipe's vblank
when the cdclk frequency is changing during the current
commit phase (Gustavo)
Ville Syrjälä [Tue, 2 Apr 2024 15:50:03 +0000 (18:50 +0300)]
drm/i915/cdclk: Fix CDCLK programming order when pipes are active
Currently we always reprogram CDCLK from the
intel_set_cdclk_pre_plane_update() when using squash/crawl.
The code only works correctly for the cd2x update or full
modeset cases, and it was simply never updated to deal with
squash/crawl.
If the CDCLK frequency is increasing we must reprogram it
before we do anything else that might depend on the new
higher frequency, and conversely we must not decrease
the frequency until everything that might still depend
on the old higher frequency has been dealt with.
Since cdclk_state->pipe is only relevant when doing a cd2x
update we can't use it to determine the correct sequence
during squash/crawl. To that end introduce cdclk_state->disable_pipes
which simply indicates that we must perform the update
while the pipes are disable (ie. during
intel_set_cdclk_pre_plane_update()). Otherwise we use the
same old vs. new CDCLK frequency comparsiong as for cd2x
updates.
The only remaining problem case is when the voltage_level
needs to increase due to a DDI port, but the CDCLK frequency
is decreasing (and not all pipes are being disabled). The
current approach will not bump the voltage level up until
after the port has already been enabled, which is too late.
But we'll take care of that case separately.
v2: Don't break the "must disable pipes case"
v3: Keep the on stack 'pipe' for future use
Ville Syrjälä [Thu, 4 Apr 2024 20:33:25 +0000 (23:33 +0300)]
drm/client: Fully protect modes[] with dev->mode_config.mutex
The modes[] array contains pointers to modes on the connectors'
mode lists, which are protected by dev->mode_config.mutex.
Thus we need to extend modes[] the same protection or by the
time we use it the elements may already be pointing to
freed/reused memory.
ivpu_device->context_xa is locked both in kernel thread and IRQ context.
It requires XA_FLAGS_LOCK_IRQ flag to be passed during initialization
otherwise the lock could be acquired from a thread and interrupted by
an IRQ that locks it for the second time causing the deadlock.
This deadlock was reported by lockdep and observed in internal tests.
accel/ivpu: Return max freq for DRM_IVPU_PARAM_CORE_CLOCK_RATE
DRM_IVPU_PARAM_CORE_CLOCK_RATE returns current NPU frequency which
could be 0 if device was sleeping. This value isn't really useful to
the user space, so return max freq instead which can be used to estimate
NPU performance.
This patch improves readability and clarity of MMU error messages.
Previously, the error strings were somewhat confusing and could lead to
ambiguous interpretations, making it difficult to diagnose issues.
accel/ivpu: Put NPU back to D3hot after failed resume
Put NPU in D3hot after ivpu_resume() fails to power up the device.
This will assure that D3->D0 power cycle will be performed before
the next resume and also will minimize power usage in this corner case.
In case of failed power up we end up left in PCI D3hot
state making it impossible to access NPU registers on retry.
Enter D0 state on retry before proceeding with power up sequence.
Merge tag 'x86-urgent-2024-04-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- Fix MCE timer reinit locking
- Fix/improve CoCo guest random entropy pool init
- Fix SEV-SNP late disable bugs
- Fix false positive objtool build warning
- Fix header dependency bug
- Fix resctrl CPU offlining bug
* tag 'x86-urgent-2024-04-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/retpoline: Add NOENDBR annotation to the SRSO dummy return thunk
x86/mce: Make sure to grab mce_sysfs_mutex in set_bank()
x86/CPU/AMD: Track SNP host status with cc_platform_*()
x86/cc: Add cc_platform_set/_clear() helpers
x86/kvm/Kconfig: Have KVM_AMD_SEV select ARCH_HAS_CC_PLATFORM
x86/coco: Require seeding RNG with RDRAND on CoCo systems
x86/numa/32: Include missing <asm/pgtable_areas.h>
x86/resctrl: Fix uninitialized memory read when last CPU of domain goes offline
Merge tag 'timers-urgent-2024-04-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Ingo Molnar:
"Fix various timer bugs:
- Fix a timer migration bug that may result in missed events
- Fix timer migration group hierarchy event updates
- Fix a PowerPC64 build warning
- Fix a handful of DocBook annotation bugs"
* tag 'timers-urgent-2024-04-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timers/migration: Return early on deactivation
timers/migration: Fix ignored event due to missing CPU update
vdso: Use CONFIG_PAGE_SHIFT in vdso/datapage.h
timers: Fix text inconsistencies and spelling
tick/sched: Fix struct tick_sched doc warnings
tick/sched: Fix various kernel-doc warnings
timers: Fix kernel-doc format and add Return values
time/timekeeping: Fix kernel-doc warnings and typos
time/timecounter: Fix inline documentation
Merge tag 'perf-urgent-2024-04-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 perf fix from Ingo Molnar:
"Fix a combined PEBS events bug on x86 Intel CPUs"
* tag 'perf-urgent-2024-04-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/ds: Don't clear ->pebs_data_cfg for the last PEBS event
Merge tag 'nfsd-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
- Address a slow memory leak with RPC-over-TCP
- Prevent another NFS4ERR_DELAY loop during CREATE_SESSION
* tag 'nfsd-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
nfsd: hold a lighter-weight client reference over CB_RECALL_ANY
SUNRPC: Fix a slow server-side memory leak with RPC-over-TCP
Merge tag '6.9-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- fix to retry close to avoid potential handle leaks when server
returns EBUSY
- DFS fixes including a fix for potential use after free
- fscache fix
- minor strncpy cleanup
- reconnect race fix
- deal with various possible UAF race conditions tearing sessions down
* tag '6.9-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: fix potential UAF in cifs_signal_cifsd_for_reconnect()
smb: client: fix potential UAF in smb2_is_network_name_deleted()
smb: client: fix potential UAF in is_valid_oplock_break()
smb: client: fix potential UAF in smb2_is_valid_oplock_break()
smb: client: fix potential UAF in smb2_is_valid_lease_break()
smb: client: fix potential UAF in cifs_stats_proc_show()
smb: client: fix potential UAF in cifs_stats_proc_write()
smb: client: fix potential UAF in cifs_dump_full_key()
smb: client: fix potential UAF in cifs_debug_files_proc_show()
smb3: retrying on failed server close
smb: client: serialise cifs_construct_tcon() with cifs_mount_mutex
smb: client: handle DFS tcons in cifs_construct_tcon()
smb: client: refresh referral without acquiring refpath_lock
smb: client: guarantee refcounted children from parent session
cifs: Fix caching to try to do open O_WRONLY as rdwr on server
smb: client: fix UAF in smb2_reconnect_server()
smb: client: replace deprecated strncpy with strscpy
x86/retpoline: Add NOENDBR annotation to the SRSO dummy return thunk
srso_alias_untrain_ret() is special code, even if it is a dummy
which is called in the !SRSO case, so annotate it like its real
counterpart, to address the following objtool splat:
vmlinux.o: warning: objtool: .export_symbol+0x2b290: data relocation to !ENDBR: srso_alias_untrain_ret+0x0
Merge tag 'firewire-fixes-6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull firewire fixes from Takashi Sakamoto:
"The firewire-ohci kernel module has a parameter for verbose kernel
logging. It is well-known that it logs the spurious IRQ for bus-reset
event due to the unmasked register for IRQ event. This update fixes
the issue"
* tag 'firewire-fixes-6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
firewire: ohci: mask bus reset interrupts between ISR and bottom half
Adam Goldman [Sun, 24 Mar 2024 22:38:41 +0000 (07:38 +0900)]
firewire: ohci: mask bus reset interrupts between ISR and bottom half
In the FireWire OHCI interrupt handler, if a bus reset interrupt has
occurred, mask bus reset interrupts until bus_reset_work has serviced and
cleared the interrupt.
Normally, we always leave bus reset interrupts masked. We infer the bus
reset from the self-ID interrupt that happens shortly thereafter. A
scenario where we unmask bus reset interrupts was introduced in 2008 in a007bb857e0b26f5d8b73c2ff90782d9c0972620: If
OHCI_PARAM_DEBUG_BUSRESETS (8) is set in the debug parameter bitmask, we
will unmask bus reset interrupts so we can log them.
irq_handler logs the bus reset interrupt. However, we can't clear the bus
reset event flag in irq_handler, because we won't service the event until
later. irq_handler exits with the event flag still set. If the
corresponding interrupt is still unmasked, the first bus reset will
usually freeze the system due to irq_handler being called again each
time it exits. This freeze can be reproduced by loading firewire_ohci
with "modprobe firewire_ohci debug=-1" (to enable all debugging output).
Apparently there are also some cases where bus_reset_work will get called
soon enough to clear the event, and operation will continue normally.
This freeze was first reported a few months after a007bb85 was committed,
but until now it was never fixed. The debug level could safely be set
to -1 through sysfs after the module was loaded, but this would be
ineffectual in logging bus reset interrupts since they were only
unmasked during initialization.
irq_handler will now leave the event flag set but mask bus reset
interrupts, so irq_handler won't be called again and there will be no
freeze. If OHCI_PARAM_DEBUG_BUSRESETS is enabled, bus_reset_work will
unmask the interrupt after servicing the event, so future interrupts
will be caught as desired.
As a side effect to this change, OHCI_PARAM_DEBUG_BUSRESETS can now be
enabled through sysfs in addition to during initial module loading.
However, when enabled through sysfs, logging of bus reset interrupts will
be effective only starting with the second bus reset, after
bus_reset_work has executed.
Merge tag 'spi-fix-v6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A few small driver specific fixes, the most important being the
s3c64xx change which is likely to be hit during normal operation"
* tag 'spi-fix-v6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: mchp-pci1xxx: Fix a possible null pointer dereference in pci1xxx_spi_probe
spi: spi-fsl-lpspi: remove redundant spi_controller_put call
spi: s3c64xx: Use DMA mode from fifo size
Merge tag 'regmap-fix-v6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
"Richard found a nasty corner case in the maple tree code which he
fixed, and also fixed a compiler warning which was showing up with the
toolchain he uses and helpfully identified a possible incorrect error
code which could have runtime impacts"
* tag 'regmap-fix-v6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: maple: Fix uninitialized symbol 'ret' warnings
regmap: maple: Fix cache corruption in regcache_maple_drop()
* tag 'block-6.9-20240405' of git://git.kernel.dk/linux:
nvme-fc: rename free_ctrl callback to match name pattern
nvmet-fc: move RCU read lock to nvmet_fc_assoc_exists
nvmet: implement unique discovery NQN
nvme: don't create a multipath node for zero capacity devices
nvme: split nvme_update_zone_info
nvme-multipath: don't inherit LBA-related fields for the multipath node
block: fix overflow in blk_ioctl_discard()
nullblk: Fix cleanup order in null_add_dev() error path
Merge tag 'io_uring-6.9-20240405' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:
- Backport of some fixes that came up during development of the 6.10
io_uring patches. This includes some kbuf cleanups and reference
fixes.
- Disable multishot read if we don't have NOWAIT support on the target
- Fix for a dependency issue with workqueue flushing
* tag 'io_uring-6.9-20240405' of git://git.kernel.dk/linux:
io_uring/kbuf: hold io_buffer_list reference over mmap
io_uring/kbuf: protect io_buffer_list teardown with a reference
io_uring/kbuf: get rid of bl->is_ready
io_uring/kbuf: get rid of lower BGID lists
io_uring: use private workqueue for exit work
io_uring: disable io-wq execution of multishot NOWAIT requests
io_uring/rw: don't allow multishot reads without NOWAIT support
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"The most important is the libsas fix, which is a problem for DMA to a
kmalloc'd structure too small causing cache line interference. The
other fixes (all in drivers) are mostly for allocation length fixes,
error leg unwinding, suspend races and a missing retry"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: core: Fix MCQ mode dev command timeout
scsi: libsas: Align SMP request allocation to ARCH_DMA_MINALIGN
scsi: sd: Unregister device if device_add_disk() failed in sd_probe()
scsi: ufs: core: WLUN suspend dev/link state error recovery
scsi: mylex: Fix sysfs buffer lengths
Merge tag 'mm-hotfixes-stable-2024-04-05-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"8 hotfixes, 3 are cc:stable
There are a couple of fixups for this cycle's vmalloc changes and one
for the stackdepot changes. And a fix for a very old x86 PAT issue
which can cause a warning splat"
* tag 'mm-hotfixes-stable-2024-04-05-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
stackdepot: rename pool_index to pool_index_plus_1
x86/mm/pat: fix VM_PAT handling in COW mappings
MAINTAINERS: change vmware.com addresses to broadcom.com
selftests/mm: include strings.h for ffsl
mm: vmalloc: fix lockdep warning
mm: vmalloc: bail out early in find_vmap_area() if vmap is not init
init: open output files from cpio unpacking with O_LARGEFILE
mm/secretmem: fix GUP-fast succeeding on secretmem folios
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
"arm64/ptrace fix to use the correct SVE layout based on the saved
floating point state rather than the TIF_SVE flag. The latter may be
left on during syscalls even if the SVE state is discarded"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/ptrace: Use saved floating point state type to determine SVE layout
Merge tag 'riscv-for-linus-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- A fix for an __{get,put}_kernel_nofault to avoid an uninitialized
value causing spurious failures
- compat_vdso.so.dbg is now installed to the standard install location
- A fix to avoid initializing PERF_SAMPLE_BRANCH_*-related events, as
they aren't supported and will just later fail
- A fix to make AT_VECTOR_SIZE_ARCH correct now that we're providing
AT_MINSIGSTKSZ
- pgprot_nx() is now implemented, which fixes vmap W^X protection
- A fix for the vector save/restore code, which at least manifests as
corrupted vector state when a signal is taken
- A fix for a race condition in instruction patching
- A fix to avoid leaking the kernel-mode GP to userspace, which is a
kernel pointer leak that can be used to defeat KASLR in various ways
- A handful of smaller fixes to build warnings, an overzealous printk,
and some missing tracing annotations
* tag 'riscv-for-linus-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: process: Fix kernel gp leakage
riscv: Disable preemption when using patch_map()
riscv: Fix warning by declaring arch_cpu_idle() as noinstr
riscv: use KERN_INFO in do_trap
riscv: Fix vector state restore in rt_sigreturn()
riscv: mm: implement pgprot_nx
riscv: compat_vdso: align VDSOAS build log
RISC-V: Update AT_VECTOR_SIZE_ARCH for new AT_MINSIGSTKSZ
riscv: Mark __se_sys_* functions __used
drivers/perf: riscv: Disable PERF_SAMPLE_BRANCH_* while not supported
riscv: compat_vdso: install compat_vdso.so.dbg to /lib/modules/*/vdso/
riscv: hwprobe: do not produce frtace relocation
riscv: Fix spurious errors from __get/put_kernel_nofault
riscv: mm: Fix prototype to avoid discarding const
Merge tag 's390-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Alexander Gordeev:
- Fix missing NULL pointer check when determining guest/host fault
- Mark all functions in asm/atomic_ops.h, asm/atomic.h and
asm/preempt.h as __always_inline to avoid unwanted instrumentation
- Fix removal of a Processor Activity Instrumentation (PAI) sampling
event in PMU device driver
- Align system call table on 8 bytes
* tag 's390-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/entry: align system call table on 8 bytes
s390/pai: fix sampling event removal for PMU device driver
s390/preempt: mark all functions __always_inline
s390/atomic: mark all functions __always_inline
s390/mm: fix NULL pointer dereference
Merge tag 'pm-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"Fix a recent Energy Model change that went against a recent scheduler
change made independently (Vincent Guittot)"
* tag 'pm-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: EM: fix wrong utilization estimation in em_cpu_energy()
Merge tag 'thermal-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fixes from Rafael Wysocki:
"These fix two power allocator thermal governor issues and an ACPI
thermal driver regression that all were introduced during the 6.8
development cycle.
Specifics:
- Allow the power allocator thermal governor to bind to a thermal
zone without cooling devices and/or without trip points (Nikita
Travkin)
- Make the ACPI thermal driver register a tripless thermal zone when
it cannot find any usable trip points instead of returning an error
from acpi_thermal_add() (Stephen Horvath)"
* tag 'thermal-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: gov_power_allocator: Allow binding without trip points
thermal: gov_power_allocator: Allow binding without cooling devices
ACPI: thermal: Register thermal zones without valid trip points
Merge tag 'gpio-fixes-for-v6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- make sure GPIO devices are registered with the subsystem before
trying to return them to a caller of gpio_device_find()
- fix two issues with incorrect sanitization of the interrupt labels
* tag 'gpio-fixes-for-v6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: cdev: fix missed label sanitizing in debounce_setup()
gpio: cdev: check for NULL labels when sanitizing them for irqs
gpiolib: Fix triggering "kobject: 'gpiochipX' is not initialized, yet" kobject_get() errors
Merge tag 'ata-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata fixes from Damien Le Moal:
- Compilation warning fixes from Arnd: one in the sata_sx4 driver due
to an incorrect calculation of the parameters passed to memcpy() and
another one in the sata_mv driver when CONFIG_PCI is not set
- Drop the owner driver field assignment in the pata_macio driver. That
is not needed as the PCI core code does that already (Krzysztof)
- Remove an unusued field in struct st_ahci_drv_data of the ahci_st
driver (Christophe)
- Add a missing clock probe error check in the sata_gemini driver
(Chen)
* tag 'ata-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: sata_gemini: Check clk_enable() result
ata: sata_mv: Fix PCI device ID table declaration compilation warning
ata: ahci_st: Remove an unused field in struct st_ahci_drv_data
ata: pata_macio: drop driver owner assignment
ata: sata_sx4: fix pdc20621_get_from_dimm() on 64-bit
Merge tag 'sound-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"This became a bit bigger collection of patches, but almost all are
about device-specific fixes, and should be safe for 6.9:
- Lots of ASoC Intel SOF-related fixes/updates
- Locking fixes in SoundWire drivers
- ASoC AMD ACP/SOF updates
- ASoC ES8326 codec fixes
- HD-audio codec fixes and quirks
- A regression fix in emu10k1 synth code"
* tag 'sound-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (49 commits)
ASoC: SOF: Core: Add remove_late() to sof_init_environment failure path
ASoC: SOF: amd: fix for false dsp interrupts
ASoC: SOF: Intel: lnl: Disable DMIC/SSP offload on remove
ASoC: Intel: avs: boards: Add modules description
ASoC: codecs: ES8326: Removing the control of ADC_SCALE
ASoC: codecs: ES8326: Solve a headphone detection issue after suspend and resume
ASoC: codecs: ES8326: modify clock table
ASoC: codecs: ES8326: Solve error interruption issue
ALSA: line6: Zero-initialize message buffers
ALSA: hda/realtek: cs35l41: Support ASUS ROG G634JYR
ALSA: hda/realtek: Update Panasonic CF-SZ6 quirk to support headset with microphone
ALSA: hda/realtek: Add sound quirks for Lenovo Legion slim 7 16ARHA7 models
Revert "ALSA: emu10k1: fix synthesizer sample playback position and caching"
OSS: dmasound/paula: Mark driver struct with __refdata to prevent section mismatch
ALSA: hda/realtek: Add quirks for ASUS Laptops using CS35L56
ASoC: amd: acp: fix for acp_init function error handling
ASoC: tas2781: mark dvc_tlv with __maybe_unused
ASoC: ops: Fix wraparound for mask in snd_soc_get_volsw
ASoC: rt-sdw*: add __func__ to all error logs
ASoC: rt722-sdca-sdw: fix locking sequence
...
Merge tag 'drm-fixes-2024-04-05' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Weekly fixes, mostly xe and i915, amdgpu on a week off, otherwise a
nouveau fix for a crash with new vulkan cts tests, and a couple of
cleanups and misc fixes.
display:
- fix typos in kerneldoc
prime:
- unbreak dma-buf export for virt-gpu
nouveau:
- uvmm: fix remap address calculation
- minor cleanups
panfrost:
- fix power-transition timeouts
xe:
- Stop using system_unbound_wq for preempt fences
- Fix saving unordered rebinding fences by attaching them as kernel
feces to the vm's resv
- Fix TLB invalidation fences completing out of order
- Move rebind TLB invalidation to the ring ops to reduce the latency
i915:
- A few DisplayPort related fixes
- eDP PSR fixes
- Remove some VM space restrictions on older platforms
- Disable automatic load CCS load balancing"
* tag 'drm-fixes-2024-04-05' of https://gitlab.freedesktop.org/drm/kernel: (22 commits)
drm/xe: Use ordered wq for preempt fence waiting
drm/xe: Move vma rebinding to the drm_exec locking loop
drm/xe: Make TLB invalidation fences unordered
drm/xe: Rework rebinding
drm/xe: Use ring ops TLB invalidation for rebinds
drm/i915/mst: Reject FEC+MST on ICL
drm/i915/mst: Limit MST+DSC to TGL+
drm/i915/dp: Fix the computation for compressed_bpp for DISPLAY < 13
drm/i915/gt: Enable only one CCS for compute workload
drm/i915/gt: Do not generate the command streamer for all the CCS
drm/i915/gt: Disable HW load balancing for CCS
drm/i915/gt: Limit the reserved VM space to only the platforms that need it
drm/i915/psr: Fix intel_psr2_sel_fetch_et_alignment usage
drm/i915/psr: Move writing early transport pipe src
drm/i915/psr: Calculate PIPE_SRCSZ_ERLY_TPT value
drm/i915/dp: Remove support for UHBR13.5
drm/i915/dp: Fix DSC state HW readout for SST connectors
drm/display: fix typo
drm/prime: Unbreak virtgpu dma-buf export
nouveau/uvmm: fix addr/range calcs for remap operations
...
Luca Weiss [Thu, 28 Mar 2024 08:02:45 +0000 (09:02 +0100)]
drm/msm/adreno: Set highest_bank_bit for A619
The default highest_bank_bit of 15 didn't seem to cause issues so far
but downstream defines it to be 14. But similar to [0] leaving it on 14
(or 15 for that matter) causes some corruption issues with some
resolutions with DisplayPort, like 1920x1200.
So set it to 13 for now so that there's no screen corruption.
[0] commit 6a0dbcd20ef2 ("drm/msm/a6xx: set highest_bank_bit to 13 for a610")
Fixes: b7616b5c69e6 ("drm/msm/adreno: Add A619 support") Signed-off-by: Luca Weiss <[email protected]>
Patchwork: https://patchwork.freedesktop.org/patch/585215/ Signed-off-by: Rob Clark <[email protected]>
stackdepot: rename pool_index to pool_index_plus_1
Commit 3ee34eabac2a ("lib/stackdepot: fix first entry having a 0-handle")
changed the meaning of the pool_index field to mean "the pool index plus
1". This made the code accessing this field less self-documenting, as
well as causing debuggers such as drgn to not be able to easily remain
compatible with both old and new kernels, because they typically do that
by testing for presence of the new field. Because stackdepot is a
debugging tool, we should make sure that it is debugger friendly.
Therefore, give the field a different name to improve readability as well
as enabling debugger backwards compatibility.
This is needed in 6.9, which would otherwise become an odd release with
the new semantics and old name so debuggers wouldn't recognize the new
semantics there.
PAT handling won't do the right thing in COW mappings: the first PTE (or,
in fact, all PTEs) can be replaced during write faults to point at anon
folios. Reliably recovering the correct PFN and cachemode using
follow_phys() from PTEs will not work in COW mappings.
Using follow_phys(), we might just get the address+protection of the anon
folio (which is very wrong), or fail on swap/nonswap entries, failing
follow_phys() and triggering a WARN_ON_ONCE() in untrack_pfn() and
track_pfn_copy(), not properly calling free_pfn_range().
In free_pfn_range(), we either wouldn't call memtype_free() or would call
it with the wrong range, possibly leaking memory.
To fix that, let's update follow_phys() to refuse returning anon folios,
and fallback to using the stored PFN inside vma->vm_pgoff for COW mappings
if we run into that.
We will now properly handle untrack_pfn() with COW mappings, where we
don't need the cachemode. We'll have to fail fork()->track_pfn_copy() if
the first page was replaced by an anon folio, though: we'd have to store
the cachemode in the VMA to make this work, likely growing the VMA size.
For now, lets keep it simple and let track_pfn_copy() just fail in that
case: it would have failed in the past with swap/nonswap entries already,
and it would have done the wrong thing with anon folios.
Simple reproducer to trigger the WARN_ON_ONCE() in untrack_pfn():
indeed it can happen if the find_vmap_area_exceed_addr_lock() gets called
concurrently because it tries to acquire two nodes locks. It was done to
prevent removing a lowest VA found on a previous step.
To address this a lowest VA is found first without holding a node lock
where it resides. As a last step we check if a VA still there because it
can go away, if removed, proceed with next lowest.
Some background. It worked before because the lock that is in question
was statically defined and initialized. As of now, the locks and data
structures are initialized in the vmalloc_init() function.
To address that issue add the check whether the "vmap_initialized"
variable is set, if not find_vmap_area() bails out on entry returning NULL.
John Sperbeck [Sat, 23 Mar 2024 15:29:34 +0000 (08:29 -0700)]
init: open output files from cpio unpacking with O_LARGEFILE
If a member of a cpio archive for an initrd or initrams is larger than
2Gb, we'll eventually fail to write to that file when we get to that
limit, unless O_LARGEFILE is set.
The problem can be seen with this recipe, assuming that BLK_DEV_RAM
is not configured:
mm/secretmem: fix GUP-fast succeeding on secretmem folios
folio_is_secretmem() currently relies on secretmem folios being LRU
folios, to save some cycles.
However, folios might reside in a folio batch without the LRU flag set, or
temporarily have their LRU flag cleared. Consequently, the LRU flag is
unreliable for this purpose.
In particular, this is the case when secretmem_fault() allocates a fresh
page and calls filemap_add_folio()->folio_add_lru(). The folio might be
added to the per-cpu folio batch and won't get the LRU flag set until the
batch was drained using e.g., lru_add_drain().
Consequently, folio_is_secretmem() might not detect secretmem folios and
GUP-fast can succeed in grabbing a secretmem folio, crashing the kernel
when we would later try reading/writing to the folio, because the folio
has been unmapped from the directmap.
Miguel Ojeda [Tue, 26 Mar 2024 21:23:24 +0000 (22:23 +0100)]
drm/msm: fix the `CRASHDUMP_READ` target of `a6xx_get_shader_block()`
Clang 14 in an (essentially) defconfig arm64 build for next-20240326
reports [1]:
drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error:
variable 'out' set but not used [-Werror,-Wunused-but-set-variable]
The variable `out` in these functions is meant to compute the `target` of
`CRASHDUMP_READ()`, but in this case only the initial value (`dumper->iova
+ A6XX_CD_DATA_OFFSET`) was being passed.
Thus use `out` as it was intended by Connor [2].
There was an alternative patch at [3] that removed the variable
altogether, but that would only use the initial value.
Jeff Layton [Fri, 5 Apr 2024 17:56:18 +0000 (13:56 -0400)]
nfsd: hold a lighter-weight client reference over CB_RECALL_ANY
Currently the CB_RECALL_ANY job takes a cl_rpc_users reference to the
client. While a callback job is technically an RPC that counter is
really more for client-driven RPCs, and this has the effect of
preventing the client from being unhashed until the callback completes.
If nfsd decides to send a CB_RECALL_ANY just as the client reboots, we
can end up in a situation where the callback can't complete on the (now
dead) callback channel, but the new client can't connect because the old
client can't be unhashed. This usually manifests as a NFS4ERR_DELAY
return on the CREATE_SESSION operation.
The job is only holding a reference to the client so it can clear a flag
after the RPC completes. Fix this by having CB_RECALL_ANY instead hold a
reference to the cl_nfsdfs.cl_ref. Typically we only take that sort of
reference when dealing with the nfsdfs info files, but it should work
appropriately here to ensure that the nfs4_client doesn't disappear.
Fixes: 44df6f439a17 ("NFSD: add delegation reaper to react to low memory condition") Reported-by: Vladimir Benes <[email protected]> Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
Merge tag '9p-for-6.9-rc3' of https://github.com/martinetd/linux
Pull minor 9p cleanups from Dominique Martinet:
- kernel doc fix & removal of unused flag
- fix some bogus debug statement for read/write
* tag '9p-for-6.9-rc3' of https://github.com/martinetd/linux:
9p: remove SLAB_MEM_SPREAD flag usage
9p: Fix read/write debug statements to report server reply
9p/trans_fd: remove Excess kernel-doc comment
Merge tag '6.9-rc2-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
"Three fixes, all also for stable:
- encryption fix
- memory overrun fix
- oplock break fix"
* tag '6.9-rc2-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: do not set SMB2_GLOBAL_CAP_ENCRYPTION for SMB 3.1.1
ksmbd: validate payload size in ipc response
ksmbd: don't send oplock break if rename fails