dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users
Limit the workaround introduced by commit 31729e8c21ec ("drm/amd/pm: fixes
a random hang in S4 for SMU v13.0.4/11") to only run in the s4 path.
Cc: Tim Huang <[email protected]> Fixes: 31729e8c21ec ("drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3351 Signed-off-by: Mario Limonciello <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
[Why]
Some older MST hubs do not report DPCD registers according to
specification.
[How]
This change re-applies commit c53655545141 ("drm/amd/display: dsc mst
re-compute pbn for changes on hub").
With an additional check for these older MST devices.
[Why]
This fixes a bug introduced by commit c53655545141 ("drm/amd/display: dsc
mst re-compute pbn for changes on hub").
The change caused light-up issues with a second display that required
DSC on some MST docks.
[How]
Use Virtual DPCD for DSC caps in MST case.
[Limitations]
This change only affects MST DSC devices that follow specifications
additional changes are required to check for old MST DSC devices such as
ones which do not check for Virtual DPCD registers.
Linus Torvalds [Fri, 3 May 2024 20:36:09 +0000 (13:36 -0700)]
epoll: be better about file lifetimes
epoll can call out to vfs_poll() with a file pointer that may race with
the last 'fput()'. That would make f_count go down to zero, and while
the ep->mtx locking means that the resulting file pointer tear-down will
be blocked until the poll returns, it means that f_count is already
dead, and any use of it won't actually get a reference to the file any
more: it's dead regardless.
Make sure we have a valid ref on the file pointer before we call down to
vfs_poll() from the epoll routines.
Linus Torvalds [Sun, 5 May 2024 17:51:29 +0000 (10:51 -0700)]
Merge tag 'edac_urgent_for_v6.9_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC fixes from Borislav Petkov:
- Fix error logging and check user-supplied data when injecting an
error in the versal EDAC driver
* tag 'edac_urgent_for_v6.9_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/versal: Do not log total error counts
EDAC/versal: Check user-supplied data before injecting an error
EDAC/versal: Do not register for NOC errors
Linus Torvalds [Sun, 5 May 2024 17:44:04 +0000 (10:44 -0700)]
Merge tag 'powerpc-6.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix incorrect delay handling in the plpks (keystore) code
- Fix a panic when an LPAR boots with a frozen PE
Thanks to Andrew Donnellan, Gaurav Batra, Nageswara R Sastry, and Nayna
Jain.
* tag 'powerpc-6.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/pseries/iommu: LPAR panics during boot up with a frozen PE
powerpc/pseries: make max polling consistent for longer H_CALLs
Linus Torvalds [Sun, 5 May 2024 17:17:05 +0000 (10:17 -0700)]
Merge tag 'x86-urgent-2024-05-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar:
- Remove the broken vsyscall emulation code from
the page fault code
- Fix kexec crash triggered by certain SEV RMP
table layouts
- Fix unchecked MSR access error when disabling
the x2APIC via iommu=off
* tag 'x86-urgent-2024-05-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Remove broken vsyscall emulation code from the page fault code
x86/apic: Don't access the APIC when disabling x2APIC
x86/sev: Add callback to apply RMP table fixups for kexec
x86/e820: Add a new e820 table update helper
Linus Torvalds [Sun, 5 May 2024 17:08:52 +0000 (10:08 -0700)]
Merge tag 'char-misc-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some small char/misc/other driver fixes and new device ids
for 6.9-rc7 that resolve some reported problems.
Included in here are:
- iio driver fixes
- mei driver fix and new device ids
- dyndbg bugfix
- pvpanic-pci driver bugfix
- slimbus driver bugfix
- fpga new device id
All have been in linux-next with no reported problems"
* tag 'char-misc-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
slimbus: qcom-ngd-ctrl: Add timeout for wait operation
dyndbg: fix old BUG_ON in >control parser
misc/pvpanic-pci: register attributes via pci_driver
fpga: dfl-pci: add PCI subdevice ID for Intel D5005 card
mei: me: add lunar lake point M DID
mei: pxp: match against PCI_CLASS_DISPLAY_OTHER
iio:imu: adis16475: Fix sync mode setting
iio: accel: mxc4005: Reset chip on probe() and resume()
iio: accel: mxc4005: Interrupt handling fixes
dt-bindings: iio: health: maxim,max30102: fix compatible check
iio: pressure: Fixes SPI support for BMP3xx devices
iio: pressure: Fixes BME280 SPI driver data
Linus Torvalds [Sun, 5 May 2024 17:00:47 +0000 (10:00 -0700)]
Merge tag 'input-for-v6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
- a new ID for ASUS ROG RAIKIRI controllers added to xpad driver
- amimouse driver structure annotated with __refdata to prevent section
mismatch warnings.
* tag 'input-for-v6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: amimouse - mark driver struct with __refdata to prevent section mismatch
Input: xpad - add support for ASUS ROG RAIKIRI
Linus Torvalds [Sun, 5 May 2024 16:56:50 +0000 (09:56 -0700)]
Merge tag 'probes-fixes-v6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes fix from Masami Hiramatsu:
- probe-events: Fix memory leak in parsing probe argument.
There is a memory leak (forget to free an allocated buffer) in a
memory allocation failure path. Fix it to jump to the correct error
handling code.
* tag 'probes-fixes-v6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/probes: Fix memory leak in traceprobe_parse_probe_arg_body()
Linus Torvalds [Sun, 5 May 2024 16:53:09 +0000 (09:53 -0700)]
Merge tag 'trace-v6.9-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing and tracefs fixes from Steven Rostedt:
- Fix RCU callback of freeing an eventfs_inode.
The freeing of the eventfs_inode from the kref going to zero freed
the contents of the eventfs_inode and then used kfree_rcu() to free
the inode itself. But the contents should also be protected by RCU.
Switch to a call_rcu() that calls a function to free all of the
eventfs_inode after the RCU synchronization.
- The tracing subsystem maps its own descriptor to a file represented
by eventfs. The freeing of this descriptor needs to know when the
last reference of an eventfs_inode is released, but currently there
is no interface for that.
Add a "release" callback to the eventfs_inode entry array that allows
for freeing of data that can be referenced by the eventfs_inode being
opened. Then increment the ref counter for this descriptor when the
eventfs_inode file is created, and decrement/free it when the last
reference to the eventfs_inode is released and the file is removed.
This prevents races between freeing the descriptor and the opening of
the eventfs file.
- Fix the permission processing of eventfs.
The change to make the permissions of eventfs default to the mount
point but keep track of when changes were made had a side effect that
could cause security concerns. When the tracefs is remounted with a
given gid or uid, all the files within it should inherit that gid or
uid. But if the admin had changed the permission of some file within
the tracefs file system, it would not get updated by the remount.
This caused the kselftest of file permissions to fail the second time
it is run. The first time, all changes would look fine, but the
second time, because the changes were "saved", the remount did not
reset them.
Create a link list of all existing tracefs inodes, and clear the
saved flags on them on a remount if the remount changes the
corresponding gid or uid fields.
This also simplifies the code by removing the distinction between the
toplevel eventfs and an instance eventfs. They should both act the
same. They were different because of a misconception due to the
remount not resetting the flags. Now that remount resets all the
files and directories to default to the root node if a uid/gid is
specified, it makes the logic simpler to implement.
* tag 'trace-v6.9-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
eventfs: Have "events" directory get permissions from its parent
eventfs: Do not treat events directory different than other directories
eventfs: Do not differentiate the toplevel events directory
tracefs: Still use mount point as default permissions for instances
tracefs: Reset permissions on remount if permissions are options
eventfs: Free all of the eventfs_inode after RCU
eventfs/tracing: Add callback for release of an eventfs_inode
Linus Torvalds [Sun, 5 May 2024 16:49:21 +0000 (09:49 -0700)]
Merge tag 'dma-mapping-6.9-2024-05-04' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fix from Christoph Hellwig:
- fix the combination of restricted pools and dynamic swiotlb
(Will Deacon)
* tag 'dma-mapping-6.9-2024-05-04' of git://git.infradead.org/users/hch/dma-mapping:
swiotlb: initialise restricted pool list_head when SWIOTLB_DYNAMIC=y
Linus Torvalds [Sun, 5 May 2024 16:37:10 +0000 (09:37 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"A handful of clk driver fixes:
- Avoid a deadlock in the Qualcomm clk driver by making the regulator
which supplies the GDSC optional
- Restore RPM clks on Qualcomm msm8976 by setting num_clks
- Fix Allwinner H6 CPU rate changing logic to avoid system crashes by
temporarily reparenting the CPU clk to something that isn't being
changed
- Set a MIPI PLL min/max rate on Allwinner A64 to fix blank screens
on some devices
- Revert back to of_match_device() in the Samsung clkout driver to
get the match data based on the parent device's compatible string"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: samsung: Revert "clk: Use device_get_match_data()"
clk: sunxi-ng: a64: Set minimum and maximum rate for PLL-MIPI
clk: sunxi-ng: common: Support minimum and maximum rate
clk: sunxi-ng: h6: Reparent CPUX during PLL CPUX rate change
clk: qcom: smd-rpm: Restore msm8976 num_clk
clk: qcom: gdsc: treat optional supplies as optional
eventfs: Have "events" directory get permissions from its parent
The events directory gets its permissions from the root inode. But this
can cause an inconsistency if the instances directory changes its
permissions, as the permissions of the created directories under it should
inherit the permissions of the instances directory when directories under
it are created.
Currently the behavior is:
# cd /sys/kernel/tracing
# chgrp 1002 instances
# mkdir instances/foo
# ls -l instances/foo
[..]
-r--r----- 1 root lkp 0 May 1 18:55 buffer_total_size_kb
-rw-r----- 1 root lkp 0 May 1 18:55 current_tracer
-rw-r----- 1 root lkp 0 May 1 18:55 error_log
drwxr-xr-x 1 root root 0 May 1 18:55 events
--w------- 1 root lkp 0 May 1 18:55 free_buffer
drwxr-x--- 2 root lkp 0 May 1 18:55 options
drwxr-x--- 10 root lkp 0 May 1 18:55 per_cpu
-rw-r----- 1 root lkp 0 May 1 18:55 set_event
All the files and directories under "foo" has the "lkp" group except the
"events" directory. That's because its getting its default value from the
mount point instead of its parent.
Have the "events" directory make its default value based on its parent's
permissions. That now gives:
# ls -l instances/foo
[..]
-rw-r----- 1 root lkp 0 May 1 21:16 buffer_subbuf_size_kb
-r--r----- 1 root lkp 0 May 1 21:16 buffer_total_size_kb
-rw-r----- 1 root lkp 0 May 1 21:16 current_tracer
-rw-r----- 1 root lkp 0 May 1 21:16 error_log
drwxr-xr-x 1 root lkp 0 May 1 21:16 events
--w------- 1 root lkp 0 May 1 21:16 free_buffer
drwxr-x--- 2 root lkp 0 May 1 21:16 options
drwxr-x--- 10 root lkp 0 May 1 21:16 per_cpu
-rw-r----- 1 root lkp 0 May 1 21:16 set_event
eventfs: Do not treat events directory different than other directories
Treat the events directory the same as other directories when it comes to
permissions. The events directory was considered different because it's
dentry is persistent, whereas the other directory dentries are created
when accessed. But the way tracefs now does its ownership by using the
root dentry's permissions as the default permissions, the events directory
can get out of sync when a remount is performed setting the group and user
permissions.
Remove the special case for the events directory on setting the
attributes. This allows the updates caused by remount to work properly as
well as simplifies the code.
eventfs: Do not differentiate the toplevel events directory
The toplevel events directory is really no different than the events
directory of instances. Having the two be different caused
inconsistencies and made it harder to fix the permissions bugs.
tracefs: Still use mount point as default permissions for instances
If the instances directory's permissions were never change, then have it
and its children use the mount point permissions as the default.
Currently, the permissions of instance directories are determined by the
instance directory's permissions itself. But if the tracefs file system is
remounted and changes the permissions, the instance directory and its
children should use the new permission.
But because both the instance directory and its children use the instance
directory's inode for permissions, it misses the update.
To demonstrate this:
# cd /sys/kernel/tracing/
# mkdir instances/foo
# ls -ld instances/foo
drwxr-x--- 5 root root 0 May 1 19:07 instances/foo
# ls -ld instances
drwxr-x--- 3 root root 0 May 1 18:57 instances
# ls -ld current_tracer
-rw-r----- 1 root root 0 May 1 18:57 current_tracer
# mount -o remount,gid=1002 .
# ls -ld instances
drwxr-x--- 3 root root 0 May 1 18:57 instances
# ls -ld instances/foo/
drwxr-x--- 5 root root 0 May 1 19:07 instances/foo/
# ls -ld current_tracer
-rw-r----- 1 root lkp 0 May 1 18:57 current_tracer
Notice that changing the group id to that of "lkp" did not affect the
instances directory nor its children. It should have been:
# ls -ld current_tracer
-rw-r----- 1 root root 0 May 1 19:19 current_tracer
# ls -ld instances/foo/
drwxr-x--- 5 root root 0 May 1 19:25 instances/foo/
# ls -ld instances
drwxr-x--- 3 root root 0 May 1 19:19 instances
# mount -o remount,gid=1002 .
# ls -ld current_tracer
-rw-r----- 1 root lkp 0 May 1 19:19 current_tracer
# ls -ld instances
drwxr-x--- 3 root lkp 0 May 1 19:19 instances
# ls -ld instances/foo/
drwxr-x--- 5 root lkp 0 May 1 19:25 instances/foo/
Where all files were updated by the remount gid update.
tracefs: Reset permissions on remount if permissions are options
There's an inconsistency with the way permissions are handled in tracefs.
Because the permissions are generated when accessed, they default to the
root inode's permission if they were never set by the user. If the user
sets the permissions, then a flag is set and the permissions are saved via
the inode (for tracefs files) or an internal attribute field (for
eventfs).
But if a remount happens that specify the permissions, all the files that
were not changed by the user gets updated, but the ones that were are not.
If the user were to remount the file system with a given permission, then
all files and directories within that file system should be updated.
This can cause security issues if a file's permission was updated but the
admin forgot about it. They could incorrectly think that remounting with
permissions set would update all files, but miss some.
For example:
# cd /sys/kernel/tracing
# chgrp 1002 current_tracer
# ls -l
[..]
-rw-r----- 1 root root 0 May 1 21:25 buffer_size_kb
-rw-r----- 1 root root 0 May 1 21:25 buffer_subbuf_size_kb
-r--r----- 1 root root 0 May 1 21:25 buffer_total_size_kb
-rw-r----- 1 root lkp 0 May 1 21:25 current_tracer
-rw-r----- 1 root root 0 May 1 21:25 dynamic_events
-r--r----- 1 root root 0 May 1 21:25 dyn_ftrace_total_info
-r--r----- 1 root root 0 May 1 21:25 enabled_functions
Where current_tracer now has group "lkp".
# mount -o remount,gid=1001 .
# ls -l
-rw-r----- 1 root tracing 0 May 1 21:25 buffer_size_kb
-rw-r----- 1 root tracing 0 May 1 21:25 buffer_subbuf_size_kb
-r--r----- 1 root tracing 0 May 1 21:25 buffer_total_size_kb
-rw-r----- 1 root lkp 0 May 1 21:25 current_tracer
-rw-r----- 1 root tracing 0 May 1 21:25 dynamic_events
-r--r----- 1 root tracing 0 May 1 21:25 dyn_ftrace_total_info
-r--r----- 1 root tracing 0 May 1 21:25 enabled_functions
Everything changed but the "current_tracer".
Add a new link list that keeps track of all the tracefs_inodes which has
the permission flags that tell if the file/dir should use the root inode's
permission or not. Then on remount, clear all the flags so that the
default behavior of using the root inode's permission is done for all
files and directories.
The freeing of eventfs_inode via a kfree_rcu() callback. But the content
of the eventfs_inode was being freed after the last kref. This is
dangerous, as changes are being made that can access the content of an
eventfs_inode from an RCU loop.
Instead of using kfree_rcu() use call_rcu() that calls a function to do
all the freeing of the eventfs_inode after a RCU grace period has expired.
eventfs/tracing: Add callback for release of an eventfs_inode
Synthetic events create and destroy tracefs files when they are created
and removed. The tracing subsystem has its own file descriptor
representing the state of the events attached to the tracefs files.
There's a race between the eventfs files and this file descriptor of the
tracing system where the following can cause an issue:
With two scripts 'A' and 'B' doing:
Script 'A':
echo "hello int aaa" > /sys/kernel/tracing/synthetic_events
while :
do
echo 0 > /sys/kernel/tracing/events/synthetic/hello/enable
done
Script 'A' creates a synthetic event "hello" and then just writes zero
into its enable file.
Script 'B' removes all synthetic events (including the newly created
"hello" event).
What happens is that the opening of the "enable" file has:
{
struct trace_event_file *file = inode->i_private;
int ret;
ret = tracing_check_open_get_tr(file->tr);
[..]
But deleting the events frees the "file" descriptor, and a "use after
free" happens with the dereference at "file->tr".
The file descriptor does have a reference counter, but there needs to be a
way to decrement it from the eventfs when the eventfs_inode is removed
that represents this file descriptor.
Add an optional "release" callback to the eventfs_entry array structure,
that gets called when the eventfs file is about to be removed. This allows
for the creating on the eventfs file to increment the tracing file
descriptor ref counter. When the eventfs file is deleted, it can call the
release function that will call the put function for the tracing file
descriptor.
This will protect the tracing file from being freed while a eventfs file
that references it is being opened.
Linus Torvalds [Fri, 3 May 2024 23:21:05 +0000 (16:21 -0700)]
Merge tag 'cxl-fixes-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull cxl fix from Dave Jiang:
"Add missing RCH support for endpoint access_coordinate calculation.
A late bug was reported by Robert Richter that the Restricted CXL Host
(RCH) support was missing in the CXL endpoint access_coordinate
calculation.
The missing support causes the topology iterator to stumble over a
NULL pointer and triggers a kernel OOPS on a platform with CXL 1.1
support.
The fix bypasses RCH topology as the access_coordinate calculation is
not necessary since RCH does not support hotplug and the memory region
exported should be covered by the HMAT table already.
A unit test is also added to cxl_test to check against future
regressions on the topology iterator"
* tag 'cxl-fixes-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl: Fix cxl_endpoint_get_perf_coordinate() support for RCH
Linus Torvalds [Fri, 3 May 2024 19:10:41 +0000 (12:10 -0700)]
Merge tag 'for-linus-6.9a-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"Two fixes when running as Xen PV guests for issues introduced in the
6.9 merge window, both related to apic id handling"
* tag 'for-linus-6.9a-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: return a sane initial apic id when running as PV guest
x86/xen/smp_pv: Register the boot CPU APIC properly
Linus Torvalds [Fri, 3 May 2024 19:05:19 +0000 (12:05 -0700)]
Merge tag 'efi-urgent-for-v6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fix from Ard Biesheuvel:
"This works around a shortcoming in the memory acceptation API, which
may apparently hog the CPU for long enough to trigger the softlockup
watchdog.
Note that this only affects confidential VMs running under the Intel
TDX hypervisor, which is why I accepted this for now, but this should
obviously be fixed properly in the future"
* tag 'efi-urgent-for-v6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi/unaccepted: touch soft lockup during memory accept
Linus Torvalds [Fri, 3 May 2024 16:33:59 +0000 (09:33 -0700)]
Merge tag 'block-6.9-20240503' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
"Nothing major in here - an nvme pull request with mostly auth/tcp
fixes, and a single fix for ublk not setting segment count and size
limits"
* tag 'block-6.9-20240503' of git://git.kernel.dk/linux:
nvme-tcp: strict pdu pacing to avoid send stalls on TLS
nvmet: fix nvme status code when namespace is disabled
nvmet-tcp: fix possible memory leak when tearing down a controller
nvme: cancel pending I/O if nvme controller is in terminal state
nvmet-auth: replace pr_debug() with pr_err() to report an error.
nvmet-auth: return the error code to the nvmet_auth_host_hash() callers
nvme: find numa distance only if controller has valid numa id
ublk: remove segment count and size limits
nvme: fix warn output about shared namespaces without CONFIG_NVME_MULTIPATH
Linus Torvalds [Fri, 3 May 2024 16:24:46 +0000 (09:24 -0700)]
Merge tag 'sound-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"As usual in a late stage, we received a fair amount of fixes for ASoC,
and it became bigger than wished. But all fixes are rather device-
specific, and they look pretty safe to apply.
A major par of changes are series of fixes for ASoC meson and SOF
drivers as well as for Realtek and Cirrus codecs. In addition, recent
emu10k1 regression fixes and usual HD-audio quirks are included"
* tag 'sound-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (46 commits)
ALSA: hda/realtek: Fix build error without CONFIG_PM
ALSA: hda/realtek: Fix conflicting PCI SSID 17aa:386f for Lenovo Legion models
ALSA: hda/realtek - Set GPIO3 to default at S4 state for Thinkpad with ALC1318
ALSA: hda: intel-sdw-acpi: fix usage of device_get_named_child_node()
ALSA: hda: intel-dsp-config: harden I2C/I2S codec detection
ASoC: cs35l56: fix usages of device_get_named_child_node()
ASoC: da7219-aad: fix usage of device_get_named_child_node()
ASoC: meson: cards: select SND_DYNAMIC_MINORS
ASoC: meson: axg-tdm: add continuous clock support
ASoC: meson: axg-tdm-interface: manage formatters in trigger
ASoC: meson: axg-card: make links nonatomic
ASoC: meson: axg-fifo: use threaded irq to check periods
ALSA: hda/realtek: Fix mute led of HP Laptop 15-da3001TU
ALSA: emu10k1: make E-MU FPGA writes potentially more reliable
ALSA: emu10k1: fix E-MU dock initialization
ALSA: emu10k1: use mutex for E-MU FPGA access locking
ALSA: emu10k1: move the whole GPIO event handling to the workqueue
ALSA: emu10k1: factor out snd_emu1010_load_dock_firmware()
ALSA: emu10k1: fix E-MU card dock presence monitoring
ASoC: rt715-sdca: volume step modification
...
panel:
- ili9341: avoid OF for device properties; respect deferred probe;
fix usage of errno codes
ttm:
- fix status output
vmwgfx:
- fix legacy display unit
- fix read length in fence signalling"
* tag 'drm-fixes-2024-05-03' of https://gitlab.freedesktop.org/drm/kernel: (25 commits)
drm/xe/display: Fix ADL-N detection
drm/panel: ili9341: Use predefined error codes
drm/panel: ili9341: Respect deferred probe
drm/panel: ili9341: Correct use of device property APIs
drm/xe/vm: prevent UAF in rebind_work_func()
drm/amd/display: Disable panel replay by default for now
drm/amdgpu: fix doorbell regression
drm/amdkfd: Flush the process wq before creating a kfd_process
drm/amd/display: Disable seamless boot on 128b/132b encoding
drm/amd/display: Fix DC mode screen flickering on DCN321
drm/amd/display: Add VCO speed parameter for DCN31 FPU
drm/amdgpu: once more fix the call oder in amdgpu_ttm_move() v2
drm/amd/display: Allocate zero bw after bw alloc enable
drm/amd/display: Fix incorrect DSC instance for MST
drm/amd/display: Atom Integrated System Info v2_2 for DCN35
drm/amd/display: Add dtbclk access to dcn315
drm/amd/display: Ensure that dmcub support flag is set for DCN20
drm/amd/display: Handle Y carry-over in VCP X.Y calculation
drm/amdgpu: Fix VRAM memory accounting
drm/vmwgfx: Fix invalid reads in fence signaled events
...
Linus Torvalds [Fri, 3 May 2024 16:12:28 +0000 (09:12 -0700)]
Merge tag 'spi-fix-v6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A few small fixes for v6.9,
The core fix is for issues with reuse of a spi_message in the case
where we've got queued messages (a relatively rare occurrence with
modern code so it wasn't noticed in testing).
We also avoid an issue with the Kunpeng driver by simply removing the
debug interface that could trigger it, and address issues with
confusing and corrupted output when printing the IP version of the AXI
SPI engine"
* tag 'spi-fix-v6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: fix null pointer dereference within spi_sync
spi: hisi-kunpeng: Delete the dump interface of data registers in debugfs
spi: axi-spi-engine: fix version format string
Linus Torvalds [Thu, 2 May 2024 17:49:12 +0000 (10:49 -0700)]
Merge tag 'for-6.9-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- set correct ram_bytes when splitting ordered extent. This can be
inconsistent on-disk but harmless as it's not used for calculations
and it's only advisory for compression
- fix lockdep splat when taking cleaner mutex in qgroups disable ioctl
- fix missing mutex unlock on error path when looking up sys chunk for
relocation
* tag 'for-6.9-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: set correct ram_bytes when splitting ordered extent
btrfs: take the cleaner_mutex earlier in qgroup disable
btrfs: add missing mutex_unlock in btrfs_relocate_sys_chunks()
Linus Torvalds [Thu, 2 May 2024 17:43:35 +0000 (10:43 -0700)]
Merge tag 's390-6.9-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Alexander Gordeev:
- The function __storage_key_init_range() expects the end address to be
the first byte outside the range to be initialized. Fix the callers
that provide the last byte within the range instead.
- 3270 Channel Command Word (CCW) may contain zero data address in case
there is no data in the request. Add data availability check to avoid
erroneous non-zero value as result of virt_to_dma32(NULL) application
in cases there is no data
- Add missing CFI directives for an unwinder to restore the return
address in the vDSO assembler code
- NUL-terminate kernel buffer when duplicating user space memory region
on Channel IO (CIO) debugfs write inject
- Fix wrong format string in zcrypt debug output
- Return -EBUSY code when a CCA card is temporarily unavailabile
- Restore a loop that retries derivation of a protected key from a
secure key in cases the low level reports temporarily unavailability
with -EBUSY code
* tag 's390-6.9-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/paes: Reestablish retry loop in paes
s390/zcrypt: Use EBUSY to indicate temp unavailability
s390/zcrypt: Handle ep11 cprb return code
s390/zcrypt: Fix wrong format string in debug feature printout
s390/cio: Ensure the copied buf is NUL terminated
s390/vdso: Add CFI for RA register to asm macro vdso_func
s390/3270: Fix buffer assignment
s390/mm: Fix clearing storage keys for huge pages
s390/mm: Fix storage key clearing for guest huge pages
Linus Torvalds [Thu, 2 May 2024 17:41:28 +0000 (10:41 -0700)]
Merge tag 'xtensa-20240502' of https://github.com/jcmvbkbc/linux-xtensa
Pull xtensa fixes from Max Filippov:
- fix unused variable warning caused by empty flush_dcache_page()
definition
- fix stack unwinding on windowed noMMU XIP configurations
- fix Coccinelle warning 'opportunity for min()' in xtensa ISS platform
code
* tag 'xtensa-20240502' of https://github.com/jcmvbkbc/linux-xtensa:
xtensa: remove redundant flush_dcache_page and ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE macros
tty: xtensa/iss: Use min() to fix Coccinelle warning
xtensa: fix MAKE_PC_FROM_RA second argument
x86/xen: return a sane initial apic id when running as PV guest
With recent sanity checks for topology information added, there are now
warnings issued for APs when running as a Xen PV guest:
[Firmware Bug]: CPU 1: APIC ID mismatch. CPUID: 0x0000 APIC: 0x0001
This is due to the initial APIC ID obtained via CPUID for PV guests is
always 0.
Avoid the warnings by synthesizing the CPUID data to contain the same
initial APIC ID as xen_pv_smp_config() is using for registering the
APIC IDs of all CPUs.
Fixes: 52128a7a21f7 ("86/cpu/topology: Make the APIC mismatch warnings complete") Signed-off-by: Juergen Gross <[email protected]>
Lucas De Marchi [Thu, 25 Apr 2024 18:16:09 +0000 (11:16 -0700)]
drm/xe/display: Fix ADL-N detection
Contrary to i915, in xe ADL-N is kept as a different platform, not a
subplatform of ADL-P. Since the display side doesn't need to
differentiate between P and N, i.e. IS_ALDERLAKE_P_N() is never called,
just fixup the compat header to check for both P and N.
Moving ADL-N to be a subplatform would be more complex as the firmware
loading in xe only handles platforms, not subplatforms, as going forward
the direction is to check on IP version rather than
platforms/subplatforms.
Fix warning when initializing display:
xe 0000:00:02.0: [drm:intel_pch_type [xe]] Found Alder Lake PCH
------------[ cut here ]------------
xe 0000:00:02.0: drm_WARN_ON(!((dev_priv)->info.platform == XE_ALDERLAKE_S) && !((dev_priv)->info.platform == XE_ALDERLAKE_P))
Linus Torvalds [Thu, 2 May 2024 16:05:21 +0000 (09:05 -0700)]
Merge tag 'firewire-fixes-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull firewire fixes from Takashi Sakamoto:
"Two driver fixes:
- The firewire-ohci driver for 1394 OHCI hardware does not fill time
stamp for response packet when handling asynchronous transaction to
local destination. This brings an inconvenience that the response
packet is not equivalent between the transaction to local and
remote. It is fixed by fulfilling the time stamp with hardware
time. The fix should be applied to Linux kernel v6.5 or later as
well.
- The nosy driver for Texas Instruments TSB12LV21A (PCILynx) has
long-standing issue about the behaviour when user space application
passes less size of buffer than expected. It is fixed by returning
zero according to the convention of UNIX-like systems"
* tag 'firewire-fixes-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
firewire: ohci: fulfill timestamp for some local asynchronous transaction
firewire: nosy: ensure user_length is taken into account when fetching packet contents
Thomas Gleixner [Thu, 2 May 2024 14:39:47 +0000 (16:39 +0200)]
x86/xen/smp_pv: Register the boot CPU APIC properly
The topology core expects the boot APIC to be registered from earhy APIC
detection first and then again when the firmware tables are evaluated. This
is used for detecting the real BSP CPU on a kexec kernel.
The recent conversion of XEN/PV to register fake APIC IDs failed to
register the boot CPU APIC correctly as it only registers it once. This
causes the BSP detection mechanism to trigger wrongly:
CPU topo: Boot CPU APIC ID not the first enumerated APIC ID: 0 > 1
Additionally this results in one CPU being ignored.
Register the boot CPU APIC twice so that the XEN/PV fake enumeration
behaves like real firmware.
Linus Torvalds [Thu, 2 May 2024 16:01:27 +0000 (09:01 -0700)]
Merge tag 'thermal-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fixes from Rafael Wysocki:
"Fix a memory leak and a few locking issues (that may cause the kernel
to crash in principle if all goes wrong) in the thermal debug code
introduced during the 6.8 development cycle"
* tag 'thermal-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal/debugfs: Prevent use-after-free from occurring after cdev removal
thermal/debugfs: Fix two locking issues with thermal zone debug
thermal/debugfs: Free all thermal zone debug memory on zone removal
Linus Torvalds [Thu, 2 May 2024 15:51:47 +0000 (08:51 -0700)]
Merge tag 'net-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bpf.
Relatively calm week, likely due to public holiday in most places. No
known outstanding regressions.
Current release - regressions:
- rxrpc: fix wrong alignmask in __page_frag_alloc_align()
- eth: e1000e: change usleep_range to udelay in PHY mdic access
Previous releases - regressions:
- gro: fix udp bad offset in socket lookup
- bpf: fix incorrect runtime stat for arm64
- tipc: fix UAF in error path
- netfs: fix a potential infinite loop in extract_user_to_sg()
- eth: ice: ensure the copied buf is NUL terminated
- eth: qeth: fix kernel panic after setting hsuid
Previous releases - always broken:
- bpf:
- verifier: prevent userspace memory access
- xdp: use flags field to disambiguate broadcast redirect
- bridge: fix multicast-to-unicast with fraglist GSO
- mptcp: ensure snd_nxt is properly initialized on connect
- nsh: fix outer header access in nsh_gso_segment().
- eth: bcmgenet: fix racing registers access
- eth: vxlan: fix stats counters.
Misc:
- a bunch of MAINTAINERS file updates"
* tag 'net-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits)
MAINTAINERS: mark MYRICOM MYRI-10G as Orphan
MAINTAINERS: remove Ariel Elior
net: gro: add flush check in udp_gro_receive_segment
net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb
ipv4: Fix uninit-value access in __ip_make_skb()
s390/qeth: Fix kernel panic after setting hsuid
vxlan: Pull inner IP header in vxlan_rcv().
tipc: fix a possible memleak in tipc_buf_append
tipc: fix UAF in error path
rxrpc: Clients must accept conn from any address
net: core: reject skb_copy(_expand) for fraglist GSO skbs
net: bridge: fix multicast-to-unicast with fraglist GSO
mptcp: ensure snd_nxt is properly initialized on connect
e1000e: change usleep_range to udelay in PHY mdic access
net: dsa: mv88e6xxx: Fix number of databases for 88E6141 / 88E6341
cxgb4: Properly lock TX queue for the selftest.
rxrpc: Fix using alignmask being zero for __page_frag_alloc_align()
vxlan: Add missing VNI filter counter update in arp_reduce().
vxlan: Fix racy device stats updates.
net: qede: use return from qede_parse_actions()
...
* git://git.infradead.org/nvme:
nvme-tcp: strict pdu pacing to avoid send stalls on TLS
nvmet: fix nvme status code when namespace is disabled
nvmet-tcp: fix possible memory leak when tearing down a controller
nvme: cancel pending I/O if nvme controller is in terminal state
nvmet-auth: replace pr_debug() with pr_err() to report an error.
nvmet-auth: return the error code to the nvmet_auth_host_hash() callers
nvme: find numa distance only if controller has valid numa id
nvme: fix warn output about shared namespaces without CONFIG_NVME_MULTIPATH
Will Deacon [Thu, 2 May 2024 09:37:23 +0000 (10:37 +0100)]
swiotlb: initialise restricted pool list_head when SWIOTLB_DYNAMIC=y
Using restricted DMA pools (CONFIG_DMA_RESTRICTED_POOL=y) in conjunction
with dynamic SWIOTLB (CONFIG_SWIOTLB_DYNAMIC=y) leads to the following
crash when initialising the restricted pools at boot-time:
| Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
| Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
| pc : rmem_swiotlb_device_init+0xfc/0x1ec
| lr : rmem_swiotlb_device_init+0xf0/0x1ec
| Call trace:
| rmem_swiotlb_device_init+0xfc/0x1ec
| of_reserved_mem_device_init_by_idx+0x18c/0x238
| of_dma_configure_id+0x31c/0x33c
| platform_dma_configure+0x34/0x80
faddr2line reveals that the crash is in the list validation code:
because add_mem_pool() is trying to list_add_rcu() to a NULL
'mem->pools'.
Fix the crash by initialising the 'mem->pools' list_head in
rmem_swiotlb_device_init() before calling add_mem_pool().
Reported-by: Nikita Ioffe <[email protected]> Tested-by: Nikita Ioffe <[email protected]> Fixes: 1aaa736815eb ("swiotlb: allocate a new memory pool when existing pools are full") Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
====================
net: gro: add flush/flush_id checks and fix wrong offset in udp
This series fixes a bug in the complete phase of UDP in GRO, in which
socket lookup fails due to using network_header when parsing encapsulated
packets. The fix is to add network_offset and inner_network_offset to
napi_gro_cb and use these offsets for socket lookup.
In addition p->flush/flush_id should be checked in all UDP flows. The
same logic from tcp_gro_receive is applied for all flows in
udp_gro_receive_segment. This prevents packets with mismatching network
headers (flush/flush_id turned on) from merging in UDP GRO.
The original series includes a change to vxlan test which adds the local
parameter to prevent similar future bugs. I plan to submit it separately to
net-next.
This series is part of a previously submitted series to net-next:
https://lore.kernel.org/all/20240408141720[email protected]/
v3 -> v4:
- Store network offsets, and use them only in udp_gro_complete flows
- Correct commit hash used in Fixes tag
- v3:
https://lore.kernel.org/netdev/20240424163045[email protected]/
v2 -> v3:
- Add network_offsets and fix udp bug in a single commit to make backporting easier
- Write to inner_network_offset in {inet,ipv6}_gro_receive
- Use network_offsets union in tcp[46]_gro_complete as well
- v2:
https://lore.kernel.org/netdev/20240419153542[email protected]/
v1 -> v2:
- Use network_offsets instead of p_poff param as suggested by Willem
- Check flush before postpull, and for all UDP GRO flows
- v1:
https://lore.kernel.org/netdev/20240412152120[email protected]/
====================
Richard Gobert [Tue, 30 Apr 2024 14:35:55 +0000 (16:35 +0200)]
net: gro: add flush check in udp_gro_receive_segment
GRO-GSO path is supposed to be transparent and as such L3 flush checks are
relevant to all UDP flows merging in GRO. This patch uses the same logic
and code from tcp_gro_receive, terminating merge if flush is non zero.
Fixes: e20cf8d3f1f7 ("udp: implement GRO for plain UDP sockets.") Signed-off-by: Richard Gobert <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
Richard Gobert [Tue, 30 Apr 2024 14:35:54 +0000 (16:35 +0200)]
net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb
Commits a602456 ("udp: Add GRO functions to UDP socket") and 57c67ff ("udp:
additional GRO support") introduce incorrect usage of {ip,ipv6}_hdr in the
complete phase of gro. The functions always return skb->network_header,
which in the case of encapsulated packets at the gro complete phase, is
always set to the innermost L3 of the packet. That means that calling
{ip,ipv6}_hdr for skbs which completed the GRO receive phase (both in
gro_list and *_gro_complete) when parsing an encapsulated packet's _outer_
L3/L4 may return an unexpected value.
This incorrect usage leads to a bug in GRO's UDP socket lookup.
udp{4,6}_lib_lookup_skb functions use ip_hdr/ipv6_hdr respectively. These
*_hdr functions return network_header which will point to the innermost L3,
resulting in the wrong offset being used in __udp{4,6}_lib_lookup with
encapsulated packets.
This patch adds network_offset and inner_network_offset to napi_gro_cb, and
makes sure both are set correctly.
To fix the issue, network_offsets union is used inside napi_gro_cb, in
which both the outer and the inner network offsets are saved.
Reproduction example:
Endpoint configuration example (fou + local address bind)
# ip fou add port 6666 ipproto 4
# ip link add name tun1 type ipip remote 2.2.2.1 local 2.2.2.2 encap fou encap-dport 5555 encap-sport 6666 mode ipip
# ip link set tun1 up
# ip a add 1.1.1.2/24 dev tun1
Netperf TCP_STREAM result on net-next before patch is applied:
Shigeru Yoshida [Tue, 30 Apr 2024 12:39:45 +0000 (21:39 +0900)]
ipv4: Fix uninit-value access in __ip_make_skb()
KMSAN reported uninit-value access in __ip_make_skb() [1]. __ip_make_skb()
tests HDRINCL to know if the skb has icmphdr. However, HDRINCL can cause a
race condition. If calling setsockopt(2) with IP_HDRINCL changes HDRINCL
while __ip_make_skb() is running, the function will access icmphdr in the
skb even if it is not included. This causes the issue reported by KMSAN.
Check FLOWI_FLAG_KNOWN_NH on fl4->flowi4_flags instead of testing HDRINCL
on the socket.
Also, fl4->fl4_icmp_type and fl4->fl4_icmp_code are not initialized. These
are union in struct flowi4 and are implicitly initialized by
flowi4_init_output(), but we should not rely on specific union layout.
Andy Shevchenko [Thu, 25 Apr 2024 14:26:19 +0000 (17:26 +0300)]
drm/panel: ili9341: Use predefined error codes
In one case the -1 is returned which is quite confusing code for
the wrong device ID, in another the ret is returning instead of
plain 0 that also confusing as readed may ask the possible meaning
of positive codes, which are never the case there. Convert both
to use explicit predefined error codes to make it clear what's going
on there.
Andy Shevchenko [Thu, 25 Apr 2024 14:26:18 +0000 (17:26 +0300)]
drm/panel: ili9341: Respect deferred probe
GPIO controller might not be available when driver is being probed.
There are plenty of reasons why, one of which is deferred probe.
Since GPIOs are optional, return any error code we got to the upper
layer, including deferred probe. With that in mind, use dev_err_probe()
in order to avoid spamming the logs.
Alexandra Winter [Tue, 30 Apr 2024 09:10:04 +0000 (11:10 +0200)]
s390/qeth: Fix kernel panic after setting hsuid
Symptom:
When the hsuid attribute is set for the first time on an IQD Layer3
device while the corresponding network interface is already UP,
the kernel will try to execute a napi function pointer that is NULL.
Analysis:
There is one napi structure per out_q: card->qdio.out_qs[i].napi
The napi.poll functions are set during qeth_open().
Since
commit 1cfef80d4c2b ("s390/qeth: Don't call dev_close/dev_open (DOWN/UP)")
qeth_set_offline()/qeth_set_online() no longer call dev_close()/
dev_open(). So if qeth_free_qdio_queues() cleared
card->qdio.out_qs[i].napi.poll while the network interface was UP and the
card was offline, they are not set again.
Reproduction:
chzdev -e $devno layer2=0
ip link set dev $network_interface up
echo 0 > /sys/bus/ccwgroup/devices/0.0.$devno/online
echo foo > /sys/bus/ccwgroup/devices/0.0.$devno/hsuid
echo 1 > /sys/bus/ccwgroup/devices/0.0.$devno/online
-> Crash (can be enforced e.g. by af_iucv connect(), ip link down/up, ...)
Note that a Completion Queue (CQ) is only enabled or disabled, when hsuid
is set for the first time or when it is removed.
Workarounds:
- Set hsuid before setting the device online for the first time
or
- Use chzdev -d $devno; chzdev $devno hsuid=xxx; chzdev -e $devno;
to set hsuid on an existing device. (this will remove and recreate the
network interface)
Fix:
There is no need to free the output queues when a completion queue is
added or removed.
card->qdio.state now indicates whether the inbound buffer pool and the
outbound queues are allocated.
card->qdio.c_q indicates whether a CQ is allocated.
Takashi Iwai [Thu, 2 May 2024 06:24:42 +0000 (08:24 +0200)]
ALSA: hda/realtek: Fix build error without CONFIG_PM
The alc_spec.power_hook is defined only with CONFIG_PM, and the recent
fix overlooked it, resulting in a build error without CONFIG_PM.
Fix it with the simple ifdef and set __maybe_unused for the function.
We may drop the whole CONFIG_PM dependency there, but it should be
done in a separate cleanup patch later.
Ensure the inner IP header is part of skb's linear data before reading
its ECN bits. Otherwise we might read garbage.
One symptom is the system erroneously logging errors like
"vxlan: non-ECT from xxx.xxx.xxx.xxx with TOS=xxxx".
Similar bugs have been fixed in geneve, ip_tunnel and ip6_tunnel (see
commit 1ca1ba465e55 ("geneve: make sure to pull inner header in
geneve_rx()") for example). So let's reuse the same code structure for
consistency. Maybe we'll can add a common helper in the future.
Paolo Abeni [Tue, 30 Apr 2024 13:53:37 +0000 (15:53 +0200)]
tipc: fix UAF in error path
Sam Page (sam4k) working with Trend Micro Zero Day Initiative reported
a UAF in the tipc_buf_append() error path:
BUG: KASAN: slab-use-after-free in kfree_skb_list_reason+0x47e/0x4c0
linux/net/core/skbuff.c:1183
Read of size 8 at addr ffff88804d2a7c80 by task poc/8034
In the critical scenario, either the relevant skb is freed or its
ownership is transferred into a frag_lists. In both cases, the cleanup
code must not free it again: we need to clear the skb reference earlier.
The find connection logic of Transarc's Rx was modified in the mid-1990s
to support multi-homed servers which might send a response packet from
an address other than the destination address in the received packet.
The rules for accepting a packet by an Rx initiator (RX_CLIENT_CONNECTION)
were altered to permit acceptance of a packet from any address provided
that the port number was unchanged and all of the connection identifiers
matched (Epoch, CID, SecurityClass, ...).
This change applies the same rules to the Linux implementation which makes
it consistent with IBM AFS 3.6, Arla, OpenAFS and AuriStorFS.
Takashi Iwai [Wed, 1 May 2024 16:05:13 +0000 (18:05 +0200)]
Merge tag 'asoc-fix-v6.9-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.9
This is much larger than is ideal, partly due to your holiday but also
due to several vendors having come in with relatively large fixes at
similar times. It's all driver specific stuff.
The meson fixes from Jerome fix some rare timing issues with blocking
operations happening in triggers, plus the continuous clock support
which fixes clocking for some platforms. The SOF series from Peter
builds to the fix to avoid spurious resets of ChainDMA which triggered
errors in cleanup paths with both PulseAudio and PipeWire, and there's
also some simple new debugfs files from Pierre which make support a lot
eaiser.
Linus Torvalds [Wed, 1 May 2024 15:58:56 +0000 (08:58 -0700)]
Merge tag 'regulator-fix-v6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"There's a few simple driver specific fixes here, plus some core
cleanups from Matti which fix issues found with client drivers due to
the API being confusing.
The two fixes for the stubs provide more constructive behaviour with
!REGULATOR configurations, issues were noticed with some hwmon drivers
which would otherwise have needed confusing bodges in the users.
The irq_helpers fix to duplicate the provided name for the interrupt
controller was found because a driver got this wrong and it's again a
case where the core is the sensible place to put the fix"
* tag 'regulator-fix-v6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: change devm_regulator_get_enable_optional() stub to return Ok
regulator: change stubbed devm_regulator_get_enable to return Ok
regulator: vqmmc-ipq4019: fix module autoloading
regulator: qcom-refgen: fix module autoloading
regulator: mt6360: De-capitalize devicetree regulator subnodes
regulator: irq_helpers: duplicate IRQ name
Matthew Auld [Tue, 23 Apr 2024 07:47:23 +0000 (08:47 +0100)]
drm/xe/vm: prevent UAF in rebind_work_func()
We flush the rebind worker during the vm close phase, however in places
like preempt_fence_work_func() we seem to queue the rebind worker
without first checking if the vm has already been closed. The concern
here is the vm being closed with the worker flushed, but then being
rearmed later, which looks like potential uaf, since there is no actual
refcounting to track the queued worker. We can't take the vm->lock here
in preempt_rebind_work_func() to first check if the vm is closed since
that will deadlock, so instead flush the worker again when the vm
refcount reaches zero.
v2:
- Grabbing vm->lock in the preempt worker creates a deadlock, so
checking the closed state is tricky. Instead flush the worker when
the refcount reaches zero. It should be impossible to queue the
preempt worker without already holding vm ref.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1676 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1591 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1364 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1304 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1249 Signed-off-by: Matthew Auld <[email protected]> Cc: Matthew Brost <[email protected]> Cc: <[email protected]> # v6.8+ Reviewed-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 3d44d67c441a9fe6f81a1d705f7de009a32a5b35) Signed-off-by: Lucas De Marchi <[email protected]>
drm/amd/display: Disable panel replay by default for now
Panel replay was enabled by default in commit 5950efe25ee0
("drm/amd/display: Enable Panel Replay for static screen use case"), but
it isn't working properly at least on some BOE and AUO panels. Instead
of being static the screen is solid black when active. As it's a new
feature that was just introduced that regressed VRR disable it for now
so that problem can be properly root caused.
Cc: Tom Chung <[email protected]> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3344 Fixes: 5950efe25ee0 ("drm/amd/display: Enable Panel Replay for static screen use case") Signed-off-by: Mario Limonciello <[email protected]> Acked-by: Harry Wentland <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
Felix Fietkau [Sat, 27 Apr 2024 18:24:19 +0000 (20:24 +0200)]
net: core: reject skb_copy(_expand) for fraglist GSO skbs
SKB_GSO_FRAGLIST skbs must not be linearized, otherwise they become
invalid. Return NULL if such an skb is passed to skb_copy or
skb_copy_expand, in order to prevent a crash on a potential later
call to skb_gso_segment.
Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.") Signed-off-by: Felix Fietkau <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Felix Fietkau [Sat, 27 Apr 2024 18:24:18 +0000 (20:24 +0200)]
net: bridge: fix multicast-to-unicast with fraglist GSO
Calling skb_copy on a SKB_GSO_FRAGLIST skb is not valid, since it returns
an invalid linearized skb. This code only needs to change the ethernet
header, so pskb_copy is the right function to call here.
Hannes Reinecke [Thu, 18 Apr 2024 10:39:45 +0000 (12:39 +0200)]
nvme-tcp: strict pdu pacing to avoid send stalls on TLS
TLS requires a strict pdu pacing via MSG_EOR to signal the end
of a record and subsequent encryption. If we do not set MSG_EOR
at the end of a sequence the record won't be closed, encryption
doesn't start, and we end up with a send stall as the message
will never be passed on to the TCP layer.
So do not check for the queue status when TLS is enabled but
rather make the MSG_MORE setting dependent on the current
request only.
nvmet: fix nvme status code when namespace is disabled
If the user disabled a nvmet namespace, it is removed from the subsystem
namespaces list. When nvmet processes a command directed to an nsid that
was disabled, it cannot differentiate between a nsid that is disabled
vs. a non-existent namespace, and resorts to return NVME_SC_INVALID_NS
with the dnr bit set.
This translates to a non-retryable status for the host, which translates
to a user error. We should expect disabled namespaces to not cause an
I/O error in a multipath environment.
Address this by searching a configfs item for the namespace nvmet failed
to find, and if we found one, conclude that the namespace is disabled
(perhaps temporarily). Return NVME_SC_INTERNAL_PATH_ERROR in this case
and keep DNR bit cleared.
nvmet-tcp: fix possible memory leak when tearing down a controller
When we teardown the controller, we wait for pending I/Os to complete
(sq->ref on all queues to drop to zero) and then we go over the commands,
and free their command buffers in case they are still fetching data from
the host (e.g. processing nvme writes) and have yet to take a reference
on the sq.
However, we may miss the case where commands have failed before executing
and are queued for sending a response, but will never occur because the
queue socket is already down. In this case we may miss deallocating command
buffers.
Solve this by freeing all commands buffers as nvmet_tcp_free_cmd_buffers is
idempotent anyways.
nvme: cancel pending I/O if nvme controller is in terminal state
While I/O is running, if the pci bus error occurs then
in-flight I/O can not complete. Worst, if at this time,
user (logically) hot-unplug the nvme disk then the
nvme_remove() code path can't forward progress until
in-flight I/O is cancelled. So these sequence of events
may potentially hang hot-unplug code path indefinitely.
This patch helps cancel the pending/in-flight I/O from the
nvme request timeout handler in case the nvme controller
is in the terminal (DEAD/DELETING/DELETING_NOIO) state and
that helps nvme_remove() code path forward progress and
finish successfully.
nvme: find numa distance only if controller has valid numa id
On system where native nvme multipath is configured and iopolicy
is set to numa but the nvme controller numa node id is undefined
or -1 (NUMA_NO_NODE) then avoid calculating node distance for
finding optimal io path. In such case we may access numa distance
table with invalid index and that may potentially refer to incorrect
memory. So this patch ensures that if the nvme controller numa node
id is -1 then instead of calculating node distance for finding optimal
io path, we set the numa node distance of such controller to default 10
(LOCAL_DISTANCE).
With commit ed6776c96c60 ("s390/crypto: remove retry
loop with sleep from PAES pkey invocation") the retry
loop to retry derivation of a protected key from a
secure key has been removed. This was based on the
assumption that theses retries are not needed any
more as proper retries are done in the zcrypt layer.
However, tests have revealed that there exist some
cases with master key change in the HSM and immediately
(< 1 second) attempt to derive a protected key from a
secure key with exact this HSM may eventually fail.
The low level functions in zcrypt_ccamisc.c and
zcrypt_ep11misc.c detect and report this temporary
failure and report it to the caller as -EBUSY. The
re-established retry loop in the paes implementation
catches exactly this -EBUSY and eventually may run
some retries.
Fixes: ed6776c96c60 ("s390/crypto: remove retry loop with sleep from PAES pkey invocation") Signed-off-by: Harald Freudenberger <[email protected]> Reviewed-by: Ingo Franzki <[email protected]> Reviewed-by: Holger Dengler <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]>
s390/zcrypt: Use EBUSY to indicate temp unavailability
Use -EBUSY instead of -EAGAIN in zcrypt_ccamisc.c
in cases where the CCA card returns 8/2290 to indicate
a temporarily unavailability of this function.
Fixes: ed6776c96c60 ("s390/crypto: remove retry loop with sleep from PAES pkey invocation") Signed-off-by: Harald Freudenberger <[email protected]> Reviewed-by: Ingo Franzki <[email protected]> Reviewed-by: Holger Dengler <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]>
An EP11 reply cprb contains a field ret_code which may
hold an error code different than the error code stored
in the payload of the cprb. As of now all the EP11 misc
functions do not evaluate this field but focus on the
error code in the payload.
Before checking the payload error, first the cprb error
field should be evaluated which is introduced with this
patch.
If the return code value 0x000c0003 is seen, this
indicates a busy situation which is reflected by
-EBUSY in the zcrpyt_ep11misc.c low level function.
A higher level caller should consider to retry after
waiting a dedicated duration (say 1 second).
Fixes: ed6776c96c60 ("s390/crypto: remove retry loop with sleep from PAES pkey invocation") Signed-off-by: Harald Freudenberger <[email protected]> Reviewed-by: Ingo Franzki <[email protected]> Reviewed-by: Holger Dengler <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]>
... and I think that's actually the important thing here:
- the first page fault is from user space, and triggers the vsyscall
emulation.
- the second page fault is from __do_sys_gettimeofday(), and that should
just have caused the exception that then sets the return value to
-EFAULT
- the third nested page fault is due to _raw_spin_unlock_irqrestore() ->
preempt_schedule() -> trace_sched_switch(), which then causes a BPF
trace program to run, which does that bpf_probe_read_compat(), which
causes that page fault under pagefault_disable().
It's quite the nasty backtrace, and there's a lot going on.
The problem is literally the vsyscall emulation, which sets
current->thread.sig_on_uaccess_err = 1;
and that causes the fixup_exception() code to send the signal *despite* the
exception being caught.
And I think that is in fact completely bogus. It's completely bogus
exactly because it sends that signal even when it *shouldn't* be sent -
like for the BPF user mode trace gathering.
In other words, I think the whole "sig_on_uaccess_err" thing is entirely
broken, because it makes any nested page-faults do all the wrong things.
Now, arguably, I don't think anybody should enable vsyscall emulation any
more, but this test case clearly does.
I think we should just make the "send SIGSEGV" be something that the
vsyscall emulation does on its own, not this broken per-thread state for
something that isn't actually per thread.
The x86 page fault code actually tried to deal with the "incorrect nesting"
by having that:
if (in_interrupt())
return;
which ignores the sig_on_uaccess_err case when it happens in interrupts,
but as shown by this example, these nested page faults do not need to be
about interrupts at all.
IOW, I think the only right thing is to remove that horrendously broken
code.
The attached patch looks like the ObviouslyCorrect(tm) thing to do.
NOTE! This broken code goes back to this commit in 2011:
4fc3490114bb ("x86-64: Set siginfo and context on vsyscall emulation faults")
... and back then the reason was to get all the siginfo details right.
Honestly, I do not for a moment believe that it's worth getting the siginfo
details right here, but part of the commit says:
This fixes issues with UML when vsyscall=emulate.
... and so my patch to remove this garbage will probably break UML in this
situation.
I do not believe that anybody should be running with vsyscall=emulate in
2024 in the first place, much less if you are doing things like UML. But
let's see if somebody screams.
Mans Rullgard [Tue, 30 Apr 2024 18:27:05 +0000 (19:27 +0100)]
spi: fix null pointer dereference within spi_sync
If spi_sync() is called with the non-empty queue and the same spi_message
is then reused, the complete callback for the message remains set while
the context is cleared, leading to a null pointer dereference when the
callback is invoked from spi_finalize_current_message().
With function inlining disabled, the call stack might look like this:
_raw_spin_lock_irqsave from complete_with_flags+0x18/0x58
complete_with_flags from spi_complete+0x8/0xc
spi_complete from spi_finalize_current_message+0xec/0x184
spi_finalize_current_message from spi_transfer_one_message+0x2a8/0x474
spi_transfer_one_message from __spi_pump_transfer_message+0x104/0x230
__spi_pump_transfer_message from __spi_transfer_message_noqueue+0x30/0xc4
__spi_transfer_message_noqueue from __spi_sync+0x204/0x248
__spi_sync from spi_sync+0x24/0x3c
spi_sync from mcp251xfd_regmap_crc_read+0x124/0x28c [mcp251xfd]
mcp251xfd_regmap_crc_read [mcp251xfd] from _regmap_raw_read+0xf8/0x154
_regmap_raw_read from _regmap_bus_read+0x44/0x70
_regmap_bus_read from _regmap_read+0x60/0xd8
_regmap_read from regmap_read+0x3c/0x5c
regmap_read from mcp251xfd_alloc_can_err_skb+0x1c/0x54 [mcp251xfd]
mcp251xfd_alloc_can_err_skb [mcp251xfd] from mcp251xfd_irq+0x194/0xe70 [mcp251xfd]
mcp251xfd_irq [mcp251xfd] from irq_thread_fn+0x1c/0x78
irq_thread_fn from irq_thread+0x118/0x1f4
irq_thread from kthread+0xd8/0xf4
kthread from ret_from_fork+0x14/0x28
Fix this by also setting message->complete to NULL when the transfer is
complete.
Lancelot SIX [Wed, 10 Apr 2024 13:14:13 +0000 (14:14 +0100)]
drm/amdkfd: Flush the process wq before creating a kfd_process
There is a race condition when re-creating a kfd_process for a process.
This has been observed when a process under the debugger executes
exec(3). In this scenario:
- The process executes exec.
- This will eventually release the process's mm, which will cause the
kfd_process object associated with the process to be freed
(kfd_process_free_notifier decrements the reference count to the
kfd_process to 0). This causes kfd_process_ref_release to enqueue
kfd_process_wq_release to the kfd_process_wq.
- The debugger receives the PTRACE_EVENT_EXEC notification, and tries to
re-enable AMDGPU traps (KFD_IOC_DBG_TRAP_ENABLE).
- When handling this request, KFD tries to re-create a kfd_process.
This eventually calls kfd_create_process and kobject_init_and_add.
At this point the call to kobject_init_and_add can fail because the
old kfd_process.kobj has not been freed yet by kfd_process_wq_release.
This patch proposes to avoid this race by making sure to drain
kfd_process_wq before creating a new kfd_process object. This way, we
know that any cleanup task is done executing when we reach
kobject_init_and_add.
When fallback to TCP happens early on a client socket, snd_nxt
is not yet initialized and any incoming ack will copy such value
into snd_una. If the mptcp worker (dumbly) tries mptcp-level
re-injection after such ack, that would unconditionally trigger a send
buffer cleanup using 'bad' snd_una values.
We could easily disable re-injection for fallback sockets, but such
dumb behavior already helped catching a few subtle issues and a very
low to zero impact in practice.
Instead address the issue always initializing snd_nxt (and write_seq,
for consistency) at connect time.
Leo Ma [Thu, 11 Apr 2024 21:17:04 +0000 (17:17 -0400)]
drm/amd/display: Fix DC mode screen flickering on DCN321
[Why && How]
Screen flickering saw on 4K@60 eDP with high refresh rate external
monitor when booting up in DC mode. DC Mode Capping is disabled
which caused wrong UCLK being used.
e1000e: change usleep_range to udelay in PHY mdic access
This is a partial revert of commit 6dbdd4de0362 ("e1000e: Workaround
for sporadic MDI error on Meteor Lake systems"). The referenced commit
used usleep_range inside the PHY access routines, which are sometimes
called from an atomic context. This can lead to a kernel panic in some
scenarios, such as cable disconnection and reconnection on vPro systems.
Solve this by changing the usleep_range calls back to udelay.
Christian König [Thu, 21 Mar 2024 10:32:02 +0000 (11:32 +0100)]
drm/amdgpu: once more fix the call oder in amdgpu_ttm_move() v2
This reverts drm/amdgpu: fix ftrace event amdgpu_bo_move always move
on same heap. The basic problem here is that after the move the old
location is simply not available any more.
Some fixes were suggested, but essentially we should call the move
notification before actually moving things because only this way we have
the correct order for DMA-buf and VM move notifications as well.
Also rework the statistic handling so that we don't update the eviction
counter before the move.
v2: add missing NULL check
Signed-off-by: Christian König <[email protected]> Fixes: 94aeb4117343 ("drm/amdgpu: fix ftrace event amdgpu_bo_move always move on same heap") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3171 Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> CC: [email protected]
drm/amd/display: Allocate zero bw after bw alloc enable
[Why]
During DP tunnel creation, CM preallocates BW and reduces
estimated BW of other DPIA. CM release preallocation only
when allocation is complete. Display mode validation logic
validates timings based on bw available per host router.
In multi display setup, this causes bw allocation failure
when allocation greater than estimated bw.
[How]
Do zero alloc to make the CM to release preallocation and
update estimated BW correctly for all DPIAs per host router.
Hersen Wu [Tue, 13 Feb 2024 19:26:06 +0000 (14:26 -0500)]
drm/amd/display: Fix incorrect DSC instance for MST
[Why] DSC debugfs, such as dp_dsc_clock_en_read,
use aconnector->dc_link to find pipe_ctx for display.
Displays connected to MST hub share the same dc_link.
DSC instance is from pipe_ctx. This causes incorrect
DSC instance for display connected to MST hub.
[How] Add aconnector->sink check to find pipe_ctx.
Gabe Teeger [Tue, 9 Apr 2024 14:38:58 +0000 (10:38 -0400)]
drm/amd/display: Atom Integrated System Info v2_2 for DCN35
New request from KMD/VBIOS in order to support new UMA carveout
model. This fixes a null dereference from accessing
Ctx->dc_bios->integrated_info while it was NULL.
DAL parses through the BIOS and extracts the necessary
integrated_info but was missing a case for the new BIOS
version 2.3.
Currently DCN315 clk manager is missing code to enable/disable dtbclk.
Because of this, "optimized_required" flag is constantly set
and this prevents FreeSync from engaging for certain high bandwidth
display Modes which require DTBCLK.
The selftest for the driver sends a dummy packet and checks if the
packet will be received properly as it should be. The regular TX path
and the selftest can use the same network queue so locking is required
and was missing in the selftest path. This was addressed in the commit
cited below.
Unfortunately locking the TX queue requires BH to be disabled which is
not the case in selftest path which is invoked in process context.
Lockdep should be complaining about this.
Yunsheng Lin [Sun, 28 Apr 2024 11:16:38 +0000 (19:16 +0800)]
rxrpc: Fix using alignmask being zero for __page_frag_alloc_align()
rxrpc_alloc_data_txbuf() may be called with data_align being
zero in none_alloc_txbuf() and rxkad_alloc_txbuf(), data_align
is supposed to be an order-based alignment value, but zero is
not a valid order-based alignment value, and '~(data_align - 1)'
doesn't result in a valid mask-based alignment value for
__page_frag_alloc_align().
Fix it by passing a valid order-based alignment value in
none_alloc_txbuf() and rxkad_alloc_txbuf().
Also use page_frag_alloc_align() expecting an order-based
alignment value in rxrpc_alloc_data_txbuf() to avoid doing the
alignment converting operation and to catch possible invalid
alignment value in the future. Remove the 'if (data_align)'
checking too, as it is always true for a valid order-based
alignment value.
Subtract the VRAM pinned memory when checking for available memory
in amdgpu_amdkfd_reserve_mem_limit function since that memory is not
available for use.
ublk_drv currently creates block devices with the default max_segments
and max_segment_size limits of BLK_MAX_SEGMENTS (128) and
BLK_MAX_SEGMENT_SIZE (65536) respectively. These defaults can
artificially constrain the I/O size seen by the ublk server - for
example, suppose that the ublk server has configured itself to accept
I/Os up to 1M and the application is also issuing 1M sized I/Os. If the
I/O buffer used by the application is backed by 4K pages, the buffer
could consist of up to 1M / 4K = 256 physically discontiguous segments
(even if the buffer is virtually contiguous). As such, the I/O could
exceed the default max_segments limit and get split. This can cause
unnecessary performance issues if the ublk server is optimized to handle
1M I/Os. The block layer's segment count/size limits exist to model
hardware constraints which don't exist in ublk_drv's case, so just
remove those limits for the block devices created by ublk_drv.