Vlastimil Babka [Thu, 22 Feb 2024 21:59:31 +0000 (22:59 +0100)]
mm, mmap: fix vma_merge() case 7 with vma_ops->close
When debugging issues with a workload using SysV shmem, Michal Hocko has
come up with a reproducer that shows how a series of mprotect() operations
can result in an elevated shm_nattch and thus leak of the resource.
The problem is caused by wrong assumptions in vma_merge() commit 714965ca8252 ("mm/mmap: start distinguishing if vma can be removed in
mergeability test"). The shmem vmas have a vma_ops->close callback that
decrements shm_nattch, and we remove the vma without calling it.
vma_merge() has thus historically avoided merging vma's with
vma_ops->close and commit 714965ca8252 was supposed to keep it that way.
It relaxed the checks for vma_ops->close in can_vma_merge_after() assuming
that it is never called on a vma that would be a candidate for removal.
However, the vma_merge() code does also use the result of this check in
the decision to remove a different vma in the merge case 7.
A robust solution would be to refactor vma_merge() code in a way that the
vma_ops->close check is only done for vma's that are actually going to be
removed, and not as part of the preliminary checks. That would both solve
the existing bug, and also allow additional merges that the checks
currently prevent unnecessarily in some cases.
However to fix the existing bug first with a minimized risk, and for
easier stable backports, this patch only adds a vma_ops->close check to
the buggy case 7 specifically. All other cases of vma removal are covered
by the can_vma_merge_before() check that includes the test for
vma_ops->close.
The reproducer code, adapted from Michal Hocko's code:
int main(int argc, char *argv[]) {
int segment_id;
size_t segment_size = 20 * PAGE_SIZE;
char * sh_mem;
struct shmid_ds shmid_ds;
Qi Zheng [Thu, 22 Feb 2024 08:08:15 +0000 (16:08 +0800)]
mm: userfaultfd: fix unexpected change to src_folio when UFFDIO_MOVE fails
After ptep_clear_flush(), if we find that src_folio is pinned we will fail
UFFDIO_MOVE and put src_folio back to src_pte entry, but the change to
src_folio->{mapping,index} is not restored in this process. This is not
what we expected, so fix it.
This can cause the rmap for that page to be invalid, possibly resulting
in memory corruption. At least swapout+migration would no longer work,
because we might fail to locate the mappings of that folio.
Vlastimil Babka [Wed, 21 Feb 2024 11:43:58 +0000 (12:43 +0100)]
mm, vmscan: prevent infinite loop for costly GFP_NOIO | __GFP_RETRY_MAYFAIL allocations
Sven reports an infinite loop in __alloc_pages_slowpath() for costly order
__GFP_RETRY_MAYFAIL allocations that are also GFP_NOIO. Such combination
can happen in a suspend/resume context where a GFP_KERNEL allocation can
have __GFP_IO masked out via gfp_allowed_mask.
Quoting Sven:
1. try to do a "costly" allocation (order > PAGE_ALLOC_COSTLY_ORDER)
with __GFP_RETRY_MAYFAIL set.
2. page alloc's __alloc_pages_slowpath tries to get a page from the
freelist. This fails because there is nothing free of that costly
order.
3. page alloc tries to reclaim by calling __alloc_pages_direct_reclaim,
which bails out because a zone is ready to be compacted; it pretends
to have made a single page of progress.
4. page alloc tries to compact, but this always bails out early because
__GFP_IO is not set (it's not passed by the snd allocator, and even
if it were, we are suspending so the __GFP_IO flag would be cleared
anyway).
5. page alloc believes reclaim progress was made (because of the
pretense in item 3) and so it checks whether it should retry
compaction. The compaction retry logic thinks it should try again,
because:
a) reclaim is needed because of the early bail-out in item 4
b) a zonelist is suitable for compaction
6. goto 2. indefinite stall.
(end quote)
The immediate root cause is confusing the COMPACT_SKIPPED returned from
__alloc_pages_direct_compact() (step 4) due to lack of __GFP_IO to be
indicating a lack of order-0 pages, and in step 5 evaluating that in
should_compact_retry() as a reason to retry, before incrementing and
limiting the number of retries. There are however other places that
wrongly assume that compaction can happen while we lack __GFP_IO.
To fix this, introduce gfp_compaction_allowed() to abstract the __GFP_IO
evaluation and switch the open-coded test in try_to_compact_pages() to use
it.
Also use the new helper in:
- compaction_ready(), which will make reclaim not bail out in step 3, so
there's at least one attempt to actually reclaim, even if chances are
small for a costly order
- in_reclaim_compaction() which will make should_continue_reclaim()
return false and we don't over-reclaim unnecessarily
- in __alloc_pages_slowpath() to set a local variable can_compact,
which is then used to avoid retrying reclaim/compaction for costly
allocations (step 5) if we can't compact and also to skip the early
compaction attempt that we do in some cases
mm/debug_vm_pgtable: fix BUG_ON with pud advanced test
Architectures like powerpc add debug checks to ensure we find only devmap
PUD pte entries. These debug checks are only done with CONFIG_DEBUG_VM.
This patch marks the ptes used for PUD advanced test devmap pte entries so
that we don't hit on debug checks on architecture like ppc64 as below.
Nhat Pham [Tue, 20 Feb 2024 03:01:21 +0000 (19:01 -0800)]
mm: cachestat: fix folio read-after-free in cache walk
In cachestat, we access the folio from the page cache's xarray to compute
its page offset, and check for its dirty and writeback flags. However, we
do not hold a reference to the folio before performing these actions,
which means the folio can concurrently be released and reused as another
folio/page/slab.
Get around this altogether by just using xarray's existing machinery for
the folio page offsets and dirty/writeback states.
This changes behavior for tmpfs files to now always report zeroes in their
dirty and writeback counters. This is okay as tmpfs doesn't follow
conventional writeback cache behavior: its pages get "cleaned" during
swapout, after which they're no longer resident etc.
Lorenzo Stoakes [Tue, 20 Feb 2024 06:44:10 +0000 (06:44 +0000)]
MAINTAINERS: add memory mapping entry with reviewers
Recently there have been a number of patches which have affected various
aspects of the memory mapping logic as implemented in mm/mmap.c where it
would have been useful for regular contributors to have been notified.
Add an entry for this part of mm in particular with regular contributors
tagged as reviewers.
Byungchul Park [Fri, 16 Feb 2024 11:15:02 +0000 (20:15 +0900)]
mm/vmscan: fix a bug calling wakeup_kswapd() with a wrong zone index
With numa balancing on, when a numa system is running where a numa node
doesn't have its local memory so it has no managed zones, the following
oops has been observed. It's because wakeup_kswapd() is called with a
wrong zone index, -1. Fixed it by checking the index before calling
wakeup_kswapd().
Marco Elver [Mon, 29 Jan 2024 10:07:02 +0000 (11:07 +0100)]
kasan: revert eviction of stack traces in generic mode
This partially reverts commits cc478e0b6bdf, 63b85ac56a64, 08d7c94d9635, a414d4286f34, and 773688a6cb24 to make use of variable-sized stack depot
records, since eviction of stack entries from stack depot forces fixed-
sized stack records. Care was taken to retain the code cleanups by the
above commits.
Eviction was added to generic KASAN as a response to alleviating the
additional memory usage from fixed-sized stack records, but this still
uses more memory than previously.
With the re-introduction of variable-sized records for stack depot, we can
just switch back to non-evictable stack records again, and return back to
the previous performance and memory usage baseline.
As can be seen from the counters, with a generic KASAN config, refcounted
allocations and evictions are no longer used. Due to using variable-sized
records, I observe a reduction of 278 stack depot pools (saving 4448 KiB)
with my test setup.
Marco Elver [Mon, 29 Jan 2024 10:07:01 +0000 (11:07 +0100)]
stackdepot: use variable size records for non-evictable entries
With the introduction of stack depot evictions, each stack record is now
fixed size, so that future reuse after an eviction can safely store
differently sized stack traces. In all cases that do not make use of
evictions, this wastes lots of space.
Fix it by re-introducing variable size stack records (up to the max
allowed size) for entries that will never be evicted. We know if an entry
will never be evicted if the flag STACK_DEPOT_FLAG_GET is not provided,
since a later stack_depot_put() attempt is undefined behavior.
With my current kernel config that enables KASAN and also SLUB owner
tracking, I observe (after a kernel boot) a whopping reduction of 296
stack depot pools, which translates into 4736 KiB saved. The savings here
are from SLUB owner tracking only, because KASAN generic mode still uses
refcounting.
There are no kasan_arch_is_ready() guards here, allowing an oops when the
shadow is not initialized. The oops can be seen on a Power8 KVM guest.
This patch adds the guard to release_free_meta(), as it's the first level
that specifically requires the shadow.
It is safe to put the guard at the start of this function, before the
stack put: only kasan_save_free_info() can initialize the saved stack,
which itself is guarded with kasan_arch_is_ready() by its caller
poison_slab_object(). If the arch becomes ready before
release_free_meta() then we will not observe KASAN_SLAB_FREE_META in the
object's shadow, so we will not put an uninitialized stack either.
SeongJae Park [Fri, 16 Feb 2024 19:40:25 +0000 (11:40 -0800)]
mm/damon/lru_sort: fix quota status loss due to online tunings
For online parameters change, DAMON_LRU_SORT creates new schemes based on
latest values of the parameters and replaces the old schemes with the new
one. When creating it, the internal status of the quotas of the old
schemes is not preserved. As a result, charging of the quota starts from
zero after the online tuning. The data that collected to estimate the
throughput of the scheme's action is also reset, and therefore the
estimation should start from the scratch again. Because the throughput
estimation is being used to convert the time quota to the effective size
quota, this could result in temporal time quota inaccuracy. It would be
recovered over time, though. In short, the quota accuracy could be
temporarily degraded after online parameters update.
Fix the problem by checking the case and copying the internal fields for
the status.
SeongJae Park [Fri, 16 Feb 2024 19:40:24 +0000 (11:40 -0800)]
mm/damon/reclaim: fix quota stauts loss due to online tunings
Patch series "mm/damon: fix quota status loss due to online tunings".
DAMON_RECLAIM and DAMON_LRU_SORT is not preserving internal quota status
when applying new user parameters, and hence could cause temporal quota
accuracy degradation. Fix it by preserving the status.
This patch (of 2):
For online parameters change, DAMON_RECLAIM creates new scheme based on
latest values of the parameters and replaces the old scheme with the new
one. When creating it, the internal status of the quota of the old
scheme is not preserved. As a result, charging of the quota starts from
zero after the online tuning. The data that collected to estimate the
throughput of the scheme's action is also reset, and therefore the
estimation should start from the scratch again. Because the throughput
estimation is being used to convert the time quota to the effective size
quota, this could result in temporal time quota inaccuracy. It would be
recovered over time, though. In short, the quota accuracy could be
temporarily degraded after online parameters update.
Fix the problem by checking the case and copying the internal fields for
the status.
SeongJae Park [Tue, 13 Feb 2024 02:36:32 +0000 (18:36 -0800)]
mm/damon/sysfs-schemes: handle schemes sysfs dir removal before commit_schemes_quota_goals
'commit_schemes_quota_goals' command handler,
damos_sysfs_set_quota_scores() assumes the number of schemes sysfs
directory will be same to the number of schemes of the DAMON context. The
assumption is wrong since users can remove schemes sysfs directories while
DAMON is running. In the case, illegal memory accesses can happen. Fix
it by checking the case.
The swapaccount deprecation warning is throwing false positives. Since we
deprecated the knob and defaulted to enabling, the only reports we've been
getting are from folks that set swapaccount=1. While this is a nice
affirmation that always-enabling was the right choice, we certainly don't
want to warn when users request the supported mode.
Only warn when disabling is requested, and clarify the warning.
mm/memblock: add MEMBLOCK_RSRV_NOINIT into flagname[] array
The commit 77e6c43e137c ("memblock: introduce MEMBLOCK_RSRV_NOINIT flag")
skipped adding this newly introduced memblock flag into flagname[] array,
thus preventing a correct memblock flags output for applicable memblock
regions.
Chengming Zhou [Thu, 8 Feb 2024 02:32:54 +0000 (02:32 +0000)]
mm/zswap: invalidate duplicate entry when !zswap_enabled
We have to invalidate any duplicate entry even when !zswap_enabled since
zswap can be disabled anytime. If the folio store success before, then
got dirtied again but zswap disabled, we won't invalidate the old
duplicate entry in the zswap_store(). So later lru writeback may
overwrite the new data in swapfile.
Kairui Song [Tue, 6 Feb 2024 18:25:59 +0000 (02:25 +0800)]
mm/swap: fix race when skipping swapcache
When skipping swapcache for SWP_SYNCHRONOUS_IO, if two or more threads
swapin the same entry at the same time, they get different pages (A, B).
Before one thread (T0) finishes the swapin and installs page (A) to the
PTE, another thread (T1) could finish swapin of page (B), swap_free the
entry, then swap out the possibly modified page reusing the same entry.
It breaks the pte_same check in (T0) because PTE value is unchanged,
causing ABA problem. Thread (T0) will install a stalled page (A) into the
PTE and cause data corruption.
One possible callstack is like this:
CPU0 CPU1
---- ----
do_swap_page() do_swap_page() with same entry
<direct swapin path> <direct swapin path>
<alloc page A> <alloc page B>
swap_read_folio() <- read to page A swap_read_folio() <- read to page B
<slow on later locks or interrupt> <finished swapin first>
... set_pte_at()
swap_free() <- entry is free
<write to page B, now page A stalled>
<swap out page B to same swap entry>
pte_same() <- Check pass, PTE seems
unchanged, but page A
is stalled!
swap_free() <- page B content lost!
set_pte_at() <- staled page A installed!
And besides, for ZRAM, swap_free() allows the swap device to discard the
entry content, so even if page (B) is not modified, if swap_read_folio()
on CPU0 happens later than swap_free() on CPU1, it may also cause data
loss.
To fix this, reuse swapcache_prepare which will pin the swap entry using
the cache flag, and allow only one thread to swap it in, also prevent any
parallel code from putting the entry in the cache. Release the pin after
PT unlocked.
Racers just loop and wait since it's a rare and very short event. A
schedule_timeout_uninterruptible(1) call is added to avoid repeated page
faults wasting too much CPU, causing livelock or adding too much noise to
perf statistics. A similar livelock issue was described in commit 029c4628b2eb ("mm: swap: get rid of livelock in swapin readahead")
Reproducer:
This race issue can be triggered easily using a well constructed
reproducer and patched brd (with a delay in read path) [1]:
With latest 6.8 mainline, race caused data loss can be observed easily:
$ gcc -g -lpthread test-thread-swap-race.c && ./a.out
Polulating 32MB of memory region...
Keep swapping out...
Starting round 0...
Spawning 65536 workers...
32746 workers spawned, wait for done...
Round 0: Error on 0x5aa00, expected 32746, got 32743, 3 data loss!
Round 0: Error on 0x395200, expected 32746, got 32743, 3 data loss!
Round 0: Error on 0x3fd000, expected 32746, got 32737, 9 data loss!
Round 0 Failed, 15 data loss!
This reproducer spawns multiple threads sharing the same memory region
using a small swap device. Every two threads updates mapped pages one by
one in opposite direction trying to create a race, with one dedicated
thread keep swapping out the data out using madvise.
The reproducer created a reproduce rate of about once every 5 minutes, so
the race should be totally possible in production.
After this patch, I ran the reproducer for over a few hundred rounds and
no data loss observed.
Performance overhead is minimal, microbenchmark swapin 10G from 32G
zram:
Nhat Pham [Mon, 5 Feb 2024 23:24:42 +0000 (15:24 -0800)]
mm/swap_state: update zswap LRU's protection range with the folio locked
When a folio is swapped in, the protection size of the corresponding zswap
LRU is incremented, so that the zswap shrinker is more conservative with
its reclaiming action. This field is embedded within the struct lruvec,
so updating it requires looking up the folio's memcg and lruvec. However,
currently this lookup can happen after the folio is unlocked, for instance
if a new folio is allocated, and swap_read_folio() unlocks the folio
before returning. In this scenario, there is no stability guarantee for
the binding between a folio and its memcg and lruvec:
* A folio's memcg and lruvec can be freed between the lookup and the
update, leading to a UAF.
* Folio migration can clear the now-unlocked folio's memcg_data, which
directs the zswap LRU protection size update towards the root memcg
instead of the original memcg. This was recently picked up by the
syzbot thanks to a warning in the inlined folio_lruvec() call.
Move the zswap LRU protection range update above the swap_read_folio()
call, and only when a new page is allocated, to prevent this.
Terry Tritton [Mon, 5 Feb 2024 14:50:56 +0000 (14:50 +0000)]
selftests/mm: uffd-unit-test check if huge page size is 0
If HUGETLBFS is not enabled then the default_huge_page_size function will
return 0 and cause a divide by 0 error. Add a check to see if the huge page
size is 0 and skip the hugetlb tests if it is.
SeongJae Park [Mon, 5 Feb 2024 20:13:06 +0000 (12:13 -0800)]
mm/damon/core: check apply interval in damon_do_apply_schemes()
kdamond_apply_schemes() checks apply intervals of schemes and avoid
further applying any schemes if no scheme passed its apply interval.
However, the following schemes applying function, damon_do_apply_schemes()
iterates all schemes without the apply interval check. As a result, the
shortest apply interval is applied to all schemes. Fix the problem by
checking the apply interval in damon_do_apply_schemes().
Yosry Ahmed [Thu, 25 Jan 2024 08:51:27 +0000 (08:51 +0000)]
mm: zswap: fix missing folio cleanup in writeback race path
In zswap_writeback_entry(), after we get a folio from
__read_swap_cache_async(), we grab the tree lock again to check that the
swap entry was not invalidated and recycled. If it was, we delete the
folio we just added to the swap cache and exit.
However, __read_swap_cache_async() returns the folio locked when it is
newly allocated, which is always true for this path, and the folio is
ref'd. Make sure to unlock and put the folio before returning.
This was discovered by code inspection, probably because this path handles
a race condition that should not happen often, and the bug would not crash
the system, it will only strand the folio indefinitely.
Linus Torvalds [Sun, 18 Feb 2024 18:09:25 +0000 (10:09 -0800)]
Merge tag 'kbuild-fixes-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Reformat nested if-conditionals in Makefiles with 4 spaces
- Fix CONFIG_DEBUG_INFO_BTF builds for big endian
- Fix modpost for module srcversion
- Fix an escape sequence warning in gen_compile_commands.py
- Fix kallsyms to ignore ARMv4 thunk symbols
* tag 'kbuild-fixes-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kallsyms: ignore ARMv4 thunks along with others
modpost: trim leading spaces when processing source files list
gen_compile_commands: fix invalid escape sequence warning
kbuild: Fix changing ELF file type for output of gen_btf for big endian
docs: kconfig: Fix grammar and formatting
kbuild: use 4-space indentation when followed by conditionals
Linus Torvalds [Sun, 18 Feb 2024 17:22:48 +0000 (09:22 -0800)]
Merge tag 'x86_urgent_for_v6.8_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Borislav Petkov:
- Use a GB page for identity mapping only when memory of this size is
requested so that mapping of reserved regions is prevented which
would otherwise lead to system crashes on UV machines
* tag 'x86_urgent_for_v6.8_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm/ident_map: Use gbpages only where full GB page should be mapped.
- Prevent spurious interrupts on Broadcom devices using GIC v3
architecture
- Other minor fixes
* tag 'irq_urgent_for_v6.8_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v3-its: Fix GICv4.1 VPE affinity update
irqchip/gic-v3-its: Restore quirk probing for ACPI-based systems
irqchip/gic-v3-its: Handle non-coherent GICv4 redistributors
irqchip/qcom-mpm: Fix IS_ERR() vs NULL check in qcom_mpm_init()
irqchip/loongson-eiointc: Use correct struct type in eiointc_domain_alloc()
irqchip/irq-brcmstb-l2: Add write memory barrier before exit
Linus Torvalds [Sun, 18 Feb 2024 17:08:57 +0000 (09:08 -0800)]
Merge tag 'i2c-for-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Two fixes for i801 and qcom-geni devices. Meanwhile, a fix from Arnd
addresses a compilation error encountered during compile test on
powerpc"
* tag 'i2c-for-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: i801: Fix block process call transactions
i2c: pasemi: split driver into two separate modules
i2c: qcom-geni: Correct I2C TRE sequence
Linus Torvalds [Sun, 18 Feb 2024 00:59:31 +0000 (16:59 -0800)]
Merge tag 'powerpc-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"This is a bit of a big batch for rc4, but just due to holiday hangover
and because I didn't send any fixes last week due to a late revert
request. I think next week should be back to normal.
- Fix ftrace bug on boot caused by exit text sections with
'-fpatchable-function-entry'
- Fix accuracy of stolen time on pseries since the switch to
VIRT_CPU_ACCOUNTING_GEN
- Fix a crash in the IOMMU code when doing DLPAR remove
- Set pt_regs->link on scv entry to fix BPF stack unwinding
- Add missing PPC_FEATURE_BOOKE on 64-bit e5500/e6500, which broke
gdb
- Fix boot on some 6xx platforms with STRICT_KERNEL_RWX enabled
- Fix build failures with KASAN enabled and 32KB stack size
- Some other minor fixes
Thanks to Arnd Bergmann, Benjamin Gray, Christophe Leroy, David
Engraf, Gaurav Batra, Jason Gunthorpe, Jiangfeng Xiao, Matthias
Schiffer, Nathan Lynch, Naveen N Rao, Nicholas Piggin, Nysal Jan K.A,
R Nageswara Sastry, Shivaprasad G Bhat, Shrikanth Hegde, Spoorthy,
Srikar Dronamraju, and Venkat Rao Bagalkote"
* tag 'powerpc-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/iommu: Fix the missing iommu_group_put() during platform domain attach
powerpc/pseries: fix accuracy of stolen time
powerpc/ftrace: Ignore ftrace locations in exit text sections
powerpc/cputable: Add missing PPC_FEATURE_BOOKE on PPC64 Book-E
powerpc/kasan: Limit KASAN thread size increase to 32KB
Revert "powerpc/pseries/iommu: Fix iommu initialisation during DLPAR add"
powerpc: 85xx: mark local functions static
powerpc: udbg_memcons: mark functions static
powerpc/kasan: Fix addr error caused by page alignment
powerpc/6xx: set High BAT Enable flag on G2_LE cores
selftests/powerpc/papr_vpd: Check devfd before get_system_loc_code()
powerpc/64: Set task pt_regs->link to the LR value on scv entry
powerpc/pseries/iommu: Fix iommu initialisation during DLPAR add
powerpc/pseries/papr-sysparm: use u8 arrays for payloads
Linus Torvalds [Sat, 17 Feb 2024 16:56:41 +0000 (08:56 -0800)]
Merge tag 'driver-core-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are some driver core fixes, a kobject fix, and a documentation
update for 6.8-rc5. In detail these changes are:
- devlink fixes for reported issues with 6.8-rc1
- topology scheduling regression fix that has been reported by many
- kobject loosening of checks change in -rc1 is now reverted as some
codepaths seemed to need the checks
- documentation update for the CVE process. Has been reviewed by
many, the last minute change to the document was to bring the .rst
format back into the the new style rules, the contents did not
change.
All of these, except for the documentation update, have been in
linux-next for over a week. The documentation update has been reviewed
for weeks by a group of developers, and in public for a week and the
wording has stabilized for now. If future changes are needed, we can
do so before 6.8-final is out (or anytime after that)"
* tag 'driver-core-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
Documentation: Document the Linux Kernel CVE process
Revert "kobject: Remove redundant checks for whether ktype is NULL"
driver core: fw_devlink: Improve logs for cycle detection
driver core: fw_devlink: Improve detection of overlapping cycles
driver core: Fix device_link_flag_is_sync_state_only()
topology: Set capacity_freq_ref in all cases
Linus Torvalds [Sat, 17 Feb 2024 16:52:38 +0000 (08:52 -0800)]
Merge tag 'char-misc-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / miscdriver fixes from Greg KH:
"Here is a small set of char/misc and IIO driver fixes for 6.8-rc5.
Included in here are:
- lots of iio driver fixes for reported issues
- nvmem device naming fixup for reported problem
- interconnect driver fixes for reported issues
All of these have been in linux-next for a while with no reported the
issues (the nvmem patch was included in a different branch in
linux-next before sent to me for inclusion here)"
* tag 'char-misc-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (21 commits)
nvmem: include bit index in cell sysfs file name
iio: adc: ad4130: only set GPIO_CTRL if pin is unused
iio: adc: ad4130: zero-initialize clock init data
interconnect: qcom: x1e80100: Add missing ACV enable_mask
interconnect: qcom: sm8650: Use correct ACV enable_mask
iio: accel: bma400: Fix a compilation problem
iio: commom: st_sensors: ensure proper DMA alignment
iio: hid-sensor-als: Return 0 for HID_USAGE_SENSOR_TIME_TIMESTAMP
iio: move LIGHT_UVA and LIGHT_UVB to the end of iio_modifier
staging: iio: ad5933: fix type mismatch regression
iio: humidity: hdc3020: fix temperature offset
iio: adc: ad7091r8: Fix error code in ad7091r8_gpio_setup()
iio: adc: ad_sigma_delta: ensure proper DMA alignment
iio: imu: adis: ensure proper DMA alignment
iio: humidity: hdc3020: Add Makefile, Kconfig and MAINTAINERS entry
iio: imu: bno055: serdev requires REGMAP
iio: magnetometer: rm3100: add boundary check for the value read from RM3100_REG_TMRC
iio: pressure: bmp280: Add missing bmp085 to SPI id table
iio: core: fix memleak in iio_device_register_sysfs
interconnect: qcom: sm8550: Enable sync_state
...
Linus Torvalds [Sat, 17 Feb 2024 16:46:57 +0000 (08:46 -0800)]
Merge tag 'tty-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty / serial fixes from Greg KH:
"Here are three small tty and serial driver fixes for 6.8-rc5:
- revert a 8250_pci1xxxx off-by-one change that was incorrect
- two changes to fix the transmit path of the mxs-auart driver,
fixing a regression in the 6.2 release
All of these have been in linux-next for over a week with no reported
issues"
* tag 'tty-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: mxs-auart: fix tx
serial: core: introduce uart_port_tx_flags()
serial: 8250_pci1xxxx: partially revert off by one patch
Linus Torvalds [Sat, 17 Feb 2024 16:44:55 +0000 (08:44 -0800)]
Merge tag 'usb-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt fixes from Greg KH:
"Here are two small fixes for 6.8-rc5:
- thunderbolt to fix a reported issue on many platforms
- dwc3 driver revert of a commit that caused problems in -rc1
Both of these changes have been in linux-next for over a week with no
reported issues"
* tag 'usb-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
Revert "usb: dwc3: Support EBC feature of DWC_usb31"
thunderbolt: Fix setting the CNS bit in ROUTER_CS_5
Linus Torvalds [Sat, 17 Feb 2024 16:13:32 +0000 (08:13 -0800)]
Merge tag 'media/v6.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
- regression fix for rkisp1 shared IRQ logic
- fix atomisp breakage due to a kAPI change
- permission fix for remote controller BPF support
- memleak fix in ir_toy driver
- Kconfig dependency fix for pwm-ir-rx
* tag 'media/v6.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: pwm-ir-tx: Depend on CONFIG_HIGH_RES_TIMERS
media: ir_toy: fix a memleak in irtoy_tx
media: rc: bpf attach/detach requires write permission
media: atomisp: Adjust for v4l2_subdev_state handling changes in 6.8
media: rkisp1: Fix IRQ handling due to shared interrupts
media: Revert "media: rkisp1: Drop IRQF_SHARED"
Linus Torvalds [Sat, 17 Feb 2024 16:06:20 +0000 (08:06 -0800)]
Merge tag 'pci-v6.8-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fixes from Bjorn Helgaas:
- Keep bridges in D0 if we need to poll downstream devices for PME to
resolve a v6.6 regression where we failed to enumerate devices below
bridges put in D3hot by runtime PM, e.g., NVMe drives connected via
Thunderbolt or USB4 docks (Alex Williamson)
- Add Siddharth Vadapalli as PCI TI DRA7XX/J721E reviewer
* tag 'pci-v6.8-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
MAINTAINERS: Add Siddharth Vadapalli as PCI TI DRA7XX/J721E reviewer
PCI: Fix active state requirement in PME polling
Linus Torvalds [Sat, 17 Feb 2024 15:59:47 +0000 (07:59 -0800)]
Merge tag 'probes-fixes-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes fix from Masami Hiramatsu:
- tracing/probes: Fix BTF structure member finder to find the members
which are placed after any anonymous union member correctly.
* tag 'probes-fixes-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/probes: Fix to search structure fields correctly
Linus Torvalds [Sat, 17 Feb 2024 15:56:10 +0000 (07:56 -0800)]
Merge tag '6.8-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
"Five smb3 client fixes, most also for stable:
- Two multichannel fixes (one to fix potential handle leak on retry)
- Work around possible serious data corruption (due to change in
folios in 6.3, for cases when non standard maximum write size
negotiated)
- Symlink creation fix
- Multiuser automount fix"
* tag '6.8-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: Fix regression in writes when non-standard maximum write size negotiated
smb: client: handle path separator of created SMB symlinks
smb: client: set correct id, uid and cruid for multiuser automounts
cifs: update the same create_guid on replay
cifs: fix underflow in parse_server_interfaces()
Documentation: Document the Linux Kernel CVE process
The Linux kernel project now has the ability to assign CVEs to fixed
issues, so document the process and how individual developers can get a
CVE if one is not automatically assigned for their fixes.
tracing/probes: Fix to search structure fields correctly
Fix to search a field from the structure which has anonymous union
correctly.
Since the reference `type` pointer was updated in the loop, the search
loop suddenly aborted where it hits an anonymous union. Thus it can not
find the field after the anonymous union. This avoids updating the
cursor `type` pointer in the loop.
Wolfram Sang [Sat, 17 Feb 2024 12:13:33 +0000 (13:13 +0100)]
Merge tag 'i2c-host-fixes-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current
Three fixes are included here. Two are strictly hardware-related
for the i801 and qcom-geni devices. Meanwhile, a fix from Arnd
addresses a compilation error encountered during compile test on
powerpc.
Linus Torvalds [Fri, 16 Feb 2024 22:05:02 +0000 (14:05 -0800)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Three fixes: the two fnic ones are a revert and a refix, which is why
the diffstat is a bit big. The target one also extracts a function to
add a check for configuration and so looks bigger than it is"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: fnic: Move fnic_fnic_flush_tx() to a work queue
scsi: Revert "scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock"
scsi: target: Fix unmap setup during configuration
Linus Torvalds [Fri, 16 Feb 2024 22:00:19 +0000 (14:00 -0800)]
Merge tag 'wq-for-6.8-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fix from Tejun Heo:
"Just one patch to revert commit ca10d851b9ad ("workqueue: Override
implicit ordered attribute in workqueue_apply_unbound_cpumask()").
This commit could break ordering guarantees for ordered workqueues.
The problem that the commit tried to resolve partially - making
ordered workqueues follow unbound cpumask - is fully solved in
wq/for-6.9 branch"
* tag 'wq-for-6.8-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
Revert "workqueue: Override implicit ordered attribute in workqueue_apply_unbound_cpumask()"
Linus Torvalds [Fri, 16 Feb 2024 21:37:15 +0000 (13:37 -0800)]
Merge tag 'ceph-for-6.8-rc5' of https://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"Additional cap handling fixes from Xiubo to avoid "client isn't
responding to mclientcaps(revoke)" stalls on the MDS side"
* tag 'ceph-for-6.8-rc5' of https://github.com/ceph/ceph-client:
ceph: add ceph_cap_unlink_work to fire check_caps() immediately
ceph: always queue a writeback when revoking the Fb caps
Linus Torvalds [Fri, 16 Feb 2024 18:48:14 +0000 (10:48 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"ARM:
- Avoid dropping the page refcount twice when freeing an unlinked
page-table subtree.
- Don't source the VFIO Kconfig twice
- Fix protected-mode locking order between kvm and vcpus
RISC-V:
- Fix steal-time related sparse warnings
x86:
- Cleanup gtod_is_based_on_tsc() to return "bool" instead of an "int"
- Make a KVM_REQ_NMI request while handling KVM_SET_VCPU_EVENTS if
and only if the incoming events->nmi.pending is non-zero. If the
target vCPU is in the UNITIALIZED state, the spurious request will
result in KVM exiting to userspace, which in turn causes QEMU to
constantly acquire and release QEMU's global mutex, to the point
where the BSP is unable to make forward progress.
- Fix a type (u8 versus u64) goof that results in pmu->fixed_ctr_ctrl
being incorrectly truncated, and ultimately causes KVM to think a
fixed counter has already been disabled (KVM thinks the old value
is '0').
- Fix a stack leak in KVM_GET_MSRS where a failed MSR read from
userspace that is ultimately ignored due to ignore_msrs=true
doesn't zero the output as intended.
Selftests cleanups and fixes:
- Remove redundant newlines from error messages.
- Delete an unused variable in the AMX test (which causes build
failures when compiling with -Werror).
- Fail instead of skipping tests if open(), e.g. of /dev/kvm, fails
with an error code other than ENOENT (a Hyper-V selftest bug
resulted in an EMFILE, and the test eventually got skipped).
- Fix TSC related bugs in several Hyper-V selftests.
- Fix a bug in the dirty ring logging test where a sem_post() could
be left pending across multiple runs, resulting in incorrect
synchronization between the main thread and the vCPU worker thread.
- Relax the dirty log split test's assertions on 4KiB mappings to fix
false positives due to the number of mappings for memslot 0 (used
for code and data that is NOT being dirty logged) changing, e.g.
due to NUMA balancing"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits)
KVM: arm64: Fix double-free following kvm_pgtable_stage2_free_unlinked()
RISC-V: KVM: Use correct restricted types
RISC-V: paravirt: Use correct restricted types
RISC-V: paravirt: steal_time should be static
KVM: selftests: Don't assert on exact number of 4KiB in dirty log split test
KVM: selftests: Fix a semaphore imbalance in the dirty ring logging test
KVM: x86: Fix KVM_GET_MSRS stack info leak
KVM: arm64: Do not source virt/lib/Kconfig twice
KVM: x86/pmu: Fix type length error when reading pmu->fixed_ctr_ctrl
KVM: x86: Make gtod_is_based_on_tsc() return 'bool'
KVM: selftests: Make hyperv_clock require TSC based system clocksource
KVM: selftests: Run clocksource dependent tests with hyperv_clocksource_tsc_page too
KVM: selftests: Use generic sys_clocksource_is_tsc() in vmx_nested_tsc_scaling_test
KVM: selftests: Generalize check_clocksource() from kvm_clock_test
KVM: x86: make KVM_REQ_NMI request iff NMI pending for vcpu
KVM: arm64: Fix circular locking dependency
KVM: selftests: Fail tests when open() fails with !ENOENT
KVM: selftests: Avoid infinite loop in hyperv_features when invtsc is missing
KVM: selftests: Delete superfluous, unused "stage" variable in AMX test
KVM: selftests: x86_64: Remove redundant newlines
...
Linus Torvalds [Fri, 16 Feb 2024 18:33:51 +0000 (10:33 -0800)]
Merge tag 'trace-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix the #ifndef that didn't have the 'CONFIG_' prefix on
HAVE_DYNAMIC_FTRACE_WITH_REGS
The fix to have dynamic trampolines work with x86 broke arm64 as the
config used in the #ifdef was HAVE_DYNAMIC_FTRACE_WITH_REGS and not
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS which removed the fix that the
previous fix was to fix.
- Fix tracing_on state
The code to test if "tracing_on" is set incorrectly used
ring_buffer_record_is_on() which returns false if the ring buffer
isn't able to be written to.
But the ring buffer disable has several bits that disable it. One is
internal disabling which is used for resizing and other modifications
of the ring buffer. But the "tracing_on" user space visible flag
should only report if tracing is actually on and not internally
disabled, as this can cause confusion as writing "1" when it is
disabled will not enable it.
Instead use ring_buffer_record_is_set_on() which shows the user space
visible settings.
- Fix a false positive kmemleak on saved cmdlines
Now that the saved_cmdlines structure is allocated via alloc_page()
and not via kmalloc() it has become invisible to kmemleak. The
allocation done to one of its pointers was flagged as a dangling
allocation leak. Make kmemleak aware of this allocation and free.
- Fix synthetic event dynamic strings
An update that cleaned up the synthetic event code removed the return
value of trace_string(), and had it return zero instead of the
length, causing dynamic strings in the synthetic event to always have
zero size.
- Clean up documentation and header files for seq_buf
* tag 'trace-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
seq_buf: Fix kernel documentation
seq_buf: Don't use "proxy" headers
tracing/synthetic: Fix trace_string() return value
tracing: Inform kmemleak of saved_cmdlines allocation
tracing: Use ring_buffer_record_is_set_on() in tracer_tracing_is_on()
tracing: Fix HAVE_DYNAMIC_FTRACE_WITH_REGS ifdef
Linus Torvalds [Fri, 16 Feb 2024 18:28:29 +0000 (10:28 -0800)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"It's a little busier than normal, but it's still not a lot of code and
things seem fairly quiet in general:
- Fix allocation failure during SVE coredumps
- Fix handling of SVE context on signal delivery
- Enable Neoverse N2 CPU errata workarounds for Microsoft's "Azure
Cobalt 100" clone
- Work around CMN PMU erratum in AmpereOneX implementation
- Fix typo in CXL PMU event definition
- Fix jump label asm constraints"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/sve: Lower the maximum allocation for the SVE ptrace regset
arm64: Subscribe Microsoft Azure Cobalt 100 to ARM Neoverse N2 errata
perf/arm-cmn: Workaround AmpereOneX errata AC04_MESH_1 (incorrect child count)
arm64: jump_label: use constraints "Si" instead of "i"
arm64: fix typo in comments
perf: CXL: fix mismatched cpmu event opcode
arm64/signal: Don't assume that TIF_SVE means we saved SVE state
Linus Torvalds [Fri, 16 Feb 2024 17:29:26 +0000 (09:29 -0800)]
Merge tag 'zonefs-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs
Pull zonefs fix from Damien Le Moal:
- Fix direct write error handling to avoid a race between failed IO
completion and the submission path itself which can result in an
invalid file size exposed to the user after the failed IO.
* tag 'zonefs-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
zonefs: Improve error handling
Linus Torvalds [Fri, 16 Feb 2024 17:02:19 +0000 (09:02 -0800)]
Merge tag 'sound-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of device-specific fixes. It became a bit bigger than
wished, but all look reasonably small and safe to apply.
- A few Cirrus Logic CS35L56 and CS42L43 driver fixes
- ASoC SOF fixes and workarounds
- Various ASoC Intel fixes
- Lots of HD-, USB-audio and AMD ACP quirks"
* tag 'sound-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (33 commits)
ALSA: usb-audio: More relaxed check of MIDI jack names
ALSA: hda/realtek: fix mute/micmute LED For HP mt645
ALSA: hda/realtek: cs35l41: Fix order and duplicates in quirks table
ALSA: hda/realtek: cs35l41: Fix device ID / model name
ALSA: hda/realtek: cs35l41: Add internal speaker support for ASUS UM3402 with missing DSD
ASoC: cs35l56: Workaround for ACPI with broken spk-id-gpios property
ALSA: hda: Add Lenovo Legion 7i gen7 sound quirk
ASoC: SOF: IPC3: fix message bounds on ipc ops
ASoC: SOF: ipc4-pcm: Workaround for crashed firmware on system suspend
ASoC: q6dsp: fix event handler prototype
ASoC: SOF: Intel: pci-lnl: Change the topology path to intel/sof-ipc4-tplg
ASoC: SOF: Intel: pci-tgl: Change the default paths and firmware names
ASoC: amd: yc: Fix non-functional mic on Lenovo 82UU
ASoC: rt5645: Add DMI quirk for inverted jack-detect on MeeGoPad T8
ASoC: rt5645: Make LattePanda board DMI match more precise
ASoC: SOF: amd: Fix locking in ACP IRQ handler
ASoC: rt5645: Fix deadlock in rt5645_jack_detect_work()
ASoC: Intel: cht_bsw_rt5645: Cleanup codec_name handling
ASoC: Intel: Boards: Fix NULL pointer deref in BYT/CHT boards
ASoC: cs35l56: Remove default from IRQ1_CFG register
...
Linus Torvalds [Fri, 16 Feb 2024 16:55:46 +0000 (08:55 -0800)]
Merge tag 'gpio-fixes-for-v6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- add missing stubs for functions that are not built with GPIOLIB
disabled
* tag 'gpio-fixes-for-v6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpiolib: add gpio_device_get_label() stub for !GPIOLIB
gpiolib: add gpio_device_get_base() stub for !GPIOLIB
gpiolib: add gpiod_to_gpio_device() stub for !GPIOLIB
Linus Torvalds [Fri, 16 Feb 2024 16:05:30 +0000 (08:05 -0800)]
Merge tag 'drm-fixes-2024-02-16' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Regular weekly fixes, nothing too major, mostly amdgpu, then i915, xe,
msm and nouveau with some scattered bits elsewhere.
crtc:
- fix uninit variable
prime:
- support > 4GB page arrays
buddy:
- fix error handling in allocations
i915:
- fix blankscreen on JSL chromebooks
- stable fix to limit DP sst link rates
xe:
- Fix an out-of-bounds shift.
- Fix the display code thinking xe uses shmem
- Fix a warning about index out-of-bound
- Fix a clang-16 compilation warning
amdgpu:
- PSR fixes
- Suspend/resume fixes
- Link training fix
- Aspect ratio fix
- DCN 3.5 fixes
- VCN 4.x fix
- GFX 11 fix
- Misc display fixes
- Misc small fixes
amdkfd:
- Cache size reporting fix
- SIMD distribution fix
msm:
- GPU:
- dmabuf vmap fix
- a610 UBWC corruption fix (incorrect hbb)
- revert a commit that was making GPU recovery unreliable
- tlb invalidation fix
* tag 'drm-fixes-2024-02-16' of git://anongit.freedesktop.org/drm/drm: (38 commits)
drm/amdgpu: Fix implicit assumtion in gfx11 debug flags
drm/amdkfd: update SIMD distribution algo for GFXIP 9.4.2 onwards
drm/amd/display: Increase ips2_eval delay for DCN35
drm/amdgpu/display: Initialize gamma correction mode variable in dcn30_get_gamcor_current()
drm/amdgpu/soc21: update VCN 4 max HEVC encoding resolution
drm/amd/display: fixed integer types and null check locations
drm/amd/display: Fix array-index-out-of-bounds in dcn35_clkmgr
drm/amd/display: Preserve original aspect ratio in create stream
drm/amd/display: Fix possible NULL dereference on device remove/driver unload
Revert "drm/amd/display: increased min_dcfclk_mhz and min_fclk_mhz"
drm/amd/display: Add align done check
Revert "drm/amd: flush any delayed gfxoff on suspend entry"
drm/amd: Stop evicting resources on APUs in suspend
drm/amd/display: Fix possible buffer overflow in 'find_dcfclk_for_voltage()'
drm/amd/display: Fix possible use of uninitialized 'max_chunks_fbc_mode' in 'calculate_bandwidth()'
drm/amd/display: Initialize 'wait_time_microsec' variable in link_dp_training_dpia.c
drm/amd/display: Fix && vs || typos
drm/amdkfd: Fix L2 cache size reporting in GFX9.4.3
drm/amdgpu: make damage clips support configurable
drm/msm: Wire up tlb ops
...
Dave Airlie [Fri, 16 Feb 2024 05:45:03 +0000 (15:45 +1000)]
Merge tag 'drm-xe-fixes-2024-02-15' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- Fix an out-of-bounds shift.
- Fix the display code thinking xe uses shmem
- Fix a warning about index out-of-bound
- Fix a clang-16 compilation warning
Dave Airlie [Fri, 16 Feb 2024 04:33:09 +0000 (14:33 +1000)]
Merge tag 'drm-misc-fixes-2024-02-15' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
A suspend/resume error fix for ivpu, a couple of scheduler fixes for
nouveau, a patch to support large page arrays in prime, a uninitialized
variable fix in crtc, a locking fix in rockchip/vop2 and a buddy
allocator error reporting fix.
Steve French [Tue, 6 Feb 2024 22:34:22 +0000 (16:34 -0600)]
smb: Fix regression in writes when non-standard maximum write size negotiated
The conversion to netfs in the 6.3 kernel caused a regression when
maximum write size is set by the server to an unexpected value which is
not a multiple of 4096 (similarly if the user overrides the maximum
write size by setting mount parm "wsize", but sets it to a value that
is not a multiple of 4096). When negotiated write size is not a
multiple of 4096 the netfs code can skip the end of the final
page when doing large sequential writes, causing data corruption.
This section of code is being rewritten/removed due to a large
netfs change, but until that point (ie for the 6.3 kernel until now)
we can not support non-standard maximum write sizes.
Add a warning if a user specifies a wsize on mount that is not
a multiple of 4096 (and round down), also add a change where we
round down the maximum write size if the server negotiates a value
that is not a multiple of 4096 (we also have to check to make sure that
we do not round it down to zero).
Damien Le Moal [Thu, 8 Feb 2024 08:26:59 +0000 (17:26 +0900)]
zonefs: Improve error handling
Write error handling is racy and can sometime lead to the error recovery
path wrongly changing the inode size of a sequential zone file to an
incorrect value which results in garbage data being readable at the end
of a file. There are 2 problems:
1) zonefs_file_dio_write() updates a zone file write pointer offset
after issuing a direct IO with iomap_dio_rw(). This update is done
only if the IO succeed for synchronous direct writes. However, for
asynchronous direct writes, the update is done without waiting for
the IO completion so that the next asynchronous IO can be
immediately issued. However, if an asynchronous IO completes with a
failure right before the i_truncate_mutex lock protecting the update,
the update may change the value of the inode write pointer offset
that was corrected by the error path (zonefs_io_error() function).
2) zonefs_io_error() is called when a read or write error occurs. This
function executes a report zone operation using the callback function
zonefs_io_error_cb(), which does all the error recovery handling
based on the current zone condition, write pointer position and
according to the mount options being used. However, depending on the
zoned device being used, a report zone callback may be executed in a
context that is different from the context of __zonefs_io_error(). As
a result, zonefs_io_error_cb() may be executed without the inode
truncate mutex lock held, which can lead to invalid error processing.
Fix both problems as follows:
- Problem 1: Perform the inode write pointer offset update before a
direct write is issued with iomap_dio_rw(). This is safe to do as
partial direct writes are not supported (IOMAP_DIO_PARTIAL is not
set) and any failed IO will trigger the execution of zonefs_io_error()
which will correct the inode write pointer offset to reflect the
current state of the one on the device.
- Problem 2: Change zonefs_io_error_cb() into zonefs_handle_io_error()
and call this function directly from __zonefs_io_error() after
obtaining the zone information using blkdev_report_zones() with a
simple callback function that copies to a local stack variable the
struct blk_zone obtained from the device. This ensures that error
handling is performed holding the inode truncate mutex.
This change also simplifies error handling for conventional zone files
by bypassing the execution of report zones entirely. This is safe to
do because the condition of conventional zones cannot be read-only or
offline and conventional zone files are always fully mapped with a
constant file size.
Linus Torvalds [Thu, 15 Feb 2024 19:39:27 +0000 (11:39 -0800)]
Merge tag 'net-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from can, wireless and netfilter.
Current release - regressions:
- af_unix: fix task hung while purging oob_skb in GC
- pds_core: do not try to run health-thread in VF path
Current release - new code bugs:
- sched: act_mirred: don't zero blockid when net device is being
deleted
Previous releases - regressions:
- netfilter:
- nat: restore default DNAT behavior
- nf_tables: fix bidirectional offload, broken when unidirectional
offload support was added
- openvswitch: limit the number of recursions from action sets
- eth: i40e: do not allow untrusted VF to remove administratively set
MAC address
Previous releases - always broken:
- tls: fix races and bugs in use of async crypto
- mptcp: prevent data races on some of the main socket fields, fix
races in fastopen handling
- dpll: fix possible deadlock during netlink dump operation
- dsa: lan966x: fix crash when adding interface under a lag when some
of the ports are disabled
- can: j1939: prevent deadlock by changing j1939_socks_lock to rwlock
Misc:
- a handful of fixes and reliability improvements for selftests
- fix sysfs documentation missing net/ in paths
- finish the work of squashing the missing MODULE_DESCRIPTION()
warnings in networking"
* tag 'net-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (92 commits)
net: fill in MODULE_DESCRIPTION()s for missing arcnet
net: fill in MODULE_DESCRIPTION()s for mdio_devres
net: fill in MODULE_DESCRIPTION()s for ppp
net: fill in MODULE_DESCRIPTION()s for fddik/skfp
net: fill in MODULE_DESCRIPTION()s for plip
net: fill in MODULE_DESCRIPTION()s for ieee802154/fakelb
net: fill in MODULE_DESCRIPTION()s for xen-netback
net: ravb: Count packets instead of descriptors in GbEth RX path
pppoe: Fix memory leak in pppoe_sendmsg()
net: sctp: fix skb leak in sctp_inq_free()
net: bcmasp: Handle RX buffer allocation failure
net-timestamp: make sk_tskey more predictable in error path
selftests: tls: increase the wait in poll_partial_rec_async
ice: Add check for lport extraction to LAG init
netfilter: nf_tables: fix bidirectional offload regression
netfilter: nat: restore default DNAT behavior
netfilter: nft_set_pipapo: fix missing : in kdoc
igc: Remove temporary workaround
igb: Fix string truncation warnings in igb_set_fw_version
can: netlink: Fix TDCO calculation using the old data bittiming
...
Linus Torvalds [Thu, 15 Feb 2024 19:33:35 +0000 (11:33 -0800)]
Merge tag 'for-linus-6.8a-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"Fixes and simple cleanups:
- use a proper flexible array instead of a one-element array in order
to avoid array-bounds sanitizer errors
- add NULL pointer checks after allocating memory
- use memdup_array_user() instead of open-coding it
- fix a rare race condition in Xen event channel allocation code
- make struct bus_type instances const
- make kerneldoc inline comments match reality"
* tag 'for-linus-6.8a-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/events: close evtchn after mapping cleanup
xen/gntalloc: Replace UAPI 1-element array
xen: balloon: make balloon_subsys const
xen: pcpu: make xen_pcpu_subsys const
xen/privcmd: Use memdup_array_user() in alloc_ioreq()
x86/xen: Add some null pointer checking to smp.c
xen/xenbus: document will_handle argument for xenbus_watch_path()
drm/amdgpu: Fix implicit assumtion in gfx11 debug flags
Gfx11 debug flags mask is currently set with an implicit assumption that
no other mqd update flags exist. This needs to be fixed with newly
introduced flag UPDATE_FLAG_IS_GWS by the previous patch.
drm/amdkfd: update SIMD distribution algo for GFXIP 9.4.2 onwards
In certain cooperative group dispatch scenarios the default SPI resource
allocation may cause reduced per-CU workgroup occupancy. Set
COMPUTE_RESOURCE_LIMITS.FORCE_SIMD_DIST=1 to mitigate soft hang
scenarions.
drm/amdgpu/display: Initialize gamma correction mode variable in dcn30_get_gamcor_current()
The dcn30_get_gamcor_current() function is responsible for determining
the current gamma correction mode used by the display controller.
However, the 'mode' variable, which stores the gamma correction mode,
was not initialized before its first usage, leading to an uninitialized
symbol error.
Thus initializes the 'mode' variable with a default value of LUT_BYPASS
before the conditional statements in the function, improves code clarity
and stability, ensuring correct behavior of the
dcn30_get_gamcor_current() function in determining the gamma correction
mode.
Fixes the below:
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_dpp_cm.c:77 dpp30_get_gamcor_current() error: uninitialized symbol 'mode'.
Tom Chung [Tue, 30 Jan 2024 07:34:08 +0000 (15:34 +0800)]
drm/amd/display: Preserve original aspect ratio in create stream
[Why]
The original picture aspect ratio in mode struct may have chance be
overwritten with wrong aspect ratio data in create_stream_for_sink().
It will create a different VIC output and cause HDMI compliance test
failed.
[How]
Preserve the original picture aspect ratio data during create the
stream.
drm/amd/display: Fix possible NULL dereference on device remove/driver unload
As part of a cleanup amdgpu_dm_fini() function, which is typically
called when a device is being shut down or a driver is being unloaded
The below error message suggests that there is a potential null pointer
dereference issue with adev->dm.dc.
In the below, line of code where adev->dm.dc is used without a preceding
null check:
for (i = 0; i < adev->dm.dc->caps.max_links; i++) {
To fix this issue, add a null check for adev->dm.dc before this line.
Reported by smatch:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:1959 amdgpu_dm_fini() error: we previously assumed 'adev->dm.dc' could be null (see line 1943)
Revert "drm/amd: flush any delayed gfxoff on suspend entry"
commit ab4750332dbe ("drm/amdgpu/sdma5.2: add begin/end_use ring
callbacks") caused GFXOFF control to be used more heavily and the
codepath that was removed from commit 0dee72639533 ("drm/amd: flush any
delayed gfxoff on suspend entry") now can be exercised at suspend again.
Users report that by using GNOME to suspend the lockscreen trigger will
cause SDMA traffic and the system can deadlock.
drm/amd: Stop evicting resources on APUs in suspend
commit 5095d5418193 ("drm/amd: Evict resources during PM ops prepare()
callback") intentionally moved the eviction of resources to earlier in
the suspend process, but this introduced a subtle change that it occurs
before adev->in_s0ix or adev->in_s3 are set. This meant that APUs
actually started to evict resources at suspend time as well.
Explicitly set s0ix or s3 in the prepare() stage, and unset them if the
prepare() stage failed.
v2: squash in warning fix from Stephen Rothwell
Reported-by: Jürg Billeter <[email protected]> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3132#note_2271038 Fixes: 5095d5418193 ("drm/amd: Evict resources during PM ops prepare() callback") Acked-by: Alex Deucher <[email protected]> Signed-off-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
drm/amd/display: Fix possible buffer overflow in 'find_dcfclk_for_voltage()'
when 'find_dcfclk_for_voltage()' function is looping over
VG_NUM_SOC_VOLTAGE_LEVELS (which is 8), but the size of the DcfClocks
array is VG_NUM_DCFCLK_DPM_LEVELS (which is 7).
When the loop variable i reaches 7, the function tries to access
clock_table->DcfClocks[7]. However, since the size of the DcfClocks
array is 7, the valid indices are 0 to 6. Index 7 is beyond the size of
the array, leading to a buffer overflow.
Reported by smatch & thus fixing the below:
drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn301/vg_clk_mgr.c:550 find_dcfclk_for_voltage() error: buffer overflow 'clock_table->DcfClocks' 7 <= 7
If 'data->fbc_en[i]' is not equal to 1 for any i, max_chunks_fbc_mode
will not be initialized if it's used outside of this for loop.
Ensure that 'max_chunks_fbc_mode' is properly initialized before it's
used. Initialize it to a default value right after its declaration to
ensure that it gets a value assigned under all possible control flow
paths.
Thus fixing the below:
drivers/gpu/drm/amd/amdgpu/../display/dc/basics/dce_calcs.c:914 calculate_bandwidth() error: uninitialized symbol 'max_chunks_fbc_mode'.
drivers/gpu/drm/amd/amdgpu/../display/dc/basics/dce_calcs.c:917 calculate_bandwidth() error: uninitialized symbol 'max_chunks_fbc_mode'.
Above line is trying to assign the maximum value between
'wait_time_microsec' and 'DPIA_CLK_SYNC_DELAY' to wait_time_microsec.
However, 'wait_time_microsec' has not been assigned a value before this
line, initialize 'wait_time_microsec' at the point of declaration.
Fixes the below:
drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_training_dpia.c:697 dpia_training_eq_non_transparent() error: uninitialized symbol 'wait_time_microsec'.
Dan Carpenter [Fri, 9 Feb 2024 13:02:42 +0000 (16:02 +0300)]
drm/amd/display: Fix && vs || typos
These ANDs should be ORs or it will lead to a NULL dereference.
Fixes: fb5a3d037082 ("drm/amd/display: Add NULL test for 'timing generator' in 'dcn21_set_pipe()'") Fixes: 886571d217d7 ("drm/amd/display: Fix 'panel_cntl' could be null in 'dcn21_set_backlight_level()'") Reviewed-by: Anthony Koo <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
Kent Russell [Tue, 6 Feb 2024 17:45:44 +0000 (12:45 -0500)]
drm/amdkfd: Fix L2 cache size reporting in GFX9.4.3
Its currently incorrectly multiplied by number of XCCs in the partition
Fixes: be457b2252b6 ("drm/amdkfd: Update cache info for GFX 9.4.3") Signed-off-by: Kent Russell <[email protected]> Reviewed-by: Mukul Joshi <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
Hamza Mahfooz [Thu, 8 Feb 2024 21:23:29 +0000 (16:23 -0500)]
drm/amdgpu: make damage clips support configurable
We have observed that there are quite a number of PSR-SU panels on the
market that are unable to keep up with what user space throws at them,
resulting in hangs and random black screens. So, make damage clips
support configurable and disable it by default for PSR-SU displays.
Linus Torvalds [Thu, 15 Feb 2024 19:14:33 +0000 (11:14 -0800)]
update workarounds for gcc "asm goto" issue
In commit 4356e9f841f7 ("work around gcc bugs with 'asm goto' with
outputs") I did the gcc workaround unconditionally, because the cause of
the bad code generation wasn't entirely clear.
In the meantime, Jakub Jelinek debugged the issue, and has come up with
a fix in gcc [2], which also got backported to the still maintained
branches of gcc-11, gcc-12 and gcc-13.
Note that while the fix technically wasn't in the original gcc-14
branch, Jakub says:
"while it is true that no GCC 14 snapshots until today (or whenever the
fix will be committed) have the fix, for GCC trunk it is up to the
distros to use the latest snapshot if they use it at all and would
allow better testing of the kernel code without the workaround, so
that if there are other issues they won't be discovered years later.
Most userland code doesn't actually use asm goto with outputs..."
so we will consider gcc-14 to be fixed - if somebody is using gcc
snapshots of the gcc-14 before the fix, they should upgrade.
Note that while the bug goes back to gcc-11, in practice other gcc
changes seem to have effectively hidden it since gcc-12.1 as per a
bisect by Jakub. So even a gcc-14 snapshot without the fix likely
doesn't show actual problems.
Also, make the default 'asm_goto_output()' macro mark the asm as
volatile by hand, because of an unrelated gcc issue [1] where it doesn't
match the documented behavior ("asm goto is always volatile").
Linus Torvalds [Thu, 15 Feb 2024 18:19:55 +0000 (10:19 -0800)]
Merge tag 'devicetree-fixes-for-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
- Improve devlink dependency parsing for DT graphs
- Fix devlink handling of io-channels dependencies
- Fix PCI addressing in marvell,prestera example
- A few schema fixes for property constraints
- Improve performance of DT unprobed devices kselftest
- Fix regression in DT_SCHEMA_FILES handling
- Fix compile error in unittest for !OF_DYNAMIC
* tag 'devicetree-fixes-for-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
dt-bindings: ufs: samsung,exynos-ufs: Add size constraints on "samsung,sysreg"
of: property: Add in-ports/out-ports support to of_graph_get_port_parent()
of: property: Improve finding the supplier of a remote-endpoint property
of: property: Improve finding the consumer of a remote-endpoint property
net: marvell,prestera: Fix example PCI bus addressing
of: unittest: Fix compile in the non-dynamic case
of: property: fix typo in io-channels
dt-bindings: tpm: Drop type from "resets"
dt-bindings: display: nxp,tda998x: Fix 'audio-ports' constraints
dt-bindings: xilinx: replace Piyush Mehta maintainership
kselftest: dt: Stop relying on dirname to improve performance
dt-bindings: don't anchor DT_SCHEMA_FILES to bindings directory
Andy Shevchenko [Thu, 15 Feb 2024 15:25:06 +0000 (17:25 +0200)]
seq_buf: Fix kernel documentation
There are plenty of issues with the kernel documentation here:
- misspelled word "sequence"
- different style of returned value descriptions
- missed Return sections
- unaligned style of ASCII / NUL-terminated / etc
- wrong function references
Linus Torvalds [Thu, 15 Feb 2024 17:13:12 +0000 (09:13 -0800)]
Merge tag 'spi-fix-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A smallish collection of fixes for SPI, all driver specific, plus one
device ID addition for a new Intel part.
The ppc4xx isn't routinely covered by most of the automated testing so
there were some errors that were missed in some of the recent API
conversions, otherwise there's nothing super remarkable here"
* tag 'spi-fix-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi-mxs: Fix chipselect glitch
spi: intel-pci: Add support for Lunar Lake-M SPI serial flash
spi: omap2-mcspi: Revert FIFO support without DMA
spi: ppc4xx: Drop write-only variable
spi: ppc4xx: Fix fallout from rename in struct spi_bitbang
spi: ppc4xx: Fix fallout from include cleanup
spi: spi-ppc4xx: include missing platform_device.h
spi: imx: fix the burst length at DMA mode and CPU mode
Linus Torvalds [Thu, 15 Feb 2024 17:11:06 +0000 (09:11 -0800)]
Merge tag 'regmap-fix-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap test fixes from Mark Brown:
"Guenter runs a lot of KUnit tests so noticed that there were a couple
of the regmap tests, including the newly added noinc test, which could
show spurious failures due to the use of randomly generated test
values. These changes handle the randomly generated data properly"
* tag 'regmap-fix-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: kunit: Ensure that changed bytes are actually different
regmap: kunit: fix raw noinc write test wrapping
Linus Torvalds [Thu, 15 Feb 2024 17:08:19 +0000 (09:08 -0800)]
Merge tag 'hid-for-linus-2024021501' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- fix for 'MSC_SERIAL = 0' corner case handling in wacom driver (Jason
Gerecke)
- ACPI S3 suspend/resume fix for intel-ish-hid (Even Xu)
- race condition fix preventing Wacom driver from losing events shortly
after initialization (Jason Gerecke)
- fix preventing certain Logitech HID++ devices from spamming kernel
log (Oleksandr Natalenko)
* tag 'hid-for-linus-2024021501' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: wacom: generic: Avoid reporting a serial of '0' to userspace
HID: Intel-ish-hid: Ishtp: Fix sensor reads after ACPI S3 suspend
HID: multitouch: Add required quirk for Synaptics 0xcddc device
HID: wacom: Do not register input devices until after hid_hw_start
HID: logitech-hidpp: Do not flood kernel log
Rob Clark [Tue, 13 Feb 2024 17:23:40 +0000 (09:23 -0800)]
drm/msm: Wire up tlb ops
The brute force iommu_flush_iotlb_all() was good enough for unmap, but
in some cases a map operation could require removing a table pte entry
to replace with a block entry. This also requires tlb invalidation.
Missing this was resulting an obscure iova fault on what should be a
valid buffer address.
Thanks to Robin Murphy for helping me understand the cause of the fault.
Cc: Robin Murphy <[email protected]> Cc: [email protected] Fixes: b145c6e65eb0 ("drm/msm: Add support to create a local pagetable") Signed-off-by: Rob Clark <[email protected]>
Patchwork: https://patchwork.freedesktop.org/patch/578117/
Thorsten Blum [Wed, 14 Feb 2024 22:05:56 +0000 (23:05 +0100)]
tracing/synthetic: Fix trace_string() return value
Fix trace_string() by assigning the string length to the return variable
which got lost in commit ddeea494a16f ("tracing/synthetic: Use union
instead of casts") and caused trace_string() to always return 0.
Jakub Kicinski [Thu, 15 Feb 2024 16:03:49 +0000 (08:03 -0800)]
Merge branch 'fix-module_description-for-net-p6'
Breno Leitao says:
====================
Fix MODULE_DESCRIPTION() for net (p6)
There are a few network modules left that misses MODULE_DESCRIPTION(),
causing a warnning when compiling with W=1. Example:
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/arcnet/....
This last patchset solves the problem for all the missing driver. It is
not expect to see any warning for the driver/net and net/ directory once
all these patches have landed.