The 'single-channel' property is an uint32, not an array, so 'items' is
an incorrect constraint. This didn't matter until dtschema recently
changed how properties are decoded. This results in this warning:
Documentation/devicetree/bindings/iio/adc/adi,ad7192.example.dtb: adc@0: \
channel@1:single-channel: 1 is not of type 'array'
of: remove internal arguments from of_property_for_each_u32()
The of_property_for_each_u32() macro needs five parameters, two of which
are primarily meant as internal variables for the macro itself (in the
for() clause). Yet these two parameters are used by a few drivers, and this
can be considered misuse or at least bad practice.
Now that the kernel uses C11 to build, these two parameters can be avoided
by declaring them internally, thus changing this pattern:
However two variables cannot be declared in the for clause even with C11,
so declare one struct that contain the two variables we actually need. As
the variables inside this struct are not meant to be used by users of this
macro, give the struct instance the noticeable name "_it" so it is visible
during code reviews, helping to avoid new code to use it directly.
Most usages are trivially converted as they do not use those two
parameters, as expected. The non-trivial cases are:
- drivers/clk/clk.c, of_clk_get_parent_name(): easily doable anyway
- drivers/clk/clk-si5351.c, si5351_dt_parse(): this is more complex as the
checks had to be replicated in a different way, making code more verbose
and somewhat uglier, but I refrained from a full rework to keep as much
of the original code untouched having no hardware to test my changes
All the changes have been build tested. The few for which I have the
hardware have been runtime-tested too.
Merge tag 'phy-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
Pull phy updates from Vinod Koul:
"New Support
- Samsung Exynos gs101 drd combo phy
- Qualcomm SC8180x USB uniphy, IPQ9574 QMP PCIe phy
- Airoha EN7581 PCIe phy
- Freescale i.MX8Q HSIO SerDes phy
- Starfive jh7110 dphy tx
Updates:
- Resume support for j721e-wiz driver
- Updates to Exynos usbdrd driver
- Support for optional power domains in g12a usb2-phy driver
- Debugfs support and updates to zynqmp driver"
* tag 'phy-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (56 commits)
phy: airoha: Add dtime and Rx AEQ IO registers
dt-bindings: phy: airoha: Add dtime and Rx AEQ IO registers
dt-bindings: phy: rockchip-emmc-phy: Convert to dtschema
dt-bindings: phy: qcom,qmp-usb: fix spelling error
phy: exynos5-usbdrd: support Exynos USBDRD 3.1 combo phy (HS & SS)
phy: exynos5-usbdrd: convert Vbus supplies to regulator_bulk
phy: exynos5-usbdrd: convert (phy) register access clock to clk_bulk
phy: exynos5-usbdrd: convert core clocks to clk_bulk
phy: exynos5-usbdrd: support isolating HS and SS ports independently
dt-bindings: phy: samsung,usb3-drd-phy: add gs101 compatible
phy: core: Fix documentation of of_phy_get
phy: starfive: Correct the dphy configure process
phy: zynqmp: Add debugfs support
phy: zynqmp: Take the phy mutex in xlate
phy: zynqmp: Only wait for PLL lock "primary" instances
phy: zynqmp: Store instance instead of type
phy: zynqmp: Enable reference clock correctly
phy: cadence-torrent: Check return value on register read
phy: Fix the cacography in phy-exynos5250-usb2.c
phy: phy-rockchip-samsung-hdptx: Select CONFIG_MFD_SYSCON
...
Merge tag 'soundwire-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire
Pull soundwire updates from Vinod Koul:
- Simplification across subsystem using cleanup.h
- Support for debugfs to read/write commands
- Few Intel and Qualcomm driver updates
* tag 'soundwire-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
soundwire: debugfs: simplify with cleanup.h
soundwire: cadence: simplify with cleanup.h
soundwire: intel_ace2x: simplify with cleanup.h
soundwire: intel_ace2x: simplify return path in hw_params
soundwire: intel: simplify with cleanup.h
soundwire: intel: simplify return path in hw_params
soundwire: amd_init: simplify with cleanup.h
soundwire: amd: simplify with cleanup.h
soundwire: amd: simplify return path in hw_params
soundwire: intel_auxdevice: start the bus at default frequency
soundwire: intel_auxdevice: add cs42l43 codec to wake_capable_list
drivers:soundwire: qcom: cleanup port maask calculations
soundwire: bus: simplify by using local slave->prop
soundwire: generic_bandwidth_allocation: change port_bo parameter to pointer
soundwire: Intel: clarify Copyright information
soundwire: intel_ace2.x: add AC timing extensions for PantherLake
soundwire: bus: add stream refcount
soundwire: debugfs: add interface to read/write commands
Merge tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator updates from Jason Donenfeld:
"This adds getrandom() support to the vDSO.
First, it adds a new kind of mapping to mmap(2), MAP_DROPPABLE, which
lets the kernel zero out pages anytime under memory pressure, which
enables allocating memory that never gets swapped to disk but also
doesn't count as being mlocked.
Then, the vDSO implementation of getrandom() is introduced in a
generic manner and hooked into random.c.
Next, this is implemented on x86. (Also, though it's not ready for
this pull, somebody has begun an arm64 implementation already)
Finally, two vDSO selftests are added.
There are also two housekeeping cleanup commits"
* tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
MAINTAINERS: add random.h headers to RNG subsection
random: note that RNDGETPOOL was removed in 2.6.9-rc2
selftests/vDSO: add tests for vgetrandom
x86: vdso: Wire up getrandom() vDSO implementation
random: introduce generic vDSO getrandom() implementation
mm: add MAP_DROPPABLE for designating always lazily freeable mappings
Merge tag 'vfs-6.11-rc1.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
"VFS:
- The new 64bit mount ids start after the old mount id, i.e., at the
first non-32 bit value. However, we started counting one id too
late and thus lost 4294967296 as the first valid id. Fix that.
- Update a few comments on some vfs_*() creation helpers.
- Move copying of the xattr name out from the locks required to start
a filesystem write.
- Extend the filelock lock UAF fix to the compat code as well.
- Now that we added the ability to look up an inode under RCU it's
possible that lockless hash lookup can find and lock an inode after
it gets I_FREEING set. It then waits until inode teardown in
evict() is finished.
The flag however is still set after evict() has woken up all
waiters. If the inode lock is taken late enough on the waiting side
after hash removal and wakeup happened the waiting thread will
never be woken.
Before RCU based lookup this was synchronized via the
inode_hash_lock. But since unhashing requires the inode lock as
well we can check whether the inode is unhashed while holding inode
lock even without holding inode_hash_lock.
pidfd:
- The nsproxy structure contains nearly all of the namespaces
associated with a task. When a namespace type isn't supported
nsproxy might contain a NULL pointer or always point to the initial
namespace type. The logic isn't consistent. So when deriving
namespace fds we need to ensure that the namespace type is
supported.
First, so that we don't risk dereferncing NULL pointers. The
correct bigger fix would be to change all namespaces to always set
a valid namespace pointer in struct nsproxy independent of whether
or not it is compiled in. But that requires quite a few changes.
Second, so that we don't allow deriving namespace fds when the
namespace type doesn't exist and thus when they couldn't also be
derived via /proc/self/ns/.
- Add missing selftests for the new pidfd ioctls to derive namespace
fds. This simply extends the already existing testsuite.
netfs:
- Fix debug logging and fix kconfig variable name so it actually
works.
- Fix writeback that goes both to the server and cache. The streams
are only activated once a subreq is added. When a server write
happens the subreq doesn't need to have finished by the time the
cache write is started. If the server write has already finished by
the time the cache write is about to start the cache write will
operate on a folio that might already have been reused. Fix this by
preactivating the cache write.
- Limit cachefiles subreq size for cache writes to MAX_RW_COUNT"
* tag 'vfs-6.11-rc1.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
inode: clarify what's locked
vfs: Fix potential circular locking through setxattr() and removexattr()
filelock: Fix fcntl/close race recovery compat path
fs: use all available ids
cachefiles: Set the max subreq size for cache writes to MAX_RW_COUNT
netfs: Fix writeback that needs to go to both server and cache
pidfs: add selftests for new namespace ioctls
pidfs: handle kernels without namespaces cleanly
pidfs: when time ns disabled add check for ioctl
vfs: correct the comments of vfs_*() helpers
vfs: handle __wait_on_freeing_inode() and evict() race
netfs: Rename CONFIG_FSCACHE_DEBUG to CONFIG_NETFS_DEBUG
netfs: Revert "netfs: Switch debug logging to pr_debug()"
Commit e3ec0fe944d2 ("hostfs: Convert hostfs_read_folio() to use a
folio") simplified hostfs_read_folio(), but in the process of converting
to using folios natively also mis-used the folio_zero_tail() function
due to the very confusing API of that function.
Very arguably it's folio_zero_tail() API itself that is buggy, since it
would make more sense (and the documentation kind of implies) that the
third argument would be the pointer to the beginning of the folio
buffer.
But no, the third argument to folio_zero_tail() is where we should start
zeroing the tail (even if we already also pass in the offset separately
as the second argument).
So fix the hostfs caller, and we can leave any folio_zero_tail() sanity
cleanup for later.
In __wait_on_freeing_inode() we warn in case the inode_hash_lock is held
but the inode is unhashed. We then release the inode_lock. So using
"locked" as parameter name is confusing. Use is_inode_hash_locked as
parameter name instead.
David Howells [Tue, 23 Jul 2024 08:59:54 +0000 (09:59 +0100)]
vfs: Fix potential circular locking through setxattr() and removexattr()
When using cachefiles, lockdep may emit something similar to the circular
locking dependency notice below. The problem appears to stem from the
following:
(1) Cachefiles manipulates xattrs on the files in its cache when called
from ->writepages().
(2) The setxattr() and removexattr() system call handlers get the name
(and value) from userspace after taking the sb_writers lock, putting
accesses of the vma->vm_lock and mm->mmap_lock inside of that.
(3) The afs filesystem uses a per-inode lock to prevent multiple
revalidation RPCs and in writeback vs truncate to prevent parallel
operations from deadlocking against the server on one side and local
page locks on the other.
Fix this by moving the getting of the name and value in {get,remove}xattr()
outside of the sb_writers lock. This also has the minor benefits that we
don't need to reget these in the event of a retry and we never try to take
the sb_writers lock in the event we can't pull the name and value into the
kernel.
Alternative approaches that might fix this include moving the dispatch of a
write to the cache off to a workqueue or trying to do without the
validation lock in afs. Note that this might also affect other filesystems
that use netfslib and/or cachefiles.
======================================================
WARNING: possible circular locking dependency detected
6.10.0-build2+ #956 Not tainted
------------------------------------------------------
fsstress/6050 is trying to acquire lock: ffff888138fd82f0 (mapping.invalidate_lock#3){++++}-{3:3}, at: filemap_fault+0x26e/0x8b0
but task is already holding lock: ffff888113f26d18 (&vma->vm_lock->lock){++++}-{3:3}, at: lock_vma_under_rcu+0x165/0x250
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
When I wrote commit 3cad1bc01041 ("filelock: Remove locks reliably when
fcntl/close race is detected"), I missed that there are two copies of the
code I was patching: The normal version, and the version for 64-bit offsets
on 32-bit kernels.
Thanks to Greg KH for stumbling over this while doing the stable
backport...
Apply exactly the same fix to the compat path for 32-bit kernels.
The counter is unconditionally incremented for each mount allocation.
If we set it to 1ULL << 32 we're losing 4294967296 as the first valid
non-32 bit mount id.
David Howells [Fri, 19 Jul 2024 14:19:02 +0000 (15:19 +0100)]
cachefiles: Set the max subreq size for cache writes to MAX_RW_COUNT
Set the maximum size of a subrequest that writes to cachefiles to be
MAX_RW_COUNT so that we don't overrun the maximum write we can make to the
backing filesystem.
David Howells [Fri, 19 Jul 2024 14:20:18 +0000 (15:20 +0100)]
netfs: Fix writeback that needs to go to both server and cache
When netfslib is performing writeback (ie. ->writepages), it maintains two
parallel streams of writes, one to the server and one to the cache, but it
doesn't mark either stream of writes as active until it gets some data that
needs to be written to that stream.
This is done because some folios will only be written to the cache
(e.g. copying to the cache on read is done by marking the folios and
letting writeback do the actual work) and sometimes we'll only be writing
to the server (e.g. if there's no cache).
Now, since we don't actually dispatch uploads and cache writes in parallel,
but rather flip between the streams, depending on which has the lowest
so-far-issued offset, and don't wait for the subreqs to finish before
flipping, we can end up in a situation where, say, we issue a write to the
server and this completes before we start the write to the cache.
But because we only activate a stream when we first add a subreq to it, the
result collection code may run before we manage to activate the stream -
resulting in the folio being cleaned and having the writeback-in-progress
mark removed. At this point, the folio no longer belongs to us.
This is only really a problem for folios that need to be written to both
streams - and in that case, the upload to the server is started first,
followed by the write to the cache - and the cache write may see a bad
folio.
Fix this by activating the cache stream up front if there's a cache
available. If there's a cache, then all data is going to be written to it.
The nsproxy structure contains nearly all of the namespaces associated
with a task. When a given namespace type is not supported by this kernel
the rules whether the corresponding pointer in struct nsproxy is NULL or
always init_<ns_type>_ns differ per namespace. Ideally, that wouldn't be
the case and for all namespace types we'd always set it to
init_<ns_type>_ns when the corresponding namespace type isn't supported.
Make sure we handle all namespaces where the pointer in struct nsproxy
can be NULL when the namespace type isn't supported.
syzbot call pidfd_ioctl() with cmd "PIDFD_GET_TIME_NAMESPACE" and disabled
CONFIG_TIME_NS, since time_ns is NULL, it will make NULL ponter deref in
open_namespace.
correct the comments of vfs_*() helpers in fs/namei.c, including:
1. vfs_create()
2. vfs_mknod()
3. vfs_mkdir()
4. vfs_rmdir()
5. vfs_symlink()
All of them come from the same commit: 6521f8917082 "namei: prepare for idmapped mounts"
The @dentry is actually the dentry of child directory rather than
base directory(parent directory), and thus the @dir has to be
modified due to the change of @dentry.
vfs: handle __wait_on_freeing_inode() and evict() race
Lockless hash lookup can find and lock the inode after it gets the
I_FREEING flag set, at which point it blocks waiting for teardown in
evict() to finish.
However, the flag is still set even after evict() wakes up all waiters.
This results in a race where if the inode lock is taken late enough, it
can happen after both hash removal and wakeups, meaning there is nobody
to wake the racing thread up.
This worked prior to RCU-based lookup because the entire ordeal was
synchronized with the inode hash lock.
Since unhashing requires the inode lock, we can safely check whether it
happened after acquiring it.
Merge tag 'perf-tools-fixes-for-v6.11-2024-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools fixes from Namhyung Kim:
"Two fixes for building perf and other tools:
- Fix breakage in tracing tools due to pkg-config for
libtrace{event,fs}
- Fix build of perf when libunwind is used"
* tag 'perf-tools-fixes-for-v6.11-2024-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
perf dso: Fix build when libunwind is enabled
tools/latency: Use pkg-config in lib_setup of Makefile.config
tools/rtla: Use pkg-config in lib_setup of Makefile.config
tools/verification: Use pkg-config in lib_setup of Makefile.config
tools: Make pkg-config dependency checks usable by other tools
perf build: Warn if libtracefs is not found
Merge tag 'execve-v6.11-rc1-fix1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull execve fix from Kees Cook:
"This moves the exec and binfmt_elf tests out of your way and into the
tests/ subdirectory, following the newly ratified KUnit naming
conventions. :)"
* tag 'execve-v6.11-rc1-fix1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
execve: Move KUnit tests to tests/ subdirectory
Merge tag 'f2fs-for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"A pretty small update including mostly minor bug fixes in zoned
storage along with the large section support.
Enhancements:
- add support for FS_IOC_GETFSSYSFSPATH
- enable atgc dynamically if conditions are met
- use new ioprio Macro to get ckpt thread ioprio level
- remove unreachable lazytime mount option parsing
Bug fixes:
- fix null reference error when checking end of zone
- fix start segno of large section
- fix to cover read extent cache access with lock
- don't dirty inode for readonly filesystem
- allocate a new section if curseg is not the first seg in its zone
- only fragment segment in the same section
- truncate preallocated blocks in f2fs_file_open()
- fix to avoid use SSR allocate when do defragment
- fix to force buffered IO on inline_data inode
And some minor code clean-ups and sanity checks"
* tag 'f2fs-for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (26 commits)
f2fs: clean up addrs_per_{inode,block}()
f2fs: clean up F2FS_I()
f2fs: use meta inode for GC of COW file
f2fs: use meta inode for GC of atomic file
f2fs: only fragment segment in the same section
f2fs: fix to update user block counts in block_operations()
f2fs: remove unreachable lazytime mount option parsing
f2fs: fix null reference error when checking end of zone
f2fs: fix start segno of large section
f2fs: remove redundant sanity check in sanity_check_inode()
f2fs: assign CURSEG_ALL_DATA_ATGC if blkaddr is valid
f2fs: fix to use mnt_{want,drop}_write_file replace file_{start,end}_wrtie
f2fs: clean up set REQ_RAHEAD given rac
f2fs: enable atgc dynamically if conditions are met
f2fs: fix to truncate preallocated blocks in f2fs_file_open()
f2fs: fix to cover read extent cache access with lock
f2fs: fix return value of f2fs_convert_inline_inode()
f2fs: use new ioprio Macro to get ckpt thread ioprio level
f2fs: fix to don't dirty inode for readonly filesystem
f2fs: fix to avoid use SSR allocate when do defragment
...
Merge tag 'jfs-6.11' of github.com:kleikamp/linux-shaggy
Pull jfs updates from David Kleikamp:
"Folio conversion from Matthew Wilcox and a few various fixes"
* tag 'jfs-6.11' of github.com:kleikamp/linux-shaggy:
jfs: don't walk off the end of ealist
jfs: Fix shift-out-of-bounds in dbDiscardAG
jfs: Fix array-index-out-of-bounds in diFree
jfs: fix null ptr deref in dtInsertEntry
jfs: Remove use of folio error flag
fs: Remove i_blocks_per_page
jfs: Change metapage->page to metapage->folio
jfs: Convert force_metapage to use a folio
jfs: Convert inc_io to take a folio
jfs: Convert page_to_mp to folio_to_mp
jfs; Convert __invalidate_metapages to use a folio
jfs: Convert dec_io to take a folio
jfs: Convert drop_metapage and remove_metapage to take a folio
jfs; Convert release_metapage to use a folio
jfs: Convert insert_metapage() to take a folio
jfs: Convert __get_metapage to use a folio
jfs: Convert metapage_writepage to metapage_write_folio
jfs: Convert metapage_read_folio to use folio APIs
Merge tag 'kbuild-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Remove tristate choice support from Kconfig
- Stop using the PROVIDE() directive in the linker script
- Reduce the number of links for the combination of CONFIG_KALLSYMS and
CONFIG_DEBUG_INFO_BTF
- Enable the warning for symbol reference to .exit.* sections by
default
- Fix warnings in RPM package builds
- Improve scripts/make_fit.py to generate a FIT image with separate
base DTB and overlays
- Improve choice value calculation in Kconfig
- Fix conditional prompt behavior in choice in Kconfig
- Remove support for the uncommon EMAIL environment variable in Debian
package builds
- Remove support for the uncommon "name <email>" form for the DEBEMAIL
environment variable
- Raise the minimum supported GNU Make version to 4.0
- Remove stale code for the absolute kallsyms
- Move header files commonly used for host programs to scripts/include/
- Introduce the pacman-pkg target to generate a pacman package used in
Arch Linux
- Clean up Kconfig
* tag 'kbuild-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (65 commits)
kbuild: doc: gcc to CC change
kallsyms: change sym_entry::percpu_absolute to bool type
kallsyms: unify seq and start_pos fields of struct sym_entry
kallsyms: add more original symbol type/name in comment lines
kallsyms: use \t instead of a tab in printf()
kallsyms: avoid repeated calculation of array size for markers
kbuild: add script and target to generate pacman package
modpost: use generic macros for hash table implementation
kbuild: move some helper headers from scripts/kconfig/ to scripts/include/
Makefile: add comment to discourage tools/* addition for kernel builds
kbuild: clean up scripts/remove-stale-files
kconfig: recursive checks drop file/lineno
kbuild: rpm-pkg: introduce a simple changelog section for kernel.spec
kallsyms: get rid of code for absolute kallsyms
kbuild: Create INSTALL_PATH directory if it does not exist
kbuild: Abort make on install failures
kconfig: remove 'e1' and 'e2' macros from expression deduplication
kconfig: remove SYMBOL_CHOICEVAL flag
kconfig: add const qualifiers to several function arguments
kconfig: call expr_eliminate_yn() at least once in expr_eliminate_dups()
...
Merge tag 'rproc-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux
Pull remoteproc updates from Bjorn Andersson:
- The maximum amount of DDR memory used by the Mediatek MT8188/MT8195
SCP is increased to handle new use cases. Handling of optional L1TCM
memory is made actually optional.
- An optimization is introduced to only clear the unused portion of IPI
shared buffers, rather than the entire buffer before writing the
message.
- Detection for IPC-only mode in the TI K3 DSP remoteproc driver is
corrected. The loglevel of a debug print in the same is lowered from
error.
- Support for attaching to an running remote processor is added to the
Xilinx R5F.
- An in-kernel implementation of the Qualcomm "protected domain mapper"
(aka service registry) service is introduced, to remove the
dependency on a userspace implementation to detect when the battery
monitor and USB Type-C port manager becomes available. This is then
integrated with the Qualcomm remoteproc driver.
- The Qualcomm PAS remoteproc driver gains support for attempting to
bust hwspinlocks held by the remote processor when it
crashed/stopped.
- The TI OMAP remoteproc driver is transitioned to use devres helpers
for various forms of allocations.
- Parsing of memory-regions in the i.MX remoteproc driver is improved
to avoid a NULL pointer dereference if the phandle reference is
empty. of_node reference counting is corrected in the same.
* tag 'rproc-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
remoteproc: mediatek: Increase MT8188/MT8195 SCP core0 DRAM size
remoteproc: k3-dsp: Fix log levels where appropriate
remoteproc: xlnx: Add attach detach support
remoteproc: qcom: select AUXILIARY_BUS
remoteproc: k3-r5: Fix IPC-only mode detection
remoteproc: mediatek: Don't attempt to remap l1tcm memory if missing
remoteproc: qcom: enable in-kernel PD mapper
dt-bindings: remoteproc: imx_rproc: Add minItems for power-domain
remoteproc: imx_rproc: Fix refcount mistake in imx_rproc_addr_init
remoteproc: omap: Use devm_rproc_add() helper
remoteproc: omap: Use devm action to release reserved memory
remoteproc: omap: Use devm_rproc_alloc() helper
remoteproc: imx_rproc: Skip over memory region when node value is NULL
dt-bindings: remoteproc: k3-dsp: Correct optional sram properties for AM62A SoCs
remoteproc: qcom_q6v5_pas: Add hwspinlock bust on stop
soc: qcom: smem: Add qcom_smem_bust_hwspin_lock_by_host()
remoteproc: mediatek: Zero out only remaining bytes of IPI buffer
Merge tag 'hwlock-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux
Pull hwspinlock updates from Bjorn Andersson:
"This introduces a mechanism in the hardware spinlock framework, and
the Qualcomm TCSR mutex driver, for allowing clients to bust locks
held by a remote processor in the event that this enters a faulty
state while holding the shared lock"
* tag 'hwlock-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
hwspinlock: qcom: implement bust operation
hwspinlock: Introduce hwspin_lock_bust()
Merge tag 'sh-for-v6.11-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux
Pull sh updates from John Paul Adrian Glaubitz:
"This is rather small this time and contains just three changes.
The first change by Oscar Salvador drops support for memory hotplug
and hotremove for sh as the kernel stopped supporting it on 32-bit
platforms since 7ec58a2b941e ("mm/memory_hotplug: restrict
CONFIG_MEMORY_HOTPLUG to 64 bit").
That then results in a follow-up change to update all affected board
config files.
The third change comes from Jeff Johnson which adds the missing
MODULE_DESCRIPTION() macro to the push-switch driver"
* tag 'sh-for-v6.11-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
sh: push-switch: Add missing MODULE_DESCRIPTION() macro
sh: config: Drop CONFIG_MEMORY_{HOTPLUG,HOTREMOVE}
sh: Drop support for memory hotplug and memory hotremove
Merge tag 'modules-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull module update from Luis Chamberlain:
"This is a super boring development cycle this time around for modules,
there is only one patch in this pull request.
The patch deals with a corner case set of dependencies which is not
resolved today to ensure users get the module they need on initramfs.
Currently only one module is known to exist which needs this, however
this can grow to capture other corner cases likely escaped and not
reported before. The kernel change is just a section update, the real
work is done and merged already on upstream kmod.
This has been on linux-next for 3 weeks now"
* tag 'modules-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
module: create weak dependecies
Merge tag 'livepatching-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching
Pull livepatching update from Petr Mladek:
- show patch->replace flag in sysfs
- add or improve few selftests
* tag 'livepatching-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
livepatch: Replace snprintf() with sysfs_emit()
selftests/livepatch: Add selftests for "replace" sysfs attribute
livepatch: Add "replace" sysfs attribute
selftests: livepatch: Test atomic replace against multiple modules
selftests/livepatch: define max test-syscall processes
Merge tag 'i2c-for-6.11-rc1-second-batch' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull more i2c updates from Wolfram Sang:
"The I2C core has two header documentation updates as the dependecies
are in now.
The I2C host drivers add some patches which nearly fell through the
cracks:
- Added descriptions in the DTS for the Qualcomm SM8650 and SM8550
Camera Control Interface (CCI).
- Added support for the "settle-time-us" property, which allows the
gpio-mux device to switch from one bus to another with a
configurable delay. The time can be set in the DTS. The latest
change also includes file sorting.
- Fixed slot numbering in the SMBus framework to prevent failures
when more than 8 slots are occupied. It now enforces a a maximum of
8 slots to be used. This ensures that the Intel PIIX4 device can
register the SPDs correctly without failure, even if other slots
are populated but not used"
* tag 'i2c-for-6.11-rc1-second-batch' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: header: improve kdoc for i2c_algorithm
i2c: header: remove unneeded stuff regarding i2c_algorithm
i2c: piix4: Register SPDs
i2c: smbus: remove i801 assumptions from SPD probing
i2c: mux: gpio: Add support for the 'settle-time-us' property
i2c: mux: gpio: Re-order #include to match alphabetic order
dt-bindings: i2c: mux-gpio: Add 'settle-time-us' property
dt-bindings: i2c: qcom-cci: Document sm8650 compatible
dt-bindings: i2c: qcom-cci: Document sm8550 compatible
Merge tag 'pcmcia-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux
Pull PCMCIA updates from Dominik Brodowski:
"A number of tiny cleanups of the PCMCIA subsystem by Jeff Johnson,
Jules Irenge, and Krzysztof Kozlowski"
* tag 'pcmcia-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux:
pcmcia: add missing MODULE_DESCRIPTION() macros
pcmcia: Use resource_size function on resource object
pcmcia: bcm63xx: drop driver owner assignment
Merge tag 'for-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
Pull power supply and reset updates from Sebastian Reichel:
"Power-supply core:
- new charging_orange_full_green RGB LED trigger
- simplify and cleanup power-supply LED trigger code
- expose power information via hwmon compatibility layer
New hardware support:
- enable battery support for Qualcomm Snapdragon X Elite
- new battery driver for Maxim MAX17201/MAX17205
- new battery driver for Lenovo Yoga C630 laptop (custom EC)
Cleanups:
- cleanup 'struct i2c_device_id' initializations
- misc small battery driver cleanups and fixes"
* tag 'for-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
power: supply: sysfs: use power_supply_property_is_writeable()
power: supply: qcom_battmgr: Enable battery support on x1e80100
power: supply: add support for MAX1720x standalone fuel gauge
dt-bindings: power: supply: add support for MAX17201/MAX17205 fuel gauge
power: reset: piix4: add missing MODULE_DESCRIPTION() macro
power: supply: samsung-sdi-battery: Constify struct power_supply_maintenance_charge_table
power: supply: samsung-sdi-battery: Constify struct power_supply_vbat_ri_table
power: supply: lenovo_yoga_c630_battery: add Lenovo C630 driver
power: supply: ingenic: Fix some error handling paths in ingenic_battery_get_property()
power: supply: ab8500: Clean some error messages
power: supply: ab8500: Use iio_read_channel_processed_scale()
power: supply: ab8500: Fix error handling when calling iio_read_channel_processed()
power: supply: hwmon: Add support for power sensors
power: supply: ab8500: remove unused struct 'inst_curr_result_list'
power: supply: bd99954: remove unused struct 'battery_data'
power: supply: leds: Add activate() callback to triggers
power: supply: leds: Share trig pointer for online and charging_full
power: supply: leds: Add power_supply_[un]register_led_trigger()
power: supply: Drop explicit initialization of struct i2c_device_id::driver_data to 0
Move the exec KUnit tests into a separate directory to avoid polluting
the local directory namespace. Additionally update MAINTAINERS for the
new files.
Merge tag 'irq-msi-2024-07-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull MSI interrupt updates from Thomas Gleixner:
"Switch ARM/ARM64 over to the modern per device MSI domains.
This simplifies the handling of platform MSI and wire to MSI
controllers and removes about 500 lines of legacy code.
Aside of that it paves the way for ARM/ARM64 to utilize the dynamic
allocation of PCI/MSI interrupts and to support the upcoming non
standard IMS (Interrupt Message Store) mechanism on PCIe devices"
* tag 'irq-msi-2024-07-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
irqchip/gic-v3-its: Correctly fish out the DID for platform MSI
irqchip/gic-v3-its: Correctly honor the RID remapping
genirq/msi: Move msi_device_data to core
genirq/msi: Remove platform MSI leftovers
irqchip/irq-mvebu-icu: Remove platform MSI leftovers
irqchip/irq-mvebu-sei: Switch to MSI parent
irqchip/mvebu-odmi: Switch to parent MSI
irqchip/mvebu-gicp: Switch to MSI parent
irqchip/irq-mvebu-icu: Prepare for real per device MSI
irqchip/imx-mu-msi: Switch to MSI parent
irqchip/gic-v2m: Switch to device MSI
irqchip/gic_v3_mbi: Switch over to parent domain
genirq/msi: Remove platform_msi_create_device_domain()
irqchip/mbigen: Remove platform_msi_create_device_domain() fallback
irqchip/gic-v3-its: Switch platform MSI to MSI parent
irqchip/irq-msi-lib: Prepare for DOMAIN_BUS_WIRED_TO_MSI
irqchip/mbigen: Prepare for real per device MSI
irqchip/irq-msi-lib: Prepare for DEVICE MSI to replace platform MSI
irqchip/gic-v3-its: Provide MSI parent for PCI/MSI[-X]
irqchip/irq-msi-lib: Prepare for PCI MSI/MSIX
...
Merge tag 'irq-core-2024-07-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull interrupt subsystem updates from Thomas Gleixner:
"Core:
- Provide a new mechanism to create interrupt domains. The existing
interfaces have already too many parameters and it's a pain to
expand any of this for new required functionality.
The new function takes a pointer to a data structure as argument.
The data structure combines all existing parameters and allows for
easy extension.
The first extension for this is to handle the instantiation of
generic interrupt chips at the core level and to allow drivers to
provide extra init/exit callbacks.
This is necessary to do the full interrupt chip initialization
before the new domain is published, so that concurrent usage sites
won't see a half initialized interrupt domain. Similar problems
exist on teardown.
This has turned out to be a real problem due to the deferred and
parallel probing which was added in recent years.
Handling this at the core level allows to remove quite some accrued
boilerplate code in existing drivers and avoids horrible
workarounds at the driver level.
- The usual small improvements all over the place
Drivers:
- Add support for LAN966x OIC and RZ/Five SoC
- Split the STM ExtI driver into a microcontroller and a SMP version
to allow building the latter as a module for multi-platform
kernels
- Enable MSI support for Armada 370XP on platforms which do not
support IPIs
- The usual small fixes and enhancements all over the place"
* tag 'irq-core-2024-07-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (59 commits)
irqdomain: Fix the kernel-doc and plug it into Documentation
genirq: Set IRQF_COND_ONESHOT in request_irq()
irqchip/imx-irqsteer: Handle runtime power management correctly
irqchip/gic-v3: Pass #redistributor-regions to gic_of_setup_kvm_info()
irqchip/bcm2835: Enable SKIP_SET_WAKE and MASK_ON_SUSPEND
irqchip/gic-v4: Make sure a VPE is locked when VMAPP is issued
irqchip/gic-v4: Substitute vmovp_lock for a per-VM lock
irqchip/gic-v4: Always configure affinity on VPE activation
Revert "irqchip/dw-apb-ictl: Support building as module"
Revert "Loongarch: Support loongarch avec"
arm64: Kconfig: Allow build irq-stm32mp-exti driver as module
ARM: stm32: Allow build irq-stm32mp-exti driver as module
irqchip/stm32mp-exti: Allow building as module
irqchip/stm32mp-exti: Rename internal symbols
irqchip/stm32-exti: Split MCU and MPU code
arm64: Kconfig: Select STM32MP_EXTI on STM32 platforms
ARM: stm32: Use different EXTI driver on ARMv7m and ARMv7a
irqchip/stm32-exti: Add CONFIG_STM32MP_EXTI
irqchip/dw-apb-ictl: Support building as module
irqchip/riscv-aplic: Simplify the initialization code
...
Merge tag 'loongarch-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- Define __ARCH_WANT_NEW_STAT in unistd.h
- Always enumerate MADT and setup logical-physical CPU mapping
- Add irq_work support via self IPIs
- Add RANDOMIZE_KSTACK_OFFSET support
- Add ARCH_HAS_PTE_DEVMAP support
- Add ARCH_HAS_DEBUG_VM_PGTABLE support
- Add writecombine support for DMW-based ioremap()
- Add architectural preparation for CPUFreq
- Add ACPI standard hardware register based S3 support
- Add support for relocating the kernel with RELR relocation
- Some bug fixes and other small changes
* tag 'loongarch-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: Make the users of larch_insn_gen_break() constant
LoongArch: Check TIF_LOAD_WATCH to enable user space watchpoint
LoongArch: Use rustc option -Zdirect-access-external-data
LoongArch: Add support for relocating the kernel with RELR relocation
LoongArch: Remove a redundant checking in relocator
LoongArch: Use correct API to map cmdline in relocate_kernel()
LoongArch: Automatically disable KASLR for hibernation
LoongArch: Add ACPI standard hardware register based S3 support
LoongArch: Add architectural preparation for CPUFreq
LoongArch: Add writecombine support for DMW-based ioremap()
LoongArch: Add ARCH_HAS_DEBUG_VM_PGTABLE support
LoongArch: Add ARCH_HAS_PTE_DEVMAP support
LoongArch: Add RANDOMIZE_KSTACK_OFFSET support
LoongArch: Add irq_work support via self IPIs
LoongArch: Always enumerate MADT and setup logical-physical CPU mapping
LoongArch: Define __ARCH_WANT_NEW_STAT in unistd.h
Merge tag 'thermal-6.11-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fix from Rafael Wysocki:
"Fix a flood of kernel messages coming from the thermal core on systems
where iwlwifi is loaded, but the network interfaces controlled by it
are down (Rafael Wysocki)"
* tag 'thermal-6.11-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: core: Allow thermal zones to tell the core to ignore them
- Series cleaning up and hardening the blk-mq debugfs flag handling
(John, Christoph)
- blk-cgroup cleanup (Xiu)
- Error polled IO attempts if backend doesn't support it (hexue)
- Fix for an sbitmap hang (Yang)
* tag 'for-6.11/block-20240722' of git://git.kernel.dk/linux: (23 commits)
blk-cgroup: move congestion_count to struct blkcg
sbitmap: fix io hung due to race on sbitmap_word::cleared
block: avoid polling configuration errors
block: Catch possible entries missing from rqf_name[]
block: Simplify definition of RQF_NAME()
block: Use enum to define RQF_x bit indexes
block: Catch possible entries missing from cmd_flag_name[]
block: Catch possible entries missing from alloc_policy_name[]
block: Catch possible entries missing from hctx_flag_name[]
block: Catch possible entries missing from hctx_state_name[]
block: Catch possible entries missing from blk_queue_flag_name[]
block: Make QUEUE_FLAG_x as an enum
block: Relocate BLK_MQ_MAX_DEPTH
block: Relocate BLK_MQ_CPU_WORK_BATCH
block: remove QUEUE_FLAG_STOPPED
block: Add missing entry to hctx_flag_name[]
block: Add zone write plugging entry to rqf_name[]
block: Add missing entries from cmd_flag_name[]
s390/dasd: fix error checks in dasd_copy_pair_store()
s390/dasd: add missing MODULE_DESCRIPTION() macros
...
Merge tag 'for-6.11/block-post-20240722' of git://git.kernel.dk/linux
Pull block integrity mapping updates from Jens Axboe:
"A set of cleanups and fixes for the block integrity support.
Sent separately from the main block changes from last week, as they
depended on later fixes in the 6.10-rc cycle"
* tag 'for-6.11/block-post-20240722' of git://git.kernel.dk/linux:
block: don't free the integrity payload in bio_integrity_unmap_free_user
block: don't free submitter owned integrity payload on I/O completion
block: call bio_integrity_unmap_free_user from blk_rq_unmap_user
block: don't call bio_uninit from bio_endio
block: also return bio_integrity_payload * from stubs
block: split integrity support out of bio.h
Merge tag 'bcachefs-2024-07-22' of https://evilpiepirate.org/git/bcachefs
Pull bcachefs fixes from Kent Overstreet:
- another fix for fsck getting stuck, from marcin
- small syzbot fix
- another undefined shift fix
* tag 'bcachefs-2024-07-22' of https://evilpiepirate.org/git/bcachefs:
bcachefs: Fix printbuf usage while atomic
bcachefs: More informative error message in reattach_inode()
bcachefs: kill btree_trans_too_many_iters() in bch2_bucket_alloc_freelist()
bcachefs: mean_and_variance: Avoid too-large shift amounts
Merge tag 'ntfs3_for_6.11' of https://github.com/Paragon-Software-Group/linux-ntfs3
Pull ntfs3 updates from Konstantin Komarov:
"New code:
- simple fileattr support
Fixes:
- transform resident to nonresident for compressed files
- the format of the "nocase" mount option
- getting file type
- many other internal bugs
Refactoring:
- remove unused functions and macros
- partial transition from page to folio (suggested by Matthew Wilcox)
- legacy ntfs support"
* tag 'ntfs3_for_6.11' of https://github.com/Paragon-Software-Group/linux-ntfs3: (42 commits)
fs/ntfs3: Fix formatting, change comments, renaming
fs/ntfs3: Update log->page_{mask,bits} if log->page_size changed
fs/ntfs3: Implement simple fileattr
fs/ntfs3: Redesign legacy ntfs support
fs/ntfs3: Use function file_inode to get inode from file
fs/ntfs3: Minor ntfs_list_ea refactoring
fs/ntfs3: Check more cases when directory is corrupted
fs/ntfs3: Do copy_to_user out of run_lock
fs/ntfs3: Keep runs for $MFT::$ATTR_DATA and $MFT::$ATTR_BITMAP
fs/ntfs3: Missed error return
fs/ntfs3: Fix the format of the "nocase" mount option
fs/ntfs3: Fix field-spanning write in INDEX_HDR
ntfs3: Convert attr_wof_frame_info() to use a folio
ntfs3: Convert ni_readpage_cmpr() to take a folio
ntfs3: Convert ntfs_get_frame_pages() to use a folio
ntfs3: Remove calls to set/clear the error flag
ntfs3: Convert attr_make_nonresident to use a folio
ntfs3: Convert attr_data_write_resident to use a folio
ntfs3: Convert ntfs_write_end() to work on a folio
ntfs3: Convert attr_data_read_resident() to take a folio
...
Merge tag 'mm-nonmm-stable-2024-07-21-15-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- In the series "treewide: Refactor heap related implementation",
Kuan-Wei Chiu has significantly reworked the min_heap library code
and has taught bcachefs to use the new more generic implementation.
- Yury Norov's series "Cleanup cpumask.h inclusion in core headers"
reworks the cpumask and nodemask headers to make things generally
more rational.
- Kuan-Wei Chiu has sent along some maintenance work against our
sorting library code in the series "lib/sort: Optimizations and
cleanups".
- More library maintainance work from Christophe Jaillet in the series
"Remove usage of the deprecated ida_simple_xx() API".
- Ryusuke Konishi continues with the nilfs2 fixes and clanups in the
series "nilfs2: eliminate the call to inode_attach_wb()".
- Kuan-Ying Lee has some fixes to the gdb scripts in the series "Fix
GDB command error".
- Plus the usual shower of singleton patches all over the place. Please
see the relevant changelogs for details.
* tag 'mm-nonmm-stable-2024-07-21-15-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (98 commits)
ia64: scrub ia64 from poison.h
watchdog/perf: properly initialize the turbo mode timestamp and rearm counter
tsacct: replace strncpy() with strscpy()
lib/bch.c: use swap() to improve code
test_bpf: convert comma to semicolon
init/modpost: conditionally check section mismatch to __meminit*
init: remove unused __MEMINIT* macros
nilfs2: Constify struct kobj_type
nilfs2: avoid undefined behavior in nilfs_cnt32_ge macro
math: rational: add missing MODULE_DESCRIPTION() macro
lib/zlib: add missing MODULE_DESCRIPTION() macro
fs: ufs: add MODULE_DESCRIPTION()
lib/rbtree.c: fix the example typo
ocfs2: add bounds checking to ocfs2_check_dir_entry()
fs: add kernel-doc comments to ocfs2_prepare_orphan_dir()
coredump: simplify zap_process()
selftests/fpu: add missing MODULE_DESCRIPTION() macro
compiler.h: simplify data_race() macro
build-id: require program headers to be right after ELF header
resource: add missing MODULE_DESCRIPTION()
...
Merge tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- In the series "mm: Avoid possible overflows in dirty throttling" Jan
Kara addresses a couple of issues in the writeback throttling code.
These fixes are also targetted at -stable kernels.
- Ryusuke Konishi's series "nilfs2: fix potential issues related to
reserved inodes" does that. This should actually be in the
mm-nonmm-stable tree, along with the many other nilfs2 patches. My
bad.
- More folio conversions from Kefeng Wang in the series "mm: convert to
folio_alloc_mpol()"
- Kemeng Shi has sent some cleanups to the writeback code in the series
"Add helper functions to remove repeated code and improve readability
of cgroup writeback"
- Kairui Song has made the swap code a little smaller and a little
faster in the series "mm/swap: clean up and optimize swap cache
index".
- In the series "mm/memory: cleanly support zeropage in
vm_insert_page*(), vm_map_pages*() and vmf_insert_mixed()" David
Hildenbrand has reworked the rather sketchy handling of the use of
the zeropage in MAP_SHARED mappings. I don't see any runtime effects
here - more a cleanup/understandability/maintainablity thing.
- Dev Jain has improved selftests/mm/va_high_addr_switch.c's handling
of higher addresses, for aarch64. The (poorly named) series is
"Restructure va_high_addr_switch".
- The core TLB handling code gets some cleanups and possible slight
optimizations in Bang Li's series "Add update_mmu_tlb_range() to
simplify code".
- Jane Chu has improved the handling of our
fake-an-unrecoverable-memory-error testing feature MADV_HWPOISON in
the series "Enhance soft hwpoison handling and injection".
- Jeff Johnson has sent a billion patches everywhere to add
MODULE_DESCRIPTION() to everything. Some landed in this pull.
- In the series "mm: cleanup MIGRATE_SYNC_NO_COPY mode", Kefeng Wang
has simplified migration's use of hardware-offload memory copying.
- Yosry Ahmed performs more folio API conversions in his series "mm:
zswap: trivial folio conversions".
- In the series "large folios swap-in: handle refault cases first",
Chuanhua Han inches us forward in the handling of large pages in the
swap code. This is a cleanup and optimization, working toward the end
objective of full support of large folio swapin/out.
- In the series "mm,swap: cleanup VMA based swap readahead window
calculation", Huang Ying has contributed some cleanups and a possible
fixlet to his VMA based swap readahead code.
- In the series "add mTHP support for anonymous shmem" Baolin Wang has
taught anonymous shmem mappings to use multisize THP. By default this
is a no-op - users must opt in vis sysfs controls. Dramatic
improvements in pagefault latency are realized.
- David Hildenbrand has some cleanups to our remaining use of
page_mapcount() in the series "fs/proc: move page_mapcount() to
fs/proc/internal.h".
- David also has some highmem accounting cleanups in the series
"mm/highmem: don't track highmem pages manually".
- Build-time fixes and cleanups from John Hubbard in the series
"cleanups, fixes, and progress towards avoiding "make headers"".
- Cleanups and consolidation of the core pagemap handling from Barry
Song in the series "mm: introduce pmd|pte_needs_soft_dirty_wp helpers
and utilize them".
- Lance Yang's series "Reclaim lazyfree THP without splitting" has
reduced the latency of the reclaim of pmd-mapped THPs under fairly
common circumstances. A 10x speedup is seen in a microbenchmark.
It does this by punting to aother CPU but I guess that's a win unless
all CPUs are pegged.
- hugetlb_cgroup cleanups from Xiu Jianfeng in the series
"mm/hugetlb_cgroup: rework on cftypes".
- Miaohe Lin's series "Some cleanups for memory-failure" does just that
thing.
- Someone other than SeongJae has developed a DAMON feature in Honggyu
Kim's series "DAMON based tiered memory management for CXL memory".
This adds DAMON features which may be used to help determine the
efficiency of our placement of CXL/PCIe attached DRAM.
- DAMON user API centralization and simplificatio work in SeongJae
Park's series "mm/damon: introduce DAMON parameters online commit
function".
- In the series "mm: page_type, zsmalloc and page_mapcount_reset()"
David Hildenbrand does some maintenance work on zsmalloc - partially
modernizing its use of pageframe fields.
- Kefeng Wang provides more folio conversions in the series "mm: remove
page_maybe_dma_pinned() and page_mkclean()".
- More cleanup from David Hildenbrand, this time in the series
"mm/memory_hotplug: use PageOffline() instead of PageReserved() for
!ZONE_DEVICE". It "enlightens memory hotplug more about PageOffline()
pages" and permits the removal of some virtio-mem hacks.
- Barry Song's series "mm: clarify folio_add_new_anon_rmap() and
__folio_add_anon_rmap()" is a cleanup to the anon folio handling in
preparation for mTHP (multisize THP) swapin.
- Kefeng Wang's series "mm: improve clear and copy user folio"
implements more folio conversions, this time in the area of large
folio userspace copying.
- The series "Docs/mm/damon/maintaier-profile: document a mailing tool
and community meetup series" tells people how to get better involved
with other DAMON developers. From SeongJae Park.
- A large series ("kmsan: Enable on s390") from Ilya Leoshkevich does
that.
- David Hildenbrand sends along more cleanups, this time against the
migration code. The series is "mm/migrate: move NUMA hinting fault
folio isolation + checks under PTL".
- Jan Kara has found quite a lot of strangenesses and minor errors in
the readahead code. He addresses this in the series "mm: Fix various
readahead quirks".
- SeongJae Park's series "selftests/damon: test DAMOS tried regions and
{min,max}_nr_regions" adds features and addresses errors in DAMON's
self testing code.
- Gavin Shan has found a userspace-triggerable WARN in the pagecache
code. The series "mm/filemap: Limit page cache size to that supported
by xarray" addresses this. The series is marked cc:stable.
- Chengming Zhou's series "mm/ksm: cmp_and_merge_page() optimizations
and cleanup" cleans up and slightly optimizes KSM.
- Roman Gushchin has separated the memcg-v1 and memcg-v2 code - lots of
code motion. The series (which also makes the memcg-v1 code
Kconfigurable) are "mm: memcg: separate legacy cgroup v1 code and put
under config option" and "mm: memcg: put cgroup v1-specific memcg
data under CONFIG_MEMCG_V1"
- Dan Schatzberg's series "Add swappiness argument to memory.reclaim"
adds an additional feature to this cgroup-v2 control file.
- The series "Userspace controls soft-offline pages" from Jiaqi Yan
permits userspace to stop the kernel's automatic treatment of
excessive correctable memory errors. In order to permit userspace to
monitor and handle this situation.
- Kefeng Wang's series "mm: migrate: support poison recover from
migrate folio" teaches the kernel to appropriately handle migration
from poisoned source folios rather than simply panicing.
- SeongJae Park's series "Docs/damon: minor fixups and improvements"
does those things.
- In the series "mm/zsmalloc: change back to per-size_class lock"
Chengming Zhou improves zsmalloc's scalability and memory
utilization.
- Vivek Kasireddy's series "mm/gup: Introduce memfd_pin_folios() for
pinning memfd folios" makes the GUP code use FOLL_PIN rather than
bare refcount increments. So these paes can first be moved aside if
they reside in the movable zone or a CMA block.
- Andrii Nakryiko has added a binary ioctl()-based API to
/proc/pid/maps for much faster reading of vma information. The series
is "query VMAs from /proc/<pid>/maps".
- In the series "mm: introduce per-order mTHP split counters" Lance
Yang improves the kernel's presentation of developer information
related to multisize THP splitting.
- Michael Ellerman has developed the series "Reimplement huge pages
without hugepd on powerpc (8xx, e500, book3s/64)". This permits
userspace to use all available huge page sizes.
- In the series "revert unconditional slab and page allocator fault
injection calls" Vlastimil Babka removes a performance-affecting and
not very useful feature from slab fault injection.
* tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (411 commits)
mm/mglru: fix ineffective protection calculation
mm/zswap: fix a white space issue
mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio
mm/hugetlb: fix possible recursive locking detected warning
mm/gup: clear the LRU flag of a page before adding to LRU batch
mm/numa_balancing: teach mpol_to_str about the balancing mode
mm: memcg1: convert charge move flags to unsigned long long
alloc_tag: fix page_ext_get/page_ext_put sequence during page splitting
lib: reuse page_ext_data() to obtain codetag_ref
lib: add missing newline character in the warning message
mm/mglru: fix overshooting shrinker memory
mm/mglru: fix div-by-zero in vmpressure_calc_level()
mm/kmemleak: replace strncpy() with strscpy()
mm, page_alloc: put should_fail_alloc_page() back behing CONFIG_FAIL_PAGE_ALLOC
mm, slab: put should_failslab() back behind CONFIG_SHOULD_FAILSLAB
mm: ignore data-race in __swap_writepage
hugetlbfs: ensure generic_hugetlb_get_unmapped_area() returns higher address than mmap_min_addr
mm: shmem: rename mTHP shmem counters
mm: swap_state: use folio_alloc_mpol() in __read_swap_cache_async()
mm/migrate: putback split folios when numa hint migration fails
...
Merge tag 'rtc-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC updates from Alexandre Belloni:
"Subsystem:
- add missing MODULE_DESCRIPTION() macro
- fix offset addition for alarms
Drivers:
- isl1208: alarm clearing fixes
- mcp794xx: oscillator failure detection
- stm32: stm32mp25 support
- tps6594: power management support"
* tag 'rtc-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
rtc: stm32: add new st,stm32mp25-rtc compatible and check RIF configuration
dt-bindings: rtc: stm32: introduce new st,stm32mp25-rtc compatible
rtc: Drop explicit initialization of struct i2c_device_id::driver_data to 0
rtc: interface: Add RTC offset to alarm after fix-up
rtc: ds1307: Clamp year to valid BCD (0-99) in `set_time()`
rtc: ds1307: Detect oscillator fail on mcp794xx
rtc: isl1208: Update correct procedure for clearing alarm
rtc: isl1208: Add a delay for clearing alarm
dt-bindings: rtc: Convert rtc-fsl-ftm-alarm.txt to yaml format
rtc: add missing MODULE_DESCRIPTION() macro
rtc: abx80x: Fix return value of nvmem callback on read
rtc: cmos: Fix return value of nvmem callbacks
rtc: isl1208: Fix return value of nvmem callbacks
rtc: tps6594: Add power management support
rtc: tps6594: introduce private structure as drvdata
rtc: tps6594: Fix memleak in probe
Merge tag '6.11-rc-part1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
"Six smb3 client fixes, most for stable including important netfs fixes:
- various netfs related fixes for cifs addressing some regressions in
6.10 (e.g. generic/708 and some multichannel crediting related
issues)
- fix for a noisy log message on copy_file_range
- add trace point for read/write credits"
* tag '6.11-rc-part1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Fix missing fscache invalidation
cifs: Add a tracepoint to track credits involved in R/W requests
cifs: Fix setting of zero_point after DIO write
cifs: Fix missing error code set
cifs: Fix server re-repick on subrequest retry
cifs: fix noisy message on copy_file_range
kallsyms: unify seq and start_pos fields of struct sym_entry
The struct sym_entry uses the 'seq' and 'start_pos' fields to remember
the index in the symbol table. They serve the same purpose and are not
used simultaneously. Unify them.
kallsyms: add more original symbol type/name in comment lines
Commit bea5b7450474 ("kallsyms: expand symbol name into comment for
debugging") added the uncompressed type/name in the comment lines of
kallsyms_offsets.
It would be useful to do the same for kallsyms_names and
kallsyms_seqs_of_names.
Thomas Weißschuh [Sat, 20 Jul 2024 09:18:12 +0000 (11:18 +0200)]
kbuild: add script and target to generate pacman package
pacman is the package manager used by Arch Linux and its derivates.
Creating native packages from the kernel tree has multiple advantages:
* The package triggers the correct hooks for initramfs generation and
bootloader configuration
* Uninstallation is complete and also invokes the relevant hooks
* New UAPI headers can be installed without any manual bookkeeping
The PKGBUILD file is a modified version of the one used for the
downstream Arch Linux "linux" package.
Extra steps that should not be necessary for a development kernel have
been removed and an UAPI header package has been added.
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"ARM:
- Initial infrastructure for shadow stage-2 MMUs, as part of nested
virtualization enablement
- Support for userspace changes to the guest CTR_EL0 value, enabling
(in part) migration of VMs between heterogenous hardware
- Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1
of the protocol
- FPSIMD/SVE support for nested, including merged trap configuration
and exception routing
- New command-line parameter to control the WFx trap behavior under
KVM
- Introduce kCFI hardening in the EL2 hypervisor
- Fixes + cleanups for handling presence/absence of FEAT_TCRX
- Miscellaneous fixes + documentation updates
LoongArch:
- Add paravirt steal time support
- Add support for KVM_DIRTY_LOG_INITIALLY_SET
- Add perf kvm-stat support for loongarch
RISC-V:
- Redirect AMO load/store access fault traps to guest
- perf kvm stat support
- Use guest files for IMSIC virtualization, when available
s390:
- Assortment of tiny fixes which are not time critical
x86:
- Fixes for Xen emulation
- Add a global struct to consolidate tracking of host values, e.g.
EFER
- Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the
effective APIC bus frequency, because TDX
- Print the name of the APICv/AVIC inhibits in the relevant
tracepoint
- Clean up KVM's handling of vendor specific emulation to
consistently act on "compatible with Intel/AMD", versus checking
for a specific vendor
- Drop MTRR virtualization, and instead always honor guest PAT on
CPUs that support self-snoop
- Update to the newfangled Intel CPU FMS infrastructure
- Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as
it reads '0' and writes from userspace are ignored
- Misc cleanups
x86 - MMU:
- Small cleanups, renames and refactoring extracted from the upcoming
Intel TDX support
- Don't allocate kvm_mmu_page.shadowed_translation for shadow pages
that can't hold leafs SPTEs
- Unconditionally drop mmu_lock when allocating TDP MMU page tables
for eager page splitting, to avoid stalling vCPUs when splitting
huge pages
- Bug the VM instead of simply warning if KVM tries to split a SPTE
that is non-present or not-huge. KVM is guaranteed to end up in a
broken state because the callers fully expect a valid SPTE, it's
all but dangerous to let more MMU changes happen afterwards
x86 - AMD:
- Make per-CPU save_area allocations NUMA-aware
- Force sev_es_host_save_area() to be inlined to avoid calling into
an instrumentable function from noinstr code
- Base support for running SEV-SNP guests. API-wise, this includes a
new KVM_X86_SNP_VM type, encrypting/measure the initial image into
guest memory, and finalizing it before launching it. Internally,
there are some gmem/mmu hooks needed to prepare gmem-allocated
pages before mapping them into guest private memory ranges
This includes basic support for attestation guest requests, enough
to say that KVM supports the GHCB 2.0 specification
There is no support yet for loading into the firmware those signing
keys to be used for attestation requests, and therefore no need yet
for the host to provide certificate data for those keys.
To support fetching certificate data from userspace, a new KVM exit
type will be needed to handle fetching the certificate from
userspace.
An attempt to define a new KVM_EXIT_COCO / KVM_EXIT_COCO_REQ_CERTS
exit type to handle this was introduced in v1 of this patchset, but
is still being discussed by community, so for now this patchset
only implements a stub version of SNP Extended Guest Requests that
does not provide certificate data
x86 - Intel:
- Remove an unnecessary EPT TLB flush when enabling hardware
- Fix a series of bugs that cause KVM to fail to detect nested
pending posted interrupts as valid wake eents for a vCPU executing
HLT in L2 (with HLT-exiting disable by L1)
- KVM: x86: Suppress MMIO that is triggered during task switch
emulation
Explicitly suppress userspace emulated MMIO exits that are
triggered when emulating a task switch as KVM doesn't support
userspace MMIO during complex (multi-step) emulation
Silently ignoring the exit request can result in the
WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace
for some other reason prior to purging mmio_needed
See commit 0dc902267cb3 ("KVM: x86: Suppress pending MMIO write
exits if emulator detects exception") for more details on KVM's
limitations with respect to emulated MMIO during complex emulator
flows
Generic:
- Rename the AS_UNMOVABLE flag that was introduced for KVM to
AS_INACCESSIBLE, because the special casing needed by these pages
is not due to just unmovability (and in fact they are only
unmovable because the CPU cannot access them)
- New ioctl to populate the KVM page tables in advance, which is
useful to mitigate KVM page faults during guest boot or after live
migration. The code will also be used by TDX, but (probably) not
through the ioctl
- Enable halt poll shrinking by default, as Intel found it to be a
clear win
- Setup empty IRQ routing when creating a VM to avoid having to
synchronize SRCU when creating a split IRQCHIP on x86
- Rework the sched_in/out() paths to replace kvm_arch_sched_in() with
a flag that arch code can use for hooking both sched_in() and
sched_out()
- Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
truncating a bogus value from userspace, e.g. to help userspace
detect bugs
- Mark a vCPU as preempted if and only if it's scheduled out while in
the KVM_RUN loop, e.g. to avoid marking it preempted and thus
writing guest memory when retrieving guest state during live
migration blackout
Selftests:
- Remove dead code in the memslot modification stress test
- Treat "branch instructions retired" as supported on all AMD Family
17h+ CPUs
- Print the guest pseudo-RNG seed only when it changes, to avoid
spamming the log for tests that create lots of VMs
- Make the PMU counters test less flaky when counting LLC cache
misses by doing CLFLUSH{OPT} in every loop iteration"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits)
crypto: ccp: Add the SNP_VLEK_LOAD command
KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops
KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops
KVM: x86: Replace static_call_cond() with static_call()
KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event
x86/sev: Move sev_guest.h into common SEV header
KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event
KVM: x86: Suppress MMIO that is triggered during task switch emulation
KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro
KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE
KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY
KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory()
KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level
KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault"
KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler
KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory
KVM: Document KVM_PRE_FAULT_MEMORY ioctl
mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE
perf kvm: Add kvm-stat for loongarch64
LoongArch: KVM: Add PV steal time support in guest side
...
Merge tag 'mtd/for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull MTD updates from Miquel Raynal:
"Nothing stands out for this merge window, mostly minor fixes, such as
module descriptions, the use of debug macros and Makefile
improvements.
Raw NAND changes;
- The Freescale MXC driver has been converted to the newer
'->exec_op()' interface
- The meson driver now supports handling the boot ROM area with very
specific ECC needs
- Support for the iMX8QXP has been added to the GPMI driver
- The lpx32xx driver now can get the DMA channels using DT entries
- The Qcom binding has been improved to be more future proof by Rob
- And then there is the usual load of misc and minor changes
SPI-NAND changes:
- The Macronix vendor driver has been improved to support an extended
ID to avoid conflicting with older devices after an ID reuse issue
SPI NOR changes:
- Drop support for Xilinx S3AN flashes. These flashes are for the
very old Xilinx Spartan 3 FPGAs and they need some awkward code in
the core to support.
Drop support for these flashes, along with the special handling we
needed for them in the core like non-power-of-2 page size handling
and the .setup() callback.
- Fix regression for old w25q128 flashes without SFDP tables.
Commit 83e824a4a595 ("mtd: spi-nor: Correct flags for Winbond
w25q128") dropped support for such devices under the assumption
that they aren't being used anymore. Users have now surfaced [0] so
fix the regression by supporting both kind of devices.
- Core cleanups including removal of SPI_NOR_NO_FR flag and
simplification of spi_nor_get_flash_info()"
Link: https://lore.kernel.org/r/CALxbwRo_-9CaJmt7r7ELgu+vOcgk=xZcGHobnKf=oT2=u4d4aA@mail.gmail.com/
* tag 'mtd/for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (28 commits)
mtd: rawnand: lpx32xx: Fix dma_request_chan() error checks
mtd: spinand: macronix: Add support for serial NAND flash
mtd: spinand: macronix: Add support for reading Device ID 2
mtd: rawnand: lpx32xx: Request DMA channels using DT entries
dt-bindings: mtd: qcom,nandc: Define properties at top-level
mtd: rawnand: intel: use 'time_left' variable with wait_for_completion_timeout()
mtd: rawnand: mxc: use 'time_left' variable with wait_for_completion_timeout()
mtd: rawnand: gpmi: add iMX8QXP support.
mtd: rawnand: gpmi: add 'support_edo_timing' in gpmi_devdata
mtd: cmdlinepart: Replace `dbg()` macro with `pr_debug()`
mtd: add missing MODULE_DESCRIPTION() macros
mtd: make mtd_test.c a separate module
dt-bindings: mtd: gpmi-nand: Add 'fsl,imx8qxp-gpmi-nand' compatible string
mtd: rawnand: cadence: remove unused struct 'ecc_info'
mtd: rawnand: mxc: support software ECC
mtd: rawnand: mxc: implement exec_op
mtd: rawnand: mxc: separate page read from ecc calc
mtd: spi-nor: winbond: fix w25q128 regression
mtd: spi-nor: simplify spi_nor_get_flash_info()
mtd: spi-nor: get rid of SPI_NOR_NO_FR
...
Merge tag 'landlock-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock updates from Mickaël Salaün:
"This simplifies code and improves documentation"
* tag 'landlock-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
landlock: Various documentation improvements
landlock: Clarify documentation for struct landlock_ruleset_attr
landlock: Use bit-fields for storing handled layer access masks
Merge tag 'firewire-updates-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull firewire updates from Takashi Sakamoto:
"There are many lines of changes for FireWire subsystem, but there is
practically no functional change.
Most of the changes are for code refactoring, some KUnit tests to
added helper functions, and new tracepoints events for both the core
functions and 1394 OHCI driver.
The tracepoints events now cover the verbose logging enabled by debug
parameter of firewire-ohci kernel module. The parameter would be
removed in any future timing, thus it is now deprecated"
* tag 'firewire-updates-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: (32 commits)
firewire: core: move copy_port_status() helper function to TP_fast_assign() block
Revert "firewire: ohci: use common macro to interpret be32 data in le32 buffer"
firewire: ohci: add tracepoints event for data of Self-ID DMA
firewire: ohci: use inline functions to operate data of self-ID DMA
firewire: ohci: add static inline functions to deserialize for Self-ID DMA operation
firewire: ohci: use static function to handle endian issue on PowerPC platform
firewire: ohci: use common macro to interpret be32 data in le32 buffer
firewire: core: Fix spelling mistakes in tracepoint messages
firewire: ohci: add tracepoints event for hardIRQ event
firewire: ohci: add support for Linux kernel tracepoints
firewire: core: add tracepoints events for completions of packets in isochronous context
firewire: core: add tracepoints events for queueing packets of isochronous context
firewire: core: add tracepoints events for flushing completions of isochronous context
firewire: core: add tracepoints events for flushing of isochronous context
firewire: core: add tracepoints events for starting/stopping of isochronous context
firewire: core: add tracepoints events for setting channels of multichannel context
firewire: core: add tracepoints events for allocation/deallocation of isochronous context
firewire: core: undefine macros after use in tracepoints events
firewire: core: record card index in tracepoints event for self ID sequence
firewire: core: use inline helper functions to serialize phy config packet
...
Pavel Begunkov [Tue, 16 Jul 2024 18:05:46 +0000 (19:05 +0100)]
io_uring: fix lost getsockopt completions
There is a report that iowq executed getsockopt never completes. The
reason being that io_uring_cmd_sock() can return a positive result, and
io_uring_cmd() propagates it back to core io_uring, instead of IOU_OK.
In case of io_wq_submit_work(), the request will be dropped without
completing it.
The offending code was introduced by a hack in a9c3eda7eada9 ("io_uring: fix submission-failure handling for uring-cmd"),
however it was fine until getsockopt was introduced and started
returning positive results.
The right solution is to always return IOU_OK, since e0b23d9953b0c ("io_uring: optimise ltimeout for inline execution"),
we should be able to do it without problems, however for the sake of
backporting and minimising side effects, let's keep returning negative
return codes and otherwise do IOU_OK.
Merge tag 'riscv-for-linus-6.11-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Support for various new ISA extensions:
* The Zve32[xf] and Zve64[xfd] sub-extensios of the vector
extension
* Zimop and Zcmop for may-be-operations
* The Zca, Zcf, Zcd and Zcb sub-extensions of the C extension
* Zawrs
- riscv,cpu-intc is now dtschema
- A handful of performance improvements and cleanups to text patching
- Support for memory hot{,un}plug
- The highest user-allocatable virtual address is now visible in
hwprobe
* tag 'riscv-for-linus-6.11-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (58 commits)
riscv: lib: relax assembly constraints in hweight
riscv: set trap vector earlier
KVM: riscv: selftests: Add Zawrs extension to get-reg-list test
KVM: riscv: Support guest wrs.nto
riscv: hwprobe: export Zawrs ISA extension
riscv: Add Zawrs support for spinlocks
dt-bindings: riscv: Add Zawrs ISA extension description
riscv: Provide a definition for 'pause'
riscv: hwprobe: export highest virtual userspace address
riscv: Improve sbi_ecall() code generation by reordering arguments
riscv: Add tracepoints for SBI calls and returns
riscv: Optimize crc32 with Zbc extension
riscv: Enable DAX VMEMMAP optimization
riscv: mm: Add support for ZONE_DEVICE
virtio-mem: Enable virtio-mem for RISC-V
riscv: Enable memory hotplugging for RISC-V
riscv: mm: Take memory hotplug read-lock during kernel page table dump
riscv: mm: Add memory hotplugging support
riscv: mm: Add pfn_to_kaddr() implementation
riscv: mm: Refactor create_linear_mapping_range() for memory hot add
...
Tiezhu Yang [Sat, 20 Jul 2024 14:41:07 +0000 (22:41 +0800)]
LoongArch: Check TIF_LOAD_WATCH to enable user space watchpoint
Currently, there are some places to set CSR.PRMD.PWE, the first one is
in hw_breakpoint_thread_switch() to enable user space singlestep via
checking TIF_SINGLESTEP, the second one is in hw_breakpoint_control() to
enable user space watchpoint. For the latter case, it should also check
TIF_LOAD_WATCH to make the logic correct and clear.
WANG Rui [Sat, 20 Jul 2024 14:41:07 +0000 (22:41 +0800)]
LoongArch: Use rustc option -Zdirect-access-external-data
-Zdirect-access-external-data is a new Rust compiler option added in
Rust 1.78, which we use to optimize the access of external data in the
Linux kernel's Rust code. This patch modifies the Rust code in vmlinux
to directly access externa data, using PC-REL instead of GOT. However,
Rust code whithin modules is constrained by the PC-REL addressing range
and is explicitly set to use an indirect method.
Xi Ruoyao [Sat, 20 Jul 2024 14:41:07 +0000 (22:41 +0800)]
LoongArch: Add support for relocating the kernel with RELR relocation
RELR as a relocation packing format for relative relocations for
reducing the size of relative relocation records. In a position
independent executable there are often many relative relocation
records, and our vmlinux is a PIE.
The LLD linker (since 17.0.0) and the BFD linker (since 2.43) supports
packing the relocations in the RELR format for LoongArch, with the flag
-z pack-relative-relocs.
Commits 5cf896fb6be3eff ("arm64: Add support for relocating the kernel
with RELR relocations") and ccb2d173b983984bfa ("Makefile: use -z
pack-relative-relocs") have already added the framework to use RELR.
We just need to wire it up and process the RELR relocation records in
relocate_relative() in addition to the RELA relocation records.
A ".p2align 3" directive is added to la_abs macro or the BFD linker
cannot pack the relocation records against the .la_abs section (the
". = ALIGN(8);" directive in vmlinux.lds.S is too late in the linking
process).
With defconfig and CONFIG_RELR vmlinux.efi is 2.1 MiB (6%) smaller, and
vmlinuz.efi (using gzip compression) is 384 KiB (2.8%) smaller.
LoongArch: Use correct API to map cmdline in relocate_kernel()
fw_arg1 is in memory space rather than I/O space, so we should use
early_memremap_ro() instead of early_ioremap() to map the cmdline.
Moreover, we should unmap it after using.
LoongArch: Automatically disable KASLR for hibernation
Hibernation assumes the memory layout after resume be the same as that
before sleep, so it expects the kernel is loaded at the same position.
To achieve this goal we automatically disable KASLR if user explicitly
requests hibernation via the "resume=" command line. Since "nohibernate"
and "noresume" have higher priorities than "resume=", we only disable
KASLR if there is no "nohibernate" and "noresume".
Jiaxun Yang [Sat, 20 Jul 2024 14:41:06 +0000 (22:41 +0800)]
LoongArch: Add ACPI standard hardware register based S3 support
Most LoongArch 64 machines are using custom "SADR" ACPI extension to
perform ACPI S3 sleep. However the standard ACPI way to perform sleep
is to write a value to ACPI PM1/SLEEP_CTL register, and this is never
supported properly in kernel.
Add standard S3 sleep by providing a default DoSuspend function which
calls ACPI's acpi_enter_sleep_state() routine when SADR is not provided
by the firmware.
Also fix suspend assembly code so that ra is set properly before go
into sleep routine. (Previously linked address of jirl was set to a0,
some firmware do require return address in a0 but it's already set with
la.pcrel before).
LoongArch: Add architectural preparation for CPUFreq
Add architectural preparation for CPUFreq driver, including: Kconfig,
register definition and platform device registration.
Some of LoongArch processors support DVFS, their IOCSR.FEATURES has
IOCSRF_FREQSCALE set. And they has a micro-core in the package called
SMC (System Management Controller) to scale frequency, voltage, etc.
LoongArch: Add writecombine support for DMW-based ioremap()
Currently, only TLB-based ioremap() support writecombine, so add the
counterpart for DMW-based ioremap() with help of DMW2. The base address
(WRITECOMBINE_BASE) is configured as 0xa000000000000000.
DMW3 is unused by kernel now, however firmware may leave garbage in them
and interfere kernel's address mapping. So clear it as necessary.
BTW, centralize the DMW configuration to macro SETUP_DMWINS.
Add ARCH_HAS_DEBUG_VM_PGTABLE selection in Kconfig, in order to make
corresponding vm debug features usable on LoongArch. Also update the
corresponding arch-support.txt document.
In order for things like get_user_pages() to work on ZONE_DEVICE memory,
we need a software PTE bit to identify device-backed PFNs. Hook this up
along with the relevant helpers to join in with ARCH_HAS_PTE_DEVMAP.
Add support of kernel stack offset randomization while handling syscall,
the offset is defaultly limited by KSTACK_OFFSET_MAX().
In order to avoid triggering stack canaries (due to __builtin_alloca())
and slowing down the entry path, use __no_stack_protector attribute to
disable stack protector for do_syscall() at function level.
Add irq_work support for LoongArch via self IPIs. This make it possible
to run works in hardware interrupt context, which is a prerequisite for
NOHZ_FULL.
LoongArch: Always enumerate MADT and setup logical-physical CPU mapping
Some drivers want to use cpu_logical_map(), early_cpu_to_node() and some
other CPU mapping APIs, even if we use "nr_cpus=1" to hard limit the CPU
number. This is strongly required for the multi-bridges machines.
Currently, we stop parsing the MADT if the nr_cpus limit is reached, but
to achieve the above goal we should always enumerate the MADT table and
setup logical-physical CPU mapping whether there is a nr_cpus limit.
Rework the MADT enumeration:
1. Define a flag "cpu_enumerated" to distinguish the first enumeration
(cpu_enumerated=0) and the physical hotplug case (cpu_enumerated=1)
for set_processor_mask().
2. If cpu_enumerated=0, stop parsing only when NR_CPUS limit is reached,
so we can setup logical-physical CPU mapping; if cpu_enumerated=1,
stop parsing when nr_cpu_ids limit is reached, so we can avoid some
runtime bugs. Once logical-physical CPU mapping is setup, we will let
cpu_enumerated=1.
3. Use find_first_zero_bit() instead of cpumask_next_zero() to find the
next zero bit (free logical CPU id) in the cpu_present_mask, because
cpumask_next_zero() will stop at nr_cpu_ids.
4. Only touch cpu_possible_mask if cpu_enumerated=0, this is in order to
avoid some potential crashes, because cpu_possible_mask is marked as
__ro_after_init.
5. In prefill_possible_map(), clear cpu_present_mask bits greater than
nr_cpu_ids, in order to avoid a CPU be "present" but not "possible".
LoongArch: Define __ARCH_WANT_NEW_STAT in unistd.h
Chromium sandbox apparently wants to deny statx [1] so it could properly
inspect arguments after the sandboxed process later falls back to fstat.
Because there's currently not a "fd-only" version of statx, so that the
sandbox has no way to ensure the path argument is empty without being
able to peek into the sandboxed process's memory. For architectures able
to do newfstatat though, glibc falls back to newfstatat after getting
-ENOSYS for statx, then the respective SIGSYS handler [2] takes care of
inspecting the path argument, transforming allowed newfstatat's into
fstat instead which is allowed and has the same type of return value.
But, as LoongArch is the first architecture to not have fstat nor
newfstatat, the LoongArch glibc does not attempt falling back at all
when it gets -ENOSYS for statx -- and you see the problem there!
Actually, back when the LoongArch port was under review, people were
aware of the same problem with sandboxing clone3 [3], so clone was
eventually kept. Unfortunately it seemed at that time no one had noticed
statx, so besides restoring fstat/newfstatat to LoongArch uapi (and
postponing the problem further), it seems inevitable that we would need
to tackle seccomp deep argument inspection.
However, this is obviously a decision that shouldn't be taken lightly,
so we just restore fstat/newfstatat by defining __ARCH_WANT_NEW_STAT
in unistd.h. This is the simplest solution for now, and so we hope the
community will tackle the long-standing problem of seccomp deep argument
inspection in the future [4][5].
Also add "newstat" to syscall_abis_64 in Makefile.syscalls due to
upstream asm-generic changes.
Wolfram Sang [Tue, 16 Jul 2024 08:36:25 +0000 (10:36 +0200)]
i2c: header: improve kdoc for i2c_algorithm
Reword the explanation of @xfer, the old one was confusing and mixing up
terminology. Other than that, capitalize some words correctly and use
full line length.
The forward declaration is not needed anymore. The sentence about
"following structs" became obsolete when struct i2c_algorithm became a
kdoc. The paragraph about return values can go because we have this
information in kdoc already.
Wolfram Sang [Sat, 20 Jul 2024 13:42:18 +0000 (15:42 +0200)]
Merge tag 'i2c-host-6.11-part-2' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-mergewindow
Added descriptions in the DTS for the Qualcomm SM8650 and SM8550
Camera Control Interface (CCI).
Added support for the "settle-time-us" property, which allows the
gpio-mux device to switch from one bus to another with a
configurable delay. The time can be set in the DTS.
The latest change also includes file sorting.
Fixed slot numbering in the SMBus framework to prevent failures
when more than 8 slots are occupied. It now enforces a a maximum
of 8 slots to be used. This ensures that the Intel PIIX4 device
can register the SPDs correctly without failure, even if other
slots are populated but not used.
The Freescale MXC driver has been converted to the newer ->exec_op()
interface. The meson driver now supports handling the boot ROM area with
very specific ECC needs. Support for the iMX8QXP has been added to the
GPMI driver. The lpx32xx driver now can get the DMA channels using DT
entries. The Qcom binding has been improved to be more future proof by
Rob. And then there is the usual load of misc and minor changes.
SPI-NAND changes:
The Macronix vendor driver has been improved to support an extended ID
to avoid conflicting with older devices after an ID reuse issue.
- Drop support for Xilinx S3AN flashes. These flashes are for the very
old Xilinx Spartan 3 FPGAs and they need some awkward code in the core
to support. Drop support for these flashes, along with the special
handling we needed for them in the core like non-power-of-2 page size
handling and the .setup() callback.
- Fix regression for old w25q128 flashes without SFDP tables. Commit 83e824a4a595 ("mtd: spi-nor: Correct flags for Winbond w25q128")
dropped support for such devices under the assumption that they aren't
being used anymore. Users have now surfaced [0] so fix the regression
by supporting both kind of devices.
- Core cleanups including removal of SPI_NOR_NO_FR flag and
simplification of spi_nor_get_flash_info().
The piix4 I2C bus can carry SPDs, register them if present.
Only look on bus 0, as this is where the SPDs seem to be located.
Only the first 8 slots are supported. If the system has more,
then these will not be visible.
The AUX bus can not be probed as on some platforms it reports all
devices present and all reads return "0".
This would allow the ee1004 to be probed incorrectly.
Makefile: add comment to discourage tools/* addition for kernel builds
Kbuild provides scripts/Makefile.host to build host programs used for
building the kernel. Unfortunately, there are two exceptions that opt
out of Kbuild. The build system under tools/ is a cheesy replica, and
cause issues. I was recently poked about a problem in the tools build
system, which I do not maintain (and nobody maintains). [1]
Without a comment, people might believe this is the right location
because that is where objtool lives, even if a more robust Kbuild
syntax satisfies their needs. [2]