Merge tag 'riscv-for-linus-6.11-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull more RISC-V updates from Palmer Dabbelt:
- Support for NUMA (via SRAT and SLIT), console output (via SPCR), and
cache info (via PPTT) on ACPI-based systems.
- The trap entry/exit code no longer breaks the return address stack
predictor on many systems, which results in an improvement to trap
latency.
- Support for HAVE_ARCH_STACKLEAK.
- The sv39 linear map has been extended to support 128GiB mappings.
- The frequency of the mtime CSR is now visible via hwprobe.
* tag 'riscv-for-linus-6.11-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (21 commits)
RISC-V: Provide the frequency of time CSR via hwprobe
riscv: Extend sv39 linear mapping max size to 128G
riscv: enable HAVE_ARCH_STACKLEAK
riscv: signal: Remove unlikely() from WARN_ON() condition
riscv: Improve exception and system call latency
RISC-V: Select ACPI PPTT drivers
riscv: cacheinfo: initialize cacheinfo's level and type from ACPI PPTT
riscv: cacheinfo: remove the useless input parameter (node) of ci_leaf_init()
RISC-V: ACPI: Enable SPCR table for console output on RISC-V
riscv: boot: remove duplicated targets line
trace: riscv: Remove deprecated kprobe on ftrace support
riscv: cpufeature: Extract common elements from extension checking
riscv: Introduce vendor variants of extension helpers
riscv: Add vendor extensions to /proc/cpuinfo
riscv: Extend cpufeature.c to detect vendor extensions
RISC-V: run savedefconfig for defconfig
RISC-V: hwprobe: sort EXT_KEY()s in hwprobe_isa_ext0() alphabetically
ACPI: NUMA: replace pr_info with pr_debug in arch_acpi_numa_init
ACPI: NUMA: change the ACPI_NUMA to a hidden option
ACPI: NUMA: Add handler for SRAT RINTC affinity structure
...
Merge tag 'for-linus-6.11-rc1a-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"Two fixes for issues introduced in this merge window:
- fix enhanced debugging in the Xen multicall handling
- two patches fixing a boot failure when running as dom0 in PVH mode"
* tag 'for-linus-6.11-rc1a-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: fix memblock_reserve() usage on PVH
x86/xen: move xen_reserve_extra_memory()
xen: fix multicall debug data referencing
minmax: avoid overly complicated constant expressions in VM code
The minmax infrastructure is overkill for simple constants, and can
cause huge expansions because those simple constants are then used by
other things.
For example, 'pageblock_order' is a core VM constant, but because it was
implemented using 'min_t()' and all the type-checking that involves, it
actually expanded to something like 2.5kB of preprocessor noise.
And when that simple constant was then used inside other expansions:
the end result was that one statement expanding to 253kB in size.
There are probably other cases of this, but this one case certainly
stood out.
I've added 'MIN_T()' and 'MAX_T()' macros for this kind of "core simple
constant with specific type" use. These macros skip the type checking,
and as such need to be very sparingly used only for obvious cases that
have active issues like this.
minmax: avoid overly complex min()/max() macro arguments in xen
We have some very fancy min/max macros that have tons of sanity checking
to warn about mixed signedness etc.
This is all things that a sane compiler should warn about, but there are
no sane compiler interfaces for this, and '-Wsign-compare' is broken [1]
and not useful.
So then we compensate (some would say over-compensate) by doing the
checks manually with some truly horrid macro games.
And no, we can't just use __builtin_types_compatible_p(), because the
whole question of "does it make sense to compare these two values" is a
lot more complicated than that.
For example, it makes a ton of sense to compare unsigned values with
simple constants like "5", even if that is indeed a signed type. So we
have these very strange macros to try to make sensible type checking
decisions on the arguments to 'min()' and 'max()'.
But that can cause enormous code expansion if the min()/max() macros are
used with complicated expressions, and particularly if you nest these
things so that you get the first big expansion then expanded again.
The xen setup.c file ended up ballooning to over 50MB of preprocessed
noise that takes 15s to compile (obviously depending on the build host),
largely due to one single line.
So let's split that one single line to just be simpler. I think it ends
up being more legible to humans too at the same time. Now that single
file compiles in under a second.
Merge tag 'auxdisplay-for-v6.11-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull auxdisplay updates from Geert Uytterhoeven:
- add support for configuring the boot message on line displays
- miscellaneous fixes and improvements
* tag 'auxdisplay-for-v6.11-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
auxdisplay: ht16k33: Drop reference after LED registration
auxdisplay: Use sizeof(*pointer) instead of sizeof(type)
auxdisplay: hd44780: add missing MODULE_DESCRIPTION() macro
auxdisplay: linedisp: add missing MODULE_DESCRIPTION() macro
auxdisplay: linedisp: Support configuring the boot message
auxdisplay: charlcd: Provide a forward declaration
Merge tag 'sound-fix-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of fixes gathered since the previous pull.
We see a bit large LOCs at a HD-audio quirk, but that's only bulk COEF
data, hence it's safe to take. In addition to that, there were two
minor fixes for MIDI 2.0 handling for ALSA core, and the rest are all
rather random small and device-specific fixes"
* tag 'sound-fix-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ASoC: fsl-asoc-card: Dynamically allocate memory for snd_soc_dai_link_components
ASoC: amd: yc: Support mic on Lenovo Thinkpad E16 Gen 2
ALSA: hda/realtek: Implement sound init sequence for Samsung Galaxy Book3 Pro 360
ALSA: hda/realtek: cs35l41: Fixup remaining asus strix models
ASoC: SOF: ipc4-topology: Preserve the DMA Link ID for ChainDMA on unprepare
ASoC: SOF: ipc4-topology: Only handle dai_config with HW_PARAMS for ChainDMA
ALSA: ump: Force 1 Group for MIDI1 FBs
ALSA: ump: Don't update FB name for static blocks
ALSA: usb-audio: Add a quirk for Sonix HD USB Camera
ASoC: TAS2781: Fix tasdev_load_calibrated_data()
ASoC: tegra: select CONFIG_SND_SIMPLE_CARD_UTILS
ASoC: Intel: use soc_intel_is_byt_cr() only when IOSF_MBI is reachable
ALSA: usb-audio: Move HD Webcam quirk to the right place
ALSA: hda: tas2781: mark const variables as __maybe_unused
ALSA: usb-audio: Fix microphone sound on HD webcam.
ASoC: sof: amd: fix for firmware reload failure in Vangogh platform
ASoC: Intel: Fix RT5650 SSP lookup
ASOC: SOF: Intel: hda-loader: only wait for HDaudio IOC for IPC4 devices
ASoC: SOF: imx8m: Fix DSP control regmap retrieval
i915:
- Reset intel_dp->link_trained before retraining the link
- Don't switch the LTTPR mode on an active link
- Do not consider preemption during execlists_dequeue for gen8
- Allow NULL memory region
xe:
- xe_exec ioctl minor fix on sync entry cleanup upon error
- SRIOV: limit VF LMEM provisioning
- Wedge mode fixes
v3d:
- fix indirect dispatch on newer v3d revs
panel:
- fix panel backlight bindings"
* tag 'drm-next-2024-07-26' of https://gitlab.freedesktop.org/drm/kernel: (39 commits)
drm/amdgpu: reset vm state machine after gpu reset(vram lost)
drm/amdgpu: add missed harvest check for VCN IP v4/v5
drm/amdgpu: Fix eeprom max record count
drm/amdgpu: fix ras UE error injection failure issue
drm/amd/display: Remove ASSERT if significance is zero in math_ceil2
drm/amd/display: Check for NULL pointer
drm/amdgpu/vcn: Use offsets local to VCN/JPEG in VF
drm/amdgpu: Add empty HDP flush function to VCN v4.0.3
drm/amdgpu: Add empty HDP flush function to JPEG v4.0.3
drm/amd/amdgpu: Fix uninitialized variable warnings
drm/amdgpu: Fix atomics on GFX12
drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell
drm/i915: Allow NULL memory region
drm/i915/gt: Do not consider preemption during execlists_dequeue for gen8
dt-bindings: display: panel: samsung,atna33xc20: Document ATNA45AF01
drm/xe: Don't suspend device upon wedge
drm/xe: Wedge the entire device
drm/xe/pf: Limit fair VF LMEM provisioning
drm/xe/exec: Fix minor bug related to xe_sync_entry_cleanup
drm/amd/display: fix corruption with high refresh rates on DCN 3.0
...
Merge tag 's390-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull more s390 updates from Vasily Gorbik:
- Fix KMSAN build breakage caused by the conflict between s390 and
mm-stable trees
- Add KMSAN page markers for ptdump
- Add runtime constant support
- Fix __pa/__va for modules under non-GPL licenses by exporting
necessary vm_layout struct with EXPORT_SYMBOL to prevent linkage
problems
- Fix an endless loop in the CF_DIAG event stop in the CPU Measurement
Counter Facility code when the counter set size is zero
- Remove the PROTECTED_VIRTUALIZATION_GUEST config option and enable
its functionality by default
- Support allocation of multiple MSI interrupts per device and improve
logging of architecture-specific limitations
- Add support for lowcore relocation as a debugging feature to catch
all null ptr dereferences in the kernel address space, improving
detection beyond the current implementation's limited write access
protection
- Clean up and rework CPU alternatives to allow for callbacks and early
patching for the lowcore relocation
* tag 's390-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (39 commits)
s390: Remove protvirt and kvm config guards for uv code
s390/boot: Add cmdline option to relocate lowcore
s390/kdump: Make kdump ready for lowcore relocation
s390/entry: Make system_call() ready for lowcore relocation
s390/entry: Make ret_from_fork() ready for lowcore relocation
s390/entry: Make __switch_to() ready for lowcore relocation
s390/entry: Make restart_int_handler() ready for lowcore relocation
s390/entry: Make mchk_int_handler() ready for lowcore relocation
s390/entry: Make int handlers ready for lowcore relocation
s390/entry: Make pgm_check_handler() ready for lowcore relocation
s390/entry: Add base register to CHECK_VMAP_STACK/CHECK_STACK macro
s390/entry: Add base register to SIEEXIT macro
s390/entry: Add base register to MBEAR macro
s390/entry: Make __sie64a() ready for lowcore relocation
s390/head64: Make startup code ready for lowcore relocation
s390: Add infrastructure to patch lowcore accesses
s390/atomic_ops: Disable flag outputs constraint for GCC versions below 14.2.0
s390/entry: Move SIE indicator flag to thread info
s390/nmi: Simplify ptregs setup
s390/alternatives: Remove alternative facility list
...
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"The usual summary below, but the main fix is for the fast GUP lockless
page-table walk when we have a combination of compile-time and
run-time folding of the p4d and the pud respectively.
- Remove some redundant Kconfig conditionals
- Fix string output in ptrace selftest
- Fix fast GUP crashes in some page-table configurations
- Remove obsolete linker option when building the vDSO
- Fix some sysreg field definitions for the GIC"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: Fix lockless walks with static and dynamic page-table folding
arm64/sysreg: Correct the values for GICv4.1
arm64/vdso: Remove --hash-style=sysv
kselftest: missing arg in ptrace.c
arm64/Kconfig: Remove redundant 'if HAVE_FUNCTION_GRAPH_TRACER'
arm64: remove redundant 'if HAVE_ARCH_KASAN' in Kconfig
Merge tag 'ceph-for-6.11-rc1' of https://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
"A small patchset to address bogus I/O errors and ultimately an
assertion failure in the face of watch errors with -o exclusive
mappings in RBD marked for stable and some assorted CephFS fixes"
* tag 'ceph-for-6.11-rc1' of https://github.com/ceph/ceph-client:
rbd: don't assume rbd_is_lock_owner() for exclusive mappings
rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive mappings
rbd: rename RBD_LOCK_STATE_RELEASING and releasing_wait
ceph: fix incorrect kmalloc size of pagevec mempool
ceph: periodically flush the cap releases
ceph: convert comma to semicolon in __ceph_dentry_dir_lease_touch()
ceph: use cap_wait_list only if debugfs is enabled
Merge tag 'erofs-for-6.11-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull more erofs updates from Gao Xiang:
- Support STATX_DIOALIGN and FS_IOC_GETFSSYSFSPATH
- Fix a race of LZ4 decompression due to recent refactoring
- Another multi-page folio adaption in erofs_bread()
* tag 'erofs-for-6.11-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: convert comma to semicolon
erofs: support multi-page folios for erofs_bread()
erofs: add support for FS_IOC_GETFSSYSFSPATH
erofs: fix race in z_erofs_get_gbuf()
erofs: support STATX_DIOALIGN
Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull struct file leak fixes from Al Viro:
"a couple of leaks on failure exits missing fdput()"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
lirc: rc_dev_get_from_fd(): fix file leak
powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap()
arm64: allow installing compressed image by default
On arm64 we build compressed images, but "make install" by default will
install the old non-compressed one. To actually get the compressed
image install, you need to use "make zinstall", which is not the usual
way to install a kernel.
Which may not sound like much of an issue, but when you deal with
multiple architectures (and years of your fingers knowing the regular
"make install" incantation), this inconsistency is pretty annoying.
But as Will Deacon says:
"Sadly, bootloaders being as top quality as you might expect, I don't
think we're in a position to rely on decompressor support across the
board. Our Image.gz is literally just that -- we don't have a built-in
decompressor (nor do I think we want to rush into that again after the
fun we had on arm32) and the recent EFI zboot support solves that
problem for platforms using EFI.
Changing the default 'install' target terrifies me. There are bound to
be folks with embedded boards who've scripted this and we could really
ruin their day if we quietly give them a compressed kernel that their
bootloader doesn't know how to handle :/"
So make this conditional on a new "COMPRESSED_INSTALL" option.
Merge tag 'bitmap-6.11-rc1' of https://github.com:/norov/linux
Pull bitmap updates from Yury Norov:
"Random fixes"
* tag 'bitmap-6.11-rc1' of https://github.com:/norov/linux:
riscv: Remove unnecessary int cast in variable_fls()
radix tree test suite: put definition of bitmap_clear() into lib/bitmap.c
bitops: Add a comment explaining the double underscore macros
lib: bitmap: add missing MODULE_DESCRIPTION() macros
cpumask: introduce assign_cpu() macro
RISC-V: Provide the frequency of time CSR via hwprobe
The RISC-V architecture makes a real time counter CSR (via RDTIME
instruction) available for applications in U-mode but there is no
architected mechanism for an application to discover the frequency
the counter is running at. Some applications (e.g., DPDK) use the
time counter for basic performance analysis as well as fine grained
time-keeping.
Add support to the hwprobe system call to export the time CSR
frequency to code running in U-mode.
Stuart Menefy [Sun, 30 Jun 2024 11:05:49 +0000 (12:05 +0100)]
riscv: Extend sv39 linear mapping max size to 128G
This harmonizes all virtual addressing modes which can now all map
(PGDIR_SIZE * PTRS_PER_PGD) / 4 of physical memory.
The RISCV implementation of KASAN requires that the boundary between
shallow mappings are aligned on an 8G boundary. In this case we need
VMALLOC_START to be 8G aligned. So although we only need to move the
start of the linear mapping down by 4GiB to allow 128GiB to be mapped,
we actually move it down by 8GiB (creating a 4GiB hole between the
linear mapping and KASAN shadow space) to maintain the alignment
requirement.
The ACPI SPCR code has been used to enable console output for ARM64 and
X86. The same code can be reused for RISC-V. Furthermore, SPCR table is
mandated for headless system as outlined in the RISC-V BRS
Specification, chapter 6.
* b4-shazam-merge:
RISC-V: ACPI: Enable SPCR table for console output on RISC-V
Anton Blanchard [Fri, 7 Jun 2024 06:13:35 +0000 (23:13 -0700)]
riscv: Improve exception and system call latency
Many CPUs implement return address branch prediction as a stack. The
RISCV architecture refers to this as a return address stack (RAS). If
this gets corrupted then the CPU will mispredict at least one but
potentally many function returns.
There are two issues with the current RISCV exception code:
- We are using the alternate link stack (x5/t0) for the indirect branch
which makes the hardware think this is a function return. This will
corrupt the RAS.
- We modify the return address of handle_exception to point to
ret_from_exception. This will also corrupt the RAS.
Testing the null system call latency before and after the patch:
FS_IOC_GETFSSYSFSPATH ioctl exposes /sys/fs path of a given filesystem,
potentially standarizing sysfs reporting. This patch add support for
FS_IOC_GETFSSYSFSPATH for erofs, "erofs/<dev>" will be outputted for bdev
cases, "erofs/[domain_id,]<fs_id>" will be outputted for fscache cases.
Merge tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf and netfilter.
A lot of networking people were at a conference last week, busy
catching COVID, so relatively short PR.
Current release - regressions:
- tcp: process the 3rd ACK with sk_socket for TFO and MPTCP
Current release - new code bugs:
- l2tp: protect session IDR and tunnel session list with one lock,
make sure the state is coherent to avoid a warning
- eth: bnxt_en: update xdp_rxq_info in queue restart logic
- eth: airoha: fix location of the MBI_RX_AGE_SEL_MASK field
Previous releases - regressions:
- xsk: require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len,
the field reuses previously un-validated pad
Previous releases - always broken:
- tap/tun: drop short frames to prevent crashes later in the stack
- eth: ice: add a per-VF limit on number of FDIR filters
- af_unix: disable MSG_OOB handling for sockets in sockmap/sockhash"
* tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits)
tun: add missing verification for short frame
tap: add missing verification for short frame
mISDN: Fix a use after free in hfcmulti_tx()
gve: Fix an edge case for TSO skb validity check
bnxt_en: update xdp_rxq_info in queue restart logic
tcp: process the 3rd ACK with sk_socket for TFO/MPTCP
selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test
xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len
bpf: Fix a segment issue when downgrading gso_size
net: mediatek: Fix potential NULL pointer dereference in dummy net_device handling
MAINTAINERS: make Breno the netconsole maintainer
MAINTAINERS: Update bonding entry
net: nexthop: Initialize all fields in dumped nexthops
net: stmmac: Correct byte order of perfect_match
selftests: forwarding: skip if kernel not support setting bridge fdb learning limit
tipc: Return non-zero value from tipc_udp_addr2str() on error
netfilter: nft_set_pipapo_avx2: disable softinterrupts
ice: Fix recipe read procedure
ice: Add a per-VF limit on number of FDIR filters
net: bonding: correctly annotate RCU in bond_should_notify_peers()
...
Merge tag 'constfy-sysctl-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl
Pull sysctl constification from Joel Granados:
"Treewide constification of the ctl_table argument of proc_handlers
using a coccinelle script and some manual code formatting fixups.
This is a prerequisite to moving the static ctl_table structs into
read-only data section which will ensure that proc_handler function
pointers cannot be modified"
* tag 'constfy-sysctl-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
sysctl: treewide: constify the ctl_table argument of proc_handlers
Merge tag 'efi-fixes-for-v6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fixes from Ard Biesheuvel:
- Wipe screen_info after allocating it from the heap - used by arm32
and EFI zboot, other EFI architectures allocate it statically
- Revert to allocating boot_params from the heap on x86 when entering
via the native PE entrypoint, to work around a regression on older
Dell hardware
* tag 'efi-fixes-for-v6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
x86/efistub: Revert to heap allocated boot_params for PE entrypoint
efi/libstub: Zero initialize heap allocated struct screen_info
Merge tag 'kgdb-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux
Pull kgdb updates from Daniel Thompson:
"Three small changes this cycle:
- Clean up an architecture abstraction that is no longer needed
because all the architectures have converged.
- Actually use the prompt argument to kdb_position_cursor() instead
of ignoring it (functionally this fix is a nop but that was due to
luck rather than good judgement)
- Fix a -Wformat-security warning"
* tag 'kgdb-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
kdb: Get rid of redundant kdb_curr_task()
kdb: Use the passed prompt in kdb_position_cursor()
kdb: address -Wformat-security warnings
Merge tag 'mips_6.11_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS updates from Thomas Bogendoerfer:
- Use improved timer sync for Loongson64
- Fix address of GCR_ACCESS register
- Add missing MODULE_DESCRIPTION
* tag 'mips_6.11_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
mips: sibyte: add missing MODULE_DESCRIPTION() macro
MIPS: SMP-CPS: Fix address for GCR_ACCESS register for CM3 and later
MIPS: Loongson64: Switch to SYNC_R4K
Merge tag 'parisc-for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
"The gettimeofday() and clock_gettime() syscalls are now available as
vDSO functions, and Dave added a patch which allows to use NVMe cards
in the PCI slots as fast and easy alternative to SCSI discs.
Summary:
- add gettimeofday() and clock_gettime() vDSO functions
- enable PCI_MSI_ARCH_FALLBACKS to allow PCI to PCIe bridge adaptor
with PCIe NVME card to function in parisc machines
- allow users to reduce kernel unaligned runtime warnings
- minor code cleanups"
* tag 'parisc-for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Add support for CONFIG_SYSCTL_ARCH_UNALIGN_NO_WARN
parisc: Use max() to calculate parisc_tlb_flush_threshold
parisc: Fix warning at drivers/pci/msi/msi.h:121
parisc: Add 64-bit gettimeofday() and clock_gettime() vDSO functions
parisc: Add 32-bit gettimeofday() and clock_gettime() vDSO functions
parisc: Clean up unistd.h file
Merge tag 'uml-for-linus-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux
Pull UML updates from Richard Weinberger:
- Support for preemption
- i386 Rust support
- Huge cleanup by Benjamin Berg
- UBSAN support
- Removal of dead code
* tag 'uml-for-linus-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (41 commits)
um: vector: always reset vp->opened
um: vector: remove vp->lock
um: register power-off handler
um: line: always fill *error_out in setup_one_line()
um: remove pcap driver from documentation
um: Enable preemption in UML
um: refactor TLB update handling
um: simplify and consolidate TLB updates
um: remove force_flush_all from fork_handler
um: Do not flush MM in flush_thread
um: Delay flushing syscalls until the thread is restarted
um: remove copy_context_skas0
um: remove LDT support
um: compress memory related stub syscalls while adding them
um: Rework syscall handling
um: Add generic stub_syscall6 function
um: Create signal stack memory assignment in stub_data
um: Remove stub-data.h include from common-offsets.h
um: time-travel: fix signal blocking race/hang
um: time-travel: remove time_exit()
...
Merge tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the big set of driver core changes for 6.11-rc1.
Lots of stuff in here, with not a huge diffstat, but apis are evolving
which required lots of files to be touched. Highlights of the changes
in here are:
- platform remove callback api final fixups (Uwe took many releases
to get here, finally!)
- Rust bindings for basic firmware apis and initial driver-core
interactions.
It's not all that useful for a "write a whole driver in rust" type
of thing, but the firmware bindings do help out the phy rust
drivers, and the driver core bindings give a solid base on which
others can start their work.
There is still a long way to go here before we have a multitude of
rust drivers being added, but it's a great first step.
- driver core const api changes.
This reached across all bus types, and there are some fix-ups for
some not-common bus types that linux-next and 0-day testing shook
out.
This work is being done to help make the rust bindings more safe,
as well as the C code, moving toward the end-goal of allowing us to
put driver structures into read-only memory. We aren't there yet,
but are getting closer.
- minor devres cleanups and fixes found by code inspection
- arch_topology minor changes
- other minor driver core cleanups
All of these have been in linux-next for a very long time with no
reported problems"
* tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits)
ARM: sa1100: make match function take a const pointer
sysfs/cpu: Make crash_hotplug attribute world-readable
dio: Have dio_bus_match() callback take a const *
zorro: make match function take a const pointer
driver core: module: make module_[add|remove]_driver take a const *
driver core: make driver_find_device() take a const *
driver core: make driver_[create|remove]_file take a const *
firmware_loader: fix soundness issue in `request_internal`
firmware_loader: annotate doctests as `no_run`
devres: Correct code style for functions that return a pointer type
devres: Initialize an uninitialized struct member
devres: Fix memory leakage caused by driver API devm_free_percpu()
devres: Fix devm_krealloc() wasting memory
driver core: platform: Switch to use kmemdup_array()
driver core: have match() callback in struct bus_type take a const *
MAINTAINERS: add Rust device abstractions to DRIVER CORE
device: rust: improve safety comments
MAINTAINERS: add Danilo as FIRMWARE LOADER maintainer
MAINTAINERS: add Rust FW abstractions to FIRMWARE LOADER
firmware: rust: improve safety comments
...
Merge tag 'linux-watchdog-6.11-rc1' of git://www.linux-watchdog.org/linux-watchdog
Pull watchdog updates from Wim Van Sebroeck:
- make watchdog_class const
- rework of the rzg2l_wdt driver
- other small fixes and improvements
* tag 'linux-watchdog-6.11-rc1' of git://www.linux-watchdog.org/linux-watchdog:
dt-bindings: watchdog: dlg,da9062-watchdog: Drop blank space
watchdog: rzn1: Convert comma to semicolon
watchdog: lenovo_se10_wdt: Convert comma to semicolon
dt-bindings: watchdog: renesas,wdt: Document RZ/G3S support
watchdog: rzg2l_wdt: Add suspend/resume support
watchdog: rzg2l_wdt: Rely on the reset driver for doing proper reset
watchdog: rzg2l_wdt: Remove comparison with zero
watchdog: rzg2l_wdt: Remove reset de-assert from probe
watchdog: rzg2l_wdt: Check return status of pm_runtime_put()
watchdog: rzg2l_wdt: Use pm_runtime_resume_and_get()
watchdog: rzg2l_wdt: Make the driver depend on PM
watchdog: rzg2l_wdt: Restrict the driver to ARCH_RZG2L and ARCH_R9A09G011
watchdog: imx7ulp_wdt: keep already running watchdog enabled
watchdog: starfive: Add missing clk_disable_unprepare()
watchdog: Make watchdog_class const
Merge tag 'asoc-fix-v6.11-merge-window' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.11
A selection of routine fixes and quirks that came in since the merge
window. The fsl-asoc-card change is a fix for systems with multiple
cards where updating templates in place leaks data from one card to
another.
The cited commit missed to check against the validity of the frame length
in the tun_xdp_one() path, which could cause a corrupted skb to be sent
downstack. Even before the skb is transmitted, the
tun_xdp_one-->eth_type_trans() may access the Ethernet header although it
can be less than ETH_HLEN. Once transmitted, this could either cause
out-of-bound access beyond the actual length, or confuse the underlayer
with incorrect or inconsistent header length in the skb metadata.
In the alternative path, tun_get_user() already prohibits short frame which
has the length less than Ethernet header size from being transmitted for
IFF_TAP.
This is to drop any frame shorter than the Ethernet header size just like
how tun_get_user() does.
Si-Wei Liu [Wed, 24 Jul 2024 17:04:51 +0000 (10:04 -0700)]
tap: add missing verification for short frame
The cited commit missed to check against the validity of the frame length
in the tap_get_user_xdp() path, which could cause a corrupted skb to be
sent downstack. Even before the skb is transmitted, the
tap_get_user_xdp()-->skb_set_network_header() may assume the size is more
than ETH_HLEN. Once transmitted, this could either cause out-of-bound
access beyond the actual length, or confuse the underlayer with incorrect
or inconsistent header length in the skb metadata.
In the alternative path, tap_get_user() already prohibits short frame which
has the length less than Ethernet header size from being transmitted.
This is to drop any frame shorter than the Ethernet header size just like
how tap_get_user() does.
The NIC requires each TSO segment to not span more than 10
descriptors. NIC further requires each descriptor to not exceed
16KB - 1 (GVE_TX_MAX_BUF_SIZE_DQO).
The descriptors for an skb are generated by
gve_tx_add_skb_no_copy_dqo() for DQO RDA queue format.
gve_tx_add_skb_no_copy_dqo() loops through each skb frag and
generates a descriptor for the entire frag if the frag size is
not greater than GVE_TX_MAX_BUF_SIZE_DQO. If the frag size is
greater than GVE_TX_MAX_BUF_SIZE_DQO, it is split into descriptor(s)
of size GVE_TX_MAX_BUF_SIZE_DQO and a descriptor is generated for
the remainder (frag size % GVE_TX_MAX_BUF_SIZE_DQO).
gve_can_send_tso() checks if the descriptors thus generated for an
skb would meet the requirement that each TSO-segment not span more
than 10 descriptors. However, the current code misses an edge case
when a TSO segment spans multiple descriptors within a large frag.
This change fixes the edge case.
gve_can_send_tso() relies on the assumption that max gso size (9728)
is less than GVE_TX_MAX_BUF_SIZE_DQO and therefore within an skb
fragment a TSO segment can never span more than 2 descriptors.
bnxt_en: update xdp_rxq_info in queue restart logic
When the netdev_rx_queue_restart() restarts queues, the bnxt_en driver
updates(creates and deletes) a page_pool.
But it doesn't update xdp_rxq_info, so the xdp_rxq_info is still
connected to an old page_pool.
So, bnxt_rx_ring_info->page_pool indicates a new page_pool, but
bnxt_rx_ring_info->xdp_rxq is still connected to an old page_pool.
An old page_pool is no longer used so it is supposed to be
deleted by page_pool_destroy() but it isn't.
Because the xdp_rxq_info is holding the reference count for it and the
xdp_rxq_info is not updated, an old page_pool will not be deleted in
the queue restart logic.
Before restarting queues, an interface has 4 page_pools.
After restarting one queue, an interface has 5 page_pools, but it
should be 4, not 5.
The reason is that queue restarting logic creates a new page_pool and
an old page_pool is not deleted due to the absence of an update of
xdp_rxq_info logic.
Jakub Kicinski [Thu, 25 Jul 2024 14:40:24 +0000 (07:40 -0700)]
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2024-07-25
We've added 14 non-merge commits during the last 8 day(s) which contain
a total of 19 files changed, 177 insertions(+), 70 deletions(-).
The main changes are:
1) Fix af_unix to disable MSG_OOB handling for sockets in BPF sockmap and
BPF sockhash. Also add test coverage for this case, from Michal Luczaj.
2) Fix a segmentation issue when downgrading gso_size in the BPF helper
bpf_skb_adjust_room(), from Fred Li.
3) Fix a compiler warning in resolve_btfids due to a missing type cast,
from Liwei Song.
4) Fix stack allocation for arm64 to align the stack pointer at a 16 byte
boundary in the fexit_sleep BPF selftest, from Puranjay Mohan.
5) Fix a xsk regression to require a flag when actuating tx_metadata_len,
from Stanislav Fomichev.
6) Fix function prototype BTF dumping in libbpf for prototypes that have
no input arguments, from Andrii Nakryiko.
7) Fix stacktrace symbol resolution in perf script for BPF programs
containing subprograms, from Hou Tao.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test
xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len
bpf: Fix a segment issue when downgrading gso_size
tools/resolve_btfids: Fix comparison of distinct pointer types warning in resolve_btfids
bpf, events: Use prog to emit ksymbol event for main program
selftests/bpf: Test sockmap redirect for AF_UNIX MSG_OOB
selftests/bpf: Parametrize AF_UNIX redir functions to accept send() flags
selftests/bpf: Support SOCK_STREAM in unix_inet_redir_to_connected()
af_unix: Disable MSG_OOB handling for sockets in sockmap/sockhash
bpftool: Fix typo in usage help
libbpf: Fix no-args func prototype BTF dumping syntax
MAINTAINERS: Update powerpc BPF JIT maintainers
MAINTAINERS: Update email address of Naveen
selftests/bpf: fexit_sleep: Fix stack allocation for arm64
====================
Shengjiu Wang [Thu, 25 Jul 2024 03:22:53 +0000 (11:22 +0800)]
ASoC: fsl-asoc-card: Dynamically allocate memory for snd_soc_dai_link_components
The static snd_soc_dai_link_components cause conflict for multiple
instances of this generic driver. For example, when there is
wm8962 and SPDIF case enabled together, the contaminated
snd_soc_dai_link_components will cause another device probe fail.
Will Deacon [Thu, 25 Jul 2024 09:03:45 +0000 (10:03 +0100)]
arm64: mm: Fix lockless walks with static and dynamic page-table folding
Lina reports random oopsen originating from the fast GUP code when
16K pages are used with 4-level page-tables, the fourth level being
folded at runtime due to lack of LPA2.
In this configuration, the generic implementation of
p4d_offset_lockless() will return a 'p4d_t *' corresponding to the
'pgd_t' allocated on the stack of the caller, gup_fast_pgd_range().
This is normally fine, but when the fourth level of page-table is folded
at runtime, pud_offset_lockless() will offset from the address of the
'p4d_t' to calculate the address of the PUD in the same page-table page.
This results in a stray stack read when the 'p4d_t' has been allocated
on the stack and can send the walker into the weeds.
Fix the problem by providing our own definition of p4d_offset_lockless()
when CONFIG_PGTABLE_LEVELS <= 4 which returns the real page-table
pointer rather than the address of the local stack variable.
The current usage of memblock_reserve() in init_pvh_bootparams() is done before
the .bss is zeroed, and that used to be fine when
memblock_reserved_init_regions implicitly ended up in the .meminit.data
section. However after commit 73db3abdca58c memblock_reserved_init_regions
ends up in the .bss section, thus breaking it's usage before the .bss is
cleared.
Move and rename the call to xen_reserve_extra_memory() so it's done in the
x86_init.oem.arch_setup hook, which gets executed after the .bss has been
zeroed, but before calling e820__memory_setup().
tcp: process the 3rd ACK with sk_socket for TFO/MPTCP
The 'Fixes' commit recently changed the behaviour of TCP by skipping the
processing of the 3rd ACK when a sk->sk_socket is set. The goal was to
skip tcp_ack_snd_check() in tcp_rcv_state_process() not to send an
unnecessary ACK in case of simultaneous connect(). Unfortunately, that
had an impact on TFO and MPTCP.
I started to look at the impact on MPTCP, because the MPTCP CI found
some issues with the MPTCP Packetdrill tests [1]. Then Paolo Abeni
suggested me to look at the impact on TFO with "plain" TCP.
For MPTCP, when receiving the 3rd ACK of a request adding a new path
(MP_JOIN), sk->sk_socket will be set, and point to the MPTCP sock that
has been created when the MPTCP connection got established before with
the first path. The newly added 'goto' will then skip the processing of
the segment text (step 7) and not go through tcp_data_queue() where the
MPTCP options are validated, and some actions are triggered, e.g.
sending the MPJ 4th ACK [2] as demonstrated by the new errors when
running a packetdrill test [3] establishing a second subflow.
This doesn't fully break MPTCP, mainly the 4th MPJ ACK that will be
delayed. Still, we don't want to have this behaviour as it delays the
switch to the fully established mode, and invalid MPTCP options in this
3rd ACK will not be caught any more. This modification also affects the
MPTCP + TFO feature as well, and being the reason why the selftests
started to be unstable the last few days [4].
For TFO, the existing 'basic-cookie-not-reqd' test [5] was no longer
passing: if the 3rd ACK contains data, and the connection is accept()ed
before receiving them, these data would no longer be processed, and thus
not ACKed.
One last thing about MPTCP, in case of simultaneous connect(), a
fallback to TCP will be done, which seems fine:
`../common/defaults.sh`
0 socket(..., SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_MPTCP) = 3
+0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)
Simultaneous SYN-data crossing is also not supported by TFO, see [6].
Kuniyuki Iwashima suggested to restrict the processing to SYN+ACK only:
that's a more generic solution than the one initially proposed, and
also enough to fix the issues described above.
Later on, Eric Dumazet mentioned that an ACK should still be sent in
reaction to the second SYN+ACK that is received: not sending a DUPACK
here seems wrong and could hurt:
0 socket(..., SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 3
+0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)
+0 > S 0:0(0) <mss 1460, sackOK, TS val 1000 ecr 0,nop,wscale 8>
+0 < S 0:0(0) win 1000 <mss 1000, sackOK, nop, nop>
+0 > S. 0:0(0) ack 1 <mss 1460, sackOK, TS val 3308134035 ecr 0,nop,wscale 8>
+0 < S. 0:0(0) ack 1 win 1000 <mss 1000, sackOK, nop, nop>
+0 > . 1:1(0) ack 1 <nop, nop, sack 0:1> // <== Here
So in this version, the 'goto consume' is dropped, to always send an ACK
when switching from TCP_SYN_RECV to TCP_ESTABLISHED. This ACK will be
seen as a DUPACK -- with DSACK if SACK has been negotiated -- in case of
simultaneous SYN crossing: that's what is expected here.
rbd: don't assume rbd_is_lock_owner() for exclusive mappings
Expanding on the previous commit, assuming that rbd_is_lock_owner()
always returns true (i.e. that we are either in RBD_LOCK_STATE_LOCKED
or RBD_LOCK_STATE_QUIESCING) if the mapping is exclusive is wrong too.
In case ceph_cls_set_cookie() fails, the lock would be temporarily
released even if the mapping is exclusive, meaning that we can end up
even in RBD_LOCK_STATE_UNLOCKED.
IOW, exclusive mappings are really "just" about disabling automatic
lock transitions (as documented in the man page), not about grabbing
the lock and holding on to it whatever it takes.
rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive mappings
Every time a watch is reestablished after getting lost, we need to
update the cookie which involves quiescing exclusive lock. For this,
we transition from RBD_LOCK_STATE_LOCKED to RBD_LOCK_STATE_QUIESCING
roughly for the duration of rbd_reacquire_lock() call. If the mapping
is exclusive and I/O happens to arrive in this time window, it's failed
with EROFS (later translated to EIO) based on the wrong assumption in
rbd_img_exclusive_lock() -- "lock got released?" check there stopped
making sense with commit a2b1da09793d ("rbd: lock should be quiesced on
reacquire").
To make it worse, any such I/O is added to the acquiring list before
EROFS is returned and this sets up for violating rbd_lock_del_request()
precondition that the request is either on the running list or not on
any list at all -- see commit ded080c86b3f ("rbd: don't move requests
to the running list on errors"). rbd_lock_del_request() ends up
processing these requests as if they were on the running list which
screws up quiescing_wait completion counter and ultimately leads to
rbd: rename RBD_LOCK_STATE_RELEASING and releasing_wait
... to RBD_LOCK_STATE_QUIESCING and quiescing_wait to recognize that
this state and the associated completion are backing rbd_quiesce_lock(),
which isn't specific to releasing the lock.
While exclusive lock does get quiesced before it's released, it also
gets quiesced before an attempt to update the cookie is made and there
the lock is not released as long as ceph_cls_set_cookie() succeeds.
xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len
Julian reports that commit 341ac980eab9 ("xsk: Support tx_metadata_len")
can break existing use cases which don't zero-initialize xdp_umem_reg
padding. Introduce new XDP_UMEM_TX_METADATA_LEN to make sure we
interpret the padding as tx_metadata_len only when being explicitly
asked.
Move the freeing of the dummy net_device from mtk_free_dev() to
mtk_remove().
Previously, if alloc_netdev_dummy() failed in mtk_probe(),
eth->dummy_dev would be NULL. The error path would then call
mtk_free_dev(), which in turn called free_netdev() assuming dummy_dev
was allocated (but it was not), potentially causing a NULL pointer
dereference.
By moving free_netdev() to mtk_remove(), we ensure it's only called when
mtk_probe() has succeeded and dummy_dev is fully allocated. This
addresses a potential NULL pointer dereference detected by Smatch[1].
Paolo Abeni [Thu, 25 Jul 2024 09:17:21 +0000 (11:17 +0200)]
Merge tag 'nf-24-07-24' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains a Netfilter fix for net:
Patch #1 if FPU is busy, then pipapo set backend falls back to standard
set element lookup. Moreover, disable bh while at this.
From Florian Westphal.
netfilter pull request 24-07-24
* tag 'nf-24-07-24' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nft_set_pipapo_avx2: disable softinterrupts
====================
Nick Weihs [Thu, 25 Jul 2024 05:47:22 +0000 (22:47 -0700)]
ALSA: hda/realtek: Implement sound init sequence for Samsung Galaxy Book3 Pro 360
Samsung Galaxy Book3 Pro 360 sends a large amount of data to the codec
through hda processing coefficients. This data was captured using a
modified version of QEMU, but the actual content of the data remains
opaque to me. Elliding any part of the data seems to cause sound to
not work.
Paolo Abeni [Thu, 25 Jul 2024 08:10:05 +0000 (10:10 +0200)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
This series contains updates to ice driver only.
Ahmed enforces the iavf per VF filter limit on ice (PF) driver to prevent
possible resource exhaustion.
Wojciech corrects assignment of l2 flags read from firmware.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: Fix recipe read procedure
ice: Add a per-VF limit on number of FDIR filters
====================
drm/amdgpu: reset vm state machine after gpu reset(vram lost)
[Why]
Page table of compute VM in the VRAM will lost after gpu reset.
VRAM won't be restored since compute VM has no shadows.
[How]
Use higher 32-bit of vm->generation to record a vram_lost_counter.
Reset the VM state machine when vm->genertaion is not equal to
the new generation token.
v2: Check vm->generation instead of calling drm_sched_entity_error
in amdgpu_vm_validate.
v3: Use new generation token instead of vram_lost_counter for check.
The ras command shared memory is allocated from
VRAM and the response status of the command
buffer will not be zero due to gpu being in
fatal error state after ras UE error injection.
drm/amd/display: Remove ASSERT if significance is zero in math_ceil2
In the DML math_ceil2 function, there is one ASSERT if the significance
is equal to zero. However, significance might be equal to zero
sometimes, and this is not an issue for a ceil function, but the current
ASSERT will trigger warnings in those cases. This commit removes the
ASSERT if the significance is equal to zero to avoid unnecessary noise.
Alex Deucher [Tue, 9 Jul 2024 21:54:11 +0000 (17:54 -0400)]
drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell
We seem to have a case where SDMA will sometimes miss a doorbell
if GFX is entering the powergating state when the doorbell comes in.
To workaround this, we can update the wptr via MMIO, however,
this is only safe because we disallow gfxoff in begin_ring() for
SDMA 5.2 and then allow it again in end_ring().
Enable this workaround while we are root causing the issue with
the HW team.
Merge tag 'phy-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
Pull phy updates from Vinod Koul:
"New Support
- Samsung Exynos gs101 drd combo phy
- Qualcomm SC8180x USB uniphy, IPQ9574 QMP PCIe phy
- Airoha EN7581 PCIe phy
- Freescale i.MX8Q HSIO SerDes phy
- Starfive jh7110 dphy tx
Updates:
- Resume support for j721e-wiz driver
- Updates to Exynos usbdrd driver
- Support for optional power domains in g12a usb2-phy driver
- Debugfs support and updates to zynqmp driver"
* tag 'phy-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (56 commits)
phy: airoha: Add dtime and Rx AEQ IO registers
dt-bindings: phy: airoha: Add dtime and Rx AEQ IO registers
dt-bindings: phy: rockchip-emmc-phy: Convert to dtschema
dt-bindings: phy: qcom,qmp-usb: fix spelling error
phy: exynos5-usbdrd: support Exynos USBDRD 3.1 combo phy (HS & SS)
phy: exynos5-usbdrd: convert Vbus supplies to regulator_bulk
phy: exynos5-usbdrd: convert (phy) register access clock to clk_bulk
phy: exynos5-usbdrd: convert core clocks to clk_bulk
phy: exynos5-usbdrd: support isolating HS and SS ports independently
dt-bindings: phy: samsung,usb3-drd-phy: add gs101 compatible
phy: core: Fix documentation of of_phy_get
phy: starfive: Correct the dphy configure process
phy: zynqmp: Add debugfs support
phy: zynqmp: Take the phy mutex in xlate
phy: zynqmp: Only wait for PLL lock "primary" instances
phy: zynqmp: Store instance instead of type
phy: zynqmp: Enable reference clock correctly
phy: cadence-torrent: Check return value on register read
phy: Fix the cacography in phy-exynos5250-usb2.c
phy: phy-rockchip-samsung-hdptx: Select CONFIG_MFD_SYSCON
...
Merge tag 'soundwire-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire
Pull soundwire updates from Vinod Koul:
- Simplification across subsystem using cleanup.h
- Support for debugfs to read/write commands
- Few Intel and Qualcomm driver updates
* tag 'soundwire-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
soundwire: debugfs: simplify with cleanup.h
soundwire: cadence: simplify with cleanup.h
soundwire: intel_ace2x: simplify with cleanup.h
soundwire: intel_ace2x: simplify return path in hw_params
soundwire: intel: simplify with cleanup.h
soundwire: intel: simplify return path in hw_params
soundwire: amd_init: simplify with cleanup.h
soundwire: amd: simplify with cleanup.h
soundwire: amd: simplify return path in hw_params
soundwire: intel_auxdevice: start the bus at default frequency
soundwire: intel_auxdevice: add cs42l43 codec to wake_capable_list
drivers:soundwire: qcom: cleanup port maask calculations
soundwire: bus: simplify by using local slave->prop
soundwire: generic_bandwidth_allocation: change port_bo parameter to pointer
soundwire: Intel: clarify Copyright information
soundwire: intel_ace2.x: add AC timing extensions for PantherLake
soundwire: bus: add stream refcount
soundwire: debugfs: add interface to read/write commands
Joel Granados [Wed, 24 Jul 2024 18:59:29 +0000 (20:59 +0200)]
sysctl: treewide: constify the ctl_table argument of proc_handlers
const qualify the struct ctl_table argument in the proc_handler function
signatures. This is a prerequisite to moving the static ctl_table
structs into .rodata data which will ensure that proc_handler function
pointers cannot be modified.
This patch has been generated by the following coccinelle script:
* Code formatting was adjusted in xfs_sysctl.c to comply with code
conventions. The xfs_stats_clear_proc_handler,
xfs_panic_mask_proc_handler and xfs_deprecated_dointvec_minmax where
adjusted.
* The ctl_table argument in proc_watchdog_common was const qualified.
This is called from a proc_handler itself and is calling back into
another proc_handler, making it necessary to change it as part of the
proc_handler migration.
Merge tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator updates from Jason Donenfeld:
"This adds getrandom() support to the vDSO.
First, it adds a new kind of mapping to mmap(2), MAP_DROPPABLE, which
lets the kernel zero out pages anytime under memory pressure, which
enables allocating memory that never gets swapped to disk but also
doesn't count as being mlocked.
Then, the vDSO implementation of getrandom() is introduced in a
generic manner and hooked into random.c.
Next, this is implemented on x86. (Also, though it's not ready for
this pull, somebody has begun an arm64 implementation already)
Finally, two vDSO selftests are added.
There are also two housekeeping cleanup commits"
* tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
MAINTAINERS: add random.h headers to RNG subsection
random: note that RNDGETPOOL was removed in 2.6.9-rc2
selftests/vDSO: add tests for vgetrandom
x86: vdso: Wire up getrandom() vDSO implementation
random: introduce generic vDSO getrandom() implementation
mm: add MAP_DROPPABLE for designating always lazily freeable mappings
Merge tag 'vfs-6.11-rc1.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
"VFS:
- The new 64bit mount ids start after the old mount id, i.e., at the
first non-32 bit value. However, we started counting one id too
late and thus lost 4294967296 as the first valid id. Fix that.
- Update a few comments on some vfs_*() creation helpers.
- Move copying of the xattr name out from the locks required to start
a filesystem write.
- Extend the filelock lock UAF fix to the compat code as well.
- Now that we added the ability to look up an inode under RCU it's
possible that lockless hash lookup can find and lock an inode after
it gets I_FREEING set. It then waits until inode teardown in
evict() is finished.
The flag however is still set after evict() has woken up all
waiters. If the inode lock is taken late enough on the waiting side
after hash removal and wakeup happened the waiting thread will
never be woken.
Before RCU based lookup this was synchronized via the
inode_hash_lock. But since unhashing requires the inode lock as
well we can check whether the inode is unhashed while holding inode
lock even without holding inode_hash_lock.
pidfd:
- The nsproxy structure contains nearly all of the namespaces
associated with a task. When a namespace type isn't supported
nsproxy might contain a NULL pointer or always point to the initial
namespace type. The logic isn't consistent. So when deriving
namespace fds we need to ensure that the namespace type is
supported.
First, so that we don't risk dereferncing NULL pointers. The
correct bigger fix would be to change all namespaces to always set
a valid namespace pointer in struct nsproxy independent of whether
or not it is compiled in. But that requires quite a few changes.
Second, so that we don't allow deriving namespace fds when the
namespace type doesn't exist and thus when they couldn't also be
derived via /proc/self/ns/.
- Add missing selftests for the new pidfd ioctls to derive namespace
fds. This simply extends the already existing testsuite.
netfs:
- Fix debug logging and fix kconfig variable name so it actually
works.
- Fix writeback that goes both to the server and cache. The streams
are only activated once a subreq is added. When a server write
happens the subreq doesn't need to have finished by the time the
cache write is started. If the server write has already finished by
the time the cache write is about to start the cache write will
operate on a folio that might already have been reused. Fix this by
preactivating the cache write.
- Limit cachefiles subreq size for cache writes to MAX_RW_COUNT"
* tag 'vfs-6.11-rc1.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
inode: clarify what's locked
vfs: Fix potential circular locking through setxattr() and removexattr()
filelock: Fix fcntl/close race recovery compat path
fs: use all available ids
cachefiles: Set the max subreq size for cache writes to MAX_RW_COUNT
netfs: Fix writeback that needs to go to both server and cache
pidfs: add selftests for new namespace ioctls
pidfs: handle kernels without namespaces cleanly
pidfs: when time ns disabled add check for ioctl
vfs: correct the comments of vfs_*() helpers
vfs: handle __wait_on_freeing_inode() and evict() race
netfs: Rename CONFIG_FSCACHE_DEBUG to CONFIG_NETFS_DEBUG
netfs: Revert "netfs: Switch debug logging to pr_debug()"
Commit e3ec0fe944d2 ("hostfs: Convert hostfs_read_folio() to use a
folio") simplified hostfs_read_folio(), but in the process of converting
to using folios natively also mis-used the folio_zero_tail() function
due to the very confusing API of that function.
Very arguably it's folio_zero_tail() API itself that is buggy, since it
would make more sense (and the documentation kind of implies) that the
third argument would be the pointer to the beginning of the folio
buffer.
But no, the third argument to folio_zero_tail() is where we should start
zeroing the tail (even if we already also pass in the offset separately
as the second argument).
So fix the hostfs caller, and we can leave any folio_zero_tail() sanity
cleanup for later.
Yunhui Cui [Mon, 17 Jun 2024 13:14:25 +0000 (21:14 +0800)]
RISC-V: Select ACPI PPTT drivers
After adding ACPI support to populate_cache_leaves(), RISC-V can build
cacheinfo through the ACPI PPTT table, thus enabling the ACPI_PPTT
configuration.
Yunhui Cui [Mon, 17 Jun 2024 13:14:24 +0000 (21:14 +0800)]
riscv: cacheinfo: initialize cacheinfo's level and type from ACPI PPTT
Before cacheinfo can be built correctly, we need to initialize level
and type. Since RISC-V currently does not have a register group that
describes cache-related attributes like ARM64, we cannot obtain them
directly, so now we obtain cache leaves from the ACPI PPTT table
(acpi_get_cache_info()) and set the cache type through split_levels.
Yunhui Cui [Mon, 17 Jun 2024 13:14:23 +0000 (21:14 +0800)]
riscv: cacheinfo: remove the useless input parameter (node) of ci_leaf_init()
ci_leaf_init() is a declared static function. The implementation of the
function body and the caller do not use the parameter (struct device_node
*node) input parameter, so remove it.
Sia Jee Heng [Thu, 2 May 2024 07:37:51 +0000 (00:37 -0700)]
RISC-V: ACPI: Enable SPCR table for console output on RISC-V
The ACPI SPCR code has been used to enable console output for ARM64 and
X86. The same code can be reused for RISC-V. Furthermore, SPCR table is
mandated for headless system as outlined in the RISC-V BRS
Specification, chapter 6.
Petr Machata [Tue, 23 Jul 2024 16:04:16 +0000 (18:04 +0200)]
net: nexthop: Initialize all fields in dumped nexthops
struct nexthop_grp contains two reserved fields that are not initialized by
nla_put_nh_group(), and carry garbage. This can be observed e.g. with
strace (edited for clarity):
# ip nexthop add id 1 dev lo
# ip nexthop add id 101 group 1
# strace -e recvmsg ip nexthop get id 101
...
recvmsg(... [{nla_len=12, nla_type=NHA_GROUP},
[{id=1, weight=0, resvd1=0x69, resvd2=0x67}]] ...) = 52
The fields are reserved and therefore not currently used. But as they are, they
leak kernel memory, and the fact they are not just zero complicates repurposing
of the fields for new ends. Initialize the full structure.
Simon Horman [Tue, 23 Jul 2024 13:29:27 +0000 (14:29 +0100)]
net: stmmac: Correct byte order of perfect_match
The perfect_match parameter of the update_vlan_hash operation is __le16,
and is correctly converted from host byte-order in the lone caller,
stmmac_vlan_update().
However, the implementations of this caller, dwxgmac2_update_vlan_hash()
and dwxgmac2_update_vlan_hash(), both treat this parameter as host byte
order, using the following pattern:
u32 value = ...
...
writel(value | perfect_match, ...);
This is not correct because both:
1) value is host byte order; and
2) writel expects a host byte order value as it's first argument
I believe that this will break on big endian systems. And I expect it
has gone unnoticed by only being exercised on little endian systems.
The approach taken by this patch is to update the callback, and it's
caller to simply use a host byte order value.
Flagged by Sparse.
Compile tested only.
Fixes: c7ab0b8088d7 ("net: stmmac: Fallback to VLAN Perfect filtering if HASH is not available") Signed-off-by: Simon Horman <[email protected]> Reviewed-by: Maxime Chevallier <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Jinjie Ruan [Thu, 13 Jun 2024 11:13:47 +0000 (19:13 +0800)]
trace: riscv: Remove deprecated kprobe on ftrace support
Since commit 7caa9765465f60 ("ftrace: riscv: move from REGS to ARGS"),
kprobe on ftrace is not supported by riscv, because riscv's support for
FTRACE_WITH_REGS has been replaced with support for FTRACE_WITH_ARGS, and
KPROBES_ON_FTRACE will be supplanted by FPROBES. So remove the deprecated
kprobe on ftrace support, which is misunderstood.
Hangbin Liu [Tue, 23 Jul 2024 08:22:52 +0000 (16:22 +0800)]
selftests: forwarding: skip if kernel not support setting bridge fdb learning limit
If the testing kernel doesn't support setting fdb_max_learned or show
fdb_n_learned, just skip it. Or we will get errors like
./bridge_fdb_learning_limit.sh: line 218: [: null: integer expression expected
./bridge_fdb_learning_limit.sh: line 225: [: null: integer expression expected
Fixes: 6f84090333bb ("selftests: forwarding: bridge_fdb_learning_limit: Add a new selftest") Signed-off-by: Hangbin Liu <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Reviewed-by: Johannes Nixdorf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Shigeru Yoshida [Tue, 16 Jul 2024 02:09:05 +0000 (11:09 +0900)]
tipc: Return non-zero value from tipc_udp_addr2str() on error
tipc_udp_addr2str() should return non-zero value if the UDP media
address is invalid. Otherwise, a buffer overflow access can occur in
tipc_media_addr_printf(). Fix this by returning 1 on an invalid UDP
media address.
Peter Ujfalusi [Wed, 24 Jul 2024 08:19:32 +0000 (11:19 +0300)]
ASoC: SOF: ipc4-topology: Preserve the DMA Link ID for ChainDMA on unprepare
The DMA Link ID is set to the IPC message's primary during dai_config,
which is only during hw_params.
During xrun handling the hw_params is not called and the DMA Link ID
information will be lost.
All other fields in the message expected to be 0 for re-configuration, only
the DMA Link ID needs to be preserved and the in case of repeated
dai_config, it is correctly updated (masked and then set).
Peter Ujfalusi [Wed, 24 Jul 2024 08:19:31 +0000 (11:19 +0300)]
ASoC: SOF: ipc4-topology: Only handle dai_config with HW_PARAMS for ChainDMA
The DMA Link ID is only valid in snd_sof_dai_config_data when the
dai_config is called with HW_PARAMS.
The commit that this patch fixes is actually moved a code section without
changing it, the same bug exists in the original code, needing different
patch to kernel prior to 6.9 kernels.
In __wait_on_freeing_inode() we warn in case the inode_hash_lock is held
but the inode is unhashed. We then release the inode_lock. So using
"locked" as parameter name is confusing. Use is_inode_hash_locked as
parameter name instead.
David Howells [Tue, 23 Jul 2024 08:59:54 +0000 (09:59 +0100)]
vfs: Fix potential circular locking through setxattr() and removexattr()
When using cachefiles, lockdep may emit something similar to the circular
locking dependency notice below. The problem appears to stem from the
following:
(1) Cachefiles manipulates xattrs on the files in its cache when called
from ->writepages().
(2) The setxattr() and removexattr() system call handlers get the name
(and value) from userspace after taking the sb_writers lock, putting
accesses of the vma->vm_lock and mm->mmap_lock inside of that.
(3) The afs filesystem uses a per-inode lock to prevent multiple
revalidation RPCs and in writeback vs truncate to prevent parallel
operations from deadlocking against the server on one side and local
page locks on the other.
Fix this by moving the getting of the name and value in {get,remove}xattr()
outside of the sb_writers lock. This also has the minor benefits that we
don't need to reget these in the event of a retry and we never try to take
the sb_writers lock in the event we can't pull the name and value into the
kernel.
Alternative approaches that might fix this include moving the dispatch of a
write to the cache off to a workqueue or trying to do without the
validation lock in afs. Note that this might also affect other filesystems
that use netfslib and/or cachefiles.
======================================================
WARNING: possible circular locking dependency detected
6.10.0-build2+ #956 Not tainted
------------------------------------------------------
fsstress/6050 is trying to acquire lock: ffff888138fd82f0 (mapping.invalidate_lock#3){++++}-{3:3}, at: filemap_fault+0x26e/0x8b0
but task is already holding lock: ffff888113f26d18 (&vma->vm_lock->lock){++++}-{3:3}, at: lock_vma_under_rcu+0x165/0x250
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is: