Kent Overstreet [Fri, 3 Nov 2023 01:01:25 +0000 (21:01 -0400)]
bcachefs: Break up bch2_journal_write()
Split up bch2_journal_write() to simplify locking:
- bch2_journal_write_pick_flush(), which needs j->lock
- bch2_journal_write_prep, which operates on the journal buffer to be
written and will need the upcoming buf_lock for synchronization with
the btree write buffer flush path
riscv: Improve PTDUMP to show RSW with non-zero value
RSW field can be used to encode 2 bits of software
defined information. Currently, PTDUMP only prints
"RSW" when its value is 1 or 3.
To fix this issue and improve the debugging experience
with PTDUMP, we redefine _PAGE_SPECIAL to its original
value and use _PAGE_SOFT as the RSW mask, allow it to
print the RSW with any non-zero value.
This patch also removes the val from the struct prot_bits
as it is no longer needed.
The CMO op macros initially used lower case, as the original iteration
of the ALT_CMO_OP alternative stringified the first parameter to
finalise the assembly for the standard variant.
As a knock-on, the T-Head versions of these CMOs had to use mixed case
defines. Commit dd23e9535889 ("RISC-V: replace cbom instructions with
an insn-def") removed the asm construction with stringify, replacing it
an insn-def macro, rending the lower-case surplus to requirements.
As far as I can tell from a brief check, CBO_zero does not see similar
use and didn't require the mixed case define in the first place.
Replace the lower case characters now for consistency with other
insn-def macros in the standard and T-Head forms, and adjust the
callsites.
riscv: don't probe unaligned access speed if already done
If misaligned_access_speed percpu var isn't so called "HWPROBE
MISALIGNED UNKNOWN", it means the probe has happened(this is possible
for example, hotplug off then hotplug on one cpu), and the percpu var
has been set, don't probe again in this case.
If these config not set, mmc can't run for jh7110, rootfs can't
be found when using SD card. So set CONFIG_MMC_DW=y like arm64
defconfig, and set CONFIG_MMC_DW_STARFIVE=y for starfive. Then
starfive vf2 board can start SD card rootfs with mainline defconfig
and dtb.
Haorong Lu [Thu, 3 Aug 2023 22:44:54 +0000 (15:44 -0700)]
riscv: signal: handle syscall restart before get_signal
In the current riscv implementation, blocking syscalls like read() may
not correctly restart after being interrupted by ptrace. This problem
arises when the syscall restart process in arch_do_signal_or_restart()
is bypassed due to changes to the regs->cause register, such as an
ebreak instruction.
Steps to reproduce:
1. Interrupt the tracee process with PTRACE_SEIZE & PTRACE_INTERRUPT.
2. Backup original registers and instruction at new_pc.
3. Change pc to new_pc, and inject an instruction (like ebreak) to this
address.
4. Resume with PTRACE_CONT and wait for the process to stop again after
executing ebreak.
5. Restore original registers and instructions, and detach from the
tracee process.
6. Now the read() syscall in tracee will return -1 with errno set to
ERESTARTSYS.
Specifically, during an interrupt, the regs->cause changes from
EXC_SYSCALL to EXC_BREAKPOINT due to the injected ebreak, which is
inaccessible via ptrace so we cannot restore it. This alteration breaks
the syscall restart condition and ends the read() syscall with an
ERESTARTSYS error. According to include/linux/errno.h, it should never
be seen by user programs. X86 can avoid this issue as it checks the
syscall condition using a register (orig_ax) exposed to user space.
Arm64 handles syscall restart before calling get_signal, where it could
be paused and inspected by ptrace/debugger.
This patch adjusts the riscv implementation to arm64 style, which also
checks syscall using a kernel register (syscallno). It ensures the
syscall restart process is not bypassed when changes to the cause
register occur, providing more consistent behavior across various
architectures.
For a simplified reproduction program, feel free to visit:
https://github.com/ancientmodern/riscv-ptrace-bug-demo.
Since commit 61cadb9 ("Provide new description of misaligned load/store
behavior compatible with privileged architecture.") in the RISC-V ISA
manual, it is stated that misaligned load/store might not be supported.
However, the RISC-V kernel uABI describes that misaligned accesses are
supported. In order to support that, this series adds support for S-mode
handling of misaligned accesses as well support for prctl(PR_UNALIGN).
Handling misaligned access in kernel allows for a finer grain control
of the misaligned accesses behavior, and thanks to the prctl() call,
can allow disabling misaligned access emulation to generate SIGBUS. User
space can then optimize its software by removing such access based on
SIGBUS generation.
This series is useful when using a SBI implementation that does not
handle misaligned traps as well as detecting misaligned accesses
generated by userspace application using the prctrl(PR_SET_UNALIGN)
feature.
This series can be tested using the spike simulator[1] and a modified
openSBI version[2] which allows to always delegate misaligned load/store to
S-mode. A test[3] that exercise various instructions/registers can be
executed to verify the unaligned access support.
* b4-shazam-merge:
riscv: add support for PR_SET_UNALIGN and PR_GET_UNALIGN
riscv: report misaligned accesses emulation to hwprobe
riscv: annotate check_unaligned_access_boot_cpu() with __init
riscv: add support for sysctl unaligned_enabled control
riscv: add floating point insn support to misaligned access emulation
riscv: report perf event for misaligned fault
riscv: add support for misaligned trap handling in S-mode
riscv: remove unused functions in traps_misaligned.c
firewire: Annotate struct fw_node with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).
As found with Coccinelle[1], add __counted_by for struct fw_node.
Linus Torvalds [Sun, 5 Nov 2023 02:25:36 +0000 (16:25 -1000)]
Merge tag 'i3c/for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux
Pull i3c updates from Alexandre Belloni:
"There are now more fixes because as stated in my previous pull
request, people now have access to actual hardware.
Core:
- handle IBI in the proper order
Drivers:
- cdns: fix status register access
- mipi-i3c-hci: many fixes now that the driver has been actually
tested
- svc: many IBI fixes, correct compatible string, fix hot join corner
cases"
* tag 'i3c/for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux: (29 commits)
i3c: master: handle IBIs in order they came
i3c: master: mipi-i3c-hci: Fix a kernel panic for accessing DAT_data.
i3c: master: svc: fix compatibility string mismatch with binding doc
i3c: master: svc: fix random hot join failure since timeout error
i3c: master: svc: fix SDA keep low when polling IBIWON timeout happen
i3c: master: svc: fix check wrong status register in irq handler
i3c: master: svc: fix ibi may not return mandatory data byte
i3c: master: svc: fix wrong data return when IBI happen during start frame
i3c: master: svc: fix race condition in ibi work thread
i3c: Fix typo "Provisional ID" to "Provisioned ID"
i3c: Fix potential refcount leak in i3c_master_register_new_i3c_devs
i3c: mipi-i3c-hci: Resume controller after aborted transfer
i3c: mipi-i3c-hci: Resume controller explicitly
i3c: mipi-i3c-hci: Fix missing xfer->completion in hci_cmd_v1_daa()
i3c: mipi-i3c-hci: Do not unmap region not mapped for transfer
i3c: mipi-i3c-hci: Set number of SW enabled Ring Bundles earlier
i3c: mipi-i3c-hci: Fix race between bus cleanup and interrupt
i3c: mipi-i3c-hci: Set ring start request together with enable
i3c: mipi-i3c-hci: Remove BUG() when Ring Abort request times out
i3c: mipi-i3c-hci: Fix out of bounds access in hci_dma_irq_handler
...
Linus Torvalds [Sun, 5 Nov 2023 02:20:36 +0000 (16:20 -1000)]
Merge tag 'cxl-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull CXL (Compute Express Link) updates from Dan Williams:
"The main new functionality this time is work to allow Linux to
natively handle CXL link protocol errors signalled via PCIe AER for
current generation CXL platforms. This required some enlightenment of
the PCIe AER core to workaround the fact that current generation RCH
(Restricted CXL Host) platforms physically hide topology details and
registers via a mechanism called RCRB (Root Complex Register Block).
The next major highlight is reworks to address bugs in parsing region
configurations for next generation VH (Virtual Host) topologies. The
old broken algorithm is replaced with a simpler one that significantly
increases the number of region configurations supported by Linux. This
is again relevant for error handling so that forward and reverse
address translation of memory errors can be carried out by Linux for
memory regions instantiated by platform firmware.
As for other cross-tree work, the ACPI table parsing code has been
refactored for reuse parsing the "CDAT" structure which is an
ACPI-like data structure that is reported by CXL devices. That work is
in preparation for v6.8 support for CXL QoS. Think of this as dynamic
generation of NUMA node topology information generated by Linux rather
than platform firmware.
Lastly, a number of internal object lifetime issues have been resolved
along with misc. fixes and feature updates (decoders_committed sysfs
ABI).
Summary:
- Add support for RCH (Restricted CXL Host) Error recovery
- Fix several region assembly bugs
- Fix mem-device lifetime issues relative to the sanitize command and
RCH topology.
- Refactor ACPI table parsing for CDAT parsing re-use in preparation
for CXL QOS support"
* tag 'cxl-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (50 commits)
lib/fw_table: Remove acpi_parse_entries_array() export
cxl/pci: Change CXL AER support check to use native AER
cxl/hdm: Remove broken error path
cxl/hdm: Fix && vs || bug
acpi: Move common tables helper functions to common lib
cxl: Add support for reading CXL switch CDAT table
cxl: Add checksum verification to CDAT from CXL
cxl: Export QTG ids from CFMWS to sysfs as qos_class attribute
cxl: Add decoders_committed sysfs attribute to cxl_port
cxl: Add cxl_decoders_committed() helper
cxl/core/regs: Rework cxl_map_pmu_regs() to use map->dev for devm
cxl/core/regs: Rename phys_addr in cxl_map_component_regs()
PCI/AER: Unmask RCEC internal errors to enable RCH downstream port error handling
PCI/AER: Forward RCH downstream port-detected errors to the CXL.mem dev handler
cxl/pci: Disable root port interrupts in RCH mode
cxl/pci: Add RCH downstream port error logging
cxl/pci: Map RCH downstream AER registers for logging protocol errors
cxl/pci: Update CXL error logging to use RAS register address
PCI/AER: Refactor cper_print_aer() for use by CXL driver module
cxl/pci: Add RCH downstream port AER register discovery
...
Kent Overstreet [Fri, 3 Nov 2023 22:30:08 +0000 (18:30 -0400)]
bcachefs: rebalance_work btree is not a snapshots btree
rebalance_work entries may refer to entries in the extents btree, which
is a snapshots btree, or they may also refer to entries in the reflink
btree, which is not.
Hence rebalance_work keys may use the snapshot field but it's not
required to be nonzero - add a new btree flag to reflect this.
Kent Overstreet [Fri, 3 Nov 2023 15:55:44 +0000 (11:55 -0400)]
bcachefs: Fix recovery when forced to use JSET_NO_FLUSH journal entry
When we didn't find anything in the journal that we'd like to use, and
we're forced to use whatever we can find - that entry will have been a
JSET_NO_FLUSH entry with a garbage last_seq value, since it's not
normally used.
Initialize it to something sane, for bch2_fs_journal_start().
Kent Overstreet [Thu, 2 Nov 2023 19:28:15 +0000 (15:28 -0400)]
bcachefs: Fix bch2_delete_dead_inodes()
- the fsck_err() check for the filesystem being clean was incorrect,
causing us to always fail to delete unlinked inodes
- if a snapshot had been taken, the unlinked inode needs to be
propagated to snapshot leaves so the unlink can happen there - fixed.
Brian Foster [Fri, 3 Nov 2023 13:09:38 +0000 (09:09 -0400)]
bcachefs: use swab40 for bch_backpointer.bucket_offset bitfield
The bucket_offset field of bch_backpointer is a 40-bit bitfield, but the
bch2_backpointer_swab() helper uses swab32. This leads to inconsistency
when an on-disk fs is accessed from an opposite endian machine.
As it turns out, we already have an internal swab40() helper that is
used from the bch_alloc_v4 swab callback. Lift it into the backpointers
header file and use it consistently in both places.
Brian Foster [Fri, 3 Nov 2023 13:09:37 +0000 (09:09 -0400)]
bcachefs: byte order swap bch_alloc_v4.fragmentation_lru field
A simple test to populate a filesystem on one CPU architecture and
fsck on an arch of the opposite byte order produces errors related
to the fragmentation LRU. This occurs because the 64-bit
fragmentation_lru field is not byte-order swapped when reads detect
that the on-disk/bset key values were written in opposite byte-order
of the current CPU.
Update the bch2_alloc_v4 swab callback to handle fragmentation_lru
as is done for other multi-byte fields. This doesn't affect existing
filesystems when accessed by CPUs of the same endianness because the
->swab() callback is only called when the bset flags indicate an
endianness mismatch between the CPU and on-disk data.
Brian Foster [Fri, 3 Nov 2023 13:09:36 +0000 (09:09 -0400)]
bcachefs: allow writeback to fill bio completely
The bcachefs folio writeback code includes a bio full check as well
as a fixed size check to determine when to split off and submit
writeback I/O. The inclusive check of the latter against the limit
means that writeback can submit slightly prematurely. This is not a
functional problem, but results in unnecessarily split I/Os and
extent merging.
This can be observed with a buffered write sized exactly to the
current maximum value (1MB) and with key_merging_disabled=1. The
latter prevents the merge from the second write such that a
subsequent check of the extent list shows a 1020k extent followed by
a contiguous 4k extent.
The purpose for the fixed size check is also undocumented and
somewhat obscure. Lift this check into a new helper that wraps the
bio check, fix the comparison logic, and add a comment to document
the purpose and how we might improve on this in the future.
Brian Foster [Wed, 1 Nov 2023 13:01:02 +0000 (09:01 -0400)]
bcachefs: fix odebug warn and lockdep splat due to on-stack rhashtable
Guenter Roeck reports a lockdep splat and DEBUG_OBJECTS_WORK related
warning when bch2_copygc_thread() initializes its rhashtable. The
lockdep splat relates to a warning print caused by the fact that the
rhashtable exists on the stack but is not annotated as so. This is
something that could be addressed by INIT_WORK_ONSTACK(), but
rhashtable doesn't expose that control and probably isnt worth the
churn for just one user. Instead, dynamically allocate the
buckets_in_flight structure and avoid the splat that way.
Brian Foster [Wed, 1 Nov 2023 19:02:45 +0000 (15:02 -0400)]
bcachefs: update alloc cursor in early bucket allocator
A recent bug report uncovered a scenario where a filesystem never
runs with freespace_initialized, and therefore the user observes
significantly degraded write performance by virtue of running the
early bucket allocator. The associated bug aside, the primary cause
of the performance drop in this particular instance is that the
early bucket allocator does not update the allocation cursor. This
means that every allocation walks the alloc btree from the first
bucket of the associated device looking for a bucket marked as free
space.
Update the early allocator code to set the alloc cursor to the last
processed position in the tree, similar to how the freelist
allocator behaves. With the alloc_cursor being updated, the retry
logic also needs to be updated to restart from the beginning of the
device when a free bucket is not available between the cursor and
the end of the device. Track the restart position in a first_bucket
variable to make the code a bit more easily readable and consistent
with the freelist allocator.
Brian Foster [Wed, 1 Nov 2023 19:02:44 +0000 (15:02 -0400)]
bcachefs: serialize on cached key in early bucket allocator
bcachefs had a transient bug where freespace_initialized was not
properly being set, which lead to unexpected use of the early bucket
allocator at runtime. This issue has been fixed, but the existence
of it uncovered a coherency issue in the early bucket allocation
code that is somewhat related to how uncached iterators deal with
the key cache.
The problem itself manifests as occasional failure of generic/113
due to corruption, often seen as a duplicate backpointer or multiple
data types per-bucket error. The immediate cause of the error is a
racing bucket allocation along the lines of the following sequence:
- Task 1 selects key A in bch2_bucket_alloc_early() and schedules.
- Task 2 selects the same key A, but proceeds to complete the
allocation and associated I/O, after which it releases the
open_bucket.
- Task 1 resumes with key A, but does not recognize the bucket is
now allocated because the open_bucket has been removed
from the hash when it was released in the previous step.
This generally shouldn't happen because the allocating task updates
the alloc btree key before releasing the bucket. This is not
sufficient in this particular instance, however, because an uncached
iterator for a cached btree doesn't actually lock the key cache slot
when no key exists for a given slot in the cache. Thus the fact that
the allocation side updates the cached key means that multiple
uncached iters can stumble across the same alloc key and duplicate
the bucket allocation as described above.
This is something that probably needs a longer term fix in the
iterator code. As a short term fix, close the race through explicit
use of a cached iterator for likely allocation candidates. We don't
want to scan the btree with a cached iterator because that would
unnecessarily pollute the cache. This mitigates cache pollution by
primarily scanning the tree with an uncached iterator, but closes
the race by creating a key cache entry for any prospective slot
prior to the bucket allocation attempt (also similar to how
_alloc_freelist() works via try_alloc_bucket()). This survives many
iterations of generic/113 on a kernel hacked to always use the early
bucket allocator.
Linus Torvalds [Sun, 5 Nov 2023 01:58:13 +0000 (15:58 -1000)]
Merge tag 'tsm-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux
Pull unified attestation reporting from Dan Williams:
"In an ideal world there would be a cross-vendor standard attestation
report format for confidential guests along with a common device
definition to act as the transport.
In the real world the situation ended up with multiple platform
vendors inventing their own attestation report formats with the
SEV-SNP implementation being a first mover to define a custom
sev-guest character device and corresponding ioctl(). Later, this
configfs-tsm proposal intercepted an attempt to add a tdx-guest
character device and a corresponding new ioctl(). It also anticipated
ARM and RISC-V showing up with more chardevs and more ioctls().
The proposal takes for granted that Linux tolerates the vendor report
format differentiation until a standard arrives. From talking with
folks involved, it sounds like that standardization work is unlikely
to resolve anytime soon. It also takes the position that kernfs ABIs
are easier to maintain than ioctl(). The result is a shared configfs
mechanism to return per-vendor report-blobs with the option to later
support a standard when that arrives.
Part of the goal here also is to get the community into the
"uncomfortable, but beneficial to the long term maintainability of the
kernel" state of talking to each other about their differentiation and
opportunities to collaborate. Think of this like the device-driver
equivalent of the common memory-management infrastructure for
confidential-computing being built up in KVM.
As for establishing an "upstream path for cross-vendor
confidential-computing device driver infrastructure" this is something
I want to discuss at Plumbers. At present, the multiple vendor
proposals for assigning devices to confidential computing VMs likely
needs a new dedicated repository and maintainer team, but that is a
discussion for v6.8.
For now, Greg and Thomas have acked this approach and this is passing
is AMD, Intel, and Google tests.
Summary:
- Introduce configfs-tsm as a shared ABI for confidential computing
attestation reports
- Convert sev-guest to additionally support configfs-tsm alongside
its vendor specific ioctl()
- Added signed attestation report retrieval to the tdx-guest driver
forgoing a new vendor specific ioctl()
- Misc cleanups and a new __free() annotation for kvfree()"
* tag 'tsm-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux:
virt: tdx-guest: Add Quote generation support using TSM_REPORTS
virt: sevguest: Add TSM_REPORTS support for SNP_GET_EXT_REPORT
mm/slab: Add __free() support for kvfree
virt: sevguest: Prep for kernel internal get_ext_report()
configfs-tsm: Introduce a shared ABI for attestation reports
virt: coco: Add a coco/Makefile and coco/Kconfig
virt: sevguest: Fix passing a stack buffer as a scatterlist target
Linus Torvalds [Sat, 4 Nov 2023 21:04:30 +0000 (11:04 -1000)]
Merge tag 'mtd/for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull mtd updates from Miquel Raynal:
"The main set of changes is related to Uwe's work converting platform
remove callbacks to return void. Comes next (in number of changes)
Kees' additional structures annotations to improve the sanitizers. The
usual amount of cleanups apply.
About the more substancial contribution, one main function of the
partitions core could return an error which was not checked, this is
now fixed. On the bindings side, fixed partitions can now have a
compression property. Finally, an erroneous situation is now always
avoided in the MAP RAM driver.
CFI:
- A several years old byte swap has been fixed.
NAND:
- The subsystem has, as usual, seen a bit of cleanup being done this
cycle, typically return values of platform_get_irq() and
devm_kasprintf(). There is also a better ECC check in the Arasan
driver. This comes with smaller misc changes.
- In the SPI-NAND world there is now support for Foresee F35SQA002G,
Winbond W25N and XTX XT26 chips.
SPI NOR:
- For SPI NOR we cleaned the flash info entries in order to have them
slimmer and self explanatory. In order to make the entries as slim
as possible, we introduced sane default values so that the actual
flash entries don't need to specify them. We now use a flexible
macro to specify the flash ID instead of the previous INFOx()
macros that had hardcoded ID lengths.
- We also removed some flash entries: the very old Catalyst SPI
EEPROMs that were introduced once with the SPI-NOR subsystem, and a
Fujitsu MRAM. Both should use the at25 EEPROM driver. The latter
even has device tree bindings for the at25 driver.
- We made sure that the conversion didn't introduce any unwanted
changes by comparing the .rodata segment before and after the
conversion. The patches landed in linux-next immediately after
v6.6-rc2, we haven't seen any regressions yet.
- Apart of the autumn cleaning we introduced a new flash entry,
at25ff321a, and added block protection support for mt25qu512a"
* tag 'mtd/for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (91 commits)
mtd: cfi_cmdset_0001: Byte swap OTP info
mtd: rawnand: meson: check return value of devm_kasprintf()
mtd: rawnand: intel: check return value of devm_kasprintf()
mtd: rawnand: sh_flctl: Convert to module_platform_driver()
mtd: spi-nor: micron-st: use SFDP table for mt25qu512a
mtd: spi-nor: micron-st: enable lock/unlock for mt25qu512a
mtd: rawnand: Remove unused of_gpio.h inclusion
mtd: spinand: Add support for XTX XT26xxxDxxxxx
mtd: spinand: winbond: add support for serial NAND flash
mtd: rawnand: cadence: Annotate struct cdns_nand_chip with __counted_by
mtd: rawnand: Annotate struct mtk_nfc_nand_chip with __counted_by
mtd: spinand: add support for FORESEE F35SQA002G
mtd: rawnand: rockchip: Use struct_size()
mtd: rawnand: arasan: Include ECC syndrome along with in-band data while checking for ECC failure
mtd: Use device_get_match_data()
mtd: spi-nor: nxp-spifi: Convert to platform remove callback returning void
mtd: spi-nor: hisi-sfc: Convert to platform remove callback returning void
mtd: maps: sun_uflash: Convert to platform remove callback returning void
mtd: maps: sa1100-flash: Convert to platform remove callback returning void
mtd: maps: pxa2xx-flash: Convert to platform remove callback returning void
...
Linus Torvalds [Sat, 4 Nov 2023 20:42:07 +0000 (10:42 -1000)]
Merge tag 'topic/nvidia-gsp-2023-11-03' of git://anongit.freedesktop.org/drm/drm
Pull drm nouveau GSP support from Dave Airlie:
"This adds the initial support for the NVIDIA GSP firmware to nouveau.
This firmware is a new direction for Turing+ GPUs, and is only enabled
by default on Ada generation. Other generations need to use
nouveau.config=NvGspRm=1
The GSP firmware takes nearly all the GPU init and power management
tasks onto a risc-v CPU on the GPU.
This series is mostly the work from Ben Skeggs, and Dave added some
patches to rebase it to the latest firmware release which is where we
will stay for as long as possible as the firmwares have no ABI
stability"
* tag 'topic/nvidia-gsp-2023-11-03' of git://anongit.freedesktop.org/drm/drm: (49 commits)
nouveau/gsp: add some basic registry entries.
nouveau/gsp: fix message signature.
nouveau/gsp: move to 535.113.01
nouveau/disp: fix post-gsp build on 32-bit arm.
nouveau: fix r535 build on 32-bit arm.
drm/nouveau/ofa/r535: initial support
drm/nouveau/nvjpg/r535: initial support
drm/nouveau/nvenc/r535: initial support
drm/nouveau/nvdec/r535: initial support
drm/nouveau/gr/r535: initial support
drm/nouveau/ce/r535: initial support
drm/nouveau/fifo/r535: initial support
drm/nouveau/disp/r535: initial support
drm/nouveau/mmu/r535: initial support
drm/nouveau/gsp/r535: add interrupt handling
drm/nouveau/gsp/r535: add support for rm alloc
drm/nouveau/gsp/r535: add support for rm control
drm/nouveau/gsp/r535: add support for booting GSP-RM
drm/nouveau/nvkm: support loading fws into sg_table
drm/nouveau/kms/tu102-: disable vbios parsing when running on RM
...
Linus Torvalds [Sat, 4 Nov 2023 19:26:23 +0000 (09:26 -1000)]
Merge tag 'f2fs-for-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this cycle, we introduce a bigger page size support by changing the
internal f2fs's block size aligned to the page size. We also continue
to improve zoned block device support regarding the power off
recovery. As usual, there are some bug fixes regarding the error
handling routines in compression and ioctl.
Enhancements:
- Support Block Size == Page Size
- let f2fs_precache_extents() traverses in file range
- stop iterating f2fs_map_block if hole exists
- preload extent_cache for POSIX_FADV_WILLNEED
- compress: fix to avoid fragment w/ OPU during f2fs_ioc_compress_file()
Bug fixes:
- do not return EFSCORRUPTED, but try to run online repair
- finish previous checkpoints before returning from remount
- fix error handling of __get_node_page and __f2fs_build_free_nids
- clean up zones when not successfully unmounted
- fix to initialize map.m_pblk in f2fs_precache_extents()
- fix to drop meta_inode's page cache in f2fs_put_super()
- set the default compress_level on ioctl
- fix to avoid use-after-free on dic
- fix to avoid redundant compress extension
- do sanity check on cluster when CONFIG_F2FS_CHECK_FS is on
- fix deadloop in f2fs_write_cache_pages()"
* tag 'f2fs-for-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
f2fs: finish previous checkpoints before returning from remount
f2fs: fix error handling of __get_node_page
f2fs: do not return EFSCORRUPTED, but try to run online repair
f2fs: fix error path of __f2fs_build_free_nids
f2fs: Clean up errors in segment.h
f2fs: clean up zones when not successfully unmounted
f2fs: let f2fs_precache_extents() traverses in file range
f2fs: avoid format-overflow warning
f2fs: fix to initialize map.m_pblk in f2fs_precache_extents()
f2fs: Support Block Size == Page Size
f2fs: stop iterating f2fs_map_block if hole exists
f2fs: preload extent_cache for POSIX_FADV_WILLNEED
f2fs: set the default compress_level on ioctl
f2fs: compress: fix to avoid fragment w/ OPU during f2fs_ioc_compress_file()
f2fs: fix to drop meta_inode's page cache in f2fs_put_super()
f2fs: split initial and dynamic conditions for extent_cache
f2fs: compress: fix to avoid redundant compress extension
f2fs: compress: do sanity check on cluster when CONFIG_F2FS_CHECK_FS is on
f2fs: compress: fix to avoid use-after-free on dic
f2fs: compress: fix deadloop in f2fs_write_cache_pages()
Linus Torvalds [Sat, 4 Nov 2023 19:20:04 +0000 (09:20 -1000)]
Merge tag '9p-for-6.7-rc1' of https://github.com/martinetd/linux
Pull 9p updates from Dominique Martinet:
A bunch of small fixes:
- three W=1 warning fixes: the NULL -> "" replacement isn't trivial
but is serialized identically by the protocol layer and has been
tested
- one syzbot/KCSAN datarace annotation where we don't care about
users messing with the fd they passed to mount -t 9p
- removing a declaration without implementation
- yet another race fix for trans_fd around connection close: the
'err' field is also used in potentially racy calls and this isn't
complete, but it's better than what we had
- and finally a theorical memory leak fix on serialization failure"
* tag '9p-for-6.7-rc1' of https://github.com/martinetd/linux:
9p/net: fix possible memory leak in p9_check_errors()
9p/fs: add MODULE_DESCRIPTION
9p/net: xen: fix false positive printf format overflow warning
9p: v9fs_listxattr: fix %s null argument warning
9p/trans_fd: Annotate data-racy writes to file::f_flags
fs/9p: Remove unused function declaration v9fs_inode2stat()
9p/trans_fd: avoid sending req to a cancelled conn
Linus Torvalds [Sat, 4 Nov 2023 19:13:50 +0000 (09:13 -1000)]
Merge tag '6.7-rc-smb3-client-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client updates from Steve French:
- use after free fixes and deadlock fix
- symlink timestamp fix
- hashing perf improvement
- multichannel fixes
- minor debugging improvements
- fix creating fifos when using "sfu" mounts
- NTLMSSP authentication improvement
- minor fixes to include some missing create flags and structures from
recently updated protocol documentation
* tag '6.7-rc-smb3-client-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6:
cifs: force interface update before a fresh session setup
cifs: do not reset chan_max if multichannel is not supported at mount
cifs: reconnect helper should set reconnect for the right channel
smb: client: fix use-after-free in smb2_query_info_compound()
smb: client: remove extra @chan_count check in __cifs_put_smb_ses()
cifs: add xid to query server interface call
cifs: print server capabilities in DebugData
smb: use crypto_shash_digest() in symlink_hash()
smb: client: fix use-after-free bug in cifs_debug_data_proc_show()
smb: client: fix potential deadlock when releasing mids
smb3: fix creating FIFOs when mounting with "sfu" mount option
Add definition for new smb3.1.1 command type
SMB3: clarify some of the unused CreateOption flags
cifs: Add client version details to NTLM authenticate message
smb3: fix touch -h of symlink
Linus Torvalds [Sat, 4 Nov 2023 18:46:37 +0000 (08:46 -1000)]
Merge tag 'x86_microcode_for_v6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 microcode loading updates from Borislac Petkov:
"Major microcode loader restructuring, cleanup and improvements by
Thomas Gleixner:
- Restructure the code needed for it and add a temporary initrd
mapping on 32-bit so that the loader can access the microcode
blobs. This in itself is a preparation for the next major
improvement:
- Do not load microcode on 32-bit before paging has been enabled.
Handling this has caused an endless stream of headaches, issues,
ugly code and unnecessary hacks in the past. And there really
wasn't any sensible reason to do that in the first place. So switch
the 32-bit loading to happen after paging has been enabled and turn
the loader code "real purrty" again
- Drop mixed microcode steppings loading on Intel - there, a single
patch loaded on the whole system is sufficient
- Rework late loading to track which CPUs have updated microcode
successfully and which haven't, act accordingly
- Move late microcode loading on Intel in NMI context in order to
guarantee concurrent loading on all threads
- Make the late loading CPU-hotplug-safe and have the offlined
threads be woken up for the purpose of the update
- Add support for a minimum revision which determines whether late
microcode loading is safe on a machine and the microcode does not
change software visible features which the machine cannot use
anyway since feature detection has happened already. Roughly, the
minimum revision is the smallest revision number which must be
loaded currently on the system so that late updates can be allowed
- Other nice leanups, fixess, etc all over the place"
* tag 'x86_microcode_for_v6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
x86/microcode/intel: Add a minimum required revision for late loading
x86/microcode: Prepare for minimal revision check
x86/microcode: Handle "offline" CPUs correctly
x86/apic: Provide apic_force_nmi_on_cpu()
x86/microcode: Protect against instrumentation
x86/microcode: Rendezvous and load in NMI
x86/microcode: Replace the all-in-one rendevous handler
x86/microcode: Provide new control functions
x86/microcode: Add per CPU control field
x86/microcode: Add per CPU result state
x86/microcode: Sanitize __wait_for_cpus()
x86/microcode: Clarify the late load logic
x86/microcode: Handle "nosmt" correctly
x86/microcode: Clean up mc_cpu_down_prep()
x86/microcode: Get rid of the schedule work indirection
x86/microcode: Mop up early loading leftovers
x86/microcode/amd: Use cached microcode for AP load
x86/microcode/amd: Cache builtin/initrd microcode early
x86/microcode/amd: Cache builtin microcode too
x86/microcode/amd: Use correct per CPU ucode_cpu_info
...
Kent Overstreet [Mon, 30 Oct 2023 16:30:52 +0000 (12:30 -0400)]
bcachefs: Ensure srcu lock is not held too long
The SRCU read lock that btree_trans takes exists to make it safe for
bch2_trans_relock() to deref pointers to btree nodes/key cache items we
don't have locked, but as a side effect it blocks reclaim from freeing
those items.
Thus, it's important to not hold it for too long: we need to
differentiate between bch2_trans_unlock() calls that will be only for a
short duration, and ones that will be for an unbounded duration.
This introduces bch2_trans_unlock_long(), to be used mainly by the data
move paths.
Kent Overstreet [Tue, 31 Oct 2023 22:05:22 +0000 (18:05 -0400)]
bcachefs: Fix build errors with gcc 10
gcc 10 seems to complain about array bounds in situations where gcc 11
does not - curious.
This unfortunately requires adding some casts for now; we may
investigate getting rid of our __u64 _data[] VLA in a future patch so
that our start[0] members can be VLAs.
Linus Torvalds [Sat, 4 Nov 2023 17:54:21 +0000 (07:54 -1000)]
Merge tag 'acpi-6.7-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Fix the acpi_thermal_add() error path that may do a double-free in
some cases after recent changes (Dan Carpenter)"
* tag 'acpi-6.7-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: thermal: Fix acpi_thermal_unregister_thermal_zone() cleanup
Miquel Raynal [Sat, 4 Nov 2023 10:50:22 +0000 (11:50 +0100)]
Merge tag 'nand/for-6.7' into mtd/next
The raw NAND subsystem has, as usual, seen a bit of cleanup being done
this cycle, typically return values of platform_get_irq() and
devm_kasprintf(), plus structure annotations for sanitizers. There is
also a better ECC check in the Arasan driver. This comes with smaller
misc changes.
In the SPI-NAND world there is now support for Foresee F35SQA002G,
Winbond W25N and XTX XT26 chips.
Miquel Raynal [Sat, 4 Nov 2023 10:49:52 +0000 (11:49 +0100)]
Merge tag 'spi-nor/for-6.7' into mtd/next
For SPI NOR we cleaned the flash info entries in order to have
them slimmer and self explanatory. In order to make the entries
as slim as possible, we introduced sane default values so that
the actual flash entries don't need to specify them. We now use
a flexible macro to specify the flash ID instead of the previous
INFOx() macros that had hardcoded ID lengths.
We now use:
+ .id = SNOR_ID(0xef, 0x80, 0x20),
+ .name = "w25q512nwm",
+ .otp = SNOR_OTP(256, 3, 0x1000, 0x1000),
We also removed some flash entries: the very old Catalyst
SPI EEPROMs that were introduced once with the SPI-NOR subsystem,
and a Fujitsu MRAM. Both should use the at25 EEPROM driver.
The latter even has device tree bindings for the at25 driver.
We made sure that the conversion didn't introduce any unwanted
changes by comparing the .rodata segment before and after the
conversion. The patches landed in linux-next immediately after
v6.6-rc2, we haven't seen any regressions yet.
Apart of the autumn cleaning we introduced a new flash entry,
at25ff321a, and added block protection support for mt25qu512a.
Replace the pinctrl helpers taking the global GPIO number as argument
with the improved variants that instead take a pointer to the GPIO chip
and the controller-relative offset.
Replace the pinctrl helpers taking the global GPIO number as argument
with the improved variants that instead take a pointer to the GPIO chip
and the controller-relative offset.
Replace the pinctrl helpers taking the global GPIO number as argument
with the improved variants that instead take a pointer to the GPIO chip
and the controller-relative offset.
Replace the pinctrl helpers taking the global GPIO number as argument
with the improved variants that instead take a pointer to the GPIO chip
and the controller-relative offset.
Replace the pinctrl helpers taking the global GPIO number as argument
with the improved variants that instead take a pointer to the GPIO chip
and the controller-relative offset.
Replace the pinctrl helpers taking the global GPIO number as argument
with the improved variants that instead take a pointer to the GPIO chip
and the controller-relative offset.
Replace the pinctrl helpers taking the global GPIO number as argument
with the improved variants that instead take a pointer to the GPIO chip
and the controller-relative offset.
Replace the pinctrl helpers taking the global GPIO number as argument
with the improved variants that instead take a pointer to the GPIO chip
and the controller-relative offset.
Replace the pinctrl helpers taking the global GPIO number as argument
with the improved variants that instead take a pointer to the GPIO chip
and the controller-relative offset.
pinctrl: mediatek: paris: use new pinctrl GPIO helpers
Replace the pinctrl helpers taking the global GPIO number as argument
with the improved variants that instead take a pointer to the GPIO chip
and the controller-relative offset.
pinctrl: mediatek: common: use new pinctrl GPIO helpers
Replace the pinctrl helpers taking the global GPIO number as argument
with the improved variants that instead take a pointer to the GPIO chip
and the controller-relative offset.
pinctrl: mediatek: moore: use new pinctrl GPIO helpers
Replace the pinctrl helpers taking the global GPIO number as argument
with the improved variants that instead take a pointer to the GPIO chip
and the controller-relative offset.
Replace the pinctrl helpers taking the global GPIO number as argument
with the improved variants that instead take a pointer to the GPIO chip
and the controller-relative offset.
Replace the pinctrl helpers taking the global GPIO number as argument
with the improved variants that instead take a pointer to the GPIO chip
and the controller-relative offset.
Replace the pinctrl helpers taking the global GPIO number as argument
with the improved variants that instead take a pointer to the GPIO chip
and the controller-relative offset.