Merge tag 'folio-5.18e' of git://git.infradead.org/users/willy/pagecache
Pull folio fixes from Matthew Wilcox:
"Fewer bug reports than I was expecting from enabling large folios.
One that doesn't show up on x86 but does on arm64, one that shows up
with hugetlbfs memory failure testing and one that shows up with page
migration, which it turns out I wasn't testing because my last NUMA
machine died. Need to set up a qemu fake NUMA machine so I don't skip
testing that in future.
Summary:
- Remove the migration code's assumptions about large pages being PMD
sized
- Don't call pmd_page() on a non-leaf PMD
- Fix handling of hugetlbfs pages in page_vma_mapped_walk"
* tag 'folio-5.18e' of git://git.infradead.org/users/willy/pagecache:
mm/rmap: Fix handling of hugetlbfs pages in page_vma_mapped_walk
mm/mempolicy: Use vma_alloc_folio() in new_page()
mm: Add vma_alloc_folio()
mm/migrate: Use a folio in migrate_misplaced_transhuge_page()
mm/migrate: Use a folio in alloc_migration_target()
mm/huge_memory: Avoid calling pmd_page() on a non-leaf PMD
Merge tag 'spi-fix-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A small collection of fixes that have arrived since the merge window,
the most noticable one is a fix for unmapping messages when the
mapping was done with the struct device supplied to do the mapping
overridden"
* tag 'spi-fix-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: bcm-qspi: fix MSPI only access with bcm_qspi_exec_mem_op()
spi: cadence-quadspi: fix protocol setup for non-1-1-X operations
spi: core: add dma_map_dev for __spi_unmap_msg()
spi: mxic: Fix an error handling path in mxic_spi_probe()
spi: rpc-if: Fix RPM imbalance in probe error path
Merge tag 'regulator-fix-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"A few small driver specific fixes for v5.18, plus an update to the
MAINTAINERS file"
* tag 'regulator-fix-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
MAINTAINERS: Fix reviewer info for a few ROHM ICs
regulator: atc260x: Fix missing active_discharge_on setting
regulator: rtq2134: Fix missing active_discharge_on setting
regulator: wm8994: Add an off-on delay for WM8994 variant
Merge tag 'mmc-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC updates from Ulf Hansson:
"MMC core:
- Improve API to make it clear that mmc_hw_reset() is for cards
- Fixup support for writeback-cache for eMMC and SD
- Check for errors after writes on SPI
MMC host:
- renesas_sdhi: A couple of fixes of TAP settings for eMMC HS400 mode
- mmci_stm32: Fixup check of all elements in sg list
- sdhci-xenon: Revert unnecessary fix for annoying 1.8V regulator warning"
* tag 'mmc-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: core: improve API to make clear mmc_hw_reset is for cards
mmc: renesas_sdhi: don't overwrite TAP settings when HS400 tuning is complete
mmc: renesas_sdhi: special 4tap settings only apply to HS400
mmc: core: Fixup support for writeback-cache for eMMC and SD
mmc: block: Check for errors after write on SPI
mmc: mmci: stm32: correctly check all elements of sg list
Revert "mmc: sdhci-xenon: fix annoying 1.8V regulator warning"
Merge tag 'iommu-fix-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fix from Joerg Roedel:
- Fix boot regression due to a NULL-ptr dereference on OMAP machines
* tag 'iommu-fix-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/omap: Fix regression in probe for NULL pointer dereference
perf/imx_ddr: Fix undefined behavior due to shift overflowing the constant
Fix:
In file included from <command-line>:0:0:
In function ‘ddr_perf_counter_enable’,
inlined from ‘ddr_perf_irq_handler’ at drivers/perf/fsl_imx8_ddr_perf.c:651:2:
././include/linux/compiler_types.h:352:38: error: call to ‘__compiletime_assert_729’ \
declared with attribute error: FIELD_PREP: mask is not constant
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
...
See https://lore.kernel.org/r/YkwQ6%[email protected] for the gory
details as to why it triggers with older gccs only.
Matti Vaittinen [Fri, 8 Apr 2022 08:32:00 +0000 (11:32 +0300)]
MAINTAINERS: Fix reviewer info for a few ROHM ICs
The email backend used by ROHM keeps labeling patches as spam.
Additionally, there have been reports of some emails been completely
dropped. Finally also the email list (or shared inbox) [email protected] inadvertly stopped working and has not
been reviwed during the past few weeks.
Remove no longer working list 'linux-power' list-entry and switch my
email to use the personal gmail account instead of the company account.
arm64: patch_text: Fixup last cpu should be master
These patch_text implementations are using stop_machine_cpuslocked
infrastructure with atomic cpu_count. The original idea: When the
master CPU patch_text, the others should wait for it. But current
implementation is using the first CPU as master, which couldn't
guarantee the remaining CPUs are waiting. This patch changes the
last CPU as the master to solve the potential risk.
Tony Lindgren [Thu, 31 Mar 2022 06:23:01 +0000 (09:23 +0300)]
iommu/omap: Fix regression in probe for NULL pointer dereference
Commit 3f6634d997db ("iommu: Use right way to retrieve iommu_ops") started
triggering a NULL pointer dereference for some omap variants:
__iommu_probe_device from probe_iommu_group+0x2c/0x38
probe_iommu_group from bus_for_each_dev+0x74/0xbc
bus_for_each_dev from bus_iommu_probe+0x34/0x2e8
bus_iommu_probe from bus_set_iommu+0x80/0xc8
bus_set_iommu from omap_iommu_init+0x88/0xcc
omap_iommu_init from do_one_initcall+0x44/0x24
This is caused by omap iommu probe returning 0 instead of ERR_PTR(-ENODEV)
as noted by Jason Gunthorpe <[email protected]>.
Looks like the regression already happened with an earlier commit 6785eb9105e3 ("iommu/omap: Convert to probe/release_device() call-backs")
that changed the function return type and missed converting one place.
Wolfram Sang [Fri, 8 Apr 2022 08:00:42 +0000 (10:00 +0200)]
mmc: core: improve API to make clear mmc_hw_reset is for cards
To make it unambiguous that mmc_hw_reset() is for cards and not for
controllers, we make the function argument mmc_card instead of mmc_host.
Also, all users are converted.
imx:
- Catch an EDID allocation failure in imx-ldb
- fix a leaked drm display mode on DT parsing error in parallel-display
- properly remove the dw_hdmi bridge in case the component_add fails in dw_hdmi-imx
- fix the IPU clock frequency debug printout in ipu-di"
* tag 'drm-fixes-2022-04-08' of git://anongit.freedesktop.org/drm/drm: (61 commits)
dt-bindings: display: panel: mipi-dbi-spi: Make width-mm/height-mm mandatory
fbdev: Fix unregistering of framebuffers without device
drm/amdgpu/smu10: fix SoC/fclk units in auto mode
drm/amd/display: update dcn315 clock table read
drm/amdgpu/display: change pipe policy for DCN 2.1
drm/amd/display: Add configuration options for AUX wake work around.
drm/amd/display: remove assert for odm transition case
drm/amdgpu: don't use BACO for reset in S3
drm/amd/display: Fix by adding FPU protection for dcn30_internal_validate_bw
drm/amdkfd: Create file descriptor after client is added to smi_clients list
drm/amdgpu: Sync up header and implementation to use the same parameter names
drm/amdgpu: fix incorrect GCR_GENERAL_CNTL address
amd/display: set backlight only if required
drm/amd/display: Fix allocate_mst_payload assert on resume
drm/amd/display: Revert FEC check in validation
drm/amd/display: Add work around for AUX failure on wake.
drm/amd/display: Clear optc false state when disable otg
drm/amd/display: Enable power gating before init_pipes
drm/amd/display: Remove redundant dsc power gating from init_hw
drm/amd/display: Correct Slice reset calculation
...
Merge tag '5.18-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs client fixes from Steve French:
- reconnect fixes: one for DFS and one to avoid a reconnect race
- small change to deal with upcoming behavior change of list iterators
* tag '5.18-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: update internal module number
cifs: force new session setup and tcon for dfs
cifs: remove check of list iterator against head past the loop body
cifs: fix potential race with cifsd thread
- eth: ice:
- clear default forwarding VSI during VSI release
- fix broken IFF_ALLMULTI handling
- synchronize_rcu() when terminating rings
- eth: qede: confirm skb is allocated before using
- eth: aqc111: fix out-of-bounds accesses in RX fixup
- eth: slip: fix NPD bug in sl_tx_timeout()"
* tag 'net-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits)
drivers: net: slip: fix NPD bug in sl_tx_timeout()
bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets
bpf: Support dual-stack sockets in bpf_tcp_check_syncookie
myri10ge: fix an incorrect free for skb in myri10ge_sw_tso
net: usb: aqc111: Fix out-of-bounds accesses in RX fixup
qede: confirm skb is allocated before using
net: ipv6mr: fix unused variable warning with CONFIG_IPV6_PIMSM_V2=n
net: phy: mscc-miim: reject clause 45 register accesses
net: axiemac: use a phandle to reference pcs_phy
dt-bindings: net: add pcs-handle attribute
net: axienet: factor out phy_node in struct axienet_local
net: axienet: setup mdio unconditionally
net: sfc: fix using uninitialized xdp tx_queue
rxrpc: fix a race in rxrpc_exit_net()
net: openvswitch: fix leak of nested actions
net: ethernet: mv643xx: Fix over zealous checking of_get_mac_address()
net: openvswitch: don't send internal clone attribute to the userspace.
net: micrel: Fix KS8851 Kconfig
ice: clear cmd_type_offset_bsz for TX rings
ice: xsk: fix VSI state check in ice_xsk_wakeup()
...
Dave Airlie [Thu, 7 Apr 2022 23:22:07 +0000 (09:22 +1000)]
Merge tag 'drm-misc-fixes-2022-04-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v5.18-rc2:
- Fix a crash when booting with nouveau on tegra.
- Don't require input port for MIPI-DSI, and make width/height mandatory.
- Fix unregistering of framebuffers without device.
Dave Airlie [Thu, 7 Apr 2022 23:13:33 +0000 (09:13 +1000)]
Merge tag 'drm-misc-next-fixes-2022-04-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-next-fixes for v5.18-rc2:
- fix warning about fence containers
- fix logic error in new fence merge code
- handle empty dma_fence_arrays gracefully
- Try all possible cases for bridge/panel detection.
SUNRPC: Move the call to xprt_send_pagedata() out of xprt_sock_sendmsg()
The client and server have different requirements for their memory
allocation, so move the allocation of the send buffer out of the socket
send code that is common to both.
Reported-by: NeilBrown <[email protected]> Fixes: b2648015d452 ("SUNRPC: Make the rpciod and xprtiod slab allocation modes consistent") Signed-off-by: Trond Myklebust <[email protected]>
Muchun Song [Fri, 1 Apr 2022 02:59:05 +0000 (10:59 +0800)]
NFSv4.2: Fix missing removal of SLAB_ACCOUNT on kmem_cache allocation
The commit 5c60e89e71f8 ("NFSv4.2: Fix up an invalid combination of memory
allocation flags") has stripped GFP_KERNEL_ACCOUNT down to GFP_KERNEL,
however, it forgot to remove SLAB_ACCOUNT from kmem_cache allocation.
It means that memory is still limited by kmemcg. This patch also fix a
NULL pointer reference issue [1] reported by NeilBrown.
SUNRPC: Ensure we flush any closed sockets before xs_xprt_free()
We must ensure that all sockets are closed before we call xprt_free()
and release the reference to the net namespace. The problem is that
calling fput() will defer closing the socket until delayed_fput() gets
called.
Let's fix the situation by allowing rpciod and the transport teardown
code (which runs on the system wq) to call __fput_sync(), and directly
close the socket.
Reported-by: Felix Fu <[email protected]> Acked-by: Al Viro <[email protected]> Fixes: a73881c96d73 ("SUNRPC: Fix an Oops in udp_poll()") Cc: [email protected] # 5.1.x: 3be232f11a3c: SUNRPC: Prevent immediate close+reconnect Cc: [email protected] # 5.1.x: 89f42494f92f: SUNRPC: Don't call connect() more than once on a TCP socket Cc: [email protected] # 5.1.x Signed-off-by: Trond Myklebust <[email protected]>
Trond Myklebust [Thu, 31 Mar 2022 00:00:07 +0000 (20:00 -0400)]
NFS: Replace readdir's use of xxhash() with hash_64()
Both xxhash() and hash_64() appear to give similarly low collision
rates with a standard linearly increasing readdir offset. They both give
similarly higher collision rates when applied to ext4's offsets.
Pavel Begunkov [Thu, 7 Apr 2022 13:05:04 +0000 (14:05 +0100)]
io_uring: zero tag on rsrc removal
Automatically default rsrc tag in io_queue_rsrc_removal(), it's safer
than leaving it there and relying on the rest of the code to behave and
not use it.
Pavel Begunkov [Wed, 6 Apr 2022 11:43:58 +0000 (12:43 +0100)]
io_uring: don't touch scm_fp_list after queueing skb
It's safer to not touch scm_fp_list after we queued an skb to which it
was assigned, there might be races lurking if we screw subtle sync
guarantees on the io_uring side.
Fixes: 6b06314c47e14 ("io_uring: add file set registration") Signed-off-by: Pavel Begunkov <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
Pavel Begunkov [Wed, 6 Apr 2022 11:43:57 +0000 (12:43 +0100)]
io_uring: nospec index for tags on files update
Don't forget to array_index_nospec() for indexes before updating rsrc
tags in __io_sqe_files_update(), just use already safe and precalculated
index @i.
Fixes: c3bdad0271834 ("io_uring: add generic rsrc update with tags") Signed-off-by: Pavel Begunkov <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
There's some discussion on the API not being as good as it can be.
Rather than ship something and be stuck with it forever, let's revert
the NAPI support for now and work on getting something sorted out
for the next kernel release instead.
Jens Axboe [Thu, 31 Mar 2022 18:38:46 +0000 (12:38 -0600)]
io_uring: drop the old style inflight file tracking
io_uring tracks requests that are referencing an io_uring descriptor to
be able to cancel without worrying about loops in the references. Since
we now assign the file at execution time, the easier approach is to drop
a potentially problematic reference before we punt the request. This
eliminates the need to special case these types of files beyond just
marking them as such, and simplifies cancelation quite a bit.
This also fixes a recent issue where an async punted tee operation would
with the io_uring descriptor as the output file would crash when
attempting to get a reference to the file from the io-wq worker. We
could have worked around that, but this is the much cleaner fix.
Jens Axboe [Tue, 29 Mar 2022 16:10:08 +0000 (10:10 -0600)]
io_uring: defer file assignment
If an application uses direct open or accept, it knows in advance what
direct descriptor value it will get as it picks it itself. This allows
combined requests such as:
where we prepare both a file open and read, and only get a completion
event for the read when both have completed successfully.
Currently links are fully prepared before the head is issued, but that
fails if the dependent link needs a file assigned that isn't valid until
the head has completed.
Conversely, if the same chain is performed but the fixed file slot is
already valid, then we would be unexpectedly returning data from the
old file slot rather than the newly opened one. Make sure we're
consistent here.
Allow deferral of file setup, which makes this documented case work.
io_uring: propagate issue_flags state down to file assignment
We'll need this in a future patch, when we could be assigning the file
after the prep stage. While at it, get rid of the io_file_get() helper,
it just makes the code harder to read.
- Deactivate sysctl_record_panic_msg on isolated guest (Andrea Parri)
- Fix a crash when unloading VMbus module (Guilherme G. Piccoli)
* tag 'hyperv-fixes-signed-20220407' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
Drivers: hv: vmbus: Replace smp_store_mb() with virt_store_mb()
Drivers: hv: balloon: Disable balloon and hot-add accordingly
Drivers: hv: balloon: Support status report for larger page sizes
Drivers: hv: vmbus: Prevent load re-ordering when reading ring buffer
PCI: hv: Propagate coherence from VMbus device to PCI device
Drivers: hv: vmbus: Propagate VMbus coherence to each VMbus device
Drivers: hv: vmbus: Fix potential crash on module unload
Drivers: hv: vmbus: Fix initialization of device object in vmbus_device_register()
Drivers: hv: vmbus: Deactivate sysctl_record_panic_msg by default in isolated guests
Merge tag 'random-5.18-rc2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator fixes from Jason Donenfeld:
- Another fixup to the fast_init/crng_init split, this time in how much
entropy is being credited, from Jan Varho.
- As discussed, we now opportunistically call try_to_generate_entropy()
in /dev/urandom reads, as a replacement for the reverted commit. I
opted to not do the more invasive wait_for_random_bytes() change at
least for now, preferring to do something smaller and more obvious
for the time being, but maybe that can be revisited as things evolve
later.
- Userspace can use FUSE or userfaultfd or simply move a process to
idle priority in order to make a read from the random device never
complete, which breaks forward secrecy, fixed by overwriting
sensitive bytes early on in the function.
- Jann Horn noticed that /dev/urandom reads were only checking for
pending signals if need_resched() was true, a bug going back to the
genesis commit, now fixed by always checking for signal_pending() and
calling cond_resched(). This explains various noticeable signal
delivery delays I've seen in programs over the years that do long
reads from /dev/urandom.
- In order to be more like other devices (e.g. /dev/zero) and to
mitigate the impact of fixing the above bug, which has been around
forever (users have never really needed to check the return value of
read() for medium-sized reads and so perhaps many didn't), we now
move signal checking to the bottom part of the loop, and do so every
PAGE_SIZE-bytes.
* tag 'random-5.18-rc2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
random: check for signals every PAGE_SIZE chunk of /dev/[u]random
random: check for signal_pending() outside of need_resched() check
random: do not allow user to keep crng key around on stack
random: opportunistically initialize on /dev/urandom reads
random: do not split fast init input in add_hwgenerator_randomness()
Merge tag 'ata-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ata fixes from Damien Le Moal:
- Fix a compilation warning due to an uninitialized variable in
ata_sff_lost_interrupt(), from me.
- Fix invalid internal command tag handling in the sata_dwc_460ex
driver, from Christian.
- Disable READ LOG DMA EXT with Samsung 840 EVO SSDs as this command
causes the drives to hang, from Christian.
- Change the config option CONFIG_SATA_LPM_POLICY back to its original
name CONFIG_SATA_LPM_MOBILE_POLICY to avoid potential problems with
users losing their configuration (as discussed during the merge
window), from Mario.
* tag 'ata-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
ata: ahci: Rename CONFIG_SATA_LPM_POLICY configuration item back
ata: libata-core: Disable READ LOG DMA EXT for Samsung 840 EVOs
ata: sata_dwc_460ex: Fix crash due to OOB write
ata: libata-sff: Fix compilation warning in ata_sff_lost_interrupt()
Chuck Lever [Wed, 6 Apr 2022 17:51:32 +0000 (13:51 -0400)]
SUNRPC: Fix the svc_deferred_event trace class
Fix a NULL deref crash that occurs when an svc_rqst is deferred
while the sunrpc tracing subsystem is enabled. svc_revisit() sets
dr->xprt to NULL, so it can't be relied upon in the tracepoint to
provide the remote's address.
Unfortunately we can't revert the "svc_deferred_class" hunk in
commit ece200ddd54b ("sunrpc: Save remote presentation address in
svc_xprt for trace events") because there is now a specific check
of event format specifiers for unsafe dereferences. The warning
that check emits is:
event svc_defer_recv has unsafe dereference of argument 1
A "%pISpc" format specifier with a "struct sockaddr *" is indeed
flagged by this check.
Instead, take the brute-force approach used by the svcrdma_qp_error
tracepoint. Convert the dr::addr field into a presentation address
in the TP_fast_assign() arm of the trace event, and store that as
a string. This fix can be backported to -stable kernels.
In the meantime, commit c6ced22997ad ("tracing: Update print fmt
check to handle new __get_sockaddr() macro") is now in v5.18, so
this wonky fix can be replaced with __sockaddr() and friends
properly during the v5.19 merge window.
Fixes: ece200ddd54b ("sunrpc: Save remote presentation address in svc_xprt for trace events") Signed-off-by: Chuck Lever <[email protected]>
zhenwei pi [Thu, 7 Apr 2022 06:40:08 +0000 (14:40 +0800)]
mm/rmap: Fix handling of hugetlbfs pages in page_vma_mapped_walk
page_mapped_in_vma() sets nr_pages to 1, which is usually correct as we
only want to know about the precise page and not about other pages in
the folio. However, hugetlbfs does want to know about the entire hpage,
and using nr_pages to get the size of the hpage is wrong. We could
change page_mapped_in_vma() to special-case hugetlbfs pages, but it's
better to ignore nr_pages in page_vma_mapped_walk() and get the size
from the VMA instead.
Fixes: 2aff7a4755bed ("mm: Convert page_vma_mapped_walk to work on PFNs") Signed-off-by: zhenwei pi <[email protected]> Reviewed-by: Muchun Song <[email protected]> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
[edit commit message, use hstate directly]
This wrapper around alloc_pages_vma() calls prep_transhuge_page(),
removing the obligation from the caller. This is in the same spirit
as __folio_alloc().
mm/migrate: Use a folio in alloc_migration_target()
This removes an assumption that a large folio is HPAGE_PMD_ORDER
as well as letting us remove the call to prep_transhuge_page()
and a few hidden calls to compound_head().
mm/huge_memory: Avoid calling pmd_page() on a non-leaf PMD
Calling try_to_unmap() with TTU_SPLIT_HUGE_PMD and a folio that's not
mapped by a PMD causes oopses on arm64 because we now call page_folio()
on an invalid page. pmd_page() returns a valid page for non-leaf PMDs on
some architectures, so this bug escaped testing before now. Fix this bug
by delaying the call to pmd_page() until after we know the PMD is a leaf.
The x86 MSI message data is 32 bits in total and is either in
compatibility or remappable format, see Intel Virtualization Technology
for Directed I/O, section 5.1.2.
Wolfram Sang [Mon, 4 Apr 2022 11:49:02 +0000 (13:49 +0200)]
mmc: renesas_sdhi: don't overwrite TAP settings when HS400 tuning is complete
When HS400 tuning is complete and HS400 is going to be activated, we
have to keep the current number of TAPs and should not overwrite them
with a hardcoded value. This was probably a copy&paste mistake when
upporting HS400 support from the BSP.
<inline asm>:29:1: while in macro instantiation
extable_type_reg reg=%eax, type=(17 | ((0) << 16))
^
This appears to be another LTO specific issue similar to what was folded
into commit 4b5305decc84 ("x86/extable: Extend extable functionality"),
where the `.set found, 0` in DEFINE_EXTABLE_TYPE_REG in
arch/x86/include/asm/asm.h conflicts with the symbol for the static
function `found` in arch/x86/kernel/pmem.c.
Assembler .set directive declare symbols with global visibility, so the
assembler may not rename such symbols in the event of a conflict. LTO
could rename static functions if there was a conflict in C sources, but
it cannot see into symbols defined in inline asm.
The symbols are also retained in the symbol table, regardless of LTO.
Give the symbols .L prefixes making them locally visible, so that they
may be renamed for LTO to avoid conflicts, and to drop them from the
symbol table regardless of LTO.
Oliver Upton [Wed, 6 Apr 2022 23:56:15 +0000 (23:56 +0000)]
selftests: KVM: Free the GIC FD when cleaning up in arch_timer
In order to correctly destroy a VM, all references to the VM must be
freed. The arch_timer selftest creates a VGIC for the guest, which
itself holds a reference to the VM.
Oliver Upton [Wed, 6 Apr 2022 23:56:14 +0000 (23:56 +0000)]
selftests: KVM: Don't leak GIC FD across dirty log test iterations
dirty_log_perf_test instantiates a VGICv3 for the guest (if supported by
hardware) to reduce the overhead of guest exits. However, the test does
not actually close the GIC fd when cleaning up the VM between test
iterations, meaning that the VM is never actually destroyed in the
kernel.
While this is generally a bad idea, the bug was detected from the kernel
spewing about duplicate debugfs entries as subsequent VMs happen to
reuse the same FD even though the debugfs directory is still present.
Abstract away the notion of setup/cleanup of the GIC FD from the test
by creating arch-specific helpers for test setup/cleanup. Close the GIC
FD on VM cleanup and do nothing for the other architectures.
Oliver Upton [Wed, 6 Apr 2022 23:56:13 +0000 (23:56 +0000)]
KVM: Don't create VM debugfs files outside of the VM directory
Unfortunately, there is no guarantee that KVM was able to instantiate a
debugfs directory for a particular VM. To that end, KVM shouldn't even
attempt to create new debugfs files in this case. If the specified
parent dentry is NULL, debugfs_create_file() will instantiate files at
the root of debugfs.
For arm64, it is possible to create the vgic-state file outside of a
VM directory, the file is not cleaned up when a VM is destroyed.
Nonetheless, the corresponding struct kvm is freed when the VM is
destroyed.
Nip the problem in the bud for all possible errant debugfs file
creations by initializing kvm->debugfs_dentry to -ENOENT. In so doing,
debugfs_create_file() will fail instead of creating the file in the root
directory.
When testing a kernel with commit a5905d6af492 ("KVM: arm64:
Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated")
get-reg-list output
vregs: Number blessed registers: 234
vregs: Number registers: 238
vregs: There are 1 new registers.
Consider adding them to the blessed reg list with the following lines:
KVM_REG_ARM_FW_REG(3),
vregs: PASS
...
That output inspired two changes: 1) add the new register to the
blessed list and 2) explain why "Number registers" is actually four
larger than "Number blessed registers" (on the system used for
testing), even though only one register is being stated as new.
The reason is that some registers are host dependent and they get
filtered out when comparing with the blessed list. The system
used for the test apparently had three filtered registers.
drivers: net: slip: fix NPD bug in sl_tx_timeout()
When a slip driver is detaching, the slip_close() will act to
cleanup necessary resources and sl->tty is set to NULL in
slip_close(). Meanwhile, the packet we transmit is blocked,
sl_tx_timeout() will be called. Although slip_close() and
sl_tx_timeout() use sl->lock to synchronize, we don`t judge
whether sl->tty equals to NULL in sl_tx_timeout() and the
null pointer dereference bug will happen.
We've added 8 non-merge commits during the last 8 day(s) which contain
a total of 9 files changed, 139 insertions(+), 36 deletions(-).
The main changes are:
1) rethook related fixes, from Jiri and Masami.
2) Fix the case when tracing bpf prog is attached to struct_ops, from Martin.
3) Support dual-stack sockets in bpf_tcp_check_syncookie, from Maxim.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets
bpf: Support dual-stack sockets in bpf_tcp_check_syncookie
bpf: selftests: Test fentry tracing a struct_ops program
bpf: Resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT
rethook: Fix to use WRITE_ONCE() for rethook:: Handler
selftests/bpf: Fix warning comparing pointer to 0
bpf: Fix sparse warnings in kprobe_multi_resolve_syms
bpftool: Explicit errno handling in skeletons
====================
scsi: megaraid_sas: Target with invalid LUN ID is deleted during scan
The megaraid_sas driver supports single LUN for RAID devices. That is LUN
0. All other LUNs are unsupported. When a device scan on a logical target
with invalid LUN number is invoked through sysfs, that target ends up
getting removed.
Add LUN ID validation in the slave destroy function to avoid the target
deletion.
Xiaomeng Tong [Sun, 20 Mar 2022 15:07:33 +0000 (23:07 +0800)]
scsi: ufs: ufshpb: Fix a NULL check on list iterator
The list iterator is always non-NULL so the check 'if (!rgn)' is always
false and the dev_err() is never called. Move the check outside the loop
and determine if 'victim_rgn' is NULL, to fix this bug.
scsi: sd: Clean up gendisk if device_add_disk() failed
We forgot to call blk_cleanup_disk() when device_add_disk() failed. This
would cause a memory leak of gendisk and sched_tags allocated in
elevator_init_mq()
Variable dmp is being assigned a value that is never read, the variable is
redundant and can be removed.
Cleans up clang scan build warning:
drivers/message/fusion/mptbase.c:6667:39: warning: Although
the value stored to 'dmp' is used in the enclosing expression,
the value is never actually read from 'dmp' [deadcode.DeadStores]
Alexey Galakhov [Wed, 9 Mar 2022 21:25:35 +0000 (22:25 +0100)]
scsi: mvsas: Add PCI ID of RocketRaid 2640
The HighPoint RocketRaid 2640 is a low-cost SAS controller based on Marvell
chip. The chip in question was already supported by the kernel, just the
PCI ID of this particular board was missing.
scsi: mpt3sas: Fail reset operation if config request timed out
As part of controller reset operation the driver issues a config request
command. If this command gets times out, then fail the controller reset
operation instead of retrying it.
Finn Thain [Wed, 6 Apr 2022 09:05:39 +0000 (19:05 +1000)]
scsi: sym53c500_cs: Stop using struct scsi_pointer
This driver doesn't use SCp.ptr to save a SCSI command data pointer which
means "scsi pointer" is a complete misnomer here. Only a few members of
struct scsi_pointer are needed so move those to private command data.
The start_addres argument of mpt3sas_check_same_4gb_region() was misnamed
in the function kdoc comment, resulting in the following warning when
compiling with W=1.
drivers/scsi/mpt3sas/mpt3sas_base.c:5728: warning: Function parameter or
member 'start_address' not described in 'mpt3sas_check_same_4gb_region'
drivers/scsi/mpt3sas/mpt3sas_base.c:5728: warning: Excess function
parameter 'reply_pool_start_address' description in
'mpt3sas_check_same_4gb_region'
Fix the argument name in the function kdoc comment to avoid it. While at
it, remove a useless blank line between the kdoc and function code.
Damien Le Moal [Mon, 4 Apr 2022 04:55:47 +0000 (13:55 +0900)]
scsi: scsi_debug: Fix sdebug_blk_mq_poll() in_use_bm bitmap use
The in_use_bm bitmap of struct sdebug_queue should be accessed under
protection of the qc_lock spinlock. Make sure that this lock is taken
before calling find_first_bit() at the beginning of the function
sdebug_blk_mq_poll().
Dave Airlie [Thu, 7 Apr 2022 00:23:03 +0000 (10:23 +1000)]
Merge tag 'imx-drm-fixes-2022-04-06' of git://git.pengutronix.de/pza/linux into drm-fixes
drm/imx: error handling and debug output fixes
Catch an EDID allocation failure in imx-ldb, fix a leaked drm display
mode on DT parsing error in parallel-display, properly remove the
dw_hdmi bridge in case the component_add fails in dw_hdmi-imx, and
fix the IPU clock frequency debug printout in ipu-di.
random: check for signals every PAGE_SIZE chunk of /dev/[u]random
In 1448769c9cdb ("random: check for signal_pending() outside of
need_resched() check"), Jann pointed out that we previously were only
checking the TIF_NOTIFY_SIGNAL and TIF_SIGPENDING flags if the process
had TIF_NEED_RESCHED set, which meant in practice, super long reads to
/dev/[u]random would delay signal handling by a long time. I tried this
using the below program, and indeed I wasn't able to interrupt a
/dev/urandom read until after several megabytes had been read. The bug
he fixed has always been there, and so code that reads from /dev/urandom
without checking the return value of read() has mostly worked for a long
time, for most sizes, not just for <= 256.
Maybe it makes sense to keep that code working. The reason it was so
small prior, ignoring the fact that it didn't work anyway, was likely
because /dev/random used to block, and that could happen for pretty
large lengths of time while entropy was gathered. But now, it's just a
chacha20 call, which is extremely fast and is just operating on pure
data, without having to wait for some external event. In that sense,
/dev/[u]random is a lot more like /dev/zero.
Taking a page out of /dev/zero's read_zero() function, it always returns
at least one chunk, and then checks for signals after each chunk. Chunk
sizes there are of length PAGE_SIZE. Let's just copy the same thing for
/dev/[u]random, and check for signals and cond_resched() for every
PAGE_SIZE amount of data. This makes the behavior more consistent with
expectations, and should mitigate the impact of Jann's fix for the
age-old signal check bug.
Kefeng Wang [Wed, 6 Apr 2022 14:57:57 +0000 (00:57 +1000)]
powerpc: Fix virt_addr_valid() for 64-bit Book3E & 32-bit
mpe: On 64-bit Book3E vmalloc space starts at 0x8000000000000000.
Because of the way __pa() works we have:
__pa(0x8000000000000000) == 0, and therefore
virt_to_pfn(0x8000000000000000) == 0, and therefore
virt_addr_valid(0x8000000000000000) == true
Which is wrong, virt_addr_valid() should be false for vmalloc space.
In fact all vmalloc addresses that alias with a valid PFN will return
true from virt_addr_valid(). That can cause bugs with hardened usercopy
as described below by Kefeng Wang:
When running ethtool eth0 on 64-bit Book3E, a BUG occurred:
usercopy: Kernel memory exposure attempt detected from SLUB object not in SLUB page?! (offset 0, size 1048)!
kernel BUG at mm/usercopy.c:99
...
usercopy_abort+0x64/0xa0 (unreliable)
__check_heap_object+0x168/0x190
__check_object_size+0x1a0/0x200
dev_ethtool+0x2494/0x2b20
dev_ioctl+0x5d0/0x770
sock_do_ioctl+0xf0/0x1d0
sock_ioctl+0x3ec/0x5a0
__se_sys_ioctl+0xf0/0x160
system_call_exception+0xfc/0x1f0
system_call_common+0xf8/0x200
The code shows below,
data = vzalloc(array_size(gstrings.len, ETH_GSTRING_LEN));
copy_to_user(useraddr, data, gstrings.len * ETH_GSTRING_LEN))
The data is alloced by vmalloc(), virt_addr_valid(ptr) will return true
on 64-bit Book3E, which leads to the panic.
As commit 4dd7554a6456 ("powerpc/64: Add VIRTUAL_BUG_ON checks for __va
and __pa addresses") does, make sure the virt addr above PAGE_OFFSET in
the virt_addr_valid() for 64-bit, also add upper limit check to make
sure the virt is below high_memory.
Meanwhile, for 32-bit PAGE_OFFSET is the virtual address of the start
of lowmem, high_memory is the upper low virtual address, the check is
suitable for 32-bit, this will fix the issue mentioned in commit 602946ec2f90 ("powerpc: Set max_mapnr correctly") too.
On 32-bit there is a similar problem with high memory, that was fixed in
commit 602946ec2f90 ("powerpc: Set max_mapnr correctly"), but that
commit breaks highmem and needs to be reverted.
We can't easily fix __pa(), we have code that relies on its current
behaviour. So for now add extra checks to virt_addr_valid().
For 64-bit Book3S the extra checks are not necessary, the combination of
virt_to_pfn() and pfn_valid() should yield the correct result, but they
are harmless.
fbdev: Fix unregistering of framebuffers without device
OF framebuffers do not have an underlying device in the Linux
device hierarchy. Do a regular unregister call instead of hot
unplugging such a non-existing device. Fixes a NULL dereference.
An example error message on ppc64le is shown below.
The bug [1] was introduced by commit 27599aacbaef ("fbdev: Hot-unplug
firmware fb devices on forced removal"). Most firmware framebuffers
have an underlying platform device, which can be hot-unplugged
before loading the native graphics driver. OF framebuffers do not
(yet) have that device. Fix the code by unregistering the framebuffer
as before without a hot unplug.
drbd: fix an invalid memory access caused by incorrect use of list iterator
The bug is here:
idr_remove(&connection->peer_devices, vnr);
If the previous for_each_connection() don't exit early (no goto hit
inside the loop), the iterator 'connection' after the loop will be a
bogus pointer to an invalid structure object containing the HEAD
(&resource->connections). As a result, the use of 'connection' above
will lead to a invalid memory access (including a possible invalid free
as idr_remove could call free_layer).
The original intention should have been to remove all peer_devices,
but the following lines have already done the work. So just remove
this line and the unneeded label, to fix this bug.
drbd: Fix five use after free bugs in get_initial_state
In get_initial_state, it calls notify_initial_state_done(skb,..) if
cb->args[5]==1. If genlmsg_put() failed in notify_initial_state_done(),
the skb will be freed by nlmsg_free(skb).
Then get_initial_state will goto out and the freed skb will be used by
return value skb->len, which is a uaf bug.
What's worse, the same problem goes even further: skb can also be
freed in the notify_*_state_change -> notify_*_state calls below.
Thus 4 additional uaf bugs happened.
My patch lets the problem callee functions: notify_initial_state_done
and notify_*_state_change return an error code if errors happen.
So that the error codes could be propagated and the uaf bugs can be avoid.
v2 reports a compilation warning. This v3 fixed this warning and built
successfully in my local environment with no additional warnings.
v2: https://lore.kernel.org/patchwork/patch/1435218/
Chuck Lever [Fri, 1 Apr 2022 21:08:21 +0000 (17:08 -0400)]
SUNRPC: Fix NFSD's request deferral on RDMA transports
Trond Myklebust reports an NFSD crash in svc_rdma_sendto(). Further
investigation shows that the crash occurred while NFSD was handling
a deferred request.
This patch addresses two inter-related issues that prevent request
deferral from working correctly for RPC/RDMA requests:
1. Prevent the crash by ensuring that the original
svc_rqst::rq_xprt_ctxt value is available when the request is
revisited. Otherwise svc_rdma_sendto() does not have a Receive
context available with which to construct its reply.
2. Possibly since before commit 71641d99ce03 ("svcrdma: Properly
compute .len and .buflen for received RPC Calls"),
svc_rdma_recvfrom() did not include the transport header in the
returned xdr_buf. There should have been no need for svc_defer()
and friends to save and restore that header, as of that commit.
This issue is addressed in a backport-friendly way by simply
having svc_rdma_recvfrom() set rq_xprt_hlen to zero
unconditionally, just as svc_tcp_recvfrom() does. This enables
svc_deferred_recv() to correctly reconstruct an RPC message
received via RPC/RDMA.
Paolo Bonzini [Wed, 6 Apr 2022 17:13:42 +0000 (13:13 -0400)]
KVM: avoid NULL pointer dereference in kvm_dirty_ring_push
kvm_vcpu_release() will call kvm_dirty_ring_free(), freeing
ring->dirty_gfns and setting it to NULL. Afterwards, it calls
kvm_arch_vcpu_destroy().
However, if closing the file descriptor races with KVM_RUN in such away
that vcpu->arch.st.preempted == 0, the following call stack leads to a
NULL pointer dereference in kvm_dirty_run_push():
bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets
The previous commit fixed support for dual-stack sockets in
bpf_tcp_check_syncookie. This commit adjusts the selftest to verify the
fixed functionality.
bpf: Support dual-stack sockets in bpf_tcp_check_syncookie
bpf_tcp_gen_syncookie looks at the IP version in the IP header and
validates the address family of the socket. It supports IPv4 packets in
AF_INET6 dual-stack sockets.
On the other hand, bpf_tcp_check_syncookie looks only at the address
family of the socket, ignoring the real IP version in headers, and
validates only the packet size. This implementation has some drawbacks:
1. Packets are not validated properly, allowing a BPF program to trick
bpf_tcp_check_syncookie into handling an IPv6 packet on an IPv4
socket.
2. Dual-stack sockets fail the checks on IPv4 packets. IPv4 clients end
up receiving a SYNACK with the cookie, but the following ACK gets
dropped.
This patch fixes these issues by changing the checks in
bpf_tcp_check_syncookie to match the ones in bpf_tcp_gen_syncookie. IP
version from the header is taken into account, and it is validated
properly with address family.
Alex Deucher [Fri, 1 Apr 2022 15:08:48 +0000 (11:08 -0400)]
drm/amdgpu/smu10: fix SoC/fclk units in auto mode
SMU takes clock limits in Mhz units. socclk and fclk were
using 10 khz units in some cases. Switch to Mhz units.
Fixes higher than required SoC clocks.
drm/amd/display: Fix by adding FPU protection for dcn30_internal_validate_bw
[Why]
Below general protection fault observed when WebGL Aquarium is run for
longer duration. If drm debug logs are enabled and set to 0x1f then the
issue is observed within 10 minutes of run.
[How]
It calles populate_dml_pipes which uses doubles to initialize.
Adding FPU protection avoids context switch and probable loss of vba context
as there is potential contention while drm debug logs are enabled.
Shirish S [Fri, 11 Mar 2022 15:00:17 +0000 (20:30 +0530)]
amd/display: set backlight only if required
[Why]
comparing pwm bl values (coverted) with user brightness(converted)
levels in commit_tail leads to continuous setting of backlight via dmub
as they don't to match.
This leads overdrive in queuing of commands to DMCU that sometimes lead
to depending on load on DMCU fw:
"[drm:dc_dmub_srv_wait_idle] *ERROR* Error waiting for DMUB idle: status=3"
[How]
Store last successfully set backlight value and compare with it instead
of pwm reads which is not what we should compare with.
Roman Li [Thu, 17 Mar 2022 23:55:05 +0000 (19:55 -0400)]
drm/amd/display: Fix allocate_mst_payload assert on resume
[Why]
On resume we do link detection for all non-MST connectors.
MST is handled separately. However the condition for telling
if connector is on mst branch is not enough for mst hub case.
Link detection for mst branch link leads to mst topology reset.
That causes assert in dc_link_allocate_mst_payload()
Roman Li [Tue, 15 Mar 2022 20:31:14 +0000 (16:31 -0400)]
drm/amd/display: Enable power gating before init_pipes
[Why]
In init_hw() we call init_pipes() before enabling power gating.
init_pipes() tries to power gate dsc but it may fail because
required force-ons are not released yet.
As a result with dsc config the following errors observed on resume:
"REG_WAIT timeout 1us * 1000 tries - dcn20_dsc_pg_control"
"REG_WAIT timeout 1us * 1000 tries - dcn20_dpp_pg_control"
"REG_WAIT timeout 1us * 1000 tries - dcn20_hubp_pg_control"
[How]
Move enable_power_gating_plane() before init_pipes() in init_hw()