Merge tag 'nfsd-5.14' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
- add tracepoints for callbacks and for client creation and destruction
- cache the mounts used for server-to-server copies
- expose callback information in /proc/fs/nfsd/clients/*/info
- don't hold locks unnecessarily while waiting for commits
- update NLM to use xdr_stream, as we have for NFSv2/v3/v4
* tag 'nfsd-5.14' of git://linux-nfs.org/~bfields/linux: (69 commits)
nfsd: fix NULL dereference in nfs3svc_encode_getaclres
NFSD: Prevent a possible oops in the nfs_dirent() tracepoint
nfsd: remove redundant assignment to pointer 'this'
nfsd: Reduce contention for the nfsd_file nf_rwsem
lockd: Update the NLMv4 SHARE results encoder to use struct xdr_stream
lockd: Update the NLMv4 nlm_res results encoder to use struct xdr_stream
lockd: Update the NLMv4 TEST results encoder to use struct xdr_stream
lockd: Update the NLMv4 void results encoder to use struct xdr_stream
lockd: Update the NLMv4 FREE_ALL arguments decoder to use struct xdr_stream
lockd: Update the NLMv4 SHARE arguments decoder to use struct xdr_stream
lockd: Update the NLMv4 SM_NOTIFY arguments decoder to use struct xdr_stream
lockd: Update the NLMv4 nlm_res arguments decoder to use struct xdr_stream
lockd: Update the NLMv4 UNLOCK arguments decoder to use struct xdr_stream
lockd: Update the NLMv4 CANCEL arguments decoder to use struct xdr_stream
lockd: Update the NLMv4 LOCK arguments decoder to use struct xdr_stream
lockd: Update the NLMv4 TEST arguments decoder to use struct xdr_stream
lockd: Update the NLMv4 void arguments decoder to use struct xdr_stream
lockd: Update the NLMv1 SHARE results encoder to use struct xdr_stream
lockd: Update the NLMv1 nlm_res results encoder to use struct xdr_stream
lockd: Update the NLMv1 TEST results encoder to use struct xdr_stream
...
Colin Ian King [Tue, 6 Jul 2021 15:11:32 +0000 (16:11 +0100)]
pwm: Remove redundant assignment to pointer pwm
The pointer pwm is being initialized with a value that is never read and
it is being updated later with a new value. The initialization is
redundant and can be removed.
Merge tag 'modules-for-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux
Pull module updates from Jessica Yu:
- Fix incorrect logic in module_kallsyms_on_each_symbol()
- Fix for a Coccinelle warning
* tag 'modules-for-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
module: correctly exit module_kallsyms_on_each_symbol when fn() != 0
kernel/module: Use BUG_ON instead of if condition followed by BUG
Merge branches 'acpi-misc', 'acpi-video' and 'acpi-prm'
* acpi-misc:
ACPI: AMBA: Fix resource name in /proc/iomem
* acpi-video:
ACPI: video: Add quirk for the Dell Vostro 3350
* acpi-prm:
ACPI: Do not singal PRM support if not enabled
ACPI: Correct \_SB._OSC bit definition for PRM
ACPI: Kconfig: Provide help text for the ACPI_PRMT option
Merge tag 'x86-fpu-2021-07-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fpu updates from Thomas Gleixner:
"Fixes and improvements for FPU handling on x86:
- Prevent sigaltstack out of bounds writes.
The kernel unconditionally writes the FPU state to the alternate
stack without checking whether the stack is large enough to
accomodate it.
Check the alternate stack size before doing so and in case it's too
small force a SIGSEGV instead of silently corrupting user space
data.
- MINSIGSTKZ and SIGSTKSZ are constants in signal.h and have never
been updated despite the fact that the FPU state which is stored on
the signal stack has grown over time which causes trouble in the
field when AVX512 is available on a CPU. The kernel does not expose
the minimum requirements for the alternate stack size depending on
the available and enabled CPU features.
ARM already added an aux vector AT_MINSIGSTKSZ for the same reason.
Add it to x86 as well.
- A major cleanup of the x86 FPU code. The recent discoveries of
XSTATE related issues unearthed quite some inconsistencies,
duplicated code and other issues.
The fine granular overhaul addresses this, makes the code more
robust and maintainable, which allows to integrate upcoming XSTATE
related features in sane ways"
* tag 'x86-fpu-2021-07-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (74 commits)
x86/fpu/xstate: Clear xstate header in copy_xstate_to_uabi_buf() again
x86/fpu/signal: Let xrstor handle the features to init
x86/fpu/signal: Handle #PF in the direct restore path
x86/fpu: Return proper error codes from user access functions
x86/fpu/signal: Split out the direct restore code
x86/fpu/signal: Sanitize copy_user_to_fpregs_zeroing()
x86/fpu/signal: Sanitize the xstate check on sigframe
x86/fpu/signal: Remove the legacy alignment check
x86/fpu/signal: Move initial checks into fpu__restore_sig()
x86/fpu: Mark init_fpstate __ro_after_init
x86/pkru: Remove xstate fiddling from write_pkru()
x86/fpu: Don't store PKRU in xstate in fpu_reset_fpstate()
x86/fpu: Remove PKRU handling from switch_fpu_finish()
x86/fpu: Mask PKRU from kernel XRSTOR[S] operations
x86/fpu: Hook up PKRU into ptrace()
x86/fpu: Add PKRU storage outside of task XSAVE buffer
x86/fpu: Dont restore PKRU in fpregs_restore_userspace()
x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi()
x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs()
x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs()
...
Merge tag 'for-linus-5.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
"Only two minor patches this time: one cleanup patch and one patch
refreshing a Xen header"
* tag 'for-linus-5.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: sync include/xen/interface/io/ring.h with Xen's newest version
xen: Use DEVICE_ATTR_*() macro
Merge tag 'hwlock-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc
Pull hwspinlock updates from Bjorn Andersson:
"This adds a driver for the hardware spinlock in Allwinner sun6i"
* tag 'hwlock-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc:
dt-bindings: hwlock: sun6i: Fix various warnings in binding
hwspinlock: add sun6i hardware spinlock support
dt-bindings: hwlock: add sun6i_hwspinlock
Merge tag 'rproc-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc
Pull remoteproc updates from Bjorn Andersson:
"This adds support for controlling the PRU and R5F clusters on the TI
AM64x, the remote processor in i.MX7ULP, i.MX8MN/P and i.MX8ULP NXP
and the audio, compute and modem remoteprocs in the Qualcomm SC8180x
platform.
It fixes improper ordering of cdev and device creation of the
remoteproc control interface and it fixes resource leaks in the error
handling path of rproc_add() and the Qualcomm modem and wifi
remoteproc drivers.
Lastly it fixes a few build warnings and replace the dummy parameter
passed in the mailbox api of the stm32 driver to something not living
on the stack"
* tag 'rproc-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc: (32 commits)
remoteproc: qcom: pas: Add SC8180X adsp, cdsp and mpss
dt-bindings: remoteproc: qcom: pas: Add SC8180X adsp, cdsp and mpss
remoteproc: imx_rproc: support i.MX8ULP
dt-bindings: remoteproc: imx_rproc: support i.MX8ULP
remoteproc: stm32: fix mbox_send_message call
remoteproc: core: Cleanup device in case of failure
remoteproc: core: Fix cdev remove and rproc del
remoteproc: core: Move validate before device add
remoteproc: core: Move cdev add before device add
remoteproc: pru: Add support for various PRU cores on K3 AM64x SoCs
dt-bindings: remoteproc: pru: Update bindings for K3 AM64x SoCs
remoteproc: qcom_wcnss: Use devm_qcom_smem_state_get()
remoteproc: qcom_q6v5: Use devm_qcom_smem_state_get() to fix missing put()
soc: qcom: smem_state: Add devm_qcom_smem_state_get()
dt-bindings: remoteproc: qcom: pas: Fix indentation warnings
remoteproc: imx-rproc: Fix IMX_REMOTEPROC configuration
remoteproc: imx_rproc: support i.MX8MN/P
remoteproc: imx_rproc: support i.MX7ULP
remoteproc: imx_rproc: make clk optional
remoteproc: imx_rproc: initial support for mutilple start/stop method
...
tracing/histograms: Fix parsing of "sym-offset" modifier
With the addition of simple mathematical operations (plus and minus), the
parsing of the "sym-offset" modifier broke, as it took the '-' part of the
"sym-offset" as a minus, and tried to break it up into a mathematical
operation of "field.sym - offset", in which case it failed to parse
(unless the event had a field called "offset").
Both .sym and .sym-offset modifiers should not be entered into
mathematical calculations anyway. If ".sym-offset" is found in the
modifier, then simply make it not an operation that can be calculated on.
Steve French [Wed, 7 Jul 2021 02:42:08 +0000 (21:42 -0500)]
CIFS: Clarify SMB1 code for delete
Coverity also complains about the way we calculate the offset
(starting from the address of a 4 byte array within the
header structure rather than from the beginning of the struct
plus 4 bytes) for SMB1 SetFileDisposition (which is used to
unlink a file by setting the delete on close flag). This
changeset doesn't change the address but makes it slightly
clearer.
Addresses-Coverity: 711524 ("Out of bounds write") Reviewed-by: Paulo Alcantara (SUSE) <[email protected]> Signed-off-by: Steve French <[email protected]>
Steve French [Wed, 7 Jul 2021 02:27:26 +0000 (21:27 -0500)]
CIFS: Clarify SMB1 code for SetFileSize
Coverity also complains about the way we calculate the offset
(starting from the address of a 4 byte array within the header
structure rather than from the beginning of the struct plus
4 bytes) for setting the file size using SMB1. This changeset
doesn't change the address but makes it slightly clearer.
Addresses-Coverity: 711525 ("Out of bounds write") Reviewed-by: Paulo Alcantara (SUSE) <[email protected]> Signed-off-by: Steve French <[email protected]>
blk-cgroup: prevent rcu_sched detected stalls warnings while iterating blkgs
We run a test that create millions of cgroups and blkgs, and then trigger
blkg_destroy_all(). blkg_destroy_all() will hold spin lock for a long
time in such situation. Thus release the lock when a batch of blkgs are
destroyed.
blkcg_activate_policy() and blkcg_deactivate_policy() might have the
same problem, however, as they are basically only called from module
init/exit paths, let's leave them alone for now.
ALSA: usb-audio: Reduce latency at playback start, take#2
This is another attempt for the reduction of the latency at the start
of a USB audio playback stream. The first attempt in the commit 9ce650a75a3b caused an unexpected regression (a deadlock with pipewire
usage) and was later reverted by the commit 4b820e167bf6. The devils
are always living in details, of course; the cause of the deadlock was
the call of snd_pcm_period_elapsed() inside prepare_playback_urb()
callback. In the original code, this callback is never called from
the stream lock context as it's driven solely from the URB complete
callback. Along with the movement of the URB submission into the
trigger START, this prepare call may be also executed in the stream
lock context, hence it deadlocked with the another lock in
snd_pcm_period_elapsed(). (Note that this happens only conditionally
with a small period size that matches with the URB buffer length,
which was a reason I overlooked during my tests. Also, the problem
wasn't seen in the capture stream because the capture stream handles
the period-elapsed only at retire callback that isn't executed at the
trigger.)
If it were only about avoiding the deadlock, it'd be possible to use
snd_pcm_period_elapsed_under_stream_lock() as a solution. However, in
general, the period elapsed notification must be sent after the actual
stream start, and replacing the call wouldn't satisfy the pattern.
A better option is to delay the notification after the stream start
procedure finished, instead. In the case of USB framework, one of the
fitting place would be the complete callback of the first URB.
So, as a workaround of the deadlock and the order fixes above, in
addition to the re-applying the changes in the commit 9ce650a75a3,
this patch introduces a new flag indicating the delayed period-elapsed
handling and sets it under the possible deadlock condition
(i.e. prepare callback being called before subs->running is set).
Once when the flag is set, the period-elapsed call is handled at a
later URB complete call instead.
As a reference for the original motivation for the low-latency change,
I cite here again:
| USB-audio driver behaves a bit strangely for the playback stream --
| namely, it starts sending silent packets at PCM prepare state while
| the actual data is submitted at first when the trigger START is
| kicked off. This is a workaround for the behavior where URBs are
| processed too quickly at the beginning. That is, if we start
| submitting URBs at trigger START, the first few URBs will be
| immediately completed, and this would result in the immediate
| period-elapsed calls right after the start, which may confuse
| applications.
|
| OTOH, submitting the data after silent URBs would, of course, result
| in a certain delay of the actual data processing, and this is rather
| more serious problem on modern systems, in practice.
|
| This patch tries to revert the workaround and lets the URB
| submission starting at PCM trigger for the playback again. As far
| as I've tested with various backends (native ALSA, PA, JACK, PW), I
| haven't seen any problems (famous last words :)
|
| Note that the capture stream handling needs no such workaround,
| since the capture is driven per received URB.
Adrian Hunter [Thu, 1 Jul 2021 17:51:32 +0000 (20:51 +0300)]
perf intel-pt: Add a config for max loops without consuming a packet
The Intel PT decoder limits the number of unconditional branches (e.g.
jmps) decoded without consuming any trace packets. Generally, a loop
needs a conditional branch which generates a TNT packet, whereas a "ret"
instruction will generate a TIP or TNT packet. So exceeding the limit is
assumed to be a never-ending loop, which can happen if there has been a
decoding error putting the decoder at the wrong place in the code.
Up until now, the limit of 10000 has been enough but some analytic
purposes have been reported to exceed that.
Increase the limit to 100000, and make it configurable via perf config
intel-pt.max-loops. Also amend the "Never-ending loop" message to
mention the configuration entry.
Jin Yao [Thu, 10 Jun 2021 03:45:57 +0000 (11:45 +0800)]
perf stat: Disable the NMI watchdog message on hybrid
If we run a single workload that only runs on big core, there is always
a ugly message about disabling the NMI watchdog because the atom is not
counted.
Some events weren't counted. Try disabling the NMI watchdog:
echo 0 > /proc/sys/kernel/nmi_watchdog
perf stat ...
echo 1 > /proc/sys/kernel/nmi_watchdog
The events in group usually have to be from the same PMU. Try reorganizing the group.
Now we disable the NMI watchdog message on hybrid, otherwise there
are too many false positives.
Kajol Jain [Mon, 28 Jun 2021 06:23:41 +0000 (11:53 +0530)]
perf script python: Fix buffer size to report iregs in perf script
Commit 48a1f565261d2ab1 ("perf script python: Add more PMU fields to
event handler dict") added functionality to report fields like weight,
iregs, uregs etc via perf report. That commit predefined buffer size to
512 bytes to print those fields.
But in PowerPC, since we added extended regs support in:
068aeea3773a6f4c ("perf powerpc: Support exposing Performance Monitor Counter SPRs as part of extended regs") d735599a069f6936 ("powerpc/perf: Add extended regs support for power10 platform")
Now iregs can carry more bytes of data and this predefined buffer size
can result to data loss in perf script output.
This patch resolves this issue by making the buffer size dynamic, based
on the number of registers needed to print. It also changes the
regs_map() return type from int to void, as it is not being used by the
set_regs_in_dict(), its only caller.
The install perf_dlfilter.h patch included what seems to be a typo in
the Makefile.perf, which changed the location of the trace link from
'$(DESTDIR_SQ)$(bindir_SQ)/trace' to '$(DESTDIR_SQ)$(dir_SQ)/trace'.
Riccardo Mancini [Mon, 21 Jun 2021 22:21:08 +0000 (00:21 +0200)]
perf top: Fix overflow in elf_sec__is_text()
ASan reports a heap-buffer-overflow in elf_sec__is_text when using perf-top.
The bug is caused by the fact that secstrs is built from runtime_ss, while
shdr is built from syms_ss if shdr.sh_type != SHT_NOBITS. Therefore, they
point to two different ELF files.
This patch renames secstrs to secstrs_run and adds secstrs_sym, so that
the correct secstrs is chosen depending on shdr.sh_type.
$ ASAN_OPTIONS=abort_on_error=1:disable_coredump=0:unmap_shadow_on_exit=1 ./perf top
=================================================================
==363148==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61300009add6 at pc 0x00000049875c bp 0x7f4f56446440 sp 0x7f4f56445bf0
READ of size 1 at 0x61300009add6 thread T6
#0 0x49875b in StrstrCheck(void*, char*, char const*, char const*) (/home/user/linux/tools/perf/perf+0x49875b)
#1 0x4d13a2 in strstr (/home/user/linux/tools/perf/perf+0x4d13a2)
#2 0xacae36 in elf_sec__is_text /home/user/linux/tools/perf/util/symbol-elf.c:176:9
#3 0xac3ec9 in elf_sec__filter /home/user/linux/tools/perf/util/symbol-elf.c:187:9
#4 0xac2c3d in dso__load_sym /home/user/linux/tools/perf/util/symbol-elf.c:1254:20
#5 0x883981 in dso__load /home/user/linux/tools/perf/util/symbol.c:1897:9
#6 0x8e6248 in map__load /home/user/linux/tools/perf/util/map.c:332:7
#7 0x8e66e5 in map__find_symbol /home/user/linux/tools/perf/util/map.c:366:6
#8 0x7f8278 in machine__resolve /home/user/linux/tools/perf/util/event.c:707:13
#9 0x5f3d1a in perf_event__process_sample /home/user/linux/tools/perf/builtin-top.c:773:6
#10 0x5f30e4 in deliver_event /home/user/linux/tools/perf/builtin-top.c:1197:3
#11 0x908a72 in do_flush /home/user/linux/tools/perf/util/ordered-events.c:244:9
#12 0x905fae in __ordered_events__flush /home/user/linux/tools/perf/util/ordered-events.c:323:8
#13 0x9058db in ordered_events__flush /home/user/linux/tools/perf/util/ordered-events.c:341:9
#14 0x5f19b1 in process_thread /home/user/linux/tools/perf/builtin-top.c:1109:7
#15 0x7f4f6a21a298 in start_thread /usr/src/debug/glibc-2.33-16.fc34.x86_64/nptl/pthread_create.c:481:8
#16 0x7f4f697d0352 in clone ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
0x61300009add6 is located 10 bytes to the right of 332-byte region [0x61300009ac80,0x61300009adcc)
allocated by thread T6 here:
#0 0x4f3f7f in malloc (/home/user/linux/tools/perf/perf+0x4f3f7f)
#1 0x7f4f6a0a88d9 (/lib64/libelf.so.1+0xa8d9)
Thread T6 created by T0 here:
#0 0x464856 in pthread_create (/home/user/linux/tools/perf/perf+0x464856)
#1 0x5f06e0 in __cmd_top /home/user/linux/tools/perf/builtin-top.c:1309:6
#2 0x5ef19f in cmd_top /home/user/linux/tools/perf/builtin-top.c:1762:11
#3 0x7b28c0 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
#4 0x7b119f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
#5 0x7b2423 in run_argv /home/user/linux/tools/perf/perf.c:409:2
#6 0x7b0c19 in main /home/user/linux/tools/perf/perf.c:539:3
#7 0x7f4f696f7b74 in __libc_start_main /usr/src/debug/glibc-2.33-16.fc34.x86_64/csu/../csu/libc-start.c:332:16
SUMMARY: AddressSanitizer: heap-buffer-overflow (/home/user/linux/tools/perf/perf+0x49875b) in StrstrCheck(void*, char*, char const*, char const*)
Shadow bytes around the buggy address:
0x0c268000b560: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c268000b570: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c268000b580: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c268000b590: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c268000b5a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c268000b5b0: 00 00 00 00 00 00 00 00 00 04[fa]fa fa fa fa fa
0x0c268000b5c0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
0x0c268000b5d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c268000b5e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c268000b5f0: 07 fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c268000b600: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==363148==ABORTING
perf annotate: Fix 's' on source line when disasm is empty
If the disasm is empty, 's' should fail. Instead it seemingly works,
hiding the empty lines and causing an assertion error on the next time
annotate is called (from within perf report).
The problem is caused by a buffer overflow, caused by a wrong exit
condition in annotate_browser__find_next_asm_line, which checks
browser->b.top instead of browser->b.entries.
This patch fixes the issue, making annotate_browser__toggle_source
fail if the disasm is empty (nothing happens to the user).
perf symbol-elf: Decode dynsym even if symtab exists
In Fedora34, libc-2.33.so has both .dynsym and .symtab sections and
most of (not all) symbols moved to .dynsym. In this case, perf only
decode the symbols in .symtab, and perf probe can not list up the
functions in the library.
To fix this issue, decode both .symtab and .dynsym sections.
Without this fix,
-----
$ ./perf probe -x /usr/lib64/libc-2.33.so -F
@plt
@plt
calloc@plt
free@plt
malloc@plt
memalign@plt
realloc@plt
-----
perf probe: Fix debuginfo__new() to enable build-id based debuginfo
Fix debuginfo__new() to set the build-id to dso before
dso__read_binary_type_filename() so that it can find
DSO_BINARY_TYPE__BUILDID_DEBUGINFO debuginfo correctly.
However, this may not change the result, because elfutils (libdwfl) has
its own debuginfo finder. With/without this patch, the perf probe
correctly find the debuginfo file.
This is just a failsafe and keep code's sanity (if you use
dso__read_binary_type_filename(), you must set the build-id to the dso.)
block: fix the problem of io_ticks becoming smaller
On the IO submission path, blk_account_io_start() may interrupt
the system interruption. When the interruption returns, the value
of part->stamp may have been updated by other cores, so the time
value collected before the interruption may be less than part->
stamp. So when this happens, we should do nothing to make io_ticks
more accurate? For kernels less than 5.0, this may cause io_ticks
to become smaller, which in turn may cause abnormal ioutil values.
Zhen Lei [Wed, 7 Jul 2021 07:40:51 +0000 (15:40 +0800)]
ALSA: isa: Fix error return code in snd_cmi8330_probe()
When 'SB_HW_16' check fails, the error code -ENODEV instead of 0 should be
returned, which is the same as that returned when 'WSS_HW_CMI8330' check
fails.
... it's possible to reliably trigger an oops by running:
stress-ng -v --mmap 1 -t 30s
... which results in a NULL pointer dereference in
__split_huge_pmd_locked().
The underlying problem is that commit ff5b4f1ed580c59d left
arch_cmpxchg64_local() defined in terms of cmpxchg_local() rather than
arch_cmpxchg_local(). In <asm-generic/atomic-instrumented.h> we wrap
these with macros which use identically-named variables. When
cmpxchg_local() nests inside cmpxchg64_local(), this casues it to use an
unitialized variable as the pointer, which can be NULL.
This can also be seen in pmdp_establish(), where the compiler can
generate the pointer with a `clr` instruction:
This patch fixes the problem by defining arch_cmpxchg64_local() in terms
of arch_cmpxchg_local(), avoiding potential shadowing, and resulting in
working cmpxchg64_local() and variants, e.g.
J. Bruce Fields [Fri, 2 Jul 2021 00:06:56 +0000 (20:06 -0400)]
nfsd: fix NULL dereference in nfs3svc_encode_getaclres
In error cases the dentry may be NULL.
Before 20798dfe249a, the encoder also checked dentry and
d_really_is_positive(dentry), but that looks like overkill to me--zero
status should be enough to guarantee a positive dentry.
This isn't the first time we've seen an error-case NULL dereference
hidden in the initialization of a local variable in an xdr encoder. But
I went back through the other recent rewrites and didn't spot any
similar bugs.
Reported-by: JianHong Yin <[email protected]> Reviewed-by: Chuck Lever III <[email protected]> Fixes: 20798dfe249a ("NFSD: Update the NFSv3 GETACL result encoder...") Signed-off-by: J. Bruce Fields <[email protected]>
Chuck Lever [Fri, 25 Jun 2021 15:12:49 +0000 (11:12 -0400)]
NFSD: Prevent a possible oops in the nfs_dirent() tracepoint
The double copy of the string is a mistake, plus __assign_str()
uses strlen(), which is wrong to do on a string that isn't
guaranteed to be NUL-terminated.
Fixes: 6019ce0742ca ("NFSD: Add a tracepoint to record directory entry encoding") Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
Colin Ian King [Thu, 13 May 2021 15:16:39 +0000 (16:16 +0100)]
nfsd: remove redundant assignment to pointer 'this'
The pointer 'this' is being initialized with a value that is never read
and it is being updated later with a new value. The initialization is
redundant and can be removed.
Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
Trond Myklebust [Thu, 17 Jun 2021 23:26:52 +0000 (19:26 -0400)]
nfsd: Reduce contention for the nfsd_file nf_rwsem
When flushing out the unstable file writes as part of a COMMIT call, try
to perform most of of the data writes and waits outside the semaphore.
This means that if the client is sending the COMMIT as part of a memory
reclaim operation, then it can continue performing I/O, with contention
for the lock occurring only once the data sync is finished.
J. Bruce Fields [Mon, 14 Jun 2021 15:20:49 +0000 (11:20 -0400)]
nfsd: rpc_peeraddr2str needs rcu lock
I'm not even sure cl_xprt can change here, but we're getting "suspicious
RCU usage" warnings, and other rpc_peeraddr2str callers are taking the
rcu lock.
Wei Yongjun [Fri, 4 Jun 2021 10:12:37 +0000 (10:12 +0000)]
NFSD: Fix error return code in nfsd4_interssc_connect()
'status' has been overwritten to 0 after nfsd4_ssc_setup_dul(), this
cause 0 will be return in vfs_kern_mount() error case. Fix to return
nfserr_nodev in this error.
Fixes: f4e44b393389 ("NFSD: delay unmount source's export after inter-server copy completed.") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wei Yongjun <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
Alexandre Ghiti [Thu, 24 Jun 2021 12:17:21 +0000 (14:17 +0200)]
riscv: Fix PTDUMP output now BPF region moved back to module region
BPF region was moved back to the region below the kernel at the end of
the module region by 3a02764c372c ("riscv: Ensure BPF_JIT_REGION_START
aligned with PMD size"), so reflect this change in kernel page table
output.
Akira Tsukamoto [Wed, 23 Jun 2021 12:40:39 +0000 (21:40 +0900)]
riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall
This patch will reduce cpu usage dramatically in kernel space especially
for application which use sys-call with large buffer size, such as
network applications. The main reason behind this is that every
unaligned memory access will raise exceptions and switch between s-mode
and m-mode causing large overhead.
First copy in bytes until reaches the first word aligned boundary in
destination memory address. This is the preparation before the bulk
aligned word copy.
The destination address is aligned now, but oftentimes the source
address is not in an aligned boundary. To reduce the unaligned memory
access, it reads the data from source in aligned boundaries, which will
cause the data to have an offset, and then combines the data in the next
iteration by fixing offset with shifting before writing to destination.
The majority of the improving copy speed comes from this shift copy.
In the lucky situation that the both source and destination address are
on the aligned boundary, perform load and store with register size to
copy the data. Without the unrolling, it will reduce the speed since the
next store instruction for the same register using from the load will
stall the pipeline.
At last, copying the remainder in one byte at a time.
In preparation to enable -Wimplicit-fallthrough for Clang, fix a
warning by explicitly adding a fallthrough; statement.
Notice that this seems to be a Duff device for performance[1]. So,
although the code looks a bit _funny_, I didn't want to refactor
or modify it beyond merely adding a fallthrough marking, which
might be the least disruptive way to fix this issue.
In preparation to enable -Wimplicit-fallthrough for Clang, fix a
warning by explicitly adding a fallthrough; statement.
Notice that this seems to be a Duff device for performance[1]. So,
although the code looks a bit _funny_, I didn't want to refactor
or modify it beyond merely adding a fallthrough marking, which
might be the least disruptive way to fix this issue.
Tong Tiangen [Mon, 21 Jun 2021 03:28:55 +0000 (11:28 +0800)]
riscv: add VMAP_STACK overflow detection
This patch adds stack overflow detection to riscv, usable when
CONFIG_VMAP_STACK=y.
Overflow is detected in kernel exception entry(kernel/entry.S), if the
kernel stack is overflow and been detected, the overflow handler is
invoked on a per-cpu overflow stack. This approach preserves GPRs and
the original exception information.
The overflow detect is performed before any attempt is made to access
the stack and the principle of stack overflow detection: kernel stacks
are aligned to double their size, enabling overflow to be detected with
a single bit test. For example, a 16K stack is aligned to 32K, ensuring
that bit 14 of the SP must be zero. On an overflow (or underflow), this
bit is flipped. Thus, overflow (of less than the size of the stack) can
be detected by testing whether this bit is set.
This gives us a useful error message on stack overflow, as can be
trigger with the LKDTM overflow test:
The code in xcs_resume() probably didn't work as intended. It uses
struct drm_device.irq, which is allocated to 0, but never initialized
by i915 to the device's interrupt number.
Change all calls to synchronize_hardirq() to intel_synchronize_irq(),
which uses the correct interrupt. _hardirq() functions are not needed
in this context.
v5:
* go back to _hardirq() after PCI probe reported wrong
context; add rsp comment
v4:
* switch everything to intel_synchronize_irq() (Daniel)
v3:
* also use intel_synchronize_hardirq() at another callsite
v2:
* wrap irq code in intel_synchronize_hardirq() (Ville)
drm/i915/display/dg1: Correctly map DPLLs during state readout
_DG1_DPCLKA0_CFGCR0 maps between DPLL 0 and 1 with one bit for phy A
and B while _DG1_DPCLKA1_CFGCR0 maps between DPLL 2 and 3 with one
bit for phy C and D.
Reusing _cnl_ddi_get_pll() don't take that into cosideration returing
DPLL 0 and 1 for phy C and D.
That is a regression introduced in the refactor done in
commit 351221ffc5e5 ("drm/i915: Move DDI clock readout to
encoder->get_config()").
While at it also dropping the macros previously used, not reusing it
to improve readability.
Kees Cook [Thu, 17 Jun 2021 21:33:01 +0000 (14:33 -0700)]
drm/i915/display: Do not zero past infoframes.vsc
intel_dp_vsc_sdp_unpack() was using a memset() size (36, struct dp_sdp)
larger than the destination (24, struct drm_dp_vsc_sdp), clobbering
fields in struct intel_crtc_state after infoframes.vsc. Use the actual
target size for the memset().
Merge tag 'kgdb-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux
Pull kgdb updates from Daniel Thompson:
"This was a extremely quiet cycle for kgdb. This consists of two
patches that between them address spelling errors and a switch
fallthrough warning"
* tag 'kgdb-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
kgdb: Fix fall-through warning for Clang
kgdb: Fix spelling mistakes
Merge tag 'fuse-update-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse updates from Miklos Szeredi:
- Fixes for virtiofs submounts
- Misc fixes and cleanups
* tag 'fuse-update-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
virtiofs: Fix spelling mistakes
fuse: use DIV_ROUND_UP helper macro for calculations
fuse: fix illegal access to inode with reused nodeid
fuse: allow fallocate(FALLOC_FL_ZERO_RANGE)
fuse: Make fuse_fill_super_submount() static
fuse: Switch to fc_mount() for submounts
fuse: Call vfs_get_tree() for submounts
fuse: add dedicated filesystem context ops for submounts
virtiofs: propagate sync() to file server
fuse: reject internal errno
fuse: check connected before queueing on fpq->io
fuse: ignore PG_workingset after stealing
fuse: Fix infinite loop in sget_fc()
fuse: Fix crash if superblock of submount gets killed early
fuse: Fix crash in fuse_dentry_automount() error path
Merge tag 'for-linus-5.14-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux
Pull orangefs updates from Mike Marshall:
"A read-ahead adjustment and a fix.
The readahead adjustment was suggested by Matthew Wilcox and looks
like how I should have written it in the first place... the "df fix"
was suggested by Walt Ligon, some Orangefs users have been complaining
about whacky df output..."
* tag 'for-linus-5.14-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
orangefs: fix orangefs df output.
orangefs: readahead adjustment
Merge tag 'exfat-for-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
Pull exfat updates from Namjae Jeon:
- Improved compatibility issue with exfat from some camera vendors.
- Do not need to release root inode on error path.
* tag 'exfat-for-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: handle wrong stream entry size in exfat_readdir()
exfat: avoid incorrectly releasing for root inode