Rodrigo Vivi [Wed, 11 Nov 2020 14:09:36 +0000 (09:09 -0500)]
drm/i915/tgl: Fix Media power gate sequence.
Some media power gates are disabled by default. commit 5d86923060fc
("drm/i915/tgl: Enable VD HCP/MFX sub-pipe power gating")
tried to enable it, but it duplicated an existent register.
So, the main PG setup sequences ended up overwriting it.
So, let's now merge this to the main PG setup sequence.
v2: (Chris): s/BIT/REG_BIT, remove useless comment,
remove useless =0, use the right gt,
remove rc6 sequence doubt from commit message.
Linus Torvalds [Mon, 16 Nov 2020 23:02:33 +0000 (15:02 -0800)]
Merge tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull Hyper-V fix from Wei Liu:
"One patch from Chris to fix kexec on Hyper-V"
* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
Drivers: hv: vmbus: Allow cleanup of VMBUS_CONNECT_CPU if disconnected
Linus Torvalds [Mon, 16 Nov 2020 22:58:23 +0000 (14:58 -0800)]
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull vhost fixes from Michael Tsirkin:
"Fixes all over the place, most notably vhost scsi IO error fixes"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vhost scsi: Add support for LUN resets.
vhost scsi: add lun parser helper
vhost scsi: fix cmd completion race
vhost scsi: alloc cmds per vq instead of session
vhost: add helper to check if a vq has been setup
vdpasim: fix "mac_pton" undefined error
swiotlb: using SIZE_MAX needs limits.h included
net: lantiq: Wait for the GPHY firmware to be ready
A user reports (slightly shortened from the original message):
libphy: lantiq,xrx200-mdio: probed
mdio_bus 1e108000.switch-mii: MDIO device at address 17 is missing.
gswip 1e108000.switch lan: no phy at 2
gswip 1e108000.switch lan: failed to connect to port 2: -19
lantiq,xrx200-net 1e10b308.eth eth0: error -19 setting up slave phy
This is a single-port board using the internal Fast Ethernet PHY. The
user reports that switching to PHY scanning instead of configuring the
PHY within device-tree works around this issue.
The documentation for the standalone variant of the PHY11G (which is
probably very similar to what is used inside the xRX200 SoCs but having
the firmware burnt onto that standalone chip in the factory) states that
the PHY needs 300ms to be ready for MDIO communication after releasing
the reset.
Add a 300ms delay after initializing all GPHYs to ensure that the GPHY
firmware had enough time to initialize and to appear on the MDIO bus.
Unfortunately there is no (known) documentation on what the minimum time
to wait after releasing the reset on an internal PHY so play safe and
take the one for the external variant. Only wait after the last GPHY
firmware is loaded to not slow down the initialization too much (
xRX200 has two GPHYs but newer SoCs have at least three GPHYs).
Jens Axboe [Mon, 16 Nov 2020 20:36:24 +0000 (13:36 -0700)]
mm: never attempt async page lock if we've transferred data already
We catch the case where we enter generic_file_buffered_read() with data
already transferred, but we also need to be careful not to allow an async
page lock if we're looping transferring data. If not, we could be
returning -EIOCBQUEUED instead of the transferred amount, and it could
result in double waitqueue additions as well.
Cc: [email protected] # v5.9 Fixes: 1a0a7853b901 ("mm: support async buffered reads in generic_file_buffered_read()") Signed-off-by: Jens Axboe <[email protected]>
Cezary Rojewski [Mon, 16 Nov 2020 13:33:29 +0000 (14:33 +0100)]
ASoC: Intel: catpt: Correct clock selection for dai trigger
During stream start DSP firmware requires LPCS disabled as that moment in
time is resource heavy. Currently high-clock is selected on start of
second stream onwards while low-clock is re-selected before stream
actually leaves RESUME state i.e. PAUSE_STREAM call. Fix this by always
updating clock before RESUME_STREAM and directly after PAUSE_STREAM.
Cezary Rojewski [Mon, 16 Nov 2020 13:33:28 +0000 (14:33 +0100)]
ASoC: Intel: catpt: Skip position update for unprepared streams
Playing with very low period sizes may lead to timeouts when awaiting
RESET_STREAM reply for offload streams. This is caused by NOTIFY_POSITION
appearing in the middle of trigger(stop).
Stream is unprepared during trigger(stop) where PAUSE_STREAM IPC gets
invoked. However, all data that is already mixed in DSP firmware's mixer
stream will still be played regardless of the pause. For offload streams,
this means possibility for another NOTIFY_POSITION to process. Keep these
notifications in check by only handling them when stream is in prepared
state.
Aili Yao [Tue, 10 Nov 2020 08:33:34 +0000 (00:33 -0800)]
ACPI, APEI, Fix error return value in apei_map_generic_address()
From commit 6915564dc5a8 ("ACPI: OSL: Change the type of
acpi_os_map_generic_address() return value"),
acpi_os_map_generic_address() will return logical address or NULL
for error, but for ACPI_ADR_SPACE_SYSTEM_IO case, it should be also
return 0 as it's a normal case, but now it will return -ENXIO.
So check it out for such case to avoid einj module initialization
fail.
Xie He [Sat, 14 Nov 2020 11:10:29 +0000 (03:10 -0800)]
MAINTAINERS: Add Martin Schiller as a maintainer for the X.25 stack
Martin Schiller is an active developer and reviewer for the X.25 code.
His company is providing products based on the Linux X.25 stack.
So he is a good candidate for maintainers of the X.25 code.
The original maintainer of the X.25 network layer (Andrew Hendry) has
not sent any email to the netdev mail list since 2013. So he is probably
inactive now.
Georg Kohmann [Wed, 11 Nov 2020 11:50:25 +0000 (12:50 +0100)]
ipv6/netfilter: Discard first fragment not including all headers
Packets are processed even though the first fragment don't include all
headers through the upper layer header. This breaks TAHI IPv6 Core
Conformance Test v6LC.1.3.6.
Referring to RFC8200 SECTION 4.5: "If the first fragment does not include
all headers through an Upper-Layer header, then that fragment should be
discarded and an ICMP Parameter Problem, Code 3, message should be sent to
the source of the fragment, with the Pointer field set to zero."
The fragment needs to be validated the same way it is done in
commit 2efdaaaf883a ("IPv6: reply ICMP error if the first fragment don't
include all headers") for ipv6. Wrap the validation into a common function,
ipv6_frag_thdr_truncated() to check for truncation in the upper layer
header. This validation does not fullfill all aspects of RFC 8200,
section 4.5, but is at the moment sufficient to pass mentioned TAHI test.
In netfilter, utilize the fragment offset returned by find_prev_fhdr() to
let ipv6_frag_thdr_truncated() start it's traverse from the fragment
header.
Return 0 to drop the fragment in the netfilter. This is the same behaviour
as used on other protocol errors in this function, e.g. when
nf_ct_frag6_queue() returns -EPROTO. The Fragment will later be picked up
by ipv6_frag_rcv() in reassembly.c. ipv6_frag_rcv() will then send an
appropriate ICMP Parameter Problem message back to the source.
References commit 2efdaaaf883a ("IPv6: reply ICMP error if the first
fragment don't include all headers")
====================
Fix usage counter leak by adding a general sync ops
In many case, we need to check return value of pm_runtime_get_sync,
but it brings a trouble to the usage counter processing. Many callers
forget to decrease the usage counter when it failed, which could
resulted in reference leak. It has been discussed a lot[0][1]. So we
add a function to deal with the usage counter for better coding and
view. Then, we replace pm_runtime_resume_and_get with it in fec_main.c
to avoid it.
Zhang Qilong [Tue, 10 Nov 2020 09:29:33 +0000 (17:29 +0800)]
net: fec: Fix reference count leak in fec series ops
pm_runtime_get_sync() will increment pm usage at first and it will
resume the device later. If runtime of the device has error or
device is in inaccessible state(or other error state), resume
operation will fail. If we do not call put operation to decrease
the reference, it will result in reference count leak. Moreover,
this device cannot enter the idle state and always stay busy or other
non-idle state later. So we fixed it by replacing it with
pm_runtime_resume_and_get.
Fixes: 8fff755e9f8d0 ("net: fec: Ensure clocks are enabled while using mdio bus") Signed-off-by: Zhang Qilong <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
Zhang Qilong [Tue, 10 Nov 2020 09:29:32 +0000 (17:29 +0800)]
PM: runtime: Add pm_runtime_resume_and_get to deal with usage counter
In many case, we need to check return value of pm_runtime_get_sync, but
it brings a trouble to the usage counter processing. Many callers forget
to decrease the usage counter when it failed, which could resulted in
reference leak. It has been discussed a lot[0][1]. So we add a function
to deal with the usage counter for better coding.
[0]https://lkml.org/lkml/2020/6/14/88
[1]https://patchwork.ozlabs.org/project/linux-tegra/list/?series=178139 Signed-off-by: Zhang Qilong <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
dmatest: dma0chan0-copy0: result #1: 'test passed' with src_off=0x0 dst_off=0x0 len=0x400000
dmatest: dma0chan0-copy0: result #2: 'test passed' with src_off=0x0 dst_off=0x0 len=0x400000
dmatest: dma0chan0-copy0: result #3: 'test passed' with src_off=0x0 dst_off=0x0 len=0x400000
dmatest: dma0chan0-copy0: result #4: 'test passed' with src_off=0x0 dst_off=0x0 len=0x400000
dmatest: dma0chan0-copy0: result #5: 'test passed' with src_off=0x0 dst_off=0x0 len=0x400000
dmatest: dma0chan0-copy0: result #6: 'test passed' with src_off=0x0 dst_off=0x0 len=0x400000
dmatest: dma0chan0-copy0: result #7: 'test passed' with src_off=0x0 dst_off=0x0 len=0x400000
dmatest: dma0chan0-copy0: result #8: 'test passed' with src_off=0x0 dst_off=0x0 len=0x400000
Lukas Bulwahn [Fri, 13 Nov 2020 08:12:48 +0000 (09:12 +0100)]
dmaengine: ioatdma: remove unused function missed during dma_v2 removal
Commit 7f832645d0e5 ("dmaengine: ioatdma: remove ioatdma v2 registration")
missed to remove dca2_tag_map_valid() during its removal. Hence, since
then, dca2_tag_map_valid() is unused and make CC=clang W=1 warns:
drivers/dma/ioat/dca.c:44:19:
warning: unused function 'dca2_tag_map_valid' [-Wunused-function]
So, remove this unused function and get rid of a -Wused-function warning.
Ian Rogers [Fri, 13 Nov 2020 18:20:53 +0000 (10:20 -0800)]
perf test: Avoid an msan warning in a copied stack.
This fix is for a failure that occurred in the DWARF unwind perf test.
Stack unwinders may probe memory when looking for frames.
Memory sanitizer will poison and track uninitialized memory on the
stack, and on the heap if the value is copied to the heap.
This can lead to false memory sanitizer failures for the use of an
uninitialized value.
Avoid this problem by removing the poison on the copied stack.
The full msan failure with track origins looks like:
==2168==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x559ceb10755b in handle_cfi elfutils/libdwfl/frame_unwind.c:648:8
#1 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
#2 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
#3 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
#4 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
#5 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
#6 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
#7 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
#8 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
#9 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
#10 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
#11 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
#12 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
#13 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
#14 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
#15 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
#16 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
#17 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
#18 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
#19 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
#20 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
#21 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
#22 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
#23 0x559cea95fbce in main tools/perf/perf.c:539:3
Uninitialized value was stored to memory at
#0 0x559ceb106acf in __libdwfl_frame_reg_set elfutils/libdwfl/frame_unwind.c:77:22
#1 0x559ceb106acf in handle_cfi elfutils/libdwfl/frame_unwind.c:627:13
#2 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
#3 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
#4 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
#5 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
#6 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
#7 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
#8 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
#9 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
#10 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
#11 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
#12 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
#13 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
#14 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
#15 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
#16 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
#17 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
#18 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
#19 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
#20 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
#21 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
#22 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
#23 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
#24 0x559cea95fbce in main tools/perf/perf.c:539:3
Uninitialized value was stored to memory at
#0 0x559ceb106a54 in handle_cfi elfutils/libdwfl/frame_unwind.c:613:9
#1 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
#2 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
#3 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
#4 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
#5 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
#6 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
#7 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
#8 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
#9 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
#10 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
#11 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
#12 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
#13 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
#14 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
#15 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
#16 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
#17 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
#18 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
#19 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
#20 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
#21 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
#22 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
#23 0x559cea95fbce in main tools/perf/perf.c:539:3
Uninitialized value was stored to memory at
#0 0x559ceaff8800 in memory_read tools/perf/util/unwind-libdw.c:156:10
#1 0x559ceb10f053 in expr_eval elfutils/libdwfl/frame_unwind.c:501:13
#2 0x559ceb1060cc in handle_cfi elfutils/libdwfl/frame_unwind.c:603:18
#3 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
#4 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
#5 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
#6 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
#7 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
#8 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
#9 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
#10 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
#11 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
#12 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
#13 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
#14 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
#15 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
#16 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
#17 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
#18 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
#19 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
#20 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
#21 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
#22 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
#23 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
#24 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
#25 0x559cea95fbce in main tools/perf/perf.c:539:3
Uninitialized value was stored to memory at
#0 0x559cea9027d9 in __msan_memcpy llvm/llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:1558:3
#1 0x559cea9d2185 in sample_ustack tools/perf/arch/x86/tests/dwarf-unwind.c:41:2
#2 0x559cea9d202c in test__arch_unwind_sample tools/perf/arch/x86/tests/dwarf-unwind.c:72:9
#3 0x559ceabc9cbd in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:106:6
#4 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
#5 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
#6 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
#7 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
#8 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
#9 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
#10 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
#11 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
#12 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
#13 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
#14 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
#15 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
#16 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
#17 0x559cea95fbce in main tools/perf/perf.c:539:3
Uninitialized value was created by an allocation of 'bf' in the stack frame of function 'perf_event__synthesize_mmap_events'
#0 0x559ceafc5f60 in perf_event__synthesize_mmap_events tools/perf/util/synthetic-events.c:445
Dave Jiang [Wed, 11 Nov 2020 22:23:46 +0000 (15:23 -0700)]
dmaengine: idxd: fix mapping of portal size
Portal size is 4k. Current code is mapping all 4 portals in a single chunk.
Restrict the mapped portal size to a single portal to ensure that submission
only goes to the intended portal address.
Al Grant [Fri, 13 Nov 2020 20:38:26 +0000 (20:38 +0000)]
perf inject: Fix file corruption due to event deletion
"perf inject" can create corrupt files when synthesizing sample events from AUX
data. This happens when in the input file, the first event (for the AUX data)
has a different sample_type from the second event (generally dummy).
Specifically, they differ in the bits that indicate the standard fields
appended to perf records in the mmap buffer. "perf inject" deletes the first
event and moves up the second event to first position.
The problem is with the synthetic PERF_RECORD_MMAP (etc.) events created
by "perf record".
Since these are synthetic versions of events which are normally produced
by the kernel, they have to have the standard fields appended as
described by sample_type.
"perf record" fills these in with zeroes, including the IDENTIFIER
field; perf readers interpret records with zero IDENTIFIER using the
descriptor for the first event in the file.
Since "perf inject" changes the first event, these synthetic records are
then processed with the wrong value of sample_type, and the perf reader
reads bad data, reports on incorrect length records etc.
Mismatching sample_types are seen with "perf record -e cs_etm//", where the AUX
event has TID|TIME|CPU|IDENTIFIER and the dummy event has TID|TIME|IDENTIFIER.
Perhaps they could be the same, but it isn't normally a problem if they aren't
- perf has no problems reading the file.
The sample_types have to agree on the position of IDENTIFIER, because
that's how perf finds the right event descriptor in the first place, but
they don't normally have to agree on other fields, and perf doesn't
check that they do.
The problem is specific to the way "perf inject" reorganizes the events
and the way synthetic MMAP events are recorded with a zero identifier. A
simple solution is to stop "perf inject" deleting the tracing event.
Committer testing
Removed the now unused 'evsel' variable, update the comment about the
evsel removal not being performed anymore, and apply the patch manually
as it failed with this warning:
warning: Patch sent with format=flowed; space at the end of lines might be lost.
Testing it with:
$ perf bench internals inject-build-id
# Running 'internals/inject-build-id' benchmark:
Average build-id injection took: 8.543 msec (+- 0.130 msec)
Average time per event: 0.838 usec (+- 0.013 usec)
Average memory usage: 12717 KB (+- 9 KB)
Average build-id-all injection took: 5.710 msec (+- 0.058 msec)
Average time per event: 0.560 usec (+- 0.006 usec)
Average memory usage: 12079 KB (+- 7 KB)
$
Arnd Bergmann [Mon, 16 Nov 2020 16:05:00 +0000 (17:05 +0100)]
Merge tag 'imx-fixes-5.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes
i.MX fixes for 5.10, round 4:
- Fix MDIO over clocking on vf610-zii-dev-rev-b board to get switch
device work reliably.
- Fix imx50-evk IOMUX for the chip select 1 to use GPIO4_13 instead of
the native CSPI_SSI function.
- Fix voltage for 1.6GHz CPU operating point on i.MX8MM to match
hardware datasheet.
- Fix phy-mode for KSZ9031 PHY on imx6qdl-udoo board.
* tag 'imx-fixes-5.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: imx50-evk: Fix the chip select 1 IOMUX
arm64: dts: imx8mm: fix voltage for 1.6GHz CPU operating point
ARM: dts: vf610-zii-dev-rev-b: Fix MDIO over clocking
arm: dts: imx6qdl-udoo: fix rgmii phy-mode for ksz9031 phy
Jakub Kicinski [Mon, 16 Nov 2020 15:34:30 +0000 (07:34 -0800)]
Merge tag 'linux-can-fixes-for-5.10-20201115' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2020-11-15
Anant Thazhemadam contributed two patches for the AF_CAN that prevent potential
access of uninitialized member in can_rcv() and canfd_rcv().
The next patch is by Alejandro Concepcion Rodriguez and changes can_restart()
to use the correct function to push a skb into the networking stack from
process context.
Zhang Qilong's patch fixes a memory leak in the error path of the ti_hecc's
probe function.
A patch by me fixes mcba_usb_start_xmit() function in the mcba_usb driver, to
first fill the skb and then pass it to can_put_echo_skb().
Colin Ian King's patch fixes a potential integer overflow on shift in the
peak_usb driver.
The next two patches target the flexcan driver, a patch by me adds the missing
"req_bit" to the stop mode property comment (which was broken during net-next
for v5.10). Zhang Qilong's patch fixes the failure handling of
pm_runtime_get_sync().
The next seven patches target the m_can driver including the tcan4x5x spi
driver glue code. Enric Balletbo i Serra's patch for the tcan4x5x Kconfig fix
the REGMAP_SPI dependency handling. A patch by me for the tcan4x5x driver's
probe() function adds missing error handling to for devm_regmap_init(), and in
tcan4x5x_can_remove() the order of deregistration is fixed. Wu Bo's patch for
the m_can driver fixes the state change handling in
m_can_handle_state_change(). Two patches by Dan Murphy first introduce
m_can_class_free_dev() and then make use of it to fix the freeing of the can
device. A patch by Faiz Abbas add a missing shutdown of the CAN controller in
the m_can_stop() function.
* tag 'linux-can-fixes-for-5.10-20201115' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: m_can: m_can_stop(): set device to software init mode before closing
can: m_can: Fix freeing of can device from peripherials
can: m_can: m_can_class_free_dev(): introduce new function
can: m_can: m_can_handle_state_change(): fix state change
can: tcan4x5x: tcan4x5x_can_remove(): fix order of deregistration
can: tcan4x5x: tcan4x5x_can_probe(): add missing error checking for devm_regmap_init()
can: tcan4x5x: replace depends on REGMAP_SPI with depends on SPI
can: flexcan: fix failure handling of pm_runtime_get_sync()
can: flexcan: flexcan_setup_stop_mode(): add missing "req_bit" to stop mode property comment
can: peak_usb: fix potential integer overflow on shift of a int
can: mcba_usb: mcba_usb_start_xmit(): first fill skb, then pass to can_put_echo_skb()
can: ti_hecc: Fix memleak in ti_hecc_probe
can: dev: can_restart(): post buffer from the right context
can: af_can: prevent potential access of uninitialized member in canfd_rcv()
can: af_can: prevent potential access of uninitialized member in can_rcv()
====================
Stefan Haberland [Mon, 16 Nov 2020 15:23:47 +0000 (16:23 +0100)]
s390/dasd: fix null pointer dereference for ERP requests
When requeueing all requests on the device request queue to the blocklayer
we might get to an ERP (error recovery) request that is a copy of an
original CQR.
Those requests do not have blocklayer request information or a pointer to
the dasd_queue set. When trying to access those data it will lead to a
null pointer dereference in dasd_requeue_all_requests().
Fix by checking if the request is an ERP request that can simply be
ignored. The blocklayer request will be requeued by the original CQR that
is on the device queue right behind the ERP request.
Thomas Gleixner [Mon, 16 Nov 2020 12:52:47 +0000 (13:52 +0100)]
iommu/vt-d: Take CONFIG_PCI_ATS into account
pci_dev::physfn is only available when CONFIG_PCI_ATS is set. The recent
fix for the irqdomain rework missed that dependency which makes the build
fail when CONFIG_PCI_ATS=n.
Dmitry Osipenko [Wed, 4 Nov 2020 13:21:26 +0000 (16:21 +0300)]
cpuidle: tegra: Annotate tegra_pm_set_cpu_in_lp2() with RCU_NONIDLE
Annotate tegra_pm_set[clear]_cpu_in_lp2() with RCU_NONIDLE in order to
fix lockdep warning about suspicious RCU usage of a spinlock during late
idling phase.
WARNING: suspicious RCU usage
...
include/trace/events/lock.h:13 suspicious rcu_dereference_check() usage!
...
(dump_stack) from (lock_acquire)
(lock_acquire) from (_raw_spin_lock)
(_raw_spin_lock) from (tegra_pm_set_cpu_in_lp2)
(tegra_pm_set_cpu_in_lp2) from (tegra_cpuidle_enter)
(tegra_cpuidle_enter) from (cpuidle_enter_state)
(cpuidle_enter_state) from (cpuidle_enter_state_coupled)
(cpuidle_enter_state_coupled) from (cpuidle_enter)
(cpuidle_enter) from (do_idle)
...
Max Filippov [Mon, 16 Nov 2020 09:38:59 +0000 (01:38 -0800)]
xtensa: disable preemption around cache alias management calls
Although cache alias management calls set up and tear down TLB entries
and fast_second_level_miss is able to restore TLB entry should it be
evicted they absolutely cannot preempt each other because they use the
same TLBTEMP area for different purposes.
Disable preemption around all cache alias management calls to enforce
that.
Max Filippov [Mon, 16 Nov 2020 09:25:56 +0000 (01:25 -0800)]
xtensa: fix TLBTEMP area placement
fast_second_level_miss handler for the TLBTEMP area has an assumption
that page table directory entry for the TLBTEMP address range is 0. For
it to be true the TLBTEMP area must be aligned to 4MB boundary and not
share its 4MB region with anything that may use a page table. This is
not true currently: TLBTEMP shares space with vmalloc space which
results in the following kinds of runtime errors when
fast_second_level_miss loads page table directory entry for the vmalloc
space instead of fixing up the TLBTEMP area:
Mike Christie [Tue, 10 Nov 2020 05:33:23 +0000 (23:33 -0600)]
vhost scsi: Add support for LUN resets.
In newer versions of virtio-scsi we just reset the timer when an a
command times out, so TMFs are never sent for the cmd time out case.
However, in older kernels and for the TMF inject cases, we can still get
resets and we end up just failing immediately so the guest might see the
device get offlined and IO errors.
For the older kernel cases, we want the same end result as the
modern virtio-scsi driver where we let the lower levels fire their error
handling and handle the problem. And at the upper levels we want to
wait. This patch ties the LUN reset handling into the LIO TMF code which
will just wait for outstanding commands to complete like we are doing in
the modern virtio-scsi case.
Note: I did not handle the ABORT case to keep this simple. For ABORTs
LIO just waits on the cmd like how it does for the RESET case. If
an ABORT fails, the guest OS ends up escalating to LUN RESET, so in
the end we get the same behavior where we wait on the outstanding
cmds.
Mike Christie [Tue, 10 Nov 2020 05:33:21 +0000 (23:33 -0600)]
vhost scsi: fix cmd completion race
We might not do the final se_cmd put from vhost_scsi_complete_cmd_work.
When the last put happens a little later then we could race where
vhost_scsi_complete_cmd_work does vhost_signal, the guest runs and sends
more IO, and vhost_scsi_handle_vq runs but does not find any free cmds.
This patch has us delay completing the cmd until the last lio core ref
is dropped. We then know that once we signal to the guest that the cmd
is completed that if it queues a new command it will find a free cmd.
Mike Christie [Tue, 10 Nov 2020 05:33:20 +0000 (23:33 -0600)]
vhost scsi: alloc cmds per vq instead of session
We currently are limited to 256 cmds per session. This leads to problems
where if the user has increased virtqueue_size to more than 2 or
cmd_per_lun to more than 256 vhost_scsi_get_tag can fail and the guest
will get IO errors.
This patch moves the cmd allocation to per vq so we can easily match
whatever the user has specified for num_queues and
virtqueue_size/cmd_per_lun. It also makes it easier to control how much
memory we preallocate. For cases, where perf is not as important and
we can use the current defaults (1 vq and 128 cmds per vq) memory use
from preallocate cmds is cut in half. For cases, where we are willing
to use more memory for higher perf, cmd mem use will now increase as
the num queues and queue depth increases.
Mike Christie [Tue, 10 Nov 2020 05:33:19 +0000 (23:33 -0600)]
vhost: add helper to check if a vq has been setup
This adds a helper check if a vq has been setup. The next patches
will use this when we move the vhost scsi cmd preallocation from per
session to per vq. In the per vq case, we only want to allocate cmds
for vqs that have actually been setup and not for all the possible
vqs.
Linus Torvalds [Sun, 15 Nov 2020 21:07:36 +0000 (13:07 -0800)]
Merge tag 'drm-fixes-2020-11-16' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Nouveau fixes:
- atomic modesetting regression fix
- ttm pre-nv50 fix
- connector NULL ptr deref fix"
* tag 'drm-fixes-2020-11-16' of git://anongit.freedesktop.org/drm/drm:
drm/nouveau/kms/nv50-: Use atomic encoder callbacks everywhere
drm/nouveau/ttm: avoid using nouveau_drm.ttm.type_vram prior to nv50
drm/nouveau/kms: Fix NULL pointer dereference in nouveau_connector_detect_depth
Linus Torvalds [Sun, 15 Nov 2020 18:15:17 +0000 (10:15 -0800)]
Merge tag 'char-misc-5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some small char/misc/whatever driver fixes for 5.10-rc4.
Nothing huge, lots of small fixes for reported issues:
- habanalabs driver fixes
- speakup driver fixes
- uio driver fixes
- virtio driver fix
- other tiny driver fixes
Full details are in the shortlog.
All of these have been in linux-next for a full week with no reported
issues"
* tag 'char-misc-5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
uio: Fix use-after-free in uio_unregister_device()
firmware: xilinx: fix out-of-bounds access
nitro_enclaves: Fixup type and simplify logic of the poll mask setup
speakup ttyio: Do not schedule() in ttyio_in_nowait
speakup: Fix clearing selection in safe context
speakup: Fix var_id_t values and thus keymap
virtio: virtio_console: fix DMA memory allocation for rproc serial
habanalabs/gaudi: mask WDT error in QMAN
habanalabs/gaudi: move coresight mmu config
habanalabs: fix kernel pointer type
mei: protect mei_cl_mtu from null dereference
Linus Torvalds [Sun, 15 Nov 2020 18:02:41 +0000 (10:02 -0800)]
Merge tag 'usb-5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB and Thunderbolt fixes from Greg KH:
"Here are some small Thunderbolt and USB driver fixes for 5.10-rc4 to
solve some reported issues.
Nothing huge in here, just small things:
- thunderbolt memory leaks fixed and new device ids added
- revert of problem patch for the musb driver
- new quirks added for USB devices
- typec power supply fixes to resolve much reported problems about
charging notifications not working anymore
All except the cdc-acm driver quirk addition have been in linux-next
with no reported issues (the quirk patch was applied on Friday, and is
self-contained)"
* tag 'usb-5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: cdc-acm: Add DISABLE_ECHO for Renesas USB Download mode
MAINTAINERS: add usb raw gadget entry
usb: typec: ucsi: Report power supply changes
xhci: hisilicon: fix refercence leak in xhci_histb_probe
Revert "usb: musb: convert to devm_platform_ioremap_resource_byname"
thunderbolt: Add support for Intel Tiger Lake-H
thunderbolt: Only configure USB4 wake for lane 0 adapters
thunderbolt: Add uaccess dependency to debugfs interface
thunderbolt: Fix memory leak if ida_simple_get() fails in enumerate_services()
thunderbolt: Add the missed ida_simple_remove() in ring_request_msix()
Linus Torvalds [Sun, 15 Nov 2020 17:57:58 +0000 (09:57 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Fixes for ARM and x86, the latter especially for old processors
without two-dimensional paging (EPT/NPT)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: mmu: fix is_tdp_mmu_check when the TDP MMU is not in use
KVM: SVM: Update cr3_lm_rsvd_bits for AMD SEV guests
KVM: x86: Introduce cr3_lm_rsvd_bits in kvm_vcpu_arch
KVM: x86: clflushopt should be treated as a no-op by emulation
KVM: arm64: Handle SCXTNUM_ELx traps
KVM: arm64: Unify trap handlers injecting an UNDEF
KVM: arm64: Allow setting of ID_AA64PFR0_EL1.CSV2 from userspace
Linus Torvalds [Sun, 15 Nov 2020 17:49:56 +0000 (09:49 -0800)]
Merge tag 'x86-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A small set of fixes for x86:
- Cure the fallout from the MSI irqdomain overhaul which missed that
the Intel IOMMU does not register virtual function devices and
therefore never reaches the point where the MSI interrupt domain is
assigned. This made the VF devices use the non-remapped MSI domain
which is trapped by the IOMMU/remap unit
- Remove an extra space in the SGI_UV architecture type procfs output
for UV5
- Remove a unused function which was missed when removing the UV BAU
TLB shootdown handler"
* tag 'x86-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
iommu/vt-d: Cure VF irqdomain hickup
x86/platform/uv: Fix copied UV5 output archtype
x86/platform/uv: Drop last traces of uv_flush_tlb_others
Linus Torvalds [Sun, 15 Nov 2020 17:46:36 +0000 (09:46 -0800)]
Merge tag 'perf-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
"A set of fixes for perf:
- A set of commits which reduce the stack usage of various perf
event handling functions which allocated large data structs on
stack causing stack overflows in the worst case
- Use the proper mechanism for detecting soft interrupts in the
recursion protection
- Make the resursion protection simpler and more robust
- Simplify the scheduling of event groups to make the code more
robust and prepare for fixing the issues vs. scheduling of
exclusive event groups
- Prevent event multiplexing and rotation for exclusive event groups
- Correct the perf event attribute exclusive semantics to take
pinned events, e.g. the PMU watchdog, into account
- Make the anythread filtering conditional for Intel's generic PMU
counters as it is not longer guaranteed to be supported on newer
CPUs. Check the corresponding CPUID leaf to make sure
- Fixup a duplicate initialization in an array which was probably
caused by the usual 'copy & paste - forgot to edit' mishap"
* tag 'perf-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/uncore: Fix Add BW copypasta
perf/x86/intel: Make anythread filter support conditional
perf: Tweak perf_event_attr::exclusive semantics
perf: Fix event multiplexing for exclusive groups
perf: Simplify group_sched_in()
perf: Simplify group_sched_out()
perf/x86: Make dummy_iregs static
perf/arch: Remove perf_sample_data::regs_user_copy
perf: Optimize get_recursion_context()
perf: Fix get_recursion_context()
perf/x86: Reduce stack usage for x86_pmu::drain_pebs()
perf: Reduce stack usage of perf_output_begin()
Linus Torvalds [Sun, 15 Nov 2020 17:39:35 +0000 (09:39 -0800)]
Merge tag 'sched-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Thomas Gleixner:
"A set of scheduler fixes:
- Address a load balancer regression by making the load balancer use
the same logic as the wakeup path to spread tasks in the LLC domain
- Prefer the CPU on which a task run last over the local CPU in the
fast wakeup path for asymmetric CPU capacity systems to align with
the symmetric case. This ensures more locality and prevents massive
migration overhead on those asymetric systems
- Fix a memory corruption bug in the scheduler debug code caused by
handing a modified buffer pointer to kfree()"
* tag 'sched-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/debug: Fix memory corruption caused by multiple small reads of flags
sched/fair: Prefer prev cpu in asymmetric wakeup path
sched/fair: Ensure tasks spreading in LLC during LB
Faiz Abbas [Tue, 25 Aug 2020 05:54:42 +0000 (11:24 +0530)]
can: m_can: m_can_stop(): set device to software init mode before closing
There might be some requests pending in the buffer when the interface close
sequence occurs. In some devices, these pending requests might lead to the
module not shutting down properly when m_can_clk_stop() is called.
Therefore, move the device to init state before potentially powering it down.
Linus Torvalds [Sun, 15 Nov 2020 17:25:43 +0000 (09:25 -0800)]
Merge tag 'locking-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
"Two fixes for the locking subsystem:
- Prevent an unconditional interrupt enable in a futex helper
function which can be called from contexts which expect interrupts
to stay disabled across the call
- Don't modify lockdep chain keys in the validation process as that
causes chain inconsistency"
* tag 'locking-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
lockdep: Avoid to modify chain keys in validate_chain()
futex: Don't enable IRQs unconditionally in put_pi_state()
can: tcan4x5x: replace depends on REGMAP_SPI with depends on SPI
regmap is a library function that gets selected by drivers that need it. No
driver modules should depend on it. Instead depends on SPI and select
REGMAP_SPI. Depending on REGMAP_SPI makes this driver only build if another
driver already selected REGMAP_SPI, as the symbol can't be selected through the
menu kernel configuration.
Zhang Qilong [Sun, 8 Nov 2020 08:30:00 +0000 (16:30 +0800)]
can: flexcan: fix failure handling of pm_runtime_get_sync()
pm_runtime_get_sync() will increment pm usage at first and it will resume the
device later. If runtime of the device has error or device is in inaccessible
state(or other error state), resume operation will fail. If we do not call put
operation to decrease the reference, it will result in reference leak in the
two functions flexcan_get_berr_counter() and flexcan_open().
Moreover, this device cannot enter the idle state and always stay busy or other
non-idle state later. So we should fix it through adding
pm_runtime_put_noidle().
Colin Ian King [Thu, 5 Nov 2020 11:24:27 +0000 (11:24 +0000)]
can: peak_usb: fix potential integer overflow on shift of a int
The left shift of int 32 bit integer constant 1 is evaluated using 32 bit
arithmetic and then assigned to a signed 64 bit variable. In the case where
time_ref->adapter->ts_used_bits is 32 or more this can lead to an oveflow.
Avoid this by shifting using the BIT_ULL macro instead.
can: mcba_usb: mcba_usb_start_xmit(): first fill skb, then pass to can_put_echo_skb()
The driver has to first fill the skb with data and then handle it to
can_put_echo_skb(). This patch moves the can_put_echo_skb() down, right before
sending the skb out via USB.
can: dev: can_restart(): post buffer from the right context
netif_rx() is meant to be called from interrupt contexts. can_restart() may be
called by can_restart_work(), which is called from a worqueue, so it may run in
process context. Use netif_rx_ni() instead.
can: af_can: prevent potential access of uninitialized member in canfd_rcv()
In canfd_rcv(), cfd->len is uninitialized when skb->len = 0, and this
uninitialized cfd->len is accessed nonetheless by pr_warn_once().
Fix this uninitialized variable access by checking cfd->len's validity
condition (cfd->len > CANFD_MAX_DLEN) separately after the skb->len's
condition is checked, and appropriately modify the log messages that
are generated as well.
In case either of the required conditions fail, the skb is freed and
NET_RX_DROP is returned, same as before.
can: af_can: prevent potential access of uninitialized member in can_rcv()
In can_rcv(), cfd->len is uninitialized when skb->len = 0, and this
uninitialized cfd->len is accessed nonetheless by pr_warn_once().
Fix this uninitialized variable access by checking cfd->len's validity
condition (cfd->len > CAN_MAX_DLEN) separately after the skb->len's
condition is checked, and appropriately modify the log messages that
are generated as well.
In case either of the required conditions fail, the skb is freed and
NET_RX_DROP is returned, same as before.
Linus Torvalds [Sun, 15 Nov 2020 16:57:19 +0000 (08:57 -0800)]
Merge branch 'for-5.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu
Pull percpu fix and cleanup from Dennis Zhou:
"A fix for a Wshadow warning in the asm-generic percpu macros came in
and then I tacked on the removal of flexible array initializers in the
percpu allocator"
* 'for-5.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
percpu: convert flexible array initializers to use struct_size()
asm-generic: percpu: avoid Wshadow warning
Paolo Bonzini [Sun, 15 Nov 2020 13:55:43 +0000 (08:55 -0500)]
kvm: mmu: fix is_tdp_mmu_check when the TDP MMU is not in use
In some cases where shadow paging is in use, the root page will
be either mmu->pae_root or vcpu->arch.mmu->lm_root. Then it will
not have an associated struct kvm_mmu_page, because it is allocated
with alloc_page instead of kvm_mmu_alloc_page.
Just return false quickly from is_tdp_mmu_root if the TDP MMU is
not in use, which also includes the case where shadow paging is
enabled.
lan743x: prevent entire kernel HANG on open, for some platforms
On arm imx6, when opening the chip's netdev, the whole Linux
kernel intermittently hangs/freezes.
This is caused by a bug in the driver code which tests if pcie
interrupts are working correctly, using the software interrupt:
1. open: enable the software interrupt
2. open: tell the chip to assert the software interrupt
3. open: wait for flag
4. ISR: acknowledge s/w interrupt, set flag
5. open: notice flag, disable the s/w interrupt, continue
Unfortunately the ISR only acknowledges the s/w interrupt, but
does not disable it. This will re-trigger the ISR in a tight
loop.
On some (lucky) platforms, open proceeds to disable the s/w
interrupt even while the ISR is 'spinning'. On arm imx6,
the spinning ISR does not allow open to proceed, resulting
in a hung Linux kernel.
Fix minimally by disabling the s/w interrupt in the ISR, which
will prevent it from spinning. This won't break anything because
the s/w interrupt is used as a one-shot interrupt.
Note that this is a minimal fix, overlooking many possible
cleanups, e.g.:
- lan743x_intr_software_isr() is completely redundant and reads
INT_STS twice for no apparent reason
- disabling the s/w interrupt in lan743x_intr_test_isr() is now
redundant, but harmless
- waiting on software_isr_flag can be converted from a sleeping
poll loop to wait_event_timeout()
When running this chip on arm imx6, we intermittently observe
the following kernel warning in the log, especially when the
system is under high load:
[ 50.119484] ------------[ cut here ]------------
[ 50.124377] WARNING: CPU: 0 PID: 303 at kernel/softirq.c:169 __local_bh_enable_ip+0x100/0x184
[ 50.132925] IRQs not enabled as expected
[ 50.159250] CPU: 0 PID: 303 Comm: rngd Not tainted 5.7.8 #1
[ 50.164837] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[ 50.171395] [<c0111a38>] (unwind_backtrace) from [<c010be28>] (show_stack+0x10/0x14)
[ 50.179162] [<c010be28>] (show_stack) from [<c05b9dec>] (dump_stack+0xac/0xd8)
[ 50.186408] [<c05b9dec>] (dump_stack) from [<c0122e40>] (__warn+0xd0/0x10c)
[ 50.193391] [<c0122e40>] (__warn) from [<c0123238>] (warn_slowpath_fmt+0x98/0xc4)
[ 50.200892] [<c0123238>] (warn_slowpath_fmt) from [<c012b010>] (__local_bh_enable_ip+0x100/0x184)
[ 50.209860] [<c012b010>] (__local_bh_enable_ip) from [<bf09ecbc>] (destroy_conntrack+0x48/0xd8 [nf_conntrack])
[ 50.220038] [<bf09ecbc>] (destroy_conntrack [nf_conntrack]) from [<c0ac9b58>] (nf_conntrack_destroy+0x94/0x168)
[ 50.230160] [<c0ac9b58>] (nf_conntrack_destroy) from [<c0a4aaa0>] (skb_release_head_state+0xa0/0xd0)
[ 50.239314] [<c0a4aaa0>] (skb_release_head_state) from [<c0a4aadc>] (skb_release_all+0xc/0x24)
[ 50.247946] [<c0a4aadc>] (skb_release_all) from [<c0a4b4cc>] (consume_skb+0x74/0x17c)
[ 50.255796] [<c0a4b4cc>] (consume_skb) from [<c081a2dc>] (lan743x_tx_release_desc+0x120/0x124)
[ 50.264428] [<c081a2dc>] (lan743x_tx_release_desc) from [<c081a98c>] (lan743x_tx_napi_poll+0x5c/0x18c)
[ 50.273755] [<c081a98c>] (lan743x_tx_napi_poll) from [<c0a6b050>] (net_rx_action+0x118/0x4a4)
[ 50.282306] [<c0a6b050>] (net_rx_action) from [<c0101364>] (__do_softirq+0x13c/0x53c)
[ 50.290157] [<c0101364>] (__do_softirq) from [<c012b29c>] (irq_exit+0x150/0x17c)
[ 50.297575] [<c012b29c>] (irq_exit) from [<c0196a08>] (__handle_domain_irq+0x60/0xb0)
[ 50.305423] [<c0196a08>] (__handle_domain_irq) from [<c05d44fc>] (gic_handle_irq+0x4c/0x90)
[ 50.313790] [<c05d44fc>] (gic_handle_irq) from [<c0100ed4>] (__irq_usr+0x54/0x80)
[ 50.321287] Exception stack(0xecd99fb0 to 0xecd99ff8)
[ 50.326355] 9fa0: 1cf1aa74000000010000000100000000
[ 50.334547] 9fc0: 00000001000000000000000000000000000000000000000000004097b6d17d14
[ 50.342738] 9fe0: 00000001b6d17c6000000000b6e71f94800b0010ffffffff
[ 50.349364] irq event stamp: 2525027
[ 50.352955] hardirqs last enabled at (2525026): [<c0a6afec>] net_rx_action+0xb4/0x4a4
[ 50.360892] hardirqs last disabled at (2525027): [<c0d6d2fc>] _raw_spin_lock_irqsave+0x1c/0x50
[ 50.369517] softirqs last enabled at (2524660): [<c01015b4>] __do_softirq+0x38c/0x53c
[ 50.377446] softirqs last disabled at (2524693): [<c012b29c>] irq_exit+0x150/0x17c
[ 50.385027] ---[ end trace c0b571db4bc8087d ]---
The driver is calling dev_kfree_skb() from code inside a spinlock,
where h/w interrupts are disabled. This is forbidden, as documented
in include/linux/netdevice.h. The correct function to use
dev_kfree_skb_irq(), or dev_kfree_skb_any().
Fix by using the correct dev_kfree_skb_xxx() functions:
in lan743x_tx_release_desc():
called by lan743x_tx_release_completed_descriptors()
called by in lan743x_tx_napi_poll()
which holds a spinlock
called by lan743x_tx_release_all_descriptors()
called by lan743x_tx_close()
which can-sleep
conclusion: use dev_kfree_skb_any()
in lan743x_tx_xmit_frame():
which holds a spinlock
conclusion: use dev_kfree_skb_irq()
in lan743x_tx_close():
which can-sleep
conclusion: use dev_kfree_skb()
in lan743x_rx_release_ring_element():
called by lan743x_rx_close()
which can-sleep
called by lan743x_rx_open()
which can-sleep
conclusion: use dev_kfree_skb()
Linus Torvalds [Sat, 14 Nov 2020 20:35:11 +0000 (12:35 -0800)]
Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
"14 patches.
Subsystems affected by this patch series: mm (migration, vmscan, slub,
gup, memcg, hugetlbfs), mailmap, kbuild, reboot, watchdog, panic, and
ocfs2"
* emailed patches from Andrew Morton <[email protected]>:
ocfs2: initialize ip_next_orphan
panic: don't dump stack twice on warn
hugetlbfs: fix anon huge page migration race
mm: memcontrol: fix missing wakeup polling thread
kernel/watchdog: fix watchdog_allowed_mask not used warning
reboot: fix overflow parsing reboot cpu number
Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint"
compiler.h: fix barrier_data() on clang
mm/gup: use unpin_user_pages() in __gup_longterm_locked()
mm/slub: fix panic in slab_alloc_node()
mailmap: fix entry for Dmitry Baryshkov/Eremin-Solenikov
mm/vmscan: fix NR_ISOLATED_FILE corruption on 64-bit
mm/compaction: stop isolation if too many pages are isolated and we have pages to migrate
mm/compaction: count pages and stop correctly during page isolation
Linus Torvalds [Sat, 14 Nov 2020 20:30:18 +0000 (12:30 -0800)]
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"Two small clk driver fixes:
- Make to_clk_regmap() inline to avoid compiler annoyance
- Fix critical clks on i.MX imx8m SoCs"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: imx8m: fix bus critical clk registration
clk: define to_clk_regmap() as inline function
Linus Torvalds [Sat, 14 Nov 2020 20:25:39 +0000 (12:25 -0800)]
Merge tag 'hwmon-for-v5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- Fix potential bufer overflow in pmbus/max20730 driver
- Fix locking issue in pmbus core
- Fix regression causing timeouts in applesmc driver
- Fix RPM calculation in pwm-fan driver
- Restrict counter visibility in amd_energy driver
* tag 'hwmon-for-v5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (amd_energy) modify the visibility of the counters
hwmon: (applesmc) Re-work SMC comms
hwmon: (pwm-fan) Fix RPM calculation
hwmon: (pmbus) Add mutex locking for sysfs reads
hwmon: (pmbus/max20730) use scnprintf() instead of snprintf()
Linus Torvalds [Sat, 14 Nov 2020 20:19:21 +0000 (12:19 -0800)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Three small fixes, all in the embedded ufs driver subsystem"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufshcd: Fix missing destroy_workqueue()
scsi: ufs: Try to save power mode change and UIC cmd completion timeout
scsi: ufs: Fix unbalanced scsi_block_reqs_cnt caused by ufshcd_hold()
Linus Torvalds [Sat, 14 Nov 2020 20:04:02 +0000 (12:04 -0800)]
Merge tag 'selinux-pr-20201113' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fix from Paul Moore:
"One small SELinux patch to make sure we return an error code when an
allocation fails. It passes all of our tests, but given the nature of
the patch that isn't surprising"
* tag 'selinux-pr-20201113' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: Fix error return code in sel_ib_pkey_sid_slow()
This is because it holds asoc when transport->proto_unreach_timer starts
and puts asoc when the timer stops, and without holding transport the
transport could be freed when the timer is still running.
So fix it by holding/putting transport instead for proto_unreach_timer
in transport, just like other timers in transport.
v1->v2:
- Also use sctp_transport_put() for the "out_unlock:" path in
sctp_generate_proto_unreach_event(), as Marcelo noticed.
David Howells [Sat, 14 Nov 2020 17:27:57 +0000 (17:27 +0000)]
afs: Fix afs_write_end() when called with copied == 0 [ver #3]
When afs_write_end() is called with copied == 0, it tries to set the
dirty region, but there's no way to actually encode a 0-length region in
the encoding in page->private.
"0,0", for example, indicates a 1-byte region at offset 0. The maths
miscalculates this and sets it incorrectly.
Fix it to just do nothing but unlock and put the page in this case. We
don't actually need to mark the page dirty as nothing presumably
changed.
Fixes: 65dd2d6072d3 ("afs: Alter dirty range encoding in page->private") Signed-off-by: David Howells <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
vsock: forward all packets to the host when no H2G is registered
Before commit c0cfa2d8a788 ("vsock: add multi-transports support"),
if a G2H transport was loaded (e.g. virtio transport), every packets
was forwarded to the host, regardless of the destination CID.
The H2G transports implemented until then (vhost-vsock, VMCI) always
responded with an error, if the destination CID was not
VMADDR_CID_HOST.
From that commit, we are using the remote CID to decide which
transport to use, so packets with remote CID > VMADDR_CID_HOST(2)
are sent only through H2G transport. If no H2G is available, packets
are discarded directly in the guest.
Some use cases (e.g. Nitro Enclaves [1]) rely on the old behaviour
to implement sibling VMs communication, so we restore the old
behavior when no H2G is registered.
It will be up to the host to discard packets if the destination is
not the right one. As it was already implemented before adding
multi-transport support.
Tested with nested QEMU/KVM by me and Nitro Enclaves by Andra.
As soon as you add the second port to a VLAN, all other port
membership configuration is overwritten with zeroes. The HW interprets
this as all ports being "unmodified members" of the VLAN.
In the simple case when all ports belong to the same VLAN, switching
will still work. But using multiple VLANs or trying to set multiple
ports as tagged members will not work.
On the 6352, doing a VTU GetNext op, followed by an STU GetNext op
will leave you with both the member- and state- data in the VTU/STU
data registers. But on the 6097 (which uses the same implementation),
the STU GetNext will override the information gathered from the VTU
GetNext.
Separate the two stages, parsing the result of the VTU GetNext before
doing the STU GetNext.
We opt to update the existing implementation for all applicable chips,
as opposed to creating a separate callback for 6097, because although
the previous implementation did work for (at least) 6352, the
datasheet does not mention the masking behavior.
The above stack is not reasonable, the final iput shouldn't happen in
ocfs2_orphan_filldir() function. Looking at the code,
2067 /* Skip inodes which are already added to recover list, since dio may
2068 * happen concurrently with unlink/rename */
2069 if (OCFS2_I(iter)->ip_next_orphan) {
2070 iput(iter);
2071 return 0;
2072 }
2073
The logic thinks the inode is already in recover list on seeing
ip_next_orphan is non-NULL, so it skip this inode after dropping a
reference which incremented in ocfs2_iget().
While, if the inode is already in recover list, it should have another
reference and the iput() at line 2070 should not be the final iput
(dropping the last reference). So I don't think the inode is really in
the recover list (no vmcore to confirm).
Note that ocfs2_queue_orphans(), though not shown up in the call back
trace, is holding cluster lock on the orphan directory when looking up
for unlinked inodes. The on disk inode eviction could involve a lot of
IOs which may need long time to finish. That means this node could hold
the cluster lock for very long time, that can lead to the lock requests
(from other nodes) to the orhpan directory hang for long time.
Looking at more on ip_next_orphan, I found it's not initialized when
allocating a new ocfs2_inode_info structure.
This causes te reflink operations from some nodes hang for very long
time waiting for the cluster lock on the orphan directory.
Christophe Leroy [Sat, 14 Nov 2020 06:52:20 +0000 (22:52 -0800)]
panic: don't dump stack twice on warn
Before commit 3f388f28639f ("panic: dump registers on panic_on_warn"),
__warn() was calling show_regs() when regs was not NULL, and show_stack()
otherwise.
After that commit, show_stack() is called regardless of whether
show_regs() has been called or not, leading to duplicated Call Trace:
Mike Kravetz [Sat, 14 Nov 2020 06:52:16 +0000 (22:52 -0800)]
hugetlbfs: fix anon huge page migration race
Qian Cai reported the following BUG in [1]
LTP: starting move_pages12
BUG: unable to handle page fault for address: ffffffffffffffe0
...
RIP: 0010:anon_vma_interval_tree_iter_first+0xa2/0x170 avc_start_pgoff at mm/interval_tree.c:63
Call Trace:
rmap_walk_anon+0x141/0xa30 rmap_walk_anon at mm/rmap.c:1864
try_to_unmap+0x209/0x2d0 try_to_unmap at mm/rmap.c:1763
migrate_pages+0x1005/0x1fb0
move_pages_and_store_status.isra.47+0xd7/0x1a0
__x64_sys_move_pages+0xa5c/0x1100
do_syscall_64+0x5f/0x310
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Hugh Dickins diagnosed this as a migration bug caused by code introduced
to use i_mmap_rwsem for pmd sharing synchronization. Specifically, the
routine unmap_and_move_huge_page() is always passing the TTU_RMAP_LOCKED
flag to try_to_unmap() while holding i_mmap_rwsem. This is wrong for
anon pages as the anon_vma_lock should be held in this case. Further
analysis suggested that i_mmap_rwsem was not required to he held at all
when calling try_to_unmap for anon pages as an anon page could never be
part of a shared pmd mapping.
Discussion also revealed that the hack in hugetlb_page_mapping_lock_write
to drop page lock and acquire i_mmap_rwsem is wrong. There is no way to
keep mapping valid while dropping page lock.
This patch does the following:
- Do not take i_mmap_rwsem and set TTU_RMAP_LOCKED for anon pages when
calling try_to_unmap.
- Remove the hacky code in hugetlb_page_mapping_lock_write. The routine
will now simply do a 'trylock' while still holding the page lock. If
the trylock fails, it will return NULL. This could impact the
callers:
- migration calling code will receive -EAGAIN and retry up to the
hard coded limit (10).
- memory error code will treat the page as BUSY. This will force
killing (SIGKILL) instead of SIGBUS any mapping tasks.
Do note that this change in behavior only happens when there is a
race. None of the standard kernel testing suites actually hit this
race, but it is possible.
Matteo Croce [Sat, 14 Nov 2020 06:52:07 +0000 (22:52 -0800)]
reboot: fix overflow parsing reboot cpu number
Limit the CPU number to num_possible_cpus(), because setting it to a
value lower than INT_MAX but higher than NR_CPUS produces the following
error on reboot and shutdown:
kstrtoint() and simple_strtoul() have a subtle difference which makes
them non interchangeable: if a non digit character is found amid the
parsing, the former will return an error, while the latter will just
stop parsing, e.g. simple_strtoul("123xyx") = 123.
The kernel cmdline reboot= argument allows to specify the CPU used for
rebooting, with the syntax `s####` among the other flags, e.g.
"reboot=warm,s31,force", so if this flag is not the last given, it's
silently ignored as well as the subsequent ones.
Arvind Sankar [Sat, 14 Nov 2020 06:51:59 +0000 (22:51 -0800)]
compiler.h: fix barrier_data() on clang
Commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h
mutually exclusive") neglected to copy barrier_data() from
compiler-gcc.h into compiler-clang.h.
The definition in compiler-gcc.h was really to work around clang's more
aggressive optimization, so this broke barrier_data() on clang, and
consequently memzero_explicit() as well.
For example, this results in at least the memzero_explicit() call in
lib/crypto/sha256.c:sha256_transform() being optimized away by clang.
Fix this by moving the definition of barrier_data() into compiler.h.
Also move the gcc/clang definition of barrier() into compiler.h,
__memory_barrier() is icc-specific (and barrier() is already defined
using it in compiler-intel.h) and doesn't belong in compiler.h.
Jason Gunthorpe [Sat, 14 Nov 2020 06:51:56 +0000 (22:51 -0800)]
mm/gup: use unpin_user_pages() in __gup_longterm_locked()
When FOLL_PIN is passed to __get_user_pages() the page list must be put
back using unpin_user_pages() otherwise the page pin reference persists
in a corrupted state.
There are two places in the unwind of __gup_longterm_locked() that put
the pages back without checking. Normally on error this function would
return the partial page list making this the caller's responsibility,
but in these two cases the caller is not allowed to see these pages at
all.
The issue is that object is not NULL while page is NULL which is odd but
may happen if the cache flush happened after loading object but before
loading page. Thus checking for the page pointer is required too.
The cache flush is done through an inter processor interrupt when a
piece of memory is off-lined. That interrupt is triggered when a memory
hot-unplug operation is initiated and offline_pages() is calling the
slub's MEM_GOING_OFFLINE callback slab_mem_going_offline_callback()
which is calling flush_cpu_slab(). If that interrupt is caught between
the reading of c->freelist and the reading of c->page, this could lead
to such a situation. That situation is expected and the later call to
this_cpu_cmpxchg_double() will detect the change to c->freelist and redo
the whole operation.
In commit 6159d0f5c03e ("mm/slub.c: page is always non-NULL in
node_match()") check on the page pointer has been removed assuming that
page is always valid when it is called. It happens that this is not
true in that particular case, so check for page before calling
node_match() here.
Nicholas Piggin [Sat, 14 Nov 2020 06:51:46 +0000 (22:51 -0800)]
mm/vmscan: fix NR_ISOLATED_FILE corruption on 64-bit
Previously the negated unsigned long would be cast back to signed long
which would have the correct negative value. After commit 730ec8c01a2b
("mm/vmscan.c: change prototype for shrink_page_list"), the large
unsigned int converts to a large positive signed long.
Symptoms include CMA allocations hanging forever holding the cma_mutex
due to alloc_contig_range->...->isolate_migratepages_block waiting
forever in "while (unlikely(too_many_isolated(pgdat)))".
Zi Yan [Sat, 14 Nov 2020 06:51:43 +0000 (22:51 -0800)]
mm/compaction: stop isolation if too many pages are isolated and we have pages to migrate
In isolate_migratepages_block, if we have too many isolated pages and
nr_migratepages is not zero, we should try to migrate what we have
without wasting time on isolating.
In theory it's possible that multiple parallel compactions will cause
too_many_isolated() to become true even if each has isolated less than
COMPACT_CLUSTER_MAX, and loop forever in the while loop. Bailing
immediately prevents that.