Linus Torvalds [Fri, 12 Jun 2020 19:24:42 +0000 (12:24 -0700)]
Merge tag 'pwm/for-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
Pull pwm updates from Thierry Reding:
"Nothing too exciting for this cycle. A couple of fixes across the
board, and Lee volunteered to help with patch review"
* tag 'pwm/for-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
pwm: Add missing "CONFIG_" prefix
MAINTAINERS: Add Lee Jones as reviewer for the PWM subsystem
pwm: imx27: Fix rounding behavior
pwm: rockchip: Simplify rockchip_pwm_get_state()
pwm: img: Call pm_runtime_put() in pm_runtime_get_sync() failed case
pwm: tegra: Support dynamic clock frequency configuration
pwm: jz4740: Add support for the JZ4725B
pwm: jz4740: Make PWM start with the active part
pwm: jz4740: Enhance precision in calculation of duty cycle
pwm: jz4740: Drop dependency on MACH_INGENIC
pwm: lpss: Fix get_state runtime-pm reference handling
pwm: sun4i: Support direct clock output on Allwinner A64
pwm: Add support for Azoteq IQS620A PWM generator
dt-bindings: pwm: rcar: add r8a77961 support
pwm: Add missing '\n' in log messages
Linus Torvalds [Fri, 12 Jun 2020 19:19:13 +0000 (12:19 -0700)]
Merge tag 'iommu-drivers-move-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu driver directory structure cleanup from Joerg Roedel:
"Move the Intel and AMD IOMMU drivers into their own subdirectory.
Both drivers consist of several files by now and giving them their own
directory unclutters the IOMMU top-level directory a bit"
* tag 'iommu-drivers-move-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/vt-d: Move Intel IOMMU driver into subdirectory
iommu/amd: Move AMD IOMMU driver into subdirectory
Linus Torvalds [Fri, 12 Jun 2020 19:13:36 +0000 (12:13 -0700)]
Merge tag 'printk-for-5.8-kdb-nmi' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk fix from Petr Mladek:
"One more printk change for 5.8: make sure that messages printed from
KDB context are redirected to KDB console handlers. It did not work
when KDB interrupted NMI or printk_safe contexts.
Arm people started hitting this problem more often recently. I forgot
to add the fix into the previous pull request by mistake"
* tag 'printk-for-5.8-kdb-nmi' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk/kdb: Redirect printk messages into kdb in any context
Recently syzbot reported that unmounting proc when there is an ongoing
inotify watch on the root directory of proc could result in a use
after free when the watch is removed after the unmount of proc
when the watcher exits.
Commit 69879c01a0c3 ("proc: Remove the now unnecessary internal mount
of proc") made it easier to unmount proc and allowed syzbot to see the
problem, but looking at the code it has been around for a long time.
Looking at the code the fsnotify watch should have been removed by
fsnotify_sb_delete in generic_shutdown_super. Unfortunately the inode
was allocated with new_inode_pseudo instead of new_inode so the inode
was not on the sb->s_inodes list. Which prevented
fsnotify_unmount_inodes from finding the inode and removing the watch
as well as made it so the "VFS: Busy inodes after unmount" warning
could not find the inodes to warn about them.
Make all of the inodes in proc visible to generic_shutdown_super,
and fsnotify_sb_delete by using new_inode instead of new_inode_pseudo.
The only functional difference is that new_inode places the inodes
on the sb->s_inodes list.
I wrote a small test program and I can verify that without changes it
can trigger this issue, and by replacing new_inode_pseudo with
new_inode the issues goes away.
Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Reported-by: [email protected] Fixes: 0097875bd415 ("proc: Implement /proc/thread-self to point at the directory of the current thread") Fixes: 021ada7dff22 ("procfs: switch /proc/self away from proc_dir_entry") Fixes: 51f0885e5415 ("vfs,proc: guarantee unique inodes in /proc") Signed-off-by: "Eric W. Biederman" <[email protected]>
Linus Torvalds [Fri, 12 Jun 2020 18:56:43 +0000 (11:56 -0700)]
Merge tag 'devicetree-fixes-for-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull Devicetree fixes from Rob Herring:
- Another round of whack-a-mole removing 'allOf', redundant cases of
'maxItems' and incorrect 'reg' sizes
- Fix support for yaml.h in non-standard paths
* tag 'devicetree-fixes-for-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
dt-bindings: Remove redundant 'maxItems'
dt-bindings: Fix more incorrect 'reg' property sizes in examples
dt-bindings: phy: qcom: Fix missing 'ranges' and example addresses
dt-bindings: Remove more cases of 'allOf' containing a '$ref'
scripts/dtc: use pkg-config to include <yaml.h> in non-standard path
Steve French [Fri, 12 Jun 2020 15:36:37 +0000 (10:36 -0500)]
cifs: fix chown and chgrp when idsfromsid mount option enabled
idsfromsid was ignored in chown and chgrp causing it to fail
when upcalls were not configured for lookup. idsfromsid allows
mapping users when setting user or group ownership using
"special SID" (reserved for this). Add support for chmod and chgrp
when idsfromsid mount option is enabled.
Steve French [Fri, 12 Jun 2020 14:25:21 +0000 (09:25 -0500)]
smb3: allow uid and gid owners to be set on create with idsfromsid mount option
Currently idsfromsid mount option allows querying owner information from the
special sids used to represent POSIX uids and gids but needed changes to
populate the security descriptor context with the owner information when
idsfromsid mount option was used.
Linus Torvalds [Fri, 12 Jun 2020 18:05:52 +0000 (11:05 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull more KVM updates from Paolo Bonzini:
"The guest side of the asynchronous page fault work has been delayed to
5.9 in order to sync with Thomas's interrupt entry rework, but here's
the rest of the KVM updates for this merge window.
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (62 commits)
KVM: x86: do not pass poisoned hva to __kvm_set_memory_region
KVM: selftests: fix sync_with_host() in smm_test
KVM: async_pf: Inject 'page ready' event only if 'page not present' was previously injected
KVM: async_pf: Cleanup kvm_setup_async_pf()
kvm: i8254: remove redundant assignment to pointer s
KVM: x86: respect singlestep when emulating instruction
KVM: selftests: Don't probe KVM_CAP_HYPERV_ENLIGHTENED_VMCS when nested VMX is unsupported
KVM: selftests: do not substitute SVM/VMX check with KVM_CAP_NESTED_STATE check
KVM: nVMX: Consult only the "basic" exit reason when routing nested exit
KVM: arm64: Move hyp_symbol_addr() to kvm_asm.h
KVM: arm64: Synchronize sysreg state on injecting an AArch32 exception
KVM: arm64: Make vcpu_cp1x() work on Big Endian hosts
KVM: arm64: Remove host_cpu_context member from vcpu structure
KVM: arm64: Stop sparse from moaning at __hyp_this_cpu_ptr
KVM: arm64: Handle PtrAuth traps early
KVM: x86: Unexport x86_fpu_cache and make it static
KVM: selftests: Ignore KVM 5-level paging support for VM_MODE_PXXV48_4K
KVM: arm64: Save the host's PtrAuth keys in non-preemptible context
KVM: arm64: Stop save/restoring ACTLR_EL1
KVM: arm64: Add emulation for 32bit guests accessing ACTLR2
...
Linus Torvalds [Fri, 12 Jun 2020 18:00:45 +0000 (11:00 -0700)]
Merge tag 'for-linus-5.8b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
- several smaller cleanups
- a fix for a Xen guest regression with CPU offlining
- a small fix in the xen pvcalls backend driver
- an update of MAINTAINERS
* tag 'for-linus-5.8b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
MAINTAINERS: Update PARAVIRT_OPS_INTERFACE and VMWARE_HYPERVISOR_INTERFACE
xen/pci: Get rid of verbose_request and use dev_dbg() instead
xenbus: Use dev_printk() when possible
xen-pciback: Use dev_printk() when possible
xen: enable BALLOON_MEMORY_HOTPLUG by default
xen: expand BALLOON_MEMORY_HOTPLUG description
xen/pvcalls: Make pvcalls_back_global static
xen/cpuhotplug: Fix initial CPU offlining for PV(H) guests
xen-platform: Constify dev_pm_ops
xen/pvcalls-back: test for errors when calling backend_connect()
Steve French [Fri, 12 Jun 2020 03:43:01 +0000 (22:43 -0500)]
smb311: add support for using info level for posix extensions query
Adds calls to the newer info level for query info using SMB3.1.1 posix extensions.
The remaining two places that call the older query info (non-SMB3.1.1 POSIX)
require passing in the fid and can be updated in a later patch.
Thomas Gleixner [Fri, 12 Jun 2020 12:02:27 +0000 (14:02 +0200)]
x86/entry: Make NMI use IDTENTRY_RAW
For no reason other than beginning brainmelt, IDTENTRY_NMI was mapped to
IDTENTRY_IST.
This is not a problem on 64bit because the IST default entry point maps to
IDTENTRY_RAW which does not any entry handling. The surplus function
declaration for the noist C entry point is unused and as there is no ASM
code emitted for NMI this went unnoticed.
On 32bit IDTENTRY_IST maps to a regular IDTENTRY which does the normal
entry handling. That is clearly the wrong thing to do for NMI.
Map it to IDTENTRY_RAW to unbreak it. The IDTENTRY_NMI mapping needs to
stay to avoid emitting ASM code.
Steve French [Fri, 12 Jun 2020 00:25:47 +0000 (19:25 -0500)]
SMB311: Add support for query info using posix extensions (level 100)
Adds support for better query info on dentry revalidation (using
the SMB3.1.1 POSIX extensions level 100). Followon patch will
add support for translating the UID/GID from the SID and also
will add support for using the posix query info on lookup.
Namjae Jeon [Thu, 11 Jun 2020 02:21:19 +0000 (11:21 +0900)]
smb3: add indatalen that can be a non-zero value to calculation of credit charge in smb2 ioctl
Some of tests in xfstests failed with cifsd kernel server since commit e80ddeb2f70e. cifsd kernel server validates credit charge from client
by calculating it base on max((InputCount + OutputCount) and
(MaxInputResponse + MaxOutputResponse)) according to specification.
MS-SMB2 specification describe credit charge calculation of smb2 ioctl :
If Connection.SupportsMultiCredit is TRUE, the server MUST validate
CreditCharge based on the maximum of (InputCount + OutputCount) and
(MaxInputResponse + MaxOutputResponse), as specified in section 3.3.5.2.5.
If the validation fails, it MUST fail the IOCTL request with
STATUS_INVALID_PARAMETER.
This patch add indatalen that can be a non-zero value to calculation of
credit charge in SMB2_ioctl_init().
Andy Lutomirski [Fri, 12 Jun 2020 03:26:38 +0000 (20:26 -0700)]
x86/entry: Treat BUG/WARN as NMI-like entries
BUG/WARN are cleverly optimized using UD2 to handle the BUG/WARN out of
line in an exception fixup.
But if BUG or WARN is issued in a funny RCU context, then the
idtentry_enter...() path might helpfully WARN that the RCU context is
invalid, which results in infinite recursion.
Split the BUG/WARN handling into an nmi_enter()/nmi_exit() path in
exc_invalid_op() to increase the chance to survive the experience.
[ tglx: Make the declaration match the implementation ]
Before commit 6cdf30375f82 ("powerpc/kvm/book3s: Use kvm helpers
to walk shadow or secondary table") we called __find_linux_pte() with
a page table pointer from a kvm_nested_guest struct but
now we rely on kvmhv_find_nested() which takes an L1 LPID and returns
a kvm_nested_guest pointer, however we pass a L0 LPID there and
the L2 guest hangs.
This fixes the LPID passed to kvmppc_hv_handle_set_rc().
Ley Foon Tan [Fri, 12 Jun 2020 06:04:49 +0000 (14:04 +0800)]
nios2: signal: Mark expected switch fall-through
Mark switch cases where we are expecting to fall through.
Fix the following warning through the use of the new the new
pseudo-keyword fallthrough;
arch/nios2/kernel/signal.c:254:12: warning: this statement may fall through [-Wimplicit-fallthrough=]
254 | restart = -2;
| ~~~~~~~~^~~~
arch/nios2/kernel/signal.c:255:3: note: here
255 | case ERESTARTNOHAND:
| ^~~~
Linus Torvalds [Fri, 12 Jun 2020 01:55:43 +0000 (18:55 -0700)]
Merge tag 'locking-kcsan-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull the Kernel Concurrency Sanitizer from Thomas Gleixner:
"The Kernel Concurrency Sanitizer (KCSAN) is a dynamic race detector,
which relies on compile-time instrumentation, and uses a
watchpoint-based sampling approach to detect races.
The feature was under development for quite some time and has already
found legitimate bugs.
Unfortunately it comes with a limitation, which was only understood
late in the development cycle:
It requires an up to date CLANG-11 compiler
CLANG-11 is not yet released (scheduled for June), but it's the only
compiler today which handles the kernel requirements and especially
the annotations of functions to exclude them from KCSAN
instrumentation correctly.
These annotations really need to work so that low level entry code and
especially int3 text poke handling can be completely isolated.
A detailed discussion of the requirements and compiler issues can be
found here:
We came to the conclusion that trying to work around compiler
limitations and bugs again would end up in a major trainwreck, so
requiring a working compiler seemed to be the best choice.
For Continous Integration purposes the compiler restriction is
manageable and that's where most xxSAN reports come from.
For a change this limitation might make GCC people actually look at
their bugs. Some issues with CSAN in GCC are 7 years old and one has
been 'fixed' 3 years ago with a half baken solution which 'solved' the
reported issue but not the underlying problem.
The KCSAN developers also ponder to use a GCC plugin to become
independent, but that's not something which will show up in a few
days.
Blocking KCSAN until wide spread compiler support is available is not
a really good alternative because the continuous growth of lockless
optimizations in the kernel demands proper tooling support"
* tag 'locking-kcsan-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (76 commits)
compiler_types.h, kasan: Use __SANITIZE_ADDRESS__ instead of CONFIG_KASAN to decide inlining
compiler.h: Move function attributes to compiler_types.h
compiler.h: Avoid nested statement expression in data_race()
compiler.h: Remove data_race() and unnecessary checks from {READ,WRITE}_ONCE()
kcsan: Update Documentation to change supported compilers
kcsan: Remove 'noinline' from __no_kcsan_or_inline
kcsan: Pass option tsan-instrument-read-before-write to Clang
kcsan: Support distinguishing volatile accesses
kcsan: Restrict supported compilers
kcsan: Avoid inserting __tsan_func_entry/exit if possible
ubsan, kcsan: Don't combine sanitizer with kcov on clang
objtool, kcsan: Add kcsan_disable_current() and kcsan_enable_current_nowarn()
kcsan: Add __kcsan_{enable,disable}_current() variants
checkpatch: Warn about data_race() without comment
kcsan: Use GFP_ATOMIC under spin lock
Improve KCSAN documentation a bit
kcsan: Make reporting aware of KCSAN tests
kcsan: Fix function matching in report
kcsan: Change data_race() to no longer require marking racing accesses
kcsan: Move kcsan_{disable,enable}_current() to kcsan-checks.h
...
This series fixes four bugs in the configuration of IPA endpoints.
See the description of each for more information.
In this version I have dropped the last patch from the series, and
restored a "static" keyword that had inadvertently gotten removed.
====================
Alex Elder [Thu, 11 Jun 2020 19:48:33 +0000 (14:48 -0500)]
net: ipa: header pad field only valid for AP->modem endpoint
Only QMAP endpoints should be configured to find a pad size field
within packet headers. They are found in the first byte of the QMAP
header (and the hardware fills only the 6 bits in that byte that
constitute the pad_len field).
The RMNet driver assumes the pad_len field is valid for received
packets, so we want to ensure the pad_len field is filled in that
case. That driver also assumes the length in the QMAP header
includes the pad bytes.
The RMNet driver does *not* pad the packets it sends, so the pad_len
field can be ignored.
Fix ipa_endpoint_init_hdr_ext() so it only marks the pad field
offset valid for QMAP RX endpoints, and in that case indicates
that the length field in the header includes the pad bytes.
Alex Elder [Thu, 11 Jun 2020 19:48:32 +0000 (14:48 -0500)]
net: ipa: program upper nibbles of sequencer type
The upper two nibbles of the sequencer type were not used for
SDM845, and were assumed to be 0. But for SC7180 they are used, and
so they must be programmed by ipa_endpoint_init_seq(). Fix this bug.
IPA_SEQ_PKT_PROCESS_NO_DEC_NO_UCP_DMAP doesn't have a descriptive
comment, so add one.
Alex Elder [Thu, 11 Jun 2020 19:48:31 +0000 (14:48 -0500)]
net: ipa: fix modem LAN RX endpoint id
The endpoint id assigned to the modem LAN RX endpoint for the SC7180 SoC
is incorrect. The erroneous value might have been copied from SDM845 and
never updated. The correct endpoint id to use for this SoC is 11.
Alex Elder [Thu, 11 Jun 2020 19:48:30 +0000 (14:48 -0500)]
net: ipa: program metadata mask differently
The way the mask value is programmed for QMAP RX endpoints was based
on some wrong assumptions about the way metadata containing the QMAP
mux_id value is formatted. The metadata value supplied by the
modem is *not* in QMAP format, and in fact contains the mux_id we
want in its (big endian) low-order byte. That byte must be written
by the IPA into offset 1 of the QMAP header it inserts before the
received packet.
QMAP TX endpoints *do* use a QMAP header as the metadata sent with
each packet. The modem assumes this, and based on that assumes the
mux_id is in the second byte. To match those assumptions we must
program the modem TX (QMAP) endpoint HDR register to indicate the
metadata will be found at offset 0 in the message header.
The previous configuration managed to work, but it was not working
correctly. This patch fixes a bug whose symptom was receipt of
messages containing the wrong QMAP mux_id.
In fixing this, get rid of ipa_rmnet_mux_id_metadata_mask(), which
was more or less defined so there was a separate place to explain
what was happening as we generated the mask value. Instead, put a
longer description of how this works above ipa_endpoint_init_hdr(),
and define the metadata mask to use as a simple constant.
Linus Torvalds [Fri, 12 Jun 2020 01:27:19 +0000 (18:27 -0700)]
Merge tag 'locking-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull atomics rework from Thomas Gleixner:
"Peter Zijlstras rework of atomics and fallbacks. This solves two
problems:
1) Compilers uninline small atomic_* static inline functions which
can expose them to instrumentation.
2) The instrumentation of atomic primitives was done at the
architecture level while composites or fallbacks were provided at
the generic level. As a result there are no uninstrumented
variants of the fallbacks.
Both issues were in the way of fully isolating fragile entry code
pathes and especially the text poke int3 handler which is prone to an
endless recursion problem when anything in that code path is about to
be instrumented. This was always a problem, but got elevated due to
the new batch mode updates of tracing.
The solution is to mark the functions __always_inline and to flip the
fallback and instrumentation so the non-instrumented variants are at
the architecture level and the instrumentation is done in generic
code.
The latter introduces another fallback variant which will go away once
all architectures have been moved over to arch_atomic_*"
* tag 'locking-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/atomics: Flip fallbacks and instrumentation
asm-generic/atomic: Use __always_inline for fallback wrappers
Shannon Nelson [Fri, 12 Jun 2020 00:18:15 +0000 (17:18 -0700)]
ionic: add pcie_print_link_status
Print the PCIe link information for our device.
Fixes: 77f972a7077d ("ionic: remove support for mgmt device") Signed-off-by: Shannon Nelson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
David S. Miller [Fri, 12 Jun 2020 01:25:20 +0000 (18:25 -0700)]
Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2020-06-11
This series contains fixes to the iavf driver.
Brett fixes the supported link speeds in the iavf driver, which was only
able to report speeds that the i40e driver supported and was missing the
speeds supported by the ice driver. In addition, fix how 2.5 and 5.0
GbE speeds are reported.
Alek fixes a enum comparison that was comparing two different enums that
may have different values, so update the comparison to use matching
enums.
Paul increases the time to complete a reset to allow for 128 VFs to
complete a reset.
====================
Linus Torvalds [Fri, 12 Jun 2020 01:18:50 +0000 (18:18 -0700)]
Merge branch 'akpm' (patches from Andrew)
Pull updates from Andrew Morton:
"A few fixes and stragglers.
Subsystems affected by this patch series: mm/memory-failure, ocfs2,
lib/lzo, misc"
* emailed patches from Andrew Morton <[email protected]>:
amdgpu: a NULL ->mm does not mean a thread is a kthread
lib/lzo: fix ambiguous encoding bug in lzo-rle
ocfs2: fix build failure when TCP/IP is disabled
mm/memory-failure: send SIGBUS(BUS_MCEERR_AR) only to current thread
mm/memory-failure: prioritize prctl(PR_MCE_KILL) over vm.memory_failure_early_kill
David Howells [Thu, 11 Jun 2020 20:57:00 +0000 (21:57 +0100)]
rxrpc: Fix race between incoming ACK parser and retransmitter
There's a race between the retransmission code and the received ACK parser.
The problem is that the retransmission loop has to drop the lock under
which it is iterating through the transmission buffer in order to transmit
a packet, but whilst the lock is dropped, the ACK parser can crank the Tx
window round and discard the packets from the buffer.
The retransmission code then updated the annotations for the wrong packet
and a later retransmission thought it had to retransmit a packet that
wasn't there, leading to a NULL pointer dereference.
Fix this by:
(1) Moving the annotation change to before we drop the lock prior to
transmission. This means we can't vary the annotation depending on
the outcome of the transmission, but that's fine - we'll retransmit
again later if it failed now.
(2) Skipping the packet if the skb pointer is NULL.
Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: David Howells <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Dave Rodgman [Fri, 12 Jun 2020 00:34:54 +0000 (17:34 -0700)]
lib/lzo: fix ambiguous encoding bug in lzo-rle
In some rare cases, for input data over 32 KB, lzo-rle could encode two
different inputs to the same compressed representation, so that
decompression is then ambiguous (i.e. data may be corrupted - although
zram is not affected because it operates over 4 KB pages).
This modifies the compressor without changing the decompressor or the
bitstream format, such that:
- there is no change to how data produced by the old compressor is
decompressed
- an old decompressor will correctly decode data from the updated
compressor
- performance and compression ratio are not affected
- we avoid introducing a new bitstream format
In testing over 12.8M real-world files totalling 903 GB, three files
were affected by this bug. I also constructed 37M semi-random 64 KB
files totalling 2.27 TB, and saw no affected files. Finally I tested
over files constructed to contain each of the ~1024 possible bad input
sequences; for all of these cases, updated lzo-rle worked correctly.
There is no significant impact to performance or compression ratio.
Tom Seewald [Fri, 12 Jun 2020 00:34:51 +0000 (17:34 -0700)]
ocfs2: fix build failure when TCP/IP is disabled
After commit 12abc5ee7873 ("tcp: add tcp_sock_set_nodelay") and commit c488aeadcbd0 ("tcp: add tcp_sock_set_user_timeout"), building the kernel
with OCFS2_FS=y but without INET=y causes it to fail with:
ld: fs/ocfs2/cluster/tcp.o: in function `o2net_accept_many':
tcp.c:(.text+0x21b1): undefined reference to `tcp_sock_set_nodelay'
ld: tcp.c:(.text+0x21c1): undefined reference to `tcp_sock_set_user_timeout'
ld: fs/ocfs2/cluster/tcp.o: in function `o2net_start_connect':
tcp.c:(.text+0x2633): undefined reference to `tcp_sock_set_nodelay'
ld: tcp.c:(.text+0x2643): undefined reference to `tcp_sock_set_user_timeout'
This is due to tcp_sock_set_nodelay() and tcp_sock_set_user_timeout()
being declared in linux/tcp.h and defined in net/ipv4/tcp.c, which
depend on TCP/IP being enabled.
To fix this, make OCFS2_FS depend on INET=y which already requires
NET=y.
Naoya Horiguchi [Fri, 12 Jun 2020 00:34:48 +0000 (17:34 -0700)]
mm/memory-failure: send SIGBUS(BUS_MCEERR_AR) only to current thread
Action Required memory error should happen only when a processor is
about to access to a corrupted memory, so it's synchronous and only
affects current process/thread.
Recently commit 872e9a205c84 ("mm, memory_failure: don't send
BUS_MCEERR_AO for action required error") fixed the issue that Action
Required memory could unnecessarily send SIGBUS to the processes which
share the error memory. But we still have another issue that we could
send SIGBUS to a wrong thread.
This is because collect_procs() and task_early_kill() fails to add the
current process to "to-kill" list. So this patch is suggesting to fix
it. With this fix, SIGBUS(BUS_MCEERR_AR) is never sent to non-current
process/thread.
Naoya Horiguchi [Fri, 12 Jun 2020 00:34:45 +0000 (17:34 -0700)]
mm/memory-failure: prioritize prctl(PR_MCE_KILL) over vm.memory_failure_early_kill
Patch series "hwpoison: fixes signaling on memory error"
This is a small patchset to solve issues in memory error handler to send
SIGBUS to proper process/thread as expected in configuration. Please
see descriptions in individual patches for more details.
This patch (of 2):
Early-kill policy is controlled from two types of settings, one is
per-process setting prctl(PR_MCE_KILL) and the other is system-wide
setting vm.memory_failure_early_kill. Users expect per-process setting
to override system-wide setting as many other settings do, but
early-kill setting doesn't work as such.
For example, if a system configures vm.memory_failure_early_kill to 1
(enabled), a process receives SIGBUS even if it's configured to
explicitly disable PF_MCE_KILL by prctl(). That's not desirable for
applications with their own policies.
This patch is suggesting to change the priority of these two types of
settings, by checking sysctl_memory_failure_early_kill only when a given
process has the default kill policy.
Note that this patch is solving a thread choice issue too.
Originally, collect_procs() always chooses the main thread when
vm.memory_failure_early_kill is 1, even if the process has a dedicated
thread for memory error handling. SIGBUS should be sent to the
dedicated thread if early-kill is enabled via
vm.memory_failure_early_kill as we are doing for PR_MCE_KILL_EARLY
processes.
Linus Torvalds [Thu, 11 Jun 2020 23:10:08 +0000 (16:10 -0700)]
Merge tag 'io_uring-5.8-2020-06-11' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"A few late stragglers in here. In particular:
- Validate full range for provided buffers (Bijan)
- Fix bad use of kfree() in buffer registration failure (Denis)
- Don't allow close of ring itself, it's not fully safe. Making it
fully safe would require making the system call more expensive,
which isn't worth it.
- Buffer selection fix
- Regression fix for O_NONBLOCK retry
- Make IORING_OP_ACCEPT honor O_NONBLOCK (Jiufei)
- Restrict opcode handling for SQ/IOPOLL (Pavel)
- io-wq work handling cleanups and improvements (Pavel, Xiaoguang)
- IOPOLL race fix (Xiaoguang)"
* tag 'io_uring-5.8-2020-06-11' of git://git.kernel.dk/linux-block:
io_uring: fix io_kiocb.flags modification race in IOPOLL mode
io_uring: check file O_NONBLOCK state for accept
io_uring: avoid unnecessary io_wq_work copy for fast poll feature
io_uring: avoid whole io_wq_work copy for requests completed inline
io_uring: allow O_NONBLOCK async retry
io_wq: add per-wq work handler instead of per work
io_uring: don't arm a timeout through work.func
io_uring: remove custom ->func handlers
io_uring: don't derive close state from ->func
io_uring: use kvfree() in io_sqe_buffer_register()
io_uring: validate the full range of provided buffers for access
io_uring: re-set iov base/len for buffer select retry
io_uring: move send/recv IOPOLL check into prep
io_uring: deduplicate io_openat{,2}_prep()
io_uring: do build_open_how() only once
io_uring: fix {SQ,IO}POLL with unsupported opcodes
io_uring: disallow close of ring itself
Linus Torvalds [Thu, 11 Jun 2020 23:07:33 +0000 (16:07 -0700)]
Merge tag 'block-5.8-2020-06-11' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Some followup fixes for this merge window. In particular:
- Seqcount write missing preemption disable for stats (Ahmed)
- blktrace fixes (Chaitanya)
- Redundant initializations (Colin)
- Various small NVMe fixes (Chaitanya, Christoph, Daniel, Max,
Niklas, Rikard)
- loop flag bug regression fix (Martijn)
- blk-mq tagging fixes (Christoph, Ming)"
* tag 'block-5.8-2020-06-11' of git://git.kernel.dk/linux-block:
umem: remove redundant initialization of variable ret
pktcdvd: remove redundant initialization of variable ret
nvmet: fail outstanding host posted AEN req
nvme-pci: use simple suspend when a HMB is enabled
nvme-fc: don't call nvme_cleanup_cmd() for AENs
nvmet-tcp: constify nvmet_tcp_ops
nvme-tcp: constify nvme_tcp_mq_ops and nvme_tcp_admin_mq_ops
nvme: do not call del_gendisk() on a disk that was never added
blk-mq: fix blk_mq_all_tag_iter
blk-mq: split out a __blk_mq_get_driver_tag helper
blktrace: fix endianness for blk_log_remap()
blktrace: fix endianness in get_pdu_int()
blktrace: use errno instead of bi_status
block: nr_sects_write(): Disable preemption on seqcount write
block: remove the error argument to the block_bio_complete tracepoint
loop: Fix wrong masking of status flags
block/bio-integrity: don't free 'buf' if bio_integrity_add_page() failed
David Howells [Thu, 11 Jun 2020 20:50:24 +0000 (21:50 +0100)]
afs: Fix afs_store_data() to set mtime in new operation descriptor
Fix afs_store_data() so that it sets the mtime in the new operation
descriptor otherwise the mtime on the server gets set to 0 when a write is
stored to the server.
Fixes: e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept") Reported-by: Dave Botsch <[email protected]> Signed-off-by: David Howells <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
Linus Torvalds [Thu, 11 Jun 2020 22:54:31 +0000 (15:54 -0700)]
Merge tag 'x86-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull more x86 updates from Thomas Gleixner:
"A set of fixes and updates for x86:
- Unbreak paravirt VDSO clocks.
While the VDSO code was moved into lib for sharing a subtle check
for the validity of paravirt clocks got replaced. While the
replacement works perfectly fine for bare metal as the update of
the VDSO clock mode is synchronous, it fails for paravirt clocks
because the hypervisor can invalidate them asynchronously.
Bring it back as an optional function so it does not inflict this
on architectures which are free of PV damage.
- Fix the jiffies to jiffies64 mapping on 64bit so it does not
trigger an ODR violation on newer compilers
- Three fixes for the SSBD and *IB* speculation mitigation maze to
ensure consistency, not disabling of some *IB* variants wrongly and
to prevent a rogue cross process shutdown of SSBD. All marked for
stable.
- Add yet more CPU models to the splitlock detection capable list
!@#%$!
- Bring the pr_info() back which tells that TSC deadline timer is
enabled.
- Reboot quirk for MacBook6,1"
* tag 'x86-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vdso: Unbreak paravirt VDSO clocks
lib/vdso: Provide sanity check for cycles (again)
clocksource: Remove obsolete ifdef
x86_64: Fix jiffies ODR violation
x86/speculation: PR_SPEC_FORCE_DISABLE enforcement for indirect branches.
x86/speculation: Prevent rogue cross-process SSBD shutdown
x86/speculation: Avoid force-disabling IBPB based on STIBP and enhanced IBRS.
x86/cpu: Add Sapphire Rapids CPU model number
x86/split_lock: Add Icelake microserver and Tigerlake CPU models
x86/apic: Make TSC deadline timer detection message visible
x86/reboot/quirks: Add MacBook6,1 reboot quirk
Dan Carpenter [Wed, 3 Jun 2020 17:54:36 +0000 (20:54 +0300)]
net/mlx5: E-Switch, Fix some error pointer dereferences
We can't leave "counter" set to an error pointer. Otherwise either it
will lead to an error pointer dereference later in the function or it
leads to an error pointer dereference when we call mlx5_fc_destroy().
Leon Romanovsky [Tue, 2 Jun 2020 12:28:37 +0000 (15:28 +0300)]
net/mlx5: Don't fail driver on failure to create debugfs
Clang warns:
drivers/net/ethernet/mellanox/mlx5/core/main.c:1278:6: warning: variable
'err' is used uninitialized whenever 'if' condition is true
[-Wsometimes-uninitialized]
if (!priv->dbg_root) {
^~~~~~~~~~~~~~~
drivers/net/ethernet/mellanox/mlx5/core/main.c:1303:9: note:
uninitialized use occurs here
return err;
^~~
drivers/net/ethernet/mellanox/mlx5/core/main.c:1278:2: note: remove the
'if' if its condition is always false
if (!priv->dbg_root) {
^~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/mellanox/mlx5/core/main.c:1259:9: note: initialize
the variable 'err' to silence this warning
int err;
^
= 0
1 warning generated.
The check of returned value of debugfs_create_dir() is wrong because
by the design debugfs failures should never fail the driver and the
check itself was wrong too. The kernel compiled without CONFIG_DEBUG_FS
will return ERR_PTR(-ENODEV) and not NULL as expected.
Parav Pandit [Fri, 15 May 2020 07:44:06 +0000 (02:44 -0500)]
net/mlx5: Fix devlink objects and devlink device unregister sequence
Current below problems exists.
1. devlink device is registered by mlx5_load_one(). But it is
not unregistered by mlx5_unload_one(). This is incorrect.
2. Above issue leads to,
When mlx5 PCI device is removed, currently devlink device is
unregistered before devlink ports are unregistered in below ladder
diagram.
remove_one()
mlx5_devlink_unregister()
[..]
devlink_unregister() <- ports are still registered!
mlx5_unload_one()
mlx5_unregister_device()
mlx5_remove_device()
mlx5e_remove()
mlx5e_devlink_port_unregister()
devlink_port_unregister()
3. Condition checking for registering and unregister device are not
symmetric either in these routines.
Hence, fix the sequence by having load and unload routines symmetric
and in right order.
i.e.
(a) register devlink device followed by registering devlink ports
(b) unregister devlink ports followed by devlink device
Do this based on boot and cleanup flags instead of different
conditions.
Fixes: c6acd629eec7 ("net/mlx5e: Add support for devlink-port in non-representors mode") Fixes: f60f315d339e ("net/mlx5e: Register devlink ports for physical link, PCI PF, VFs") Signed-off-by: Parav Pandit <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
Parav Pandit [Thu, 14 May 2020 10:12:56 +0000 (05:12 -0500)]
net/mlx5: Disable reload while removing the device
While unregistration is in progress, user might be reloading the
interface.
This can race with unregistration in below flow which uses the
resources which are getting disabled by reload flow.
Hence, disable the devlink reloading first when removing the device.
Aya Levin [Sun, 17 May 2020 09:45:52 +0000 (12:45 +0300)]
net/mlx5e: Fix ethtool hfunc configuration change
Changing RX hash function requires rearranging of RQT internal indexes,
the user isn't exposed to such changes and these changes do not affect
the user configured indirection table. Rebuild RQ table on hfunc change.
After an XSK is closed, the relevant structures in the channel are not
zeroed. If an XSK is opened the second time on the same channel without
recreating channels, the stray values in the structures will lead to
incorrect operation of queues, which causes CQE errors, and the new
socket doesn't work at all.
This patch fixes the issue by explicitly zeroing XSK-related structs in
the channel on XSK close. Note that those structs are zeroed on channel
creation, and usually a configuration change (XDP program is set)
happens on XSK open, which leads to recreating channels, so typical XSK
usecases don't suffer from this issue. However, if XSKs are opened and
closed on the same channel without removing the XDP program, this bug
reproduces.
Shay Drory [Thu, 7 May 2020 06:32:53 +0000 (09:32 +0300)]
net/mlx5: Fix fatal error handling during device load
Currently, in case of fatal error during mlx5_load_one(), we cannot
enter error state until mlx5_load_one() is finished, what can take
several minutes until commands will get timeouts, because these commands
can't be processed due to the fatal error.
Fix it by setting dev->state as MLX5_DEVICE_STATE_INTERNAL_ERROR before
requesting the lock.
Fixes: c1d4d2e92ad6 ("net/mlx5: Avoid calling sleeping function by the health poll thread") Signed-off-by: Shay Drory <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
Shay Drory [Wed, 6 May 2020 12:59:48 +0000 (15:59 +0300)]
net/mlx5: drain health workqueue in case of driver load error
In case there is a work in the health WQ when we teardown the driver,
in driver load error flow, the health work will try to read dev->iseg,
which was already unmap in mlx5_pci_close().
Fix it by draining the health workqueue first thing in mlx5_pci_close().
Linus Torvalds [Thu, 11 Jun 2020 22:36:28 +0000 (15:36 -0700)]
Merge tag 'timers-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
"A small fix for the VDSO code to force inline
__cvdso_clock_gettime_common() so the compiler
can't generate horrible code"
* tag 'timers-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
lib/vdso: Force inlining of __cvdso_clock_gettime_common()
Brett Creeley [Fri, 5 Jun 2020 17:09:45 +0000 (10:09 -0700)]
iavf: Fix reporting 2.5 Gb and 5Gb speeds
Commit 4ae4916b5643 ("i40e: fix 'Unknown bps' in dmesg for 2.5Gb/5Gb
speeds") added the ability for the PF to report 2.5 and 5Gb speeds,
however, the iavf driver does not recognize those speeds as the values were
not added there. Add the proper enums and values so that iavf can properly
deal with those speeds.
adapter->link_speed has type enum virtchnl_link_speed but our comparisons
are against enum iavf_aq_link_speed. Though they are, currently, the same
values, change the comparison to the matching enum virtchnl_link_speed
since that may not always be the case.
Brett Creeley [Fri, 5 Jun 2020 17:09:43 +0000 (10:09 -0700)]
iavf: fix speed reporting over virtchnl
Link speeds are communicated over virtchnl using an enum
virtchnl_link_speed. Currently, the highest link speed is 40Gbps which
leaves us unable to reflect some speeds that an ice VF is capable of.
This causes link speed to be misreported on the iavf driver.
Allow for communicating link speeds using Mbps so that the proper speed can
be reported for an ice VF. Moving away from the enum allows us to
communicate future speed changes without requiring a new enum to be added.
In order to support communicating link speeds over virtchnl in Mbps the
following functionality was added:
- Added u32 link_speed_mbps in the iavf_adapter structure.
- Added the macro ADV_LINK_SUPPORT(_a) to determine if the VF
driver supports communicating link speeds in Mbps.
- Added the function iavf_get_vpe_link_status() to fill the
correct link_status in the event_data union based on the
ADV_LINK_SUPPORT(_a) macro.
- Added the function iavf_set_adapter_link_speed_from_vpe()
to determine whether or not to fill the u32 link_speed_mbps or
enum virtchnl_link_speed link_speed field in the iavf_adapter
structure based on the ADV_LINK_SUPPORT(_a) macro.
- Do not free vf_res in iavf_init_get_resources() as vf_res will be
accessed in iavf_get_link_ksettings(); memset to 0 instead. This
memory is subsequently freed in iavf_remove().
Tobias Klauser [Thu, 11 Jun 2020 10:33:41 +0000 (12:33 +0200)]
tools, bpftool: Exit on error in function codegen
Currently, the codegen function might fail and return an error. But its
callers continue without checking its return value. Since codegen can
fail only in the unlikely case of the system running out of memory or
the static template being malformed, just exit(-1) directly from codegen
and make it void-returning.
Li RongQing [Thu, 11 Jun 2020 05:11:06 +0000 (13:11 +0800)]
xdp: Fix xsk_generic_xmit errno
Propagate sock_alloc_send_skb error code, not set it to
EAGAIN unconditionally, when fail to allocate skb, which
might cause that user space unnecessary loops.
Linus Torvalds [Thu, 11 Jun 2020 20:25:53 +0000 (13:25 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge some more updates from Andrew Morton:
- various hotfixes and minor things
- hch's use_mm/unuse_mm clearnups
Subsystems affected by this patch series: mm/hugetlb, scripts, kcov,
lib, nilfs, checkpatch, lib, mm/debug, ocfs2, lib, misc.
* emailed patches from Andrew Morton <[email protected]>:
kernel: set USER_DS in kthread_use_mm
kernel: better document the use_mm/unuse_mm API contract
kernel: move use_mm/unuse_mm to kthread.c
kernel: move use_mm/unuse_mm to kthread.c
stacktrace: cleanup inconsistent variable type
lib: test get_count_order/long in test_bitops.c
mm: add comments on pglist_data zones
ocfs2: fix spelling mistake and grammar
mm/debug_vm_pgtable: fix kernel crash by checking for THP support
lib: fix bitmap_parse() on 64-bit big endian archs
checkpatch: correct check for kernel parameters doc
nilfs2: fix null pointer dereference at nilfs_segctor_do_construct()
lib/lz4/lz4_decompress.c: document deliberate use of `&'
kcov: check kcov_softirq in kcov_remote_stop()
scripts/spelling: add a few more typos
khugepaged: selftests: fix timeout condition in wait_for_scan()
Rob Herring [Thu, 11 Jun 2020 14:58:04 +0000 (08:58 -0600)]
dt-bindings: Fix more incorrect 'reg' property sizes in examples
The examples template is a 'simple-bus' with a size of 1 cell for
had between 2 and 4 cells which really only errors on I2C or SPI type
devices with a single cell.
The easiest fix in most cases is to change the 'reg' property to 1 cell
for address and size.
Joerg Roedel [Thu, 11 Jun 2020 09:11:39 +0000 (11:11 +0200)]
alpha: Fix build around srm_sysrq_reboot_op
The patch introducing the struct was probably never compile tested,
because it sets a handler with a wrong function signature. Wrap the
handler into a functions with the correct signature to fix the build.
Linus Torvalds [Thu, 11 Jun 2020 19:55:20 +0000 (12:55 -0700)]
Merge tag 'riscv-for-linus-5.8-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull more RISC-V updates from Palmer Dabbelt:
- Kconfig select statements are now sorted alphanumerically
- first-level interrupts are now handled via a full irqchip driver
- CPU hotplug is fixed
- vDSO calls now use the common vDSO infrastructure
* tag 'riscv-for-linus-5.8-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: set the permission of vdso_data to read-only
riscv: use vDSO common flow to reduce the latency of the time-related functions
riscv: fix build warning of missing prototypes
RISC-V: Don't mark init section as non-executable
RISC-V: Force select RISCV_INTC for CONFIG_RISCV
RISC-V: Remove do_IRQ() function
clocksource/drivers/timer-riscv: Use per-CPU timer interrupt
irqchip: RISC-V per-HART local interrupt controller driver
RISC-V: Rename and move plic_find_hart_id() to arch directory
RISC-V: self-contained IPI handling routine
RISC-V: Sort select statements alphanumerically
- Fix incorrect mask in HiSilicon L3C perf PMU driver
- Fix compat vDSO compilation under some toolchain configurations
- Fix false UBSAN warning from ACPI IORT parsing code
- Fix booting under bootloaders that ignore TEXT_OFFSET
- Annotate debug initcall function with '__init'"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: warn on incorrect placement of the kernel by the bootloader
arm64: acpi: fix UBSAN warning
arm64: vdso32: add CONFIG_THUMB2_COMPAT_VDSO
drivers/perf: hisi: Fix wrong value for all counters enable
arm64: ftrace: Change CONFIG_FTRACE_WITH_REGS to CONFIG_DYNAMIC_FTRACE_WITH_REGS
arm64: debug: mark a function as __init to save some memory
scs: Report SCS usage in bytes rather than number of entries
Linus Torvalds [Thu, 11 Jun 2020 19:50:54 +0000 (12:50 -0700)]
Merge tag 'm68knommu-for-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
Pull m68knommu updates from Greg Ungerer:
- casting clean up in the user access macros
- memory leak on error case fix for PCI probing
- update of a defconfig
* tag 'm68knommu-for-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k,nommu: fix implicit cast from __user in __{get,put}_user_asm()
m68k,nommu: add missing __user in uaccess' __ptr() macro
m68k: Drop CONFIG_MTD_M25P80 in stmark2_defconfig
m68k/PCI: Fix a memory leak in an error handling path
Rob Herring [Thu, 11 Jun 2020 14:52:38 +0000 (08:52 -0600)]
dt-bindings: phy: qcom: Fix missing 'ranges' and example addresses
The QCom QMP PHY bindings have child nodes with translatable (MMIO)
addresses, so a 'ranges' property is required in the parent node.
Additionally, the examples default to 1 address and size cell, so let's
fix that, too.
Rob Herring [Wed, 10 Jun 2020 21:49:12 +0000 (15:49 -0600)]
dt-bindings: Remove more cases of 'allOf' containing a '$ref'
Another round of 'allOf' removals that came in this cycle.
json-schema versions draft7 and earlier have a weird behavior in that
any keywords combined with a '$ref' are ignored (silently). The correct
form was to put a '$ref' under an 'allOf'. This behavior is now changed
in the 2019-09 json-schema spec and '$ref' can be mixed with other
keywords. The json-schema library doesn't yet support this, but the
tooling now does a fixup for this and either way works.
This has been a constant source of review comments, so let's change this
treewide so everyone copies the simpler syntax.
Tuong Lien [Thu, 11 Jun 2020 10:08:08 +0000 (17:08 +0700)]
tipc: fix NULL pointer dereference in tipc_disc_rcv()
When a bearer is enabled, we create a 'tipc_discoverer' object to store
the bearer related data along with a timer and a preformatted discovery
message buffer for later probing... However, this is only carried after
the bearer was set 'up', that left a race condition resulting in kernel
panic.
It occurs when a discovery message from a peer node is received and
processed in bottom half (since the bearer is 'up' already) just before
the discoverer object is created but is now accessed in order to update
the preformatted buffer (with a new trial address, ...) so leads to the
NULL pointer dereference.
We solve the problem by simply moving the bearer 'up' setting to later,
so make sure everything is ready prior to any message receiving.
Tuong Lien [Thu, 11 Jun 2020 10:07:35 +0000 (17:07 +0700)]
tipc: fix kernel WARNING in tipc_msg_append()
syzbot found the following issue:
WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 check_copy_size include/linux/thread_info.h:150 [inline]
WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 copy_from_iter include/linux/uio.h:144 [inline]
WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 tipc_msg_append+0x49a/0x5e0 net/tipc/msg.c:242
Kernel panic - not syncing: panic_on_warn set ...
This happens after commit 5e9eeccc58f3 ("tipc: fix NULL pointer
dereference in streaming") that tried to build at least one buffer even
when the message data length is zero... However, it now exposes another
bug that the 'mss' can be zero and the 'cpy' will be negative, thus the
above kernel WARNING will appear!
The zero value of 'mss' is never expected because it means Nagle is not
enabled for the socket (actually the socket type was 'SOCK_SEQPACKET'),
so the function 'tipc_msg_append()' must not be called at all. But that
was in this particular case since the message data length was zero, and
the 'send <= maxnagle' check became true.
We resolve the issue by explicitly checking if Nagle is enabled for the
socket, i.e. 'maxnagle != 0' before calling the 'tipc_msg_append()'. We
also reinforce the function to against such a negative values if any.
Shannon Nelson [Thu, 11 Jun 2020 04:07:39 +0000 (21:07 -0700)]
ionic: remove support for mgmt device
We no longer support the mgmt device in the ionic driver,
so remove the device id and related code.
Fixes: b3f064e9746d ("ionic: add support for device id 0x1004") Signed-off-by: Shannon Nelson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Linus Torvalds [Thu, 11 Jun 2020 19:42:14 +0000 (12:42 -0700)]
Merge tag 'mailbox-v5.8' of git://git.linaro.org/landing-teams/working/fujitsu/integration
Pull mailbox updates from Jassi Brar:
"qcom:
- new controller driver for IPCC
- reorg the of_device data
- add support for ipq6018 platform
spreadtrum:
- new sprd controller driver
imx:
- implement suspend/resume PM support
misc:
- make pcc driver struct static
- fix return value in imx_mu_scu
- disable clock before bailout in imx probe
- remove duplicate error mssg in zynqmp probe
- fix header size in imx.scu
- check for null instead of is-err in zynqmp"
* tag 'mailbox-v5.8' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
mailbox: qcom: Add ipq6018 apcs compatible
mailbox: qcom: Add clock driver name in apcs mailbox driver data
dt-bindings: mailbox: Add YAML schemas for QCOM APCS global block
mailbox: imx: ONLY IPC MU needs IRQF_NO_SUSPEND flag
mailbox: imx: Add runtime PM callback to handle MU clocks
mailbox: imx: Add context save/restore for suspend/resume
MAINTAINERS: Add entry for Qualcomm IPCC driver
mailbox: Add support for Qualcomm IPCC
dt-bindings: mailbox: Add devicetree binding for Qcom IPCC
mailbox: zynqmp-ipi: Fix NULL vs IS_ERR() check in zynqmp_ipi_mbox_probe()
mailbox: imx-mailbox: fix scu msg header size check
mailbox: sprd: Add Spreadtrum mailbox driver
dt-bindings: mailbox: Add the Spreadtrum mailbox documentation
mailbox: ZynqMP IPI: Delete an error message in zynqmp_ipi_probe()
mailbox: imx: Disable the clock on devm_mbox_controller_register() failure
mailbox: imx: Fix return in imx_mu_scu_xlate()
mailbox: imx: Support runtime PM
mailbox: pcc: make pcc_mbox_driver static
Xu Wang [Thu, 11 Jun 2020 02:45:20 +0000 (02:45 +0000)]
drivers: dpaa2: Use devm_kcalloc() in setup_dpni()
A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus use the corresponding function "devm_kcalloc".
Linus Torvalds [Thu, 11 Jun 2020 19:38:11 +0000 (12:38 -0700)]
Merge tag 'sound-fix-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Here are last-minute fixes gathered before merge window close; a few
fixes are for the core while the rest majority are driver fixes.
- PCM locking annotation fixes and the possible self-lock fix
- ASoC DPCM regression fixes with multi-CPU DAI
- A fix for inconsistent resume from system-PM on USB-audio
- Improved runtime-PM handling with multiple USB interfaces
- Quirks for HD-audio and USB-audio
- Hardened firmware handling in max98390 codec
- A couple of fixes for meson"
* tag 'sound-fix-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (21 commits)
ASoC: rt5645: Add platform-data for Asus T101HA
ASoC: Intel: bytcr_rt5640: Add quirk for Toshiba Encore WT10-A tablet
ASoC: SOF: nocodec: conditionally set dpcm_capture/dpcm_playback flags
ASoC: Intel: boards: replace capture_only by dpcm_capture
ASoC: core: only convert non DPCM link to DPCM link
ASoC: soc-pcm: dpcm: fix playback/capture checks
ASoC: meson: add missing free_irq() in error path
ALSA: pcm: disallow linking stream to itself
ALSA: usb-audio: Manage auto-pm of all bundled interfaces
ALSA: hda/realtek - add a pintbl quirk for several Lenovo machines
ALSA: pcm: fix snd_pcm_link() lockdep splat
ALSA: usb-audio: Use the new macro for HP Dock rename quirks
ALSA: usb-audio: Add vendor, product and profile name for HP Thunderbolt Dock
ALSA: emu10k1: delete an unnecessary condition
dt-bindings: ASoc: Fix tdm-slot documentation spelling error
ASoC: meson: fix memory leak of links if allocation of ldata fails
ALSA: usb-audio: Fix inconsistent card PM state after resume
ASoC: max98390: Fix potential crash during param fw loading
ASoC: max98390: Fix incorrect printf qualifier
ASoC: fsl-asoc-card: Defer probe when fail to find codec device
...
Linus Torvalds [Thu, 11 Jun 2020 19:27:06 +0000 (12:27 -0700)]
Merge tag 'drm-next-2020-06-11-1' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"One sun4i fix and a connector hotplug race The ast fix is for a
regression in 5.6, and one of the i915 ones fixes an oops reported by
dhowells.
core:
- fix race in connectors sending hotplug
i915:
- Avoid use after free in cmdparser
- Avoid NULL dereference when probing all display encoders
- Fixup to module parameter type
sun4i:
- clock divider fix
ast:
- 24/32 bpp mode setting fix"
* tag 'drm-next-2020-06-11-1' of git://anongit.freedesktop.org/drm/drm:
drm/ast: fix missing break in switch statement for format->cpp[0] case 4
drm/sun4i: hdmi ddc clk: Fix size of m divider
drm/i915/display: Only query DP state of a DDI encoder
drm/i915/params: fix i915.reset module param type
drm/i915/gem: Mark the buffer pool as active for the cmdparser
drm/connector: notify userspace on hotplug after register complete
Linus Torvalds [Thu, 11 Jun 2020 19:22:41 +0000 (12:22 -0700)]
Merge tag 'nfs-for-5.8-1' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client updates from Anna Schumaker:
"New features and improvements:
- Sunrpc receive buffer sizes only change when establishing a GSS credentials
- Add more sunrpc tracepoints
- Improve on tracepoints to capture internal NFS I/O errors
Other bugfixes and cleanups:
- Move a dprintk() to after a call to nfs_alloc_fattr()
- Fix off-by-one issues in rpc_ntop6
- Fix a few coccicheck warnings
- Use the correct SPDX license identifiers
- Fix rpc_call_done assignment for BIND_CONN_TO_SESSION
- Replace zero-length array with flexible array
- Remove duplicate headers
- Set invalid blocks after NFSv4 writes to update space_used attribute
- Fix direct WRITE throughput regression"
* tag 'nfs-for-5.8-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (27 commits)
NFS: Fix direct WRITE throughput regression
SUNRPC: rpc_xprt lifetime events should record xprt->state
xprtrdma: Make xprt_rdma_slot_table_entries static
nfs: set invalid blocks after NFSv4 writes
NFS: remove redundant initialization of variable result
sunrpc: add missing newline when printing parameter 'auth_hashtable_size' by sysfs
NFS: Add a tracepoint in nfs_set_pgio_error()
NFS: Trace short NFS READs
NFS: nfs_xdr_status should record the procedure name
SUNRPC: Set SOFTCONN when destroying GSS contexts
SUNRPC: rpc_call_null_helper() should set RPC_TASK_SOFT
SUNRPC: rpc_call_null_helper() already sets RPC_TASK_NULLCREDS
SUNRPC: trace RPC client lifetime events
SUNRPC: Trace transport lifetime events
SUNRPC: Split the xdr_buf event class
SUNRPC: Add tracepoint to rpc_call_rpcerror()
SUNRPC: Update the RPC_SHOW_SOCKET() macro
SUNRPC: Update the rpc_show_task_flags() macro
SUNRPC: Trace GSS context lifetimes
SUNRPC: receive buffer size estimation values almost never change
...
Marco Elver [Thu, 21 May 2020 14:20:45 +0000 (16:20 +0200)]
compiler.h: Avoid nested statement expression in data_race()
It appears that compilers have trouble with nested statement
expressions. Therefore, remove one level of statement expression nesting
from the data_race() macro. This will help avoiding potential problems
in the future as its usage increases.
Marco Elver [Thu, 21 May 2020 14:20:44 +0000 (16:20 +0200)]
compiler.h: Remove data_race() and unnecessary checks from {READ,WRITE}_ONCE()
The volatile accesses no longer need to be wrapped in data_race()
because compilers that emit instrumentation distinguishing volatile
accesses are required for KCSAN.
Consequently, the explicit kcsan_check_atomic*() are no longer required
either since the compiler emits instrumentation distinguishing the
volatile accesses.
Finally, simplify __READ_ONCE_SCALAR() and remove __WRITE_ONCE_SCALAR().
Marco Elver [Thu, 21 May 2020 14:20:41 +0000 (16:20 +0200)]
kcsan: Remove 'noinline' from __no_kcsan_or_inline
Some compilers incorrectly inline small __no_kcsan functions, which then
results in instrumenting the accesses. For this reason, the 'noinline'
attribute was added to __no_kcsan_or_inline. All known versions of GCC
are affected by this. Supported versions of Clang are unaffected, and
never inline a no_sanitize function.
However, the attribute 'noinline' in __no_kcsan_or_inline causes
unexpected code generation in functions that are __no_kcsan and call a
__no_kcsan_or_inline function.
In certain situations it is expected that the __no_kcsan_or_inline
function is actually inlined by the __no_kcsan function, and *no* calls
are emitted. By removing the 'noinline' attribute, give the compiler
the ability to inline and generate the expected code in __no_kcsan
functions.
Marco Elver [Thu, 21 May 2020 14:20:40 +0000 (16:20 +0200)]
kcsan: Pass option tsan-instrument-read-before-write to Clang
Clang (unlike GCC) removes reads before writes with matching addresses
in the same basic block. This is an optimization for TSAN, since writes
will always cause conflict if the preceding read would have.
However, for KCSAN we cannot rely on this option, because we apply
several special rules to writes, in particular when the
KCSAN_ASSUME_PLAIN_WRITES_ATOMIC option is selected. To avoid missing
potential data races, pass the -tsan-instrument-read-before-write option
to Clang if it is available [1].
Marco Elver [Thu, 21 May 2020 14:20:39 +0000 (16:20 +0200)]
kcsan: Support distinguishing volatile accesses
In the kernel, the "volatile" keyword is used in various concurrent
contexts, whether in low-level synchronization primitives or for
legacy reasons. If supported by the compiler, it will be assumed
that aligned volatile accesses up to sizeof(long long) (matching
compiletime_assert_rwonce_type()) are atomic.
Recent versions of Clang [1] (GCC tentative [2]) can instrument
volatile accesses differently. Add the option (required) to enable the
instrumentation, and provide the necessary runtime functions. None of
the updated compilers are widely available yet (Clang 11 will be the
first release to support the feature).
Marco Elver [Thu, 21 May 2020 14:20:42 +0000 (16:20 +0200)]
kcsan: Restrict supported compilers
The first version of Clang that supports -tsan-distinguish-volatile will
be able to support KCSAN. The first Clang release to do so, will be
Clang 11. This is due to satisfying all the following requirements:
1. Never emit calls to __tsan_func_{entry,exit}.
2. __no_kcsan functions should not call anything, not even
kcsan_{enable,disable}_current(), when using __{READ,WRITE}_ONCE => Requires
leaving them plain!
3. Support atomic_{read,set}*() with KCSAN, which rely on
arch_atomic_{read,set}*() using __{READ,WRITE}_ONCE() => Because of
#2, rely on Clang 11's -tsan-distinguish-volatile support. We will
double-instrument atomic_{read,set}*(), but that's reasonable given
it's still lower cost than the data_race() variant due to avoiding 2
extra calls (kcsan_{en,dis}able_current() calls).
4. __always_inline functions inlined into __no_kcsan functions are never
instrumented.
5. __always_inline functions inlined into instrumented functions are
instrumented.
6. __no_kcsan_or_inline functions may be inlined into __no_kcsan functions =>
Implies leaving 'noinline' off of __no_kcsan_or_inline.
7. Because of #6, __no_kcsan and __no_kcsan_or_inline functions should never be
spuriously inlined into instrumented functions, causing the accesses of the
__no_kcsan function to be instrumented.
Older versions of Clang do not satisfy #3. The latest GCC currently
doesn't support at least #1, #3, and #7.
Marco Elver [Thu, 21 May 2020 14:20:38 +0000 (16:20 +0200)]
kcsan: Avoid inserting __tsan_func_entry/exit if possible
To avoid inserting __tsan_func_{entry,exit}, add option if supported by
compiler. Currently only Clang can be told to not emit calls to these
functions. It is safe to not emit these, since KCSAN does not rely on
them.
Note that, if we disable __tsan_func_{entry,exit}(), we need to disable
tail-call optimization in sanitized compilation units, as otherwise we
may skip frames in the stack trace; in particular when the tail called
function is one of the KCSAN's runtime functions, and a report is
generated, we might miss the function where the actual access occurred.
Since __tsan_func_{entry,exit}() insertion effectively disabled
tail-call optimization, there should be no observable change.
This was caught and confirmed with kcsan-test & UNWINDER_ORC.
Thomas Gleixner [Thu, 11 Jun 2020 18:02:46 +0000 (20:02 +0200)]
Rebase locking/kcsan to locking/urgent
Merge the state of the locking kcsan branch before the read/write_once()
and the atomics modifications got merged.
Squash the fallout of the rebase on top of the read/write once and atomic
fallback work into the merge. The history of the original branch is
preserved in tag locking-kcsan-2020-06-02.
Paolo Bonzini [Thu, 11 Jun 2020 18:02:32 +0000 (14:02 -0400)]
Merge tag 'kvmarm-fixes-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for Linux 5.8, take #1
* 32bit VM fixes:
- Fix embarassing mapping issue between AArch32 CSSELR and AArch64
ACTLR
- Add ACTLR2 support for AArch32
- Get rid of the useless ACTLR_EL1 save/restore
- Fix CP14/15 accesses for AArch32 guests on BE hosts
- Ensure that we don't loose any state when injecting a 32bit
exception when running on a VHE host
* 64bit VM fixes:
- Fix PtrAuth host saving happening in preemptible contexts
- Optimize PtrAuth lazy enable
- Drop vcpu to cpu context pointer
- Fix sparse warnings for HYP per-CPU accesses
Paolo Bonzini [Thu, 11 Jun 2020 18:01:51 +0000 (14:01 -0400)]
KVM: x86: do not pass poisoned hva to __kvm_set_memory_region
__kvm_set_memory_region does not use the hva at all, so trying to
catch use-after-delete is pointless and, worse, it fails access_ok
now that we apply it to all memslots including private kernel ones.
This fixes an AVIC regression.
Fixes: 09d952c971a5 ("KVM: check userspace_addr for all memslots") Reported-by: Maxim Levitsky <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
Linus Torvalds [Thu, 11 Jun 2020 17:48:12 +0000 (10:48 -0700)]
Merge tag 'vfs-5.8-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull DAX updates part three from Darrick Wong:
"Now that the xfs changes have landed, this third piece changes the
FS_XFLAG_DAX ioctl code in xfs to request that the inode be reloaded
after the last program closes the file, if doing so would make a S_DAX
change happen. The goal here is to make dax access mode switching
quicker when possible.
Summary:
- Teach XFS to ask the VFS to drop an inode if the administrator
changes the FS_XFLAG_DAX inode flag such that the S_DAX state would
change. This can result in files changing access modes without
requiring an unmount cycle"
* tag 'vfs-5.8-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
fs/xfs: Update xfs_ioctl_setattr_dax_invalidate()
fs/xfs: Combine xfs_diflags_to_linux() and xfs_diflags_to_iflags()
fs/xfs: Create function xfs_inode_should_enable_dax()
fs/xfs: Make DAX mount option a tri-state
fs/xfs: Change XFS_MOUNT_DAX to XFS_MOUNT_DAX_ALWAYS
fs/xfs: Remove unnecessary initialization of i_rwsem
Chuck Lever [Fri, 29 May 2020 18:14:40 +0000 (14:14 -0400)]
NFS: Fix direct WRITE throughput regression
I measured a 50% throughput regression for large direct writes.
The observed on-the-wire behavior is that the client sends every
NFS WRITE twice: once as an UNSTABLE WRITE plus a COMMIT, and once
as a FILE_SYNC WRITE.
This is because the nfs_write_match_verf() check in
nfs_direct_commit_complete() fails for every WRITE.
Buffered writes use nfs_write_completion(), which sets req->wb_verf
correctly. Direct writes use nfs_direct_write_completion(), which
does not set req->wb_verf at all. This leaves req->wb_verf set to
all zeroes for every direct WRITE, and thus
nfs_direct_commit_completion() always sets NFS_ODIRECT_RESCHED_WRITES.
This fix appears to restore nearly all of the lost performance.
Fixes: 1f28476dcb98 ("NFS: Fix O_DIRECT commit verifier handling") Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
Zheng Bin [Thu, 21 May 2020 09:17:21 +0000 (17:17 +0800)]
nfs: set invalid blocks after NFSv4 writes
Use the following command to test nfsv4(size of file1M is 1MB):
mount -t nfs -o vers=4.0,actimeo=60 127.0.0.1/dir1 /mnt
cp file1M /mnt
du -h /mnt/file1M -->0 within 60s, then 1M
When write is done(cp file1M /mnt), will call this:
nfs_writeback_done
nfs4_write_done
nfs4_write_done_cb
nfs_writeback_update_inode
nfs_post_op_update_inode_force_wcc_locked(change, ctime, mtime
nfs_post_op_update_inode_force_wcc_locked
nfs_set_cache_invalid
nfs_refresh_inode_locked
nfs_update_inode
nfsd write response contains change, ctime, mtime, the flag will be
clear after nfs_update_inode. Howerver, write response does not contain
space_used, previous open response contains space_used whose value is 0,
so inode->i_blocks is still 0.
nfs_getattr -->called by "du -h"
do_update |= force_sync || nfs_attribute_cache_expired -->false in 60s
cache_validity = READ_ONCE(NFS_I(inode)->cache_validity)
do_update |= cache_validity & (NFS_INO_INVALID_ATTR -->false
if (do_update) {
__nfs_revalidate_inode
}
Within 60s, does not send getattr request to nfsd, thus "du -h /mnt/file1M"
is 0.
Add a NFS_INO_INVALID_BLOCKS flag, set it when nfsv4 write is done.
Fixes: 16e143751727 ("NFS: More fine grained attribute tracking") Signed-off-by: Zheng Bin <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
Colin Ian King [Wed, 27 May 2020 12:56:11 +0000 (13:56 +0100)]
NFS: remove redundant initialization of variable result
The variable result is being initialized with a value that is never read
and it is being updated later with a new value. The initialization is
redundant and can be removed.
Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>