Parav Pandit [Fri, 15 May 2020 07:44:06 +0000 (02:44 -0500)]
net/mlx5: Fix devlink objects and devlink device unregister sequence
Current below problems exists.
1. devlink device is registered by mlx5_load_one(). But it is
not unregistered by mlx5_unload_one(). This is incorrect.
2. Above issue leads to,
When mlx5 PCI device is removed, currently devlink device is
unregistered before devlink ports are unregistered in below ladder
diagram.
remove_one()
mlx5_devlink_unregister()
[..]
devlink_unregister() <- ports are still registered!
mlx5_unload_one()
mlx5_unregister_device()
mlx5_remove_device()
mlx5e_remove()
mlx5e_devlink_port_unregister()
devlink_port_unregister()
3. Condition checking for registering and unregister device are not
symmetric either in these routines.
Hence, fix the sequence by having load and unload routines symmetric
and in right order.
i.e.
(a) register devlink device followed by registering devlink ports
(b) unregister devlink ports followed by devlink device
Do this based on boot and cleanup flags instead of different
conditions.
Fixes: c6acd629eec7 ("net/mlx5e: Add support for devlink-port in non-representors mode") Fixes: f60f315d339e ("net/mlx5e: Register devlink ports for physical link, PCI PF, VFs") Signed-off-by: Parav Pandit <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
Parav Pandit [Thu, 14 May 2020 10:12:56 +0000 (05:12 -0500)]
net/mlx5: Disable reload while removing the device
While unregistration is in progress, user might be reloading the
interface.
This can race with unregistration in below flow which uses the
resources which are getting disabled by reload flow.
Hence, disable the devlink reloading first when removing the device.
Aya Levin [Sun, 17 May 2020 09:45:52 +0000 (12:45 +0300)]
net/mlx5e: Fix ethtool hfunc configuration change
Changing RX hash function requires rearranging of RQT internal indexes,
the user isn't exposed to such changes and these changes do not affect
the user configured indirection table. Rebuild RQ table on hfunc change.
After an XSK is closed, the relevant structures in the channel are not
zeroed. If an XSK is opened the second time on the same channel without
recreating channels, the stray values in the structures will lead to
incorrect operation of queues, which causes CQE errors, and the new
socket doesn't work at all.
This patch fixes the issue by explicitly zeroing XSK-related structs in
the channel on XSK close. Note that those structs are zeroed on channel
creation, and usually a configuration change (XDP program is set)
happens on XSK open, which leads to recreating channels, so typical XSK
usecases don't suffer from this issue. However, if XSKs are opened and
closed on the same channel without removing the XDP program, this bug
reproduces.
Shay Drory [Thu, 7 May 2020 06:32:53 +0000 (09:32 +0300)]
net/mlx5: Fix fatal error handling during device load
Currently, in case of fatal error during mlx5_load_one(), we cannot
enter error state until mlx5_load_one() is finished, what can take
several minutes until commands will get timeouts, because these commands
can't be processed due to the fatal error.
Fix it by setting dev->state as MLX5_DEVICE_STATE_INTERNAL_ERROR before
requesting the lock.
Fixes: c1d4d2e92ad6 ("net/mlx5: Avoid calling sleeping function by the health poll thread") Signed-off-by: Shay Drory <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
Shay Drory [Wed, 6 May 2020 12:59:48 +0000 (15:59 +0300)]
net/mlx5: drain health workqueue in case of driver load error
In case there is a work in the health WQ when we teardown the driver,
in driver load error flow, the health work will try to read dev->iseg,
which was already unmap in mlx5_pci_close().
Fix it by draining the health workqueue first thing in mlx5_pci_close().
Linus Torvalds [Thu, 11 Jun 2020 22:36:28 +0000 (15:36 -0700)]
Merge tag 'timers-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
"A small fix for the VDSO code to force inline
__cvdso_clock_gettime_common() so the compiler
can't generate horrible code"
* tag 'timers-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
lib/vdso: Force inlining of __cvdso_clock_gettime_common()
Brett Creeley [Fri, 5 Jun 2020 17:09:45 +0000 (10:09 -0700)]
iavf: Fix reporting 2.5 Gb and 5Gb speeds
Commit 4ae4916b5643 ("i40e: fix 'Unknown bps' in dmesg for 2.5Gb/5Gb
speeds") added the ability for the PF to report 2.5 and 5Gb speeds,
however, the iavf driver does not recognize those speeds as the values were
not added there. Add the proper enums and values so that iavf can properly
deal with those speeds.
adapter->link_speed has type enum virtchnl_link_speed but our comparisons
are against enum iavf_aq_link_speed. Though they are, currently, the same
values, change the comparison to the matching enum virtchnl_link_speed
since that may not always be the case.
Brett Creeley [Fri, 5 Jun 2020 17:09:43 +0000 (10:09 -0700)]
iavf: fix speed reporting over virtchnl
Link speeds are communicated over virtchnl using an enum
virtchnl_link_speed. Currently, the highest link speed is 40Gbps which
leaves us unable to reflect some speeds that an ice VF is capable of.
This causes link speed to be misreported on the iavf driver.
Allow for communicating link speeds using Mbps so that the proper speed can
be reported for an ice VF. Moving away from the enum allows us to
communicate future speed changes without requiring a new enum to be added.
In order to support communicating link speeds over virtchnl in Mbps the
following functionality was added:
- Added u32 link_speed_mbps in the iavf_adapter structure.
- Added the macro ADV_LINK_SUPPORT(_a) to determine if the VF
driver supports communicating link speeds in Mbps.
- Added the function iavf_get_vpe_link_status() to fill the
correct link_status in the event_data union based on the
ADV_LINK_SUPPORT(_a) macro.
- Added the function iavf_set_adapter_link_speed_from_vpe()
to determine whether or not to fill the u32 link_speed_mbps or
enum virtchnl_link_speed link_speed field in the iavf_adapter
structure based on the ADV_LINK_SUPPORT(_a) macro.
- Do not free vf_res in iavf_init_get_resources() as vf_res will be
accessed in iavf_get_link_ksettings(); memset to 0 instead. This
memory is subsequently freed in iavf_remove().
Tobias Klauser [Thu, 11 Jun 2020 10:33:41 +0000 (12:33 +0200)]
tools, bpftool: Exit on error in function codegen
Currently, the codegen function might fail and return an error. But its
callers continue without checking its return value. Since codegen can
fail only in the unlikely case of the system running out of memory or
the static template being malformed, just exit(-1) directly from codegen
and make it void-returning.
Li RongQing [Thu, 11 Jun 2020 05:11:06 +0000 (13:11 +0800)]
xdp: Fix xsk_generic_xmit errno
Propagate sock_alloc_send_skb error code, not set it to
EAGAIN unconditionally, when fail to allocate skb, which
might cause that user space unnecessary loops.
Linus Torvalds [Thu, 11 Jun 2020 20:25:53 +0000 (13:25 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge some more updates from Andrew Morton:
- various hotfixes and minor things
- hch's use_mm/unuse_mm clearnups
Subsystems affected by this patch series: mm/hugetlb, scripts, kcov,
lib, nilfs, checkpatch, lib, mm/debug, ocfs2, lib, misc.
* emailed patches from Andrew Morton <[email protected]>:
kernel: set USER_DS in kthread_use_mm
kernel: better document the use_mm/unuse_mm API contract
kernel: move use_mm/unuse_mm to kthread.c
kernel: move use_mm/unuse_mm to kthread.c
stacktrace: cleanup inconsistent variable type
lib: test get_count_order/long in test_bitops.c
mm: add comments on pglist_data zones
ocfs2: fix spelling mistake and grammar
mm/debug_vm_pgtable: fix kernel crash by checking for THP support
lib: fix bitmap_parse() on 64-bit big endian archs
checkpatch: correct check for kernel parameters doc
nilfs2: fix null pointer dereference at nilfs_segctor_do_construct()
lib/lz4/lz4_decompress.c: document deliberate use of `&'
kcov: check kcov_softirq in kcov_remote_stop()
scripts/spelling: add a few more typos
khugepaged: selftests: fix timeout condition in wait_for_scan()
Rob Herring [Thu, 11 Jun 2020 14:58:04 +0000 (08:58 -0600)]
dt-bindings: Fix more incorrect 'reg' property sizes in examples
The examples template is a 'simple-bus' with a size of 1 cell for
had between 2 and 4 cells which really only errors on I2C or SPI type
devices with a single cell.
The easiest fix in most cases is to change the 'reg' property to 1 cell
for address and size.
Joerg Roedel [Thu, 11 Jun 2020 09:11:39 +0000 (11:11 +0200)]
alpha: Fix build around srm_sysrq_reboot_op
The patch introducing the struct was probably never compile tested,
because it sets a handler with a wrong function signature. Wrap the
handler into a functions with the correct signature to fix the build.
Linus Torvalds [Thu, 11 Jun 2020 19:55:20 +0000 (12:55 -0700)]
Merge tag 'riscv-for-linus-5.8-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull more RISC-V updates from Palmer Dabbelt:
- Kconfig select statements are now sorted alphanumerically
- first-level interrupts are now handled via a full irqchip driver
- CPU hotplug is fixed
- vDSO calls now use the common vDSO infrastructure
* tag 'riscv-for-linus-5.8-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: set the permission of vdso_data to read-only
riscv: use vDSO common flow to reduce the latency of the time-related functions
riscv: fix build warning of missing prototypes
RISC-V: Don't mark init section as non-executable
RISC-V: Force select RISCV_INTC for CONFIG_RISCV
RISC-V: Remove do_IRQ() function
clocksource/drivers/timer-riscv: Use per-CPU timer interrupt
irqchip: RISC-V per-HART local interrupt controller driver
RISC-V: Rename and move plic_find_hart_id() to arch directory
RISC-V: self-contained IPI handling routine
RISC-V: Sort select statements alphanumerically
- Fix incorrect mask in HiSilicon L3C perf PMU driver
- Fix compat vDSO compilation under some toolchain configurations
- Fix false UBSAN warning from ACPI IORT parsing code
- Fix booting under bootloaders that ignore TEXT_OFFSET
- Annotate debug initcall function with '__init'"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: warn on incorrect placement of the kernel by the bootloader
arm64: acpi: fix UBSAN warning
arm64: vdso32: add CONFIG_THUMB2_COMPAT_VDSO
drivers/perf: hisi: Fix wrong value for all counters enable
arm64: ftrace: Change CONFIG_FTRACE_WITH_REGS to CONFIG_DYNAMIC_FTRACE_WITH_REGS
arm64: debug: mark a function as __init to save some memory
scs: Report SCS usage in bytes rather than number of entries
Linus Torvalds [Thu, 11 Jun 2020 19:50:54 +0000 (12:50 -0700)]
Merge tag 'm68knommu-for-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
Pull m68knommu updates from Greg Ungerer:
- casting clean up in the user access macros
- memory leak on error case fix for PCI probing
- update of a defconfig
* tag 'm68knommu-for-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k,nommu: fix implicit cast from __user in __{get,put}_user_asm()
m68k,nommu: add missing __user in uaccess' __ptr() macro
m68k: Drop CONFIG_MTD_M25P80 in stmark2_defconfig
m68k/PCI: Fix a memory leak in an error handling path
Rob Herring [Thu, 11 Jun 2020 14:52:38 +0000 (08:52 -0600)]
dt-bindings: phy: qcom: Fix missing 'ranges' and example addresses
The QCom QMP PHY bindings have child nodes with translatable (MMIO)
addresses, so a 'ranges' property is required in the parent node.
Additionally, the examples default to 1 address and size cell, so let's
fix that, too.
Rob Herring [Wed, 10 Jun 2020 21:49:12 +0000 (15:49 -0600)]
dt-bindings: Remove more cases of 'allOf' containing a '$ref'
Another round of 'allOf' removals that came in this cycle.
json-schema versions draft7 and earlier have a weird behavior in that
any keywords combined with a '$ref' are ignored (silently). The correct
form was to put a '$ref' under an 'allOf'. This behavior is now changed
in the 2019-09 json-schema spec and '$ref' can be mixed with other
keywords. The json-schema library doesn't yet support this, but the
tooling now does a fixup for this and either way works.
This has been a constant source of review comments, so let's change this
treewide so everyone copies the simpler syntax.
Tuong Lien [Thu, 11 Jun 2020 10:08:08 +0000 (17:08 +0700)]
tipc: fix NULL pointer dereference in tipc_disc_rcv()
When a bearer is enabled, we create a 'tipc_discoverer' object to store
the bearer related data along with a timer and a preformatted discovery
message buffer for later probing... However, this is only carried after
the bearer was set 'up', that left a race condition resulting in kernel
panic.
It occurs when a discovery message from a peer node is received and
processed in bottom half (since the bearer is 'up' already) just before
the discoverer object is created but is now accessed in order to update
the preformatted buffer (with a new trial address, ...) so leads to the
NULL pointer dereference.
We solve the problem by simply moving the bearer 'up' setting to later,
so make sure everything is ready prior to any message receiving.
Tuong Lien [Thu, 11 Jun 2020 10:07:35 +0000 (17:07 +0700)]
tipc: fix kernel WARNING in tipc_msg_append()
syzbot found the following issue:
WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 check_copy_size include/linux/thread_info.h:150 [inline]
WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 copy_from_iter include/linux/uio.h:144 [inline]
WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 tipc_msg_append+0x49a/0x5e0 net/tipc/msg.c:242
Kernel panic - not syncing: panic_on_warn set ...
This happens after commit 5e9eeccc58f3 ("tipc: fix NULL pointer
dereference in streaming") that tried to build at least one buffer even
when the message data length is zero... However, it now exposes another
bug that the 'mss' can be zero and the 'cpy' will be negative, thus the
above kernel WARNING will appear!
The zero value of 'mss' is never expected because it means Nagle is not
enabled for the socket (actually the socket type was 'SOCK_SEQPACKET'),
so the function 'tipc_msg_append()' must not be called at all. But that
was in this particular case since the message data length was zero, and
the 'send <= maxnagle' check became true.
We resolve the issue by explicitly checking if Nagle is enabled for the
socket, i.e. 'maxnagle != 0' before calling the 'tipc_msg_append()'. We
also reinforce the function to against such a negative values if any.
Shannon Nelson [Thu, 11 Jun 2020 04:07:39 +0000 (21:07 -0700)]
ionic: remove support for mgmt device
We no longer support the mgmt device in the ionic driver,
so remove the device id and related code.
Fixes: b3f064e9746d ("ionic: add support for device id 0x1004") Signed-off-by: Shannon Nelson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Linus Torvalds [Thu, 11 Jun 2020 19:42:14 +0000 (12:42 -0700)]
Merge tag 'mailbox-v5.8' of git://git.linaro.org/landing-teams/working/fujitsu/integration
Pull mailbox updates from Jassi Brar:
"qcom:
- new controller driver for IPCC
- reorg the of_device data
- add support for ipq6018 platform
spreadtrum:
- new sprd controller driver
imx:
- implement suspend/resume PM support
misc:
- make pcc driver struct static
- fix return value in imx_mu_scu
- disable clock before bailout in imx probe
- remove duplicate error mssg in zynqmp probe
- fix header size in imx.scu
- check for null instead of is-err in zynqmp"
* tag 'mailbox-v5.8' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
mailbox: qcom: Add ipq6018 apcs compatible
mailbox: qcom: Add clock driver name in apcs mailbox driver data
dt-bindings: mailbox: Add YAML schemas for QCOM APCS global block
mailbox: imx: ONLY IPC MU needs IRQF_NO_SUSPEND flag
mailbox: imx: Add runtime PM callback to handle MU clocks
mailbox: imx: Add context save/restore for suspend/resume
MAINTAINERS: Add entry for Qualcomm IPCC driver
mailbox: Add support for Qualcomm IPCC
dt-bindings: mailbox: Add devicetree binding for Qcom IPCC
mailbox: zynqmp-ipi: Fix NULL vs IS_ERR() check in zynqmp_ipi_mbox_probe()
mailbox: imx-mailbox: fix scu msg header size check
mailbox: sprd: Add Spreadtrum mailbox driver
dt-bindings: mailbox: Add the Spreadtrum mailbox documentation
mailbox: ZynqMP IPI: Delete an error message in zynqmp_ipi_probe()
mailbox: imx: Disable the clock on devm_mbox_controller_register() failure
mailbox: imx: Fix return in imx_mu_scu_xlate()
mailbox: imx: Support runtime PM
mailbox: pcc: make pcc_mbox_driver static
Xu Wang [Thu, 11 Jun 2020 02:45:20 +0000 (02:45 +0000)]
drivers: dpaa2: Use devm_kcalloc() in setup_dpni()
A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus use the corresponding function "devm_kcalloc".
Linus Torvalds [Thu, 11 Jun 2020 19:38:11 +0000 (12:38 -0700)]
Merge tag 'sound-fix-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Here are last-minute fixes gathered before merge window close; a few
fixes are for the core while the rest majority are driver fixes.
- PCM locking annotation fixes and the possible self-lock fix
- ASoC DPCM regression fixes with multi-CPU DAI
- A fix for inconsistent resume from system-PM on USB-audio
- Improved runtime-PM handling with multiple USB interfaces
- Quirks for HD-audio and USB-audio
- Hardened firmware handling in max98390 codec
- A couple of fixes for meson"
* tag 'sound-fix-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (21 commits)
ASoC: rt5645: Add platform-data for Asus T101HA
ASoC: Intel: bytcr_rt5640: Add quirk for Toshiba Encore WT10-A tablet
ASoC: SOF: nocodec: conditionally set dpcm_capture/dpcm_playback flags
ASoC: Intel: boards: replace capture_only by dpcm_capture
ASoC: core: only convert non DPCM link to DPCM link
ASoC: soc-pcm: dpcm: fix playback/capture checks
ASoC: meson: add missing free_irq() in error path
ALSA: pcm: disallow linking stream to itself
ALSA: usb-audio: Manage auto-pm of all bundled interfaces
ALSA: hda/realtek - add a pintbl quirk for several Lenovo machines
ALSA: pcm: fix snd_pcm_link() lockdep splat
ALSA: usb-audio: Use the new macro for HP Dock rename quirks
ALSA: usb-audio: Add vendor, product and profile name for HP Thunderbolt Dock
ALSA: emu10k1: delete an unnecessary condition
dt-bindings: ASoc: Fix tdm-slot documentation spelling error
ASoC: meson: fix memory leak of links if allocation of ldata fails
ALSA: usb-audio: Fix inconsistent card PM state after resume
ASoC: max98390: Fix potential crash during param fw loading
ASoC: max98390: Fix incorrect printf qualifier
ASoC: fsl-asoc-card: Defer probe when fail to find codec device
...
Linus Torvalds [Thu, 11 Jun 2020 19:27:06 +0000 (12:27 -0700)]
Merge tag 'drm-next-2020-06-11-1' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"One sun4i fix and a connector hotplug race The ast fix is for a
regression in 5.6, and one of the i915 ones fixes an oops reported by
dhowells.
core:
- fix race in connectors sending hotplug
i915:
- Avoid use after free in cmdparser
- Avoid NULL dereference when probing all display encoders
- Fixup to module parameter type
sun4i:
- clock divider fix
ast:
- 24/32 bpp mode setting fix"
* tag 'drm-next-2020-06-11-1' of git://anongit.freedesktop.org/drm/drm:
drm/ast: fix missing break in switch statement for format->cpp[0] case 4
drm/sun4i: hdmi ddc clk: Fix size of m divider
drm/i915/display: Only query DP state of a DDI encoder
drm/i915/params: fix i915.reset module param type
drm/i915/gem: Mark the buffer pool as active for the cmdparser
drm/connector: notify userspace on hotplug after register complete
Linus Torvalds [Thu, 11 Jun 2020 19:22:41 +0000 (12:22 -0700)]
Merge tag 'nfs-for-5.8-1' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client updates from Anna Schumaker:
"New features and improvements:
- Sunrpc receive buffer sizes only change when establishing a GSS credentials
- Add more sunrpc tracepoints
- Improve on tracepoints to capture internal NFS I/O errors
Other bugfixes and cleanups:
- Move a dprintk() to after a call to nfs_alloc_fattr()
- Fix off-by-one issues in rpc_ntop6
- Fix a few coccicheck warnings
- Use the correct SPDX license identifiers
- Fix rpc_call_done assignment for BIND_CONN_TO_SESSION
- Replace zero-length array with flexible array
- Remove duplicate headers
- Set invalid blocks after NFSv4 writes to update space_used attribute
- Fix direct WRITE throughput regression"
* tag 'nfs-for-5.8-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (27 commits)
NFS: Fix direct WRITE throughput regression
SUNRPC: rpc_xprt lifetime events should record xprt->state
xprtrdma: Make xprt_rdma_slot_table_entries static
nfs: set invalid blocks after NFSv4 writes
NFS: remove redundant initialization of variable result
sunrpc: add missing newline when printing parameter 'auth_hashtable_size' by sysfs
NFS: Add a tracepoint in nfs_set_pgio_error()
NFS: Trace short NFS READs
NFS: nfs_xdr_status should record the procedure name
SUNRPC: Set SOFTCONN when destroying GSS contexts
SUNRPC: rpc_call_null_helper() should set RPC_TASK_SOFT
SUNRPC: rpc_call_null_helper() already sets RPC_TASK_NULLCREDS
SUNRPC: trace RPC client lifetime events
SUNRPC: Trace transport lifetime events
SUNRPC: Split the xdr_buf event class
SUNRPC: Add tracepoint to rpc_call_rpcerror()
SUNRPC: Update the RPC_SHOW_SOCKET() macro
SUNRPC: Update the rpc_show_task_flags() macro
SUNRPC: Trace GSS context lifetimes
SUNRPC: receive buffer size estimation values almost never change
...
Marco Elver [Thu, 21 May 2020 14:20:45 +0000 (16:20 +0200)]
compiler.h: Avoid nested statement expression in data_race()
It appears that compilers have trouble with nested statement
expressions. Therefore, remove one level of statement expression nesting
from the data_race() macro. This will help avoiding potential problems
in the future as its usage increases.
Marco Elver [Thu, 21 May 2020 14:20:44 +0000 (16:20 +0200)]
compiler.h: Remove data_race() and unnecessary checks from {READ,WRITE}_ONCE()
The volatile accesses no longer need to be wrapped in data_race()
because compilers that emit instrumentation distinguishing volatile
accesses are required for KCSAN.
Consequently, the explicit kcsan_check_atomic*() are no longer required
either since the compiler emits instrumentation distinguishing the
volatile accesses.
Finally, simplify __READ_ONCE_SCALAR() and remove __WRITE_ONCE_SCALAR().
Marco Elver [Thu, 21 May 2020 14:20:41 +0000 (16:20 +0200)]
kcsan: Remove 'noinline' from __no_kcsan_or_inline
Some compilers incorrectly inline small __no_kcsan functions, which then
results in instrumenting the accesses. For this reason, the 'noinline'
attribute was added to __no_kcsan_or_inline. All known versions of GCC
are affected by this. Supported versions of Clang are unaffected, and
never inline a no_sanitize function.
However, the attribute 'noinline' in __no_kcsan_or_inline causes
unexpected code generation in functions that are __no_kcsan and call a
__no_kcsan_or_inline function.
In certain situations it is expected that the __no_kcsan_or_inline
function is actually inlined by the __no_kcsan function, and *no* calls
are emitted. By removing the 'noinline' attribute, give the compiler
the ability to inline and generate the expected code in __no_kcsan
functions.
Marco Elver [Thu, 21 May 2020 14:20:40 +0000 (16:20 +0200)]
kcsan: Pass option tsan-instrument-read-before-write to Clang
Clang (unlike GCC) removes reads before writes with matching addresses
in the same basic block. This is an optimization for TSAN, since writes
will always cause conflict if the preceding read would have.
However, for KCSAN we cannot rely on this option, because we apply
several special rules to writes, in particular when the
KCSAN_ASSUME_PLAIN_WRITES_ATOMIC option is selected. To avoid missing
potential data races, pass the -tsan-instrument-read-before-write option
to Clang if it is available [1].
Marco Elver [Thu, 21 May 2020 14:20:39 +0000 (16:20 +0200)]
kcsan: Support distinguishing volatile accesses
In the kernel, the "volatile" keyword is used in various concurrent
contexts, whether in low-level synchronization primitives or for
legacy reasons. If supported by the compiler, it will be assumed
that aligned volatile accesses up to sizeof(long long) (matching
compiletime_assert_rwonce_type()) are atomic.
Recent versions of Clang [1] (GCC tentative [2]) can instrument
volatile accesses differently. Add the option (required) to enable the
instrumentation, and provide the necessary runtime functions. None of
the updated compilers are widely available yet (Clang 11 will be the
first release to support the feature).
Marco Elver [Thu, 21 May 2020 14:20:42 +0000 (16:20 +0200)]
kcsan: Restrict supported compilers
The first version of Clang that supports -tsan-distinguish-volatile will
be able to support KCSAN. The first Clang release to do so, will be
Clang 11. This is due to satisfying all the following requirements:
1. Never emit calls to __tsan_func_{entry,exit}.
2. __no_kcsan functions should not call anything, not even
kcsan_{enable,disable}_current(), when using __{READ,WRITE}_ONCE => Requires
leaving them plain!
3. Support atomic_{read,set}*() with KCSAN, which rely on
arch_atomic_{read,set}*() using __{READ,WRITE}_ONCE() => Because of
#2, rely on Clang 11's -tsan-distinguish-volatile support. We will
double-instrument atomic_{read,set}*(), but that's reasonable given
it's still lower cost than the data_race() variant due to avoiding 2
extra calls (kcsan_{en,dis}able_current() calls).
4. __always_inline functions inlined into __no_kcsan functions are never
instrumented.
5. __always_inline functions inlined into instrumented functions are
instrumented.
6. __no_kcsan_or_inline functions may be inlined into __no_kcsan functions =>
Implies leaving 'noinline' off of __no_kcsan_or_inline.
7. Because of #6, __no_kcsan and __no_kcsan_or_inline functions should never be
spuriously inlined into instrumented functions, causing the accesses of the
__no_kcsan function to be instrumented.
Older versions of Clang do not satisfy #3. The latest GCC currently
doesn't support at least #1, #3, and #7.
Marco Elver [Thu, 21 May 2020 14:20:38 +0000 (16:20 +0200)]
kcsan: Avoid inserting __tsan_func_entry/exit if possible
To avoid inserting __tsan_func_{entry,exit}, add option if supported by
compiler. Currently only Clang can be told to not emit calls to these
functions. It is safe to not emit these, since KCSAN does not rely on
them.
Note that, if we disable __tsan_func_{entry,exit}(), we need to disable
tail-call optimization in sanitized compilation units, as otherwise we
may skip frames in the stack trace; in particular when the tail called
function is one of the KCSAN's runtime functions, and a report is
generated, we might miss the function where the actual access occurred.
Since __tsan_func_{entry,exit}() insertion effectively disabled
tail-call optimization, there should be no observable change.
This was caught and confirmed with kcsan-test & UNWINDER_ORC.
Thomas Gleixner [Thu, 11 Jun 2020 18:02:46 +0000 (20:02 +0200)]
Rebase locking/kcsan to locking/urgent
Merge the state of the locking kcsan branch before the read/write_once()
and the atomics modifications got merged.
Squash the fallout of the rebase on top of the read/write once and atomic
fallback work into the merge. The history of the original branch is
preserved in tag locking-kcsan-2020-06-02.
Paolo Bonzini [Thu, 11 Jun 2020 18:02:32 +0000 (14:02 -0400)]
Merge tag 'kvmarm-fixes-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for Linux 5.8, take #1
* 32bit VM fixes:
- Fix embarassing mapping issue between AArch32 CSSELR and AArch64
ACTLR
- Add ACTLR2 support for AArch32
- Get rid of the useless ACTLR_EL1 save/restore
- Fix CP14/15 accesses for AArch32 guests on BE hosts
- Ensure that we don't loose any state when injecting a 32bit
exception when running on a VHE host
* 64bit VM fixes:
- Fix PtrAuth host saving happening in preemptible contexts
- Optimize PtrAuth lazy enable
- Drop vcpu to cpu context pointer
- Fix sparse warnings for HYP per-CPU accesses
Paolo Bonzini [Thu, 11 Jun 2020 18:01:51 +0000 (14:01 -0400)]
KVM: x86: do not pass poisoned hva to __kvm_set_memory_region
__kvm_set_memory_region does not use the hva at all, so trying to
catch use-after-delete is pointless and, worse, it fails access_ok
now that we apply it to all memslots including private kernel ones.
This fixes an AVIC regression.
Fixes: 09d952c971a5 ("KVM: check userspace_addr for all memslots") Reported-by: Maxim Levitsky <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
Linus Torvalds [Thu, 11 Jun 2020 17:48:12 +0000 (10:48 -0700)]
Merge tag 'vfs-5.8-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull DAX updates part three from Darrick Wong:
"Now that the xfs changes have landed, this third piece changes the
FS_XFLAG_DAX ioctl code in xfs to request that the inode be reloaded
after the last program closes the file, if doing so would make a S_DAX
change happen. The goal here is to make dax access mode switching
quicker when possible.
Summary:
- Teach XFS to ask the VFS to drop an inode if the administrator
changes the FS_XFLAG_DAX inode flag such that the S_DAX state would
change. This can result in files changing access modes without
requiring an unmount cycle"
* tag 'vfs-5.8-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
fs/xfs: Update xfs_ioctl_setattr_dax_invalidate()
fs/xfs: Combine xfs_diflags_to_linux() and xfs_diflags_to_iflags()
fs/xfs: Create function xfs_inode_should_enable_dax()
fs/xfs: Make DAX mount option a tri-state
fs/xfs: Change XFS_MOUNT_DAX to XFS_MOUNT_DAX_ALWAYS
fs/xfs: Remove unnecessary initialization of i_rwsem
Chuck Lever [Fri, 29 May 2020 18:14:40 +0000 (14:14 -0400)]
NFS: Fix direct WRITE throughput regression
I measured a 50% throughput regression for large direct writes.
The observed on-the-wire behavior is that the client sends every
NFS WRITE twice: once as an UNSTABLE WRITE plus a COMMIT, and once
as a FILE_SYNC WRITE.
This is because the nfs_write_match_verf() check in
nfs_direct_commit_complete() fails for every WRITE.
Buffered writes use nfs_write_completion(), which sets req->wb_verf
correctly. Direct writes use nfs_direct_write_completion(), which
does not set req->wb_verf at all. This leaves req->wb_verf set to
all zeroes for every direct WRITE, and thus
nfs_direct_commit_completion() always sets NFS_ODIRECT_RESCHED_WRITES.
This fix appears to restore nearly all of the lost performance.
Fixes: 1f28476dcb98 ("NFS: Fix O_DIRECT commit verifier handling") Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
Zheng Bin [Thu, 21 May 2020 09:17:21 +0000 (17:17 +0800)]
nfs: set invalid blocks after NFSv4 writes
Use the following command to test nfsv4(size of file1M is 1MB):
mount -t nfs -o vers=4.0,actimeo=60 127.0.0.1/dir1 /mnt
cp file1M /mnt
du -h /mnt/file1M -->0 within 60s, then 1M
When write is done(cp file1M /mnt), will call this:
nfs_writeback_done
nfs4_write_done
nfs4_write_done_cb
nfs_writeback_update_inode
nfs_post_op_update_inode_force_wcc_locked(change, ctime, mtime
nfs_post_op_update_inode_force_wcc_locked
nfs_set_cache_invalid
nfs_refresh_inode_locked
nfs_update_inode
nfsd write response contains change, ctime, mtime, the flag will be
clear after nfs_update_inode. Howerver, write response does not contain
space_used, previous open response contains space_used whose value is 0,
so inode->i_blocks is still 0.
nfs_getattr -->called by "du -h"
do_update |= force_sync || nfs_attribute_cache_expired -->false in 60s
cache_validity = READ_ONCE(NFS_I(inode)->cache_validity)
do_update |= cache_validity & (NFS_INO_INVALID_ATTR -->false
if (do_update) {
__nfs_revalidate_inode
}
Within 60s, does not send getattr request to nfsd, thus "du -h /mnt/file1M"
is 0.
Add a NFS_INO_INVALID_BLOCKS flag, set it when nfsv4 write is done.
Fixes: 16e143751727 ("NFS: More fine grained attribute tracking") Signed-off-by: Zheng Bin <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
Colin Ian King [Wed, 27 May 2020 12:56:11 +0000 (13:56 +0100)]
NFS: remove redundant initialization of variable result
The variable result is being initialized with a value that is never read
and it is being updated later with a new value. The initialization is
redundant and can be removed.
Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
Chuck Lever [Tue, 12 May 2020 21:14:05 +0000 (17:14 -0400)]
NFS: Trace short NFS READs
A short read can generate an -EIO error without there being an error
on the wire. This tracepoint acts as an eyecatcher when there is no
obvious I/O error.
Commit a52458b48af1 ("NFS/NFSD/SUNRPC: replace generic creds with
'struct cred'.") made rpc_call_null_helper() set RPC_TASK_NULLCREDS
unconditionally. Therefore there's no need for
rpc_call_null_helper()'s call sites to set RPC_TASK_NULLCREDS.
Chuck Lever [Tue, 12 May 2020 21:13:39 +0000 (17:13 -0400)]
SUNRPC: trace RPC client lifetime events
The "create" tracepoint records parts of the rpc_create arguments,
and the shutdown tracepoint records when the rpc_clnt is about to
signal pending tasks and destroy auths.
Chuck Lever [Tue, 12 May 2020 21:13:28 +0000 (17:13 -0400)]
SUNRPC: Split the xdr_buf event class
To help tie the recorded xdr_buf to a particular RPC transaction,
the client side version of this class should display task ID
information and the server side one should show the request's XID.
Chuck Lever [Tue, 12 May 2020 21:13:01 +0000 (17:13 -0400)]
SUNRPC: receive buffer size estimation values almost never change
Avoid unnecessary cache sloshing by placing the buffer size
estimation update logic behind an atomic bit flag.
The size of GSS information included in each wrapped Reply does
not change during the lifetime of a GSS context. Therefore, the
au_rslack and au_ralign fields need to be updated only once after
establishing a fresh GSS credential.
Thus a slack size update must occur after a cred is created,
duplicated, renewed, or expires. I'm not sure I have this exactly
right. A trace point is introduced to track updates to these
variables to enable troubleshooting the problem if I missed a spot.
Jonas Karlman [Fri, 22 May 2020 20:21:33 +0000 (22:21 +0200)]
media: rkvdec: Fix H264 scaling list order
The Rockchip Video Decoder driver is expecting that the values in a
scaling list are in zig-zag order and applies the inverse scanning process
to get the values in matrix order.
Commit 0b0393d59eb4 ("media: uapi: h264: clarify expected
scaling_list_4x4/8x8 order") clarified that the values in the scaling list
should already be in matrix order.
Fix this by removing the reordering and change to use two memcpy.
Tomi Valkeinen [Wed, 27 May 2020 08:23:34 +0000 (10:23 +0200)]
media: videobuf2-dma-contig: fix bad kfree in vb2_dma_contig_clear_max_seg_size
Commit 9495b7e92f716ab2bd6814fab5e97ab4a39adfdd ("driver core: platform:
Initialize dma_parms for platform devices") in v5.7-rc5 causes
vb2_dma_contig_clear_max_seg_size() to kfree memory that was not
allocated by vb2_dma_contig_set_max_seg_size().
The assumption in vb2_dma_contig_set_max_seg_size() seems to be that
dev->dma_parms is always NULL when the driver is probed, and the case
where dev->dma_parms has bee initialized by someone else than the driver
(by calling vb2_dma_contig_set_max_seg_size) will cause a failure.
All the current users of these functions are platform devices, which now
always have dma_parms set by the driver core. To fix the issue for v5.7,
make vb2_dma_contig_set_max_seg_size() return an error if dma_parms is
NULL to be on the safe side, and remove the kfree code from
vb2_dma_contig_clear_max_seg_size().
For v5.8 we should remove the two functions and move the
dma_set_max_seg_size() calls into the drivers.
Michael Rodin [Wed, 27 May 2020 16:21:32 +0000 (18:21 +0200)]
media: v4l2-subdev.rst: correct information about v4l2 events
Remove description of non-existing v4l2_subdev.nevents and replace the
undefined flag V4L2_SUBDEV_USES_EVENTS by the correct flag
V4L2_SUBDEV_FL_HAS_EVENTS, which is already documented in v4l2_subdev.flags
Marek Szyprowski [Thu, 28 May 2020 14:03:26 +0000 (16:03 +0200)]
media: s5p-mfc: Properly handle dma_parms for the allocated devices
Commit 9495b7e92f71 ("driver core: platform: Initialize dma_parms for
platform devices") in v5.7-rc5 added allocation of dma_parms structure to
all platform devices. Then vb2_dma_contig_set_max_seg_size() have been
changed not to allocate dma_parms structure and rely on the one allocated
by the device core. Lets allocate the needed structure also for the
devices created for the 2 MFC device memory ports.
media: medium: cec: Make MEDIA_CEC_SUPPORT default to n if !MEDIA_SUPPORT
Recently, MEDIA_CEC_SUPPORT became indepedent of MEDIA_SUPPORT.
However, if MEDIA_SUPPORT is not enabled, MEDIA_SUPPORT_FILTER is not
defined, and MEDIA_CEC_SUPPORT is thus enabled by default, which is not
desirable.
Fix this by adding a dependency on MEDIA_CEC_SUPPORT to the default
configuration.
Fixes: 46d2a3b964ddbe63 ("media: place CEC menu before MEDIA_SUPPORT") Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Samuel Holland [Sat, 9 May 2020 20:06:42 +0000 (22:06 +0200)]
media: cedrus: Program output format during each run
Previously, the output format was programmed as part of the ioctl()
handler. However, this has two problems:
1) If there are multiple active streams with different output
formats, the hardware will use whichever format was set last
for both streams. Similarly, an ioctl() done in an inactive
context will wrongly affect other active contexts.
2) The registers are written while the device is not actively
streaming. To enable runtime PM tied to the streaming state,
all hardware access needs to be moved inside cedrus_device_run().
The call to cedrus_dst_format_set() is now placed just before the
codec-specific callback that programs the hardware.
media: atomisp: improve sensor detection code to use _DSM table
Instead of keep hardcoding device-specific tables, read them
directly from the ACPI BIOS, if available.
This method is know to work with Asus T101HA device. the
same table is also visible on EzPad devices. So, it seems
that at least some BIOSes use this method to pass data about
ISP2401-connected sensors.
media: atomisp: get rid of an iomem abstraction layer
The hive_isp_css_custom_host_hrt.h code, together
with atomisp_helper.h, provides an abstraction layer for
some functions inside atomisp_compat_css20.c and atomisp_cmd.c.
There's no good reason for that. In a matter of fact, after
removing the abstraction, the code looked a lot cleaner
and easier to understand.
So, get rid of them.
While here, get rid also of the udelay(1) abstraction code.
There are several parts of the driver that could produce
a "dfs failed!" message. Change the texts, in order to help
identifying from where they're coming.
media: atomisp: change the detection of ISP2401 at runtime
Instead of having a static var to detect it, let's use the
already-existing arch-specific bytes, as this is how other
parts of the code also checks when it needs to do something
different, depending on an specific chipset version.
media: atomisp: don't set hpll_freq twice with different values
The logic which sets the hpll_freq for BYT sets hpll_freq
to 1600MHz, but ignores it, and sets it again after reading
from-device-specific EFI vars (this time, using a default
of 2000MHz).
Remove the first set, as this will be overriden anyway.
While here, do minor adjustments on comments and on a
printk message.
Well, this is something that we don't have upstream. So,
without further details about that, we can't really parse it.
If we ever end supporting those devices with the upstream driver,
this patch can be reverted and the device can be detected
via DMI (or maybe via PCI ID?).
There are lots of mess with IRQ ifdef settings. As the
*_global.h will already detect the type of IRQ system at
compile time, we can get rid of them, replacing by just
one ifdef for ISP2401.