Git Repo - linux.git/log

parisc: Avoid kernel panic triggered by invalid kprobe

When running gdb I was able to trigger this kernel panic:

Kernel Fault: Code=26 (Data memory access rights trap) at addr 0000000000000060
CPU: 0 PID: 1401 Comm: gdb-crash Not tainted 5.2.0-rc7-64bit+ #1053

      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00001000000001000000000000001111 Not tainted
r00-03  000000000804000f 0000000040dee1a0 0000000040c78cf0 00000000b8d50160
r04-07  0000000040d2b1a0 000000004360a098 00000000bbbe87b8 0000000000000003
r08-11  00000000fac20a70 00000000fac24160 00000000fac1bbe0 0000000000000000
r12-15  00000000fabfb79a 00000000fac244a4 0000000000010000 0000000000000001
r16-19  00000000bbbe87b8 00000000f8f02910 0000000000010034 0000000000000000
r20-23  00000000fac24630 00000000fac24630 000000006474e552 00000000fac1aa52
r24-27  0000000000000028 00000000bbbe87b8 00000000bbbe87b8 0000000040d2b1a0
r28-31  0000000000000000 00000000b8d501c0 00000000b8d501f0 0000000003424000
sr00-03  0000000000423000 0000000000000000 0000000000000000 0000000000423000
sr04-07  0000000000000000 0000000000000000 0000000000000000 0000000000000000

IASQ: 0000000000000000 0000000000000000 IAOQ: 0000000040c78cf0 0000000040c78cf4
  IIR: 539f00c0    ISR: 0000000000000000  IOR: 0000000000000060
  CPU:        0   CR30: 00000000b8d50000 CR31: 00000000d22345e2
  ORIG_R28: 0000000040250798
  IAOQ[0]: parisc_kprobe_ss_handler+0x58/0x170
  IAOQ[1]: parisc_kprobe_ss_handler+0x5c/0x170
  RP(r2): parisc_kprobe_ss_handler+0x58/0x170
Backtrace:
  [<0000000040206ff8>] handle_interruption+0x178/0xbb8
Kernel panic - not syncing: Kernel Fault

Avoid this panic by checking the return value of kprobe_running() and
skip kprobe if none is currently active.

Cc: <[email protected]> # v5.2
Acked-by: Sven Schnelle <[email protected]>
Tested-by: Rolf Eike Beer <[email protected]>
Signed-off-by: Helge Deller <[email protected]>

parisc: Ensure userspace privilege for ptraced processes in regset functions

On parisc the privilege level of a process is stored in the lowest two bits of
the instruction pointers (IAOQ0 and IAOQ1). On Linux we use privilege level 0
for the kernel and privilege level 3 for user-space. So userspace should not be
allowed to modify IAOQ0 or IAOQ1 of a ptraced process to change it's privilege
level to e.g. 0 to try to gain kernel privileges.

This patch prevents such modifications in the regset support functions by
always setting the two lowest bits to one (which relates to privilege level 3
for user-space) if IAOQ0 or IAOQ1 are modified via ptrace regset calls.

Link: https://bugs.gentoo.org/481768
Cc: <[email protected]> # v4.7+
Tested-by: Rolf Eike Beer <[email protected]>
Signed-off-by: Helge Deller <[email protected]>

parisc: Fix kernel panic due invalid values in IAOQ0 or IAOQ1

On parisc the privilege level of a process is stored in the lowest two bits of
the instruction pointers (IAOQ0 and IAOQ1). On Linux we use privilege level 0
for the kernel and privilege level 3 for user-space. So userspace should not be
allowed to modify IAOQ0 or IAOQ1 of a ptraced process to change it's privilege
level to e.g. 0 to try to gain kernel privileges.

This patch prevents such modifications by always setting the two lowest bits to
one (which relates to privilege level 3 for user-space) if IAOQ0 or IAOQ1 are
modified via ptrace calls in the native and compat ptrace paths.

Link: https://bugs.gentoo.org/481768
Reported-by: Jeroen Roovers <[email protected]>
Cc: <[email protected]>
Tested-by: Rolf Eike Beer <[email protected]>
Signed-off-by: Helge Deller <[email protected]>

net_sched: unset TCQ_F_CAN_BYPASS when adding filters

For qdisc's that support TC filters and set TCQ_F_CAN_BYPASS,
notably fq_codel, it makes no sense to let packets bypass the TC
filters we setup in any scenario, otherwise our packets steering
policy could not be enforced.

This can be reproduced easily with the following script:

ip li add dev dummy0 type dummy
ifconfig dummy0 up
tc qd add dev dummy0 root fq_codel
tc filter add dev dummy0 parent 8001: protocol arp basic action mirred egress redirect dev lo
tc filter add dev dummy0 parent 8001: protocol ip basic action mirred egress redirect dev lo
ping -I dummy0 192.168.112.1

Without this patch, packets are sent directly to dummy0 without
hitting any of the filters. With this patch, packets are redirected
to loopback as expected.

This fix is not perfect, it only unsets the flag but does not set it back
because we have to save the information somewhere in the qdisc if we
really want that. Note, both fq_codel and sfq clear this flag in their
->bind_tcf() but this is clearly not sufficient when we don't use any
class ID.

Fixes: 23624935e0c4 ("net_sched: TCQ_F_CAN_BYPASS generalization")
Cc: Eric Dumazet <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'platform-drivers-x86-v5.3-2' of git://git.infradead.org/linux-platform-drivers-x86

Pull another x86 platform driver update from Andy Shevchenko:
"Provide better naming for ABI, i.e. tell that we have fan boost mode.

  It won't break any ABI, but has to be done now to avoid confusion in
  the future"

* tag 'platform-drivers-x86-v5.3-2' of git://git.infradead.org/linux-platform-drivers-x86:
  platform/x86: asus: Rename "fan mode" to "fan boost mode"

Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux

Pull thermal management updates from Zhang Rui:

- Convert thermal documents to ReST (Mauro Carvalho Chehab)

- Fix a cyclic depedency in between thermal core and governors (Daniel
   Lezcano)

- Fix processor_thermal_device driver to re-evaluate power limits after
   resume (Srinivas Pandruvada, Zhang Rui)

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
  drivers: thermal: processor_thermal_device: Fix build warning
  docs: thermal: convert to ReST
  thermal/drivers/core: Use governor table to initialize
  thermal/drivers/core: Add init section table for self-encapsulation
  drivers: thermal: processor_thermal: Read PPCC on resume

Merge tag 'gpio-v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:

- Revert a SPIO GPIO fix that didn't fix anything but instead created
   new problems.

- Remove the EM GPIO irqdomain in a safe manner.

- Fix a memory leak in the gpio quirks.

- Make the DaVinci error path silent on probe deferral.

* tag 'gpio-v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  Revert "gpio/spi: Fix spi-gpio regression on active high CS"
  gpio: em: remove the gpiochip before removing the irq domain
  gpiolib: of: fix a memory leak in of_gpio_flags_quirks()
  gpio: davinci: silence error prints in case of EPROBE_DEFER

Merge branch 'net-rds-RDMA-fixes'

Gerd Rausch says:

====================
net/rds: RDMA fixes

A number of net/rds fixes necessary to make "rds_rdma.ko"
pass some basic Oracle internal tests.
====================

Signed-off-by: David S. Miller <[email protected]>

net/rds: Initialize ic->i_fastreg_wrs upon allocation

Otherwise, if an IB connection is torn down before "rds_ib_setup_qp"
is called, the value of "ic->i_fastreg_wrs" is still at zero
(as it wasn't initialized by "rds_ib_setup_qp").
Consequently "rds_ib_conn_path_shutdown" will spin forever,
waiting for it to go back to "RDS_IB_DEFAULT_FR_WR",
which of course will never happen as there are no
outstanding work requests.

Signed-off-by: Gerd Rausch <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/rds: Keep track of and wait for FRWR segments in use upon shutdown

Since "rds_ib_free_frmr" and "rds_ib_free_frmr_list" simply put
the FRMR memory segments on the "drop_list" or "free_list",
and it is the job of "rds_ib_flush_mr_pool" to reap those entries
by ultimately issuing a "IB_WR_LOCAL_INV" work-request,
we need to trigger and then wait for all those memory segments
attached to a particular connection to be fully released before
we can move on to release the QP, CQ, etc.

So we make "rds_ib_conn_path_shutdown" wait for one more
atomic_t called "i_fastreg_inuse_count" that keeps track of how
many FRWR memory segments are out there marked "FRMR_IS_INUSE"
(and also wake_up rds_ib_ring_empty_wait, as they go away).

Signed-off-by: Gerd Rausch <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/rds: Set fr_state only to FRMR_IS_FREE if IB_WR_LOCAL_INV had been successful

Fix a bug where fr_state first goes to FRMR_IS_STALE, because of a failure
of operation IB_WR_LOCAL_INV, but then gets set back to "FRMR_IS_FREE"
uncoditionally, even though the operation failed.

Signed-off-by: Gerd Rausch <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/rds: Fix NULL/ERR_PTR inconsistency

Make function "rds_ib_try_reuse_ibmr" return NULL in case
memory region could not be allocated, since callers
simply check if the return value is not NULL.

Signed-off-by: Gerd Rausch <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/rds: Wait for the FRMR_IS_FREE (or FRMR_IS_STALE) transition after posting IB_WR_LOCAL_INV

In order to:
1) avoid a silly bouncing between "clean_list" and "drop_list"
   triggered by function "rds_ib_reg_frmr" as it is releases frmr
   regions whose state is not "FRMR_IS_FREE" right away.

2) prevent an invalid access error in a race from a pending
   "IB_WR_LOCAL_INV" operation with a teardown ("dma_unmap_sg", "put_page")
   and de-registration ("ib_dereg_mr") of the corresponding
   memory region.

Signed-off-by: Gerd Rausch <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/rds: Get rid of "wait_clean_list_grace" and add locking

Waiting for activity on the "clean_list" to quiesce is no substitute
for proper locking.

We can have multiple threads competing for "llist_del_first"
via "rds_ib_reuse_mr", and a single thread competing
for "llist_del_all" and "llist_del_first" via "rds_ib_flush_mr_pool".

Since "llist_del_first" depends on "list->first->next" not to change
in the midst of the operation, simply waiting for all current calls
to "rds_ib_reuse_mr" to quiesce across all CPUs is woefully inadequate:

By the time "wait_clean_list_grace" is done iterating over all CPUs to see
that there is no concurrent caller to "rds_ib_reuse_mr", a new caller may
have just shown up on the first CPU.

Furthermore, <linux/llist.h> explicitly calls out the need for locking:
* Cases where locking is needed:
* If we have multiple consumers with llist_del_first used in one consumer,
* and llist_del_first or llist_del_all used in other consumers,
* then a lock is needed.

Also, while at it, drop the unused "pool" parameter
from "list_to_llist_nodes".

Signed-off-by: Gerd Rausch <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/rds: Give fr_state a chance to transition to FRMR_IS_FREE

In the context of FRMR (ib_frmr.c):

Memory regions make it onto the "clean_list" via "rds_ib_flush_mr_pool",
after the memory region has been posted for invalidation via
"rds_ib_post_inv".

At that point in time, "fr_state" may still be in state "FRMR_IS_INUSE",
since the only place where "fr_state" transitions to "FRMR_IS_FREE"
is in "rds_ib_mr_cqe_handler", which is triggered by a tasklet.

So in case we notice that "fr_state != FRMR_IS_FREE" (see below),
we wait for "fr_inv_done" to trigger with a maximum of 10msec.
Then we check again, and only put the memory region onto the drop_list
(via "rds_ib_free_frmr") in case the situation remains unchanged.

This avoids the problem of memory-regions bouncing between "clean_list"
and "drop_list" before they even have a chance to be properly invalidated.

Signed-off-by: Gerd Rausch <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/sched: Make NET_ACT_CT depends on NF_NAT

If NF_NAT is m and NET_ACT_CT is y, build fails:

net/sched/act_ct.o: In function `tcf_ct_act':
act_ct.c:(.text+0x21ac): undefined reference to `nf_ct_nat_ext_add'
act_ct.c:(.text+0x229a): undefined reference to `nf_nat_icmp_reply_translation'
act_ct.c:(.text+0x233a): undefined reference to `nf_nat_setup_info'
act_ct.c:(.text+0x234a): undefined reference to `nf_nat_alloc_null_binding'
act_ct.c:(.text+0x237c): undefined reference to `nf_nat_packet'

Reported-by: Hulk Robot <[email protected]>
Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct")
Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: sctp: fix warning "NULL check before some freeing functions is not needed"

This patch removes NULL checks before calling kfree.

fixes below issues reported by coccicheck
net/sctp/sm_make_chunk.c:2586:3-8: WARNING: NULL check before some
freeing functions is not needed.
net/sctp/sm_make_chunk.c:2652:3-8: WARNING: NULL check before some
freeing functions is not needed.
net/sctp/sm_make_chunk.c:2667:3-8: WARNING: NULL check before some
freeing functions is not needed.
net/sctp/sm_make_chunk.c:2684:3-8: WARNING: NULL check before some
freeing functions is not needed.

Signed-off-by: Hariprasad Kelam <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Acked-by: Neil Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

caif-hsi: fix possible deadlock in cfhsi_exit_module()

cfhsi_exit_module() calls unregister_netdev() under rtnl_lock().
but unregister_netdev() internally calls rtnl_lock().
So deadlock would occur.

Fixes: c41254006377 ("caif-hsi: Add rtnl support")
Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'hwlock-v5.3' of git://github.com/andersson/remoteproc

Pull hwspinlock updates from Bjorn Andersson:
"This contains support for hardware spinlock TI K3 AM65x and J721E
  family of SoCs, support for using hwspinlocks from atomic context and
  better error reporting when dealing with hardware disabled in
  DeviceTree"

* tag 'hwlock-v5.3' of git://github.com/andersson/remoteproc:
  hwspinlock: add the 'in_atomic' API
  hwspinlock: document the hwspinlock 'raw' API
  hwspinlock: stm32: implement the relax() ops
  hwspinlock: ignore disabled device
  hwspinlock/omap: Add a trace during probe
  hwspinlock/omap: Add support for TI K3 SoCs
  dt-bindings: hwlock: Update OMAP binding for TI K3 SoCs

Merge tag 'rproc-v5.3' of git://github.com/andersson/remoteproc

Pull remoteproc updates from Bjorn Andersson:
"This adds support for the STM32 remoteproc, additional i.MX platforms
  with Cortex M4 remoteprocs and Qualcomm's QCS404 Compute DSP.

  Also initial support for vendor specific resource table entries and
  support for unprocessed Qualcomm firmware files"

* tag 'rproc-v5.3' of git://github.com/andersson/remoteproc:
  remoteproc: stm32: fix building without ARM SMCC
  remoteproc: qcom: q6v5-mss: Fix build error without QCOM_MDT_LOADER
  remoteproc: copy parent dma_pfn_offset for vdev
  remoteproc: qcom: q6v5-mss: Support loading non-split images
  soc: qcom: mdt_loader: Support loading non-split images
  remoteproc: stm32: add an ST stm32_rproc driver
  dt-bindings: remoteproc: add bindings for stm32 remote processor driver
  dt-bindings: stm32: add bindings for ML-AHB interconnect
  remoteproc: Use struct_size() helper
  remoteproc: add vendor resources handling
  remoteproc: imx: Fix typo in "failed"
  remoteproc: imx: Broaden the Kconfig selection logic
  remoteproc,rpmsg: add missing MAINTAINERS file entries
  remoteproc: qcom: qdsp6-adsp: Add support for QCS404 CDSP
  dt-bindings: remoteproc: Rename and amend Hexagon v56 binding

drm/amdkfd: Remove GWS from process during uninit

If we shut down a process without having destroyed its GWS-using
queues, it is possible that GWS BO will still be in the process
BO list during the gpuvm destruction. This list should be empty
at that time, so we should remove the GWS allocation at the
process uninit point if it is still around.

Signed-off-by: Joseph Greathouse <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/amdgpu: Fix offset for vmid selection in debugfs interface

The register debugfs interface was using the wrong bitmask for vmid
selection for GFX_CNTL.

Signed-off-by: Tom St Denis <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: update vega20 driver if to fit latest SMU firmware

Optimization for the socket power calculation is introduced.

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: maintain SMU FW backward compatibility

Do not halt driver loading on if_version mismatch. As our
driver and FWs are backward compatible.

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: correct smu_update_table usage

The interface was used in a confusing way. In profile mode scenario,
the 2nd parameter of the interface was used in a different way from
other scenarios.

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: fix deadlock around smu_handle_task V2

As the lock was already held on the entrance to smu_handle_task.

- V2: lock in small granularity

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Kevin Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: avoid access before allocation

No access before allocation.

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Feifei Xu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: fix memory allocation failure check V2

Fix memory allocation failure check.

- V2: fix one more similar error

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Fix silent amdgpu_bo_move failures

Under memory pressure, buffer moves between RAM to VRAM can
fail when there is no GTT space available. In those cases
amdgpu_bo_move falls back to ttm_bo_move_memcpy, which seems to
succeed, although it doesn't really support non-contiguous or
invisible VRAM. This manifests as VM faults with corrupted page
table entries in KFD eviction stress tests.

Print some helpful messages when lack of GTT space is causing buffer
moves to fail. Check that source and destination memory regions are
supported by ttm_bo_move_memcpy before taking that fallback.

Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: drop dead header

Not used anymore.

Reviewed-by: Evan Quan <[email protected]>
Acked-by: Christian König <[email protected]>
Noticed-by: Dave Airlie <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

Merge tag 'rpmsg-v5.3' of git://github.com/andersson/remoteproc

Pull rpmsg updates from Bjorn Andersson:
"This contains a DT binding update and a change to make the remote
  function of rpmsg_devices optional"

* tag 'rpmsg-v5.3' of git://github.com/andersson/remoteproc:
  rpmsg: core: Make remove handler for rpmsg driver optional.
  dt-bindings: soc: qcom: Add remote-pid binding for GLINK SMEM

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio, vhost updates from Michael Tsirkin:
"Fixes, features, performance:

   - new iommu device

   - vhost guest memory access using vmap (just meta-data for now)

   - minor fixes"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio-mmio: add error check for platform_get_irq
  scsi: virtio_scsi: Use struct_size() helper
  iommu/virtio: Add event queue
  iommu/virtio: Add probe request
  iommu: Add virtio-iommu driver
  PCI: OF: Initialize dev->fwnode appropriately
  of: Allow the iommu-map property to omit untranslated devices
  dt-bindings: virtio: Add virtio-pci-iommu node
  dt-bindings: virtio-mmio: Add IOMMU description
  vhost: fix clang build warning
  vhost: access vq metadata through kernel virtual address
  vhost: factor out setting vring addr and num
  vhost: introduce helpers to get the size of metadata area
  vhost: rename vq_iotlb_prefetch() to vq_meta_prefetch()
  vhost: fine grain userspace memory accessors
  vhost: generalize adding used elem

Merge tag 'vfio-v5.3-rc1' of git://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:

- Static symbol cleanup in mdev samples (Kefeng Wang)

- Use vma help in nvlink code (Peng Hao)

- Remove unused code in mbochs sample (YueHaibing)

- Send uevents around mdev registration (Alex Williamson)

* tag 'vfio-v5.3-rc1' of git://github.com/awilliam/linux-vfio:
  mdev: Send uevents around parent device registration
  sample/mdev/mbochs: remove set but not used variable 'mdev_state'
  vfio: vfio_pci_nvlink2: use a vma helper function
  vfio-mdev/samples: make some symbols static

kbuild: split out *.mod out of {single,multi}-used-m rules

Currently, *.mod is created as a side-effect of obj-m.

Split out *.mod as a dedicated build rule, which allows to unify
the %.c -> %.o rule, and remove the single-used-m rule.

This also makes the incremental build of allmodconfig faster because
it saves $(NM) invocation when there is no change in the module.

Signed-off-by: Masahiro Yamada <[email protected]>

kbuild: remove 'prepare1' target

Now that there is no rule for 'prepare1', it can go away.

Signed-off-by: Masahiro Yamada <[email protected]>

kbuild: remove the first line of *.mod files

The current format of *.mod is like this:

  line 1: directory path to the .ko file
  line 2: a list of objects linked into this module
  line 3: unresolved symbols (only when CONFIG_TRIM_UNUSED_KSYMS=y)

Now that *.mod and *.ko are created in the same directory, the line 1
provides no valuable information. It can be derived by replacing the
extension .mod with .ko. In fact, nobody uses the first line any more.

Cut down the first line.

Signed-off-by: Masahiro Yamada <[email protected]>

kbuild: create *.mod with full directory path and remove MODVERDIR

While descending directories, Kbuild produces objects for modules,
but do not link final *.ko files; it is done in the modpost.

To keep track of modules, Kbuild creates a *.mod file in $(MODVERDIR)
for every module it is building. Some post-processing steps read the
necessary information from *.mod files. This avoids descending into
directories again. This mechanism was introduced in 2003 or so.

Later, commit 551559e13af1 ("kbuild: implement modules.order") added
modules.order. So, we can simply read it out to know all the modules
with directory paths. This is easier than parsing the first line of
*.mod files.

$(MODVERDIR) has a flat directory structure, that is, *.mod files
are named only with base names. This is based on the assumption that
the module name is unique across the tree. This assumption is really
fragile.

Stephen Rothwell reported a race condition caused by a module name
conflict:

https://lkml.org/lkml/2019/5/13/991

In parallel building, two different threads could write to the same
$(MODVERDIR)/*.mod simultaneously.

Non-unique module names are the source of all kind of troubles, hence
commit 3a48a91901c5 ("kbuild: check uniqueness of module names")
introduced a new checker script.

However, it is still fragile in the build system point of view because
this race happens before scripts/modules-check.sh is invoked. If it
happens again, the modpost will emit unclear error messages.

To fix this issue completely, create *.mod with full directory path
so that two threads never attempt to write to the same file.

$(MODVERDIR) is no longer needed.

Since modules with directory paths are listed in modules.order, Kbuild
is still able to find *.mod files without additional descending.

I also killed cmd_secanalysis; scripts/mod/sumversion.c computes MD4 hash
for modules with MODULE_VERSION(). When CONFIG_DEBUG_SECTION_MISMATCH=y,
it occurs not only in the modpost stage, but also during directory
descending, where sumversion.c may parse stale *.mod files. It would emit
'No such file or directory' warning when an object consisting a module is
renamed, or when a single-obj module is turned into a multi-obj module or
vice versa.

Signed-off-by: Masahiro Yamada <[email protected]>
Acked-by: Nicolas Pitre <[email protected]>

kbuild: export_report: read modules.order instead of .tmp_versions/*.mod

Towards the goal of removing MODVERDIR aka .tmp_versions, read out
modules.order to get the list of modules to be processed. This is
simpler than parsing *.mod files in .tmp_versions.

Signed-off-by: Masahiro Yamada <[email protected]>

kbuild: modpost: read modules.order instead of $(MODVERDIR)/*.mod

Towards the goal of removing MODVERDIR, read out modules.order to get
the list of modules to be processed. This is simpler than parsing *.mod
files in $(MODVERDIR).

For external modules, $(KBUILD_EXTMOD)/modules.order should be read.

I removed the single target %.ko from the top Makefile. To make sure
modpost works correctly, vmlinux and the other modules must be built.
You cannot build a particular .ko file alone.

Signed-off-by: Masahiro Yamada <[email protected]>

dm: use printk ratelimiting functions

DM provided its own ratelimiting printk wrapper but given printk
advances this is no longer needed.

Also, switching DMDEBUG_LIMIT to using pr_debug_ratelimited() fixes the
reported issue where DMDEBUG_LIMIT() still caused a flood of "callbacks
suppressed" messages.

Reported-by: Milan Broz <[email protected]>
Depends-on: 29fc2bc7539386 ("printk: pr_debug_ratelimited: check state first to reduce "callbacks suppressed" messages")
Signed-off-by: Mike Snitzer <[email protected]>

Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk updates from Stephen Boyd:
"This round of clk driver and framework updates is heavy on the driver
  update side. The two main highlights in the core framework are the
  addition of an bulk clk_get API that handles optional clks and an
  extra debugfs file that tells the developer about the current parent
  of a clk.

  The driver updates are dominated by i.MX in the diffstat, but that is
  mostly because that SoC has started converting to the clk_hw style of
  clk registration. The next big update is in the Amlogic meson clk
  driver that gained some support for audio, cpu, and temperature clks
  while fixing some PLL issues. Finally, the biggest thing that stands
  out is the conversion of a large part of the Allwinner sunxi-ng driver
  to the new clk parent scheme that uses less strings and more pointer
  comparisons to match clk parents and children up.

  In general, it looks like we have a lot of little fixes and tweaks
  here and there to clk data along with the normal addition of a handful
  of new drivers and a couple new core framework features.

  Core:
   - Add a 'clk_parent' file in clk debugfs
   - Add a clk_bulk_get_optional() API (with devm too)

  New Drivers:
   - Support gated clk controller on MIPS based BCM63XX SoCs
   - Support SiLabs Si5341 and Si5340 chips
   - Support for CPU clks on Raspberry Pi devices
   - Audsys clock driver for MediaTek MT8516 SoCs

  Updates:
   - Convert a large portion of the Allwinner sunxi-ng driver to new clk parent scheme
   - Small frequency support for SiLabs Si544 chips
   - Slow clk support for AT91 SAM9X60 SoCs
   - Remove dead code in various clk drivers (-Wunused)
   - Support for Marvell 98DX1135 SoCs
   - Get duty cycle of generic pwm clks
   - Improvement in mmc phase calculation and cleanup of some rate defintions
   - Switch i.MX6 and i.MX7 clock drivers to clk_hw based APIs
   - Add GPIO, SNVS and GIC clocks for i.MX8 drivers
   - Mark imx6sx/ul/ull/sll MMDC_P1_IPG and imx8mm DRAM_APB as critical clock
   - Correct imx7ulp nic1_bus_clk and imx8mm audio_pll2_clk clock setting
   - Add clks for new Exynos5422 Dynamic Memory Controller driver
   - Clock definition for Exynos4412 Mali
   - Add CMM (Color Management Module) clocks on Renesas R-Car H3, M3-N, E3, and D3
   - Add TPU (Timer Pulse Unit / PWM) clocks on Renesas RZ/G2M
   - Support for 32 bit clock IDs in TI's sci-clks for J721e SoCs
   - TI clock probing done from DT by default instead of firmware
   - Fix Amlogic Meson mpll fractional part and spread sprectrum issues
   - Add Amlogic meson8 audio clocks
   - Add Amlogic g12a temperature sensors clocks
   - Add Amlogic g12a and g12b cpu clocks
   - Add TPU (Timer Pulse Unit / PWM) clocks on Renesas R-Car H3, M3-W, and M3-N
   - Add CMM (Color Management Module) clocks on Renesas R-Car M3-W
   - Add Clock Domain support on Renesas RZ/N1"

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (190 commits)
  clk: consoldiate the __clk_get_hw() declarations
  clk: sprd: Add check for return value of sprd_clk_regmap_init()
  clk: lochnagar: Update DT binding doc to include the primary SPDIF MCLK
  clk: Add Si5341/Si5340 driver
  dt-bindings: clock: Add silabs,si5341
  clk: clk-si544: Implement small frequency change support
  clk: add BCM63XX gated clock controller driver
  devicetree: document the BCM63XX gated clock bindings
  clk: at91: sckc: use dedicated functions to unregister clock
  clk: at91: sckc: improve error path for sama5d4 sck registration
  clk: at91: sckc: remove unnecessary line
  clk: at91: sckc: improve error path for sam9x5 sck register
  clk: at91: sckc: add support to free slow clock osclillator
  clk: at91: sckc: add support to free slow rc oscillator
  clk: at91: sckc: add support to free slow oscillator
  clk: rockchip: export HDMIPHY clock on rk3228
  clk: rockchip: add watchdog pclk on rk3328
  clk: rockchip: add clock id for hdmi_phy special clock on rk3228
  clk: rockchip: add clock id for watchdog pclk on rk3328
  clk: at91: sckc: add support for SAM9X60
  ...

Merge tag 'rtc-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux

Pull RTC updates from Alexandre Belloni:
"A quiet cycle this time.

   - ds1307: properly handle oscillator failure flags

   - imx-sc: alarm support

   - pcf2123: alarm support, correct offset handling

   - sun6i: add R40 support

   - simplify getting the adapter of an i2c client"

* tag 'rtc-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (37 commits)
  rtc: wm831x: Add IRQF_ONESHOT flag
  rtc: stm32: remove one condition check in stm32_rtc_set_alarm()
  rtc: pcf2123: Fix build error
  rtc: interface: Change type of 'count' from int to u64
  rtc: pcf8563: Clear event flags and disable interrupts before requesting irq
  rtc: pcf8563: Fix interrupt trigger method
  rtc: pcf2123: fix negative offset rounding
  rtc: pcf2123: add alarm support
  rtc: pcf2123: use %ptR
  rtc: pcf2123: port to regmap
  rtc: pcf2123: remove sysfs register view
  rtc: rx8025: simplify getting the adapter of a client
  rtc: rx8010: simplify getting the adapter of a client
  rtc: rv8803: simplify getting the adapter of a client
  rtc: m41t80: simplify getting the adapter of a client
  rtc: fm3130: simplify getting the adapter of a client
  rtc: tegra: Drop MODULE_ALIAS
  rtc: sun6i: Add R40 compatible
  dt-bindings: rtc: sun6i: Add the R40 RTC compatible
  dt-bindings: rtc: Convert Allwinner A31 RTC to a schema
  ...

Merge tag 'dmaengine-5.3-rc1' of git://git.infradead.org/users/vkoul/slave-dma

Pull dmaengine updates from Vinod Koul:

- Add support in dmaengine core to do device node checks for DT devices
   and update bunch of drivers to use that and remove open coding from
   drivers

- New driver/driver support for new hardware, namely:
     - MediaTek UART APDMA
     - Freescale i.mx7ulp edma2
     - Synopsys eDMA IP core version 0
     - Allwinner H6 DMA

- Updates to axi-dma and support for interleaved cyclic transfers

- Greg's debugfs return value check removals on drivers

- Updates to stm32-dma, hsu, dw, pl330, tegra drivers

* tag 'dmaengine-5.3-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (68 commits)
  dmaengine: Revert "dmaengine: fsl-edma: add i.mx7ulp edma2 version support"
  dmaengine: at_xdmac: check for non-empty xfers_list before invoking callback
  Documentation: dmaengine: clean up description of dmatest usage
  dmaengine: tegra210-adma: remove PM_CLK dependency
  dmaengine: fsl-edma: add i.mx7ulp edma2 version support
  dt-bindings: dma: fsl-edma: add new i.mx7ulp-edma
  dmaengine: fsl-edma-common: version check for v2 instead
  dmaengine: fsl-edma-common: move dmamux register to another single function
  dmaengine: fsl-edma: add drvdata for fsl-edma
  dmaengine: Revert "dmaengine: fsl-edma: support little endian for edma driver"
  dmaengine: rcar-dmac: Reject zero-length slave DMA requests
  dmaengine: dw: Enable iDMA 32-bit on Intel Elkhart Lake
  dmaengine: dw-edma: fix semicolon.cocci warnings
  dmaengine: sh: usb-dmac: Use [] to denote a flexible array member
  dmaengine: dmatest: timeout value of -1 should specify infinite wait
  dmaengine: dw: Distinguish ->remove() between DW and iDMA 32-bit
  dmaengine: fsl-edma: support little endian for edma driver
  dmaengine: hsu: Revert "set HSU_CH_MTSR to memory width"
  dmagengine: pl330: add code to get reset property
  dt-bindings: pl330: document the optional resets property
  ...

Merge tag 'mips_5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS updates from Paul Burton:
"A light batch this time around but significant improvements for
  certain systems:

   - Removal of readq & writeq for MIPS32 kernels where they would
     simply BUG() anyway, allowing drivers or other code that #ifdefs on
     their presence to work properly.

   - Improvements for Ingenic JZ4740 systems, including support for the
     external memory controller & pinmuxing fixes for qi_lb60/NanoNote
     systems.

   - Improvements for Lantiq systems, in particular around SMP & IPIs.

   - DT updates for ralink/MediaTek MT7628a systems to probe & configure
     a bunch more devices.

   - Miscellaneous cleanups & build fixes"

* tag 'mips_5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (30 commits)
  MIPS: fix some more fall through errors in arch/mips
  MIPS: perf events: handle switch statement falling through warnings
  mips/kprobes: Export kprobe_fault_handler()
  MAINTAINERS: Add myself as Ingenic SoCs maintainer
  MIPS: ralink: mt7628a.dtsi: Add watchdog controller DT node
  MIPS: ralink: mt7628a.dtsi: Add SPI controller DT node
  MIPS: ralink: mt7628a.dtsi: Add GPIO controller DT node
  MIPS: ralink: mt7628a.dtsi: Add pinctrl DT properties to the UART nodes
  MIPS: ralink: mt7628a.dtsi: Add pinmux DT node
  MIPS: ralink: mt7628a.dtsi: Add SPDX GPL-2.0 license identifier
  MIPS: lantiq: Add SMP support for lantiq interrupt controller
  MIPS: lantiq: Shorten register names, remove unused macros
  MIPS: lantiq: Fix bitfield masking
  MIPS: lantiq: Remove unused macros
  MIPS: lantiq: Fix attributes of of_device_id structure
  MIPS: lantiq: Change variables to the same type as the source
  MIPS: lantiq: Move macro directly to iomem function
  mips: Remove q-accessors from non-64bit platforms
  FDDI: defza: Include linux/io-64-nonatomic-lo-hi.h
  MIPS: configs: Remove useless UEVENT_HELPER_PATH
  ...

Merge tag 'h8300-for-linus-20190617' of git://git.sourceforge.jp/gitroot/uclinux-h8/linux

Pull h8300 update from Yoshinori Sato:
"Remove unused barrier defines"

* tag 'h8300-for-linus-20190617' of git://git.sourceforge.jp/gitroot/uclinux-h8/linux:
H8300: remove unused barrier defines

Merge tag 'for-linus-20190617' of git://git.sourceforge.jp/gitroot/uclinux-h8/linux

Pull SH updates from Yoshinori Sato.

kprobe fix, defconfig updates and a SH Kconfig fix.

* tag 'for-linus-20190617' of git://git.sourceforge.jp/gitroot/uclinux-h8/linux:
  arch/sh: Check for kprobe trap number before trying to handle a kprobe trap
  sh: configs: Remove useless UEVENT_HELPER_PATH
  Fix allyesconfig output.

KVM: LAPIC: Make lapic timer unpinned

Commit 61abdbe0bcc2 ("kvm: x86: make lapic hrtimer pinned") pinned the
lapic timer to avoid to wait until the next kvm exit for the guest to
see KVM_REQ_PENDING_TIMER set. There is another solution to give a kick
after setting the KVM_REQ_PENDING_TIMER bit, make lapic timer unpinned
will be used in follow up patches.

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Marcelo Tosatti <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

platform/x86: asus: Rename "fan mode" to "fan boost mode"

The Asus WMI spec indicates that the function being controlled here
is called "Fan Boost Mode". The user-facing documentation also calls it
this.

The spec uses the term "fan mode" is used to refer to other things,
including functionality expected to appear on future products.
We missed this before as we are not dealing with the most readable of
specs, and didn't forsee any confusion around shortening the name.

Rename "fan mode" to "fan boost mode" to improve consistency with the
spec and to avoid a future naming conflict.

There is no interface breakage here since this has yet to be included
in an official kernel release. I also updated the kernel version listed
under ABI accordingly.

Signed-off-by: Daniel Drake <[email protected]>
Acked-by: Yurii Pavlovskyi <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>

Merge branch 'akpm' (patches from Andrew)

Merge more updates from Andrew Morton:
"VM:
   - z3fold fixes and enhancements by Henry Burns and Vitaly Wool

   - more accurate reclaimed slab caches calculations by Yafang Shao

   - fix MAP_UNINITIALIZED UAPI symbol to not depend on config, by
     Christoph Hellwig

   - !CONFIG_MMU fixes by Christoph Hellwig

   - new novmcoredd parameter to omit device dumps from vmcore, by
     Kairui Song

   - new test_meminit module for testing heap and pagealloc
     initialization, by Alexander Potapenko

   - ioremap improvements for huge mappings, by Anshuman Khandual

   - generalize kprobe page fault handling, by Anshuman Khandual

   - device-dax hotplug fixes and improvements, by Pavel Tatashin

   - enable synchronous DAX fault on powerpc, by Aneesh Kumar K.V

   - add pte_devmap() support for arm64, by Robin Murphy

   - unify locked_vm accounting with a helper, by Daniel Jordan

   - several misc fixes

  core/lib:
   - new typeof_member() macro including some users, by Alexey Dobriyan

   - make BIT() and GENMASK() available in asm, by Masahiro Yamada

   - changed LIST_POISON2 on x86_64 to 0xdead000000000122 for better
     code generation, by Alexey Dobriyan

   - rbtree code size optimizations, by Michel Lespinasse

   - convert struct pid count to refcount_t, by Joel Fernandes

  get_maintainer.pl:
   - add --no-moderated switch to skip moderated ML's, by Joe Perches

  misc:
   - ptrace PTRACE_GET_SYSCALL_INFO interface

   - coda updates

   - gdb scripts, various"

[ Using merge message suggestion from Vlastimil Babka, with some editing - Linus ]

* emailed patches from Andrew Morton <[email protected]>: (100 commits)
  fs/select.c: use struct_size() in kmalloc()
  mm: add account_locked_vm utility function
  arm64: mm: implement pte_devmap support
  mm: introduce ARCH_HAS_PTE_DEVMAP
  mm: clean up is_device_*_page() definitions
  mm/mmap: move common defines to mman-common.h
  mm: move MAP_SYNC to asm-generic/mman-common.h
  device-dax: "Hotremove" persistent memory that is used like normal RAM
  mm/hotplug: make remove_memory() interface usable
  device-dax: fix memory and resource leak if hotplug fails
  include/linux/lz4.h: fix spelling and copy-paste errors in documentation
  ipc/mqueue.c: only perform resource calculation if user valid
  include/asm-generic/bug.h: fix "cut here" for WARN_ON for __WARN_TAINT architectures
  scripts/gdb: add helpers to find and list devices
  scripts/gdb: add lx-genpd-summary command
  drivers/pps/pps.c: clear offset flags in PPS_SETPARAMS ioctl
  kernel/pid.c: convert struct pid count to refcount_t
  drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings
  select: shift restore_saved_sigmask_unless() into poll_select_copy_remaining()
  select: change do_poll() to return -ERESTARTNOHAND rather than -EINTR
  ...

dm kcopyd: Increase default sub-job size to 512KB

Currently, kcopyd has a sub-job size of 64KB and a maximum number of 8
sub-jobs. As a result, for any kcopyd job, we have a maximum of 512KB of
I/O in flight.

This upper limit to the amount of in-flight I/O under-utilizes fast
devices and results in decreased throughput, e.g., when writing to a
snapshotted thin LV with I/O size less than the pool's block size (so
COW is performed using kcopyd).

Increase kcopyd's default sub-job size to 512KB, so we have a maximum of
4MB of I/O in flight for each kcopyd job. This results in an up to 96%
improvement of bandwidth when writing to a snapshotted thin LV, with I/O
sizes less than the pool's block size.

Also, add dm_mod.kcopyd_subjob_size_kb module parameter to allow users
to fine tune the sub-job size of kcopyd. The default value of this
parameter is 512KB and the maximum allowed value is 1024KB.

We evaluate the performance impact of the change by running the
snap_breaking_throughput benchmark, from the device mapper test suite
[1].

The benchmark:

  1. Creates a 1G thin LV
  2. Provisions the thin LV
  3. Takes a snapshot of the thin LV
  4. Writes to the thin LV with:

      dd if=/dev/zero of=/dev/vg/thin_lv oflag=direct bs=<I/O size>

Running this benchmark with various thin pool block sizes and dd I/O
sizes (all combinations triggering the use of kcopyd) we get the
following results:

+-----------------+-------------+------------------+-----------------+
| Pool block size | dd I/O size | BW before (MB/s) | BW after (MB/s) |
+-----------------+-------------+------------------+-----------------+
|       1 MB      |      256 KB |       242        |       280       |
|       1 MB      |      512 KB |       238        |       295       |
|                 |             |                  |                 |
|       2 MB      |      256 KB |       238        |       354       |
|       2 MB      |      512 KB |       241        |       380       |
|       2 MB      |        1 MB |       245        |       394       |
|                 |             |                  |                 |
|       4 MB      |      256 KB |       248        |       412       |
|       4 MB      |      512 KB |       234        |       432       |
|       4 MB      |        1 MB |       251        |       474       |
|       4 MB      |        2 MB |       257        |       504       |
|                 |             |                  |                 |
|       8 MB      |      256 KB |       239        |       420       |
|       8 MB      |      512 KB |       256        |       431       |
|       8 MB      |        1 MB |       264        |       467       |
|       8 MB      |        2 MB |       264        |       502       |
|       8 MB      |        4 MB |       281        |       537       |
+-----------------+-------------+------------------+-----------------+

[1] https://github.com/jthornber/device-mapper-test-suite

Signed-off-by: Nikos Tsironis <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

dm snapshot: fix oversights in optional discard support

__find_snapshots_sharing_cow() should always be used with _origins_lock
held so fix snapshot_io_hints() accordingly. Also, once a snapshot is
being merged discards must not be allowed -- otherwise incorrect or
duplicate work will be performed.

Fixes: 2e6023850e177d ("dm snapshot: add optional discard support features")
Reported-by: Nikos Tsironis <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

dm zoned: fix zone state management race

dm-zoned uses the zone flag DMZ_ACTIVE to indicate that a zone of the
backend device is being actively read or written and so cannot be
reclaimed. This flag is set as long as the zone atomic reference
counter is not 0. When this atomic is decremented and reaches 0 (e.g.
on BIO completion), the active flag is cleared and set again whenever
the zone is reused and BIO issued with the atomic counter incremented.
These 2 operations (atomic inc/dec and flag set/clear) are however not
always executed atomically under the target metadata mutex lock and
this causes the warning:

WARN_ON(!test_bit(DMZ_ACTIVE, &zone->flags));

in dmz_deactivate_zone() to be displayed. This problem is regularly
triggered with xfstests generic/209, generic/300, generic/451 and
xfs/077 with XFS being used as the file system on the dm-zoned target
device. Similarly, xfstests ext4/303, ext4/304, generic/209 and
generic/300 trigger the warning with ext4 use.

This problem can be easily fixed by simply removing the DMZ_ACTIVE flag
and managing the "ACTIVE" state by directly looking at the reference
counter value. To do so, the functions dmz_activate_zone() and
dmz_deactivate_zone() are changed to inline functions respectively
calling atomic_inc() and atomic_dec(), while the dmz_is_active() macro
is changed to an inline function calling atomic_read().

Fixes: 3b1a94c88b79 ("dm zoned: drive-managed zoned block device target")
Cc: [email protected]
Reported-by: Masato Suzuki <[email protected]>
Signed-off-by: Damien Le Moal <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

iomap: move internal declarations into fs/iomap/

Move internal function declarations out of fs/internal.h into
include/linux/iomap.h so that our transition is complete.

Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>

iomap: move the main iteration code into a separate file

Move the main iteration code into a separate file so that we can group
related functions in a single file instead of having a single enormous
source file.

Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>

iomap: move the buffered IO code into a separate file

Move the buffered IO code into a separate file so that we can group
related functions in a single file instead of having a single enormous
source file.

Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>

iomap: move the direct IO code into a separate file

Move the direct IO code into a separate file so that we can group
related functions in a single file instead of having a single enormous
source file.

Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>

iomap: move the SEEK_HOLE code into a separate file

Move the SEEK_HOLE/SEEK_DATA code into a separate file so that we can
group related functions in a single file instead of having a single
enormous source file.

Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>

iomap: move the file mapping reporting code into a separate file

Move the file mapping reporting code (FIEMAP/FIBMAP) into a separate
file so that we can group related functions in a single file instead of
having a single enormous source file.

Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>

iomap: move the swapfile code into a separate file

Move the swapfile activation code into a separate file so that we can
group related functions in a single file instead of having a single
enormous source file.

Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>

kbuild: modsign: read modules.order instead of $(MODVERDIR)/*.mod

Towards the goal of removing MODVERDIR, read out modules.order to get
the list of modules to be signed. This is simpler than parsing *.mod
files in $(MODVERDIR).

The modules_sign target is only supported for in-kernel modules.
So, this commit does not take care of external modules.

Signed-off-by: Masahiro Yamada <[email protected]>

kbuild: modinst: read modules.order instead of $(MODVERDIR)/*.mod

Towards the goal of removing MODVERDIR, read out modules.order to get
the list of modules to be installed. This is simpler than parsing *.mod
files in $(MODVERDIR).

For external modules, $(KBUILD_EXTMOD)/modules.order should be read.

Signed-off-by: Masahiro Yamada <[email protected]>

scsi: remove pointless $(MODVERDIR)/$(obj)/53c700.ver

Nothing depends on this, so it is dead code.

Signed-off-by: Masahiro Yamada <[email protected]>

kbuild: remove duplication from modules.order in sub-directories

Currently, only the top-level modules.order drops duplicated entries.

The modules.order files in sub-directories potentially contain
duplication. To list out the paths of all modules, I want to use
modules.order instead of parsing *.mod files in $(MODVERDIR).

To achieve this, I want to rip off duplication from modules.order
of external modules too.

Signed-off-by: Masahiro Yamada <[email protected]>

kbuild: get rid of kernel/ prefix from in-tree modules.{order,builtin}

Removing the 'kernel/' prefix will make our life easier because we can
simply do 'cat modules.order' to get all built modules with full paths.

Currently, we parse the first line of '*.mod' files in $(MODVERDIR).
Since we have duplicated functionality here, I plan to remove MODVERDIR
entirely.

In fact, modules.order is generated also for external modules in a
broken format. It adds the 'kernel/' prefix to the absolute path of
the module, like this:

kernel//path/to/your/external/module/foo.ko

This is fine for now since modules.order is not used for external
modules. However, I want to sanitize the format everywhere towards
the goal of removing MODVERDIR.

We cannot change the format of installed module.{order,builtin}.
So, 'make modules_install' will add the 'kernel/' prefix while copying
them to $(MODLIB)/.

Signed-off-by: Masahiro Yamada <[email protected]>

kbuild: do not create empty modules.order in the prepare stage

Currently, $(objtree)/modules.order is touched in two places.

In the 'prepare0' rule, scripts/Makefile.build creates an empty
modules.order while processing 'obj=.'

In the 'modules' rule, the top-level Makefile overwrites it with
the correct list of modules.

While this might be a good side-effect that modules.order is made
empty every time (probably this is not intended functionality),
I personally do not like this behavior.

Create modules.order only when it is sensible to do so.

This avoids creating the following pointless files:

  scripts/basic/modules.order
  scripts/dtc/modules.order
  scripts/gcc-plugins/modules.order
  scripts/genksyms/modules.order
  scripts/mod/modules.order
  scripts/modules.order
  scripts/selinux/genheaders/modules.order
  scripts/selinux/mdp/modules.order
  scripts/selinux/modules.order

Going forward, $(objtree)/modules.order lists the modules that
was built in the last successful build.

Signed-off-by: Masahiro Yamada <[email protected]>

coccinelle: api: add devm_platform_ioremap_resource script

Use recently introduced devm_platform_ioremap_resource
helper which wraps platform_get_resource() and
devm_ioremap_resource() together. This helps produce much
cleaner code and remove local `struct resource` declaration.

Signed-off-by: Himanshu Jha <[email protected]>
Signed-off-by: Julia Lawall <[email protected]>
Signed-off-by: Masahiro Yamada <[email protected]>

kbuild: compile-test headers listed in header-test-m as well

It will be useful to control the header-test by a tristate option.

If CONFIG_FOO is a tristate option, you can write like this:

header-test-$(CONFIG_FOO) += foo.h

Signed-off-by: Masahiro Yamada <[email protected]>

kbuild: remove unused hostcc-option

We can re-add this whenever it is needed. At this moment, it is unused.

Signed-off-by: Masahiro Yamada <[email protected]>

kbuild: remove tag files by distclean instead of mrproper

It takes somewhat long time to generate these tag files.
Keep such precious files until we run 'make distclean'.

Signed-off-by: Masahiro Yamada <[email protected]>

kbuild: add --hash-style= and --build-id unconditionally

As commit 1e0221374e30 ("mips: vdso: drop unnecessary cc-ldoption")
explained, these flags are supported by the minimal required version
of binutils. They are supported by ld.lld too.

Signed-off-by: Masahiro Yamada <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>
Tested-by: Nathan Chancellor <[email protected]>

kbuild: get rid of misleading $(AS) from documents

The assembler files in the kernel are *.S instead of *.s, so they must
be preprocessed. Since 'as' of GNU binutils is not able to preprocess,
we always use $(CC) as an assembler driver.

$(AS) is almost unused in Kbuild. As of v5.2, there is just one place
that directly invokes $(AS).

$ git grep -e '$(AS)' -e '${AS}' -e '$AS' -e '$(AS:' -e '${AS:' -- :^Documentation
drivers/net/wan/Makefile: AS68K = $(AS)

The documentation about *_AFLAGS* sounds like the flags were passed
to $(AS). This is somewhat misleading.

Signed-off-by: Masahiro Yamada <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>

kconfig: fix missing choice values in auto.conf

Since commit 00c864f8903d ("kconfig: allow all config targets to write
auto.conf if missing"), Kconfig creates include/config/auto.conf in the
defconfig stage when it is missing.

Joonas Kylmälä reported incorrect auto.conf generation under some
circumstances.

To reproduce it, apply the following diff:

|  --- a/arch/arm/configs/imx_v6_v7_defconfig
|  +++ b/arch/arm/configs/imx_v6_v7_defconfig
|  @@ -345,14 +345,7 @@ CONFIG_USB_CONFIGFS_F_MIDI=y
|   CONFIG_USB_CONFIGFS_F_HID=y
|   CONFIG_USB_CONFIGFS_F_UVC=y
|   CONFIG_USB_CONFIGFS_F_PRINTER=y
|  -CONFIG_USB_ZERO=m
|  -CONFIG_USB_AUDIO=m
|  -CONFIG_USB_ETH=m
|  -CONFIG_USB_G_NCM=m
|  -CONFIG_USB_GADGETFS=m
|  -CONFIG_USB_FUNCTIONFS=m
|  -CONFIG_USB_MASS_STORAGE=m
|  -CONFIG_USB_G_SERIAL=m
|  +CONFIG_USB_FUNCTIONFS=y
|   CONFIG_MMC=y
|   CONFIG_MMC_SDHCI=y
|   CONFIG_MMC_SDHCI_PLTFM=y

And then, run:

$ make ARCH=arm mrproper imx_v6_v7_defconfig

You will see CONFIG_USB_FUNCTIONFS=y is correctly contained in the
.config, but not in the auto.conf.

Please note drivers/usb/gadget/legacy/Kconfig is included from a choice
block in drivers/usb/gadget/Kconfig. So USB_FUNCTIONFS is a choice value.

This is probably a similar situation described in commit beaaddb62540
("kconfig: tests: test defconfig when two choices interact").

When sym_calc_choice() is called, the choice symbol forgets the
SYMBOL_DEF_USER unless all of its choice values are explicitly set by
the user.

The choice symbol is given just one chance to recall it because
set_all_choice_values() is called if SYMBOL_NEED_SET_CHOICE_VALUES
is set.

When sym_calc_choice() is called again, the choice symbol forgets it
forever, since SYMBOL_NEED_SET_CHOICE_VALUES is a one-time aid.
Hence, we cannot call sym_clear_all_valid() again and again.

It is crazy to repeat set and unset of internal flags. However, we
cannot simply get rid of "sym->flags &= flags | ~SYMBOL_DEF_USER;"
Doing so would re-introduce the problem solved by commit 5d09598d488f
("kconfig: fix new choices being skipped upon config update").

To work around the issue, conf_write_autoconf() stopped calling
sym_clear_all_valid().

conf_write() must be changed accordingly. Currently, it clears
SYMBOL_WRITE after the symbol is written into the .config file. This
is needed to prevent it from writing the same symbol multiple times in
case the symbol is declared in two or more locations. I added the new
flag SYMBOL_WRITTEN, to track the symbols that have been written.

Anyway, this is a cheesy workaround in order to suppress the issue
as far as defconfig is concerned.

Handling of choices is totally broken. sym_clear_all_valid() is called
every time a user touches a symbol from the GUI interface. To reproduce
it, just add a new symbol drivers/usb/gadget/legacy/Kconfig, then touch
around unrelated symbols from menuconfig. USB_FUNCTIONFS will disappear
from the .config file.

I added the Fixes tag since it is more fatal than before. But, this
has been broken since long long time before, and still it is.
We should take a closer look to fix this correctly somehow.

Fixes: 00c864f8903d ("kconfig: allow all config targets to write auto.conf if missing")
Cc: linux-stable <[email protected]> # 4.19+
Reported-by: Joonas Kylmälä <[email protected]>
Signed-off-by: Masahiro Yamada <[email protected]>
Tested-by: Joonas Kylmälä <[email protected]>

KVM: x86/vPMU: reset pmc->counter to 0 for pmu fixed_counters

To avoid semantic inconsistency, the fixed_counters in Intel vPMU
need to be reset to 0 in intel_pmu_reset() as gp_counters does.

Signed-off-by: Like Xu <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

dma-direct: only limit the mapping size if swiotlb could be used

Don't just check for a swiotlb buffer, but also if buffering might
be required for this particular device.

Fixes: 133d624b1cee ("dma: Introduce dma_max_mapping_size()")
Reported-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Tested-by: Benjamin Herrenschmidt <[email protected]>

dma-mapping: add a dma_addressing_limited helper

This helper returns if the device has issues addressing all present
memory in the system.

Signed-off-by: Christoph Hellwig <[email protected]>

xen/pv: Fix a boot up hang revealed by int3 self test

Commit 7457c0da024b ("x86/alternatives: Add int3_emulate_call()
selftest") is used to ensure there is a gap setup in int3 exception stack
which could be used for inserting call return address.

This gap is missed in XEN PV int3 exception entry path, then below panic
triggered:

[    0.772876] general protection fault: 0000 [#1] SMP NOPTI
[    0.772886] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0+ #11
[    0.772893] RIP: e030:int3_magic+0x0/0x7
[    0.772905] RSP: 3507:ffffffff82203e98 EFLAGS: 00000246
[    0.773334] Call Trace:
[    0.773334]  alternative_instructions+0x3d/0x12e
[    0.773334]  check_bugs+0x7c9/0x887
[    0.773334]  ? __get_locked_pte+0x178/0x1f0
[    0.773334]  start_kernel+0x4ff/0x535
[    0.773334]  ? set_init_arg+0x55/0x55
[    0.773334]  xen_start_kernel+0x571/0x57a

For 64bit PV guests, Xen's ABI enters the kernel with using SYSRET, with
%rcx/%r11 on the stack. To convert back to "normal" looking exceptions,
the xen thunks do 'xen_*: pop %rcx; pop %r11; jmp *'.

E.g. Extracting 'xen_pv_trap xenint3' we have:
xen_xenint3:
pop %rcx;
pop %r11;
jmp xenint3

As xenint3 and int3 entry code are same except xenint3 doesn't generate
a gap, we can fix it by using int3 and drop useless xenint3.

Signed-off-by: Zhenzhong Duan <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Andrew Cooper <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>

x86/xen: Add "nopv" support for HVM guest

PVH guest needs PV extentions to work, so "nopv" parameter should be
ignored for PVH but not for HVM guest.

If PVH guest boots up via the Xen-PVH boot entry, xen_pvh is set early,
we know it's PVH guest and ignore "nopv" parameter directly.

If PVH guest boots up via the normal boot entry same as HVM guest, it's
hard to distinguish PVH and HVM guest at that time. In this case, we
have to panic early if PVH is detected and nopv is enabled to avoid a
worse situation later.

Remove static from bool_x86_init_noop/x86_op_int_noop so they could be
used globally. Move xen_platform_hvm() after xen_hvm_guest_late_init()
to avoid compile error.

Signed-off-by: Zhenzhong Duan <[email protected]>
Reviewed-by: Boris Ostrovsky <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>

x86/paravirt: Remove const mark from x86_hyper_xen_hvm variable

.. as "nopv" support needs it to be changeable at boot up stage.

Checkpatch reports warning, so move variable declarations from
hypervisor.c to hypervisor.h

Signed-off-by: Zhenzhong Duan <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>

xen: Map "xen_nopv" parameter to "nopv" and mark it obsolete

Clean up unnecessory code after that operation.

Signed-off-by: Zhenzhong Duan <[email protected]>
Reviewed-by: Boris Ostrovsky <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>

x86: Add "nopv" parameter to disable PV extensions

In virtualization environment, PV extensions (drivers, interrupts,
timers, etc) are enabled in the majority of use cases which is the
best option.

However, in some cases (kexec not fully working, benchmarking)
we want to disable PV extensions. We have "xen_nopv" for that purpose
but only for XEN. For a consistent admin experience a common command
line parameter "nopv" set across all PV guest implementations is a
better choice.

There are guest types which just won't work without PV extensions,
like Xen PV, Xen PVH and jailhouse. add a "ignore_nopv" member to
struct hypervisor_x86 set to true for those guest types and call
the detect functions only if nopv is false or ignore_nopv is true.

Suggested-by: Juergen Gross <[email protected]>
Signed-off-by: Zhenzhong Duan <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jan Kiszka <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>

x86/xen: Mark xen_hvm_need_lapic() and xen_x2apic_para_available() as __init

.. as they are only called at early bootup stage. In fact, other
functions in x86_hyper_xen_hvm.init.* are all marked as __init.

Unexport xen_hvm_need_lapic as it's never used outside.

Signed-off-by: Zhenzhong Duan <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>

xen: remove tmem driver

The Xen tmem (transcendent memory) driver can be removed, as the
related Xen hypervisor feature never made it past the "experimental"
state and will be removed in future Xen versions (>= 4.13).

The xen-selfballoon driver depends on tmem, so it can be removed, too.

Signed-off-by: Juergen Gross <[email protected]>
Acked-by: Boris Ostrovsky <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>

Revert "x86/paravirt: Set up the virt_spin_lock_key after static keys get initialized"

This reverts commit ca5d376e17072c1b60c3fee66f3be58ef018952d.

Commit 8990cac6e5ea ("x86/jump_label: Initialize static branching
early") adds jump_label_init() call in setup_arch() to make static
keys initialized early, so we could use the original simpler code
again.

Signed-off-by: Zhenzhong Duan <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>

xen/events: fix binding user event channels to cpus

When binding an interdomain event channel to a vcpu via
IOCTL_EVTCHN_BIND_INTERDOMAIN not only the event channel needs to be
bound, but the affinity of the associated IRQi must be changed, too.
Otherwise the IRQ and the event channel won't be moved to another vcpu
in case the original vcpu they were bound to is going offline.

Cc: <[email protected]> # 4.13
Fixes: c48f64ab472389df ("xen-evtchn: Bind dyn evtchn:qemu-dm interrupt to next online VCPU")
Signed-off-by: Juergen Gross <[email protected]>
Reviewed-by: Boris Ostrovsky <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>

scsi: megaraid_sas: set an unlimited max_segment_size

When using a virt_boundary_mask, as done for NVMe devices attached to
megaraid_sas controllers, we require an unlimited max_segment_size as the
virt boundary merging code assumes that.  But we also need to propagate
that to the DMA mapping layer to make dma-debug happy.  The SCSI layer
takes care of that when using the per-host virt_boundary setting, but
given that megaraid_sas only wants to set the virt_boundary for actual
NVMe devices, we can't rely on that.  The DMA layer maximum segment is
global to the HBA however, so we have to set it explicitly.  This patch
assumes that megaraid_sas does not have a segment size limitation, which
seems true based on the SGL format, but will need to be verified.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: mpt3sas: set an unlimited max_segment_size for SAS 3.0 HBAs

When using a virt_boundary_mask, as done for NVMe devices attached to
mpt3sas controllers, we require an unlimited max_segment_size as the virt
boundary merging code assumes that.  But we also need to propagate that to
the DMA mapping layer to make dma-debug happy.  The SCSI layer takes care
of that when using the per-host virt_boundary setting, but given that
mpt3sas only wants to set the virt_boundary for actual NVMe devices, we
can't rely on that.  The DMA layer maximum segment is global to the HBA
however, so we have to set it explicitly.  This patch assumes that mpt3sas
does not have a segment size limitation, which seems true based on the SGL
format, but will need to be verified.

Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Suganath Prabu <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: IB/srp: set virt_boundary_mask in the scsi host

This ensures all proper DMA layer handling is taken care of by the SCSI
midlayer.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Acked-by: Bart Van Assche <[email protected]>
Reviewed-by: Max Gurtovoy <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: IB/iser: set virt_boundary_mask in the scsi host

This ensures all proper DMA layer handling is taken care of by the SCSI
midlayer.

Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Sagi Grimberg <[email protected]>
Reviewed-by: Max Gurtovoy <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: storvsc: set virt_boundary_mask in the scsi host template

This ensures all proper DMA layer handling is taken care of by the SCSI
midlayer.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Bart Van Assche <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: ufshcd: set max_segment_size in the scsi host template

We need to also mirror the value to the device to ensure IOMMU merging
doesn't undo it, and the SCSI host level parameter will ensure that.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Bart Van Assche <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: core: take the DMA max mapping size into account

We need to limit the device's max_sectors to what the DMA mapping
implementation can support. If not, we risk running out of swiotlb
buffers easily.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: core: add a host / host template field for the virt boundary

This allows drivers setting it up easily instead of branching out to block
layer calls in slave_alloc, and ensures the upgraded max_segment_size
setting gets picked up by the DMA layer.

Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Kashyap Desai < [email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

switch the remnants of releasing the mountpoint away from fs_pin

We used to need rather convoluted ordering trickery to guarantee
that dput() of ex-mountpoints happens before the final mntput()
of the same. Since we don't need that anymore, there's no point
playing with fs_pin for that.

Signed-off-by: Al Viro <[email protected]>

get rid of detach_mnt()

Lift getting the original mount (dentry is actually not needed at all)
of the mountpoint into the callers - to do_move_mount() and pivot_root()
level. That simplifies the cleanup in those and allows to get saner
arguments for attach_mnt_recursive().

Signed-off-by: Al Viro <[email protected]>

virtio_pmem: fix sparse warning

This patch fixes below sparse warning related to __virtio
type in virtio pmem driver. This is reported by Intel test
bot on linux-next tree.

nd_virtio.c:56:28: warning: incorrect type in assignment
                                (different base types)
nd_virtio.c:56:28:    expected unsigned int [unsigned] [usertype] type
nd_virtio.c:56:28:    got restricted __virtio32
nd_virtio.c:93:59: warning: incorrect type in argument 2
                                (different base types)
nd_virtio.c:93:59:    expected restricted __virtio32 [usertype] val
nd_virtio.c:93:59:    got unsigned int [unsigned] [usertype] ret

Reported-by: kbuild test robot <[email protected]>
Signed-off-by: Pankaj Gupta <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Dan Williams <[email protected]>

make struct mountpoint bear the dentry reference to mountpoint, not struct mount

Using dput_to_list() to shift the contributing reference from ->mnt_mountpoint
to ->mnt_mp->m_dentry.  Dentries are dropped (with dput_to_list()) as soon
as struct mountpoint is destroyed; in cases where we are under namespace_sem
we use the global list, shrinking it in namespace_unlock().  In case of
detaching stuck MNT_LOCKed children at final mntput_no_expire() we use a local
list and shrink it ourselves.  ->mnt_ex_mountpoint crap is gone.

Signed-off-by: Al Viro <[email protected]>

scsi: core: Fix race on creating sense cache

When scsi_init_sense_cache(host) is called concurrently from different
hosts, each code path may find that no cache has been created and
allocate a new one. The lack of locking can lead to potentially
overriding a cache allocated by a different host.

Fix the issue by moving 'mutex_lock(&scsi_sense_cache_mutex)' before
scsi_select_sense_cache().

Fixes: 0a6ac4ee7c21 ("scsi: respect unchecked_isa_dma for blk-mq")
Cc: Stable <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Hannes Reinecke <[email protected]>
Cc: Ewan D. Milne <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: sd_zbc: Fix compilation warning

kbuild test robot gets the following compilation warning using gcc 7.4
cross compilation for c6x (GCC_VERSION=7.4.0 make.cross ARCH=c6x).

   In file included from include/asm-generic/bug.h:18:0,
                    from arch/c6x/include/asm/bug.h:12,
                    from include/linux/bug.h:5,
                    from include/linux/thread_info.h:12,
                    from include/asm-generic/current.h:5,
                    from ./arch/c6x/include/generated/asm/current.h:1,
                    from include/linux/sched.h:12,
                    from include/linux/blkdev.h:5,
                    from drivers//scsi/sd_zbc.c:11:
   drivers//scsi/sd_zbc.c: In function 'sd_zbc_read_zones':
>> include/linux/kernel.h:62:48: warning: 'zone_blocks' may be used
   uninitialized in this function [-Wmaybe-uninitialized]
    #define __round_mask(x, y) ((__typeof__(x))((y)-1))
                                                   ^
   drivers//scsi/sd_zbc.c:464:6: note: 'zone_blocks' was declared here
     u32 zone_blocks;
         ^~~~~~~~~~~

This is a false-positive report. The variable zone_blocks is always
initialized in sd_zbc_check_zones() before use. It is not initialized
only and only if sd_zbc_check_zones() fails.

Avoid this warning by initializing the zone_blocks variable to 0.

Fixes: 5f832a395859 ("scsi: sd_zbc: Fix sd_zbc_check_zones() error checks")
Cc: Stable <[email protected]>
Signed-off-by: Damien Le Moal <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: libfc: fix null pointer dereference on a null lport

Currently if lport is null then the null lport pointer is dereference when
printing out debug via the FC_LPORT_DB macro. Fix this by using the more
generic FC_LIBFC_DBG debug macro instead that does not use lport.

Addresses-Coverity: ("Dereference after null check")
Fixes: 7414705ea4ae ("libfc: Add runtime debugging with debug_logging module parameter")
Signed-off-by: Colin Ian King <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

dax: Fix missed wakeup with PMD faults

RocksDB can hang indefinitely when using a DAX file. This is due to
a bug in the XArray conversion when handling a PMD fault and finding a
PTE entry. We use the wrong index in the hash and end up waiting on
the wrong waitqueue.

There's actually no need to wait; if we find a PTE entry while looking
for a PMD entry, we can return immediately as we know we should fall
back to a PTE fault (which may not conflict with the lock held).

We reuse the XA_RETRY_ENTRY to signal a conflicting entry was found.
This value can never be found in an XArray while holding its lock, so
it does not create an ambiguity.

Cc: <[email protected]>
Link: http://lkml.kernel.org/r/CAPcyv4hwHpX-MkUEqxwdTj7wCCZCN4RV-L4jsnuwLGyL_UEG4A@mail.gmail.com
Fixes: b15cd800682f ("dax: Convert page fault handlers to XArray")
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Tested-by: Dan Williams <[email protected]>
Reported-by: Robert Barror <[email protected]>
Reported-by: Seema Pandit <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Dan Williams <[email protected]>