Git Repo - linux.git/log

IB/mad: Fix use-after-free in ib mad completion handling

We encountered a use-after-free bug when unloading the driver:

[ 3562.116059] BUG: KASAN: use-after-free in ib_mad_post_receive_mads+0xddc/0xed0 [ib_core]
[ 3562.117233] Read of size 4 at addr ffff8882ca5aa868 by task kworker/u13:2/23862
[ 3562.118385]
[ 3562.119519] CPU: 2 PID: 23862 Comm: kworker/u13:2 Tainted: G           OE     5.1.0-for-upstream-dbg-2019-05-19_16-44-30-13 #1
[ 3562.121806] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014
[ 3562.123075] Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core]
[ 3562.124383] Call Trace:
[ 3562.125640]  dump_stack+0x9a/0xeb
[ 3562.126911]  print_address_description+0xe3/0x2e0
[ 3562.128223]  ? ib_mad_post_receive_mads+0xddc/0xed0 [ib_core]
[ 3562.129545]  __kasan_report+0x15c/0x1df
[ 3562.130866]  ? ib_mad_post_receive_mads+0xddc/0xed0 [ib_core]
[ 3562.132174]  kasan_report+0xe/0x20
[ 3562.133514]  ib_mad_post_receive_mads+0xddc/0xed0 [ib_core]
[ 3562.134835]  ? find_mad_agent+0xa00/0xa00 [ib_core]
[ 3562.136158]  ? qlist_free_all+0x51/0xb0
[ 3562.137498]  ? mlx4_ib_sqp_comp_worker+0x1970/0x1970 [mlx4_ib]
[ 3562.138833]  ? quarantine_reduce+0x1fa/0x270
[ 3562.140171]  ? kasan_unpoison_shadow+0x30/0x40
[ 3562.141522]  ib_mad_recv_done+0xdf6/0x3000 [ib_core]
[ 3562.142880]  ? _raw_spin_unlock_irqrestore+0x46/0x70
[ 3562.144277]  ? ib_mad_send_done+0x1810/0x1810 [ib_core]
[ 3562.145649]  ? mlx4_ib_destroy_cq+0x2a0/0x2a0 [mlx4_ib]
[ 3562.147008]  ? _raw_spin_unlock_irqrestore+0x46/0x70
[ 3562.148380]  ? debug_object_deactivate+0x2b9/0x4a0
[ 3562.149814]  __ib_process_cq+0xe2/0x1d0 [ib_core]
[ 3562.151195]  ib_cq_poll_work+0x45/0xf0 [ib_core]
[ 3562.152577]  process_one_work+0x90c/0x1860
[ 3562.153959]  ? pwq_dec_nr_in_flight+0x320/0x320
[ 3562.155320]  worker_thread+0x87/0xbb0
[ 3562.156687]  ? __kthread_parkme+0xb6/0x180
[ 3562.158058]  ? process_one_work+0x1860/0x1860
[ 3562.159429]  kthread+0x320/0x3e0
[ 3562.161391]  ? kthread_park+0x120/0x120
[ 3562.162744]  ret_from_fork+0x24/0x30
...
[ 3562.187615] Freed by task 31682:
[ 3562.188602]  save_stack+0x19/0x80
[ 3562.189586]  __kasan_slab_free+0x11d/0x160
[ 3562.190571]  kfree+0xf5/0x2f0
[ 3562.191552]  ib_mad_port_close+0x200/0x380 [ib_core]
[ 3562.192538]  ib_mad_remove_device+0xf0/0x230 [ib_core]
[ 3562.193538]  remove_client_context+0xa6/0xe0 [ib_core]
[ 3562.194514]  disable_device+0x14e/0x260 [ib_core]
[ 3562.195488]  __ib_unregister_device+0x79/0x150 [ib_core]
[ 3562.196462]  ib_unregister_device+0x21/0x30 [ib_core]
[ 3562.197439]  mlx4_ib_remove+0x162/0x690 [mlx4_ib]
[ 3562.198408]  mlx4_remove_device+0x204/0x2c0 [mlx4_core]
[ 3562.199381]  mlx4_unregister_interface+0x49/0x1d0 [mlx4_core]
[ 3562.200356]  mlx4_ib_cleanup+0xc/0x1d [mlx4_ib]
[ 3562.201329]  __x64_sys_delete_module+0x2d2/0x400
[ 3562.202288]  do_syscall_64+0x95/0x470
[ 3562.203277]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

The problem was that the MAD PD was deallocated before the MAD CQ.
There was completion work pending for the CQ when the PD got deallocated.
When the mad completion handling reached procedure
ib_mad_post_receive_mads(), we got a use-after-free bug in the following
line of code in that procedure:
   sg_list.lkey = qp_info->port_priv->pd->local_dma_lkey;
(the pd pointer in the above line is no longer valid, because the
pd has been deallocated).

We fix this by allocating the PD before the CQ in procedure
ib_mad_port_open(), and deallocating the PD after freeing the CQ
in procedure ib_mad_port_close().

Since the CQ completion work queue is flushed during ib_free_cq(),
no completions will be pending for that CQ when the PD is later
deallocated.

Note that freeing the CQ before deallocating the PD is the practice
in the ULPs.

Fixes: 4be90bc60df4 ("IB/mad: Remove ib_get_dma_mr calls")
Signed-off-by: Jack Morgenstein <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Doug Ledford <[email protected]>

RDMA/restrack: Track driver QP types in resource tracker

The check for QP type different than XRC has excluded driver QP
types from the resource tracker.
As a result, "rdma resource show" user command would not show opened
driver QPs which does not reflect the real state of the system.

Check QP type explicitly instead of assuming enum values/ordering.

Fixes: 40909f664d27 ("RDMA/efa: Add EFA verbs implementation")
Signed-off-by: Gal Pressman <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Doug Ledford <[email protected]>

IB/mlx5: Fix MR registration flow to use UMR properly

Driver shouldn't allow to use UMR to register a MR when
umr_modify_atomic_disabled is set. Otherwise it will always end up with a
failure in the post send flow which sets the UMR WQE to modify atomic access
right.

Fixes: c8d75a980fab ("IB/mlx5: Respect new UMR capabilities")
Signed-off-by: Guy Levi <[email protected]>
Reviewed-by: Moni Shoua <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Doug Ledford <[email protected]>

RDMA/devices: Remove the lock around remove_client_context

Due to the complexity of client->remove() callbacks it is desirable to not
hold any locks while calling them. Remove the last one by tracking only
the highest client ID and running backwards from there over the xarray.

Since the only purpose of that lock was to protect the linked list, we can
drop the lock.

Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Doug Ledford <[email protected]>

RDMA/devices: Do not deadlock during client removal

lockdep reports:

   WARNING: possible circular locking dependency detected

   modprobe/302 is trying to acquire lock:
   0000000007c8919c ((wq_completion)ib_cm){+.+.}, at: flush_workqueue+0xdf/0x990

   but task is already holding lock:
   000000002d3d2ca9 (&device->client_data_rwsem){++++}, at: remove_client_context+0x79/0xd0 [ib_core]

   which lock already depends on the new lock.

   the existing dependency chain (in reverse order) is:

   -> #2 (&device->client_data_rwsem){++++}:
          down_read+0x3f/0x160
          ib_get_net_dev_by_params+0xd5/0x200 [ib_core]
          cma_ib_req_handler+0x5f6/0x2090 [rdma_cm]
          cm_process_work+0x29/0x110 [ib_cm]
          cm_req_handler+0x10f5/0x1c00 [ib_cm]
          cm_work_handler+0x54c/0x311d [ib_cm]
          process_one_work+0x4aa/0xa30
          worker_thread+0x62/0x5b0
          kthread+0x1ca/0x1f0
          ret_from_fork+0x24/0x30

   -> #1 ((work_completion)(&(&work->work)->work)){+.+.}:
          process_one_work+0x45f/0xa30
          worker_thread+0x62/0x5b0
          kthread+0x1ca/0x1f0
          ret_from_fork+0x24/0x30

   -> #0 ((wq_completion)ib_cm){+.+.}:
          lock_acquire+0xc8/0x1d0
          flush_workqueue+0x102/0x990
          cm_remove_one+0x30e/0x3c0 [ib_cm]
          remove_client_context+0x94/0xd0 [ib_core]
          disable_device+0x10a/0x1f0 [ib_core]
          __ib_unregister_device+0x5a/0xe0 [ib_core]
          ib_unregister_device+0x21/0x30 [ib_core]
          mlx5_ib_stage_ib_reg_cleanup+0x9/0x10 [mlx5_ib]
          __mlx5_ib_remove+0x3d/0x70 [mlx5_ib]
          mlx5_ib_remove+0x12e/0x140 [mlx5_ib]
          mlx5_remove_device+0x144/0x150 [mlx5_core]
          mlx5_unregister_interface+0x3f/0xf0 [mlx5_core]
          mlx5_ib_cleanup+0x10/0x3a [mlx5_ib]
          __x64_sys_delete_module+0x227/0x350
          do_syscall_64+0xc3/0x6a4
          entry_SYSCALL_64_after_hwframe+0x49/0xbe

Which is due to the read side of the client_data_rwsem being obtained
recursively through a work queue flush during cm client removal.

The lock is being held across the remove in remove_client_context() so
that the function is a fence, once it returns the client is removed. This
is required so that the two callers do not proceed with destruction until
the client completes removal.

Instead of using client_data_rwsem use the existing device unregistration
refcount and add a similar client unregistration (client->uses) refcount.

This will fence the two unregistration paths without holding any locks.

Cc: <[email protected]>
Fixes: 921eab1143aa ("RDMA/devices: Re-organize device.c locking")
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Doug Ledford <[email protected]>

IB/core: Add mitigation for Spectre V1

Some processors may mispredict an array bounds check and
speculatively access memory that they should not. With
a user supplied array index we like to play things safe
by masking the value with the array size before it is
used as an index.

Signed-off-by: Tony Luck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Doug Ledford <[email protected]>

arm64/mm: fix variable 'tag' set but not used

When CONFIG_KASAN_SW_TAGS=n, set_tag() is compiled away. GCC throws a
warning,

mm/kasan/common.c: In function '__kasan_kmalloc':
mm/kasan/common.c:464:5: warning: variable 'tag' set but not used
[-Wunused-but-set-variable]
u8 tag = 0xff;
^~~

Fix it by making __tag_set() a static inline function the same as
arch_kasan_set_tag() in mm/kasan/kasan.h for consistency because there
is a macro in arch/arm64/include/asm/kasan.h,

#define arch_kasan_set_tag(addr, tag) __tag_set(addr, tag)

However, when CONFIG_DEBUG_VIRTUAL=n and CONFIG_SPARSEMEM_VMEMMAP=y,
page_to_virt() will call __tag_set() with incorrect type of a
parameter, so fix that as well. Also, still let page_to_virt() return
"void *" instead of "const void *", so will not need to add a similar
cast in lowmem_page_address().

Signed-off-by: Qian Cai <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

drm/msm: Annotate intentional switch statement fall throughs

Explicitly mark intentional fall throughs in switch statements to keep
-Wimplicit-fallthrough from complaining.

Reviewed-by: Rob Clark <[email protected]>
Signed-off-by: Jordan Crouse <[email protected]>
Signed-off-by: Sean Paul <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

drm/msm: add support for per-CRTC max_vblank_count on mdp5

The mdp5 drm/kms driver currently does not work on command-mode DSI
panels due to 'vblank wait timed out' errors. This causes a latency
of seconds, or tens of seconds in some cases, before content is shown
on the panel. This hardware does not have the something that we can use
as a frame counter available when running in command mode, so we need to
fall back to using timestamps by setting the max_vblank_count to zero.
This can be done on a per-CRTC basis, so the convert mdp5 to use
drm_crtc_set_max_vblank_count().

This change was tested on a LG Nexus 5 (hammerhead) phone.

Suggested-by: Jeffrey Hugo <[email protected]>
Reviewed-by: Jeffrey Hugo <[email protected]>
Signed-off-by: Brian Masney <[email protected]>
Signed-off-by: Sean Paul <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

arm64/mm: fix variable 'pud' set but not used

GCC throws a warning,

arch/arm64/mm/mmu.c: In function 'pud_free_pmd_page':
arch/arm64/mm/mmu.c:1033:8: warning: variable 'pud' set but not used
[-Wunused-but-set-variable]
pud_t pud;
^~~

because pud_table() is a macro and compiled away. Fix it by making it a
static inline function and for pud_sect() as well.

Signed-off-by: Qian Cai <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

arm64: Remove unneeded rcu_read_lock from debug handlers

Remove rcu_read_lock()/rcu_read_unlock() from debug exception
handlers since we are sure those are not preemptible and
interrupts are off.

Acked-by: Paul E. McKenney <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

arm64: unwind: Prohibit probing on return_address()

Prohibit probing on return_address() and subroutines which
is called from return_address(), since the it is invoked from
trace_hardirqs_off() which is also kprobe blacklisted.

Reported-by: Naresh Kamboju <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

arm64: Lower priority mask for GIC_PRIO_IRQON

On a system with two security states, if SCR_EL3.FIQ is cleared,
non-secure IRQ priorities get shifted to fit the secure view but
priority masks aren't.

On such system, it turns out that GIC_PRIO_IRQON masks the priority of
normal interrupts, which obviously ends up in a hang.

Increase GIC_PRIO_IRQON value (i.e. lower priority) to make sure
interrupts are not blocked by it.

Cc: Oleg Nesterov <[email protected]>
Fixes: bd82d4bd21880b7c ("arm64: Fix incorrect irqflag restore for priority masking")
Acked-by: Marc Zyngier <[email protected]>
Signed-off-by: Julien Thierry <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
[will: fixed Fixes: tag]
Signed-off-by: Will Deacon <[email protected]>

Merge tag 'mmc-v5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:

- sdhci-sprd: Add a missing pm_runtime_put_noidle() to fix deferred
   probe

- dw_mmc: Fix occasional hang after tuning on eMMC

- meson-mx-sdio: Fix misuse of GENMASK macro

- mmc_spi: Fix CRC problems for writes by using BDI_CAP_STABLE_WRITES

* tag 'mmc-v5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: mmc_spi: Enable stable writes
  mmc: meson-mx-sdio: Fix misuse of GENMASK macro
  mmc: dw_mmc: Fix occasional hang after tuning on eMMC
  mmc: host: sdhci-sprd: Fix the missing pm_runtime_put_noidle()

Merge tag 'gpio-v5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
"Three GPIO fixes, all touching the core, so quite important:

   - Fix the request of active low GPIO line events.

   - Don't issue WARN() stuff on NULL descriptors if the GPIOLIB is
     disabled.

   - Preserve the descriptor flags when setting the initial direction on
     lines"

* tag 'gpio-v5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpiolib: Preserve desc->flags when setting state
  gpio: don't WARN() on NULL descs if gpiolib is disabled
  gpiolib: fix incorrect IRQ requesting of an active-low lineevent

drm/bochs: Use shadow buffer for bochs framebuffer console

The bochs driver (and virtual hardware) requires buffer objects to
reside in video ram to display them to the screen. So it can not
display the framebuffer console because the respective buffer object
is permanently pinned in system memory.

Using a shadow buffer for the console solves this problem. The console
emulation will pin the buffer object only during updates from the shadow
buffer. Otherwise, the bochs driver can freely relocated the buffer
between system memory and video ram.

v2:
* select shadow FB via struct drm_mode_config.prefer_shadow_fbdev

Signed-off-by: Thomas Zimmermann <[email protected]>
Acked-by: Noralf Trønnes <[email protected]>
Link: https://patchwork.freedesktop.org/patch/315833/
Signed-off-by: Gerd Hoffmann <[email protected]>

drm/fb-helper: Instanciate shadow FB if configured in device's mode_config

Generic framebuffer emulation uses a shadow buffer for framebuffers with
dirty() function. If drivers want to use the shadow FB without such a
function, they can now set prefer_shadow or prefer_shadow_fbdev in their
mode_config structures. The former flag is exported to userspace, the
latter flag is fbdev-only.

v3:
* only schedule dirty worker if fbdev uses shadow fb
* test shadow fb settings with boolean operators
* use bool for struct drm_mode_config.prefer_shadow_fbdev
* fix documentation comments

Signed-off-by: Thomas Zimmermann <[email protected]>
Reviewed-by: Noralf Trønnes <[email protected]>
Tested-by: Noralf Trønnes <[email protected]>
Link: https://patchwork.freedesktop.org/patch/315834/
Signed-off-by: Gerd Hoffmann <[email protected]>

drm/fb-helper: Map DRM client buffer only when required

This patch changes DRM clients to not map the buffer by default. The
buffer, like any buffer object, should be mapped and unmapped when
needed.

An unmapped buffer object can be evicted to system memory and does
not consume video ram until displayed. This allows to use generic fbdev
emulation with drivers for low-memory devices, such as ast and mgag200.

This change affects the generic framebuffer console. HW-based consoles
map their console buffer once and keep it mapped. Userspace can mmap this
buffer into its address space. The shadow-buffered framebuffer console
only needs the buffer object to be mapped during updates. While not being
updated from the shadow buffer, the buffer object can remain unmapped.
Userspace will always mmap the shadow buffer.

v2:
* change DRM client to not map buffer by default
* manually map client buffer for fbdev with HW framebuffer

Signed-off-by: Thomas Zimmermann <[email protected]>
Reviewed-by: Noralf Trønnes <[email protected]>
Link: https://patchwork.freedesktop.org/patch/315830/
Signed-off-by: Gerd Hoffmann <[email protected]>

drm/client: Support unmapping of DRM client buffers

DRM clients, such as the fbdev emulation, have their buffer objects
mapped by default. Mapping a buffer implicitly prevents its relocation.
Hence, the buffer may permanently consume video memory while it's
allocated. This is a problem for drivers of low-memory devices, such as
ast, mgag200 or older framebuffer hardware, which will then not have
enough memory to display other content (e.g., X11).

This patch introduces drm_client_buffer_vmap() and _vunmap(). Internal
DRM clients can use these functions to unmap and remap buffer objects
as needed.

There's no reference counting for vmap operations. Callers are expected
to either keep buffers mapped (as it is now), or call vmap and vunmap
in pairs around code that accesses the mapped memory.

v2:
* remove several duplicated NULL-pointer checks
v3:
* style and typo fixes

Signed-off-by: Thomas Zimmermann <[email protected]>
Reviewed-by: Noralf Trønnes <[email protected]>
Link: https://patchwork.freedesktop.org/patch/315831/
Signed-off-by: Gerd Hoffmann <[email protected]>

i2c: iproc: Fix i2c master read more than 63 bytes

Use SMBUS_MASTER_DATA_READ.MASTER_RD_STATUS bit to check for RX
FIFO empty condition because SMBUS_MASTER_FIFO_CONTROL.MASTER_RX_PKT_COUNT
is not updated for read >= 64 bytes. This fixes the issue when trying to
read from the I2C slave more than 63 bytes.

Fixes: c24b8d574b7c ("i2c: iproc: Extend I2C read up to 255 bytes")
Cc: [email protected]
Signed-off-by: Rayagonda Kokatanur <[email protected]>
Reviewed-by: Ray Jui <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>

parisc: Add archclean Makefile target

Apparently we don't have an archclean target in our
arch/parisc/Makefile, so files in there never get cleaned out by make
mrproper. This, in turn means that the sizes.h file in
arch/parisc/boot/compressed never gets removed and worse, when you
transition to an O=build/parisc[64] build model it overrides the
generated file. The upshot being my bzImage was building with a SZ_end
that was too small.

I fixed it by making mrproper clean everything.

Signed-off-by: James Bottomley <[email protected]>
Cc: [email protected] # v4.20+
Signed-off-by: Helge Deller <[email protected]>

parisc: Strip debug info from kernel before creating compressed vmlinuz

Same as on x86-64, strip the .comment, .note and debug sections from the
Linux kernel before creating the compressed image for the boot loader.

Reported-by: James Bottomley <[email protected]>
Reported-by: Sven Schnelle <[email protected]>
Cc: [email protected] # v4.20+
Signed-off-by: Helge Deller <[email protected]>

parisc: Fix build of compressed kernel even with debug enabled

With debug info enabled (CONFIG_DEBUG_INFO=y) the resulting vmlinux may get
that huge that we need to increase the start addresss for the decompression
text section otherwise one will face a linker error.

Reported-by: Sven Schnelle <[email protected]>
Tested-by: Sven Schnelle <[email protected]>
Cc: [email protected] # v4.14+
Signed-off-by: Helge Deller <[email protected]>

Merge tag 'at24-v5.3-rc3-fixes-for-wolfram' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into i2c/for-current

at24 fixes for v5.3-rc3

- make spd eeproms world-readable again

drm/i915: Only recover active engines

If we issue a reset to a currently idle engine, leave it idle
afterwards. This is useful to excise a linkage between reset and the
shrinker. When waking the engine, we need to pin the default context
image which we use for overwriting a guilty context -- if the engine is
idle we do not need this pinned image! However, this pinning means that
waking the engine acquires the FS_RECLAIM, and so may trigger the
shrinker. The shrinker itself may need to wait upon the GPU to unbind
and object and so may require services of reset; ergo we should avoid
the engine wake up path.

The danger in skipping the recovery for idle engines is that we leave the
engine with no context defined, which may interfere with the operation of
the power context on some older platforms. In practice, we should only
be resetting an active GPU but it something to look out for on Ironlake
(if memory serves).

Fixes: 79ffac8599c4 ("drm/i915: Invert the GEM wakeref hierarchy")
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Mika Kuoppala <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Imre Deak <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 18398904ca9e3ddd180e2ecd45886e146b1d9d5b)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915: Add a wakeref getter for iff the wakeref is already active

For use in the next patch, we want to acquire a wakeref without having
to wake the device up -- i.e. only acquire the engine wakeref if the
engine is already active.

Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Mika Kuoppala <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit de5147b8ce6d51f634661d7c531385371485cec6)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915: Lift intel_engines_resume() to callers

Since the reset path wants to recover the engines itself, it only wants
to reinitialise the hardware using i915_gem_init_hw(). Pull the call to
intel_engines_resume() to the module init/resume path so we can avoid it
during reset.

Fixes: 79ffac8599c4 ("drm/i915: Invert the GEM wakeref hierarchy")
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Mika Kuoppala <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Imre Deak <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 092be382a2602067766f190a113514d469162456)
Signed-off-by: Jani Nikula <[email protected]>

xen/swiotlb: remember having called xen_create_contiguous_region()

Instead of always calling xen_destroy_contiguous_region() in case the
memory is DMA-able for the used device, do so only in case it has been
made DMA-able via xen_create_contiguous_region() before.

This will avoid a lot of xen_destroy_contiguous_region() calls for
64-bit capable devices.

As the memory in question is owned by swiotlb-xen the PG_owner_priv_1
flag of the first allocated page can be used for remembering.

Signed-off-by: Juergen Gross <[email protected]>
Acked-by: Konrad Rzeszutek Wilk <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>

xen/swiotlb: simplify range_straddles_page_boundary()

range_straddles_page_boundary() is open coding several macros from
include/xen/page.h. Use those instead. Additionally there is no need
to have check_pages_physically_contiguous() as a separate function as
it is used only once, so merge it into range_straddles_page_boundary().

Signed-off-by: Juergen Gross <[email protected]>
Reviewed-by: Boris Ostrovsky <[email protected]>
Acked-by: Konrad Rzeszutek Wilk <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>

xen/swiotlb: fix condition for calling xen_destroy_contiguous_region()

The condition in xen_swiotlb_free_coherent() for deciding whether to
call xen_destroy_contiguous_region() is wrong: in case the region to
be freed is not contiguous calling xen_destroy_contiguous_region() is
the wrong thing to do: it would result in inconsistent mappings of
multiple PFNs to the same MFN. This will lead to various strange
crashes or data corruption.

Instead of calling xen_destroy_contiguous_region() in that case a
warning should be issued as that situation should never occur.

Cc: [email protected]
Signed-off-by: Juergen Gross <[email protected]>
Reviewed-by: Boris Ostrovsky <[email protected]>
Reviewed-by: Jan Beulich <[email protected]>
Acked-by: Konrad Rzeszutek Wilk <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>

selinux: fix memory leak in policydb_init()

Since roles_init() adds some entries to the role hash table, we need to
destroy also its keys/values on error, otherwise we get a memory leak in
the error path.

Cc: <[email protected]>
Reported-by: [email protected]
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Ondrej Mosnacek <[email protected]>
Signed-off-by: Paul Moore <[email protected]>

drm/msm: Use the correct dma_sync calls in msm_gem

[subject was: drm/msm: shake fist angrily at dma-mapping]

So, using dma_sync_* for our cache needs works out w/ dma iommu ops, but
it falls appart with dma direct ops.  The problem is that, depending on
display generation, we can have either set of dma ops (mdp4 and dpu have
iommu wired to mdss node, which maps to toplevel drm device, but mdp5
has iommu wired up to the mdp sub-node within mdss).

Fixes this splat on mdp5 devices:

   Unable to handle kernel paging request at virtual address ffffffff80000000
   Mem abort info:
     ESR = 0x96000144
     Exception class = DABT (current EL), IL = 32 bits
     SET = 0, FnV = 0
     EA = 0, S1PTW = 0
   Data abort info:
     ISV = 0, ISS = 0x00000144
     CM = 1, WnR = 1
   swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000810e4000
   [ffffffff80000000] pgd=0000000000000000
   Internal error: Oops: 96000144 [#1] SMP
   Modules linked in: btqcomsmd btqca bluetooth cfg80211 ecdh_generic ecc rfkill libarc4 panel_simple msm wcnss_ctrl qrtr_smd drm_kms_helper venus_enc venus_dec videobuf2_dma_sg videobuf2_memops drm venus_core ipv6 qrtr qcom_wcnss_pil v4l2_mem2mem qcom_sysmon videobuf2_v4l2 qmi_helpers videobuf2_common crct10dif_ce mdt_loader qcom_common videodev qcom_glink_smem remoteproc bmc150_accel_i2c bmc150_magn_i2c bmc150_accel_core bmc150_magn snd_soc_lpass_apq8016 snd_soc_msm8916_analog mms114 mc nf_defrag_ipv6 snd_soc_lpass_cpu snd_soc_apq8016_sbc industrialio_triggered_buffer kfifo_buf snd_soc_lpass_platform snd_soc_msm8916_digital drm_panel_orientation_quirks
   CPU: 2 PID: 33 Comm: kworker/2:1 Not tainted 5.3.0-rc2 #1
   Hardware name: Samsung Galaxy A5U (EUR) (DT)
   Workqueue: events deferred_probe_work_func
   pstate: 80000005 (Nzcv daif -PAN -UAO)
   pc : __clean_dcache_area_poc+0x20/0x38
   lr : arch_sync_dma_for_device+0x28/0x30
   sp : ffff0000115736a0
   x29: ffff0000115736a0 x28: 0000000000000001
   x27: ffff800074830800 x26: ffff000011478000
   x25: 0000000000000000 x24: 0000000000000001
   x23: ffff000011478a98 x22: ffff800009fd1c10
   x21: 0000000000000001 x20: ffff800075ad0a00
   x19: 0000000000000000 x18: ffff0000112b2000
   x17: 0000000000000000 x16: 0000000000000000
   x15: 00000000fffffff0 x14: ffff000011455d70
   x13: 0000000000000000 x12: 0000000000000028
   x11: 0000000000000001 x10: ffff00001106c000
   x9 : ffff7e0001d6b380 x8 : 0000000000001000
   x7 : ffff7e0001d6b380 x6 : ffff7e0001d6b382
   x5 : 0000000000000000 x4 : 0000000000001000
   x3 : 000000000000003f x2 : 0000000000000040
   x1 : ffffffff80001000 x0 : ffffffff80000000
   Call trace:
    __clean_dcache_area_poc+0x20/0x38
    dma_direct_sync_sg_for_device+0xb8/0xe8
    get_pages+0x22c/0x250 [msm]
    msm_gem_get_and_pin_iova+0xdc/0x168 [msm]
    ...

Fixes the combination of two patches:

Fixes: 0036bc73ccbe (drm/msm: stop abusing dma_map/unmap for cache)
Fixes: 449fa54d6815 (dma-direct: correct the physical addr in dma_direct_sync_sg_for_cpu/device)
Tested-by: Stephan Gerhold <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
[seanpaul changed subject to something more desriptive]
Signed-off-by: Sean Paul <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull mount_capable() fix from Al Viro.

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
Unbreak mount_capable()

Bluetooth: hci_uart: check for missing tty operations

Certain ttys operations (pty_unix98_ops) lack tiocmget() and tiocmset()
functions which are called by the certain HCI UART protocols (hci_ath,
hci_bcm, hci_intel, hci_mrvl, hci_qca) via hci_uart_set_flow_control()
or directly. This leads to an execution at NULL and can be triggered by
an unprivileged user. Fix this by adding a helper function and a check
for the missing tty operations in the protocols code.

This fixes CVE-2019-10207. The Fixes: lines list commits where calls to
tiocm[gs]et() or hci_uart_set_flow_control() were added to the HCI UART
protocols.

Link: https://syzkaller.appspot.com/bug?id=1b42faa2848963564a5b1b7f8c837ea7b55ffa50
Reported-by: [email protected]
Cc: [email protected] # v2.6.36+
Fixes: b3190df62861 ("Bluetooth: Support for Atheros AR300x serial chip")
Fixes: 118612fb9165 ("Bluetooth: hci_bcm: Add suspend/resume PM functions")
Fixes: ff2895592f0f ("Bluetooth: hci_intel: Add Intel baudrate configuration support")
Fixes: 162f812f23ba ("Bluetooth: hci_uart: Add Marvell support")
Fixes: fa9ad876b8e0 ("Bluetooth: hci_qca: Add support for Qualcomm Bluetooth chip wcn3990")
Signed-off-by: Vladis Dronov <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Reviewed-by: Yu-Chen, Cho <[email protected]>
Tested-by: Yu-Chen, Cho <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: slub: Fix slab walking for init_on_free

To properly clear the slab on free with slab_want_init_on_free, we walk
the list of free objects using get_freepointer/set_freepointer.

The value we get from get_freepointer may not be valid.  This isn't an
issue since an actual value will get written later but this means
there's a chance of triggering a bug if we use this value with
set_freepointer:

  kernel BUG at mm/slub.c:306!
  invalid opcode: 0000 [#1] PREEMPT PTI
  CPU: 0 PID: 0 Comm: swapper Not tainted 5.2.0-05754-g6471384a #4
  RIP: 0010:kfree+0x58a/0x5c0
  Code: 48 83 05 78 37 51 02 01 0f 0b 48 83 05 7e 37 51 02 01 48 83 05 7e 37 51 02 01 48 83 05 7e 37 51 02 01 48 83 05 d6 37 51 02 01 <0f> 0b 48 83 05 d4 37 51 02 01 48 83 05 d4 37 51 02 01 48 83 05 d4
  RSP: 0000:ffffffff82603d90 EFLAGS: 00010002
  RAX: ffff8c3976c04320 RBX: ffff8c3976c04300 RCX: 0000000000000000
  RDX: ffff8c3976c04300 RSI: 0000000000000000 RDI: ffff8c3976c04320
  RBP: ffffffff82603db8 R08: 0000000000000000 R09: 0000000000000000
  R10: ffff8c3976c04320 R11: ffffffff8289e1e0 R12: ffffd52cc8db0100
  R13: ffff8c3976c01a00 R14: ffffffff810f10d4 R15: ffff8c3976c04300
  FS:  0000000000000000(0000) GS:ffffffff8266b000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffff8c397ffff000 CR3: 0000000125020000 CR4: 00000000000406b0
  Call Trace:
   apply_wqattrs_prepare+0x154/0x280
   apply_workqueue_attrs_locked+0x4e/0xe0
   apply_workqueue_attrs+0x36/0x60
   alloc_workqueue+0x25a/0x6d0
   workqueue_init_early+0x246/0x348
   start_kernel+0x3c7/0x7ec
   x86_64_start_reservations+0x40/0x49
   x86_64_start_kernel+0xda/0xe4
   secondary_startup_64+0xb6/0xc0
  Modules linked in:
  ---[ end trace f67eb9af4d8d492b ]---

Fix this by ensuring the value we set with set_freepointer is either NULL
or another value in the chain.

Reported-by: kernel test robot <[email protected]>
Signed-off-by: Laura Abbott <[email protected]>
Fixes: 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options")
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

riscv: defconfig: align RV64 defconfig to the output of "make savedefconfig"

Align the RV64 defconfig to the output of "make savedefconfig" to
avoid unnecessary deltas for future defconfig patches. This patch
should have no runtime functional impact.

Signed-off-by: Paul Walmsley <[email protected]>
Reviewed-by: Bin Meng <[email protected]>

riscv: dts: fu540-c000: drop "timebase-frequency"

On FU540-based systems, the "timebase-frequency" (RTCCLK) is sourced
from an external crystal located on the PCB.  Thus the
timebase-frequency DT property should be defined by the board that
uses the SoC, not the SoC itself.  Drop the superfluous
timebase-frequency property from the SoC DT data.  (It's already
present in the board DT data.)

Signed-off-by: Paul Walmsley <[email protected]>
Reviewed-by: Bin Meng <[email protected]>

riscv: Fix perf record without libelf support

This patch fix following perf record error by linking vdso.so with
build id.

perf.data perf.data.old
[ perf record: Woken up 1 times to write data ]
free(): double free detected in tcache 2
Aborted

perf record use filename__read_build_id(util/symbol-minimal.c) to get
build id when libelf is not supported. When vdso.so is linked without
build id, the section size of PT_NOTE will be zero, buf size will
realloc to zero and cause memory corruption.

Signed-off-by: Mao Han <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Albert Ou <[email protected]>
Signed-off-by: Paul Walmsley <[email protected]>

drm/vgem: fix cache synchronization on arm/arm64

drm_cflush_pages() is no-op on arm/arm64. But instead we can use
dma_sync API.

Fixes failures w/ vgem_test.

Acked-by: Daniel Vetter <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
Signed-off-by: Sean Paul <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

Merge tag 'trace-v5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
"Two minor fixes:

   - Fix trace event header include guards, as several did not match the
     #define to the #ifdef

   - Remove a redundant test to ftrace_graph_notrace_addr() that was
     accidentally added"

* tag 'trace-v5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  fgraph: Remove redundant ftrace_graph_notrace_addr() test
  tracing: Fix header include guards in trace event headers

arm64/efi: fix variable 'si' set but not used

GCC throws out this warning on arm64.

drivers/firmware/efi/libstub/arm-stub.c: In function 'efi_entry':
drivers/firmware/efi/libstub/arm-stub.c:132:22: warning: variable 'si'
set but not used [-Wunused-but-set-variable]

Fix it by making free_screen_info() a static inline function.

Acked-by: Will Deacon <[email protected]>
Signed-off-by: Qian Cai <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>

Merge tag 'for-linus-5.3-2' of git://github.com/cminyard/linux-ipmi

Pull IPMI fix from Corey Minyard:
"One necessary fix for an uninitialized variable in the new IPMB driver.

Nothing else has come in besides things that need to wait until later"

* tag 'for-linus-5.3-2' of git://github.com/cminyard/linux-ipmi:
Fix uninitialized variable in ipmb_dev_int.c

arm64: cpufeature: Fix feature comparison for CTR_EL0.{CWG,ERG}

If CTR_EL0.{CWG,ERG} are 0b0000 then they must be interpreted to have
their architecturally maximum values, which defeats the use of
FTR_HIGHER_SAFE when sanitising CPU ID registers on heterogeneous
machines.

Introduce FTR_HIGHER_OR_ZERO_SAFE so that these fields effectively
saturate at zero.

Fixes: 3c739b571084 ("arm64: Keep track of CPU feature registers")
Cc: <[email protected]> # 4.4.x-
Reviewed-by: Suzuki K Poulose <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>

arm64: vdso: Fix Makefile regression

Using an old .config in combination with "make oldconfig" can cause
an incorrect detection of the compat compiler:

$ grep CROSS_COMPILE_COMPAT .config
CONFIG_CROSS_COMPILE_COMPAT_VDSO=""

$ make oldconfig && make
arch/arm64/Makefile:58: gcc not found, check CROSS_COMPILE_COMPAT.
Stop.

Accordingly to the section 7.2 of the GNU Make manual "Syntax of
Conditionals", "When the value results from complex expansions of
variables and functions, expansions you would consider empty may
actually contain whitespace characters and thus are not seen as
empty. However, you can use the strip function to avoid interpreting
whitespace as a non-empty value."

Fix the issue adding strip to the CROSS_COMPILE_COMPAT string
evaluation.

Reported-by: Matteo Croce <[email protected]>
Tested-by: Matteo Croce <[email protected]>
Acked-by: Will Deacon <[email protected]>
Signed-off-by: Vincenzo Frascino <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>

gfs2: Inode dirtying fix

With the recent iomap write page reclaim deadlock fix, it turns out that the
GLF_DIRTY flag isn't always set when it needs to be anymore: previously, this
happened as a side effect of always adding the inode buffer head to the current
transaction with gfs2_trans_add_meta, but this isn't happening consistently
anymore. Fix by removing an additional unnecessary gfs2_trans_add_meta call
and by setting the GLF_DIRTY flag in gfs2_iomap_end.

(The GLF_DIRTY flag causes inode_go_sync to flush the transaction log when
syncing out the glock of that inode. When the flag isn't set, inode_go_sync
will skip inodes, including ones with an i_state of I_DIRTY_PAGES, which will
lead to cluster incoherency.)

In addition, in gfs2_iomap_page_done, if the metadata has changed, mark the
inode as I_DIRTY_DATASYNC to have the inode added to the current transaction:
we don't expect metadata to change here, but let's err on the safe side.

Fixes: d0a22a4b03b8 ("gfs2: Fix iomap write page reclaim deadlock");
Signed-off-by: Andreas Gruenbacher <[email protected]>

Unbreak mount_capable()

In "consolidate the capability checks in sget_{fc,userns}())" the
wrong argument had been passed to mount_capable() by sget_fc().
That mistake had been further obscured later, when switching
mount_capable() to fs_context has moved the calculation of
bogus argument from sget_fc() to mount_capable() itself. It
should've been fc->user_ns all along.

Screwed-up-by: Al Viro <[email protected]>
Reported-by: Christian Brauner <[email protected]>
Tested-by: Christian Brauner <[email protected]>
Reviewed-by: David Howells <[email protected]>
Signed-off-by: Al Viro <[email protected]>

MAINTAINERS: floppy: take over maintainership

I would like to maintain the floppy driver. After the recent fixes,
I think I know the code pretty well. Nowadays I've got 2 physical 3.5"
readers to test all the changes.

Signed-off-by: Denis Efremov <[email protected]>
Acked-by: Will Deacon <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

kbuild: Check for unknown options with cc-option usage in Kconfig and clang

If the particular version of clang a user has doesn't enable
-Werror=unknown-warning-option by default, even though it is the
default[1], then make sure to pass the option to the Kconfig cc-option
command so that testing options from Kconfig files works properly.
Otherwise, depending on the default values setup in the clang toolchain
we will silently assume options such as -Wmaybe-uninitialized are
supported by clang, when they really aren't.

A compilation issue only started happening for me once commit
589834b3a009 ("kbuild: Add -Werror=unknown-warning-option to
CLANG_FLAGS") was applied on top of commit b303c6df80c9 ("kbuild:
compute false-positive -Wmaybe-uninitialized cases in Kconfig"). This
leads kbuild to try and test for the existence of the
-Wmaybe-uninitialized flag with the cc-option command in
scripts/Kconfig.include, and it doesn't see an error returned from the
option test so it sets the config value to Y. Then the Makefile tries to
pass the unknown option on the command line and
-Werror=unknown-warning-option catches the invalid option and breaks the
build. Before commit 589834b3a009 ("kbuild: Add
-Werror=unknown-warning-option to CLANG_FLAGS") the build works fine,
but any cc-option test of a warning option in Kconfig files silently
evaluates to true, even if the warning option flag isn't supported on
clang.

Note: This doesn't change cc-option usages in Makefiles because those
use a different rule that includes KBUILD_CFLAGS by default (see the
__cc-option command in scripts/Kbuild.incluide). The KBUILD_CFLAGS
variable already has the -Werror=unknown-warning-option flag set. Thanks
to Doug for pointing out the different rule.

[1] https://clang.llvm.org/docs/DiagnosticsReference.html#wunknown-warning-option
Cc: Peter Smith <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Douglas Anderson <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Masahiro Yamada <[email protected]>

lib/raid6: fix unnecessary rebuild of vpermxor*.c

The following four files are every time rebuilt:

  UNROLL  lib/raid6/vpermxor1.c
  UNROLL  lib/raid6/vpermxor2.c
  UNROLL  lib/raid6/vpermxor4.c
  UNROLL  lib/raid6/vpermxor8.c

Fix the suffixes in the targets.

Fixes: 72ad21075df8 ("lib/raid6: refactor unroll rules with pattern rules")
Signed-off-by: Masahiro Yamada <[email protected]>

kbuild: modpost: do not parse unnecessary rules for vmlinux modpost

Since commit ff9b45c55b26 ("kbuild: modpost: read modules.order instead
of $(MODVERDIR)/*.mod"), 'make vmlinux' emits a warning, like this:

$ make defconfig vmlinux
  [ snip ]
  LD      vmlinux.o
cat: modules.order: No such file or directory
  MODPOST vmlinux.o
  MODINFO modules.builtin.modinfo
  KSYM    .tmp_kallsyms1.o
  KSYM    .tmp_kallsyms2.o
  LD      vmlinux
  SORTEX  vmlinux
  SYSMAP  System.map

When building only vmlinux, KBUILD_MODULES is not set. Hence, the
modules.order is not generated. For the vmlinux modpost, it is not
necessary at all.

Separate scripts/Makefile.modpost for the vmlinux/modules stages.
This works more efficiently because the vmlinux modpost does not
need to include .*.cmd files.

Fixes: ff9b45c55b26 ("kbuild: modpost: read modules.order instead of $(MODVERDIR)/*.mod")
Signed-off-by: Masahiro Yamada <[email protected]>

kbuild: modpost: remove unnecessary dependency for __modpost

__modpost is a phony target. The dependency on FORCE is pointless.
All the objects have been built in the previous stage, so the
dependency on the objects are not necessary either.

Count the number of modules in a more straightforward way.

Signed-off-by: Masahiro Yamada <[email protected]>

kbuild: modpost: handle KBUILD_EXTRA_SYMBOLS only for external modules

KBUILD_EXTRA_SYMBOLS makes sense only when building external modules.
Moreover, the modpost sets 'external_module' if the -e option is given.

I replaced $(patsubst %, -e %,...) with simpler $(addprefix -e,...)
while I was here.

Signed-off-by: Masahiro Yamada <[email protected]>

kbuild: modpost: include .*.cmd files only when targets exist

If a build rule fails, the .DELETE_ON_ERROR special target removes the
target, but does nothing for the .*.cmd file, which might be corrupted.
So, .*.cmd files should be included only when the corresponding targets
exist.

Commit 392885ee82d3 ("kbuild: let fixdep directly write to .*.cmd
files") missed to fix up this file.

Fixes: 392885ee82d3 ("kbuild: let fixdep directly write to .*.cmd")
Cc: <[email protected]> # v5.0+
Signed-off-by: Masahiro Yamada <[email protected]>

drm/i810: Use CONFIG_PREEMPTION

CONFIG_PREEMPTION is selected by CONFIG_PREEMPT and by
CONFIG_PREEMPT_RT. Both PREEMPT and PREEMPT_RT require the same
functionality which today depends on CONFIG_PREEMPT.

Change the Kconfig dependency of i810 to !CONFIG_PREEMPTION so the driver
is not accidentally built on a RT kernel.

Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: Maarten Lankhorst <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Daniel Vetter <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

Merge tag 'v5.3-rc2' into drm-misc-fixes

Linux 5.3-rc2

Required for a CONFIG_PREEMPTION fix to i810. :)

Signed-off-by: Maarten Lankhorst <[email protected]>

nbd: replace kill_bdev() with __invalidate_device() again

Commit abbbdf12497d ("replace kill_bdev() with __invalidate_device()")
once did this, but 29eaadc03649 ("nbd: stop using the bdev everywhere")
resurrected kill_bdev() and it has been there since then. So buffer_head
mappings still get killed on a server disconnection, and we can still
hit the BUG_ON on a filesystem on the top of the nbd device.

  EXT4-fs (nbd0): mounted filesystem with ordered data mode. Opts: (null)
  block nbd0: Receive control failed (result -32)
  block nbd0: shutting down sockets
  print_req_error: I/O error, dev nbd0, sector 66264 flags 3000
  EXT4-fs warning (device nbd0): htree_dirblock_to_tree:979: inode #2: lblock 0: comm ls: error -5 reading directory block
  print_req_error: I/O error, dev nbd0, sector 2264 flags 3000
  EXT4-fs error (device nbd0): __ext4_get_inode_loc:4690: inode #2: block 283: comm ls: unable to read itable block
  EXT4-fs error (device nbd0) in ext4_reserve_inode_write:5894: IO failure
  ------------[ cut here ]------------
  kernel BUG at fs/buffer.c:3057!
  invalid opcode: 0000 [#1] SMP PTI
  CPU: 7 PID: 40045 Comm: jbd2/nbd0-8 Not tainted 5.1.0-rc3+ #4
  Hardware name: Amazon EC2 m5.12xlarge/, BIOS 1.0 10/16/2017
  RIP: 0010:submit_bh_wbc+0x18b/0x190
  ...
  Call Trace:
   jbd2_write_superblock+0xf1/0x230 [jbd2]
   ? account_entity_enqueue+0xc5/0xf0
   jbd2_journal_update_sb_log_tail+0x94/0xe0 [jbd2]
   jbd2_journal_commit_transaction+0x12f/0x1d20 [jbd2]
   ? __switch_to_asm+0x40/0x70
   ...
   ? lock_timer_base+0x67/0x80
   kjournald2+0x121/0x360 [jbd2]
   ? remove_wait_queue+0x60/0x60
   kthread+0xf8/0x130
   ? commit_timeout+0x10/0x10 [jbd2]
   ? kthread_bind+0x10/0x10
   ret_from_fork+0x35/0x40

With __invalidate_device(), I no longer hit the BUG_ON with sync or
unmount on the disconnected device.

Fixes: 29eaadc03649 ("nbd: stop using the bdev everywhere")
Cc: [email protected]
Cc: Ratna Manoj Bolla <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: David Woodhouse <[email protected]>
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Munehisa Kamata <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

ata: libahci: do not complain in case of deferred probe

Retrieving PHYs can defer the probe, do not spawn an error when
-EPROBE_DEFER is returned, it is normal behavior.

Fixes: b1a9edbda040 ("ata: libahci: allow to use multiple PHYs")
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Miquel Raynal <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

io_uring: fix KASAN use after free in io_sq_wq_submit_work

[root@localhost ~]# ./liburing/test/link

QEMU Standard PC report that:

[   29.379892] CPU: 0 PID: 84 Comm: kworker/u2:2 Not tainted 5.3.0-rc2-00051-g4010b622f1d2-dirty #86
[   29.379902] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[   29.379913] Workqueue: io_ring-wq io_sq_wq_submit_work
[   29.379929] Call Trace:
[   29.379953]  dump_stack+0xa9/0x10e
[   29.379970]  ? io_sq_wq_submit_work+0xbf4/0xe90
[   29.379986]  print_address_description.cold.6+0x9/0x317
[   29.379999]  ? io_sq_wq_submit_work+0xbf4/0xe90
[   29.380010]  ? io_sq_wq_submit_work+0xbf4/0xe90
[   29.380026]  __kasan_report.cold.7+0x1a/0x34
[   29.380044]  ? io_sq_wq_submit_work+0xbf4/0xe90
[   29.380061]  kasan_report+0xe/0x12
[   29.380076]  io_sq_wq_submit_work+0xbf4/0xe90
[   29.380104]  ? io_sq_thread+0xaf0/0xaf0
[   29.380152]  process_one_work+0xb59/0x19e0
[   29.380184]  ? pwq_dec_nr_in_flight+0x2c0/0x2c0
[   29.380221]  worker_thread+0x8c/0xf40
[   29.380248]  ? __kthread_parkme+0xab/0x110
[   29.380265]  ? process_one_work+0x19e0/0x19e0
[   29.380278]  kthread+0x30b/0x3d0
[   29.380292]  ? kthread_create_on_node+0xe0/0xe0
[   29.380311]  ret_from_fork+0x3a/0x50

[   29.380635] Allocated by task 209:
[   29.381255]  save_stack+0x19/0x80
[   29.381268]  __kasan_kmalloc.constprop.6+0xc1/0xd0
[   29.381279]  kmem_cache_alloc+0xc0/0x240
[   29.381289]  io_submit_sqe+0x11bc/0x1c70
[   29.381300]  io_ring_submit+0x174/0x3c0
[   29.381311]  __x64_sys_io_uring_enter+0x601/0x780
[   29.381322]  do_syscall_64+0x9f/0x4d0
[   29.381336]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

[   29.381633] Freed by task 84:
[   29.382186]  save_stack+0x19/0x80
[   29.382198]  __kasan_slab_free+0x11d/0x160
[   29.382210]  kmem_cache_free+0x8c/0x2f0
[   29.382220]  io_put_req+0x22/0x30
[   29.382230]  io_sq_wq_submit_work+0x28b/0xe90
[   29.382241]  process_one_work+0xb59/0x19e0
[   29.382251]  worker_thread+0x8c/0xf40
[   29.382262]  kthread+0x30b/0x3d0
[   29.382272]  ret_from_fork+0x3a/0x50

[   29.382569] The buggy address belongs to the object at ffff888067172140
                which belongs to the cache io_kiocb of size 224
[   29.384692] The buggy address is located 120 bytes inside of
                224-byte region [ffff888067172140, ffff888067172220)
[   29.386723] The buggy address belongs to the page:
[   29.387575] page:ffffea00019c5c80 refcount:1 mapcount:0 mapping:ffff88806ace5180 index:0x0
[   29.387587] flags: 0x100000000000200(slab)
[   29.387603] raw: 0100000000000200 dead000000000100 dead000000000122 ffff88806ace5180
[   29.387617] raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
[   29.387624] page dumped because: kasan: bad access detected

[   29.387920] Memory state around the buggy address:
[   29.388771]  ffff888067172080: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
[   29.390062]  ffff888067172100: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
[   29.391325] >ffff888067172180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   29.392578]                                         ^
[   29.393480]  ffff888067172200: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
[   29.394744]  ffff888067172280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   29.396003] ==================================================================
[   29.397260] Disabling lock debugging due to kernel taint

io_sq_wq_submit_work free and read req again.

Cc: Zhengyuan Liu <[email protected]>
Cc: [email protected]
Cc: [email protected]
Fixes: f7b76ac9d17e ("io_uring: fix counter inc/dec mismatch in async_list")
Signed-off-by: Jackie Liu <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

parisc: fix race condition in patching code

Assume the following ftrace code sequence that was patched in earlier by
ftrace_make_call():

PAGE A:
ffc: addr of ftrace_caller()
PAGE B:
000: 0x6fc10080 /* stw,ma r1,40(sp) */
004: 0x48213fd1 /* ldw -18(r1),r1 */
008: 0xe820c002 /* bv,n r0(r1) */
00c: 0xe83f1fdf /* b,l,n .-c,r1 */

When a Code sequences that is to be patched spans a page break, we might
have already cleared the part on the PAGE A. If an interrupt is coming in
during the remap of the fixed mapping to PAGE B, it might execute the
patched function with only parts of the FTRACE code cleared. To prevent
this, clear the jump to our mini trampoline first, and clear the remaining
parts after this. This might also happen when patch_text() patches a
function that it calls during remap.

Signed-off-by: Sven Schnelle <[email protected]>
Cc: <[email protected]> # 5.2+
Signed-off-by: Helge Deller <[email protected]>

parisc: rename default_defconfig to defconfig

'default_defconfig' is an awkward name since 'defconfig' is the default.
Let's simply say 'defconfig' like other architectures. You can drop the
KBUILD_DEFCONFIG define by following the standard naming.

Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Helge Deller <[email protected]>

parisc: Fix fall-through warnings in fpudispatch.c

In fpudispatch.c we see a lot of fall-through warnings, but for this file we
prefer to not mark the switches and instead keep it in it's original state as
it's copied from HP-UX.

Fixes: a035d552a93b ("Makefile: Globally enable fall-through warning")
Signed-off-by: Helge Deller <[email protected]>

parisc: Mark expected switch fall-throughs in fault.c

Fix a fall-through warning in fault.c.

Fixes: a035d552a93b ("Makefile: Globally enable fall-through warning")
Signed-off-by: Helge Deller <[email protected]>

powerpc/kasan: fix early boot failure on PPC32

Due to commit 4a6d8cf90017 ("powerpc/mm: don't use pte_alloc_kernel()
until slab is available on PPC32"), pte_alloc_kernel() cannot be used
during early KASAN init.

Fix it by using memblock_alloc() instead.

Fixes: 2edb16efc899 ("powerpc/32: Add KASAN support")
Cc: [email protected] # v5.2+
Reported-by: Erhard F. <[email protected]>
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/da89670093651437f27d2975224712e0a130b055.1564552796.git.christophe.leroy@c-s.fr

drivers/macintosh/smu.c: Mark expected switch fall-through

Mark switch cases where we are expecting to fall through.

This patch fixes the following warning (Building: powerpc):

  drivers/macintosh/smu.c: In function 'smu_queue_i2c':
  drivers/macintosh/smu.c:854:21: warning: this statement may fall through [-Wimplicit-fallthrough=]
     cmd->info.devaddr &= 0xfe;
     ~~~~~~~~~~~~~~~~~~^~~~~~~
  drivers/macintosh/smu.c:855:2: note: here
    case SMU_I2C_TRANSFER_STDSUB:
    ^~~~

Fixes: 0365ba7fb1fa ("[PATCH] ppc64: SMU driver update & i2c support")
Signed-off-by: Stephen Rothwell <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

drm/amd/powerplay: correct UVD/VCE/VCN power status retrieval

VCN should be used for Vega20 later ASICs while UVD and VCE
are for previous ASICs.

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: correct Navi10 VCN powergate control (v2)

No VCN DPM bit check as that's different from VCN PG. Also
no extra check for possible double enablement/disablement
as that's already done by VCN.

v2: check return value of smu_feature_set_enabled

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: support VCN powergate status retrieval for SW SMU

Commonly used for VCN powergate status retrieval for SW SMU.

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: support VCN powergate status retrieval on Raven

Enable VCN powergate status report on Raven.

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: add new sensor type for VCN powergate status

VCN is widely used in new ASICs and different from tranditional
UVD and VCE.

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: fix a potential information leaking bug

Coccinelle reports a path that the array "data" is never initialized.
The path skips the checks in the conditional branches when either
of callback functions, read_wave_vgprs and read_wave_sgprs, is not
registered. Later, the uninitialized "data" array is read
in the while-loop below and passed to put_user().

Fix the path by allocating the array with kcalloc().

The patch is simplier than adding a fall-back branch that explicitly
calls memset(data, 0, ...). Also it does not need the multiplication
1024*sizeof(*data) as the size parameter for memset() though there is
no risk of integer overflow.

Signed-off-by: Wang Xiayang <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: fix error handling in amdgpu_cs_process_fence_dep

We always need to drop the ctx reference and should check
for errors first and then dereference the fence pointer.

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

xen: avoid link error on ARM

Building the privcmd code as a loadable module on ARM, we get
a link error due to the private cache management functions:

ERROR: "__sync_icache_dcache" [drivers/xen/xen-privcmd.ko] undefined!

Move the code into a new that is always built in when Xen is enabled,
as suggested by Juergen Gross and Boris Ostrovsky.

Signed-off-by: Arnd Bergmann <[email protected]>
Reviewed-by: Stefano Stabellini <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>

xen/gntdev.c: Replace vm_map_pages() with vm_map_pages_zero()

'commit df9bde015a72 ("xen/gntdev.c: convert to use vm_map_pages()")'
breaks gntdev driver. If vma->vm_pgoff > 0, vm_map_pages()
will:
- use map->pages starting at vma->vm_pgoff instead of 0
- verify map->count against vma_pages()+vma->vm_pgoff instead of just
   vma_pages().

In practice, this breaks using a single gntdev FD for mapping multiple
grants.

relevant strace output:
[pid   857] ioctl(7, IOCTL_GNTDEV_MAP_GRANT_REF, 0x7ffd3407b6d0) = 0
[pid   857] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, 7, 0) =
0x777f1211b000
[pid   857] ioctl(7, IOCTL_GNTDEV_SET_UNMAP_NOTIFY, 0x7ffd3407b710) = 0
[pid   857] ioctl(7, IOCTL_GNTDEV_MAP_GRANT_REF, 0x7ffd3407b6d0) = 0
[pid   857] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, 7,
0x1000) = -1 ENXIO (No such device or address)

details here:
https://github.com/QubesOS/qubes-issues/issues/5199

The reason is -> ( copying Marek's word from discussion)

vma->vm_pgoff is used as index passed to gntdev_find_map_index. It's
basically using this parameter for "which grant reference to map".
map struct returned by gntdev_find_map_index() describes just the pages
to be mapped. Specifically map->pages[0] should be mapped at
vma->vm_start, not vma->vm_start+vma->vm_pgoff*PAGE_SIZE.

When trying to map grant with index (aka vma->vm_pgoff) > 1,
__vm_map_pages() will refuse to map it because it will expect map->count
to be at least vma_pages(vma)+vma->vm_pgoff, while it is exactly
vma_pages(vma).

Converting vm_map_pages() to use vm_map_pages_zero() will fix the
problem.

Marek has tested and confirmed the same.

Cc: [email protected] # v5.2+
Fixes: df9bde015a72 ("xen/gntdev.c: convert to use vm_map_pages()")
Reported-by: Marek Marczykowski-Górecki <[email protected]>
Signed-off-by: Souptick Joarder <[email protected]>
Tested-by: Marek Marczykowski-Górecki <[email protected]>
Reviewed-by: Boris Ostrovsky <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>

drm/amd/powerplay: enable SW SMU reset functionality

Move SMU irq handler register to sw_init as that's totally
software related. Otherwise, it will prevent SMU reset working.

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: fix null pointer dereference around dpm state relates

DPM state relates are not supported on the new SW SMU ASICs. But still
it's not OK to trigger null pointer dereference on accessing them.

Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu/powerplay: use proper revision id for navi

The PCI revision id determines the sku.

Reviewed-by: Feifei Xu <[email protected]>
Reviewed-by: Kevin Wang <[email protected]>
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: fix temperature granularity error in smu11

in this patch,
drm/amd/powerplay: add callback function of get_thermal_temperature_range
the driver missed temperature granularity change on other temperature.

Signed-off-by: Kevin Wang <[email protected]>
Reviewed-by: Evan Quan <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/powerplay: add callback function of get_thermal_temperature_range

1. the thermal temperature is asic related data, move the code logic to
xxx_ppt.c.
2. replace data structure PP_TemperatureRange with
smu_temperature_range.
3. change temperature uint from temp*1000 to temp (temperature uint).

Signed-off-by: Kevin Wang <[email protected]>
Signed-off-by: Kenneth Feng <[email protected]>
Acked-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdkfd: Fix byte align on VegaM

This was missed during the addition of VegaM support

Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Kent Russell <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

fgraph: Remove redundant ftrace_graph_notrace_addr() test

We already have tested it before. The second one should be removed.
With this change, the performance should have little improvement.

Link: http://lkml.kernel.org/r/[email protected]
Cc: [email protected]
Fixes: 9cd2992f2d6c ("fgraph: Have set_graph_notrace only affect function_graph tracer")
Signed-off-by: Changbin Du <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>

tracing: Fix header include guards in trace event headers

These include guards are broken.

Match the #if !define() and #define lines so that they work correctly.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: f54d1867005c3 ("dma-buf: Rename struct fence to dma_fence")
Fixes: 2e26ca7150a4f ("tracing: Fix tracepoint.h DECLARE_TRACE() to allow more than one header")
Fixes: e543002f77f46 ("qdisc: add tracepoint qdisc:qdisc_dequeue for dequeued SKBs")
Fixes: 95f295f9fe081 ("dmaengine: tegra: add tracepoints to driver")
Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>

Merge branch 'dax-fix-5.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull dax fix from Dan Williams:
"Fix a botched manual patch update that got dropped between testing and
application"

* 'dax-fix-5.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
dax: Fix missed wakeup in put_unlocked_entry()

dm table: fix various whitespace issues with recent DAX code

Also, rename device_synchronous to device_dax_synchronous.

Signed-off-by: Mike Snitzer <[email protected]>

dm table: fix dax_dev NULL dereference in device_synchronous()

If a device doesn't support DAX its 'dax_dev' is NULL. Fix
device_synchronous() to first check if dax_dev is NULL before
dereferencing it.

Fixes: 2e9ee0955d3c ("dm: enable synchronous dax")
Reported-by: [email protected]
Signed-off-by: Pankaj Gupta <[email protected]>
Acked-by: Dan Williams <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

arm64: compat: vdso: Use legacy syscalls as fallback

The generic VDSO implementation uses the Y2038 safe clock_gettime64() and
clock_getres_time64() syscalls as fallback for 32bit VDSO. This breaks
seccomp setups because these syscalls might be not (yet) allowed.

Implement the 32bit variants which use the legacy syscalls and select the
variant in the core library.

The 64bit time variants are not removed because they are required for the
time64 based vdso accessors.

Fixes: 00b26474c2f1 ("lib/vdso: Provide generic VDSO implementation")
Reported-by: Sean Christopherson <[email protected]>
Reported-by: Paul Bolle <[email protected]>
Suggested-by: Andy Lutomirski <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Vincenzo Frascino <[email protected]>
Reviewed-by: Vincenzo Frascino <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

x86/vdso/32: Use 32bit syscall fallback

The generic VDSO implementation uses the Y2038 safe clock_gettime64() and
clock_getres_time64() syscalls as fallback for 32bit VDSO. This breaks
seccomp setups because these syscalls might be not (yet) allowed.

Implement the 32bit variants which use the legacy syscalls and select the
variant in the core library.

The 64bit time variants are not removed because they are required for the
time64 based vdso accessors.

Fixes: 7ac870747988 ("x86/vdso: Switch to generic vDSO implementation")
Reported-by: Sean Christopherson <[email protected]>
Reported-by: Paul Bolle <[email protected]>
Suggested-by: Andy Lutomirski <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Vincenzo Frascino <[email protected]>
Reviewed-by: Andy Lutomirski <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

lib/vdso/32: Provide legacy syscall fallbacks

To address the regression which causes seccomp to deny applications the
access to clock_gettime64() and clock_getres64() syscalls because they
are not enabled in the existing filters.

That trips over the fact that 32bit VDSOs use the new clock_gettime64() and
clock_getres64() syscalls in the fallback path.

Add a conditional to invoke the 32bit legacy fallback syscalls instead of
the new 64bit variants. The conditional can go away once all architectures
are converted.

Fixes: 00b26474c2f1 ("lib/vdso: Provide generic VDSO implementation")
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Sean Christopherson <[email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

lib/vdso: Move fallback invocation to the callers

To allow syscall fallbacks using the legacy 32bit syscall for 32bit VDSO
builds, move the fallback invocation out into the callers.

Split the common code out of __cvdso_clock_gettime/getres() and invoke the
syscall fallback in the 64bit and 32bit variants.

Preparatory work for using legacy syscalls in 32bit VDSO. No functional
change.

Fixes: 00b26474c2f1 ("lib/vdso: Provide generic VDSO implementation")
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Vincenzo Frascino <[email protected]>
Reviewed-by: Andy Lutomirski <[email protected]>
Reviewed-by: Vincenzo Frascino <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

lib/vdso/32: Remove inconsistent NULL pointer checks

The 32bit variants of vdso_clock_gettime()/getres() have a NULL pointer
check for the timespec pointer. That's inconsistent vs. 64bit.

But the vdso implementation will never be consistent versus the syscall
because the only case which it can handle is NULL. Any other invalid
pointer will cause a segfault. So special casing NULL is not really useful.

Remove it along with the superflouos syscall fallback invocation as that
will return -EFAULT anyway. That also gets rid of the dubious typecast
which only works because the pointer is NULL.

Fixes: 00b26474c2f1 ("lib/vdso: Provide generic VDSO implementation")
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Vincenzo Frascino <[email protected]>
Reviewed-by: Vincenzo Frascino <[email protected]>
Reviewed-by: Andy Lutomirski <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

Merge tag 'for-linus-20190730' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull pidfd fixes from Christian Brauner:
"This makes setting the exit_state in exit_notify() consistent after
  fixing the pidfd polling race pre-rc1. Related to the race fix, this
  adds a WARN_ON() to do_notify_pidfd() to catch any future exit_state
  races.

  Last, this removes an obsolete comment from the pidfd tests"

* tag 'for-linus-20190730' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  exit: make setting exit_state consistent
  pidfd: Add warning if exit_state is 0 during notification
  pidfd: remove obsolete comments from test

Merge tag 'f2fs-for-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs fixes from Jaegeuk Kim:
"This set of patches adjust to follow recent setflags changes and fix
  two regressions"

* tag 'f2fs-for-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
  f2fs: use EINVAL for superblock with invalid magic
  f2fs: fix to read source block before invalidating it
  f2fs: remove redundant check from f2fs_setflags_common()
  f2fs: use generic checking function for FS_IOC_FSSETXATTR
  f2fs: use generic checking and prep function for FS_IOC_SETFLAGS

Merge tag 'linux-kselftest-5.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan:
"Minor fixes to tests and one major fix to livepatch test to add skip
  handling to avoid false fail reports when livepatch is disabled"

* tag 'linux-kselftest-5.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/livepatch: add test skip handling
  selftests: mlxsw: Fix typo in qos_mc_aware.sh
  selftests/x86: fix spelling mistake "FAILT" -> "FAIL"
  selftests: kmod: Fix typo in kmod.sh

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
"A few regression and bug fixes for the patches merged in the last
  cycle:

   - hns fixes a subtle crash from the ib core SGL rework

   - hfi1 fixes various error handling, oops and protocol errors

   - bnxt_re fixes a regression where nvmeof doesn't work on some
     configurations

   - mlx5 fixes a serious 'use after free' bug in how MR caching is
     handled

   - some edge case crashers in the new statistic core code

   - more siw static checker fixups"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  IB/mlx5: Fix RSS Toeplitz setup to be aligned with the HW specification
  IB/counters: Always initialize the port counter object
  IB/core: Fix querying total rdma stats
  IB/mlx5: Prevent concurrent MR updates during invalidation
  IB/mlx5: Fix clean_mr() to work in the expected order
  IB/mlx5: Move MRs to a kernel PD when freeing them to the MR cache
  IB/mlx5: Use direct mkey destroy command upon UMR unreg failure
  IB/mlx5: Fix unreg_umr to ignore the mkey state
  RDMA/siw: Remove set but not used variables 'rv'
  IB/mlx5: Replace kfree with kvfree
  RDMA/bnxt_re: Honor vlan_id in GID entry comparison
  IB/hfi1: Drop all TID RDMA READ RESP packets after r_next_psn
  IB/hfi1: Field not zero-ed when allocating TID flow memory
  IB/hfi1: Unreserve a flushed OPFN request
  IB/hfi1: Check for error on call to alloc_rsm_map_table
  RDMA/hns: Fix sg offset non-zero issue
  RDMA/siw: Fix error return code in siw_init_module()

Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull HMM fixes from Jason Gunthorpe:
"Fix the locking around nouveau's use of the hmm_range_* APIs. It works
  correctly in the success case, but many of the the edge cases have
  missing unlocks or double unlocks.

  The diffstat is a bit big as Christoph did a comprehensive job to move
  the obsolete API from the core header and into the driver before
  fixing its flow, but the risk of regression from this code motion is
  low"

* tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  nouveau: unlock mmap_sem on all errors from nouveau_range_fault
  nouveau: remove the block parameter to nouveau_range_fault
  mm/hmm: move hmm_vma_range_done and hmm_vma_fault to nouveau
  mm/hmm: always return EBUSY for invalid ranges in hmm_range_{fault,snapshot}

loop: Fix mount(2) failure due to race with LOOP_SET_FD

Commit 33ec3e53e7b1 ("loop: Don't change loop device under exclusive
opener") made LOOP_SET_FD ioctl acquire exclusive block device reference
while it updates loop device binding. However this can make perfectly
valid mount(2) fail with EBUSY due to racing LOOP_SET_FD holding
temporarily the exclusive bdev reference in cases like this:

for i in {a..z}{a..z}; do
        dd if=/dev/zero of=$i.image bs=1k count=0 seek=1024
        mkfs.ext2 $i.image
        mkdir mnt$i
done

echo "Run"
for i in {a..z}{a..z}; do
        mount -o loop -t ext2 $i.image mnt$i &
done

Fix the problem by not getting full exclusive bdev reference in
LOOP_SET_FD but instead just mark the bdev as being claimed while we
update the binding information. This just blocks new exclusive openers
instead of failing them with EBUSY thus fixing the problem.

Fixes: 33ec3e53e7b1 ("loop: Don't change loop device under exclusive opener")
Cc: [email protected]
Tested-by: Kai-Heng Feng <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

xfs: Fix possible null-pointer dereferences in xchk_da_btree_block_check_sibling()

In xchk_da_btree_block_check_sibling(), there is an if statement on
line 274 to check whether ds->state->altpath.blk[level].bp is NULL:
    if (ds->state->altpath.blk[level].bp)

When ds->state->altpath.blk[level].bp is NULL, it is used on line 281:
    xfs_trans_brelse(..., ds->state->altpath.blk[level].bp);
        struct xfs_buf_log_item *bip = bp->b_log_item;
        ASSERT(bp->b_transp == tp);

Thus, possible null-pointer dereferences may occur.

To fix these bugs, ds->state->altpath.blk[level].bp is checked before
being used.

These bugs are found by a static analysis tool STCheck written by us.

Signed-off-by: Jia-Ju Bai <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

exit: make setting exit_state consistent

Since commit b191d6491be6 ("pidfd: fix a poll race when setting exit_state")
we unconditionally set exit_state to EXIT_ZOMBIE before calling into
do_notify_parent(). This was done to eliminate a race when querying
exit_state in do_notify_pidfd().
Back then we decided to do the absolute minimal thing to fix this and
not touch the rest of the exit_notify() function where exit_state is
set.
Since this fix has not caused any issues change the setting of
exit_state to EXIT_DEAD in the autoreap case to account for the fact hat
exit_state is set to EXIT_ZOMBIE unconditionally. This fix was planned
but also explicitly requested in [1] and makes the whole code more
consistent.

/* References */
[1]: https://lore.kernel.org/lkml/CAHk-=wigcxGFR2szue4wavJtH5cYTTeNES=toUBVGsmX0rzX+g@mail.gmail.com

Signed-off-by: Christian Brauner <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Cc: Linus Torvalds <[email protected]>

scsi: qla2xxx: Fix possible fcport null-pointer dereferences

In qla2x00_alloc_fcport(), fcport is assigned to NULL in the error
handling code on line 4880:
fcport = NULL;

Then fcport is used on lines 4883-4886:
INIT_WORK(&fcport->del_work, qla24xx_delete_sess_fn);
INIT_WORK(&fcport->reg_work, qla_register_fcport_fn);
INIT_LIST_HEAD(&fcport->gnl_entry);
INIT_LIST_HEAD(&fcport->list);

Thus, possible null-pointer dereferences may occur.

To fix these bugs, qla2x00_alloc_fcport() directly returns NULL
in the error handling code.

These bugs are found by a static analysis tool STCheck written by us.

Signed-off-by: Jia-Ju Bai <[email protected]>
Acked-by: Himanshu Madhani <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: mpt3sas: Use 63-bit DMA addressing on SAS35 HBA

Although SAS3 & SAS3.5 IT HBA controllers support 64-bit DMA addressing, as
per hardware design, if DMA-able range contains all 64-bits
set (0xFFFFFFFF-FFFFFFFF) then it results in a firmware fault.

E.g. SGE's start address is 0xFFFFFFFF-FFFF000 and data length is 0x1000
bytes. when HBA tries to DMA the data at 0xFFFFFFFF-FFFFFFFF location then
HBA will fault the firmware.

Driver will set 63-bit DMA mask to ensure the above address will not be
used.

Cc: <[email protected]> # 5.1.20+
Signed-off-by: Suganath Prabu <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: hpsa: remove printing internal cdb on tag collision

Remove racy printing of internal commands. Completion thread can be
cleaning up the command in parallel.

Reviewed-by: Bader Ali - Saleh <[email protected]>
Reviewed-by: Scott Teel <[email protected]>
Reviewed-by: Scott Benesh <[email protected]>
Reviewed-by: Kevin Barnett <[email protected]>
Signed-off-by: Don Brace <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>