Git Repo - linux.git/log

btrfs: fix NULL pointer dereference after failure to create snapshot

When trying to get a new fs root for a snapshot during the transaction
at transaction.c:create_pending_snapshot(), if btrfs_get_new_fs_root()
fails we leave "pending->snap" pointing to an error pointer, and then
later at ioctl.c:create_snapshot() we dereference that pointer, resulting
in a crash:

  [12264.614689] BUG: kernel NULL pointer dereference, address: 00000000000007c4
  [12264.615650] #PF: supervisor write access in kernel mode
  [12264.616487] #PF: error_code(0x0002) - not-present page
  [12264.617436] PGD 0 P4D 0
  [12264.618328] Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
  [12264.619150] CPU: 0 PID: 2310635 Comm: fsstress Tainted: G        W         5.9.0-rc3-btrfs-next-67 #1
  [12264.619960] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  [12264.621769] RIP: 0010:btrfs_mksubvol+0x438/0x4a0 [btrfs]
  [12264.622528] Code: bc ef ff ff (...)
  [12264.624092] RSP: 0018:ffffaa6fc7277cd8 EFLAGS: 00010282
  [12264.624669] RAX: 00000000fffffff4 RBX: ffff9d3e8f151a60 RCX: 0000000000000000
  [12264.625249] RDX: 0000000000000001 RSI: ffffffff9d56c9be RDI: fffffffffffffff4
  [12264.625830] RBP: ffff9d3e8f151b48 R08: 0000000000000000 R09: 0000000000000000
  [12264.626413] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000fffffff4
  [12264.626994] R13: ffff9d3ede380538 R14: ffff9d3ede380500 R15: ffff9d3f61b2eeb8
  [12264.627582] FS:  00007f140d5d8200(0000) GS:ffff9d3fb5e00000(0000) knlGS:0000000000000000
  [12264.628176] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [12264.628773] CR2: 00000000000007c4 CR3: 000000020f8e8004 CR4: 00000000003706f0
  [12264.629379] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [12264.629994] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [12264.630594] Call Trace:
  [12264.631227]  btrfs_mksnapshot+0x7b/0xb0 [btrfs]
  [12264.631840]  __btrfs_ioctl_snap_create+0x16f/0x1a0 [btrfs]
  [12264.632458]  btrfs_ioctl_snap_create_v2+0xb0/0xf0 [btrfs]
  [12264.633078]  btrfs_ioctl+0x1864/0x3130 [btrfs]
  [12264.633689]  ? do_sys_openat2+0x1a7/0x2d0
  [12264.634295]  ? kmem_cache_free+0x147/0x3a0
  [12264.634899]  ? __x64_sys_ioctl+0x83/0xb0
  [12264.635488]  __x64_sys_ioctl+0x83/0xb0
  [12264.636058]  do_syscall_64+0x33/0x80
  [12264.636616]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

  (gdb) list *(btrfs_mksubvol+0x438)
  0x7c7b8 is in btrfs_mksubvol (fs/btrfs/ioctl.c:858).
  853 ret = 0;
  854 pending_snapshot->anon_dev = 0;
  855 fail:
  856 /* Prevent double freeing of anon_dev */
  857 if (ret && pending_snapshot->snap)
  858 pending_snapshot->snap->anon_dev = 0;
  859 btrfs_put_root(pending_snapshot->snap);
  860 btrfs_subvolume_release_metadata(root, &pending_snapshot->block_rsv);
  861 free_pending:
  862 if (pending_snapshot->anon_dev)

So fix this by setting "pending->snap" to NULL if we get an error from the
call to btrfs_get_new_fs_root() at transaction.c:create_pending_snapshot().

Fixes: 2dfb1e43f57dd3 ("btrfs: preallocate anon block device at first phase of snapshot creation")
Signed-off-by: Filipe Manana <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>

usb: typec: intel_pmc_mux: Do not configure SBU and HSL Orientation in Alternate modes

According to the PMC Type C Subsystem (TCSS) Mux programming guide rev
0.7, bits 4 and 5 are reserved in Alternate modes.
SBU Orientation and HSL Orientation needs to be configured only during
initial cable detection in USB connect flow based on device property of
"sbu-orientation" and "hsl-orientation".
Configuring these reserved bits in the Alternate modes may result in delay
in display link training or some unexpected behaviour.
So do not configure them while issuing Alternate Mode requests.

Fixes: ff4a30d5e243 ("usb: typec: mux: intel_pmc_mux: Support for static SBU/HSL orientation")
Signed-off-by: Utkarsh Patel <[email protected]>
Cc: stable <[email protected]>
Signed-off-by: Heikki Krogerus <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: typec: intel_pmc_mux: Do not configure Altmode HPD High

According to the PMC Type C Subsystem (TCSS) Mux programming guide rev
0.7, bit 14 is reserved in Alternate mode.
In DP Alternate Mode state, if the HPD_STATE (bit 7) field in the
status update command VDO is set to HPD_HIGH, HPD is configured via
separate HPD mode request after configuring DP Alternate mode request.
Configuring reserved bit may show unexpected behaviour.
So do not configure them while issuing the Alternate Mode request.

Fixes: 7990be48ef4d ("usb: typec: mux: intel: Handle alt mode HPD_HIGH")
Cc: [email protected]
Signed-off-by: Utkarsh Patel <[email protected]>
Signed-off-by: Heikki Krogerus <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

scripts/tags.sh: exclude tools directory from tags generation

when COMPILED_SOURCE is set, running 'make ARCH=x86_64 COMPILED_SOURCE=1
cscope tags' in KBUILD_OUTPUT directory produces lots of "No such file
or directory" warnings:
...
realpath: sigchain.h: No such file or directory
realpath: orc_gen.c: No such file or directory
realpath: objtool.c: No such file or directory
...
let's exclude tools directory from tags generation

Fixes: 4f491bb6ea2a ("scripts/tags.sh: collect compiled source precisely")
Link: https://lore.kernel.org/lkml/20200809210056.GA1344537@thinkpad
Signed-off-by: Rustam Kovhaev <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: free data reloc tree on failed mount

While testing a weird problem with -o degraded, I noticed I was getting
leaked root errors

  BTRFS warning (device loop0): writable mount is not allowed due to too many missing devices
  BTRFS error (device loop0): open_ctree failed
  BTRFS error (device loop0): leaked root -9-0 refcount 1

This is the DATA_RELOC root, which gets read before the other fs roots,
but is included in the fs roots radix tree.  Handle this by adding a
btrfs_drop_and_free_fs_root() on the data reloc root if it exists.  This
is ok to do here if we fail further up because we will only drop the ref
if we delete the root from the radix tree, and all other cleanup won't
be duplicated.

CC: [email protected] # 5.8+
Reviewed-by: Nikolay Borisov <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>

btrfs: require only sector size alignment for parent eb bytenr

[BUG]
A completely sane converted fs will cause kernel warning at balance
time:

  [ 1557.188633] BTRFS info (device sda7): relocating block group 8162107392 flags data
  [ 1563.358078] BTRFS info (device sda7): found 11722 extents
  [ 1563.358277] BTRFS info (device sda7): leaf 7989321728 gen 95 total ptrs 213 free space 3458 owner 2
  [ 1563.358280] item 0 key (7984947200 169 0) itemoff 16250 itemsize 33
  [ 1563.358281] extent refs 1 gen 90 flags 2
  [ 1563.358282] ref#0: tree block backref root 4
  [ 1563.358285] item 1 key (7985602560 169 0) itemoff 16217 itemsize 33
  [ 1563.358286] extent refs 1 gen 93 flags 258
  [ 1563.358287] ref#0: shared block backref parent 7985602560
  [ 1563.358288] (parent 7985602560 is NOT ALIGNED to nodesize 16384)
  [ 1563.358290] item 2 key (7985635328 169 0) itemoff 16184 itemsize 33
  ...
  [ 1563.358995] BTRFS error (device sda7): eb 7989321728 invalid extent inline ref type 182
  [ 1563.358996] ------------[ cut here ]------------
  [ 1563.359005] WARNING: CPU: 14 PID: 2930 at 0xffffffff9f231766

Then with transaction abort, and obviously failed to balance the fs.

[CAUSE]
That mentioned inline ref type 182 is completely sane, it's
BTRFS_SHARED_BLOCK_REF_KEY, it's some extra check making kernel to
believe it's invalid.

Commit 64ecdb647ddb ("Btrfs: add one more sanity check for shared ref
type") introduced extra checks for backref type.

One of the requirement is, parent bytenr must be aligned to node size,
which is not correct.

One example is like this:

0 1G  1G+4K 2G 2G+4K
|   |///////////////////|//|  <- A chunk starts at 1G+4K
            |   | <- A tree block get reserved at bytenr 1G+4K

Then we have a valid tree block at bytenr 1G+4K, but not aligned to
nodesize (16K).

Such chunk is not ideal, but current kernel can handle it pretty well.
We may warn about such tree block in the future, but should not reject
them.

[FIX]
Change the alignment requirement from node size alignment to sector size
alignment.

Also, to make our lives a little easier, also output @iref when
btrfs_get_extent_inline_ref_type() failed, so we can locate the item
easier.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=205475
Fixes: 64ecdb647ddb ("Btrfs: add one more sanity check for shared ref type")
CC: [email protected] # 4.14+
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Qu Wenruo <[email protected]>
[ update comments and messages ]
Signed-off-by: David Sterba <[email protected]>

btrfs: fix lockdep splat in add_missing_dev

Nikolay reported a lockdep splat in generic/476 that I could reproduce
with btrfs/187.

  ======================================================
  WARNING: possible circular locking dependency detected
  5.9.0-rc2+ #1 Tainted: G        W
  ------------------------------------------------------
  kswapd0/100 is trying to acquire lock:
  ffff9e8ef38b6268 (&delayed_node->mutex){+.+.}-{3:3}, at: __btrfs_release_delayed_node.part.0+0x3f/0x330

  but task is already holding lock:
  ffffffffa9d74700 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #2 (fs_reclaim){+.+.}-{0:0}:
fs_reclaim_acquire+0x65/0x80
slab_pre_alloc_hook.constprop.0+0x20/0x200
kmem_cache_alloc_trace+0x3a/0x1a0
btrfs_alloc_device+0x43/0x210
add_missing_dev+0x20/0x90
read_one_chunk+0x301/0x430
btrfs_read_sys_array+0x17b/0x1b0
open_ctree+0xa62/0x1896
btrfs_mount_root.cold+0x12/0xea
legacy_get_tree+0x30/0x50
vfs_get_tree+0x28/0xc0
vfs_kern_mount.part.0+0x71/0xb0
btrfs_mount+0x10d/0x379
legacy_get_tree+0x30/0x50
vfs_get_tree+0x28/0xc0
path_mount+0x434/0xc00
__x64_sys_mount+0xe3/0x120
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #1 (&fs_info->chunk_mutex){+.+.}-{3:3}:
__mutex_lock+0x7e/0x7e0
btrfs_chunk_alloc+0x125/0x3a0
find_free_extent+0xdf6/0x1210
btrfs_reserve_extent+0xb3/0x1b0
btrfs_alloc_tree_block+0xb0/0x310
alloc_tree_block_no_bg_flush+0x4a/0x60
__btrfs_cow_block+0x11a/0x530
btrfs_cow_block+0x104/0x220
btrfs_search_slot+0x52e/0x9d0
btrfs_lookup_inode+0x2a/0x8f
__btrfs_update_delayed_inode+0x80/0x240
btrfs_commit_inode_delayed_inode+0x119/0x120
btrfs_evict_inode+0x357/0x500
evict+0xcf/0x1f0
vfs_rmdir.part.0+0x149/0x160
do_rmdir+0x136/0x1a0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #0 (&delayed_node->mutex){+.+.}-{3:3}:
__lock_acquire+0x1184/0x1fa0
lock_acquire+0xa4/0x3d0
__mutex_lock+0x7e/0x7e0
__btrfs_release_delayed_node.part.0+0x3f/0x330
btrfs_evict_inode+0x24c/0x500
evict+0xcf/0x1f0
dispose_list+0x48/0x70
prune_icache_sb+0x44/0x50
super_cache_scan+0x161/0x1e0
do_shrink_slab+0x178/0x3c0
shrink_slab+0x17c/0x290
shrink_node+0x2b2/0x6d0
balance_pgdat+0x30a/0x670
kswapd+0x213/0x4c0
kthread+0x138/0x160
ret_from_fork+0x1f/0x30

  other info that might help us debug this:

  Chain exists of:
    &delayed_node->mutex --> &fs_info->chunk_mutex --> fs_reclaim

   Possible unsafe locking scenario:

CPU0                    CPU1
----                    ----
    lock(fs_reclaim);
lock(&fs_info->chunk_mutex);
lock(fs_reclaim);
    lock(&delayed_node->mutex);

   *** DEADLOCK ***

  3 locks held by kswapd0/100:
   #0: ffffffffa9d74700 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30
   #1: ffffffffa9d65c50 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x115/0x290
   #2: ffff9e8e9da260e0 (&type->s_umount_key#48){++++}-{3:3}, at: super_cache_scan+0x38/0x1e0

  stack backtrace:
  CPU: 1 PID: 100 Comm: kswapd0 Tainted: G        W         5.9.0-rc2+ #1
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
  Call Trace:
   dump_stack+0x92/0xc8
   check_noncircular+0x12d/0x150
   __lock_acquire+0x1184/0x1fa0
   lock_acquire+0xa4/0x3d0
   ? __btrfs_release_delayed_node.part.0+0x3f/0x330
   __mutex_lock+0x7e/0x7e0
   ? __btrfs_release_delayed_node.part.0+0x3f/0x330
   ? __btrfs_release_delayed_node.part.0+0x3f/0x330
   ? lock_acquire+0xa4/0x3d0
   ? btrfs_evict_inode+0x11e/0x500
   ? find_held_lock+0x2b/0x80
   __btrfs_release_delayed_node.part.0+0x3f/0x330
   btrfs_evict_inode+0x24c/0x500
   evict+0xcf/0x1f0
   dispose_list+0x48/0x70
   prune_icache_sb+0x44/0x50
   super_cache_scan+0x161/0x1e0
   do_shrink_slab+0x178/0x3c0
   shrink_slab+0x17c/0x290
   shrink_node+0x2b2/0x6d0
   balance_pgdat+0x30a/0x670
   kswapd+0x213/0x4c0
   ? _raw_spin_unlock_irqrestore+0x46/0x60
   ? add_wait_queue_exclusive+0x70/0x70
   ? balance_pgdat+0x670/0x670
   kthread+0x138/0x160
   ? kthread_create_worker_on_cpu+0x40/0x40
   ret_from_fork+0x1f/0x30

This is because we are holding the chunk_mutex when we call
btrfs_alloc_device, which does a GFP_KERNEL allocation.  We don't want
to switch that to a GFP_NOFS lock because this is the only place where
it matters.  So instead use memalloc_nofs_save() around the allocation
in order to avoid the lockdep splat.

Reported-by: Nikolay Borisov <[email protected]>
CC: [email protected] # 4.4+
Reviewed-by: Anand Jain <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>

PM: <linux/device.h>: fix @em_pd kernel-doc warning

Fix kernel-doc warning in <linux/device.h>:

../include/linux/device.h:613: warning: Function parameter or member 'em_pd' not described in 'device'

Fixes: 1bc138c62295 ("PM / EM: add support for other devices than CPUs in Energy Model")
Signed-off-by: Randy Dunlap <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Reviewed-by: Lukasz Luba <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

openrisc: Fix cache API compile issue when not inlining

I found this when compiling a kbuild random config with GCC 11.  The
config enables CONFIG_DEBUG_SECTION_MISMATCH, which sets CFLAGS
-fno-inline-functions-called-once. This causes the call to cache_loop in
cache.c to not be inlined causing the below compile error.

    In file included from arch/openrisc/mm/cache.c:13:
    arch/openrisc/mm/cache.c: In function 'cache_loop':
    ./arch/openrisc/include/asm/spr.h:16:27: warning: 'asm' operand 0 probably does not match constraints
       16 | #define mtspr(_spr, _val) __asm__ __volatile__ (  \
  |                           ^~~~~~~
    arch/openrisc/mm/cache.c:25:3: note: in expansion of macro 'mtspr'
       25 |   mtspr(reg, line);
  |   ^~~~~
    ./arch/openrisc/include/asm/spr.h:16:27: error: impossible constraint in 'asm'
       16 | #define mtspr(_spr, _val) __asm__ __volatile__ (  \
  |                           ^~~~~~~
    arch/openrisc/mm/cache.c:25:3: note: in expansion of macro 'mtspr'
       25 |   mtspr(reg, line);
  |   ^~~~~
    make[1]: *** [scripts/Makefile.build:283: arch/openrisc/mm/cache.o] Error 1

The asm constraint "K" requires a immediate constant argument to mtspr,
however because of no inlining a register argument is passed causing a
failure.  Fix this by using __always_inline.

Link: https://lore.kernel.org/lkml/202008200453.ohnhqkjQ%[email protected]/
Signed-off-by: Stafford Horne <[email protected]>

openrisc: Reserve memblock for initrd

Recently OpenRISC added support for external initrd images, but I found
some instability when using larger buildroot initrd images. It turned
out that I forgot to reserve the memblock space for the initrd image.

This patch fixes the instability issue by reserving memblock space.

Fixes: ff6c923dbec3 ("openrisc: Add support for external initrd images")
Signed-off-by: Stafford Horne <[email protected]>
Reviewed-by: Mike Rapoport <[email protected]>

spi: stm32: Rate-limit the 'Communication suspended' message

The 'spi_stm32 44004000.spi: Communication suspended' message means that
when using PIO, the kernel did not read the FIFO fast enough and so the
SPI controller paused the transfer. Currently, this is printed on every
single such event, so if the kernel is busy and the controller is pausing
the transfers often, the kernel will be all the more busy scrolling this
message into the log buffer every few milliseconds. That is not helpful.

Instead, rate-limit the message and print it every once in a while. It is
not possible to use the default dev_warn_ratelimited(), because that is
still too verbose, as it prints 10 lines (DEFAULT_RATELIMIT_BURST) every
5 seconds (DEFAULT_RATELIMIT_INTERVAL). The policy here is to print 1 line
every 50 seconds (DEFAULT_RATELIMIT_INTERVAL * 10), because 1 line is more
than enough and the cycles saved on printing are better left to the CPU to
handle the SPI. However, dev_warn_once() is also not useful, as the user
should be aware that this condition is possibly recurring or ongoing. Thus
the custom rate-limit policy.

Finally, turn the message from dev_warn() to dev_dbg(), since the system
does not suffer any sort of malfunction if this message appears, it is
just slowing down. This further reduces the printing into the log buffer
and frees the CPU to do useful work.

Fixes: dcbe0d84dfa5 ("spi: add driver for STM32 SPI controller")
Signed-off-by: Marek Vasut <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Amelie Delaunay <[email protected]>
Cc: Antonio Borneo <[email protected]>
Cc: Mark Brown <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>

rbd: require global CAP_SYS_ADMIN for mapping and unmapping

It turns out that currently we rely only on sysfs attribute
permissions:

  $ ll /sys/bus/rbd/{add*,remove*}
  --w------- 1 root root 4096 Sep  3 20:37 /sys/bus/rbd/add
  --w------- 1 root root 4096 Sep  3 20:37 /sys/bus/rbd/add_single_major
  --w------- 1 root root 4096 Sep  3 20:37 /sys/bus/rbd/remove
  --w------- 1 root root 4096 Sep  3 20:38 /sys/bus/rbd/remove_single_major

This means that images can be mapped and unmapped (i.e. block devices
can be created and deleted) by a UID 0 process even after it drops all
privileges or by any process with CAP_DAC_OVERRIDE in its user namespace
as long as UID 0 is mapped into that user namespace.

Be consistent with other virtual block devices (loop, nbd, dm, md, etc)
and require CAP_SYS_ADMIN in the initial user namespace for mapping and
unmapping, and also for dumping the configuration string and refreshing
the image header.

Cc: [email protected]
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Jeff Layton <[email protected]>

kobject: Drop unneeded conditional in __kobject_del()

__kobject_del() is called from two places, in one where kobj is dereferenced
before and thus can't be NULL, and in the other the NULL check is done before
call. Drop unneeded conditional in __kobject_del().

Signed-off-by: Andy Shevchenko <[email protected]>
Reviewed-by: Rafael J. Wysocki <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

mmc: sdio: Use mmc_pre_req() / mmc_post_req()

SDHCI changed from using a tasklet to finish requests, to using an IRQ
thread i.e. commit c07a48c2651965 ("mmc: sdhci: Remove finish_tasklet").
Because this increased the latency to complete requests, a preparatory
change was made to complete the request from the IRQ handler if
possible i.e. commit 19d2f695f4e827 ("mmc: sdhci: Call mmc_request_done()
from IRQ handler if possible").  That alleviated the situation for MMC
block devices because the MMC block driver makes use of mmc_pre_req()
and mmc_post_req() so that successful requests are completed in the IRQ
handler and any DMA unmapping is handled separately in mmc_post_req().
However SDIO was still affected, and an example has been reported with
up to 20% degradation in performance.

Looking at SDIO I/O helper functions, sdio_io_rw_ext_helper() appeared
to be a possible candidate for making use of asynchronous requests
within its I/O loops, but analysis revealed that these loops almost
never iterate more than once, so the complexity of the change would not
be warrented.

Instead, mmc_pre_req() and mmc_post_req() are added before and after I/O
submission (mmc_wait_for_req) in mmc_io_rw_extended().  This still has
the potential benefit of reducing the duration of interrupt handlers, as
well as addressing the latency issue for SDHCI.  It also seems a more
reasonable solution than forcing drivers to do everything in the IRQ
handler.

Reported-by: Dmitry Osipenko <[email protected]>
Fixes: c07a48c2651965 ("mmc: sdhci: Remove finish_tasklet")
Signed-off-by: Adrian Hunter <[email protected]>
Tested-by: Dmitry Osipenko <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Ulf Hansson <[email protected]>

mmc: sdhci-of-esdhc: Don't walk device-tree on every interrupt

Commit b214fe592ab7 ("mmc: sdhci-of-esdhc: add erratum eSDHC7 support")
added code to check for a specific compatible string in the device-tree
on every esdhc interrupat. Instead of doing this record the quirk in
struct sdhci_esdhc and lookup the struct in esdhc_irq.

Signed-off-by: Chris Packham <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Fixes: b214fe592ab7 ("mmc: sdhci-of-esdhc: add erratum eSDHC7 support")
Cc: [email protected]
Signed-off-by: Ulf Hansson <[email protected]>

mmc: mmc_spi: Allow the driver to be built when CONFIG_HAS_DMA is unset

The commit cd57d07b1e4e ("sh: don't allow non-coherent DMA for NOMMU") made
CONFIG_NO_DMA to be set for some platforms, for good reasons.
Consequentially, CONFIG_HAS_DMA doesn't get set, which makes the DMA
mapping interface to be built as stub functions, but also prevent the
mmc_spi driver from being built as it depends on CONFIG_HAS_DMA.

It turns out that for some odd cases, the driver still relied on the DMA
mapping interface, even if the DMA was not actively being used.

To fixup the behaviour, let's drop the build dependency for CONFIG_HAS_DMA.
Moreover, as to allow the driver to succeed probing, let's move the DMA
initializations behind "#ifdef CONFIG_HAS_DMA".

Fixes: cd57d07b1e4e ("sh: don't allow non-coherent DMA for NOMMU")
Reported-by: Rich Felker <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>
Tested-by: Rich Felker <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

mmc: sdhci-msm: Add retries when all tuning phases are found valid

As the comments in this patch say, if we tune and find all phases are
valid it's _almost_ as bad as no phases being found valid.  Probably
all phases are not really reliable but we didn't detect where the
unreliable place is.  That means we'll essentially be guessing and
hoping we get a good phase.

This is not just a problem in theory.  It was causing real problems on
a real board.  On that board, most often phase 10 is found as the only
invalid phase, though sometimes 10 and 11 are invalid and sometimes
just 11.  Some percentage of the time, however, all phases are found
to be valid.  When this happens, the current logic will decide to use
phase 11.  Since phase 11 is sometimes found to be invalid, this is a
bad choice.  Sure enough, when phase 11 is picked we often get mmc
errors later in boot.

I have seen cases where all phases were found to be valid 3 times in a
row, so increase the retry count to 10 just to be extra sure.

Fixes: 415b5a75da43 ("mmc: sdhci-msm: Add platform_execute_tuning implementation")
Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Veerabhadrarao Badiganti <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Link: https://lore.kernel.org/r/20200827075809.1.If179abf5ecb67c963494db79c3bc4247d987419b@changeid
Signed-off-by: Ulf Hansson <[email protected]>

mmc: sdhci-acpi: Clear amd_sdhci_host on reset

The commit 61d7437ed1390 ("mmc: sdhci-acpi: Fix HS400 tuning for AMDI0040")
broke resume for eMMC HS400. When the system suspends the eMMC controller
is powered down. So, on resume we need to reinitialize the controller.
Although, amd_sdhci_host was not getting cleared, so the DLL was never
re-enabled on resume. This results in HS400 being non-functional.

To fix the problem, this change clears the tuned_clock flag, clears the
dll_enabled flag and disables the DLL on reset.

Fixes: 61d7437ed1390 ("mmc: sdhci-acpi: Fix HS400 tuning for AMDI0040")
Signed-off-by: Raul E Rangel <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Link: https://lore.kernel.org/r/20200831150517.1.I93c78bfc6575771bb653c9d3fca5eb018a08417d@changeid
Signed-off-by: Ulf Hansson <[email protected]>

cifs: fix DFS mount with cifsacl/modefromsid

RHBZ: 1871246

If during cifs_lookup()/get_inode_info() we encounter a DFS link
and we use the cifsacl or modefromsid mount options we must suppress
any -EREMOTE errors that triggers or else we will not be able to follow
the DFS link and automount the target.

This fixes an issue with modefromsid/cifsacl where these mountoptions
would break DFS and we would no longer be able to access the share.

Signed-off-by: Ronnie Sahlberg <[email protected]>
Reviewed-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Steve French <[email protected]>

Linux 5.9-rc4

Merge tag 'io_uring-5.9-2020-09-06' of git://git.kernel.dk/linux-block

Pull more io_uring fixes from Jens Axboe:
"Two followup fixes. One is fixing a regression from this merge window,
  the other is two commits fixing cancelation of deferred requests.

  Both have gone through full testing, and both spawned a few new
  regression test additions to liburing.

   - Don't play games with const, properly store the output iovec and
     assign it as needed.

   - Deferred request cancelation fix (Pavel)"

* tag 'io_uring-5.9-2020-09-06' of git://git.kernel.dk/linux-block:
  io_uring: fix linked deferred ->files cancellation
  io_uring: fix cancel of deferred reqs with ->files
  io_uring: fix explicit async read/write mapping for large segments

Merge tag 'iommu-fixes-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu fixes from Joerg Roedel:

- three Intel VT-d fixes to fix address handling on 32bit, fix a NULL
   pointer dereference bug and serialize a hardware register access as
   required by the VT-d spec.

- two patches for AMD IOMMU to force AMD GPUs into translation mode
   when memory encryption is active and disallow using IOMMUv2
   functionality.  This makes the AMDGPU driver work when memory
   encryption is active.

- two more fixes for AMD IOMMU to fix updating the Interrupt Remapping
   Table Entries.

- MAINTAINERS file update for the Qualcom IOMMU driver.

* tag 'iommu-fixes-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/vt-d: Handle 36bit addressing for x86-32
  iommu/amd: Do not use IOMMUv2 functionality when SME is active
  iommu/amd: Do not force direct mapping when SME is active
  iommu/amd: Use cmpxchg_double() when updating 128-bit IRTE
  iommu/amd: Restore IRTE.RemapEn bit after programming IRTE
  iommu/vt-d: Fix NULL pointer dereference in dev_iommu_priv_set()
  iommu/vt-d: Serialize IOMMU GCMD register modifications
  MAINTAINERS: Update QUALCOMM IOMMU after Arm SMMU drivers move

Merge tag 'x86-urgent-2020-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:

- more generic entry code ABI fallout

- debug register handling bugfixes

- fix vmalloc mappings on 32-bit kernels

- kprobes instrumentation output fix on 32-bit kernels

- fix over-eager WARN_ON_ONCE() on !SMAP hardware

- NUMA debugging fix

- fix Clang related crash on !RETPOLINE kernels

* tag 'x86-urgent-2020-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/entry: Unbreak 32bit fast syscall
  x86/debug: Allow a single level of #DB recursion
  x86/entry: Fix AC assertion
  tracing/kprobes, x86/ptrace: Fix regs argument order for i386
  x86, fakenuma: Fix invalid starting node ID
  x86/mm/32: Bring back vmalloc faulting on x86_32
  x86/cmdline: Disable jump tables for cmdline.c

Merge tag 'for-linus-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:
"A small series for fixing a problem with Xen PVH guests when running
  as backends (e.g. as dom0).

  Mapping other guests' memory is now working via ZONE_DEVICE, thus not
  requiring to abuse the memory hotplug functionality for that purpose"

* tag 'for-linus-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: add helpers to allocate unpopulated memory
  memremap: rename MEMORY_DEVICE_DEVDAX to MEMORY_DEVICE_GENERIC
  xen/balloon: add header guard

io_uring: fix linked deferred ->files cancellation

While looking for ->files in ->defer_list, consider that requests there
may actually be links.

Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

io_uring: fix cancel of deferred reqs with ->files

While trying to cancel requests with ->files, it also should look for
requests in ->defer_list, otherwise it might end up hanging a thread.

Cancel all requests in ->defer_list up to the last request there with
matching ->files, that's needed to follow drain ordering semantics.

Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

Merge tags 'auxdisplay-for-linus-v5.9-rc4', 'clang-format-for-linus-v5.9-rc4' and 'compiler-attributes-for-linus-v5.9-rc4' of git://github.com/ojeda/linux

Pull misc fixes from Miguel Ojeda:
"A trivial patch for auxdisplay:

   - Replace HTTP links with HTTPS ones (Alexander A. Klimov)

  The usual clang-format trivial update:

   - Update with the latest for_each macro list (Miguel Ojeda)

  And Luc requested me to pick a sparse fix on my queue, so here it goes
  along with other two trivial Compiler Attributes ones (also from Luc).

   - sparse: use static inline for __chk_{user,io}_ptr() (Luc Van
     Oostenryck)

   - Compiler Attributes: fix comment concerning GCC 4.6 (Luc Van
     Oostenryck)

   - Compiler Attributes: remove comment about sparse not supporting
     __has_attribute (Luc Van Oostenryck)"

* tag 'auxdisplay-for-linus-v5.9-rc4' of git://github.com/ojeda/linux:
  auxdisplay: Replace HTTP links with HTTPS ones

* tag 'clang-format-for-linus-v5.9-rc4' of git://github.com/ojeda/linux:
  clang-format: Update with the latest for_each macro list

* tag 'compiler-attributes-for-linus-v5.9-rc4' of git://github.com/ojeda/linux:
  sparse: use static inline for __chk_{user,io}_ptr()
  Compiler Attributes: fix comment concerning GCC 4.6
  Compiler Attributes: remove comment about sparse not supporting __has_attribute

Merge tag 'arc-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:

- HSDK-4xd Dev system: perf driver updates for sampling interrupt

- HSDK* Dev System: Ethernet broken [Evgeniy Didin]

- HIGHMEM broken (2 memory banks) [Mike Rapoport]

- show_regs() rewrite once and for all

- Other minor fixes

* tag 'arc-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: [plat-hsdk]: Switch ethernet phy-mode to rgmii-id
  arc: fix memory initialization for systems with two memory banks
  irqchip/eznps: Fix build error for !ARC700 builds
  ARC: show_regs: fix r12 printing and simplify
  ARC: HSDK: wireup perf irq
  ARC: perf: don't bail setup if pct irq missing in device-tree
  ARC: pgalloc.h: delete a duplicated word + other fixes

Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
"19 patches.

  Subsystems affected by this patch series: MAINTAINERS, ipc, fork,
  checkpatch, lib, and mm (memcg, slub, pagemap, madvise, migration,
  hugetlb)"

* emailed patches from Andrew Morton <[email protected]>:
  include/linux/log2.h: add missing () around n in roundup_pow_of_two()
  mm/khugepaged.c: fix khugepaged's request size in collapse_file
  mm/hugetlb: fix a race between hugetlb sysctl handlers
  mm/hugetlb: try preferred node first when alloc gigantic page from cma
  mm/migrate: preserve soft dirty in remove_migration_pte()
  mm/migrate: remove unnecessary is_zone_device_page() check
  mm/rmap: fixup copying of soft dirty and uffd ptes
  mm/migrate: fixup setting UFFD_WP flag
  mm: madvise: fix vma user-after-free
  checkpatch: fix the usage of capture group ( ... )
  fork: adjust sysctl_max_threads definition to match prototype
  ipc: adjust proc_ipc_sem_dointvec definition to match prototype
  mm: track page table modifications in __apply_to_page_range()
  MAINTAINERS: IA64: mark Status as Odd Fixes only
  MAINTAINERS: add LLVM maintainers
  MAINTAINERS: update Cavium/Marvell entries
  mm: slub: fix conversion of freelist_corrupted()
  mm: memcg: fix memcg reclaim soft lockup
  memcg: fix use-after-free in uncharge_batch

include/linux/log2.h: add missing () around n in roundup_pow_of_two()

Otherwise gcc generates warnings if the expression is complicated.

Fixes: 312a0c170945 ("[PATCH] LOG2: Alter roundup_pow_of_two() so that it can use a ilog2() on a constant")
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/khugepaged.c: fix khugepaged's request size in collapse_file

collapse_file() in khugepaged passes PAGE_SIZE as the number of pages to
be read to page_cache_sync_readahead(). The intent was probably to read
a single page. Fix it to use the number of pages to the end of the
window instead.

Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: David Howells <[email protected]>
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Matthew Wilcox (Oracle) <[email protected]>
Acked-by: Song Liu <[email protected]>
Acked-by: Yang Shi <[email protected]>
Acked-by: Pankaj Gupta <[email protected]>
Cc: Eric Biggers <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/hugetlb: fix a race between hugetlb sysctl handlers

There is a race between the assignment of `table->data` and write value
to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on
the other thread.

  CPU0:                                 CPU1:
                                        proc_sys_write
  hugetlb_sysctl_handler                  proc_sys_call_handler
  hugetlb_sysctl_handler_common             hugetlb_sysctl_handler
    table->data = &tmp;                       hugetlb_sysctl_handler_common
                                                table->data = &tmp;
      proc_doulongvec_minmax
        do_proc_doulongvec_minmax           sysctl_head_finish
          __do_proc_doulongvec_minmax         unuse_table
            i = table->data;
            *i = val;  // corrupt CPU1's stack

Fix this by duplicating the `table`, and only update the duplicate of
it.  And introduce a helper of proc_hugetlb_doulongvec_minmax() to
simplify the code.

The following oops was seen:

    BUG: kernel NULL pointer dereference, address: 0000000000000000
    #PF: supervisor instruction fetch in kernel mode
    #PF: error_code(0x0010) - not-present page
    Code: Bad RIP value.
    ...
    Call Trace:
     ? set_max_huge_pages+0x3da/0x4f0
     ? alloc_pool_huge_page+0x150/0x150
     ? proc_doulongvec_minmax+0x46/0x60
     ? hugetlb_sysctl_handler_common+0x1c7/0x200
     ? nr_hugepages_store+0x20/0x20
     ? copy_fd_bitmaps+0x170/0x170
     ? hugetlb_sysctl_handler+0x1e/0x20
     ? proc_sys_call_handler+0x2f1/0x300
     ? unregister_sysctl_table+0xb0/0xb0
     ? __fd_install+0x78/0x100
     ? proc_sys_write+0x14/0x20
     ? __vfs_write+0x4d/0x90
     ? vfs_write+0xef/0x240
     ? ksys_write+0xc0/0x160
     ? __ia32_sys_read+0x50/0x50
     ? __close_fd+0x129/0x150
     ? __x64_sys_write+0x43/0x50
     ? do_syscall_64+0x6c/0x200
     ? entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: e5ff215941d5 ("hugetlb: multiple hstates for multiple page sizes")
Signed-off-by: Muchun Song <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Cc: Andi Kleen <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/hugetlb: try preferred node first when alloc gigantic page from cma

Since commit cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic
hugepages using cma"), the gigantic page would be allocated from node
which is not the preferred node, although there are pages available from
that node. The reason is that the nid parameter has been ignored in
alloc_gigantic_page().

Besides, the __GFP_THISNODE also need be checked if user required to
alloc only from the preferred node.

After this patch, the preferred node is tried first before other allowed
nodes, and don't try to allocate from other nodes if __GFP_THISNODE is
specified. If user don't specify the preferred node, the current node
will be used as preferred node, which makes sure consistent behavior of
allocating gigantic and non-gigantic hugetlb page.

Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Signed-off-by: Li Xinhai <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Roman Gushchin <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/migrate: preserve soft dirty in remove_migration_pte()

The code to remove a migration PTE and replace it with a device private
PTE was not copying the soft dirty bit from the migration entry. This
could lead to page contents not being marked dirty when faulting the page
back from device private memory.

Signed-off-by: Ralph Campbell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Cc: Jerome Glisse <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Bharata B Rao <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/migrate: remove unnecessary is_zone_device_page() check

Patch series "mm/migrate: preserve soft dirty in remove_migration_pte()".

I happened to notice this from code inspection after seeing Alistair
Popple's patch ("mm/rmap: Fixup copying of soft dirty and uffd ptes").

This patch (of 2):

The check for is_zone_device_page() and is_device_private_page() is
unnecessary since the latter is sufficient to determine if the page is a
device private page. Simplify the code for easier reading.

Signed-off-by: Ralph Campbell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Cc: Jerome Glisse <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Bharata B Rao <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/rmap: fixup copying of soft dirty and uffd ptes

During memory migration a pte is temporarily replaced with a migration
swap pte. Some pte bits from the existing mapping such as the soft-dirty
and uffd write-protect bits are preserved by copying these to the
temporary migration swap pte.

However these bits are not stored at the same location for swap and
non-swap ptes. Therefore testing these bits requires using the
appropriate helper function for the given pte type.

Unfortunately several code locations were found where the wrong helper
function is being used to test soft_dirty and uffd_wp bits which leads to
them getting incorrectly set or cleared during page-migration.

Fix these by using the correct tests based on pte type.

Fixes: a5430dda8a3a ("mm/migrate: support un-addressable ZONE_DEVICE page in migration")
Fixes: 8c3328f1f36a ("mm/migrate: migrate_vma() unmap page from vma while collecting pages")
Fixes: f45ec5ff16a7 ("userfaultfd: wp: support swap and page migration")
Signed-off-by: Alistair Popple <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Cc: Jérôme Glisse <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Ralph Campbell <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/migrate: fixup setting UFFD_WP flag

Commit f45ec5ff16a75 ("userfaultfd: wp: support swap and page migration")
introduced support for tracking the uffd wp bit during page migration.
However the non-swap PTE variant was used to set the flag for zone device
private pages which are a type of swap page.

This leads to corruption of the swap offset if the original PTE has the
uffd_wp flag set.

Fixes: f45ec5ff16a75 ("userfaultfd: wp: support swap and page migration")
Signed-off-by: Alistair Popple <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Cc: Jérôme Glisse <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Ralph Campbell <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: madvise: fix vma user-after-free

The syzbot reported the below use-after-free:

  BUG: KASAN: use-after-free in madvise_willneed mm/madvise.c:293 [inline]
  BUG: KASAN: use-after-free in madvise_vma mm/madvise.c:942 [inline]
  BUG: KASAN: use-after-free in do_madvise.part.0+0x1c8b/0x1cf0 mm/madvise.c:1145
  Read of size 8 at addr ffff8880a6163eb0 by task syz-executor.0/9996

  CPU: 0 PID: 9996 Comm: syz-executor.0 Not tainted 5.9.0-rc1-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x18f/0x20d lib/dump_stack.c:118
    print_address_description.constprop.0.cold+0xae/0x497 mm/kasan/report.c:383
    __kasan_report mm/kasan/report.c:513 [inline]
    kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
    madvise_willneed mm/madvise.c:293 [inline]
    madvise_vma mm/madvise.c:942 [inline]
    do_madvise.part.0+0x1c8b/0x1cf0 mm/madvise.c:1145
    do_madvise mm/madvise.c:1169 [inline]
    __do_sys_madvise mm/madvise.c:1171 [inline]
    __se_sys_madvise mm/madvise.c:1169 [inline]
    __x64_sys_madvise+0xd9/0x110 mm/madvise.c:1169
    do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

  Allocated by task 9992:
    kmem_cache_alloc+0x138/0x3a0 mm/slab.c:3482
    vm_area_alloc+0x1c/0x110 kernel/fork.c:347
    mmap_region+0x8e5/0x1780 mm/mmap.c:1743
    do_mmap+0xcf9/0x11d0 mm/mmap.c:1545
    vm_mmap_pgoff+0x195/0x200 mm/util.c:506
    ksys_mmap_pgoff+0x43a/0x560 mm/mmap.c:1596
    do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

  Freed by task 9992:
    kmem_cache_free.part.0+0x67/0x1f0 mm/slab.c:3693
    remove_vma+0x132/0x170 mm/mmap.c:184
    remove_vma_list mm/mmap.c:2613 [inline]
    __do_munmap+0x743/0x1170 mm/mmap.c:2869
    do_munmap mm/mmap.c:2877 [inline]
    mmap_region+0x257/0x1780 mm/mmap.c:1716
    do_mmap+0xcf9/0x11d0 mm/mmap.c:1545
    vm_mmap_pgoff+0x195/0x200 mm/util.c:506
    ksys_mmap_pgoff+0x43a/0x560 mm/mmap.c:1596
    do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

It is because vma is accessed after releasing mmap_lock, but someone
else acquired the mmap_lock and the vma is gone.

Releasing mmap_lock after accessing vma should fix the problem.

Fixes: 692fe62433d4c ("mm: Handle MADV_WILLNEED through vfs_fadvise()")
Reported-by: [email protected]
Signed-off-by: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Cc: <[email protected]> [5.4+]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

checkpatch: fix the usage of capture group ( ... )

The usage of "capture group (...)" in the immediate condition after `&&`
results in `$1` being uninitialized.  This issues a warning "Use of
uninitialized value $1 in regexp compilation at ./scripts/checkpatch.pl
line 2638".

I noticed this bug while running checkpatch on the set of commits from
v5.7 to v5.8-rc1 of the kernel on the commits with a diff content in
their commit message.

This bug was introduced in the script by commit e518e9a59ec3
("checkpatch: emit an error when there's a diff in a changelog").  It
has been in the script since then.

The author intended to store the match made by capture group in variable
`$1`.  This should have contained the name of the file as `[\w/]+`
matched.  However, this couldn't be accomplished due to usage of capture
group and `$1` in the same regular expression.

Fix this by placing the capture group in the condition before `&&`.
Thus, `$1` can be initialized to the text that capture group matches
thereby setting it to the desired and required value.

Fixes: e518e9a59ec3 ("checkpatch: emit an error when there's a diff in a changelog")
Signed-off-by: Mrinal Pandey <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Lukas Bulwahn <[email protected]>
Reviewed-by: Lukas Bulwahn <[email protected]>
Cc: Joe Perches <[email protected]>
Link: https://lkml.kernel.org/r/20200714032352.f476hanaj2dlmiot@mrinalpandey
Signed-off-by: Linus Torvalds <[email protected]>

fork: adjust sysctl_max_threads definition to match prototype

Commit 32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler")
changed ctl_table.proc_handler to take a kernel pointer.  Adjust the
definition of sysctl_max_threads to match its prototype in
linux/sysctl.h which fixes the following sparse error/warning:

  kernel/fork.c:3050:47: warning: incorrect type in argument 3 (different address spaces)
  kernel/fork.c:3050:47:    expected void *
  kernel/fork.c:3050:47:    got void [noderef] __user *buffer
  kernel/fork.c:3036:5: error: symbol 'sysctl_max_threads' redeclared with different type (incompatible argument 3 (different address spaces)):
  kernel/fork.c:3036:5:    int extern [addressable] [signed] [toplevel] sysctl_max_threads( ... )
  kernel/fork.c: note: in included file (through include/linux/key.h, include/linux/cred.h, include/linux/sched/signal.h, include/linux/sched/cputime.h):
  include/linux/sysctl.h:242:5: note: previously declared as:
  include/linux/sysctl.h:242:5:    int extern [addressable] [signed] [toplevel] sysctl_max_threads( ... )

Fixes: 32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler")
Signed-off-by: Tobias Klauser <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Al Viro <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

ipc: adjust proc_ipc_sem_dointvec definition to match prototype

Commit 32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler")
changed ctl_table.proc_handler to take a kernel pointer.  Adjust the
signature of proc_ipc_sem_dointvec to match ctl_table.proc_handler which
fixes the following sparse error/warning:

  ipc/ipc_sysctl.c:94:47: warning: incorrect type in argument 3 (different address spaces)
  ipc/ipc_sysctl.c:94:47:    expected void *buffer
  ipc/ipc_sysctl.c:94:47:    got void [noderef] __user *buffer
  ipc/ipc_sysctl.c:194:35: warning: incorrect type in initializer (incompatible argument 3 (different address spaces))
  ipc/ipc_sysctl.c:194:35:    expected int ( [usertype] *proc_handler )( ... )
  ipc/ipc_sysctl.c:194:35:    got int ( * )( ... )

Fixes: 32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler")
Signed-off-by: Tobias Klauser <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Alexander Viro <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: track page table modifications in __apply_to_page_range()

__apply_to_page_range() is also used to change and/or allocate
page-table pages in the vmalloc area of the address space.  Make sure
these changes get synchronized to other page-tables in the system by
calling arch_sync_kernel_mappings() when necessary.

The impact appears limited to x86-32, where apply_to_page_range may miss
updating the PMD.  That leads to explosions in drivers like

  BUG: unable to handle page fault for address: fe036000
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  *pde = 00000000
  Oops: 0002 [#1] SMP
  CPU: 3 PID: 1300 Comm: gem_concurrent_ Not tainted 5.9.0-rc1+ #16
  Hardware name:  /NUC6i3SYB, BIOS SYSKLi35.86A.0024.2015.1027.2142 10/27/2015
  EIP: __execlists_context_alloc+0x132/0x2d0 [i915]
  Code: 31 d2 89 f0 e8 2f 55 02 00 89 45 e8 3d 00 f0 ff ff 0f 87 11 01 00 00 8b 4d e8 03 4b 30 b8 5a 5a 5a 5a ba 01 00 00 00 8d 79 04 <c7> 01 5a 5a 5a 5a c7 81 fc 0f 00 00 5a 5a 5a 5a 83 e7 fc 29 f9 81
  EAX: 5a5a5a5a EBX: f60ca000 ECX: fe036000 EDX: 00000001
  ESI: f43b7340 EDI: fe036004 EBP: f6389cb8 ESP: f6389c9c
  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010286
  CR0: 80050033 CR2: fe036000 CR3: 2d361000 CR4: 001506d0
  DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
  DR6: fffe0ff0 DR7: 00000400
  Call Trace:
    execlists_context_alloc+0x10/0x20 [i915]
    intel_context_alloc_state+0x3f/0x70 [i915]
    __intel_context_do_pin+0x117/0x170 [i915]
    i915_gem_do_execbuffer+0xcc7/0x2500 [i915]
    i915_gem_execbuffer2_ioctl+0xcd/0x1f0 [i915]
    drm_ioctl_kernel+0x8f/0xd0
    drm_ioctl+0x223/0x3d0
    __ia32_sys_ioctl+0x1ab/0x760
    __do_fast_syscall_32+0x3f/0x70
    do_fast_syscall_32+0x29/0x60
    do_SYSENTER_32+0x15/0x20
    entry_SYSENTER_32+0x9f/0xf2
  EIP: 0xb7f28559
  Code: 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
  EAX: ffffffda EBX: 00000005 ECX: c0406469 EDX: bf95556c
  ESI: b7e68000 EDI: c0406469 EBP: 00000005 ESP: bf9554d8
  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000296
  Modules linked in: i915 x86_pkg_temp_thermal intel_powerclamp crc32_pclmul crc32c_intel intel_cstate intel_uncore intel_gtt drm_kms_helper intel_pch_thermal video button autofs4 i2c_i801 i2c_smbus fan
  CR2: 00000000fe036000

It looks like kasan, xen and i915 are vulnerable.

Actual impact is "on thinkpad X60 in 5.9-rc1, screen starts blinking
after 30-or-so minutes, and machine is unusable"

[[email protected]: ARCH_PAGE_TABLE_SYNC_MASK needs vmalloc.h]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: changelog addition]
[[email protected]: changelog addition]

Fixes: 2ba3e6947aed ("mm/vmalloc: track which page-table levels were modified")
Fixes: 86cf69f1d893 ("x86/mm/32: implement arch_sync_kernel_mappings()")
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Stephen Rothwell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Chris Wilson <[email protected]> [x86-32]
Tested-by: Pavel Machek <[email protected]>
Acked-by: Linus Torvalds <[email protected]>
Cc: <[email protected]> [5.8+]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

MAINTAINERS: IA64: mark Status as Odd Fixes only

IA64 isn't really being maintained, so mark it as Odd Fixes only.

Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Tony Luck <[email protected]>
Cc: Fenghua Yu <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

MAINTAINERS: add LLVM maintainers

Nominate Nathan and myself to be point of contact for clang/LLVM related
support, after a poll at the LLVM BoF at Linux Plumbers Conf 2020.

While corporate sponsorship is beneficial, its important to not entrust
the keys to the nukes with any one entity. Should Nathan and I find
ourselves at the same employer, I would gladly step down.

Signed-off-by: Nick Desaulniers <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Sedat Dilek <[email protected]>
Acked-by: Nathan Chancellor <[email protected]>
Acked-by: Lukas Bulwahn <[email protected]>
Acked-by: Miguel Ojeda <[email protected]>
Acked-by: Masahiro Yamada <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

MAINTAINERS: update Cavium/Marvell entries

I am leaving Marvell and already do not have access to my @marvell.com
email address. So switching over to my korg mail address or removing my
address there another maintainer is already listed. For the entries
there no other maintainer is listed I will keep looking into patches for
Cavium systems for a while until someone from Marvell takes it over.

Since I might have limited access to hardware and also limited time I
changed state to 'Odd Fixes' for those entries.

Signed-off-by: Robert Richter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Ganapatrao Kulkarni <[email protected]>
Cc: Sunil Goutham <[email protected]>
CC: Borislav Petkov <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Wolfram Sang <[email protected]>,
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: slub: fix conversion of freelist_corrupted()

Commit 52f23478081ae0 ("mm/slub.c: fix corrupted freechain in
deactivate_slab()") suffered an update when picked up from LKML [1].

Specifically, relocating 'freelist = NULL' into 'freelist_corrupted()'
created a no-op statement. Fix it by sticking to the behavior intended
in the original patch [1]. In addition, make freelist_corrupted()
immune to passing NULL instead of &freelist.

The issue has been spotted via static analysis and code review.

[1] https://lore.kernel.org/linux-mm/20200331031450 [email protected]/

Fixes: 52f23478081ae0 ("mm/slub.c: fix corrupted freechain in deactivate_slab()")
Signed-off-by: Eugeniu Rosca <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Dongli Zhang <[email protected]>
Cc: Joe Jin <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: memcg: fix memcg reclaim soft lockup

We've met softlockup with "CONFIG_PREEMPT_NONE=y", when the target memcg
doesn't have any reclaimable memory.

It can be easily reproduced as below:

  watchdog: BUG: soft lockup - CPU#0 stuck for 111s![memcg_test:2204]
  CPU: 0 PID: 2204 Comm: memcg_test Not tainted 5.9.0-rc2+ #12
  Call Trace:
    shrink_lruvec+0x49f/0x640
    shrink_node+0x2a6/0x6f0
    do_try_to_free_pages+0xe9/0x3e0
    try_to_free_mem_cgroup_pages+0xef/0x1f0
    try_charge+0x2c1/0x750
    mem_cgroup_charge+0xd7/0x240
    __add_to_page_cache_locked+0x2fd/0x370
    add_to_page_cache_lru+0x4a/0xc0
    pagecache_get_page+0x10b/0x2f0
    filemap_fault+0x661/0xad0
    ext4_filemap_fault+0x2c/0x40
    __do_fault+0x4d/0xf9
    handle_mm_fault+0x1080/0x1790

It only happens on our 1-vcpu instances, because there's no chance for
oom reaper to run to reclaim the to-be-killed process.

Add a cond_resched() at the upper shrink_node_memcgs() to solve this
issue, this will mean that we will get a scheduling point for each memcg
in the reclaimed hierarchy without any dependency on the reclaimable
memory in that memcg thus making it more predictable.

Suggested-by: Michal Hocko <[email protected]>
Signed-off-by: Xunlei Pang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Chris Down <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

memcg: fix use-after-free in uncharge_batch

syzbot has reported an use-after-free in the uncharge_batch path

  BUG: KASAN: use-after-free in instrument_atomic_write include/linux/instrumented.h:71 [inline]
  BUG: KASAN: use-after-free in atomic64_sub_return include/asm-generic/atomic-instrumented.h:970 [inline]
  BUG: KASAN: use-after-free in atomic_long_sub_return include/asm-generic/atomic-long.h:113 [inline]
  BUG: KASAN: use-after-free in page_counter_cancel mm/page_counter.c:54 [inline]
  BUG: KASAN: use-after-free in page_counter_uncharge+0x3d/0xc0 mm/page_counter.c:155
  Write of size 8 at addr ffff8880371c0148 by task syz-executor.0/9304

  CPU: 0 PID: 9304 Comm: syz-executor.0 Not tainted 5.8.0-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x1f0/0x31e lib/dump_stack.c:118
    print_address_description+0x66/0x620 mm/kasan/report.c:383
    __kasan_report mm/kasan/report.c:513 [inline]
    kasan_report+0x132/0x1d0 mm/kasan/report.c:530
    check_memory_region_inline mm/kasan/generic.c:183 [inline]
    check_memory_region+0x2b5/0x2f0 mm/kasan/generic.c:192
    instrument_atomic_write include/linux/instrumented.h:71 [inline]
    atomic64_sub_return include/asm-generic/atomic-instrumented.h:970 [inline]
    atomic_long_sub_return include/asm-generic/atomic-long.h:113 [inline]
    page_counter_cancel mm/page_counter.c:54 [inline]
    page_counter_uncharge+0x3d/0xc0 mm/page_counter.c:155
    uncharge_batch+0x6c/0x350 mm/memcontrol.c:6764
    uncharge_page+0x115/0x430 mm/memcontrol.c:6796
    uncharge_list mm/memcontrol.c:6835 [inline]
    mem_cgroup_uncharge_list+0x70/0xe0 mm/memcontrol.c:6877
    release_pages+0x13a2/0x1550 mm/swap.c:911
    tlb_batch_pages_flush mm/mmu_gather.c:49 [inline]
    tlb_flush_mmu_free mm/mmu_gather.c:242 [inline]
    tlb_flush_mmu+0x780/0x910 mm/mmu_gather.c:249
    tlb_finish_mmu+0xcb/0x200 mm/mmu_gather.c:328
    exit_mmap+0x296/0x550 mm/mmap.c:3185
    __mmput+0x113/0x370 kernel/fork.c:1076
    exit_mm+0x4cd/0x550 kernel/exit.c:483
    do_exit+0x576/0x1f20 kernel/exit.c:793
    do_group_exit+0x161/0x2d0 kernel/exit.c:903
    get_signal+0x139b/0x1d30 kernel/signal.c:2743
    arch_do_signal+0x33/0x610 arch/x86/kernel/signal.c:811
    exit_to_user_mode_loop kernel/entry/common.c:135 [inline]
    exit_to_user_mode_prepare+0x8d/0x1b0 kernel/entry/common.c:166
    syscall_exit_to_user_mode+0x5e/0x1a0 kernel/entry/common.c:241
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

Commit 1a3e1f40962c ("mm: memcontrol: decouple reference counting from
page accounting") reworked the memcg lifetime to be bound the the struct
page rather than charges.  It also removed the css_put_many from
uncharge_batch and that is causing the above splat.

uncharge_batch() is supposed to uncharge accumulated charges for all
pages freed from the same memcg.  The queuing is done by uncharge_page
which however drops the memcg reference after it adds charges to the
batch.  If the current page happens to be the last one holding the
reference for its memcg then the memcg is OK to go and the next page to
be freed will trigger batched uncharge which needs to access the memcg
which is gone already.

Fix the issue by taking a reference for the memcg in the current batch.

Fixes: 1a3e1f40962c ("mm: memcontrol: decouple reference counting from page accounting")
Reported-by: [email protected]
Reported-by: [email protected]
Signed-off-by: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Shakeel Butt <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Hugh Dickins <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

Merge tag 'xfs-5.9-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fix from Darrick Wong:
"Fix a broken metadata verifier that would incorrectly validate attr
fork extents of a realtime file against the realtime volume"

* tag 'xfs-5.9-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix xfs_bmap_validate_extent_raw when checking attr fork of rt files

xfs: don't update mtime on COW faults

When running in a dax mode, if the user maps a page with MAP_PRIVATE and
PROT_WRITE, the xfs filesystem would incorrectly update ctime and mtime
when the user hits a COW fault.

This breaks building of the Linux kernel.  How to reproduce:

1. extract the Linux kernel tree on dax-mounted xfs filesystem
2. run make clean
3. run make -j12
4. run make -j12

at step 4, make would incorrectly rebuild the whole kernel (although it
was already built in step 3).

The reason for the breakage is that almost all object files depend on
objtool.  When we run objtool, it takes COW page fault on its .data
section, and these faults will incorrectly update the timestamp of the
objtool binary.  The updated timestamp causes make to rebuild the whole
tree.

Signed-off-by: Mikulas Patocka <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>

ext2: don't update mtime on COW faults

When running in a dax mode, if the user maps a page with MAP_PRIVATE and
PROT_WRITE, the ext2 filesystem would incorrectly update ctime and mtime
when the user hits a COW fault.

This breaks building of the Linux kernel.  How to reproduce:

1. extract the Linux kernel tree on dax-mounted ext2 filesystem
2. run make clean
3. run make -j12
4. run make -j12

at step 4, make would incorrectly rebuild the whole kernel (although it
was already built in step 3).

The reason for the breakage is that almost all object files depend on
objtool.  When we run objtool, it takes COW page fault on its .data
section, and these faults will incorrectly update the timestamp of the
objtool binary.  The updated timestamp causes make to rebuild the whole
tree.

Signed-off-by: Mikulas Patocka <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>

drm: xlnx: dpsub: Fix DMADEVICES Kconfig dependency

The dpsub driver uses the DMA engine API, and thus selects DMA_ENGINE to
provide that API. DMA_ENGINE depends on DMADEVICES, which can be
deselected by the user, creating a possibly unmet indirect dependency:

WARNING: unmet direct dependencies detected for DMA_ENGINE
  Depends on [n]: DMADEVICES [=n]
  Selected by [m]:
  - DRM_ZYNQMP_DPSUB [=m] && HAS_IOMEM [=y] && (ARCH_ZYNQMP || COMPILE_TEST [=y]) && COMMON_CLK [=y] && DRM [=m] && OF [=y]

Add a dependency on DMADEVICES to fix this.

Reported-by: Randy Dunlap <[email protected]>
Signed-off-by: Laurent Pinchart <[email protected]>
Acked-by: Randy Dunlap <[email protected]>

rapidio: Replace 'select' DMAENGINES 'with depends on'

Enabling a whole subsystem from a single driver 'select' is frowned
upon and won't be accepted in new drivers, that need to use 'depends on'
instead. Existing selection of DMAENGINES will then cause circular
dependencies. Replace them with a dependency.

Signed-off-by: Laurent Pinchart <[email protected]>
Acked-by: Randy Dunlap <[email protected]>

io_uring: fix explicit async read/write mapping for large segments

If we exceed UIO_FASTIOV, we don't handle the transition correctly
between an allocated vec for requests that are queued with IOSQE_ASYNC.
Store the iovec appropriately and re-set it in the iter iov in case
it changed.

Fixes: ff6165b2d7f6 ("io_uring: retain iov_iter state over io_read/io_write calls")
Reported-by: Nick Hill <[email protected]>
Tested-by: Norman Maurer <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

SUNRPC: stop printk reading past end of string

Since p points at raw xdr data, there's no guarantee that it's NULL
terminated, so we should give a length. And probably escape any special
characters too.

Reported-by: Zhi Li <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>

NFS: Zero-stateid SETATTR should first return delegation

If a write delegation isn't available, the Linux NFS client uses
a zero-stateid when performing a SETATTR.

NFSv4.0 provides no mechanism for an NFS server to match such a
request to a particular client. It recalls all delegations for that
file, even delegations held by the client issuing the request. If
that client happens to hold a read delegation, the server will
recall it immediately, resulting in an NFS4ERR_DELAY/CB_RECALL/
DELEGRETURN sequence.

Optimize out this pipeline bubble by having the client return any
delegations it may hold on a file before it issues a
SETATTR(zero-stateid) on that file.

Signed-off-by: Chuck Lever <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>

ARM: dts: imx6sx: fix the pad QSPI1B_SCLK mux mode for uart3

The pad QSPI1B_SCLK mux mode 0x1 is for function UART3_DTE_TX,
correct the mux mode.

Fixes: 743636f25f1d ("ARM: dts: imx: add pin function header for imx6sx")
Signed-off-by: Fugang Duan <[email protected]>
Signed-off-by: Shawn Guo <[email protected]>

arm64: dts: imx8mp: correct sdma1 clk setting

Correct sdma1 ahb clk, otherwise wrong 1:1 clk ratio will be chosed so
that sdma1 function broken. sdma1 should use 1:2 clk, while sdma2/3 use
1:1.

Fixes: 6d9b8d20431f ("arm64: dts: freescale: Add i.MX8MP dtsi support")
Cc: <[email protected]>
Signed-off-by: Robin Gong <[email protected]>
Signed-off-by: Shawn Guo <[email protected]>

Merge tag 's390-5.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Vasily Gorbik:

- Fix GENERIC_LOCKBREAK dependency on PREEMPTION in Kconfig broken
   because of a typo

- Update defconfigs

* tag 's390-5.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: update defconfigs
  s390: fix GENERIC_LOCKBREAK dependency typo in Kconfig

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

- Fix the loading of modules built with binutils-2.35. This version
   produces writable and executable .text.ftrace_trampoline section
   which is rejected by the kernel.

- Remove the exporting of cpu_logical_map() as the Tegra driver has now
   been fixed and no longer uses this function.

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64/module: set trampoline section flags regardless of CONFIG_DYNAMIC_FTRACE
  arm64: Remove exporting cpu_logical_map symbol

Merge tag 'mips_fixes_5.9_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS fixes from Thomas Bogendoerfer:
"A few MIPS fixes:

   - fallthrough fallout fix

   - BMIPS fixes

   - MSA fix to avoid leaking MSA register contents

   - Loongson perf and cpu feature fix

   - SNI interrupt fix"

* tag 'mips_fixes_5.9_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: SNI: Fix SCSI interrupt
  MIPS: add missing MSACSR and upper MSA initialization
  MIPS: perf: Fix wrong check condition of Loongson event IDs
  mips/oprofile: Fix fallthrough placement
  MIPS: Loongson64: Remove unnecessary inclusion of boot_param.h
  MIPS: BMIPS: Also call bmips_cpu_setup() for secondary cores
  MIPS: mm: BMIPS5000 has inclusive physical caches
  MIPS: Loongson64: Do not override watch and ejtag feature

Merge tag 'kbuild-fixes-v5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

- fix documents

- fix warning in 'make localmodconfig'

* tag 'kbuild-fixes-v5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kconfig: remove redundant assignment prompt = prompt
  kbuild: Documentation: clean up makefiles.rst
  kconfig: streamline_config.pl: check defined(ENV variable) before using it
  Documentation/llvm: Improve formatting of commands, variables, and arguments

Merge tag 'pm-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
"These fix reference counting in the operating performance points (OPP)
  framework and address a few intel_pstate driver issues, mostly related
  to switching driver operation modes and similar with hardware-managed
  P-states (HWP) enabled.

  Specifics:

   - Fix reference counting of operating performance points (OPP) tables
     (Viresh Kumar).

   - Address intel_pstate driver interface issues, mostly related to
     switching operation modes and handling CPU offline and online and
     system-wide suspend/resume with hardware-managed P-states (HWP)
     enabled (Rafael Wysocki).

   - Fix the maximum frequency computation in the intel_pstate driver
     with turbo P-states disabled by the platform firmware and HWP
     enabled (Francisco Jerez)"

* tag 'pm-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: intel_pstate: Fix intel_pstate_get_hwp_max() for turbo disabled
  cpufreq: intel_pstate: Free memory only when turning off
  cpufreq: intel_pstate: Add ->offline and ->online callbacks
  cpufreq: intel_pstate: Tweak the EPP sysfs interface
  cpufreq: intel_pstate: Update cached EPP in the active mode
  cpufreq: intel_pstate: Refuse to turn off with HWP enabled
  opp: Don't drop reference for an OPP table that was never parsed

Merge tag 'libata-5.9-2020-09-04' of git://git.kernel.dk/linux-block

Pull libata fixes from Jens Axboe:

- improve Sandisks ATA_HORKAGE on NCQ (Tejun)

- link printk cleanup (Xu)

* tag 'libata-5.9-2020-09-04' of git://git.kernel.dk/linux-block:
libata: implement ATA_HORKAGE_MAX_TRIM_128M and apply to Sandisks
ata: ahci: use ata_link_info() instead of ata_link_printk()

Merge tag 'block-5.9-2020-09-04' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
"A bit larger than usual this week, mostly due to the NVMe fixes
  arriving late for -rc3 and hence didn't make last weeks pull request.

   - NVMe:
        - instance leak and io boundary fixes from Keith
        - fc locking fix from Christophe
        - various tcp/rdma reset during traffic fixes from Sagi
        - pci use-after-free fix from Tong
        - tcp target null deref fix from Ziye

   - Locking fix for partition removal (Christoph)

   - Ensure bdi->io_pages is always set (me)

   - Fixup for hd struct reference (Ming)

   - Fix for zero length bvecs (Ming)

   - Two small blk-iocost fixes (Tejun)"

* tag 'block-5.9-2020-09-04' of git://git.kernel.dk/linux-block:
  block: allow for_each_bvec to support zero len bvec
  blk-stat: make q->stats->lock irqsafe
  blk-iocost: ioc_pd_free() shouldn't assume irq disabled
  block: fix locking in bdev_del_partition
  block: release disk reference in hd_struct_free_work
  block: ensure bdi->io_pages is always initialized
  nvme-pci: cancel nvme device request before disabling
  nvme: only use power of two io boundaries
  nvme: fix controller instance leak
  nvmet-fc: Fix a missed _irqsave version of spin_lock in 'nvmet_fc_fod_op_done()'
  nvme: Fix NULL dereference for pci nvme controllers
  nvme-rdma: fix reset hang if controller died in the middle of a reset
  nvme-rdma: fix timeout handler
  nvme-rdma: serialize controller teardown sequences
  nvme-tcp: fix reset hang if controller died in the middle of a reset
  nvme-tcp: fix timeout handler
  nvme-tcp: serialize controller teardown sequences
  nvme: have nvme_wait_freeze_timeout return if it timed out
  nvme-fabrics: don't check state NVME_CTRL_NEW for request acceptance
  nvmet-tcp: Fix NULL dereference when a connect data comes in h2cdata pdu

Merge tag 'io_uring-5.9-2020-09-04' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:

- EAGAIN with O_NONBLOCK retry fix

- Two small fixes for registered files (Jiufei)

* tag 'io_uring-5.9-2020-09-04' of git://git.kernel.dk/linux-block:
  io_uring: no read/write-retry on -EAGAIN error and O_NONBLOCK marked file
  io_uring: set table->files[i] to NULL when io_sqe_file_register failed
  io_uring: fix removing the wrong file in __io_sqe_files_update()

Merge tag 'thermal-v5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux

Pull thermal fixes from Daniel Lezcano:

- Fix bogus thermal shutdowns for omap4430 where bogus values resulting
   from an incorrect ADC conversion are too high and fire an emergency
   shutdown (Tony Lindgren)

- Don't suppress negative temp for qcom spmi as they are valid and
   userspace needs them (Veera Vegivada)

- Fix use-after-free in thermal_zone_device_unregister reported by
   Kasan (Dmitry Osipenko)

* tag 'thermal-v5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
  thermal: core: Fix use-after-free in thermal_zone_device_unregister()
  thermal: qcom-spmi-temp-alarm: Don't suppress negative temp
  thermal: ti-soc-thermal: Fix bogus thermal shutdowns for omap4430

drm/msm: Disable the RPTR shadow

Disable the RPTR shadow across all targets. It will be selectively
re-enabled later for targets that need it.

Cc: [email protected]
Signed-off-by: Jordan Crouse <[email protected]>
Signed-off-by: Rob Clark <[email protected]>

drm/msm: Disable preemption on all 5xx targets

Temporarily disable preemption on a5xx targets pending some improvements
to protect the RPTR shadow from being corrupted.

Cc: [email protected]
Signed-off-by: Jordan Crouse <[email protected]>
Signed-off-by: Rob Clark <[email protected]>

drm/msm: Enable expanded apriv support for a650

a650 supports expanded apriv support that allows us to map critical buffers
(ringbuffer and memstore) as as privileged to protect them from corruption.

Cc: [email protected]
Signed-off-by: Jordan Crouse <[email protected]>
Signed-off-by: Rob Clark <[email protected]>

drm/msm: Split the a5xx preemption record

The main a5xx preemption record can be marked as privileged to
protect it from user access but the counters storage needs to be
remain unprivileged. Split the buffers and mark the critical memory
as privileged.

Cc: [email protected]
Signed-off-by: Jordan Crouse <[email protected]>
Signed-off-by: Rob Clark <[email protected]>

Merge tag 'dmaengine-fix-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine

Pull dmaengine fixes from Vinod Koul:
"A couple of core fixes and odd driver fixes for dmaengine subsystem:

  Core:
   - drop ACPI CSRT table reference after using it
   - fix of_dma_router_xlate() error handling

  Drivers fixes in idxd, at_hdmac, pl330, dw-edma and jz478"

* tag 'dmaengine-fix-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
  dmaengine: ti: k3-udma: Update rchan_oes_offset for am654 SYSFW ABI 3.0
  drivers/dma/dma-jz4780: Fix race condition between probe and irq handler
  dmaengine: dw-edma: Fix scatter-gather address calculation
  dmaengine: ti: k3-udma: Fix the TR initialization for prep_slave_sg
  dmaengine: pl330: Fix burst length if burst size is smaller than bus width
  dmaengine: at_hdmac: add missing kfree() call in at_dma_xlate()
  dmaengine: at_hdmac: add missing put_device() call in at_dma_xlate()
  dmaengine: at_hdmac: check return value of of_find_device_by_node() in at_dma_xlate()
  dmaengine: of-dma: Fix of_dma_router_xlate's of_dma_xlate handling
  dmaengine: idxd: reset states after device disable or reset
  dmaengine: acpi: Put the CSRT table after using it

Merge tag 'sound-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"A collection of small changes, nothing intrusive:

   - remaining tasklet API conversions, now all sound stuff have been
     converted

   - a few HD-audio and USB-audio quirks and minor fixes

   - FireWire Tascam and Digi00xx fixes

   - drop a kernel WARNING from PCM OSS for syzkaller"

* tag 'sound-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (29 commits)
  ALSA: hda/realtek - Improved routing for Thinkpad X1 7th/8th Gen
  ALSA: hda: use consistent HDAudio spelling in comments/docs
  ALSA: hda: add dev_dbg log when driver is not selected
  ALSA: hda: fix a runtime pm issue in SOF when integrated GPU is disabled
  ALSA: hda: hdmi - add Rocketlake support
  ALSA: ua101: convert tasklets to use new tasklet_setup() API
  ALSA: usb-audio: convert tasklets to use new tasklet_setup() API
  ASoC: txx9: convert tasklets to use new tasklet_setup() API
  ASoC: siu: convert tasklets to use new tasklet_setup() API
  ASoC: fsl_esai: convert tasklets to use new tasklet_setup() API
  ALSA: hdsp: convert tasklets to use new tasklet_setup() API
  ALSA: riptide: convert tasklets to use new tasklet_setup() API
  ALSA: pci/asihpi: convert tasklets to use new tasklet_setup() API
  ALSA: firewire: convert tasklets to use new tasklet_setup() API
  ALSA: core: convert tasklets to use new tasklet_setup() API
  ALSA: pcm: oss: Remove superfluous WARN_ON() for mulaw sanity check
  ALSA: hda - Fix silent audio output and corrupted input on MSI X570-A PRO
  ALSA: hda/hdmi: always check pin power status in i915 pin fixup
  ALSA: hda/realtek: Add quirk for Samsung Galaxy Book Ion NT950XCJ-X716A
  ALSA: usb-audio: Add basic capture support for Pioneer DJ DJM-250MK2
  ...

Merge tag 'drm-fixes-2020-09-04' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
"Not much going on this week, nouveau has a display hw bug workaround,
  amdgpu has some PM fixes and CIK regression fixes, one single radeon
  PLL fix, and a couple of i915 display fixes.

  amdgpu:
   - Fix for 32bit systems
   - SW CTF fix
   - Update for Sienna Cichlid
   - CIK bug fixes

  radeon:
   - PLL fix

  i915:
   - Clang build warning fix
   - HDCP fixes

  nouveau:
   - display fixes"

* tag 'drm-fixes-2020-09-04' of git://anongit.freedesktop.org/drm/drm:
  drm/nouveau/kms/nv50-gp1xx: add WAR for EVO push buffer HW bug
  drm/nouveau/kms/nv50-gp1xx: disable notifies again after core update
  drm/nouveau/kms/nv50-: add some whitespace before debug message
  drm/nouveau/kms/gv100-: Include correct push header in crcc37d.c
  drm/radeon: Prefer lower feedback dividers
  drm/amdgpu: Fix bug in reporting voltage for CIK
  drm/amdgpu: Specify get_argument function for ci_smu_funcs
  drm/amd/pm: enable MP0 DPM for sienna_cichlid
  drm/amd/pm: avoid false alarm due to confusing softwareshutdowntemp setting
  drm/amd/pm: fix is_dpm_running() run error on 32bit system
  drm/i915: Clear the repeater bit on HDCP disable
  drm/i915: Fix sha_text population code
  drm/i915/display: Ensure that ret is always initialized in icl_combo_phy_verify_state

net/packet: fix overflow in tpacket_rcv

Using tp_reserve to calculate netoff can overflow as
tp_reserve is unsigned int and netoff is unsigned short.

This may lead to macoff receving a smaller value then
sizeof(struct virtio_net_hdr), and if po->has_vnet_hdr
is set, an out-of-bounds write will occur when
calling virtio_net_hdr_from_skb.

The bug is fixed by converting netoff to unsigned int
and checking if it exceeds USHRT_MAX.

This addresses CVE-2020-14386

Fixes: 8913336a7e8d ("packet: add PACKET_RESERVE sockopt")
Signed-off-by: Or Cohen <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge branch 'simplify-do_wp_page'

Merge emailed patches from Peter Xu:
"This is a small series that I picked up from Linus's suggestion to
  simplify cow handling (and also make it more strict) by checking
  against page refcounts rather than mapcounts.

  This makes uffd-wp work again (verified by running upmapsort)"

Note: this is horrendously bad timing, and making this kind of
fundamental vm change after -rc3 is not at all how things should work.
The saving grace is that it really is a a nice simplification:

8 files changed, 29 insertions(+), 120 deletions(-)

The reason for the bad timing is that it turns out that commit
17839856fd58 ("gup: document and work around 'COW can break either way'
issue" broke not just UFFD functionality (as Peter noticed), but Mikulas
Patocka also reports that it caused issues for strace when running in a
DAX environment with ext4 on a persistent memory setup.

And we can't just revert that commit without re-introducing the original
issue that is a potential security hole, so making COW stricter (and in
the process much simpler) is a step to then undoing the forced COW that
broke other uses.

Link: https://lore.kernel.org/lkml/alpine.LRH.2.02.2009031328040.6929@file01.intranet.prod.int.rdu2.redhat.com/
* emailed patches from Peter Xu <[email protected]>:
  mm: Add PGREUSE counter
  mm/gup: Remove enfornced COW mechanism
  mm/ksm: Remove reuse_ksm_page()
  mm: do_wp_page() simplification

Merge branch 'pm-cpufreq'

* pm-cpufreq:
  cpufreq: intel_pstate: Fix intel_pstate_get_hwp_max() for turbo disabled
  cpufreq: intel_pstate: Free memory only when turning off
  cpufreq: intel_pstate: Add ->offline and ->online callbacks
  cpufreq: intel_pstate: Tweak the EPP sysfs interface
  cpufreq: intel_pstate: Update cached EPP in the active mode
  cpufreq: intel_pstate: Refuse to turn off with HWP enabled

mm: Add PGREUSE counter

This accounts for wp_page_reuse() case, where we reused a page for COW.

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/gup: Remove enfornced COW mechanism

With the more strict (but greatly simplified) page reuse logic in
do_wp_page(), we can safely go back to the world where cow is not
enforced with writes.

This essentially reverts commit 17839856fd58 ("gup: document and work
around 'COW can break either way' issue").  There are some context
differences due to some changes later on around it:

  2170ecfa7688 ("drm/i915: convert get_user_pages() --> pin_user_pages()", 2020-06-03)
  376a34efa4ee ("mm/gup: refactor and de-duplicate gup_fast() code", 2020-06-03)

Some lines moved back and forth with those, but this revert patch should
have striped out and covered all the enforced cow bits anyways.

Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/ksm: Remove reuse_ksm_page()

Remove the function as the last reference has gone away with the do_wp_page()
changes.

Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: do_wp_page() simplification

How about we just make sure we're the only possible valid user fo the
page before we bother to reuse it?

Simplify, simplify, simplify.

And get rid of the nasty serialization on the page lock at the same time.

[peterx: add subject prefix]

Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

driver core: Fix device_pm_lock() locking for device links

This commit fixes two issues:

1. The lockdep warning reported by Dong Aisheng <[email protected]> [1].

It is a warning about a cycle (dpm_list_mtx --> kn->active#3 --> fw_lock)
that was introduced when device-link devices were added to expose device
link information in sysfs.

The patch that "introduced" this cycle can't be reverted because it's fixes
a real SRCU issue and also ensures that the device-link device is deleted
as soon as the device-link is deleted. This is important to avoid sysfs
name collisions if the device-link is create again immediately (this can
happen a lot with deferred probing).

2. Inconsistency in grabbing device_pm_lock() during device link deletion

Some device link deletion code paths grab device_pm_lock(), while others
don't. The device_pm_lock() is grabbed during device_link_add() because it
checks if the supplier is in the dpm_list and also reorders the dpm_list.
However, when a device link is deleted, it does not do either of those and
therefore device_pm_lock() is not necessary. Dropping the device_pm_lock()
in all the device link deletion paths removes the inconsistency in locking.

Thanks to Stephen Boyd for helping me understand the lockdep splat.

Fixes: 843e600b8a2b ("driver core: Fix sleeping in invalid context during device link deletion")
[1] - https://lore.kernel.org/lkml/CAA+hA=S4eAreb7vo69LAXSk2t5=DEKNxHaiY1wSpk4xTp9urLg@mail.gmail.com/
Reported-by: Dong Aisheng <[email protected]>
Signed-off-by: Saravana Kannan <[email protected]>
Tested-by: Peng Fan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

gcov: Disable gcov build with GCC 10

GCOV built with GCC 10 doesn't initialize n_function variable. This
produces different kernel panics as was seen by Colin in Ubuntu and me
in FC 32.

As a workaround, let's disable GCOV build for broken GCC 10 version.

Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1891288
Link: https://lore.kernel.org/lkml/[email protected]
Link: https://lore.kernel.org/lkml/CAHk-=whbijeSdSvx-Xcr0DPMj0BiwhJ+uiNnDSVZcr_h_kg7UA@mail.gmail.com/
Cc: Colin Ian King <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

init: fix error check in clean_path()

init_stat() returns 0 on success, same as vfs_lstat(). When it replaced
vfs_lstat(), the '!' was dropped.

Fixes: 716308a5331b ("init: add an init_stat helper")
Signed-off-by: Barret Rhoden <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

MAINTAINERS: Add the security document to SECURITY CONTACT

When changing the document related to kernel security workflow, notify
the security mailing list as its concerned by this.

Cc: <[email protected]>
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

driver code: print symbolic error code

dev_err_probe() prepends the message with an error code. Let's make it
more readable by translating the code to a more recognisable symbol.

Fixes: a787e5400a1c ("driver core: add device probe log helper")
Signed-off-by: Michał Mirosław <[email protected]>
Link: https://lore.kernel.org/r/ea3f973e4708919573026fdce52c264db147626d.1598630856.git.mirq-linux@rere.qmqm.pl
Signed-off-by: Greg Kroah-Hartman <[email protected]>

debugfs: Fix module state check condition

The '#ifdef MODULE' check in the original commit does not work as intended.
The code under the check is not built at all if CONFIG_DEBUG_FS=y. Fix this
by using a correct check.

Fixes: 275678e7a9be ("debugfs: Check module state before warning in {full/open}_proxy_open()")
Signed-off-by: Vladis Dronov <[email protected]>
Cc: stable <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

video: fbdev: fix OOB read in vga_8planes_imageblit()

syzbot is reporting OOB read at vga_8planes_imageblit() [1], for
"cdat[y] >> 4" can become a negative value due to "const char *cdat".

[1] https://syzkaller.appspot.com/bug?id=0d7a0da1557dcd1989e00cb3692b26d4173b4132

Reported-by: syzbot <[email protected]>
Signed-off-by: Tetsuo Handa <[email protected]>
Cc: stable <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

dyndbg: fix problem parsing format="foo bar"

commit 14775b049642 ("dyndbg: accept query terms like file=bar and
module=foo") added the combined keyword=value parsing poorly; revert
most of it, keeping the keyword & arg change.

Instead, fix the tokenizer for the new input, by terminating the
keyword (an unquoted word) on '=' as well as space, thus letting the
tokenizer work on the quoted argument, like it would have previously.

Also add a few debug-prints to show more parsing context, into
tokenizer and parse-query, and use "keyword, value" in others.

Fixes: 14775b049642 ("dyndbg: accept query terms like file=bar and module=foo")
Signed-off-by: Jim Cromie <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

dyndbg: refine export, rename to dynamic_debug_exec_queries()

commit 4c0d77828d4f ("dyndbg: export ddebug_exec_queries") had a few
problems:
- broken non DYNAMIC_DEBUG_CORE configs, sparse warning
- the exported function modifies query string, breaks on RO strings.
- func name follows internal convention, shouldn't be exposed as is.

1st is fixed in header with ifdefd function prototype or stub defn.
Also remove an obsolete HAVE-symbol ifdef-comment, and add others.

Fix others by wrapping existing internal function with a new one,
named in accordance with module-prefix naming convention, before
export hits v5.9.0. In new function, copy query string to a local
buffer, so users can pass hard-coded/RO queries, and internal function
can be used unchanged.

Fixes: 4c0d77828d4f ("dyndbg: export ddebug_exec_queries")
Signed-off-by: Jim Cromie <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

dyndbg: give %3u width in pr-format, cosmetic only

Specify the print-width so log entries line up nicely.

no functional changes.

Signed-off-by: Jim Cromie <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: core: fix slab-out-of-bounds Read in read_descriptors

The USB device descriptor may get changed between two consecutive
enumerations on the same device for some reason, such as DFU or
malicius device.
In that case, we may access the changing descriptor if we don't take
the device lock here.

The issue is reported:
https://syzkaller.appspot.com/bug?id=901a0d9e6519ef8dc7acab25344bd287dd3c7be9

Cc: stable <[email protected]>
Cc: Alan Stern <[email protected]>
Reported-by: [email protected]
Fixes: 217a9081d8e6 ("USB: add all configs to the "descriptors" attribute")
Signed-off-by: Zeng Tao <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

Revert "usb: dwc3: meson-g12a: fix shared reset control use"

This reverts commit 7a410953d1fb4dbe91ffcfdee9cbbf889d19b0d7.

This commit breaks USB on meson-gxl-s905x-libretech-cc. Reverting
the change solves the issue.

In fact, according to the reset framework code, consumers must not use
reset_control_(de)assert() on shared reset lines when reset_control_reset
has been used, and vice-versa.

Moreover, with this commit, usb is not guaranted to be reset since the
reset is likely to be initially deasserted.

Reverting the commit will bring back the suspend warning mentioned in the
commit description. Nevertheless, a warning is much less critical than
breaking dwc3-meson-g12a USB completely. We will address the warning
issue in another way as a 2nd step.

Fixes: 7a410953d1fb ("usb: dwc3: meson-g12a: fix shared reset control use")
Cc: stable <[email protected]>
Signed-off-by: Amjad Ouled-Ameur <[email protected]>
Reported-by: Jerome Brunet <[email protected]>
Acked-by: Neil Armstrong <[email protected]>
Acked-by: Philipp Zabel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: typec: ucsi: acpi: Check the _DEP dependencies

Failing probe with -EPROBE_DEFER until all dependencies
listed in the _DEP (Operation Region Dependencies) object
have been met.

This will fix an issue where on some platforms UCSI ACPI
driver fails to probe because the address space handler for
the operation region that the UCSI ACPI interface uses has
not been loaded yet.

Fixes: 8243edf44152 ("usb: typec: ucsi: Add ACPI driver")
Cc: [email protected]
Signed-off-by: Heikki Krogerus <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: typec: intel_pmc_mux: Un-register the USB role switch

Added missing code for un-register USB role switch in the remove and
error path.

Cc: Stable <[email protected]> # v5.8
Reviewed-by: Heikki Krogerus <[email protected]>
Fixes: 6701adfa9693b ("usb: typec: driver for Intel PMC mux control")
Signed-off-by: Madhusudanarao Amara <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: Fix out of sync data toggle if a configured device is reconfigured

Userspace drivers that use a SetConfiguration() request to "lightweight"
reset an already configured usb device might cause data toggles to get out
of sync between the device and host, and the device becomes unusable.

The xHCI host requires endpoints to be dropped and added back to reset the
toggle. If USB core notices the new configuration is the same as the
current active configuration it will avoid these extra steps by calling
usb_reset_configuration() instead of usb_set_configuration().

A SetConfiguration() request will reset the device side data toggles.
Make sure usb_reset_configuration() function also drops and adds back the
endpoints to ensure data toggles are in sync.

To avoid code duplication split the current usb_disable_device() function
and reuse the endpoint specific part.

Cc: stable <[email protected]>
Tested-by: Martin Thierer <[email protected]>
Signed-off-by: Mathias Nyman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

x86/entry: Unbreak 32bit fast syscall

Andy reported that the syscall treacing for 32bit fast syscall fails:

# ./tools/testing/selftests/x86/ptrace_syscall_32
...
[RUN] SYSEMU
[FAIL] Initial args are wrong (nr=224, args=10 11 12 13 14 4289172732)
...
[RUN] SYSCALL
[FAIL] Initial args are wrong (nr=29, args=0 0 0 0 0 4289172732)

The eason is that the conversion to generic entry code moved the retrieval
of the sixth argument (EBP) after the point where the syscall entry work
runs, i.e. ptrace, seccomp, audit...

Unbreak it by providing a split up version of syscall_enter_from_user_mode().

- syscall_enter_from_user_mode_prepare() establishes state and enables
interrupts

- syscall_enter_from_user_mode_work() runs the entry work

Replace the call to syscall_enter_from_user_mode() in the 32bit fast
syscall C-entry with the split functions and stick the EBP retrieval
between them.

Fixes: 27d6b4d14f5c ("x86/entry: Use generic syscall entry function")
Reported-by: Andy Lutomirski <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

x86/debug: Allow a single level of #DB recursion

Trying to clear DR7 around a #DB from usermode malfunctions if the tasks
schedules when delivering SIGTRAP.

Rather than trying to define a special no-recursion region, just allow a
single level of recursion. The same mechanism is used for NMI, and it
hasn't caused any problems yet.

Fixes: 9f58fdde95c9 ("x86/db: Split out dr6/7 handling")
Reported-by: Kyle Huey <[email protected]>
Debugged-by: Josh Poimboeuf <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Daniel Thompson <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/8b9bd05f187231df008d48cf818a6a311cbd5c98.1597882384.git.luto@kernel.org
Link: https://lore.kernel.org/r/[email protected]

x86/entry: Fix AC assertion

The WARN added in commit 3c73b81a9164 ("x86/entry, selftests: Further
improve user entry sanity checks") unconditionally triggers on a IVB
machine because it does not support SMAP.

For !SMAP hardware the CLAC/STAC instructions are patched out and thus if
userspace sets AC, it is still have set after entry.

Fixes: 3c73b81a9164 ("x86/entry, selftests: Further improve user entry sanity checks")
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Daniel Thompson <[email protected]>
Acked-by: Andy Lutomirski <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]

tracing/kprobes, x86/ptrace: Fix regs argument order for i386

On i386, the order of parameters passed on regs is eax,edx,and ecx
(as per regparm(3) calling conventions).

Change the mapping in regs_get_kernel_argument(), so that arg1=ax
arg2=dx, and arg3=cx.

Running the selftests testcase kprobes_args_use.tc shows the result
as passed.

Fixes: 3c88ee194c28 ("x86: ptrace: Add function argument access API")
Signed-off-by: Vamshi K Sthambamkadi <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Cc: <[email protected]>
Link: https://lkml.kernel.org/r/20200828113242.GA1424@cosmos