Git Repo - linux.git/log

exfat: Unify access to the boot sector

Unify access to boot sector via 'sbi->pbr_bh'.
This fixes vol_flags inconsistency at read failed in fs_set_vol_flags(),
and buffer_head leak in __exfat_fill_super().

Signed-off-by: Tetsuhiro Kohada <[email protected]>
Signed-off-by: Namjae Jeon <[email protected]>

exfat: add missing MODULE_ALIAS_FS()

This adds the necessary MODULE_ALIAS_FS() to exfat so the module gets
automatically loaded when an exfat filesystem is mounted.

Signed-off-by: Thomas Backlund <[email protected]>
Signed-off-by: Namjae Jeon <[email protected]>

exfat: Fix discard support

Discard support was always unconditionally disabled. Now it is disabled
only in the case when blk_queue_discard() returns false.

Signed-off-by: Pali Rohár <[email protected]>
Signed-off-by: Namjae Jeon <[email protected]>

pnp: Use list_for_each_entry() instead of open coding

Aside from good practice, this avoids a warning from gcc 10:

./include/linux/kernel.h:997:3: warning: array subscript -31 is outside array bounds of ‘struct list_head[1]’ [-Warray-bounds]
  997 |  ((type *)(__mptr - offsetof(type, member))); })
      |  ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/list.h:493:2: note: in expansion of macro ‘container_of’
  493 |  container_of(ptr, type, member)
      |  ^~~~~~~~~~~~
./include/linux/pnp.h:275:30: note: in expansion of macro ‘list_entry’
  275 | #define global_to_pnp_dev(n) list_entry(n, struct pnp_dev, global_list)
      |                              ^~~~~~~~~~
./include/linux/pnp.h:281:11: note: in expansion of macro ‘global_to_pnp_dev’
  281 |  (dev) != global_to_pnp_dev(&pnp_global); \
      |           ^~~~~~~~~~~~~~~~~
arch/x86/kernel/rtc.c:189:2: note: in expansion of macro ‘pnp_for_each_dev’
  189 |  pnp_for_each_dev(dev) {

Because the common code doesn't cast the starting list_head to the
containing struct.

Signed-off-by: Jason Gunthorpe <[email protected]>
[ rjw: Whitespace adjustments ]
Signed-off-by: Rafael J. Wysocki <[email protected]>

net: stmmac: Enable SERDES power up/down sequence

This patch is to enable Intel SERDES power up/down sequence. The SERDES
converts 8/10 bits data to SGMII signal. Below is an example of
HW configuration for SGMII mode. The SERDES is located in the PHY IF
in the diagram below.

<-----------------GBE Controller---------->|<--External PHY chip-->
+----------+         +----+            +---+           +----------+
|   EQoS   | <-GMII->| DW | < ------ > |PHY| <-SGMII-> | External |
|   MAC    |         |xPCS|            |IF |           | PHY      |
+----------+         +----+            +---+           +----------+
       ^               ^                 ^                ^
       |               |                 |                |
       +---------------------MDIO-------------------------+

PHY IF configuration and status registers are accessible through
mdio address 0x15 which is defined as mdio_adhoc_addr. During D0,
The driver will need to power up PHY IF by changing the power state
to P0. Likewise, for D3, the driver sets PHY IF power state to P3.

Signed-off-by: Voon Weifeng <[email protected]>
Signed-off-by: Ong Boon Leong <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: broadcom: convert to devm_platform_ioremap_resource_byname()

Use the function devm_platform_ioremap_resource_byname() to simplify
source code which calls the functions platform_get_resource_byname()
and devm_ioremap_resource(). Remove also a few error messages which
became unnecessary with this software refactoring.

Suggested-by: Markus Elfring <[email protected]>
Signed-off-by: Dejin Zheng <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

macvlan: fix null dereference in macvlan_device_event()

In the macvlan_device_event(), the list_first_entry_or_null() is used.
This function could return null pointer if there is no node.
But, the macvlan module doesn't check the null pointer.
So, null-ptr-deref would occur.

      bond0
        |
   +----+-----+
   |          |
macvlan0   macvlan1
   |          |
dummy0     dummy1

The problem scenario.
If dummy1 is removed,
1. ->dellink() of dummy1 is called.
2. NETDEV_UNREGISTER of dummy1 notification is sent to macvlan module.
3. ->dellink() of macvlan1 is called.
4. NETDEV_UNREGISTER of macvlan1 notification is sent to bond module.
5. __bond_release_one() is called and it internally calls
   dev_set_mac_address().
6. dev_set_mac_address() calls the ->ndo_set_mac_address() of macvlan1,
   which is macvlan_set_mac_address().
7. macvlan_set_mac_address() calls the dev_set_mac_address() with dummy1.
8. NETDEV_CHANGEADDR of dummy1 is sent to macvlan module.
9. In the macvlan_device_event(), it calls list_first_entry_or_null().
At this point, dummy1 and macvlan1 were removed.
So, list_first_entry_or_null() will return NULL.

Test commands:
    ip netns add nst
    ip netns exec nst ip link add bond0 type bond
    for i in {0..10}
    do
        ip netns exec nst ip link add dummy$i type dummy
ip netns exec nst ip link add macvlan$i link dummy$i \
type macvlan mode passthru
ip netns exec nst ip link set macvlan$i master bond0
    done
    ip netns del nst

Splat looks like:
[   40.585687][  T146] general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP DEI
[   40.587249][  T146] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[   40.588342][  T146] CPU: 1 PID: 146 Comm: kworker/u8:2 Not tainted 5.7.0-rc1+ #532
[   40.589299][  T146] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   40.590469][  T146] Workqueue: netns cleanup_net
[   40.591045][  T146] RIP: 0010:macvlan_device_event+0x4e2/0x900 [macvlan]
[   40.591905][  T146] Code: 00 00 00 00 00 fc ff df 80 3c 06 00 0f 85 45 02 00 00 48 89 da 48 b8 00 00 00 00 00 fc ff d2
[   40.594126][  T146] RSP: 0018:ffff88806116f4a0 EFLAGS: 00010246
[   40.594783][  T146] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
[   40.595653][  T146] RDX: 0000000000000000 RSI: ffff88806547ddd8 RDI: ffff8880540f1360
[   40.596495][  T146] RBP: ffff88804011a808 R08: fffffbfff4fb8421 R09: fffffbfff4fb8421
[   40.597377][  T146] R10: ffffffffa7dc2107 R11: 0000000000000000 R12: 0000000000000008
[   40.598186][  T146] R13: ffff88804011a000 R14: ffff8880540f1000 R15: 1ffff1100c22de9a
[   40.599012][  T146] FS:  0000000000000000(0000) GS:ffff888067800000(0000) knlGS:0000000000000000
[   40.600004][  T146] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   40.600665][  T146] CR2: 00005572d3a807b8 CR3: 000000005fcf4003 CR4: 00000000000606e0
[   40.601485][  T146] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   40.602461][  T146] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   40.603443][  T146] Call Trace:
[   40.603871][  T146]  ? nf_tables_dump_setelem+0xa0/0xa0 [nf_tables]
[   40.604587][  T146]  ? macvlan_uninit+0x100/0x100 [macvlan]
[   40.605212][  T146]  ? __module_text_address+0x13/0x140
[   40.605842][  T146]  notifier_call_chain+0x90/0x160
[   40.606477][  T146]  dev_set_mac_address+0x28e/0x3f0
[   40.607117][  T146]  ? netdev_notify_peers+0xc0/0xc0
[   40.607762][  T146]  ? __module_text_address+0x13/0x140
[   40.608440][  T146]  ? notifier_call_chain+0x90/0x160
[   40.609097][  T146]  ? dev_set_mac_address+0x1f0/0x3f0
[   40.609758][  T146]  dev_set_mac_address+0x1f0/0x3f0
[   40.610402][  T146]  ? __local_bh_enable_ip+0xe9/0x1b0
[   40.611071][  T146]  ? bond_hw_addr_flush+0x77/0x100 [bonding]
[   40.611823][  T146]  ? netdev_notify_peers+0xc0/0xc0
[   40.612461][  T146]  ? bond_hw_addr_flush+0x77/0x100 [bonding]
[   40.613213][  T146]  ? bond_hw_addr_flush+0x77/0x100 [bonding]
[   40.613963][  T146]  ? __local_bh_enable_ip+0xe9/0x1b0
[   40.614631][  T146]  ? bond_time_in_interval.isra.31+0x90/0x90 [bonding]
[   40.615484][  T146]  ? __bond_release_one+0x9f0/0x12c0 [bonding]
[   40.616230][  T146]  __bond_release_one+0x9f0/0x12c0 [bonding]
[   40.616949][  T146]  ? bond_enslave+0x47c0/0x47c0 [bonding]
[   40.617642][  T146]  ? lock_downgrade+0x730/0x730
[   40.618218][  T146]  ? check_flags.part.42+0x450/0x450
[   40.618850][  T146]  ? __mutex_unlock_slowpath+0xd0/0x670
[   40.619519][  T146]  ? trace_hardirqs_on+0x30/0x180
[   40.620117][  T146]  ? wait_for_completion+0x250/0x250
[   40.620754][  T146]  bond_netdev_event+0x822/0x970 [bonding]
[   40.621460][  T146]  ? __module_text_address+0x13/0x140
[   40.622097][  T146]  notifier_call_chain+0x90/0x160
[   40.622806][  T146]  rollback_registered_many+0x660/0xcf0
[   40.623522][  T146]  ? netif_set_real_num_tx_queues+0x780/0x780
[   40.624290][  T146]  ? notifier_call_chain+0x90/0x160
[   40.624957][  T146]  ? netdev_upper_dev_unlink+0x114/0x180
[   40.625686][  T146]  ? __netdev_adjacent_dev_unlink_neighbour+0x30/0x30
[   40.626421][  T146]  ? mutex_is_locked+0x13/0x50
[   40.627016][  T146]  ? unregister_netdevice_queue+0xf2/0x240
[   40.627663][  T146]  unregister_netdevice_many.part.134+0x13/0x1b0
[   40.628362][  T146]  default_device_exit_batch+0x2d9/0x390
[   40.628987][  T146]  ? unregister_netdevice_many+0x40/0x40
[   40.629615][  T146]  ? dev_change_net_namespace+0xcb0/0xcb0
[   40.630279][  T146]  ? prepare_to_wait_exclusive+0x2e0/0x2e0
[   40.630943][  T146]  ? ops_exit_list.isra.9+0x97/0x140
[   40.631554][  T146]  cleanup_net+0x441/0x890
[ ... ]

Fixes: e289fd28176b ("macvlan: fix the problem when mac address changes for passthru mode")
Reported-by: [email protected]
Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

e1000: remove unneeded conversion to bool

The '==' expression itself is bool, no need to convert it to bool again.
This fixes the following coccicheck warning:

drivers/net/ethernet/intel/e1000/e1000_main.c:1479:44-49: WARNING:
conversion to bool not needed here

Signed-off-by: Jason Yan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

i40e: Remove unneeded conversion to bool

The '==' expression itself is bool, no need to convert it to bool again.
This fixes the following coccicheck warning:

drivers/net/ethernet/intel/i40e/i40e_main.c:1614:52-57: WARNING:
conversion to bool not needed here
drivers/net/ethernet/intel/i40e/i40e_main.c:11439:52-57: WARNING:
conversion to bool not needed here

Signed-off-by: Jason Yan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ptp: Remove unneeded conversion to bool

The '==' expression itself is bool, no need to convert it to bool again.
This fixes the following coccicheck warning:

drivers/ptp/ptp_ines.c:403:55-60: WARNING: conversion to bool not
needed here
drivers/ptp/ptp_ines.c:404:55-60: WARNING: conversion to bool not
needed here

Signed-off-by: Jason Yan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

cgroup, netclassid: remove double cond_resched

Commit 018d26fcd12a ("cgroup, netclassid: periodically release file_lock
on classid") added a second cond_resched to write_classid indirectly by
update_classid_task. Remove the one in write_classid.

Signed-off-by: Jiri Slaby <[email protected]>
Cc: Dmitry Yakunin <[email protected]>
Cc: Konstantin Khlebnikov <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
"15 fixes"

* emailed patches from Andrew Morton <[email protected]>:
  tools/vm: fix cross-compile build
  coredump: fix null pointer dereference on coredump
  mm: shmem: disable interrupt when acquiring info->lock in userfaultfd_copy path
  shmem: fix possible deadlocks on shmlock_user_lock
  vmalloc: fix remap_vmalloc_range() bounds checks
  mm/shmem: fix build without THP
  mm/ksm: fix NULL pointer dereference when KSM zero page is enabled
  tools/build: tweak unused value workaround
  checkpatch: fix a typo in the regex for $allocFunctions
  mm, gup: return EINTR when gup is interrupted by fatal signals
  mm/hugetlb: fix a addressing exception caused by huge_pte_offset
  MAINTAINERS: add an entry for kfifo
  mm/userfaultfd: disable userfaultfd-wp on x86_32
  slub: avoid redzone when choosing freepointer location
  sh: fix build error in mm/init.c

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
"Bugfixes, and a few cleanups to the newly-introduced assembly language
  vmentry code for AMD"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: PPC: Book3S HV: Handle non-present PTEs in page fault functions
  kvm: Disable objtool frame pointer checking for vmenter.S
  MAINTAINERS: add a reviewer for KVM/s390
  KVM: s390: Fix PV check in deliverable_irqs()
  kvm: Handle reads of SandyBridge RAPL PMU MSRs rather than injecting #GP
  KVM: Remove CREATE_IRQCHIP/SET_PIT2 race
  KVM: SVM: Fix __svm_vcpu_run declaration.
  KVM: SVM: Do not setup frame pointer in __svm_vcpu_run
  KVM: SVM: Fix build error due to missing release_pages() include
  KVM: SVM: Do not mark svm_vcpu_run with STACK_FRAME_NON_STANDARD
  kvm: nVMX: match comment with return type for nested_vmx_exit_reflected
  kvm: nVMX: reflect MTF VM-exits if injected by L1
  KVM: s390: Return last valid slot if approx index is out-of-bounds
  KVM: Check validity of resolved slot when searching memslots
  KVM: VMX: Enable machine check support for 32bit targets
  KVM: SVM: move more vmentry code to assembly
  KVM: SVM: fix compilation with modular PSP and non-modular KVM

Merge tag 'asoc-fix-v5.7-rc2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v5.7

Quite a lot of fixes here, a lot of driver specific ones but the biggest
one is the revert of changes to the startup and shutdown sequence for
DAIs that went in during the merge window - they broke some older x86
platforms and attempts to fix them didn't succeed so it's safer to just
roll them back and try to make sure those platforms are handled properly
in any future attempt.

The rockchip S/PDIF DT stuff was IIRC for validation issues.

ALSA: usb-audio: Apply async workaround for Scarlett 2i4 2nd gen

Due to rounding error driver sometimes incorrectly calculate next packet
size, which results in audible clicks on devices with synchronous playback
endpoints. For example on a high speed bus and a sample rate 44.1 kHz it
loses one sample every ~40.9 seconds. Fortunately playback interface on
Scarlett 2i4 2nd gen has a working explicit feedback endpoint, so we can
switch playback data endpoint to asynchronous mode as a workaround.

Signed-off-by: Alexander Tsoy <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes and cleanups from Michael Tsirkin:

- Some bug fixes

- Cleanup a couple of issues that surfaced meanwhile

- Disable vhost on ARM with OABI for now - to be fixed fully later in
   the cycle or in the next release.

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (24 commits)
  vhost: disable for OABI
  virtio: drop vringh.h dependency
  virtio_blk: add a missing include
  virtio-balloon: Avoid using the word 'report' when referring to free page hinting
  virtio-balloon: make virtballoon_free_page_report() static
  vdpa: fix comment of vdpa_register_device()
  vdpa: make vhost, virtio depend on menu
  vdpa: allow a 32 bit vq alignment
  drm/virtio: fix up for include file changes
  remoteproc: pull in slab.h
  rpmsg: pull in slab.h
  virtio_input: pull in slab.h
  remoteproc: pull in slab.h
  virtio-rng: pull in slab.h
  virtgpu: pull in uaccess.h
  tools/virtio: make asm/barrier.h self contained
  tools/virtio: define aligned attribute
  virtio/test: fix up after IOTLB changes
  vhost: Create accessors for virtqueues private_data
  vdpasim: Return status in vdpasim_get_status
  ...

Merge tag 'tpmdd-next-20200421' of git://git.infradead.org/users/jjs/linux-tpmdd

Pull tpm fixes from Jarkko Sakkinen:
"A few bug fixes"

* tag 'tpmdd-next-20200421' of git://git.infradead.org/users/jjs/linux-tpmdd:
  tpm/tpm_tis: Free IRQ if probing fails
  tpm: fix wrong return value in tpm_pcr_extend
  tpm: ibmvtpm: retry on H_CLOSED in tpm_ibmvtpm_send()
  tpm: Export tpm2_get_cc_attrs_tbl for ibmvtpm driver as module

Merge tag 'clang-format-for-linus-v5.7-rc3' of git://github.com/ojeda/linux

Pull clang-format fixlets from Miguel Ojeda:
"Two trivial clang-format changes:

   - Don't indent C++ namespaces (Ian Rogers)

   - The usual clang-format macro list update (Miguel Ojeda)"

* tag 'clang-format-for-linus-v5.7-rc3' of git://github.com/ojeda/linux:
  clang-format: Update with the latest for_each macro list
  clang-format: don't indent namespaces

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) flow_block_cb memleak in nf_flow_table_offload_del_cb(), from Roi Dayan.

2) Fix error path handling in nf_nat_inet_register_fn(), from Hillf Danton.
====================

Signed-off-by: David S. Miller <[email protected]>

tools/vm: fix cross-compile build

Commit 7ed1c1901fe5 ("tools: fix cross-compile var clobbering") moved
the setup of the CC variable to tools/scripts/Makefile.include to make
the behavior consistent across all the tools Makefiles.

As the vm tools missed the include we end up with the wrong CC in a
cross-compiling evironment.

Fixes: 7ed1c1901fe5 (tools: fix cross-compile var clobbering)
Signed-off-by: Lucas Stach <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Martin Kelly <[email protected]>
Cc: <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

coredump: fix null pointer dereference on coredump

If the core_pattern is set to "|" and any process segfaults then we get
a null pointer derefernce while trying to coredump. The call stack shows:

    RIP: do_coredump+0x628/0x11c0

When the core_pattern has only "|" there is no use of trying the
coredump and we can check that while formating the corename and exit
with an error.

After this change I get:

    format_corename failed
    Aborting core

Fixes: 315c69261dd3 ("coredump: split pipe command whitespace before expanding template")
Reported-by: Matthew Ruffell <[email protected]>
Signed-off-by: Sudip Mukherjee <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Paul Wise <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Neil Horman <[email protected]>
Cc: <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: shmem: disable interrupt when acquiring info->lock in userfaultfd_copy path

Syzbot reported the below lockdep splat:

    WARNING: possible irq lock inversion dependency detected
    5.6.0-rc7-syzkaller #0 Not tainted
    --------------------------------------------------------
    syz-executor.0/10317 just changed the state of lock:
    ffff888021d16568 (&(&info->lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:338 [inline]
    ffff888021d16568 (&(&info->lock)->rlock){+.+.}, at: shmem_mfill_atomic_pte+0x1012/0x21c0 mm/shmem.c:2407
    but this lock was taken by another, SOFTIRQ-safe lock in the past:
     (&(&xa->xa_lock)->rlock#5){..-.}

    and interrupts could create inverse lock ordering between them.

    other info that might help us debug this:
     Possible interrupt unsafe locking scenario:

           CPU0                    CPU1
           ----                    ----
      lock(&(&info->lock)->rlock);
                                   local_irq_disable();
                                   lock(&(&xa->xa_lock)->rlock#5);
                                   lock(&(&info->lock)->rlock);
      <Interrupt>
        lock(&(&xa->xa_lock)->rlock#5);

     *** DEADLOCK ***

The full report is quite lengthy, please see:

  https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2004152007370 [email protected]/T/#m813b412c5f78e25ca8c6c7734886ed4de43f241d

It is because CPU 0 held info->lock with IRQ enabled in userfaultfd_copy
path, then CPU 1 is splitting a THP which held xa_lock and info->lock in
IRQ disabled context at the same time.  If softirq comes in to acquire
xa_lock, the deadlock would be triggered.

The fix is to acquire/release info->lock with *_irq version instead of
plain spin_{lock,unlock} to make it softirq safe.

Fixes: 4c27fe4c4c84 ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support")
Reported-by: [email protected]
Signed-off-by: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: [email protected]
Acked-by: Hugh Dickins <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

shmem: fix possible deadlocks on shmlock_user_lock

Recent commit 71725ed10c40 ("mm: huge tmpfs: try to split_huge_page()
when punching hole") has allowed syzkaller to probe deeper, uncovering a
long-standing lockdep issue between the irq-unsafe shmlock_user_lock,
the irq-safe xa_lock on mapping->i_pages, and shmem inode's info->lock
which nests inside xa_lock (or tree_lock) since 4.8's shmem_uncharge().

user_shm_lock(), servicing SysV shmctl(SHM_LOCK), wants
shmlock_user_lock while its caller shmem_lock() holds info->lock with
interrupts disabled; but hugetlbfs_file_setup() calls user_shm_lock()
with interrupts enabled, and might be interrupted by a writeback endio
wanting xa_lock on i_pages.

This may not risk an actual deadlock, since shmem inodes do not take
part in writeback accounting, but there are several easy ways to avoid
it.

Requiring interrupts disabled for shmlock_user_lock would be easy, but
it's a high-level global lock for which that seems inappropriate.
Instead, recall that the use of info->lock to guard info->flags in
shmem_lock() dates from pre-3.1 days, when races with SHMEM_PAGEIN and
SHMEM_TRUNCATE could occur: nowadays it serves no purpose, the only flag
added or removed is VM_LOCKED itself, and calls to shmem_lock() an inode
are already serialized by the caller.

Take info->lock out of the chain and the possibility of deadlock or
lockdep warning goes away.

Fixes: 4595ef88d136 ("shmem: make shmem_inode_info::lock irq-safe")
Reported-by: [email protected]
Reported-by: [email protected]
Signed-off-by: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Yang Shi <[email protected]>
Cc: Yang Shi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: Linus Torvalds <[email protected]>

vmalloc: fix remap_vmalloc_range() bounds checks

remap_vmalloc_range() has had various issues with the bounds checks it
promises to perform ("This function checks that addr is a valid
vmalloc'ed area, and that it is big enough to cover the vma") over time,
e.g.:

- not detecting pgoff<<PAGE_SHIFT overflow

- not detecting (pgoff<<PAGE_SHIFT)+usize overflow

- not checking whether addr and addr+(pgoff<<PAGE_SHIFT) are the same
vmalloc allocation

- comparing a potentially wildly out-of-bounds pointer with the end of
the vmalloc region

In particular, since commit fc9702273e2e ("bpf: Add mmap() support for
BPF_MAP_TYPE_ARRAY"), unprivileged users can cause kernel null pointer
dereferences by calling mmap() on a BPF map with a size that is bigger
than the distance from the start of the BPF map to the end of the
address space.

This could theoretically be used as a kernel ASLR bypass, by using
whether mmap() with a given offset oopses or returns an error code to
perform a binary search over the possible address range.

To allow remap_vmalloc_range_partial() to verify that addr and
addr+(pgoff<<PAGE_SHIFT) are in the same vmalloc region, pass the offset
to remap_vmalloc_range_partial() instead of adding it to the pointer in
remap_vmalloc_range().

In remap_vmalloc_range_partial(), fix the check against
get_vm_area_size() by using size comparisons instead of pointer
comparisons, and add checks for pgoff.

Fixes: 833423143c3a ("[PATCH] mm: introduce remap_vmalloc_range()")
Signed-off-by: Jann Horn <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: [email protected]
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Yonghong Song <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: John Fastabend <[email protected]>
Cc: KP Singh <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/shmem: fix build without THP

Some optimizers don't notice that shmem_punch_compound() is always true
(PageTransCompound() being false) without CONFIG_TRANSPARENT_HUGEPAGE==y.

Use IS_ENABLED to help them to avoid the BUILD_BUG inside HPAGE_PMD_NR.

Fixes: 71725ed10c40 ("mm: huge tmpfs: try to split_huge_page() when punching hole")
Reported-by: Randy Dunlap <[email protected]>
Signed-off-by: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Randy Dunlap <[email protected]>
Acked-by: Randy Dunlap <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/ksm: fix NULL pointer dereference when KSM zero page is enabled

find_mergeable_vma() can return NULL.  In this case, it leads to a crash
when we access vm_mm(its offset is 0x40) later in write_protect_page.
And this case did happen on our server.  The following call trace is
captured in kernel 4.19 with the following patch applied and KSM zero
page enabled on our server.

  commit e86c59b1b12d ("mm/ksm: improve deduplication of zero pages with colouring")

So add a vma check to fix it.

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
  Oops: 0000 [#1] SMP NOPTI
  CPU: 9 PID: 510 Comm: ksmd Kdump: loaded Tainted: G OE 4.19.36.bsk.9-amd64 #4.19.36.bsk.9
  RIP: try_to_merge_one_page+0xc7/0x760
  Code: 24 58 65 48 33 34 25 28 00 00 00 89 e8 0f 85 a3 06 00 00 48 83 c4
        60 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 8b 46 08 a8 01 75 b8 <49>
        8b 44 24 40 4c 8d 7c 24 20 b9 07 00 00 00 4c 89 e6 4c 89 ff 48
  RSP: 0018:ffffadbdd9fffdb0 EFLAGS: 00010246
  RAX: ffffda83ffd4be08 RBX: ffffda83ffd4be40 RCX: 0000002c6e800000
  RDX: 0000000000000000 RSI: ffffda83ffd4be40 RDI: 0000000000000000
  RBP: ffffa11939f02ec0 R08: 0000000094e1a447 R09: 00000000abe76577
  R10: 0000000000000962 R11: 0000000000004e6a R12: 0000000000000000
  R13: ffffda83b1e06380 R14: ffffa18f31f072c0 R15: ffffda83ffd4be40
  FS: 0000000000000000(0000) GS:ffffa0da43b80000(0000) knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000040 CR3: 0000002c77c0a003 CR4: 00000000007626e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
    ksm_scan_thread+0x115e/0x1960
    kthread+0xf5/0x130
    ret_from_fork+0x1f/0x30

[[email protected]: if the vma is out of date, just exit]
Link: http://lkml.kernel.org/r/[email protected]
[[email protected]: add the conventional braces, replace /** with /*]
Fixes: e86c59b1b12d ("mm/ksm: improve deduplication of zero pages with colouring")
Co-developed-by: Xiongchun Duan <[email protected]>
Signed-off-by: Muchun Song <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Kirill Tkhai <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Claudio Imbrenda <[email protected]>
Cc: Markus Elfring <[email protected]>
Cc: <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

tools/build: tweak unused value workaround

Clang has -Wself-assign enabled by default under -Wall, which always
gets -Werror'ed on this file, causing sync-compare-and-swap to be
disabled by default.

The generally-accepted way to spell "this value is intentionally
unused," is casting it to `void`. This is accepted by both GCC and
Clang with -Wall enabled: https://godbolt.org/z/qqZ9r3

Signed-off-by: George Burgess IV <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

checkpatch: fix a typo in the regex for $allocFunctions

Here, we look for function such as 'netdev_alloc_skb_ip_align', so a '_'
is missing in the regex.

To make sure:
grep -r --include=*.c skbip_a * | wc ==> 0 results
grep -r --include=*.c skb_ip_a * | wc ==> 112 results

Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Joe Perches <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm, gup: return EINTR when gup is interrupted by fatal signals

EINTR is the usual error code which other killable interfaces return.
This is the case for the other fatal_signal_pending break out from the
same function. Make the code consistent.

ERESTARTSYS is also quite confusing because the signal is fatal and so
no restart will happen before returning to the userspace.

Signed-off-by: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Linus Torvalds <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Hillf Danton <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/hugetlb: fix a addressing exception caused by huge_pte_offset

Our machine encountered a panic(addressing exception) after run for a
long time and the calltrace is:

    RIP: hugetlb_fault+0x307/0xbe0
    RSP: 0018:ffff9567fc27f808  EFLAGS: 00010286
    RAX: e800c03ff1258d48 RBX: ffffd3bb003b69c0 RCX: e800c03ff1258d48
    RDX: 17ff3fc00eda72b7 RSI: 00003ffffffff000 RDI: e800c03ff1258d48
    RBP: ffff9567fc27f8c8 R08: e800c03ff1258d48 R09: 0000000000000080
    R10: ffffaba0704c22a8 R11: 0000000000000001 R12: ffff95c87b4b60d8
    R13: 00005fff00000000 R14: 0000000000000000 R15: ffff9567face8074
    FS:  00007fe2d9ffb700(0000) GS:ffff956900e40000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffffd3bb003b69c0 CR3: 000000be67374000 CR4: 00000000003627e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
      follow_hugetlb_page+0x175/0x540
      __get_user_pages+0x2a0/0x7e0
      __get_user_pages_unlocked+0x15d/0x210
      __gfn_to_pfn_memslot+0x3c5/0x460 [kvm]
      try_async_pf+0x6e/0x2a0 [kvm]
      tdp_page_fault+0x151/0x2d0 [kvm]
     ...
      kvm_arch_vcpu_ioctl_run+0x330/0x490 [kvm]
      kvm_vcpu_ioctl+0x309/0x6d0 [kvm]
      do_vfs_ioctl+0x3f0/0x540
      SyS_ioctl+0xa1/0xc0
      system_call_fastpath+0x22/0x27

For 1G hugepages, huge_pte_offset() wants to return NULL or pudp, but it
may return a wrong 'pmdp' if there is a race.  Please look at the
following code snippet:

    ...
    pud = pud_offset(p4d, addr);
    if (sz != PUD_SIZE && pud_none(*pud))
        return NULL;
    /* hugepage or swap? */
    if (pud_huge(*pud) || !pud_present(*pud))
        return (pte_t *)pud;

    pmd = pmd_offset(pud, addr);
    if (sz != PMD_SIZE && pmd_none(*pmd))
        return NULL;
    /* hugepage or swap? */
    if (pmd_huge(*pmd) || !pmd_present(*pmd))
        return (pte_t *)pmd;
    ...

The following sequence would trigger this bug:

- CPU0: sz = PUD_SIZE and *pud = 0 , continue
- CPU0: "pud_huge(*pud)" is false
- CPU1: calling hugetlb_no_page and set *pud to xxxx8e7(PRESENT)
- CPU0: "!pud_present(*pud)" is false, continue
- CPU0: pmd = pmd_offset(pud, addr) and maybe return a wrong pmdp

However, we want CPU0 to return NULL or pudp in this case.

We must make sure there is exactly one dereference of pud and pmd.

Signed-off-by: Longpeng <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

MAINTAINERS: add an entry for kfifo

Kfifo has been written by Stefani Seibold and she's implicitly expected
to Ack any changes to it. She's not however officially listed as kfifo
maintainer which leads to delays in patch review. This patch proposes
to add an explitic entry for kfifo to MAINTAINERS file.

[[email protected]: alphasort F: entries, per Joe]
[[email protected]: remove colon, per Bartosz]
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Bartosz Golaszewski <[email protected]>
Acked-by: Stefani Seibold <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Linus Walleij <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Joe Perches <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/userfaultfd: disable userfaultfd-wp on x86_32

Userfaultfd-wp is not yet working on 32bit hosts, but it's accidentally
enabled previously. Disable it.

Fixes: 5a281062af1d ("userfaultfd: wp: add WP pagetable tracking to x86")
Reported-by: Naresh Kamboju <[email protected]>
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Naresh Kamboju <[email protected]>
Reviewed-by: Andrea Arcangeli <[email protected]>
Cc: Hillf Danton <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

slub: avoid redzone when choosing freepointer location

Marco Elver reported system crashes when booting with "slub_debug=Z".

The freepointer location (s->offset) was not taking into account that
the "inuse" size that includes the redzone area should not be used by
the freelist pointer. Change the calculation to save the area of the
object that an inline freepointer may be written into.

Fixes: 3202fa62fb43 ("slub: relocate freelist pointer to middle of object")
Reported-by: Marco Elver <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Marco Elver <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Link: http://lkml.kernel.org/r/202004151054.BD695840@keescook
Link: https://lore.kernel.org/linux-mm/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

sh: fix build error in mm/init.c

The closing parenthesis is missing.

Fixes: bfeb022f8fe4 ("mm/memory_hotplug: add pgprot_t to mhp_params")
Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Geert Uytterhoeven <[email protected]>
Reviewed-by: Logan Gunthorpe <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Cc: Rich Felker <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

block: remove unused header

Dax related code already removed from this file.

Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jianpeng Ma <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

blk-iocost: Fix error on iocost_ioc_vrate_adj

Systemtap 4.2 is unable to correctly interpret the "u32 (*missed_ppm)[2]"
argument of the iocost_ioc_vrate_adj trace entry defined in
include/trace/events/iocost.h leading to the following error:

  /tmp/stapAcz0G0/stap_c89c58b83cea1724e26395efa9ed4939_6321_aux_6.c:78:8:
  error: expected ‘;’, ‘,’ or ‘)’ before ‘*’ token
   , u32[]* __tracepoint_arg_missed_ppm

That argument type is indeed rather complex and hard to read. Looking
at block/blk-iocost.c. It is just a 2-entry u32 array. By simplifying
the argument to a simple "u32 *missed_ppm" and adjusting the trace
entry accordingly, the compilation error was gone.

Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost")
Acked-by: Steven Rostedt (VMware) <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Signed-off-by: Waiman Long <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

signal: Avoid corrupting si_pid and si_uid in do_notify_parent

Christof Meerwald <[email protected]> writes:
> Hi,
>
> this is probably related to commit
> 7a0cf094944e2540758b7f957eb6846d5126f535 (signal: Correct namespace
> fixups of si_pid and si_uid).
>
> With a 5.6.5 kernel I am seeing SIGCHLD signals that don't include a
> properly set si_pid field - this seems to happen for multi-threaded
> child processes.
>
> A simple test program (based on the sample from the signalfd man page):
>
> #include <sys/signalfd.h>
> #include <signal.h>
> #include <unistd.h>
> #include <spawn.h>
> #include <stdlib.h>
> #include <stdio.h>
>
> #define handle_error(msg) \
>     do { perror(msg); exit(EXIT_FAILURE); } while (0)
>
> int main(int argc, char *argv[])
> {
>   sigset_t mask;
>   int sfd;
>   struct signalfd_siginfo fdsi;
>   ssize_t s;
>
>   sigemptyset(&mask);
>   sigaddset(&mask, SIGCHLD);
>
>   if (sigprocmask(SIG_BLOCK, &mask, NULL) == -1)
>     handle_error("sigprocmask");
>
>   pid_t chldpid;
>   char *chldargv[] = { "./sfdclient", NULL };
>   posix_spawn(&chldpid, "./sfdclient", NULL, NULL, chldargv, NULL);
>
>   sfd = signalfd(-1, &mask, 0);
>   if (sfd == -1)
>     handle_error("signalfd");
>
>   for (;;) {
>     s = read(sfd, &fdsi, sizeof(struct signalfd_siginfo));
>     if (s != sizeof(struct signalfd_siginfo))
>       handle_error("read");
>
>     if (fdsi.ssi_signo == SIGCHLD) {
>       printf("Got SIGCHLD %d %d %d %d\n",
>           fdsi.ssi_status, fdsi.ssi_code,
>           fdsi.ssi_uid, fdsi.ssi_pid);
>       return 0;
>     } else {
>       printf("Read unexpected signal\n");
>     }
>   }
> }
>
>
> and a multi-threaded client to test with:
>
> #include <unistd.h>
> #include <pthread.h>
>
> void *f(void *arg)
> {
>   sleep(100);
> }
>
> int main()
> {
>   pthread_t t[8];
>
>   for (int i = 0; i != 8; ++i)
>   {
>     pthread_create(&t[i], NULL, f, NULL);
>   }
> }
>
> I tried to do a bit of debugging and what seems to be happening is
> that
>
>   /* From an ancestor pid namespace? */
>   if (!task_pid_nr_ns(current, task_active_pid_ns(t))) {
>
> fails inside task_pid_nr_ns because the check for "pid_alive" fails.
>
> This code seems to be called from do_notify_parent and there we
> actually have "tsk != current" (I am assuming both are threads of the
> current process?)

I instrumented the code with a warning and received the following backtrace:
> WARNING: CPU: 0 PID: 777 at kernel/pid.c:501 __task_pid_nr_ns.cold.6+0xc/0x15
> Modules linked in:
> CPU: 0 PID: 777 Comm: sfdclient Not tainted 5.7.0-rc1userns+ #2924
> Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> RIP: 0010:__task_pid_nr_ns.cold.6+0xc/0x15
> Code: ff 66 90 48 83 ec 08 89 7c 24 04 48 8d 7e 08 48 8d 74 24 04 e8 9a b6 44 00 48 83 c4 08 c3 48 c7 c7 59 9f ac 82 e8 c2 c4 04 00 <0f> 0b e9 3fd
> RSP: 0018:ffffc9000042fbf8 EFLAGS: 00010046
> RAX: 000000000000000c RBX: 0000000000000000 RCX: ffffc9000042faf4
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff81193d29
> RBP: ffffc9000042fc18 R08: 0000000000000000 R09: 0000000000000001
> R10: 000000100f938416 R11: 0000000000000309 R12: ffff8880b941c140
> R13: 0000000000000000 R14: 0000000000000000 R15: ffff8880b941c140
> FS:  0000000000000000(0000) GS:ffff8880bca00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f2e8c0a32e0 CR3: 0000000002e10000 CR4: 00000000000006f0
> Call Trace:
>  send_signal+0x1c8/0x310
>  do_notify_parent+0x50f/0x550
>  release_task.part.21+0x4fd/0x620
>  do_exit+0x6f6/0xaf0
>  do_group_exit+0x42/0xb0
>  get_signal+0x13b/0xbb0
>  do_signal+0x2b/0x670
>  ? __audit_syscall_exit+0x24d/0x2b0
>  ? rcu_read_lock_sched_held+0x4d/0x60
>  ? kfree+0x24c/0x2b0
>  do_syscall_64+0x176/0x640
>  ? trace_hardirqs_off_thunk+0x1a/0x1c
>  entry_SYSCALL_64_after_hwframe+0x49/0xb3

The immediate problem is as Christof noticed that "pid_alive(current) == false".
This happens because do_notify_parent is called from the last thread to exit
in a process after that thread has been reaped.

The bigger issue is that do_notify_parent can be called from any
process that manages to wait on a thread of a multi-threaded process
from wait_task_zombie.  So any logic based upon current for
do_notify_parent is just nonsense, as current can be pretty much
anything.

So change do_notify_parent to call __send_signal directly.

Inspecting the code it appears this problem has existed since the pid
namespace support started handling this case in 2.6.30.  This fix only
backports to 7a0cf094944e ("signal: Correct namespace fixups of si_pid and si_uid")
where the problem logic was moved out of __send_signal and into send_signal.

Cc: [email protected]
Fixes: 6588c1e3ff01 ("signals: SI_USER: Masquerade si_pid when crossing pid ns boundary")
Ref: 921cf9f63089 ("signals: protect cinit from unblocked SIG_DFL signals")
Link: https://lore.kernel.org/lkml/[email protected]/
Reported-by: Christof Meerwald <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Acked-by: Christian Brauner <[email protected]>
Signed-off-by: "Eric W. Biederman" <[email protected]>

arm64: sync kernel APIAKey when installing

A direct write to a APxxKey_EL1 register requires a context
synchronization event to ensure that indirect reads made by subsequent
instructions (e.g. AUTIASP, PACIASP) observe the new value.

When we initialize the boot task's APIAKey in boot_init_stack_canary()
via ptrauth_keys_switch_kernel() we miss the necessary ISB, and so there
is a window where instructions are not guaranteed to use the new APIAKey
value. This has been observed to result in boot-time crashes where
PACIASP and AUTIASP within a function used a mixture of the old and new
key values.

Fix this by having ptrauth_keys_switch_kernel() synchronize the new key
value with an ISB. At the same time, __ptrauth_key_install() is renamed
to __ptrauth_key_install_nosync() so that it is obvious that this
performs no synchronization itself.

Fixes: 28321582334c261c ("arm64: initialize ptrauth keys for kernel booting task")
Signed-off-by: Mark Rutland <[email protected]>
Reported-by: Will Deacon <[email protected]>
Cc: Amit Daniel Kachhap <[email protected]>
Cc: Marc Zyngier <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
Tested-by: Will Deacon <[email protected]>

ASoC: wm8960: Fix wrong clock after suspend & resume

After suspend & resume, wm8960_hw_params may be called when
bias_level is not SND_SOC_BIAS_ON, then wm8960_configure_clocking
is not called. But if sample rate is changed at that time, then
the output clock rate will be not correct.

So judgement of bias_level is SND_SOC_BIAS_ON in wm8960_hw_params
is not necessary and it causes above issue.

Fixes: 3176bf2d7ccd ("ASoC: wm8960: update pll and clock setting function")
Signed-off-by: Shengjiu Wang <[email protected]>
Acked-by: Charles Keepax <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>

Merge tag 'kvm-ppc-fixes-5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master

PPC KVM fix for 5.7

- Fix a regression introduced in the last merge window, which results
in guests in HPT mode dying randomly.

Merge tag 'kvm-s390-master-5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master

KVM: s390: Fix for 5.7 and maintainer update

- Silence false positive lockdep warning
- add Claudio as reviewer

MAINTAINERS: update mt76 reviewers

Roy no longer works here. Time to say goodbye, my friend.

Signed-off-by: Ryder Lee <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/c171e0dfce9f2dad5ca6935eaf6004117f82e259.1587195398.git.ryder.lee@mediatek.com

iwlwifi: fix WGDS check when WRDS is disabled

In the reference BIOS implementation, WRDS can be disabled without
disabling WGDS. And this happens in most cases where WRDS is
disabled, causing the WGDS without WRDS check and issue an error.

To avoid this issue, we change the check so that we only considered it
an error if the WRDS entry doesn't exist. If the entry (or the
selected profile is disabled for any other reason), we just silently
ignore WGDS.

Cc: [email protected] # 4.14+
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=205513
Signed-off-by: Luca Coelho <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/iwlwifi.20200417133700.72ad25c3998b.I875d935cefd595ed7f640ddcfc7bc802627d2b7f@changeid

iwlwifi: mvm: fix inactive TID removal return value usage

The function iwl_mvm_remove_inactive_tids() returns bool, so we
should just check "if (ret)", not "if (ret >= 0)" (which would
do nothing useful here). We obviously therefore cannot use the
return value of the function for the free_queue, we need to use
the queue (i) we're currently dealing with instead.

Cc: [email protected] # v5.4+
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Luca Coelho <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/iwlwifi.20200417100405.9d862ed72535.I9e27ccc3ee3c8855fc13682592b571581925dfbd@changeid

iwlwifi: mvm: Do not declare support for ACK Enabled Aggregation

As this was not supposed to be enabled to begin with.

Cc: [email protected] # v4.19+
Signed-off-by: Ilan Peer <[email protected]>
Signed-off-by: Luca Coelho <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/iwlwifi.20200417100405.53dbc3c6c36b.Idfe118546b92cc31548b2211472a5303c7de5909@changeid

iwlwifi: mvm: limit maximum queue appropriately

Due to some hardware issues, queue 31 isn't usable on devices that have
32 queues (7000, 8000, 9000 families), which is correctly reflected in
the configuration and TX queue initialization.

However, the firmware API and queue allocation code assumes that there
are 32 queues, and if something actually attempts to use #31 this leads
to a NULL-pointer dereference since it's not allocated.

Fix this by limiting to 31 in the IWL_MVM_DQA_MAX_DATA_QUEUE, and also
add some code to catch this earlier in the future, if the configuration
changes perhaps.

Cc: [email protected] # v4.9+
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Luca Coelho <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/iwlwifi.20200417100405.98a79be2db6a.I3a4af6b03b87a6bc18db9b1ff9a812f397bee1fc@changeid

iwlwifi: pcie: indicate correct RB size to device

In the context info, we need to indicate the correct RB size
to the device so that it will not think we have 4k when we
only use 2k. This seems to not have caused any issues right
now, likely because the hardware no longer supports putting
multiple entries into a single RB, and practically all of
the entries should be smaller than 2k.

Nevertheless, it's a bug, and we must advertise the right
size to the device.

Note that right now we can only tell it 2k vs. 4k, so for
the cases where we have more, still use 4k. This needs to
be fixed by the firmware first.

Signed-off-by: Johannes Berg <[email protected]>
Fixes: cfdc20efebdc ("iwlwifi: pcie: use partial pages if applicable")
Cc: [email protected] # v5.6
Signed-off-by: Luca Coelho <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/iwlwifi.20200417100405.ae6cd345764f.I0985c55223decf70182b9ef1d8edf4179f537853@changeid

iwlwifi: mvm: beacon statistics shouldn't go backwards

We reset statistics also in case that we didn't reassoc so in
this cases keep last beacon counter.

Cc: [email protected] # v4.19+
Signed-off-by: Mordechay Goodstein <[email protected]>
Signed-off-by: Luca Coelho <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/iwlwifi.20200417100405.1f9142751fbc.Ifbfd0f928a0a761110b8f4f2ca5483a61fb21131@changeid

iwlwifi: pcie: actually release queue memory in TVQM

The iwl_trans_pcie_dyn_txq_free() function only releases the frames
that may be left on the queue by calling iwl_pcie_gen2_txq_unmap(),
but doesn't actually free the DMA ring or byte-count tables for the
queue. This leads to pretty large memory leaks (at least before my
queue size improvements), in particular in monitor/sniffer mode on
channel hopping since this happens on every channel change.

This was also now more evident after the move to a DMA pool for the
byte count tables, showing messages such as

BUG iwlwifi:bc (...): Objects remaining in iwlwifi:bc on __kmem_cache_shutdown()

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=206811.

Signed-off-by: Johannes Berg <[email protected]>
Fixes: 6b35ff91572f ("iwlwifi: pcie: introduce a000 TX queues management")
Cc: [email protected] # v4.14+
Signed-off-by: Luca Coelho <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/iwlwifi.20200417100405.f5f4c4193ec1.Id5feebc9b4318041913a9c89fc1378bb5454292c@changeid

Merge tag 'zynqmp-soc-for-v5.7-rc3' of https://github.com/Xilinx/linux-xlnx into arm/fixes

arm64: soc: ZynqMP SoC fixes for v5.7

- Fix firmware driver dependency
- Fix one spare warning in firmware driver

* tag 'zynqmp-soc-for-v5.7-rc3' of https://github.com/Xilinx/linux-xlnx:
firmware: xilinx: make firmware_debugfs_root static
drivers: soc: xilinx: fix firmware driver Kconfig dependency

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>

batman-adv: Fix refcnt leak in batadv_v_ogm_process

batadv_v_ogm_process() invokes batadv_hardif_neigh_get(), which returns
a reference of the neighbor object to "hardif_neigh" with increased
refcount.

When batadv_v_ogm_process() returns, "hardif_neigh" becomes invalid, so
the refcount should be decreased to keep refcount balanced.

The reference counting issue happens in one exception handling paths of
batadv_v_ogm_process(). When batadv_v_ogm_orig_get() fails to get the
orig node and returns NULL, the refcnt increased by
batadv_hardif_neigh_get() is not decreased, causing a refcnt leak.

Fix this issue by jumping to "out" label when batadv_v_ogm_orig_get()
fails to get the orig node.

Fixes: 9323158ef9f4 ("batman-adv: OGMv2 - implement originators logic")
Signed-off-by: Xiyu Yang <[email protected]>
Signed-off-by: Xin Tan <[email protected]>
Signed-off-by: Sven Eckelmann <[email protected]>
Signed-off-by: Simon Wunderlich <[email protected]>

batman-adv: Fix refcnt leak in batadv_store_throughput_override

batadv_show_throughput_override() invokes batadv_hardif_get_by_netdev(),
which gets a batadv_hard_iface object from net_dev with increased refcnt
and its reference is assigned to a local pointer 'hard_iface'.

When batadv_store_throughput_override() returns, "hard_iface" becomes
invalid, so the refcount should be decreased to keep refcount balanced.

The issue happens in one error path of
batadv_store_throughput_override(). When batadv_parse_throughput()
returns NULL, the refcnt increased by batadv_hardif_get_by_netdev() is
not decreased, causing a refcnt leak.

Fix this issue by jumping to "out" label when batadv_parse_throughput()
returns NULL.

Fixes: 0b5ecc6811bd ("batman-adv: add throughput override attribute to hard_ifaces")
Signed-off-by: Xiyu Yang <[email protected]>
Signed-off-by: Xin Tan <[email protected]>
Signed-off-by: Sven Eckelmann <[email protected]>
Signed-off-by: Simon Wunderlich <[email protected]>

batman-adv: Fix refcnt leak in batadv_show_throughput_override

batadv_show_throughput_override() invokes batadv_hardif_get_by_netdev(),
which gets a batadv_hard_iface object from net_dev with increased refcnt
and its reference is assigned to a local pointer 'hard_iface'.

When batadv_show_throughput_override() returns, "hard_iface" becomes
invalid, so the refcount should be decreased to keep refcount balanced.

The issue happens in the normal path of
batadv_show_throughput_override(), which forgets to decrease the refcnt
increased by batadv_hardif_get_by_netdev() before the function returns,
causing a refcnt leak.

Fix this issue by calling batadv_hardif_put() before the
batadv_show_throughput_override() returns in the normal path.

Fixes: 0b5ecc6811bd ("batman-adv: add throughput override attribute to hard_ifaces")
Signed-off-by: Xiyu Yang <[email protected]>
Signed-off-by: Xin Tan <[email protected]>
Signed-off-by: Sven Eckelmann <[email protected]>
Signed-off-by: Simon Wunderlich <[email protected]>

batman-adv: fix batadv_nc_random_weight_tq

and change to pseudorandom numbers, as this is a traffic dithering
operation that doesn't need crypto-grade.

The previous code operated in 4 steps:

1. Generate a random byte 0 <= rand_tq <= 255
2. Multiply it by BATADV_TQ_MAX_VALUE - tq
3. Divide by 255 (= BATADV_TQ_MAX_VALUE)
4. Return BATADV_TQ_MAX_VALUE - rand_tq

This would apperar to scale (BATADV_TQ_MAX_VALUE - tq) by a random
value between 0/255 and 255/255.

But!  The intermediate value between steps 3 and 4 is stored in a u8
variable.  So it's truncated, and most of the time, is less than 255, after
which the division produces 0.  Specifically, if tq is odd, the product is
always even, and can never be 255.  If tq is even, there's exactly one
random byte value that will produce a product byte of 255.

Thus, the return value is 255 (511/512 of the time) or 254 (1/512
of the time).

If we assume that the truncation is a bug, and the code is meant to scale
the input, a simpler way of looking at it is that it's returning a random
value between tq and BATADV_TQ_MAX_VALUE, inclusive.

Well, we have an optimized function for doing just that.

Fixes: 3c12de9a5c75 ("batman-adv: network coding - code and transmit packets if possible")
Signed-off-by: George Spelvin <[email protected]>
Signed-off-by: Sven Eckelmann <[email protected]>
Signed-off-by: Simon Wunderlich <[email protected]>

iwlwifi: actually check allocated conf_tlv pointer

Commit 71bc0334a637 ("iwlwifi: check allocated pointer when allocating
conf_tlvs") attempted to fix a typoe introduced by commit 17b809c9b22e
("iwlwifi: dbg: move debug data to a struct") but does not implement the
check correctly.

Fixes: 71bc0334a637 ("iwlwifi: check allocated pointer when allocating conf_tlvs")
Tweeted-by: @grsecurity
Signed-off-by: Chris Rorvick <[email protected]>
Signed-off-by: Sedat Dilek <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

ALSA: usx2y: Fix potential NULL dereference

The error handling code in usX2Y_rate_set() may hit a potential NULL
dereference when an error occurs before allocating all us->urb[].
Add a proper NULL check for fixing the corner case.

Reported-by: Lin Yi <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

ALSA: usb-audio: Add quirk for Focusrite Scarlett 2i2

Force it to use asynchronous playback.

Same quirk has already been added for Focusrite Scarlett Solo (2nd gen)
with a commit 46f5710f0b88 ("ALSA: usb-audio: Add quirk for Focusrite
Scarlett Solo").

This also seems to prevent regular clicks when playing at 44100Hz
on Scarlett 2i2 (2nd gen). I did not notice any side effects.

Moved both quirks to snd_usb_audioformat_attributes_quirk() as suggested.

Signed-off-by: Gregor Pintar <[email protected]>
Reviewed-by: Alexander Tsoy <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

bpf, selftests: Add test for BPF_STX BPF_B storing R10

This patch adds a test to test_verifier that writes the lower 8 bits of
R10 (aka FP) using BPF_B to an array map and reads the result back. The
expected behavior is that the result should be the same as first copying
R10 to R9, and then storing / loading the lower 8 bits of R9.

This test catches a bug that was present in the x86-64 JIT that caused
an incorrect encoding for BPF_STX BPF_B when the source operand is R10.

Signed-off-by: Xi Wang <[email protected]>
Signed-off-by: Luke Nelson <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

bpf, x86: Fix encoding for lower 8-bit registers in BPF_STX BPF_B

This patch fixes an encoding bug in emit_stx for BPF_B when the source
register is BPF_REG_FP.

The current implementation for BPF_STX BPF_B in emit_stx saves one REX
byte when the operands can be encoded using Mod-R/M alone. The lower 8
bits of registers %rax, %rbx, %rcx, and %rdx can be accessed without using
a REX prefix via %al, %bl, %cl, and %dl, respectively. Other registers,
(e.g., %rsi, %rdi, %rbp, %rsp) require a REX prefix to use their 8-bit
equivalents (%sil, %dil, %bpl, %spl).

The current code checks if the source for BPF_STX BPF_B is BPF_REG_1
or BPF_REG_2 (which map to %rdi and %rsi), in which case it emits the
required REX prefix. However, it misses the case when the source is
BPF_REG_FP (mapped to %rbp).

The result is that BPF_STX BPF_B with BPF_REG_FP as the source operand
will read from register %ch instead of the correct %bpl. This patch fixes
the problem by fixing and refactoring the check on which registers need
the extra REX byte. Since no BPF registers map to %rsp, there is no need
to handle %spl.

Fixes: 622582786c9e0 ("net: filter: x86: internal BPF JIT")
Signed-off-by: Xi Wang <[email protected]>
Signed-off-by: Luke Nelson <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

bpf: Fix handling of XADD on BTF memory

check_xadd() can cause check_ptr_to_btf_access() to be executed with
atype==BPF_READ and value_regno==-1 (meaning "just check whether the access
is okay, don't tell me what type it will result in").
Handle that case properly and skip writing type information, instead of
indexing into the registers at index -1 and writing into out-of-bounds
memory.

Note that at least at the moment, you can't actually write through a BTF
pointer, so check_xadd() will reject the program after calling
check_ptr_to_btf_access with atype==BPF_WRITE; but that's after the
verifier has already corrupted memory.

This patch assumes that BTF pointers are not available in unprivileged
programs.

Fixes: 9e15db66136a ("bpf: Implement accurate raw_tp context access via BTF")
Signed-off-by: Jann Horn <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

bpf: Forbid XADD on spilled pointers for unprivileged users

When check_xadd() verifies an XADD operation on a pointer to a stack slot
containing a spilled pointer, check_stack_read() verifies that the read,
which is part of XADD, is valid. However, since the placeholder value -1 is
passed as `value_regno`, check_stack_read() can only return a binary
decision and can't return the type of the value that was read. The intent
here is to verify whether the value read from the stack slot may be used as
a SCALAR_VALUE; but since check_stack_read() doesn't check the type, and
the type information is lost when check_stack_read() returns, this is not
enforced, and a malicious user can abuse XADD to leak spilled kernel
pointers.

Fix it by letting check_stack_read() verify that the value is usable as a
SCALAR_VALUE if no type information is passed to the caller.

To be able to use __is_pointer_value() in check_stack_read(), move it up.

Fix up the expected unprivileged error message for a BPF selftest that,
until now, assumed that unprivileged users can use XADD on stack-spilled
pointers. This also gives us a test for the behavior introduced in this
patch for free.

In theory, this could also be fixed by forbidding XADD on stack spills
entirely, since XADD is a locked operation (for operations on memory with
concurrency) and there can't be any concurrency on the BPF stack; but
Alexei has said that he wants to keep XADD on stack slots working to avoid
changes to the test suite [1].

The following BPF program demonstrates how to leak a BPF map pointer as an
unprivileged user using this bug:

    // r7 = map_pointer
    BPF_LD_MAP_FD(BPF_REG_7, small_map),
    // r8 = launder(map_pointer)
    BPF_STX_MEM(BPF_DW, BPF_REG_FP, BPF_REG_7, -8),
    BPF_MOV64_IMM(BPF_REG_1, 0),
    ((struct bpf_insn) {
      .code  = BPF_STX | BPF_DW | BPF_XADD,
      .dst_reg = BPF_REG_FP,
      .src_reg = BPF_REG_1,
      .off = -8
    }),
    BPF_LDX_MEM(BPF_DW, BPF_REG_8, BPF_REG_FP, -8),

    // store r8 into map
    BPF_MOV64_REG(BPF_REG_ARG1, BPF_REG_7),
    BPF_MOV64_REG(BPF_REG_ARG2, BPF_REG_FP),
    BPF_ALU64_IMM(BPF_ADD, BPF_REG_ARG2, -4),
    BPF_ST_MEM(BPF_W, BPF_REG_ARG2, 0, 0),
    BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
    BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
    BPF_EXIT_INSN(),
    BPF_STX_MEM(BPF_DW, BPF_REG_0, BPF_REG_8, 0),

    BPF_MOV64_IMM(BPF_REG_0, 0),
    BPF_EXIT_INSN()

[1] https://lore.kernel.org/bpf/20200416211116 [email protected]/

Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)")
Signed-off-by: Jann Horn <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

cpumap: Avoid warning when CONFIG_DEBUG_PER_CPU_MAPS is enabled

When the kernel is built with CONFIG_DEBUG_PER_CPU_MAPS, the cpumap code
can trigger a spurious warning if CONFIG_CPUMASK_OFFSTACK is also set. This
happens because in this configuration, NR_CPUS can be larger than
nr_cpumask_bits, so the initial check in cpu_map_alloc() is not sufficient
to guard against hitting the warning in cpumask_check().

Fix this by explicitly checking the supplied key against the
nr_cpumask_bits variable before calling cpu_possible().

Fixes: 6710e1126934 ("bpf: introduce new bpf cpu map type BPF_MAP_TYPE_CPUMAP")
Reported-by: Xiumei Mu <[email protected]>
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Tested-by: Xiumei Mu <[email protected]>
Acked-by: Jesper Dangaard Brouer <[email protected]>
Acked-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

KVM: PPC: Book3S HV: Handle non-present PTEs in page fault functions

Since cd758a9b57ee "KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot in HPT
page fault handler", it's been possible in fairly rare circumstances to
load a non-present PTE in kvmppc_book3s_hv_page_fault() when running a
guest on a POWER8 host.

Because that case wasn't checked for, we could misinterpret the non-present
PTE as being a cache-inhibited PTE.  That could mismatch with the
corresponding hash PTE, which would cause the function to fail with -EFAULT
a little further down.  That would propagate up to the KVM_RUN ioctl()
generally causing the KVM userspace (usually qemu) to fall over.

This addresses the problem by catching that case and returning to the guest
instead.

For completeness, this fixes the radix page fault handler in the same
way.  For radix this didn't cause any obvious misbehaviour, because we
ended up putting the non-present PTE into the guest's partition-scoped
page tables, leading immediately to another hypervisor data/instruction
storage interrupt, which would go through the page fault path again
and fix things up.

Fixes: cd758a9b57ee "KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot in HPT page fault handler"
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1820402
Reported-by: David Gibson <[email protected]>
Tested-by: David Gibson <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>

Merge tag 'mlx5-fixes-2020-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

mlx5-fixes-2020-04-20

Signed-off-by: David S. Miller <[email protected]>

net/mlx5e: Get the latest values from counters in switchdev mode

In the switchdev mode, when running "cat
/sys/class/net/NIC/statistics/tx_packets", the ppcnt register is
accessed to get the latest values. But currently this command can
not get the correct values from ppcnt.

From firmware manual, before getting the 802_3 counters, the 802_3
data layout should be set to the ppcnt register.

When the command "cat /sys/class/net/NIC/statistics/tx_packets" is
run, before updating 802_3 data layout with ppcnt register, the
monitor counters are tested. The test result will decide the
802_3 data layout is updated or not.

Actually the monitor counters do not support to monitor rx/tx
stats of 802_3 in switchdev mode. So the rx/tx counters change
will not trigger monitor counters. So the 802_3 data layout will
not be updated in ppcnt register. Finally this command can not get
the latest values from ppcnt register with 802_3 data layout.

Fixes: 5c7e8bbb0257 ("net/mlx5e: Use monitor counters for update stats")
Signed-off-by: Zhu Yanjun <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5: Kconfig: convert imply usage to weak dependency

MLX5_CORE uses the 'imply' keyword to depend on VXLAN, PTP_1588_CLOCK,
MLXFW and PCI_HYPERV_INTERFACE.

This was useful to force vxlan, ptp, etc.. to be reachable to mlx5
regardless of their config states.

Due to the changes in the cited commit below, the semantics of 'imply'
was changed to not force any restriction on the implied config.

As a result of this change, the compilation of MLX5_CORE=y and VXLAN=m
would result in undefined references, as VXLAN now would stay as 'm'.

To fix this we change MLX5_CORE to have a weak dependency on
these modules/configs and make sure they are reachable, by adding:
depend on symbol || !symbol.

For example: VXLAN=m MLX5_CORE=y, this will force MLX5_CORE to m

Fixes: def2fbffe62c ("kconfig: allow symbols implied by y to become m")
Signed-off-by: Saeed Mahameed <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Nicolas Pitre <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Reported-by: Randy Dunlap <[email protected]>

net/mlx5e: Don't trigger IRQ multiple times on XSK wakeup to avoid WQ overruns

XSK wakeup function triggers NAPI by posting a NOP WQE to a special XSK
ICOSQ. When the application floods the driver with wakeup requests by
calling sendto() in a certain pattern that ends up in mlx5e_trigger_irq,
the XSK ICOSQ may overflow.

Multiple NOPs are not required and won't accelerate the process, so
avoid posting a second NOP if there is one already on the way. This way
we also avoid increasing the queue size (which might not help anyway).

Fixes: db05815b36cb ("net/mlx5e: Add XSK zero-copy support")
Signed-off-by: Maxim Mikityanskiy <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5: CT: Change idr to xarray to protect parallel tuple id allocation

After allowing parallel tuple insertion, we get the following trace:

[ 5505.142249] ------------[ cut here ]------------
[ 5505.148155] WARNING: CPU: 21 PID: 13313 at lib/radix-tree.c:581 delete_node+0x16c/0x180
[ 5505.295553] CPU: 21 PID: 13313 Comm: kworker/u50:22 Tainted: G           OE     5.6.0+ #78
[ 5505.304824] Hardware name: Supermicro Super Server/X10DRT-P, BIOS 2.0b 03/30/2017
[ 5505.313740] Workqueue: nf_flow_table_offload flow_offload_work_handler [nf_flow_table]
[ 5505.323257] RIP: 0010:delete_node+0x16c/0x180
[ 5505.349862] RSP: 0018:ffffb19184eb7b30 EFLAGS: 00010282
[ 5505.356785] RAX: 0000000000000000 RBX: ffff904ac95b86d8 RCX: ffff904b6f938838
[ 5505.365190] RDX: 0000000000000000 RSI: ffff904ac954b908 RDI: ffff904ac954b920
[ 5505.373628] RBP: ffff904b4ac13060 R08: 0000000000000001 R09: 0000000000000000
[ 5505.382155] R10: 0000000000000000 R11: 0000000000000040 R12: 0000000000000000
[ 5505.390527] R13: ffffb19184eb7bfc R14: ffff904b6bef5800 R15: ffff90482c1203c0
[ 5505.399246] FS:  0000000000000000(0000) GS:ffff904c2fc80000(0000) knlGS:0000000000000000
[ 5505.408621] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5505.415739] CR2: 00007f5d27006010 CR3: 0000000058c10006 CR4: 00000000001626e0
[ 5505.424547] Call Trace:
[ 5505.428429]  idr_alloc_u32+0x7b/0xc0
[ 5505.433803]  mlx5_tc_ct_entry_add_rule+0xbf/0x950 [mlx5_core]
[ 5505.441354]  ? mlx5_fc_create+0x23c/0x370 [mlx5_core]
[ 5505.448225]  mlx5_tc_ct_block_flow_offload+0x874/0x10b0 [mlx5_core]
[ 5505.456278]  ? mlx5_tc_ct_block_flow_offload+0x63d/0x10b0 [mlx5_core]
[ 5505.464532]  nf_flow_offload_tuple.isra.21+0xc5/0x140 [nf_flow_table]
[ 5505.472286]  ? __kmalloc+0x217/0x2f0
[ 5505.477093]  ? flow_rule_alloc+0x1c/0x30
[ 5505.482117]  flow_offload_work_handler+0x1d0/0x290 [nf_flow_table]
[ 5505.489674]  ? process_one_work+0x17c/0x580
[ 5505.494922]  process_one_work+0x202/0x580
[ 5505.500082]  ? process_one_work+0x17c/0x580
[ 5505.505696]  worker_thread+0x4c/0x3f0
[ 5505.510458]  kthread+0x103/0x140
[ 5505.514989]  ? process_one_work+0x580/0x580
[ 5505.520616]  ? kthread_bind+0x10/0x10
[ 5505.525837]  ret_from_fork+0x3a/0x50
[ 5505.570841] ---[ end trace 07995de9c56d6831 ]---

This happens from parallel deletes/adds to idr, as idr isn't protected.
Fix that by using xarray as the tuple_ids allocator instead of idr.

Fixes: 7da182a998d6 ("netfilter: flowtable: Use work entry per offload command")
Reviewed-by: Roi Dayan <[email protected]>
Signed-off-by: Paul Blakey <[email protected]>
Reviewed-by: Oz Shlomo <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

net/mlx5: Fix failing fw tracer allocation on s390

On s390 FORCE_MAX_ZONEORDER is 9 instead of 11, thus a larger kzalloc()
allocation as done for the firmware tracer will always fail.

Looking at mlx5_fw_tracer_save_trace(), it is actually the driver itself
that copies the debug data into the trace array and there is no need for
the allocation to be contiguous in physical memory. We can therefor use
kvzalloc() instead of kzalloc() and get rid of the large contiguous
allcoation.

Fixes: f53aaa31cce7 ("net/mlx5: FW tracer, implement tracer logic")
Signed-off-by: Niklas Schnelle <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>

kvm: Disable objtool frame pointer checking for vmenter.S

Frame pointers are completely broken by vmenter.S because it clobbers
RBP:

arch/x86/kvm/svm/vmenter.o: warning: objtool: __svm_vcpu_run()+0xe4: BP used as a scratch register

That's unavoidable, so just skip checking that file when frame pointers
are configured in.

On the other hand, ORC can handle that code just fine, so leave objtool
enabled in the !FRAME_POINTER case.

Reported-by: Randy Dunlap <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Message-Id: <01fae42917bacad18be8d2cbc771353da6603473.1587398610 [email protected]>
Tested-by: Randy Dunlap <[email protected]> # build-tested
Fixes: 199cd1d7b534 ("KVM: SVM: Split svm_vcpu_run inline assembly to separate file")
Signed-off-by: Paolo Bonzini <[email protected]>

audit: check the length of userspace generated audit records

Commit 756125289285 ("audit: always check the netlink payload length
in audit_receive_msg()") fixed a number of missing message length
checks, but forgot to check the length of userspace generated audit
records. The good news is that you need CAP_AUDIT_WRITE to submit
userspace audit records, which is generally only given to trusted
processes, so the impact should be limited.

Cc: [email protected]
Fixes: 756125289285 ("audit: always check the netlink payload length in audit_receive_msg()")
Reported-by: [email protected]
Signed-off-by: Paul Moore <[email protected]>

tpm/tpm_tis: Free IRQ if probing fails

Call disable_interrupts() if we have to revert to polling in order not to
unnecessarily reserve the IRQ for the life-cycle of the driver.

Cc: [email protected] # 4.5.x
Reported-by: Hans de Goede <[email protected]>
Fixes: e3837e74a06d ("tpm_tis: Refactor the interrupt setup")
Signed-off-by: Jarkko Sakkinen <[email protected]>

tpm: fix wrong return value in tpm_pcr_extend

For the algorithm that does not match the bank, a positive
value EINVAL is returned here. I think this is a typo error.
It is necessary to return an error value.

Cc: [email protected] # 5.4.x
Fixes: 9f75c8224631 ("KEYS: trusted: correctly initialize digests and fix locking issue")
Signed-off-by: Tianjia Zhang <[email protected]>
Reviewed-by: Roberto Sassu <[email protected]>
Reviewed-by: Jerry Snitselaar <[email protected]>
Reviewed-by: Jarkko Sakkinen <[email protected]>
Signed-off-by: Jarkko Sakkinen <[email protected]>

tpm: ibmvtpm: retry on H_CLOSED in tpm_ibmvtpm_send()

tpm_ibmvtpm_send() can fail during PowerVM Live Partition Mobility resume
with an H_CLOSED return from ibmvtpm_send_crq(). The PAPR says, 'The
"partner partition suspended" transport event disables the associated CRQ
such that any H_SEND_CRQ hcall() to the associated CRQ returns H_Closed
until the CRQ has been explicitly enabled using the H_ENABLE_CRQ hcall.'
This patch adds a check in tpm_ibmvtpm_send() for an H_CLOSED return from
ibmvtpm_send_crq() and in that case calls tpm_ibmvtpm_resume() and
retries the ibmvtpm_send_crq() once.

Cc: [email protected] # 3.7.x
Fixes: 132f76294744 ("drivers/char/tpm: Add new device driver to support IBM vTPM")
Reported-by: Linh Pham <[email protected]>
Reviewed-by: Stefan Berger <[email protected]>
Signed-off-by: George Wilson <[email protected]>
Tested-by: Linh Pham <[email protected]>
Reviewed-by: Jarkko Sakkinen <[email protected]>
Signed-off-by: Jarkko Sakkinen <[email protected]>

tpm: Export tpm2_get_cc_attrs_tbl for ibmvtpm driver as module

This patch fixes the following problem when the ibmvtpm driver
is built as a module:

ERROR: modpost: "tpm2_get_cc_attrs_tbl" [drivers/char/tpm/tpm_ibmvtpm.ko] undefined!
make[1]: *** [scripts/Makefile.modpost:94: __modpost] Error 1
make: *** [Makefile:1298: modules] Error 2

Fixes: 18b3670d79ae ("tpm: ibmvtpm: Add support for TPM2")
Signed-off-by: Stefan Berger <[email protected]>
Reported-by: Sachin Sant <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Reviewed-by: Jarkko Sakkinen <[email protected]>
Signed-off-by: Jarkko Sakkinen <[email protected]>

team: fix hang in team_mode_get()

When team mode is changed or set, the team_mode_get() is called to check
whether the mode module is inserted or not. If the mode module is not
inserted, it calls the request_module().
In the request_module(), it creates a child process, which is
the "modprobe" process and waits for the done of the child process.
At this point, the following locks were used.
down_read(&cb_lock()); by genl_rcv()
    genl_lock(); by genl_rcv_msc()
        rtnl_lock(); by team_nl_cmd_options_set()
            mutex_lock(&team->lock); by team_nl_team_get()

Concurrently, the team module could be removed by rmmod or "modprobe -r"
The __exit function of team module is team_module_exit(), which calls
team_nl_fini() and it tries to acquire following locks.
down_write(&cb_lock);
    genl_lock();
Because of the genl_lock() and cb_lock, this process can't be finished
earlier than request_module() routine.

The problem secenario.
CPU0                                     CPU1
team_mode_get
    request_module()
                                         modprobe -r team_mode_roundrobin
                                                     team <--(B)
        modprobe team <--(A)
            team_mode_roundrobin

By request_module(), the "modprobe team_mode_roundrobin" command
will be executed. At this point, the modprobe process will decide
that the team module should be inserted before team_mode_roundrobin.
Because the team module is being removed.

By the module infrastructure, the same module insert/remove operations
can't be executed concurrently.
So, (A) waits for (B) but (B) also waits for (A) because of locks.
So that the hang occurs at this point.

Test commands:
    while :
    do
        teamd -d &
killall teamd &
modprobe -rv team_mode_roundrobin &
    done

The approach of this patch is to hold the reference count of the team
module if the team module is compiled as a module. If the reference count
of the team module is not zero while request_module() is being called,
the team module will not be removed at that moment.
So that the above scenario could not occur.

Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device")
Signed-off-by: Taehee Yoo <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'mptcp-fix-races-on-accept'

Paolo Abeni says:

====================
mptcp: fix races on accept()

This series includes some fixes for accept() races which may cause inconsistent
MPTCP socket status and oops. Please see the individual patches for the
technical details.
====================

Signed-off-by: David S. Miller <[email protected]>

mptcp: drop req socket remote_key* fields

We don't need them, as we can use the current ingress opt
data instead. Setting them in syn_recv_sock() may causes
inconsistent mptcp socket status, as per previous commit.

Fixes: cc7972ea1932 ("mptcp: parse and emit MP_CAPABLE option according to v1 spec")
Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mptcp: avoid flipping mp_capable field in syn_recv_sock()

If multiple CPUs races on the same req_sock in syn_recv_sock(),
flipping such field can cause inconsistent child socket status.

When racing, the CPU losing the req ownership may still change
the mptcp request socket mp_capable flag while the CPU owning
the request is cloning the socket, leaving the child socket with
'is_mptcp' set but no 'mp_capable' flag.

Such socket will stay with 'conn' field cleared, heading to oops
in later mptcp callback.

Address the issue tracking the fallback status in a local variable.

Fixes: 58b09919626b ("mptcp: create msk early")
Co-developed-by: Florian Westphal <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mptcp: handle mptcp listener destruction via rcu

Following splat can occur during self test:

BUG: KASAN: use-after-free in subflow_data_ready+0x156/0x160
Read of size 8 at addr ffff888100c35c28 by task mptcp_connect/4808

  subflow_data_ready+0x156/0x160
  tcp_child_process+0x6a3/0xb30
  tcp_v4_rcv+0x2231/0x3730
  ip_protocol_deliver_rcu+0x5c/0x860
  ip_local_deliver_finish+0x220/0x360
  ip_local_deliver+0x1c8/0x4e0
  ip_rcv_finish+0x1da/0x2f0
  ip_rcv+0xd0/0x3c0
  __netif_receive_skb_one_core+0xf5/0x160
  __netif_receive_skb+0x27/0x1c0
  process_backlog+0x21e/0x780
  net_rx_action+0x35f/0xe90
  do_softirq+0x4c/0x50
  [..]

This occurs when accessing subflow_ctx->conn.

Problem is that tcp_child_process() calls listen sockets'
sk_data_ready() notification, but it doesn't hold the listener
lock.  Another cpu calling close() on the listener will then cause
transition of refcount to 0.

Fixes: 58b09919626bf ("mptcp: create msk early")
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

cxgb4: fix large delays in PTP synchronization

Fetching PTP sync information from mailbox is slow and can take
up to 10 milliseconds. Reduce this unnecessary delay by directly
reading the information from the corresponding registers.

Fixes: 9c33e4208bce ("cxgb4: Add PTP Hardware Clock (PHC) support")
Signed-off-by: Manoj Malviya <[email protected]>
Signed-off-by: Rahul Lakkireddy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'omap-for-v5.6/fixes-rc7-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes

Boot regression fix for N950/N9

We need to tag RNG as disabled for N950/N9 as it blocked by the secure
mode. We have a similar change done for N900, but I missed adding it
for N950/N9 with the recent RNG changes.

* tag 'omap-for-v5.6/fixes-rc7-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: OMAP3: disable RNG on N950/N9

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>

net: stmmac: dwmac-meson8b: Add missing boundary to RGMII TX clock array

Running with KASAN on a VIM3L systems leads to the following splat
when probing the Ethernet device:

==================================================================
BUG: KASAN: global-out-of-bounds in _get_maxdiv+0x74/0xd8
Read of size 4 at addr ffffa000090615f4 by task systemd-udevd/139
CPU: 1 PID: 139 Comm: systemd-udevd Tainted: G E 5.7.0-rc1-00101-g8624b7577b9c #781
Hardware name: amlogic w400/w400, BIOS 2020.01-rc5 03/12/2020
Call trace:
dump_backtrace+0x0/0x2a0
show_stack+0x20/0x30
dump_stack+0xec/0x148
print_address_description.isra.12+0x70/0x35c
__kasan_report+0xfc/0x1d4
kasan_report+0x4c/0x68
__asan_load4+0x9c/0xd8
_get_maxdiv+0x74/0xd8
clk_divider_bestdiv+0x74/0x5e0
clk_divider_round_rate+0x80/0x1a8
clk_core_determine_round_nolock.part.9+0x9c/0xd0
clk_core_round_rate_nolock+0xf0/0x108
clk_hw_round_rate+0xac/0xf0
clk_factor_round_rate+0xb8/0xd0
clk_core_determine_round_nolock.part.9+0x9c/0xd0
clk_core_round_rate_nolock+0xf0/0x108
clk_core_round_rate_nolock+0xbc/0x108
clk_core_set_rate_nolock+0xc4/0x2e8
clk_set_rate+0x58/0xe0
meson8b_dwmac_probe+0x588/0x72c [dwmac_meson8b]
platform_drv_probe+0x78/0xd8
really_probe+0x158/0x610
driver_probe_device+0x140/0x1b0
device_driver_attach+0xa4/0xb0
__driver_attach+0xcc/0x1c8
bus_for_each_dev+0xf4/0x168
driver_attach+0x3c/0x50
bus_add_driver+0x238/0x2e8
driver_register+0xc8/0x1e8
__platform_driver_register+0x88/0x98
meson8b_dwmac_driver_init+0x28/0x1000 [dwmac_meson8b]
do_one_initcall+0xa8/0x328
do_init_module+0xe8/0x368
load_module+0x3300/0x36b0
__do_sys_finit_module+0x120/0x1a8
__arm64_sys_finit_module+0x4c/0x60
el0_svc_common.constprop.2+0xe4/0x268
do_el0_svc+0x98/0xa8
el0_svc+0x24/0x68
el0_sync_handler+0x12c/0x318
el0_sync+0x158/0x180

The buggy address belongs to the variable:
div_table.63646+0x34/0xfffffffffffffa40 [dwmac_meson8b]

Memory state around the buggy address:
ffffa00009061480: fa fa fa fa 00 00 00 01 fa fa fa fa 00 00 00 00
ffffa00009061500: 05 fa fa fa fa fa fa fa 00 04 fa fa fa fa fa fa
>ffffa00009061580: 00 03 fa fa fa fa fa fa 00 00 00 00 00 00 fa fa
^
ffffa00009061600: fa fa fa fa 00 01 fa fa fa fa fa fa 01 fa fa fa
ffffa00009061680: fa fa fa fa 00 01 fa fa fa fa fa fa 04 fa fa fa
==================================================================

Digging into this indeed shows that the clock divider array is
lacking a final fence, and that the clock subsystems goes in the
weeds. Oh well.

Let's add the empty structure that indicates the end of the array.

Fixes: bd6f48546b9c ("net: stmmac: dwmac-meson8b: Fix the RGMII TX delay on Meson8b/8m2 SoCs")
Signed-off-by: Marc Zyngier <[email protected]>
Cc: Martin Blumenstingl <[email protected]>
Reviewed-by: Martin Blumenstingl <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ipv6: fix restrict IPV6_ADDRFORM operation

Commit b6f6118901d1 ("ipv6: restrict IPV6_ADDRFORM operation") fixed a
problem found by syzbot an unfortunate logic error meant that it
also broke IPV6_ADDRFORM.

Rearrange the checks so that the earlier test is just one of the series
of checks made before moving the socket from IPv6 to IPv4.

Fixes: b6f6118901d1 ("ipv6: restrict IPV6_ADDRFORM operation")
Signed-off-by: John Haxby <[email protected]>
Cc: [email protected]
Signed-off-by: David S. Miller <[email protected]>

net: systemport: Omit superfluous error message in bcm_sysport_probe()

In the function bcm_sysport_probe(), when get irq failed, the function
platform_get_irq() logs an error message, so remove redundant message
here.

Signed-off-by: Tang Bin <[email protected]>
Acked-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: openvswitch: ovs_ct_exit to be done under ovs_lock

syzbot wrote:
| =============================
| WARNING: suspicious RCU usage
| 5.7.0-rc1+ #45 Not tainted
| -----------------------------
| net/openvswitch/conntrack.c:1898 RCU-list traversed in non-reader section!!
|
| other info that might help us debug this:
| rcu_scheduler_active = 2, debug_locks = 1
| ...
|
| stack backtrace:
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
| Workqueue: netns cleanup_net
| Call Trace:
| ...
| ovs_ct_exit
| ovs_exit_net
| ops_exit_list.isra.7
| cleanup_net
| process_one_work
| worker_thread

To avoid that warning, invoke the ovs_ct_exit under ovs_lock and add
lockdep_ovsl_is_held as optional lockdep expression.

Link: https://lore.kernel.org/lkml/[email protected]
Fixes: 11efd5cb04a1 ("openvswitch: Support conntrack zone limit")
Cc: Pravin B Shelar <[email protected]>
Cc: Yi-Hung Wei <[email protected]>
Reported-by: [email protected]
Signed-off-by: Tonghao Zhang <[email protected]>
Acked-by: Pravin B Shelar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

vdso/datapage: Use correct clock mode name in comment

While the explanation for time namespace <-> vdso interactions is very
helpful it uses the wrong name in the comment when describing the clock
mode making grepping a bit annoying.

This seems like an accidental oversight when moving from VCLOCK_TIMENS
to VDSO_CLOCKMODE_TIMENS. It seems that
660fd04f9317 ("lib/vdso: Prepare for time namespace support") misspelled
VCLOCK_TIMENS as VLOCK_TIMENS which explains why it got missed when
VCLOCK_TIMENS became VDSO_CLOCKMODE_TIMENS in
2d6b01bd88cc ("lib/vdso: Move VCLOCK_TIMENS to vdso_clock_modes").

Update the comment to use VDSO_CLOCKMODE_TIMENS.

Fixes: 660fd04f9317 ("lib/vdso: Prepare for time namespace support")
Fixes: 2d6b01bd88cc ("lib/vdso: Move VCLOCK_TIMENS to vdso_clock_modes")
Signed-off-by: Christian Brauner <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Andrei Vagin <[email protected]>
Acked-by: Vincenzo Frascino <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

drm/i915/dpcd_bl: Unbreak enable_dpcd_backlight modparam

Looks like I accidentally made it so you couldn't force DPCD backlight
support on, whoops. Fix that.

Signed-off-by: Lyude Paul <[email protected]>
Fixes: 17f5d57915be ("drm/i915: Force DPCD backlight mode on X1 Extreme 2nd Gen 4K AMOLED panel")
Cc: Adam Jackson <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: "Ville Syrjälä" <[email protected]>
Signed-off-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit d7fb38ae36a2dc97924b075ad1d1a88792777ea9)
Signed-off-by: Rodrigo Vivi <[email protected]>

drm/i915: fix Sphinx build duplicate label warning

Fix the warning caused by enabling the autosectionlabel extension in the
kernel Sphinx build:

Documentation/gpu/i915.rst:610: WARNING: duplicate label
gpu/i915:layout, other instance in Documentation/gpu/i915.rst

The autosectionlabel extension adds labels to each section title for
cross-referencing, but forbids identical section titles in a
document. With kernel-doc, this includes sections titles in the included
kernel-doc comments.

In the warning message, Sphinx is unable to reference the labels in
their true locations in the kernel-doc comments in source. In this case,
there's "Layout" sections in both gt/intel_workarounds.c and
i915_reg.h. Rename the section in the latter to "File Layout".

Fixes: 58ad30cf91f0 ("docs: fix reference to core-api/namespaces.rst")
Reviewed-by: Chris Wilson <[email protected]>
Signed-off-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 27be41de45a70fe1cb0ae1cbd2fd6da1ce3ffe9a)
Signed-off-by: Rodrigo Vivi <[email protected]>

drm/i915/display: Load DP_TP_CTL/STATUS offset before use it

Right now dp.regs.dp_tp_ctl/status are only set during the encoder
pre_enable() hook, what is causing all reads and writes to those
registers to go to offset 0x0 before pre_enable() is executed.

So if i915 takes the BIOS state and don't do a modeset any following
link retraing will fail.

In the case that i915 needs to do a modeset, the DDI disable sequence
will write to a wrong register not disabling DP 'Transport Enable' in
DP_TP_CTL, making a HDMI modeset in the same port/transcoder to
not light up the monitor.

So here for GENs older than 12, that have those registers fixed at
port offset range it is loading at encoder/port init while for GEN12
it will keep setting it at encoder pre_enable() and during HW state
readout.

Fixes: 4444df6e205b ("drm/i915/tgl: move DP_TP_* to transcoder")
Cc: Matt Roper <[email protected]>
Cc: Lucas De Marchi <[email protected]>
Signed-off-by: José Roberto de Souza <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit edcb9028d66b44d74ba4f8b9daa379b004dc1f85)
Signed-off-by: Rodrigo Vivi <[email protected]>

drm/i915/tgl: TBT AUX should use TC power well ops

As on ICL, we want to use the Type-C aux handlers for the TBT aux wells
to ensure the DP_AUX_CH_CTL_TBT_IO flag is set properly.

Fixes: 656409bbaf87 ("drm/i915/tgl: Add power well support")
Cc: José Roberto de Souza <[email protected]>
Cc: Imre Deak <[email protected]>
Signed-off-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: José Roberto de Souza <[email protected]>
(cherry picked from commit 3cbdb97564a39020262e62b655e788b63cf426cb)
Signed-off-by: Rodrigo Vivi <[email protected]>

drm/i915: HDCP: fix Ri prime check done during link check

The check was always succeeding even in case of a mismatch due to the
HDCP_STATUS_ENC bit being set. Make sure both bits are actually set.

Signed-off-by: Oliver Barta <[email protected]>
Fixes: 2320175feb74 ("drm/i915: Implement HDCP for HDMI")
Reviewed-by: Ramalingam C <[email protected]>
Signed-off-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 3ffaf56e912e76cf09c560244c8804f9bebda8b1)
Signed-off-by: Rodrigo Vivi <[email protected]>

drm/i915/gt: Update PMINTRMSK holding fw

If we use a non-forcewaked write to PMINTRMSK, it does not take effect
until much later, if at all, causing a loss of RPS interrupts and no GPU
reclocking, leaving the GPU running at the wrong frequency for long
periods of time.

Reported-by: Francisco Jerez <[email protected]>
Suggested-by: Francisco Jerez <[email protected]>
Fixes: 35cc7f32c298 ("drm/i915/gt: Use non-forcewake writes for RPS")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Francisco Jerez <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Cc: Andi Shyti <[email protected]>
Reviewed-by: Mika Kuoppala <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Cc: <[email protected]> # v5.6+
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit a080bd994c4023042a2b605c65fa10a25933f636)
Signed-off-by: Rodrigo Vivi <[email protected]>

drm/i915/tgl: Add Wa_14010477008:tgl

Media decompression support should not be advertised on any display
planes for steppings A0-C0.

Bspec: 53273
Fixes: 2dfbf9d2873a ("drm/i915/tgl: Gen-12 display can decompress surfaces compressed by the media engine")
Cc: Matt Atwood <[email protected]>
Signed-off-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: José Roberto de Souza <[email protected]>
(cherry picked from commit dbff5a8db9c630f61a892ab41a283445e01270f5)
Signed-off-by: Rodrigo Vivi <[email protected]>

bdev: Reduce time holding bd_mutex in sync in blkdev_close()

While trying to "dd" to the block device for a USB stick, I
encountered a hung task warning (blocked for > 120 seconds).  I
managed to come up with an easy way to reproduce this on my system
(where /dev/sdb is the block device for my USB stick) with:

  while true; do dd if=/dev/zero of=/dev/sdb bs=4M; done

With my reproduction here are the relevant bits from the hung task
detector:

INFO: task udevd:294 blocked for more than 122 seconds.
...
udevd           D    0   294      1 0x00400008
Call trace:
  ...
  mutex_lock_nested+0x40/0x50
  __blkdev_get+0x7c/0x3d4
  blkdev_get+0x118/0x138
  blkdev_open+0x94/0xa8
  do_dentry_open+0x268/0x3a0
  vfs_open+0x34/0x40
  path_openat+0x39c/0xdf4
  do_filp_open+0x90/0x10c
  do_sys_open+0x150/0x3c8
  ...

...
Showing all locks held in the system:
...
1 lock held by dd/2798:
  #0: ffffff814ac1a3b8 (&bdev->bd_mutex){+.+.}, at: __blkdev_put+0x50/0x204
...
dd              D    0  2798   2764 0x00400208
Call trace:
  ...
  schedule+0x8c/0xbc
  io_schedule+0x1c/0x40
  wait_on_page_bit_common+0x238/0x338
  __lock_page+0x5c/0x68
  write_cache_pages+0x194/0x500
  generic_writepages+0x64/0xa4
  blkdev_writepages+0x24/0x30
  do_writepages+0x48/0xa8
  __filemap_fdatawrite_range+0xac/0xd8
  filemap_write_and_wait+0x30/0x84
  __blkdev_put+0x88/0x204
  blkdev_put+0xc4/0xe4
  blkdev_close+0x28/0x38
  __fput+0xe0/0x238
  ____fput+0x1c/0x28
  task_work_run+0xb0/0xe4
  do_notify_resume+0xfc0/0x14bc
  work_pending+0x8/0x14

The problem appears related to the fact that my USB disk is terribly
slow and that I have a lot of RAM in my system to cache things.
Specifically my writes seem to be happening at ~15 MB/s and I've got
~4 GB of RAM in my system that can be used for buffering.  To write 4
GB of buffer to disk thus takes ~4000 MB / ~15 MB/s = ~267 seconds.

The 267 second number is a problem because in __blkdev_put() we call
sync_blockdev() while holding the bd_mutex.  Any other callers who
want the bd_mutex will be blocked for the whole time.

The problem is made worse because I believe blkdev_put() specifically
tells other tasks (namely udev) to go try to access the device at right
around the same time we're going to hold the mutex for a long time.

Putting some traces around this (after disabling the hung task detector),
I could confirm:
dd:    437.608600: __blkdev_put() right before sync_blockdev() for sdb
udevd: 437.623901: blkdev_open() right before blkdev_get() for sdb
dd:    661.468451: __blkdev_put() right after sync_blockdev() for sdb
udevd: 663.820426: blkdev_open() right after blkdev_get() for sdb

A simple fix for this is to realize that sync_blockdev() works fine if
you're not holding the mutex.  Also, it's not the end of the world if
you sync a little early (though it can have performance impacts).
Thus we can make a guess that we're going to need to do the sync and
then do it without holding the mutex.  We still do one last sync with
the mutex but it should be much, much faster.

With this, my hung task warnings for my test case are gone.

Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

ASoC: wm89xx: Add missing dependency

sound/soc/codecs/wm8900.o: In function `wm8900_i2c_probe':
wm8900.c:(.text+0xa36): undefined reference to `__devm_regmap_init_i2c'
sound/soc/codecs/wm8900.o: In function `wm8900_modinit':
wm8900.c:(.init.text+0xb): undefined reference to `i2c_register_driver'
sound/soc/codecs/wm8900.o: In function `wm8900_exit':
wm8900.c:(.exit.text+0x8): undefined reference to `i2c_del_driver'
sound/soc/codecs/wm8988.o: In function `wm8988_i2c_probe':
wm8988.c:(.text+0x857): undefined reference to `__devm_regmap_init_i2c'
sound/soc/codecs/wm8988.o: In function `wm8988_modinit':
wm8988.c:(.init.text+0xb): undefined reference to `i2c_register_driver'
sound/soc/codecs/wm8988.o: In function `wm8988_exit':
wm8988.c:(.exit.text+0x8): undefined reference to `i2c_del_driver'
sound/soc/codecs/wm8995.o: In function `wm8995_i2c_probe':
wm8995.c:(.text+0x1c4f): undefined reference to `__devm_regmap_init_i2c'
sound/soc/codecs/wm8995.o: In function `wm8995_modinit':
wm8995.c:(.init.text+0xb): undefined reference to `i2c_register_driver'
sound/soc/codecs/wm8995.o: In function `wm8995_exit':
wm8995.c:(.exit.text+0x8): undefined reference to `i2c_del_driver'

Add SND_SOC_I2C_AND_SPI dependency to fix this.

Fixes: ea00d95200d02ece ("ASoC: Use imply for SND_SOC_ALL_CODECS")
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: YueHaibing <[email protected]>
Acked-by: Charles Keepax <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>

vhost: disable for OABI

vhost is currently broken on the some ARM configs.

The reason is that the ring element addresses are passed between
components with different alignments assumptions. Thus, if
guest selects a pointer and host then gets and dereferences
it, then alignment assumed by the host's compiler might be
greater than the actual alignment of the pointer.
compiler on the host from assuming pointer is aligned.

This actually triggers on ARM with -mabi=apcs-gnu - which is a
deprecated configuration. With this OABI, compiler assumes that
all structures are 4 byte aligned - which is stronger than
virtio guarantees for available and used rings, which are
merely 2 bytes. Thus a guest without -mabi=apcs-gnu running
on top of host with -mabi=apcs-gnu will be broken.

The correct fix is to force alignment of structures - however
that is an intrusive fix that's best deferred until the next release.

We didn't previously support such ancient systems at all - this surfaced
after vdpa support prompted removing dependency of vhost on
VIRTULIZATION. So for now, let's just add something along the lines of

depends on !ARM || AEABI

to the virtio Kconfig declaration, and add a comment that it has to do
with struct member alignment.

Note: we can't make VHOST and VHOST_RING themselves have
a dependency since these are selected. Add a new symbol for that.

We should be able to drop this dependency down the road.

Fixes: 20c384f1ea1a0bc7 ("vhost: refine vhost and vringh kconfig")
Suggested-by: Ard Biesheuvel <[email protected]>
Suggested-by: Richard Earnshaw <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

Merge series "ASoC: rsnd: multi-SSI setup fixes" from Matthias Blankertz <[email protected]>:

Fix rsnd_dai_call() operations being performed twice for the master SSI
in multi-SSI setups, and fix the rsnd_ssi_stop operation for multi-SSI
setups.
The only visible effect of these issues was some "status check failed"
spam when the rsnd_ssi_stop was called, but overall the code is cleaner
now, and some questionable writes to the SSICR register which did not
lead to any observable misbehaviour but were contrary to the datasheet
are fixed.

Mark:
The first patch kind of reverts my "ASoC: rsnd: Fix parent SSI
start/stop in multi-SSI mode" from a few days ago and achieves the same
effect in a simpler fashion, if you would prefer a clean patch series
based on v5.6 drop me a note.

Greetings,
Matthias

Matthias Blankertz (2):
ASoC: rsnd: Don't treat master SSI in multi SSI setup as parent
ASoC: rsnd: Fix "status check failed" spam for multi-SSI

sound/soc/sh/rcar/ssi.c | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)

base-commit: 15a5760cb8b6d5c1ebbf1d2e1f0b77380ab68a82
--
2.26.1

Merge series "ASoC: meson: fix codec-to-codec link setup" from Jerome Brunet <[email protected]>:

This patchset fixes the problem reported by Marc in this thread [0]
The problem was due to an error in the meson card drivers which had
the "no_pcm" dai_link property set on codec-to-codec links

[0]: https://lore.kernel.org/r/20200417122732 [email protected]

Jerome Brunet (2):
  ASoC: meson: axg-card: fix codec-to-codec link setup
  ASoC: meson: gx-card: fix codec-to-codec link setup

sound/soc/meson/axg-card.c | 4 +++-
sound/soc/meson/gx-card.c  | 4 +++-
2 files changed, 6 insertions(+), 2 deletions(-)

--
2.25.2

ASoC: dapm: fixup dapm kcontrol widget

snd_soc_dapm_kcontrol widget which is created by autodisable control
should contain correct on_val, mask and shift because it is set when the
widget is powered and changed value is applied on registers by following
code in dapm_seq_run_coalesced().

mask |= w->mask << w->shift;
if (w->power)
value |= w->on_val << w->shift;
else
value |= w->off_val << w->shift;

Shift on the mask in dapm_kcontrol_data_alloc() is removed to prevent
double shift.
And, on_val in dapm_kcontrol_set_value() is modified to get correct
value in the dapm_seq_run_coalesced().

Signed-off-by: Gyeongtaek Lee <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>