Git Repo - linux.git/log

ibmvnic: remove duplicate napi_schedule call in do_reset function

During adapter reset, do_reset/do_hard_reset calls ibmvnic_open(),
which will calls napi_schedule if previous state is VNIC_CLOSED
(i.e, the reset case, and "ifconfig down" case). So there is no need
for do_reset to call napi_schedule again at the end of the function
though napi_schedule will neglect the request if napi is already
scheduled.

Fixes: ed651a10875f ("ibmvnic: Updated reset handling")
Signed-off-by: Lijun Pan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ibmvnic: avoid calling napi_disable() twice

__ibmvnic_open calls napi_disable without checking whether NAPI polling
has already been disabled or not. This could cause napi_disable
being called twice, which could generate deadlock. For example,
the first napi_disable will spin until NAPI_STATE_SCHED is cleared
by napi_complete_done, then set it again.
When napi_disable is called the second time, it will loop infinitely
because no dev->poll will be running to clear NAPI_STATE_SCHED.

To prevent above scenario from happening, call ibmvnic_napi_disable()
which checks if napi is disabled or not before calling napi_disable.

Fixes: bfc32f297337 ("ibmvnic: Move resource initialization to its own routine")
Suggested-by: Thomas Falcon <[email protected]>
Signed-off-by: Lijun Pan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

r8169: don't advertise pause in jumbo mode

It has been reported [0] that using pause frames in jumbo mode impacts
performance. There's no available chip documentation, but vendor
drivers r8168 and r8125 don't advertise pause in jumbo mode. So let's
do the same, according to Roman it fixes the issue.

[0] https://bugzilla.kernel.org/show_bug.cgi?id=212617

Fixes: 9cf9b84cc701 ("r8169: make use of phy_set_asym_pause")
Reported-by: Roman Mamedov <[email protected]>
Tested-by: Roman Mamedov <[email protected]>
Signed-off-by: Heiner Kallweit <[email protected]>
Cc: [email protected]
Signed-off-by: David S. Miller <[email protected]>

ethtool: pause: make sure we init driver stats

The intention was for pause statistics to not be reported
when driver does not have the relevant callback (only
report an empty netlink nest). What happens currently
we report all 0s instead. Make sure statistics are
initialized to "not set" (which is -1) so the dumping
code skips them.

Fixes: 9a27a33027f2 ("ethtool: add standard pause stats")
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

io_uring: fix early sqd_list removal sqpoll hangs

[  245.463317] INFO: task iou-sqp-1374:1377 blocked for more than 122 seconds.
[  245.463334] task:iou-sqp-1374    state:D flags:0x00004000
[  245.463345] Call Trace:
[  245.463352]  __schedule+0x36b/0x950
[  245.463376]  schedule+0x68/0xe0
[  245.463385]  __io_uring_cancel+0xfb/0x1a0
[  245.463407]  do_exit+0xc0/0xb40
[  245.463423]  io_sq_thread+0x49b/0x710
[  245.463445]  ret_from_fork+0x22/0x30

It happens when sqpoll forgot to run park_task_work and goes to exit,
then exiting user may remove ctx from sqd_list, and so corresponding
io_sq_thread() -> io_uring_cancel_sqpoll() won't be executed. Hopefully
it just stucks in do_exit() in this case.

Fixes: dbe1bdbb39db ("io_uring: handle signals for IO threads like a normal thread")
Reported-by: Joakim Hassila <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

dm verity fec: fix misaligned RS roots IO

commit df7b59ba9245 ("dm verity: fix FEC for RS roots unaligned to
block size") introduced the possibility for misaligned roots IO
relative to the underlying device's logical block size. E.g. Android's
default RS roots=2 results in dm_bufio->block_size=1024, which causes
the following EIO if the logical block size of the device is 4096,
given v->data_dev_block_bits=12:

E sd 0 : 0:0:0: [sda] tag#30 request not aligned to the logical block size
E blk_update_request: I/O error, dev sda, sector 10368424 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
E device-mapper: verity-fec: 254:8: FEC 9244672: parity read failed (block 18056): -5

Fix this by onlu using f->roots for dm_bufio blocksize IFF it is
aligned to v->data_dev_block_bits.

Fixes: df7b59ba9245 ("dm verity: fix FEC for RS roots unaligned to block size")
Cc: [email protected]
Signed-off-by: Jaegeuk Kim <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

Merge tag 's390-5.12-7' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Heiko Carstens:

- setup stack backchain properly in external and i/o interrupt handler
   to fix stack unwinding. This broke when converting to generic entry

  - save caller address of psw_idle to get a sane stacktrace

* tag 's390-5.12-7' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/entry: save the caller of psw_idle
  s390/entry: avoid setting up backchain in ext|io handlers

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:

- Fix incorrect asm constraint for load_unaligned_zeropad() fixup

- Fix thread flag update when setting TIF_MTE_ASYNC_FAULT

- Fix restored irq state when handling fault on kprobe

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: kprobes: Restore local irqflag if kprobes is cancelled
  arm64: mte: Ensure TIF_MTE_ASYNC_FAULT is set atomically
  arm64: fix inline asm in load_unaligned_zeropad()

Merge tag 'dmaengine-fix-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine

Pull dmaengine fixes from Vinod Koul:
"A couple of dmaengine driver fixes for:

   - race and descriptor issue for xilinx driver

   - fix interrupt handling, wq state & cleanup, field sizes for
     completion, msix permissions for idxd driver

   - runtime pm fix for tegra driver

   - double free fix in dma_async_device_register"

* tag 'dmaengine-fix-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
  dmaengine: idxd: fix wq cleanup of WQCFG registers
  dmaengine: idxd: clear MSIX permission entry on shutdown
  dmaengine: plx_dma: add a missing put_device() on error path
  dmaengine: tegra20: Fix runtime PM imbalance on error
  dmaengine: Fix a double free in dma_async_device_register
  dmaengine: dw: Make it dependent to HAS_IOMEM
  dmaengine: idxd: fix wq size store permission state
  dmaengine: idxd: fix opcap sysfs attribute output
  dmaengine: idxd: fix delta_rec and crc size field for completion record
  dmaengine: idxd: Fix clobbering of SWERR overflow bit on writeback
  dmaengine: xilinx: dpdma: Fix race condition in done IRQ
  dmaengine: xilinx: dpdma: Fix descriptor issuing on video group

Merge tag 'vfio-v5.12-rc8' of git://github.com/awilliam/linux-vfio

Pull VFIO fix from Alex Williamson:
"Verify mmap region within range (Christian A. Ehrhardt)"

* tag 'vfio-v5.12-rc8' of git://github.com/awilliam/linux-vfio:
vfio/pci: Add missing range check in vfio_pci_mmap

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fix from Paolo Bonzini:
"Fix for a possible out-of-bounds access"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: VMX: Don't use vcpu->run->internal.ndata as an array index

Merge tag 'trace-v5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
"Fix a memory link in dyn_event_release().

  An error path exited the function before freeing the allocated 'argv'
  variable"

* tag 'trace-v5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing/dynevent: Fix a memory leak in an error handling path

xen-netback: Check for hotplug-status existence before watching

The logic in connect() is currently written with the assumption that
xenbus_watch_pathfmt() will return an error for a node that does not
exist.  This assumption is incorrect: xenstore does allow a watch to
be registered for a nonexistent node (and will send notifications
should the node be subsequently created).

As of commit 1f2565780 ("xen-netback: remove 'hotplug-status' once it
has served its purpose"), this leads to a failure when a domU
transitions into XenbusStateConnected more than once.  On the first
domU transition into Connected state, the "hotplug-status" node will
be deleted by the hotplug_status_changed() callback in dom0.  On the
second or subsequent domU transition into Connected state, the
hotplug_status_changed() callback will therefore never be invoked, and
so the backend will remain stuck in InitWait.

This failure prevents scenarios such as reloading the xen-netfront
module within a domU, or booting a domU via iPXE.  There is
unfortunately no way for the domU to work around this dom0 bug.

Fix by explicitly checking for existence of the "hotplug-status" node,
thereby creating the behaviour that was previously assumed to exist.

Signed-off-by: Michael Brown <[email protected]>
Reviewed-by: Paul Durrant <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

KVM: VMX: Don't use vcpu->run->internal.ndata as an array index

__vmx_handle_exit() uses vcpu->run->internal.ndata as an index for
an array access. Since vcpu->run is (can be) mapped to a user address
space with a writer permission, the 'ndata' could be updated by the
user process at anytime (the user process can set it to outside the
bounds of the array).
So, it is not safe that __vmx_handle_exit() uses the 'ndata' that way.

Fixes: 1aa561b1a4c0 ("kvm: x86: Add "last CPU" to some KVM_EXIT information")
Signed-off-by: Reiji Watanabe <[email protected]>
Reviewed-by: Jim Mattson <[email protected]>
Message-Id: <20210413154739 [email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>

gro: ensure frag0 meets IP header alignment

After commit 0f6925b3e8da ("virtio_net: Do not pull payload in skb->head")
Guenter Roeck reported one failure in his tests using sh architecture.

After much debugging, we have been able to spot silent unaligned accesses
in inet_gro_receive()

The issue at hand is that upper networking stacks assume their header
is word-aligned. Low level drivers are supposed to reserve NET_IP_ALIGN
bytes before the Ethernet header to make that happen.

This patch hardens skb_gro_reset_offset() to not allow frag0 fast-path
if the fragment is not properly aligned.

Some arches like x86, arm64 and powerpc do not care and define NET_IP_ALIGN
as 0, this extra check will be a NOP for them.

Note that if frag0 is not used, GRO will call pskb_may_pull()
as many times as needed to pull network and transport headers.

Fixes: 0f6925b3e8da ("virtio_net: Do not pull payload in skb->head")
Fixes: 78a478d0efd9 ("gro: Inline skb_gro_header and cache frag0 virtual address")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: Guenter Roeck <[email protected]>
Cc: Xuan Zhuo <[email protected]>
Cc: "Michael S. Tsirkin" <[email protected]>
Cc: Jason Wang <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/sctp: fix race condition in sctp_destroy_sock

If sctp_destroy_sock is called without sock_net(sk)->sctp.addr_wq_lock
held and sp->do_auto_asconf is true, then an element is removed
from the auto_asconf_splist without any proper locking.

This can happen in the following functions:
1. In sctp_accept, if sctp_sock_migrate fails.
2. In inet_create or inet6_create, if there is a bpf program
attached to BPF_CGROUP_INET_SOCK_CREATE which denies
creation of the sctp socket.

The bug is fixed by acquiring addr_wq_lock in sctp_destroy_sock
instead of sctp_close.

This addresses CVE-2021-23133.

Reported-by: Or Cohen <[email protected]>
Reviewed-by: Xin Long <[email protected]>
Fixes: 610236587600 ("bpf: Add new cgroup attach type to enable sock modifications")
Signed-off-by: Or Cohen <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ibmvnic: correctly use dev_consume/free_skb_irq

It is more correct to use dev_kfree_skb_irq when packets are dropped,
and to use dev_consume_skb_irq when packets are consumed.

Fixes: 0d973388185d ("ibmvnic: Introduce xmit_more support using batched subCRQ hcalls")
Suggested-by: Thomas Falcon <[email protected]>
Signed-off-by: Lijun Pan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Make tcp_allowed_congestion_control readonly in non-init netns

Currently, tcp_allowed_congestion_control is global and writable;
writing to it in any net namespace will leak into all other net
namespaces.

tcp_available_congestion_control and tcp_allowed_congestion_control are
the only sysctls in ipv4_net_table (the per-netns sysctl table) with a
NULL data pointer; their handlers (proc_tcp_available_congestion_control
and proc_allowed_congestion_control) have no other way of referencing a
struct net. Thus, they operate globally.

Because ipv4_net_table does not use designated initializers, there is no
easy way to fix up this one "bad" table entry. However, the data pointer
updating logic shouldn't be applied to NULL pointers anyway, so we
instead force these entries to be read-only.

These sysctls used to exist in ipv4_table (init-net only), but they were
moved to the per-net ipv4_net_table, presumably without realizing that
tcp_allowed_congestion_control was writable and thus introduced a leak.

Because the intent of that commit was only to know (i.e. read) "which
congestion algorithms are available or allowed", this read-only solution
should be sufficient.

The logic added in recent commit
31c4d2f160eb: ("net: Ensure net namespace isolation of sysctls")
does not and cannot check for NULL data pointers, because
other table entries (e.g. /proc/sys/net/netfilter/nf_log/) have
.data=NULL but use other methods (.extra2) to access the struct net.

Fixes: 9cb8e048e5d9 ("net/ipv4/sysctl: show tcp_{allowed, available}_congestion_control in non-initial netns")
Signed-off-by: Jonathon Reinhart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'catch-all-devices'

Hristo Venev says:

====================
net: Fix two use-after-free bugs

The two patches fix two use-after-free bugs related to cleaning up
network namespaces, one in sit and one in ip6_tunnel. They are easy to
trigger if the user has the ability to create network namespaces.

The bugs can be used to trigger null pointer dereferences. I am not
sure if they can be exploited further, but I would guess that they
can. I am not sending them to the mailing list without confirmation
that doing so would be OK.
====================

Signed-off-by: David S. Miller <[email protected]>

net: ip6_tunnel: Unregister catch-all devices

Similarly to the sit case, we need to remove the tunnels with no
addresses that have been moved to another network namespace.

Fixes: 0bd8762824e73 ("ip6tnl: add x-netns support")
Signed-off-by: Hristo Venev <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: sit: Unregister catch-all devices

A sit interface created without a local or a remote address is linked
into the `sit_net::tunnels_wc` list of its original namespace. When
deleting a network namespace, delete the devices that have been moved.

The following script triggers a null pointer dereference if devices
linked in a deleted `sit_net` remain:

    for i in `seq 1 30`; do
        ip netns add ns-test
        ip netns exec ns-test ip link add dev veth0 type veth peer veth1
        ip netns exec ns-test ip link add dev sit$i type sit dev veth0
        ip netns exec ns-test ip link set dev sit$i netns $$
        ip netns del ns-test
    done
    for i in `seq 1 30`; do
        ip link del dev sit$i
    done

Fixes: 5e6700b3bf98f ("sit: add support of x-netns")
Signed-off-by: Hristo Venev <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'fixes-for-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux

Pull MTD fix from Richard Weinberger:
"Fix WAITRDY break condition and timeout in mtk nand driver"

* tag 'fixes-for-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
mtd: rawnand: mtk: Fix WAITRDY break condition and timeout

ice: Fix potential infinite loop when using u8 loop counter

A for-loop is using a u8 loop counter that is being compared to
a u32 cmp_dcbcfg->numapp to check for the end of the loop. If
cmp_dcbcfg->numapp is larger than 255 then the counter j will wrap
around to zero and hence an infinite loop occurs. Fix this by making
counter j the same type as cmp_dcbcfg->numapp.

Addresses-Coverity: ("Infinite loop")
Fixes: aeac8ce864d9 ("ice: Recognize 860 as iSCSI port in CEE mode")
Signed-off-by: Colin Ian King <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>

ixgbe: fix unbalanced device enable/disable in suspend/resume

pci_disable_device() called in __ixgbe_shutdown() decreases
dev->enable_cnt by 1. pci_enable_device_mem() which increases
dev->enable_cnt by 1, was removed from ixgbe_resume() in commit
6f82b2558735 ("ixgbe: use generic power management"). This caused
unbalanced increase/decrease. So add pci_enable_device_mem() back.

Fix the following call trace.

  ixgbe 0000:17:00.1: disabling already-disabled device
  Call Trace:
   __ixgbe_shutdown+0x10a/0x1e0 [ixgbe]
   ixgbe_suspend+0x32/0x70 [ixgbe]
   pci_pm_suspend+0x87/0x160
   ? pci_pm_freeze+0xd0/0xd0
   dpm_run_callback+0x42/0x170
   __device_suspend+0x114/0x460
   async_suspend+0x1f/0xa0
   async_run_entry_fn+0x3c/0xf0
   process_one_work+0x1dd/0x410
   worker_thread+0x34/0x3f0
   ? cancel_delayed_work+0x90/0x90
   kthread+0x14c/0x170
   ? kthread_park+0x90/0x90
   ret_from_fork+0x1f/0x30

Fixes: 6f82b2558735 ("ixgbe: use generic power management")
Signed-off-by: Yongxin Liu <[email protected]>
Tested-by: Dave Switzer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>

ixgbe: Fix NULL pointer dereference in ethtool loopback test

The ixgbe driver currently generates a NULL pointer dereference when
performing the ethtool loopback test. This is due to the fact that there
isn't a q_vector associated with the test ring when it is setup as
interrupts are not normally added to the test rings.

To address this I have added code that will check for a q_vector before
returning a napi_id value. If a q_vector is not present it will return a
value of 0.

Fixes: b02e5a0ebb17 ("xsk: Propagate napi_id to XDP socket Rx path")
Signed-off-by: Alexander Duyck <[email protected]>
Acked-by: Björn Töpel <[email protected]>
Tested-by: Dave Switzer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>

tracing/dynevent: Fix a memory leak in an error handling path

We must free 'argv' before returning, as already done in all the other
paths of this function.

Link: https://lkml.kernel.org/r/21e3594ccd7fc88c5c162c98450409190f304327.1618136448.git.christophe.jaillet@wanadoo.fr
Fixes: d262271d0483 ("tracing/dynevent: Delegate parsing to create function")
Acked-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>

vfio/pci: Add missing range check in vfio_pci_mmap

When mmaping an extra device region verify that the region index
derived from the mmap offset is valid.

Fixes: a15b1883fee1 ("vfio_pci: Allow mapping extra regions")
Cc: [email protected]
Signed-off-by: Christian A. Ehrhardt <[email protected]>
Message-Id: <20210412214124 [email protected]>
Reviewed-by: David Gibson <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>

ACPI: x86: Call acpi_boot_table_init() after acpi_table_upgrade()

Commit 1a1c130ab757 ("ACPI: tables: x86: Reserve memory occupied by
ACPI tables") attempted to address an issue with reserving the memory
occupied by ACPI tables, but it broke the initrd-based table override
mechanism relied on by multiple users.

To restore the initrd-based ACPI table override functionality, move
the acpi_boot_table_init() invocation in setup_arch() on x86 after
the acpi_table_upgrade() one.

Fixes: 1a1c130ab757 ("ACPI: tables: x86: Reserve memory occupied by ACPI tables")
Reported-by: Hans de Goede <[email protected]>
Tested-by: Hans de Goede <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

arm64: kprobes: Restore local irqflag if kprobes is cancelled

If instruction being single stepped caused a page fault, the kprobes
is cancelled to let the page fault handler continue as a normal page
fault. But the local irqflags are disabled so cpu will restore pstate
with DAIF masked. After pagefault is serviced, the kprobes is
triggerred again, we overwrite the saved_irqflag by calling
kprobes_save_local_irqflag(). NOTE, DAIF is masked in this new saved
irqflag. After kprobes is serviced, the cpu pstate is retored with
DAIF masked.

This patch is inspired by one patch for riscv from Liao Chang.

Signed-off-by: Jisheng Zhang <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Fix NAT IPv6 offload in the flowtable.

2) icmpv6 is printed as unknown in /proc/net/nf_conntrack.

3) Use div64_u64() in nft_limit, from Eric Dumazet.

4) Use pre_exit to unregister ebtables and arptables hooks,
from Florian Westphal.

5) Fix out-of-bound memset in x_tables compat match/target,
also from Florian.

6) Clone set elements expression to ensure proper initialization.
====================

Signed-off-by: David S. Miller <[email protected]>

netfilter: nftables: clone set element expression template

memcpy() breaks when using connlimit in set elements. Use
nft_expr_clone() to initialize the connlimit expression list, otherwise
connlimit garbage collector crashes when walking on the list head copy.

[  493.064656] Workqueue: events_power_efficient nft_rhash_gc [nf_tables]
[  493.064685] RIP: 0010:find_or_evict+0x5a/0x90 [nf_conncount]
[  493.064694] Code: 2b 43 40 83 f8 01 77 0d 48 c7 c0 f5 ff ff ff 44 39 63 3c 75 df 83 6d 18 01 48 8b 43 08 48 89 de 48 8b 13 48 8b 3d ee 2f 00 00 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 03 48 83
[  493.064699] RSP: 0018:ffffc90000417dc0 EFLAGS: 00010297
[  493.064704] RAX: 0000000000000000 RBX: ffff888134f38410 RCX: 0000000000000000
[  493.064708] RDX: 0000000000000000 RSI: ffff888134f38410 RDI: ffff888100060cc0
[  493.064711] RBP: ffff88812ce594a8 R08: ffff888134f38438 R09: 00000000ebb9025c
[  493.064714] R10: ffffffff8219f838 R11: 0000000000000017 R12: 0000000000000001
[  493.064718] R13: ffffffff82146740 R14: ffff888134f38410 R15: 0000000000000000
[  493.064721] FS:  0000000000000000(0000) GS:ffff88840e440000(0000) knlGS:0000000000000000
[  493.064725] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  493.064729] CR2: 0000000000000008 CR3: 00000001330aa002 CR4: 00000000001706e0
[  493.064733] Call Trace:
[  493.064737]  nf_conncount_gc_list+0x8f/0x150 [nf_conncount]
[  493.064746]  nft_rhash_gc+0x106/0x390 [nf_tables]

Reported-by: Laura Garcia Liebana <[email protected]>
Fixes: 409444522976 ("netfilter: nf_tables: add elements with stateful expressions")
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: x_tables: fix compat match/target pad out-of-bound write

xt_compat_match/target_from_user doesn't check that zeroing the area
to start of next rule won't write past end of allocated ruleset blob.

Remove this code and zero the entire blob beforehand.

Reported-by: [email protected]
Reported-by: Andy Nguyen <[email protected]>
Fixes: 9fa492cdc160c ("[NETFILTER]: x_tables: simplify compat API")
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

ethtool: fix kdoc attr name

Add missing 't' in attrtype.

Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: phy: marvell: fix detection of PHY on Topaz switches

Since commit fee2d546414d ("net: phy: marvell: mv88e6390 temperature
sensor reading"), Linux reports the temperature of Topaz hwmon as
constant -75°C.

This is because switches from the Topaz family (88E6141 / 88E6341) have
the address of the temperature sensor register different from Peridot.

This address is instead compatible with 88E1510 PHYs, as was used for
Topaz before the above mentioned commit.

Create a new mapping table between switch family and PHY ID for families
which don't have a model number. And define PHY IDs for Topaz and Peridot
families.

Create a new PHY ID and a new PHY driver for Topaz's internal PHY.
The only difference from Peridot's PHY driver is the HWMON probing
method.

Prior this change Topaz's internal PHY is detected by kernel as:

PHY [...] driver [Marvell 88E6390] (irq=63)

And afterwards as:

PHY [...] driver [Marvell 88E6341 Family] (irq=63)

Signed-off-by: Pali Rohár <[email protected]>
BugLink: https://github.com/globalscaletechnologies/linux/issues/1
Fixes: fee2d546414d ("net: phy: marvell: mv88e6390 temperature sensor reading")
Reviewed-by: Marek Behún <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dmaengine: idxd: fix wq cleanup of WQCFG registers

A pre-release silicon erratum workaround where wq reset does not clear
WQCFG registers was leaked into upstream code. Use wq reset command
instead of blasting the MMIO region. This also address an issue where
we clobber registers in future devices.

Fixes: da32b28c95a7 ("dmaengine: idxd: cleanup workqueue config after disabling")
Reported-by: Shreenivaas Devarajan <[email protected]>
Signed-off-by: Dave Jiang <[email protected]>
Link: https://lore.kernel.org/r/161824330020.881560.16375921906426627033.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <[email protected]>

dmaengine: idxd: clear MSIX permission entry on shutdown

Add disabling/clearing of MSIX permission entries on device shutdown to
mirror the enabling of the MSIX entries on probe. Current code left the
MSIX enabled and the pasid entries still programmed at device shutdown.

Fixes: 8e50d392652f ("dmaengine: idxd: Add shared workqueue support")
Signed-off-by: Dave Jiang <[email protected]>
Link: https://lore.kernel.org/r/161824457969.882533.6020239898682672311.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <[email protected]>

Merge tag 'm68knommu-for-v5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu

Pull m68knommu fix from Greg Ungerer:
"Some m68k platforms with a non-zero memory base fail to boot with the
  recent flatmem changes.

  This is a single regression fix to the pfn offset for that case"

* tag 'm68knommu-for-v5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
  m68k: fix flatmem memory model setup

arm64: mte: Ensure TIF_MTE_ASYNC_FAULT is set atomically

The entry from EL0 code checks the TFSRE0_EL1 register for any
asynchronous tag check faults in user space and sets the
TIF_MTE_ASYNC_FAULT flag. This is not done atomically, potentially
racing with another CPU calling set_tsk_thread_flag().

Replace the non-atomic ORR+STR with an STSET instruction. While STSET
requires ARMv8.1 and an assembler that understands LSE atomics, the MTE
feature is part of ARMv8.5 and already requires an updated assembler.

Signed-off-by: Catalin Marinas <[email protected]>
Fixes: 637ec831ea4f ("arm64: mte: Handle synchronous and asynchronous tag check faults")
Cc: <[email protected]> # 5.10.x
Reported-by: Will Deacon <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Vincenzo Frascino <[email protected]>
Cc: Mark Rutland <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>

drm/i915/display/vlv_dsi: Do not skip panel_pwr_cycle_delay when disabling the panel

After the recently added commit fe0f1e3bfdfe ("drm/i915: Shut down
displays gracefully on reboot"), the DSI panel on a Cherry Trail based
Predia Basic tablet would no longer properly light up after reboot.

I've managed to reproduce this without rebooting by doing:
chvt 3; echo 1 > /sys/class/graphics/fb0/blank;\
echo 0 > /sys/class/graphics/fb0/blank

Which rapidly turns the panel off and back on again.

The vlv_dsi.c code uses an intel_dsi_msleep() helper for the various delays
used for panel on/off, since starting with MIPI-sequences version >= 3 the
delays are already included inside the MIPI-sequences.

The problems exposed by the "Shut down displays gracefully on reboot"
change, show that using this helper for the panel_pwr_cycle_delay is
not the right thing to do. This has not been noticed until now because
normally the panel never is cycled off and directly on again in quick
succession.

Change the msleep for the panel_pwr_cycle_delay to a normal msleep()
call to avoid the panel staying black after a quick off + on cycle.

Cc: Ville Syrjälä <[email protected]>
Fixes: fe0f1e3bfdfe ("drm/i915: Shut down displays gracefully on reboot")
Signed-off-by: Hans de Goede <[email protected]>
Reviewed-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 2878b29fc25a0dac0e1c6c94177f07c7f94240f0)
Signed-off-by: Rodrigo Vivi <[email protected]>

drm/i915: Don't zero out the Y plane's watermarks

Don't zero out the watermarks for the Y plane since we've already
computed them when computing the UV plane's watermarks (since the
UV plane always appears before ethe Y plane when iterating through
the planes).

This leads to allocating no DDB for the Y plane since .min_ddb_alloc
also gets zeroed. And that of course leads to underruns when scanning
out planar formats.

Cc: [email protected]
Cc: Stanislav Lisovskiy <[email protected]>
Fixes: dbf71381d733 ("drm/i915: Nuke intel_atomic_crtc_state_for_each_plane_state() from skl+ wm code")
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Stanislav Lisovskiy <[email protected]>
(cherry picked from commit f99b805fb9413ff007ca0b6add871737664117dd)
Signed-off-by: Rodrigo Vivi <[email protected]>

drm/i915/dpcd_bl: Don't try vesa interface unless specified by VBT

Looks like that there actually are another subset of laptops on the market
that don't support the Intel HDR backlight interface, but do advertise
support for the VESA DPCD backlight interface despite the fact it doesn't
seem to work.

Note though I'm not entirely clear on this - on one of the machines where
this issue was observed, I also noticed that we appeared to be rejecting
the VBT defined backlight frequency in
intel_dp_aux_vesa_calc_max_backlight(). It's noted in this function that:

/* Use highest possible value of Pn for more granularity of brightness
* adjustment while satifying the conditions below.
* ...
* - FxP is within 25% of desired value.
* Note: 25% is arbitrary value and may need some tweak.
*/

So it's possible that this value might just need to be tweaked, but for now
let's just disable the VESA backlight interface unless it's specified in
the VBT just to be safe. We might be able to try enabling this again by
default in the future.

Fixes: 2227816e647a ("drm/i915/dp: Allow forcing specific interfaces through enable_dpcd_backlight")
Cc: Jani Nikula <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Bugzilla: https://gitlab.freedesktop.org/drm/intel/-/issues/3169
Signed-off-by: Lyude Paul <[email protected]>
Reviewed-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 9e2eb6d5380e9dadcd2baecb51f238e5eba94bee)
Signed-off-by: Rodrigo Vivi <[email protected]>

s390/entry: save the caller of psw_idle

Currently psw_idle does not allocate a stack frame and does not
save its r14 and r15 into the save area. Even though this is valid from
call ABI point of view, because psw_idle does not make any calls
explicitly, in reality psw_idle is an entry point for controlled
transition into serving interrupts. So, in practice, psw_idle stack
frame is analyzed during stack unwinding. Depending on build options
that r14 slot in the save area of psw_idle might either contain a value
saved by previous sibling call or complete garbage.

  [task    0000038000003c28] do_ext_irq+0xd6/0x160
  [task    0000038000003c78] ext_int_handler+0xba/0xe8
  [task   *0000038000003dd8] psw_idle_exit+0x0/0x8 <-- pt_regs
([task    0000038000003dd8] 0x0)
  [task    0000038000003e10] default_idle_call+0x42/0x148
  [task    0000038000003e30] do_idle+0xce/0x160
  [task    0000038000003e70] cpu_startup_entry+0x36/0x40
  [task    0000038000003ea0] arch_call_rest_init+0x76/0x80

So, to make a stacktrace nicer and actually point for the real caller of
psw_idle in this frequently occurring case, make psw_idle save its r14.

  [task    0000038000003c28] do_ext_irq+0xd6/0x160
  [task    0000038000003c78] ext_int_handler+0xba/0xe8
  [task   *0000038000003dd8] psw_idle_exit+0x0/0x6 <-- pt_regs
([task    0000038000003dd8] arch_cpu_idle+0x3c/0xd0)
  [task    0000038000003e10] default_idle_call+0x42/0x148
  [task    0000038000003e30] do_idle+0xce/0x160
  [task    0000038000003e70] cpu_startup_entry+0x36/0x40
  [task    0000038000003ea0] arch_call_rest_init+0x76/0x80

Reviewed-by: Sven Schnelle <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>

s390/entry: avoid setting up backchain in ext|io handlers

Currently when interrupt arrives to cpu while in kernel context
INT_HANDLER macro (used for ext_int_handler and io_int_handler)
allocates new stack frame and pt_regs on the kernel stack and
sets up the backchain to jump over the pt_regs to the frame which has
been interrupted. This is not ideal to two reasons:

1. This hides the fact that kernel stack contains interrupt frame in it
   and hence breaks arch_stack_walk_reliable(), which needs to know that to
   guarantee "reliability" and checks that there are no pt_regs on the way.

2. It breaks the backchain unwinder logic, which assumes that the next
   stack frame after an interrupt frame is reliable, while it is not.
   In some cases (when r14 contains garbage) this leads to early unwinding
   termination with an error, instead of marking frame as unreliable
   and continuing.

To address that, only set backchain to 0.

Fixes: 56e62a737028 ("s390: convert to generic entry")
Reviewed-by: Sven Schnelle <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>

dmaengine: plx_dma: add a missing put_device() on error path

Add a missing put_device(&pdev->dev) if the call to
dma_async_device_register(dma); fails.

Fixes: 905ca51e63be ("dmaengine: plx-dma: Introduce PLX DMA engine PCI driver skeleton")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Logan Gunthorpe <[email protected]>
Link: https://lore.kernel.org/r/YFnq/0IQzixtAbC1@mwanda
Signed-off-by: Vinod Koul <[email protected]>

dmaengine: tegra20: Fix runtime PM imbalance on error

pm_runtime_get_sync() will increase the runtime PM counter
even it returns an error. Thus a pairing decrement is needed
to prevent refcount leak. Fix this by replacing this API with
pm_runtime_resume_and_get(), which will not change the runtime
PM counter on error.

Signed-off-by: Dinghao Liu <[email protected]>
Acked-by: Thierry Reding <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vinod Koul <[email protected]>

dmaengine: Fix a double free in dma_async_device_register

In the first list_for_each_entry() macro of dma_async_device_register,
it gets the chan from list and calls __dma_async_device_channel_register
(..,chan). We can see that chan->local is allocated by alloc_percpu() and
it is freed chan->local by free_percpu(chan->local) when
__dma_async_device_channel_register() failed.

But after __dma_async_device_channel_register() failed, the caller will
goto err_out and freed the chan->local in the second time by free_percpu().

The cause of this problem is forget to set chan->local to NULL when
chan->local was freed in __dma_async_device_channel_register(). My
patch sets chan->local to NULL when the callee failed to avoid double free.

Fixes: d2fb0a0438384 ("dmaengine: break out channel registration")
Signed-off-by: Lv Yunlong <[email protected]>
Reviewed-by: Dave Jiang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vinod Koul <[email protected]>

dmaengine: dw: Make it dependent to HAS_IOMEM

Some architectures do not provide devm_*() APIs. Hence make the driver
dependent on HAVE_IOMEM.

Fixes: dbde5c2934d1 ("dw_dmac: use devm_* functions to simplify code")
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vinod Koul <[email protected]>

dmaengine: idxd: fix wq size store permission state

WQ size can only be changed when the device is disabled. Current code
allows change when device is enabled but wq is disabled. Change the check
to detect device state.

Fixes: c52ca478233c ("dmaengine: idxd: add configuration component of driver")
Signed-off-by: Dave Jiang <[email protected]>
Link: https://lore.kernel.org/r/161782558755.107710.18138252584838406025.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <[email protected]>

dmaengine: idxd: fix opcap sysfs attribute output

The operation capability register is 256bits. The current output only
prints out the first 64bits. Fix to output the entire 256bits. The current
code omits operation caps from IAX devices.

Fixes: c52ca478233c ("dmaengine: idxd: add configuration component of driver")
Reported-by: Lucas Van <[email protected]>
Signed-off-by: Dave Jiang <[email protected]>
Link: https://lore.kernel.org/r/161645624963.2003736.829798666998490151.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <[email protected]>

dmaengine: idxd: fix delta_rec and crc size field for completion record

The delta_rec_size and crc_val in the completion record should
be 32bits and not 16bits.

Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Reported-by: Nikhil Rao <[email protected]>
Signed-off-by: Dave Jiang <[email protected]>
Link: https://lore.kernel.org/r/161645618572.2003490.14466173451736323035.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <[email protected]>

dmaengine: idxd: Fix clobbering of SWERR overflow bit on writeback

Current code blindly writes over the SWERR and the OVERFLOW bits. Write
back the bits actually read instead so the driver avoids clobbering the
OVERFLOW bit that comes after the register is read.

Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Reported-by: Sanjay Kumar <[email protected]>
Signed-off-by: Dave Jiang <[email protected]>
Link: https://lore.kernel.org/r/161352082229.3511254.1002151220537623503.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <[email protected]>

net: geneve: check skb is large enough for IPv4/IPv6 header

Check within geneve_xmit_skb/geneve6_xmit_skb that sk_buff structure
is large enough to include IPv4 or IPv6 header, and reject if not. The
geneve_xmit_skb portion and overall idea was contributed by Eric Dumazet.
Fixes a KMSAN-found uninit-value bug reported by syzbot at:
https://syzkaller.appspot.com/bug?id=abe95dc3e3e9667fc23b8d81f29ecad95c6f106f

Suggested-by: Eric Dumazet <[email protected]>
Reported-by: [email protected]
Signed-off-by: Phillip Potter <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: davicom: Fix regulator not turned off on failed probe

When the probe fails, we must disable the regulator that was previously
enabled.

This patch is a follow-up to commit ac88c531a5b3
("net: davicom: Fix regulator not turned off on failed probe") which missed
one case.

Fixes: 7994fe55a4a2 ("dm9000: Add regulator and reset support to dm9000")
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

MAINTAINERS: update maintainer entry for freescale fec driver

Update maintainer entry for freescale fec driver.

Suggested-by: Heiner Kallweit <[email protected]>
Signed-off-by: Joakim Zhang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

m68k: fix flatmem memory model setup

Detected a broken boot on mcf54415, likely introduced from

commit 4bfc848e0981
("m68k/mm: enable use of generic memory_model.h for !DISCONTIGMEM")

Fix ARCH_PFN_OFFSET to be a pfn.

Signed-off-by: Angelo Dureghello <[email protected]>
Acked-by: Mike Rapoport <[email protected]>
Signed-off-by: Greg Ungerer <[email protected]>

Linux 5.12-rc7

Merge tag 'for-5.12-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fix from David Sterba:
"One more patch that we'd like to get to 5.12 before release.

  It's changing where and how the superblock is stored in the zoned
  mode. It is an on-disk format change but so far there are no
  implications for users as the proper mkfs support hasn't been merged
  and is waiting for the kernel side to settle.

  Until now, the superblocks were derived from the zone index, but zone
  size can differ per device. This is changed to be based on fixed
  offset values, to make it independent of the device zone size.

  The work on that got a bit delayed, we discussed the exact locations
  to support potential device sizes and usecases. (Partially delayed
  also due to my vacation.) Having that in the same release where the
  zoned mode is declared usable is highly desired, there are userspace
  projects that need to be updated to recognize the feature. Pushing
  that to the next release would make things harder to test"

* tag 'for-5.12-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: zoned: move superblock logging zone location

Merge tag 'locking-urgent-2021-04-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fixlets from Ingo Molnar:
"Two minor fixes: one for a Clang warning, the other improves an
  ambiguous/confusing kernel log message"

* tag 'locking-urgent-2021-04-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lockdep: Address clang -Wformat warning printing for %hd
  lockdep: Add a missing initialization hint to the "INFO: Trying to register non-static key" message

Merge tag 'x86_urgent_for_v5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

- Fix the vDSO exception handling return path to disable interrupts
   again.

- A fix for the CE collector to return the proper return values to its
   callers which are used to convey what the collector has done with the
   error address.

* tag 'x86_urgent_for_v5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/traps: Correct exc_general_protection() and math_error() return paths
  RAS/CEC: Correct ce_add_elem()'s returned values

Merge branch 'for-5.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu

Pull percpu fix from Dennis Zhou:
"This contains a fix for sporadically failing atomic percpu
  allocations.

  I only caught it recently while I was reviewing a new series [1] and
  simultaneously saw reports by btrfs in xfstests [2] and [3].

  In v5.9, memcg accounting was extended to percpu done by adding a
  second type of chunk. I missed an interaction with the free page float
  count used to ensure we can support atomic allocations. If one type of
  chunk has no free pages, but the other has enough to satisfy the free
  page float requirement, we will not repopulate the free pages for the
  former type of chunk. This led to the sporadically failing atomic
  allocations"

Link: https://lore.kernel.org/linux-mm/[email protected]/
Link: https://lore.kernel.org/linux-mm/[email protected]/
Link: https://lore.kernel.org/linux-mm/CAL3q7H5RNBjCi708GH7jnczAOe0BLnacT9C+OBgA-Dx9jhB6SQ@mail.gmail.com/
* 'for-5.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
  percpu: make pcpu_nr_empty_pop_pages per chunk type

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"Seven fixes, all in drivers.

  The hpsa three are the most extensive and the most problematic: it's a
  packed structure misalignment that oopses on ia64 but looks like it
  would also oops on quite a few non-x86 architectures.

  The pm80xx is a regression and the rest are bug fixes for patches in
  the misc tree"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: scsi_transport_srp: Don't block target in SRP_PORT_LOST state
  scsi: target: iscsi: Fix zero tag inside a trace event
  scsi: pm80xx: Fix chip initialization failure
  scsi: ufs: core: Fix wrong Task Tag used in task management request UPIUs
  scsi: ufs: core: Fix task management request completion timeout
  scsi: hpsa: Add an assert to prevent __packed reintroduction
  scsi: hpsa: Fix boot on ia64 (atomic_t alignment)
  scsi: hpsa: Use __packed on individual structs, not header-wide

netfilter: arp_tables: add pre_exit hook for table unregister

Same problem that also existed in iptables/ip(6)tables, when
arptable_filter is removed there is no longer a wait period before the
table/ruleset is free'd.

Unregister the hook in pre_exit, then remove the table in the exit
function.
This used to work correctly because the old nf_hook_unregister API
did unconditional synchronize_net.

The per-net hook unregister function uses call_rcu instead.

Fixes: b9e69e127397 ("netfilter: xtables: don't hook tables by default")
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: bridge: add pre_exit hooks for ebtable unregistration

Just like ip/ip6/arptables, the hooks have to be removed, then
synchronize_rcu() has to be called to make sure no more packets are being
processed before the ruleset data is released.

Place the hook unregistration in the pre_exit hook, then call the new
ebtables pre_exit function from there.

Years ago, when first netns support got added for netfilter+ebtables,
this used an older (now removed) netfilter hook unregister API, that did
a unconditional synchronize_rcu().

Now that all is done with call_rcu, ebtable_{filter,nat,broute} pernet exit
handlers may free the ebtable ruleset while packets are still in flight.

This can only happens on module removal, not during netns exit.

The new function expects the table name, not the table struct.

This is because upcoming patch set (targeting -next) will remove all
net->xt.{nat,filter,broute}_table instances, this makes it necessary
to avoid external references to those member variables.

The existing APIs will be converted, so follow the upcoming scheme of
passing name + hook type instead.

Fixes: aee12a0a3727e ("ebtables: remove nf_hook_register usage")
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: nft_limit: avoid possible divide error in nft_limit_init

div_u64() divides u64 by u32.

nft_limit_init() wants to divide u64 by u64, use the appropriate
math function (div64_u64)

divide error: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 8390 Comm: syz-executor188 Not tainted 5.12.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:div_u64_rem include/linux/math64.h:28 [inline]
RIP: 0010:div_u64 include/linux/math64.h:127 [inline]
RIP: 0010:nft_limit_init+0x2a2/0x5e0 net/netfilter/nft_limit.c:85
Code: ef 4c 01 eb 41 0f 92 c7 48 89 de e8 38 a5 22 fa 4d 85 ff 0f 85 97 02 00 00 e8 ea 9e 22 fa 4c 0f af f3 45 89 ed 31 d2 4c 89 f0 <49> f7 f5 49 89 c6 e8 d3 9e 22 fa 48 8d 7d 48 48 b8 00 00 00 00 00
RSP: 0018:ffffc90009447198 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000200000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff875152e6 RDI: 0000000000000003
RBP: ffff888020f80908 R08: 0000200000000000 R09: 0000000000000000
R10: ffffffff875152d8 R11: 0000000000000000 R12: ffffc90009447270
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 000000000097a300(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000200001c4 CR3: 0000000026a52000 CR4: 00000000001506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
nf_tables_newexpr net/netfilter/nf_tables_api.c:2675 [inline]
nft_expr_init+0x145/0x2d0 net/netfilter/nf_tables_api.c:2713
nft_set_elem_expr_alloc+0x27/0x280 net/netfilter/nf_tables_api.c:5160
nf_tables_newset+0x1997/0x3150 net/netfilter/nf_tables_api.c:4321
nfnetlink_rcv_batch+0x85a/0x21b0 net/netfilter/nfnetlink.c:456
nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:580 [inline]
nfnetlink_rcv+0x3af/0x420 net/netfilter/nfnetlink.c:598
netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338
netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927
sock_sendmsg_nosec net/socket.c:654 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:674
____sys_sendmsg+0x6e8/0x810 net/socket.c:2350
___sys_sendmsg+0xf3/0x170 net/socket.c:2404
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2433
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: c26844eda9d4 ("netfilter: nf_tables: Fix nft limit burst handling")
Fixes: 3e0f64b7dd31 ("netfilter: nft_limit: fix packet ratelimiting")
Signed-off-by: Eric Dumazet <[email protected]>
Diagnosed-by: Luigi Rizzo <[email protected]>
Reported-by: syzbot <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

Merge tag 'powerpc-5.12-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
"Some some more powerpc fixes for 5.12:

   - Fix an oops triggered by ptrace when CONFIG_PPC_FPU_REGS=n

   - Fix an oops on sigreturn when the VDSO is unmapped on 32-bit

   - Fix vdso_wrapper.o not being rebuilt everytime vdso.so is rebuilt

  Thanks to Christophe Leroy"

* tag 'powerpc-5.12-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/vdso: Make sure vdso_wrapper.o is rebuilt everytime vdso.so is rebuilt
  powerpc/signal32: Fix Oops on sigreturn with unmapped VDSO
  powerpc/ptrace: Don't return error when getting/setting FP regs without CONFIG_PPC_FPU_REGS

Merge tag 'driver-core-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fix from Greg KH:
"Here is a single driver core fix for 5.12-rc7 to resolve a reported
  problem that caused some devices to lockup when booting. It has been
  in linux-next with no reported issues"

* tag 'driver-core-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  driver core: Fix locking bug in deferred_probe_timeout_work_func()

Merge tag 'usb-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB/Thunderbolt fixes from Greg KH:
"Here are a few small USB and Thunderbolt driver fixes for 5.12-rc7 for
  reported issues:

   - thunderbolt leaks and off-by-one fix

   - cdnsp deque fix

   - usbip fixes for syzbot-reported issues

  All have been in linux-next with no reported problems"

* tag 'usb-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usbip: synchronize event handler with sysfs code paths
  usbip: vudc synchronize sysfs code paths
  usbip: stub-dev synchronize sysfs code paths
  usbip: add sysfs_lock to synchronize sysfs code paths
  thunderbolt: Fix off by one in tb_port_find_retimer()
  thunderbolt: Fix a leak in tb_retimer_add()
  usb: cdnsp: Fixes issue with dequeuing requests after disabling endpoint

Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
"A mixture of driver and documentation bugfixes for I2C"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: imx: mention Oleksij as maintainer of the binding docs
  i2c: exynos5: correct top kerneldoc
  i2c: designware: Adjust bus_freq_hz when refuse high speed mode set
  i2c: hix5hd2: use the correct HiSilicon copyright
  i2c: gpio: update email address in binding docs
  i2c: imx: drop me as maintainer of binding docs
  i2c: stm32f4: Mundane typo fix
  I2C: JZ4780: Fix bug for Ingenic X1000.
  i2c: turn recovery error on init to debug

btrfs: zoned: move superblock logging zone location

Moves the location of the superblock logging zones. The new locations of
the logging zones are now determined based on fixed block addresses
instead of on fixed zone numbers.

The old placement method based on fixed zone numbers causes problems when
one needs to inspect a file system image without access to the drive zone
information. In such case, the super block locations cannot be reliably
determined as the zone size is unknown. By locating the superblock logging
zones using fixed addresses, we can scan a dumped file system image without
the zone information since a super block copy will always be present at or
after the fixed known locations.

Introduce the following three pairs of zones containing fixed offset
locations, regardless of the device zone size.

  - primary superblock: offset   0B (and the following zone)
  - first copy:         offset 512G (and the following zone)
  - Second copy:        offset   4T (4096G, and the following zone)

If a logging zone is outside of the disk capacity, we do not record the
superblock copy.

The first copy position is much larger than for a non-zoned filesystem,
which is at 64M.  This is to avoid overlapping with the log zones for
the primary superblock. This higher location is arbitrary but allows
supporting devices with very large zone sizes, plus some space around in
between.

Such large zone size is unrealistic and very unlikely to ever be seen in
real devices. Currently, SMR disks have a zone size of 256MB, and we are
expecting ZNS drives to be in the 1-4GB range, so this limit gives us
room to breathe. For now, we only allow zone sizes up to 8GB. The
maximum zone size that would still fit in the space is 256G.

The fixed location addresses are somewhat arbitrary, with the intent of
maintaining superblock reliability for smaller and larger devices, with
the preference for the latter. For this reason, there are two superblocks
under the first 1T. This should cover use cases for physical devices and
for emulated/device-mapper devices.

The superblock logging zones are reserved for superblock logging and
never used for data or metadata blocks. Note that we only reserve the
two zones per primary/copy actually used for superblock logging. We do
not reserve the ranges of zones possibly containing superblocks with the
largest supported zone size (0-16GB, 512G-528GB, 4096G-4112G).

The zones containing the fixed location offsets used to store
superblocks on a non-zoned volume are also reserved to avoid confusion.

Signed-off-by: Naohiro Aota <[email protected]>
Signed-off-by: David Sterba <[email protected]>

Merge branch 'for-5.12/dax' into libnvdimm-fixes

Pick up dax compile fix.

libnvdimm/region: Fix nvdimm_has_flush() to handle ND_REGION_ASYNC

In case a platform doesn't provide explicit flush-hints but provides an
explicit flush callback via ND_REGION_ASYNC region flag, then
nvdimm_has_flush() still returns '0' indicating that writes do not
require flushing. This happens on PPC64 with patch at [1] applied, where
'deep_flush' of a region was denied even though an explicit flush
function was provided.

Fix this by adding a condition to nvdimm_has_flush() to test for the
ND_REGION_ASYNC flag on the region and see if a 'region->flush' callback
is assigned.

Link: http://lore.kernel.org/r/161703936121.36.7260632399582101498.stgit@e1fbed493c87
Fixes: c5d4355d10d4 ("libnvdimm: nd_region flush callback support")
Reported-by: Shivaprasad G Bhat <[email protected]>
Signed-off-by: Vaibhav Jain <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Dan Williams <[email protected]>

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
"Here's the latest pile of clk driver and clk framework fixes for this
  release:

   - Two clk framework fixes for a long standing issue in
     clk_notifier_{register,unregister}() where we used a pointer that
     was for a struct containing a list head when there was no container
     struct

   - A compile warning fix for socfpga that's good to have

   - A double free problem with devm registered fixed factor clks

   - One last fix to the Qualcomm camera clk driver to use the right clk
     ops so clks don't get stuck and stop working because the firmware
     takes them for a ride"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: fixed: fix double free in resource managed fixed-factor clock
  clk: fix invalid usage of list cursor in unregister
  clk: fix invalid usage of list cursor in register
  clk: qcom: camcc: Update the clock ops for the SC7180
  clk: socfpga: fix iomem pointer cast on 64-bit

Merge tag 'perf-tools-fixes-for-v5.12-2020-04-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tool fixes from Arnaldo Carvalho de Melo:

- Fix wrong LBR block sorting in 'perf report'

- Fix 'perf inject' repipe usage when consuming perf.data files

- Avoid potential buffer overrun when decoding ARM SPE hardware tracing
   packets, bug found using a fuzzer

* tag 'perf-tools-fixes-for-v5.12-2020-04-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf arm-spe: Avoid potential buffer overrun
  perf report: Fix wrong LBR block sorting
  perf inject: Fix repipe usage

Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
"14 patches.

  Subsystems affected by this patch series: mm (kasan, gup, pagecache,
  and kfence), MAINTAINERS, mailmap, nds32, gcov, ocfs2, ia64, and lib"

* emailed patches from Andrew Morton <[email protected]>:
  lib: fix kconfig dependency on ARCH_WANT_FRAME_POINTERS
  kfence, x86: fix preemptible warning on KPTI-enabled systems
  lib/test_kasan_module.c: suppress unused var warning
  kasan: fix conflict with page poisoning
  fs: direct-io: fix missing sdio->boundary
  ia64: fix user_stack_pointer() for ptrace()
  ocfs2: fix deadlock between setattr and dio_end_io_write
  gcov: re-fix clang-11+ support
  nds32: flush_dcache_page: use page_mapping_file to avoid races with swapoff
  mm/gup: check page posion status for coredump.
  .mailmap: fix old email addresses
  mailmap: update email address for Jordan Crouse
  treewide: change my e-mail address, fix my name
  MAINTAINERS: update CZ.NIC's Turris information

Merge tag 'net-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"Networking fixes for 5.12-rc7, including fixes from can, ipsec,
  mac80211, wireless, and bpf trees.

  No scary regressions here or in the works, but small fixes for 5.12
  changes keep coming.

  Current release - regressions:

   - virtio: do not pull payload in skb->head

   - virtio: ensure mac header is set in virtio_net_hdr_to_skb()

   - Revert "net: correct sk_acceptq_is_full()"

   - mptcp: revert "mptcp: provide subflow aware release function"

   - ethernet: lan743x: fix ethernet frame cutoff issue

   - dsa: fix type was not set for devlink port

   - ethtool: remove link_mode param and derive link params from driver

   - sched: htb: fix null pointer dereference on a null new_q

   - wireless: iwlwifi: Fix softirq/hardirq disabling in
     iwl_pcie_enqueue_hcmd()

   - wireless: iwlwifi: fw: fix notification wait locking

   - wireless: brcmfmac: p2p: Fix deadlock introduced by avoiding the
     rtnl dependency

  Current release - new code bugs:

   - napi: fix hangup on napi_disable for threaded napi

   - bpf: take module reference for trampoline in module

   - wireless: mt76: mt7921: fix airtime reporting and related tx hangs

   - wireless: iwlwifi: mvm: rfi: don't lock mvm->mutex when sending
     config command

  Previous releases - regressions:

   - rfkill: revert back to old userspace API by default

   - nfc: fix infinite loop, refcount & memory leaks in LLCP sockets

   - let skb_orphan_partial wake-up waiters

   - xfrm/compat: Cleanup WARN()s that can be user-triggered

   - vxlan, geneve: do not modify the shared tunnel info when PMTU
     triggers an ICMP reply

   - can: fix msg_namelen values depending on CAN_REQUIRED_SIZE

   - can: uapi: mark union inside struct can_frame packed

   - sched: cls: fix action overwrite reference counting

   - sched: cls: fix err handler in tcf_action_init()

   - ethernet: mlxsw: fix ECN marking in tunnel decapsulation

   - ethernet: nfp: Fix a use after free in nfp_bpf_ctrl_msg_rx

   - ethernet: i40e: fix receiving of single packets in xsk zero-copy
     mode

   - ethernet: cxgb4: avoid collecting SGE_QBASE regs during traffic

  Previous releases - always broken:

   - bpf: Refuse non-O_RDWR flags in BPF_OBJ_GET

   - bpf: Refcount task stack in bpf_get_task_stack

   - bpf, x86: Validate computation of branch displacements

   - ieee802154: fix many similar syzbot-found bugs
       - fix NULL dereferences in netlink attribute handling
       - reject unsupported operations on monitor interfaces
       - fix error handling in llsec_key_alloc()

   - xfrm: make ipv4 pmtu check honor ip header df

   - xfrm: make hash generation lock per network namespace

   - xfrm: esp: delete NETIF_F_SCTP_CRC bit from features for esp
     offload

   - ethtool: fix incorrect datatype in set_eee ops

   - xdp: fix xdp_return_frame() kernel BUG throw for page_pool memory
     model

   - openvswitch: fix send of uninitialized stack memory in ct limit
     reply

  Misc:

   - udp: add get handling for UDP_GRO sockopt"

* tag 'net-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (182 commits)
  net: fix hangup on napi_disable for threaded napi
  net: hns3: Trivial spell fix in hns3 driver
  lan743x: fix ethernet frame cutoff issue
  net: ipv6: check for validity before dereferencing cfg->fc_nlinfo.nlh
  net: dsa: lantiq_gswip: Configure all remaining GSWIP_MII_CFG bits
  net: dsa: lantiq_gswip: Don't use PHY auto polling
  net: sched: sch_teql: fix null-pointer dereference
  ipv6: report errors for iftoken via netlink extack
  net: sched: fix err handler in tcf_action_init()
  net: sched: fix action overwrite reference counting
  Revert "net: sched: bump refcount for new action in ACT replace mode"
  ice: fix memory leak of aRFS after resuming from suspend
  i40e: Fix sparse warning: missing error code 'err'
  i40e: Fix sparse error: 'vsi->netdev' could be null
  i40e: Fix sparse error: uninitialized symbol 'ring'
  i40e: Fix sparse errors in i40e_txrx.c
  i40e: Fix parameters in aq_get_phy_register()
  nl80211: fix beacon head validation
  bpf, x86: Validate computation of branch displacements for x86-32
  bpf, x86: Validate computation of branch displacements for x86-64
  ...

Merge tag 'io_uring-5.12-2021-04-09' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:
"Two minor fixups for the reissue logic, and one for making sure that
  unbounded work is canceled on io-wq exit"

* tag 'io_uring-5.12-2021-04-09' of git://git.kernel.dk/linux-block:
  io-wq: cancel unbounded works on io-wq destroy
  io_uring: fix rw req completion
  io_uring: clear F_REISSUE right after getting it

lib: fix kconfig dependency on ARCH_WANT_FRAME_POINTERS

When LATENCYTOP, LOCKDEP, or FAULT_INJECTION_STACKTRACE_FILTER is
enabled and ARCH_WANT_FRAME_POINTERS is disabled, Kbuild gives a warning
such as:

  WARNING: unmet direct dependencies detected for FRAME_POINTER
    Depends on [n]: DEBUG_KERNEL [=y] && (M68K || UML || SUPERH) || ARCH_WANT_FRAME_POINTERS [=n] || MCOUNT [=n]
    Selected by [y]:
    - LATENCYTOP [=y] && DEBUG_KERNEL [=y] && STACKTRACE_SUPPORT [=y] && PROC_FS [=y] && !MIPS && !PPC && !S390 && !MICROBLAZE && !ARM && !ARC && !X86

Depending on ARCH_WANT_FRAME_POINTERS causes a recursive dependency
error.  ARCH_WANT_FRAME_POINTERS is to be selected by the architecture,
and is not supposed to be overridden by other config options.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Julian Braha <[email protected]>
Cc: Andreas Schwab <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Necip Fazil Yildiran <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kfence, x86: fix preemptible warning on KPTI-enabled systems

On systems with KPTI enabled, we can currently observe the following
warning:

  BUG: using smp_processor_id() in preemptible
  caller is invalidate_user_asid+0x13/0x50
  CPU: 6 PID: 1075 Comm: dmesg Not tainted 5.12.0-rc4-gda4a2b1a5479-kfence_1+ #1
  Hardware name: Hewlett-Packard HP Pro 3500 Series/2ABF, BIOS 8.11 10/24/2012
  Call Trace:
   dump_stack+0x7f/0xad
   check_preemption_disabled+0xc8/0xd0
   invalidate_user_asid+0x13/0x50
   flush_tlb_one_kernel+0x5/0x20
   kfence_protect+0x56/0x80
   ...

While it normally makes sense to require preemption to be off, so that
the expected CPU's TLB is flushed and not another, in our case it really
is best-effort (see comments in kfence_protect_page()).

Avoid the warning by disabling preemption around flush_tlb_one_kernel().

Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Marco Elver <[email protected]>
Reported-by: Tomi Sarvela <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Jann Horn <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

lib/test_kasan_module.c: suppress unused var warning

Local `unused' is intentionally unused - it is there to suppress
__must_check warnings.

Reported-by: kernel test robot <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Cc: Marco Elver <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kasan: fix conflict with page poisoning

When page poisoning is enabled, it accesses memory that is marked as
poisoned by KASAN, which leas to false-positive KASAN reports.

Suppress the reports by adding KASAN annotations to unpoison_page()
(poison_page() already has them).

Link: https://lkml.kernel.org/r/2dc799014d31ac13fd97bd906bad33e16376fc67.1617118501.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

fs: direct-io: fix missing sdio->boundary

I encountered a hung task issue, but not a performance one.  I run DIO
on a device (need lba continuous, for example open channel ssd), maybe
hungtask in below case:

  DIO: Checkpoint:
  get addr A(at boundary), merge into BIO,
  no submit because boundary missing
flush dirty data(get addr A+1), wait IO(A+1)
writeback timeout, because DIO(A) didn't submit
  get addr A+2 fail, because checkpoint is doing

dio_send_cur_page() may clear sdio->boundary, so prevent it from missing
a boundary.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: b1058b981272 ("direct-io: submit bio after boundary buffer is added to it")
Signed-off-by: Jack Qiu <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ia64: fix user_stack_pointer() for ptrace()

ia64 has two stacks:

- memory stack (or stack), pointed at by by r12

- register backing store (register stack), pointed at by
   ar.bsp/ar.bspstore with complications around dirty
   register frame on CPU.

In [1] Dmitry noticed that PTRACE_GET_SYSCALL_INFO returns the register
stack instead memory stack.

The bug comes from the fact that user_stack_pointer() and
current_user_stack_pointer() don't return the same register:

  ulong user_stack_pointer(struct pt_regs *regs) { return regs->ar_bspstore; }
  #define current_user_stack_pointer() (current_pt_regs()->r12)

The change gets both back in sync.

I think ptrace(PTRACE_GET_SYSCALL_INFO) is the only affected user by
this bug on ia64.

The change fixes 'rt_sigreturn.gen.test' strace test where it was
observed initially.

Link: https://bugs.gentoo.org/769614
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Sergei Trofimovich <[email protected]>
Reported-by: Dmitry V. Levin <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: fix deadlock between setattr and dio_end_io_write

The following deadlock is detected:

  truncate -> setattr path is waiting for pending direct IO to be done (inode->i_dio_count become zero) with inode->i_rwsem held (down_write).

  PID: 14827  TASK: ffff881686a9af80  CPU: 20  COMMAND: "ora_p005_hrltd9"
   #0  __schedule at ffffffff818667cc
   #1  schedule at ffffffff81866de6
   #2  inode_dio_wait at ffffffff812a2d04
   #3  ocfs2_setattr at ffffffffc05f322e [ocfs2]
   #4  notify_change at ffffffff812a5a09
   #5  do_truncate at ffffffff812808f5
   #6  do_sys_ftruncate.constprop.18 at ffffffff81280cf2
   #7  sys_ftruncate at ffffffff81280d8e
   #8  do_syscall_64 at ffffffff81003949
   #9  entry_SYSCALL_64_after_hwframe at ffffffff81a001ad

dio completion path is going to complete one direct IO (decrement
inode->i_dio_count), but before that it hung at locking inode->i_rwsem:

   #0  __schedule+700 at ffffffff818667cc
   #1  schedule+54 at ffffffff81866de6
   #2  rwsem_down_write_failed+536 at ffffffff8186aa28
   #3  call_rwsem_down_write_failed+23 at ffffffff8185a1b7
   #4  down_write+45 at ffffffff81869c9d
   #5  ocfs2_dio_end_io_write+180 at ffffffffc05d5444 [ocfs2]
   #6  ocfs2_dio_end_io+85 at ffffffffc05d5a85 [ocfs2]
   #7  dio_complete+140 at ffffffff812c873c
   #8  dio_aio_complete_work+25 at ffffffff812c89f9
   #9  process_one_work+361 at ffffffff810b1889
  #10  worker_thread+77 at ffffffff810b233d
  #11  kthread+261 at ffffffff810b7fd5
  #12  ret_from_fork+62 at ffffffff81a0035e

Thus above forms ABBA deadlock.  The same deadlock was mentioned in
upstream commit 28f5a8a7c033 ("ocfs2: should wait dio before inode lock
in ocfs2_setattr()").  It seems that that commit only removed the
cluster lock (the victim of above dead lock) from the ABBA deadlock
party.

End-user visible effects: Process hang in truncate -> ocfs2_setattr path
and other processes hang at ocfs2_dio_end_io_write path.

This is to fix the deadlock itself.  It removes inode_lock() call from
dio completion path to remove the deadlock and add ip_alloc_sem lock in
setattr path to synchronize the inode modifications.

[[email protected]: remove the "had_alloc_lock" as suggested]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Wengang Wang <[email protected]>
Reviewed-by: Joseph Qi <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Changwei Ge <[email protected]>
Cc: Gang He <[email protected]>
Cc: Jun Piao <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

gcov: re-fix clang-11+ support

LLVM changed the expected function signature for llvm_gcda_emit_function()
in the clang-11 release.  Users of clang-11 or newer may have noticed
their kernels producing invalid coverage information:

  $ llvm-cov gcov -a -c -u -f -b <input>.gcda -- gcno=<input>.gcno
  1 <func>: checksum mismatch, \
    (<lineno chksum A>, <cfg chksum B>) != (<lineno chksum A>, <cfg chksum C>)
  2 Invalid .gcda File!
  ...

Fix up the function signatures so calling this function interprets its
parameters correctly and computes the correct cfg checksum.  In
particular, in clang-11, the additional checksum is no longer optional.

Link: https://reviews.llvm.org/rG25544ce2df0daa4304c07e64b9c8b0f7df60c11d
Link: https://lkml.kernel.org/r/[email protected]
Reported-by: Prasad Sodagudi <[email protected]>
Tested-by: Prasad Sodagudi <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>
Cc: <[email protected]> [5.4+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

nds32: flush_dcache_page: use page_mapping_file to avoid races with swapoff

Commit cb9f753a3731 ("mm: fix races between swapoff and flush dcache")
updated flush_dcache_page implementations on several architectures to
use page_mapping_file() in order to avoid races between page_mapping()
and swapoff().

This update missed arch/nds32 and there is a possibility of a race
there.

Replace page_mapping() with page_mapping_file() in nds32 implementation
of flush_dcache_page().

Link: https://lkml.kernel.org/r/[email protected]
Fixes: cb9f753a3731 ("mm: fix races between swapoff and flush dcache")
Signed-off-by: Mike Rapoport <[email protected]>
Reviewed-by: Matthew Wilcox (Oracle) <[email protected]>
Acked-by: Greentime Hu <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: Nick Hu <[email protected]>
Cc: Vincent Chen <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/gup: check page posion status for coredump.

When we do coredump for user process signal, this may be an SIGBUS signal
with BUS_MCEERR_AR or BUS_MCEERR_AO code, which means this signal is
resulted from ECC memory fail like SRAR or SRAO, we expect the memory
recovery work is finished correctly, then the get_dump_page() will not
return the error page as its process pte is set invalid by
memory_failure().

But memory_failure() may fail, and the process's related pte may not be
correctly set invalid, for current code, we will return the poison page,
get it dumped, and then lead to system panic as its in kernel code.

So check the poison status in get_dump_page(), and if TRUE, return NULL.

There maybe other scenario that is also better to check the posion status
and not to panic, so make a wrapper for this check, Thanks to David's
suggestion(<[email protected]>).

[[email protected]: s/0/false/]
[[email protected]: is_page_poisoned() arg cannot be null, per Matthew]

Link: https://lkml.kernel.org/r/20210322115233.05e4e82a@alex-virtual-machine
Link: https://lkml.kernel.org/r/20210319104437.6f30e80d@alex-virtual-machine
Signed-off-by: Aili Yao <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Aili Yao <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

.mailmap: fix old email addresses

Update Nick & Nadia's old addresses.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Matthew Wilcox <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Nadia Yvette Chambers <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mailmap: update email address for Jordan Crouse

jcrouse at codeaurora.org has started bouncing. Redirect to a more
permanent address.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Jordan Crouse <[email protected]>
Cc: Alexander Lobakin <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Miguel Ojeda <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

treewide: change my e-mail address, fix my name

Change my e-mail address to [email protected], and fix my name in
non-code parts (add diacritical mark).

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Marek Behún <[email protected]>
Cc: Bartosz Golaszewski <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Jassi Brar <[email protected]>
Cc: Linus Walleij <[email protected]>
Cc: Pavel Machek <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

MAINTAINERS: update CZ.NIC's Turris information

Add all the files maintained by Turris team, not only for MOX, but also
for Omnia. Change website.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Marek Behún <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Jassi Brar <[email protected]>
Cc: Linus Walleij <[email protected]>
Cc: Bartosz Golaszewski <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge tag 'devicetree-fixes-for-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree fixes from Rob Herring:

- Fix fw_devlink failure with ".*,nr-gpios" properties

- Doc link reference fixes from Mauro

- Fixes for unaligned FDT handling found on OpenRisc. First, avoid
   crash with better error handling when unflattening an unaligned FDT.
   Second, fix memory allocations for FDTs to ensure alignment.

* tag 'devicetree-fixes-for-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  of: property: fw_devlink: do not link ".*,nr-gpios"
  dt-bindings:iio:adc: update motorola,cpcap-adc.yaml reference
  dt-bindings: fix references for iio-bindings.txt
  dt-bindings: don't use ../dir for doc references
  of: unittest: overlay: ensure proper alignment of copied FDT
  of: properly check for error returned by fdt_get_name()

Merge tag 'drm-fixes-2021-04-10' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
"Was relatively quiet this week, but still a few pulls came in, pretty
  much small fixes across the board, a couple of regression fixes in the
  amdgpu/radeon code, msm has a few minor fixes across the board, a
  panel regression fix also.

  amdgpu:
   - DCN3 fix
   - Fix CAC setting regression for TOPAZ
   - Fix ttm regression

  radeon:
   - Fix ttm regression

  msm:
   - a5xx/a6xx timestamp fix
   - microcode version check
   - fail path fix
   - block programming fix
   - error removal fix

  i915:
   - Fix invalid access to ACPI _DSM objects

  xen:
   - Fix use-after-free in xen
   - minor duplicate defintion cleanup

  vc4:
   - Reduce fifo threshold on hvs4 to fix a fifo full error
   - minor redunantant assignment cleanup

  panel:
   - Disable TE support for Droid4 and N950"

* tag 'drm-fixes-2021-04-10' of git://anongit.freedesktop.org/drm/drm:
  drm/vc4: crtc: Reduce PV fifo threshold on hvs4
  drm/vc4: plane: Remove redundant assignment
  drm/amdgpu/smu7: fix CAC setting on TOPAZ
  drm/radeon: Fix size overflow
  drm/amdgpu: Fix size overflow
  drm/i915: Fix invalid access to ACPI _DSM objects
  drm/amd/display: Add missing mask for DCN3
  drm/panel: panel-dsi-cm: disable TE for now
  drm/msm/disp/dpu1: program 3d_merge only if block is attached
  drm/msm: a6xx: fix version check for the A650 SQE microcode
  drm/msm: Fix a5xx/a6xx timestamps
  drm/msm: Fix removal of valid error case when checking speed_bin
  drm/msm: Set drvdata to NULL when msm_drm_init() fails
  drivers: gpu: drm: xen_drm_front_drm_info is declared twice
  gpu/xen: Fix a use after free in xen_drm_drv_init

net: fix hangup on napi_disable for threaded napi

napi_disable() is subject to an hangup, when the threaded
mode is enabled and the napi is under heavy traffic.

If the relevant napi has been scheduled and the napi_disable()
kicks in before the next napi_threaded_wait() completes - so
that the latter quits due to the napi_disable_pending() condition,
the existing code leaves the NAPI_STATE_SCHED bit set and the
napi_disable() loop waiting for such bit will hang.

This patch addresses the issue by dropping the NAPI_STATE_DISABLE
bit test in napi_thread_wait(). The later napi_threaded_poll()
iteration will take care of clearing the NAPI_STATE_SCHED.

This also addresses a related problem reported by Jakub:
before this patch a napi_disable()/napi_enable() pair killed
the napi thread, effectively disabling the threaded mode.
On the patched kernel napi_disable() simply stops scheduling
the relevant thread.

v1 -> v2:
- let the main napi_thread_poll() loop clear the SCHED bit

Reported-by: Jakub Kicinski <[email protected]>
Fixes: 29863d41bb6e ("net: implement threaded-able napi poll loop support")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/r/883923fa22745a9589e8610962b7dc59df09fb1f.1617981844.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <[email protected]>

net: hns3: Trivial spell fix in hns3 driver

Some trivial spelling mistakes which caught my eye during the
review of the code.

Signed-off-by: Salil Mehta <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

lan743x: fix ethernet frame cutoff issue

The ethernet frame length is calculated incorrectly. Depending on
the value of RX_HEAD_PADDING, this may result in ethernet frames
that are too short (cut off at the end), or too long (garbage added
to the end).

Fix by calculating the ethernet frame length correctly. For added
clarity, use the ETH_FCS_LEN constant in the calculation.

Many thanks to Heiner Kallweit for suggesting this solution.

Suggested-by: Heiner Kallweit <[email protected]>
Fixes: 3e21a10fdea3 ("lan743x: trim all 4 bytes of the FCS; not just 2")
Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: Sven Van Asbroeck <[email protected]>
Reviewed-by: George McCollister <[email protected]>
Tested-by: George McCollister <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

of: property: fw_devlink: do not link ".*,nr-gpios"

[<vendor>,]nr-gpios property is used by some GPIO drivers[0] to indicate
the number of GPIOs present on a system, not define a GPIO. nr-gpios is
not configured by #gpio-cells and can't be parsed along with other
"*-gpios" properties.

nr-gpios without the "<vendor>," prefix is not allowed by the DT
spec[1], so only add exception for the ",nr-gpios" suffix and let the
error message continue being printed for non-compliant implementations.

[0] nr-gpios is referenced in Documentation/devicetree/bindings/gpio:
- gpio-adnp.txt
- gpio-xgene-sb.txt
- gpio-xlp.txt
- snps,dw-apb-gpio.yaml

Link: https://github.com/devicetree-org/dt-schema/blob/cb53a16a1eb3e2169ce170c071e47940845ec26e/schemas/gpio/gpio-consumer.yaml#L20
Fixes errors such as:
OF: /palmbus@300000/gpio@600: could not find phandle

Fixes: 7f00be96f125 ("of: property: Add device link support for interrupt-parent, dmas and -gpio(s)")
Signed-off-by: Ilya Lipnitskiy <[email protected]>
Cc: Saravana Kannan <[email protected]>
Cc: [email protected] # v5.5+
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Rob Herring <[email protected]>

dt-bindings:iio:adc: update motorola,cpcap-adc.yaml reference

Changeset 1ca9d1b1342d ("dt-bindings:iio:adc:motorola,cpcap-adc yaml conversion")
renamed: Documentation/devicetree/bindings/iio/adc/cpcap-adc.txt
to: Documentation/devicetree/bindings/iio/adc/motorola,cpcap-adc.yaml.

Update its cross-reference accordingly.

Fixes: 1ca9d1b1342d ("dt-bindings:iio:adc:motorola,cpcap-adc yaml conversion")
Acked-by: Jonathan Cameron <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Link: https://lore.kernel.org/r/3e205e5fa701e4bc15d39d6ac1f57717df2bb4c6.1617972339.git.mchehab+huawei@kernel.org
Signed-off-by: Rob Herring <[email protected]>

dt-bindings: fix references for iio-bindings.txt

The iio-bindings.txt was converted into two files and merged
at the dt-schema git tree at:

https://github.com/devicetree-org/dt-schema

Yet, some documents still refer to the old file. Fix their
references, in order to point to the right URL.

Fixes: dba91f82d580 ("dt-bindings:iio:iio-binding.txt Drop file as content now in dt-schema")
Acked-by: Jonathan Cameron <[email protected]>
Acked-by: Guenter Roeck <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Link: https://lore.kernel.org/r/4efd81eca266ca0875d3bf9d1672097444146c69.1617972339.git.mchehab+huawei@kernel.org
Signed-off-by: Rob Herring <[email protected]>

dt-bindings: don't use ../dir for doc references

As documents have been renamed and moved around, their
references will break, but this will be unnoticed, as the
script which checks for it won't handle "../" references.

So, replace them by the full patch.

Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Link: https://lore.kernel.org/r/68d3a1244119d1f2829c375b0ef554cf348bc89f.1617972339.git.mchehab+huawei@kernel.org
Signed-off-by: Rob Herring <[email protected]>

Merge tag 'drm-intel-fixes-2021-04-09' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- Fix invalid access to ACPI _DSM objects (Takashi)

Signed-off-by: Dave Airlie <[email protected]>
From: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]