Git Repo - linux.git/log

rtc: au1xxx: convert to SPDX identifier

Use SPDX-License-Identifier instead of a verbose license text.

Signed-off-by: Nobuhiro Iwamatsu <[email protected]>
Signed-off-by: Alexandre Belloni <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

Revert "PCI: Coalesce host bridge contiguous apertures"

This reverts commit 65db04053efea3f3e412a7e0cc599962999c96b4.

Guenter reported that after 65db04053efe, the ppc:sam460ex qemu emulation
no longer boots from nvme:

nvme nvme0: Device not ready; aborting initialisation, CSTS=0x0
nvme nvme0: Removing after probe failure status: -19

Link: https://lore.kernel.org/r/[email protected]
Reported-by: Guenter Roeck <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>

rtc: pcf85063: Update the PCF85063A datasheet revision

After updating the datasheet URL, the PCF85063A datasheet revision
has changed.

Adjust it accordingly.

Reported-by: Nobuhiro Iwamatsu <[email protected]>
Signed-off-by: Fabio Estevam <[email protected]>
Signed-off-by: Alexandre Belloni <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

dt-bindings: rtc: ti,bq32k: take maintainership

Take maintainership of the binding as PAvel said he doesn't have the
hardware anymore.

Signed-off-by: Alexandre Belloni <[email protected]>
Acked-by: Pavel Machek <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2021-07-09

The following pull-request contains BPF updates for your *net* tree.

We've added 9 non-merge commits during the last 9 day(s) which contain
a total of 13 files changed, 118 insertions(+), 62 deletions(-).

The main changes are:

1) Fix runqslower task->state access from BPF, from SanjayKumar Jeyakumar.

2) Fix subprog poke descriptor tracking use-after-free, from John Fastabend.

3) Fix sparse complaint from prior devmap RCU conversion, from Toke Høiland-Jørgensen.

4) Fix missing va_end in bpftool JIT json dump's error path, from Gu Shengxian.

5) Fix tools/bpf install target from missing runqslower install, from Wei Li.

6) Fix xdpsock BPF sample to unload program on shared umem option, from Wang Hai.
====================

Signed-off-by: David S. Miller <[email protected]>

net: validate lwtstate->data before returning from skb_tunnel_info()

skb_tunnel_info() returns pointer of lwtstate->data as ip_tunnel_info
type without validation. lwtstate->data can have various types such as
mpls_iptunnel_encap, etc and these are not compatible.
So skb_tunnel_info() should validate before returning that pointer.

Splat looks like:
BUG: KASAN: slab-out-of-bounds in vxlan_get_route+0x418/0x4b0 [vxlan]
Read of size 2 at addr ffff888106ec2698 by task ping/811

CPU: 1 PID: 811 Comm: ping Not tainted 5.13.0+ #1195
Call Trace:
dump_stack_lvl+0x56/0x7b
print_address_description.constprop.8.cold.13+0x13/0x2ee
? vxlan_get_route+0x418/0x4b0 [vxlan]
? vxlan_get_route+0x418/0x4b0 [vxlan]
kasan_report.cold.14+0x83/0xdf
? vxlan_get_route+0x418/0x4b0 [vxlan]
vxlan_get_route+0x418/0x4b0 [vxlan]
[ ... ]
vxlan_xmit_one+0x148b/0x32b0 [vxlan]
[ ... ]
vxlan_xmit+0x25c5/0x4780 [vxlan]
[ ... ]
dev_hard_start_xmit+0x1ae/0x6e0
__dev_queue_xmit+0x1f39/0x31a0
[ ... ]
neigh_xmit+0x2f9/0x940
mpls_xmit+0x911/0x1600 [mpls_iptunnel]
lwtunnel_xmit+0x18f/0x450
ip_finish_output2+0x867/0x2040
[ ... ]

Fixes: 61adedf3e3f1 ("route: move lwtunnel state to dst_entry")
Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ip_tunnel: fix mtu calculation for ETHER tunnel devices

Commit 28e104d00281 ("net: ip_tunnel: fix mtu calculation") removed
dev->hard_header_len subtraction when calculate MTU for tunnel devices
as there is an overhead for device that has header_ops.

But there are ETHER tunnel devices, like gre_tap or erspan, which don't
have header_ops but set dev->hard_header_len during setup. This makes
pkts greater than (MTU - ETH_HLEN) could not be xmited. Fix it by
subtracting the ETHER tunnel devices' dev->hard_header_len for MTU
calculation.

Fixes: 28e104d00281 ("net: ip_tunnel: fix mtu calculation")
Reported-by: Jianlin Shi <[email protected]>
Signed-off-by: Hangbin Liu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'io_uring-5.14-2021-07-09' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:
"A few fixes that should go into this merge.

  One fixes a regression introduced in this release, others are just
  generic fixes, mostly related to handling fallback task_work"

* tag 'io_uring-5.14-2021-07-09' of git://git.kernel.dk/linux-block:
  io_uring: remove dead non-zero 'poll' check
  io_uring: mitigate unlikely iopoll lag
  io_uring: fix drain alloc fail return code
  io_uring: fix exiting io_req_task_work_add leaks
  io_uring: simplify task_work func
  io_uring: fix stuck fallback reqs

Merge tag 'block-5.14-2021-07-08' of git://git.kernel.dk/linux-block

Pull more block updates from Jens Axboe:
"A combination of changes that ended up depending on both the driver
  and core branch (and/or the IDE removal), and a few late arriving
  fixes. In detail:

   - Fix io ticks wrap-around issue (Chunguang)

   - nvme-tcp sock locking fix (Maurizio)

   - s390-dasd fixes (Kees, Christoph)

   - blk_execute_rq polling support (Keith)

   - blk-cgroup RCU iteration fix (Yu)

   - nbd backend ID addition (Prasanna)

   - Partition deletion fix (Yufen)

   - Use blk_mq_alloc_disk for mmc, mtip32xx, ubd (Christoph)

   - Removal of now dead block request types due to IDE removal
     (Christoph)

   - Loop probing and control device cleanups (Christoph)

   - Device uevent fix (Christoph)

   - Misc cleanups/fixes (Tetsuo, Christoph)"

* tag 'block-5.14-2021-07-08' of git://git.kernel.dk/linux-block: (34 commits)
  blk-cgroup: prevent rcu_sched detected stalls warnings while iterating blkgs
  block: fix the problem of io_ticks becoming smaller
  nvme-tcp: can't set sk_user_data without write_lock
  loop: remove unused variable in loop_set_status()
  block: remove the bdgrab in blk_drop_partitions
  block: grab a device refcount in disk_uevent
  s390/dasd: Avoid field over-reading memcpy()
  dasd: unexport dasd_set_target_state
  block: check disk exist before trying to add partition
  ubd: remove dead code in ubd_setup_common
  nvme: use return value from blk_execute_rq()
  block: return errors from blk_execute_rq()
  nvme: use blk_execute_rq() for passthrough commands
  block: support polling through blk_execute_rq
  block: remove REQ_OP_SCSI_{IN,OUT}
  block: mark blk_mq_init_queue_data static
  loop: rewrite loop_exit using idr_for_each_entry
  loop: split loop_lookup
  loop: don't allow deleting an unspecified loop device
  loop: move loop_ctl_mutex locking into loop_add
  ...

Merge tag 'sound-fix-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"Just a collection of small fixes here: the most outstanding one is the
  re-application of USB-audio lowlatency support that was reverted in
  the previous PR. The rest are device-specific quirks/fixes, spelling
  fixes and a regression fix for the old intel8x0 driver"

* tag 'sound-fix-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: intel8x0: Fix breakage at ac97 clock measurement
  ALSA: usb-audio: Reduce latency at playback start, take#2
  ALSA: isa: Fix error return code in snd_cmi8330_probe()
  ALSA: emux: fix spelling mistakes
  ALSA: usb-audio: fix spelling mistakes
  ALSA: bebob: correct duplicated entries with TerraTec OUI
  ALSA: usx2y: fix spelling mistakes
  ALSA: x86: fix spelling mistakes
  ALSA: hda/realtek: fix mute led of the HP Pavilion 15-eh1xxx series

perf test: Add free() calls for scandir() returned dirent entries

ASan reported a memory leak for items of the entlist returned from scandir().

In fact, scandir() returns a malloc'd array of malloc'd dirents.

This patch adds the missing (z)frees.

Fixes: da963834fe6975a1 ("perf test: Iterate over shell tests in alphabetical order")
Signed-off-by: Riccardo Mancini <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Fabian Hemmer <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Remi Bernon <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

libperf: Add tests for perf_evlist__set_leader()

Add a test for the newly added perf_evlist__set_leader() function.

Committer testing:

  $ cd tools/lib/perf/
  $ sudo make tests
  [sudo] password for acme:
  running static:
  - running tests/test-cpumap.c...OK
  - running tests/test-threadmap.c...OK
  - running tests/test-evlist.c...OK
  - running tests/test-evsel.c...OK
  running dynamic:
  - running tests/test-cpumap.c...OK
  - running tests/test-threadmap.c...OK
  - running tests/test-evlist.c...OK
  - running tests/test-evsel.c...OK
  $

Signed-off-by: Jiri Olsa <[email protected]>
Requested-by: Shunsuke Nakamura <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

libperf: Remove BUG_ON() from library code in get_group_fd()

We shouldn't just panic, return a value that doesn't clash with what
perf_evsel__open() was already returning in case of error, i.e. errno
when sys_perf_event_open() fails.

Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Shunsuke Nakamura <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

net: do not reuse skbuff allocated from skbuff_fclone_cache in the skb cache

Some socket buffers allocated in the fclone cache (in __alloc_skb) can
end-up in the following path[1]:

napi_skb_finish
  __kfree_skb_defer
    napi_skb_cache_put

The issue is napi_skb_cache_put is not fclone friendly and will put
those skbuff in the skb cache to be reused later, although this cache
only expects skbuff allocated from skbuff_head_cache. When this happens
the skbuff is eventually freed using the wrong origin cache, and we can
see traces similar to:

[ 1223.947534] cache_from_obj: Wrong slab cache. skbuff_head_cache but object is from skbuff_fclone_cache
[ 1223.948895] WARNING: CPU: 3 PID: 0 at mm/slab.h:442 kmem_cache_free+0x251/0x3e0
[ 1223.950211] Modules linked in:
[ 1223.950680] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.13.0+ #474
[ 1223.951587] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-3.fc34 04/01/2014
[ 1223.953060] RIP: 0010:kmem_cache_free+0x251/0x3e0

Leading sometimes to other memory related issues.

Fix this by using __kfree_skb for fclone skbuff, similar to what is done
the other place __kfree_skb_defer is called.

[1] At least in setups using veth pairs and tunnels. Building a kernel
    with KASAN we can for example see packets allocated in
    sk_stream_alloc_skb hit the above path and later the issue arises
    when the skbuff is reused.

Fixes: 9243adfc311a ("skbuff: queue NAPI_MERGED_FREE skbs into NAPI cache instead of freeing")
Cc: Alexander Lobakin <[email protected]>
Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp: call sk_wmem_schedule before sk_mem_charge in zerocopy path

sk_wmem_schedule makes sure that sk_forward_alloc has enough
bytes for charging that is going to be done by sk_mem_charge.

In the transmit zerocopy path, there is sk_mem_charge but there was
no call to sk_wmem_schedule. This change adds that call.

Without this call to sk_wmem_schedule, sk_forward_alloc can go
negetive which is a bug because sk_forward_alloc is a per-socket
space that has been forward charged so this can't be negative.

Fixes: f214f915e7db ("tcp: enable MSG_ZEROCOPY")
Signed-off-by: Talal Ahmad <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Wei Wang <[email protected]>
Reviewed-by: Soheil Hassas Yeganeh <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: send SYNACK packet with accepted fwmark

commit e05a90ec9e16 ("net: reflect mark on tcp syn ack packets")
fixed IPv4 only.

This part is for the IPv6 side.

Fixes: e05a90ec9e16 ("net: reflect mark on tcp syn ack packets")
Signed-off-by: Alexander Ovechkin <[email protected]>
Acked-by: Dmitry Yakunin <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'trace-v5.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix and cleanup from Steven Rostedt:
"Tracing fix for histograms and a clean up in ftrace:

   - Fixed a bug that broke the .sym-offset modifier and added a test to
     make sure nothing breaks it again.

   - Replace a list_del/list_add() with a list_move()"

* tag 'trace-v5.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Use list_move instead of list_del/list_add
  tracing/selftests: Add tests to test histogram sym and sym-offset modifiers
  tracing/histograms: Fix parsing of "sym-offset" modifier

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio,vhost,vdpa updates from Michael Tsirkin:

- Doorbell remapping for ifcvf, mlx5

- virtio_vdpa support for mlx5

- Validate device input in several drivers (for SEV and friends)

- ZONE_MOVABLE aware handling in virtio-mem

- Misc fixes, cleanups

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (48 commits)
  virtio-mem: prioritize unplug from ZONE_MOVABLE in Big Block Mode
  virtio-mem: simplify high-level unplug handling in Big Block Mode
  virtio-mem: prioritize unplug from ZONE_MOVABLE in Sub Block Mode
  virtio-mem: simplify high-level unplug handling in Sub Block Mode
  virtio-mem: simplify high-level plug handling in Sub Block Mode
  virtio-mem: use page_zonenum() in virtio_mem_fake_offline()
  virtio-mem: don't read big block size in Sub Block Mode
  virtio/vdpa: clear the virtqueue state during probe
  vp_vdpa: allow set vq state to initial state after reset
  virtio-pci library: introduce vp_modern_get_driver_features()
  vdpa: support packed virtqueue for set/get_vq_state()
  virtio-ring: store DMA metadata in desc_extra for split virtqueue
  virtio: use err label in __vring_new_virtqueue()
  virtio_ring: introduce virtqueue_desc_add_split()
  virtio_ring: secure handling of mapping errors
  virtio-ring: factor out desc_extra allocation
  virtio_ring: rename vring_desc_extra_packed
  virtio-ring: maintain next in extra state for packed virtqueue
  vdpa/mlx5: Clear vq ready indication upon device reset
  vdpa/mlx5: Add support for doorbell bypassing
  ...

cifs: update internal version number

To 2.33

Signed-off-by: Steve French <[email protected]>

net: ti: fix UAF in tlan_remove_one

priv is netdev private data and it cannot be
used after free_netdev() call. Using priv after free_netdev()
can cause UAF bug. Fix it by moving free_netdev() at the end of the
function.

Fixes: 1e0a8b13d355 ("tlan: cancel work at remove path")
Signed-off-by: Pavel Skripkin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: qcom/emac: fix UAF in emac_remove

adpt is netdev private data and it cannot be
used after free_netdev() call. Using adpt after free_netdev()
can cause UAF bug. Fix it by moving free_netdev() at the end of the
function.

Fixes: 54e19bc74f33 ("net: qcom/emac: do not use devm on internal phy pdev")
Signed-off-by: Pavel Skripkin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: moxa: fix UAF in moxart_mac_probe

In case of netdev registration failure the code path will
jump to init_fail label:

init_fail:
netdev_err(ndev, "init failed\n");
moxart_mac_free_memory(ndev);
irq_map_fail:
free_netdev(ndev);
return ret;

So, there is no need to call free_netdev() before jumping
to error handling path, since it can cause UAF or double-free
bug.

Fixes: 6c821bd9edc9 ("net: Add MOXA ART SoCs ethernet driver")
Signed-off-by: Pavel Skripkin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:

- Regression fix in drbg due to missing self-test for new default
   algorithm

- Add ratelimit on user-triggerable message in qat

- Fix build failure due to missing dependency in sl3516

- Remove obsolete PageSlab checks

- Fix bogus hardware register writes on Kunpeng920 in hisilicon/sec

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: hisilicon/sec - fix the process of disabling sva prefetching
  crypto: sl3516 - Add dependency on ARCH_GEMINI
  crypto: sl3516 - Typo s/Stormlink/Storlink/
  crypto: drbg - self test for HMAC(SHA-512)
  crypto: omap - Drop obsolete PageSlab check
  crypto: scatterwalk - Remove obsolete PageSlab check
  crypto: qat - ratelimit invalid ioctl message and print the invalid cmd

cifs: prevent NULL deref in cifs_compose_mount_options()

The optional @ref parameter might contain an NULL node_name, so
prevent dereferencing it in cifs_compose_mount_options().

Addresses-Coverity: 1476408 ("Explicit null dereferenced")
Signed-off-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Steve French <[email protected]>

libperf: Add group support to perf_evsel__open()

Add support to set group_fd in perf_evsel__open() and make it follow the
group setup.

Signed-off-by: Jiri Olsa <[email protected]>
Requested-by: Shunsuke Nakamura <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

SMB3.1.1: Add support for negotiating signing algorithm

Support for faster packet signing (using GMAC instead of CMAC) can
now be negotiated to some newer servers, including Windows.
See MS-SMB2 section 2.2.3.17.

This patch adds support for sending the new negotiate context
with the first of three supported signing algorithms (AES-CMAC)
and decoding the response. A followon patch will add support
for sending the other two (including AES-GMAC, which is fastest)
and changing the signing algorithm used based on what was
negotiated.

To allow the client to request GMAC signing set module parameter
"enable_negotiate_signing" to 1.

Reviewed-by: Ronnie Sahlberg <[email protected]>
Reviewed-by: Pavel Shilovsky <[email protected]>
Signed-off-by: Steve French <[email protected]>

Merge branch 'arm/fixes' into arm/soc

Merging in the last batch of fixes that didn't go in before previous
release, just a few smaller DT fixups and a MAINTAINERS update

* arm/fixes: (29 commits)
  MAINTAINERS: Add myself as TEE subsystem reviewer
  ARM: dts: ux500: Fix LED probing
  arm64: dts: rockchip: Update RK3399 PCI host bridge window to 32-bit address memory
  arm64: dts: allwinner: a64-sopine-baseboard: change RGMII mode to TXID
  arm64: meson: select COMMON_CLK
  soc: amlogic: meson-clk-measure: remove redundant dev_err call in meson_msr_probe()
  ARM: dts: qcom: sdx55-telit: Represent secure-regions as 64-bit elements
  ARM: dts: qcom: sdx55-t55: Represent secure-regions as 64-bit elements
  ARM: dts: sun8i: h3: orangepi-plus: Fix ethernet phy-mode
  ARM: dts: imx: emcon-avari: Fix nxp,pca8574 #gpio-cells
  ARM: dts: imx7d-pico: Fix the 'tuning-step' property
  ARM: dts: imx7d-meerkat96: Fix the 'tuning-step' property
  arm64: dts: freescale: sl28: var1: fix RGMII clock and voltage
  arm64: dts: freescale: sl28: var4: fix RGMII clock and voltage
  ARM: imx: pm-imx27: Include "common.h"
  arm64: dts: zii-ultra: fix 12V_MAIN voltage
  arm64: dts: zii-ultra: remove second GEN_3V3 regulator instance
  arm64: dts: ls1028a: fix memory node
  optee: use export_uuid() to copy client UUID
  arm64: dts: ti: k3*: Introduce reg definition for interrupt routers
  ...

Merge tag 'riscv-for-linus-5.14-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:
"We have a handful of new features for 5.14:

   - Support for transparent huge pages.

   - Support for generic PCI resources mapping.

   - Support for the mem= kernel parameter.

   - Support for KFENCE.

   - A handful of fixes to avoid W+X mappings in the kernel.

   - Support for VMAP_STACK based overflow detection.

   - An optimized copy_{to,from}_user"

* tag 'riscv-for-linus-5.14-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (37 commits)
  riscv: xip: Fix duplicate included asm/pgtable.h
  riscv: Fix PTDUMP output now BPF region moved back to module region
  riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall
  riscv: add VMAP_STACK overflow detection
  riscv: ptrace: add argn syntax
  riscv: mm: fix build errors caused by mk_pmd()
  riscv: Introduce structure that group all variables regarding kernel mapping
  riscv: Map the kernel with correct permissions the first time
  riscv: Introduce set_kernel_memory helper
  riscv: Enable KFENCE for riscv64
  RISC-V: Use asm-generic for {in,out}{bwlq}
  riscv: add ASID-based tlbflushing methods
  riscv: pass the mm_struct to __sbi_tlb_flush_range
  riscv: Add mem kernel parameter support
  riscv: Simplify xip and !xip kernel address conversion macros
  riscv: Remove CONFIG_PHYS_RAM_BASE_FIXED
  riscv: Only initialize swiotlb when necessary
  riscv: fix typo in init.c
  riscv: Cleanup unused functions
  riscv: mm: Use better bitmap_zalloc()
  ...

Merge tag 'powerpc-5.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
"Fix crashes on 64-bit Book3E due to use of Book3S only mtmsrd
  instruction.

  Fix "scheduling while atomic" warnings at boot due to preempt count
  underflow.

  Two commits fixing our handling of BPF atomic instructions.

  Fix error handling in xive when allocating an IPI.

  Fix lockup on kernel exec fault on 603.

  Thanks to Bharata B Rao, Cédric Le Goater, Christian Zigotzky,
  Christophe Leroy, Guenter Roeck, Jiri Olsa, Naveen N. Rao, Nicholas
  Piggin, and Valentin Schneider"

* tag 'powerpc-5.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/preempt: Don't touch the idle task's preempt_count during hotplug
  powerpc/64e: Fix system call illegal mtmsrd instruction
  powerpc/xive: Fix error handling when allocating an IPI
  powerpc/bpf: Reject atomic ops in ppc32 JIT
  powerpc/bpf: Fix detecting BPF atomic instructions
  powerpc/mm: Fix lockup on kernel exec fault

Merge tag 'for-linus-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml

Pull UML updates from Richard Weinberger:

- Support for optimized routines based on the host CPU

- Support for PCI via virtio

- Various fixes

* tag 'for-linus-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: remove unneeded semicolon in um_arch.c
  um: Remove the repeated declaration
  um: fix error return code in winch_tramp()
  um: fix error return code in slip_open()
  um: Fix stack pointer alignment
  um: implement flush_cache_vmap/flush_cache_vunmap
  um: add a UML specific futex implementation
  um: enable the use of optimized xor routines in UML
  um: Add support for host CPU flags and alignment
  um: allow not setting extra rpaths in the linux binary
  um: virtio/pci: enable suspend/resume
  um: add PCI over virtio emulation driver
  um: irqs: allow invoking time-travel handler multiple times
  um: time-travel/signals: fix ndelay() in interrupt
  um: expose time-travel mode to userspace side
  um: export signals_enabled directly
  um: remove unused smp_sigio_handler() declaration
  lib: add iomem emulation (logic_iomem)
  um: allow disabling NO_IOMEM

Merge tag 'for-linus-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs

Pull UBIFS updates from Richard Weinberger:

- Fix for a race xattr list and modification

- Various minor fixes (spelling, return codes, ...)

* tag 'for-linus-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
  ubifs: Set/Clear I_LINKABLE under i_lock for whiteout inode
  ubifs: Fix spelling mistakes
  ubifs: Remove ui_mutex in ubifs_xattr_get and change_xattr
  ubifs: Fix races between xattr_{set|get} and listxattr operations
  ubifs: fix snprintf() checking
  ubifs: journal: Fix error return code in ubifs_jnl_write_inode()

perf tools: Fix pattern matching for same substring in different PMU type

Some different PMU types may have the same substring. For example, on
Icelake server we have PMU types "uncore_imc" and
"uncore_imc_free_running". Both PMU types have the substring
"uncore_imc".  But the parser wrongly thinks they are the same PMU type.

We enable an imc event,
perf stat -e uncore_imc/event=0xe3/ -a -- sleep 1

Perf actually expands the event to:

  uncore_imc_0/event=0xe3/
  uncore_imc_1/event=0xe3/
  uncore_imc_2/event=0xe3/
  uncore_imc_3/event=0xe3/
  uncore_imc_4/event=0xe3/
  uncore_imc_5/event=0xe3/
  uncore_imc_6/event=0xe3/
  uncore_imc_7/event=0xe3/
  uncore_imc_free_running_0/event=0xe3/
  uncore_imc_free_running_1/event=0xe3/
  uncore_imc_free_running_3/event=0xe3/
  uncore_imc_free_running_4/event=0xe3/

That's because the "uncore_imc_free_running" matches the
pattern "uncore_imc*".

Now we check that the last characters of PMU name is '_<digit>'.

For example, for pattern "uncore_imc*", "uncore_imc_0" is parsed ok, but
"uncore_imc_free_running_0" fails.

Fixes: b2b9d3a3f0211c5d ("perf pmu: Support wildcards on pmu name in dynamic pmu events")
Signed-off-by: Jin Yao <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Agustin Vega-Frias <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf record: Add a dummy event on hybrid systems to collect metadata records

Some symbols may not be resolved if a user only monitors one type of
PMU.

  $ sudo perf record -e cpu_atom/branch-instructions/ ./big_small_workload
  $ sudo perf report –stdio
  # Overhead  Command    Shared Object      Symbol
  # ........  .........  .................  .....................
  #
     28.02%  perf-exec  [unknown]          [.] 0x0000000000401cf6
     11.32%  perf-exec  [unknown]          [.] 0x0000000000401d04
     10.90%  perf-exec  [unknown]          [.] 0x0000000000401d11
     10.61%  perf-exec  [unknown]          [.] 0x0000000000401cfc

To parse symbols the metadata records, e.g., PERF_RECORD_COMM, which are
generated by the kernel, are required.

To decide whether to generate the metadata records, the kernel relies on
the event_filter_match() to filter the unrelated events.

On a hybrid system, event_filter_match() further checks the CPU mask of
the current enabled PMU. If an event is collected on the CPU which
doesn't have an enabled PMU, it's treated as an unrelated event.

The "big_small_workload" is created in a big core, but runs on a small
core. The metadata records are filtered, because the user only monitors
the PMU of the small core. The big core PMU is not enabled.

For a hybrid system, a dummy event is required to generate the complete
side-band events.

Signed-off-by: Kan Liang <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf stat: Add Topdown metrics L2 events as default events

The Topdown Microarchitecture Analysis (TMA) Method is a structured
analysis methodology to identify critical performance bottlenecks in
out-of-order processors.

The Topdown metrics L1 event was added as default in 42641d6f4d15e6db
("perf stat: Add Topdown metrics events as default events")

From the Sapphire Rapids server and later platforms, the same dedicated
"metrics" register is extended to support both L1 and L2 events.

Add both L1 and L2 Topdown metrics events as default to enrich the
default measuring information if the new measurement register is
available.

On legacy systems there is no change to avoid extra multiplexing.

The topdown_level indicates the max metrics level for the top-down
statistics. Set it to 2 to display all L1 and L2 Topdown metrics events.

With the patch:

  $ perf stat sleep 1

  Performance counter stats for 'sleep 1':

           0.59 msec task-clock             #   0.001 CPUs utilized
              1      context-switches       #   1.687 K/sec
              0      cpu-migrations         #   0.000 /sec
             76      page-faults            # 128.198 K/sec
      1,405,318      cycles                 #   2.371 GHz
      1,471,136      instructions           #   1.05  insn per cycle
        310,132      branches               # 523.136 M/sec
         10,435      branch-misses          #   3.36% of all branches
      8,431,908      slots                  #  14.223 G/sec
      1,554,116      topdown-retiring       #    18.4% retiring
      1,289,585      topdown-bad-spec       #    15.2% bad speculation
      2,810,636      topdown-fe-bound       #    33.2% frontend bound
      2,810,636      topdown-be-bound       #    33.2% backend bound
        231,464      topdown-heavy-ops      #     2.7% heavy operations   #  15.6% light operations
      1,223,453      topdown-br-mispredict  #    14.5% branch mispredict  #   0.8% machine clears
      1,884,779      topdown-fetch-lat      #    22.3% fetch latency      #  10.9% fetch bandwidth
      1,454,917      topdown-mem-bound      #    17.2% memory bound       #  16.0% Core bound

    1.001179699 seconds time elapsed

    0.000000000 seconds user
    0.001238000 seconds sys

Signed-off-by: Kan Liang <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

libperf: Adopt evlist__set_leader() from tools/perf as perf_evlist__set_leader()

Move the implementation of evlist__set_leader() to a new libperf
perf_evlist__set_leader() function with the same functionality make it a
libperf exported API.

Signed-off-by: Jiri Olsa <[email protected]>
Requested-by: Shunsuke Nakamura <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

libperf: Move 'nr_groups' from tools/perf to evlist::nr_groups

Move evsel::nr_groups to perf_evsel::nr_groups, so we can move the group
interface to libperf.

Signed-off-by: Jiri Olsa <[email protected]>
Requested-by: Shunsuke Nakamura <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

libperf: Move 'leader' from tools/perf to perf_evsel::leader

Move evsel::leader to perf_evsel::leader, so we can move the group
interface to libperf.

Also add several evsel helpers to ease up the transition:

  struct evsel *evsel__leader(struct evsel *evsel);
  - get leader evsel

  bool evsel__has_leader(struct evsel *evsel, struct evsel *leader);
  - true if evsel has leader as leader

  bool evsel__is_leader(struct evsel *evsel);
  - true if evsel is itw own leader

  void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
  - set leader for evsel

Committer notes:

Fix this when building with 'make BUILD_BPF_SKEL=1'

  tools/perf/util/bpf_counter.c

  -       if (evsel->leader->core.nr_members > 1) {
  +       if (evsel->core.leader->nr_members > 1) {

Signed-off-by: Jiri Olsa <[email protected]>
Requested-by: Shunsuke Nakamura <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

libperf: Move 'idx' from tools/perf to perf_evsel::idx

Move evsel::idx to perf_evsel::idx, so we can move the group interface
to libperf.

Committer notes:

Fixup evsel->idx usage in tools/perf/util/bpf_counter_cgroup.c, that
appeared in my tree in my local tree.

Also fixed up these:

$ find tools/perf/ -name "*.[ch]" | xargs grep 'evsel->idx'
tools/perf/ui/gtk/annotate.c: evsel->idx + i);
tools/perf/ui/gtk/annotate.c: evsel->idx);
$

That running 'make -C tools/perf build-test' caught.

Signed-off-by: Jiri Olsa <[email protected]>
Requested-by: Shunsuke Nakamura <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o:
"Ext4 regression and bug fixes"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: inline jbd2_journal_[un]register_shrinker()
  ext4: fix flags validity checking for EXT4_IOC_CHECKPOINT
  ext4: fix possible UAF when remounting r/o a mmp-protected file system
  ext4: use ext4_grp_locked_error in mb_find_extent
  ext4: fix WARN_ON_ONCE(!buffer_uptodate) after an error writing the superblock
  Revert "ext4: consolidate checks for resize of bigalloc into ext4_resize_begin"

Merge tag 'ceph-for-5.14-rc1' of git://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
"We have new filesystem client metrics for reporting I/O sizes from
  Xiubo, two patchsets from Jeff that begin to untangle some heavyweight
  blocking locks in the filesystem and a bunch of code cleanups"

* tag 'ceph-for-5.14-rc1' of git://github.com/ceph/ceph-client:
  ceph: take reference to req->r_parent at point of assignment
  ceph: eliminate ceph_async_iput()
  ceph: don't take s_mutex in ceph_flush_snaps
  ceph: don't take s_mutex in try_flush_caps
  ceph: don't take s_mutex or snap_rwsem in ceph_check_caps
  ceph: eliminate session->s_gen_ttl_lock
  ceph: allow ceph_put_mds_session to take NULL or ERR_PTR
  ceph: clean up locking annotation for ceph_get_snap_realm and __lookup_snap_realm
  ceph: add some lockdep assertions around snaprealm handling
  ceph: decoding error in ceph_update_snap_realm should return -EIO
  ceph: add IO size metrics support
  ceph: update and rename __update_latency helper to __update_stdev
  ceph: simplify the metrics struct
  libceph: fix doc warnings in cls_lock_client.c
  libceph: remove unnecessary ret variable in ceph_auth_init()
  libceph: fix some spelling mistakes
  libceph: kill ceph_none_authorizer::reply_buf
  ceph: make ceph_queue_cap_snap static
  ceph: make ceph_netfs_read_ops static
  ceph: remove bogus checks and WARN_ONs from ceph_set_page_dirty

Merge tag 'nfs-for-5.14-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
"Highlights include:

  Features:

   - Multiple patches to add support for fcntl() leases over NFSv4.

   - A sysfs interface to display more information about the various
     transport connections used by the RPC client

   - A sysfs interface to allow a suitably privileged user to offline a
     transport that may no longer point to a valid server

   - A sysfs interface to allow a suitably privileged user to change the
     server IP address used by the RPC client

  Stable fixes:

   - Two sunrpc fixes for deadlocks involving privileged rpc_wait_queues

  Bugfixes:

   - SUNRPC: Avoid a KASAN slab-out-of-bounds bug in xdr_set_page_base()

   - SUNRPC: prevent port reuse on transports which don't request it.

   - NFSv3: Fix memory leak in posix_acl_create()

   - NFS: Various fixes to attribute revalidation timeouts

   - NFSv4: Fix handling of non-atomic change attribute updates

   - NFSv4: If a server is down, don't cause mounts to other servers to
     hang as well

   - pNFS: Fix an Oops in pnfs_mark_request_commit() when doing O_DIRECT

   - NFS: Fix mount failures due to incorrect setting of the
     has_sec_mnt_opts filesystem flag

   - NFS: Ensure nfs_readpage returns promptly when an internal error
     occurs

   - NFS: Fix fscache read from NFS after cache error

   - pNFS: Various bugfixes around the LAYOUTGET operation"

* tag 'nfs-for-5.14-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (46 commits)
  NFSv4/pNFS: Return an error if _nfs4_pnfs_v3_ds_connect can't load NFSv3
  NFSv4/pNFS: Don't call _nfs4_pnfs_v3_ds_connect multiple times
  NFSv4/pnfs: Clean up layout get on open
  NFSv4/pnfs: Fix layoutget behaviour after invalidation
  NFSv4/pnfs: Fix the layout barrier update
  NFS: Fix fscache read from NFS after cache error
  NFS: Ensure nfs_readpage returns promptly when internal error occurs
  sunrpc: remove an offlined xprt using sysfs
  sunrpc: provide showing transport's state info in the sysfs directory
  sunrpc: display xprt's queuelen of assigned tasks via sysfs
  sunrpc: provide multipath info in the sysfs directory
  NFSv4.1 identify and mark RPC tasks that can move between transports
  sunrpc: provide transport info in the sysfs directory
  SUNRPC: take a xprt offline using sysfs
  sunrpc: add dst_attr attributes to the sysfs xprt directory
  SUNRPC for TCP display xprt's source port in sysfs xprt_info
  SUNRPC query transport's source port
  SUNRPC display xprt's main value in sysfs's xprt_info
  SUNRPC mark the first transport
  sunrpc: add add sysfs directory per xprt under each xprt_switch
  ...

Merge tag 'f2fs-for-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
"In this round, we've improved the compression support especially for
  Android such as allowing compression for mmap files, replacing the
  immutable bit with internal bit to prohibits data writes explicitly,
  and adding a mount option, "compress_cache", to improve random reads.
  And, we added "readonly" feature to compact the partition w/
  compression enabled, which will be useful for Android RO partitions.

  Enhancements:
   - support compression for mmap file
   - use an f2fs flag instead of IMMUTABLE bit for compression
   - support RO feature w/ extent_cache
   - fully support swapfile with file pinning
   - improve atgc tunability
   - add nocompress extensions to unselect files for compression

  Bug fixes:
   - fix false alaram on iget failure during GC
   - fix race condition on global pointers when there are multiple f2fs
     instances
   - add MODULE_SOFTDEP for initramfs

  As usual, we've also cleaned up some places for better code
  readability (e.g., sysfs/feature, debugging messages, slab cache
  name, and docs)"

* tag 'f2fs-for-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (32 commits)
  f2fs: drop dirty node pages when cp is in error status
  f2fs: initialize page->private when using for our internal use
  f2fs: compress: add nocompress extensions support
  MAINTAINERS: f2fs: update my email address
  f2fs: remove false alarm on iget failure during GC
  f2fs: enable extent cache for compression files in read-only
  f2fs: fix to avoid adding tab before doc section
  f2fs: introduce f2fs_casefolded_name slab cache
  f2fs: swap: support migrating swapfile in aligned write mode
  f2fs: swap: remove dead codes
  f2fs: compress: add compress_inode to cache compressed blocks
  f2fs: clean up /sys/fs/f2fs/<disk>/features
  f2fs: add pin_file in feature list
  f2fs: Advertise encrypted casefolding in sysfs
  f2fs: Show casefolding support only when supported
  f2fs: support RO feature
  f2fs: logging neatening
  f2fs: introduce FI_COMPRESS_RELEASED instead of using IMMUTABLE bit
  f2fs: compress: remove unneeded preallocation
  f2fs: atgc: export entries for better tunability via sysfs
  ...

Merge branch 'akpm' (patches from Andrew)

Pull yet more updates from Andrew Morton:
"54 patches.

  Subsystems affected by this patch series: lib, mm (slub, secretmem,
  cleanups, init, pagemap, and mremap), and debug"

* emailed patches from Andrew Morton <[email protected]>: (54 commits)
  powerpc/mm: enable HAVE_MOVE_PMD support
  powerpc/book3s64/mm: update flush_tlb_range to flush page walk cache
  mm/mremap: allow arch runtime override
  mm/mremap: hold the rmap lock in write mode when moving page table entries.
  mm/mremap: use pmd/pud_poplulate to update page table entries
  mm/mremap: don't enable optimized PUD move if page table levels is 2
  mm/mremap: convert huge PUD move to separate helper
  selftest/mremap_test: avoid crash with static build
  selftest/mremap_test: update the test to handle pagesize other than 4K
  mm: rename p4d_page_vaddr to p4d_pgtable and make it return pud_t *
  mm: rename pud_page_vaddr to pud_pgtable and make it return pmd_t *
  kdump: use vmlinux_build_id to simplify
  buildid: fix kernel-doc notation
  buildid: mark some arguments const
  scripts/decode_stacktrace.sh: indicate 'auto' can be used for base path
  scripts/decode_stacktrace.sh: silence stderr messages from addr2line/nm
  scripts/decode_stacktrace.sh: support debuginfod
  x86/dumpstack: use %pSb/%pBb for backtrace printing
  arm64: stacktrace: use %pSb for backtrace printing
  module: add printk formats to add module build ID to stacktraces
  ...

io_uring: remove dead non-zero 'poll' check

Colin reports that Coverity complains about checking for poll being
non-zero after having dereferenced it multiple times. This is a valid
complaint, and actually a leftover from back when this code was based
on the aio poll code.

Kill the redundant check.

Link: https://lore.kernel.org/io-uring/[email protected]/
Reported-by: Colin Ian King <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

Merge tag 'irqchip-fixes-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent

Pull irqchip fixes from Marc Zyngier:

- Fix a MIPS bug where irqdomain loopkups could occur in a context
where RCU is not allowed

- Fix a documentation bug for handle_domain_irq

MIPS: vdso: Invalid GIC access through VDSO

Accessing raw timers (currently only CLOCK_MONOTONIC_RAW) through VDSO
doesn't return the correct time when using the GIC as clock source.
The address of the GIC mapped page is in this case not calculated
correctly. The GIC mapped page is calculated from the VDSO data by
subtracting PAGE_SIZE:

  void *get_gic(const struct vdso_data *data) {
    return (void __iomem *)data - PAGE_SIZE;
  }

However, the data pointer is not page aligned for raw clock sources.
This is because the VDSO data for raw clock sources (CS_RAW = 1) is
stored after the VDSO data for coarse clock sources (CS_HRES_COARSE = 0).
Therefore, only the VDSO data for CS_HRES_COARSE is page aligned:

  +--------------------+
  |                    |
  | vd[CS_RAW]         | ---+
  | vd[CS_HRES_COARSE] |    |
  +--------------------+    | -PAGE_SIZE
  |                    |    |
  |  GIC mapped page   | <--+
  |                    |
  +--------------------+

When __arch_get_hw_counter() is called with &vd[CS_RAW], get_gic returns
the wrong address (somewhere inside the GIC mapped page). The GIC counter
values are not returned which results in an invalid time.

Fixes: a7f4df4e21dd ("MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()")
Signed-off-by: Martin Fäcknitz <[email protected]>
Signed-off-by: Thomas Bogendoerfer <[email protected]>

bpf: Selftest to verify mixing bpf2bpf calls and tailcalls with insn patch

This adds some extra noise to the tailcall_bpf2bpf4 tests that will cause
verify to patch insns. This then moves around subprog start/end insn
index and poke descriptor insn index to ensure that verify and JIT will
continue to track these correctly.

If done correctly verifier should pass this program same as before and
JIT should emit tail call logic.

Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

bpf: Track subprog poke descriptors correctly and fix use-after-free

Subprograms are calling map_poke_track(), but on program release there is no
hook to call map_poke_untrack(). However, on program release, the aux memory
(and poke descriptor table) is freed even though we still have a reference to
it in the element list of the map aux data. When we run map_poke_run(), we then
end up accessing free'd memory, triggering KASAN in prog_array_map_poke_run():

  [...]
  [  402.824689] BUG: KASAN: use-after-free in prog_array_map_poke_run+0xc2/0x34e
  [  402.824698] Read of size 4 at addr ffff8881905a7940 by task hubble-fgs/4337
  [  402.824705] CPU: 1 PID: 4337 Comm: hubble-fgs Tainted: G          I       5.12.0+ #399
  [  402.824715] Call Trace:
  [  402.824719]  dump_stack+0x93/0xc2
  [  402.824727]  print_address_description.constprop.0+0x1a/0x140
  [  402.824736]  ? prog_array_map_poke_run+0xc2/0x34e
  [  402.824740]  ? prog_array_map_poke_run+0xc2/0x34e
  [  402.824744]  kasan_report.cold+0x7c/0xd8
  [  402.824752]  ? prog_array_map_poke_run+0xc2/0x34e
  [  402.824757]  prog_array_map_poke_run+0xc2/0x34e
  [  402.824765]  bpf_fd_array_map_update_elem+0x124/0x1a0
  [...]

The elements concerned are walked as follows:

    for (i = 0; i < elem->aux->size_poke_tab; i++) {
           poke = &elem->aux->poke_tab[i];
    [...]

The access to size_poke_tab is a 4 byte read, verified by checking offsets
in the KASAN dump:

  [  402.825004] The buggy address belongs to the object at ffff8881905a7800
                 which belongs to the cache kmalloc-1k of size 1024
  [  402.825008] The buggy address is located 320 bytes inside of
                 1024-byte region [ffff8881905a7800, ffff8881905a7c00)

The pahole output of bpf_prog_aux:

  struct bpf_prog_aux {
    [...]
    /* --- cacheline 5 boundary (320 bytes) --- */
    u32                        size_poke_tab;        /*   320     4 */
    [...]

In general, subprograms do not necessarily manage their own data structures.
For example, BTF func_info and linfo are just pointers to the main program
structure. This allows reference counting and cleanup to be done on the latter
which simplifies their management a bit. The aux->poke_tab struct, however,
did not follow this logic. The initial proposed fix for this use-after-free
bug further embedded poke data tracking into the subprogram with proper
reference counting. However, Daniel and Alexei questioned why we were treating
these objects special; I agree, its unnecessary. The fix here removes the per
subprogram poke table allocation and map tracking and instead simply points
the aux->poke_tab pointer at the main programs poke table. This way, map
tracking is simplified to the main program and we do not need to manage them
per subprogram.

This also means, bpf_prog_free_deferred(), which unwinds the program reference
counting and kfrees objects, needs to ensure that we don't try to double free
the poke_tab when free'ing the subprog structures. This is easily solved by
NULL'ing the poke_tab pointer. The second detail is to ensure that per
subprogram JIT logic only does fixups on poke_tab[] entries it owns. To do
this, we add a pointer in the poke structure to point at the subprogram value
so JITs can easily check while walking the poke_tab structure if the current
entry belongs to the current program. The aux pointer is stable and therefore
suitable for such comparison. On the jit_subprogs() error path, we omit
cleaning up the poke->aux field because these are only ever referenced from
the JIT side, but on error we will never make it to the JIT, so its fine to
leave them dangling. Removing these pointers would complicate the error path
for no reason. However, we do need to untrack all poke descriptors from the
main program as otherwise they could race with the freeing of JIT memory from
the subprograms. Lastly, a748c6975dea3 ("bpf: propagate poke descriptors to
subprograms") had an off-by-one on the subprogram instruction index range
check as it was testing 'insn_idx >= subprog_start && insn_idx <= subprog_end'.
However, subprog_end is the next subprogram's start instruction.

Fixes: a748c6975dea3 ("bpf: propagate poke descriptors to subprograms")
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Co-developed-by: Daniel Borkmann <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

irqchip/mips: Fix RCU violation when using irqdomain lookup on interrupt entry

Since d4a45c68dc81 ("irqdomain: Protect the linear revmap with RCU"),
any irqdomain lookup requires the RCU read lock to be held.

This assumes that the architecture code will be structured such as
irq_enter() will be called *before* the interrupt is looked up
in the irq domain. However, this isn't the case for MIPS, and a number
of drivers are structured to do it the other way around when handling
an interrupt in their root irqchip (secondary irqchips are OK by
construction).

This results in a RCU splat on a lockdep-enabled kernel when the kernel
takes an interrupt from idle, as reported by Guenter Roeck.

Note that this could have fired previously if any driver had used
tree-based irqdomain, which always had the RCU requirement.

To solve this, provide a MIPS-specific helper (do_domain_IRQ())
as the pendent of do_IRQ() that will do thing in the right order
(and maybe save some cycles in the process).

Ideally, MIPS would be moved over to using handle_domain_irq(),
but that's much more ambitious.

Reported-by: Guenter Roeck <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
[maz: add dependency on CONFIG_IRQ_DOMAIN after report from the kernelci bot]
Signed-off-by: Marc Zyngier <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Serge Semin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]

net: bcmgenet: Ensure all TX/RX queues DMAs are disabled

Make sure that we disable each of the TX and RX queues in the TDMA and
RDMA control registers. This is a correctness change to be symmetrical
with the code that enables the TX and RX queues.

Tested-by: Maxime Ripard <[email protected]>
Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file")
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

cifs: use helpers when parsing uid/gid mount options and validate them

Use the nice helpers to initialize and the uid/gid/cred_uid when passed as mount arguments.

Signed-off-by: Ronnie Sahlberg <[email protected]>
Acked-by: Pavel Shilovsky <[email protected]>
Signed-off-by: Steve French <[email protected]>

Merge branch 'ncsi-phy-link-up'

Ivan Mikhaylov says:

====================
net/ncsi: Add NCSI Intel OEM command to keep PHY link up

Add NCSI Intel OEM command to keep PHY link up and prevents any channel
resets during the host load on i210. Also includes dummy response handler for
Intel manufacturer id.

Changes from v1:
1. sparse fixes about casts
2. put it after ncsi_dev_state_probe_cis instead of
ncsi_dev_state_probe_channel because sometimes channel is not ready
after it
3. inl -> intel
====================

Signed-off-by: David S. Miller <[email protected]>

net/ncsi: add dummy response handler for Intel boards

Add the dummy response handler for Intel boards to prevent incorrect
handling of OEM commands.

Signed-off-by: Ivan Mikhaylov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/ncsi: add NCSI Intel OEM command to keep PHY up

This allows to keep PHY link up and prevents any channel resets during
the host load.

It is KEEP_PHY_LINK_UP option(Veto bit) in i210 datasheet which
block PHY reset and power state changes.

Signed-off-by: Ivan Mikhaylov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/ncsi: fix restricted cast warning of sparse

Sparse reports:
net/ncsi/ncsi-rsp.c:406:24: warning: cast to restricted __be32
net/ncsi/ncsi-manage.c:732:33: warning: cast to restricted __be32
net/ncsi/ncsi-manage.c:756:25: warning: cast to restricted __be32
net/ncsi/ncsi-manage.c:779:25: warning: cast to restricted __be32

Signed-off-by: Ivan Mikhaylov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: microchip: sparx5: fix kconfig warning

PHY_SPARX5_SERDES depends on OF so SPARX5_SWITCH should also depend
on OF since 'select' does not follow any dependencies.

WARNING: unmet direct dependencies detected for PHY_SPARX5_SERDES
  Depends on [n]: (ARCH_SPARX5 || COMPILE_TEST [=n]) && OF [=n] && HAS_IOMEM [=y]
  Selected by [y]:
  - SPARX5_SWITCH [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_MICROCHIP [=y] && NET_SWITCHDEV [=y] && HAS_IOMEM [=y]

Fixes: 3cfa11bac9bb ("net: sparx5: add the basic sparx5 driver")
Signed-off-by: Randy Dunlap <[email protected]>
Cc: Lars Povlsen <[email protected]>
Cc: Steen Hegelund <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: "David S. Miller" <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: [email protected]
Signed-off-by: David S. Miller <[email protected]>

cxgb4: fix IRQ free race during driver unload

IRQs are requested during driver's ndo_open() and then later
freed up in disable_interrupts() during driver unload.
A race exists where driver can set the CXGB4_FULL_INIT_DONE
flag in ndo_open() after the disable_interrupts() in driver
unload path checks it, and hence misses calling free_irq().

Fix by unregistering netdevice first and sync with driver's
ndo_open(). This ensures disable_interrupts() checks the flag
correctly and frees up the IRQs properly.

Fixes: b37987e8db5f ("cxgb4: Disable interrupts and napi before unregistering netdev")
Signed-off-by: Shahjada Abul Husain <[email protected]>
Signed-off-by: Raju Rangoju <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mt76: mt7921: continue to probe driver when fw already downloaded

When reboot system, no power cycles, firmware is already downloaded,
return -EIO will break driver as error:
mt7921e: probe of 0000:03:00.0 failed with error -5

Skip firmware download and continue to probe.

Signed-off-by: Aaron Ma <[email protected]>
Fixes: 1c099ab44727c ("mt76: mt7921: add MCU support")
Signed-off-by: David S. Miller <[email protected]>

atl1c: fix Mikrotik 10/25G NIC detection

Since Mikrotik 10/25G NIC MDIO op emulation is not 100% reliable,
on rare occasions it can happen that some physical functions of
the NIC do not get initialized due to timeouted early MDIO op.

This changes the atl1c probe on Mikrotik 10/25G NIC not to
depend on MDIO op emulation.

Signed-off-by: Gatis Peisenieks <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

s390: preempt: Fix preempt_count initialization

S390's init_idle_preempt_count(p, cpu) doesn't actually let us initialize the
preempt_count of the requested CPU's idle task: it unconditionally writes
to the current CPU's. This clearly conflicts with idle_threads_init(),
which intends to initialize *all* the idle tasks, including their
preempt_count (or their CPU's, if the arch uses a per-CPU preempt_count).

Unfortunately, it seems the way s390 does things doesn't let us initialize
every possible CPU's preempt_count early on, as the pages where this
resides are only allocated when a CPU is brought up and are freed when it
is brought down.

Let the arch-specific code set a CPU's preempt_count when its lowcore is
allocated, and turn init_idle_preempt_count() into an empty stub.

Fixes: f1a0a376ca0c ("sched/core: Initialize the idle task with preemption disabled")
Reported-by: Guenter Roeck <[email protected]>
Signed-off-by: Valentin Schneider <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Reviewed-by: Heiko Carstens <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vasily Gorbik <[email protected]>

s390/linkage: increase asm symbols alignment to 16

Both clang and gcc (for -march=z13 and later) align functions to 16
bytes at -O2 to benefit branch prediction.

Make asm symbols alignment consistent with that.

This also benefits potential ftrace code patching, which is currently
able to patch 8 aligned bytes at once.

With defconfig this currently increases .text size by 4104 bytes.

Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390: rename CALL_ON_STACK_NORETURN() to call_on_stack_noreturn()

Lower case matches the call_on_stack() macro and is easier to read.

Reviewed-by: Sven Schnelle <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390: add type checking to CALL_ON_STACK_NORETURN() macro

Make sure the to be called function takes no arguments (and returns void).
Otherwise usage of CALL_ON_STACK_NORETURN() would generate broken code.

Reviewed-by: Sven Schnelle <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390: remove old CALL_ON_STACK() macro

Reviewed-by: Sven Schnelle <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390/softirq: use call_on_stack() macro

Reviewed-by: Sven Schnelle <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390/lib: use call_on_stack() macro

Reviewed-by: Sven Schnelle <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390/smp: use call_on_stack() macro

Reviewed-by: Sven Schnelle <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390/kexec: use call_on_stack() macro

Reviewed-by: Sven Schnelle <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390/irq: use call_on_stack() macro

Reviewed-by: Sven Schnelle <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390/mm: use call_on_stack() macro

Reviewed-by: Sven Schnelle <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390: introduce proper type handling call_on_stack() macro

The existing CALL_ON_STACK() macro allows for subtle bugs:

- There is no type checking of the function that is being called. That
  is: missing or too many arguments do not cause any compile error or
  warning. The same is true if the return type of the called function
  changes. This can lead to quite random bugs.

- Sign and zero extension of arguments is missing. Given that the s390
  C ABI requires that the caller of a function performs proper sign
  and zero extension this can also lead to subtle bugs.

- If arguments to the CALL_ON_STACK() macros contain functions calls
  register corruption can happen due to register asm constructs being
  used.

Therefore introduce a new call_on_stack() macro which is supposed to
fix all these problems.

Reviewed-by: Sven Schnelle <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390/irq: simplify on_async_stack()

Make on_async_stack() a bit more readable, even though as usual it
depends if one considers "!!!" readable or not.
At least the new construct to check if the async stack is in use or
not is a bit shorter and generates slightly better code.

Reviewed-by: Sven Schnelle <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390/irq: inline do_softirq_own_stack()

Move do_softirq_own_stack() to proper header file so it can be
inlined; saving a few cycles.

Reviewed-by: Sven Schnelle <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390/irq: simplify do_softirq_own_stack()

do_softirq_own_stack() is always called from task context and
therefore it is not necessary to check if the async stack is
currently used.
Remove the check and directly switch to async stack.

Reviewed-by: Sven Schnelle <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390/ap: get rid of register asm in ap_dqap()

This is the second part of the cleanup for the header file ap.h
to remove the register asm statements. This patch deals with
the inline ap_dqap() function where within the assembler code
an odd register of an register pair is to be addressed.

[[email protected]: this intentionally breaks compilation with any
clang compilers prior to llvm-project commit 458eac257377 ("[SystemZ]
Support the 'N' code for the odd register in inline-asm.").
This is hopefully the last clang kernel compile breakage caused by
incompatibilities between gcc and clang.]

Signed-off-by: Harald Freudenberger <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390: rename PIF_SYSCALL_RESTART to PIF_EXECVE_PGSTE_RESTART

PIF_SYSCALL_RESTART is now only used to restart execve when loading
PGSTE binaries. Rename the flag to reflect that, and avoid people
thinking that this bit has anything to do with generic syscall
restarting.

Signed-off-by: Sven Schnelle <[email protected]>
Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390: move restart of execve() syscall

On s390, execve might have to be restarted for PGSTE binaries
like kvm. In the past this was done via the PIF_SYSCALL_RESTART
bit. However, with the recent changes, syscalls are now restarted
differently. Now that execve() is the only call that might get
restarted via PIF_SYSCALL_RESTART, move the loop to do_syscall().
This also has the advantage that the restart is no longer visible
to userspace.

Signed-off-by: Sven Schnelle <[email protected]>
Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390/signal: remove sigreturn on stack

{rt_}sigreturn is now called from the vdso, so we no longer
need the svc on the stack, and therefore no hack to support that
mechanism on machines with non-executable stack.

Signed-off-by: Sven Schnelle <[email protected]>
Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390/signal: switch to using vdso for sigreturn and syscall restart

with generic entry, there's a bug when it comes to restarting of signals.
The failing sequence is:

a) a signal is coming in, and no handler is registered, so the lower
   part of arch_do_signal_or_restart() in arch/s390/kernel/signal.c
   sets PIF_SYSCALL_RESTART.

b) a second signal gets pending while the kernel is still in the exit
   loop, and for that one, a handler exists.

c) The first part of arch_do_signal_or_restart() is called. That part
   calls handle_signal(), which sets up stack + registers for handling
   the signal.

d) __do_syscall() in arch/s390/kernel/syscall.c checks for
   PIF_SYSCALL_RESTART right before leaving to userspace. If it is set,
   it restart's the syscall. However, the registers are already setup
   for handling a signal from c). The syscall is now restarted with the
   wrong arguments.

Change the code to:

- use vdso for syscall_restart() instead of PIF_SYSCALL_RESTART because
  we cannot rewind and go back to userspace on s390 because the system call
  number might be encoded in the svc instruction.
- for all other syscalls we rewind the PSW and return to userspace.

Cc: <[email protected]> # v5.12+ d57778feb987: s390/vdso: always enable vdso
Cc: <[email protected]> # v5.12+ 686341f2548b: s390/vdso64: add sigreturn,rt_sigreturn and restart_syscall
Cc: <[email protected]> # v5.12+ 43e1f76b0b69: s390/vdso: rename VDSO64_LBASE to VDSO_LBASE
Cc: <[email protected]> # v5.12+ 779df2248739: s390/vdso: add minimal compat vdso
Cc: <[email protected]> # v5.12+
Signed-off-by: Sven Schnelle <[email protected]>
Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

io_uring: mitigate unlikely iopoll lag

We have requests like IORING_OP_FILES_UPDATE that don't go through
->iopoll_list but get completed in place under ->uring_lock, and so
after dropping the lock io_iopoll_check() should expect that some CQEs
might have get completed in a meanwhile.

Currently such events won't be accounted in @nr_events, and the loop
will continue to poll even if there is enough of CQEs. It shouldn't be a
problem as it's not likely to happen and so, but not nice either. Just
return earlier in this case, it should be enough.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/66ef932cc66a34e3771bbae04b2953a8058e9d05.1625747741.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

ptp: Relocate lookup cookie to correct block.

An earlier commit set the pps_lookup cookie, but the line
was somehow added to the wrong code block. Correct this.

Fixes: 8602e40fc813 ("ptp: Set lookup cookie when creating a PTP PPS source.")
Signed-off-by: Jonathan Lemon <[email protected]>
Signed-off-by: Dario Binacchi <[email protected]>
Acked-by: Richard Cochran <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'drm-next-2021-07-08-1' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
"Some fixes for rc1 that came in the past weeks, mainly a bunch of
  amdgpu fixes, some i915 and the rest are misc around the place. I'm
  sending this a bit early so some more stuff may show up, but I'll
  probably take tomorrow off.

  dma-buf:
   - doc fixes

  amdgpu:
   - Misc Navi fixes
   - Powergating fix
   - Yellow Carp updates
   - Beige Goby updates
   - S0ix fix
   - Revert overlay validation fix
   - GPU reset fix for DC
   - PPC64 fix
   - Add new dimgrey cavefish DID
   - RAS fix
   - TTM fixes

  amdkfd:
   - SVM fixes

  radeon:
   - Fix missing drm_gem_object_put in error path
   - NULL ptr deref fix

  i915:
   - display DP VSC fix
   - DG1 display fix
   - IRQ fixes
   - IRQ demidlayering

  gma500:
   - bo leaks in error paths fixed"

* tag 'drm-next-2021-07-08-1' of git://anongit.freedesktop.org/drm/drm: (52 commits)
  drm/i915: Drop all references to DRM IRQ midlayer
  drm/i915: Use the correct IRQ during resume
  drm/i915/display/dg1: Correctly map DPLLs during state readout
  drm/i915/display: Do not zero past infoframes.vsc
  drm/amdgpu: Conditionally reset SDMA RAS error counts
  drm/amdkfd: Maintain svm_bo reference in page->zone_device_data
  drm/amdkfd: add invalid pages debug at vram migration
  drm/amdkfd: skip migration for pages already in VRAM
  drm/amdkfd: skip invalid pages during migrations
  drm/amdkfd: classify and map mixed svm range pages in GPU
  drm/amdkfd: use hmm range fault to get both domain pfns
  drm/amdgpu: get owner ref in validate and map
  drm/amdkfd: set owner ref to svm range prefault
  drm/amdkfd: add owner ref param to get hmm pages
  drm/amdkfd: device pgmap owner at the svm migrate init
  drm/amdkfd: inc counter on child ranges with xnack off
  drm/amd/display: Extend DMUB diagnostic logging to DCN3.1
  drm/amdgpu: Update NV SIMD-per-CU to 2
  drm/amdgpu: add new dimgrey cavefish DID
  drm/amd/pm: skip PrepareMp1ForUnload message in s0ix
  ...

ipv6: tcp: drop silly ICMPv6 packet too big messages

While TCP stack scales reasonably well, there is still one part that
can be used to DDOS it.

IPv6 Packet too big messages have to lookup/insert a new route,
and if abused by attackers, can easily put hosts under high stress,
with many cpus contending on a spinlock while one is stuck in fib6_run_gc()

ip6_protocol_deliver_rcu()
icmpv6_rcv()
  icmpv6_notify()
   tcp_v6_err()
    tcp_v6_mtu_reduced()
     inet6_csk_update_pmtu()
      ip6_rt_update_pmtu()
       __ip6_rt_update_pmtu()
        ip6_rt_cache_alloc()
         ip6_dst_alloc()
          dst_alloc()
           ip6_dst_gc()
            fib6_run_gc()
             spin_lock_bh() ...

Some of our servers have been hit by malicious ICMPv6 packets
trying to _increase_ the MTU/MSS of TCP flows.

We believe these ICMPv6 packets are a result of a bug in one ISP stack,
since they were blindly sent back for _every_ (small) packet sent to them.

These packets are for one TCP flow:
09:24:36.266491 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
09:24:36.266509 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
09:24:36.316688 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
09:24:36.316704 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
09:24:36.608151 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240

TCP stack can filter some silly requests :

1) MTU below IPV6_MIN_MTU can be filtered early in tcp_v6_err()
2) tcp_v6_mtu_reduced() can drop requests trying to increase current MSS.

This tests happen before the IPv6 routing stack is entered, thus
removing the potential contention and route exhaustion.

Note that IPv6 stack was performing these checks, but too late
(ie : after the route has been added, and after the potential
garbage collect war)

v2: fix typo caught by Martin, thanks !
v3: exports tcp_mtu_to_mss(), caught by David, thanks !

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: Maciej Żenczykowski <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Acked-by: Martin KaFai Lau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'pwm/for-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm

Pull pwm updates from Thierry Reding:
"This contains mostly various fixes, cleanups and some conversions to
  the atomic API. One noteworthy change is that PWM consumers can now
  pass a hint to the PWM core about the PWM usage, enabling PWM
  providers to implement various optimizations.

  There's also a fair bit of simplification here with the addition of
  some device-managed helpers as well as unification between the DT and
  ACPI firmware interfaces"

* tag 'pwm/for-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (50 commits)
  pwm: Remove redundant assignment to pointer pwm
  pwm: ep93xx: Fix read of uninitialized variable ret
  pwm: ep93xx: Prepare clock before using it
  pwm: ep93xx: Unfold legacy callbacks into ep93xx_pwm_apply()
  pwm: ep93xx: Implement .apply callback
  pwm: vt8500: Only unprepare the clock after the pwmchip was removed
  pwm: vt8500: Drop if with an always false condition
  pwm: tegra: Assert reset only after the PWM was unregistered
  pwm: tegra: Don't needlessly enable and disable the clock in .remove()
  pwm: tegra: Don't modify HW state in .remove callback
  pwm: tegra: Drop an if block with an always false condition
  pwm: core: Simplify some devm_*pwm*() functions
  pwm: core: Remove unused devm_pwm_put()
  pwm: core: Unify fwnode checks in the module
  pwm: core: Reuse fwnode_to_pwmchip() in ACPI case
  pwm: core: Convert to use fwnode for matching
  docs: firmware-guide: ACPI: Add a PWM example
  dt-bindings: pwm: pwm-tiecap: Add compatible string for AM64 SoC
  dt-bindings: pwm: pwm-tiecap: Convert to json schema
  pwm: sprd: Don't check the return code of pwmchip_remove()
  ...

Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull more clk updates from Stephen Boyd:

- A handful of fixes for lmk04832 driver

- Migrate the basic clk divider to use determine rate ops

- Fix modpost build for hisilicon hi3559a driver

- Actually set the parent in k210_clk_set_parent()

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  Revert "clk: divider: Switch from .round_rate to .determine_rate by default"
  clk: hisilicon: hi3559a: Drop __init markings everywhere
  clk: meson: regmap: switch to determine_rate for the dividers
  clk: divider: Switch from .round_rate to .determine_rate by default
  clk: divider: Add re-usable determine_rate implementations
  clk: k210: Fix k210_clk_set_parent()
  clk: lmk04832: Fix spelling mistakes in dev_err messages and comments
  clk: lmk04832: fix return value check in lmk04832_probe()
  clk: stm32mp1: fix missing spin_lock_init()

Merge tag 'pci-v5.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull pci updates from Bjorn Helgaas:
"Enumeration:
   - Fix dsm_label_utf16s_to_utf8s() buffer overrun (Krzysztof
     Wilczyński)
   - Rely on lengths from scnprintf(), dsm_label_utf16s_to_utf8s()
     (Krzysztof Wilczyński)
   - Use sysfs_emit() and sysfs_emit_at() in "show" functions (Krzysztof
     Wilczyński)
   - Fix 'resource_alignment' newline issues (Krzysztof Wilczyński)
   - Add 'devspec' newline (Krzysztof Wilczyński)
   - Dynamically map ECAM regions (Russell King)

  Resource management:
   - Coalesce host bridge contiguous apertures (Kai-Heng Feng)

  PCIe native device hotplug:
   - Ignore Link Down/Up caused by DPC (Lukas Wunner)

  Power management:
   - Leave Apple Thunderbolt controllers on for s2idle or standby
     (Konstantin Kharlamov)

  Virtualization:
   - Work around Huawei Intelligent NIC VF FLR erratum (Chiqijun)
   - Clarify error message for unbound IOV devices (Moritz Fischer)
   - Add pci_reset_bus_function() Secondary Bus Reset interface (Raphael
     Norwitz)

  Peer-to-peer DMA:
   - Simplify distance calculation (Christoph Hellwig)
   - Finish RCU conversion of pdev->p2pdma (Eric Dumazet)
   - Rename upstream_bridge_distance() and rework doc (Logan Gunthorpe)
   - Collect acs list in stack buffer to avoid sleeping (Logan
     Gunthorpe)
   - Use correct calc_map_type_and_dist() return type (Logan Gunthorpe)
   - Warn if host bridge not in whitelist (Logan Gunthorpe)
   - Refactor pci_p2pdma_map_type() (Logan Gunthorpe)
   - Avoid pci_get_slot(), which may sleep (Logan Gunthorpe)

  Altera PCIe controller driver:
   - Add Joyce Ooi as Altera PCIe maintainer (Joyce Ooi)

  Broadcom iProc PCIe controller driver:
   - Fix multi-MSI base vector number allocation (Sandor Bodo-Merle)
   - Support multi-MSI only on uniprocessor kernel (Sandor Bodo-Merle)

  Freescale i.MX6 PCIe controller driver:
   - Limit DBI register length for imx6qp PCIe (Richard Zhu)
   - Add "vph-supply" for PHY supply voltage (Richard Zhu)
   - Enable PHY internal regulator when supplied >3V (Richard Zhu)
   - Remove imx6_pcie_probe() redundant error message (Zhen Lei)

  Intel Gateway PCIe controller driver:
   - Fix INTx enable (Martin Blumenstingl)

  Marvell Aardvark PCIe controller driver:
   - Fix checking for PIO Non-posted Request (Pali Rohár)
   - Implement workaround for the readback value of VEND_ID (Pali Rohár)

  MediaTek PCIe controller driver:
   - Remove redundant error printing in mtk_pcie_subsys_powerup() (Zhen
     Lei)

  MediaTek PCIe Gen3 controller driver:
   - Add missing MODULE_DEVICE_TABLE (Zou Wei)

  Microchip PolarFlare PCIe controller driver:
   - Make struct event_descs static (Krzysztof Wilczyński)

  Microsoft Hyper-V host bridge driver:
   - Fix race condition when removing the device (Long Li)
   - Remove bus device removal unused refcount/functions (Long Li)

  Mobiveil PCIe controller driver:
   - Remove unused readl and writel functions (Krzysztof Wilczyński)

  NVIDIA Tegra PCIe controller driver:
   - Add missing MODULE_DEVICE_TABLE (Zou Wei)

  NVIDIA Tegra194 PCIe controller driver:
   - Fix tegra_pcie_ep_raise_msi_irq() ill-defined shift (Jon Hunter)
   - Fix host initialization during resume (Vidya Sagar)

  Rockchip PCIe controller driver:
   - Register IRQ handlers after device and data are ready (Javier
     Martinez Canillas)"

* tag 'pci-v5.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (48 commits)
  PCI/P2PDMA: Finish RCU conversion of pdev->p2pdma
  PCI: xgene: Annotate __iomem pointer
  PCI: Fix kernel-doc formatting
  PCI: cpcihp: Declare cpci_debug in header file
  MAINTAINERS: Add Joyce Ooi as Altera PCIe maintainer
  PCI: rockchip: Register IRQ handlers after device and data are ready
  PCI: tegra194: Fix tegra_pcie_ep_raise_msi_irq() ill-defined shift
  PCI: aardvark: Implement workaround for the readback value of VEND_ID
  PCI: aardvark: Fix checking for PIO Non-posted Request
  PCI: tegra194: Fix host initialization during resume
  PCI: tegra: Add missing MODULE_DEVICE_TABLE
  PCI: imx6: Enable PHY internal regulator when supplied >3V
  dt-bindings: imx6q-pcie: Add "vph-supply" for PHY supply voltage
  PCI: imx6: Limit DBI register length for imx6qp PCIe
  PCI: imx6: Remove imx6_pcie_probe() redundant error message
  PCI: intel-gw: Fix INTx enable
  PCI: iproc: Support multi-MSI only on uniprocessor kernel
  PCI: iproc: Fix multi-MSI base vector number allocation
  PCI: mediatek-gen3: Add missing MODULE_DEVICE_TABLE
  PCI: Dynamically map ECAM regions
  ...

scripts: add generic syscallnr.sh

Like syscallhdr.sh and syscalltbl.sh, add a simple script to generate
the __NR_syscalls, which should not be exported to userspace.

This script is useful to replace arch/mips/kernel/syscalls/syscallnr.sh,
refactor arch/s390/kernel/syscalls/syscalltbl, and eliminate the code
surrounded by #ifdef __KERNEL__ / #endif from exported uapi/asm/unistd_*.h
files.

Signed-off-by: Masahiro Yamada <[email protected]>

scripts: check duplicated syscall number in syscall table

Currently, syscall{hdr,tbl}.sh sorts the entire syscall table, but you
can assume it is already sorted by the syscall number.

The generated syscall table does not work if the same syscall number
appears twice. Check it in the script.

Signed-off-by: Masahiro Yamada <[email protected]>

powerpc/mm: enable HAVE_MOVE_PMD support

mremap HAVE_MOVE_PMD/PUD optimization time comparison for 1GB region:
1GB mremap - Source PTE-aligned, Destination PTE-aligned
  mremap time:      2292772ns
1GB mremap - Source PMD-aligned, Destination PMD-aligned
  mremap time:      1158928ns
1GB mremap - Source PUD-aligned, Destination PUD-aligned
  mremap time:        63886ns

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Kalesh Singh <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

powerpc/book3s64/mm: update flush_tlb_range to flush page walk cache

flush_tlb_range is special in that we don't specify the page size used for
the translation.  Hence when flushing TLB we flush the translation cache
for all possible page sizes.  The kernel also uses the same interface when
moving page tables around.  Such a move requires us to flush the page walk
cache.

Instead of adding another interface to force page walk cache flush, update
flush_tlb_range to flush page walk cache if the range flushed is more than
the PMD range.  A page table move will always involve an invalidate range
more than PMD_SIZE.

Running microbenchmark with mprotect and parallel memory access didn't
show any observable performance impact.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Kalesh Singh <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/mremap: allow arch runtime override

Patch series "Speedup mremap on ppc64", v8.

This patchset enables MOVE_PMD/MOVE_PUD support on power.  This requires
the platform to support updating higher-level page tables without updating
page table entries.  This also needs to invalidate the Page Walk Cache on
architecture supporting the same.

This patch (of 3):

Architectures like ppc64 support faster mremap only with radix
translation.  Hence allow a runtime check w.r.t support for fast mremap.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Kalesh Singh <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: "Aneesh Kumar K . V" <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/mremap: hold the rmap lock in write mode when moving page table entries.

To avoid a race between rmap walk and mremap, mremap does
take_rmap_locks().  The lock was taken to ensure that rmap walk don't miss
a page table entry due to PTE moves via move_pagetables().  The kernel
does further optimization of this lock such that if we are going to find
the newly added vma after the old vma, the rmap lock is not taken.  This
is because rmap walk would find the vmas in the same order and if we don't
find the page table attached to older vma we would find it with the new
vma which we would iterate later.

As explained in commit eb66ae030829 ("mremap: properly flush TLB before
releasing the page") mremap is special in that it doesn't take ownership
of the page.  The optimized version for PUD/PMD aligned mremap also
doesn't hold the ptl lock.  This can result in stale TLB entries as show
below.

This patch updates the rmap locking requirement in mremap to handle the race condition
explained below with optimized mremap::

Optmized PMD move

    CPU 1                           CPU 2                                   CPU 3

    mremap(old_addr, new_addr)      page_shrinker/try_to_unmap_one

    mmap_write_lock_killable()

                                    addr = old_addr
                                    lock(pte_ptl)
    lock(pmd_ptl)
    pmd = *old_pmd
    pmd_clear(old_pmd)
    flush_tlb_range(old_addr)

    *new_pmd = pmd
                                                                            *new_addr = 10; and fills
                                                                            TLB with new addr
                                                                            and old pfn

    unlock(pmd_ptl)
                                    ptep_clear_flush()
                                    old pfn is free.
                                                                            Stale TLB entry

Optimized PUD move also suffers from a similar race.  Both the above race
condition can be fixed if we force mremap path to take rmap lock.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: 2c91bd4a4e2e ("mm: speed up mremap by 20x on large regions")
Fixes: c49dd3401802 ("mm: speedup mremap on 1GB or larger regions")
Link: https://lore.kernel.org/linux-mm/CAHk-=wgXVR04eBNtxQfevontWnP6FDm+oj5vauQXP3S-huwbPw@mail.gmail.com
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Acked-by: Hugh Dickins <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Kalesh Singh <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/mremap: use pmd/pud_poplulate to update page table entries

pmd/pud_populate is the right interface to be used to set the respective
page table entries. Some architectures like ppc64 do assume that
set_pmd/pud_at can only be used to set a hugepage PTE. Since we are not
setting up a hugepage PTE here, use the pmd/pud_populate interface.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Kalesh Singh <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/mremap: don't enable optimized PUD move if page table levels is 2

With two level page table don't enable move_normal_pud.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Kalesh Singh <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/mremap: convert huge PUD move to separate helper

With TRANSPARENT_HUGEPAGE_PUD enabled the kernel can find huge PUD
entries. Add a helper to move huge PUD entries on mremap().

This will be used by a later patch to optimize mremap of PUD_SIZE aligned
level 4 PTE mapped address

This also make sure we support mremap on huge PUD entries even with
CONFIG_HAVE_MOVE_PUD disabled.

[[email protected]: fix build failure with clang-10]
Link: https://lore.kernel.org/lkml/YMuOSnJsL9qkxweY@archlinux-ax161
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Kalesh Singh <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

selftest/mremap_test: avoid crash with static build

With a large mmap map size, we can overlap with the text area and using
MAP_FIXED results in unmapping that area. Switch to MAP_FIXED_NOREPLACE
and handle the EEXIST error.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Reviewed-by: Kalesh Singh <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

selftest/mremap_test: update the test to handle pagesize other than 4K

Patch series "mrermap fixes", v2.

This patch (of 6):

Instead of hardcoding 4K page size fetch it using sysconf(). For the
performance measurements test still assume 2M and 1G are hugepage sizes.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Reviewed-by: Kalesh Singh <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: rename p4d_page_vaddr to p4d_pgtable and make it return pud_t *

No functional change in this patch.

[[email protected]: m68k build error reported by kernel robot]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/linuxppc-dev/CAHk-=wi+J+iodze9FtjM3Zi4j4OeS+qqbKxME9QN4roxPEXH9Q@mail.gmail.com/
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Kalesh Singh <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: rename pud_page_vaddr to pud_pgtable and make it return pmd_t *

No functional change in this patch.

[[email protected]: fix]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: another fix]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/linuxppc-dev/CAHk-=wi+J+iodze9FtjM3Zi4j4OeS+qqbKxME9QN4roxPEXH9Q@mail.gmail.com/
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Stephen Rothwell <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Kalesh Singh <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kdump: use vmlinux_build_id to simplify

We can use the vmlinux_build_id array here now instead of open coding it.
This mostly consolidates code.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Stephen Boyd <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Jessica Yu <[email protected]>
Cc: Evan Green <[email protected]>
Cc: Hsin-Yi Wang <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Konstantin Khlebnikov <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>