Git Repo - linux.git/log

perf parse-events: Use uintptr_t when casting numbers to pointers

To address these errors found when cross building from x86_64 to MIPS
little endian 32-bit:

    CC       /tmp/build/perf/util/parse-events-bison.o
  util/parse-events.y: In function 'parse_events_parse':
  util/parse-events.y:514:6: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
    514 |      (void *) $2, $6, $4);
        |      ^
  util/parse-events.y:531:7: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
    531 |       (void *) $2, NULL, $4)) {
        |       ^
  util/parse-events.y:547:6: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
    547 |      (void *) $2, $4, 0);
        |      ^
  util/parse-events.y:564:7: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
    564 |       (void *) $2, NULL, 0)) {
        |       ^

Fixes: cabbf26821aa210f ("perf parse: Before yyabort-ing free components")
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Yonghong Song <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

doc: net: dsa: Fix typo in config code sample

In the "single port" example code for configuring a DSA switch without
tagging support from userspace the command to bring up the "lan2" link
was typo'd.

Signed-off-by: Paul Barker <[email protected]>
Acked-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'fixes-2020-09-03' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock

Pull misc build failure fixes from Mike Rapoport:
"Fix min_low_pfn/max_low_pfn build errors on ia64 and microblaze.

  Some configurations of ia64 and microblaze use min_low_pfn and
  max_low_pfn in pfn_valid(). This causes build failures for modules
  that use pfn_valid().

  The fix is to add EXPORT_SYMBOL() for these variables on ia64 and
  microblaze"

* tag 'fixes-2020-09-03' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  ia64: fix min_low_pfn/max_low_pfn build errors
  microblaze: fix min_low_pfn/max_low_pfn build errors

Merge tag 'affs-for-5.9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull affs fix from David Sterba:
"One fix to make permissions work the same way as on AmigaOS"

* tag 'affs-for-5.9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
affs: fix basic permission bits to actually work

xfs: fix xfs_bmap_validate_extent_raw when checking attr fork of rt files

The realtime flag only applies to the data fork, so don't use the
realtime block number checks on the attr fork of a realtime file.

Fixes: 30b0984d9117 ("xfs: refactor bmap record validation")
Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Eric Sandeen <[email protected]>

Merge tag 'media/v5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:

- a compilation fix issue with ti-vpe on arm 32 bits

- two Kconfig fixes for imx214 and max9286 drivers

- a kernel information leak at v4l2-core on time32 compat ioctls

- some fixes at rc core unbind logic

- a fix at mceusb driver for it to not use GFP_ATOMIC

- fixes at cedrus and vicodec drivers at the control handling logic

- a fix at gpio-ir-tx to avoid disabling interruts on a spinlock

* tag 'media/v5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: mceusb: Avoid GFP_ATOMIC where it is not needed
  media: gpio-ir-tx: spinlock is not needed to disable interrupts
  media: rc: do not access device via sysfs after rc_unregister_device()
  media: rc: uevent sysfs file races with rc_unregister_device()
  media: max9286: Depend on OF_GPIO
  media: i2c: imx214: select V4L2_FWNODE
  media: cedrus: Add missing v4l2_ctrl_request_hdl_put()
  media: vicodec: add missing v4l2_ctrl_request_hdl_put()
  media: media/v4l2-core: Fix kernel-infoleak in video_put_user()
  media: ti-vpe: cal: Fix compilation on 32-bit ARM

ALSA: hda/realtek - Improved routing for Thinkpad X1 7th/8th Gen

There've been quite a few regression reports about the lowered volume
(reduced to ca 65% from the previous level) on Lenovo Thinkpad X1
after the commit d2cd795c4ece ("ALSA: hda - fixup for the bass speaker
on Lenovo Carbon X1 7th gen").  Although the commit itself does the
right thing from HD-audio POV in order to have a volume control for
bass speakers, it seems that the machine has some secret recipe under
the hood.

Through experiments, Benjamin Poirier found out that the following
routing gives the best result:
* DAC1 (NID 0x02) -> Speaker pin (NID 0x14)
* DAC2 (NID 0x03) -> Shared by both Bass Speaker pin (NID 0x17) &
                     Headphone pin (0x21)
* DAC3 (NID 0x06) -> Unused

DAC1 seems to have some equalizer internally applied, and you'd get
again the output in a bad quality if you connect this to the
headphone pin.  Hence the headphone is connected to DAC2, which is now
shared with the bass speaker pin.  DAC3 has no volume amp, hence it's
not connected at all.

For achieving the routing above, this patch introduced a couple of
workarounds:

* The connection list of bass speaker pin (NID 0x17) is reduced not to
  include DAC3 (NID 0x06)
* Pass preferred_pairs array to specify the fixed connection

Here, both workarounds are needed because the generic parser prefers
the individual DAC assignment over others.

When the routing above is applied, the generic parser creates the two
volume controls "Front" and "Bass Speaker".  Since we have only two
DACs for three output pins, those are not fully controlling each
output individually, and it would confuse PulseAudio.  For avoiding
the pitfall, in this patch, we rename those volume controls to some
unique ones ("DAC1" and "DAC2").  Then PulseAudio ignore them and
concentrate only on the still good-working "Master" volume control.
If a user still wants to control each DAC volume, they can still
change manually via "DAC1" and "DAC2" volume controls.

Fixes: d2cd795c4ece ("ALSA: hda - fixup for the bass speaker on Lenovo Carbon X1 7th gen")
Reported-by: Benjamin Poirier <[email protected]>
Reviewed-by: Jaroslav Kysela <[email protected]>
Tested-by: Benjamin Poirier <[email protected]>
Cc: <[email protected]>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207407#c10
BugLink: https://gist.github.com/hamidzr/dd81e429dc86f4327ded7a2030e7d7d9#gistcomment-3214171
BugLink: https://gist.github.com/hamidzr/dd81e429dc86f4327ded7a2030e7d7d9#gistcomment-3276276
Link: https://lore/kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

MIPS: SNI: Fix SCSI interrupt

On RM400(a20r) machines ISA and SCSI interrupts share the same interrupt
line. Commit 49e6e07e3c80 ("MIPS: pass non-NULL dev_id on shared
request_irq()") accidently dropped the IRQF_SHARED bit, which breaks
registering SCSI interrupt. Put back IRQF_SHARED and add dev_id for
ISA interrupt.

Fixes: 49e6e07e3c80 ("MIPS: pass non-NULL dev_id on shared request_irq()")
Signed-off-by: Thomas Bogendoerfer <[email protected]>

MIPS: add missing MSACSR and upper MSA initialization

In cc97ab235f3f ("MIPS: Simplify FP context initialization), init_fp_ctx
just initialize the fp/msa context, and own_fp_inatomic just restore
FCSR and 64bit FP regs from it, but miss MSACSR and upper MSA regs for
MSA, so MSACSR and MSA upper regs's value from previous task on current
cpu can leak into current task and cause unpredictable behavior when MSA
context not initialized.

Fixes: cc97ab235f3f ("MIPS: Simplify FP context initialization")
Signed-off-by: Huang Pei <[email protected]>
Signed-off-by: Thomas Bogendoerfer <[email protected]>

x86/mm/32: Bring back vmalloc faulting on x86_32

One can not simply remove vmalloc faulting on x86-32. Upstream

commit: 7f0a002b5a21 ("x86/mm: remove vmalloc faulting")

removed it on x86 alltogether because previously the
arch_sync_kernel_mappings() interface was introduced. This interface
added synchronization of vmalloc/ioremap page-table updates to all
page-tables in the system at creation time and was thought to make
vmalloc faulting obsolete.

But that assumption was incredibly naive.

It turned out that there is a race window between the time the vmalloc
or ioremap code establishes a mapping and the time it synchronizes
this change to other page-tables in the system.

During this race window another CPU or thread can establish a vmalloc
mapping which uses the same intermediate page-table entries (e.g. PMD
or PUD) and does no synchronization in the end, because it found all
necessary mappings already present in the kernel reference page-table.

But when these intermediate page-table entries are not yet
synchronized, the other CPU or thread will continue with a vmalloc
address that is not yet mapped in the page-table it currently uses,
causing an unhandled page fault and oops like below:

BUG: unable to handle page fault for address: fe80c000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
*pde = 33183067 *pte = a8648163
Oops: 0002 [#1] SMP
CPU: 1 PID: 13514 Comm: cve-2017-17053 Tainted: G
...
Call Trace:
ldt_dup_context+0x66/0x80
dup_mm+0x2b3/0x480
copy_process+0x133b/0x15c0
_do_fork+0x94/0x3e0
__ia32_sys_clone+0x67/0x80
__do_fast_syscall_32+0x3f/0x70
do_fast_syscall_32+0x29/0x60
do_SYSENTER_32+0x15/0x20
entry_SYSENTER_32+0x9f/0xf2
EIP: 0xb7eef549

So the arch_sync_kernel_mappings() interface is racy, but removing it
would mean to re-introduce the vmalloc_sync_all() interface, which is
even more awful. Keep arch_sync_kernel_mappings() in place and catch
the race condition in the page-fault handler instead.

Do a partial revert of above commit to get vmalloc faulting on x86-32
back in place.

Fixes: 7f0a002b5a21 ("x86/mm: remove vmalloc faulting")
Reported-by: Naresh Kamboju <[email protected]>
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

x86/cmdline: Disable jump tables for cmdline.c

When CONFIG_RETPOLINE is disabled, Clang uses a jump table for the
switch statement in cmdline_find_option (jump tables are disabled when
CONFIG_RETPOLINE is enabled). This function is called very early in boot
from sme_enable() if CONFIG_AMD_MEM_ENCRYPT is enabled. At this time,
the kernel is still executing out of the identity mapping, but the jump
table will contain virtual addresses.

Fix this by disabling jump tables for cmdline.c when AMD_MEM_ENCRYPT is
enabled.

Signed-off-by: Arvind Sankar <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

dmaengine: ti: k3-udma: Update rchan_oes_offset for am654 SYSFW ABI 3.0

SYSFW ABI 3.0 has changed the rchan_oes_offset value for am654 to support
SR2.

Since the kernel now needs SYSFW API 3.0 to work because the merged irqchip
update, we need to also update the am654 rchan_oes_offset.

Signed-off-by: Peter Ujfalusi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vinod Koul <[email protected]>

drm/nouveau/kms/nv50-gp1xx: add WAR for EVO push buffer HW bug

Thanks to NVIDIA for confirming this workaround, and clarifying which HW
is affected.

Signed-off-by: Ben Skeggs <[email protected]>
Tested-by: Alexander Kapshuk <[email protected]>

drm/nouveau/kms/nv50-gp1xx: disable notifies again after core update

This was lost during the header conversion.

Signed-off-by: Ben Skeggs <[email protected]>

drm/nouveau/kms/nv50-: add some whitespace before debug message

Signed-off-by: Ben Skeggs <[email protected]>

drm/nouveau/kms/gv100-: Include correct push header in crcc37d.c

Looks like when we converted everything over to Nvidia's class headers,
we mistakenly included the nvif/push507b.h instead of nvif/pushc37b.h,
which resulted in breaking CRC reporting for volta+:

nouveau 0000:1f:00.0: disp: chid 0 stat 10003361 reason 3
[RESERVED_METHOD] mthd 0d84 data 00000000 code 00000000
nouveau 0000:1f:00.0: disp: chid 0 stat 10003360 reason 3
[RESERVED_METHOD] mthd 0d80 data 00000000 code 00000000
nouveau 0000:1f:00.0: DRM: CRC notifier ctx for head 3 not finished
after 50ms

So, fix that.

Signed-off-by: Lyude Paul <[email protected]>
Fixes: c4b27bc8682c ("drm/nouveau/kms/nv50-: convert core crc_set_src() to new push macros")
Signed-off-by: Ben Skeggs <[email protected]>

drm/radeon: Prefer lower feedback dividers

Commit 2e26ccb119bd ("drm/radeon: prefer lower reference dividers")
fixed screen flicker for HP Compaq nx9420 but breaks other laptops like
Asus X50SL.

Turns out we also need to favor lower feedback dividers.

Users confirmed this change fixes the regression and doesn't regress the
original fix.

Fixes: 2e26ccb119bd ("drm/radeon: prefer lower reference dividers")
BugLink: https://bugs.launchpad.net/bugs/1791312
BugLink: https://bugs.launchpad.net/bugs/1861554
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Kai-Heng Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Fix bug in reporting voltage for CIK

On my R9 390, the voltage was reported as a constant 1000 mV.
This was due to a bug in smu7_hwmgr.c, in the smu7_read_sensor()
function, where some magic constants were used in a condition,
to determine whether the voltage should be read from PLANE2_VID
or PLANE1_VID. The VDDC mask was incorrectly used, instead of
the VDDGFX mask.

This patch changes the code to use the correct defined constants
(and apply the correct bitshift), thus resulting in correct voltage reporting.

Signed-off-by: Sandeep Raghuraman <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Specify get_argument function for ci_smu_funcs

Starting in Linux 5.8, the graphics and memory clock frequency were not being
reported for CIK cards. This is a regression, since they were reported correctly
in Linux 5.7.

After investigation, I discovered that the smum_send_msg_to_smc() function,
attempts to call the corresponding get_argument() function of ci_smu_funcs.
However, the get_argument() function is not defined in ci_smu_funcs.

This patch fixes the bug by specifying the correct get_argument() function.

Fixes: a0ec225633d9f6 ("drm/amd/powerplay: unified interfaces for message issuing and response checking")
Signed-off-by: Sandeep Raghuraman <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]

drm/amd/pm: enable MP0 DPM for sienna_cichlid

Enable MP0 clock DPM for sienna_cichlid.

Signed-off-by: Jiansong Chen <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/pm: avoid false alarm due to confusing softwareshutdowntemp setting

Normally softwareshutdowntemp should be greater than Thotspotlimit.
However, on some VEGA10 ASIC, the softwareshutdowntemp is 91C while
Thotspotlimit is 105C. This seems not right and may trigger some
false alarms.

Signed-off-by: Evan Quan <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]

drm/amd/pm: fix is_dpm_running() run error on 32bit system

v1:
the C type "unsigned long" size is 32bit on 32bit system,
it will cause code logic error, so replace it with "uint64_t".

v2:
remove duplicate cast operation.

Signed-off-by: Kevin <[email protected]>
Suggest-by: Jiansong Chen <[email protected]>
Reviewed-by: Jiansong Chen <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

kconfig: remove redundant assignment prompt = prompt

Semi-automatic removing of localization macros changed the line
from "prompt = _(prompt);" to "prompt = prompt;". Drop the
reduntand assignment.

Fixes: 694c49a7c01c ("kconfig: drop localization support")
Signed-off-by: Denis Efremov <[email protected]>
Signed-off-by: Masahiro Yamada <[email protected]>

kbuild: Documentation: clean up makefiles.rst

This is a general cleanup of kbuild/makefiles.rst:

* Use "Chapter" for major heading references and use "section" for
the next-level heading references, for consistency.
* Section 3.8 was deleted long ago.
* Drop the ending ':' in section names in the contents list.
* Correct some section numbering references.
* Correct verb agreement typo.
* Fix run-on sentence punctuation.

Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Masahiro Yamada <[email protected]>

kconfig: streamline_config.pl: check defined(ENV variable) before using it

A user reported:
'Use of uninitialized value $ENV{"LMC_KEEP"} in split at
./scripts/kconfig/streamline_config.pl line 596.'

so first check that $ENV{LMC_KEEP} is defined before trying
to use it.

Fixes: c027b02d89fd ("streamline_config.pl: add LMC_KEEP to preserve some kconfigs")
Signed-off-by: Randy Dunlap <[email protected]>
Acked-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Masahiro Yamada <[email protected]>

block: allow for_each_bvec to support zero len bvec

Block layer usually doesn't support or allow zero-length bvec. Since
commit 1bdc76aea115 ("iov_iter: use bvec iterator to implement
iterate_bvec()"), iterate_bvec() switches to bvec iterator. However,
Al mentioned that 'Zero-length segments are not disallowed' in iov_iter.

Fixes for_each_bvec() so that it can move on after seeing one zero
length bvec.

Fixes: 1bdc76aea115 ("iov_iter: use bvec iterator to implement iterate_bvec()")
Reported-by: syzbot <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
Tested-by: Tetsuo Handa <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: <[email protected]>
Link: https://www.mail-archive.com/[email protected]/msg2262077.html
Signed-off-by: Jens Axboe <[email protected]>

net: dp83867: Fix WoL SecureOn password

Fix the registers being written to as the values were being over written
when writing the same registers.

Fixes: caabee5b53f5 ("net: phy: dp83867: support Wake on LAN")
Signed-off-by: Dan Murphy <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

nfp: flower: fix ABI mismatch between driver and firmware

Fix an issue where the driver wrongly detected ipv6 neighbour updates
from the NFP as corrupt. Add a reserved field on the kernel side so
it is similar to the ipv4 version of the struct and has space for the
extra bytes from the card.

Fixes: 9ea9bfa12240 ("nfp: flower: support ipv6 tunnel keep-alive messages from fw")
Signed-off-by: Louis Peens <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tipc: fix shutdown() of connectionless socket

syzbot is reporting hung task at nbd_ioctl() [1], for there are two
problems regarding TIPC's connectionless socket's shutdown() operation.

----------
#include <fcntl.h>
#include <sys/socket.h>
#include <sys/ioctl.h>
#include <linux/nbd.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
        const int fd = open("/dev/nbd0", 3);
        alarm(5);
        ioctl(fd, NBD_SET_SOCK, socket(PF_TIPC, SOCK_DGRAM, 0));
        ioctl(fd, NBD_DO_IT, 0); /* To be interrupted by SIGALRM. */
        return 0;
}
----------

One problem is that wait_for_completion() from flush_workqueue() from
nbd_start_device_ioctl() from nbd_ioctl() cannot be completed when
nbd_start_device_ioctl() received a signal at wait_event_interruptible(),
for tipc_shutdown() from kernel_sock_shutdown(SHUT_RDWR) from
nbd_mark_nsock_dead() from sock_shutdown() from nbd_start_device_ioctl()
is failing to wake up a WQ thread sleeping at wait_woken() from
tipc_wait_for_rcvmsg() from sock_recvmsg() from sock_xmit() from
nbd_read_stat() from recv_work() scheduled by nbd_start_device() from
nbd_start_device_ioctl(). Fix this problem by always invoking
sk->sk_state_change() (like inet_shutdown() does) when tipc_shutdown() is
called.

The other problem is that tipc_wait_for_rcvmsg() cannot return when
tipc_shutdown() is called, for tipc_shutdown() sets sk->sk_shutdown to
SEND_SHUTDOWN (despite "how" is SHUT_RDWR) while tipc_wait_for_rcvmsg()
needs sk->sk_shutdown set to RCV_SHUTDOWN or SHUTDOWN_MASK. Fix this
problem by setting sk->sk_shutdown to SHUTDOWN_MASK (like inet_shutdown()
does) when the socket is connectionless.

[1] https://syzkaller.appspot.com/bug?id=3fe51d307c1f0a845485cf1798aa059d12bf18b2

Reported-by: syzbot <[email protected]>
Signed-off-by: Tetsuo Handa <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ipv6: Fix sysctl max for fib_multipath_hash_policy

Cited commit added the possible value of '2', but it cannot be set. Fix
it by adjusting the maximum value to '2'. This is consistent with the
corresponding IPv4 sysctl.

Before:

# sysctl -w net.ipv6.fib_multipath_hash_policy=2
sysctl: setting key "net.ipv6.fib_multipath_hash_policy": Invalid argument
net.ipv6.fib_multipath_hash_policy = 2
# sysctl net.ipv6.fib_multipath_hash_policy
net.ipv6.fib_multipath_hash_policy = 0

After:

# sysctl -w net.ipv6.fib_multipath_hash_policy=2
net.ipv6.fib_multipath_hash_policy = 2
# sysctl net.ipv6.fib_multipath_hash_policy
net.ipv6.fib_multipath_hash_policy = 2

Fixes: d8f74f0975d8 ("ipv6: Support multipath hashing on inner IP pkts")
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Stephen Suryaputra <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

drivers/net/wan/hdlc: Change the default of hard_header_len to 0

Change the default value of hard_header_len in hdlc.c from 16 to 0.

Currently there are 6 HDLC protocol drivers, among them:

hdlc_raw_eth, hdlc_cisco, hdlc_ppp, hdlc_x25 set hard_header_len when
attaching the protocol, overriding the default. So this patch does not
affect them.

hdlc_raw and hdlc_fr don't set hard_header_len when attaching the
protocol. So this patch will change the hard_header_len of the HDLC
device for them from 16 to 0.

This is the correct change because both hdlc_raw and hdlc_fr don't have
header_ops, and the code in net/packet/af_packet.c expects the value of
hard_header_len to be consistent with header_ops.

In net/packet/af_packet.c, in the packet_snd function,
for AF_PACKET/DGRAM sockets it would reserve a headroom of
hard_header_len and call dev_hard_header to fill in that headroom,
and for AF_PACKET/RAW sockets, it does not reserve the headroom and
does not call dev_hard_header, but checks if the user has provided a
header of length hard_header_len (in function dev_validate_header).

Cc: Krzysztof Halasa <[email protected]>
Cc: Martin Schiller <[email protected]>
Signed-off-by: Xie He <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: gemini: Fix another missing clk_disable_unprepare() in probe

We recently added some calls to clk_disable_unprepare() but we missed
the last error path if register_netdev() fails.

I made a couple cleanups so we avoid mistakes like this in the future.
First I reversed the "if (!ret)" condition and pulled the code in one
indent level. Also, the "port->netdev = NULL;" is not required because
"port" isn't used again outside this function so I deleted that line.

Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet")
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID fixes from Jiri Kosina:

- data sanitization and validtion fixes for report descriptor parser
   from Marc Zyngier

- memory leak fix for hid-elan driver from Dinghao Liu

- two device-specific quirks

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: core: Sanitize event code and type when mapping input
  HID: core: Correctly handle ReportSize being zero
  HID: elan: Fix memleak in elan_input_configured
  HID: microsoft: Add rumble support for the 8bitdo SN30 Pro+ controller
  HID: quirks: Set INCREMENT_USAGE_ON_DUPLICATE for all Saitek X52 devices

net: bcmgenet: fix mask check in bcmgenet_validate_flow()

VALIDATE_MASK(eth_mask->h_source) is checked twice in a row in
bcmgenet_validate_flow(). Add VALIDATE_MASK(eth_mask->h_dest)
instead.

Fixes: 3e370952287c ("net: bcmgenet: add support for ethtool rxnfc flows")
Signed-off-by: Denis Efremov <[email protected]>
Acked-by: Doug Berger <[email protected]>
Acked-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

amd-xgbe: Add support for new port mode

Add support for a new port mode that is a backplane connection without
support for auto negotiation.

Signed-off-by: Shyam Sundar S K <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'for-5.9/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

- writecache fix to allow dax_direct_access() to partitioned pmem
   devices.

- multipath fix to avoid any Path Group initialization if
   'pg_init_in_progress' isn't set.

- crypt fix to use DECLARE_CRYPTO_WAIT() for onstack wait structures.

- integrity fix to properly check integrity after device creation when
   in bitmap mode.

- thinp and cache target __create_persistent_data_objects() fixes to
   reset the metadata's dm_block_manager pointer from PTR_ERR to NULL
   before returning from error path.

- persistent-data block manager fix to guard against dm_block_manager
   NULL pointer dereference in dm_bm_is_read_only() and update various
   opencoded bm->read_only checks to use dm_bm_is_read_only() instead.

* tag 'for-5.9/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm thin metadata: Fix use-after-free in dm_bm_set_read_only
  dm thin metadata:  Avoid returning cmd->bm wild pointer on error
  dm cache metadata: Avoid returning cmd->bm wild pointer on error
  dm integrity: fix error reporting in bitmap mode after creation
  dm crypt: Initialize crypto wait structures
  dm mpath: fix racey management of PG initialization
  dm writecache: handle DAX to partitions on persistent memory correctly

Merge tag 'xfs-5.9-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:
"Various small corruption fixes that have come in during the past
  month:

   - Avoid a log recovery failure for an insert range operation by
     rolling deferred ops incrementally instead of at the end.

   - Fix an off-by-one error when calculating log space reservations for
     anything involving an inode allocation or free.

   - Fix a broken shortform xattr verifier.

   - Ensure that the shortform xattr header padding is always
     initialized to zero"

* tag 'xfs-5.9-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: initialize the shortform attr header padding entry
  xfs: fix boundary test in xfs_attr_shortform_verify
  xfs: fix off-by-one in inode alloc block reservation calculation
  xfs: finish dfops on every insert range shift iteration

Merge branch 'work.epoll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull epoll fixup from Al Viro:
"Fixup for epoll regression; there's a better solution longer term, but
this is the least intrusive fix"

* 'work.epoll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix regression in "epoll: Keep a reference on files added to the check list"

dm thin metadata: Fix use-after-free in dm_bm_set_read_only

The following error ocurred when testing disk online/offline:

[  301.798344] device-mapper: thin: 253:5: aborting current metadata transaction
[  301.848441] device-mapper: thin: 253:5: failed to abort metadata transaction
[  301.849206] Aborting journal on device dm-26-8.
[  301.850489] EXT4-fs error (device dm-26) in __ext4_new_inode:943: Journal has aborted
[  301.851095] EXT4-fs (dm-26): Delayed block allocation failed for inode 398742 at logical offset 181 with max blocks 19 with error 30
[  301.854476] BUG: KASAN: use-after-free in dm_bm_set_read_only+0x3a/0x40 [dm_persistent_data]

Reason is:

metadata_operation_failed
    abort_transaction
        dm_pool_abort_metadata
    __create_persistent_data_objects
        r = __open_or_format_metadata
        if (r) --> If failed will free pmd->bm but pmd->bm not set NULL
    dm_block_manager_destroy(pmd->bm);
    set_pool_mode
dm_pool_metadata_read_only(pool->pmd);
dm_bm_set_read_only(pmd->bm);  --> use-after-free

Add checks to see if pmd->bm is NULL in dm_bm_set_read_only and
dm_bm_set_read_write functions.  If bm is NULL it means creating the
bm failed and so dm_bm_is_read_only must return true.

Signed-off-by: Ye Bin <[email protected]>
Cc: [email protected]
Signed-off-by: Mike Snitzer <[email protected]>

dm thin metadata: Avoid returning cmd->bm wild pointer on error

Maybe __create_persistent_data_objects() caller will use PTR_ERR as a
pointer, it will lead to some strange things.

Signed-off-by: Ye Bin <[email protected]>
Cc: [email protected]
Signed-off-by: Mike Snitzer <[email protected]>

dm cache metadata: Avoid returning cmd->bm wild pointer on error

Maybe __create_persistent_data_objects() caller will use PTR_ERR as a
pointer, it will lead to some strange things.

Signed-off-by: Ye Bin <[email protected]>
Cc: [email protected]
Signed-off-by: Mike Snitzer <[email protected]>

ALSA: hda: use consistent HDAudio spelling in comments/docs

We use HDaudio and HDAudio, pick one to make searches easier.
No functionality change

Also fix timestamping typo in documentation.

Reported-by: Guennadi Liakhovetski <[email protected]>
Signed-off-by: Pierre-Louis Bossart <[email protected]>
Reviewed-by: Bard Liao <[email protected]>
Reviewed-by: Guennadi Liakhovetski <[email protected]>
Signed-off-by: Kai Vehmanen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

libata: implement ATA_HORKAGE_MAX_TRIM_128M and apply to Sandisks

All three generations of Sandisk SSDs lock up hard intermittently.
Experiments showed that disabling NCQ lowered the failure rate significantly
and the kernel has been disabling NCQ for some models of SD7's and 8's,
which is obviously undesirable.

Karthik worked with Sandisk to root cause the hard lockups to trim commands
larger than 128M. This patch implements ATA_HORKAGE_MAX_TRIM_128M which
limits max trim size to 128M and applies it to all three generations of
Sandisk SSDs.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Karthik Shivaram <[email protected]>
Cc: [email protected]
Signed-off-by: Jens Axboe <[email protected]>

ALSA: hda: add dev_dbg log when driver is not selected

On SKL+ Intel platforms, the driver selection is handled by the
snd_intel_dspcfg, and when the HDaudio legacy driver is not selected,
be it with the auto-selection or user preferences with a kernel
parameter, the probe aborts with no logs, only a -ENODEV return value.

Having no dmesg trace, even with dynamic debug enabled, makes support
more complicated than it needs to be, and even experienced users can
be fooled. A simple dev_dbg() trace solves this problem.

BugLink: https://github.com/thesofproject/linux/issues/2330
Signed-off-by: Pierre-Louis Bossart <[email protected]>
Reviewed-by: Bard Liao <[email protected]>
Reviewed-by: Guennadi Liakhovetski <[email protected]>
Signed-off-by: Kai Vehmanen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

ALSA: hda: fix a runtime pm issue in SOF when integrated GPU is disabled

In snd_hdac_device_init pm_runtime_set_active is called to
increase child_count in parent device. But when it is failed
to build connection with GPU for one case that integrated
graphic gpu is disabled, snd_hdac_ext_bus_device_exit will be
invoked to clean up a HD-audio extended codec base device. At
this time the child_count of parent is not decreased, which
makes parent device can't get suspended.

This patch calls pm_runtime_set_suspended to decrease child_count
in parent device in snd_hdac_device_exit to match with
snd_hdac_device_init. pm_runtime_set_suspended can make sure that
it will not decrease child_count if the device is already suspended.

Signed-off-by: Rander Wang <[email protected]>
Reviewed-by: Ranjani Sridharan <[email protected]>
Reviewed-by: Pierre-Louis Bossart <[email protected]>
Reviewed-by: Bard Liao <[email protected]>
Reviewed-by: Guennadi Liakhovetski <[email protected]>
Signed-off-by: Kai Vehmanen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

ALSA: hda: hdmi - add Rocketlake support

Add Rocketlake HDMI codec support. Rocketlake shares
the pin-to-port mapping table with Tigerlake.

Signed-off-by: Rander Wang <[email protected]>
Reviewed-by: Pierre-Louis Bossart <[email protected]>
Reviewed-by: Ranjani Sridharan <[email protected]>
Signed-off-by: Kai Vehmanen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

io_uring: no read/write-retry on -EAGAIN error and O_NONBLOCK marked file

Actually two things that need fixing up here:

- The io_rw_reissue() -EAGAIN retry is explicit to block devices and
  regular files, so don't ever attempt to do that on other types of
  files.

- If we hit -EAGAIN on a nonblock marked file, don't arm poll handler for
  it. It should just complete with -EAGAIN.

Cc: [email protected]
Reported-by: Norman Maurer <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

fix regression in "epoll: Keep a reference on files added to the check list"

epoll_loop_check_proc() can run into a file already committed to destruction;
we can't grab a reference on those and don't need to add them to the set for
reverse path check anyway.

Tested-by: Marc Zyngier <[email protected]>
Fixes: a9ed4a6560b8 ("epoll: Keep a reference on files added to the check list")
Signed-off-by: Al Viro <[email protected]>

io_uring: set table->files[i] to NULL when io_sqe_file_register failed

While io_sqe_file_register() failed in __io_sqe_files_update(),
table->files[i] still point to the original file which may freed
soon, and that will trigger use-after-free problems.

Cc: [email protected]
Fixes: f3bd9dae3708 ("io_uring: fix memleak in __io_sqe_files_update()")
Signed-off-by: Jiufei Xue <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

Merge branch 'topic/tasklet-convert' into for-linus

Pull tasklet API conversions.

Signed-off-by: Takashi Iwai <[email protected]>

ALSA: ua101: convert tasklets to use new tasklet_setup() API

In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Signed-off-by: Romain Perier <[email protected]>
Signed-off-by: Allen Pais <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

ALSA: usb-audio: convert tasklets to use new tasklet_setup() API

In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Signed-off-by: Romain Perier <[email protected]>
Signed-off-by: Allen Pais <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

ASoC: txx9: convert tasklets to use new tasklet_setup() API

In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Signed-off-by: Romain Perier <[email protected]>
Signed-off-by: Allen Pais <[email protected]>
Acked-by: Mark Brown <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

ASoC: siu: convert tasklets to use new tasklet_setup() API

In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Signed-off-by: Romain Perier <[email protected]>
Signed-off-by: Allen Pais <[email protected]>
Acked-by: Mark Brown <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

ASoC: fsl_esai: convert tasklets to use new tasklet_setup() API

In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Signed-off-by: Romain Perier <[email protected]>
Signed-off-by: Allen Pais <[email protected]>
Acked-by: Mark Brown <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

ALSA: hdsp: convert tasklets to use new tasklet_setup() API

In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Signed-off-by: Romain Perier <[email protected]>
Signed-off-by: Allen Pais <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

ALSA: riptide: convert tasklets to use new tasklet_setup() API

In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Signed-off-by: Romain Perier <[email protected]>
Signed-off-by: Allen Pais <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

ALSA: pci/asihpi: convert tasklets to use new tasklet_setup() API

In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Signed-off-by: Romain Perier <[email protected]>
Signed-off-by: Allen Pais <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

ALSA: firewire: convert tasklets to use new tasklet_setup() API

In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Signed-off-by: Romain Perier <[email protected]>
Signed-off-by: Allen Pais <[email protected]>
Acked-by: Takashi Sakamoto <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

ALSA: core: convert tasklets to use new tasklet_setup() API

In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Signed-off-by: Romain Perier <[email protected]>
Signed-off-by: Allen Pais <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>

s390: update defconfigs

Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

s390: fix GENERIC_LOCKBREAK dependency typo in Kconfig

Commit fa686453053b ("sched/rt, s390: Use CONFIG_PREEMPTION")
changed a bunch of uses of CONFIG_PREEMPT to _PREEMPTION.
Except in the Kconfig it used two T's. That's the only place
in the system where that spelling exists, so let's fix that.

Fixes: fa686453053b ("sched/rt, s390: Use CONFIG_PREEMPTION")
Cc: <[email protected]> # 5.6
Signed-off-by: Eric Farman <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>

drm/i915: Clear the repeater bit on HDCP disable

On HDCP disable, clear the repeater bit. This ensures if we connect a
non-repeater sink after a repeater, the bit is in the state we expect.

Fixes: ee5e5e7a5e0f ("drm/i915: Add HDCP framework + base implementation")
Cc: Chris Wilson <[email protected]>
Cc: Ramalingam C <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Sean Paul <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: [email protected]
Cc: <[email protected]> # v4.17+
Reviewed-by: Ramalingam C <[email protected]>
Signed-off-by: Sean Paul <[email protected]>
Signed-off-by: Ramalingam C <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 2cc0c7b520bf8ea20ec42285d4e3d37b467eb7f9)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915: Fix sha_text population code

This patch fixes a few bugs:

1- We weren't taking into account sha_leftovers when adding multiple
   ksvs to sha_text. As such, we were or'ing the end of ksv[j - 1] with
   the beginning of ksv[j]

2- In the sha_leftovers == 2 and sha_leftovers == 3 case, bstatus was
   being placed on the wrong half of sha_text, overlapping the leftover
   ksv value

3- In the sha_leftovers == 2 case, we need to manually terminate the
   byte stream with 0x80 since the hardware doesn't have enough room to
   add it after writing M0

The upside is that all of the HDCP supported HDMI repeaters I could
find on Amazon just strip HDCP anyways, so it turns out to be _really_
hard to hit any of these cases without an MST hub, which is not (yet)
supported. Oh, and the sha_leftovers == 1 case works perfectly!

Fixes: ee5e5e7a5e0f ("drm/i915: Add HDCP framework + base implementation")
Cc: Chris Wilson <[email protected]>
Cc: Ramalingam C <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Sean Paul <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: [email protected]
Cc: <[email protected]> # v4.17+
Reviewed-by: Ramalingam C <[email protected]>
Signed-off-by: Sean Paul <[email protected]>
Signed-off-by: Ramalingam C <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 1f0882214fd0037b74f245d9be75c31516fed040)
Signed-off-by: Jani Nikula <[email protected]>

drm/i915/display: Ensure that ret is always initialized in icl_combo_phy_verify_state

Clang warns:

drivers/gpu/drm/i915/display/intel_combo_phy.c:268:3: warning: variable
'ret' is uninitialized when used here [-Wuninitialized]
                ret &= check_phy_reg(dev_priv, phy, ICL_PORT_TX_DW8_LN0(phy),
                ^~~
drivers/gpu/drm/i915/display/intel_combo_phy.c:261:10: note: initialize
the variable 'ret' to silence this warning
        bool ret;
                ^
                 = 0
1 warning generated.

In practice, the bug this warning appears to be concerned with would not
actually matter because ret gets initialized to the return value of
cnl_verify_procmon_ref_values. However, that does appear to be a bug
since it means the first hunk of the patch this fixes won't actually do
anything (since the values of check_phy_reg won't factor into the final
ret value). Initialize ret to true then make all of the assignments a
bitwise AND with itself so that the function always does what it should
do.

Fixes: 239bef676d8e ("drm/i915/display: Implement new combo phy initialization step")
Link: https://github.com/ClangBuiltLinux/linux/issues/1094
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: José Roberto de Souza <[email protected]>
(cherry picked from commit 2034c2129bc4a91d471815d4dc7a2a69eaa5338d)
Signed-off-by: Jani Nikula <[email protected]>

arm64/module: set trampoline section flags regardless of CONFIG_DYNAMIC_FTRACE

In the arm64 module linker script, the section .text.ftrace_trampoline
is specified unconditionally regardless of whether CONFIG_DYNAMIC_FTRACE
is enabled (this is simply due to the limitation that module linker
scripts are not preprocessed like the vmlinux one).

Normally, for .plt and .text.ftrace_trampoline, the section flags
present in the module binary wouldn't matter since module_frob_arch_sections()
would assign them manually anyway. However, the arm64 module loader only
sets the section flags for .text.ftrace_trampoline when CONFIG_DYNAMIC_FTRACE=y.
That's only become problematic recently due to a recent change in
binutils-2.35, where the .text.ftrace_trampoline section (along with the
.plt section) is now marked writable and executable (WAX).

We no longer allow writable and executable sections to be loaded due to
commit 5c3a7db0c7ec ("module: Harden STRICT_MODULE_RWX"), so this is
causing all modules linked with binutils-2.35 to be rejected under arm64.
Drop the IS_ENABLED(CONFIG_DYNAMIC_FTRACE) check in module_frob_arch_sections()
so that the section flags for .text.ftrace_trampoline get properly set to
SHF_EXECINSTR|SHF_ALLOC, without SHF_WRITE.

Signed-off-by: Jessica Yu <[email protected]>
Acked-by: Will Deacon <[email protected]>
Acked-by: Ard Biesheuvel <[email protected]>
Link: http://lore.kernel.org/r/20200831094651.GA16385@linux-8ccs
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>

arm64: Remove exporting cpu_logical_map symbol

Commit eaecca9e7710 ("arm64: Fix __cpu_logical_map undefined issue")
exported cpu_logical_map in order to fix tegra194-cpufreq module build
failure.

As this might potentially cause problem while supporting physical CPU
hotplug, tegra194-cpufreq module was reworded to avoid use of
cpu_logical_map() via the commit 93d0c1ab2328 ("cpufreq: replace
cpu_logical_map() with read_cpuid_mpir()")

Since cpu_logical_map was exported to fix the module build temporarily,
let us remove the same before it gains any user again.

Signed-off-by: Sudeep Holla <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>

Merge tag 'perf-tools-fixes-for-v5.9-2020-09-01' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

- Fix infinite loop in the TUI for grouped events in 'perf top/record',
   eg when using "perf top -e '{cycles,instructions,cache-misses}'".

- Fix segfault by skipping side-band event setup if HAVE_LIBBPF_SUPPORT
   is not set.

- Fix synthesized branch stacks generated from CoreSight ETM trace and
   Intel PT hardware traces.

- Fix error when synthesizing events from ARM SPE hardware trace.

- The SNOOPX and REMOTE offsets in the data_src bitmask in perf records
   were were both 37, SNOOPX is 38, fix it.

- Fix use of CPU list with summary option in 'perf sched timehist'.

- Avoid an uninitialized read when using fake PMUs.

- Set perf_event_attr.exclude_guest=1 for user-space counting.

- Don't order events when doing a 'perf report -D' raw dump of
   perf.data records.

- Set NULL sentinel in pmu_events table in "Parse and process metrics"
   'perf test'

- Fix basic bpf filtering 'perf test' on s390x.

- Fix out of bounds array access in the 'perf stat' print_counters()
   evlist method.

- Add mwait_idle_with_hints.constprop.0 to the list of idle symbols.

- Use %zd for size_t printf formats on 32-bit.

- Correct the help info of "perf record --no-bpf-event" option.

- Add entries for CoreSight and Arm SPE tooling to MAINTAINERS.

* tag 'perf-tools-fixes-for-v5.9-2020-09-01' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf report: Disable ordered_events for raw dump
  perf tools: Correct SNOOPX field offset
  perf intel-pt: Fix corrupt data after perf inject from
  perf cs-etm: Fix corrupt data after perf inject from
  perf top/report: Fix infinite loop in the TUI for grouped events
  perf parse-events: Avoid an uninitialized read when using fake PMUs
  perf stat: Fix out of bounds array access in the print_counters() evlist method
  perf test: Set NULL sentinel in pmu_events table in "Parse and process metrics" test
  perf parse-events: Set exclude_guest=1 for user-space counting
  perf record: Correct the help info of option "--no-bpf-event"
  perf tools: Use %zd for size_t printf formats on 32-bit
  MAINTAINERS: Add entries for CoreSight and Arm SPE tooling
  perf: arm-spe: Fix check error when synthesizing events
  perf symbols: Add mwait_idle_with_hints.constprop.0 to the list of idle symbols
  perf top: Skip side-band event setup if HAVE_LIBBPF_SUPPORT is not set
  perf sched timehist: Fix use of CPU list with summary option
  perf test: Fix basic bpf filtering test

Merge tag 'for-5.9-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
"Two small fixes and a bunch of lockdep fixes for warnings that show up
  with an upcoming tree locking update but are valid with current locks
  as well"

* tag 'for-5.9-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: tree-checker: fix the error message for transid error
  btrfs: set the lockdep class for log tree extent buffers
  btrfs: set the correct lockdep class for new nodes
  btrfs: allocate scrub workqueues outside of locks
  btrfs: fix potential deadlock in the search ioctl
  btrfs: drop path before adding new uuid tree entry
  btrfs: block-group: fix free-space bitmap threshold

blk-stat: make q->stats->lock irqsafe

blk-iocost calls blk_stat_enable_accounting() while holding an irqsafe lock
which triggers a lockdep splat because q->stats->lock isn't irqsafe. Let's
make it irqsafe.

Signed-off-by: Tejun Heo <[email protected]>
Fixes: cd006509b0a9 ("blk-iocost: account for IO size when testing latencies")
Cc: [email protected] # v5.8+
Signed-off-by: Jens Axboe <[email protected]>

blk-iocost: ioc_pd_free() shouldn't assume irq disabled

ioc_pd_free() grabs irq-safe ioc->lock without ensuring that irq is disabled
when it can be called with irq disabled or enabled. This has a small chance
of causing A-A deadlocks and triggers lockdep splats. Use irqsave operations
instead.

Signed-off-by: Tejun Heo <[email protected]>
Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost")
Cc: [email protected] # v5.4+
Signed-off-by: Jens Axboe <[email protected]>

net: usb: dm9601: Add USB ID of Keenetic Plus DSL

Keenetic Plus DSL is a xDSL modem that uses dm9620 as its USB interface.

Signed-off-by: Kamil Lorenc <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

vhost: fix typo in error message

"enable" should be "disable" when the function name is
vhost_disable_notify(), which does the disabling work.

Signed-off-by: Yunsheng Lin <[email protected]>
Acked-by: Jason Wang <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"Three minor fixes, all in drivers"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: scsi_debug: Remove superfluous close zone in resp_open_zone()
  scsi: libcxgbi: Fix a use after free in cxgbi_conn_xmit_pdu()
  scsi: qedf: Fix null ptr reference in qedf_stag_change_work

dm integrity: fix error reporting in bitmap mode after creation

The dm-integrity target did not report errors in bitmap mode just after
creation. The reason is that the function integrity_recalc didn't clean up
ic->recalc_bitmap as it proceeded with recalculation.

Fix this by updating the bitmap accordingly -- the double shift serves
to rounddown.

Signed-off-by: Mikulas Patocka <[email protected]>
Fixes: 468dfca38b1a ("dm integrity: add a bitmap mode")
Cc: [email protected] # v5.2+
Signed-off-by: Mike Snitzer <[email protected]>

dm crypt: Initialize crypto wait structures

Use the DECLARE_CRYPTO_WAIT() macro to properly initialize the crypto
wait structures declared on stack before their use with
crypto_wait_req().

Fixes: 39d13a1ac41d ("dm crypt: reuse eboiv skcipher for IV generation")
Fixes: bbb1658461ac ("dm crypt: Implement Elephant diffuser for Bitlocker compatibility")
Cc: [email protected]
Signed-off-by: Damien Le Moal <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

dm mpath: fix racey management of PG initialization

Commit 935fcc56abc3 ("dm mpath: only flush workqueue when needed")
changed flush_multipath_work() to avoid needless workqueue
flushing (of a multipath global workqueue). But that change didn't
realize the surrounding flush_multipath_work() code should also only
run if 'pg_init_in_progress' is set.

Fix this by only doing all of flush_multipath_work()'s PG init related
work if 'pg_init_in_progress' is set.

Otherwise multipath_wait_for_pg_init_completion() will run
unconditionally but the preceeding flush_workqueue(kmpath_handlerd)
may not. This could lead to deadlock (though only if kmpath_handlerd
never runs a corresponding work to decrement 'pg_init_in_progress').

It could also be, though highly unlikely, that the kmpath_handlerd
work that does PG init completes before 'pg_init_in_progress' is set,
and then an intervening DM table reload's multipath_postsuspend()
triggers flush_multipath_work().

Fixes: 935fcc56abc3 ("dm mpath: only flush workqueue when needed")
Cc: [email protected]
Reported-by: Ben Marzinski <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

dm writecache: handle DAX to partitions on persistent memory correctly

The function dax_direct_access doesn't take partitions into account,
it always maps pages from the beginning of the device. Therefore,
persistent_memory_claim() must get the partition offset using
get_start_sect() and add it to the page offsets passed to
dax_direct_access().

Signed-off-by: Mikulas Patocka <[email protected]>
Fixes: 48debafe4f2f ("dm: add writecache target")
Cc: [email protected] # 4.18+
Signed-off-by: Mike Snitzer <[email protected]>

net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init()

On machines with much memory (> 2 TByte) and log_mtts_per_seg == 0, a
max_order of 31 will be passed to mlx_buddy_init(), which results in
s = BITS_TO_LONGS(1 << 31) becoming a negative value, leading to
kvmalloc_array() failure when it is converted to size_t.

mlx4_core 0000:b1:00.0: Failed to initialize memory region table, aborting
mlx4_core: probe of 0000:b1:00.0 failed with error -12

Fix this issue by changing the left shifting operand from a signed literal to
an unsigned one.

Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
Signed-off-by: Shung-Hsi Yu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

cpufreq: intel_pstate: Fix intel_pstate_get_hwp_max() for turbo disabled

This fixes the behavior of the scaling_max_freq and scaling_min_freq
sysfs files in systems which had turbo disabled by the BIOS.

Caleb noticed that the HWP is programmed to operate in the wrong
P-state range on his system when the CPUFREQ policy min/max frequency
is set via sysfs. This seems to be because in his system
intel_pstate_get_hwp_max() is returning the maximum turbo P-state even
though turbo was disabled by the BIOS, which causes intel_pstate to
scale kHz frequencies incorrectly e.g. setting the maximum turbo
frequency whenever the maximum guaranteed frequency is requested via
sysfs.

Tested-by: Caleb Callaway <[email protected]>
Signed-off-by: Francisco Jerez <[email protected]>
Acked-by: Srinivas Pandruvada <[email protected]>
[ rjw: Minor subject edits ]
Signed-off-by: Rafael J. Wysocki <[email protected]>

cpufreq: intel_pstate: Free memory only when turning off

When intel_pstate switches the operation mode from "active" to
"passive" or the other way around, freeing its data structures
representing CPUs and allocating them again from scratch is not
necessary and wasteful. Moreover, if these data structures are
preserved, the cached HWP Request MSR value from there may be
written to the MSR to start with to reinitialize it and help to
restore the EPP value set previously (it is set to 0xFF when CPUs
go offline to allow their SMT siblings to use the full range of
EPP values and that also happens when the driver gets unregistered).

Accordingly, modify the driver to only do a full cleanup on driver
object registration errors and when its status is changed to "off"
via sysfs and to write the cached HWP Request MSR value back to
the MSR on CPU init if the data structure representing the given
CPU is still there.

Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Srinivas Pandruvada <[email protected]>

cpufreq: intel_pstate: Add ->offline and ->online callbacks

Add ->offline and ->online driver callbacks to prepare for taking a
CPU offline and to restore its working configuration when it goes
back online, respectively, to avoid invoking the ->init callback on
every CPU online which is quite a bit of unnecessary overhead.

Define ->offline and ->online so that they can be used in the
passive mode as well as in the active mode and because ->offline
will do the majority of ->stop_cpu work, the passive mode does
not need that callback any more, so drop it from there.

Also modify the active mode ->suspend and ->resume callbacks to
prevent them from interfering with the new ->offline and ->online
ones in case the latter are invoked withing the system-wide suspend
and resume code flow and make the passive mode use them too.

Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Srinivas Pandruvada <[email protected]>

cpufreq: intel_pstate: Tweak the EPP sysfs interface

Modify the EPP sysfs interface to reject attempts to change the EPP
to values different from 0 ("performance") in the active mode with
the "performance" policy (ie. scaling_governor set to "performance"),
to avoid situations in which the kernel appears to discard data
passed to it via the EPP sysfs attribute.

Signed-off-by: Rafael J. Wysocki <[email protected]>
Reviewed-by: Artem Bityutskiy <[email protected]>
Acked-by: Srinivas Pandruvada <[email protected]>

cpufreq: intel_pstate: Update cached EPP in the active mode

Make intel_pstate update the cached EPP value when setting the EPP
via sysfs in the active mode just like it is the case in the passive
mode, for consistency, but also for the benefit of subsequent
changes.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Srinivas Pandruvada <[email protected]>

cpufreq: intel_pstate: Refuse to turn off with HWP enabled

After commit f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive
mode with HWP enabled") it is possible to change the driver status
to "off" via sysfs with HWP enabled, which effectively causes the
driver to unregister itself, but HWP remains active and it forces the
minimum performance, so even if another cpufreq driver is loaded,
it will not be able to control the CPU frequency.

For this reason, make the driver refuse to change the status to
"off" with HWP enabled.

Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Srinivas Pandruvada <[email protected]>

ARC: [plat-hsdk]: Switch ethernet phy-mode to rgmii-id

HSDK board has Micrel KSZ9031, recent commit
bcf3440c6dd ("net: phy: micrel: add phy-mode support for the KSZ9031 PHY")
caused a breakdown of Ethernet.
Using 'phy-mode = "rgmii"' is not correct because accodring RGMII
specification it is necessary to have delay on RX (PHY to MAX)
which is not generated in case of "rgmii".
Using "rgmii-id" adds necessary delay and solves the issue.

Also adding name of PHY placed on HSDK board.

Signed-off-by: Evgeniy Didin <[email protected]>
Cc: Eugeniy Paltsev <[email protected]>
Cc: Alexey Brodkin <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>

pktgen: fix error message with wrong function name

Error on calling kthread_create_on_node prints wrong function name,
kernel_thread.

Fixes: 94dcf29a11b3 ("kthread: use kthread_create_on_node()")
Signed-off-by: Leesoo Ahn <[email protected]>
Acked-by: Gustavo A. R. Silva <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

arc: fix memory initialization for systems with two memory banks

Rework of memory map initialization broke initialization of ARC systems
with two memory banks. Before these changes, memblock was not aware of
nodes configuration and the memory map was always allocated from the
"lowmem" bank. After the addition of node information to memblock, the core
mm attempts to allocate the memory map for the "highmem" bank from its
node. The access to this memory using __va() fails because it can be only
accessed using kmap.

Anther problem that was uncovered is that {min,max}_high_pfn are calculated
from u64 high_mem_start variable which prevents truncation to 32-bit
physical address and the PFN values are above the node and zone boundaries.

Use phys_addr_t type for high_mem_start and high_mem_size to ensure
correspondence between PFNs and highmem zone boundaries and reserve the
entire highmem bank until mem_init() to avoid accesses to it before highmem
is enabled.

To test this:
1. Enable HIGHMEM in ARC config
2. Enable 2 memory banks in haps_hs.dts (uncomment the 2nd bank)

Fixes: 51930df5801e ("mm: free_area_init: allow defining max_zone_pfn in descending order")
Cc: [email protected] [5.8]
Signed-off-by: Mike Rapoport <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
[vgupta: added instructions to test highmem]

Merge branch 'opp/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm

Pull operating performance points (OPP) framework fixes for 5.9-rc4 from
Viresh Kumar:

"This fixes reference counting for OPP tables. Few patches are getting
queued (for various subsystems) for 5.10 which depend on this to be
fixed first."

* 'opp/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
opp: Don't drop reference for an OPP table that was never parsed

ia64: fix min_low_pfn/max_low_pfn build errors

Fix min_low_pfn/max_low_pfn build errors for arch/ia64/: (e.g.)

ERROR: "max_low_pfn" [drivers/rpmsg/virtio_rpmsg_bus.ko] undefined!
ERROR: "min_low_pfn" [drivers/rpmsg/virtio_rpmsg_bus.ko] undefined!
ERROR: "min_low_pfn" [drivers/hwtracing/intel_th/intel_th_msu.ko] undefined!
ERROR: "max_low_pfn" [drivers/hwtracing/intel_th/intel_th_msu.ko] undefined!
ERROR: "min_low_pfn" [drivers/crypto/cavium/nitrox/n5pf.ko] undefined!
ERROR: "max_low_pfn" [drivers/crypto/cavium/nitrox/n5pf.ko] undefined!
ERROR: "max_low_pfn" [drivers/md/dm-integrity.ko] undefined!
ERROR: "min_low_pfn" [drivers/md/dm-integrity.ko] undefined!
ERROR: "max_low_pfn" [crypto/tcrypt.ko] undefined!
ERROR: "min_low_pfn" [crypto/tcrypt.ko] undefined!
ERROR: "min_low_pfn" [security/keys/encrypted-keys/encrypted-keys.ko] undefined!
ERROR: "max_low_pfn" [security/keys/encrypted-keys/encrypted-keys.ko] undefined!
ERROR: "min_low_pfn" [arch/ia64/kernel/mca_recovery.ko] undefined!
ERROR: "max_low_pfn" [arch/ia64/kernel/mca_recovery.ko] undefined!

David suggested just exporting min_low_pfn & max_low_pfn in
mm/memblock.c:
https://lore.kernel.org/lkml/alpine.DEB.2.22.394.2006291911220.1118534@chino.kir.corp.google.com/

Reported-by: kernel test robot <[email protected]>
Signed-off-by: Randy Dunlap <[email protected]>
Acked-by: David Rientjes <[email protected]>
Acked-by: Tony Luck <[email protected]>
Cc: [email protected]
Cc: Andrew Morton <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: [email protected]
Signed-off-by: Mike Rapoport <[email protected]>

perf report: Disable ordered_events for raw dump

Disable ordered_events for report raw dump, because for raw dump we want
to see events as they are stored in the perf.data file, not sorted by
time.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf tools: Correct SNOOPX field offset

perf_event.h has macros that define the field offsets in the data_src
bitmask in perf records. The SNOOPX and REMOTE offsets were both 37.

These are distinct fields, and the bitfield layout in perf_mem_data_src
confirms that SNOOPX should be at offset 38.

Committer notes:

This was extracted from a larger patch that also contained kernel
changes.

Fixes: 52839e653b5629bd ("perf tools: Add support for printing new mem_info encodings")
Signed-off-by: Al Grant <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf intel-pt: Fix corrupt data after perf inject from

Commit 42bbabed09ce6208 ("perf tools: Add hw_idx in struct branch_stack")
changed the format of branch stacks in perf samples. When samples use
this new format, a flag must be set in the corresponding event.

Synthesized branch stacks generated from Intel PT were using the new
format, but not setting the event attribute, leading to consumers
seeing corrupt data. This patch fixes the issue by setting the event
attribute to indicate use of the new format.

Fixes: 42bbabed09ce6208 ("perf tools: Add hw_idx in struct branch_stack")
Signed-off-by: Al Grant <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Reviewed-by: Mathieu Poirier <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Leo Yan <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf cs-etm: Fix corrupt data after perf inject from

Commit 42bbabed09ce6208 ("perf tools: Add hw_idx in struct branch_stack")
changed the format of branch stacks in perf samples. When samples use
this new format, a flag must be set in the corresponding event.

Synthesized branch stacks generated from CoreSight ETM trace were using
the new format, but not setting the event attribute, leading to
consumers seeing corrupt data. This patch fixes the issue by setting the
event attribute to indicate use of the new format.

Fixes: 42bbabed09ce6208 ("perf tools: Add hw_idx in struct branch_stack")
Signed-off-by: Al Grant <[email protected]>
Reviewed-by: Andrea Brunato <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: [email protected]
Signed-off-by: Leo Yan <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf top/report: Fix infinite loop in the TUI for grouped events

For a while we need to have a dummy event for doing things like
receiving PERF_RECORD_COMM, PERF_RECORD_EXEC, etc for threads being
created and dying while we synthesize the pre-existing ones at tool
start.

This 'dummy' event is needed for keeping track of thread lifetime events
early in the session but are uninteresting otherwise, i.e. no need to
have it in a initial events menu for the non-grouped case, i.e. for:

# perf top -e cycles,instructions

or even for plain:

# perf top

When 'cycles' and that 'dummy' event are in place.

The code to remove that 'dummy' event ended up creating an endless loop
for the grouped case, i.e.:

# perf top -e '{cycles,instructions}'

Fix it.

Fixes: bee9ca1c8a237ca1 ("perf report TUI: Remove needless 'dummy' event from menu")
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf parse-events: Avoid an uninitialized read when using fake PMUs

With a fake_pmu the pmu_info isn't populated by perf_pmu__check_alias.
In this case, don't try to copy the uninitialized values to the evsel.

Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf stat: Fix out of bounds array access in the print_counters() evlist method

Fix a compile error on F32 and gcc version 10.1 on s390 in file
utils/stat-display.c.  The error does not show up with make DEBUG=y.  In
fact the issue shows up when using both compiler options -O6 and
-D_FORTIFY_SOURCE=2 (which are omitted with DEBUG=Y).

This is the offending call chain:

print_counter_aggr()
  printout(config, -1, 0, ...)  with 2nd parm id set to -1
    aggr_printout(config, x, id --> -1, ...) which leads to this code:
case AGGR_NONE:
                if (evsel->percore && !config->percore_show_thread) {
                        ....
                } else {
                        fprintf(config->output, "CPU%*d%s",
                                config->csv_output ? 0 : -7,
                                evsel__cpus(evsel)->map[id],
                        ^^ id is -1 !!!!
                                config->csv_sep);
                }

This is a compiler inlining issue which is detected on s390 but not on
other plattforms.

Output before:

# make util/stat-display.o
    .....

  util/stat-display.c: In function ‘perf_evlist__print_counters’:
  util/stat-display.c:121:4: error: array subscript -1 is below array
      bounds of ‘int[]’ [-Werror=array-bounds]
  121 |    fprintf(config->output, "CPU%*d%s",
      |    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  122 |     config->csv_output ? 0 : -7,
      |     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  123 |     evsel__cpus(evsel)->map[id],
      |     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  124 |     config->csv_sep);
      |     ~~~~~~~~~~~~~~~~
  In file included from util/evsel.h:13,
                 from util/evlist.h:13,
                 from util/stat-display.c:9:
  /root/linux/tools/lib/perf/include/internal/cpumap.h:10:7:
  note: while referencing ‘map’
   10 |  int  map[];
      |       ^~~
  cc1: all warnings being treated as errors
  mv: cannot stat 'util/.stat-display.o.tmp': No such file or directory
  make[3]: *** [/root/linux/tools/build/Makefile.build:97: util/stat-display.o]
  Error 1
  make[2]: *** [Makefile.perf:716: util/stat-display.o] Error 2
  make[1]: *** [Makefile.perf:231: sub-make] Error 2
  make: *** [Makefile:110: util/stat-display.o] Error 2
  [root@t35lp46 perf]#

Output after:

  # make util/stat-display.o
    .....
  CC       util/stat-display.o
  [root@t35lp46 perf]#

Committer notes:

Removed the removal of {} enclosing the multiline else block, as pointed
out by Jiri Olsa.

Suggested-by: Jiri Olsa <[email protected]>
Signed-off-by: Thomas Richter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Sumanth Korikkar <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf test: Set NULL sentinel in pmu_events table in "Parse and process metrics" test

Linux 5.9 introduced perf test case "Parse and process metrics" and
on s390 this test case always dumps core:

  [root@t35lp67 perf]# ./perf test -vvvv -F 67
  67: Parse and process metrics                             :
  --- start ---
  metric expr inst_retired.any / cpu_clk_unhalted.thread for IPC
  parsing metric: inst_retired.any / cpu_clk_unhalted.thread
  Segmentation fault (core dumped)
  [root@t35lp67 perf]#

I debugged this core dump and gdb shows this call chain:

  (gdb) where
   #0  0x000003ffabc3192a in __strnlen_c_1 () from /lib64/libc.so.6
   #1  0x000003ffabc293de in strcasestr () from /lib64/libc.so.6
   #2  0x0000000001102ba2 in match_metric(list=0x1e6ea20 "inst_retired.any",
            n=<optimized out>)
       at util/metricgroup.c:368
   #3  find_metric (map=<optimized out>, map=<optimized out>,
           metric=0x1e6ea20 "inst_retired.any")
      at util/metricgroup.c:765
   #4  __resolve_metric (ids=0x0, map=<optimized out>, metric_list=0x0,
           metric_no_group=<optimized out>, m=<optimized out>)
      at util/metricgroup.c:844
   #5  resolve_metric (ids=0x0, map=0x0, metric_list=0x0,
          metric_no_group=<optimized out>)
      at util/metricgroup.c:881
   #6  metricgroup__add_metric (metric=<optimized out>,
        metric_no_group=metric_no_group@entry=false, events=<optimized out>,
        events@entry=0x3ffd84fb878, metric_list=0x0,
        metric_list@entry=0x3ffd84fb868, map=0x0)
      at util/metricgroup.c:943
   #7  0x00000000011034ae in metricgroup__add_metric_list (map=0x13f9828 <map>,
        metric_list=0x3ffd84fb868, events=0x3ffd84fb878,
        metric_no_group=<optimized out>, list=<optimized out>)
      at util/metricgroup.c:988
   #8  parse_groups (perf_evlist=perf_evlist@entry=0x1e70260,
          str=str@entry=0x12f34b2 "IPC", metric_no_group=<optimized out>,
          metric_no_merge=<optimized out>,
          fake_pmu=fake_pmu@entry=0x1462f18 <perf_pmu.fake>,
          metric_events=0x3ffd84fba58, map=0x1)
      at util/metricgroup.c:1040
   #9  0x0000000001103eb2 in metricgroup__parse_groups_test(
   evlist=evlist@entry=0x1e70260, map=map@entry=0x13f9828 <map>,
   str=str@entry=0x12f34b2 "IPC",
   metric_no_group=metric_no_group@entry=false,
   metric_no_merge=metric_no_merge@entry=false,
   metric_events=0x3ffd84fba58)
      at util/metricgroup.c:1082
   #10 0x00000000010c84d8 in __compute_metric (ratio2=0x0, name2=0x0,
          ratio1=<synthetic pointer>, name1=0x12f34b2 "IPC",
   vals=0x3ffd84fbad8, name=0x12f34b2 "IPC")
      at tests/parse-metric.c:159
   #11 compute_metric (ratio=<synthetic pointer>, vals=0x3ffd84fbad8,
   name=0x12f34b2 "IPC")
      at tests/parse-metric.c:189
   #12 test_ipc () at tests/parse-metric.c:208
.....
..... omitted many more lines

This test case was added with
commit 218ca91df477 ("perf tests: Add parse metric test for frontend metric").

When I compile with make DEBUG=y it works fine and I do not get a core dump.

It turned out that the above listed function call chain worked on a struct
pmu_event array which requires a trailing element with zeroes which was
missing. The marco map_for_each_event() loops over that array tests for members
metric_expr/metric_name/metric_group being non-NULL. Adding this element fixes
the issue.

Output after:

  [root@t35lp46 perf]# ./perf test 67
  67: Parse and process metrics                             : Ok
  [root@t35lp46 perf]#

Committer notes:

As Ian remarks, this is not s390 specific:

<quote Ian>
  This also shows up with address sanitizer on all architectures
  (perhaps change the patch title) and perhaps add a "Fixes: <commit>"
  tag.

  =================================================================
  ==4718==ERROR: AddressSanitizer: global-buffer-overflow on address
  0x55c93b4d59e8 at pc 0x55c93a1541e2 bp 0x7ffd24327c60 sp
  0x7ffd24327c58
  READ of size 8 at 0x55c93b4d59e8 thread T0
      #0 0x55c93a1541e1 in find_metric tools/perf/util/metricgroup.c:764:2
      #1 0x55c93a153e6c in __resolve_metric tools/perf/util/metricgroup.c:844:9
      #2 0x55c93a152f18 in resolve_metric tools/perf/util/metricgroup.c:881:9
      #3 0x55c93a1528db in metricgroup__add_metric
  tools/perf/util/metricgroup.c:943:9
      #4 0x55c93a151996 in metricgroup__add_metric_list
  tools/perf/util/metricgroup.c:988:9
      #5 0x55c93a1511b9 in parse_groups tools/perf/util/metricgroup.c:1040:8
      #6 0x55c93a1513e1 in metricgroup__parse_groups_test
  tools/perf/util/metricgroup.c:1082:9
      #7 0x55c93a0108ae in __compute_metric tools/perf/tests/parse-metric.c:159:8
      #8 0x55c93a010744 in compute_metric tools/perf/tests/parse-metric.c:189:9
      #9 0x55c93a00f5ee in test_ipc tools/perf/tests/parse-metric.c:208:2
      #10 0x55c93a00f1e8 in test__parse_metric
  tools/perf/tests/parse-metric.c:345:2
      #11 0x55c939fd7202 in run_test tools/perf/tests/builtin-test.c:410:9
      #12 0x55c939fd6736 in test_and_print tools/perf/tests/builtin-test.c:440:9
      #13 0x55c939fd58c3 in __cmd_test tools/perf/tests/builtin-test.c:661:4
      #14 0x55c939fd4e02 in cmd_test tools/perf/tests/builtin-test.c:807:9
      #15 0x55c939e4763d in run_builtin tools/perf/perf.c:313:11
      #16 0x55c939e46475 in handle_internal_command tools/perf/perf.c:365:8
      #17 0x55c939e4737e in run_argv tools/perf/perf.c:409:2
      #18 0x55c939e45f7e in main tools/perf/perf.c:539:3

  0x55c93b4d59e8 is located 0 bytes to the right of global variable
  'pme_test' defined in 'tools/perf/tests/parse-metric.c:17:25'
  (0x55c93b4d54a0) of size 1352
  SUMMARY: AddressSanitizer: global-buffer-overflow
  tools/perf/util/metricgroup.c:764:2 in find_metric
  Shadow bytes around the buggy address:
    0x0ab9a7692ae0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ab9a7692af0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ab9a7692b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ab9a7692b10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ab9a7692b20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  =>0x0ab9a7692b30: 00 00 00 00 00 00 00 00 00 00 00 00 00[f9]f9 f9
    0x0ab9a7692b40: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
    0x0ab9a7692b50: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
    0x0ab9a7692b60: f9 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
    0x0ab9a7692b70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ab9a7692b80: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  Shadow byte legend (one shadow byte represents 8 application bytes):
    Addressable:           00
    Partially addressable: 01 02 03 04 05 06 07
    Heap left redzone:    fa
    Freed heap region:    fd
    Stack left redzone:    f1
    Stack mid redzone:    f2
    Stack right redzone:     f3
    Stack after return:    f5
    Stack use after scope:   f8
    Global redzone:          f9
    Global init order:    f6
    Poisoned by user:        f7
    Container overflow:    fc
    Array cookie:            ac
    Intra object redzone:    bb
    ASan internal:           fe
    Left alloca redzone:     ca
    Right alloca redzone:    cb
    Shadow gap:              cc
</quote>

I'm also adding the missing "Fixes" tag and setting just .name to NULL,
as doing it that way is more compact (the compiler will zero out
everything else) and the table iterators look for .name being NULL as
the sentinel marking the end of the table.

Fixes: 0a507af9c681ac2a ("perf tests: Add parse metric test for ipc metric")
Signed-off-by: Thomas Richter <[email protected]>
Reviewed-by: Sumanth Korikkar <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf parse-events: Set exclude_guest=1 for user-space counting

Currently if we run 'perf record -e cycles:u', exclude_guest=0.

But it doesn't make sense in most cases that we request for
user-space counting but we also get the guest report.

Of course, we also need to consider 'perf kvm' usage case that
authorized perf users on the host may only want to count guest user
space events. For example,

  # perf kvm --guest record -e cycles:u

When we have 'exclude_guest=1' for 'perf kvm' usage, we may get nothing
from guest events.

To keep perf semantics consistent and clear, this patch sets
exclude_guest=1 for user-space counting but except for 'perf kvm' usage.

Before:

  perf record -e cycles:u ./div
  perf evlist -v
  cycles:u: ..., exclude_kernel: 1, exclude_hv: 1, ...

After:
  perf record -e cycles:u ./div
  perf evlist -v
  cycles:u: ..., exclude_kernel: 1, exclude_hv: 1,  exclude_guest: 1, ...

Before:
  perf kvm --guest record -e cycles:u -vvv

perf_event_attr:

  size                             120
  { sample_period, sample_freq }   4000
  sample_type                      IP|TID|TIME|ID|CPU|PERIOD
  read_format                      ID
  disabled                         1
  inherit                          1
  exclude_kernel                   1
  exclude_hv                       1
  freq                             1
  sample_id_all                    1

After:

  perf kvm --guest record -e cycles:u -vvv

perf_event_attr:
  size                             120
  { sample_period, sample_freq }   4000
  sample_type                      IP|TID|TIME|ID|CPU|PERIOD
  read_format                      ID
  disabled                         1
  inherit                          1
  exclude_kernel                   1
  exclude_hv                       1
  freq                             1
  sample_id_all                    1

For Before/After, exclude_guest are both 0 for perf kvm usage.

perf test 6

6: Parse event definition strings             : Ok

Signed-off-by: Jin Yao <[email protected]>
Tested-by: Like Xu <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf record: Correct the help info of option "--no-bpf-event"

The help info of option "--no-bpf-event" is wrongly described as "record
bpf events", correct it.

Committer testing:

  $ perf record -h bpf

   Usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]

          --clang-opt <clang options>
                            options passed to clang when compiling BPF scriptlets
          --clang-path <clang path>
                            clang binary to use for compiling BPF scriptlets
          --no-bpf-event    do not record bpf events

  $

Fixes: 71184c6ab7e6 ("perf record: Replace option --bpf-event with --no-bpf-event")
Signed-off-by: Wei Li <[email protected]>
Acked-by: Song Liu <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Hanjun Guo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Li Bin <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>