Git Repo - linux.git/log

bpf: skip unnecessary capability check

The current check statement in BPF syscall will do a capability check
for CAP_SYS_ADMIN before checking sysctl_unprivileged_bpf_disabled. This
code path will trigger unnecessary security hooks on capability checking
and cause false alarms on unprivileged process trying to get CAP_SYS_ADMIN
access. This can be resolved by simply switch the order of the statement
and CAP_SYS_ADMIN is not required anyway if unprivileged bpf syscall is
allowed.

Signed-off-by: Chenbo Feng <[email protected]>
Acked-by: Lorenzo Colitti <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

trace/bpf: remove helper bpf_perf_prog_read_value from tracepoint type programs

Commit 4bebdc7a85aa ("bpf: add helper bpf_perf_prog_read_value")
added helper bpf_perf_prog_read_value so that perf_event type program
can read event counter and enabled/running time.
This commit, however, introduced a bug which allows this helper
for tracepoint type programs. This is incorrect as bpf_perf_prog_read_value
needs to access perf_event through its bpf_perf_event_data_kern type context,
which is not available for tracepoint type program.

This patch fixed the issue by separating bpf_func_proto between tracepoint
and perf_event type programs and removed bpf_perf_prog_read_value
from tracepoint func prototype.

Fixes: 4bebdc7a85aa ("bpf: add helper bpf_perf_prog_read_value")
Reported-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Yonghong Song <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

test_bpf: Fix testing with CONFIG_BPF_JIT_ALWAYS_ON=y on other arches

Function bpf_fill_maxinsns11 is designed to not be able to be JITed on
x86_64. So, it fails when CONFIG_BPF_JIT_ALWAYS_ON=y, and
commit 09584b406742 ("bpf: fix selftests/bpf test_kmod.sh failure when
CONFIG_BPF_JIT_ALWAYS_ON=y") makes sure that failure is detected on that
case.

However, it does not fail on other architectures, which have a different
JIT compiler design. So, test_bpf has started to fail to load on those.

After this fix, test_bpf loads fine on both x86_64 and ppc64el.

Fixes: 09584b406742 ("bpf: fix selftests/bpf test_kmod.sh failure when CONFIG_BPF_JIT_ALWAYS_ON=y")
Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
Reviewed-by: Yonghong Song <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

kvm/x86: fix icebp instruction handling

The undocumented 'icebp' instruction (aka 'int1') works pretty much like
'int3' in the absense of in-circuit probing equipment (except,
obviously, that it raises #DB instead of raising #BP), and is used by
some validation test-suites as such.

But Andy Lutomirski noticed that his test suite acted differently in kvm
than on bare hardware.

The reason is that kvm used an inexact test for the icebp instruction:
it just assumed that an all-zero VM exit qualification value meant that
the VM exit was due to icebp.

That is not unlike the guess that do_debug() does for the actual
exception handling case, but it's purely a heuristic, not an absolute
rule. do_debug() does it because it wants to ascribe _some_ reasons to
the #DB that happened, and an empty %dr6 value means that 'icebp' is the
most likely casue and we have no better information.

But kvm can just do it right, because unlike the do_debug() case, kvm
actually sees the real reason for the #DB in the VM-exit interruption
information field.

So instead of relying on an inexact heuristic, just use the actual VM
exit information that says "it was 'icebp'".

Right now the 'icebp' instruction isn't technically documented by Intel,
but that will hopefully change. The special "privileged software
exception" information _is_ actually mentioned in the Intel SDM, even
though the cause of it isn't enumerated.

Reported-by: Andy Lutomirski <[email protected]>
Tested-by: Paolo Bonzini <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

RDMA/ucma: Ensure that CM_ID exists prior to access it

Prior to access UCMA commands, the context should be initialized
and connected to CM_ID with ucma_create_id(). In case user skips
this step, he can provide non-valid ctx without CM_ID and cause
to multiple NULL dereferences.

Also there are situations where the create_id can be raced with
other user access, ensure that the context is only shared to
other threads once it is fully initialized to avoid the races.

[  109.088108] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[  109.090315] IP: ucma_connect+0x138/0x1d0
[  109.092595] PGD 80000001dc02d067 P4D 80000001dc02d067 PUD 1da9ef067 PMD 0
[  109.095384] Oops: 0000 [#1] SMP KASAN PTI
[  109.097834] CPU: 0 PID: 663 Comm: uclose Tainted: G    B 4.16.0-rc1-00062-g2975d5de6428 #45
[  109.100816] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[  109.105943] RIP: 0010:ucma_connect+0x138/0x1d0
[  109.108850] RSP: 0018:ffff8801c8567a80 EFLAGS: 00010246
[  109.111484] RAX: 0000000000000000 RBX: 1ffff100390acf50 RCX: ffffffff9d7812e2
[  109.114496] RDX: 1ffffffff3f507a5 RSI: 0000000000000297 RDI: 0000000000000297
[  109.117490] RBP: ffff8801daa15600 R08: 0000000000000000 R09: ffffed00390aceeb
[  109.120429] R10: 0000000000000001 R11: ffffed00390aceea R12: 0000000000000000
[  109.123318] R13: 0000000000000120 R14: ffff8801de6459c0 R15: 0000000000000118
[  109.126221] FS:  00007fabb68d6700(0000) GS:ffff8801e5c00000(0000) knlGS:0000000000000000
[  109.129468] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  109.132523] CR2: 0000000000000020 CR3: 00000001d45d8003 CR4: 00000000003606b0
[  109.135573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  109.138716] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  109.142057] Call Trace:
[  109.144160]  ? ucma_listen+0x110/0x110
[  109.146386]  ? wake_up_q+0x59/0x90
[  109.148853]  ? futex_wake+0x10b/0x2a0
[  109.151297]  ? save_stack+0x89/0xb0
[  109.153489]  ? _copy_from_user+0x5e/0x90
[  109.155500]  ucma_write+0x174/0x1f0
[  109.157933]  ? ucma_resolve_route+0xf0/0xf0
[  109.160389]  ? __mod_node_page_state+0x1d/0x80
[  109.162706]  __vfs_write+0xc4/0x350
[  109.164911]  ? kernel_read+0xa0/0xa0
[  109.167121]  ? path_openat+0x1b10/0x1b10
[  109.169355]  ? fsnotify+0x899/0x8f0
[  109.171567]  ? fsnotify_unmount_inodes+0x170/0x170
[  109.174145]  ? __fget+0xa8/0xf0
[  109.177110]  vfs_write+0xf7/0x280
[  109.179532]  SyS_write+0xa1/0x120
[  109.181885]  ? SyS_read+0x120/0x120
[  109.184482]  ? compat_start_thread+0x60/0x60
[  109.187124]  ? SyS_read+0x120/0x120
[  109.189548]  do_syscall_64+0xeb/0x250
[  109.192178]  entry_SYSCALL_64_after_hwframe+0x21/0x86
[  109.194725] RIP: 0033:0x7fabb61ebe99
[  109.197040] RSP: 002b:00007fabb68d5e98 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[  109.200294] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fabb61ebe99
[  109.203399] RDX: 0000000000000120 RSI: 00000000200001c0 RDI: 0000000000000004
[  109.206548] RBP: 00007fabb68d5ec0 R08: 0000000000000000 R09: 0000000000000000
[  109.209902] R10: 0000000000000000 R11: 0000000000000202 R12: 00007fabb68d5fc0
[  109.213327] R13: 0000000000000000 R14: 00007fff40ab2430 R15: 00007fabb68d69c0
[  109.216613] Code: 88 44 24 2c 0f b6 84 24 6e 01 00 00 88 44 24 2d 0f
b6 84 24 69 01 00 00 88 44 24 2e 8b 44 24 60 89 44 24 30 e8 da f6 06 ff
31 c0 <66> 41 83 7c 24 20 1b 75 04 8b 44 24 64 48 8d 74 24 20 4c 89 e7
[  109.223602] RIP: ucma_connect+0x138/0x1d0 RSP: ffff8801c8567a80
[  109.226256] CR2: 0000000000000020

Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace")
Reported-by: <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>

ipv6: old_dport should be a __be16 in __ip6_datagram_connect()

Fixes: 2f987a76a977 ("net: ipv6: keep sk status consistent after datagram connect failure")
Signed-off-by: Stefano Brivio <[email protected]>
Acked-by: Paolo Abeni <[email protected]>
Acked-by: Guillaume Nault <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'linux-can-fixes-for-4.16-20180319' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2018-03-19

this is a pull reqeust of one patch for net/master.

The patch is by Andri Yngvason and fixes a potential use-after-free bug
in the cc770 driver introduced in the previous pull-request.
====================

Signed-off-by: David S. Miller <[email protected]>

net: gemini: fix memory leak

cppcheck report:
[drivers/net/ethernet/cortina/gemini.c:543]: (error) Memory leak: skb_tab

Signed-off-by: Igor Pylypiv <[email protected]>
Acked-by: Linus Walleij <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ethernet: arc: Fix a potential memory leak if an optional regulator is deferred

If the optional regulator is deferred, we must release some resources.
They will be re-allocated when the probe function will be called again.

Fixes: 6eacf31139bf ("ethernet: arc: Add support for Rockchip SoC layer device tree bindings")
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

devlink: Remove redundant free on error path

The current code performs unneeded free. Remove the redundant skb freeing
during the error path.

Fixes: 1555d204e743 ("devlink: Support for pipeline debug (dpipe)")
Signed-off-by: Arkadi Sharshevsky <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

vmxnet3: remove unused flag "rxcsum" from struct vmxnet3_adapter

Signed-off-by: Igor Pylypiv <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth

Johan Hedberg says:

====================
Here are a few more important Bluetooth driver fixes for the 4.16
kernel.

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <[email protected]>

x86/vsyscall/64: Use proper accessor to update P4D entry

Writing to it directly does not work for Xen PV guests.

Fixes: 49275fef986a ("x86/vsyscall/64: Explicitly set _PAGE_USER in the pagetable hierarchy")
Signed-off-by: Boris Ostrovsky <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Acked-by: Andy Lutomirski <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

drm/sun4i: hdmi: Fix another error handling path in 'sun4i_hdmi_bind()'

If we can not get the HDMI DDC clock, we still need to free some
resources before returning.

Fixes: 939d749ad664 ("drm/sun4i: hdmi: Add support for controller hardware variants")
Reviewed-by: Chen-Yu Tsai <[email protected]>
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/5e0084af4ad57e9eea3bca5bd8e2e95970cd6714.1521413031.git.christophe.jaillet@wanadoo.fr

drm/sun4i: hdmi: Fix an error handling path in 'sun4i_hdmi_bind()'

If we can not allocate the HDMI encoder regmap, we still need to free some
resources before returning.

Fixes: 4b1c924b1fc1 ("drm/sun4i: hdmi: create a regmap for later use")
Reviewed-by: Chen-Yu Tsai <[email protected]>
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/14c42391e1b562c7495bda6ad6fa1d24ec8dc052.1521413031.git.christophe.jaillet@wanadoo.fr

x86/cpu: Remove the CONFIG_X86_PPRO_FENCE=y quirk

There were only a few Pentium Pro multiprocessors systems where this
errata applied. They are more than 20 years old now, and we've slowly
dropped places which put the workarounds in and discouraged anyone
from enabling the workaround.

Get rid of it for good.

Tested-by: Tom Lendacky <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
Cc: David Woodhouse <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Jon Mason <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Muli Ben-Yehuda <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

sched/debug: Adjust newlines for better alignment

Scheduler debug stats include newlines that display out of alignment
when prefixed by timestamps.  For example, the dmesg utility:

  % echo t > /proc/sysrq-trigger
  % dmesg
  ...
  [   83.124251]
  runnable tasks:
   S           task   PID         tree-key  switches  prio     wait-time
  sum-exec        sum-sleep
  -----------------------------------------------------------------------------------------------------------

At the same time, some syslog utilities (like rsyslog by default) don't
like the additional newlines control characters, saving lines like this
to /var/log/messages:

  Mar 16 16:02:29 localhost kernel: #012runnable tasks:#012 S           task   PID         tree-key ...
                                    ^^^^               ^^^^
Clean these up by moving newline characters to their own SEQ_printf
invocation.  This leaves the /proc/sched_debug unchanged, but brings the
entire output into alignment when prefixed:

  % echo t > /proc/sysrq-trigger
  % dmesg
  ...
  [   62.410368] runnable tasks:
  [   62.410368]  S           task   PID         tree-key  switches  prio     wait-time             sum-exec        sum-sleep
  [   62.410369] -----------------------------------------------------------------------------------------------------------
  [   62.410369]  I  kworker/u12:0     5      1932.215593       332   120         0.000000         3.621252         0.000000 0 0 /

and no escaped control characters from rsyslog in /var/log/messages:

  Mar 16 16:15:06 localhost kernel: runnable tasks:
  Mar 16 16:15:06 localhost kernel: S           task   PID         tree-key  ...

Signed-off-by: Joe Lawrence <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

sched/debug: Fix per-task line continuation for console output

When the SEQ_printf() macro prints to the console, it runs a simple
printk() without KERN_CONT "continued" line printing.  The result of
this is oddly wrapped task info, for example:

  % echo t > /proc/sysrq-trigger
  % dmesg
  ...
  runnable tasks:
  ...
  [   29.608611]  I
  [   29.608613]       rcu_sched     8      3252.013846      4087   120
  [   29.608614]         0.000000        29.090111         0.000000
  [   29.608615]  0 0
  [   29.608616]  /

Modify SEQ_printf to use pr_cont() for expected one-line results:

  % echo t > /proc/sysrq-trigger
  % dmesg
  ...
  runnable tasks:
  ...
  [  106.716329]  S        cpuhp/5    37      2006.315026        14   120         0.000000         0.496893         0.000000 0 0 /

Signed-off-by: Joe Lawrence <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

perf/cgroup: Fix child event counting bug

When a perf_event is attached to parent cgroup, it should count events
for all children cgroups:

   parent_group   <---- perf_event
     \
      - child_group  <---- process(es)

However, in our tests, we found this perf_event cannot report reliable
results. Here is an example case:

  # create cgroups
  mkdir -p /sys/fs/cgroup/p/c
  # start perf for parent group
  perf stat -e instructions -G "p"

  # on another console, run test process in child cgroup:
  stressapptest -s 2 -M 1000 & echo $! > /sys/fs/cgroup/p/c/cgroup.procs

  # after the test process is done, stop perf in the first console shows

       <not counted>      instructions              p

The instruction should not be "not counted" as the process runs in the
child cgroup.

We found this is because perf_event->cgrp and cpuctx->cgrp are not
identical, thus perf_event->cgrp are not updated properly.

This patch fixes this by updating perf_cgroup properly for ancestor
cgroup(s).

Reported-by: Ephraim Park <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: <[email protected]>
Cc: <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vince Weaver <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

perf/x86/intel/uncore: Fix multi-domain PCI CHA enumeration bug on Skylake servers

The number of CHAs is miscalculated on multi-domain PCI Skylake server systems,
resulting in an uncore driver initialization error.

Gary Kroening explains:

"For systems with a single PCI segment, it is sufficient to look for the
  bus number to change in order to determine that all of the CHa's have
  been counted for a single socket.

  However, for multi PCI segment systems, each socket is given a new
  segment and the bus number does NOT change.  So looking only for the
  bus number to change ends up counting all of the CHa's on all sockets
  in the system.  This leads to writing CPU MSRs beyond a valid range and
  causes an error in ivbep_uncore_msr_init_box()."

To fix this bug, query the number of CHAs from the CAPID6 register:
it should read bits 27:0 in the CAPID6 register located at
Device 30, Function 3, Offset 0x9C. These 28 bits form a bit vector
of available LLC slices and the CHAs that manage those slices.

Reported-by: Kroening, Gary <[email protected]>
Tested-by: Kroening, Gary <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vince Weaver <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Fixes: cd34cd97b7b4 ("perf/x86/intel/uncore: Add Skylake server uncore support")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

perf/x86/intel: Rename confusing 'freerunning PEBS' API and implementation to 'large PEBS'

The 'freerunning PEBS' and 'large PEBS' are the same thing. Both of these
names appear in the code and in the API, which causes confusion.

Rename 'freerunning PEBS' to 'large PEBS' to unify the code,
which eliminates the confusion.

No functional change.

Reported-by: Vince Weaver <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

jump_label: Disable jump labels in __exit code

With the following commit:

  333522447063 ("jump_label: Explicitly disable jump labels in __init code")

... we explicitly disabled jump labels in __init code, so they could be
detected and not warned about in the following commit:

  dc1dd184c2f0 ("jump_label: Warn on failed jump_label patching attempt")

In-kernel __exit code has the same issue.  It's never used, so it's
freed along with the rest of initmem.  But jump label entries in __exit
code aren't explicitly disabled, so we get the following warning when
enabling pr_debug() in __exit code:

  can't patch jump_label at dmi_sysfs_exit+0x0/0x2d
  WARNING: CPU: 0 PID: 22572 at kernel/jump_label.c:376 __jump_label_update+0x9d/0xb0

Fix the warning by disabling all jump labels in initmem (which includes
both __init and __exit code).

Reported-and-tested-by: Li Wang <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jason Baron <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Fixes: dc1dd184c2f0 ("jump_label: Warn on failed jump_label patching attempt")
Link: http://lkml.kernel.org/r/7121e6e595374f06616c505b6e690e275c0054d1.1521483452.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>

perf/x86/intel/uncore: Add missing filter constraint for SKX CHA event

Adding a filter constraint for Intel Skylake CHA event
UNC_CHA_UPI_CREDITS_ACQUIRED (0x38).

The event supports core-id/thread-id and link filtering.

Signed-off-by: Stephane Eranian <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vince Weaver <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

perf/x86/intel: Don't accidentally clear high bits in bdw_limit_period()

We intended to clear the lowest 6 bits but because of a type bug we
clear the high 32 bits as well. Andi says that periods are rarely more
than U32_MAX so this bug probably doesn't have a huge runtime impact.

Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vince Weaver <[email protected]>
Fixes: 294fe0f52a44 ("perf/x86/intel: Add INST_RETIRED.ALL workarounds")
Link: http://lkml.kernel.org/r/20180317115216.GB4035@mwanda
Signed-off-by: Ingo Molnar <[email protected]>

perf/x86/intel: Disable userspace RDPMC usage for large PEBS

Userspace RDPMC cannot possibly work for large PEBS, which was introduced in:

b8241d20699e ("perf/x86/intel: Implement batched PEBS interrupt handling (large PEBS interrupt threshold)")

When the PEBS interrupt threshold is larger than one, there is no way
to get exact auto-reload times and value for userspace RDPMC. Disable
the userspace RDPMC usage when large PEBS is enabled.

The only exception is when the PEBS interrupt threshold is 1, in which
case user-space RDPMC works well even with auto-reload events.

Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vince Weaver <[email protected]>
Cc: [email protected]
Fixes: b8241d20699e ("perf/x86/intel: Implement batched PEBS interrupt handling (large PEBS interrupt threshold)")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
(cherry picked from commit 1af22eba248efe2de25658041a80a3d40fb3e92e)

locking/mutex: Improve documentation

On Wed, Mar 14, 2018 at 01:56:31PM -0700, Andrew Morton wrote:

> My memory is weak and our documentation is awful. What does
> mutex_lock_killable() actually do and how does it differ from
> mutex_lock_interruptible()?

Add kernel-doc for mutex_lock_killable() and mutex_lock_io(). Reword the
kernel-doc for mutex_lock_interruptible().

Signed-off-by: Matthew Wilcox <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Kirill Tkhai <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

x86/boot/64: Verify alignment of the LOAD segment

Since the x86-64 kernel must be aligned to 2MB, refuse to boot the
kernel if the alignment of the LOAD segment isn't a multiple of 2MB.

Signed-off-by: H.J. Lu <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/CAMe9rOrR7xSJgUfiCoZLuqWUwymRxXPoGBW38%2BpN%3D9g%[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

x86/build/64: Force the linker to use 2MB page size

Binutils 2.31 will enable -z separate-code by default for x86 to avoid
mixing code pages with data to improve cache performance as well as
security.  To reduce x86-64 executable and shared object sizes, the
maximum page size is reduced from 2MB to 4KB.  But x86-64 kernel must
be aligned to 2MB.  Pass -z max-page-size=0x200000 to linker to force
2MB page size regardless of the default page size used by linker.

Tested with Linux kernel 4.15.6 on x86-64.

Signed-off-by: H.J. Lu <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/CAMe9rOp4_%[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

Merge branch 'phy-relax-error-checking'

Grygorii Strashko says:

====================
net: phy: relax error checking when creating sysfs link netdev->phydev

Some ethernet drivers (like TI CPSW) may connect and manage >1 Net PHYs per
one netdevice, as result such drivers will produce warning during system
boot and fail to connect second phy to netdevice when PHYLIB framework
will try to create sysfs link netdev->phydev for second PHY
in phy_attach_direct(), because sysfs link with the same name has been
created already for the first PHY.
As result, second CPSW external port will became unusable.
This regression was introduced by commits:
5568363f0cb3 ("net: phy: Create sysfs reciprocal links for attached_dev/phydev"
a3995460491d ("net: phy: Relax error checking on sysfs_create_link()"

Patch 1: exports sysfs_create_link_nowarn() function as preparation for Patch 2.
Patch 2: relaxes error checking when PHYLIB framework is creating sysfs
link netdev->phydev in phy_attach_direct(), suppresses warning by using
sysfs_create_link_nowarn() and adds error message instead, so links creation
failure is not fatal any more and system can continue working,
which fixes TI CPSW issue and makes boot logs accessible
in case of NFS boot, for example.

This can be stable material 4.13+.

Changes in v2:
- commit messages updated.

v1:
https://patchwork.ozlabs.org/cover/886058/
====================

Signed-off-by: David S. Miller <[email protected]>

net: phy: relax error checking when creating sysfs link netdev->phydev

Some ethernet drivers (like TI CPSW) may connect and manage >1 Net PHYs per
one netdevice, as result such drivers will produce warning during system
boot and fail to connect second phy to netdevice when PHYLIB framework
will try to create sysfs link netdev->phydev for second PHY
in phy_attach_direct(), because sysfs link with the same name has been
created already for the first PHY. As result, second CPSW external
port will became unusable.

Fix it by relaxing error checking when PHYLIB framework is creating sysfs
link netdev->phydev in phy_attach_direct(), suppressing warning by using
sysfs_create_link_nowarn() and adding error message instead.
After this change links (phy->netdev and netdev->phy) creation failure is not
fatal any more and system can continue working, which fixes TI CPSW issue.

Cc: Florian Fainelli <[email protected]>
Cc: Andrew Lunn <[email protected]>
Fixes: a3995460491d ("net: phy: Relax error checking on sysfs_create_link()")
Signed-off-by: Grygorii Strashko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sysfs: symlink: export sysfs_create_link_nowarn()

The sysfs_create_link_nowarn() is going to be used in phylib framework in
subsequent patch which can be built as module. Hence, export
sysfs_create_link_nowarn() to avoid build errors.

Cc: Florian Fainelli <[email protected]>
Cc: Andrew Lunn <[email protected]>
Fixes: a3995460491d ("net: phy: Relax error checking on sysfs_create_link()")
Signed-off-by: Grygorii Strashko <[email protected]>
Acked-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

drm/i915/dp: Write to SET_POWER dpcd to enable MST hub.

If bios sets up an MST output and hardware state readout code sees this is
an SST configuration, when disabling the encoder we end up calling
->post_disable_dp() hook instead of the MST version. Consequently, we write
to the DP_SET_POWER dpcd to set it D3 state. Further along when we try
enable the encoder in MST mode, POWER_UP_PHY transaction fails to power up
the MST hub. This results in continuous link training failures which keep
the system busy delaying boot. We could identify bios MST boot discrepancy
and handle it accordingly but a simple way to solve this is to write to the
DP_SET_POWER dpcd for MST too.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105470
Cc: Ville Syrjälä <[email protected]>
Cc: Jani Nikula <[email protected]>
Reviewed-by: Ville Syrjälä <[email protected]>
Reported-by: Laura Abbott <[email protected]>
Cc: [email protected]
Fixes: 5ea2355a100a ("drm/i915/mst: Use MST sideband message transactions for dpms control")
Tested-by: Laura Abbott <[email protected]>
Signed-off-by: Dhinakaran Pandiyan <[email protected]>
Signed-off-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit ad260ab32a4d94fa974f58262f8000472d34fd5b)
Signed-off-by: Rodrigo Vivi <[email protected]>

Merge branch 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup fixes from Tejun Heo:
"Two commits to fix the following subtle cgroup2 behavior bugs:

   - cpu.max was rejecting config when it shouldn't

   - thread mode enable was allowed when it shouldn't"

* 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: fix rule checking for threaded mode switching
  sched, cgroup: Don't reject lower cpu.max on ancestors

ACPI / watchdog: Fix off-by-one error at resource assignment

The resource allocation in WDAT watchdog has off-one-by error, it sets
one byte more than the actual end address. This may eventually lead
to unexpected resource conflicts.

Fixes: 058dfc767008 (ACPI / watchdog: Add support for WDAT hardware watchdog)
Cc: 4.9+ <[email protected]> # 4.9+
Signed-off-by: Takashi Iwai <[email protected]>
Acked-by: Mika Westerberg <[email protected]>
Acked-by: Guenter Roeck <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

Merge branch 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

Pull workqueue fixes from Tejun Heo:
"Two low-impact workqueue commits.

  One fixes workqueue creation error path and the other removes the
  unused cancel_work()"

* 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: remove unused cancel_work()
  workqueue: use put_device() instead of kfree()

Merge branch 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu

Pull percpu fixes from Tejun Heo:
"Late percpu pull request for v4.16-rc6.

   - percpu allocator pool replenishing no longer triggers OOM or
     warning messages.

     Also, the alloc interface now understands __GFP_NORETRY and
     __GFP_NOWARN. This is to allow avoiding OOMs from userland
     triggered actions like bpf map creation.

     Also added cond_resched() in alloc loop.

   - perpcu allocation now can be interrupted by kill sigs to avoid
     deadlocking OOM killer.

   - Added Dennis Zhou as a co-maintainer.

     He has rewritten the area map allocator, understands most of the
     code base and has been responsive for all bug reports"

* 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu_ref: Update doc to dissuade users from depending on internal RCU grace periods
  mm: Allow to kill tasks doing pcpu_alloc() and waiting for pcpu_balance_workfn()
  percpu: include linux/sched.h for cond_resched()
  percpu: add a schedule point in pcpu_balance_workfn()
  percpu: allow select gfp to be passed to underlying allocators
  percpu: add __GFP_NORETRY semantics to the percpu balancing path
  percpu: match chunk allocator declarations with definitions
  percpu: add Dennis Zhou as a percpu co-maintainer

Merge branch 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata

Pull libata fixes from Tejun Heo:
"I sat on them too long and it's quite a few this late, but nothing has
  a wide blast area. The changes are...

   - Fix corner cases in SG command handling.

   - Recent introduction of default powersaving mode config option
     exposed several devices with broken powersaving behaviors. A number
     of patches to update the blacklist accordingly.

   - Fix a kernel panic on SAS hotplug.

   - Other misc and device specific updates"

* 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
  libata: Modify quirks for MX100 to limit NCQ_TRIM quirk to MU01 version
  libata: Make Crucial BX100 500GB LPM quirk apply to all firmware versions
  libata: Apply NOLPM quirk to Crucial M500 480 and 960GB SSDs
  libata: Enable queued TRIM for Samsung SSD 860
  PCI: Add function 1 DMA alias quirk for Highpoint RocketRAID 644L
  ahci: Add PCI-id for the Highpoint Rocketraid 644L card
  ata: do not schedule hot plug if it is a sas host
  libata: disable LPM for Crucial BX100 SSD 500GB drive
  libata: Apply NOLPM quirk to Crucial MX100 512GB SSDs
  libata: update documentation for sysfs interfaces
  ata: sata_rcar: Remove unused variable in sata_rcar_init_controller()
  libata: transport: cleanup documentation of sysfs interface
  sata_rcar: Reset SATA PHY when Salvator-X board resumes
  libata: don't try to pass through NCQ commands to non-NCQ devices
  libata: remove WARN() for DMA or PIO command without data
  libata: fix length validation of ATAPI-relayed SCSI commands
  ata: libahci: fix comment indentation
  ahci: Add check for device presence (PCIe hot unplug) in ahci_stop_engine()
  libata: Fix compile warning with ATA_DEBUG enabled

nfsd: remove blocked locks on client teardown

We had some reports of panics in nfsd4_lm_notify, and that showed a
nfs4_lockowner that had outlived its so_client.

Ensure that we walk any leftover lockowners after tearing down all of
the stateids, and remove any blocked locks that they hold.

With this change, we also don't need to walk the nbl_lru on nfsd_net
shutdown, as that will happen naturally when we tear down the clients.

Fixes: 76d348fadff5 (nfsd: have nfsd4_lock use blocking locks for v4.1+ locks)
Reported-by: Frank Sorenson <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
Cc: [email protected] # 4.9
Signed-off-by: J. Bruce Fields <[email protected]>

RDMA/verbs: Remove restrack entry from XRCD structure

XRCD object is not implemented in the restrack, so lets remove it.

Fixes: 02d8883f520e ("RDMA/restrack: Add general infrastructure to track RDMA resources")
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>

RDMA/ucma: Fix use-after-free access in ucma_close

The error in ucma_create_id() left ctx in the list of contexts belong
to ucma file descriptor. The attempt to close this file descriptor causes
to use-after-free accesses while iterating over such list.

Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace")
Reported-by: <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Reviewed-by: Sean Hefty <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>

percpu_ref: Update doc to dissuade users from depending on internal RCU grace periods

percpu_ref internally uses sched-RCU to implement the percpu -> atomic
mode switching and the documentation suggested that this could be
depended upon.  This doesn't seem like a good idea.

* percpu_ref uses sched-RCU which has different grace periods regular
  RCU.  Users may combine percpu_ref with regular RCU usage and
  incorrectly believe that regular RCU grace periods are performed by
  percpu_ref.  This can lead to, for example, use-after-free due to
  premature freeing.

* percpu_ref has a grace period when switching from percpu to atomic
  mode.  It doesn't have one between the last put and release.  This
  distinction is subtle and can lead to surprising bugs.

* percpu_ref allows starting in and switching to atomic mode manually
  for debugging and other purposes.  This means that there may not be
  any grace periods from kill to release.

This patch makes it clear that the grace periods are percpu_ref's
internal implementation detail and can't be depended upon by the
users.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Linus Torvalds <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

mm: Allow to kill tasks doing pcpu_alloc() and waiting for pcpu_balance_workfn()

In case of memory deficit and low percpu memory pages,
pcpu_balance_workfn() takes pcpu_alloc_mutex for a long
time (as it makes memory allocations itself and waits
for memory reclaim). If tasks doing pcpu_alloc() are
choosen by OOM killer, they can't exit, because they
are waiting for the mutex.

The patch makes pcpu_alloc() to care about killing signal
and use mutex_lock_killable(), when it's allowed by GFP
flags. This guarantees, a task does not miss SIGKILL
from OOM killer.

Signed-off-by: Kirill Tkhai <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

percpu: include linux/sched.h for cond_resched()

microblaze build broke due to missing declaration of the
cond_resched() invocation added recently. Let's include linux/sched.h
explicitly.

Signed-off-by: Tejun Heo <[email protected]>
Reported-by: kbuild test robot <[email protected]>

clk: bcm2835: Protect sections updating shared registers

CM_PLLx and A2W_XOSC_CTRL registers are accessed by different clock
handlers and must be accessed with ->regs_lock held.
Update the sections where this protection is missing.

Fixes: 41691b8862e2 ("clk: bcm2835: Add support for programming the audio domain clocks")
Cc: <[email protected]>
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>

clk: bcm2835: Fix ana->maskX definitions

ana->maskX values are already '~'-ed in bcm2835_pll_set_rate(). Remove
the '~' in the definition to fix ANA setup.

Note that this commit fixes a long standing bug preventing one from
using an HDMI display if it's plugged after the FW has booted Linux.
This is because PLLH is used by the HDMI encoder to generate the pixel
clock.

Fixes: 41691b8862e2 ("clk: bcm2835: Add support for programming the audio domain clocks")
Cc: <[email protected]>
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>

drm/amd/display: fix dereferencing possible ERR_PTR()

This patch fixes static checker warning caused by
"36cc549d5986: "drm/amd/display: disable CRTCs with
NULL FB on their primary plane (V2)"

Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Shirish S <[email protected]>
Reviewed-by: Harry Wentland <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amd/display: Refine disable VGA

bad case won't follow normal sense, it will not enable vga1 as usual, but vga2,3,4 is on.

Signed-off-by: Clark Zheng <[email protected]>
Reviewed-by: Tony Cheng <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

ALSA: usb-audio: Fix parsing descriptor of UAC2 processing unit

Currently, the offsets in the UAC2 processing unit descriptor are
calculated incorrectly. It causes an issue when connecting the device which
provides such a feature:

~~~~
[84126.724420] usb 1-1.3.1: invalid Processing Unit descriptor (id 18)
~~~~

After this patch is applied, the UAC2 processing unit inits w/o this error.

Fixes: 23caaf19b11e ("ALSA: usb-mixer: Add support for Audio Class v2.0")
Signed-off-by: Kirill Marinushkin <[email protected]>
Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

libata: Modify quirks for MX100 to limit NCQ_TRIM quirk to MU01 version

When commit 9c7be59fc519af ("libata: Apply NOLPM quirk to Crucial MX100
512GB SSDs") was added it inherited the ATA_HORKAGE_NO_NCQ_TRIM quirk
from the existing "Crucial_CT*MX100*" entry, but that entry sets model_rev
to "MU01", where as the entry adding the NOLPM quirk sets it to NULL.

This means that after this commit we no apply the NO_NCQ_TRIM quirk to
all "Crucial_CT512MX100*" SSDs even if they have the fixed "MU02"
firmware. This commit splits the "Crucial_CT512MX100*" quirk into 2
quirks, one for the "MU01" firmware and one for all other firmware
versions, so that we once again only apply the NO_NCQ_TRIM quirk to the
"MU01" firmware version.

Fixes: 9c7be59fc519af ("libata: Apply NOLPM quirk to ... MX100 512GB SSDs")
Cc: [email protected]
Signed-off-by: Hans de Goede <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

libata: Make Crucial BX100 500GB LPM quirk apply to all firmware versions

Commit b17e5729a630 ("libata: disable LPM for Crucial BX100 SSD 500GB
drive"), introduced a ATA_HORKAGE_NOLPM quirk for Crucial BX100 500GB SSDs
but limited this to the MU02 firmware version, according to:
http://www.crucial.com/usa/en/support-ssd-firmware

MU02 is the last version, so there are no newer possibly fixed versions
and if the MU02 version has broken LPM then the MU01 almost certainly
also has broken LPM, so this commit changes the quirk to apply to all
firmware versions.

Fixes: b17e5729a630 ("libata: disable LPM for Crucial BX100 SSD 500GB...")
Cc: [email protected]
Cc: Kai-Heng Feng <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

libata: Apply NOLPM quirk to Crucial M500 480 and 960GB SSDs

There have been reports of the Crucial M500 480GB model not working
with LPM set to min_power / med_power_with_dipm level.

It has not been tested with medium_power, but that typically has no
measurable power-savings.

Note the reporters Crucial_CT480M500SSD3 has a firmware version of MU03
and there is a MU05 update available, but that update does not mention any
LPM fixes in its changelog, so the quirk matches all firmware versions.

In my experience the LPM problems with (older) Crucial SSDs seem to be
limited to higher capacity versions of the SSDs (different firmware?),
so this commit adds a NOLPM quirk for the 480 and 960GB versions of the
M500, to avoid LPM causing issues with these SSDs.

Cc: [email protected]
Reported-and-tested-by: Martin Steigerwald <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

staging: ncpfs: memory corruption in ncp_read_kernel()

If the server is malicious then *bytes_read could be larger than the
size of the "target" buffer. It would lead to memory corruption when we
do the memcpy().

Reported-by: Dr Silvio Cesare of InfoSect <Silvio Cesare <[email protected]>
Signed-off-by: Dan Carpenter <[email protected]>
Cc: stable <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

Merge tag 'iio-fixes-for-4.16b' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus

Jonathan writes:

Second set of IIO fixes for the 4.16 cycle.

A slightly fiddly revert then fix pair in here as the bug lead to
an unused local variable that was then removed without us noticing
the bug.  The revert should only be needed on 4.16 - the fix
goes back futher.

* ccs811
  - Fix the transition from 'boot' to 'application' mode.  Fixes the case
    where the power is not cut between boot cycles.
* meson-saradc
  - Fix missing mutex_unlock in an error path.
* sd-modulator
  - Fix bindings doc to have the right value of io-channel-cells to reflect
    that this device type only ever outputs one channel.
* st-accel
  - Revert drop of redundant pointer patch.
  - Use the now available pointer to avoid overwriting the platform data
    pointer and causing trouble on reprobing the driver.
* st-pressure
  - Use local copy of the platform data pointer to avoid overwriting the
    one associated with the device, which would cause issues on reprobing
    the driver.
* stm32-dfsdm
  - Use the right regmap_cfg for the type of device.
  - Correct the ID passed to stop channel to be the channel one.
  - Correct which clock is used to allow for the 'audio' clock.
  - Fix allocation of channels when more than one is enabled.

can: cc770: Fix use after free in cc770_tx_interrupt()

This fixes use after free introduced by the last cc770 patch.

Signed-off-by: Andri Yngvason <[email protected]>
Fixes: 746201235b3f ("can: cc770: Fix queue stall & dropped RTR reply")
Cc: linux-stable <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>

mtdchar: fix usage of mtd_ooblayout_ecc()

Section was not properly computed. The value of OOB region definition is
always ECC section 0 information in the OOB area, but we want to get all
the ECC bytes information, so we should call
mtd_ooblayout_ecc(mtd, section++, &oobregion) until it returns -ERANGE.

Fixes: c2b78452a9db ("mtd: use mtd_ooblayout_xxx() helpers where appropriate")
Cc: <[email protected]>
Signed-off-by: OuYang ZhiZhong <[email protected]>
Signed-off-by: Boris Brezillon <[email protected]>

Revert "ACPI / battery: Add quirk for Asus GL502VSK and UX305LA"

Revert commit c68f0676ef7d ("ACPI / battery: Add quirk for Asus
GL502VSK and UX305LA") and commit 4446823e2573 ("ACPI / battery: Add
quirk for Asus UX360UA and UX410UAK").

On many many Asus products, the battery is sometimes reported as
charging or discharging even when it is full and you are on AC power.
This change quirked the kernel to avoid advertising the discharging
state when this happens on 4 laptop models, under the belief that
this was incorrect information.  I presume it originates from user
reports who are confused that their battery status icon says that it
is discharging.

However, the reported information is indeed correct, and the quirk
approach taken is inadequate and more thought is needed first.
Specifically:

1. It only quirks discharging state, not charging

2. There are so many different Asus products and DMI naming variants
    within those product families that behave this way; Linux could
    grow to quirk hundreds of products and still not even be close at
    "winning" this battle.

3. Asus previously clarified that this behaviour is intentional. The
    platform will periodically do a partial discharge/charge cycle
    when the battery is full, because this is one way to extend the
    lifetime of the battery (leaving a battery at 100% charge and
    unused will decrease its usable capacity over time).

    My understanding is that any decent consumer product will have
    this behaviour, but it appears that Asus is different in that
    they expose this info through ACPI.

    However, the behaviour seems correct. The ACPI spec does not
    suggest in that the platform should hide the truth.  It lets you
    report that the battery is full of charge, and discharging, and
    with external power connected; and Asus does this.

4. In terms of not confusing the user, this seems like something that
    could/should be handled by userspace, which can also detect these
    same (accurate) conditions in the general case.

Revert this quirk before it gets included in a release, while we look
for better approaches.

Signed-off-by: Daniel Drake <[email protected]>
Acked-by: Kai-Heng Feng <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

drm/tegra: Shutdown on driver unbind

Since commit 846c7dfc1193 ("drm/atomic: Try to preserve the crtc enabled
state in drm_atomic_remove_fb, v2."), removing the last framebuffer will
no longer disable the corresponding pipeline, which causes the KMS core
to complain about leaked connectors on driver unbind.

Fix this by calling drm_atomic_helper_shutdown() on driver unbind, which
will cause all display pipelines to be shut down and therefore drop the
extra references on the connectors.

Signed-off-by: Thierry Reding <[email protected]>

drm/tegra: dsi: Don't disable regulator on ->exit()

The regulator is controlled as part of runtime PM, so it should not be
additionally disabled from the ->exit() callback.

Signed-off-by: Thierry Reding <[email protected]>

drm/tegra: dc: Detach IOMMU group from domain only once

Detaching from an IOMMU group multiple times can lead to a crash. This
could potentially be fixed in the IOMMU driver, but it's easy to avoid
the subsequent detach operations in this driver, so do that as well.

Signed-off-by: Thierry Reding <[email protected]>

selftests/x86/ptrace_syscall: Fix for yet more glibc interference

glibc keeps getting cleverer, and my version now turns raise() into
more than one syscall. Since the test relies on ptrace seeing an
exact set of syscalls, this breaks the test. Replace raise(SIGSTOP)
with syscall(SYS_tgkill, ...) to force glibc to get out of our way.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/bc80338b453afa187bc5f895bd8e2c8d6e264da2.1521300271.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>

Linux 4.16-rc6

dt-bindings: exynos: Document #sound-dai-cells property of the HDMI node

The #sound-dai-cells DT property is required to describe link between
the HDMI IP block and the SoC's audio subsystem.

Signed-off-by: Sylwester Nawrocki <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Signed-off-by: Inki Dae <[email protected]>

net: fec: Fix unbalanced PM runtime calls

When unbinding/removing the driver, we will run into the following warnings:

[  259.655198] fec 400d1000.ethernet: 400d1000.ethernet supply phy not found, using dummy regulator
[  259.665065] fec 400d1000.ethernet: Unbalanced pm_runtime_enable!
[  259.672770] fec 400d1000.ethernet (unnamed net_device) (uninitialized): Invalid MAC address: 00:00:00:00:00:00
[  259.683062] fec 400d1000.ethernet (unnamed net_device) (uninitialized): Using random MAC address: f2:3e:93:b7:29:c1
[  259.696239] libphy: fec_enet_mii_bus: probed

Avoid these warnings by balancing the runtime PM calls during fec_drv_remove().

Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86/pti updates from Thomas Gleixner:
"Another set of melted spectrum updates:

   - Iron out the last late microcode loading issues by actually
     checking whether new microcode is present and preventing the CPU
     synchronization to run into a timeout induced hang.

   - Remove Skylake C2 from the microcode blacklist according to the
     latest Intel documentation

   - Fix the VM86 POPF emulation which traps if VIP is set, but VIF is
     not. Enhance the selftests to catch that kind of issue

   - Annotate indirect calls/jumps for objtool on 32bit. This is not a
     functional issue, but for consistency sake its the right thing to
     do.

   - Fix a jump label build warning observed on SPARC64 which uses 32bit
     storage for the code location which is casted to 64 bit pointer w/o
     extending it to 64bit first.

   - Add two new cpufeature bits. Not really an urgent issue, but
     provides them for both x86 and x86/kvm work. No impact on the
     current kernel"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode: Fix CPU synchronization routine
  x86/microcode: Attempt late loading only when new microcode is present
  x86/speculation: Remove Skylake C2 from Speculation Control microcode blacklist
  jump_label: Fix sparc64 warning
  x86/speculation, objtool: Annotate indirect calls/jumps for objtool on 32-bit kernels
  x86/vm86/32: Fix POPF emulation
  selftests/x86/entry_from_vm86: Add test cases for POPF
  selftests/x86/entry_from_vm86: Exit with 1 if we fail
  x86/cpufeatures: Add Intel PCONFIG cpufeature
  x86/cpufeatures: Add Intel Total Memory Encryption cpufeature

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fix from Thomas Gleixner:
"A single fix for vmalloc_fault() which uses p*d_huge() unconditionally
  whether CONFIG_HUGETLBFS is set or not. In case of CONFIG_HUGETLBFS=n
  this results in a crash as p*d_huge() returns 0 in that case"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Fix vmalloc_fault to use pXd_large

Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
"Three fixes for irq chip drivers:

   - Make sure the allocations in the GIC-V3 ITS driver are large enough
     to accomodate the interrupt space

   - Fix a misplaced __iomem annotation which causes a splat of 26
     sparse warnings

   - Remove an unused function in the IMX GPCV2 driver which causes
     build warnings"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/irq-imx-gpcv2: Remove unused function
  irqchip/gic-v3-its: Ensure nr_ites >= nr_lpis
  irqchip/gic-v3-its: Fix misplaced __iomem annotations

Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull EFI fix from Thomas Gleixner:
"A single fix to prevent partially initialized pointers in mixed mode
(64bit kernel on 32bit UEFI)"

* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/libstub/tpm: Initialize pointer variables to zero for mixed mode

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
"PPC:
   - fix bug leading to lost IPIs and smp_call_function_many() lockups
     on POWER9

  ARM:
   - locking fix
   - reset fix
   - GICv2 multi-source SGI injection fix
   - GICv2-on-v3 MMIO synchronization fix
   - make the console less verbose.

  x86:
   - fix device passthrough on AMD SME"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Fix device passthrough when SME is active
  kvm: arm/arm64: vgic-v3: Tighten synchronization for guests using v2 on v3
  KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
  KVM: arm/arm64: Reduce verbosity of KVM init log
  KVM: arm/arm64: Reset mapped IRQs on VM reset
  KVM: arm/arm64: Avoid vcpu_load for other vcpu ioctls than KVM_RUN
  KVM: arm/arm64: vgic: Add missing irq_lock to vgic_mmio_read_pending
  KVM: PPC: Book3S HV: Fix trap number return from __kvmppc_vcore_entry

batman-adv: Fix skbuff rcsum on packet reroute

batadv_check_unicast_ttvn may redirect a packet to itself or another
originator. This involves rewriting the ttvn and the destination address in
the batadv unicast header. These field were not yet pulled (with skb rcsum
update) and thus any change to them also requires a change in the receive
checksum.

Reported-by: Matthias Schiffer <[email protected]>
Fixes: a73105b8d4c7 ("batman-adv: improved client announcement mechanism")
Signed-off-by: Sven Eckelmann <[email protected]>
Signed-off-by: Simon Wunderlich <[email protected]>

batman-adv: Add missing include for EPOLL* constants

Fixes: a9a08845e9ac ("vfs: do bulk POLL* -> EPOLL* replacement")
Signed-off-by: Sven Eckelmann <[email protected]>
Signed-off-by: Simon Wunderlich <[email protected]>

vmxnet3: use correct flag to indicate LRO feature

'Commit 45dac1d6ea04 ("vmxnet3: Changes for vmxnet3 adapter version 2
(fwd)")' introduced a flag "lro" in structure vmxnet3_adapter which is
used to indicate whether LRO is enabled or not. However, the patch
did not set the flag and hence it was never exercised.

So, when LRO is enabled, it resulted in poor TCP performance due to
delayed acks. This issue is seen with packets which are larger than
the mss getting a delayed ack rather than an immediate ack, thus
resulting in high latency.

This patch removes the lro flag and directly uses device features
against NETIF_F_LRO to check if lro is enabled.

Fixes: 45dac1d6ea04 ("vmxnet3: Changes for vmxnet3 adapter version 2 (fwd)")
Reported-by: Rachel Lunnon <[email protected]>
Signed-off-by: Ronak Doshi <[email protected]>
Acked-by: Shrikrishna Khare <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

vmxnet3: avoid xmit reset due to a race in vmxnet3

The field txNumDeferred is used by the driver to keep track of the number
of packets it has pushed to the emulation. The driver increments it on
pushing the packet to the emulation and the emulation resets it to 0 at
the end of the transmit.

There is a possibility of a race either when (a) ESX is under heavy load or
(b) workload inside VM is of low packet rate.

This race results in xmit hangs when network coalescing is disabled. This
change creates a local copy of txNumDeferred and uses it to perform ring
arithmetic.

Reported-by: Noriho Tanaka <[email protected]>
Signed-off-by: Ronak Doshi <[email protected]>
Acked-by: Shrikrishna Khare <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'tcf_foo_init-NULL-deref'

Davide Caratti says:

====================
net/sched: fix NULL dereference in the error path of .init()

with several TC actions it's possible to see NULL pointer dereference,
when the .init() function calls tcf_idr_alloc(), fails at some point and
then calls tcf_idr_release(): this series fixes all them introducing
non-NULL tests in the .cleanup() function.
====================

Signed-off-by: David S. Miller <[email protected]>

net/sched: fix NULL dereference on the error path of tcf_skbmod_init()

when the following command

# tc action replace action skbmod swap mac index 100

is run for the first time, and tcf_skbmod_init() fails to allocate struct
tcf_skbmod_params, tcf_skbmod_cleanup() calls kfree_rcu(NULL), thus
causing the following error:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: __call_rcu+0x23/0x2b0
PGD 8000000034057067 P4D 8000000034057067 PUD 74937067 PMD 0
Oops: 0002 [#1] SMP PTI
Modules linked in: act_skbmod(E) psample ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 snd_hda_codec_generic snd_hda_intel snd_hda_codec crct10dif_pclmul mbcache jbd2 crc32_pclmul snd_hda_core ghash_clmulni_intel snd_hwdep pcbc snd_seq snd_seq_device snd_pcm aesni_intel snd_timer crypto_simd glue_helper snd cryptd virtio_balloon joydev soundcore pcspkr i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_console virtio_net virtio_blk ata_piix libata crc32c_intel virtio_pci serio_raw virtio_ring virtio i2c_core floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_skbmod]
CPU: 3 PID: 3144 Comm: tc Tainted: G            E    4.16.0-rc4.act_vlan.orig+ #403
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:__call_rcu+0x23/0x2b0
RSP: 0018:ffffbd2e403e7798 EFLAGS: 00010246
RAX: ffffffffc0872080 RBX: ffff981d34bff780 RCX: 00000000ffffffff
RDX: ffffffff922a5f00 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000001 R09: 000000000000021f
R10: 000000003d003000 R11: 0000000000aaaaaa R12: 0000000000000000
R13: ffffffff922a5f00 R14: 0000000000000001 R15: ffff981d3b698c2c
FS:  00007f3678292740(0000) GS:ffff981d3fd80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 000000007c57a006 CR4: 00000000001606e0
Call Trace:
  __tcf_idr_release+0x79/0xf0
  tcf_skbmod_init+0x1d1/0x210 [act_skbmod]
  tcf_action_init_1+0x2cc/0x430
  tcf_action_init+0xd3/0x1b0
  tc_ctl_action+0x18b/0x240
  rtnetlink_rcv_msg+0x29c/0x310
  ? _cond_resched+0x15/0x30
  ? __kmalloc_node_track_caller+0x1b9/0x270
  ? rtnl_calcit.isra.28+0x100/0x100
  netlink_rcv_skb+0xd2/0x110
  netlink_unicast+0x17c/0x230
  netlink_sendmsg+0x2cd/0x3c0
  sock_sendmsg+0x30/0x40
  ___sys_sendmsg+0x27a/0x290
  ? filemap_map_pages+0x34a/0x3a0
  ? __handle_mm_fault+0xbfd/0xe20
  __sys_sendmsg+0x51/0x90
  do_syscall_64+0x6e/0x1a0
  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x7f36776a3ba0
RSP: 002b:00007fff4703b618 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fff4703b740 RCX: 00007f36776a3ba0
RDX: 0000000000000000 RSI: 00007fff4703b690 RDI: 0000000000000003
RBP: 000000005aaaba36 R08: 0000000000000002 R09: 0000000000000000
R10: 00007fff4703b0a0 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fff4703b754 R14: 0000000000000001 R15: 0000000000669f60
Code: 5d e9 42 da ff ff 66 90 0f 1f 44 00 00 41 57 41 56 41 55 49 89 d5 41 54 55 48 89 fd 53 48 83 ec 08 40 f6 c7 07 0f 85 19 02 00 00 <48> 89 75 08 48 c7 45 00 00 00 00 00 9c 58 0f 1f 44 00 00 49 89
RIP: __call_rcu+0x23/0x2b0 RSP: ffffbd2e403e7798
CR2: 0000000000000008

Fix it in tcf_skbmod_cleanup(), ensuring that kfree_rcu(p, ...) is called
only when p is not NULL.

Fixes: 86da71b57383 ("net_sched: Introduce skbmod action")
Signed-off-by: Davide Caratti <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/sched: fix NULL dereference in the error path of tcf_sample_init()

when the following command

# tc action add action sample rate 100 group 100 index 100

is run for the first time, and psample_group_get(100) fails to create a
new group, tcf_sample_cleanup() calls psample_group_put(NULL), thus
causing the following error:

BUG: unable to handle kernel NULL pointer dereference at 000000000000001c
IP: psample_group_put+0x15/0x71 [psample]
PGD 8000000075775067 P4D 8000000075775067 PUD 7453c067 PMD 0
Oops: 0002 [#1] SMP PTI
Modules linked in: act_sample(E) psample ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core mbcache jbd2 crct10dif_pclmul snd_hwdep crc32_pclmul snd_seq ghash_clmulni_intel pcbc snd_seq_device snd_pcm aesni_intel crypto_simd snd_timer glue_helper snd cryptd joydev pcspkr i2c_piix4 soundcore virtio_balloon nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_net ata_piix virtio_console virtio_blk libata serio_raw crc32c_intel virtio_pci i2c_core virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_tunnel_key]
CPU: 2 PID: 5740 Comm: tc Tainted: G            E    4.16.0-rc4.act_vlan.orig+ #403
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:psample_group_put+0x15/0x71 [psample]
RSP: 0018:ffffb8a80032f7d0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000024
RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffffc06d93c0
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000044
R10: 00000000bd003000 R11: ffff979fba04aa59 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: ffff979fbba3f22c
FS:  00007f7638112740(0000) GS:ffff979fbfd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000001c CR3: 00000000734ea001 CR4: 00000000001606e0
Call Trace:
  __tcf_idr_release+0x79/0xf0
  tcf_sample_init+0x125/0x1d0 [act_sample]
  tcf_action_init_1+0x2cc/0x430
  tcf_action_init+0xd3/0x1b0
  tc_ctl_action+0x18b/0x240
  rtnetlink_rcv_msg+0x29c/0x310
  ? _cond_resched+0x15/0x30
  ? __kmalloc_node_track_caller+0x1b9/0x270
  ? rtnl_calcit.isra.28+0x100/0x100
  netlink_rcv_skb+0xd2/0x110
  netlink_unicast+0x17c/0x230
  netlink_sendmsg+0x2cd/0x3c0
  sock_sendmsg+0x30/0x40
  ___sys_sendmsg+0x27a/0x290
  ? filemap_map_pages+0x34a/0x3a0
  ? __handle_mm_fault+0xbfd/0xe20
  __sys_sendmsg+0x51/0x90
  do_syscall_64+0x6e/0x1a0
  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x7f7637523ba0
RSP: 002b:00007fff0473ef58 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fff0473f080 RCX: 00007f7637523ba0
RDX: 0000000000000000 RSI: 00007fff0473efd0 RDI: 0000000000000003
RBP: 000000005aaaac80 R08: 0000000000000002 R09: 0000000000000000
R10: 00007fff0473e9e0 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fff0473f094 R14: 0000000000000001 R15: 0000000000669f60
Code: be 02 00 00 00 48 89 df e8 a9 fe ff ff e9 7c ff ff ff 0f 1f 40 00 0f 1f 44 00 00 53 48 89 fb 48 c7 c7 c0 93 6d c0 e8 db 20 8c ef <83> 6b 1c 01 74 10 48 c7 c7 c0 93 6d c0 ff 14 25 e8 83 83 b0 5b
RIP: psample_group_put+0x15/0x71 [psample] RSP: ffffb8a80032f7d0
CR2: 000000000000001c

Fix it in tcf_sample_cleanup(), ensuring that calls to psample_group_put(p)
are done only when p is not NULL.

Fixes: cadb9c9fdbc6 ("net/sched: act_sample: Fix error path in init")
Signed-off-by: Davide Caratti <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/sched: fix NULL dereference in the error path of tunnel_key_init()

when the following command

# tc action add action tunnel_key unset index 100

is run for the first time, and tunnel_key_init() fails to allocate struct
tcf_tunnel_key_params, tunnel_key_release() dereferences NULL pointers.
This causes the following error:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: tunnel_key_release+0xd/0x40 [act_tunnel_key]
PGD 8000000033787067 P4D 8000000033787067 PUD 74646067 PMD 0
Oops: 0000 [#1] SMP PTI
Modules linked in: act_tunnel_key(E) act_csum ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 mbcache jbd2 crct10dif_pclmul crc32_pclmul snd_hda_codec_generic ghash_clmulni_intel snd_hda_intel pcbc snd_hda_codec snd_hda_core snd_hwdep snd_seq aesni_intel snd_seq_device crypto_simd glue_helper snd_pcm cryptd joydev snd_timer pcspkr virtio_balloon snd i2c_piix4 soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_net virtio_blk drm virtio_console crc32c_intel ata_piix serio_raw i2c_core virtio_pci libata virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod
CPU: 2 PID: 3101 Comm: tc Tainted: G            E    4.16.0-rc4.act_vlan.orig+ #403
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:tunnel_key_release+0xd/0x40 [act_tunnel_key]
RSP: 0018:ffffba46803b7768 EFLAGS: 00010286
RAX: ffffffffc09010a0 RBX: 0000000000000000 RCX: 0000000000000024
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff99ee336d7480
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000044
R10: 0000000000000220 R11: ffff99ee79d73131 R12: 0000000000000000
R13: ffff99ee32d67610 R14: ffff99ee7671dc38 R15: 00000000fffffff4
FS:  00007febcb2cd740(0000) GS:ffff99ee7fd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 000000007c8e4005 CR4: 00000000001606e0
Call Trace:
  __tcf_idr_release+0x79/0xf0
  tunnel_key_init+0xd9/0x460 [act_tunnel_key]
  tcf_action_init_1+0x2cc/0x430
  tcf_action_init+0xd3/0x1b0
  tc_ctl_action+0x18b/0x240
  rtnetlink_rcv_msg+0x29c/0x310
  ? _cond_resched+0x15/0x30
  ? __kmalloc_node_track_caller+0x1b9/0x270
  ? rtnl_calcit.isra.28+0x100/0x100
  netlink_rcv_skb+0xd2/0x110
  netlink_unicast+0x17c/0x230
  netlink_sendmsg+0x2cd/0x3c0
  sock_sendmsg+0x30/0x40
  ___sys_sendmsg+0x27a/0x290
  __sys_sendmsg+0x51/0x90
  do_syscall_64+0x6e/0x1a0
  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x7febca6deba0
RSP: 002b:00007ffe7b0dd128 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007ffe7b0dd250 RCX: 00007febca6deba0
RDX: 0000000000000000 RSI: 00007ffe7b0dd1a0 RDI: 0000000000000003
RBP: 000000005aaa90cb R08: 0000000000000002 R09: 0000000000000000
R10: 00007ffe7b0dcba0 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffe7b0dd264 R14: 0000000000000001 R15: 0000000000669f60
Code: 44 00 00 8b 0d b5 23 00 00 48 8b 87 48 10 00 00 48 8b 3c c8 e9 a5 e5 d8 c3 0f 1f 44 00 00 0f 1f 44 00 00 53 48 8b 9f b0 00 00 00 <83> 7b 10 01 74 0b 48 89 df 31 f6 5b e9 f2 fa 7f c3 48 8b 7b 18
RIP: tunnel_key_release+0xd/0x40 [act_tunnel_key] RSP: ffffba46803b7768
CR2: 0000000000000010

Fix this in tunnel_key_release(), ensuring 'param' is not NULL before
dereferencing it.

Fixes: d0f6dd8a914f ("net/sched: Introduce act_tunnel_key")
Signed-off-by: Davide Caratti <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/sched: fix NULL dereference in the error path of tcf_csum_init()

when the following command

# tc action add action csum udp continue index 100

is run for the first time, and tcf_csum_init() fails allocating struct
tcf_csum, tcf_csum_cleanup() calls kfree_rcu(NULL,...). This causes the
following error:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: __call_rcu+0x23/0x2b0
PGD 80000000740b4067 P4D 80000000740b4067 PUD 32e7f067 PMD 0
Oops: 0002 [#1] SMP PTI
Modules linked in: act_csum(E) act_vlan ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 mbcache jbd2 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_generic pcbc snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer aesni_intel crypto_simd glue_helper cryptd snd joydev pcspkr virtio_balloon i2c_piix4 soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_blk drm virtio_net virtio_console ata_piix crc32c_intel libata virtio_pci serio_raw i2c_core virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_vlan]
CPU: 2 PID: 5763 Comm: tc Tainted: G            E    4.16.0-rc4.act_vlan.orig+ #403
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:__call_rcu+0x23/0x2b0
RSP: 0018:ffffb275803e77c0 EFLAGS: 00010246
RAX: ffffffffc057b080 RBX: ffff9674bc6f5240 RCX: 00000000ffffffff
RDX: ffffffff928a5f00 RSI: 0000000000000008 RDI: 0000000000000008
RBP: 0000000000000008 R08: 0000000000000001 R09: 0000000000000044
R10: 0000000000000220 R11: ffff9674b9ab4821 R12: 0000000000000000
R13: ffffffff928a5f00 R14: 0000000000000000 R15: 0000000000000001
FS:  00007fa6368d8740(0000) GS:ffff9674bfd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 0000000073dec001 CR4: 00000000001606e0
Call Trace:
  __tcf_idr_release+0x79/0xf0
  tcf_csum_init+0xfb/0x180 [act_csum]
  tcf_action_init_1+0x2cc/0x430
  tcf_action_init+0xd3/0x1b0
  tc_ctl_action+0x18b/0x240
  rtnetlink_rcv_msg+0x29c/0x310
  ? _cond_resched+0x15/0x30
  ? __kmalloc_node_track_caller+0x1b9/0x270
  ? rtnl_calcit.isra.28+0x100/0x100
  netlink_rcv_skb+0xd2/0x110
  netlink_unicast+0x17c/0x230
  netlink_sendmsg+0x2cd/0x3c0
  sock_sendmsg+0x30/0x40
  ___sys_sendmsg+0x27a/0x290
  ? filemap_map_pages+0x34a/0x3a0
  ? __handle_mm_fault+0xbfd/0xe20
  __sys_sendmsg+0x51/0x90
  do_syscall_64+0x6e/0x1a0
  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x7fa635ce9ba0
RSP: 002b:00007ffc185b0fc8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007ffc185b10f0 RCX: 00007fa635ce9ba0
RDX: 0000000000000000 RSI: 00007ffc185b1040 RDI: 0000000000000003
RBP: 000000005aaa85e0 R08: 0000000000000002 R09: 0000000000000000
R10: 00007ffc185b0a20 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffc185b1104 R14: 0000000000000001 R15: 0000000000669f60
Code: 5d e9 42 da ff ff 66 90 0f 1f 44 00 00 41 57 41 56 41 55 49 89 d5 41 54 55 48 89 fd 53 48 83 ec 08 40 f6 c7 07 0f 85 19 02 00 00 <48> 89 75 08 48 c7 45 00 00 00 00 00 9c 58 0f 1f 44 00 00 49 89
RIP: __call_rcu+0x23/0x2b0 RSP: ffffb275803e77c0
CR2: 0000000000000010

fix this in tcf_csum_cleanup(), ensuring that kfree_rcu(param, ...) is
called only when param is not NULL.

Fixes: 9c5f69bbd75a ("net/sched: act_csum: don't use spinlock in the fast path")
Signed-off-by: Davide Caratti <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/sched: fix NULL dereference in the error path of tcf_vlan_init()

when the following command

# tc actions replace action vlan pop index 100

is run for the first time, and tcf_vlan_init() fails allocating struct
tcf_vlan_params, tcf_vlan_cleanup() calls kfree_rcu(NULL, ...). This causes
the following error:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
IP: __call_rcu+0x23/0x2b0
PGD 80000000760a2067 P4D 80000000760a2067 PUD 742c1067 PMD 0
Oops: 0002 [#1] SMP PTI
Modules linked in: act_vlan(E) ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 snd_hda_codec_generic snd_hda_intel mbcache snd_hda_codec jbd2 snd_hda_core crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd snd_timer glue_helper snd cryptd joydev soundcore virtio_balloon pcspkr i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_console virtio_blk virtio_net ata_piix crc32c_intel libata virtio_pci i2c_core virtio_ring serio_raw virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_vlan]
CPU: 3 PID: 3119 Comm: tc Tainted: G            E    4.16.0-rc4.act_vlan.orig+ #403
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:__call_rcu+0x23/0x2b0
RSP: 0018:ffffaac3005fb798 EFLAGS: 00010246
RAX: ffffffffc0704080 RBX: ffff97f2b4bbe900 RCX: 00000000ffffffff
RDX: ffffffffabca5f00 RSI: 0000000000000010 RDI: 0000000000000010
RBP: 0000000000000010 R08: 0000000000000001 R09: 0000000000000044
R10: 00000000fd003000 R11: ffff97f2faab5b91 R12: 0000000000000000
R13: ffffffffabca5f00 R14: ffff97f2fb80202c R15: 00000000fffffff4
FS:  00007f68f75b4740(0000) GS:ffff97f2ffd80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000018 CR3: 0000000072b52001 CR4: 00000000001606e0
Call Trace:
  __tcf_idr_release+0x79/0xf0
  tcf_vlan_init+0x168/0x270 [act_vlan]
  tcf_action_init_1+0x2cc/0x430
  tcf_action_init+0xd3/0x1b0
  tc_ctl_action+0x18b/0x240
  rtnetlink_rcv_msg+0x29c/0x310
  ? _cond_resched+0x15/0x30
  ? __kmalloc_node_track_caller+0x1b9/0x270
  ? rtnl_calcit.isra.28+0x100/0x100
  netlink_rcv_skb+0xd2/0x110
  netlink_unicast+0x17c/0x230
  netlink_sendmsg+0x2cd/0x3c0
  sock_sendmsg+0x30/0x40
  ___sys_sendmsg+0x27a/0x290
  ? filemap_map_pages+0x34a/0x3a0
  ? __handle_mm_fault+0xbfd/0xe20
  __sys_sendmsg+0x51/0x90
  do_syscall_64+0x6e/0x1a0
  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x7f68f69c5ba0
RSP: 002b:00007fffd79c1118 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fffd79c1240 RCX: 00007f68f69c5ba0
RDX: 0000000000000000 RSI: 00007fffd79c1190 RDI: 0000000000000003
RBP: 000000005aaa708e R08: 0000000000000002 R09: 0000000000000000
R10: 00007fffd79c0ba0 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fffd79c1254 R14: 0000000000000001 R15: 0000000000669f60
Code: 5d e9 42 da ff ff 66 90 0f 1f 44 00 00 41 57 41 56 41 55 49 89 d5 41 54 55 48 89 fd 53 48 83 ec 08 40 f6 c7 07 0f 85 19 02 00 00 <48> 89 75 08 48 c7 45 00 00 00 00 00 9c 58 0f 1f 44 00 00 49 89
RIP: __call_rcu+0x23/0x2b0 RSP: ffffaac3005fb798
CR2: 0000000000000018

fix this in tcf_vlan_cleanup(), ensuring that kfree_rcu(p, ...) is called
only when p is not NULL.

Fixes: 4c5b9d9642c8 ("act_vlan: VLAN action rewrite to use RCU lock/unlock and update")
Acked-by: Jiri Pirko <[email protected]>
Acked-by: Manish Kurup <[email protected]>
Signed-off-by: Davide Caratti <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ethernet: ti: cpsw: add check for in-band mode setting with RGMII PHY interface

According to AM335x TRM[1] 14.3.6.2, AM437x TRM[2] 15.3.6.2 and
DRA7 TRM[3] 24.11.4.8.7.3.3, in-band mode in EXT_EN(bit18) register is only
available when PHY is configured in RGMII mode with 10Mbps speed. It will
cause some networking issues without RGMII mode, such as carrier sense
errors and low throughput. TI also mentioned this issue in their forum[4].

This patch adds the check mechanism for PHY interface with RGMII interface
type, the in-band mode can only be set in RGMII mode with 10Mbps speed.

References:
[1]: https://www.ti.com/lit/ug/spruh73p/spruh73p.pdf
[2]: http://www.ti.com/lit/ug/spruhl7h/spruhl7h.pdf
[3]: http://www.ti.com/lit/ug/spruic2b/spruic2b.pdf
[4]: https://e2e.ti.com/support/arm/sitara_arm/f/791/p/640765/2392155

Suggested-by: Holsety Chen (陳憲輝) <[email protected]>
Signed-off-by: SZ Lin (林上智) <[email protected]>
Signed-off-by: Schuyler Patton <[email protected]>
Reviewed-by: Grygorii Strashko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns: Fix ethtool private flags

The driver implementation returns support for private flags, while
no private flags are present. When asked for the number of private
flags it returns the number of statistic flag names.

Fix this by returning EOPNOTSUPP for not implemented ethtool flags.

Signed-off-by: Matthias Brugger <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ALSA: hda/realtek - Always immediately update mute LED with pin VREF

Some HP laptops have a mute mute LED controlled by a pin VREF.  The
Realtek codec driver updates the VREF via vmaster hook by calling
snd_hda_set_pin_ctl_cache().

This works fine as long as the driver is running in a normal mode.
However, when the VREF change happens during the codec being in
runtime PM suspend, the regmap access will skip and postpone the
actual register change.  This ends up with the unchanged LED status
until the next runtime PM resume even if you change the Master mute
switch.  (Interestingly, the machine keeps the LED status even after
the codec goes into D3 -- but it's another story.)

For improving this usability, let the driver temporarily powering up /
down only during the pin VREF change.  This can be achieved easily by
wrapping the call with snd_hda_power_up_pm() / *_down_pm().

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199073
Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

mlxsw: spectrum_buffers: Set a minimum quota for CPU port traffic

In commit 9ffcc3725f09 ("mlxsw: spectrum: Allow packets to be trapped
from any PG") I fixed a problem where packets could not be trapped to
the CPU due to exceeded shared buffer quotas. The mentioned commit
explains the problem in detail.

The problem was fixed by assigning a minimum quota for the CPU port and
the traffic class used for scheduling traffic to the CPU.

However, commit 117b0dad2d54 ("mlxsw: Create a different trap group list
for each device") assigned different traffic classes to different
packet types and rendered the fix useless.

Fix the problem by assigning a minimum quota for the CPU port and all
the traffic classes that are currently in use.

Fixes: 117b0dad2d54 ("mlxsw: Create a different trap group list for each device")
Signed-off-by: Ido Schimmel <[email protected]>
Reported-by: Eddie Shklaer <[email protected]>
Tested-by: Eddie Shklaer <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: sched: fix uses after free

syzbot reported one use-after-free in pfifo_fast_enqueue() [1]

Issue here is that we can not reuse skb after a successful skb_array_produce()
since another cpu might have consumed it already.

I believe a similar problem exists in try_bulk_dequeue_skb_slow()
in case we put an skb into qdisc_enqueue_skb_bad_txq() for lockless qdisc.

[1]
BUG: KASAN: use-after-free in qdisc_pkt_len include/net/sch_generic.h:610 [inline]
BUG: KASAN: use-after-free in qdisc_qstats_cpu_backlog_inc include/net/sch_generic.h:712 [inline]
BUG: KASAN: use-after-free in pfifo_fast_enqueue+0x4bc/0x5e0 net/sched/sch_generic.c:639
Read of size 4 at addr ffff8801cede37e8 by task syzkaller717588/5543

CPU: 1 PID: 5543 Comm: syzkaller717588 Not tainted 4.16.0-rc4+ #265
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x24d lib/dump_stack.c:53
print_address_description+0x73/0x250 mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report+0x23c/0x360 mm/kasan/report.c:412
__asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:432
qdisc_pkt_len include/net/sch_generic.h:610 [inline]
qdisc_qstats_cpu_backlog_inc include/net/sch_generic.h:712 [inline]
pfifo_fast_enqueue+0x4bc/0x5e0 net/sched/sch_generic.c:639
__dev_xmit_skb net/core/dev.c:3216 [inline]

Fixes: c5ad119fb6c0 ("net: sched: pfifo_fast use skb_array")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: [email protected]
Cc: John Fastabend <[email protected]>
Cc: Jamal Hadi Salim <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Jiri Pirko <[email protected]>
Acked-by: John Fastabend <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

parisc: Handle case where flush_cache_range is called with no context

Just when I had decided that flush_cache_range() was always called with
a valid context, Helge reported two cases where the
"BUG_ON(!vma->vm_mm->context);" was hit on the phantom buildd:

kernel BUG at /mnt/sdb6/linux/linux-4.15.4/arch/parisc/kernel/cache.c:587!
CPU: 1 PID: 3254 Comm: kworker/1:2 Tainted: G D 4.15.0-1-parisc64-smp #1 Debian 4.15.4-1+b1
Workqueue: events free_ioctx
  IAOQ[0]: flush_cache_range+0x164/0x168
  IAOQ[1]: flush_cache_page+0x0/0x1c8
  RP(r2): unmap_page_range+0xae8/0xb88
Backtrace:
  [<00000000404a6980>] unmap_page_range+0xae8/0xb88
  [<00000000404a6ae0>] unmap_single_vma+0xc0/0x188
  [<00000000404a6cdc>] zap_page_range_single+0x134/0x1f8
  [<00000000404a702c>] unmap_mapping_range+0x1cc/0x208
  [<0000000040461518>] truncate_pagecache+0x98/0x108
  [<0000000040461624>] truncate_setsize+0x9c/0xb8
  [<00000000405d7f30>] put_aio_ring_file+0x80/0x100
  [<00000000405d803c>] aio_free_ring+0x8c/0x290
  [<00000000405d82c0>] free_ioctx+0x80/0x180
  [<0000000040284e6c>] process_one_work+0x21c/0x668
  [<00000000402854c4>] worker_thread+0x20c/0x778
  [<0000000040291d44>] kthread+0x2d4/0x2e0
  [<0000000040204020>] end_fault_vector+0x20/0xc0

This indicates that we need to handle the no context case in
flush_cache_range() as we do in flush_cache_mm().

In thinking about this, I realized that we don't need to flush the TLB
when there is no context.  So, I added context checks to the large flush
cases in flush_cache_mm() and flush_cache_range().  The large flush case
occurs frequently in flush_cache_mm() and the change should improve fork
performance.

The v2 version of this change removes the BUG_ON from flush_cache_page()
by skipping the TLB flush when there is no context.  I also added code
to flush the TLB in flush_cache_mm() and flush_cache_range() when we
have a context that's not current.  Now all three routines handle TLB
flushes in a similar manner.

Signed-off-by: John David Anglin <[email protected]>
Cc: [email protected] # 4.9+
Signed-off-by: Helge Deller <[email protected]>

Merge tag 'for-4.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
"There's an important revert in this pull request that needs to go to
  stable as it causes a corruption on big endian machines.

  The other fix is for FIEMAP incorrectly reporting shared extents
  before a sync and one fix for a crash in raid56.

  So far we got only one report about the BE corruption, the stable
  kernels were out for like a week, so hopefully the scope of the damage
  is low"

* tag 'for-4.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  Revert "btrfs: use proper endianness accessors for super_copy"
  btrfs: add missing initialization in btrfs_check_shared
  btrfs: Fix NULL pointer exception in find_bio_stripe

Merge tag 'microblaze-4.16-rc6' of git://git.monstr.eu/linux-2.6-microblaze

Pull microblaze fixes from Michal Simek:

- Use NO_BOOTMEM to fix boot issue

- Fix opt lib endian dependencies

* tag 'microblaze-4.16-rc6' of git://git.monstr.eu/linux-2.6-microblaze:
  microblaze: switch to NO_BOOTMEM
  microblaze: remove unused alloc_maybe_bootmem
  microblaze: Setup dependencies for ASM optimized lib functions

x86/microcode: Fix CPU synchronization routine

Emanuel reported an issue with a hang during microcode update because my
dumb idea to use one atomic synchronization variable for both rendezvous
- before and after update - was simply bollocks:

  microcode: microcode_reload_late: late_cpus: 4
  microcode: __reload_late: cpu 2 entered
  microcode: __reload_late: cpu 1 entered
  microcode: __reload_late: cpu 3 entered
  microcode: __reload_late: cpu 0 entered
  microcode: __reload_late: cpu 1 left
  microcode: Timeout while waiting for CPUs rendezvous, remaining: 1

CPU1 above would finish, leave and the others will still spin waiting for
it to join.

So do two synchronization atomics instead, which makes the code a lot more
straightforward.

Also, since the update is serialized and it also takes quite some time per
microcode engine, increase the exit timeout by the number of CPUs on the
system.

That's ok because the moment all CPUs are done, that timeout will be cut
short.

Furthermore, panic when some of the CPUs timeout when returning from a
microcode update: we can't allow a system with not all cores updated.

Also, as an optimization, do not do the exit sync if microcode wasn't
updated.

Reported-by: Emanuel Czirai <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Emanuel Czirai <[email protected]>
Tested-by: Ashok Raj <[email protected]>
Tested-by: Tom Lendacky <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

x86/microcode: Attempt late loading only when new microcode is present

Return UCODE_NEW from the scanning functions to denote that new microcode
was found and only then attempt the expensive synchronization dance.

Reported-by: Emanuel Czirai <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Emanuel Czirai <[email protected]>
Tested-by: Ashok Raj <[email protected]>
Tested-by: Tom Lendacky <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

Merge tag 'drm-fixes-for-v4.16-rc6' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"i915, amd and nouveau fixes.

  i915:
   - backlight fix for some panels
   - pm fix
   - fencing fix
   - some GVT fixes

  amdgpu:
   - backlight fix across suspend/resume
   - object destruction ordering issue fix
   - displayport fix

  nouveau:
   - two backlight fixes
   - fix for some lockups

  Pretty quiet week, seems like everyone was fixing backlights"

* tag 'drm-fixes-for-v4.16-rc6' of git://people.freedesktop.org/~airlied/linux:
  drm/nouveau/bl: fix backlight regression
  drm/nouveau/bl: Fix oops on driver unbind
  drm/nouveau/mmu: ALIGN_DOWN correct variable
  drm/i915/gvt: fix user copy warning by whitelist workload rb_tail field
  drm/i915/gvt: Correct the privilege shadow batch buffer address
  drm/amdgpu/dce: Don't turn off DP sink when disconnected
  drm/amdgpu: save/restore backlight level in legacy dce code
  drm/radeon: fix prime teardown order
  drm/amdgpu: fix prime teardown order
  drm/i915: Kick the rps worker when changing the boost frequency
  drm/i915: Only prune fences after wait-for-all
  drm/i915: Enable VBT based BL control for DP
  drm/i915/gvt: keep oa config in shadow ctx
  drm/i915/gvt: Add runtime_pm_get/put into gvt_switch_mmio

batman-adv: fix header size check in batadv_dbg_arp()

Checking for 0 is insufficient: when an SKB without a batadv header, but
with a VLAN header is received, hdr_size will be 4, making the following
code interpret the Ethernet header as a batadv header.

Fixes: be1db4f6615b ("batman-adv: make the Distributed ARP Table vlan aware")
Signed-off-by: Matthias Schiffer <[email protected]>
Signed-off-by: Sven Eckelmann <[email protected]>
Signed-off-by: Simon Wunderlich <[email protected]>

batman-adv: update data pointers after skb_cow()

batadv_check_unicast_ttvn() calls skb_cow(), so pointers into the SKB data
must be (re)set after calling it. The ethhdr variable is dropped
altogether.

Fixes: 7cdcf6dddc42 ("batman-adv: add UNICAST_4ADDR packet type")
Signed-off-by: Matthias Schiffer <[email protected]>
Signed-off-by: Sven Eckelmann <[email protected]>
Signed-off-by: Simon Wunderlich <[email protected]>

net: dsa: mv88e6xxx: Fix binding documentation for MDIO busses

The MDIO busses are switch properties and so should be inside the
switch node. Fix the examples in the binding document.

Reported-by: 尤晓杰 <[email protected]>
Signed-off-by: Andrew Lunn <[email protected]>
Fixes: a3c53be55c95 ("net: dsa: mv88e6xxx: Support multiple MDIO busses")
Signed-off-by: David S. Miller <[email protected]>

skbuff: Fix not waking applications when errors are enqueued

When errors are enqueued to the error queue via sock_queue_err_skb()
function, it is possible that the waiting application is not notified.

Calling 'sk->sk_data_ready()' would not notify applications that
selected only POLLERR events in poll() (for example).

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Randy E. Witt <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: Vinicius Costa Gomes <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

netlink: avoid a double skb free in genlmsg_mcast()

nlmsg_multicast() consumes always the skb, thus the original skb must be
freed only when this function is called with a clone.

Fixes: cb9f7a9a5c96 ("netlink: ensure to loop over all netns in genlmsg_multicast_allns()")
Reported-by: Ben Hutchings <[email protected]>
Signed-off-by: Nicolas Dichtel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

qede: Fix qedr link update

Link updates were not reported to qedr correctly.
Leading to cases where a link could be down, but qedr
would see it as up.
In addition, once qede was loaded, link state would be up,
regardless of the actual link state.

Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: Ariel Elior <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'qed-iWARP-related-fixes'

Michal Kalderon says:

====================
qed: iWARP related fixes

This series contains two fixes related to iWARP flow.
====================

Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: Ariel Elior <[email protected]>

qed: Fix non TCP packets should be dropped on iWARP ll2 connection

FW workaround. The iWARP LL2 connection did not expect TCP packets
to arrive on it's connection. The fix drops any non-tcp packets

Fixes b5c29ca ("qed: iWARP CM - setup a ll2 connection for handling
SYN packets")

Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: Ariel Elior <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

qed: Fix MPA unalign flow in case header is split across two packets.

There is a corner case in the MPA unalign flow where a FPDU header is
split over two tcp segments. The length of the first fragment in this
case was not initialized properly and should be '1'

Fixes: c7d1d839 ("qed: Add support for MPA header being split over two tcp packets")
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: Ariel Elior <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/iucv: Free memory obtained by kzalloc

Free memory by calling put_device(), if afiucv_iucv_init is not
successful.

Signed-off-by: Arvind Yadav <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: systemport: Rewrite __bcm_sysport_tx_reclaim()

There is no need for complex checking between the last consumed index
and current consumed index, a simple subtraction will do.

This also eliminates the possibility of a permanent transmit queue stall
under the following conditions:

- one CPU bursts ring->size worth of traffic (up to 256 buffers), to the
  point where we run out of free descriptors, so we stop the transmit
  queue at the end of bcm_sysport_xmit()

- because of our locking, we have the transmit process disable
  interrupts which means we can be blocking the TX reclamation process

- when TX reclamation finally runs, we will be computing the difference
  between ring->c_index (last consumed index by SW) and what the HW
  reports through its register

- this register is masked with (ring->size - 1) = 0xff, which will lead
  to stripping the upper bits of the index (register is 16-bits wide)

- we will be computing last_tx_cn as 0, which means there is no work to
  be done, and we never wake-up the transmit queue, leaving it
  permanently disabled

A practical example is e.g: ring->c_index aka last_c_index = 12, we
pushed 256 entries, HW consumer index = 268, we mask it with 0xff = 12,
so last_tx_cn == 0, nothing happens.

Fixes: 80105befdb4b ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>