Git Repo - linux.git/log

Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fix from Ingo Molnar:
"This fixes a group scheduling related performance/interactivity
  regression introduced in v4.8, which affects certain hardware
  environments where cpu_possible_mask != cpu_present_mask"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Fix incorrect task group ->load_avg

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid

Pull HID fixes from Jiri Kosina:

- hid-dr regression fix for certain dragonrise gamepads (device ID
   0079:0006), from Ioan-Adrian Ratiu

- dma-on-stack fix for hid-led driver, from Heiner Kallweit

- quirk for Akai MIDImix device

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: add quirk for Akai MIDImix.
  Revert "HID: dragonrise: fix HID Descriptor for 0x0006 PID"
  HID: hid-dr: add input mapping for axis selection
  HID: hid-led: fix issue with transfer buffer not being dma capable

Merge tag 'pinctrl-v4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull first round of pin control fixes from Linus Walleij:

- a bunch of barnsjukdomar/kinderkrankheiten/maladie infantile in the
   Aspeed driver. (Why doesn't English have a word for this?)

   [ Maybe "teething problems" is the closest English idiom? - Linus T ]

- fix a lockdep bug on the Intel BayTrail.

- fix a few special laptop issues on the Intel pin controller solving
   suspend issues.

* tag 'pinctrl-v4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: intel: Only restore pins that are used by the driver
  pinctrl: baytrail: Fix lockdep
  pinctrl: aspeed-g5: Fix pin association of SPI1 function
  pinctrl: aspeed-g5: Fix GPIOE1 typo
  pinctrl: aspeed-g5: Fix names of GPID2 pins
  pinctrl: aspeed: "Not enabled" is a significant mux state

printk: suppress empty continuation lines

We have a fairly common pattern where you print several things as
continuations on one single line in a loop, and then at the end you do

printk(KERN_CONT "\n");

to flush the buffered output.

But if the output was flushed by something else (concurrent printk
activity, or just system logging), we don't want that final flushing to
just print an empty line.

So just suppress empty continuation lines when they couldn't be merged
into the line they are a continuation of.

Signed-off-by: Linus Torvalds <[email protected]>

Merge branch 'gup_flag-cleanups'

Merge the gup_flags cleanups from Lorenzo Stoakes:
"This patch series adjusts functions in the get_user_pages* family such
  that desired FOLL_* flags are passed as an argument rather than
  implied by flags.

  The purpose of this change is to make the use of FOLL_FORCE explicit
  so it is easier to grep for and clearer to callers that this flag is
  being used.  The use of FOLL_FORCE is an issue as it overrides missing
  VM_READ/VM_WRITE flags for the VMA whose pages we are reading
  from/writing to, which can result in surprising behaviour.

  The patch series came out of the discussion around commit 38e088546522
  ("mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing"),
  which addressed a BUG_ON() being triggered when a page was faulted in
  with PROT_NONE set but having been overridden by FOLL_FORCE.
  do_numa_page() was run on the assumption the page _must_ be one marked
  for NUMA node migration as an actual PROT_NONE page would have been
  dealt with prior to this code path, however FOLL_FORCE introduced a
  situation where this assumption did not hold.

  See

      https://marc.info/?l=linux-mm&m=147585445805166

  for the patch proposal"

Additionally, there's a fix for an ancient bug related to FOLL_FORCE and
FOLL_WRITE by me.

[ This branch was rebased recently to add a few more acked-by's and
  reviewed-by's ]

* gup_flag-cleanups:
  mm: replace access_process_vm() write parameter with gup_flags
  mm: replace access_remote_vm() write parameter with gup_flags
  mm: replace __access_remote_vm() write parameter with gup_flags
  mm: replace get_user_pages_remote() write/force parameters with gup_flags
  mm: replace get_user_pages() write/force parameters with gup_flags
  mm: replace get_vaddr_frames() write/force parameters with gup_flags
  mm: replace get_user_pages_locked() write/force parameters with gup_flags
  mm: replace get_user_pages_unlocked() write/force parameters with gup_flags
  mm: remove write/force parameters from __get_user_pages_unlocked()
  mm: remove write/force parameters from __get_user_pages_locked()
  mm: remove gup_flags FOLL_WRITE games from __get_user_pages()

x86/cpufeature: Add AVX512_4VNNIW and AVX512_4FMAPS features

AVX512_4VNNIW - Vector instructions for deep learning enhanced word
variable precision.
AVX512_4FMAPS - Vector instructions for deep learning floating-point
single precision.

These new instructions are to be used in future Intel Xeon & Xeon Phi
processors. The bits 2&3 of CPUID[level:0x07, EDX] inform that new
instructions are supported by a processor.

The spec can be found in the Intel Software Developer Manual (SDM) or in
the Instruction Set Extensions Programming Reference (ISE).

Define new feature flags to enumerate the new instructions in /proc/cpuinfo
accordingly to CPUID bits and add the required xsave extensions which are
required for proper operation.

Signed-off-by: Piotr Luc <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

x86/vmware: Skip timer_irq_works() check on VMware

The timer_irq_works() boot check may sometimes fail in a VM, when
the Host is overcommitted or when the Guest is running nested.

Since the intended check is unnecessary on VMware's virtual
hardware, by-pass it.

Signed-off-by: Renat Valiullin <[email protected]>
Acked-by: Alok N Kataria <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/20161013184539.GA11497@rvaliullin-vm
Signed-off-by: Thomas Gleixner <[email protected]>

mm: replace access_process_vm() write parameter with gup_flags

This removes the 'write' argument from access_process_vm() and replaces
it with 'gup_flags' as use of this function previously silently implied
FOLL_FORCE, whereas after this patch callers explicitly pass this flag.

We make this explicit as use of FOLL_FORCE can result in surprising
behaviour (and hence bugs) within the mm subsystem.

Signed-off-by: Lorenzo Stoakes <[email protected]>
Acked-by: Jesper Nilsson <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Michael Ellerman <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/numa: Remove duplicated include from mprotect.c

Signed-off-by: Wei Yongjun <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: [email protected]
Cc: Andrew Morton <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

mm: replace access_remote_vm() write parameter with gup_flags

This removes the 'write' argument from access_remote_vm() and replaces
it with 'gup_flags' as use of this function previously silently implied
FOLL_FORCE, whereas after this patch callers explicitly pass this flag.

We make this explicit as use of FOLL_FORCE can result in surprising
behaviour (and hence bugs) within the mm subsystem.

Signed-off-by: Lorenzo Stoakes <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: replace __access_remote_vm() write parameter with gup_flags

This removes the 'write' argument from __access_remote_vm() and replaces
it with 'gup_flags' as use of this function previously silently implied
FOLL_FORCE, whereas after this patch callers explicitly pass this flag.

We make this explicit as use of FOLL_FORCE can result in surprising
behaviour (and hence bugs) within the mm subsystem.

Signed-off-by: Lorenzo Stoakes <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: replace get_user_pages_remote() write/force parameters with gup_flags

This removes the 'write' and 'force' from get_user_pages_remote() and
replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in
callers as use of this flag can result in surprising behaviour (and
hence bugs) within the mm subsystem.

Signed-off-by: Lorenzo Stoakes <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: replace get_user_pages() write/force parameters with gup_flags

This removes the 'write' and 'force' from get_user_pages() and replaces
them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers
as use of this flag can result in surprising behaviour (and hence bugs)
within the mm subsystem.

Signed-off-by: Lorenzo Stoakes <[email protected]>
Acked-by: Christian König <[email protected]>
Acked-by: Jesper Nilsson <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: replace get_vaddr_frames() write/force parameters with gup_flags

This removes the 'write' and 'force' from get_vaddr_frames() and
replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in
callers as use of this flag can result in surprising behaviour (and
hence bugs) within the mm subsystem.

Signed-off-by: Lorenzo Stoakes <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: replace get_user_pages_locked() write/force parameters with gup_flags

This removes the 'write' and 'force' use from get_user_pages_locked()
and replaces them with 'gup_flags' to make the use of FOLL_FORCE
explicit in callers as use of this flag can result in surprising
behaviour (and hence bugs) within the mm subsystem.

Signed-off-by: Lorenzo Stoakes <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

arm64: percpu: rewrite ll/sc loops in assembly

Writing the outer loop of an LL/SC sequence using do {...} while
constructs potentially allows the compiler to hoist memory accesses
between the STXR and the branch back to the LDXR. On CPUs that do not
guarantee forward progress of LL/SC loops when faced with memory
accesses to the same ERG (up to 2k) between the failed STXR and the
branch back, we may end up livelocking.

This patch avoids this issue in our percpu atomics by rewriting the
outer loop as part of the LL/SC inline assembly block.

Cc: <[email protected]>
Fixes: f97fc810798c ("arm64: percpu: Implement this_cpu operations")
Reviewed-by: Mark Rutland <[email protected]>
Tested-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

arm64: swp emulation: bound LL/SC retries before rescheduling

If a CPU does not implement a global monitor for certain memory types,
then userspace can attempt a kernel DoS by issuing SWP instructions
targetting the problematic memory (for example, a framebuffer mapped
with non-cacheable attributes).

The SWP emulation code protects against these sorts of attacks by
checking for pending signals and potentially rescheduling when the STXR
instruction fails during the emulation. Whilst this is good for avoiding
livelock, it harms emulation of legitimate SWP instructions on CPUs
where forward progress is not guaranteed if there are memory accesses to
the same reservation granule (up to 2k) between the failing STXR and
the retry of the LDXR.

This patch solves the problem by retrying the STXR a bounded number of
times (4) before breaking out of the LL/SC loop and looking for
something else to do.

Cc: <[email protected]>
Fixes: bd35a4adc413 ("arm64: Port SWP/SWPB emulation support from arm")
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

sched/fair: Fix incorrect task group ->load_avg

A scheduler performance regression has been reported by Joseph Salisbury,
which he bisected back to:

  3d30544f0212 ("sched/fair: Apply more PELT fixes)

The regression triggers when several levels of task groups are involved
(read: SystemD) and cpu_possible_mask != cpu_present_mask.

The root cause is that group entity's load (tg_child->se[i]->avg.load_avg)
is initialized to scale_load_down(se->load.weight). During the creation of
a child task group, its group entities on possible CPUs are attached to
parent's cfs_rq (tg_parent) and their loads are added to the parent's load
(tg_parent->load_avg) with update_tg_load_avg().

But only the load on online CPUs will then be updated to reflect real load,
whereas load on other CPUs will stay at the initial value.

The result is a tg_parent->load_avg that is higher than the real load, the
weight of group entities (tg_parent->se[i]->load.weight) on online CPUs is
smaller than it should be, and the task group gets a less running time than
what it could expect.

( This situation can be detected with /proc/sched_debug. The ".tg_load_avg"
  of the task group will be much higher than sum of ".tg_load_avg_contrib"
  of online cfs_rqs of the task group. )

The load of group entities don't have to be intialized to something else
than 0 because their load will increase when an entity is attached.

Reported-by: Joseph Salisbury <[email protected]>
Tested-by: Dietmar Eggemann <[email protected]>
Signed-off-by: Vincent Guittot <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: <[email protected]> # 4.8.x
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Fixes: 3d30544f0212 ("sched/fair: Apply more PELT fixes)
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

efi/arm: Fix absolute relocation detection for older toolchains

When building the ARM kernel with CONFIG_EFI=y, the following build
error may occur when using a less recent version of binutils (2.23 or
older):

   STUBCPY drivers/firmware/efi/libstub/lib-sort.stub.o
00000000 R_ARM_ABS32       sort
00000004 R_ARM_ABS32       __ksymtab_strings
drivers/firmware/efi/libstub/lib-sort.stub.o: absolute symbol references not allowed in the EFI stub

(and when building with debug symbols, the list above is much longer, and
contains all the internal references between the .debug sections and the
actual code)

This issue is caused by the fact that objcopy v2.23 or earlier does not
support wildcards in its -R and -j options, which means the following
line from the Makefile:

  STUBCOPY_FLAGS-y := -R .debug* -R *ksymtab* -R *kcrctab*

fails to take effect, leaving harmless absolute relocations in the binary
that are indistinguishable from relocations that may cause crashes at
runtime due to the fact that these relocations are resolved at link time
using the virtual address of the kernel, which is always different from
the address at which the EFI firmware loads and invokes the stub.

So, as a workaround, disable debug symbols explicitly when building the
stub for ARM, and strip the ksymtab and kcrctab symbols for the only
exported symbol we currently reuse in the stub, which is 'sort'.

Tested-by: Jon Hunter <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Reviewed-by: Matt Fleming <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

irqchip/eznps: Drop pointless static qualifier in nps400_of_init()

There is no need to have the 'struct irq_domain *nps400_root_domain'
variable static since new value is always assigned before use.

Fixes: 44df427c894a ("irqchip: add nps Internal and external irqchips")
Signed-off-by: Wei Yongjun <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Jason Cooper <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

powerpc: Ignore the pkey system calls for now

Eliminates warning messages:

<stdin>:1316:2: warning: #warning syscall pkey_mprotect not implemented [-Wcpp]
<stdin>:1319:2: warning: #warning syscall pkey_alloc not implemented [-Wcpp]
<stdin>:1322:2: warning: #warning syscall pkey_free not implemented [-Wcpp]

Hopefully we will remember to revert this commit if we ever implement
them.

Signed-off-by: Stephen Rothwell <[email protected]>
Acked-by: Balbir Singh <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>

powerpc: Fix numa topology console print

With recent update to printk, we get console output like below:

[    0.550639] Brought up 160 CPUs
[    0.550718] Node 0 CPUs:
[    0.550721]  0
[    0.550754] -39

[    0.550794] Node 1 CPUs:
[    0.550798]  40
[    0.550817] -79

[    0.550856] Node 16 CPUs:
[    0.550860]  80
[    0.550880] -119

[    0.550917] Node 17 CPUs:
[    0.550923]  120
[    0.550942] -159

Fix this by properly using pr_cont(), ie. KERN_CONT.

Signed-off-by: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>

powerpc/mm: Drop dump_numa_memory_topology()

At boot we dump the NUMA memory topology in dump_numa_memory_topology(),
at KERN_DEBUG level, resulting in output like:

  Node 0 Memory: 0x0-0x100000000
  Node 1 Memory: 0x100000000-0x200000000

Which is nice enough, but immediately after that we iterate over each
node and call setup_node_data(), which also prints out the node ranges,
at KERN_INFO, giving eg:

  numa: Initmem setup node 0 [mem 0x00000000-0xffffffff]
  numa: Initmem setup node 1 [mem 0x100000000-0x1ffffffff]

Additionally dump_numa_memory_topology() does not use KERN_CONT
correctly, resulting in split output lines on recent kernels.

So drop dump_numa_memory_topology() as superfluous chatter.

Signed-off-by: Michael Ellerman <[email protected]>
Acked-by: Balbir Singh <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>

cxl: Prevent adapter reset if an active context exists

This patch prevents resetting the cxl adapter via sysfs in presence of
one or more active cxl_context on it. This protects against an
unrecoverable error caused by PSL owning a dirty cache line even after
reset and host tries to touch the same cache line. In case a force reset
of the card is required irrespective of any active contexts, the int
value -1 can be stored in the 'reset' sysfs attribute of the card.

The patch introduces a new atomic_t member named contexts_num inside
struct cxl that holds the number of active context attached to the card
, which is checked against '0' before proceeding with the reset. To
prevent against a race condition where a context is activated just after
reset check is performed, the contexts_num is atomically set to '-1'
after reset-check to indicate that no more contexts can be activated on
the card anymore.

Before activating a context we atomically test if contexts_num is
non-negative and if so, increment its value by one. In case the value of
contexts_num is negative then it indicates that the card is about to be
reset and context activation is error-ed out at that point.

Fixes: 62fa19d4b4fd ("cxl: Add ability to reset the card")
Cc: [email protected] # v4.0+
Acked-by: Frederic Barrat <[email protected]>
Reviewed-by: Andrew Donnellan <[email protected]>
Signed-off-by: Vaibhav Jain <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>

powerpc/boot: Fix boot on systems with uncompressed kernel image

This commit broke boot on systems with an uncompressed kernel image,
namely systems using a cuImage. On such systems the compressed boot
image (boot wrapper, uncompressed kernel image, ..) is decompressed
by u-boot already, therefore the boot wrapper code sees an
uncompressed kernel image.

The old decompression code silently assumed an uncompressed kernel
image if it found no valid gzip signature, whilst the new code
bailed out in this case.

Fix this by re-introducing such a fallback if no valid compressed
image is found.

Fixes: 1b7898ee276b ("Use the pre-boot decompression API")
Signed-off-by: Heiner Kallweit <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>

powerpc/mm: Prevent unlikely crash in copro_calculate_slb()

If a cxl adapter faults on an invalid address for a kernel context, we
may enter copro_calculate_slb() with a NULL mm pointer (kernel
context) and an effective address which looks like a user
address. Which will cause a crash when dereferencing mm. It is clearly
an AFU bug, but there's no reason to crash either. So return an error,
so that cxl can ack the interrupt with an address error.

Fixes: 73d16a6e0e51 ("powerpc/cell: Move data segment faulting code out of cell platform")
Cc: [email protected] # v3.18+
Signed-off-by: Frederic Barrat <[email protected]>
Acked-by: Ian Munsie <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>

KVM: MIPS: Add missing uaccess.h include

MIPS KVM uses user memory accessors but mips.c doesn't directly include
uaccess.h, so include it now.

This wasn't too much of a problem before v4.9-rc1 as asm/module.h
included asm/uaccess.h, however since commit 29abfbd9cbba ("mips:
separate extable.h, switch module.h to it") this is no longer the case.

This resulted in build failures when trace points were disabled, as
trace/define_trace.h includes trace/trace_events.h only ifdef
TRACEPOINTS_ENABLED, which goes on to include asm/uaccess.h via a couple
of other headers.

Fixes: 29abfbd9cbba ("mips: separate extable.h, switch module.h to it")
Signed-off-by: James Hogan <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: "Radim Krčmář" <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: [email protected]
Cc: [email protected]

sh: add earlycon support to j2_defconfig

Signed-off-by: Rich Felker <[email protected]>

sh: add Kconfig option for J-Core SoC core drivers

Signed-off-by: Rich Felker <[email protected]>

Merge tag 'for-f2fs-4.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs bugfix from Jaegeuk Kim:
"This fixes a bug which referenced the wrong pointer, sum_page, in
f2fs_gc. It was newly introduced in 4.9-rc1.

* tag 'for-f2fs-4.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
f2fs: fix wrong sum_page pointer in f2fs_gc

mm: replace get_user_pages_unlocked() write/force parameters with gup_flags

This removes the 'write' and 'force' use from get_user_pages_unlocked()
and replaces them with 'gup_flags' to make the use of FOLL_FORCE
explicit in callers as use of this flag can result in surprising
behaviour (and hence bugs) within the mm subsystem.

Signed-off-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: remove write/force parameters from __get_user_pages_unlocked()

This removes the redundant 'write' and 'force' parameters from
__get_user_pages_unlocked() to make the use of FOLL_FORCE explicit in
callers as use of this flag can result in surprising behaviour (and
hence bugs) within the mm subsystem.

Signed-off-by: Lorenzo Stoakes <[email protected]>
Acked-by: Paolo Bonzini <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: remove write/force parameters from __get_user_pages_locked()

This removes the redundant 'write' and 'force' parameters from
__get_user_pages_locked() to make the use of FOLL_FORCE explicit in
callers as use of this flag can result in surprising behaviour (and
hence bugs) within the mm subsystem.

Signed-off-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: remove gup_flags FOLL_WRITE games from __get_user_pages()

This is an ancient bug that was actually attempted to be fixed once
(badly) by me eleven years ago in commit 4ceb5db9757a ("Fix
get_user_pages() race for write access") but that was then undone due to
problems on s390 by commit f33ea7f404e5 ("fix get_user_pages bug").

In the meantime, the s390 situation has long been fixed, and we can now
fix it by checking the pte_dirty() bit properly (and do it better). The
s390 dirty bit was implemented in abf09bed3cce ("s390/mm: implement
software dirty bits") which made it into v3.9. Earlier kernels will
have to look at the page state itself.

Also, the VM has become more scalable, and what used a purely
theoretical race back then has become easier to trigger.

To fix it, we introduce a new internal FOLL_COW flag to mark the "yes,
we already did a COW" rather than play racy games with FOLL_WRITE that
is very fundamental, and then use the pte dirty flag to validate that
the FOLL_COW flag is still valid.

Reported-and-tested-by: Phil "not Paul" Oester <[email protected]>
Acked-by: Hugh Dickins <[email protected]>
Reviewed-by: Michal Hocko <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Willy Tarreau <[email protected]>
Cc: Nick Piggin <[email protected]>
Cc: Greg Thelen <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
"Misc fixes, plus hw-enablement changes:

   - fix persistent RAM handling
   - remove pkeys warning
   - remove duplicate macro
   - fix debug warning in irq handler
   - add new 'Knights Mill' CPU related constants and enable the perf bits"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/uncore: Add Knights Mill CPUID
  perf/x86/intel/rapl: Add Knights Mill CPUID
  perf/x86/intel: Add Knights Mill CPUID
  x86/cpu/intel: Add Knights Mill to Intel family
  x86/e820: Don't merge consecutive E820_PRAM ranges
  pkeys: Remove easily triggered WARN
  x86: Remove duplicate rtit status MSR macro
  x86/smp: Add irq_enter/exit() in smp_reschedule_interrupt()

Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fixlet from Ingo Molnar:
"Remove an unused variable"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
alarmtimer: Remove unused but set variable

Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fix from Ingo Molnar:
"Fix a crash that can trigger when racing with CPU hotplug: we didn't
use sched-domains data structures carefully enough in select_idle_cpu()"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Fix sched domains NULL dereference in select_idle_sibling()

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
"Four tooling fixes, two kprobes KASAN related fixes and an x86 PMU
  driver fix/cleanup"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf jit: Fix build issue on Ubuntu
  perf jevents: Handle events including .c and .o
  perf/x86/intel: Remove an inconsistent NULL check
  kprobes: Unpoison stack in jprobe_return() for KASAN
  kprobes: Avoid false KASAN reports during stack copy
  perf header: Set nr_numa_nodes only when we parsed all the data
  perf top: Fix refreshing hierarchy entries on TUI

Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fixes from Ingo Molnar:
"Two fixes:

   - a file locks fix (missing critical section, bug introduced in this
     merge window)

   - an x86 down_write() stack frame annotation"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking, fs/locks: Add missing file_sem locks
  locking/rwsem/x86: Add stack frame dependency for ____down_write()

Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Ingo Molnar:
"Three irqchip driver fixes"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gicv3: Handle loop timeout proper
  irqchip/jcore: Fix lost per-cpu interrupts
  irqchip/eznps: Acknowledge NPS_IPI before calling the handler

Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc fixes from Ingo Molnar:
"A CPU hotplug debuggability fix and three objtool false positive
  warnings fixes for new GCC6 code generation patterns"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpu/hotplug: Use distinct name for cpu_hotplug.dep_map
  objtool: Skip all "unreachable instruction" warnings for gcov kernels
  objtool: Improve rare switch jump table pattern detection
  objtool: Support '-mtune=atom' stack frame setup instruction

Merge tag 'firewire-update' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394

Pull firewire fixlet from Stefan Richter:
"IEEE 1394 subsystem patch: catch an initialization error in the packet
sniffer nosy"

* tag 'firewire-update' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
firewire: nosy: do not ignore errors in ioremap_nocache()

MAINTAINERS: Add myself as EFI maintainer

At the request of Matt, I am taking up co-maintainership of the EFI
subsystem. So add my name to the EFI section in MAINTAINERS, and
change the SCM tree reference to point to the new shared Git repo.

Signed-off-by: Ard Biesheuvel <[email protected]>
Acked-by: Will Deacon <[email protected]>
Acked-by: Matt Fleming <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

Merge tag 'drm-fixes-for-v4.9-rc2' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"Just had a couple of amdgpu fixes and one core fix I wanted to get out
  early to fix some regressions.

  I'm sure I'll have more stuff this week for -rc2"

* tag 'drm-fixes-for-v4.9-rc2' of git://people.freedesktop.org/~airlied/linux: (22 commits)
  drm: Print device information again in debugfs
  drm/amd/powerplay: fix bug stop dpm can't work on Vi.
  drm/amd/powerplay: notify smu no display by default.
  drm/amdgpu/dpm: implement thermal sensor for CZ/ST
  drm/amdgpu/powerplay: implement thermal sensor for CZ/ST
  drm/amdgpu: disable smu hw first on tear down
  drm/amdgpu: fix amdgpu_need_full_reset (v2)
  drm/amdgpu/si_dpm: Limit clocks on HD86xx part
  drm/amd/powerplay: fix static checker warnings in smu7_hwmgr.c
  drm/amdgpu: potential NULL dereference in debugfs code
  drm/amd/powerplay: fix static checker warnings in smu7_hwmgr.c
  drm/amd/powerplay: fix static checker warnings in iceland_smc.c
  drm/radeon: change vblank_time's calculation method to reduce computational error.
  drm/amdgpu: change vblank_time's calculation method to reduce computational error.
  drm/amdgpu: clarify UVD/VCE special handling for CG
  drm/amd/amdgpu: enable clockgating only after late init
  drm/radeon: allow TA_CS_BC_BASE_ADDR on SI
  drm/amdgpu: initialize the context reset_counter in amdgpu_ctx_init
  drm/amdgpu/gfx8: fix CGCG_CGLS handling
  drm/radeon: fix modeset tear down code
  ...

pinctrl: intel: Only restore pins that are used by the driver

Dell XPS 13 (and maybe some others) uses a GPIO (CPU_GP_1) during suspend
to explicitly disable USB touchscreen interrupt. This is done to prevent
situation where the lid is closed the touchscreen is left functional.

The pinctrl driver (wrongly) assumes it owns all pins which are owned by
host and not locked down. It is perfectly fine for BIOS to use those pins
as it is also considered as host in this context.

What happens is that when the lid of Dell XPS 13 is closed, the BIOS
configures CPU_GP_1 low disabling the touchscreen interrupt. During resume
we restore all host owned pins to the known state which includes CPU_GP_1
and this overwrites what the BIOS has programmed there causing the
touchscreen to fail as no interrupts are reaching the CPU anymore.

Fix this by restoring only those pins we know are explicitly requested by
the kernel one way or other.

Cc: [email protected]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=176361
Reported-by: AceLan Kao <[email protected]>
Tested-by: AceLan Kao <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>

pinctrl: baytrail: Fix lockdep

Initialize the spinlock before using it.

INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.8.0-dwc-bisect #4
Hardware name: Intel Corp. VALLEYVIEW C0 PLATFORM/BYT-T FFD8, BIOS BLAKFF81.X64.0088.R10.1403240443 FFD8_X64_R_2014_13_1_00 03/24/2014
0000000000000000 ffff8800788ff770 ffffffff8133d597 0000000000000000
0000000000000000 ffff8800788ff7e0 ffffffff810cfb9e 0000000000000002
ffff8800788ff7d0 ffffffff8205b600 0000000000000002 ffff8800788ff7f0
Call Trace:
[<ffffffff8133d597>] dump_stack+0x67/0x90
[<ffffffff810cfb9e>] register_lock_class+0x52e/0x540
[<ffffffff810d2081>] __lock_acquire+0x81/0x16b0
[<ffffffff810cede1>] ? save_trace+0x41/0xd0
[<ffffffff810d33b2>] ? __lock_acquire+0x13b2/0x16b0
[<ffffffff810cf05a>] ? __lock_is_held+0x4a/0x70
[<ffffffff810d3b1a>] lock_acquire+0xba/0x220
[<ffffffff8136f1fe>] ? byt_gpio_get_direction+0x3e/0x80
[<ffffffff81631567>] _raw_spin_lock_irqsave+0x47/0x60
[<ffffffff8136f1fe>] ? byt_gpio_get_direction+0x3e/0x80
[<ffffffff8136f1fe>] byt_gpio_get_direction+0x3e/0x80
[<ffffffff813740a9>] gpiochip_add_data+0x319/0x7d0
[<ffffffff81631723>] ? _raw_spin_unlock_irqrestore+0x43/0x70
[<ffffffff8136fe3b>] byt_pinctrl_probe+0x2fb/0x620
[<ffffffff8142fb0c>] platform_drv_probe+0x3c/0xa0
...

Based on the diff it looks like the problem was introduced in
commit 71e6ca61e826 ("pinctrl: baytrail: Register pin control handling")
but I wasn't able to verify that empirically as the parent commit
just oopsed when I tried to boot it.

Cc: Cristina Ciocan <[email protected]>
Cc: [email protected]
Fixes: 71e6ca61e826 ("pinctrl: baytrail: Register pin control handling")
Signed-off-by: Ville Syrjälä <[email protected]>
Acked-by: Mika Westerberg <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>

pinctrl: aspeed-g5: Fix pin association of SPI1 function

The SPI1 function was associated with the wrong pins: The functions that
those pins provide is either an SPI debug or passthrough function
coupled to SPI1. Make the SPI1 mux function configure the relevant pins
and associate new SPI1DEBUG and SPI1PASSTHRU functions with the pins
that were already defined.

The notation used in the datasheet's multi-function pin table for the SoC is
often creative: in this case the SYS* signals are enabled by a single bit,
which is nothing unusual on its own, but in this case the bit was also
participating in a multi-bit bitfield and therefore represented multiple
functions. This fact was overlooked in the original patch.

Fixes: 56e57cb6c07f (pinctrl: Add pinctrl-aspeed-g5 driver)
Signed-off-by: Andrew Jeffery <[email protected]>
Reviewed-by: Joel Stanley <[email protected]>
Acked-by: Rob Herring <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>

pinctrl: aspeed-g5: Fix GPIOE1 typo

This prevented C20 from successfully being muxed as GPIO.

Fixes: 56e57cb6c07f (pinctrl: Add pinctrl-aspeed-g5 driver)
Signed-off-by: Andrew Jeffery <[email protected]>
Reviewed-by: Joel Stanley <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>

pinctrl: aspeed-g5: Fix names of GPID2 pins

Fixes simple typos in the initial commit. There is no behavioural
change.

Fixes: 56e57cb6c07f (pinctrl: Add pinctrl-aspeed-g5 driver)
Reported-by: Xo Wang <[email protected]>
Signed-off-by: Andrew Jeffery <[email protected]>
Reviewed-by: Joel Stanley <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>

pinctrl: aspeed: "Not enabled" is a significant mux state

Consider a scenario with one pin P that has two signals A and B, where A
is defined to be higher priority than B: That is, if the mux IP is in a
state that would consider both A and B to be active on P, then A will be
the active signal.

To instead configure B as the active signal we must configure the mux so
that A is inactive. The mux state for signals can be described by
logical operations on one or more bits from one or more registers (a
"signal expression"), which in some cases leads to aliased mux states for
a particular signal. Further, signals described by multi-bit bitfields
often do not only need to record the states that would make them active
(the "enable" expressions), but also the states that makes them inactive
(the "disable" expressions). All of this combined leads to four possible
states for a signal:

         1. A signal is active with respect to an "enable" expression
         2. A signal is not active with respect to an "enable" expression
         3. A signal is inactive with respect to a "disable" expression
         4. A signal is not inactive with respect to a "disable" expression

In the case of P, if we are looking to activate B without explicitly
having configured A it's enough to consider A inactive if all of A's
"enable" signal expressions evaluate to "not active". If any evaluate to
"active" then the corresponding "disable" states must be applied so it
becomes inactive.

For example, on the AST2400 the pins composing GPIO bank H provide
signals ROMD8 through ROMD15 (high priority) and those for UART6 (low
priority). The mux states for ROMD8 through ROMD15 are aliased, i.e.
there are two mux states that result in the respective signals being
configured:

         A. SCU90[6]=1
         B. Strap[4,1:0]=100

Further, the second mux state is a 3-bit bitfield that explicitly
defines the enabled state but the disabled state is implicit, i.e. if
Strap[4,1:0] is not exactly "100" then ROMD8 through ROMD15 are not
considered active. This requires the mux function evaluation logic to
use approach 2. above, however the existing code was using approach 3.
The problem was brought to light on the Palmetto machines where the
strap register value is 0x120ce416, and prevented GPIO requests in bank
H from succeeding despite the hardware being in a position to allow
them.

Fixes: 318398c09a8d ("pinctrl: Add core pinctrl support for Aspeed SoCs")
Signed-off-by: Andrew Jeffery <[email protected]>
Reviewed-by: Joel Stanley <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>

ceph: fix non static symbol warning

Fixes the following sparse warning:

fs/ceph/xattr.c:19:28: warning:
symbol 'ceph_other_xattr_handler' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>

locking, fs/locks: Add missing file_sem locks

I overlooked a few code-paths that can lead to
locks_delete_global_locks().

Reported-by: Dmitry Vyukov <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Jeff Layton <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Bruce Fields <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: syzkaller <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

locking/rwsem/x86: Add stack frame dependency for ____down_write()

Arnd reported the following objtool warning:

kernel/locking/rwsem.o: warning: objtool: down_write_killable()+0x16: call without frame pointer save/setup

The warning means gcc placed the ____down_write() inline asm (and its
call instruction) before the frame pointer setup in
down_write_killable(), which breaks frame pointer convention and can
result in incorrect stack traces.

Force the stack frame to be created before the call instruction by
listing the stack pointer as an output operand in the inline asm
statement.

Reported-by: Arnd Bergmann <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/1188b7015f04baf361e59de499ee2d7272c59dce.1476393828.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>

ceph: fix uninitialized dentry pointer in ceph_real_mount()

fs/ceph/super.c: In function ‘ceph_real_mount’:
fs/ceph/super.c:818: warning: ‘root’ may be used uninitialized in this function

If s_root is already valid, dentry pointer root is never initialized,
and returned by ceph_real_mount(). This will cause a crash later when
the caller dereferences the pointer.

Fixes: ce2728aaa82bbeba ("ceph: avoid accessing / when mounting a subpath")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Yan, Zheng <[email protected]>

ceph: fix readdir vs fragmentation race

following sequence of events tigger the race

- client readdir frag 0* -> got item 'A'
- MDS merges frag 0* and frag 1*
- client send readdir request (frag 1*, offset 2, readdir_start 'A')
- MDS reply items (that are after item 'A') in frag *

Link: http://tracker.ceph.com/issues/17286
Signed-off-by: Yan, Zheng <[email protected]>

ext2: avoid bogus -Wmaybe-uninitialized warning

On ARM, we get this false-positive warning since the rework of
the ext2_get_blocks interface:

fs/ext2/inode.c: In function 'ext2_get_block':
include/linux/buffer_head.h:340:16: error: 'bno' may be used uninitialized in this function [-Werror=maybe-uninitialized]

The calling conventions for this function are rather complex, and it's
not surprising that the compiler gets this wrong, I spent a long time
trying to understand how it all fits together myself.

This change to avoid the warning makes sure the compiler sees that we
always set 'bno' pointer whenever we have a positive return code.
The transformation is correct because we always arrive at the 'got_it'
label with a positive count that gets used as the return value, while
any branch to the 'cleanup' label has a negative or zero 'err'.

Fixes: 6750ad71986d ("ext2: stop passing buffer_head to ext2_get_blocks")
Signed-off-by: Arnd Bergmann <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Cc: Dave Chinner <[email protected]>
Signed-off-by: Jan Kara <[email protected]>

isofs: Do not return EACCES for unknown filesystems

When isofs_mount() is called to mount a device read-write, it returns
EACCES even before it checks that the device actually contains an isofs
filesystem. This may confuse mount(8) which then tries to mount all
subsequent filesystem types in read-only mode.

Fix the problem by returning EACCES only once we verify that the device
indeed contains an iso9660 filesystem.

CC: [email protected]
Fixes: 17b7f7cf58926844e1dd40f5eb5348d481deca6a
Reported-by: Kent Overstreet <[email protected]>
Reported-by: Karel Zak <[email protected]>
Signed-off-by: Jan Kara <[email protected]>

btrfs: assign error values to the correct bio structs

Fixes: 4246a0b63bd8 ("block: add a bi_error field to struct bio")
Signed-off-by: Junjie Mao <[email protected]>
Acked-by: David Sterba <[email protected]>
Cc: [email protected] # 4.3+
Signed-off-by: Linus Torvalds <[email protected]>

x86, pkeys: remove cruft from never-merged syscalls

pkey_set() and pkey_get() were syscalls present in older versions
of the protection keys patches.  The syscall number definitions
were inadvertently left in place.  This patch removes them.

I did a git grep and verified that these are the last places in
the tree that these appear, save for the protection_keys.c tests
and Documentation.  Those spots talk about functions called
pkey_get/set() which are wrappers for the direct PKRU
instructions, not the syscalls.

Signed-off-by: Dave Hansen <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Fixes: f9afc6197e9bb ("x86: Wire up protection keys system calls")
Signed-off-by: Linus Torvalds <[email protected]>

arm64: sysreg: Fix use of XZR in write_sysreg_s

Commit 8a71f0c656e0 ("arm64: sysreg: replace open-coded mrs_s/msr_s with
{read,write}_sysreg_s") introduced a write_sysreg_s macro for writing
to system registers that are not supported by binutils.

Unfortunately, this was implemented with the wrong template (%0 vs %x0),
so in the case that we are writing a constant 0, we will generate
invalid instruction syntax and bail with a cryptic assembler error:

| Error: constant expression required

This patch fixes the template.

Acked-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

hwmon: (max31790) potential ERR_PTR dereference

We should only dereference "data" after we check if it is an error
pointer.

Fixes: 54187ff9d766 ('hwmon: (max31790) Convert to use new hwmon registration API')
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>

hwmon: (adm9240) handle temperature readings below 0

Unlike the temperature thresholds the temperature data is a 9-bit signed
value. This allows and additional 0.5 degrees of precision on the
reading but makes handling negative values slightly harder. In order to
have sign-extension applied correctly the 9-bit value is stored in the
upper bits of a signed 16-bit value. When presenting this in sysfs the
value is shifted and scaled appropriately.

Signed-off-by: Chris Packham <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>

generic syscalls: kill cruft from removed pkey syscalls

pkey_set() and pkey_get() were syscalls present in older versions
of the protection keys patches.  They were fully excised from the
x86 code, but some cruft was left in the generic syscall code.  The
C++ comments were intended to help to make it more glaring to me to
fix them before actually submitting them.  That technique worked,
but later than I would have liked.

I test-compiled this for arm64.

Fixes: a60f7b69d92c0 ("generic syscalls: Wire up memory protection keys syscalls")
Signed-off-by: Dave Hansen <[email protected]>
Acked-by: Arnd Bergmann <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>

irqchip/gic-v3-its: Fix entry size mask for GITS_BASER

Entry Size in GITS_BASER<n> occupies 5 bits [52:48], but we mask out 8
bits.

Fixes: cc2d3216f53c ("irqchip: GICv3: ITS command queue")
Cc: [email protected]
Signed-off-by: Vladimir Murzin <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>

arm64: kaslr: keep modules close to the kernel when DYNAMIC_FTRACE=y

The RANDOMIZE_MODULE_REGION_FULL Kconfig option allows KASLR to be
configured in such a way that kernel modules and the core kernel are
allocated completely independently, which implies that modules are likely
to require branches via PLT entries to reach the core kernel. The dynamic
ftrace code does not expect that, and assumes that it can patch module
code to perform a relative branch to anywhere in the core kernel. This
may result in errors such as

  branch_imm_common: offset out of range
  ------------[ cut here ]------------
  WARNING: CPU: 3 PID: 196 at kernel/trace/ftrace.c:1995 ftrace_bug+0x220/0x2e8
  Modules linked in:

  CPU: 3 PID: 196 Comm: systemd-udevd Not tainted 4.8.0-22-generic #24
  Hardware name: AMD Seattle/Seattle, BIOS 10:34:40 Oct  6 2016
  task: ffff8d1bef7dde80 task.stack: ffff8d1bef6b0000
  PC is at ftrace_bug+0x220/0x2e8
  LR is at ftrace_process_locs+0x330/0x430

So make RANDOMIZE_MODULE_REGION_FULL mutually exclusive with DYNAMIC_FTRACE
at the Kconfig level.

Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

arm64: kernel: Init MDCR_EL2 even in the absence of a PMU

Commit f436b2ac90a0 ("arm64: kernel: fix architected PMU registers
unconditional access") made sure we wouldn't access unimplemented
PMU registers, but also left MDCR_EL2 uninitialized in that case,
leading to trap bits being potentially left set.

Make sure we always write something in that register.

Fixes: f436b2ac90a0 ("arm64: kernel: fix architected PMU registers unconditional access")
Cc: Lorenzo Pieralisi <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

perf: xgene: Remove bogus IS_ERR() check

In acpi_get_pmu_hw_inf we pass the address of a local variable to IS_ERR(),
which doesn't make sense, as the pointer must be a real, valid pointer.
This doesn't cause a functional problem, as IS_ERR() will evaluate as
false, but the check is bogus and causes static checkers to complain.

Remove the bogus check.

The bug is reported by Dan Carpenter <[email protected]> in [1]

[1] https://www.spinics.net/lists/arm-kernel/msg535957.html

Signed-off-by: Tai Nguyen <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

arm64: kernel: numa: fix ACPI boot cpu numa node mapping

Commit 7ba5f605f3a0 ("arm64/numa: remove the limitation that cpu0 must
bind to node0") removed the numa cpu<->node mapping restriction whereby
logical cpu 0 always corresponds to numa node 0; removing the
restriction was correct, in that it does not really exist in practice
but the commit only updated the early mapping of logical cpu 0 to its
real numa node for the DT boot path, missing the ACPI one, leading to
boot failures on ACPI systems owing to missing node<->cpu map for
logical cpu 0.

Fix the issue by updating the ACPI boot path with code that carries out
the early cpu<->node mapping also for the boot cpu (ie cpu 0), mirroring
what is currently done in the DT boot path.

Fixes: 7ba5f605f3a0 ("arm64/numa: remove the limitation that cpu0 must bind to node0")
Signed-off-by: Lorenzo Pieralisi <[email protected]>
Tested-by: Laszlo Ersek <[email protected]>
Reported-by: Laszlo Ersek <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Laszlo Ersek <[email protected]>
Cc: Hanjun Guo <[email protected]>
Cc: Andrew Jones <[email protected]>
Cc: Zhen Lei <[email protected]>
Cc: Catalin Marinas <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

Merge tag 'perf-urgent-for-mingo-20161017' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

- Fix handling of NUMA nodes in perf.data files (Jiri Olsa)

- Fix scrolling when refreshing 'perf top --tui --hierarchy' entries (Namhyung Kim)

- Fix building of JIT support on Ubuntu 16.04 (Anton Blanchard)

- Fix handling of events including .c and .o, that were being treated as
BPF scripts instead of vendor ones (Wang Nan)

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>

perf jit: Fix build issue on Ubuntu

When building on Ubuntu 16.04, I get the following error:

Makefile:49: *** the openjdk development package appears to me missing, install and try again. Stop.

The problem is that update-java-alternatives has multiple spaces between
fields, and cut treats each space as a new delimiter:

java-1.8.0-openjdk-ppc64el 1081 /usr/lib/jvm/java-1.8.0-openjdk-ppc64el

Fix this by using awk, which handles this fine.

Signed-off-by: Anton Blanchard <[email protected]>
Reviewed-by: Stephane Eranian <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf jevents: Handle events including .c and .o

This patch helps with Sukadev's vendor event tree where such events can happen.

>From Andi Kleen:
Any event including a .c/.o/.bpf currently triggers BPF compilation or loading
and then an error. This can happen for some Intel vendor events, which cannot
be used.

This patch fixes this problem by forbidding BPF file patch containing '{', '}'
and ',', make sure flex consumes the leading '{', instead of matching it using
a BPF file path.

Tested result:

  $ perf stat -e '{unc_p_clockticks,unc_p_power_state_occupancy.cores_c0}' -a -I 1000
  invalid or unsupported event: '{unc_p_clockticks,unc_p_power_state_occupancy.cores_c0}'
  Run 'perf list' for a list of valid events
  (as expected, interperted as event)

  $ perf stat -e 'aaa.c' -a -I 1000
  ERROR: problems with path aaa.c: No such file or directory
  (as expected, interpreted as BPF source)

  $ perf stat -e 'aaa.ccc' -a -I 1000
  invalid or unsupported event: 'aaa.ccc'
  (as expected, interpreted as event)

  $ perf stat -e '{aaa.c}' -a -I 1000
  ERROR: problems with path aaa.c: No such file or directory
  event syntax error: '{aaa.c}'
  <SKIP>
  (as expected, interpreted as BPF source)

  $ perf stat -e '{cycles,aaa.c}' -a -I 1000
  ERROR: problems with path aaa.c: No such file or directory
  event syntax error: '{cycles,aaa.c}'
  (as expected, interpreted as BPF source)

Signed-off-by: Wang Nan <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Reported-by: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Sukadev Bhattiprolu <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

MAINTAINERS: mmc: Move the mmc tree to kernel.org

Signed-off-by: Ulf Hansson <[email protected]>

memstick: rtsx_usb_ms: Manage runtime PM when accessing the device

Accesses to the rtsx usb device, which is the parent of the rtsx memstick
device, must not be done unless it's runtime resumed. This is currently not
the case and it could trigger various errors.

Fix this by properly deal with runtime PM in this regards. This means
making sure the device is runtime resumed, when serving requests via the
->request() callback or changing settings via the ->set_param() callbacks.

Cc: <[email protected]>
Cc: Ritesh Raj Sarraf <[email protected]>
Cc: Alan Stern <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>

memstick: rtsx_usb_ms: Runtime resume the device when polling for cards

Accesses to the rtsx usb device, which is the parent of the rtsx memstick
device, must not be done unless it's runtime resumed.

Therefore when the rtsx_usb_ms driver polls for inserted memstick cards,
let's add pm_runtime_get|put*() to make sure accesses is done when the
rtsx usb device is runtime resumed.

Reported-by: Ritesh Raj Sarraf <[email protected]>
Tested-by: Ritesh Raj Sarraf <[email protected]>
Signed-off-by: Alan Stern <[email protected]>
Cc: <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>

mmc: rtsx_usb_sdmmc: Handle runtime PM while changing the led

Accesses of the rtsx sdmmc's parent device, which is the rtsx usb device,
must be done when it's runtime resumed. Currently this isn't case when
changing the led, so let's fix this by adding a pm_runtime_get_sync() and
a pm_runtime_put() around those operations.

Reported-by: Ritesh Raj Sarraf <[email protected]>
Tested-by: Ritesh Raj Sarraf <[email protected]>
Cc: <[email protected]>
Cc: Alan Stern <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>

mmc: rtsx_usb_sdmmc: Avoid keeping the device runtime resumed when unused

The rtsx_usb_sdmmc driver may bail out in its ->set_ios() callback when no
SD card is inserted. This is wrong, as it could cause the device to remain
runtime resumed when it's unused. Fix this behaviour.

Tested-by: Ritesh Raj Sarraf <[email protected]>
Cc: <[email protected]>
Cc: Alan Stern <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>

mmc: sdhci: cast unsigned int to unsigned long long to avoid unexpeted error

Potentially overflowing expression 1000000 * data->timeout_clks with
type unsigned int is evaluated using 32-bit arithmetic, and then used
in a context that expects an expression of type unsigned long long.

To avoid overflow, cast 1000000U to type unsigned long long.
Special thanks to Coverity.

Fixes: 7f05538af71c ("mmc: sdhci: fix data timeout (part 2)")
Signed-off-by: Haibo Chen <[email protected]>
Cc: [email protected] # v3.15+
Acked-by: Adrian Hunter <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>

PCI: layerscape: Fix drvdata usage before assignment

Commit fefe6733e516 ("PCI: layerscape: Move struct pcie_port setup
to probe function") changed the init ordering of the pcie structure,
but started to use the pcie->drvdata field before initializing it.
Mayhem follows.

Fix this by moving the drvdata assignment right before the first use.
Tested on LS2085a.

Fixes: efe6733e516 ("PCI: layerscape: Move struct pcie_port setup to probe function")
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>

PCI: designware-plat: Change maintainer to Jose Abreu

Change designware-plat maintainer to Jose Abreu.

Signed-off-by: Joao Pinto <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>

mailbox: PCC: Fix return value of pcc_mbox_request_channel()

When CONFIG_PCC is disabled, pcc_mbox_request_channel() needs to
return ERR_PTR(-ENODEV), not a NULL pointer, as the callers of
this function use IS_ERR() to check for error code.

Signed-off-by: Duc Dang <[email protected]>
Signed-off-by: Hoan Tran <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

arm64: kaslr: fix breakage with CONFIG_MODVERSIONS=y

As it turns out, the KASLR code breaks CONFIG_MODVERSIONS, since the
kcrctab has an absolute address field that is relocated at runtime
when the kernel offset is randomized.

This has been fixed already for PowerPC in the past, so simply wire up
the existing code dealing with this issue.

Cc: <[email protected]>
Fixes: f80fb3a3d508 ("arm64: add support for kernel ASLR")
Tested-by: Timur Tabi <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

irqchip/gic-v3-its: Fix 64bit GIC{R,ITS}_TYPER accesses

The GICv3 architecture specification mentions that a 64bit
register can be accessed using two 32bit accesses. What it
doesn't mention is that this is only guaranteed on a system
that implements AArch32, and a pure AArch64 system is allowed
not to support this. This causes issues with the GICR_TYPER
and GITS_TYPER registers, which are both RO 64bit registers.

In order to solve this, this patch switches the TYPER accesses
to the gic_read_typer macro already used in other parts of the
driver. This makes sure that we always use a 64bit access on
64bit systems, and two 32bit accesses on 32bit system.

Signed-off-by: Marc Zyngier <[email protected]>

alarmtimer: Remove unused but set variable

Remove the set but unused variable base in alarm_clock_get to fix the
following warning when building with 'W=1':

kernel/time/alarmtimer.c: In function ‘alarm_timer_create’:
kernel/time/alarmtimer.c:545:21: warning: variable ‘base’ set but not used [-Wunused-but-set-variable]

Signed-off-by: Tobias Klauser <[email protected]>
Cc: John Stultz <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

Merge branch 'mm/pkeys' into x86/urgent, to pick up pkeys fix

Signed-off-by: Ingo Molnar <[email protected]>

perf/x86/intel/uncore: Add Knights Mill CPUID

Add Knights Mill (KNM) to the list of CPUIDs supported by PMU.

Signed-off-by: Piotr Luc <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

perf/x86/intel/rapl: Add Knights Mill CPUID

Add Knights Mill (KNM) to the list of CPUIDs supported by rapl.

Signed-off-by: Piotr Luc <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

perf/x86/intel: Add Knights Mill CPUID

Add Knights Mill (KNM) to the list of CPUIDs supported by PMU.

Signed-off-by: Piotr Luc <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

x86/cpu/intel: Add Knights Mill to Intel family

Add CPUID of Knights Mill (KNM) processor to Intel family list.

Signed-off-by: Piotr Luc <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

drm: Print device information again in debugfs

I was a bit over-eager in my cleanup in

commit 95c081c17f284de50eaca60d4d55643a64d39019
Author: Daniel Vetter <[email protected]>
Date: Tue Jun 21 10:54:12 2016 +0200

drm: Move master pointer from drm_minor to drm_device

Noticed by Chris Wilson.

Fixes: 95c081c17f28 ("drm: Move master pointer from drm_minor to drm_device")
Cc: Chris Wilson <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Emil Velikov <[email protected]>
Cc: Julia Lawall <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Tested-by: Chris Wilson <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>

Merge branch 'drm-next-4.9' of git://people.freedesktop.org/~agd5f/linux into drm-next

Fixes for radeon and amdgpu for 4.9:
- allow an additional reg in the SI reg checker
- fix thermal sensor readback on CZ/ST
- misc bug fixes

* 'drm-next-4.9' of git://people.freedesktop.org/~agd5f/linux: (21 commits)
  drm/amd/powerplay: fix bug stop dpm can't work on Vi.
  drm/amd/powerplay: notify smu no display by default.
  drm/amdgpu/dpm: implement thermal sensor for CZ/ST
  drm/amdgpu/powerplay: implement thermal sensor for CZ/ST
  drm/amdgpu: disable smu hw first on tear down
  drm/amdgpu: fix amdgpu_need_full_reset (v2)
  drm/amdgpu/si_dpm: Limit clocks on HD86xx part
  drm/amd/powerplay: fix static checker warnings in smu7_hwmgr.c
  drm/amdgpu: potential NULL dereference in debugfs code
  drm/amd/powerplay: fix static checker warnings in smu7_hwmgr.c
  drm/amd/powerplay: fix static checker warnings in iceland_smc.c
  drm/radeon: change vblank_time's calculation method to reduce computational error.
  drm/amdgpu: change vblank_time's calculation method to reduce computational error.
  drm/amdgpu: clarify UVD/VCE special handling for CG
  drm/amd/amdgpu: enable clockgating only after late init
  drm/radeon: allow TA_CS_BC_BASE_ADDR on SI
  drm/amdgpu: initialize the context reset_counter in amdgpu_ctx_init
  drm/amdgpu/gfx8: fix CGCG_CGLS handling
  drm/radeon: fix modeset tear down code
  drm/radeon: fix up dp aux tear down (v2)
  ...

Merge remote-tracking branch 'mkp-scsi/4.9/scsi-fixes' into fixes

perf/x86/intel: Remove an inconsistent NULL check

Smatch complains that we don't check "event->ctx" consistently. It's
never NULL so we can just remove the check.

Signed-off-by: Dan Carpenter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: David Carrillo-Cisneros <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vince Weaver <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>

Merge tag 'v4.9-rc1' into x86/urgent, to pick up updates

Signed-off-by: Ingo Molnar <[email protected]>

x86/e820: Don't merge consecutive E820_PRAM ranges

Commit:

917db484dc6a ("x86/boot: Fix kdump, cleanup aborted E820_PRAM max_pfn manipulation")

... fixed up the broken manipulations of max_pfn in the presence of
E820_PRAM ranges.

However, it also broke the sanitize_e820_map() support for not merging
E820_PRAM ranges.

Re-introduce the enabling to keep resource boundaries between
consecutive defined ranges. Otherwise, for example, an environment that
boots with memmap=2G!8G,2G!10G will end up with a single 4G /dev/pmem0
device instead of a /dev/pmem0 and /dev/pmem1 device 2G in size.

Reported-by: Dave Chinner <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Cc: <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Zhang Yi <[email protected]>
Cc: [email protected]
Fixes: 917db484dc6a ("x86/boot: Fix kdump, cleanup aborted E820_PRAM max_pfn manipulation")
Link: http://lkml.kernel.org/r/147629530854.10618.10383744751594021268.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Ingo Molnar <[email protected]>

cpu/hotplug: Use distinct name for cpu_hotplug.dep_map

Use distinctive name for cpu_hotplug.dep_map to avoid the actual
cpu_hotplug.lock appearing as cpu_hotplug.lock#2 in lockdep splats.

Signed-off-by: Joonas Lahtinen <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Acked-by: Gautham R. Shenoy <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Gautham R . Shenoy <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>

kprobes: Unpoison stack in jprobe_return() for KASAN

I observed false KSAN positives in the sctp code, when
sctp uses jprobe_return() in jsctp_sf_eat_sack().

The stray 0xf4 in shadow memory are stack redzones:

[     ] ==================================================================
[     ] BUG: KASAN: stack-out-of-bounds in memcmp+0xe9/0x150 at addr ffff88005e48f480
[     ] Read of size 1 by task syz-executor/18535
[     ] page:ffffea00017923c0 count:0 mapcount:0 mapping:          (null) index:0x0
[     ] flags: 0x1fffc0000000000()
[     ] page dumped because: kasan: bad access detected
[     ] CPU: 1 PID: 18535 Comm: syz-executor Not tainted 4.8.0+ #28
[     ] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[     ]  ffff88005e48f2d0 ffffffff82d2b849 ffffffff0bc91e90 fffffbfff10971e8
[     ]  ffffed000bc91e90 ffffed000bc91e90 0000000000000001 0000000000000000
[     ]  ffff88005e48f480 ffff88005e48f350 ffffffff817d3169 ffff88005e48f370
[     ] Call Trace:
[     ]  [<ffffffff82d2b849>] dump_stack+0x12e/0x185
[     ]  [<ffffffff817d3169>] kasan_report+0x489/0x4b0
[     ]  [<ffffffff817d31a9>] __asan_report_load1_noabort+0x19/0x20
[     ]  [<ffffffff82d49529>] memcmp+0xe9/0x150
[     ]  [<ffffffff82df7486>] depot_save_stack+0x176/0x5c0
[     ]  [<ffffffff817d2031>] save_stack+0xb1/0xd0
[     ]  [<ffffffff817d27f2>] kasan_slab_free+0x72/0xc0
[     ]  [<ffffffff817d05b8>] kfree+0xc8/0x2a0
[     ]  [<ffffffff85b03f19>] skb_free_head+0x79/0xb0
[     ]  [<ffffffff85b0900a>] skb_release_data+0x37a/0x420
[     ]  [<ffffffff85b090ff>] skb_release_all+0x4f/0x60
[     ]  [<ffffffff85b11348>] consume_skb+0x138/0x370
[     ]  [<ffffffff8676ad7b>] sctp_chunk_put+0xcb/0x180
[     ]  [<ffffffff8676ae88>] sctp_chunk_free+0x58/0x70
[     ]  [<ffffffff8677fa5f>] sctp_inq_pop+0x68f/0xef0
[     ]  [<ffffffff8675ee36>] sctp_assoc_bh_rcv+0xd6/0x4b0
[     ]  [<ffffffff8677f2c1>] sctp_inq_push+0x131/0x190
[     ]  [<ffffffff867bad69>] sctp_backlog_rcv+0xe9/0xa20
[ ... ]
[     ] Memory state around the buggy address:
[     ]  ffff88005e48f380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[     ]  ffff88005e48f400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[     ] >ffff88005e48f480: f4 f4 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[     ]                    ^
[     ]  ffff88005e48f500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[     ]  ffff88005e48f580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[     ] ==================================================================

KASAN stack instrumentation poisons stack redzones on function entry
and unpoisons them on function exit. If a function exits abnormally
(e.g. with a longjmp like jprobe_return()), stack redzones are left
poisoned. Later this leads to random KASAN false reports.

Unpoison stack redzones in the frames we are going to jump over
before doing actual longjmp in jprobe_return().

Signed-off-by: Dmitry Vyukov <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Mark Rutland <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Lorenzo Pieralisi <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Anil S Keshavamurthy <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

kprobes: Avoid false KASAN reports during stack copy

Kprobes save and restore raw stack chunks with memcpy().
With KASAN these chunks can contain poisoned stack redzones,
as the result memcpy() interceptor produces false
stack out-of-bounds reports.

Use __memcpy() instead of memcpy() for stack copying.
__memcpy() is not instrumented by KASAN and does not lead
to the false reports.

Currently there is a spew of KASAN reports during boot
if CONFIG_KPROBES_SANITY_TEST is enabled:

[   ] Kprobe smoke test: started
[   ] ==================================================================
[   ] BUG: KASAN: stack-out-of-bounds in setjmp_pre_handler+0x17c/0x280 at addr ffff88085259fba8
[   ] Read of size 64 by task swapper/0/1
[   ] page:ffffea00214967c0 count:0 mapcount:0 mapping:          (null) index:0x0
[   ] flags: 0x2fffff80000000()
[   ] page dumped because: kasan: bad access detected
[...]

Reported-by: CAI Qian <[email protected]>
Tested-by: CAI Qian <[email protected]>
Signed-off-by: Dmitry Vyukov <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Anil S Keshavamurthy <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
[ Improved various details. ]
Signed-off-by: Ingo Molnar <[email protected]>

objtool: Skip all "unreachable instruction" warnings for gcov kernels

Recently objtool has started reporting a few "unreachable instruction"
warnings when CONFIG_GCOV is enabled for newer versions of GCC.  Usually
this warning means there's some new control flow that objtool doesn't
understand.  But in this case, objtool is correct and the instructions
really are inaccessible.  It's an annoying quirk of gcov, but it's
harmless, so it's ok to just silence the warnings.

With older versions of GCC, it was relatively easy to detect
gcov-specific instructions and to skip any unreachable warnings produced
by them.  But GCC 6 has gotten craftier.

Instead of continuing to play whack-a-mole with gcov, just use a bigger,
more permanent hammer and disable unreachable warnings for the whole
file when gcov is enabled.  This is fine to do because a) unreachable
warnings are usually of questionable value; and b) gcov isn't used for
production kernels and we can relax the checks a bit there.

Reported-by: kbuild test robot <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/38d5c87d61d9cd46486dd2c86f46603dff0df86f.1476393584.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>

objtool: Improve rare switch jump table pattern detection

GCC 6 added a new switch statement jump table optimization which makes
objtool's life harder.  It looks like:

  mov [rodata addr],%reg1
  ... some instructions ...
  jmpq *(%reg1,%reg2,8)

The optimization is quite rare, but objtool still needs to be able to
identify the pattern so that it can follow all possible control flow
paths related to the switch statement.

In order to detect the pattern, objtool starts from the indirect jump
and scans backwards through the function until it finds the first
instruction in the pattern.  If it encounters an unconditional jump
along the way, it stops and considers the pattern to be not found.

As it turns out, unconditional jumps can happen, as long as they are
small forward jumps within the range being scanned.

This fixes the following warnings:

  drivers/infiniband/sw/rxe/rxe_comp.o: warning: objtool: rxe_completer()+0x2f4: sibling call from callable instruction with changed frame pointer
  drivers/infiniband/sw/rxe/rxe_resp.o: warning: objtool: rxe_responder()+0x10f: sibling call from callable instruction with changed frame pointer

Reported-by: Arnd Bergmann <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/8a9ed68ae1780e8d3963e4ee13f2f257fe3a3c33.1476393584.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>

ceph: fix error handling in ceph_read_iter

In case __ceph_do_getattr returns an error and the retry_op in
ceph_read_iter is not READ_INLINE, then it's possible to invoke
__free_page on a page which is NULL, this naturally leads to a crash.
This can happen when, for example, a process waiting on a MDS reply
receives sigterm.

Fix this by explicitly checking whether the page is set or not.

Cc: [email protected] # 3.19+
Signed-off-by: Nikolay Borisov <[email protected]>
Reviewed-by: Yan, Zheng <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>