Git Repo - linux.git/log

scripts/gdb: add version command

lx-version Report the Linux Version of the current kernel.

Add a command to identify the version specified by the banner in the
debugged kernel.

This lets the user identify the kernel of the running kernel, and will
let later scripts compare the banner of the attached kernel against the
banner in the vmlinux symbols files to verify that the files are
correct.

[[email protected]: remove blank line from help output and fix pep8 warning]
Signed-off-by: Kieran Bingham <[email protected]>
Signed-off-by: Jan Kiszka <[email protected]>
Cc: Jason Wessel <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kernel: add kcov code coverage

kcov provides code coverage collection for coverage-guided fuzzing
(randomized testing).  Coverage-guided fuzzing is a testing technique
that uses coverage feedback to determine new interesting inputs to a
system.  A notable user-space example is AFL
(http://lcamtuf.coredump.cx/afl/).  However, this technique is not
widely used for kernel testing due to missing compiler and kernel
support.

kcov does not aim to collect as much coverage as possible.  It aims to
collect more or less stable coverage that is function of syscall inputs.
To achieve this goal it does not collect coverage in soft/hard
interrupts and instrumentation of some inherently non-deterministic or
non-interesting parts of kernel is disbled (e.g.  scheduler, locking).

Currently there is a single coverage collection mode (tracing), but the
API anticipates additional collection modes.  Initially I also
implemented a second mode which exposes coverage in a fixed-size hash
table of counters (what Quentin used in his original patch).  I've
dropped the second mode for simplicity.

This patch adds the necessary support on kernel side.  The complimentary
compiler support was added in gcc revision 231296.

We've used this support to build syzkaller system call fuzzer, which has
found 90 kernel bugs in just 2 months:

  https://github.com/google/syzkaller/wiki/Found-Bugs

We've also found 30+ bugs in our internal systems with syzkaller.
Another (yet unexplored) direction where kcov coverage would greatly
help is more traditional "blob mutation".  For example, mounting a
random blob as a filesystem, or receiving a random blob over wire.

Why not gcov.  Typical fuzzing loop looks as follows: (1) reset
coverage, (2) execute a bit of code, (3) collect coverage, repeat.  A
typical coverage can be just a dozen of basic blocks (e.g.  an invalid
input).  In such context gcov becomes prohibitively expensive as
reset/collect coverage steps depend on total number of basic
blocks/edges in program (in case of kernel it is about 2M).  Cost of
kcov depends only on number of executed basic blocks/edges.  On top of
that, kernel requires per-thread coverage because there are always
background threads and unrelated processes that also produce coverage.
With inlined gcov instrumentation per-thread coverage is not possible.

kcov exposes kernel PCs and control flow to user-space which is
insecure.  But debugfs should not be mapped as user accessible.

Based on a patch by Quentin Casasnovas.

[[email protected]: make task_struct.kcov_mode have type `enum kcov_mode']
[[email protected]: unbreak allmodconfig]
[[email protected]: follow x86 Makefile layout standards]
Signed-off-by: Dmitry Vyukov <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Cc: syzkaller <[email protected]>
Cc: Vegard Nossum <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Tavis Ormandy <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Quentin Casasnovas <[email protected]>
Cc: Kostya Serebryany <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: David Drysdale <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

profile: hide unused functions when !CONFIG_PROC_FS

A couple of functions and variables in the profile implementation are
used only on SMP systems by the procfs code, but are unused if either
procfs is disabled or in uniprocessor kernels.  gcc prints a harmless
warning about the unused symbols:

  kernel/profile.c:243:13: error: 'profile_flip_buffers' defined but not used [-Werror=unused-function]
   static void profile_flip_buffers(void)
               ^
  kernel/profile.c:266:13: error: 'profile_discard_flip_buffers' defined but not used [-Werror=unused-function]
   static void profile_discard_flip_buffers(void)
               ^
  kernel/profile.c:330:12: error: 'profile_cpu_callback' defined but not used [-Werror=unused-function]
   static int profile_cpu_callback(struct notifier_block *info,
              ^

This adds further #ifdef to the file, to annotate exactly in which cases
they are used.  I have done several thousand ARM randconfig kernels with
this patch applied and no longer get any warnings in this file.

Signed-off-by: Arnd Bergmann <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Robin Holt <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Tejun Heo <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

hpwdt: use nmi_panic() when kernel panics in NMI handler

Commit 1717f2096b54 ("panic, x86: Fix re-entrance problem due to panic
on NMI") introduced nmi_panic() which prevents concurrent and recursive
execution of panic(). It also saves registers for the crash dump on x86
by later commit 58c5661f2144 ("panic, x86: Allow CPUs to save registers
even if looping in NMI context").

hpwdt driver can call panic() from NMI handler, so replace it with
nmi_panic(). Also, do some cleanups.

Signed-off-by: Hidehiro Kawai <[email protected]>
Acked-by: Guenter Roeck <[email protected]>
Cc: Thomas Mingarelli <[email protected]>
Cc: Wim Van Sebroeck <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ipmi/watchdog: use nmi_panic() when kernel panics in NMI handler

Commit 1717f2096b54 ("panic, x86: Fix re-entrance problem due to panic
on NMI") introduced nmi_panic() which prevents concurrent and recursive
execution of panic(). It also saves registers for the crash dump on x86
by later commit 58c5661f2144 ("panic, x86: Allow CPUs to save registers
even if looping in NMI context").

ipmi_watchdog driver can call panic() from NMI handler, so replace it
with nmi_panic().

Signed-off-by: Hidehiro Kawai <[email protected]>
Acked-by: Corey Minyard <[email protected]>
Acked-by: Guenter Roeck <[email protected]>
Reviewed-by: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

panic: change nmi_panic from macro to function

Commit 1717f2096b54 ("panic, x86: Fix re-entrance problem due to panic
on NMI") and commit 58c5661f2144 ("panic, x86: Allow CPUs to save
registers even if looping in NMI context") introduced nmi_panic() which
prevents concurrent/recursive execution of panic().  It also saves
registers for the crash dump on x86.

However, there are some cases where NMI handlers still use panic().
This patch set partially replaces them with nmi_panic() in those cases.

Even this patchset is applied, some NMI or similar handlers (e.g.  MCE
handler) continue to use panic().  This is because I can't test them
well and actual problems won't happen.  For example, the possibility
that normal panic and panic on MCE happen simultaneously is very low.

This patch (of 3):

Convert nmi_panic() to a proper function and export it instead of
exporting internal implementation details to modules, for obvious
reasons.

Signed-off-by: Hidehiro Kawai <[email protected]>
Acked-by: Borislav Petkov <[email protected]>
Acked-by: Michal Nazarewicz <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Cc: Nicolas Iooss <[email protected]>
Cc: Javi Merino <[email protected]>
Cc: Gobinda Charan Maji <[email protected]>
Cc: "Steven Rostedt (Red Hat)" <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vitaly Kuznetsov <[email protected]>
Cc: HATAYAMA Daisuke <[email protected]>
Cc: Tejun Heo <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

eventfd: document lockless access in eventfd_poll

Since commit e22553e2a25e ("eventfd: don't take the spinlock in
eventfd_poll", 2015-02-17), eventfd is reading ctx->count outside
ctx->wqh.lock.

However, things aren't as simple as the read barrier in eventfd_poll
would suggest.  In fact, the read barrier, besides lacking a comment, is
not paired in any obvious manner with another read barrier, and it is
pointless because it is sitting between a write (deep in poll_wait) and
the read of ctx->count.  The read barrier is acting just as a compiler
barrier, for which we can use READ_ONCE instead.  This is what the code
change in this patch does.

The documentation change is just as important, however.  The question,
posed by Andrea Arcangeli, is then why the thing is safe on
architectures where spin_unlock does not imply a store-load memory
barrier.  The answer is that it's safe because writes of ctx->count use
the same lock as poll_wait, and hence an acquire barrier implicit in
poll_wait provides the necessary synchronization between eventfd_poll
and callers of wake_up_locked_poll.  This is sort of mentioned in the
commit message with respect to eventfd_ctx_read ("eventfd_read is
similar, it will do a single decrement with the lock held") but it
applies to all other callers too.  It's tricky enough that it should be
documented in the code.

Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: Andrea Arcangeli <[email protected]>
Cc: Chris Mason <[email protected]>
Cc: Davide Libenzi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

cred/userns: define current_user_ns() as a function

The current_user_ns() macro currently returns &init_user_ns when user
namespaces are disabled, and that causes several warnings when building
with gcc-6.0 in code that compares the result of the macro to
&init_user_ns itself:

  fs/xfs/xfs_ioctl.c: In function 'xfs_ioctl_setattr_check_projid':
  fs/xfs/xfs_ioctl.c:1249:22: error: self-comparison always evaluates to true [-Werror=tautological-compare]
    if (current_user_ns() == &init_user_ns)

This is a legitimate warning in principle, but here it isn't really
helpful, so I'm reprasing the definition in a way that shuts up the
warning.  Apparently gcc only warns when comparing identical literals,
but it can figure out that the result of an inline function can be
identical to a constant expression in order to optimize a condition yet
not warn about the fact that the condition is known at compile time.
This is exactly what we want here, and it looks reasonable because we
generally prefer inline functions over macros anyway.

Signed-off-by: Arnd Bergmann <[email protected]>
Acked-by: Serge Hallyn <[email protected]>
Cc: David Howells <[email protected]>
Cc: Yaowei Bai <[email protected]>
Cc: James Morris <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio: add mport char device driver

Add mport character device driver to provide user space interface to
basic RapidIO subsystem operations.

See included Documentation/rapidio/mport_cdev.txt for more details.

[[email protected]: fix printk warning on i386]
[[email protected]: mport_cdev: fix some error codes]
Signed-off-by: Alexandre Bounine <[email protected]>
Signed-off-by: Dan Carpenter <[email protected]>
Tested-by: Barry Wood <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Cc: Barry Wood <[email protected]>
Cc: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio/tsi721_dma: fix hardware error handling

Add DMA channel re-initialization after an error to avoid termination of
all pending transfer requests.

Signed-off-by: Alexandre Bounine <[email protected]>
Reported-by: Barry Wood <[email protected]>
Tested-by: Barry Wood <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Cc: Barry Wood <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio/tsi721_dma: fix synchronization issues

Fix synchronization issues found during testing using multiple DMA
transfer requests to the same channel:

- lost MSI-X interrupt notifications
- non-synchronized attempts to start DMA channel HW resulting in error
message from the driver
- cookie tracking/update race conditions resulting in incorrect DMA
transfer status report

Signed-off-by: Alexandre Bounine <[email protected]>
Reported-by: Barry Wood <[email protected]>
Tested-by: Barry Wood <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Cc: Barry Wood <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio/tsi721_dma: update error reporting from prep_sg callback

Switch to returning error-valued pointer instead of simple NULL pointer.
This allows to properly identify situation when request queue is full
and therefore gives to upper layer an option to retry operation later.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio/tsi721: add filtered debug output

Replace "all-or-nothing" debug output with controlled debug output using
functional block masks. This allows run time control of debug messages
through 'dbg_level' module parameter.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio/tsi721: add outbound windows mapping support

Add device-specific callback functions to support outbound windows
mapping and release.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio: add outbound window support

Add RapidIO controller (mport) outbound window configuration operations.

This patch is a part of the original patch submitted by Li Yang:

   https://lists.ozlabs.org/pipermail/linuxppc-dev/2009-April/071210.html

For some reason the original part was not applied to mainline code
tree.  The inbound window mapping part has been applied later during
tsi721 mport driver submission.  Now goes the second part with
corresponding HW support.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Li Yang <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio/tsi721: fix locking in OB_MSG processing

- Add spinlock protection into outbound message queuing routine.

- Change outbound message interrupt handler to avoid deadlock when
  calling registered callback routine.

- Allow infinite retries for outbound messages to avoid retry threshold
  error signaling in systems with nodes that have slow message receive
  queue processing.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio: add global inbound port write interfaces

Add new Port Write handler registration interfaces that attach PW
handlers to local mport device objects. This is different from old
interface that attaches PW callback to individual RapidIO device. The
new interfaces are intended for use for common event handling (e.g.
hot-plug notifications) while the old interface is available for
individual device drivers.

This patch is based on patch proposed by Andre van Herk but preserves
existing per-device interface and adds lock protection for list
handling.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio: move rio_pw_enable into core code

Make rio_pw_enable() routine available to other RapidIO drivers.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio: move rio_local_set_device_id function to the common core

Make function rio_local_set_device_id() common for all components of
RapidIO subsystem.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio: add lock protection for doorbell list

Add lock protection around doorbell list handling to prevent list
corruption on SMP platforms.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio/rionet: add mport removal handling

Add handling of a local mport device removal.

RIONET driver registers itself as class interface that supports only
removal notification, 'add_device' callback is not provided because
RIONET network device can be initialized only after enumeration is
completed and the existing method (using remote peer addition) satisfies
this condition.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio/rionet: add locking into add/remove device

Add spinlock protection when handling list of connected peers and
ability to handle new peer device addition after the RIONET device was
open. Before his update RIONET was sending JOIN requests only when it
have been opened, peer devices added later have been missing from this
process.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

powerpc/fsl_rio: changes to mport registration

Change mport object initialization/registration sequence to match
reworked version of rio_register_mport() in the core code.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio/tsi721: add HW specific mport removal

Add hardware-specific device removal support for Tsi721 PCIe-to-RapidIO
bridge. To avoid excessive data type conversions, parameters passed to
some internal functions have been revised. Dynamic memory allocations
of rio_mport and rio_ops have been replaced to reduce references between
data structures.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio: add core mport removal support

Add common mport removal support functions into the RapidIO subsystem
core.

Changes to the existing mport registration process have been made to
avoid race conditions with active subsystem interfaces immediately after
mport device registration: part of initialization code from
rio_register_mport() have been moved into separate function
rio_mport_initialize() to allow to perform mport registration as the
final step of setup process.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio: move net allocation into core code

Make net allocation/release routines available to all components of
RapidIO subsystem by moving code from rio-scan enumerator.

Make destination ID allocation method private to existing enumerator
because other enumeration methods can use their own algorithm.

Setup net device object as a parent of all RapidIO devices residing in
it and register net as a child of active mport device.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio: rework common RIO device add/delete routines

This patch moves per-net device list handling from rio-scan to common
RapidIO core and adds a matching device deletion routine. This makes
device object creation/removal available to other implementations of
enumeration/discovery process.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio/rionet: add shutdown event handling

Add shutdown notification handler which terminates active connections
with remote RapidIO nodes. This prevents remote nodes from sending
packets to the powered off node and eliminates hardware error events on
remote nodes.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio/tsi721: add shutdown notification callback

Add device driver specific shutdown notification callback.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio: add shutdown notification for RapidIO devices

Add bus-specific callback to stop RapidIO devices during a system
shutdown.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio/tsi721: add query_mport callback

Add device-specific implementation of query_mport callback function.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio: add query_mport operation

Add mport query operation to report master port RapidIO capabilities and
run time configuration to upper level drivers.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio/tsi721_dma: fix pending transaction queue handling

Fix pending DMA request queue handling to avoid broken ordering during
concurrent request submissions.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio/tsi721: add option to configure direct mapping of IB window

Add an option to configure mapping of Inbound Window without RIO-to-PCIe
address translation.

If a local memory buffer is not properly aligned to meet HW requirements
for RapidIO address mapping with address translation, caller can request
an inbound window with matching RapidIO address assigned to it. This
implementation selects RapidIO base address and size for inbound window
that are capable to accommodate the local memory buffer.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio/tsi721: add check for overlapped IB window mappings

Add check for attempts to request mapping of inbound RapidIO address
space that overlaps with existing active mapping windows.

Tsi721 device does not support overlapped inbound windows and SRIO
address decoding behavior is not defined in such cases.

This patch is applicable to kernel versions starting from v3.7.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio/tsi721: fix hardcoded MRRS setting

Remove use of hardcoded setting for Maximum Read Request Size (MRRS)
value and use one set by PCIe bus driver.

Using hardcoded value can cause PCIe bus errors on platforms that have
tsi721 device on PCIe path that allows only smaller read request sizes.

This fix is applicable to kernel versions starting from v3.2.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Aurelien Jacquiot <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio/rionet: add capability to change MTU

These patches are the result of extensive collaboration within the
RapidIO.org Software Task Group between Texas Instruments, Freescale,
Prodrive Technologies, Nokia Networks, BAE and IDT.  Additional input
was received from other members of RapidIO.org.  The objective was to
create a character mode driver interface which exposes the capabilities
of RapidIO devices directly to applications, in a manner that allows the
numerous and varied RapidIO implementations to interoperate.

The Software Task Group has also developed fabric management, Remote
Memory Access, and sockets applications which make use of these
interfaces in user space.  Intensive testing with these applications
prompted the RapidIO subsystem updates provided within this set of
patches.

This patch (of 29):

Replace default Ethernet-specific routine by the custom one to allow
setting of larger MTU supported by RapidIO messaging (max RIO packet
size is 4096 bytes).

Signed-off-by: Aurelien Jacquiot <[email protected]>
Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Andre van Herk <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio/rionet: fix deadlock on SMP

Fix deadlocking during concurrent receive and transmit operations on SMP
platforms caused by the use of incorrect lock: on transmit 'tx_lock'
spinlock should be used instead of 'lock' which is used for receive
operation.

This fix is applicable to kernel versions starting from v2.15.

Signed-off-by: Aurelien Jacquiot <[email protected]>
Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Andre van Herk <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

cpumask: remove incorrect information from comment

Since commit cdfdef75e795 ("cpumask: only allocate nr_cpumask_bits."),
this comment above cpumask_size() is no longer relevant.

Signed-off-by: Eric Biggers <[email protected]>
Cc: Rusty Russell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

fs/coredump: prevent fsuid=0 dumps into user-controlled directories

This commit fixes the following security hole affecting systems where
all of the following conditions are fulfilled:

- The fs.suid_dumpable sysctl is set to 2.
- The kernel.core_pattern sysctl's value starts with "/". (Systems
   where kernel.core_pattern starts with "|/" are not affected.)
- Unprivileged user namespace creation is permitted. (This is
   true on Linux >=3.8, but some distributions disallow it by
   default using a distro patch.)

Under these conditions, if a program executes under secure exec rules,
causing it to run with the SUID_DUMP_ROOT flag, then unshares its user
namespace, changes its root directory and crashes, the coredump will be
written using fsuid=0 and a path derived from kernel.core_pattern - but
this path is interpreted relative to the root directory of the process,
allowing the attacker to control where a coredump will be written with
root privileges.

To fix the security issue, always interpret core_pattern for dumps that
are written under SUID_DUMP_ROOT relative to the root directory of init.

Signed-off-by: Jann Horn <[email protected]>
Acked-by: Kees Cook <[email protected]>
Cc: Al Viro <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ptrace: change __ptrace_unlink() to clear ->ptrace under ->siglock

This test-case (simplified version of generated by syzkaller)

#include <unistd.h>
#include <sys/ptrace.h>
#include <sys/wait.h>

void test(void)
{
for (;;) {
if (fork()) {
wait(NULL);
continue;
}

ptrace(PTRACE_SEIZE, getppid(), 0, 0);
ptrace(PTRACE_INTERRUPT, getppid(), 0, 0);
_exit(0);
}
}

int main(void)
{
int np;

for (np = 0; np < 8; ++np)
if (!fork())
test();

while (wait(NULL) > 0)
;
return 0;
}

triggers the 2nd WARN_ON_ONCE(!signr) warning in do_jobctl_trap(). The
problem is that __ptrace_unlink() clears task->jobctl under siglock but
task->ptrace is cleared without this lock held; this fools the "else"
branch which assumes that !PT_SEIZED means PT_PTRACED.

Note also that most of other PTRACE_SEIZE checks can race with detach
from the exiting tracer too. Say, the callers of ptrace_trap_notify()
assume that SEIZED can't go away after it was checked.

Signed-off-by: Oleg Nesterov <[email protected]>
Reported-by: Dmitry Vyukov <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: syzkaller <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

fat: add config option to set UTF-8 mount option by default

FAT has long supported its own default file name encoding config
setting, separate from CONFIG_NLS_DEFAULT.

However, if UTF-8 encoded file names are desired FAT character set
should not be set to utf8 since this would make file names case
sensitive even if case insensitive matching is requested. Instead,
"utf8" mount options should be provided to enable UTF-8 file names in
FAT file system.

Unfortunately, there was no possibility to set the default value of this
option so on UTF-8 system "utf8" mount option had to be added manually
to most FAT mounts.

This patch adds config option to set such default value.

Signed-off-by: Maciej S. Szmigiero <[email protected]>
Acked-by: OGAWA Hirofumi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

x86/compat: remove is_compat_task()

x86's is_compat_task always checked the current syscall type, not the
task type. It has no non-arch users any more, so just remove it to
avoid confusion.

On x86, nothing should really be checking the task ABI. There are
legitimate users for the syscall ABI and for the mm ABI.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

drivers/hid/uhid.c: check write() bitness using in_compat_syscall

uhid changes the format expected in write() depending on bitness. It
should check the syscall bitness directly.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: David Herrmann <[email protected]>
Acked-by: Jiri Kosina <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

input: redefine INPUT_COMPAT_TEST as in_compat_syscall()

The input compat code should work like all other compat code: for 32-bit
syscalls, use the 32-bit ABI and for 64-bit syscalls, use the 64-bit
ABI. We have a helper for that (in_compat_syscall()): just use it.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Dmitry Torokhov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

drivers/gpu/drm/amd/amdkfd: use in_compat_syscall to check open() caller type

amdkfd wants to know syscall type, not task type. Check directly.

Unfortunately, amdkfd is making nasty assumptions that a process'
bitness is a well-defined constant thing. This isn't the case on x86.
I don't know how much this matters, but this patch has no effect on
generated code on x86, so amdkfd is equally broken with and without this
patch.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Oded Gabbay <[email protected]>
Cc: David Airlie <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

drivers/firmware/efi/efivars.c: use in_compat_syscall() to check for compat callers

This should make no difference on any architecture, as x86's historical
is_compat_task behavior really did check whether the calling syscall was
a compat syscall. x86's is_compat_task is going away, though.

Signed-off-by: Andy Lutomirski <[email protected]>
Reviewed-by: Matt Fleming <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

firewire: use in_compat_syscall to check ioctl compatness

Firewire was using is_compat_task to check whether it was in a compat
ioctl or a non-compat ioctl. Use is_compat_syscall instead so it works
properly on all architectures.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Clemens Ladisch <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

net/xfrm_user: use in_compat_syscall to deny compat syscalls

The code wants to prevent compat code from receiving messages. Use
in_compat_syscall for this.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Steffen Klassert <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: "David S. Miller" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

net/sctp: use in_compat_syscall for sctp_getsockopt_connectx3

SCTP unfortunately has a different ABI for SCTP_SOCKOPT_CONNECTX3 for
32-bit and 64-bit callers. Use in_compat_syscall to correctly
distinguish them on all architectures.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Vlad Yasevich <[email protected]>
Cc: Neil Horman <[email protected]>
Cc: David Miller <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ext4: in ext4_dir_llseek, check syscall bitness directly

ext4 treats directory offsets differently for 32-bit and 64-bit callers.
Check the caller type using in_compat_syscall, not is_compat_task. This
changes behavior on SPARC slightly.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: "Theodore Ts'o" <[email protected]>
Cc: Andreas Dilger <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

staging/lustre: switch from is_compat_task to in_compat_syscall

AFAICT, lustre is trying to determine syscall bitness. Use the new
accessor.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Oleg Drokin <[email protected]>
Cc: Andreas Dilger <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

auditsc: for seccomp events, log syscall compat state using in_compat_syscall

Except on SPARC, this is what the code always did. SPARC compat seccomp
was buggy, although the impact of the bug was limited because SPARC
32-bit and 64-bit syscall numbers are the same.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Paul Moore <[email protected]>
Cc: Eric Paris <[email protected]>
Cc: David Miller <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ptrace: in PEEK_SIGINFO, check syscall bitness, not task bitness

Users of the 32-bit ptrace() ABI expect the full 32-bit ABI. siginfo
translation should check ptrace() ABI, not caller task ABI.

This is an ABI change on SPARC. Let's hope that no one relied on the
old buggy ABI.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

seccomp: check in_compat_syscall, not is_compat_task, in strict mode

Seccomp wants to know the syscall bitness, not the caller task bitness,
when it selects the syscall whitelist.

As far as I know, this makes no difference on any architecture, so it's
not a security problem. (It generates identical code everywhere except
sparc, and, on sparc, the syscall numbering is the same for both ABIs.)

Signed-off-by: Andy Lutomirski <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

sparc/syscall: fix syscall_get_arch

Sparc's syscall_get_arch was buggy: it returned the task arch, not the
syscall arch. This could confuse seccomp and audit.

I don't think this is as bad for seccomp as it looks: sparc's 32-bit and
64-bit syscalls are numbered the same.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Sam Ravnborg <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

sparc/compat: provide an accurate in_compat_syscall implementation

On sparc64 compat-enabled kernels, any task can make 32-bit and 64-bit
syscalls. is_compat_task returns true in 32-bit tasks, which does not
necessarily imply that the current syscall is 32-bit.

Provide an in_compat_syscall implementation that checks whether the
current syscall is compat.

As far as I know, sparc is the only architecture on which is_compat_task
checks the compat status of the task and on which the compat status of a
syscall can differ from the compat status of the task. On x86,
is_compat_task checks the syscall type, not the task type.

[[email protected]: add comment, per Sam]
[[email protected]: update comment, per Andy]
Signed-off-by: Andy Lutomirski <[email protected]>
Acked-by: David S. Miller <[email protected]>
Cc: Sam Ravnborg <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

compat: add in_compat_syscall to ask whether we're in a compat syscall

A lot of code currently abuses is_compat_task to determine this.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: "Theodore Ts'o" <[email protected]>
Cc: Andreas Dilger <[email protected]>
Cc: Clemens Ladisch <[email protected]>
Cc: David Airlie <[email protected]>
Cc: David Herrmann <[email protected]>
Cc: David Miller <[email protected]>
Cc: Dmitry Torokhov <[email protected]>
Cc: Eric Paris <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Ingo Molnar <[email protected]>
Acked-by: Jiri Kosina <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: Neil Horman <[email protected]>
Cc: Oded Gabbay <[email protected]>
Cc: Oleg Drokin <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Paul Moore <[email protected]>
Cc: Sam Ravnborg <[email protected]>
Cc: Steffen Klassert <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vlad Yasevich <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kernel/hung_task.c: use timeout diff when timeout is updated

When new timeout is written to /proc/sys/kernel/hung_task_timeout_secs,
khungtaskd is interrupted and again sleeps for full timeout duration.

This means that hang task will not be checked if new timeout is written
periodically within old timeout duration and/or checking of hang task
will be delayed for up to previous timeout duration. Fix this by
remembering last time khungtaskd checked hang task.

This change will allow other watchdog tasks (if any) to share khungtaskd
by sleeping for minimal timeout diff of all watchdog tasks. Doing more
watchdog tasks from khungtaskd will reduce the possibility of printk()
collisions by multiple watchdog threads.

Signed-off-by: Tetsuo Handa <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Aaron Tomlin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

zram: revive swap_slot_free_notify

Commit b430e9d1c6d4 ("remove compressed copy from zram in-memory")
applied swap_slot_free_notify call in *end_swap_bio_read* to remove
duplicated memory between zram and memory.

However, with the introduction of rw_page in zram: 8c7f01025f7b ("zram:
implement rw_page operation of zram"), it became void because rw_page
doesn't need bio.

Memory footprint is really important in embedded platforms which have
small memory, for example, 512M) recently because it could start to kill
processes if memory footprint exceeds some threshold by LMK or some
similar memory management modules.

This patch restores the function for rw_page, thereby eliminating this
duplication.

Signed-off-by: Minchan Kim <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: karam.lee <[email protected]>
Cc: <[email protected]>
Cc: Chan Jeong <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: add feature document for online file check

This document will describe OCFS2 online file check feature.  OCFS2 is
often used in high-availaibility systems.  However, OCFS2 usually
converts the filesystem to read-only when encounters an error.  This may
not be necessary, since turning the filesystem read-only would affect
other running processes as well, decreasing availability.

Then, a mount option (errors=continue) is introduced, which would return
the -EIO errno to the calling process and terminate furhter processing
so that the filesystem is not corrupted further.  The filesystem is not
converted to read-only, and the problematic file's inode number is
reported in the kernel log.  The user can try to check/fix this file via
online filecheck feature.

Signed-off-by: Gang He <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Joseph Qi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: check/fix inode block for online file check

Implement online check or fix inode block during reading a inode block
to memory.

Signed-off-by: Gang He <[email protected]>
Reviewed-by: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Joseph Qi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: create/remove sysfile for online file check

Create online file check sysfile when ocfs2 mount, remove the related
sysfile when ocfs2 umount.

Signed-off-by: Gang He <[email protected]>
Reviewed-by: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Joseph Qi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: sysfile interfaces for online file check

Implement online file check sysfile interfaces, e.g. how to create the
related sysfile according to device name, how to display/handle file
check request from the sysfile.

Signed-off-by: Gang He <[email protected]>
Reviewed-by: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Joseph Qi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: export ocfs2_kset for online file check

When there are errors in the ocfs2 filesystem, they are usually
accompanied by the inode number which caused the error.  This inode
number would be the input to fixing the file.  One of these options
could be considered:

A file in the sys filesytem which would accept inode numbers.  This
could be used to communication back what has to be fixed or is fixed.
You could write:

  $# echo "<inode>" > /sys/fs/ocfs2/devname/filecheck/check

or

  $# echo "<inode>" > /sys/fs/ocfs2/devname/filecheck/fix

Compare with second version, I re-design filecheck sysfs interfaces,
there are three sysfs files (check, fix and set) under filecheck
directory (see above), sysfs will accept only one argument <inode>.
Second, I adjust some code in ocfs2_filecheck_repair_inode_block()
function according to upstream feedback, we cannot just add VALID_FL
flag back as a inode block fix, then we will not fix this field
corruption currently until having a complete solution.  Compare with
first version, I use strncasecmp instead of double strncmp functions.
Second, update the source file contribution vendor.

This patch (of 4):

Export ocfs2_kset object from ocfs2_stackglue kernel module, then online
file check code will create the related sysfiles under ocfs2_kset
object.  We're exporting this because it's built in ocfs2_stackglue.ko.

Signed-off-by: Gang He <[email protected]>
Reviewed-by: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Joseph Qi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

cpufreq: governor: Always schedule work on the CPU running update

Modify dbs_irq_work() to always schedule the process-context work
on the current CPU which also ran the dbs_update_util_handler()
that the irq_work being handled came from.

This causes the entire frequency update handling (involving the
"ondemand" or "conservative" governors) to be carried out by the
CPU whose frequency is to be updated and reduces the overall amount
of inter-CPU noise related to cpufreq.

Signed-off-by: Rafael J. Wysocki <[email protected]>

cpufreq: Always update current frequency before startig governor

Make policy->cur match the current frequency returned by the driver's
->get() callback before starting the governor in case they went out of
sync in the meantime and drop the piece of code attempting to
resync policy->cur with the real frequency of the boot CPU from
cpufreq_resume() as it serves no purpose any more (and it's racy and
super-ugly anyway).

Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Viresh Kumar <[email protected]>

cpufreq: Introduce cpufreq_update_current_freq()

Move the part of cpufreq_update_policy() that obtains the current
frequency from the driver and updates policy->cur if necessary to
a separate function, cpufreq_get_current_freq().

That should not introduce functional changes and subsequent change
set will need it.

Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Viresh Kumar <[email protected]>

cpufreq: Introduce cpufreq_start_governor()

Starting a governor in cpufreq always follows the same pattern
involving two calls to cpufreq_governor(), one with the event
argument set to CPUFREQ_GOV_START and one with that argument set to
CPUFREQ_GOV_LIMITS.

Introduce cpufreq_start_governor() that will carry out those two
operations and make all places where governors are started use it.

That slightly modifies the behavior of cpufreq_set_policy() which
now also will go back to the old governor if the second call to
cpufreq_governor() (the one with event equal to CPUFREQ_GOV_LIMITS)
fails, but that really is how it should work in the first place.

Also cpufreq_resume() will now pring an error message if the
CPUFREQ_GOV_LIMITS call to cpufreq_governor() fails, but that
makes it follow cpufreq_add_policy_cpu() and cpufreq_offline()
in that respect.

Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Viresh Kumar <[email protected]>

cpufreq: powernv: Add sysfs attributes to show throttle stats

Create sysfs attributes to export throttle information in
/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats directory. The
newly added sysfs files are as follows:

1)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/turbo_stat
2)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/sub-turbo_stat
3)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/unthrottle
4)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/powercap
5)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/overtemp
6)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/supply_fault
7)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/overcurrent
8)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/occ_reset

Detailed explanation of each attribute is added to
Documentation/ABI/testing/sysfs-devices-system-cpu

Signed-off-by: Shilpasri G Bhat <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

cpufreq: acpi-cpufreq: make Intel/AMD MSR access, io port access static

These frequency register read/write operations' implementations for the
given processor (Intel/AMD MSR access or I/O port access) are only used
internally in acpi-cpufreq, so make them static.

Signed-off-by: Jisheng Zhang <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

PCI: ACPI: IA64: fix IO port generic range check

The [0 - 64k] ACPI PCI IO port resource boundary check in:

acpi_dev_ioresource_flags()

is currently applied blindly in the ACPI resource parsing to all
architectures, but only x86 suffers from that IO space limitation.

On arches (ie IA64 and ARM64) where IO space is memory mapped,
the PCI root bridges IO resource windows are firstly initialized from
the _CRS (in acpi_decode_space()) and contain the CPU physical address
at which a root bridge decodes IO space in the CPU physical address
space with the offset value representing the offset required to translate
the PCI bus address into the CPU physical address.

The IO resource windows are then parsed and updated in arch code
before creating and enumerating PCI buses (eg IA64 add_io_space())
to map in an arch specific way the obtained CPU physical address range
to a slice of virtual address space reserved to map PCI IO space,
ending up with PCI bridges resource windows containing IO
resources like the following on a working IA64 configuration:

PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [io  0x1000000-0x100ffff window] (bus
address [0x0000-0xffff])
pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000fffff window]
pci_bus 0000:00: root bus resource [mem 0x80000000-0x8fffffff window]
pci_bus 0000:00: root bus resource [mem 0x80004000000-0x800ffffffff window]
pci_bus 0000:00: root bus resource [bus 00]

This implies that the [0 - 64K] check in acpi_dev_ioresource_flags()
leaves platforms with memory mapped IO space (ie IA64) broken (ie kernel
can't claim IO resources since the host bridge IO resource is disabled
and discarded by ACPI core code, see log on IA64 with missing root bridge
IO resource, silently filtered by current [0 - 64k] check in
acpi_dev_ioresource_flags()):

PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000fffff window]
pci_bus 0000:00: root bus resource [mem 0x80000000-0x8fffffff window]
pci_bus 0000:00: root bus resource [mem 0x80004000000-0x800ffffffff window]
pci_bus 0000:00: root bus resource [bus 00]

[...]

pci 0000:00:03.0: [1002:515e] type 00 class 0x030000
pci 0000:00:03.0: reg 0x10: [mem 0x80000000-0x87ffffff pref]
pci 0000:00:03.0: reg 0x14: [io  0x1000-0x10ff]
pci 0000:00:03.0: reg 0x18: [mem 0x88020000-0x8802ffff]
pci 0000:00:03.0: reg 0x30: [mem 0x88000000-0x8801ffff pref]
pci 0000:00:03.0: supports D1 D2
pci 0000:00:03.0: can't claim BAR 1 [io  0x1000-0x10ff]: no compatible
bridge window

For this reason, the IO port resources boundaries check in generic ACPI
parsing code should be guarded with a CONFIG_X86 guard so that more arches
(ie ARM64) can benefit from the generic ACPI resources parsing interface
without incurring in unexpected resource filtering, fixing at the same
time current breakage on IA64.

This patch factors out IO ports boundary [0 - 64k] check in generic ACPI
code and makes the IO space check X86 specific to make sure that IO
space resources are usable on other arches too.

Fixes: 3772aea7d6f3 (ia64/PCI/ACPI: Use common ACPI resource parsing interface for host bridge)
Signed-off-by: Lorenzo Pieralisi <[email protected]>
Cc: 4.4+ <[email protected]> # 4.4+
Signed-off-by: Rafael J. Wysocki <[email protected]>

tracing: Record and show NMI state

The latency tracer format has a nice column to indicate IRQ state, but
this is not able to tell us about NMI state.

When tracing perf interrupt handlers (which often run in NMI context)
it is very useful to see how the events nest.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>

tracing: Fix trace_printk() to print when not using bprintk()

The trace_printk() code will allocate extra buffers if the compile detects
that a trace_printk() is used. To do this, the format of the trace_printk()
is saved to the __trace_printk_fmt section, and if that section is bigger
than zero, the buffers are allocated (along with a message that this has
happened).

If trace_printk() uses a format that is not a constant, and thus something
not guaranteed to be around when the print happens, the compiler optimizes
the fmt out, as it is not used, and the __trace_printk_fmt section is not
filled. This means the kernel will not allocate the special buffers needed
for the trace_printk() and the trace_printk() will not write anything to the
tracing buffer.

Adding a "__used" to the variable in the __trace_printk_fmt section will
keep it around, even though it is set to NULL. This will keep the string
from being printed in the debugfs/tracing/printk_formats section as it is
not needed.

Reported-by: Vlastimil Babka <[email protected]>
Fixes: 07d777fe8c398 "tracing: Add percpu buffers for trace_printk()"
Cc: [email protected] # v3.5+
Signed-off-by: Steven Rostedt <[email protected]>

Merge branch 'AF_VSOCK-missed-wakeups'

Claudio Imbrenda says:

====================
AF_VSOCK: Shrink the area influenced by prepare_to_wait

This patchset applies on net-next.

I think I found a problem with the patch submitted by Laura Abbott
( https://lkml.org/lkml/2016/2/4/711 ): we might miss wakeups.
Since the condition is not checked between the prepare_to_wait and the
schedule(), if a wakeup happens after the condition is checked but before
the sleep happens, and we miss it. ( A description of the problem can be
found here: http://www.makelinux.net/ldd3/chp-6-sect-2 ).

The first patch reverts the previous broken patch, while the second patch
properly fixes the sleep-while-waiting issue.
====================

Signed-off-by: David S. Miller <[email protected]>

AF_VSOCK: Shrink the area influenced by prepare_to_wait

When a thread is prepared for waiting by calling prepare_to_wait, sleeping
is not allowed until either the wait has taken place or finish_wait has
been called. The existing code in af_vsock imposed unnecessary no-sleep
assumptions to a broad list of backend functions.
This patch shrinks the influence of prepare_to_wait to the area where it
is strictly needed, therefore relaxing the no-sleep restriction there.

Signed-off-by: Claudio Imbrenda <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Revert "vsock: Fix blocking ops call in prepare_to_wait"

This reverts commit 5988818008257ca42010d6b43a3e0e48afec9898 ("vsock: Fix
blocking ops call in prepare_to_wait")

The commit reverted with this patch caused us to potentially miss wakeups.
Since the condition is not checked between the prepare_to_wait and the
schedule(), if a wakeup happens after the condition is checked but before
the sleep happens, we will miss it. ( A description of the problem can be
found here: http://www.makelinux.net/ldd3/chp-6-sect-2 ).

By reverting the patch, the behaviour is still incorrect (since we
shouldn't sleep between the prepare_to_wait and the schedule) but at least
it will not miss wakeups.

The next patch in the series actually fixes the behaviour.

Signed-off-by: Claudio Imbrenda <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'nfs-for-4.6-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
"Highlights include:

  Features:
   - Add support for multiple NFSv4.1 callbacks in flight
   - Initial patchset for RPC multipath support
   - Adapt RPC/RDMA to use the new completion queue API

  Bugfixes and cleanups:
   - nfs4: nfs4_ff_layout_prepare_ds should return NULL if connection failed
   - Cleanups to remove nfs_inode_dio_wait and nfs4_file_fsync
   - Fix RPC/RDMA credit accounting
   - Properly handle RDMA_ERROR replies
   - xprtrdma: Do not wait if ib_post_send() fails
   - xprtrdma: Segment head and tail XDR buffers on page boundaries
   - xprtrdma cleanups for dprintk, physical_op_map and unused macros"

* tag 'nfs-for-4.6-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (35 commits)
  nfs/blocklayout: make sure making a aligned read request
  nfs4: nfs4_ff_layout_prepare_ds should return NULL if connection failed
  nfs: remove nfs_inode_dio_wait
  nfs: remove nfs4_file_fsync
  xprtrdma: Use new CQ API for RPC-over-RDMA client send CQs
  xprtrdma: Use an anonymous union in struct rpcrdma_mw
  xprtrdma: Use new CQ API for RPC-over-RDMA client receive CQs
  xprtrdma: Serialize credit accounting again
  xprtrdma: Properly handle RDMA_ERROR replies
  rpcrdma: Add RPCRDMA_HDRLEN_ERR
  xprtrdma: Do not wait if ib_post_send() fails
  xprtrdma: Segment head and tail XDR buffers on page boundaries
  xprtrdma: Clean up dprintk format string containing a newline
  xprtrdma: Clean up physical_op_map()
  xprtrdma: Clean up unused RPCRDMA_INLINE_PAD_THRESH macro
  NFS add callback_ops to nfs4_proc_bind_conn_to_session_callback
  pnfs/NFSv4.1: Add multipath capabilities to pNFS flexfiles servers over NFSv3
  SUNRPC: Allow addition of new transports to a struct rpc_clnt
  NFSv4.1: nfs4_proc_bind_conn_to_session must iterate over all connections
  SUNRPC: Make NFS swap work with multipath
  ...

Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs updates from Miklos Szeredi:
"Various fixes and tweaks"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: cleanup unused var in rename2
  ovl: rename is_merge to is_lowest
  ovl: fixed coding style warning
  ovl: Ensure upper filesystem supports d_type
  ovl: Warn on copy up if a process has a R/O fd open to the lower file
  ovl: honor flag MS_SILENT at mount
  ovl: verify upper dentry before unlink and rename

macb: fix PHY reset

The driver calls gpiod_set_value() with GPIOD_OUT_* instead of 0 and 1, as
a result the PHY isn't really put back into reset state in macb_remove().
Moreover, the driver assumes that something else has set the GPIO direction
to output, so if it has not, the PHY may not be taken out of reset in
macb_probe() either...

Signed-off-by: Sergei Shtylyov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse

Pull fuse update from Miklos Szeredi:
"This contains direct I/O fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: return patrial success from fuse_direct_io()
  fuse: Add reference counting for fuse_io_priv
  fuse: do not use iocb after it may have been freed

drm/radeon/mst: cleanup code indentation

This was all sorts of ugly from when I hacked it up,
just clean it up now and remove the extra indents.

Signed-off-by: Dave Airlie <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/radeon/mst: fix regression in lane/link handling.

The function this used changed in
092c96a8ab9d1bd60ada2ed385cc364ce084180e
drm/radeon: fix dp link rate selection (v2)

However for MST we should just always train to the
max link/rate. Though we probably need to limit this
for future hw, in theory radeon won't support it.

This fixes my 30" monitor with MST enabled.

Cc: [email protected] # v4.4
Signed-off-by: Dave Airlie <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

ipv4: initialize flowi4_flags before calling fib_lookup()

Field fl4.flowi4_flags is not initialized in fib_compute_spec_dst()
before calling fib_lookup(), which means fib_table_lookup() is
using non-deterministic data at this line:

if (!(flp->flowi4_flags & FLOWI_FLAG_SKIP_NH_OIF)) {

Fix by initializing the entire fl4 structure, which will prevent
similar issues as fields are added in the future by ensuring that
all fields are initialized to zero unless explicitly initialized
to another value.

Fixes: 58189ca7b2741 ("net: Fix vti use case with oif in dst lookups")
Suggested-by: David Ahern <[email protected]>
Signed-off-by: Lance Richardson <[email protected]>
Acked-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

fsl/fman: Workaround for Errata A-007273

Errata A-007273 (For FMan V3 devices only):
FMan soft reset is not finished properly if one
of the Ethernet MAC clocks is disabled

Workaround:
Re-enable all disabled MAC clocks through the DCFG_CCSR_DEVDISR2
register prior to issuing an FMAN soft reset.
Re-disable the MAC clocks after the FMAN soft reset is done.

Signed-off-by: Igal Liberman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'for-linus-4.6-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from David Vrabel:
"Features and fixes for 4.6:

  - Make earlyprintk=xen work for HVM guests

  - Remove module support for things never built as modules"

* tag 'for-linus-4.6-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  drivers/xen: make platform-pci.c explicitly non-modular
  drivers/xen: make sys-hypervisor.c explicitly non-modular
  drivers/xen: make xenbus_dev_[front/back]end explicitly non-modular
  drivers/xen: make [xen-]ballon explicitly non-modular
  xen: audit usages of module.h ; remove unnecessary instances
  xen/x86: Drop mode-selecting ifdefs in startup_xen()
  xen/x86: Zero out .bss for PV guests
  hvc_xen: make early_printk work with HVM guests
  hvc_xen: fix xenboot for DomUs
  hvc_xen: add earlycon support

ipv4: fix broadcast packets reception

Currently, ingress ipv4 broadcast datagrams are dropped since,
in udp_v4_early_demux(), ip_check_mc_rcu() is invoked even on
bcast packets.

This patch addresses the issue, invoking ip_check_mc_rcu()
only for mcast packets.

Fixes: 6e5403093261 ("ipv4/udp: Verify multicast group is ours in upd_v4_early_demux()")
Signed-off-by: Paolo Abeni <[email protected]>
Acked-by: Hannes Frederic Sowa <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c updates from Wolfram Sang:
"Mostly usual driver updates and improvements.  The changelog should
  give an idea.  Standing out is the i2c-qup driver with lots of new
  capabilities and we also have now an i2c-demuxer.

  I'd especially like to welcome Peter Rosin as the i2c-mux maintainer.
  He has an interesting series for muxes in the queue and agreed to look
  after this part of the subsystem.  Thank you, Peter, and welcome
  again!

  The octeon changes were applied pretty recently before the merge
  window.  I am aware.  They are the first (and relatively simple)
  patches of a larger overhaul to this driver.  In case something goes
  wrong with them, they are easy to fix (or revert).  The advantage I
  see is that they are out of the way, and I can concentrate on the next
  block of patches.  I really would like to apply the overhaul in
  smaller batches to avoid regressions.  And waiting a cycle for the
  introductory patches seemed too much of a delay for me"

* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (39 commits)
  i2c: octeon: Support I2C_M_RECV_LEN
  i2c: octeon: Cleanup resource allocation code
  i2c: octeon: Cleanup i2c-octeon driver
  MAINTAINERS: add Peter Rosin as i2c mux maintainer
  dt-bindings: i2c: Spelling s/propoerty/property/
  i2c: immediately mark ourselves as registered
  i2c: i801: sort IDs alphabetically
  MAINTAINERS: Mika and me are designated reviewers for I2C DESIGNWARE
  i2c: octeon: Cleanup kerneldoc comments
  i2c: do not use internal data from driver core
  i2c: cadence: Fix the kernel-doc warnings
  i2c: imx: remove extra spaces.
  i2c: rcar: don't open code of_device_get_match_data()
  i2c: qup: Fix fifo handling after adding V2 support
  i2c: xiic: Implement power management
  i2c: piix4: Pre-shift the port number
  i2c: piix4: Always use the same type for port
  i2c: piix4: Support alternative port selection register
  i2c: tegra: don't open code of_device_get_match_data()
  i2c: riic, sh_mobile, rcar: Use ARCH_RENESAS
  ...

Merge branch 'hns-fixes'

Yisen Zhuang says:

====================
net: hns: bugs fixed for hns

This series includes some bug fixes and updates for hns driver.

>from Daode, one fix about mss.

>from Kejian, one fix about ping6 issue, one fix about mac address setting,
two fix for RSS setting, two fix about mtu setting.

>from qianqian, fixed HNS v2 xge statistic reg issue.

>from Sheng, one fix about manage packets sending, one fix about GMACs mac
setting.

For more details, please see individual patches.

Thanks a lot!

---
change log:
Series V2:
  - fix the comments as below:
    1) modifies the wrong charator "whick" to "which" in commit log
    2) use the "eth_hdr()" help to get source mac of packets
    3) fix the wrong cast
    4) use tabs instead of spaces to indent the value

Series V1:
  - first submit
====================

Signed-off-by: David S. Miller <[email protected]>

net: hns: bug fix about the overflow of mss

When set MTU to the minimum value 68, there are increasing number
of error packets occur, which is caused by the overflowed value of
mss. This patch fix the bug.

Signed-off-by: Daode Huang <[email protected]>
Signed-off-by: Yisen Zhuang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns: adds limitation for debug port mtu

If mtu for debug port is set more than 1500, it may cause that packets
are dropped by ppe. So maximum value for debug port should be 1500.

Signed-off-by: Kejian Yan <[email protected]>
Signed-off-by: Yisen Zhuang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns: fix the bug about mtu setting

In chip V1, the maximum mtu value is 9600. But in chip V2, it is 9728.
And it is always configurates as 9600 before this patch.

Signed-off-by: Kejian Yan <[email protected]>
Signed-off-by: Yisen Zhuang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns: fixes a bug of RSS

If trying to get receive flow hash indirection table by ethtool, it needs
to call .get_rxnfc to get ring number first. So this patch implements the
.get_rxnfc of ethtool. And the data type of rss_indir_table is u32, it has
to be multiply by the width of data type when using memcpy.

Signed-off-by: Kejian Yan <[email protected]>
Signed-off-by: Yisen Zhuang <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns: fix return value of the function about rss

Both .get_rxfh and .set_rxfh are always return 0, it should return result
from hardware when getting or setting rss. And the rss function should
return the correct data type.

Signed-off-by: Kejian Yan <[email protected]>
Signed-off-by: Yisen Zhuang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns: set xge statistic reg as read only

As the user manual of HNS V2 describs, XGE_DFX_CTRL_CFG.xge_dfx_ctrl_cfg
should be configed as zero if we want xge statistic reg to be read only.
But HNS V1 gets the other meanings. It needs to be identified the process
and then config it rightly.

Signed-off-by: Qianqian Xie <[email protected]>
Signed-off-by: Yisen Zhuang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns: fixed the bug about GMACs mac setting

When sending a pause frame out from GMACs, the packets' source MAC address
does not match the GMACs' MAC address. It causes by the condition before
the mac address setting routine for GMACs, the mac address cannot be set
into loacal mac table for service ports. It obviously the condition needs
to be deleted.

Signed-off-by: Sheng Li <[email protected]>
Signed-off-by: Yisen Zhuang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns: add uc match for debug ports

Debug ports receives lots of packets with dest mac addr does not match
local mac addr, because the filter is close, and it does not drop the
useless packets. This patch adds ON/OFF switch of filtering the packets
whose dest mac addr do not match the local addr in mac table. And the
switch is ON in initialization.

Signed-off-by: Kejian Yan <[email protected]>
Signed-off-by: Peng Li <[email protected]>
Signed-off-by: Yisen Zhuang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns: fixed portid bug in sending manage pkt

In chip V2, the default value of port id in tx BD is Zero. If it is not
configurated to the other value, all management packets will be sent out
from port0. So port_id in the tx BD needs to be updated when sending a
management packet.

In V2 chip, when sending mamagement packets, the driver should
config the port id to BD descs.

Signed-off-by: Sheng Li <[email protected]>
Signed-off-by: Yisen Zhuang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: hns: bug fix about ping6

The current upstreaming code fails to ping other IPv6 net device, because
the enet receives the multicast packets with the src mac addr which is the
same as its mac addr. These packets need to be dropped.

Signed-off-by: Kejian Yan <[email protected]>
Signed-off-by: Yisen Zhuang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ipv6: remove unused in6_addr struct

struct in6_addr isn't used anymore in inet6_connection_sock.h, removing
the forward declaration.

Fixes: 1b33bc3e9e90 ("ipv6: remove obsolete inet6 functions")
Signed-off-by: Luis de Bethencourt <[email protected]>
Signed-off-by: David S. Miller <[email protected]>