Git Repo - linux.git/log

vfs: remove might_sleep() from clear_inode()

Commit 7994e6f72543 ("vfs: Move waiting for inode writeback from
end_writeback() to evict_inode()") removed inode_sync_wait() from
end_writeback() and commit dbd5768f87ff ("vfs: Rename end_writeback() to
clear_inode()") renamed end_writeback() to clear_inode().

After these patches there is no sleeping operation in clear_inode().
So, remove might_sleep() from it.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Shakeel Butt <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Greg Thelen <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Johannes Weiner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

arch/score/kernel/setup.c: combine two seq_printf() calls into one call in show_cpuinfo()

Some data were printed into a sequence by two separate function calls.
Print the same data by a single function call instead.

This issue was detected by using the Coccinelle software.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Markus Elfring <[email protected]>
Cc: Chen Liqin <[email protected]>
Cc: Lennox Wu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ipc/mqueue.c: have RT tasks queue in by priority in wq_add()

Previous behavior added tasks to the work queue using the static_prio
value instead of the dynamic priority value in prio.  This caused RT tasks
to be added to the work queue in a FIFO manner rather than by priority.
Normal tasks were handled by priority.

This fix utilizes the dynamic priority of the task to ensure that both RT
and normal tasks are added to the work queue in priority order.  Utilizing
the dynamic priority (prio) rather than the base priority (normal_prio)
was chosen to ensure that if a task had a boosted priority when it was
added to the work queue, it would be woken sooner to to ensure that it
releases any other locks it may be holding in a more timely manner.  It is
understood that the task could have a lower priority when it wakes than
when it was added to the queue in this (unlikely) case.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jonathan Haws <[email protected]>
Reviewed-by: Steven Rostedt (VMware) <[email protected]>
Reviewed-by: Davidlohr Bueso <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Deepa Dinamani <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Manfred Spraul <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ipc: fix ipc data structures inconsistency

As described in the title, this patch fixes <ipc>id_ds inconsistency when
<ipc>ctl_stat executes concurrently with some ds-changing function, e.g.
shmat, msgsnd or whatever.

For instance, if shmctl(IPC_STAT) is running concurrently
with shmat, following data structure can be returned:
{... shm_lpid = 0, shm_nattch = 1, ...}

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Philippe Mikoyan <[email protected]>
Reviewed-by: Davidlohr Bueso <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Manfred Spraul <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

lib/ubsan: remove returns-nonnull-attribute checks

Similarly to type mismatch checks, new GCC 8.x and Clang also changed for
ABI for returns_nonnull checks. While we can update our code to conform
the new ABI it's more reasonable to just remove it. Because it's just
dead code, we don't have any single user of returns_nonnull attribute in
the whole kernel.

And AFAIU the advantage that this attribute could bring would be mitigated
by -fno-delete-null-pointer-checks cflag that we use to build the kernel.
So it's unlikely we will have a lot of returns_nonnull attribute in
future.

So let's just remove the code, it has no use.

[[email protected]: fix warning]
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Andrey Ryabinin <[email protected]>
Cc: Sodagudi Prasad <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

lib/ubsan: add type mismatch handler for new GCC/Clang

UBSAN=y fails to build with new GCC/clang:

    arch/x86/kernel/head64.o: In function `sanitize_boot_params':
    arch/x86/include/asm/bootparam_utils.h:37: undefined reference to `__ubsan_handle_type_mismatch_v1'

because Clang and GCC 8 slightly changed ABI for 'type mismatch' errors.
Compiler now uses new __ubsan_handle_type_mismatch_v1() function with
slightly modified 'struct type_mismatch_data'.

Let's add new 'struct type_mismatch_data_common' which is independent from
compiler's layout of 'struct type_mismatch_data'.  And make
__ubsan_handle_type_mismatch[_v1]() functions transform compiler-dependent
type mismatch data to our internal representation.  This way, we can
support both old and new compilers with minimal amount of change.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Andrey Ryabinin <[email protected]>
Reported-by: Sodagudi Prasad <[email protected]>
Cc: <[email protected]> [4.5+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

lib/ubsan.c: s/missaligned/misaligned/

A vist from the spelling fairy.

Cc: David Laight <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

build_bug.h: remove BUILD_BUG_ON_NULL()

This macro is only used by net/ipv6/mcast.c, but there is no reason
why it must be BUILD_BUG_ON_NULL().

Replace it with BUILD_BUG_ON_ZERO(), and remove BUILD_BUG_ON_NULL()
definition from <linux/build_bug.h>.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Masahiro Yamada <[email protected]>
Cc: Ian Abbott <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Hideaki YOSHIFUJI <[email protected]>
Cc: Alexey Kuznetsov <[email protected]>
Cc: "David S. Miller" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

include/linux/genl_magic_func.h: remove own BUILD_BUG_ON*() defines

Do not duplicate BUILD_BUG_ON*. Use ones from <linux/build_bug.h>.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Masahiro Yamada <[email protected]>
Cc: Ian Abbott <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Hideaki YOSHIFUJI <[email protected]>
Cc: Alexey Kuznetsov <[email protected]>
Cc: "David S. Miller" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kcov: detect double association with a single task

Currently KCOV_ENABLE does not check if the current task is already
associated with another kcov descriptor. As the result it is possible
to associate a single task with more than one kcov descriptor, which
later leads to a memory leak of the old descriptor. This relation is
really meant to be one-to-one (task has only one back link).

Extend validation to detect such misuse.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 5c9a8750a640 ("kernel: add kcov code coverage")
Signed-off-by: Dmitry Vyukov <[email protected]>
Reported-by: Shankara Pailoor <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: syzbot <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kernel/relay.c: revert "kernel/relay.c: fix potential memory leak"

This reverts commit ba62bafe942b ("kernel/relay.c: fix potential memory leak").

This commit introduced a double free bug, because 'chan' is already
freed by the line:

kref_put(&chan->kref, relay_destroy_channel);

This bug was found by syzkaller, using the BLKTRACESETUP ioctl.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: ba62bafe942b ("kernel/relay.c: fix potential memory leak")
Signed-off-by: Eric Biggers <[email protected]>
Reported-by: syzbot <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Zhouyi Zhou <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: <[email protected]> [4.7+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

pps: parport: use timespec64 instead of timespec

getnstimeofday() is deprecated, so I'm converting this to use
ktime_get_real_ts64() as a safe replacement.  I considered using
ktime_get_real() instead, but since the algorithm here depends on the
exact timing, I decided to introduce fewer changes and leave the code
that determines the nanoseconds since the last seconds wrap untouched.

It's not entirely clear to me whether we should also change the time
base to CLOCK_BOOTTIME or CLOCK_TAI.  With boottime, we would be
independent of changes due to settimeofday() and only see the speed
adjustment from the upstream clock source, with the downside of having
the signal be at an arbirary offset from the start of the UTC second
signal.  With CLOCK_TAI, we would use the same offset from the UTC
second as before and still suffer from settimeofday() adjustments, but
would be less confused during leap seconds.

Both boottime and tai only offer usable (i.e.  avoiding ktime_t to
timespec64 conversion) interfaces for ktime_t though, so either way,
changing it wouldn't take significantly more work.  CLOCK_MONOTONIC
could be used with ktime_get_ts64(), but would lose synchronization
across a suspend/resume cycle, which seems worse.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Acked-by: Rodolfo Giometti <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

pids: introduce find_get_task_by_vpid() helper

There are several functions that do find_task_by_vpid() followed by
get_task_struct(). We can use a helper function instead.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

drivers/rapidio/devices/tsi721_dma.c: adjust six checks for null pointers

checkpatch pointed out the following:

Comparison to NULL could be written !...

Thus fix the affected source code places.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Markus Elfring <[email protected]>
Acked-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

drivers/rapidio/devices/tsi721_dma.c: delete an unnecessary variable initialisation in tsi721_alloc_chan_resources()

The local variable "desc" will eventually be set to an appropriate pointer
a bit later. Thus omit the explicit initialisation at the beginning.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Markus Elfring <[email protected]>
Acked-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

drivers/rapidio/devices/tsi721_dma.c: delete an error message for a failed memory allocation in tsi721_alloc_chan_resources()

Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Markus Elfring <[email protected]>
Acked-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio: move 12 EXPORT_SYMBOL_GPL() calls to function implementations

checkpatch pointed information out like the following.

WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable

Thus fix the affected source code places.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Markus Elfring <[email protected]>
Acked-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio: return an error code only as a constant in two functions

* Return an error code without storing it in an intermediate variable.

* Delete the label "out" and local variable "rc" which became unnecessary
with this refactoring.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Markus Elfring <[email protected]>
Acked-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio: delete an unnecessary variable initialisation in three functions

The local variable "rc" will be set to an appropriate value a bit later.
Thus omit the explicit initialisation at the beginning.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Markus Elfring <[email protected]>
Acked-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio: improve a size determination in five functions

Replace the specification of data structures by pointer dereferences as
the parameter for the operator "sizeof" to make the corresponding size
determination a bit safer according to the Linux coding style
convention.

This issue was detected by using the Coccinelle software.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Markus Elfring <[email protected]>
Acked-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio: adjust five function calls together with a variable assignment

checkpatch pointed information out like the following.

ERROR: do not use assignment in if condition

Thus fix the affected source code places.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Markus Elfring <[email protected]>
Acked-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio: adjust 12 checks for null pointers

checkpatch pointed information out like the following.

Comparison to NULL could be written ...

Thus fix the affected source code places.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Markus Elfring <[email protected]>
Acked-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio: delete an error message for a failed memory allocation in rio_init_mports()

Patch series "RapidIO: Adjustments for some function implementations".

This patch (of 7):

Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Markus Elfring <[email protected]>
Acked-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

cpumask: make cpumask_size() return "unsigned int"

CPUmasks are never big enough to warrant 64-bit code.

Space savings:

add/remove: 0/0 grow/shrink: 1/4 up/down: 3/-17 (-14)
Function                                     old     new   delta
sched_init_numa                             1530    1533      +3
compat_sys_sched_setaffinity                 160     159      -1
sys_sched_getaffinity                        197     195      -2
sys_sched_setaffinity                        183     176      -7
compat_sys_sched_getaffinity                 179     172      -7

Link: http://lkml.kernel.org/r/20171204165531.GA8221@avx2
Signed-off-by: Alexey Dobriyan <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kernel/fork.c: add comment about usage of CLONE_FS flags and namespaces

All other places that deals with namespaces have an explanation of why
the restriction is there.

The description added in this commit was based on commit e66eded8309e
("userns: Don't allow CLONE_NEWUSER | CLONE_FS").

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Marcos Paulo de Souza <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kernel/fork.c: check error and return early

Thus reducing one indentation level while maintaining the same rationale.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Marcos Paulo de Souza <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

<asm-generic/siginfo.h>: fix language in comments

Fix grammar and add an omitted word.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: f9886bc50a8e ("signal: Document the strange si_codes used by ptrace event stops")
Signed-off-by: Randy Dunlap <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

hfsplus: honor setgid flag on directories

When creating a file inside a directory that has the setgid flag set, give
the new file the group ID of the parent, and also the setgid flag if it is
a directory itself.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ernesto A. Fernandez <[email protected]>
Reviewed-by: Vyacheslav Dubeyko <[email protected]>
Cc: Al Viro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

nilfs2: use time64_t internally

The superblock and segment timestamps are used only internally in nilfs2
and can be read out using sysfs.

Since we are using the old 'get_seconds()' interface and store the data
as timestamps, the behavior differs slightly between 64-bit and 32-bit
kernels, the latter will show incorrect timestamps after 2038 in sysfs,
and presumably fail completely in 2106 as comparisons go wrong.

This changes nilfs2 to use time64_t with ktime_get_real_seconds() to
handle timestamps, making the behavior consistent and correct on both
32-bit and 64-bit machines.

The on-disk format already uses 64-bit timestamps, so nothing changes
there.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Acked-by: Ryusuke Konishi <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jan Kara <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kallsyms: let print_ip_sym() print raw addresses

print_ip_sym() is mostly used for debugging, so I think it should print
the raw addresses.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Huacai Chen <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Fuxin Zhang <[email protected]>
Cc: "Tobin C. Harding" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

elf: fix NT_FILE integer overflow

If vm.max_map_count bumped above 2^26 (67+ mil) and system has enough RAM
to allocate all the VMAs (~12.8 GB on Fedora 27 with 200-byte VMAs), then
it should be possible to overflow 32-bit "size", pass paranoia check,
allocate very little vmalloc space and oops while writing into vmalloc
guard page...

But I didn't test this, only coredump of regular process.

Link: http://lkml.kernel.org/r/20180112203427.GA9109@avx2
Signed-off-by: Alexey Dobriyan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

checkpatch: improve OPEN_BRACE test

Some structure definitions that use macros trip the OPEN_BRACE test.

e.g. +struct bpf_map_def SEC("maps") control_map = {

Improve the test by using $balanced_parens instead of a .*

Miscellanea:

o Use $sline so any comments are ignored
o Correct the message output from declaration to definition
o Remove unnecessary parentheses

Link: http://lkml.kernel.org/r/db9b772999d1d2fbda3b9ee24bbca81a87837e13.1517543491.git.joe@perches.com
Signed-off-by: Joe Perches <[email protected]>
Reported-by: Song Liu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

checkpatch: avoid some false positives for TABSTOP declaration test

Using an open bracket after what seems to be a declaration can also be a
function definition and declaration argument line continuation so remove
the open bracket from the possible declaration/definition matching.

e.g.:
int foobar(int a;
int *b[]);

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Joe Perches <[email protected]>
Reported-by: Sven Eckelmann <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

checkpatch: exclude drivers/staging from if with unnecessary parentheses test

Greg KH doesn't like this test so exclude the staging directory from the
implied --strict only test unless --strict is actually used on the
command-line.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Joe Perches <[email protected]>
Cc: Greg KH <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

checkpatch: improve the TABSTOP test to include declarations

Declarations should start on a tabstop too.

Link: http://lkml.kernel.org/r/1b5f97673f36595956ad43329f77bf1a5546d2ff.1513976662.git.joe@perches.com
Signed-off-by: Joe Perches <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

checkpatch: add a few DEVICE_ATTR style tests

DEVICE_ATTR is a declaration macro that has a few alternate and
preferred forms like DEVICE_ATTR_RW, DEVICE_ATTR_RO, and DEVICE_ATTR.

As well, many uses of DEVICE_ATTR could use the preferred forms when the
show or store functions are also named in a regular form.

Suggest the preferred forms when appropriate.

Also emit a permissions warning if the the permissions are not the
typical 0644, 0444, or 0200.

Link: http://lkml.kernel.org/r/725864f363d91d1e1e6894a39fb57662eabd6d65.1513803306.git.joe@perches.com
Signed-off-by: Joe Perches <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

checkpatch: improve quoted string and line continuation test

Given this patch context,

+#define EFI_ST_DISK_IMG { \
+ 0x00000240, "\xbe\x5b\x7c\xac\x22\xc0\x74\x0b" /* .[|.".t. */ \
+ }

the current code misreports a quoted string line continuation defect as
there is a single quote in comment.

The 'raw' line should not be tested for quote count, the comment
substituted line should be instead.

Link: http://lkml.kernel.org/r/13f2735df10c33ca846e26f42f5cce6618157200.1513698599.git.joe@perches.com
Signed-off-by: Joe Perches <[email protected]>
Reported-by: Heinrich Schuchardt <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

checkpatch: ignore some octal permissions of 0

module_param and create_proc uses with a permissions use of a single 0 are
"special" and should not emit any warning.

module_param uses with permission 0 are not visible in sysfs

create_proc uses with permission 0 use a default permission

Link: http://lkml.kernel.org/r/b6583611bb529ea6f6d43786827fddbabbab0a71.1513190059.git.joe@perches.com
Signed-off-by: Bartosz Golaszewski <[email protected]>
Signed-off-by: Joe Perches <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

checkpatch: allow long lines containing URL

Allow lines with URL to exceed the 80 char limit for improved interaction
in adaption to ongoing but undocumented practice.

$ git grep -E '://\S{77}.*' -- '*.[ch]'

As per RFC3986 [1], the URL format allows for alphanum, +, - and .
characters in the scheme before the separator :// as long as it starts
with a letter (e.g. https, git, f.-+).

Recognition of URIs without more context information is prone to false
positives and thus currently left out of the heuristics.

$rawline is used in the check as comments are removed from $line.

[1] https://tools.ietf.org/html/rfc3986#section-3.1

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Andreas Brauchli <[email protected]>
Acked-by: Joe Perches <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

lib/test_sort.c: add module unload support

test_sort.c performs array-based and linked list sort test. Code allows
to compile either as a loadable modules or builtin into the kernel.

Current code is not allow to unload the test_sort.ko module after
successful completion.

This patch adds support to unload the "test_sort.ko" module by adding
module_exit support.

Previous patch was implemented auto unload support by returning -EAGAIN
from module_init() function on successful case, but this approach is not
ideal.

The auto-unload might seem like a nice optimization, but it encourages
inconsistent behaviour. And behaviour that is different from all other
normal modules.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Pravin Shedge <[email protected]>
Cc: Kostenzer Felix <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Paul Gortmaker <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

lib/: make RUNTIME_TESTS a menuconfig to ease disabling it all

No need to get into the submenu to disable all related config entries.

This makes it easier to disable all RUNTIME_TESTS config options without
entering the submenu. It will also enable one to see that en/dis-abled
state from the outside menu.

This is only intended to change menuconfig UI, not change the config
dependencies.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Vincent Legoll <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Byungchul Park <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: "Luis R. Rodriguez" <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

lib: optimize cpumask_next_and()

We've measured that we spend ~0.6% of sys cpu time in cpumask_next_and().
It's essentially a joined iteration in search for a non-zero bit, which is
currently implemented as a lookup join (find a nonzero bit on the lhs,
lookup the rhs to see if it's set there).

Implement a direct join (find a nonzero bit on the incrementally built
join).  Also add generic bitmap benchmarks in the new `test_find_bit`
module for new function (see `find_next_and_bit` in [2] and [3] below).

For cpumask_next_and, direct benchmarking shows that it's 1.17x to 14x
faster with a geometric mean of 2.1 on 32 CPUs [1].  No impact on memory
usage.  Note that on Arm, the new pure-C implementation still outperforms
the old one that uses a mix of C and asm (`find_next_bit`) [3].

[1] Approximate benchmark code:

```
  unsigned long src1p[nr_cpumask_longs] = {pattern1};
  unsigned long src2p[nr_cpumask_longs] = {pattern2};
  for (/*a bunch of repetitions*/) {
    for (int n = -1; n <= nr_cpu_ids; ++n) {
      asm volatile("" : "+rm"(src1p)); // prevent any optimization
      asm volatile("" : "+rm"(src2p));
      unsigned long result = cpumask_next_and(n, src1p, src2p);
      asm volatile("" : "+rm"(result));
    }
  }
```

Results:
pattern1    pattern2     time_before/time_after
0x0000ffff  0x0000ffff   1.65
0x0000ffff  0x00005555   2.24
0x0000ffff  0x00001111   2.94
0x0000ffff  0x00000000   14.0
0x00005555  0x0000ffff   1.67
0x00005555  0x00005555   1.71
0x00005555  0x00001111   1.90
0x00005555  0x00000000   6.58
0x00001111  0x0000ffff   1.46
0x00001111  0x00005555   1.49
0x00001111  0x00001111   1.45
0x00001111  0x00000000   3.10
0x00000000  0x0000ffff   1.18
0x00000000  0x00005555   1.18
0x00000000  0x00001111   1.17
0x00000000  0x00000000   1.25
-----------------------------
               geo.mean  2.06

[2] test_find_next_bit, X86 (skylake)

[ 3913.477422] Start testing find_bit() with random-filled bitmap
[ 3913.477847] find_next_bit: 160868 cycles, 16484 iterations
[ 3913.477933] find_next_zero_bit: 169542 cycles, 16285 iterations
[ 3913.478036] find_last_bit: 201638 cycles, 16483 iterations
[ 3913.480214] find_first_bit: 4353244 cycles, 16484 iterations
[ 3913.480216] Start testing find_next_and_bit() with random-filled
bitmap
[ 3913.481074] find_next_and_bit: 89604 cycles, 8216 iterations
[ 3913.481075] Start testing find_bit() with sparse bitmap
[ 3913.481078] find_next_bit: 2536 cycles, 66 iterations
[ 3913.481252] find_next_zero_bit: 344404 cycles, 32703 iterations
[ 3913.481255] find_last_bit: 2006 cycles, 66 iterations
[ 3913.481265] find_first_bit: 17488 cycles, 66 iterations
[ 3913.481266] Start testing find_next_and_bit() with sparse bitmap
[ 3913.481272] find_next_and_bit: 764 cycles, 1 iterations

[3] test_find_next_bit, arm (v7 odroid XU3).

[  267.206928] Start testing find_bit() with random-filled bitmap
[  267.214752] find_next_bit: 4474 cycles, 16419 iterations
[  267.221850] find_next_zero_bit: 5976 cycles, 16350 iterations
[  267.229294] find_last_bit: 4209 cycles, 16419 iterations
[  267.279131] find_first_bit: 1032991 cycles, 16420 iterations
[  267.286265] Start testing find_next_and_bit() with random-filled
bitmap
[  267.302386] find_next_and_bit: 2290 cycles, 8140 iterations
[  267.309422] Start testing find_bit() with sparse bitmap
[  267.316054] find_next_bit: 191 cycles, 66 iterations
[  267.322726] find_next_zero_bit: 8758 cycles, 32703 iterations
[  267.329803] find_last_bit: 84 cycles, 66 iterations
[  267.336169] find_first_bit: 4118 cycles, 66 iterations
[  267.342627] Start testing find_next_and_bit() with sparse bitmap
[  267.356919] find_next_and_bit: 91 cycles, 1 iterations

[[email protected]: v6]
Link: http://lkml.kernel.org/r/[email protected]
[[email protected]: m68k/bitops: always include <asm-generic/bitops/find.h>]
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Clement Courbet <[email protected]>
Signed-off-by: Geert Uytterhoeven <[email protected]>
Cc: Yury Norov <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

lib/find_bit_benchmark.c: improvements

As suggested in review comments:
* printk: align numbers using whitespaces instead of tabs;
* return error value from init() to avoid calling rmmod if testing again;
* use ktime_get instead of get_cycles as some arches don't support it;

The output in dmesg (on QEMU arm64):
[   38.823430] Start testing find_bit() with random-filled bitmap
[   38.845358] find_next_bit:                20138448 ns, 163968 iterations
[   38.856217] find_next_zero_bit:           10615328 ns, 163713 iterations
[   38.863564] find_last_bit:                 7111888 ns, 163967 iterations
[   40.944796] find_first_bit:             2081007216 ns, 163968 iterations
[   40.944975]
[   40.944975] Start testing find_bit() with sparse bitmap
[   40.945268] find_next_bit:                   73216 ns,    656 iterations
[   40.967858] find_next_zero_bit:           22461008 ns, 327025 iterations
[   40.968047] find_last_bit:                   62320 ns,    656 iterations
[   40.978060] find_first_bit:                9889360 ns,    656 iterations

Link: http://lkml.kernel.org/r/20171124143040.a44jvhmnaiyedg2i@yury-thinkpad
Signed-off-by: Yury Norov <[email protected]>
Tested-by: Geert Uytterhoeven <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Clement Courbet <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

lib/test_find_bit.c: rename to find_bit_benchmark.c

As suggested in review comments, rename test_find_bit.c to
find_bit_benchmark.c.

Link: http://lkml.kernel.org/r/20171124143040.a44jvhmnaiyedg2i@yury-thinkpad
Signed-off-by: Yury Norov <[email protected]>
Tested-by: Geert Uytterhoeven <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Clement Courbet <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

lib/stackdepot.c: use a non-instrumented version of memcmp()

stackdepot used to call memcmp(), which compiler tools normally
instrument, therefore every lookup used to unnecessarily call instrumented
code. This is somewhat ok in the case of KASAN, but under KMSAN a lot of
time was spent in the instrumentation.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Alexander Potapenko <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

include/linux/bitmap.h: make bitmap_fill() and bitmap_zero() consistent

Behaviour of bitmap_fill() differs from bitmap_zero() in a way how bits
behind bitmap are handed. bitmap_zero() clears entire bitmap by unsigned
long boundary, while bitmap_fill() mimics bitmap_set().

Here we change bitmap_fill() behaviour to be consistent with bitmap_zero()
and add a note to documentation.

The change might reveal some bugs in the code where unused bits are
handled differently and in such cases bitmap_set() has to be used.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Andy Shevchenko <[email protected]>
Suggested-by: Rasmus Villemoes <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Yury Norov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

lib/test_bitmap.c: clean up test_zero_fill_copy() test case and rename

Since we have separate explicit test cases for bitmap_zero() /
bitmap_clear() and bitmap_fill() / bitmap_set(), clean up
test_zero_fill_copy() to only test bitmap_copy() functionality and thus
rename a function to reflect the changes.

While here, replace bitmap_fill() by bitmap_set() with proper values.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Andy Shevchenko <[email protected]>
Reviewed-by: Yury Norov <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

lib/test_bitmap.c: add bitmap_fill()/bitmap_set() test cases

Explicitly test bitmap_fill() and bitmap_set() functions.

For bitmap_fill() we expect a consistent behaviour as in bitmap_zero(),
i.e. the trailing bits will be set up to unsigned long boundary.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Andy Shevchenko <[email protected]>
Reviewed-by: Yury Norov <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

lib/test_bitmap.c: add bitmap_zero()/bitmap_clear() test cases

Explicitly test bitmap_zero() and bitmap_clear() functions.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Andy Shevchenko <[email protected]>
Reviewed-by: Yury Norov <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Cc: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

bitmap: replace bitmap_{from,to}_u32array

with bitmap_{from,to}_arr32 over the kernel. Additionally to it:
* __check_eq_bitmap() now takes single nbits argument.
* __check_eq_u32_array is not used in new test but may be used in
future. So I don't remove it here, but annotate as __used.

Tested on arm64 and 32-bit BE mips.

[[email protected]: perf: arm_dsu_pmu: convert to bitmap_from_arr32]
Link: http://lkml.kernel.org/r/[email protected]
[[email protected]: fix net/core/ethtool.c]
Link: http://lkml.kernel.org/r/20180205071747.4ekxtsbgxkj5b2fz@yury-thinkpad
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Yury Norov <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
Cc: Ben Hutchings <[email protected]>
Cc: David Decotigny <[email protected]>,
Cc: David S. Miller <[email protected]>,
Cc: Geert Uytterhoeven <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Cc: Heiner Kallweit <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

bitmap: new bitmap_copy_safe and bitmap_{from,to}_arr32

This patchset replaces bitmap_{to,from}_u32array with more simple and
standard looking copy-like functions.

bitmap_from_u32array() takes 4 arguments (bitmap_to_u32array is similar):
- unsigned long *bitmap, which is destination;
- unsigned int nbits, the length of destination bitmap, in bits;
- const u32 *buf, the source; and
- unsigned int nwords, the length of source buffer in ints.

In description to the function it is detailed like:
* copy min(nbits, 32*nwords) bits from @buf to @bitmap, remaining
* bits between nword and nbits in @bitmap (if any) are cleared.

Having two size arguments looks unneeded and potentially dangerous.

It is unneeded because normally user of copy-like function should take
care of the size of destination and make it big enough to fit source
data.

And it is dangerous because function may hide possible error if user
doesn't provide big enough bitmap, and data becomes silently dropped.

That's why all copy-like functions have 1 argument for size of copying
data, and I don't see any reason to make bitmap_from_u32array()
different.

One exception that comes in mind is strncpy() which also provides size
of destination in arguments, but it's strongly argued by the possibility
of taking broken strings in source. This is not the case of
bitmap_{from,to}_u32array().

There is no many real users of bitmap_{from,to}_u32array(), and they all
very clearly provide size of destination matched with the size of
source, so additional functionality is not used in fact. Like this:
bitmap_from_u32array(to->link_modes.supported,
__ETHTOOL_LINK_MODE_MASK_NBITS,
link_usettings.link_modes.supported,
__ETHTOOL_LINK_MODE_MASK_NU32);
Where:
#define __ETHTOOL_LINK_MODE_MASK_NU32 \
DIV_ROUND_UP(__ETHTOOL_LINK_MODE_MASK_NBITS, 32)

In this patch, bitmap_copy_safe and bitmap_{from,to}_arr32 are introduced.

'Safe' in bitmap_copy_safe() stands for clearing unused bits in bitmap
beyond last bit till the end of last word. It is useful for hardening
API when bitmap is assumed to be exposed to userspace.

bitmap_{from,to}_arr32 functions are replacements for
bitmap_{from,to}_u32array. They don't take unneeded nwords argument, and
so simpler in implementation and understanding.

This patch suggests optimization for 32-bit systems - aliasing
bitmap_{from,to}_arr32 to bitmap_copy_safe.

Other possible optimization is aliasing 64-bit LE bitmap_{from,to}_arr32 to
more generic function(s). But I didn't end up with the function that would
be helpful by itself, and can be used to alias 64-bit LE
bitmap_{from,to}_arr32, like bitmap_copy_safe() does. So I preferred to
leave things as is.

The following patch switches kernel to new API and introduces test for it.

Discussion is here: https://lkml.org/lkml/2017/11/15/592

[[email protected]: rename bitmap_copy_safe to bitmap_copy_clear_tail]
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Yury Norov <[email protected]>
Cc: Ben Hutchings <[email protected]>
Cc: David Decotigny <[email protected]>,
Cc: David S. Miller <[email protected]>,
Cc: Geert Uytterhoeven <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

MAINTAINERS: update sboyd's email address

Replace my codeaurora.org address with my kernel.org address so that
emails don't bounce.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Stephen Boyd <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kernel/async.c: revert "async: simplify lowest_in_progress()"

This reverts commit 92266d6ef60c ("async: simplify lowest_in_progress()")
which was simply wrong: In the case where domain is NULL, we now use the
wrong offsetof() in the list_first_entry macro, so we don't actually
fetch the ->cookie value, but rather the eight bytes located
sizeof(struct list_head) further into the struct async_entry.

On 64 bit, that's the data member, while on 32 bit, that's a u64 built
from func and data in some order.

I think the bug happens to be harmless in practice: It obviously only
affects callers which pass a NULL domain, and AFAICT the only such
caller is

  async_synchronize_full() ->
  async_synchronize_full_domain(NULL) ->
  async_synchronize_cookie_domain(ASYNC_COOKIE_MAX, NULL)

and the ASYNC_COOKIE_MAX means that in practice we end up waiting for
the async_global_pending list to be empty - but it would break if
somebody happened to pass (void*)-1 as the data element to
async_schedule, and of course also if somebody ever does a
async_synchronize_cookie_domain(, NULL) with a "finite" cookie value.

Maybe the "harmless in practice" means this isn't -stable material.  But
I'm not completely confident my quick git grep'ing is enough, and there
might be affected code in one of the earlier kernels that has since been
removed, so I'll leave the decision to the stable guys.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 92266d6ef60c "async: simplify lowest_in_progress()"
Signed-off-by: Rasmus Villemoes <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Cc: Adam Wallis <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Cc: <[email protected]> [3.10+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

tools/lib/subcmd/pager.c: do not alias select() params

Use a separate fd set for select()-s exception fds param to fix the
following gcc warning:

  pager.c:36:12: error: passing argument 2 to restrict-qualified parameter aliases with argument 4 [-Werror=restrict]
    select(1, &in, NULL, &in, NULL);
              ^~~        ~~~

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Sergey Senozhatsky <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

uuid: cleanup <uapi/linux/uuid.h>

Exported header doesn't use anything from <linux/string.h>,
it is <linux/uuid.h> which uses memcmp().

Link: http://lkml.kernel.org/r/20171225171121.GA22754@avx2
Signed-off-by: Alexey Dobriyan <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Makefile: introduce CONFIG_CC_STACKPROTECTOR_AUTO

Nearly all modern compilers support a stack-protector option, and nearly
all modern distributions enable the kernel stack-protector, so enabling
this by default in kernel builds would make sense.  However, Kconfig does
not have knowledge of available compiler features, so it isn't safe to
force on, as this would unconditionally break builds for the compilers or
architectures that don't have support.  Instead, this introduces a new
option, CONFIG_CC_STACKPROTECTOR_AUTO, which attempts to discover the best
possible stack-protector available, and will allow builds to proceed even
if the compiler doesn't support any stack-protector.

This option is made the default so that kernels built with modern
compilers will be protected-by-default against stack buffer overflows,
avoiding things like the recent BlueBorne attack.  Selection of a specific
stack-protector option remains available, including disabling it.

Additionally, tiny.config is adjusted to use CC_STACKPROTECTOR_NONE, since
that's the option with the least code size (and it used to be the default,
so we have to explicitly choose it there now).

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Kees Cook <[email protected]>
Tested-by: Laura Abbott <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Josh Triplett <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Makefile: move stack-protector availability out of Kconfig

Various portions of the kernel, especially per-architecture pieces,
need to know if the compiler is building with the stack protector.
This was done in the arch/Kconfig with 'select', but this doesn't
allow a way to do auto-detected compiler support. In preparation for
creating an on-if-available default, move the logic for the definition of
CONFIG_CC_STACKPROTECTOR into the Makefile.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Kees Cook <[email protected]>
Tested-by: Laura Abbott <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Josh Triplett <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Makefile: move stack-protector compiler breakage test earlier

In order to make stack-protector failures warn instead of unconditionally
breaking the build, this moves the compiler output sanity-check earlier,
and sets a flag for later testing. Future patches can choose to warn or
fail, depending on the flag value.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Kees Cook <[email protected]>
Tested-by: Laura Abbott <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Josh Triplett <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

fs/proc/consoles.c: use seq_putc() in show_console_dev()

A single character (line break) should be put into a sequence. Thus use
the corresponding function "seq_putc".

This issue was detected by using the Coccinelle software.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Markus Elfring <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

proc: rearrange args

Rearrange args for smaller code.

lookup revolves around memcmp() which gets len 3rd arg, so propagate
length as 3rd arg.

readdir and lookup add additional arg to VFS ->readdir and ->lookup, so
better add it to the end.

Space savings on x86_64:

add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-18 (-18)
Function                                     old     new   delta
proc_readdir                                  22      13      -9
proc_lookup                                   18       9      -9

proc_match() is smaller if not inlined, I promise!

Link: http://lkml.kernel.org/r/20180104175958.GB5204@avx2
Signed-off-by: Alexey Dobriyan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

proc: spread likely/unlikely a bit

use_pde() is used at every open/read/write/...  of every random /proc
file.  Negative refcount happens only if PDE is being deleted by module
(read: never).  So it gets "likely".

unuse_pde() gets "unlikely" for the same reason.

close_pdeo() gets unlikely as the completion is filled only if there is a
race between PDE removal and close() (read: never ever).

It even saves code on x86_64 defconfig:

add/remove: 0/0 grow/shrink: 1/2 up/down: 2/-20 (-18)
Function                                     old     new   delta
close_pdeo                                   183     185      +2
proc_reg_get_unmapped_area                   119     111      -8
proc_reg_poll                                 85      73     -12

Link: http://lkml.kernel.org/r/20180104175657.GA5204@avx2
Signed-off-by: Alexey Dobriyan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

fs/proc: use __ro_after_init

/proc/self inode numbers, value of proc_inode_cache and st_nlink of
/proc/$TGID are fixed constants.

Link: http://lkml.kernel.org/r/20180103184707.GA31849@avx2
Signed-off-by: Alexey Dobriyan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

fs/proc/internal.h: fix up comment

Document what ->pde_unload_lock actually does.

Link: http://lkml.kernel.org/r/20180103185120.GB31849@avx2
Signed-off-by: Alexey Dobriyan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

fs/proc/internal.h: rearrange struct proc_dir_entry

struct proc_dir_entry became bit messy over years:

* move 16-bit ->mode_t before namelen to get rid of padding
* make ->in_use first field: it seems to be most used resulting in
  smaller code on x86_64 (defconfig):

add/remove: 0/0 grow/shrink: 7/13 up/down: 24/-67 (-43)
Function                                     old     new   delta
proc_readdir_de                              451     455      +4
proc_get_inode                               282     286      +4
pde_put                                       65      69      +4
remove_proc_subtree                          294     297      +3
remove_proc_entry                            297     300      +3
proc_register                                295     298      +3
proc_notify_change                            94      97      +3
unuse_pde                                     27      26      -1
proc_reg_write                                89      85      -4
proc_reg_unlocked_ioctl                       85      81      -4
proc_reg_read                                 89      85      -4
proc_reg_llseek                               87      83      -4
proc_reg_get_unmapped_area                   123     119      -4
proc_entry_rundown                           139     135      -4
proc_reg_poll                                 91      85      -6
proc_reg_mmap                                 79      73      -6
proc_get_link                                 55      49      -6
proc_reg_release                             108     101      -7
proc_reg_open                                298     291      -7
close_pdeo                                   228     218     -10

* move writeable fields together to a first cacheline (on x86_64),
  those include
* ->in_use: reference count, taken every open/read/write/close etc
* ->count: reference count, taken at readdir on every entry
* ->pde_openers: tracks (nearly) every open, dirtied
* ->pde_unload_lock: spinlock protecting ->pde_openers
* ->proc_iops, ->proc_fops, ->data: writeonce fields,
  used right together with previous group.

* other rarely written fields go into 1st/2nd and 2nd/3rd cacheline on
  32-bit and 64-bit respectively.

Additionally on 32-bit, ->subdir, ->subdir_node, ->namelen, ->name go
fully into 2nd cacheline, separated from writeable fields.  They are all
used during lookup.

Link: http://lkml.kernel.org/r/20171220215914.GA7877@avx2
Signed-off-by: Alexey Dobriyan <[email protected]>
Cc: Al Viro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

fs/proc/kcore.c: use probe_kernel_read() instead of memcpy()

Commit df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext
data") added a bounce buffer to avoid hardened usercopy checks. Copying
to the bounce buffer was implemented with a simple memcpy() assuming
that it is always valid to read from kernel memory iff the
kern_addr_valid() check passed.

A simple, but pointless, test case like "dd if=/proc/kcore of=/dev/null"
now can easily crash the kernel, since the former execption handling on
invalid kernel addresses now doesn't work anymore.

Also adding a kern_addr_valid() implementation wouldn't help here. Most
architectures simply return 1 here, while a couple implemented a page
table walk to figure out if something is mapped at the address in
question.

With DEBUG_PAGEALLOC active mappings are established and removed all the
time, so that relying on the result of kern_addr_valid() before
executing the memcpy() also doesn't work.

Therefore simply use probe_kernel_read() to copy to the bounce buffer.
This also allows to simplify read_kcore().

At least on s390 this fixes the observed crashes and doesn't introduce
warnings that were removed with df04abfd181a ("fs/proc/kcore.c: Add
bounce buffer for ktext data"), even though the generic
probe_kernel_read() implementation uses uaccess functions.

While looking into this I'm also wondering if kern_addr_valid() could be
completely removed...(?)

Link: http://lkml.kernel.org/r/[email protected]
Fixes: df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data")
Fixes: f5509cc18daa ("mm: Hardened usercopy")
Signed-off-by: Heiko Carstens <[email protected]>
Acked-by: Kees Cook <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Al Viro <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

fs/proc/array.c: delete children_seq_release()

It is 1:1 wrapper around seq_release().

Link: http://lkml.kernel.org/r/20171122171510.GA12161@avx2
Signed-off-by: Alexey Dobriyan <[email protected]>
Acked-by: Cyrill Gorcunov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

proc: less memory for /proc/*/map_files readdir

dentry name can be evaluated later, right before calling into VFS.

Also, spend less time under ->mmap_sem.

Link: http://lkml.kernel.org/r/20171110163034.GA2534@avx2
Signed-off-by: Alexey Dobriyan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

fs/proc/vmcore.c: simpler /proc/vmcore cleanup

Iterators aren't necessary as you can just grab the first entry and delete
it until no entries left.

Link: http://lkml.kernel.org/r/20171121191121.GA20757@avx2
Signed-off-by: Alexey Dobriyan <[email protected]>
Cc: Mahesh Salgaonkar <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

proc: fix /proc/*/map_files lookup

Current code does:

if (sscanf(dentry->d_name.name, "%lx-%lx", start, end) != 2)

However sscanf() is broken garbage.

It silently accepts whitespace between format specifiers
(did you know that?).

It silently accepts valid strings which result in integer overflow.

Do not use sscanf() for any even remotely reliable parsing code.

OK
# readlink '/proc/1/map_files/55a23af39000-55a23b05b000'
/lib/systemd/systemd

broken
# readlink '/proc/1/map_files/               55a23af39000-55a23b05b000'
/lib/systemd/systemd

broken
# readlink '/proc/1/map_files/55a23af39000-55a23b05b000    '
/lib/systemd/systemd

very broken
# readlink '/proc/1/map_files/1000000000000000055a23af39000-55a23b05b000'
/lib/systemd/systemd

Andrei said:

: This patch breaks criu.  It was a bug in criu.  And this bug is on a minor
: path, which works when memfd_create() isn't available.  It is a reason why
: I ask to not backport this patch to stable kernels.
:
: In CRIU this bug can be triggered, only if this patch will be backported
: to a kernel which version is lower than v3.16.

Link: http://lkml.kernel.org/r/20171120212706.GA14325@avx2
Signed-off-by: Alexey Dobriyan <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Cc: Andrei Vagin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

proc: don't use READ_ONCE/WRITE_ONCE for /proc/*/fail-nth

READ_ONCE and WRITE_ONCE are useless when there is only one read/write
is being made.

Link: http://lkml.kernel.org/r/20171120204033.GA9446@avx2
Signed-off-by: Alexey Dobriyan <[email protected]>
Cc: Akinobu Mita <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

proc: use %u for pid printing and slightly less stack

PROC_NUMBUF is 13 which is enough for "negative int + \n + \0".

However PIDs and TGIDs are never negative and newline is not a concern,
so use just 10 per integer.

Link: http://lkml.kernel.org/r/20171120203005.GA27743@avx2
Signed-off-by: Alexey Dobriyan <[email protected]>
Cc: Alexander Viro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kasan: remove redundant initialization of variable 'real_size'

Variable real_size is initialized with a value that is never read, it is
re-assigned a new value later on, hence the initialization is redundant
and can be removed.

Cleans up clang warning:

lib/test_kasan.c:422:21: warning: Value stored to 'real_size' during its initialization is never read

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Colin Ian King <[email protected]>
Acked-by: Andrey Ryabinin <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kasan: clean up KASAN_SHADOW_SCALE_SHIFT usage

Right now the fact that KASAN uses a single shadow byte for 8 bytes of
memory is scattered all over the code.

This change defines KASAN_SHADOW_SCALE_SHIFT early in asm include files
and makes use of this constant where necessary.

[[email protected]: coding-style fixes]
Link: http://lkml.kernel.org/r/34937ca3b90736eaad91b568edf5684091f662e3.1515775666.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <[email protected]>
Acked-by: Andrey Ryabinin <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kasan: fix prototype author email address

Use the new one.

Link: http://lkml.kernel.org/r/de3b7ffc30a55178913a7d3865216aa7accf6c40.1515775666.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kasan: detect invalid frees

Detect frees of pointers into middle of heap objects.

Link: http://lkml.kernel.org/r/cb569193190356beb018a03bb8d6fbae67e7adbc.1514378558.git.dvyukov@google.com
Signed-off-by: Dmitry Vyukov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>a
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kasan: unify code between kasan_slab_free() and kasan_poison_kfree()

Both of these functions deal with freeing of slab objects.
However, kasan_poison_kfree() mishandles SLAB_TYPESAFE_BY_RCU
(must also not poison such objects) and does not detect double-frees.

Unify code between these functions.

This solves both of the problems and allows to add more common code
(e.g. detection of invalid frees).

Link: http://lkml.kernel.org/r/385493d863acf60408be219a021c3c8e27daa96f.1514378558.git.dvyukov@google.com
Signed-off-by: Dmitry Vyukov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>a
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kasan: detect invalid frees for large mempool objects

Detect frees of pointers into middle of mempool objects.

I did a one-off test, but it turned out to be very tricky, so I reverted
it.  First, mempool does not call kasan_poison_kfree() unless allocation
function fails.  I stubbed an allocation function to fail on second and
subsequent allocations.  But then mempool stopped to call
kasan_poison_kfree() at all, because it does it only when allocation
function is mempool_kmalloc().  We could support this special failing
test allocation function in mempool, but it also can't live with kasan
tests, because these are in a module.

Link: http://lkml.kernel.org/r/bf7a7d035d7a5ed62d2dd0e3d2e8a4fcdf456aa7.1514378558.git.dvyukov@google.com
Signed-off-by: Dmitry Vyukov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>a
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kasan: don't use __builtin_return_address(1)

__builtin_return_address(1) is unreliable without frame pointers.
With defconfig on kmalloc_pagealloc_invalid_free test I am getting:

BUG: KASAN: double-free or invalid-free in (null)

Pass caller PC from callers explicitly.

Link: http://lkml.kernel.org/r/9b01bc2d237a4df74ff8472a3bf6b7635908de01.1514378558.git.dvyukov@google.com
Signed-off-by: Dmitry Vyukov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>a
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kasan: detect invalid frees for large objects

Patch series "kasan: detect invalid frees".

KASAN detects double-frees, but does not detect invalid-frees (when a
pointer into a middle of heap object is passed to free). We recently had
a very unpleasant case in crypto code which freed an inner object inside
of a heap allocation. This left unnoticed during free, but totally
corrupted heap and later lead to a bunch of random crashes all over kernel
code.

Detect invalid frees.

This patch (of 5):

Detect frees of pointers into middle of large heap objects.

I dropped const from kasan_kfree_large() because it starts propagating
through a bunch of functions in kasan_report.c, slab/slub nearest_obj(),
all of their local variables, fixup_red_left(), etc.

Link: http://lkml.kernel.org/r/1b45b4fe1d20fc0de1329aab674c1dd973fee723.1514378558.git.dvyukov@google.com
Signed-off-by: Dmitry Vyukov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>a
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kasan: add functions for unpoisoning stack variables

As a code-size optimization, LLVM builds since r279383 may bulk-manipulate
the shadow region when (un)poisoning large memory blocks. This requires
new callbacks that simply do an uninstrumented memset().

This fixes linking the Clang-built kernel when using KASAN.

[[email protected]: add declarations for internal functions]
Link: http://lkml.kernel.org/r/[email protected]
[[email protected]: __asan_set_shadow_00 can be static]
Link: http://lkml.kernel.org/r/20171223125943.GA74341@lkp-ib03
[[email protected]: fix memset() parameters, and tweak commit message to describe new callbacks]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Alexander Potapenko <[email protected]>
Signed-off-by: Greg Hackmann <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
Signed-off-by: Fengguang Wu <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
Acked-by: Andrey Ryabinin <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Matthias Kaehlcke <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kasan: add tests for alloca poisoning

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Hackmann <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
Acked-by: Andrey Ryabinin <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Matthias Kaehlcke <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kasan: support alloca() poisoning

clang's AddressSanitizer implementation adds redzones on either side of
alloca()ed buffers. These redzones are 32-byte aligned and at least 32
bytes long.

__asan_alloca_poison() is passed the size and address of the allocated
buffer, *excluding* the redzones on either side. The left redzone will
always be to the immediate left of this buffer; but AddressSanitizer may
need to add padding between the end of the buffer and the right redzone.
If there are any 8-byte chunks inside this padding, we should poison
those too.

__asan_allocas_unpoison() is just passed the top and bottom of the dynamic
stack area, so unpoisoning is simpler.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Hackmann <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
Acked-by: Andrey Ryabinin <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Matthias Kaehlcke <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kasan/Makefile: support LLVM style asan parameters

LLVM doesn't understand GCC-style paramters ("--param asan-foo=bar"), thus
we currently we don't use inline/globals/stack instrumentation when
building the kernel with clang.

Add support for LLVM-style parameters ("-mllvm -asan-foo=bar") to enable
all KASAN features.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Andrey Ryabinin <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
Reviewed-by: Alexander Potapenko <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Greg Hackmann <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Matthias Kaehlcke <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kasan: add compiler support for clang

Patch series "kasan: support alloca, LLVM", v4.

This patch (of 5):

For now we can hard-code ASAN ABI level 5, since historical clang builds
can't build the kernel anyway. We also need to emulate gcc's
__SANITIZE_ADDRESS__ flag, or memset() calls won't be instrumented.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Hackmann <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
Acked-by: Andrey Ryabinin <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Matthias Kaehlcke <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kasan: don't emit builtin calls when sanitization is off

With KASAN enabled the kernel has two different memset() functions, one
with KASAN checks (memset) and one without (__memset).  KASAN uses some
macro tricks to use the proper version where required.  For example
memset() calls in mm/slub.c are without KASAN checks, since they operate
on poisoned slab object metadata.

The issue is that clang emits memset() calls even when there is no
memset() in the source code.  They get linked with improper memset()
implementation and the kernel fails to boot due to a huge amount of KASAN
reports during early boot stages.

The solution is to add -fno-builtin flag for files with KASAN_SANITIZE :=
n marker.

Link: http://lkml.kernel.org/r/8ffecfffe04088c52c42b92739c2bd8a0bcb3f5e.1516384594.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <[email protected]>
Acked-by: Nick Desaulniers <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Michal Marek <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

netfilter: nf_tables: fix flowtable free

Every flow_offload entry is added into the table twice. Because of this,
rhashtable_free_and_destroy can't be used, since it would call kfree for
each flow_offload object twice.

This patch cleans up the flowtable via nf_flow_table_iterate() to
schedule removal of entries by setting on the dying bit, then there is
an explicitly invocation of the garbage collector to release resources.

Based on patch from Felix Fietkau.

Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: nft_flow_offload: move flowtable cleanup routines to nf_flow_table

Move the flowtable cleanup routines to nf_flow_table and expose the
nf_flow_table_cleanup() helper function.

Signed-off-by: Pablo Neira Ayuso <[email protected]>

netfilter: xt_RATEEST: acquire xt_rateest_mutex for hash insert

rateest_hash is supposed to be protected by xt_rateest_mutex,
and, as suggested by Eric, lookup and insert should be atomic,
so we should acquire the xt_rateest_mutex once for both.

So introduce a non-locking helper for internal use and keep the
locking one for external.

Reported-by: <[email protected]>
Fixes: 5859034d7eb8 ("[NETFILTER]: x_tables: add RATEEST target")
Signed-off-by: Cong Wang <[email protected]>
Reviewed-by: Florian Westphal <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>

Merge tag 'platform-drivers-x86-v4.16-1' of git://git.infradead.org/linux-platform-drivers-x86

Pull x86 platform-driver updates from Darren Hart:
"New model support added for Dell, Ideapad, Acer, Asus, Thinkpad, and
  GPD laptops. Improvements to the common intel-vbtn driver, including
  tablet mode, rotate, and front button support. Intel CPU support added
  for Cannonlake and platform support for Dollar Cove power button.

  Overhaul of the mellanox platform driver, creating a new
  platform/mellanox directory for the newly multi-architecture regmap
  interface.

  Significant Intel PMC update with CannonLake support, Coffeelake
  update, CPUID enumeration, module support, new read64 API, refactoring
  and cleanups.

  Revert the apple-gmux iGP IO lock, addressing reported issues with
  non-binary drivers, leaving Nvidia binary driver users to comment out
  conflicting code.

  Miscellaneous fixes and cleanups"

* tag 'platform-drivers-x86-v4.16-1' of git://git.infradead.org/linux-platform-drivers-x86: (81 commits)
  platform/x86: mlx-platform: Fix an ERR_PTR vs NULL issue
  platform/x86: intel_pmc_core: Special case for Coffeelake
  platform/x86: intel_pmc_core: Add CannonLake PCH support
  x86/cpu: Add Cannonlake to Intel family
  platform/x86: intel_pmc_core: Read base address from LPIT
  ACPI / LPIT: Export lpit_read_residency_count_address()
  platform/x86: intel-vbtn: Replace License by SDPX identifier
  platform/x86: intel-vbtn: Remove redundant inclusions
  platform/x86: intel-vbtn: Support tablet mode switch
  platform/x86: dell-laptop: Allocate buffer on heap rather than globally
  platform/x86: intel_pmc_core: Remove unused header file
  platform/x86: mlx-platform: Add hotplug device unregister to error path
  platform/x86: mlx-platform: fix module aliases
  platform/mellanox: mlxreg-hotplug: Add check for negative adapter number
  platform/x86: mlx-platform: Add IO access verification callbacks
  platform/x86: mlx-platform: Document pdev_hotplug field
  platform/x86: mlx-platform: Allow compilation for 32 bit arch
  platform/mellanox: mlxreg-hotplug: Enable building for ARM
  platform/mellanox: mlxreg-hotplug: Modify to use a regmap interface
  platform/mellanox: Group create/destroy with attribute functions
  ...

Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux

Pull thermal management updates from Zhang Rui:

- fix a race condition issue in power allocator governor (Yi Zeng).

- add support for AP806 and CP110 in armada thermal driver, together
   with several improvements (Baruch Siach, Miquel Raynal)

- add support for r8z7743 in rcar thermal driver (Biju Das)

- convert thermal core to use new hwmon API to avoid warning (Fabio
   Estevam)

- small fixes and cleanups in thermal core and x86_pkg_thermal,
   int3400_thermal, hisi_thermal, mtk_thermal and imx_thermal drivers
   (Pravin Shedge, Geert Uytterhoeven, Alexey Khoroshilov, Brian Bian,
   Matthias Brugger, Nicolin Chen, Uwe Kleine-König)

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (25 commits)
  thermal: thermal_hwmon: Convert to hwmon_device_register_with_info()
  thermal/x86 pkg temp: Remove debugfs_create_u32() casts
  thermal: int3400_thermal: fix error handling in int3400_thermal_probe()
  thermal/drivers/hisi: Remove bogus const from function return type
  thermal: armada: Give meaningful names to the thermal zones
  thermal: armada: Wait sensors validity before exiting the init callback
  thermal: armada: Change sensors trim default value
  thermal: armada: Update Kconfig and module description
  thermal: armada: Add support for Armada CP110
  thermal: armada: Add support for Armada AP806
  thermal: armada: Use real status register name
  thermal: armada: Clarify control registers accesses
  thermal: armada: Simplify the check of the validity bit
  thermal: armada: Use msleep for long delays
  dt-bindings: thermal: Describe Armada AP806 and CP110
  dt-bindings: thermal: rcar: Add device tree support for r8a7743
  thermal: mtk: Cleanup unused defines
  thermal: imx: update to new formula according to NXP AN5215
  thermal: imx: use consistent style to write temperatures
  thermal: imx: improve comments describing algorithm for temp calculation
  ...

media: videobuf2: fix up for "media: annotate ->poll() instances"

Signed-off-by: Stephen Rothwell <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge branch 'linus' into sched/urgent, to resolve conflicts

Conflicts:
arch/arm64/kernel/entry.S
arch/x86/Kconfig
include/linux/sched/mm.h
kernel/fork.c

Signed-off-by: Ingo Molnar <[email protected]>

Merge tag 'media/v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media updates from Mauro Carvalho Chehab:

- videobuf2 was moved to a media/common dir, as it is now used by the
   DVB subsystem too

- Digital TV core memory mapped support interface

- new sensor driver: ov7740

- several improvements at ddbridge driver

- new V4L2 driver: IPU3 CIO2 CSI-2 receiver unit, found on some Intel
   SoCs

- new tuner driver: tda18250

- finally got rid of all LIRC staging drivers

- as we don't have old lirc drivers anymore, restruct the lirc device
   code

- add support for UVC metadata

- add a new staging driver for NVIDIA Tegra Video Decoder Engine

- DVB kAPI headers moved to include/media

- synchronize the kAPI and uAPI for the DVB subsystem, removing the gap
   for non-legacy APIs

- reduce the kAPI gap for V4L2

- lots of other driver enhancements, cleanups, etc.

* tag 'media/v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (407 commits)
  media: v4l2-compat-ioctl32.c: make ctrl_is_pointer work for subdevs
  media: v4l2-compat-ioctl32.c: refactor compat ioctl32 logic
  media: v4l2-compat-ioctl32.c: don't copy back the result for certain errors
  media: v4l2-compat-ioctl32.c: drop pr_info for unknown buffer type
  media: v4l2-compat-ioctl32.c: copy clip list in put_v4l2_window32
  media: v4l2-compat-ioctl32.c: fix ctrl_is_pointer
  media: v4l2-compat-ioctl32.c: copy m.userptr in put_v4l2_plane32
  media: v4l2-compat-ioctl32.c: avoid sizeof(type)
  media: v4l2-compat-ioctl32.c: move 'helper' functions to __get/put_v4l2_format32
  media: v4l2-compat-ioctl32.c: fix the indentation
  media: v4l2-compat-ioctl32.c: add missing VIDIOC_PREPARE_BUF
  media: v4l2-ioctl.c: don't copy back the result for -ENOTTY
  media: v4l2-ioctl.c: use check_fmt for enum/g/s/try_fmt
  media: vivid: fix module load error when enabling fb and no_error_inj=1
  media: dvb_demux: improve debug messages
  media: dvb_demux: Better handle discontinuity errors
  media: cxusb, dib0700: ignore XC2028_I2C_FLUSH
  media: ts2020: avoid integer overflows on 32 bit machines
  media: i2c: ov7740: use gpio/consumer.h instead of gpio.h
  media: entity: Add a nop variant of media_entity_cleanup
  ...

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull more rdma updates from Doug Ledford:
"Items of note:

   - two patches fix a regression in the 4.15 kernel. The 4.14 kernel
     worked fine with NVMe over Fabrics and mlx5 adapters. That broke in
     4.15. The fix is here.

   - one of the patches (the endian notation patch from Lijun) looks
     like a lot of lines of change, but it's mostly mechanical in
     nature. It amounts to the biggest chunk of change in it (it's about
     2/3rds of the overall pull request).

  Summary:

   - Clean up some function signatures in rxe for clarity

   - Tidy the RDMA netlink header to remove unimplemented constants

   - bnxt_re driver fixes, one is a regression this window.

   - Minor hns driver fixes

   - Various fixes from Dan Carpenter and his tool

   - Fix IRQ cleanup race in HFI1

   - HF1 performance optimizations and a fix to report counters in the right units

   - Fix for an IPoIB startup sequence race with the external manager

   - Oops fix for the new kabi path

   - Endian cleanups for hns

   - Fix for mlx5 related to the new automatic affinity support"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (38 commits)
  net/mlx5: increase async EQ to avoid EQ overrun
  mlx5: fix mlx5_get_vector_affinity to start from completion vector 0
  RDMA/hns: Fix the endian problem for hns
  IB/uverbs: Use the standard kConfig format for experimental
  IB: Update references to libibverbs
  IB/hfi1: Add 16B rcvhdr trace support
  IB/hfi1: Convert kzalloc_node and kcalloc to use kcalloc_node
  IB/core: Avoid a potential OOPs for an unused optional parameter
  IB/core: Map iWarp AH type to undefined in rdma_ah_find_type
  IB/ipoib: Fix for potential no-carrier state
  IB/hfi1: Show fault stats in both TX and RX directions
  IB/hfi1: Remove blind constants from 16B update
  IB/hfi1: Convert PortXmitWait/PortVLXmitWait counters to flit times
  IB/hfi1: Do not override given pcie_pset value
  IB/hfi1: Optimize process_receive_ib()
  IB/hfi1: Remove unnecessary fecn and becn fields
  IB/hfi1: Look up ibport using a pointer in receive path
  IB/hfi1: Optimize packet type comparison using 9B and bypass code paths
  IB/hfi1: Compute BTH only for RDMA_WRITE_LAST/SEND_LAST packet
  IB/hfi1: Remove dependence on qp->s_hdrwords
  ...

Merge tag 'libnvdimm-for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Ross Zwisler:

- Require struct page by default for filesystem DAX to remove a number
   of surprising failure cases. This includes failures with direct I/O,
   gdb and fork(2).

- Add support for the new Platform Capabilities Structure added to the
   NFIT in ACPI 6.2a. This new table tells us whether the platform
   supports flushing of CPU and memory controller caches on unexpected
   power loss events.

- Revamp vmem_altmap and dev_pagemap handling to clean up code and
   better support future future PCI P2P uses.

- Deprecate the ND_IOCTL_SMART_THRESHOLD command whose payload has
   become out-of-sync with recent versions of the NVDIMM_FAMILY_INTEL
   spec, and instead rely on the generic ND_CMD_CALL approach used by
   the two other IOCTL families, NVDIMM_FAMILY_{HPE,MSFT}.

- Enhance nfit_test so we can test some of the new things added in
   version 1.6 of the DSM specification. This includes testing firmware
   download and simulating the Last Shutdown State (LSS) status.

* tag 'libnvdimm-for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (37 commits)
  libnvdimm, namespace: remove redundant initialization of 'nd_mapping'
  acpi, nfit: fix register dimm error handling
  libnvdimm, namespace: make min namespace size 4K
  tools/testing/nvdimm: force nfit_test to depend on instrumented modules
  libnvdimm/nfit_test: adding support for unit testing enable LSS status
  libnvdimm/nfit_test: add firmware download emulation
  nfit-test: Add platform cap support from ACPI 6.2a to test
  libnvdimm: expose platform persistence attribute for nd_region
  acpi: nfit: add persistent memory control flag for nd_region
  acpi: nfit: Add support for detect platform CPU cache flush on power loss
  device-dax: Fix trailing semicolon
  libnvdimm, btt: fix uninitialized err_lock
  dax: require 'struct page' by default for filesystem dax
  ext2: auto disable dax instead of failing mount
  ext4: auto disable dax instead of failing mount
  mm, dax: introduce pfn_t_special()
  mm: Fix devm_memremap_pages() collision handling
  mm: Fix memory size alignment in devm_memremap_pages_release()
  memremap: merge find_dev_pagemap into get_dev_pagemap
  memremap: change devm_memremap_pages interface to use struct dev_pagemap
  ...

Merge tag 'pci-v4.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:

- skip AER driver error recovery callbacks for correctable errors
   reported via ACPI APEI, as we already do for errors reported via the
   native path (Tyler Baicar)

- fix DPC shared interrupt handling (Alex Williamson)

- print full DPC interrupt number (Keith Busch)

- enable DPC only if AER is available (Keith Busch)

- simplify DPC code (Bjorn Helgaas)

- calculate ASPM L1 substate parameter instead of hardcoding it (Bjorn
   Helgaas)

- enable Latency Tolerance Reporting for ASPM L1 substates (Bjorn
   Helgaas)

- move ASPM internal interfaces out of public header (Bjorn Helgaas)

- allow hot-removal of VGA devices (Mika Westerberg)

- speed up unplug and shutdown by assuming Thunderbolt controllers
   don't support Command Completed events (Lukas Wunner)

- add AtomicOps support for GPU and Infiniband drivers (Felix Kuehling,
   Jay Cornwall)

- expose "ari_enabled" in sysfs to help NIC naming (Stuart Hayes)

- clean up PCI DMA interface usage (Christoph Hellwig)

- remove PCI pool API (replaced with DMA pool) (Romain Perier)

- deprecate pci_get_bus_and_slot(), which assumed PCI domain 0 (Sinan
   Kaya)

- move DT PCI code from drivers/of/ to drivers/pci/ (Rob Herring)

- add PCI-specific wrappers for dev_info(), etc (Frederick Lawler)

- remove warnings on sysfs mmap failure (Bjorn Helgaas)

- quiet ROM validation messages (Alex Deucher)

- remove redundant memory alloc failure messages (Markus Elfring)

- fill in types for compile-time VGA and other I/O port resources
   (Bjorn Helgaas)

- make "pci=pcie_scan_all" work for Root Ports as well as Downstream
   Ports to help AmigaOne X1000 (Bjorn Helgaas)

- add SPDX tags to all PCI files (Bjorn Helgaas)

- quirk Marvell 9128 DMA aliases (Alex Williamson)

- quirk broken INTx disable on Ceton InfiniTV4 (Bjorn Helgaas)

- fix CONFIG_PCI=n build by adding dummy pci_irqd_intx_xlate() (Niklas
   Cassel)

- use DMA API to get MSI address for DesignWare IP (Niklas Cassel)

- fix endpoint-mode DMA mask configuration (Kishon Vijay Abraham I)

- fix ARTPEC-6 incorrect IS_ERR() usage (Wei Yongjun)

- add support for ARTPEC-7 SoC (Niklas Cassel)

- add endpoint-mode support for ARTPEC (Niklas Cassel)

- add Cadence PCIe host and endpoint controller driver (Cyrille
   Pitchen)

- handle multiple INTx status bits being set in dra7xx (Vignesh R)

- translate dra7xx hwirq range to fix INTD handling (Vignesh R)

- remove deprecated Exynos PHY initialization code (Jaehoon Chung)

- fix MSI erratum workaround for HiSilicon Hip06/Hip07 (Dongdong Liu)

- fix NULL pointer dereference in iProc BCMA driver (Ray Jui)

- fix Keystone interrupt-controller-node lookup (Johan Hovold)

- constify qcom driver structures (Julia Lawall)

- rework Tegra config space mapping to increase space available for
   endpoints (Vidya Sagar)

- simplify Tegra driver by using bus->sysdata (Manikanta Maddireddy)

- remove PCI_REASSIGN_ALL_BUS usage on Tegra (Manikanta Maddireddy)

- add support for Global Fabric Manager Server (GFMS) event to
   Microsemi Switchtec switch driver (Logan Gunthorpe)

- add IDs for Switchtec PSX 24xG3 and PSX 48xG3 (Kelvin Cao)

* tag 'pci-v4.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (140 commits)
  PCI: cadence: Add EndPoint Controller driver for Cadence PCIe controller
  dt-bindings: PCI: cadence: Add DT bindings for Cadence PCIe endpoint controller
  PCI: endpoint: Fix EPF device name to support multi-function devices
  PCI: endpoint: Add the function number as argument to EPC ops
  PCI: cadence: Add host driver for Cadence PCIe controller
  dt-bindings: PCI: cadence: Add DT bindings for Cadence PCIe host controller
  PCI: Add vendor ID for Cadence
  PCI: Add generic function to probe PCI host controllers
  PCI: generic: fix missing call of pci_free_resource_list()
  PCI: OF: Add generic function to parse and allocate PCI resources
  PCI: Regroup all PCI related entries into drivers/pci/Makefile
  PCI/DPC: Reformat DPC register definitions
  PCI/DPC: Add and use DPC Status register field definitions
  PCI/DPC: Squash dpc_rp_pio_get_info() into dpc_process_rp_pio_error()
  PCI/DPC: Remove unnecessary RP PIO register structs
  PCI/DPC: Push dpc->rp_pio_status assignment into dpc_rp_pio_get_info()
  PCI/DPC: Squash dpc_rp_pio_print_error() into dpc_rp_pio_get_info()
  PCI/DPC: Make RP PIO log size check more generic
  PCI/DPC: Rename local "status" to "dpc_status"
  PCI/DPC: Squash dpc_rp_pio_print_tlp_header() into dpc_rp_pio_print_error()
  ...

Merge branch 'be2net-patch-set'

Suresh Reddy says:

====================
be2net: patch-set

Hi Dave, Please consider applying these two patches to net
====================

Signed-off-by: David S. Miller <[email protected]>

be2net: Handle transmit completion errors in Lancer

If the driver receives a TX CQE with status as 0x1 or 0x9 or 0xb,
the completion indexes should not be used. The driver must stop
consuming CQEs from this TXQ/CQ. The TXQ from this point on-wards
to be in a bad state. Driver should destroy and recreate the TXQ.

0x1: LANCER_TX_COMP_LSO_ERR
0x9 LANCER_TX_COMP_SGE_ERR
0xb: LANCER_TX_COMP_PARITY_ERR

Reset the adapter if driver sees this error in TX completion. Also
adding sge error counter in ethtool stats.

Signed-off-by: Suresh Reddy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

be2net: Fix HW stall issue in Lancer

Lancer HW cannot handle a TSO packet with a single segment.
Disable TSO/GSO for such packets.

Signed-off-by: Suresh Reddy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

RDS: IB: Fix null pointer issue

Scenario:
1. Port down and do fail over
2. Ap do rds_bind syscall

PID: 47039  TASK: ffff89887e2fe640  CPU: 47  COMMAND: "kworker/u:6"
#0 [ffff898e35f159f0] machine_kexec at ffffffff8103abf9
#1 [ffff898e35f15a60] crash_kexec at ffffffff810b96e3
#2 [ffff898e35f15b30] oops_end at ffffffff8150f518
#3 [ffff898e35f15b60] no_context at ffffffff8104854c
#4 [ffff898e35f15ba0] __bad_area_nosemaphore at ffffffff81048675
#5 [ffff898e35f15bf0] bad_area_nosemaphore at ffffffff810487d3
#6 [ffff898e35f15c00] do_page_fault at ffffffff815120b8
#7 [ffff898e35f15d10] page_fault at ffffffff8150ea95
    [exception RIP: unknown or invalid address]
    RIP: 0000000000000000  RSP: ffff898e35f15dc8  RFLAGS: 00010282
    RAX: 00000000fffffffe  RBX: ffff889b77f6fc00  RCX:ffffffff81c99d88
    RDX: 0000000000000000  RSI: ffff896019ee08e8  RDI:ffff889b77f6fc00
    RBP: ffff898e35f15df0   R8: ffff896019ee08c8  R9:0000000000000000
    R10: 0000000000000400  R11: 0000000000000000  R12:ffff896019ee08c0
    R13: ffff889b77f6fe68  R14: ffffffff81c99d80  R15: ffffffffa022a1e0
    ORIG_RAX: ffffffffffffffff  CS: 0010 SS: 0018
#8 [ffff898e35f15dc8] cma_ndev_work_handler at ffffffffa022a228 [rdma_cm]
#9 [ffff898e35f15df8] process_one_work at ffffffff8108a7c6
#10 [ffff898e35f15e58] worker_thread at ffffffff8108bda0
#11 [ffff898e35f15ee8] kthread at ffffffff81090fe6

PID: 45659  TASK: ffff880d313d2500  CPU: 31  COMMAND: "oracle_45659_ap"
#0 [ffff881024ccfc98] __schedule at ffffffff8150bac4
#1 [ffff881024ccfd40] schedule at ffffffff8150c2cf
#2 [ffff881024ccfd50] __mutex_lock_slowpath at ffffffff8150cee7
#3 [ffff881024ccfdc0] mutex_lock at ffffffff8150cdeb
#4 [ffff881024ccfde0] rdma_destroy_id at ffffffffa022a027 [rdma_cm]
#5 [ffff881024ccfe10] rds_ib_laddr_check at ffffffffa0357857 [rds_rdma]
#6 [ffff881024ccfe50] rds_trans_get_preferred at ffffffffa0324c2a [rds]
#7 [ffff881024ccfe80] rds_bind at ffffffffa031d690 [rds]
#8 [ffff881024ccfeb0] sys_bind at ffffffff8142a670

PID: 45659                          PID: 47039
rds_ib_laddr_check
  /* create id_priv with a null event_handler */
  rdma_create_id
  rdma_bind_addr
    cma_acquire_dev
      /* add id_priv to cma_dev->id_list */
      cma_attach_to_dev
                                    cma_ndev_work_handler
                                      /* event_hanlder is null */
                                      id_priv->id.event_handler

Signed-off-by: Guanglei Li <[email protected]>
Signed-off-by: Honglei Wang <[email protected]>
Reviewed-by: Junxiao Bi <[email protected]>
Reviewed-by: Yanjun Zhu <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Acked-by: Doug Ledford <[email protected]>
Signed-off-by: David S. Miller <[email protected]>