Git Repo - linux.git/log

arc: remove redundant UTS_MACHINE define in arch/arc/Makefile

The top-level Makefile sets the default of UTS_MACHINE to $(ARCH).

If ARCH and UTS_MACHINE match, arch/$(ARCH)/Makefile need not specify
UTS_MACHINE explicitly.

Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>

ARC: [plat-eznps] Update platform maintainer as Noam left

Signed-off-by: Vineet Gupta <[email protected]>

ARC: [plat-hsdk] use actual clk driver to manage cpu clk

With corresponding clk driver now merged upstream, switch to it.

- core_clk now represent the PLL (vs. fixed clk before)
- input_clk represent the clk signal src for PLL (basically xtal)

Signed-off-by: Eugeniy Paltsev <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>

ARC: [*defconfig] Reenable soft lock-up detector

Commit 92e5aae45778 "kernel/watchdog: split up config options"
introduced SOFTLOCKUP_DETECTOR which selects LOCKUP_DETECTOR
instead of the latter to be selected itself.

We need to adjust our defconfigs accordingly.

Signed-off-by: Alexey Brodkin <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>

ARC: [plat-axs10x] sdio: Temporary fix of sdio ciu frequency

DW sdio controller has external ciu clock divider controlled
via register in SDIO IP. It divides sdio_ref_clk
(which comes from CGU) by 16 for default. So default mmcclk
clock (which comes to sdk_in) is 25000000 Hz.

So fix wrong current value (50000000 Hz) to actual 25000000 Hz.

Note this is a preventive fix, in line with similar change for HSDK
where this was actually needed. see:
http://lists.infradead.org/pipermail/linux-snps-arc/2017-September/002924.html

Signed-off-by: Eugeniy Paltsev <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>

ARC: [plat-hsdk] sdio: Temporary fix of sdio ciu frequency

DW sdio controller has external ciu clock divider controlled via
register in SDIO IP. Due to its unexpected default value
(it should divide by 1 but it divides by 8)
SDIO IP uses wrong ciu clock and works unstable

So add temporary fix and change clock frequency from 100000000
to 12500000 Hz until we fix dw sdio driver itself.

Fixes SNPS STAR 9001204800

Signed-off-by: Eugeniy Paltsev <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>

ARC: [plat-axs103] Add temporary quirk to reset ethernet IP

DW ethernet controller on AXS10x hangs sometimes after SW reset, so
add temporary quirk to reset DW ethernet controller IP core.
This quirk can be removed after axs10x reset driver
(see http://patchwork.ozlabs.org/patch/800273/)
or simple reset driver
(see https://patchwork.kernel.org/patch/9903375/)
will be available in upstream.

Signed-off-by: Eugeniy Paltsev <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>

Merge tag 'omap-for-v4.14/fixes-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes

Fixes for omaps for v4.14-rc cycle

Few minor fixes for omaps, mostly just boot time warning fixes:

- Drop undocumented camera binding that got merged during the merge window by
  accident as I applied before Sakari's comments

- Fix soft reset warning for dra7 kexec boot for gpio1 as the optional clocks
  need to be enabled for reset

- Fix dra7 kexec boot clock rate for McASP as the rate is no longer the default
  rate after kexec

- Fix omap3 pandora MMC warning during boot

- Add am33xx SPI alias like we have on other SoCs

- Remove node for non-existing CPSW EMAC Ethernet on am43xx-epos-evm

* tag 'omap-for-v4.14/fixes-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: dts: am43xx-epos-evm: Remove extra CPSW EMAC entry
  ARM: dts: am33xx: Add spi alias to match SOC schematics
  ARM: OMAP2+: hsmmc: fix logic to call either omap_hsmmc_init or omap_hsmmc_late_init but not both
  ARM: dts: dra7: Set a default parent to mcasp3_ahclkx_mux
  ARM: OMAP2+: dra7xx: Set OPT_CLKS_IN_RESET flag for gpio1
  ARM: dts: nokia n900: drop unneeded/undocumented parts of the dts

Signed-off-by: Olof Johansson <[email protected]>

Merge tag 'v4.14-rockchip-dts64fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into fixes

Adding the operating points on rk3368 like they were did not end up well
for the boards as all of them are missing their cpu supplies, the OPPs
actually need to follow the <target min max> format as the regulator is
shared between both clusters and the one rk3368 board I have, somehow also
doesn't like the higher opps at all - all of which I only realized after
I brought my rk3368 board online again, after its bootloader broke.
So we revert that OPP addition for now.

And also two fixes for the mipi dsi controller on rk3399, which was
referencing a clock to high up in the clock-tree so that an intermediate
gate could be disabled inadvertently and also needs a clock for its area
in the general register files of the rk3399 soc.

* tag 'v4.14-rockchip-dts64fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
  arm64: dts: rockchip: add the grf clk for dw-mipi-dsi on rk3399
  arm64: dts: rockchip: Correct MIPI DPHY PLL clock on rk3399
  Revert "arm64: dts: rockchip: Add basic cpu frequencies for RK3368"

Signed-off-by: Olof Johansson <[email protected]>

Merge tag 'mvebu-fixes-4.14-1' of git://git.infradead.org/linux-mvebu into fixes

mvebu fixes for 4.14 (part 1)

Update MAINTAINERS for the Macchiatobin board (Armada 8K based)
Fix AP806 system controller size on Armada 7K/8K

* tag 'mvebu-fixes-4.14-1' of git://git.infradead.org/linux-mvebu:
arm64: dt marvell: Fix AP806 system controller size
MAINTAINERS: add Macchiatobin maintainers entry

Signed-off-by: Olof Johansson <[email protected]>

Merge tag 'reset-fixes-for-4.14' of git://git.pengutronix.de/git/pza/linux into fixes

Reset controller fixes for v4.14

- Remove misleading HSDK v1 suffix, as there is no v2 planned
- Add missing DT binding documentation for HSDK reset driver
- Fix HSDK reset driver dependencies

* tag 'reset-fixes-for-4.14' of git://git.pengutronix.de/git/pza/linux:
  reset: Restrict RESET_HSDK to ARC_SOC_HSDK or COMPILE_TEST
  ARC: reset: remove the misleading v1 suffix all over
  ARC: reset: add missing DT binding documentation for HSDKv1 reset driver
  ARC: reset: Only build on archs that have IOMEM

Signed-off-by: Olof Johansson <[email protected]>

Merge tag 'davinci-fixes-for-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci into fixes

A fix for random ethernet mac address problem
on DA850 EVM.

* tag 'davinci-fixes-for-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci:
ARM: dts: da850-evm: add serial and ethernet aliases

Signed-off-by: Olof Johansson <[email protected]>

Merge tag 'at91-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nferre/linux-at91 into fixes

Fixes for 4.14:
- three DT fixes for the newly introduced sama5d27_som1_ek board
- one treewide modification that didn't touch this new PM code: we
  synchronize now to be coherent with the other ARM platforms

* tag 'at91-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nferre/linux-at91:
  ARM: at91: Replace uses of virt_to_phys with __pa_symbol
  ARM: dts: at91: sama5d27_som1_ek: fix USB host vbus
  ARM: dts: at91: sama5d27_som1_ek: fix typos
  ARM: dts: at91: sama5d27_som1_ek: update pinmux/pinconf for LEDs and USB

Signed-off-by: Olof Johansson <[email protected]>

ARM: defconfig: update Gemini defconfig

This updates the Gemini defconfig with drivers merged
for v4.13 or v4.14:
- ATA driver is merged
- DMA driver is merged
- RTC driver gets selected from default Kconfig

Signed-off-by: Linus Walleij <[email protected]>
Signed-off-by: Olof Johansson <[email protected]>

ARM: defconfig: FRAMEBUFFER_CONSOLE can no longer be =m

It is no longer possible to load this at runtime, so let's
change the few remaining users to have it built-in all
the time.

arch/arm/configs/zeus_defconfig:115:warning: symbol value 'm' invalid for FRAMEBUFFER_CONSOLE
arch/arm/configs/viper_defconfig:116:warning: symbol value 'm' invalid for FRAMEBUFFER_CONSOLE
arch/arm/configs/pxa_defconfig:474:warning: symbol value 'm' invalid for FRAMEBUFFER_CONSOLE

Reported-by: kernelci.org bot <[email protected]>
Fixes: 6104c37094e7 ("fbcon: Make fbcon a built-time depency for fbdev")
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Olof Johansson <[email protected]>

include/linux/fs.h: fix comment about struct address_space

Before commit 9c5d760b8d22 ("mm: split gfp_mask and mapping flags into
separate fields") the private_* fields of struct adrress_space were
grouped together and using "ditto" in comments describing the last
fields was correct.

With introduction of gpf_mask between private_lock and private_list
"ditto" references the wrong description.

Fix it by using the elaborate description.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Cc: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

checkpatch: fix ignoring cover-letter logic

Currently running checkpatch on a directory with a cover-letter.patch
file reports the following error:

  -----------------------------------------
  patches/smp-v2/v2-0000-cover-letter.patch
  -----------------------------------------

  ERROR: Does not appear to be a unified-diff format patch

The logic to suppress the unified-diff check for cover letters is there
but is checking $file instead of $filename.  Fix the variable to use the
correct one.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Stafford Horne <[email protected]>
Acked-by: Joe Perches <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

m32r: fix build failure

The allmodconfig build of m32r is failing with the error:

  lib/mpi/mpih-div.o: In function 'mpihelp_divrem':
  mpih-div.c:(.text+0x40): undefined reference to 'abort'
  mpih-div.c:(.text+0x40): relocation truncated to fit:
R_M32R_26_PCREL_RELA against undefined symbol 'abort'

The function 'abort' was never defined for the m32r architecture.

Create 'abort' as is done in other arch like 'arm' and 'unicore32'.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Sudip Mukherjee <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

lib/ratelimit.c: use deferred printk() version

printk_ratelimit() invokes ___ratelimit() which may invoke a normal
printk() (pr_warn() in this particular case) to warn about suppressed
output.  Given that printk_ratelimit() may be called from anywhere, that
pr_warn() is dangerous - it may end up deadlocking the system.  Fix
___ratelimit() by using deferred printk().

Sasha reported the following lockdep error:

: Unregister pv shared memory for cpu 8
: select_fallback_rq: 3 callbacks suppressed
: process 8583 (trinity-c78) no longer affine to cpu8
:
: ======================================================
: WARNING: possible circular locking dependency detected
: 4.14.0-rc2-next-20170927+ #252 Not tainted
: ------------------------------------------------------
: migration/8/62 is trying to acquire lock:
: (&port_lock_key){-.-.}, at: serial8250_console_write()
:
: but task is already holding lock:
: (&rq->lock){-.-.}, at: sched_cpu_dying()
:
: which lock already depends on the new lock.
:
:
: the existing dependency chain (in reverse order) is:
:
: -> #3 (&rq->lock){-.-.}:
: __lock_acquire()
: lock_acquire()
: _raw_spin_lock()
: task_fork_fair()
: sched_fork()
: copy_process.part.31()
: _do_fork()
: kernel_thread()
: rest_init()
: start_kernel()
: x86_64_start_reservations()
: x86_64_start_kernel()
: verify_cpu()
:
: -> #2 (&p->pi_lock){-.-.}:
: __lock_acquire()
: lock_acquire()
: _raw_spin_lock_irqsave()
: try_to_wake_up()
: default_wake_function()
: woken_wake_function()
: __wake_up_common()
: __wake_up_common_lock()
: __wake_up()
: tty_wakeup()
: tty_port_default_wakeup()
: tty_port_tty_wakeup()
: uart_write_wakeup()
: serial8250_tx_chars()
: serial8250_handle_irq.part.25()
: serial8250_default_handle_irq()
: serial8250_interrupt()
: __handle_irq_event_percpu()
: handle_irq_event_percpu()
: handle_irq_event()
: handle_level_irq()
: handle_irq()
: do_IRQ()
: ret_from_intr()
: native_safe_halt()
: default_idle()
: arch_cpu_idle()
: default_idle_call()
: do_idle()
: cpu_startup_entry()
: rest_init()
: start_kernel()
: x86_64_start_reservations()
: x86_64_start_kernel()
: verify_cpu()
:
: -> #1 (&tty->write_wait){-.-.}:
: __lock_acquire()
: lock_acquire()
: _raw_spin_lock_irqsave()
: __wake_up_common_lock()
: __wake_up()
: tty_wakeup()
: tty_port_default_wakeup()
: tty_port_tty_wakeup()
: uart_write_wakeup()
: serial8250_tx_chars()
: serial8250_handle_irq.part.25()
: serial8250_default_handle_irq()
: serial8250_interrupt()
: __handle_irq_event_percpu()
: handle_irq_event_percpu()
: handle_irq_event()
: handle_level_irq()
: handle_irq()
: do_IRQ()
: ret_from_intr()
: native_safe_halt()
: default_idle()
: arch_cpu_idle()
: default_idle_call()
: do_idle()
: cpu_startup_entry()
: rest_init()
: start_kernel()
: x86_64_start_reservations()
: x86_64_start_kernel()
: verify_cpu()
:
: -> #0 (&port_lock_key){-.-.}:
: check_prev_add()
: __lock_acquire()
: lock_acquire()
: _raw_spin_lock_irqsave()
: serial8250_console_write()
: univ8250_console_write()
: console_unlock()
: vprintk_emit()
: vprintk_default()
: vprintk_func()
: printk()
: ___ratelimit()
: __printk_ratelimit()
: select_fallback_rq()
: sched_cpu_dying()
: cpuhp_invoke_callback()
: take_cpu_down()
: multi_cpu_stop()
: cpu_stopper_thread()
: smpboot_thread_fn()
: kthread()
: ret_from_fork()
:
: other info that might help us debug this:
:
: Chain exists of:
:   &port_lock_key --> &p->pi_lock --> &rq->lock
:
:  Possible unsafe locking scenario:
:
:        CPU0                    CPU1
:        ----                    ----
:   lock(&rq->lock);
:                                lock(&p->pi_lock);
:                                lock(&rq->lock);
:   lock(&port_lock_key);
:
:  *** DEADLOCK ***
:
: 4 locks held by migration/8/62:
: #0: (&p->pi_lock){-.-.}, at: sched_cpu_dying()
: #1: (&rq->lock){-.-.}, at: sched_cpu_dying()
: #2: (printk_ratelimit_state.lock){....}, at: ___ratelimit()
: #3: (console_lock){+.+.}, at: vprintk_emit()
:
: stack backtrace:
: CPU: 8 PID: 62 Comm: migration/8 Not tainted 4.14.0-rc2-next-20170927+ #252
: Call Trace:
: dump_stack()
: print_circular_bug()
: check_prev_add()
: ? add_lock_to_list.isra.26()
: ? check_usage()
: ? kvm_clock_read()
: ? kvm_sched_clock_read()
: ? sched_clock()
: ? check_preemption_disabled()
: __lock_acquire()
: ? __lock_acquire()
: ? add_lock_to_list.isra.26()
: ? debug_check_no_locks_freed()
: ? memcpy()
: lock_acquire()
: ? serial8250_console_write()
: _raw_spin_lock_irqsave()
: ? serial8250_console_write()
: serial8250_console_write()
: ? serial8250_start_tx()
: ? lock_acquire()
: ? memcpy()
: univ8250_console_write()
: console_unlock()
: ? __down_trylock_console_sem()
: vprintk_emit()
: vprintk_default()
: vprintk_func()
: printk()
: ? show_regs_print_info()
: ? lock_acquire()
: ___ratelimit()
: __printk_ratelimit()
: select_fallback_rq()
: sched_cpu_dying()
: ? sched_cpu_starting()
: ? rcutree_dying_cpu()
: ? sched_cpu_starting()
: cpuhp_invoke_callback()
: ? cpu_disable_common()
: take_cpu_down()
: ? trace_hardirqs_off_caller()
: ? cpuhp_invoke_callback()
: multi_cpu_stop()
: ? __this_cpu_preempt_check()
: ? cpu_stop_queue_work()
: cpu_stopper_thread()
: ? cpu_stop_create()
: smpboot_thread_fn()
: ? sort_range()
: ? schedule()
: ? __kthread_parkme()
: kthread()
: ? sort_range()
: ? kthread_create_on_node()
: ret_from_fork()
: process 9121 (trinity-c78) no longer affine to cpu8
: smpboot: CPU 8 is now offline

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 6b1d174b0c27b ("ratelimit: extend to print suppressed messages on release")
Signed-off-by: Sergey Senozhatsky <[email protected]>
Reported-by: Sasha Levin <[email protected]>
Reviewed-by: Petr Mladek <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kernel/params.c: improve STANDARD_PARAM_DEF readability

Align the parameters passed to STANDARD_PARAM_DEF for clarity.

Link: http://lkml.kernel.org/r/20170928162728.756143cc@endymion
Signed-off-by: Jean Delvare <[email protected]>
Suggested-by: Ingo Molnar <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kernel/params.c: fix an overflow in param_attr_show

Function param_attr_show could overflow the buffer it is operating on.

The buffer size is PAGE_SIZE, and the string returned by
attribute->param->ops->get is generated by scnprintf(buffer, PAGE_SIZE,
...) so it could be PAGE_SIZE - 1 long, with the terminating '\0' at the
very end of the buffer. Calling strcat(..., "\n") on this isn't safe, as
the '\0' will be replaced by '\n' (OK) and then another '\0' will be added
past the end of the buffer (not OK.)

Simply add the trailing '\n' when writing the attribute contents to the
buffer originally. This is safe, and also faster.

Credits to Teradata for discovering this issue.

Link: http://lkml.kernel.org/r/20170928162602.60c379c7@endymion
Signed-off-by: Jean Delvare <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kernel/params.c: fix the maximum length in param_get_string

The length parameter of strlcpy() is supposed to reflect the size of the
target buffer, not of the source string. Harmless in this case as the
buffer is PAGE_SIZE long and the source string is always much shorter than
this, but conceptually wrong, so let's fix it.

Link: http://lkml.kernel.org/r/20170928162515.24846b4f@endymion
Signed-off-by: Jean Delvare <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/memory_hotplug: define find_{smallest|biggest}_section_pfn as unsigned long

find_{smallest|biggest}_section_pfn()s find the smallest/biggest section
and return the pfn of the section. But the functions are defined as int.
So the functions always return 0x00000000 - 0xffffffff. It means if
memory address is over 16TB, the functions does not work correctly.

To handle 64 bit value, the patch defines
find_{smallest|biggest}_section_pfn() as unsigned long.

Fixes: 815121d2b5cd ("memory_hotplug: clear zone when removing the memory")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Yasuaki Ishimatsu <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Xishi Qiu <[email protected]>
Cc: Reza Arbab <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/memory_hotplug: change pfn_to_section_nr/section_nr_to_pfn macro to inline function

pfn_to_section_nr() and section_nr_to_pfn() are defined as macro.
pfn_to_section_nr() has no issue even if it is defined as macro.  But
section_nr_to_pfn() has overflow issue if sec is defined as int.

section_nr_to_pfn() just shifts sec by PFN_SECTION_SHIFT.  If sec is
defined as unsigned long, section_nr_to_pfn() returns pfn as 64 bit value.
But if sec is defined as int, section_nr_to_pfn() returns pfn as 32 bit
value.

__remove_section() calculates start_pfn using section_nr_to_pfn() and
scn_nr defined as int.  So if hot-removed memory address is over 16TB,
overflow issue occurs and section_nr_to_pfn() does not calculate correct
pfn.

To make callers use proper arg, the patch changes the macros to inline
functions.

Fixes: 815121d2b5cd ("memory_hotplug: clear zone when removing the memory")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Yasuaki Ishimatsu <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Xishi Qiu <[email protected]>
Cc: Reza Arbab <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kernel/kcmp.c: drop branch leftover typo

The else branch been left over and escaped the source code refresh. Not
a problem but better clean it up.

Fixes: 0791e3644e5e ("kcmp: add KCMP_EPOLL_TFD mode to compare epoll target files")
Link: http://lkml.kernel.org/r/[email protected]
Reported-by: Eugene Syromiatnikov <[email protected]>
Signed-off-by: Cyrill Gorcunov <[email protected]>
Acked-by: Andrei Vagin <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

memremap: add scheduling point to devm_memremap_pages

devm_memremap_pages is initializing struct pages in for_each_device_pfn
and that can take quite some time.  We have even seen a soft lockup
triggering on a non preemptive kernel

  NMI watchdog: BUG: soft lockup - CPU#61 stuck for 22s! [kworker/u641:11:1808]
  [...]
  RIP: 0010:[<ffffffff8118b6b7>]  [<ffffffff8118b6b7>] devm_memremap_pages+0x327/0x430
  [...]
  Call Trace:
    pmem_attach_disk+0x2fd/0x3f0 [nd_pmem]
    nvdimm_bus_probe+0x64/0x110 [libnvdimm]
    driver_probe_device+0x1f7/0x420
    bus_for_each_drv+0x52/0x80
    __device_attach+0xb0/0x130
    bus_probe_device+0x87/0xa0
    device_add+0x3fc/0x5f0
    nd_async_device_register+0xe/0x40 [libnvdimm]
    async_run_entry_fn+0x43/0x150
    process_one_work+0x14e/0x410
    worker_thread+0x116/0x490
    kthread+0xc7/0xe0
    ret_from_fork+0x3f/0x70

fix this by adding cond_resched every 1024 pages.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Hocko <[email protected]>
Reported-by: Johannes Thumshirn <[email protected]>
Tested-by: Johannes Thumshirn <[email protected]>
Cc: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm, page_alloc: add scheduling point to memmap_init_zone

memmap_init_zone gets a pfn range to initialize and it can be really
large resulting in a soft lockup on non-preemptible kernels

  NMI watchdog: BUG: soft lockup - CPU#31 stuck for 23s! [kworker/u642:5:1720]
  [...]
  task: ffff88ecd7e902c0 ti: ffff88eca4e50000 task.ti: ffff88eca4e50000
  RIP: move_pfn_range_to_zone+0x185/0x1d0
  [...]
  Call Trace:
    devm_memremap_pages+0x2c7/0x430
    pmem_attach_disk+0x2fd/0x3f0 [nd_pmem]
    nvdimm_bus_probe+0x64/0x110 [libnvdimm]
    driver_probe_device+0x1f7/0x420
    bus_for_each_drv+0x52/0x80
    __device_attach+0xb0/0x130
    bus_probe_device+0x87/0xa0
    device_add+0x3fc/0x5f0
    nd_async_device_register+0xe/0x40 [libnvdimm]
    async_run_entry_fn+0x43/0x150
    process_one_work+0x14e/0x410
    worker_thread+0x116/0x490
    kthread+0xc7/0xe0
    ret_from_fork+0x3f/0x70

Fix this by adding a scheduling point once per page block.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Hocko <[email protected]>
Reported-by: Johannes Thumshirn <[email protected]>
Tested-by: Johannes Thumshirn <[email protected]>
Cc: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm, memory_hotplug: add scheduling point to __add_pages

Patch series "mm, memory_hotplug: fix few soft lockups in memory
hotadd".

Johannes has noticed few soft lockups when adding a large nvdimm device.
All of them were caused by a long loop without any explicit cond_resched
which is a problem for !PREEMPT kernels.

The fix is quite straightforward.  Just make sure that cond_resched gets
called from time to time.

This patch (of 3):

__add_pages gets a pfn range to add and there is no upper bound for a
single call.  This is usually a memory block aligned size for the
regular memory hotplug - smaller sizes are usual for memory balloning
drivers, or the whole NUMA node for physical memory online.  There is no
explicit scheduling point in that code path though.

This can lead to long latencies while __add_pages is executed and we
have even seen a soft lockup report during nvdimm initialization with
!PREEMPT kernel

  NMI watchdog: BUG: soft lockup - CPU#11 stuck for 23s! [kworker/u641:3:832]
  [...]
  Workqueue: events_unbound async_run_entry_fn
  task: ffff881809270f40 ti: ffff881809274000 task.ti: ffff881809274000
  RIP: _raw_spin_unlock_irqrestore+0x11/0x20
  RSP: 0018:ffff881809277b10  EFLAGS: 00000286
  [...]
  Call Trace:
    sparse_add_one_section+0x13d/0x18e
    __add_pages+0x10a/0x1d0
    arch_add_memory+0x4a/0xc0
    devm_memremap_pages+0x29d/0x430
    pmem_attach_disk+0x2fd/0x3f0 [nd_pmem]
    nvdimm_bus_probe+0x64/0x110 [libnvdimm]
    driver_probe_device+0x1f7/0x420
    bus_for_each_drv+0x52/0x80
    __device_attach+0xb0/0x130
    bus_probe_device+0x87/0xa0
    device_add+0x3fc/0x5f0
    nd_async_device_register+0xe/0x40 [libnvdimm]
    async_run_entry_fn+0x43/0x150
    process_one_work+0x14e/0x410
    worker_thread+0x116/0x490
    kthread+0xc7/0xe0
    ret_from_fork+0x3f/0x70
  DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70

Fix this by adding cond_resched once per each memory section in the
given pfn range.  Each section is constant amount of work which itself
is not too expensive but many of them will just add up.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Hocko <[email protected]>
Reported-by: Johannes Thumshirn <[email protected]>
Tested-by: Johannes Thumshirn <[email protected]>
Cc: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

lib/idr.c: fix comment for idr_replace()

idr_replace() returns the old value on success, not 0.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Eric Biggers <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: memcontrol: use vmalloc fallback for large kmem memcg arrays

For quick per-memcg indexing, slab caches and list_lru structures
maintain linear arrays of descriptors.  As the number of concurrent
memory cgroups in the system goes up, this requires large contiguous
allocations (8k cgroups = order-5, 16k cgroups = order-6 etc.) for every
existing slab cache and list_lru, which can easily fail on loaded
systems.  E.g.:

  mkdir: page allocation failure: order:5, mode:0x14040c0(GFP_KERNEL|__GFP_COMP), nodemask=(null)
  CPU: 1 PID: 6399 Comm: mkdir Not tainted 4.13.0-mm1-00065-g720bbe532b7c-dirty #481
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-20170228_101828-anatol 04/01/2014
  Call Trace:
   ? __alloc_pages_direct_compact+0x4c/0x110
   __alloc_pages_nodemask+0xf50/0x1430
   alloc_pages_current+0x60/0xc0
   kmalloc_order_trace+0x29/0x1b0
   __kmalloc+0x1f4/0x320
   memcg_update_all_list_lrus+0xca/0x2e0
   mem_cgroup_css_alloc+0x612/0x670
   cgroup_apply_control_enable+0x19e/0x360
   cgroup_mkdir+0x322/0x490
   kernfs_iop_mkdir+0x55/0x80
   vfs_mkdir+0xd0/0x120
   SyS_mkdirat+0x6c/0xe0
   SyS_mkdir+0x14/0x20
   entry_SYSCALL_64_fastpath+0x18/0xad
  Mem-Info:
  active_anon:2965 inactive_anon:19 isolated_anon:0
   active_file:100270 inactive_file:98846 isolated_file:0
   unevictable:0 dirty:0 writeback:0 unstable:0
   slab_reclaimable:7328 slab_unreclaimable:16402
   mapped:771 shmem:52 pagetables:278 bounce:0
   free:13718 free_pcp:0 free_cma:0

This output is from an artificial reproducer, but we have repeatedly
observed order-7 failures in production in the Facebook fleet.  These
systems become useless as they cannot run more jobs, even though there
is plenty of memory to allocate 128 individual pages.

Use kvmalloc and kvzalloc to fall back to vmalloc space if these arrays
prove too large for allocating them physically contiguous.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Johannes Weiner <[email protected]>
Reviewed-by: Josef Bacik <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Vladimir Davydov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kernel/sysctl.c: remove duplicate UINT_MAX check on do_proc_douintvec_conv()

do_proc_douintvec_conv() has two UINT_MAX checks, we can remove one.
This has no functional changes other than fixing a compiler warning:

kernel/sysctl.c:2190]: (warning) Identical condition '*lvalp>UINT_MAX', second condition is always false

Fixes: 4f2fec00afa60 ("sysctl: simplify unsigned int support")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Luis R. Rodriguez <[email protected]>
Reported-by: David Binderman <[email protected]>
Acked-by: Kees Cook <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

include/linux/bitfield.h: remove 32bit from FIELD_GET comment block

I do not see anything that restricts this macro to 32 bit width.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Masahiro Yamada <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

lib/lz4: make arrays static const, reduces object code size

Don't populate the read-only arrays dec32table and dec64table on the
stack, instead make them both static const.  Makes the object code
smaller by over 10K bytes:

  Before:
     text    data     bss     dec     hex filename
    31500       0       0   31500    7b0c lib/lz4/lz4_decompress.o

  After:
     text    data     bss     dec     hex filename
    20237     176       0   20413    4fbd lib/lz4/lz4_decompress.o

(gcc version 7.2.0 x86_64)

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Colin Ian King <[email protected]>
Cc: Christophe JAILLET <[email protected]>
Cc: Sven Schmidt <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Joe Perches <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

exec: binfmt_misc: kill the onstack iname[BINPRM_BUF_SIZE] array

After the previous change "fmt" can't go away, we can kill
iname/iname_addr and use fmt->interpreter.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Oleg Nesterov <[email protected]>
Acked-by: Kees Cook <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Ben Woodard <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Jim Foraker <[email protected]>
Cc: <[email protected]>
Cc: Travis Gummels <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

exec: binfmt_misc: fix race between load_misc_binary() and kill_node()

load_misc_binary() makes a local copy of fmt->interpreter under
entries_lock to avoid the race with kill_node() but this is not enough;
the whole Node can be freed after we drop entries_lock, not only the
->interpreter string.

Add dget/dput(fmt->dentry) to ensure bm_evict_inode() can't destroy/free
this Node.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Oleg Nesterov <[email protected]>
Acked-by: Kees Cook <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Ben Woodard <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Jim Foraker <[email protected]>
Cc: Travis Gummels <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

exec: binfmt_misc: remove the confusing e->interp_file != NULL checks

If MISC_FMT_OPEN_FILE flag is set e->interp_file must be valid or we
have a bug which should not be silently ignored.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Oleg Nesterov <[email protected]>
Acked-by: Kees Cook <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Ben Woodard <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Jim Foraker <[email protected]>
Cc: <[email protected]>
Cc: Travis Gummels <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

exec: binfmt_misc: shift filp_close(interp_file) from kill_node() to bm_evict_inode()

To ensure that load_misc_binary() can't use the partially destroyed
Node, see also the next patch.

The current logic looks wrong in any case, once we close interp_file it
doesn't make any sense to delay kfree(inode->i_private), this Node is no
longer valid. Even if the MISC_FMT_OPEN_FILE/interp_file checks were
not racy (they are), load_misc_binary() should not try to reopen
->interpreter if MISC_FMT_OPEN_FILE is set but ->interp_file is NULL.

And I can't understand why do we use filp_close(), not fput().

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Oleg Nesterov <[email protected]>
Acked-by: Kees Cook <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Ben Woodard <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Jim Foraker <[email protected]>
Cc: <[email protected]>
Cc: Travis Gummels <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

exec: binfmt_misc: don't nullify Node->dentry in kill_node()

kill_node() nullifies/checks Node->dentry to avoid double free.  This
complicates the next changes and this is very confusing:

- we do not need to check dentry != NULL under entries_lock,
   kill_node() is always called under inode_lock(d_inode(root)) and we
   rely on this inode_lock() anyway, without this lock the
   MISC_FMT_OPEN_FILE cleanup could race with itself.

- if kill_inode() was already called and ->dentry == NULL we should not
   even try to close e->interp_file.

We can change bm_entry_write() to simply check !list_empty(list) before
kill_node.  Again, we rely on inode_lock(), in particular it saves us
from the race with bm_status_write(), another caller of kill_node().

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Oleg Nesterov <[email protected]>
Acked-by: Kees Cook <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Ben Woodard <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Jim Foraker <[email protected]>
Cc: <[email protected]>
Cc: Travis Gummels <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

exec: load_script: kill the onstack interp[BINPRM_BUF_SIZE] array

Patch series "exec: binfmt_misc: fix use-after-free, kill
iname[BINPRM_BUF_SIZE]".

It looks like this code was always wrong, then commit 948b701a607f
("binfmt_misc: add persistent opened binary handler for containers")
added more problems.

This patch (of 6):

load_script() can simply use i_name instead, it points into bprm->buf[]
and nobody can change this memory until we call prepare_binprm().

The only complication is that we need to also change the signature of
bprm_change_interp() but this change looks good too.

While at it, do whitespace/style cleanups.

NOTE: the real motivation for this change is that people want to
increase BINPRM_BUF_SIZE, we need to change load_misc_binary() too but
this looks more complicated because afaics it is very buggy.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Oleg Nesterov <[email protected]>
Acked-by: Kees Cook <[email protected]>
Cc: Travis Gummels <[email protected]>
Cc: Ben Woodard <[email protected]>
Cc: Jim Foraker <[email protected]>
Cc: <[email protected]>
Cc: Al Viro <[email protected]>
Cc: James Bottomley <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

userfaultfd: non-cooperative: fix fork use after free

When reading the event from the uffd, we put it on a temporary
fork_event list to detect if we can still access it after releasing and
retaking the event_wqh.lock.

If fork aborts and removes the event from the fork_event all is fine as
long as we're still in the userfault read context and fork_event head is
still alive.

We've to put the event allocated in the fork kernel stack, back from
fork_event list-head to the event_wqh head, before returning from
userfaultfd_ctx_read, because the fork_event head lifetime is limited to
the userfaultfd_ctx_read stack lifetime.

Forgetting to move the event back to its event_wqh place then results in
__remove_wait_queue(&ctx->event_wqh, &ewq->wq); in
userfaultfd_event_wait_completion to remove it from a head that has been
already freed from the reader stack.

This could only happen if resolve_userfault_fork failed (for example if
there are no file descriptors available to allocate the fork uffd). If
it succeeded it was put back correctly.

Furthermore, after find_userfault_evt receives a fork event, the forked
userfault context in fork_nctx and uwq->msg.arg.reserved.reserved1 can
be released by the fork thread as soon as the event_wqh.lock is
released. Taking a reference on the fork_nctx before dropping the lock
prevents an use after free in resolve_userfault_fork().

If the fork side aborted and it already released everything, we still
try to succeed resolve_userfault_fork(), if possible.

Fixes: 893e26e61d04eac9 ("userfaultfd: non-cooperative: Add fork() event")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Andrea Arcangeli <[email protected]>
Reported-by: Mark Rutland <[email protected]>
Tested-by: Mark Rutland <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: "Dr. David Alan Gilbert" <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/device-public-memory: fix edge case in _vm_normal_page()

With device public pages at the end of my memory space, I'm getting
output from _vm_normal_page():

  BUG: Bad page map in process migrate_pages  pte:c0800001ffff0d06 pmd:f95d3000
  addr:00007fff89330000 vm_flags:00100073 anon_vma:c0000000fa899320 mapping:          (null) index:7fff8933
  file:          (null) fault:          (null) mmap:          (null) readpage:          (null)
  CPU: 0 PID: 13963 Comm: migrate_pages Tainted: P    B      OE 4.14.0-rc1-wip #155
  Call Trace:
     dump_stack+0xb0/0xf4 (unreliable)
     print_bad_pte+0x28c/0x340
     _vm_normal_page+0xc0/0x140
     zap_pte_range+0x664/0xc10
     unmap_page_range+0x318/0x670
     unmap_vmas+0x74/0xe0
     exit_mmap+0xe8/0x1f0
     mmput+0xac/0x1f0
     do_exit+0x348/0xcd0
     do_group_exit+0x5c/0xf0
     SyS_exit_group+0x1c/0x20
     system_call+0x58/0x6c

The pfn causing this is the very last one.  Correct the bounds check
accordingly.

Fixes: df6ad69838fc ("mm/device-public-memory: device memory cache coherent with CPU")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Reza Arbab <[email protected]>
Reviewed-by: Jérôme Glisse <[email protected]>
Reviewed-by: Balbir Singh <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: fix data corruption caused by lazyfree page

MADV_FREE clears pte dirty bit and then marks the page lazyfree (clear
SwapBacked).  There is no lock to prevent the page is added to swap
cache between these two steps by page reclaim.  If page reclaim finds
such page, it will simply add the page to swap cache without pageout the
page to swap because the page is marked as clean.  Next time, page fault
will read data from the swap slot which doesn't have the original data,
so we have a data corruption.  To fix issue, we mark the page dirty and
pageout the page.

However, we shouldn't dirty all pages which is clean and in swap cache.
swapin page is swap cache and clean too.  So we only dirty page which is
added into swap cache in page reclaim, which shouldn't be swapin page.
As Minchan suggested, simply dirty the page in add_to_swap can do the
job.

Fixes: 802a3a92ad7a ("mm: reclaim MADV_FREE pages")
Link: http://lkml.kernel.org/r/08c84256b007bf3f63c91d94383bd9eb6fee2daa.1506446061.git.shli@fb.com
Signed-off-by: Shaohua Li <[email protected]>
Reported-by: Artem Savkov <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Minchan Kim <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Hillf Danton <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: <[email protected]> [4.12+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: avoid marking swap cached page as lazyfree

MADV_FREE clears pte dirty bit and then marks the page lazyfree (clear
SwapBacked).  There is no lock to prevent the page is added to swap
cache between these two steps by page reclaim.  Page reclaim could add
the page to swap cache and unmap the page.  After page reclaim, the page
is added back to lru.  At that time, we probably start draining per-cpu
pagevec and mark the page lazyfree.  So the page could be in a state
with SwapBacked cleared and PG_swapcache set.  Next time there is a
refault in the virtual address, do_swap_page can find the page from swap
cache but the page has PageSwapCache false because SwapBacked isn't set,
so do_swap_page will bail out and do nothing.  The task will keep
running into fault handler.

Fixes: 802a3a92ad7a ("mm: reclaim MADV_FREE pages")
Link: http://lkml.kernel.org/r/6537ef3814398c0073630b03f176263bc81f0902.1506446061.git.shli@fb.com
Signed-off-by: Shaohua Li <[email protected]>
Reported-by: Artem Savkov <[email protected]>
Tested-by: Artem Savkov <[email protected]>
Reviewed-by: Rik van Riel <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Minchan Kim <[email protected]>
Cc: Hillf Danton <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: <[email protected]> [4.12+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: have filemap_check_and_advance_wb_err clear AS_EIO/AS_ENOSPC

Eryu noticed that he could sometimes get a leftover error reported when
it shouldn't be on fsync with ext2 and non-journalled ext4.

The problem is that writeback_single_inode still uses filemap_fdatawait.
That picks up a previously set AS_EIO flag, which would ordinarily have
been cleared before.

Since we're mostly using this function as a replacement for
filemap_check_errors, have filemap_check_and_advance_wb_err clear AS_EIO
and AS_ENOSPC when reporting an error. That should allow the new
function to better emulate the behavior of the old with respect to these
flags.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jeff Layton <[email protected]>
Reported-by: Eryu Guan <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

m32r: define CPU_BIG_ENDIAN

The build of m32r allmodconfig is giving lots of build warnings about:

include/linux/byteorder/big_endian.h:7:2:
warning: #warning inconsistent configuration,
needs CONFIG_CPU_BIG_ENDIAN [-Wcpp]
#warning inconsistent configuration, needs CONFIG_CPU_BIG_ENDIAN

Define CPU_BIG_ENDIAN like the way CPU_LITTLE_ENDIAN is defined.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Sudip Mukherjee <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

zram: fix null dereference of handle

In testing I found handle passed to zs_map_object in __zram_bvec_read is
NULL so eh kernel goes oops in pin_object().

The reason is there is no routine to check the slot's freeing after
getting the slot's lock. This patch fixes it.

[[email protected]: v2]
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Fixes: 1f7319c74275 ("zram: partial IO refactoring")
Signed-off-by: Minchan Kim <[email protected]>
Reviewed-by: Sergey Senozhatsky <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: fix RODATA_TEST failure "rodata_test: test data was not read only"

On powerpc, RODATA_TEST fails with message the following messages:

  Freeing unused kernel memory: 528K
  rodata_test: test data was not read only

This is because GCC allocates it to .data section:

  c0695034 g     O .data 00000004 rodata_test_data

Since commit 056b9d8a7692 ("mm: remove rodata_test_data export, add
pr_fmt"), rodata_test_data is used only inside rodata_test.c By
declaring it static, it gets properly allocated into .rodata section
instead of .data:

  c04df710 l     O .rodata 00000004 rodata_test_data

Fixes: 056b9d8a7692 ("mm: remove rodata_test_data export, add pr_fmt")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Christophe Leroy <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Jinbum Park <[email protected]>
Cc: Segher Boessenkool <[email protected]>
Cc: David Laight <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio: remove global irq spinlocks from the subsystem

Locking of config and doorbell operations should be done only if the
underlying hardware requires it.

This patch removes the global spinlocks from the rapidio subsystem and
moves them to the mport drivers (fsl_rio and tsi721), only to the
necessary places. For example, local config space read and write
operations (lcread/lcwrite) are atomic in all existing drivers, so there
should be no need for locking, while the cread/cwrite operations which
generate maintenance transactions need to be synchronized with a lock.

Later, each driver could chose to use a per-port lock instead of a
global one, or even more granular locking.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ioan Nicu <[email protected]>
Signed-off-by: Frank Kunz <[email protected]>
Acked-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: meminit: mark init_reserved_page as __meminit

The function is called from __meminit context and calls other __meminit
functions but isn't it self mark as such today:

  WARNING: vmlinux.o(.text.unlikely+0x4516): Section mismatch in reference from the function init_reserved_page() to the function .meminit.text:early_pfn_to_nid()
  The function init_reserved_page() references the function __meminit early_pfn_to_nid().
  This is often because init_reserved_page lacks a __meminit annotation or the annotation of early_pfn_to_nid is wrong.

On most compilers, we don't notice this because the function gets
inlined all the time.  Adding __meminit here fixes the harmless warning
for the old versions and is generally the correct annotation.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 7e18adb4f80b ("mm: meminit: initialise remaining struct pages in parallel with kswapd")
Signed-off-by: Arnd Bergmann <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

z3fold: fix stale list handling

Fix the situation when clear_bit() is called for page->private before
the page pointer is actually assigned. While at it, remove work_busy()
check because it is costly and does not give 100% guarantee anyway.

Signed-off-by: Vitaly Wool <[email protected]>
Cc: Dan Streetman <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm,compaction: serialize waitqueue_active() checks (for real)

Andrea brought to my attention that the L->{L,S} guarantees are
completely bogus for this case. I was looking at the diagram, from the
offending commit, when that _is_ the race, we had the load reordered
already.

What we need is at least S->L semantics, thus simply use
wq_has_sleeper() to serialize the call for good.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 46acef048a6 (mm,compaction: serialize waitqueue_active() checks)
Signed-off-by: Davidlohr Bueso <[email protected]>
Reported-by: Andrea Parri <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

android: binder: drop lru lock in isolate callback

Drop the global lru lock in isolate callback before calling
zap_page_range which calls cond_resched, and re-acquire the global lru
lock before returning. Also change return code to LRU_REMOVED_RETRY.

Use mmput_async when fail to acquire mmap sem in an atomic context.

Fix "BUG: sleeping function called from invalid context"
errors when CONFIG_DEBUG_ATOMIC_SLEEP is enabled.

Also restore mmput_async, which was initially introduced in commit
ec8d7c14ea14 ("mm, oom_reaper: do not mmput synchronously from the oom
reaper context"), and was removed in commit 212925802454 ("mm: oom: let
oom_reap_task and exit_mmap run concurrently").

Link: http://lkml.kernel.org/r/[email protected]
Fixes: f2517eb76f1f2 ("android: binder: Add global lru shrinker to binder")
Signed-off-by: Sherry Yang <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Reported-by: Kyle Yan <[email protected]>
Acked-by: Arve Hjønnevåg <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Martijn Coenen <[email protected]>
Cc: Todd Kjos <[email protected]>
Cc: Riley Andrews <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Hillf Danton <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Hoeun Ryu <[email protected]>
Cc: Christopher Lameter <[email protected]>
Cc: Vegard Nossum <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/memcg: avoid page count check for zone device

Fix for 4.14, zone device page always have an elevated refcount of one
and thus page count sanity check in uncharge_page() is inappropriate for
them.

[[email protected]: nano-optimize VM_BUG_ON in uncharge_page]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jérôme Glisse <[email protected]>
Signed-off-by: Michal Hocko <[email protected]>
Reported-by: Evgeny Baskakov <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm, memcg: remove hotplug locking from try_charge

The following lockdep splat has been noticed during LTP testing

  ======================================================
  WARNING: possible circular locking dependency detected
  4.13.0-rc3-next-20170807 #12 Not tainted
  ------------------------------------------------------
  a.out/4771 is trying to acquire lock:
   (cpu_hotplug_lock.rw_sem){++++++}, at: [<ffffffff812b4668>] drain_all_stock.part.35+0x18/0x140

  but task is already holding lock:
   (&mm->mmap_sem){++++++}, at: [<ffffffff8106eb35>] __do_page_fault+0x175/0x530

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #3 (&mm->mmap_sem){++++++}:
         lock_acquire+0xc9/0x230
         __might_fault+0x70/0xa0
         _copy_to_user+0x23/0x70
         filldir+0xa7/0x110
         xfs_dir2_sf_getdents.isra.10+0x20c/0x2c0 [xfs]
         xfs_readdir+0x1fa/0x2c0 [xfs]
         xfs_file_readdir+0x30/0x40 [xfs]
         iterate_dir+0x17a/0x1a0
         SyS_getdents+0xb0/0x160
         entry_SYSCALL_64_fastpath+0x1f/0xbe

  -> #2 (&type->i_mutex_dir_key#3){++++++}:
         lock_acquire+0xc9/0x230
         down_read+0x51/0xb0
         lookup_slow+0xde/0x210
         walk_component+0x160/0x250
         link_path_walk+0x1a6/0x610
         path_openat+0xe4/0xd50
         do_filp_open+0x91/0x100
         file_open_name+0xf5/0x130
         filp_open+0x33/0x50
         kernel_read_file_from_path+0x39/0x80
         _request_firmware+0x39f/0x880
         request_firmware_direct+0x37/0x50
         request_microcode_fw+0x64/0xe0
         reload_store+0xf7/0x180
         dev_attr_store+0x18/0x30
         sysfs_kf_write+0x44/0x60
         kernfs_fop_write+0x113/0x1a0
         __vfs_write+0x37/0x170
         vfs_write+0xc7/0x1c0
         SyS_write+0x58/0xc0
         do_syscall_64+0x6c/0x1f0
         return_from_SYSCALL_64+0x0/0x7a

  -> #1 (microcode_mutex){+.+.+.}:
         lock_acquire+0xc9/0x230
         __mutex_lock+0x88/0x960
         mutex_lock_nested+0x1b/0x20
         microcode_init+0xbb/0x208
         do_one_initcall+0x51/0x1a9
         kernel_init_freeable+0x208/0x2a7
         kernel_init+0xe/0x104
         ret_from_fork+0x2a/0x40

  -> #0 (cpu_hotplug_lock.rw_sem){++++++}:
         __lock_acquire+0x153c/0x1550
         lock_acquire+0xc9/0x230
         cpus_read_lock+0x4b/0x90
         drain_all_stock.part.35+0x18/0x140
         try_charge+0x3ab/0x6e0
         mem_cgroup_try_charge+0x7f/0x2c0
         shmem_getpage_gfp+0x25f/0x1050
         shmem_fault+0x96/0x200
         __do_fault+0x1e/0xa0
         __handle_mm_fault+0x9c3/0xe00
         handle_mm_fault+0x16e/0x380
         __do_page_fault+0x24a/0x530
         do_page_fault+0x30/0x80
         page_fault+0x28/0x30

  other info that might help us debug this:

  Chain exists of:
    cpu_hotplug_lock.rw_sem --> &type->i_mutex_dir_key#3 --> &mm->mmap_sem

   Possible unsafe locking scenario:

         CPU0                    CPU1
         ----                    ----
    lock(&mm->mmap_sem);
                                 lock(&type->i_mutex_dir_key#3);
                                 lock(&mm->mmap_sem);
    lock(cpu_hotplug_lock.rw_sem);

   *** DEADLOCK ***

  2 locks held by a.out/4771:
   #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8106eb35>] __do_page_fault+0x175/0x530
   #1:  (percpu_charge_mutex){+.+...}, at: [<ffffffff812b4c97>] try_charge+0x397/0x6e0

The problem is very similar to the one fixed by commit a459eeb7b852
("mm, page_alloc: do not depend on cpu hotplug locks inside the
allocator").  We are taking hotplug locks while we can be sitting on top
of basically arbitrary locks.  This just calls for problems.

We can get rid of {get,put}_online_cpus, fortunately.  We do not have to
be worried about races with memory hotplug because drain_local_stock,
which is called from both the WQ draining and the memory hotplug
contexts, is always operating on the local cpu stock with IRQs disabled.

The only thing to be careful about is that the target memcg doesn't
vanish while we are still in drain_all_stock so take a reference on it.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Hocko <[email protected]>
Reported-by: Artem Savkov <[email protected]>
Tested-by: Artem Savkov <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm, oom_reaper: skip mm structs with mmu notifiers

Andrea has noticed that the oom_reaper doesn't invalidate the range via
mmu notifiers (mmu_notifier_invalidate_range_start/end) and that can
corrupt the memory of the kvm guest for example.

tlb_flush_mmu_tlbonly already invokes mmu notifiers but that is not
sufficient as per Andrea:

"mmu_notifier_invalidate_range cannot be used in replacement of
  mmu_notifier_invalidate_range_start/end. For KVM
  mmu_notifier_invalidate_range is a noop and rightfully so. A MMU
  notifier implementation has to implement either ->invalidate_range
  method or the invalidate_range_start/end methods, not both. And if you
  implement invalidate_range_start/end like KVM is forced to do, calling
  mmu_notifier_invalidate_range in common code is a noop for KVM.

  For those MMU notifiers that can get away only implementing
  ->invalidate_range, the ->invalidate_range is implicitly called by
  mmu_notifier_invalidate_range_end(). And only those secondary MMUs
  that share the same pagetable with the primary MMU (like AMD iommuv2)
  can get away only implementing ->invalidate_range"

As the callback is allowed to sleep and the implementation is out of
hand of the MM it is safer to simply bail out if there is an mmu
notifier registered.  In order to not fail too early make the
mm_has_notifiers check under the oom_lock and have a little nap before
failing to give the current oom victim some more time to exit.

[[email protected]: coding-style fixes]
Link: http://lkml.kernel.org/r/[email protected]
Fixes: aac453635549 ("mm, oom: introduce oom reaper")
Signed-off-by: Michal Hocko <[email protected]>
Reported-by: Andrea Arcangeli <[email protected]>
Reviewed-by: Andrea Arcangeli <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

z3fold: fix potential race in z3fold_reclaim_page

It is possible that on a (partially) unsuccessful page reclaim,
kref_put() called in z3fold_reclaim_page() does not yield page release,
but the page is released shortly afterwards by another thread. Then
z3fold_reclaim_page() would try to list_add() that (released) page again
which is obviously a bug.

To avoid that, spin_lock() has to be taken earlier, before the
kref_put() call mentioned earlier.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Vitaly Wool <[email protected]>
Cc: Dan Streetman <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

sh: sh7269: remove nonexistent GPIO_PH[0-7] to fix pinctrl registration

Pinmux_pins[] is initialized through PINMUX_GPIO(), using designated
array initializers, where the GPIO_* enums serve as indices.  If enum
values are defined, but never used, pinmux_pins[] contains (zero-filled)
holes.  Such entries are treated as pin zero, which was registered
before, thus leading to pinctrl registration failures, as seen on
sh7722:

    sh-pfc pfc-sh7722: pin 0 already registered
    sh-pfc pfc-sh7722: error during pin registration
    sh-pfc pfc-sh7722: could not register: -22
    sh-pfc: probe of pfc-sh7722 failed with error -22

Remove GPIO_PH[0-7] from the enum to fix this.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: ef0fa5331a73e479 ("sh: Add pinmux for sh7269")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Reviewed-by: Laurent Pinchart <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Magnus Damm <[email protected]>
Cc: Yoshihiro Shimoda <[email protected]>
Cc: Jacopo Mondi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

sh: sh7264: remove nonexistent GPIO_PH[0-7] to fix pinctrl registration

Pinmux_pins[] is initialized through PINMUX_GPIO(), using designated
array initializers, where the GPIO_* enums serve as indices.  If enum
values are defined, but never used, pinmux_pins[] contains (zero-filled)
holes.  Such entries are treated as pin zero, which was registered
before, thus leading to pinctrl registration failures, as seen on
sh7722:

    sh-pfc pfc-sh7722: pin 0 already registered
    sh-pfc pfc-sh7722: error during pin registration
    sh-pfc pfc-sh7722: could not register: -22
    sh-pfc: probe of pfc-sh7722 failed with error -22

Remove GPIO_PH[0-7] from the enum to fix this.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 41797f75486d8ca3 ("sh: Add pinmux for sh7264")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Reviewed-by: Laurent Pinchart <[email protected]>
Cc: Jacopo Mondi <[email protected]>
Cc: Magnus Damm <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Yoshihiro Shimoda <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

sh: sh7757: remove nonexistent GPIO_PT[JLNQ]7_RESV to fix pinctrl registration

Commit 3810e96056ff ("sh: modify pinmux for SH7757 2nd cut") renamed
GPIO_PT[JLNQ]7 to GPIO_PT[JLNQ]7_RESV, and removed the existing users
from the pinmux_pins[] array.

However, pinmux_pins[] is initialized through PINMUX_GPIO(), using
designated array initializers, where the GPIO_* enums serve as indices.
Hence entries were not really removed, but replaced by (zero-filled)
holes.  Such entries are treated as pin zero, which was registered
before, thus leading to pinctrl registration failures, as seen on
sh7722:

    sh-pfc pfc-sh7722: pin 0 already registered
    sh-pfc pfc-sh7722: error during pin registration
    sh-pfc pfc-sh7722: could not register: -22
    sh-pfc: probe of pfc-sh7722 failed with error -22

Remove GPIO_PT[JLNQ]7_RESV from the enum to fix this.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 3810e96056ffddf6 ("sh: modify pinmux for SH7757 2nd cut")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Reviewed-by: Laurent Pinchart <[email protected]>
Cc: Jacopo Mondi <[email protected]>
Cc: Magnus Damm <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Yoshihiro Shimoda <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

sh: sh7722: remove nonexistent GPIO_PTQ7 to fix pinctrl registration

Patch series "sh: sh7722/sh7757i/sh7264/sh7269: Fix pinctrl registration",
v2.

Magnus Damm reported that on sh7722/Migo-R, pinctrl registration fails
with:

    sh-pfc pfc-sh7722: pin 0 already registered
    sh-pfc pfc-sh7722: error during pin registration
    sh-pfc pfc-sh7722: could not register: -22
    sh-pfc: probe of pfc-sh7722 failed with error -22

pinmux_pins[] is initialized through PINMUX_GPIO(), using designated
array initializers, where the GPIO_* enums serve as indices.  Apparently
GPIO_PTQ7 was defined in the enum, but never used.  If enum values are
defined, but never used, pinmux_pins[] contains (zero-filled) holes.
Hence such entries are treated as pin zero, which was registered before,
and pinctrl registration fails.

I can't see how this ever worked, as at the time of commit f5e25ae52fef
("sh-pfc: Add sh7722 pinmux support"), pinmux_gpios[] in
drivers/pinctrl/sh-pfc/pfc-sh7722.c already had the hole, and
drivers/pinctrl/core.c already had the check.

Some scripting revealed a few more broken drivers:
  - sh7757 has four holes, due to nonexistent GPIO_PT[JLNQ]7_RESV.
  - sh7264 and sh7269 define GPIO_PH[0-7], but don't use it with
    PINMUX_GPIO().

Patch 1 fixes the issue on sh7722, and was tested.  Patches 3-4 should
fix the issue on the other 3 SoCs, but was untested due to lack of
hardware.

This patch (of 4):

On sh7722/Migo-R, pinctrl registration fails with:

    sh-pfc pfc-sh7722: pin 0 already registered
    sh-pfc pfc-sh7722: error during pin registration
    sh-pfc pfc-sh7722: could not register: -22
    sh-pfc: probe of pfc-sh7722 failed with error -22

pinmux_pins[] is initialized through PINMUX_GPIO(), using designated array
initializers, where the GPIO_* enums serve as indices.  As GPIO_PTQ7 is
defined in the enum, but never used, pinmux_pins[] contains a
(zero-filled) hole.  Hence this entry is treated as pin zero, which was
registered before, and pinctrl registration fails.

According to the datasheet, port PTQ7 does not exist.  Hence remove
GPIO_PTQ7 from the enum to fix this.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 8d7b5b0af7e070b9 ("sh: Add sh7722 pinmux code")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Reported-by: Magnus Damm <[email protected]>
Reviewed-by: Laurent Pinchart <[email protected]>
Tested-by: Jacopo Mondi <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Yoshihiro Shimoda <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm, hugetlb, soft_offline: save compound page order before page migration

This fixes a bug in madvise() where if you'd try to soft offline a
hugepage via madvise(), while walking the address range you'd end up,
using the wrong page offset due to attempting to get the compound order
of a former but presently not compound page, due to dissolving the huge
page (since commit c3114a84f7f9: "mm: hugetlb: soft-offline: dissolve
source hugepage after successful migration").

As a result I ended up with all my free pages except one being offlined.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: c3114a84f7f9 ("mm: hugetlb: soft-offline: dissolve source hugepage after successful migration")
Signed-off-by: Alexandru Moise <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Hillf Danton <[email protected]>
Cc: Shaohua Li <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ksm: fix unlocked iteration over vmas in cmp_and_merge_page()

In this place mm is unlocked, so vmas or list may change. Down read
mmap_sem to protect them from modifications.

Link: http://lkml.kernel.org/r/150512788393.10691.8868381099691121308.stgit@localhost.localdomain
Fixes: e86c59b1b12d ("mm/ksm: improve deduplication of zero pages with colouring")
Signed-off-by: Kirill Tkhai <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Reviewed-by: Andrea Arcangeli <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: zhong jiang <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Claudio Imbrenda <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

include/linux/mm.h: fix typo in VM_MPX definition

There's a typo in recent change of VM_MPX definition.  We want it to be
VM_HIGH_ARCH_4, not VM_HIGH_ARCH_BIT_4.

This bug does cause visible regressions.  In arch_vma_name the vmflags
are tested against VM_MPX.  With the incorrect value of VM_MPX, a number
of vmas (such as the stack) test positive and end up being marked as
"[mpx]" in /proc/N/maps instead of their correct names.

This confuses tools like rr which expect to be able to find familiar
vmas.

Fixes: df3735c5b40f ("x86,mpx: make mpx depend on x86-64 to free up VMA flag")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Kirill A. Shutemov <[email protected]>
Reviewed-by: Rik van Riel <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Kyle Huey <[email protected]>
Cc: <[email protected]> [4.14+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

scripts/spelling.txt: add more spelling mistakes to spelling.txt

Here are some of the more spelling mistakes and typos that I've found
while fixing up spelling mistakes in kernel error message text over the
past eight weeks.

[[email protected]: s/|/||/, per Joe]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Colin Ian King <[email protected]>
Acked-by: Kees Cook <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Joe Perches <[email protected]>
Cc: Ross Zwisler <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kernel/params.c: align add_sysfs_param documentation with code

This parameter is named kp, so the documentation should use that.

Fixes: 9b473de87209 ("param: Fix duplicate module prefixes")
Link: http://lkml.kernel.org/r/20170919142656.64aea59e@endymion
Signed-off-by: Jean Delvare <[email protected]>
Acked-by: Rusty Russell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

alpha: fix build failures

The build of alpha allmodconfig is giving error:

arch/alpha/include/asm/mmu_context.h: In function 'ev5_switch_mm':
arch/alpha/include/asm/mmu_context.h:160:2: error:
implicit declaration of function 'task_thread_info';
did you mean 'init_thread_info'? [-Werror=implicit-function-declaration]

The file 'mmu_context.h' needed an extra header file.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Sudip Mukherjee <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: Matt Turner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

bpf: fix bpf_tail_call() x64 JIT

- bpf prog_array just like all other types of bpf array accepts 32-bit index.
Clarify that in the comment.
- fix x64 JIT of bpf_tail_call which was incorrectly loading 8 instead of 4 bytes
- tighten corresponding check in the interpreter to stay consistent

The JIT bug can be triggered after introduction of BPF_F_NUMA_NODE flag
in commit 96eabe7a40aa in 4.14. Before that the map_flags would stay zero and
though JIT code is wrong it will check bounds correctly.
Hence two fixes tags. All other JITs don't have this problem.

Signed-off-by: Alexei Starovoitov <[email protected]>
Fixes: 96eabe7a40aa ("bpf: Allow selecting numa node during map creation")
Fixes: b52f00e6a715 ("x86: bpf_jit: implement bpf_tail_call() helper")
Acked-by: Daniel Borkmann <[email protected]>
Acked-by: Martin KaFai Lau <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: stmmac: dwmac-rk: Add RK3128 GMAC support

Add constants and callback functions for the dwmac on rk3128 soc.
As can be seen, the base structure is the same, only registers
and the bits in them moved slightly.

Signed-off-by: David Wu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

blk-mq-debugfs: fix device sched directory for default scheduler

In blk_mq_debugfs_register(), I remembered to set up the per-hctx sched
directories if a default scheduler was already configured by
blk_mq_sched_init() from blk_mq_init_allocated_queue(), but I didn't do
the same for the device-wide sched directory. Fix it.

Fixes: d332ce091813 ("blk-mq-debugfs: allow schedulers to register debugfs attributes")
Signed-off-by: Omar Sandoval <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

null_blk: change configfs dependency to select

A recent commit made null_blk depend on configfs, which is kind of
annoying since you now have to find this dependency and enable that
as well. Discovered this since I no longer had null_blk available
on a box I needed to debug, since it got killed when the config
updated after the configfs change was merged.

Fixes: 3bf2bd20734e ("nullb: add configfs interface")
Reviewed-by: Shaohua Li <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

blk-throttle: fix possible io stall when upgrade to max

There is a case which will lead to io stall. The case is described as
follows.
/test1
  |-subtest1
/test2
  |-subtest2
And subtest1 and subtest2 each has 32 queued bios already.

Now upgrade to max. In throtl_upgrade_state, it will try to dispatch
bios as follows:
1) tg=subtest1, do nothing;
2) tg=test1, transfer 32 queued bios from subtest1 to test1; no pending
left, no need to schedule next dispatch;
3) tg=subtest2, do nothing;
4) tg=test2, transfer 32 queued bios from subtest2 to test2; no pending
left, no need to schedule next dispatch;
5) tg=/, transfer 8 queued bios from test1 to /, 8 queued bios from
test2 to /, 8 queued bios from test1 to /, and 8 queued bios from test2
to /; note that test1 and test2 each still has 16 queued bios left;
6) tg=/, try to schedule next dispatch, but since disptime is now
(update in tg_update_disptime, wait=0), pending timer is not scheduled
in fact;
7) In throtl_upgrade_state it totally dispatches 32 queued bios and with
32 left. test1 and test2 each has 16 queued bios;
8) throtl_pending_timer_fn sees the left over bios, but could do
nothing, because throtl_select_dispatch returns 0, and test1/test2 has
no pending tg.

The blktrace shows the following:
8,32   0        0     2.539007641     0  m   N throtl upgrade to max
8,32   0        0     2.539072267     0  m   N throtl /test2 dispatch nr_queued=16 read=0 write=16
8,32   7        0     2.539077142     0  m   N throtl /test1 dispatch nr_queued=16 read=0 write=16

So force schedule dispatch if there are pending children.

Reviewed-by: Shaohua Li <[email protected]>
Signed-off-by: Joseph Qi <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

rndis_host: support Novatel Verizon USB730L

Treat the ef/04/01 interface class/subclass/protocol combination used
by the Novatel Verizon USB730L (1410:9030) as a possible RNDIS
interface.

T:  Bus=01 Lev=02 Prnt=02 Port=01 Cnt=02 Dev#= 17 Spd=480 MxCh= 0
D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  3
P:  Vendor=1410 ProdID=9030 Rev=03.10
S:  Manufacturer=Novatel Wireless
S:  Product=MiFi USB730L
S:  SerialNumber=0123456789ABCDEF
C:  #Ifs= 3 Cfg#= 1 Atr=80 MxPwr=500mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=ef(misc ) Sub=04 Prot=01 Driver=rndis_host
I:  If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=rndis_host
I:  If#= 2 Alt= 0 #EPs= 1 Cls=03(HID  ) Sub=00 Prot=00 Driver=usbhid

Once the network interface is brought up, the user just needs to run a
DHCP client to get IP address and routing setup.

As a side note, other Novatel Verizon USB730L models with the same
vid:pid end up exposing a standard ECM interface which doesn't require
any other kernel update to make it work.

Signed-off-by: Aleksander Morgado <[email protected]>
Reviewed-by: Bjørn Mork <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

drm/i915: Fix DDI PHY init if it was already on

The common lane power down flag of a DPIO PHY has a funky semantic:
after the initial enabling of the PHY (so from a disabled state) this
flag will be clear. It will be set only after the PHY will be used for
the first time (for instance due to enabling the corresponding pipe) and
then become unused (due to disabling the pipe). During the initial PHY
enablement we don't know which of the above phases we are in, so move
the check for the flag where this is known, the HW readout code. This is
where the rest of lane power down status checks are done anyway.

This fixes at least a problem on GLK where after module reloading, the
common lane power down flag of PHY1 is set, but the PHY is actually
powered-on and properly set up. The GRC readout code for other PHYs will
hence think that PHY1 is not powered initially and disable it after the
GRC readout. This will cause the AUX power well related to PHY1 to get
disabled in a stuck state, timing out when we try to enable it later.

Cc: Ville Syrjala <[email protected]>
Fixes: e93da0a0137b ("drm/i915/bxt: Sanitiy check the PHY lane power down status")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102777
Signed-off-by: Imre Deak <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit e19c1eb885ac4186e64c7e484424124f3145318e)
Signed-off-by: Rodrigo Vivi <[email protected]>

ide: fix IRQ assignment for PCI bus order probing

We used to assign IRQs for all devices at boot-time, before any drivers
claimed devices.  The following commits:

  30fdfb929e82 ("PCI: Add a call to pci_assign_irq() in pci_device_probe()")
  0e4c2eeb758a ("alpha/PCI: Replace pci_fixup_irqs() call with host bridge IRQ mapping hooks")

changed this so we now call pci_assign_irq() from pci_device_probe() when
we call a driver's probe method.

The ide_scan_pcibus() path (enabled by CONFIG_IDEPCI_PCIBUS_ORDER) bypasses
pci_device_probe() so it can guarantee devices are claimed in order of PCI
bus address.  It calls the driver's probe method directly, so it misses the
pci_assign_irq() call (and other PCI initialization functions), which
causes failures like this:

  ide0: disabled, no IRQ
  ide0: failed to initialize IDE interface
  ide0: disabling port
  cmd64x 0000:00:02.0: IDE controller (0x1095:0x0646 rev 0x07)
  CMD64x_IDE 0000:00:02.0: BAR 0: can't reserve [io  0x8050-0x8057]
  cmd64x 0000:00:02.0: can't reserve resources
  CMD64x_IDE: probe of 0000:00:02.0 failed with error -16
  ide_generic: please use "probe_mask=0x3f" module parameter for probing
  all legacy ISA IDE ports
  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 1 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x94/0xd0
  sysfs: cannot create duplicate filename '/class/ide_port/ide0'
  ...

  Trace:
  [<fffffc000048c9f4>] sysfs_warn_dup+0x94/0xd0
  [<fffffc0000330928>] warn_slowpath_fmt+0x58/0x70
  [<fffffc000048c9f4>] sysfs_warn_dup+0x94/0xd0
  [<fffffc0000486d40>] kernfs_path_from_node+0x30/0x60
  [<fffffc00004874ac>] kernfs_put+0x16c/0x2c0
  [<fffffc00004874ac>] kernfs_put+0x16c/0x2c0
  [<fffffc000048d010>] sysfs_do_create_link_sd.isra.2+0x100/0x120
  [<fffffc00005b9d64>] device_add+0x2a4/0x7c0
  [<fffffc00005ba5cc>] device_create_groups_vargs+0x14c/0x170
  [<fffffc00005ba518>] device_create_groups_vargs+0x98/0x170
  [<fffffc00005ba690>] device_create+0x50/0x70
  [<fffffc00005df36c>] ide_host_register+0x48c/0xa00
  [<fffffc00005df330>] ide_host_register+0x450/0xa00
  [<fffffc00005ba2a0>] device_register+0x20/0x50
  [<fffffc00005df330>] ide_host_register+0x450/0xa00
  [<fffffc00005df944>] ide_host_add+0x64/0xe0
  [<fffffc000079b41c>] kobject_uevent_env+0x16c/0x710
  [<fffffc0000310288>] do_one_initcall+0x68/0x260
  [<fffffc00007b13bc>] kernel_init+0x1c/0x1a0
  ...
  ---[ end trace 24a70433c3e4d374 ]---
  ide0: disabling port

Fix the IRQ allocation issue by calling pci_assign_irq() from
ide_scan_pcidev() before probing the IDE PCI drivers, so that IRQs for a
given PCI device are allocated for the IDE PCI drivers to use them for
device configuration.

Fixes: 30fdfb929e82 ("PCI: Add a call to pci_assign_irq() in pci_device_probe()")
Fixes: 0e4c2eeb758a ("alpha/PCI: Replace pci_fixup_irqs() call with host bridge IRQ mapping hooks")
Link: http://lkml.kernel.org/r/[email protected]
Reported-by: Guenter Roeck <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Signed-off-by: Lorenzo Pieralisi <[email protected]>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Bartlomiej Zolnierkiewicz <[email protected]>
Acked-by: David S. Miller <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: Matt Turner <[email protected]>

ide: pci: free PCI BARs on initialization failure

Recent pci_assign_irq() changes uncovered a problem with missing freeing of
PCI BARs on PCI IDE host initialization failure:

  ide0: disabled, no IRQ
  ide0: failed to initialize IDE interface
  ide0: disabling port
  cmd64x 0000:00:02.0: IDE controller (0x1095:0x0646 rev 0x07)
  CMD64x_IDE 0000:00:02.0: BAR 0: can't reserve [io  0x8050-0x8057]
  cmd64x 0000:00:02.0: can't reserve resources
  CMD64x_IDE: probe of 0000:00:02.0 failed with error -16

Fix the problem by adding missing freeing of PCI BARs to
ide_setup_pci_controller() and ide_pci_init_two().

Fixes: 30fdfb929e82 ("PCI: Add a call to pci_assign_irq() in pci_device_probe()")
Fixes: 0e4c2eeb758a ("alpha/PCI: Replace pci_fixup_irqs() call with host bridge IRQ mapping hooks")
Link: http://lkml.kernel.org/r/[email protected]
Reported-by: Guenter Roeck <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
[bhelgaas: add Fixes:]
Signed-off-by: Bjorn Helgaas <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: Matt Turner <[email protected]>

ide: free hwif->portdev on hwif_init() failure

Recent pci_assign_irq() changes uncovered a problem with missing freeing of
ide_port class instance on hwif_init() failure in ide_host_register():

  ide0: disabled, no IRQ
  ide0: failed to initialize IDE interface
  ide0: disabling port
  cmd64x 0000:00:02.0: IDE controller (0x1095:0x0646 rev 0x07)
  CMD64x_IDE 0000:00:02.0: BAR 0: can't reserve [io  0x8050-0x8057]
  cmd64x 0000:00:02.0: can't reserve resources
  CMD64x_IDE: probe of 0000:00:02.0 failed with error -16
  ide_generic: please use "probe_mask=0x3f" module parameter for probing all legacy ISA IDE ports
  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 1 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x94/0xd0
  sysfs: cannot create duplicate filename '/class/ide_port/ide0'
  ...

  Trace:
  [<fffffc00003308a0>] __warn+0x160/0x190
  [<fffffc000048c9f4>] sysfs_warn_dup+0x94/0xd0
  [<fffffc0000330928>] warn_slowpath_fmt+0x58/0x70
  [<fffffc000048c9f4>] sysfs_warn_dup+0x94/0xd0
  [<fffffc0000486d40>] kernfs_path_from_node+0x30/0x60
  [<fffffc00004874ac>] kernfs_put+0x16c/0x2c0
  [<fffffc00004874ac>] kernfs_put+0x16c/0x2c0
  [<fffffc000048d010>] sysfs_do_create_link_sd.isra.2+0x100/0x120
  [<fffffc00005b9d64>] device_add+0x2a4/0x7c0
  [<fffffc00005ba5cc>] device_create_groups_vargs+0x14c/0x170
  [<fffffc00005ba518>] device_create_groups_vargs+0x98/0x170
  [<fffffc00005ba690>] device_create+0x50/0x70
  [<fffffc00005df36c>] ide_host_register+0x48c/0xa00
  [<fffffc00005df330>] ide_host_register+0x450/0xa00
  [<fffffc00005ba2a0>] device_register+0x20/0x50
  [<fffffc00005df330>] ide_host_register+0x450/0xa00
  [<fffffc00005df944>] ide_host_add+0x64/0xe0
  [<fffffc000079b41c>] kobject_uevent_env+0x16c/0x710
  [<fffffc0000310288>] do_one_initcall+0x68/0x260
  [<fffffc00007b13bc>] kernel_init+0x1c/0x1a0
  [<fffffc00007b13a0>] kernel_init+0x0/0x1a0
  [<fffffc0000311868>] ret_from_kernel_thread+0x18/0x20
  [<fffffc00007b13a0>] kernel_init+0x0/0x1a0

  ---[ end trace 24a70433c3e4d374 ]---
  ide0: disabling port

Fix the problem by adding missing code to ide_host_register().

Fixes: 30fdfb929e82 ("PCI: Add a call to pci_assign_irq() in pci_device_probe()")
Fixes: 0e4c2eeb758a ("alpha/PCI: Replace pci_fixup_irqs() call with host bridge IRQ mapping hooks")
Link: http://lkml.kernel.org/r/[email protected]
Reported-by: Guenter Roeck <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
[bhelgaas: add Fixes:]
Signed-off-by: Bjorn Helgaas <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: Matt Turner <[email protected]>

MAINTAINERS: update list for NBD

[email protected] becomes [email protected], because
sourceforge is just a spamtrap these days.

Signed-off-by: Wouter Verhelst <[email protected]>
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

Merge branch 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

Pull workqueue fixlet from Tejun Heo:
"Minor documentation update"

* 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
Documentation: core-api: minor workqueue.rst cleanups

Merge branch 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup fix from Tejun Heo:
"The recent migration code updates assumed that migrations always
  execute from the top to the bottom once and didn't clean up internal
  states after each migration round; however, cgroup_transfer_tasks()
  repeats the inner steps multiple times and the garbage internal states
  from the previous iteration led to OOPS.

  Waiman fixed the bug by reinitializing the relevant states at the end
  of each migration round"

* 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: Reinit cgroup_taskset structure before cgroup_migrate_execute() returns

net: rtnetlink: fix info leak in RTM_GETSTATS call

When RTM_GETSTATS was added the fields of its header struct were not all
initialized when returning the result thus leaking 4 bytes of information
to user-space per rtnl_fill_statsinfo call, so initialize them now. Thanks
to Alexander Potapenko for the detailed report and bisection.

Reported-by: Alexander Potapenko <[email protected]>
Fixes: 10c9ead9f3c6 ("rtnetlink: add new RTM_GETSTATS message to dump link stats")
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Acked-by: Roopa Prabhu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu

Pull percpu fixes from Tejun Heo:
"Rather important fixes this time.

   - The new percpu area allocator had a subtle bug in how it iterates
     the memory regions and could skip viable areas, which led to
     allocation failures for module static percpu variables. Dennis
     fixed the bug and another non-critical one in stat calculation.

   - Mark noticed that the generic implementations of percpu local
     atomic reads aren't properly protected against irqs and there's a
     (slim) chance for split reads on some 32bit systems. Generic
     implementations are updated to disable irq when read size is larger
     than ulong size. This may have made some 32bit archs which can do
     atomic local 64bit accesses generate sub-optimal code. We need to
     find them out and implement arch-specific overrides"

* 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu: fix iteration to prevent skipping over block
  percpu: fix starting offset for chunk statistics traversal
  percpu: make this_cpu_generic_read() atomic w.r.t. interrupts

Merge branch 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata

Pull libata fixes from Tejun Heo:
"Nothing too interesting.

  Arnd's gcc-7 warning fixes that slipped through the cracks for two
  release cycles (my bad), and two minor low level driver updates"

* 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
  ahci: don't ignore result code of ahci_reset_controller()
  ata_piix: Add Fujitsu-Siemens Lifebook S6120 to short cable IDs
  ata: avoid gcc-7 warning in ata_timing_quantize

Merge tag 'usb-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
"Here are a number of USB fixes for 4.14-rc4 to resolved reported
  issues.

  There's a bunch of stuff in here based on the great work Andrey
  Konovalov is doing in fuzzing the USB stack. Lots of bug fixes when
  dealing with corrupted USB descriptors that we've never seen in
  "normal" operation, but is now ensuring the stack is much more
  hardened overall.

  There's also the usual XHCI and gadget driver fixes as well, and a
  build error fix, and a few other minor things, full details in the
  shortlog.

  All of these have been in linux-next with no reported issues"

* tag 'usb-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (38 commits)
  usb: dwc3: of-simple: Add compatible for Spreadtrum SC9860 platform
  usb: gadget: udc: atmel: set vbus irqflags explicitly
  usb: gadget: ffs: handle I/O completion in-order
  usb: renesas_usbhs: fix usbhsf_fifo_clear() for RX direction
  usb: renesas_usbhs: fix the BCLR setting condition for non-DCP pipe
  usb: gadget: udc: renesas_usb3: Fix return value of usb3_write_pipe()
  usb: gadget: udc: renesas_usb3: fix Pn_RAMMAP.Pn_MPKT value
  usb: gadget: udc: renesas_usb3: fix for no-data control transfer
  USB: dummy-hcd: Fix erroneous synchronization change
  USB: dummy-hcd: fix infinite-loop resubmission bug
  USB: dummy-hcd: fix connection failures (wrong speed)
  USB: cdc-wdm: ignore -EPIPE from GetEncapsulatedResponse
  USB: devio: Don't corrupt user memory
  USB: devio: Prevent integer overflow in proc_do_submiturb()
  USB: g_mass_storage: Fix deadlock when driver is unbound
  USB: gadgetfs: Fix crash caused by inadequate synchronization
  USB: gadgetfs: fix copy_to_user while holding spinlock
  USB: uas: fix bug in handling of alternate settings
  usb-storage: unusual_devs entry to fix write-access regression for Seagate external drives
  usb-storage: fix bogus hardware error messages for ATA pass-thru devices
  ...

Merge tag 'tty-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
"Here are a small number (5) of patches for some reported TTY and
  serial issues. Nothing major, a documentation update, timing fix,
  error handling fix, name reporting fix, and a timeout issue resolved.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'tty-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: sccnxp: Fix error handling in sccnxp_probe()
  tty: serial: lpuart: avoid report NULL interrupt
  serial: bcm63xx: fix timing issue.
  mxser: fix timeout calculation for low rates
  serial: sh-sci: document R8A77970 bindings

Merge tag 'staging-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging/IIO fixes from Greg KH:
"Here are some small staging/IIO driver fixes for 4.14-rc4

  Most of these have been in my tree for a while due to travels, sorry
  for the delay. They resolve a number of small issues reported by
  people, mostly for the iio drivers. Nothing major in here, full
  details are in the shortlog.

  All have been linux-next for a few weeks with no reported issues"

* tag 'staging-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (23 commits)
  staging: iio: ad7192: Fix - use the dedicated reset function avoiding dma from stack.
  iio: core: Return error for failed read_reg
  iio: ad7793: Fix the serial interface reset
  iio: ad_sigma_delta: Implement a dedicated reset function
  IIO: BME280: Updates to Humidity readings need ctrl_reg write!
  iio: adc: mcp320x: Fix readout of negative voltages
  iio: adc: mcp320x: Fix oops on module unload
  iio: adc: stm32: fix bad error check on max_channels
  iio: trigger: stm32-timer: fix a corner case to write preset
  iio: trigger: stm32-timer: preset shouldn't be buffered
  iio: adc: twl4030: Return an error if we can not enable the vusb3v1 regulator in 'twl4030_madc_probe()'
  iio: adc: twl4030: Disable the vusb3v1 rugulator in the error handling path of 'twl4030_madc_probe()'
  iio: adc: twl4030: Fix an error handling path in 'twl4030_madc_probe()'
  staging: rtl8723bs: avoid null pointer dereference on pmlmepriv
  staging: rtl8723bs: add missing range check on id
  staging: vchiq_2835_arm: Fix NULL ptr dereference in free_pagelist
  staging: speakup: fix speakup-r empty line lockup
  staging: pi433: Move limit check to switch default to kill warning
  staging: r8822be: fix null pointer dereferences with a null driver_adapter
  staging: mt29f_spinand: Enable the read ECC before program the page
  ...

KVM: PPC: Book3S: Fix server always zero from kvmppc_xive_get_xive()

In KVM's XICS-on-XIVE emulation, kvmppc_xive_get_xive() returns the
value of state->guest_server as "server". However, this value is not
set by it's counterpart kvmppc_xive_set_xive(). When the guest uses
this interface to migrate interrupts away from a CPU that is going
offline, it sees all interrupts as belonging to CPU 0, so they are
left assigned to (now) offline CPUs.

This patch removes the guest_server field from the state, and returns
act_server in it's place (that is, the CPU actually handling the
interrupt, which may differ from the one requested).

Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller")
Cc: [email protected]
Signed-off-by: Sam Bobroff <[email protected]>
Acked-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Radim Krčmář <[email protected]>

Merge tag 'driver-core-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
"Here are a few small fixes for 4.14-rc4.

  The removal of DRIVER_ATTR() was almost completed by 4.14-rc1, but one
  straggler made it in through some other tree (odds are, one of
  mine...) So there's a simple removal of the last user, and then
  finally the macro is removed from the tree.

  There's a fix for old crazy udev instances that insist on reloading a
  module when it is removed from the kernel due to the new uevents for
  bind/unbind. This fixes the reported regression, hopefully some year
  in the future we can drop the workaround, once users update to the
  latest version, but I'm not holding my breath.

  And then there's a build fix for a linker warning, and a buffer
  overflow fix to match the PCI fixes you took through the PCI tree in
  the same area.

  All of these have been in linux-next for a few weeks while I've been
  traveling, sorry for the delay"

* tag 'driver-core-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  driver core: remove DRIVER_ATTR
  fpga: altera-cvp: remove DRIVER_ATTR() usage
  driver core: platform: Don't read past the end of "driver_override" buffer
  base: arch_topology: fix section mismatch build warnings
  driver core: suppress sending MODALIAS in UNBIND uevents

Merge tag 'char-misc-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc fixes from Greg KH:
"Here are a handful of char/misc driver fixes for 4.14-rc4.

  Nothing major, some binder fixups, hyperv fixes, and other tiny
  things.

  All of these have been sitting in my tree for way too long, sorry for
  the delay in getting them to you. All have been in linux-next for a
  few weeks, and despite some people's feeling about if linux-next
  actually tests things, I think it's a good "soak test" for patches"

* tag 'char-misc-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  Drivers: hv: fcopy: restore correct transfer length
  vmbus: don't acquire the mutex in vmbus_hvsock_device_unregister()
  intel_th: pci: Add Lewisburg PCH support
  intel_th: pci: Add Cedar Fork PCH support
  stm class: Fix a use-after-free
  nvmem: add missing of_node_put() in of_nvmem_cell_get()
  nvmem: core: return EFBIG on out-of-range write
  auxdisplay: charlcd: properly restore atomic counter on error path
  binder: fix memory corruption in binder_transaction binder
  binder: fix an ret value override
  android: binder: fix type mismatch warning

rcu: Remove extraneous READ_ONCE()s from rcu_irq_{enter,exit}()

The read of ->dynticks_nmi_nesting in rcu_irq_enter() and rcu_irq_exit()
is currently protected with READ_ONCE(). However, this protection is
unnecessary because (1) ->dynticks_nmi_nesting is updated only by the
current CPU, (2) Although NMI handlers can update this field, they reset
it back to its old value before return, and (3) Interrupts are disabled,
so nothing else can modify it. The value of ->dynticks_nmi_nesting is
thus effectively constant, and so no protection is required.

This commit therefore removes the READ_ONCE() protection from these
two accesses.

Link: http://lkml.kernel.org/r/[email protected]
Reported-by: Linus Torvalds <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>

ftrace: Fix kmemleak in unregister_ftrace_graph

The trampoline allocated by function tracer was overwriten by function_graph
tracer, and caused a memory leak. The save_global_trampoline should have
saved the previous trampoline in register_ftrace_graph() and restored it in
unregister_ftrace_graph(). But as it is implemented, save_global_trampoline was
only used in unregister_ftrace_graph as default value 0, and it overwrote the
previous trampoline's value. Causing the previous allocated trampoline to be
lost.

kmmeleak backtrace:
    kmemleak_vmalloc+0x77/0xc0
    __vmalloc_node_range+0x1b5/0x2c0
    module_alloc+0x7c/0xd0
    arch_ftrace_update_trampoline+0xb5/0x290
    ftrace_startup+0x78/0x210
    register_ftrace_function+0x8b/0xd0
    function_trace_init+0x4f/0x80
    tracing_set_tracer+0xe6/0x170
    tracing_set_trace_write+0x90/0xd0
    __vfs_write+0x37/0x170
    vfs_write+0xb2/0x1b0
    SyS_write+0x55/0xc0
    do_syscall_64+0x67/0x180
    return_from_SYSCALL_64+0x0/0x6a

[
  Looking further into this, I found that this was left over from when the
  function and function graph tracers shared the same ftrace_ops. But in
  commit 5f151b2401 ("ftrace: Fix function_profiler and function tracer
  together"), the two were separated, and the save_global_trampoline no
  longer was necessary (and it may have been broken back then too).
  -- Steven Rostedt
]

Link: http://lkml.kernel.org/r/[email protected]
Cc: [email protected]
Fixes: 5f151b2401 ("ftrace: Fix function_profiler and function tracer together")
Signed-off-by: Shu Wang <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>

powerpc/4xx: Fix compile error with 64K pages on 40x, 44x

The mmu context on the 40x, 44x does not define pte_frag entry. This
causes gcc abort the compilation due to:

  setup-common.c: In function ‘setup_arch’:
  setup-common.c:908: error: ‘mm_context_t’ has no ‘pte_frag’

This patch fixes the issue by removing the pte_frag initialization in
setup-common.c.

This is possible, because the compiler will do the initialization,
since the mm_context is a sub struct of init_mm. init_mm is declared
in mm_types.h as external linkage.

According to C99 6.2.4.3:
  An object whose identifier is declared with external linkage
  [...] has static storage duration.

C99 defines in 6.7.8.10 that:
  If an object that has static storage duration is not
  initialized explicitly, then:
  - if it has pointer type, it is initialized to a null pointer

Fixes: b1923caa6e64 ("powerpc: Merge 32-bit and 64-bit setup_arch()")
Signed-off-by: Christian Lamparter <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>

powerpc: Fix action argument for cpufeatures-based TLB flush

Commit 41d0c2ecde19 ("powerpc/powernv: Fix local TLB flush for boot
and MCE on POWER9") introduced calls to __flush_tlb_power[89] from the
cpufeatures code, specifying the number of sets to flush.

However, these functions take an action argument, not a number of
sets. This means we hit the BUG() in __flush_tlb_{206,300} when using
cpufeatures-style configuration.

This change passes TLB_INVAL_SCOPE_GLOBAL instead.

Fixes: 41d0c2ecde19 ("powerpc/powernv: Fix local TLB flush for boot and MCE on POWER9")
Cc: [email protected] # v4.13+
Signed-off-by: Jeremy Kerr <[email protected]>
Reviewed-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>

scsi: ibmvscsis: Fix write_pending failure path

For write_pending if the queue is down or client failed then return -EIO
so that LIO can properly process the completed command. Prior we
returned 0 since LIO could not handle it properly. Now with commit
fa7e25cf13a6 ("target: Fix unknown fabric callback queue-full errors")
that patch addresses LIO's ability to handle things right.

Signed-off-by: Bryant G. Ly <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: libiscsi: Remove iscsi_destroy_session

iscsi_session_teardown was the only user of this function. Function
currently is just short for iscsi_remove_session + iscsi_free_session.

Signed-off-by: Khazhismel Kumykov <[email protected]>
Acked-by: Chris Leech <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: libiscsi: Fix use-after-free race during iscsi_session_teardown

Session attributes exposed through sysfs were freed before the device
was destroyed, resulting in a potential use-after-free. Free these
attributes after removing the device.

Signed-off-by: Khazhismel Kumykov <[email protected]>
Acked-by: Chris Leech <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: sd: Do not override max_sectors_kb sysfs setting

A user may lower the max_sectors_kb setting in sysfs to accommodate
certain workloads. Previously we would always set the max I/O size to
either the block layer default or the optional preferred I/O size
reported by the device.

Keep the current heuristics for the initial setting of max_sectors_kb.
For subsequent invocations, only update the current queue limit if it
exceeds the capabilities of the hardware.

Cc: <[email protected]>
Reported-by: Don Brace <[email protected]>
Reviewed-by: Martin Wilck <[email protected]>
Tested-by: Don Brace <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

scsi: sd: Implement blacklist option for WRITE SAME w/ UNMAP

SBC-4 states:

  "A MAXIMUM UNMAP LBA COUNT field set to a non-zero value indicates the
   maximum number of LBAs that may be unmapped by an UNMAP command"

  "A MAXIMUM WRITE SAME LENGTH field set to a non-zero value indicates
   the maximum number of contiguous logical blocks that the device server
   allows to be unmapped or written in a single WRITE SAME command."

Despite the spec being clear on the topic, some devices incorrectly
expect WRITE SAME commands with the UNMAP bit set to be limited to the
value reported in MAXIMUM UNMAP LBA COUNT in the Block Limits VPD.

Implement a blacklist option that can be used to accommodate devices
with this behavior.

Cc: <[email protected]>
Reported-by: Bill Kuzeja <[email protected]>
Reported-by: Ewan D. Milne <[email protected]>
Reviewed-by: Ewan D. Milne <[email protected]>
Tested-by: Laurence Oberman <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

Merge tag 'drm-intel-fixes-2017-09-27' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes

drm/i915 fixes for 4.14-rc3

Couple fixes for stable:

- Fix ELD connector types and consequently audio on DP (Jani).
- Ignore HDMI on Port A and consequently fix an ops on i915 probe
  when VBT advertises HDMI on Port A (Jani).

And a small fix:

- That removes a reduntant hw_check on modeset. (Colin)

* tag 'drm-intel-fixes-2017-09-27' of git://anongit.freedesktop.org/git/drm-intel:
  drm/i915/bios: ignore HDMI on port A
  drm/i915: remove redundant variable hw_check
  drm/i915: always update ELD connector type after get modes

socket, bpf: fix possible use after free

Starting from linux-4.4, 3WHS no longer takes the listener lock.

Since this time, we might hit a use-after-free in sk_filter_charge(),
if the filter we got in the memcpy() of the listener content
just happened to be replaced by a thread changing listener BPF filter.

To fix this, we need to make sure the filter refcount is not already
zero before incrementing it again.

Fixes: e994b2f0fb92 ("tcp: do not lock listener to process SYN packets")
Signed-off-by: Eric Dumazet <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

nbd: fix -ERESTARTSYS handling

Christoph made it so that if we return'ed BLK_STS_RESOURCE whenever we
got ERESTARTSYS from sending our packets we'd return BLK_STS_OK, which
means we'd never requeue and just hang. We really need to return the
right value from the upper layer.

Fixes: fc17b6534eb8 ("blk-mq: switch ->queue_rq return value to blk_status_t")
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>