]> Git Repo - linux.git/log
linux.git
6 years agoigmp: fix incorrect unsolicit report count after link down and up
Hangbin Liu [Wed, 29 Aug 2018 10:06:10 +0000 (18:06 +0800)]
igmp: fix incorrect unsolicit report count after link down and up

After link down and up, i.e. when call ip_mc_up(), we doesn't init
im->unsolicit_count. So after igmp_timer_expire(), we will not start
timer again and only send one unsolicit report at last.

Fix it by initializing im->unsolicit_count in igmp_group_added(), so
we can respect igmp robustness value.

Fixes: 24803f38a5c0b ("igmp: do not remove igmp souce list info when set link down")
Signed-off-by: Hangbin Liu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoigmp: fix incorrect unsolicit report count when join group
Hangbin Liu [Wed, 29 Aug 2018 10:06:08 +0000 (18:06 +0800)]
igmp: fix incorrect unsolicit report count when join group

We should not start timer if im->unsolicit_count equal to 0 after decrease.
Or we will send one more unsolicit report message. i.e. 3 instead of 2 by
default.

Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2")
Signed-off-by: Hangbin Liu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agobpf: avoid misuse of psock when TCP_ULP_BPF collides with another ULP
John Fastabend [Fri, 31 Aug 2018 04:25:02 +0000 (21:25 -0700)]
bpf: avoid misuse of psock when TCP_ULP_BPF collides with another ULP

Currently we check sk_user_data is non NULL to determine if the sk
exists in a map. However, this is not sufficient to ensure the psock
or the ULP ops are not in use by another user, such as kcm or TLS. To
avoid this when adding a sock to a map also verify it is of the
correct ULP type. Additionally, when releasing a psock verify that
it is the TCP_ULP_BPF type before releasing the ULP. The error case
where we abort an update due to ULP collision can cause this error
path.

For example,

  __sock_map_ctx_update_elem()
     [...]
     err = tcp_set_ulp_id(sock, TCP_ULP_BPF) <- collides with TLS
     if (err)                                <- so err out here
        goto out_free
     [...]
  out_free:
     smap_release_sock() <- calling tcp_cleanup_ulp releases the
                            TLS ULP incorrectly.

Fixes: 2f857d04601a ("bpf: sockmap, remove STRPARSER map_flags and add multi-map support")
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
6 years agotools/bpf: bpftool, add xskmap in map types
Prashant Bhole [Fri, 31 Aug 2018 06:32:42 +0000 (15:32 +0900)]
tools/bpf: bpftool, add xskmap in map types

When listed all maps, bpftool currently shows (null) for xskmap.
Added xskmap type in map_type_name[] to show correct type.

Signed-off-by: Prashant Bhole <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
6 years agobpf: Fix bpf_msg_pull_data()
Tushar Dave [Fri, 31 Aug 2018 21:45:16 +0000 (23:45 +0200)]
bpf: Fix bpf_msg_pull_data()

Helper bpf_msg_pull_data() mistakenly reuses variable 'offset' while
linearizing multiple scatterlist elements. Variable 'offset' is used
to find first starting scatterlist element
    i.e. msg->data = sg_virt(&sg[first_sg]) + start - offset"

Use different variable name while linearizing multiple scatterlist
elements so that value contained in variable 'offset' won't get
overwritten.

Fixes: 015632bb30da ("bpf: sk_msg program helper bpf_sk_msg_pull_data")
Signed-off-by: Tushar Dave <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
6 years agoMerge tag 'devicetree-fixes-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 2 Sep 2018 17:56:01 +0000 (10:56 -0700)]
Merge tag 'devicetree-fixes-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree updates from Rob Herring:
 "A couple of new helper functions in preparation for some tree wide
  clean-ups.

  I'm sending these new helpers now for rc2 in order to simplify the
  dependencies on subsequent cleanups across the tree in 4.20"

* tag 'devicetree-fixes-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  of: Add device_type access helper functions
  of: add node name compare helper functions
  of: add helper to lookup compatible child node

6 years agoMerge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Linus Torvalds [Sun, 2 Sep 2018 17:44:28 +0000 (10:44 -0700)]
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
 "First batch of fixes post-merge window:

   - A handful of devicetree changes for i.MX2{3,8} to change over to
     new panel bindings. The platforms were moved from legacy
     framebuffers to DRM and some development board panels hadn't yet
     been converted.

   - OMAP fixes related to ti-sysc driver conversion fallout, fixing
     some register offsets, no_console_suspend fixes, etc.

   - Droid4 changes to fix flaky eMMC probing and vibrator DTS mismerge.

   - Fixed 0755->0644 permissions on a newly added file.

   - Defconfig changes to make ARM Versatile more useful with QEMU
     (helps testing).

   - Enable defconfig options for new TI SoC platform that was merged
     this window (AM6)"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  arm64: defconfig: Enable TI's AM6 SoC platform
  ARM: defconfig: Update the ARM Versatile defconfig
  ARM: dts: omap4-droid4: Fix emmc errors seen on some devices
  ARM: dts: Fix file permission for am335x-osd3358-sm-red.dts
  ARM: imx_v6_v7_defconfig: Select CONFIG_DRM_PANEL_SEIKO_43WVF1G
  ARM: mxs_defconfig: Select CONFIG_DRM_PANEL_SEIKO_43WVF1G
  ARM: dts: imx23-evk: Convert to the new display bindings
  ARM: dts: imx23-evk: Move regulators outside simple-bus
  ARM: dts: imx28-evk: Convert to the new display bindings
  ARM: dts: imx28-evk: Move regulators outside simple-bus
  Revert "ARM: dts: imx7d: Invert legacy PCI irq mapping"
  arm: dts: am4372: setup rtc as system-power-controller
  ARM: dts: omap4-droid4: fix vibrations on Droid 4
  bus: ti-sysc: Fix no_console_suspend handling
  bus: ti-sysc: Fix module register ioremap for larger offsets
  ARM: OMAP2+: Fix module address for modules using mpu_rt_idx
  ARM: OMAP2+: Fix null hwmod for ti-sysc debug

6 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 2 Sep 2018 17:11:30 +0000 (10:11 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "Speculation:

   - Make the microcode check more robust

   - Make the L1TF memory limit depend on the internal cache physical
     address space and not on the CPUID advertised physical address
     space, which might be significantly smaller. This avoids disabling
     L1TF on machines which utilize the full physical address space.

   - Fix the GDT mapping for EFI calls on 32bit PTI

   - Fix the MCE nospec implementation to prevent #GP

  Fixes and robustness:

   - Use the proper operand order for LSL in the VDSO

   - Prevent NMI uaccess race against CR3 switching

   - Add a lockdep check to verify that text_mutex is held in
     text_poke() functions

   - Repair the fallout of giving native_restore_fl() a prototype

   - Prevent kernel memory dumps based on usermode RIP

   - Wipe KASAN shadow stack before rewinding the stack to prevent false
     positives

   - Move the AMS GOTO enforcement to the actual build stage to allow
     user API header extraction without a compiler

   - Fix a section mismatch introduced by the on demand VDSO mapping
     change

  Miscellaneous:

   - Trivial typo, GCC quirk removal and CC_SET/OUT() cleanups"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/pti: Fix section mismatch warning/error
  x86/vdso: Fix lsl operand order
  x86/mce: Fix set_mce_nospec() to avoid #GP fault
  x86/efi: Load fixmap GDT in efi_call_phys_epilog()
  x86/nmi: Fix NMI uaccess race against CR3 switching
  x86: Allow generating user-space headers without a compiler
  x86/dumpstack: Don't dump kernel memory based on usermode RIP
  x86/asm: Use CC_SET()/CC_OUT() in __gen_sigismember()
  x86/alternatives: Lockdep-enforce text_mutex in text_poke*()
  x86/entry/64: Wipe KASAN stack shadow before rewind_stack_do_exit()
  x86/irqflags: Mark native_restore_fl extern inline
  x86/build: Remove jump label quirk for GCC older than 4.5.2
  x86/Kconfig: Fix trivial typo
  x86/speculation/l1tf: Increase l1tf memory limit for Nehalem+
  x86/spectre: Add missing family 6 check to microcode check

6 years agoMerge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 2 Sep 2018 17:09:35 +0000 (10:09 -0700)]
Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull CPU hotplug fix from Thomas Gleixner:
 "Remove the stale skip_onerr member from the hotplug states"

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpu/hotplug: Remove skip_onerr field from cpuhp_step structure

6 years agoMerge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 2 Sep 2018 16:41:45 +0000 (09:41 -0700)]
Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core fixes from Thomas Gleixner:
 "A small set of updates for core code:

   - Prevent tracing in functions which are called from trace patching
     via stop_machine() to prevent executing half patched function trace
     entries.

   - Remove old GCC workarounds

   - Remove pointless includes of notifier.h"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Remove workaround for unreachable warnings from old GCC
  notifier: Remove notifier header file wherever not used
  watchdog: Mark watchdog touch functions as notrace

6 years agox86/pti: Fix section mismatch warning/error
Randy Dunlap [Sun, 2 Sep 2018 04:01:28 +0000 (21:01 -0700)]
x86/pti: Fix section mismatch warning/error

Fix the section mismatch warning in arch/x86/mm/pti.c:

WARNING: vmlinux.o(.text+0x6972a): Section mismatch in reference from the function pti_clone_pgtable() to the function .init.text:pti_user_pagetable_walk_pte()
The function pti_clone_pgtable() references
the function __init pti_user_pagetable_walk_pte().
This is often because pti_clone_pgtable lacks a __init
annotation or the annotation of pti_user_pagetable_walk_pte is wrong.
FATAL: modpost: Section mismatches detected.

Fixes: 85900ea51577 ("x86/pti: Map the vsyscall page if needed")
Reported-by: kbuild test robot <[email protected]>
Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
6 years agoof/platform: initialise AMBA default DMA masks
Linus Walleij [Fri, 31 Aug 2018 14:13:07 +0000 (16:13 +0200)]
of/platform: initialise AMBA default DMA masks

This addresses a v4.19-rc1 regression in the PL111 DRM driver in
drivers/gpu/pl111/*

The driver uses the CMA KMS helpers and will thus at some point call
down to dma_alloc_attrs() to allocate a chunk of contigous DMA memory
for the framebuffer.

It appears that in v4.18, it was OK that this (and other DMA mastering
AMBA devices) left dev->coherent_dma_mask blank (zero).

In v4.19-rc1 the WARN_ON_ONCE(dev && !dev->coherent_dma_mask) in
dma_alloc_attrs() in include/linux/dma-mapping.h is triggered.  The
allocation later fails when get_coherent_dma_mask() is called from
__dma_alloc() and __dma_alloc() returns NULL:

drm-clcd-pl111 dev:20: coherent DMA mask is unset
drm-clcd-pl111 dev:20: [drm:drm_fb_helper_fbdev_setup] *ERROR*
                  Failed to set fbdev configuration

It turns out that in commit 4d8bde883bfb ("OF: Don't set default
coherent DMA mask") the OF core stops setting the default DMA mask on
new devices, especially those lines of the patch:

- if (!dev->coherent_dma_mask)
-               dev->coherent_dma_mask = DMA_BIT_MASK(32);

Robin Murphy solved a similar problem in a5516219b102 ("of/platform:
Initialise default DMA masks") by simply assigning dev.coherent_dma_mask
and the dev.dma_mask to point to the same when creating devices from the
device tree, and introducing the same code into the code path creating
AMBA/PrimeCell devices solved my problem, graphics now come up.

The code simply assumes that the device can access all of the system
memory by setting the coherent DMA mask to 0xffffffff when creating a
device from the device tree, which is crude, but seems to be what kernel
v4.18 assumed.

The AMBA PrimeCells do not differ between coherent and streaming DMA so
we can just assign the same to any DMA mask.

Possibly drivers should augment their coherent DMA mask in accordance
with "dma-ranges" from the device tree if more finegranular masking is
needed.

Reported-by: Russell King <[email protected]>
Fixes: 4d8bde883bfb ("OF: Don't set default coherent DMA mask")
Cc: Russell King <[email protected]>
Cc: Robin Murphy <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
6 years agosparc: set a default 32-bit dma mask for OF devices
Christoph Hellwig [Tue, 28 Aug 2018 09:25:51 +0000 (11:25 +0200)]
sparc: set a default 32-bit dma mask for OF devices

This keeps the historic default behavior for devices without a DMA mask,
but removes the warning about a lacking DMA mask for doing DMA without
a mask.

Reported-by: Meelis Roos <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
6 years agoMerge tag 'omap-for-v4.19/fixes-v2-signed' of git://git.kernel.org/pub/scm/linux...
Olof Johansson [Sun, 2 Sep 2018 01:22:19 +0000 (18:22 -0700)]
Merge tag 'omap-for-v4.19/fixes-v2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes

Fixes for omap variants against v4.19-rc1

These are mostly fixes related to using ti-sysc interconnect target module
driver for accessing right register offsets for sgx and cpsw and for
no_console_suspend regression.

There is also a droid4 emmc fix where emmc may not get detected for some
models, and vibrator dts mismerge fix.

And we have a file permission fix for am335x-osd3358-sm-red.dts that
just got added. And we must tag RTC as system-power-controller for
am437x for PMIC to shut down during poweroff.

* tag 'omap-for-v4.19/fixes-v2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: dts: omap4-droid4: Fix emmc errors seen on some devices
  ARM: dts: Fix file permission for am335x-osd3358-sm-red.dts
  arm: dts: am4372: setup rtc as system-power-controller
  ARM: dts: omap4-droid4: fix vibrations on Droid 4
  bus: ti-sysc: Fix no_console_suspend handling
  bus: ti-sysc: Fix module register ioremap for larger offsets
  ARM: OMAP2+: Fix module address for modules using mpu_rt_idx
  ARM: OMAP2+: Fix null hwmod for ti-sysc debug

Signed-off-by: Olof Johansson <[email protected]>
6 years agoipv6: don't get lwtstate twice in ip6_rt_copy_init()
Alexey Kodanev [Thu, 30 Aug 2018 16:11:24 +0000 (19:11 +0300)]
ipv6: don't get lwtstate twice in ip6_rt_copy_init()

Commit 80f1a0f4e0cd ("net/ipv6: Put lwtstate when destroying fib6_info")
partially fixed the kmemleak [1], lwtstate can be copied from fib6_info,
with ip6_rt_copy_init(), and it should be done only once there.

rt->dst.lwtstate is set by ip6_rt_init_dst(), at the start of the function
ip6_rt_copy_init(), so there is no need to get it again at the end.

With this patch, lwtstate also isn't copied from RTF_REJECT routes.

[1]:
unreferenced object 0xffff880b6aaa14e0 (size 64):
  comm "ip", pid 10577, jiffies 4295149341 (age 1273.903s)
  hex dump (first 32 bytes):
    01 00 04 00 04 00 00 00 10 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<0000000018664623>] lwtunnel_build_state+0x1bc/0x420
    [<00000000b73aa29a>] ip6_route_info_create+0x9f7/0x1fd0
    [<00000000ee2c5d1f>] ip6_route_add+0x14/0x70
    [<000000008537b55c>] inet6_rtm_newroute+0xd9/0xe0
    [<000000002acc50f5>] rtnetlink_rcv_msg+0x66f/0x8e0
    [<000000008d9cd381>] netlink_rcv_skb+0x268/0x3b0
    [<000000004c893c76>] netlink_unicast+0x417/0x5a0
    [<00000000f2ab1afb>] netlink_sendmsg+0x70b/0xc30
    [<00000000890ff0aa>] sock_sendmsg+0xb1/0xf0
    [<00000000a2e7b66f>] ___sys_sendmsg+0x659/0x950
    [<000000001e7426c8>] __sys_sendmsg+0xde/0x170
    [<00000000fe411443>] do_syscall_64+0x9f/0x4a0
    [<000000001be7b28b>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [<000000006d21f353>] 0xffffffffffffffff

Fixes: 6edb3c96a5f0 ("net/ipv6: Defer initialization of dst to data path")
Signed-off-by: Alexey Kodanev <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agox86/vdso: Fix lsl operand order
Samuel Neves [Sat, 1 Sep 2018 20:14:52 +0000 (21:14 +0100)]
x86/vdso: Fix lsl operand order

In the __getcpu function, lsl is using the wrong target and destination
registers. Luckily, the compiler tends to choose %eax for both variables,
so it has been working so far.

Fixes: a582c540ac1b ("x86/vdso: Use RDPID in preference to LSL when available")
Signed-off-by: Samuel Neves <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Andy Lutomirski <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
6 years agoMerge tag 'linux-watchdog-4.19-rc2' of git://www.linux-watchdog.org/linux-watchdog
Linus Torvalds [Sat, 1 Sep 2018 20:17:15 +0000 (13:17 -0700)]
Merge tag 'linux-watchdog-4.19-rc2' of git://www.linux-watchdog.org/linux-watchdog

Pull watchdog fixlet from Wim Van Sebroeck:
 "Document support for r8a774a1"

* tag 'linux-watchdog-4.19-rc2' of git://www.linux-watchdog.org/linux-watchdog:
  dt-bindings: watchdog: renesas-wdt: Document r8a774a1 support

6 years agoMerge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 1 Sep 2018 20:03:32 +0000 (13:03 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
 "Two small fixes, one for the x86 Stoney SoC to get a more accurate clk
  frequency and the other to fix a bad allocation in the Nuvoton NPCM7XX
  driver"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: x86: Set default parent to 48Mhz
  clk: npcm7xx: fix memory allocation

6 years agokernel/dma/direct: take DMA offset into account in dma_direct_supported
Christoph Hellwig [Fri, 24 Aug 2018 06:07:52 +0000 (08:07 +0200)]
kernel/dma/direct: take DMA offset into account in dma_direct_supported

When a device has a DMA offset the dma capable result will change due
to the difference between the physical and DMA address.  Take that into
account.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Benjamin Herrenschmidt <[email protected]>
Reviewed-by: Robin Murphy <[email protected]>
6 years agox86/mce: Fix set_mce_nospec() to avoid #GP fault
LuckTony [Fri, 31 Aug 2018 16:55:06 +0000 (09:55 -0700)]
x86/mce: Fix set_mce_nospec() to avoid #GP fault

The trick with flipping bit 63 to avoid loading the address of the 1:1
mapping of the poisoned page while the 1:1 map is updated used to work when
unmapping the page. But it falls down horribly when attempting to directly
set the page as uncacheable.

The problem is that when the cache mode is changed to uncachable, the pages
needs to be flushed from the cache first. But the decoy address is
non-canonical due to bit 63 flipped, and the CLFLUSH instruction throws a
#GP fault.

Add code to change_page_attr_set_clr() to fix the address before calling
flush.

Fixes: 284ce4011ba6 ("x86/memory_failure: Introduce {set, clear}_mce_nospec()")
Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: Tony Luck <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Linus Torvalds <[email protected]>
Cc: Peter Anvin <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: linux-edac <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Jiang <[email protected]>
Link: https://lkml.kernel.org/r/20180831165506.GA9605@agluck-desk
6 years agoibmvnic: Include missing return code checks in reset function
Thomas Falcon [Thu, 30 Aug 2018 18:19:53 +0000 (13:19 -0500)]
ibmvnic: Include missing return code checks in reset function

Check the return codes of these functions and halt reset
in case of failure. The driver will remain in a dormant state
until the next reset event, when device initialization will be
re-attempted.

Signed-off-by: Thomas Falcon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoselftests: pmtu: detect correct binary to ping ipv6 addresses
Sabrina Dubroca [Thu, 30 Aug 2018 14:01:18 +0000 (16:01 +0200)]
selftests: pmtu: detect correct binary to ping ipv6 addresses

Some systems don't have the ping6 binary anymore, and use ping for
everything. Detect the absence of ping6 and try to use ping instead.

Fixes: d1f1b9cbf34c ("selftests: net: Introduce first PMTU test")
Signed-off-by: Sabrina Dubroca <[email protected]>
Acked-by: Stefano Brivio <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoselftests: pmtu: maximum MTU for vti4 is 2^16-1-20
Sabrina Dubroca [Thu, 30 Aug 2018 14:01:17 +0000 (16:01 +0200)]
selftests: pmtu: maximum MTU for vti4 is 2^16-1-20

Since commit 82612de1c98e ("ip_tunnel: restore binding to ifaces with a
large mtu"), the maximum MTU for vti4 is based on IP_MAX_MTU instead of
the mysterious constant 0xFFF8.  This makes this selftest fail.

Fixes: 82612de1c98e ("ip_tunnel: restore binding to ifaces with a large mtu")
Signed-off-by: Sabrina Dubroca <[email protected]>
Acked-by: Stefano Brivio <[email protected]>
Acked-by: Nicolas Dichtel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agotcp: do not restart timewait timer on rst reception
Florian Westphal [Thu, 30 Aug 2018 12:24:29 +0000 (14:24 +0200)]
tcp: do not restart timewait timer on rst reception

RFC 1337 says:
 ''Ignore RST segments in TIME-WAIT state.
   If the 2 minute MSL is enforced, this fix avoids all three hazards.''

So with net.ipv4.tcp_rfc1337=1, expected behaviour is to have TIME-WAIT sk
expire rather than removing it instantly when a reset is received.

However, Linux will also re-start the TIME-WAIT timer.

This causes connect to fail when tying to re-use ports or very long
delays (until syn retry interval exceeds MSL).

packetdrill test case:
// Demonstrate bogus rearming of TIME-WAIT timer in rfc1337 mode.
`sysctl net.ipv4.tcp_rfc1337=1`

0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
0.000 bind(3, ..., ...) = 0
0.000 listen(3, 1) = 0

0.100 < S 0:0(0) win 29200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
0.200 < . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4

// Receive first segment
0.310 < P. 1:1001(1000) ack 1 win 46

// Send one ACK
0.310 > . 1:1(0) ack 1001

// read 1000 byte
0.310 read(4, ..., 1000) = 1000

// Application writes 100 bytes
0.350 write(4, ..., 100) = 100
0.350 > P. 1:101(100) ack 1001

// ACK
0.500 < . 1001:1001(0) ack 101 win 257

// close the connection
0.600 close(4) = 0
0.600 > F. 101:101(0) ack 1001 win 244

// Our side is in FIN_WAIT_1 & waits for ack to fin
0.7 < . 1001:1001(0) ack 102 win 244

// Our side is in FIN_WAIT_2 with no outstanding data.
0.8 < F. 1001:1001(0) ack 102 win 244
0.8 > . 102:102(0) ack 1002 win 244

// Our side is now in TIME_WAIT state, send ack for fin.
0.9 < F. 1002:1002(0) ack 102 win 244
0.9 > . 102:102(0) ack 1002 win 244

// Peer reopens with in-window SYN:
1.000 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>

// Therefore, reply with ACK.
1.000 > . 102:102(0) ack 1002 win 244

// Peer sends RST for this ACK.  Normally this RST results
// in tw socket removal, but rfc1337=1 setting prevents this.
1.100 < R 1002:1002(0) win 244

// second syn. Due to rfc1337=1 expect another pure ACK.
31.0 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
31.0 > . 102:102(0) ack 1002 win 244

// .. and another RST from peer.
31.1 < R 1002:1002(0) win 244
31.2 `echo no timer restart;ss -m -e -a -i -n -t -o state TIME-WAIT`

// third syn after one minute.  Time-Wait socket should have expired by now.
63.0 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>

// so we expect a syn-ack & 3whs to proceed from here on.
63.0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>

Without this patch, 'ss' shows restarts of tw timer and last packet is
thus just another pure ack, more than one minute later.

This restores the original code from commit 283fd6cf0be690a83
("Merge in ANK networking jumbo patch") in netdev-vger-cvs.git .

For some reason the else branch was removed/lost in 1f28b683339f7
("Merge in TCP/UDP optimizations and [..]") and timer restart became
unconditional.

Reported-by: Michal Tesar <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet/rds: RDS is not Radio Data System
Pavel Machek [Thu, 30 Aug 2018 11:30:03 +0000 (13:30 +0200)]
net/rds: RDS is not Radio Data System

Getting prompt "The RDS Protocol" (RDS) is not too helpful, and it is
easily confused with Radio Data System (which we may want to support
in kernel, too).

Signed-off-by: Pavel Machek <[email protected]>
Acked-by: Sowmini Varadhan <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Acked-by: Sowmini Varadhan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agohv_netvsc: Fix a deadlock by getting rtnl lock earlier in netvsc_probe()
Dexuan Cui [Thu, 30 Aug 2018 05:42:13 +0000 (05:42 +0000)]
hv_netvsc: Fix a deadlock by getting rtnl lock earlier in netvsc_probe()

This patch fixes the race between netvsc_probe() and
rndis_set_subchannel(), which can cause a deadlock.

These are the related 3 paths which show the deadlock:

path #1:
    Workqueue: hv_vmbus_con vmbus_onmessage_work [hv_vmbus]
    Call Trace:
     schedule
     schedule_preempt_disabled
     __mutex_lock
     __device_attach
     bus_probe_device
     device_add
     vmbus_device_register
     vmbus_onoffer
     vmbus_onmessage_work
     process_one_work
     worker_thread
     kthread
     ret_from_fork

path #2:
    schedule
     schedule_preempt_disabled
     __mutex_lock
     netvsc_probe
     vmbus_probe
     really_probe
     __driver_attach
     bus_for_each_dev
     driver_attach_async
     async_run_entry_fn
     process_one_work
     worker_thread
     kthread
     ret_from_fork

path #3:
    Workqueue: events netvsc_subchan_work [hv_netvsc]
    Call Trace:
     schedule
     rndis_set_subchannel
     netvsc_subchan_work
     process_one_work
     worker_thread
     kthread
     ret_from_fork

Before path #1 finishes, path #2 can start to run, because just before
the "bus_probe_device(dev);" in device_add() in path #1, there is a line
"object_uevent(&dev->kobj, KOBJ_ADD);", so systemd-udevd can
immediately try to load hv_netvsc and hence path #2 can start to run.

Next, path #2 offloads the subchannal's initialization to a workqueue,
i.e. path #3, so we can end up in a deadlock situation like this:

Path #2 gets the device lock, and is trying to get the rtnl lock;
Path #3 gets the rtnl lock and is waiting for all the subchannel messages
to be processed;
Path #1 is trying to get the device lock, but since #2 is not releasing
the device lock, path #1 has to sleep; since the VMBus messages are
processed one by one, this means the sub-channel messages can't be
procedded, so #3 has to sleep with the rtnl lock held, and finally #2
has to sleep... Now all the 3 paths are sleeping and we hit the deadlock.

With the patch, we can make sure #2 gets both the device lock and the
rtnl lock together, gets its job done, and releases the locks, so #1
and #3 will not be blocked for ever.

Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug")
Signed-off-by: Dexuan Cui <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: K. Y. Srinivasan <[email protected]>
Cc: Haiyang Zhang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonfp: wait for posted reconfigs when disabling the device
Jakub Kicinski [Wed, 29 Aug 2018 19:46:08 +0000 (12:46 -0700)]
nfp: wait for posted reconfigs when disabling the device

To avoid leaking a running timer we need to wait for the
posted reconfigs after netdev is unregistered.  In common
case the process of deinitializing the device will perform
synchronous reconfigs which wait for posted requests, but
especially with VXLAN ports being actively added and removed
there can be a race condition leaving a timer running after
adapter structure is freed leading to a crash.

Add an explicit flush after deregistering and for a good
measure a warning to check if timer is running just before
structures are freed.

Fixes: 3d780b926a12 ("nfp: add async reconfiguration mechanism")
Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Dirk van der Merwe <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoRevert "packet: switch kvzalloc to allocate memory"
Eric Dumazet [Wed, 29 Aug 2018 18:50:12 +0000 (11:50 -0700)]
Revert "packet: switch kvzalloc to allocate memory"

This reverts commit 71e41286203c017d24f041a7cd71abea7ca7b1e0.

mmap()/munmap() can not be backed by kmalloced pages :

We fault in :

    VM_BUG_ON_PAGE(PageSlab(page), page);

    unmap_single_vma+0x8a/0x110
    unmap_vmas+0x4b/0x90
    unmap_region+0xc9/0x140
    do_munmap+0x274/0x360
    vm_munmap+0x81/0xc0
    SyS_munmap+0x2b/0x40
    do_syscall_64+0x13e/0x1c0
    entry_SYSCALL_64_after_hwframe+0x42/0xb7

Fixes: 71e41286203c ("packet: switch kvzalloc to allocate memory")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: John Sperbeck <[email protected]>
Bisected-by: John Sperbeck <[email protected]>
Cc: Zhang Yu <[email protected]>
Cc: Li RongQing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoblkcg: use tryget logic when associating a blkg with a bio
Dennis Zhou (Facebook) [Fri, 31 Aug 2018 20:22:44 +0000 (16:22 -0400)]
blkcg: use tryget logic when associating a blkg with a bio

There is a very small change a bio gets caught up in a really
unfortunate race between a task migration, cgroup exiting, and itself
trying to associate with a blkg. This is due to css offlining being
performed after the css->refcnt is killed which triggers removal of
blkgs that reach their blkg->refcnt of 0.

To avoid this, association with a blkg should use tryget and fallback to
using the root_blkg.

Fixes: 08e18eab0c579 ("block: add bi_blkg to the bio for cgroups")
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Dennis Zhou <[email protected]>
Cc: Jiufei Xue <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Josef Bacik <[email protected]>
Cc: Jens Axboe <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
6 years agoblkcg: delay blkg destruction until after writeback has finished
Dennis Zhou (Facebook) [Fri, 31 Aug 2018 20:22:43 +0000 (16:22 -0400)]
blkcg: delay blkg destruction until after writeback has finished

Currently, blkcg destruction relies on a sequence of events:
  1. Destruction starts. blkcg_css_offline() is called and blkgs
     release their reference to the blkcg. This immediately destroys
     the cgwbs (writeback).
  2. With blkgs giving up their reference, the blkcg ref count should
     become zero and eventually call blkcg_css_free() which finally
     frees the blkcg.

Jiufei Xue reported that there is a race between blkcg_bio_issue_check()
and cgroup_rmdir(). To remedy this, blkg destruction becomes contingent
on the completion of all writeback associated with the blkcg. A count of
the number of cgwbs is maintained and once that goes to zero, blkg
destruction can follow. This should prevent premature blkg destruction
related to writeback.

The new process for blkcg cleanup is as follows:
  1. Destruction starts. blkcg_css_offline() is called which offlines
     writeback. Blkg destruction is delayed on the cgwb_refcnt count to
     avoid punting potentially large amounts of outstanding writeback
     to root while maintaining any ongoing policies. Here, the base
     cgwb_refcnt is put back.
  2. When the cgwb_refcnt becomes zero, blkcg_destroy_blkgs() is called
     and handles destruction of blkgs. This is where the css reference
     held by each blkg is released.
  3. Once the blkcg ref count goes to zero, blkcg_css_free() is called.
     This finally frees the blkg.

It seems in the past blk-throttle didn't do the most understandable
things with taking data from a blkg while associating with current. So,
the simplification and unification of what blk-throttle is doing caused
this.

Fixes: 08e18eab0c579 ("block: add bi_blkg to the bio for cgroups")
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Dennis Zhou <[email protected]>
Cc: Jiufei Xue <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Josef Bacik <[email protected]>
Cc: Jens Axboe <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
6 years agoRevert "blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()"
Dennis Zhou (Facebook) [Fri, 31 Aug 2018 20:22:42 +0000 (16:22 -0400)]
Revert "blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()"

This reverts commit 4c6994806f708559c2812b73501406e21ae5dcd0.

Destroying blkgs is tricky because of the nature of the relationship. A
blkg should go away when either a blkcg or a request_queue goes away.
However, blkg's pin the blkcg to ensure they remain valid. To break this
cycle, when a blkcg is offlined, blkgs put back their css ref. This
eventually lets css_free() get called which frees the blkcg.

The above commit (4c6994806f70) breaks this order of events by trying to
destroy blkgs in css_free(). As the blkgs still hold references to the
blkcg, css_free() is never called.

The race between blkcg_bio_issue_check() and cgroup_rmdir() will be
addressed in the following patch by delaying destruction of a blkg until
all writeback associated with the blkcg has been finished.

Fixes: 4c6994806f70 ("blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()")
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Dennis Zhou <[email protected]>
Cc: Jiufei Xue <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Jens Axboe <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
6 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 31 Aug 2018 16:20:30 +0000 (09:20 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "A few arm64 fixes came in this week, specifically fixing some nasty
  truncation of return values from firmware calls and resolving a
  VM_BUG_ON due to accessing uninitialised struct pages corresponding to
  NOMAP pages.

  Summary:

   - Fix typos in SVE documentation

   - Fix type-checking and implicit truncation for SMCCC calls

   - Force CONFIG_HOLES_IN_ZONE=y so that SLAB doesn't fall over NOMAP
     regions"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: mm: always enable CONFIG_HOLES_IN_ZONE
  arm/arm64: smccc-1.1: Handle function result as parameters
  arm/arm64: smccc-1.1: Make return values unsigned long
  Documentation/arm64/sve: Couple of improvements and typos

6 years agox86/efi: Load fixmap GDT in efi_call_phys_epilog()
Joerg Roedel [Fri, 31 Aug 2018 08:05:38 +0000 (10:05 +0200)]
x86/efi: Load fixmap GDT in efi_call_phys_epilog()

When PTI is enabled on x86-32 the kernel uses the GDT mapped in the fixmap
for the simple reason that this address is also mapped for user-space.

The efi_call_phys_prolog()/efi_call_phys_epilog() wrappers change the GDT
to call EFI runtime services and switch back to the kernel GDT when they
return. But the switch-back uses the writable GDT, not the fixmap GDT.

When that happened and and the CPU returns to user-space it switches to the
user %cr3 and tries to restore user segment registers. This fails because
the writable GDT is not mapped in the user page-table, and without a GDT
the fault handlers also can't be launched. The result is a triple fault and
reboot of the machine.

Fix that by restoring the GDT back to the fixmap GDT which is also mapped
in the user page-table.

Fixes: 7757d607c6b3 x86/pti: ('Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32')
Reported-by: Guenter Roeck <[email protected]>
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
6 years agoMerge tag 'for-linus-4.19b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 31 Aug 2018 15:45:16 +0000 (08:45 -0700)]
Merge tag 'for-linus-4.19b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

 - minor cleanup avoiding a warning when building with new gcc

 - a patch to add a new sysfs node for Xen frontend/backend drivers to
   make it easier to obtain the state of a pv device

 - two fixes for 32-bit pv-guests to avoid intermediate L1TF vulnerable
   PTEs

* tag 'for-linus-4.19b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: remove redundant variable save_pud
  xen: export device state to sysfs
  x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear
  x86/xen: don't write ptes directly in 32-bit PV guests

6 years agoMerge tag 'm68k-for-v4.19-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 31 Aug 2018 15:42:46 +0000 (08:42 -0700)]
Merge tag 'm68k-for-v4.19-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k

Pull m68k fix from Geert Uytterhoeven:
 "Just a single fix for a bug introduced during the merge window: fix
  wrong date and time on PMU-based Macs"

* tag 'm68k-for-v4.19-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k/mac: Use correct PMU response format

6 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Fri, 31 Aug 2018 15:38:53 +0000 (08:38 -0700)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:

 - regression fixes for i801 and designware

 - better API and leak fix for releasing DMA safe buffers

 - better greppable strings for the bitbang algorithm

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: sh_mobile: fix leak when using DMA bounce buffer
  i2c: sh_mobile: define start_ch() void as it only returns 0 anyhow
  i2c: refactor function to release a DMA safe buffer
  i2c: algos: bit: make the error messages grepable
  i2c: designware: Re-init controllers with pm_disabled set on resume
  i2c: i801: Allow ACPI AML access I/O ports not reserved for SMBus

6 years agox86/nmi: Fix NMI uaccess race against CR3 switching
Andy Lutomirski [Wed, 29 Aug 2018 15:47:18 +0000 (08:47 -0700)]
x86/nmi: Fix NMI uaccess race against CR3 switching

A NMI can hit in the middle of context switching or in the middle of
switch_mm_irqs_off().  In either case, CR3 might not match current->mm,
which could cause copy_from_user_nmi() and friends to read the wrong
memory.

Fix it by adding a new nmi_uaccess_okay() helper and checking it in
copy_from_user_nmi() and in __copy_from_user_nmi()'s callers.

Signed-off-by: Andy Lutomirski <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Rik van Riel <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/dd956eba16646fd0b15c3c0741269dfd84452dac.1535557289.git.luto@kernel.org
6 years agox86: Allow generating user-space headers without a compiler
Ben Hutchings [Wed, 29 Aug 2018 19:43:17 +0000 (20:43 +0100)]
x86: Allow generating user-space headers without a compiler

When bootstrapping an architecture, it's usual to generate the kernel's
user-space headers (make headers_install) before building a compiler.  Move
the compiler check (for asm goto support) to the archprepare target so that
it is only done when building code for the target.

Fixes: e501ce957a78 ("x86: Force asm-goto")
Reported-by: Helmut Grohne <[email protected]>
Signed-off-by: Ben Hutchings <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
6 years agox86/dumpstack: Don't dump kernel memory based on usermode RIP
Jann Horn [Tue, 28 Aug 2018 15:49:01 +0000 (17:49 +0200)]
x86/dumpstack: Don't dump kernel memory based on usermode RIP

show_opcodes() is used both for dumping kernel instructions and for dumping
user instructions. If userspace causes #PF by jumping to a kernel address,
show_opcodes() can be reached with regs->ip controlled by the user,
pointing to kernel code. Make sure that userspace can't trick us into
dumping kernel memory into dmesg.

Fixes: 7cccf0725cf7 ("x86/dumpstack: Add a show_ip() function")
Signed-off-by: Jann Horn <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
6 years agoof: Add device_type access helper functions
Rob Herring [Tue, 28 Aug 2018 20:10:48 +0000 (15:10 -0500)]
of: Add device_type access helper functions

In preparation to remove direct access to device_node.type, add
of_node_is_type() and of_node_get_device_type() helpers to check and
retrieve the device type.

Cc: Frank Rowand <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
6 years agocpu/hotplug: Remove skip_onerr field from cpuhp_step structure
Mukesh Ojha [Tue, 28 Aug 2018 06:54:54 +0000 (12:24 +0530)]
cpu/hotplug: Remove skip_onerr field from cpuhp_step structure

When notifiers were there, `skip_onerr` was used to avoid calling
particular step startup/teardown callbacks in the CPU up/down rollback
path, which made the hotplug asymmetric.

As notifiers are gone now after the full state machine conversion, the
`skip_onerr` field is no longer required.

Remove it from the structure and its usage.

Signed-off-by: Mukesh Ojha <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
6 years agoarm64: mm: always enable CONFIG_HOLES_IN_ZONE
James Morse [Thu, 30 Aug 2018 15:05:32 +0000 (16:05 +0100)]
arm64: mm: always enable CONFIG_HOLES_IN_ZONE

Commit 6d526ee26ccd ("arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA")
only enabled HOLES_IN_ZONE for NUMA systems because the NUMA code was
choking on the missing zone for nomap pages. This problem doesn't just
apply to NUMA systems.

If the architecture doesn't set HAVE_ARCH_PFN_VALID, pfn_valid() will
return true if the pfn is part of a valid sparsemem section.

When working with multiple pages, the mm code uses pfn_valid_within()
to test each page it uses within the sparsemem section is valid. On
most systems memory comes in MAX_ORDER_NR_PAGES chunks which all
have valid/initialised struct pages. In this case pfn_valid_within()
is optimised out.

Systems where this isn't true (e.g. due to nomap) should set
HOLES_IN_ZONE and provide HAVE_ARCH_PFN_VALID so that mm tests each
page as it works with it.

Currently non-NUMA arm64 systems can't enable HOLES_IN_ZONE, leading to
a VM_BUG_ON():

| page:fffffdff802e1780 is uninitialized and poisoned
| raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
| raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
| page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
| ------------[ cut here ]------------
| kernel BUG at include/linux/mm.h:978!
| Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[...]
| CPU: 1 PID: 25236 Comm: dd Not tainted 4.18.0 #7
| Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
| pstate: 40000085 (nZcv daIf -PAN -UAO)
| pc : move_freepages_block+0x144/0x248
| lr : move_freepages_block+0x144/0x248
| sp : fffffe0071177680
[...]
| Process dd (pid: 25236, stack limit = 0x0000000094cc07fb)
| Call trace:
|  move_freepages_block+0x144/0x248
|  steal_suitable_fallback+0x100/0x16c
|  get_page_from_freelist+0x440/0xb20
|  __alloc_pages_nodemask+0xe8/0x838
|  new_slab+0xd4/0x418
|  ___slab_alloc.constprop.27+0x380/0x4a8
|  __slab_alloc.isra.21.constprop.26+0x24/0x34
|  kmem_cache_alloc+0xa8/0x180
|  alloc_buffer_head+0x1c/0x90
|  alloc_page_buffers+0x68/0xb0
|  create_empty_buffers+0x20/0x1ec
|  create_page_buffers+0xb0/0xf0
|  __block_write_begin_int+0xc4/0x564
|  __block_write_begin+0x10/0x18
|  block_write_begin+0x48/0xd0
|  blkdev_write_begin+0x28/0x30
|  generic_perform_write+0x98/0x16c
|  __generic_file_write_iter+0x138/0x168
|  blkdev_write_iter+0x80/0xf0
|  __vfs_write+0xe4/0x10c
|  vfs_write+0xb4/0x168
|  ksys_write+0x44/0x88
|  sys_write+0xc/0x14
|  el0_svc_naked+0x30/0x34
| Code: aa1303e0 90001a01 91296421 94008902 (d4210000)
| ---[ end trace 1601ba47f6e883fe ]---

Remove the NUMA dependency.

Link: https://www.spinics.net/lists/arm-kernel/msg671851.html
Cc: <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Reported-by: Mikulas Patocka <[email protected]>
Reviewed-by: Pavel Tatashin <[email protected]>
Tested-by: Mikulas Patocka <[email protected]>
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
6 years agogpio: Fix crash due to registration race
Vincent Whitchurch [Fri, 31 Aug 2018 07:04:18 +0000 (09:04 +0200)]
gpio: Fix crash due to registration race

gpiochip_add_data_with_key() adds the gpiochip to the gpio_devices list
before of_gpiochip_add() is called, but it's only the latter which sets
the ->of_xlate function pointer.  gpiochip_find() can be called by
someone else between these two actions, and it can find the chip and
call of_gpiochip_match_node_and_xlate() which leads to the following
crash due to a NULL ->of_xlate().

 Unhandled prefetch abort: page domain fault (0x01b) at 0x00000000
 Modules linked in: leds_gpio(+) gpio_generic(+)
 CPU: 0 PID: 830 Comm: insmod Not tainted 4.18.0+ #43
 Hardware name: ARM-Versatile Express
 PC is at   (null)
 LR is at of_gpiochip_match_node_and_xlate+0x2c/0x38
 Process insmod (pid: 830, stack limit = 0x(ptrval))
  (of_gpiochip_match_node_and_xlate) from  (gpiochip_find+0x48/0x84)
  (gpiochip_find) from  (of_get_named_gpiod_flags+0xa8/0x238)
  (of_get_named_gpiod_flags) from  (gpiod_get_from_of_node+0x2c/0xc8)
  (gpiod_get_from_of_node) from  (devm_fwnode_get_index_gpiod_from_child+0xb8/0x144)
  (devm_fwnode_get_index_gpiod_from_child) from  (gpio_led_probe+0x208/0x3c4 [leds_gpio])
  (gpio_led_probe [leds_gpio]) from  (platform_drv_probe+0x48/0x9c)
  (platform_drv_probe) from  (really_probe+0x1d0/0x3d4)
  (really_probe) from  (driver_probe_device+0x78/0x1c0)
  (driver_probe_device) from  (__driver_attach+0x120/0x13c)
  (__driver_attach) from  (bus_for_each_dev+0x68/0xb4)
  (bus_for_each_dev) from  (bus_add_driver+0x1a8/0x268)
  (bus_add_driver) from  (driver_register+0x78/0x10c)
  (driver_register) from  (do_one_initcall+0x54/0x1fc)
  (do_one_initcall) from  (do_init_module+0x64/0x1f4)
  (do_init_module) from  (load_module+0x2198/0x26ac)
  (load_module) from  (sys_finit_module+0xe0/0x110)
  (sys_finit_module) from  (ret_fast_syscall+0x0/0x54)

One way to fix this would be to rework the hairy registration sequence
in gpiochip_add_data_with_key(), but since I'd probably introduce a
couple of new bugs if I attempted that, simply add a check for a
non-NULL of_xlate function pointer in
of_gpiochip_match_node_and_xlate().  This works since the driver looking
for the gpio will simply fail to find the gpio and defer its probe and
be reprobed when the driver which is registering the gpiochip has fully
completed its probe.

Signed-off-by: Vincent Whitchurch <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
6 years agom68k/mac: Use correct PMU response format
Finn Thain [Fri, 24 Aug 2018 02:02:06 +0000 (12:02 +1000)]
m68k/mac: Use correct PMU response format

Now that the 68k Mac port has adopted the via-pmu driver, it must decode
the PMU response accordingly otherwise the date and time will be wrong.

Fixes: ebd722275f9cfc67 ("macintosh/via-pmu: Replace via-pmu68k driver with via-pmu driver")
Signed-off-by: Finn Thain <[email protected]>
Signed-off-by: Geert Uytterhoeven <[email protected]>
6 years agoMerge tag 'drm-fixes-2018-08-31' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 31 Aug 2018 04:18:05 +0000 (21:18 -0700)]
Merge tag 'drm-fixes-2018-08-31' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Regular fixes pull:

   - Mediatek has a bunch of fixes to their RDMA and Overlay engines.

   - i915 has some Cannonlake/Geminilake watermark workarounds, LSPCON
     fix, HDCP free fix, audio fix and a ppgtt reference counting fix.

   - amdgpu has some SRIOV, Kasan, memory leaks and other misc fixes"

* tag 'drm-fixes-2018-08-31' of git://anongit.freedesktop.org/drm/drm: (35 commits)
  drm/i915/audio: Hook up component bindings even if displays are disabled
  drm/i915: Increase LSPCON timeout
  drm/i915: Stop holding a ref to the ppgtt from each vma
  drm/i915: Free write_buf that we allocated with kzalloc.
  drm/i915: Fix glk/cnl display w/a #1175
  drm/amdgpu: Need to set moved to true when evict bo
  drm/amdgpu: Remove duplicated power source update
  drm/amd/display: Fix memory leak caused by missed dc_sink_release
  drm/amdgpu: fix holding mn_lock while allocating memory
  drm/amdgpu: Power on uvd block when hw_fini
  drm/amdgpu: Update power state at the end of smu hw_init.
  drm/amdgpu: Fix vce initialize failed on Kaveri/Mullins
  drm/amdgpu: Enable/disable gfx PG feature in rlc safe mode
  drm/amdgpu: Adjust the VM size based on system memory size v2
  drm/mediatek: fix connection from RDMA2 to DSI1
  drm/mediatek: update some variable name from ovl to comp
  drm/mediatek: use layer_nr function to get layer number to init plane
  drm/mediatek: add function to return RDMA layer number
  drm/mediatek: add function to return OVL layer number
  drm/mediatek: add function to get layer number for component
  ...

6 years agodisable stringop truncation warnings for now
Stephen Rothwell [Thu, 30 Aug 2018 21:47:28 +0000 (07:47 +1000)]
disable stringop truncation warnings for now

They are too noisy

Signed-off-by: Stephen Rothwell <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
6 years agoMerge tag 'pm-4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Fri, 31 Aug 2018 01:02:02 +0000 (18:02 -0700)]
Merge tag 'pm-4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These address a corner case in the menu cpuidle governor and fix error
  handling in the PM core's generic clock management code.

  Specifics:

   - Make the menu cpuidle governor avoid stopping the scheduler tick if
     the predicted idle duration exceeds the tick period length, but the
     selected idle state is shallow and deeper idle states with high
     target residencies are available (Rafael Wysocki).

   - Make the PM core's generic clock management code use a proper data
     type for one variable to make error handling work (Dan Carpenter)"

* tag 'pm-4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpuidle: menu: Retain tick when shallow state is selected
  PM / clk: signedness bug in of_pm_clk_add_clks()

6 years agoMerge branch 'pm-core'
Rafael J. Wysocki [Thu, 30 Aug 2018 23:23:31 +0000 (01:23 +0200)]
Merge branch 'pm-core'

Merge a generic clock management fix for 4.19-rc2.

* pm-core:
  PM / clk: signedness bug in of_pm_clk_add_clks()

6 years agoclk: x86: Set default parent to 48Mhz
Akshu Agrawal [Tue, 21 Aug 2018 06:51:57 +0000 (12:21 +0530)]
clk: x86: Set default parent to 48Mhz

System clk provided in ST soc can be set to:
48Mhz, non-spread
25Mhz, spread
To get accurate rate, we need it to set it at non-spread
option which is 48Mhz.

Signed-off-by: Akshu Agrawal <[email protected]>
Reviewed-by: Daniel Kurtz <[email protected]>
Fixes: 421bf6a1f061 ("clk: x86: Add ST oscout platform clock")
Signed-off-by: Stephen Boyd <[email protected]>
6 years agoi2c: sh_mobile: fix leak when using DMA bounce buffer
Wolfram Sang [Fri, 24 Aug 2018 14:52:46 +0000 (16:52 +0200)]
i2c: sh_mobile: fix leak when using DMA bounce buffer

We only freed the bounce buffer after successful DMA, missing the cases
where DMA setup may have gone wrong. Use a better location which always
gets called after each message and use 'stop_after_dma' as a flag for a
successful transfer.

Signed-off-by: Wolfram Sang <[email protected]>
Reviewed-by: Niklas Söderlund <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
6 years agoi2c: sh_mobile: define start_ch() void as it only returns 0 anyhow
Wolfram Sang [Fri, 24 Aug 2018 14:52:45 +0000 (16:52 +0200)]
i2c: sh_mobile: define start_ch() void as it only returns 0 anyhow

After various refactoring over the years, start_ch() doesn't return
errno anymore, so make the function return void. This saves the error
handling when calling it which in turn eases cleanup of resources of a
future patch.

Signed-off-by: Wolfram Sang <[email protected]>
Reviewed-by: Niklas Söderlund <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
6 years agoi2c: refactor function to release a DMA safe buffer
Wolfram Sang [Fri, 24 Aug 2018 14:52:44 +0000 (16:52 +0200)]
i2c: refactor function to release a DMA safe buffer

a) rename to 'put' instead of 'release' to match 'get' when obtaining
   the buffer
b) change the argument order to have the buffer as first argument
c) add a new argument telling the function if the message was
   transferred. This allows the function to be used also in cases
   where setting up DMA failed, so the buffer needs to be freed without
   syncing to the message buffer.

Also convert the only user.

Signed-off-by: Wolfram Sang <[email protected]>
Reviewed-by: Niklas Söderlund <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
6 years agoi2c: algos: bit: make the error messages grepable
Jan Kundrát [Tue, 28 Aug 2018 08:07:40 +0000 (10:07 +0200)]
i2c: algos: bit: make the error messages grepable

Yep, I went looking for one of these, and I wasn't able to find it
easily.  That's worse than a line which is 82-chars long, IMHO.

Signed-off-by: Jan Kundrát <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
6 years agoi2c: designware: Re-init controllers with pm_disabled set on resume
Hans de Goede [Wed, 29 Aug 2018 13:06:31 +0000 (15:06 +0200)]
i2c: designware: Re-init controllers with pm_disabled set on resume

On Bay Trail and Cherry Trail devices we set the pm_disabled flag for I2C
busses which the OS shares with the PUNIT as these need special handling.
Until now we called dev_pm_syscore_device(dev, true) for I2C controllers
with this flag set to keep these I2C controllers always on.

After commit 12864ff8545f ("ACPI / LPSS: Avoid PM quirks on suspend and
resume from hibernation"), this no longer works. This commit modifies
lpss_iosf_exit_d3_state() to only run if lpss_iosf_enter_d3_state() has ran
before it, so that it does not run on a resume from hibernate (or from S3).

On these systems the conditions for lpss_iosf_enter_d3_state() to run
never become true, so lpss_iosf_exit_d3_state() never gets called and
the 2 LPSS DMA controllers never get forced into D0 mode, instead they
are left in their default automatic power-on when needed mode.

The not forcing of D0 mode for the DMA controllers enables these systems
to properly enter S0ix modes, which is a good thing.

But after entering S0ix modes the I2C controller connected to the PMIC
no longer works, leading to e.g. broken battery monitoring.

The _PS3 method for this I2C controller looks like this:

            Method (_PS3, 0, NotSerialized)  // _PS3: Power State 3
            {
                If ((((PMID == 0x04) || (PMID == 0x05)) || (PMID == 0x06)))
                {
                    Return (Zero)
                }

                PSAT |= 0x03
                Local0 = PSAT /* \_SB_.I2C5.PSAT */
            }

Where PMID = 0x05, so we enter the Return (Zero) path on these systems.

So even if we were to not call dev_pm_syscore_device(dev, true) the
I2C controller will be left in D0 rather then be switched to D3.

Yet on other Bay and Cherry Trail devices S0ix is not entered unless *all*
I2C controllers are in D3 mode. This combined with the I2C controller no
longer working now that we reach S0ix states on these systems leads to me
believing that the PUNIT itself puts the I2C controller in D3 when all
other conditions for entering S0ix states are true.

Since now the I2C controller is put in D3 over a suspend/resume we must
re-initialize it afterwards and that does indeed fix it no longer working.

This commit implements this fix by:

1) Making the suspend_late callback a no-op if pm_disabled is set and
making the resume_early callback skip the clock re-enable (since it now was
not disabled) while still doing the necessary I2C controller re-init.

2) Removing the dev_pm_syscore_device(dev, true) call, so that the suspend
and resume callbacks are actually called. Normally this would cause the
ACPI pm code to call _PS3 putting the I2C controller in D3, wreaking havoc
since it is shared with the PUNIT, but in this special case the _PS3 method
is a no-op so we can safely allow a "fake" suspend / resume.

Fixes: 12864ff8545f ("ACPI / LPSS: Avoid PM quirks on suspend and resume ...")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200861
Cc: 4.15+ <[email protected]> # 4.15+
Signed-off-by: Hans de Goede <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Acked-by: Jarkko Nikula <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
6 years agoi2c: i801: Allow ACPI AML access I/O ports not reserved for SMBus
Mika Westerberg [Thu, 30 Aug 2018 08:50:13 +0000 (11:50 +0300)]
i2c: i801: Allow ACPI AML access I/O ports not reserved for SMBus

Commit 7ae81952cda ("i2c: i801: Allow ACPI SystemIO OpRegion to conflict
with PCI BAR") made it possible for AML code to access SMBus I/O ports
by installing custom SystemIO OpRegion handler and blocking i80i driver
access upon first AML read/write to this OpRegion.

However, while ThinkPad T560 does have SystemIO OpRegion declared under
the SMBus device, it does not access any of the SMBus registers:

    Device (SMBU)
    {
        ...

        OperationRegion (SMBP, PCI_Config, 0x50, 0x04)
        Field (SMBP, DWordAcc, NoLock, Preserve)
        {
            ,   5,
            TCOB,   11,
            Offset (0x04)
        }

        Name (TCBV, 0x00)
        Method (TCBS, 0, NotSerialized)
        {
            If ((TCBV == 0x00))
            {
            TCBV = (\_SB.PCI0.SMBU.TCOB << 0x05)
            }

            Return (TCBV) /* \_SB_.PCI0.SMBU.TCBV */
        }

        OperationRegion (TCBA, SystemIO, TCBS (), 0x10)
        Field (TCBA, ByteAcc, NoLock, Preserve)
        {
            Offset (0x04),
            ,   9,
            CPSC,   1
        }
    }

Problem with the current approach is that it blocks all I/O port access
and because this system has touchpad connected to the SMBus controller
after first AML access (happens during suspend/resume cycle) the
touchpad fails to work anymore.

Fix this so that we allow ACPI AML I/O port access if it does not touch
the region reserved for the SMBus.

Fixes: 7ae81952cda ("i2c: i801: Allow ACPI SystemIO OpRegion to conflict with PCI BAR")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200737
Reported-by: Yussuf Khalil <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>
Reviewed-by: Jean Delvare <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
6 years agoMerge tag 'for-linus-20180830' of git://git.kernel.dk/linux-block
Linus Torvalds [Thu, 30 Aug 2018 20:39:04 +0000 (13:39 -0700)]
Merge tag 'for-linus-20180830' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Small collection of fixes that should go into this series. This pull
  contains:

   - NVMe pull request with three small fixes (via Christoph)

   - Kill useless NULL check before kmem_cache_destroy (Chengguang Xu)

   - Xen block driver pull request with persistent grant flushing fixes
     (Juergen Gross)

   - Final wbt fixes, wrapping up the changes for this series. These
     have been heavily tested (me)

   - cdrom info leak fix (Scott Bauer)

   - ATA dma quirk for SQ201 (Linus Walleij)

   - Straight forward bsg refcount_t conversion (John Pittman)"

* tag 'for-linus-20180830' of git://git.kernel.dk/linux-block:
  cdrom: Fix info leak/OOB read in cdrom_ioctl_drive_status
  nvmet: free workqueue object if module init fails
  nvme-fcloop: Fix dropped LS's to removed target port
  nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event
  block: bsg: move atomic_t ref_count variable to refcount API
  block: remove unnecessary condition check
  ata: ftide010: Add a quirk for SQ201
  blk-wbt: remove dead code
  blk-wbt: improve waking of tasks
  blk-wbt: abstract out end IO completion handler
  xen/blkback: remove unused pers_gnts_lock from struct xen_blkif_ring
  xen/blkback: move persistent grants flags to bool
  xen/blkfront: reorder tests in xlblk_init()
  xen/blkfront: cleanup stale persistent grants
  xen/blkback: don't keep persistent grants too long

6 years agoof: add node name compare helper functions
Rob Herring [Mon, 27 Aug 2018 12:50:47 +0000 (07:50 -0500)]
of: add node name compare helper functions

In preparation to remove device_node.name pointer, add helper functions
for node name comparisons which are a common pattern throughout the kernel.

Cc: Frank Rowand <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
6 years agoMerge tag 'mtd/for-4.19-rc2' of git://git.infradead.org/linux-mtd
Linus Torvalds [Thu, 30 Aug 2018 17:05:12 +0000 (10:05 -0700)]
Merge tag 'mtd/for-4.19-rc2' of git://git.infradead.org/linux-mtd

Pull mtd fixes from Boris Brezillon:
 "Raw NAND fixes:

   - denali: Fix a regression caused by the nand_scan() rework

   - docg4: Fix a build error when gcc decides to not iniline some
     functions (can be reproduced with gcc 4.1.2):

* tag 'mtd/for-4.19-rc2' of git://git.infradead.org/linux-mtd:
  mtd: rawnand: denali: do not pass zero maxchips to nand_scan()
  mtd: rawnand: docg4: Remove wrong __init annotations

6 years agoMerge tag 'mmc-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Thu, 30 Aug 2018 16:50:15 +0000 (09:50 -0700)]
Merge tag 'mmc-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
 "MMC core:
   - Fix unsupported parallel dispatch of requests

  MMC host:
   - atmel-mci/android-goldfish: Fixup logic of sg_copy_{from,to}_buffer
   - renesas_sdhi_internal_dmac: Prevent IRQ-storm due of DMAC IRQs
   - renesas_sdhi_internal_dmac: Fixup bad register offset"

* tag 'mmc-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: renesas_sdhi_internal_dmac: mask DMAC interrupts
  mmc: renesas_sdhi_internal_dmac: fix #define RST_RESERVED_BITS
  mmc: block: Fix unsupported parallel dispatch of requests
  mmc: android-goldfish: fix bad logic of sg_copy_{from,to}_buffer conversion
  mmc: atmel-mci: fix bad logic of sg_copy_{from,to}_buffer conversion

6 years agoarm/arm64: smccc-1.1: Handle function result as parameters
Marc Zyngier [Fri, 24 Aug 2018 14:08:30 +0000 (15:08 +0100)]
arm/arm64: smccc-1.1: Handle function result as parameters

If someone has the silly idea to write something along those lines:

extern u64 foo(void);

void bar(struct arm_smccc_res *res)
{
arm_smccc_1_1_smc(0xbad, foo(), res);
}

they are in for a surprise, as this gets compiled as:

0000000000000588 <bar>:
 588:   a9be7bfd        stp     x29, x30, [sp, #-32]!
 58c:   910003fd        mov     x29, sp
 590:   f9000bf3        str     x19, [sp, #16]
 594:   aa0003f3        mov     x19, x0
 598:   aa1e03e0        mov     x0, x30
 59c:   94000000        bl      0 <_mcount>
 5a0:   94000000        bl      0 <foo>
 5a4:   aa0003e1        mov     x1, x0
 5a8:   d4000003        smc     #0x0
 5ac:   b4000073        cbz     x19, 5b8 <bar+0x30>
 5b0:   a9000660        stp     x0, x1, [x19]
 5b4:   a9010e62        stp     x2, x3, [x19, #16]
 5b8:   f9400bf3        ldr     x19, [sp, #16]
 5bc:   a8c27bfd        ldp     x29, x30, [sp], #32
 5c0:   d65f03c0        ret
 5c4:   d503201f        nop

The call to foo "overwrites" the x0 register for the return value,
and we end up calling the wrong secure service.

A solution is to evaluate all the parameters before assigning
anything to specific registers, leading to the expected result:

0000000000000588 <bar>:
 588:   a9be7bfd        stp     x29, x30, [sp, #-32]!
 58c:   910003fd        mov     x29, sp
 590:   f9000bf3        str     x19, [sp, #16]
 594:   aa0003f3        mov     x19, x0
 598:   aa1e03e0        mov     x0, x30
 59c:   94000000        bl      0 <_mcount>
 5a0:   94000000        bl      0 <foo>
 5a4:   aa0003e1        mov     x1, x0
 5a8:   d28175a0        mov     x0, #0xbad
 5ac:   d4000003        smc     #0x0
 5b0:   b4000073        cbz     x19, 5bc <bar+0x34>
 5b4:   a9000660        stp     x0, x1, [x19]
 5b8:   a9010e62        stp     x2, x3, [x19, #16]
 5bc:   f9400bf3        ldr     x19, [sp, #16]
 5c0:   a8c27bfd        ldp     x29, x30, [sp], #32
 5c4:   d65f03c0        ret

Reported-by: Julien Grall <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
6 years agox86/asm: Use CC_SET()/CC_OUT() in __gen_sigismember()
Uros Bizjak [Tue, 14 Aug 2018 16:59:51 +0000 (18:59 +0200)]
x86/asm: Use CC_SET()/CC_OUT() in __gen_sigismember()

Replace open-coded set instructions with CC_SET()/CC_OUT().

Signed-off-by: Uros Bizjak <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
6 years agox86/alternatives: Lockdep-enforce text_mutex in text_poke*()
Jiri Kosina [Tue, 28 Aug 2018 06:55:14 +0000 (08:55 +0200)]
x86/alternatives: Lockdep-enforce text_mutex in text_poke*()

text_poke() and text_poke_bp() must be called with text_mutex held.

Put proper lockdep anotation in place instead of just mentioning the
requirement in a comment.

Reported-by: Peter Zijlstra <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
6 years agoobjtool: Remove workaround for unreachable warnings from old GCC
Masahiro Yamada [Mon, 27 Aug 2018 03:39:43 +0000 (12:39 +0900)]
objtool: Remove workaround for unreachable warnings from old GCC

Commit cafa0010cd51 ("Raise the minimum required gcc version to 4.6")
bumped the minimum GCC version to 4.6 for all architectures.

This effectively reverts commit da541b20021c ("objtool: Skip unreachable
warnings for GCC 4.4 and older"), which was a workaround for GCC 4.4 or
older.

Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Josh Poimboeuf <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Michal Marek <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
6 years agonotifier: Remove notifier header file wherever not used
Mukesh Ojha [Fri, 24 Aug 2018 12:33:53 +0000 (18:03 +0530)]
notifier: Remove notifier header file wherever not used

The conversion of the hotplug notifiers to a state machine left the
notifier.h includes around in some places. Remove them.

Signed-off-by: Mukesh Ojha <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
6 years agowatchdog: Mark watchdog touch functions as notrace
Vincent Whitchurch [Tue, 21 Aug 2018 15:25:07 +0000 (17:25 +0200)]
watchdog: Mark watchdog touch functions as notrace

Some architectures need to use stop_machine() to patch functions for
ftrace, and the assumption is that the stopped CPUs do not make function
calls to traceable functions when they are in the stopped state.

Commit ce4f06dcbb5d ("stop_machine: Touch_nmi_watchdog() after
MULTI_STOP_PREPARE") added calls to the watchdog touch functions from
the stopped CPUs and those functions lack notrace annotations.  This
leads to crashes when enabling/disabling ftrace on ARM kernels built
with the Thumb-2 instruction set.

Fix it by adding the necessary notrace annotations.

Fixes: ce4f06dcbb5d ("stop_machine: Touch_nmi_watchdog() after MULTI_STOP_PREPARE")
Signed-off-by: Vincent Whitchurch <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
6 years agox86/entry/64: Wipe KASAN stack shadow before rewind_stack_do_exit()
Jann Horn [Tue, 28 Aug 2018 18:40:33 +0000 (20:40 +0200)]
x86/entry/64: Wipe KASAN stack shadow before rewind_stack_do_exit()

Reset the KASAN shadow state of the task stack before rewinding RSP.
Without this, a kernel oops will leave parts of the stack poisoned, and
code running under do_exit() can trip over such poisoned regions and cause
nonsensical false-positive KASAN reports about stack-out-of-bounds bugs.

This does not wipe the exception stacks; if an oops happens on an exception
stack, it might result in random KASAN false-positives from other tasks
afterwards. This is probably relatively uninteresting, since if the kernel
oopses on an exception stack, there are most likely bigger things to worry
about. It'd be more interesting if vmapped stacks and KASAN were
compatible, since then handle_stack_overflow() would oops from exception
stack context.

Fixes: 2deb4be28077 ("x86/dumpstack: When OOPSing, rewind the stack before do_exit()")
Signed-off-by: Jann Horn <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Andrey Ryabinin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
6 years agox86/irqflags: Mark native_restore_fl extern inline
Nick Desaulniers [Mon, 27 Aug 2018 21:40:09 +0000 (14:40 -0700)]
x86/irqflags: Mark native_restore_fl extern inline

This should have been marked extern inline in order to pick up the out
of line definition in arch/x86/kernel/irqflags.S.

Fixes: 208cbb325589 ("x86/irqflags: Provide a declaration for native_save_fl")
Reported-by: Ben Hutchings <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
6 years agox86/build: Remove jump label quirk for GCC older than 4.5.2
Masahiro Yamada [Mon, 27 Aug 2018 05:45:14 +0000 (14:45 +0900)]
x86/build: Remove jump label quirk for GCC older than 4.5.2

Commit cafa0010cd51 ("Raise the minimum required gcc version to 4.6")
bumped the minimum GCC version to 4.6 for all architectures.

Remove the workaround code.

It was the only user of cc-if-fullversion.  Remove the macro as well.

Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Michal Marek <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
6 years agomac80211: always account for A-MSDU header changes
Johannes Berg [Thu, 30 Aug 2018 08:55:49 +0000 (10:55 +0200)]
mac80211: always account for A-MSDU header changes

In the error path of changing the SKB headroom of the second
A-MSDU subframe, we would not account for the already-changed
length of the first frame that just got converted to be in
A-MSDU format and thus is a bit longer now.

Fix this by doing the necessary accounting.

It would be possible to reorder the operations, but that would
make the code more complex (to calculate the necessary pad),
and the headroom expansion should not fail frequently enough
to make that worthwhile.

Fixes: 6e0456b54545 ("mac80211: add A-MSDU tx support")
Signed-off-by: Johannes Berg <[email protected]>
Acked-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
6 years agomac80211: do not convert to A-MSDU if frag/subframe limited
Lorenzo Bianconi [Wed, 29 Aug 2018 19:03:25 +0000 (21:03 +0200)]
mac80211: do not convert to A-MSDU if frag/subframe limited

Do not start to aggregate packets in a A-MSDU frame (converting the
first subframe to A-MSDU, adding the header) if max_tx_fragments or
max_amsdu_subframes limits are already exceeded by it. In particular,
this happens when drivers set the limit to 1 to avoid A-MSDUs at all.

Signed-off-by: Lorenzo Bianconi <[email protected]>
[reword commit message to be more precise]
Signed-off-by: Johannes Berg <[email protected]>
6 years agocfg80211: nl80211_update_ft_ies() to validate NL80211_ATTR_IE
Arunk Khandavalli [Wed, 29 Aug 2018 21:40:16 +0000 (00:40 +0300)]
cfg80211: nl80211_update_ft_ies() to validate NL80211_ATTR_IE

nl80211_update_ft_ies() tried to validate NL80211_ATTR_IE with
is_valid_ie_attr() before dereferencing it, but that helper function
returns true in case of NULL pointer (i.e., attribute not included).
This can result to dereferencing a NULL pointer. Fix that by explicitly
checking that NL80211_ATTR_IE is included.

Fixes: 355199e02b83 ("cfg80211: Extend support for IEEE 802.11r Fast BSS Transition")
Signed-off-by: Arunk Khandavalli <[email protected]>
Signed-off-by: Jouni Malinen <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
6 years agoMerge branch 'net_sched-reject-unknown-tcfa_action-values'
David S. Miller [Thu, 30 Aug 2018 05:10:59 +0000 (22:10 -0700)]
Merge branch 'net_sched-reject-unknown-tcfa_action-values'

Paolo Abeni says:

====================
net_sched: reject unknown tcfa_action values

As agreed some time ago, this changeset reject unknown tcfa_action values,
instead of changing such values under the hood.

A tdc test is included to verify the new behavior.

v1 -> v2:
 - helper is now static and renamed according to act_* convention
 - updated extack message, according to the new behavior
====================

Reviewed-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agotc-testing: add test-cases for numeric and invalid control action
Paolo Abeni [Wed, 29 Aug 2018 08:22:34 +0000 (10:22 +0200)]
tc-testing: add test-cases for numeric and invalid control action

Only the police action allows us to specify an arbitrary numeric value
for the control action. This change introduces an explicit test case
for the above feature and then leverage it for testing the kernel behavior
for invalid control actions (reject).

Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet_sched: reject unknown tcfa_action values
Paolo Abeni [Wed, 29 Aug 2018 08:22:33 +0000 (10:22 +0200)]
net_sched: reject unknown tcfa_action values

After the commit 802bfb19152c ("net/sched: user-space can't set
unknown tcfa_action values"), unknown tcfa_action values are
converted to TC_ACT_UNSPEC, but the common agreement is instead
rejecting such configurations.

This change also introduces a helper to simplify the destruction
of a single action, avoiding code duplication.

v1 -> v2:
 - helper is now static and renamed according to act_* convention
 - updated extack message, according to the new behavior

Fixes: 802bfb19152c ("net/sched: user-space can't set unknown tcfa_action values")
Signed-off-by: Paolo Abeni <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: mvpp2: initialize port of_node pointer
Baruch Siach [Wed, 29 Aug 2018 06:44:39 +0000 (09:44 +0300)]
net: mvpp2: initialize port of_node pointer

Without a valid of_node in struct device we can't find the mvpp2 port
device by its DT node. Specifically, this breaks
of_find_net_device_by_node().

For example, the Armada 8040 based Clearfog GT-8K uses Marvell 88E6141
switch connected to the &cp1_eth2 port:

&cp1_mdio {
...

switch0: switch0@4 {
compatible = "marvell,mv88e6085";
...

ports {
...

port@5 {
reg = <5>;
label = "cpu";
ethernet = <&cp1_eth2>;
};
};
};
};

Without this patch, dsa_register_switch() returns -EPROBE_DEFER because
of_find_net_device_by_node() can't find the device_node of the &cp1_eth2
device.

Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: Baruch Siach <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: bcmgenet: use MAC link status for fixed phy
Doug Berger [Tue, 28 Aug 2018 19:33:15 +0000 (12:33 -0700)]
net: bcmgenet: use MAC link status for fixed phy

When using the fixed PHY with GENET (e.g. MOCA) the PHY link
status can be determined from the internal link status captured
by the MAC. This allows the PHY state machine to use the correct
link state with the fixed PHY even if MAC link event interrupts
are missed when the net device is opened.

Fixes: 8d88c6ebb34c ("net: bcmgenet: enable MoCA link state change detection")
Signed-off-by: Doug Berger <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: stmmac: build the dwmac-socfpga platform driver for Stratix10
Dinh Nguyen [Tue, 28 Aug 2018 15:19:20 +0000 (10:19 -0500)]
net: stmmac: build the dwmac-socfpga platform driver for Stratix10

The Stratix10 SoC is an AARCH64 based platform that shares the same ethernet
controller that is on other SoCFPGA platforms. Build the platform driver.

Signed-off-by: Dinh Nguyen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoMerge branch 'ipv6-fix-error-path-of-inet6_init'
David S. Miller [Thu, 30 Aug 2018 02:28:55 +0000 (19:28 -0700)]
Merge branch 'ipv6-fix-error-path-of-inet6_init'

Sabrina Dubroca says:

====================
ipv6: fix error path of inet6_init()

The error path of inet6_init() can trigger multiple kernel panics,
mostly due to wrong ordering of cleanups. This series fixes those
issues.
====================

Reviewed-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: rtnl: return early from rtnl_unregister_all when protocol isn't registered
Sabrina Dubroca [Tue, 28 Aug 2018 11:40:53 +0000 (13:40 +0200)]
net: rtnl: return early from rtnl_unregister_all when protocol isn't registered

rtnl_unregister_all(PF_INET6) gets called from inet6_init in cases when
no handler has been registered for PF_INET6 yet, for example if
ip6_mr_init() fails. Abort and avoid a NULL pointer deref in that case.

Example of panic (triggered by faking a failure of
 register_pernet_subsys):

    general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI
    [...]
    RIP: 0010:rtnl_unregister_all+0x17e/0x2a0
    [...]
    Call Trace:
     ? rtnetlink_net_init+0x250/0x250
     ? sock_unregister+0x103/0x160
     ? kernel_getsockopt+0x200/0x200
     inet6_init+0x197/0x20d

Fixes: e2fddf5e96df ("[IPV6]: Make af_inet6 to check ip6_route_init return value.")
Signed-off-by: Sabrina Dubroca <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoipv6: fix cleanup ordering for pingv6 registration
Sabrina Dubroca [Tue, 28 Aug 2018 11:40:52 +0000 (13:40 +0200)]
ipv6: fix cleanup ordering for pingv6 registration

Commit 6d0bfe226116 ("net: ipv6: Add IPv6 support to the ping socket.")
contains an error in the cleanup path of inet6_init(): when
proto_register(&pingv6_prot, 1) fails, we try to unregister
&pingv6_prot. When rawv6_init() fails, we skip unregistering
&pingv6_prot.

Example of panic (triggered by faking a failure of
 proto_register(&pingv6_prot, 1)):

    general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI
    [...]
    RIP: 0010:__list_del_entry_valid+0x79/0x160
    [...]
    Call Trace:
     proto_unregister+0xbb/0x550
     ? trace_preempt_on+0x6f0/0x6f0
     ? sock_no_shutdown+0x10/0x10
     inet6_init+0x153/0x1b8

Fixes: 6d0bfe226116 ("net: ipv6: Add IPv6 support to the ping socket.")
Signed-off-by: Sabrina Dubroca <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoipv6: fix cleanup ordering for ip6_mr failure
Sabrina Dubroca [Tue, 28 Aug 2018 11:40:51 +0000 (13:40 +0200)]
ipv6: fix cleanup ordering for ip6_mr failure

Commit 15e668070a64 ("ipv6: reorder icmpv6_init() and ip6_mr_init()")
moved the cleanup label for ipmr_fail, but should have changed the
contents of the cleanup labels as well. Now we can end up cleaning up
icmpv6 even though it hasn't been initialized (jump to icmp_fail or
ipmr_fail).

Simply undo things in the reverse order of their initialization.

Example of panic (triggered by faking a failure of icmpv6_init):

    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI
    [...]
    RIP: 0010:__list_del_entry_valid+0x79/0x160
    [...]
    Call Trace:
     ? lock_release+0x8a0/0x8a0
     unregister_pernet_operations+0xd4/0x560
     ? ops_free_list+0x480/0x480
     ? down_write+0x91/0x130
     ? unregister_pernet_subsys+0x15/0x30
     ? down_read+0x1b0/0x1b0
     ? up_read+0x110/0x110
     ? kmem_cache_create_usercopy+0x1b4/0x240
     unregister_pernet_subsys+0x1d/0x30
     icmpv6_cleanup+0x1d/0x30
     inet6_init+0x1b5/0x23f

Fixes: 15e668070a64 ("ipv6: reorder icmpv6_init() and ip6_mr_init()")
Signed-off-by: Sabrina Dubroca <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoMerge tag 'riscv-for-linus-4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 30 Aug 2018 01:41:48 +0000 (18:41 -0700)]
Merge tag 'riscv-for-linus-4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux

Pull RISC-V fixes from Palmer Dabbelt:
 "RISC-V Fixes and Cleanups for 4.19-rc2

  This contains a handful of patches that filtered their way in during
  the merge window but just didn't make the deadline. It includes:

   - Additional documentation in the riscv,cpu-intc device tree binding
     that resulted from some feedback I missed in the original patch
     set.

   - A build fix that provides the definition of tlb_flush() before
     including tlb.h, which fixes a RISC-V build regression introduced
     during this merge window.

   - A cosmetic cleanup to sys_riscv_flush_icache()"

* tag 'riscv-for-linus-4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
  RISC-V: Use a less ugly workaround for unused variable warnings
  riscv: tlb: Provide definition of tlb_flush() before including tlb.h
  dt-bindings: riscv,cpu-intc: Cleanups from a missed review

6 years agoMerge tag 'drm-intel-fixes-2018-08-29' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Thu, 30 Aug 2018 01:34:55 +0000 (11:34 +1000)]
Merge tag 'drm-intel-fixes-2018-08-29' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- fix for GLK and CNL watermark workaround
- fix for display affecting NUCs with LSPCON
- freeing an allocated write_buf on hdcp
- audio hook when display is disabled
- vma stop holding ppgtt reference

Signed-off-by: Dave Airlie <[email protected]>
From: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
6 years agoMerge branch 'drm-fixes-4.19' of git://people.freedesktop.org/~agd5f/linux into drm...
Dave Airlie [Thu, 30 Aug 2018 01:30:02 +0000 (11:30 +1000)]
Merge branch 'drm-fixes-4.19' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

Fixes for 4.19:
- SR-IOV fixes
- Kasan and page fault fix on device removal
- S3 stability fix for CZ/ST
- VCE regression fixes for CIK parts
- Avoid holding the mn_lock when allocating memory
- DC memory leak fix
- BO eviction fix

Signed-off-by: Dave Airlie <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
6 years agoMerge branch 'mediatek-drm-fixes-4.19' of https://github.com/ckhu-mediatek/linux...
Dave Airlie [Thu, 30 Aug 2018 00:25:47 +0000 (10:25 +1000)]
Merge branch 'mediatek-drm-fixes-4.19' of https://github.com/ckhu-mediatek/linux.git-tags into drm-fixes

"Here are some fixes for mediatek drm driver."

Mostly fixes around the RDMA and Overlay

Signed-off-by: Dave Airlie <[email protected]>
From: CK Hu <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/1535346194.27648.5.camel@mtksdaap41
6 years agonet/sched: act_pedit: fix dump of extended layered op
Davide Caratti [Mon, 27 Aug 2018 20:56:22 +0000 (22:56 +0200)]
net/sched: act_pedit: fix dump of extended layered op

in the (rare) case of failure in nla_nest_start(), missing NULL checks in
tcf_pedit_key_ex_dump() can make the following command

 # tc action add action pedit ex munge ip ttl set 64

dereference a NULL pointer:

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
 PGD 800000007d1cd067 P4D 800000007d1cd067 PUD 7acd3067 PMD 0
 Oops: 0002 [#1] SMP PTI
 CPU: 0 PID: 3336 Comm: tc Tainted: G            E     4.18.0.pedit+ #425
 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
 RIP: 0010:tcf_pedit_dump+0x19d/0x358 [act_pedit]
 Code: be 02 00 00 00 48 89 df 66 89 44 24 20 e8 9b b1 fd e0 85 c0 75 46 8b 83 c8 00 00 00 49 83 c5 08 48 03 83 d0 00 00 00 4d 39 f5 <66> 89 04 25 00 00 00 00 0f 84 81 01 00 00 41 8b 45 00 48 8d 4c 24
 RSP: 0018:ffffb5d4004478a8 EFLAGS: 00010246
 RAX: ffff8880fcda2070 RBX: ffff8880fadd2900 RCX: 0000000000000000
 RDX: 0000000000000002 RSI: ffffb5d4004478ca RDI: ffff8880fcda206e
 RBP: ffff8880fb9cb900 R08: 0000000000000008 R09: ffff8880fcda206e
 R10: ffff8880fadd2900 R11: 0000000000000000 R12: ffff8880fd26cf40
 R13: ffff8880fc957430 R14: ffff8880fc957430 R15: ffff8880fb9cb988
 FS:  00007f75a537a740(0000) GS:ffff8880fda00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 000000007a2fa005 CR4: 00000000001606f0
 Call Trace:
  ? __nla_reserve+0x38/0x50
  tcf_action_dump_1+0xd2/0x130
  tcf_action_dump+0x6a/0xf0
  tca_get_fill.constprop.31+0xa3/0x120
  tcf_action_add+0xd1/0x170
  tc_ctl_action+0x137/0x150
  rtnetlink_rcv_msg+0x263/0x2d0
  ? _cond_resched+0x15/0x40
  ? rtnl_calcit.isra.30+0x110/0x110
  netlink_rcv_skb+0x4d/0x130
  netlink_unicast+0x1a3/0x250
  netlink_sendmsg+0x2ae/0x3a0
  sock_sendmsg+0x36/0x40
  ___sys_sendmsg+0x26f/0x2d0
  ? do_wp_page+0x8e/0x5f0
  ? handle_pte_fault+0x6c3/0xf50
  ? __handle_mm_fault+0x38e/0x520
  ? __sys_sendmsg+0x5e/0xa0
  __sys_sendmsg+0x5e/0xa0
  do_syscall_64+0x5b/0x180
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7f75a4583ba0
 Code: c3 48 8b 05 f2 62 2c 00 f7 db 64 89 18 48 83 cb ff eb dd 0f 1f 80 00 00 00 00 83 3d fd c3 2c 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ae cc 00 00 48 89 04 24
 RSP: 002b:00007fff60ee7418 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
 RAX: ffffffffffffffda RBX: 00007fff60ee7540 RCX: 00007f75a4583ba0
 RDX: 0000000000000000 RSI: 00007fff60ee7490 RDI: 0000000000000003
 RBP: 000000005b842d3e R08: 0000000000000002 R09: 0000000000000000
 R10: 00007fff60ee6ea0 R11: 0000000000000246 R12: 0000000000000000
 R13: 00007fff60ee7554 R14: 0000000000000001 R15: 000000000066c100
 Modules linked in: act_pedit(E) ip6table_filter ip6_tables iptable_filter binfmt_misc crct10dif_pclmul ext4 crc32_pclmul mbcache ghash_clmulni_intel jbd2 pcbc snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd snd_timer cryptd glue_helper snd joydev pcspkr soundcore virtio_balloon i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi virtio_net net_failover virtio_blk virtio_console failover qxl crc32c_intel drm_kms_helper syscopyarea serio_raw sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix virtio_pci libata virtio_ring i2c_core virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_pedit]
 CR2: 0000000000000000

Like it's done for other TC actions, give up dumping pedit rules and return
an error if nla_nest_start() returns NULL.

Fixes: 71d0ed7079df ("net/act_pedit: Support using offset relative to the conventional network headers")
Signed-off-by: Davide Caratti <[email protected]>
Acked-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agosh_eth: Add R7S9210 support
Chris Brandt [Mon, 27 Aug 2018 17:42:02 +0000 (12:42 -0500)]
sh_eth: Add R7S9210 support

Add support for the R7S9210 which is part of the RZ/A2 series.

Signed-off-by: Chris Brandt <[email protected]>
Acked-by: Rob Herring <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoMerge branch 'hns-fixes'
David S. Miller [Thu, 30 Aug 2018 01:08:20 +0000 (18:08 -0700)]
Merge branch 'hns-fixes'

Peng Li says:

====================
net: hns: fix some bugs about speed and duplex change

If there are packets in hardware when changing the spped
or duplex, it may cause hardware hang up.

This patchset adds the code for waiting chip to clean the all
pkts(TX & RX) in chip when the driver uses the function named
"adjust link".

This patchset cleans the pkts as follows:
1) close rx of chip, close tx of protocol stack.
2) wait rcb, ppe, mac to clean.
3) adjust link
4) open rx of chip, open tx of protocol stack.
====================

Signed-off-by: David S. Miller <[email protected]>
6 years agonet: hns: add netif_carrier_off before change speed and duplex
Peng Li [Mon, 27 Aug 2018 01:59:30 +0000 (09:59 +0800)]
net: hns: add netif_carrier_off before change speed and duplex

If there are packets in hardware when changing the speed
or duplex, it may cause hardware hang up.

This patch adds netif_carrier_off before change speed and
duplex in ethtool_ops.set_link_ksettings, and adds
netif_carrier_on after complete the change.

Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agonet: hns: add the code for cleaning pkt in chip
Peng Li [Mon, 27 Aug 2018 01:59:29 +0000 (09:59 +0800)]
net: hns: add the code for cleaning pkt in chip

If there are packets in hardware when changing the speed
or duplex, it may cause hardware hang up.

This patch adds the code for waiting chip to clean the all
pkts(TX & RX) in chip when the driver uses the function named
"adjust link".

This patch cleans the pkts as follows:
1) close rx of chip, close tx of protocol stack.
2) wait rcb, ppe, mac to clean.
3) adjust link
4) open rx of chip, open tx of protocol stack.

Signed-off-by: Peng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agor8169: set RxConfig after tx/rx is enabled for RTL8169sb/8110sb devices
Azat Khuzhin [Sun, 26 Aug 2018 14:03:09 +0000 (17:03 +0300)]
r8169: set RxConfig after tx/rx is enabled for RTL8169sb/8110sb devices

I have two Ethernet adapters:
  r8169 0000:03:01.0 eth0: RTL8169sb/8110sb, 00:14:d1:14:2d:49, XID 10000000, IRQ 18
  r8169 0000:01:00.0 eth0: RTL8168e/8111e, 64:66:b3:11:14:5d, XID 2c200000, IRQ 30
And after upgrading from linux 4.15 [1] to linux 4.18+ [2] RTL8169sb failed to
receive any packets. tcpdump shows a lot of checksum mismatch.

  [1]: a0f79386a4968b4925da6db2d1daffd0605a4402
  [2]: 0519359784328bfa92bf0931bf0cff3b58c16932 (4.19 merge window opened)

I started bisecting and the found that [3] breaks it. According to [4]:
  "For 8110S, 8110SB, and 8110SC series, the initial value of RxConfig
  needs to be set after the tx/rx is enabled."
So I moved rtl_init_rxcfg() after enabling tx/rs and now my adapter works
(RTL8168e works too).

  [3]: 3559d81e76bfe3803e89f2e04cf6ef7ab4f3aace
  [4]: e542a2269f232d61270ceddd42b73a4348dee2bb ("r8169: adjust the RxConfig
settings.")

Also drop "rx" from rtl_set_rx_tx_config_registers(), since it does nothing
with it already.

Fixes: 3559d81e76bfe3803e89f2e04cf6ef7ab4f3aace ("r8169: simplify
rtl_hw_start_8169")

Cc: Heiner Kallweit <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: [email protected]
Cc: Realtek linux nic maintainers <[email protected]>
Signed-off-by: Azat Khuzhin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agotipc: switch to rhashtable iterator
Cong Wang [Fri, 24 Aug 2018 19:28:06 +0000 (12:28 -0700)]
tipc: switch to rhashtable iterator

syzbot reported a use-after-free in tipc_group_fill_sock_diag(),
where tipc_group_fill_sock_diag() still reads tsk->group meanwhile
tipc_group_delete() just deletes it in tipc_release().

tipc_nl_sk_walk() aims to lock this sock when walking each sock
in the hash table to close race conditions with sock changes like
this one, by acquiring tsk->sk.sk_lock.slock spinlock, unfortunately
this doesn't work at all. All non-BH call path should take
lock_sock() instead to make it work.

tipc_nl_sk_walk() brutally iterates with raw rht_for_each_entry_rcu()
where RCU read lock is required, this is the reason why lock_sock()
can't be taken on this path. This could be resolved by switching to
rhashtable iterator API's, where taking a sleepable lock is possible.
Also, the iterator API's are friendly for restartable calls like
diag dump, the last position is remembered behind the scence,
all we need to do here is saving the iterator into cb->args[].

I tested this with parallel tipc diag dump and thousands of tipc
socket creation and release, no crash or memory leak.

Reported-by: [email protected]
Cc: Jon Maloy <[email protected]>
Cc: Ying Xue <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoRevert "net: stmmac: Do not keep rearming the coalesce timer in stmmac_xmit"
Jerome Brunet [Fri, 24 Aug 2018 09:04:40 +0000 (11:04 +0200)]
Revert "net: stmmac: Do not keep rearming the coalesce timer in stmmac_xmit"

This reverts commit 4ae0169fd1b3c792b66be58995b7e6b629919ecf.

This change in the handling of the coalesce timer is causing regression on
(at least) amlogic platforms.

Network will break down very quickly (a few seconds) after starting
a download. This can easily be reproduced using iperf3 for example.

The problem has been reported on the S805, S905, S912 and A113 SoCs
(Realtek and Micrel PHYs) and it is likely impacting all Amlogics
platforms using Gbit ethernet

No problem was seen with the platform using 10/100 only PHYs (GXL internal)

Reverting change brings things back to normal and allows to use network
again until we better understand the problem with the coalesce timer.

Cc: Jose Abreu <[email protected]>
Cc: Joao Pinto <[email protected]>
Cc: Vitor Soares <[email protected]>
Cc: Giuseppe Cavallaro <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Corentin Labbe <[email protected]>
Signed-off-by: Jerome Brunet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agotipc: fix a missing rhashtable_walk_exit()
Cong Wang [Thu, 23 Aug 2018 23:19:44 +0000 (16:19 -0700)]
tipc: fix a missing rhashtable_walk_exit()

rhashtable_walk_exit() must be paired with rhashtable_walk_enter().

Fixes: 40f9f4397060 ("tipc: Fix tipc_sk_reinit race conditions")
Cc: Herbert Xu <[email protected]>
Cc: Ying Xue <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agovti6: remove !skb->ignore_df check from vti6_xmit()
Alexey Kodanev [Thu, 23 Aug 2018 16:49:54 +0000 (19:49 +0300)]
vti6: remove !skb->ignore_df check from vti6_xmit()

Before the commit d6990976af7c ("vti6: fix PMTU caching and reporting
on xmit") '!skb->ignore_df' check was always true because the function
skb_scrub_packet() was called before it, resetting ignore_df to zero.

In the commit, skb_scrub_packet() was moved below, and now this check
can be false for the packet, e.g. when sending it in the two fragments,
this prevents successful PMTU updates in such case. The next attempts
to send the packet lead to the same tx error. Moreover, vti6 initial
MTU value relies on PMTU adjustments.

This issue can be reproduced with the following LTP test script:
    udp_ipsec_vti.sh -6 -p ah -m tunnel -s 2000

Fixes: ccd740cbc6e0 ("vti6: Add pmtu handling to vti6_xmit.")
Signed-off-by: Alexey Kodanev <[email protected]>
Acked-by: Steffen Klassert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
6 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
David S. Miller [Wed, 29 Aug 2018 23:19:38 +0000 (16:19 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2018-08-29

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix a build error in sk_reuseport_convert_ctx_access() when
   compiling with clang which cannot resolve hweight_long() at
   build time inside the BUILD_BUG_ON() assertion, from Stefan.

2) Several fixes for BPF sockmap, four of them in getting the
   bpf_msg_pull_data() helper to work, one use after free case
   in bpf_tcp_close() and one refcount leak in bpf_tcp_recvmsg(),
   from Daniel.

3) Another fix for BPF sockmap where we misaccount sk_mem_uncharge()
   in the socket redirect error case from unwinding scatterlist
   twice, from John.
====================

Signed-off-by: David S. Miller <[email protected]>
6 years agopowerpc: disable support for relative ksymtab references
Ard Biesheuvel [Wed, 29 Aug 2018 06:47:53 +0000 (08:47 +0200)]
powerpc: disable support for relative ksymtab references

The newly added code that emits ksymtab entries as pairs of 32-bit
relative references interacts poorly with the way powerpc lays out its
address space: when a module exports a per-CPU variable, the primary
module region covering the ksymtab entry -and thus the 32-bit relative
reference- is too far away from the actual per-CPU variable's base
address (to which the per-CPU offsets are applied to obtain the
respective address of each CPU's copy), resulting in corruption when the
module loader attempts to resolve symbol references of modules that are
loaded on top and link to the exported per-CPU symbol.

So let's disable this feature on powerpc.  Even though it implements
CONFIG_RELOCATABLE, it does not implement CONFIG_RANDOMIZE_BASE and so
KASLR kernels (which are the main target of the feature) do not exist on
powerpc anyway.

Reported-by: Andreas Schwab <[email protected]>
Suggested-by: Nicholas Piggin <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
6 years agoMerge tag 'hwmon-for-linus-v4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 29 Aug 2018 23:03:45 +0000 (16:03 -0700)]
Merge tag 'hwmon-for-linus-v4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:

 - Fix potential Spectre v1 in nct6775

 - Add error checking to adt7475 driver

 - Fix reading shunt resistor value in ina2xx driver

* tag 'hwmon-for-linus-v4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (nct6775) Fix potential Spectre v1
  hwmon: (adt7475) Make adt7475_read_word() return errors
  hwmon: (adt7475) Potential error pointer dereferences
  hwmon: (ina2xx) fix sysfs shunt resistor read access

6 years agoMerge tag 'for_v4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Linus Torvalds [Wed, 29 Aug 2018 21:56:45 +0000 (14:56 -0700)]
Merge tag 'for_v4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull misc fs fixes from Jan Kara:

 - make UDF to properly mount media created by Win7

 - make isofs to properly refuse devices with large physical block size

 - fix a Spectre gadget in quotactl(2)

 - fix a warning in fsnotify code hit by syzkaller

* tag 'for_v4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: Fix mounting of Win7 created UDF filesystems
  udf: Remove dead code from udf_find_fileset()
  fs/quota: Fix spectre gadget in do_quotactl
  fs/quota: Replace XQM_MAXQUOTAS usage with MAXQUOTAS
  isofs: reject hardware sector size > 2048 bytes
  fsnotify: fix false positive warning on inode delete

6 years agoMerge tag 'nios2-v4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan...
Linus Torvalds [Wed, 29 Aug 2018 21:51:32 +0000 (14:51 -0700)]
Merge tag 'nios2-v4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2

Pull nios2 fix from Ley Foon Tan:
 "remove duplicate DEBUG_STACK_USAGE symbol defintions"

* tag 'nios2-v4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
  nios2: kconfig: remove duplicate DEBUG_STACK_USAGE symbol defintions

This page took 0.146708 seconds and 4 git commands to generate.