Git Repo - linux.git/log

i40e: Fix for Tx timeouts when interface is brought up if DCB is enabled

If interface is connected to switch port configured for DCB there are
TX timeouts when bringing up interface. Problem started appearing after
adding in i40e driver code mqprio hardware offload mode. In function
i40e_vsi_configure_bw_alloc was added resetting BW rate which should
be executing when mqprio qdisc is removed but was also when there was
no mqprio qdisc added and DCB was enabled. In this patch was added
additional check for DCB flag so now when DCB is enabled the correct
DCB configs from before mqprio patch are restored.

Signed-off-by: Martyna Szapar <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ixgbe: fix driver behaviour after issuing VFLR

Since VFLR doesn't clear VFMBMEM (VF Mailbox Memory)
and is not re-enabling queues correctly we should fix
this behavior.

Signed-off-by: Sebastian Basierski <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ixgbe: Prevent unsupported configurations with XDP

These changes address comments by Jakub Kicinski on
commit 38b7e7f8ae82 ("ixgbe: Do not allow LRO or MTU change with XDP").

Change the MTU check with XDP to allow any supported value and only
reject those outside of the range as opposed to rejecting any change
when XDP is active. In situations where MTU size is not supported,
return -EINVAL instead of -EPERM.

Add checks when enabling SRIOV, DCB, or adding L2FW offloaded device
as they are not supported with XDP.

CC: Jakub Kicinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ixgbe: Replace GFP_ATOMIC with GFP_KERNEL

ixgbe_fcoe_ddp_setup(), ixgbe_setup_fcoe_ddp_resources() and
ixgbe_sw_init() are never called in atomic context.
They call kmalloc(), dma_pool_alloc() and kzalloc() with GFP_ATOMIC,
which is not necessary.
GFP_ATOMIC can be replaced with GFP_KERNEL.

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <[email protected]>
Acked-by: Sebastian Basierski <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

igb: Replace mdelay() with msleep() in igb_integrated_phy_loopback()

igb_integrated_phy_loopback() is never called in atomic context.
It calls mdelay() to busily wait, which is not necessary.
mdelay() can be replaced with msleep().

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

igb: Replace GFP_ATOMIC with GFP_KERNEL in igb_sw_init()

igb_sw_init() is never called in atomic context.
It calls kzalloc() and kcalloc() with GFP_ATOMIC, which is not necessary.
GFP_ATOMIC can be replaced with GFP_KERNEL.

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

igb: Use an advanced ctx descriptor for launchtime

On i210, Launchtime (TxTime) requires the usage of an "Advanced
Transmit Context Descriptor" for retrieving the timestamp of a packet.

The igb driver correctly builds such descriptor on the segmentation
flow (i.e. igb_tso()) or on the checksum one (i.e. igb_tx_csum()), but the
feature is broken for AF_PACKET if the IGB_TX_FLAGS_VLAN is not set,
which happens due to an early return on igb_tx_csum().

This flag is only set by the kernel when a VLAN interface is used,
thus we can't just rely on it. Here we are fixing this issue by checking
if launchtime is enabled for the current tx_ring before performing the
early return.

Signed-off-by: Jesus Sanchez-Palencia <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

e1000: ensure to free old tx/rx rings in set_ringparam()

In 'e1000_set_ringparam()', the tx_ring and rx_ring are updated with new value
and the old tx/rx rings are freed only when the device is up. There are resource
leaks on old tx/rx rings when the device is not up. This bug is reported by COD,
a tool for testing kernel module binaries I am building.

This patch fixes the bug by always calling 'kfree()' on old tx/rx rings in
'e1000_set_ringparam()'.

Signed-off-by: Bo Chen <[email protected]>
Reviewed-by: Alexander Duyck <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

e1000: check on netif_running() before calling e1000_up()

When the device is not up, the call to 'e1000_up()' from the error handling path
of 'e1000_set_ringparam()' causes a kernel oops with a null-pointer
dereference. The null-pointer dereference is triggered in function
'e1000_alloc_rx_buffers()' at line 'buffer_info = &rx_ring->buffer_info[i]'.

This bug was reported by COD, a tool for testing kernel module binaries I am
building. This bug was also detected by KFI from Dr. Kai Cong.

This patch fixes the bug by checking on 'netif_running()' before calling
'e1000_up()' in 'e1000_set_ringparam()'.

Signed-off-by: Bo Chen <[email protected]>
Acked-by: Alexander Duyck <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ixgb: use dma_zalloc_coherent instead of allocator/memset

Use dma_zalloc_coherent instead of dma_alloc_coherent
followed by memset 0.

Signed-off-by: YueHaibing <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

iommu/rockchip: Move irq request past pm_runtime_enable

Enabling the interrupt early, before power has been applied to the
device, can result in an interrupt being delivered too early if:

- the IOMMU shares an interrupt with a VOP
- the VOP has a pending interrupt (after a kexec, for example)

In these conditions, we end-up taking the interrupt without
the IOMMU being ready to handle the interrupt (not powered on).

Moving the interrupt request past the pm_runtime_enable() call
makes sure we can at least access the IOMMU registers. Note that
this is only a partial fix, and that the VOP interrupt will still
be screaming until the VOP driver kicks in, which advocates for
a more synchronized interrupt enabling/disabling approach.

Fixes: 0f181d3cf7d98 ("iommu/rockchip: Add runtime PM support")
Reviewed-by: Heiko Stuebner <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Olof Johansson <[email protected]>

iommu/rockchip: Handle errors returned from PM framework

pm_runtime_get_if_in_use can fail: either PM has been disabled
altogether (-EINVAL), or the device hasn't been enabled yet (0).
Sadly, the Rockchip IOMMU driver tends to conflate the two things
by considering a non-zero return value as successful.

This has the consequence of hiding other bugs, so let's handle this
case throughout the driver, with a WARN_ON_ONCE so that we can try
and work out what happened.

Fixes: 0f181d3cf7d98 ("iommu/rockchip: Add runtime PM support")
Reviewed-by: Heiko Stuebner <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Olof Johansson <[email protected]>

arm64: rockchip: Force CONFIG_PM on Rockchip systems

A number of the Rockchip-specific drivers (IOMMU, display controllers)
are now assuming that CONFIG_PM is set, and may completely misbehave
if that's not the case.

Since there is hardly any reason for this configuration option not
to be selected anyway, let's require it (in the same way Tegra already
does).

Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Olof Johansson <[email protected]>

ARM: rockchip: Force CONFIG_PM on Rockchip systems

A number of the Rockchip-specific drivers (IOMMU, display controllers)
are now assuming that CONFIG_PM is set, and may completely misbehave
if that's not the case.

Since there is hardly any reason for this configuration option not
to be selected anyway, let's require it (in the same way Tegra already
does).

Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Olof Johansson <[email protected]>

arm64: dts: Fix various entry-method properties to reflect documentation

The idle-states binding documentation[1] mentions that the
'entry-method' property is required on 64-bit platforms and must be
set to "psci".

commit a13f18f59d26 ("Documentation: arm: Fix typo in the idle-states
bindings examples") attempted to fix this earlier but clearly more is
needed.

Fix the cpu-capacity.txt documentation that uses the incorrect value so
we don't get copy-paste errors like these. Clarify the language in
idle-states.txt by removing the reference to the psci bindings that
might be causing this confusion.

Finally, fix devicetrees of various boards to reflect current
documentation.

[1] Documentation/devicetree/bindings/arm/idle-states.txt (see
idle-states node)

Signed-off-by: Amit Kucheria <[email protected]>
Acked-by: Sudeep Holla <[email protected]>
Acked-by: Li Yang <[email protected]>
Signed-off-by: Olof Johansson <[email protected]>

Merge tag 'reset-fixes-for-4.18' of git://git.pengutronix.de/git/pza/linux into next/late

Reset controller fixes for v4.18

This tag fixes reset assertion on i.MX7 for all non-inverted reset
control bits. Currently only PCIE controller and PHY resets are used.

* tag 'reset-fixes-for-4.18' of git://git.pengutronix.de/git/pza/linux:
reset: imx7: Fix always writing bits as 0

Signed-off-by: Olof Johansson <[email protected]>

Merge tag 'reset-for-4.19-2' of git://git.pengutronix.de/git/pza/linux into next/late

Reset controller changes for v4.19, part 2

This adds a single new driver for the Amlogic Meson Audio Memory Arbiter
resets.

* tag 'reset-for-4.19-2' of git://git.pengutronix.de/git/pza/linux:
reset: meson: add meson audio arb driver
reset: meson: add dt-bindings for meson-axg audio arb

Signed-off-by: Olof Johansson <[email protected]>

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio updates from Michael Tsirkin:
"virtio, vhost: fixes, tweaks

  No new features but a bunch of tweaks such as switching balloon from
  oom notifier to shrinker"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vhost/scsi: increase VHOST_SCSI_PREALLOC_PROT_SGLS to 2048
  vhost: allow vhost-scsi driver to be built-in
  virtio: pci-legacy: Validate queue pfn
  virtio: mmio-v1: Validate queue PFN
  virtio_balloon: replace oom notifier with shrinker
  virtio-balloon: kzalloc the vb struct
  virtio-balloon: remove BUG() in init_vqs

i2c: don't use any __deprecated handling anymore

This can be dropped with commit 771c035372a036f83353eef46dbb829780330234
("deprecate the '__deprecated' attribute warnings entirely and for good")
now in upstream.

And we got rid of the last __deprecated use, too.

Signed-off-by: Sedat Dilek <[email protected]>
[wsa: shortened commit message to reflect the current situation]
Signed-off-by: Wolfram Sang <[email protected]>

Merge tag 'irqchip-4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent

Pull irqchip updates for 4.19, take #2 from Marc Zyngier:

- bcm7038: compilation fix for !SMP
- stm32: fix teardown on probe error
- s3c24xx: fix compilation warning
- renesas-irqc: r8a774a1 support
- tango: chained irq setup simplification
- gic-v3: allow wake-up sources

x86/speculation/l1tf: Suggest what to do on systems with too much RAM

Two users have reported [1] that they have an "extremely unlikely" system
with more than MAX_PA/2 memory and L1TF mitigation is not effective.

Make the warning more helpful by suggesting the proper mem=X kernel boot
parameter to make it effective and a link to the L1TF document to help
decide if the mitigation is worth the unusable RAM.

[1] https://bugzilla.suse.com/show_bug.cgi?id=1105536

Suggested-by: Michal Hocko <[email protected]>
Signed-off-by: Vlastimil Babka <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: "H . Peter Anvin" <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

i2c: use SPDX identifier for Renesas drivers

Signed-off-by: Wolfram Sang <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>

i2c: ocores: update my email address

The old @sunsite.dk address is no longer active, so update the references.

Signed-off-by: Peter Korsgaard <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>

i2c: remove deprecated attach_adapter callback

There aren't any users left. Remove this callback from the 2.4 times.
Phew, finally, that took years to reach...

Signed-off-by: Wolfram Sang <[email protected]>

macintosh: therm_windtunnel: drop using attach_adapter

As we now have deferred probing, we can use a custom mechanism and
finally get rid of the legacy interface from the i2c core.

Signed-off-by: Wolfram Sang <[email protected]>
Acked-by: Michael Ellerman <[email protected]>

ubifs: Remove empty file.h

This empty file sneaked into the tree by mistake.
Remove it.

Fixes: 6eb61d587f45 ("ubifs: Pass struct ubifs_info to ubifs_assert()")
Signed-off-by: Richard Weinberger <[email protected]>

PM / clk: signedness bug in of_pm_clk_add_clks()

"count" needs to be signed for the error handling to work. I made "i"
signed as well so they match.

Fixes: 02113ba93ea4 (PM / clk: Add support for obtaining clocks from device-tree)
Cc: 4.6+ <[email protected]> # 4.6+
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

udf: Fix mounting of Win7 created UDF filesystems

Win7 is creating UDF filesystems with single partition with number 8192.
Current partition descriptor scanning code does not handle this well as
it incorrectly assumes that partition numbers will form mostly contiguous
space of small numbers. This results in unmountable media due to errors
like:

UDF-fs: error (device dm-1): udf_read_tagged: tag version 0x0000 != 0x0002 || 0x0003, block 0
UDF-fs: warning (device dm-1): udf_fill_super: No fileset found

Fix the problem by handling partition descriptors in a way that sparse
partition numbering does not matter.

Reported-and-tested-by: jean-luc malet <[email protected]>
CC: [email protected]
Fixes: 7b78fd02fb19530fd101ae137a1f46aa466d9bb6
Signed-off-by: Jan Kara <[email protected]>

udf: Remove dead code from udf_find_fileset()

Remove dead code and slightly simplify code in udf_find_fileset().

Signed-off-by: Jan Kara <[email protected]>

x86/speculation/l1tf: Fix off-by-one error when warning that system has too much RAM

Two users have reported [1] that they have an "extremely unlikely" system
with more than MAX_PA/2 memory and L1TF mitigation is not effective. In
fact it's a CPU with 36bits phys limit (64GB) and 32GB memory, but due to
holes in the e820 map, the main region is almost 500MB over the 32GB limit:

[ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000081effffff] usable

Suggestions to use 'mem=32G' to enable the L1TF mitigation while losing the
500MB revealed, that there's an off-by-one error in the check in
l1tf_select_mitigation().

l1tf_pfn_limit() returns the last usable pfn (inclusive) and the range
check in the mitigation path does not take this into account.

Instead of amending the range check, make l1tf_pfn_limit() return the first
PFN which is over the limit which is less error prone. Adjust the other
users accordingly.

[1] https://bugzilla.suse.com/show_bug.cgi?id=1105536

Fixes: 17dbca119312 ("x86/speculation/l1tf: Add sysfs reporting for l1tf")
Reported-by: George Anchev <[email protected]>
Reported-by: Christopher Snowhill <[email protected]>
Signed-off-by: Vlastimil Babka <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: "H . Peter Anvin" <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2018-08-24

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix BPF sockmap and tls where we get a hang in do_tcp_sendpages()
   when sndbuf is full due to missing calls into underlying socket's
   sk_write_space(), from John.

2) Two BPF sockmap fixes to reject invalid parameters on map creation
   and to fix a map element miscount on allocation failure. Another fix
   for BPF hash tables to use per hash table salt for jhash(), from Daniel.

3) Fix for bpftool's command line parsing in order to terminate on bad
   arguments instead of keeping looping in some border cases, from Quentin.

4) Fix error value of xdp_umem_assign_dev() in order to comply with
   expected bind ops error codes, from Prashant.
====================

Signed-off-by: David S. Miller <[email protected]>

Merge tag 'drm-misc-next-fixes-2018-08-23-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

- Add quirk to Lenovo B50-80 to use 6 bpc instead of 8 (Feng)

Signed-off-by: Dave Airlie <[email protected]>
From: Sean Paul <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/20180823205434.GA137644@art_vandelay

Merge branch 'akpm' (patches from Andrew)

Merge yet more updates from Andrew Morton:

- the rest of MM

- various misc fixes and tweaks

* emailed patches from Andrew Morton <[email protected]>: (22 commits)
  mm: Change return type int to vm_fault_t for fault handlers
  lib/fonts: convert comments to utf-8
  s390: ebcdic: convert comments to UTF-8
  treewide: convert ISO_8859-1 text comments to utf-8
  drivers/gpu/drm/gma500/: change return type to vm_fault_t
  docs/core-api: mm-api: add section about GFP flags
  docs/mm: make GFP flags descriptions usable as kernel-doc
  docs/core-api: split memory management API to a separate file
  docs/core-api: move *{str,mem}dup* to "String Manipulation"
  docs/core-api: kill trailing whitespace in kernel-api.rst
  mm/util: add kernel-doc for kvfree
  mm/util: make strndup_user description a kernel-doc comment
  fs/proc/vmcore.c: hide vmcoredd_mmap_dumps() for nommu builds
  treewide: correct "differenciate" and "instanciate" typos
  fs/afs: use new return type vm_fault_t
  drivers/hwtracing/intel_th/msu.c: change return type to vm_fault_t
  mm: soft-offline: close the race against page allocation
  mm: fix race on soft-offlining free huge pages
  namei: allow restricted O_CREAT of FIFOs and regular files
  hfs: prevent crash on exit from failed search
  ...

mm: Change return type int to vm_fault_t for fault handlers

Use new return type vm_fault_t for fault handler.  For now, this is just
documenting that the function returns a VM_FAULT value rather than an
errno.  Once all instances are converted, vm_fault_t will become a
distinct type.

Ref-> commit 1c8f422059ae ("mm: change return type to vm_fault_t")

The aim is to change the return type of finish_fault() and
handle_mm_fault() to vm_fault_t type.  As part of that clean up return
type of all other recursively called functions have been changed to
vm_fault_t type.

The places from where handle_mm_fault() is getting invoked will be
change to vm_fault_t type but in a separate patch.

vmf_error() is the newly introduce inline function in 4.17-rc6.

[[email protected]: don't shadow outer local `ret' in __do_huge_pmd_anonymous_page()]
Link: http://lkml.kernel.org/r/20180604171727.GA20279@jordon-HP-15-Notebook-PC
Signed-off-by: Souptick Joarder <[email protected]>
Reviewed-by: Matthew Wilcox <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

lib/fonts: convert comments to utf-8

The font files contain bit masks for characters in the cp437 character
set, and comments showing what character this is supposed to be.

This only makes sense when the terminal used to view the files is set to
the same codepage, but all other files in the kernel now use utf-8
encoding.

This changes those comments to utf-8 as well, for consistency.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

s390: ebcdic: convert comments to UTF-8

The ebcdic.c file contains tables for converting between ebcdic and PC
codepage 437. I could however not identify which encoding was used for
the comments. This seems to be some variation of ISO_8859-1 with
non-UTF-8 escape characters.

I have converted this to UTF-8 by manually removing the escape
characters and then running it through recode, to get the same encoding
that we use for the rest of the kernel.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Heiko Carstens <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

treewide: convert ISO_8859-1 text comments to utf-8

Almost all files in the kernel are either plain text or UTF-8 encoded. A
couple however are ISO_8859-1, usually just a few characters in a C
comments, for historic reasons.

This converts them all to UTF-8 for consistency.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Acked-by: Simon Horman <[email protected]> [IPVS portion]
Acked-by: Jonathan Cameron <[email protected]> [IIO]
Acked-by: Michael Ellerman <[email protected]> [powerpc]
Acked-by: Rob Herring <[email protected]>
Cc: Joe Perches <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Samuel Ortiz <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Rob Herring <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

drivers/gpu/drm/gma500/: change return type to vm_fault_t

Use new return type vm_fault_t for fault handler.  For now, this is just
documenting that the function returns a VM_FAULT value rather than an
errno.  Once all instances are converted, vm_fault_t will become a
distinct type.

Ref-> 1c8f422059ae ("mm: change return type to vm_fault_t")

Previously vm_insert_{pfn,mixed} returns err which driver mapped into
VM_FAULT_* type.  The new function vmf_insert_{pfn,mixed} will replace
this inefficiency by returning VM_FAULT_* type.

vmf_error() is the newly introduce inline function in 4.17-rc6.

Link: http://lkml.kernel.org/r/20180713154541.GA3345@jordon-HP-15-Notebook-PC
Signed-off-by: Souptick Joarder <[email protected]>
Reviewed-by: Matthew Wilcox <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Patrik Jakobsson <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

docs/core-api: mm-api: add section about GFP flags

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

docs/mm: make GFP flags descriptions usable as kernel-doc

This patch adds DOC: headings for GFP flag descriptions and adjusts the
formatting to fit sphinx expectations of paragraphs.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

docs/core-api: split memory management API to a separate file

This is basically copy-paste of the memory management section from
kernel-api.rst with some minor adjustments:

* The "User Space Memory Access" is moved to the beginning
* The get_user_pages_fast reference is now a part of "User Space Memory
Access"
* And, of course, headings are adjusted with section being promoted to
chapters

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

docs/core-api: move *{str,mem}dup* to "String Manipulation"

The string and memory duplication routines fit better to the "String
Manipulation" section than to "The SLAB Cache".

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

docs/core-api: kill trailing whitespace in kernel-api.rst

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/util: add kernel-doc for kvfree

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/util: make strndup_user description a kernel-doc comment

Patch series "memory management documentation updates", v3.

Here are several updates to the mm documentation.

Aside from really minor changes in the first three patches, the updates
are:

* move the documentation of kstrdup and friends to "String Manipulation"
  section
* split memory management API into a separate .rst file
* adjust formating of the GFP flags description and include it in the
  reference documentation.

This patch (of 7):

The description of the strndup_user function misses '*' character at the
beginning of the comment to be proper kernel-doc.  Add the missing
character.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

fs/proc/vmcore.c: hide vmcoredd_mmap_dumps() for nommu builds

Without CONFIG_MMU, we get a build warning:

fs/proc/vmcore.c:228:12: error: 'vmcoredd_mmap_dumps' defined but not used [-Werror=unused-function]
static int vmcoredd_mmap_dumps(struct vm_area_struct *vma, unsigned long dst,

The function is only referenced from an #ifdef'ed caller, so
this uses the same #ifdef around it.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 7efe48df8a3d ("vmcore: append device dumps to vmcore as elf notes")
Signed-off-by: Arnd Bergmann <[email protected]>
Cc: Ganesh Goudar <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Rahul Lakkireddy <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

treewide: correct "differenciate" and "instanciate" typos

Also add these typos to spelling.txt so checkpatch.pl will look for them.

Link: http://lkml.kernel.org/r/88af06b9de34d870cb0afc46cfd24e0458be2575.1529471371.git.fthain@telegraphics.com.au
Signed-off-by: Finn Thain <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Joe Perches <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

fs/afs: use new return type vm_fault_t

Use new return type vm_fault_t for fault handler in struct
vm_operations_struct. For now, this is just documenting that the
function returns a VM_FAULT value rather than an errno. Once all
instances are converted, vm_fault_t will become a distinct type.

See 1c8f422059ae ("mm: change return type to vm_fault_t") for reference.

Link: http://lkml.kernel.org/r/20180702152017.GA3780@jordon-HP-15-Notebook-PC
Signed-off-by: Souptick Joarder <[email protected]>
Reviewed-by: Matthew Wilcox <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: David Howells <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

drivers/hwtracing/intel_th/msu.c: change return type to vm_fault_t

Use new return type vm_fault_t for fault handler. For now, this is just
documenting that the function returns a VM_FAULT value rather than an
errno. Once all instances are converted, vm_fault_t will become a
distinct type.

See 1c8f422059ae ("mm: change return type to vm_fault_t") for reference.

Link: http://lkml.kernel.org/r/20180702155801.GA4010@jordon-HP-15-Notebook-PC
Signed-off-by: Souptick Joarder <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: soft-offline: close the race against page allocation

A process can be killed with SIGBUS(BUS_MCEERR_AR) when it tries to
allocate a page that was just freed on the way of soft-offline.  This is
undesirable because soft-offline (which is about corrected error) is
less aggressive than hard-offline (which is about uncorrected error),
and we can make soft-offline fail and keep using the page for good
reason like "system is busy."

Two main changes of this patch are:

- setting migrate type of the target page to MIGRATE_ISOLATE. As done
  in free_unref_page_commit(), this makes kernel bypass pcplist when
  freeing the page. So we can assume that the page is in freelist just
  after put_page() returns,

- setting PG_hwpoison on free page under zone->lock which protects
  freelists, so this allows us to avoid setting PG_hwpoison on a page
  that is decided to be allocated soon.

[[email protected]: tweak set_hwpoison_free_buddy_page() comment]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Naoya Horiguchi <[email protected]>
Reported-by: Xishi Qiu <[email protected]>
Tested-by: Mike Kravetz <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: <[email protected]>
Cc: Mike Kravetz <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: fix race on soft-offlining free huge pages

Patch series "mm: soft-offline: fix race against page allocation".

Xishi recently reported the issue about race on reusing the target pages
of soft offlining.  Discussion and analysis showed that we need make
sure that setting PG_hwpoison should be done in the right place under
zone->lock for soft offline.  1/2 handles free hugepage's case, and 2/2
hanldes free buddy page's case.

This patch (of 2):

There's a race condition between soft offline and hugetlb_fault which
causes unexpected process killing and/or hugetlb allocation failure.

The process killing is caused by the following flow:

  CPU 0               CPU 1              CPU 2

  soft offline
    get_any_page
    // find the hugetlb is free
                      mmap a hugetlb file
                      page fault
                        ...
                          hugetlb_fault
                            hugetlb_no_page
                              alloc_huge_page
                              // succeed
      soft_offline_free_page
      // set hwpoison flag
                                         mmap the hugetlb file
                                         page fault
                                           ...
                                             hugetlb_fault
                                               hugetlb_no_page
                                                 find_lock_page
                                                   return VM_FAULT_HWPOISON
                                           mm_fault_error
                                             do_sigbus
                                             // kill the process

The hugetlb allocation failure comes from the following flow:

  CPU 0                          CPU 1

                                 mmap a hugetlb file
                                 // reserve all free page but don't fault-in
  soft offline
    get_any_page
    // find the hugetlb is free
      soft_offline_free_page
      // set hwpoison flag
        dissolve_free_huge_page
        // fail because all free hugepages are reserved
                                 page fault
                                   ...
                                     hugetlb_fault
                                       hugetlb_no_page
                                         alloc_huge_page
                                           ...
                                             dequeue_huge_page_node_exact
                                             // ignore hwpoisoned hugepage
                                             // and finally fail due to no-mem

The root cause of this is that current soft-offline code is written based
on an assumption that PageHWPoison flag should be set at first to avoid
accessing the corrupted data.  This makes sense for memory_failure() or
hard offline, but does not for soft offline because soft offline is about
corrected (not uncorrected) error and is safe from data lost.  This patch
changes soft offline semantics where it sets PageHWPoison flag only after
containment of the error page completes successfully.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Naoya Horiguchi <[email protected]>
Reported-by: Xishi Qiu <[email protected]>
Suggested-by: Xishi Qiu <[email protected]>
Tested-by: Mike Kravetz <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: <[email protected]>
Cc: Mike Kravetz <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

namei: allow restricted O_CREAT of FIFOs and regular files

Disallows open of FIFOs or regular files not owned by the user in world
writable sticky directories, unless the owner is the same as that of the
directory or the file is opened without the O_CREAT flag.  The purpose
is to make data spoofing attacks harder.  This protection can be turned
on and off separately for FIFOs and regular files via sysctl, just like
the symlinks/hardlinks protection.  This patch is based on Openwall's
"HARDEN_FIFO" feature by Solar Designer.

This is a brief list of old vulnerabilities that could have been prevented
by this feature, some of them even allow for privilege escalation:

CVE-2000-1134
CVE-2007-3852
CVE-2008-0525
CVE-2009-0416
CVE-2011-4834
CVE-2015-1838
CVE-2015-7442
CVE-2016-7489

This list is not meant to be complete.  It's difficult to track down all
vulnerabilities of this kind because they were often reported without any
mention of this particular attack vector.  In fact, before
hardlinks/symlinks restrictions, fifos/regular files weren't the favorite
vehicle to exploit them.

[[email protected]: fix bug reported by Dan Carpenter]
Link: https://lkml.kernel.org/r/20180426081456.GA7060@mwanda
Link: http://lkml.kernel.org/r/[email protected]
[[email protected]: drop pr_warn_ratelimited() in favor of audit changes in the future]
[[email protected]: adjust commit subjet]
Link: http://lkml.kernel.org/r/20180416175918.GA13494@beast
Signed-off-by: Salvatore Mesoraca <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Suggested-by: Solar Designer <[email protected]>
Suggested-by: Kees Cook <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Dan Carpenter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

hfs: prevent crash on exit from failed search

hfs_find_exit() expects fd->bnode to be NULL after a search has failed.
hfs_brec_insert() may instead set it to an error-valued pointer. Fix
this to prevent a crash.

Link: http://lkml.kernel.org/r/53d9749a029c41b4016c495fc5838c9dba3afc52.1530294815.git.ernesto.mnd.fernandez@gmail.com
Signed-off-by: Ernesto A. Fernández <[email protected]>
Cc: Anatoly Trosinenko <[email protected]>
Cc: Viacheslav Dubeyko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

hfsplus: prevent crash on exit from failed search

hfs_find_exit() expects fd->bnode to be NULL after a search has failed.
hfs_brec_insert() may instead set it to an error-valued pointer. Fix
this to prevent a crash.

Link: http://lkml.kernel.org/r/803590a35221fbf411b2c141419aea3233a6e990.1530294813.git.ernesto.mnd.fernandez@gmail.com
Signed-off-by: Ernesto A. Fernandez <[email protected]>
Reported-by: Anatoly Trosinenko <[email protected]>
Reviewed-by: Vyacheslav Dubeyko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

hfsplus: fix NULL dereference in hfsplus_lookup()

An HFS+ filesystem can be mounted read-only without having a metadata
directory, which is needed to support hardlinks.  But if the catalog
data is corrupted, a directory lookup may still find dentries claiming
to be hardlinks.

hfsplus_lookup() does check that ->hidden_dir is not NULL in such a
situation, but mistakenly does so after dereferencing it for the first
time.  Reorder this check to prevent a crash.

This happens when looking up corrupted catalog data (dentry) on a
filesystem with no metadata directory (this could only ever happen on a
read-only mount).  Wen Xu sent the replication steps in detail to the
fsdevel list: https://bugzilla.kernel.org/show_bug.cgi?id=200297

Link: http://lkml.kernel.org/r/20180712215344.q44dyrhymm4ajkao@eaf
Signed-off-by: Ernesto A. Fernández <[email protected]>
Reported-by: Wen Xu <[email protected]>
Cc: Viacheslav Dubeyko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

arm64: tlb: Provide forward declaration of tlb_flush() before including tlb.h

As of commit fd1102f0aade ("mm: mmu_notifier fix for tlb_end_vma"),
asm-generic/tlb.h now calls tlb_flush() from a static inline function,
so we need to make sure that it's declared before #including the
asm-generic header in the arch header.

Signed-off-by: Will Deacon <[email protected]>
Acked-by: Nicholas Piggin <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kbuild: rename LDFLAGS to KBUILD_LDFLAGS

Commit a0f97e06a43c ("kbuild: enable 'make CFLAGS=...' to add
additional options to CC") renamed CFLAGS to KBUILD_CFLAGS.

Commit 222d394d30e7 ("kbuild: enable 'make AFLAGS=...' to add
additional options to AS") renamed AFLAGS to KBUILD_AFLAGS.

Commit 06c5040cdb13 ("kbuild: enable 'make CPPFLAGS=...' to add
additional options to CPP") renamed CPPFLAGS to KBUILD_CPPFLAGS.

For some reason, LDFLAGS was not renamed.

Using a well-known variable like LDFLAGS may result in accidental
override of the variable.

Kbuild generally uses KBUILD_ prefixed variables for the internally
appended options, so here is one more conversion to sanitize the
naming convention.

I did not touch Makefiles under tools/ since the tools build system
is a different world.

Signed-off-by: Masahiro Yamada <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Reviewed-by: Palmer Dabbelt <[email protected]>

kbuild: pass LDFLAGS to recordmcount.pl

Since commit 0fbe9a245c60 ("microblaze: add endianness options to
LDFLAGS instead of LD"), you cannot build the kernel for microblaze
with CONFIG_DYNAMIC_FTRACE.

Fixes: 0fbe9a245c60 ("microblaze: add endianness options to LDFLAGS instead of LD")
Signed-off-by: Masahiro Yamada <[email protected]>

kbuild: test dead code/data elimination support in Kconfig

This config option should be enabled only when both the compiler and
the linker support necessary flags. Add proper dependencies to Kconfig.

Signed-off-by: Masahiro Yamada <[email protected]>

Merge tag 'nfs-for-4.19-1' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client updates from Anna Schumaker:
"These patches include adding async support for the v4.2 COPY
  operation. I think Bruce is planning to send the server patches for
  the next release, but I figured we could get the client side out of
  the way now since it's been in my tree for a while. This shouldn't
  cause any problems, since the server will still respond with
  synchronous copies even if the client requests async.

  Features:
   - Add support for asynchronous server-side COPY operations

  Stable bufixes:
   - Fix an off-by-one in bl_map_stripe() (v3.17+)
   - NFSv4 client live hangs after live data migration recovery (v4.9+)
   - xprtrdma: Fix disconnect regression (v4.18+)
   - Fix locking in pnfs_generic_recover_commit_reqs (v4.14+)
   - Fix a sleep in atomic context in nfs4_callback_sequence() (v4.9+)

  Other bugfixes and cleanups:
   - Optimizations and fixes involving NFS v4.1 / pNFS layout handling
   - Optimize lseek(fd, SEEK_CUR, 0) on directories to avoid locking
   - Immediately reschedule writeback when the server replies with an
     error
   - Fix excessive attribute revalidation in nfs_execute_ok()
   - Add error checking to nfs_idmap_prepare_message()
   - Use new vm_fault_t return type
   - Return a delegation when reclaiming one that the server has
     recalled
   - Referrals should inherit proto setting from parents
   - Make rpc_auth_create_args a const
   - Improvements to rpc_iostats tracking
   - Fix a potential reference leak when there is an error processing a
     callback
   - Fix rmdir / mkdir / rename nlink accounting
   - Fix updating inode change attribute
   - Fix error handling in nfsn4_sp4_select_mode()
   - Use an appropriate work queue for direct-write completion
   - Don't busy wait if NFSv4 session draining is interrupted"

* tag 'nfs-for-4.19-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (54 commits)
  pNFS: Remove unwanted optimisation of layoutget
  pNFS/flexfiles: ff_layout_pg_init_read should exit on error
  pNFS: Treat RECALLCONFLICT like DELAY...
  pNFS: When updating the stateid in layoutreturn, also update the recall range
  NFSv4: Fix a sleep in atomic context in nfs4_callback_sequence()
  NFSv4: Fix locking in pnfs_generic_recover_commit_reqs
  NFSv4: Fix a typo in nfs4_init_channel_attrs()
  NFSv4: Don't busy wait if NFSv4 session draining is interrupted
  NFS recover from destination server reboot for copies
  NFS add a simple sync nfs4_proc_commit after async COPY
  NFS handle COPY ERR_OFFLOAD_NO_REQS
  NFS send OFFLOAD_CANCEL when COPY killed
  NFS export nfs4_async_handle_error
  NFS handle COPY reply CB_OFFLOAD call race
  NFS add support for asynchronous COPY
  NFS COPY xdr handle async reply
  NFS OFFLOAD_CANCEL xdr
  NFS CB_OFFLOAD xdr
  NFS: Use an appropriate work queue for direct-write completion
  NFSv4: Fix error handling in nfs4_sp4_select_mode()
  ...

Merge tag 'nfsd-4.19-1' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
"Chuck Lever fixed a problem with NFSv4.0 callbacks over GSS from
  multi-homed servers.

  The only new feature is a minor bit of protocol (change_attr_type)
  which the client doesn't even use yet.

  Other than that, various bugfixes and cleanup"

* tag 'nfsd-4.19-1' of git://linux-nfs.org/~bfields/linux: (27 commits)
  sunrpc: Add comment defining gssd upcall API keywords
  nfsd: Remove callback_cred
  nfsd: Use correct credential for NFSv4.0 callback with GSS
  sunrpc: Extract target name into svc_cred
  sunrpc: Enable the kernel to specify the hostname part of service principals
  sunrpc: Don't use stack buffer with scatterlist
  rpc: remove unneeded variable 'ret' in rdma_listen_handler
  nfsd: use true and false for boolean values
  nfsd: constify write_op[]
  fs/nfsd: Delete invalid assignment statements in nfsd4_decode_exchange_id
  NFSD: Handle full-length symlinks
  NFSD: Refactor the generic write vector fill helper
  svcrdma: Clean up Read chunk path
  svcrdma: Avoid releasing a page in svc_xprt_release()
  nfsd: Mark expected switch fall-through
  sunrpc: remove redundant variables 'checksumlen','blocksize' and 'data'
  nfsd: fix leaked file lock with nfs exported overlayfs
  nfsd: don't advertise a SCSI layout for an unsupported request_queue
  nfsd: fix corrupted reply to badly ordered compound
  nfsd: clarify check_op_ordering
  ...

Merge tag 'upstream-4.19-rc1' of git://git.infradead.org/linux-ubifs

Pull UBI/UBIFS updates from Richard Weinberger:

- Year 2038 preparations

- New UBI feature to skip CRC checks of static volumes

- A new Kconfig option to disable xattrs in UBIFS

- Lots of fixes in UBIFS, found by our new test framework

* tag 'upstream-4.19-rc1' of git://git.infradead.org/linux-ubifs: (21 commits)
  ubifs: Set default assert action to read-only
  ubifs: Allow setting assert action as mount parameter
  ubifs: Rework ubifs_assert()
  ubifs: Pass struct ubifs_info to ubifs_assert()
  ubifs: Turn two ubifs_assert() into a WARN_ON()
  ubi: expose the volume CRC check skip flag
  ubi: provide a way to skip CRC checks
  ubifs: Use kmalloc_array()
  ubifs: Check data node size before truncate
  Revert "UBIFS: Fix potential integer overflow in allocation"
  ubifs: Add comment on c->commit_sem
  ubifs: introduce Kconfig symbol for xattr support
  ubifs: use swap macro in swap_dirty_idx
  ubifs: tnc: use monotonic znode timestamp
  ubifs: use timespec64 for inode timestamps
  ubifs: xattr: Don't operate on deleted inodes
  ubifs: gc: Fix typo
  ubifs: Fix memory leak in lprobs self-check
  ubi: Initialize Fastmap checkmapping correctly
  ubifs: Fix synced_i_size calculation for xattr inodes
  ...

Merge tag 'pwm/for-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm

Pull pwm updates from Thierry Reding:
"This contains mostly minor bug fixes as well as some new chip support
  for existing drivers"

* tag 'pwm/for-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
  pwm: mediatek: Add MT7628 support
  dt-bindings: pwm: Add MT7628 information
  dt-bindings: pwm: rcar: Add bindings for R-Car E3 support
  pwm: meson: Fix mux clock names
  pwm: stm32-lp: Remove useless loop in stm32_pwm_lp_remove()
  pwm: omap-dmtimer: Return -EPROBE_DEFER if no dmtimer platform data
  pwm: mxs: Switch to SPDX identifier
  dt-bindings: pwm: fsl-ftm: Add compatible string for i.MX8QM
  pwm: fsl-ftm: Enable support for the new SoC i.MX8QM
  pwm: fsl-ftm: Added the support of per-compatible data
  pwm: fsl-ftm: Added a dedicated IP interface clock
  pwm: cros-ec: Switch to SPDX identifier
  pwm: imx: Switch to SPDX identifier
  pwm: tiehrpwm: Fix disabling of output of PWMs
  pwm: tiehrpwm: Don't use emulation mode bits to control PWM output
  pwm: berlin: Don't use broken prescaler values

Merge tag 'fbdev-v4.19' of https://github.com/bzolnier/linux

Pull fbdev updates from Bartlomiej Zolnierkiewicz:
"Mostly small fixes and cleanups for fb drivers (the biggest updates
  are for udlfb and pxafb drivers). This also adds deferred console
  takeover support to the console code and efifb driver.

  Summary:

   - add support for deferred console takeover, when enabled defers
     fbcon taking over the console from the dummy console until the
     first text is displayed on the console - together with the "quiet"
     kernel commandline option this allows fbcon to still be used
     together with a smooth graphical bootup (Hans de Goede)

   - improve console locking debugging code (Thomas Zimmermann)

   - copy the ACPI BGRT boot graphics to the framebuffer when deferred
     console takeover support is used in efifb driver (Hans de Goede)

   - update udlfb driver - fix lost console when the user unplugs a USB
     adapter, fix the screen corruption issue, fix locking and add some
     performance optimizations (Mikulas Patocka)

   - update pxafb driver - fix using uninitialized memory, switch to
     devm_* API, handle initialization errors and add support for
     lcd-supply regulator (Daniel Mack)

   - add support for boards booted with a DeviceTree in pxa3xx_gcu
     driver (Daniel Mack)

   - rename omap2 module to omap2fb.ko to avoid conflicts with omap1
     driver (Arnd Bergmann)

   - enable ACPI-based enumeration for goldfishfb driver (Yu Ning)

   - fix goldfishfb driver to make user space Android code use 60 fps
     (Christoffer Dall)

   - print big fat warning when nomodeset kernel parameter is used in
     vgacon driver (Lyude Paul)

   - remove VLA usage from fsl-diu-fb driver (Kees Cook)

   - misc fixes (Julia Lawall, Geert Uytterhoeven, Fredrik Noring,
     Yisheng Xie, Dan Carpenter, Daniel Vetter, Anton Vasilyev, Randy
     Dunlap, Gustavo A. R. Silva, Colin Ian King, Fengguang Wu)

   - misc cleanups (Roman Kiryanov, Yisheng Xie, Colin Ian King)"

* tag 'fbdev-v4.19' of https://github.com/bzolnier/linux: (54 commits)
  Documentation/fb: corrections for fbcon.txt
  fbcon: Do not takeover the console from atomic context
  dummycon: Stop exporting dummycon_[un]register_output_notifier
  fbcon: Only defer console takeover if the current console driver is the dummycon
  fbcon: Only allow FRAMEBUFFER_CONSOLE_DEFERRED_TAKEOVER if fbdev is builtin
  fbdev: omap2: omapfb: fix ifnullfree.cocci warnings
  fbdev: omap2: omapfb: fix bugon.cocci warnings
  fbdev: omap2: omapfb: fix boolreturn.cocci warnings
  fb: amifb: fix build warnings when not builtin
  fbdev/core: Disable console-lock warnings when fb.lockless_register_fb is set
  console: Replace #if 0 with atomic var 'ignore_console_lock_warning'
  udlfb: use spin_lock_irq instead of spin_lock_irqsave
  udlfb: avoid prefetch
  udlfb: optimization - test the backing buffer
  udlfb: allow reallocating the framebuffer
  udlfb: set line_length in dlfb_ops_set_par
  udlfb: handle allocation failure
  udlfb: set optimal write delay
  udlfb: make a local copy of fb_ops
  udlfb: don't switch if we are switching to the same videomode
  ...

Merge tag 'sound-fix-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"No surprises here: a regression fix for virmidi code refactoring,
  three fixes for the new AC97 bus compat and runtime PM, and a usual
  HD-audio quirk"

* tag 'sound-fix-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek - Fix HP Headset Mic can't record
  ALSA: ac97: fix unbalanced pm_runtime_enable
  ALSA: ac97: fix check of pm_runtime_get_sync failure
  ALSA: ac97: fix device initialization in the compat layer
  ALSA: seq: virmidi: Fix discarding the unsubscribed output

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull more rdma updates from Jason Gunthorpe:
"This is the SMC cleanup promised, a randconfig regression fix, and
  kernel oops fix.

  Summary:

   - Switch SMC over to rdma_get_gid_attr and remove the compat

   - Fix a crash in HFI1 with some BIOS's

   - Fix a randconfig failure"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  IB/ucm: fix UCM link error
  IB/hfi1: Invalid NUMA node information can cause a divide by zero
  RDMA/smc: Replace ib_query_gid with rdma_get_gid_attr

Merge branch 'tlb-fixes'

Merge fixes for missing TLB shootdowns.

This fixes a couple of cases that involved us possibly freeing page
table structures before the required TLB shootdown had been done.

There are a few cleanup patches to make the code easier to follow, and
to avoid some of the more problematic cases entirely when not necessary.

To make this easier for backports, it undoes the recent lazy TLB
patches, because the cleanups and fixes are more important, and Rik is
ok with re-doing them later when things have calmed down.

The missing TLB flush was only delayed, and the wrong ordering only
happened under memory pressure (and in theory under a couple of other
fairly theoretical situations), so this may have been all very unlikely
to have hit people in practice.

But getting the TLB shootdown wrong is _so_ hard to debug and see that I
consider this a crticial fix.

Many thanks to Jann Horn for having debugged this.

* tlb-fixes:
  x86/mm: Only use tlb_remove_table() for paravirt
  mm: mmu_notifier fix for tlb_end_vma
  mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE
  mm/tlb: Remove tlb_remove_table() non-concurrent condition
  mm: move tlb_table_flush to tlb_flush_mmu_free
  x86/mm/tlb: Revert the recent lazy TLB patches

Merge tag 'for-linus-4.19b-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes and cleanups from Juergen Gross:
"Some cleanups, some minor fixes and a fix for a bug introduced in this
  merge window hitting 32-bit PV guests"

* tag 'for-linus-4.19b-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: enable early use of set_fixmap in 32-bit Xen PV guest
  xen: remove unused hypercall functions
  x86/xen: remove unused function xen_auto_xlated_memory_setup()
  xen/ACPI: don't upload Px/Cx data for disabled processors
  x86/Xen: further refine add_preferred_console() invocations
  xen/mcelog: eliminate redundant setting of interface version
  x86/Xen: mark xen_setup_gdt() __init

Merge tag 'mips_4.19_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS fixes from Paul Burton:

  - Fix microMIPS build failures by adding a .insn directive to the
    barrier_before_unreachable() asm statement in order to convince the
    toolchain that the asm statement is a valid branch target rather
    than a bogus attempt to switch ISA.

  - Clean up our declarations of TLB functions that we overwrite with
    generated code in order to prevent the compiler making assumptions
    about alignment that cause microMIPS kernels built with GCC 7 &
    above to die early during boot.

  - Fix up a regression for MIPS32 kernels which slipped into the main
    MIPS pull for 4.19, causing CONFIG_32BIT=y kernels to contain
    inappropriate MIPS64 instructions.

  - Extend our existing workaround for MIPSr6 builds that end up using
    the __multi3 intrinsic to GCC 7 & below, rather than just GCC 7.

* tag 'mips_4.19_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: lib: Provide MIPS64r6 __multi3() for GCC < 7
  MIPS: Workaround GCC __builtin_unreachable reordering bug
  compiler.h: Allow arch-specific asm/compiler.h
  MIPS: Avoid move psuedo-instruction whilst using MIPS_ISA_LEVEL
  MIPS: Consistently declare TLB functions
  MIPS: Export tlbmiss_handler_setup_pgd near its definition

Merge tag 'for-linus' of git://github.com/openrisc/linux

Pull OpenRISC update from Stafford Horne:
"Just one change for 4.19: refactoring from Christoph Hellwig to use
  generic DMA facilities"

* tag 'for-linus' of git://github.com/openrisc/linux:
  openrisc: use generic dma_noncoherent_ops
  openrisc: fix cache maintainance the the sync_single_for_device DMA operation
  openrisc: remove the no-op unmap_page and unmap_sg DMA operations
  openrisc: remove the sync_single_for_cpu DMA operation

Merge tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM device-tree updates from Olof Johansson:
"Business as usual -- the bulk of our changes are to devicetree files
  with new hardware support, new SoCs and platforms, and new board
  types.

  New SoCs/platforms:
   - Raspberry Pi Compute Module (CM1) and IO board
   - i.MX6SSL from NXP
   - Renesas RZ/N1D SoC (R9A06G032), Dual Cortex-A7 with Ethernet, CAN
     and PLC interfaces
   - TI AM654 SoC, Quad Cortex-A53, safety subsystem with Cortex-R5
     controllers, communication and PRU subsystem and lots of other
     interfaces (PCIe, USB3, etc).

  New boards and systems:
   - Several Atmel at91-based boards from Laird
   - Marvell Armada388-based Helios4 board from SolidRun
   - Samsung Aires-based phones (s5pv210)
   - Allwinner A64-based Pinebook laptop

  In addition to the above, there's the usual amount of new devices
  described on existing platforms, fixes and tweaks and new minor
  variants of boards/platforms"

* tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (478 commits)
  arm64: dts: sdm845: Add tsens nodes
  arm64: dts: msm8996: thermal: Initialise via DT and add second controller
  arm64: dts: sprd: Add one suspend timer
  arm64: dts: sprd: Add SC27XX ADC device
  arm64: dts: sprd: Add SC27XX eFuse device
  arm64: dts: sprd: Add SC27XX vibrator device
  arm64: dts: sprd: Add SC27XX breathing light controller device
  arm64: dts: meson-axg: add spdif-dit codec
  arm64: dts: meson-axg: add lineout codec
  arm64: dts: meson-axg: add linein codec
  arm64: dts: meson-axg: add tdm interfaces
  arm64: dts: meson-axg: add tdmout formatters
  arm64: dts: meson-axg: add tdmin formatters
  arm64: dts: meson-axg: add spdifout
  arm64: dts: rockchip: add led support for Firefly-RK3399
  arm64: dts: rockchip: remove deprecated Type-C PHY properties on rk3399
  arm64: dts: rockchip: add power button support for Firefly-RK3399
  ARM: dts: aspeed: Add coprocessor interrupt controller
  arm64: dts: meson-axg: add audio arb reset controller
  arm64: dts: meson-axg: add usb power regulator
  ...

Merge tag 'armsoc-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC defconfig updates from Olof Johansson:
"We keep these separate since some files are shared and conflict-prone,
  but there isn't really much to write about here.

  Some of the churnier pieces is for the Aspeed platforms, which did an
  overdue refresh of the defconfig, and enabled USB gadget and some
  drivers from there. Most of the rest are minor additions here and
  there to turn on drivers that are needed or useful on the various
  platforms"

* tag 'armsoc-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (39 commits)
  ARM: multi_v7_defconfig: add CONFIG_UNIPHIER_THERMAL and CONFIG_SNI_AVE
  ARM: config: aspeed: Enable new FSI drivers
  ARM: config: multi_v5: Enable ASPEED drivers
  ARM: config: multi_v5: Refresh configuration
  ARM: config: aspeed: Update defconfig
  ARM: multi_v7_defconfig: Enable support for RZN1D-DB
  ARM: shmobile: defconfig: Disable /sbin/hotplug fork-bomb
  ARM: shmobile: defconfig: Enable support for RZN1D-DB
  ARM: shmobile: defconfig: Enable reset controller support
  ARM: shmobile: defconfig: Drop NET_VENDOR_<FOO>=n
  arm64: defconfig: Enable more peripherals for Samsung Chromebook Plus.
  arm64: defconfig: Enable CONFIG_MTD_NAND_QCOM for IPQ8074
  ARM: qcom_defconfig: Enable QCOM NAND related configs
  ARM: imx_v6_v7_defconfig: add DMATEST support
  ARM: mvebu_v7_defconfig: enable SFP support
  ARM: mvebu_v7_defconfig: sync defconfig
  ARM: multi_v7_defconfig: Add Marvell NAND controller support
  arm: configs: Add USB gadget to Aspeed G5 defconfig
  arm: configs: Add USB gadget to Aspeed G4 defconfig
  arm64: defconfig: enable HiSilicon PMU driver
  ...

Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC driver updates from Olof Johansson:
"Some of the larger changes this merge window:

   - Removal of drivers for Exynos5440, a Samsung SoC that never saw
     widespread use.

   - Uniphier support for USB3 and SPI reset handling

   - Syste control and SRAM drivers and bindings for Allwinner platforms

   - Qualcomm AOSS (Always-on subsystem) reset controller drivers

   - Raspberry Pi hwmon driver for voltage

   - Mediatek pwrap (pmic) support for MT6797 SoC"

* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (52 commits)
  drivers/firmware: psci_checker: stash and use topology_core_cpumask for hotplug tests
  soc: fsl: cleanup Kconfig menu
  soc: fsl: dpio: Convert DPIO documentation to .rst
  staging: fsl-mc: Remove remaining files
  staging: fsl-mc: Move DPIO from staging to drivers/soc/fsl
  staging: fsl-dpaa2: eth: move generic FD defines to DPIO
  soc: fsl: qe: gpio: Add qe_gpio_set_multiple
  usb: host: exynos: Remove support for Exynos5440
  clk: samsung: Remove support for Exynos5440
  soc: sunxi: Add the A13, A23 and H3 system control compatibles
  reset: uniphier: add reset control support for SPI
  cpufreq: exynos: Remove support for Exynos5440
  ata: ahci-platform: Remove support for Exynos5440
  soc: imx6qp: Use GENPD_FLAG_ALWAYS_ON for PU errata
  soc: mediatek: pwrap: add mt6351 driver for mt6797 SoCs
  soc: mediatek: pwrap: add pwrap driver for mt6797 SoCs
  soc: mediatek: pwrap: fix cipher init setting error
  dt-bindings: pwrap: mediatek: add pwrap support for MT6797
  reset: uniphier: add USB3 core reset control
  dt-bindings: reset: uniphier: add USB3 core reset support
  ...

Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM 32-bit SoC platform updates from Olof Johansson:
"Most of the SoC updates in this cycle are cleanups and moves to more
  modern infrastructure:

   - Davinci was moved to common clock framework

   - OMAP1-based Amstrad E3 "Superphone" saw a bunch of cleanups to the
     keyboard interface (bitbanged AT keyboard via GPIO).

   - Removal of some stale code for Renesas platforms

   - Power management improvements for i.MX6LL"

* tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (112 commits)
  ARM: uniphier: select RESET_CONTROLLER
  arm64: uniphier: select RESET_CONTROLLER
  ARM: uniphier: remove empty Makefile
  ARM: exynos: Clear global variable on init error path
  ARM: exynos: Remove outdated maintainer information
  ARM: shmobile: Always enable ARCH_TIMER on SoCs with A7 and/or A15
  ARM: shmobile: r8a7779: hide unused r8a7779_platform_cpu_kill
  soc: r9a06g032: don't build SMP files for non-SMP config
  ARM: shmobile: Add the R9A06G032 SMP enabler driver
  ARM: at91: pm: configure wakeup sources for ULP1 mode
  ARM: at91: pm: add PMC fast startup registers defines
  ARM: at91: pm: Add ULP1 mode support
  ARM: at91: pm: Use ULP0 naming instead of slow clock
  ARM: hisi: handle of_iomap and fix missing of_node_put
  ARM: hisi: check of_iomap and fix missing of_node_put
  ARM: hisi: fix error handling and missing of_node_put
  ARM: mx5: Set the DBGEN bit in ARM_GPC register
  ARM: imx51: Configure M4IF to avoid visual artifacts
  ARM: imx: call imx6sx_cpuidle_init() conditionally for 6sll
  ARM: imx: fix i.MX6SLL build
  ...

Merge tag 'riscv-for-linus-4.19-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux

Pull RISC-V fixes from Palmer Dabbelt:
"This contains a pair of fixes to the RISC-V port:

   - The removal of our compat.h, which didn't do anything.

   - Fixes to sys_riscv_flush_icache to ensure it actually shows up.

     We're going to just call this a bug in the ABI, as it was always
     supposed to be there.

  I've given these a simple build+boot test, both individually and as
  the actual tag"

* tag 'riscv-for-linus-4.19-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
  riscv: Delete asm/compat.h
  RISC-V: Don't use a global include guard for uapi/asm/syscalls.h
  RISC-V: Define sys_riscv_flush_icache when SMP=n

cifs: update internal module version number for cifs.ko to 2.12

Signed-off-by: Steve French <[email protected]>

cifs: check kmalloc before use

The kmalloc was not being checked - if it fails issue a warning
and return -ENOMEM to the caller.

Signed-off-by: Nicholas Mc Guire <[email protected]>
Fixes: b8da344b74c8 ("cifs: dynamic allocation of ntlmssp blob")
Signed-off-by: Steve French <[email protected]>
Reviewed-by: Pavel Shilovsky <[email protected]>
cc: Stable <[email protected]>`

cifs: check if SMB2 PDU size has been padded and suppress the warning

Some SMB2/3 servers, Win2016 but possibly others too, adds padding
not only between PDUs in a compound but also to the final PDU.
This padding extends the PDU to a multiple of 8 bytes.

Check if the unexpected length looks like this might be the case
and avoid triggering the log messages for :

"SMB2 server sent bad RFC1001 len %d not %d\n"

Signed-off-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Steve French <[email protected]>

cifs: create a define for how many iovs we need for an SMB2_open()

Signed-off-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Steve French <[email protected]>

Merge tag 'trace-v4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
"Masami found an off by one bug in the code that keeps "notrace"
  functions from being traced by kprobes. During my testing, I found
  that there's places that we may want to add kprobes to notrace, thus
  we may end up changing this code before 4.19 is released.

  The history behind this change is that we found that adding kprobes to
  various notrace functions caused the kernel to crashed. We took the
  safe route and decided not to allow kprobes to trace any notrace
  function.

  But because notrace is added to functions that just cause weird side
  effects to the function tracer, but are still safe, preventing kprobes
  for all notrace functios may be too much of a big hammer.

  One such place is __schedule() is marked notrace, to keep function
  tracer from doing strange recursive loops when it gets traced with
  NEED_RESCHED set. With this change, one can not add kprobes to the
  scheduler.

  Masami also added code to use gcov on ftrace"

* tag 'trace-v4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing/kprobes: Fix to check notrace function with correct range
  tracing: Allow gcov profiling on only ftrace subsystem

Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue

Jeff Kirsher says:

====================
Intel Wired LAN Driver Fixes 2018-08-23

This series contains bug fixes to the ice driver.

Anirudh provides several fixes, starting with static analysis fixes by
replacing a bitwise-and with a constant value and replace "magic"
numbers with defines.  Fixed the control queue processing by removing
unnecessary read/writes to registers, as well as getting a accurate
value for "pending".  Added additional checks to avoid NULL pointer
dereferences.  Fixed up code formatting issues, by cleaning up code
comments and coding style.

Bruce cleans up a duplicate check for owner, within the same function.
Also cleans up interrupt causes that are not handled or applicable.
Fix checkpatch warning about the use of bool in structures due to the
wasted space and size of bool, so convert struct members to u8 instead.

Jake fixes a number of potential bugs in the reporting of stats via
ethtool, by simply reporting all the queue statistics, even for the
queues that are not activated.  Fixed a compiler warning, as well as
make the code a bit cleaner but just using order_base_2() for
calculating the power-of-2.

Preethi adds a check to avoid a NULL pointer dereference crash during
initialization.

Brett clarifies the code when it comes to port VLANs and regular VLANs,
by renaming defines and field values to match their intended use and
purpose.

Jesse initializes a variable to avoid garbage values being returned to
the caller.
====================

Signed-off-by: David S. Miller <[email protected]>

x86/mm: Only use tlb_remove_table() for paravirt

If we don't use paravirt; don't play unnecessary and complicated games
to free page-tables.

Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Rik van Riel <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: mmu_notifier fix for tlb_end_vma

The generic tlb_end_vma does not call invalidate_range mmu notifier, and
it resets resets the mmu_gather range, which means the notifier won't be
called on part of the range in case of an unmap that spans multiple
vmas.

ARM64 seems to be the only arch I could see that has notifiers and uses
the generic tlb_end_vma.  I have not actually tested it.

[ Catalin and Will point out that ARM64 currently only uses the
  notifiers for KVM, which doesn't use the ->invalidate_range()
  callback right now, so it's a bug, but one that happens to
  not affect them.  So not necessary for stable.  - Linus ]

Signed-off-by: Nicholas Piggin <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
Acked-by: Will Deacon <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

Jann reported that x86 was missing required TLB invalidates when he
hit the !*batch slow path in tlb_remove_table().

This is indeed the case; RCU_TABLE_FREE does not provide TLB (cache)
invalidates, the PowerPC-hash where this code originated and the
Sparc-hash where this was subsequently used did not need that. ARM
which later used this put an explicit TLB invalidate in their
__p*_free_tlb() functions, and PowerPC-radix followed that example.

But when we hooked up x86 we failed to consider this. Fix this by
(optionally) hooking tlb_remove_table() into the TLB invalidate code.

NOTE: s390 was also needing something like this and might now
be able to use the generic code again.

[ Modified to be on top of Nick's cleanups, which simplified this patch
now that tlb_flush_mmu_tlbonly() really only flushes the TLB - Linus ]

Fixes: 9e52fc2b50de ("x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)")
Reported-by: Jann Horn <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Rik van Riel <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: David Miller <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/tlb: Remove tlb_remove_table() non-concurrent condition

Will noted that only checking mm_users is incorrect; we should also
check mm_count in order to cover CPUs that have a lazy reference to
this mm (and could do speculative TLB operations).

If removing this turns out to be a performance issue, we can
re-instate a more complete check, but in tlb_table_flush() eliding the
call_rcu_sched().

Fixes: 267239116987 ("mm, powerpc: move the RCU page-table freeing into generic code")
Reported-by: Will Deacon <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Rik van Riel <[email protected]>
Acked-by: Will Deacon <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: David Miller <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: move tlb_table_flush to tlb_flush_mmu_free

There is no need to call this from tlb_flush_mmu_tlbonly, it logically
belongs with tlb_flush_mmu_free.  This makes future fixes simpler.

[ This was originally done to allow code consolidation for the
  mmu_notifier fix, but it also ends up helping simplify the
  HAVE_RCU_TABLE_INVALIDATE fix.    - Linus ]

Signed-off-by: Nicholas Piggin <[email protected]>
Acked-by: Will Deacon <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>

getxattr: use correct xattr length

When running in a container with a user namespace, if you call getxattr
with name = "system.posix_acl_access" and size % 8 != 4, then getxattr
silently skips the user namespace fixup that it normally does resulting in
un-fixed-up data being returned.
This is caused by posix_acl_fix_xattr_to_user() being passed the total
buffer size and not the actual size of the xattr as returned by
vfs_getxattr().
This commit passes the actual length of the xattr as returned by
vfs_getxattr() down.

A reproducer for the issue is:

  touch acl_posix

  setfacl -m user:0:rwx acl_posix

and the compile:

  #define _GNU_SOURCE
  #include <errno.h>
  #include <stdio.h>
  #include <stdlib.h>
  #include <string.h>
  #include <sys/types.h>
  #include <unistd.h>
  #include <attr/xattr.h>

  /* Run in user namespace with nsuid 0 mapped to uid != 0 on the host. */
  int main(int argc, void **argv)
  {
          ssize_t ret1, ret2;
          char buf1[128], buf2[132];
          int fret = EXIT_SUCCESS;
          char *file;

          if (argc < 2) {
                  fprintf(stderr,
                          "Please specify a file with "
                          "\"system.posix_acl_access\" permissions set\n");
                  _exit(EXIT_FAILURE);
          }
          file = argv[1];

          ret1 = getxattr(file, "system.posix_acl_access",
                          buf1, sizeof(buf1));
          if (ret1 < 0) {
                  fprintf(stderr, "%s - Failed to retrieve "
                                  "\"system.posix_acl_access\" "
                                  "from \"%s\"\n", strerror(errno), file);
                  _exit(EXIT_FAILURE);
          }

          ret2 = getxattr(file, "system.posix_acl_access",
                          buf2, sizeof(buf2));
          if (ret2 < 0) {
                  fprintf(stderr, "%s - Failed to retrieve "
                                  "\"system.posix_acl_access\" "
                                  "from \"%s\"\n", strerror(errno), file);
                  _exit(EXIT_FAILURE);
          }

          if (ret1 != ret2) {
                  fprintf(stderr, "The value of \"system.posix_acl_"
                                  "access\" for file \"%s\" changed "
                                  "between two successive calls\n", file);
                  _exit(EXIT_FAILURE);
          }

          for (ssize_t i = 0; i < ret2; i++) {
                  if (buf1[i] == buf2[i])
                          continue;

                  fprintf(stderr,
                          "Unexpected different in byte %zd: "
                          "%02x != %02x\n", i, buf1[i], buf2[i]);
                  fret = EXIT_FAILURE;
          }

          if (fret == EXIT_SUCCESS)
                  fprintf(stderr, "Test passed\n");
          else
                  fprintf(stderr, "Test failed\n");

          _exit(fret);
  }
and run:

  ./tester acl_posix

On a non-fixed up kernel this should return something like:

  root@c1:/# ./t
  Unexpected different in byte 16: ffffffa0 != 00
  Unexpected different in byte 17: ffffff86 != 00
  Unexpected different in byte 18: 01 != 00

and on a fixed kernel:

  root@c1:~# ./t
  Test passed

Cc: [email protected]
Fixes: 2f6f0654ab61 ("userns: Convert vfs posix_acl support to use kuids and kgids")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199945
Reported-by: Colin Watson <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
Acked-by: Serge Hallyn <[email protected]>
Signed-off-by: Eric W. Biederman <[email protected]>

ice: Trivial formatting fixes

1) Add missing "\n" when printing link event error message.

2) Update dev_err statement in probe.

3) Add function description for ice_clear_pf_cfg.

4) Fix coding style for ice_acquire_nvm.

5) netdev->mtu is unsigned so use %u.

Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ice: Change struct members from bool to u8

Recent versions of checkpatch have a new warning based on a documented
preference of Linus to not use bool in structures due to wasted space and
the size of bool is implementation dependent. For more information, see
the email thread at https://lkml.org/lkml/2017/11/21/384.

Signed-off-by: Bruce Allan <[email protected]>
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ice: Fix potential return of uninitialized value

In ice_vsi_setup_[tx|rx]_rings, err is uninitialized which can result in
a garbage value return to the caller. Fix that.

Signed-off-by: Jesse Brandeburg <[email protected]>
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ice: Fix a few null pointer dereference issues

1) When ice_ena_msix_range() fails to reserve vectors, a devm_kfree()
warning was seen in the error flow path. So check pf->irq_tracker
before use in ice_clear_interrupt_scheme().

2) In ice_vsi_cfg(), check vsi->netdev before use.

3) In ice_get_link_status, check link_up before use.

Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ice: Update to interrupts enabled in OICR

Remove the following interrupt causes that are not applicable or not
handled:
- PFINT_OICR_HLP_RDY_M
- PFINT_OICR_CPM_RDY_M
- PFINT_OICR_GPIO_M
- PFINT_OICR_STORM_DETECT_M

Add the following interrupt cause that's actually handled in ice_misc_intr:
- PFINT_OICR_PE_CRITERR_M

Signed-off-by: Bruce Allan <[email protected]>
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

tools: bpftool: return from do_event_pipe() on bad arguments

When command line parsing fails in the while loop in do_event_pipe()
because the number of arguments is incorrect or because the keyword is
unknown, an error message is displayed, but bpftool remains stuck in
the loop. Make sure we exit the loop upon failure.

Fixes: f412eed9dfde ("tools: bpftool: add simple perf event output reader")
Signed-off-by: Quentin Monnet <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

gcc-plugins: Disable when building under Clang

Prior to doing compiler feature detection in Kconfig, attempts to build
GCC plugins with Clang would fail the build, much in the same way missing
GCC plugin headers would fail the build. However, now that this logic
has been lifted into Kconfig, add an explicit test for GCC (instead of
duplicating it in the feature-test script).

Reported-by: Stefan Agner <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Reviewed-by: Masahiro Yamada <[email protected]>

ice: Set VLAN flags correctly

In the struct ice_aqc_vsi_props the field port_vlan_flags is an
overloaded term because it is used for both port VLANs (PVLANs) and
regular VLANs. This is an issue and is very confusing especially when
dealing with VFs because normal VLANs and port VLANs are not the same.
To fix this the field was renamed to vlan_flags and all of the #define's
labeled *_PVLAN_* were renamed to *_VLAN_* if they are not specific to
port VLANs.

Also in ice_vsi_manage_vlan_stripping, set the ICE_AQ_VSI_VLAN_MODE_ALL
bit to allow the driver to add a VLAN tag to all packets it sends.

Signed-off-by: Brett Creeley <[email protected]>
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ice: Use order_base_2 to calculate higher power of 2

Currently, we use a combination of ilog2 and is_power_of_2() to
calculate the next power of 2 for the qcount. This appears to be causing
a warning on some combinations of GCC and the Linux kernel:

MODPOST 1 modules
WARNING: "____ilog2_NaN" [ice.ko] undefined!

This appears to because because GCC realizes that qcount could be zero
in some circumstances and thus attempts to link against the
intentionally undefined ___ilog2_NaN function.

The order_base_2 function is intentionally defined to return 0 when
passed 0 as an argument, and thus will be safe to use here.

This not only fixes the warning but makes the resulting code slightly
cleaner, and is really what we should have used originally.

Also update the comment to make it more clear that we are rounding up,
not just incrementing the ilog2 of qcount unconditionally.

Signed-off-by: Jacob Keller <[email protected]>
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ice: Fix bugs in control queue processing

This patch is a consolidation of multiple bug fixes for control queue
processing.

1)  In ice_clean_adminq_subtask() remove unnecessary reads/writes to
    registers. The bits PFINT_FW_CTL, PFINT_MBX_CTL and PFINT_SB_CTL
    are not set when an interrupt arrives, which means that clearing them
    again can be omitted.

2)  Get an accurate value in "pending" by re-reading the control queue
    head register from the hardware.

3)  Fix a corner case involving lost control queue messages by checking
    for new control messages (using ice_ctrlq_pending) before exiting the
    cleanup routine.

Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ice: Clean control queues only when they are initialized

Clean control queues only when they are initialized. One of the ways to
validate if the basic initialization is done is by checking value of
cq->sq.head and cq->rq.head variables that specify the register address.
This patch adds a check to avoid NULL pointer dereference crash when tried
to shutdown uninitialized control queue.

Signed-off-by: Preethi Banala <[email protected]>
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ice: Report stats for allocated queues via ethtool stats

It is not safe to have the string table for statistics change order or
size over the lifetime of a given netdevice. This is because of the
nature of the 3-step process for obtaining stats. First, user space
performs a request for the size of the strings table. Second it performs
a separate request for the strings themselves, after allocating space
for the table. Third, it requests the stats themselves, also allocating
space for the table.

If the size decreased, there is potential to see garbage data or stats
values. In the worst case, we could potentially see stats values become
mis-aligned with their strings, so that it looks like a statistic is
being reported differently than it actually is.

Even worse, if the size increased, there is potential that the strings
table or stats table was not allocated large enough and the stats code
could access and write to memory it should not, potentially resulting in
undefined behavior and system crashes.

It isn't even safe if the size always changes under the RTNL lock. This
is because the calls take place over multiple user space commands, so it
is not possible to hold the RTNL lock for the entire duration of
obtaining strings and stats. Further, not all consumers of the ethtool
API are the user space ethtool program, and it is possible that one
assumes the strings will not change (valid under the current contract),
and thus only requests the stats values when requesting stats in a loop.

Finally, it's not possible in the general case to detect when the size
changes, because it is quite possible that one value which could impact
the stat size increased, while another decreased. This would result in
the same total number of stats, but reordering them so that stats no
longer line up with the strings they belong to. Since only size changes
aren't enough, we would need some sort of hash or token to determine
when the strings no longer match. This would require extending the
ethtool stats commands, but there is no more space in the relevant
structures.

The real solution to resolve this would be to add a completely new API
for stats, probably over netlink.

In the ice driver, the only thing impacting the stats that is not
constant is the number of queues. Instead of reporting stats for each
used queue, report stats for each allocated queue. We do not change the
number of queues allocated for a given netdevice, as we pass this into
the alloc_etherdev_mq() function to set the num_tx_queues and
num_rx_queues.

This resolves the potential bugs at the slight cost of displaying many
queue statistics which will not be activated.

Signed-off-by: Jacob Keller <[email protected]>
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

ice: Cleanup magic number

Use define for the unit size shift of the Rx LAN context descriptor base
address instead of the magic number 7.

Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>