Git Repo - linux.git/log

KVM: arm/arm64: vgic: Use READ_ONCE fo cmpxchg

There is a small chance that the compiler could generate separate loads
for the dist->propbaser which could be modified from another CPU. As we
want to make sure we atomically update the entire value, and don't race
with other updates, guarantee that the cmpxchg operation compares
against the original value.

Acked-by: Catalin Marinas <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>

KVM: nVMX: Fix interrupt window request with "Acknowledge interrupt on exit"

------------[ cut here ]------------
WARNING: CPU: 5 PID: 2288 at arch/x86/kvm/vmx.c:11124 nested_vmx_vmexit+0xd64/0xd70 [kvm_intel]
CPU: 5 PID: 2288 Comm: qemu-system-x86 Not tainted 4.13.0-rc2+ #7
RIP: 0010:nested_vmx_vmexit+0xd64/0xd70 [kvm_intel]
Call Trace:
  vmx_check_nested_events+0x131/0x1f0 [kvm_intel]
  ? vmx_check_nested_events+0x131/0x1f0 [kvm_intel]
  kvm_arch_vcpu_ioctl_run+0x5dd/0x1be0 [kvm]
  ? vmx_vcpu_load+0x1be/0x220 [kvm_intel]
  ? kvm_arch_vcpu_load+0x62/0x230 [kvm]
  kvm_vcpu_ioctl+0x340/0x700 [kvm]
  ? kvm_vcpu_ioctl+0x340/0x700 [kvm]
  ? __fget+0xfc/0x210
  do_vfs_ioctl+0xa4/0x6a0
  ? __fget+0x11d/0x210
  SyS_ioctl+0x79/0x90
  do_syscall_64+0x8f/0x750
  ? trace_hardirqs_on_thunk+0x1a/0x1c
  entry_SYSCALL64_slow_path+0x25/0x25

This can be reproduced by booting L1 guest w/ 'noapic' grub parameter, which
means that tells the kernel to not make use of any IOAPICs that may be present
in the system.

Actually external_intr variable in nested_vmx_vmexit() is the req_int_win
variable passed from vcpu_enter_guest() which means that the L0's userspace
requests an irq window. I observed the scenario (!kvm_cpu_has_interrupt(vcpu) &&
L0's userspace reqeusts an irq window) is true, so there is no interrupt which
L1 requires to inject to L2, we should not attempt to emualte "Acknowledge
interrupt on exit" for the irq window requirement in this scenario.

This patch fixes it by not attempt to emulate "Acknowledge interrupt on exit"
if there is no L1 requirement to inject an interrupt to L2.

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
[Added code comment to make it obvious that the behavior is not correct.
We should do a userspace exit with open interrupt window instead of the
nested VM exit.  This patch still improves the behavior, so it was
accepted as a (temporary) workaround.]
Signed-off-by: Radim Krčmář <[email protected]>

rtlwifi: Replace hardcode value with macro

In _rtl_init_mac80211(), hardcoded value for hw->max_listen_interval
and hw->max_rate_tries are replaced by macro and removed the comment.

Signed-off-by: Souptick Joarder <[email protected]>
Acked-by: Larry Finger <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

mwifiex: pcie: compatible with wifi-only image while extract wifi-part fw

Sometimes, we might using wifi-only firmware with a combo firmware name,
in this case, do not need to filter bluetooth part from header.

Signed-off-by: Xinming Hu <[email protected]>
Signed-off-by: Cathy Luo <[email protected]>
Reviewed-by: Brian Norris <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

mwifiex: make addba request command clean

uninitilized variable, such as .add_req_result might be magic stack
value. Initialize the structure to make it clean.

Signed-off-by: Xinming Hu <[email protected]>
Signed-off-by: Cathy Luo <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

net: qtnfmac: constify pci_device_id.

pci_device_id are not supposed to change at runtime. All functions
working with pci_device_id provided by <linux/pci.h> work with
const pci_device_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <[email protected]>
Reviewed-by: Sergey Matyukevich <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

hostap: Fix outdated comment about dev->destructor

After commit cf124db566e6 ("net: Fix inconsistent teardown and
release of private netdev state."), setting
'dev->needs_free_netdev' ensures device data is released, and
'dev->destructor' is not used anymore.

Fixes: cf124db566e6 ("net: Fix inconsistent teardown and release of private netdev state.")
Signed-off-by: Stefano Brivio <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

rtlwifi: rtl8192ee: constify pci_device_id.

pci_device_id are not supposed to change at runtime. All functions
working with pci_device_id provided by <linux/pci.h> work with
const pci_device_id. So mark the non-const structs as const.

File size before:
   text    data     bss     dec     hex filename
   1899     928       0    2827     b0b realtek/rtlwifi/rtl8192ee/sw.o

File size After adding 'const':
   text    data     bss     dec     hex filename
   1963     864       0    2827     b0b realtek/rtlwifi/rtl8192ee/sw.o

Signed-off-by: Arvind Yadav <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

rtlwifi: rtl8188ee: constify pci_device_id.

pci_device_id are not supposed to change at runtime. All functions
working with pci_device_id provided by <linux/pci.h> work with
const pci_device_id. So mark the non-const structs as const.

File size before:
   text    data     bss     dec     hex filename
   3090     912       0    4002     fa2 realtek/rtlwifi/rtl8188ee/sw.o

File size After adding 'const':
   text    data     bss     dec     hex filename
   3154     848       0    4002     fa2 realtek/rtlwifi/rtl8188ee/sw.o

Signed-off-by: Arvind Yadav <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

rtlwifi: rtl8723be: constify pci_device_id.

pci_device_id are not supposed to change at runtime. All functions
working with pci_device_id provided by <linux/pci.h> work with
const pci_device_id. So mark the non-const structs as const.

File size before:
   text    data     bss     dec     hex filename
   3032     912       0    3944     f68 realtek/rtlwifi/rtl8723be/sw.o

File size After adding 'const':
   text    data     bss     dec     hex filename
   3096     848       0    3944     f68 realtek/rtlwifi/rtl8723be/sw.o

Signed-off-by: Arvind Yadav <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

rtlwifi: rtl8723ae: constify pci_device_id.

pci_device_id are not supposed to change at runtime. All functions
working with pci_device_id provided by <linux/pci.h> work with
const pci_device_id. So mark the non-const structs as const.

File size before:
   text    data     bss     dec     hex filename
   2775     912       0    3687     e67 realtek/rtlwifi/rtl8723ae/sw.o

File size After adding 'const':
   text    data     bss     dec     hex filename
   2839     848       0    3687     e67 realtek/rtlwifi/rtl8723ae/sw.o

Signed-off-by: Arvind Yadav <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

rtlwifi: rtl8821ae: constify pci_device_id.

pci_device_id are not supposed to change at runtime. All functions
working with pci_device_id provided by <linux/pci.h> work with
const pci_device_id. So mark the non-const structs as const.

File size before:
   text    data     bss     dec     hex filename
   2491     960       0    3451     d7b realtek/rtlwifi/rtl8821ae/sw.o

File size After adding 'const':
   text    data     bss     dec     hex filename
   2587     864       0    3451     d7b realtek/rtlwifi/rtl8821ae/sw.o

Signed-off-by: Arvind Yadav <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

rtlwifi: rtl8192se: constify pci_device_id.

pci_device_id are not supposed to change at runtime. All functions
working with pci_device_id provided by <linux/pci.h> work with
const pci_device_id. So mark the non-const structs as const.

File size before:
   text    data     bss     dec     hex filename
   2817    1040       0    3857     f11 realtek/rtlwifi/rtl8192se/sw.o

File size After adding 'const':
   text    data     bss     dec     hex filename
   3009     848       0    3857     f11 realtek/rtlwifi/rtl8192se/sw.o

Signed-off-by: Arvind Yadav <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

rtlwifi: rtl8192de: constify pci_device_id.

pci_device_id are not supposed to change at runtime. All functions
working with pci_device_id provided by <linux/pci.h> work with
const pci_device_id. So mark the non-const structs as const.

File size before:
   text    data     bss     dec     hex filename
   2833     945      12    3790     ece realtek/rtlwifi/rtl8192de/sw.o

File size After adding 'const':
   text    data     bss     dec     hex filename
   2929     849      12    3790     ece realtek/rtlwifi/rtl8192de/sw.o

Signed-off-by: Arvind Yadav <[email protected]>
Acked-by: Larry Finger <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

qtnfmac: Tidy up DMA mask setting

As the only caller of dma_supported() outside of DMA API internals, the
qtfnmac driver stands out and invites scrutiny. Thankfully, it's not
being used for evil, but it is entirely redundant, since it open-codes a
check that the DMA mask setting functions are going to perform anyway.
In fact, the whole qtnf_pcie_init_dma_mask() function is nothing more
than a rather long-winded implementation of dma_set_mask_and_coherent(),
so let's just use that directly.

Signed-off-by: Robin Murphy <[email protected]>
Acked-by: Sergey Matyukevich <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

qtnfmac: prepare for AP_VLAN interface type support

Modify qlink command structures and interface types handling
to prepare adding AP_VLAN support to qtnfmac driver.

Signed-off-by: Igor Mitsyanko <[email protected]>
Signed-off-by: Sergey Matyukevich <[email protected]>
Signed-off-by: Avinash Patil <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

qtnfmac: remove function qtnf_cmd_skb_put_action

This function is not used anymore, so remove it.

Signed-off-by: Igor Mitsyanko <[email protected]>
Signed-off-by: Sergey Matyukevich <[email protected]>
Signed-off-by: Avinash Patil <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

qtnfmac: fix handling of iftype mask reported by firmware

Firmware sends supported interface type rather than mask. As a result,
types field of ieee80211_iface_limit structure may end up having
multiple iftype bits set. This leads to WARN_ON from
wiphy_verify_combinations.

Signed-off-by: Igor Mitsyanko <[email protected]>
Signed-off-by: Sergey Matyukevich <[email protected]>
Signed-off-by: Avinash Patil <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

qtnfmac: implement scan timeout

Userspace tools may hang on scan in the case when scan completion event
is not returned by firmware. This patch implements the scan timeout
to avoid such situation.

Signed-off-by: Igor Mitsyanko <[email protected]>
Signed-off-by: Sergey Matyukevich <[email protected]>
Signed-off-by: Avinash Patil <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

qtnfmac: implement cfg80211 channel_switch handler

This patch implements cfg80211 channel_switch handler enabling CSA
channel-switch procedure.

Driver performs only basic validation of the requested new channel
and then sends command to firmware. Beacon IEs are not sent since
beacon update is handled by firmware.

Signed-off-by: Igor Mitsyanko <[email protected]>
Signed-off-by: Sergey Matyukevich <[email protected]>
Signed-off-by: Avinash Patil <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

qtnfmac: move current channel info from vif to mac

Wireless cfg80211 core supplies channel settings in cfg80211_ap_settings
structure for each BSS in multiple BSS configuration. On the other hand
all the virtual interfaces on one radio are using the same PHY settings
including channel.

Move chandef structure from vif to mac structure in order to mantain
the only instance of cfg80211_chan_def structure in qtnf_wmac
rather than its multiple copies in qtnf_vif.

Signed-off-by: Igor Mitsyanko <[email protected]>
Signed-off-by: Sergey Matyukevich <[email protected]>
Signed-off-by: Avinash Patil <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

qtnfmac: fix station leave reason endianness

Use proper endianness conversion for client station leave reason.

Signed-off-by: Igor Mitsyanko <[email protected]>
Signed-off-by: Sergey Matyukevich <[email protected]>
Signed-off-by: Avinash Patil <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

qtnfmac: implement reporting current channel

Implement current channel reporting functionality. Current operating
channel can be obtained either directly using cfg80211 get_channel
callback or from stats reported by cfg80211 survey_dump callback.

Signed-off-by: Igor Mitsyanko <[email protected]>
Signed-off-by: Sergey Matyukevich <[email protected]>
Signed-off-by: Avinash Patil <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

qtnfmac: implement cfg80211 dump_survey handler

This patch implements cfg80211 dump_survey handler enabling
per-channel survey data reports.

Signed-off-by: Igor Mitsyanko <[email protected]>
Signed-off-by: Sergey Matyukevich <[email protected]>
Signed-off-by: Avinash Patil <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

qtnfmac: add missing bus lock

Add missing bus lock into get_mac_chan_info command.

Signed-off-by: Igor Mitsyanko <[email protected]>
Signed-off-by: Sergey Matyukevich <[email protected]>
Signed-off-by: Avinash Patil <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

qtnfmac: regulatory configuration for self-managed setup

Regdomain information needs to be registered with cfg80211
for devices with REGULATORY_WIPHY_SELF_MANAGED flag set.

Signed-off-by: Igor Mitsyanko <[email protected]>
Signed-off-by: Sergey Matyukevich <[email protected]>
Signed-off-by: Avinash Patil <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

qtnfmac: updates for regulatory support

On startup driver obtains regulatory rules from firmware and
enables them during wiphy registration. Later on regulatory
domain change can be requested by host. In this case firmware
is notified about the upcoming changes. If the change is valid,
then firmware updates hardware channel configuration and host
driver receives updated channel info for each band.

Signed-off-by: Igor Mitsyanko <[email protected]>
Signed-off-by: Sergey Matyukevich <[email protected]>
Signed-off-by: Avinash Patil <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

rtlwifi: Fix fallback firmware loading

Commit f70e4df2b384 ("rtlwifi: Add code to read new versions of
firmware") added code to load an old firmware file if the new one is
not available. Unfortunately that code is never reached because
request_firmware_nowait() does not wait for the firmware to show up
and returns 0 even if the file is not there.

Use the existing fallback mechanism introduced by commit 62009b7f1279
("rtlwifi: rtl8192cu: Add new firmware") instead.

Fixes: f70e4df2b384 ("rtlwifi: Add code to read new versions of firmware")
Cc: [email protected]
Signed-off-by: Sven Joachim <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

mwifiex: correct IE parse during association

It is observed that some IEs get missed during association.
This patch correct the old IE parse code. sme->ie will be
store as wpa ie, wps ie, wapi ie and gen ie accordingly.

Signed-off-by: Xinming Hu <[email protected]>
Signed-off-by: Cathy Luo <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

rtlwifi: rtl_pci_probe: Fix fail path of _rtl_pci_find_adapter

_rtl_pci_find_adapter fail path will jump to label fail3 for
unsupported adapter types.

However, on course for fail3 there will be call rtl_deinit_core
before rtl_init_core.

For the inclusion of checking pci_iounmap this fail can be moved to
fail2.

Fixes
[ 4.492963] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 4.493067] IP: rtl_deinit_core+0x31/0x90 [rtlwifi]

Signed-off-by: Malcolm Priestley <[email protected]>
Cc: <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>

Merge tag 'iwlwifi-next-for-kalle-2017-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next

First batch of iwlwifi patches for 4.14

* Reorganization of the code into separate directories continues;
* A couple of new minor features;
* Fixes and cleanups here and there.

mmc: block: bypass the queue even if usage is present for hotplug

The commit 304419d8a7e9 ("mmc: core: Allocate per-request data using the
block layer core") refactored mechanism of queue handling caused
mmc_init_request() can be called just after mmc_cleanup_queue() caused null
pointer dereference.

Another commit bbdc74dc19e0 ("mmc: block: Prevent new req entering queue
after its cleanup") tried to fix the problem. However it actually miss one
corner case.

We could still reproduce the issue mentioned with these steps:
(1) insert a SD card and mount it
(2) hotplug it, so it will leave md->usage still be counted
(3) reboot the system which will sync data and umount the card

[Unable to handle kernel NULL pointer dereference at virtual address
00000000
[user pgtable: 4k pages, 48-bit VAs, pgd = ffff80007bab3000
[[0000000000000000] *pgd=000000007a828003, *pud=0000000078dce003,
*pmd=000000007aab6003, *pte=0000000000000000
[Internal error: Oops: 96000007 [#1] PREEMPT SMP
[Modules linked in:
[CPU: 3 PID: 3507 Comm: umount Tainted: G W
4.13.0-rc1-next-20170720-00012-g9d9bf45 #33
[Hardware name: Firefly-RK3399 Board (DT)
[task: ffff80007a1de200 task.stack: ffff80007a01c000
[PC is at mmc_init_request+0x14/0xc4
[LR is at alloc_request_size+0x4c/0x74
[pc : [<ffff0000087d7150>] lr : [<ffff000008378fe0>] pstate: 600001c5
[sp : ffff80007a01f8f0

....

[[<ffff0000087d7150>] mmc_init_request+0x14/0xc4
[[<ffff000008378fe0>] alloc_request_size+0x4c/0x74
[[<ffff00000817ac28>] mempool_create_node+0xb8/0x17c
[[<ffff00000837aadc>] blk_init_rl+0x9c/0x120
[[<ffff000008396580>] blkg_alloc+0x110/0x234
[[<ffff000008396ac8>] blkg_create+0x424/0x468
[[<ffff00000839877c>] blkg_lookup_create+0xd8/0x14c
[[<ffff0000083796bc>] generic_make_request_checks+0x368/0x3b0
[[<ffff00000837b050>] generic_make_request+0x1c/0x240

So mmc_blk_put wouldn't calling blk_cleanup_queue which actually the
QUEUE_FLAG_DYING and QUEUE_FLAG_BYPASS should stay. Block core expect
blk_queue_bypass_{start, end} internally to bypass/drain the queue before
actually dying the queue, so it didn't expose API to set the queue bypass.
I think we should set QUEUE_FLAG_BYPASS whenever queue is removed, although
the md->usage is still counted, as no dispatch queue could be found then.

Fixes: 304419d8a7e9 ("mmc: core: Allocate per-request data using the block layer core")
Signed-off-by: Shawn Lin <[email protected]>
Reviewed-by: Linus Walleij <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>

mmc: sdhci-of-at91: force card detect value for non removable devices

When the device is non removable, the card detect signal is often used
for another purpose i.e. muxed to another SoC peripheral or used as a
GPIO. It could lead to wrong behaviors depending the default value of
this signal if not muxed to the SDHCI controller.

Fixes: bb5f8ea4d514 ("mmc: sdhci-of-at91: introduce driver for the Atmel SDMMC")
Signed-off-by: Ludovic Desroches <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Cc: <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>

Merge tag 'nfs-for-4.13-4' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client fixes from Anna Schumaker:
"Two fixes from Trond this time, now that he's back from his vacation.
  The first is a stable fix for the EXCHANGE_ID issue on the mailing
  list, and the other fixes a double-free situation that he found at the
  same time.

  Stable fix:
   - Fix EXCHANGE_ID corrupt verifier issue

  Other fix:
   - Fix double frees in nfs4_test_session_trunk()"

* tag 'nfs-for-4.13-4' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  NFSv4: Fix double frees in nfs4_test_session_trunk()
  NFSv4: Fix EXCHANGE_ID corrupt verifier issue

isdn/i4l: fix buffer overflow

This fixes a potential buffer overflow in isdn_net.c caused by an
unbounded strcpy.

[ ISDN seems to be effectively unmaintained, and the I4L driver in
particular is long deprecated, but in case somebody uses this..
- Linus ]

Signed-off-by: Jiten Thakkar <[email protected]>
Signed-off-by: Annie Cherkaev <[email protected]>
Cc: Karsten Keil <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>

clk: keystone: sci-clk: Fix sci_clk_get

Currently a bug in the sci_clk_get implementation causes it to always
return a clock belonging to the last device in the static list of clock
data. This is due to a bug in the init code that causes the array
used by sci_clk_get to only be populated with the clocks for the last
device, as each device overwrites the entire array with its own clocks.

Fix this by calculating the actual number of clocks for the SoC, and
allocating the whole array in one go. Also, we don't need the handle
to the init data array anymore after doing this, instead we can
just compare the dev_id / clk_id against the registered clocks and
use binary search for speed.

Signed-off-by: Tero Kristo <[email protected]>
Reported-by: Dave Gerlach <[email protected]>
Fixes: b745c0794e2f ("clk: keystone: Add sci-clk driver support")
Cc: Nishanth Menon <[email protected]>
Tested-by: Franklin Cooper <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>

ocfs2: don't clear SGID when inheriting ACLs

When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0').  However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.

Fix the problem by moving posix_acl_update_mode() out of ocfs2_set_acl()
into ocfs2_iop_set_acl().  That way the function will not be called when
inheriting ACLs which is what we want as it prevents SGID bit clearing
and the mode has been properly set by posix_acl_create() anyway.  Also
posix_acl_chmod() that is calling ocfs2_set_acl() takes care of updating
mode itself.

Fixes: 073931017b4 ("posix_acl: Clear SGID bit when setting file permissions")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jan Kara <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: allow page_cache_get_speculative in interrupt context

Kernel panic when calling the IRQ-safe __get_user_pages_fast in NMI
handler.

The bug was introduced by commit 2947ba054a4d ("x86/mm/gup: Switch GUP
to the generic get_user_page_fast() implementation").

The original x86 __get_user_page_fast used plain get_page() or
page_ref_add().  However, the generic __get_user_page_fast uses
page_cache_get_speculative(), which has VM_BUG_ON(in_interrupt()).

There is no reason to prevent page_cache_get_speculative from using in
interrupt context.  According to the author, putting a BUG_ON there is
just because the code is not verifying correctness of interrupt races.
I did some tests in interrupt context.  There is no issue found.

Removing VM_BUG_ON(in_interrupt()) for page_cache_get_speculative().

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 2947ba054a4d ("x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation")
Signed-off-by: Kan Liang <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Ying Huang <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

userfaultfd: non-cooperative: flush event_wqh at release time

There may still be threads waiting on event_wqh at the time the
userfault file descriptor is closed. Flush the events wait-queue to
prevent waiting threads from hanging.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 9cd75c3cd4c3d ("userfaultfd: non-cooperative: add ability to report
non-PF events from uffd descriptor")
Signed-off-by: Mike Rapoport <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: "Dr. David Alan Gilbert" <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ipc: add missing container_of()s for randstruct

When building with the randstruct gcc plugin, the layout of the IPC
structs will be randomized, which requires any sub-structure accesses to
use container_of().  The proc display handlers were missing the needed
container_of()s since the iterator is passing in the top-level struct
kern_ipc_perm.

This would lead to crashes when running the "lsipc" program after the
system had IPC registered (e.g. after starting up Gnome):

  general protection fault: 0000 [#1] PREEMPT SMP
  ...
  RIP: 0010:shm_add_rss_swap.isra.1+0x13/0xa0
  ...
  Call Trace:
    sysvipc_shm_proc_show+0x5e/0x150
    sysvipc_proc_show+0x1a/0x30
    seq_read+0x2e9/0x3f0
  ...

Link: http://lkml.kernel.org/r/20170730205950.GA55841@beast
Fixes: 3859a271a003 ("randstruct: Mark various structs for randomization")
Signed-off-by: Kees Cook <[email protected]>
Reported-by: Dominik Brodowski <[email protected]>
Acked-by: Davidlohr Bueso <[email protected]>
Acked-by: Manfred Spraul <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

cpuset: fix a deadlock due to incomplete patching of cpusets_enabled()

In codepaths that use the begin/retry interface for reading
mems_allowed_seq with irqs disabled, there exists a race condition that
stalls the patch process after only modifying a subset of the
static_branch call sites.

This problem manifested itself as a deadlock in the slub allocator,
inside get_any_partial.  The loop reads mems_allowed_seq value (via
read_mems_allowed_begin), performs the defrag operation, and then
verifies the consistency of mem_allowed via the read_mems_allowed_retry
and the cookie returned by xxx_begin.

The issue here is that both begin and retry first check if cpusets are
enabled via cpusets_enabled() static branch.  This branch can be
rewritted dynamically (via cpuset_inc) if a new cpuset is created.  The
x86 jump label code fully synchronizes across all CPUs for every entry
it rewrites.  If it rewrites only one of the callsites (specifically the
one in read_mems_allowed_retry) and then waits for the
smp_call_function(do_sync_core) to complete while a CPU is inside the
begin/retry section with IRQs off and the mems_allowed value is changed,
we can hang.

This is because begin() will always return 0 (since it wasn't patched
yet) while retry() will test the 0 against the actual value of the seq
counter.

The fix is to use two different static keys: one for begin
(pre_enable_key) and one for retry (enable_key).  In cpuset_inc(), we
first bump the pre_enable key to ensure that cpuset_mems_allowed_begin()
always return a valid seqcount if are enabling cpusets.  Similarly, when
disabling cpusets via cpuset_dec(), we first ensure that callers of
cpuset_mems_allowed_retry() will start ignoring the seqcount value
before we let cpuset_mems_allowed_begin() return 0.

The relevant stack traces of the two stuck threads:

  CPU: 1 PID: 1415 Comm: mkdir Tainted: G L  4.9.36-00104-g540c51286237 #4
  Hardware name: Default string Default string/Hardware, BIOS 4.29.1-20170526215256 05/26/2017
  task: ffff8817f9c28000 task.stack: ffffc9000ffa4000
  RIP: smp_call_function_many+0x1f9/0x260
  Call Trace:
    smp_call_function+0x3b/0x70
    on_each_cpu+0x2f/0x90
    text_poke_bp+0x87/0xd0
    arch_jump_label_transform+0x93/0x100
    __jump_label_update+0x77/0x90
    jump_label_update+0xaa/0xc0
    static_key_slow_inc+0x9e/0xb0
    cpuset_css_online+0x70/0x2e0
    online_css+0x2c/0xa0
    cgroup_apply_control_enable+0x27f/0x3d0
    cgroup_mkdir+0x2b7/0x420
    kernfs_iop_mkdir+0x5a/0x80
    vfs_mkdir+0xf6/0x1a0
    SyS_mkdir+0xb7/0xe0
    entry_SYSCALL_64_fastpath+0x18/0xad

  ...

  CPU: 2 PID: 1 Comm: init Tainted: G L  4.9.36-00104-g540c51286237 #4
  Hardware name: Default string Default string/Hardware, BIOS 4.29.1-20170526215256 05/26/2017
  task: ffff8818087c0000 task.stack: ffffc90000030000
  RIP: int3+0x39/0x70
  Call Trace:
    <#DB> ? ___slab_alloc+0x28b/0x5a0
    <EOE> ? copy_process.part.40+0xf7/0x1de0
    __slab_alloc.isra.80+0x54/0x90
    copy_process.part.40+0xf7/0x1de0
    copy_process.part.40+0xf7/0x1de0
    kmem_cache_alloc_node+0x8a/0x280
    copy_process.part.40+0xf7/0x1de0
    _do_fork+0xe7/0x6c0
    _raw_spin_unlock_irq+0x2d/0x60
    trace_hardirqs_on_caller+0x136/0x1d0
    entry_SYSCALL_64_fastpath+0x5/0xad
    do_syscall_64+0x27/0x350
    SyS_clone+0x19/0x20
    do_syscall_64+0x60/0x350
    entry_SYSCALL64_slow_path+0x25/0x25

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 46e700abc44c ("mm, page_alloc: remove unnecessary taking of a seqlock when cpusets are disabled")
Signed-off-by: Dima Zavin <[email protected]>
Reported-by: Cliff Spradlin <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Christopher Lameter <[email protected]>
Cc: Li Zefan <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

userfaultfd_zeropage: return -ENOSPC in case mm has gone

In the non-cooperative userfaultfd case, the process exit may race with
outstanding mcopy_atomic called by the uffd monitor. Returning -ENOSPC
instead of -EINVAL when mm is already gone will allow uffd monitor to
distinguish this case from other error conditions.

Unfortunately I overlooked userfaultfd_zeropage when updating
userfaultd_copy().

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 96333187ab162 ("userfaultfd_copy: return -ENOSPC in case mm has gone")
Signed-off-by: Mike Rapoport <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: "Dr. David Alan Gilbert" <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: take memory hotplug lock within numa_zonelist_order_handler()

Andre Wild reported the following warning:

  WARNING: CPU: 2 PID: 1205 at kernel/cpu.c:240 lockdep_assert_cpus_held+0x4c/0x60
  Modules linked in:
  CPU: 2 PID: 1205 Comm: bash Not tainted 4.13.0-rc2-00022-gfd2b2c57ec20 #10
  Hardware name: IBM 2964 N96 702 (z/VM 6.4.0)
  task: 00000000701d8100 task.stack: 0000000073594000
  Krnl PSW : 0704f00180000000 0000000000145e24 (lockdep_assert_cpus_held+0x4c/0x60)
  ...
  Call Trace:
   lockdep_assert_cpus_held+0x42/0x60)
   stop_machine_cpuslocked+0x62/0xf0
   build_all_zonelists+0x92/0x150
   numa_zonelist_order_handler+0x102/0x150
   proc_sys_call_handler.isra.12+0xda/0x118
   proc_sys_write+0x34/0x48
   __vfs_write+0x3c/0x178
   vfs_write+0xbc/0x1a0
   SyS_write+0x66/0xc0
   system_call+0xc4/0x2b0
   locks held by bash/1205:
   #0:  (sb_writers#4){.+.+.+}, at: vfs_write+0xa6/0x1a0
   #1:  (zl_order_mutex){+.+...}, at: numa_zonelist_order_handler+0x44/0x150
   #2:  (zonelists_mutex){+.+...}, at: numa_zonelist_order_handler+0xf4/0x150
  Last Breaking-Event-Address:
    lockdep_assert_cpus_held+0x48/0x60

This can be easily triggered with e.g.

    echo n > /proc/sys/vm/numa_zonelist_order

In commit 3f906ba23689a ("mm/memory-hotplug: switch locking to a percpu
rwsem") memory hotplug locking was changed to fix a potential deadlock.

This also switched the stop_machine() invocation within
build_all_zonelists() to stop_machine_cpuslocked() which now expects
that online cpus are locked when being called.

This assumption is not true if build_all_zonelists() is being called
from numa_zonelist_order_handler().

In order to fix this simply add a mem_hotplug_begin()/mem_hotplug_done()
pair to numa_zonelist_order_handler().

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 3f906ba23689a ("mm/memory-hotplug: switch locking to a percpu rwsem")
Signed-off-by: Heiko Carstens <[email protected]>
Reported-by: Andre Wild <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: KAMEZAWA Hiroyuki <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/page_io.c: fix oops during block io poll in swapin path

When a thread is OOM-killed during swap_readpage() operation, an oops
occurs because end_swap_bio_read() is calling wake_up_process() based on
an assumption that the thread which called swap_readpage() is still
alive.

  Out of memory: Kill process 525 (polkitd) score 0 or sacrifice child
  Killed process 525 (polkitd) total-vm:528128kB, anon-rss:0kB, file-rss:4kB, shmem-rss:0kB
  oom_reaper: reaped process 525 (polkitd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
  general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
  Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter coretemp ppdev pcspkr vmw_balloon sg shpchp vmw_vmci parport_pc parport i2c_piix4 ip_tables xfs libcrc32c sd_mod sr_mod cdrom ata_generic pata_acpi vmwgfx ahci libahci drm_kms_helper ata_piix syscopyarea sysfillrect sysimgblt fb_sys_fops mptspi scsi_transport_spi ttm e1000 mptscsih drm mptbase i2c_core libata serio_raw
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.13.0-rc2-next-20170725 #129
  Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013
  task: ffffffffb7c16500 task.stack: ffffffffb7c00000
  RIP: 0010:__lock_acquire+0x151/0x12f0
  Call Trace:
   <IRQ>
   lock_acquire+0x59/0x80
   _raw_spin_lock_irqsave+0x3b/0x4f
   try_to_wake_up+0x3b/0x410
   wake_up_process+0x10/0x20
   end_swap_bio_read+0x6f/0xf0
   bio_endio+0x92/0xb0
   blk_update_request+0x88/0x270
   scsi_end_request+0x32/0x1c0
   scsi_io_completion+0x209/0x680
   scsi_finish_command+0xd4/0x120
   scsi_softirq_done+0x120/0x140
   __blk_mq_complete_request_remote+0xe/0x10
   flush_smp_call_function_queue+0x51/0x120
   generic_smp_call_function_single_interrupt+0xe/0x20
   smp_trace_call_function_single_interrupt+0x22/0x30
   smp_call_function_single_interrupt+0x9/0x10
   call_function_single_interrupt+0xa7/0xb0
   </IRQ>
  RIP: 0010:native_safe_halt+0x6/0x10
   default_idle+0xe/0x20
   arch_cpu_idle+0xa/0x10
   default_idle_call+0x1e/0x30
   do_idle+0x187/0x200
   cpu_startup_entry+0x6e/0x70
   rest_init+0xd0/0xe0
   start_kernel+0x456/0x477
   x86_64_start_reservations+0x24/0x26
   x86_64_start_kernel+0xf7/0x11a
   secondary_startup_64+0xa5/0xa5
  Code: c3 49 81 3f 20 9e 0b b8 41 bc 00 00 00 00 44 0f 45 e2 83 fe 01 0f 87 62 ff ff ff 89 f0 49 8b 44 c7 08 48 85 c0 0f 84 52 ff ff ff <f0> ff 80 98 01 00 00 8b 3d 5a 49 c4 01 45 8b b3 18 0c 00 00 85
  RIP: __lock_acquire+0x151/0x12f0 RSP: ffffa01f39e03c50
  ---[ end trace 6c441db499169b1e ]---
  Kernel panic - not syncing: Fatal exception in interrupt
  Kernel Offset: 0x36000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
  ---[ end Kernel panic - not syncing: Fatal exception in interrupt

Fix it by holding a reference to the thread.

[[email protected]: add comment]
Fixes: 23955622ff8d231b ("swap: add block io poll in swapin path")
Signed-off-by: Tetsuo Handa <[email protected]>
Reviewed-by: Shaohua Li <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge branch 'per-nexthop-offload'

Jiri Pirko says:

====================
ipv4: fib: Provide per-nexthop offload indication

Ido says:

Offload indication for IPv4 routes is currently set in the FIB info's
flags. When multipath routes are employed, this can lead to a route being
marked as offloaded although only one of its nexthops is actually
offloaded.

Instead, this patchset aims to proivde a higher resolution for the offload
indication and report it on a per-nexthop basis.

Example output from patched iproute:

$ ip route show 192.168.200.0/24
192.168.200.0/24
        nexthop via 192.168.100.2 dev enp3s0np7 weight 1 offload
        nexthop via 192.168.101.3 dev enp3s0np8 weight 1

And once the second gateway is resolved:

$ ip route show 192.168.200.0/24
192.168.200.0/24
        nexthop via 192.168.100.2 dev enp3s0np7 weight 1 offload
        nexthop via 192.168.101.3 dev enp3s0np8 weight 1 offload

First patch teaches the kernel to look for the offload indication in the
nexthop flags. Patches 2-5 adjust current capable drivers to provide
offload indication on a per-nexthop basis. Last patch removes no longer
used functions to set offload indication in the FIB info's flags.
====================

Signed-off-by: David S. Miller <[email protected]>

ipv4: fib: Remove unused functions

Previous patches converted users of these functions to provide offload
indication using the nexthop's flags instead of the FIB info's.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Acked-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mlxsw: spectrum_router: Refresh offload indication upon group refresh

Now that we provide offload indication using the nexthop's flags we must
refresh the offload indication whenever the offload state within the
group changes.

This didn't matter until now, as offload indication was provided using
the FIB info flags and multipath routes were marked as offloaded as long
as one of the nexthops was offloaded.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Tested-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mlxsw: spectrum_router: Don't check state when refreshing offload indication

Previous patch removed the reliance on the counter in the FIB info to
set the offload indication, so we no longer need to keep an offload
state on each FIB entry and can just set or unset the RTNH_F_OFFLOAD
flag in each nexthop.

This is also necessary because we're going to need to refresh the
offload indication whenever the nexthop group associated with the FIB
entry is refreshed. Current check would prevent us from marking a newly
resolved nexthop as offloaded if the FIB entry is already marked as
offloaded.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Tested-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mlxsw: spectrum_router: Provide offload indication using nexthop flags

In a similar fashion to previous patch, use the nexthop flags to provide
offload indication instead of the FIB info's flags.

In case a nexthop in a multipath route can't be offloaded (gateway's MAC
can't be resolved, for example), then its offload flag isn't set.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Tested-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

rocker: Provide offload indication using nexthop flags

We want to stop using the FIB info's flags to provide the offlaod
indication and instead do that on a per-nexthop basis.

Convert rocker to do just that. It only supports one nexthop per-route,
so conversion is simple.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ipv4: fib: Set offload indication according to nexthop flags

We're going to have capable drivers indicate route offload using the
nexthop flags, but for non-multipath routes these flags aren't dumped to
user space.

Instead, set the offload indication in the route message flags.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Acked-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

mlxsw: core: Use correct EMAD transaction ID in debug message

'trans->tid' is only assigned later in the function, resulting in a zero
transaction ID. Use 'tid' instead.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'netvsc-transparent-VF-support'

Stephen Hemminger says:

====================
netvsc: transparent VF support

This patch set changes how SR-IOV Virtual Function devices are managed
in the Hyper-V network driver. This version is rebased onto current net-next.

Background

In Hyper-V SR-IOV can be enabled (and disabled) by changing guest settings
on host. When SR-IOV is enabled a matching PCI device is hot plugged and
visible on guest. The VF device is an add-on to an existing netvsc
device, and has the same MAC address.

How is this different?

The original support of VF relied on using bonding driver in active
standby mode to handle the VF device.

With the new netvsc VF logic, the Linux hyper-V network
virtual driver will directly manage the link to SR-IOV VF device.
When VF device is detected (hot plug) it is automatically made a
slave device of the netvsc device. The VF device state reflects
the state of the netvsc device; i.e. if netvsc is set down, then
VF is set down. If netvsc is set up, then VF is brought up.

Packet flow is independent of VF status; all packets are sent and
received as if they were associated with the netvsc device. If VF is
removed or link is down then the synthetic VMBUS path is used.

What was wrong with using bonding script?

A lot of work went into getting the bonding script to work on all
distributions, but it was a major struggle. Linux network devices
can be configured many, many ways and there is no one solution from
userspace to make it all work. What is really hard is when
configuration is attached to synthetic device during boot (eth0) and
then the same addresses and firewall rules needs to also work later if
doing bonding. The new code gets around all of this.

How does VF work during initialization?

Since all packets are sent and received through the logical netvsc
device, initialization is much easier. Just configure the regular
netvsc Ethernet device; when/if SR-IOV is enabled it just
works. Provisioning and cloud init only need to worry about setting up
netvsc device (eth0). If SR-IOV is enabled (even as a later step), the
address and rules stay the same.

What devices show up?

Both netvsc and PCI devices are visible in the system. The netvsc
device is active and named in usual manner (eth0). The PCI device is
visible to Linux and gets renamed by udev to a persistent name
(enP2p3s0). The PCI device name is now irrelevant now.

The logic also sets the PCI VF device SLAVE flag on the network
device so network tools can see the relationship if they are smart
enough to understand how layered devices work.

This is a lot like how I see Windows working.
The VF device is visible in Device Manager, but is not configured.

Is there any performance impact?
There is no visible change in performance. The bonding
and netvsc driver both have equivalent steps.

Is it compatible with old bonding script?

It turns out that if you use the old bonding script, then everything
still works but in a sub-optimum manner. What happens is that bonding
is unable to steal the VF from the netvsc device so it creates a one
legged bond. Packet flow then is:
bond0 <--> eth0 <- -> VF (enP2p3s0).
In other words, if you get it wrong it still works, just
awkward and slower.

What if I add address or firewall rule onto the VF?

Same problems occur with now as already occur with bonding, bridging,
teaming on Linux if user incorrectly does configuration onto
an underlying slave device. It will sort of work, packets will come in
and out but the Linux kernel gets confused and things like ARP don’t
work right. There is no way to block manipulation of the slave
device, and I am sure someone will find some special use case where
they want it.
====================

Signed-off-by: David S. Miller <[email protected]>

netvsc: remove bonding setup script

No longer needed, now all managed by transparent VF logic.

Signed-off-by: Stephen Hemminger <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

netvsc: add documentation

Add some background documentation on netvsc device options
and limitations.

Signed-off-by: Stephen Hemminger <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

netvsc: transparent VF management

This patch implements transparent fail over from synthetic NIC to
SR-IOV virtual function NIC in Hyper-V environment. It is a better
alternative to using bonding as is done now. Instead, the receive and
transmit fail over is done internally inside the driver.

Using bonding driver has lots of issues because it depends on the
script being run early enough in the boot process and with sufficient
information to make the association. This patch moves all that
functionality into the kernel.

Signed-off-by: Stephen Hemminger <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

atm: solos-pci: constify attribute_group structures

Functions working with attribute_groups provided by <linux/sysfs.h>
work with const attribute_group. These attribute_group structures do not
change at runtime so mark them as const.

File size before:
text      data     bss     dec     hex filename
35740    28424     832   64996    fde4 drivers/atm/solos-pci.o

File size after:
text      data     bss     dec     hex filename
35932    28232     832   64996    fde4 drivers/atm/solos-pci.o

This change was made with the help of Coccinelle.

Signed-off-by: Amitoj Kaur Chawla <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

atm: adummy: constify attribute_group structure

Functions working with attribute_groups provided by <linux/sysfs.h>
work with const attribute_group. These attribute_group structures do not
change at runtime so mark them as const.

File size before:
text      data     bss     dec     hex filename
2033      1448       0    3481     d99 drivers/atm/adummy.o

File size after:
text      data     bss     dec     hex filename
2129      1352       0    3481     d99 drivers/atm/adummy.o

This change was made with the help of Coccinelle.

Signed-off-by: Amitoj Kaur Chawla <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

liquidio: set sriov_totalvfs correctly

The file /sys/devices/pci000.../sriov_totalvfs is showing a wrong value.
Fix it by calling pci_sriov_set_totalvfs() to set the total number of VFs
available after calculations for the number of PF and VF queues are made.

Signed-off-by: Derek Chickles <[email protected]>
Signed-off-by: Raghu Vatsavayi <[email protected]>
Signed-off-by: Felix Manlunas <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: Add support for 64-bit statistics

DSA slave network devices maintain a pair of bytes and packets counters
for each directions, but these are not 64-bit capable. Re-use
pcpu_sw_netstats which contains exactly what we need for that purpose
and update the code path to report 64-bit capable statistics.

Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

zram: do not free pool->size_class

Mike reported kernel goes oops with ltp:zram03 testcase.

  zram: Added device: zram0
  zram0: detected capacity change from 0 to 107374182400
  BUG: unable to handle kernel paging request at 0000306d61727a77
  IP: zs_map_object+0xb9/0x260
  PGD 0
  P4D 0
  Oops: 0000 [#1] SMP
  Dumping ftrace buffer:
     (ftrace buffer empty)
  Modules linked in: zram(E) xfs(E) libcrc32c(E) btrfs(E) xor(E) raid6_pq(E) loop(E) ebtable_filter(E) ebtables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) ip_tables(E) x_tables(E) af_packet(E) br_netfilter(E) bridge(E) stp(E) llc(E) iscsi_ibft(E) iscsi_boot_sysfs(E) nls_iso8859_1(E) nls_cp437(E) vfat(E) fat(E) intel_powerclamp(E) coretemp(E) cdc_ether(E) kvm_intel(E) usbnet(E) mii(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) iTCO_wdt(E) ghash_clmulni_intel(E) bnx2(E) iTCO_vendor_support(E) pcbc(E) ioatdma(E) ipmi_ssif(E) aesni_intel(E) i5500_temp(E) i2c_i801(E) aes_x86_64(E) lpc_ich(E) shpchp(E) mfd_core(E) crypto_simd(E) i7core_edac(E) dca(E) glue_helper(E) cryptd(E) ipmi_si(E) button(E) acpi_cpufreq(E) ipmi_devintf(E) pcspkr(E) ipmi_msghandler(E)
   nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) ext4(E) crc16(E) mbcache(E) jbd2(E) sd_mod(E) ata_generic(E) i2c_algo_bit(E) ata_piix(E) drm_kms_helper(E) ahci(E) syscopyarea(E) sysfillrect(E) libahci(E) sysimgblt(E) fb_sys_fops(E) uhci_hcd(E) ehci_pci(E) ttm(E) ehci_hcd(E) libata(E) drm(E) megaraid_sas(E) usbcore(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) efivarfs(E) autofs4(E) [last unloaded: zram]
  CPU: 6 PID: 12356 Comm: swapon Tainted: G            E   4.13.0.g87b2c3f-default #194
  Hardware name: IBM System x3550 M3 -[7944K3G]-/69Y5698     , BIOS -[D6E150AUS-1.10]- 12/15/2010
  task: ffff880158d2c4c0 task.stack: ffffc90001680000
  RIP: 0010:zs_map_object+0xb9/0x260
  Call Trace:
   zram_bvec_rw.isra.26+0xe8/0x780 [zram]
   zram_rw_page+0x6e/0xa0 [zram]
   bdev_read_page+0x81/0xb0
   do_mpage_readpage+0x51a/0x710
   mpage_readpages+0x122/0x1a0
   blkdev_readpages+0x1d/0x20
   __do_page_cache_readahead+0x1b2/0x270
   ondemand_readahead+0x180/0x2c0
   page_cache_sync_readahead+0x31/0x50
   generic_file_read_iter+0x7e7/0xaf0
   blkdev_read_iter+0x37/0x40
   __vfs_read+0xce/0x140
   vfs_read+0x9e/0x150
   SyS_read+0x46/0xa0
   entry_SYSCALL_64_fastpath+0x1a/0xa5
  Code: 81 e6 00 c0 3f 00 81 fe 00 00 16 00 0f 85 9f 01 00 00 0f b7 13 65 ff 05 5e 07 dc 7e 66 c1 ea 02 81 e2 ff 01 00 00 49 8b 54 d4 08 <8b> 4a 48 41 0f af ce 81 e1 ff 0f 00 00 41 89 c9 48 c7 c3 a0 70
  RIP: zs_map_object+0xb9/0x260 RSP: ffffc90001683988
  CR2: 0000306d61727a77

He bisected the problem is [1].

After commit cf8e0fedf078 ("mm/zsmalloc: simplify zs_max_alloc_size
handling"), zram doesn't use double pointer for pool->size_class any
more in zs_create_pool so counter function zs_destroy_pool don't need to
free it, either.

Otherwise, it does kfree wrong address and then, kernel goes Oops.

Link: http://lkml.kernel.org/r/20170725062650.GA12134@bbox
Fixes: cf8e0fedf078 ("mm/zsmalloc: simplify zs_max_alloc_size handling")
Signed-off-by: Minchan Kim <[email protected]>
Reported-by: Mike Galbraith <[email protected]>
Tested-by: Mike Galbraith <[email protected]>
Reviewed-by: Sergey Senozhatsky <[email protected]>
Cc: Jerome Marchand <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kthread: fix documentation build warning

The kerneldoc comment for kthread_create() had an incorrect argument
name, leading to a warning in the docs build.

Correct it, and make one more small step toward a warning-free build.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jonathan Corbet <[email protected]>
Cc: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kasan: avoid -Wmaybe-uninitialized warning

gcc-7 produces this warning:

  mm/kasan/report.c: In function 'kasan_report':
  mm/kasan/report.c:351:3: error: 'info.first_bad_addr' may be used uninitialized in this function [-Werror=maybe-uninitialized]
     print_shadow_for_address(info->first_bad_addr);
     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  mm/kasan/report.c:360:27: note: 'info.first_bad_addr' was declared here

The code seems fine as we only print info.first_bad_addr when there is a
shadow, and we always initialize it in that case, but this is relatively
hard for gcc to figure out after the latest rework.

Adding an intialization to the most likely value together with the other
struct members shuts up that warning.

Fixes: b235b9808664 ("kasan: unify report headers")
Link: https://patchwork.kernel.org/patch/9641417/
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Suggested-by: Alexander Potapenko <[email protected]>
Suggested-by: Andrey Ryabinin <[email protected]>
Acked-by: Andrey Ryabinin <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

userfaultfd: non-cooperative: notify about unmap of destination during mremap

When mremap is called with MREMAP_FIXED it unmaps memory at the
destination address without notifying userfaultfd monitor.

If the destination were registered with userfaultfd, the monitor has no
way to distinguish between the old and new ranges and to properly relate
the page faults that would occur in the destination region.

Fixes: 897ab3e0c49e ("userfaultfd: non-cooperative: add event for memory unmaps")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Acked-by: Pavel Emelyanov <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries

Nadav Amit identified a theoritical race between page reclaim and
mprotect due to TLB flushes being batched outside of the PTL being held.

He described the race as follows:

        CPU0                            CPU1
        ----                            ----
                                        user accesses memory using RW PTE
                                        [PTE now cached in TLB]
        try_to_unmap_one()
        ==> ptep_get_and_clear()
        ==> set_tlb_ubc_flush_pending()
                                        mprotect(addr, PROT_READ)
                                        ==> change_pte_range()
                                        ==> [ PTE non-present - no flush ]

                                        user writes using cached RW PTE
        ...

        try_to_unmap_flush()

The same type of race exists for reads when protecting for PROT_NONE and
also exists for operations that can leave an old TLB entry behind such
as munmap, mremap and madvise.

For some operations like mprotect, it's not necessarily a data integrity
issue but it is a correctness issue as there is a window where an
mprotect that limits access still allows access.  For munmap, it's
potentially a data integrity issue although the race is massive as an
munmap, mmap and return to userspace must all complete between the
window when reclaim drops the PTL and flushes the TLB.  However, it's
theoritically possible so handle this issue by flushing the mm if
reclaim is potentially currently batching TLB flushes.

Other instances where a flush is required for a present pte should be ok
as either the page lock is held preventing parallel reclaim or a page
reference count is elevated preventing a parallel free leading to
corruption.  In the case of page_mkclean there isn't an obvious path
that userspace could take advantage of without using the operations that
are guarded by this patch.  Other users such as gup as a race with
reclaim looks just at PTEs.  huge page variants should be ok as they
don't race with reclaim.  mincore only looks at PTEs.  userfault also
should be ok as if a parallel reclaim takes place, it will either fault
the page back in or read some of the data before the flush occurs
triggering a fault.

Note that a variant of this patch was acked by Andy Lutomirski but this
was for the x86 parts on top of his PCID work which didn't make the 4.13
merge window as expected.  His ack is dropped from this version and
there will be a follow-on patch on top of PCID that will include his
ack.

[[email protected]: tweak comments]
[[email protected]: fix spello]
Link: http://lkml.kernel.org/r/[email protected]
Reported-by: Nadav Amit <[email protected]>
Signed-off-by: Mel Gorman <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: <[email protected]> [v4.4+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

pid: kill pidhash_size in pidhash_init()

After commit 3d375d78593c ("mm: update callers to use HASH_ZERO flag"),
drop unused pidhash_size in pidhash_init().

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Kefeng Wang <[email protected]>
Reviewed-by: Pavel Tatashin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/hugetlb.c: __get_user_pages ignores certain follow_hugetlb_page errors

Commit 9a291a7c9428 ("mm/hugetlb: report -EHWPOISON not -EFAULT when
FOLL_HWPOISON is specified") causes __get_user_pages to ignore certain
errors from follow_hugetlb_page.  After such error, __get_user_pages
subsequently calls faultin_page on the same VMA and start address that
follow_hugetlb_page failed on instead of returning the error immediately
as it should.

In follow_hugetlb_page, when hugetlb_fault returns a value covered under
VM_FAULT_ERROR, follow_hugetlb_page returns it without setting nr_pages
to 0 as __get_user_pages expects in this case, which causes the
following to happen in __get_user_pages: the "while (nr_pages)" check
succeeds, we skip the "if (!vma..." check because we got a VMA the last
time around, we find no page with follow_page_mask, and we call
faultin_page, which calls hugetlb_fault for the second time.

This issue also slightly changes how __get_user_pages works.  Before, it
only returned error if it had made no progress (i = 0).  But now,
follow_hugetlb_page can clobber "i" with an error code since its new
return path doesn't check for progress.  So if "i" is nonzero before a
failing call to follow_hugetlb_page, that indication of progress is lost
and __get_user_pages can return error even if some pages were
successfully pinned.

To fix this, change follow_hugetlb_page so that it updates nr_pages,
allowing __get_user_pages to fail immediately and restoring the "error
only if no progress" behavior to __get_user_pages.

Tested that __get_user_pages returns when expected on error from
hugetlb_fault in follow_hugetlb_page.

Fixes: 9a291a7c9428 ("mm/hugetlb: report -EHWPOISON not -EFAULT when FOLL_HWPOISON is specified")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Daniel Jordan <[email protected]>
Acked-by: Punit Agrawal <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: "Aneesh Kumar K.V" <[email protected]>
Cc: Gerald Schaefer <[email protected]>
Cc: James Morse <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: zhong jiang <[email protected]>
Cc: <[email protected]> [4.12.x]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

KVM: nVMX: mark vmcs12 pages dirty on L2 exit

The host physical addresses of L1's Virtual APIC Page and Posted
Interrupt descriptor are loaded into the VMCS02. The CPU may write
to these pages via their host physical address while L2 is running,
bypassing address-translation-based dirty tracking (e.g. EPT write
protection). Mark them dirty on every exit from L2 to prevent them
from getting out of sync with dirty tracking.

Also mark the virtual APIC page and the posted interrupt descriptor
dirty when KVM is virtualizing posted interrupt processing.

Signed-off-by: David Matlack <[email protected]>
Reviewed-by: Paolo Bonzini <[email protected]>
Signed-off-by: Radim Krčmář <[email protected]>

kvm: nVMX: don't flush VMCS12 during VMXOFF or VCPU teardown

According to the Intel SDM, software cannot rely on the current VMCS to be
coherent after a VMXOFF or shutdown. So this is a valid way to handle VMCS12
flushes.

24.11.1 Software Use of Virtual-Machine Control Structures
...
  If a logical processor leaves VMX operation, any VMCSs active on
  that logical processor may be corrupted (see below). To prevent
  such corruption of a VMCS that may be used either after a return
  to VMX operation or on another logical processor, software should
  execute VMCLEAR for that VMCS before executing the VMXOFF instruction
  or removing power from the processor (e.g., as part of a transition
  to the S3 and S4 power states).
...

This fixes a "suspicious rcu_dereference_check() usage!" warning during
kvm_vm_release() because nested_release_vmcs12() calls
kvm_vcpu_write_guest_page() without holding kvm->srcu.

Signed-off-by: David Matlack <[email protected]>
Reviewed-by: Paolo Bonzini <[email protected]>
Signed-off-by: Radim Krčmář <[email protected]>

KVM: nVMX: do not pin the VMCS12

Since the current implementation of VMCS12 does a memcpy in and out
of guest memory, we do not need current_vmcs12 and current_vmcs12_page
anymore.  current_vmptr is enough to read and write the VMCS12.

And David Matlack noted:

  This patch also fixes dirty tracking (memslot->dirty_bitmap) of the
  VMCS12 page by using kvm_write_guest. nested_release_page() only marks
  the struct page dirty.

Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
[Added David Matlack's note and nested_release_page_clean() fix.]
Signed-off-by: Radim Krčmář <[email protected]>

KVM: avoid using rcu_dereference_protected

During teardown, accesses to memslots and buses are using
rcu_dereference_protected with an always-true condition because
these accesses are done outside the usual mutexes. This
is because the last reference is gone and there cannot be any
concurrent modifications, but rcu_dereference_protected is
ugly and unobvious.

Instead, check the refcount in kvm_get_bus and __kvm_memslots.

Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Radim Krčmář <[email protected]>

KVM: X86: init irq->level in kvm_pv_kick_cpu_op

'lapic_irq' is a local variable and its 'level' field isn't
initialized, so 'level' is random, it doesn't matter but
makes UBSAN unhappy:

UBSAN: Undefined behaviour in .../lapic.c:...
load of value 10 is not a valid value for type '_Bool'
...
Call Trace:
[<ffffffff81f030b6>] dump_stack+0x1e/0x20
[<ffffffff81f03173>] ubsan_epilogue+0x12/0x55
[<ffffffff81f03b96>] __ubsan_handle_load_invalid_value+0x118/0x162
[<ffffffffa1575173>] kvm_apic_set_irq+0xc3/0xf0 [kvm]
[<ffffffffa1575b20>] kvm_irq_delivery_to_apic_fast+0x450/0x910 [kvm]
[<ffffffffa15858ea>] kvm_irq_delivery_to_apic+0xfa/0x7a0 [kvm]
[<ffffffffa1517f4e>] kvm_emulate_hypercall+0x62e/0x760 [kvm]
[<ffffffffa113141a>] handle_vmcall+0x1a/0x30 [kvm_intel]
[<ffffffffa114e592>] vmx_handle_exit+0x7a2/0x1fa0 [kvm_intel]
...

Signed-off-by: Longpeng(Mike) <[email protected]>
Signed-off-by: Radim Krčmář <[email protected]>

KVM: X86: Fix loss of pending INIT due to race

When SMP VM start, AP may lost INIT because of receiving INIT between
kvm_vcpu_ioctl_x86_get/set_vcpu_events.

       vcpu 0                             vcpu 1
                                   kvm_vcpu_ioctl_x86_get_vcpu_events
                                     events->smi.latched_init = 0
  send INIT to vcpu1
    set vcpu1's pending_events
                                   kvm_vcpu_ioctl_x86_set_vcpu_events
                                      if (events->smi.latched_init == 0)
                                        clear INIT in pending_events

This patch fixes it by just update SMM related flags if we are in SMM.

Thanks Peng Hao for the report and original commit message.

Reported-by: Peng Hao <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
Reviewed-by: Paolo Bonzini <[email protected]>
Signed-off-by: Radim Krčmář <[email protected]>

drm/amdgpu: Use list_del_init in amdgpu_mn_unregister

Otherwise bo->shadow_list (which is aliased by bo->mn_list) will not
appear empty in amdgpu_ttm_bo_destroy and cause an oops when freeing
former userptr BOs.

Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/amdgpu: Fix undue fallthroughs in golden registers initialization

As I was staring at the si_init_golden_registers code, I noticed that
the Pitcairn initialization silently falls through the Cape Verde
initialization, and the Oland initialization falls through the Hainan
initialization. However there is no comment stating that this is
intentional, and the radeon driver doesn't have any such fallthrough,
so I suspect this is not supposed to happen.

Signed-off-by: Jean Delvare <[email protected]>
Fixes: 62a37553414a ("drm/amdgpu: add si implementation v10")
Cc: Ken Wang <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: "Marek Olšák" <[email protected]>
Cc: "Christian König" <[email protected]>
Cc: Flora Cui <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]

flow_dissector: remove unused functions

They are introduced by commit f70ea018da06
("net: Add functions to get skb->hash based on flow structures")
but never gets used in tree.

Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tcp: tcp_data_queue() cleanup

Commit c13ee2a4f03f ("tcp: reindent two spots after prequeue removal")
removed code in tcp_data_queue().

We can go a little farther, removing an always true test,
and removing initializers for fragstolen and eaten variables.

Signed-off-by: Eric Dumazet <[email protected]>
Cc: Florian Westphal <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: bcmgenet: drop COMPILE_TEST dependency

The last patch added the dependency on 'OF && HAS_IOMEM' but left
COMPILE_TEST as an alternative, which kind of defeats the purpose
of adding the dependency, we still get randconfig build warnings:

warning: (NET_DSA_BCM_SF2 && BCMGENET) selects MDIO_BCM_UNIMAC which has unmet direct dependencies (NETDEVICES && MDIO_BUS && HAS_IOMEM && OF_MDIO)

For compile-testing purposes, we don't really need this anyway,
as CONFIG_OF can be enabled on all architectures, and HAS_IOMEM
is present on all architectures we do meaningful compile-testing on
(the exception being arch/um).

This makes both OF and HAS_IOMEM hard dependencies.

Fixes: 5af74bb4fcf8 ("net: bcmgenet: Add dependency on HAS_IOMEM && OF")
Signed-off-by: Arnd Bergmann <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

hyperv: netvsc: Neaten netvsc_send_pkt by using a temporary

Repeated dereference of nvmsg.msg.v1_msg.send_rndis_pkt can be
shortened by using a temporary. Do so.

No change in object code.

Miscellanea:

o Use * const for rpkt and nvchan

Signed-off-by: Joe Perches <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'platform-drivers-x86-v4.13-3' of git://git.infradead.org/linux-platform-drivers-x86

Pull x86 platform driver fixes from Darren Hart:
"Fix two bugs under error or abnormal usage conditions. Correct a
  config dependency:

  dell-wmi:
   - Fix driver interface version query

  wmi:
   - Fix error handling in acpi_wmi_init()

  peaq-wmi:
   - select INPUT_POLLDEV"

* tag 'platform-drivers-x86-v4.13-3' of git://git.infradead.org/linux-platform-drivers-x86:
  platform/x86: dell-wmi: Fix driver interface version query
  platform/x86: wmi: Fix error handling in acpi_wmi_init()
  platform/x86: peaq-wmi: select INPUT_POLLDEV

Merge tag 'sunxi-clk-fixes-for-4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into clk-fixes

Pull one Allwinner clock fix from Chen-Yu Tsai:

One critical clock fix for sun5i (A10s/A13/R8) which enables propagation
of clock rate changes from the "cpu" clock to it's parent PLL clock.
This fixes cpufreq related crashes that have been observed on KernelCI
with the C.H.I.P. and multi_v7_defconfig.

* tag 'sunxi-clk-fixes-for-4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
clk: sunxi-ng: sun5i: Add clk_set_rate_parent to the CPU clock

Merge tag 'meson-clk-fixes-for-4.13-rc4-v2' of git://github.com/baylibre/clk-meson into clk-fixes

Pull one Meson clock fix from Neil Armstrong

* tag 'meson-clk-fixes-for-4.13-rc4-v2' of git://github.com/baylibre/clk-meson:
clk: meson: mpll: fix mpll0 fractional part ignored

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"These seven patches are mostly minor build, Kconfig and error leg
  fixes"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: qedi: Fix return code in qedi_ep_connect()
  scsi: lpfc: fix linking against modular NVMe support
  scsi: scsi_transport_fc: return -EBUSY for deleted vport
  scsi: libcxgbi: add check for valid cxgbi_task_data
  scsi: aic7xxx: fix firmware build with O=path
  scsi: megaraid_sas: fix memleak in megasas_alloc_cmdlist_fusion
  scsi: qedi: Add ISCSI_BOOT_SYSFS to Kconfig

Merge tag 'asoc-fix-v4.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v4.13

Quite a few fixes here that have been sent since the merge window, the
biggest one is the fix from Tony for some confusion with the device
property API which was causing issues with the of-graph card. This is
fixed with some changes in the graph API itself as it seemed very likely
to be error prone.

ARM64: dts: marvell: armada-37xx: Fix the number of GPIO on south bridge

The number of pins in South Bridge is 30 and not 29. There is a fix for
the driver for the pinctrl, but a fix is also need at device tree level
for the GPIO.

Fixes: afda007feda5 ("ARM64: dts: marvell: Add pinctrl nodes for Armada
3700")
Cc: <[email protected]>
Signed-off-by: Gregory CLEMENT <[email protected]>

NFSv4: Fix double frees in nfs4_test_session_trunk()

rpc_clnt_add_xprt() expects the callback function to be synchronous, and
expects to release the transport and switch references itself.

Fixes: 04fa2c6bb51b1 ("NFS pnfs data server multipath session trunking")
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>

ALSA: hda - Fix speaker output from VAIO VPCL14M1R

Sony VAIO VPCL14M1R needs the quirk to make the speaker working properly.

Tested-by: Dmitriy <[email protected]>
Cc: <[email protected]>
Signed-off-by: Sergei A. Trusov <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

powerpc/83xx/mpc832x_rdb: fix of_irq_to_resource() error check

of_irq_to_resource() has recently been fixed to return negative error #'s
along with 0 in case of failure, however the Freescale MPC832x RDB board
code still only regards 0 as a failure indication -- fix it up.

Fixes: 7a4228bbff76 ("of: irq: use of_irq_get() in of_irq_to_resource()")
Signed-off-by: Sergei Shtylyov <[email protected]>
Acked-by: Scott Wood <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>

gpio: tegra: fix unbalanced chained_irq_enter/exit

When more than one GPIO IRQs are triggered simultaneously,
tegra_gpio_irq_handler() called chained_irq_exit() multiple
times for one chained_irq_enter().

Fixes: 3c92db9ac0ca3eee8e46e2424b6c074e2e394ad9
Signed-off-by: Michał Mirosław <[email protected]>
[Also changed the variable to a bool]
Signed-off-by: Linus Walleij <[email protected]>

Merge branch 'dsa-rework-EEE-support'

Vivien Didelot says:

====================
net: dsa: rework EEE support

EEE implies configuring the port's PHY and MAC of both ends of the wire.

The current EEE support in DSA mixes PHY and MAC configuration, which is
bad because PHYs must be configured through a proper PHY driver. The DSA
switch operations for EEE are only meant for configuring the port's MAC,
which are integrated in the Ethernet switch device.

This patchset fixes the EEE support in qca8k driver, makes the DSA layer
call phy_init_eee for all drivers, and remove the EEE support from the
mv88e6xxx driver since the Marvell PHY driver should be enough for it.

Changes in v2:
- make PHY device and DSA EEE ops mandatory for slave EEE operations.
- simply return 0 in drivers which don't need to do anything to
configure the port' MAC. Subsequent PHY calls will be enough.
====================

Signed-off-by: David S. Miller <[email protected]>

net: dsa: rename switch EEE ops

To avoid confusion with the PHY EEE settings, rename the .set_eee and
.get_eee ops to respectively .set_mac_eee and .get_mac_eee.

Signed-off-by: Vivien Didelot <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: mv88e6xxx: remove EEE support

The PHY's EEE settings are already accessed by the DSA layer through the
Marvell PHY driver and there is nothing to be done for switch's MACs.

Remove all EEE support from the mv88e6xxx driver and simply return 0
from the EEE ops.

Signed-off-by: Vivien Didelot <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: remove PHY device argument from .set_eee

The DSA switch operations for EEE are only meant to configure a port's
MAC EEE settings. The port's PHY EEE settings are accessed by the DSA
layer and must be made available via a proper PHY driver.

In order to reduce this confusion, remove the phy_device argument from
the .set_eee operation.

Signed-off-by: Vivien Didelot <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: call phy_init_eee in DSA layer

All DSA drivers are calling phy_init_eee if eee_enabled is true.

Move up this statement in the DSA layer to simplify the DSA drivers.
qca8k does not require to cache the ethtool_eee structures from now on.

Signed-off-by: Vivien Didelot <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: mv88e6xxx: call phy_init_eee

It is safer to init the EEE before the DSA layer call
phy_ethtool_set_eee, as sf2 and qca8k are doing.

Signed-off-by: Vivien Didelot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: bcm_sf2: remove unneeded supported flags

The SF2 driver is masking the supported bitfield of its private copy of
the ports' ethtool_eee structures. It is used nowhere, thus remove it.

Signed-off-by: Vivien Didelot <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: qca8k: empty qca8k_get_eee

phy_ethtool_get_eee is already called by the DSA layer, thus remove the
duplicated call in the qca8k driver.

Signed-off-by: Vivien Didelot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: qca8k: do not cache unneeded EEE fields

The qca8k driver is currently caching a bitfield of the supported member
of a ethtool_eee private structure, which is unused.

Only the eee_enabled field of the private ethtool_eee copy is updated,
thus using p->advertised and p->lp_advertised is also erroneous.

Remove the usage of these private ethtool_eee members and only rely on
phy_ethtool_get_eee to assign the eee_active member.

Signed-off-by: Vivien Didelot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: qca8k: enable EEE once

If EEE is queried enabled, qca8k_set_eee calls qca8k_eee_enable_set
twice (because it is already called in qca8k_eee_init). Fix that.

Signed-off-by: Vivien Didelot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: dsa: qca8k: fix EEE init

The qca8k obviously copied code from the sf2 driver as how to set EEE:

    if (e->eee_enabled) {
        p->eee_enabled = qca8k_eee_init(ds, port, phydev);
        if (!p->eee_enabled)
            ret = -EOPNOTSUPP;
    }

But it did not use the same logic for the EEE init routine, which is
"Returns 0 if EEE was not enabled, or 1 otherwise". This results in
returning -EOPNOTSUPP on success and caching EEE enabled on failure.

This patch fixes the returned value of qca8k_eee_init.

Signed-off-by: Vivien Didelot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>