Git Repo - linux.git/log

parisc: Ignore the pkey system calls for now

Signed-off-by: Helge Deller <[email protected]>

parisc: Use LINUX_GATEWAY_ADDR define instead of hardcoded value

LINUX_GATEWAY_ADDR is defined in unistd.h. Let's use it.

Signed-off-by: Helge Deller <[email protected]>

parisc: Ensure consistent state when switching to kernel stack at syscall entry

We have one critical section in the syscall entry path in which we switch from
the userspace stack to kernel stack. In the event of an external interrupt, the
interrupt code distinguishes between those two states by analyzing the value of
sr7. If sr7 is zero, it uses the kernel stack. Therefore it's important, that
the value of sr7 is in sync with the currently enabled stack.

This patch now disables interrupts while executing the critical section. This
prevents the interrupt handler to possibly see an inconsistent state which in
the worst case can lead to crashes.

Interestingly, in the syscall exit path interrupts were already disabled in the
critical section which switches back to the userspace stack.

Cc: <[email protected]>
Signed-off-by: John David Anglin <[email protected]>
Signed-off-by: Helge Deller <[email protected]>

parisc: Avoid trashing sr2 and sr3 in LWS code

There is no need to trash sr2 and sr3 in the Light-weight syscall (LWS). sr2
already points to kernel space (it's zero in userspace, otherwise syscalls
wouldn't work), and since the LWS code is executed in userspace, we can simply
ignore to preload sr3.

Signed-off-by: John David Anglin <[email protected]>
Signed-off-by: Helge Deller <[email protected]>

parisc: use KERN_CONT when printing device inventory

Recent changes to printk require KERN_CONT uses to continue logging messages.
So add KERN_CONT to output of device inventory.

Signed-off-by: Helge Deller <[email protected]>

kvm: x86: Check memopp before dereference (CVE-2016-8630)

Commit 41061cdb98 ("KVM: emulate: do not initialize memopp") removes a
check for non-NULL under incorrect assumptions. An undefined instruction
with a ModR/M byte with Mod=0 and R/M-5 (e.g. 0xc7 0x15) will attempt
to dereference a null pointer here.

Fixes: 41061cdb98a0bec464278b4db8e894a3121671f5
Message-Id: <1477592752 [email protected]>
Signed-off-by: Owen Hofmann <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

kvm: nVMX: VMCLEAR an active shadow VMCS after last use

After a successful VM-entry with the "VMCS shadowing" VM-execution
control set, the shadow VMCS referenced by the VMCS link pointer field
in the current VMCS becomes active on the logical processor.

A VMCS that is made active on more than one logical processor may become
corrupted. Therefore, before an active VMCS can be migrated to another
logical processor, the first logical processor must execute a VMCLEAR
for the active VMCS. VMCLEAR both ensures that all VMCS data are written
to memory and makes the VMCS inactive.

Signed-off-by: Jim Mattson <[email protected]>
Reviewed-By: David Matlack <[email protected]>
Message-Id: <1477668579 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: x86: drop TSC offsetting kvm_x86_ops to fix KVM_GET/SET_CLOCK

Since commit a545ab6a0085 ("kvm: x86: add tsc_offset field to struct
kvm_vcpu_arch", 2016-09-07) the offset between host and L1 TSC is
cached and need not be fished out of the VMCS or VMCB.  This means
that we can implement adjust_tsc_offset_guest and read_l1_tsc
entirely in generic code.  The simplification is particularly
significant for VMX code, where vmx->nested.vmcs01_tsc_offset
was duplicating what is now in vcpu->arch.tsc_offset.  Therefore
the vmcs01_tsc_offset can be dropped completely.

More importantly, this fixes KVM_GET_CLOCK/KVM_SET_CLOCK
which, after commit 108b249c453d ("KVM: x86: introduce get_kvmclock_ns",
2016-09-01) called read_l1_tsc while the VMCS was not loaded.
It thus returned bogus values on Intel CPUs.

Fixes: 108b249c453dd7132599ab6dc7e435a7036c193f
Reported-by: Roman Kagan <[email protected]>
Reviewed-by: Radim Krčmář <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

Merge tag 'gcc-plugins-v4.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull gcc plugin fixes from Kees Cook:
- make sure required exports from gcc plugins are visible to gcc
- switch latent_entropy to unsigned long to avoid stack frame bloat

* tag 'gcc-plugins-v4.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
latent_entropy: Fix wrong gcc code generation with 64 bit variables
gcc-plugins: Export symbols needed by gcc

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio updates from Michael Tsirkin:
"Tests, fixes and cleanups.

  Just minor tweaks, there's nothing major in this cycle"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio_ring: mark vring_dma_dev inline
  virtio/vhost: add Jason to list of maintainers
  virtio_blk: Delete an unnecessary initialisation in init_vq()
  virtio_blk: Use kmalloc_array() in init_vq()
  virtio: remove config.c
  virtio: console: Unlock vqs while freeing buffers
  ringtest: poll for new buffers once before updating event index
  ringtest: commonize implementation of poll_avail/poll_used
  ringtest: use link-time optimization
  virtio: update balloon size in balloon "probe"
  virtio_ring: Make interrupt suppression spec compliant
  virtio_pci: Limit DMA mask to 44 bits for legacy virtio devices

Merge tag 'vfio-v4.9-rc4' of git://github.com/awilliam/linux-vfio

Pull VFIO fix from Alex Williamson:
"SET_IRQS ioctl parameter sanitization (Vlad Tsyrklevich)"

* tag 'vfio-v4.9-rc4' of git://github.com/awilliam/linux-vfio:
vfio/pci: Fix integer overflows, bitmask check

nfsd: Fix general protection fault in release_lock_stateid()

When I push NFSv4.1 / RDMA hard, (xfstests generic/089, for example),
I get this crash on the server:

Oct 28 22:04:30 klimt kernel: general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
Oct 28 22:04:30 klimt kernel: Modules linked in: cts rpcsec_gss_krb5 iTCO_wdt iTCO_vendor_support sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm btrfs irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd xor pcspkr raid6_pq i2c_i801 i2c_smbus lpc_ich mfd_core sg mei_me mei ioatdma shpchp wmi ipmi_si ipmi_msghandler rpcrdma ib_ipoib rdma_ucm acpi_power_meter acpi_pad ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_ib mlx4_en ib_core sr_mod cdrom sd_mod ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel igb ahci libahci ptp mlx4_core pps_core dca libata i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
Oct 28 22:04:30 klimt kernel: CPU: 7 PID: 1558 Comm: nfsd Not tainted 4.9.0-rc2-00005-g82cd754 #8
Oct 28 22:04:30 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
Oct 28 22:04:30 klimt kernel: task: ffff880835c3a100 task.stack: ffff8808420d8000
Oct 28 22:04:30 klimt kernel: RIP: 0010:[<ffffffffa05a759f>]  [<ffffffffa05a759f>] release_lock_stateid+0x1f/0x60 [nfsd]
Oct 28 22:04:30 klimt kernel: RSP: 0018:ffff8808420dbce0  EFLAGS: 00010246
Oct 28 22:04:30 klimt kernel: RAX: ffff88084e6660f0 RBX: ffff88084e667020 RCX: 0000000000000000
Oct 28 22:04:30 klimt kernel: RDX: 0000000000000007 RSI: 0000000000000000 RDI: ffff88084e667020
Oct 28 22:04:30 klimt kernel: RBP: ffff8808420dbcf8 R08: 0000000000000001 R09: 0000000000000000
Oct 28 22:04:30 klimt kernel: R10: ffff880835c3a100 R11: ffff880835c3aca8 R12: 6b6b6b6b6b6b6b6b
Oct 28 22:04:30 klimt kernel: R13: ffff88084e6670d8 R14: ffff880835f546f0 R15: ffff880835f1c548
Oct 28 22:04:30 klimt kernel: FS:  0000000000000000(0000) GS:ffff88087bdc0000(0000) knlGS:0000000000000000
Oct 28 22:04:30 klimt kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 28 22:04:30 klimt kernel: CR2: 00007ff020389000 CR3: 0000000001c06000 CR4: 00000000001406e0
Oct 28 22:04:30 klimt kernel: Stack:
Oct 28 22:04:30 klimt kernel: ffff88084e667020 0000000000000000 ffff88084e6670d8 ffff8808420dbd20
Oct 28 22:04:30 klimt kernel: ffffffffa05ac80d ffff880835f54548 ffff88084e640008 ffff880835f545b0
Oct 28 22:04:30 klimt kernel: ffff8808420dbd70 ffffffffa059803d ffff880835f1c768 0000000000000870
Oct 28 22:04:30 klimt kernel: Call Trace:
Oct 28 22:04:30 klimt kernel: [<ffffffffa05ac80d>] nfsd4_free_stateid+0xfd/0x1b0 [nfsd]
Oct 28 22:04:30 klimt kernel: [<ffffffffa059803d>] nfsd4_proc_compound+0x40d/0x690 [nfsd]
Oct 28 22:04:30 klimt kernel: [<ffffffffa0583114>] nfsd_dispatch+0xd4/0x1d0 [nfsd]
Oct 28 22:04:30 klimt kernel: [<ffffffffa047bbf9>] svc_process_common+0x3d9/0x700 [sunrpc]
Oct 28 22:04:30 klimt kernel: [<ffffffffa047ca64>] svc_process+0xf4/0x330 [sunrpc]
Oct 28 22:04:30 klimt kernel: [<ffffffffa05827ca>] nfsd+0xfa/0x160 [nfsd]
Oct 28 22:04:30 klimt kernel: [<ffffffffa05826d0>] ? nfsd_destroy+0x170/0x170 [nfsd]
Oct 28 22:04:30 klimt kernel: [<ffffffff810b367b>] kthread+0x10b/0x120
Oct 28 22:04:30 klimt kernel: [<ffffffff810b3570>] ? kthread_stop+0x280/0x280
Oct 28 22:04:30 klimt kernel: [<ffffffff8174e8ba>] ret_from_fork+0x2a/0x40
Oct 28 22:04:30 klimt kernel: Code: c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 48 8b 87 b0 00 00 00 48 89 fb 4c 8b a0 98 00 00 00 <49> 8b 44 24 20 48 8d b8 80 03 00 00 e8 10 66 1a e1 48 89 df e8
Oct 28 22:04:30 klimt kernel: RIP  [<ffffffffa05a759f>] release_lock_stateid+0x1f/0x60 [nfsd]
Oct 28 22:04:30 klimt kernel: RSP <ffff8808420dbce0>
Oct 28 22:04:30 klimt kernel: ---[ end trace cf5d0b371973e167 ]---

Jeff Layton says:
> Hm...now that I look though, this is a little suspicious:
>
>    struct nfs4_openowner *oo = openowner(stp->st_openstp->st_stateowner);
>
> I wonder if it's possible for the openstateid to have already been
> destroyed at this point.
>
> We might be better off doing something like this to get the client pointer:
>
>    stp->st_stid.sc_client;
>
> ...which should be more direct and less dependent on other stateids
> staying valid.

With the suggested change, I am no longer able to reproduce the above oops.

v2: Fix unhash_lock_stateid() as well

Fix-suggested-by: Jeff Layton <[email protected]>
Fixes: 42691398be08 ('nfsd: Fix race between FREE_STATEID and LOCK')
Signed-off-by: Chuck Lever <[email protected]>
Reviewed-by: Jeff Layton <[email protected]>
Cc: [email protected]
Signed-off-by: J. Bruce Fields <[email protected]>

svcrdma: backchannel cannot share a page for send and rcv buffers

The underlying transport releases the page pointed to by rq_buffer
during xprt_rdma_bc_send_request. When the backchannel reply arrives,
rq_rbuffer then points to freed memory.

Fixes: 68778945e46f ('SUNRPC: Separate buffer pointers for RPC ...')
Signed-off-by: Chuck Lever <[email protected]>
Cc: Jeff Layton <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>

gpio/mvebu: Use irq_domain_add_linear

This fixes the irq allocation in this driver to not print:
irq: Cannot allocate irq_descs @ IRQ34, assuming pre-allocated
irq: Cannot allocate irq_descs @ IRQ66, assuming pre-allocated

Which happens because the driver already called irq_alloc_descs()
and so the change to use irq_domain_add_simple resulted in calling
irq_alloc_descs() twice.

Modernize the irq allocation in this driver to use the
irq_domain_add_linear flow directly and eliminate the use of
irq_domain_add_simple/legacy

Fixes: ce931f571b6d ("gpio/mvebu: convert to use irq_domain_add_simple()")
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>

fork: Add task stack refcounting sanity check and prevent premature task stack freeing

If something goes wrong with task stack refcounting and a stack
refcount hits zero too early, warn and leak it rather than
potentially freeing it early (and silently).

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/f29119c783a9680a4b4656e751b6123917ace94b.1477926663.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>

drm/nouveau/acpi: fix check for power resources support

Check whether the kernel really supports power resources for a device,
otherwise the power might not be removed when the device is runtime
suspended (DSM should still work in these cases where PR does not).

This is a workaround for a problem where ACPICA and Windows 10 differ in
behavior. ACPICA does not correctly enumerate power resources within a
conditional block (due to delayed execution of such blocks) and as a
result power_resources is set to false even if _PR3 exists.

Fixes: 692a17dcc292 ("drm/nouveau/acpi: fix lockup with PCIe runtime PM")
Link: https://bugs.freedesktop.org/show_bug.cgi?id=98398
Reported-and-tested-by: Rick Kerkhof <[email protected]>
Reviewed-by: Mika Westerberg <[email protected]>
Cc: [email protected] # v4.8+
Signed-off-by: Peter Wu <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>

Merge branch 'drm-fixes-staging' of ssh://people.freedesktop.org/~/linux into drm-fixes

Pull the staging fixes tree I had into rc3 to make real -fixes again.

gpio: of: fix GPIO drivers with multiple gpio_chip for a single node

Sylvain Lemieux reports the LPC32xx GPIO driver is broken since
commit 762c2e46c059 ("gpio: of: remove of_gpiochip_and_xlate() and
struct gg_data").  Probably, gpio-etraxfs.c and gpio-davinci.c are
broken too.

Those drivers register multiple gpio_chip that are associated to a
single OF node, and their own .of_xlate() checks if the passed
gpio_chip is valid.

Now, the problem is of_find_gpiochip_by_node() returns the first
gpio_chip found to match the given node.  So, .of_xlate() fails,
except for the first GPIO bank.

Reverting the commit could be a solution, but I do not want to go
back to the mess of struct gg_data.  Another solution here is to
take the match by a node pointer and the success of .of_xlate().
It is a bit clumsy to call .of_xlate twice; for gpio_chip matching
and for really getting the gpio_desc index.  Perhaps, our long-term
goal might be to convert the drivers to single chip registration,
but this commit will solve the problem until then.

Fixes: 762c2e46c059 ("gpio: of: remove of_gpiochip_and_xlate() and struct gg_data")
Signed-off-by: Masahiro Yamada <[email protected]>
Reported-by: Sylvain Lemieux <[email protected]>
Tested-by: David Lechner <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>

gpio: GPIO_GET_LINE{HANDLE,EVENT}_IOCTL: Fix file descriptor leak

When allocating a new line handle or event a file is allocated that it is
associated to. The file is attached to a file descriptor of the current
process and the file descriptor is returned to userspace using
copy_to_user(). If this copy operation fails the line handle or event
allocation is aborted, all acquired resources are freed and an error is
returned.

But the file struct is not freed and left attached to the userspace
application and even though the file descriptor number was not copied it is
trivial to guess. If a userspace application performs a IOCTL on such a
left over file descriptor it will trigger a use-after-free and if the file
descriptor is closed (latest when the application exits) a double-free is
triggered.

anon_inode_getfd() performs 3 tasks, allocate a file struct, allocate a
file descriptor for the current process and install the file struct in the
file descriptor. As soon as the file struct is installed in the file
descriptor it is accessible by userspace (even if the IOCTL itself hasn't
completed yet), this means uninstalling the fd on the error path is not an
option, since userspace might already got a reference to the file.

Instead anon_inode_getfd() needs to be broken into its individual steps.
The allocation of the file struct and file descriptor is done first, then
the copy_to_user() is executed and only if it succeeds the file is
installed.

Since the file struct is reference counted it can not be just freed, but
its reference needs to be dropped, which will also call the release()
callback, which will free the state attached to the file. So in this case
the normal error cleanup path should not be taken.

Cc: [email protected]
Fixes: d932cd49182f ("gpio: free handles in fringe cases")
Signed-off-by: Lars-Peter Clausen <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>

Merge tag 'spi-fix-v4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown: "A few small fixes for SPI, one core fix
  that only applies in cases where we're handling DT overlays and a
  couple of driver specific fixes:

   - Fix handling of error cases when instantiating DT overlays so we
     don't end up just ignoring devices that encountered an error during
     instantiation.

   - Avoid reading uninitialized data when handing spurious interrupts
     in the espi driver.

   - A driver specific fix for the dspi driver to fix a bad interaction
     with u-boot"

* tag 'spi-fix-v4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: dspi: clear SPI_SR before enable interrupt
  spi: fsl-espi: avoid processing uninitalized data on error
  spi: mark device nodes only in case of successful instantiation

latent_entropy: Fix wrong gcc code generation with 64 bit variables

The stack frame size could grow too large when the plugin used long long
on 32-bit architectures when the given function had too many basic blocks.

The gcc warning was:

drivers/pci/hotplug/ibmphp_ebda.c: In function 'ibmphp_access_ebda':
drivers/pci/hotplug/ibmphp_ebda.c:409:1: warning: the frame size of 1108 bytes is larger than 1024 bytes [-Wframe-larger-than=]

This switches latent_entropy from u64 to unsigned long.

Thanks to PaX Team and Emese Revfy for the patch.

Signed-off-by: Kees Cook <[email protected]>

gcc-plugins: Export symbols needed by gcc

This explicitly exports symbols that gcc expects from plugins.

Based on code from Emese Revfy.

Signed-off-by: Kees Cook <[email protected]>

Merge tag 'regulator-fix-v4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fix from Mark Brown:
"Fix ramp_delay warnings for v4.9

  A new warning was introduced for missing information about the time
  that regulators take to power on in v4.9. This is in theory a real
  issue but for most practical regulators the communication overhead of
  talking to the device is greater than the ramp time so a lot of
  drivers don't set it and the warning is far too noisy without
  identifying practical issues.

  Just remove the warning for now"

* tag 'regulator-fix-v4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: core: silence warning: "VDD1: ramp_delay not set"

Merge tag 'regmap-fix-v4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

Pull regmap fixes from Mark Brown:
"A couple of small build fixes here, nothing major.

  The missing include is triggered in some configurations and the
  renaming of ret is defensive for the benefit of some drivers people
  are in the process of mainlining"

* tag 'regmap-fix-v4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: Rename ret variable in regmap_read_poll_timeout
  regmap: include <linux/delay.h> from include/linux/regmap.h

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security

Pull TPM fix from James Morris.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
tpm: remove invalid min length check from tpm_do_selftest()

tpm: remove invalid min length check from tpm_do_selftest()

Removal of this check was not properly amended to the original commit.

Cc: [email protected]
Fixes: 0c541332231e ("tpm: use tpm_pcr_read_dev() in tpm_do_selftest()")
Signed-off-by: Jarkko Sakkinen <[email protected]>
Signed-off-by: James Morris <[email protected]>

Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:
"A fix for a regression on ARMv4T CPUs, and wiring up the new pkey
  syscalls for ARM"

* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: wire up new pkey syscalls
  ARM: fix oops when using older ARMv4T CPUs

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc

Pull sparc fixes from David Miller:
"Several sparc64 bug fixes here:

  1) Make the user copy routines on sparc64 return a properly accurate
     residual length when an exception occurs.

  2) We can get enormous kernel TLB range flush requests from vmalloc
     unmaps, so handle these more gracefully by doing full flushes
     instead of going page-by-page.

  3) Cope properly with negative branch offsets in sparc jump-label
     support, from James Clarke.

  4) Some old-style decl GCC warning fixups from Tobias Klauser"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc64: Handle extremely large kernel TLB range flushes more gracefully.
  sparc64: Fix illegal relative branches in hypervisor patched TLB cross-call code.
  sparc64: Fix instruction count in comment for __hypervisor_flush_tlb_pending.
  sparc64: Handle extremely large kernel TSB range flushes sanely.
  sparc: Handle negative offsets in arch_jump_label_transform
  sparc64: Fix illegal relative branches in hypervisor patched TLB code.
  sparc64: Delete now unused user copy fixup functions.
  sparc64: Delete now unused user copy assembler helpers.
  sparc64: Convert U3copy_{from,to}_user to accurate exception reporting.
  sparc64: Convert NG2copy_{from,to}_user to accurate exception reporting.
  sparc64: Convert NGcopy_{from,to}_user to accurate exception reporting.
  sparc64: Convert NG4copy_{from,to}_user to accurate exception reporting.
  sparc64: Convert U1copy_{from,to}_user to accurate exception reporting.
  sparc64: Convert GENcopy_{from,to}_user to accurate exception reporting.
  sparc64: Convert copy_in_user to accurate exception reporting.
  sparc64: Prepare to move to more saner user copy exception handling.
  sparc64: Delete __ret_efault.
  sparc32: Fix old style declaration GCC warnings
  sparc64: Fix old style declaration GCC warnings
  sparc64: Setup a scheduling domain for highest level cache.

ovl: fsync after copy-up

Make sure the copied up file hits the disk before renaming to the final
destination. If this is not done then the copy-up may corrupt the data in
the file in case of a crash.

Signed-off-by: Miklos Szeredi <[email protected]>
Cc: <[email protected]>

ovl: fix get_acl() on tmpfs

tmpfs doesn't have ->get_acl() because it only uses cached acls.

This fixes the acl tests in pjdfstest when tmpfs is used as the upper layer
of the overlay.

Reported-by: Amir Goldstein <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
Fixes: 39a25b2b3762 ("ovl: define ->get_acl() for overlay inodes")
Cc: <[email protected]> # v4.8

ovl: update S_ISGID when setting posix ACLs

This change fixes xfstest generic/375, which failed to clear the
setgid bit in the following test case on overlayfs:

  touch $testfile
  chown 100:100 $testfile
  chmod 2755 $testfile
  _runas -u 100 -g 101 -- setfacl -m u::rwx,g::rwx,o::rwx $testfile

Reported-by: Amir Goldstein <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
Tested-by: Amir Goldstein <[email protected]>
Fixes: d837a49bd57f ("ovl: fix POSIX ACL setting")
Cc: <[email protected]> # v4.8

virtio_ring: mark vring_dma_dev inline

This inline function is unused on configurations
where dma_map/unmap are empty macros.

Make the function inline to avoid gcc errors because
of an unused static function.

Signed-off-by: Michael S. Tsirkin <[email protected]>

virtio/vhost: add Jason to list of maintainers

Jason's been one of the mst active contributors
to virtio and vhost, it will help to formalize this
and list him as co-maintainer.

Signed-off-by: Michael S. Tsirkin <[email protected]>

virtio_blk: Delete an unnecessary initialisation in init_vq()

The local variable "err" will be set to an appropriate value
by a following statement.
Thus omit the explicit initialisation at the beginning.

Signed-off-by: Markus Elfring <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

virtio_blk: Use kmalloc_array() in init_vq()

Multiplications for the size determination of memory allocations
indicated that array data structures should be processed.
Thus use the corresponding function "kmalloc_array".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

virtio: remove config.c

Remove unused file config.c

Signed-off-by: Juergen Gross <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

virtio: console: Unlock vqs while freeing buffers

Commit c6017e793b93 ("virtio: console: add locks around buffer removal
in port unplug path") added locking around the freeing of buffers in the
vq. However, when free_buf() is called with can_sleep = true and rproc
is enabled, it calls dma_free_coherent() directly, requiring interrupts
to be enabled. Currently a WARNING is triggered due to the spin locking
around free_buf, with a call stack like this:

WARNING: CPU: 3 PID: 121 at ./include/linux/dma-mapping.h:433
free_buf+0x1a8/0x288
Call Trace:
[<8040c538>] show_stack+0x74/0xc0
[<80757240>] dump_stack+0xd0/0x110
[<80430d98>] __warn+0xfc/0x130
[<80430ee0>] warn_slowpath_null+0x2c/0x3c
[<807e7c6c>] free_buf+0x1a8/0x288
[<807ea590>] remove_port_data+0x50/0xac
[<807ea6a0>] unplug_port+0xb4/0x1bc
[<807ea858>] virtcons_remove+0xb0/0xfc
[<807b6734>] virtio_dev_remove+0x58/0xc0
[<807f918c>] __device_release_driver+0xac/0x134
[<807f924c>] device_release_driver+0x38/0x50
[<807f7edc>] bus_remove_device+0xfc/0x130
[<807f4b74>] device_del+0x17c/0x21c
[<807f4c38>] device_unregister+0x24/0x38
[<807b6b50>] unregister_virtio_device+0x28/0x44

Fix this by restructuring the loops to allow the locks to only be taken
where it is necessary to protect the vqs, and release it while the
buffer is being freed.

Fixes: c6017e793b93 ("virtio: console: add locks around buffer removal in port unplug path")
Cc: [email protected]
Signed-off-by: Matt Redfearn <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

ringtest: poll for new buffers once before updating event index

Updating the event index has a memory barrier and causes more work
on the other side to actually signal the event. It is unnecessary
if a new buffer has already appeared on the ring, so poll once before
doing the update.

The effect of this on the 0.9 ring implementation is pretty much
invisible, but on the new-style ring it provides a consistent 3%
performance improvement.

Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

ringtest: commonize implementation of poll_avail/poll_used

Provide new primitives used_empty/avail_empty and
build poll_avail/poll_used on top of it.

Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

ringtest: use link-time optimization

By using -flto and -fwhole-program, all functions from the ring implementation
can be treated as static and possibly inlined. Force this to happen through
the GCC flatten attribute.

Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

virtio: update balloon size in balloon "probe"

The following commit 'fad7b7b27b6a (virtio_balloon: Use a workqueue
instead of "vballoon" kthread)' has added a regression. Original code with
kthread starts the thread inside probe and checks the necessity to update
balloon inside the thread immediately.

Nowadays the code behaves differently. Work is queued only on the first
command from the host after the negotiation. Thus there is a window
especially at the guest startup or the module reloading when the balloon
size is not updated until the notification from the host.

This patch adds balloon size check at the end of the probe to match
original behaviour.

Signed-off-by: Konstantin Neumoin <[email protected]>
Signed-off-by: Denis V. Lunev <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

virtio_ring: Make interrupt suppression spec compliant

According to the spec, if the VIRTIO_RING_F_EVENT_IDX feature bit is
negotiated the driver MUST set flags to 0. Not dirtying the available
ring in virtqueue_disable_cb also has a minor positive performance
impact, improving L1 dcache load missed by ~0.5% in vring_bench.

Writes to the used event field (vring_used_event) are still unconditional.

Cc: Michael S. Tsirkin <[email protected]>
Cc: <[email protected]> # f277ec4 virtio_ring: shadow available
Cc: <[email protected]>
Signed-off-by: Ladi Prosek <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

virtio_pci: Limit DMA mask to 44 bits for legacy virtio devices

Legacy virtio defines the virtqueue base using a 32-bit PFN field, with
a read-only register indicating a fixed page size of 4k.

This can cause problems for DMA allocators that allocate top down from
the DMA mask, which is set to 64 bits. In this case, the addresses are
silently truncated to 44-bit, leading to IOMMU faults, failure to read
from the queue or data corruption.

This patch restricts the coherent DMA mask for legacy PCI virtio devices
to 44 bits, which matches the specification.

Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Benjamin Serebrin <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:
"Lots of fixes, mostly drivers as is usually the case.

   1) Don't treat zero DMA address as invalid in vmxnet3, from Alexey
      Khoroshilov.

   2) Fix element timeouts in netfilter's nft_dynset, from Anders K.
      Pedersen.

   3) Don't put aead_req crypto struct on the stack in mac80211, from
      Ard Biesheuvel.

   4) Several uninitialized variable warning fixes from Arnd Bergmann.

   5) Fix memory leak in cxgb4, from Colin Ian King.

   6) Fix bpf handling of VLAN header push/pop, from Daniel Borkmann.

   7) Several VRF semantic fixes from David Ahern.

   8) Set skb->protocol properly in ip6_tnl_xmit(), from Eli Cooper.

   9) Socket needs to be locked in udp_disconnect(), from Eric Dumazet.

  10) Div-by-zero on 32-bit fix in mlx4 driver, from Eugenia Emantayev.

  11) Fix stale link state during failover in NCSCI driver, from Gavin
      Shan.

  12) Fix netdev lower adjacency list traversal, from Ido Schimmel.

  13) Propvide proper handle when emitting notifications of filter
      deletes, from Jamal Hadi Salim.

  14) Memory leaks and big-endian issues in rtl8xxxu, from Jes Sorensen.

  15) Fix DESYNC_FACTOR handling in ipv6, from Jiri Bohac.

  16) Several routing offload fixes in mlxsw driver, from Jiri Pirko.

  17) Fix broadcast sync problem in TIPC, from Jon Paul Maloy.

  18) Validate chunk len before using it in SCTP, from Marcelo Ricardo
      Leitner.

  19) Revert a netns locking change that causes regressions, from Paul
      Moore.

  20) Add recursion limit to GRO handling, from Sabrina Dubroca.

  21) GFP_KERNEL in irq context fix in ibmvnic, from Thomas Falcon.

  22) Avoid accessing stale vxlan/geneve socket in data path, from
      Pravin Shelar"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (189 commits)
  geneve: avoid using stale geneve socket.
  vxlan: avoid using stale vxlan socket.
  qede: Fix out-of-bound fastpath memory access
  net: phy: dp83848: add dp83822 PHY support
  enic: fix rq disable
  tipc: fix broadcast link synchronization problem
  ibmvnic: Fix missing brackets in init_sub_crq_irqs
  ibmvnic: Fix releasing of sub-CRQ IRQs in interrupt context
  Revert "ibmvnic: Fix releasing of sub-CRQ IRQs in interrupt context"
  arch/powerpc: Update parameters for csum_tcpudp_magic & csum_tcpudp_nofold
  net/mlx4_en: Save slave ethtool stats command
  net/mlx4_en: Fix potential deadlock in port statistics flow
  net/mlx4: Fix firmware command timeout during interrupt test
  net/mlx4_core: Do not access comm channel if it has not yet been initialized
  net/mlx4_en: Fix panic during reboot
  net/mlx4_en: Process all completions in RX rings after port goes up
  net/mlx4_en: Resolve dividing by zero in 32-bit system
  net/mlx4_core: Change the default value of enable_qos
  net/mlx4_core: Avoid setting ports to auto when only one port type is supported
  net/mlx4_core: Fix the resource-type enum in res tracker to conform to FW spec
  ...

geneve: avoid using stale geneve socket.

This patch is similar to earlier vxlan patch.
Geneve device close operation frees geneve socket. This
operation can race with geneve-xmit function which
dereferences geneve socket. Following patch uses RCU
mechanism to avoid this situation.

Signed-off-by: Pravin B Shelar <[email protected]>
Acked-by: John W. Linville <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

vxlan: avoid using stale vxlan socket.

When vxlan device is closed vxlan socket is freed. This
operation can race with vxlan-xmit function which
dereferences vxlan socket. Following patch uses RCU
mechanism to avoid this situation.

Signed-off-by: Pravin B Shelar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

qede: Fix out-of-bound fastpath memory access

Driver allocates a shadow array for transmitted SKBs with X entries;
That means valid indices are {0,...,X - 1}. [X == 8191]
Problem is the driver also uses X as a mask for a
producer/consumer in order to choose the right entry in the
array which allows access to entry X which is out of bounds.

To fix this, simply allocate X + 1 entries in the shadow array.

Signed-off-by: Yuval Mintz <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: phy: dp83848: add dp83822 PHY support

This PHY has a compatible register set with DP83848x so
add support for it.

Acked-by: Andrew F. Davis <[email protected]>
Signed-off-by: Roger Quadros <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

enic: fix rq disable

When MTU is changed from 9000 to 1500 while there is burst of inbound 9000
bytes packets, adaptor sometimes delivers 9000 bytes packets to 1500 bytes
buffers. This causes memory corruption and sometimes crash.

This is because of a race condition in adaptor between "RQ disable"
clearing descriptor mini-cache and mini-cache valid bit being set by
completion of descriptor fetch. This can result in stale RQ desc being
cached and used when packets arrive. In this case, the stale descriptor
have old MTU value.

Solution is to write RQ->disable twice. The first write will stop any
further desc fetches, allowing the second disable to clear the mini-cache
valid bit without danger of a race.

Also, the check for rq->running becoming 0 after writing rq->enable to 0
is not done properly. When incoming packets are flooding the interface,
rq->running will pulse high for each dropped packet. Since the driver was
waiting for 10us between each poll, it is possible to see rq->running = 1
1000 times in a row, even though it is not actually stuck running.
This results in false failure of vnic_rq_disable(). Fix is to try more
than 1000 time without delay between polls to ensure we do not miss when
running goes low.

In old adaptors rq->enable needs to be re-written to 0 when posted_index
is reset in vnic_rq_clean() in order to keep rq->prefetch_index in sync.

Signed-off-by: Govindarajulu Varadarajan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tipc: fix broadcast link synchronization problem

In commit 2d18ac4ba745 ("tipc: extend broadcast link initialization
criteria") we tried to fix a problem with the initial synchronization
of broadcast link acknowledge values. Unfortunately that solution is
not sufficient to solve the issue.

We have seen it happen that LINK_PROTOCOL/STATE packets with a valid
non-zero unicast acknowledge number may bypass BCAST_PROTOCOL
initialization, NAME_DISTRIBUTOR and other STATE packets with invalid
broadcast acknowledge numbers, leading to premature opening of the
broadcast link. When the bypassed packets finally arrive, they are
inadvertently accepted, and the already correctly initialized
acknowledge number in the broadcast receive link is overwritten by
the invalid (zero) value of the said packets. After this the broadcast
link goes stale.

We now fix this by marking the packets where we know the acknowledge
value is or may be invalid, and then ignoring the acks from those.

To this purpose, we claim an unused bit in the header to indicate that
the value is invalid. We set the bit to 1 in the initial BCAST_PROTOCOL
synchronization packet and all initial ("bulk") NAME_DISTRIBUTOR
packets, plus those LINK_PROTOCOL packets sent out before the broadcast
links are fully synchronized.

This minor protocol update is fully backwards compatible.

Reported-by: John Thompson <[email protected]>
Tested-by: John Thompson <[email protected]>
Signed-off-by: Jon Maloy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ibmvnic: Fix missing brackets in init_sub_crq_irqs

Signed-off-by: Thomas Falcon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ibmvnic: Fix releasing of sub-CRQ IRQs in interrupt context

Schedule these XPORT event tasks in the shared workqueue
so that IRQs are not freed in an interrupt context when
sub-CRQs are released.

Signed-off-by: Thomas Falcon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Revert "ibmvnic: Fix releasing of sub-CRQ IRQs in interrupt context"

This reverts commit 8d7533e5aaad1c94386a8101a36b0617987966b7.

It introduced kbuild failures, new version coming.

Signed-off-by: David S. Miller <[email protected]>

Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2016-10-27

This series contains fixes to ixgbe and i40e.

Emil fixes a NULL pointer dereference when a macvlan interface is brought
up while the PF is still down.

David root caused the original panic that was fixed by commit id
(a036244c068612 "i40e: Fix kernel panic on enable/disable LLDP") and the
fix was not quite correct, so removed the get_default_tc() and replaced
it with a #define since there is only one TC supported as a default.

Guilherme Piccoli fixes an issue where if we modprobe the driver module
without enough MSI-X interrupts, then unload the module and reload it
again, the kernel would crash. So if we fail to allocate enough MSI-X
interrupts, we should disable them since they were previously enabled.

Huaibin Wang found that the order of the arguments for
ndo_dflt_bridge_getlink() were in the correct order, so fix the order.
====================

Signed-off-by: David S. Miller <[email protected]>

arch/powerpc: Update parameters for csum_tcpudp_magic & csum_tcpudp_nofold

Commit 01cfbad "ipv4: Update parameters for csum_tcpudp_magic to their
original types" changed parameters for csum_tcpudp_magic and
csum_tcpudp_nofold for many platforms but not for PowerPC.

Fixes: 01cfbad "ipv4: Update parameters for csum_tcpudp_magic to their original types"
Cc: Alexander Duyck <[email protected]>
Signed-off-by: Ivan Vecera <[email protected]>
Acked-by: Alexander Duyck <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Linux 4.9-rc3

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 bugfix from Thomas Gleixner:
"A single bugfix for the recent changes related to registering the boot
  cpu when this has not happened before prefill_possible_map().

  The main problem with this change got fixed already, but we missed the
  case where the local APIC is not yet mapped, when prefill_possible_map()
  is invoked, so the registration of the boot cpu which has the APIC bit
  set in CPUID will explode.

  I should have seen that issue earlier, but all I can do now is feeling
  embarassed"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/smpboot: Init apic mapping before usage

Merge branch 'mlx4-fixes'

Tariq Toukan says:

====================
mlx4 misc fixes for 4.9

This patchset contains several bug fixes from the team to the
mlx4 Eth and Core drivers.

Series generated against net commit:
ecc515d7238f 'sctp: fix the panic caused by route update'
====================

Signed-off-by: David S. Miller <[email protected]>

net/mlx4_en: Save slave ethtool stats command

Following the previous patch, as an optimization, the slave will
not even bother sending the DUMP_ETH_STATS command over the
comm channel.

Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx4_en: Fix potential deadlock in port statistics flow

mlx4_en_DUMP_ETH_STATS took the *counter mutex* and then
called the FW command, with WRAPPED attribute. As a result, the fw command
is wrapped on the Hypervisor when it calls mlx4_en_DUMP_ETH_STATS.
The FW command wrapper flow on the hypervisor takes the *slave_cmd_mutex*
during processing.

At the same time, a VF could be in the process of coming up, and could
call mlx4_QUERY_FUNC_CAP. On the hypervisor, the command flow takes the
*slave_cmd_mutex*, then executes mlx4_QUERY_FUNC_CAP_wrapper.
mlx4_QUERY_FUNC_CAP wrapper calls mlx4_get_default_counter_index(),
which takes the *counter mutex*. DEADLOCK.

The fix is that the DUMP_ETH_STATS fw command should be called with
the NATIVE attribute, so that on the hypervisor, this command does not
enter the wrapper flow.

Since the Hypervisor no longer goes through the wrapper code, we also
simply return 0 in mlx4_DUMP_ETH_STATS_wrapper (i.e.the function succeeds,
but the returned data will be all zeroes).
No need to test if it is the Hypervisor going through the wrapper.

Fixes: f9baff509f8a ("mlx4_core: Add "native" argument to mlx4_cmd ...")
Signed-off-by: Jack Morgenstein <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx4: Fix firmware command timeout during interrupt test

Currently interrupt test that is part of ethtool selftest runs the
check over all interrupt vectors of the device.
In mlx4_en package part of interrupt vectors are uninitialized since
mlx4_ib doesn't exist. This causes NOP FW command to time out.
Change logic to test current port interrupt vectors only.

Signed-off-by: Eugenia Emantayev <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx4_core: Do not access comm channel if it has not yet been initialized

In the Hypervisor, there are several FW commands which are invoked
before the comm channel is initialized (in mlx4_multi_func_init).
These include MOD_STAT_CONFIG, QUERY_DEV_CAP, INIT_HCA, and others.

If any of these commands fails, say with a timeout, the Hypervisor
driver enters the internal error reset flow. In this flow, the driver
attempts to notify all slaves via the comm channel that an internal error
has occurred.

Since the comm channel has not yet been initialized (i.e., mapped via
ioremap), this will cause dereferencing a NULL pointer.

To fix this, do not access the comm channel in the internal error flow
if it has not yet been initialized.

Fixes: 55ad359225b2 ("net/mlx4_core: Enable device recovery flow with SRIOV")
Fixes: ab9c17a009ee ("mlx4_core: Modify driver initialization flow to accommodate SRIOV for Ethernet")
Signed-off-by: Jack Morgenstein <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx4_en: Fix panic during reboot

Fix a kernel panic that occurs as a result of an asynchronous event
handled in roce_gid_mgmt:
mlx4_en_get_drvinfo is called and accesses freed resources.

This happens in a shutdown flow only, since pci device is destroyed
while netdevice is still alive.

Fixes: c27a02cd94d6 ("mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC")
Signed-off-by: Eugenia Emantayev <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx4_en: Process all completions in RX rings after port goes up

Currently there is a race between incoming traffic and
initialization flow. HW is able to receive the packets
after INIT_PORT is done and unicast steering is configured.
Before we set priv->port_up NAPI is not scheduled and
receive queues become full. Therefore we never get
new interrupts about the completions.
This issue could happen if running heavy traffic during
bringing port up.
The resolution is to schedule NAPI once port_up is set.
If receive queues were full this will process all cqes
and release them.

Fixes: c27a02cd94d6 ("mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC")
Signed-off-by: Erez Shitrit <[email protected]>
Signed-off-by: Eugenia Emantayev <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx4_en: Resolve dividing by zero in 32-bit system

When doing roundup_pow_of_two for large enough number with
bit 31, an overflow will occur and a value equal to 1 will
be returned. In this case 1 will be subtracted from the return
value and division by zero will be reached.

Fixes: 31c128b66e5b ("net/mlx4_en: Choose time-stamping shift value according to HW frequency")
Signed-off-by: Eugenia Emantayev <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx4_core: Change the default value of enable_qos

Change the default status of quality of service back to disabled,
as it hurts performance in some cases.

Fixes: 38438f7c7e8c ("net/mlx4: Set enhanced QoS support by default when ...")
Signed-off-by: Moshe Lazer <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx4_core: Avoid setting ports to auto when only one port type is supported

When only one port type is supported, it should be read only.
We reject changing requests, even to the auto sense mode.

Fixes: 27bf91d6a0d5 ("mlx4_core: Add link type autosensing")
Signed-off-by: Maor Gottlieb <[email protected]>
Signed-off-by: Moshe Shemesh <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx4_core: Fix the resource-type enum in res tracker to conform to FW spec

The resource type enum in the resource tracker was incorrect.
RES_EQ was put in the position of RES_NPORT_ID (a FC resource).

Since the remaining resources maintain their current values,
and RES_EQ is not passed from slaves to the hypervisor in any
FW command, this change affects only the hypervisor.
Therefore, there is no backwards-compatibility issue.

Fixes: 623ed84b1f95 ("mlx4_core: initial header-file changes for SRIOV support")
Signed-off-by: Jack Morgenstein <[email protected]>
Signed-off-by: Moshe Shemesh <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'upstream-4.9-rc3' of git://git.infradead.org/linux-ubifs

Pull ubi/ubifs fixes from Richard Weinberger:
"This contains fixes for issues in both UBI and UBIFS:

   - A regression wrt overlayfs, introduced in -rc2.
   - An UBI issue, found by Dan Carpenter's static checker"

* tag 'upstream-4.9-rc3' of git://git.infradead.org/linux-ubifs:
  ubifs: Fix regression in ubifs_readdir()
  ubi: fastmap: Fix add_vol() return value test in ubi_attach_fastmap()

rds: debug messages are enabled by default

rds use Kconfig option called "RDS_DEBUG" to enable rds debug messages.
This option cause the rds Makefile to add -DDEBUG to the rds gcc command
line.

When CONFIG_DYNAMIC_DEBUG is enabled, the "DEBUG" macro is used by
include/linux/dynamic_debug.h to decide if dynamic debug prints should
be sent by default to the kernel log.

rds should not enable this macro for production builds. rds dynamic
debug work as expected follow this fix.

Signed-off-by: Shamir Rabinovitch <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Reviewed-by: Wengang Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'mac80211-for-davem-2016-10-27' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
Just two fixes:
* a fix to process all events while suspending, so any
   potential calls into the driver are done before it is
   suspended
* small markup fixes for the sphinx documentation conversion
   that's coming into the tree via the doc tree
====================

Signed-off-by: David S. Miller <[email protected]>

ibmvnic: Fix releasing of sub-CRQ IRQs in interrupt context

Schedule these XPORT event tasks in the shared workqueue
so that IRQs are not freed in an interrupt context when
sub-CRQs are released.

Signed-off-by: Thomas Falcon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: mv643xx_eth: Fetch the phy connection type from DT

The MAC is capable of RGMII mode and that is probably a more typical
connection type than GMII today (eg it is used by Marvell Reference
designs for several SOCs). Let DT users specify the standard

phy-connection-type = "rgmii-id";

On a phy node.

Signed-off-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
"We haven't seen a whole lot of fixes for the first two weeks since the
  merge window, but here is the batch that we have at the moment.

  Nothing sticks out as particularly bad or scary, it's mostly a handful
  of smaller fixes to several platforms. The Uniphier reset controller
  changes could probably have been delayed to 4.10, but they're not
  scary and just plumbing up driver changes that went in during the
  merge window.

  We're also adding another maintainer to Marvell Berlin platforms, to
  help out when Sebastian is too busy. Yay teamwork!"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: imx: mach-imx6q: Fix the PHY ID mask for AR8031
  ARM: dts: vf610: fix IRQ flag of global timer
  ARM: imx: gpc: Fix the imx_gpc_genpd_init() error path
  ARM: imx: gpc: Initialize all power domains
  arm64: dts: Updated NAND DT properties for NS2 SVK
  arm64: dts: uniphier: change MIO node to SD control node
  ARM: dts: uniphier: change MIO node to SD control node
  reset: uniphier: rename MIO reset to SD reset for Pro5, PXs2, LD20 SoCs
  arm64: uniphier: select ARCH_HAS_RESET_CONTROLLER
  ARM: uniphier: select ARCH_HAS_RESET_CONTROLLER
  arm64: dts: Add timer erratum property for LS2080A and LS1043A
  arm64: dts: rockchip: remove the abuse of keep-power-in-suspend
  ARM: multi_v7_defconfig: Enable Intel e1000e driver
  MAINTAINERS: add myself as Marvell berlin SoC maintainer
  bus: qcom-ebi2: depend on ARCH_QCOM or COMPILE_TEST
  ARM: dts: fix the SD card on the Snowball
  arm64: dts: rockchip: remove always-on and boot-on from vcc_sd
  arm64: dts: marvell: fix clocksource for CP110 master SPI0
  ARM: mvebu: Select corediv clk for all mvebu v7 SoC

Merge tag 'batadv-net-for-davem-20161026' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
Here are three batman-adv bugfix patches:

- Fix RCU usage for neighbor list, by Sven Eckelmann

- Fix BATADV_DBG_ALL loglevel to include TP Meter messages, by Sven Eckelmann

- Fix possible splat when disabling an interface, by Linus Luessing
====================

Signed-off-by: David S. Miller <[email protected]>

Revert "hv_netvsc: report vmbus name in ethtool"

This reverts commit e3f74b841d48
("hv_netvsc: report vmbus name in ethtool")'
because of problem introduced by commit f9a56e5d6a0ba
("Drivers: hv: make VMBus bus ids persistent").
This changed the format of the vmbus name and this new format is too
long to fit in the bus_info field of ethtool.

Signed-off-by: Stephen Hemminger <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

packet: on direct_xmit, limit tso and csum to supported devices

When transmitting on a packet socket with PACKET_VNET_HDR and
PACKET_QDISC_BYPASS, validate device support for features requested
in vnet_hdr.

Drop TSO packets sent to devices that do not support TSO or have the
feature disabled. Note that the latter currently do process those
packets correctly, regardless of not advertising the feature.

Because of SKB_GSO_DODGY, it is not sufficient to test device features
with netif_needs_gso. Full validate_xmit_skb is needed.

Switch to software checksum for non-TSO packets that request checksum
offload if that device feature is unsupported or disabled. Note that
similar to the TSO case, device drivers may perform checksum offload
correctly even when not advertising it.

When switching to software checksum, packets hit skb_checksum_help,
which has two BUG_ON checksum not in linear segment. Packet sockets
always allocate at least up to csum_start + csum_off + 2 as linear.

Tested by running github.com/wdebruij/kerneltools/psock_txring_vnet.c

  ethtool -K eth0 tso off tx on
  psock_txring_vnet -d $dst -s $src -i eth0 -l 2000 -n 1 -q -v
  psock_txring_vnet -d $dst -s $src -i eth0 -l 2000 -n 1 -q -v -N

  ethtool -K eth0 tx off
  psock_txring_vnet -d $dst -s $src -i eth0 -l 1000 -n 1 -q -v -G
  psock_txring_vnet -d $dst -s $src -i eth0 -l 1000 -n 1 -q -v -G -N

v2:
  - add EXPORT_SYMBOL_GPL(validate_xmit_skb_list)

Fixes: d346a3fae3ff ("packet: introduce PACKET_QDISC_BYPASS socket option")
Signed-off-by: Willem de Bruijn <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net_sched actions: use nla_parse_nested()

Use nla_parse_nested instead of open-coding the call to
nla_parse() with the attribute data/len.

Signed-off-by: Johannes Berg <[email protected]>
Acked-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

cxgb4: Fix error handling in alloc_uld_rxqs().

Fix to release resources properly in error handling path of
alloc_uld_rxqs(), This patch also removes unwanted arguments
and avoids calling the same function twice.

Fixes: 94cdb8bb993a (cxgb4: Add support for dynamic allocation
of resources for ULD
Signed-off-by: Ganesh Goudar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

IB/mlx4: avoid a -Wmaybe-uninitialize warning

There is an old warning about mlx4_SW2HW_EQ_wrapper on x86:

ethernet/mellanox/mlx4/resource_tracker.c: In function ‘mlx4_SW2HW_EQ_wrapper’:
ethernet/mellanox/mlx4/resource_tracker.c:3071:10: error: ‘eq’ may be used uninitialized in this function [-Werror=maybe-uninitialized]

The problem here is that gcc won't track the state of the variable
across a spin_unlock. Moving the assignment out of the lock is
safe here and avoids the warning.

Signed-off-by: Arnd Bergmann <[email protected]>
Reviewed-by: Yishai Hadas <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge remote-tracking branches 'spi/fix/dt', 'spi/fix/fsl-dspi' and 'spi/fix/fsl-espi' into spi-linus

spi: dspi: clear SPI_SR before enable interrupt

Once dspi is used in uboot, the SPI_SR have been set by some value.
At this time, if kernel enable the interrupt before clear the
status flag, that will trigger the wrong interrupt.

Signed-off-by: Yuan Yao <[email protected]>
Signed-off-by: Mark Brown <[email protected]>

ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()

This patch updates skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit() when an
IPv6 header is installed to a socket buffer.

This is not a cosmetic change. Without updating this value, GSO packets
transmitted through an ipip6 tunnel have the protocol of ETH_P_IP and
skb_mac_gso_segment() will attempt to call gso_segment() for IPv4,
which results in the packets being dropped.

Fixes: b8921ca83eed ("ip4ip6: Support for GSO/GRO")
Signed-off-by: Eli Cooper <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bpf: fix samples to add fake KBUILD_MODNAME

Some of the sample files are causing issues when they are loaded with tc
and cls_bpf, meaning tc bails out while trying to parse the resulting ELF
file as program/map/etc sections are not present, which can be easily
spotted with readelf(1).

Currently, BPF samples are including some of the kernel headers and mid
term we should change them to refrain from this, really. When dynamic
debugging is enabled, we bail out due to undeclared KBUILD_MODNAME, which
is easily overlooked in the build as clang spills this along with other
noisy warnings from various header includes, and llc still generates an
ELF file with mentioned characteristics. For just playing around with BPF
examples, this can be a bit of a hurdle to take.

Just add a fake KBUILD_MODNAME as a band-aid to fix the issue, same is
done in xdp*_kern samples already.

Fixes: 65d472fb007d ("samples/bpf: add 'pointer to packet' tests")
Fixes: 6afb1e28b859 ("samples/bpf: Add tunnel set/get tests.")
Fixes: a3f74617340b ("cgroup: bpf: Add an example to do cgroup checking in BPF")
Reported-by: Chandrasekar Kannan <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'char-misc-4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
"Here are a few small char/misc driver fixes for reported issues.

  The "biggest" are two binder fixes for reported issues that have been
  shipping in Android phones for a while now, the others are various
  fixes for reported problems.

  And there's a MAINTAINERS update for good measure.

  All have been in linux-next with no reported issues"

* tag 'char-misc-4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  MAINTAINERS: Add entry for genwqe driver
  VMCI: Doorbell create and destroy fixes
  GenWQE: Fix bad page access during abort of resource allocation
  vme: vme_get_size potentially returning incorrect value on failure
  extcon: qcom-spmi-misc: Sync the extcon state on interrupt
  hv: do not lose pending heartbeat vmbus packets
  mei: txe: don't clean an unprocessed interrupt cause.
  ANDROID: binder: Clear binder and cookie when setting handle in flat binder struct
  ANDROID: binder: Add strong ref checks

Merge remote-tracking branches 'regmap/fix/header' and 'regmap/fix/macro' into regmap-linus

Merge tag 'v4.9-rockchip-dts64-fixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into fixes

Correct regulator handling on Rockchip arm64 boards to make
bind/unbind calls work correctly and remove a sdio-only
property from non-sdio mmc hosts, that accidentially was
added there.

* tag 'v4.9-rockchip-dts64-fixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
arm64: dts: rockchip: remove the abuse of keep-power-in-suspend
arm64: dts: rockchip: remove always-on and boot-on from vcc_sd

Signed-off-by: Olof Johansson <[email protected]>

Merge tag 'arm-soc/for-4.9/devicetree-arm64-fixes' of http://github.com/Broadcom/stblinux into fixes

This pull request contains a single fix for Broadcom ARM64-based SoCs:

- Ray adds the required bus width and OOB sector size properties to the
  Northstar 2 SVK reference board in order for the NAND controller to work
  properly

* tag 'arm-soc/for-4.9/devicetree-arm64-fixes' of http://github.com/Broadcom/stblinux:
  arm64: dts: Updated NAND DT properties for NS2 SVK

Signed-off-by: Olof Johansson <[email protected]>

Merge tag 'imx-fixes-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes

The i.MX fixes for 4.9:
- A couple of patches from Fabio to fix the GPC power domain regression
   which is caused by PM Domain core change 0159ec670763dd
   ("PM / Domains: Verify the PM domain is present when adding a
   provider"), and a related kernel crash seen with multi_v7_defconfig
   build.
- Correct the PHY ID mask for AR8031 to match phy driver code.
- Apply new added timer erratum A008585 for LS1043A and LS2080A SoC.
- Correct vf610 global timer IRQ flag to avoid warning from gic driver
   after commit 992345a58e0c ("irqchip/gic: WARN if setting the
   interrupt type for a PPI fails").

* tag 'imx-fixes-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  ARM: imx: mach-imx6q: Fix the PHY ID mask for AR8031
  ARM: dts: vf610: fix IRQ flag of global timer
  ARM: imx: gpc: Fix the imx_gpc_genpd_init() error path
  ARM: imx: gpc: Initialize all power domains
  arm64: dts: Add timer erratum property for LS2080A and LS1043A

Signed-off-by: Olof Johansson <[email protected]>

Merge tag 'uniphier-fixes-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-uniphier into fixes

UniPhier ARM SoC fixes for v4.9

- Add "select ARCH_HAS_RESET_CONTROLLER" in Kconfig
- Rename wrongly-named mioctrl to sdctrl

* tag 'uniphier-fixes-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-uniphier:
  arm64: dts: uniphier: change MIO node to SD control node
  ARM: dts: uniphier: change MIO node to SD control node
  reset: uniphier: rename MIO reset to SD reset for Pro5, PXs2, LD20 SoCs
  arm64: uniphier: select ARCH_HAS_RESET_CONTROLLER
  ARM: uniphier: select ARCH_HAS_RESET_CONTROLLER

Signed-off-by: Olof Johansson <[email protected]>

Merge tag 'driver-core-4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
"Here are two small driver core / kernfs fixes for 4.9-rc3.

  One makes the Kconfig entry for DEBUG_TEST_DRIVER_REMOVE a bit more
  explicit that this is a crazy thing to enable for a distro kernel
  (thanks for trying Fedora!), the other resolves an issue with vim
  opening kernfs files (sysfs, configfs, etc.)

  Both have been in linux-next with no reported issues"

* tag 'driver-core-4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  driver core: Make Kconfig text for DEBUG_TEST_DRIVER_REMOVE stronger
  kernfs: Add noop_fsync to supported kernfs_file_fops

Merge tag 'staging-4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging and IIO driver fixes from Greg KH:
"Here are some small staging and iio driver fixes for reported issues
  for 4.9-rc3. Nothing major, the "largest" being a lustre fix for a
  sysfs file that was obviously wrong, and had never been tested, so it
  was moved to debugfs as that is where it belongs. The others are small
  bug fixes for reported issues with various staging or iio drivers.

  All have been in linux-next for a while with no reported issues"

* tag 'staging-4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  greybus: fix a leak on error in gb_module_create()
  greybus: es2: fix error return code in ap_probe()
  greybus: arche-platform: Add missing of_node_put() in arche_platform_change_state()
  staging: android: ion: Fix error handling in ion_query_heaps()
  iio: accel: sca3000_core: avoid potentially uninitialized variable
  iio:chemical:atlas-ph-sensor: Fix use of 32 bit int to hold 16 bit big endian value
  staging/lustre/llite: Move unstable_stats from sysfs to debugfs
  Staging: wilc1000: Fix kernel Oops on opening the device
  staging: android/ion: testing the wrong variable
  Staging: greybus: uart: Use gbphy_dev->dev instead of bundle->dev
  Staging: greybus: gpio: Use gbphy_dev->dev instead of bundle->dev
  iio: adc: ti-adc081c: Select IIO_TRIGGERED_BUFFER to prevent build errors
  iio: maxim_thermocouple: Align 16 bit big endian value of raw reads

Merge tag 'tty-4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial driver fixes from Greg KH:
"Here are a number of small tty and serial driver fixes for reported
  issues for 4.9-rc3. Nothing major, but they do resolve a bunch of
  problems with the tty core changes that are in 4.9-rc1, and finally
  the atmel serial driver is back working properly.

  All have been in linux-next with no reported issues"

* tag 'tty-4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  tty: serial_core: fix NULL struct tty pointer access in uart_write_wakeup
  tty: serial_core: Fix serial console crash on port shutdown
  tty/serial: at91: fix hardware handshake on Atmel platforms
  vt: clear selection before resizing
  sc16is7xx: always write state when configuring GPIO as an output
  sh-sci: document R8A7743/5 support
  tty: serial: 8250: 8250_core: NXP SC16C2552 workaround
  tty: limit terminal size to 4M chars
  tty: serial: fsl_lpuart: Fix Tx DMA edge case
  serial: 8250_lpss: enable MSI for sure
  serial: core: fix console problems on uart_close
  serial: 8250_uniphier: fix clearing divisor latch access bit
  serial: 8250_uniphier: fix more unterminated string
  serial: pch_uart: add terminate entry for dmi_system_id tables
  devicetree: bindings: uart: Add new compatible string for ZynqMP
  serial: xuartps: Add new compatible string for ZynqMP
  serial: SERIAL_STM32 should depend on HAS_DMA
  serial: stm32: Fix comparisons with undefined register
  tty: vt, fix bogus division in csi_J

Merge tag 'usb-4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
"Here are a number of small USB driver fixes for 4.9-rc3.

  There is the usual number of gadget and xhci patches in here to
  resolved reported issues, as well as some usb-serial driver fixes and
  new device ids.

  All have been in linux-next with no reported issues"

* tag 'usb-4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (26 commits)
  usb: chipidea: host: fix NULL ptr dereference during shutdown
  usb: renesas_usbhs: add wait after initialization for R-Car Gen3
  usb: increase ohci watchdog delay to 275 msec
  usb: musb: Call pm_runtime from musb_gadget_queue
  usb: musb: Fix hardirq-safe hardirq-unsafe lock order error
  usb: ehci-platform: increase EHCI_MAX_RSTS to 4
  usb: ohci-at91: Set RemoteWakeupConnected bit explicitly.
  USB: serial: fix potential NULL-dereference at probe
  xhci: use default USB_RESUME_TIMEOUT when resuming ports.
  xhci: workaround for hosts missing CAS bit
  xhci: add restart quirk for Intel Wildcatpoint PCH
  USB: serial: cp210x: fix tiocmget error handling
  wusb: fix error return code in wusb_prf()
  Revert "Documentation: devicetree: dwc2: Deprecate g-tx-fifo-size"
  Revert "usb: dwc2: gadget: fix TX FIFO size and address initialization"
  Revert "usb: dwc2: gadget: change variable name to more meaningful"
  USB: serial: ftdi_sio: add support for Infineon TriBoard TC2X7
  wusb: Stop using the stack for sg crypto scratch space
  usb: dwc3: Fix size used in dma_free_coherent()
  usb: gadget: f_fs: stop sleeping in ffs_func_eps_disable
  ...

inet: Fix missing return value in inet6_hash

As part of a series to implement faster SO_REUSEPORT lookups,
commit 086c653f5862 ("sock: struct proto hash function may error")
added return values to protocol hash functions and
commit 496611d7b5ea ("inet: create IPv6-equivalent inet_hash function")
implemented a new hash function for IPv6. However, the latter does
not respect the former's convention.

This properly propagates the hash errors in the IPv6 case.

Fixes: 496611d7b5ea ("inet: create IPv6-equivalent inet_hash function")
Reported-by: Soheil Hassas Yeganeh <[email protected]>
Signed-off-by: Craig Gallek <[email protected]>
Acked-by: Soheil Hassas Yeganeh <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'mlx5-fixes'

Saeed Mahameed says:

====================
Mellanox 100G mlx5 fixes 2016-10-25

This series contains some bug fixes for the mlx5 core and mlx5e driver.

From Daniel:
    - Cache line size determination at runtime, instead of using
      L1_CACHE_BYTES hard coded value, use cache_line_size()
    - Always Query HCA caps after setting them even on reset flow

From Mohamad:
    - Reorder netdev cleanup to uregister netdev before detaching it
      for the kernel to not complain about open resources such as vlans
    - Change the acl enable prototype to return status, for better error
      resiliency
    - Clear health sick bit when starting health poll after reset flow
    - Fix race between PCI error handlers and health work
    - PCI error recovery health care simulation, in case when the kernel
      PCI error handlers are not triggered for some internal firmware errors

From Noa:
    - Avoid passing dma address 0 to firmware when mapping system pages
      to the firmware

From Paul: Some straight forward flow steering fixes
    - Keep autogroups list ordered
    - Fix autogroups groups num not decreasing
    - Correctly initialize last use of flow counters

From Saeed:
    - Choose the nearest LRO timeout to the wanted one
      instead of blindly choosing "dev_cap.lro_timeout[2]"

This series has no conflict with the for-next pull request posted
earlier today ("Mellanox mlx5 core driver updates 2016-10-25").
====================

Signed-off-by: David S. Miller <[email protected]>

net/mlx5: Avoid passing dma address 0 to firmware

Currently the firmware can't work with a page with dma address 0.
Passing such an address to the firmware will cause the give_pages
command to fail.

To avoid this, in case we get a 0 dma address of a page from the
dma engine, we avoid passing it to FW by remapping to get an address
other than 0.

Fixes: bf0bf77f6519 ('mlx5: Support communicating arbitrary host...')
Signed-off-by: Noa Osherovich <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx5: PCI error recovery health care simulation

In case that the kernel PCI error handlers are not called, we will
trigger our own recovery flow.

The health work will give priority to the kernel pci error handlers to
recover the PCI by waiting for a small period, if the pci error handlers
are not triggered the manual recovery flow will be executed.

We don't save pci state in case of manual recovery because it will ruin the
pci configuration space and we will lose dma sync.

Fixes: 89d44f0a6c73 ('net/mlx5_core: Add pci error handlers to mlx5_core driver')
Signed-off-by: Mohamad Haj Yahia <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx5: Fix race between PCI error handlers and health work

Currently there is a race between the health care work and the kernel
pci error handlers because both of them detect the error, the first one
to be called will do the error handling.
There is a chance that health care will disable the pci after resuming
pci slot.
Also create a separate WQ because now we will have two types of health
works, one for the error detection and one for the recovery.

Fixes: 89d44f0a6c73 ('net/mlx5_core: Add pci error handlers to mlx5_core driver')
Signed-off-by: Mohamad Haj Yahia <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx5: Clear health sick bit when starting health poll

The health sick status should be cleared when we start the health poll.
This is crucial for driver reload (unload + load) in order to behave
right in case of health issue.

Fixes: fd76ee4da55a ('net/mlx5_core: Fix internal error detection conditions')
Signed-off-by: Mohamad Haj Yahia <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: David S. Miller <[email protected]>