Git Repo - linux.git/log

NFSv4: Don't call put_rpccred() under the rcu_read_lock()

put_rpccred() can sleep.

Fixes: 8f649c3762547 ("NFSv4: Fix the locking in nfs_inode_reclaim_delegation()")
Cc: [email protected] # 2.6.35+
Signed-off-by: Trond Myklebust <[email protected]>

NFS: Don't require a filehandle to refresh the inode in nfs_prime_dcache()

If the server does not return a valid set of attributes that we can
use to either create a file or refresh the inode, then there is no
value in calling nfs_prime_dcache().

However if we're just refreshing the inode using the attributes that
the server returned, then it shouldn't matter whether or not we have
a filehandle, as long as we check the fsid+fileid combination.

Signed-off-by: Trond Myklebust <[email protected]>

NFSv3: Use the readdir fileid as the mounted-on-fileid

When we call readdirplus, set the fileid normally returned by readdir
as the mounted-on-fileid, since that is commonly the case if there is
a mountpoint. To ensure that we get it right, we only set the flag if
the readdir fileid differs from the one returned in the readdirplus
attributes.

This again means that we can avoid the issues described in commit
2ef47eb1aee17 ("NFS: Fix use of nfs_attr_use_mounted_on_fileid()"),
which only fixed NFSv4.

Signed-off-by: Trond Myklebust <[email protected]>

NFS: Don't invalidate a submounted dentry in nfs_prime_dcache()

If we're traversing a directory which contains a submounted filesystem,
or one that has a referral, the NFS server that is processing the READDIR
request will often return information for the underlying (mounted-on)
directory. It may, or may not, also return filehandle information.

If this happens, and the lookup in nfs_prime_dcache() returns the
dentry for the submounted directory, the filehandle comparison will
fail, and we call d_invalidate(). Post-commit 8ed936b5671bf
("vfs: Lazily remove mounts on unlinked files and directories."), this
means the entire subtree is unmounted.

The following minimal patch addresses this problem by punting on
the invalidation if there is a submount.

Kudos to Neil Brown <[email protected]> for having tracked down this
issue (see link).

Reported-by: Nix <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Cc: [email protected] # 3.18+
Signed-off-by: Trond Myklebust <[email protected]>

NFSv4: Set a barrier in the update_changeattr() helper

Ensure that we don't regress the changes that were made to the
directory.

Signed-off-by: Trond Myklebust <[email protected]>
Tested-by: Chuck Lever <[email protected]>

NFS: Fix nfs_post_op_update_inode() to set an attribute barrier

nfs_post_op_update_inode() is called after a self-induced attribute
update. Ensure that it also sets the barrier.

Signed-off-by: Trond Myklebust <[email protected]>
Tested-by: Chuck Lever <[email protected]>

NFS: Remove size hack in nfs_inode_attrs_need_update()

Prior to this patch, we used to always OK attribute updates that extended
the file size on the assumption that we might be performing writeback.
Now that we have attribute barriers to protect the writeback related updates,
we should remove this hack, as it can cause truncate() operations to
apparently be reverted if/when a readahead or getattr RPC call races
with our on-the-wire SETATTR.

Signed-off-by: Trond Myklebust <[email protected]>
Tested-by: Chuck Lever <[email protected]>

NFSv4: Add attribute update barriers to delegreturn and pNFS layoutcommit

Ensure that other operations that race with delegreturn and layoutcommit
cannot revert the attribute updates that were made on the server.

Signed-off-by: Trond Myklebust <[email protected]>
Tested-by: Chuck Lever <[email protected]>

NFS: Add attribute update barriers to NFS writebacks

Ensure that other operations that race with our write RPC calls
cannot revert the file size updates that were made on the server.

Signed-off-by: Trond Myklebust <[email protected]>
Tested-by: Chuck Lever <[email protected]>

NFS: Set an attribute barrier on all updates

Ensure that we update the attribute barrier even if there were no
invalidations, provided that this value is newer than the old one.

Signed-off-by: Trond Myklebust <[email protected]>
Tested-by: Chuck Lever <[email protected]>

NFS: Add attribute update barriers to nfs_setattr_update_inode()

Ensure that other operations which raced with our setattr RPC call
cannot revert the file attribute changes that were made on the server.
To do so, we artificially bump the attribute generation counter on
the inode so that all calls to nfs_fattr_init() that precede ours
will be dropped.

The motivation for the patch came from Chuck Lever's reports of readaheads
racing with truncate operations and causing the file size to be reverted.

Reported-by: Chuck Lever <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
Tested-by: Chuck Lever <[email protected]>

NFS: Add a helper to set attribute barriers

Signed-off-by: Trond Myklebust <[email protected]>
Tested-by: Chuck Lever <[email protected]>

NFS: Ensure that buffered writes wait for O_DIRECT writes to complete

The O_DIRECT code will grab the inode->i_mutex and flush out buffered
writes, before scheduling a read or a write. However there is no
equivalent in the buffered write code to wait for O_DIRECT to complete.

Fixes a reported issue in xfstests generic/133, when first performing an
O_DIRECT write followed by a buffered write.

Signed-off-by: Trond Myklebust <[email protected]>
Tested-by: Chuck Lever <[email protected]>

mei: make device disabled on stop unconditionally

Set the internal device state to to disabled after hardware reset in stop flow.
This will cover cases when driver was not brought to disabled state because of
an error and in stop flow we wish not to retry the reset.

Cc: <[email protected]> #3.10+
Signed-off-by: Alexander Usyskin <[email protected]>
Signed-off-by: Tomas Winkler <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

staging: comedi: adv_pci1710: fix AI INSN_READ for non-zero channel

Reading of analog input channels by the `INSN_READ` comedi instruction
is broken for all except channel 0.  `pci171x_ai_insn_read()` calls
`pci171x_ai_read_sample()` with the wrong value for the third parameter.
It is supposed to be the current index in a channel list (which is
always of length 1 in this case, so the index should be 0), but instead
it is passing the actual channel number.  `pci171x_ai_read_sample()`
checks the channel number encoded in the raw sample value read from the
hardware matches the channel number stored in the specified index of the
previously set up channel list and returns `-ENODATA` if it doesn't
match.  Since the index should always be 0 in this case, the match will
fail unless the channel number is also 0.  Fix it by passing 0 as the
channel index.

Note that when the bug first appeared, it was `pci171x_ai_dropout()`
that was called with the wrong parameter value.  `pci171x_ai_dropout()`
got replaced with `pci171x_ai_read_sample()` in commit 7fd2dae2500d
("staging: comedi: adv_pci1710: introduce pci171x_ai_read_sample()").

Fixes: 16c7eb6047bb ("staging: comedi: adv_pci1710: always enable PCI171x_PARANOIDCHECK code")
Signed-off-by: Ian Abbott <[email protected]>
Cc: stable <[email protected]> # 3.16+
Signed-off-by: Greg Kroah-Hartman <[email protected]>

android: binder: fix binder mmap failures

binder_update_page_range() initializes only addr and size
fields in 'struct vm_struct tmp_area;' and passes it to
map_vm_area().

Before 71394fe50146 ("mm: vmalloc: add flag preventing guard hole allocation")
this was because map_vm_area() didn't use any other fields
in vm_struct except addr and size.

Now get_vm_area_size() (used in map_vm_area()) reads vm_struct's
flags to determine whether vm area has guard hole or not.

binder_update_page_range() don't initialize flags field, so
this causes following binder mmap failures:
-----------[ cut here ]------------
WARNING: CPU: 0 PID: 1971 at mm/vmalloc.c:130
vmap_page_range_noflush+0x119/0x144()
CPU: 0 PID: 1971 Comm: healthd Not tainted 4.0.0-rc1-00399-g7da3fdc-dirty #157
Hardware name: ARM-Versatile Express
[<c001246d>] (unwind_backtrace) from [<c000f7f9>] (show_stack+0x11/0x14)
[<c000f7f9>] (show_stack) from [<c049a221>] (dump_stack+0x59/0x7c)
[<c049a221>] (dump_stack) from [<c001cf21>] (warn_slowpath_common+0x55/0x84)
[<c001cf21>] (warn_slowpath_common) from [<c001cfe3>]
(warn_slowpath_null+0x17/0x1c)
[<c001cfe3>] (warn_slowpath_null) from [<c00c66c5>]
(vmap_page_range_noflush+0x119/0x144)
[<c00c66c5>] (vmap_page_range_noflush) from [<c00c716b>] (map_vm_area+0x27/0x48)
[<c00c716b>] (map_vm_area) from [<c038ddaf>]
(binder_update_page_range+0x12f/0x27c)
[<c038ddaf>] (binder_update_page_range) from [<c038e857>]
(binder_mmap+0xbf/0x1ac)
[<c038e857>] (binder_mmap) from [<c00c2dc7>] (mmap_region+0x2eb/0x4d4)
[<c00c2dc7>] (mmap_region) from [<c00c3197>] (do_mmap_pgoff+0x1e7/0x250)
[<c00c3197>] (do_mmap_pgoff) from [<c00b35b5>] (vm_mmap_pgoff+0x45/0x60)
[<c00b35b5>] (vm_mmap_pgoff) from [<c00c1f39>] (SyS_mmap_pgoff+0x5d/0x80)
[<c00c1f39>] (SyS_mmap_pgoff) from [<c000ce81>] (ret_fast_syscall+0x1/0x5c)
---[ end trace 48c2c4b9a1349e54 ]---
binder: 1982: binder_alloc_buf failed to map page at f0e00000 in kernel
binder: binder_mmap: 1982 b6bde000-b6cdc000 alloc small buf failed -12

Use map_kernel_range_noflush() instead of map_vm_area() as this is better
API for binder's purposes and it allows to get rid of 'vm_struct tmp_area' at all.

Fixes: 71394fe50146 ("mm: vmalloc: add flag preventing guard hole allocation")
Signed-off-by: Andrey Ryabinin <[email protected]>
Reported-by: Amit Pundir <[email protected]>
Tested-by: Amit Pundir <[email protected]>
Acked-by: David Rientjes <[email protected]>
Tested-by: John Stultz <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
"A CR4-shadow 32-bit init fix, plus two typo fixes"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Init per-cpu shadow copy of CR4 on 32-bit CPUs too
  x86/platform/intel-mid: Fix trivial printk message typo in intel_mid_arch_setup()
  x86/cpu/intel: Fix trivial typo in intel_tlb_table[]

Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fixes from Ingo Molnar:
"Three clockevents/clocksource driver fixes"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource: pxa: Fix section mismatch
  clocksource: mtk: Fix race conditions in probe code
  clockevents: asm9260: Fix compilation error with sparc/sparc64 allyesconfig

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
"Two kprobes fixes and a handful of tooling fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf tools: Make sparc64 arch point to sparc
  perf symbols: Define EM_AARCH64 for older OSes
  perf top: Fix SIGBUS on sparc64
  perf tools: Fix probing for PERF_FLAG_FD_CLOEXEC flag
  perf tools: Fix pthread_attr_setaffinity_np build error
  perf tools: Define _GNU_SOURCE on pthread_attr_setaffinity_np feature check
  perf bench: Fix order of arguments to memcpy_alloc_mem
  kprobes/x86: Check for invalid ftrace location in __recover_probed_insn()
  kprobes/x86: Use 5-byte NOP when the code might be modified by ftrace

Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fix from Ingo Molnar:
"An rtmutex deadlock path fixlet"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rtmutex: Set state back to running on error

Merge branch 'bcmgenet_systemport_stats'

Florian Fainelli says:

====================
net: bcmgenet and systemport statistics fixes

This two patches fix a similar problem in the GENET and SYSTEMPORT drivers
for software maintained statistics used to track DMA mapping and SKB
re-allocation failures.
====================

Signed-off-by: David S. Miller <[email protected]>

net: systemport: fix software maintained statistics

Commit 60b4ea1781fd ("net: systemport: log RX buffer allocation and RX/TX DMA
failures") added a few software maintained statistics using
BCM_SYSPORT_STAT_MIB_RX and BCM_SYSPORT_STAT_MIB_TX. These statistics are read
from the hardware MIB counters, such that bcm_sysport_update_mib_counters() was
trying to read from a non-existing MIB offset for these counters.

Fix this by introducing a special type: BCM_SYSPORT_STAT_SOFT, similar to
BCM_SYSPORT_STAT_NETDEV, such that bcm_sysport_get_ethtool_stats will read from
the software mib.

Fixes: 60b4ea1781fd ("net: systemport: log RX buffer allocation and RX/TX DMA failures")
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: bcmgenet: fix software maintained statistics

Commit 44c8bc3ce39f ("net: bcmgenet: log RX buffer allocation and RX/TX dma
failures") added a few software maintained statistics using
BCMGENET_STAT_MIB_RX and BCMGENET_STAT_MIB_TX. These statistics are read from
the hardware MIB counters, such that bcmgenet_update_mib_counters() was trying
to read from a non-existing MIB offset for these counters.

Fix this by introducing a special type: BCMGENET_STAT_SOFT, similar to
BCMGENET_STAT_NETDEV, such that bcmgenet_get_ethtool_stats will read from the
software mib.

Fixes: 44c8bc3ce39f ("net: bcmgenet: log RX buffer allocation and RX/TX dma failures")
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

rxrpc: don't multiply with HZ twice

rxrpc_resend_timeout has an initial value of 4 * HZ; use it as-is.

Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

rxrpc: terminate retrans loop when sending of skb fails

Typo, 'stop' is never set to true.
Seems intent is to not attempt to retransmit more packets after sendmsg
returns an error.

This change is based on code inspection only.

Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/hsr: Fix NULL pointer dereference and refcnt bugs when deleting a HSR interface.

To repeat:

$ sudo ip link del hsr0
BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
IP: [<ffffffff8187f495>] hsr_del_port+0x15/0xa0
etc...

Bug description:

As part of the hsr master device destruction, hsr_del_port() is called for each of
the hsr ports. At each such call, the master device is updated regarding features
and mtu. When the master device is freed before the slave interfaces, master will
be NULL in hsr_del_port(), which led to a NULL pointer dereference.

Additionally, dev_put() was called on the master device itself in hsr_del_port(),
causing a refcnt error.

A third bug in the same code path was that the rtnl lock was not taken before
hsr_del_port() was called as part of hsr_dev_destroy().

The reporter (Nicolas Dichtel) also said: "hsr_netdev_notify() supposes that the
port will always be available when the notification is for an hsr interface. It's
wrong. For example, netdev_wait_allrefs() may resend NETDEV_UNREGISTER.". As a
precaution against this, a check for port == NULL was added in hsr_dev_notify().

Reported-by: Nicolas Dichtel <[email protected]>
Fixes: 51f3c605318b056a ("net/hsr: Move slave init to hsr_slave.c.")
Signed-off-by: Arvid Brodin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: pasemi: Use setup_timer and mod_timer

Use timer API functions setup_timer and mod_timer instead
of structure assignments as they are standard way to set
the timer and to update the expire field of an active timer
respectively.

This is done using Coccinelle and semantic patch used for
this is as follows:

// <smpl>
@@
expression x,y,z,a,b;
@@

-init_timer (&x);
+setup_timer (&x, y, z);
+mod_timer (&a, b);
-x.function = y;
-x.data = z;
-x.expires = b;
-add_timer(&a);
// </smpl>

Signed-off-by: Vaishali Thakkar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: stmmac: Use setup_timer and mod_timer

Use timer API functions setup_timer and mod_timer instead
of structure assignments as they are standard way to set
the timer and to update the expire field of an active timer
respectively.

This is done using Coccinelle and semantic patch used for
this is as follows:

// <smpl>
@@
expression x,y,z,a,b;
@@

-init_timer (&x);
+setup_timer (&x, y, z);
+mod_timer (&a, b);
-x.function = y;
-x.data = z;
-x.expires = b;
-add_timer(&a);
// </smpl>

Signed-off-by: Vaishali Thakkar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: 8390: axnet_cs: Use setup_timer and mod_timer

Use timer API functions setup_timer and mod_timer instead
of structure assignments as they are standard way to set
the timer and to update the expire field of an active timer
respectively.

This is done using Coccinelle and semantic patch used for
this is as follows:

// <smpl>
@@
expression x,y,z,a,b;
@@

-init_timer (&x);
+setup_timer (&x, y, z);
+mod_timer (&a, b);
-x.function = y;
-x.data = z;
-x.expires = b;
-add_timer(&a);
// </smpl>

Signed-off-by: Vaishali Thakkar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: 8390: pcnet_cs: Use setup_timer and mod_timer

Use timer API functions setup_timer and mod_timer instead
of structure assignments as they are standard way to set
the timer and to update the expire field of an active timer
respectively.

This is done using Coccinelle and semantic patch used for
this is as follows:

// <smpl>
@@
expression x,y,z,a,b;
@@

-init_timer (&x);
+setup_timer (&x, y, z);
+mod_timer (&a, b);
-x.function = y;
-x.data = z;
-x.expires = b;
-add_timer(&a);
// </smpl>

Signed-off-by: Vaishali Thakkar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: smc91c92_cs: Use setup_timer and mod_timer

Use timer API functions setup_timer and mod_timer instead
of structure assignments as they are standard way to set
the timer and to update the expire field of an active timer
respectively.

This is done using Coccinelle and semantic patch used for
this is as follows:

// <smpl>
@@
expression x,y,z,a,b;
@@

-init_timer (&x);
+setup_timer (&x, y, z);
+mod_timer (&a, b);
-x.function = y;
-x.data = z;
-x.expires = b;
-add_timer(&a);
// </smpl>

Signed-off-by: Vaishali Thakkar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

netxen_nic: Fix trivial typos in comments

Change 'mutliple' to 'multiple'
Change 'Firmare' to 'Firmware'

Signed-off-by: Yannick Guerrini <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

qlcnic: Fix trivial typo in comment

Change 'Firmare' to 'Firmware'

Signed-off-by: Yannick Guerrini <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ti: cpsw: add hibernation callbacks

Setting a dev_pm_ops suspend/resume pair but not a set of
hibernation functions means those pm functions will not be
called upon hibernation.
Fix this by using SIMPLE_DEV_PM_OPS, which appropriately
assigns the suspend and hibernation handlers and move
cpsw_suspend/resume calbacks under CONFIG_PM_SLEEP
to avoid build warnings.

Signed-off-by: Grygorii Strashko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: davinci_mdio: add hibernation callbacks

Setting a dev_pm_ops suspend_late/resume_early pair but not a
set of hibernation functions means those pm functions will
not be called upon hibernation.
Fix this by using SET_LATE_SYSTEM_SLEEP_PM_OPS, which appropriately
assigns the suspend and hibernation handlers and move
davinci_mdio_x callbacks under CONFIG_PM_SLEEP to avoid build warnings.

Signed-off-by: Grygorii Strashko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

  - pthread_attr_setaffinity_np() feature detection build fixes (Adrian Hunter, Josh Boyer)

  - Fix probing for PERF_FLAG_FD_CLOEXEC flag (Adrian Hunter)

  - Fix order of arguments to memcpy_alloc_mem in 'perf bench' (Bruce Merry)

  - Sparc64 and Aarch64 build and segfault fixes (David Ahern)

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>

ALSA: dice: fix wrong offsets for Dice interface

For received packet stream, the offset of 'RX_SEQ_START' locates after
the offset of 'RX_NUMBER_MIDI', although current macro and proc output
includes wrong offsets.

Fortunately, this bug doesn't affect streaming functionality because
these macro is not used.

This commit fixes these wrong macro and outputs.

Signed-off-by: Takashi Sakamoto <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

locking/rtmutex: Set state back to running on error

The "usual" path is:

- rt_mutex_slowlock()
- set_current_state()
- task_blocks_on_rt_mutex() (ret 0)
- __rt_mutex_slowlock()
- sleep or not but do return with __set_current_state(TASK_RUNNING)
- back to caller.

In the early error case where task_blocks_on_rt_mutex() return
-EDEADLK we never change the task's state back to RUNNING. I
assume this is intended. Without this change after ww_mutex
using rt_mutex the selftest passes but later I get plenty of:

| bad: scheduling from the idle thread!

backtraces.

Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Acked-by: Mike Galbraith <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Maarten Lankhorst <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Fixes: afffc6c1805d ("locking/rtmutex: Optimize setting task running after being blocked")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

net: do not use rcu in rtnl_dump_ifinfo()

We did a failed attempt in the past to only use rcu in rtnl dump
operations (commit e67f88dd12f6 "net: dont hold rtnl mutex during
netlink dump callbacks")

Now that dumps are holding RTNL anyway, there is no need to also
use rcu locking, as it forbids any scheduling ability, like
GFP_KERNEL allocations that controlling path should use instead
of GFP_ATOMIC whenever possible.

This should fix following splat Cong Wang reported :

[ INFO: suspicious RCU usage. ]
3.19.0+ #805 Tainted: G        W

include/linux/rcupdate.h:538 Illegal context switch in RCU read-side critical section!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
2 locks held by ip/771:
  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff8182b8f4>] netlink_dump+0x21/0x26c
  #1:  (rcu_read_lock){......}, at: [<ffffffff817d785b>] rcu_read_lock+0x0/0x6e

stack backtrace:
CPU: 3 PID: 771 Comm: ip Tainted: G        W       3.19.0+ #805
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  0000000000000001 ffff8800d51e7718 ffffffff81a27457 0000000029e729e6
  ffff8800d6108000 ffff8800d51e7748 ffffffff810b539b ffffffff820013dd
  00000000000001c8 0000000000000000 ffff8800d7448088 ffff8800d51e7758
Call Trace:
  [<ffffffff81a27457>] dump_stack+0x4c/0x65
  [<ffffffff810b539b>] lockdep_rcu_suspicious+0x107/0x110
  [<ffffffff8109796f>] rcu_preempt_sleep_check+0x45/0x47
  [<ffffffff8109e457>] ___might_sleep+0x1d/0x1cb
  [<ffffffff8109e67d>] __might_sleep+0x78/0x80
  [<ffffffff814b9b1f>] idr_alloc+0x45/0xd1
  [<ffffffff810cb7ab>] ? rcu_read_lock_held+0x3b/0x3d
  [<ffffffff814b9f9d>] ? idr_for_each+0x53/0x101
  [<ffffffff817c1383>] alloc_netid+0x61/0x69
  [<ffffffff817c14c3>] __peernet2id+0x79/0x8d
  [<ffffffff817c1ab7>] peernet2id+0x13/0x1f
  [<ffffffff817d8673>] rtnl_fill_ifinfo+0xa8d/0xc20
  [<ffffffff810b17d9>] ? __lock_is_held+0x39/0x52
  [<ffffffff817d894f>] rtnl_dump_ifinfo+0x149/0x213
  [<ffffffff8182b9c2>] netlink_dump+0xef/0x26c
  [<ffffffff8182bcba>] netlink_recvmsg+0x17b/0x2c5
  [<ffffffff817b0adc>] __sock_recvmsg+0x4e/0x59
  [<ffffffff817b1b40>] sock_recvmsg+0x3f/0x51
  [<ffffffff817b1f9a>] ___sys_recvmsg+0xf6/0x1d9
  [<ffffffff8115dc67>] ? handle_pte_fault+0x6e1/0xd3d
  [<ffffffff8100a3a0>] ? native_sched_clock+0x35/0x37
  [<ffffffff8109f45b>] ? sched_clock_local+0x12/0x72
  [<ffffffff8109f6ac>] ? sched_clock_cpu+0x9e/0xb7
  [<ffffffff810cb7ab>] ? rcu_read_lock_held+0x3b/0x3d
  [<ffffffff811abde8>] ? __fcheck_files+0x4c/0x58
  [<ffffffff811ac556>] ? __fget_light+0x2d/0x52
  [<ffffffff817b376f>] __sys_recvmsg+0x42/0x60
  [<ffffffff817b379f>] SyS_recvmsg+0x12/0x1c

Signed-off-by: Eric Dumazet <[email protected]>
Fixes: 0c7aecd4bde4b7302 ("netns: add rtnl cmd to add and get peer netns ids")
Cc: Nicolas Dichtel <[email protected]>
Reported-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sh_eth: Fix lost MAC address on kexec

Commit 740c7f31c094703c ("sh_eth: Ensure DMA engines are stopped before
freeing buffers") added a call to sh_eth_reset() to the
sh_eth_set_ringparam() and sh_eth_close() paths.

However, setting the software reset bit(s) in the EDMR register resets
the MAC Address Registers to zero. Hence after kexec, the new kernel
doesn't detect a valid MAC address and assigns a random MAC address,
breaking DHCP.

Set the MAC address again after the reset in sh_eth_dev_exit() to fix
this.

Tested on r8a7740/armadillo (GETHER) and r8a7791/koelsch (FAST_RCAR).

Fixes: 740c7f31c094703c ("sh_eth: Ensure DMA engines are stopped before freeing buffers")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: bcmgenet: fix throughtput regression

This patch adds bcmgenet_tx_poll for the tx_rings. This can reduce the
interrupt load and send xmit in network stack on time. This also
separated for the completion of tx_ring16 from bcmgenet_poll.

The bcmgenet_tx_reclaim of tx_ring[{0,1,2,3}] operative by an interrupt
is to be not more than a certain number TxBDs. It is caused by too
slowly reclaiming the transmitted skb. Therefore, performance
degradation of xmit after 605ad7f ("tcp: refine TSO autosizing").

Signed-off-by: Jaedon Shin <[email protected]>
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

macvtap: make sure neighbour code can push ethernet header

Brian reported crashes using IPv6 traffic with macvtap/veth combo.

I tracked the crashes in neigh_hh_output()

-> memcpy(skb->data - HH_DATA_MOD, hh->hh_data, HH_DATA_MOD);

Neighbour code assumes headroom to push Ethernet header is
at least 16 bytes.

It appears macvtap has only 14 bytes available on arches
where NET_IP_ALIGN is 0 (like x86)

Effect is a corruption of 2 bytes right before skb->head,
and possible crashes if accessing non existing memory.

This fix should also increase IPv4 performance, as paranoid code
in ip_finish_output2() wont have to call skb_realloc_headroom()

Reported-by: Brian Rak <[email protected]>
Tested-by: Brian Rak <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'mac80211-for-davem-2015-02-27' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
A few patches have accumulated, among them the fix for Linus's
four-way-handshake problem. The others are various small fixes
for problems all over, nothing really stands out.
====================

Signed-off-by: David S. Miller <[email protected]>

cpuidle / sleep: Do sanity checks in cpuidle_enter_freeze() too

Modify cpuidle_enter_freeze() to do the sanity checks done by
cpuidle_select() to avoid crashing the suspend-to-idle code
path in case something is missing.

Fixes: 381063133246 (PM / sleep: Re-implement suspend-to-idle handling)
Original-by: Lorenzo Pieralisi <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>

idle / sleep: Avoid excessive disabling and enabling interrupts

Disabling interrupts at the end of cpuidle_enter_freeze() is not
useful, because its caller, cpuidle_idle_call(), re-enables them
right away after invoking it.

To avoid that unnecessary back and forth dance with interrupts,
make cpuidle_enter_freeze() enable interrupts after calling
enter_freeze_proper() and drop the local_irq_disable() at its
end, so that all of the code paths in it end up with interrupts
enabled. Then, cpuidle_idle_call() will not need to re-enable
interrupts after calling cpuidle_enter_freeze() any more, because
the latter will return with interrupts enabled, in analogy with
cpuidle_enter().

Reported-by: Lorenzo Pieralisi <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>

net: Verify permission to link_net in newlink

When applicable verify that the caller has permisson to the underlying
network namespace for a newly created network device.

Similary checks exist for the network namespace a network device will
be created in.

Fixes: 317f4810e45e ("rtnl: allow to create device with IFLA_LINK_NETNSID set")
Signed-off-by: "Eric W. Biederman" <[email protected]>
Acked-by: Nicolas Dichtel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: Verify permission to dest_net in newlink

When applicable verify that the caller has permision to create a
network device in another network namespace. This check is already
present when moving a network device between network namespaces in
setlink so all that is needed is to duplicate that check in newlink.

This change almost backports cleanly, but there are context conflicts
as the code that follows was added in v4.0-rc1

Fixes: b51642f6d77b net: Enable a userns root rtnl calls that are safe for unprivilged users
Signed-off-by: "Eric W. Biederman" <[email protected]>
Acked-by: Nicolas Dichtel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

drivers: net: cpsw: Set SECURE for dual_emac ucast

Prior to this patch, sending a packet with the source MAC address of one
of the CPSW interfaces to one of the CPSW slave ports while it's configured in
dual_emac mode would update the port_num field of the VLAN/Unicast Address
Table Entry. This would cause it to discard all incoming traffic addressed to
that MAC address, essentially rendering the port useless until the ALE table is
cleared (by starting and stopping the interface or rebooting.)

For example, if eth0 has a MAC address of 90:59:af:8f:43:e9 it will have
an ALE table entry:

00 00 00 00 59 90 02 30 e9 43 8f af
(VLAN Addr vlan_id=2 unicast type=0 port_num=0 addr=90:59:af:8f:43:e9)

If you configure another device with the same MAC address and connect it
to the first CPSW slave port and send some traffic the ALE table entry
becomes:

04 00 00 00 59 90 02 30 e9 43 8f af
(VLAN Addr vlan_id=2 unicast type=0 port_num=1 addr=90:59:af:8f:43:e9)

>From this point forward all incoming traffic addressed to
90:59:af:8f:43:e9 will be dropped.

Setting the SECURE bit for the VLAN/Unicast address table entry for each
interface's MAC address corrects the problem.

Signed-off-by: George McCollister <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"Just general fixes: radeon, i915, atmel, tegra, amdkfd and one core
  fix"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (28 commits)
  drm: atmel-hlcdc: remove clock polarity from crtc driver
  drm/radeon: only enable DP audio if the monitor supports it
  drm/radeon: fix atom aux payload size check for writes (v2)
  drm/radeon: fix 1 RB harvest config setup for TN/RL
  drm/radeon: enable SRBM timeout interrupt on EG/NI
  drm/radeon: enable SRBM timeout interrupt on SI
  drm/radeon: enable SRBM timeout interrupt on CIK v2
  drm/radeon: dump full IB if we hit a packet error
  drm/radeon: disable mclk switching with 120hz+ monitors
  drm/radeon: use drm_mode_vrefresh() rather than mode->vrefresh
  drm/radeon: enable native backlight control on old macs
  drm/i915: Fix frontbuffer false positve.
  drm/i915: Align initial plane backing objects correctly
  drm/i915: avoid processing spurious/shared interrupts in low-power states
  drm/i915: Check obj->vma_list under the struct_mutex
  drm/i915: Fix a use after free, and unbalanced refcounting
  drm: atmel-hlcdc: remove useless pm_runtime_put_sync in probe
  drm: atmel-hlcdc: reset layer A2Q and UPDATE bits when disabling it
  drm: Fix deadlock due to getconnector locking changes
  drm/i915: Dell Chromebook 11 has PWM backlight
  ...

Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block layer fixes from Jens Axboe:
"Two smaller fixes for this cycle:

   - A fixup from Keith so that NVMe compiles without BLK_INTEGRITY,
     basically just moving the code around appropriately.

   - A fixup for shm, fixing an oops in shmem_mapping() for mapping with
     no inode.  From Sasha"

[ The shmem fix doesn't look block-layer-related, but fixes a bug that
  happened due to the backing_dev_info removal..  - Linus ]

* 'for-linus' of git://git.kernel.dk/linux-block:
  mm: shmem: check for mapping owner before dereferencing
  NVMe: Fix for BLK_DEV_INTEGRITY not set

Merge tag 'xfs-for-linus-4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs

Pull xfs fixes from Dave Chinner:
"These are fixes for regressions/bugs introduced in the 4.0 merge cycle
  and problems discovered during the merge window that need to be pushed
  back to stable kernels ASAP.

  This contains:
   - ensure quota type is reset in on-disk dquots
   - fix missing partial EOF block data flush on truncate extension
   - fix transaction leak in error handling for new pnfs block layout
     support
   - add missing target_ip check to RENAME_EXCHANGE"

* tag 'xfs-for-linus-4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
  xfs: cancel failed transaction in xfs_fs_commit_blocks()
  xfs: Ensure we have target_ip for RENAME_EXCHANGE
  xfs: ensure truncate forces zeroed blocks to disk
  xfs: Fix quota type in quota structures when reusing quota file

niu: fix error handling in niu_class_to_ethflow()

There is a discrepancy here because the niu_class_to_ethflow() returns
zero on failure and one on success but the caller expected zero on
success and negative on failure.

The problem means that we allow the user to pass classes and flow_types
which we don't want. I've looked at it a bit and I don't see it as a
very serious bug.

Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
"13 fixes"

* emailed patches from Andrew Morton <[email protected]>:
  mm: add missing __PAGETABLE_{PUD,PMD}_FOLDED defines
  mm: page_alloc: revert inadvertent !__GFP_FS retry behavior change
  kernel/sys.c: fix UNAME26 for 4.0
  mm: memcontrol: use "max" instead of "infinity" in control knobs
  zram: use proper type to update max_used_pages
  drivers/rtc/rtc-ds1685.c: fix conditional in ds1685_rtc_sysfs_time_regs_{show,store}
  nilfs2: fix potential memory overrun on inode
  scripts/gdb: add empty package initialization script
  rtc: ds1685: remove superfluous checks for out-of-range u8 values
  rtc: ds1685: fix ds1685_rtc_alarm_irq_enable build error
  memcg: fix low limit calculation
  mm/nommu: fix memory leak
  ocfs2: update web page + git tree in documentation

mm: add missing __PAGETABLE_{PUD,PMD}_FOLDED defines

Core mm expects __PAGETABLE_{PUD,PMD}_FOLDED to be defined if these page
table levels folded.  Usually, these defines are provided by
<asm-generic/pgtable-nopmd.h> and <asm-generic/pgtable-nopud.h>.

But some architectures fold page table levels in a custom way.  They
need to define these macros themself.  This patch adds missing defines.

The patch fixes mm->nr_pmds underflow and eliminates dead __pmd_alloc()
and __pud_alloc() on architectures without these page table levels.

Signed-off-by: Kirill A. Shutemov <[email protected]>
Cc: Aaro Koskinen <[email protected]>
Cc: David Howells <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: "James E.J. Bottomley" <[email protected]>
Cc: Koichi Yasutake <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: page_alloc: revert inadvertent !__GFP_FS retry behavior change

Historically, !__GFP_FS allocations were not allowed to invoke the OOM
killer once reclaim had failed, but nevertheless kept looping in the
allocator.

Commit 9879de7373fc ("mm: page_alloc: embed OOM killing naturally into
allocation slowpath"), which should have been a simple cleanup patch,
accidentally changed the behavior to aborting the allocation at that
point. This creates problems with filesystem callers (?) that currently
rely on the allocator waiting for other tasks to intervene.

Revert the behavior as it shouldn't have been changed as part of a
cleanup patch.

Fixes: 9879de7373fc ("mm: page_alloc: embed OOM killing naturally into allocation slowpath")
Signed-off-by: Johannes Weiner <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Reported-by: Tetsuo Handa <[email protected]>
Cc: Theodore Ts'o <[email protected]>
Cc: Dave Chinner <[email protected]>
Acked-by: David Rientjes <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: <[email protected]> [3.19.x]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kernel/sys.c: fix UNAME26 for 4.0

There's a uname workaround for broken userspace which can't handle kernel
versions of 3.x. Update it for 4.x.

Signed-off-by: Jon DeVree <[email protected]>
Cc: Andi Kleen <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: memcontrol: use "max" instead of "infinity" in control knobs

The memcg control knobs indicate the highest possible value using the
symbolic name "infinity", which is long and awkward to type.

Switch to the string "max", which is just as descriptive but shorter and
sweeter.

This changes a user interface, so do it before the release and before
the development flag is dropped from the default hierarchy.

Signed-off-by: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

zram: use proper type to update max_used_pages

max_used_pages is defined as atomic_long_t so we need to use unsigned
long to keep temporary value for it rather than int which is smaller
than unsigned long in a 64 bit system.

Signed-off-by: Joonsoo Kim <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Jerome Marchand <[email protected]>
Cc: Nitin Gupta <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

drivers/rtc/rtc-ds1685.c: fix conditional in ds1685_rtc_sysfs_time_regs_{show,store}

Fix a conditional statement checking for NULL in both
ds1685_rtc_sysfs_time_regs_show and ds1685_rtc_sysfs_time_regs_store
that was using a logical AND when it should be using a logical OR so
that we fail out of the function properly if the condition ever
evaluates to true.

Fixes: aaaf5fbf56f1 ("rtc: add driver for DS1685 family of real time clocks")
Signed-off-by: Joshua Kinard <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

nilfs2: fix potential memory overrun on inode

Each inode of nilfs2 stores a root node of a b-tree, and it turned out to
have a memory overrun issue:

Each b-tree node of nilfs2 stores a set of key-value pairs and the number
of them (in "bn_nchildren" member of nilfs_btree_node struct), as well as
a few other "bn_*" members.

Since the value of "bn_nchildren" is used for operations on the key-values
within the b-tree node, it can cause memory access overrun if a large
number is incorrectly set to "bn_nchildren".

For instance, nilfs_btree_node_lookup() function determines the range of
binary search with it, and too large "bn_nchildren" leads
nilfs_btree_node_get_key() in that function to overrun.

As for intermediate b-tree nodes, this is prevented by a sanity check
performed when each node is read from a drive, however, no sanity check
has been done for root nodes stored in inodes.

This patch fixes the issue by adding missing sanity check against b-tree
root nodes so that it's called when on-memory inodes are read from ifile,
inode metadata file.

Signed-off-by: Ryusuke Konishi <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

scripts/gdb: add empty package initialization script

This got lost during the initial merge process: Python requires an
__init__.py script, even if empty, in order to accept a directory as
package. Add it, this time as a non-empty file.

Signed-off-by: Jan Kiszka <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rtc: ds1685: remove superfluous checks for out-of-range u8 values

drivers/rtc/rtc-ds1685.c: In function `ds1685_rtc_read_alarm':
drivers/rtc/rtc-ds1685.c:402: warning: comparison is always true due to limited range of data type
drivers/rtc/rtc-ds1685.c:409: warning: comparison is always true due to limited range of data type
drivers/rtc/rtc-ds1685.c:416: warning: comparison is always true due to limited range of data type
drivers/rtc/rtc-ds1685.c: In function `ds1685_rtc_set_alarm':
drivers/rtc/rtc-ds1685.c:475: warning: comparison is always true due to limited range of data type
drivers/rtc/rtc-ds1685.c:478: warning: comparison is always true due to limited range of data type
drivers/rtc/rtc-ds1685.c:481: warning: comparison is always true due to limited range of data type

u8 cannot contain a value larger than 0xff, hence drop the checks.
Wrapping the checks in unlikely() indicated some sense of humor, though ;-)

Signed-off-by: Geert Uytterhoeven <[email protected]>
Acked-by: Joshua Kinard <[email protected]>
Cc: Alessandro Zummo <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rtc: ds1685: fix ds1685_rtc_alarm_irq_enable build error

The newly added ds1685 driver causes a build error when enabled without
CONFIG_RTC_INTF_DEV:

drivers/rtc/rtc-ds1685.c:919:22: error: 'ds1685_rtc_alarm_irq_enable' undeclared here (not in a function)
.alarm_irq_enable = ds1685_rtc_alarm_irq_enable,

Apparently the driver was incorrectly changed to reflect the interface
change from 16380c153a69c ("RTC: Convert rtc drivers to use the
alarm_irq_enable method"), which removed the respective #ifdef from all
other rtc drivers.

This does the same change that was merged for the other drivers before and
removes the #ifdef, allowing the interrupts to be enabled through the
in-kernel rtc interface independent of the existence of /dev/rtc.

Fixes: aaaf5fbf56f ("rtc: add driver for DS1685 family of real time clocks")
Signed-off-by: Arnd Bergmann <[email protected]>
Acked-by: Joshua Kinard <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Alessandro Zummo <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

memcg: fix low limit calculation

A memcg is considered low limited even when the current usage is equal to
the low limit.  This leads to interesting side effects e.g.
groups/hierarchies with no memory accounted are considered protected and
so the reclaim will emit MEMCG_LOW event when encountering them.

Another and much bigger issue was reported by Joonsoo Kim.  He has hit a
NULL ptr dereference with the legacy cgroup API which even doesn't have
low limit exposed.  The limit is 0 by default but the initial check fails
for memcg with 0 consumption and parent_mem_cgroup() would return NULL if
use_hierarchy is 0 and so page_counter_read would try to dereference NULL.

I suppose that the current implementation is just an overlook because the
documentation in Documentation/cgroups/unified-hierarchy.txt says:

  "The memory.low boundary on the other hand is a top-down allocated
  reserve.  A cgroup enjoys reclaim protection when it and all its
  ancestors are below their low boundaries"

Fix the usage and the low limit comparision in mem_cgroup_low accordingly.

Fixes: 241994ed8649 (mm: memcontrol: default hierarchy interface for memory)
Reported-by: Joonsoo Kim <[email protected]>
Signed-off-by: Michal Hocko <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/nommu: fix memory leak

Maxime reported the following memory leak regression due to commit
dbc8358c7237 ("mm/nommu: use alloc_pages_exact() rather than its own
implementation").

On v3.19, I am facing a memory leak.  Each time I run a command one page
is lost.  Here an example with busybox's free command:

  / # free
               total       used       free     shared    buffers     cached
  Mem:          7928       1972       5956          0          0        492
  -/+ buffers/cache:       1480       6448
  / # free
               total       used       free     shared    buffers     cached
  Mem:          7928       1976       5952          0          0        492
  -/+ buffers/cache:       1484       6444
  / # free
               total       used       free     shared    buffers     cached
  Mem:          7928       1980       5948          0          0        492
  -/+ buffers/cache:       1488       6440
  / # free
               total       used       free     shared    buffers     cached
  Mem:          7928       1984       5944          0          0        492
  -/+ buffers/cache:       1492       6436
  / # free
               total       used       free     shared    buffers     cached
  Mem:          7928       1988       5940          0          0        492
  -/+ buffers/cache:       1496       6432

At some point, the system fails to sastisfy 256KB allocations:

  free: page allocation failure: order:6, mode:0xd0
  CPU: 0 PID: 67 Comm: free Not tainted 3.19.0-05389-gacf2cf1-dirty #64
  Hardware name: STM32 (Device Tree Support)
    show_stack+0xb/0xc
    warn_alloc_failed+0x97/0xbc
    __alloc_pages_nodemask+0x295/0x35c
    __get_free_pages+0xb/0x24
    alloc_pages_exact+0x19/0x24
    do_mmap_pgoff+0x423/0x658
    vm_mmap_pgoff+0x3f/0x4e
    load_flat_file+0x20d/0x4f8
    load_flat_binary+0x3f/0x26c
    search_binary_handler+0x51/0xe4
    do_execveat_common+0x271/0x35c
    do_execve+0x19/0x1c
    ret_fast_syscall+0x1/0x4a
  Mem-info:
  Normal per-cpu:
  CPU    0: hi:    0, btch:   1 usd:   0
  active_anon:0 inactive_anon:0 isolated_anon:0
   active_file:0 inactive_file:0 isolated_file:0
   unevictable:123 dirty:0 writeback:0 unstable:0
   free:1515 slab_reclaimable:17 slab_unreclaimable:139
   mapped:0 shmem:0 pagetables:0 bounce:0
   free_cma:0
  Normal free:6060kB min:352kB low:440kB high:528kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:492kB isolated(anon):0ks
  lowmem_reserve[]: 0 0
  Normal: 23*4kB (U) 22*8kB (U) 24*16kB (U) 23*32kB (U) 23*64kB (U) 23*128kB (U) 1*256kB (U) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 6060kB
  123 total pagecache pages
  2048 pages of RAM
  1538 free pages
  66 reserved pages
  109 slab pages
  -46 pages shared
  0 pages swap cached
  nommu: Allocation of length 221184 from process 67 (free) failed
  Normal per-cpu:
  CPU    0: hi:    0, btch:   1 usd:   0
  active_anon:0 inactive_anon:0 isolated_anon:0
   active_file:0 inactive_file:0 isolated_file:0
   unevictable:123 dirty:0 writeback:0 unstable:0
   free:1515 slab_reclaimable:17 slab_unreclaimable:139
   mapped:0 shmem:0 pagetables:0 bounce:0
   free_cma:0
  Normal free:6060kB min:352kB low:440kB high:528kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:492kB isolated(anon):0ks
  lowmem_reserve[]: 0 0
  Normal: 23*4kB (U) 22*8kB (U) 24*16kB (U) 23*32kB (U) 23*64kB (U) 23*128kB (U) 1*256kB (U) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 6060kB
  123 total pagecache pages
  Unable to allocate RAM for process text/data, errno 12 SEGV

This problem happens because we allocate ordered page through
__get_free_pages() in do_mmap_private() in some cases and we try to free
individual pages rather than ordered page in free_page_series().  In
this case, freeing pages whose refcount is not 0 won't be freed to the
page allocator so memory leak happens.

To fix the problem, this patch changes __get_free_pages() to
alloc_pages_exact() since alloc_pages_exact() returns
physically-contiguous pages but each pages are refcounted.

Fixes: dbc8358c7237 ("mm/nommu: use alloc_pages_exact() rather than its own implementation").
Reported-by: Maxime Coquelin <[email protected]>
Tested-by: Maxime Coquelin <[email protected]>
Signed-off-by: Joonsoo Kim <[email protected]>
Cc: <[email protected]> [3.19]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: update web page + git tree in documentation

We (the Ocfs2 project) recently moved the location of our ocfs2-tools
git tree and project web page.  The pertinent discussion can be seen
here:

  https://oss.oracle.com/pipermail/ocfs2-devel/2015-February/010579.html

The following patch updates the Ocfs2 documentation in MAINTAINERS,
ocfs2.txt, and dlmfs.txt.  I added our new official web page, changed
the location of our tools git tree and removed the link to Joel's
ancient kernel git tree - Andrew has handled our patches for a while
now.

Signed-off-by: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

net: smc91x: use run-time configuration on all ARM machines

The smc91x driver traditionally gets configured at compile-time
for whichever hardware it runs on. This no longer works on
ARM as we continue to move to building all-in-one kernels.

Most ARM configurations with this driver already use run-time
configuration through DT or through platform_data, but a
few have not been converted yet.

I've checked all ARM boards that use this driver in their
legacy board files, and converted the ones that were using
compile-time configuration in smc91x.h to behave like the
other ones and provide the interrupt polarity along with
the MMIO configuration (width, stride) at platform device
creation time.

In particular, these combinations were previously selectable
in Kconfig but in fact broken:

- sa1100 assabet plus pleb
- msm combined with any other armv6/v7 platform
- pxa-idp combined with any non-DMA pxa variant
- LogicPD PXA270 combined with any other pxa
- nomadik combined with any other armv4/v5 platform,
  e.g. versatile.

None of these seem critical enough to warrant a backport
to stable, but it would be nice to clean this up for good.

Signed-off-by: Arnd Bergmann <[email protected]>
----
I would like the patch to get merged through netdev, after
Robert and/or Linus have verified it on at least some hardware.

There are a few other non-ARM platforms using this driver,
I could do the same patch for those if we want to take
it further.

arch/arm/mach-msm/board-halibut.c    |   8 ++++-
arch/arm/mach-msm/board-qsd8x50.c    |   8 ++++-
arch/arm/mach-pxa/idp.c              |   5 +++
arch/arm/mach-pxa/lpd270.c           |   8 ++++-
arch/arm/mach-realview/core.c        |   7 ++++
arch/arm/mach-realview/realview_eb.c |   2 +-
arch/arm/mach-sa1100/neponset.c      |   6 ++++
arch/arm/mach-sa1100/pleb.c          |   7 ++++
drivers/net/ethernet/smsc/smc91x.c   |   9 +++--
drivers/net/ethernet/smsc/smc91x.h   | 114 ++----------------------------------------------------------
10 files changed, 57 insertions(+), 117 deletions(-)
Tested-by: Robert Jarzmik <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

staging: comedi: vmk80xx: remove "firmware version" kernel messages

During the attach of this driver a couple commands are sent to the hardware
with usb_bulk_msg() to read the firmware version information. This information
is then dumped as dev_info() kernel messages. Thee messages are just added
noise and don't effect the operation of the driver.

For simplicity, remove the messages as well as the then unused functions
vmk80xx_read_eeprom() and vmk80xx_check_data_link().

This also fixes an issue reported by coverity about an out-of-bounds write
in vmk80xx_read_eeprom().

Reported-by: coverity (CID 711413)
Signed-off-by: H Hartley Sweeten <[email protected]>
Reviewed-by: Ian Abbott <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

staging: comedi: comedi_isadma: fix "stalled" detect in comedi_isadma_disable_on_sample()

The "stalled" variable this function is used to detect if the DMA operation
is stalled while trying to disable DMA on a full comedi sample. The reset
of this variable should only occur when the remaining bytes of the DMA
transfer does not equal the remaining bytes from the last check.

Reported-by: coverity (CID 1271132)
Signed-off-by: H Hartley Sweeten <[email protected]>
Reviewed-by: Ian Abbott <[email protected]>
Cc: stable <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

Merge tag 'iio-fixes-for-4.0b' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus

Jonathan writes:

Second round of IIO fixes for the 4.0 cycle (or round one part two really!)
These are fixes for patches in the recent merge window and are in a separate
branch to avoid rebasing the main fixes-togreg branch.

* jsa1212 - select missing REGMAP_I2C
* ssp_common - build warning fix for PM functions when PM not in use.
* ak8975 - the addition of a utility library for this driver (as part of
adding new device support) led to a dependency not being inforced
for the original driver (I2C and GPIOLIB).

Merge tag 'iio-fixes-for-4.0a' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus

Jonathan writes:

First round of fixes for IIO in the 4.0 cycle. Note a followup
set dependent on patches in the recent merge windows will follow shortly.

* dht11 - fix a read off the end of an array, add some locking to prevent
          the read function being interrupted and make sure gpio/irq lines
  are not enabled for irqs during output.
* iadc - timeout should be in jiffies not msecs
* mpu6050 - avoid a null id from ACPI emumeration being dereferenced.
* mxs-lradc - fix up some interaction issues between the touchscreen driver
              and iio driver.  Mostly about making sure that the adc driver
              only affects channels that are not being used for the
              touchscreen.
* ad2s1200 - sign extension fix for a result of c type promotion.
* adis16400 - sign extension fix for a result of c type promotion.
* mcp3422 - scale table was transposed.
* ad5686 - use _optional regulator get to avoid a dummy reg being allocate
           which would cause the driver to fail to initialize.
* gp2ap020a00f - select REGMAP_I2C
* si7020 - revert an incorrect cleanup up and then fix the issue that made
           that cleanup seem like a good idea.

iio: ak8975: fix AK09911 dependencies

ak8975 depends on I2C and GPIOLIB, so any symbols that selects
ak8975 must have the same dependency, or we get build errors:

drivers/iio/magnetometer/ak8975.c: In function 'ak8975_who_i_am':
drivers/iio/magnetometer/ak8975.c:393:2: error: implicit declaration of function 'i2c_smbus_read_i2c_block_data' [-Werror=implicit-function-declaration]
  ret = i2c_smbus_read_i2c_block_data(client, AK09912_REG_WIA1,
  ^
drivers/iio/magnetometer/ak8975.c: In function 'ak8975_set_mode':
drivers/iio/magnetometer/ak8975.c:431:2: error: implicit declaration of function 'i2c_smbus_write_byte_data' [-Werror=implicit-function-declaration]
  ret = i2c_smbus_write_byte_data(data->client,

Signed-off-by: Arnd Bergmann <[email protected]>
Fixes: 57e73a423b1e85 ("iio: ak8975: add ak09911 and ak09912 support")
Signed-off-by: Jonathan Cameron <[email protected]>

x86: Init per-cpu shadow copy of CR4 on 32-bit CPUs too

Commit:

1e02ce4cccdc ("x86: Store a per-cpu shadow copy of CR4")

added a shadow CR4 such that reads and writes that do not
modify the CR4 execute much faster than always reading the
register itself.

The change modified cpu_init() in common.c, so that the
shadow CR4 gets initialized before anything uses it.

Unfortunately, there's two cpu_init()s in common.c. There's
one for 64-bit and one for 32-bit. The commit only added
the shadow init to the 64-bit path, but the 32-bit path
needs the init too.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Andy Lutomirski <[email protected]>
Cc: Peter Zijlstra (Intel) <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

Merge branch 'linus' into x86/urgent, to merge dependent patch

Signed-off-by: Ingo Molnar <[email protected]>

Merge branch 'tmon-fixes' of .git into next

thermal: int340x_thermal: Ignore missing _ART, _TRT tables

It is possible that _ART/_TRT tables are missing or have errors.
Ignore those failures, as INT3400 thermal zone is still required
for _OSC or mode switch.

Signed-off-by: Srinivas Pandruvada <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>

thermal/intel_powerclamp: add id for Avoton SoC

Enable Intel Powerclamp driver on Atom* Processor C2000 Product
Family for Microservers (Avoton). Avoton - SoCs for micro-servers
has package C-states which can be used for idle injection.

Reported-by: Jose Navarro <[email protected]>
Suggested-by: Jacob Pan <[email protected]>
Tested-by: Jose Carlos Venegas Munoz <[email protected]>
Signed-off-by: Miguel Bernal Marin <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>

tools/thermal: tmon: silence 'set but not used' warnings

gcc complains about the 'cols' variable being unused. This is
unavoidable, given the ncurses getmaxyx() macro-based API, which wants
to assign to a variable directly, even when we're not going to use it.

Warning:

    gcc -O1 -Wall -Wshadow -W -Wformat -Wimplicit-function-declaration -Wimplicit-int -fstack-protector -D VERSION=\"1.0\"   -c -o tui.o tui.c
    tui.c: In function ‘show_dialogue’:
    tui.c:288:12: warning: variable ‘cols’ set but not used [-Wunused-but-set-variable]
      int rows, cols;
                ^

So, add a hack to get rid of that warning.

Signed-off-by: Brian Norris <[email protected]>
Acked-by: Jacob Pan <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>

tools/thermal: tmon: use pkg-config to determine library dependencies

Some distros (e.g., Arch Linux) don't package the tinfo library
separately from ncurses, so don't unconditionally include it. Instead,
use pkg-config.

The $(STATIC) ugliness is to handle the reported build case from commit
6b533269fb25 ("tools/thermal: tmon: fix compilation errors when building
statically"), where a developer wants to be able to build with:

make LDFLAGS=-static

which requires an additional pkg-config flag.

Finally, support a lowest common denominator fallback (-lpanel
-lncurses) for build systems that don't have pkg-config entries for
ncurses.

Signed-off-by: Brian Norris <[email protected]>
Acked-by: Jacob Pan <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>

tools/thermal: tmon: support cross-compiling

We might want to prepare CFLAGS outside of this Makefile, so don't
overwrite its initial value.

Then, support $(CROSS_COMPILE), so we can use a cross-compile toolchain.

Signed-off-by: Brian Norris <[email protected]>
Acked-by: Jacob Pan <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>

tools/thermal: tmon: add .gitignore

Signed-off-by: Brian Norris <[email protected]>
Acked-by: Jacob Pan <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>

tools/thermal: tmon: fixup tui windowing calculations

The number of rows in the dialog vary according to the number of cooling
devices. However, some of the windowing computations were assuming a
fixed number of rows. This computation is OK when we have between 4 and
9 cooling devices (and they wrap to the next column), but with fewer
devices, we end up printing off the end of the window.

This unifies the row computation into a single function and uses that
throughout the TUI code. This also accounts for increasing the number of
rows when there are more than 9 total cooling devices.

Signed-off-by: Brian Norris <[email protected]>
Acked-by: Jacob Pan <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>

tools/thermal: tmon: tui: don't hard-code dialog window size assumptions

We can use the ncurses API to get the number of rows.

Signed-off-by: Brian Norris <[email protected]>
Acked-by: Jacob Pan <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>

tools/thermal: tmon: add min/max macros

Signed-off-by: Brian Norris <[email protected]>
Acked-by: Jacob Pan <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>

tools/thermal: tmon: add --target-temp parameter

If we launch in daemon mode (--daemon), we don't have the ncurses UI,
but we might want to set the target temperature still. For example,
someone might stick the following in their boot script:

tmon --control intel_powerclamp --target-temp 90 --log --daemon

This would turn on CPU idle injection when we're around 90 degrees
celsius, and would log temperature and throttling info to
/var/tmp/tmon.log.

Signed-off-by: Brian Norris <[email protected]>
Acked-by: Jacob Pan <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Arnd Bergmann:
"The arm-soc bug fixes this time around are mostly for the omap
  platform, coming from a pull request from Tony Lindgren and are almost
  entirely fixing dts files.

  The other two changes enable support for the shmobile platform in
  generic armv7 kernels and change some properties in the ARM64
  reference board dts files"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: multi_v7_defconfig: Enable shmobile platforms
  arm64: Add L2 cache topology to ARM Ltd boards/models
  ARM: dts: am335x-bone*: usb0 is hardwired for peripheral
  ARM: dts: dra7x-evm: beagle-x15: Fix USB Host
  ARM: omap2plus_defconfig: Fix SATA boot
  ARM: omap2plus_defconfig: Enable OMAP NAND BCH driver
  ARM: dts: dra7: Correct the dma controller's property names
  ARM: dts: omap5: Correct the dma controller's property names
  ARM: dts: omap4: Correct the dma controller's property names
  ARM: dts: omap3: Correct the dma controller's property names
  ARM: dts: omap2: Correct the dma controller's property names
  ARM: dts: am437x-idk: fix sleep pinctrl state
  ARM: omap2plus_defconfig: enable TPS62362 regulator
  ARM: dts: am437x-idk: fix TPS62362 i2c bus
  ARM: dts: n900: Fix offset for smc91x ethernet
  ARM: dts: n900: fix i2c bus numbering
  ARM: dts: Fix USB dts configuration for dm816x
  ARM: dts: OMAP5: Fix SATA PHY node
  ARM: dts: DRA7: Fix SATA PHY node

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:
"Various arm64 fixes:
   - ftrace branch generation fix
   - branch instruction encoding fix
   - include files, guards and unused prototypes clean-up
   - minor VDSO ABI fix (clock_getres)
   - PSCI functions moved to .S to avoid compilation error with gcc 5
   - pte_modify fix to not ignore the mapping type
   - crypto: AES interleaved increased to 4x (for performance reasons)
   - text patching fix for modules
   - swiotlb increased back to 64MB
   - copy_siginfo_to_user32() fix for big endian"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: cpuidle: add asm/proc-fns.h inclusion
  arm64: compat Fix siginfo_t -> compat_siginfo_t conversion on big endian
  arm64: Increase the swiotlb buffer size 64MB
  arm64: Fix text patching logic when using fixmap
  arm64: crypto: increase AES interleave to 4x
  arm64: enable PTE type bit in the mask for pte_modify
  arm64: mm: remove unused functions and variable protoypes
  arm64: psci: move psci firmware calls out of line
  arm64: vdso: minor ABI fix for clock_getres
  arm64: guard asm/assembler.h against multiple inclusions
  arm64: insn: fix compare-and-branch encodings
  arm64: ftrace: fix ftrace_modify_graph_caller for branch replace

Merge tag 'renesas-sh-drivers-for-v4.0' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas

Pull SH driver fix from Simon Horman:
"Disable PM runtime for multi-platform r8a7740 with genpd"

* tag 'renesas-sh-drivers-for-v4.0' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
drivers: sh: Disable PM runtime for multi-platform r8a7740 with genpd

PCI: versatile: Update for list_for_each_entry() API change

In Linux 4.0-rc1 ARM Versatile PCI build fails to build due to what
appears to be an API update. This patch is a very simple correction,
merely posted as a heads-up to the maintainers. Hopefully a better
fix can be forwarded to Linus.

[ arnd: the patch actually looks correct, so let's take this version ]

Signed-off-by: Joachim Nilsson <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
Acked-by: Linus Walleij <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

rhashtable: use cond_resched()

If a hash table has 128 slots and 16384 elems, expand to 256 slots
takes more than one second. For larger sets, a soft lockup is detected.

Holding cpu for that long, even in a work queue is a show stopper
for non preemptable kernels.

cond_resched() at strategic points to allow process scheduler
to reschedule us.

Signed-off-by: Eric Dumazet <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-02-26

This series contains fixes for i40e and i40evf only.

Alexey Khoroshilov found a possible leak of 'cmd_buf' when copy_from_user()
failed in i40e_dbg_command_write(), so resolved by calling kfree().

Shannon provides a fix to ensure the shift and bitwise precedences do not
work backwards for us by adding parans.  Fixed the driver by preventing
the driver from allowing stray interrupts or causing system logs from
un-handled interrupts by combining the ICR0 shutdown with the standard
interrupt shutdown and add the interrupt clearing to the PCI shutdown
path.  Fixed an issue where a NVM write times out before a transaction
can complete, so Shannon added logic to make another attempt by
reacquiring the semaphore, then retry the write, if the one retry fails,
we will then give up.  Adds checks to pointers before their use to ensure
we do not try to dereference NULL pointers when returning values from the
AdminQ calls.

Akeem adds a check to bail out if the device is already down when checking
for Tx hang subtask.

Anjali fixes TSO with more than 8 frags per segment issue.  The hardware
has some limitations which the driver needs to adhere to:
  1) no more than 8 descriptors per packet on the wire
  2) no header can span more than 3 descriptors
If one of these events happens, the hardware will generate an internal
error and freeze the Tx queue, so Anjali fixes this by linearizes the skb
to avoid these situations.  Fixed an issue where the per Traffic Class
queue count was higher than queues enabled, which will fix a warning
with multiple function mode where systems regularly have more cores than
vectors.  Fixed TCP/IPv6 over VXLAN Tx checksum offload, where we were
checking the outer protocol flags and deciding the flow for the inner
header.

Jesse fixes a race condition in the transmit hang detection.  Before we
were having issues of false Tx hang detection, no the driver makes more
direct with the checks for progress forward by directly checking the head
write back address and tail register when determining progress.  This
avoids Tx hangs where the software gets behind, because we are directly
checking hardware state when determining a hang state.

Neerav fixes the transmit ring Qset handle when DCB reconfigures. The issue
was when DCB is reconfigured to a single traffic class (TC) and the driver
did not reset the Tx ring Qset handle to correct the mapping, which caused
the Tx queue to disable timeouts.  Also as part of DCB reconfiguration flow
if the Tx queue disable times out, then issue a PF reset to do some level
of recovery.

Mitch stops flow director on shutdown because, in some cases, the hardware
would continue to try to access the FDIR ring after entering D3Hot state,
which would cause either PCIe errors or NMIs, depending upon the system
configuration.

* NOTE * I have verified that this series of patches for net will not cause
any merge issues when you sync up your net tree with your net-next tree.
====================

Signed-off-by: David S. Miller <[email protected]>

amd-xgbe: Request IRQs only after driver is fully setup

It is possible that the hardware may not have been properly shutdown
before this driver gets control, through use by firmware, for example.
Until the driver is loaded, interrupts associated with the hardware
could go pending. When the IRQs are requested napi support has not
been initialized yet, but the ISR will get control and schedule napi
processing resulting in a kernel panic because the poll routine has not
been set.

Adjust the code so that the driver is fully ready to handle and process
interrupts as soon as the IRQs are requested. This involves requesting
and freeing IRQs during start and stop processing and ordering the napi
add and delete calls appropriately.

Also adjust the powerup and powerdown routines to match the start and
stop routines in regards to the ordering of tasks, including napi
related calls.

Signed-off-by: Tom Lendacky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: asix: add support for the Sitecom LN-028 USB adapter

Just another AX88178-based 10/100/1000 USB-to-Ethernet dongle. This one
shows up in lsusb as: "Sitecom Europe B.V. LN-028 Network USB 2.0 Adapter".

Signed-off-by: Luca Ceresoli <[email protected]>
Cc: Francois Romieu <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: David S. Miller <[email protected]>

NFSv4: nfs4_open_recover_helper() must set share access

The share access mode is now specified as an argument in the nfs4_opendata,
and so nfs4_open_recover_helper() needs to call nfs4_map_atomic_open_share()
in order to set it.

Fixes: 6ae373394c42 ("NFSv4.1: Ask for no delegation on OPEN if using O_DIRECT")
Signed-off-by: Trond Myklebust <[email protected]>

Merge branch 'rhashtable'

Daniel Borkmann says:

====================
rhashtable updates

As discussed, I'm sending out rhashtable fixups for -net.

I have a couple of more patches I was working on last week pending,
i.e. to get rid of ht->nelems and ht->shift atomic operations which
speed-up pure insertions/deletions, e.g. on my laptop I have 2 threads,
inserting 7M entries each, that will reduce insertion time from ~1,450 ms
to 865 ms (performance should even be better after removing the
grow/shrink indirections). I guess that however is rather something
for net-next.
====================

Signed-off-by: David S. Miller <[email protected]>

rhashtable: remove indirection for grow/shrink decision functions

Currently, all real users of rhashtable default their grow and shrink
decision functions to rht_grow_above_75() and rht_shrink_below_30(),
so that there's currently no need to have this explicitly selectable.

It can/should be generic and private inside rhashtable until a real
use case pops up. Since we can make this private, we'll save us this
additional indirection layer and can improve insertion/deletion time
as well.

Reference: http://patchwork.ozlabs.org/patch/443040/
Suggested-by: David S. Miller <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Thomas Graf <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

rhashtable: unconditionally grow when max_shift is not specified

While commit c0c09bfdc415 ("rhashtable: avoid unnecessary wakeup for
worker queue") rightfully moved part of the decision making of
whether we should expand or shrink from the expand/shrink functions
themselves into insert/delete functions in order to avoid unnecessary
worker wake-ups, it however introduced a regression by doing so.

Before that change, if no max_shift was specified (= 0) on rhashtable
initialization, rhashtable_expand() would just grow unconditionally
and lets the available memory be the limiting factor. After that
change, if no max_shift was specified, there would be _no_ expansion
step at all.

Given that netlink and tipc have a max_shift specified, it was not
visible there, but Josh Hunt reported that if nft that starts out
with a default element hint of 3 if not otherwise provided, would
slow i.e. inserts down trememdously as it cannot grow larger to
relax table occupancy.

Given that the test case verifies shrinks/expands manually, we also
must remove pointer to the helper functions to explicitly avoid
parallel resizing on insertions/deletions. test_bucket_stats() and
test_rht_lookup() could also be wrapped around rhashtable mutex to
explicitly synchronize a walk from resizing, but I think that defeats
the actual test case which intended to have explicit test steps,
i.e. 1) inserts, 2) expands, 3) shrinks, 4) deletions, with object
verification after each stage.

Reported-by: Josh Hunt <[email protected]>
Fixes: c0c09bfdc415 ("rhashtable: avoid unnecessary wakeup for worker queue")
Signed-off-by: Daniel Borkmann <[email protected]>
Cc: Ying Xue <[email protected]>
Cc: Josh Hunt <[email protected]>
Acked-by: Thomas Graf <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

vhost: drop hard-coded num_buffers size

The 2 that we use for copy_to_iter comes from sizeof(u16),
it used to be that way before the iov iter update.
Fix it up, making it obvious the size of stack access
is right.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

vhost: cleanup iterator update logic

Recent iterator-related changes in vhost made it
harder to follow the logic fixing up the header.
In fact, the fixup always happens at the same
offset: sizeof(virtio_net_hdr): sometimes the
fixup iterator is updated by copy_to_iter,
sometimes-by iov_iter_advance.

Rearrange code to make this obvious.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

rocker: silence shift wrapping warning

"val" is declared as a u64 so static checkers complain that this shift
can wrap. I don't have the hardware but probably it's doesn't have over
31 ports. Still we may as well silence the warning even if it's not a
real bug.

Signed-off-by: Dan Carpenter <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Acked-by: Scott Feldman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>