Git Repo - linux.git/log

block: support embedded device command line partition

Read block device partition table from command line.  The partition used
for fixed block device (eMMC) embedded device.  It is no MBR, save
storage space.  Bootloader can be easily accessed by absolute address of
data on the block device.  Users can easily change the partition.

This code reference MTD partition, source "drivers/mtd/cmdlinepart.c"
About the partition verbose reference
"Documentation/block/cmdline-partition.txt"

[[email protected]: fix printk text]
[[email protected]: fix error return code in parse_parts()]
Signed-off-by: Cai Zhiyong <[email protected]>
Cc: Karel Zak <[email protected]>
Cc: "Wanglin (Albert)" <[email protected]>
Cc: Marius Groeger <[email protected]>
Cc: David Woodhouse <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Brian Norris <[email protected]>
Cc: Artem Bityutskiy <[email protected]>
Signed-off-by: Wei Yongjun <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

block/blk-sysfs.c: replace strict_strtoul() with kstrtoul()

The usage of strict_strtoul() is not preferred, because strict_strtoul()
is obsolete. Thus, kstrtoul() should be used.

Signed-off-by: Jingoo Han <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

block: replace strict_strtoul() with kstrtoul()

The use of strict_strtoul() is not preferred, because strict_strtoul() is
obsolete. Thus, kstrtoul() should be used.

Signed-off-by: Jingoo Han <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

include/linux/sched.h: don't use task->pid/tgid in same_thread_group/has_group_leader_pid

task_struct->pid/tgid should go away.

1. Change same_thread_group() to use task->signal for comparison.

2. Change has_group_leader_pid(task) to compare task_pid(task) with
signal->leader_pid.

Signed-off-by: Oleg Nesterov <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Sergey Dyasly <[email protected]>
Reviewed-by: "Eric W. Biederman" <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: fix the end cluster offset of FIEMAP

Call fiemap ioctl(2) with given start offset as well as an desired mapping
range should show extents if possible.  However, we somehow figure out the
end offset of mapping via 'mapping_end -= cpos' before iterating the
extent records which would cause problems if the given fiemap length is
too small to a cluster size, e.g,

Cluster size 4096:
debugfs.ocfs2 1.6.3
        Block Size Bits: 12   Cluster Size Bits: 12

The extended fiemap test utility From David:
https://gist.github.com/anonymous/6172331

# dd if=/dev/urandom of=/ocfs2/test_file bs=1M count=1000
# ./fiemap /ocfs2/test_file 4096 10
start: 4096, length: 10
File /ocfs2/test_file has 0 extents:
# Logical          Physical         Length           Flags
^^^^^ <-- No extent is shown

In this case, at ocfs2_fiemap(): cpos == mapping_end == 1. Hence the
loop of searching extent records was not executed at all.

This patch remove the in question 'mapping_end -= cpos', and loops
until the cpos is larger than the mapping_end as usual.

# ./fiemap /ocfs2/test_file 4096 10
start: 4096, length: 10
File /ocfs2/test_file has 1 extents:
# Logical          Physical         Length           Flags
0: 0000000000000000 0000000056a01000 0000000006a00000 0000

Signed-off-by: Jie Liu <[email protected]>
Reported-by: David Weber <[email protected]>
Tested-by: David Weber <[email protected]>
Cc: Sunil Mushran <[email protected]>
Cc: Mark Fashen <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: remove unused variable ip in dlmfs_get_root_inode()

Variable ip in dlmfs_get_root_inode() is defined but not used. So clean
it up.

Signed-off-by: Joseph Qi <[email protected]>
Reviewed-by: Jie Liu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: fix a tiny race case when firing callbacks

In o2hb_shutdown_slot() and o2hb_check_slot(), since event is defined as
local, it is only valid during the call stack.  So the following tiny race
case may happen in a multi-volumes mounted environment:

o2hb-vol1                         o2hb-vol2
1) o2hb_shutdown_slot
allocate local event1
2) queue_node_event
add event1 to global o2hb_node_events
                                  3) o2hb_shutdown_slot
                                  allocate local event2
                                  4) queue_node_event
                                  add event2 to global o2hb_node_events
                                  5) o2hb_run_event_list
                                  delete event1 from o2hb_node_events
6) o2hb_run_event_list
event1 empty, return
7) o2hb_shutdown_slot
event1 lifecycle ends
                                  8) o2hb_fire_callbacks
                                  event1 is already *invalid*

This patch lets it wait on o2hb_callback_sem when another thread is firing
callbacks.  And for performance consideration, we only call
o2hb_run_event_list when there is an event queued.

Signed-off-by: Joyce <[email protected]>
Signed-off-by: Joseph Qi <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Mark Fasheh <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: avoid possible NULL pointer dereference in o2net_accept_one()

Since o2nm_get_node_by_num() may return NULL, we add this check in
o2net_accept_one() to avoid possible NULL pointer dereference.

Signed-off-by: Joseph Qi <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: adjust code style for o2net_handler_tree_lookup()

Code in o2net_handler_tree_lookup() may be corrupted by mistake. So
adjust it to promote readability.

Signed-off-by: Joseph Qi <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: free path in ocfs2_remove_inode_range()

In ocfs2_remove_inode_range(), there is a memory leak. The variable path
has allocated memory with ocfs2_new_path_from_et(), but it is not free.

Signed-off-by: Younger Liu <[email protected]>
Reviewed-by: Jie Liu <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: fix possible double free in ocfs2_reflink_xattr_rec

In ocfs2_reflink_xattr_rec(), meta_ac and data_ac are allocated by calling
ocfs2_lock_reflink_xattr_rec_allocators().

Once an error occurs when allocating *data_ac, it frees *meta_ac which is
allocated before. Here it mistakenly sets meta_ac to NULL but *meta_ac.
Then ocfs2_reflink_xattr_rec() will try to free meta_ac again which is
already invalid.

Signed-off-by: Joseph Qi <[email protected]>
Reviewed-by: Jie Liu <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2/dlm: force clean refmap when doing local cleanup

dlm_do_local_recovery_cleanup() should force clean refmap if the owner of
lockres is UNKNOWN.  Otherwise node may hang when umounting filesystems.
Here's the situation:

Node1                                    Node2
dlmlock()
  -> dlm_get_lock_resource()
send DLM_MASTER_REQUEST_MSG to
other nodes.

                                       trying to master this lockres,
                                       return MAYBE.

selected as the master of lockresA,
set mle->master to Node1,
and do assert_master,
send DLM_ASSERT_MASTER_MSG to Node2.
                                       Node 2 has interest on lockresA
                                       and return
                                       DLM_ASSERT_RESPONSE_MASTERY_REF
                                       then something happened and
                                       Node2 crashed.

Receiving DLM_ASSERT_RESPONSE_MASTERY_REF, set Node2 into refmap, and keep
sending DLM_ASSERT_MASTER_MSG to other nodes

o2hb found node2 down, calling dlm_hb_node_down() -->
dlm_do_local_recovery_cleanup() the master of lockresA is still UNKNOWN,
no need to call dlm_free_dead_locks().

Set the master of lockresA to Node1, but Node2 stills remains in refmap.

When Node1 umount, it found that the refmap of lockresA is not empty and
attempted to migrate it to Node2, But Node2 is already down, so umount
hang, trying to migrate lockresA again and again.

Signed-off-by: joyce <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Jie Liu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: free meta_ac and data_ac when ocfs2_start_trans fails in ocfs2_xattr_set()

In ocfs2_xattr_set(), if ocfs2_start_trans failed, meta_ac and data_ac
should be free. Otherwise, It would lead to a memory leak.

Signed-off-by: Younger Liu <[email protected]>
Cc: Joseph Qi <[email protected]>
Reviewed-by: Jie Liu <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: add the missing return value check of ocfs2_xattr_get_clusters

In ocfs2_xattr_value_attach_refcount(), if error occurs when calling
ocfs2_xattr_get_clusters(), it will go with unexpected behavior since
local variables p_cluster, num_clusters and ext_flags are declared without
initialization.

Signed-off-by: Joseph Qi <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Mark Fasheh <[email protected]>
Acked-by: Jie Liu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

[SCSI] fnic: Kernel panic while running sh/nosh with max lun cfg

Kernel panics due to NULL lport while executing the log message because
of synchronization issues between libfc and scsi transport fc. Checking
for NULL pointers at the beginning of this routine would resolve the issue
from kernel panic point of view.

Signed-off-by: Sesidhar Baddel <[email protected]>
Signed-off-by: Hiral Patel <[email protected]>
Signed-off-by: James Bottomley <[email protected]>

ocfs2: fix a memory leak in __ocfs2_move_extents()

The ocfs2 path is not properly freed which leads to a memory leak at
__ocfs2_move_extents().

This patch stops the leaks of the ocfs2_path structure.

Signed-off-by: Jie Liu <[email protected]>
Reviewed-by: Younger Liu <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Mark Fasheh <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: add missing return value check of ocfs2_get_clusters()

In ocfs2_attach_refcount_tree() and ocfs2_duplicate_extent_list(), if
error occurs when calling ocfs2_get_clusters(), it will go with
unexpected behavior as local variables p_cluster, num_clusters and
ext_flags are declared without initialization.

Signed-off-by: Joseph Qi <[email protected]>
Reviewed-by: Jie Liu <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Mark Fasheh <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: clean up dead code in ocfs2_acl_from_xattr()

In ocfs2_acl_from_xattr(), if size is less than sizeof(struct
posix_acl_entry), it returns ERR_PTR(-EINVAL) directly. Then assign (size
/ sizeof(struct posix_acl_entry)) to count which will be at least 1, that
means the following branch (count < 0) and (count == 0) will never be
true.

Signed-off-by: Joseph Qi <[email protected]>
Cc: Mark Fasheh <[email protected]>
Acked-by: Joel Becker <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: use list_for_each_entry() instead of list_for_each()

[[email protected]: fix up some NULL dereference bugs]
Signed-off-by: Dong Fang <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Jeff Liu <[email protected]>
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

fs/ocfs2/cluster/tcp.c: fix possible null pointer dereferences

Fix some possible null pointer dereferences that were detected by the
static code analyser, smatch.

Signed-off-by: Sunil Mushran <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
Reported-by: Guozhonghua <[email protected]>
Cc: Sunil Mushran <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: ac_bits_wanted should be local_alloc_bits when returns -ENOSPC

There is an issue in reserving and claiming space for localalloc, When
localalloc space is not enough, it would claim space from global_bitmap.
And if there is not enough free space in global_bitmap, the size of
claiming space would set to half of orignal size and retry.

The issue is as follows: osb->local_alloc_bits is set to half of orignal
size in ocfs2_recalc_la_window(), but ac->ac_bits_wanted is set to
osb->local_alloc_default_bits which is not changed. localalloc always
reserves and claims local_alloc_default_bits space and returns ENOSPC.

So, ac->ac_bits_wanted should be osb->local_alloc_bits which would be
changed.

Signed-off-by: Younger Liu <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Jeff Liu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: dlm_request_all_locks() should deal with the status sent from target node

dlm_request_all_locks() should deal with the status sent from target node
if DLM_LOCK_REQUEST_MSG is sent successfully, or recovery master will fall
into endless loop, waiting for other nodes to send locks and
DLM_RECO_DATA_DONE_MSG to me.

        NodeA                                  NodeB
                                     selected as recovery master
                                     dlm_remaster_locks()
                                     ->dlm_request_all_locks()
                                     send DLM_LOCK_REQUEST_MSG to nodeA

It happened that NodeA cannot alloc memory when it processes this
message.  dlm_request_all_locks_handler() do not queue
dlm_request_all_locks_worker and returns -ENOMEM.  It will never send
locks and DLM_RECO_DATA_DONE_MSG to NodeB.

                                    NodeB do not deal with the status
                                    sent from nodeA, and will fall in
                                    endless loop waiting for the
                                    recovery state of NodeA to be
                                    changed.

Signed-off-by: joyce <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Jeff Liu <[email protected]>
Cc: Joel Becker <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: use i_size_read() to access i_size

Though ocfs2 uses inode->i_mutex to protect i_size, there are both
i_size_read/write() and direct accesses. Clean up all direct access to
eliminate confusion.

Signed-off-by: Junxiao Bi <[email protected]>
Cc: Jie Liu <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: lighten up allocate transaction

The issue scenario is as following:

When fallocating a very large disk space for a small file,
__ocfs2_extend_allocation attempts to get a very large transaction. For
some journal sizes, there may be not enough room for this transaction,
and the fallocate will fail.

The patch below extends & restarts the transaction as necessary while
allocating space, and should work with even the smallest journal. This
patch refers ext4 resize.

Test:
# mkfs.ocfs2 -b 4K -C 32K -T datafiles /dev/sdc
...(jounral size is 32M)
# mount.ocfs2 /dev/sdc /mnt/ocfs2/
# touch /mnt/ocfs2/1.log
# fallocate -o 0 -l 400G /mnt/ocfs2/1.log
fallocate: /mnt/ocfs2/1.log: fallocate failed: Cannot allocate memory
# tail -f /var/log/messages
[ 7372.278591] JBD: fallocate wants too many credits (2051 > 2048)
[ 7372.278597] (fallocate,6438,0):__ocfs2_extend_allocation:709 ERROR: status = -12
[ 7372.278603] (fallocate,6438,0):ocfs2_allocate_unwritten_extents:1504 ERROR: status = -12
[ 7372.278607] (fallocate,6438,0):__ocfs2_change_file_space:1955 ERROR: status = -12
^C
With this patch, the test works well.

Signed-off-by: Younger Liu <[email protected]>
Cc: Jie Liu <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Mark Fasheh <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

drivers/iommu: remove unnecessary platform_set_drvdata()

The driver core clears the driver data to NULL after device_release or
on probe failure. Thus, it is not needed to manually clear the device
driver data to NULL.

Signed-off-by: Jingoo Han <[email protected]>
Cc: David Brown <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Suman Anna <[email protected]>
Acked-by: Libo Chen <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

drivers/video/acornfb.c: remove dead code

acornfb checks for HAS_VIDC while support for that macro was removed in
v2.6.23 (when the arm26 port was removed). So we can remove a bit of
dead code.

Signed-off-by: Paul Bolle <[email protected]>
Cc: Florian Tobias Schandinat <[email protected]>
Cc: Laurent Pinchart <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

fork: unify and tighten up CLONE_NEWUSER/CLONE_NEWPID checks

do_fork() denies CLONE_THREAD | CLONE_PARENT if NEWUSER | NEWPID.

Then later copy_process() denies CLONE_SIGHAND if the new process will
be in a different pid namespace (task_active_pid_ns() doesn't match
current->nsproxy->pid_ns).

This looks confusing and inconsistent. CLONE_NEWPID is very similar to
the case when ->pid_ns was already unshared, we want the same
restrictions so copy_process() should also nack CLONE_PARENT.

And it would be better to deny CLONE_NEWUSER && CLONE_SIGHAND as well
just for consistency.

Kill the "CLONE_NEWUSER | CLONE_NEWPID" check in do_fork() and change
copy_process() to do the same check along with ->pid_ns check we already
have.

Signed-off-by: Oleg Nesterov <[email protected]>
Acked-by: Andy Lutomirski <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Colin Walters <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

pidns: kill the unnecessary CLONE_NEWPID in copy_process()

Commit 8382fcac1b81 ("pidns: Outlaw thread creation after
unshare(CLONE_NEWPID)") nacks CLONE_NEWPID if the forking process
unshared pid_ns. This is correct but unnecessary, copy_pid_ns() does
the same check.

Remove the CLONE_NEWPID check to cleanup the code and prepare for the
next change.

Test-case:

static int child(void *arg)
{
return 0;
}

static char stack[16 * 1024];

int main(void)
{
pid_t pid;

assert(unshare(CLONE_NEWUSER | CLONE_NEWPID) == 0);

pid = clone(child, stack + sizeof(stack) / 2,
CLONE_NEWPID | SIGCHLD, NULL);
assert(pid < 0 && errno == EINVAL);

return 0;
}

clone(CLONE_NEWPID) correctly fails with or without this change.

Signed-off-by: Oleg Nesterov <[email protected]>
Acked-by: Andy Lutomirski <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Colin Walters <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

pidns: fix vfork() after unshare(CLONE_NEWPID)

Commit 8382fcac1b81 ("pidns: Outlaw thread creation after
unshare(CLONE_NEWPID)") nacks CLONE_VM if the forking process unshared
pid_ns, this obviously breaks vfork:

int main(void)
{
assert(unshare(CLONE_NEWUSER | CLONE_NEWPID) == 0);
assert(vfork() >= 0);
_exit(0);
return 0;
}

fails without this patch.

Change this check to use CLONE_SIGHAND instead.  This also forbids
CLONE_THREAD automatically, and this is what the comment implies.

We could probably even drop CLONE_SIGHAND and use CLONE_THREAD, but it
would be safer to not do this.  The current check denies CLONE_SIGHAND
implicitely and there is no reason to change this.

Eric said "CLONE_SIGHAND is fine.  CLONE_THREAD would be even better.
Having shared signal handling between two different pid namespaces is
the case that we are fundamentally guarding against."

Signed-off-by: Oleg Nesterov <[email protected]>
Reported-by: Colin Walters <[email protected]>
Acked-by: Andy Lutomirski <[email protected]>
Reviewed-by: "Eric W. Biederman" <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

include/linux/smp.h:on_each_cpu(): switch back to a C function

Revert commit c846ef7deba2 ("include/linux/smp.h:on_each_cpu(): switch
back to a macro"). It turns out that the problematic linux/irqflags.h
include was fixed within ia64 and mn10300.

Cc: Geert Uytterhoeven <[email protected]>
Cc: David Daney <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

[SCSI] fnic: Hitting BUG_ON(io_req->abts_done) in fnic_rport_exch_reset

Hitting BUG_ON(io_req->abts_done) in fnic_rport_exch_reset in case of
timing issue and also to some extent locking issue where abts and terminate
is happening around same timing.

The code changes are intended to update CMD_STATE(sc) and
io_req->abts_done together.

Signed-off-by: Sesidhar Beddel <[email protected]>
Signed-off-by: Hiral Patel <[email protected]>
Signed-off-by: James Bottomley <[email protected]>

[SCSI] fnic: Remove QUEUE_FULL handling code

Remove fnic driver QUEUE_FULL handling code instead let SCSI mid layer
handle queue full and use its algorithm to ramp down/up queue

Signed-off-by: Suma Ramars <[email protected]>
Signed-off-by: Hiral Patel <[email protected]>
Signed-off-by: James Bottomley <[email protected]>

[SCSI] fnic: On system with >1.1TB RAM, VIC fails multipath after boot up

Issue was seen when SCSI buffer address is more than 40 bits in system
with more than 1.1TB RAM. When SCSI buffer is passed to VIC, it is failing
to map to correct buffer address, as DMA mask is set to 40 bits in driver
initialization. Corrected DMA_MASK from 40-bits to 64-bits to avoid masking
41-64 bits addresses.

Signed-off-by: Brian Uchino <[email protected]>
Signed-off-by: Hiral Patel <[email protected]>
Signed-off-by: James Bottomley <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) Brown paper bag fix in HTB scheduler, class options set incorrectly
    due to a typoe.  Fix from Vimalkumar.

2) It's possible for the ipv6 FIB garbage collector to run before all
    the necessary datastructure are setup during init, defer the
    notifier registry to avoid this problem.  Fix from Michal Kubecek.

3) New i40e ethernet driver from the Intel folks.

4) Add new qmi wwan device IDs, from Bjørn Mork.

5) Doorbell lock in bnx2x driver is not initialized properly in some
    configurations, fix from Ariel Elior.

6) Revert an ipv6 packet option padding change that broke standardized
    ipv6 implementation test suites.  From Jiri Pirko.

7) Fix synchronization of ARP information in bonding layer, from
    Nikolay Aleksandrov.

8) Fix missing error return resulting in illegal memory accesses in
    openvswitch, from Daniel Borkmann.

9) SCTP doesn't signal poll events properly due to mistaken operator
    precedence, fix also from Daniel Borkmann.

10) __netdev_pick_tx() passes wrong index to sk_tx_queue_set() which
    essentially disables caching of TX queue in sockets :-/ Fix from
    Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (29 commits)
  net_sched: htb: fix a typo in htb_change_class()
  net: qmi_wwan: add new Qualcomm devices
  ipv6: don't call fib6_run_gc() until routing is ready
  net: tilegx driver: avoid compiler warning
  fib6_rules: fix indentation
  irda: vlsi_ir: Remove casting the return value which is a void pointer
  irda: donauboe: Remove casting the return value which is a void pointer
  net: fix multiqueue selection
  net: sctp: fix smatch warning in sctp_send_asconf_del_ip
  net: sctp: fix bug in sctp_poll for SOCK_SELECT_ERR_QUEUE
  net: fib: fib6_add: fix potential NULL pointer dereference
  net: ovs: flow: fix potential illegal memory access in __parse_flow_nlattrs
  bcm63xx_enet: remove deprecated IRQF_DISABLED
  net: korina: remove deprecated IRQF_DISABLED
  macvlan: Move skb_clone check closer to call
  qlcnic: Fix warning reported by kbuild test robot.
  bonding: fix bond_arp_rcv setting and arp validate desync state
  bonding: fix store_arp_validate race with mode change
  ipv6/exthdrs: accept tlv which includes only padding
  bnx2x: avoid atomic allocations during initialization
  ...

cpufreq: Acquire the lock in cpufreq_policy_restore() for reading

In cpufreq_policy_restore() before system suspend policy is read from
percpu's cpufreq_cpu_data_fallback. It's a read operation rather
than a write one, so take the lock for reading in there.

Signed-off-by: Lan Tianyu <[email protected]>
Reviewed-by: Srivatsa S. Bhat <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

cpufreq: Prevent problems in update_policy_cpu() if last_cpu == new_cpu

If update_policy_cpu() is invoked with the existing policy->cpu itself
as the new-cpu parameter, then a lot of things can go terribly wrong.

In its present form, update_policy_cpu() always assumes that the new-cpu
is different from policy->cpu and invokes other functions to perform their
respective updates. And those functions implement the actual update like
this:

per_cpu(..., new_cpu) = per_cpu(..., last_cpu);
per_cpu(..., last_cpu) = NULL;

Thus, when new_cpu == last_cpu, the final NULL assignment makes the per-cpu
references vanish into thin air! (memory leak). From there, it leads to more
problems: cpufreq_stats_create_table() now doesn't find the per-cpu reference
and hence tries to create a new sysfs-group; but sysfs already had created
the group earlier, so it complains that it cannot create a duplicate filename.
In short, the repercussions of a rather innocuous invocation of
update_policy_cpu() can turn out to be pretty nasty.

Ideally update_policy_cpu() should handle this situation (new == last)
gracefully, and not lead to such severe problems. So fix it by adding an
appropriate check.

Signed-off-by: Srivatsa S. Bhat <[email protected]>
Tested-by: Stephen Warren <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

cpufreq: Restructure if/else block to avoid unintended behavior

In __cpufreq_remove_dev_prepare(), the code which decides whether to remove
the sysfs link or nominate a new policy cpu, is governed by an if/else block
with a rather complex set of conditionals. Worse, they harbor a subtlety
which leads to certain unintended behavior.

The code looks like this:

        if (cpu != policy->cpu && !frozen) {
                sysfs_remove_link(&dev->kobj, "cpufreq");
        } else if (cpus > 1) {
new_cpu = cpufreq_nominate_new_policy_cpu(...);
...
update_policy_cpu(..., new_cpu);
}

The original intention was:
If the CPU going offline is not policy->cpu, just remove the link.
On the other hand, if the CPU going offline is the policy->cpu itself,
handover the policy->cpu job to some other surviving CPU in that policy.

But because the 'if' condition also includes the 'frozen' check, now there
are *two* possibilities by which we can enter the 'else' block:

1. cpu == policy->cpu (intended)
2. cpu != policy->cpu && frozen (unintended)

Due to the second (unintended) scenario, we end up spuriously nominating
a CPU as the policy->cpu, even when the existing policy->cpu is alive and
well. This can cause problems further down the line, especially when we end
up nominating the same policy->cpu as the new one (ie., old == new),
because it totally confuses update_policy_cpu().

To avoid this mess, restructure the if/else block to only do what was
originally intended, and thus prevent any unwelcome surprises.

Signed-off-by: Srivatsa S. Bhat <[email protected]>
Tested-by: Stephen Warren <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

cpufreq: Fix crash in cpufreq-stats during suspend/resume

Stephen Warren reported that the cpufreq-stats code hits a NULL pointer
dereference during the second attempt to suspend a system. He also
pin-pointed the problem to commit 5302c3f "cpufreq: Perform light-weight
init/teardown during suspend/resume".

That commit actually ensured that the cpufreq-stats table and the
cpufreq-stats sysfs entries are *not* torn down (ie., not freed) during
suspend/resume, which makes it all the more surprising. However, it turns
out that the root-cause is not that we access an already freed memory, but
that the reference to the allocated memory gets moved around and we lose
track of that during resume, leading to the reported crash in a subsequent
suspend attempt.

In the suspend path, during CPU offline, the value of policy->cpu is updated
by choosing one of the surviving CPUs in that policy, as long as there is
atleast one CPU in that policy. And cpufreq_stats_update_policy_cpu() is
invoked to update the reference to the stats structure by assigning it to
the new CPU. However, in the resume path, during CPU online, we end up
assigning a fresh CPU as the policy->cpu, without letting cpufreq-stats
know about this. Thus the reference to the stats structure remains
(incorrectly) associated with the old CPU. So, in a subsequent suspend attempt,
during CPU offline, we end up accessing an incorrect location to get the
stats structure, which eventually leads to the NULL pointer dereference.

Fix this by letting cpufreq-stats know about the update of the policy->cpu
during CPU online in the resume path. (Also, move the update_policy_cpu()
function higher up in the file, so that __cpufreq_add_dev() can invoke
it).

Reported-and-tested-by: Stephen Warren <[email protected]>
Signed-off-by: Srivatsa S. Bhat <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

net_sched: htb: fix a typo in htb_change_class()

Fix a typo added in commit 56b765b79 ("htb: improved accuracy at high
rates")

cbuffer should not be a copy of buffer.

Signed-off-by: Vimalkumar <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Jesper Dangaard Brouer <[email protected]>
Cc: Jiri Pirko <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: qmi_wwan: add new Qualcomm devices

Adding the device list from the Windows driver description files
included with a new Qualcomm MDM9615 based device, "Alcatel-sbell
ASB TL131 TDD LTE", from China Mobile.  This device is tested
and verified to work.  The others are assumed to work based on
using the same Windows driver.

Many of these devices support multiple QMI/wwan ports, requiring
multiple interface matching entries.  All devices are composite,
providing a mix of one or more serial, storage or Android Debug
Brigde functions in addition to the wwan function.

This device list included an update of one previously known device,
which was incorrectly assumed to have a Gobi 2K layout.  This is
corrected.

Reported-by: 王康 <[email protected]>
Signed-off-by: Bjørn Mork <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates

This series implements the new i40e driver for Intel's upcoming
Intel(R) Ethernet Controller XL710 Family of devices.

V7: many changes from a few comments:
    use linux errno types
    change I40E_SUCCESS to 0, standardize returns
    change s32 return values to int
    use void return values where possible
    prefer use of int over i40e_status
V6: rename Kbuild to Makefile
    rename i40e_mem[set|cpy] to regular memset/memcpy
V5: remove sysfs support from this set, will rearchitect
    changes from community comments
V4: addresses remaining community comments, mostly trivial edits.
    major sparse based cleanup of possible endian issues
    removal of most of __func__ references
    sizeof(*var) instead of sizeof(struct ...)
    change 'NULL ==' tests to !NULL
    implement xps
    use kernel bitshift macros (upper_32_bits, etc)
V3: many more individual comments addressed, thanks reviewers!  Many
    other changes due to internal review and development.
V2: each patch has individual comments, in general, feedback from the
    list was applied and addressed. Many changes due to internal review
    and coding as well.
V1: initial send

Let me start by saying thanks and we appreciate any time spent by
those of you who review and comment on this new driver, and we will
attempt to address and respond to all issues brought to our attention.

I tried to break the patches up to ease review, but the series should
apply and still be bisectable, as the last patch adds the driver to
the kernel compile with CONFIG_I40E.

This driver is for a brand new bit of silicon that has a different
design than other Intel Ethernet silicon, and therefore needed a new
driver.

The hardware has quite a bit of capability and this driver is only
meant to provide basic functionality at first.  Future patches will
continue to add functionality and bug fixes.

This initial release is very early in the product cycle with the intent
of getting initial support into the kernel before users have the
hardware available to purchase.  A software development manual is not
ready yet but will be available when the hardware ships.

The driver development model and interaction with community submitted
patches *will not be any different* than what we are currently doing
today.  We plan to continue established processes.

An associated i40evf driver has been posted for review.

List of tools we ran in preparation:
way more sparse clean
make W=1, W=2 clean
checkpatch (almost) clean
        total: 1 errors, 4 warnings, 30461 lines checked
        NOTE: Ignored message types: LONG_LINE
        - issues have been addressed and the remainders
          are noise.
codespell clean
smatch (almost) clean with a couple minor warnings
coccicheck clean
namespacecheck clean
allmodconfig clean
ppc64 build clean (untested)

This driver is a team effort, thank you to Joseph Gasparakis,
Shannon Nelson, Anjali Singhai-Jain, Mitch Williams, Neerav
Parikh, Vasu Dev, Kavindya Deegala, Yi Zou, and PJ Waskiewicz.

TODO (known issues)
BQL implementation
finish rtnl_stat64 locking (we have a patch but debugging it)
====================

Signed-off-by: David S. Miller <[email protected]>

ipv6: don't call fib6_run_gc() until routing is ready

When loading the ipv6 module, ndisc_init() is called before
ip6_route_init(). As the former registers a handler calling
fib6_run_gc(), this opens a window to run the garbage collector
before necessary data structures are initialized. If a network
device is initialized in this window, adding MAC address to it
triggers a NETDEV_CHANGEADDR event, leading to a crash in
fib6_clean_all().

Take the event handler registration out of ndisc_init() into a
separate function ndisc_late_init() and move it after
ip6_route_init().

Signed-off-by: Michal Kubecek <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: tilegx driver: avoid compiler warning

The "id" variable was being incremented in common code, but only
initialized and used in IPv4 code. We move the increment to the IPv4
code too, and then legitimately use the uninitialized_var() macro to
avoid the gcc 4.6 warning that 'id' may be used uninitialized.
Note that gcc 4.7 does not warn.

Signed-off-by: Chris Metcalf <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

fib6_rules: fix indentation

This change just removes two tabs from the source file.

Signed-off-by: Stefan Tomanek <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

irda: vlsi_ir: Remove casting the return value which is a void pointer

Casting the return value which is a void pointer is redundant.
The conversion from void pointer to any other pointer type is
guaranteed by the C programming language.

Signed-off-by: Jingoo Han <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

irda: donauboe: Remove casting the return value which is a void pointer

Casting the return value which is a void pointer is redundant.
The conversion from void pointer to any other pointer type is
guaranteed by the C programming language.

Signed-off-by: Jingoo Han <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: fix multiqueue selection

commit 416186fbf8c5b4e4465 ("net: Split core bits of netdev_pick_tx
into __netdev_pick_tx") added a bug that disables caching of queue
index in the socket.

This is the source of packet reorders for TCP flows, and
again this is happening more often when using FQ pacing.

Old code was doing

if (queue_index != old_index)
sk_tx_queue_set(sk, queue_index);

Alexander renamed the variables but forgot to change sk_tx_queue_set()
2nd parameter.

if (queue_index != new_index)
sk_tx_queue_set(sk, queue_index);

This means we store -1 over and over in sk->sk_tx_queue_mapping

Signed-off-by: Eric Dumazet <[email protected]>
Cc: Alexander Duyck <[email protected]>
Acked-by: Alexander Duyck <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: sctp: fix smatch warning in sctp_send_asconf_del_ip

This was originally reported in [1] and posted by Neil Horman [2], he said:

  Fix up a missed null pointer check in the asconf code. If we don't find
  a local address, but we pass in an address length of more than 1, we may
  dereference a NULL laddr pointer. Currently this can't happen, as the only
  users of the function pass in the value 1 as the addrcnt parameter, but
  its not hot path, and it doesn't hurt to check for NULL should that ever
  be the case.

The callpath from sctp_asconf_mgmt() looks okay. But this could be triggered
from sctp_setsockopt_bindx() call with SCTP_BINDX_REM_ADDR and addrcnt > 1
while passing all possible addresses from the bind list to SCTP_BINDX_REM_ADDR
so that we do *not* find a single address in the association's bind address
list that is not in the packed array of addresses. If this happens when we
have an established association with ASCONF-capable peers, then we could get
a NULL pointer dereference as we only check for laddr == NULL && addrcnt == 1
and call later sctp_make_asconf_update_ip() with NULL laddr.

BUT: this actually won't happen as sctp_bindx_rem() will catch such a case
and return with an error earlier. As this is incredably unintuitive and error
prone, add a check to catch at least future bugs here. As Neil says, its not
hot path. Introduced by 8a07eb0a5 ("sctp: Add ASCONF operation on the
single-homed host").

[1] http://www.spinics.net/lists/linux-sctp/msg02132.html
[2] http://www.spinics.net/lists/linux-sctp/msg02133.html

Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Neil Horman <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Cc: Michio Honda <[email protected]>
Acked-By: Neil Horman <[email protected]>
Acked-by: Vlad Yasevich <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: sctp: fix bug in sctp_poll for SOCK_SELECT_ERR_QUEUE

If we do not add braces around ...

mask |= POLLERR |
sock_flag(sk, SOCK_SELECT_ERR_QUEUE) ? POLLPRI : 0;

... then this condition always evaluates to true as POLLERR is
defined as 8 and binary or'd with whatever result comes out of
sock_flag(). Hence instead of (X | Y) ? A : B, transform it into
X | (Y ? A : B). Unfortunatelty, commit 8facd5fb73 ("net: fix
smatch warnings inside datagram_poll") forgot about SCTP. :-(

Introduced by 7d4c04fc170 ("net: add option to enable error queue
packets waking select").

Signed-off-by: Daniel Borkmann <[email protected]>
Cc: Jacob Keller <[email protected]>
Acked-by: Neil Horman <[email protected]>
Acked-by: Vlad Yasevich <[email protected]>
Acked-by: Jacob Keller <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: fib: fib6_add: fix potential NULL pointer dereference

When the kernel is compiled with CONFIG_IPV6_SUBTREES, and we return
with an error in fn = fib6_add_1(), then error codes are encoded into
the return pointer e.g. ERR_PTR(-ENOENT). In such an error case, we
write the error code into err and jump to out, hence enter the if(err)
condition. Now, if CONFIG_IPV6_SUBTREES is enabled, we check for:

  if (pn != fn && pn->leaf == rt)
    ...
  if (pn != fn && !pn->leaf && !(pn->fn_flags & RTN_RTINFO))
    ...

Since pn is NULL and fn is f.e. ERR_PTR(-ENOENT), then pn != fn
evaluates to true and causes a NULL-pointer dereference on further
checks on pn. Fix it, by setting both NULL in error case, so that
pn != fn already evaluates to false and no further dereference
takes place.

This was first correctly implemented in 4a287eba2 ("IPv6 routing,
NLM_F_* flag support: REPLACE and EXCL flags support, warn about
missing CREATE flag"), but the bug got later on introduced by
188c517a0 ("ipv6: return errno pointers consistently for fib6_add_1()").

Signed-off-by: Daniel Borkmann <[email protected]>
Cc: Lin Ming <[email protected]>
Cc: Matti Vaittinen <[email protected]>
Cc: Hannes Frederic Sowa <[email protected]>
Acked-by: Hannes Frederic Sowa <[email protected]>
Acked-by: Matti Vaittinen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: ovs: flow: fix potential illegal memory access in __parse_flow_nlattrs

In function __parse_flow_nlattrs(), we check for condition
(type > OVS_KEY_ATTR_MAX) and if true, print an error, but we do
not return from this function as in other checks. It seems this
has been forgotten, as otherwise, we could access beyond the
memory of ovs_key_lens, which is of ovs_key_lens[OVS_KEY_ATTR_MAX + 1].
Hence, a maliciously prepared nla_type from user space could access
beyond this upper limit.

Introduced by 03f0d916a ("openvswitch: Mega flow implementation").

Signed-off-by: Daniel Borkmann <[email protected]>
Cc: Andy Zhou <[email protected]>
Acked-by: Jesse Gross <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bcm63xx_enet: remove deprecated IRQF_DISABLED

This patch proposes to remove the IRQF_DISABLED flag from
drivers/net/ethernet/broadcom/bcm63xx_enet.c

It's a NOOP since 2.6.35 and it will be removed one day.

Signed-off-by: Michael Opdenacker <[email protected]>
Reviewed-by: Jingoo Han <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: korina: remove deprecated IRQF_DISABLED

This patch proposes to remove the IRQF_DISABLED flag from
drivers/net/ethernet/korina.c

It's a NOOP since 2.6.35 and it will be removed one day.

Signed-off-by: Michael Opdenacker <[email protected]>
Reviewed-by: Jingoo Han <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

macvlan: Move skb_clone check closer to call

Currently macvlan calls skb_clone in macvlan_broadcast but checks
for a NULL return in macvlan_broadcast_one instead. This is
needlessly confusing and may lead to bugs introduced later.

This patch moves the error check to where the skb_clone call is.

The only other caller of macvlan_broadcast_one never passes in a
NULL value so it doesn't need the check either.

Signed-off-by: Herbert Xu <[email protected]>
Thanks,
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

hwmon: (tmp421) Fix return value

Propagate return value obtained from i2c_smbus_read_byte_data()
instead of hardcoding.

Signed-off-by: Sachin Kamat <[email protected]>
Cc: Andre Prendel <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>

hwmon: (amc6821) Remove redundant break

'break' after return or goto has no effect. Remove it.

Signed-off-by: Sachin Kamat <[email protected]>
Cc: T. Mertelj <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>

hwmon: (amc6821) Fix return value

Propagate return value obtained from i2c_smbus_read_byte_data()
instead of hardcoding.

Signed-off-by: Sachin Kamat <[email protected]>
Cc: T. Mertelj <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>

hwmon: (ibmaem) Fix return value

Propagate appropriate error code obtained from ipmi_create_user()
instead of hardcoding.

Signed-off-by: Sachin Kamat <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Acked-by: Darrick J. Wong <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>

hwmon: (emc2103) Fix return value

kstrtol() returns appropriate error values. Use those instead of
hardcoding. Silences several sparse messages of following type:
"why not propagate 'result' from kstrtol() instead of (-22)?"

Signed-off-by: Sachin Kamat <[email protected]>
Cc: Steve Glendinning <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>

qlcnic: Fix warning reported by kbuild test robot.

drivers/net/ethernet/qlogic/qlcnic/qlcnic_io.c: In function 'qlcnic_handle_fw_message':
drivers/net/ethernet/qlogic/qlcnic/qlcnic_io.c:922:4: warning: overflow in implicit constant conversion [-Woverflow]

Signed-off-by: Jitendra Kalsaria <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'bonding_arp'

Nikolay Aleksandrov says:

====================
bonding: fix arp_validate desync state & race

These two patches aim to fix the possible de-sync state which the bond
can enter if we have arp_validate without arp_interval or the other way
around. They also fix a race condition between arp_validate setting and
mode changing.

Patch 01 - fixes the race condition between store_arp_validate and bond
mode change by using rtnl for sync
Patch 02 - fixes the possible de-sync state by setting/unsetting recv_probe
if arp_interval is set/unset and also if arp_validate is set/unset

v2: Fix the mode check in store_arp_validate
====================

Signed-off-by: David S. Miller <[email protected]>

bonding: fix bond_arp_rcv setting and arp validate desync state

We make bond_arp_rcv global so it can be used in bond_sysfs if the bond
interface is up and arp_interval is being changed to a positive value
and cleared otherwise as per Jay's suggestion.
This also fixes a problem where bond_arp_rcv was set even though
arp_validate was disabled while the bond was up by unsetting recv_probe
in bond_store_arp_validate and respectively setting it if enabled.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Marcelo Ricardo Leitner <[email protected]>
Acked-by: Veaceslav Falico <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bonding: fix store_arp_validate race with mode change

We need to protect store_arp_validate via rtnl because it can race with
mode changing and we can end up having arp_validate set in a mode
different from active-backup.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Acked-by: Veaceslav Falico <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ipv6/exthdrs: accept tlv which includes only padding

In rfc4942 and rfc2460 I cannot find anything which would implicate to
drop packets which have only padding in tlv.

Current behaviour breaks TAHI Test v6LC.1.2.6.

Problem was intruduced in:
9b905fe6843 "ipv6/exthdrs: strict Pad1 and PadN check"

Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

[SCSI] fnic: FC stat param seconds_since_last_reset not getting updated

Code to reset fc_host statistics.
echo 1 > /sys/class/fc_host/hostX/statistics/reset_statistics clears fc_host stats,
the code also issues command to fnic firmware to clear vnic stats.

Signed-off-by: Narsimhulu Musini <[email protected]>
Signed-off-by: Hiral Patel <[email protected]>
Signed-off-by: James Bottomley <[email protected]>

bnx2x: avoid atomic allocations during initialization

During initialization bnx2x allocates significant amounts of memory
(for rx data, rx SGEs, TPA pool) using atomic allocations.

I received a report where bnx2x failed to allocate SGEs and it had
to fall back to TPA-less operation.

Let's use GFP_KERNEL allocations during initialization, which runs
in process context. Add gfp_t parameters to functions that are used
both in initialization and in the receive path.

Use an unlikely branch in bnx2x_frag_alloc() to avoid atomic allocation
by netdev_alloc_frag(). The branch is taken several thousands of times
during initialization, but then never more. Note that fp->rx_frag_size
is never greater than PAGE_SIZE, so __get_free_page() can be used here.

Signed-off-by: Michal Schmidt <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'for-linus-3.12-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs

Pull 9p updates from Eric Van Hensbergen:
"Minor 9p fixes and tweaks for 3.12 merge window

  The first fixes namespace issues which causes a kernel NULL pointer
  dereference, the second fixes uevent handling to work better with
  udev, and the third switches some code to use srlcpy instead of
  strncpy in order to be safer.

  All changes have been baking in for-next for at least 2 weeks"

* tag 'for-linus-3.12-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  fs/9p: avoid accessing utsname after namespace has been torn down
  9p: send uevent after adding/removing mount_tag attribute
  fs: 9p: use strlcpy instead of strncpy

Merge tag 'squashfs-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next

Pull squashfs updates from Phillip Lougher:
"A couple of minor additional sanity check patches for corrupted
  information, and some fixes.  Apart from that there's a minor loop
  optimisation.

  These sanity checks mainly exist to trap maliciously corrupted
  filesystems either through using a deliberately modified mksquashfs,
  or where the user has deliberately chosen to generate uncompressed
  metadata and then corrupted it.

  Normally metadata in Squashfs filesystems is compressed, which means
  corruption (either accidental or malicious) is detected when trying to
  decompress the metadata.  So corrupted data does not normally get as
  far as the code paths in question here"

* tag 'squashfs-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next:
  Squashfs: add corruption check for type in squashfs_readdir()
  Squashfs: add corruption check in get_dir_index_using_offset()
  Squashfs: fix corruption checks in squashfs_readdir()
  Squashfs: fix corruption checks in squashfs_lookup()
  Squashfs: fix corruption check in get_dir_index_using_name()
  Squashfs: Optimized uncompressed buffer loop
  Squashfs: sanity check information from disk

xen/balloon: remove BUG_ON in increase_reservation

The BUG_ON in increase_reservation is wrong as we have P2M entry
ballooned out page set to balloon scratch page, so it might have a valid
P2M entry at that point.

Signed-off-by: Wei Liu <[email protected]>
Signed-off-by: Stefano Stabellini <[email protected]>

xen/balloon: ensure preemption is disabled when using a scratch page

In decrease_reservation(), if the kernel is preempted between updating
the mapping and updating the p2m then they may end up using different
scratch pages.

Use get_balloon_scratch_page() and put_balloon_scratch_page() which use
get_cpu_var() and put_cpu_var() to correctly disable preemption.

Signed-off-by: David Vrabel <[email protected]>
Signed-off-by: Stefano Stabellini <[email protected]>
Tested-by: Sander Eikelenboom <[email protected]>

[SCSI] sd: Fix potential out-of-bounds access

This patch fixes an out-of-bounds error in sd_read_cache_type(), found
by Google's AddressSanitizer tool. When the loop ends, we know that
"offset" lies beyond the end of the data in the buffer, so no Caching
mode page was found. In theory it may be present, but the buffer size
is limited to 512 bytes.

Signed-off-by: Alan Stern <[email protected]>
Reported-by: Dmitry Vyukov <[email protected]>
CC: <[email protected]>
Signed-off-by: James Bottomley <[email protected]>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull more s390 updates from Heiko Carstens:
"This includes one bpf/jit bug fix where the jit compiler could
  sometimes write generated code out of bounds of the allocated memory
  area.

  The rest of the patches are only cleanups and minor improvements"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/irq: reduce size of external interrupt handler hash array
  s390/compat,uid16: use current_cred()
  s390/ap_bus: use and-mask instead of a cast
  s390/ftrace: avoid pointer arithmetics with function pointers
  s390: make various functions static, add declarations to header files
  s390/compat signal: add couple of __force annotations
  s390/mm: add __releases()/__acquires() annotations to gmap_alloc_table()
  s390: keep Kconfig sorted
  s390/irq: rework irq subclass handling
  s390/irq: use hlists for external interrupt handler array
  s390/dumpstack: convert print_symbol to %pSR
  s390/perf: Remove print_hex_dump_bytes() debug output
  s390: update defconfig
  s390/bpf,jit: fix address randomization

Merge branch 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild

Pull kconfig updates from Michal Marek:
"This is the kconfig part of kbuild for v3.12-rc1:
   - post-3.11 search code fixes and micro-optimizations
   - CONFIG_MODULES is no longer a special case; this is needed to
     eventually fix the bug that using KCONFIG_ALLCONFIG breaks
     allmodconfig
   - long long is used to store hex and int values
   - make silentoldconfig no longer warns when a symbol changes from
     tristate to bool (it's a job for make oldconfig)
   - scripts/diffconfig updated to work with newer Pythons
   - scripts/config does not rely on GNU sed extensions"

* 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  kconfig: do not allow more than one symbol to have 'option modules'
  kconfig: regenerate bison parser
  kconfig: do not special-case 'MODULES' symbol
  diffconfig: Update script to support python versions 2.5 through 3.3
  diffconfig: Gracefully exit if the default config files are not present
  modules: do not depend on kconfig to set 'modules' option to symbol MODULES
  kconfig: silence warning when parsing auto.conf when a symbol has changed type
  scripts/config: use sed's POSIX interface
  kconfig: switch to "long long" for sanity
  kconfig: simplify symbol-search code
  kconfig: don't allocate n+1 elements in temporary array
  kconfig: minor style fixes in symbol-search code
  kconfig/[mn]conf: shorten title in search-box
  kconfig: avoid multiple calls to strlen
  Documentation/kconfig: more concise and straightforward search explanation

Merge branch 'pm-cpufreq'

* pm-cpufreq:
  intel_pstate: Add Haswell CPU models
  Revert "cpufreq: make sure frequency transitions are serialized"
  cpufreq: Use signed type for 'ret' variable, to store negative error values
  cpufreq: Remove temporary fix for race between CPU hotplug and sysfs-writes
  cpufreq: Synchronize the cpufreq store_*() routines with CPU hotplug
  cpufreq: Invoke __cpufreq_remove_dev_finish() after releasing cpu_hotplug.lock
  cpufreq: Split __cpufreq_remove_dev() into two parts
  cpufreq: Fix wrong time unit conversion
  cpufreq: serialize calls to __cpufreq_governor()
  cpufreq: don't allow governor limits to be changed when it is disabled

Merge branch 'acpi-bind'

* acpi-bind:
ACPI / bind: Prefer device objects with _STA to those without it

perf: Fix up MMAP2 buffer space reservation

The ino_generation field was added in the PERF_RECORD_MMAP2 record in
the 13d7a24 cset but no space for it was allocated, corrupting the
PERF_FORMAT_{TIME,CPU,TID,etc} area (sample_type/sample_id_all), fix it.

Detected with one of the regression tests done by 'perf test':

  [root@sandy ~]# perf test -v 7
   7: Validate PERF_RECORD_* events & perf_sample fields     :
  --- start ---
  61315294449606 0 PERF_RECORD_SAMPLE
  61315294453161 0 PERF_RECORD_SAMPLE
  61315294454441 0 PERF_RECORD_SAMPLE
  61315294455709 0 PERF_RECORD_SAMPLE
  61315295600899 0 PERF_RECORD_COMM: sleep:6500
  27917287430500 342521613 PERF_RECORD_MMAP2 6500/6500: [0x400000(0x7000) @ 0 00:1d 311442 9016]: /usr/bin/sleep
  MMAP2 going backwards in time, prev=61315295600899, curr=27917287430500
  MMAP2 with unexpected cpu, expected 0, got 342521613
  MMAP2 with unexpected pid, expected 6500, got 1701606191
  MMAP2 with unexpected tid, expected 6500, got 28773
  27917287430500 342561333 PERF_RECORD_MMAP2 6500/6500: [0x3b7e000000(0x223000) @ 0 00:1d 309186 9016]: /usr/lib64/ld-2.16.so
  MMAP2 with unexpected cpu, expected 0, got 342561333
  MMAP2 with unexpected pid, expected 6500, got 1932408369
  MMAP2 with unexpected tid, expected 6500, got 111
  27917287430500 342600095 PERF_RECORD_MMAP2 6500/6500: [0x7fffbd7dc000(0x1000) @ 0x7fffbd7dc000 00:00 0 0]: [vdso]
  MMAP2 with unexpected cpu, expected 0, got 342600095
  MMAP2 with unexpected pid, expected 6500, got 1935963739
  MMAP2 with unexpected tid, expected 6500, got 23919
  27917287430500 342882834 PERF_RECORD_MMAP2 6500/6500: [0x3b7e400000(0x3b8000) @ 0 00:1d 309187 9016]: /usr/lib64/libc-2.16.so
  MMAP2 with unexpected cpu, expected 0, got 342882834
  MMAP2 with unexpected pid, expected 6500, got 909192754
  MMAP2 with unexpected tid, expected 6500, got 7303982
  61316297195411 0 PERF_RECORD_EXIT(6500:6500):(6500:6500)
  ---- end ----
  Validate PERF_RECORD_* events & perf_sample fields: FAILED!
  [root@sandy ~]#

After this patch:

  [root@sandy ~]# perf test 7
   7: Validate PERF_RECORD_* events & perf_sample fields     : Ok
  [root@sandy ~]#

Acked-by: Peter Zijlstra <[email protected]>
Acked-by: Stephane Eranian <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf tools: Add attr->mmap2 support

This patch adds support for the new PERF_RECORD_MMAP2 record type
exposed by the kernel. This is an extended PERF_RECORD_MMAP record.

It adds for each file-backed mapping the device major, minor number and
the inode number and generation.

This triplet uniquely identifies the source of a file-backed mapping. It
can be used to detect identical virtual mappings between processes, for
instance.

The patch will prefer MMAP2 over MMAP.

Signed-off-by: Stephane Eranian <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Cope with 314add6 "Change machine__findnew_thread() to set thread pid",
  fix 'perf test' regression test entry affected,
  use perf_missing_features.mmap2 to fallback to not using .mmap2 in older kernels,
  so that new tools can work with kernels where this feature is not present ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

NFSv4.1: sp4_mach_cred: WARN_ON -> WARN_ON_ONCE

No need to spam the logs

Signed-off-by: Weston Andros Adamson <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>

NFSv4.1: sp4_mach_cred: no need to ref count creds

The cl_machine_cred doesn't need to be reference counted here -
a reference is held is for the lifetime of the struct nfs_client.
Also, no need to put_rpccred the rpc_message.rpc_cred.

Signed-off-by: Weston Andros Adamson <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>

NFSv4.1: fix SECINFO* use of put_rpccred

Recent SP4_MACH_CRED changes allows rpc_message.rpc_cred to change,
so keep a separate pointer to the machine cred for put_rpccred.

Signed-off-by: Weston Andros Adamson <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>

NFSv4.1: sp4_mach_cred: ask for WRITE and COMMIT

Request SP4_MACH_CRED WRITE and COMMIT support in spo_must_allow list --
they're already supported by the client.

Signed-off-by: Weston Andros Adamson <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>

Merge tag 'asoc-v3.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v3.12

A few small fixes, nothing with any broad impact but all useful for the
affected systems. The Kirkwood compatible string change is fixing up a
string just added in the merge window so that we don't get any changes
in released kernels.

Merge remote-tracking branch 'asoc/fix/rsnd' into asoc-linus

Merge remote-tracking branch 'asoc/fix/mc13783' into asoc-linus

Merge remote-tracking branch 'asoc/fix/kirkwood' into asoc-linus

Merge remote-tracking branch 'asoc/fix/fsl' into asoc-linus

Merge remote-tracking branch 'asoc/fix/atmel' into asoc-linus

ASoC: mc13783: add spi errata fix

The MC13783 Chip Errata, Rev. 4 says, that depending on SPI clock
and main audio clock speed, the Audio Codec or Stereo DAC do sometimes
not start when programmed to do so. This is due to an internal clock
timing issue related to the loading of the SPI bits into the audio block.

On an i.MX27 based system, this issue lead to switched audio channels under
certain circumstances: RTC + Touch + Audio are used and loaded at startup.

The mentioned workaround of writing registers 40 and 41 two times is implemented
here.

Signed-off-by: Steffen Trumtrar <[email protected]>
Cc: [email protected]
Signed-off-by: Mark Brown <[email protected]>

i40e: include i40e in kernel proper

This patch adds the changes for Kconfig, i40e.txt, MAINTAINERS, Kbuild
and new i40e/Makefile to build i40e with the kernel.

New driver build option is CONFIG_I40E

Signed-off-by: Jesse Brandeburg <[email protected]>
Signed-off-by: Shannon Nelson <[email protected]>
CC: PJ Waskiewicz <[email protected]>
CC: [email protected]
Tested-by: Kavindya Deegala <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: debugfs interface

This driver includes a debugfs interface for developers to get more hardware
information in real-time.

Signed-off-by: Jesse Brandeburg <[email protected]>
Signed-off-by: Shannon Nelson <[email protected]>
CC: PJ Waskiewicz <[email protected]>
CC: [email protected]
Tested-by: Kavindya Deegala <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: init code and hardware support

This patch implements the hardware specific init and management.

Signed-off-by: Jesse Brandeburg <[email protected]>
Signed-off-by: Shannon Nelson <[email protected]>
CC: PJ Waskiewicz <[email protected]>
CC: [email protected]
Tested-by: Kavindya Deegala <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: implement virtual device interface

While not part of this patch series, an i40evf driver is on its
way, and uses these files to communicate to the PF driver.

This patch contains the header and implementation files for the
PF to VF interface.

Signed-off-by: Jesse Brandeburg <[email protected]>
Signed-off-by: Shannon Nelson <[email protected]>
Signed-off-by: Mitch Williams <[email protected]>
CC: PJ Waskiewicz <[email protected]>
CC: [email protected]
Tested-by: Kavindya Deegala <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: driver core headers

This patch contains the main driver header files, containing
structures and data types specific to the linux driver.

i40e_osdep.h contains some code that helps us adapt our OS agnostic code to
Linux.

Signed-off-by: Jesse Brandeburg <[email protected]>
Signed-off-by: Shannon Nelson <[email protected]>
CC: PJ Waskiewicz <[email protected]>
CC: [email protected]
Tested-by: Kavindya Deegala <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: driver ethtool core

This patch contains the ethtool interface and implementation.

The goal in this patch series is minimal functionality while not
including much in the way of "set support."

Signed-off-by: Jesse Brandeburg <[email protected]>
Signed-off-by: Shannon Nelson <[email protected]>
CC: PJ Waskiewicz <[email protected]>
CC: [email protected]
Tested-by: Kavindya Deegala <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: transmit, receive, and NAPI

This patch contains the transmit, receive, and NAPI routines, as well
as ancillary routines.

This file is code that is (will be) used by both the VF and PF
drivers.

Signed-off-by: Jesse Brandeburg <[email protected]>
Signed-off-by: Shannon Nelson <[email protected]>
CC: PJ Waskiewicz <[email protected]>
CC: [email protected]
Tested-by: Kavindya Deegala <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

i40e: main driver core

This is the driver for the Intel(R) Ethernet Controller XL710 Family.

This driver is targeted at basic ethernet functionality only, and will be
improved upon further as time goes on.

This patch contains the driver entry points but does not include transmit
and receive (see the next patch in the series) routines.

Signed-off-by: Jesse Brandeburg <[email protected]>
Signed-off-by: Shannon Nelson <[email protected]>
CC: PJ Waskiewicz <[email protected]>
CC: [email protected]
Tested-by: Kavindya Deegala <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>

Merge tag 'for-v3.12' of git://git.infradead.org/battery-2.6

Pull battery/power supply driver updates from Anton Vorontsov:
"New drivers:

   - APM X-Gene system reboot driver by Feng Kan and Loc Ho (APM).

   - Qualcomm MSM reboot/poweroff driver by Abhimanyu Kapur (Codeaurora).

   - Texas Instruments BQ24190 charger driver by Mark A.  Greer (Animal
     Creek Technologies).

   - Texas Instruments TWL4030 MADC battery driver by Lukas Märdian and
     Marek Belisko (Golden Delicious Computers).  The driver is used on
     Freerunner GTA04 phones.

  Highlighted fixes and improvements:

   - Suspend/wakeup logic improvements: power supply objects will block
     system suspend until all power supply events are processed.  Thanks
     to Zoran Markovic (Linaro), Arve Hjonnevag and Todd Poynor (Google)"

* tag 'for-v3.12' of git://git.infradead.org/battery-2.6:
  rx51_battery: Fix channel number when reading adc value
  power: Add twl4030_madc battery driver.
  bq24190_charger: Workaround SS definition problem on i386 builds
  power_supply: Prevent suspend until power supply events are processed
  vexpress-poweroff: Should depend on the required infrastructure
  twl4030-charger: Fix compiler warning with regulator_enable()
  rx51_battery: Replace hardcoded channels values.
  bq24190_charger: Add support for TI BQ24190 Battery Charger
  ab8500-charger: We print an unintended error message
  max8925_power: Fix missing of_node_put
  power_supply: Replace strict_strtol() with kstrtol()
  power: Add APM X-Gene system reboot driver
  power_supply: tosa_battery: Get rid of irq_to_gpio usage
  power supply: collie_battery: Convert to use dev_pm_ops
  power_supply: Make goldfish_battery depend on GOLDFISH || COMPILE_TEST
  power: reset: Add msm restart support
  MAINTAINERS: drivers/power: add entry for SmartReflex AVS drivers

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc

Pull powerpc fixes from Ben Herrenschmidt:
"Here are a handful of small powerpc fixes.

  A couple of section mismatches (always worth fixing), a missing export
  of a new symbol causing build failures of modules, a page fault
  deadlock fix (interestingly that bug has been around for a LONG time,
  though it seems to be more easily triggered by KVM) and fixing pseries
  default idle loop in the absence of the cpuidle drivers (such as
  during boot)"

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc: Default arch idle could cede processor on pseries
  fbdev/ps3fb: Fix section mismatch warning for ps3fb_probe
  powerpc: Fix section mismatch warning for prom_rtas_call
  powerpc: Fix possible deadlock on page fault
  powerpc: Export cpu_to_chip_id() to fix build error

target/iscsi: Bump versions to v4.1.0

Signed-off-by: Nicholas Bellinger <[email protected]>

target: Update copyright ownership/year information to 2013

Update copyright ownership/year information for target-core,
loopback, iscsi-target, tcm_qla2xx, vhost and iser-target.

Signed-off-by: Nicholas Bellinger <[email protected]>