Git Repo - linux.git/log

tipc: fix connection refcount leak

When tipc_conn_sendmsg() calls tipc_conn_lookup() to query a
connection instance, its reference count value is increased if
it's found. But subsequently if it's found that the connection is
closed, the work of sending message is not queued into its server
send workqueue, and the connection reference count is not decreased.
This will cause a reference count leak. To reproduce this problem,
an application would need to open and closes topology server
connections with high intensity.

We fix this by immediately decrementing the connection reference
count if a send fails due to the connection being closed.

Signed-off-by: Ying Xue <[email protected]>
Acked-by: Erik Hugne <[email protected]>
Reviewed-by: Jon Maloy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

tipc: allow connection shutdown callback to be invoked in advance

Currently connection shutdown callback function is called when
connection instance is released in tipc_conn_kref_release(), and
receiving packets and sending packets are running in different
threads. Even if connection is closed by the thread of receiving
packets, its shutdown callback may not be called immediately as
the connection reference count is non-zero at that moment. So,
although the connection is shut down by the thread of receiving
packets, the thread of sending packets doesn't know it. Before
its shutdown callback is invoked to tell the sending thread its
connection has been closed, the sending thread may deliver
messages by tipc_conn_sendmsg(), this is why the following error
information appears:

"Sending subscription event failed, no memory"

To eliminate it, allow connection shutdown callback function to
be called before connection id is removed in tipc_close_conn(),
which makes the sending thread know the truth in time that its
socket is closed so that it doesn't send message to it. We also
remove the "Sending XXX failed..." error reporting for topology
and config services.

Signed-off-by: Ying Xue <[email protected]>
Signed-off-by: Erik Hugne <[email protected]>
Reviewed-by: Jon Maloy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

l2tp: fix userspace reception on plain L2TP sockets

As pppol2tp_recv() never queues up packets to plain L2TP sockets,
pppol2tp_recvmsg() never returns data to userspace, thus making
the recv*() system calls unusable.

Instead of dropping packets when the L2TP socket isn't bound to a PPP
channel, this patch adds them to its reception queue.

Signed-off-by: Guillaume Nault <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

l2tp: fix manual sequencing (de)activation in L2TPv2

Commit e0d4435f "l2tp: Update PPP-over-L2TP driver to work over L2TPv3"
broke the PPPOL2TP_SO_SENDSEQ setsockopt. The L2TP header length was
previously computed by pppol2tp_l2t_header_len() before each call to
l2tp_xmit_skb(). Now that header length is retrieved from the hdr_len
session field, this field must be updated every time the L2TP header
format is modified, or l2tp_xmit_skb() won't push the right amount of
data for the L2TP header.

This patch uses l2tp_session_set_header_len() to adjust hdr_len every
time sequencing is (de)activated from userspace (either by the
PPPOL2TP_SO_SENDSEQ setsockopt or the L2TP_ATTR_SEND_SEQ netlink
attribute).

Signed-off-by: Guillaume Nault <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dm thin: fix Documentation for held metadata root feature

The Documentation for the thin provisioning target's held metadata root
feature was incorrect. It is now available and the value for the held
metadata root is in block units (not 512b sectors).

Signed-off-by: Mike Snitzer <[email protected]>

x86, trace: Further robustify CR2 handling vs tracing

Building on commit 0ac09f9f8cd1 ("x86, trace: Fix CR2 corruption when
tracing page faults") this patch addresses another few issues:

- Now that read_cr2() is lifted into trace_do_page_fault(), we should
   pass the address to trace_page_fault_entries() to avoid it
   re-reading a potentially changed cr2.

- Put both trace_do_page_fault() and trace_page_fault_entries() under
   CONFIG_TRACING.

- Mark both fault entry functions {,trace_}do_page_fault() as notrace
   to avoid getting __mcount or other function entry trace callbacks
   before we've observed CR2.

- Mark __do_page_fault() as noinline to guarantee the function tracer
   does get to see the fault.

Cc: <[email protected]>
Cc: <[email protected]>
Acked-by: Steven Rostedt <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: H. Peter Anvin <[email protected]>

mwifiex: save and copy AP's VHT capability info correctly

While preparing association request, intersection of device's
VHT capability information and corresponding field advertised
by AP is used.

This patch fixes a couple errors while saving and copying vht_cap
and vht_oper fields from AP's beacon.

Cc: <[email protected]> # 3.9+
Signed-off-by: Amitkumar Karwar <[email protected]>
Signed-off-by: Bing Zhao <[email protected]>
Signed-off-by: John W. Linville <[email protected]>

mwifiex: copy AP's HT capability info correctly

While preparing association request, intersection of device's HT
capability information and corresponding fields advertised by AP
is used.

This patch fixes an error while copying this field from AP's
beacon.

Cc: <[email protected]>
Signed-off-by: Amitkumar Karwar <[email protected]>
Signed-off-by: Bing Zhao <[email protected]>
Signed-off-by: John W. Linville <[email protected]>

Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

ACPI / EC: Clear stale EC events on Samsung systems

A number of Samsung notebooks (530Uxx/535Uxx/540Uxx/550Pxx/900Xxx/etc)
continue to log events during sleep (lid open/close, AC plug/unplug,
battery level change), which accumulate in the EC until a buffer fills.
After the buffer is full (tests suggest it holds 8 events), GPEs stop
being triggered for new events. This state persists on wake or even on
power cycle, and prevents new events from being registered until the EC
is manually polled.

This is the root cause of a number of bugs, including AC not being
detected properly, lid close not triggering suspend, and low ambient
light not triggering the keyboard backlight. The bug also seemed to be
responsible for performance issues on at least one user's machine.

Juan Manuel Cabo found the cause of bug and the workaround of polling
the EC manually on wake.

The loop which clears the stale events is based on an earlier patch by
Lan Tianyu (see referenced attachment).

This patch:
- Adds a function acpi_ec_clear() which polls the EC for stale _Q
   events at most ACPI_EC_CLEAR_MAX (currently 100) times. A warning is
   logged if this limit is reached.
- Adds a flag EC_FLAGS_CLEAR_ON_RESUME which is set to 1 if the DMI
   system vendor is Samsung. This check could be replaced by several
   more specific DMI vendor/product pairs, but it's likely that the bug
   affects more Samsung products than just the five series mentioned
   above. Further, it should not be harmful to run acpi_ec_clear() on
   systems without the bug; it will return immediately after finding no
   data waiting.
- Runs acpi_ec_clear() on initialisation (boot), from acpi_ec_add()
- Runs acpi_ec_clear() on wake, from acpi_ec_unblock_transactions()

References: https://bugzilla.kernel.org/show_bug.cgi?id=44161
References: https://bugzilla.kernel.org/show_bug.cgi?id=45461
References: https://bugzilla.kernel.org/show_bug.cgi?id=57271
References: https://bugzilla.kernel.org/attachment.cgi?id=126801
Suggested-by: Juan Manuel Cabo <[email protected]>
Signed-off-by: Kieran Clancy <[email protected]>
Reviewed-by: Lan Tianyu <[email protected]>
Reviewed-by: Dennis Jansen <[email protected]>
Tested-by: Kieran Clancy <[email protected]>
Tested-by: Juan Manuel Cabo <[email protected]>
Tested-by: Dennis Jansen <[email protected]>
Tested-by: Maurizio D'Addona <[email protected]>
Tested-by: San Zamoyski <[email protected]>
Cc: All applicable <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

cpufreq: Initialize governor for a new policy under policy->rwsem

policy->rwsem is used to lock access to all parts of code modifying
struct cpufreq_policy, but it's not used on a new policy created by
__cpufreq_add_dev().

Because of that, if cpufreq_update_policy() is called in a tight loop
on one CPU in parallel with offline/online of another CPU, then the
following crash can be triggered:

Unable to handle kernel NULL pointer dereference at virtual address 00000020
pgd = c0003000
[00000020] *pgd=80000000004003, *pmd=00000000
Internal error: Oops: 206 [#1] PREEMPT SMP ARM

PC is at __cpufreq_governor+0x10/0x1ac
LR is at cpufreq_update_policy+0x114/0x150

---[ end trace f23a8defea6cd706 ]---
Kernel panic - not syncing: Fatal exception
CPU0: stopping
CPU: 0 PID: 7136 Comm: mpdecision Tainted: G D W 3.10.0-gd727407-00074-g979ede8 #396

[<c0afe180>] (notifier_call_chain+0x40/0x68) from [<c02a23ac>] (__blocking_notifier_call_chain+0x40/0x58)
[<c02a23ac>] (__blocking_notifier_call_chain+0x40/0x58) from [<c02a23d8>] (blocking_notifier_call_chain+0x14/0x1c)
[<c02a23d8>] (blocking_notifier_call_chain+0x14/0x1c) from [<c0803c68>] (cpufreq_set_policy+0xd4/0x2b8)
[<c0803c68>] (cpufreq_set_policy+0xd4/0x2b8) from [<c0803e7c>] (cpufreq_init_policy+0x30/0x98)
[<c0803e7c>] (cpufreq_init_policy+0x30/0x98) from [<c0805a18>] (__cpufreq_add_dev.isra.17+0x4dc/0x7a4)
[<c0805a18>] (__cpufreq_add_dev.isra.17+0x4dc/0x7a4) from [<c0805d38>] (cpufreq_cpu_callback+0x58/0x84)
[<c0805d38>] (cpufreq_cpu_callback+0x58/0x84) from [<c0afe180>] (notifier_call_chain+0x40/0x68)
[<c0afe180>] (notifier_call_chain+0x40/0x68) from [<c02812dc>] (__cpu_notify+0x28/0x44)
[<c02812dc>] (__cpu_notify+0x28/0x44) from [<c0aeed90>] (_cpu_up+0xf4/0x1dc)
[<c0aeed90>] (_cpu_up+0xf4/0x1dc) from [<c0aeeed4>] (cpu_up+0x5c/0x78)
[<c0aeeed4>] (cpu_up+0x5c/0x78) from [<c0aec808>] (store_online+0x44/0x74)
[<c0aec808>] (store_online+0x44/0x74) from [<c03a40f4>] (sysfs_write_file+0x108/0x14c)
[<c03a40f4>] (sysfs_write_file+0x108/0x14c) from [<c03517d4>] (vfs_write+0xd0/0x180)
[<c03517d4>] (vfs_write+0xd0/0x180) from [<c0351ca8>] (SyS_write+0x38/0x68)
[<c0351ca8>] (SyS_write+0x38/0x68) from [<c0205de0>] (ret_fast_syscall+0x0/0x30)

Fix that by taking locks at appropriate places in __cpufreq_add_dev()
as well.

Reported-by: Saravana Kannan <[email protected]>
Suggested-by: Srivatsa S. Bhat <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
[rjw: Changelog]
Signed-off-by: Rafael J. Wysocki <[email protected]>

cpufreq: Initialize policy before making it available for others to use

Policy must be fully initialized before it is being made available
for use by others. Otherwise cpufreq_cpu_get() would be able to grab
a half initialized policy structure that might not have affected_cpus
(for example) populated. Then, anybody accessing those fields will get
a wrong value and that will lead to unpredictable results.

In order to fix this, do all the necessary initialization before we
make the policy structure available via cpufreq_cpu_get(). That will
guarantee that any code accessing fields of the policy will get
correct data from them.

Reported-by: Saravana Kannan <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
[rjw: Changelog]
Signed-off-by: Rafael J. Wysocki <[email protected]>

cpufreq: use cpufreq_cpu_get() to avoid cpufreq_get() race conditions

If a module calls cpufreq_get while cpufreq is initializing, it's
possible for it to be called after cpufreq_driver is set but before
cpufreq_cpu_data is written during subsys_interface_register.  This
happens because cpufreq_get doesn't take the cpufreq_driver_lock
around its use of cpufreq_cpu_data.

Fix this by using cpufreq_cpu_get(cpu) to look up the policy rather
than reading it out of cpufreq_cpu_data directly.  cpufreq_cpu_get()
takes the appropriate locks to prevent this race from happening.

Since it's possible for policy to be NULL if the caller passes in an
invalid CPU number or calls the function before cpufreq is initialized,
delete the BUG_ON(!policy) and simply return 0.  Don't try to return
-ENOENT because that's negative and the function returns an unsigned
integer.

References: https://bbs.archlinux.org/viewtopic.php?id=177934
Signed-off-by: Aaron Plattner <[email protected]>
Cc: 3.13+ <[email protected]> # 3.13+
Signed-off-by: Rafael J. Wysocki <[email protected]>

ARM: KVM: fix non-VGIC compilation

Add a stub for kvm_vgic_addr when compiling without
CONFIG_KVM_ARM_VGIC. The usefulness of this configurarion is extremely
doubtful, but let's fix it anyway (until we decide that we'll always
support a VGIC).

Reported-by: Michele Paolino <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Christoffer Dall <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

clk: shmobile: rcar-gen2: Use kick bit to allow Z clock frequency change

The Z clock frequency change is effective only after setting the kick
bit located in the FRQCRB register.
Without that, the CA15 CPUs clock rate will never change.

Fix that by checking if the kick bit is cleared and enable it to make
the clock rate change effective. The bit is cleared automatically upon
completion.

Signed-off-by: Benoit Cousson <[email protected]>
Acked-by: Laurent Pinchart <[email protected]>
Signed-off-by: Mike Turquette <[email protected]>

hyperv: Move state setting for link query

It moves the state setting for query into rndis_filter_receive_response().
All callbacks including query-complete and status-callback are synchronized
by channel->inbound_lock. This prevents pentential race between them.

Signed-off-by: Haiyang Zhang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: macb: DMA-unmap full rx-buffer

When allocating RX buffers a fixed size is used, while freeing is based
on actually received bytes, resulting in the following kernel warning
when CONFIG_DMA_API_DEBUG is enabled:
WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:1051 check_unmap+0x258/0x894()
macb e000b000.ethernet: DMA-API: device driver frees DMA memory with different size [device address=0x000000002d170040] [map size=1536 bytes] [unmap size=60 bytes]
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.0-rc3-xilinx-00220-g49f84081ce4f #65
[<c001516c>] (unwind_backtrace) from [<c0011df8>] (show_stack+0x10/0x14)
[<c0011df8>] (show_stack) from [<c03c775c>] (dump_stack+0x7c/0xc8)
[<c03c775c>] (dump_stack) from [<c00245cc>] (warn_slowpath_common+0x60/0x84)
[<c00245cc>] (warn_slowpath_common) from [<c0024670>] (warn_slowpath_fmt+0x2c/0x3c)
[<c0024670>] (warn_slowpath_fmt) from [<c0227d44>] (check_unmap+0x258/0x894)
[<c0227d44>] (check_unmap) from [<c0228588>] (debug_dma_unmap_page+0x64/0x70)
[<c0228588>] (debug_dma_unmap_page) from [<c02ab78c>] (gem_rx+0x118/0x170)
[<c02ab78c>] (gem_rx) from [<c02ac4d4>] (macb_poll+0x24/0x94)
[<c02ac4d4>] (macb_poll) from [<c031222c>] (net_rx_action+0x6c/0x188)
[<c031222c>] (net_rx_action) from [<c0028a28>] (__do_softirq+0x108/0x280)
[<c0028a28>] (__do_softirq) from [<c0028e8c>] (irq_exit+0x84/0xf8)
[<c0028e8c>] (irq_exit) from [<c000f360>] (handle_IRQ+0x68/0x8c)
[<c000f360>] (handle_IRQ) from [<c0008528>] (gic_handle_irq+0x3c/0x60)
[<c0008528>] (gic_handle_irq) from [<c0012904>] (__irq_svc+0x44/0x78)
Exception stack(0xc056df20 to 0xc056df68)
df20: 00000001 c0577430 00000000 c0577430 04ce8e0d 00000002 edfce238 00000000
df40: 04e20f78 00000002 c05981f4 00000000 00000008 c056df68 c0064008 c02d7658
df60: 20000013 ffffffff
[<c0012904>] (__irq_svc) from [<c02d7658>] (cpuidle_enter_state+0x54/0xf8)
[<c02d7658>] (cpuidle_enter_state) from [<c02d77dc>] (cpuidle_idle_call+0xe0/0x138)
[<c02d77dc>] (cpuidle_idle_call) from [<c000f660>] (arch_cpu_idle+0x8/0x3c)
[<c000f660>] (arch_cpu_idle) from [<c006bec4>] (cpu_startup_entry+0xbc/0x124)
[<c006bec4>] (cpu_startup_entry) from [<c053daec>] (start_kernel+0x350/0x3b0)
---[ end trace d5fdc38641bd3a11 ]---
Mapped at:
  [<c0227184>] debug_dma_map_page+0x48/0x11c
  [<c02ab32c>] gem_rx_refill+0x154/0x1f8
  [<c02ac7b4>] macb_open+0x270/0x3e0
  [<c03152e0>] __dev_open+0x7c/0xfc
  [<c031554c>] __dev_change_flags+0x8c/0x140

Fixing this by passing the same size which is passed during mapping the
memory to the unmap function as well.

Signed-off-by: Soren Brinkmann <[email protected]>
Reviewed-by: Ben Hutchings <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: macb: Check DMA mappings for error

With CONFIG_DMA_API_DEBUG enabled the following warning is printed:
WARNING: CPU: 0 PID: 619 at lib/dma-debug.c:1101 check_unmap+0x758/0x894()
macb e000b000.ethernet: DMA-API: device driver failed to check map error[device address=0x000000002d171c02] [size=322 bytes] [mapped as single]
Modules linked in:
CPU: 0 PID: 619 Comm: udhcpc Not tainted 3.14.0-rc3-xilinx-00219-gd158fc7f36a2 #63
[<c001516c>] (unwind_backtrace) from [<c0011df8>] (show_stack+0x10/0x14)
[<c0011df8>] (show_stack) from [<c03c7714>] (dump_stack+0x7c/0xc8)
[<c03c7714>] (dump_stack) from [<c00245cc>] (warn_slowpath_common+0x60/0x84)
[<c00245cc>] (warn_slowpath_common) from [<c0024670>] (warn_slowpath_fmt+0x2c/0x3c)
[<c0024670>] (warn_slowpath_fmt) from [<c0228244>] (check_unmap+0x758/0x894)
[<c0228244>] (check_unmap) from [<c0228588>] (debug_dma_unmap_page+0x64/0x70)
[<c0228588>] (debug_dma_unmap_page) from [<c02aba64>] (macb_interrupt+0x1f8/0x2dc)
[<c02aba64>] (macb_interrupt) from [<c006c6e4>] (handle_irq_event_percpu+0x2c/0x178)
[<c006c6e4>] (handle_irq_event_percpu) from [<c006c86c>] (handle_irq_event+0x3c/0x5c)
[<c006c86c>] (handle_irq_event) from [<c006f548>] (handle_fasteoi_irq+0xb8/0x100)
[<c006f548>] (handle_fasteoi_irq) from [<c006c148>] (generic_handle_irq+0x20/0x30)
[<c006c148>] (generic_handle_irq) from [<c000f35c>] (handle_IRQ+0x64/0x8c)
[<c000f35c>] (handle_IRQ) from [<c0008528>] (gic_handle_irq+0x3c/0x60)
[<c0008528>] (gic_handle_irq) from [<c0012904>] (__irq_svc+0x44/0x78)
Exception stack(0xed197f60 to 0xed197fa8)
7f60: 00000134 60000013 bd94362e bd94362e be96b37c 00000014 fffffd72 00000122
7f80: c000ebe4 ed196000 00000000 00000011 c032c0d8 ed197fa8 c0064008 c000ea20
7fa0: 60000013 ffffffff
[<c0012904>] (__irq_svc) from [<c000ea20>] (ret_fast_syscall+0x0/0x48)
---[ end trace 478f921d0d542d1e ]---
Mapped at:
  [<c0227184>] debug_dma_map_page+0x48/0x11c
  [<c02aaca0>] macb_start_xmit+0x184/0x2a8
  [<c03143c0>] dev_hard_start_xmit+0x334/0x470
  [<c032c09c>] sch_direct_xmit+0x78/0x2f8
  [<c0314814>] __dev_queue_xmit+0x318/0x708

due to missing checks of the dma mapping. Add the appropriate checks to fix
this.

Signed-off-by: Soren Brinkmann <[email protected]>
Reviewed-by: Ben Hutchings <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: sctp: fix skb leakage in COOKIE ECHO path of chunk->auth_chunk

While working on ec0223ec48a9 ("net: sctp: fix sctp_sf_do_5_1D_ce to
verify if we/peer is AUTH capable"), we noticed that there's a skb
memory leakage in the error path.

Running the same reproducer as in ec0223ec48a9 and by unconditionally
jumping to the error label (to simulate an error condition) in
sctp_sf_do_5_1D_ce() receive path lets kmemleak detector bark about
the unfreed chunk->auth_chunk skb clone:

Unreferenced object 0xffff8800b8f3a000 (size 256):
  comm "softirq", pid 0, jiffies 4294769856 (age 110.757s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    89 ab 75 5e d4 01 58 13 00 00 00 00 00 00 00 00  ..u^..X.........
  backtrace:
    [<ffffffff816660be>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff8119f328>] kmem_cache_alloc+0xc8/0x210
    [<ffffffff81566929>] skb_clone+0x49/0xb0
    [<ffffffffa0467459>] sctp_endpoint_bh_rcv+0x1d9/0x230 [sctp]
    [<ffffffffa046fdbc>] sctp_inq_push+0x4c/0x70 [sctp]
    [<ffffffffa047e8de>] sctp_rcv+0x82e/0x9a0 [sctp]
    [<ffffffff815abd38>] ip_local_deliver_finish+0xa8/0x210
    [<ffffffff815a64af>] nf_reinject+0xbf/0x180
    [<ffffffffa04b4762>] nfqnl_recv_verdict+0x1d2/0x2b0 [nfnetlink_queue]
    [<ffffffffa04aa40b>] nfnetlink_rcv_msg+0x14b/0x250 [nfnetlink]
    [<ffffffff815a3269>] netlink_rcv_skb+0xa9/0xc0
    [<ffffffffa04aa7cf>] nfnetlink_rcv+0x23f/0x408 [nfnetlink]
    [<ffffffff815a2bd8>] netlink_unicast+0x168/0x250
    [<ffffffff815a2fa1>] netlink_sendmsg+0x2e1/0x3f0
    [<ffffffff8155cc6b>] sock_sendmsg+0x8b/0xc0
    [<ffffffff8155d449>] ___sys_sendmsg+0x369/0x380

What happens is that commit bbd0d59809f9 clones the skb containing
the AUTH chunk in sctp_endpoint_bh_rcv() when having the edge case
that an endpoint requires COOKIE-ECHO chunks to be authenticated:

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  ------------------ AUTH; COOKIE-ECHO ---------------->
  <-------------------- COOKIE-ACK ---------------------

When we enter sctp_sf_do_5_1D_ce() and before we actually get to
the point where we process (and subsequently free) a non-NULL
chunk->auth_chunk, we could hit the "goto nomem_init" path from
an error condition and thus leave the cloned skb around w/o
freeing it.

The fix is to centrally free such clones in sctp_chunk_destroy()
handler that is invoked from sctp_chunk_free() after all refs have
dropped; and also move both kfree_skb(chunk->auth_chunk) there,
so that chunk->auth_chunk is either NULL (since sctp_chunkify()
allocs new chunks through kmem_cache_zalloc()) or non-NULL with
a valid skb pointer. chunk->skb and chunk->auth_chunk are the
only skbs in the sctp_chunk structure that need to be handeled.

While at it, we should use consume_skb() for both. It is the same
as dev_kfree_skb() but more appropriately named as we are not
a device but a protocol. Also, this effectively replaces the
kfree_skb() from both invocations into consume_skb(). Functions
are the same only that kfree_skb() assumes that the frame was
being dropped after a failure (e.g. for tools like drop monitor),
usage of consume_skb() seems more appropriate in function
sctp_chunk_destroy() though.

Fixes: bbd0d59809f9 ("[SCTP]: Implement the receive and verification of AUTH chunk")
Signed-off-by: Daniel Borkmann <[email protected]>
Cc: Vlad Yasevich <[email protected]>
Cc: Neil Horman <[email protected]>
Acked-by: Vlad Yasevich <[email protected]>
Acked-by: Neil Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

r8152: disable the ECM mode

There are known issues for switching the drivers between ECM mode and
vendor mode. The interrup transfer may become abnormal. The hardware
may have the opportunity to die if you change the configuration without
unloading the current driver first, because all the control transfers
of the current driver would fail after the command of switching the
configuration.

Although to use the ecm driver and vendor driver independently is fine,
it may have problems to change the driver from one to the other by
switching the configuration. Additionally, now the vendor mode driver
is more powerful than the ECM driver. Thus, disable the ECM mode driver,
and let r8152 to set the configuration to vendor mode and reset the
device automatically.

Signed-off-by: Hayes Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net/mlx4: Support shutdown() interface

In kexec scenario, we failed to load the mlx4 driver in the
second kernel because the ownership bit was hold by the first
kernel without release correctly.

The patch adds shutdown() interface so that the ownership can
be released correctly in the first kernel. It also helps avoiding
EEH error happened during boot stage of the second kernel because
of undesired traffic, which can't be handled by hardware during
that stage on Power platform.

Signed-off-by: Gavin Shan <[email protected]>
Tested-by: Wei Yang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

bridge: multicast: add sanity check for query source addresses

MLD queries are supposed to have an IPv6 link-local source address
according to RFC2710, section 4 and RFC3810, section 5.1.14. This patch
adds a sanity check to ignore such broken MLD queries.

Without this check, such malformed MLD queries can result in a
denial of service: The queries are ignored by any MLD listener
therefore they will not respond with an MLD report. However,
without this patch these malformed MLD queries would enable the
snooping part in the bridge code, potentially shutting down the
according ports towards these hosts for multicast traffic as the
bridge did not learn about these listeners.

Reported-by: Jan Stancek <[email protected]>
Signed-off-by: Linus Lüssing <[email protected]>
Reviewed-by: Hannes Frederic Sowa <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: fix for a race condition in the inet frag code

I stumbled upon this very serious bug while hunting for another one,
it's a very subtle race condition between inet_frag_evictor,
inet_frag_intern and the IPv4/6 frag_queue and expire functions
(basically the users of inet_frag_kill/inet_frag_put).

What happens is that after a fragment has been added to the hash chain
but before it's been added to the lru_list (inet_frag_lru_add) in
inet_frag_intern, it may get deleted (either by an expired timer if
the system load is high or the timer sufficiently low, or by the
fraq_queue function for different reasons) before it's added to the
lru_list, then after it gets added it's a matter of time for the
evictor to get to a piece of memory which has been freed leading to a
number of different bugs depending on what's left there.

I've been able to trigger this on both IPv4 and IPv6 (which is normal
as the frag code is the same), but it's been much more difficult to
trigger on IPv4 due to the protocol differences about how fragments
are treated.

The setup I used to reproduce this is: 2 machines with 4 x 10G bonded
in a RR bond, so the same flow can be seen on multiple cards at the
same time. Then I used multiple instances of ping/ping6 to generate
fragmented packets and flood the machines with them while running
other processes to load the attacked machine.

*It is very important to have the _same flow_ coming in on multiple CPUs
concurrently. Usually the attacked machine would die in less than 30
minutes, if configured properly to have many evictor calls and timeouts
it could happen in 10 minutes or so.

An important point to make is that any caller (frag_queue or timer) of
inet_frag_kill will remove both the timer refcount and the
original/guarding refcount thus removing everything that's keeping the
frag from being freed at the next inet_frag_put. All of this could
happen before the frag was ever added to the LRU list, then it gets
added and the evictor uses a freed fragment.

An example for IPv6 would be if a fragment is being added and is at
the stage of being inserted in the hash after the hash lock is
released, but before inet_frag_lru_add executes (or is able to obtain
the lru lock) another overlapping fragment for the same flow arrives
at a different CPU which finds it in the hash, but since it's
overlapping it drops it invoking inet_frag_kill and thus removing all
guarding refcounts, and afterwards freeing it by invoking
inet_frag_put which removes the last refcount added previously by
inet_frag_find, then inet_frag_lru_add gets executed by
inet_frag_intern and we have a freed fragment in the lru_list.

The fix is simple, just move the lru_add under the hash chain locked
region so when a removing function is called it'll have to wait for
the fragment to be added to the lru_list, and then it'll remove it (it
works because the hash chain removal is done before the lru_list one
and there's no window between the two list adds when the frag can get
dropped). With this fix applied I couldn't kill the same machine in 24
hours with the same setup.

Fixes: 3ef0eb0db4bf ("net: frag, move LRU list maintenance outside of
rwlock")

CC: Florian Westphal <[email protected]>
CC: Jesper Dangaard Brouer <[email protected]>
CC: David S. Miller <[email protected]>
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Acked-by: Jesper Dangaard Brouer <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

dm thin: fix noflush suspend IO queueing

i) by the time DM core calls the postsuspend hook the dm_noflush flag
has been cleared.  So the old thin_postsuspend did nothing.  We need to
use the presuspend hook instead.

ii) There was a race between bios leaving DM core and arriving in the
deferred queue.

thin_presuspend now sets a 'requeue' flag causing all bios destined for
that thin to be requeued back to DM core.  Then it requeues all held IO,
and all IO on the deferred queue (destined for that thin).  Finally
postsuspend clears the 'requeue' flag.

Signed-off-by: Joe Thornber <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

dm thin: fix deadlock in __requeue_bio_list

The spin lock in requeue_io() was held for too long, allowing deadlock.
Don't worry, due to other issues addressed in the following "dm thin:
fix noflush suspend IO queueing" commit, this code was never called.

Fix this by taking the spin lock for a much shorter period of time.

Signed-off-by: Joe Thornber <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

dm thin: fix out of data space handling

Ideally a thin pool would never run out of data space; the low water
mark would trigger userland to extend the pool before we completely run
out of space.  However, many small random IOs to unprovisioned space can
consume data space at an alarming rate.  Adjust your low water mark if
you're frequently seeing "out-of-data-space" mode.

Before this fix, if data space ran out the pool would be put in
PM_READ_ONLY mode which also aborted the pool's current metadata
transaction (data loss for any changes in the transaction).  This had a
side-effect of needlessly compromising data consistency.  And retry of
queued unserviceable bios, once the data pool was resized, could
initiate changes to potentially inconsistent pool metadata.

Now when the pool's data space is exhausted transition to a new pool
mode (PM_OUT_OF_DATA_SPACE) that allows metadata to be changed but data
may not be allocated.  This allows users to remove thin volumes or
discard data to recover data space.

The pool is no longer put in PM_READ_ONLY mode in response to the pool
running out of data space.  And PM_READ_ONLY mode no longer aborts the
pool's current metadata transaction.  Also, set_pool_mode() will now
notify userspace when the pool mode is changed.

Signed-off-by: Joe Thornber <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

dm thin: ensure user takes action to validate data and metadata consistency

If a thin metadata operation fails the current transaction will abort,
whereby causing potential for IO layers up the stack (e.g. filesystems)
to have data loss.  As such, set THIN_METADATA_NEEDS_CHECK_FLAG in the
thin metadata's superblock which:
1) requires the user verify the thin metadata is consistent (e.g. use
   thin_check, etc)
2) suggests the user verify the thin data is consistent (e.g. use fsck)

The only way to clear the superblock's THIN_METADATA_NEEDS_CHECK_FLAG is
to run thin_repair.

On metadata operation failure: abort current metadata transaction, set
pool in read-only mode, and now set the needs_check flag.

As part of this change, constraints are introduced or relaxed:
* don't allow a pool to transition to write mode if needs_check is set
* don't allow data or metadata space to be resized if needs_check is set
* if a thin pool's metadata space is exhausted: the kernel will now
  force the user to take the pool offline for repair before the kernel
  will allow the metadata space to be extended.

Also, update Documentation to include information about when the thin
provisioning target commits metadata, how it handles metadata failures
and running out of space.

Signed-off-by: Mike Snitzer <[email protected]>
Signed-off-by: Joe Thornber <[email protected]>

NFSv4: Fail the truncate() if the lock/open stateid is invalid

If the open stateid could not be recovered, or the file locks were lost,
then we should fail the truncate() operation altogether.

Reported-by: Andy Adamson <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Trond Myklebust <[email protected]>

NFSv4.1 Fail data server I/O if stateid represents a lost lock

Signed-off-by: Andy Adamson <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Trond Myklebust <[email protected]>

NFSv4: Fix the return value of nfs4_select_rw_stateid

In commit 5521abfdcf4d6 (NFSv4: Resend the READ/WRITE RPC call
if a stateid change causes an error), we overloaded the return value of
nfs4_select_rw_stateid() to cause it to return -EWOULDBLOCK if an RPC
call is outstanding that would cause the NFSv4 lock or open stateid
to change.
That is all redundant when we actually copy the stateid used in the
read/write RPC call that failed, and check that against the current
stateid. It is doubly so, when we consider that in the NFSv4.1 case,
we also set the stateid's seqid to the special value '0', which means
'match the current valid stateid'.

Reported-by: Andy Adamson <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Trond Myklebust <[email protected]>

NFSv4: nfs4_stateid_is_current should return 'true' for an invalid stateid

When nfs4_set_rw_stateid() can fails by returning EIO to indicate that
the stateid is completely invalid, then it makes no sense to have it
trigger a retry of the READ or WRITE operation. Instead, we should just
have it fall through and attempt a recovery.

This fixes an infinite loop in which the client keeps replaying the same
bad stateid back to the server.

Reported-by: Andy Adamson <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Cc: [email protected] # 3.10+
Signed-off-by: Trond Myklebust <[email protected]>

mac80211: clear sequence/fragment number in QoS-null frames

Avoid leaking data by sending uninitialized memory and setting an
invalid (non-zero) fragment number (the sequence number is ignored
anyway) by setting the seq_ctrl field to zero.

Cc: [email protected]
Fixes: 3f52b7e328c5 ("mac80211: mesh power save basics")
Fixes: ce662b44ce22 ("mac80211: send (QoS) Null if no buffered frames")
Reviewed-by: Emmanuel Grumbach <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>

ALSA: usb-audio: Add quirk for Logitech Webcam C500

Logitech C500 (046d:0807) needs the same workaround like other
Logitech Webcams.

Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

ALSA: hda - Use analog beep for Thinkpads with AD1984 codecs

For making the driver behavior compatible with the earlier kernels,
use the analog beep in the loopback path instead of the digital beep.

Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

ALSA: hda - Add missing loopback merge path for AD1884/1984 codecs

The mixer widget (NID 0x20) of AD1884 and AD1984 codecs isn't
connected directly to the actual I/O paths but only via another mixer
widget (NID 0x21). We need a similar fix as we did for AD1882.

Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

pinctrl: sirf: fix kernel panic in gpio_lock_as_irq

commit 655dada6277991 causes kernel panic, this patch fixes it.

    [    1.197816] [ffffffee] *pgd=0d7fd821, *pte=00000000, *ppte=00000000
    [    1.204070] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
    [    1.209447] Modules linked in:
    [    1.212490] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc1 #3
    [    1.218737] task: cd03c000 ti: cd040000 task.ti: cd040000
    [    1.224127] PC is at gpiod_lock_as_irq+0xc/0x64
    [    1.228634] LR is at sirfsoc_gpio_irq_startup+0x18/0x44
    [    1.233842] pc : [<c01d3990>]    lr : [<c01d1c38>]    psr: a0000193
    [    1.233842] sp : cd041d30  ip : 00000000  fp : 00000000
    [    1.245296] r10: 00000000  r9 : cd023db4  r8 : 60000113
    [    1.250505] r7 : 0000003e  r6 : cd023dd4  r5 : c06bfa54  r4 : cd023d80
    [    1.257014] r3 : 00000020  r2 : 00000000  r1 : ffffffea  r0 : ffffffea
    [    1.263526] Flags: NzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
    [    1.270903] Control: 10c53c7d  Table: 00004059  DAC: 00000015
    [    1.276631] Process swapper/0 (pid: 1, stack limit = 0xcd040240)
    [    1.282620] Stack: (0xcd041d30 to 0xcd042000)
    [    1.286963] 1d20:                                     cd023d80 c01d1c38 c01d1c20 cd023d80
    [    1.295124] 1d40: 00000001 c0068438 cd023d80 ccb6d880 cd023dd4 c0067044 0000718e c006719c
    [    1.286963] 1d20:                                     cd023d80 c01d1c38 c01d1c20 cd023d80
    [    1.295124] 1d40: 00000001 c0068438 cd023d80 ccb6d880 cd023dd4 c0067044 0000718e c006719c
    [    1.295124] 1d40: 00000001 c0068438 cd023d80 ccb6d880 cd023dd4 c0067044 0000718e c006719c
    [    1.303283] 1d60: 00000800 00000083 ccb6d880 cd023d80 c02b41d8 00000083 0000003e ccb7c410
    [    1.311442] 1d80: 00000000 c00671dc 00000083 0000003e c02b41d8 cd3dd5c0 0000003e ccb7c634
    [    1.319601] 1da0: cd040030 c00672a8 cd3dd5c0 ccb7c410 ccb6d340 ccb7c410 ccb6d340 cd3dd400
    [    1.327760] 1dc0: cd3dd410 c02b4434 ccb7c410 c01265a8 00000001 cd3dd410 c0687108 00000000
    [    1.335919] 1de0: c0687108 00000000 00000000 c0240170 c0240158 cd3dd410 c06c30d0 c023e8bc
    [    1.344079] 1e00: c023e9d4 00000000 cd3dd410 c023e9d4 c0682150 c023cf88 cd003e98 cd2d50c4
    [    1.352238] 1e20: cd3dd410 cd3dd444 c06822f0 c023e768 cd3dd418 cd3dd410 c06822f0 c023de14
    [    1.360397] 1e40: cd3dd418 00000000 cd3dd410 c023c398 cd041e78 cd041ea8 cd3dd400 cd3dd410
    [    1.368556] 1e60: 00000083 00000000 cd3dd400 cd3dd410 00000083 000000c8 c04e00c8 c023fee8
    [    1.376715] 1e80: 00000000 cd041ea8 cd3dd400 00000001 00000083 c024048c c0435ef8 c0434dec
    [    1.384874] 1ea0: c068da58 c04c6d04 c0682150 c0435ef8 ffffffff 00000000 00000000 c068da58
    [    1.393033] 1ec0: 00000020 00000000 00000000 00000000 c05dabb8 00000007 c068d640 c068d640
    [    1.401193] 1ee0: c04c247c c04c249c 00000000 c00088e8 cd004c00 c043bbb8 cd029180 c03812a0
    [    1.409352] 1f00: 00000000 00000000 60000113 c0673728 60000113 c0673728 00000000 00000000
    [    1.417511] 1f20: cd7fce01 c0390a54 00000065 c003a81c c049e8bc 00000007 cd7fce0e 00000007
    [    1.425670] 1f40: 00000000 c05dabb8 00000007 c068d640 c068d640 c04c050c c04e00c8 00000065
    [    1.433829] 1f60: c04e00c0 c04c0c54 00000007 00000007 c04c050c c037d8fc cd03c000 c004322c
    [    1.441988] 1f80: c0662b40 0000d640 c03737c0 00000000 00000000 00000000 00000000 00000000
    [    1.450147] 1fa0: 00000000 c03737cc 00000000 c000e478 00000000 00000000 00000000 00000000
    [    1.458307] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    [    1.466467] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 0002d481 05014092
    [    1.474640] [<c01d3990>] (gpiod_lock_as_irq) from [<c01d1c38>] (sirfsoc_gpio_irq_startup+0x18/0x44)
    [    1.483661] [<c01d1c38>] (sirfsoc_gpio_irq_startup) from [<c0068438>] (irq_startup+0x34/0x6c)
    [    1.492163] [<c0068438>] (irq_startup) from [<c0067044>] (__setup_irq+0x450/0x4b8)
    [    1.499714] [<c0067044>] (__setup_irq) from [<c00671dc>] (request_threaded_irq+0xa8/0x128)
    [    1.507960] [<c00671dc>] (request_threaded_irq) from [<c00672a8>] (request_any_context_irq+0x4c/0x7c)
    [    1.517164] [<c00672a8>] (request_any_context_irq) from [<c02b4434>] (gpio_extcon_probe+0x144/0x1d4)
    [    1.526279] [<c02b4434>] (gpio_extcon_probe) from [<c0240170>] (platform_drv_probe+0x18/0x48)
    [    1.534783] [<c0240170>] (platform_drv_probe) from [<c023e8bc>] (driver_probe_device+0x120/0x238)
    [    1.543641] [<c023e8bc>] (driver_probe_device) from [<c023cf88>] (bus_for_each_drv+0x58/0x8c)
    [    1.552143] [<c023cf88>] (bus_for_each_drv) from [<c023e768>] (device_attach+0x74/0x88)
    [    1.560126] [<c023e768>] (device_attach) from [<c023de14>] (bus_probe_device+0x84/0xa8)
    [    1.568113] [<c023de14>] (bus_probe_device) from [<c023c398>] (device_add+0x440/0x520)
    [    1.576012] [<c023c398>] (device_add) from [<c023fee8>] (platform_device_add+0xb4/0x214)
    [    1.584084] [<c023fee8>] (platform_device_add) from [<c024048c>] (platform_device_register_full+0xb8/0xdc)
    [    1.593719] [<c024048c>] (platform_device_register_full) from [<c04c6d04>] (sirfsoc_init_late+0xec/0xf4)
    [    1.603185] [<c04c6d04>] (sirfsoc_init_late) from [<c04c249c>] (init_machine_late+0x20/0x28)
    [    1.611603] [<c04c249c>] (init_machine_late) from [<c00088e8>] (do_one_initcall+0xf8/0x144)
    [    1.619934] [<c00088e8>] (do_one_initcall) from [<c04c0c54>] (kernel_init_freeable+0x13c/0x1dc)
    [    1.628620] [<c04c0c54>] (kernel_init_freeable) from [<c03737cc>] (kernel_init+0xc/0x118)

Signed-off-by: Barry Song <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>

Merge tag 'drm-intel-fixes-2014-03-04' of ssh://git.freedesktop.org/git/drm-intel into drm-fixes

Small fixes all around, mostly stable material. Please pull.

* tag 'drm-intel-fixes-2014-03-04' of ssh://git.freedesktop.org/git/drm-intel:
  drm/i915: Reject >165MHz modes w/ DVI monitors
  drm/i915: fix assert_cursor on BDW
  drm/i915: vlv: reserve GT power context early
  drm/i915: fix pch pci device enumeration
  drm/i915: Resolving the memory region conflict for Stolen area
  drm/i915: use backlight legacy combination mode also for i915gm/i945gm

spi: atmel: add missing spi_master_{resume,suspend} calls to PM callbacks

The PM callbacks implemented by the spi-atmel driver don't call
spi_master_{resume,suspend}, fix that.

Signed-off-by: Wenyou Yang <[email protected]>
Signed-off-by: Mark Brown <[email protected]>

spi: coldfire-qspi: Fix getting correct address for *mcfqspi

dev_get_drvdata() returns the address of master rather than mcfqspi.

Fixes: af361079 (spi/coldfire-qspi: Drop extra calls to spi_master_get in suspend/resume functions)
Signed-off-by: Axel Lin <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Cc: [email protected]

spi: fsl-dspi: Fix getting correct address for master

Current code set platform drvdata to dspi. However, the code in dspi_suspend()
and dspi_resume() assumes the drvdata is the address of master.
Fix it by setting platform drvdata to master.

Signed-off-by: Axel Lin <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Cc: [email protected]

ASoC: pcm: free path list before exiting from error conditions

dpcm_path_get() allocates dynamic memory to hold path list.
Corresponding dpcm_path_put() must be called to free the memory.
dpcm_path_put() is not called under several error conditions.
This leads to memory leak.

Signed-off-by: Patrick Lai <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Cc: [email protected]

pinctrl: sh-pfc: r8a7791: SD1_CLK fix

Fix the SD1_CLK handling for r8a7791. Without this patch
it is impossible to request all pins needed for SDHI1 on
the Koelsch board.

Signed-off-by: Magnus Damm <[email protected]>
Acked-by: Laurent Pinchart <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>

pinctrl: msm: make PINCTRL_MSM bool instead of tristate

Modular builds of pinctrl-msm break due to handle_bad_irq being
unexported for module use. For now, make PINCTRL_MSM 'bool'.

Signed-off-by: Josh Cartwright <[email protected]>
Reviewed-by: Bjorn Andersson <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>

pinctrl: sunxi: Fix interrupt register offset calculation

This fixing setting the interrupt type for eints >= 8.

Signed-off-by: Hans de Goede <[email protected]>
Acked-by: Maxime Ripard <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>

pinctrl: sunxi: Fix masking when setting irq type

Signed-off-by: Hans de Goede <[email protected]>
Acked-by: Maxime Ripard <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>

pinctrl: sunxi: use chained_irq_{enter, exit} for GIC compatibility

On tha Allwinner A20 SoC, the external interrupts on the pin controller
device are connected to the GIC. Without chained_irq_{enter, exit},
external GPIO interrupts, such as used by mmc core card detect, cause
the system to hang.

This issue was first encountered during my attempt to get out-of-band
interrupts for WiFi on the Cubietruck working. With David's new series
of sunci-mci using mmc slot-gpio for (GPIO interrupt based) card
detection, removing the SD card also causes my Cubietruck to hang. This
problem should extend to all Allwinner A20 based boards.

With this fix, the system no longer hangs when I remove or insert the
SD card. /proc/interrupts show that the interrupt has correctly fired.
However the system still does not detect card removal/insertion. I
believe this is another unrelated issue.

Cc: [email protected]
Signed-off-by: Chen-Yu Tsai <[email protected]>
Acked-by: Maxime Ripard <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>

iser-target: Fix command leak for tx_desc->comp_llnode_batch

This patch addresses a number of active I/O shutdown issues
related to isert_cmd descriptors being leaked that are part
of a completion interrupt coalescing batch.

This includes adding logic in isert_cq_tx_comp_err() to
drain any associated tx_desc->comp_llnode_batch, as well
as isert_cq_drain_comp_llist() to drain any associated
isert_conn->conn_comp_llist.

Also, set tx_desc->llnode_active in isert_init_send_wr()
in order to determine when work requests need to be skipped
in isert_cq_tx_work() exception path code.

Finally, update isert_init_send_wr() to only allow interrupt
coalescing when ISER_CONN_UP.

Acked-by: Sagi Grimberg <[email protected]>
Cc: Or Gerlitz <[email protected]>
Cc: <[email protected]> #3.13+
Signed-off-by: Nicholas Bellinger <[email protected]>

iser-target: Ignore completions for FRWRs in isert_cq_tx_work

This patch changes IB_WR_FAST_REG_MR + IB_WR_LOCAL_INV related
work requests to include a ISER_FRWR_LI_WRID value in order to
signal isert_cq_tx_work() that these requests should be ignored.

This is necessary because even though IB_SEND_SIGNALED is not
set for either work request, during a QP failure event the work
requests will be returned with exception status from the TX
completion queue.

v2 changes:
- Rename ISER_FRWR_LI_WRID -> ISER_FASTREG_LI_WRID (Sagi)

Acked-by: Sagi Grimberg <[email protected]>
Cc: Or Gerlitz <[email protected]>
Cc: <[email protected]> #3.12+
Signed-off-by: Nicholas Bellinger <[email protected]>

iser-target: Fix post_send_buf_count for RDMA READ/WRITE

This patch fixes the incorrect setting of ->post_send_buf_count
related to RDMA WRITEs + READs where isert_rdma_rw->send_wr_num
was not being taken into account.

This includes incrementing ->post_send_buf_count within
isert_put_datain() + isert_get_dataout(), decrementing within
__isert_send_completion() + isert_response_completion(), and
clearing wr->send_wr_num within isert_completion_rdma_read()

This is necessary because even though IB_SEND_SIGNALED is
not set for RDMA WRITEs + READs, during a QP failure event
the work requests will be returned with exception status
from the TX completion queue.

Acked-by: Sagi Grimberg <[email protected]>
Cc: Or Gerlitz <[email protected]>
Cc: <[email protected]> #3.10+
Signed-off-by: Nicholas Bellinger <[email protected]>

iscsi/iser-target: Fix isert_conn->state hung shutdown issues

This patch addresses a couple of different hug shutdown issues
related to wait_event() + isert_conn->state. First, it changes
isert_conn->conn_wait + isert_conn->conn_wait_comp_err from
waitqueues to completions, and sets ISER_CONN_TERMINATING from
within isert_disconnect_work().

Second, it splits isert_free_conn() into isert_wait_conn() that
is called earlier in iscsit_close_connection() to ensure that
all outstanding commands have completed before continuing.

Finally, it breaks isert_cq_comp_err() into seperate TX / RX
related code, and adds logic in isert_cq_rx_comp_err() to wait
for outstanding commands to complete before setting ISER_CONN_DOWN
and calling complete(&isert_conn->conn_wait_comp_err).

Acked-by: Sagi Grimberg <[email protected]>
Cc: Or Gerlitz <[email protected]>
Cc: <[email protected]> #3.10+
Signed-off-by: Nicholas Bellinger <[email protected]>

iscsi/iser-target: Use list_del_init for ->i_conn_node

There are a handful of uses of list_empty() for cmd->i_conn_node
within iser-target code that expect to return false once a cmd
has been removed from the per connect list.

This patch changes all uses of list_del -> list_del_init in order
to ensure that list_empty() returns false as expected.

Acked-by: Sagi Grimberg <[email protected]>
Cc: Or Gerlitz <[email protected]>
Cc: <[email protected]> #3.10+
Signed-off-by: Nicholas Bellinger <[email protected]>

iscsi-target: Fix iscsit_get_tpg_from_np tpg_state bug

This patch fixes a bug in iscsit_get_tpg_from_np() where the
tpg->tpg_state sanity check was looking for TPG_STATE_FREE,
instead of != TPG_STATE_ACTIVE.

The latter is expected during a normal TPG shutdown once the
tpg_state goes into TPG_STATE_INACTIVE in order to reject any
new incoming login attempts.

Cc: <[email protected]> #3.10+
Signed-off-by: Nicholas Bellinger <[email protected]>

staging/cxt1e1/linux.c: Correct arbitrary memory write in c4_ioctl()

The function c4_ioctl() writes data from user in ifr->ifr_data
to the kernel struct data arg, without any iolen bounds checking.
This can lead to a arbitrary write outside of the struct data arg.
Corrected by adding bounds-checking of iolen before the copy_from_user().

Signed-off-by: Salva Peiró <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

x86, trace: Fix CR2 corruption when tracing page faults

The trace_do_page_fault function trigger tracepoint
and then handles the actual page fault.

This could lead to error if the tracepoint caused page
fault. The original cr2 value gets lost and the original
page fault handler kills current process with SIGSEGV.

This happens if you record page faults with callchain
data, the user part of it will cause tracepoint handler
to page fault:

# perf record -g -e exceptions:page_fault_user ls

Fixing this by saving the original cr2 value
and using it after tracepoint handler is done.

v2: Moving the cr2 read before exception_enter, because
it could trigger tracepoint as well.

Reported-by: Arnaldo Carvalho de Melo <[email protected]>
Reported-by: Vince Weaver <[email protected]>
Tested-by: Vince Weaver <[email protected]>
Acked-by: Steven Rostedt <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Seiji Aguchi <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]

Merge tag 'efi-urgent' into x86/urgent

* Disable the new EFI 1:1 virtual mapping for SGI UV because using it
causes a crash during boot - Borislav Petkov

Signed-off-by: H. Peter Anvin <[email protected]>

x86/efi: Quirk out SGI UV

Alex reported hitting the following BUG after the EFI 1:1 virtual
mapping work was merged,

kernel BUG at arch/x86/mm/init_64.c:351!
invalid opcode: 0000 [#1] SMP
Call Trace:
  [<ffffffff818aa71d>] init_extra_mapping_uc+0x13/0x15
  [<ffffffff818a5e20>] uv_system_init+0x22b/0x124b
  [<ffffffff8108b886>] ? clockevents_register_device+0x138/0x13d
  [<ffffffff81028dbb>] ? setup_APIC_timer+0xc5/0xc7
  [<ffffffff8108b620>] ? clockevent_delta2ns+0xb/0xd
  [<ffffffff818a3a92>] ? setup_boot_APIC_clock+0x4a8/0x4b7
  [<ffffffff8153d955>] ? printk+0x72/0x74
  [<ffffffff818a1757>] native_smp_prepare_cpus+0x389/0x3d6
  [<ffffffff818957bc>] kernel_init_freeable+0xb7/0x1fb
  [<ffffffff81535530>] ? rest_init+0x74/0x74
  [<ffffffff81535539>] kernel_init+0x9/0xff
  [<ffffffff81541dfc>] ret_from_fork+0x7c/0xb0
  [<ffffffff81535530>] ? rest_init+0x74/0x74

Getting this thing to work with the new mapping scheme would need more
work, so automatically switch to the old memmap layout for SGI UV.

Acked-by: Russ Anderson <[email protected]>
Cc: Alex Thorlton <[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Matt Fleming <[email protected]>

c6x: fix build failure caused by cache.h

A patch to linux/irqflags.h uncovered a problem with c6x asm/cache.h
which causes a build failure:

/arch/c6x/include/asm/cache.h:63:20: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘c6x_cache_init’
extern void __init c6x_cache_init(void);

The asm/cache.h was relying on linux/irqflags.h to pull in linux/init.h
but the recent patch changed that. The c6x header should have included
linux/init.h all along.

Signed-off-by: Mark Salter <[email protected]>

Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes

wl1251: use skb_trim to make skb shorter

the current code is directly setting skb->len, which is not correct and
brings problems with HAVE_EFFICIENT_UNALIGNED_ACCESS enabled in config

Signed-off-by: Ivaylo Dimitrov <[email protected]>
Signed-off-by: John W. Linville <[email protected]>

iwlwifi: fix and add 7265 series HW IDs

Update of the HW IDs for the 7265 series.

Signed-off-by: Oren Givon <[email protected]>
Signed-off-by: Emmanuel Grumbach <[email protected]>

iwlwifi: mvm: don't WARN when statistics are handled late

Since the statistics handler is asynchrous, it can very well
be that we will handle the statistics (hence the RSSI
fluctuation) when we already disassociated.
Don't WARN on this case.

This solves: https://bugzilla.redhat.com/show_bug.cgi?id=1071998

Cc: <[email protected]> [3.10+]
Fixes: 2b76ef13086f ("iwlwifi: mvm: implement reduced Tx power")
Reviewed-by: Johannes Berg <[email protected]>
Signed-off-by: Emmanuel Grumbach <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) Fix memory leak in ieee80211_prep_connection(), sta_info leaked on
    error.  From Eytan Lifshitz.

2) Unintentional switch case fallthrough in nft_reject_inet_eval(),
    from Patrick McHardy.

3) Must check if payload lenth is a power of 2 in
    nft_payload_select_ops(), from Nikolay Aleksandrov.

4) Fix mis-checksumming in xen-netfront driver, ip_hdr() is not in the
    correct place when we invoke skb_checksum_setup().  From Wei Liu.

5) TUN driver should not advertise HW vlan offload features in
    vlan_features.  Fix from Fernando Luis Vazquez Cao.

6) IPV6_VTI needs to select NET_IPV_TUNNEL to avoid build errors, fix
    from Steffen Klassert.

7) Add missing locking in xfrm_migrade_state_find(), we must hold the
    per-namespace xfrm_state_lock while traversing the lists.  Fix from
    Steffen Klassert.

8) Missing locking in ath9k driver, access to tid->sched must be done
    under ath_txq_lock().  Fix from Stanislaw Gruszka.

9) Fix two bugs in TCP fastopen.  First respect the size argument given
    to tcp_sendmsg() in the fastopen path, and secondly prevent
    tcp_send_syn_data() from potentially using order-5 allocations.
    From Eric Dumazet.

10) Fix handling of default neigh garbage collection params, from Jiri
    Pirko.

11) Fix cwnd bloat and over-inflation of RTT when transmit segmentation
    is in use.  From Eric Dumazet.

12) Missing initialization of Realtek r8169 driver's statistics
    seqlocks.  Fix from Kyle McMartin.

13) Fix RTNL assertion failures in 802.3ad and AB ARP monitor of bonding
    driver, from Ding Tianhong.

14) Bonding slave release race can cause divide by zero, fix from
    Nikolay Aleksandrov.

15) Overzealous return from neigh_periodic_work() causes reachability
    time to not be computed.  Fix from Duain Jiong.

16) Fix regression in ipv6_find_hdr(), it should not return -ENOENT when
    a specific target is specified and found.  From Hans Schillstrom.

17) Fix VLAN tag stripping regression in BNA driver, from Ivan Vecera.

18) Tail loss probe can calculate bogus RTTs due to missing packet
    marking on retransmit.  Fix from Yuchung Cheng.

19) We cannot do skb_dst_drop() in iptunnel_pull_header() because
    multicast loopback detection in later code paths need access to
    skb_rtable().  Fix from Xin Long.

20) The macvlan driver regresses in that it propagates lower device
    offload support disables into itself, causing severe slowdowns when
    running over a bridge.  Provide the software offloads always on
    macvlan devices to deal with this and the regression is gone.  From
    Vlad Yasevich.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (103 commits)
  macvlan: Add support for 'always_on' offload features
  net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable
  ip_tunnel:multicast process cause panic due to skb->_skb_refdst NULL pointer
  net: cpsw: fix cpdma rx descriptor leak on down interface
  be2net: isolate TX workarounds not applicable to Skyhawk-R
  be2net: Fix skb double free in be_xmit_wrokarounds() failure path
  be2net: clear promiscuous bits in adapter->flags while disabling promiscuous mode
  be2net: Fix to reset transparent vlan tagging
  qlcnic: dcb: a couple off by one bugs
  tcp: fix bogus RTT on special retransmission
  hsr: off by one sanity check in hsr_register_frame_in()
  can: remove CAN FD compatibility for CAN 2.0 sockets
  can: flexcan: factor out soft reset into seperate funtion
  can: flexcan: flexcan_remove(): add missing netif_napi_del()
  can: flexcan: fix transition from and to freeze mode in chip_{,un}freeze
  can: flexcan: factor out transceiver {en,dis}able into seperate functions
  can: flexcan: fix transition from and to low power mode in chip_{en,dis}able
  can: flexcan: flexcan_open(): fix error path if flexcan_chip_start() fails
  can: flexcan: fix shutdown: first disable chip, then all interrupts
  USB AX88179/178A: Support D-Link DUB-1312
  ...

Merge tag 'regulator-v3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
"A couple of fixes here which ensure that regulators using the core
  support for GPIO enables work in all cases by ensuring that helpers
  are used consistently rather than open coding in places and hence not
  having GPIO support in some of them"

* tag 'regulator-v3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: core: Replace direct ops->disable usage
  regulator: core: Replace direct ops->enable usage

Merge branch 'akpm' (patches from Andrew Morton)

Merge misc fixes from Andrew Morton.

* emailed patches from Andrew Morton [email protected]>:
  mm: page_alloc: exempt GFP_THISNODE allocations from zone fairness
  mm: numa: bugfix for LAST_CPUPID_NOT_IN_PAGE_FLAGS
  MAINTAINERS: add and correct types of some "T:" entries
  MAINTAINERS: use tab for separator
  rapidio/tsi721: fix tasklet termination in dma channel release
  hfsplus: fix remount issue
  zram: avoid null access when fail to alloc meta
  sh: prefix sh-specific "CCR" and "CCR2" by "SH_"
  ocfs2: fix quota file corruption
  drivers/rtc/rtc-s3c.c: fix incorrect way of save/restore of S3C2410_TICNT for TYPE_S3C64XX
  kallsyms: fix absolute addresses for kASLR
  scripts/gen_initramfs_list.sh: fix flags for initramfs LZ4 compression
  mm: include VM_MIXEDMAP flag in the VM_SPECIAL list to avoid m(un)locking
  memcg: reparent charges of children before processing parent
  memcg: fix endless loop in __mem_cgroup_iter_next()
  lib/radix-tree.c: swapoff tmpfs radix_tree: remember to rcu_read_unlock
  dma debug: account for cachelines and read-only mappings in overlap tracking
  mm: close PageTail race
  MAINTAINERS: EDAC: add Mauro and Borislav as interim patch collectors

dm thin: synchronize the pool mode during suspend

Commit b5330655 ("dm thin: handle metadata failures more consistently")
increased potential for the pool's mode to be changed in response to
metadata operation failures.

When the pool mode is changed it isn't synchronized with the mode in
pool_features stored in the target's context (ti->private) that is used
as the basis for (re)establishing the pool mode during resume via
bind_control_target.

It is important that we synchronize the pool mode when it is changed
otherwise the pool may experience and unexpected mode transition on the
next resume (especially if there was no new table load).

Signed-off-by: Mike Snitzer <[email protected]>
Acked-by: Joe Thornber <[email protected]>

mm: page_alloc: exempt GFP_THISNODE allocations from zone fairness

Jan Stancek reports manual page migration encountering allocation
failures after some pages when there is still plenty of memory free, and
bisected the problem down to commit 81c0a2bb515f ("mm: page_alloc: fair
zone allocator policy").

The problem is that GFP_THISNODE obeys the zone fairness allocation
batches on one hand, but doesn't reset them and wake kswapd on the other
hand. After a few of those allocations, the batches are exhausted and
the allocations fail.

Fixing this means either having GFP_THISNODE wake up kswapd, or
GFP_THISNODE not participating in zone fairness at all. The latter
seems safer as an acute bugfix, we can clean up later.

Reported-by: Jan Stancek <[email protected]>
Signed-off-by: Johannes Weiner <[email protected]>
Acked-by: Rik van Riel <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Cc: <[email protected]> [3.12+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: numa: bugfix for LAST_CPUPID_NOT_IN_PAGE_FLAGS

When doing some numa tests on powerpc, I triggered an oops bug.  I find
it is caused by using page->_last_cpupid.  It should be initialized as
"-1 & LAST_CPUPID_MASK", but not "-1".  Otherwise, in task_numa_fault(),
we will miss the checking (last_cpupid == (-1 & LAST_CPUPID_MASK)).  And
finally cause an oops bug in task_numa_group(), since the online cpu is
less than possible cpu.  This happen with CONFIG_SPARSE_VMEMMAP disabled

Call trace:

  SMP NR_CPUS=64 NUMA PowerNV
  Modules linked in:
  CPU: 24 PID: 804 Comm: systemd-udevd Not tainted3.13.0-rc1+ #32
  task: c000001e2746aa80 ti: c000001e32c50000 task.ti:c000001e32c50000
  REGS: c000001e32c53510 TRAP: 0300   Not tainted(3.13.0-rc1+)
  MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI>  CR:28024424  XER: 20000000
  CFAR: c000000000009324 DAR: 7265717569726857 DSISR:40000000 SOFTE: 1
  NIP  .task_numa_fault+0x1470/0x2370
  LR  .task_numa_fault+0x1468/0x2370
  Call Trace:
   .task_numa_fault+0x1468/0x2370 (unreliable)
   .do_numa_page+0x480/0x4a0
   .handle_mm_fault+0x4ec/0xc90
   .do_page_fault+0x3a8/0x890
   handle_page_fault+0x10/0x30
  Instruction dump:
  3c82fefb 3884b138 48d9cff1 60000000 48000574 3c62fefb3863af78 3c82fefb
  3884b138 48d9cfd5 60000000 e93f0100 <812902e4> 7d2907b45529063e 7d2a07b4
  ---[ end trace 15f2510da5ae07cf ]---

Signed-off-by: Liu Ping Fan <[email protected]>
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Mel Gorman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

MAINTAINERS: add and correct types of some "T:" entries

Tree location entries should start with the appropriate type.

Add git to some, hg to another.

Neaten tree type description.

Signed-off-by: Joe Perches <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

MAINTAINERS: use tab for separator

Convert whitespace to single tab for separators.

Signed-off-by: Joe Perches <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

rapidio/tsi721: fix tasklet termination in dma channel release

This patch is a modification of the patch originally proposed by
Xiaotian Feng <[email protected]>: https://lkml.org/lkml/2012/11/5/413
This new version disables DMA channel interrupts and ensures that the
tasklet wil not be scheduled again before calling tasklet_kill().

Unfortunately the updated patch was not released at that time due to
planned rework of Tsi721 mport driver to use threaded interrupts (which
has yet to happen).  Recently the issue was reported again:
https://lkml.org/lkml/2014/2/19/762.

Description from the original Xiaotian's patch:

"Some drivers use tasklet_disable in device remove/release process,
  tasklet_disable will inc tasklet->count and return.  If the tasklet is
  not handled yet under some softirq pressure, the tasklet will be
  placed on the tasklet_vec, never have a chance to be excuted.  This
  might lead to a heavy loaded ksoftirqd, wakeup with pending_softirq,
  but tasklet is disabled.  tasklet_kill should be used in this case."

This patch is applicable to kernel versions starting from v3.5.

Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Xiaotian Feng <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: <[email protected]> [3.5+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

hfsplus: fix remount issue

Current implementation of HFS+ driver has small issue with remount
option.  Namely, for example, you are unable to remount from RO mode
into RW mode by means of command "mount -o remount,rw /dev/loop0
/mnt/hfsplus".  Trying to execute sequence of commands results in an
error message:

  mount /dev/loop0 /mnt/hfsplus
  mount -o remount,ro /dev/loop0 /mnt/hfsplus
  mount -o remount,rw /dev/loop0 /mnt/hfsplus

  mount: you must specify the filesystem type

  mount -t hfsplus -o remount,rw /dev/loop0 /mnt/hfsplus

  mount: /mnt/hfsplus not mounted or bad option

The reason of such issue is failure of mount syscall:

  mount("/dev/loop0", "/mnt/hfsplus", 0x2282a60, MS_MGC_VAL|MS_REMOUNT, NULL) = -1 EINVAL (Invalid argument)

Namely, hfsplus_parse_options_remount() method receives empty "input"
argument and return false in such case.  As a result, hfsplus_remount()
returns -EINVAL error code.

This patch fixes the issue by means of return true for the case of empty
"input" argument in hfsplus_parse_options_remount() method.

Signed-off-by: Vyacheslav Dubeyko <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

zram: avoid null access when fail to alloc meta

zram_meta_alloc could fail so caller should check it. Otherwise, your
system will hang.

Signed-off-by: Minchan Kim <[email protected]>
Acked-by: Jerome Marchand <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

sh: prefix sh-specific "CCR" and "CCR2" by "SH_"

Commit bcf24e1daa94 ("mmc: omap_hsmmc: use the generic config for
omap2plus devices"), enabled the build for other platforms for compile
testing.

sh-allmodconfig now fails with:

    include/linux/omap-dma.h:171:8: error: expected identifier before numeric constant
    make[4]: *** [drivers/mmc/host/omap_hsmmc.o] Error 1

This happens because SuperH #defines "CCR", which is one of the enum
values in include/linux/omap-dma.h.  There's a similar issue with "CCR2"
on sh2a.

As "CCR" and "CCR2" are too generic names for global #defines, prefix
them with "SH_" to fix this.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: fix quota file corruption

Global quota files are accessed from different nodes. Thus we cannot
cache offset of quota structure in the quota file after we drop our node
reference count to it because after that moment quota structure may be
freed and reallocated elsewhere by a different node resulting in
corruption of quota file.

Fix the problem by clearing dq_off when we are releasing dquot structure.
We also remove the DB_READ_B handling because it is useless -
DQ_ACTIVE_B is set iff DQ_READ_B is set.

Signed-off-by: Jan Kara <[email protected]>
Cc: Goldwyn Rodrigues <[email protected]>
Cc: Joel Becker <[email protected]>
Reviewed-by: Mark Fasheh <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

drivers/rtc/rtc-s3c.c: fix incorrect way of save/restore of S3C2410_TICNT for TYPE_S3C64XX

On exynos5250, exynos5420 and exynos5260 it was observed that, after 1
cycle of S2R, the rtc-tick occurs at a very fast rate as compared to the
rtc-tick occuring before S2R.

This patch fixes the above issue by correcting the wrong way of
save/restore of S3C2410_TICNT for TYPE_S3C64XX.

Signed-off-by: Vikas Sajjan <[email protected]>
Cc: Grant Likely <[email protected]>
Cc: Rob Herring <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kallsyms: fix absolute addresses for kASLR

Currently symbols that are absolute addresses are incorrectly displayed
in /proc/kallsyms if the kernel is loaded with kASLR.

The problem was that the scripts/kallsyms.c file which generates the
array of symbol names and addresses uses an relocatable value for all
symbols, even absolute symbols.  This patch fixes that.

Several kallsyms output in different boot states for comparison:

  $ egrep '_(stext|_per_cpu_(start|end))' /root/kallsyms.nokaslr
  0000000000000000 D __per_cpu_start
  0000000000014280 D __per_cpu_end
  ffffffff810001c8 T _stext
  $ egrep '_(stext|_per_cpu_(start|end))' /root/kallsyms.kaslr1
  000000001f200000 D __per_cpu_start
  000000001f214280 D __per_cpu_end
  ffffffffa02001c8 T _stext
  $ egrep '_(stext|_per_cpu_(start|end))' /root/kallsyms.kaslr2
  000000000d400000 D __per_cpu_start
  000000000d414280 D __per_cpu_end
  ffffffff8e4001c8 T _stext
  $ egrep '_(stext|_per_cpu_(start|end))' /root/kallsyms.kaslr-fixed
  0000000000000000 D __per_cpu_start
  0000000000014280 D __per_cpu_end
  ffffffffadc001c8 T _stext

Signed-off-by: Andy Honig <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Cc: Michal Marek <[email protected]>
Cc: Rusty Russell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

scripts/gen_initramfs_list.sh: fix flags for initramfs LZ4 compression

LZ4 as implemented in the kernel differs from the default method now
used by the reference implementation of LZ4.  Until the in-kernel method
is updated to support the new default, passing the legacy flag (-l) to
the compressor is necessary.  Without this flag the kernel-generated,
LZ4-compressed initramfs is junk.

Kyungsik said:

: It seems that lz4 supports legacy format with the same option as lz4c
: does.  Just looking at the first few bytes of lz4 compressed image, we can
: see whether it is new format or not.
:
: It shows new format magic number without this patch.  New format magic
: number is 0x184d2204.
:
: $ hexdump -C ./initramfs_data.cpio.lz4 |more
: 00000000  04 22 4d 18 64 70 b9 69 (Little Endian)
: ...
:
: Currently kernel supports legacy format only.

Signed-off-by: Daniel M. Weeks <[email protected]>
Cc: Michal Marek <[email protected]>
Acked-by: Kyungsik Lee <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: include VM_MIXEDMAP flag in the VM_SPECIAL list to avoid m(un)locking

Daniel Borkmann reported a VM_BUG_ON assertion failing:

  ------------[ cut here ]------------
  kernel BUG at mm/mlock.c:528!
  invalid opcode: 0000 [#1] SMP
  Modules linked in: ccm arc4 iwldvm [...]
   video
  CPU: 3 PID: 2266 Comm: netsniff-ng Not tainted 3.14.0-rc2+ #8
  Hardware name: LENOVO 2429BP3/2429BP3, BIOS G4ET37WW (1.12 ) 05/29/2012
  task: ffff8801f87f9820 ti: ffff88002cb44000 task.ti: ffff88002cb44000
  RIP: 0010:[<ffffffff81171ad0>]  [<ffffffff81171ad0>] munlock_vma_pages_range+0x2e0/0x2f0
  Call Trace:
    do_munmap+0x18f/0x3b0
    vm_munmap+0x41/0x60
    SyS_munmap+0x22/0x30
    system_call_fastpath+0x1a/0x1f
  RIP   munlock_vma_pages_range+0x2e0/0x2f0
  ---[ end trace a0088dcf07ae10f2 ]---

because munlock_vma_pages_range() thinks it's unexpectedly in the middle
of a THP page.  This can be reproduced with default config since 3.11
kernels.  A reproducer can be found in the kernel's selftest directory
for networking by running ./psock_tpacket.

The problem is that an order=2 compound page (allocated by
alloc_one_pg_vec_page() is part of the munlocked VM_MIXEDMAP vma (mapped
by packet_mmap()) and mistaken for a THP page and assumed to be order=9.

The checks for THP in munlock came with commit ff6a6da60b89 ("mm:
accelerate munlock() treatment of THP pages"), i.e.  since 3.9, but did
not trigger a bug.  It just makes munlock_vma_pages_range() skip such
compound pages until the next 512-pages-aligned page, when it encounters
a head page.  This is however not a problem for vma's where mlocking has
no effect anyway, but it can distort the accounting.

Since commit 7225522bb429 ("mm: munlock: batch non-THP page isolation
and munlock+putback using pagevec") this can trigger a VM_BUG_ON in
PageTransHuge() check.

This patch fixes the issue by adding VM_MIXEDMAP flag to VM_SPECIAL, a
list of flags that make vma's non-mlockable and non-mergeable.  The
reasoning is that VM_MIXEDMAP vma's are similar to VM_PFNMAP, which is
already on the VM_SPECIAL list, and both are intended for non-LRU pages
where mlocking makes no sense anyway.  Related Lkml discussion can be
found in [2].

[1] tools/testing/selftests/net/psock_tpacket
[2] https://lkml.org/lkml/2014/1/10/427

Signed-off-by: Vlastimil Babka <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reported-by: Daniel Borkmann <[email protected]>
Tested-by: Daniel Borkmann <[email protected]>
Cc: Thomas Hellstrom <[email protected]>
Cc: John David Anglin <[email protected]>
Cc: HATAYAMA Daisuke <[email protected]>
Cc: Konstantin Khlebnikov <[email protected]>
Cc: Carsten Otte <[email protected]>
Cc: Jared Hulbert <[email protected]>
Tested-by: Hannes Frederic Sowa <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Acked-by: Rik van Riel <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: <[email protected]> [3.11.x+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

memcg: reparent charges of children before processing parent

Sometimes the cleanup after memcg hierarchy testing gets stuck in
mem_cgroup_reparent_charges(), unable to bring non-kmem usage down to 0.

There may turn out to be several causes, but a major cause is this: the
workitem to offline parent can get run before workitem to offline child;
parent's mem_cgroup_reparent_charges() circles around waiting for the
child's pages to be reparented to its lrus, but it's holding
cgroup_mutex which prevents the child from reaching its
mem_cgroup_reparent_charges().

Further testing showed that an ordered workqueue for cgroup_destroy_wq
is not always good enough: percpu_ref_kill_and_confirm's call_rcu_sched
stage on the way can mess up the order before reaching the workqueue.

Instead, when offlining a memcg, call mem_cgroup_reparent_charges() on
all its children (and grandchildren, in the correct order) to have their
charges reparented first.

Fixes: e5fca243abae ("cgroup: use a dedicated workqueue for cgroup destruction")
Signed-off-by: Filipe Brandenburger <[email protected]>
Signed-off-by: Hugh Dickins <[email protected]>
Reviewed-by: Tejun Heo <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: <[email protected]> [v3.10+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

memcg: fix endless loop in __mem_cgroup_iter_next()

Commit 0eef615665ed ("memcg: fix css reference leak and endless loop in
mem_cgroup_iter") got the interaction with the commit a few before it
d8ad30559715 ("mm/memcg: iteration skip memcgs not yet fully
initialized") slightly wrong, and we didn't notice at the time.

It's elusive, and harder to get than the original, but for a couple of
days before rc1, I several times saw a endless loop similar to that
supposedly being fixed.

This time it was a tighter loop in __mem_cgroup_iter_next(): because we
can get here when our root has already been offlined, and the ordering
of conditions was such that we then just cycled around forever.

Fixes: 0eef615665ed ("memcg: fix css reference leak and endless loop in mem_cgroup_iter").
Signed-off-by: Hugh Dickins <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Greg Thelen <[email protected]>
Cc: <[email protected]> [3.12+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

lib/radix-tree.c: swapoff tmpfs radix_tree: remember to rcu_read_unlock

Running fsx on tmpfs with concurrent memhog-swapoff-swapon, lots of

  BUG: sleeping function called from invalid context at kernel/fork.c:606
  in_atomic(): 0, irqs_disabled(): 0, pid: 1394, name: swapoff
  1 lock held by swapoff/1394:
   #0:  (rcu_read_lock){.+.+.+}, at: [<ffffffff812520a1>] radix_tree_locate_item+0x1f/0x2b6

followed by

  ================================================
  [ BUG: lock held when returning to user space! ]
  3.14.0-rc1 #3 Not tainted
  ------------------------------------------------
  swapoff/1394 is leaving the kernel with locks still held!
  1 lock held by swapoff/1394:
   #0:  (rcu_read_lock){.+.+.+}, at: [<ffffffff812520a1>] radix_tree_locate_item+0x1f/0x2b6

after which the system recovered nicely.

Whoops, I long ago forgot the rcu_read_unlock() on one unlikely branch.

Fixes e504f3fdd63d ("tmpfs radix_tree: locate_item to speed up swapoff")

Signed-off-by: Hugh Dickins <[email protected]>
Cc: Johannes Weiner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

dma debug: account for cachelines and read-only mappings in overlap tracking

While debug_dma_assert_idle() checks if a given *page* is actively
undergoing dma the valid granularity of a dma mapping is a *cacheline*.
Sander's testing shows that the warning message "DMA-API: exceeded 7
overlapping mappings of pfn..." is falsely triggering.  The test is
simply mapping multiple cachelines in a given page.

Ultimately we want overlap tracking to be valid as it is a real api
violation, so we need to track active mappings by cachelines.  Update
the active dma tracking to use the page-frame-relative cacheline of the
mapping as the key, and update debug_dma_assert_idle() to check for all
possible mapped cachelines for a given page.

However, the need to track active mappings is only relevant when the
dma-mapping is writable by the device.  In fact it is fairly standard
for read-only mappings to have hundreds or thousands of overlapping
mappings at once.  Limiting the overlap tracking to writable
(!DMA_TO_DEVICE) eliminates this class of false-positive overlap
reports.

Note, the radix gang lookup is sub-optimal.  It would be best if it
stopped fetching entries once the search passed a page boundary.
Nevertheless, this implementation does not perturb the original net_dma
failing case.  That is to say the extra overhead does not show up in
terms of making the failing case pass due to a timing change.

References:
  http://marc.info/?l=linux-netdev&m=139232263419315&w=2
  http://marc.info/?l=linux-netdev&m=139217088107122&w=2

Signed-off-by: Dan Williams <[email protected]>
Reported-by: Sander Eikelenboom <[email protected]>
Reported-by: Dave Jones <[email protected]>
Tested-by: Dave Jones <[email protected]>
Tested-by: Sander Eikelenboom <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Francois Romieu <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Wei Liu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: close PageTail race

Commit bf6bddf1924e ("mm: introduce compaction and migration for
ballooned pages") introduces page_count(page) into memory compaction
which dereferences page->first_page if PageTail(page).

This results in a very rare NULL pointer dereference on the
aforementioned page_count(page).  Indeed, anything that does
compound_head(), including page_count() is susceptible to racing with
prep_compound_page() and seeing a NULL or dangling page->first_page
pointer.

This patch uses Andrea's implementation of compound_trans_head() that
deals with such a race and makes it the default compound_head()
implementation.  This includes a read memory barrier that ensures that
if PageTail(head) is true that we return a head page that is neither
NULL nor dangling.  The patch then adds a store memory barrier to
prep_compound_page() to ensure page->first_page is set.

This is the safest way to ensure we see the head page that we are
expecting, PageTail(page) is already in the unlikely() path and the
memory barriers are unfortunately required.

Hugetlbfs is the exception, we don't enforce a store memory barrier
during init since no race is possible.

Signed-off-by: David Rientjes <[email protected]>
Cc: Holger Kiehl <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Rafael Aquini <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

MAINTAINERS: EDAC: add Mauro and Borislav as interim patch collectors

We're more or less collecting EDAC patches already anyway so let's hold it
down so that get_maintainer sees it too.

Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Mauro Carvalho Chehab <[email protected]>
Cc: Doug Thompson <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ALSA: hda - add automute fix for another dell AIO model

When plugging a headphone or headset, lots of noise is heard from
internal speaker, after changing the automute via amp instead of
pinctl, the noise disappears.

BugLink: https://bugs.launchpad.net/bugs/1268468
Cc: David Henningsson <[email protected]>
Cc: [email protected]
Signed-off-by: Hui Wang <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

ASoC: n810: fix init with DT boot

Since 3.14-rc1 only DT boot has been supported on N810, so this
file fails to init. Make a minimal fix to retain functionality.
This file should be properly converted to DT in longer term.

There seems to be still other unresolved issues with N810 audio support,
but this patch is needed at minimum as otherwise the machine driver
probing would completely fail.

Signed-off-by: Aaro Koskinen <[email protected]>
Acked-by: Jarkko Nikula <[email protected]>
Signed-off-by: Mark Brown <[email protected]>

tracing: Do not add event files for modules that fail tracepoints

If a module fails to add its tracepoints due to module tainting, do not
create the module event infrastructure in the debugfs directory. As the events
will not work and worse yet, they will silently fail, making the user wonder
why the events they enable do not display anything.

Having a warning on module load and the events not visible to the users
will make the cause of the problem much clearer.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 6d723736e472 "tracing/events: add support for modules to TRACE_EVENT"
Acked-by: Mathieu Desnoyers <[email protected]>
Cc: [email protected] # 2.6.31+
Cc: Rusty Russell <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>

dm snapshot: fix metadata corruption

Commit 55494bf2947dccdf2 ("dm snapshot: use dm-bufio") broke snapshots.
Before that 3.14-rc1 commit, loading a snapshot's list of exceptions
involved reading exception areas one by one into ps->area and inserting
those exceptions into the hash table. Commit 55494bf2947dccdf2 changed
it so that dm-bufio with prefetch is used to load exceptions in batchs.
Exceptions are loaded correctly, but ps->area is left uninitialized.
When a new exception is allocated, it is stored in this uninitialized
ps->area which will be written to the disk. This causes metadata
corruption.

Fix this corruption by copying the last area that was read via dm-bufio
into ps->area.

Signed-off-by: Mikulas Patocka <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>

libata: disable queued TRIM for Crucial M500 mSATA SSDs

Queued TRIM commands cause problems and silent file system corruption
on Crucial M500 SSDs. This patch disables them for the mSATA model of
the drive.

Signed-off-by: Marios Andreopoulos <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
Cc: [email protected] # 3.12+
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=71371

dm: fix Kconfig indentation

Since DM_DEBUG_BLOCK_STACK_TRACING is a DM_PERSISTENT_DATA config option
move it from drivers/md/Kconfig to drivers/md/persistent-data/Kconfig.

Doing so fixes indentation for other DM config options.

Signed-off-by: Mike Snitzer <[email protected]>

macvlan: Add support for 'always_on' offload features

Macvlan currently inherits all of its features from the lower
device.  When lower device disables offload support, this causes
macvlan to disable offload support as well.  This causes
performance regression when using macvlan/macvtap in bridge
mode.

It can be easily demonstrated by creating 2 namespaces using
macvlan in bridge mode and running netperf between them:

MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

87380  16384  16384    20.00    1204.61

To restore the performance, we add software offload features
to the list of "always_on" features for macvlan.  This way
when a namespace or a guest using macvtap initially sends a
packet, this packet will not be segmented at macvlan level.
It will only be segmented when macvlan sends the packet
to the lower device.

MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

87380  16384  16384    20.00    5507.35

Fixes: 6acf54f1cf0a6747bac9fea26f34cfc5a9029523 (macvtap: Add support of packet capture on macvtap device.)
Fixes: 797f87f83b60685ff8a13fa0572d2f10393c50d3 (macvlan: fix netdev feature propagation from lower device)
CC: Florian Westphal <[email protected]>
CC: Christian Borntraeger <[email protected]>
CC: Jason Wang <[email protected]>
CC: Michael S. Tsirkin <[email protected]>
Tested-by: Christian Borntraeger <[email protected]>
Signed-off-by: Vlad Yasevich <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless

John W. Linville says:

====================
Please pull this batch of fixes intended for the 3.14 stream...

For the mac80211 bits, Johannes says:

"This time I have a fix to get out of an 'infinite error state' in case
regulatory domain updates failed and two fixes for VHT associations: one
to not disconnect immediately when the AP uses more bandwidth than the
new regdomain would allow after a change due to association country
information getting used, and one for an issue in the code where
mac80211 doesn't correctly ignore a reserved field and then uses an HT
instead of VHT association."

For the iwlwifi bits, Emmanuel says:

"Johannes fixes a long standing bug in the AMPDU status reporting.
Max fixes the listen time which was way too long and causes trouble
to several APs."

Along with those, Bing Zhao marks the mwifiex_usb driver as _not_
supporting USB autosuspend after a number of problems with that have
been reported.
====================

Signed-off-by: David S. Miller <[email protected]>

net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable

RFC4895 introduced AUTH chunks for SCTP; during the SCTP
handshake RANDOM; CHUNKS; HMAC-ALGO are negotiated (CHUNKS
being optional though):

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  -------------------- COOKIE-ECHO -------------------->
  <-------------------- COOKIE-ACK ---------------------

A special case is when an endpoint requires COOKIE-ECHO
chunks to be authenticated:

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  ------------------ AUTH; COOKIE-ECHO ---------------->
  <-------------------- COOKIE-ACK ---------------------

RFC4895, section 6.3. Receiving Authenticated Chunks says:

  The receiver MUST use the HMAC algorithm indicated in
  the HMAC Identifier field. If this algorithm was not
  specified by the receiver in the HMAC-ALGO parameter in
  the INIT or INIT-ACK chunk during association setup, the
  AUTH chunk and all the chunks after it MUST be discarded
  and an ERROR chunk SHOULD be sent with the error cause
  defined in Section 4.1. [...] If no endpoint pair shared
  key has been configured for that Shared Key Identifier,
  all authenticated chunks MUST be silently discarded. [...]

  When an endpoint requires COOKIE-ECHO chunks to be
  authenticated, some special procedures have to be followed
  because the reception of a COOKIE-ECHO chunk might result
  in the creation of an SCTP association. If a packet arrives
  containing an AUTH chunk as a first chunk, a COOKIE-ECHO
  chunk as the second chunk, and possibly more chunks after
  them, and the receiver does not have an STCB for that
  packet, then authentication is based on the contents of
  the COOKIE-ECHO chunk. In this situation, the receiver MUST
  authenticate the chunks in the packet by using the RANDOM
  parameters, CHUNKS parameters and HMAC_ALGO parameters
  obtained from the COOKIE-ECHO chunk, and possibly a local
  shared secret as inputs to the authentication procedure
  specified in Section 6.3. If authentication fails, then
  the packet is discarded. If the authentication is successful,
  the COOKIE-ECHO and all the chunks after the COOKIE-ECHO
  MUST be processed. If the receiver has an STCB, it MUST
  process the AUTH chunk as described above using the STCB
  from the existing association to authenticate the
  COOKIE-ECHO chunk and all the chunks after it. [...]

Commit bbd0d59809f9 introduced the possibility to receive
and verification of AUTH chunk, including the edge case for
authenticated COOKIE-ECHO. On reception of COOKIE-ECHO,
the function sctp_sf_do_5_1D_ce() handles processing,
unpacks and creates a new association if it passed sanity
checks and also tests for authentication chunks being
present. After a new association has been processed, it
invokes sctp_process_init() on the new association and
walks through the parameter list it received from the INIT
chunk. It checks SCTP_PARAM_RANDOM, SCTP_PARAM_HMAC_ALGO
and SCTP_PARAM_CHUNKS, and copies them into asoc->peer
meta data (peer_random, peer_hmacs, peer_chunks) in case
sysctl -w net.sctp.auth_enable=1 is set. If in INIT's
SCTP_PARAM_SUPPORTED_EXT parameter SCTP_CID_AUTH is set,
peer_random != NULL and peer_hmacs != NULL the peer is to be
assumed asoc->peer.auth_capable=1, in any other case
asoc->peer.auth_capable=0.

Now, if in sctp_sf_do_5_1D_ce() chunk->auth_chunk is
available, we set up a fake auth chunk and pass that on to
sctp_sf_authenticate(), which at latest in
sctp_auth_calculate_hmac() reliably dereferences a NULL pointer
at position 0..0008 when setting up the crypto key in
crypto_hash_setkey() by using asoc->asoc_shared_key that is
NULL as condition key_id == asoc->active_key_id is true if
the AUTH chunk was injected correctly from remote. This
happens no matter what net.sctp.auth_enable sysctl says.

The fix is to check for net->sctp.auth_enable and for
asoc->peer.auth_capable before doing any operations like
sctp_sf_authenticate() as no key is activated in
sctp_auth_asoc_init_active_key() for each case.

Now as RFC4895 section 6.3 states that if the used HMAC-ALGO
passed from the INIT chunk was not used in the AUTH chunk, we
SHOULD send an error; however in this case it would be better
to just silently discard such a maliciously prepared handshake
as we didn't even receive a parameter at all. Also, as our
endpoint has no shared key configured, section 6.3 says that
MUST silently discard, which we are doing from now onwards.

Before calling sctp_sf_pdiscard(), we need not only to free
the association, but also the chunk->auth_chunk skb, as
commit bbd0d59809f9 created a skb clone in that case.

I have tested this locally by using netfilter's nfqueue and
re-injecting packets into the local stack after maliciously
modifying the INIT chunk (removing RANDOM; HMAC-ALGO param)
and the SCTP packet containing the COOKIE_ECHO (injecting
AUTH chunk before COOKIE_ECHO). Fixed with this patch applied.

Fixes: bbd0d59809f9 ("[SCTP]: Implement the receive and verification of AUTH chunk")
Signed-off-by: Daniel Borkmann <[email protected]>
Cc: Vlad Yasevich <[email protected]>
Cc: Neil Horman <[email protected]>
Acked-by: Vlad Yasevich <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge tag 'linux-can-fixes-for-3.14-20140303' of git://gitorious.org/linux-can/linux-can

linux-can-fixes-for-3.14-20140303

Marc Kleine-Budde says:

====================
this is a pull request of 8 patches. Oliver Hartkopp contributes a patch which
removes the CAN FD compatibility for CAN 2.0 sockets, as it turns out that this
compatibility has some conceptual cornercases. The remaining 7 patches are by
me, they address a problem in the flexcan driver. When shutting down the
interface ("ifconfig can0 down") under heavy network load the whole system will
hang. This series reworks the actual sequence in close() and the transition
from and to the low power modes of the CAN controller.
====================

Signed-off-by: David S. Miller <[email protected]>

ip_tunnel:multicast process cause panic due to skb->_skb_refdst NULL pointer

when ip_tunnel process multicast packets, it may check if the packet is looped
back packet though 'rt_is_output_route(skb_rtable(skb))' in ip_tunnel_rcv(),
but before that , skb->_skb_refdst has been dropped in iptunnel_pull_header(),
so which leads to a panic.

fix the bug: https://bugzilla.kernel.org/show_bug.cgi?id=70681

Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: cpsw: fix cpdma rx descriptor leak on down interface

This patch fixes a CPDMA RX Descriptor leak that occurs after taking
the interface down when the CPSW is in Dual MAC mode. Previously
the CPSW_ALE port was left open up which causes packets to be received
and processed by the RX interrupt handler and were passed to the
non active network interface where they were ignored.

The fix is for the slave_stop function of the selected interface
to disable the respective CPSW_ALE Port from forwarding packets. This
blocks traffic from being received on the inactive interface.

Signed-off-by: Schuyler Patton <[email protected]>
Reviewed-by: Felipe Balbi <[email protected]>
Signed-off-by: Mugunthan V N <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

be2net: isolate TX workarounds not applicable to Skyhawk-R

Some of TX workarounds in be_xmit_workarounds() routine
are not applicable (and result in HW errors) to Skyhawk-R chip.
Isolate BE3-R/Lancer specific workarounds to a separate routine.

Signed-off-by: Vasundhara Volam <[email protected]>
Signed-off-by: Sathya Perla <[email protected]>
Signed-off-by: Somnath Kotur <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

be2net: Fix skb double free in be_xmit_wrokarounds() failure path

skb_padto(), skb_share_check() and __vlan_put_tag() routines free
skb when they return an error. This patch fixes be_xmit_workarounds()
to not free skb again in such cases.

Signed-off-by: Vasundhara Volam <[email protected]>
Signed-off-by: Sathya Perla <[email protected]>
Signed-off-by: Somnath Kotur <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

be2net: clear promiscuous bits in adapter->flags while disabling promiscuous mode

We should clear promiscuous bits in adapter->flags while disabling promiscuous
mode. Else we will not put interface back into VLAN promisc mode if the vlans
already added exceeds the maximum limit.

Signed-off-by: Kalesh AP <[email protected]>
Signed-off-by: Somnath Kotur <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

be2net: Fix to reset transparent vlan tagging

For disabling transparent tagging issue SET_HSW_CONFIG with pvid_valid=1
and pvid=0xFFFF and not with the default pvid as this case would fail in Lancer.
Hence removing the get_hsw_config call from be_vf_setup() as it's
only use of getting default pvid is no longer needed.

Also do proper housekeeping only if the FW command succeeds.

Signed-off-by: Kalesh AP <[email protected]>
Signed-off-by: Somnath Kotur <[email protected]>
Signed-off-by: David S. Miller <[email protected]>