Git Repo - linux.git/log

uas: Do not blacklist ASM1153 disk enclosures

Our detection logic to avoid doing UAS on ASM1051 bridge chips causes problems
with newer ASM1153 disk enclosures in 2 ways:

1) Some ASM1153 disk enclosures re-use the ASM1051 device-id of 5106, which
   we assume is always an ASM1051, so remove the quirk for 5106, and instead
   use the same detection logic as we already use for device-id 55aa, which is
   used for all of ASM1051, ASM1053 and ASM1153 devices <sigh>.

2) Our detection logic to differentiate between ASM1051 and ASM1053 sees
   ASM1153 devices as ASM1051 because they have 32 streams like ASM1051 devs.
   Luckily the ASM1153 descriptors are not 100% identical, unlike the previous
   models the ASM1153 has bMaxPower == 0, so use that to differentiate it.

Signed-off-by: Hans de Goede <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: gadget: udc: avoid dereference before NULL check in ep_queue

Coverity: CID 1260069

Signed-off-by: John W. Linville <[email protected]>
Cc: Felipe Balbi <[email protected]>
Cc: stable <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: host: ehci-tegra: request deferred probe when failing to get phy

The commit 1290a958d48e ("usb: phy: propagate __of_usb_find_phy()'s error on
failure") changed the condition to return -EPROBE_DEFER to host driver.
Originally the Tegra host driver depended on the returned -EPROBE_DEFER to
get the phy device later when booting. Now we have to do that explicitly.

Signed-off-by: Vince Hsu <[email protected]>
Tested-by: Tomeu Vizoso <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

uas: disable UAS on Apricorn SATA dongles

The Apricorn SATA dongle will occasionally return "USBSUSBSUSB" in
response to SCSI commands when running in UAS mode. Therefore,
disable UAS mode on this dongle.

Signed-off-by: Darrick J. Wong <[email protected]>
Acked-by: Hans de Goede <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

uas: Add US_FL_NO_REPORT_OPCODES for JMicron JMS566 with usb-id 0bc2:a013

Like the JMicron JMS567 enclosures with the JMS566 choke on report-opcodes,
so avoid it.

Tested-and-reported-by: Takeo Nakayama <[email protected]>
Cc: [email protected] # 3.16
Signed-off-by: Hans de Goede <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

uas: Add US_FL_NO_ATA_1X for Seagate devices with usb-id 0bc2:a013

This is yet another Seagate device which needs the US_FL_NO_ATA_1X quirk

Reported-by: Marcin Zajączkowski <[email protected]>
Cc: [email protected] # 3.16
Signed-off-by: Hans de Goede <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

xhci: Add broken-streams quirk for Fresco Logic FL1000G xhci controllers

Streams do not work reliabe on Fresco Logic FL1000G xhci controllers,
trying to use them results in errors like this:

21:37:33 kernel: xhci_hcd 0000:04:00.0: ERROR Transfer event for disabled endpoint or incorrect stream ring
21:37:33 kernel: xhci_hcd 0000:04:00.0: @00000000368b3570 9067b000 00000000 05000000 01078001
21:37:33 kernel: xhci_hcd 0000:04:00.0: ERROR Transfer event for disabled endpoint or incorrect stream ring
21:37:33 kernel: xhci_hcd 0000:04:00.0: @00000000368b3580 9067b400 00000000 05000000 01038001

As always I've ordered a pci-e addon card with a Fresco Logic controller for
myself to see if I can come up with a better fix then the big hammer, in
the mean time this will make uas devices work again (in usb-storage mode)
for FL1000G users.

Reported-by: Marcin Zajączkowski <[email protected]>
Cc: [email protected] # 3.15
Signed-off-by: Hans de Goede <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

USB: EHCI: adjust error return code

The USB stack uses error code -ENOSPC to indicate that the periodic
schedule is too full, with insufficient bandwidth to accommodate a new
allocation. It uses -EFBIG to indicate that an isochronous transfer
could not be linked into the schedule because it would exceed the
number of isochronous packets the host controller driver can handle
(generally because the new transfer would extend too far into the
future).

ehci-hcd uses the wrong error code at one point. This patch fixes it,
along with a misleading comment and debugging message.

Signed-off-by: Alan Stern <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

USB: EHCI: fix initialization bug in iso_stream_schedule()

Commit c3ee9b76aa93 (EHCI: improved logic for isochronous scheduling)
introduced the idea of using ehci->last_iso_frame as the origin (or
base) for the circular calculations involved in modifying the
isochronous schedule. However, the new code it added used
ehci->last_iso_frame before the value was properly initialized. This
patch rectifies the mistake by moving the initialization lines earlier
in iso_stream_schedule().

This fixes Bugzilla #72891.

Signed-off-by: Alan Stern <[email protected]>
Fixes: c3ee9b76aa93
Reported-by: Joe Bryant <[email protected]>
Tested-by: Joe Bryant <[email protected]>
Tested-by: Martin Long <[email protected]>
CC: <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

xhci: Check if slot is already in default state before moving it there

Solves xhci error cases with debug messages:
xhci_hcd 0000:00:14.0: Setup ERROR: setup context command for slot 1.
usb 1-6: hub failed to enable device, error -22

xhci will give a context state error if we try to set a slot in default
state to the same default state with a special address device command.

Turns out this happends in several cases:
- retry reading the device rescriptor in hub_port_init()
- usb_reset_device() is called for a slot in default state
- in resume path, usb_port_resume() calls hub_port_init()

The default state is usually reached from most states with a reset device
command without any context state errors, but using the address device
command with BSA bit set (block set address) only works from the enabled
state and will otherwise cause context error.

solve this by checking if we are already in the default state before issuing
a address device BSA=1 command.

Fixes: 48fc7dbd52c0 ("usb: xhci: change enumeration scheme to 'new scheme'")
Cc: <[email protected]> # v3.14+
Signed-off-by: Mathias Nyman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

Revert "usb: chipidea: remove duplicate dev_set_drvdata for host_start"

This reverts commit 14b4099c074f2ddf4d84b22d370170e61b527529

It moved platform_set_drvdata(pdev, ci) before hcd is created,
and the hcd will assign itself as ci controller's drvdata during
the hcd creation function (in usb_create_shared_hcd), so it
overwrites the real ci's drvdata which we want to use.

So, if the controller is at host mode, the system suspend
API will get the wrong struct ci_hdrc pointer, and cause the
oops.

Signed-off-by: Peter Chen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

Merge tag 'for-3.19-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/kishon/linux-phy into usb-linus

Kishon writes:

misc fixes in PHY drivers

scsi: ->queue_rq can't sleep

The blk-mq ->queue_rq method is always called from process context,
but might have preemption disabled. This means we still always
have to use GFP_ATOMIC for memory allocations, and thus need to
revert part of commit 3c356bde1 ("scsi: stop passing a gfp_mask
argument down the command setup path").

Signed-off-by: Christoph Hellwig <[email protected]>
Reported-by: Sasha Levin <[email protected]>
Reviewed-by: Bart Van Assche <[email protected]>
Tested-by: Alexei Starovoitov <[email protected]>

HID: roccat: potential out of bounds in pyra_sysfs_write_settings()

This is a static checker fix. We write some binary settings to the
sysfs file. One of the settings is the "->startup_profile". There
isn't any checking to make sure it fits into the
pyra->profile_settings[] array in the profile_activated() function.

I added a check to pyra_sysfs_write_settings() in both places because
I wasn't positive that the other callers were correct.

Cc: <[email protected]>
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>

MAINTAINERS: Update maintainer list for qla4xxx

Signed-off-by: Nilesh Javali <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>

mutex: Always clear owner field upon mutex_unlock()

Currently if DEBUG_MUTEXES is enabled, the mutex->owner field is only
cleared iff debug_locks is active. This exposes a race to other users of
the field where the mutex->owner may be still set to a stale value,
potentially upsetting mutex_spin_on_owner() among others.

References: https://bugs.freedesktop.org/show_bug.cgi?id=87955
Signed-off-by: Chris Wilson <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Davidlohr Bueso <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

sched/fair: Fix RCU stall upon -ENOMEM in sched_create_group()

When alloc_fair_sched_group() in sched_create_group() fails,
free_sched_group() is called, and free_fair_sched_group() is called by
free_sched_group(). Since destroy_cfs_bandwidth() is called by
free_fair_sched_group() without calling init_cfs_bandwidth(),
RCU stall occurs at hrtimer_cancel():

  INFO: rcu_sched self-detected stall on CPU { 1}  (t=60000 jiffies g=13074 c=13073 q=0)
  Task dump for CPU 1:
  (fprintd)       R  running task        0  6249      1 0x00000088
  ...
  Call Trace:
   <IRQ>  [<ffffffff81094988>] sched_show_task+0xa8/0x110
   [<ffffffff81097acd>] dump_cpu_task+0x3d/0x50
   [<ffffffff810c3a80>] rcu_dump_cpu_stacks+0x90/0xd0
   [<ffffffff810c7751>] rcu_check_callbacks+0x491/0x700
   [<ffffffff810cbf2b>] update_process_times+0x4b/0x80
   [<ffffffff810db046>] tick_sched_handle.isra.20+0x36/0x50
   [<ffffffff810db0a2>] tick_sched_timer+0x42/0x70
   [<ffffffff810ccb19>] __run_hrtimer+0x69/0x1a0
   [<ffffffff810db060>] ? tick_sched_handle.isra.20+0x50/0x50
   [<ffffffff810ccedf>] hrtimer_interrupt+0xef/0x230
   [<ffffffff810452cb>] local_apic_timer_interrupt+0x3b/0x70
   [<ffffffff8164a465>] smp_apic_timer_interrupt+0x45/0x60
   [<ffffffff816485bd>] apic_timer_interrupt+0x6d/0x80
   <EOI>  [<ffffffff810cc588>] ? lock_hrtimer_base.isra.23+0x18/0x50
   [<ffffffff81193cf1>] ? __kmalloc+0x211/0x230
   [<ffffffff810cc9d2>] hrtimer_try_to_cancel+0x22/0xd0
   [<ffffffff81193cf1>] ? __kmalloc+0x211/0x230
   [<ffffffff810ccaa2>] hrtimer_cancel+0x22/0x30
   [<ffffffff810a3cb5>] free_fair_sched_group+0x25/0xd0
   [<ffffffff8108df46>] free_sched_group+0x16/0x40
   [<ffffffff810971bb>] sched_create_group+0x4b/0x80
   [<ffffffff810aa383>] sched_autogroup_create_attach+0x43/0x1c0
   [<ffffffff8107dc9c>] sys_setsid+0x7c/0x110
   [<ffffffff81647729>] system_call_fastpath+0x12/0x17

Check whether init_cfs_bandwidth() was called before calling
destroy_cfs_bandwidth().

Signed-off-by: Tetsuo Handa <[email protected]>
[ Move the check into destroy_cfs_bandwidth() to aid compilability. ]
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: Ben Segall <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

sched/deadline: Avoid double-accounting in case of missed deadlines

The dl_runtime_exceeded() function is supposed to ckeck if
a SCHED_DEADLINE task must be throttled, by checking if its
current runtime is <= 0. However, it also checks if the
scheduling deadline has been missed (the current time is
larger than the current scheduling deadline), further
decreasing the runtime if this happens.
This "double accounting" is wrong:

- In case of partitioned scheduling (or single CPU), this
  happens if task_tick_dl() has been called later than expected
  (due to small HZ values). In this case, the current runtime is
  also negative, and replenish_dl_entity() can take care of the
  deadline miss by recharging the current runtime to a value smaller
  than dl_runtime

- In case of global scheduling on multiple CPUs, scheduling
  deadlines can be missed even if the task did not consume more
  runtime than expected, hence penalizing the task is wrong

This patch fix this problem by throttling a SCHED_DEADLINE task
only when its runtime becomes negative, and not modifying the runtime

Signed-off-by: Luca Abeni <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Juri Lelli <[email protected]>
Cc: <[email protected]>
Cc: Dario Faggioli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

sched/deadline: Fix migration of SCHED_DEADLINE tasks

According to global EDF, tasks should be migrated between runqueues
without checking if their scheduling deadlines and runtimes are valid.
However, SCHED_DEADLINE currently performs such a check:
a migration happens doing:

deactivate_task(rq, next_task, 0);
set_task_cpu(next_task, later_rq->cpu);
activate_task(later_rq, next_task, 0);

which ends up calling dequeue_task_dl(), setting the new CPU, and then
calling enqueue_task_dl().

enqueue_task_dl() then calls enqueue_dl_entity(), which calls
update_dl_entity(), which can modify scheduling deadline and runtime,
breaking global EDF scheduling.

As a result, some of the properties of global EDF are not respected:
for example, a taskset {(30, 80), (40, 80), (120, 170)} scheduled on
two cores can have unbounded response times for the third task even
if 30/80+40/80+120/170 = 1.5809 < 2

This can be fixed by invoking update_dl_entity() only in case of
wakeup, or if this is a new SCHED_DEADLINE task.

Signed-off-by: Luca Abeni <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Juri Lelli <[email protected]>
Cc: <[email protected]>
Cc: Dario Faggioli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

sched: Fix odd values in effective_load() calculations

In effective_load, we have (long w * unsigned long tg->shares) / long W,
when w is negative, it is cast to unsigned long and hence the product is
insanely large. Fix this by casting tg->shares to long.

Reported-by: Sasha Levin <[email protected]>
Signed-off-by: Yuyang Du <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Dave Jones <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

sched, fanotify: Deal with nested sleeps

As per e23738a7300a ("sched, inotify: Deal with nested sleeps").

fanotify_read is a wait loop with sleeps in. Wait loops rely on
task_struct::state and sleeps do too, since that's the only means of
actually sleeping. Therefore the nested sleeps destroy the wait loop
state and the wait loop breaks the sleep functions that assume
TASK_RUNNING (mutex_lock).

Fix this by using the new woken_wake_function and wait_woken() stuff,
which registers wakeups in wait and thereby allows shrinking the
task_state::state changes to the actual sleep part.

Reported-by: Yuanhan Liu <[email protected]>
Reported-by: Sedat Dilek <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Takashi Iwai <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Eric Paris <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Eric Paris <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

perf/x86/uncore/hsw-ep: Handle systems with only two SBOXes

There was another report of a boot failure with a #GP fault in the
uncore SBOX initialization. The earlier work around was not enough
for this system.

The boot was failing while trying to initialize the third SBOX.

This patch detects parts with only two SBOXes and limits the number
of SBOX units to two there.

Stable material, as it affects boot problems on 3.18.

Tested-by: Andreas Oehler <[email protected]>
Signed-off-by: Andi Kleen <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Yan, Zheng <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

perf/x86_64: Improve user regs sampling

Perf reports user regs for kernel-mode samples so that samples can
be backtraced through user code. The old code was very broken in
syscall context, resulting in useless backtraces.

The new code, in contrast, is still dangerously racy, but it should
at least work most of the time.

Tested-by: Jiri Olsa <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: [email protected]
Cc: Wu Fengguang <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: http://lkml.kernel.org/r/243560c26ff0f739978e2459e203f6515367634d.1420396372.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <[email protected]>

perf: Move task_pt_regs sampling into arch code

On x86_64, at least, task_pt_regs may be only partially initialized
in many contexts, so x86_64 should not use it without extra care
from interrupt context, let alone NMI context.

This will allow x86_64 to override the logic and will supply some
scratch space to use to make a cleaner copy of user regs.

Tested-by: Jiri Olsa <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Cc: Wu Fengguang <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Jean Pihet <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mark Salter <[email protected]>
Cc: Russell King <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/e431cd4c18c2e1c44c774f10758527fb2d1025c4.1420396372.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <[email protected]>

x86: Fix off-by-one in instruction decoder

Stephane reported that the PEBS fixup was broken by the recent commit to
the instruction decoder. The thing had an off-by-one which resulted in
not being able to decode the last instruction and always bail.

Reported-by: Stephane Eranian <[email protected]>
Fixes: 6ba48ff46f76 ("x86: Remove arbitrary instruction size limit in instruction decoder")
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: [email protected] # 3.18
Cc: <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Liang Kan <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Jim Keniston <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

- Free callchains when hist entries are deleted, plugging a massive leak in
'top -g', where hist_entries (and its callchains) are decayed over time. (Namhyung Kim)

- Fix segfault when showing callchain in the hists browser (report & top) (Namhyung Kim)

- Fix children sort key behavior, and also the 'perf test 32' test that
was failing due to reliance on undefined behaviour (Namhyung Kim)

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>

s390/bpf: Fix JMP_JGE_X (A > X) and JMP_JGT_X (A >= X)

Currently the signed COMPARE (cr) instruction is used to compare "A"
with "X". This is not correct because "A" and "X" are both unsigned.
To fix this use the unsigned COMPARE LOGICAL (clr) instruction instead.

Signed-off-by: Michael Holzheu <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>

s390/bpf: Fix ALU_NEG (A = -A)

Currently the LOAD NEGATIVE (lnr) instruction is used for ALU_NEG. This
instruction always loads the negative value. Therefore, if A is already
negative, it remains unchanged. To fix this use LOAD COMPLEMENT (lcr)
instead.

Signed-off-by: Michael Holzheu <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>

gpio: dln2: use bus_sync_unlock instead of scheduling work

Use the irq_chip bus_sync_unlock method to update hardware registers
instead of scheduling work from the mask/unmask methods. This simplifies
a bit the driver and make it more uniform with the other GPIO IRQ
drivers.

Signed-off-by: Octavian Purdila <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>

net: fec: fix NULL pointer dereference in fec_enet_timeout_work

This patch initialises the fep->netdev pointer. This pointer was not
initialised at all, but is used in fec_enet_timeout_work and in some
error paths.

Signed-off-by: Hubert Feurstein <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sh_eth: Fix access to TRSCER register

TRSCER register is configured differently by SoCs. TRSCER of R-Car Gen2 is
RINT8 bit only valid, other bits are reserved bits. This removes access to
TRSCER register reserve bit by adding variable trscer_err_mask to
sh_eth_cpu_data structure, set the register information to each SoCs.

Signed-off-by: Nobuhiro Iwamatsu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

sh-eth: Set fdr_value of R-Car SoCs

FDR register of R-Car set in fdr_value can have the original settings.
This sets the value that is suitable for each SoCs to fdr_value of R8A777x
and R8A779x.

Signed-off-by: Nobuhiro Iwamatsu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-01-06

This series contains fixes to i40e only.

Jesse provides a fix for when the driver was polling with interrupts
disabled the hardware would occasionally not write back descriptors.
His fix causes the driver to detect this situation and force an interrupt
to fire which will flush the stuck descriptor.

Anjali provides a couple of fixes, the first corrects an issue where
the receive port checksum error counter was incrementing incorrectly with
UDP encapsulated tunneled traffic.  The second fix resolves an issue where
the driver was examining the outer protocol layer to set the inner protocol
layer checksum offload.  In the case of TCP over IPv6 over an IPv4 based
VXLAN, the inner checksum offloads would be set to look for IPv4/UDP
instead of IPv6/TCP, so fixed the issue so that the driver will look at
the proper layer for encapsulation offload settings.

v2: fixed a bug in patch 01 of the series, where the interrupt rate impacted
    4 port workloads by reducing throughput.
====================

Signed-off-by: David S. Miller <[email protected]>

Merge tag 'iio-fixes-for-3.19a' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus

Jonathan writes:

First round of IIO fixes for the 3.19 cycle.

* ad799x fix ad7991/ad7995/ad7999 setup as they do not have a configuration
  register to write to.  It is written during the convesion sequence. As
  such we don't want to write to it at other times.
* Fix iio_channel_read utility function to return to ensure it is apparent
  if the relevant element is not there. This avoids using a wrong value
  if some channels have the element and others do not.

mm, vmscan: prevent kswapd livelock due to pfmemalloc-throttled process being killed

Charles Shirron and Paul Cassella from Cray Inc have reported kswapd
stuck in a busy loop with nothing left to balance, but
kswapd_try_to_sleep() failing to sleep.  Their analysis found the cause
to be a combination of several factors:

1. A process is waiting in throttle_direct_reclaim() on pgdat->pfmemalloc_wait

2. The process has been killed (by OOM in this case), but has not yet been
   scheduled to remove itself from the waitqueue and die.

3. kswapd checks for throttled processes in prepare_kswapd_sleep():

        if (waitqueue_active(&pgdat->pfmemalloc_wait)) {
                wake_up(&pgdat->pfmemalloc_wait);
return false; // kswapd will not go to sleep
}

   However, for a process that was already killed, wake_up() does not remove
   the process from the waitqueue, since try_to_wake_up() checks its state
   first and returns false when the process is no longer waiting.

4. kswapd is running on the same CPU as the only CPU that the process is
   allowed to run on (through cpus_allowed, or possibly single-cpu system).

5. CONFIG_PREEMPT_NONE=y kernel is used. If there's nothing to balance, kswapd
   encounters no voluntary preemption points and repeatedly fails
   prepare_kswapd_sleep(), blocking the process from running and removing
   itself from the waitqueue, which would let kswapd sleep.

So, the source of the problem is that we prevent kswapd from going to
sleep until there are processes waiting on the pfmemalloc_wait queue,
and a process waiting on a queue is guaranteed to be removed from the
queue only when it gets scheduled.  This was done to make sure that no
process is left sleeping on pfmemalloc_wait when kswapd itself goes to
sleep.

However, it isn't necessary to postpone kswapd sleep until the
pfmemalloc_wait queue actually empties.  To prevent processes from being
left sleeping, it's actually enough to guarantee that all processes
waiting on pfmemalloc_wait queue have been woken up by the time we put
kswapd to sleep.

This patch therefore fixes this issue by substituting 'wake_up' with
'wake_up_all' and removing 'return false' in the code snippet from
prepare_kswapd_sleep() above.  Note that if any process puts itself in
the queue after this waitqueue_active() check, or after the wake up
itself, it means that the process will also wake up kswapd - and since
we are under prepare_to_wait(), the wake up won't be missed.  Also we
update the comment prepare_kswapd_sleep() to hopefully more clearly
describe the races it is preventing.

Fixes: 5515061d22f0 ("mm: throttle direct reclaimers if PF_MEMALLOC reserves are low and swap is backed by network storage")
Signed-off-by: Vlastimil Babka <[email protected]>
Signed-off-by: Vladimir Davydov <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Johannes Weiner <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Rik van Riel <[email protected]>
Cc: <[email protected]> [3.6+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

memcg: fix destination cgroup leak on task charges migration

We are supposed to take one css reference per each memory page and per
each swap entry accounted to a memory cgroup. However, during task
charges migration we take a reference to the destination cgroup twice
per each swap entry: first in mem_cgroup_do_precharge()->try_charge()
and then in mem_cgroup_move_swap_account(), permanently leaking the
destination cgroup.

The hunk taking the second reference seems to be a leftover from the
pre-00501b531c472 ("mm: memcontrol: rewrite charge API") era. Remove it
to fix the leak.

Fixes: e8ea14cc6ead (mm: memcontrol: take a css reference for each charged page)
Signed-off-by: Vladimir Davydov <[email protected]>
Cc: Johannes Weiner <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: memcontrol: switch soft limit default back to infinity

Commit 3e32cb2e0a12 ("mm: memcontrol: lockless page counters")
accidentally switched the soft limit default from infinity to zero,
which turns all memcgs with even a single page into soft limit excessors
and engages soft limit reclaim on all of them during global memory
pressure. This makes global reclaim generally more aggressive, but also
inverts the meaning of existing soft limit configurations where unset
soft limits are usually more generous than set ones.

Signed-off-by: Johannes Weiner <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Vladimir Davydov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/debug_pagealloc: remove obsolete Kconfig options

These are obsolete since commit e30825f1869a ("mm/debug-pagealloc:
prepare boottime configurable") was merged. So remove them.

[[email protected]: find obsolete Kconfig options]
Signed-off-by: Joonsoo Kim <[email protected]>
Cc: Paul Bolle <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Michal Nazarewicz <[email protected]>
Cc: Jungsoo Son <[email protected]>
Acked-by: David Rientjes <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

vfs: renumber FMODE_NONOTIFY and add to uniqueness check

Fix clashing values for O_PATH and FMODE_NONOTIFY on sparc. The
clashing O_PATH value was added in commit 5229645bdc35 ("vfs: add
nonconflicting values for O_PATH") but this can't be changed as it is
user-visible.

FMODE_NONOTIFY is only used internally in the kernel, but it is in the
same numbering space as the other O_* flags, as indicated by the comment
at the top of include/uapi/asm-generic/fcntl.h (and its use in
fs/notify/fanotify/fanotify_user.c). So renumber it to avoid the clash.

All of this has happened before (commit 12ed2e36c98a: "fanotify:
FMODE_NONOTIFY and __O_SYNC in sparc conflict"), and all of this will
happen again -- so update the uniqueness check in fcntl_init() to
include __FMODE_NONOTIFY.

Signed-off-by: David Drysdale <[email protected]>
Acked-by: David S. Miller <[email protected]>
Acked-by: Jan Kara <[email protected]>
Cc: Heinrich Schuchardt <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Cc: Eric Paris <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

arch/blackfin/mach-bf533/boards/stamp.c: add linux/delay.h

build error

arch/blackfin/mach-bf533/boards/stamp.c:834:2: error: implicit declaration of function 'mdelay'

Signed-off-by: Oleg Nesterov <[email protected]>
Reported-by: Wu Fengguang <[email protected]>
Acked-by: Mike Frysinger <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: fix the wrong directory passed to ocfs2_lookup_ino_from_name() when link file

In ocfs2_link(), the parent directory inode passed to function
ocfs2_lookup_ino_from_name() is wrong.  Parameter dir is the parent of
new_dentry not old_dentry.  We should get old_dir from old_dentry and
lookup old_dentry in old_dir in case another node remove the old dentry.

With this change, hard linking works again, when paths are relative with
at least one subdirectory.  This is how the problem was reproducable:

  # mkdir a
  # mkdir b
  # touch a/test
  # ln a/test b/test
  ln: failed to create hard link `b/test' => `a/test': No such file or  directory

However when creating links in the same dir, it worked well.

Now the link gets created.

Fixes: 0e048316ff57 ("ocfs2: check existence of old dentry in ocfs2_link()")
Signed-off-by: joyce.xue <[email protected]>
Reported-by: Szabo Aron - UBIT <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Tested-by: Aron Szabo <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

MAINTAINERS: update rydberg's addresses

My ISP finally gave up on the old mail address, so I am moving things
over to bitmath.org instead. Also change the status fields to better
reflect reality.

Signed-off-by: Henrik Rydberg <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: protect set_page_dirty() from ongoing truncation

Tejun, while reviewing the code, spotted the following race condition
between the dirtying and truncation of a page:

__set_page_dirty_nobuffers()       __delete_from_page_cache()
  if (TestSetPageDirty(page))
                                     page->mapping = NULL
     if (PageDirty())
       dec_zone_page_state(page, NR_FILE_DIRTY);
       dec_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);
    if (page->mapping)
      account_page_dirtied(page)
        __inc_zone_page_state(page, NR_FILE_DIRTY);
__inc_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);

which results in an imbalance of NR_FILE_DIRTY and BDI_RECLAIMABLE.

Dirtiers usually lock out truncation, either by holding the page lock
directly, or in case of zap_pte_range(), by pinning the mapcount with
the page table lock held.  The notable exception to this rule, though,
is do_wp_page(), for which this race exists.  However, do_wp_page()
already waits for a locked page to unlock before setting the dirty bit,
in order to prevent a race where clear_page_dirty() misses the page bit
in the presence of dirty ptes.  Upgrade that wait to a fully locked
set_page_dirty() to also cover the situation explained above.

Afterwards, the code in set_page_dirty() dealing with a truncation race
is no longer needed.  Remove it.

Reported-by: Tejun Heo <[email protected]>
Signed-off-by: Johannes Weiner <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm: prevent endless growth of anon_vma hierarchy

Constantly forking task causes unlimited grow of anon_vma chain.  Each
next child allocates new level of anon_vmas and links vma to all
previous levels because pages might be inherited from any level.

This patch adds heuristic which decides to reuse existing anon_vma
instead of forking new one.  It adds counter anon_vma->degree which
counts linked vmas and directly descending anon_vmas and reuses anon_vma
if counter is lower than two.  As a result each anon_vma has either vma
or at least two descending anon_vmas.  In such trees half of nodes are
leafs with alive vmas, thus count of anon_vmas is no more than two times
bigger than count of vmas.

This heuristic reuses anon_vmas as few as possible because each reuse
adds false aliasing among vmas and rmap walker ought to scan more ptes
when it searches where page is might be mapped.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 5beb49305251 ("mm: change anon_vma linking to fix multi-process server scalability issue")
[[email protected]: fix typo, per Rik]
Signed-off-by: Konstantin Khlebnikov <[email protected]>
Reported-by: Daniel Forrest <[email protected]>
Tested-by: Michal Hocko <[email protected]>
Tested-by: Jerome Marchand <[email protected]>
Reviewed-by: Michal Hocko <[email protected]>
Reviewed-by: Rik van Riel <[email protected]>
Cc: <[email protected]> [2.6.34+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

exit: fix race between wait_consider_task() and wait_task_zombie()

wait_consider_task() checks EXIT_ZOMBIE after EXIT_DEAD/EXIT_TRACE and
both checks can fail if we race with EXIT_ZOMBIE -> EXIT_DEAD/EXIT_TRACE
change in between, gcc needs to reload p->exit_state after
security_task_wait(). In this case ->notask_error will be wrongly
cleared and do_wait() can hang forever if it was the last eligible
child.

Many thanks to Arne who carefully investigated the problem.

Note: this bug is very old but it was pure theoretical until commit
b3ab03160dfa ("wait: completely ignore the EXIT_DEAD tasks"). Before
this commit "-O2" was probably enough to guarantee that compiler won't
read ->exit_state twice.

Signed-off-by: Oleg Nesterov <[email protected]>
Reported-by: Arne Goedeke <[email protected]>
Tested-by: Arne Goedeke <[email protected]>
Cc: <[email protected]> [3.15+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: remove bogus check in dlm_process_recovery_data

In dlm_process_recovery_data, only when dlm_new_lock failed the ret will
be set to -ENOMEM. And in this case, newlock is definitely NULL. So
test newlock is meaningless, remove it.

Signed-off-by: Joseph Qi <[email protected]>
Reviewed-by: Alex Chen <[email protected]>
Reviewed-by: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild

Pull kbuild fix from Michal Marek:
"make mrproper / distclean stopped removing the generated debian/
directory in v3.16. This fixes it"

* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
kbuild: Fix removal of the debian/ directory

Makefile: include arch/*/include/generated/uapi before .../generated

The introduction of the uapi directories in v3.7-rc1 moved some of the
generated headers from arch/*/include/generated to the uapi directory,
keeping the #include directives intact.

This creates a problem when bisecting, because the unversioned files are
not cleaned automatically by git and the compiler might include stale
headers as a result.  Instead of cleaning them in the Makefiles, promote
arch/*/include/generated/uapi in the search path.  Under normal
circumstances, there is no overlap between this uapi subdirectory and
its parent, so the include choices remain the same.  We keep
arch/*/include/generated/uapi in the USERINCLUDE variable so that it is
usable standalone.

Note that we cannot completely swap the order of the uapi and
kernel-only directories, since the headers in include/uapi/asm-generic
are meant to be wrapped by their include/asm-generic counterparts when
building kernel code.

Reported-by: "Nicholas A. Bellinger" <[email protected]>
Reported-by: David Drysdale <[email protected]>
Signed-off-by: Michal Marek <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge tag 'pinctrl-v3.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pinctrl fixes from Linus Walleij:
"Allright allright I've been lazy over christmas and New Years.  Here
  are a few collected pin control fixes eventually.  Details:

  A set of assorted pin control fixes for the Rockchip and STi drivers"

* tag 'pinctrl-v3.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: st: Add irq_disable hook to st_gpio_irqchip
  pinctrl: st: avoid multiple mutex lock
  pinctrl: rockchip: Fix enable/disable/mask/unmask
  pinctrl: rockchip: Handle wakeup pins

Merge tag 'pm+acpi-3.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI fixes from Rafael Wysocki:
"These are an ACPI device power management initialization fix (-stable
  material), two commits renaming stuff in the ACPI processor driver to
  make it more suitable for ARM64 processors and a new ACPI backlight
  blacklist entry.

  Specifics:

   - Fix ACPI power management intialization for device objects
     corresponding to devices that are not present at the init time (the
     _STA control method returns 0 for them) and therefore should not be
     regarded as power manageable (Rafael J Wysocki).

   - Rename a structure field and two functions used by the ACPI
     processor driver to make them less tied to architectures that use
     APICs (both x86 and ia64) and more suitable for ARM64 processors
     (Hanjun Guo).

   - Add a disable_native_backlight quirk for Dell XPS15 L521X designed
     in an unusual way preventing native backlight from working on that
     machine (Hans de Goede)"

* tag 'pm+acpi-3.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / video: Add disable_native_backlight quirk for Dell XPS15 L521X
  ACPI / processor: Rename acpi_(un)map_lsapic() to acpi_(un)map_cpu()
  ACPI / processor: Convert apic_id to phys_id to make it arch agnostic
  ACPI / PM: Fix PM initialization for devices that are not present

Merge tag 'keys-fixes-20150107' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull keyrings fixes from David Howells:
"Two fixes:

   - Fix for the order in which things are done during key garbage
     collection to prevent named keyrings causing a crash
     [CVE-2014-9529].

   - Fix assoc_array to explicitly #include rcupdate.h to prevent
     compilation errors under certain circumstances"

* tag 'keys-fixes-20150107' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  assoc_array: Include rcupdate.h for call_rcu() definition
  KEYS: close race between key lookup and freeing

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

pull virtio/vhost fixes from Michael Tsirkin:
"This fixes a couple of bugs triggered by hot-unplug of virtio devices,
  as well as a regression in vhost-net"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vhost/net: length miscalculation
  virtio_pci: document why we defer kfree
  virtio_pci: defer kfree until release callback
  virtio_pci: device-specific release callback
  virtio: make del_vqs idempotent

Merge tag 'iommu-fixes-v3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU fixes from Joerg Roedel:
"Including:

   - a domain structure leak fix in the Intel VT-d driver

   - compile error fix for the VMSA IPMMU driver because of the
     IOMMU_EXEC -> IOMMU_NOEXEC conversion

   - two small cleanups as an aftermath of the merge window and the
     domain-leak fix"

* tag 'iommu-fixes-v3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/rockchip: Drop owner assignment from platform_drivers
  iommu/vt-d: Remove dead code in device_notifier
  iommu/vt-d: Fix dmar_domain leak in iommu_attach_device
  iommu/ipmmu-vmsa: Change IOMMU_EXEC to IOMMU_NOEXEC

Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
"This fixes a build problem with sha-mb with old toolchains and an
  implementation bug in the ctr(aes)/by8 branch of aesni-intel that's
  enabled when AVX is available"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: sha-mb - Add avx2_supported check.
  crypto: aesni - fix "by8" variant for 128 bit keys

gpio: grgpio: Avoid potential NULL pointer dereference

irqmap is optional property, so priv->domain can be NULL if !irqmap.
Thus add NULL test for priv->domain before calling irq_domain_remove()
to prevent NULL pointer dereference.

Signed-off-by: Axel Lin <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>

libceph: fix sparse endianness warnings

The only real issue is the one in auth_x.c and it came with
3.19-rc1 merge.

Signed-off-by: Ilya Dryomov <[email protected]>

ceph: use %zu for len in ceph_fill_inline_data()

len is size_t, should be printed with %zu.

Signed-off-by: Ilya Dryomov <[email protected]>

arm: dts: Use pmu_system_controller phandle for dp phy

DP PHY now require pmu-system-controller to handle PMU register
to control PHY's power isolation. Adding the same to dp-phy
node.

Signed-off-by: Vivek Gautam <[email protected]>
Reviewed-by: Jingoo Han <[email protected]>
Tested-by: Javier Martinez Canillas <[email protected]>
Signed-off-by: Kukjin Kim <[email protected]>

NVMe: Fix locking on abort handling

The queues and device need to be locked when messing with them.

Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

NVMe: Start and stop h/w queues on reset

This freezes and stops all the queues on device shutdown and restarts
them on resume. This fixes hotplug and reset issues when the controller
is actively being used.

Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

NVMe: Command abort handling fixes

Aborts all requeued commands prior to killing the request_queue. For
commands that time out on a dying request queue, set the "Do Not Retry"
bit on the command status so the command cannot be requeued. Finanally, if
the driver is requested to abort a command it did not start, do nothing.

Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

NVMe: Admin queue removal handling

This protects admin queue access on shutdown. When the controller is
disabled, the queue is frozen to prevent new entry, and unfrozen on
resume, and fixes cq_vector signedness to not suspend a queue twice.

Since unfreezing the queue makes it available for commands, it requires
the queue be initialized, so this moves this part after that.

Special handling is done when the device is unresponsive during
shutdown. This can be optimized to not require subsequent commands to
timeout, but saving that fix for later.

This patch also removes the kill signals in this path that were left-over
artifacts from the blk-mq conversion and no longer necessary.

Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

selftests/vm: fix link error for transhuge-stress test

add -lrt to fix undefined reference to `clock_gettime'
error seen when the test is compiled using gcc 4.6.4.

Signed-off-by: Andrey Skvortsov <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

NVMe: Reference count admin queue usage

Since there is no gendisk associated with the admin queue, the driver
needs to hold a reference to it until all open references to the
controller are closed.

This also combines queue cleanup with freeing the tag set since these
should not be separate.

Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

NVMe: Start all requests

Once the nvme callback is set for a request, the driver can start it
and make it available for timeout handling. For timed out commands on a
device that is not initialized, this fixes potential deadlocks that can
occur on startup and shutdown when a device is unresponsive since they
can now be cancelled.

Asynchronous requests do not have any expected timeout, so these are
using the new "REQ_NO_TIMEOUT" request flags.

Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

blk-mq: End unstarted requests on a dying queue

Requests that haven't been started prior to a queue dying can be ended
in error without waiting for them to start and time out.

Signed-off-by: Keith Busch <[email protected]>
Added code comment to explain why this is done.

Signed-off-by: Jens Axboe <[email protected]>

blk-mq: Allow requests to never expire

Some types of requests may be started that are not gauranteed to ever
complete. This adds a request flag that a driver can use so mark the
request as such.

Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

blk-mq: Add helper to abort requeued requests

Adds a helper function a driver can use to abort requeued requests in
case any are pending when h/w queues are being removed.

Signed-off-by: Jens Axboe <[email protected]>

blk-mq: Let drivers cancel requeue_work

Kicking requeued requests will start h/w queues in a work_queue, which
may alter the driver's requested state to temporarily stop them. This
patch exports a method to cancel the q->requeue_work so a driver can be
assured stopped h/w queues won't be started up before it is ready.

Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

blk-mq: Export if requests were started

Drivers can iterate over all allocated request tags, but their callback
needs a way to know if the driver started the request in the first place.

Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

blk-mq: Wake tasks entering queue on dying

When the queue is set to dying, wake up tasks that are waiting on frozen
queue so they realize it is dying and abandon their request.

Signed-off-by: Keith Busch <[email protected]>
Modified by me to add a code comment on the need for the wakeup.

Signed-off-by: Jens Axboe <[email protected]>

perf hists browser: Fix segfault when showing callchain

When perf report on TUI shows callchain it checks first node has
siblings to determine whether it needs to print percentage value.

But it missed a case that first node is NULL.  So sometimes it segfaults
like below:

  $ perf top -g
  perf: Segmentation fault
  -------- backtrace --------
  perf[0x4fcefb]
  /usr/lib/libc.so.6(+0x33b20)[0x7f2a35839b20]
  perf(rb_next+0x8)[0x47d3d8]
  perf[0x4f6058]
  perf[0x4f833b]
  perf[0x4f8610]
  perf[0x4f209e]
  perf(ui_browser__run+0x3a)[0x4f2e6a]
  perf[0x4f94ee]
  perf(perf_evlist__tui_browse_hists+0x94)[0x4fbbf4]
  perf[0x444d10]
  /usr/lib/libpthread.so.0(+0x7314)[0x7f2a37070314]
  /usr/lib/libc.so.6(clone+0x6d)[0x7f2a358ee5bd]

  $ addr2line -e `which perf` 0x4f6058
  /home/namhyung/project/linux/tools/perf/ui/browsers/hists.c:553

I don't know why the backtrace didn't print some symbols..

Signed-off-by: Namhyung Kim <[email protected]>
Fixes: 4087d11cd945 ("perf hists browser: Print overhead percent value for first-level callchain")
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf callchain: Free callchains when hist entries are deleted

Markus reported that "perf top -g" can leak ~300MB per second on his
machine. This is partly because it missed to free callchains when hist
entries are deleted. Fix it.

Reported-by: Markus Trippelsdorf <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Markus Trippelsdorf <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/20141230053813.GD6081@sejong
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf hists: Fix children sort key behavior

When perf report --children resorts output fields, it tries to put
caller above the callee.  But this was only meaningful for a same thread
and doing this requires callchain enabled.  So fix its check before
comparing the callchain depth.

This also changes the hist accumulation tests: In test 3, xmalloc in
bash thread should be above than other perf threads due to alphabetical
order of comm string.  Also it's under page_fault in bash thread since
alphabetical order of dso name.  The sys_perf_event_open in perf thread
is put on the last line since it's self overhead is 0.

In test 4, the sys_perf_event_open is put above other perf entries that
have same children overhead since its callchain depth is smaller.

Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

x86/xen: avoid freeing static 'name' when kasprintf() fails

In case kasprintf() fails in xen_setup_timer() we assign name to the
static string "<timer kasprintf failed>". We, however, don't check
that fact before issuing kfree() in xen_teardown_timer(), kernel is
supposed to crash with 'kernel BUG at mm/slub.c:3341!'

Solve the issue by making name a fixed length string inside struct
xen_clock_event_device. 16 bytes should be enough.

Suggested-by: Laszlo Ersek <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: David Vrabel <[email protected]>

x86/xen: add extra memory for remapped frames during setup

If the non-RAM regions in the e820 memory map are larger than the size
of the initial balloon, a BUG was triggered as the frames are remaped
beyond the limit of the linear p2m. The frames are remapped into the
initial balloon area (xen_extra_mem) but not enough of this is
available.

Ensure enough extra memory regions are added for these remapped
frames.

Signed-off-by: David Vrabel <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>

x86/xen: don't count how many PFNs are identity mapped

This accounting is just used to print a diagnostic message that isn't
very useful.

Signed-off-by: David Vrabel <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>

x86/xen: Free bootmem in free_p2m_page() during early boot

With recent changes in p2m we now have legitimate cases when
p2m memory needs to be freed during early boot (i.e. before
slab is initialized).

Signed-off-by: Boris Ostrovsky <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Signed-off-by: David Vrabel <[email protected]>

arm64/efi: add missing call to early_ioremap_reset()

The early ioremap support introduced by patch bf4b558eba92
("arm64: add early_ioremap support") failed to add a call to
early_ioremap_reset() at an appropriate time. Without this call,
invocations of early_ioremap etc. that are done too late will go
unnoticed and may cause corruption.

This is exactly what happened when the first user of this feature
was added in patch f84d02755f5a ("arm64: add EFI runtime services").
The early mapping of the EFI memory map is unmapped during an early
initcall, at which time the early ioremap support is long gone.

Fix by adding the missing call to early_ioremap_reset() to
setup_arch(), and move the offending early_memunmap() to right after
the point where the early mapping of the EFI memory map is last used.

Fixes: f84d02755f5a ("arm64: add EFI runtime services")
Cc: <[email protected]>
Signed-off-by: Leif Lindholm <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

s390/mm: avoid using pmd_to_page for !USE_SPLIT_PMD_PTLOCKS

pmd_to_page() is only available if USE_SPLIT_PMD_PTLOCKS is defined.
The use of pmd_to_page in the gmap code can cause compile errors if
NR_CPUS is smaller than SPLIT_PTLOCK_CPUS. Do not use pmd_to_page
outside of USE_SPLIT_PMD_PTLOCKS sections.

Reported-by: Mike Frysinger <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>

Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

"
  - 'perf probe' should fall back to find probe point in symbols when failing
    to do so in a debuginfo file (Masami Hiramatsu)

  - Fix 'perf probe' crash in dwarf_getcfi_elf (Namhyung Kim)

  - Fix shell completion with 'perf list' --raw-dump option (Taesoo Kim)

  - Fix 'perf diff' to sort by baseline field by default (Namhyung Kim)
"

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>

Merge tag 'amdkfd-fixes-2015-01-06' of git://people.freedesktop.org/~gabbayo/linux into drm-fixes

- Complete overhaul to the main IOCTL function, kfd_ioctl(), according to
  drm_ioctl() example. This includes changing the IOCTL definitions, so it
  breaks compatibility with previous versions of the userspace. However,
  because the kernel was not officialy released yet, and this the first
  kernel that includes amdkfd, I assume I can still do that at this stage.

- A couple of bug fixes for the non-HWS path (used for bring-ups and
  debugging purposes only).

* tag 'amdkfd-fixes-2015-01-06' of git://people.freedesktop.org/~gabbayo/linux:
  drm/amdkfd: rewrite kfd_ioctl() according to drm_ioctl()
  drm/amdkfd: reformat IOCTL definitions to drm-style
  drm/amdkfd: Do copy_to/from_user in general kfd_ioctl()
  drm/amdkfd: unmap VMID<-->PASID when relesing VMID (non-HWS)
  drm/radeon: Assign VMID to PASID for IH in non-HWS mode
  drm/radeon: do not leave queue acquired if timeout happens in kgd_hqd_destroy()
  drm/amdkfd: Load mqd to hqd in non-HWS mode
  drm/amd: Fixing typos in kfd<->kgd interface

Merge branch 'drm-fixes-3.19' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

some minor radeon fixes.

* 'drm-fixes-3.19' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon: integer underflow in radeon_cp_dispatch_texture()
  drm/radeon: adjust default bapm settings for KV
  drm/radeon: properly filter DP1.2 4k modes on non-DP1.2 hw
  drm/radeon: fix sad_count check for dce3
  drm/radeon: KV has three PPLLs (v2)

Merge branch 'linux-3.19' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes

    - Fix BUG() on !SMP builds
    - Fix for OOPS on pre-NV50 that snuck into -next
    - MCP7[789A] hang fix where firmware hasn't already setup NISO pollers
    - NV4x IGP MSI disable, it doesn't appear to work correctly
    - Add GK208B to recognised boards (no code change aside from adding
    chipset recognition)

* 'linux-3.19' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
  drm/nouveau/nouveau: Do not BUG_ON(!spin_is_locked()) on UP
  drm/nv4c/mc: disable msi
  drm/nouveau/fb/ram/mcp77: enable NISO poller
  drm/nouveau/fb/ram/mcp77: use carveout reg to determine size
  drm/nouveau/fb/ram/mcp77: subclass nouveau_ram
  drm/nouveau: wake up the card if necessary during gem callbacks
  drm/nouveau/device: Add support for GK208B, resolves bug 86935
  drm/nouveau: fix missing return statement in nouveau_ttm_tt_unpopulate
  drm/nouveau/bios: fix oops on pre-nv50 chipsets

ARM: shmobile: sh73a0 legacy: Set .control_parent for all irqpin instances

The sh73a0 INTC can't mask interrupts properly most likely due to a
hardware bug. Set the .control_parent flag to delegate masking to the
parent interrupt controller, like was already done for irqpin1.

Without this, accessing the three-axis digital accelerometer ADXL345
on kzm9g through /dev/input/event1 causes an interrupt storm, which
requires a power-cycle to recover from.

This was inspired by a patch for arch/arm/boot/dts/sh73a0.dtsi from
Laurent Pinchart <[email protected]>.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Fixes: 341eb5465f67437a ("ARM: shmobile: INTC External IRQ pin driver on sh73a0")
Signed-off-by: Simon Horman <[email protected]>

ARM: 8253/1: mm: use phys_addr_t type in map_lowmem() for kernel mem region

Now local variables kernel_x_start and kernel_x_end defined using
'unsigned long' type which is wrong because they represent physical
memory range and will be calculated wrongly if LPAE is enabled.
As result, all following code in map_lowmem() will not work correctly.

For example, Keystone 2 boot is broken because
kernel_x_start == 0x0000 0000
kernel_x_end == 0x0080 0000

instead of
kernel_x_start == 0x0000 0008 0000 0000
kernel_x_end == 0x0000 0008 0080 0000
and as result whole low memory will be mapped with MT_MEMORY_RW
permissions by code (start > kernel_x_end):
} else if (start >= kernel_x_end) {
map.pfn = __phys_to_pfn(start);
map.virtual = __phys_to_virt(start);
map.length = end - start;
map.type = MT_MEMORY_RW;

create_mapping(&map);
}

Hence, fix it by using phys_addr_t type for variables kernel_x_start
and kernel_x_end.

Tested-by: Murali Karicheri <[email protected]>
Signed-off-by: Grygorii Strashko <[email protected]>
Signed-off-by: Russell King <[email protected]>

ARM: 8249/1: mm: dump: don't skip regions

Currently the arm page table dumping code starts dumping page tables
from USER_PGTABLES_CEILING. This is unnecessary for skipping any entries
related to userspace as the swapper_pg_dir does not contain such
entries, and results in a couple of unfortuante side effects.

Firstly, any kernel mappings which might exist below
USER_PGTABLES_CEILING will not be accounted in the dump output. This
masks any entries erroneously created below this address.

Secondly, if the final page table entry walked is part of a valid
mapping the page table dumping code will not log the region this entry
is part of, as the final note_page call in walk_pgd will trigger an
early return when 0 < USER_PGTABLES_CEILING. Luckily this isn't seen on
contemporary systems as they typically don't have enough RAM to extend
the linear mapping right to the end of the address space.

Due to the way addr is constructed in the walk_* functions, it can never
be less than USER_PGTABLES_CEILING when walking the page tables, so it
is not necessary to avoid dereferencing invalid table addresses. The
existing checks for st->current_prot and st->marker[1].start_address are
sufficient to ensure we will not print and/or dereference garbage when
trying to log information.

This patch removes both problematic uses of USER_PGTABLES_CEILING from
the arm page table dumping code, preventing both of these issues. We
will now report any low mappings, and the final note_page call will not
return early, ensuring all regions are logged.

Signed-off-by: Mark Rutland <[email protected]>
Cc: Steve Capper <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Russell King <[email protected]>

ARM: wire up execveat syscall

Signed-off-by: Russell King <[email protected]>

rpc: fix xdr_truncate_encode to handle buffer ending on page boundary

A struct xdr_stream at a page boundary might point to the end of one
page or the beginning of the next, but xdr_truncate_encode isn't
prepared to handle the former.

This can cause corruption of NFSv4 READDIR replies in the case that a
readdir entry that would have exceeded the client's dircount/maxcount
limit would have ended exactly on a 4k page boundary. You're more
likely to hit this case on large directories.

Other xdr_truncate_encode callers are probably also affected.

Reported-by: Holger Hoffstätte <[email protected]>
Tested-by: Holger Hoffstätte <[email protected]>
Fixes: 3e19ce762b53 "rpc: xdr_truncate_encode"
Cc: [email protected]
Signed-off-by: J. Bruce Fields <[email protected]>

nfsd: fix fi_delegees leak when fi_had_conflict returns true

Currently, nfs4_set_delegation takes a reference to an existing
delegation and then checks to see if there is a conflict. If there is
one, then it doesn't release that reference.

Change the code to take the reference after the check and only if there
is no conflict.

Signed-off-by: Jeff Layton <[email protected]>
Cc: [email protected]
Signed-off-by: J. Bruce Fields <[email protected]>

blk-mq: get rid of ->cmd_size in the hardware queue

We store it in the tag set, we don't need it in the hardware queue.
While removing cmd_size, place ->queue_num further down to avoid
a hole on 64-bit archs. It's not used in any fast paths, so we
can safely move it.

Signed-off-by: Jens Axboe <[email protected]>

vfio-pci: Fix the check on pci device type in vfio_pci_probe()

Current vfio-pci just supports normal pci device, so vfio_pci_probe() will
return if the pci device is not a normal device. While current code makes a
mistake. PCI_HEADER_TYPE is the offset in configuration space of the device
type, but we use this value to mask the type value.

This patch fixs this by do the check directly on the pci_dev->hdr_type.

Signed-off-by: Wei Yang <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>
Cc: [email protected] # v3.6+

assoc_array: Include rcupdate.h for call_rcu() definition

Include rcupdate.h header to provide call_rcu() definition. This was implicitly
being provided by slab.h file which include srcu.h somewhere in its include
hierarchy which in-turn included rcupdate.h.

Lately, tinification effort added support to remove srcu entirely because of
which we are encountering build errors like

lib/assoc_array.c: In function 'assoc_array_apply_edit':
lib/assoc_array.c:1426:2: error: implicit declaration of function 'call_rcu' [-Werror=implicit-function-declaration]
cc1: some warnings being treated as errors

Fix these by including rcupdate.h explicitly.

Signed-off-by: Pranith Kumar <[email protected]>
Reported-by: Scott Wood <[email protected]>

ALSA: fireworks: fix an endianness bug for transaction length

Although the 't->length' is a big-endian value, it's used without any
conversion. This means that the driver always uses 'length' parameter.

Fixes: 555e8a8f7f14("ALSA: fireworks: Add command/response functionality into hwdep interface")
Reported-by: Clemens Ladisch <[email protected]>
Signed-off-by: Takashi Sakamoto <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

ARM: dts: berlin: correct BG2Q's SM GPIO location.

The gpio4 and gpio5 are in 0xf7fc0000 apb which is located in the SM domain.
This patch moves gpio4 and gpio5 to the correct location. This patch also
renames them as the following to match the names we internally used in
marvell:
gpio4 -> sm_gpio1
gpio5 -> sm_gpio0
porte -> portf
portf -> porte

This also matches what we did for BG2 and BG2CD's SM GPIO.

Cc: [email protected] # 3.16+
Fixes: cedf57fc4f2f ("ARM: dts: berlin: add the BG2Q GPIO nodes")
Signed-off-by: Jisheng Zhang <[email protected]>
Signed-off-by: Sebastian Hesselbarth <[email protected]>

ARM: dts: berlin: add broken-cd and set bus width for eMMC in Marvell DMP DT

There's no card detection for the eMMC, so this patch adds the missing
broken-cd property. This patch also sets bus width as 8 to add
MMC_CAP_8_BIT_DATA in the Host capabilities.

Cc: [email protected] # 3.16+
Fixes: 3047086dfd56 ("ARM: dts: berlin: enable SD card reader and eMMC for the BG2Q DMP")
Signed-off-by: Jisheng Zhang <[email protected]>
Signed-off-by: Sebastian Hesselbarth <[email protected]>

ARM: dts: berlin: fix io clk and add missing core clk for BG2Q sdhci2 host

On BG2Q, the sdhci2 host uses nfcecc for "io" clk and nfc for "core" clk.
The shdci2 can't work without this patch due to the "core" clk is gated.

Cc: [email protected] # 3.16+
Fixes: 0d859a6a9d14 ("ARM: dts: berlin: add the SDHCI nodes for the BG2Q")
Signed-off-by: Jisheng Zhang <[email protected]>
Signed-off-by: Sebastian Hesselbarth <[email protected]>

thermal: rcar: change type of ctemp in rcar_thermal_update_temp()

Since the ctemp is used for rcar_thermal_write() in
rcar_thermal_update_temp(), the type of 'ctemp' should be "u32" instead
of "int". This patch also changes type of the helper variables 'old'
and 'new'.

Acked-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Yoshihiro Shimoda <[email protected]>
Signed-off-by: Eduardo Valentin <[email protected]>

thermal: rcar: fix ENR register value

On R-Mobile APE6, since it has 3 thermal zones, ENR register
has enable bits in bit 19-16, bit 11-8 and bit 3-0.

However, on R-Car gen2, since it has 1 thermal zone, ENR register has
enable bits in bit 3-0. (In other words, the write value should always
be 0 for bit 31-4 of ENR register.)

So, this patch fixes the ENR register value using I/O resource sets.

Acked-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Yoshihiro Shimoda <[email protected]>
Signed-off-by: Eduardo Valentin <[email protected]>

arm64: fix missing asm/io.h include in kernel/smp_spin_table.c

On next-20150105, defconfig compilation breaks with:

arch/arm64/kernel/smp_spin_table.c:80:2: error: implicit declaration of function ‘ioremap_cache’ [-Werror=implicit-function-declaration]
arch/arm64/kernel/smp_spin_table.c:92:2: error: implicit declaration of function ‘writeq_relaxed’ [-Werror=implicit-function-declaration]
arch/arm64/kernel/smp_spin_table.c:101:2: error: implicit declaration of function ‘iounmap’ [-Werror=implicit-function-declaration]

Fix by including asm/io.h, which contains definitions or prototypes
for these macros or functions.

This second version incorporates a comment from Mark Rutland
<[email protected]> to keep the includes in alphabetical order
by filename.

Signed-off-by: Paul Walmsley <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>