Git Repo - linux.git/log

Merge tag 'fbdev-fixes-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux

Pull fbdev fixes from Tomi Valkeinen:
- omapdss: compilation fix and DVI fix for PandaBoard
- mxsfb: fix colors when using 18bit LCD bus

* tag 'fbdev-fixes-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
  ARM: OMAP: dss-common: fix Panda's DVI DDC channel
  video: mxsfb: fix color settings for 18bit data bus and 32bpp
  OMAPDSS: analog-tv-connector: compile fix

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"Mostly radeon, more fixes for dynamic power management which is is off
  by default for this release anyways, but there are a large number of
  testers, so I'd like to keep merging the fixes.

  Otherwise, radeon UVD fixes affecting suspend/resume regressions, i915
  regression fixes, one for your mac mini, ast, mgag200, cirrus ttm fix
  and one regression fix in the core"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (25 commits)
  drm: Don't pass negative delta to ktime_sub_ns()
  drm/radeon: make missing smc ucode non-fatal
  drm/radeon/dpm: require rlc for dpm
  drm/radeon/cik: use a mutex to properly lock srbm instanced registers
  drm/radeon: remove unnecessary unpin
  drm/radeon: add more UVD CS checking
  drm/radeon: stop sending invalid UVD destroy msg
  drm/radeon: only save UVD bo when we have open handles
  drm/radeon: always program the MC on startup
  drm/radeon: fix audio dto calculation on DCE3+ (v3)
  drm/radeon/dpm: disable sclk ss on rv6xx
  drm/radeon: fix halting UVD
  drm/radeon/dpm: adjust power state properly for UVD on SI
  drm/radeon/dpm: fix spread spectrum setup (v2)
  drm/radeon/dpm: adjust thermal protection requirements
  drm/radeon: select audio dto based on encoder id for DCE3
  drm/radeon: properly handle pm on gpu reset
  drm/i915: do not disable backlight on vgaswitcheroo switch off
  drm/i915: Don't call encoder's get_config unless encoder is active
  drm/i915: avoid brightness overflow when doing scale
  ...

vxlan: fix a soft lockup in vxlan module removal

This is a regression introduced by:

commit fe5c3561e6f0ac7c9546209f01351113c1b77ec8
Author: stephen hemminger <[email protected]>
Date: Sat Jul 13 10:18:18 2013 -0700

vxlan: add necessary locking on device removal

The problem is that vxlan_dellink(), which is called with RTNL lock
held, tries to flush the workqueue synchronously, but apparently
igmp_join and igmp_leave work need to hold RTNL lock too, therefore we
have a soft lockup!

As suggested by Stephen, probably the flush_workqueue can just be
removed and let the normal refcounting work. The workqueue has a
reference to device and socket, therefore the cleanups should work
correctly.

Suggested-by: Stephen Hemminger <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: David S. Miller <[email protected]>
Tested-by: Cong Wang <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Acked-by: Stephen Hemminger <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

vxlan: fix a regression of igmp join

This is a regression introduced by:

commit 3fc2de2faba387218bdf9dbc6b13f513ac3b060a
Author: stephen hemminger <[email protected]>
Date:   Thu Jul 18 08:40:15 2013 -0700

    vxlan: fix igmp races

Before this commit, the old code was:

       if (vxlan_group_used(vn, vxlan->default_dst.remote_ip))
               ip_mc_join_group(sk, &mreq);
       else
               ip_mc_leave_group(sk, &mreq);

therefore we shoud check vxlan_group_used(), not its opposite,
for igmp_join.

Cc: Stephen Hemminger <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: rename busy poll MIB counter

Rename mib counter from "low latency" to "busy poll"

v1 also moved the counter to the ip MIB (suggested by Shawn Bohrer)
Eric Dumazet suggested that the current location is better.

So v2 just renames the counter to fit the new naming convention.

Signed-off-by: Eliezer Tamir <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

8139cp: Fix skb leak in rx_status_loop failure path.

Introduced in cf3c4c03060b688cbc389ebc5065ebcce5653e96
("8139cp: Add dma_mapping_error checking")

Signed-off-by: Dave Jones <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

net: flow_dissector: add 802.1ad support

Same behavior than 802.1q : finds the encapsulated protocol and
skip 32bit header.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

ip_gre: fix ipgre_header to return correct offset

Fix ipgre_header() (header_ops->create) to return the correct
amount of bytes pushed. Most callers of dev_hard_header() seem
to care only if it was success, but af_packet.c uses it as
offset to the skb to copy from userspace only once. In practice
this fixes packet socket sendto()/sendmsg() to gre tunnels.

Regression introduced in c54419321455631079c7d6e60bc732dd0c5914c5
("GRE: Refactor GRE tunneling code.")

Cc: Pravin B Shelar <[email protected]>
Signed-off-by: Timo Teräs <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

Hostap: copying wrong data prism2_ioctl_giwaplist()

We want the data stored in "addr" and "qual", but the extra ampersands
mean we are copying stack data instead.

Signed-off-by: Dan Carpenter <[email protected]>
Cc: [email protected]
Signed-off-by: John W. Linville <[email protected]>

zd1201: do not use stack as URB transfer_buffer

Patch fixes zd1201 not to use stack as URB transfer_buffer. URB buffers need
to be DMA-able, which stack is not.

Patch is only compile tested.

Cc: [email protected]
Signed-off-by: Jussi Kivilinna <[email protected]>
Signed-off-by: John W. Linville <[email protected]>

Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes

dlm: kill the unnecessary and wrong device_close()->recalc_sigpending()

device_close()->recalc_sigpending() is not needed, sigprocmask() takes
care of TIF_SIGPENDING correctly.

And without ->siglock it is racy and wrong, it can wrongly clear
TIF_SIGPENDING and miss a signal.

But even with this patch device_close() is still buggy:

  1. sigprocmask() should not be used, we have set_task_blocked(),
     but this is minor.

  2. We should never block SIGKILL or SIGSTOP, and this is what
     the code tries to do.

  3. This can't protect against SIGKILL or SIGSTOP anyway. Another
     thread can do signal_wake_up(), say, do_signal_stop() or
     complete_signal() or debugger.

  4. sigprocmask(SIG_BLOCK, allsigs) doesn't necessarily clears
     TIF_SIGPENDING, say, freezing() or ->jobctl.

  5. device_write() looks equally wrong by the same reason.

Looks like, this tries to protect some wait_event_interruptible() logic
from signals, it should be turned into uninterruptible wait.  Or we need
to implement something like signals_stop/start for such a use-case.

Signed-off-by: Oleg Nesterov <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

sata, highbank: fix ordering of SGPIO signals

The ACTIVITY and ERROR signals were reversed in the original commit.
Fix that so that hard drive activity does not show up on the error
light, and attempts to indicate that the hard drive is failing do
not show up as hard drive activity. This fixes a fairly serious
functional bug in the driver, but failing to apply this patch will
not cause any stability issues on the system.

Signed-off-by: Mark Langsdorf <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
Cc: [email protected]

swiotlb-xen: replace dma_length with sg_dma_len() macro

swiotlb-xen has an implicit dependency on CONFIG_NEED_SG_DMA_LENGTH.
Remove it by replacing dma_length with sg_dma_len.

Signed-off-by: Stefano Stabellini <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>

swiotlb: replace dma_length with sg_dma_len() macro

This patch replace dma_length in "lib/swiotlb.c" to sg_dma_len() macro,
because the build error can occur if CONFIG_NEED_SG_DMA_LENGTH is not
set, and CONFIG_SWIOTLB is set.

Singed-off-by: EunBong Song <[email protected]>
Signed-off-by: Stefano Stabellini <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>

xen/balloon: set a mapping for ballooned out pages

Currently ballooned out pages are mapped to 0 and have INVALID_P2M_ENTRY
in the p2m. These ballooned out pages are used to map foreign grants
by gntdev and blkback (see alloc_xenballooned_pages).

Allocate a page per cpu and map all the ballooned out pages to the
corresponding mfn. Set the p2m accordingly. This way reading from a
ballooned out page won't cause a kernel crash (see
http://lists.xen.org/archives/html/xen-devel/2012-12/msg01154.html).

Signed-off-by: Stefano Stabellini <[email protected]>
Reviewed-by: David Vrabel <[email protected]>
CC: [email protected]
CC: [email protected]
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>

xen/evtchn: improve scalability by using per-user locks

The global array of port users and the port_user_lock limits
scalability of the evtchn device. Instead of the global array lookup,
use a per-use (per-fd) tree of event channels bound by that user and
protect the tree with a per-user lock.

This is also a prerequiste for extended the number of supported event
channels, by removing the fixed size, per-event channel array.

Signed-off-by: David Vrabel <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>

xen/p2m: avoid unneccesary TLB flush in m2p_remove_override()

In m2p_remove_override() when removing the grant map from the kernel
mapping and replacing with a mapping to the original page, the grant
unmap will already have flushed the TLB and it is not necessary to do
it again after updating the mapping.

Signed-off-by: David Vrabel <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
Reviewed-by: Stefano Stabellini <[email protected]>

MAINTAINERS: Add in two extra co-maintainers of the Xen tree.

Both Boris and David have graciously volunteered to help in
maintaining the Xen subsystem tree. Cementing this in the
MAINTAINERS file so they are copied on Xen related patches.

Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
Acked-by: Boris Ostrovsky <[email protected]>
Acked-by: David Vrabel <[email protected]>

MAINTAINERS: Update the Xen subsystem's with proper mailing list.

And also drop the virtualization one since we don't really use it.

Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
Acked-by: Ian Campbell <[email protected]> [for netback]
Acked-by: Stefano Stabellini <[email protected]>

xen: replace strict_strtoul() with kstrtoul()

The usage of strict_strtoul() is not preferred, because
strict_strtoul() is obsolete. Thus, kstrtoul() should be
used.

Signed-off-by: Jingoo Han <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>

xen-gnt: prevent adding duplicate gnt callbacks

With the current implementation, the callback in the tail of the list
can be added twice, because the check done in
gnttab_request_free_callback is bogus, callback->next can be NULL if
it is the last callback in the list. If we add the same callback twice
we end up with an infinite loop, were callback == callback->next.

Replace this check with a proper one that iterates over the list to
see if the callback has already been added.

Signed-off-by: Roger Pau Monné <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: David Vrabel <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
Acked-by: Matt Wilson <[email protected]>
Reviewed-by: David Vrabel <[email protected]>
CC: [email protected]

drivers/tpm: add xen tpmfront interface

This is a complete rewrite of the Xen TPM frontend driver, taking
advantage of a simplified frontend/backend interface and adding support
for cancellation and timeouts. The backend for this driver is provided
by a vTPM stub domain using the interface in Xen 4.3.

Signed-off-by: Daniel De Graaf <[email protected]>
Acked-by: Matthew Fioravante <[email protected]>
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
Acked-by: Peter Huewe <[email protected]>
Reviewed-by: Peter Huewe <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>

xen: Support 64-bit PV guest receiving NMIs

This is based on a patch that Zhenzhong Duan had sent - which
was missing some of the remaining pieces. The kernel has the
logic to handle Xen-type-exceptions using the paravirt interface
in the assembler code (see PARAVIRT_ADJUST_EXCEPTION_FRAME -
pv_irq_ops.adjust_exception_frame and and INTERRUPT_RETURN -
pv_cpu_ops.iret).

That means the nmi handler (and other exception handlers) use
the hypervisor iret.

The other changes that would be neccessary for this would
be to translate the NMI_VECTOR to one of the entries on the
ipi_vector and make xen_send_IPI_mask_allbutself use different
events.

Fortunately for us commit 1db01b4903639fcfaec213701a494fe3fb2c490b
(xen: Clean up apic ipi interface) implemented this and we piggyback
on the cleanup such that the apic IPI interface will pass the right
vector value for NMI.

With this patch we can trigger NMIs within a PV guest (only tested
x86_64).

For this to work with normal PV guests (not initial domain)
we need the domain to be able to use the APIC ops - they are
already implemented to use the Xen event channels. For that
to be turned on in a PV domU we need to remove the masking
of X86_FEATURE_APIC.

Incidentally that means kgdb will also now work within
a PV guest without using the 'nokgdbroundup' workaround.

Note that the 32-bit version is different and this patch
does not enable that.

CC: Lisa Nguyen <[email protected]>
CC: Ben Guthro <[email protected]>
CC: Zhenzhong Duan <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
[v1: Fixed up per David Vrabel comments]
Reviewed-by: Ben Guthro <[email protected]>
Reviewed-by: David Vrabel <[email protected]>

kvm guest: Add configuration support to enable debug information for KVM Guests

Signed-off-by: Srivatsa Vaddagiri <[email protected]>
Link: http://lkml.kernel.org/r/1376058122-8248-14-git-send-email-raghavendra.kt@linux.vnet.ibm.com
Signed-off-by: Suzuki Poulose <[email protected]>
Signed-off-by: Raghavendra K T <[email protected]>
Acked-by: Gleb Natapov <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>

kvm uapi: Add KICK_CPU and PV_UNHALT definition to uapi

These are needed by both guest and host.

Originally-from: Srivatsa Vaddagiri <[email protected]>
Signed-off-by: Raghavendra K T <[email protected]>
Link: http://lkml.kernel.org/r/1376058122-8248-13-git-send-email-raghavendra.kt@linux.vnet.ibm.com
Acked-by: Gleb Natapov <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>

xen, pvticketlock: Allow interrupts to be enabled while blocking

If interrupts were enabled when taking the spinlock, we can leave them
enabled while blocking to get the lock.

If we can enable interrupts while waiting for the lock to become
available, and we take an interrupt before entering the poll,
and the handler takes a spinlock which ends up going into
the slow state (invalidating the per-cpu "lock" and "want" values),
then when the interrupt handler returns the event channel will
remain pending so the poll will return immediately, causing it to
return out to the main spinlock loop.

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
Link: http://lkml.kernel.org/r/1376058122-8248-12-git-send-email-raghavendra.kt@linux.vnet.ibm.com
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
Signed-off-by: Raghavendra K T <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>

x86, ticketlock: Add slowpath logic

Maintain a flag in the LSB of the ticket lock tail which indicates
whether anyone is in the lock slowpath and may need kicking when
the current holder unlocks.  The flags are set when the first locker
enters the slowpath, and cleared when unlocking to an empty queue (ie,
no contention).

In the specific implementation of lock_spinning(), make sure to set
the slowpath flags on the lock just before blocking.  We must do
this before the last-chance pickup test to prevent a deadlock
with the unlocker:

Unlocker Locker
test for lock pickup
-> fail
unlock
test slowpath
-> false
set slowpath flags
block

Whereas this works in any ordering:

Unlocker Locker
set slowpath flags
test for lock pickup
-> fail
block
unlock
test slowpath
-> true, kick

If the unlocker finds that the lock has the slowpath flag set but it is
actually uncontended (ie, head == tail, so nobody is waiting), then it
clears the slowpath flag.

The unlock code uses a locked add to update the head counter.  This also
acts as a full memory barrier so that its safe to subsequently
read back the slowflag state, knowing that the updated lock is visible
to the other CPUs.  If it were an unlocked add, then the flag read may
just be forwarded from the store buffer before it was visible to the other
CPUs, which could result in a deadlock.

Unfortunately this means we need to do a locked instruction when
unlocking with PV ticketlocks.  However, if PV ticketlocks are not
enabled, then the old non-locked "add" is the only unlocking code.

Note: this code relies on gcc making sure that unlikely() code is out of
line of the fastpath, which only happens when OPTIMIZE_SIZE=n.  If it
doesn't the generated code isn't too bad, but its definitely suboptimal.

Thanks to Srivatsa Vaddagiri for providing a bugfix to the original
version of this change, which has been folded in.
Thanks to Stephan Diestelhorst for commenting on some code which relied
on an inaccurate reading of the x86 memory ordering rules.

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
Link: http://lkml.kernel.org/r/1376058122-8248-11-git-send-email-raghavendra.kt@linux.vnet.ibm.com
Signed-off-by: Srivatsa Vaddagiri <[email protected]>
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
Cc: Stephan Diestelhorst <[email protected]>
Signed-off-by: Raghavendra K T <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>

jump_label: Split jumplabel ratelimit

Commit b202952075f62603bea9bfb6ebc6b0420db11949 ("perf, core: Rate limit
perf_sched_events jump_label patching") introduced rate limiting
for jump label disabling. The changes were made in the jump label code
in order to be more widely available and to keep things tidier. This is
all fine, except now jump_label.h includes linux/workqueue.h, which
makes it impossible to include jump_label.h from anything that
workqueue.h needs. For example, it's now impossible to include
jump_label.h from asm/spinlock.h, which is done in proposed
pv-ticketlock patches. This patch splits out the rate limiting related
changes from jump_label.h into a new file, jump_label_ratelimit.h, to
resolve the issue.

Signed-off-by: Andrew Jones <[email protected]>
Link: http://lkml.kernel.org/r/1376058122-8248-10-git-send-email-raghavendra.kt@linux.vnet.ibm.com
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
Signed-off-by: Raghavendra K T <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>

x86, pvticketlock: When paravirtualizing ticket locks, increment by 2

Increment ticket head/tails by 2 rather than 1 to leave the LSB free
to store a "is in slowpath state" bit. This halves the number
of possible CPUs for a given ticket size, but this shouldn't matter
in practice - kernels built for 32k+ CPU systems are probably
specially built for the hardware rather than a generic distro
kernel.

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
Link: http://lkml.kernel.org/r/1376058122-8248-9-git-send-email-raghavendra.kt@linux.vnet.ibm.com
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
Tested-by: Attilio Rao <[email protected]>
Signed-off-by: Raghavendra K T <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>

x86, pvticketlock: Use callee-save for lock_spinning

Although the lock_spinning calls in the spinlock code are on the
uncommon path, their presence can cause the compiler to generate many
more register save/restores in the function pre/postamble, which is in
the fast path. To avoid this, convert it to using the pvops callee-save
calling convention, which defers all the save/restores until the actual
function is called, keeping the fastpath clean.

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
Link: http://lkml.kernel.org/r/1376058122-8248-8-git-send-email-raghavendra.kt@linux.vnet.ibm.com
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
Tested-by: Attilio Rao <[email protected]>
Signed-off-by: Raghavendra K T <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>

xen, pvticketlocks: Add xen_nopvspin parameter to disable xen pv ticketlocks

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
Link: http://lkml.kernel.org/r/1376058122-8248-7-git-send-email-raghavendra.kt@linux.vnet.ibm.com
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
Signed-off-by: Raghavendra K T <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>

xen, pvticketlock: Xen implementation for PV ticket locks

Replace the old Xen implementation of PV spinlocks with and implementation
of xen_lock_spinning and xen_unlock_kick.

xen_lock_spinning simply registers the cpu in its entry in lock_waiting,
adds itself to the waiting_cpus set, and blocks on an event channel
until the channel becomes pending.

xen_unlock_kick searches the cpus in waiting_cpus looking for the one
which next wants this lock with the next ticket, if any. If found,
it kicks it by making its event channel pending, which wakes it up.

We need to make sure interrupts are disabled while we're relying on the
contents of the per-cpu lock_waiting values, otherwise an interrupt
handler could come in, try to take some other lock, block, and overwrite
our values.

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
Link: http://lkml.kernel.org/r/1376058122-8248-6-git-send-email-raghavendra.kt@linux.vnet.ibm.com
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
[ Raghavendra: use function + enum instead of macro, cmpxchg for zero status reset
Reintroduce break since we know the exact vCPU to send IPI as suggested by Konrad.]
Signed-off-by: Raghavendra K T <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>

xen: Defer spinlock setup until boot CPU setup

There's no need to do it at very early init, and doing it there
makes it impossible to use the jump_label machinery.

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
Link: http://lkml.kernel.org/r/1376058122-8248-5-git-send-email-raghavendra.kt@linux.vnet.ibm.com
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
Signed-off-by: Raghavendra K T <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>

x86, ticketlock: Collapse a layer of functions

Now that the paravirtualization layer doesn't exist at the spinlock
level any more, we can collapse the __ticket_ functions into the arch_
functions.

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
Link: http://lkml.kernel.org/r/1376058122-8248-4-git-send-email-raghavendra.kt@linux.vnet.ibm.com
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
Tested-by: Attilio Rao <[email protected]>
Signed-off-by: Raghavendra K T <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>

x86, ticketlock: Don't inline _spin_unlock when using paravirt spinlocks

The code size expands somewhat, and its better to just call
a function rather than inline it.

Thanks Jeremy for original version of ARCH_NOINLINE_SPIN_UNLOCK config patch,
which is simplified.

Suggested-by: Linus Torvalds <[email protected]>
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
Signed-off-by: Raghavendra K T <[email protected]>
Link: http://lkml.kernel.org/r/1376058122-8248-3-git-send-email-raghavendra.kt@linux.vnet.ibm.com
Acked-by: Ingo Molnar <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>

x86, spinlock: Replace pv spinlocks with pv ticketlocks

Rather than outright replacing the entire spinlock implementation in
order to paravirtualize it, keep the ticket lock implementation but add
a couple of pvops hooks on the slow patch (long spin on lock, unlocking
a contended lock).

Ticket locks have a number of nice properties, but they also have some
surprising behaviours in virtual environments.  They enforce a strict
FIFO ordering on cpus trying to take a lock; however, if the hypervisor
scheduler does not schedule the cpus in the correct order, the system can
waste a huge amount of time spinning until the next cpu can take the lock.

(See Thomas Friebel's talk "Prevent Guests from Spinning Around"
http://www.xen.org/files/xensummitboston08/LHP.pdf for more details.)

To address this, we add two hooks:
- __ticket_spin_lock which is called after the cpu has been
   spinning on the lock for a significant number of iterations but has
   failed to take the lock (presumably because the cpu holding the lock
   has been descheduled).  The lock_spinning pvop is expected to block
   the cpu until it has been kicked by the current lock holder.
- __ticket_spin_unlock, which on releasing a contended lock
   (there are more cpus with tail tickets), it looks to see if the next
   cpu is blocked and wakes it if so.

When compiled with CONFIG_PARAVIRT_SPINLOCKS disabled, a set of stub
functions causes all the extra code to go away.

Results:
=======
setup: 32 core machine with 32 vcpu KVM guest (HT off)  with 8GB RAM
base = 3.11-rc
patched = base + pvspinlock V12

+-----------------+----------------+--------+
dbench (Throughput in MB/sec. Higher is better)
+-----------------+----------------+--------+
|   base (stdev %)|patched(stdev%) | %gain  |
+-----------------+----------------+--------+
| 15035.3   (0.3) |15150.0   (0.6) |   0.8  |
|  1470.0   (2.2) | 1713.7   (1.9) |  16.6  |
|   848.6   (4.3) |  967.8   (4.3) |  14.0  |
|   652.9   (3.5) |  685.3   (3.7) |   5.0  |
+-----------------+----------------+--------+

pvspinlock shows benefits for overcommit ratio > 1 for PLE enabled cases,
and undercommits results are flat

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
Link: http://lkml.kernel.org/r/1376058122-8248-2-git-send-email-raghavendra.kt@linux.vnet.ibm.com
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
Tested-by: Attilio Rao <[email protected]>
[ Raghavendra: Changed SPIN_THRESHOLD, fixed redefinition of arch_spinlock_t]
Signed-off-by: Raghavendra K T <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>

arm64: KVM: use 'int' instead of 'u32' for variable 'target' in kvm_host.h.

'target' will be set to '-1' in kvm_arch_vcpu_init(), and it need check
'target' whether less than zero or not in kvm_vcpu_initialized().

So need define target as 'int' instead of 'u32', just like ARM has done.

The related warning:

arch/arm64/kvm/../../../arch/arm/kvm/arm.c:497:2: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]

Signed-off-by: Chen Gang <[email protected]>
[Marc: reformated the Subject line to fit the series]
Signed-off-by: Marc Zyngier <[email protected]>

arm64: KVM: add missing dsb before invalidating Stage-2 TLBs

When performing a Stage-2 TLB invalidation, it is necessary to
make sure the write to the page tables is observable by all CPUs.

For this purpose, add dsb instructions to __kvm_tlb_flush_vmid_ipa
and __kvm_flush_vm_context before doing the TLB invalidation itself.

Signed-off-by: Marc Zyngier <[email protected]>

arm64: KVM: perform save/restore of PAR_EL1

Not saving PAR_EL1 is an unfortunate oversight. If the guest
performs an AT* operation and gets scheduled out before reading
the result of the translation from PAREL1, it could become
corrupted by another guest or the host.

Saving this register is made slightly more complicated as KVM also
uses it on the permission fault handling path, leading to an ugly
"stash and restore" sequence. Fortunately, this is already a slow
path so we don't really care. Also, Linux doesn't do any AT*
operation, so Linux guests are not impacted by this bug.

Signed-off-by: Marc Zyngier <[email protected]>

Revert "HID: hid-logitech-dj: querying_devices was never set"

This reverts commit 407a2c2a4d85100c8c67953e4bac2f4a6c942335.

Explanation provided by Benjamin Tissoires:

Commit "HID: hid-logitech-dj, querying_devices was never set" activate
a flag which guarantees that we do not ask the receiver for too many
enumeration. When the flag is set, each following enumeration call is
discarded (the usb request is not forwarded to the receiver). The flag
is then released when the driver receive a pairing information event,
which normally follows the enumeration request.
However, the USB3 bug makes the driver think the enumeration request
has been forwarded to the receiver. However, it is actually not the
case because the USB stack returns -EPIPE. So, when a new unknown
device appears, the workaround consisting in asking for a new
enumeration is not working anymore: this new enumeration is discarded
because of the flag, which is never reset.

A solution could be to trigger a timeout before releasing it, but for
now, let's just revert the patch.

Reported-by: Benjamin Tissoires <[email protected]>
Tested-by: Sune Mølgaard <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>

powerpc/tm: Fix context switching TAR, PPR and DSCR SPRs

If a transaction is rolled back, the Target Address Register (TAR), Processor
Priority Register (PPR) and Data Stream Control Register (DSCR) should be
restored to the checkpointed values before the transaction began. Any changes
to these SPRs inside the transaction should not be visible in the abort
handler.

Currently Linux doesn't save or restore the checkpointed TAR, PPR or DSCR. If
we preempt a processes inside a transaction which has modified any of these, on
process restore, that same transaction may be aborted we but we won't see the
checkpointed versions of these SPRs.

This adds checkpointed versions of these SPRs to the thread_struct and adds the
save/restore of these three SPRs to the treclaim/trechkpt code.

Without this if any of these SPRs are modified during a transaction, users may
incorrectly see a speculated SPR value even if the transaction is aborted.

Signed-off-by: Michael Neuling <[email protected]>
Cc: <[email protected]> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc: Save the TAR register earlier

This moves us to save the Target Address Register (TAR) a earlier in
__switch_to. It introduces a new function save_tar() to do this.

We need to save the TAR earlier as we will overwrite it in the transactional
memory reclaim/recheckpoint path. We are going to do this in a subsequent
patch which will fix saving the TAR register when it's modified inside a
transaction.

Signed-off-by: Michael Neuling <[email protected]>
Cc: <[email protected]> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc: Fix context switch DSCR on POWER8

POWER8 allows the DSCR to be accessed directly from userspace via a new SPR
number 0x3 (Rather than 0x11.  DSCR SPR number 0x11 is still used on POWER8 but
like POWER7, is only accessible in HV and OS modes).  Currently, we allow this
by setting H/FSCR DSCR bit on boot.

Unfortunately this doesn't work, as the kernel needs to see the DSCR change so
that it knows to no longer restore the system wide version of DSCR on context
switch (ie. to set thread.dscr_inherit).

This clears the H/FSCR DSCR bit initially.  If a process then accesses the DSCR
(via SPR 0x3), it'll trap into the kernel where we set thread.dscr_inherit in
facility_unavailable_exception().

We also change _switch() so that we set or clear the H/FSCR DSCR bit based on
the thread.dscr_inherit.

Signed-off-by: Michael Neuling <[email protected]>
Cc: <[email protected]> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc: Rework setting up H/FSCR bit definitions

This reworks the Facility Status and Control Regsiter (FSCR) config bit
definitions so that we can access the bit numbers. This is needed for a
subsequent patch to fix the userspace DSCR handling.

HFSCR and FSCR bit definitions are the same, so reuse them.

Signed-off-by: Michael Neuling <[email protected]>
Cc: <[email protected]> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc: Fix hypervisor facility unavaliable vector number

Currently if we take hypervisor facility unavaliable (from 0xf80/0x4f80) we
mark it as an OS facility unavaliable (0xf60) as the two share the same code
path.

The becomes a problem in facility_unavailable_exception() as we aren't able to
see the hypervisor facility unavailable exceptions.

Below fixes this by duplication the required macros.

Signed-off-by: Michael Neuling <[email protected]>
Cc: <[email protected]> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc/kvm/book3s_pr: Return appropriate error when allocation fails

err was overwritten by a previous function call, and checked to be 0. If
the following page allocation fails, 0 is going to be returned instead
of -ENOMEM.

Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc/kvm: Add signed type cast for comparation

'rmls' is 'unsigned long', lpcr_rmls() will return negative number when
failure occurs, so it need a type cast for comparing.

'lpid' is 'unsigned long', kvmppc_alloc_lpid() return negative number
when failure occurs, so it need a type cast for comparing.

Signed-off-by: Chen Gang <[email protected]>
Acked-by: Paul Mackerras <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc/eeh: Add missing procfs entry for PowerNV

The procfs entry for global statistics has been missed on PowerNV
platform and the patch is going to add that.

Signed-off-by: Mike Qiu <[email protected]>
Acked-by: Gavin Shan <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc/pseries: Add backward compatibilty to read old kernel oops-log

Older kernels has just length information in their header. Handle it
while reading old kernel oops log from pstore.

Applies on top of powerpc/pseries: Fix buffer overflow when reading from pstore

Signed-off-by: Aruna Balakrishnaiah <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc/pseries: Fix buffer overflow when reading from pstore

When reading from pstore there is a buffer overflow during decompression
due to the header added in unzip_oops. Remove unzip_oops and call
pstore_decompress directly in nvram_pstore_read. Allocate buffer of size
report_length of the oops header as header will not be deallocated in pstore.
Since we have 'openssl' command line tool to decompress the compressed data,
dump the compressed data in case decompression fails instead of not dumping
anything.

Signed-off-by: Aruna Balakrishnaiah <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

powerpc: On POWERNV enable PPC_DENORMALISATION by default

We want PPC_DENORMALISATION enabled when POWERNV is enabled,
so update the Kconfig.

Signed-off-by: Anton Blanchard <[email protected]>
Acked-by: Michael Neuling <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
CC: <[email protected]>

ext4: fix mount/remount error messages for incompatible mount options

Commit 5688978 ("ext4: improve handling of conflicting mount options")
introduced incorrect messages shown while choosing wrong mount options.

First of all, both cases of incorrect mount options,
"data=journal,delalloc" and "data=journal,dioread_nolock" result in
the same error message.

Secondly, the problem above isn't solved for remount option: the
mismatched parameter is simply ignored. Moreover, ext4_msg states
that remount with options "data=journal,delalloc" succeeded, which is
not true.

To fix it up, I added a simple check after parse_options() call to
ensure that data=journal and delalloc/dioread_nolock parameters are
not present at the same time.

Signed-off-by: Piotr Sarna <[email protected]>
Acked-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
Cc: [email protected]

ext4: allow the mount options nodelalloc and data=journal

Commit 26092bf ("ext4: use a table-driven handler for mount options")
wrongly disallows the specifying the mount options nodelalloc and
data=journal simultaneously. This is incorrect; it should have only
disallowed the combination of delalloc and data=journal
simultaneously.

Reported-by: Piotr Sarna <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
Cc: [email protected]

Merge tag 'drm-intel-fixes-2013-08-08' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes

Daniel writes:
A few bugfixes for serious stuff and regressions. Highlight is the
reinstated hack to keep the i915 backlight on when running on an optimus
machine, this prevents black screens especially with some radeon muxed
platforms. And the patch to quiet dmesg on Linus' old mac mini ;-)

* tag 'drm-intel-fixes-2013-08-08' of git://people.freedesktop.org/~danvet/drm-intel:
  drm/i915: do not disable backlight on vgaswitcheroo switch off
  drm/i915: Don't call encoder's get_config unless encoder is active
  drm/i915: avoid brightness overflow when doing scale
  drm/i915: update last_vblank when disabling the power well
  drm/i915: fix gen4 digital port hotplug definitions

Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless

John W. Linville says:

====================
This is a batch of fixes intended for the 3.11 queue...

Regarding the mac80211 (and related) bits, Johannes says:

"I have a fix from Chris for an infinite loop along with fixes from
myself to prevent it entering the loop to start with (continue using
disabled channels, many thanks to Chris for his debug/test help) and a
workaround for broken APs that advertise a bad HT primary channel in
their beacons. Additionally, a fix for another attrbuf race in mac80211
and a fix to clean up properly while P2P GO interfaces go down."

Along with that...

Solomon Peachy corrects a range check in cw1200 that would lead to
a BUG_ON when starting AP mode.

Stanislaw Gruszka provides an iwl4965 patch to power-up the device
earlier (avoiding microcode errors), and another iwl4965 fix that
resets the firmware after turning rfkill off (resolving a bug in the
Red Hat Bugzilla).

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <[email protected]>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32

Pull AVR32 build fix from Hans-Christian Egtvedt.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32:
avr32: boards/atngw100/mrmt.c: fix building error

userns: limit the maximum depth of user_namespace->parent chain

Ensure that user_namespace->parent chain can't grow too much.
Currently we use the hardroded 32 as limit.

Reported-by: Andy Lutomirski <[email protected]>
Signed-off-by: Oleg Nesterov <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

hwmon: (adt7470) Fix incorrect return code check

In adt7470_write_word_data(), which writes two bytes using
i2c_smbus_write_byte_data(), the return codes are incorrectly AND-ed
together when they should be OR-ed together.

The return code of i2c_smbus_write_byte_data() is zero for success.

The upshot is only the first byte was ever written to the hardware.
The 2nd byte was never written out.

I noticed that trying to set the fan speed limits was not working
correctly on my system. Setting the fan speed limits is the only
code that uses adt7470_write_word_data(). After making the change
the limit settings work and the alarms work also.

Signed-off-by: Curt Brune <[email protected]>
Cc: [email protected]
Signed-off-by: Guenter Roeck <[email protected]>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 bugfixes from Ted Ts'o.

Misc ext4 fixes, delayed by Ted moving mail servers and email getting
marked as spam due to bad spf records.

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: add WARN_ON to check the length of allocated blocks
  ext4: fix retry handling in ext4_ext_truncate()
  ext4: destroy ext4_es_cachep on module unload
  ext4: make sure group number is bumped after a inode allocation race

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security

Pull security layer fix from James Morris:
"Smack casting fix"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
Smack: IPv6 casting error fix for 3.11

Merge tag 'regulator-v3.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator DT binding fixes from Mark Brown:
"A couple of fixes to bring the DT binding documentation for Palmas
  into sync with the code"

* tag 'regulator-v3.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: palmas-pmic: doc: remove ti,tstep
  regulator: palmas-pmic: doc: fix typo for sleep-mode

Merge tag 'regmap-v3.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

Pull regmap fixes from Mark Brown:
"Two things here, one is a fix for a nasty issue where we were failing
  to sync the last register in a block when using raw writes and the
  other fixes a missing header for the !REGMAP stubs so that we don't
  rely on implicit includes in that case"

* tag 'regmap-v3.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: Add missing header for !CONFIG_REGMAP stubs
  regmap: cache: Make sure to sync the last register in a block

Merge tag 'spi-v3.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fix from Mark Brown:
"Just one update for SPI, a simple fix to the davinci driver to correct
  the direction for which DMA is mapped following the dmaengine
  conversion"

* tag 'spi-v3.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: spi-davinci: Fix direction in dma_map_single()

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull virtio fixes from Rusty Russell:
"More virtio console fixes than I'm happy with, but all real issues,
  and all CC:stable.."

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  virtio-scsi: Fix virtqueue affinity setup
  virtio: console: return -ENODEV on all read operations after unplug
  virtio: console: fix raising SIGIO after port unplug
  virtio: console: clean up port data immediately at time of unplug
  virtio: console: fix race in port_fops_open() and port unplug
  virtio: console: fix race with port unplug and open/close
  virtio/console: Add pipe_lock/unlock for splice_write
  virtio/console: Quit from splice_write if pipe->nrbufs is 0

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Kevin Hilman:
- MSM: GPIO fixes (includes old code removal)
- OMAP: earlyprintk regression, AM33xx cpgmac PM regression
- OMAP5: urgent fix for potentially harmful voltage regulator values
- Renesas: gpio-keys fix, fix SD card detection, fix shdma calculation
   error
- STi: critical SMP boot fix
- tegra: DTS fix for usb-phy
- a couple MAINTAINERS updates

(Arnd is on paternity leave, Kevin is stepping up to help arm-soc
maintenance)

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  MAINTAINERS: add TI Keystone ARM platform
  MAINTAINERS: delete Srinidhi from ux500
  ARM: tegra: enable ULPI phy on Colibri T20
  ARM: STi: remove sti_secondary_start from INIT section.
  ARM: STi: Fix cpu nodes with correct device_type.
  ARM: shmobile: lager: do not annotate gpio_buttons as __initdata
  ARM: shmobile: BOCK-W: fix SDHI0 PFC settings
  shdma: fixup sh_dmae_get_partial() calculation error
  ARM: OMAP2+: hwmod: AM335x: fix cpgmac address space
  ARM: OMAP2+: hwmod: rt address space index for DT
  ARM: OMAP2+: Sync hwmod state with the pm_runtime and omap_device state
  ARM: OMAP2+: Avoid idling memory controllers with no drivers
  ARM: OMAP2+: hwmod: Fix a crash in _setup_reset() with DEBUG_LL
  ARM: dts: omap5-uevm: update optional/unused regulator configurations
  ARM: dts: omap5-uevm: fix regulator configurations mandatory for SoC
  ARM: dts: omap5-uevm: document regulator signals used on the actual board
  ARM: msm: Consolidate gpiomux for older architectures
  ARM: shmobile: armadillo800eva: Don't request GPIO 166 in board code
  ARM: msm: dts: Fix the gpio register address for msm8960

Revert "slub: do not put a slab to cpu partial list when cpu_partial is 0"

This reverts commit 318df36e57c0ca9f2146660d41ff28e8650af423.

This commit caused Steven Rostedt's hackbench runs to run out of memory
due to a leak.  As noted by Joonsoo Kim, it is buggy in the following
scenario:

"I guess, you may set 0 to all kmem caches's cpu_partial via sysfs,
  doesn't it?

  In this case, memory leak is possible in following case.  Code flow of
  possible leak is follwing case.

   * in __slab_free()
   1. (!new.inuse || !prior) && !was_frozen
   2. !kmem_cache_debug && !prior
   3. new.frozen = 1
   4. after cmpxchg_double_slab, run the (!n) case with new.frozen=1
   5. with this patch, put_cpu_partial() doesn't do anything,
      because this cache's cpu_partial is 0
   6. return

  In step 5, leak occur"

And Steven does indeed have cpu_partial set to 0 due to RT testing.

Joonsoo is cooking up a patch, but everybody agrees that reverting this
for now is the right thing to do.

Reported-and-bisected-by: Steven Rostedt <[email protected]>
Acked-by: Joonsoo Kim <[email protected]>
Acked-by: Pekka Enberg <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

avr32: boards/atngw100/mrmt.c: fix building error

there is an additional "{", which causes building error.

Signed-off-by: Cong Ding <[email protected]>
Acked-by: Hans-Christian Egtvedt <[email protected]>

ARM: Fix FIQ code on VIVT CPUs

Aaro Koskinen reports the following oops:
Installing fiq handler from c001b110, length 0x164
Unable to handle kernel paging request at virtual address ffff1224
pgd = c0004000
[ffff1224] *pgd=00000000, *pte=11fff0cb, *ppte=11fff00a
...
[<c0013154>] (set_fiq_handler+0x0/0x6c) from [<c0365d38>] (ams_delta_init_fiq+0xa8/0x160)
r6:00000164 r5:c001b110 r4:00000000 r3:fefecb4c
[<c0365c90>] (ams_delta_init_fiq+0x0/0x160) from [<c0365b14>] (ams_delta_init+0xd4/0x114)
r6:00000000 r5:fffece10 r4:c037a9e0
[<c0365a40>] (ams_delta_init+0x0/0x114) from [<c03613b4>] (customize_machine+0x24/0x30)

This is because the vectors page is now write-protected, and to change
code in there we must write to its original alias. Make that change,
and adjust the cache flushing such that the code will become visible
to the instruction stream on VIVT CPUs.

Reported-by: Aaro Koskinen <[email protected]>
Tested-by: Aaro Koskinen <[email protected]>
Signed-off-by: Russell King <[email protected]>

ALSA: usb-audio: do not trust too-big wMaxPacketSize values

The driver used to assume that the streaming endpoint's wMaxPacketSize
value would be an indication of how much data the endpoint expects or
sends, and compute the number of packets per URB using this value.

However, the Focusrite Scarlett 2i4 declares a value of 1024 bytes,
while only about 88 or 44 bytes are be actually used. This discrepancy
would result in URBs with far too few packets, which would not work
correctly on the EHCI driver.

To get correct URBs, use wMaxPacketSize only as an upper limit on the
packet size.

Reported-by: James Stone <[email protected]>
Tested-by: James Stone <[email protected]>
Cc: <[email protected]> # 2.6.35+
Signed-off-by: Clemens Ladisch <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

arm64: KVM: fix 2-level page tables unmapping

When using 64kB pages, we only have two levels of page tables,
meaning that PGD, PUD and PMD are fused. In this case, trying
to refcount PUDs and PMDs independently is a a complete disaster,
as they are the same.

We manage to get it right for the allocation (stage2_set_pte uses
{pmd,pud}_none), but the unmapping path clears both pud and pmd
refcounts, which fails spectacularly with 2-level page tables.

The fix is to avoid calling clear_pud_entry when both the pmd and
pud pages are empty. For this, and instead of introducing another
pud_empty function, consolidate both pte_empty and pmd_empty into
page_empty (the code is actually identical) and use that to also
test the validity of the pud.

Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>

ARM: KVM: Fix unaligned unmap_range leak

The unmap_range function did not properly cover the case when the start
address was not aligned to PMD_SIZE or PUD_SIZE and an entire pte table
or pmd table was cleared, causing us to leak memory when incrementing
the addr.

The fix is to always move onto the next page table entry boundary
instead of adding the full size of the VA range covered by the
corresponding table level entry.

Acked-by: Marc Zyngier <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>

NFSv4: Fix up nfs4_proc_lookup_mountpoint

Currently, we do not check the return value of client = rpc_clone_client(),
nor do we shut down the resulting cloned rpc_clnt in the case where a
NFS4ERR_WRONGSEC has caused nfs4_proc_lookup_common() to replace the
original value of 'client' (causing a memory leak).

Fix both issues and simplify the code by moving the call to
rpc_clone_client() until after nfs4_proc_lookup_common() has
done its business.

Reported-by: Andy Adamson <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>

ipv6: don't stop backtracking in fib6_lookup_1 if subtree does not match

In case a subtree did not match we currently stop backtracking and return
NULL (root table from fib_lookup). This could yield in invalid routing
table lookups when using subtrees.

Instead continue to backtrack until a valid subtree or node is found
and return this match.

Also remove unneeded NULL check.

Reported-by: Teco Boot <[email protected]>
Cc: YOSHIFUJI Hideaki <[email protected]>
Cc: David Lamparter <[email protected]>
Cc: <[email protected]>
Signed-off-by: Hannes Frederic Sowa <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

drm: Don't pass negative delta to ktime_sub_ns()

It takes an unsigned value. This happens not to blow up on 64-bit
architectures, but it does on 32-bit, causing
drm_calc_vbltimestamp_from_scanoutpos() to calculate totally bogus
timestamps for vblank events. Which in turn causes e.g. gnome-shell to
hang after a DPMS off cycle with current xf86-video-ati Git.

[airlied: regression introduced in drm: use monotonic time in drm_calc_vbltimestamp_from_scanoutpos]

Cc: [email protected]
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59339
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59836
Tested-by: shui yangwei <[email protected]>
Signed-off-by: Michel Dänzer <[email protected]>
Reviewed-by: Imre Deak <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>

Merge branch 'drm-fixes-3.11' of git://people.freedesktop.org/~agd5f/linux

Some more radeon fixes.  Mostly dpm and uvd fixes.  Fixes hangs
with dpm on more rv6xx asics, and fixes suspend and resume with UVD.

* 'drm-fixes-3.11' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon: make missing smc ucode non-fatal
  drm/radeon/dpm: require rlc for dpm
  drm/radeon/cik: use a mutex to properly lock srbm instanced registers
  drm/radeon: remove unnecessary unpin
  drm/radeon: add more UVD CS checking
  drm/radeon: stop sending invalid UVD destroy msg
  drm/radeon: only save UVD bo when we have open handles
  drm/radeon: always program the MC on startup
  drm/radeon: fix audio dto calculation on DCE3+ (v3)
  drm/radeon/dpm: disable sclk ss on rv6xx
  drm/radeon: fix halting UVD
  drm/radeon/dpm: adjust power state properly for UVD on SI
  drm/radeon/dpm: fix spread spectrum setup (v2)
  drm/radeon/dpm: adjust thermal protection requirements
  drm/radeon: select audio dto based on encoder id for DCE3
  drm/radeon: properly handle pm on gpu reset

drm/radeon: make missing smc ucode non-fatal

The smc ucode is required for dpm (dynamic power
management), but if it's missing just skip dpm setup
and don't disable acceleration.

Should fix:
https://bugs.freedesktop.org/show_bug.cgi?id=67876

Signed-off-by: Alex Deucher <[email protected]>

drm/radeon/dpm: require rlc for dpm

The rlc is required for dpm to work properly, so if
the rlc ucode is missing, don't enable dpm. Enabling
dpm without the rlc enabled can result in hangs.

Signed-off-by: Alex Deucher <[email protected]>

drm/radeon/cik: use a mutex to properly lock srbm instanced registers

We need proper locking in the driver when accessing instanced
registers on CIK.

Signed-off-by: Alex Deucher <[email protected]>

drm/radeon: remove unnecessary unpin

We don't pin the BO on allocation, so don't unpin it on free.

Signed-off-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/radeon: add more UVD CS checking

Improve error handling in case userspace sends us
an invalid command buffer.

Signed-off-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

drm/radeon: stop sending invalid UVD destroy msg

We also need to check the handle.

Signed-off-by: Christian König <[email protected]>
Cc: [email protected]
Signed-off-by: Alex Deucher <[email protected]>

drm/radeon: only save UVD bo when we have open handles

Otherwise just reinitialize from scratch on resume,
and so make it more likely to succeed.

Signed-off-by: Christian König <[email protected]>
Cc: [email protected]
Signed-off-by: Alex Deucher <[email protected]>

drm/radeon: always program the MC on startup

For r6xx+ asics.  This mirrors the behavior of pre-r6xx
asics.  We need to program the MC even if something
else in startup() fails.  Failure to do so results in
an unusable GPU.

Based on a fix from: Mark Kettenis <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]

drm/radeon: fix audio dto calculation on DCE3+ (v3)

Need to set the wallclock ratio and adjust the phase
and module registers appropriately. May fix problems
with audio timing at certain display timings.

v2: properly handle clocks below 24mhz
v3: rebase r600 changes

Signed-off-by: Alex Deucher <[email protected]>

drm/radeon/dpm: disable sclk ss on rv6xx

Enabling spread spectrum on the engine clock
leads to hangs on some asics.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=66963

Signed-off-by: Alex Deucher <[email protected]>

drm/radeon: fix halting UVD

Removing the clock/power or resetting the VCPU can cause
hangs if that happens in the middle of a register write.

Stall the memory and register bus before putting the VCPU
into reset. Keep it in reset when unloading the module or
suspending.

Signed-off-by: Christian König <[email protected]>
Cc: [email protected]
Signed-off-by: Alex Deucher <[email protected]>

drm/radeon/dpm: adjust power state properly for UVD on SI

There are some hardware issue with reclocking on SI when
UVD is active, so use a stable power state when UVD is
active. Fixes possible hangs and performance issues when
using UVD on SI.

Signed-off-by: Alex Deucher <[email protected]>

drm/radeon/dpm: fix spread spectrum setup (v2)

Need to check for engine and memory clock ss separately
and only enable dynamic ss if either of them are found.

This should fix systems which have a ss table, but do
not have entries for engine or memory. On those systems
we may enable dynamic spread spectrum without enabling
it on the engine or memory clocks which can lead to a
hang in some cases.

fixes some systems reported here:
https://bugs.freedesktop.org/show_bug.cgi?id=66963

v2: fix typo

Signed-off-by: Alex Deucher <[email protected]>

drm/radeon/dpm: adjust thermal protection requirements

On rv770 and newer, clock gating is not required
for thermal protection. The only requirement is that
the design utilizes a thermal sensor.

Signed-off-by: Alex Deucher <[email protected]>

drm/radeon: select audio dto based on encoder id for DCE3

There are two audio dtos on radeon asics that you can
select between. Normally, dto0 is used for hdmi and
dto1 for DP, but it seems that the dto is somehow
tied to the encoders on DCE3 asics.

fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=67435

Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]

drm/radeon: properly handle pm on gpu reset

When we reset the GPU, we need to properly tear
down power management before reseting the GPU and then
set it back up again after reset. Add the missing
radeon_pm_[suspend|resume] calls to the gpu reset
function.

Signed-off-by: Alex Deucher <[email protected]>

NFS: Remove unnecessary call to nfs_setsecurity in nfs_fhget()

We only need to call it on the creation of the inode.

Reported-by: Julia Lawall <[email protected]>
Cc: Steve Dickson <[email protected]>
Cc: Dave Quigley <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>

NFSv4: Fix the sync mount option for nfs4 mounts

The sync mount option stopped working for NFSv4 mounts after commit
c02d7adf8c5429727a98bad1d039bccad4c61c50 (NFSv4: Replace nfs4_path_walk() with
FS path lookup in a private namespace). If MS_SYNCHRONOUS is set in the
super_block that we're cloning from, then it should be set in the new
super_block as well.

Signed-off-by: Scott Mayhew <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>

NFS: Fix writeback performance issue on cache invalidation

If a cache invalidation is triggered, and we happen to have a lot of
writebacks cached at the time, then the call to invalidate_inode_pages2()
will end up calling ->launder_page() on each and every dirty page in order
to sync its contents to disk, thus defeating write coalescing.
The following patch ensures that we try to sync the inode to disk before
calling invalidate_inode_pages2() so that we do the writeback as efficiently
as possible.

Reported-by: William Dauchy <[email protected]>
Reported-by: Pascal Bouchareine <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
Tested-by: William Dauchy <[email protected]>
Reviewed-by: Jeff Layton <[email protected]>

SUNRPC: If the rpcbind channel is disconnected, fail the call to unregister

If rpcbind causes our connection to the AF_LOCAL socket to close after
we've registered a service, then we want to be careful about reconnecting
since the mount namespace may have changed.

By simply refusing to reconnect the AF_LOCAL socket in the case of
unregister, we avoid the need to somehow save the mount namespace. While
this may lead to some services not unregistering properly, it should
be safe.

Signed-off-by: Trond Myklebust <[email protected]>
Cc: Nix <[email protected]>
Cc: Jeff Layton <[email protected]>
Cc: [email protected] # 3.9.x

Merge branch 'pm-fixes'

* pm-fixes:
cpufreq: rename ignore_nice as ignore_nice_load
cpufreq: loongson2: fix regression related to clock management

Merge branch 'acpi-fixes'

* acpi-fixes:
  ACPI: Try harder to resolve _ADR collisions for bridges
  ACPI / processor: move try_offline_node() after acpi_unmap_lsapic()
  ACPI: Drop physical_node_id_bitmap from struct acpi_device
  ACPI / PM: Walk physical_node_list under physical_node_lock
  ACPI / video: improve quirk check in acpi_video_bqc_quirk()

ACPI: Try harder to resolve _ADR collisions for bridges

In theory, under a given ACPI namespace node there should be only
one child device object with _ADR whose value matches a given bus
address exactly.  In practice, however, there are systems in which
multiple child device objects under a given parent have _ADR matching
exactly the same address.  In those cases we use _STA to determine
which of the multiple matching devices is enabled, since some systems
are known to indicate which ACPI device object to associate with the
given physical (usually PCI) device this way.

Unfortunately, as it turns out, there are systems in which many
device objects under the same parent have _ADR matching exactly the
same bus address and none of them has _STA, in which case they all
should be regarded as enabled according to the spec.  Still, if
those device objects are supposed to represent bridges (e.g. this
is the case for device objects corresponding to PCIe ports), we can
try harder and skip the ones that have no child device objects in the
ACPI namespace.  With luck, we can avoid using device objects that we
are not expected to use this way.

Although this only works for bridges whose children also have ACPI
namespace representation, it is sufficient to address graphics
adapter detection issues on some systems, so rework the code finding
a matching device ACPI handle for a given bus address to implement
this idea.

Introduce a new function, acpi_find_child(), taking three arguments:
the ACPI handle of the device's parent, a bus address suitable for
the device's bus type and a bool indicating if the device is a
bridge and make it work as outlined above.  Reimplement the function
currently used for this purpose, acpi_get_child(), as a call to
acpi_find_child() with the last argument set to 'false' and make
the PCI subsystem use acpi_find_child() with the bridge information
passed as the last argument to it.  [Lan Tianyu notices that it is
not sufficient to use pci_is_bridge() for that, because the device's
subordinate pointer hasn't been set yet at this point, so use
hdr_type instead.]

This change fixes a regression introduced inadvertently by commit
33f767d (ACPI: Rework acpi_get_child() to be more efficient) which
overlooked the fact that for acpi_walk_namespace() "post-order" means
"after all children have been visited" rather than "on the way back",
so for device objects without children and for namespace walks of
depth 1, as in the acpi_get_child() case, the "post-order" callbacks
ordering is actually the same as the ordering of "pre-order" ones.
Since that commit changed the namespace walk in acpi_get_child() to
terminate after finding the first matching object instead of going
through all of them and returning the last one, it effectively
changed the result returned by that function in some rare cases and
that led to problems (the switch from a "pre-order" to a "post-order"
callback was supposed to prevent that from happening, but it was
ineffective).

As it turns out, the systems where the change made by commit
33f767d actually matters are those where there are multiple ACPI
device objects representing the same PCIe port (which effectively
is a bridge).  Moreover, only one of them, and the one we are
expected to use, has child device objects in the ACPI namespace,
so the regression can be addressed as described above.

References: https://bugzilla.kernel.org/show_bug.cgi?id=60561
Reported-by: Peter Wu <[email protected]>
Tested-by: Vladimir Lalov <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
Cc: 3.9+ <[email protected]> # 3.9+