Git Repo - linux.git/log

gpu/drm: Kill off set_irq_flags usage

set_irq_flags is ARM specific with custom flags which have genirq
equivalents. Convert drivers to use the genirq interfaces directly, so we
can kill off set_irq_flags. The translation of flags is as follows:

IRQF_VALID -> !IRQ_NOREQUEST
IRQF_PROBE -> !IRQ_NOPROBE
IRQF_NOAUTOEN -> IRQ_NOAUTOEN

For IRQs managed by an irqdomain, the irqdomain core code handles clearing
and setting IRQ_NOREQUEST already, so there is no need to do this in
.map() functions and we can simply remove the set_irq_flags calls. Some
users also modify IRQ_NOPROBE and this has been maintained although it
is not clear that is really needed. There appears to be a great deal of
blind copy and paste of this code.

Signed-off-by: Rob Herring <[email protected]>
Cc: [email protected]
Cc: Russell King <[email protected]>
Cc: David Airlie <[email protected]>
Cc: [email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
"This fixes the following issues:

   - The selftest overreads the IV test vector.

  - Fix potential infinite loop in sunxi-ss driver.

   - Fix powerpc build failure when VMX is set without VSX"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: testmgr - don't copy from source IV too much
  crypto: sunxi-ss - Fix a possible driver hang with ciphers
  crypto: vmx - VMX crypto should depend on CONFIG_VSX

x86/platform: Fix Geode LX timekeeping in the generic x86 build

In 2007, commit 07190a08eef36 ("Mark TSC on GeodeLX reliable")
bypassed verification of the TSC on Geode LX. However, this code
(now in the check_system_tsc_reliable() function in
arch/x86/kernel/tsc.c) was only present if CONFIG_MGEODE_LX was
set.

OpenWRT has recently started building its generic Geode target
for Geode GX, not LX, to include support for additional
platforms. This broke the timekeeping on LX-based devices,
because the TSC wasn't marked as reliable:
https://dev.openwrt.org/ticket/20531

By adding a runtime check on is_geode_lx(), we can also include
the fix if CONFIG_MGEODEGX1 or CONFIG_X86_GENERIC are set, thus
fixing the problem.

Signed-off-by: David Woodhouse <[email protected]>
Cc: Andres Salomon <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Marcelo Tosatti <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

arm: KVM: Fix incorrect device to IPA mapping

A critical bug has been found in device memory stage1 translation for
VMs with more then 4GB of address space. Once vm_pgoff size is smaller
then pa (which is true for LPAE case, u32 and u64 respectively) some
more significant bits of pa may be lost as a shift operation is performed
on u32 and later cast onto u64.

Example: vm_pgoff(u32)=0x00210030, PAGE_SHIFT=12
expected pa(u64): 0x0000002010030000
produced pa(u64): 0x0000000010030000

The fix is to change the order of operations (casting first onto phys_addr_t
and then shifting).

Reviewed-by: Marc Zyngier <[email protected]>
[maz: fixed changelog and patch formatting]
Cc: [email protected]
Signed-off-by: Marek Majtyka <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>

arm64: KVM: Fix user access for debug registers

When setting the debug register from userspace, make sure that
copy_from_user() is called with its parameters in the expected
order. It otherwise doesn't do what you think.

Fixes: 84e690bfbed1 ("KVM: arm64: introduce vcpu->arch.debug_ptr")
Reported-by: Peter Maydell <[email protected]>
Cc: Alex Bennée <[email protected]>
Reviewed-by: Christoffer Dall <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>

genirq: Remove irq argument from irq flow handlers

Most interrupt flow handlers do not use the irq argument. Those few
which use it can retrieve the irq number from the irq descriptor.

Remove the argument.

Search and replace was done with coccinelle and some extra helper
scripts around it. Thanks to Julia for her help!

Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Julia Lawall <[email protected]>
Cc: Jiang Liu <[email protected]>

genirq: Move field 'msi_desc' from irq_data into irq_common_data

MSI descriptors are per-irq instead of per irqchip, so move it into
struct irq_common_data.

Signed-off-by: Jiang Liu <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Kevin Cernekee <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Marc Zyngier <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

genirq: Move field 'affinity' from irq_data into irq_common_data

Irq affinity mask is per-irq instead of per irqchip, so move it into
struct irq_common_data.

Signed-off-by: Jiang Liu <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Kevin Cernekee <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

genirq: Move field 'handler_data' from irq_data into irq_common_data

Handler data (handler_data) is per-irq instead of per irqchip, so move
it into struct irq_common_data.

Signed-off-by: Jiang Liu <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Kevin Cernekee <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Marc Zyngier <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

genirq: Move field 'node' from irq_data into irq_common_data

NUMA node information is per-irq instead of per-irqchip, so move it into
struct irq_common_data. Also use CONFIG_NUMA to guard irq_common_data.node.

Signed-off-by: Jiang Liu <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Kevin Cernekee <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

irqchip/gic-v3: Use IRQD_FORWARDED_TO_VCPU flag

Get rid of the handler_data abuse.

Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Marc Zyngier <[email protected]>

irqchip/gic: Use IRQD_FORWARDED_TO_VCPU flag

Get rid of the handler_data abuse.

Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Marc Zyngier <[email protected]>

genirq: Provide IRQD_FORWARDED_TO_VCPU status flag

Provide a irq data flag to mark an irq forwarded to a VCPU along with
the accessor functions.

Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Marc Zyngier <[email protected]>

genirq: Simplify irq_data_to_desc()

Avoid the lookup of irq_desc and use the same mechanism for
hierarchical and flat irqdomains.

Based-on-a-patch-from: Jiang Liu <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>

genirq: Remove __irq_set_handler_locked()

All users converted to irq_set_handler_locked()

Signed-off-by: Thomas Gleixner <[email protected]>

pinctrl/pistachio: Use irq_set_handler_locked

Use irq_set_handler_locked() as it avoids a redundant lookup of the
irq descriptor. Search and replacement was done with coccinelle:

Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Julia Lawall <[email protected]>
Cc: Linus Walleij <[email protected]>
Cc: [email protected]

gpio: vf610: Use irq_set_handler_locked

Use irq_set_handler_locked() as it avoids a redundant lookup of the
irq descriptor. Search and replacement was done with coccinelle:

Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Julia Lawall <[email protected]>
Cc: Linus Walleij <[email protected]>
Cc: [email protected]

powerpc/mpc8xx: Use irq_set_handler_locked()

Use irq_set_handler_locked() as it avoids a redundant lookup of the
irq descriptor.

Search and replacement was done with coccinelle:

@@
struct irq_data *d;
expression E1;
@@

-__irq_set_handler_locked(d->irq, E1);
+irq_set_handler_locked(d, E1);

Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Jiang Liu <[email protected]>
Cc: Julia Lawall <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: [email protected]

powerpc/ipic: Use irq_set_handler_locked()

Use irq_set_handler_locked() as it avoids a redundant lookup of the
irq descriptor.

Search and replacement was done with coccinelle:

@@
struct irq_data *d;
expression E1;
@@

-__irq_set_handler_locked(d->irq, E1);
+irq_set_handler_locked(d, E1);

Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Jiang Liu <[email protected]>
Cc: Julia Lawall <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Anton Blanchard <[email protected]>
Cc: [email protected]

powerpc/cpm2: Use irq_set_handler_locked()

Use irq_set_handler_locked() as it avoids a redundant lookup of the
irq descriptor.

Search and replacement was done with coccinelle:

@@
struct irq_data *d;
expression E1;
@@

-__irq_set_handler_locked(d->irq, E1);
+irq_set_handler_locked(d, E1);

Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Jiang Liu <[email protected]>
Cc: Julia Lawall <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: [email protected]

powerpc/mpc52xx: Use irq_set_handler_locked()

Use irq_set_handler_locked() as it avoids a redundant lookup of the
irq descriptor.

Search and replacement was done with coccinelle:

@@
struct irq_data *d;
expression E1;
@@

-__irq_set_handler_locked(d->irq, E1);
+irq_set_handler_locked(d, E1);

Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Jiang Liu <[email protected]>
Cc: Julia Lawall <[email protected]>
Cc: Anatolij Gustschin <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: [email protected]

KVM: vmx: fix VPID is 0000H in non-root operation

Reference SDM 28.1:

The current VPID is 0000H in the following situations:
- Outside VMX operation. (This includes operation in system-management
  mode under the default treatment of SMIs and SMM with VMX operation;
  see Section 34.14.)
- In VMX root operation.
- In VMX non-root operation when the “enable VPID” VM-execution control
  is 0.

The VPID should never be 0000H in non-root operation when "enable VPID"
VM-execution control is 1. However, commit 34a1cd60 ("kvm: x86: vmx:
move some vmx setting from vmx_init() to hardware_setup()") remove the
codes which reserve 0000H for VMX root operation.

This patch fix it by again reserving 0000H for VMX root operation.

Cc: [email protected] # 3.19+
Fixes: 34a1cd60d17f62c1f077c1478a6c2ca8c3d17af4
Reported-by: Wincy Van <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

powerpc/mm: Recompute hash value after a failed update

If we had secondary hash flag set, we ended up modifying hash value in
the updatepp code path. Hence with a failed updatepp we will be using
a wrong hash value for the following hash insert. Fix this by
recomputing hash before insert.

Without this patch we can end up with using wrong slot number in linux
pte. That can result in us missing an hash pte update or invalidate
which can cause memory corruption or even machine check.

Fixes: 6d492ecc6489 ("powerpc/THP: Add code to handle HPTE faults for hugepages")
Cc: [email protected] # v3.11+
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Reviewed-by: Paul Mackerras <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>

powerpc/boot: Specify ABI v2 when building an LE boot wrapper

The kernel does it, not the boot wrapper, which breaks with some
cross compilers that still default to ABI v1.

Fixes: 147c05168fc8 ("powerpc/boot: Add support for 64bit little endian wrapper")
Cc: [email protected] # v3.16+
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>

genirq: Remove __irq_set_chip_handler_name_locked()

All users converted to irq_set_chip_handler_name_locked()

Signed-off-by: Thomas Gleixner <[email protected]>

pinctrl: sunxi: Use irq_set_chip_handler_name_locked()

__irq_set_chip_handler_name_locked() is about to be replaced. Use
irq_set_chip_handler_name_locked() instead.

Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Linus Walleij <[email protected]>
Cc: [email protected]

KVM: add halt_attempted_poll to VCPU stats

This new statistic can help diagnosing VCPUs that, for any reason,
trigger bad behavior of halt_poll_ns autotuning.

For example, say halt_poll_ns = 480000, and wakeups are spaced exactly
like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes
10+20+40+80+160+320+480 = 1110 microseconds out of every
479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then
is consuming about 30% more CPU than it would use without
polling. This would show as an abnormally high number of
attempted polling compared to the successful polls.

Acked-by: Christian Borntraeger <[email protected]<
Reviewed-by: David Matlack <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

virtio/s390: handle failures of READ_VQ_CONF ccw

In virtio_ccw_read_vq_conf() the return value of ccw_io_helper()
was not checked.
If the configuration could not be read properly, we'd wrongly assume a
queue size of 0.

Let's propagate any I/O error to virtio_ccw_setup_vq() so it may
properly fail.

Signed-off-by: Pierre Morel <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

tools/virtio: propagate V=X to kernel build

Signed-off-by: Michael S. Tsirkin <[email protected]>

vhost: move features to core

virtio 1 and any layout are core features, move them
there. This fixes vhost test.

Signed-off-by: Michael S. Tsirkin <[email protected]>

ARM: fix Thumb2 signal handling when ARMv6 is enabled

When a kernel is built covering ARMv6 to ARMv7, we omit to clear the
IT state when entering a signal handler.  This can cause the first
few instructions to be conditionally executed depending on the parent
context.

In any case, the original test for >= ARMv7 is broken - ARMv6 can have
Thumb-2 support as well, and an ARMv6T2 specific build would omit this
code too.

Relax the test back to ARMv6 or greater.  This results in us always
clearing the IT state bits in the PSR, even on CPUs where these bits
are reserved.  However, they're reserved for the IT state, so this
should cause no harm.

Cc: <[email protected]>
Fixes: d71e1352e240 ("Clear the IT state when invoking a Thumb-2 signal handler")
Acked-by: Tony Lindgren <[email protected]>
Tested-by: H. Nikolaus Schaller <[email protected]>
Tested-by: Grazvydas Ignotas <[email protected]>
Signed-off-by: Russell King <[email protected]>

Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

- Fix segfault pressing -> in 'perf top' with no hist entries. (Wang Nan)

   E.g:
perf top -e page-faults --pid 11400 # 11400 generates no page-fault

- Fix propagation of thread and cpu maps, that got broken when doing incomplete
  changes to better support events with a PMU cpu mask, leading to Intel PT to
  fail with an error like:

    $ perf record -e intel_pt//u uname
    Error: The sys_perf_event_open() syscall returned with
              22 (Invalid argument) for event (sched:sched_switch).

  Because intel_pt adds that sched:sched_switch evsel to the evlist after the
  thread/cpu maps were propagated to the evsels, fix it. (Adrian Hunter)

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>

cpufreq: acpi-cpufreq: Use cpufreq_cpu_get_raw() in ->get()

cpufreq_cpu_get() called by get_cur_freq_on_cpu() is overkill,
because the ->get() callback is always invoked in a context in
which all of the conditions checked by cpufreq_cpu_get() are
guaranteed to be satisfied.

Use cpufreq_cpu_get_raw() instead of it and drop the
corresponding cpufreq_cpu_put() from get_cur_freq_on_cpu().

Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Viresh Kumar <[email protected]>

blockdev: don't set S_DAX for misaligned partitions

The dax code doesn't currently support misaligned partitions,
so disable O_DIRECT via dax until such time as that support
materializes.

Cc: <[email protected]>
Suggested-by: Boaz Harrosh <[email protected]>
Signed-off-by: Jeff Moyer <[email protected]>
Signed-off-by: Dan Williams <[email protected]>

dax: fix O_DIRECT I/O to the last block of a blockdev

commit bbab37ddc20b (block: Add support for DAX reads/writes to
block devices) caused a regression in mkfs.xfs.  That utility
sets the block size of the device to the logical block size
using the BLKBSZSET ioctl, and then issues a single sector read
from the last sector of the device.  This results in the dax_io
code trying to do a page-sized read from 512 bytes from the end
of the device.  The result is -ERANGE being returned to userspace.

The fix is to align the block to the page size before calling
get_block.

Thanks to willy for simplifying my original patch.

Cc: <[email protected]>
Signed-off-by: Jeff Moyer <[email protected]>
Tested-by: Linda Knippers <[email protected]>
Signed-off-by: Dan Williams <[email protected]>

ia64: Enable userfaultfd and membarrier system calls

Signed-off-by: Tony Luck <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

modsign: Fix GPL/OpenSSL licence incompatibility

The GPL does not permit us to link against the OpenSSL library. Use
LGPL for sign-file and extract-file instead.

[ The whole "openssl isn't compatible with gpl" is really just
fear-mongering, but there's no reason not to make modsign LGPL, so
nobody cares. - Linus ]

Reported-by: Julian Andres Klode <[email protected]>
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: David Howells <[email protected]>
Reviewed-by: Julian Andres Klode <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

genirq: Update the comment for generic_handle_irq_desc

__do_IRQ() was removed by commit 1c77ff2 "genirq: Remove __do_IRQ",
but the comment referring to __do_IRQ() was left.

Update the comment for generic_handle_irq_desc().

Signed-off-by: Huang Shijie <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

genirq: Remove stale comment

Signed-off-by: Thomas Gleixner <[email protected]>

irqchip/renesas-irqc: Propagate wake-up settings to parent

The renesas-irqc interrupt controller is cascaded to the GIC, but its
driver doesn't propagate wake-up settings to the parent interrupt
controller.

Since commit aec89ef72ba6c944 ("irqchip/gic: Enable SKIP_SET_WAKE and
MASK_ON_SUSPEND"), the GIC driver masks interrupts during suspend, and
wake-up through gpio-keys now fails on r8a73a4/ape6evm.

Fix this by propagating wake-up settings to the parent interrupt
controller. There's no need to handle irq_set_irq_wake() failures, as
the renesas-irqc interrupt controller is always cascaded to a GIC, and
the GIC driver always sets SKIP_SET_WAKE since the aforementioned
commit.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Cc: Sudeep Holla <[email protected]>
Cc: Magnus Damm <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Marc Zyngier <[email protected]>
Link: http://lkml.kernel.org/r/1441731636-17610-3-git-send-email-geert%[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

irqchip/renesas-intc-irqpin: Propagate wake-up settings to parent

The renesas-intc-irqpin interrupt controller is cascaded to the GIC, but
its driver doesn't propagate wake-up settings to the parent interrupt
controller.

Since commit aec89ef72ba6c944 ("irqchip/gic: Enable SKIP_SET_WAKE and
MASK_ON_SUSPEND"), the GIC driver masks interrupts during suspend, and
wake-up through gpio-keys now fails on r8a7740/armadillo and
sh73a0/kzm9g.

Fix this by propagating wake-up settings to the parent interrupt
controller. There's no need to handle irq_set_irq_wake() failures, as
the renesas-intc-irqpin interrupt controller is always cascaded to a
GIC, and the GIC driver always sets SKIP_SET_WAKE since the
aforementioned commit.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Cc: Sudeep Holla <[email protected]>
Cc: Magnus Damm <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Marc Zyngier <[email protected]>
Link: http://lkml.kernel.org/r/1441731636-17610-2-git-send-email-geert%[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

irqchip/renesas-intc-irqpin: Use a separate lockdep class

The renesas-intc-irqpin interrupt controller is cascaded to the GIC.
Hence when propagating wake-up settings to its parent interrupt
controller, the following lockdep warning is printed:

    =============================================
    [ INFO: possible recursive locking detected ]
    4.2.0-armadillo-10725-g50fcd7643c034198 #781 Not tainted
    ---------------------------------------------
    s2ram/1179 is trying to acquire lock:
    (&irq_desc_lock_class){-.-...}, at: [<c005bb54>] __irq_get_desc_lock+0x78/0x94

    but task is already holding lock:
    (&irq_desc_lock_class){-.-...}, at: [<c005bb54>] __irq_get_desc_lock+0x78/0x94

    other info that might help us debug this:
    Possible unsafe locking scenario:

  CPU0
  ----
     lock(&irq_desc_lock_class);
     lock(&irq_desc_lock_class);

    *** DEADLOCK ***

    May be due to missing lock nesting notation

    7 locks held by s2ram/1179:
    #0:  (sb_writers#7){.+.+.+}, at: [<c00c9708>] __sb_start_write+0x64/0xb8
    #1:  (&of->mutex){+.+.+.}, at: [<c0125a00>] kernfs_fop_write+0x78/0x1a0
    #2:  (s_active#23){.+.+.+}, at: [<c0125a08>] kernfs_fop_write+0x80/0x1a0
    #3:  (autosleep_lock){+.+.+.}, at: [<c0058244>] pm_autosleep_lock+0x18/0x20
    #4:  (pm_mutex){+.+.+.}, at: [<c0057e50>] pm_suspend+0x54/0x248
    #5:  (&dev->mutex){......}, at: [<c0243a20>] __device_suspend+0xdc/0x240
    #6:  (&irq_desc_lock_class){-.-...}, at: [<c005bb54>] __irq_get_desc_lock+0x78/0x94

    stack backtrace:
    CPU: 0 PID: 1179 Comm: s2ram Not tainted 4.2.0-armadillo-10725-g50fcd7643c034198

    Hardware name: Generic R8A7740 (Flattened Device Tree)
    [<c00129f4>] (dump_backtrace) from [<c0012bec>] (show_stack+0x18/0x1c)
    [<c0012bd4>] (show_stack) from [<c03f5d94>] (dump_stack+0x20/0x28)
    [<c03f5d74>] (dump_stack) from [<c00514d4>] (__lock_acquire+0x67c/0x1b88)
    [<c0050e58>] (__lock_acquire) from [<c0052df8>] (lock_acquire+0x9c/0xbc)
    [<c0052d5c>] (lock_acquire) from [<c03fb068>] (_raw_spin_lock_irqsave+0x44/0x58)
    [<c03fb024>] (_raw_spin_lock_irqsave) from [<c005bb54>] (__irq_get_desc_lock+0x78/0x94
    [<c005badc>] (__irq_get_desc_lock) from [<c005c3d8>] (irq_set_irq_wake+0x28/0x100)
    [<c005c3b0>] (irq_set_irq_wake) from [<c01e50d0>] (intc_irqpin_irq_set_wake+0x24/0x4c)
    [<c01e50ac>] (intc_irqpin_irq_set_wake) from [<c005c17c>] (set_irq_wake_real+0x3c/0x50
    [<c005c140>] (set_irq_wake_real) from [<c005c414>] (irq_set_irq_wake+0x64/0x100)
    [<c005c3b0>] (irq_set_irq_wake) from [<c02a19b4>] (gpio_keys_suspend+0x60/0xa0)
    [<c02a1954>] (gpio_keys_suspend) from [<c023b750>] (platform_pm_suspend+0x3c/0x5c)

Avoid this false positive by using a separate lockdep class for INTC
External IRQ Pin interrupts.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Cc: Grygorii Strashko <[email protected]>
Cc: Magnus Damm <[email protected]>
Cc: Jason Cooper <[email protected]>
Link: http://lkml.kernel.org/r/1441798974-25716-3-git-send-email-geert%[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

irqchip/renesas-irqc: Use a separate lockdep class

The renesas-irqc interrupt controller is cascaded to the GIC. Hence when
propagating wake-up settings to its parent interrupt controller, the
following lockdep warning is printed:

    =============================================
    [ INFO: possible recursive locking detected ]
    4.2.0-ape6evm-10725-g50fcd7643c034198 #280 Not tainted
    ---------------------------------------------
    s2ram/1072 is trying to acquire lock:
    (&irq_desc_lock_class){-.-...}, at: [<c008d3fc>] __irq_get_desc_lock+0x58/0x98

    but task is already holding lock:
    (&irq_desc_lock_class){-.-...}, at: [<c008d3fc>] __irq_get_desc_lock+0x58/0x98

    other info that might help us debug this:
    Possible unsafe locking scenario:

  CPU0
  ----
     lock(&irq_desc_lock_class);
     lock(&irq_desc_lock_class);

    *** DEADLOCK ***

    May be due to missing lock nesting notation

    6 locks held by s2ram/1072:
    #0:  (sb_writers#7){.+.+.+}, at: [<c012eb14>] __sb_start_write+0xa0/0xa8
    #1:  (&of->mutex){+.+.+.}, at: [<c019396c>] kernfs_fop_write+0x4c/0x1bc
    #2:  (s_active#24){.+.+.+}, at: [<c0193974>] kernfs_fop_write+0x54/0x1bc
    #3:  (pm_mutex){+.+.+.}, at: [<c008213c>] pm_suspend+0x10c/0x510
    #4:  (&dev->mutex){......}, at: [<c02af3c4>] __device_suspend+0xdc/0x2cc
    #5:  (&irq_desc_lock_class){-.-...}, at: [<c008d3fc>] __irq_get_desc_lock+0x58/0x98

    stack backtrace:
    CPU: 0 PID: 1072 Comm: s2ram Not tainted 4.2.0-ape6evm-10725-g50fcd7643c034198 #280
    Hardware name: Generic R8A73A4 (Flattened Device Tree)
    [<c0018078>] (unwind_backtrace) from [<c00144f0>] (show_stack+0x10/0x14)
    [<c00144f0>] (show_stack) from [<c0451f14>] (dump_stack+0x88/0x98)
    [<c0451f14>] (dump_stack) from [<c007b29c>] (__lock_acquire+0x15cc/0x20e4)
    [<c007b29c>] (__lock_acquire) from [<c007c6e0>] (lock_acquire+0xac/0x12c)
    [<c007c6e0>] (lock_acquire) from [<c0457c00>] (_raw_spin_lock_irqsave+0x40/0x54)
    [<c0457c00>] (_raw_spin_lock_irqsave) from [<c008d3fc>] (__irq_get_desc_lock+0x58/0x98)
    [<c008d3fc>] (__irq_get_desc_lock) from [<c008ebbc>] (irq_set_irq_wake+0x20/0xf8)
    [<c008ebbc>] (irq_set_irq_wake) from [<c0260770>] (irqc_irq_set_wake+0x20/0x4c)
    [<c0260770>] (irqc_irq_set_wake) from [<c008ec28>] (irq_set_irq_wake+0x8c/0xf8)
    [<c008ec28>] (irq_set_irq_wake) from [<c02cb8c0>] (gpio_keys_suspend+0x74/0xc0)
    [<c02cb8c0>] (gpio_keys_suspend) from [<c02ae8cc>] (dpm_run_callback+0x54/0x124)

Avoid this false positive by using a separate lockdep class for IRQC
interrupts.

Signed-off-by: Geert Uytterhoeven <[email protected]>
Cc: Grygorii Strashko <[email protected]>
Cc: Magnus Damm <[email protected]>
Cc: Jason Cooper <[email protected]>
Link: http://lkml.kernel.org/r/1441798974-25716-2-git-send-email-geert%[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

irqchip/GICv2m: Fix GICv2m build warning on 32 bits

After GICv2m was enabled for 32-bit ARM kernel, a warning popped up:

drivers/irqchip/irq-gic-v2m.c: In function gicv2m_compose_msi_msg:
drivers/irqchip/irq-gic-v2m.c:100:2: warning: right shift count >= width
of type [enabled by default]
msg->address_hi = (u32) (addr >> 32);
^

This patch fixes it by using proper macros for splitting up the value.

Signed-off-by: Pavel Fedin <[email protected]>
Reviewed-by: Marc Zyngier <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Cc: [email protected]
Cc: Stuart Yoder <[email protected]>
Cc: Jason Cooper <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

irqchip/gic-v3-its: Add missing cache flushes

When the ITS is configured for non-cacheable transactions, make sure
that the allocated, zeroed memory is flushed to the Point of
Coherency, allowing the ITS to observe the zeros instead of random
garbage (or even get its own data overwritten by zeros being evicted
from the cache...).

Fixes: 241a386c7dbb "irqchip: gicv3-its: Use non-cacheable accesses when no shareability"
Reported-and-tested-by: Stuart Yoder <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Cc: [email protected]
Cc: Pavel Fedin <[email protected]>
Cc: Jason Cooper <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

irqchip/GIC: Add workaround for aliased GIC400

The GICv2 architecture mandates that the two 4kB GIC regions are
contiguous, and on two separate physical pages (so that access to
the second page can be trapped by a hypervisor). This doesn't work
very well when PAGE_SIZE is 64kB.

A relatively common hack^Wway to work around this is to alias each
4kB region over its own 64kB page. Of course in this case, the base
address you want to use is not really the begining of the region,
but base + 60kB (so that you get a contiguous 8kB region over two
distinct pages).

Normally, this would be described in DT with a new property, but
some HW is already out there, and the firmware makes sure that
it will override whatever you put in the GIC node. Duh. And of course,
said firmware source code is not available, despite being based
on u-boot.

The workaround is to detect the case where the CPU interface size
is set to 128kB, and verify the aliasing by checking that the ID
register for GIC400 (which is the only GIC wired this way so far)
is the same at base and base + 0xF000. In this case, we update
the GIC base address and let it roll.

And if you feel slightly sick by looking at this, rest assured that
I do too...

Reported-by: Julien Grall <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Cc: [email protected]
Cc: Stuart Yoder <[email protected]>
Cc: Pavel Fedin <[email protected]>
Cc: Jason Cooper <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

platform-msi: Do not cache msi_desc in handler_data

The current implementation of platform MSI caches the msi_desc
pointer in irq_data::handler_data. This is a bit silly, as
we also have irq_data::msi_desc, which is perfectly valid.

Remove the useless assignment and simplify the whole flow.

Reported-by: Ma Jun <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Reviewed-by: Jiang Liu <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

net/mlx4_en: Use access helper irq_data_get_affinity_mask()

This is a preparatory patch for moving irq_data struct members. Search
and replace was done with coccinelle

Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Julia Lawall <[email protected]>
Cc: Jiang Liu <[email protected]>
Cc: Amir Vadai <[email protected]>

powerpc, irq: Use access helper irq_data_get_affinity_mask()

Use access helper irq_data_get_affinity_mask() so we can move the
affinity mask to irq_common_data.

Signed-off-by: Jiang Liu <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

kvm: fix zero length mmio searching

Currently, if we had a zero length mmio eventfd assigned on
KVM_MMIO_BUS. It will never be found by kvm_io_bus_cmp() since it
always compares the kvm_io_range() with the length that guest
wrote. This will cause e.g for vhost, kick will be trapped by qemu
userspace instead of vhost. Fixing this by using zero length if an
iodevice is zero length.

Cc: [email protected]
Cc: Gleb Natapov <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Signed-off-by: Jason Wang <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

kvm: fix double free for fast mmio eventfd

We register wildcard mmio eventfd on two buses, once for KVM_MMIO_BUS
and once on KVM_FAST_MMIO_BUS but with a single iodev
instance. This will lead to an issue: kvm_io_bus_destroy() knows
nothing about the devices on two buses pointing to a single dev. Which
will lead to double free[1] during exit. Fix this by allocating two
instances of iodevs then registering one on KVM_MMIO_BUS and another
on KVM_FAST_MMIO_BUS.

CPU: 1 PID: 2894 Comm: qemu-system-x86 Not tainted 3.19.0-26-generic #28-Ubuntu
Hardware name: LENOVO 2356BG6/2356BG6, BIOS G7ET96WW (2.56 ) 09/12/2013
task: ffff88009ae0c4b0 ti: ffff88020e7f0000 task.ti: ffff88020e7f0000
RIP: 0010:[<ffffffffc07e25d8>]  [<ffffffffc07e25d8>] ioeventfd_release+0x28/0x60 [kvm]
RSP: 0018:ffff88020e7f3bc8  EFLAGS: 00010292
RAX: dead000000200200 RBX: ffff8801ec19c900 RCX: 000000018200016d
RDX: ffff8801ec19cf80 RSI: ffffea0008bf1d40 RDI: ffff8801ec19c900
RBP: ffff88020e7f3bd8 R08: 000000002fc75a01 R09: 000000018200016d
R10: ffffffffc07df6ae R11: ffff88022fc75a98 R12: ffff88021e7cc000
R13: ffff88021e7cca48 R14: ffff88021e7cca50 R15: ffff8801ec19c880
FS:  00007fc1ee3e6700(0000) GS:ffff88023e240000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8f389d8000 CR3: 000000023dc13000 CR4: 00000000001427e0
Stack:
ffff88021e7cc000 0000000000000000 ffff88020e7f3be8 ffffffffc07e2622
ffff88020e7f3c38 ffffffffc07df69a ffff880232524160 ffff88020e792d80
0000000000000000 ffff880219b78c00 0000000000000008 ffff8802321686a8
Call Trace:
[<ffffffffc07e2622>] ioeventfd_destructor+0x12/0x20 [kvm]
[<ffffffffc07df69a>] kvm_put_kvm+0xca/0x210 [kvm]
[<ffffffffc07df818>] kvm_vcpu_release+0x18/0x20 [kvm]
[<ffffffff811f69f7>] __fput+0xe7/0x250
[<ffffffff811f6bae>] ____fput+0xe/0x10
[<ffffffff81093f04>] task_work_run+0xd4/0xf0
[<ffffffff81079358>] do_exit+0x368/0xa50
[<ffffffff81082c8f>] ? recalc_sigpending+0x1f/0x60
[<ffffffff81079ad5>] do_group_exit+0x45/0xb0
[<ffffffff81085c71>] get_signal+0x291/0x750
[<ffffffff810144d8>] do_signal+0x28/0xab0
[<ffffffff810f3a3b>] ? do_futex+0xdb/0x5d0
[<ffffffff810b7028>] ? __wake_up_locked_key+0x18/0x20
[<ffffffff810f3fa6>] ? SyS_futex+0x76/0x170
[<ffffffff81014fc9>] do_notify_resume+0x69/0xb0
[<ffffffff817cb9af>] int_signal+0x12/0x17
Code: 5d c3 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 8b 7f 20 e8 06 d6 a5 c0 48 8b 43 08 48 8b 13 48 89 df 48 89 42 08 <48> 89 10 48 b8 00 01 10 00 00
RIP  [<ffffffffc07e25d8>] ioeventfd_release+0x28/0x60 [kvm]
RSP <ffff88020e7f3bc8>

Cc: [email protected]
Cc: Gleb Natapov <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Signed-off-by: Jason Wang <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

kvm: factor out core eventfd assign/deassign logic

This patch factors out core eventfd assign/deassign logic and leaves
the argument checking and bus index selection to callers.

Cc: [email protected]
Cc: Gleb Natapov <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Signed-off-by: Jason Wang <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

kvm: don't try to register to KVM_FAST_MMIO_BUS for non mmio eventfd

We only want zero length mmio eventfd to be registered on
KVM_FAST_MMIO_BUS. So check this explicitly when arg->len is zero to
make sure this.

Cc: [email protected]
Cc: Gleb Natapov <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Signed-off-by: Jason Wang <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

arm64: head.S: initialise mdcr_el2 in el2_setup

When entering the kernel at EL2, we fail to initialise the MDCR_EL2
register which controls debug access and PMU capabilities at EL1.

This patch ensures that the register is initialised so that all traps
are disabled and all the PMU counters are available to the host. When a
guest is scheduled, KVM takes care to configure trapping appropriately.

Cc: <[email protected]>
Acked-by: Marc Zyngier <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

arm64: enable generic idle loop

Enable generic idle loop for ARM64, so can support for hlt/nohlt
command line options to override default idle loop behavior.

Acked-by: Catalin Marinas <[email protected]>
Signed-off-by: Leo Yan <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

perf tests: Fix software clock events test setting maps

The test titled "Test software clock events have valid period values"
was setting cpu/thread maps directly. Make it use the proper function
perf_evlist__set_maps() especially now that it also propagates the maps.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf tests: Fix task exit test setting maps

The test titled "Test number of exit event of a simple workload" was
setting cpu/thread maps directly. Make it use the proper function
perf_evlist__set_maps() especially now that it also propagates the maps.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf evlist: Fix create_syswide_maps() not propagating maps

Fix it by making it call perf_evlist__set_maps() instead of setting the
maps itself.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf evlist: Fix add() not propagating maps

If evsels are added after maps are created, then they won't have any
maps propagated to them. Fix that.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Moved the moving of propagate_maps() to the patch before, so that this
one does _just_ the one lile fix calling in add()]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf evlist: Factor out a function to propagate maps for a single evsel

Subsequent fixes will need a function that just propagates maps for a
single evsel so factor it out.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Moved them to before perf_evlist__add() to avoid having to move it in the next patch ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf evlist: Make create_maps() use set_maps()

Since there is a function to set maps, perf_evlist__create_maps() should
use it.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf evlist: Make set_maps() more resilient

Make perf_evlist__set_maps() more resilient by allowing for the
possibility that one or another of the maps isn't being changed and
therefore should not be "put".

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf evsel: Add own_cpus member

perf_evlist__propagate_maps() cannot easily tell if an evsel has its own
cpu map. To make that simpler, keep a copy of the PMU cpu map and
adjust the propagation logic accordingly.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf evlist: Fix missing thread_map__put in propagate_maps()

perf_evlist__propagate_maps() incorrectly assumes evsel->threads is NULL
before reassigning it, but it won't be NULL when perf_evlist__set_maps()
is used to set different (or NULL) maps. Thus thread_map__put must be
used, which works even if evsel->threads is NULL.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf evlist: Fix splice_list_tail() not setting evlist

Commit d49e46950772 ("perf evsel: Add a backpointer to the evlist a
evsel is in") updated perf_evlist__add() but not
perf_evlist__splice_list_tail().

This illustrates that it is better if perf_evlist__splice_list_tail()
calls perf_evlist__add() instead of duplicating the logic, so do that.
This will also simplify a subsequent fix for propagating maps.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf evlist: Add has_user_cpus member

Subsequent patches will need to call perf_evlist__propagate_maps without
reference to a "target". Add evlist->has_user_cpus to record whether
the user has specified which cpus to target (and therefore whether that
list of cpus should override the default settings for a selected event
i.e. the cpu maps should be propagated)

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf evlist: Remove redundant validation from propagate_maps()

The validation checks that the values that were just assigned, got
assigned i.e. the error can't ever happen. Subsequent patches will call
this code in places where errors are not being returned. Changing those
code paths to return this non-existent error is counter-productive, so
just remove it.

That in turn results in perf_evlist__set_maps not needing to return an
error, but callers aren't checking it either, so remove that too.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf evlist: Simplify set_maps() logic

Don't need to check for NULL when "putting" evlist->maps and
evlist->threads because the "put" functions already do that.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

perf evlist: Simplify propagate_maps() logic

If evsel->cpus is to be reassigned then the current value must be "put",
which works even if it is NULL. Simplify the current logic by moving
the "put" next to the assignment.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

cxl: Fix build failure due to -Wunused-variable behaviour change

A recent change in gcc caused this build failure:

/var/lib/jenkins/workspace/gcc_kernel_build/linux/drivers/misc/cxl/cxl.h:72:27:
error: ‘CXL_PSL_DLCNTL’ defined but not used [-Werror=unused-const-variable]
static const cxl_p1_reg_t CXL_PSL_DLCNTL = {0x0060};

Because of this gcc commit:

Commit 1bca8cbd0c68366f07277f98ce6963e10c2aa617 by mark
PR28901 -Wunused-variable ignores unused const initialised variables in C
12 years ago it was decided that -Wunused-variable shouldn't warn about
static const variables because some code used const static char rcsid[]
strings which were never used but wanted in the code anyway. But as the
bug points out this hides some real bugs. These days the usage of
rcsids is not very popular anymore. So this patch changes the default
to warn about unused static const variables in C with
-Wunused-variable. And it adds a new option -Wno-unused-const-variable
to turn this warning off. For C++ this new warning is off by default,
since const variables can be used as #defines in C++. New testcases for
the new defaults in C and C++ are included testing the new warning and
suppressing it with an unused attribute or using
-Wno-unused-const-variable. gcc/ChangeLog

The cxl driver uses static consts in place of #defines in some cases
for type safety, so this change causes the driver to fail to build on
new copilers as these constants are not all used in every file that
imports the header. Suppress the warning for this driver to return to
the old behaviour of -Wunused-variable.

Reported-by: Anton Blanchard <[email protected]>
Signed-off-by: Ian Munsie <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>

cxl: Fix unbalanced pci_dev_get in cxl_probe

Currently the first thing we do in cxl_probe is to grab a reference
on the pci device. Later on, we call device_register on our adapter.
In our remove path, we call device_unregister, but we never call
pci_dev_put. We therefore leak the device every time we do a
reflash.

device_register/unregister is sufficient to hold the reference.
Therefore, drop the call to pci_dev_get.

Here's why this is safe.
The proposed cxl_probe(pdev) calls cxl_adapter_init:
    a) init calls cxl_adapter_alloc, which creates a struct cxl,
       conventionally called adapter. This struct contains a
       device entry, adapter->dev.

    b) init calls cxl_configure_adapter, where we set
       adapter->dev.parent = &dev->dev (here dev is the pci dev)

So at this point, the cxl adapter's device's parent is the PCI
device that I want to be refcounted properly.

    c) init calls cxl_register_adapter
       *) cxl_register_adapter calls device_register(&adapter->dev)

So now we're in device_register, where dev is the adapter device, and
we want to know if the PCI device is safe after we return.

device_register(&adapter->dev) calls device_initialize() and then
device_add().

device_add() does a get_device(). device_add() also explicitly grabs
the device's parent, and calls get_device() on it:

         parent = get_device(dev->parent);

So therefore, device_register() takes a lock on the parent PCI dev,
which is what pci_dev_get() was guarding. pci_dev_get() can therefore
be safely removed.

Fixes: f204e0b8cedd ("cxl: Driver code for powernv PCIe based cards for userspace access")
Cc: [email protected]
Signed-off-by: Daniel Axtens <[email protected]>
Acked-by: Ian Munsie <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>

locking/static_keys: Fix up the static keys documentation

Fix a few small mistakes in the static key documentation and
delete an unneeded sentence.

Suggested-by: Jason Baron <[email protected]>
Signed-off-by: Jonathan Corbet <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

ACPI: Eliminate CONFIG_.*{, _MODULE} #ifdef in favor of IS_ENABLED()

This commit removes all CONFIG_.*{,_MODULE} in ACPI code, replacing it
with IS_ENABLED().

Signed-off-by: Sudeep Holla <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

ACPI: int340x_thermal: add missing CONFIG_ prefix

This patch adds the missing CONFIG_ prefix to INTEL_SOC_DTS_THERMAL
macros.

Signed-off-by: Sudeep Holla <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
"A couple build fixes for drivers introduced in the merge window and a
  handful of patches to add more critical clocks on rockchip SoCs that
  are affected by newly introduced gpio clock handling"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: rockchip: Add pclk_peri to critical clocks on RK3066/RK3188
  clk: rockchip: add pclk_cpu to the list of rk3188 critical clocks
  clk: rockchip: handle critical clocks after registering all clocks
  clk: Hi6220: separately build stub clock driver
  clk: h8s2678: Fix compile error

Merge branch 'for-rafael' of https://git.kernel.org/pub/scm/linux/kernel/git/mzx/devfreq into pm-devfreq

Pull devfreq updates for v4.3 from MyungJoo Ham.

* 'for-rafael' of https://git.kernel.org/pub/scm/linux/kernel/git/mzx/devfreq:
  PM / devfreq: Fix incorrect type issue.
  PM / devfreq: tegra: Update governor to use devfreq_update_stats()
  PM / devfreq: comments for get_dev_status usage updated
  PM / devfreq: drop comment about thermal setting max_freq
  PM / devfreq: cache the last call to get_dev_status()
  PM / devfreq: Drop unlikely before IS_ERR(_OR_NULL)
  PM / devfreq: exynos-ppmu: bit-wise operation bugfix.
  PM / devfreq: exynos-ppmu: Update documentation to support PPMUv2
  PM / devfreq: exynos-ppmu: Add the support of PPMUv2 for Exynos5433
  PM / devfreq: event: Remove incorrect property in exynos-ppmu DT binding

Conflicts:
drivers/devfreq/event/exynos-ppmu.c

selftests: exec: revert to default emit rule

With the previous patch, the installation method change from install
to rsync. There is no need to create subdir during test, the
default EMIT_TESTS is enough.

This patch essentially revert commit 84cbd9e4 ("selftests/exec: do not
install subdir as it is already created").

Suggested-by: Michael Ellerman <[email protected]>
Signed-off-by: Bamvor Jian Zhang <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

selftests: change install command to rsync

The command of install could not handle the special files in exec
testcases, change the default rule to rsync to fix this.

The installation is unchanged after this commit.

Suggested-by: Michael Ellerman <[email protected]>
Signed-off-by: Bamvor Jian Zhang <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

selftests: mqueue: simplify the Makefile

Use make's implict rule for building simple C programs.

Suggested-by: Michael Ellerman <[email protected]>
Signed-off-by: Bamvor Jian Zhang <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

selftests: mqueue: allow extra cflags

Change from = to += in order to allows the user to pass whatever
CFLAGS they wish(E.g. pass the proper headers and librareis
(popt.h and libpopt.so) in cross-compiling)

Suggested-by: Michael Ellerman <[email protected]>
Signed-off-by: Bamvor Jian Zhang <[email protected]>
Acked-by: Michael Ellerman <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

selftests: rename jump label to static_keys

Commit 2bf9e0ab08c6 ("locking/static_keys: Provide a selftest")
renamed jump_label directory to static_keys and failed to update
the Makefile, causing the selftests build to fail.

This commit fixes it by updating the Makefile with the new name
and also moves the entry into the correct position to keep the
list alphabetically sorted.

Fixes: 2bf9e0ab08c6 ("locking/static_keys: Provide a selftest")
Signed-off-by: Bamvor Jian Zhang <[email protected]>
Acked-by: Shuah Khan <[email protected]>
Acked-by: Michael Ellerman <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

selftests/seccomp: add support for s390

This adds support for s390 to the seccomp selftests. Some improvements
were made to enhance the accuracy of failure reporting, and additional
tests were added to validate assumptions about the currently traced
syscall. Also adds early asserts for running on older kernels to avoid
noise when the seccomp syscall is not implemented.

Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

seltests/zram: fix syntax error

Not all shells define a variable UID. This is a bash and zsh feature only.
In other shells, the UID variable is not defined, so here test command
expands to [ != 0 ] which is a syntax error.

Without this patch:
root@HGH1000007090:/opt/work/linux/tools/testing/selftests/zram# sh zram.sh
zram.sh: 8: [: !=: unexpected operator
zram.sh : No zram.ko module or /dev/zram0 device file not found
zram.sh : CONFIG_ZRAM is not set

With this patch:
root@HGH1000007090:/opt/work/linux/tools/testing/selftests/zram# sh ./zram.sh
zram.sh : No zram.ko module or /dev/zram0 device file not found
zram.sh : CONFIG_ZRAM is not set

Signed-off-by: Zhang Zhen <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

clk: rockchip: add critical clock for rk3368

Again a result of the gpio-clock-liberation the rk3368 needs the
pclk_pd_pmu marked as critical, to boot successfully.

Reported-by: Mark Rutland <[email protected]>
Signed-off-by: Heiko Stuebner <[email protected]>
Tested-by: Mark Rutland <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>

Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6

Pull CIFS fixes from Steve French:
"Two small cifs fixes"

* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
[CIFS] mount option sec=none not displayed properly in /proc/mounts
CIFS: fix type confusion in copy offload ioctl

Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:
"A number of fixes for the merge window, fixing a number of cases
  missed when testing the uaccess code, particularly cases which only
  show up with certain compiler versions"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 8431/1: fix alignement of __bug_table section entries
  arm/xen: Enable user access to the kernel before issuing a privcmd call
  ARM: domains: add memory dependencies to get_domain/set_domain
  ARM: domains: thread_info.h no longer needs asm/domains.h
  ARM: uaccess: fix undefined instruction on ARMv7M/noMMU
  ARM: uaccess: remove unneeded uaccess_save_and_disable macro
  ARM: swpan: fix nwfpe for uaccess changes
  ARM: 8429/1: disable GCC SRA optimization

perf top: Fix segfault pressing -> with no hist entries

'perf top' segfaults with following operation:

# perf top -e page-faults -p 11400 # 11400 never generates page-fault

Then on the resulting empty interface, press right key:

  # ./perf top -e page-faults -p 11400
  perf: Segmentation fault
  -------- backtrace --------
  ./perf[0x535428]
  /lib64/libc.so.6(+0x3545f)[0x7f0dd360745f]
  ./perf[0x531d46]
  ./perf(perf_evlist__tui_browse_hists+0x96)[0x5340d6]
  ./perf[0x44ba2f]
  /lib64/libpthread.so.0(+0x81d0)[0x7f0dd49dc1d0]
  /lib64/libc.so.6(clone+0x6c)[0x7f0dd36b90dc]

The bug resides in perf_evsel__hists_browse() that, in the above
circumstance browser->selection can be NULL, but code after
skip_annotation doesn't consider it.

This patch fix it by checking browser->selection before fetching
browser->selection->map.

Signed-off-by: Wang Nan <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

KVM: make the declaration of functions within 80 characters

After 'commit 0b8ba4a2b658 ("KVM: fix checkpatch.pl errors in
kvm/coalesced_mmio.h")', the declaration of the two function will exceed 80
characters.

This patch reduces the TAPs to make each line in 80 characters.

Signed-off-by: Wei Yang <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

x86/apic: Serialize LVTT and TSC_DEADLINE writes

The APIC LVTT register is MMIO mapped but the TSC_DEADLINE register is an
MSR. The write to the TSC_DEADLINE MSR is not serializing, so it's not
guaranteed that the write to LVTT has reached the APIC before the
TSC_DEADLINE MSR is written. In such a case the write to the MSR is
ignored and as a consequence the local timer interrupt never fires.

The SDM decribes this issue for xAPIC and x2APIC modes. The
serialization methods recommended by the SDM differ.

xAPIC:
"1. Memory-mapped write to LVT Timer Register, setting bits 18:17 to 10b.
  2. WRMSR to the IA32_TSC_DEADLINE MSR a value much larger than current time-stamp counter.
  3. If RDMSR of the IA32_TSC_DEADLINE MSR returns zero, go to step 2.
  4. WRMSR to the IA32_TSC_DEADLINE MSR the desired deadline."

x2APIC:
"To allow for efficient access to the APIC registers in x2APIC mode,
  the serializing semantics of WRMSR are relaxed when writing to the
  APIC registers. Thus, system software should not use 'WRMSR to APIC
  registers in x2APIC mode' as a serializing instruction. Read and write
  accesses to the APIC registers will occur in program order. A WRMSR to
  an APIC register may complete before all preceding stores are globally
  visible; software can prevent this by inserting a serializing
  instruction, an SFENCE, or an MFENCE before the WRMSR."

The xAPIC method is to just wait for the memory mapped write to hit
the LVTT by checking whether the MSR write has reached the hardware.
There is no reason why a proper MFENCE after the memory mapped write would
not do the same. Andi Kleen confirmed that MFENCE is sufficient for the
xAPIC case as well.

Issue MFENCE before writing to the TSC_DEADLINE MSR. This can be done
unconditionally as all CPUs which have TSC_DEADLINE also have MFENCE
support.

[ tglx: Massaged the changelog ]

Signed-off-by: Shaohua Li <[email protected]>
Reviewed-by: Ingo Molnar <[email protected]>
Cc: <[email protected]>
Cc: <[email protected]>
Cc: <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: [email protected] #v3.7+
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

x86/ioapic: Force affinity setting in setup_ioapic_dest()

The recent ioapic cleanups changed the affinity setting in
setup_ioapic_dest() from a direct write to the hardware to the delayed
affinity setup via irq_set_affinity().

That results in a warning from chained_irq_exit():
WARNING: CPU: 0 PID: 5 at kernel/irq/migration.c:32 irq_move_masked_irq
[<ffffffff810a0a88>] irq_move_masked_irq+0xb8/0xc0
[<ffffffff8103c161>] ioapic_ack_level+0x111/0x130
[<ffffffff812bbfe8>] intel_gpio_irq_handler+0x148/0x1c0

The reason is that irq_set_affinity() does not write directly to the
hardware. It marks the affinity setting as pending and executes it
from the next interrupt. The chained handler infrastructure does not
take the irq descriptor lock for performance reasons because such a
chained interrupt is not visible to any interfaces. So the delayed
affinity setting triggers the warning in irq_move_masked_irq().

Restore the old behaviour by calling the set_affinity function of the
ioapic chip in setup_ioapic_dest(). This is safe as none of the
interrupts can be on the fly at this point.

Fixes: aa5cb97f14a2 'x86/irq: Remove x86_io_apic_ops.set_affinity and related interfaces'
Reported-and-tested-by: Mika Westerberg <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Jiang Liu <[email protected]>
Cc: [email protected]

KVM: arm64: add workaround for Cortex-A57 erratum #852523

When restoring the system register state for an AArch32 guest at EL2,
writes to DACR32_EL2 may not be correctly synchronised by Cortex-A57,
which can lead to the guest effectively running with junk in the DACR
and running into unexpected domain faults.

This patch works around the issue by re-ordering our restoration of the
AArch32 register aliases so that they happen before the AArch64 system
registers. Ensuring that the registers are restored in this order
guarantees that they will be correctly synchronised by the core.

Cc: <[email protected]>
Reviewed-by: Marc Zyngier <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>

Merge tag 'kvm-arm-for-4.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master

KVM/ARM changes for 4.3-rc2

- Fix timer interrupt injection after the rework
that went in during the merge window
- Reset the timer to zero on reboot
- Make sure the TCR_EL2 RES1 bits are really set to 1
- Fix a PSCI affinity bug for non-existing vcpus

KVM: fix polling for guest halt continued even if disable it

If there is already some polling ongoing, it's impossible to disable the
polling, since as soon as somebody sets halt_poll_ns to 0, polling will
never stop, as grow and shrink are only handled if halt_poll_ns is != 0.

This patch fix it by reset vcpu->halt_poll_ns in order to stop polling
when polling is disabled.

Reported-by: Christian Borntraeger <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

x86/paravirt: Remove the unused pv_time_ops::get_tsc_khz method

It's not used anywhere.

Signed-off-by: Juergen Gross <[email protected]>
Acked-by: Rusty Russell <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

arm64: pgtable: use a single bit for PTE_WRITE regardless of DBM

Depending on CONFIG_ARM64_HW_AFDBM, we use either bit 57 or 51 of the
pte to represent PTE_WRITE. Given that bit 51 is reserved prior to
ARMv8.1, we can just use that bit regardless of the config option. That
also matches what happens if a kernel configured with ARM64_HW_AFDBM=y
is run on a CPU without the DBM functionality.

Cc: Julien Grall <[email protected]>
Tested-by: Julien Grall <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

arm64: Fix pte_modify() to preserve the hardware dirty information

The pte_modify() function with hardware AF/DBM enabled must transfer the
hardware dirty information to the software PTE_DIRTY bit. However, it
was setting this bit in newprot and the mask does not cover such bit.
This patch sets PTE_DIRTY on the original pte which will be preserved in
the returned value.

Fixes: 2f4b829c625e ("arm64: Add support for hardware updates of the access and dirty pte bits")
Cc: Julien Grall <[email protected]>
Tested-by: Julien Grall <[email protected]>
Tested-by: Will Deacon <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

arm64: Fix the pte_hw_dirty() check when AF/DBM is enabled

Commit 2f4b829c625e ("arm64: Add support for hardware updates of the
access and dirty pte bits") introduced support for handling hardware
updates of the access flag and dirty status. The PTE is automatically
dirtied in hardware (if supported) by clearing the PTE_RDONLY bit when
the PTE_DBM/PTE_WRITE bit is set. The pte_hw_dirty() macro was added to
detect a hardware dirtied pte. The pte_dirty() macro checks for both
software PTE_DIRTY and pte_hw_dirty().

Functions like pte_modify() clear the PTE_RDONLY bit since it is meant
to be set in set_pte_at() when written to memory. In such cases,
pte_hw_dirty() would return true even though such pte is clean. This
patch changes pte_hw_dirty() to test the PTE_DBM/PTE_WRITE bit together
with PTE_RDONLY.

Fixes: 2f4b829c625e ("arm64: Add support for hardware updates of the access and dirty pte bits")
Reported-by: Julien Grall <[email protected]>
Tested-by: Julien Grall <[email protected]>
Tested-by: Will Deacon <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

arm64: dma-mapping: check whether cma area is initialized or not

If CMA is turned on and CMA size is set to zero, kernel should
behave as if CMA was not enabled at compile time.
Every dma allocation should check existence of cma area
before requesting memory.

Arm has done this by commit e464ef16c4f0 ("arm: dma-mapping: add
checking cma area initialized"), also do this for arm64.

Acked-by: Catalin Marinas <[email protected]>
Signed-off-by: Jisheng Zhang <[email protected]>
Signed-off-by: Will Deacon <[email protected]>

x86/ldt: Fix small LDT allocation for Xen

While the following commit:

37868fe113 ("x86/ldt: Make modify_ldt synchronous")

added a nice comment explaining that Xen needs page-aligned
whole page chunks for guest descriptor tables, it then
nevertheless used kzalloc() on the small size path.

As I'm unaware of guarantees for kmalloc(PAGE_SIZE, ) to return
page-aligned memory blocks, I believe this needs to be switched
back to __get_free_page() (or better get_zeroed_page()).

Signed-off-by: Jan Beulich <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: David Vrabel <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

clockevents: Remove unused set_mode() callback

All users are migrated to the per-state callbacks, get rid of the
unused interface and the core support code.

Signed-off-by: Viresh Kumar <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: John Stultz <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/fd60de14cf6d125489c031207567bb255ad946f6.1441943991.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar <[email protected]>