Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
"Spinlock performance regression fix, plus documentation fixes"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/static_keys: Fix up the static keys documentation
locking/qspinlock/x86: Only emit the test-and-set fallback when building guest support
locking/qspinlock/x86: Fix performance regression under unaccelerated VMs
locking/static_keys: Fix a silly typo
Paolo Bonzini [Thu, 17 Sep 2015 14:51:59 +0000 (16:51 +0200)]
Merge tag 'kvm-arm-for-4.3-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master
Second set of KVM/ARM changes for 4.3-rc2
- Workaround for a Cortex-A57 erratum
- Bug fix for the debugging infrastructure
- Fix for 32bit guests with more than 4GB of address space
on a 32bit host
- A number of fixes for the (unusual) case when we don't use
the in-kernel GIC emulation
- Removal of ThumbEE handling on arm64, since these have been
dropped from the architecture before anyone actually ever
built a CPU
- Remove the KVM_ARM_MAX_VCPUS limitation which has become
fairly pointless
x86/pci/dma: Fix gfp flags for coherent DMA memory allocation
Commit 6894258eda2f reversed the order of gfp_flags adjustment in
dma_alloc_attrs() for x86 [arch/x86/kernel/pci-dma.c] As a result,
relevant flags set by dma_alloc_coherent_gfp_flags() are just
discarded and cause coherent DMA memory allocation failure on some
devices.
Ming Lei [Wed, 2 Sep 2015 06:31:21 +0000 (14:31 +0800)]
arm/arm64: KVM: Remove 'config KVM_ARM_MAX_VCPUS'
This patch removes config option of KVM_ARM_MAX_VCPUS,
and like other ARCHs, just choose the maximum allowed
value from hardware, and follows the reasons:
1) from distribution view, the option has to be
defined as the max allowed value because it need to
meet all kinds of virtulization applications and
need to support most of SoCs;
2) using a bigger value doesn't introduce extra memory
consumption, and the help text in Kconfig isn't accurate
because kvm_vpu structure isn't allocated until request
of creating VCPU is sent from QEMU;
3) the main effect is that the field of vcpus[] in 'struct kvm'
becomes a bit bigger(sizeof(void *) per vcpu) and need more cache
lines to hold the structure, but 'struct kvm' is one generic struct,
and it has worked well on other ARCHs already in this way. Also,
the world switch frequecy is often low, for example, it is ~2000
when running kernel building load in VM from APM xgene KVM host,
so the effect is very small, and the difference can't be observed
in my test at all.
Will Deacon [Tue, 15 Sep 2015 16:15:33 +0000 (17:15 +0100)]
arm64: KVM: Remove all traces of the ThumbEE registers
Although the ThumbEE registers and traps were present in earlier
versions of the v8 architecture, it was retrospectively removed and so
we can do the same.
Whilst this breaks migrating a guest started on a previous version of
the kernel, it is much better to kill these (non existent) registers
as soon as possible.
Marc Zyngier [Wed, 16 Sep 2015 15:18:59 +0000 (16:18 +0100)]
arm: KVM: Disable virtual timer even if the guest is not using it
When running a guest with the architected timer disabled (with QEMU and
the kernel_irqchip=off option, for example), it is important to make
sure the timer gets turned off. Otherwise, the guest may try to
enable it anyway, leading to a screaming HW interrupt.
The fix is to unconditionally turn off the virtual timer on guest
exit.
Marc Zyngier [Wed, 16 Sep 2015 15:18:59 +0000 (16:18 +0100)]
arm64: KVM: Disable virtual timer even if the guest is not using it
When running a guest with the architected timer disabled (with QEMU and
the kernel_irqchip=off option, for example), it is important to make
sure the timer gets turned off. Otherwise, the guest may try to
enable it anyway, leading to a screaming HW interrupt.
The fix is to unconditionally turn off the virtual timer on guest
exit.
Will Deacon [Tue, 17 Mar 2015 12:15:02 +0000 (12:15 +0000)]
arm64: errata: add module build workaround for erratum #843419
Cortex-A53 processors <= r0p4 are affected by erratum #843419 which can
lead to a memory access using an incorrect address in certain sequences
headed by an ADRP instruction.
There is a linker fix to generate veneers for ADRP instructions, but
this doesn't work for kernel modules which are built as unlinked ELF
objects.
This patch adds a new config option for the erratum which, when enabled,
builds kernel modules with the mcmodel=large flag. This uses absolute
addressing for all kernel symbols, thereby removing the use of ADRP as
a PC-relative form of addressing. The ADRP relocs are removed from the
module loader so that we fail to load any potentially affected modules.
Will Deacon [Tue, 15 Sep 2015 11:07:06 +0000 (12:07 +0100)]
arm64: compat: fix vfp save/restore across signal handlers in big-endian
When saving/restoring the VFP registers from a compat (AArch32)
signal frame, we rely on the compat registers forming a prefix of the
native register file and therefore make use of copy_{to,from}_user to
transfer between the native fpsimd_state and the compat_vfp_sigframe.
Unfortunately, this doesn't work so well in a big-endian environment.
Our fpsimd save/restore code operates directly on 128-bit quantities
(Q registers) whereas the compat_vfp_sigframe represents the registers
as an array of 64-bit (D) registers. The architecture packs the compat D
registers into the Q registers, with the least significant bytes holding
the lower register. Consequently, we need to swap the 64-bit halves when
converting between these two representations on a big-endian machine.
This patch replaces the __copy_{to,from}_user invocations in our
compat VFP signal handling code with explicit __put_user loops that
operate on 64-bit values and swap them accordingly.
leds:lp55xx: Correct Kconfig dependency for f/w user helper
The commit [b67893206fc0: leds:lp55xx: fix firmware loading error]
tries to address the firmware file handling with user helper, but it
sets a wrong Kconfig CONFIG_FW_LOADER_USER_HELPER_FALLBACK. Since the
wrong option was enabled, the system got a regression -- it suffers
from the unexpected long delays for non-present firmware files.
This patch corrects the Kconfig dependency to the right one,
CONFIG_FW_LOADER_USER_HELPER. This doesn't change the fallback
behavior but only enables UMH when needed.
powerpc32: memset: only use dcbz once cache is enabled
memset() uses instruction dcbz to speed up clearing by not wasting time
loading cache line with data that will be overwritten.
Some platform like mpc52xx do no have cache active at startup and
can therefore not use memset(). Allthough no part of the code
explicitly uses memset(), GCC may make calls to it.
This patch modifies memset() such that at startup, memset()
unconditionally skip the optimised bloc that uses dcbz instruction.
Once the initial MMU is set up, in machine_init() we patch memset()
by replacing this inconditional jump by a NOP
powerpc32: memcpy: only use dcbz once cache is enabled
memcpy() uses instruction dcbz to speed up copy by not wasting time
loading cache line with data that will be overwritten.
Some platform like mpc52xx do no have cache active at startup and
can therefore not use memcpy(). Allthough no part of the code
explicitly uses memcpy(), GCC makes calls to it.
This patch modifies memcpy() such that at startup, memcpy()
unconditionally jumps to generic_memcpy() which doesn't use
the dcbz instruction.
Once the initial MMU is set up, in machine_init() we patch memcpy()
by replacing this inconditional jump by a NOP
Doug Anderson [Wed, 26 Aug 2015 17:26:49 +0000 (18:26 +0100)]
ARM: 8425/1: kgdb: Don't try to stop the machine when setting breakpoints
In (23a4e40 arm: kgdb: Handle read-only text / modules) we moved to
using patch_text() to set breakpoints so that we could handle the case
when we had CONFIG_DEBUG_RODATA. That patch used patch_text().
Unfortunately, patch_text() assumes that we're not in atomic context
when it runs since it needs to grab a mutex and also wait for other
CPUs to stop (which it does with a completion).
This would result in a stack crawl if you had
CONFIG_DEBUG_ATOMIC_SLEEP and tried to set a breakpoint in kgdb. The
crawl looked something like:
BUG: scheduling while atomic: swapper/0/0/0x00010007
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-rc7-00133-geb63b34 #1073
Hardware name: Rockchip (Device Tree)
(unwind_backtrace) from [<c00133d4>] (show_stack+0x20/0x24)
(show_stack) from [<c05400e8>] (dump_stack+0x84/0xb8)
(dump_stack) from [<c004913c>] (__schedule_bug+0x54/0x6c)
(__schedule_bug) from [<c054065c>] (__schedule+0x80/0x668)
(__schedule) from [<c0540cfc>] (schedule+0xb8/0xd4)
(schedule) from [<c0543a3c>] (schedule_timeout+0x2c/0x234)
(schedule_timeout) from [<c05417c0>] (wait_for_common+0xf4/0x188)
(wait_for_common) from [<c0541874>] (wait_for_completion+0x20/0x24)
(wait_for_completion) from [<c00a0104>] (__stop_cpus+0x58/0x70)
(__stop_cpus) from [<c00a0580>] (stop_cpus+0x3c/0x54)
(stop_cpus) from [<c00a06c4>] (__stop_machine+0xcc/0xe8)
(__stop_machine) from [<c00a0714>] (stop_machine+0x34/0x44)
(stop_machine) from [<c00173e8>] (patch_text+0x28/0x34)
(patch_text) from [<c001733c>] (kgdb_arch_set_breakpoint+0x40/0x4c)
(kgdb_arch_set_breakpoint) from [<c00a0d68>] (kgdb_validate_break_address+0x2c/0x60)
(kgdb_validate_break_address) from [<c00a0e90>] (dbg_set_sw_break+0x1c/0xdc)
(dbg_set_sw_break) from [<c00a2e88>] (gdb_serial_stub+0x9c4/0xba4)
(gdb_serial_stub) from [<c00a11cc>] (kgdb_cpu_enter+0x1f8/0x60c)
(kgdb_cpu_enter) from [<c00a18cc>] (kgdb_handle_exception+0x19c/0x1d0)
(kgdb_handle_exception) from [<c0016f7c>] (kgdb_compiled_brk_fn+0x30/0x3c)
(kgdb_compiled_brk_fn) from [<c00091a4>] (do_undefinstr+0x1a4/0x20c)
(do_undefinstr) from [<c001400c>] (__und_svc_finish+0x0/0x34)
It turns out that when we're in kgdb all the CPUs are stopped anyway
so there's no reason we should be calling patch_text(). We can
instead directly call __patch_text() which assumes that CPUs have
already been stopped.
Andre Przywara [Mon, 14 Sep 2015 16:49:02 +0000 (17:49 +0100)]
ARM: 8437/1: dma-mapping: fix build warning with new DMA_ERROR_CODE definition
Commit 96231b2686b5: ("ARM: 8419/1: dma-mapping: harmonize definition
of DMA_ERROR_CODE") changed the definition of DMA_ERROR_CODE to use
dma_addr_t, which makes the compiler barf on assigning this to an
"int" variable on ARM with LPAE enabled:
*************
In file included from /src/linux/include/linux/dma-mapping.h:86:0,
from /src/linux/arch/arm/mm/dma-mapping.c:21:
/src/linux/arch/arm/mm/dma-mapping.c: In function '__iommu_create_mapping':
/src/linux/arch/arm/include/asm/dma-mapping.h:16:24: warning:
overflow in implicit constant conversion [-Woverflow]
#define DMA_ERROR_CODE (~(dma_addr_t)0x0)
^
/src/linux/arch/arm/mm/dma-mapping.c:1252:15: note: in expansion of
macro DMA_ERROR_CODE'
int i, ret = DMA_ERROR_CODE;
^
*************
Remove the actually unneeded initialization of "ret" in
__iommu_create_mapping() and move the variable declaration inside the
for-loop to make the scope of this variable more clear.
Russell King [Wed, 16 Sep 2015 10:08:49 +0000 (11:08 +0100)]
ARM: get rid of needless #if in signal handling code
Remove the #if statement which caused trouble for kernels that support
both ARMv6 and ARMv7. Older architectures do not implement these bits,
so it should be safe to always clear them.
Mans Rullgard [Sun, 15 Feb 2015 12:33:49 +0000 (12:33 +0000)]
clk: check for invalid parent index of orphans in __clk_init()
If a mux clock is initialised (by hardware or firmware) with an
invalid parent, its ->get_parent() can return an out of range
index. For example, the generic mux clock attempts to return
-EINVAL, which due to the u8 return type ends up a rather large
number. Using this index with the parent_names[] array results
in an invalid pointer and (usually) a crash in the following
strcmp().
This patch adds a check for the parent index being in range,
ignoring clocks reporting invalid values.
The OPP list needs to be protected against concurrent accesses. Using
simple RCU read locks does the trick and gets rid of the following
lockdep warning:
===============================
[ INFO: suspicious RCU usage. ]
4.2.0-next-20150908 #1 Not tainted
-------------------------------
drivers/base/power/opp.c:460 Missing rcu_read_lock() or dev_opp_list_lock protection!
Pavel Fedin [Wed, 5 Aug 2015 10:53:57 +0000 (11:53 +0100)]
arm/arm64: KVM: vgic: Check for !irqchip_in_kernel() when mapping resources
Until b26e5fdac43c ("arm/arm64: KVM: introduce per-VM ops"),
kvm_vgic_map_resources() used to include a check on irqchip_in_kernel(),
and vgic_v2_map_resources() still has it.
But now vm_ops are not initialized until we call kvm_vgic_create().
Therefore kvm_vgic_map_resources() can being called without a VGIC,
and we die because vm_ops.map_resources is NULL.
Fixing this restores QEMU's kernel-irqchip=off option to a working state,
allowing to use GIC emulation in userspace.
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma driver move from Doug Ledford:
"This is a move only, no functional changes.
I tried to get it in prior to the rc1 release, but we were waiting on
IBM to get back to us that they were OK with the deprecation and
eventual removal of this driver. That OK didn't materialize until
last week, so integration and testing time pushed us beyond the rc1
release.
Summary:
- Move ehca driver to staging/rdma and schedule for deletion"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/ehca: Deprecate driver, move to staging, schedule deletion
Merge tag 'hwmon-for-linus-v4.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
"Two patches for the nct6775 driver: add support for NCT6793D, and fix
swapped registers"
* tag 'hwmon-for-linus-v4.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (nct6775) Add support for NCT6793D
hwmon: (nct6775) Swap STEP_UP_TIME and STEP_DOWN_TIME registers for most chips
Merge tag 'pinctrl-v4.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"This is a first set of pin control fixes for the v4.3 series. Nothing
special to say, business as usual.
- Some IS_ERR() fixes from Julia Lawall. I always wanted the
compiler to catch these but error pointers by nailing them as an
err pointer intrinsic type or something seem to be a "no can do".
In any case, cocinelle is obviously up to the task, better than
bugs staying around.
- Better error handling for NULL GPIO chips.
- Fix a compile error from the big irq desc refactoring. I'm
surprised the fallout wasn't bigger than this"
* tag 'pinctrl-v4.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: samsung: s3c24xx: fix syntax error
pinctrl: core: Warn about NULL gpio_chip in pinctrl_ready_for_gpio_range()
pinctrl: join lines that can be a single line within 80 columns
pinctrl: digicolor: convert null test to IS_ERR test
pinctrl: qcom: ssbi: convert null test to IS_ERR test
Jason J. Herne [Wed, 16 Sep 2015 13:13:50 +0000 (09:13 -0400)]
KVM: s390: Replace incorrect atomic_or with atomic_andnot
The offending commit accidentally replaces an atomic_clear with an
atomic_or instead of an atomic_andnot in kvm_s390_vcpu_request_handled.
The symptom is that kvm guests on s390 hang on startup.
This patch simply replaces the incorrect atomic_or with atomic_andnot
Merge tag 'gpio-v4.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"This is the first round of GPIO fixes for v4.3. Quite a lot of
patches, but the influx of new stuff in the merge window was equally
big, so I'm not surprised.
- Return value checks and thus nicer errorpath for two drivers.
- Make GPIO_RCAR arch neutral.
- Propagate errors from GPIO chip ->get() vtable call. It turned out
these can actually fail sometimes, especially on slowpath
controllers doing I2C traffic and similar.
- Update documentation to be in sync with the massive changes in the
v4.3 merge window, phew.
- Handle deferred probe properly in the OMAP driver.
- Get rid of surplus MODULE_ALIAS() from sx150x"
* tag 'gpio-v4.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: omap: Fix GPIO numbering for deferred probe
Documentation: gpio: Explain that <function>-gpio is also supported
gpio: omap: Fix gpiochip_add() handling for deferred probe
gpio: sx150x: Remove unnecessary MODULE_ALIAS()
Documentation: gpio: board: describe the con_id parameter
Documentation: gpio: board: add flags parameter to gpiod_get*() functions
gpio: Propagate errors from chip->get()
gpio: rcar: GPIO_RCAR doesn't relate to ARM
gpio: mxs: need to check return value of irq_alloc_generic_chip
gpio: mxc: need to check return value of irq_alloc_generic_chip
Rob Herring [Sat, 29 Aug 2015 23:01:23 +0000 (18:01 -0500)]
sh: Kill off set_irq_flags usage
set_irq_flags is ARM specific with custom flags which have genirq
equivalents. Convert drivers to use the genirq interfaces directly, so we
can kill off set_irq_flags. The translation of flags is as follows:
For IRQs managed by an irqdomain, the irqdomain core code handles clearing
and setting IRQ_NOREQUEST already, so there is no need to do this in
.map() functions and we can simply remove the set_irq_flags calls. Some
users also modify IRQ_NOPROBE and this has been maintained although it
is not clear that is really needed. There appears to be a great deal of
blind copy and paste of this code.
Rob Herring [Sat, 29 Aug 2015 23:01:22 +0000 (18:01 -0500)]
irqchip: Kill off set_irq_flags usage
set_irq_flags is ARM specific with custom flags which have genirq
equivalents. Convert drivers to use the genirq interfaces directly, so we
can kill off set_irq_flags. The translation of flags is as follows:
For IRQs managed by an irqdomain, the irqdomain core code handles clearing
and setting IRQ_NOREQUEST already, so there is no need to do this in
.map() functions and we can simply remove the set_irq_flags calls. Some
users also modify IRQ_NOPROBE and this has been maintained although it
is not clear that is really needed. There appears to be a great deal of
blind copy and paste of this code.
Rob Herring [Sat, 29 Aug 2015 23:01:21 +0000 (18:01 -0500)]
gpu/drm: Kill off set_irq_flags usage
set_irq_flags is ARM specific with custom flags which have genirq
equivalents. Convert drivers to use the genirq interfaces directly, so we
can kill off set_irq_flags. The translation of flags is as follows:
For IRQs managed by an irqdomain, the irqdomain core code handles clearing
and setting IRQ_NOREQUEST already, so there is no need to do this in
.map() functions and we can simply remove the set_irq_flags calls. Some
users also modify IRQ_NOPROBE and this has been maintained although it
is not clear that is really needed. There appears to be a great deal of
blind copy and paste of this code.
Pull crypto fixes from Herbert Xu:
"This fixes the following issues:
- The selftest overreads the IV test vector.
- Fix potential infinite loop in sunxi-ss driver.
- Fix powerpc build failure when VMX is set without VSX"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: testmgr - don't copy from source IV too much
crypto: sunxi-ss - Fix a possible driver hang with ciphers
crypto: vmx - VMX crypto should depend on CONFIG_VSX
David Woodhouse [Wed, 16 Sep 2015 13:10:03 +0000 (14:10 +0100)]
x86/platform: Fix Geode LX timekeeping in the generic x86 build
In 2007, commit 07190a08eef36 ("Mark TSC on GeodeLX reliable")
bypassed verification of the TSC on Geode LX. However, this code
(now in the check_system_tsc_reliable() function in
arch/x86/kernel/tsc.c) was only present if CONFIG_MGEODE_LX was
set.
OpenWRT has recently started building its generic Geode target
for Geode GX, not LX, to include support for additional
platforms. This broke the timekeeping on LX-based devices,
because the TSC wasn't marked as reliable:
https://dev.openwrt.org/ticket/20531
By adding a runtime check on is_geode_lx(), we can also include
the fix if CONFIG_MGEODEGX1 or CONFIG_X86_GENERIC are set, thus
fixing the problem.
Marek Majtyka [Wed, 16 Sep 2015 10:04:55 +0000 (12:04 +0200)]
arm: KVM: Fix incorrect device to IPA mapping
A critical bug has been found in device memory stage1 translation for
VMs with more then 4GB of address space. Once vm_pgoff size is smaller
then pa (which is true for LPAE case, u32 and u64 respectively) some
more significant bits of pa may be lost as a shift operation is performed
on u32 and later cast onto u64.
Example: vm_pgoff(u32)=0x00210030, PAGE_SHIFT=12
expected pa(u64): 0x0000002010030000
produced pa(u64): 0x0000000010030000
The fix is to change the order of operations (casting first onto phys_addr_t
and then shifting).
Marc Zyngier [Wed, 16 Sep 2015 09:54:37 +0000 (10:54 +0100)]
arm64: KVM: Fix user access for debug registers
When setting the debug register from userspace, make sure that
copy_from_user() is called with its parameters in the expected
order. It otherwise doesn't do what you think.
Wanpeng Li [Wed, 16 Sep 2015 11:31:11 +0000 (19:31 +0800)]
KVM: vmx: fix VPID is 0000H in non-root operation
Reference SDM 28.1:
The current VPID is 0000H in the following situations:
- Outside VMX operation. (This includes operation in system-management
mode under the default treatment of SMIs and SMM with VMX operation;
see Section 34.14.)
- In VMX root operation.
- In VMX non-root operation when the “enable VPID” VM-execution control
is 0.
The VPID should never be 0000H in non-root operation when "enable VPID"
VM-execution control is 1. However, commit 34a1cd60 ("kvm: x86: vmx:
move some vmx setting from vmx_init() to hardware_setup()") remove the
codes which reserve 0000H for VMX root operation.
This patch fix it by again reserving 0000H for VMX root operation.
powerpc/mm: Recompute hash value after a failed update
If we had secondary hash flag set, we ended up modifying hash value in
the updatepp code path. Hence with a failed updatepp we will be using
a wrong hash value for the following hash insert. Fix this by
recomputing hash before insert.
Without this patch we can end up with using wrong slot number in linux
pte. That can result in us missing an hash pte update or invalidate
which can cause memory corruption or even machine check.
powerpc/boot: Specify ABI v2 when building an LE boot wrapper
The kernel does it, not the boot wrapper, which breaks with some
cross compilers that still default to ABI v1.
Fixes: 147c05168fc8 ("powerpc/boot: Add support for 64bit little endian wrapper") Cc: [email protected] # v3.16+ Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
Paolo Bonzini [Tue, 15 Sep 2015 16:27:57 +0000 (18:27 +0200)]
KVM: add halt_attempted_poll to VCPU stats
This new statistic can help diagnosing VCPUs that, for any reason,
trigger bad behavior of halt_poll_ns autotuning.
For example, say halt_poll_ns = 480000, and wakeups are spaced exactly
like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes
10+20+40+80+160+320+480 = 1110 microseconds out of every
479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then
is consuming about 30% more CPU than it would use without
polling. This would show as an abnormally high number of
attempted polling compared to the successful polls.
Pierre Morel [Thu, 10 Sep 2015 14:35:08 +0000 (16:35 +0200)]
virtio/s390: handle failures of READ_VQ_CONF ccw
In virtio_ccw_read_vq_conf() the return value of ccw_io_helper()
was not checked.
If the configuration could not be read properly, we'd wrongly assume a
queue size of 0.
Let's propagate any I/O error to virtio_ccw_setup_vq() so it may
properly fail.
Russell King [Fri, 11 Sep 2015 15:44:02 +0000 (16:44 +0100)]
ARM: fix Thumb2 signal handling when ARMv6 is enabled
When a kernel is built covering ARMv6 to ARMv7, we omit to clear the
IT state when entering a signal handler. This can cause the first
few instructions to be conditionally executed depending on the parent
context.
In any case, the original test for >= ARMv7 is broken - ARMv6 can have
Thumb-2 support as well, and an ARMv6T2 specific build would omit this
code too.
Relax the test back to ARMv6 or greater. This results in us always
clearing the IT state bits in the PSR, even on CPUs where these bits
are reserved. However, they're reserved for the IT state, so this
should cause no harm.
Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Fix segfault pressing -> in 'perf top' with no hist entries. (Wang Nan)
E.g:
perf top -e page-faults --pid 11400 # 11400 generates no page-fault
- Fix propagation of thread and cpu maps, that got broken when doing incomplete
changes to better support events with a PMU cpu mask, leading to Intel PT to
fail with an error like:
$ perf record -e intel_pt//u uname
Error: The sys_perf_event_open() syscall returned with
22 (Invalid argument) for event (sched:sched_switch).
Because intel_pt adds that sched:sched_switch evsel to the evlist after the
thread/cpu maps were propagated to the evsels, fix it. (Adrian Hunter)
cpufreq: acpi-cpufreq: Use cpufreq_cpu_get_raw() in ->get()
cpufreq_cpu_get() called by get_cur_freq_on_cpu() is overkill,
because the ->get() callback is always invoked in a context in
which all of the conditions checked by cpufreq_cpu_get() are
guaranteed to be satisfied.
Use cpufreq_cpu_get_raw() instead of it and drop the
corresponding cpufreq_cpu_put() from get_cur_freq_on_cpu().
Jeff Moyer [Fri, 14 Aug 2015 20:15:31 +0000 (16:15 -0400)]
dax: fix O_DIRECT I/O to the last block of a blockdev
commit bbab37ddc20b (block: Add support for DAX reads/writes to
block devices) caused a regression in mkfs.xfs. That utility
sets the block size of the device to the logical block size
using the BLKBSZSET ioctl, and then issues a single sector read
from the last sector of the device. This results in the dax_io
code trying to do a page-sized read from 512 bytes from the end
of the device. The result is -ERANGE being returned to userspace.
The fix is to align the block to the page size before calling
get_block.
Thanks to willy for simplifying my original patch.
David Woodhouse [Tue, 15 Sep 2015 15:03:36 +0000 (16:03 +0100)]
modsign: Fix GPL/OpenSSL licence incompatibility
The GPL does not permit us to link against the OpenSSL library. Use
LGPL for sign-file and extract-file instead.
[ The whole "openssl isn't compatible with gpl" is really just
fear-mongering, but there's no reason not to make modsign LGPL, so
nobody cares. - Linus ]
irqchip/renesas-irqc: Propagate wake-up settings to parent
The renesas-irqc interrupt controller is cascaded to the GIC, but its
driver doesn't propagate wake-up settings to the parent interrupt
controller.
Since commit aec89ef72ba6c944 ("irqchip/gic: Enable SKIP_SET_WAKE and
MASK_ON_SUSPEND"), the GIC driver masks interrupts during suspend, and
wake-up through gpio-keys now fails on r8a73a4/ape6evm.
Fix this by propagating wake-up settings to the parent interrupt
controller. There's no need to handle irq_set_irq_wake() failures, as
the renesas-irqc interrupt controller is always cascaded to a GIC, and
the GIC driver always sets SKIP_SET_WAKE since the aforementioned
commit.
irqchip/renesas-intc-irqpin: Propagate wake-up settings to parent
The renesas-intc-irqpin interrupt controller is cascaded to the GIC, but
its driver doesn't propagate wake-up settings to the parent interrupt
controller.
Since commit aec89ef72ba6c944 ("irqchip/gic: Enable SKIP_SET_WAKE and
MASK_ON_SUSPEND"), the GIC driver masks interrupts during suspend, and
wake-up through gpio-keys now fails on r8a7740/armadillo and
sh73a0/kzm9g.
Fix this by propagating wake-up settings to the parent interrupt
controller. There's no need to handle irq_set_irq_wake() failures, as
the renesas-intc-irqpin interrupt controller is always cascaded to a
GIC, and the GIC driver always sets SKIP_SET_WAKE since the
aforementioned commit.
irqchip/renesas-intc-irqpin: Use a separate lockdep class
The renesas-intc-irqpin interrupt controller is cascaded to the GIC.
Hence when propagating wake-up settings to its parent interrupt
controller, the following lockdep warning is printed:
=============================================
[ INFO: possible recursive locking detected ] 4.2.0-armadillo-10725-g50fcd7643c034198 #781 Not tainted
---------------------------------------------
s2ram/1179 is trying to acquire lock:
(&irq_desc_lock_class){-.-...}, at: [<c005bb54>] __irq_get_desc_lock+0x78/0x94
but task is already holding lock:
(&irq_desc_lock_class){-.-...}, at: [<c005bb54>] __irq_get_desc_lock+0x78/0x94
other info that might help us debug this:
Possible unsafe locking scenario:
irqchip/renesas-irqc: Use a separate lockdep class
The renesas-irqc interrupt controller is cascaded to the GIC. Hence when
propagating wake-up settings to its parent interrupt controller, the
following lockdep warning is printed:
=============================================
[ INFO: possible recursive locking detected ] 4.2.0-ape6evm-10725-g50fcd7643c034198 #280 Not tainted
---------------------------------------------
s2ram/1072 is trying to acquire lock:
(&irq_desc_lock_class){-.-...}, at: [<c008d3fc>] __irq_get_desc_lock+0x58/0x98
but task is already holding lock:
(&irq_desc_lock_class){-.-...}, at: [<c008d3fc>] __irq_get_desc_lock+0x58/0x98
other info that might help us debug this:
Possible unsafe locking scenario:
Pavel Fedin [Sun, 13 Sep 2015 11:14:33 +0000 (12:14 +0100)]
irqchip/GICv2m: Fix GICv2m build warning on 32 bits
After GICv2m was enabled for 32-bit ARM kernel, a warning popped up:
drivers/irqchip/irq-gic-v2m.c: In function gicv2m_compose_msi_msg:
drivers/irqchip/irq-gic-v2m.c:100:2: warning: right shift count >= width
of type [enabled by default]
msg->address_hi = (u32) (addr >> 32);
^
This patch fixes it by using proper macros for splitting up the value.
Marc Zyngier [Sun, 13 Sep 2015 11:14:32 +0000 (12:14 +0100)]
irqchip/gic-v3-its: Add missing cache flushes
When the ITS is configured for non-cacheable transactions, make sure
that the allocated, zeroed memory is flushed to the Point of
Coherency, allowing the ITS to observe the zeros instead of random
garbage (or even get its own data overwritten by zeros being evicted
from the cache...).
Marc Zyngier [Sun, 13 Sep 2015 11:14:31 +0000 (12:14 +0100)]
irqchip/GIC: Add workaround for aliased GIC400
The GICv2 architecture mandates that the two 4kB GIC regions are
contiguous, and on two separate physical pages (so that access to
the second page can be trapped by a hypervisor). This doesn't work
very well when PAGE_SIZE is 64kB.
A relatively common hack^Wway to work around this is to alias each
4kB region over its own 64kB page. Of course in this case, the base
address you want to use is not really the begining of the region,
but base + 60kB (so that you get a contiguous 8kB region over two
distinct pages).
Normally, this would be described in DT with a new property, but
some HW is already out there, and the firmware makes sure that
it will override whatever you put in the GIC node. Duh. And of course,
said firmware source code is not available, despite being based
on u-boot.
The workaround is to detect the case where the CPU interface size
is set to 128kB, and verify the aliasing by checking that the ID
register for GIC400 (which is the only GIC wired this way so far)
is the same at base and base + 0xF000. In this case, we update
the GIC base address and let it roll.
And if you feel slightly sick by looking at this, rest assured that
I do too...
Marc Zyngier [Sun, 13 Sep 2015 12:37:03 +0000 (13:37 +0100)]
platform-msi: Do not cache msi_desc in handler_data
The current implementation of platform MSI caches the msi_desc
pointer in irq_data::handler_data. This is a bit silly, as
we also have irq_data::msi_desc, which is perfectly valid.
Remove the useless assignment and simplify the whole flow.
Jason Wang [Tue, 15 Sep 2015 06:41:57 +0000 (14:41 +0800)]
kvm: fix zero length mmio searching
Currently, if we had a zero length mmio eventfd assigned on
KVM_MMIO_BUS. It will never be found by kvm_io_bus_cmp() since it
always compares the kvm_io_range() with the length that guest
wrote. This will cause e.g for vhost, kick will be trapped by qemu
userspace instead of vhost. Fixing this by using zero length if an
iodevice is zero length.
Jason Wang [Tue, 15 Sep 2015 06:41:56 +0000 (14:41 +0800)]
kvm: fix double free for fast mmio eventfd
We register wildcard mmio eventfd on two buses, once for KVM_MMIO_BUS
and once on KVM_FAST_MMIO_BUS but with a single iodev
instance. This will lead to an issue: kvm_io_bus_destroy() knows
nothing about the devices on two buses pointing to a single dev. Which
will lead to double free[1] during exit. Fix this by allocating two
instances of iodevs then registering one on KVM_MMIO_BUS and another
on KVM_FAST_MMIO_BUS.
Will Deacon [Wed, 2 Sep 2015 17:49:28 +0000 (18:49 +0100)]
arm64: head.S: initialise mdcr_el2 in el2_setup
When entering the kernel at EL2, we fail to initialise the MDCR_EL2
register which controls debug access and PMU capabilities at EL1.
This patch ensures that the register is initialised so that all traps
are disabled and all the PMU counters are available to the host. When a
guest is scheduled, KVM takes care to configure trapping appropriately.
Adrian Hunter [Tue, 8 Sep 2015 07:59:02 +0000 (10:59 +0300)]
perf tests: Fix software clock events test setting maps
The test titled "Test software clock events have valid period values"
was setting cpu/thread maps directly. Make it use the proper function
perf_evlist__set_maps() especially now that it also propagates the maps.
Adrian Hunter [Tue, 8 Sep 2015 07:59:01 +0000 (10:59 +0300)]
perf tests: Fix task exit test setting maps
The test titled "Test number of exit event of a simple workload" was
setting cpu/thread maps directly. Make it use the proper function
perf_evlist__set_maps() especially now that it also propagates the maps.