Ard Biesheuvel [Mon, 20 Nov 2017 17:41:30 +0000 (17:41 +0000)]
arm64: ftrace: emit ftrace-mod.o contents through code
When building the arm64 kernel with both CONFIG_ARM64_MODULE_PLTS and
CONFIG_DYNAMIC_FTRACE enabled, the ftrace-mod.o object file is built
with the kernel and contains a trampoline that is linked into each
module, so that modules can be loaded far away from the kernel and
still reach the ftrace entry point in the core kernel with an ordinary
relative branch, as is emitted by the compiler instrumentation code
dynamic ftrace relies on.
In order to be able to build out of tree modules, this object file
needs to be included into the linux-headers or linux-devel packages,
which is undesirable, as it makes arm64 a special case (although a
precedent does exist for 32-bit PPC).
Given that the trampoline essentially consists of a PLT entry, let's
not bother with a source or object file for it, and simply patch it
in whenever the trampoline is being populated, using the existing
PLT support routines.
David Howells [Fri, 1 Dec 2017 11:40:43 +0000 (11:40 +0000)]
afs: Properly reset afs_vnode (inode) fields
When an AFS inode is allocated by afs_alloc_inode(), the allocated
afs_vnode struct isn't necessarily reset from the last time it was used as
an inode because the slab constructor is only invoked once when the memory
is obtained from the page allocator.
This means that information can leak from one inode to the next because
we're not calling kmem_cache_zalloc(). Some of the information isn't
reset, in particular the permit cache pointer.
David Howells [Fri, 1 Dec 2017 11:40:43 +0000 (11:40 +0000)]
afs: Fix permit refcounting
Fix four refcount bugs in afs_cache_permit():
(1) When checking the result of the kzalloc(), we can't just return, but
must put 'permits'.
(2) We shouldn't put permits immediately after hashing a new permit as we
need to keep the pointer stable so that we can check to see if
vnode->permit_cache has changed before we decide whether to assign to
it.
(3) 'permits' is being put twice.
(4) We need to put either the replacement or the thing replaced after the
assignment to vnode->permit_cache.
Without this, lots of the following are seen:
Kernel BUG at ffffffffa039857b [verbose debug info unavailable]
------------[ cut here ]------------
Kernel BUG at ffffffffa039858a [verbose debug info unavailable]
------------[ cut here ]------------
The addresses are in the .text..refcount section of the kafs.ko module.
Following the relocation records for the __ex_table section shows one to be
due to the decrement in afs_put_permits() and the other to be key_get() in
afs_cache_permit().
Occasionally, the following is seen:
refcount_t overflow at afs_cache_permit+0x57d/0x5c0 [kafs] in cc1[562], uid/euid: 0/0
WARNING: CPU: 0 PID: 562 at kernel/panic.c:657 refcount_error_report+0x9c/0xac
...
can: peak/pci: fix potential bug when probe() fails
PCI/PCIe drivers for PEAK-System CAN/CAN-FD interfaces do some access to the
PCI config during probing. In case one of these accesses fails, a POSITIVE
PCIBIOS_xxx error code is returned back. This POSITIVE error code MUST be
converted into a NEGATIVE errno for the probe() function to indicate it
failed. Using the pcibios_err_to_errno() function, we make sure that the
return code will always be negative.
Oliver Stäbler [Mon, 20 Nov 2017 13:45:15 +0000 (14:45 +0100)]
can: ti_hecc: Fix napi poll return value for repoll
After commit d75b1ade567f ("net: less interrupt masking in NAPI") napi
repoll is done only when work_done == budget.
So we need to return budget if there are still packets to receive.
Jimmy Assarsson [Tue, 21 Nov 2017 07:22:27 +0000 (08:22 +0100)]
can: kvaser_usb: Fix comparison bug in kvaser_usb_read_bulk_callback()
The conditon in the while-loop becomes true when actual_length is less than
2 (MSG_HEADER_LEN). In best case we end up with a former, already
dispatched msg, that got msg->len greater than actual_length. This will
result in a "Format error" error printout.
Problem seen when unplugging a Kvaser USB device connected to a vbox guest.
warning: comparison between signed and unsigned integer expressions
[-Wsign-compare]
Yonghong Song [Thu, 30 Nov 2017 21:47:55 +0000 (13:47 -0800)]
samples/bpf: add error checking for perf ioctl calls in bpf loader
load_bpf_file() should fail if ioctl with command
PERF_EVENT_IOC_ENABLE and PERF_EVENT_IOC_SET_BPF fails.
When they do fail, proper error messages are printed.
With this change, the below "syscall_tp" run shows that
the maximum number of bpf progs attaching to the same
perf tracepoint is indeed enforced.
$ ./syscall_tp -i 64
prog #0: map ids 4 5
...
prog #63: map ids 382 383
$ ./syscall_tp -i 65
prog #0: map ids 4 5
...
prog #64: map ids 388 389
ioctl PERF_EVENT_IOC_SET_BPF failed err Argument list too long
Yonghong Song [Thu, 30 Nov 2017 21:47:54 +0000 (13:47 -0800)]
bpf: set maximum number of attached progs to 64 for a single perf tp
cgropu+bpf prog array has a maximum number of 64 programs.
Let us apply the same limit here.
Fixes: e87c6bc3852b ("bpf: permit multiple bpf attachments for a single perf event") Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
Linus Torvalds [Thu, 30 Nov 2017 23:49:50 +0000 (18:49 -0500)]
Merge tag 'acpi-4.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix a regression related to the ACPI EC handling during system
suspend/resume on some platforms and prevent modalias from being
exposed to user space for ACPI device object with "not functional and
not present" status.
Specifics:
- Fix an ACPI EC driver regression (from the 4.9 cycle) causing the
driver's power management operations to be omitted during system
suspend/resume on platforms where the EC instance from the ECDT
table is used instead of the one from the DSDT (Lv Zheng).
- Prevent modalias from being exposed to user space for ACPI device
objects with _STA returning 0 (not present and not functional) to
prevent driver modules from being loaded automatically for hardware
that is not actually present on some platforms (Hans de Goede)"
* tag 'acpi-4.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / EC: Fix regression related to PM ops support in ECDT device
ACPI / bus: Leave modalias empty for devices which are not present
Dave Airlie [Thu, 30 Nov 2017 23:15:57 +0000 (09:15 +1000)]
Merge branch 'drm-fixes-4.15' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Fixes for 4.15. Highlights:
- DC fixes for S3, gamma, audio, pageflipping, etc.
- fix a regression in radeon from kfd removal
- fix a ttm regression with swiotlb disabled
- misc other fixes
* 'drm-fixes-4.15' of git://people.freedesktop.org/~agd5f/linux: (36 commits)
drm/radeon: remove init of CIK VMIDs 8-16 for amdkfd
drm/ttm: fix populate_and_map() functions once more
drm/amd/display: USB-C / thunderbolt dock specific workaround
drm/amd/display: Switch to drm_atomic_helper_wait_for_flip_done
drm/amd/display: fix gamma setting
drm/amd/display: Do not put drm_atomic_state on resume
drm/amd/display: Fix couple more inconsistent NULL checks in dc_resource
drm/amd/display: Fix potential NULL and mem leak in create_links
drm/amd/display: Fix hubp check in set_cursor_position
drm/amd/display: Fix use before NULL check in validate_timing
drm/amd/display: Bunch of smatch error and warning fixes in DC
drm/amd/display: Fix amdgpu_dm bugs found by smatch
drm/amd/display: try to find matching audio inst for enc inst first
drm/amd/display: fix seq issue: turn on clock before programming afmt.
drm/amd/display: fix memory leaks on error exit return
drm/amd/display: check plane state before validating fbc
drm/amd/display: Do DC mode-change check when adding CRTCs
drm/amd/display: Revert noisy assert messages
drm/amd/display: fix split viewport rounding error
drm/amd/display: Check aux channel before MST resume
...
Dave Airlie [Thu, 30 Nov 2017 23:15:31 +0000 (09:15 +1000)]
Merge branch 'for-upstream/mali-dp' of git://linux-arm.org/linux-ld into drm-fixes
mali-dp interface cleanups.
* 'for-upstream/mali-dp' of git://linux-arm.org/linux-ld:
drm: mali-dp: Disable planes when their CRTC gets disabled.
drm: mali-dp: Separate static internal data into a read-only structure.
drm/arm: Replace instances of drm_dev_unref with drm_dev_put.
drm: mali-dp: switch to drm_*_get(), drm_*_put() helpers
Dave Airlie [Thu, 30 Nov 2017 23:14:46 +0000 (09:14 +1000)]
Merge tag 'drm-amdkfd-fixes-2017-11-26' of git://people.freedesktop.org/~gabbayo/linux into drm-fixes
This is amdkfd pull request for -rc2. It contains three small fixes to the
CIK SDMA code, compilation error fix in kfd_ioctl.h and fix to accessing
a pointer after it was released.
* tag 'drm-amdkfd-fixes-2017-11-26' of git://people.freedesktop.org/~gabbayo/linux:
uapi: fix linux/kfd_ioctl.h userspace compilation errors
drm/amdkfd: fix amdkfd use-after-free GP fault
drm/amdkfd: Fix SDMA oversubsription handling
drm/amdkfd: Fix SDMA ring buffer size calculation
drm/amdgpu: Fix SDMA load/unload sequence on HWS disabled mode
Dave Airlie [Thu, 30 Nov 2017 23:14:18 +0000 (09:14 +1000)]
Merge branch 'for-upstream/hdlcd' of git://linux-arm.org/linux-ld into drm-fixes
3 hdlcd fixes/cleanups
* 'for-upstream/hdlcd' of git://linux-arm.org/linux-ld:
drm/arm: Replace instances of drm_dev_unref with drm_dev_put.
drm: Fix checkpatch issue: "WARNING: braces {} are not necessary for single statement blocks."
drm: hdlcd: Update PM code to save/restore console.
Dave Airlie [Thu, 30 Nov 2017 23:11:13 +0000 (09:11 +1000)]
Merge tag 'imx-drm-fixes-2017-11-30' of git://git.pengutronix.de/git/pza/linux into drm-fixes
drm/imx: fix commit_tail for new drm_atomic_helper_setup_commit
Since commit 080de2e5be2d ("drm/atomic: Check for busy planes/connectors before
setting the commit"), drm_atomic_helper_setup_commit expects that blocking
commits have completed flipping before the commit_tail returns. Add the missing
wait_for_flip_done to commit_tail to ensure this.
* tag 'imx-drm-fixes-2017-11-30' of git://git.pengutronix.de/git/pza/linux:
drm/imx: always call wait_for_flip_done in commit_tail
Dave Airlie [Thu, 30 Nov 2017 23:10:32 +0000 (09:10 +1000)]
Merge tag 'drm-intel-fixes-2017-11-30' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Disable transparent huge pages for now until we have a W/A
- Building fix when CONFIG_BACKLIGHT_CLASS_DEVICE is not selected
- GMBUS communication robustness
- Fbdev hotplug handling fix
* tag 'drm-intel-fixes-2017-11-30' of git://anongit.freedesktop.org/drm/drm-intel:
drm/i915: Disable THP until we have a GPU read BW W/A
drm/i915/gvt: Correct ADDR_4K/2M/1G_MASK definition
drm/i915/gvt: enabled pipe A default on creating vgpu
drm/i915/gvt: Move request alloc to dispatch_workload path only
drm/i915/gvt: remove skl_misc_ctl_write handler
drm/i915/gvt: Fix unsafe locking caused by spin_unlock_bh
drm/i915: fix intel_backlight_device_register declaration
drm/i915/fbdev: Serialise early hotplug events with async fbdev config
drm/i915: Prevent zero length "index" write
drm/i915: Don't try indexed reads to alternate slave addresses
Peter Rosin [Mon, 27 Nov 2017 16:31:00 +0000 (17:31 +0100)]
hwmon: (jc42) optionally try to disable the SMBUS timeout
With a nxp,se97 chip on an atmel sama5d31 board, the I2C adapter driver
is not always capable of avoiding the 25-35 ms timeout as specified by
the SMBUS protocol. This may cause silent corruption of the last bit of
any transfer, e.g. a one is read instead of a zero if the sensor chip
times out. This also affects the eeprom half of the nxp-se97 chip, where
this silent corruption was originally noticed. Other I2C adapters probably
suffer similar issues, e.g. bit-banging comes to mind as risky...
The SMBUS register in the nxp chip is not a standard Jedec register, but
it is not special to the nxp chips either, at least the atmel chips
have the same mechanism. Therefore, do not special case this on the
manufacturer, it is opt-in via the device property anyway.
Andrew Waterman [Wed, 25 Oct 2017 21:32:16 +0000 (14:32 -0700)]
RISC-V: Allow userspace to flush the instruction cache
Despite RISC-V having a direct 'fence.i' instruction available to
userspace (which we can't trap!), that's not actually viable when
running on Linux because the kernel might schedule a process on another
hart. There is no way for userspace to handle this without invoking the
kernel (as it doesn't know the thread->hart mappings), so we've defined
a RISC-V specific system call to flush the instruction cache.
This patch adds both a system call and a VDSO entry. If possible, we'd
like to avoid having the system call be considered part of the
user-facing ABI and instead restrict that to the VDSO entry -- both just
in general to avoid having additional user-visible ABI to maintain, and
because we'd prefer that users just call the VDSO entry because there
might be a better way to do this in the future (ie, one that doesn't
require entering the kernel).
Andrew Waterman [Wed, 25 Oct 2017 21:30:32 +0000 (14:30 -0700)]
RISC-V: Flush I$ when making a dirty page executable
The RISC-V ISA allows for instruction caches that are not coherent WRT
stores, even on a single hart. As a result, we need to explicitly flush
the instruction cache whenever marking a dirty page as executable in
order to preserve the correct system behavior.
Local instruction caches aren't that scary (our implementations actually
flush the cache, but RISC-V is defined to allow higher-performance
implementations to exist), but RISC-V defines no way to perform an
instruction cache shootdown. When explicitly asked to do so we can
shoot down remote instruction caches via an IPI, but this is a bit on
the slow side.
Instead of requiring an IPI to all harts whenever marking a page as
executable, we simply flush the currently running harts. In order to
maintain correct behavior, we additionally mark every other hart as
needing a deferred instruction cache which will be taken before anything
runs on it.
Florian Fainelli [Thu, 30 Nov 2017 18:45:26 +0000 (10:45 -0800)]
net: dsa: bcm_sf2: Set correct CHAIN_ID and slice number mask
When configuring an IPv6 address mask, we should use SLICE_NUM_MASK as
the mask in order to make sure all bits are masked by the hardware.
Also, we want matching entries to have a CHAIN_ID value set to the same
value as the rule index we return to user-space for convenience, so fix
that too.
Fixes: ba0696c22e7c ("net: dsa: bcm_sf2: Add support for IPv6 CFP rules") Fixes: dd8eff68343d ("net: dsa: bcm_sf2: Allow matching arbitrary IPv6 masks/lengths") Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Yonghong Song [Thu, 30 Nov 2017 16:52:42 +0000 (08:52 -0800)]
tools/bpf: adjust rlimit RLIMIT_MEMLOCK for test_verifier_log
The default rlimit RLIMIT_MEMLOCK is 64KB. In certain cases,
e.g. in a test machine mimicking our production system, this test may
fail due to unable to charge the required memory for prog load:
# ./test_verifier_log
Test log_level 0...
ERROR: Program load returned: ret:-1/errno:1, expected ret:-1/errno:22
Changing the default rlimit RLIMIT_MEMLOCK to unlimited makes
the test always pass.
Olof Johansson [Thu, 30 Nov 2017 01:55:20 +0000 (17:55 -0800)]
RISC-V: Add missing include
Fixes:
include/asm-generic/mm_hooks.h:20:11: warning: 'struct vm_area_struct' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/mm_hooks.h:19:38: warning: 'struct mm_struct' declared inside parameter list will not be visible outside of this definition or declaration
Olof Johansson [Thu, 30 Nov 2017 01:55:16 +0000 (17:55 -0800)]
RISC-V: Export some expected symbols for modules
These are the ones needed by current allmodconfig, so add them instead
of everything other architectures are exporting -- the rest can be
added on demand later if needed.
Olof Johansson [Thu, 30 Nov 2017 01:55:14 +0000 (17:55 -0800)]
RISC-V: io.h: type fixes for warnings
include <linux/types.h> for __iomem definition. Also, add volatile to
iounmap() like other architectures have it to avoid "discarding
volatile" warnings from some drivers.
Finally, explicitly promote the base address for INB/OUTB functions to
avoid some old legacy drivers complaining about int-to-ptr promotions.
The drivers are unlikely to work but they're included in allmodconfig
so the warnings are noisy.
Fixes, among other warnings, these with allmodconfig:
../arch/riscv/include/asm/io.h:24:21: error: expected '=', ',', ';', 'asm' or '__attribute__' before '*' token
extern void __iomem *ioremap(phys_addr_t offset, unsigned long size);
sound/pci/echoaudio/echoaudio.c: In function 'snd_echo_free':
sound/pci/echoaudio/echoaudio.c:1879:10: warning: passing argument 1 of 'iounmap' discards 'volatile' qualifier from pointer target type [-Wdiscarded-qualifiers]
Olof Johansson [Thu, 30 Nov 2017 01:55:13 +0000 (17:55 -0800)]
RISC-V: use RISCV_{INT,SHORT} instead of {INT,SHORT} for asm macros
INT and SHORT are used by some drivers that pull in the include files,
so prefixing helps avoid namespace conflicts. Other constructs in the
same file already uses this.
Fixes, among others, these warnings with allmodconfig:
../sound/core/pcm_misc.c:43:0: warning: "INT" redefined
#define INT __force int
Carlos Maiolino [Tue, 28 Nov 2017 16:54:10 +0000 (08:54 -0800)]
xfs: Properly retry failed dquot items in case of error during buffer writeback
Once the inode item writeback errors is already fixed, it's time to fix the same
problem in dquot code.
Although there were no reports of users hitting this bug in dquot code (at least
none I've seen), the bug is there and I was already planning to fix it when the
correct approach to fix the inodes part was decided.
This patch aims to fix the same problem in dquot code, regarding failed buffers
being unable to be resubmitted once they are flush locked.
Tested with the recently test-case sent to fstests list by Hou Tao.
Darrick J. Wong [Tue, 28 Nov 2017 05:40:19 +0000 (21:40 -0800)]
xfs: scrub inode mode properly
Since we've used up all the bits in i_mode, the existing mode check
doesn't actually do anything useful. However, we've not used all the
bit values in the format portion of i_mode, so we /do/ need to test
that for bad values.
Darrick J. Wong [Mon, 27 Nov 2017 17:50:22 +0000 (09:50 -0800)]
xfs: remove unused parameter from xfs_writepage_map
The first thing that xfs_writepage_map does is clobber the offset
parameter. Since we never use the passed-in value, turn the parameter
into a local variable. This gets rid of an UBSAN warning in generic/466.
Linus Torvalds [Thu, 30 Nov 2017 16:15:19 +0000 (08:15 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
- x86 bugfixes: APIC, nested virtualization, IOAPIC
- PPC bugfix: HPT guests on a POWER9 radix host
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (26 commits)
KVM: Let KVM_SET_SIGNAL_MASK work as advertised
KVM: VMX: Fix vmx->nested freeing when no SMI handler
KVM: VMX: Fix rflags cache during vCPU reset
KVM: X86: Fix softlockup when get the current kvmclock
KVM: lapic: Fixup LDR on load in x2apic
KVM: lapic: Split out x2apic ldr calculation
KVM: PPC: Book3S HV: Fix migration and HPT resizing of HPT guests on radix hosts
KVM: vmx: use X86_CR4_UMIP and X86_FEATURE_UMIP
KVM: x86: Fix CPUID function for word 6 (80000001_ECX)
KVM: nVMX: Fix vmx_check_nested_events() return value in case an event was reinjected to L2
KVM: x86: ioapic: Preserve read-only values in the redirection table
KVM: x86: ioapic: Clear Remote IRR when entry is switched to edge-triggered
KVM: x86: ioapic: Remove redundant check for Remote IRR in ioapic_set_irq
KVM: x86: ioapic: Don't fire level irq when Remote IRR set
KVM: x86: ioapic: Fix level-triggered EOI and IOAPIC reconfigure race
KVM: x86: inject exceptions produced by x86_decode_insn
KVM: x86: Allow suppressing prints on RDMSR/WRMSR of unhandled MSRs
KVM: x86: fix em_fxstor() sleeping while in atomic
KVM: nVMX: Fix mmu context after VMLAUNCH/VMRESUME failure
KVM: nVMX: Validate the IA32_BNDCFGS on nested VM-entry
...
Linus Torvalds [Thu, 30 Nov 2017 16:13:36 +0000 (08:13 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
- SPDX identifiers are added to more of the s390 specific files.
- The ELF_ET_DYN_BASE base patch from Kees is reverted, with the change
some old 31-bit programs crash.
- Bug fixes and cleanups.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (29 commits)
s390/gs: add compat regset for the guarded storage broadcast control block
s390: revert ELF_ET_DYN_BASE base changes
s390: Remove redundant license text
s390: crypto: Remove redundant license text
s390: include: Remove redundant license text
s390: kernel: Remove redundant license text
s390: add SPDX identifiers to the remaining files
s390: appldata: add SPDX identifiers to the remaining files
s390: pci: add SPDX identifiers to the remaining files
s390: mm: add SPDX identifiers to the remaining files
s390: crypto: add SPDX identifiers to the remaining files
s390: kernel: add SPDX identifiers to the remaining files
s390: sthyi: add SPDX identifiers to the remaining files
s390: drivers: Remove redundant license text
s390: crypto: Remove redundant license text
s390: virtio: add SPDX identifiers to the remaining files
s390: scsi: zfcp_aux: add SPDX identifier
s390: net: add SPDX identifiers to the remaining files
s390: char: add SPDX identifiers to the remaining files
s390: cio: add SPDX identifiers to the remaining files
...
Eric Dumazet [Thu, 30 Nov 2017 01:43:57 +0000 (17:43 -0800)]
tcp: remove buggy call to tcp_v6_restore_cb()
tcp_v6_send_reset() expects to receive an skb with skb->cb[] layout as
used in TCP stack.
MD5 lookup uses tcp_v6_iif() and tcp_v6_sdif() and thus
TCP_SKB_CB(skb)->header.h6
This patch probably fixes RST packets sent on behalf of a timewait md5
ipv6 socket.
Before Florian patch, tcp_v6_restore_cb() was needed before jumping to
no_tcp_socket label.
Cong Wang [Thu, 30 Nov 2017 00:07:51 +0000 (16:07 -0800)]
act_sample: get rid of tcf_sample_cleanup_rcu()
Similar to commit d7fb60b9cafb ("net_sched: get rid of tcfa_rcu"),
TC actions don't need to respect RCU grace period, because it
is either just detached from tc filter (standalone case) or
it is removed together with tc filter (bound case) in which case
RCU grace period is already respected at filter layer.
David S. Miller [Thu, 30 Nov 2017 15:07:34 +0000 (10:07 -0500)]
Merge tag 'rxrpc-fixes-20171129' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Fixes
Here are three patches for AF_RXRPC. One removes some whitespace, one
fixes terminal ACK generation and the third makes a couple of places
actually use the timeout value just determined rather than ignoring it.
====================
Lucas Stach [Thu, 30 Nov 2017 13:31:46 +0000 (14:31 +0100)]
drm/imx: always call wait_for_flip_done in commit_tail
drm_atomic_helper_wait_for_vblanks will go away in the future.
The new drm_atomic_helper_setup_commit in 4.15 expects that blocking commits
have completed flipping before the commit_tail returns. This must be ensured
by calling wait_for_vblanks or wait_for_flip_done, where flip_done might do
a less agressive wait, which is fine for imx-drm.
Fixes: 080de2e5be2d (drm/atomic: Check for busy planes/connectors before
setting the commit) Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Signed-off-by: Philipp Zabel <[email protected]>
Yan Markman [Thu, 30 Nov 2017 09:49:46 +0000 (10:49 +0100)]
net: mvpp2: allocate zeroed tx descriptors
Reserved and unused fields in the Tx descriptors should be 0. The PPv2
driver doesn't clear them at run-time (for performance reasons) but
these descriptors aren't zeroed when allocated, which can lead to
unpredictable behaviors. This patch fixes this by using
dma_zalloc_coherent instead of dma_alloc_coherent.
Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit") Signed-off-by: Yan Markman <[email protected]>
[Antoine: commit message] Signed-off-by: Antoine Tenart <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Laurent Pinchart [Thu, 16 Nov 2017 08:50:19 +0000 (09:50 +0100)]
drm: omapdrm: Fix DPI on platforms using the DSI VDDS
Commit d178e034d565 ("drm: omapdrm: Move FEAT_DPI_USES_VDDS_DSI feature
to dpi code") replaced usage of platform data version with SoC matching
to configure DPI VDDS. The SoC match entries were incorrect, they should
have matched on the machine name instead of the SoC family. Fix it.
The result was observed on OpenPandora with OMAP3530 where the panel only
had the Blue channel and Red&Green were missing. It was not observed on
GTA04 with DM3730.
Peter Ujfalusi [Mon, 20 Nov 2017 09:51:40 +0000 (11:51 +0200)]
omapdrm: hdmi4: Correct the SoC revision matching
I believe the intention of the commit 2c9fc9bf45f8
("drm: omapdrm: Move FEAT_HDMI_* features to hdmi4 driver")
was to identify omap4430 ES1.x, omap4430 ES2.x and other OMAP4 revisions,
like omap4460.
By using family=OMAP4 in the match the code will treat omap4460 ES1.x in a
same way as it would treat omap4430 ES1.x
This breaks HDMI audio on OMAP4460 devices (PandaES for example).
Correct the match rule so we are not going to get false positive match.
The new backlight code causes a link failure when backlight
support itself is disabled:
drivers/gpu/drm/omapdrm/displays/panel-dpi.o: In function `panel_dpi_probe_of':
panel-dpi.c:(.text+0x35c): undefined reference to `of_find_backlight_by_node'
This adds a Kconfig dependency like we have for the other OMAP
display targets.
Fixes: 39135a305a0f ("drm/omap: displays: panel-dpi: Support for handling backlight devices") Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Tomi Valkeinen <[email protected]>
Joonas Lahtinen [Mon, 27 Nov 2017 09:12:33 +0000 (11:12 +0200)]
drm/i915: Disable THP until we have a GPU read BW W/A
We seem to be missing some W/A for 2M pages and are getting
a hit on raw GPU read bandwidths (even 30%) even though the
GPU write bandwidths improve (even 10%).
For now, disable THP, which is our only practical source of
2M pages until we have a W/A for the issue.
v2:
- Be explicit that we talk about GPU bandwidths (Eero)
- s/deny/never/ because that's why (Chris)
First four bytes should go to DP0_AUXWDATA0. Due to bug if
len > 4 first four bytes was writen to DP0_AUXWDATA1 and all
data get shifted by 4 bytes. Fix it.
Andrey Gusakov [Tue, 7 Nov 2017 16:56:22 +0000 (19:56 +0300)]
drm/bridge: tc358767: fix timing calculations
Fields in HTIM01 and HTIM02 regs should be even.
Recomended thresh_dly value is max_tu_symbol.
Remove set of VPCTRL0.VSDELAY as it is related to DSI input
interface. Currently driver supports only DPI.
Eric Anholt [Tue, 14 Nov 2017 19:16:47 +0000 (11:16 -0800)]
drm/bridge: Fix lvds-encoder since the panel_bridge rework.
The panel_bridge bridge attaches to the panel's OF node, not the
lvds-encoder's node. Put in a little no-op bridge of our own so that
our consumers can still find a bridge where they expect.
This also fixes an unintended unregistration and leak of the
panel-bridge on module remove.
Support the "cec" optional clock. The documentation already mentions "cec"
optional clock and it is used by several boards, but currently the driver
doesn't enable it, thus preventing cec from working on those boards.
And even worse: a /dev/cecX device will appear for those boards, but it
won't be functioning without configuring this clock.
Changes:
v4:
- Change commit message to stress the importance of this patch
v3:
- Drop useless braces
v2:
- Separate ENOENT errors from others
- Propagate other errors (especially -EPROBE_DEFER)
If the device tree for a board did not specify a cec clock, then
adv7511_cec_init would return an error, which would cause adv7511_probe()
to fail and thus there is no HDMI output.
There is no need to have adv7511_probe() fail if the CEC initialization
fails, so just change adv7511_cec_init() to a void function. In addition,
adv7511_cec_init() should just return silently if the cec clock isn't
found and show a message for any other errors.
An otherwise correct cleanup patch from Dan Carpenter turned this broken
failure handling into a kernel Oops, so bisection points to commit 7af35b0addbc ("drm/kirin: Checking for IS_ERR() instead of NULL") rather
than 3b1b975003e4 ("drm: adv7511/33: add HDMI CEC support").
Linus Torvalds [Thu, 30 Nov 2017 03:12:44 +0000 (19:12 -0800)]
Merge branch 'akpm' (patches from Andrew)
Mergr misc fixes from Andrew Morton:
"28 fixes"
* emailed patches from Andrew Morton <[email protected]>: (28 commits)
fs/hugetlbfs/inode.c: change put_page/unlock_page order in hugetlbfs_fallocate()
mm/hugetlb: fix NULL-pointer dereference on 5-level paging machine
autofs: revert "autofs: fix AT_NO_AUTOMOUNT not being honored"
autofs: revert "autofs: take more care to not update last_used on path walk"
fs/fat/inode.c: fix sb_rdonly() change
mm, memcg: fix mem_cgroup_swapout() for THPs
mm: migrate: fix an incorrect call of prep_transhuge_page()
kmemleak: add scheduling point to kmemleak_scan()
scripts/bloat-o-meter: don't fail with division by 0
fs/mbcache.c: make count_objects() more robust
Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical"
mm/madvise.c: fix madvise() infinite loop under special circumstances
exec: avoid RLIMIT_STACK races with prlimit()
IB/core: disable memory registration of filesystem-dax vmas
v4l2: disable filesystem-dax mapping support
mm: fail get_vaddr_frames() for filesystem-dax mappings
mm: introduce get_user_pages_longterm
device-dax: implement ->split() to catch invalid munmap attempts
mm, hugetlbfs: introduce ->split() to vm_operations_struct
scripts/faddr2line: extend usage on generic arch
...
Nadav Amit [Thu, 30 Nov 2017 00:11:33 +0000 (16:11 -0800)]
fs/hugetlbfs/inode.c: change put_page/unlock_page order in hugetlbfs_fallocate()
hugetlfs_fallocate() currently performs put_page() before unlock_page().
This scenario opens a small time window, from the time the page is added
to the page cache, until it is unlocked, in which the page might be
removed from the page-cache by another core. If the page is removed
during this time windows, it might cause a memory corruption, as the
wrong page will be unlocked.
It is arguable whether this scenario can happen in a real system, and
there are several mitigating factors. The issue was found by code
inspection (actually grep), and not by actually triggering the flow.
Yet, since putting the page before unlocking is incorrect it should be
fixed, if only to prevent future breakage or someone copy-pasting this
code.
Mike said:
"I am of the opinion that this does not need to be sent to stable.
Although the ordering is current code is incorrect, there is no way
for this to be a problem with current locking. In addition, I verified
that the perhaps bigger issue with sys_fadvise64(POSIX_FADV_DONTNEED)
for hugetlbfs and other filesystems is addressed in 3a77d214807c ("mm:
fadvise: avoid fadvise for fs without backing device")"
Ian Kent [Thu, 30 Nov 2017 00:11:26 +0000 (16:11 -0800)]
autofs: revert "autofs: fix AT_NO_AUTOMOUNT not being honored"
Commit 42f461482178 ("autofs: fix AT_NO_AUTOMOUNT not being honored")
allowed the fstatat(2) system call to properly honor the AT_NO_AUTOMOUNT
flag but introduced a semantic change.
In order to honor AT_NO_AUTOMOUNT a semantic change was made to the
negative dentry case for stat family system calls in follow_automount().
This changed the unconditional triggering of an automount in this case
to no longer be done and an error returned instead.
This has caused more problems than I expected so reverting the change is
needed.
In a discussion with Neil Brown it was concluded that the automount(8)
daemon can implement this change without kernel modifications. So that
will be done instead and the autofs module documentation updated with a
description of the problem and what needs to be done by module users for
this specific case.
Ian Kent [Thu, 30 Nov 2017 00:11:23 +0000 (16:11 -0800)]
autofs: revert "autofs: take more care to not update last_used on path walk"
While commit 092a53452bb7 ("autofs: take more care to not update
last_used on path walk") helped (partially) resolve a problem where
automounts were not expiring due to aggressive accesses from user space
it has a side effect for very large environments.
This change helps with the expire problem by making the expire more
aggressive but, for very large environments, that means more mount
requests from clients. When there are a lot of clients that can mean
fairly significant server load increases.
It turns out I put the last_used in this position to solve this very
problem and failed to update my own thinking of the autofs expire
policy. So the patch being reverted introduces a regression which
should be fixed.
Shakeel Butt [Thu, 30 Nov 2017 00:11:15 +0000 (16:11 -0800)]
mm, memcg: fix mem_cgroup_swapout() for THPs
Commit d6810d730022 ("memcg, THP, swap: make mem_cgroup_swapout()
support THP") changed mem_cgroup_swapout() to support transparent huge
page (THP).
However the patch missed one location which should be changed for
correctly handling THPs. The resulting bug will cause the memory
cgroups whose THPs were swapped out to become zombies on deletion.
Zi Yan [Thu, 30 Nov 2017 00:11:12 +0000 (16:11 -0800)]
mm: migrate: fix an incorrect call of prep_transhuge_page()
In https://lkml.org/lkml/2017/11/20/411, Andrea reported that during
memory hotplug/hot remove prep_transhuge_page() is called incorrectly on
non-THP pages for migration, when THP is on but THP migration is not
enabled. This leads to a bad state of target pages for migration.
By inspecting the code, if called on a non-THP, prep_transhuge_page()
will
1) change the value of the mapping of (page + 2), since it is used for
THP deferred list;
2) change the lru value of (page + 1), since it is used for THP's dtor.
Both can lead to data corruption of these two pages.
Andrea said:
"Pragmatically and from the point of view of the memory_hotplug subsys,
the effect is a kernel crash when pages are being migrated during a
memory hot remove offline and migration target pages are found in a
bad state"
This patch fixes it by only calling prep_transhuge_page() when we are
certain that the target page is THP.
Yisheng Xie [Thu, 30 Nov 2017 00:11:08 +0000 (16:11 -0800)]
kmemleak: add scheduling point to kmemleak_scan()
kmemleak_scan() will scan struct page for each node and it can be really
large and resulting in a soft lockup. We have seen a soft lockup when
do scan while compile kernel:
Andy Shevchenko [Thu, 30 Nov 2017 00:11:05 +0000 (16:11 -0800)]
scripts/bloat-o-meter: don't fail with division by 0
Under some circumstances it's possible to get a divider 0 which crashes
the script.
Traceback (most recent call last):
File "linux/scripts/bloat-o-meter", line 98, in <module>
print_result("Function", "tTdDbBrR", 2)
File "linux/scripts/bloat-o-meter", line 87, in print_result
(otot, ntot, (ntot - otot)*100.0/otot))
ZeroDivisionError: float division by zero
Michal Hocko [Thu, 30 Nov 2017 00:10:58 +0000 (16:10 -0800)]
Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical"
This reverts commit 0f6d24f87856 ("mm/page-writeback.c: print a warning
if the vm dirtiness settings are illogical") because it causes false
positive warnings during OOM situations as noticed by Tetsuo Handa:
The problem is that both thresh and bg_thresh will be 0 if
available_memory is less than 4 pages when evaluating
global_dirtyable_memory.
While this might be worked around the whole point of the warning is
dubious at best. We do rely on admins to do sensible things when
changing tunable knobs. Dirty memory writeback knobs are not any
special in that regards so revert the warning rather than adding more
hacks to work this around.
chenjie [Thu, 30 Nov 2017 00:10:54 +0000 (16:10 -0800)]
mm/madvise.c: fix madvise() infinite loop under special circumstances
MADVISE_WILLNEED has always been a noop for DAX (formerly XIP) mappings.
Unfortunately madvise_willneed() doesn't communicate this information
properly to the generic madvise syscall implementation. The calling
convention is quite subtle there. madvise_vma() is supposed to either
return an error or update &prev otherwise the main loop will never
advance to the next vma and it will keep looping for ever without a way
to get out of the kernel.
It seems this has been broken since introduction. Nobody has noticed
because nobody seems to be using MADVISE_WILLNEED on these DAX mappings.
Kees Cook [Thu, 30 Nov 2017 00:10:51 +0000 (16:10 -0800)]
exec: avoid RLIMIT_STACK races with prlimit()
While the defense-in-depth RLIMIT_STACK limit on setuid processes was
protected against races from other threads calling setrlimit(), I missed
protecting it against races from external processes calling prlimit().
This adds locking around the change and makes sure that rlim_max is set
too.
Dan Williams [Thu, 30 Nov 2017 00:10:47 +0000 (16:10 -0800)]
IB/core: disable memory registration of filesystem-dax vmas
Until there is a solution to the dma-to-dax vs truncate problem it is
not safe to allow RDMA to create long standing memory registrations
against filesytem-dax vmas.
Dan Williams [Thu, 30 Nov 2017 00:10:43 +0000 (16:10 -0800)]
v4l2: disable filesystem-dax mapping support
V4L2 memory registrations are incompatible with filesystem-dax that
needs the ability to revoke dma access to a mapping at will, or
otherwise allow the kernel to wait for completion of DMA. The
filesystem-dax implementation breaks the traditional solution of
truncate of active file backed mappings since there is no page-cache
page we can orphan to sustain ongoing DMA.
If v4l2 wants to support long lived DMA mappings it needs to arrange to
hold a file lease or use some other mechanism so that the kernel can
coordinate revoking DMA access when the filesystem needs to truncate
mappings.
Dan Williams [Thu, 30 Nov 2017 00:10:39 +0000 (16:10 -0800)]
mm: fail get_vaddr_frames() for filesystem-dax mappings
Until there is a solution to the dma-to-dax vs truncate problem it is
not safe to allow V4L2, Exynos, and other frame vector users to create
long standing / irrevocable memory registrations against filesytem-dax
vmas.
Dan Williams [Thu, 30 Nov 2017 00:10:35 +0000 (16:10 -0800)]
mm: introduce get_user_pages_longterm
Patch series "introduce get_user_pages_longterm()", v2.
Here is a new get_user_pages api for cases where a driver intends to
keep an elevated page count indefinitely. This is distinct from usages
like iov_iter_get_pages where the elevated page counts are transient.
The iov_iter_get_pages cases immediately turn around and submit the
pages to a device driver which will put_page when the i/o operation
completes (under kernel control).
In the longterm case userspace is responsible for dropping the page
reference at some undefined point in the future. This is untenable for
filesystem-dax case where the filesystem is in control of the lifetime
of the block / page and needs reasonable limits on how long it can wait
for pages in a mapping to become idle.
Fixing filesystems to actually wait for dax pages to be idle before
blocks from a truncate/hole-punch operation are repurposed is saved for
a later patch series.
Also, allowing longterm registration of dax mappings is a future patch
series that introduces a "map with lease" semantic where the kernel can
revoke a lease and force userspace to drop its page references.
I have also tagged these for -stable to purposely break cases that might
assume that longterm memory registrations for filesystem-dax mappings
were supported by the kernel. The behavior regression this policy
change implies is one of the reasons we maintain the "dax enabled.
Warning: EXPERIMENTAL, use at your own risk" notification when mounting
a filesystem in dax mode.
It is worth noting the device-dax interface does not suffer the same
constraints since it does not support file space management operations
like hole-punch.
This patch (of 4):
Until there is a solution to the dma-to-dax vs truncate problem it is
not safe to allow long standing memory registrations against
filesytem-dax vmas. Device-dax vmas do not have this problem and are
explicitly allowed.
This is temporary until a "memory registration with layout-lease"
mechanism can be implemented for the affected sub-systems (RDMA and
V4L2).
Dan Williams [Thu, 30 Nov 2017 00:10:32 +0000 (16:10 -0800)]
device-dax: implement ->split() to catch invalid munmap attempts
Similar to how device-dax enforces that the 'address', 'offset', and
'len' parameters to mmap() be aligned to the device's fundamental
alignment, the same constraints apply to munmap(). Implement ->split()
to fail munmap calls that violate the alignment constraint.
Otherwise, we later fail VM_BUG_ON checks in the unmap_page_range() path
with crash signatures of the form:
vma ffff8800b60c8a88 start 00007f88c0000000 end 00007f88c0e00000
next (null) prev (null) mm ffff8800b61150c0
prot 8000000000000027 anon_vma (null) vm_ops ffffffffa0091240
pgoff 0 file ffff8800b638ef80 private_data (null)
flags: 0x380000fb(read|write|shared|mayread|maywrite|mayexec|mayshare|softdirty|mixedmap|hugepage)
------------[ cut here ]------------
kernel BUG at mm/huge_memory.c:2014!
[..]
RIP: 0010:__split_huge_pud+0x12a/0x180
[..]
Call Trace:
unmap_page_range+0x245/0xa40
? __vma_adjust+0x301/0x990
unmap_vmas+0x4c/0xa0
unmap_region+0xae/0x120
? __vma_rb_erase+0x11a/0x230
do_munmap+0x276/0x410
vm_munmap+0x6a/0xa0
SyS_munmap+0x1d/0x30
Dan Williams [Thu, 30 Nov 2017 00:10:28 +0000 (16:10 -0800)]
mm, hugetlbfs: introduce ->split() to vm_operations_struct
Patch series "device-dax: fix unaligned munmap handling"
When device-dax is operating in huge-page mode we want it to behave like
hugetlbfs and fail attempts to split vmas into unaligned ranges. It
would be messy to teach the munmap path about device-dax alignment
constraints in the same (hstate) way that hugetlbfs communicates this
constraint. Instead, these patches introduce a new ->split() vm
operation.
This patch (of 2):
The device-dax interface has similar constraints as hugetlbfs in that it
requires the munmap path to unmap in huge page aligned units. Rather
than add more custom vma handling code in __split_vma() introduce a new
vm operation to perform this vma specific check.
If you had, then the powerpc code would have worked ... ;-) and many
of the other interfaces in include/asm-generic/pgtable.h are
protected that way ...
Given that some architecture already define pmd_write() as a macro, it's
a net reduction to drop the definition of __HAVE_ARCH_PMD_WRITE.
Dan Williams [Thu, 30 Nov 2017 00:10:06 +0000 (16:10 -0800)]
mm: fix device-dax pud write-faults triggered by get_user_pages()
Currently only get_user_pages_fast() can safely handle the writable gup
case due to its use of pud_access_permitted() to check whether the pud
entry is writable. In the gup slow path pud_write() is used instead of
pud_access_permitted() and to date it has been unimplemented, just calls
BUG_ON().
For now this just implements a simple check for the _PAGE_RW bit similar
to pmd_write. However, this implies that the gup-slow-path check is
missing the extra checks that the gup-fast-path performs with
pud_access_permitted. Later patches will align all checks to use the
'access_permitted' helper if the architecture provides it.
Note that the generic 'access_permitted' helper fallback is the simple
_PAGE_RW check on architectures that do not define the
'access_permitted' helper(s).
Mike Kravetz [Thu, 30 Nov 2017 00:10:01 +0000 (16:10 -0800)]
mm/cma: fix alloc_contig_range ret code/potential leak
If the call __alloc_contig_migrate_range() in alloc_contig_range returns
-EBUSY, processing continues so that test_pages_isolated() is called
where there is a tracepoint to identify the busy pages. However, it is
possible for busy pages to become available between the calls to these
two routines. In this case, the range of pages may be allocated.
Unfortunately, the original return code (ret == -EBUSY) is still set and
returned to the caller. Therefore, the caller believes the pages were
not allocated and they are leaked.
Update the comment to indicate that allocation is still possible even if
__alloc_contig_migrate_range returns -EBUSY. Also, clear return code in
this case so that it is not accidentally used or returned to caller.