]> Git Repo - linux.git/log
linux.git
8 years agonfs: make nfs4_cb_sv_ops static
Jason Yan [Fri, 10 Mar 2017 02:48:13 +0000 (10:48 +0800)]
nfs: make nfs4_cb_sv_ops static

Fixes the following sparse warning:

fs/nfs/callback.c:235:21: warning: symbol 'nfs4_cb_sv_ops' was not
declared. Should it be static?

Signed-off-by: Jason Yan <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
8 years agoxprtrdma: Squelch kbuild sparse complaint
Chuck Lever [Sat, 11 Mar 2017 20:52:47 +0000 (15:52 -0500)]
xprtrdma: Squelch kbuild sparse complaint

New complaint from kbuild for 4.9.y:

net/sunrpc/xprtrdma/verbs.c:489:19: sparse: incompatible types in
    comparison expression (different type sizes)

verbs.c:
489 max_sge = min(ia->ri_device->attrs.max_sge, RPCRDMA_MAX_SEND_SGES);

I can't reproduce this running sparse here. Likewise, "make W=1
net/sunrpc/xprtrdma/verbs.o" never indicated any issue.

A little poking suggests that because the range of its values is
small, gcc can make the actual width of RPCRDMA_MAX_SEND_SGES
smaller than the width of an unsigned integer.

Fixes: 16f906d66cd7 ("xprtrdma: Reduce required number of send SGEs")
Signed-off-by: Chuck Lever <[email protected]>
Cc: [email protected]
Signed-off-by: Anna Schumaker <[email protected]>
8 years agoNFS: fix the fault nrequests decreasing for nfs_inode COPY
Kinglong Mee [Thu, 9 Mar 2017 03:36:36 +0000 (11:36 +0800)]
NFS: fix the fault nrequests decreasing for nfs_inode COPY

The nfs_commit_file for NFSv4.2's COPY operation goes through
the commit path for normal WRITE, but without increase nrequests,
so, the nrequests decreased in nfs_commit_release_pages is fault.
After that, the nrequests will be wrong.

[ 5670.299881] ------------[ cut here ]------------
[ 5670.300295] WARNING: CPU: 0 PID: 27656 at fs/nfs/inode.c:127 nfs_clear_inode+0x66/0x90 [nfs]
[ 5670.300558] Modules linked in: nfsv4(E) nfs(E) fscache(E) tun bridge stp llc fuse ip_set nfnetlink vmw_vsock_vmci_transport vsock snd_seq_midi snd_seq_midi_event ppdev f2fs coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_ens1371 intel_rapl_perf gameport snd_ac97_codec vmw_balloon ac97_bus snd_seq snd_pcm joydev snd_rawmidi snd_timer snd_seq_device snd soundcore nfit parport_pc parport acpi_cpufreq tpm_tis tpm_tis_core tpm i2c_piix4 vmw_vmci shpchp nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c vmwgfx drm_kms_helper ttm drm e1000 crc32c_intel mptspi scsi_transport_spi serio_raw mptscsih mptbase ata_generic pata_acpi fjes [last unloaded: fscache]
[ 5670.302925] CPU: 0 PID: 27656 Comm: umount.nfs4 Tainted: G        W   E   4.11.0-rc1+ #519
[ 5670.303292] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[ 5670.304094] Call Trace:
[ 5670.304510]  dump_stack+0x63/0x86
[ 5670.304917]  __warn+0xcb/0xf0
[ 5670.305276]  warn_slowpath_null+0x1d/0x20
[ 5670.305661]  nfs_clear_inode+0x66/0x90 [nfs]
[ 5670.306093]  nfs4_evict_inode+0x61/0x70 [nfsv4]
[ 5670.306480]  evict+0xbb/0x1c0
[ 5670.306888]  dispose_list+0x4d/0x70
[ 5670.307233]  evict_inodes+0x178/0x1a0
[ 5670.307579]  generic_shutdown_super+0x44/0xf0
[ 5670.307985]  nfs_kill_super+0x21/0x40 [nfs]
[ 5670.308325]  deactivate_locked_super+0x43/0x70
[ 5670.308698]  deactivate_super+0x5a/0x60
[ 5670.309036]  cleanup_mnt+0x3f/0x90
[ 5670.309407]  __cleanup_mnt+0x12/0x20
[ 5670.309837]  task_work_run+0x80/0xa0
[ 5670.310162]  exit_to_usermode_loop+0x89/0x90
[ 5670.310497]  syscall_return_slowpath+0xaa/0xb0
[ 5670.310875]  entry_SYSCALL_64_fastpath+0xa7/0xa9
[ 5670.311197] RIP: 0033:0x7f1bb3617fe7
[ 5670.311545] RSP: 002b:00007ffecbabb828 EFLAGS: 00000206 ORIG_RAX: 00000000000000a6
[ 5670.311906] RAX: 0000000000000000 RBX: 0000000001dca1f0 RCX: 00007f1bb3617fe7
[ 5670.312239] RDX: 000000000000000c RSI: 0000000000000001 RDI: 0000000001dc83c0
[ 5670.312653] RBP: 0000000001dc83c0 R08: 0000000000000001 R09: 0000000000000000
[ 5670.312998] R10: 0000000000000755 R11: 0000000000000206 R12: 00007ffecbabc66a
[ 5670.313335] R13: 0000000001dc83a0 R14: 0000000000000000 R15: 0000000000000000
[ 5670.313758] ---[ end trace bf4bfe7764e4eb40 ]---

Cc: [email protected]
Fixes: 67911c8f18 ("NFS: Add nfs_commit_file()")
Signed-off-by: Kinglong Mee <[email protected]>
Cc: [email protected] # 4.7+
Signed-off-by: Anna Schumaker <[email protected]>
8 years agoMerge tag 'afs-20170316' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells...
Linus Torvalds [Fri, 17 Mar 2017 19:16:44 +0000 (12:16 -0700)]
Merge tag 'afs-20170316' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull AFS fixes from David Howells:
 "Fixes to the AFS filesystem in the kernel.

  They fix a variety of bugs. These include some issues fixed for
  consistency with other AFS implementations:

   - handle AFS mode bits better

   - use the client mtime rather than the server mtime in the protocol

   - handle the server returning more or less data than was requested in
     a FetchData call

   - distinguish mountpoints from symlinks based on the mode bits rather
     than preemptively reading every symlink to find out what it
     actually represents

  One other notable change for the user is that files are now flushed on
  close analogously with other network filesystems"

* tag 'afs-20170316' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (28 commits)
  afs: Don't wait for page writeback with the page lock held
  afs: ->writepage() shouldn't call clear_page_dirty_for_io()
  afs: Fix abort on signal while waiting for call completion
  afs: Fix an off-by-one error in afs_send_pages()
  afs: Fix afs_kill_pages()
  afs: Fix page leak in afs_write_begin()
  afs: Don't set PG_error on local EINTR or ENOMEM when filling a page
  afs: Populate and use client modification time
  afs: Better abort and net error handling
  afs: Invalid op ID should abort with RXGEN_OPCODE
  afs: Fix the maths in afs_fs_store_data()
  afs: Use a bvec rather than a kvec in afs_send_pages()
  afs: Make struct afs_read::remain 64-bit
  afs: Fix AFS read bug
  afs: Prevent callback expiry timer overflow
  afs: Migrate vlocation fields to 64-bit
  afs: security: Replace rcu_assign_pointer() with RCU_INIT_POINTER()
  afs: inode: Replace rcu_assign_pointer() with RCU_INIT_POINTER()
  afs: Distinguish mountpoints from symlinks by file mode alone
  afs: Flush outstanding writes when an fd is closed
  ...

8 years agoMerge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm
Linus Torvalds [Fri, 17 Mar 2017 19:14:49 +0000 (12:14 -0700)]
Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM fix from Russell King:
 "Just one change to add the statx syscall this time around"

* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: wire up statx syscall

8 years agoMerge tag 'for-linus-4.11b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 17 Mar 2017 18:25:46 +0000 (11:25 -0700)]
Merge tag 'for-linus-4.11b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fix from Juergen Gross:
 "A minor fix for using the appropriate refcount_t instead of atomic_t"

* tag 'for-linus-4.11b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  drivers, xen: convert grant_map.users from atomic_t to refcount_t

8 years agoMerge tag 'drm-fixes-for-v4.11-rc3' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Fri, 17 Mar 2017 18:19:52 +0000 (11:19 -0700)]
Merge tag 'drm-fixes-for-v4.11-rc3' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "Bunch of fixes across the drivers, in a St Patrick's day pull request
  (please turn terminal colors to green on black or black on green for
  full effect).

  On the arm side, tilcdc, omap and malidp got fixes, while amd has some
  powermanagement fixes, and intel has a set of fixes across the driver.

  Nothing seems to bad or scary at this point"

* tag 'drm-fixes-for-v4.11-rc3' of git://people.freedesktop.org/~airlied/linux: (27 commits)
  drm/amd/amdgpu:  Fix debugfs reg read/write address width
  drm/amdgpu/si: add dpm quirk for Oland
  drm/radeon/si: add dpm quirk for Oland
  drm: amd: remove broken include path
  drm/amd/powerplay: fix copy error in smu7_clockpoweragting.c
  drm/tilcdc: Set framebuffer DMA address to HW only if CRTC is enabled
  drm/tilcdc: Fix hardcoded fail-return value in tilcdc_crtc_create()
  drm/i915: Fix forcewake active domain tracking
  drm/i915: Nuke skl_update_plane debug message from the pipe update critical section
  drm/i915: use correct node for handling cache domain eviction
  uapi: fix drm/omap_drm.h userspace compilation errors
  drm/omap: fix dmabuf mmap for dma_alloc'ed buffers
  drm/amdgpu: fix parser init error path to avoid crash in parser fini
  drm/amd/amdgpu: Disable GFX_PG on Carrizo until compute issues solved
  drm: mali-dp: Fix smart layer not going to composition
  drm: mali-dp: Remove mclk rate management
  drm/i915: Drain the freed state from the tail of the next commit
  drm/i915: Nuke debug messages from the pipe update critical section
  drm/i915: Use pagecache write to prepopulate shmemfs from pwrite-ioctl
  drm/i915: Store a permanent error in obj->mm.pages
  ...

8 years agoMerge tag 'perf-urgent-for-mingo-4.11-20170317' of git://git.kernel.org/pub/scm/linux...
Ingo Molnar [Fri, 17 Mar 2017 14:13:28 +0000 (15:13 +0100)]
Merge tag 'perf-urgent-for-mingo-4.11-20170317' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fix from Arnaldo Carvalho de Melo:

- Fix symbols__fixup_end heuristic for corner cases, such as JITted eBPF
  programs, that are loaded at page aligned addresses, just after the
  kernel proper (Daniel Borkmann)

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
8 years agoperf symbols: Fix symbols__fixup_end heuristic for corner cases
Daniel Borkmann [Wed, 15 Mar 2017 21:53:37 +0000 (22:53 +0100)]
perf symbols: Fix symbols__fixup_end heuristic for corner cases

The current symbols__fixup_end() heuristic for the last entry in the rb
tree is suboptimal as it leads to not being able to recognize the symbol
in the call graph in a couple of corner cases, for example:

 i) If the symbol has a start address (f.e. exposed via kallsyms)
    that is at a page boundary, then the roundup(curr->start, 4096)
    for the last entry will result in curr->start == curr->end with
    a symbol length of zero.

ii) If the symbol has a start address that is shortly before a page
    boundary, then also here, curr->end - curr->start will just be
    very few bytes, where it's unrealistic that we could perform a
    match against.

Instead, change the heuristic to roundup(curr->start, 4096) + 4096, so
that we can catch such corner cases and have a better chance to find
that specific symbol. It's still just best effort as the real end of the
symbol is unknown to us (and could even be at a larger offset than the
current range), but better than the current situation.

Alexei reported that he recently run into case i) with a JITed eBPF
program (these are all page aligned) as the last symbol which wasn't
properly shown in the call graph (while other eBPF program symbols in
the rb tree were displayed correctly). Since this is a generic issue,
lets try to improve the heuristic a bit.

Reported-and-Tested-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Fixes: 2e538c4a1847 ("perf tools: Improve kernel/modules symbol lookup")
Link: http://lkml.kernel.org/r/bb5c80d27743be6f12afc68405f1956a330e1bc9.1489614365.git.daniel@iogearbox.net
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
8 years agox86/perf: Clarify why x86_pmu_event_mapped() isn't racy
Andy Lutomirski [Thu, 16 Mar 2017 19:59:40 +0000 (12:59 -0700)]
x86/perf: Clarify why x86_pmu_event_mapped() isn't racy

Naively, it looks racy, but ->mmap_sem saves it.  Add a comment and a
lockdep assertion.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vince Weaver <[email protected]>
Link: http://lkml.kernel.org/r/03a1e629063899168dfc4707f3bb6e581e21f5c6.1489694270.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
8 years agox86/perf: Fix CR4.PCE propagation to use active_mm instead of mm
Andy Lutomirski [Thu, 16 Mar 2017 19:59:39 +0000 (12:59 -0700)]
x86/perf: Fix CR4.PCE propagation to use active_mm instead of mm

If one thread mmaps a perf event while another thread in the same mm
is in some context where active_mm != mm (which can happen in the
scheduler, for example), refresh_pce() would write the wrong value
to CR4.PCE.  This broke some PAPI tests.

Reported-and-tested-by: Vince Weaver <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Fixes: 7911d3f7af14 ("perf/x86: Only allow rdpmc if a perf_event is mapped")
Link: http://lkml.kernel.org/r/0c5b38a76ea50e405f9abe07a13dfaef87c173a1.1489694270.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
8 years agopowerpc/pseries: Don't give a warning when HPT resizing isn't available
Michael Ellerman [Fri, 17 Mar 2017 05:02:35 +0000 (16:02 +1100)]
powerpc/pseries: Don't give a warning when HPT resizing isn't available

As of commit 438cc81a41e8 ("powerpc/pseries: Automatically resize HPT
for memory hot add/remove"), when running on the pseries platform, we
always attempt to use the PAPR extension to resize the hashed page
table (HPT) when we add or remove memory.

This is fine, but when the extension is not available we'll give a
harmless, but scary warning. Instead check if the firmware supports HPT
resizing before populating the mmu_hash_ops.resize_hpt pointer.

Signed-off-by: David Gibson <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
8 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Fri, 17 Mar 2017 01:23:02 +0000 (18:23 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge fixes from Andrew Morton:
 "6 fixes"

* emailed patches from Andrew Morton <[email protected]>:
  drivers core: remove assert_held_device_hotplug()
  mm: add private lock to serialize memory hotplug operations
  mm: don't warn when vmalloc() fails due to a fatal signal
  mm, x86: fix native_pud_clear build error
  kasan: add a prototype of task_struct to avoid warning
  z3fold: fix spinlock unlocking in page reclaim

8 years agodrivers core: remove assert_held_device_hotplug()
Heiko Carstens [Thu, 16 Mar 2017 23:40:33 +0000 (16:40 -0700)]
drivers core: remove assert_held_device_hotplug()

The last caller of assert_held_device_hotplug() is gone, so remove it again.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Heiko Carstens <[email protected]>
Acked-by: Dan Williams <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Cc: Ben Hutchings <[email protected]>
Cc: Gerald Schaefer <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Sebastian Ott <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
8 years agomm: add private lock to serialize memory hotplug operations
Heiko Carstens [Thu, 16 Mar 2017 23:40:30 +0000 (16:40 -0700)]
mm: add private lock to serialize memory hotplug operations

Commit bfc8c90139eb ("mem-hotplug: implement get/put_online_mems")
introduced new functions get/put_online_mems() and mem_hotplug_begin/end()
in order to allow similar semantics for memory hotplug like for cpu
hotplug.

The corresponding functions for cpu hotplug are get/put_online_cpus()
and cpu_hotplug_begin/done() for cpu hotplug.

The commit however missed to introduce functions that would serialize
memory hotplug operations like they are done for cpu hotplug with
cpu_maps_update_begin/done().

This basically leaves mem_hotplug.active_writer unprotected and allows
concurrent writers to modify it, which may lead to problems as outlined
by commit f931ab479dd2 ("mm: fix devm_memremap_pages crash, use
mem_hotplug_{begin, done}").

That commit was extended again with commit b5d24fda9c3d ("mm,
devm_memremap_pages: hold device_hotplug lock over mem_hotplug_{begin,
done}") which serializes memory hotplug operations for some call sites
by using the device_hotplug lock.

In addition with commit 3fc21924100b ("mm: validate device_hotplug is held
for memory hotplug") a sanity check was added to mem_hotplug_begin() to
verify that the device_hotplug lock is held.

This in turn triggers the following warning on s390:

WARNING: CPU: 6 PID: 1 at drivers/base/core.c:643 assert_held_device_hotplug+0x4a/0x58
 Call Trace:
  assert_held_device_hotplug+0x40/0x58)
  mem_hotplug_begin+0x34/0xc8
  add_memory_resource+0x7e/0x1f8
  add_memory+0xda/0x130
  add_memory_merged+0x15c/0x178
  sclp_detect_standby_memory+0x2ae/0x2f8
  do_one_initcall+0xa2/0x150
  kernel_init_freeable+0x228/0x2d8
  kernel_init+0x2a/0x140
  kernel_thread_starter+0x6/0xc

One possible fix would be to add more lock_device_hotplug() and
unlock_device_hotplug() calls around each call site of
mem_hotplug_begin/end().  But that would give the device_hotplug lock
additional semantics it better should not have (serialize memory hotplug
operations).

Instead add a new memory_add_remove_lock which has the similar semantics
like cpu_add_remove_lock for cpu hotplug.

To keep things hopefully a bit easier the lock will be locked and unlocked
within the mem_hotplug_begin/end() functions.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Heiko Carstens <[email protected]>
Reported-by: Sebastian Ott <[email protected]>
Acked-by: Dan Williams <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Cc: Ben Hutchings <[email protected]>
Cc: Gerald Schaefer <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
8 years agomm: don't warn when vmalloc() fails due to a fatal signal
Dmitry Vyukov [Thu, 16 Mar 2017 23:40:27 +0000 (16:40 -0700)]
mm: don't warn when vmalloc() fails due to a fatal signal

When vmalloc() fails it prints a very lengthy message with all the
details about memory consumption assuming that it happened due to OOM.

However, vmalloc() can also fail due to fatal signal pending.  In such
case the message is quite confusing because it suggests that it is OOM
but the numbers suggest otherwise.  The messages can also pollute
console considerably.

Don't warn when vmalloc() fails due to fatal signal pending.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Dmitry Vyukov <[email protected]>
Reviewed-by: Matthew Wilcox <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
8 years agomm, x86: fix native_pud_clear build error
Arnd Bergmann [Thu, 16 Mar 2017 23:40:24 +0000 (16:40 -0700)]
mm, x86: fix native_pud_clear build error

We still get a build error in random configurations, after this has been
modified a few times:

  In file included from include/linux/mm.h:68:0,
                   from include/linux/suspend.h:8,
                   from arch/x86/kernel/asm-offsets.c:12:
  arch/x86/include/asm/pgtable.h:66:26: error: redefinition of 'native_pud_clear'
   #define pud_clear(pud)   native_pud_clear(pud)

My interpretation is that the build error comes from a typo in
__PAGETABLE_PUD_FOLDED, so fix that typo now, and remove the incorrect
#ifdef around the native_pud_clear definition.

Fixes: 3e761a42e19c ("mm, x86: fix HIGHMEM64 && PARAVIRT build config for native_pud_clear()")
Fixes: a00cc7d9dd93 ("mm, x86: add support for PUD-sized transparent hugepages")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Ackedy-by: Dave Jiang <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Thomas Garnier <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Borislav Petkov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
8 years agokasan: add a prototype of task_struct to avoid warning
Masami Hiramatsu [Thu, 16 Mar 2017 23:40:21 +0000 (16:40 -0700)]
kasan: add a prototype of task_struct to avoid warning

Add a prototype of task_struct to fix below warning on arm64.

  In file included from arch/arm64/kernel/probes/kprobes.c:19:0:
  include/linux/kasan.h:81:132: error: 'struct task_struct' declared inside parameter list will not be visible outside of this definition or declaration [-Werror]
   static inline void kasan_unpoison_task_stack(struct task_struct *task) {}

As same as other types (kmem_cache, page, and vm_struct) this adds a
prototype of task_struct data structure on top of kasan.h.

[arnd] A related warning was fixed before, but now appears in a
different line in the same file in v4.11-rc2.  The patch from Masami
Hiramatsu still seems appropriate, so let's take his version.

Fixes: 71af2ed5eeea ("kasan, sched/headers: Remove <linux/sched.h> from <linux/kasan.h>")
Link: https://patchwork.kernel.org/patch/9569839/
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
Acked-by: Alexander Potapenko <[email protected]>
Acked-by: Andrey Ryabinin <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
8 years agoz3fold: fix spinlock unlocking in page reclaim
Vitaly Wool [Thu, 16 Mar 2017 23:40:19 +0000 (16:40 -0700)]
z3fold: fix spinlock unlocking in page reclaim

Commmit 5a27aa822029 ("z3fold: add kref refcounting") introduced a bug
in z3fold_reclaim_page() with function exit that may leave pool->lock
spinlock held.  Here comes the trivial fix.

Fixes: 5a27aa822029 ("z3fold: add kref refcounting")
Link: http://lkml.kernel.org/r/[email protected]
Reported-by: Alexey Khoroshilov <[email protected]>
Signed-off-by: Vitaly Wool <[email protected]>
Cc: Dan Streetman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
8 years agoMerge tag 'xfs-4.11-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Thu, 16 Mar 2017 19:30:43 +0000 (12:30 -0700)]
Merge tag 'xfs-4.11-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fix from Darrick Wong:
 "Here's a single fix for -rc3 to improve input validation on inline
  directory data to prevent buffer overruns due to corrupt metadata"

* tag 'xfs-4.11-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: verify inline directory data forks

8 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Thu, 16 Mar 2017 18:47:28 +0000 (11:47 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes/cleanups from Catalin Marinas:
 "In Will's absence I'm sending the arm64 fixes he queued for 4.11-rc3:

   - fix arm64 kernel boot warning when DEBUG_VIRTUAL and KASAN are
     enabled

   - enable KEYS_COMPAT for keyctl compat support

   - use cpus_have_const_cap() for system_uses_ttbr0_pan() (slight
     performance improvement)

   - update kerneldoc for cpu_suspend() rename

   - remove the arm64-specific kprobe_exceptions_notify (weak generic
     variant defined)"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: kernel: Update kerneldoc for cpu_suspend() rename
  arm64: use const cap for system_uses_ttbr0_pan()
  arm64: support keyctl() system call in 32-bit mode
  arm64: kasan: avoid bad virt_to_pfn()
  arm64: kprobes: remove kprobe_exceptions_notify

8 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md
Linus Torvalds [Thu, 16 Mar 2017 18:43:48 +0000 (11:43 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md

Pull MD fixes from Shaohua Li:

 - fix a parity calculation bug of raid5 cache by Song

 - fix a potential deadlock issue by me

 - fix two endian issues by Jason

 - fix a disk limitation issue by Neil

 - other small fixes and cleanup

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
  md/raid1: fix a trivial typo in comments
  md/r5cache: fix set_syndrome_sources() for data in cache
  md: fix incorrect use of lexx_to_cpu in does_sb_need_changing
  md: fix super_offset endianness in super_1_rdev_size_change
  md/raid1/10: fix potential deadlock
  md: don't impose the MD_SB_DISKS limit on arrays without metadata.
  md: move funcs from pers->resize to update_size
  md-cluster: remove useless memset from gather_all_resync_info
  md-cluster: free md_cluster_info if node leave cluster
  md: delete dead code
  md/raid10: submit bio directly to replacement disk

8 years agoafs: Don't wait for page writeback with the page lock held
David Howells [Thu, 16 Mar 2017 16:27:49 +0000 (16:27 +0000)]
afs: Don't wait for page writeback with the page lock held

Drop the page lock before waiting for page writeback.

Signed-off-by: David Howells <[email protected]>
8 years agoafs: ->writepage() shouldn't call clear_page_dirty_for_io()
David Howells [Thu, 16 Mar 2017 16:27:49 +0000 (16:27 +0000)]
afs: ->writepage() shouldn't call clear_page_dirty_for_io()

The ->writepage() op shouldn't call clear_page_dirty_for_io() as that has
already been called by the caller.

Fix afs_writepage() by moving the call out of
afs_write_back_from_locked_page() to afs_writepages_region() where it is
needed.

Signed-off-by: David Howells <[email protected]>
8 years agoafs: Fix abort on signal while waiting for call completion
David Howells [Thu, 16 Mar 2017 16:27:49 +0000 (16:27 +0000)]
afs: Fix abort on signal while waiting for call completion

Fix the way in which a call that's in progress and being waited for is
aborted in the case that EINTR is detected.  We should be sending
RX_USER_ABORT rather than RX_CALL_DEAD as the abort code.

Note that since the only two ways out of the loop are if the call completes
or if a signal happens, the kill-the-call clause after the loop has
finished can only happen in the case of EINTR.  This means that we only
have one abort case to deal with, not two, and the "KWC" case can never
happen and so can be deleted.

Note further that simply aborting the call isn't necessarily the best thing
here since at this point: the request has been entirely sent and it's
likely the server will do the operation anyway - whether we abort it or
not.  In future, we should punt the handling of the remainder of the call
off to a background thread.

Reported-by: Marc Dionne <[email protected]>
Signed-off-by: David Howells <[email protected]>
8 years agoafs: Fix an off-by-one error in afs_send_pages()
David Howells [Thu, 16 Mar 2017 16:27:48 +0000 (16:27 +0000)]
afs: Fix an off-by-one error in afs_send_pages()

afs_send_pages() should only put the call into the AFS_CALL_AWAIT_REPLY
state if it has sent all the pages - but the check it makes is incorrect
and sometimes it will finish the loop early.

Signed-off-by: David Howells <[email protected]>
8 years agoafs: Fix afs_kill_pages()
David Howells [Thu, 16 Mar 2017 16:27:48 +0000 (16:27 +0000)]
afs: Fix afs_kill_pages()

Fix afs_kill_pages() in two ways:

 (1) If a writeback has been partially flushed, then if we try and kill the
     pages it contains, some of them may no longer be undergoing writeback
     and end_page_writeback() will assert.

     Fix this by checking to see whether the page in question is actually
     undergoing writeback before ending that writeback.

 (2) The loop that scans for pages to kill doesn't increase the first page
     index, and so the loop may not terminate, but it will try to process
     the same pages over and over again.

     Fix this by increasing the first page index to one after the last page
     we processed.

Signed-off-by: David Howells <[email protected]>
8 years agoafs: Fix page leak in afs_write_begin()
David Howells [Thu, 16 Mar 2017 16:27:48 +0000 (16:27 +0000)]
afs: Fix page leak in afs_write_begin()

afs_write_begin() leaks a ref and a lock on a page if afs_fill_page()
fails.  Fix the leak by unlocking and releasing the page in the error path.

Signed-off-by: David Howells <[email protected]>
8 years agoafs: Don't set PG_error on local EINTR or ENOMEM when filling a page
David Howells [Thu, 16 Mar 2017 16:27:48 +0000 (16:27 +0000)]
afs: Don't set PG_error on local EINTR or ENOMEM when filling a page

Don't set PG_error on a page if we get local EINTR or ENOMEM when filling a
page for writing.

Signed-off-by: David Howells <[email protected]>
8 years agoafs: Populate and use client modification time
Marc Dionne [Thu, 16 Mar 2017 16:27:47 +0000 (16:27 +0000)]
afs: Populate and use client modification time

The inode timestamps should be set from the client time
in the status received from the server, rather than the
server time which is meant for internal server use.

Set AFS_SET_MTIME and populate the mtime for operations
that take an input status, such as file/dir creation
and StoreData.  If an input time is not provided the
server will set the vnode times based on the current server
time.

In a situation where the server has some skew with the
client, this could lead to the client seeing a timestamp
in the future for a file that it just created or wrote.

Signed-off-by: Marc Dionne <[email protected]>
Signed-off-by: David Howells <[email protected]>
8 years agoafs: Better abort and net error handling
David Howells [Thu, 16 Mar 2017 16:27:47 +0000 (16:27 +0000)]
afs: Better abort and net error handling

If we receive a network error, a remote abort or a protocol error whilst
we're still transmitting data, make sure we return an appropriate error to
the caller rather than ESHUTDOWN or ECONNABORTED.

Signed-off-by: David Howells <[email protected]>
8 years agoafs: Invalid op ID should abort with RXGEN_OPCODE
David Howells [Thu, 16 Mar 2017 16:27:47 +0000 (16:27 +0000)]
afs: Invalid op ID should abort with RXGEN_OPCODE

When we are given an invalid operation ID, we should abort that with
RXGEN_OPCODE rather than RX_INVALID_OPERATION.

Also map RXGEN_OPCODE to -ENOTSUPP.

Signed-off-by: David Howells <[email protected]>
8 years agoafs: Fix the maths in afs_fs_store_data()
David Howells [Thu, 16 Mar 2017 16:27:47 +0000 (16:27 +0000)]
afs: Fix the maths in afs_fs_store_data()

afs_fs_store_data() works out of the size of the write it's going to make,
but it uses 32-bit unsigned subtraction in one place that gets
automatically cast to loff_t.

However, if to < offset, then the number goes negative, but as the result
isn't signed, this doesn't get sign-extended to 64-bits when placed in a
loff_t.

Fix by casting the operands to loff_t.

Signed-off-by: David Howells <[email protected]>
8 years agoafs: Use a bvec rather than a kvec in afs_send_pages()
David Howells [Thu, 16 Mar 2017 16:27:46 +0000 (16:27 +0000)]
afs: Use a bvec rather than a kvec in afs_send_pages()

Use a bvec rather than a kvec in afs_send_pages() as we don't then have to
call kmap() in advance.  This allows us to pass the array of contiguous
pages that we extracted through to rxrpc in one go rather than passing a
single page at a time.

Signed-off-by: David Howells <[email protected]>
8 years agoafs: Make struct afs_read::remain 64-bit
David Howells [Thu, 16 Mar 2017 16:27:46 +0000 (16:27 +0000)]
afs: Make struct afs_read::remain 64-bit

Make struct afs_read::remain 64-bit so that it can handle huge transfers if
we ever request them or the server decides to give us a bit extra data (the
other fields there are already 64-bit).

Signed-off-by: David Howells <[email protected]>
Tested-by: Marc Dionne <[email protected]>
8 years agoafs: Fix AFS read bug
David Howells [Thu, 16 Mar 2017 16:27:46 +0000 (16:27 +0000)]
afs: Fix AFS read bug

Fix a bug in AFS read whereby the request page afs_read::index isn't
incremented after calling ->page_done() if ->remain reaches 0, indicating
that the data read is complete.

Without this a NULL pointer exception happens when ->page_done() is called
twice for the last page because the page clearing loop will call it also
and afs_readpages_page_done() clears the current entry in the page list.

BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: afs_readpages_page_done+0x21/0xa4 [kafs]
PGD 0
Oops: 0002 [#1] SMP
Modules linked in: kafs(E)
CPU: 2 PID: 3002 Comm: md5sum Tainted: G            E   4.10.0-fscache #485
Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
task: ffff8804017d86c0 task.stack: ffff8803fc1d8000
RIP: 0010:afs_readpages_page_done+0x21/0xa4 [kafs]
RSP: 0018:ffff8803fc1db978 EFLAGS: 00010282
RAX: ffff880405d39af8 RBX: 0000000000000000 RCX: ffff880407d83ed4
RDX: 0000000000000000 RSI: ffff880405d39a00 RDI: ffff880405c6f400
RBP: ffff8803fc1db988 R08: 0000000000000000 R09: 0000000000000001
R10: ffff8803fc1db820 R11: ffff88040cf56000 R12: ffff8804088f1780
R13: ffff8804017d86c0 R14: ffff8804088f1780 R15: 0000000000003840
FS:  00007f8154469700(0000) GS:ffff88041fb00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000004016ec000 CR4: 00000000001406e0
Call Trace:
 afs_deliver_fs_fetch_data+0x5b9/0x60e [kafs]
 ? afs_make_call+0x316/0x4e8 [kafs]
 ? afs_make_call+0x359/0x4e8 [kafs]
 afs_deliver_to_call+0x173/0x2e8 [kafs]
 ? afs_make_call+0x316/0x4e8 [kafs]
 afs_make_call+0x37a/0x4e8 [kafs]
 ? wake_up_q+0x4f/0x4f
 ? __init_waitqueue_head+0x36/0x49
 afs_fs_fetch_data+0x21c/0x227 [kafs]
 ? afs_fs_fetch_data+0x21c/0x227 [kafs]
 afs_vnode_fetch_data+0xf3/0x1d2 [kafs]
 afs_readpages+0x314/0x3fd [kafs]
 __do_page_cache_readahead+0x208/0x2c5
 ondemand_readahead+0x3a2/0x3b7
 ? ondemand_readahead+0x3a2/0x3b7
 page_cache_async_readahead+0x5e/0x67
 generic_file_read_iter+0x23b/0x70c
 ? __inode_security_revalidate+0x2f/0x62
 __vfs_read+0xc4/0xe8
 vfs_read+0xd1/0x15a
 SyS_read+0x4c/0x89
 do_syscall_64+0x80/0x191
 entry_SYSCALL64_slow_path+0x25/0x25

Reported-by: Marc Dionne <[email protected]>
Signed-off-by: David Howells <[email protected]>
Tested-by: Marc Dionne <[email protected]>
8 years agoafs: Prevent callback expiry timer overflow
Tina Ruchandani [Thu, 16 Mar 2017 16:27:46 +0000 (16:27 +0000)]
afs: Prevent callback expiry timer overflow

get_seconds() returns real wall-clock seconds. On 32-bit systems
this value will overflow in year 2038 and beyond. This patch changes
afs_vnode record to use ktime_get_real_seconds() instead, for the
fields cb_expires and cb_expires_at.

Signed-off-by: Tina Ruchandani <[email protected]>
Signed-off-by: David Howells <[email protected]>
8 years agoafs: Migrate vlocation fields to 64-bit
Tina Ruchandani [Thu, 16 Mar 2017 16:27:46 +0000 (16:27 +0000)]
afs: Migrate vlocation fields to 64-bit

get_seconds() returns real wall-clock seconds. On 32-bit systems
this value will overflow in year 2038 and beyond. This patch changes
afs's vlocation record to use ktime_get_real_seconds() instead, for the
fields time_of_death and update_at.

Signed-off-by: Tina Ruchandani <[email protected]>
Signed-off-by: David Howells <[email protected]>
8 years agoafs: security: Replace rcu_assign_pointer() with RCU_INIT_POINTER()
Andreea-Cristina Bernat [Thu, 16 Mar 2017 16:27:45 +0000 (16:27 +0000)]
afs: security: Replace rcu_assign_pointer() with RCU_INIT_POINTER()

The use of "rcu_assign_pointer()" is NULLing out the pointer.
According to RCU_INIT_POINTER()'s block comment:
"1.   This use of RCU_INIT_POINTER() is NULLing out the pointer"
it is better to use it instead of rcu_assign_pointer() because it has a
smaller overhead.

The following Coccinelle semantic patch was used:
@@
@@

- rcu_assign_pointer
+ RCU_INIT_POINTER
  (..., NULL)

Signed-off-by: Andreea-Cristina Bernat <[email protected]>
Signed-off-by: David Howells <[email protected]>
8 years agoafs: inode: Replace rcu_assign_pointer() with RCU_INIT_POINTER()
Andreea-Cristina Bernat [Thu, 16 Mar 2017 16:27:45 +0000 (16:27 +0000)]
afs: inode: Replace rcu_assign_pointer() with RCU_INIT_POINTER()

The use of "rcu_assign_pointer()" is NULLing out the pointer.
According to RCU_INIT_POINTER()'s block comment:
"1.   This use of RCU_INIT_POINTER() is NULLing out the pointer"
it is better to use it instead of rcu_assign_pointer() because it has a
smaller overhead.

The following Coccinelle semantic patch was used:
@@
@@

- rcu_assign_pointer
+ RCU_INIT_POINTER
  (..., NULL)

Signed-off-by: Andreea-Cristina Bernat <[email protected]>
Signed-off-by: David Howells <[email protected]>
8 years agoafs: Distinguish mountpoints from symlinks by file mode alone
David Howells [Thu, 16 Mar 2017 16:27:45 +0000 (16:27 +0000)]
afs: Distinguish mountpoints from symlinks by file mode alone

In AFS, mountpoints appear as symlinks with mode 0644 and normal symlinks
have mode 0777, so use this to distinguish them rather than reading the
content and parsing it.  In the case of a mountpoint, the symlink body is a
formatted string indicating the location of the target volume.

Note that with this, kAFS no longer 'pre-fetches' the contents of symlinks,
so afs_readpage() may fail with an access-denial because when the VFS calls
d_automount(), it wraps the call in an credentials override that sets the
initial creds - thereby preventing access to the caller's keyrings and the
authentication keys held therein.

To this end, a patch reverting that change to the VFS is required also.

Reported-by: Jeffrey Altman <[email protected]>
Signed-off-by: David Howells <[email protected]>
8 years agoafs: Flush outstanding writes when an fd is closed
David Howells [Thu, 16 Mar 2017 16:27:45 +0000 (16:27 +0000)]
afs: Flush outstanding writes when an fd is closed

Flush outstanding writes in afs when an fd is closed.  This is what NFS and
CIFS do.

Reported-by: Marc Dionne <[email protected]>
Signed-off-by: David Howells <[email protected]>
8 years agoafs: Handle a short write to an AFS page
David Howells [Thu, 16 Mar 2017 16:27:44 +0000 (16:27 +0000)]
afs: Handle a short write to an AFS page

Handle the situation where afs_write_begin() is told to expect that a
full-page write will be made, but this doesn't happen (EFAULT, CTRL-C,
etc.), and so afs_write_end() sees a partial write took place.  Currently,
no attempt is to deal with the discrepency.

Fix this by loading the gap from the server.

Reported-by: Al Viro <[email protected]>
Signed-off-by: David Howells <[email protected]>
8 years agoafs: Kill struct afs_read::pg_offset
David Howells [Thu, 16 Mar 2017 16:27:44 +0000 (16:27 +0000)]
afs: Kill struct afs_read::pg_offset

Kill struct afs_read::pg_offset as nothing uses it.  It's unnecessary as pos
can be masked off.

Signed-off-by: David Howells <[email protected]>
8 years agoafs: Handle better the server returning excess or short data
David Howells [Thu, 16 Mar 2017 16:27:44 +0000 (16:27 +0000)]
afs: Handle better the server returning excess or short data

When an AFS server is given an FS.FetchData{,64} request to read data from
a file, it is permitted by the protocol to return more or less than was
requested.  kafs currently relies on the latter behaviour in readpage{,s}
to handle a partial page at the end of the file (we just ask for a whole
page and clear space beyond the short read).

However, we don't handle all cases.  Add:

 (1) Handle excess data by discarding it rather than aborting.  Note that
     we use a common static buffer to discard into so that the decryption
     algorithm advances the PCBC state.

 (2) Handle a short read that affects more than just the last page.

Note that if a read comes up unexpectedly short of long, it's possible that
the server's copy of the file changed - in which case the data version
number will have been incremented and the callback will have been broken -
in which case all the pages currently attached to the inode will be zapped
anyway at some point.

Signed-off-by: David Howells <[email protected]>
8 years agoafs: Deal with an empty callback array
Marc Dionne [Thu, 16 Mar 2017 16:27:44 +0000 (16:27 +0000)]
afs: Deal with an empty callback array

Servers may send a callback array that is the same size as
the FID array, or an empty array.  If the callback count is
0, the code would attempt to read (fid_count * 12) bytes of
data, which would fail and result in an unmarshalling error.
This would lead to stale data for remotely modified files
or directories.

Store the callback array size in the internal afs_call
structure and use that to determine the amount of data to
read.

Signed-off-by: Marc Dionne <[email protected]>
8 years agoafs: Adjust mode bits processing
Marc Dionne [Thu, 16 Mar 2017 16:27:44 +0000 (16:27 +0000)]
afs: Adjust mode bits processing

Mode bits for an afs file should not be enforced in the usual
way.

For files, the absence of user bits can restrict file access
with respect to what is granted by the server.

These bits apply regardless of the owner or the current uid; the
rest of the mode bits (group, other) are ignored.

Signed-off-by: Marc Dionne <[email protected]>
Signed-off-by: David Howells <[email protected]>
8 years agoafs: Populate group ID from vnode status
Marc Dionne [Thu, 16 Mar 2017 16:27:43 +0000 (16:27 +0000)]
afs: Populate group ID from vnode status

The group was hard coded to GLOBAL_ROOT_GID; use the group
ID that was received from the server.

Signed-off-by: Marc Dionne <[email protected]>
Signed-off-by: David Howells <[email protected]>
8 years agoafs: Fix page overput in afs_fill_page()
David Howells [Thu, 16 Mar 2017 16:27:43 +0000 (16:27 +0000)]
afs: Fix page overput in afs_fill_page()

afs_fill_page() loads the page it wants to fill into the afs_read request
without incrementing its refcount - but then calls afs_put_read() to clean
up afterwards, which then releases a ref on the page.

Fix this by getting a ref on the page before calling
afs_vnode_fetch_data().

This causes sync after a write to hang in afs_writepages_region() because
find_get_pages_tag() gets confused and doesn't return.

Fixes: 196ee9cd2d04 ("afs: Make afs_fs_fetch_data() take a list of pages")
Reported-by: Marc Dionne <[email protected]>
Signed-off-by: David Howells <[email protected]>
Tested-by: Marc Dionne <[email protected]>
8 years agoafs: Fix missing put_page()
David Howells [Thu, 16 Mar 2017 16:27:43 +0000 (16:27 +0000)]
afs: Fix missing put_page()

In afs_writepages_region(), inside the loop where we find dirty pages to
deal with, one of the if-statements is missing a put_page().

Signed-off-by: David Howells <[email protected]>
8 years agoperf/core: Better explain the inherit magic
Peter Zijlstra [Thu, 16 Mar 2017 12:47:51 +0000 (13:47 +0100)]
perf/core: Better explain the inherit magic

While going through the event inheritance code Oleg got confused.

Add some comments to better explain the silent dissapearance of
orphaned events.

So what happens is that at perf_event_release_kernel() time; when an
event looses its connection to userspace (and ceases to exist from the
user's perspective) we can still have an arbitrary amount of inherited
copies of the event. We want to synchronously find and remove all
these child events.

Since that requires a bit of lock juggling, there is the possibility
that concurrent clone()s will create new child events. Therefore we
first mark the parent event as DEAD, which marks all the extant child
events as orphaned.

We then avoid copying orphaned events; in order to avoid getting more
of them.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vince Weaver <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
8 years agoperf/core: Simplify perf_event_free_task()
Peter Zijlstra [Thu, 16 Mar 2017 12:47:50 +0000 (13:47 +0100)]
perf/core: Simplify perf_event_free_task()

We have ctx->event_list that contains all events; no need to
repeatedly iterate the group lists to find them all.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vince Weaver <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
8 years agoperf/core: Fix event inheritance on fork()
Peter Zijlstra [Thu, 16 Mar 2017 12:47:49 +0000 (13:47 +0100)]
perf/core: Fix event inheritance on fork()

While hunting for clues to a use-after-free, Oleg spotted that
perf_event_init_context() can loose an error value with the result
that fork() can succeed even though we did not fully inherit the perf
event context.

Spotted-by: Oleg Nesterov <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vince Weaver <[email protected]>
Cc: [email protected]
Cc: [email protected]
Fixes: 889ff0150661 ("perf/core: Split context's event group list into pinned and non-pinned lists")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
8 years agoperf/core: Fix use-after-free in perf_release()
Peter Zijlstra [Thu, 16 Mar 2017 12:47:48 +0000 (13:47 +0100)]
perf/core: Fix use-after-free in perf_release()

Dmitry reported syzcaller tripped a use-after-free in perf_release().

After much puzzlement Oleg spotted the below scenario:

  Task1                           Task2

  fork()
    perf_event_init_task()
    /* ... */
    goto bad_fork_$foo;
    /* ... */
    perf_event_free_task()
      mutex_lock(ctx->lock)
      perf_free_event(B)

                                  perf_event_release_kernel(A)
                                    mutex_lock(A->child_mutex)
                                    list_for_each_entry(child, ...) {
                                      /* child == B */
                                      ctx = B->ctx;
                                      get_ctx(ctx);
                                      mutex_unlock(A->child_mutex);

        mutex_lock(A->child_mutex)
        list_del_init(B->child_list)
        mutex_unlock(A->child_mutex)

        /* ... */

      mutex_unlock(ctx->lock);
      put_ctx() /* >0 */
    free_task();
                                      mutex_lock(ctx->lock);
                                      mutex_lock(A->child_mutex);
                                      /* ... */
                                      mutex_unlock(A->child_mutex);
                                      mutex_unlock(ctx->lock)
                                      put_ctx() /* 0 */
                                        ctx->task && !TOMBSTONE
                                          put_task_struct() /* UAF */

This patch closes the hole by making perf_event_free_task() destroy the
task <-> ctx relation such that perf_event_release_kernel() will no longer
observe the now dead task.

Spotted-by: Oleg Nesterov <[email protected]>
Reported-by: Dmitry Vyukov <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vince Weaver <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Fixes: c6e5b73242d2 ("perf: Synchronously clean up child events")
Link: http://lkml.kernel.org/r/20170314155949.GE32474@worktop
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
8 years agoMAINTAINERS: Add maintianer entry for crypto/s5p-sss
Krzysztof Kozlowski [Sat, 11 Mar 2017 06:11:00 +0000 (08:11 +0200)]
MAINTAINERS: Add maintianer entry for crypto/s5p-sss

Add Krzysztof Kozlowski and Vladimir Zapolskiy as maintainers of s5p-sss
driver for handling reviews, testing and getting bug reports from the
users.

Cc: Vladimir Zapolskiy <[email protected]>
Cc: Herbert Xu <[email protected]>
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Acked-by: Vladimir Zapolskiy <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
8 years agocrypto: doc - fix typo (struct sdesc)
Fabien DESSENNE [Thu, 9 Mar 2017 09:40:05 +0000 (10:40 +0100)]
crypto: doc - fix typo (struct sdesc)

Add missing " " in api-samples.rst

Signed-off-by: Fabien Dessenne <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
8 years agocrypto: mediatek - make hardware operation flow more efficient
Ryder Lee [Thu, 9 Mar 2017 02:11:19 +0000 (10:11 +0800)]
crypto: mediatek - make hardware operation flow more efficient

This patch refines data structures, which are used to control engine's
data path, to make it more efficient. Hence current change are:

- gathers the broken pieces of structures 'mtk_aes_ct''mtk_aes_tfm'
into struct mtk_aes_info hence avoiding additional DMA-mapping.

- adds 'keymode' in struct mtk_aes_base_ctx. When .setkey() callback is
called, we store keybit setting in keymode. Doing so, there is no need
to check keylen second time in mtk_aes_info_init() / mtk_aes_gcm_info_init().

Besides, this patch also removes unused macro definitions and adds helper
inline function to write security information(key, IV,...) to info->state.

Signed-off-by: Ryder Lee <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
8 years agocrypto: mediatek - add mtk_aes_gcm_tag_verify()
Ryder Lee [Thu, 9 Mar 2017 02:11:18 +0000 (10:11 +0800)]
crypto: mediatek - add mtk_aes_gcm_tag_verify()

This patch adds mtk_aes_gcm_tag_verify() which is used to compare
authenticated tag.

Signed-off-by: Ryder Lee <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
8 years agocrypto: mediatek - fix error handling in mtk_aes_complete()
Ryder Lee [Thu, 9 Mar 2017 02:11:17 +0000 (10:11 +0800)]
crypto: mediatek - fix error handling in mtk_aes_complete()

This patch fixes how errors should be handled by mtk_aes_complete().

Signed-off-by: Ryder Lee <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
8 years agocrypto: mediatek - add queue_task tasklet
Ryder Lee [Thu, 9 Mar 2017 02:11:16 +0000 (10:11 +0800)]
crypto: mediatek - add queue_task tasklet

This patch adds 'queue_task' to dequeue crypto requset. This will help to
avoid directly calling mtk_aes_handle_queue() / mtk_sha_handle_queue()
from done tasklet or error handler.

In order to avoid confusion, the new code properly renames DMA completion
"task" to "done_task".

Signed-off-by: Ryder Lee <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
8 years agocrypto: mediatek - simplify descriptor ring management
Ryder Lee [Thu, 9 Mar 2017 02:11:15 +0000 (10:11 +0800)]
crypto: mediatek - simplify descriptor ring management

This patch replaces cmd_pos/res_pos with pointer cmd_next/res_next.

In old code, we must to add one to shift ring to the next segment, and
then use this value to caculate current offset from ring base for each
DMA operation. Now these pointers helps us to simplify flow, so we just
need to move pointers and check the boundaries of ring.

Signed-off-by: Ryder Lee <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
8 years agocrypto: mediatek - make mtk_sha_xmit() more generic
Ryder Lee [Thu, 9 Mar 2017 02:11:14 +0000 (10:11 +0800)]
crypto: mediatek - make mtk_sha_xmit() more generic

This is a transitional patch. It merges mtk_sha_xmit() and mtk_sha_xmit2()
to make transmit function more generic.
In addition, res->buf and cryp->tmp_dma in mtk_sha_xmit() are useless, since
crypto engine writes the result digests into ctx->tfm.digest instead of
res->buf. It's better to remove it.

Signed-off-by: Ryder Lee <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
8 years agocrypto: mediatek - add MTK_* prefix and correct annotations.
Ryder Lee [Thu, 9 Mar 2017 02:11:13 +0000 (10:11 +0800)]
crypto: mediatek - add MTK_* prefix and correct annotations.

Dummy patch to add MTK_* prefix to ring enum and fix incorrect annotations.

Signed-off-by: Ryder Lee <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
8 years agocrypto: mediatek - rework interrupt handler
Ryder Lee [Thu, 9 Mar 2017 02:11:12 +0000 (10:11 +0800)]
crypto: mediatek - rework interrupt handler

This patch removes redundant task that used to handle interrupt
from ring manager, so that the same task/handler can be shared.
It also uses aes->id and sha-id to distinguish interrupt sources.

Signed-off-by: Ryder Lee <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
8 years agohwrng: omap - move clock related code to omap_rng_probe()
Thomas Petazzoni [Tue, 7 Mar 2017 14:14:49 +0000 (15:14 +0100)]
hwrng: omap - move clock related code to omap_rng_probe()

Currently, the code that takes a reference to the clock and enables it
is located inside of_get_omap_rng_device_details(), called only when
probing through the Device Tree.

However, there is nothing that makes this clock logic dependent on the
Device Tree, so it makes more sense to have it in omap_rng_probe()
directly.

Moreover, we make sure to bail out if we can't enable the clock. Indeed,
while the clock is optional, if a clock is present, we really want to
succeed in enabling it. And we fix the error message to fit on one line,
so that it is grep-friendly.

Signed-off-by: Thomas Petazzoni <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
8 years agohwrng: meson - add clock handling to driver
Heiner Kallweit [Wed, 22 Feb 2017 07:02:31 +0000 (08:02 +0100)]
hwrng: meson - add clock handling to driver

Add handling of RNG0 clock to the driver.

Signed-off-by: Heiner Kallweit <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
8 years agoARM64: dts: meson-gx: add clock CLKID_RNG0 to hwrng node
Heiner Kallweit [Wed, 22 Feb 2017 07:00:55 +0000 (08:00 +0100)]
ARM64: dts: meson-gx: add clock CLKID_RNG0 to hwrng node

Add clock CLKID_RNG0 to HW randon number generator node.

Signed-off-by: Heiner Kallweit <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
8 years agodt-bindings: rng: add clock to DT binding documentation for hwrng
Heiner Kallweit [Wed, 22 Feb 2017 06:59:08 +0000 (07:59 +0100)]
dt-bindings: rng: add clock to DT binding documentation for hwrng

Add clock to DT binding documentation.

Signed-off-by: Heiner Kallweit <[email protected]>
Acked-by: Rob Herring <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
8 years agoclk: meson-gxbb: expose clock CLKID_RNG0
Heiner Kallweit [Wed, 22 Feb 2017 06:55:24 +0000 (07:55 +0100)]
clk: meson-gxbb: expose clock CLKID_RNG0

Expose clock CLKID_RNG0 which is needed for the HW random number generator.

Signed-off-by: Heiner Kallweit <[email protected]>
Reviewed-by: Neil Armstrong <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
8 years agopowerpc: Wire up statx() syscall
Chandan Rajendra [Thu, 16 Mar 2017 09:07:11 +0000 (14:37 +0530)]
powerpc: Wire up statx() syscall

Test runs on a ppc64 BE guest succeeded. linux/samples/statx/test-statx
program was executed on the following file types,

1. Regular file
2. Directory
3. device file
4. symlink
5. Named pipe

The test run also included invoking test-statx with the runtime options
provided in the main() function of test-statx.c

Signed-off-by: Chandan Rajendra <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
8 years agohwrng: geode - Revert managed API changes
Prarit Bhargava [Tue, 14 Mar 2017 11:36:02 +0000 (07:36 -0400)]
hwrng: geode - Revert managed API changes

After commit e9afc746299d ("hwrng: geode - Use linux/io.h instead of
asm/io.h") the geode-rng driver uses devres with pci_dev->dev to keep
track of resources, but does not actually register a PCI driver.  This
results in the following issues:

1.  The driver leaks memory because the driver does not attach to a
device.  The driver only uses the PCI device as a reference.   devm_*()
functions will release resources on driver detach, which the geode-rng
driver will never do.  As a result,

2.  The driver cannot be reloaded because there is always a use of the
ioport and region after the first load of the driver.

Revert the changes made by  e9afc746299d ("hwrng: geode - Use linux/io.h
instead of asm/io.h").

Cc: <[email protected]>
Signed-off-by: Prarit Bhargava <[email protected]>
Fixes: 6e9b5e76882c ("hwrng: geode - Migrate to managed API")
Cc: Matt Mackall <[email protected]>
Cc: Corentin LABBE <[email protected]>
Cc: PrasannaKumar Muralidharan <[email protected]>
Cc: Wei Yongjun <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Herbert Xu <[email protected]>
8 years agohwrng: amd - Revert managed API changes
Prarit Bhargava [Tue, 14 Mar 2017 11:36:01 +0000 (07:36 -0400)]
hwrng: amd - Revert managed API changes

After commit 31b2a73c9c5f ("hwrng: amd - Migrate to managed API"), the
amd-rng driver uses devres with pci_dev->dev to keep track of resources,
but does not actually register a PCI driver.  This results in the
following issues:

1. The message

WARNING: CPU: 2 PID: 621 at drivers/base/dd.c:349 driver_probe_device+0x38c

is output when the i2c_amd756 driver loads and attempts to register a PCI
driver.  The PCI & device subsystems assume that no resources have been
registered for the device, and the WARN_ON() triggers since amd-rng has
already do so.

2.  The driver leaks memory because the driver does not attach to a
device.  The driver only uses the PCI device as a reference.   devm_*()
functions will release resources on driver detach, which the amd-rng
driver will never do.  As a result,

3.  The driver cannot be reloaded because there is always a use of the
ioport and region after the first load of the driver.

Revert the changes made by 31b2a73c9c5f ("hwrng: amd - Migrate to managed
API").

Cc: <[email protected]>
Signed-off-by: Prarit Bhargava <[email protected]>
Fixes: 31b2a73c9c5f ("hwrng: amd - Migrate to managed API").
Cc: Matt Mackall <[email protected]>
Cc: Corentin LABBE <[email protected]>
Cc: PrasannaKumar Muralidharan <[email protected]>
Cc: Wei Yongjun <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Herbert Xu <[email protected]>
8 years agocrypto: ccp - Assign DMA commands to the channel's CCP
Gary R Hook [Fri, 10 Mar 2017 18:28:18 +0000 (12:28 -0600)]
crypto: ccp - Assign DMA commands to the channel's CCP

The CCP driver generally uses a round-robin approach when
assigning operations to available CCPs. For the DMA engine,
however, the DMA mappings of the SGs are associated with a
specific CCP. When an IOMMU is enabled, the IOMMU is
programmed based on this specific device.

If the DMA operations are not performed by that specific
CCP then addressing errors and I/O page faults will occur.

Update the CCP driver to allow a specific CCP device to be
requested for an operation and use this in the DMA engine
support.

Cc: <[email protected]> # 4.9.x-
Signed-off-by: Gary R Hook <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
8 years agosched/deadline: Use deadline instead of period when calculating overflow
Steven Rostedt (VMware) [Thu, 2 Mar 2017 14:10:59 +0000 (15:10 +0100)]
sched/deadline: Use deadline instead of period when calculating overflow

I was testing Daniel's changes with his test case, and tweaked it a
little. Instead of having the runtime equal to the deadline, I
increased the deadline ten fold.

Daniel's test case had:

attr.sched_runtime  = 2 * 1000 * 1000; /* 2 ms */
attr.sched_deadline = 2 * 1000 * 1000; /* 2 ms */
attr.sched_period   = 2 * 1000 * 1000 * 1000; /* 2 s */

To make it more interesting, I changed it to:

attr.sched_runtime  =  2 * 1000 * 1000; /* 2 ms */
attr.sched_deadline = 20 * 1000 * 1000; /* 20 ms */
attr.sched_period   =  2 * 1000 * 1000 * 1000; /* 2 s */

The results were rather surprising. The behavior that Daniel's patch
was fixing came back. The task started using much more than .1% of the
CPU. More like 20%.

Looking into this I found that it was due to the dl_entity_overflow()
constantly returning true. That's because it uses the relative period
against relative runtime vs the absolute deadline against absolute
runtime.

  runtime / (deadline - t) > dl_runtime / dl_period

There's even a comment mentioning this, and saying that when relative
deadline equals relative period, that the equation is the same as using
deadline instead of period. That comment is backwards! What we really
want is:

  runtime / (deadline - t) > dl_runtime / dl_deadline

We care about if the runtime can make its deadline, not its period. And
then we can say "when the deadline equals the period, the equation is
the same as using dl_period instead of dl_deadline".

After correcting this, now when the task gets enqueued, it can throttle
correctly, and Daniel's fix to the throttling of sleeping deadline
tasks works even when the runtime and deadline are not the same.

Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Daniel Bristot de Oliveira <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Luca Abeni <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Romulo Silva de Oliveira <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tommaso Cucinotta <[email protected]>
Link: http://lkml.kernel.org/r/02135a27f1ae3fe5fd032568a5a2f370e190e8d7.1488392936.git.bristot@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
8 years agosched/deadline: Throttle a constrained deadline task activated after the deadline
Daniel Bristot de Oliveira [Thu, 2 Mar 2017 14:10:58 +0000 (15:10 +0100)]
sched/deadline: Throttle a constrained deadline task activated after the deadline

During the activation, CBS checks if it can reuse the current task's
runtime and period. If the deadline of the task is in the past, CBS
cannot use the runtime, and so it replenishes the task. This rule
works fine for implicit deadline tasks (deadline == period), and the
CBS was designed for implicit deadline tasks. However, a task with
constrained deadline (deadine < period) might be awakened after the
deadline, but before the next period. In this case, replenishing the
task would allow it to run for runtime / deadline. As in this case
deadline < period, CBS enables a task to run for more than the
runtime / period. In a very loaded system, this can cause a domino
effect, making other tasks miss their deadlines.

To avoid this problem, in the activation of a constrained deadline
task after the deadline but before the next period, throttle the
task and set the replenishing timer to the begin of the next period,
unless it is boosted.

Reproducer:

 --------------- %< ---------------
  int main (int argc, char **argv)
  {
int ret;
int flags = 0;
unsigned long l = 0;
struct timespec ts;
struct sched_attr attr;

memset(&attr, 0, sizeof(attr));
attr.size = sizeof(attr);

attr.sched_policy   = SCHED_DEADLINE;
attr.sched_runtime  = 2 * 1000 * 1000; /* 2 ms */
attr.sched_deadline = 2 * 1000 * 1000; /* 2 ms */
attr.sched_period   = 2 * 1000 * 1000 * 1000; /* 2 s */

ts.tv_sec = 0;
ts.tv_nsec = 2000 * 1000; /* 2 ms */

ret = sched_setattr(0, &attr, flags);

if (ret < 0) {
perror("sched_setattr");
exit(-1);
}

for(;;) {
/* XXX: you may need to adjust the loop */
for (l = 0; l < 150000; l++);
/*
 * The ideia is to go to sleep right before the deadline
 * and then wake up before the next period to receive
 * a new replenishment.
 */
nanosleep(&ts, NULL);
}

exit(0);
  }
  --------------- >% ---------------

On my box, this reproducer uses almost 50% of the CPU time, which is
obviously wrong for a task with 2/2000 reservation.

Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Luca Abeni <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Romulo Silva de Oliveira <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tommaso Cucinotta <[email protected]>
Link: http://lkml.kernel.org/r/edf58354e01db46bf42df8d2dd32418833f68c89.1488392936.git.bristot@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
8 years agosched/deadline: Make sure the replenishment timer fires in the next period
Daniel Bristot de Oliveira [Thu, 2 Mar 2017 14:10:57 +0000 (15:10 +0100)]
sched/deadline: Make sure the replenishment timer fires in the next period

Currently, the replenishment timer is set to fire at the deadline
of a task. Although that works for implicit deadline tasks because the
deadline is equals to the begin of the next period, that is not correct
for constrained deadline tasks (deadline < period).

For instance:

f.c:
 --------------- %< ---------------
int main (void)
{
for(;;);
}
 --------------- >% ---------------

  # gcc -o f f.c

  # trace-cmd record -e sched:sched_switch                              \
   -e syscalls:sys_exit_sched_setattr   \
   chrt -d --sched-runtime  490000000 \
           --sched-deadline 500000000 \
   --sched-period  1000000000 0 ./f

  # trace-cmd report | grep "{pid of ./f}"

After setting parameters, the task is replenished and continue running
until being throttled:

         f-11295 [003] 13322.113776: sys_exit_sched_setattr: 0x0

The task is throttled after running 492318 ms, as expected:

         f-11295 [003] 13322.606094: sched_switch:   f:11295 [-1] R ==> watchdog/3:32 [0]

But then, the task is replenished 500719 ms after the first
replenishment:

    <idle>-0     [003] 13322.614495: sched_switch:   swapper/3:0 [120] R ==> f:11295 [-1]

Running for 490277 ms:

         f-11295 [003] 13323.104772: sched_switch:   f:11295 [-1] R ==>  swapper/3:0 [120]

Hence, in the first period, the task runs 2 * runtime, and that is a bug.

During the first replenishment, the next deadline is set one period away.
So the runtime / period starts to be respected. However, as the second
replenishment took place in the wrong instant, the next replenishment
will also be held in a wrong instant of time. Rather than occurring in
the nth period away from the first activation, it is taking place
in the (nth period - relative deadline).

Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Luca Abeni <[email protected]>
Reviewed-by: Steven Rostedt (VMware) <[email protected]>
Reviewed-by: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Romulo Silva de Oliveira <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tommaso Cucinotta <[email protected]>
Link: http://lkml.kernel.org/r/ac50d89887c25285b47465638354b63362f8adff.1488392936.git.bristot@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
8 years agolocking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y
Niklas Cassel [Sat, 25 Feb 2017 00:17:53 +0000 (01:17 +0100)]
locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y

We hang if SIGKILL has been sent, but the task is stuck in down_read()
(after do_exit()), even though no task is doing down_write() on the
rwsem in question:

  INFO: task libupnp:21868 blocked for more than 120 seconds.
  libupnp         D    0 21868      1 0x08100008
  ...
  Call Trace:
  __schedule()
  schedule()
  __down_read()
  do_exit()
  do_group_exit()
  __wake_up_parent()

This bug has already been fixed for CONFIG_RWSEM_XCHGADD_ALGORITHM=y in
the following commit:

 04cafed7fc19 ("locking/rwsem: Fix down_write_killable()")

... however, this bug also exists for CONFIG_RWSEM_GENERIC_SPINLOCK=y.

Signed-off-by: Niklas Cassel <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: <[email protected]>
Cc: <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Niklas Cassel <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Fixes: d47996082f52 ("locking/rwsem: Introduce basis for down_write_killable()")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
8 years agosched/loadavg: Use {READ,WRITE}_ONCE() for sample window
Matt Fleming [Fri, 17 Feb 2017 12:07:31 +0000 (12:07 +0000)]
sched/loadavg: Use {READ,WRITE}_ONCE() for sample window

'calc_load_update' is accessed without any kind of locking and there's
a clear assumption in the code that only a single value is read or
written.

Make this explicit by using READ_ONCE() and WRITE_ONCE(), and avoid
unintentionally seeing multiple values, or having the load/stores
split.

Technically the loads in calc_global_*() don't require this since
those are the only functions that update 'calc_load_update', but I've
added the READ_ONCE() for consistency.

Suggested-by: Peter Zijlstra <[email protected]>
Signed-off-by: Matt Fleming <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Morten Rasmussen <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
8 years agosched/loadavg: Avoid loadavg spikes caused by delayed NO_HZ accounting
Matt Fleming [Fri, 17 Feb 2017 12:07:30 +0000 (12:07 +0000)]
sched/loadavg: Avoid loadavg spikes caused by delayed NO_HZ accounting

If we crossed a sample window while in NO_HZ we will add LOAD_FREQ to
the pending sample window time on exit, setting the next update not
one window into the future, but two.

This situation on exiting NO_HZ is described by:

  this_rq->calc_load_update < jiffies < calc_load_update

In this scenario, what we should be doing is:

  this_rq->calc_load_update = calc_load_update      [ next window ]

But what we actually do is:

  this_rq->calc_load_update = calc_load_update + LOAD_FREQ   [ next+1 window ]

This has the effect of delaying load average updates for potentially
up to ~9seconds.

This can result in huge spikes in the load average values due to
per-cpu uninterruptible task counts being out of sync when accumulated
across all CPUs.

It's safe to update the per-cpu active count if we wake between sample
windows because any load that we left in 'calc_load_idle' will have
been zero'd when the idle load was folded in calc_global_load().

This issue is easy to reproduce before,

  commit 9d89c257dfb9 ("sched/fair: Rewrite runnable load and utilization average tracking")

just by forking short-lived process pipelines built from ps(1) and
grep(1) in a loop. I'm unable to reproduce the spikes after that
commit, but the bug still seems to be present from code review.

Signed-off-by: Matt Fleming <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Morten Rasmussen <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vincent Guittot <[email protected]>
Fixes: commit 5167e8d ("sched/nohz: Rewrite and fix load-avg computation -- again")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
8 years agosched/deadline: Add missing update_rq_clock() in dl_task_timer()
Wanpeng Li [Tue, 7 Mar 2017 05:51:28 +0000 (21:51 -0800)]
sched/deadline: Add missing update_rq_clock() in dl_task_timer()

The following warning can be triggered by hot-unplugging the CPU
on which an active SCHED_DEADLINE task is running on:

 ------------[ cut here ]------------
 WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 replenish_dl_entity+0x71e/0xc40
 rq->clock_update_flags < RQCF_ACT_SKIP
 CPU: 7 PID: 0 Comm: swapper/7 Tainted: G    B           4.11.0-rc1+ #24
 Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016
 Call Trace:
  <IRQ>
  dump_stack+0x85/0xc4
  __warn+0x172/0x1b0
  warn_slowpath_fmt+0xb4/0xf0
  ? __warn+0x1b0/0x1b0
  ? debug_check_no_locks_freed+0x2c0/0x2c0
  ? cpudl_set+0x3d/0x2b0
  replenish_dl_entity+0x71e/0xc40
  enqueue_task_dl+0x2ea/0x12e0
  ? dl_task_timer+0x777/0x990
  ? __hrtimer_run_queues+0x270/0xa50
  dl_task_timer+0x316/0x990
  ? enqueue_task_dl+0x12e0/0x12e0
  ? enqueue_task_dl+0x12e0/0x12e0
  __hrtimer_run_queues+0x270/0xa50
  ? hrtimer_cancel+0x20/0x20
  ? hrtimer_interrupt+0x119/0x600
  hrtimer_interrupt+0x19c/0x600
  ? trace_hardirqs_off+0xd/0x10
  local_apic_timer_interrupt+0x74/0xe0
  smp_apic_timer_interrupt+0x76/0xa0
  apic_timer_interrupt+0x93/0xa0

The DL task will be migrated to a suitable later deadline rq once the DL
timer fires and currnet rq is offline. The rq clock of the new rq should
be updated. This patch fixes it by updating the rq clock after holding
the new rq's rq lock.

Signed-off-by: Wanpeng Li <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Matt Fleming <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
8 years agox86/mpx: Make unnecessarily global function static
Tobias Klauser [Wed, 8 Mar 2017 13:30:34 +0000 (14:30 +0100)]
x86/mpx: Make unnecessarily global function static

Make the function get_user_bd_entry() static as it is not used outside of
arch/x86/mm/mpx.c

This fixes a sparse warning.

Signed-off-by: Tobias Klauser <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
8 years agoMerge branch 'drm-fixes-4.11' of git://people.freedesktop.org/~agd5f/linux into drm...
Dave Airlie [Thu, 16 Mar 2017 01:28:44 +0000 (11:28 +1000)]
Merge branch 'drm-fixes-4.11' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

A few amd fixes.

* 'drm-fixes-4.11' of git://people.freedesktop.org/~agd5f/linux:
  drm/amd/amdgpu:  Fix debugfs reg read/write address width
  drm/amdgpu/si: add dpm quirk for Oland
  drm/radeon/si: add dpm quirk for Oland
  drm: amd: remove broken include path
  drm/amd/powerplay: fix copy error in smu7_clockpoweragting.c
  drm/amdgpu: fix parser init error path to avoid crash in parser fini
  drm/amd/amdgpu: Disable GFX_PG on Carrizo until compute issues solved

8 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-block
Linus Torvalds [Wed, 15 Mar 2017 23:54:58 +0000 (16:54 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Four small fixes for this cycle:

   - followup fix from Neil for a fix that went in before -rc2, ensuring
     that we always see the full per-task bio_list.

   - fix for blk-mq-sched from me that ensures that we retain similar
     direct-to-issue behavior on running the queue.

   - fix from Sagi fixing a potential NULL pointer dereference in blk-mq
     on spurious CPU unplug.

   - a memory leak fix in writeback from Tahsin, fixing a case where
     device removal of a mounted device can leak a struct
     wb_writeback_work"

* 'for-linus' of git://git.kernel.dk/linux-block:
  blk-mq-sched: don't run the queue async from blk_mq_try_issue_directly()
  writeback: fix memory leak in wb_queue_work()
  blk-mq: Fix tagset reinit in the presence of cpu hot-unplug
  blk: Ensure users for current->bio_list can see the full list.

8 years agocpufreq: Fix and clean up show_cpuinfo_cur_freq()
Rafael J. Wysocki [Tue, 14 Mar 2017 23:12:16 +0000 (00:12 +0100)]
cpufreq: Fix and clean up show_cpuinfo_cur_freq()

There is a missing newline in show_cpuinfo_cur_freq(), so add it,
but while at it clean that function up somewhat too.

Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Cc: All applicable <[email protected]>
8 years agoparisc: Avoid compiler warnings with access_ok()
Helge Deller [Wed, 15 Mar 2017 20:48:42 +0000 (21:48 +0100)]
parisc: Avoid compiler warnings with access_ok()

Commit 09b871ffd4d8 (parisc: Define access_ok() as macro) missed to mark uaddr
as used, which then gives compiler warnings about unused variables.

Fix it by comparing uaddr to uaddr which then gets optimized away by the
compiler.

Signed-off-by: Helge Deller <[email protected]>
Fixes: 09b871ffd4d8 ("parisc: Define access_ok() as macro")
8 years agodrm/amd/amdgpu: Fix debugfs reg read/write address width
Tom St Denis [Wed, 15 Mar 2017 09:34:25 +0000 (05:34 -0400)]
drm/amd/amdgpu:  Fix debugfs reg read/write address width

The MMIO space is wider now so we mask the lower 22 bits
instead of 18.

Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
8 years agodrm/amdgpu/si: add dpm quirk for Oland
Alex Deucher [Tue, 14 Mar 2017 23:24:19 +0000 (19:24 -0400)]
drm/amdgpu/si: add dpm quirk for Oland

OLAND 0x1002:0x6604 0x1028:0x066F 0x00 seems to have problems
with higher sclks.

Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
8 years agodrm/radeon/si: add dpm quirk for Oland
Alex Deucher [Tue, 14 Mar 2017 18:42:03 +0000 (14:42 -0400)]
drm/radeon/si: add dpm quirk for Oland

OLAND 0x1002:0x6604 0x1028:0x066F 0x00 seems to have problems
with higher sclks.

Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
8 years agoparisc: Wire up statx system call
Helge Deller [Wed, 15 Mar 2017 20:10:17 +0000 (21:10 +0100)]
parisc: Wire up statx system call

Signed-off-by: Helge Deller <[email protected]>
8 years agoparisc: Optimize flush_kernel_vmap_range and invalidate_kernel_vmap_range
John David Anglin [Sat, 11 Mar 2017 23:03:34 +0000 (18:03 -0500)]
parisc: Optimize flush_kernel_vmap_range and invalidate_kernel_vmap_range

The previously submitted patch did not resolve the random segmentation
faults observed on the phantom buildd system.  There are still
unresolved problems with the Debian 4.8 and 4.9 kernels on C8000.

The attached patch removes the flush of the offset map pages and does a
whole data cache flush for large ranges.  No other arch flushes the
offset map in these routines as far as I can tell.

I have not observed any random segmentation faults on rp3440 in two
weeks of testing with 4.10.0 and 4.10.1.

Signed-off-by: John David Anglin <[email protected]>
Cc: [email protected] # v4.8+
Signed-off-by: Helge Deller <[email protected]>
8 years agoparisc: support R_PARISC_SECREL32 relocation in modules
Mikulas Patocka [Tue, 14 Mar 2017 15:47:29 +0000 (11:47 -0400)]
parisc: support R_PARISC_SECREL32 relocation in modules

The parisc kernel doesn't work with CONFIG_MODVERSIONS since the commit
71810db27c1c853b335675bee335d893bc3d324b. It can't load modules with the
error: "module unix: Unknown relocation: 41".

The commit changes __kcrctab from 64-bit valus to 32-bit values. The
assembler generates R_PARISC_SECREL32 secrel relocation for them and the
module loader doesn't support this relocation.

This patch adds the R_PARISC_SECREL32 relocation to the module loader.

Signed-off-by: Mikulas Patocka <[email protected]>
Cc: [email protected] # v4.10+
Signed-off-by: Helge Deller <[email protected]>
8 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Wed, 15 Mar 2017 17:44:19 +0000 (10:44 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This is a rather large set of fixes. The bulk are for lpfc correcting
  a lot of issues in the new NVME driver code which just went in in the
  merge window.

  The others are:

   - fix a hang in the vmware paravirt driver caused by incorrect
     handling of the new MSI vector allocation

   - long standing bug in storvsc, which recent block changes turned
     from being a harmless annoyance into a hang

   - yet more fallout (in mpt3sas) from the changes to device blocking

  The remainder are small fixes and updates"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (34 commits)
  scsi: lpfc: Add shutdown method for kexec
  scsi: storvsc: Workaround for virtual DVD SCSI version
  scsi: lpfc: revise version number to 11.2.0.10
  scsi: lpfc: code cleanups in NVME initiator discovery
  scsi: lpfc: code cleanups in NVME initiator base
  scsi: lpfc: correct rdp diag portnames
  scsi: lpfc: remove dead sli3 nvme code
  scsi: lpfc: correct double print
  scsi: lpfc: Rename LPFC_MAX_EQ_DELAY to LPFC_MAX_EQ_DELAY_EQID_CNT
  scsi: lpfc: Rework lpfc Kconfig for NVME options
  scsi: lpfc: add transport eh_timed_out reference
  scsi: lpfc: Fix eh_deadline setting for sli3 adapters.
  scsi: lpfc: add NVME exchange aborts
  scsi: lpfc: Fix nvme allocation bug on failed nvme_fc_register_localport
  scsi: lpfc: Fix IO submission if WQ is full
  scsi: lpfc: Fix NVME CMD IU byte swapped word 1 problem
  scsi: lpfc: Fix RCTL value on NVME LS request and response
  scsi: lpfc: Fix crash during Hardware error recovery on SLI3 adapters
  scsi: lpfc: fix missing spin_unlock on sql_list_lock
  scsi: lpfc: don't dereference dma_buf->iocbq before null check
  ...

8 years agoMerge tag 'gfs2-4.11-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 15 Mar 2017 16:33:15 +0000 (09:33 -0700)]
Merge tag 'gfs2-4.11-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 fix from Bob Peterson:
 "This is an emergency patch for 4.11-rc3

  The GFS2 developers uncovered a really nasty problem that can lead to
  random corruption and kernel panic, much like the last one. Andreas
  Gruenbacher wrote a simple one-line patch to fix the problem."

* tag 'gfs2-4.11-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: Avoid alignment hole in struct lm_lockname

8 years agoMerge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Wed, 15 Mar 2017 16:26:04 +0000 (09:26 -0700)]
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:

 - self-test failure of crc32c on powerpc

 - regressions of ecb(aes) when used with xts/lrw in s5p-sss

 - a number of bugs in the omap RNG driver

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: s5p-sss - Fix spinlock recursion on LRW(AES)
  hwrng: omap - Do not access INTMASK_REG on EIP76
  hwrng: omap - use devm_clk_get() instead of of_clk_get()
  hwrng: omap - write registers after enabling the clock
  crypto: s5p-sss - Fix completing crypto request in IRQ handler
  crypto: powerpc - Fix initialisation of crc32c context

8 years agocpufreq: intel_pstate: Avoid percentages in limits-related computations
Rafael J. Wysocki [Tue, 14 Mar 2017 15:18:34 +0000 (16:18 +0100)]
cpufreq: intel_pstate: Avoid percentages in limits-related computations

Currently, intel_pstate_update_perf_limits() first converts the
policy minimum and maximum limits into percentages of the maximum
turbo frequency (rounding up to an integer) and then converts these
percentages to fractions (by using fixed-point arithmetic to divide
them by 100).

That introduces a rounding error unnecessarily, because the fractions
can be obtained by carrying out fixed-point divisions directly on the
input numbers.

Rework the computations in intel_pstate_hwp_set() to use fractions
instead of percentages (and drop redundant local variables from
there) and modify intel_pstate_update_perf_limits() to compute the
fractions directly and percentages out of them.

While at it, introduce percent_ext_fp() for converting percentages
to fractions (with extended number of fraction bits) and use it in
the computations.

Signed-off-by: Rafael J. Wysocki <[email protected]>
8 years agoopenrisc: Export symbols needed by modules
Stafford Horne [Tue, 14 Mar 2017 13:52:49 +0000 (22:52 +0900)]
openrisc: Export symbols needed by modules

This was detected by allmodconfig, errors reported:

 ERROR: "empty_zero_page" [net/ceph/libceph.ko] undefined!
 ERROR: "__ucmpdi2" [lib/842/842_decompress.ko] undefined!
 ERROR: "empty_zero_page" [fs/nfs/objlayout/objlayoutdriver.ko] undefined!
 ERROR: "empty_zero_page" [fs/exofs/exofs.ko] undefined!
 ERROR: "empty_zero_page" [fs/crypto/fscrypto.ko] undefined!
 ERROR: "__ucmpdi2" [fs/btrfs/btrfs.ko] undefined!
 ERROR: "pm_power_off" [drivers/regulator/act8865-regulator.ko] undefined!
 ERROR: "__ucmpdi2" [drivers/media/i2c/adv7842.ko] undefined!
 ERROR: "__clear_user" [drivers/md/dm-mod.ko] undefined!
 ERROR: "__clear_user" [net/netfilter/x_tables.ko] undefined!

Signed-off-by: Stafford Horne <[email protected]>
8 years agoopenrisc: fix issue handling 8 byte get_user calls
Stafford Horne [Sun, 12 Mar 2017 22:44:45 +0000 (07:44 +0900)]
openrisc: fix issue handling 8 byte get_user calls

Was getting the following error with allmodconfig:

  ERROR: "__get_user_bad" [lib/test_user_copy.ko] undefined!

This was simply a missing break statement, causing an unwanted fall
through.

Signed-off-by: Stafford Horne <[email protected]>
8 years agoopenrisc: xchg: fix `computed is not used` warning
Stafford Horne [Sun, 12 Mar 2017 22:41:55 +0000 (07:41 +0900)]
openrisc: xchg: fix `computed is not used` warning

When building allmodconfig this warning shows.

  fs/ocfs2/file.c: In function 'ocfs2_file_write_iter':
  ./arch/openrisc/include/asm/cmpxchg.h:81:3: warning: value computed is
  not used [-Wunused-value]
    ((typeof(*(ptr)))__xchg((unsigned long)(with), (ptr), sizeof(*(ptr))))
     ^

Applying the same patch logic that was done to the cmpxchg macro.

Signed-off-by: Stafford Horne <[email protected]>
8 years agogfs2: Avoid alignment hole in struct lm_lockname
Andreas Gruenbacher [Mon, 6 Mar 2017 17:58:42 +0000 (12:58 -0500)]
gfs2: Avoid alignment hole in struct lm_lockname

Commit 88ffbf3e03 switches to using rhashtables for glocks, hashing over
the entire struct lm_lockname instead of its individual fields.  On some
architectures, struct lm_lockname contains a hole of uninitialized
memory due to alignment rules, which now leads to incorrect hash values.
Get rid of that hole.

Signed-off-by: Andreas Gruenbacher <[email protected]>
Signed-off-by: Bob Peterson <[email protected]>
CC: <[email protected]> #v4.3+
8 years agoxfs: verify inline directory data forks
Darrick J. Wong [Wed, 15 Mar 2017 07:24:25 +0000 (00:24 -0700)]
xfs: verify inline directory data forks

When we're reading or writing the data fork of an inline directory,
check the contents to make sure we're not overflowing buffers or eating
garbage data.  xfs/348 corrupts an inline symlink into an inline
directory, triggering a buffer overflow bug.

Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Brian Foster <[email protected]>
---
v2: add more checks consistent with _dir2_sf_check and make the verifier
usable from anywhere.

This page took 0.132053 seconds and 4 git commands to generate.