]> Git Repo - linux.git/commit
mm: kill vma flag VM_EXECUTABLE and mm->num_exe_file_vmas
authorKonstantin Khlebnikov <[email protected]>
Mon, 8 Oct 2012 23:28:54 +0000 (16:28 -0700)
committerLinus Torvalds <[email protected]>
Tue, 9 Oct 2012 07:22:18 +0000 (16:22 +0900)
commite9714acf8c439688884234dcac2bfc38bb607d38
tree2e21c88f855a9f5168a143fa9948141140ff02a2
parent2dd8ad81e31d0d36a5d448329c646ab43eb17788
mm: kill vma flag VM_EXECUTABLE and mm->num_exe_file_vmas

Currently the kernel sets mm->exe_file during sys_execve() and then tracks
number of vmas with VM_EXECUTABLE flag in mm->num_exe_file_vmas, as soon
as this counter drops to zero kernel resets mm->exe_file to NULL.  Plus it
resets mm->exe_file at last mmput() when mm->mm_users drops to zero.

VMA with VM_EXECUTABLE flag appears after mapping file with flag
MAP_EXECUTABLE, such vmas can appears only at sys_execve() or after vma
splitting, because sys_mmap ignores this flag.  Usually binfmt module sets
mm->exe_file and mmaps executable vmas with this file, they hold
mm->exe_file while task is running.

comment from v2.6.25-6245-g925d1c4 ("procfs task exe symlink"),
where all this stuff was introduced:

> The kernel implements readlink of /proc/pid/exe by getting the file from
> the first executable VMA.  Then the path to the file is reconstructed and
> reported as the result.
>
> Because of the VMA walk the code is slightly different on nommu systems.
> This patch avoids separate /proc/pid/exe code on nommu systems.  Instead of
> walking the VMAs to find the first executable file-backed VMA we store a
> reference to the exec'd file in the mm_struct.
>
> That reference would prevent the filesystem holding the executable file
> from being unmounted even after unmapping the VMAs.  So we track the number
> of VM_EXECUTABLE VMAs and drop the new reference when the last one is
> unmapped.  This avoids pinning the mounted filesystem.

exe_file's vma accounting is hooked into every file mmap/unmmap and vma
split/merge just to fix some hypothetical pinning fs from umounting by mm,
which already unmapped all its executable files, but still alive.

Seems like currently nobody depends on this behaviour.  We can try to
remove this logic and keep mm->exe_file until final mmput().

mm->exe_file is still protected with mm->mmap_sem, because we want to
change it via new sys_prctl(PR_SET_MM_EXE_FILE).  Also via this syscall
task can change its mm->exe_file and unpin mountpoint explicitly.

Signed-off-by: Konstantin Khlebnikov <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Carsten Otte <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Cc: Eric Paris <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Morris <[email protected]>
Cc: Jason Baron <[email protected]>
Cc: Kentaro Takeda <[email protected]>
Cc: Matt Helsley <[email protected]>
Cc: Nick Piggin <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Suresh Siddha <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Cc: Venkatesh Pallipadi <[email protected]>
Acked-by: Linus Torvalds <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
include/linux/mm.h
include/linux/mm_types.h
include/linux/mman.h
kernel/fork.c
mm/mmap.c
mm/nommu.c
This page took 0.058895 seconds and 4 git commands to generate.