thp: change CoW semantics for anon-THP
Currently we have different copy-on-write semantics for anon- and
file-THP. For anon-THP we try to allocate huge page on the write fault,
but on file-THP we split PMD and allocate 4k page.
Arguably, file-THP semantics is more desirable: we don't necessary want to
unshare full PMD range from the parent on the first access. This is the
primary reason THP is unusable for some workloads, like Redis.
The original THP refcounting didn't allow to have PTE-mapped compound
pages, so we had no options, but to allocate huge page on CoW (with
fallback to 512 4k pages).
The current refcounting doesn't have such limitations and we can cut a lot
of complex code out of fault path.
khugepaged is now able to recover THP from such ranges if the
configuration allows.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Zi Yan <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Reviewed-by: Zi Yan <[email protected]>
Acked-by: Yang Shi <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Ralph Campbell <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>