x86: add missed pgtable_pmd_page_ctor/dtor calls for preallocated pmds
In split page table lock case, we embed spinlock_t into struct page.
For obvious reason, we don't want to increase size of struct page if
spinlock_t is too big, like with DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC or
on -rt kernel. So we disable split page table lock, if spinlock_t is
too big.
This patchset allows to allocate the lock dynamically if spinlock_t is
big. In this page->ptl is used to store pointer to spinlock instead of
spinlock itself. It costs additional cache line for indirect access,
but fix page fault scalability for multi-threaded applications.
LOCK_STAT depends on DEBUG_SPINLOCK, so on current kernel enabling
LOCK_STAT to analyse scalability issues breaks scalability. ;)
The patchset mostly fixes this. Results for ./thp_memscale -c 80 -b 512M
on 4-socket machine:
baseline, no CONFIG_LOCK_STAT: 9.
115460703 seconds time elapsed
baseline, CONFIG_LOCK_STAT=y: 53.
890567123 seconds time elapsed
patched, no CONFIG_LOCK_STAT: 8.
852250368 seconds time elapsed
patched, CONFIG_LOCK_STAT=y: 11.
069770759 seconds time elapsed
Patch count is scary, but most of them trivial. Overview:
Patches 1-4 Few bug fixes. No dependencies to other patches.
Probably should applied as soon as possible.
Patch 5 Changes signature of pgtable_page_ctor(). We will use it
for dynamic lock allocation, so it can fail.
Patches 6-8 Add missing constructor/destructor calls on few archs.
It's fixes NR_PAGETABLE accounting and prepare to use
split ptl.
Patches 9-33 Add pgtable_page_ctor() fail handling to all archs.
Patches 34 Finally adds support of dynamically-allocated page->pte.
Also contains documentation for split page table lock.
This patch (of 34):
I've missed that we preallocate few pmds on pgd_alloc() if X86_PAE
enabled. Let's add missed constructor/destructor calls.
I haven't noticed it during testing since prep_new_page() clears
page->mapping and therefore page->ptl. It's effectively equal to
spin_lock_init(&page->ptl).
Signed-off-by: Kirill A. Shutemov <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: "James E.J. Bottomley" <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Chen Liqin <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Chris Zankel <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: David Howells <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Grant Likely <[email protected]>
Cc: Guan Xuetao <[email protected]>
Cc: Haavard Skinnemoen <[email protected]>
Cc: Hans-Christian Egtvedt <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: Hirokazu Takata <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Jeff Dike <[email protected]>
Cc: Jesper Nilsson <[email protected]>
Cc: Jonas Bonn <[email protected]>
Cc: Koichi Yasutake <[email protected]>
Cc: Lennox Wu <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Max Filippov <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Mikael Starvik <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Paul Mundt <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Richard Kuo <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Russell King <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>