]> Git Repo - linux.git/log
linux.git
4 years agomm: remove superfluous __ClearPageActive()
Yu Zhao [Tue, 13 Oct 2020 23:52:11 +0000 (16:52 -0700)]
mm: remove superfluous __ClearPageActive()

To activate a page, mark_page_accessed() always holds a reference on it.
It either gets a new reference when adding a page to
lru_pvecs.activate_page or reuses an existing one it previously got when
it added a page to lru_pvecs.lru_add.  So it doesn't call SetPageActive()
on a page that doesn't have any reference left.  Therefore, the race is
impossible these days (I didn't brother to dig into its history).

For other paths, namely reclaim and migration, a reference count is always
held while calling SetPageActive() on a page.

SetPageSlabPfmemalloc() also uses SetPageActive(), but it's irrelevant to
LRU pages.

Signed-off-by: Yu Zhao <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Yang Shi <[email protected]>
Cc: Alexander Duyck <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Qian Cai <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm: remove activate_page() from unuse_pte()
Yu Zhao [Tue, 13 Oct 2020 23:52:08 +0000 (16:52 -0700)]
mm: remove activate_page() from unuse_pte()

We don't initially add anon pages to active lruvec after commit
b518154e59aa ("mm/vmscan: protect the workingset on anonymous LRU").
Remove activate_page() from unuse_pte(), which seems to be missed by the
commit.  And make the function static while we are at it.

Before the commit, we called lru_cache_add_active_or_unevictable() to add
new ksm pages to active lruvec.  Therefore, activate_page() wasn't
necessary for them in the first place.

Signed-off-by: Yu Zhao <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Yang Shi <[email protected]>
Cc: Alexander Duyck <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Qian Cai <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoswap: rename SWP_FS to SWAP_FS_OPS to avoid ambiguity
Gao Xiang [Tue, 13 Oct 2020 23:52:04 +0000 (16:52 -0700)]
swap: rename SWP_FS to SWAP_FS_OPS to avoid ambiguity

SWP_FS is used to make swap_{read,write}page() go through the filesystem,
and it's only used for swap files over NFS for now.  Otherwise it will
directly submit IO to blockdev according to swapfile extents reported by
filesystems in advance.

As Matthew pointed out [1], SWP_FS naming is somewhat confusing, so let's
rename to SWP_FS_OPS.

[1] https://lore.kernel.org/r/20200820113448[email protected]

Suggested-by: Matthew Wilcox <[email protected]>
Signed-off-by: Gao Xiang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/gup: protect unpin_user_pages() against npages==-ERRNO
John Hubbard [Tue, 13 Oct 2020 23:52:01 +0000 (16:52 -0700)]
mm/gup: protect unpin_user_pages() against npages==-ERRNO

As suggested by Dan Carpenter, fortify unpin_user_pages() just a bit,
against a typical caller mistake: check if the npages arg is really a
-ERRNO value, which would blow up the unpinning loop: WARN and return.

If this new WARN_ON() fires, then the system *might* be leaking pages (by
leaving them pinned), but probably not.  More likely, gup/pup returned a
hard -ERRNO error to the caller, who erroneously passed it here.

Signed-off-by: John Hubbard <[email protected]>
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Souptick Joarder <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/gup: don't permit users to call get_user_pages with FOLL_LONGTERM
Barry Song [Tue, 13 Oct 2020 23:51:58 +0000 (16:51 -0700)]
mm/gup: don't permit users to call get_user_pages with FOLL_LONGTERM

gup prohibits users from calling get_user_pages() with FOLL_PIN.  But it
allows users to call get_user_pages() with FOLL_LONGTERM only.  It seems
insensible.

Since FOLL_LONGTERM is a stricter case of FOLL_PIN, we should prohibit
users from calling get_user_pages() with FOLL_LONGTERM while not with
FOLL_PIN.

mm/gup_benchmark.c used to be the only user who did this improperly.
But it has been fixed by moving to use pin_user_pages().

[[email protected]: fix CONFIG_MMU=n build]
Link: https://lkml.kernel.org/r/CA+G9fYuNS3k0DVT62twfV746pfNhCSrk5sVMcOcQ1PGGnEseyw@mail.gmail.com
Signed-off-by: Barry Song <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Ira Weiny <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Jérôme Glisse <[email protected]>
Cc: "Matthew Wilcox (Oracle)" <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Naresh Kamboju <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/gup_benchmark: use pin_user_pages for FOLL_LONGTERM flag
Barry Song [Tue, 13 Oct 2020 23:51:54 +0000 (16:51 -0700)]
mm/gup_benchmark: use pin_user_pages for FOLL_LONGTERM flag

According to Documentation/core-api/pin_user_pages.rst, FOLL_PIN is a
prerequisite to FOLL_LONGTERM.  Another way of saying that is,
FOLL_LONGTERM is a specific case, more restrictive case of FOLL_PIN.

Almost all kernel modules are using pin_user_pages() with FOLL_LONGTERM,
mm/gup_benchmark.c seems to the only exception in which FOLL_PIN is not a
prerequisite to FOLL_LONGTERM.

Signed-off-by: Barry Song <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: John Hubbard <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Jérôme Glisse <[email protected]>
Cc: "Matthew Wilcox (Oracle)" <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/gup_benchmark: update the documentation in Kconfig
Barry Song [Tue, 13 Oct 2020 23:51:51 +0000 (16:51 -0700)]
mm/gup_benchmark: update the documentation in Kconfig

In the beginning, mm/gup_benchmark.c supported get_user_pages_fast() only,
but right now, it supports the benchmarking of a couple of
get_user_pages() related calls like:

* get_user_pages_fast()
* get_user_pages()
* pin_user_pages_fast()
* pin_user_pages()

The documentation is confusing and needs update.

Signed-off-by: Barry Song <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Keith Busch <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm, fadvise: improve the expensive remote LRU cache draining after FADV_DONTNEED
Yafang Shao [Tue, 13 Oct 2020 23:51:47 +0000 (16:51 -0700)]
mm, fadvise: improve the expensive remote LRU cache draining after FADV_DONTNEED

Our users reported that there're some random latency spikes when their RT
process is running.  Finally we found that latency spike is caused by
FADV_DONTNEED.  Which may call lru_add_drain_all() to drain LRU cache on
remote CPUs, and then waits the per-cpu work to complete.  The wait time
is uncertain, which may be tens millisecond.

That behavior is unreasonable, because this process is bound to a specific
CPU and the file is only accessed by itself, IOW, there should be no
pagecache pages on a per-cpu pagevec of a remote CPU.  That unreasonable
behavior is partially caused by the wrong comparation of the number of
invalidated pages and the number of the target.  For example,

        if (count < (end_index - start_index + 1))

The count above is how many pages were invalidated in the local CPU, and
(end_index - start_index + 1) is how many pages should be invalidated.
The usage of (end_index - start_index + 1) is incorrect, because they are
virtual addresses, which may not mapped to pages.  Besides that, there may
be holes between start and end.  So we'd better check whether there are
still pages on per-cpu pagevec after drain the local cpu, and then decide
whether or not to call lru_add_drain_all().

After I applied it with a hotfix to our production environment, most of
the lru_add_drain_all() can be avoided.

Suggested-by: Mel Gorman <[email protected]>
Signed-off-by: Yafang Shao <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Cc: Johannes Weiner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/filemap: fix filemap_map_pages for THP
Matthew Wilcox (Oracle) [Tue, 13 Oct 2020 23:51:44 +0000 (16:51 -0700)]
mm/filemap: fix filemap_map_pages for THP

We dereference page->mapping and page->index directly after calling
find_subpage() and these fields are not valid for tail pages.  While
commit 4101196b19d7 ("mm: page cache: store only head pages in i_pages")
introduced the call to find_subpage(), the problem existed prior to this;
I'm going to suggest all the way back to when THPs first existed.

The user-visible effects of this are almost negligible.  To hit it, you
have to mmap a tmpfs file at an unaligned address and then it's only a
disabled optimisation causing page faults to happen more frequently than
they otherwise would.

Fix this by keeping both head and page pointers and checking the
appropriate one.  We could use page_mapping() and page_to_index(), but
that's higher overhead.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: William Kucharski <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm: add find_lock_head
Matthew Wilcox (Oracle) [Tue, 13 Oct 2020 23:51:41 +0000 (16:51 -0700)]
mm: add find_lock_head

Add a new FGP_HEAD flag which avoids calling find_subpage() and add a
convenience wrapper for it.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: William Kucharski <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/shmem: return head page from find_lock_entry
Matthew Wilcox (Oracle) [Tue, 13 Oct 2020 23:51:38 +0000 (16:51 -0700)]
mm/shmem: return head page from find_lock_entry

Convert shmem_getpage_gfp() (the only remaining caller of
find_lock_entry()) to cope with a head page being returned instead of
the subpage for the index.

[[email protected]: fix BUG()s]
  Link https://lore.kernel.org/linux-mm/20200912032042[email protected]/

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: William Kucharski <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm: convert find_get_entry to return the head page
Matthew Wilcox (Oracle) [Tue, 13 Oct 2020 23:51:34 +0000 (16:51 -0700)]
mm: convert find_get_entry to return the head page

There are only four callers remaining of find_get_entry().
get_shadow_from_swap_cache() only wants to see shadow entries and doesn't
care about which page is returned.  Push the find_subpage() call into
find_lock_entry(), find_get_incore_page() and pagecache_get_page().

[[email protected]: fix oops]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: William Kucharski <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoi915: use find_lock_page instead of find_lock_entry
Matthew Wilcox (Oracle) [Tue, 13 Oct 2020 23:51:31 +0000 (16:51 -0700)]
i915: use find_lock_page instead of find_lock_entry

i915 does not want to see value entries.  Switch it to use
find_lock_page() instead, and remove the export of find_lock_entry().
Move find_lock_entry() and find_get_entry() to mm/internal.h to discourage
any future use.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: William Kucharski <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoproc: optimise smaps for shmem entries
Matthew Wilcox (Oracle) [Tue, 13 Oct 2020 23:51:28 +0000 (16:51 -0700)]
proc: optimise smaps for shmem entries

Avoid bumping the refcount on pages when we're only interested in the
swap entries.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: William Kucharski <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm: optimise madvise WILLNEED
Matthew Wilcox (Oracle) [Tue, 13 Oct 2020 23:51:24 +0000 (16:51 -0700)]
mm: optimise madvise WILLNEED

Instead of calling find_get_entry() for every page index, use an XArray
iterator to skip over NULL entries, and avoid calling get_page(),
because we only want the swap entries.

[[email protected]: fix LTP soft lockups]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: William Kucharski <[email protected]>
Cc: Qian Cai <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm: use find_get_incore_page in memcontrol
Matthew Wilcox (Oracle) [Tue, 13 Oct 2020 23:51:21 +0000 (16:51 -0700)]
mm: use find_get_incore_page in memcontrol

The current code does not protect against swapoff of the underlying
swap device, so this is a bug fix as well as a worthwhile reduction in
code complexity.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: William Kucharski <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm: factor find_get_incore_page out of mincore_page
Matthew Wilcox (Oracle) [Tue, 13 Oct 2020 23:51:17 +0000 (16:51 -0700)]
mm: factor find_get_incore_page out of mincore_page

Patch series "Return head pages from find_*_entry", v2.

This patch series started out as part of the THP patch set, but it has
some nice effects along the way and it seems worth splitting it out and
submitting separately.

Currently find_get_entry() and find_lock_entry() return the page
corresponding to the requested index, but the first thing most callers do
is find the head page, which we just threw away.  As part of auditing all
the callers, I found some misuses of the APIs and some plain
inefficiencies that I've fixed.

The diffstat is unflattering, but I added more kernel-doc and a new wrapper.

This patch (of 8);

Provide this functionality from the swap cache.  It's useful for
more than just mincore().

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: William Kucharski <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: Huang Ying <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm, dump_page: rename head_mapcount() --> head_compound_mapcount()
John Hubbard [Tue, 13 Oct 2020 23:51:14 +0000 (16:51 -0700)]
mm, dump_page: rename head_mapcount() --> head_compound_mapcount()

Rename head_pincount() --> head_compound_pincount().  These names are more
accurate (or less misleading) than the original ones.

Signed-off-by: John Hubbard <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Qian Cai <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: William Kucharski <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/debug.c: do not dereference i_ino blindly
Matthew Wilcox (Oracle) [Tue, 13 Oct 2020 23:51:10 +0000 (16:51 -0700)]
mm/debug.c: do not dereference i_ino blindly

__dump_page() checks i_dentry is fetchable and i_ino is earlier in the
struct than i_ino, so it ought to work fine, but it's possible that struct
randomisation has reordered i_ino after i_dentry and the pointer is just
wild enough that i_dentry is fetchable and i_ino isn't.

Also print the inode number if the dentry is invalid.

Reported-by: Vlastimil Babka <[email protected]>
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: John Hubbard <[email protected]>
Reviewed-by: Mike Rapoport <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agodevice-dax: add a range mapping allocation attribute
Joao Martins [Tue, 13 Oct 2020 23:51:06 +0000 (16:51 -0700)]
device-dax: add a range mapping allocation attribute

Add a sysfs attribute which denotes a range from the dax region to be
allocated.  It's an write only @mapping sysfs attribute in the format of
'<start>-<end>' to allocate a range.  @start and @end use hexadecimal
values and the @pgoff is implicitly ordered wrt to previous writes to
@mapping sysfs e.g.  a write of a range of length 1G the pgoff is
0..1G(-4K), a second write will use @pgoff for 1G+4K..<size>.

This range mapping interface is useful for:

 1) Application which want to implement its own allocation logic, and
    thus pick the desired ranges from dax_region.

 2) For use cases like VMM fast restart[0] where after kexec we want
    to the same gpa<->phys mappings (as originally created before kexec).

[0] https://static.sched.com/hosted_files/kvmforum2019/66/VMM-fast-restart_kvmforum2019.pdf

Signed-off-by: Joao Martins <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: David Airlie <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Jia He <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/159643106970.4062302.10402616567780784722.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lore.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/160106119570.30709.4548889722645210610.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agodax/hmem: introduce dax_hmem.region_idle parameter
Joao Martins [Tue, 13 Oct 2020 23:51:00 +0000 (16:51 -0700)]
dax/hmem: introduce dax_hmem.region_idle parameter

Introduce a new module parameter for dax_hmem which initializes all region
devices as free, rather than allocating a pagemap for the region by
default.

All hmem devices created with dax_hmem.region_idle=1 will have full
available size for creating dynamic dax devices.

Signed-off-by: Joao Martins <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: David Airlie <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Jia He <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/159643106460.4062302.5868522341307530091.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lore.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/160106119033.30709.11249962152222193448.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agodevice-dax: add an 'align' attribute
Dan Williams [Tue, 13 Oct 2020 23:50:55 +0000 (16:50 -0700)]
device-dax: add an 'align' attribute

Introduce a device align attribute.  While doing so, rename the region
align attribute to be more explicitly named as so, but keep it named as
@align to retain the API for tools like daxctl.

Changes on align may not always be valid, when say certain mappings were
created with 2M and then we switch to 1G.  So, we validate all ranges
against the new value being attempted, post resizing.

Signed-off-by: Joao Martins <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: David Airlie <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Jia He <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/159643105944.4062302.3131761052969132784.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lore.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/160106118486.30709.13012322227204800596.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agodevice-dax: make align a per-device property
Joao Martins [Tue, 13 Oct 2020 23:50:50 +0000 (16:50 -0700)]
device-dax: make align a per-device property

Introduce @align to struct dev_dax.

When creating a new device, we still initialize to the default dax_region
@align.  Child devices belonging to a region may wish to keep a different
alignment property instead of a global region-defined one.

Signed-off-by: Joao Martins <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: David Airlie <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Jia He <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/159643105377.4062302.4159447829955683131.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lore.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/160106117957.30709.1142303024324655705.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agodevice-dax: introduce 'mapping' devices
Dan Williams [Tue, 13 Oct 2020 23:50:45 +0000 (16:50 -0700)]
device-dax: introduce 'mapping' devices

In support of interrogating the physical address layout of a device with
dis-contiguous ranges, introduce a sysfs directory with 'start', 'end',
and 'page_offset' attributes.  The alternative is trying to parse
/proc/iomem, and that file will not reflect the extent layout until the
device is enabled.

Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: David Airlie <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Jia He <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/159643104819.4062302.13691281391423291589.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lkml.kernel.org/r/160106117446.30709.2751020815463722537.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agodevice-dax: add dis-contiguous resource support
Dan Williams [Tue, 13 Oct 2020 23:50:39 +0000 (16:50 -0700)]
device-dax: add dis-contiguous resource support

Break the requirement that device-dax instances are physically contiguous.
With this constraint removed it allows fragmented available capacity to
be fully allocated.

This capability is useful to mitigate the "noisy neighbor" problem with
memory-side-cache management for virtual machines, or any other scenario
where a platform address boundary also designates a performance boundary.
For example a direct mapped memory side cache might rotate cache colors at
1GB boundaries.  With dis-contiguous allocations a device-dax instance
could be configured to contain only 1 cache color.

It also satisfies Joao's use case (see link) for partitioning memory for
exclusive guest access.  It allows for a future potential mode where the
host kernel need not allocate 'struct page' capacity up-front.

Reported-by: Joao Martins <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: David Airlie <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Jia He <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lkml.kernel.org/r/159643104304.4062302.16561669534797528660.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lkml.kernel.org/r/160106116875.30709.11456649969327399771.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/memremap_pages: support multiple ranges per invocation
Dan Williams [Tue, 13 Oct 2020 23:50:34 +0000 (16:50 -0700)]
mm/memremap_pages: support multiple ranges per invocation

In support of device-dax growing the ability to front physically
dis-contiguous ranges of memory, update devm_memremap_pages() to track
multiple ranges with a single reference counter and devm instance.

Convert all [devm_]memremap_pages() users to specify the number of ranges
they are mapping in their 'struct dev_pagemap' instance.

Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: "Jérôme Glisse" <[email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Jia He <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/159643103789.4062302.18426128170217903785.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lkml.kernel.org/r/160106116293.30709.13350662794915396198.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/memremap_pages: convert to 'struct range'
Dan Williams [Tue, 13 Oct 2020 23:50:29 +0000 (16:50 -0700)]
mm/memremap_pages: convert to 'struct range'

The 'struct resource' in 'struct dev_pagemap' is only used for holding
resource span information.  The other fields, 'name', 'flags', 'desc',
'parent', 'sibling', and 'child' are all unused wasted space.

This is in preparation for introducing a multi-range extension of
devm_memremap_pages().

The bulk of this change is unwinding all the places internal to libnvdimm
that used 'struct resource' unnecessarily, and replacing instances of
'struct dev_pagemap'.res with 'struct dev_pagemap'.range.

P2PDMA had a minor usage of the resource flags field, but only to report
failures with "%pR".  That is replaced with an open coded print of the
range.

[[email protected]: mm/hmm/test: use after free in dmirror_allocate_chunk()]
Link: https://lkml.kernel.org/r/20200926121402.GA7467@kadam
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Boris Ostrovsky <[email protected]> [xen]
Cc: Paul Mackerras <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: Jia He <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/159643103173.4062302.768998885691711532.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lkml.kernel.org/r/160106115761.30709.13539840236873663620.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agodevice-dax: add resize support
Dan Williams [Tue, 13 Oct 2020 23:50:24 +0000 (16:50 -0700)]
device-dax: add resize support

Make the device-dax 'size' attribute writable to allow capacity to be
split between multiple instances in a region.  The intended consumers of
this capability are users that want to split a scarce memory resource
between device-dax and System-RAM access, or users that want to have
multiple security domains for a large region.

By default the hmem instance provider allocates an entire region to the
first instance.  The process of creating a new instance (assuming a
region-id of 0) is find the region and trigger the 'create' attribute
which yields an empty instance to configure.  For example:

    cd /sys/bus/dax/devices
    echo dax0.0 > dax0.0/driver/unbind
    echo $new_size > dax0.0/size
    echo 1 > $(readlink -f dax0.0)../dax_region/create
    seed=$(cat $(readlink -f dax0.0)../dax_region/seed)
    echo $new_size > $seed/size
    echo dax0.0 > ../drivers/{device_dax,kmem}/bind
    echo dax0.1 > ../drivers/{device_dax,kmem}/bind

Instances can be destroyed by:

    echo $device > $(readlink -f $device)../dax_region/delete

Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jia He <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/159643102625.4062302.7431838945566033852.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lkml.kernel.org/r/160106115239.30709.9850106928133493138.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agodrivers/base: make device_find_child_by_name() compatible with sysfs inputs
Dan Williams [Tue, 13 Oct 2020 23:50:18 +0000 (16:50 -0700)]
drivers/base: make device_find_child_by_name() compatible with sysfs inputs

Use sysfs_streq() in device_find_child_by_name() to allow it to use a
sysfs input string that might contain a trailing newline.

The other "device by name" interfaces,
{bus,driver,class}_find_device_by_name(), already account for sysfs
strings.

Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: David Airlie <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Jia He <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/159643102106.4062302.12229802117645312104.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lkml.kernel.org/r/160106114576.30709.2960091665444712180.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agodevice-dax: introduce 'seed' devices
Dan Williams [Tue, 13 Oct 2020 23:50:13 +0000 (16:50 -0700)]
device-dax: introduce 'seed' devices

Add a seed device concept for dynamic dax regions to be able to split the
region amongst multiple sub-instances.  The seed device, similar to
libnvdimm seed devices, is a device that starts with zero capacity
allocated and unbound to a driver.  In contrast to libnvdimm seed devices
explicit 'create' and 'delete' interfaces are added to the region to
trigger seeds to be created and unused devices to be reclaimed.  The
explicit create and delete replaces implicit create as a side effect of
probe and implicit delete when writing 0 to the size that libnvdimm
implements.

Delete can be performed on any 0-sized and idle device.  This avoids the
gymnastics of needing to move device_unregister() to its own async
context.  Specifically, it avoids the deadlock of deleting a device via
one of its own attributes.  It is also less surprising to userspace which
never sees an extra device it did not request.

For now just add the device creation, teardown, and ->probe() prevention.
A later patch will arrange for the 'dax/size' attribute to be writable to
allocate capacity from the region.

Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jia He <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/159643101583.4062302.12255093902950754962.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lkml.kernel.org/r/160106113873.30709.15168756050631539431.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agodevice-dax: introduce 'struct dev_dax' typed-driver operations
Dan Williams [Tue, 13 Oct 2020 23:50:08 +0000 (16:50 -0700)]
device-dax: introduce 'struct dev_dax' typed-driver operations

In preparation for introducing seed devices the dax-bus core needs to be
able to intercept ->probe() and ->remove() operations.  Towards that end
arrange for the bus and drivers to switch from raw 'struct device' driver
operations to 'struct dev_dax' typed operations.

Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jia He <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/160106113357.30709.4541750544799737855.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agodevice-dax: add an allocation interface for device-dax instances
Dan Williams [Tue, 13 Oct 2020 23:50:03 +0000 (16:50 -0700)]
device-dax: add an allocation interface for device-dax instances

In preparation for a facility that enables dax regions to be sub-divided,
introduce infrastructure to track and allocate region capacity.

The new dax_region/available_size attribute is only enabled for volatile
hmem devices, not pmem devices that are defined by nvdimm namespace
boundaries.  This is per Jeff's feedback the last time dynamic device-dax
capacity allocation support was discussed.

Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jia He <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lore.kernel.org/linux-nvdimm/[email protected]
Link: https://lkml.kernel.org/r/159643101035.4062302.6785857915652647857.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lkml.kernel.org/r/160106112801.30709.14601438735305335071.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agodevice-dax/kmem: replace release_resource() with release_mem_region()
Dan Williams [Tue, 13 Oct 2020 23:49:58 +0000 (16:49 -0700)]
device-dax/kmem: replace release_resource() with release_mem_region()

Towards removing the mode specific @dax_kmem_res attribute from the
generic 'struct dev_dax', and preparing for multi-range support, change
the kmem driver to use the idiomatic release_mem_region() to pair with the
initial request_mem_region().  This also eliminates the need to open code
the release of the resource allocated by request_mem_region().

As there are no more dax_kmem_res users, delete this struct member.

Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jia He <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/160106112239.30709.15909567572288425294.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agodevice-dax/kmem: move resource name tracking to drvdata
Dan Williams [Tue, 13 Oct 2020 23:49:53 +0000 (16:49 -0700)]
device-dax/kmem: move resource name tracking to drvdata

Towards removing the mode specific @dax_kmem_res attribute from the
generic 'struct dev_dax', and preparing for multi-range support, move
resource name tracking to driver data.  The memory for the resource name
needs to have its own lifetime separate from the device bind lifetime for
cases where the driver is unbound, but the kmem range could not be
unplugged from the page allocator.

Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jia He <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/160106111639.30709.17624822766862009183.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agodevice-dax/kmem: introduce dax_kmem_range()
Dan Williams [Tue, 13 Oct 2020 23:49:48 +0000 (16:49 -0700)]
device-dax/kmem: introduce dax_kmem_range()

Towards removing the mode specific @dax_kmem_res attribute from the
generic 'struct dev_dax', and preparing for multi-range support, teach the
driver to calculate the hotplug range from the device range.  The hotplug
range is the trivially calculated memory-block-size aligned version of the
device range.

Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jia He <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/160106111109.30709.3173462396758431559.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agodevice-dax: make pgmap optional for instance creation
Dan Williams [Tue, 13 Oct 2020 23:49:43 +0000 (16:49 -0700)]
device-dax: make pgmap optional for instance creation

The passed in dev_pagemap is only required in the pmem case as the
libnvdimm core may have reserved a vmem_altmap for dev_memremap_pages() to
place the memmap in pmem directly.  In the hmem case there is no agent
reserving an altmap so it can all be handled by a core internal default.

Pass the resource range via a new @range property of 'struct
dev_dax_data'.

Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jia He <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/159643099958.4062302.10379230791041872886.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lkml.kernel.org/r/160106110513.30709.4303239334850606031.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agodevice-dax: move instance creation parameters to 'struct dev_dax_data'
Dan Williams [Tue, 13 Oct 2020 23:49:38 +0000 (16:49 -0700)]
device-dax: move instance creation parameters to 'struct dev_dax_data'

In preparation for adding more parameters to instance creation, move
existing parameters to a new struct.

Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: David Airlie <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: Jia He <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Vivek Goyal <[email protected]>
Link: https://lkml.kernel.org/r/159643099411.4062302.1337305960720423895.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agodevice-dax: drop the dax_region.pfn_flags attribute
Dan Williams [Tue, 13 Oct 2020 23:49:33 +0000 (16:49 -0700)]
device-dax: drop the dax_region.pfn_flags attribute

All callers specify the same flags to alloc_dax_region(), so there is no
need to allow for anything other than PFN_DEV|PFN_MAP, or carry a
->pfn_flags around on the region.  Device-dax instances are always page
backed.

Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: David Airlie <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: Jia He <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Vivek Goyal <[email protected]>
Link: https://lkml.kernel.org/r/159643098829.4062302.13611520567669439046.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoACPI: HMAT: attach a device for each soft-reserved range
Dan Williams [Tue, 13 Oct 2020 23:49:28 +0000 (16:49 -0700)]
ACPI: HMAT: attach a device for each soft-reserved range

The hmem enabling in commit cf8741ac57ed ("ACPI: NUMA: HMAT: Register
"soft reserved" memory as an "hmem" device") only registered ranges to the
hmem driver for each soft-reservation that also appeared in the HMAT.
While this is meant to encourage platform firmware to "do the right thing"
and publish an HMAT, the corollary is that platforms that fail to publish
an accurate HMAT will strand memory from Linux usage.  Additionally, the
"efi_fake_mem" kernel command line option enabling will strand memory by
default without an HMAT.

Arrange for "soft reserved" memory that goes unclaimed by HMAT entries to
be published as raw resource ranges for the hmem driver to consume.

Include a module parameter to disable either this fallback behavior, or
the hmat enabling from creating hmem devices.  The module parameter
requires the hmem device enabling to have unique name in the module
namespace: "device_hmem".

The driver depends on the architecture providing phys_to_target_node()
which is only x86 via numa_meminfo() and arm64 via a generic memblock
implementation.

[[email protected]: require NUMA_KEEP_MEMINFO for phys_to_target_node()]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Joao Martins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Joao Martins <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: David Airlie <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jia He <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Vivek Goyal <[email protected]>
Link: https://lkml.kernel.org/r/159643098298.4062302.17587338161136144730.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/memory_hotplug: introduce default phys_to_target_node() implementation
Dan Williams [Tue, 13 Oct 2020 23:49:23 +0000 (16:49 -0700)]
mm/memory_hotplug: introduce default phys_to_target_node() implementation

In preparation to set a fallback value for dev_dax->target_node, introduce
generic fallback helpers for phys_to_target_node()

A generic implementation based on node-data or memblock was proposed, but
as noted by Mike:

    "Here again, I would prefer to add a weak default for
     phys_to_target_node() because the "generic" implementation is not really
     generic.

     The fallback to reserved ranges is x86 specfic because on x86 most of
     the reserved areas is not in memblock.memory. AFAIK, no other
     architecture does this."

The info message in the generic memory_add_physaddr_to_nid()
implementation is fixed up to properly reflect that
memory_add_physaddr_to_nid() communicates "online" node info and
phys_to_target_node() indicates "target / to-be-onlined" node info.

[[email protected]: fix CONFIG_MEMORY_HOTPLUG=n build]
Link: https://lkml.kernel.org/r/202008252130.7YrHIyMI%[email protected]
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Jia He <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Vivek Goyal <[email protected]>
Link: https://lkml.kernel.org/r/159643097768.4062302.3135192588966888630.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoresource: report parent to walk_iomem_res_desc() callback
Dan Williams [Tue, 13 Oct 2020 23:49:18 +0000 (16:49 -0700)]
resource: report parent to walk_iomem_res_desc() callback

In support of detecting whether a resource might have been been claimed,
report the parent to the walk_iomem_res_desc() callback.  For example, the
ACPI HMAT parser publishes "hmem" platform devices per target range.
However, if the HMAT is disabled / missing a fallback driver can attach
devices to the raw memory ranges as a fallback if it sees unclaimed /
orphan "Soft Reserved" resources in the resource tree.

Otherwise, find_next_iomem_res() returns a resource with garbage data from
the stack allocation in __walk_iomem_res_desc() for the res->parent field.

There are currently no users that expect ->child and ->sibling to be
valid, and the resource_lock would be needed to traverse them.  Use a
compound literal to implicitly zero initialize the fields that are not
being returned in addition to setting ->parent.

Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: David Airlie <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: Jia He <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Vivek Goyal <[email protected]>
Link: https://lkml.kernel.org/r/159643097166.4062302.11875688887228572793.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoACPI: HMAT: refactor hmat_register_target_device to hmem_register_device
Dan Williams [Tue, 13 Oct 2020 23:49:13 +0000 (16:49 -0700)]
ACPI: HMAT: refactor hmat_register_target_device to hmem_register_device

In preparation for exposing "Soft Reserved" memory ranges without an HMAT,
move the hmem device registration to its own compilation unit and make the
implementation generic.

The generic implementation drops usage acpi_map_pxm_to_online_node() that
was translating ACPI proximity domain values and instead relies on
numa_map_to_online_node() to determine the numa node for the device.

[[email protected]: CONFIG_DEV_DAX_HMEM_DEVICES should depend on CONFIG_DAX=y]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Joao Martins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: David Airlie <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: Jia He <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Vivek Goyal <[email protected]>
Link: https://lkml.kernel.org/r/159643096584.4062302.5035370788475153738.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lore.kernel.org/r/158318761484.2216124.2049322072599482736.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoefi/fake_mem: arrange for a resource entry per efi_fake_mem instance
Dan Williams [Tue, 13 Oct 2020 23:49:08 +0000 (16:49 -0700)]
efi/fake_mem: arrange for a resource entry per efi_fake_mem instance

In preparation for attaching a platform device per iomem resource teach
the efi_fake_mem code to create an e820 entry per instance.  Similar to
E820_TYPE_PRAM, bypass merging resource when the e820 map is sanitized.

Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Ard Biesheuvel <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: David Airlie <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: Jia He <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Vivek Goyal <[email protected]>
Link: https://lkml.kernel.org/r/159643096068.4062302.11590041070221681669.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agox86/numa: add 'nohmat' option
Dan Williams [Tue, 13 Oct 2020 23:49:02 +0000 (16:49 -0700)]
x86/numa: add 'nohmat' option

Disable parsing of the HMAT for debug, to workaround broken platform
instances, or cases where it is otherwise not wanted.

[[email protected]: fix build when CONFIG_ACPI is not set]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: David Airlie <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: Jia He <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Vivek Goyal <[email protected]>
Link: https://lkml.kernel.org/r/159643095540.4062302.732962081968036212.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agox86/numa: cleanup configuration dependent command-line options
Dan Williams [Tue, 13 Oct 2020 23:48:57 +0000 (16:48 -0700)]
x86/numa: cleanup configuration dependent command-line options

Patch series "device-dax: Support sub-dividing soft-reserved ranges", v5.

The device-dax facility allows an address range to be directly mapped
through a chardev, or optionally hotplugged to the core kernel page
allocator as System-RAM.  It is the mechanism for converting persistent
memory (pmem) to be used as another volatile memory pool i.e.  the current
Memory Tiering hot topic on linux-mm.

In the case of pmem the nvdimm-namespace-label mechanism can sub-divide
it, but that labeling mechanism is not available / applicable to
soft-reserved ("EFI specific purpose") memory [3].  This series provides a
sysfs-mechanism for the daxctl utility to enable provisioning of
volatile-soft-reserved memory ranges.

The motivations for this facility are:

1/ Allow performance differentiated memory ranges to be split between
   kernel-managed and directly-accessed use cases.

2/ Allow physical memory to be provisioned along performance relevant
   address boundaries. For example, divide a memory-side cache [4] along
   cache-color boundaries.

3/ Parcel out soft-reserved memory to VMs using device-dax as a security
   / permissions boundary [5]. Specifically I have seen people (ab)using
   memmap=nn!ss (mark System-RAM as Persistent Memory) just to get the
   device-dax interface on custom address ranges. A follow-on for the VM
   use case is to teach device-dax to dynamically allocate 'struct page' at
   runtime to reduce the duplication of 'struct page' space in both the
   guest and the host kernel for the same physical pages.

[2]: http://lore.kernel.org/r/20200713160837[email protected]
[3]: http://lore.kernel.org/r/157309097008.1579826.12818463304589384434[email protected]
[4]: http://lore.kernel.org/r/154899811738.3165233.12325692939590944259[email protected]
[5]: http://lore.kernel.org/r/20200110190313[email protected]

This patch (of 23):

In preparation for adding a new numa= option clean up the existing ones to
avoid ifdefs in numa_setup(), and provide feedback when the option is
numa=fake= option is invalid due to kernel config.  The same does not need
to be done for numa=noacpi, since the capability is already hard disabled
at compile-time.

Suggested-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Ben Skeggs <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brice Goglin <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: David Airlie <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: Jia He <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Hulk Robot <[email protected]>
Cc: Jason Yan <[email protected]>
Cc: "Jérôme Glisse" <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Vivek Goyal <[email protected]>
Link: https://lkml.kernel.org/r/160106109960.30709.7379926726669669398.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lkml.kernel.org/r/159643094279.4062302.17779410714418721328.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lkml.kernel.org/r/159643094925.4062302.14979872973043772305.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm,kmemleak-test.c: move kmemleak-test.c to samples dir
Hui Su [Tue, 13 Oct 2020 23:48:53 +0000 (16:48 -0700)]
mm,kmemleak-test.c: move kmemleak-test.c to samples dir

kmemleak-test.c is just a kmemleak test module, which also can not be used
as a built-in kernel module.  Thus, i think it may should not be in mm
dir, and move the kmemleak-test.c to samples/kmemleak/kmemleak-test.c.
Fix the spelling of built-in by the way.

Signed-off-by: Hui Su <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Sam Ravnborg <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Cc: Miguel Ojeda <[email protected]>
Cc: Divya Indi <[email protected]>
Cc: Tomas Winkler <[email protected]>
Cc: David Howells <[email protected]>
Link: https://lkml.kernel.org/r/20200925183729.GA172837@rlk
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/kmemleak: rely on rcu for task stack scanning
Davidlohr Bueso [Tue, 13 Oct 2020 23:48:50 +0000 (16:48 -0700)]
mm/kmemleak: rely on rcu for task stack scanning

kmemleak_scan() currently relies on the big tasklist_lock hammer to
stabilize iterating through the tasklist.  Instead, this patch proposes
simply using rcu along with the rcu-safe for_each_process_thread flavor
(without changing scan semantics), which doesn't make use of
next_thread/p->thread_group and thus cannot race with exit.  Furthermore,
any races with fork() and not seeing the new child should be benign as
it's not running yet and can also be detected by the next scan.

Avoiding the tasklist_lock could prove beneficial for performance
considering the scan operation is done periodically.  I have seen
improvements of 30%-ish when doing similar replacements on very
pathological microbenchmarks (ie stressing get/setpriority(2)).

However my main motivation is that it's one less user of the global
lock, something that Linus has long time wanted to see gone eventually
(if ever) even if the traditional fairness issues has been dealt with
now with qrwlocks.  Of course this is a very long ways ahead.  This
patch also kills another user of the deprecated tsk->thread_group.

Signed-off-by: Davidlohr Bueso <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Qian Cai <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/slub: make add_full() condition more explicit
Abel Wu [Tue, 13 Oct 2020 23:48:47 +0000 (16:48 -0700)]
mm/slub: make add_full() condition more explicit

The commit below is incomplete, as it didn't handle the add_full() part.
commit a4d3f8916c65 ("slub: remove useless kmem_cache_debug() before
remove_full()")

This patch checks for SLAB_STORE_USER instead of kmem_cache_debug(), since
that should be the only context in which we need the list_lock for
add_full().

Signed-off-by: Abel Wu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Liu Xiang <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/slub: fix missing ALLOC_SLOWPATH stat when bulk alloc
Abel Wu [Tue, 13 Oct 2020 23:48:43 +0000 (16:48 -0700)]
mm/slub: fix missing ALLOC_SLOWPATH stat when bulk alloc

The ALLOC_SLOWPATH statistics is missing in bulk allocation now.  Fix it
by doing statistics in alloc slow path.

Signed-off-by: Abel Wu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Pekka Enberg <[email protected]>
Acked-by: David Rientjes <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Hewenliang <[email protected]>
Cc: Hu Shiyuan <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/slub.c: branch optimization in free slowpath
Abel Wu [Tue, 13 Oct 2020 23:48:40 +0000 (16:48 -0700)]
mm/slub.c: branch optimization in free slowpath

The two conditions are mutually exclusive and gcc compiler will optimise
this into if-else-like pattern.  Given that the majority of free_slowpath
is free_frozen, let's provide some hint to the compilers.

Tests (perf bench sched messaging -g 20 -l 400000, executed 10x
after reboot) are done and the summarized result:

un-patched patched
max. 192.316 189.851
min. 187.267 186.252
avg. 189.154 188.086
stdev. 1.37 0.99

Signed-off-by: Abel Wu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Hewenliang <[email protected]>
Cc: Hu Shiyuan <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoinclude/linux/slab.h: fix a typo error in comment
tangjianqiang [Tue, 13 Oct 2020 23:48:37 +0000 (16:48 -0700)]
include/linux/slab.h: fix a typo error in comment

fix a typo error in slab.h
"allocagtor" -> "allocator"

Signed-off-by: tangjianqiang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Souptick Joarder <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agomm/slab.c: clean code by removing redundant if condition
Mateusz Nosek [Tue, 13 Oct 2020 23:48:34 +0000 (16:48 -0700)]
mm/slab.c: clean code by removing redundant if condition

The removed code was unnecessary and changed nothing in the flow, since in
case of returning NULL by 'kmem_cache_alloc_node' returning 'freelist'
from the function in question is the same as returning NULL.

Signed-off-by: Mateusz Nosek <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agofs_parse: mark fs_param_bad_value() as static
Luo Jiaxing [Tue, 13 Oct 2020 23:48:30 +0000 (16:48 -0700)]
fs_parse: mark fs_param_bad_value() as static

We found the following warning when build kernel with W=1:

fs/fs_parser.c:192:5: warning: no previous prototype for `fs_param_bad_value' [-Wmissing-prototypes]
int fs_param_bad_value(struct p_log *log, struct fs_parameter *param)
     ^
CC      drivers/usb/gadget/udc/snps_udc_core.o

And no header file define a prototype for this function, so we should mark
it as static.

Signed-off-by: Luo Jiaxing <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agofs/xattr.c: fix kernel-doc warnings for setxattr & removexattr
Randy Dunlap [Tue, 13 Oct 2020 23:48:27 +0000 (16:48 -0700)]
fs/xattr.c: fix kernel-doc warnings for setxattr & removexattr

Fix kernel-doc warnings in fs/xattr.c:

../fs/xattr.c:251: warning: Function parameter or member 'dentry' not described in '__vfs_setxattr_locked'
../fs/xattr.c:251: warning: Function parameter or member 'name' not described in '__vfs_setxattr_locked'
../fs/xattr.c:251: warning: Function parameter or member 'value' not described in '__vfs_setxattr_locked'
../fs/xattr.c:251: warning: Function parameter or member 'size' not described in '__vfs_setxattr_locked'
../fs/xattr.c:251: warning: Function parameter or member 'flags' not described in '__vfs_setxattr_locked'
../fs/xattr.c:251: warning: Function parameter or member 'delegated_inode' not described in '__vfs_setxattr_locked'
../fs/xattr.c:458: warning: Function parameter or member 'dentry' not described in '__vfs_removexattr_locked'
../fs/xattr.c:458: warning: Function parameter or member 'name' not described in '__vfs_removexattr_locked'
../fs/xattr.c:458: warning: Function parameter or member 'delegated_inode' not described in '__vfs_removexattr_locked'

Fixes: 08b5d5014a27 ("xattr: break delegations in {set,remove}xattr")
Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Frank van der Linden <[email protected]>
Cc: Chuck Lever <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoocfs2: fix potential soft lockup during fstrim
Gang He [Tue, 13 Oct 2020 23:48:24 +0000 (16:48 -0700)]
ocfs2: fix potential soft lockup during fstrim

When we discard unused blocks on a mounted ocfs2 filesystem, fstrim
handles each block goup with locking/unlocking global bitmap meta-file
repeatedly. we should let fstrim thread take a break(if need) between
unlock and lock, this will avoid the potential soft lockup problem,
and also gives the upper applications more IO opportunities, these
applications are not blocked for too long at writing files.

Signed-off-by: Gang He <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Joseph Qi <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Changwei Ge <[email protected]>
Cc: Jun Piao <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoocfs2: delete repeated words in comments
Randy Dunlap [Tue, 13 Oct 2020 23:48:21 +0000 (16:48 -0700)]
ocfs2: delete repeated words in comments

Drop duplicated words {the, and} in comments.

Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Joseph Qi <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agontfs: add check for mft record size in superblock
Rustam Kovhaev [Tue, 13 Oct 2020 23:48:17 +0000 (16:48 -0700)]
ntfs: add check for mft record size in superblock

Number of bytes allocated for mft record should be equal to the mft record
size stored in ntfs superblock as reported by syzbot, userspace might
trigger out-of-bounds read by dereferencing ctx->attr in ntfs_attr_find()

Reported-by: [email protected]
Signed-off-by: Rustam Kovhaev <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: [email protected]
Acked-by: Anton Altaparmakov <[email protected]>
Link: https://syzkaller.appspot.com/bug?extid=aed06913f36eff9b544e
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoscripts/decodecode: add the capability to supply the program counter
Borislav Petkov [Tue, 13 Oct 2020 23:48:14 +0000 (16:48 -0700)]
scripts/decodecode: add the capability to supply the program counter

So that comparing with objdump output from vmlinux can ease pinpointing
where the trapping instruction actually is.  An example is better than a
thousand words:

  $ PC=0xffffffff8329a927 ./scripts/decodecode < ~/tmp/syz/gfs2.splat
  [ 477.379104][T23917] Code: 48 83 ec 28 48 89 3c 24 48 89 54 24 08 e8 c1 b4 4a fe 48 8d bb 00 01 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 97 05 00 00 48 8b 9b 00 01 00 00 48 85 db 0f 84
  All code
  ========
  ffffffff8329a8fd:       48 83 ec 28             sub    $0x28,%rsp
  ffffffff8329a901:       48 89 3c 24             mov    %rdi,(%rsp)
  ffffffff8329a905:       48 89 54 24 08          mov    %rdx,0x8(%rsp)
  ffffffff8329a90a:       e8 c1 b4 4a fe          callq  0xffffffff81745dd0
  ffffffff8329a90f:       48 8d bb 00 01 00 00    lea    0x100(%rbx),%rdi
  ffffffff8329a916:       48 b8 00 00 00 00 00    movabs $0xdffffc0000000000,%rax
  ffffffff8329a91d:       fc ff df
  ffffffff8329a920:       48 89 fa                mov    %rdi,%rdx
  ffffffff8329a923:       48 c1 ea 03             shr    $0x3,%rdx
  ffffffff8329a927:*      80 3c 02 00             cmpb   $0x0,(%rdx,%rax,1)               <-- trapping instruction
  ffffffff8329a92b:       0f 85 97 05 00 00       jne    0xffffffff8329aec8
  ffffffff8329a931:       48 8b 9b 00 01 00 00    mov    0x100(%rbx),%rbx
  ffffffff8329a938:       48 85 db                test   %rbx,%rbx
  ffffffff8329a93b:       0f                      .byte 0xf
  ffffffff8329a93c:       84                      .byte 0x84

Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Rabin Vincent <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoscripts/spelling.txt: add "arbitrary" typo
Naoki Hayama [Tue, 13 Oct 2020 23:48:11 +0000 (16:48 -0700)]
scripts/spelling.txt: add "arbitrary" typo

Add "abitrary||arbitrary".

Signed-off-by: Naoki Hayama <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Andy Whitcroft <[email protected]>
Cc: Joe Perches <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoscripts/spelling.txt: increase error-prone spell checking
Wang Qing [Tue, 13 Oct 2020 23:48:08 +0000 (16:48 -0700)]
scripts/spelling.txt: increase error-prone spell checking

Increase direcly,ununsed,manger spelling error check

Signed-off-by: Wang Qing <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Wang Qing <[email protected]>
Cc: Xiong <[email protected]>
Cc: SeongJae Park <[email protected]>
Cc: Jonathan Neuschfer <[email protected]>
Cc: Luca Ceresoli <[email protected]>
Cc: Joe Perches <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agokbuild: doc: describe proper script invocation
Lukas Bulwahn [Tue, 13 Oct 2020 23:48:05 +0000 (16:48 -0700)]
kbuild: doc: describe proper script invocation

During an investigation to fix up the execute bits of scripts in the
kernel repository, Andrew Morton and Kees Cook pointed out that the
execute bit should not matter, and that build scripts cannot rely on that.
Kees could not point to any documentation, though.

Masahiro Yamada explained the convention of setting execute bits to make
it easier for manual script invocation.

Provide some basic documentation how the build shall invoke scripts, such
that the execute bits do not matter, and acknowledge that execute bits are
useful nonetheless.

This serves as reference for further clean-up patches in the future.

Suggested-by: Andrew Morton <[email protected]>
Suggested-by: Kees Cook <[email protected]>
Signed-off-by: Lukas Bulwahn <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Michal Marek <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Ujjwal Kumar <[email protected]>
Cc: Lukas Bulwahn <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lore.kernel.org/lkml/202008271102.FEB906C88@keescook/
Link: https://lore.kernel.org/linux-kbuild/CAK7LNAQdrvMkDA6ApDJCGr+5db8SiPo=G+p8EiOvnnGvEN80gA@mail.gmail.com/
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoexport.h: fix section name for CONFIG_TRIM_UNUSED_KSYMS for Clang
Nick Desaulniers [Tue, 13 Oct 2020 23:48:01 +0000 (16:48 -0700)]
export.h: fix section name for CONFIG_TRIM_UNUSED_KSYMS for Clang

When enabling CONFIG_TRIM_UNUSED_KSYMS, the linker will warn about the
orphan sections:

(".discard.ksym") is being placed in '".discard.ksym"'

repeatedly when linking vmlinux.  This is because the stringification
operator, `#`, in the preprocessor escapes strings.  GCC and Clang differ
in how they treat section names that contain \".

The portable solution is to not use a string literal with the preprocessor
stringification operator.

Fixes: commit bbda5ec671d3 ("kbuild: simplify dependency generation for CONFIG_TRIM_UNUSED_KSYMS")
Reported-by: kbuild test robot <[email protected]>
Suggested-by: Kees Cook <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Matthias Maennich <[email protected]>
Cc: Jessica Yu <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://bugs.llvm.org/show_bug.cgi?id=42950
Link: https://github.com/ClangBuiltLinux/linux/issues/1166
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agocompiler.h: avoid escaped section names
Nick Desaulniers [Tue, 13 Oct 2020 23:47:58 +0000 (16:47 -0700)]
compiler.h: avoid escaped section names

The stringification operator, `#`, in the preprocessor escapes strings.
For example, `# "foo"` becomes `"\"foo\""`.  GCC and Clang differ in how
they treat section names that contain \".

The portable solution is to not use a string literal with the preprocessor
stringification operator.

In this case, since __section unconditionally uses the stringification
operator, we actually want the more verbose
__attribute__((__section__())).

Fixes: commit e04462fb82f8 ("Compiler Attributes: remove uses of __attribute__ from compiler.h")
Signed-off-by: Nick Desaulniers <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Miguel Ojeda <[email protected]>
Cc: Luc Van Oostenryck <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Arvind Sankar <[email protected]>
Link: https://bugs.llvm.org/show_bug.cgi?id=42950
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agocompiler-gcc: improve version error
Nick Desaulniers [Tue, 13 Oct 2020 23:47:55 +0000 (16:47 -0700)]
compiler-gcc: improve version error

As Kees suggests, doing so provides developers with two useful pieces of
information:
- The kernel build was attempting to use GCC.
  (Maybe they accidentally poked the wrong configs in a CI.)
- They need 4.9 or better.
  ("Upgrade to what version?" doesn't need to be dug out of documentation,
   headers, etc.)

Suggested-by: Kees Cook <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Sedat Dilek <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Miguel Ojeda <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>
Reviewed-by: Sedat Dilek <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Fangrui Song <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Vincenzo Frascino <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agokasan: remove mentions of unsupported Clang versions
Marco Elver [Tue, 13 Oct 2020 23:47:51 +0000 (16:47 -0700)]
kasan: remove mentions of unsupported Clang versions

Since the kernel now requires at least Clang 10.0.1, remove any mention of
old Clang versions and simplify the documentation.

Signed-off-by: Marco Elver <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrey Konovalov <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>
Cc: Fangrui Song <[email protected]>
Cc: Miguel Ojeda <[email protected]>
Cc: Sedat Dilek <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Vincenzo Frascino <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoPartially revert "ARM: 8905/1: Emit __gnu_mcount_nc when using Clang 10.0.0 or newer"
Nick Desaulniers [Tue, 13 Oct 2020 23:47:48 +0000 (16:47 -0700)]
Partially revert "ARM: 8905/1: Emit __gnu_mcount_nc when using Clang 10.0.0 or newer"

This partially reverts commit b0fe66cf095016e0b238374c10ae366e1f087d11.

The minimum supported version of clang is now clang 10.0.1. We still
want to pass -meabi=gnu.

Suggested-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Fangrui Song <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: Miguel Ojeda <[email protected]>
Cc: Sedat Dilek <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Vincenzo Frascino <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoRevert "arm64: vdso: Fix compilation with clang older than 8"
Nick Desaulniers [Tue, 13 Oct 2020 23:47:44 +0000 (16:47 -0700)]
Revert "arm64: vdso: Fix compilation with clang older than 8"

This reverts commit 3acf4be235280f14d838581a750532219d67facc.

The minimum supported version of clang is clang 10.0.1.

Suggested-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Fangrui Song <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: Miguel Ojeda <[email protected]>
Cc: Sedat Dilek <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Vincenzo Frascino <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoRevert "arm64: bti: Require clang >= 10.0.1 for in-kernel BTI support"
Nick Desaulniers [Tue, 13 Oct 2020 23:47:40 +0000 (16:47 -0700)]
Revert "arm64: bti: Require clang >= 10.0.1 for in-kernel BTI support"

This reverts commit b9249cba25a5dce5de87e5404503a5e11832c2dd.

The minimum supported version of clang is now 10.0.1.

Suggested-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Fangrui Song <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: Miguel Ojeda <[email protected]>
Cc: Sedat Dilek <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Vincenzo Frascino <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoRevert "kbuild: disable clang's default use of -fmerge-all-constants"
Nick Desaulniers [Tue, 13 Oct 2020 23:47:37 +0000 (16:47 -0700)]
Revert "kbuild: disable clang's default use of -fmerge-all-constants"

This reverts commit 87e0d4f0f37fb0c8c4aeeac46fff5e957738df79.

-fno-merge-all-constants has been the default since clang-6; the minimum
supported version of clang in the kernel is clang-10 (10.0.1).

Suggested-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Sedat Dilek <[email protected]>
Reviewed-by: Fangrui Song <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>
Reviewed-by: Sedat Dilek <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: Miguel Ojeda <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Vincenzo Frascino <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Link: https://reviews.llvm.org/rL329300.
Link: https://github.com/ClangBuiltLinux/linux/issues/9
Signed-off-by: Linus Torvalds <[email protected]>
4 years agocompiler-clang: add build check for clang 10.0.1
Nick Desaulniers [Tue, 13 Oct 2020 23:47:33 +0000 (16:47 -0700)]
compiler-clang: add build check for clang 10.0.1

Patch series "set clang minimum version to 10.0.1", v3.

Adds a compile time #error to compiler-clang.h setting the effective
minimum supported version to clang 10.0.1.  A separate patch has already
been picked up into the Documentation/ tree also confirming the version.

Next are a series of reverts. One for 32b arm is a partial revert.

Then Marco suggested fixes to KASAN docs.

Finally, improve the warning for GCC too as per Kees.

This patch (of 7):

During Plumbers 2020, we voted to just support the latest release of Clang
for now.  Add a compile time check for this.

We plan to remove workarounds for older versions now, which will break in
subtle and not so subtle ways.

Suggested-by: Sedat Dilek <[email protected]>
Suggested-by: Nathan Chancellor <[email protected]>
Suggested-by: Kees Cook <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Sedat Dilek <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Miguel Ojeda <[email protected]>
Reviewed-by: Sedat Dilek <[email protected]>
Acked-by: Marco Elver <[email protected]>
Acked-by: Nathan Chancellor <[email protected]>
Acked-by: Sedat Dilek <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Fangrui Song <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Vincenzo Frascino <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://github.com/ClangBuiltLinux/linux/issues/9
Link: https://github.com/ClangBuiltLinux/linux/issues/941
Signed-off-by: Linus Torvalds <[email protected]>
4 years agoMerge tag 'overflow-v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees...
Linus Torvalds [Tue, 13 Oct 2020 23:37:39 +0000 (16:37 -0700)]
Merge tag 'overflow-v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull overflow update from Kees Cook:
 "Just a single change to help enforce all callers are actually checking
  the results of the helpers"

* tag 'overflow-v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  overflow: Add __must_check attribute to check_*() helpers

4 years agoMerge tag 'seccomp-v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees...
Linus Torvalds [Tue, 13 Oct 2020 23:33:43 +0000 (16:33 -0700)]
Merge tag 'seccomp-v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull seccomp updates from Kees Cook:
 "The bulk of the changes are with the seccomp selftests to accommodate
  some powerpc-specific behavioral characteristics. Additional cleanups,
  fixes, and improvements are also included:

   - heavily refactor seccomp selftests (and clone3 selftests
     dependency) to fix powerpc (Kees Cook, Thadeu Lima de Souza
     Cascardo)

   - fix style issue in selftests (Zou Wei)

   - upgrade "unknown action" from KILL_THREAD to KILL_PROCESS (Rich
     Felker)

   - replace task_pt_regs(current) with current_pt_regs() (Denis
     Efremov)

   - fix corner-case race in USER_NOTIF (Jann Horn)

   - make CONFIG_SECCOMP no longer per-arch (YiFei Zhu)"

* tag 'seccomp-v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (23 commits)
  seccomp: Make duplicate listener detection non-racy
  seccomp: Move config option SECCOMP to arch/Kconfig
  selftests/clone3: Avoid OS-defined clone_args
  selftests/seccomp: powerpc: Set syscall return during ptrace syscall exit
  selftests/seccomp: Allow syscall nr and ret value to be set separately
  selftests/seccomp: Record syscall during ptrace entry
  selftests/seccomp: powerpc: Fix seccomp return value testing
  selftests/seccomp: Remove SYSCALL_NUM_RET_SHARE_REG in favor of SYSCALL_RET_SET
  selftests/seccomp: Avoid redundant register flushes
  selftests/seccomp: Convert REGSET calls into ARCH_GETREG/ARCH_SETREG
  selftests/seccomp: Convert HAVE_GETREG into ARCH_GETREG/ARCH_SETREG
  selftests/seccomp: Remove syscall setting #ifdefs
  selftests/seccomp: mips: Remove O32-specific macro
  selftests/seccomp: arm64: Define SYSCALL_NUM_SET macro
  selftests/seccomp: arm: Define SYSCALL_NUM_SET macro
  selftests/seccomp: mips: Define SYSCALL_NUM_SET macro
  selftests/seccomp: Provide generic syscall setting macro
  selftests/seccomp: Refactor arch register macros to avoid xtensa special case
  selftests/seccomp: Use __NR_mknodat instead of __NR_mknod
  selftests/seccomp: Use bitwise instead of arithmetic operator for flags
  ...

4 years agoMerge tag 'selinux-pr-20201012' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 13 Oct 2020 23:29:55 +0000 (16:29 -0700)]
Merge tag 'selinux-pr-20201012' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:
 "A decent number of SELinux patches for v5.10, twenty two in total. The
  highlights are listed below, but all of the patches pass our test
  suite and merge cleanly.

   - A number of changes to how the SELinux policy is loaded and managed
     inside the kernel with the goal of improving the atomicity of a
     SELinux policy load operation.

     These changes account for the bulk of the diffstat as well as the
     patch count. A special thanks to everyone who contributed patches
     and fixes for this work.

   - Convert the SELinux policy read-write lock to RCU.

   - A tracepoint was added for audited SELinux access control events;
     this should help provide a more unified backtrace across kernel and
     userspace.

   - Allow the removal of security.selinux xattrs when a SELinux policy
     is not loaded.

   - Enable policy capabilities in SELinux policies created with the
     scripts/selinux/mdp tool.

   - Provide some "no sooner than" dates for the SELinux checkreqprot
     sysfs deprecation"

* tag 'selinux-pr-20201012' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: (22 commits)
  selinux: provide a "no sooner than" date for the checkreqprot removal
  selinux: Add helper functions to get and set checkreqprot
  selinux: access policycaps with READ_ONCE/WRITE_ONCE
  selinux: simplify away security_policydb_len()
  selinux: move policy mutex to selinux_state, use in lockdep checks
  selinux: fix error handling bugs in security_load_policy()
  selinux: convert policy read-write lock to RCU
  selinux: delete repeated words in comments
  selinux: add basic filtering for audit trace events
  selinux: add tracepoint on audited events
  selinux: Create new booleans and class dirs out of tree
  selinux: Standardize string literal usage for selinuxfs directory names
  selinux: Refactor selinuxfs directory populating functions
  selinux: Create function for selinuxfs directory cleanup
  selinux: permit removing security.selinux xattr before policy load
  selinux: fix memdup.cocci warnings
  selinux: avoid dereferencing the policy prior to initialization
  selinux: fix allocation failure check on newpolicy->sidtab
  selinux: refactor changing booleans
  selinux: move policy commit after updating selinuxfs
  ...

4 years agoMerge tag 'audit-pr-20201012' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoor...
Linus Torvalds [Tue, 13 Oct 2020 23:24:42 +0000 (16:24 -0700)]
Merge tag 'audit-pr-20201012' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit

Pull audit updates from Paul Moore:
 "A small set of audit patches for v5.10.

  There are only three patches in total, and all three are trivial fixes
  that don't really warrant any explanations beyond their descriptions.
  As usual, all three patches pass our test suite"

* tag 'audit-pr-20201012' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
  audit: Remove redundant null check
  audit: uninitialize variable audit_sig_sid
  audit: change unnecessary globals into statics

4 years agoMerge tag 'Smack-for-5.10' of git://github.com/cschaufler/smack-next
Linus Torvalds [Tue, 13 Oct 2020 23:18:51 +0000 (16:18 -0700)]
Merge tag 'Smack-for-5.10' of git://github.com/cschaufler/smack-next

Pull smack updates from Casey Schaufler:
 "Two minor fixes and one performance enhancement to Smack. The
  performance improvement is significant and the new code is more like
  its counterpart in SELinux.

   - Two kernel test robot suggested clean-ups.

   - Teach Smack to use the IPv4 netlabel cache. This results in a
     12-14% improvement on TCP benchmarks"

* tag 'Smack-for-5.10' of git://github.com/cschaufler/smack-next:
  Smack: Remove unnecessary variable initialization
  Smack: Fix build when NETWORK_SECMARK is not set
  Smack: Use the netlabel cache
  Smack: Set socket labels only once
  Smack: Consolidate uses of secmark into a function

4 years agoMerge tag 'tomoyo-pr-20201012' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1
Linus Torvalds [Tue, 13 Oct 2020 23:10:37 +0000 (16:10 -0700)]
Merge tag 'tomoyo-pr-20201012' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1

Pull tomoyo fix from Tetsuo HandaL
 "One patch to make it possible to execute usermode-driver's path"

* tag 'tomoyo-pr-20201012' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1:
  tomoyo: Loosen pathname/domainname validation.

4 years agoMerge tag 'printk-for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/printk...
Linus Torvalds [Tue, 13 Oct 2020 22:58:10 +0000 (15:58 -0700)]
Merge tag 'printk-for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:
 "The big new thing is the fully lockless ringbuffer implementation,
  including the support for continuous lines. It will allow to store and
  read messages in any situation wihtout the risk of deadlocks and
  without the need of temporary per-CPU buffers.

  The access is still serialized by logbuf_lock. It synchronizes few
  more operations, for example, temporary buffer for formatting the
  message, syslog and kmsg_dump operations. The lock removal is being
  discussed and should be ready for the next release.

  The continuous lines are handled exactly the same way as before to
  avoid regressions in user space. It means that they are appended to
  the last message when the caller is the same. Only the last message
  can be extended.

  The data ring includes plain text of the messages. Except for an
  integer at the beginning of each message that points back to the
  descriptor ring with other metadata.

  The dictionary has to stay. journalctl uses it to filter the log. It
  allows to show messages related to a given device. The dictionary
  values are stored in the descriptor ring with the other metadata.

  This is the first part of the printk rework as discussed at Plumbers
  2019, see https://lore.kernel.org/r/[email protected]. The
  next big step will be handling consoles by kthreads during the normal
  system operation. It will require special handling of situations when
  the kthreads could not get scheduled, for example, early boot,
  suspend, panic.

  Other changes:

   - Add John Ogness as a reviewer for printk subsystem. He is author of
     the rework and is familiar with the code and history.

   - Fix locking in serial8250_do_startup() to prevent lockdep report.

   - Few code cleanups"

* tag 'printk-for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: (27 commits)
  printk: Use fallthrough pseudo-keyword
  printk: reduce setup_text_buf size to LOG_LINE_MAX
  printk: avoid and/or handle record truncation
  printk: remove dict ring
  printk: move dictionary keys to dev_printk_info
  printk: move printk_info into separate array
  printk: reimplement log_cont using record extension
  printk: ringbuffer: add finalization/extension support
  printk: ringbuffer: change representation of states
  printk: ringbuffer: clear initial reserved fields
  printk: ringbuffer: add BLK_DATALESS() macro
  printk: ringbuffer: relocate get_data()
  printk: ringbuffer: avoid memcpy() on state_var
  printk: ringbuffer: fix setting state in desc_read()
  kernel.h: Move oops_in_progress to printk.h
  scripts/gdb: update for lockless printk ringbuffer
  scripts/gdb: add utils.read_ulong()
  docs: vmcoreinfo: add lockless printk ringbuffer vmcoreinfo
  printk: reduce LOG_BUF_SHIFT range for H8300
  printk: ringbuffer: support dataless records
  ...

4 years agoMerge tag 'drm-misc-next-fixes-2020-10-13' of git://anongit.freedesktop.org/drm/drm...
Dave Airlie [Tue, 13 Oct 2020 21:25:51 +0000 (07:25 +1000)]
Merge tag 'drm-misc-next-fixes-2020-10-13' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

One fix for a bad revert in ingenic-drm, and one fix for panfrost to increase a timeout at power up.

Signed-off-by: Dave Airlie <[email protected]>
From: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
4 years agoMerge tag 'x86_asm_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Linus Torvalds [Tue, 13 Oct 2020 20:36:07 +0000 (13:36 -0700)]
Merge tag 'x86_asm_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 asm updates from Borislav Petkov:
 "Two asm wrapper fixes:

   - Use XORL instead of XORQ to avoid a REX prefix and save some bytes
     in the .fixup section, by Uros Bizjak.

   - Replace __force_order dummy variable with a memory clobber to fix
     LLVM requiring a definition for former and to prevent memory
     accesses from still being cached/reordered, by Arvind Sankar"

* tag 'x86_asm_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/asm: Replace __force_order with a memory clobber
  x86/uaccess: Use XORL %0,%0 in __get_user_asm()

4 years agoMerge tag 'drivers-5.10-2020-10-12' of git://git.kernel.dk/linux-block
Linus Torvalds [Tue, 13 Oct 2020 20:04:41 +0000 (13:04 -0700)]
Merge tag 'drivers-5.10-2020-10-12' of git://git.kernel.dk/linux-block

Pull block driver updates from Jens Axboe:
 "Here are the driver updates for 5.10.

  A few SCSI updates in here too, in coordination with Martin as they
  depend on core block changes for the shared tag bitmap.

  This contains:

   - NVMe pull requests via Christoph:
      - fix keep alive timer modification (Amit Engel)
      - order the PCI ID list more sensibly (Andy Shevchenko)
      - cleanup the open by controller helper (Chaitanya Kulkarni)
      - use an xarray for the CSE log lookup (Chaitanya Kulkarni)
      - support ZNS in nvmet passthrough mode (Chaitanya Kulkarni)
      - fix nvme_ns_report_zones (Christoph Hellwig)
      - add a sanity check to nvmet-fc (James Smart)
      - fix interrupt allocation when too many polled queues are
        specified (Jeffle Xu)
      - small nvmet-tcp optimization (Mark Wunderlich)
      - fix a controller refcount leak on init failure (Chaitanya
        Kulkarni)
      - misc cleanups (Chaitanya Kulkarni)
      - major refactoring of the scanning code (Christoph Hellwig)

   - MD updates via Song:
      - Bug fixes in bitmap code, from Zhao Heming
      - Fix a work queue check, from Guoqing Jiang
      - Fix raid5 oops with reshape, from Song Liu
      - Clean up unused code, from Jason Yan
      - Discard improvements, from Xiao Ni
      - raid5/6 page offset support, from Yufen Yu

   - Shared tag bitmap for SCSI/hisi_sas/null_blk (John, Kashyap,
     Hannes)

   - null_blk open/active zone limit support (Niklas)

   - Set of bcache updates (Coly, Dongsheng, Qinglang)"

* tag 'drivers-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (78 commits)
  md/raid5: fix oops during stripe resizing
  md/bitmap: fix memory leak of temporary bitmap
  md: fix the checking of wrong work queue
  md/bitmap: md_bitmap_get_counter returns wrong blocks
  md/bitmap: md_bitmap_read_sb uses wrong bitmap blocks
  md/raid0: remove unused function is_io_in_chunk_boundary()
  nvme-core: remove extra condition for vwc
  nvme-core: remove extra variable
  nvme: remove nvme_identify_ns_list
  nvme: refactor nvme_validate_ns
  nvme: move nvme_validate_ns
  nvme: query namespace identifiers before adding the namespace
  nvme: revalidate zone bitmaps in nvme_update_ns_info
  nvme: remove nvme_update_formats
  nvme: update the known admin effects
  nvme: set the queue limits in nvme_update_ns_info
  nvme: remove the 0 lba_shift check in nvme_update_ns_info
  nvme: clean up the check for too large logic block sizes
  nvme: freeze the queue over ->lba_shift updates
  nvme: factor out a nvme_configure_metadata helper
  ...

4 years agoMerge tag 'libata-5.10-2020-10-12' of git://git.kernel.dk/linux-block
Linus Torvalds [Tue, 13 Oct 2020 19:50:48 +0000 (12:50 -0700)]
Merge tag 'libata-5.10-2020-10-12' of git://git.kernel.dk/linux-block

Pull libata updates from Jens Axboe:
 "Nothing major in here, just fixes or improvements collected over the
  last few months"

* tag 'libata-5.10-2020-10-12' of git://git.kernel.dk/linux-block:
  ata: ahci: mvebu: Make SATA PHY optional for Armada 3720
  MAINTAINERS: remove LIBATA PATA DRIVERS entry
  pata_cmd64x: Use fallthrough pseudo-keyword
  ahci: qoriq: enable acpi support in qoriq ahci driver
  sata, highbank: simplify the return expression of ahci_highbank_suspend
  ahci: Add Intel Rocket Lake PCH-H RAID PCI IDs

4 years agoMerge tag 'io_uring-5.10-2020-10-12' of git://git.kernel.dk/linux-block
Linus Torvalds [Tue, 13 Oct 2020 19:36:21 +0000 (12:36 -0700)]
Merge tag 'io_uring-5.10-2020-10-12' of git://git.kernel.dk/linux-block

Pull io_uring updates from Jens Axboe:

 - Add blkcg accounting for io-wq offload (Dennis)

 - A use-after-free fix for io-wq (Hillf)

 - Cancelation fixes and improvements

 - Use proper files_struct references for offload

 - Cleanup of io_uring_get_socket() since that can now go into our own
   header

 - SQPOLL fixes and cleanups, and support for sharing the thread

 - Improvement to how page accounting is done for registered buffers and
   huge pages, accounting the real pinned state

 - Series cleaning up the xarray code (Willy)

 - Various cleanups, refactoring, and improvements (Pavel)

 - Use raw spinlock for io-wq (Sebastian)

 - Add support for ring restrictions (Stefano)

* tag 'io_uring-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (62 commits)
  io_uring: keep a pointer ref_node in file_data
  io_uring: refactor *files_register()'s error paths
  io_uring: clean file_data access in files_register
  io_uring: don't delay io_init_req() error check
  io_uring: clean leftovers after splitting issue
  io_uring: remove timeout.list after hrtimer cancel
  io_uring: use a separate struct for timeout_remove
  io_uring: improve submit_state.ios_left accounting
  io_uring: simplify io_file_get()
  io_uring: kill extra check in fixed io_file_get()
  io_uring: clean up ->files grabbing
  io_uring: don't io_prep_async_work() linked reqs
  io_uring: Convert advanced XArray uses to the normal API
  io_uring: Fix XArray usage in io_uring_add_task_file
  io_uring: Fix use of XArray in __io_uring_files_cancel
  io_uring: fix break condition for __io_uring_register() waiting
  io_uring: no need to call xa_destroy() on empty xarray
  io_uring: batch account ->req_issue and task struct references
  io_uring: kill callback_head argument for io_req_task_work_add()
  io_uring: move req preps out of io_issue_sqe()
  ...

4 years agoMerge tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block
Linus Torvalds [Tue, 13 Oct 2020 19:12:44 +0000 (12:12 -0700)]
Merge tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block

Pull block updates from Jens Axboe:

 - Series of merge handling cleanups (Baolin, Christoph)

 - Series of blk-throttle fixes and cleanups (Baolin)

 - Series cleaning up BDI, seperating the block device from the
   backing_dev_info (Christoph)

 - Removal of bdget() as a generic API (Christoph)

 - Removal of blkdev_get() as a generic API (Christoph)

 - Cleanup of is-partition checks (Christoph)

 - Series reworking disk revalidation (Christoph)

 - Series cleaning up bio flags (Christoph)

 - bio crypt fixes (Eric)

 - IO stats inflight tweak (Gabriel)

 - blk-mq tags fixes (Hannes)

 - Buffer invalidation fixes (Jan)

 - Allow soft limits for zone append (Johannes)

 - Shared tag set improvements (John, Kashyap)

 - Allow IOPRIO_CLASS_RT for CAP_SYS_NICE (Khazhismel)

 - DM no-wait support (Mike, Konstantin)

 - Request allocation improvements (Ming)

 - Allow md/dm/bcache to use IO stat helpers (Song)

 - Series improving blk-iocost (Tejun)

 - Various cleanups (Geert, Damien, Danny, Julia, Tetsuo, Tian, Wang,
   Xianting, Yang, Yufen, yangerkun)

* tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (191 commits)
  block: fix uapi blkzoned.h comments
  blk-mq: move cancel of hctx->run_work to the front of blk_exit_queue
  blk-mq: get rid of the dead flush handle code path
  block: get rid of unnecessary local variable
  block: fix comment and add lockdep assert
  blk-mq: use helper function to test hw stopped
  block: use helper function to test queue register
  block: remove redundant mq check
  block: invoke blk_mq_exit_sched no matter whether have .exit_sched
  percpu_ref: don't refer to ref->data if it isn't allocated
  block: ratelimit handle_bad_sector() message
  blk-throttle: Re-use the throtl_set_slice_end()
  blk-throttle: Open code __throtl_de/enqueue_tg()
  blk-throttle: Move service tree validation out of the throtl_rb_first()
  blk-throttle: Move the list operation after list validation
  blk-throttle: Fix IO hang for a corner case
  blk-throttle: Avoid tracking latency if low limit is invalid
  blk-throttle: Avoid getting the current time if tg->last_finish_time is 0
  blk-throttle: Remove a meaningless parameter for throtl_downgrade_state()
  block: Remove redundant 'return' statement
  ...

4 years agoMerge tag 'x86_urgent_for_v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 13 Oct 2020 19:02:57 +0000 (12:02 -0700)]
Merge tag 'x86_urgent_for_v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Fix the #DE oops message string format which confused tools parsing
   crash information (Thomas Gleixner)

 - Remove an unused variable in the UV5 code which was triggering a
   build warning with clang (Mike Travis)

* tag 'x86_urgent_for_v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/platform/uv: Remove unused variable in UV5 NMI handler
  x86/traps: Fix #DE Oops message regression

4 years agodt: Remove booting-without-of.rst
Rob Herring [Thu, 8 Oct 2020 14:24:20 +0000 (09:24 -0500)]
dt: Remove booting-without-of.rst

booting-without-of.rst is an ancient document that first outlined
Flattened DeviceTree on PowerPC initially. The DT world has evolved a
lot in the 15 years since and booting-without-of.rst is pretty stale.
The name of the document itself is confusing if you don't understand the
evolution from real 'OpenFirmware'. Most of what booting-without-of.rst
contains is now in the DT specification (which evolved out of the
ePAPR). The few things that weren't documented in the DT specification
are now.

All that remains is the boot entry details, so let's move these to arch
specific documents. The exception is arm which already has the same
details documented.

Cc: Frank Rowand <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Acked-by: Benjamin Herrenschmidt <[email protected]>
Acked-by: Borislav Petkov <[email protected]>
Acked-by: Michael Ellerman <[email protected]> (powerpc)
Signed-off-by: Rob Herring <[email protected]>
4 years agox86/platform/uv: Remove unused variable in UV5 NMI handler
Mike Travis [Tue, 13 Oct 2020 15:47:31 +0000 (10:47 -0500)]
x86/platform/uv: Remove unused variable in UV5 NMI handler

Remove an unused variable.

Signed-off-by: Mike Travis <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
4 years agox86/traps: Fix #DE Oops message regression
Thomas Gleixner [Mon, 12 Oct 2020 13:11:47 +0000 (15:11 +0200)]
x86/traps: Fix #DE Oops message regression

The conversion of #DE to the idtentry mechanism introduced a change in the
Ooops message which confuses tools which parse crash information in dmesg.

Remove the underscore from 'divide_error' to restore previous behaviour.

Fixes: 9d06c4027f21 ("x86/entry: Convert Divide Error to IDTENTRY")
Reported-by: Dmitry Vyukov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/CACT4Y+bTZFkuZd7+bPArowOv-7Die+WZpfOWnEO_Wgs3U59+oA@mail.gmail.com
4 years agoMerge tag 'hwmon-for-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck...
Linus Torvalds [Tue, 13 Oct 2020 17:15:31 +0000 (10:15 -0700)]
Merge tag 'hwmon-for-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon updates from Guenter Roeck:
 "New driver and chip support:
   - Moortec MR75203 PVT controller
   - MPS Multi-phase mp2975 controller
   - ADM1266
   - Zen3 CPUs
   - Intel MAX 10 BMC

  Enhancements:
   - Support for rated attributes in hwmon core
   - MAX20730:
      - Device monitoring via debugfs
      - VOUT readin adjustment vie devicetree bindings
   - LM75:
      - Devicetree support
      - Regulator support
   - Improved accumulationm logic in amd_energy driver
   - Added fan sensor to gsc-hwmon driver
   - Support for simplified I2C probing

  Various other minor fixes and improvements"

* tag 'hwmon-for-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (64 commits)
  hwmon: (pmbus/max20730) adjust the vout reading given voltage divider
  dt-bindings: hwmon: max20730: adding device tree doc for max20730
  hwmon: Add hardware monitoring driver for Moortec MR75203 PVT controller
  hwmon: Add DT bindings schema for PVT controller
  dt-bindings: hwmon: Add the +vs supply to the lm75 bindings
  dt-bindings: hwmon: Convert lm75 bindings to yaml
  docs: hwmon: (ltc2945) update datasheet link
  hwmon: (mlxreg-fan) Fix double "Mellanox"
  hwmon: (pmbus/max20730) add device monitoring via debugfs
  hwmon: (pmbus/max34440) Fix OC fault limits
  hwmon: (bt1-pvt) Wait for the completion with timeout
  hwmon: (bt1-pvt) Cache current update timeout
  hwmon: (bt1-pvt) Test sensor power supply on probe
  hwmon: (lm75) Add regulator support
  hwmon: Add hwmon driver for Intel MAX 10 BMC
  dt-bindings: Add MP2975 voltage regulator device
  hwmon: (pmbus) Add support for MPS Multi-phase mp2975 controller
  hwmon: (tmp513) fix spelling typo in comments
  hwmon: (amd_energy) Update driver documentation
  hwmon: (amd_energy) Improve the accumulation logic
  ...

4 years agoMerge tag 'gpio-v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw...
Linus Torvalds [Tue, 13 Oct 2020 17:09:33 +0000 (10:09 -0700)]
Merge tag 'gpio-v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO updates from Linus Walleij:
 "This time very little driver changes but lots of core changes.

  We have some interesting cooperative work for ARM and Intel alike,
  making the GPIO subsystem more and more suitable for industrial
  systems and the like, in addition to the in-kernel users.

  We touch driver core (device properties) and lib/* by adding one
  simple string array free function, these are authored by Andy
  Shevchenko who is a well known and recognized core helpers maintainers
  so this should be fine.

  We also see some Android GKI-related modularization in the MXC
  drivers.

  Core changes:

   - The big core change is the updated (v2) userspace character device
     API.

     This corrects badly designed 64-bit alignment around the line
     events. We also add the debounce request feature. This echoes the
     often quotes passage from Frederick Brooks "The mythical man-month"
     to always throw one away, which we have seen before in things such
     as V4L2. So we put in a new one and deprecate and obsolete the old
     one.

   - All example tools in tools/gpio/* are migrated to the new API to
     set a good example. The libgpiod userspace library has been
     augmented to use this new API pretty much from day 1.

   - Some misc API hardening by using strn* function calls has been
     added as well.

   - Use the simpler IDA interface for GPIO chip instance enumeration.

   - Add device core function for counting string arrays in device
     properties.

   - Provide a generic library function kfree_strarray() that can be
     used throughout the kernel.

  Driver enhancements:

   - The DesignWare dwapb-gpio driver has been enhanced and now uses the
     IRQ handling in the gpiolib core.

   - The mockup and aggregator drivers have seen some substantial code
     clean-up and now use more of the core kernel inftrastructure.

   - Misc cleanups using dev_err_probe().

   - The MXC drivers (Freescale/NXP) can now be built modularized, which
     makes modularized GKI Android kernels happy"

* tag 'gpio-v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (73 commits)
  gpiolib: Update header block in gpiolib-cdev.h
  gpiolib: cdev: switch from kstrdup() to kstrndup()
  docs: gpio: add a new document to its index.rst
  gpio: pca953x: Add support for the NXP PCAL9554B/C
  tools: gpio: add debounce support to gpio-event-mon
  tools: gpio: add multi-line monitoring to gpio-event-mon
  tools: gpio: port gpio-event-mon to v2 uAPI
  tools: gpio: port gpio-hammer to v2 uAPI
  tools: gpio: rename nlines to num_lines
  tools: gpio: port gpio-watch to v2 uAPI
  tools: gpio: port lsgpio to v2 uAPI
  gpio: uapi: document uAPI v1 as deprecated
  gpiolib: cdev: support setting debounce
  gpiolib: cdev: support GPIO_V2_LINE_SET_VALUES_IOCTL
  gpiolib: cdev: support GPIO_V2_LINE_SET_CONFIG_IOCTL
  gpiolib: cdev: support edge detection for uAPI v2
  gpiolib: cdev: support GPIO_V2_GET_LINEINFO_IOCTL and GPIO_V2_GET_LINEINFO_WATCH_IOCTL
  gpiolib: cdev: support GPIO_V2_GET_LINE_IOCTL and GPIO_V2_LINE_GET_VALUES_IOCTL
  gpiolib: add build option for CDEV v1 ABI
  gpiolib: make cdev a build option
  ...

4 years agoMerge tag 'spi-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Linus Torvalds [Tue, 13 Oct 2020 17:03:12 +0000 (10:03 -0700)]
Merge tag 'spi-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi updates from Mark Brown:
 "There's quite a lot of changes for SPI in this release but none in the
  core, they're all mostly small driver updates and additions. Some of
  the more notable changes include:

   - A huge set of cleanups, optimizations and improvements for the
     DesignWare driver from Serge Semin finishing up the work started
     last release.

   - Conversion of the Zynq gqspi driver to spi-mem.

   - Support for Baikal T1, Broadcom BCMSTB 7445, and Renesas R8A7742"

* tag 'spi-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (137 commits)
  spi: cadence: Add SPI transfer delays
  spi: dw: Add Baikal-T1 SPI Controller bindings
  spi: dw: Add Baikal-T1 SPI Controller glue driver
  spi: dw: Add poll-based SPI transfers support
  spi: dw: Introduce max mem-ops SPI bus frequency setting
  spi: dw: Add memory operations support
  spi: dw: Add generic DW SSI status-check method
  spi: dw: Move num-of retries parameter to the header file
  spi: dw: Explicitly de-assert CS on SPI transfer completion
  spi: dw: De-assert chip-select on reset
  spi: dw: Discard chip enabling on DMA setup error
  spi: dw: Unmask IRQs after enabling the chip
  spi: dw: Perform IRQ setup in a dedicated function
  spi: dw: Refactor IRQ-based SPI transfer procedure
  spi: dw: Refactor data IO procedure
  spi: dw: Add DW SPI controller config structure
  spi: dw: Update Rx sample delay in the config function
  spi: dw: Simplify the SPI bus speed config procedure
  spi: dw: Update SPI bus speed in a config function
  spi: dw: Detach SPI device specific CR0 config method
  ...

4 years agoMerge tag 'regulator-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie...
Linus Torvalds [Tue, 13 Oct 2020 16:56:25 +0000 (09:56 -0700)]
Merge tag 'regulator-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator updates from Mark Brown:
 "This is a fairly small release for the regulator API, there's quite a
  few new devices supported and some important improvements around
  coupled regulators in the core but mostly just small fixes and
  improvements otherwise.

  Summary:

   - Fixes and cleanups around the handling of coupled regulators.

   - A special driver for some Raspberry Pi panels with some unusually
     custom stuff around them.

   - Support for Qualcomm PM660/PM660L, PM8950 and PM8953, Richtek
     RT4801 and RTMV20, Rohm BD9576MUF and BD9573MUF"

* tag 'regulator-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (89 commits)
  regulator: bd9576: Fix print
  regulator: bd9576: fix regulator binfdings dt node names
  dt-bindings: regulator: document pm8950 and pm8953 smd regulators
  regulator: qcom_smd: add pm8953 regulators
  regulator: Make constraint debug processing conditional on DEBUG
  regulator: qcom: labibb: Constify static structs
  regulator: dt-bindings: Document the PM660/PM660L PMICs entries
  regulator: qcom_smd: Add PM660/PM660L regulator support
  regulator: dt-bindings: Document the PM660/660L SPMI PMIC entries
  regulator: qcom_spmi: Add PM660/PM660L regulators
  regulator: qcom_spmi: Add support for new regulator types
  regulator: core: Enlarge max OF property name length to 64 chars
  regulator: tps65910: use regmap accessors
  regulator: rtmv20: Add missing regcache cache only before marked as dirty
  regulator: rtmv20: Update DT binding document and property name parsing
  regulator: rtmv20: Add DT-binding document for Richtek RTMV20
  regulator: rtmv20: Adds support for Richtek RTMV20 load switch regulator
  regulator: resolve supply after creating regulator
  regulator: print symbolic errors in kernel messages
  regulator: print state at boot
  ...

4 years agoMerge tag 'regmap-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie...
Linus Torvalds [Tue, 13 Oct 2020 16:53:59 +0000 (09:53 -0700)]
Merge tag 'regmap-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

Pull regmap updates from Mark Brown:
 "Quite a busy release for regmap, mostly support for new features
  useful on fairly small subsets of devices. The user visible features
  are:

   - A new API for registering large numbers of regmap fields at once.

   - Support for Intel AVMM buses connected via SPI.

   - Support for 12/20 address/value layouts.

   - Support for yet another scheme for acknowledging interrupts used on
     some Qualcomm devices"

* tag 'regmap-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: irq: Add support to clear ack registers
  regmap: add support to regmap_field_bulk_alloc/free apis
  regmap: destroy mutex (if used) in regmap_exit()
  regmap: debugfs: use semicolons rather than commas to separate statements
  regmap: debugfs: Fix more error path regressions
  regmap: Add support for 12/20 register formatting
  regmap: Add can_sleep configuration option
  regmap: soundwire: remove unused header mod_devicetable.h
  regmap: Use flexible sleep
  regmap: add Intel SPI Slave to AVMM Bus Bridge support

4 years agoMerge tag 'media/v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Tue, 13 Oct 2020 16:37:02 +0000 (09:37 -0700)]
Merge tag 'media/v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media updates from Mauro Carvalho Chehab:

 - the usbvision driver was dropped from staging

 - the Zoran driver were re-added at staging. It gained lots of
   improvements, and was converted to use videobuf2 API

 - a new virtual driver (vidtv) was added in order to allow testing the
   digital TV framework and APIs

 - the media uAPI documentation gained a glossary with commonly used
   terms, helping to simplify some parts of the docs

 - more cleanups at the atomisp driver

 - Mediatek VPU gained support for MT8183

 - added support for codecs with supports doing colorspace conversion
   (CSC)

 - support for CSC API was added at vivid and rksip1 drivers

 - added a helper core support and uAPI for better supporting H.264
   codecs

 - added support for Renesas R8A774E1

 - use the new SPDX GFDL-1.1-no-invariants-or-later license on media
   uAPI docs, instead of a license text

 - Venus driver has gained VP9 codec support

 - lots of other cleanups and driver improvements

* tag 'media/v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (555 commits)
  media: dvb-frontends/drxk_hard.c: fix uninitialized variable warning
  media: tvp7002: fix uninitialized variable warning
  media: s5k5baf: drop 'data' field in struct s5k5baf_fw
  media: dt-bindings: media: venus: Add an optional power domain for perf voting
  media: rcar-vin: rcar-dma: Fix setting VNIS_REG for RAW8 formats
  media: staging: rkisp1: uapi: Do not use BIT() macro
  media: v4l2-mem2mem: Fix spurious v4l2_m2m_buf_done
  media: usbtv: Fix refcounting mixup
  media: zoran.rst: place it at the right place this time
  media: add Zoran cardlist
  media: admin-guide: update cardlists
  media: siano: rename a duplicated card string
  media: zoran: move documentation file to the right place
  media: atomisp: fixes build breakage for ISP2400 due to a cleanup
  media: zoran: fix mixed case on vars
  media: zoran: get rid of an unused var
  media: zoran: use upper case for card types
  media: zoran: fix sparse warnings
  media: zoran: fix smatch warning
  media: zoran: update TODO
  ...

4 years agoALSA: hda/hdmi: fix incorrect locking in hdmi_pcm_close
Kai Vehmanen [Tue, 13 Oct 2020 15:26:28 +0000 (18:26 +0300)]
ALSA: hda/hdmi: fix incorrect locking in hdmi_pcm_close

A race exists between closing a PCM and update of ELD data. In
hdmi_pcm_close(), hinfo->nid value is modified without taking
spec->pcm_lock. If this happens concurrently while processing an ELD
update in hdmi_pcm_setup_pin(), converter assignment may be done
incorrectly.

This bug was found by hitting a WARN_ON in snd_hda_spdif_ctls_assign()
in a HDMI receiver connection stress test:

[2739.684569] WARNING: CPU: 5 PID: 2090 at sound/pci/hda/patch_hdmi.c:1898 check_non_pcm_per_cvt+0x41/0x50 [snd_hda_codec_hdmi]
...
[2739.684707] Call Trace:
[2739.684720]  update_eld+0x121/0x5a0 [snd_hda_codec_hdmi]
[2739.684736]  hdmi_present_sense+0x21e/0x3b0 [snd_hda_codec_hdmi]
[2739.684750]  check_presence_and_report+0x81/0xd0 [snd_hda_codec_hdmi]
[2739.684842]  intel_audio_codec_enable+0x122/0x190 [i915]

Fixes: 42b2987079ec ("ALSA: hda - hdmi playback without monitor in dynamic pcm bind mode")
Signed-off-by: Kai Vehmanen <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
4 years agoMerge tag 'mmc-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Tue, 13 Oct 2020 16:25:24 +0000 (09:25 -0700)]
Merge tag 'mmc-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC updates from Ulf Hansson:
 "MMC core:
   - Export SDIO revision and info strings to userspace
   - Add support for specifying mmc/mmcblk index via mmc aliases in DT

 MMC host:
   - Enable support for async probe for all mmc host drivers
   - Enable compile testing of multiple host drivers
   - dw_mmc: Enable the Synopsys DesignWare driver for RISCV and CSKY
   - mtk-sd: Fixup support for CQHCI
   - owl-mmc: Add support for the actions,s700-mmc variant
   - renesas_sdhi: Fix regression (temporary) for re-insertion of SD cards
   - renesas_sdhi: Add support for the r8a774e1 variant
   - renesas_sdhi/tmio: Improvements for tunings
   - renesas_sdhi/tmio: Rework support for reset of controller
   - sdhci-acpi: Fix HS400 tuning for devices with invalid presets on AMDI0040
   - sdhci_am654: Improve support for tunings
   - sdhci_am654: Add support for input tap delays
   - sdhci_am654: Add workaround for card detect debounce timer
   - sdhci-am654: Add support for the TI's J7200 variants
   - sdhci-esdhc-imx: Fix support for manual tuning
   - sdhci-iproc: Enable support for eMMC DDR 3.3V for bcm2711
   - sdhci-msm: Fix stability issues with HS400 for sc7180
   - sdhci-of-sparx5: Add Sparx5 SoC eMMC driver
   - sdhci-of-esdhc: Fixup reference clock source selection
   - sdhci-pci: Add LTR support for some Intel BYT controllers
   - sdhci-pci-gli: Add CQHCI Support for GL9763E"

* tag 'mmc-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (91 commits)
  mmc: sdhci_am654: Fix module autoload
  mmc: renesas_sdhi: workaround a regression when reinserting SD cards
  mmc: sdhci-pci-gli: Add CQHCI Support for GL9763E
  mmc: sdhci-acpi: AMDI0040: Set SDHCI_QUIRK2_PRESET_VALUE_BROKEN
  mmc: sdhci_am654: Enable tuning for SDR50
  mmc: sdhci_am654: Add support for software tuning
  mmc: sdhci_am654: Add support for input tap delay
  mmc: sdhci_am654: Fix hard coded otap delay array size
  dt-bindings: mmc: sdhci-am654: Add documentation for input tap delay
  dt-bindings: mmc: sdhci-am654: Convert sdhci-am654 controller documentation to json schema
  mmc: sdhci-of-esdhc: fix reference clock source selection
  mmc: host: fix depends for MMC_MESON_GX w/ COMPILE_TEST
  mmc: sdhci-s3c: hide forward declaration of of_device_id behind CONFIG_OF
  mmc: sdhci: fix indentation mistakes
  mmc: moxart: remove unneeded check for drvdata
  mmc: renesas_sdhi: drop local flag for tuning
  mmc: rtsx_usb_sdmmc: simplify the return expression of sd_change_phase()
  mmc: core: document mmc_hw_reset()
  mmc: mediatek: Drop pointer to mmc_host from msdc_host
  dt-bindings: mmc: owl: add compatible string actions,s700-mmc
  ...

4 years agoMerge tag 'erofs-for-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang...
Linus Torvalds [Tue, 13 Oct 2020 16:09:42 +0000 (09:09 -0700)]
Merge tag 'erofs-for-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs updates from Gao Xiang:
 "This cycle addresses a reported permission issue with overlay due to a
  duplicated permission check for "trusted." xattrs. Also, a REQ_RAHEAD
  flag is added now to all readahead requests in order to trace
  readahead I/Os. The others are random cleanups.

  All commits have been tested and have been in linux-next as well.

  Summary:

   - fix an issue which can cause overlay permission problem due to
     duplicated permission check for "trusted." xattrs;

   - add REQ_RAHEAD flag to readahead requests for blktrace;

   - several random cleanup"

* tag 'erofs-for-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: remove unnecessary enum entries
  erofs: add REQ_RAHEAD flag to readahead requests
  erofs: fold in should_decompress_synchronously()
  erofs: avoid unnecessary variable `err'
  erofs: remove unneeded parameter
  erofs: avoid duplicated permission check for "trusted." xattrs

4 years agoMerge tag 'for-5.10-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Linus Torvalds [Tue, 13 Oct 2020 16:01:36 +0000 (09:01 -0700)]
Merge tag 'for-5.10-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
 "Mostly core updates with a few user visible bits and fixes.

  Hilights:

   - fsync performance improvements
      - less contention of log mutex (throughput +4%, latency -14%,
        dbench with 32 clients)
      - skip unnecessary commits for link and rename (throughput +6%,
        latency -30%, rename latency -75%, dbench with 16 clients)
      - make fast fsync wait only for writeback (throughput +10..40%,
        runtime -1..-20%, dbench with 1 to 64 clients on various
        file/block sizes)

   - direct io is now implemented using the iomap infrastructure, that's
     the main part, we still have a workaround that requires an iomap
     API update, coming in 5.10

   - new sysfs exports:
      - information about the exclusive filesystem operation status
        (balance, device add/remove/replace, ...)
      - supported send stream version

  Core:

   - use ticket space reservations for data, fair policy using the same
     infrastructure as metadata

   - preparatory work to switch locking from our custom tree locks to
     standard rwsem, now the locking context is propagated to all
     callers, actual switch is expected to happen in the next dev cycle

   - seed device structures are now using list API

   - extent tracepoints print proper tree id

   - unified range checks for extent buffer helpers

   - send: avoid using temporary buffer for copying data

   - remove unnecessary RCU protection from space infos

   - remove unused readpage callback for metadata, enabling several
     cleanups

   - replace indirect function calls for end io hooks and remove
     extent_io_ops completely

  Fixes:

   - more lockdep warning fixes

   - fix qgroup reservation for delayed inode and an occasional
     reservation leak for preallocated files

   - fix device replace of a seed device

   - fix metadata reservation for fallocate that leads to transaction
     aborts

   - reschedule if necessary when logging directory items or when
     cloning lots of extents

   - tree-checker: fix false alert caused by legacy btrfs root item

   - send: fix rename/link conflicts for orphanized inodes

   - properly initialize device stats for seed devices

   - skip devices without magic signature when mounting

  Other:

   - error handling improvements, BUG_ONs replaced by proper handling,
     fuzz fixes

   - various function parameter cleanups

   - various W=1 cleanups

   - error/info messages improved

  Mishaps:

   - commit 62cf5391209a ("btrfs: move btrfs_rm_dev_replace_free_srcdev
     outside of all locks") is a rebase leftover after the patch got
     merged to 5.9-rc8 as a466c85edc6f ("btrfs: move
     btrfs_rm_dev_replace_free_srcdev outside of all locks"), the
     remaining part is trivial and the patch is in the middle of the
     series so I'm keeping it there instead of rebasing"

* tag 'for-5.10-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (161 commits)
  btrfs: rename BTRFS_INODE_ORDERED_DATA_CLOSE flag
  btrfs: annotate device name rcu_string with __rcu
  btrfs: skip devices without magic signature when mounting
  btrfs: cleanup cow block on error
  btrfs: remove BTRFS_INODE_READDIO_NEED_LOCK
  fs: remove no longer used dio_end_io()
  btrfs: return error if we're unable to read device stats
  btrfs: init device stats for seed devices
  btrfs: remove struct extent_io_ops
  btrfs: call submit_bio_hook directly for metadata pages
  btrfs: stop calling submit_bio_hook for data inodes
  btrfs: don't opencode is_data_inode in end_bio_extent_readpage
  btrfs: call submit_bio_hook directly in submit_one_bio
  btrfs: remove extent_io_ops::readpage_end_io_hook
  btrfs: replace readpage_end_io_hook with direct calls
  btrfs: send, recompute reference path after orphanization of a directory
  btrfs: send, orphanize first all conflicting inodes when processing references
  btrfs: tree-checker: fix false alert caused by legacy btrfs root item
  btrfs: use unaligned helpers for stack and header set/get helpers
  btrfs: free-space-cache: use unaligned helpers to access data
  ...

4 years agoMerge tag 'dlm-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Linus Torvalds [Tue, 13 Oct 2020 15:59:39 +0000 (08:59 -0700)]
Merge tag 'dlm-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm

Pull dlm updates from David Teigland:
 "This set continues the ongoing rework of the low level communication
  layer in the dlm.

  The focus here is on improvements to connection handling, and
  reworking the receiving of messages"

* tag 'dlm-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
  fs: dlm: fix race in nodeid2con
  fs: dlm: rework receive handling
  fs: dlm: disallow buffer size below default
  fs: dlm: handle range check as callback
  fs: dlm: fix mark per nodeid setting
  fs: dlm: remove lock dependency warning
  fs: dlm: use free_con to free connection
  fs: dlm: handle possible othercon writequeues
  fs: dlm: move free writequeue into con free
  fs: dlm: fix configfs memory leak
  fs: dlm: fix dlm_local_addr memory leak
  fs: dlm: make connection hash lockless
  fs: dlm: synchronize dlm before shutdown

4 years agoMerge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt
Linus Torvalds [Tue, 13 Oct 2020 15:54:00 +0000 (08:54 -0700)]
Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt

Pull fscrypt updates from Eric Biggers:
 "This release, we rework the implementation of creating new encrypted
  files in order to fix some deadlocks and prepare for adding fscrypt
  support to CephFS, which Jeff Layton is working on.

  We also export a symbol in preparation for the above-mentioned CephFS
  support and also for ext4/f2fs encrypt+casefold support.

  Finally, there are a few other small cleanups.

  As usual, all these patches have been in linux-next with no reported
  issues, and I've tested them with xfstests"

* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
  fscrypt: export fscrypt_d_revalidate()
  fscrypt: rename DCACHE_ENCRYPTED_NAME to DCACHE_NOKEY_NAME
  fscrypt: don't call no-key names "ciphertext names"
  fscrypt: use sha256() instead of open coding
  fscrypt: make fscrypt_set_test_dummy_encryption() take a 'const char *'
  fscrypt: handle test_dummy_encryption in more logical way
  fscrypt: move fscrypt_prepare_symlink() out-of-line
  fscrypt: make "#define fscrypt_policy" user-only
  fscrypt: stop pretending that key setup is nofs-safe
  fscrypt: require that fscrypt_encrypt_symlink() already has key
  fscrypt: remove fscrypt_inherit_context()
  fscrypt: adjust logging for in-creation inodes
  ubifs: use fscrypt_prepare_new_inode() and fscrypt_set_context()
  f2fs: use fscrypt_prepare_new_inode() and fscrypt_set_context()
  ext4: use fscrypt_prepare_new_inode() and fscrypt_set_context()
  ext4: factor out ext4_xattr_credits_for_new_inode()
  fscrypt: add fscrypt_prepare_new_inode() and fscrypt_set_context()
  fscrypt: restrict IV_INO_LBLK_32 to ino_bits <= 32
  fscrypt: drop unused inode argument from fscrypt_fname_alloc_buffer

4 years agoMerge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Tue, 13 Oct 2020 15:50:16 +0000 (08:50 -0700)]
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - Allow DRBG testing through user-space af_alg
   - Add tcrypt speed testing support for keyed hashes
   - Add type-safe init/exit hooks for ahash

  Algorithms:
   - Mark arc4 as obsolete and pending for future removal
   - Mark anubis, khazad, sead and tea as obsolete
   - Improve boot-time xor benchmark
   - Add OSCCA SM2 asymmetric cipher algorithm and use it for integrity

  Drivers:
   - Fixes and enhancement for XTS in caam
   - Add support for XIP8001B hwrng in xiphera-trng
   - Add RNG and hash support in sun8i-ce/sun8i-ss
   - Allow imx-rngc to be used by kernel entropy pool
   - Use crypto engine in omap-sham
   - Add support for Ingenic X1830 with ingenic"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (205 commits)
  X.509: Fix modular build of public_key_sm2
  crypto: xor - Remove unused variable count in do_xor_speed
  X.509: fix error return value on the failed path
  crypto: bcm - Verify GCM/CCM key length in setkey
  crypto: qat - drop input parameter from adf_enable_aer()
  crypto: qat - fix function parameters descriptions
  crypto: atmel-tdes - use semicolons rather than commas to separate statements
  crypto: drivers - use semicolons rather than commas to separate statements
  hwrng: mxc-rnga - use semicolons rather than commas to separate statements
  hwrng: iproc-rng200 - use semicolons rather than commas to separate statements
  hwrng: stm32 - use semicolons rather than commas to separate statements
  crypto: xor - use ktime for template benchmarking
  crypto: xor - defer load time benchmark to a later time
  crypto: hisilicon/zip - fix the uninitalized 'curr_qm_qp_num'
  crypto: hisilicon/zip - fix the return value when device is busy
  crypto: hisilicon/zip - fix zero length input in GZIP decompress
  crypto: hisilicon/zip - fix the uncleared debug registers
  lib/mpi: Fix unused variable warnings
  crypto: x86/poly1305 - Remove assignments with no effect
  hwrng: npcm - modify readl to readb
  ...

This page took 0.178796 seconds and 4 git commands to generate.