Git Repo - linux.git/log

mm/swapfile.c: omit a duplicate code by compare tmp and max first

There are two duplicate code to handle the case when there is no available
swap entry. To avoid this, we can compare tmp and max first and let the
second guard do its job.

No functional change is expected.

Signed-off-by: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: "Huang, Ying" <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Hugh Dickins <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/swapfile.c: tmp is always smaller than max

If tmp is bigger or equal to max, we would jump to new_cluster.

Return true directly.

Signed-off-by: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: "Huang, Ying" <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Hugh Dickins <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/swapfile.c: found_free could be represented by (tmp < max)

This is not necessary to use the variable found_free to record the status.
Just check tmp and max is enough.

Signed-off-by: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: "Huang, Ying" <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Hugh Dickins <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/swapfile.c: remove the extra check in scan_swap_map_slots()

scan_swap_map_slots() is only called by scan_swap_map() and
get_swap_pages(). Both ensure nr would not exceed SWAP_BATCH.

Just remove it.

Signed-off-by: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Hugh Dickins <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/swapfile.c: simplify the calculation of n_goal

Use min3() to simplify the comparison and make it more self-explaining.

Signed-off-by: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Hugh Dickins <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/swapfile.c: remove the unnecessary goto for SSD case

Now we can see there is redundant goto for SSD case. In these two places,
we can just let the code walk through to the correct tag instead of
explicitly jump to it.

Let's remove them for better readability.

Signed-off-by: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Tim Chen <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/swapfile.c: explicitly show ssd/non-ssd is handled mutually exclusive

The code shows if this is ssd, it will jump to specific tag and skip the
following code for non-ssd.

Let's use "else if" to explicitly show the mutually exclusion for
ssd/non-ssd to reduce ambiguity.

Signed-off-by: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Tim Chen <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/swapfile.c: offset is only used when there is more slots

scan_swap_map_slots() is used to iterate swap_map[] array for an
available swap entry.  While after several optimizations, e.g.  for ssd
case, the logic of this function is a little not easy to catch.

This patchset tries to clean up the logic a little:

  * shows the ssd/non-ssd case is handled mutually exclusively
  * remove some unnecessary goto for ssd case

This patch (of 3):

When si->cluster_nr is zero, function would reach done and return.  The
increased offset would not be used any more.  This means we can move the
offset increment into the if clause.

This brings a further code cleanup possibility.

Signed-off-by: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Tim Chen <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: swap: properly update readahead statistics in unuse_pte_range()

In unuse_pte_range() we blindly swap-in pages without checking if the
swap entry is already present in the swap cache.

By doing this, the hit/miss ratio used by the swap readahead heuristic
is not properly updated and this leads to non-optimal performance during
swapoff.

Tracing the distribution of the readahead size returned by the swap
readahead heuristic during swapoff shows that a small readahead size is
used most of the time as if we had only misses (this happens both with
cluster and vma readahead), for example:

r::swapin_nr_pages(unsigned long offset):unsigned long:$retval
        COUNT      EVENT
        36948      $retval = 8
        44151      $retval = 4
        49290      $retval = 1
        527771     $retval = 2

Checking if the swap entry is present in the swap cache, instead, allows
to properly update the readahead statistics and the heuristic behaves in a
better way during swapoff, selecting a bigger readahead size:

r::swapin_nr_pages(unsigned long offset):unsigned long:$retval
        COUNT      EVENT
        1618       $retval = 1
        4960       $retval = 2
        41315      $retval = 4
        103521     $retval = 8

In terms of swapoff performance the result is the following:

Testing environment
===================

- Host:
   CPU: 1.8GHz Intel Core i7-8565U (quad-core, 8MB cache)
   HDD: PC401 NVMe SK hynix 512GB
   MEM: 16GB

- Guest (kvm):
   8GB of RAM
   virtio block driver
   16GB swap file on ext4 (/swapfile)

Test case
=========
- allocate 85% of memory
- `systemctl hibernate` to force all the pages to be swapped-out to the
   swap file
- resume the system
- measure the time that swapoff takes to complete:
   # /usr/bin/time swapoff /swapfile

Result (swapoff time)
======
                  5.6 vanilla   5.6 w/ this patch
                  -----------   -----------------
cluster-readahead      22.09s              12.19s
    vma-readahead      18.20s              15.33s

Conclusion
==========

The specific use case this patch is addressing is to improve swapoff
performance in cloud environments when a VM has been hibernated, resumed
and all the memory needs to be forced back to RAM by disabling swap.

This change allows to better exploits the advantages of the readahead
heuristic during swapoff and this improvement allows to to speed up the
resume process of such VMs.

[[email protected]: update changelog]
Link: http://lkml.kernel.org/r/20200418084705.GA147642@xps-13
Signed-off-by: Andrea Righi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: "Huang, Ying" <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Anchal Agarwal <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Vineeth Remanan Pillai <[email protected]>
Cc: Kelley Nielsen <[email protected]>
Link: http://lkml.kernel.org/r/20200416180132.GB3352@xps-13
Signed-off-by: Linus Torvalds <[email protected]>

mm/swap_state: fix a data race in swapin_nr_pages

"prev_offset" is a static variable in swapin_nr_pages() that can be
accessed concurrently with only mmap_sem held in read mode as noticed by
KCSAN,

BUG: KCSAN: data-race in swap_cluster_readahead / swap_cluster_readahead

write to 0xffffffff92763830 of 8 bytes by task 14795 on cpu 17:
  swap_cluster_readahead+0x2a6/0x5e0
  swapin_readahead+0x92/0x8dc
  do_swap_page+0x49b/0xf20
  __handle_mm_fault+0xcfb/0xd70
  handle_mm_fault+0xfc/0x2f0
  do_page_fault+0x263/0x715
  page_fault+0x34/0x40

1 lock held by (dnf)/14795:
  #0: ffff897bd2e98858 (&mm->mmap_sem#2){++++}-{3:3}, at: do_page_fault+0x143/0x715
  do_user_addr_fault at arch/x86/mm/fault.c:1405
  (inlined by) do_page_fault at arch/x86/mm/fault.c:1535
irq event stamp: 83493
count_memcg_event_mm+0x1a6/0x270
count_memcg_event_mm+0x119/0x270
__do_softirq+0x365/0x589
irq_exit+0xa2/0xc0

read to 0xffffffff92763830 of 8 bytes by task 1 on cpu 22:
  swap_cluster_readahead+0xfd/0x5e0
  swapin_readahead+0x92/0x8dc
  do_swap_page+0x49b/0xf20
  __handle_mm_fault+0xcfb/0xd70
  handle_mm_fault+0xfc/0x2f0
  do_page_fault+0x263/0x715
  page_fault+0x34/0x40

1 lock held by systemd/1:
  #0: ffff897c38f14858 (&mm->mmap_sem#2){++++}-{3:3}, at: do_page_fault+0x143/0x715
irq event stamp: 43530289
count_memcg_event_mm+0x1a6/0x270
count_memcg_event_mm+0x119/0x270
__do_softirq+0x365/0x589
irq_exit+0xa2/0xc0

Signed-off-by: Qian Cai <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: Hugh Dickins <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/swapfile: use list_{prev,next}_entry() instead of open-coding

Use list_{prev,next}_entry() instead of list_entry() for better
code readability.

Signed-off-by: chenqiwu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Pankaj Gupta <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Qian Cai <[email protected]>
Cc: Baoquan He <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/gup.c: further document vma_permits_fault()

Describe the caller's responsibilities when passing
FAULT_FLAG_ALLOW_RETRY.

Link: http://lkml.kernel.org/r/1586915606.5647.5.camel@mtkswgap22
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ivtv: convert get_user_pages() --> pin_user_pages()

This code was using get_user_pages*(), in a "Case 2" scenario
(DMA/RDMA), using the categorization from [1]. That means that it's
time to convert the get_user_pages*() + put_page() calls to
pin_user_pages*() + unpin_user_pages() calls.

There is some helpful background in [2]: basically, this is a small part
of fixing a long-standing disconnect between pinning pages, and file
systems' use of those pages.

[1] Documentation/core-api/pin_user_pages.rst

[2] "Explicit pinning of user-space pages":
https://lwn.net/Articles/807108/

Signed-off-by: John Hubbard <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Andy Walls <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/gup: introduce pin_user_pages_unlocked

Introduce pin_user_pages_unlocked(), which is nearly identical to the
get_user_pages_unlocked() that it wraps, except that it sets FOLL_PIN
and rejects FOLL_GET.

Signed-off-by: John Hubbard <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Andy Walls <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/gup.c: update the documentation

This patch is an attempt to update the documentation.

- Add/ remove extra * based on type of function static/global.

- Add description for functions and their input arguments.

[[email protected]: s@/*@/**@]
Signed-off-by: Souptick Joarder <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/writeback: discard NR_UNSTABLE_NFS, use NR_WRITEBACK instead

After an NFS page has been written it is considered "unstable" until a
COMMIT request succeeds.  If the COMMIT fails, the page will be
re-written.

These "unstable" pages are currently accounted as "reclaimable", either
in WB_RECLAIMABLE, or in NR_UNSTABLE_NFS which is included in a
'reclaimable' count.  This might have made sense when sending the COMMIT
required a separate action by the VFS/MM (e.g.  releasepage() used to
send a COMMIT).  However now that all writes generated by ->writepages()
will automatically be followed by a COMMIT (since commit 919e3bd9a875
("NFS: Ensure we commit after writeback is complete")) it makes more
sense to treat them as writeback pages.

So this patch removes NR_UNSTABLE_NFS and accounts unstable pages in
NR_WRITEBACK and WB_WRITEBACK.

A particular effect of this change is that when
wb_check_background_flush() calls wb_over_bg_threshold(), the latter
will report 'true' a lot less often as the 'unstable' pages are no
longer considered 'dirty' (as there is nothing that writeback can do
about them anyway).

Currently wb_check_background_flush() will trigger writeback to NFS even
when there are relatively few dirty pages (if there are lots of unstable
pages), this can result in small writes going to the server (10s of
Kilobytes rather than a Megabyte) which hurts throughput.  With this
patch, there are fewer writes which are each larger on average.

Where the NR_UNSTABLE_NFS count was included in statistics
virtual-files, the entry is retained, but the value is hard-coded as
zero.  static trace points and warning printks which mentioned this
counter no longer report it.

[[email protected]: re-layout comment]
[[email protected]: fix printk warning]
Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Acked-by: Trond Myklebust <[email protected]>
Acked-by: Michal Hocko <[email protected]> [mm]
Cc: Christoph Hellwig <[email protected]>
Cc: Chuck Lever <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/writeback: replace PF_LESS_THROTTLE with PF_LOCAL_THROTTLE

PF_LESS_THROTTLE exists for loop-back nfsd (and a similar need in the
loop block driver and callers of prctl(PR_SET_IO_FLUSHER)), where a
daemon needs to write to one bdi (the final bdi) in order to free up
writes queued to another bdi (the client bdi).

The daemon sets PF_LESS_THROTTLE and gets a larger allowance of dirty
pages, so that it can still dirty pages after other processses have been
throttled.  The purpose of this is to avoid deadlock that happen when
the PF_LESS_THROTTLE process must write for any dirty pages to be freed,
but it is being thottled and cannot write.

This approach was designed when all threads were blocked equally,
independently on which device they were writing to, or how fast it was.
Since that time the writeback algorithm has changed substantially with
different threads getting different allowances based on non-trivial
heuristics.  This means the simple "add 25%" heuristic is no longer
reliable.

The important issue is not that the daemon needs a *larger* dirty page
allowance, but that it needs a *private* dirty page allowance, so that
dirty pages for the "client" bdi that it is helping to clear (the bdi
for an NFS filesystem or loop block device etc) do not affect the
throttling of the daemon writing to the "final" bdi.

This patch changes the heuristic so that the task is not throttled when
the bdi it is writing to has a dirty page count below below (or equal
to) the free-run threshold for that bdi.  This ensures it will always be
able to have some pages in flight, and so will not deadlock.

In a steady-state, it is expected that PF_LOCAL_THROTTLE tasks might
still be throttled by global threshold, but that is acceptable as it is
only the deadlock state that is interesting for this flag.

This approach of "only throttle when target bdi is busy" is consistent
with the other use of PF_LESS_THROTTLE in current_may_throttle(), were
it causes attention to be focussed only on the target bdi.

So this patch
- renames PF_LESS_THROTTLE to PF_LOCAL_THROTTLE,
- removes the 25% bonus that that flag gives, and
- If PF_LOCAL_THROTTLE is set, don't delay at all unless the
   global and the local free-run thresholds are exceeded.

Note that previously realtime threads were treated the same as
PF_LESS_THROTTLE threads.  This patch does *not* change the behvaiour
for real-time threads, so it is now different from the behaviour of nfsd
and loop tasks.  I don't know what is wanted for realtime.

[[email protected]: coding style fixes]
Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Acked-by: Chuck Lever <[email protected]> [nfsd]
Cc: Christoph Hellwig <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Trond Myklebust <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/page-writeback.c: remove unused variable

Commit 64081362e8ff ("mm/page-writeback.c: fix range_cyclic writeback
vs writepages deadlock") left unused variable, remove it.

Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/filemap.c: remove misleading comment

We no longer return 0 here and the comment doesn't tell us anything that
we don't already know (SIGBUS is a pretty good indicator that things
didn't work out).

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm_types.h: change set_page_private to inline function

Change it to inline function to make callers use the proper argument. And
no need for it to be macro per Andrew's comment [1].

[1] https://lore.kernel.org/lkml/20200518221235.1fa32c38e5766113f78e3f0d@linux-foundation.org/

Signed-off-by: Guoqing Jiang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/migrate.c: call detach_page_private to cleanup code

We can cleanup code a little by call detach_page_private here.

[[email protected]: use attach_page_private(), per Dave]
http://lkml.kernel.org/r/20200521225220 [email protected]
[[email protected]: clear PagePrivate]
Signed-off-by: Guoqing Jiang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

buffer_head.h: remove attach_page_buffers

All the callers have replaced attach_page_buffers with the new function
attach_page_private, so remove it.

Signed-off-by: Guoqing Jiang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Andreas Dilger <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

orangefs: use attach/detach_page_private

Since the new pair function is introduced, we can call them to clean the
code in orangefs.

Signed-off-by: Guoqing Jiang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Mike Marshall <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Martin Brandenburg <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

ntfs: replace attach_page_buffers with attach_page_private

Call the new function since attach_page_buffers will be removed.

Signed-off-by: Guoqing Jiang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Anton Altaparmakov <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

iomap: use attach/detach_page_private

Since the new pair function is introduced, we can call them to clean the
code in iomap.

Signed-off-by: Guoqing Jiang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Dave Chinner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

f2fs: use attach/detach_page_private

Since the new pair function is introduced, we can call them to clean the
code in f2fs.h.

Signed-off-by: Guoqing Jiang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Chao Yu <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

fs/buffer.c: use attach/detach_page_private

Since the new pair function is introduced, we can call them to clean the
code in buffer.c.

Signed-off-by: Guoqing Jiang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Alexander Viro <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

btrfs: use attach/detach_page_private

Since the new pair function is introduced, we can call them to clean the
code in btrfs.

Signed-off-by: Guoqing Jiang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Acked-by: David Sterba <[email protected]>
Cc: Chris Mason <[email protected]>
Cc: Josef Bacik <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

md: remove __clear_page_buffers and use attach/detach_page_private

After introduction attach/detach_page_private in pagemap.h, we can remove
the duplicated code and call the new functions.

Signed-off-by: Guoqing Jiang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Song Liu <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

include/linux/pagemap.h: introduce attach/detach_page_private

Patch series "Introduce attach/detach_page_private to cleanup code".

This patch (of 10):

The logic in attach_page_buffers and __clear_page_buffers are quite
paired, but

1. they are located in different files.

2. attach_page_buffers is implemented in buffer_head.h, so it could be
   used by other files. But __clear_page_buffers is static function in
   buffer.c and other potential users can't call the function, md-bitmap
   even copied the function.

So, introduce the new attach/detach_page_private to replace them.  With
the new pair of function, we will remove the usage of attach_page_buffers
and __clear_page_buffers in next patches.  Thanks for suggestions about
the function name from Alexander Viro, Andreas Grünbacher, Christoph
Hellwig and Matthew Wilcox.

Suggested-by: Matthew Wilcox <[email protected]>
Signed-off-by: Guoqing Jiang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: "Darrick J. Wong" <[email protected]>
Cc: William Kucharski <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Andreas Gruenbacher <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Yafang Shao <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Chris Mason <[email protected]>
Cc: Josef Bacik <[email protected]>
Cc: David Sterba <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Anton Altaparmakov <[email protected]>
Cc: Mike Marshall <[email protected]>
Cc: Martin Brandenburg <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Andreas Dilger <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Dave Chinner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

iomap: convert from readpages to readahead

Use the new readahead operation in iomap. Convert XFS and ZoneFS to use
it.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

fuse: convert from readpages to readahead

Implement the new readahead operation in fuse by using __readahead_batch()
to fill the array of pages in fuse_args_pages directly. This lets us
inline fuse_readpages_fill() into fuse_readahead().

[[email protected]: build fix]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Acked-by: Miklos Szeredi <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

f2fs: pass the inode to f2fs_mpage_readpages

This function now only uses the mapping argument to look up the inode, and
both callers already have the inode, so just pass the inode instead of the
mapping.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Reviewed-by: Eric Biggers <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Acked-by: Jaegeuk Kim <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

f2fs: convert from readpages to readahead

Use the new readahead operation in f2fs

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Reviewed-by: Eric Biggers <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Acked-by: Jaegeuk Kim <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

ext4: pass the inode to ext4_mpage_readpages

This function now only uses the mapping argument to look up the inode, and
both callers already have the inode, so just pass the inode instead of the
mapping.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Reviewed-by: Eric Biggers <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

ext4: convert from readpages to readahead

Use the new readahead operation in ext4

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Reviewed-by: Eric Biggers <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

erofs: convert compressed files from readpages to readahead

Use the new readahead operation in erofs.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Acked-by: Gao Xiang <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

erofs: convert uncompressed files from readpages to readahead

Use the new readahead operation in erofs

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Acked-by: Gao Xiang <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

btrfs: convert from readpages to readahead

Implement the new readahead method in btrfs using the new
readahead_page_batch() function.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

fs: convert mpage_readpages to mpage_readahead

Implement the new readahead aop and convert all callers (block_dev,
exfat, ext2, fat, gfs2, hpfs, isofs, jfs, nilfs2, ocfs2, omfs, qnx6,
reiserfs & udf).

The callers are all trivial except for GFS2 & OCFS2.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Junxiao Bi <[email protected]> # ocfs2
Reviewed-by: Joseph Qi <[email protected]> # ocfs2
Reviewed-by: Dave Chinner <[email protected]>
Reviewed-by: John Hubbard <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: use memalloc_nofs_save in readahead path

Ensure that memory allocations in the readahead path do not attempt to
reclaim file-backed pages, which could lead to a deadlock. It is
possible, though unlikely this is the root cause of a problem observed
by Cong Wang.

Reported-by: Cong Wang <[email protected]>
Suggested-by: Michal Hocko <[email protected]>
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: document why we don't set PageReadahead

If the page is already in cache, we don't set PageReadahead on it.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: add page_cache_readahead_unbounded

ext4 and f2fs have duplicated the guts of the readahead code so they can
read past i_size. Instead, separate out the guts of the readahead code
so they can call it directly.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Eric Biggers <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Reviewed-by: Eric Biggers <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: move end_index check out of readahead loop

By reducing nr_to_read, we can eliminate this check from inside the loop.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: John Hubbard <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: add readahead address space operation

This replaces ->readpages with a saner interface:
- Return void instead of an ignored error code.
- Page cache is already populated with locked pages when ->readahead
   is called.
- New arguments can be passed to the implementation without changing
   all the filesystems that use a common helper function like
   mpage_readahead().

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: John Hubbard <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: put readahead pages in cache earlier

When populating the page cache for readahead, mappings that use
->readpages must populate the page cache themselves as the pages are
passed on a linked list which would normally be used for the page
cache's LRU. For mappings that use ->readpage or the upcoming
->readahead method, we can put the pages into the page cache as soon as
they're allocated, which solves a race between readahead and direct IO.
It also lets us remove the gfp argument from read_pages().

Use the new readahead_page() API to implement the repeated calls to
->readpage(), just like most filesystems will.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: remove 'page_offset' from readahead loop

Replace the page_offset variable with 'index + i'.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: John Hubbard <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: rename readahead loop variable to 'i'

Change the type of page_idx to unsigned long, and rename it -- it's just
a loop counter, not a page index.

Suggested-by: John Hubbard <[email protected]>
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: rename various 'offset' parameters to 'index'

The word 'offset' is used ambiguously to mean 'byte offset within a
page', 'byte offset from the start of the file' and 'page offset from
the start of the file'.

Use 'index' to mean 'page offset from the start of the file' throughout
the readahead code.

[ We should probably rename the 'pgoff_t' type to 'pgidx_t' too - Linus ]

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Zi Yan <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: use readahead_control to pass arguments

In this patch, only between __do_page_cache_readahead() and
read_pages(), but it will be extended in upcoming patches. The
read_pages() function becomes aops centric, as this makes the most sense
by the end of the patchset.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: John Hubbard <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: add new readahead_control API

Filesystems which implement the upcoming ->readahead method will get
their pages by calling readahead_page() or readahead_page_batch().
These functions support large pages, even though none of the filesystems
to be converted do yet.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: move readahead nr_pages check into read_pages

Simplify the callers by moving the check for nr_pages and the BUG_ON
into read_pages().

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Zi Yan <[email protected]>
Reviewed-by: John Hubbard <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: ignore return value of ->readpages

We used to assign the return value to a variable, which we then ignored.
Remove the pretence of caring.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Reviewed-by: John Hubbard <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: return void from various readahead functions

ondemand_readahead has two callers, neither of which use the return
value. That means that both ra_submit and __do_page_cache_readahead()
can return void, and we don't need to worry that a present page in the
readahead window causes us to return a smaller nr_pages than we ought to
have.

Similarly, no caller uses the return value from
force_page_cache_readahead().

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Reviewed-by: John Hubbard <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm: move readahead prototypes from mm.h

Patch series "Change readahead API", v11.

This series adds a readahead address_space operation to replace the
readpages operation.  The key difference is that pages are added to the
page cache as they are allocated (and then looked up by the filesystem)
instead of passing them on a list to the readpages operation and having
the filesystem add them to the page cache.  It's a net reduction in code
for each implementation, more efficient than walking a list, and solves
the direct-write vs buffered-read problem reported by yu kuai at
http://lkml.kernel.org/r/20200116063601 [email protected]

The only unconverted filesystems are those which use fscache.  Their
conversion is pending Dave Howells' rewrite which will make the
conversion substantially easier.  This should be completed by the end of
the year.

I want to thank the reviewers/testers; Dave Chinner, John Hubbard, Eric
Biggers, Johannes Thumshirn, Dave Sterba, Zi Yan, Christoph Hellwig and
Miklos Szeredi have done a marvellous job of providing constructive
criticism.

These patches pass an xfstests run on ext4, xfs & btrfs with no
regressions that I can tell (some of the tests seem a little flaky
before and remain flaky afterwards).

This patch (of 25):

The readahead code is part of the page cache so should be found in the
pagemap.h file.  force_page_cache_readahead is only used within mm, so
move it to mm/internal.h instead.  Remove the parameter names where they
add no value, and rename the ones which were actively misleading.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: John Hubbard <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm, dump_page(): do not crash with invalid mapping pointer

We have seen a following problem on a RPi4 with 1G RAM:

    BUG: Bad page state in process systemd-hwdb  pfn:35601
    page:ffff7e0000d58040 refcount:15 mapcount:131221 mapping:efd8fe765bc80080 index:0x1 compound_mapcount: -32767
    Unable to handle kernel paging request at virtual address efd8fe765bc80080
    Mem abort info:
      ESR = 0x96000004
      Exception class = DABT (current EL), IL = 32 bits
      SET = 0, FnV = 0
      EA = 0, S1PTW = 0
    Data abort info:
      ISV = 0, ISS = 0x00000004
      CM = 0, WnR = 0
    [efd8fe765bc80080] address between user and kernel address ranges
    Internal error: Oops: 96000004 [#1] SMP
    Modules linked in: btrfs libcrc32c xor xor_neon zlib_deflate raid6_pq mmc_block xhci_pci xhci_hcd usbcore sdhci_iproc sdhci_pltfm sdhci mmc_core clk_raspberrypi gpio_raspberrypi_exp pcie_brcmstb bcm2835_dma gpio_regulator phy_generic fixed sg scsi_mod efivarfs
    Supported: No, Unreleased kernel
    CPU: 3 PID: 408 Comm: systemd-hwdb Not tainted 5.3.18-8-default #1 SLE15-SP2 (unreleased)
    Hardware name: raspberrypi rpi/rpi, BIOS 2020.01 02/21/2020
    pstate: 40000085 (nZcv daIf -PAN -UAO)
    pc : __dump_page+0x268/0x368
    lr : __dump_page+0xc4/0x368
    sp : ffff000012563860
    x29: ffff000012563860 x28: ffff80003ddc4300
    x27: 0000000000000010 x26: 000000000000003f
    x25: ffff7e0000d58040 x24: 000000000000000f
    x23: efd8fe765bc80080 x22: 0000000000020095
    x21: efd8fe765bc80080 x20: ffff000010ede8b0
    x19: ffff7e0000d58040 x18: ffffffffffffffff
    x17: 0000000000000001 x16: 0000000000000007
    x15: ffff000011689708 x14: 3030386362353637
    x13: 6566386466653a67 x12: 6e697070616d2031
    x11: 32323133313a746e x10: 756f6370616d2035
    x9 : ffff00001168a840 x8 : ffff00001077a670
    x7 : 000000000000013d x6 : ffff0000118a43b5
    x5 : 0000000000000001 x4 : ffff80003dd9e2c8
    x3 : ffff80003dd9e2c8 x2 : 911c8d7c2f483500
    x1 : dead000000000100 x0 : efd8fe765bc80080
    Call trace:
     __dump_page+0x268/0x368
     bad_page+0xd4/0x168
     check_new_page_bad+0x80/0xb8
     rmqueue_bulk.constprop.26+0x4d8/0x788
     get_page_from_freelist+0x4d4/0x1228
     __alloc_pages_nodemask+0x134/0xe48
     alloc_pages_vma+0x198/0x1c0
     do_anonymous_page+0x1a4/0x4d8
     __handle_mm_fault+0x4e8/0x560
     handle_mm_fault+0x104/0x1e0
     do_page_fault+0x1e8/0x4c0
     do_translation_fault+0xb0/0xc0
     do_mem_abort+0x50/0xb0
     el0_da+0x24/0x28
    Code: f9401025 8b8018a0 9a851005 17ffffca (f94002a0)

Besides the underlying issue with page->mapping containing a bogus value
for some reason, we can see that __dump_page() crashed by trying to read
the pointer at mapping->host, turning a recoverable warning into full
Oops.

It can be expected that when page is reported as bad state for some
reason, the pointers there should not be trusted blindly.

So this patch treats all data in __dump_page() that depends on
page->mapping as lava, using probe_kernel_read_strict().  Ideally this
would include the dentry->d_parent recursively, but that would mean
changing printk handler for %pd.  Chances of reaching the dentry
printing part with an initially bogus mapping pointer should be rather
low, though.

Also prefix printing mapping->a_ops with a description of what is being
printed.  In case the value is bogus, %ps will print raw value instead
of the symbol name and then it's not obvious at all that it's printing
a_ops.

Reported-by: Petr Tesarik <[email protected]>
Signed-off-by: Vlastimil Babka <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: John Hubbard <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

Documentation/vm/slub.rst: s/Toggle/Enable/

"toggle" means to change a boolean thing's state. This operation
doesn't do that - it sets it to "true".

Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Rafael Aquini <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Pekka Enberg <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

mm/slub: fix stack overruns with SLUB_STATS

There is no need to copy SLUB_STATS items from root memcg cache to new
memcg cache copies.  Doing so could result in stack overruns because the
store function only accepts 0 to clear the stat and returns an error for
everything else while the show method would print out the whole stat.

Then, the mismatch of the lengths returns from show and store methods
happens in memcg_propagate_slab_attrs():

else if (root_cache->max_attr_size < ARRAY_SIZE(mbuf))
buf = mbuf;

max_attr_size is only 2 from slab_attr_store(), then, it uses mbuf[64]
in show_stat() later where a bounch of sprintf() would overrun the stack
variable.  Fix it by always allocating a page of buffer to be used in
show_stat() if SLUB_STATS=y which should only be used for debug purpose.

  # echo 1 > /sys/kernel/slab/fs_cache/shrink
  BUG: KASAN: stack-out-of-bounds in number+0x421/0x6e0
  Write of size 1 at addr ffffc900256cfde0 by task kworker/76:0/53251

  Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
  Workqueue: memcg_kmem_cache memcg_kmem_cache_create_func
  Call Trace:
    number+0x421/0x6e0
    vsnprintf+0x451/0x8e0
    sprintf+0x9e/0xd0
    show_stat+0x124/0x1d0
    alloc_slowpath_show+0x13/0x20
    __kmem_cache_create+0x47a/0x6b0

  addr ffffc900256cfde0 is located in stack of task kworker/76:0/53251 at offset 0 in frame:
   process_one_work+0x0/0xb90

  this frame has 1 object:
   [32, 72) 'lockdep_map'

  Memory state around the buggy address:
   ffffc900256cfc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
   ffffc900256cfd00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  >ffffc900256cfd80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
                                                         ^
   ffffc900256cfe00: 00 00 00 00 00 f2 f2 f2 00 00 00 00 00 00 00 00
   ffffc900256cfe80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  ==================================================================
  Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: __kmem_cache_create+0x6ac/0x6b0
  Workqueue: memcg_kmem_cache memcg_kmem_cache_create_func
  Call Trace:
    __kmem_cache_create+0x6ac/0x6b0

Fixes: 107dab5c92d5 ("slub: slub-specific propagation changes")
Signed-off-by: Qian Cai <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Glauber Costa <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

slub: remove kmalloc under list_lock from list_slab_objects() V2

list_slab_objects() is called when a slab is destroyed and there are
objects still left to list the objects in the syslog. This is a pretty
rare event.

And there it seems we take the list_lock and call kmalloc while holding
that lock.

Perform the allocation in free_partial() before the list_lock is taken.

Fixes: bbd7d57bfe852d9788bae5fb171c7edb4021d8ac ("slub: Potential stack overflow")
Signed-off-by: Christopher Lameter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Cc: Yu Zhao <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

slub: Remove userspace notifier for cache add/remove

I came across some unnecessary uevents once again which reminded me
this.  The patch seems to be lost in the leaves of the original
discussion [1], so resending.

[1] https://lore.kernel.org/r/alpine.DEB.2.21.2001281813130 [email protected]

Kmem caches are internal kernel structures so it is strange that
userspace notifiers would be needed.  And I am not aware of any use of
these notifiers.  These notifiers may just exist because in the initial
slub release the sysfs code was copied from another subsystem.

Signed-off-by: Christoph Lameter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Acked-by: Michal Koutný <[email protected]>
Acked-by: David Rientjes <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

mm/slub.c: fix corrupted freechain in deactivate_slab()

The slub_debug is able to fix the corrupted slab freelist/page.
However, alloc_debug_processing() only checks the validity of current
and next freepointer during allocation path.  As a result, once some
objects have their freepointers corrupted, deactivate_slab() may lead to
page fault.

Below is from a test kernel module when 'slub_debug=PUF,kmalloc-128
slub_nomerge'.  The test kernel corrupts the freepointer of one free
object on purpose.  Unfortunately, deactivate_slab() does not detect it
when iterating the freechain.

  BUG: unable to handle page fault for address: 00000000123456f8
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  ... ...
  RIP: 0010:deactivate_slab.isra.92+0xed/0x490
  ... ...
  Call Trace:
   ___slab_alloc+0x536/0x570
   __slab_alloc+0x17/0x30
   __kmalloc+0x1d9/0x200
   ext4_htree_store_dirent+0x30/0xf0
   htree_dirblock_to_tree+0xcb/0x1c0
   ext4_htree_fill_tree+0x1bc/0x2d0
   ext4_readdir+0x54f/0x920
   iterate_dir+0x88/0x190
   __x64_sys_getdents+0xa6/0x140
   do_syscall_64+0x49/0x170
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

Therefore, this patch adds extra consistency check in deactivate_slab().
Once an object's freepointer is corrupted, all following objects
starting at this object are isolated.

[[email protected]: fix build with CONFIG_SLAB_DEBUG=n]
Signed-off-by: Dongli Zhang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Joe Jin <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

usercopy: mark dma-kmalloc caches as usercopy caches

We have seen a "usercopy: Kernel memory overwrite attempt detected to
SLUB object 'dma-kmalloc-1 k' (offset 0, size 11)!" error on s390x, as
IUCV uses kmalloc() with __GFP_DMA because of memory address
restrictions.  The issue has been discussed [2] and it has been noted
that if all the kmalloc caches are marked as usercopy, there's little
reason not to mark dma-kmalloc caches too.  The 'dma' part merely means
that __GFP_DMA is used to restrict memory address range.

As Jann Horn put it [3]:
"I think dma-kmalloc slabs should be handled the same way as normal
  kmalloc slabs. When a dma-kmalloc allocation is freshly created, it is
  just normal kernel memory - even if it might later be used for DMA -,
  and it should be perfectly fine to copy_from_user() into such
  allocations at that point, and to copy_to_user() out of them at the
  end. If you look at the places where such allocations are created, you
  can see things like kmemdup(), memcpy() and so on - all normal
  operations that shouldn't conceptually be different from usercopy in
  any relevant way."

Thus this patch marks the dma-kmalloc-* caches as usercopy.

[1] https://bugzilla.suse.com/show_bug.cgi?id=1156053
[2] https://lore.kernel.org/kernel-hardening/bfca96db-bbd0-d958-7732-76e36c667c68@suse.cz/
[3] https://lore.kernel.org/kernel-hardening/CAG48ez1a4waGk9kB0WLaSbs4muSoK0AYAVk8=XYaKj4_+6e6Hg@mail.gmail.com/

Signed-off-by: Vlastimil Babka <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Christian Borntraeger <[email protected]>
Acked-by: Jiri Slaby <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Christopher Lameter <[email protected]>
Cc: Julian Wiedmann <[email protected]>
Cc: Ursula Braun <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: David Windsor <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Laura Abbott <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: "Martin K. Petersen" <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Christoffer Dall <[email protected]>
Cc: Dave Kleikamp <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Luis de Bethencourt <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Matthew Garrett <[email protected]>
Cc: Michal Kubecek <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

fs/buffer.c: record blockdev write errors in super_block that it backs

When syncing out a block device (a'la __sync_blockdev), any error
encountered will only be recorded in the bd_inode's mapping. When the
blockdev contains a filesystem however, we'd like to also record the
error in the super_block that's stored there.

Make mark_buffer_write_io_error also record the error in the
corresponding super_block when a writeback error occurs and the block
device contains a mounted superblock.

Since superblocks are RCU freed, hold the rcu_read_lock to ensure that
the superblock doesn't go away while we're marking it.

Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: David Howells <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Dave Chinner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

vfs: track per-sb writeback errors and report them to syncfs

Patch series "vfs: have syncfs() return error when there are writeback
errors", v6.

Currently, syncfs does not return errors when one of the inodes fails to
be written back.  It will return errors based on the legacy AS_EIO and
AS_ENOSPC flags when syncing out the block device fails, but that's not
particularly helpful for filesystems that aren't backed by a blockdev.
It's also possible for a stray sync to lose those errors.

The basic idea in this set is to track writeback errors at the
superblock level, so that we can quickly and easily check whether
something bad happened without having to fsync each file individually.
syncfs is then changed to reliably report writeback errors after they
occur, much in the same fashion as fsync does now.

This patch (of 2):

Usually we suggest that applications call fsync when they want to ensure
that all data written to the file has made it to the backing store, but
that can be inefficient when there are a lot of open files.

Calling syncfs on the filesystem can be more efficient in some
situations, but the error reporting doesn't currently work the way most
people expect.  If a single inode on a filesystem reports a writeback
error, syncfs won't necessarily return an error.  syncfs only returns an
error if __sync_blockdev fails, and on some filesystems that's a no-op.

It would be better if syncfs reported an error if there were any
writeback failures.  Then applications could call syncfs to see if there
are any errors on any open files, and could then call fsync on all of
the other descriptors to figure out which one failed.

This patch adds a new errseq_t to struct super_block, and has
mapping_set_error also record writeback errors there.

To report those errors, we also need to keep an errseq_t in struct file
to act as a cursor.  This patch adds a dedicated field for that purpose,
which slots nicely into 4 bytes of padding at the end of struct file on
x86_64.

An earlier version of this patch used an O_PATH file descriptor to cue
the kernel that the open file should track the superblock error and not
the inode's writeback error.

I think that API is just too weird though.  This is simpler and should
make syncfs error reporting "just work" even if someone is multiplexing
fsync and syncfs on the same fds.

Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: David Howells <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

arch/parisc/include/asm/pgtable.h: remove unused `old_pte'

parisc's set_pte_at() macro has set-but-not-used variable:

include/linux/pgtable.h: In function 'pte_clear_not_present_full':
arch/parisc/include/asm/pgtable.h:96:9: warning: variable 'old_pte' set but not used [-Wunused-but-set-variable]

Reported-by: kbuild test robot <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: "James E.J. Bottomley" <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: Mike Rapoport <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: mount shared volume without ha stack

Usually we create and use a ocfs2 shared volume on the top of ha stack.
For pcmk based ha stack, which includes DLM, corosync and pacemaker
services.

The customers complained they could not mount existent ocfs2 volume in
the single node without ha stack, e.g. single node backup/restore
scenario.

Like this case, the customers just want to access the data from the
existent ocfs2 volume quickly, but do not want to restart or setup ha
stack.

Then, I'd like to add a mount option "nocluster", if the users use this
option to mount a ocfs2 shared volume, the whole mount will not depend
on the ha related services. the command will mount the existent ocfs2
volume directly (like local mount), for avoiding setup the ha stack.

Signed-off-by: Gang He <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Changwei Ge <[email protected]>
Cc: Jun Piao <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

ocfs2: add missing annotation for dlm_empty_lockres()

Sparse reports a warning at dlm_empty_lockres()

warning: context imbalance in dlm_purge_lockres() - unexpected unlock

The root cause is the missing annotation at dlm_purge_lockres()

Add the missing __must_hold(&dlm->spinlock)

Signed-off-by: Jules Irenge <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Changwei Ge <[email protected]>
Cc: Gang He <[email protected]>
Cc: Jun Piao <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

squashfs: migrate from ll_rw_block usage to BIO

ll_rw_block() function has been deprecated in favor of BIO which appears
to come with large performance improvements.

This patch decreases boot time by close to 40% when using squashfs for
the root file-system. This is observed at least in the context of
starting an Android VM on Chrome OS using crosvm. The patch was tested
on 4.19 as well as master.

This patch is largely based on Adrien Schildknecht's patch that was
originally sent as https://lkml.org/lkml/2017/9/22/814 though with some
significant changes and simplifications while also taking Phillip
Lougher's feedback into account, around preserving support for
FILE_CACHE in particular.

[[email protected]: fix build error reported by Randy]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Philippe Liard <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Cc: Adrien Schildknecht <[email protected]>
Cc: Phillip Lougher <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: Daniel Rosenberg <[email protected]>
Link: https://chromium.googlesource.com/chromiumos/platform/crosvm
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

gup: document and work around "COW can break either way" issue

Doing a "get_user_pages()" on a copy-on-write page for reading can be
ambiguous: the page can be COW'ed at any time afterwards, and the
direction of a COW event isn't defined.

Yes, whoever writes to it will generally do the COW, but if the thread
that did the get_user_pages() unmapped the page before the write (and
that could happen due to memory pressure in addition to any outright
action), the writer could also just take over the old page instead.

End result: the get_user_pages() call might result in a page pointer
that is no longer associated with the original VM, and is associated
with - and controlled by - another VM having taken it over instead.

So when doing a get_user_pages() on a COW mapping, the only really safe
thing to do would be to break the COW when getting the page, even when
only getting it for reading.

At the same time, some users simply don't even care.

For example, the perf code wants to look up the page not because it
cares about the page, but because the code simply wants to look up the
physical address of the access for informational purposes, and doesn't
really care about races when a page might be unmapped and remapped
elsewhere.

This adds logic to force a COW event by setting FOLL_WRITE on any
copy-on-write mapping when FOLL_GET (or FOLL_PIN) is used to get a page
pointer as a result.

The current semantics end up being:

- __get_user_pages_fast(): no change. If you don't ask for a write,
   you won't break COW. You'd better know what you're doing.

- get_user_pages_fast(): the fast-case "look it up in the page tables
   without anything getting mmap_sem" now refuses to follow a read-only
   page, since it might need COW breaking.  Which happens in the slow
   path - the fast path doesn't know if the memory might be COW or not.

- get_user_pages() (including the slow-path fallback for gup_fast()):
   for a COW mapping, turn on FOLL_WRITE for FOLL_GET/FOLL_PIN, with
   very similar semantics to FOLL_FORCE.

If it turns out that we want finer granularity (ie "only break COW when
it might actually matter" - things like the zero page are special and
don't need to be broken) we might need to push these semantics deeper
into the lookup fault path.  So if people care enough, it's possible
that we might end up adding a new internal FOLL_BREAK_COW flag to go
with the internal FOLL_COW flag we already have for tracking "I had a
COW".

Alternatively, if it turns out that different callers might want to
explicitly control the forced COW break behavior, we might even want to
make such a flag visible to the users of get_user_pages() instead of
using the above default semantics.

But for now, this is mostly commentary on the issue (this commit message
being a lot bigger than the patch, and that patch in turn is almost all
comments), with that minimal "enable COW breaking early" logic using the
existing FOLL_WRITE behavior.

[ It might be worth noting that we've always had this ambiguity, and it
  could arguably be seen as a user-space issue.

  You only get private COW mappings that could break either way in
  situations where user space is doing cooperative things (ie fork()
  before an execve() etc), but it _is_ surprising and very subtle, and
  fork() is supposed to give you independent address spaces.

  So let's treat this as a kernel issue and make the semantics of
  get_user_pages() easier to understand. Note that obviously a true
  shared mapping will still get a page that can change under us, so this
  does _not_ mean that get_user_pages() somehow returns any "stable"
  page ]

Reported-by: Jann Horn <[email protected]>
Tested-by: Christoph Hellwig <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Acked-by: Kirill Shutemov <[email protected]>
Acked-by: Jan Kara <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge branch 'from-miklos' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs updates from Al Viro:
"Assorted patches from Miklos.

  An interesting part here is /proc/mounts stuff..."

The "/proc/mounts stuff" is using a cursor for keeeping the location
data while traversing the mount listing.

Also probably worth noting is the addition of faccessat2(), which takes
an additional set of flags to specify how the lookup is done
(AT_EACCESS, AT_SYMLINK_NOFOLLOW, AT_EMPTY_PATH).

* 'from-miklos' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  vfs: add faccessat2 syscall
  vfs: don't parse "silent" option
  vfs: don't parse "posixacl" option
  vfs: don't parse forbidden flags
  statx: add mount_root
  statx: add mount ID
  statx: don't clear STATX_ATIME on SB_RDONLY
  uapi: deprecate STATX_ALL
  utimensat: AT_EMPTY_PATH support
  vfs: split out access_override_creds()
  proc/mounts: add cursor
  aio: fix async fsync creds
  vfs: allow unprivileged whiteout creation

Merge branch 'work.set_fs-exec' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull uaccess/coredump updates from Al Viro:
"set_fs() removal in coredump-related area - mostly Christoph's
  stuff..."

* 'work.set_fs-exec' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  binfmt_elf_fdpic: remove the set_fs(KERNEL_DS) in elf_fdpic_core_dump
  binfmt_elf: remove the set_fs(KERNEL_DS) in elf_core_dump
  binfmt_elf: remove the set_fs in fill_siginfo_note
  signal: refactor copy_siginfo_to_user32
  powerpc/spufs: simplify spufs core dumping
  powerpc/spufs: stop using access_ok
  powerpc/spufs: fix copy_to_user while atomic

Merge branch 'uaccess.__copy_to_user' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull uaccess/__copy_to_user updates from Al Viro:
"Getting rid of __copy_to_user() callers - stuff that doesn't fit into
  other series"

* 'uaccess.__copy_to_user' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  dlmfs: convert dlmfs_file_read() to copy_to_user()
  esas2r: don't bother with __copy_to_user()

Merge branch 'uaccess.__copy_from_user' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull uaccess/__copy_from_user updates from Al Viro:
"Getting rid of __copy_from_user() callers - patches that don't fit
  into other series"

* 'uaccess.__copy_from_user' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  pstore: switch to copy_from_user()
  firewire: switch ioctl_queue_iso to use of copy_from_user()

Merge branch 'uaccess.__put_user' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull uaccess/__put-user updates from Al Viro:
"Removal of __put_user() calls - misc patches that don't fit into any
  other series"

* 'uaccess.__put_user' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  pcm_native: result of put_user() needs to be checked
  scsi_ioctl.c: switch SCSI_IOCTL_GET_IDLUN to copy_to_user()
  compat sysinfo(2): don't bother with field-by-field copyout

Merge branch 'uaccess.readdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull uaccess/readdir updates from Al Viro:
"Finishing the conversion of readdir.c to unsafe_... API.

  This includes the uaccess_{read,write}_begin series by Christophe
  Leroy"

* 'uaccess.readdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  readdir.c: get rid of the last __put_user(), drop now-useless access_ok()
  readdir.c: get compat_filldir() more or less in sync with filldir()
  switch readdir(2) to unsafe_copy_dirent_name()
  drm/i915/gem: Replace user_access_begin by user_write_access_begin
  uaccess: Selectively open read or write user access
  uaccess: Add user_read_access_begin/end and user_write_access_begin/end

Merge branch 'uaccess.access_ok' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull uaccess/access_ok updates from Al Viro:
"Removals of trivially pointless access_ok() calls.

  Note: the fiemap stuff was removed from the series, since they are
  duplicates with part of ext4 series carried in Ted's tree"

* 'uaccess.access_ok' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  vmci_host: get rid of pointless access_ok()
  hfi1: get rid of pointless access_ok()
  usb: get rid of pointless access_ok() calls
  lpfc_debugfs: get rid of pointless access_ok()
  efi_test: get rid of pointless access_ok()
  drm_read(): get rid of pointless access_ok()
  via-pmu: don't bother with access_ok()
  drivers/crypto/ccp/sev-dev.c: get rid of pointless access_ok()
  omapfb: get rid of pointless access_ok() calls
  amifb: get rid of pointless access_ok() calls
  drivers/fpga/dfl-afu-dma-region.c: get rid of pointless access_ok()
  drivers/fpga/dfl-fme-pr.c: get rid of pointless access_ok()
  cm4000_cs.c cmm_ioctl(): get rid of pointless access_ok()
  nvram: drop useless access_ok()
  n_hdlc_tty_read(): remove pointless access_ok()
  tomoyo_write_control(): get rid of pointless access_ok()
  btrfs_ioctl_send(): don't bother with access_ok()
  fat_dir_ioctl(): hadn't needed that access_ok() for more than a decade...
  dlmfs_file_write(): get rid of pointless access_ok()

Merge branch 'uaccess.csum' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull uaccess/csum updates from Al Viro:
"Regularize the sitation with uaccess checksum primitives:

   - fold csum_partial_... into csum_and_copy_..._user()

   - on x86 collapse several access_ok()/stac()/clac() into
     user_access_begin()/user_access_end()"

* 'uaccess.csum' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  default csum_and_copy_to_user(): don't bother with access_ok()
  take the dummy csum_and_copy_from_user() into net/checksum.h
  arm: switch to csum_and_copy_from_user()
  sh32: convert to csum_and_copy_from_user()
  m68k: convert to csum_and_copy_from_user()
  xtensa: switch to providing csum_and_copy_from_user()
  sparc: switch to providing csum_and_copy_from_user()
  parisc: turn csum_partial_copy_from_user() into csum_and_copy_from_user()
  alpha: turn csum_partial_copy_from_user() into csum_and_copy_from_user()
  ia64: turn csum_partial_copy_from_user() into csum_and_copy_from_user()
  ia64: csum_partial_copy_nocheck(): don't abuse csum_partial_copy_from_user()
  x86: switch 32bit csum_and_copy_to_user() to user_access_{begin,end}()
  x86: switch both 32bit and 64bit to providing csum_and_copy_from_user()
  x86_64: csum_..._copy_..._user(): switch to unsafe_..._user()
  get rid of csum_partial_copy_to_user()

Merge tag 'docs-5.8' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
"A fair amount of stuff this time around, dominated by yet another
  massive set from Mauro toward the completion of the RST conversion. I
  *really* hope we are getting close to the end of this. Meanwhile,
  those patches reach pretty far afield to update document references
  around the tree; there should be no actual code changes there. There
  will be, alas, more of the usual trivial merge conflicts.

  Beyond that we have more translations, improvements to the sphinx
  scripting, a number of additions to the sysctl documentation, and lots
  of fixes"

* tag 'docs-5.8' of git://git.lwn.net/linux: (130 commits)
  Documentation: fixes to the maintainer-entry-profile template
  zswap: docs/vm: Fix typo accept_threshold_percent in zswap.rst
  tracing: Fix events.rst section numbering
  docs: acpi: fix old http link and improve document format
  docs: filesystems: add info about efivars content
  Documentation: LSM: Correct the basic LSM description
  mailmap: change email for Ricardo Ribalda
  docs: sysctl/kernel: document unaligned controls
  Documentation: admin-guide: update bug-hunting.rst
  docs: sysctl/kernel: document ngroups_max
  nvdimm: fixes to maintainter-entry-profile
  Documentation/features: Correct RISC-V kprobes support entry
  Documentation/features: Refresh the arch support status files
  Revert "docs: sysctl/kernel: document ngroups_max"
  docs: move locking-specific documents to locking/
  docs: move digsig docs to the security book
  docs: move the kref doc into the core-api book
  docs: add IRQ documentation at the core-api book
  docs: debugging-via-ohci1394.txt: add it to the core-api book
  docs: fix references for ipmi.rst file
  ...

Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM updates from Russell King:

- remove a now unnecessary usage of the KERNEL_DS for
   sys_oabi_epoll_ctl()

- update my email address in a number of drivers

- decompressor EFI updates from Ard Biesheuvel

- module unwind section handling updates

- sparsemem Kconfig cleanups

- make act_mm macro respect THREAD_SIZE

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 8980/1: Allow either FLATMEM or SPARSEMEM on the multiplatform build
  ARM: 8979/1: Remove redundant ARCH_SPARSEMEM_DEFAULT setting
  ARM: 8978/1: mm: make act_mm() respect THREAD_SIZE
  ARM: decompressor: run decompressor in place if loaded via UEFI
  ARM: decompressor: move GOT into .data for EFI enabled builds
  ARM: decompressor: defer loading of the contents of the LC0 structure
  ARM: decompressor: split off _edata and stack base into separate object
  ARM: decompressor: move headroom variable out of LC0
  ARM: 8976/1: module: allow arch overrides for .init section names
  ARM: 8975/1: module: fix handling of unwind init sections
  ARM: 8974/1: use SPARSMEM_STATIC when SPARSEMEM is enabled
  ARM: 8971/1: replace the sole use of a symbol with its definition
  ARM: 8969/1: decompressor: simplify libfdt builds
  Update rmk's email address in various drivers
  ARM: compat: remove KERNEL_DS usage in sys_oabi_epoll_ctl()

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
"A sizeable pile of arm64 updates for 5.8.

  Summary below, but the big two features are support for Branch Target
  Identification and Clang's Shadow Call stack. The latter is currently
  arm64-only, but the high-level parts are all in core code so it could
  easily be adopted by other architectures pending toolchain support

  Branch Target Identification (BTI):

   - Support for ARMv8.5-BTI in both user- and kernel-space. This allows
     branch targets to limit the types of branch from which they can be
     called and additionally prevents branching to arbitrary code,
     although kernel support requires a very recent toolchain.

   - Function annotation via SYM_FUNC_START() so that assembly functions
     are wrapped with the relevant "landing pad" instructions.

   - BPF and vDSO updates to use the new instructions.

   - Addition of a new HWCAP and exposure of BTI capability to userspace
     via ID register emulation, along with ELF loader support for the
     BTI feature in .note.gnu.property.

   - Non-critical fixes to CFI unwind annotations in the sigreturn
     trampoline.

  Shadow Call Stack (SCS):

   - Support for Clang's Shadow Call Stack feature, which reserves
     platform register x18 to point at a separate stack for each task
     that holds only return addresses. This protects function return
     control flow from buffer overruns on the main stack.

   - Save/restore of x18 across problematic boundaries (user-mode,
     hypervisor, EFI, suspend, etc).

   - Core support for SCS, should other architectures want to use it
     too.

   - SCS overflow checking on context-switch as part of the existing
     stack limit check if CONFIG_SCHED_STACK_END_CHECK=y.

  CPU feature detection:

   - Removed numerous "SANITY CHECK" errors when running on a system
     with mismatched AArch32 support at EL1. This is primarily a concern
     for KVM, which disabled support for 32-bit guests on such a system.

   - Addition of new ID registers and fields as the architecture has
     been extended.

  Perf and PMU drivers:

   - Minor fixes and cleanups to system PMU drivers.

  Hardware errata:

   - Unify KVM workarounds for VHE and nVHE configurations.

   - Sort vendor errata entries in Kconfig.

  Secure Monitor Call Calling Convention (SMCCC):

   - Update to the latest specification from Arm (v1.2).

   - Allow PSCI code to query the SMCCC version.

  Software Delegated Exception Interface (SDEI):

   - Unexport a bunch of unused symbols.

   - Minor fixes to handling of firmware data.

  Pointer authentication:

   - Add support for dumping the kernel PAC mask in vmcoreinfo so that
     the stack can be unwound by tools such as kdump.

   - Simplification of key initialisation during CPU bringup.

  BPF backend:

   - Improve immediate generation for logical and add/sub instructions.

  vDSO:

   - Minor fixes to the linker flags for consistency with other
     architectures and support for LLVM's unwinder.

   - Clean up logic to initialise and map the vDSO into userspace.

  ACPI:

   - Work around for an ambiguity in the IORT specification relating to
     the "num_ids" field.

   - Support _DMA method for all named components rather than only PCIe
     root complexes.

   - Minor other IORT-related fixes.

  Miscellaneous:

   - Initialise debug traps early for KGDB and fix KDB cacheflushing
     deadlock.

   - Minor tweaks to early boot state (documentation update, set
     TEXT_OFFSET to 0x0, increase alignment of PE/COFF sections).

   - Refactoring and cleanup"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits)
  KVM: arm64: Move __load_guest_stage2 to kvm_mmu.h
  KVM: arm64: Check advertised Stage-2 page size capability
  arm64/cpufeature: Add get_arm64_ftr_reg_nowarn()
  ACPI/IORT: Remove the unused __get_pci_rid()
  arm64/cpuinfo: Add ID_MMFR4_EL1 into the cpuinfo_arm64 context
  arm64/cpufeature: Add remaining feature bits in ID_AA64PFR1 register
  arm64/cpufeature: Add remaining feature bits in ID_AA64PFR0 register
  arm64/cpufeature: Add remaining feature bits in ID_AA64ISAR0 register
  arm64/cpufeature: Add remaining feature bits in ID_MMFR4 register
  arm64/cpufeature: Add remaining feature bits in ID_PFR0 register
  arm64/cpufeature: Introduce ID_MMFR5 CPU register
  arm64/cpufeature: Introduce ID_DFR1 CPU register
  arm64/cpufeature: Introduce ID_PFR2 CPU register
  arm64/cpufeature: Make doublelock a signed feature in ID_AA64DFR0
  arm64/cpufeature: Drop TraceFilt feature exposure from ID_DFR0 register
  arm64/cpufeature: Add explicit ftr_id_isar0[] for ID_ISAR0 register
  arm64: mm: Add asid_gen_match() helper
  firmware: smccc: Fix missing prototype warning for arm_smccc_version_init
  arm64: vdso: Fix CFI directives in sigreturn trampoline
  arm64: vdso: Don't prefix sigreturn trampoline with a BTI C instruction
  ...

Merge tag 'm68k-for-v5.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k

Pull m68k updates from Geert Uytterhoeven:

- several Mac fixes

- defconfig updates

- minor cleanups and fixes

* tag 'm68k-for-v5.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k: tools: Replace zero-length array with flexible-array member
  m68k: Add missing __user annotation in get_user()
  m68k: mac: Avoid stuck ISM IOP interrupt on Quadra 900/950
  m68k: mac: Remove misleading comment
  m68k: mac: Don't call via_flush_cache() on Mac IIfx
  m68k: defconfig: Update defconfigs for v5.7-rc1
  m68k: amiga: config: Replace zero-length array with flexible-array member
  m68k: amiga: config: Mark expected switch fall-through

Merge tag 'x86-vdso-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 vdso updates from Ingo Molnar:
"Clean up various aspects of the vDSO code, no change in functionality
  intended"

* tag 'x86-vdso-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/vdso/Makefile: Add vobjs32
  x86/vdso/vdso2c: Convert iterators to unsigned
  x86/vdso/vdso2c: Correct error messages on file open

Merge tag 'x86-platform-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 platform updates from Ingo Molnar:
"This tree cleans up various aspects of the UV platform support code,
  it removes unnecessary functions and cleans up the rest"

* tag 'x86-platform-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic/uv: Remove code for unused distributed GRU mode
  x86/platform/uv: Remove the unused _uv_cpu_blade_processor_id() macro
  x86/platform/uv: Unexport uv_apicid_hibits
  x86/platform/uv: Remove _uv_hub_info_check()
  x86/platform/uv: Simplify uv_send_IPI_one()
  x86/platform/uv: Mark uv_min_hub_revision_id static
  x86/platform/uv: Mark is_uv_hubless() static
  x86/platform/uv: Remove the UV*_HUB_IS_SUPPORTED macros
  x86/platform/uv: Unexport symbols only used by x2apic_uv_x.c
  x86/platform/uv: Unexport sn_coherency_id
  x86/platform/uv: Remove the uv_partition_coherence_id() macro
  x86/platform/uv: Mark uv_bios_call() and uv_bios_call_irqsave() static

Merge tag 'x86-fpu-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 FPU updates from Ingo Molnar:
"Most of the changes here related to 'XSAVES supervisor state' support,
  which is a feature that allows kernel-only data to be automatically
  saved/restored by the FPU context switching code.

  CPU features that can be supported this way are Intel PT, 'PASID' and
  CET features"

* tag 'x86-fpu-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu/xstate: Restore supervisor states for signal return
  x86/fpu/xstate: Preserve supervisor states for the slow path in __fpu__restore_sig()
  x86/fpu: Introduce copy_supervisor_to_kernel()
  x86/fpu/xstate: Update copy_kernel_to_xregs_err() for supervisor states
  x86/fpu/xstate: Update sanitize_restored_xstate() for supervisor xstates
  x86/fpu/xstate: Define new functions for clearing fpregs and xstates
  x86/fpu/xstate: Introduce XSAVES supervisor states
  x86/fpu/xstate: Separate user and supervisor xfeatures mask
  x86/fpu/xstate: Define new macros for supervisor and user xstates
  x86/fpu/xstate: Rename validate_xstate_header() to validate_user_xstate_header()

Merge tag 'x86-cpu-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cpu updates from Ingo Molnar:
"Misc updates:

   - Extend the x86 family/model macros with a steppings dimension,
     because x86 life isn't complex enough and Intel uses steppings to
     differentiate between different CPUs. :-/

   - Convert the TSC deadline timer quirks to the steppings macros.

   - Clean up asm mnemonics.

   - Fix the handling of an AMD erratum, or in other words, fix a kernel
     erratum"

* tag 'x86-cpu-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Use RDRAND and RDSEED mnemonics in archrandom.h
  x86/cpu: Use INVPCID mnemonic in invpcid.h
  x86/cpu/amd: Make erratum #1054 a legacy erratum
  x86/apic: Convert the TSC deadline timer matching to steppings macro
  x86/cpu: Add a X86_MATCH_INTEL_FAM6_MODEL_STEPPINGS() macro
  x86/cpu: Add a steppings field to struct x86_cpu_id

Merge tag 'x86-cleanups-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cleanups from Ingo Molnar:
"Misc cleanups, with an emphasis on removing obsolete/dead code"

* tag 'x86-cleanups-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/spinlock: Remove obsolete ticket spinlock macros and types
  x86/mm: Drop deprecated DISCONTIGMEM support for 32-bit
  x86/apb_timer: Drop unused declaration and macro
  x86/apb_timer: Drop unused TSC calibration
  x86/io_apic: Remove unused function mp_init_irq_at_boot()
  x86/mm: Stop printing BRK addresses
  x86/audit: Fix a -Wmissing-prototypes warning for ia32_classify_syscall()
  x86/nmi: Remove edac.h include leftover
  mm: Remove MPX leftovers
  x86/mm/mmap: Fix -Wmissing-prototypes warnings
  x86/early_printk: Remove unused includes
  crash_dump: Remove no longer used saved_max_pfn
  x86/smpboot: Remove the last ICPU() macro

Merge tag 'x86-build-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 build updates from Ingo Molnar:
"Misc dependency fixes, plus a documentation update about memory
  protection keys support"

* tag 'x86-build-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/Kconfig: Update config and kernel doc for MPK feature on AMD
  x86/boot: Discard .discard.unreachable for arch/x86/boot/compressed/vmlinux
  x86/boot/build: Add phony targets in arch/x86/boot/Makefile to PHONY
  x86/boot/build: Make 'make bzlilo' not depend on vmlinux or $(obj)/bzImage
  x86/boot/build: Add cpustr.h to targets and remove clean-files

Merge tag 'x86-boot-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 boot updates from Ingo Molnar:
"Misc updates:

   - Add the initrdmem= boot option to specify an initrd embedded in RAM
     (flash most likely)

   - Sanitize the CS value earlier during boot, which also fixes SEV-ES

   - Various fixes and smaller cleanups"

* tag 'x86-boot-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Correct relocation destination on old linkers
  x86/boot/compressed/64: Switch to __KERNEL_CS after GDT is loaded
  x86/boot: Fix -Wint-to-pointer-cast build warning
  x86/boot: Add kstrtoul() from lib/
  x86/tboot: Mark tboot static
  x86/setup: Add an initrdmem= option to specify initrd physical address

Merge tag 'smp-core-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull SMP updates from Ingo Molnar:
"Misc cleanups in the SMP hotplug and cross-call code"

* tag 'smp-core-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpu/hotplug: Remove __freeze_secondary_cpus()
  cpu/hotplug: Remove disable_nonboot_cpus()
  cpu/hotplug: Fix a typo in comment "broadacasted"->"broadcasted"
  smp: Use smp_call_func_t in on_each_cpu()

Merge tag 'efi-core-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull EFI updates from Ingo Molnar:
"The EFI changes for this cycle are:

   - preliminary changes for RISC-V

   - Add support for setting the resolution on the EFI framebuffer

   - Simplify kernel image loading for arm64

   - Move .bss into .data via the linker script instead of relying on
     symbol annotations.

   - Get rid of __pure getters to access global variables

   - Clean up the config table matching arrays

   - Rename pr_efi/pr_efi_err to efi_info/efi_err, and use them
     consistently

   - Simplify and unify initrd loading

   - Parse the builtin command line on x86 (if provided)

   - Implement printk() support, including support for wide character
     strings

   - Simplify GDT handling in early mixed mode thunking code

   - Some other minor fixes and cleanups"

* tag 'efi-core-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (79 commits)
  efi/x86: Don't blow away existing initrd
  efi/x86: Drop the special GDT for the EFI thunk
  efi/libstub: Add missing prototype for PE/COFF entry point
  efi/efivars: Add missing kobject_put() in sysfs entry creation error path
  efi/libstub: Use pool allocation for the command line
  efi/libstub: Don't parse overlong command lines
  efi/libstub: Use snprintf with %ls to convert the command line
  efi/libstub: Get the exact UTF-8 length
  efi/libstub: Use %ls for filename
  efi/libstub: Add UTF-8 decoding to efi_puts
  efi/printf: Add support for wchar_t (UTF-16)
  efi/gop: Add an option to list out the available GOP modes
  efi/libstub: Add definitions for console input and events
  efi/libstub: Implement printk-style logging
  efi/printf: Turn vsprintf into vsnprintf
  efi/printf: Abort on invalid format
  efi/printf: Refactor code to consolidate padding and output
  efi/printf: Handle null string input
  efi/printf: Factor out integer argument retrieval
  efi/printf: Factor out width/precision parsing
  ...

Merge tag 'perf-core-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf updates from Ingo Molnar:
"Kernel side changes:

   - Add AMD Fam17h RAPL support

   - Introduce CAP_PERFMON to kernel and user space

   - Add Zhaoxin CPU support

   - Misc fixes and cleanups

  Tooling changes:

   - perf record:

     Introduce '--switch-output-event' to use arbitrary events to be
     setup and read from a side band thread and, when they take place a
     signal be sent to the main 'perf record' thread, reusing the core
     for '--switch-output' to take perf.data snapshots from the ring
     buffer used for '--overwrite', e.g.:

# perf record --overwrite -e sched:* \
      --switch-output-event syscalls:*connect* \
      workload

     will take perf.data.YYYYMMDDHHMMSS snapshots up to around the
     connect syscalls.

     Add '--num-synthesize-threads' option to control degree of
     parallelism of the synthesize_mmap() code which is scanning
     /proc/PID/task/PID/maps and can be time consuming. This mimics
     pre-existing behaviour in 'perf top'.

   - perf bench:

     Add a multi-threaded synthesize benchmark and kallsyms parsing
     benchmark.

   - Intel PT support:

     Stitch LBR records from multiple samples to get deeper backtraces,
     there are caveats, see the csets for details.

     Allow using Intel PT to synthesize callchains for regular events.

     Add support for synthesizing branch stacks for regular events
     (cycles, instructions, etc) from Intel PT data.

  Misc changes:

   - Updated perf vendor events for power9 and Coresight.

   - Add flamegraph.py script via 'perf flamegraph'

   - Misc other changes, fixes and cleanups - see the Git log for details

  Also, since over the last couple of years perf tooling has matured and
  decoupled from the kernel perf changes to a large degree, going
  forward Arnaldo is going to send perf tooling changes via direct pull
  requests"

* tag 'perf-core-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (163 commits)
  perf/x86/rapl: Add AMD Fam17h RAPL support
  perf/x86/rapl: Make perf_probe_msr() more robust and flexible
  perf/x86/rapl: Flip logic on default events visibility
  perf/x86/rapl: Refactor to share the RAPL code between Intel and AMD CPUs
  perf/x86/rapl: Move RAPL support to common x86 code
  perf/core: Replace zero-length array with flexible-array
  perf/x86: Replace zero-length array with flexible-array
  perf/x86/intel: Add more available bits for OFFCORE_RESPONSE of Intel Tremont
  perf/x86/rapl: Add Ice Lake RAPL support
  perf flamegraph: Use /bin/bash for report and record scripts
  perf cs-etm: Move definition of 'traceid_list' global variable from header file
  libsymbols kallsyms: Move hex2u64 out of header
  libsymbols kallsyms: Parse using io api
  perf bench: Add kallsyms parsing
  perf: cs-etm: Update to build with latest opencsd version.
  perf symbol: Fix kernel symbol address display
  perf inject: Rename perf_evsel__*() operating on 'struct evsel *' to evsel__*()
  perf annotate: Rename perf_evsel__*() operating on 'struct evsel *' to evsel__*()
  perf trace: Rename perf_evsel__*() operating on 'struct evsel *' to evsel__*()
  perf script: Rename perf_evsel__*() operating on 'struct evsel *' to evsel__*()
  ...

Merge tag 'objtool-core-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Ingo Molnar:
"There are a lot of objtool changes in this cycle, all across the map:

   - Speed up objtool significantly, especially when there are large
     number of sections

   - Improve objtool's understanding of special instructions such as
     IRET, to reduce the number of annotations required

   - Implement 'noinstr' validation

   - Do baby steps for non-x86 objtool use

   - Simplify/fix retpoline decoding

   - Add vmlinux validation

   - Improve documentation

   - Fix various bugs and apply smaller cleanups"

* tag 'objtool-core-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
  objtool: Enable compilation of objtool for all architectures
  objtool: Move struct objtool_file into arch-independent header
  objtool: Exit successfully when requesting help
  objtool: Add check_kcov_mode() to the uaccess safelist
  samples/ftrace: Fix asm function ELF annotations
  objtool: optimize add_dead_ends for split sections
  objtool: use gelf_getsymshndx to handle >64k sections
  objtool: Allow no-op CFI ops in alternatives
  x86/retpoline: Fix retpoline unwind
  x86: Change {JMP,CALL}_NOSPEC argument
  x86: Simplify retpoline declaration
  x86/speculation: Change FILL_RETURN_BUFFER to work with objtool
  objtool: Add support for intra-function calls
  objtool: Move the IRET hack into the arch decoder
  objtool: Remove INSN_STACK
  objtool: Make handle_insn_ops() unconditional
  objtool: Rework allocating stack_ops on decode
  objtool: UNWIND_HINT_RET_OFFSET should not check registers
  objtool: is_fentry_call() crashes if call has no destination
  x86,smap: Fix smap_{save,restore}() alternatives
  ...

Merge tag 'locking-core-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:
"The biggest change to core locking facilities in this cycle is the
  introduction of local_lock_t - this primitive comes from the -rt
  project and identifies CPU-local locking dependencies normally handled
  opaquely beind preempt_disable() or local_irq_save/disable() critical
  sections.

  The generated code on mainline kernels doesn't change as a result, but
  still there are benefits: improved debugging and better documentation
  of data structure accesses.

  The new local_lock_t primitives are introduced and then utilized in a
  couple of kernel subsystems. No change in functionality is intended.

  There's also other smaller changes and cleanups"

* tag 'locking-core-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  zram: Use local lock to protect per-CPU data
  zram: Allocate struct zcomp_strm as per-CPU memory
  connector/cn_proc: Protect send_msg() with a local lock
  squashfs: Make use of local lock in multi_cpu decompressor
  mm/swap: Use local_lock for protection
  radix-tree: Use local_lock for protection
  locking: Introduce local_lock()
  locking/lockdep: Replace zero-length array with flexible-array
  locking/rtmutex: Remove unused rt_mutex_cmpxchg_relaxed()

Merge tag 'core-rcu-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RCU updates from Ingo Molnar:
"The RCU updates for this cycle were:

   - RCU-tasks update, including addition of RCU Tasks Trace for BPF use
     and TASKS_RUDE_RCU

   - kfree_rcu() updates.

   - Remove scheduler locking restriction

   - RCU CPU stall warning updates.

   - Torture-test updates.

   - Miscellaneous fixes and other updates"

* tag 'core-rcu-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits)
  rcu: Allow for smp_call_function() running callbacks from idle
  rcu: Provide rcu_irq_exit_check_preempt()
  rcu: Abstract out rcu_irq_enter_check_tick() from rcu_nmi_enter()
  rcu: Provide __rcu_is_watching()
  rcu: Provide rcu_irq_exit_preempt()
  rcu: Make RCU IRQ enter/exit functions rely on in_nmi()
  rcu/tree: Mark the idle relevant functions noinstr
  x86: Replace ist_enter() with nmi_enter()
  x86/mce: Send #MC singal from task work
  x86/entry: Get rid of ist_begin/end_non_atomic()
  sched,rcu,tracing: Avoid tracing before in_nmi() is correct
  sh/ftrace: Move arch_ftrace_nmi_{enter,exit} into nmi exception
  lockdep: Always inline lockdep_{off,on}()
  hardirq/nmi: Allow nested nmi_enter()
  arm64: Prepare arch_nmi_enter() for recursion
  printk: Disallow instrumenting print_nmi_enter()
  printk: Prepare for nested printk_nmi_enter()
  rcutorture: Convert ULONG_CMP_LT() to time_before()
  torture: Add a --kasan argument
  torture: Save a few lines by using config_override_param initially
  ...

Merge tag 'core-kprobes-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull kprobes updates from Ingo Molnar:
"Various kprobes updates, mostly centered around cleaning up the
  no-instrumentation logic.

  Instead of the current per debug facility blacklist, use the more
  generic .noinstr.text approach, combined with a 'noinstr' marker for
  functions.

  Also add instrumentation_begin()/end() to better manage the exact
  place in entry code where instrumentation may be used.

  And add a kprobes blacklist for modules"

* tag 'core-kprobes-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  kprobes: Prevent probes in .noinstr.text section
  vmlinux.lds.h: Create section for protection against instrumentation
  samples/kprobes: Add __kprobes and NOKPROBE_SYMBOL() for handlers.
  kprobes: Support NOKPROBE_SYMBOL() in modules
  kprobes: Support __kprobes blacklist in modules
  kprobes: Lock kprobe_mutex while showing kprobe_blacklist

Merge tag 'x86_cache_updates_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cache resource control updates from Borislav Petkov:
"Add support for wider Memory Bandwidth Monitoring counters by querying
  their width from CPUID.

  As a prerequsite for that, streamline and unify the CPUID detection of
  the respective resource control attributes.

  By Reinette Chatre"

* tag 'x86_cache_updates_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/resctrl: Support wider MBM counters
  x86/resctrl: Support CPUID enumeration of MBM counter width
  x86/resctrl: Maintain MBM counter width per resource
  x86/resctrl: Query LLC monitoring properties once during boot
  x86/resctrl: Remove unnecessary RMID checks
  x86/cpu: Move resctrl CPUID code to resctrl/
  x86/resctrl: Rename asm/resctrl_sched.h to asm/resctrl.h

Merge tag 'x86_microcode_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 microcode update from Borislav Petkov:
"A single fix for late microcode loading to handle the correct return
value from stop_machine(), from Mihai Carabas"

* tag 'x86_microcode_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode: Fix return value for microcode late loading

Merge tag 'edac_updates_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC updates from Borislav Petkov:

- Fix i10nm_edac loading on some Ice Lake and Tremont/Jacobsville
   steppings due to the offset change of the bus number configuration
   register, by Qiuxu Zhuo.

- The usual cleanups and fixes all over the place.

* tag 'edac_updates_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/amd64: Remove redundant assignment to variable ret in hw_info_get()
  EDAC/skx: Use the mcmtr register to retrieve close_pg/bank_xor_enable
  EDAC/i10nm: Update driver to support different bus number config register offsets
  EDAC, {skx,i10nm}: Make some configurations CPU model specific
  EDAC/amd8131: Remove defined but not used bridge_str
  EDAC/thunderx: Make symbols static
  MAINTAINERS: Remove sifive_l2_cache.c from EDAC-SIFIVE pattern
  EDAC/xgene: Remove set but not used address local var
  EDAC/armada_xp: Fix some log messages

Merge tag 'printk-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

- Benjamin Herrenschmidt solved a problem with non-matched console
   aliases by first checking consoles defined on the command line. It is
   a more conservative approach than the previous attempts.

- Benjamin also made sure that the console accessible via /dev/console
   always has CON_CONSDEV flag.

- Andy Shevchenko added the %ptT modifier for printing struct time64_t.
   It extends the existing %ptR handling for struct rtc_time.

- Bruno Meneguele fixed /dev/kmsg error value returned by unsupported
   SEEK_CUR.

- Tetsuo Handa removed unused pr_cont_once().

... and a few small fixes.

* tag 'printk-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: Remove pr_cont_once()
  printk: handle blank console arguments passed in.
  kernel/printk: add kmsg SEEK_CUR handling
  printk: Fix a typo in comment "interator"->"iterator"
  usb: pulse8-cec: Switch to use %ptT
  ARM: bcm2835: Switch to use %ptT
  lib/vsprintf: Print time64_t in human readable format
  lib/vsprintf: update comment about simple_strto<foo>() functions
  printk: Correctly set CON_CONSDEV even when preferred console was not registered
  printk: Fix preferred console selection with multiple matches
  printk: Move console matching logic into a separate function
  printk: Convert a use of sprintf to snprintf in console_unlock

Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt

Pull fsverity updates from Eric Biggers:
"Fix kerneldoc warnings and some coding style inconsistencies.

  This mirrors the similar cleanups being done in fs/crypto/"

* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
  fs-verity: remove unnecessary extern keywords
  fs-verity: fix all kerneldoc warnings