]> Git Repo - linux.git/log
linux.git
5 years agolibperf: Use sys/types.h to get ssize_t, not unistd.h
Arnaldo Carvalho de Melo [Mon, 23 Sep 2019 21:06:52 +0000 (18:06 -0300)]
libperf: Use sys/types.h to get ssize_t, not unistd.h

The sys/types.h header looks more sensible, from its name we can gather
it should be there because of some needed typedef, and it is much
smaller than unistd.h, so use it and fix up the fallout in places where
it was being used for something else entirely but being obtained by
sheer luck, indirectly.

Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agoperf tools: No need to include internal/lib.h from util/util.h
Arnaldo Carvalho de Melo [Mon, 23 Sep 2019 19:22:22 +0000 (16:22 -0300)]
perf tools: No need to include internal/lib.h from util/util.h

That was done just to have users of writen() and readn(), that before
had their prototypes in util/util.h to get it without having to add an
include for internal/lib.h, but the right way is to add it and by now
all places already do it.

Fix a fallout were readlink() was used but unistd.h was being obtained
by luck thru util.h -> internal/lib.h, now to check why unistd.h is
being included there...

Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Move 'page_size' global variable to libperf
Jiri Olsa [Tue, 6 Aug 2019 13:25:25 +0000 (15:25 +0200)]
libperf: Move 'page_size' global variable to libperf

We need the 'page_size' variable in libperf, so move it there.

Add a libperf_init() as a global libperf init function to obtain this
value via sysconf() at tool start.

Committer notes:

Add internal/lib.h to tools/perf/ files using 'page_size', sometimes
replacing util.h with it if that was the only reason for having util.h
included.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Add perf_evlist__id_add_fd() function
Jiri Olsa [Tue, 3 Sep 2019 09:19:56 +0000 (11:19 +0200)]
libperf: Add perf_evlist__id_add_fd() function

Add the perf_evlist__id_add_fd() function to libperf as an internal
function.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Add perf_evlist__id_add() function
Jiri Olsa [Tue, 3 Sep 2019 09:01:04 +0000 (11:01 +0200)]
libperf: Add perf_evlist__id_add() function

Add the perf_evlist__id_add() function to libperf as an internal
function.  We already have the 'heads' member in 'struct perf_evlist'.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Add perf_evlist__read_format() function
Jiri Olsa [Tue, 3 Sep 2019 08:54:48 +0000 (10:54 +0200)]
libperf: Add perf_evlist__read_format() function

Add the perf_evlist__read_format() function to libperf as internal
function.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Add perf_evlist__first()/last() functions
Jiri Olsa [Tue, 3 Sep 2019 08:39:52 +0000 (10:39 +0200)]
libperf: Add perf_evlist__first()/last() functions

Add perf_evlist__first()/last() functions to libperf, as internal
functions and rename perf's origins to evlist__first/last.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Add perf_evsel__alloc_id/perf_evsel__free_id functions
Jiri Olsa [Tue, 3 Sep 2019 08:34:29 +0000 (10:34 +0200)]
libperf: Add perf_evsel__alloc_id/perf_evsel__free_id functions

Add perf_evsel__alloc_id()/perf_evsel__free_id() functions to libperf as
internal functions.

Move 'struct perf_sample_id' to internal/evsel.h header and change
'struct perf_sample_id::evsel' to 'struct perf_evsel' and the related
code that touches it.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Move 'heads' from 'struct evlist' to 'struct perf_evlist'
Jiri Olsa [Mon, 2 Sep 2019 20:20:12 +0000 (22:20 +0200)]
libperf: Move 'heads' from 'struct evlist' to 'struct perf_evlist'

Move 'heads' hash table from 'struct evlist' to 'struct perf_evlist'.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Move 'ids' from 'struct evsel' to 'struct perf_evsel'
Jiri Olsa [Mon, 2 Sep 2019 20:15:47 +0000 (22:15 +0200)]
libperf: Move 'ids' from 'struct evsel' to 'struct perf_evsel'

Move 'ids' from 'struct evsel' to libperf's 'struct perf_evsel'.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Move 'id' from 'struct evsel' to 'struct perf_evsel'
Jiri Olsa [Mon, 2 Sep 2019 20:12:26 +0000 (22:12 +0200)]
libperf: Move 'id' from 'struct evsel' to 'struct perf_evsel'

Move the 'id' array from 'struct evsel' to libperf's 'struct perf_evsel'.

Committer note:

Fix the tools/perf/util/cs-etm.c build, i.e. aarch64's CoreSight.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Move 'sample_id' from 'struct evsel' to 'struct perf_evsel'
Jiri Olsa [Mon, 2 Sep 2019 20:04:12 +0000 (22:04 +0200)]
libperf: Move 'sample_id' from 'struct evsel' to 'struct perf_evsel'

Move 'sample_id' array from 'struct evsel' to libperf's 'struct perf_evsel'.

Committer notes:

Removed the 'struct xyarray' from util/evsel.h, not needed anymore
there.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Add missing 'struct xyarray' forward declaration
Arnaldo Carvalho de Melo [Mon, 23 Sep 2019 18:10:35 +0000 (15:10 -0300)]
libperf: Add missing 'struct xyarray' forward declaration

We were getting it by luck, from files included before internal/evsel.h
where it is being included.

Fixes: 9dfcb7599084 ("libperf: Move fd array from perf's evsel to lobperf's perf_evsel class")
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Move 'pollfd' from 'struct evlist' to 'struct perf_evlist'
Jiri Olsa [Tue, 6 Aug 2019 09:28:02 +0000 (11:28 +0200)]
libperf: Move 'pollfd' from 'struct evlist' to 'struct perf_evlist'

Moving 'pollfd' from 'struct evlist' to 'struct perf_evlist' it will be
used in following patches.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Move 'mmap_len' from 'struct evlist' to 'struct perf_evlist'
Jiri Olsa [Tue, 6 Aug 2019 13:14:05 +0000 (15:14 +0200)]
libperf: Move 'mmap_len' from 'struct evlist' to 'struct perf_evlist'

Moving 'mmap_len' from 'struct evlist' to 'struct perf_evlist' it will
be used in following patches.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Move 'nr_mmaps' from 'struct evlist' to 'struct perf_evlist'
Jiri Olsa [Tue, 30 Jul 2019 11:04:59 +0000 (13:04 +0200)]
libperf: Move 'nr_mmaps' from 'struct evlist' to 'struct perf_evlist'

Moving 'nr_mmaps' from 'struct evlist' to 'struct perf_evlist', it will
be used in following patches.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Move 'system_wide' from 'struct evsel' to 'struct perf_evsel'
Jiri Olsa [Tue, 6 Aug 2019 09:35:19 +0000 (11:35 +0200)]
libperf: Move 'system_wide' from 'struct evsel' to 'struct perf_evsel'

Move the 'system_wide 'member from perf's evsel to libperf's perf_evsel.

Committer notes:

Added stdbool.h as we now use bool here.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Add 'flush' to 'struct perf_mmap'
Jiri Olsa [Tue, 27 Aug 2019 14:05:18 +0000 (16:05 +0200)]
libperf: Add 'flush' to 'struct perf_mmap'

Move 'flush' from tools/perf's mmap to libperf's perf_mmap struct.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Add 'event_copy' to 'struct perf_mmap'
Jiri Olsa [Sat, 27 Jul 2019 20:47:58 +0000 (22:47 +0200)]
libperf: Add 'event_copy' to 'struct perf_mmap'

Move 'event_copy' from tools/perf's mmap to libperf's perf_mmap struct.

Committer notes:

Add linux/compiler.h as we need it for '__aligned'.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Add 'overwrite' to 'struct perf_mmap'
Jiri Olsa [Sat, 27 Jul 2019 20:42:56 +0000 (22:42 +0200)]
libperf: Add 'overwrite' to 'struct perf_mmap'

Move 'overwrite' from tools/perf's mmap to libperf's perf_mmap struct.

Committer notes:

Add stdbool.h as we start using 'bool'.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Add prev/start/end to struct perf_mmap
Jiri Olsa [Sat, 27 Jul 2019 20:39:53 +0000 (22:39 +0200)]
libperf: Add prev/start/end to struct perf_mmap

Move prev/start/end from tools/perf's mmap to libperf's perf_mmap struct.

Committer notes:

Add linux/types.h as we use u64.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Add 'refcnt' to struct perf_mmap
Jiri Olsa [Sat, 27 Jul 2019 20:35:35 +0000 (22:35 +0200)]
libperf: Add 'refcnt' to struct perf_mmap

Move 'refcnt' from tools/perf's mmap to libperf's perf_mmap struct.

Committer notes:

Add the refcount.h include directive here, now it is needed.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Add 'cpu' to struct perf_mmap
Jiri Olsa [Sat, 27 Jul 2019 20:33:20 +0000 (22:33 +0200)]
libperf: Add 'cpu' to struct perf_mmap

Move 'cpu' from tools/perf's mmap to libperf's perf_mmap struct.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Add 'fd' to struct perf_mmap
Jiri Olsa [Sat, 27 Jul 2019 20:31:17 +0000 (22:31 +0200)]
libperf: Add 'fd' to struct perf_mmap

Move 'fd' from tools/perf's mmap to libperf's perf_mmap struct.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Add 'mask' to struct perf_mmap
Jiri Olsa [Sat, 27 Jul 2019 20:27:55 +0000 (22:27 +0200)]
libperf: Add 'mask' to struct perf_mmap

Move 'mask' from tools/perf's mmap to libperf's perf_mmap struct.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Add perf_mmap struct
Jiri Olsa [Sat, 27 Jul 2019 20:07:44 +0000 (22:07 +0200)]
libperf: Add perf_mmap struct

Add the perf_mmap struct to libperf.

The definition is added into:

  include/internal/mmap.h

which is not to be included by users, but shared within perf and
libperf.

Committer notes:

Remove unnecessary includes from tools/perf/lib/include/internal/mmap.h,
those will be readded as they become necessary, later in the series.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agoperf evlist: Adopt backwards ring buffer state enum
Arnaldo Carvalho de Melo [Mon, 23 Sep 2019 15:20:38 +0000 (12:20 -0300)]
perf evlist: Adopt backwards ring buffer state enum

As this isn't used at all in mmap.h but in evlist.h, so to cut down the
header dependency tree, move it to where it is used.

Also add mmap.h to the places using it but previously getting it
indirectly via evlist.h.

Add missing pthread.h to evlist.h, as it has a pthread_t struct member
and was getting the header via mmap.h.

Noticed while processing a Jiri's libperf batch touching mmap.h, where
almost everything gets rebuilt because evlist.h is so popular, so cut
down't this rebuild the world party.

Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Song Liu <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibperf: Link libapi.a in libperf.so
Jiri Olsa [Sun, 18 Aug 2019 21:02:58 +0000 (23:02 +0200)]
libperf: Link libapi.a in libperf.so

Linking libapi.a in libperf.so, because we are about to use some of the
API functions in it.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agoperf tools: Rename perf_evlist__purge() to evlist__purge()
Jiri Olsa [Thu, 5 Sep 2019 08:11:37 +0000 (10:11 +0200)]
perf tools: Rename perf_evlist__purge() to evlist__purge()

Rename (perf_evlist__purge) to evlist__purge(), so we don't have a
name clash when we add (perf_evlist__purge) in libperf.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agoperf tools: Rename perf_evlist__exit() to evlist__exit()
Jiri Olsa [Mon, 2 Sep 2019 12:34:52 +0000 (14:34 +0200)]
perf tools: Rename perf_evlist__exit() to evlist__exit()

Rename perf_evlist__exit() to evlist__exit(), so we don't have a name
clash when we add perf_evlist__exit() to libperf.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agoperf tools: Rename perf_evlist__alloc_mmap() to evlist__alloc_mmap()
Jiri Olsa [Fri, 16 Aug 2019 14:21:46 +0000 (16:21 +0200)]
perf tools: Rename perf_evlist__alloc_mmap() to evlist__alloc_mmap()

Rename perf_evlist__alloc_mmap() to evlist__alloc_mmap(), so we don't
have a name clash when we add perf_evlist__alloc_mmap() to libperf.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agoperf tools: Rename perf_evlist__munmap() to evlist__munmap()
Jiri Olsa [Fri, 16 Aug 2019 14:19:55 +0000 (16:19 +0200)]
perf tools: Rename perf_evlist__munmap() to evlist__munmap()

Rename perf_evlist__munmap() to evlist__munmap(), so we don't have a
name clash when we add perf_evlist__munmap() in libperf.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agoperf tools: Rename perf_evlist__mmap() to evlist__mmap()
Jiri Olsa [Sun, 28 Jul 2019 10:45:35 +0000 (12:45 +0200)]
perf tools: Rename perf_evlist__mmap() to evlist__mmap()

Rename perf_evlist__mmap() to evlist__mmap(), so we don't have a name
clash when we add perf_evlist__mmap() in libperf.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agoperf tools: Rename 'struct perf_mmap' to 'struct mmap'
Jiri Olsa [Sat, 27 Jul 2019 18:30:53 +0000 (20:30 +0200)]
perf tools: Rename 'struct perf_mmap' to 'struct mmap'

Rename 'struct perf_evlist' to 'struct evlist', so we don't have a name
clash when we add 'struct perf_mmap' to libperf.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agotools: Add missing stdio.h include to asm/bug.h header
Jiri Olsa [Thu, 12 Sep 2019 08:57:18 +0000 (10:57 +0200)]
tools: Add missing stdio.h include to asm/bug.h header

We have a direct fprintf() call in the header, so we need stdio.h
include, otherwise it could fail compilation if there's no prior stdio.h
include directive.

Signed-off-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibtraceevent: Man pages for tep plugins APIs
Tzvetomir Stoyanov [Thu, 19 Sep 2019 21:23:40 +0000 (17:23 -0400)]
libtraceevent: Man pages for tep plugins APIs

Create man pages for libtraceevent APIs:

  tep_load_plugins(),
  tep_unload_plugin()

Signed-off-by: Tzvetomir Stoyanov <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/linux-trace-devel/[email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibtraceevent: Move traceevent plugins in its own subdirectory
Tzvetomir Stoyanov (VMware) [Thu, 19 Sep 2019 21:23:41 +0000 (17:23 -0400)]
libtraceevent: Move traceevent plugins in its own subdirectory

All traceevent plugins code is moved to tools/lib/traceevent/plugins
subdirectory. It makes traceevent implementation in trace-cmd and in
kernel tree consistent. There is no changes in the way libtraceevent and
plugins are compiled and installed.

Committer notes:

Applied fixup provided by Steven, fixing the tools/perf/Makefile.perf
target for the plugin dynamic list file. Problem noticed when cross
building to aarch64 from a Ubuntu 19.04 container.

Suggested-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Tzvetomir Stoyanov (VMware) <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Tzvetomir Stoyanov (VMware) <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Link: http://lore.kernel.org/linux-trace-devel/[email protected]
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibtraceevent: Add tep_get_event() in event-parse.h
Tzvetomir Stoyanov (VMware) [Thu, 19 Sep 2019 21:23:39 +0000 (17:23 -0400)]
libtraceevent: Add tep_get_event() in event-parse.h

The tep_get_event() function is an official libtracevent API, described
in the library man pages. However, it cannot be used by the library users because
it is not declared in the event-parse.h file, where all libtracevent APIs are.
The function declaration is added in event-parse.h file.

Signed-off-by: Tzvetomir Stoyanov (VMware) <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Tzvetomir Stoyanov (VMware) <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/linux-trace-devel/[email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibtraceevent: Man pages fix, changes in event printing APIs
Tzvetomir Stoyanov (VMware) [Thu, 19 Sep 2019 21:23:38 +0000 (17:23 -0400)]
libtraceevent: Man pages fix, changes in event printing APIs

APIs for printing various trace event information were redesigned to be
more simple. However, the main libtraceevent man page was not updated
with those changes. The documentation is updated to describe the new
event print API.

Signed-off-by: Tzvetomir Stoyanov (VMware) <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Tzvetomir Stoyanov (VMware) <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/linux-trace-devel/[email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibtraceevent: Man pages fix, rename tep_ref_get() to tep_get_ref()
Tzvetomir Stoyanov (VMware) [Thu, 19 Sep 2019 21:23:37 +0000 (17:23 -0400)]
libtraceevent: Man pages fix, rename tep_ref_get() to tep_get_ref()

The tep_ref_get() was renamed to tep_get_ref(), to be more consistent
with the other tep_ref_* APIs. However, in the man pages the API is
still with the old name. The documentation is fixed to reflect the
actual name of the API.

Signed-off-by: Tzvetomir Stoyanov (VMware) <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Tzvetomir Stoyanov (VMware) <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/linux-trace-devel/[email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibtraceevent: Man pages for libtraceevent event print related API
Tzvetomir Stoyanov [Thu, 19 Sep 2019 21:23:36 +0000 (17:23 -0400)]
libtraceevent: Man pages for libtraceevent event print related API

Added new man page, describing tep_print_event() libtraceevent API.

Signed-off-by: Tzvetomir Stoyanov <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/linux-trace-devel/[email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agolibtraceevent: Round up in tep_print_event() time precision
Steven Rostedt (VMware) [Thu, 19 Sep 2019 20:51:19 +0000 (16:51 -0400)]
libtraceevent: Round up in tep_print_event() time precision

When testing the output of the old trace-cmd compared to the one that
uses the updated tep_print_event() logic, it was different in that the
time stamp precision in the old format would round up to the nearest
precision, where as the new logic truncates. Bring back the old method
of rounding up.

Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Tzvetomir Stoyanov <[email protected]>
Cc: linux trace devel <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agoperf list: Allow plurals for metric, metricgroup
Kim Phillips [Thu, 19 Sep 2019 20:43:05 +0000 (15:43 -0500)]
perf list: Allow plurals for metric, metricgroup

Enhance usability by allowing the same plurality used in the output
title, for the command line parameter.

BEFORE, perf deceitfully acts as if there are no metrics to be had:

  $ perf list metrics

  List of pre-defined events (to be used in -e):

  Metric Groups:

  $

But singular 'metric' shows a list of metrics:

  $ perf list metric

  List of pre-defined events (to be used in -e):

  Metrics:

    IPC
         [Instructions Per Cycle (per logical thread)]
    UPI
         [Uops Per Instruction]

AFTER, when asking for 'metrics', we actually see the metrics get listed:

  $ perf list metrics

  List of pre-defined events (to be used in -e):

  Metrics:

    IPC
         [Instructions Per Cycle (per logical thread)]
    UPI
         [Uops Per Instruction]

Fixes: 71b0acce78d1 ("perf list: Add metric groups to perf list")
Signed-off-by: Kim Phillips <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Janakarajan Natarajan <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Luke Mujica <[email protected]>
Cc: Martin Liška <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agoperf vendor events: Minor fixes to the README
Kim Phillips [Thu, 19 Sep 2019 20:43:04 +0000 (15:43 -0500)]
perf vendor events: Minor fixes to the README

Some grammatical fixes, and updates to some path references that have
since changed.

Signed-off-by: Kim Phillips <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Janakarajan Natarajan <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Luke Mujica <[email protected]>
Cc: Martin Liška <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agoperf vendor events amd: Remove redundant '['
Kim Phillips [Thu, 19 Sep 2019 20:43:03 +0000 (15:43 -0500)]
perf vendor events amd: Remove redundant '['

Remove the redundant '['.

'perf list' output before:

  ex_ret_brn
       [[Retired Branch Instructions]

'perf list' output after:

  ex_ret_brn
       [Retired Branch Instructions]

Fixes: 98c07a8f74f8 ("perf vendor events amd: perf PMU events for AMD Family 17h")
Signed-off-by: Kim Phillips <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Janakarajan Natarajan <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Luke Mujica <[email protected]>
Cc: Martin Liška <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agoperf vendor events amd: Add L3 cache events for Family 17h
Kim Phillips [Thu, 19 Sep 2019 20:43:02 +0000 (15:43 -0500)]
perf vendor events amd: Add L3 cache events for Family 17h

Allow users to symbolically specify L3 events for Family 17h processors
using the existing AMD Uncore driver.

Source of events descriptions are from section 2.1.15.4.1 "L3 Cache PMC
Events" of the latest Family 17h PPR, available here:

  https://www.amd.com/system/files/TechDocs/55570-B1_PUB.zip

Opnly BriefDescriptions added, since they show with and without
the -v and --details flags.

Tested with:

 # perf stat -e l3_request_g1.caching_l3_cache_accesses,amd_l3/event=0x01,umask=0x80/,l3_comb_clstr_state.request_miss,amd_l3/event=0x06,umask=0x01/ perf bench mem memcpy -s 4mb -l 100 -f default
...
         7,006,831      l3_request_g1.caching_l3_cache_accesses
         7,006,830      amd_l3/event=0x01,umask=0x80/
           366,530      l3_comb_clstr_state.request_miss
           366,568      amd_l3/event=0x06,umask=0x01/

Signed-off-by: Kim Phillips <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Janakarajan Natarajan <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Luke Mujica <[email protected]>
Cc: Martin Liška <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
5 years agonet: macb: Remove dead code
Shubhrajyoti Datta [Mon, 23 Sep 2019 08:33:51 +0000 (14:03 +0530)]
net: macb: Remove dead code

macb_64b_desc is always called when HW_DMA_CAP_64B is defined.
So the return NULL can never be reached. Remove the dead code.

Signed-off-by: Shubhrajyoti Datta <[email protected]>
Reviewed-by: Claudiu Beznea <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agonet: stmmac: selftests: Flow Control test can also run with ASYM Pause
Jose Abreu [Mon, 23 Sep 2019 07:49:08 +0000 (09:49 +0200)]
net: stmmac: selftests: Flow Control test can also run with ASYM Pause

The Flow Control selftest is also available with ASYM Pause. Lets add
this check to the test and fix eventual false positive failures.

Fixes: 091810dbded9 ("net: stmmac: Introduce selftests support")
Signed-off-by: Jose Abreu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agogianfar: Make reset_gfar static
YueHaibing [Mon, 23 Sep 2019 06:16:03 +0000 (14:16 +0800)]
gianfar: Make reset_gfar static

Fix sparse warning:

drivers/net/ethernet/freescale/gianfar.c:2070:6:
 warning: symbol 'reset_gfar' was not declared. Should it be static?

Reported-by: Hulk Robot <[email protected]>
Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agoatm: he: clean up an indentation issue
Colin Ian King [Sun, 22 Sep 2019 11:42:16 +0000 (13:42 +0200)]
atm: he: clean up an indentation issue

There is a statement that is indented one level too many, remove
the extraneous tab.

Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agoppp: Fix memory leak in ppp_write
Takeshi Misawa [Sun, 22 Sep 2019 07:45:31 +0000 (16:45 +0900)]
ppp: Fix memory leak in ppp_write

When ppp is closing, __ppp_xmit_process() failed to enqueue skb
and skb allocated in ppp_write() is leaked.

syzbot reported :
BUG: memory leak
unreferenced object 0xffff88812a17bc00 (size 224):
  comm "syz-executor673", pid 6952, jiffies 4294942888 (age 13.040s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000d110fff9>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
    [<00000000d110fff9>] slab_post_alloc_hook mm/slab.h:522 [inline]
    [<00000000d110fff9>] slab_alloc_node mm/slab.c:3262 [inline]
    [<00000000d110fff9>] kmem_cache_alloc_node+0x163/0x2f0 mm/slab.c:3574
    [<000000002d616113>] __alloc_skb+0x6e/0x210 net/core/skbuff.c:197
    [<000000000167fc45>] alloc_skb include/linux/skbuff.h:1055 [inline]
    [<000000000167fc45>] ppp_write+0x48/0x120 drivers/net/ppp/ppp_generic.c:502
    [<000000009ab42c0b>] __vfs_write+0x43/0xa0 fs/read_write.c:494
    [<00000000086b2e22>] vfs_write fs/read_write.c:558 [inline]
    [<00000000086b2e22>] vfs_write+0xee/0x210 fs/read_write.c:542
    [<00000000a2b70ef9>] ksys_write+0x7c/0x130 fs/read_write.c:611
    [<00000000ce5e0fdd>] __do_sys_write fs/read_write.c:623 [inline]
    [<00000000ce5e0fdd>] __se_sys_write fs/read_write.c:620 [inline]
    [<00000000ce5e0fdd>] __x64_sys_write+0x1e/0x30 fs/read_write.c:620
    [<00000000d9d7b370>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:296
    [<0000000006e6d506>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fix this by freeing skb, if ppp is closing.

Fixes: 6d066734e9f0 ("ppp: avoid loop in xmit recursion detection code")
Reported-and-tested-by: [email protected]
Signed-off-by: Takeshi Misawa <[email protected]>
Reviewed-by: Guillaume Nault <[email protected]>
Tested-by: Guillaume Nault <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agoMerge branch 'ibmvnic-serialization-fixes'
David S. Miller [Wed, 25 Sep 2019 11:41:41 +0000 (13:41 +0200)]
Merge branch 'ibmvnic-serialization-fixes'

Juliet Kim says:

====================
net/ibmvnic: serialization fixes

This series includes two fixes. The first improves reset code to allow
linkwatch_event to proceed during reset. The second ensures that no more
than one thread runs in reset at a time.

v2:
- Separate change param reset from do_reset()
- Return IBMVNIC_OPEN_FAILED if __ibmvnic_open fails
- Remove setting wait_for_reset to false from __ibmvnic_reset(), this
  is done in wait_for_reset()
- Move the check for force_reset_recovery from patch 1 to patch 2

v3:
- Restore reset’s successful return in open failure case

v4:
- Change resetting flag access to atomic
====================

Signed-off-by: David S. Miller <[email protected]>
5 years agonet/ibmvnic: prevent more than one thread from running in reset
Juliet Kim [Fri, 20 Sep 2019 20:11:23 +0000 (16:11 -0400)]
net/ibmvnic: prevent more than one thread from running in reset

The current code allows more than one thread to run in reset. This can
corrupt struct adapter data. Check adapter->resetting before performing
a reset, if there is another reset running delay (100 msec) before trying
again.

Signed-off-by: Juliet Kim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agonet/ibmvnic: unlock rtnl_lock in reset so linkwatch_event can run
Juliet Kim [Fri, 20 Sep 2019 20:11:22 +0000 (16:11 -0400)]
net/ibmvnic: unlock rtnl_lock in reset so linkwatch_event can run

Commit a5681e20b541 ("net/ibmnvic: Fix deadlock problem in reset")
made the change to hold the RTNL lock during a reset to avoid deadlock
but linkwatch_event is fired during the reset and needs the RTNL lock.
That keeps linkwatch_event process from proceeding until the reset
is complete. The reset process cannot tolerate the linkwatch_event
processing after reset completes, so release the RTNL lock during the
process to allow a chance for linkwatch_event to run during reset.
This does not guarantee that the linkwatch_event will be processed as
soon as link state changes, but is an improvement over the current code
where linkwatch_event processing is always delayed, which prevents
transmissions on the device from being deactivated leading transmit
watchdog timer to time-out.

Release the RTNL lock before link state change and re-acquire after
the link state change to allow linkwatch_event to grab the RTNL lock
and run during the reset.

Fixes: a5681e20b541 ("net/ibmnvic: Fix deadlock problem in reset")
Signed-off-by: Juliet Kim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
5 years agotracing/probe: Fix same probe event argument matching
Srikar Dronamraju [Tue, 24 Sep 2019 11:49:06 +0000 (17:19 +0530)]
tracing/probe: Fix same probe event argument matching

Commit fe60b0ce8e73 ("tracing/probe: Reject exactly same probe event")
tries to reject a event which matches an already existing probe.

However it currently continues to match arguments and rejects adding a
probe even when the arguments don't match. Fix this by only rejecting a
probe if and only if all the arguments match.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: fe60b0ce8e73 ("tracing/probe: Reject exactly same probe event")
Acked-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Srikar Dronamraju <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
5 years agonetfilter: nf_tables: bogus EBUSY when deleting flowtable after flush
Laura Garcia Liebana [Tue, 24 Sep 2019 12:42:44 +0000 (14:42 +0200)]
netfilter: nf_tables: bogus EBUSY when deleting flowtable after flush

The deletion of a flowtable after a flush in the same transaction
results in EBUSY. This patch adds an activation and deactivation of
flowtables in order to update the _use_ counter.

Signed-off-by: Laura Garcia Liebana <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
5 years agonetfilter: ebtables: use __u8 instead of uint8_t in uapi header
Masahiro Yamada [Mon, 23 Sep 2019 22:40:06 +0000 (07:40 +0900)]
netfilter: ebtables: use __u8 instead of uint8_t in uapi header

When CONFIG_UAPI_HEADER_TEST=y, exported headers are compile-tested to
make sure they can be included from user-space.

Currently, linux/netfilter_bridge/ebtables.h is excluded from the test
coverage. To make it join the compile-test, we need to fix the build
errors attached below.

For a case like this, we decided to use __u{8,16,32,64} variable types
in this discussion:

  https://lkml.org/lkml/2019/6/5/18

Build log:

  CC      usr/include/linux/netfilter_bridge/ebtables.h.s
In file included from <command-line>:32:0:
./usr/include/linux/netfilter_bridge/ebtables.h:126:4: error: unknown type name ‘uint8_t’
    uint8_t revision;
    ^~~~~~~
./usr/include/linux/netfilter_bridge/ebtables.h:139:4: error: unknown type name ‘uint8_t’
    uint8_t revision;
    ^~~~~~~
./usr/include/linux/netfilter_bridge/ebtables.h:152:4: error: unknown type name ‘uint8_t’
    uint8_t revision;
    ^~~~~~~

Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
5 years agoRevert "locking/pvqspinlock: Don't wait if vCPU is preempted"
Wanpeng Li [Mon, 9 Sep 2019 01:40:28 +0000 (09:40 +0800)]
Revert "locking/pvqspinlock: Don't wait if vCPU is preempted"

This patch reverts commit 75437bb304b20 (locking/pvqspinlock: Don't
wait if vCPU is preempted).  A large performance regression was caused
by this commit.  on over-subscription scenarios.

The test was run on a Xeon Skylake box, 2 sockets, 40 cores, 80 threads,
with three VMs of 80 vCPUs each.  The score of ebizzy -M is reduced from
13000-14000 records/s to 1700-1800 records/s:

          Host                Guest                score

vanilla w/o kvm optimizations     upstream    1700-1800 records/s
vanilla w/o kvm optimizations     revert      13000-14000 records/s
vanilla w/ kvm optimizations      upstream    4500-5000 records/s
vanilla w/ kvm optimizations      revert      14000-15500 records/s

Exit from aggressive wait-early mechanism can result in premature yield
and extra scheduling latency.

Actually, only 6% of wait_early events are caused by vcpu_is_preempted()
being true.  However, when one vCPU voluntarily releases its vCPU, all
the subsequently waiters in the queue will do the same and the cascading
effect leads to bad performance.

kvm optimizations:
[1] commit d73eb57b80b (KVM: Boost vCPUs that are delivering interrupts)
[2] commit 266e85a5ec9 (KVM: X86: Boost queue head vCPU to mitigate lock waiter preemption)

Tested-by: [email protected]
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: [email protected]
Cc: [email protected]
Fixes: 75437bb304b20 (locking/pvqspinlock: Don't wait if vCPU is preempted)
Signed-off-by: Wanpeng Li <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
5 years agodt-bindings: pwm: Update bindings for MT7629 SoC
Ryder Lee [Thu, 19 Sep 2019 22:49:10 +0000 (06:49 +0800)]
dt-bindings: pwm: Update bindings for MT7629 SoC

This updates bindings for MT7629 PWM controller.

Signed-off-by: Ryder Lee <[email protected]>
Signed-off-by: Sam Shih <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Reviewed-by: Matthias Brugger <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
5 years agopwm: mediatek: Update license and switch to SPDX tag
Sam Shih [Thu, 19 Sep 2019 22:49:06 +0000 (06:49 +0800)]
pwm: mediatek: Update license and switch to SPDX tag

Add SPDX identifiers to pwm-mediatek.c. Update MODULE_LICENSE to
correctly reflect the GNU General Public License v2.0.

Signed-off-by: Ryder Lee <[email protected]>
Signed-off-by: Sam Shih <[email protected]>
Reviewed-by: Uwe Kleine-König <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
5 years agopwm: mediatek: Use pwm_mediatek as common prefix
Sam Shih [Thu, 19 Sep 2019 22:49:05 +0000 (06:49 +0800)]
pwm: mediatek: Use pwm_mediatek as common prefix

Use pwm_mediatek as common prefix to match the filename. No functional
change intended.

Signed-off-by: Ryder Lee <[email protected]>
Signed-off-by: Sam Shih <[email protected]>
Acked-by: Uwe Kleine-König <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
5 years agopwm: mediatek: Allocate the clks array dynamically
Sam Shih [Thu, 19 Sep 2019 22:49:04 +0000 (06:49 +0800)]
pwm: mediatek: Allocate the clks array dynamically

Instead of using fixed size of arrays, allocate the memory for them
based on the number of PWMs specified for each SoC generation.

Signed-off-by: Ryder Lee <[email protected]>
Signed-off-by: Sam Shih <[email protected]>
Reviewed-by: Uwe Kleine-König <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
5 years agopwm: mediatek: Remove the has_clks field
Sam Shih [Thu, 19 Sep 2019 22:49:03 +0000 (06:49 +0800)]
pwm: mediatek: Remove the has_clks field

We can use fixed clocks to repair mt7628 PWM during configure from
userspace. The SoC is legacy MIPS and has no complex clock tree. Because
we can get the clock frequency for period calculation from fixed clocks
specified in DT, we can remove the has_clock field, and directly use
devm_clk_get() and clk_get_rate().

Signed-off-by: Ryder Lee <[email protected]>
Signed-off-by: Sam Shih <[email protected]>
Acked-by: Uwe Kleine-König <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
5 years agowil6210: use after free in wil_netif_rx_any()
Dan Carpenter [Sat, 21 Sep 2019 06:01:45 +0000 (09:01 +0300)]
wil6210: use after free in wil_netif_rx_any()

The debug code dereferences "skb" to print "skb->len" so we have to
print the message before we free "skb".

Fixes: f99fe49ff372 ("wil6210: add wil_netif_rx() helper function")
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
5 years agothermal: db8500: Rewrite to be a pure OF sensor
Linus Walleij [Wed, 28 Aug 2019 13:03:20 +0000 (15:03 +0200)]
thermal: db8500: Rewrite to be a pure OF sensor

This patch rewrites the DB8500 thermal sensor to be a
pure OF sensor, so that it can be used with thermal zones
defined in the device tree.

This driver was initially merged before we had generic
thermal zone device tree bindings, and now it gets
modernized to the way we do things these days.

The old driver depended on a set of trigger points
provided in the device tree or platform data to
interpolate the current temperature between trigger
points depending on whether the trend was rising or
falling. This was bad because the trigger points should
be used for defining temperature zone policies and
bind to cooling devices.

As the PRCMU (power reset control management unit) can
only issue IRQs when we pass temperature trigger points
upward or downward We instead define a number of
temperature points inside the driver ranging from
15 to 100 degrees celsius. The effect is that when
we register the device we quickly trigger 15, 20 ... up
to the room temperature in succession and then we
get continous event IRQs also under normal operating
conditions, and the temperature of the system is now
reported more accurately (+/- 2.5 degrees celsius)
while in the past the first trigger point was at 70
degrees and the average temperature was simply reported
as 35 degrees celsius (between 70 degrees and 0) until
we passed 70 degrees which didn't accurately represent
the temperature of the system.

As a result of dropping all the trigger points from the
driver and reusing the core DT thermal zone management
code we reduce the code footprint quite a bit.

Cc: Vincent Guittot <[email protected]>
Suggested-by: Daniel Lezcano <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
Reviewed-by: Daniel Lezcano <[email protected]>
Signed-off-by: Eduardo Valentin <[email protected]>
5 years agothermal: db8500: Use dev helper variable
Linus Walleij [Wed, 28 Aug 2019 13:03:19 +0000 (15:03 +0200)]
thermal: db8500: Use dev helper variable

The code gets easier to read like this.

Cc: Vincent Guittot <[email protected]>
Reviewed-by: Daniel Lezcano <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
Signed-off-by: Eduardo Valentin <[email protected]>
5 years agothermal: db8500: Finalize device tree conversion
Linus Walleij [Wed, 28 Aug 2019 13:03:18 +0000 (15:03 +0200)]
thermal: db8500: Finalize device tree conversion

At some point there was an attempt to convert the DB8500
thermal sensor to device tree: a probe path was added
and the device tree was augmented for the Snowball board.
The switchover was never completed: instead the thermal
devices came from from the PRCMU MFD device and the probe
on the Snowball was confused as another set of configuration
appeared from the device tree.

Move over to a device-tree only approach, as we fixed up
the device trees.

Cc: Vincent Guittot <[email protected]>
Acked-by: Lee Jones <[email protected]>
Reviewed-by: Daniel Lezcano <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
Signed-off-by: Eduardo Valentin <[email protected]>
5 years agosmb3: Add missing reparse tags
Steve French [Wed, 25 Sep 2019 04:27:34 +0000 (23:27 -0500)]
smb3: Add missing reparse tags

Additional reparse tags were described for WSL and file sync.
Add missing defines for these tags. Some will be useful for
POSIX extensions (as discussed at Storage Developer Conference).

Signed-off-by: Steve French <[email protected]>
Reviewed-by: Aurelien Aptel <[email protected]>
5 years agoMerge branch 'i2c/for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Linus Torvalds [Tue, 24 Sep 2019 23:48:02 +0000 (16:48 -0700)]
Merge branch 'i2c/for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c updates from Wolfram Sang:

 - new driver for ICY, an Amiga Zorro card :)

 - axxia driver gained slave mode support, NXP driver gained ACPI

 - the slave EEPROM backend gained 16 bit address support

 - and lots of regular driver updates and reworks

* 'i2c/for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (52 commits)
  i2c: tegra: Move suspend handling to NOIRQ phase
  i2c: imx: ACPI support for NXP i2c controller
  i2c: uniphier(-f): remove all dev_dbg()
  i2c: uniphier(-f): use devm_platform_ioremap_resource()
  i2c: slave-eeprom: Add comment about address handling
  i2c: exynos5: Remove IRQF_ONESHOT
  i2c: stm32f7: Make structure stm32f7_i2c_algo constant
  i2c: cht-wc: drop check because i2c_unregister_device() is NULL safe
  i2c-eeprom_slave: Add support for more eeprom models
  i2c: fsi: Add of_put_node() before break
  i2c: synquacer: Make synquacer_i2c_ops constant
  i2c: hix5hd2: Remove IRQF_ONESHOT
  i2c: i801: Use iTCO version 6 in Cannon Lake PCH and beyond
  watchdog: iTCO: Add support for Cannon Lake PCH iTCO
  i2c: iproc: Make bcm_iproc_i2c_quirks constant
  i2c: iproc: Add full name of devicetree node to adapter name
  i2c: piix4: Add ACPI support
  i2c: piix4: Fix probing of reserved ports on AMD Family 16h Model 30h
  i2c: ocores: use request_any_context_irq() to register IRQ handler
  i2c: designware: Fix optional reset error handling
  ...

5 years agoMerge tag 'sound-fix-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Tue, 24 Sep 2019 23:46:16 +0000 (16:46 -0700)]
Merge tag 'sound-fix-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A few small remaining wrap-up for this merge window.

  Most of patches are device-specific (HD-audio and USB-audio quirks,
  FireWire, pcm316a, fsl, rsnd, Atmel, and TI fixes), while there is a
  simple fix (actually two commits) for ASoC core"

* tag 'sound-fix-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: usb-audio: Add DSD support for EVGA NU Audio
  ALSA: hda - Add laptop imic fixup for ASUS M9V laptop
  ASoC: ti: fix SND_SOC_DM365_VOICE_CODEC dependencies
  ASoC: pcm3168a: The codec does not support S32_LE
  ASoC: core: use list_del_init and move it back to soc_cleanup_component
  ALSA: hda/realtek - PCI quirk for Medion E4254
  ALSA: hda - Apply AMD controller workaround for Raven platform
  ASoC: rsnd: do error check after rsnd_channel_normalization()
  ASoC: atmel_ssc_dai: Remove wrong spinlock usage
  ASoC: core: delete component->card_list in soc_remove_component only
  ASoC: fsl_sai: Fix noise when using EDMA
  ALSA: usb-audio: Add Hiby device family to quirks for native DSD support
  ALSA: hda/realtek - Fix alienware headset mic
  ALSA: dice: fix wrong packet parameter for Alesis iO26

5 years agotpm: Wrap the buffer from the caller to tpm_buf in tpm_send()
Jarkko Sakkinen [Mon, 16 Sep 2019 08:38:34 +0000 (11:38 +0300)]
tpm: Wrap the buffer from the caller to tpm_buf in tpm_send()

tpm_send() does not give anymore the result back to the caller. This
would require another memcpy(), which kind of tells that the whole
approach is somewhat broken. Instead, as Mimi suggested, this commit
just wraps the data to the tpm_buf, and thus the result will not go to
the garbage.

Obviously this assumes from the caller that it passes large enough
buffer, which makes the whole API somewhat broken because it could be
different size than @buflen but since trusted keys is the only module
using this API right now I think that this fix is sufficient for the
moment.

In the near future the plan is to replace the parameters with a tpm_buf
created by the caller.

Reported-by: Mimi Zohar <[email protected]>
Suggested-by: Mimi Zohar <[email protected]>
Cc: [email protected]
Fixes: 412eb585587a ("use tpm_buf in tpm_transmit_cmd() as the IO parameter")
Signed-off-by: Jarkko Sakkinen <[email protected]>
Reviewed-by: Jerry Snitselaar <[email protected]>
5 years agoMAINTAINERS: keys: Update path to trusted.h
Denis Efremov [Thu, 15 Aug 2019 22:12:00 +0000 (01:12 +0300)]
MAINTAINERS: keys: Update path to trusted.h

Update MAINTAINERS record to reflect that trusted.h
was moved to a different directory in commit 22447981fc05
("KEYS: Move trusted.h to include/keys [ver #2]").

Cc: Denis Kenzior <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Jarkko Sakkinen <[email protected]>
Cc: Mimi Zohar <[email protected]>
Cc: [email protected]
Signed-off-by: Denis Efremov <[email protected]>
Reviewed-by: Jarkko Sakkinen <[email protected]>
Signed-off-by: Jarkko Sakkinen <[email protected]>
5 years agoKEYS: trusted: correctly initialize digests and fix locking issue
Roberto Sassu [Fri, 13 Sep 2019 18:51:36 +0000 (20:51 +0200)]
KEYS: trusted: correctly initialize digests and fix locking issue

Commit 0b6cf6b97b7e ("tpm: pass an array of tpm_extend_digest structures to
tpm_pcr_extend()") modifies tpm_pcr_extend() to accept a digest for each
PCR bank. After modification, tpm_pcr_extend() expects that digests are
passed in the same order as the algorithms set in chip->allocated_banks.

This patch fixes two issues introduced in the last iterations of the patch
set: missing initialization of the TPM algorithm ID in the tpm_digest
structures passed to tpm_pcr_extend() by the trusted key module, and
unreleased locks in the TPM driver due to returning from tpm_pcr_extend()
without calling tpm_put_ops().

Cc: [email protected]
Fixes: 0b6cf6b97b7e ("tpm: pass an array of tpm_extend_digest structures to tpm_pcr_extend()")
Signed-off-by: Roberto Sassu <[email protected]>
Suggested-by: Jarkko Sakkinen <[email protected]>
Reviewed-by: Jerry Snitselaar <[email protected]>
Reviewed-by: Jarkko Sakkinen <[email protected]>
Signed-off-by: Jarkko Sakkinen <[email protected]>
5 years agoselftests/tpm2: Add log and *.pyc to .gitignore
Petr Vorel [Wed, 11 Sep 2019 09:34:42 +0000 (11:34 +0200)]
selftests/tpm2: Add log and *.pyc to .gitignore

Fixes: 6ea3dfe1e073 ("selftests: add TPM 2.0 tests")
Signed-off-by: Petr Vorel <[email protected]>
Reviewed-by: Jarkko Sakkinen <[email protected]>
Signed-off-by: Jarkko Sakkinen <[email protected]>
5 years agoselftests/tpm2: Add the missing TEST_FILES assignment
Jarkko Sakkinen [Tue, 10 Sep 2019 20:11:37 +0000 (21:11 +0100)]
selftests/tpm2: Add the missing TEST_FILES assignment

The Python files required by the selftests are not packaged because of
the missing assignment to TEST_FILES. Add the assignment.

Cc: [email protected]
Fixes: 6ea3dfe1e073 ("selftests: add TPM 2.0 tests")
Signed-off-by: Jarkko Sakkinen <[email protected]>
Reviewed-by: Petr Vorel <[email protected]>
5 years agoMerge tag 'for-5.4/io_uring-2019-09-24' of git://git.kernel.dk/linux-block
Linus Torvalds [Tue, 24 Sep 2019 23:40:21 +0000 (16:40 -0700)]
Merge tag 'for-5.4/io_uring-2019-09-24' of git://git.kernel.dk/linux-block

Pull more io_uring updates from Jens Axboe:
 "A collection of later fixes and additions, that weren't quite ready
  for pushing out with the initial pull request.

  This contains:

   - Fix potential use-after-free of shadow requests (Jackie)

   - Fix potential OOM crash in request allocation (Jackie)

   - kmalloc+memcpy -> kmemdup cleanup (Jackie)

   - Fix poll crash regression (me)

   - Fix SQ thread not being nice and giving up CPU for !PREEMPT (me)

   - Add support for timeouts, making it easier to do epoll_wait()
     conversions, for instance (me)

   - Ensure io_uring works without f_ops->read_iter() and
     f_ops->write_iter() (me)"

* tag 'for-5.4/io_uring-2019-09-24' of git://git.kernel.dk/linux-block:
  io_uring: correctly handle non ->{read,write}_iter() file_operations
  io_uring: IORING_OP_TIMEOUT support
  io_uring: use cond_resched() in sqthread
  io_uring: fix potential crash issue due to io_get_req failure
  io_uring: ensure poll commands clear ->sqe
  io_uring: fix use-after-free of shadow_req
  io_uring: use kmemdup instead of kmalloc and memcpy

5 years agoMerge tag 'for-5.4/post-2019-09-24' of git://git.kernel.dk/linux-block
Linus Torvalds [Tue, 24 Sep 2019 23:31:50 +0000 (16:31 -0700)]
Merge tag 'for-5.4/post-2019-09-24' of git://git.kernel.dk/linux-block

Pull more block updates from Jens Axboe:
 "Some later additions that weren't quite done for the first pull
  request, and also a few fixes that have arrived since.

  This contains:

   - Kill silly pktcdvd warning on attempting to register a non-scsi
     passthrough device (me)

   - Use symbolic constants for the block t10 protection types, and
     switch to handling it in core rather than in the drivers (Max)

   - libahci platform missing node put fix (Nishka)

   - Small series of fixes for BFQ (Paolo)

   - Fix possible nbd crash (Xiubo)"

* tag 'for-5.4/post-2019-09-24' of git://git.kernel.dk/linux-block:
  block: drop device references in bsg_queue_rq()
  block: t10-pi: fix -Wswitch warning
  pktcdvd: remove warning on attempting to register non-passthrough dev
  ata: libahci_platform: Add of_node_put() before loop exit
  nbd: fix possible page fault for nbd disk
  nbd: rename the runtime flags as NBD_RT_ prefixed
  block, bfq: push up injection only after setting service time
  block, bfq: increase update frequency of inject limit
  block, bfq: reduce upper bound for inject limit to max_rq_in_driver+1
  block, bfq: update inject limit only after injection occurred
  block: centralize PI remapping logic to the block layer
  block: use symbolic constants for t10_pi type

5 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Tue, 24 Sep 2019 23:10:23 +0000 (16:10 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge updates from Andrew Morton:

 - a few hot fixes

 - ocfs2 updates

 - almost all of -mm (slab-generic, slab, slub, kmemleak, kasan,
   cleanups, debug, pagecache, memcg, gup, pagemap, memory-hotplug,
   sparsemem, vmalloc, initialization, z3fold, compaction, mempolicy,
   oom-kill, hugetlb, migration, thp, mmap, madvise, shmem, zswap,
   zsmalloc)

* emailed patches from Andrew Morton <[email protected]>: (132 commits)
  mm/zsmalloc.c: fix a -Wunused-function warning
  zswap: do not map same object twice
  zswap: use movable memory if zpool support allocate movable memory
  zpool: add malloc_support_movable to zpool_driver
  shmem: fix obsolete comment in shmem_getpage_gfp()
  mm/madvise: reduce code duplication in error handling paths
  mm: mmap: increase sockets maximum memory size pgoff for 32bits
  mm/mmap.c: refine find_vma_prev() with rb_last()
  riscv: make mmap allocation top-down by default
  mips: use generic mmap top-down layout and brk randomization
  mips: replace arch specific way to determine 32bit task with generic version
  mips: adjust brk randomization offset to fit generic version
  mips: use STACK_TOP when computing mmap base address
  mips: properly account for stack randomization and stack guard gap
  arm: use generic mmap top-down layout and brk randomization
  arm: use STACK_TOP when computing mmap base address
  arm: properly account for stack randomization and stack guard gap
  arm64, mm: make randomization selected by generic topdown mmap layout
  arm64, mm: move generic mmap layout functions to mm
  arm64: consider stack randomization for mmap base only when necessary
  ...

5 years agomm/zsmalloc.c: fix a -Wunused-function warning
Qian Cai [Mon, 23 Sep 2019 22:39:46 +0000 (15:39 -0700)]
mm/zsmalloc.c: fix a -Wunused-function warning

set_zspage_inuse() was introduced in the commit 4f42047bbde0 ("zsmalloc:
use accessor") but all the users of it were removed later by the commits,

bdb0af7ca8f0 ("zsmalloc: factor page chain functionality out")
3783689a1aa8 ("zsmalloc: introduce zspage structure")

so the function can be safely removed now.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Qian Cai <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agozswap: do not map same object twice
Vitaly Wool [Mon, 23 Sep 2019 22:39:43 +0000 (15:39 -0700)]
zswap: do not map same object twice

zswap_writeback_entry() maps a handle to read swpentry first, and
then in the most common case it would map the same handle again.
This is ok when zbud is the backend since its mapping callback is
plain and simple, but it slows things down for z3fold.

Since there's hardly a point in unmapping a handle _that_ fast as
zswap_writeback_entry() does when it reads swpentry, the
suggestion is to keep the handle mapped till the end.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Vitaly Wool <[email protected]>
Reviewed-by: Dan Streetman <[email protected]>
Cc: Shakeel Butt <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: Seth Jennings <[email protected]>
Cc: Vitaly Wool <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agozswap: use movable memory if zpool support allocate movable memory
Hui Zhu [Mon, 23 Sep 2019 22:39:40 +0000 (15:39 -0700)]
zswap: use movable memory if zpool support allocate movable memory

This is the third version that was updated according to the comments from
Sergey Senozhatsky https://lkml.org/lkml/2019/5/29/73 and Shakeel Butt
https://lkml.org/lkml/2019/6/4/973

zswap compresses swap pages into a dynamically allocated RAM-based memory
pool.  The memory pool should be zbud, z3fold or zsmalloc.  All of them
will allocate unmovable pages.  It will increase the number of unmovable
page blocks that will bad for anti-fragment.

zsmalloc support page migration if request movable page:
        handle = zs_malloc(zram->mem_pool, comp_len,
                GFP_NOIO | __GFP_HIGHMEM |
                __GFP_MOVABLE);

And commit "zpool: Add malloc_support_movable to zpool_driver" add
zpool_malloc_support_movable check malloc_support_movable to make sure if
a zpool support allocate movable memory.

This commit let zswap allocate block with gfp
__GFP_HIGHMEM | __GFP_MOVABLE if zpool support allocate movable memory.

Following part is test log in a pc that has 8G memory and 2G swap.

Without this commit:
~# echo lz4 > /sys/module/zswap/parameters/compressor
~# echo zsmalloc > /sys/module/zswap/parameters/zpool
~# echo 1 > /sys/module/zswap/parameters/enabled
~# swapon /swapfile
~# cd /home/teawater/kernel/vm-scalability/
/home/teawater/kernel/vm-scalability# export unit_size=$((9 * 1024 * 1024 * 1024))
/home/teawater/kernel/vm-scalability# ./case-anon-w-seq
2717908992 bytes / 4826062 usecs = 549973 KB/s
2717908992 bytes / 4864201 usecs = 545661 KB/s
2717908992 bytes / 4867015 usecs = 545346 KB/s
2717908992 bytes / 4915485 usecs = 539968 KB/s
397853 usecs to free memory
357820 usecs to free memory
421333 usecs to free memory
420454 usecs to free memory
/home/teawater/kernel/vm-scalability# cat /proc/pagetypeinfo
Page block order: 9
Pages per block:  512

Free pages count per migrate type at order       0      1      2      3      4      5      6      7      8      9     10
Node    0, zone      DMA, type    Unmovable      1      1      1      0      2      1      1      0      1      0      0
Node    0, zone      DMA, type      Movable      0      0      0      0      0      0      0      0      0      1      3
Node    0, zone      DMA, type  Reclaimable      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone      DMA, type   HighAtomic      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone      DMA, type          CMA      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone      DMA, type      Isolate      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone    DMA32, type    Unmovable      6      5      8      6      6      5      4      1      1      1      0
Node    0, zone    DMA32, type      Movable     25     20     20     19     22     15     14     11     11      5    767
Node    0, zone    DMA32, type  Reclaimable      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone    DMA32, type   HighAtomic      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone    DMA32, type          CMA      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone    DMA32, type      Isolate      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone   Normal, type    Unmovable   4753   5588   5159   4613   3712   2520   1448    594    188     11      0
Node    0, zone   Normal, type      Movable     16      3    457   2648   2143   1435    860    459    223    224    296
Node    0, zone   Normal, type  Reclaimable      0      0     44     38     11      2      0      0      0      0      0
Node    0, zone   Normal, type   HighAtomic      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone   Normal, type          CMA      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone   Normal, type      Isolate      0      0      0      0      0      0      0      0      0      0      0

Number of blocks type     Unmovable      Movable  Reclaimable   HighAtomic          CMA      Isolate
Node 0, zone      DMA            1            7            0            0            0            0
Node 0, zone    DMA32            4         1652            0            0            0            0
Node 0, zone   Normal          931         1485           15            0            0            0

With this commit:
~# echo lz4 > /sys/module/zswap/parameters/compressor
~# echo zsmalloc > /sys/module/zswap/parameters/zpool
~# echo 1 > /sys/module/zswap/parameters/enabled
~# swapon /swapfile
~# cd /home/teawater/kernel/vm-scalability/
/home/teawater/kernel/vm-scalability# export unit_size=$((9 * 1024 * 1024 * 1024))
/home/teawater/kernel/vm-scalability# ./case-anon-w-seq
2717908992 bytes / 4689240 usecs = 566020 KB/s
2717908992 bytes / 4760605 usecs = 557535 KB/s
2717908992 bytes / 4803621 usecs = 552543 KB/s
2717908992 bytes / 5069828 usecs = 523530 KB/s
431546 usecs to free memory
383397 usecs to free memory
456454 usecs to free memory
224487 usecs to free memory
/home/teawater/kernel/vm-scalability# cat /proc/pagetypeinfo
Page block order: 9
Pages per block:  512

Free pages count per migrate type at order       0      1      2      3      4      5      6      7      8      9     10
Node    0, zone      DMA, type    Unmovable      1      1      1      0      2      1      1      0      1      0      0
Node    0, zone      DMA, type      Movable      0      0      0      0      0      0      0      0      0      1      3
Node    0, zone      DMA, type  Reclaimable      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone      DMA, type   HighAtomic      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone      DMA, type          CMA      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone      DMA, type      Isolate      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone    DMA32, type    Unmovable     10      8     10      9     10      4      3      2      3      0      0
Node    0, zone    DMA32, type      Movable     18     12     14     16     16     11      9      5      5      6    775
Node    0, zone    DMA32, type  Reclaimable      0      0      0      0      0      0      0      0      0      0      1
Node    0, zone    DMA32, type   HighAtomic      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone    DMA32, type          CMA      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone    DMA32, type      Isolate      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone   Normal, type    Unmovable   2669   1236    452    118     37     14      4      1      2      3      0
Node    0, zone   Normal, type      Movable   3850   6086   5274   4327   3510   2494   1520    934    438    220    470
Node    0, zone   Normal, type  Reclaimable     56     93    155    124     47     31     17      7      3      0      0
Node    0, zone   Normal, type   HighAtomic      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone   Normal, type          CMA      0      0      0      0      0      0      0      0      0      0      0
Node    0, zone   Normal, type      Isolate      0      0      0      0      0      0      0      0      0      0      0

Number of blocks type     Unmovable      Movable  Reclaimable   HighAtomic          CMA      Isolate
Node 0, zone      DMA            1            7            0            0            0            0
Node 0, zone    DMA32            4         1650            2            0            0            0
Node 0, zone   Normal           79         2326           26            0            0            0

You can see that the number of unmovable page blocks is decreased
when the kernel has this commit.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Hui Zhu <[email protected]>
Reviewed-by: Shakeel Butt <[email protected]>
Cc: Dan Streetman <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Nitin Gupta <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: Seth Jennings <[email protected]>
Cc: Vitaly Wool <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agozpool: add malloc_support_movable to zpool_driver
Hui Zhu [Mon, 23 Sep 2019 22:39:37 +0000 (15:39 -0700)]
zpool: add malloc_support_movable to zpool_driver

As a zpool_driver, zsmalloc can allocate movable memory because it support
migate pages.  But zbud and z3fold cannot allocate movable memory.

Add malloc_support_movable to zpool_driver.  If a zpool_driver support
allocate movable memory, set it to true.  And add
zpool_malloc_support_movable check malloc_support_movable to make sure if
a zpool support allocate movable memory.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Hui Zhu <[email protected]>
Reviewed-by: Shakeel Butt <[email protected]>
Cc: Dan Streetman <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Nitin Gupta <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: Seth Jennings <[email protected]>
Cc: Vitaly Wool <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agoshmem: fix obsolete comment in shmem_getpage_gfp()
Miles Chen [Mon, 23 Sep 2019 22:39:34 +0000 (15:39 -0700)]
shmem: fix obsolete comment in shmem_getpage_gfp()

Replace "fault_mm" with "vmf" in code comment because commit cfda05267f7b
("userfaultfd: shmem: add userfaultfd hook for shared memory faults") has
changed the prototpye of shmem_getpage_gfp() - pass vmf instead of
fault_mm to the function.

Before:
static int shmem_getpage_gfp(struct inode *inode, pgoff_t index,
struct page **pagep, enum sgp_type sgp,
gfp_t gfp, struct mm_struct *fault_mm, int *fault_type);
After:
static int shmem_getpage_gfp(struct inode *inode, pgoff_t index,
struct page **pagep, enum sgp_type sgp,
gfp_t gfp, struct vm_area_struct *vma,
struct vm_fault *vmf, vm_fault_t *fault_type);

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Miles Chen <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agomm/madvise: reduce code duplication in error handling paths
Mike Rapoport [Mon, 23 Sep 2019 22:39:31 +0000 (15:39 -0700)]
mm/madvise: reduce code duplication in error handling paths

madvise_behavior() converts -ENOMEM to -EAGAIN in several places using
identical code.

Move that code to a common error handling path.

No functional changes.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Acked-by: Pankaj Gupta <[email protected]>
Reviewed-by: Anshuman Khandual <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agomm: mmap: increase sockets maximum memory size pgoff for 32bits
Ivan Khoronzhuk [Mon, 23 Sep 2019 22:39:28 +0000 (15:39 -0700)]
mm: mmap: increase sockets maximum memory size pgoff for 32bits

The AF_XDP sockets umem mapping interface uses XDP_UMEM_PGOFF_FILL_RING
and XDP_UMEM_PGOFF_COMPLETION_RING offsets.  These offsets are
established already and are part of the configuration interface.

But for 32-bit systems, using AF_XDP socket configuration, these values
are too large to pass the maximum allowed file size verification.  The
offsets can be tuned off, but instead of changing the existing
interface, let's extend the max allowed file size for sockets.

No one has been using this until this patch with 32 bits as without
this fix af_xdp sockets can't be used at all, so it unblocks af_xdp
socket usage for 32bit systems.

All list of mmap cbs for sockets was verified for side effects and all
of them contain dummy cb - sock_no_mmap() at this moment, except the
following:

xsk_mmap() - it's what this fix is needed for.
tcp_mmap() - doesn't have obvious issues with pgoff - no any references on it.
packet_mmap() - return -EINVAL if it's even set.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ivan Khoronzhuk <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Björn Töpel <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Magnus Karlsson <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: David Miller <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agomm/mmap.c: refine find_vma_prev() with rb_last()
Wei Yang [Mon, 23 Sep 2019 22:39:25 +0000 (15:39 -0700)]
mm/mmap.c: refine find_vma_prev() with rb_last()

When addr is out of range of the whole rb_tree, pprev will point to the
right-most node.  rb_tree facility already provides a helper function,
rb_last(), to do this task.  We can leverage this instead of
reimplementing it.

This patch refines find_vma_prev() with rb_last() to make it a little
nicer to read.

[[email protected]: little cleanup, per Vlastimil]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Wei Yang <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agoriscv: make mmap allocation top-down by default
Alexandre Ghiti [Mon, 23 Sep 2019 22:39:21 +0000 (15:39 -0700)]
riscv: make mmap allocation top-down by default

In order to avoid wasting user address space by using bottom-up mmap
allocation scheme, prefer top-down scheme when possible.

Before:
root@qemuriscv64:~# cat /proc/self/maps
00010000-00016000 r-xp 00000000 fe:00 6389       /bin/cat.coreutils
00016000-00017000 r--p 00005000 fe:00 6389       /bin/cat.coreutils
00017000-00018000 rw-p 00006000 fe:00 6389       /bin/cat.coreutils
00018000-00039000 rw-p 00000000 00:00 0          [heap]
1555556000-155556d000 r-xp 00000000 fe:00 7193   /lib/ld-2.28.so
155556d000-155556e000 r--p 00016000 fe:00 7193   /lib/ld-2.28.so
155556e000-155556f000 rw-p 00017000 fe:00 7193   /lib/ld-2.28.so
155556f000-1555570000 rw-p 00000000 00:00 0
1555570000-1555572000 r-xp 00000000 00:00 0      [vdso]
1555574000-1555576000 rw-p 00000000 00:00 0
1555576000-1555674000 r-xp 00000000 fe:00 7187   /lib/libc-2.28.so
1555674000-1555678000 r--p 000fd000 fe:00 7187   /lib/libc-2.28.so
1555678000-155567a000 rw-p 00101000 fe:00 7187   /lib/libc-2.28.so
155567a000-15556a0000 rw-p 00000000 00:00 0
3fffb90000-3fffbb1000 rw-p 00000000 00:00 0      [stack]

After:
root@qemuriscv64:~# cat /proc/self/maps
00010000-00016000 r-xp 00000000 fe:00 6389       /bin/cat.coreutils
00016000-00017000 r--p 00005000 fe:00 6389       /bin/cat.coreutils
00017000-00018000 rw-p 00006000 fe:00 6389       /bin/cat.coreutils
2de81000-2dea2000 rw-p 00000000 00:00 0          [heap]
3ff7eb6000-3ff7ed8000 rw-p 00000000 00:00 0
3ff7ed8000-3ff7fd6000 r-xp 00000000 fe:00 7187   /lib/libc-2.28.so
3ff7fd6000-3ff7fda000 r--p 000fd000 fe:00 7187   /lib/libc-2.28.so
3ff7fda000-3ff7fdc000 rw-p 00101000 fe:00 7187   /lib/libc-2.28.so
3ff7fdc000-3ff7fe2000 rw-p 00000000 00:00 0
3ff7fe4000-3ff7fe6000 r-xp 00000000 00:00 0      [vdso]
3ff7fe6000-3ff7ffd000 r-xp 00000000 fe:00 7193   /lib/ld-2.28.so
3ff7ffd000-3ff7ffe000 r--p 00016000 fe:00 7193   /lib/ld-2.28.so
3ff7ffe000-3ff7fff000 rw-p 00017000 fe:00 7193   /lib/ld-2.28.so
3ff7fff000-3ff8000000 rw-p 00000000 00:00 0
3fff888000-3fff8a9000 rw-p 00000000 00:00 0      [stack]

[[email protected]: v6]
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Alexandre Ghiti <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Acked-by: Paul Walmsley <[email protected]> [arch/riscv]
Cc: Albert Ou <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Russell King <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agomips: use generic mmap top-down layout and brk randomization
Alexandre Ghiti [Mon, 23 Sep 2019 22:39:18 +0000 (15:39 -0700)]
mips: use generic mmap top-down layout and brk randomization

mips uses a top-down layout by default that exactly fits the generic
functions, so get rid of arch specific code and use the generic version by
selecting ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT.

As ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT selects ARCH_HAS_ELF_RANDOMIZE,
use the generic version of arch_randomize_brk since it also fits.  Note
that this commit also removes the possibility for mips to have elf
randomization and no MMU: without MMU, the security added by randomization
is worth nothing.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Alexandre Ghiti <[email protected]>
Acked-by: Paul Burton <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Russell King <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agomips: replace arch specific way to determine 32bit task with generic version
Alexandre Ghiti [Mon, 23 Sep 2019 22:39:14 +0000 (15:39 -0700)]
mips: replace arch specific way to determine 32bit task with generic version

Mips uses TASK_IS_32BIT_ADDR to determine if a task is 32bit, but this
define is mips specific and other arches do not have it: instead, use
!IS_ENABLED(CONFIG_64BIT) || is_compat_task() condition.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Alexandre Ghiti <[email protected]>
Acked-by: Paul Burton <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Russell King <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agomips: adjust brk randomization offset to fit generic version
Alexandre Ghiti [Mon, 23 Sep 2019 22:39:11 +0000 (15:39 -0700)]
mips: adjust brk randomization offset to fit generic version

This commit simply bumps up to 32MB and 1GB the random offset of brk,
compared to 8MB and 256MB, for 32bit and 64bit respectively.

Link: http://lkml.kernel.org/r/[email protected]
Suggested-by: Kees Cook <[email protected]>
Signed-off-by: Alexandre Ghiti <[email protected]>
Acked-by: Paul Burton <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Russell King <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agomips: use STACK_TOP when computing mmap base address
Alexandre Ghiti [Mon, 23 Sep 2019 22:39:07 +0000 (15:39 -0700)]
mips: use STACK_TOP when computing mmap base address

mmap base address must be computed wrt stack top address, using TASK_SIZE
is wrong since STACK_TOP and TASK_SIZE are not equivalent.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Alexandre Ghiti <[email protected]>
Acked-by: Kees Cook <[email protected]>
Acked-by: Paul Burton <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Russell King <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agomips: properly account for stack randomization and stack guard gap
Alexandre Ghiti [Mon, 23 Sep 2019 22:39:04 +0000 (15:39 -0700)]
mips: properly account for stack randomization and stack guard gap

This commit takes care of stack randomization and stack guard gap when
computing mmap base address and checks if the task asked for
randomization.  This fixes the problem uncovered and not fixed for arm
here: https://lkml.kernel.org/r/20170622200033[email protected]

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Alexandre Ghiti <[email protected]>
Acked-by: Kees Cook <[email protected]>
Acked-by: Paul Burton <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Russell King <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agoarm: use generic mmap top-down layout and brk randomization
Alexandre Ghiti [Mon, 23 Sep 2019 22:39:01 +0000 (15:39 -0700)]
arm: use generic mmap top-down layout and brk randomization

arm uses a top-down mmap layout by default that exactly fits the generic
functions, so get rid of arch specific code and use the generic version by
selecting ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT.

As ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT selects ARCH_HAS_ELF_RANDOMIZE,
use the generic version of arch_randomize_brk since it also fits.  Note
that this commit also removes the possibility for arm to have elf
randomization and no MMU: without MMU, the security added by randomization
is worth nothing.

Note that it is safe to remove STACK_RND_MASK since it matches the default
value.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Alexandre Ghiti <[email protected]>
Acked-by: Kees Cook <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Russell King <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agoarm: use STACK_TOP when computing mmap base address
Alexandre Ghiti [Mon, 23 Sep 2019 22:38:57 +0000 (15:38 -0700)]
arm: use STACK_TOP when computing mmap base address

mmap base address must be computed wrt stack top address, using TASK_SIZE
is wrong since STACK_TOP and TASK_SIZE are not equivalent.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Alexandre Ghiti <[email protected]>
Acked-by: Kees Cook <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Russell King <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agoarm: properly account for stack randomization and stack guard gap
Alexandre Ghiti [Mon, 23 Sep 2019 22:38:54 +0000 (15:38 -0700)]
arm: properly account for stack randomization and stack guard gap

This commit takes care of stack randomization and stack guard gap when
computing mmap base address and checks if the task asked for
randomization.  This fixes the problem uncovered and not fixed for arm
here: https://lkml.kernel.org/r/20170622200033[email protected]

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Alexandre Ghiti <[email protected]>
Acked-by: Kees Cook <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Russell King <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agoarm64, mm: make randomization selected by generic topdown mmap layout
Alexandre Ghiti [Mon, 23 Sep 2019 22:38:50 +0000 (15:38 -0700)]
arm64, mm: make randomization selected by generic topdown mmap layout

This commits selects ARCH_HAS_ELF_RANDOMIZE when an arch uses the generic
topdown mmap layout functions so that this security feature is on by
default.

Note that this commit also removes the possibility for arm64 to have elf
randomization and no MMU: without MMU, the security added by randomization
is worth nothing.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Alexandre Ghiti <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
Acked-by: Kees Cook <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Russell King <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agoarm64, mm: move generic mmap layout functions to mm
Alexandre Ghiti [Mon, 23 Sep 2019 22:38:47 +0000 (15:38 -0700)]
arm64, mm: move generic mmap layout functions to mm

arm64 handles top-down mmap layout in a way that can be easily reused by
other architectures, so make it available in mm.  It then introduces a new
config ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT that can be set by other
architectures to benefit from those functions.  Note that this new config
depends on MMU being enabled, if selected without MMU support, a warning
will be thrown.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Alexandre Ghiti <[email protected]>
Suggested-by: Christoph Hellwig <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
Acked-by: Kees Cook <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Russell King <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agoarm64: consider stack randomization for mmap base only when necessary
Alexandre Ghiti [Mon, 23 Sep 2019 22:38:43 +0000 (15:38 -0700)]
arm64: consider stack randomization for mmap base only when necessary

Do not offset mmap base address because of stack randomization if current
task does not want randomization.  Note that x86 already implements this
behaviour.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Alexandre Ghiti <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
Acked-by: Kees Cook <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Russell King <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agoarm64: make use of is_compat_task instead of hardcoding this test
Alexandre Ghiti [Mon, 23 Sep 2019 22:38:40 +0000 (15:38 -0700)]
arm64: make use of is_compat_task instead of hardcoding this test

Each architecture has its own way to determine if a task is a compat task,
by using is_compat_task in arch_mmap_rnd, it allows more genericity and
then it prepares its moving to mm/.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Alexandre Ghiti <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
Acked-by: Kees Cook <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Russell King <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
5 years agomm, fs: move randomize_stack_top from fs to mm
Alexandre Ghiti [Mon, 23 Sep 2019 22:38:37 +0000 (15:38 -0700)]
mm, fs: move randomize_stack_top from fs to mm

Patch series "Provide generic top-down mmap layout functions", v6.

This series introduces generic functions to make top-down mmap layout
easily accessible to architectures, in particular riscv which was the
initial goal of this series.  The generic implementation was taken from
arm64 and used successively by arm, mips and finally riscv.

Note that in addition the series fixes 2 issues:

- stack randomization was taken into account even if not necessary.

- [1] fixed an issue with mmap base which did not take into account
  randomization but did not report it to arm and mips, so by moving arm64
  into a generic library, this problem is now fixed for both
  architectures.

This work is an effort to factorize architecture functions to avoid code
duplication and oversights as in [1].

[1]: https://www.mail-archive.com/[email protected]/msg1429066.html

This patch (of 14):

This preparatory commit moves this function so that further introduction
of generic topdown mmap layout is contained only in mm/util.c.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Alexandre Ghiti <[email protected]>
Acked-by: Kees Cook <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Cc: Russell King <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
This page took 0.139673 seconds and 4 git commands to generate.